Benthic monitoring of salmon farms in Norway using foraminiferal metabarcoding

The rapid growth of the salmon industry necessitates the development of fast and accurate tools to assess its environmental impact. Macrobenthic monitoring is commonly used to measure the impact of organic enrichment associated with salmon farm activities. However, classical benthic monitoring can hardly answer the rapidly growing demand because the morphological identification of macro-invertebrates is time-consuming, expensive and requires taxonomic expertise. Environmental DNA (eDNA) metabarcoding of meiofauna-sized organisms, such as Foraminifera, was proposed to overcome the drawbacks of macrofauna-based benthic monitoring. Here, we tested the application of foraminiferal metabarcoding to benthic monitoring of salmon farms in Norway. We analysed 140 samples of eDNA and environmental RNA (eRNA) extracted from surface sediment samples collected at 4 salmon farming sites in Norway. We sequenced the variable region 37f of the 18S rRNA gene specific to Foraminifera. We compared our data to the results of macrofaunal surveys of the same sites and tested the congruence between various diversity indices inferred from metabarcoding and morphological data. The results of our study confirm the usefulness of Foraminifera as bioindicators of organic enrichment associated with salmon farming. The foraminiferal diversity increased with the distance to fish cages, and metabarcoding provides an assessment of the ecological quality comparable to the morphological analyses. The foraminiferal metabarcoding approach appears to be a promising alternative to classical benthic monitoring, providing a solution to the morpho-taxonomic bottleneck of macrofaunal surveys.


INTRODUCTION
By 2030, aquaculture is projected to supply over 60% of fish destined for direct human consumption (World Bank 2013).Among the different species produced, farming of Atlantic salmon Salmo salar has grown substantially in the past 40 yr, and currently represents approximately 60% of the world's salmon production.According to FAO statistics (STECF 2014), Norway is the world's leading producer of farmed Atlantic salmon, exporting to 140 countries.Future development of the sector depends on complying with regulatory requirements related to environmental protection (Taranger et al. 2015).
The assessment of benthic diversity is one of the mandatory tools required to comply with the standards established for monitoring the environmental impact of salmon farming.Various biotic indices have been developed based on macrofaunal inventories, including the infaunal trophic index (ITI, Maurer et al. 1999), and the AZTI marine biotic index (AMBI, Borja et al. 2000).In New Zealand, multiple biotic indices are used in conjunction with chemical and other biological indicators to provide weight-ofevidence-based multivariable overall assessment of enrichment stage (Keeley et al. 2012).In Norway, unique indices such as the Norwegian sensitivity index (NSI, Rygg & Norling 2013), and the Norwegian quality index (NQI) are used in conjunction with many other biological indicators to obtain an overall assessment.These indices provide a meaningful evaluation of the ecological quality status based on our current knowledge of the ecological niches of recorded species.However, calculation of biotic indices requires the morphological identification of sorted macro-invertebrates, which is time-consuming and requires taxonomic expertise.The lack of trained taxonomists causes important delays in the analysis of rapidly growing numbers of samples, which seriously limits the efficiency and time-sensitive aspects of benthic monitoring.
Over the last decade, there has been a spectacular development of next-generation sequencing (NGS)based environmental DNA (eDNA) surveys, also called NGS eDNA metabarcoding (Taberlet et al. 2012).Until now, most of the metabarcoding studies related to biomonitoring focused on freshwater ecosystems, either applying the metabarcoding approach to diatom biomonitoring (Kermarrec et al. 2013, 2014, Zimmermann 2014, 2015, Visco et al. 2015), assessing the diversity of benthic macrofauna in fixed bulk samples (Hajibabaei et al. 2012, Stein et al. 2013) or testing the congruence between species inventories inferred from NGS and morphological studies in aquatic insects (Yu et al. 2012, Zhou et al. 2013, Carew et al. 2013).A few studies performed eDNA surveys of benthic communities to monitor marine ecosystems (e.g.Chariton et al. 2010, 2014, 2015, Bik et al. 2012, Pawlowski et al. 2014a, Cowart et al. 2015, Guardiola et al. 2015, Lejzerowicz et al. 2015).
Foraminifera are among the most common and diversified groups of marine meiofauna-sized protists extensively used in ecotoxicological studies (Alve 1995, Frontalini & Coccioni 2011, Schönfeld et al. 2012).The Foraminifera are sensitive to local conditions and often have short life cycles, making them highly responsive to environmental perturbations, including organic enrichment and physical disturbances.Previous studies showed that foraminiferal communities rapidly change under organic pollution exposures associated with fish farming (Scott et al. 1995, Angel et al. 2000, Vidovi et al. 2009, 2014).Foraminifera are also good indicators of the impact of offshore drilling activities (Mojtahid et al. 2008, Jorissen et al. 2009, Denoyelle et al. 2010, Schwing et al. 2015) and heavy metal pollution (Bergin et al. 2006, Frontalini et al. 2009).However, all of these studies were restricted to the hard-shelled species microscopically identified in dried sediment samples (Schönfeld et al. 2012, Alve et al. 2016).
Here, we take advantage of well-established protocols developed for the purpose of molecular identification and classification of Foraminifera (Pawlowski & Lecroq 2010, Pawlowski & Holzmann 2014).We built and currently maintain the most extensive database of reference sequences comprising a fragment of 18S rRNA gene (forambarcoding.unige.ch)for diverse foraminiferal taxa including species collected in northern European coastal habitats.Several metabarcoding studies have been conducted in order to explore the hidden diversity (Lecroq et al. 2011) and spatial micro-distribution of Foraminifera (Lejzerowicz et al. 2014), and to test the preservation of ancient foraminiferal DNA in downcore sediments (Lejzerowicz et al. 2013, Pawłowska et al. 2014).The conclusions of these studies and their perspectives have been reviewed by Pawlowski et al. (2014b).
In previous studies, we used both DNA-and RNAbased metabarcoding to investigate the impact of organic enrichment associated with salmon farming on the diversity of benthic Foraminifera in Scotland and New Zealand.In Scotland, we surveyed the response of foraminiferans at various distances from cages both at the community and species levels (Pawlowski et al. 2014a).Correlative analyses based on common diversity metrics and exploratory analyses based on Bray-Curtis community distances showed that ecological responses could be captured better and appeared more robust when using RNA molecules.In New Zealand, we focused on the RNA signal to evidence foraminiferal responses along well-defined organic enrichment gradients and flow regimes (Pochon et al. 2015).We proposed that RNA sequence abundance profiles of selected foraminiferans can be used to predict their ecological preferences and therefore their value as bioindicators.
In the present study, we tested the accuracy of foraminiferal metabarcoding as an alternative to macrofaunal benthic monitoring in Norway.To achieve this objective, (1) we used molecular data to describe the communities of benthic Foraminifera living in the vicinity of salmon farms; (2) we analysed the changes of foraminiferal communities inferred from metabarcoding data in relation to environmental gradients (distance to cages); and (3) we evaluated the potential congruence between diversity metrics of foraminiferal metabarcoding data and benthic macrofaunal indices.

Sampling
The samples were collected in 4 fish farming sites situated in Norway, in the coastal regions 'West' in April 2014 (Bjørlykkestranda and Rundreimstranda) and 'Central' in June 2014 (Kornstad and Smøla, the later including sites Brettingen and Bremnessvaet; see Table S1 and Fig. S1 in the Supplement at www. int-res.com/articles/ suppl/ q008 p371 _ supp.pdf).Up to 10 stations located at increasing distance (0−3000 m) from the fish cages were sampled per site (Table S2 and Fig. S1 in the Supplement).At each station, 1 to 2 van Veen grabs of 1000 cm 2 (model 12.211, KC-Denmark) were deployed, and within each grab, we sub-sampled 2 to 3 replicates of 2 ml from the top 2 cm of the surface sediment.In total, 142 sediment samples were collected.Each sample was placed in a tube containing 5 ml of Life Guard Soil Preservation Solution (MoBio).The samples were collected using gloves and disposable spoons in order to avoid extraneous contamination.They were stored in a cooler at 4°C and then frozen at −20°C after returning to the laboratory.For each sediment sample, 2 measurements of redox potentials were taken with a probe (model IntelliCAL ORP-REDOX MTC 101, Hach), following the Norwegian Standard NS 9410:2007 protocol (measurement at 1 cm into the sediment layer).
At each station, additional surface sediment material (about 5 ml) was sub-sampled for morphological analyses of foraminiferal communities.The samples were fixed in 4% buffered formalin and transferred to the laboratory, where Rose Bengal stain was added following the recommendations of FOBIMO (Schönfeld et al. 2012).The sediment fraction retained by 100 µm mesh size sieves was searched for living (i.e. Rose Bengal stained) Foraminifera under a stereomicroscope.Each isolated specimen was identified following the reference literature for northern European Foraminifera (Höglund 1947, Cedhagen 2006).
The remaining sediments were sieved through a 1 mm mesh size sieve and fixed in 4% formalin for sorting and counting of macrofauna.The species identification, counting and calculation of macrofaunal indices was done by Havbrukstjenesten AS (West region) and Akvaplan-niva AS (Central region).

eDNA and eRNA extractions and cDNA synthesis
The frozen sediments were thawed on ice and centrifuged at 1170 × g (5 min) in order to discard the Life Guard Preservation solution supernatant.The total RNA and DNA contents of each sediment sample replicate were extracted using the PowerSoil Total RNA Isolation Kit and the DNA Elution Accessory Kit, respectively, according to the manufacturer's instructions (MoBio), and in RNase-free conditions.The quality and purity of crude RNA extracts were checked visually by gel electrophoresis (1.5% agarose) and analytically by spectrophotometry (NanoDrop 1000), respectively.One blank extraction control without sediment was incorporated for each extraction session (up to 11 samples per session).Blank controls were processed in parallel throughout the workflow until the PCR step in order to monitor extraneous or cross-contamination events.Carriedover DNA molecules were digested from RNA extracts by 2 consecutive DNase treatments, and the purified RNA molecules were reverse-transcribed into complementary DNA (cDNA) as explained by Langlet et al. (2013).Pristine aliquots of each sample's RNA, cDNA and DNA extracts were immediately frozen at −80°C in case of contamination during PCR and sequencing or for further research.

PCR amplification and high-throughput sequencing
The foraminiferal 37f hypervariable region of the 18S rDNA was enriched from each metagenomic extract using modified versions of the amplification primers s14F1 (forward: 5'-AAG GGC ACC ACA AGA ACG C-3') and s15 (reverse: 5'-CCA CCT ATC ACA YAA TCA TG-3').The primers' modifications consisted of 8-nucleotide-long tag sequences appended at their 5' end in order to multiplex the PCR products obtained from each sample into sequencing libraries, as described by Pawlowski et al. (2014a).Hence, each sample was PCR amplified using a unique combination of tagged primers, according to an optimized multiplexing design, as explained by Esling et al. (2015).However, each PCR mixture determined by a unique com-bination of tags was performed in duplicate, and the duplicate PCR products were later pooled in separate libraries in order to obtain technical PCR and sequencing replicates.Each PCR was performed in a total volume of 20 µl containing 1× AmpliTaq Gold Buffer, 2.5 mM of MgCl 2 , 1 U of AmpliTaq Gold DNA polymerase (Applied Biosystems), 0.2 mM of each dNTP, 0.2 µM of each tagged primer and ca. 10 ng of DNA (or cDNA) extract.After a pre-incubation at 94°C for 5 min and 30 cycles of 94°C for 20 s, 52°C for 20 s and 72°C for 20 s, the PCR products were incubated at 72°C for 2 min.A subset of PCR products was purified using the High Pure PCR Clean up Micro Kit (Roche) and quantified either by a fluorometric method (QuBit HS dsDNA kit, Invitrogen) or by using relative gel electrophoresis band intensities (ImageLab 4.0.1 on the Gel DocTM XR+ transilluminator, BioRad), as in Lejzerowicz et al. (2014).After quantification, the PCR products of each sample were pooled in equimolar quantities (ca.20 ng) per library, and each library pool was subjected to size-selection and purification, end repair, adapter ligation and library-indexing PCR with the Illumina PE adapters using the TruSeq Nano DNA LT Sample Prep kit following the manufacturer's instructions (Illumina).The libraries were sequenced on a MiSeq instrument for 2× 151 cycles (paired-end) using MiSeq Reagent Nano Kits (v2) pooled in separate runs.The raw sequencing reads were submitted to the Short Read Archive under accession number PRJNA314454.

NGS data analysis
We assembled, quality-filtered and de-multiplexed the raw sequence data using a computational pipe line specifically tailored for analysing diversity data generated by Illumina sequencing platforms (Paw lowski et al. 2014a).We filtered cross-contamination events that stem from the library-preparation artefact referred to as the mistagging phenomenon (i.e.switching of the tags labeling the amplicons) following the method described by Esling et al. (2015).After assembly of the paired-end sequencing reads into contiguous sequences, we only kept the unique copies of these sequences (strict dereplication).We removed the unique sequences with a single occurrence in a library sample (i.e. in a sequenced PCR replicate) and compared them for each pair of samples corresponding to PCR duplicates pooled in separate libraries.We only kept a unique sequence if it occurred in both technical duplicates.The number of reads underlying such a sequence corresponded to the average of its number of reads across the duplicates.We then assigned taxonomies to unique sequences based on the Needleman-Wunsch global alignments as explained by Pawlowski et al. (2014a).We used a manually curated reference sequence data base comprising 996 foraminiferal species entries.We defined the assigned taxon by taking the consensus among the taxonomic levels of the best matches (i.e. the taxonomy common to all matches).If a unique sequence shares less than 80% of similarity with every entry of the database, it is classified as an unknown species.We then grouped the unique sequences based on their taxonomies to create large pre-clusters of sequences that we divided into operational taxonomic units (OTUs), which can be considered equivalent to molecular species.We defined OTUs by performing a complete linkage clustering based on the pairwise Needleman-Wunsch distances computed between each pair of sequences from the pre-cluster, as explained by Lejzerowicz et al. (2014).
Because of the uneven distribution of sequence reads among the samples, we performed a normalization of the OTUs-to-samples dataset in order to allow further comparisons.We used a rarefaction approach similar to that described by de Cárcer et al. (2011).Briefly, we randomly subsampled the OTUs of each sample replicate 100 times (with replacement), and considered the median of the number of reads per sample as the OTU abundance.We then kept the average number of reads per OTU and normalized the samples so that each would be composed of 10 000 reads.We discarded the OTUs represented by fewer than 10 reads.

Benthic indices
We analysed the beta-diversity and computed several diversity indices based on the OTUs found in at least 1 DNA and 1 RNA sample simultaneously.We computed Bray-Curtis dissimilarity matrices for each pair of DNA and RNA samples based on presence/ absence in order to group samples using a complete linkage hierarchical clustering in MATLAB R2014b.For foraminiferal data, we computed the species richness (S; number of species), SN factor (SN = lnS / ln [lnN ]; N: number of individuals), Shannon diversity index (H') and Chao diversity index (Gotelli & Colwell 2011).For the metazoan morpho-taxonomic data, we used the species lists obtained from the sampling sites to compute the following diversity indices: Shannon diversity index (H'), NSI, NQI1, AMBI and indicator species index (ISI2012).The taxon-specific sensitivity values for NSI, ISI and AMBI were extracted from Rygg & Norling (2013).

Statistical analyses
To test the effect of distance from cages on the compositional variation of foraminiferal communities (based on Bray-Curtis distance matrix), we used permutational multivariate analysis of variance (PERMANOVA, Anderson 2001) using a nested model ('Distance from cage' nested in the 'Farm' factor).The PER-MANOVAs were performed with the adonis function of the R vegan package (Oksanen et al. 2015), using 999 permutations.The multivariate component of variation (percentage of the total variation) between farms, the variation along the distance from the cages and the variation between grab replicates was calculated from the mean squares of the PERMANOVA, using the method of moments (Searle et al. 1992).
We further tested the congruence between diversity metrics of foraminiferal metabarcoding data and benthic macrofaunal indices with the kappa2 function of the R irr package (Gamer et al. 2012), with a squared weight because the raters are ordered (from 'very poor' to 'very good' ecological status).Agreement between raters was classified from 'poor', i.e.Kappa value ranging from 0.01 to 0.2, and 'almost per fect', i.e.Kappa value ranging from 0.8 to 1 (Landis & Koch 1977).

NGS data statistics
In total, we obtained and analysed about 24 million DNA and RNA sequences of the foraminiferal hypervariable region 37f of the 18S rRNA gene for a total of 142 samples.After quality filtering (to remove various technical errors), we obtained 13 758 447 foraminiferal sequences (called reads).The sequences were analysed in 2 datasets, corresponding to the sampling regions (Table 1).
After filtering, the good reads were dereplicated and clustered into OTUs.In order to avoid the potential biases induced by the presence of extracellular DNA, we merged DNA and RNA data by retaining only those OTUs that were present at least once in both the DNA and RNA datasets (following Pawlowski et al. 2014a).The number of foraminiferal OTUs varied from 513 in Rundreimstranda to 1545 in Brettingen + Bremnessvaet (Smøla).About half of the OTUs were represented by <100 reads.The maximum number of reads found in a single OTU after normalization was 600 979 (assigned to Vellaria pellucidus).The 20 most abundant OTUs comprised more than 55% of the reads per site.
The OTUs were assigned based on our reference database, which comprises about 1000 rDNA sequences obtained from single Foraminifera, including many specimens collected in Norway, Scotland, the Faroes and Iceland.The sequences of almost all species previously isolated and individually sequenced in this area were found in the present NGS data.However, we also found many sequences that could not be assigned to our database.In total, 255 OTUs could be assigned to the specific or generic level.Some of these OTUs were assigned to the same reference data and were considered to be genetic variants of the same morphospecies.After combining them, 150 OTUs were retained for taxonomic analyses.
About one-third of the foraminiferal OTUs (47/150) were present at all sampling sites.Of these, 12 species were found among the 20 most abundant species present at 3 or more sites (Table 2).At each site, the most abundant OTUs also included some unassigned OTUs, but the majority were abundant at a single site only.In total, 37 out of 150 OTUs were present at a single site.The majority of these (88%) were found in Brettingen + Bremnessvaet, which are characterized by the highest species richness both in NGS and morphological analyses.

Taxonomic composition of metabarcoding data
The large majority of the assigned foraminiferal OTUs belong to Class Monothalamea (Fig. 1), char- The taxonomic composition of Foraminifera varied across the sampling sites.In particular, we observed an exceptionally high proportion of Rotaliida and Tex tulariida (Fig. 1) in the samples collected near (0−25 m) cages in Bjørlykkestranda (Stations 1, 2, 7).However, this trend was less pronounced in Rundreimstranda, where the relative abundance of monothalamid OTUs was higher and more equally distributed among stations.In terms of read abundance, the rotaliids were almost absent at Stns 9 and 10 of Rundreimstranda.These stations were characterized by higher abundances of miliolid sequences, which were only sporadically found in other stations.The taxonomic composition of foraminiferal assemblages was less variable and did not seem to be correlated to the distance from cages in Kornstad and Brettingen + Bremnessvaet.In terms of OTU richness, the relative abundance of different taxonomic groups was more homogeneous, with only a slight increase of rotaliid and textulariid OTUs at the stations located near the cages (Kornstad 1, 2; Bremnessvaet 1, Brettingen 6).In terms of read abun- dance, the variability between stations did not seem to be related to their positions.The comparison between 150 OTUs and 105 foraminiferal morphospecies isolated from the fixed samples showed a relatively good congruence between these 2 datasets (Table S3 in the Supplement).The highest similarity was observed for rotaliids, for which more than 50% of species were in common between the NGS and morphological data.Almost all OTUs assigned to the textulariids have been identified in morphological samples.However, the number of morphologically identified textulariids was almost twice the number found in the molecular data.Similarly, the number of morphologically identified miliolids and other tubothalamids ( 12) was much higher than the number of OTUs (3) assigned to this group.
Among the most common species present at all sites, we found 3 rotaliids (Stainforthia fusiformis, Bulimina marginata and Cibicidoides lobatulus), a textulariid (Reophax sp.) and 6 monothalamids (Bathysiphon argenteus, Micrometula hyalinostriata, Psammophaga sp., Nemogullmia longivariabilis, Cylindrogullmia alba and Tinogullmia sp.).All of these species are well known in the Norwegian coastal waters.A detailed list of species with corresponding numbers of reads is provided in Table S3.

Comparison of foraminiferal communities
We compared the foraminiferal communities by cluster analyses (Fig. 2) and by multivariate ANOVA (Table 3), across all samples (including technical replicates) and for each sampling site separately.
The variability of replicate samples taken from the same grab was low.Indeed, very distinctive clusters formed, which were composed of the technical replicates of the same grab and also of the different grabs of the same stations.Overall, this clustering appeared highly correlated to the distance from the cages.Indeed, in all cases, the stations close to cages formed one set of clusters, while distant samples formed another distinct set of clusters.In the case of Smøla, which comprised 2 distinct fish farms (Brettingen and Bremnessvaet), the stations close to the cages (1, 6) were in the same set of clusters (Fig. 2).
The same effect was also observed at other sites, albeit in a less perceptible manner.Indeed, in the case of both Rundreimstranda and Bjørlykkestranda, the group of samples closest to the cages (left part) was partly scrambled in terms of replicates, whereas as we moved away from the cages (middle to right), the samples gradually began to group neatly based both on their technical replicates and different grabs (Fig. 2).This observation suggests that the impact of the cages could even be stronger than the local patchiness that can be observed across different grabs of the same sampling station.
While the samples grouped relatively well in relation to the distance from the cages, this was not the case for redox values, which differed largely between and within sites.However, we noted very large variations of redox in samples from Rundreimstranda and Bjørlykkestranda, especially for samples collected close to the cages.The correspondence between the redox and distance was slightly better for Kornstad and Smøla, with generally higher redox values close to the cages (Fig. 2).
ANOVA confirmed the effect of distance on the compositional variation of foraminiferal communities (Table 3).Variance partitioning showed that differences between farms accounted for the largest variation in the data, with 75 and 45% for the West coastal region (Bjørlykkestranda and Rundreistranda) and the Central coastal region (Kornstad and Smøla), respectively.The distance from the cages, although significant, accounted for only 19 and 36% of the variance in the West and Central regions, respectively.The replicate samples accounted for the smallest variation in the data, with 6 and 19%, respectively.

Foraminiferal diversity vs. environmental gradients
We analysed the foraminiferal diversity indices S (richness), SN, H ' (Shannon) and Chao depending on the distance to cages, in order to investigate the diversity of foraminiferal communities along the environmental gradients.We calculated these indices for each sample categorized per class of distance to the cage.This allowed us to compute the mean, standard deviation and confidence interval of each set.The patterns of the Chao diversity index changes are illustrated in Fig. 3.The detailed analyses of all indices are available in Fig. S2 in the Supplement.
Overall, the variation in foraminiferal diversity tended to increase with increasing distance from cages (Fig. 3).Hence, the diversity was lower with increasing proximity to the cages.We typically observed around 4 times fewer species near the cages than at more distant stations, but the number of species declined by up to 10 times fewer species near cages in the Bjørlykke stranda site.It seems that the standard deviation and confidence interval of the diversity indices have a tendency to be extremely narrow near the cages and then to 'spread out' as the dis-tance increases.This suggests an effect of the farm on the diversity of foraminiferal communities.Moreover, Fig. 3 and Fig. S2 in the Supplement exhibit an increasing richness following different slopes, which might be explained by different types of conditions, such as the sediment type or the water flow.

Correlation between foraminiferal diversity and macro-invertebrate indices
We investigated the potential correlation between different foraminiferal diversity indices (S, SN, H ', Chao) and the 4 standard indices based on benthic macrofaunal analysis commonly used in Norway (H ', NSI, NQI1, NQI2).For a global overlook at this correlation, we plotted the values of a foraminiferal diversity metric (y-axis) compared to a macrofaunal index (x-axis) for all samples from a particular sampling site.The best correlation was obtained by using the Chao diversity index (Fig. 4).The other correlations are presented in Fig. S3 in the Supplement.
A positive correlation was evident between the foraminiferal Chao diversity index and macrofaunal indices.In all cases, the correlations were almost linear, and the relationships (R 2 ranging from 0.601 to  The distance to cages could not be given for some stations without cages (7.1 and 7.2) in Smøla.The branches belonging to sampling stations that were located ≤50 m from salmon cages are coloured in red 0.848) were highly significant (all pvalues < 0.0001).This suggests that the Chao diversity index provides information equivalent to the macrofaunal indices in their respective changes.Moreover, the deviation between values was typically quite narrow, and most of the points lay within this margin of error.It was always the same few samples that deviated from one index to another (Rundreimstranda Stns 1 and 2, and Bremnessvaet Stn 3), which might point out to technical problems in the corresponding samples.

Fig. 2 (continued)
To determine whether the foraminiferal diversity indices (S, SN, H ', Chao) could provide a scaling system equivalent to macrofaunal indices, we analysed the similarities of ecological quality statuses inferred from foraminiferal and macrofaunal indices.To do so, we first reordered the different samples based on increasing values of the selected macrofaunal index.This also allowed us to regroup the samples based on their ecological quality statuses.
As shown in Fig. 5, the foraminiferal diversity indices provide a view of the ecological quality status that closely resembles the assessment provided by macrofaunal indices.The Chao diversity index appeared to be the most accurate overall, but in some cases other indices also gave a good correlation (for example, SN vs NQI1 in the Central region dataset or H ' vs. NSI in both datasets).This indicates that refined versions or combinations of these indices could provide a more accurate view of the eco logical statuses.
The weighted Kappa analysis showed that the rating agreement of ecological status inferred from macrofaunal data and from foraminiferal molecular data were moderate to substantial with significant confidence (Table 4).The kappa values ranged from 0.412 to 0.753 for the Central region between the macrofaunal NGI1 index and the molecular Shannon index, and for the West region between the macrofaunal Shannon index and the molecular Chao index.

DISCUSSION
The diversity and composition of foraminiferal communities seems to be affected by the presence of fish farms.Both macrofaunal and foraminiferal communities were less diverse close to the farm cages, and there was a significant compositional change along the distance-to-cage gradient.Furthermore, replicates sampled near the cages, within the same van Veen grab, will likely be very similar, while replicates within the same grab collected at distant stations will deviate more from each other.Even though the distance-tocage is no substitute for the environmental gradients that drive community changes, our data show that there is a steady trend towards increasing diversity and diversity variation among sample replicates away from the cages, a pattern which is consistent across farms and with previous studies (Pawlowski et al. 2014a, Pochon et al. 2015).Taken together, our results suggest that the fish farms affect their immediate surroundings, creating environmental conditions that are particularly suitable for a set of species that could then be viewed as potential bioindicators.Our study confirms that the metabarcoding of meiofaunal-sized organisms, such as Foraminifera, provides an interesting alternative solution to benthic monitoring.The ecosystem status inferred from macrofaunal counts and molecular foraminiferal data are leading to similar classification, as shown by the Kappa analysis.This means that foraminiferal com-munities provide relevant infor mation for the biomonitoring of benthic ecosystems.The impact of fish farms on the foraminiferal community has al ready been studied using a meta barcoding approach in Scot land (Pawlowski et al. 2014a) and New Zealand (Pochon et al. 2015).The previous studies focused on the comparison of (1) molecular and morphological taxonomic composition, (2) DNA and RNA data and (3) the abundance profiles of particular species.The results of these studies evidenced the effects of environmental impact on beta-diversity of foraminiferal communities, as well as on OTU abundance, at least for the most sequenced OTUs that are more likely to be present in both the DNA and RNA extracts of each sample.The previous studies also highlighted the importance of replicates and stringent qualityand mistag-based filters (Esling et al. 2015), which is critical to reduce the number of potentially spurious rare OTUs that might bias richness estimates (Pawlowski et al. 2014a).
Compared to the previous studies, here we directly compared the biotic indices inferred from foramini feral molecular data and macrofaunal counts in order to evaluate the ecological quality status.Our results show that assessing environmental effects of fish farms based on different biological communities leads to similar conclusions.This is consistent with the results of a recent study, showing that it is possible to exploit the response of a given taxonomic group as a surrogate for the response of another taxonomic group (i.e.diatoms and macrofauna), provided that analyses are restricted to a small spatial scale and to a specific type of habitat (Bae et al. 2014).Other studies infer- ring biotic indices from meta barcoding data showed that similar values of ITI and AMBI are obtained for the impact assessment of salmon farms in Scotland when calculated from the macro-invertebrate morpho-taxonomic inventories and metazoan, mainly meiofaunal, metabarcoding data (Lejzerowicz et al. 2015).Similarly, comparable values of the benthic diatom index have been obtained for rivers and streams based on metabarcoding and microscopic data (Visco et al. 2015).All of these studies confirm the potential of metabarcoding to assess biodiversity and quality status of marine and freshwater ecosystems (Ji et al. 2013, Boh mann et al. 2014, Gibson et al. 2015, Guardiola et al. 2015).
The advantages of including metabarcoding in routine benthic monitoring are numerous.First, given that the process of eDNA analysis is much faster, the waiting time for the impact assessment could be reduced from 3−6 mo to a few weeks.Second, the metabarcoding workflow could be automated, and the costs of DNA sequencing are rapidly decreasing (Junemann et al. 2013).Third, biodiversity assessments will be more accurate because species identification is based on publicly available reference databases that may be more reliable than personal taxonomic knowledge.Additionally, the ecosystem evaluation will be enhanced by covering a wider range of taxa, including bacteria and protists, as well as the meiofauna, which is increasingly considered a reservoir of potential ecological indicators for various applications (Zeppilli et al. 2015).Finally, the metabarcoding approach not only increases the taxonomic coverage but also provides the ability to dis tinguish multiple OTUs that might belong to the same morphotype but have different ecological characteristics.
There are still several challenges that metabarcoding will need to address to become fully operational for biomonitoring purposes.Among them, the accuracy of NGS data analysis in terms of filtering, OTU clustering and taxonomic assignment needs to be improved.Another issue is that the reference database of DNA barcodes needs to be completed.Considerable efforts are being made to enrich reference databases with the sequencing of DNA barcodes from various marine species, especially for macro-invertebrates (Barcode of Life Database, Aylagas et al. 2014) and to a lesser extent for meiofauna and small-sized eukaryotes (Tang et al. 2012, del Campo et al. 2014).However, the task is daunting given the gigantic diversity of meiofaunal and microbial taxa as well as the paucity of morphologically distinctive characters and trained taxonomists for their description (Boero 2010).Further research is also needed to better understand the ecological meaning of metabarcoding data.In fact, the biggest challenge is to assign ecological values to sequenced species or OTUs.Most of the macrofaunal indices rely on characteristic species associated with ecological (e.g.AMBI, Borja et al. 2000) or trophic (ITI, Word 1979) groups, which have been defined empirically.Such ecological groups have also been defined for Foraminifera (e.g.Alve et al. 2016), but only for hard-shelled, fossilized taxa (e.g.rotaliids) and not for soft-shelled monothalamids that represent the majority of the sequenced foraminiferal diversity.In the case of meiofauna and other incon-  spicuous protist species with unknown autecology, being able to directly link OTUs with environmental gradients could be much more efficient than attempting to taxonomically and ecologically characterize the morphospecies corresponding to these OTUs.
To conclude, we think that ecologically meaningful information can be inferred from metabarcoding data by bypassing a taxonomical approach, and ultimately unveil what we call 'eco-barcodes', i.e. eDNA sequences that would be used as bio-indicators.We further think that an approach using both empirical knowledge about the ecology of species and community ecology theory might soon lead to the discovery of such eco-barcodes.This approach requires a large, multi-site dataset in order to identify these eco-barcodes and to design relevant biotic indices as exemplified for the development of the I 2 M 2 index, which is based on the best combination of the known indices (Mondy et al. 2012).Such an approach could complement or even replace traditional morphologybased monitoring in the future.

Fig. 2 .
Fig.2.Comparison of foraminiferal communities at aquaculture farms in Norway across samples based on their Bray-Curtis similarities.Sampling stations are indicated by coloured squares at each leaf and annotated with the corresponding sample label: the first number indicates the station and the second number indicates the van Veen grab (identical numbers correspond to technical replicates).The values of distance to cage (blue, logarithmic scale) and redox (green) are indicated under the dendrogram and reordered based on the leaf orders.The distance to cages could not be given for some stations without cages (7.1 and 7.2) in Smøla.The branches belonging to sampling stations that were located ≤50 m from salmon cages are coloured in red

Fig. 3 .
Fig. 3. Changes in foraminiferal diversity at aquaculture farms in Norway based on the distance to salmon cages.The values of the Chao index (y-axis) for all samples from a particular sampling site (grey dots) are plotted at their corresponding position (on a scaled x-axis).The green patch shows the confidence interval, and the dashed green line shows the mean.The boxplot at each distance position summarizes the corresponding mean (red line), standard deviation (red box) and confidence interval (blue box) of the set

Fig. 4 .
Fig. 4. Correlation between the Chao diversity index in Foraminifera and macrofaunal indices (Shannon diversity index [H '], Norwegian Sensitivity Index [NSI], Norwegian Quality Indices [NQI1, NQI2]).The values of molecular diversity (y-axis) compared to a macrofaunal index (x-axis) are plotted for all samples of a particular sampling site.The dotted green line represents the results of a model II regression with least squares fitting over all the points, while the surrounding green area shows the interquartile range of the standard deviations.The regression R 2 value and the p-value are indicated directly on each graph

Fig. 5 .
Fig. 5. Comparison of ecological quality status inferred from foraminiferal diversity indices (H ', Chao) and macrofaunal indices (H ', Norwegian Quality Index [NQI1]).The values of foraminiferal indices are indicated by magenta bars and macrofaunal indices by green bars.The samples are grouped according to their ecological quality statuses inferred from morphological data and indicated at the bottom of each histogram.The ecological quality inferred from molecular data is also indicated by the multicoloured bars on the right side, and the corresponding dotted lines indicate their boundary values.The results are presented separately for the (A) West region (sites Bjørlykkestranda + Rundreimstranda) and (B) Central region (sites Kornstad + Bremnessvaet + Brettingen).Codes on top of the graph indicate the site name abbrevations, the sampling station and the replicate number

Table 1 .
Number of foraminiferal reads (next-generation sequences) before and after quality filtering based on a total of 142 samples from 2 regions in Norway; 37f: foraminiferal hypervariable region 37f of the 18S rRNA gene acterized by a single-chambered test with an organic or agglutinated wall.The 3 other groups most commonly represented in our data are Orders Rotaliida, Textulariida and Miliolida.The first 2 orders are characterized by hard-shelled calcareous (rotaliids) or agglutinated (textulariids) multichambered tests.The miliolids also possess multichambered calcareous tests, but they differ from rotaliids by imperforate, porcelaneous wall.

Table 2 .
Most abundant species of Foraminifera based on the number of reads (next-generation sequences) at each of 4 sampled aquaculture sites in Norway.
Undet.: undetermined Fig.1.High-level taxonomic composition of foraminiferal communities at aquaculture farms in Norway based on the number of operational taxonomic units (OTUs).For each station, indicated as numbers, the replicates from each grab were combined (n = 2 replicates per grab for Bjørlykkestranda, Rundreimstranda, Kornstad, and n = 3 per grab for Smøla)

Table 3 .
Permutational multivariate analysis of variances of the compositional dissimilarities of foraminiferal communities at aquaculture farms in 2 regions in Norway.***Significant at p < 0.001