Siblingship tests connect two seemingly independent farmed Atlantic salmon escape events

Aquaculture escapees represent a threat to the genetic integrity of native populations, may spread infectious agents and display ecological interactions with wild fish. DNA-based identification methods are well established for tracing Atlantic salmon escapees back to their farms of origin. However, traditional genetic assignment approaches are not always able to single out the farm of origin in cases where several potential farm sources rear fish from the same genetic line, and display strongly overlapping allele frequencies. We investigated whether an alternative statistical approach, which involves ad hoc identification of sibling relationships, circumvents the challenge of overlapping allele frequencies. We analysed the following samples collected in 2013: (1) 221 farmed escapees captured in several rivers in the Ryfylke region of Norway, (2) 139 farmed escapees captured some 150 km away in an upstream fish migration trap in the River Etne, and (3) 779 farmed salmon sampled from 17 cages on 10 farms in Ryfylke. Siblingship tests increased the precision of identification of escapees back to their farm of origin over genetic assignment and population statistic approaches. Together with other non-genetic data, siblingship tests were also able to connect 2 seemingly independent escape events, demonstrating that some of the salmon escaping from 1 or 2 farms in Ryfylke took approximately 1 mo to migrate 150 km northwards before entering the River Etne. Finally, we demonstrated that the genetic background of the escapees captured in the River Etne during the course of an entire season was represented by the 3 major breeding programs in Norway.


INTRODUCTION
Atlantic salmon Salmo salar L. farming is typically based upon a freshwater-rearing phase involving egg hatching and smolt production (9 to 15 mo), followed by rearing in sea cages until market size (12 to 24 mo).One of the challenges to the further development of a sustainable aquaculture industry is escapes that occur in both the freshwater (Clifford et al. 1998a) and marine stages of production (Lund et al. 1991, Crozier 1993, 2000, Clifford et al. 1998b, Glover et al. 2008).Escapees represent a potential threat to wild populations in the form of ecological interactions (Jonsson & Jonsson 2006), disease transmission (Madhun et al. 2015), spread of parasites (Nylund et al. 1999, Naylor et al. 2005, Finstad & Bjørn 2011, Krkošek et al. 2013) and genetic introgression (Crozier 1993, Skaala et al. 2006, Bourret et al. 2011, Glover et al. 2012, 2013a).Therefore, the monitoring of escapees in time and space is of crucial importance.
In Norway, which is the world's largest producer of Atlantic salmon, farmers are legally obliged to report escapes of farmed fish to the Norwegian Directorate of Fisheries (NDF).Although tens to hundreds of thousands of escapees are reported to the NDF annually, underreporting represents a challenge and the true numbers of escapees have been estimated to be 2 to 4 times higher, putting these numbers in the millions in some years (Skilbrei et al. 2015a).Despite efforts to minimise and monitor escapes, large numbers of farmed salmon have been observed in native spawning populations in Norway for several decades (Fiske et al. 2006), to the extent that in some rivers, escaped farmed salmon have accounted for 50% of the total brood stock across different years (Saegrov et al. 1997, Fiske et al. 2006).Escapees have also been observed in rivers located in countries where salmon farming is not even practiced, thus demonstrating their potential for long-distance dispersal (Morris et al. 2008).
Farmed and wild salmon can be differentiated based upon morphology and scale characteristics (Lund & Hansen 1991).However, this information alone does not permit identification of the farms from which the escapees originated.This presents a challenge given that salmon can disperse over large distances after escaping (Hansen 2006, Skilbrei 2010, Skilbrei et al. 2015a).In order to address this issue, Glover et al. (2008) developed a method based upon genetic assignment statistics called the 'DNA stand-by-method', which permits the identification of the farm of origin for salmon escapees.This method has been under continuous development since its successful implementation in 2007 (Glover et al. 2009, 2011b, 2013a, Glover 2010, Zhang et al. 2013), and it is now commonly applied in cases of unreported escapes of farmed fish in Norway.However, it is sometimes not possible to conclusively differentiate between 2 or more potential farm sources that rear fish with a similar genetic background.This specific challenge results from the commercial production of Atlantic salmon in Norway being largely based upon fish arising from 3 commercial breeding programs; and while each program typically has several or more breeding strains, there is occasionally overlap in the genetic profile (allele frequencies) between salmon reared in neighbouring farms (e.g.Glover et al. 2011b), especially if smolts have been purchased from the same supplier (Glover 2010, his Fig. 2).Non-genetic supplementary methods, such as fatty acid (Grahl-Nielsen & Glover 2010) or pathogen profiling (Glover et al. 2013b, Madhun et al. 2015) have been successfully used as additional information to assist the identification of the specific farm of origin when the genetic background of several nearby farms is very similar, and also to monitor the escape history of the fish (Skilbrei et al. 2015b).However, thus far, other statistical approaches using the available genetic data to identify farms of origin for escapees, e.g.siblingship reconstruction methods (Wang & Santure 2009), have not been fully evaluated.Hence, it is possible that a more comprehensive analysis of the available genetic data may increase the resolution of identification even in challenging cases where several farms rear fish with varying degrees of overlapping allele frequencies.
In the summer of 2013, large numbers of presumed escaped farmed salmon were captured by local anglers in rivers located in the Ryfylke area, SW Norway (Fig. 1).While farm escapees are commonly observed in rivers in this region, some of the escapees entered the rivers early in the season (July), which is less common, many were silver in colouration (i.e.probably not sexually mature) and were in the 1-3 kg category.Based upon experience, these factors suggested that the fish may have escaped recently, and, potentially, from a single farm.However, none of the farming companies operating in the area that had salmon of similar size reported having lost fish in this period; therefore, the NDF decided to implement the DNA stand-by method to identify the source.This process involved taking samples of salmon from cages and farms in the region that could represent the potential origin of the escapees, in addition to the extensive sampling conducted by local fishermen in the rivers.
In the following months, a number of escaped farmed salmon that were very similar in appearance and size to the escapees recaptured in Ryfylke were observed some 150 km away, ascending an upstream migration trap in the River Etne, outer Hardangerfjord (Fig. 1).The question was whether these were 2 independent escape events, or whether the fish escaping from Ryfylke had also dispersed north to the Hardangerfjord, and thereafter entered the River Etne.The large number of samples gathered in connection with these 2 cases offered a unique opportunity to: (1) evaluate the potential of siblingship statistical methods to increase the precision of identification of escapees to farms and (2) investigate the potential connection between 2 geographically distinct observations of farm escapees.

Samples from the Ryfylke case
A total of 313 fish ranging mostly between 1 and 3 kg (∼50-65 cm long) and suspected of being of farm origin (based upon the anglers' preliminary observations) were captured in 7 rivers located in Ryfylke (SW Norway, Fig. 1) from 15 July to 23 November 2013.The NDF decided to try and identify the origin of these putative escapees using the 'DNA stand-bymethod' (Glover et al. 2008, Glover 2010) and commissioned the Norwegian consultancy company Rådgivende Biologer AS to read the fish scales in order to verify these fish as escapees or wild.A total of 221 of the fish were identified as farm escapees based upon scale growth characteristics (Lund & Hansen 1991) and were retained for genetic analyses.Of these confirmed escapees, 75% were recaptured in the River Suldalslågen.To identify the putative source of these escapees, the NDF sampled salmon from the farms in the area that had fish overlapping in size with the escapees.These samples, which included fish from 17 cage samples on 10 farms, are referred to as the 'baseline samples' and consisted of ~47 individuals per cage sample (see Table 1).These farmed salmon originated from 3 different breeding strains: Aqua-Gen (cage samples 2B, 6A, 8A, 9A, 9B and 9C), Mowi (cage samples 1A, 1B, 2A, 2C, 2D, 3A, 4A and 131A) and SalmoBreed (cage samples 5A, 7A and 8B).The names of the companies operating the investigated farms remain anonymous for legal reasons.

Samples from the Etne case
In a similar time frame to the Ryfylke case (14 June to 24 November 2013), 168 salmon, also thought to be escapees, were captured in the upstream migration trap (Resistance Border Weir) located in the River Etne (Fig. 1).After scale inspection, 139 of these fish were confirmed as farm escapees, and retained for genetic analyses.In spite of this river being a minimum of ~150 km by sea from the Ryfylke region, the same baseline samples were used to explore the possibility of these escapees having their origin in the aforementioned farms in the Ryfylke area.Post escape, they could have migrated northwards either west of Karmøy or through the narrow Karmøysundet, turned east into the Hardangerfjord and thereafter entered the River Etne.

Statistical analysis
Total number of alleles and allelic richness were calculated with Microsatellite Analyser (MSA) (Dieringer & Schlötterer 2003), whereas observed heterozygosity (H o ) was computed with GenAlEx (Peakall & Smouse 2006).The genotype distribution of each locus per year class and its direction (heterozygote deficit or excess) was compared with the expected Hardy-Weinberg distribution using the program GENEPOP 7 (Rousset 2008), as was the linkage disequilibrium.Effective population size (N e ) per sample together with the 95% confidence interval was obtained by the jackknife method using LDNE (Waples & Do 2008), implementing the following threshold values of lowest allele frequency: 0.05, 0.02 and 0.01.STRUCTURE v.2.3.4 (Pritchard et al. 2000) was used to identify genetic groups under a model assuming admixture and correlated allele frequencies without using population information.Ten runs with a burn-in period consisting of 100 000 replications and a run length of 1 000 000 Markov Chain Monte Carlo (MCMC) iterations were performed for a number of clusters ranging from K = 1 to K = 5.STRUCTURE Harvester (Earl & von Holdt 2012) runs was thereafter used to calculate the Evanno et al. ( 2005) ad hoc summary statistic ΔK, which is based on the rate of change of the 'estimated likelihood' between successive K values and allows determination of the uppermost hierarchical level of structure in the data.Runs were averaged with CLUMPP v.1.1.1 (Jakobsson & Rosenberg 2007) using the LargeK-Greedy algorithm and the G' pairwise matrix similarity statistic, and was graphically displayed using barplots.
The individual assignment of the escapees to their potential source farms was conducted with the program GeneClass 2 (Piry et al. 2004) using the Rannala & Mountain (1997) method of computation.A combination of direct genetic assignment and exclusion of every escaped fish from each farm source with significance thresholds of α = 0.05 and α = 0.001 was used to identify individual probabilities of fitting in with the genetic profile of each of the 17 cage samples in the baseline.Furthermore, the collective inference from STRUCTURE and GeneClass allowed escapees to be classified into 4 different categories: AquaGen, Mowi and SalmoBreed genetic strains and 'Unknown'.Fish placed into the 'Unknown' category originated from farms that were not sampled to establish the genetic baseline, and although such escapees probably belonged to 1 of the 3 aforementioned strains (due to the fact that these strains overwhelmingly dominate Norwegian production), they could not be unequivocally identified to strain as they did not necessarily exactly resemble the reference samples.This could be caused, for example, by these fish belonging to different year classes to those sampled here and thus displaying large genetic differences also within strains (Glover et al. 2011b).The frequency of individuals being classified into each of the aforementioned categories was assessed in relation to the time of sampling and in connection with their size.This specific analysis was limited to the escapees captured in the Suldalslågen and Etne rivers due to the fact that the other rivers sampled in Ryfylke did not produce high enough numbers to provide statistical power.
The number of individuals hosted in a cage on a commercial farm can exceed tens of thousands; however, the number of families represented is often limited.Therefore, a complementary method to assess the likelihood of escaped individuals coming from a specific source is to analyse the sibling relationships between them.We used the software COLONY v.2.0.5.1 (Jones & Wang 2010), which implements full-pedigree likelihood methods to simultaneously infer siblingship and parentage among individuals using multilocus genotype data, to compute the number of full-sibling dyads between farmed fish and escapees.Analyses were run with no information on parental genotypes, assuming both male and female polygamy as well as possible inbreeding.The fulllikelihood model was chosen with run length and precision set to medium.In the Ryfylke case, we investigated the possible relationships between the escaped individuals and all of the baseline farms.
Likewise, in the Etne case, we also aimed to determine whether some of the escapees could possibly have originated from the escape event in Ryfylke.Hence, the escapees captured in the upstream migration trap in the River Etne were tested against the Ryfylke farm samples as well as the escapees captured in Ryfylke using this siblingship analysis approach.

Fatty acid analysis and classification according to dietary history
The dietary history of a fish is mirrored in its fatty acid profile.Fatty acid profiling permits the classification of farm escapees into early or recent.The content of triacylglycerols in the adipose fin of the escapees was analysed for a subset of individuals: 36 from the River Suldalslågen and 114 from the River Etne.This analysis followed the methodology detailed by Olsen et al. (2013).In short, total lipids were extracted and depot lipids (triacylglycerols) were separated using slight modifications of the methods of Folch et al. (1957) and Olsen & Henderson (1989).Fish were then classified into either early escapees or newly escaped fish according to the content of the signature fatty acid 18:2(n-6).The level of this fatty acid is normally below 2.5% in sea-run and farmed salmon that has escaped early in life, and approximately 10 to 12% in fish feed and recently escaped farmed salmon (Jónsson et al. 1997, Olsen & Skilbrei 2010).According to the method described by Skilbrei et al. (2015b), the fish here were classified into 2 categories based on the level of the fatty acid 18:2(n-6): recently escaped (i.e. the same year as capture) farmed salmon with 18:2(n-6) > 7% and farmed salmon believed to have escaped at an early age (i.e.1-2 yr prior to capture) with levels of 18:2(n-6) ≤ 7%.

Summary statistics
The level of genetic variation measured as total number of alleles and allelic richness ranged between 106 and 6.1 in cage sample 5A, and 154 and 8.8 in cage sample 8A, respectively (Table 1).The scope of variation per cage sample can be explained by the number of families placed in each, as well as whether they contain fish of single or mixed origin.In contrast, the observed level of genetic variation in the samples of the escapees captured in the rivers was much higher than in any of the cage samples of the farms (218 and 10.8 in Etne; and 208 and 9.0 in Ryfylke, respectively), which suggests that the escapees in the rivers originated from multiple sources.This idea is further corroborated by the high effective population size found in the sample of escapees in the River Etne (Table 1) in comparison with any of the cage samples.

Identification of escapees
Based upon the results from STRUCTURE, samples from the farms were predominantly placed into each of the 3 genetic clusters matching the 3 breeding lines in Norway (Fig. 2).However, not all the cage samples belonged exclusively to 1 cluster; e.g.cage sample 1A showed signs of clustering to both Mowi and SalmoBreed genetic groups (which could suggest genetic mixing).STRUCTURE analysis also demonstrated that the escapees captured in Ryfylke were mainly of SalmoBreed origin, while the escapees captured in the River Etne were more evenly distributed among the 3 breeding strains.
GeneClass directly assigned most of the Ryfylke escapees into 2 farms rearing the SalmoBreed strain: 7A (59.7%) and 5A (15.8%); whereas the remaining ones were assigned to the rest of the cage samples in very low proportions (Fig. 3a).In addition, 31 to 35% of the escapees could not be excluded from cage samples 5A and 8B, whereas 73% could not be excluded from cage sample 7A.In contrast, most of the escapees captured in Etne were assigned to 3 cage samples producing each of the breeding strains in even proportions (Fig. 3b): 1A, 7A and 8A (12.9 to 15.8%), followed by cage samples rearing Mowi: 3A and 2C (10.1 to 11.5%).
Using the collective inference provided by STRUC-TURE and GeneClass, escapees were classified into 4 groups according to their origin (i.e.AquaGen, SalmoBreed, Mowi or 'Unknown').The Ryfylke escapees (pie chart in Fig. 1) mostly belonged to the SalmoBreed strain (77%) in contrast to Mowi (10%) and AquaGen (4%).On the other hand, escapees captured in the River Etne (pie chart in Fig. 1) were more evenly distributed among the strains, ranging between 14% (AquaGen) and 30% (Mowi).When looking specifically at the escapees that were excluded from all cage samples at a significance threshold of α = 0.001, the proportion of individuals of 'Unknown' origin was 3.6-fold higher in Etne (36%) than in Ryfylke (10%).At this stage, the genetic analyses strongly pointed towards different profiles of escapees in each of the regions.

Time and space analysis of escapees
The proportion of escapees belonging to each of the 4 genetic categories was analysed separately at different time periods in both cases.In the River Suldalslågen in Ryfylke (Fig. 4a), SalmoBreed individuals dominated the whole temporal frame of the investigation (frequency ranging from 76 to 89% from September to November), even during the first period of sampling (July and August), although at a lower frequency (38%).Significantly, in the River Etne (Fig. 4b), SalmoBreed individuals were first recorded 1 mo later than in Ryfylke (i.e. in September) and kept discrete proportions for the remaining period of time (12 to 25%).In contrast, a constant and high presence of individuals of 'Unknown' origin was registered in the River Etne, in particular from June to September when they accounted for 53 to 60% of the total.
Fish-length analyses showed that the proportion of escapees belonging to the largest size category was some 4 times larger in Etne than in Suldalslågen (in numbers: 56 vs. 21, Fig. 4c,d).Interestingly, the escapees that were genetically identified as of SalmoBreed origin were of the same size in both rivers (< 50 to 65 cm) and these coinciding lengths, together with the delay of almost 1 mo that occurred between the first record of SalmoBreed escapees in Suldalslågen and Etne, suggested that these fish may have escaped from the same farms, and some of them migrated north, ultimately ascending the River Etne.

Siblingship reconstruction
COLONY analyses provided important further insights linking the escapees captured in Ryfylke with 2 of the cage samples to which the majority of the escapees had already been assigned using the genetic assignment methods described above (Table 2).A total of 32 dyads of full-sibling escapees were ob served for cage sample 7A, accounting for 29 unique escapees.Of those, 28 were classified as SalmoBreed, whereas the remaining 1 (RF-199) was classified as 'Unknown' despite having shown inferred membership of the SalmoBreed STRUCTURE cluster of 0.983, as it was excluded from all cage samples at p < 0.001 by GeneClass.Cage sample 5A shared a slighter lower number of full siblings with the escapees: 22 pairs involving 11 unique escaped individuals, all of them assigned to the SalmoBreed strain.No full siblings were recorded between es capees and cage sample 8B.Similarly, no full siblings were registered in any pair-wise comparison between cage samples.
The sibling relationships between the escapees captured in Etne and SalmoBreed-rearing farms were also examined (Table 2).Cage sample 7A shared 4 dyads of full siblings with the escapees, accounting for 4 unique escaped individuals.Again, 3 of them were classified as SalmoBreed, whereas the remaining 1 (RF-341) was classified as 'Unknown', despite showing inferred membership of the SalmoBreed STRUCTURE cluster of 0.985, as it was ex cluded from all cage samples at p < 0.001 by GeneClass.Cage sample 5A and the Etne escapees shared 6 dyads, which accounted for 3 unique escaped individuals, all of them Salmo-Breed.Again, no full siblings were recorded between the escapees captured in Etne and cage sample 8B.
Importantly, COLONY further supported the hypothesis that SalmoBreed escapees trapped in Etne originated from the Ryfylke escape, as 22 dyads accounting for 31 unique individuals were detected between both sets of escaped salmon (Table 2).

Lipid profiles
Mean content of fatty acid 18:2(n-6) was 1.38% (± 0.33 SD) in fish classified as early escapees and 11.07%(±1.37) in recently escaped fish.In both episodes, almost all of the individuals identified to the SalmoBreed strain were categorised as recent escapees based upon their fatty acid profiles; i.e. 94% in Suldalslågen and 100% in Etne (Fig. 5a,b).There were very few individuals of 'Unknown' origin with a fatty acids profile in Ryfylke, all of them were categorised to have escaped recently, but in River Etne, this group was quite evenly distributed across time (63% early escapees vs. 37% recent ones).

DISCUSSION
The use of DNA analysis to identify the farm(s) of origin of escaped fish is a well developed technique that is routinely used in the management and regulation of Atlantic salmon farming in Norway (Glover 2010).In addition, DNA methods have been successfully tested and implemented to identify escapees back to their farms of origin for a range of other aquaculture species, including rainbow trout Oncorhynchus mykiss (Glover 2008, Consuegra et al. 2011), European seabass Dicentrarchus labrax (Brown et al. 2015), Asian sea bass Lates calcarifer (Yue et al. 2012, Noble et al. 2014) and Atlantic cod Gadus morhua (Glover et al. 2011a).In the present study, we imple-mented siblingship tests for the first time, which led to an increase in the accuracy and confidence of assignment of the salmon escapees back to their farm(s) of origin.The identification of the relationships among siblings, either those that share 1 (half sibling) or both (full sibling) parents, has formerly been utilised to address a wide variety of questions in biology and ecology, such as elucidating fine-scale patterns of larval dispersal for a rocky reef fish on the open coast (Schunter et al. 2014), determining individual variability in reproductive success (Hudy et al. 2008, Liu & Ely 2009) and dispersal (Hudy et al. 2008), providing some insight into the mating systems by inferring genotypes of unknown parents (Wang 2004, Kanno et al. 2011) and tracing market product to the farm of origin in the event of detection of disease or toxins in the market fish (Hayes et al. 2005).
The singularity of this study resides in the fact that the suite of procedures conducted here allowed, for the first time, the connection of 2 seemingly independent escape events.The connection of both episodes was possible through the joint analysis of several pieces of evidence, including biometric data of the escapees, their genetic background and the sampling time.Importantly, however, the siblingship tests revealed not only the existence of dyads of full siblings between the escapees and cage farms, but also that some of the escapees reported in Ryfylke displayed full-sibling relationships with some of the escapees observed ~150 km north, in an upstream migration trap located in the River Etne.Based upon this evidence, we conclude that the fish identified as full siblings in these 2 regions originated from the same farm, despite being sampled in 2 locations separated by 150 km.These findings highlight the utility of siblingship tests, even when the baseline samples from farms used to assess the escape epi sode are out of the geographic scope, as it is the case for the River Etne.
Non-genetic data were also available to support the connection between the 2 escape events.Firstly, the overlapping sizes of escaped SalmoBreed fish in both episodes (∼50 to 65 cm) suggest that they could have originated from the same farm and cage sample (Fig. 4c,d).Secondly, the 1 mo delay between the first record of SalmoBreed escapees in Ryfylke and in Etne (Fig. 4a,b) suggests that fish would have had enough time to cover the ~150 km distance that separates both sites.This is in agreement with field experiments in this region reporting that the majority of the escapees disperse from the escape area after ~1 mo (Skilbrei 2010, Skilbrei & Jørgensen 2010).Likewise, Chittenden et al. (2011) experimentally showed that fish dispersed rapidly (9.5 ± 19.2 km d −1 ) in the days following escape, travelling outward to coastal waters along the edges of the fjord.In addition, the findings from the present study illustrate an even longer migration along the coast and into another fjord system, allowing the escaped farmed fish to spread into multiples rivers along the coast.
The genetic clustering analyses conducted in this study allowed the successful identification of the breeding strain of origin of 80.3% of all escapees.The number of escapees that were assigned to a strain was higher for the escapees captured in Ryfylke than those in the trap in Etne (90.5 vs. 64%, respectively).This is likely to be attributed to the closer geographical proximity between the rivers where escapees were captured with respect to the farms sampled as baseline reference.Thus, in Ryfylke, SalmoBreed fish were overrepresented and accounted for 77% of the escapees captured in this region, whereas in the River Etne, Mowi was the dominating strain (30% of the escapees).When considering the fish that were statistically excluded from all cage samples at a threshold of α = 0.001, the proportion of individuals of 'Unknown' origin was 3.6fold higher in Etne (36%) than in Ryfylke (10%).This would account for fish coming from a variety of sources that had been in the sea for longer before initiating the migration to fresh water, a suggestion supported by their lipid profiles (Fig. 5).The larger genetic variation (total number of alleles, allelic richness) observed among the escapees in relation to the neighbouring farms further supports this aforementioned mixed origin (Skaala et al. 2004), an issue that has already been invoked in former escape episodes (Zhang et al. 2013).The higher percentage of individuals of 'Unknown', and hence, mixed origin is particularly relevant in Etne and explains why this sample shows larger genetic variation and effective population size than any of the remaining ones.
The 'DNA stand-by-method' is implemented to identify the cage sample of origin of farmed salmon in order to provide scientific ground for regulatory and legal procedures (Glover 2010).As previously discussed, none of the farms sampled in this study represented the primary source of the escapees captured in the River Etne.However, cage sample 7A was pin-pointed as the primary source of the escapees captured in the Ryfylke area.In addition, cage sample 5A could not be completely excluded as a potential source of some of those escapees (Fig. 3a).The sibling-based analyses helped identify the source of these escapees in Ryfylke based upon the identification of some full-and half-sibling relationships between the escapees and the farms of origin.This represents an advance in the implementation of DNA methods to identify escapees, as fish originating from the same breeding strain will often be genetically similar (Glover 2010, his Fig. 2) and therefore difficult to differentiate using standard genetic assignment methods based upon allele frequencies alone.However, as demonstrated here, sibling-reconstruction methods will provide extra resolution.Furthermore, if sibling-based statistical methods are combined with increased numbers of genetic markers and larger sample sizes, it is possible that even the most challenging escape events, where there is large genetic overlap between farms that rear fish of similar genetic background, may be partially or fully resolved.

Fig. 1 .
Fig. 1.Study area at the SW Norwegian coast.(d) Rivers where the escapees were collected and (j) farms selected for analysis.The size of the red circles is directly proportional to the number of escaped fish caught in the corresponding river.Upper right insert: upstream fish migration trap (Resistance Border Weir) operating in the River Etne.Pie charts: frequency and number of escapees assigned per breeding line in each of the escape events (Etne above and Ryfylke below).Breeding lines: (----) AquaGen, (----) Mowi, (----) SalmoBreed, (----) 'Unknown'.(-----) Hypothetical migration routes of the escapees from Ryfylke towards the River Etne

Fig. 3 .
Fig. 3. Assignment of escapees to each of the cage samples in the (a) Ryfylke and (b) Etne cases.(Black bars) Number of individuals directly assigned to each baseline cage sample, (dark grey bars) number of escapees not excluded at p < 0.05 and (light grey bars) at p < 0.001

Table 1 .
Summary statistics per sample: number of individuals (N); number of alleles; allelic richness (AR, based on a minimum sample of 39 diploid individuals); observed heterozygosity (H o ); number of deviations (dev.) from Hardy-Weinberg equilibrium (HWE) at α = 0.05; number of deviations from linkage disequilibrium (LD) at α = 0.05; and effective population size (N e ) with 95% confidence interval obtained by the jackknife method (in brackets) calculated using 3 different values of lowest allele frequency (0.05, 0.02 and 0.01)

Table 2 .
Individual RF-341 was classified as 'Unknown' despite being directly assigned to cage sample 5A and showing inferred membership to the SalmoBreed STRUCTURE cluster of 0.985, as it was excluded from all cage samples at p < 0.001 c Five individuals were classified as 'Unknown', 1 of them (RF-331) despite being directly assigned to cage sample 5A and showing inferred membership to the SalmoBreed STRUCTURE cluster of 0.982, as it was excluded from all cage samples at p < 0.001 Best maximum likelihood assignment of full-sibling dyads obtained with COLONY v.2.0.5.1: number of full-sibling dyads between cage farms (5A, 7A, 8B) and escapees, and between both escape events; number of escapees in the full-sibling dyads; classification of the escapees belonging to those dyads according to the consensus information provided by the STRUC-TURE and GeneClass methods; and number of escapees directly assigned to SalmoBreed cage samples by GeneClass b