Genetic analysis of red lionfish Pterois volitans from Florida, USA, leads to alternative North Atlantic introduction scenarios

: The red lionfish Pterois volitans is a successful invasive predator across the western North Atlantic, Caribbean, and Gulf of Mexico. The southeast coast of Florida (USA) has been identified as the original introduction location, but genetic analyses including Florida lionfish have yet to investigate introduction scenarios. Here, we assessed the potential lionfish invasion pathways using 1795 sequences from previously published mitochondrial D-loop sequences (n = 1558) and new samples (n = 237) from 6 locations: The Bahamas, Florida Keys, northwest Florida, North Carolina, Panamá, and southeast Florida. None of the assessed Florida lionfish (n = 394) contained the H05−H09 D-loop haplotypes found in The Bahamas, North Carolina, and Bermuda (the Northern Region), indicating that Florida was not the source for these haplotypes. Assessing the mitochond - rial population structure, the Florida east coast lionfish grouped with the Caribbean/ Gulf of Mexico, as opposed to the Northern Region. To further explore connectivity and invasion pathways, 14 nuclear microsatellite loci were multiplexed on lionfish collected from 15 locations (n = 394). As found in other nuclear lionfish studies, the analyses identified a lack of population structure likely due to founding effects and/or inbreeding in aquaculture brood stocks. Together, the significant haplotype differences and H01−H04 haplotypes refute Florida as the sole source of red lionfish introduction. The results of this study support alternative invasion scenarios, in which Florida was colonized as a secondary introduction site or by individuals from the Northern Region. Understanding invasive species’ population boundaries and dispersal patterns informs local control efforts and management planning for future invasive species introductions.


INTRODUCTION
The red lionfish Pterois volitans, hereafter referred to as lionfish, is the first non-native marine fish species to successfully invade the North Atlantic Ocean basin (Schofield 2009). Lionfish are predatory reef fish native to the Indian and South Pacific Oceans. The rapid growth and colonization of the North Atlantic basin by the invasive lionfish population has negatively impacted native communities by reducing native reef fish abundance, species diversity, and net recruitment of nurseries (Barbour et al. 2010, Green et al. 2012, Albins & Hixon 2013, Pimiento et al. 2015, Wilcox et al. 2018. Of particular concern is the reduction of native keystone species, such as herbivorous fishes like the striped parrotfish Scarus iseri, that prevent macroalgae and seaweeds from overgrowing coral species (Albins & Hixon 2008). Reduction of native herbivorous fish species and the resultant algal overgrowth could lead to cascade effects and further decline of coral cover in coastal tropical ecosystems.
Invasive lionfish were first formally reported in the U.S. Geological Survey (USGS) Nonindigenous Aquatic Species (NAS) database in 1985 on the southeast coast of Florida (Schofield 2009), North Carolina in 2000 (Whitfield et al. 2002), and The Bahamas in 2004 (Schofield 2009, 2010, Ferreira et al. 2015. A subsequent report by the Florida Fish and Wildlife Conservation Commission documented unsubstantiated sightings on the Florida east coast from 1993 to 2002, and 3 specimens confirmed near St. Augustine, Florida, in 2002 (Westbrooks & Westbrooks 2011). These reports led to the inferred order of lionfish introductions and dispersal, with Florida being identified as the single source of the species introduction (Courtenay 1995, Ruiz-Carus et al. 2006, Betancur-R. et al. 2011, Bors et al. 2019b. The NAS database contains sightings-only records which capture the presence of a species, but not its absence (Schofield 2009, 2010, USGS-NAS 2021, and may not accurately reflect the true presence or temporal patterns of invasions due to observational biases and the ad hoc nature of public observations.
Defining non-native species invasion pathways, larval sources, and spatial structure is important for designing effective control strategies (Allendorf & Lundquist 2003, Travis & Park 2004, Gleeson et al. 2006. Adult lionfish are not thought to disperse long distances due to sedentary behavior and strong site fidelity within the reef system (Fishelson 1997). The release of free-floating egg masses, however, yields a broad-range dispersal mechanism using planktonic larvae (Morris & Akins 2009). Studies indicate that a female lionfish can release up to 2 million eggs a year with a 20−35 d period of planktonic larval development, which is enough time to travel long distances in ocean currents (Ahrenholz & Morris 2010, Vásquez-Yeomans et al. 2011, Kulbicki et al. 2012.
The use of genetic markers to study population connectivity provides essential information for identifying parental populations, hybridization, and evidence of gene flow, especially in species with larval dispersal, like the invasive lionfish (Guerrero & Franco 2008, González et al. 2009, Morris & Akins 2009, Schofield 2009, Dimitriou et al. 2019. To clarify introduction history and spatial structure, mitochondrial (mtDNA) D-loop sequences from more than 1500 lionfish across the majority of the invasive range have been published to date (Hamner et al. 2007, Fresh water et al. 2009, Betancur-R. et al. 2011, Toledo-Hernández 2014, Butterfield et al. 2015, Johnson et al. 2016, Bors et al. 2019b, Whitaker & Janosik 2020. However, direct assessment and genetic assignment of lionfish from Florida to either the Northern Region (Bermuda, North Carolina, and The Bahamas) or Southern Region (Gulf of Mexico and Greater Caribbean basin south and west of The Bahamas to Brazil) is needed to more comprehensively investigate invasive-range relationships and dispersal pathways (Freshwater et al. 2009, Betancur-R. et al. 2011, Bors et al. 2019b. Genetic studies have principally classified lionfish samples into 2 populations, the Northern Region or Southern Region, excepting the unassessed Florida population (Cowen et al. 2006, Betancur-R. et al. 2011, Butterfield et al. 2015, Johnson et al. 2016). To date, 5 of the primary haplotypes (H05−H09) have been recovered in the Northern Region but are nearly absent in the Southern Region (Freshwater et al. 2009, Butterfield et al. 2015. Additionally, lowfrequency Northern haplotypes are of high frequency in the Southern Region, and novel haplotypes are found in each region (or in unique localities within a region), supporting the hypothesis of multiple introductions or larval dispersal colonizing the Northern and Southern Regions separately (Butterfield et al. 2015). Nuclear data have failed to reveal population structure using standard population genetic methods (Pérez-Portela et al. 2018, Bors et al. 2019b, La bastida-Estrada et al. 2019, Whitaker & Janosik 2020). This genetic homogeneity over large areas is a common feature of tropical marine fishes and can reflect elevated dispersal capability and high levels of gene flow (Shulman & Bermingham 1995).
Larval dispersal models and some genetic studies have stated that the invasion pathway began in south-ern Florida, based on the NAS sightings-only data; however, as indicated previously, lionfish from Florida have not been genetically analyzed or incorporated into range-wide datasets (Briggs 1995, Roberts 1997, Paris et al. 2005, Cowen et al. 2006, Freshwater et al. 2009, Schofield 2009, 2010, Betancur-R. et al. 2011). Betancur-R. et al. (2011) assessed regional mtDNA haplo types (excepting Florida lionfish) and indicated that lionfish in The Bahamas, Bermuda, and North Carolina originated from a single-source introduction in Florida. Similarly, Freshwater et al. (2009) hypothesized that the invasion pathway for lionfish in North Carolina and The Bahamas started in Florida: 'The genetic homogeneity of the Bahamian and North Carolina populations suggests that they ultimately trace back to the same introduction…the initial source of lionfish in The Bahamas is dispersal from the U.S. east coast population across the Florida Straits.' (Freshwater et al. 2009(Freshwater et al. , p. 1219). Yet, it has been suggested that movement from Florida east to The Bahamas is likely suppressed because the Gulf Stream current acts as a barrier to east−west larval movement (Briggs 1995, Ro berts 1997, Paris et al. 2005, Cowen et al. 2006, Freshwater et al. 2009). Comprehensive genetic in vestigations are needed to further assess these patterns.
In this study, we provide the most comprehensive geographic assessment of the invasive range to date to elucidate the Northern and Southern Regional boundaries. We investigate introduction scenarios by assessing the geographic distribution of genetic variation across mtDNA D-loop haplotypes for both newly generated (N = 237) and previously published (N = 1558) lionfish samples. We include Florida east coast and Florida Keys (N = 178) mtDNA sequences from lionfish collected in 2010−2011 to further de mar cate the split between the regions and shed light on the invasion source(s) and pathway(s). We also created novel multiplexes for 14 previously developed nuclear microsatellite markers (Schultz et al. 2013) and assessed 394 lionfish samples collected throughout the invasive range between 2008 and 2016. This work will help to more accurately determine introduction source locations and dispersal patterns to aid lionfish control efforts and management of this and other nonnative marine species at the regional level.

Study area
Previously unsampled red lionfish Pterois volitans (N = 235) were analyzed for mtDNA from The Bahamas (n (single location) = 6), Florida Keys (n = 119), northwest Florida (n = 29), Panamá (n = 22), and southeast Florida (n = 59). Lionfish were collected using hand nets or spears while snorkeling as described by Fishelson (1997). Previously unpublished North Carolina sequences were also obtained from the collection held by D. W. Freshwater at the University of North Carolina at Wilmington (n = 2) for a total of 237 new sequences (Table 1). We also generated and analyzed microsatellite genotypes from tissue samples previously used for mtDNA analysis by Freshwater et al. (2009), Butterfield et al. (2015, Bors et al. (2019b), and Whitaker & Janosik (2020) ( Table 1). DNA extraction of all lionfish samples was conducted either using a cetyltrimethylammonium bromide (CTAB) isolation protocol or DNeasy extraction kits (Qiagen). Sequences archived in GenBank from the Northern Region, Caribbean, and the Gulf of Mexico were used in the analyses (N = 1558; Tables S1 & S2 in the Supplement at www.int-res . com/ articles/suppl/ m675 p133_supp.pdf). These analyses included a lion fish from North Carolina published in GenBank with the H46 haplotype by D. W. Freshwater (accession number FJ516454.1), that had yet to be included in a range-wide assessment.

Mitochondrial DNA analyses
PCR amplification of the mtDNA D-loop was performed using the LionA _H and LionB_L primers (Freshwater et al. 2009). PCR reactions followed the protocol described by Butterfield et al. (2015). We used ExoSap-IT ® (Affymetrix) for PCR cleanup and sequenced following the Applied Biosystems Big Dye v.3 kit protocol using an Applied Biosystems 3130xl genetic analyzer (Life Technologies). Se quences were aligned and edited in Geneious 5.4.7 (Biomatters), and species identification was confirmed using the NCBI standard nucleotide BLAST tool (Johnson et al. 2008) with default 'megablast' and 'nt' database options. In addition, sequence quality was checked using phred Quality Scores, and sequences with at least 20% of their total length missing or with phred score < 20 were removed from the study (N = 17) (Ewing & Green 1998. We generated a haplotype network using TCS v.1.21 (Clement et al. 2002) from 1795 mtDNA D-loop sequences using PopArt 1.7 (Leigh & Bryant 2015) to visualize the geographic distribution of haplotype diversity across the invasive range of the lionfish. Genetic diversity and population differentiation parameters were also calculated to determine haplo- Table 1. Invasive red lionfish sample sizes and genetic variation by location. Lionfish samples are assembled by the 22 locations, ordered geographically (north to south) with the cited study noted in superscript. For reference, sample sizes are shown for the nuclear microsatellite (nDNA) dataset (see Table S1 for details). type structure. Summary statistics including the num ber of polymorphic sites (S), nucleotide diversity (π), average number of nucleotide differences (k), Tajima's D standard neutrality test (D), number of haplotypes (H), and haplotype diversity (h) were calculated using DNASP v.5.0 for each location. Population differentiation and spatial patterns of genetic structure across all locations and among regions were studied using pairwise Φ ST , a population differentiation value analogous to F ST, implemented in Arlequin v.3.5.1.2 (Excoffier & Lischer 2010). We conducted several analyses of molecular variance (AMOVAs) in Arlequin v.3.5.1.2 (Excoffier & Lischer 2010) to determine the alignment of lionfish from southeast Florida and Jacksonville, Florida (herein, the Florida east coast) and the Florida Keys with the previously identified structure of the invasive range (Butterfield et al. 2015). Permutations were set to 10 000 and the Jukes & Cantor (1969) distance method was used to estimate the best fit model identified by Butterfield et al. (2015). The grouping scheme that maximized the percent variation found among the regions was considered to have the highest support and selected as the best fit to the data. Following Butterfield et al. (2015), 5 grouping schematics were tested (i−v; see  (Fig. 1). The Florida east coast and Florida Keys lionfish were each assessed within the Northern Region (scheme (ii)) or Southern Region (scheme (iii)), and the best supported results determined where the Florida east coast and Keys lionfish were then grouped in schemes (iv) and (v).
Additionally, we calculated over-water and Euclidean geographic distances to assess whether genetic distance (Φ ST ) or diversity (S, π, k, D, H, h) parameters showed any correlation to geographic distances between sampling locations. All geographic distances were measured in Google Earth Pro Desktop v.7.3.2.7699 (Table S3). First, 2 geographic distance matrices were created by measuring the shortest over-water pathway between sampling locations while trying to stay in shallow water systems. One geographic distance matrix hypothesized high connectivity among all locations, eliminating any geographic barriers. Another geographic distance matrix hypothesized 4 geographic breaks at (A)/(B)/(C)/(D), described above and (E) separating the Gulf of Mexico from the western Caribbean ( Fig. 1). Mantel tests of isolation by distance were then calculated in GenAlEx 6.501 (Peakall & Smouse 2006 and performed to identify any correlation between either the geographic matrix or the Φ ST genetic matrix (Peakall & Smouse 2006). Second, Euclidean distances from potential sources of North Carolina, The Bahamas, and southeast Florida were measured to each sampling location in this study. To test this, the Euclidean distances from each source location were compared with mtDNA genetic distances to test for correlation using linear regression analyses ('stats' package, R Core Team 2019; modified from Bors et al. 2019b).
Haplotype rarefaction curves were constructed to extrapolate each location's haplotype richness that may not be represented in this study. The curves were generated by 'iNEXT' utilizing the Chao2 estimate of sample-based haplotype richness under the assumption that the number of rare haplotypes is related to the number of haplotypes not sampled (Mestre et al. 2016, R Core Team 2019, Hsieh et al. 2020). All sampling locations were analyzed together with 200 knots based on the reference sample size. Haplotype rarefaction curves made with 'ggiNEXT' and 95% confidence intervals (CI) were displayed (Wickham 2016, Hsieh et al. 2020). Additionally, 'iNEXT' calculated observed haplotype richness (H R ) and asymptotic estimates of total haplotype richness if sample sizes ran to infinity with the bootstrap standard error (Chao2 ± SE). For reference, when observed haplotype richness is equal to estimates of extrapolated total haplotype richness, the SE < 1.0.

Microsatellite DNA collection and statistical analyses
To address fine-scale genetic diversity and population structure in the invasive population, we com-bined 17 population-specific microsatellites (Schultz et al. 2013) into 6 newly developed multiplexes to increase efficiency (Table S4). All PCR products were analyzed on an ABI 3130xl (Applied Biosystems). Due to poor amplification or chromatograms, 3 loci were dropped, leaving 14 successful loci. The fragment data were scored using Geneious 5.4.7.
Micro-Checker (Van Oosterhout et al. 2004) was used to identify loci with evidence of null alleles. Genecap (Wilberg & Dreher 2004) calculated the probability of identity (P (ID) ), which is the probability that 2 individuals drawn at random from a population will have the same genotype at the assessed loci (Paetkau & Strobeck 1994), and sibling probability of identity (P (ID)sib ), a related, more conservative statistic for calculating P (ID) among siblings (Evett & Weir 1998). The program additionally searched for duplicate genotypes. Hardy-Weinberg equilibrium (HWE; Jorde et al. 2007) exact tests and linkage disequilibrium expectations were tested using the randomization method of Raymond & Rousset (1995) for all pairs of loci within collections in Genepop 4.0 (dememorization: 10 000; batches: 100; iterations per batch: 5000). Sequential Bonferroni adjustments (Rice 1989) were used to determine significance for those tests.
The program Structure 2.3.4 (Pritchard et al. 2000) was used to identify the population structure among sampling locations. Structure, a model-based clustering algorithm, infers population structure by probabilistically assigning individuals, without a priori geo graphic or ancestral knowledge, to a specific number of clusters (presumably populations). In determining the number of clusters (K ), the algorithm attempts to minimize deviations from HWE (Jorde et al. 2007). Simulations were conducted using default parameters including the correlated allele frequency model and admixture model, which assumes that individuals could have some proportion of membership (q) to each of K clusters. Multiple Markov chains can delineate differences within populations; therefore, 10 iterations were analyzed for different so lutions of each K = 1−15 clustering schemes, with a run-length of 200 000 Markov chain Monte Carlo repetitions, following a burn-in period of 50 000 iterations. The most probable K was assessed in Structure Harvester using the mean log likelihood (LnP[K]) and by calculating ΔK, an ad hoc quantity related to the change in posterior probabilities be tween runs of K values (Evanno et al. 2005, Cristescu et al. 2010). Since uncertainty can be found in selecting K-values, StructureSelector was also assessed (Li & Liu 2018). To further identify cluster probabilities and assess individual assignment predictions, we con ducted a discriminant analysis of principal compo nents (DAPC) multivariate analysis along with the K-means clustering algorithm (Pritchard et al. 2000, Jombart et al. 2010). Individuals were assigned to clusters with q ≥ 0.90 membership. Genetic distances values (F ST ) were calculated in GenAlEx 6.501 to further evaluate significant structure for pairwise comparisons of sampling locations with alpha adjusted using Bonferroni correction (α = 0.004167) (Rice 1989).
Genetic diversity across all locations and globally across all loci was estimated by the average number of alleles (N A ), average effective number of alleles (E A ), and observed (H o ) and expected heterozygosity (H e ) for departures from HWE using GenAlEx 6.501 (Peakall & Smouse 2006). Kruskal-Wallis (KW) nonparametric tests were run to determine significant differences among diversity parameters across locations ('stats' package, R Core Team 2019). To accommodate varying sample sizes, we also performed rarefaction on private alleles using HP-Rare v.1.1 (Kalinowski 2005). We used Bottleneck 1.2.02 to evaluate heterozygote excess of populations under the sign test, 1-tailed Wilcoxon signed-rank test for mutation-drift equilibrium, and the allele frequency distribution test using the default values (Piry et al. 1999). GenAlEx 6.501 was used to calculate the inbreeding coefficient, F IS , which is close to 0 when the population is undergoing random mating. Pairwise relatedness values were calculated between all individual lionfish using Queller & Goodnight (1989) estimators in GenAlEx 6.501.The effective population size (N E ) was calculated in NeEstimator v1.3 for lionfish with known collection years (N = 347) to provide baselines for future sampling efforts (Peel et al. 2004). Although it is complex to accurately estimate N E in large populations with gene flow, the N E linkage disequilibrium method was reported with 95% CIs for the population as a whole.
The same 5 a priori groups, (i)−(v), used to investigate mtDNA structure in the invasive range, were also assessed using nuclear microsatellite DNA to further assess the extent of regional separation. AMOVA tests for the genotype groups were calculated in GenAlEx 6.501. The same program was used to conduct Mantel tests of isolation by distance to identify any correlation between either geographic distance matrix described above and the calculated nuclear F ST genetic distance matrix. Additionally, linear regressions were calculated between Euclidean geographic distances and microsatellite diversity parameters (i.e. H o , H e , F IS ) (R Core Team 2019). Spatiotemporal patterns were explored using a principal component analysis (PCA) on a limited subset of samples that were published with the inclusion of specific collection dates (as opposed to date ranges).
In the full dataset, H01−H09 haplotypes were found in The Bahamas, North Carolina, and/or Bermuda. Two unpublished lionfish samples from North Carolina were sequenced from the archive of D. W. Freshwater and matched to the Indonesian haplotypes, H15 and H22 (GenBank acc. nos. FJ516423.1 and FJ516430.1). The H01−H04 haplotypes were found at various frequencies in the Southern Region (which now included the Florida locality): Barbados; Belize; Bonaire; Cuba; Florida Keys; Grand Cayman; Honduras; Jacksonville, Florida; Jamaica; northwest Florida; Panamá; Puerto Rico; Saint Petersburg, Florida; San Andres Island; Santa Marta; southeast Florida; Texas; Trinidad and Tobago; Virgin Islands ( Fig. 1). Of note, Sealy (2013) reported that Barbados contained H01−H04 and one H05 and H07 sequence (Table 1).
A haplotype network was generated to visualize the haplotype diversity and associations and indicated few nucleotide differences between common haplotypes and the newly generated sequences (Fig. S1). Haplotype diversity was greatest for areas within the Northern Region (h average = 0.63, 0.71) and Honduras (h = 0.63) ( Table 1). Genetic distance Φ ST values revealed weak structure among many of the studied locations within the Caribbean but found significant differentiation for most of the pairwise comparisons between The Bahamas, North Carolina, and Bermuda (Table 2). Significant differentiation was also found when Puerto Rico was compared to St. Petersburg, Florida, and Texas, as well as between San Andres Island and St. Petersburg, Florida ( Table 2).
The AMOVA revealed significant structure among sampling locations for all 5 grouping schemes (Table 3). When Florida lionfish (Jacksonville, South east Florida, and Florida Keys) were grouped with the Southern Region in scheme (iii), the proportion of genetic variation among regions was maximized (7.11%). Grouping the east coast Florida lionfish (without the Florida Keys) with the Northern Region in scheme (i) resulted in a greater percentage of variation (4.83%; p < 0.001) than when the Florida Keys were included in the Northern Region (scheme ii) (2.74%). Florida grouping with the Southern Region (iii) was the highest supported grouping scheme. For completeness, the Southern Region was further split into 3 regions (iv) and 4 regions (v) to assess the structure when including Florida; however, the percent of variation decreased for each, although it was statistically significant (p < 0.001; Table 3).
To further assess source introductions or invasion pathways, genetic distances and diversity parameters were correlated with geographic distances, but no significant values were found (p ≥ 0.3). The Mantel tests of isolation by distance did not identify a cor-relation between pairwise Φ ST genetic distances and either geographic distance matrices of high connectivity with no geographic breaks (R 2 = 0.0002, p = 0.424) or low connectivity with (A)/(B)/(C)/(D)/(E) geographic breaks (R 2 = 0.0004, p = 0.412). The linear regression analyses for Euclidean distances from all putative invasive-range dispersal sources and the genetic diversity parameters (S, π, k, D, H, h) were also not statistically significant (p > 0.349). This analysis was modified from Bors et al. (2019b), who found similar results among Euclidean, ocean, and modified oceanic distances using single nucleotide polymorphism (SNP) data.
The haplotype rarefaction curves indicated that the observed haplotype richness (H R ) in most locations was approximately equal to the extrapolated total estimates (Chao2) (Fig. 2). However, H R values were lower than Chao2 values for locations in the Northern Region and parts of the Caribbean. Specifically, North Carolina had the greatest differential (H R = 11, Chao2 = 15.48 ± 7.17) due to the ratio of sample size to the number of rare haplotypes, followed by Barbados (H R = 6, Chao2 = 8.98 ± 4.39) (Fig. 2a). The Bahamas (H R = 8, Chao2 = 8.99), Bermuda (H R = 5, Chao2 = 5.98), Grand Cayman (H R = 4, Chao2 = 4.99), and Virgin Islands (H R = 3, Chao2 = 3.9) all had similar differentials between observed and extrapolated estimates (SE = 1.93−2.24) (Fig. 2a). Most extrapolated haplotype sampling curves for Florida locations reached asymptotes, suggesting that our sampling efforts were sufficient to characterize haplotype diversity (see Fig. 2b). Southeast Florida was the only Florida location where the confidence envelope indicated any variability in the potential haplotype richness (Fig. 2b). The 95% CI included 5 haplotypes, suggesting a possibility of the presence of 1 unsampled haplotype at that location.

Microsatellite DNA
A total of 394 lionfish were genotyped from 15 of the 22 locations assessed using mtDNA (Table 1). Of the 394 genotyped individuals, mtDNA sequences were generated for 359 (91% of genotyped lionfish and 20% of the 1795 total mtDNA sequences). Conversely, 35 lionfish were genotyped, but not analyzed for mtDNA. Locations in the microsatellite dataset included: The Bahamas; Belize; Bonaire; Cuba; Florida Keys; Jacksonville, Florida; Jamaica; North Carolina; northwest Florida; Panamá; Puerto Rico; southeast Florida; Saint Petersburg, Florida; Trinidad and Tobago; and the Virgin Islands. Of note, northwest  Florida (n = 6), the Virgin Islands (n = 5), and Tri nidad and Tobago (n = 1) had low sample sizes, which could influence results. The majority of lionfish were genotyped at all 14 loci (56.3%), with no genotype missing more than 3 loci. Only the PVM41 marker indicated evidence of null alleles due to homozygote excess, but there was no evidence of stuttering, large allele dropout, or linkage disequilibrium. Across the 394 lionfish, no pattern of Hardy-Weinberg disequilibrium was found for any location or microsatellite locus. After the sequential Bonferroni adjustments, linkage disequilibrium was found for 3 out of 91 (3.3%) comparisons, PVM27 and PVM37, PVM15 and PVM42, and PVM 21 and PVM42, likely due to inbreeding or cryptic subpopulation structure (i.e. Wahlund effect). Across all lionfish samples, the un -biased P (ID) estimate was 4.10 × 10 −10 and the P (ID)sib estimate was 8.43 × 10 −5 , indicating that unique indivi duals can be confidently identified across the region. We used Bayesian evaluation methods along with genetic distance values (F ST ) to determine population structure. The Bayesian Structure analysis LnP(K ) estimates indicated that the highest support was for the K = 1 cluster. The Structure Harvester ΔK analysis supported K = 2 clusters (note that the program is unable to evaluate a K = 1 hypothesis) as did Struc-tureSelector. The q-values for each genotype, however, were relatively equal for each cluster in K = 2 (as shown in Fig. 3). Therefore, as recommended in the software documentation, K = 1 was selected as the best supported hypothesis (Pritchard et al. 2000). Table 3. Investigation of invasive lionfish genetic structure and geographic breaks using an analysis of molecular variance. Three grouping schemes, (i), (ii), and (iii), tested the support of assignment for the Southeast Florida (SEF), Jacksonville (JAX), and Florida Keys (KEY) lionfish with the Northern or Southern Regions. Separation of all other lionfish samples into the Northern or Southern Region (Caribbean/Gulf of Mexico [GoM]) followed the groupings of Butterfield et al. (2015) and Johnson et al. (2016). After the Florida lionfish samples were statistically assigned to the Southern Region, the grouping schemes (iv) and (v) were used to assess the number of regions based on predetermined geographic breaks (A−E, as displayed in Fig. 1; Betancur-R. et al. 2011). The results are reported as the degrees of freedom (df), sum of squares (SS), the variance of each hierarchy, percent variation explained by each hierarchy of structure (% variation), and the p-value indicating significant structure (α = 0.05)  Table 1 We also conducted a DACP multivariate analysis using the K-means clustering method and found no biological or geographical meaning to any of the inferred populations. We were unable to calculate genetic distance values for northwest Florida, Virgin Islands, and Trinidad and Tobago due to low sample sizes. Pairwise distances were significant when North Carolina was compared to all other tested locations (p < 0.003; F ST = 0.01−0.10). Other significant distances were found between Bonaire and The Bahamas (p = 0.003; F ST = 0.01), and Bonaire and Jacksonville, Florida (p = 0.001; F ST = 0.04).
Overall, no differences were found among nuclear diversity parameters for each sampling location (p > 0.25). Across all loci, the average number of alleles was 3.77 ± 0.10 and H e = 0.55 ± 0.01 (Table S5). We performed rarefaction to assess differences in private allelic richness and found no significant unique diversity among the regions or locations. Assessing the 394 lionfish together for evidence of a bottleneck, the 2-phase model (TPM; p < 0.001) and infinite allele model (IAM; p < 0.001) of the sign test were significant. The Wilcoxon test was also significant for the TPM (p < 0.001) and IAM (p = 0.00003). However, a normal 'L' shaped allele distribution curve was obtained, indicating a larger proportion of alleles in the low-frequency allele classes. When separated by locality, negative F IS values were found for all locations except southeast Florida, The Bahamas, and North Carolina, although values were low (Table S6). The global F IS was −0.03 ± 0.01, which indicates minimal heterozygote excess (outbreeding) compared  Evanno et al. 2005) supported the K = 2 hypothesis, but cannot evaluate a K = 1 hypothesis. (Lower panel) The K = 2 bar plot is shown for reference, however, K = 2 was not selected following the recommendations of the software documentation and lack of biological and geographical inference. Location abbreviations as in Table 1 with HWE expectations. Additionally, standard error values globally across locations and all loci were ≤ 1.0, further indicating very close diversity metrics and absent structure (Table S6). Queller & Goodnight (1989) pairwise relatedness estimators showed no pattern within or among sampling locations and a negative global average across all genotype comparisons (mean = −0.003 ± 0.001 SE). The effective population size for all samples grouped as a single population was N E = 395.9; 95% CI = 305.5, 540.8. Individual location values are provided to allow for comparison with future sampling efforts, but these values must be interpreted carefully within the greater panmictic population (Table S7).
The addition of the east coast Florida lionfish revealed no evidence for a geographic barrier to geneflow, distinguishable structure or regions, or a source introduction among the locations studied. AMOVAs consistently revealed <1% of variation among any number of regions hypothesized, then 2% variation among locations, 8% among individuals, and 90% within individuals with all partitions statistically significant (p < 0.036). Mantel tests of isolation by distance revealed no significant correlation between F ST distance values and geographic distances of highly connected locations (R 2 = 0.038) or locations with (A)/ (B)/(C)/(D)/(E) geographic barriers (R 2 = 3.0 × 10 −5 ). Linear regressions found no significant differences (p > 0.27) between any of the nu clear diversity parameters (H o , H e , F IS ) and Euclidean distances from each location to the 3 suspected source locations (The Bahamas, North Carolina, and southeast Florida).

DISCUSSION
The results of our analysis of 1795 red lionfish Pterois volitans mtDNA sequences were consistent with the previous work showing genetic separation between the invasive Northern and Southern Regions (Φ ST of 0.071, p < 0.001; Butterfield et al. 2015). Florida was assigned to the Southern Region in the expanded analysis incorporating 394 Florida lionfish mtDNA haplotypes (H01−H04) collected between 2010 and 2016. Our data indicate that Florida is unlikely to be the single invasion source of lionfish in the North Atlantic Ocean. Significant Φ ST values separated southeast Florida from The Bahamas and North Carolina, and rarefaction curves for Florida stabilized at the 4 existing haplotypes, indicating that the additional Northern Region haplotypes are unlikely to be present. Previous regional studies have proposed that the lionfish invasion began on the Florida east coast (Freshwater et al. 2009, Betancur-R. et al. 2011, but these studies did not include lionfish collected in Florida. Based on our findings, we propose alternative scenarios that better explain the haplotype diversity seen across the western North Atlantic, in which primary or secondary introductions of lionfish occurred in The Bahamas, North Carolina, or across multiple locations, including Florida. Schofield (2009Schofield ( , 2010 documented the chronology of the lionfish invasion based on the USGS NAS database (USGS-NAS 2021) sightings reports beginning in Florida and moving throughout the western North Atlantic Ocean, Caribbean Sea, and Gulf of Mexico. While sightings reports are useful to document the presence of a species, they are less successful in documenting its absence, potentially resulting in pseudoabsence error (O'Malia et al. 2018). The sightings can also be opportunistic in nature or biased towards organized surveillance efforts. It is possible that founding populations of lionfish were present but undetected, or unreported, in certain areas and are therefore absent in the NAS reports (Schofield 2009). Undetected populations combined with multiple introductions likely created a non-linear sequence of lionfish introduction and spread, which this study attempts to help clarify using genetic data.

Investigation of invasion locations and pathways
Genetic analysis of lionfish samples collected between 2010 and 2016 from Florida, including Jacksonville, southeast Florida, and the Florida Keys, resulted in separation with the Northern Region and support for alternative introduction scenarios outside of Florida for the lionfish population. Florida has a low probability of being the source of the H05−H09 haplotypes since they were not found in the 394 Florida lionfish sequenced to date. If the H05−H09 haplotypes were present in Florida, the frequency is so low that we did not detect them in our sample set. Haplotype rarefaction curves showed limited evidence that haplotypes are missing from the Florida locations, although the presence of a single unsampled haplotype at the southeast Florida location was possible. The lack of lionfish with the H05−H09 haplotypes in Florida is unlikely to be due to local extinction or poor habitat suitability because those haplotypes are found in lionfish from both North Carolina and The Bahamas, which have habitat similarities with the Florida coastlines (Whitfield et al. 2002). Of note, temporal and spatial patterns were explored through a PCA for lionfish samples in our dataset containing published dates and locations; however, no patterns were evident in the available data.
The colonization process for scenario (1) includes lionfish with haplotypes H01−H07 and H09 invading North Carolina, then traveling south and/or east across and/or against the Gulf Stream current to Florida and/ or The Bahamas. The North Carolina population was presumably first colonized (through releases and then short-or long-distance transport) on the warmer offshore shelf, where lionfish can survive perennially (Whitfield et al. 2002). Here, we report 2 new haplotypes (H15 and H22) in North Carolina lionfish samples that had only previously been found in the Indonesian native range (Freshwater et al. 2009). We also statistically analyzed haplotype H46 from North Carolina for the first time after its publication in GenBank. These rare haplotypes, including 2 from the native range, may suggest independent releases/ colonization of lionfish in North Carolina. Additionally, the other lionfish species, Pterois miles, has only been confirmed to date in North Carolina, adding support to the region being a localized source of introductions, or colonization through passive larval transport in currents (Hamner et al. 2007, Betancur-R. et al. 2011. A colonization in North Carolina could have been followed by fish with the highest frequency haplotypes (H01−H02) or the best adapted genotypes dispersing through directed movements to colonize the US southeast coast, The Bahamas, and the Caribbean. The northward Gulf Stream current, however, is considered a strong barrier against passive larval transport between the Florida Keys and The Bahamas (Roberts 1997, Cowen et al. 2006, Freshwater et al. 2009), and presumably the east coast of Florida. Still, some rare dispersal events (e.g. hurricanes) could lead to higher than predicted connectivity (Johnston & Purkis 2015). In this scenario (1), H08 would have been present in North Carolina, but undetected by genetic studies, and fish containing the H05−H09 haplotypes would have had to secondarily colonize The Bahamas.
Alternatively, in scenario (2), the presence of H01− H08 lionfish in the Northern Region occurred after a single source introduction in The Bahamas, where larval recruits or free-swimming fish were transported north in the Gulf Stream currents from The Bahamas to colonize the deep outer reefs of North Carolina. From there, fish could disperse south along the US coast to Florida and the Caribbean. From the Bahamas, they could have dispersed west and/or south. Bermuda, in the Northern Region, may have been colonized by lionfish originating from The Bahamas (Freshwater et al. 2009) or potentially North Carolina. A single fish with the H08 haplotype has been found in The Bahamas, but H08 is absent in North Carolina, po tentially supporting its introduction there. However, the haplotype rarefaction curves for North Carolina indicate the presence of unsampled haplotypes de spite high sample sizes, creating un certainty regarding the source of the H08 haplotype. Further, a single, or private, H09 haplotype has been collected in North Carolina, suggesting it likely originated or dispersed there and not in a single colonization of The Bahamas.
The private haplotypes in The Bahamas and North Carolina, and the disparate Florida and Northern Region haplotype frequencies support scenario (3) H04 haplotypes being found in Florida and across the entire range. Lionfish from Jacksonville, Florida only had 2 of the 4 haplotypes in Florida, so it is possible that southeast Florida was a source of introduction that dispersed along the east coast of Florida. The characteristics of many invasive populations include the low diversity (and high relatedness) in the founding population, coupled with rapid reproductive rates and population growth. These properties likely reduce the ability to track changes in diversity during lionfish population expansion and colonization (Freshwater et al. 2009). Theoretically, if one of the locations assessed here was the center of the range expansion, we might expect to see genetic diversity decrease with increasing geographic distance away from that location (isolation by distance). However, we did not find evidence of this in mtDNA or microsatellite data, beyond the higher genetic diversity found in the Northern Region as compared to the Southern Region.
In the current study, we did not have microsatellite data from the native range or aquarium trade source population(s) which would allow us to assess and compare changes in the nuclear microsatellite diversity post-expansion. We did not detect a bottleneck, possibly due to known limitations with the statistical tests after severe reductions in population size followed by rapid growth (Hundertmark & Van Daele 2010). The low estimate of N E in comparison to the census size (N C ) is also reflective of the low diversity in the population, although it should be noted that large populations with low to moderate levels of gene flow are challenging to derive accurate estimates for N E . Applying the average N E :N C ratio of 0.1 (Hoban et al. 2020) would result in a calculation of ~4000 fish in the Atlantic lionfish population; however, the population is estimated to be larger than that, with < 200 individuals per acre in some locations (FWC 2021).

Spatial and temporal patterns of invasive lionfish genetic diversity
In previous studies, genetic analyses of lionfish samples from the US eastern seaboard and The Bahamas were characterized by reduced genetic diversity, likely due to a strong founder effect (Hamner et al. 2007, Freshwater et al. 2009). Our analyses using an expanded sample set, including the Florida Keys and the southeast coast of Florida, supported previous reports of low diversity. Other studies that assessed nuclear data using SNPs also showed a lack of lionfish population structure (Pérez-Portela et al. 2018, Bors et al. 2019b). Low nuclear and mitochondrial diversity in invasive species may be related to an inbred founding population in aquaculture (Selwyn et al. 2017), where genetic diversity is often limited (Hawley et al. 2006, Hunter & Nico 2015, Michaelides et al. 2018. Multiple introductions have also been indicated in genetic studies for Pterois miles in the Mediterranean Sea (Dimitriou et al. 2019) and in numerous other marine and aquatic invasive species. Multiple introductions can improve the likelihood of establishment with increased genetic and morphological diversity and the chance for releases occurring in favorable environments (Blackburn et al. 2015). Species of note with multiple invasion histories include the European green crab Carcinus maenas, Asian black carp Mylopharyngodon piceus, Asian caridean shrimp Palaemon macrodactylus, and saltmarsh cordgrass Sporobolus alterniflorus (syn. Spartina alterniflora) (Roman 2006, Hunter & Nico 2015, Bors et al. 2019a, Dimitriou et al. 2019, Xia et al. 2020. Invasion success and pathways often depend on numerous variables such as habitat types and dispersal vectors (human-mediated, life history, oceanography). Increa sed genetic diversity during population founding often improves invasion success and adaptability to novel environments (Roman 2006, Wellband et al. 2017, Card et al. 2018. One potential contribution to the invasion success of lionfish is the evidence of considerable hybridization found in the Pterois genus in its native range (Wilcox et al. 2018). Further, hybridization between P. volitans and P. miles has been found using mtDNA and nuclear markers in 1.8% of invasive lionfish samples, along with evidence for pseudogenes (Whitaker & Janosik 2020).
The lack of genetic structure among lionfish sampled from the Caribbean and Gulf of Mexico supports a single panmictic population across the Southern Region of the invasive range (Pérez-Portela et al. 2018, Bors et al. 2019b, Labastida-Estrada et al. 2019, Whitaker & Janosik 2020. However low frequency haplotypes have been identified in Panamá (H03) and Barbados (H05, H07) in the Southern Region in 3 lionfish (Sealy 2013, Butterfield et al. 2015. These locally unique haplotypes may indicate recent releases from the pool of haplotypes available in the aquarium trade and could lead to increased diversity and improved adaptation potential for the invasive populations (Wellband et al. 2017, Dimitriou et al. 2019). Measures to prevent additional releases could help to prevent gene exchange, admixture, and heterosis (hybrid vigor) (Xia et al. 2020).

Management implications and future efforts
Our genetic assessment of invasive North Atlantic lionfish collected between 2007 and 2016 identified evidence of 2 populations and multiple introductions. The N E and genetic diversity can be used as a baseline to monitor changes and to inform removal and prevention efforts. Haplotype frequency patterns can help to identify source populations within the current invasive range or expansion into new areas of South or Central America (Díaz-Ferguson & Hunter 2019). Lionfish have proven to be highly adaptable to the invaded environments, and visual or environmental DNA (eDNA) monitoring in Uruguay, the Panama Canal and the coastal Pacific will be important to detect population expansion (Moyer & Diaz-Ferguson 2013, Grieve et al. 2016, Díaz-Ferguson & Hunter 2019, Norton & Norton 2021. The presented lionfish introduction and invasion pathways may also help to inform management strategies for future marine invaders (Wellband et al. 2017). Managing broadly dispersing invasive and commercial marine species among country seaboards requires collaborative and unilateral efforts.
Our dataset primarily included lionfish from shallow-water habitats, excepting numerous lionfish collected from the deep outer reefs of North Carolina, limiting our ability to investigate the effects of water depth. Moving forward, the collection of additional lionfish from deeper waters is needed to investigate the effects of diversity and connectivity of those habitats. Improved reporting of specific collection dates and locations could help to further elucidate spatiotemporal haplotype frequency changes (Hundertmark & Van Daele 2010, Bors et al. 2019b.
Local control efforts have been shown to reduce lionfish on adjacent reefs in some situations (Peiffer et al. 2017, Harms-Tuohy et al. 2018. It is believed that re-colonization is most likely due to recruitment or ontogenetic migration, as opposed to lateral adult migration (Harms-Tuohy et al. 2018). To target source populations and help increase efficiency of removal efforts, knowledge of genetic and larval connectivity could be applied, especially in high-priority conservation areas like marine protected areas, buffer zones, and native fish and lionfish nursery habitats (Harms-Tuohy et al. 2018). Moreover, highly sensitive eDNA methods could focus search and removal efforts to bring lionfish densities below the threshold necessary to allow for native fish communities to recover (Benkwitt 2015, Díaz-Ferguson & Hunter 2019.