Using sonobuoys and visual surveys to characterize North Atlantic right whale (Eubalaena glacialis) calling behavior in the Gulf of St. Lawrence

: The appropriate use and interpretation of passive acoustic data for monitoring the Critically Endangered North Atlantic right whale Eubalaena glacialis (hereafter right whale) rely on knowledge of their calling behavior and how it varies with respect to time, space, demographics, and observed behavior. To assess such relationships in a habitat of increased management importance, sonobuoys (disposable drifting hydrophones) were deployed in the Gulf of St. Lawrence, Canada, to record sounds from aggregating right whales during visual aerial surveys in the sum-mers (June through August) of 2017 (n = 8), 2018 (n = 13), and 2019 (n = 16). Upcalls, gunshots, and various mid-frequency (250−800 Hz) tonal calls were compared to demographics and observed behaviors of concurrently observed right whales using correlation matrices, linear regressions, and generalized linear models. Our results show that (1) call rates increased from June to August for all call types; (2) calling rates were associated negatively with observed foraging behavior and positively with observed socializing behavior; (3) upcalls were occasionally produced at higher rates (>20 calls h −1 ) when in association with gunshots and tonal calls; (4) acoustic monitoring did not always detect right whale presence at fine timescales (2−6 h), but presence estimates were improved when multiple calls types were considered; and (5) calling rates were too variable to provide reliable density estimates of observed right whales. These results have important implications for the interpretation of passive acoustic monitoring in this habitat and provide evidence that some whale behaviors (e.g. socializing) may be reliably inferred from acoustics alone.


INTRODUCTION
Subsequent to the cessation of large-scale commercial and historic whaling, the recovery of the Critically Endangered North Atlantic right whale Eu balaena glacialis (hereafter right whale) has been limited due to their distribution in urbanized areas along the east coast of the USA and Canada, where they are susceptible to impacts from anthropogenic activities (Cooke 2020).As of 2020, estimates suggest this species consists of fewer than 350 individuals and is in decline (Pettis et al. 2022).The primary sources of mortality and sublethal trauma that impede the health and growth of this species are vessel strikes and fishing gear entanglements (e.g.Knowlton & Kraus 2001, Corkeron et al. 2018, Sharp et al. 2019, Moore et al. 2021, Stewart et al. 2021), with additional stressors from noise pollution (Rolland et al. 2012) and climate-driven shifts in prey (Record et al. 2019, Gutbrod-Meyer et al. 2021).The species' re covery has been further compromised by a recent unusual mortality event in which 32 individuals were found dead between 2017 and 2020, of which 16 were discovered in the Gulf of St. Lawrence (GSL), Canada (NOAA 2022a).The persistent right whale occupancy in the GSL since at least 2015 (Simard et al. 2019, Crowe et al. 2021, Johnson et al. 2021), frequent observations of both socializing and foraging behavior (Crowe et al. 2021 and this paper), and a dramatic increase in observed mortalities suggest that the GSL is an important high-risk habitat.
Observations of right whales in the GSL precipitated numerous research and risk mitigation efforts, the majority of which rely on near real-time knowledge of the spatial and temporal distribution of right whales (Davies & Brilliant 2019).One established approach to monitoring right whale distribution is to conduct visual surveys (e.g.Baumgartner et al. 2003, Nichols et al. 2008, Cole et al. 2013).Dedicated right whale visual surveys allow for rapid detection of individuals to inform dynamic risk mitigation strategies; such strategies include implementing vessel speed restrictions and fishing closures in areas determined to be a high risk (DFO [Fisheries and Oceans Canada] 2021, Transport Canada 2021, NOAA 2022b).These surveys also enable the collection of ad ditional data, such as photographs or biological samples, that contribute to large, well-established databases used to derive essential conservation information such as abundance, distribution, demographics, behavior, and health (e.g.Hamilton et al. 1998, Brown et al. 2001, Pettis et al. 2004, Schick et al. 2013, Pace et al. 2017).Photographs are especially valuable, as right whales can be identified by unique callosity (cornified skin colonized by cyamids) patterns on their heads, as well as scars and other unique markings on their bodies (Kraus et al. 1986).Information including sex, age, behavior, sightings history, and health for each photographed individual can be determined through the North Atlantic Right Whale Catalog, which maintains these records for all known individuals in the species (Brown et al. 1994, Hamilton et al. 1998, Pettis et al. 2004, Frasier et al. 2007).
Visual surveys are subject to several limitations.Cost, platform endurance, day length, and weather conditions prevent continuous survey coverage.Environmental conditions and additional factors, such as survey platform (e.g.vessel, aircraft) and observer experience, affect the probability of detection (detection bias; e.g.Baumgartner & Mussoline 2011, Ganley et al. 2019).These surveys also require whales to be at or near the surface to be observed (availability bias), where they typically spend a small proportion of their time (Baumgartner & Mate 2003, Ganley et al. 2019), though their surfacing behavior is variable and not well characterized (Matthews et al. 2001, Baumgartner & Mate 2003, Parks et al. 2011).Due to these limitations, it is difficult to conduct continuous visual right whale spatial and temporal monitoring.
Passive acoustic monitoring (PAM) provides an alternative method for surveying right whales and is achieved by listening for the many sounds right whales produce (hereafter referred to as calls unless otherwise specified).Though unable to collect many ancillary data streams accessible by visual surveys, PAM surveys offer several advantages including persistent, cost-effective data collection.Numerous studies have demonstrated that PAM can provide a reliable indication of right whale presence (e.g.Clark et al. 2010, Durette-Morin et al. 2019) and have used this information to make inferences about right whale distribution over large temporal and spatial scales (e.g.Davis et al. 2017, Durette-Morin 2021).The de velopment of near real-time PAM systems (e.g.Baumgartner et al. 2013, 2019, 2020, Gervaise et al. 2021) has facilitated the use of acoustic detections to inform dynamic risk mitigation measures in Canada and the USA (DFO 2021, Transport Canada 2021, NOAA 2022b).Though PAM is typically used to determine whale presence, modified distance sampling methods have been developed and applied to PAM to estimate whale abundance (Marques et al. 2013).These methods have not been directly applied to North Atlantic right whales but have been used to estimate the density of several other baleen whale species, such as North Pacific right whales E. japonica (Marques et al. 2011) and fin whales Balaenoptera physalus in the Pacific Ocean (Harris et al. 2018).
The effectiveness of passive acoustics as a monitoring tool is limited by a variety of factors that influence call detection and availability.These include system characteristics (e.g.depth, system noise, duty cycle), ambient noise levels, sound propagation, and analysis procedure (e.g.Johnson et al. 2022).Though considering these factors is often not trivial, they can be readily measured and their impacts accounted for (e.g.Helble et al. 2013).A more pervasive challenge in PAM is characterizing and accounting for variation in call availability.This is important to consider, as changes in calling rates can introduce substantial bias and/or uncertainty into estimates of acoustic presence, distribution, and/or abundance (Marques et al. 2013).
Right whale call availability is influenced by calling behavior, which is challenging to measure, as it varies depending on environmental and biological contexts.Initial studies reported associations between the type of call produced and the observed behavioral state.For instance, the upcall, which is one of the best-characterized calls, has been identified as a contact call that all ages and sexes produce yearround, throughout their geographical range (Parks & Clark 2007).Additionally, observations from the Bay of Fundy, Canada, suggest gunshot calls are likely produced by males when reproductively active (Parks et al. 2005), and mid-frequency tonal calls are associated with focal females in surface active groups (SAGs) (Parks & Tyack 2005, Parks et al. 2007).Results from acoustic tagging experiments show that individual calling rates are highly variable; call rates ranged from 0 to 20 calls h −1 , with more than half of the tagged animals (28 of 46) not producing any calls (Parks et al. 2011).Parks et al. (2011) also found that calling rates were dependent on behavioral context, where they were highest during surface activity and travelling and lowest during foraging or logging.In addition, call characteristics appear to convey individual and age-specific information (McCordic et al. 2016), change as an individual ages (Root-Gutteridge et al. 2018), and vary in responses to environmental noise (Parks et al. 2009(Parks et al. , 2011)).Calling behavior also appears to vary based on habitat.For example, right whale mother−calf pairs exhibit a shift in call type and rate as they move from the calving grounds (e.g.Florida and Georgia coasts) to foraging and socializing grounds (e.g.Cape Cod Bay, Gulf of Maine, Bay of Fundy, GSL; Cusano et al. 2019, Parks et al. 2019).Therefore, right whale calling is highly variable in just about every way possible.
In this study, we collected the first concurrent finescale visual and acoustic observations of right whales in the southern GSL to characterize variability in calling behavior with respect to the number of whales present, sex, day of year, and observed behavior.These results have implications for the appropriate use and interpretation of PAM in this habitat and are especially important to evaluate given that near realtime acoustic detections are currently being used to inform dynamic management measures.

Data collection
Right whale visual surveys in the southwestern region of the GSL were conducted in 2017, 2018, and 2019 by NOAA in collaboration with DFO aboard a De Havilland DHC-6-300 Twin Otter aircraft.These surveys occurred in June, July, and August during daylight hours on days suitable for flying and observing (less than ~15 knots of wind, ceilings above ~400 m [1500 ft], etc.; Fig. 1).The surveys did not follow a typical systematic line transect approach but instead were directed to search areas with the intention to photograph as many individual right whales as possible (see Crowe et al. 2021 for more details on these mark−recapture aerial surveys).The aircraft circled an aggregation long enough to collect sufficient photos for photo-identification, and as such, the number of photos per individual varied among deployments.The surveys were conducted at a nominal speed of 185 km h −1 (100 knots) and an altitude of 305 m (1000 ft).An observer on each side of the aircraft scanned for whales through bubble windows positioned perpendicular to the front of the aircraft, and a photographer was positioned at the window on the rear left side of the aircraft.When right whales were spotted, the plane circled over the whales to collect images of individuals for photo-identification.The images were taken using a Canon digital singlelens reflex camera with a fixed 300 mm lens.Visual survey protocols are described in detail in Cole et al. (2013) and Crowe et al. (2021).
When 3 or more right whales were sighted and the aircraft could remain in the area for an hour or more, a sonobuoy (model AN/SSQ-53F DIFAR) was deployed from an altitude of 244 m (800 ft) approximately 0.5 to 1 km away from the whales to collect passive acoustic data.These requirements were put in place to re duce the subjectivity of deploying a sonobuoy.A sonobuoy is a drifting disposable hydrophone system consisting of a hydrophone, cable, and surface float containing a radio that transmits directional acoustic data to a nearby receiver (i.e. on an aircraft or vessel) and has been widely used for marine mammal monitoring (e.g.McDonald & Moore 2002, Laurinolli et al. 2003, Crance et al. 2019).The sonobuoys were programmed to deploy the hydrophone to approximately 27.4 m (90 ft) depth, record in DIFAR mode, and transmit on a pre-determined radio channel for a maximum of 8 h.The transmitted data were received onboard the aircraft using a WR-G39WSBe sonobuoy receiver and were digitized using a Fireface 400 sound card.The received signal for each sonobuoy was sampled at 48 kHz and saved as 5 min wav files using Raven Pro software (Center for Conservation Bioacoustics 2019).

Visual data
The visual sightings for 2017 and 2018 were provided from the North Atlantic Right Whale Consortium (NARWC 2020), while the visual data for 2019 were provided directly from the NOAA Northeast Fisheries Science Center (NEFSC).The aerial photographs of right whales were reviewed and compared to the North Atlantic Right Whale Catalog, and the subsequent identification, age class, and sex data were provided by the NARWC Photo-Identification database.Photo-identification data processing methods are described in detail by Hamilton et al. (2007).Age classes were defined as follows: adults were whales that had given birth or were of known age and older than 8 yr or of unknown age with sightings histories spanning at least 8 yr, juveniles were known-age whales between 1 and 8 yr of age that had not given birth, and calves were whales born that year (Knowlton et al. 1994, Hamilton et al. 1998).
The sightings data associated with each sonobuoy deployment were restricted to the estimated temporal (recording duration ±1 h) and spatial (30 km radius) scales of acoustic monitoring capabilities.This time restraint was chosen to include whales that may have been calling during the deployment but photographed before or after the sonobuoy recording began or finished, respectively.A 30 km radius was chosen based on maximum observed acoustic detection distances measured in the Bay of Fundy (Laurinolli et al. 2003) and modeled maximum detection distances in the southwestern region of the GSL (Simard et al. 2019), the same general area as our study.Whale movement is unlikely to increase this range, as observations of right whales in the southern GSL suggest that 75% of daily distance between sightings of the same individual was less than 10 km (Crowe et al. 2021), and movement simulations estimate that within a 6 h time frame (which is longer than the longest deployment duration in our study), both socializing and feeding whales would likely remain within approximately 30 km of the original point of the acoustic detection (Johnson et al. 2020).Alternative space and time range combinations (20 and 60 km with 0.5 and 12 h) were considered and had little influence on the results.The number of whales sighted within this defined time and space is referred to as whale count in the analyses.Other species (e.g.bowhead whale Bal-  aena mysticetus, minke whale Balaenoptera acutorostrata, dolphins) were occasionally observed with right whales, but their presence and potential interactions were not considered in this study.
Photographically documented whale behaviors were classified into 1 of 2 categories: foraging or socializing.Foraging behaviors included subsurface feeding, mouth closing, and skim feeding, while socializing behaviors included SAGs, rolling, lobtailing, and breaching (Zani & Hamilton 2017; see Table S1 in Supplement 1 for full list of behaviors in these categories; all supplements available at www.int-res.com/articles/suppl/n049p159_supp.pdf).Of note, 5 whales entangled in fishing gear were encountered, 2 of which exhibited behaviors consistent with our definition of socializing behavior (i.e.tail slashing, lobtailing, and rolling).These whales and associated behaviors were included in the analysis even though we were not able to determine if the behaviors were social in nature or a result of the entanglement (see Table S4 in Supplement 2 for statistical comparison).Behavior rates were calculated as the sum of observed behaviors in a particular category (foraging or socializing) divided by the whale count for each deployment.Rates, rather than counts, were used to correct for occasions where individuals ex hibited multiple behaviors during a single sonobuoy deployment.Additionally, the male:female ratio, de fined as the total count of observed males of all ages divided by the total count of observed females of all ages, was calculated for each deployment.The calculation of the male:female ratio omits 3 individuals of unknown sex that were observed in 7 deployments: 1 individual was observed in 4 deployments, a second individual in 2 deployments, and a third individual in 1 deployment.

Acoustic data
Acoustic recordings were displayed as spectrograms and were manually reviewed both visually and aurally by an experienced analyst using Raven Pro 1.6 and 2.0 software (Center for Conservation Bioacoustics 2019).The spectrogram parameters (8192-sample discrete Fourier transform, with 50% overlap using a Hann window) resulted in a time resolution of approximately 0.1 s and a frequency resolution of 5.9 Hz.Since the sonobuoys were recorded in DIFAR mode, approximately 3 kHz of omnidirectional acoustic data were transmitted, where a bandwidth of 50 to 1500 Hz was available for spectrogram review.Higher frequencies are allocated for direc-tional data which were not analyzed nor used in this analysis.Right whale calls were identified and categorized as either upcalls, gunshots, or tonals.Upcalls were 0.4 to 2 s long frequency-modulated upsweeps between approximately 50 and 400 Hz (e.g.Vanderlaan et al. 2003, Parks & Tyack 2005).Gunshots were distinguished as short-duration (~1 s) broadband sounds (Parks et al. 2005).Tonals consisted of variable-frequency modulated sounds with fundamental frequencies ranging from approximately 250 to 800 Hz and often contained harmonics.These were similar to the 'moan' classification by Matthews et al. (2001), a combination of the 'mid-frequency' and 'high-frequency' categorizations by Parks et al. (2011), and the 'tonal low', 'modulated', and 'hybrid' categories by Trygonis et al. (2013).Calls from other species (e.g.blue whale Balaenoptera musculus, minke whale) were also observed visually and aurally; however, only calls that could be confidently and unambiguously assigned to one of the defined right whale call categories were considered in this study.
Deployment duration, photo-documentation duration, call count, call rate, and call production rate for each deployment were calculated.Deployment duration was defined as the total duration of audio that had the presence of prominent DIFAR signals.Photodocumentation duration was the difference in hours from when the first whale was photographed to when the last whale was photographed within a sonobuoy deployment.Call count was the total number of calls within the deployment duration, call rate (units of calls h −1 ) was the call count divided by the deployment duration, and call production rate (units of calls h −1 whale −1 ) was the call rate divided by the whale count associated with a given deployment.The call count and rate metrics were calculated for each call type and deployment.

Statistical analysis
An additional variable, day of year, defined as an integer assigned to each day of the year (1−365), was used to assess temporal patterns.There was little evidence that whale count, age-class, sex-class, behaviors, or call rates varied significantly among years (Table S2 in Supplement 2), so the data from all years were combined in all analyses.Non-parametric Spearman's correlation coefficients were used to quantify correlations between all available variables (Fig. S2 in Supplement 3) and select an appropriate subset (whale count, male:female ratio, foraging and socializing behavior rates, and call rates or call counts, where appropriate) for use in subsequent regression analyses to prevent overfitting (Fig. 2).All statistical tests were evaluated to a significance level (α) of 0.05.Explicit consideration of environmental variables and their influence on right whale calling was beyond the scope of this study.

Characterizing call rates
Variation in calling rates was assessed using negative binomial generalized linear models (GLMs) with a log link and an offset term using the following form: where the dependent variable, call, is the upcall, gunshot, or tonal call count; β 0 is the intercept coefficient; β i is the coefficient of the independent variable (or slope); x i is the independent variable; and the offset is the log of the deployment duration.The independent variables were day of year, male:female ratio, whale count, foraging rate, and social rate.A negative binomial model was chosen to represent these data because the call rates were zero inflated and exhibited overdispersion (see Supplement 3 for an evaluation of the assumptions associated with negative binomial GLMs).To account for variability in recording effort, an offset was used to convert call counts into call rates.A maximum of 100 iterations was sufficient to fit each negative binomial model.The intercept and independent variable coefficients were produced for each regression.A negative binomial likelihood ratio test was conducted to determine if the null model (containing only the intercept and offset terms) differed from the full model (Eq. 1) for all models.All the independent variables were then combined to create the following full model: where call and offset refer to the same call-dependent variables and offset, respectively, described in Eq. (1); and β 0 and β i (where i = 1, 2, 3, 4, 5) are as described in Eq. (1).Variance inflation factors showed little evidence of correlation among independent variables for all call types, suggesting the GLM assumption of no multicollinearity was not violated (see Table S5 in Supplement 3).The full models were subjected to forward and backward stepwise selection using Akaike information criterion (AIC).The same models were selected using both forward and backward selection procedures (see Table S12 in Supplement 4 for details).The most parsimonious models were compared to the null models (includes intercept and offset terms only) using a likelihood ratio test for negative binomial models.

Characterizing whale count
Linear regressions were applied to determine if variation in whale count changed over time or could be explained by calling behavior.The generic form of the regression equation was as follows: where the dependent variable is whale count; and β 0 , β i , and x i are as described in Eq. ( 1).The independent variables used were upcall rate, gunshot rate, tonal rate, and day of year.A linear regression was chosen because whale count was normally distributed and did not violate typical linear regression assumptions (see Supplement 3 for an evaluation of model assumptions).A multiple linear regression was not performed because all call types were highly correlated with each other and with day of year (see Fig. 2; Table S5 in Supplement 3), resulting in a collinearity issue that would prevent the appropriate evaluation of the effect of each independent variable.The coefficient terms (intercept and variable coefficient) for each regression were computed, and ANOVA (analysis of variance) statistics were assessed to determine whether the slope of each regression differed from 0. All analyses were conducted using the R programming language version 4.0.2(R Core Team 2019).Data processing and visualization were achieved using tidyverse packages (Wickham et al. 2019).Spearman's rank correlation matrices were calculated and visualized using the corrplot package (Wei & Simko 2017).Negative binomial models and likelihood ratio tests were implemented with the glm.nb() and anova() functions, respectively, from the MASS package (Venables & Ripley 2002).All additional statistical analyses were implemented using the stats package (R Core Team 2019) unless otherwise noted.The R code used for this analysis is available from GitHub (https://github.com/kimfranklin/narw_sonobuoys).

Data collection and processing
In total, 37 sonobuoys were successfully deployed; 8 were deployed from 27 June through 26 July 2017, 13 from 6 June through 12 August 2018, and 16 from 4 June through 26 August 2019 (Table 1).The majority (29 of 37) of the sonobuoys were de ployed in the Shediac Valley in water depths between 70 and 100 m (Fig. 1).Four sonobuoys were deployed southeast of the Shediac Valley in early 2017, and 4 were deployed northeast of the majority of the sonobuoys in early 2019.The time of day the sono buoys were de ployed ranged from ap proximately 11:00 to 19:00 h UTC (08:00 to 16:00 h local Atlantic time; Table 1).The number of right whales observed during the deployments varied from 4 to 56 whales (median = 22 whales; Table 2).A total of 142 unique whales were photographed, 130 of which (91.5%) were photo graphed in more than 1 deployment.Foraging and socializing behaviors were ob served during all deployments except for the 3 in June 2018.The male:female ratio was 1 or greater for all deployments except for 4 in late June−early July in 2017 and 2019.There were 6 deployments with no upcalls detected; 9 with no tonal calls detected; 10 with no gunshots de tected; and 3 deployments, 1 in each year, where none of the call types were detected.The call rates for all call types were highly variable.The median up call rate was 4.29 calls h −1 (interquartile range [IQR]: 9.21 calls h −1 ), while 2 deployments had upcall rates of 70.49 and 199.44 calls h −1 .The median rate for gunshot and tonal calls was 4.50 calls h −1 (IQR: 34.86 calls h −1 ) and 5.80 calls h −1 (IQR: 41.63 calls h −1 ), respectively.Counts of these latter 2 call types were relatively similar; for instance, 8 of the 9 de ployments with gunshot counts above 100 also had tonal counts above 100.

Statistical analysis
All the call rates for the different call types were significantly and positively correlated with each other (ρ ≥ 0.59; Fig. 2).Foraging rate was significantly negatively correlated (ρ ≥ −0.46) with every variable except whale count, male:female ratio, and socializing rate.Upcall, gunshot, tonal, and socializing rates, as well as whale count, were positively correlated with day of year (ρ ≥ 0.38).The male:female ratio was positively correlated with gunshot rate and tonal rate (ρ = 0.36 and 0.44, respectively).

Characterizing call rates
The slopes and corresponding intercept coefficients of the single-variable GLMs were similar for each call type (e.g.socializing rate slope and intercept were similar for all call types; Table 3a).Foraging rate was the only independent variable that exhibited a negative slope for every call type regression (Table 3a, Fig. 3a,b).For the upcall rate regressions, the variables day of year, male:female ratio, foraging rate, and socializing rate were all significant (p ≤ 0.01) (Table 3a).Day of year, foraging rate, and socializing rate variables explained a significant proportion of the variation in gunshot rates (p ≤ 0.01).For tonal rates, the variables day of year, male:female ratio, foraging rate, and socializing rate were significant (p ≤ 0.02).
The backward AIC stepwise selection process conducted on the multivariate GLMs (Eq.2) suggested that socializing rate and whale count best explained variation in upcall rate (Table 3b, Table S12 in Supplement 4).For the gunshot rate regression, the stepwise selection suggested that day of year was the best predictor.Lastly, for the tonal rate, the stepwise selection suggested day of year and male:female ratio were the best predictors.All the selected stepwise models explained more variance than the null models, which contained only the intercept and off-  (a) Characterizing call rates Upcall = (0.03 ± 0.01) × day of year + (−12.5 ± 1.67) < 0.001** Upcall = (0.88 ± 0.39) × male:female ratio + (−7.4 ± 0.71) 0.002** Upcall = (0.02 ± 0.02) × whale count + (−5.95 ± 0.6) 0.57 Upcall = (−3.95± 1.44) × foraging rate + (−5.27 ± 0.31) 0.01** Upcall = (5.21± 1.28) × socializing rate + (−6.84 ± 0.26) < 0.001** Gunshot = (0.07 ± 0.01) × day of year + (−18.2 ± 2.04) < 0.001** Gunshot = (0.63 ± 0.52) × male:female ratio + (−6.07 ± 0.96) 0.14 Gunshot = (0.03 ± 0.03) × whale count + (−5.56 ± 0.  3.All models and associated p-values for (a) single-variable negative binomial generalized linear models (GLMs) used to characterize call rates, (b) stepwise-selected negative binomial GLMs used to characterize call rates, and (c) linear models used to characterize whale count.The p-values for (a) and (b) were derived from likelihood ratio tests for negative binomial regressions, while those for (c) were derived from ANOVA tables.Models (a) and (b) used an offset term (log of deployment duration) that is not shown here.See Supplement 4 for full ANOVA tests and likelihood ratio test for negative binomial regressions.**Significant (α = 0.05) that gunshot activity was lowest in May and June and progressively increased to a maximum in autumn (October, November, and De cember).In addition to gunshots, we also observed an increase in tonal calling rates over time.This constitutes the first known report of a temporal trend in right whale mid-frequency tonal calling.The temporal increases in gunshot and tonal rates appeared to coincide with a potential transition from primarily foraging to increased socializing behavior.This is consistent with observations from the Bay of Fundy, where gunshots and mid-frequency tonals were produced in SAGs and suggested to be part of reproductive displays (Parks & Tyack 2005, Parks et al. 2005).Additionally, Parks et al. (2011) observed similar results in the Bay of Fundy such that the call rates (upcall, gunshot, and tonal) were lower when whales were foraging and higher when they were socializing.One speculative explanation is that right whales, as capital breeders (e.g.van der Hoop et al. 2017), may be focused on foraging during the early part of our study period and, having accumulated sufficient energy stores, socialize and call more later in the study period.
However, it is possible that the perceived decreases in foraging over time may be attributed to our inability to effectively observe right whales using different foraging strategies later in the study period (e.g.feeding near the ocean floor rather than at the surface) to adapt for shifts in the vertical distribution of their prey.The primary prey of right whales in the GSL, copepods of the genus Calanus, are likely engaged in diel vertical migration in the early part of our study period and primarily in diapausing layers near the ocean floor later in the study period (Baumgartner & Tarrant 2017, Brennan et al. 2019, Plourde et al. 2019, Sorochan et al. 2019).Foraging at or near the surface and socializing behaviors are more readily observed, as both are often characterized by visually obvious features (splashes, wakes, etc.).Though the causes for observed behavioral shifts remain unknown, if behavior and call rate relationships are consistent in this habitat, it may be possible to simply infer the presence, or lack thereof, of socializing and foraging behavior from acoustic data alone.This would potentially allow managers to identify when and where right whales are engaged in certain be haviors and adjust risk mitigation measures accordingly.
Our results also show a positive relationship between tonal call rate and male:female ratio, such that more tonal calls were produced in aggregations composed of a greater proportion of males.Observations from the Bay of Fundy suggest tonal calls are often associated with the focal female of a SAG, and tonal call production varied depending on the demographic composition of the SAG (Parks & Tyack 2005).A possible explanation for these observations is that tonal call production plays a role in coordinating mating behavior, but additional evidence is necessary to support this idea.In contrast, gunshot rate was not always related to male:female ratio but did increase over the course of the study period.Gunshot sounds have been attributed to males and potentially serve as a reproductive advertisement directed to wards females and/or an agonistic display directed towards other males (Parks et al. 2005).Perhaps the lack of an association between gunshot rate and male:female ratio we observed was because gunshot calls were being produced in multiple behavioral contexts (i.e.not only produced during mating behavior).

Upcalls
The rates at which upcalls were produced temporally increased over the study period (Fig. 3a), which is consistent with what others have reported in several regions in the Gulf of Maine (Mussoline et al. 2012, Bort et al. 2015) and the Scotian Shelf (Mellinger et al. 2007).The results of those studies suggest that upcalls exhibited similar seasonal trends as gunshots, with the least number of upcalls in May and June, then increasing in July and August, and reaching a maximum in mid-to late autumn.The previously mentioned studies were solely acoustic without concurrent visual observations and therefore could not determine if the increase in upcalls was due to an increase in the number of whales or to an increase in the rate at which individual whales produced upcalls.Here, we provide the first report that upcall rate in this habitat can be influenced by both the number of whales and their behavioral state.This agrees with speculation made by Clark et al. (2010) that other factors such as social context, environmental conditions, and whale abundance likely affect upcall production.Our results were inconclusive about the age and sex composition of the whales sighted and their relationship with upcalls.However, 2 lateseason (August) recordings with upcall rates of over 70 calls h −1 were associated with prolific gunshot and tonal call production and the highest socializing rates.This suggests upcalls are not always contact calls but are occasionally incorporated into acoustic displays.Changes in upcall rate affect the probability of acoustic detection (e.g.Johnson et al. 2022) and should be considered in the interpretation of PAM results.
The majority of right whale PAM used to inform management schemes exclusively use upcalls to determine daily right whale presence (e.g.Baumgartner et al. 2013, Davis et al. 2017).This is because upcalls are a relatively identifiable species-specific signal that is used by all age and sex classes, and their acoustic presence within 24 to 48 h correlates well with visual presence estimates (Baumgartner et al. 2019).In this study, recordings from 6 deployments did not have upcalls despite the presence of 4 to 31 right whales within the area, although 3 of these recordings had either gunshot or tonal calls (Table 1).Due to the variability in right whale calling, our results indicate that using upcalls alone to assess right whale presence−absence at fine timescales (<1 d) is unreliable.The consideration of additional call types improves presence−absence estimates but still does not fully resolve the instances when a whale is present and not calling.We suggest the most reliable PAM approach would use all call types over longer time periods (≥ 1 d) for detecting right whale presence and absence.

Characterizing whale count
As we have shown, right whale calling is ephemeral, highly variable, and dependent on behavioral state rather than the number of observed individuals.Our results align with those of Clark et al. (2010) but contrast with findings from Matthews et al. (2001) and Durette- Morin et al. (2019), both of which reported a statistically significant predictive relationship between the numbers of calls and the numbers of whales observed.The discrepancy may be attributed to the aggregation sizes considered, where 89% (33 of 37) of our observations were associated with observations with more than 10 whales compared to 14% (3 of 12) and 22% (5 of 23) from Matthews et al. (2001) and Durette- Morin et al. (2019), respectively.The aggregations Clark et al. (2010) observed were mostly groups of 2 or 3 whales.Another reason for the discrepancy may be that Matthews et al. (2001) and Durette- Morin et al. (2019) used vessels as visual survey platforms, which may not have been as effective as an aircraft at collecting observations at the spatial scale of acoustic monitoring.Furthermore, Durette- Morin et al. (2019) used fixed moorings that monitor constantly (24 h d −1 , as opposed to 0.5 to 6 h d −1 when weather is good) regardless of the number of whales present.Our study also occurred in a different habitat and time of year from the previously mentioned papers, as Clark et al. (2010) studied Cape Cod Bay from January to May, Durette- Morin et al. (2019) studied Roseway Basin from August to September, and Matthews et al. (2001) studied the Great South Channel and Cape Cod from April to May and the Bay of Fundy from August to September.Thus, habitat and time of year may affect whale behavior and contribute to the observed differences among these studies.
The variability in calling rates estimated here will undoubtedly lead to large uncertainties in acoustic density estimates, which may render such estimates uninformative, especially when compared to more precise estimates from visual mark−recapture methods (e.g.Crowe et al. 2021).For example, the coefficient of variation for our estimated upcall call production rate (also called cue rate), a common requirement for many density estimation methods (Marques et al. 2013), was 265% (median 0.2 call h −1 whale −1 , IQR: 0.367 call h −1 whale −1 ; Table 2) and indicates large uncertainty.Perhaps a more tractable approach to acoustic density estimation for right whales and other small populations of ephemerally, facultatively vocalizing whales would be to use an acoustic mark− recapture framework, where certain characteristics of calls can be reliably associated with an individual.For instance, McCordic et al. (2016) conducted an experiment where upcalls from 14 right whales of known age and sex were found to contain enough information to derive right whale identity.In a case study with bottlenose dolphins Tursiops truncatus, Longden et al. (2020) used an acoustic mark−recapture framework to estimate local abundance using known signature whistles.Detailed analysis of the call parameters (e.g.amplitude, frequency, call duration) was beyond the scope of our study but could be done in the future to evaluate acoustic mark−recapture for right whales in this habitat.

Sources of variation and biases
This dataset is subject to numerous biases that must be considered carefully before drawing conclusions about right whale acoustic ecology.For instance, we cannot make robust inferences about spatial, daily, seasonal (i.e.longer than 3 mo), and environmental variability due to the limited and haphazard distribution of our observations and inconsistent visual survey effort in the vicinity of each deployed sonobuoy.
Whale observations were made over short periods of time (e.g. in most cases, seconds to a few minutes) during daylight hours from an aircraft.It is certain that many behaviors were not documented because of the limited time spent observing the whales, as well as visual observations being limited to whales photographed at the surface for individual identification.We assumed that behaviors were missed consistently in each deployment and our observations represented relative changes in right whale behavior.Similarly, given that the assumed maximum acoustic detection radius (30 km) was much larger than the assumed maximum visual detection radius (1.5 km), the aircraft did not conduct comprehensive surveys of the area monitored by the sonobuoy, and the re cording duration typically exceeded the photodocumentation duration, it is possible that some whales that were acoustically detected were not visually detected (see Table S3 in Supplement 2 for additional details on estimated spatial coverage).Again, we assumed that this discrepancy did not introduce any systematic bias and that our results reflect relative patterns in right whale abundance and calling rates.
Our observations were limited to between 11:00 and 19:00 h (UTC) on a given day, which precluded characterization of diel patterns in calling that have been observed in other habitats (Mussoline et al. 2012).Furthermore, sounds from the aircraft can occasionally be heard on the acoustic recordings.Other studies have shown that noise from platforms can disrupt normal marine mammal behaviors (i.e.Patenaude et al. 2002, Erbe et al. 2019).Although this may have impacted the visual and acoustic observations in this study, a preliminary analysis comparing the number of calls received in the first and second half of each deployment provided limited evidence that call production was not strongly af fected by the aircraft (Fig. S1 in Supplement 2).The study design prevented more robust analysis of these potential sampling artifacts.We assumed both acoustic and visual data were impartial and unbiased because both were manually reviewed by experienced analysts using common protocols.
Entangled whales were present during 5 deployments.The calling and behavioral rates were statistically indistinguishable between deployments with and without entanglements (Table S4 in Supplement 2).To further assess the potential impact of including entangled whales in the results, statistical models (Eqs.1−3) were repeated with these deployments removed.The results were nearly identical to those of the full dataset.The only exception was the tonal call rate variability analysis (Eq. 2), where the model selected using the full dataset included the term male:female ratio (Table 3b), while the model selected without entangled whales did not (Table S15b in   Supplement 5).This provides little evidence that the presence of entangled whales altered the acoustic or visual behavior observed during a deployment.
Another source of variation may arise from repeated sampling (visually and acoustically), which would violate the independence assumption of the models used.Despite having the identity of every right whale, where 91.5% of the individual whales observed were photographed in more than 1 deployment (see Crowe et al. 2021 for more information on right whale residency in the GSL), we do not have the tools to determine which individuals were being acoustically active and determine if these individuals were repeatedly being acoustically active.Only 4 of 37 sonobuoys were deployed in proximity (within 24 h and 30 km) of another sonobuoy (these deployments are identified in Table 1).Due to the low number of neighboring deployments and the high variability of acoustic and visual whale behavior, we presumed that the independence assumption was not violated.

Conclusions
We compared concurrent visual and acoustic observations to characterize the calling behavior of North Atlantic right whales in the southern GSL, Canada.The call rates increased from June to August and were associated negatively with observed foraging behavior and positively with observed socializing behavior.We occasionally observed prolific production of upcalls (> 20 calls h −1 ) in association with gunshot and tonal calls, suggesting that upcalls may be incorporated into acoustic displays.We found that call rates were highly variable, making it impractical to acoustically estimate the number of right whales accurately.Considering all call types will improve PAM at fine time scales but will not resolve instances when whales are present but not acoustically active.The associations between whale calling and behavioral state suggest that we may be able to reliably infer some whale behavior from acoustics alone in this habitat, thus advancing acoustic monitoring beyond a presence-only tool.

Table 1 .
Raw data collected during aerial visual surveys.Raw data are counts of all visual and acoustic variables collected for each deployed sonobuoy, with corresponding sonobuoy location, date and time, and duration of acoustic recording.DD: decimal degrees

Table 2 .
Summary variables collected or derived for all deployments (n = 37), where whale count is a count and not a rate