Experimental field studies to measure behavioral responses of cetaceans to sonar

Substantial recent progress has been made in directly measuring behavioral re sponses of free-ranging marine mammals to sound using controlled exposure experiments. Many studies were motivated by concerns about observed and potential negative effects of military sonar, including stranding events. Well-established experimental methods and increasingly sophisticated technologies have enabled fine-resolution measurement of many aspects of baseline behavior and responses to sonar. Studies have considered increasingly diverse taxa, but primarily odontocete and mysticete cetaceans that are endangered, particularly sensitive, or frequently exposed to sonar. This review focuses on recent field experiments studying cetacean responses to simulated or actual active military sonars in the 1 to 8 kHz band. Overall results demonstrate that some individuals of different species display clear yet varied responses, some of which have negative implications, while others appear to tolerate relatively high levels, although such exposures may have other consequences not measured. Responses were highly variable and may not be fully predictable with simple acoustic exposure metrics (e.g. received sound level). Rather, differences among species and individuals along with contextual aspects of exposure (e.g. be havioral state) appear to affect response probability. These controlled experiments provide critically needed documentation of identified behavioral responses occurring upon known sonar exposures, and they directly inform regulatory assessments of potential effects. They also inform more targeted opportunistic monitoring of potential responses of animals during sonar operations and have stimulated adaptations of field methods to consider increasingly realistic exposure scenarios and how contextual factors such as behavioral state and source proximity influence response type and probability.


INTRODUCTION
Early ethologists recognized that baseline observations provide key insights for developing hypotheses about animal behavior and designing experiments to test them.Recognizing the need to separate anec-dotal observation from controlled measurements, and building upon experimental methods in other scientific disciplines, pioneering researchers developed techniques to empirically document causal links between exposure to stimuli and behavioral responses (Lorenz 1937, Tinbergen 1963).These and other early studies established experimental methods as an essential foundation for studying animal behavior.Increasingly sophisticated experimental methods have been derived and applied to understand marine mammal behavior for over half a century (e.g.experimental demonstration of echo location in dolphins by Norris et al. 1961).
Experimental presentations of external stimuli to wild and laboratory animals have documented myriad aspects of behavioral ecology, such as how animals recognize one another, compete and succeed in mating, select feeding or breeding habitat, and respond to various stimuli (see Bradbury & Vehrencamp 1998 for a review).For species that rely heavily on sound to perform vital functions, including most vertebrates and all marine mammals, experimental methods have been fundamentally important in understanding how and why they use sound and how they may respond to it.
In recent decades, interest has grown concerning how cetacean behavior may be affected by active sonar, particularly military systems operating in the lower-frequency (~0.1−2 kHz) and mid-frequency bands (2−8 kHz).Research on possible behavioral effects of the US Navy's SURTASS low frequency active (LFA) (~0.1−0.5 kHz band) sonar (Miller et al. 2000, Fristrup et al. 2003) was initiated given the relatively high source level of this low frequency sonar, which can propagate large distances.Public concern about the possible behavioral impact of sonar on killer whales was heightened following an unintentional sonar exposure of the endangered (US Endangered Species Act [ESA], listed in 2005) southern resident population of killer whales in Haro Strait, WA, USA in 2003 (Southall & Gentry 2005).In a similar event, multi-national naval exercises in 2000 were associated with reduced killer whale presence in the Vestfjorden basin of Norway (WWF-Norway 2001).But despite these concerns and short-term observations of response, the National Research Council Ocean Studies board (NRC 2005) concluded that there were no data or methods available to quantify possible population consequences of these kinds of disturbances.
Perhaps the strongest motivation for additional behavioral research arose from a series of lethal cetacean strandings coinciding with active sonar (~0.5−8 kHz) testing and training.This phenomenon was first noted from mass strandings in the Canary Islands and Greece (Simmonds & Lopez-Jurado 1991, Frantzis 1998) and became more widely recognized following a well documented, highly publicized mass stranding in the Bahamas in 2000 (Bal-comb & Claridge 2001), for which the US Navy acknowledged a causal association (Evans & England 2001).These and subsequent events, including continuing strandings in the Mediterranean (e.g.Tethys 2014, www.tethys.org/tethys/strandedwhalesupdate/), have raised awareness, interest, and debate within scientific, military, regulatory, and environmental communities about the nature and extent of the association between active sonar and strandings.Broad analyses have identified other potential sonarassociated strandings globally, observing commonalities and raising key questions, including the potential role of behavioral responses (Brownell et al. 2004, Cox et al. 2006, Filadelfo et al. 2009).
Sonar-associated strandings have typically included multiple animals found within hours to several days, spread over tens of kilometers, in places with deep water near shore, and have predominately involved several species of deep-diving, pelagic beaked whales (family Ziphiidae), most commonly Cuvier's beaked whale Ziphius cavirostris (Cox et al. 2006, Filadelfo et al. 2009).The Bahamas 2000 stranding involved some of the most powerful tactical military mid-frequency active sonar (MFAS) systems (SQS-53C) with primary energy in the 2.5−5 kHz range (as did a number of others based on unpublished reports).However, a lower-frequency (0.45− 0.7 kHz) sonar system was involved in the 1998 Greece stranding (Frantzis 1998, D'Amico et al. 2009, Filadelfo et al. 2009), as well as a 3 kHz sonar transmitted at the same time.In some instances, physical trauma associated with the formation of gas or fat emboli have been documented in sonar-associated strandings (e.g.Jepson et al. 2003, Fernández et al. 2005).The underlying cause of injury remains unknown.However, from the relatively tight temporal and broad spatial distribution pattern of most strandings and the quite high exposure levels required to induce changes in sensitive hearing systems in most cetaceans tested in laboratories (see Finneran 2015), it is unlikely that they resulted directly from physiological consequences of exposure (e.g.tissue damage caused directly by sound).Stranding patterns suggest that other less direct mechanisms were involved, unless by some unlikely coincidence all animals involved were within a few hundreds of meters of sound sources.Specifically, exposure to sonar may have resulted in behavioral responses that ultimately resulted in a cascade of events leading to lethal stranding, potentially including decompression injuries caused by alteration of dive behavior (Cox et al. 2006).Anecdotal observations of strong and potentially harmful behavioral responses in several other (non-beaked whale) cetaceans associated with or identified as likely resulting from exposure to approximately 2−8 kHz tactical sonars have provided additional support for the notion that behavioral reactions may play a key role in the sequence of events leading to injury or mortality (Southall & Gentry 2005, Southall et al. 2006).
Given a general lack of empirical data on cetacean behavioral responses to sonar systems involved in stranding events, focused research was clearly needed.This was explicitly identified by inter national scientific and governmental regulatory bodies in the midto late 2000s.These recommended controlled experiments to measure behavioral response in cetaceans, including but not limited to beaked whales, to improve the scientific basis for understanding how they may respond to and potentially be harmed by active sonar (ICES 2005, NRC 2005, IACMST 2006, Nowacek et al. 2007, Southall et al. 2007, 2009).Efforts to formulate science-based noise exposure criteria (e.g.Southall et al. 2007) were motivated by the need to improve regulatory assessments and mitigation of potential harm from human sound sources, including sonar.This increased interest and concern has resulted in major advances in both basic and applied scientific understanding of marine mammal behavior and potential effects of noise.
A series of independent but cross-pollinating international research efforts have focused on cetacean behavioral responses to different sonar systems.Studies have applied well established scientific principles to develop robust and adaptive experimental methods.The methods, results, and broad conclusions from a decade of focused research, financially supported largely by the US, Norwegian, and Dutch defense organizations, are described here.These studies have substantial scientific and regulatory implications for endangered, threatened, and protected species.

EVOLUTION OF EXPERIMENTAL APPROACHES AND METHODS
Experimental approaches using controlled exposure experiments (CEEs) to study marine mammal be havioral responses to noise have been derived from traditional sound playback experiments to study natural communication.These experiments involve the presentation of natural signals and often synthetic control signals in different contexts (e.g.known versus unfamiliar conspecifics) to investigate aspects of behavior such as individual recognition, territoriality, or parental attendance of young (e.g.McGregor 1992, 2013, Falls 1992, Hopp & Morton 1997).Results from naturalistic observations in formed by empirical playback experiments have formed the foundation of much of what is known about communication systems and behavior in many taxa, including birds (e.g.Falls et al. 1982), terrestrial mammals (e.g.Cheney et al. 1995), andfish (e.g. McGregor 1992) as well as marine mammals (e.g.Tyack 1983, Janik et al. 2006, Holt et al. 2010).Playback methods have been adapted and applied to free-ranging marine mammals using CEEs to test potential responses to various human noise sources, including examples involving low frequency coded signals (Frankel & Clark 2000), LFA sonar (Miller et al. 2000, Fristrup et al. 2003), seismic airgun surveys (Miller et al. 2009, Cato et al. 2013, Dunlop et al. 2015), and low-level tonal signals (Nowacek et al. 2004, Dunlop et al. 2013).Their application to studying behavioral responses to naval sonar in the 1−8 kHz range is an area of active research considered here.
The basic approach involves measuring aspects of individual behavior within 'pre-exposure baseline' (no stimulus), 'exposure' (controlled stimulus presentation using specified protocols), and 'post-exposure' (after stimulus) periods (e.g.Tyack et al. 2003).During exposure, received sound at the experimental subject is controlled in order to meet research objectives.This involves a presentation of exposure stimuli with specific characteristics, often including an escalation in the exposure level to identify whether and when behavioral response(s) may occur.Behavior during exposure is compared with baseline behavior to identify potential change points.Post-exposure behavior is evaluated in terms of whether and when behavior returns to baseline.

Methodological approaches to measuring marine mammal behavioral response
Early experiments were conducted as part of the US Navy's SURTASS-LFA scientific research program.Miller et al. (2000) demonstrated changes in humpback whale song durations during exposure to a similar system by visually following individual singers and listening with towed hydrophone arrays.Croll et al. (2001) also studied responses of foraging fin and blue whales to the same sound source, showing that whale behavior appeared more strongly linked to prey than sonar exposure.Tyack (2009) demonstrated localized avoidance of an active (0.1−0.5 kHz) sonar source by migrating grey whales using shore-based visual tracking methods.However, broad application of these observational methods and the types of behavior that can be studied are limited.
In conjunction with increasing research effort in the early 2000s, animal-attached archival tags that record fine-scale individual movement and broadband acoustic data were developed (Johnson & Tyack 2003).These tags contain pressure sensors to measure depth, 3-axis magnetometers and accelero meters to record movement, and hydrophones to measure vocal behavior of the tagged animal along with environmental sounds.The behavioral and noise exposure recording capability of these tags combined with the ability to follow tagged individuals using integrated radio beacons were critical technological developments enabling subsequent experimental studies.
These novel tags quickly enabled researchers to record fine-scale diving, vocalization, and movement behavior in cetaceans (e.g.Nowacek et al. 2004, Johnson et al. 2006).These and other studies re vealed aspects of behavior impossible to measure with surface observations, passive listening, or lower-resolution archival or satellite-linked tags.Their successful application on many species, including the poorly known, deep-diving beaked whales that were the subject of concern regarding sonar-associated strandings (Johnson et al. 2006, Tyack et al. 2006), provided critically needed baseline behavioral data.Focal study individuals (Altmann 1974) could be followed using tag-specific radio beacons, enabling measurement of responses within an experimental framework comparing the focal individual before, during, and after exposure (e.g.Nowacek et al. 2004, Miller et al. 2009) A detailed understanding of baseline (undisturbed) behavior patterns from these tags has been crucial in CEE design and in meaningfully interpreting behavior and potential response.Baseline studies have focused on key behavioral parameters likely to be important in detecting potential behavioral changes during CEEs.For instance, Tyack et al. (2011) used baseline data from tagged Blainville's beaked whales Mesoplodon densirostris to design research protocols to present sonar exposure during a particular part of the dive cycle in which baseline behavior seldom changed.They then compared numerous dive parameters in intentionally unexposed Blainville's beaked whales to those in sonar CEEs using identical highresolution behavioral metrics.Similarly, studies of diving kinematics and feeding strategies of blue whales Balaenoptera musculus in relation to prey distribution (Friedlaender et al. 2015, Goldbogen et al. 2015, Hazen et al. 2015) have provided critical insight into the interpretation of responses of feeding whales during CEEs (Friedlaender et al. 2016).Furthermore, baseline acoustic data from the tags have contributed to understanding the typical animal distribution in study areas before, during and after exposure to sound (McCarthy et al. 2011, Tyack et al. 2011, Yack et al. 2013).Finally, novel methods to identify vocalizing individuals within groups are enabling baseline vocalization measurements and examinations of potential effects of noise on calling and echolocation (e.g.Goldbogen et al. 2014, Stimpert et al. 2015, Arranz et al. 2016).
The combination of focal individual sampling with high-resolution tags to measure fine-scale movement, sound production, and sound exposure within an experimental context poses some analytical challenges.For instance, studies are often limited by small sample sizes, although this depends on the relative efficacy of locating and tagging subjects.Species such as baleen whales that are relatively amenable may have larger sample sizes (e.g.Goldbogen et al. 2013, Sivle et al. 2015), but for more challenging subjects, including the high-priority beaked whales, sample sizes have remained small (e.g.DeRuiter et al. 2013a, Stimpert et al. 2014, Miller et al. 2015).Additionally, the multivariate nature of rich timeseries behavioral data, differences in temporal resolution, and differential variability in behavioral parameters, pose complex statistical challenges.Sustained development of associated analytical methods to deal with these challenges, through interactions between tag designers, acousticians, field biologists, and statisticians, has significantly improved CEE analyses.These have included the application of traditional and novel statistical methods, including expert elicitation, generalized linear mixed models, general estimating equations, time-series analysis statistical techniques, and state-space analyses (see Harris et al. 2015Harris et al. , 2016)).

Methodological considerations of sonar CEEs
The interaction of scientific objectives with the logistical realities of obtaining behavioral response data from free-ranging cetaceans in increasingly realistic sonar exposure conditions has shaped CEE experimental design and methodology.These present a number of practical challenges, including con-ducting multiple CEE replicates on the same wild individuals to test for potentially different effects of stimulus type, repeated exposure, variable received level or other exposure parameters.In laboratory settings, these include issues of habituation and/or sensitization in addition to the very different exposure context and consequences relative to freeranging animals.But it is often difficult to re-acquire and test wild individuals multiple times and there are challenges in interpreting order effects where sequential noise exposures are presented within individuals over short durations (e.g.Tyack et al. 2011).Exposure dose escalation is often used where aspects of exposure (e.g. received sound levels) are in creased within presentations to each individual subject by increasing the source level or by moving sound sources transmitting at a constant level to identify the lowest exposure condition at which the onset of a particular response may occur.Multiple individuals are then tested in separate experiments or using sequences of repeated exposures and controls to determine exposure conditions within which behavioral responses may occur and to describe intraindividual variability.
Numerous experimental and methodological considerations drive operational aspects of cetacean sonar CEEs.These broad considerations, which depend on species life history, general movement patterns, social structure, and geographic location, include at least the following: (1) Sonar source type, including frequency spectra, source level, signal modulation pattern (including rise time and duty cycle), and overall duration.
(2) Source−subject range and relative movement, determined by source capabilities, desired received levels, and whether level escalation is used and is achieved by increasing stationary source level or source movement or both.
(3) Mitigation requirements of permitted research, which may include source level ramp-up (which has been required as a permit condition in some experiments and is consistent with exposure level-escalation objectives), source shut-down zones, acoustic detection of sensitive species such as beaked whales during exposure, observations for aberrant behavior that indicate a risk of harm to the subjects, and other considerations.
(4) Type of control stimuli used, which may include no-sonar control sequences, signals with acoustic features similar to those of interest (e.g.pseudorandom noise at overlapping frequencies), or presumably salient biological signals (e.g.sounds of conspecifics or predators).
(5) Whether or not the same or multiple stimuli are repeated during a CEE sequence.
These methodological considerations have been approached in slightly different ways across the diverse experimental sonar exposure studies that have been conducted over the past decade, and they have collectively yielded a large, increasing body of published results despite acknowledged inherent challenges.

Recent and ongoing marine mammal sonar CEEs
Several recent research programs from around the world have resulted in major advances in studying cetacean behavioral responses to sonar.The application of high-resolution, multi-sensor tags within a CEE context in these studies has allowed direct measurements of potential behavioral responses that are directly applicable to and are being incorporated into ongoing regulatory assessments regarding military sonar.These studies have some common goals, objectives, and methods, with considerable overlap in researchers, including the authors of this review.Each study is discussed here in some detail, as collectively they represent much of the recent progress in this field.This was the first study to measure beaked whale behavioral responses to simulated and real naval sonar, including the potential mechanisms underlying the adverse effects of specific active military sonar systems (AN/SQS-53C and AN/SQS-56C tactical sonars) that use 2−8 kHz sonar signals and were involved in previous mass stranding events.It was supported by the US Navy along with the primary US federal regulatory agency for marine mammals (US National Oceanic and Atmospheric Administration [NOAA]) and included a multidisciplinary collaboration of academic, private sector, and government scientists.The study was directly motivated by the 2000 beaked whale stranding in a relatively nearby area of the Bahamas (see Filadelfo et al. 2009) and was designed to investigate behavioral responses of beaked whales using 3 different complementary methods suited to measure potential responses on different time, space, and data resolution scales.
The experimental component of the Tyack et al. (2011) study was referred to as the AUTEC Behavioral Response Study (AUTEC-BRS).Empirical measurements of acoustic exposure, behavior, and potential changes in behavior were made using CEEs to compare baseline behavior with that occurring during and following exposure.Three different sound stimuli were projected from an experimental sound source deployed from a research vessel: (1) sonar waveforms used in operational navy systems; (2) pseudorandom noise in the same frequency band; and (3) sounds from mammal-eating killer whales, a potential predator (see Tyack et al. 2011 for more details).The maximum source level used for the sonar and pseudorandom noise stimuli was 211 to 212 dB re 1 µPa at 1 m, and that for the killer whale playback was 190 to 203 dB re 1 µPa at 1 m.As the first study was specifically designed to expose beaked whales to sonar signals simulating those for which lethal strandings had previously occurred, extensive protective measures for conducting CEEs were applied (Southall et al. in press).The focal species was Blainville's beaked whales M. densi rostris, but several delphinid cetaceans, including short-finned pilot whales Globicephala macro rhynchus, melonheaded whales Peponocephala electra, and false killer whales Pseudorca crassidens, were included as comparison species and to evaluate whether differential social responses to potential threats might affect the probability of flight reactions and potential associated risk of stranding.Insufficient samples were obtained across these species to fully evaluate differences between species, but some observations of behavioral changes are illustrative and are discussed below.The beaked whale CEE measurements provide extremely de tailed, high-resolution measurements of individual behavior for relatively short periods (hours) before, during, and after CEEs using experimental protocols designed to identify potential behavioral response onset.
A second method employed by Tyack et al. ( 2011) was to deploy satellite tags on individual Blainville's beaked whales to opportunistically monitor movements during a period of days before, during, and after a sonar training exercise conducted on the AUTEC range.These measurements provided relatively coarse measures of surface locations, but extended the total duration monitored to many days.These data included a baseline (before exposure) period of several days, followed by a period of intense sonar operation in the general vicinity of the tagged individual, followed by a period with no sonar transmissions.
The third approach also involved observational monitoring of marine mammal behavior before, during, and after sonar transmissions, but expanded the spatial scale to include the entire AUTEC range.Passive acoustic monitoring was conducted using the array of bottom-mounted hydrophones modified to track cetaceans vocalizing within the AUTEC range.Each of nearly 80 hydrophones in an array that covers hundreds of square kilometers was simultaneously monitored to detect echolocation clicks of multiple beaked whales.The duration of clicking for groups of beaked whales was called a group clicking period (GCP); their location and duration were quantified for comparable periods (20 to 23 h total) before, during, and after active sonar from realistic training operations.This approach lacks an experimental framework and resolution of individual responses, and is limited to measurements of vocalizing animals.However, it provides a broader perspective on potential responses to real operations and movement patterns for the local population of animals on the range.
The combined results of these experimental and observational methods provide a unique perspective into the nature of beaked whale responses to active military sonar.As shown in Fig. 1 (from Tyack et al. 2011), beaked whales in all 3 scenarios clearly responded to sonar exposure, but each method provides different kinds of insights.For the beaked whale exposed to simulated sonar (upper left panel), individual dive behavior was significantly affected.The animal ceased vocalizing early in the dive and demonstrated a long shallow ascent moving horizontally away from the sound source.This effect was more pronounced following subsequent exposure to killer whale signals, after which the animal completely avoided the area.Similarly, the individual tagged with a satellite-linked transmitter (upper right panel) moved tens of kilometers away from the range area during sonar operations, but returned several days following sonar cessation.Finally, passive acoustic monitoring of the entire range to document GCPs before, during, and after active sonar (bottom panel) demonstrated either a complete cessation of vocalizations or, more likely, a temporary avoidance of the sonar use area on the range.
The AUTEC-BRS provided both a unique example of integrating complementary measurements on different time and space scales, and key lessons for subsequent cetacean sonar CEEs.By analyzing behavior and potential responses at the individual level at very high resolution but short duration, as well as coarser resolution but longer duration for fewer data types, it is evident that the responses of Blainville's beaked whales to sonar include avoidance of the sound source as well as changes in vocal and diving be havior.By comparing vocal activity in a large area, analyses on the local population level also suggested a general avoidance of sonar use areas.Each ap proach has strengths and limitations, but by integrating them, Tyack et al.

MED-09
The AUTEC-BRS project conducted sonar CEEs with beaked whales within a naval range where animals were routinely exposed to sonar without documented stranding events and where a hydrophone array could detect clicking beaked whales in real time.This setting was selected to minimize the risk to animals while maximizing the probability of experimental success.Once this project successfully and safely conducted CEEs that measured beaked whale responses to sonar, there was a high priority for a similar study with other beaked whales in the Mediterranean Sea area where most sonar-related strandings had been documented.In order to develop research capabilities critical for such a study, the Sirena08 cruise tagged Cuvier's beaked whales Ziphius cavirostris in the Mediterranean and monitored whales in real time, using a hydrophone array towed from a very quiet ship (Pavan et al. 2010).This cruise succeeded in using the hydrophone array to detect and track Cuvier's beaked whales, link detections with visual sightings, and to tag an individual whale in the Alboran Sea (Haun et al. 2008).Given this success, the MED-09 study was designed to study Cuvier's beaked whale responses using highresolution tags and simulated sonar CEEs.Given the high priority assigned to working with Cuvier's beaked whales, the team maintained a focus on this species, even when conditions made this difficult.During the 39 operational days at sea, this cruise was unable to tag Cuvier's beaked whales due to a combination of weather and difficulties in approaching whales closely enough.While a singular focus on beaked whales was maintained in MED-09 because of their high research priority, these challenges led to the evolution of a broader taxonomic approach in later studies, focusing on more species, and a mode of operation that was less dependent on 1 large research vessel (Southall et al. 2012).

The 3S experiments
The 3S project comprised an international collaboration of scientists from Norway, the UK, the USA, and the Netherlands to study the effects of sonar on cetaceans and fish in Norway.Across many NATO countries, new 1−2 kHz sonars were becoming operational (Kvadsheim et al. 2007), and there was concern that these new systems might have greater effects on some species than the higher frequency (6−8 kHz) sonars that had been primarily used up to that point in European waters.Specific events involving killer whales in both Norway (Vestfjorden; WWF-Norway 2001) and the USA (Haro Strait; Southall & Gentry 2005) raised specific concerns of how killer whales might respond to sonar in ways that could adversely affect whale-watching of this iconic species.Within Norway, specific concern also existed that sonar might affect the presence of herring (WWF-Norway 2001), a key component of Norway's multibillion dollar fishing industry.
The initial phase of 3S was thus focused on killer whales Orcinus orca and their prey, herring Clupea harengus, and a motivation to compare sensitivity of killer whales and herring to the new 1−2 kHz sonar relative to existing European naval sonars operating in the 6−8 kHz band.Measurements of hearing in killer whales suggested they have approximately 30 dB lower sensitivity in the 1−2 kHz frequency band than in the 6−7 kHz band (Hall & Johnson 1972, Szymanski et al. 1999).In contrast, herring were predicted to be sensitive to 1−2 kHz sounds but much less sensitive to 6−7 kHz (Enger 1967, Mann et al. 2005).While it is generally accepted that a sound must be audible to induce behavioral responses, the effect of exposure sensation level or perceived loudness on strength or probability of response remains poorly understood.
Initial 3S CEEs with killer whales and herring occurred in November 2006 when large numbers of herring and killer whales gathered in Vestfjorden, Norway.Observing herring with a bottom-mounted echosounder, 6 experiments were conducted over 9 nights comprising 42 exposure sessions.Results indicated that herring did not respond to either 1−2 or 6−7 kHz sonar signals, but did respond to playback of killer whale sounds (Doksaeter et al. 2009).In contrast, just 2 controlled exposure experiments sessions (one using 6−7 kHz sonar at a maximum source level of 197 dB re 1 µPa at 1 m and the other using 1−2 kHz sonar at a maximum source level of 209 dB re 1 µPa at 1 m) were conducted with killer whales (Kvadsheim et al. 2007).A NATO fleet exercise (FLOTEX Silver 2006) took place in the same waters of Vestfjorden during the same period.A retrospective study of killer whale sightings from 2002 to 2008, including research surveys, concluded that declines in killer whale sightings were most strongly correlated with the decline in herring biomass, but that sonar usage in a low herring biomass year (2006) likely had some impact on killer whale distribution (Kuningas et al. 2013).
The 3S study period was changed from winter to summer in 2008 to maximize cetacean sightings and available daylight.As killer whales are more dispersed in summer, sperm whales Physeter macrocephalus and long-finned pilot whales Globicephala melas were added as subject species.In the second phase of the 3S study (3S2), only the 1−2 kHz sonar was studied further, and species studied included the humpback whale Megaptera novaeangliae, minke whale Balaenoptera acutorostrata, and northern bottle nose whale Hyperoodon ampullatus.
The core experimental protocol was to escalate received exposure dose (quantified as received level) during 30 to 60 min of controlled exposure to a sonar signals from a towable, operational military sonar source (Socrates, TNO).Following a pre-exposure baseline, an exposure session was initiated with the source vessel positioned 8 to 10 km from the tagged whale.Transmission source levels started at approximately 150 dB re 1 µPa at 1 m, and were increased to the full source level of 209 dB re 1 µPa at 1 m over 5 to 20 min.Except for the single northern bottlenose whale tested (Miller et al. 2015), additional escalation of the sound level received by the whale was achieved by moving the source vessel toward the whale position, continuously updated by sightings from a separate observation vessel.The source vessel heading was fixed and no further maneuvers were made once the vessel reached a distance of 1 km from the whale, allowing avoiding animals to move away from the oncoming vessel.Using this method, received levels varied from very low levels of < 90 dB re 1 µPa to as high as 180 dB re 1 µPa (Wensveen et al. 2015).
Experiments initially consisted of both 1−2 and 6−7 kHz hyperbolic upsweep sonar signals.Paired presentations within each CEE sought to control for expected between-whale variation in response, while accounting for presentation order by alternating which signal was presented in the first session.Additional control and sonar exposures included no-sonar control sessions to account for the possible effect of the approaching vessel itself, 1−2 kHz hyperbolic downsweep exposures, and playback of killer whale sounds at natural source levels (Miller et al. 2011).No-sonar control sessions were identical to exposure sessions, except that no sonar signal was transmitted.Initially, no-sonar control sessions were conducted randomly within the exposure schedule for each experiment (Miller et al. 2011).However, in 3S2, a no-sonar session was conducted as the first session for each experiment, except for the bottlenose whale, to test the effect of vessel approach alone before there was any possibility of the whales becoming sensitized to the signature of a vessel that had recently transmitted sonar.
Two broad classes of analytical procedures were used.Inspired by dose-escalation ('titration') procedures used in phase I clinical trials in human medicine (Simon et al. 1997), diverse analyses identified whether responses occurred during each session, and the received acoustic level associated with the re sponse was used as a 'response threshold' datapoint.Thus, each exposure session was evaluated on a case-by-case basis using quantitative multivariate time-series break-point analyses (e.g.Mahalanobis distance; Antunes et al. 2014, Miller et al. 2014, 2015) as well as expert evaluation of the complex datasets recorded for each exposure session (Miller et al. 2012, Sivle et al. 2015).The broad range of received levels from the dose-escalation exposure sessions was important as it enabled identification of behavioral response onset, and lack of any response during a session could be treated using right-censored data methods, which are also standard in medical trials.Relevant covariates for each session (e.g.subject identity or behavioral state, sonar frequency, session order) enabled explicit evaluation of how such contextual variables influenced threshold and quantification of between-and within-animal variability in response thresholds.
Using this response threshold identification approach, it was possible to quantify the probability of a particular response occurring relative to a specific received sound level (Fig. 2) for these experimental contexts.Initial probabilistic dose-response curves were reported for avoidance in killer whales (Miller et al. 2014) and long-finned pilot whales (Antunes et al. 2014).For killer whales, response thresholds were generally lower for 6−7 kHz than for 1−2 kHz sonar, but no order effect was observed.However, the sonar frequency effect was not statistically significant, so a single dose-response curve for both frequencies was derived, with considerable uncertainty resulting from the relatively high levels of between and within animal variability in response.A broader analysis applying expert-identified behavioral response severity was used to derive dose-response functions for low, moderate, and severe responses using Cox proportional hazard models (Harris et al. 2015).In the model selection, species, sonar frequency, and behavioral context prior to exposure were included, so that more specific dose-response functions could justifiably be derived (Fig. 3).Killer whales were found to be more likely to respond to sonar in these exposure contexts at lower received levels than sperm whales or longfinned pilot whales.
In addition to the session-by-session analyses used to derive dose-response functions, several analyses have treated the exposure period as a block in a more traditional before-during-after design.Analysis of expert-identified behavioral responses in dicated significantly more severe re sponses to actual sonar exposures than no-sonar control approach sessions (Miller et al. 2012).A hidden-state analysis of sperm whale behavior identified increases in non-foraging, non-resting 'silent-active' behavior of sperm whales in response to 1−2 kHz sonar and playback of killer whale sounds, but no such change for 6−7 kHz sonar playbacks, no-sonar vessel controls, or a sonar incidentally recorded at lower levels on 3 non-experiment tag records (Fig. 4; from Isojunno et al. 2016).
The 3S2 experiments conducted 1−2 kHz sonar CEEs with 11 humpback whales, as well as a single experiment for each of the difficult-to-tag minke and northern bottlenose whales.Severity scores of expert-identified responses indicated that both the minke and bottlenose whales showed high-severity behavioral responses to 1−2 kHz sonar, while humpbacks demonstrated less severe responses for higher received levels (Sivle et al. 2015).The individual northern bottlenose whale switched from typical surface behavior between deeper foraging dives to avoidance behavior and conducted the deepest and longest dive recorded for the species, during which no feeding-related echolocation sounds were produced (Miller et al. 2015).Acoustic and visual detections of other whales in the area decreased substantially after the CEE, suggesting that other individuals were also likely affected.
Humpback whales were studied using similar methods, but the protocols were specifically de signed to enable measurement of low sonar levels received by the whale to test how behavior was potentially affected by adding a 'ramp-up' procedure prior to fulllevel transmissions.Results indicated a non-significant effect overall, but that levels were more strongly reduced (−6 dB for maximum sound pressure level, SPL max ) for whales that avoided the approaching sonar source (Wensveen 2016).Consistent with theoretical modeling (von Benda-Beckmann et al. 2014), the effectiveness of ramp-up to reduce received sound levels depended critically upon responsiveness.This humpback study demonstrated that experimental tests of mitigation procedures could be a useful application of future CEEs and that ramp-up can reduce exposure levels in some conditions.

SOCAL-BRS
The Southern California Behavioral Response Study (SOCAL-BRS) began in 2010 and evolved from the AUTEC-BRS and MED-09 studies, sharing similar methodological approaches and objectives.Like the other marine mammal and sonar CEEs, SOCAL-BRS involves an interdisciplinary, multi-team collaboration of researchers, although specific measures were implemented to reduce the overall size and cost of the effort while increasing sample sizes of experimental subjects for some species (see Southall et al. 2012 and the following paragraphs).SOCAL-BRS was designed to increase understanding of marine mammal reactions to sound and provide a more robust scientific basis for estimating the impact of specific naval active sonar systems operating in the approximately 3−4 kHz range, systems which are used by a number of navies.The overall species focus was 3-fold: particularly sensitive species (beaked whales), endangered species (baleen whales and sperm whales), and common species (delphinids).Species from all 3 groups were successfully tagged and included in CEEs, providing considerable base- The AUTEC-BRS was successful in demonstrating the nature of Blainville's beaked whale responses to simulated and actual military sonar using complementary methods on very different time and space scales.However, the total sample size was low relative to the total effort and cost of the CEE aspects of the study, and results for other species were limited.The US Navy and NOAA still had a pressing interest in obtaining data on additional marine mammals, including endangered species that are regularly exposed to sonar as well as Cuvier's beaked whales Z. cavirostris, the species most commonly represented in sonar-associated strandings.SOCAL-BRS was planned for areas off southern California with both a wide diversity and abundance of marine mammals and regular sonar operations.As described in greater detail by Southall et al. (2012), the overall approach was to apply similar sonar CEE methods to new species, but to streamline field teams and maximize flexibility by adjusting the species focus based on in situ distribution of animals and weather conditions.
A key enabler was the development of a small, hand-deployable vertical line array sound source capable of generating high source levels (up to 212 dB re 1 µPa at 1 m) but not requiring a large, oceanographic research ship.SOCAL-BRS built on the Tyack et al. ( 2011) approach, as well as successes and lessons learned in MED-09 and 3S which helped define goals and objectives.The MED-09 project demonstrated the risks of focusing exclusively on a high priority species such as Ziphius that requires a rare combination of special conditions for tagging to succeed.The 3S project demonstrated the benefits of combining a focus on such high priority species that were difficult to tag with lower priority species that were easier to tag.These insights and capabilities enabled the use of a smaller centralized research platform supporting fast, independent tag boats capable of covering large areas to locate, tag, and track several different species.This evolution was extremely effective in enabling SOCAL-BRS to use smaller, more agile teams able to work adaptively with more species, resulting in sample sizes of dozens for many focal species as opposed to much smaller sample sizes in earlier projects.Sample sizes for beaked whales remained small given the inherent difficulty in locating and tagging them; however, this adaptive approach enabled focus on these species when con- ditions allowed but alternative options when they did not.Additional methodological ad vances included using active fisheries acoustics to measure prey distribution for feeding baleen whales, resulting in key insights into baseline foraging ecology (e.g.Goldbogen et al. 2013) and the context-dependent nature of behavioral responses (Friedlaender et al. 2016).
While some SOCAL-BRS data were still being collected and analyzed at the time of this review, many results were already published, notably for blue and beaked whales.Goldbogen et al. (2013) provided the first analyses for blue whales, demonstrating that for a subset of individuals, changes in a suite of diving behaviors occurred during some CEEs with both simulated sonar and pseudorandom noise, but that individual behavioral state was critically important in determining response probability.As shown in the example in Fig. 5 for simulated sonar, the composite dive response (a suite of associated dive metrics within the principal component analysis) changed as a function of sonar exposure for animals engaged in deep (> 50 m) feeding behavior much more than for those in shallow or non-feeding conditions.Goldbogen et al. (2013) did not analyze individual responses quantitatively.However, specific examples demonstrated the nature of response associated with the general response patterns described above.Some individuals engaged in shallow feeding behavior demonstrated no clear changes in diving or movement even for quite high (~160 dB re 1 µPa) received levels for exposures to 3−4 kHz sonar signals (Fig. 6A).Other individuals exhibited clear responses, including altered dive behavior and increased speed at exposure onset with much lower (~100 to 140 dB re 1 µPa) received levels of both pseudorandom noise (Fig. 6B) and sonar (Fig. 6C) signals.
Subsequent analyses have further illustrated the importance of measuring and directly incorporating key contextual variables in response analyses.Friedlaender et al. ( 2016) demonstrated a 5-fold increase in the ability to quantify variability in blue whale diving behavior when prey distribution and density variables were incorporated into across-individual analyses.These results provided the first integration of direct measures of prey distribution into the analyses of potential responses to sound exposure in feeding marine mammals and illustrated that responses evaluated without such measurements for foraging animals may be misleading.Additional studies are expanding on the findings of Goldbogen et al. ( 2013 Miller et al. (2015) in demonstrating clear, strong, and pronounced behavioral changes, including sustained avoidance with associated energetic swimming and cessation of feeding behavior at quite low received levels (~100 to 135 dB re 1 µPa) for exposures with relatively nearby (2 to 5 km) sound sources in simulated sonar CEEs.
In terms of exposure to actual military sonar, limited beaked whale data from SOCAL-BRS represent the only high-resolution, multi-sensor measurements of detailed behavior in these instances presently available.As evident in Fig. 7, the 2011 subject was incidentally exposed to distant actual sonar at the onset of the tag deployment.However, no measurements were made before these exposures and this was an observational, uncontrolled exposure with key aspects (such as source-animal range and orientation) un known, limiting the conclusions that may be drawn.However, the observed lack of response of this individual is similar to measured behavior of a Cuvier's beaked whale in the SOCAL-BRS CEE conducted with an actual US Navy AN/ SQS-53C sonar system (see Southall et al. 2014).In The limited observational and CEE results to date with operational navy vessels suggest that additional contextual variables (e.g.exposure range) may also mediate the probability of response, possibly along with received level.This is one of a number of key questions regarding factors influencing response probability that require con siderable additional focused research (discussed below).

General conclusions from recent sonar-related CEEs
The above studies represent significant progress in experimental methods to measure behavioral responses (or lack thereof) of various cetaceans to different sonar signals on very fine scales in a causative rather than correlative manner.This is possible because of the use of a controlled, experimental approach with high-resolution measurement tags.No-  sonar control sequences have been particularly useful to evaluate potential changes relative to natural variability in behavioral patterns and to test for potential experimental effects such as source movement.Escalating received levels through sound source movement and/or source level escalation has been important for achieving a broad range of received levels needed to derive robust exposureresponse curves.However, each mode of dose escalation may differentially affect whale response.
The stationary source form of dose escalation does not duplicate all features of exposure to sonar from a moving naval vessel, but may simulate some aspects of an approach given the rapidly increasing levels used.The escalation methodology using an approaching source is more realistic, and likely increases the relative probability of response compared to situations where the source is stationary or moving away.The resulting sample sizes of active sonar CEEs with cetaceans have been relatively small for some, but not all, species.However, many useful and very timely results with practical management implications have been generated, and several broad conclusions are evident.
Clearly identifiable behavioral responses to sonar exposure have been measured in some, but not all, individuals of all species tested.While many, but not all, responses have been relatively mild and/or brief, it is important to note that these were short-term experiments intentionally designed not to harm subjects, but rather to identify the onset and nature of different behavioral responses.Common responses have included avoidance of the area of sonar exposure, either directed away from a stationary or often perpendicular to the track of an approaching vessel (Miller et al. 2012, Goldbogen et al. 2013).Cessation or modification of vocal behavior during sonar exposure has been shown in multiple studies (e.g.De -Ruiter et al. 2013b, Miller et al. 2012, 2014, Alves et al. 2014).Additionally, cessation of foraging has been documented in multiple studies, based on acoustic (e.g.Miller et al. 2012, DeRuiter et al. 2013a) or kinematic (e.g.lunges) indicators of feeding (Goldbogen et al. 2013, Friedlaender et al. 2016).The responses observed during and after relatively short (tens of minutes) exposure periods are not thought to have produced significant or permanent adverse effects, an important aspect of the study design.For example, subject md07_245 from the Tyack et al. ( 2011) study, which showed a strong and prolonged response, has been sighted numerous times in subsequent years.However, as noted, these experiments were deliberately designed to demonstrate the onset of response and not to produce adverse effects.Nevertheless, these kinds of sub-lethal behavioral changes may still have significant energetic and physiological consequences given sustained or repeated exposure to these kinds of stimuli, something which is certainly a common occurrence in some high-use sonar areas.The severity of behavioral responses has been considered within some studies (e.g.Miller et al. 2012, Sivle et al. 2015) using an adaptation of the Southall et al. (2007) response severity scaling.
Both the AUTEC and 3S programs conducted playback of killer whales as a positive control to sonar exposures.These killer whale playbacks were useful for demonstrating the ability of the observation systems to detect behavioral responses (i.e.Doksaeter et al. 2009, Miller et al. 2012) and to provide a useful biologically relevant 'yardstick' against which to compare responses to sonar (Curé et al. 2012(Curé et al. , 2013(Curé et al. , 2015)).Animals are predicted to have evolved behavioral responses to predators (cessation of foraging, avoidance) that may be costly in the short term provided they are effective at reducing the risk of being killed (Frid & Dill 2002).The killer whale playbacks made it possible to test the hypothesis that the response(s) documented to sonar exposure resemble(s) a predator threat response (Tyack et al. 2011, Miller et al. 2012).The overall nature of behavioral responses of sperm whales and humpback whales to sonar were quite similar to responses to playbacks of mammal-eating killer whale sounds recorded in places other than where they were played back (Miller et al. 2012, Curé et al. 2013, 2015, 2016, Sivle et al. 2015, Isojunno et al. 2016).Long-finned pilot whales responded strongly and consistently to playback of herring-feeding killer whale sounds, but responses indicated a mobbing response (Curé et al. 2012), which was not observed during sonar exposures.Responses similar to mobbing behavior have been documented for subsequent dedicated playback experiments using feeding-associated sounds produced by unfamiliar killer whales (P.J. O. Miller unpubl.data), indicating that this may be a common strategy of long-finned pilot whales responding to killer whale presence (De Stephanis et al. 2015).These results suggest that cetaceans respond to sonar in a manner that may be shaped by anti-predator adaptations and association, but that sonar sounds are not categorized as so intense a threat or competitor as actual killer whale sounds.
While the probability of response to sonar varied considerably within and across the marine mammals tested, one particular generalization across the various taxa tested appears clear.Namely, individuals of all 4 beaked whale species tested -Mesoplodon densirostris (Tyack et al. 2011), Ziphius cavirostris (De Ruiter et al. 2013a), Berardius bairdii (Stimpert et al. 2014), and Hyperoodon ampullatus (Miller et al. 2015) -have shown particular sensitivity in terms of the probability of response (100% of individuals tested have responded strongly to simulated sonar), very low received levels at the onset of behavioral response, and the relative strength and sustained duration of individual responses, compared to some other cetaceans such as blue whales (Goldbogen et al. 2013) and pilot whales (Antunes et al. 2014).However, the responses of some species (notably Baird's beaked whales) were not markedly stronger than those documented in some other non-beaked whale cetaceans such as killer whales (Miller et al. 2014).Individuals of all 4 beaked whale species did respond with cessation of echolocation-based foraging and extended dive durations consistent with avoidance reactions.These collective results appear categorically different from those of most other cetacean species, where more varied responses and, in some cases, a lack of any apparent behavioral response even for relatively intense exposures have been observed.The strong and consistent nature of behavioral responses measured in beaked whales to date is consistent with earlier observations based on stranding events that they may be more sensitive to noise exposure (e.g.Cox et al. 2006).The strong response for a bottlenose whale measured by Miller et al. (2015) was particularly striking in that it resulted from a 1−2 kHz sonar signal, which is even further from the species-typical sound production than the 2−7 kHz sonars in the other studies, presented over 5 km from the source in an area where operational military sonars seldom occur.While the CEE results for beaked whales now span 4 different species in different parts of the world, these studies have all involved small sample sizes (1 to 2 individuals per species), and additional experimental work is clearly needed.Some individuals of non-beaked whale species have also demonstrated relatively strong reactions to sonar (e.g.killer whale subject oo09_144a in Miller et al. 2012Miller et al. , 2014;; the blue whale example in Goldbogen et al. 2013 in Fig. 6C).Interestingly, the 3S results (Miller et al. 2012(Miller et al. , 2014) ) have demonstrated that some killer whales are also particularly sensitive to sonar compared to other species.This represents an interesting mismatch with the predation-risk theory posed and tested for other species in that they may treat sonar exposure as a general threat to which responses have been shaped by natural responses to the threats posed by predation.
Among the non-beaked whale CEE results, there is considerable variability within and between species, even in the same studies, in terms of the probability and nature of behavioral response.The combined results indicate diversity in the probability of responses within species, some of which may vary because of variation in the received levels and because of differences in context (as suggested by Ellison et al. 2012).Unfortunately, it is not possible to control or measure the large variety of potentially relevant contextual factors.However, contextual variables have been increasingly identified and successfully described within sonar CEEs in terms of their relative contributions to response probability.For instance, within the 3S studies, a Bayesian dose-response analysis quantified a high degree of within and between animal variability in the onset of behavioral response (Antunes et al. 2014, Miller et al. 2014).Behavioral state at the start of exposure also appears a relevant consideration.Blue whales feeding on deep, dispersed prey were more likely to change diving behavior and avoid simulated sonar sources than whales feeding at shallow depths on highly concentrated prey (Goldbogen et al. 2013).These findings demonstrate the importance of identifying and accounting for both intrinsic and extrinsic behavioral and exposure variables to fully predict responses to sonar.
Key questions remain regarding potential sources of context-dependent variability that may be important in predicting the probability of response, notably the relative importance of behavioral state and spatial relationships between sound sources and animals.However, significant progress has been made in this regard, guiding future studies and future observational monitoring.There is increasing recognition and quantification of these issues within CEEs, including the recognition that not every complex context interaction must be directly studied.Provided that CEEs involve subjects sharing the same context with the areas, seasons, and populations with which sonar operations typically occur, results can be used to generate probabilistic risk functions that can estimate the responses of these populations without the requirement to quantify all of the context-specific causes for the variability.Studies on focal animals where context can be quantified provide a common basis for comparison and analysis not possible with uncontrolled observational methods.Experimental studies can identify behavioral changes caused by sonar.Once these responses have been defined, they can be used to inform more targeted monitoring programs designed to identify when and where such responses occur during actual sonar exercises.

Applications for assessing potential effects of sonar
The combined BRS results obtained over the past decade have significantly improved our understanding of the types and probabilities of response in marine mammals to simulated and actual naval sonar sources; recent efforts in SOCAL-BRS have begun to explore the responses of cetaceans to controlled exposures of actual naval sonar systems in realistic scenarios, including use of the most intense sources at long ranges.In many conditions, animals do not respond to sonar in any detectable way.However, various behavioral responses have been observed in some conditions; these have been quantified in finescale resolution and can be experimentally demonstrated to result from sonar exposure.There is considerable within-and between-species variability in terms of the probability of response, likely related to both intrinsic (e.g.individual differences in threshold for response, behavioral state during exposure) and extrinsic (e.g.spatial orientation of source and receiver) exposure contextual factors.Within the experimental contexts tested, received sound level has been associated with the probability of responses in an experimentally controlled dose-escalation context to generate preliminary, probabilistic dose-response functions (e.g.Miller et al. 2014).In some conditions, individualspecific factors such as behavioral state may interact with exposure variables such as received level to influence the probability of re sponse (e.g.Goldbogen et al. 2013).However, received level re mains an important exposure variable for consideration and one that has been central within the regulatory context.Sonar BRS results that include exposure levels and other relevant identifiable contextual variables are clearly needed by navies and regulatory agencies.Current predictions of response probability based on the available, rapidly increasing data provide vastly superior estimates relative to earlier re sponse functions based upon anecdotal, uncontrolled observations, recreations of strandings, and/or captive studies with little relevance to free-ranging animals (Department of the Navy 2008).
Many of the regulatory compliance questions and data requirements facing navies that operate active sonars require sufficient detail on the probability of behavioral response to sonar exposures.This is most directly accomplished through the use of experiments to establish causal links between exposure and responses, with high-resolution measurements with the potential to detect subtle aspects of individual response behavior.Once such responses have been identified, targeted observational methods building on experimental results can be developed to monitor and measure the effects of realistic sonar exercises over the associated predicted ranges.Then monitoring can be developed to measure effects of actual sonar exercises over broader time and space scales for local populations.An excellent example of this comes from Moretti et al. (2014), who used passive acoustic monitoring of echolocation-based foraging behavior in beaked whales.Using these methods coupled with sound propagation modeling, Moretti et al. (2014) derived a received level exposure-response function for the initiation of foraging dives as a function of exposure to actual sonar exercises.The results have associated caveats as the units of analysis involve an unknown mixture of responses from different individuals, repeated measurements from the same individuals, and may not represent the most sensitive individuals because they are sampled from a range where sonar is commonly used and sensitive individuals may be less likely to be in the range than in other areas of their habitat.However, the clear benefits of this kind of approach is that it builds on both experimental and observational studies within a species (Tyack et al. 2011) to investigate the probability of a particular behavioral response over more rep re sentative time and space scales in the context of real naval operations.This combination of high-resolution experimental and coarser-resolution observational methods is a timely and important element of this field of research; we discuss this in greater detail in 'Directions for future research' below.Here we propose a focused and adaptive application of experimental sonar CEE approaches as the fundamental basis for approaching and understanding cetacean behavioral response to inform and direct more targeted and effective observational monitoring of local populations of animals during realistic operations.
The combined results are useful for improving the assessment of exposure-response functions and the type and severity of potential adverse effects of sonar operations.Recent results are already being applied within the ongoing evolution of regulatory compliance processes.However, it remains challenging to assess the relative biological significance and broader ecological and population consequences of short-term experimental measurements of behavioral re sponse onset relative to the much larger and more complex navy training and testing operations.The evaluation of measured response severity is one effort to try to assess the significance of behavioral responses documented in BRS studies.Longer duration of exposure may lead to more severe responses (i.e.longer disruption of behavior or sensitization to repeated exposures) or might, alternatively, result in tolerance or habituation following some initial re sponse.Relative motion and spatial orientation of sound exposure may be particularly important (e.g.Ellison et al. 2012): vessels that consistently approach a whale, as in the 3S protocol, may lead to different patterns of response than vessels moving obliquely or away from an animal (Miller et al. 2012).Clearly the sound sources and exposure context of most of these experiments differ in a number of fundamental ways (e.g.source level, directivity, range, spatial orientation) from the kinds of real-world sonar systems.Indeed, these ex periments were carefully designed to minimize risk of stranding while still detecting responses that may be of regulatory and biological significance.Because these contextual factors differ between CEEs and actual sonar exercises in ways that may affect the probability and type of behavioral responses (Ellison et al. 2012), direct extrapolation and application of experimental results to full-scale operational systems needs to be considered a fundamental part of the experimental design in future studies, as was done in Tyack et al. (2011) and Moretti et al. (2014) for sonar exercises at AUTEC.
Teasing apart the relationship between physical range from the sonar source to the whale (proximity) and received level of the signal at the whale is a related issue that could be addressed by using sonar systems with a variety of source levels.The initial BRS experiments logically began, particularly with beaked whales where lethal consequences of sonar exposure were known, with lower power, stationary systems to minimize the potential risk of stranding (e.g.Southall et al. in press).Later BRS studies have used moderate-power sonar systems that have lower source levels than the most intense naval sonars.For example, the 3S experiments with killer whales used a source level ranging from 197 to 214 dB re 1 µPa at 1 m.These source levels are consistent with some European naval sonar systems, but are much lower than the most intense naval sonars used by a number of navies around the world.The average acoustic avoidance response threshold of 142 dB re 1 µPa was recorded at an average source−whale distance of 3.8 to 4.6 km (Miller et al. 2014).However, a 142 dB re 1 µPa received level from a 225 dB re 1 µPa at 1 m source level operational sonar source would occur over much larger distances, up to 60× the distances in these experiments.It remains unknown to what degree the distance from the whale to the source might influence the acoustic thresholds for behavioral response, but DeRuiter et al. (2013a) found that distance to the source was likely an important factor for Cuvier's beaked whales located near an opera-tional navy range.Because of the importance of spatial context of sonar exposure in estimating how many animals may be behaviorally affected by sonar, future research to quantify effects of source distance and movement is strongly recommended.Given that we have now demonstrated that such studies can be safely and effectively conducted, including the ability to measure and quantify the significance of contextual covariates, more direct investigation of the relative significance of spatial context is possible.This is a key area of research within SOCAL-BRS, as described above, along with the deliberate inte gration of experimental approaches using the most powerful operational active naval sonar (e.g.AN/ SQS-53C) systems in the context of actual navy operations.
The application of rapidly increasing data from cetacean sonar CEEs to evaluate potential impacts and mitigation efficacy for realistic sonar operations is clearly challenging.Many factors beyond the type and relevance of available data on response probability affect such assessments, such as differential regulatory requirements, paradigms, and distinctions in different jurisdictions and ecosystems.Regulatory assessments must account for operational factors such as the type and nature of active transmissions in terms of overall magnitude, distribution and use patterns, and differential source and signal types.They must also consider the population status of exposed animals and the quality of potential alternate habitat surrounding areas where temporary displacement may regularly occur.
The past decade of research has unquestionably increased the scientific basis for improving regulatory assessments of potential sonar risks to ceta ceans.So, too, has it evidenced a number of key questions and considerations for ongoing and future research.Studies have identified numerous sources of variability in the probability of response, including species and individual differences and context dependencies for response.Research indicates the need to move beyond step-function thresholds based on received level to empirically based probabilistic exposureresponse functions.If the vari ability of responses makes probabilistic exposure-response functions too broad for effective regulation, further evolution might consider different exposure-response functions for broad contextual categories (e.g.particularly sensitive species, behavioral state differences, or sourcereceiver range).These should remain clearly distinct from operational mitigation requirements, as it is clearly unrealistic to identify practical methods by which navies could effectively incorporate such contextual variables in their real-world use of sonar.

Directions for future research
One of the major positive outcomes of the substantial body of recent research discussed here is the inter-disciplinary and international collaboration of biologists, engineers, statisticians, and acousticians that has resulted.Active and future research will require sustained effort and significant financial support from various navies for teams that can tackle these complex problems using novel experimental and analytical methods.Many critical, current, and pressing applied questions require experimentally testing the correct taxa, locations, time periods, and exposure contexts in order to define how these animals respond to relevant exposures.The longer-term objective is to develop science-based methods from these relevant behavioral metrics to monitor critical responses during actual operations in order to understand and predict the potential severity of effects from active military sonar.A number of targeted nearterm research needs have been discussed throughout this review, and are summarized here: (1) Increase the realism of sonar exposures used within experimental studies.Considerable effort is underway to conduct CEEs using the most intense active sonar systems with increasingly realistic exposure conditions to successfully match received levels tested with simulated sources using full-power systems at much greater physical range.These comparisons may support the evaluation of the relative importance of proximity relative to received exposure level for operational sonars.
(2) Evaluate the relative familiarity of experimental subjects with sonar.Many, though not all, active military sonar transmissions around the world occur in relatively concentrated regions around training and testing areas where most monitoring and research on potential effect has logically occurred.However, animals in these areas may be less sensitive to sonar because of habituation, tolerance or because sensitive animals may leave.This means that predictions derived from areas where sonars frequently transmit may underestimate potential responses for animals in regions where sonar use is less common but still occurs.Additional studies are needed in areas of less regular sonar use.
(3) Select appropriate locations, time periods, and focal species.Species tested have either been thought to be particularly sensitive, endangered or threatened, or very common in areas of sonar operations.Harris et al. (2016) present statistical techniques for deciding when and how to pool different species, contexts, and signal types based upon empirical ob servations of re-sponses to sonar, but this can be challenging given the large number of different species and observed species-, individual-, and context-mediated differences in response behavior.Subsequent research should evaluate patterns of social structure, susceptibility to predation risk, and key life history parameters to develop generalized models of functional sensitivity categories based on species groups (beaked whales) or behavior state (e.g.feeding vs. traveling).These analyses can help to direct decisions about how many subjects of which species in which states are most effective in reducing uncertainty in response functions for specific sites and sources.
(4) Expand spatial and temporal scales of experimental studies to overlap with monitoring methods.Experimental responses need to be considered within the context of broader monitoring observations (as in Moretti et al. 2014) and relative to increasingly detailed, longer-term movement and diving behavior from satellite tags (e.g.Schorr et al. 2014), particularly where sonar exposure is sufficiently documented.Specifically, there is a clear need to apply CEE methodologies in tandem with observational monitoring in conditions where longer-duration or sequential exposures over multiple days can be measured and where relatively high-resolution measurements of individual behavior are possible using measurement devices that can be attached for days to weeks.Evaluations are also needed of both individual fitness parameters (using metrics of health or body condition that can be measured with archival tags), as well as of foraging habitat metrics, in order to evaluate potential population-level effects.
We argue here for an integrated research strategy using complementary experimental and observational methods to further advance our understanding of the potential effects and biological significance of military sonar on cetaceans.Focused work is needed to (1) improve our understanding of the relationship between response probability and intensity to acoustic exposure, including key contextual variables (e.g.be havioral state, source-receiver proximity); (2) increase the temporal and spatial scales over which responses can be evaluated with both targeted experimental studies and increasingly informed observational monitoring; and (3) identify the linkages of short-and medium-term behavioral responses to changes in individual vital rates.Given the substantial progress made in the past decade thanks to the efforts and collaborations described here, targeted studies in these areas over the next decade will fundamentally change the way we understand, manage, and mitigate the effects of active sonar on cetaceans.
Acknowledgements.The authors would like to acknowledge the many dedicated and talented people that have done so much hard work in the AUTEC, MED-09, 3S, and SOCAL-BRS projects.While they are too numerous to name individually here, many are represented in the authorship and acknowledgments of the referenced publications from these projects.All studies described were conducted under requisite local, federal, and institutional permits and animal care and use protocols.
. (2011) studied Blainville's beaked whale behavior on and around a navy training and testing range in the Bahamas called the Atlantic Undersea Testing and Evaluation Center (AUTEC).
(2011)  were able to demonstrate the general nature of behavioral responses that occurred and the kinds of conditions under which they were likely to begin.Several additional findings from the AUTEC-BRS provide important insight into other aspects of cetacean behavior.DeRuiter et al. (2013b) studied potential vocal responses of delphinids and found little change in pilot whale whistle rates during simulated sonar CEEs but an increased probability of melon-headed whales producing sonar-like whistles.These findings demonstrated differences in responses across species and also context dependence of changes in vocal behavior during sound exposure.Additionally, oceanographic ecological variables (prey distribution and density) have been measured with scientific echosounders (Hazen et al. 2011), setting the stage for future integration of relevant environmental and ecological variables in CEEs with foraging animals.

Fig. 1 .
Fig. 1.Three different analytical methods of studying behavioral responses of Blainville's beaked whales Mesoplodon densirostris to 3−4 kHz military sonar signals on different time and spatial scales (from Tyack et al. 2011).The panel in the upper left (i) shows diving behavior (in m) relative to time of day (in decimal hours local Bahamas time), echolocation clicking (blue line), and exposure to sonar or orca signals (red; received level [RL] at the animal).The upper right panel (ii) shows movement of an individual beaked whale using a satellite tag relative to the AUTEC range (shaded area) in periods (A) before, (B) during, (C) 0 to 72 h after and (D) 72 to 144 h after a US Navy sonar training operation on the range; as well as (E) relative distance from the center of the AUTEC range during this period.The bottom panel (iii) shows the locations of hydro phones detecting beaked whale clicks (red circles) for periods of (A) 20 h before, (B) 23 h during, and (C) 22 h following a US Navy sonar training operation

Fig. 2 .
Fig. 2. Dose-response function for the onset of avoidance of sonar by killer whales Orcinus orca, as a function of sonar received level (sound pressure level [SPL]; from Miller et al. 2014).The solid central line represents the mean, followed by 50%, 95%, and 99% credible interval lines.The doseresponse model assumes the signal is audible, but the limited data on the hearing threshold are marked in the figure with small arrows for 1 (left) and 2 kHz (right) signals, respectively.Reproduced from Miller PJO, Antunes RN, Wensveen PJ, Samarra FI, Alves AC, Tyack PL, Thomas L (2014) Dose-response relationships for the onset of avoidance of sonar by free-ranging killer whales.J Acoust Soc Am 135:975-993, with the permission of the Acoustical Society of America

Fig. 3 .
Fig. 3.The probability of a moderate-severity response occurring in (dashed lines) killer whales Orcinus orca, (dotted lines) long-finned pilot whales Globicephala melas and (solid lines) sperm whales Physeter macrocephalus versus received acoustic energy (cumulative sound exposure level [SELcum]) at a signal of (left panels) 6−7 kHz sonar (mid-frequency active sonar, MFAS) and (right panels) 1−2 kHz sonar (low-frequency active sonar, LFAS) and behavioral states of (top panels) feeding and (bottom panels) non-feeding.Mean probabilities are all shown in black, 95% confidence intervals are shown in grey (from Harris et al. 2015)

Fig. 5 .
Fig. 5. Scaled dive response values for blue whales Balaenoptera musculus exposed to simulated 3−4 kHz mid-frequency active sonar (MFAS) controlled exposure experiments (CEEs) using principal component analyses and generalized mixed models (from Goldbogen et al. 2013).Results are from sequential 30 min periods before, during, and after CEEs for a total of 16 blue whales.Error bars represent 1 SD across individuals

Fig. 4 .
Fig. 4. Time budgets of sperm whales Physeter macroce phalus during baseline, incidental sonar exposures (SON05_30), 6−7 kHz mid-frequency active sonar (MFAS) and 1−2 kHz low-frequency active sonar (LFAS) sessions, and no-sonar control sessions (from Isojunno et al. 2016).Note the strong increase in the proportion of time spent in the 'silent active' state during 1−2 kHz LFAS sessions and the corresponding decrease in layer restricted search (LRS) which is considered to represent foraging.Time budgets for other exposure types were more similar to time budgets observed in baseline periods

Fig. 6 .
Fig. 6.Individual blue whale Balaenoptera musculus behavior during SOCAL-BRS controlled exposure experiments (CEEs) (from Goldbogen et al. 2013).Black lines indicate animal depth (in m, left axis) before and after sound exposure, while blue segments indicate depth during sound exposure.Gray lines show relative animal speed estimated from recorded flow noise.Received levels for each signal transmission during CEEs are shown as red circles (in dB, right axis) and red-dashed lines are estimates of median values for the full exposure period, including some relatively low exposure level periods where they were masked by noise internal to the tag.Results are shown for CEEs with (A) a shallow feeding individual exposed to 3−4 kHz sonar, (B) a deep feeding individual exposed to pseudorandom noise, and (C) a non-feeding individual exposed to 3−4 kHz sonar

Fig. 7 .
Fig. 7. Individual Cuvier's beaked whale Ziphius cavirostris behavior during SOCAL-BRS controlled exposure experiments (CEEs) (from DeRuiter et al. 2013a).Results are shown for individuals tested in (A) 2010 and (B) 2011.For each individual, the top panel shows animal depth (left axis) and received levels (dB re 1 µPa) (red line) for each 3−4 kHz mid-frequency active (MFA) sonar transmission (right axes), with (light blue lines) echolocation clicking periods; lower panels show integrated response intensity using a Mahalanobis distance statistic that integrates multiple response variables to a univariate difference metric over time before, during, and after CEEs.The whale in 2011 (panel B) was also incidentally exposed to much lower levels of sonar (shown in dark blue) from other, uncontrolled sources during periods before and after received signals from the CEE (red) . Other sensors (e.g.Argos, GPS; see Costa et al. 2010) or visual focal-follow methods describing individual or social behaviors measured (e.g.Visser et al. 2014) have been used with archival tags to provide complementary data.