Animal Counting Toolkit: a practical guide to small-boat surveys for estimating abundance of coastal marine mammals

Small cetaceans (dolphins and porpoises) face serious anthropogenic threats in coastal habitats. These include bycatch in fisheries; exposure to noise, plastic and chemical pollution; disturbance from boaters; and climate change. Generating reliable abundance estimates is essential to assess sustainability of bycatch in fishing gear or any other form of anthropogenic removals and to design conservation and recovery plans for endangered species. Cetacean abundance estimates are lacking from many coastal waters of many developing countries. Lack of funding and training opportunities makes it difficult to fill in data gaps. Even if international funding were found for surveys in developing countries, building local capacity would be necessary to sustain efforts over time to detect trends and monitor biodiversity loss. Large-scale, shipboard surveys can cost tens of thousands of US dollars each day. We focus on methods to generate preliminary abundance estimates from low-cost, small-boat surveys that embrace a ‘training-while-doing’ approach to fill in data gaps while simultaneously building regional capacity for data collection. Our toolkit offers practical guidance on simple design and field data collection protocols that work with small boats and small budgets, but expect analysis to involve collaboration with a quantitative ecologist or statistician. Our audience includes independent scientists, government conservation agencies, NGOs and indigenous coastal communities, with a primary focus on fisheries bycatch. We apply our Animal Counting Toolkit to a smallboat survey in Canada’s Pacific coastal waters to illustrate the key steps in collecting line transect survey data used to estimate and monitor marine mammal abundance.

spatially explicit risk assessments (McCann et al. 2006) for human activities (Stelzenmüller et al. 2010), or to determine when populations warrant endangered or threatened species listing or de-listing (Rodrigues et al. 2006).Marine mammal abundance estimates are about to become increasingly im portant as the USA considers a rule that would ban access to US seafood markets unless countries can demonstrate that marine mammal bycatch is sustainable relative to marine mammal population size (Young & Iudicello 2007, Williams et al. 2016a).A major drawback of using surveys to generate reliable abundance estimates of marine megafauna is that they usually require expensive ship time.Daily charter rates for government survey vessels routinely run into the tens of thousands of US dollars (https://swfsc. noaa.gov/uploadedFiles/ Divisions/PRD/Projects/ Research_ Cruises/ Pac MAPPS/ PacMAPPS-Developing AStrategic Plan.pdf), which puts data collection out of reach for many low-income countries, coastal and indigenous communities, independent scientists and conservation NGOs (Devictor et al. 2010).It is unsurprising that published cetacean density estimates are accessible for only ~25% of the world ocean, and only 5% of the ocean has been surveyed frequently enough to offer any opportunity to detect trends in abundance (Kaschner et al. 2012).Gaps in marine mammal survey coverage are especially striking in the coastal waters of many countries in the Global South (Kaschner et al. 2012).We consider 2 primary applications that require information on abundance and distribution of marine megafauna: (1) assessing the sustainability of bycatch or other takes incidental to human activities (Wade 1998); and (2) assessing the conservation status of populations (Hoffmann et al. 2008).A recent review of priority research areas for marine mammal bycatch concluded that documenting risk is limited by major gaps in the data available at the population level (Reeves et al. 2013).Secondarily, the systematic effort and sightings data that yield abundance estimates also yield good distribution data, which are useful for marine spatial planning and prioritizing areas to protect in order to meet global biodiversity conservation targets.

Distance sampling: assumptions and challenges for small-boat line-transect surveys
Two methods commonly used to estimate marine mammal abundance are mark-recapture methods that rely on data collected from marked individuals and the family of distance sampling methods, including line-transect surveys (Seber 1982).These methods can be complementary, but they estimate different attributes of a population (Calambokidis & Bar low 2004).Broadly speaking, a mark-recapture esti mator samples individuals, whereas distance sampling me thods sample area (Fig. 1).A mark-recapture ex peri ment estimates the total number of uniquely identifiable individuals that had a non-zero probability of being in the survey region during the study period, whereas a linetransect survey estimates the average number of individuals that were in the survey region at the time of the survey (Fig. 1).Besides abundance, mark-recapture methods allow estimation of a number of other population parameters (e.g.site fidelity, survival, reproductive rate) that cannot be estimated from a linetransect survey, but such methods apply only to cases in which individuals can be identified reliably on multiple sampling occasions (Hammond et al. 1990, Wilson et al. 1999).Line-transect survey can generate abundance estimates for many species from a single sampling occasion.We encourage colleagues to consider systematic sampling design (Tyne et al. 2016) and modifying field protocols to include collection of perpendicular distance data (Read et al. 2003) in small-boat, photo-ID studies.At best, combining the 2 approaches will allow re searchers to make in ferences about the population by comparing the 2 estimates (Calambokidis & Barlow 2004).If funding prevents a second photo-ID field season, following line-transect survey protocols will increase the chances of generating usable results from a single survey.
The term 'distance sampling' describes a family of techniques in which animal abundance in an area is estimated by measuring density along a representative sample of transects (Thomas et al. 2007).We use the term 'transect' to refer to the independent sampling unit placed throughout the survey region, and the term 'trackline' to refer to the line followed by the observer from which distances and angles are measured.Sample density is multiplied by the size of the surveyed area from which the sample transect was drawn to derive an estimate of population size: (1) where D is density, n is the number of animals (or clusters of animals) observed along the trackline, a is the area surveyed along the transects, and Pa is an estimate of the probability that an animal was seen in the surveyed area (Buckland et al. 2015).Surveyed area, a, is not known with certainty, but it can be estimated by multiplying the length of the transects, L, by twice the effective strip width, μ, the area effectively searched by the ob servers.Distance sampling assumes that the probability of detecting an animal is highest directly along the trackline, and drops off with increasing perpendicular distance from the track line.A detection function is used to model the probability of detection as a function, g(x), of perpendicular distance.The probability density function (PDF) of perpendicular distances to detected objects, f(x), is the detection function g(x) rescaled so that it sums to 1.In conventional distance sampling (CDS), trackline detectability is assumed to be certain (100%).The effective strip width, μ, is defined such that the number of animals detected beyond μ is equal to the number of animals missed within μ.The program Distance has a number of built-in functions (e.g.half normal, hazard rate) to solve for the parameter μ (Buckland et al. 2001, Thomas et al. 2010).
Although the analysis uses information on perpendicular distance, it is common in boat-based sightings surveys to record radial distance and angle to animals and convert these to perpendicular distance at the analysis stage.The program Distance allows data to be imported in either perpendicular distance or a combination of radial distance and angle.In a conventional line-transect survey (Buckland et al. 2001), a number of key assumptions are made to ensure that (1) the density measured in a sample of transects is representative of the survey area from which transects were drawn, and (2) animal density is measured accurately in the samples.
First, although rarely stated explicitly, there is an implicit assumption in any conventional distance sampling study that samples (lines or points) are distributed in such a way that average animal density in the sample is representative of animal density in the region.This is accomplished by placing lines or points randomly, or systematically (with a random start point), throughout the survey region.Good survey design promotes accuracy of sample density.Fig. 1.A hypothetical distribution of whales during a line-transect survey conducted within the area demarcated by the grey box.Line-transect surveys sample an area (i.e. the area within the grey box, in this example) and use density along the tracklines to estimate the average number of individuals in the surveyed area at the time of the survey.Sample density along the transects is converted to abundance by multiplying by the size of the area.In contrast, mark-recapture surveys of individually recognizable ('marked') individuals sample animals, rather than area.Mark-recapture methods estimate the number of individuals available to be 'captured' during the study (i.e.animals whose travel paths, in grey dashed lines, crossed into the box).A mark-recapture abundance estimate therefore includes animals that may move in and out of the area while the survey is being completed.Thus, as long as all assumptions have been met, if there is movement into and out of the survey area, the simplest mark-recapture methods are likely to produce larger estimates than line-transect methods, because the 2 approaches estimate different population attributes Replication promotes precision in the density estimated from the sample, and statisticians generally recommend 15−20 transects per stratum (Thomas et al. 2007).Of course, the desired level of precision is a function of the intended purpose.Our toolkit approach can be used for a pilot study to inform a power analysis (Gerrodette 1987).Second, objects on the trackline (perpendicular distance from trackline = 0) are detected with certainty.This is often termed the 'g(0) = 1' assumption, be cause data are analyzed with the constraint that detection probability at zero distance is 1 (see Eq. 1).This assumption is often violated in surveys of diving animals, through a combination of availability bias (i.e.animals are underwater and unavailable for detection) and perception bias (i.e.observers missed the animals) (Marsh & Sinclair 1989).Violating this assumption, as our surveys no doubt do, will generate a minimum abundance estimate.It is important to assess at the outset whether a survey is being conducted to estimate absolute abundance or to provide a relative abundance index for detecting trends (Dawson et al. 2008).
Third, objects do not move before perpendicular distance from the trackline has been established and recorded.Field protocols are developed to ensure that observers search well ahead of the vessel, so distance and angle can be recorded before animals have responded to the boat.Satisfying this assumption allows accurate estimation of μ.Responsive movement following detection is not a problem, but responsive movement prior to detection can cause bias (Buckland et al. 2015).If animals are attracted to the boat before distance and angle to the sighting are recorded, density estimates will be positively biased.If animals avoid the boat, density estimates will be negatively biased.
And fourth, perpendicular distances are measured without error.Satisfying this assumption allows accurate estimation of μ.In practice, the methods are generally robust to some random variability, but they are not robust to systematic bias in recording perpendicular distances (Buckland et al. 2015).If observers tend to overestimate distances, then density estimates will be negatively biased.If observers tend to underestimate distances, then density estimates will be positively biased.Gauging the amount of bias to tolerate will depend on the primary objective of the study: abundance or trends (Dawson et al. 2008).
The first assumption is dealt with at the study design stage, whereas assumptions 2−4 are ad dressed at the data collection stage.Surveys designed to ad dress the first assumption are termed 'design-unbiased,' whereas surveys that violate this assumption may require advanced, model-based methods to ad dress spatial bias in the survey design (Buckland et al. 2007, Thomas et al. 2010, Miller et al. 2013).We designed a systematic sightings survey for a small (6 m) boat (see Materials and Methods) to illustrate how to use free GIS software to define a survey region and design a spatially unbiased survey in the program Distance (Thomas et al. 2010).The automated survey design algorithms in the program Distance allow users to avoid violating the first assumption altogether by creating a design-unbiased survey (Thomas et al. 2007).Although spatial modeling methods are advancing rapidly, they were never intended to salvage spatially biased data (Hedley et al. 1999, Miller et al. 2013).
Violating assumptions 2−4 is common in low-cost, small-boat surveys.With low survey platforms, marine mammals surface fewer times within an ob server's field of view than would be the case for observers working on ships with higher viewing platforms.Lower platforms also mean that observers cannot search as far ahead of the vessel as they could from a higher platform and, consequently, observers may first see an animal after it has already ap proached or avoided the boat.Nearshore surveys in small boats rarely have unobstructed views to the horizon that would allow distances to be measured easily with reticle binoculars.Taken as a whole, violation of assumptions 2−4 can introduce both bias and poor precision in abundance estimates.Given a growing trend toward adapting methods from shipboard surveys for small boats (Dawson et al. 2004, Stensland et al. 2006, Braulik et al. 2012), it is important that these limitations be clearly communicated so that re searchers can take proactive steps to maximize the quality of their survey data while also understanding the practical limits to informing population assessments from small-boat surveys.For a region where no information was previously available, a minimum or imprecise abundance estimate may be very useful (Dawson et al. 2008, Williams & Thomas 2009).For example, estimating population size to the correct order of magnitude (e.g.tens of thousands of animals) may be sufficient to evaluate whether a quantified mortality source (e.g.bycatch) is likely to pose a population threat.However, abundance estimates from smallboat line-transect surveys, even when adhering to best practices such as those we recommend here, are unlikely to yield accurate estimates of rare species or reliable inferences about population trends except for the most rapidly declining (or re covering) populations.As stakeholders identify conservation priorities, the accuracy and precision of abundance estimates can always be improved iteratively as data collection methods, sample size and analytical techniques improve over time (Taylor & Gerrodette 1993).
A practical guide for conducting small-boat surveys to estimate population abundance All too frequently we hear from colleagues who have spent years collecting hard-won field data but cannot generate abundance estimates because effort data are missing, the researchers failed to record zeroes or null findings, or the protocols used for data collection violated some fundamental statistical assumption.The primary objective of our study was to take lessons learned from designing and conducting small-boat surveys for coastal marine mega fauna, apply them in western Canadian waters, and distill the methods into a practical guide to data collection that we call our Animal Counting Toolkit.The toolkit includes a series of conceptual guidelines, software resources, and examples of hardware and methods that have worked for our projects over the years.All of the survey design, data collection and analysis projects will be placed online, and over time, supplemented with simple how-to videos.The target audience for the Animal Counting Toolkit is a conservation practitioner trying to fill in data gaps in regions where having even minimum or imprecise abundance estimates would advance discussions about assessing risk or guiding conservation prioritization.Our target audience includes independent or early career scientists (e.g.graduate students), government wildlife or park managers and rangers, First Nations, environmental nongovernmental organizations (ENGOs) and coastal communities.Our study was motivated by a common scenario in which a small ENGO wishes to contribute to 'best available science' in informing some impending conservation or management decision (e.g.assessing sustainability of marine mammal bycatch in fisheries or the proposed construction of a windfarm or pipeline near important marine mammal habitat).We include a case study as a worked example of the process of generating minimum estimates of marine mammal abundance from design through data collection to analysis.
We illustrate our methods for study design, data collection and analysis using a small (6 m) boat survey for marine mammals in the coastal waters off northeastern Vancouver Island, British Columbia, Canada.This survey is used in the present study as a generic case study to illustrate how to (1) design a survey in which animal density measured in the sample of transects is expected to be representative of animal density in the survey region (i.e. it is 'designunbiased'), and (2) collect field data in a way that satisfies the assumption that animal density is measured accurately.We focus primarily on field survey methods rather than data analysis.At each stage, we illustrate the process using freely available software.All data and projects are available in the Animal counting toolkit file in the Supplement (www.int-res.com/ articles/ suppl/ n034 p149 _ supp.zip).We hope that by following these guidelines, future ecologists may be able to avoid some important analysis pitfalls, and so provide data that are useful for trend analysis or fill knowledge gaps in distribution patterns.
Although we illustrate our toolkit with several marine mammal species as case studies, the fundamental principles we outline apply to many surface-oriented marine top predators, including seabirds, some sharks, sea turtles (Fuentes et al. 2015, Jackson et al. 2015) and sunfish.Recent reviews of seabird bycatch in longline fisheries (Anderson et al. 2011) and global research priorities for seabirds (Lewison et al. 2012) both identified that lack of data on at-sea abundance hinders our ability to assess sustainability of seabird mortality in fisheries.Estimating abundance of sea turtles is easier to do on their nesting grounds than foraging grounds, but sightings surveys can at least facilitate analyses that integrate information on atsea distribution and spatial extent of anthropogenic threats (e.g.fishing pressure, ocean noise or oil spill risk; Hamann et al. 2010).
Our team includes ecologists and statisticians who have collaborated for many years on designing and conducting surveys to estimate marine mammal abundance.The statisticians have focused on survey design, as well as statistical methods to estimate abundance and infer trends (Buckland et al. 2001, 2007, Thomas et al. 2004, 2007, 2010, Moore & Barlow 2011, Borchers et al. 2015).The ecologists have focused on field and analytical methods to generate abundance estimates from small boats (Williams & Thomas 2007, 2009), dedicated studies whose primary focus was not to estimate abundance (Williams et al. 2011), and platforms of opportunity (Williams et al. 2006).Our own research has benefited from strong collaborations between statisticians and ecologists, but we note that many researchers do not have the resources that can be taken for granted in academic settings in wealthy countries (Gimenez et al. 2013).We have been involved in international collaborations to estimate wildlife abundance and conservation status in countries with little funding for biodiversity monitoring (Moore et al. 2010, Savage et al. 2010, Lewison et al. 2014, Williams et al. 2016b) and in efforts to help colleagues, particularly students, leverage (i.e.make use of existing but unpublished) historical survey data.
Conservation practitioners must define for themselves what they hope to accomplish with their abundance estimate (Dawson et al. 2008).An accurate (i.e.unbiased) estimate may be needed to quantify extinc tion risk (Mace et al. 2008).Precision (i.e.low variance) may be more important than accuracy for detecting trends; an index of relative abundance can be useful for inferring trends as long as the methods are repeatable and consistent (Yoccoz et al. 2001).A successful survey is one that generates estimates that are fit for purpose.

MATERIALS AND METHODS
The following methods describe the approach we followed in our small-boat survey in western Canadian waters to ensure that (1) the density measured in a sample of transects was representative of the survey area from which the transects were drawn, and (2) animal density was measured accurately along the tracklines.

Defining a survey region
The objective of any line-transect survey is to estimate the number of animals in a survey area at the time of the survey.Defining the boundaries of the survey area becomes particularly important when scaling up the number of animals observed along the trackline to total abundance in the survey area.One of us (E.A.) has conducted a photo-ID study of Pacific white-sided dolphins Lagenorhycnhus obliquidens in the region since 2007 (Ashe 2015).To illustrate our Animal Counting Toolkit approach, we used Ashe's core study area to define the boundaries of our line transect survey area (Fig. 2).We exported her tracks from a handheld Garmin GPS unit and imported them to QGIS (Fig. 2, www.qgis.org).Ashe used the same small boat for the majority of her photo-ID effort (survey tracks, Fig. 2).We used Ashe's search effort to outline a region we felt confident we could survey using the same small boat, and would allow us to return each day to our field accommodation (red star, Fig. 1).We downloaded a shapefile of British Columbia from the provincial government's Geospatial Data Downloads website (www.empr.gov.bc.ca/ MINING/ GEOSCIENCE/ MAPPLACE/ GEODATA/ Pages/ default.aspx)and imported both that file and Ashe's GPS tracks to QGIS 2.8 (QGIS Development Team 2015).We created a general outline of the proposed study area using the approximate northern, southern, eastern and western extents of Ashe's tracklines and clipped the study area to define the boundaries of the line-transect survey we wanted to conduct.Because it is more common to find terrestrial, rather than marine, shapefiles, we used QGIS Geospatial tools to create a raster of the entire area shown in the inset of Fig. 2. We joined that layer to the British Columbia shapefile, and scored each cell as a 2 if it was on land, and 1 if it was on water.We clipped out the land, and were left with a shapefile defining only the marine component (the red, irregularly shaped polygon in the inset of Fig. 2).We exported the marine study area to a new shapefile, and used it to design a systematic line-transect survey.Using the Geospatial tools in QGIS, the survey area was estimated to cover 1191 km 2 .

Designing a survey to provide representative coverage of the survey region
Randomization and replication are key elements in any good survey design.We followed the recommendations outlined previously for survey design for complex survey regions (Thomas et al. 2007).We imported the survey area polygon (red, irregularly shaped polygon in the inset of Fig. 2) into the program Distance 6 (Thomas et al. 2010).We designed a survey with 6 km spacing of parallel lines with a randomly chosen start point (using the RAND() function in Excel to choose the longitude of the first transect, and systematic placement of transects 6 km apart thereafter).Parallel lines were chosen over zigzag samplers because they give even coverage, even in complex survey regions (Thomas et al. 2007).The spacing was chosen to be wider than the effective strip widths covered for all species in a multi-species marine mammal survey from a 21 m vessel (Williams & Thomas 2007), while also allowing placement of the recommended 15−20 transects per stratum to give reasonable variance estimates (Thomas et al. 2007).The final survey design (Fig. 3 and see the Supplement files ACT_design.zip and DesignedTransectCoordinates.xlsx) was intended to cover 209 km along 17 transects.The complex coastline led to a survey design that we knew was extremely inefficient, be cause observers would have to spend a great deal of time navigating around islands, and to and from their home base (near the bottom of Transect 5) each day.We did not see a practical alternative to this, under the circumstances.Efficiency would have been improved had we had access to a live-aboard vessel or a helicopter, but neither was possible given the budget.
In the field, observers found that they were unable to navigate safely from their base to the easternmost extent of the survey area and back.The observers decided to drop Transects 16 and 17 (not shown), and the southern legs of Transects 13, 14 and 15 from the survey (Fig. 3).Because the parallel line design provided even coverage probability (Thomas et al. 2007), the unsurveyed area could simply be removed from the calculation of abundance.After revisiting the survey design due to safety concerns, the ob servers felt they could cover ~89% of the planned transects (183 versus 209 km) and ~96% of the planned survey area safely (1191 versus 1140 km 2 ).

Field data collection
Effort and sightings data were collected from 3 to 24 August 2013, using a 6 m fiberglass powerboat with a 115 hp outboard.The vessel steamed at approximately 10 knots (19 km h −1 ) during searching ef fort.The team consisted of 2 people.The starboard observer was driving the boat, and the port observer collected effort and sightings data on a Trimble Juno T41 handheld device equipped with a GPS unit and CyberTracker (www.cybertracker.org/)software.Obser vers followed protocols outlined previously (Williams & Tho mas 2007).With the trackline representing 12 o'clock, the starboard observer searched a sector from 11 o'clock (just port of the trackline) to 3 o'clock, and the port observer searched from 9 o'clock to 1 o'clock, scanning continuously.This overlap at the trackline was meant to maximize the chances of satisfying the g(0) = 1 assumption.A customized Cyber-Tracker template was created to allow observers to rotate through a series of screens and toggle commonly used entries to keep track of search status (off effort, on a transect or on effort during a transit leg), transect numbers and sightings.The template is included in the Supplement (CyberTracker TOOLKIT _1.CTX).The CyberTracker template prompted observers to collect data on sighting conditions, but in practice, all search effort had to be conducted in very low sea state (Beaufort sea state 1 or 2), given the size of the boat.Whenever a sighting was made, the port observer entered the sighting in CyberTracker, which automatically assigns a serial sighting number.An angle board mounted on the dash was used to measure radial angle to the animal or centre of a cluster of animals.The person making the sighting was responsible for gauging distance and angle to the first sighting, identifying species, estimating group size and re cording behavior.For most sightings, this was done while the boat was still underway.In cases where a second opinion was needed, sightings made by the driver were 'passed' to the port observer to use 7×50 binoculars to estimate group size or confirm species identity.During the time it took for the port observer to complete the sighting record, the driver scanned the entire sector from the port beam to the starboard beam.On the few occasions where the port observer needed assistance, the driver stopped the boat temporarily and the observers worked together to complete the sighting record.
The CyberTracker template prompted observers to collect data on sighting cue (e.g. the animal's body breaking the surface, blow [exhaled breath], seabird activity [i.e.potentially allowing marine mammal detection beyond the horizon]), the animal's behavior (swimming normally, avoid, approach) and its head-ing relative to the boat (profile, head-on, tail-on or other/ unsure), but these data were not used in the analysis.Environmental conditions affecting sightability can be used as covariates in the detection function (Marques & Buckland 2003).It is possible to use information on orientation relative to the boat to assess, quantitatively, whether responsive movement is biasing the abundance estimate (Palka & Hammond 2001).
Accurate distance estimation can be a problem in any sightings survey (Marques et al. 2006).Distance sampling methods are robust to modest levels of measurement error, but not bias (Buckland et al. 2001).In shipboard sightings surveys, ranges can be measured using photogrammetry or reticle binoculars (Hammond et al. 2002).Low platforms can make it difficult to use these methods on small-boat surveys.One solution is to use distance estimation experiments to generate a quantitative relationship for each observer between estimated and true distances using laser rangefinders, radar or photogrammetry, and then applying that relationship to remove the systematic bias in visual estimates of distance (Williams et al. 2007).These experiments work best with more than 2 people, especially when one is driving the boat.Another solution is to conduct distance estimation training throughout the survey, and this is the approach we used for this illustrative case study.While in transit, and not collecting effort and sightings data, the port observer identified candidate objects (logs, boats, rocks) to use as trials.Both observers estimated distance visually, and then the port observer announced the true distance using a Bushnell Yardage Pro laser rangefinder.These training exercises were conducted daily.
After dropping the few transects that the observers could not cover safely, there was sufficient boat time to allow observers to cover the remaining transects twice.Total line lengths covered (i.e.twice the original line length, in most cases) were entered into the program Distance for calculating encounter rate and variance.Because observers stayed on effort during transit, a number of additional 'transit-leg' sightings were collected.Transit-leg sightings were scored as taking place on Transect 0. This allowed the transit-leg sightings to be used in fitting the detection function, but not to estimate density (Williams & Thomas 2009).

Analyzing the data to estimate animal density and abundance
Effort and sightings data were exported from Cy-berTracker as comma-separated value (CSV) files for editing in Excel (see the Supplement, efforts and sightings.xlsx).The most common data entry errors were in recording group size, distance or angle.Because the port observer entered a comment in Cyber-Tracker any time this took place, it was easy to correct those sightings.On a few occasions, the observers went off effort to collect identification photos of Pacific white-sided dolphins to contribute to an ongoing photo-ID study (Ashe 2015).On those occasions, notes re corded by observers in a photo-ID notebook were considered more reliable estimates of group size than the estimates recorded in CyberTracker at the initial sighting.Those corrections were made manually after reconciling the field notebook and the CyberTracker records.The CSV and CyberTracker effort and sightings files are available in the Supplement.
The effort and sightings data were compiled into a single 'flatfile' format (http://creem2.st-andrews.ac.uk/preparing-your-data-for-use-in-distance/) to create a new project in the program Distance (see the Supplement, ACT_analysis_MCDS.zip).Small sample size limited the number of analyses that could be explored, so only half-normal and hazard-rate de tection functions were tested for each species.Model selection was conducted using Akaike's information criterion (AIC), with one exception.In small-boat surveys, some species (e.g.Dall's porpoises Phocoenoides dalli and Pacific white-sided dolphins) are not seen until they have approached the boat, and this can manifest in the form of a spike near zero distance in a histogram of perpendicular distances (Williams & Thomas 2007).The analysis methods rely on biological interpretation in addition to information theoretic approaches; use of AIC alone may have led to selection of a hazardrate model to fit the apparent spike near zero, which can result in underestimating detection probability and overestimating abundance.In cases where observers made comments indicating attraction to the boat, the use of the half-normal model was chosen over the hazard-rate model (even if not supported by AIC) to avoid fitting a spike at zero distance.
In a small-boat, low-cost or pilot study, small sample size is common.Conceptually, it is possible to use transit-leg sightings to fill out the detection function, but not in the calculations of density (Buckland et al. 2001).In our experience, statisticians using that approach tend to conduct analyses outside of Distance (e.g.see analyses and advice for rare species by Len Thomas; Williams & Thomas 2009).It is possible to do much of this work in Distance, but the methods are not well documented.First, we set up a separate stratum (a sub-region) for transit-leg sightings, using Transect=0 as a filter.Importantly, the area of that stratum must be set to zero to avoid affecting the resulting abundance estimates.Line length cannot be zero (Eq.1), so we entered the total length of search effort conducted in transit-leg mode.For each species, we set up an analysis with the detection function estimated globally (i.e.pooling both the transit-leg stratum and the designed stratum) and density by stratum as well as globally.This is accomplished in Distance under the Model Definition, Estimate tab of the CDS engine, by ticking 'User layer type Stratum' and under Quantities to estimate, tick Density Global and Stratum, and Detection function Global.In the multiple covariate distance sampling (MCDS) en gine, one would tick Detection function Global and Stratum.We set the Global density estimate to be the mean of the stratum-level estimates, weighted by Area, which is the default in the Estimate tab.When running any model, Distance will issue a warning that Area=0 for the transit-leg stratum.That error can be ignored.Distance uses all sightings (including transit-leg sightings) for fitting the detection function.We ignored the encounter rate and density estimates for the transitleg stratum, and only interpreted the results for the designed stratum.Had we conducted a stratified survey, the global density estimate would be correct (i.e.ignoring the transit-leg stratum), because the area of the transit-leg stratum was set to zero.The global density is a weighted average of the stratum-level estimates with the weight being the area, and so the weight for this stratum is zero.
We used CDS analyses initially for all species for which 10 or more sightings were made (Table 1).This is well below the 60−80 sightings recommended for fitting a robust detection function (Buckland et al. 2001), but an accurate abundance estimate has been estimated from a small-boat survey for killer whales Orcinus orca based on only 18 sightings (Williams & Thomas 2009).To attempt to estimate abundance of rarely seen species, we followed previous recommendations (Barlow 1995, Barlow & Forney 2007) to pool species into groups thought to share similar detectability in order to use a pooled detection function.We used the MCDS approach to 'borrow strength' across species, that is, to use the information from commonly seen species to make inference about the strip width effectively searched for rare species that are thought to be similarly sightable (Table 1).The 3 species groups were: small cetaceans (i.e.harbor porpoise Phocoena phocoena, Dall's porpoise and Pacific white-sided dolphins); whales (i.e. common minke Balaenoptera acutorostrata, humpback Megaptera novaeangliae and killer whales); and pinnipeds (i.e.harbor seal Phoca vitulina and Steller sea lion Eumetopias jubatus) (Barlow 1995).
The version of Distance we used (version 7, Beta 3) is unable to stratify the data by planned/transit-leg effort strata as well as using species as a covariate.
We therefore used a 2-step process, in which only the effort and sightings data from the planned stratum were used to estimate species-specific encounter rates (and associated coefficients of variation) and expected school size (and associated coefficient of variation).Next, all sightings data (i.e.including both the planned-and transit-leg strata) were used to estimate parameters associated with the detection function (i.e.f(0) and its coefficient of variation).These were combined outside of Distance (i.e. using an attached R script in this case, but could be calculated in Excel) (see the Supplement, script estimate abund from 2 runs of Distance.r)using simple calculations to compute species-specific abundance estimates and associated measures of precision for the MCDS analyses.There is a trade-off between complexity of the analysis and the number of estimates that could be generated.The CDS analyses are simpler and can be done entirely within Distance, but the MCDS analyses done in combination with Distance and some simple subsequent calculations allowed estimation of effective strip width for species with too few sightings (<10) to fit a species-specific detection function.

RESULTS
The survey was completed largely as planned, with the exception of the southeastern transects and segments dropped for safety reasons as mentioned.Because some transects were covered twice, the final search effort totaled 368 km, in contrast to the 209-km-long planned survey (Fig. 4).In addition to the designed transects, ob servers recorded sightings along 1503 km in transit to and from the transects.This indicates an extremely inefficient survey design, and the imbalance between transects and transits translates to ~80% of the survey effort being conducted in transit.
Observers recorded 163 sightings of all species (Table 1; sightings of commonly seen species along the plan ned tracklines shown in Fig. 4).Of these, 81 sightings were collected while observers were in searching mode (i.e.'on effort') but in transit between transects or be tween the base and a transect (scored as 'Transect 0' in Distance).The remaining 82 sightings were made on the planned transects.
For illustrative purposes, the selected detection function for Dall's porpoise is shown in Fig. 5.The half-normal detection function was chosen in all CDS analyses except Pacific white-sided dolphins.The difference between the half-normal and hazard-rate function was slight (ΔAIC < 2), except in the case of harbor seals, in which the support for the half-normal over the hazard-rate function was large (ΔAIC = 6).The choice of detection function made little difference in the CDS abundance estimates, so abundance estimates from the half-normal models were shown in all cases for illustrative purposes.
Minimum abundance estimates for all 8 species are shown in Table 1.Precision was generally low, and co efficients of variation approached or ex ceeded 100% for Pacific white-sided dolphins and killer whales.Expanding the analyses to MCDS methods allowed estimation of 3 rarely seen species: common minke and killer whale, and Steller sea lion.All of the abundance estimates presented are tentative, but the greater sample sizes make the MCDS estimates likely more reliable than the CDS estimates.Overall, we conclude that killer whales, common minke whales and Steller sea lions were the least common marine mammals in our study area at the time of the survey, with each species probably numbering in the tens of individuals.It is likely that hundreds of humpback whales, harbor and Dall's porpoises, and harbor seals were in the survey region.It is likely that the Pacific white-sided dolphin was the most abundant species in the area at the time of the survey, with a point estimate of 1441 animals.

DISCUSSION
The survey satisfied its main objective by generating preliminary abundance estimates for 8 marine mammal species from a low-cost, small-boat survey.Although the precision is low and the estimates are uncorrected for perception or availability bias, there are precautionary methods for using imprecise and negatively biased abundance estimates for marine mammals in management of human activities (Taylor & Gerrodette 1993, Wade 1998).By quantifying the low precision associated with these abundance estimates, it becomes possible to use precautionary procedures based on lower bounds on the abundance estimate in order to minimize harm to populations through fisheries bycatch (Wade 1998).There are cases in which a minimum abundance estimate is useful.For example, many conservation assessments use thresholds of abundance as proxies for extinction risk (Gerber & DeMaster 1999).A minimum abundance estimate can suffice to estimate degree of depletion and rate of recovery from commercial whaling (Williams et al. 2011).
The survey also accomplished its secondary goal, which was to provide a detailed and transparent description of the steps involved in defining a survey region, designing a systematic survey, describing field protocols used to collect the data, and conducting 2 relatively simple distance-sampling analyses.A related issue that we do not discuss involves data management, which is an important aspect of open science and citizen science (Stuart-Smith et al. 2013).The more sophisticated of our 2 sets of analyses generated more abundance estimates, by including species with small sample size, and these analyses are probably more reliable than the simpler, conventional distance sampling analyses.Overall, the abundance estimates themselves were of low precision, which was largely the result of sample size.None of the species generated the 60−80 sightings recommended for fitting the detection function (Buckland et al. 2001).Future analyses could use the very large amount of transit-leg effort and sightings data to model encounter rate as a function of spatial and environmental covariates, and not only in the detection function (Miller et al. 2013).Future fieldwork could target high-density areas preferentially to collect additional sightings for fitting the detection function, but not used in variance estimates (i.e. to increase the sample size for the MCDS analyses we presented).Simply repeating the survey would increase sample size and reduce variance due to variability in encounter rate: a great deal of improvement in precision is attainable given enough effort (Thomas et al. 2010).Encounter rate variance was the largest contributor to the variance in most cases, because many transects had no sightings of that species, and others had several.The program Distance takes variable transect length into account when estimating variance (Thomas et al. 2010), but there may be extreme cases where the en counter rate variance is overdispersed relative to a Poisson distribution.For novice users, this problem may be a minor one (e.g.addressing negatively biased abundance estimates by accounting for g(0) < 1 may be a higher priority than addressing negatively biased variance estimates).For advanced users, this issue may be worth pursuing.Some re searchers have used Monte Carlo methods to resample transect segments (Barlow & Forney 2007).Others have developed ap proaches to estimate a suitable variance inflation factor (Caughley & Grigg 1981, Pollock et al. 2006b, Moore & Barlow 2014).Density surface models offer a powerful way to ex plore and explain spatial patterns in encounter rate, when spatial heterogeneity is a feature of interest rather than simply a nuisance to be resolved to generate unbiased abundance and variance estimates (Miller et al. 2013).
The low precision of some estimates and the inability to correct for availability or perception bias illustrate the need to be clear about a study's main objective so the estimates can be fit for purpose (Dawson et al. 2008).If an absolute abundance estimate is needed, our toolkit approach may only provide a pilot study for improving design of a future survey that has sufficient power to detect trends (Gerrodette 1993) or uses platforms that can support 2 independent sets of observers to estimate g(0) (Buckland & Turnock 1992, Pollock et al. 2006a).If the survey is intended to detect trends, a relative abundance estimate may suffice, as long as it surveys a constant proportion of the population through time (Norvell et al. 2003).As observers, platforms and technologies change through time, this constant-proportionality assumption may be violated.Bayesian methods may allow advanced users to account for these changes and detect trends even from sparse data (Moore & Barlow 2011, 2014).
Our abundance estimate for Pacific white-sided dolphins was unusual, in that it was the only one in which the variance was driven largely by variability in school size, rather than variability in encounter rate.School sizes of Pacific white-sided dolphins ranged from 4 to 200, with a mean of 48 individuals.In practice, this means that a large proportion of the estimated number of dolphins in the area could be found in 1 or 2 clusters.Statistically, there is no way of avoiding high variance when large proportions of the population are found in relatively few clusters.Worse, the Pacific white-sided dolphin variance estimate was actually an underestimate, because it does not include uncertainty in group size estimation itself.Observers recorded low, best and high esti-mates the size of the groups of Pacific white-sided dolphins they encountered, but aerial photography or some other method would help replace visual estimates with a more accurate empirical estimate of group size.Low precision in this case is not an artifact of sampling error or small sample size, but rather the consequence of an attribute of this highly social species.If a precise abundance estimate is needed, one can develop explicit protocols that allow observers to split very large schools into many subgroups, with a distance and angle re corded to the centroid of each subgroup.This will increase the number of schools for fitting the detection function, reduce uncertainty in school size estimation itself, and reduce the contribution of heterogeneity in school size to the final variance estimate.For trend estimation, one could simply monitor abundance of schools, rather than individuals, but this raises at least 2 concerns.Firstly, the field protocols for defining a school must remain constant over time.Secondly, even when population size remains relatively uniform through time, ecological processes such as inter-annual variability in prey density or seasonal mating behavior can cause school size to vary (Lusseau et al. 2004).Both factors would confound trend estimation.An alternative might be to explore mark-recapture methods from photo-identification studies using natural markings (Morton 2000).That is not a panacea, because photo-ID studies can be difficult when populations are large, not all individuals are marked and capture probability is low (Stevick et al. 2001).
It is difficult to groundtruth any of the estimates.The study area spans 2 of the strata in a much larger survey that covered much of the British Columbia continental shelf (Williams & Thomas 2007), which makes it difficult to compare spatially and temporally incompatible estimates.A long-term photo-ID study of Pacific white-sided dolphins indicates that average abundance in the region is 1577 (95% CI: 910−2243), which is comparable to the estimated 1441 dolphins (CV = 0.92) from the present study.Killer whale abundance in the area is highly variable.On average, 6.5% of the population (numbering 290 whales in 2014; Towers et al. 2015) was found in the area in summer months between 1995 and 2002, so a point estimate of 27 killer whales from a snapshot in time (i.e. this line-transect survey) is in line with the 19 animals one would expect to be in the study area from a long-term study (Williams et al. 2009).
When working from small boats, observers have an extremely restricted field of view.As field of view narrows, the expected number of observations that can be used to fit the detection function de clines.This underscores the importance of managing expectations when conducting a small-boat survey, and setting an expectation that knowledge will improve as studies mature and sample size in creases (Walters 1986).It is best to think of the survey as generating abundance estimates that can be used as hypotheses to explore with additional data, order-of-magnitude estimates to compare with what are often equally tentative or provisional bycatch rates (Moore et al. 2010), or simply one of many quantitative and qualitative inputs in a structured decision-making and adaptive management process (Lyons et al. 2008), Bayesian belief network (Marcot et al. 2006) or relative risk model (Landis 2004).Commonly used methods for estimating allowable harm limits are robust to imprecise estimates.The precautionary nature of such methods simply allows very low levels of allowable harm in the face of uncertainty (Wade 1998).Importantly, high levels of variance preclude any hope of trend estimation (Taylor et al. 2007).

Robustness of abundance estimates to violation of distance sampling assumptions
(1) Design-unbiased sampling In practice, 15−20 parallel lines with a random start point will usually provide reasonably unbiased coverage of even complex survey regions (see discussion and alternatives in Thomas et al. 2007).Our survey was designed to avoid introducing bias in the abundance estimates, but increased survey effort in future could reduce the variance.A grid of parallel lines is a good idea, generally, for a pilot study, because it is easy to double search effort in future without having to design a new survey simply by interspersing new tracklines midway between the first set of parallel lines.Our survey's parallel survey design is in marked contrast to many small-boat surveys, which run parallel to the coast.Placing transects parallel to the coast is not recommended, because they can introduce an animal density gradient within the detection strip that is confounded with detectability.In the case of a density gradient away from the coast, perpendicular transects would capture the density gradient and reduce between-transect variability.If data were collected in such a way that there was an animal density gradient within the truncation distance of the transect, we recommend using procedures described elsewhere to remove this bias (Chapter 11 in Buckland et al. 2015).

Objects on the trackline are detected with certainty
This is often termed the 'g(0) = 1' assumption, because data are analyzed with the constraint that detection probability at zero distance is 1.Methods ex ist for relaxing the g(0) = assumption, but most re quire 2 independent platforms (Buckland et al. 2007).Because it is difficult to isolate observers on a small boat to set up experimental trials to estimate the proportion of animals observers miss on the trackline, many smallboat surveys, including this one, assume falsely that all animals directly on the trackline were seen.This underestimates true abundance, because it fails to account for submerged animals or animals that were not detected on the trackline.Our survey no doubt violated this assumption, but the relatively slow survey speed gave shallow-diving animals several opportunities to surface within the field of view as the boat traveled along the transect (Barlow 1995).Minimum estimates are precautionary from the perspective of bycatch sustainability (Wade 1998), be cause allowable harm limits can use a lower bound when abundance spans a wide range (Williams et al. 2008).But field protocols must attempt to keep the degree of this underestimation constant if there is any interest in detecting trends (Moore & Barlow 2013).
(3) Objects do not move before perpendicular distance from the trackline is recorded Responsive movement or avoidance following detection is not a problem.If animals are attracted to the boat, density estimates will be positively biased.If animals avoid the boat, density estimates will be negatively biased.Small-boat surveys, including ours, tend to violate this assumption.As a general rule, the lower the survey platform, the smaller the observers' field of view.It is possible to minimize this bias by building a platform to raise the observers' eye height (Dawson et al. 2004), using binoculars to search far ahead of the vessel, or using data on the animals' orientation relative to the vessel to generate correction factors to account for responsive movement statistically (Palka & Hammond 2001).
(4) Perpendicular distances are measured without error In practice, the methods are robust to some variability, but they are not robust to systematic bias in recording perpendicular distances.If observers tend to overestimate distances, then density estimates will be negatively biased.If observers tend to underestimate distances, then density estimates will be positively biased.This assumption is often violated in small-boat surveys, including ours.Where accurate abundance estimates are needed, we recommend measuring distances where possible (e.g. using rangefinders or measuring declination below a horizon), or generating observer-specific correction factors using distance estimation experiments (Williams et al. 2007).

Next steps: training while doing
Since developing our Animal Counting Toolkit in Canada, we field-tested the approach on a smallboat survey in Indonesia with our colleague, Dr Putu Liza Mustika (Conservation International Indonesia).The Indonesia survey involved classroom training for 38 faculty and students at Udayana University in Den pasar, Indonesia.Of those, 6 participants re ceived hands-on training while rotating through our field survey crew.This collaborative effort exemplifies our intent for future applications of the toolkit.Our long-term goal is to identify regions that are predicted (from habitat suitability models) to be rich in marine mammal species (Williams et al. 2014) and likely to be problem areas for bycatch (Reeves et al. 2013, Lewison et al. 2014) but that are previously unsurveyed for marine mammals (Kaschner et al. 2012).Those 3 criteria, along with identification of an in-country partner willing to collaborate on a field study, will determine our priority areas to fill data gaps using this 'training-while-doing' approach.We anticipate that besides filling data gaps, our program will help build capacity in regions where it is most needed but currently in short supply.We note with interest that our Animal Counting Toolkit parallels similar efforts by marine ecologists and indigenous communities in northern Australia for monitoring dugong, turtle and coastal dolphin abundance (Fuentes et al. 2015, Jackson et al. 2015).The Canada case study described in this paper was intended to inform a complementary, online component of our capacity-building program.The online and hands-on training are not meant to replace training in statistical analysis, but are intended to help emerging re searchers who are working independently to collect reliable data (Jackson et al. 2015).

Fig. 2 .
Fig. 2. GPS tracks of effort from a previous photo-ID study (Ashe 2015) were used to outline a survey region for designing a systematic line-transect survey (main map).Tracklines covered by the small (6 m) vessel in previous years were considered to outline the extent of the team's core study area.The visual mass of previous tracklines was used to outline a rough polygon in QGIS (www.qgis.org) to define a survey region.After clipping out the land, the red, irregular polygon (study area inset) was saved as a shapefile in QGIS for use in designing a systematic survey in Distance 6 (http://distancesampling.org/).The red star refers to our Pearse Islands base camp

Fig. 3 .
Fig. 3.A parallel-line survey designed to provide even (systematic) coverage of the survey area.The planned survey design had 17 transects, totaling 208.6 km of survey effort.The dotted lines indicate planned transects that were not completed.The area sampled by those lines was removed from the estimated survey area when calculating abundance

Fig. 4 .Fig. 5 .
Fig. 4. Map of realized search effort (black solid lines are the designed transects, and the white dotted line represents transit legs when observers remained on effort).Sightings are categorized by species, with the size of the dot proportional to school size

Table 1 .
Abundance estimates (N) (and coefficients of variation, CV) using conventional distance sampling (CDS), in which detection functions were fitted separately for each species, and multiple covariate distance sampling (MCDS), in which detection functions were shared among 3 species groups.Note that no attempt was made to derive CDS estimates for minke or killer whales, or Steller sea lions, due to small sample size (column labeled 'No.sightings').NA: not available