Using triaxial accelerometers to identify wild polar bear behaviors

Tri-axial accelerometers have been used to remotely identify the behaviors of a wide range of taxa. Assigning behaviors to accelerometer data often involves the use of captive animals or surrogate species, as their accelerometer signatures are generally assumed to be similar to those of their wild counterparts. However, this has rarely been tested. Validated accelerometer data are needed for polar bears Ursus maritimus to understand how habitat conditions may in fluence behavior and energy demands. We used accelerometer and water conductivity data to remotely distinguish 10 polar bear behaviors. We calibrated accelerometer and conductivity data collected from collars with behaviors observed from video-recorded captive polar bears and brown bears U. arctos, and with video from camera collars deployed on free-ranging polar bears on sea ice and on land. We used random forest models to predict behaviors and found strong ability to discriminate the most common wild polar bear behaviors using a combination of accelerometer and conductivity sensor data from captive or wild polar bears. In contrast, models using data from captive brown bears failed to reliably distinguish most active behaviors in wild polar bears. Our ability to discriminate behavior was greatest when speciesand habitat-specific data from wild individuals were used to train models. Data from captive individuals may be suitable for calibrating accelerometers, but may provide reduced ability to discriminate some behaviors. The accelerometer calibrations developed here provide a method to quantify polar bear behaviors to evaluate the impacts of declines in Arctic sea ice.

Tri-axial accelerometers, which collect high frequency measures of acceleration in the form of gravitational and inertial velocity (Brown et al. 2013), have provided a means to remotely identify animal behaviors (Yoda et al. 1999, Watanabe et al. 2005).Accelerometers have been particularly useful in studying widely dispersed animals or those occurring in remote habitats, such as marine mammals and birds (Brown et al. 2013).Once calibrated, tri-axial accelerometer data from wild animals can be used to remotely identify behaviors such as resting, walking, running, and even feeding events (Yoda et al. 2001, Shepard et al. 2008, Wilson et al. 2008, Watanabe & Takahashi 2013, Williams et al. 2014).Calibration typically involves time-synchronizing behavioral observations with their associated accelerometer readings, which often necessitates the use of captive animals or surrogate species (e.g.Yoda et al. 2001, Shepard et al. 2008, Nathan et al. 2012, Campbell et al. 2013).Alternatively, animal-borne video cameras can be used to directly calibrate accelerometers (e.g.Watanabe & Takahashi 2013, Nakamura et al. 2015, Volpov et al. 2015), but cameras can be expensive and can only collect data over limited durations.
Polar bears Ursus maritimus typically occupy remote environments, and few quantitative data exist on their behaviors or activity budgets.Much of what is known about polar bear behavior on the sea ice comes from coastal indigenous resident knowledge (e.g.Nelson 1966, Kalxdorff 1997, Kochnev et al. 2003, Voorhees et al. 2014) and direct observational research limited to 2 locations over limited time periods (Stirling 1974, Stirling & Latour 1978, Hansson & Thomassen 1983, Stirling et al. 2016).Satellite telemetry has been used to track polar bears in some subpopulations since the late 1970s (Schweinsburg & Lee 1982, Taylor 1986) and has helped to identify important habitats (Ferguson et al. 2000, Mauritzen et al. 2003, Durner et al. 2009, Wilson et al. 2014).However, detailed behavioral data in association with habitat conditions are lacking.Recent declines in Arctic sea ice have already caused declines in abundance, survival, or body condition of polar bears in some subpopulations (Stirling et al. 1999, Regehr et al. 2007, Rode et al. 2010, 2012, Bromaghin et al. 2015, Obbard et al. 2016) and models project increasing negative impacts in the 21 st century (Amstrup et al. 2008, Hunter et al. 2010, Molnár et al. 2010, Atwood et al. 2016).In order to better predict the impacts of projected sea ice loss on polar bears, it will be important to understand the behavioral and physio logical mechanisms driving current declines (Vongraven et al. 2012, Atwood et al. 2016).Accelerometers could be used in combination with satellite telemetry to better understand the behavioral consequences of sea ice loss.This mechanistic information would allow for improved assessment of the relationships between habitat loss, individual health, and vital rates in polar bear populations.
In this study, we developed a method to quantify wild polar bear behaviors using accelerometers and conductivity sensor data, validated through animalborne video camera data.Additionally, we evaluated the effectiveness of using accelerometer data from captive polar and brown bears U. arctos to predict behaviors of wild polar bears.Though it is generally assumed that accelerometer signatures of captives or surrogates are similar to those of their instrumented wild counterparts (Williams et al. 2014, McClune et al. 2015, Wang et al. 2015, Hammond et al. 2016), this has rarely been tested.Captive individuals may exhibit different behaviors and/or kinematics than wild counterparts (McPhee & Carlstead 2010), which could potentially influence accelerometer signatures.Because polar bears use both sea ice and terrestrial habitats and because differences in habitat substrate or gradient could also affect accelerometer signatures (Bidder et al. 2012, Shepard et al. 2013, McClune et al. 2014), we examined data from wild polar bears in both of these habitats.Lastly, because sampling frequency affects the longevity of accelerometers during deployment as well as computational power for analyses, we evaluated the ability of accelero meters to predict wild polar bear behaviors using 3 different sampling frequencies (16,8,and 4 Hz).

Accelerometer recordings on captive bears
We deployed collars with archival loggers (TDR10-X-340D; Wildlife Computers) on 3 adult female polar bears Ursus maritimus housed at the Alaska Zoo, Oregon Zoo, and San Diego Zoo, USA, as well as 2 adult female brown bears U. arctos housed at the Bear Research, Education, and Conservation Center at Washington State University (WSU; Table 1), USA.Ar chi val loggers recorded tri-axial acceleration (m s −2 ) at 16 Hz (range: ± 20 m s −2 ), time-of-day, and wet/dry conductions (via an on-board conductivity sensor; Fig. 1).Conductivity data were sampled at 1 Hz.Bears at the Oregon and San Diego Zoos were trained to voluntarily place their heads into crates in which collars could be applied or removed, and wore collars for 1 to 4 h sessions.Bears at the Alaska Zoo and WSU were anesthetized for collaring, with a combination of tiletamine HCl and zolazepam HCl (Telazol®; Pfizer Animal Health) and dexmedetomidine HCl (Dexdomitor®; Pfizer Animal Health) (Teisberg et al. 2014).Following collar placement, the effect of the anesthetic were reversed with atipamezole HCl (Antisedan®; Pfizer Animal Health).We used release mechanisms (Lotek Wireless) to remove collars from bears at the Alaska Zoo and WSU.We matched accelerometer recordings to the behaviors of captive bears while they moved freely around enclosures based on visual examination of timestamped video recordings (Sony camcorder model DCR-TRV280 or OpenEye Digital Video Security Solutions).

Accelerometer recordings on free-ranging polar bears
We deployed GPS-equipped video camera collars (Exeye) and archival loggers (TDR10-X-340D; Wild life Computers) on 4 adult female polar bears and 1 subadult female polar bear captured on the sea ice of the southern Beaufort Sea in April 2014 and 2015 (hereafter 'ice bears') and 2 subadult polar bears (1 male and 1 female) captured on land on Akimiski Island, Nunavut, Canada, in September 2015 (hereafter 'land bears'; Table 1).Video collars, including archival loggers and release mechanisms, weighed 1.6 to 2.1 kg (0.8 to 1.5% of body mass of bears in this study).We captured polar bears by injecting them with immobilizing drugs through projectile syringes fired from a helicopter.On the sea ice, we anesthe tized bears using a combination of tiletamine HCl and zolazepam HCl (Tela-zol®) with no reversal (Stirling et al. 1989).On land, we anesthetized bears with a combination of medetomidine (Domitor®; Pfizer Animal Health) and tiletamine HCl and zolazepam HCl (Telazol®) and reversed with atipamezole HCl (Antisedan®) (Cattet et al. 1997).Archival loggers were attached to collars in the same location and orientation as captive deployments (Fig. 1) and similarly recorded tri-axial acceleration at 16 Hz (range: ± 20 m s −2 ), time-of-day, and wet/dry conductions (via an on-board conductivity sensor).Conductivity data were sampled at 1 Hz.Video cameras were programmed to record at varying frequencies during daylight periods (see Table S1 in the Supplement at www.int-res.com/articles/ suppl/ n032 p019 _ supp.pdf) and programmed to turn off if the temperature of the collar fell below −17°C to protect video equipment.Collars deployed on ice and land bears were recovered 4 to 23 d following deployment, either by recapture of the individual or by remote activation of the collar release and retrieval of the dropped collar by the field crew.We matched accelerometer data to behavior of ice and land bears based on visual examination of the time-stamped video recordings from the collar.

Behaviors
Behaviors were annotated based on the video data on a per second basis.For bears that were anesthetized, we excluded behaviors on the day of capture and during retrieval of the collar.Resting behaviors included standing, sitting, and lying down.Head movements while standing, sitting, or lying down were included as resting behaviors, but limb movements were treated as transitionary behaviors (Knudsen 1978, Williams 1983).Swimming included surface swimming and diving.We excluded from ana lyses any behaviors that were not indicative of natural movements in captive bears (e.g.stereotypic behaviors), were rare (e.g.fighting, breeding, drinking), were transitionary, or were non-descript.

Modeling
We derived summary statistics from the accelerometer data and linked the accelerometer data with corresponding behaviors of interest (SAS version 9.3; SAS Institute).We converted accelerometer measures from m s −2 to g (1 g = 9.81 m s −2 ).We calculated magnitude (Q) as a fourth dimension, where (Nathan et al. 2012).We used a 2 s running mean of the raw acceleration data to calculate static accelera-tion (gravitational acceleration) and subtracted the static acceleration from the raw acceleration data to calculate dynamic acceleration (Wilson et al. 2006, Shepard et al. 2008).We calculated overall dynamic body acceleration (ODBA) as the absolute sum of dynamic acceleration across the 3 axes (Wilson et al. 2006).We used a Fast Fourier Transform to calculate the dominant power spectrum (dps) and frequency (fdps) for each axis (Watanabe et al. 2005, Shamoun-Baranes et al. 2012).In total, we derived 25 predictor variables based on previous accelerometer studies (e.g.Watanabe et al. 2005, Nathan et al. 2012, Shamoun-Baranes et al. 2012, Wang et al. 2015).Predictor variables were extracted from the accelerometer data over 2 s intervals; mean conductivity data (wet/dry) was also extracted over 2 s intervals using program R (R Core Team 2014) (Table 2).Videolinked behaviors that lasted less than 2 s were excluded from analyses.We used a random forest supervised machine learning algorithm (Breiman 2001) in R ('RandomForest' package) to predict polar bear behaviors.Random forest models use multiple classification trees from a random subset of predictor variables and then replicate this process over multiple iterations using a subset of the data for each iteration to determine the best variables for making predictions (Breiman 2001).An estimate of error is derived by using the remaining data not used in each iteration to test the predictive ability of the model, which is termed the 'out-of-bag' (OOB) error rate Q heave +surge +sway Table 2. Parameters extracted from tri-axial accelerometer and conductivity data and used in random forest models to predict wild polar bear Ursus maritimus behaviors.Respective acceleration measures from the surge (X), heave (Y), sway (Z), and magnitude (Q) axes (Breiman 2001, Liaw & Wiener 2002).The random forest algorithm has previously shown high accuracy (> 80%) for predicting animal behaviors from accelerometer data (Nathan et al. 2012, Resheff et al. 2014, Graf et al. 2015, Lush et al. 2015, Rekvik 2015, Wang et al. 2015, Alvarenga et al. 2016).We fit 500 classification trees to each training dataset and used a random subset of 5 predictor variables for each split in the tree.

Analyses
Unbalanced datasets can bias the predictive ability of classification algorithms toward the most dominant classes (Chen et al. 2004).Therefore, we performed 3 initial analyses to test the effect of uneven distributions on predictive ability.The first analysis used an uneven distribution in which for ice and land bears, we randomly selected 70% of each behavior for the training dataset and used the remaining 30% to test the predictive ability of the random forest algorithm (e.g.Nathan et al. 2012, Alvarenga et al. 2016).For captive polar and brown bears we used the entire datasets to train the random forest algorithm.The second analysis used a subsampling approach in which we attempted to reduce the uneven distribution of more frequent behaviors (e.g.resting) in our training dataset.To reduce the uneven distribution of behaviors in the dataset from ice bears, we randomly selected 5% of the resting behaviors, 30% of the walking behaviors, and 70% of each of the remaining behaviors for training the random forest algorithm.We used the remaining data from ice bears for testing predictions.To reduce the uneven distribution of the dataset from land bears, we randomly selected 5% of the resting behaviors and 70% of each of the remaining behaviors for training and used the remaining data to test predictions.To reduce the uneven distribution of the datasets from captive polar bears and brown bears, we randomly selected 10% of the resting behaviors, 30% of the walking behaviors, and 100% of each of the remaining behaviors for training the random forest algorithm.The third analysis used a completely balanced distribution in which we used identical sample sizes of 500 observations for each behavior in the training dataset and the remaining observations to test and excluded behaviors with less than 500 observations.Based on these 3 analyses, we used the sampling distribution (i.e.uneven, subsampled, or balanced distribution) with the greatest predictive ability for further analyses.
To evaluate our ability to predict behaviors of ice bears, we used 3 different datasets to train the random forest models and evaluated the ability of each of these models.First, we used a random subset of the data from ice bears as the training dataset and the remaining data from ice bears to test predictions (testing dataset).Second, we used the data from captive polar bears as the training dataset.Third, we used the data from captive brown bears as the training dataset.
To evaluate our ability to predict behaviors of land bears, we conducted 4 additional analyses.First, we used a random subset of the data from land bears as the training dataset and the remaining data from land bears to test predictions (testing dataset).Second, we used the training data from ice bears as the training dataset.Third, we used the training data from captive polar bears as the training dataset.Fourth, we used the training data from captive brown bears as the training dataset.
To examine the effect of sampling frequency on our ability to discriminate behaviors, we subsampled our 16 Hz accelerometer data to lower data acquisition rates of 8 and 4 Hz using SAS, and repeated the predictive analyses above for both ice and land bears.
Predicted behaviors were categorized as true positive (TP) if they correctly matched the actual behavior, true negative (TN) if they correctly identified as a different behavior, false positive (FP) if they incorrectly identified the behavior, and false negative (FN) if they incorrectly identified as a different behavior.We evaluated the predictive abilities of these models based on Matthews' correlation coefficient (MCC; e.g.Basu et al. 2013, Martins et al. 2016), the percent precision, recall, and F-measure.We used MCC in place of accuracy due to the unbalanced nature of our dataset.

MCC, ,
provides a measure of the agreement between the predicted and actual classifications, where +1 represents a perfect prediction and −1 represents total disagreement (Matthews 1975).Precision is the proportion of positive classifications that were correctly classified (TP/TP + FP), recall is the probability that a behavior will be correctly classified (TP/TP + FN), and F-measure is the harmonic mean of precision and recall (2 × precision × recall/precision + recall).
We used 2 sample t-tests to evaluate whether MCC, precision, and recall differed significantly using a 16 Hz sampling frequency compared to either an 8 or 4 Hz sampling frequency based on the ice and land datasets.

Behavior on the sea ice
Video collars on ice bears Ursus maritimus recorded 14 to 55 h of video (x − = 38 h, SD = 17 h, n = 5).For predicting the behavior of ice bears, we collected a total of 140 h of video-linked accelerometer data from ice bears, 37 h from captive polar bears, and 72 h from captive brown bears U. arctos.We identified 10 different behaviors from ice bears, with resting, walking, and eating being the most prevalent (Table 3).Ice bears ate recently killed adult, subadult, or pup ringed seals Pusa hispida, seal carcasses, bowhead whale Balaena mysti cetus carcasses, or unidentifiable carcasses.Captive polar bears consumed fish, and captive brown bears ate dry omnivore chow.Captive brown bears also grazed on grass, which was excluded from analyses predicting behaviors of ice bears, but was included as eating for predicting behaviors of land bears.
Our models using an uneven distribution of behaviors in which we used 70% of each behavior from ice bears and all of the available data from captive polar or brown bears exhibited 5% greater predictive ability overall compared to the subsampled distribution, and 7% greater predictive ability overall compared to the balanced distribution based on F-measure (see Table S2 in the Supplement at www. int-res.com/ articles/ suppl/ n032 p019 _ supp.pdf).In particular, the data sets with an uneven distribution ex hibited greater ability to discriminate less frequent behaviors such as swimming, eating, and running (Table S2).Therefore, we used the datasets with uneven distributions for subsequent analyses (Table 3).
Our model with training data from ice bears had an OOB error rate of 2.0% and exhibited the greatest predictive abilities for all 10 behaviors (Fig. 2) compared to all other models tested.Our models with training data from captive polar bears and brown bears had OOB error rates of 3.7 and 0.5% respectively, indicating that both models performed well in discriminating captive behaviors.Both the ice bear and captive polar bear models exhibited strong predictive ability for identifying resting and walking behaviors in wild bears (> 90% MCC, precision, recall, and F-measure; Table 4 & Table S3 in the Supplement).Predictive abilities for other behaviors varied, with swimming and head shaking exhibiting strong predictive ability using the ice bear model (> 75% MCC, precision, recall, and F-measure), but lower predictive ability for eating, running, pouncing, grooming, digging, and rolling (Fig. 2, Tables 4  & 5).The model from ice bears had particularly greater ability than the captive polar bear model for  swimming, pouncing, and digging (Fig. 2, Table S3).
The captive brown bear model provided weaker ability to distinguish behaviors of ice bears for walking, eating, and grooming (< 65% MCC and F-measure), but reliably distinguished resting (Fig. 2, Table S4).
Using the model from ice bears, eating had a high rate of false positive classifications resulting from digging behavior being incorrectly classified as eating (Table 5) as well as a high rate of false negative classifications with eating behavior incorrectly classified as either resting or walking (Table 5).A post hoc test using only feeding behavior while eating a recently killed ringed seal within the training and testing datasets failed to improve our ability to discriminate eating (MCC = 0.61, precision = 0.67, recall = 0.56, F-measure = 0.61).Additionally, running was often misclassified as walking, whereas rolling was often misclassified as resting (Table 5).
The most important predictors using the model from ice bears were static acceleration in the heave (staticY) and surge directions (staticX), wet/dry conductivity (wetdry), and frequency at the dominant power spectrum in the surge direction (fdpsX; Fig. 3).Differences in the intensity of behaviors were discernible in the ODBA measures, with head shaking having the greatest ODBA and resting having the lowest (Table S5).Eating and swimming showed simi-  The importance plot provides a relative ranking of parameters in which higher values indicate parameters that contributed more toward classification accuracy.Mean decrease in accuracy is normalized by dividing by the standard errors of the parameters (i.e.z-score).See Table 2 for description of parameters lar mean ODBA values, but had differing mean static acceleration values (Table S5).Eating and grooming had low values of static acceleration in the heave direction (staticY), which was indicative of a head-down posture.Walking and running exhibited periodic undulating patterns in static acceleration in the heave direction (staticY; Fig. 4 & Fig. S1 in the Supplement), which was indicative of the bear's head moving up and down as it stepped.Wet/dry conductivity while swimming was lower for wild polar bears (x -= 81.9, SD = 81.5)than captive polar bears (x -= 205.3, SD = 57.8)and lower than all other behaviors (all x -> 234).A post hoc test excluding the conductivity variable reduced the ability of the algorithm to correctly identify swimming be haviors using the training data set for ice bears (MCC = 0.47, pre cision = 0.77, recall = 0.29, Fmeasure = 0.42) with a high rate of swimming behaviors misclassified as resting.

Behaviors on land
Video collars on land bears recorded 19 to 36 h of video (x -= 27 h, SD = 12 h, n = 2) and in total we collected 36 h of video-linked accelero meter data for the behaviors of interest.We identified 5 different be haviors from land bears, with resting being the most prevalent, followed by eating (Table 3).Eating on land consisted of berries, primarily crowberries Empetrum nigrum.
Our model with training data from land bears had an OOB error rate of 0.5% and had the greatest success in discriminating on-land behaviors (Fig. 5, Table S6).All behaviors, except for grooming and head shaking, had MCC, precision, recall, and Fmeasure values > 90% using the model from land bears (Fig. 5, Table S6).In particular, the model from land bears was able to distinguish eating (MCC = 0.95, precision = 0.95, recall = 0.96, F-measure = 0.95), which was not possible with the other datasets.Our model with training data from ice bears had success in discriminating resting behaviors on land (MCC = 0.60, precision = 0.96, recall = 1.0,F-measure = 0.98) and walking on land (MCC = 0.82, precision = 0.89, recall = 0.76, F-measure = 0.82), but eating was often misclassified as resting or walking (FP).The captive polar bear model performed similarly to the model from ice bears for discriminating behaviors on land (Fig. 5).The captive brown bear model performed less well than the other models for discriminating walking on land, but otherwise performed similarly to the models from ice bears and captive polar bears (Fig. 5).

Sampling frequency
The OOB error rate using the data from ice bears increased from 2.0 to 2.2% using an 8 Hz sampling frequency and to 2.6% using a 4 Hz sampling frequency.OOB error rate using data from land bears increased from 0.5 to 0.6% at 8 Hz and to 0.8% at 4 Hz.Predictive ability using an 8 Hz sampling frequency was nearly identical to 16 Hz among all behaviors using the dataset from ice bears (t 58 = 0.70, p = 0.24) and land bears (t 28 = 0.61, p = 0.27) based on MCC, precision, and recall.Predictive ability using a 4 Hz sampling frequency was lower than predictive ability using 16 Hz for ice bears (t 55 = 1.8, p = 0.04), but not for land bears (t 27 = 0.59, p = 0.28).In particular, the ability to discriminate the high intensity behaviors of pouncing and head shaking declined using a 4 Hz sampling rate (Fig. 6).

DISCUSSION
Our results show that tri-axial accelerometers in combination with measures of conductivity can reliably distinguish the 3 most common behaviors of wild polar bears Ursus maritimus (resting, walking, and swimming; Stirling 1974, Latour 1981, Hansson & Thomassen 1983, Lunn & Stirling 1985).This will provide a method to remotely document the activity budgets of these far-ranging animals, which can be further linked with location data from satellite collars to examine the effects of habitat on behavior and energy expenditure.Our results indicate that differences among habitats and species can impact the ability to discriminate behaviors in wild individuals using accelerometers.We found no loss in predictive ability using an 8 Hz sampling frequency, which would allow for twice the battery longevity of a 16 Hz rate and reduce the computational power needed for analyses.Although accelerometer studies on smaller species appear to require greater sampling frequencies (e.g.> 30 Hz; Broell et al. 2013, Brown et al. 2013), our results are similar to data obtained by Rekvik (2015) from captive brown bears U. arctos, and by Wang et al. (2015) from captive mountain lions Puma concolor, which both found little loss in predictive ability at sampling frequencies ≥8 Hz.

Habitat effects
Our results indicate that accelerometer signatures on sea ice are similar to signatures on land for most behaviors, but eating berries by land bears had a distinct signature that our ice bear model and captive bear models misclassified as grooming, resting, or walking.This highlights the value in linking observational and acce lero meter data from wild subjects over multiple time periods and habitats, and the importance of accounting for as many behaviors as possible in training datasets.Knowledge of eating frequency and du ration would provide insight in determining foraging success, an im portant determi-nant of individual re productive success and survival (Stirling et al. 1999, Regehr et al. 2007, 2010).Al though we had success discriminating eating events by land bears, we had lower precision and recall in discriminating eating events by ice bears.This was likely related in part to the movement pattern of bears eating berries, in which they typically stood with their head down and grazed.Conversely, bears eating on the sea ice exhibited a variety of positions including standing, sitting, and lying down, and both tore pieces of food from seals or gnawed on carcasses.Since most kill events involve bears pouncing on their seal prey (Stirling 1988, Derocher 2012), we may be able to identify successful kills based on the combination of a pouncing signature followed by eating signatures (e.g.Williams et al. 2014), but this requires further evaluation.Additionally, feeding on a seal would typically last for a prolonged period; hence, if the model primarily predicted eating over a prolonged period this could be used as an indication of a feeding event, but this also requires further evaluation.

Use of captive animals and surrogate species
Our ability to discriminate behaviors was greatly improved by including data from free-ranging polar bears rather than using data from captive bears alone.However, resting and walking could be reliably discriminated using data from either captive or wild polar bears.This illustrates the value of collecting data from captive individuals when data collection is difficult or impossible from wild counterparts.However, data from captive brown bears exhibited poorer performance for predicting active behaviors in wild polar bears.This may be related to differences in walking kinematics between polar and brown bears as well as potential differences in limb lengths between the species (Renous et al. 1998).
Additionally, polar bears have longer necks relative to their body size than other ursid species (DeMaster & Stirling 1981), which could also affect accelerometer signatures from a neck-worn collar.Although Campbell et al. (2013) proposed the use of surrogate species to predict the behaviors of other species, our findings suggest that polar bear accelerometer signatures are likely species-and habitat-specific, at least for distinguishing specific behaviors.The brown bear model did reliably distinguish resting behavior in wild polar bears, which suggests that surrogate species could be used to distinguish coarse activity patterns such as active versus inactive (e.g.Gervasi et al. 2006, Ware et al. 2015).
Our analyses indicate that conductivity measures are needed to reliably discriminate swimming.Greater conductivity measures in captive polar bears that were swimming in fresh water likely caused the poorer performance for discriminating swimming in wild polar bears that were swimming in salt water.For pouncing, captive polar bears pounced on large plastic barrels, which resulted in similar measures of ODBA as wild counterparts, but had different signatures of static acceleration (i.e.body posture).Digging by wild bears, which was often through snow and ice into subnivean lairs to locate seals, exhibited greater ODBA measures and slightly different static acceleration than captive bears digging in snow and ice.These results suggest that some behaviors of captive bears may not fully reflect behaviors of their wild counterparts, which further illustrates the value of collecting simultaneous observational data (e.g.video) from free-ranging individuals to calibrate accelerometer-based behavioral data.

Accelerometer attachment
Regardless of which training dataset was used, we found lower precision and recall for predicting 5 of the behaviors tested for bears on the sea ice.
Eating, grooming, and rolling had high rates of misclassifications as resting, whereas running and digging had high rates of misclassifications as walking.These re sults suggest the random forest algorithm could be prone to slightly overestimate the amount of true resting and walking behaviors in quantifying activity budgets.Our lower precision and recall for discriminating some behaviors was likely due in part to the attachment of the accelerometer on a collar.Al though a number of studies have successfully discriminated behaviors using accelerometers on collars (Watanabe et al. 2005, Martiskainen et al. 2009, Soltis et al. 2012, McClune et al. 2014, Lush et al. 2015, Rekvik 2015, Wang et al. 2015), many of these studies limited their analyses to 4 or 5 behaviors or documented high misclassification rates for distinguishing some behaviors.Wang et al. (2015) similarly reported low accuracy of accelerometers on collars for predicting eating and grooming by captive mountain lions, and Lush et al. (2015) reported low ac curacy for predicting some behaviors, including grooming, in wild brown hares Lepus europaeus.Attach ment of the accelerometer to a collar, as opposed to attachment directly on the animal, likely introduces noise in the data due to independent collar motion (i.e. the collar must be fitted to ensure animals do not remove it, but loose enough to accommodate potential changes in body mass) and may reduce the ability of the accelerometer to detect some low intensity movements (Shepard et al. 2008).The effect of independent collar motion is evident in our large values of ODBA when bears shook their heads.This behavior may be useful for identifying the end of a swim, as bears are known to shake and roll in the snow following a swim.Additionally, our ability to discriminate head shaking allows for excluding it from potential energetic analyses using accelerometers.Use of a higher sampling frequency than was used in this study (i.e.>16 Hz) could potentially improve the ability to discriminate some fine-scale body movements (Nathan et al. 2012) such as eating, though Wang et al. (2015) sampled at 64 Hz and had low accuracy in discriminating eating behaviors of captive mountain lions.

Video calibration
Having video-linked observational data from camera-mounted collars on wild polar bears was the most practical method to calibrate accelerometers on freeranging individuals.However, because the animal's body was not visible in the video, some behaviors may have been incorrectly classified.For example, distinguishing walking versus running was often challenging, as was determining when bears were actively swimming versus resting in the water.Both of these could have contributed to the misclassifications between running and walking and swimming and resting.Additionally, the models had greater success discriminating behaviors as sample sizes increased.Although unbalanced datasets are known to affect the predictive ability of random forest algorithms (Chen et al. 2004), we found that the inclusion of larger sample sizes in the training dataset was more important than imbalance.This highlights the value of calibrating accelerometers from multiple individuals over prolonged periods.

CONCLUSIONS
Our results underscore the importance of thoroughly validating accelerometers for use in remote detection of behavior, ideally on a species-and habitat-specific level.The use of tri-axial accelerometers, as shown here, will enable detailed assessments of polar bear behaviors to better understand polar bear habitat use and the implications for energy demands.For example, measures of acceleration could be combined with measures of oxygen consumption from captive bears while resting, walking, and swimming to both quantify activity budgets and estimate the energetic costs of these behaviors (e.g.Wilson et al. 2006, 2012, Halsey et al. 2009, 2011, Gómez Laich et al. 2011, Williams et al. 2014).Future advances are needed that would enable remote transmission of raw accelerometer data to further enhance the applicability of these devices to animals occurring in remote environments and obviate the need for sensor recovery.As declines in sea ice are expected to increase the activity rates of polar bears across much of their range (Derocher et al. 2004, Molnár et al. 2010, Sahanatien & Derocher 2012), the use of accelerometers provides a method to monitor the impacts of habitat change on activity and energy budgets to better understand the implications for body condition, reproductive success, and survival of this Arctic apex predator.

Fig. 3 .
Fig. 3. Variable importance plot from the random forest model of accelerometer data from polar bears on the sea ice.The importance plot provides a relative ranking of parameters in which higher values indicate parameters that contributed more toward classification accuracy.Mean decrease in accuracy is normalized by dividing by the standard errors of the parameters (i.e.z-score).See Table2for description of parameters

Fig. 4 .
Fig. 4. Accelerometer signatures of static acceleration in the surge (X), heave (Y), and sway (Z) directions and overall dynamic acceleration (ODBA) while walking, swimming, standing, and eating a seal from an adult female polar bear Ursus maritimus on the sea ice of the southern Beaufort Sea

Table 1 .
Polar bears Ursus maritimus and brown bears U. arctos wearing collars with tri-axial accelerometers that were video recorded (captive bears) or that wore video-equipped collars (wild bears) Fig. 1.Orientation of an archival logger containing a triaxial accelerometer attached to a collar for use on polar Ursus maritimus and brown bears U. arctos

Table 3 .
Number of 2 s long behaviors used in random forest training datasets for predicting behaviors of wild polar bears Ursus maritimus.Ice bears: polar bears on the sea ice of the southern Beaufort Sea.Land bears: polar bears on Akimiski Island, Nunavut Fig. 2. Ability (F-measure) of the random forest model to predict 10 behaviors of polar bears Ursus maritimus on the sea ice from 3 different training datasets of accelerometer data.Ice bears: polar bears on the sea ice of the southern Beaufort Sea

Table 4 .
Performance of a random forest model using accelerometer data from polar bears Ursus maritimus on the sea ice to predict behaviors of bears on the sea ice as verified by video data.MCC: Matthews' correlation coefficient

Table 5 .
Cross-validation comparing predicted behaviors (rows) from accelerometer analyses of polar bears Ursus maritimus on the sea ice to actual behaviors (columns) confirmed by video recordings.Correct classifications are denoted in bold.See Table4for performance statistics in predicting behaviors