Open data and the future of conservation biology

The efficiency of conservation measures largely depends on our ability to understand numerous biotic and abiotic factors, and the broad array of their interactions and dependencies, which are often scale-sensitive. To generate precise evidence for causes, outputs and processes, and thus to translate knowledge into conservation actions, advanced methods and an unbiased synthesis of data are required. Following scientific advances, along with the support of new technologies, research questions in conservation biology have gradually evolved. To deal with these questions, new methods and approaches have been developed aiming to capture a more holistic picture of the dynamic and multidimensional processes structuring biodiversity patterns. Such modern methodological tools can be satisfied through the use of transparent and credible data, collected over various spatial and temporal scales. Here, the basic concept behind recent methodological advances — viz. (1) decision support tools for spatial conservation planning, (2) cumulative effect assessments and (3) ecological niche models — which offer innovative analyses for conservation of biodiversity, is briefly presented. The need for standardized analytical methodologies seems to be properly acknowledged. Yet, the application, precision and validation of any such modern tool largely depend on the available data. The need for transparent and credible openaccess data is more urgent than ever.

An oversimplified conceptual model delineating the processes involved in conservation biology would likely involve 3 main modules: information, knowledge and policy (Fig. 1).Although background information on environmental, ecological, socioeconomic, demographic, cultural and technological drivers of change represents the cornerstone for this dynamic chain, the ability to translate information into knowledge is the key process that could actually result in concrete output (Cook et al. 2010, Dicks et al. 2014).As the next step, policy and management instruments capitalize on this knowledge.This transfer of theory and of empirical evidence into practice represents a fundamental part of conservation science, as it enhances public awareness and provides the evidence needed to structure societal perceptions and foster policy capacity on environmental issues (Norton 1998, Arlettaz et al. 2010, Cook et al. 2013, Walsh et al. 2015).
Even if we assume that conservation capacity comprises only these 3 basic structural elements (i.e.information, knowledge and policy), a degree of complexity could be added once we accept that the processes and interactions related to these components are scale-sensitive (Henle et al. 2014).Human (e.g.disturbance, exploitation) and natural (e.g.evolution, succession, colonization) processes operate at various temporal and spatial scales (Fig. 1).These scales actually determine the interactive nature of such processes, as well as the directionality and intensity of their effects upon ecosystems functionality and services (Sala et al. 2000, Tzanopoulos et al. 2013).
The overall objectives of conservation biology as a main branch of environmental science is to adequately protect natural environments, preserve biological diversity, maintain ecosystem functionality and services and design measures towards mitigating and adapting to global environmental change (e.g.land use and climate change) (Lubchenco 1998).Conservation biology also aims to contribute to the improvement of human livelihoods and the quality of human life (Lubchenco et al. 1991, Robinson 2006).
The efficiency of management, restoration and conservation measures largely depends on our consideration of numerous biotic and abiotic factors, and the broad array of their interactions and dependencies (Loreau et al. 2001).Solving this puzzle of complexity offers a unique opportunity to provide an indepth understanding of natural systems and their processes (Brand & Karvonen 2007), while at the same time it triggers the development of areas of new expertise and new scientific disciplines (Gibbons et al. 1994, Meine et al. 2006).Yet, under this new order in science, nobody can actually be an expert in all aspects involved in conservation biology (Ludwig et al. 2001).
Although the sharing of knowledge can be satisfied by the increasing scientific evidence published, integrated and improved outputs rely on the transparency and credibility of data (Livoreil et al. 2016, Schofield 2017, this Theme Section).To this end, international re search projects and scientific collaborations enhance the exchange of information and target the standardization of methodologies and available data (Levin et al. 2014, Stephenson et al. 2016, Katsanevakis et al. 2017, A. D. Mazaris et al. unpubl.).Nevertheless, the main question is whether the valuable in formation collected globally (including citizen-based information) is available and thus could support the most effective measures for conserving biodiversity and ecosystems services.To generate precise evidence, conservation biology requires a cross-sectoral collaboration and an unbiased synthesis of data (Sutherland et al. 2006, Hays et al. 2010, 2014).The lack of freely available, accurate information on the types of data and the methods used to collect them could largely decelerate this progress.

ADVANCED METHODS, INCREASED DATA NEED
Following the evolution of conservation biology, the applied methodologies and the research questions themselves have evolved in a way to capture a more holistic picture of the dynamic and multidimensional processes structuring biodiversity patterns (Sutherland et al. 2006).Once again, it is clear that the improvement of the tools and the development of comprehensive methods drive, but also are directed by, advances in our understanding on how natural systems function (Brook et al. 2000, Mac Nally 2000, Pearson & Dawson 2003, Hampe & Petit 2005).Below, I briefly present 2 advanced, modern systematic frameworks for conservation decision making and planning, which clearly reflect the evolution of knowledge, delineating the need for multidisciplinary efforts.Both the optimized models for systematic planning of conservation areas (Margules & Pressey 2000, Moilanen 2007, Pressey et al. 2007, Watts et al. 2009) and the cumulative effect assessments (Schindler 2001, Halpern et al. 2008, Maxwell et al. 2013) require information on (1) the distribution, abundance and dynamics of biotic communities and populations, (2) the potential changes in environmental, physical, social and economic drivers, (3) the impacts of global environmental changes and (4) coarse-and fine-scale data on functionality of the ecosystems.Considering the variety of information needed to draw accurate policy recommendations, it becomes apparent that open-access datasets could significantly improve the applicability of such tools, supporting decisions on conservation planning and prioritization.

Optimized models for systematic conservation planning
Currently, more than 200 000 protected areas exist globally (UNEP-WDPA 2016).Although political and scientific criteria were originally applied for the design and selection of conservation areas (Margules & Usher 1981), from the 1980s attempts were mainly focused on the scientific components of prioritization (Justus & Sarkar 2002).Still, the efficiency of the network of protected areas and their ability to adequately protect species and habitats are often questioned (Rodrigues et al. 2004, Joppa & Pfaff 2009, Mazaris et al. 2014, Pendoley et al. 2014, Pimm et al. 2014).In order to fill conservation gaps and maximizegains from management, advanced tools have been developed.Sophisticated conservation planning modelling frameworks (such as MARXAN, Possingham et al. 2000;Zonation, Moilanen et al. 2005;or MARXAN with Zones, Watts et al. 2009) have ex tended traditional prioritization methods by implementing multi-objective planning, addressing species-specific connectivity patterns and considering various socioeconomic factors while attempting to quantitatively maximize biodiversity representation targets (e.g.Beger et al. 2010, Moilanen et al. 2011).Such spatial conservation planning tools are open to the public, with software packages including manuals, tutorials and data for testing.
Such improved modelling schemes offer spatially explicit alternatives towards directing conservation decisions.However, to achieve their promising goals, these models need to be parameterized with a great amount of spatially explicit data.Thus, data are needed on biodiversity features (e.g.distribution of species, habitats, ecosystem services, distribution of alleles), physical backgrounds (e.g.land cover, vegetation layers, elevation/bathymetry maps) and socio economic parameters (e.g.land costs, opportunity costs, management costs, culturally significant areas), while they could further incorporate uncertainty due to various drivers (e.g.climate change, land use changes).
Once again the main question remains: How can we obtain this information in order to ensure that we will gain the maximum benefits from such systematic conservation planning tools over different scales?Acknowledging the increased conservation needs and the requirements for the promising conservation tools, the option of open-access data represents a viable solution towards enhancing our conservation ability (Reichman et al. 2011, Hampton et al. 2013, Turner et al. 2015).

Cumulative effect assessments
A challenge for the scientific community, conservationists and policy makers lies in the identification and evaluation of key effects derived from various sources and events upon biotic community and ecosystem processes (Adams 2005, Crain et al. 2008, Borja et al. 2016b).The systematic methodology developed to address this challenge is often referred as the 'cumulative effect assessment'.Taking as an example the marine environment, direct (e.g.climate change, pollution, Harley et al. 2006) or indirect drivers of change (e.g.demographic, economic, sociopolitical conditions, Mazaris et al. 2017), operating at different scales, could affect various components and processes (Halpern et al. 2008).Human activities (e.g.overfishing, inputs of nutrients, physical modifications) can alter the state of marine ecosystems and influence the viability of populations (e.g.Coll et al. 2008).These causes, events and actions are actually the source of pressures and effects upon different receptors.Receptors could represent either ecological (e.g.fish, seagrass, sea mammals, habitats), physical (beaches, water column) or socioeconomic (fishing industry, tourism) entities (Judd et al. 2015).Interestingly, the pathways (e.g.hydrodynamic regime, alternation of biophysical conditions) by which a receptor is exposed to pressures are not linear.For example, the interactions of various pressures could result in additive, multiplicative or even mitigative effects upon the ecosystem components (Korpinen & Andersen 2016).Similarly, causes and events (e.g. a human activity) could have additive, cumulative, synergistic and antagonistic effects upon various ecosystem entities, while these effects upon different receptors could be either continuous or discontinuous through time (Scherer 2011).
The issue of spatial scales when describing pressures and interactions, by definition, raises the need for various data sources and types.It is the spatial extent of the ecosystems, the heterogeneity of the sea-and landscape and the non-linearities of anthropogenic drivers and pressures that further highlight the necessity for increasing information.Under the ultimate goal of protecting the environment, modern conservation should be based on innovative analyses of processes and data across scales.Systematic evaluations of cumulative impacts at coarse scales might limit the ability to present the real environmental conditions and the potential problems at finer scales (Guarnieri et al. 2016).Regional specificities, idiosyncratic, behavioural, biological and evolutionary attributes of different biotic components could lead to discrepancies between the expected impacts (Jones 2016).Additionally, various types of uncertainty could accompany any such assessment (Regan et al. 2002).The need to minimize the impact of the sources of uncertainty is therefore being increasingly recognized in order to improve the performance and efficiency of cumulative effect assessments (Stelzenmüller et al. 2015, Stock & Micheli 2016).The improvement in knowledge exchange, the harmonization of methods for collecting and mapping data and the reinforcement of data accessibility are top priorities (Levin et al. 2014, Cvitanovic et al. 2015).
Various methods, techniques and conceptual frameworks have been developed and applied to assess cumulative effects on ecosystems, with the principal objective being the generation of human impact maps.The most widely applied model was introduced by Halpern et al. (2008), allowing the generation of spatial cumulative effects at various spatial and administrative scales.This additive model combines information on (1) the intensity of stressors in any cell of the sea-or landscape mosaic, (2) each ecosystem component (i.e.populations, species, habitats) identified in this cell and (3) the relative sensitivity of each component to each pressure.This basic model has been applied in numerous studies, triggering the development of new integrative assessment methods towards considering multiple ecosystem components in a more holistic way (Borja et al. 2016a).The majority of such tools are available to the public (e.g. the code for the original model of Halpern et al. 2008 andStock 2016; or the data, code and methods of the advanced model ocean health index of Halpern et al. 2012) and their use is rather straightforward; these factors make them very attractive for use.Still, the efficiency of the assessments generated depends explicitly on the quality and resolution of the available data, the relative assumptions made (Stock & Micheli 2016) and the information on the pressure-state links for the different organizational levels (i.e.populations, species, habitats; Borja et al. 2016a).Therefore, open-access data could not only parameterize the existing frameworks, but could also be used for model validation and thus contribute to methodological development.

PROJECTING IN TIME AND SPACE
The complexity of the phenomena studied in ecology and addressed within the field of conservation biology is not the only reason why open access to data is a predominant need for a sustainable future.
Conservation and management plans are developed by targeting the future.Therefore, considering that future targets are defined based on current knowledge, it is critical to quantitatively validate the generated suggestions.
For example, future predictions of species distributions are important to mitigate the impacts of global environmental challenges such as climate change (Elith et al. 2010, Mazaris et al. 2015, Almpanidou et al. 2016).Following the increased need to understand potential range shifts, over the last decade, species distribution models, also known as ecological niche models (ENMs), have become a popular tool for providing quantitative estimates of species distributions across a range of spatial and temporal scales (Pearson & Dawson 2003, Elith & Graham 2009, Elith & Leathwick 2009, Beale & Lennon 2012).The concept behind ENMs is rather straightforward and involves associating information on the occurrence of a given species with a set of environmental predictor variables and then identifying sites with similar attributes as potential occurrence areas (Guisan & Zimmermann 2000, Kearney et al. 2010).
Regardless of the wide range of ENM applications in predicting species distributions (Elith & Leathwick 2009, Beale & Lennon 2012), the accuracy of ENMs is often questioned (Fitzpatrick & Hargrove 2009, Elith et al. 2010, Synes & Osborne 2011).This is mainly because they aim to identify fundamental and not realized niches (as defined by Hutchinson 1957), they do not take into account biotic interactions and species plasticity, and the predictor variables could themselves represent a source of uncertainty affecting model outputs (e.g.Real et al. 2010).
It should be noted that predictions generated by these models could serve as inputs for systematic conservation planning tools, while ENMs are often applied over large spatial scales (regional, continental or global).It is therefore critical to quantitatively validate the outputs and the uncertainties of these models (Fitzpatrick & Hargrove 2009, Wiens et al. 2009, Synes & Osborne 2011, Beale & Lennon 2012).Open-access data could offer a window of opportunity for using valuable ecological information towards improving prediction accuracy, or even validating model performance when applying and testing models over different locations.As the main obstacle impeding the practical use of ENM outputs is the difficulty in measuring the accuracy of the predictions in current time (Araujo & Guisan 2006), open-access data could provide the basis for empirically testing the predictions over space and time, while also allowing for model improvements by including more biotic attributes but also incorporating local environmental heterogeneity into the models.

RECENT POLICY STRATEGIES INDICATE THE NEED FOR OPEN-ACCESS DATA
Although there is a temporal mismatch between scientific interventions and policy, the first positive signs of the acceptance of the need for a multidisciplinary approach to modern conservation are evident.As an example, the recent agenda of the European Union (EU), presented under the flag of the 'EU Strategy on Green Infrastructure' pointedly highlights the need for integrating knowledge and combining research fields.It is the original definition of green infrastructure by the European Environmental Agency (EEA) which makes clear that a broad horizon of knowledge is needed to enhance conservation, with a list of key words reflecting the multiple dimensions of integration: 'Green infrastructure is a strategically planned network of natural and seminatural areas with other environmental features designed and managed to deliver a wide range of ecosystem services … to provide environmental, economic and social benefits through natural solutions.'(EEA 2016).
The future of EU conservation, as reflected by the green infrastructure, does not ignore major challenges such as 'climate change', while it also prioritizes human health and quality of life.It is therefore clear that this new policy requires a holistic transdisciplinary approach in order to understand social− ecological interdependencies and to design the most effective conservation, management and restoration plans.Yet, a key question arises: Can such integration be achieved?Based on the few examples mentioned above, it is apparent that 'green infrastructure' reflects a new direction of conservation biology.Thus, the potential and efficiency of 'green infrastructure' itself largely depend on the availability and quality of data from natural, formal and social sciences.
A shift in the EU to a more anthropocentric focus is detected in the major financial instrument 'Horizon 2020', which aims to achieve smart, sustainable and inclusive growth through funding of research and innovation.A novelty in Horizon 2020 is the Open Research Data Pilot which aims to improve and maximize access to and re-use of research data generated by projects.In the same context, EU agencies coordinate several programmes (e.g. the Corine Land Cover inventory of the EEA) and operate repositories of open environmental datasets (e.g.Copernicus of the European Space Agency; the EEA data portal; EU Open Data Portal).Thus, the message that we get from the EU is not only a redirection of the needs or priorities but mainly a shift to a more holistic consideration of the environment and its interactions.The advanced methodological tools, the updated datasets, the revised policies and targets seem extremely promising, but development of effective conservation strategies also needs to rely on improving the precision of the outputs generated.The collection, management and dissemination of scientific open-access data could offer this advantage.