Using ensemble-mean climate scenarios for future crop yield projections: a stochastic weather generator approach

Using climate scenarios from only 1 or a small number of global climate models (GCMs) in climate change impact studies may lead to biased assessment due to large uncertainty in climate projections. Ensemble means in impact projections derived from a multi-GCM ensemble are often used as best estimates to reduce bias. However, it is often time consuming to run process-based models (e.g. hydrological and crop models) in climate change impact studies using numerous climate scenarios. It would be interesting to investigate if using a reduced number of climate scenarios could lead to a reasonable estimate of the ensemble mean. In this study, we generated a single ensemble-mean climate scenario (En-WG scenario) using ensemble means of the change factors derived from 20 GCMs included in CMIP5 to perturb the parameters in a weather generator, LARS-WG, for selected locations across Canada. We used En-WG scenarios to drive crop growth models in DSSAT ver. 4.7 to simulate crop yields for canola and spring wheat under RCP4.5 and RCP8.5 emission scenarios. We evaluated the potential of using the En-WG scenarios to simulate crop yields by comparing them with crop yields simulated with the LARS-WG generated climate scenarios based on each of the 20 GCMs (WG scenarios). Our results showed that simulated crop yields using the En-WG scenarios were often close to the ensemble means of simulated crop yields using the 20 WG scenarios with a high probability of outperforming simulations based on a randomly selected GCM. Further studies are required, as the results of the proposed ap proach may be influenced by selected crop types, crop models, weather generators, and GCM ensembles.


INTRODUCTION
Climate change has significant impacts on various systems worldwide, including ecology, hydrology, and agriculture (IPCC 2014). Climate change impact assessments are essential to developing policies and strategies for climate change adaptation and mitigation. Climate change impact studies are often based on modelling results using future climate scenarios to drive impact models, such as hydrological models for water resources, crop growth models for crop produc-tion, and ecosystem models for the environment. Global climate models (GCMs) are the major tools used to project future climate scenarios, and their outputs are often used in the studies of climate change im pacts (e.g. Qian et al. 2016a,b, Al Samouly et al. 2018). An increasing number of GCMs have been available in recent years (Lutz et al. 2016). Moreover, improvements in other aspects of climate modelling are incorporated in the Coupled Model Intercomparison Project Phase 5 (CMIP5), including a new set of emission scenarios, i.e. representative concentration pathways (RCPs) (Moss et al. 2010). In spite of improvements in CMIP5, uncertainty in climate projections is still large due to differences in model re sponse, forcing scenarios, and internal climate variability. Thus, it is recommended to use multi-model ensembles in climate change impact studies to reduce potential biases in impact projections (Semenov & Stratonovitch 2010). For example, Gao et al. (2019) assessed the responses of hydrological processes to climate change over the southeastern Tibetan Plateau using 18 GCMs in CMIP5. Qian et al. (2019a) conducted a comprehensive evaluation of climate change impacts on Canada's crop production under different levels of global warming based on 20 GCMs.
However, it is often time consuming to run processbased models, such as watershed hydrological models or crop growth models, in climate change impact studies when a large number of climate scenarios are used. For example, in a recent study (Webber et al. 2018), simulations were run on 8157 grids in Europe to estimate the impact of climate change on wheat and maize using 10 crop models and 5 GCMs. The ensemble means provide essential information to policymakers and the general public, as ensemble means can serve as a best estimate for the impact of climate change (Challinor et al. 2018). Moreover, using ensemble means will reduce bias in assessments where only 1 GCM is used. For example, Qian et al. (2020) found that projections of crop production in Canada using climate scenarios from 1 climate model differ substantially from the ensemble means derived using climate scenarios from multiple GCMs in CMIP5. Moreover, multiple runs of the GCMs, i.e. perturbed physics ensembles (PPEs), are often available, but only 1 run (member) of the GCMs is often used in climate change impact studies where the multi-GCM approach is adopted. Large uncertainty due to internal climate variability was quantified for crop yield projections in Canada using large ensembles of 1 GCM and 1 regional climate model (Qian et al. 2020). Therefore, it is practically useful if a single climate scenario representative for climate change scenarios from multiple GCMs, and particularly multiple members of PPEs, can be generated to drive process-based impact models for efficient estimation of ensemble means in climate change impact studies. Adopting this approach may substantially reduce the use of resources in running simulations with climate scenarios from a large and increasing number of GCMs and PPEs to derive ensemble means, thus freeing up resources for other aspects in climate change impact and adaptation studies, in addition to reducing bias.
It has been reported that multi-model averaging can enhance the reliability of climate projections (Wang et al. 2017, Al Samouly et al. 2018, largely resulting from the cancellation or compensation of errors in the individual models even on the regional scale (Pierce et al. 2009). However, directly averaging daily climate outputs from multiple climate models to drive process-based models is not applicable be cause averaging smooths climate variability. It is even more difficult, if not impossible, to develop such a single scenario by averaging GCMs, as the relationship between climate variables and their impacts may not be linear (Wang et al. 2018, Whitfield & Cannon 2000. In this study, we attempted to generate the aforementioned single ensemble-mean climate scenarios (En-WG scenarios) by using ensemble means of the change factors derived from 20 GCMs in CMIP5 to perturb distributions of the site parameters of climatic variables in the Long Ashton Research Station Weather Generator (LARS-WG) (Semenov & Barrow 1997). The generated En-WG scenarios for 2 future periods (2040−2069 and 2070−2099) under RCP4.5 and RCP8.5 were used to drive crop growth models in the Decision Support System for Agrotechnology Transfer (DSSAT) ver. 4.7 (Hoogenboom et al. 2017) to simulate yields for 2 major crops (canola and spring wheat) at selected locations across Canada. Simulated crop yields using the En-WG scenarios were compared with the ensemble means of simulated yields using 20 climate scenarios generated by perturbing the LARS-WG site parameters with change factors estimated from each of the 20 GCMs (WG scenarios), for the same future periods and RCPs. The objective of this study was to investigate the potential of generating 1 single ensemble-mean climate scenario (En-WG), using a stochastic approach based on multiple GCMs, for estimating crop yields and to compare them with the ensemble means of simulated crop yields using individual climate scenarios (WG scenarios) from the multi-GCM ensemble under different RCPs.

Study areas
In this study, we selected 10 locations with diverse climatic conditions and soils, covering agricultural production areas across Canada (Fig. 1), for canola and spring wheat yield simulation. Basic information (including geographical position, climate, and soil) for these 10 locations is presented in Table 1.

Climate data
Observed daily maximum temperature (T max ), daily minimum temperature (T min ), and daily precipitation (Prec) for 1971−2000 at the 10 locations were obtained from Environment and Climate Change Canada's National Climate Data and Information Archive. Values of daily solar radiation (Rad) were extracted from a high-resolution global dataset of meteorological forcings for land surface modelling (Sheffield et al. 2006) because Rad was not observed at most locations.
Bias-corrected and downscaled GCM data including daily T max , T min , Prec, and Rad (GCM scenarios) were used to drive the crop models and calculate the climate change factors used in stochastic weather generation. Daily outputs of T max , T min , Prec, and Rad from the 20 GCMs in the CMIP5 archive (Table 2) for the baseline period 1971−2000 and 2 projection periods, 2040−2069 and 2070− 2099, under the forcing scenarios RCP4.5 and RCP8.5 were used in this study. RCP4.5 and RCP8.5 represent medium-low and high emission scenarios with a radiative forcing of 4.5 and 8.5 W m −2 at the end of the 21st century, respectively (IPCC 2014).
Observed data were used to bias correct and downscale the GCM simulations as well as for calibrating the LARS-WG site parameters. Bias correction/ downscaling is performed using a multivariate form of quantile mapping -multivariate bias correction using N-dimensional probability distribution transfer (MBCn) (Kirchmeier-Young et al. 2017, Cannon 2018 -that, first, corrects GCM marginal distributions and the multivariate dependence structure between sites and variables to match historical observations and, second, preserves GCM-projected changes in quantiles in future periods. Specifically, T max , T min , Prec, and Rad from each GCM at each location are corrected simultaneously, using the 1971− 2000 observational period for calibration. The MBCn bias correction algorithm (ver. 0.10-1, https://cran.r-project. org/ package= MBC; R ver. 3.3.2, www.r-project.org) is applied over 30 yr sliding windows on concatenated historical (1950− 2005) and RCP scenario (2006−2100) periods. In each window, the central decade is replaced, the window is slid forward 1 decade, etc., until the end of the projection period is reached. To ensure an unbiased seasonal cycle, adjustments are applied over data that have been pooled over 33 day-of-year sliding blocks -the central 11 d are replaced, the block is slid 11 d, etc. To ensure that corrected values of T max exceed T min on all days, MBCn is applied to the diurnal temperature range (T max -T min ) and approximate mean temperature ([T min + T max ]/2) variables. Outside of the 1971− 2000 calibration period, changes in corrected quantiles are constrained to match those in the raw climate model simulations (i.e. the adjustments made by MBCn are change preserving on a quantile-by-quantile basis)

Stochastic weather generation
LARS-WG was used to generate the En-WG and WG scenarios. LARS-WG is a stochastic weather generator based on the series approach (Semenov & Barrow 1997, Semenov & Stratonovitch 2010, in comparison with the Richardson-type stochastic weather generators that use either first-or higher-order Markov chains (Richardson 1981). It utilizes observed daily weather data of a given site to compute a set of parameters for probability distributions of weather variables as well as correlations between them (Semenov & Stratonovitch 2010). This set of parameters is used to generate synthetic weather time se-ries with statistical characteristics corresponding to the observed datasets. To generate future weather data, the LARS-WG parameters from historical climate data are perturbed by a scenario of climate change in terms of change factors. LARS-WG is available from https:// sites.google.com/view/ lars-wg/. Bias-corrected and downscaled daily T max , T min , and Prec from 20 GCMs for the baseline period (1971− 2000) and 2 relevant future periods (2040− 2069 and 2070−2099) under RCP4.5 and RCP8.5 at 10 locations across Canada were used to calculate change factors. For each GCM at each location in the baseline period and the 2 future periods under both emission scenarios, monthly mean T max and T min , monthly mean Prec, monthly mean duration of wet and dry spells, and SDs of daily mean temperature were calculated. Change factors of these climate statistics for the 2 future periods with respect to the

Crop simulation
The CSM-CERES-wheat model and the CSM-CROPGRO-canola model included in DSSAT ver. 4.7 were used to simulate crop yields for spring wheat and canola in this study. Crop models in DSSAT have been widely used in climate change impact studies around the world (e.g. He et al. 2018, Hussain et al. 2020, Ye et al. 2020). Furthermore, these 2 models have been calibrated and evaluated with field experimental data in Canada (Jing et al. 2016(Jing et al. , 2017 and used to assess climate change impacts (Qian et al. 2016a(Qian et al. , 2019a. Climate data, soil information, crop cultivar parameters, and crop management practices are required as inputs to the crop models. Soil data for each site were obtained from the Canadian Soil Information System, Soil Landscapes of Canada, ver. 3.2 (Soil Landscapes of Canada Working Group 2010). Spring wheat cultivar AC Barrie and canola cultivar InVigor 5440 calibrated in Canada by Jing et al. (2016Jing et al. ( , 2017 were used to simulate crop yields as continuous spring wheat and canola. All simulations included the direct effects of elevated atmospheric CO 2 concentration. Planting date in simulations for both spring wheat and canola was May 15 for the baseline period 1971− 2000 and May 8 for the future periods. A fixed planting date was used for simplicity in crop simulations, as planting date is considered a crop management practice that can vary from year to year and by location. Crops were harvested automatically at physiological maturity in all the simulations. Al though N fertilizer applications are an important agronomic management practice for rainfed crop production, we simulated only the water-limited yield (Y w ) of crops grown without N stress to emphasize the climate impacts. Soil texture may have significant impacts on crop growth and yield in the simulations of Y w ; therefore, simulated crop yields at the selected locations can be different if other soils are used.

Quantitative evaluation
To quantitatively assess how close simulated crop yields using En-WG scenarios are to the ensemble means of simulated yields using multiple climate scenarios (WG scenarios), we compared them with the yields from all (i.e. 20 in this study) or any one of the WG scenarios using 2 statistical measures. Closeness is defined as the absolute value of the difference between the long-term mean of simulated yields using En-WG scenarios (Y En ) and ensemble mean yields (Y m ), i.e. |Y En -Y m |; Y m is the average of the long-term means of simulated yields (Y i , i = 1, 20) using the 20 WG scenarios. We defined relative difference (RD) in Eq. (1). We used the proportion as an estimate of the probability (p) of the WG scenario based on a randomly chosen GCM from the 20 GCMs (Y i , i = 1, 20), that may have a smaller difference to the ensemble mean (Y m ) than En-WG (Y En ), i.e. |Y i -Y m | < |Y En -Y m |, N s is the total number of GCMs with a smaller difference, in Eq. (2). Smaller values of these indicators indicate better performance of the En-WG scenario at reproducing the ensemble mean. The values of these indicators can be averaged across locations for evaluating the overall performance. (1) (3)

RESULTS
The long-term means of canola yields and spring wheat yields simulated using En-WG scenarios were compared with those using 20 WG scenarios for 2 future periods (2040−2069 and 2070−2099) under RCP4.5 and RCP8.5 in Figs. 2 & 3, respectively. As shown, the ranges of the simulated yields across the  wheat yields simulated using the 20 WG scenarios. This is true across most of the 10 locations and all 4 future scenarios. Two statistical indicators for quantitatively assessing the performance of the single En-WG scenarios in terms of reproducing the ensemble means of yields derived from multiple WG scenarios, RD and p, were calculated for each location for the future periods 2040−2069 and 2070−2099 under RCP4.5 and RCP8.5. The results are shown in Table 3 for canola and Table 4 for spring wheat. In Table 3, all values for p are smaller than 0.3, and most of them are smaller than 0.05, indicating that the single En-WG scenarios, in most cases, outperformed a randomly selected GCM for reproducing the ensemble means of yields. For Harrow in 2070−2099 under RCP8.5, the simulated canola yield using the En-WG scenario shows the biggest relative difference from the ensemble mean yield, with an RD value of 11.6%. The p value is 0.20, which indicates that the En-WG scenario would still be more likely to outperform the cases using 1 GCM. In other cases, the RDs are all less than 10% and mostly less than 5%, showing small differences between the En-WG simulated canola yields and the ensemble means of canola yields derived from the 20 WG scenarios. The overall performances of the single En-WG scenarios across the 10 locations are satisfactory in the 2 future periods (2040−2069 and 2070−2099) under both RCP4.5 and RCP8.5.
In Table 4, all values for RD are less than 5%. The small RD values indicate very small differences between spring wheat yields simulated using the En-WG scenarios and the ensemble means of spring wheat yields simulated using the 20 WG scenarios. cating that the en semble mean using the En-WG scenarios is more likely than not to outperform a randomly selected WG scenario. However, even in the 3 cases where the En-WG scenarios are less likely to outperform a randomly se lected GCM, the RDs between En-WG simulated spring wheat yields and the ensemble means of yields simulated with the 20 WG scenarios are very small (2.5, 3.8, and 1.9%, respectively), reflecting good performance of the single En-WG scenarios. In fact, the ranges of the simulated yields across the 20 WG scenarios are often relatively small in these 3 cases; hence, all ensemble members are close to the ensemble mean, which results in a large p in these 3 cases.
The average values for RD across the 10 locations in 2 future periods under 2 RCPs are all less than 3%, and the average values for p are all smaller than 0.30, implying satisfactory overall performance of the En-WG scenarios for producing crop yields close to the ensemble mean yields based on the 20 WG scenarios.  Table 4. Relative difference (RD, %) and probability (p) for assessing the performance of En-WG scenarios for spring wheat yield simulations

DISCUSSION
In this study, a stochastic weather generator, LARS-WG, was used to develop future climate scenarios (i.e. WG and En-WG scenarios) from multiple GCMs under 2 forcing scenarios, RCP4.5 and RCP8.5. Stochastic weather generators have attracted attention in the past decades as a convenient tool for producing daily climate scenarios in climate change impact studies (Qian et al. 2005, Kilsby et al. 2007). Qian et al. (2011) compared simulated crop yields with observed and synthetic weather data generated by a stochastic weather generator. They found that reliable crop yield estimates could be obtained by using the stochastic weather data to drive DSSAT crop growth models at the Canadian locations in their study. However, stochastic weather generators suffer from the problem known as overdispersion, i.e. they underestimate the interannual variability of climate (Katz & Parlange 1998, Qian et al. 2004, Chen & Brissette 2014. To represent the interannual variability of climate in a 30 yr period, we calculated the SD of mean temperature (T mean ) and the coefficient of variation of accumulated precipitation (P total ) in the crop growing season (May 1 to Aug 31) based on 300 yr long climate data from the 20 WG scenarios. The interannual variability of T mean and P total in the growing season derived from 20 bias-corrected and downscaled GCMs (GCM scenarios) was also calculated for a comparison. Fig. 4 shows the results in 2070− 2099 under RCP8.5 for 2 locations as examples, Swift Current on the Canadian Prairies and Ottawa in eastern Canada. As observed in Fig. 4, the interannual variability of T mean and P total in the crop growing season derived from GCM scenarios is notably larger than that derived from WG scenarios. The bias correction of GCMs, used previously in Qian et al. (2019aQian et al. ( , 2020, did not change the relative year-to-year variations of climate (not shown). However, LARS-WG does not consider potential changes in climate variability from year to year (Prudhomme et al. 2002), in addition to the common overdispersion issue for the baseline climate in stochastic weather generation, and thus underestimates interannual variability.
Due to the underestimation of the interannual variability of climate in stochastic weather generation by LARS-WG, it is very likely that the interannual variability of crop yields simulated using WG scenarios will also be underestimated. Thus, we compared the interannual variability of crop yields simulated using GCM scenarios with that simulated using WG scenarios. Here we only presented the results for canola yield simulations in 2070−2099 under RCP8.5 (Fig. 5), as those from all other cases are similar. Al though ensemble-mean yields derived from WG and GCM scenarios are relatively close for most locations (Figs. S1 & S2 in the Supplement at www. int-res. com/ articles/ suppl/ c083 p161_ supp. pdf), the interannual variability (represented by SD) of crop yields derived from WG scenarios is always lower than that derived from GCM scenarios, which is consistent with what is seen in the WG-generated climate.
Our results support the need for corrections or improvements in stochastic weather generation algorithms, such as LARS-WG used in this study, to remedy the underestimation of interannual climate variability, as several previous studies do (Semenov et al. 1998, Mavromatis & Hansen 2001, Qian et al. 2004. Efforts have been made to reduce overdispersion in stochastic weather generators (Wang & Nathan 2007, Kim et al. 2012. For example, Kim et al. (2012) cou- . Boxplots show the 10th, 25th, 50th, 75th, and 90th percentiles of SDs/CVs across 20 GCM/WG scenarios. Red dashes represent the means of SDs/CVs derived from 20 GCM/WG scenarios pled a generalized linear modelling (GLM) approach into a stochastic weather generator, with seasonally aggregated climate statistics as additional covariates to the GLM-based weather generator, through which the overdispersion phenomenon was effectively reduced. Similar measures should be incorporated into other stochastic weather generators, especially for studies in which the interannual variability is important. However, implementation of such measures in generating future climate scenarios is still a substantial challenge. The goal of our study is to generate a single En-WG scenario to drive the crop models in DSSAT and evaluate its performance in reproducing ensemble means of crop yields derived from the WG scenarios of a multi-model ensemble. Our results also show that the SDs of simulated yields using En-WG scenarios match well to the ensemble means of the SDs of yields simulated using WG scenarios. Improving the overdispersion issue in stochastic weather generation is beyond the scope of this study. To the best of our knowledge, our study is the first assessment on the potential of using a single ensemblemean climate scenario for crop yield projections in Canada, and there are no published studies available for reference or comparison. The En-WG scenarios are developed for effectively simulating the ensemble means; thus, they are not expected to produce information on the spread in crop yield projections associated with the uncertainty in climate projections in a multi-model ensemble. As shown in Figs. 2 & 3, the spread in projected crop yields across the 20 WG scenarios is large at most locations, implying considerable uncertainty in crop yield projections. In addition to the ensemble means of crop yields as the best estimates, the ranges of crop yields (uncertainty) will, in many contexts, also be required by stakeholders for decision making. Consid-ering the large amount of time needed to run process-based crop models driven by a large number of climate scenarios, an algorithm for selecting a limited number of climate scenarios from a multi-model ensemble will be useful in crop yield projections to account for uncertainty. One of the possible options would be to select a small number of GCMs based on their climatic sensitivity representative to the full CMIP5 en semble (Seme nov & Stratonovitch 2015) using a procedure such as the Katsavounidis-Kuo-Zhang algo rithm (Katsavounidis et al. 1994), which can be applied for recursively selecting members that best span the spread of an ensemble (Cannon 2015). Using estimates of the ranges based on a small subset of GCMs in combination with the ensemble means de rived using the approach proposed in this study could effectively meet the demand of end users. However, further studies are required to evaluate the effectiveness of such selection methods on covering the ranges of full ensembles for crop yield projections.
Nevertheless, more studies may be needed to better understand the potential of the proposed approach for using ensemble-mean climate scenarios in climate change impact studies with multi-GCM ensembles and to explore whether its results could be reproduced well for different crop types, crop models, weather generators, and GCM ensembles, although 2 different spring crops and 2 structurally different crop models were used in this study.

CONCLUSIONS
In this study, we developed single ensemble-mean climate scenarios (En-WG) for 2 future periods (2040− 2069 and 2070−2099) under 2 RCPs (RCP4.5 and RCP8.5) at 10 locations across Canada using LARS-WG based on the ensemble means of climate change factors estimated from 20 CMIP5 GCMs. We compared simulated crop yields using En-WG scenarios with the ensemble means of simulated yields using 20 climate scenarios (WG scenarios) generated by LARS-WG based on 20 individual GCMs. We introduced 2 statistical measures, RD and p, of a randomly chosen GCM outperforming the En-WG scenario for reproducing the ensemble mean estimate. The simulated crop yields using the En-WG scenarios were close to the ensemble means of the simulated crop yields using the 20 WG scenarios. Moreover, using single En-WG scenarios in crop yield projections often outperformed using individual WG scenarios. However, climate scenarios generated by LARS-WG usually resulted in the underestimation of interannual variability in the simulated crop yields, which is common for many weather generators. The En-WG scenarios have the potential of efficiently estimating the ensemble means of future crop yield projections in a multi-GCM ensemble when a stochastic approach is applied to each of the individual GCMs, in terms of much less time (only 5% of a 20 GCM ensemble) for running simulations and a reasonable accuracy (approximately a 2% error on average).