Probabilistic Forecasts of Wind Speed: Ensemble Model Output Statistics using Heteroskedastic Censored Regression
|
|
- Lilian Oliver
- 7 years ago
- Views:
Transcription
1 Probabilistic Forecasts of Wind Speed: Ensemble Model Output Statistics using Heteroskedastic Censored Regression Thordis L. Thorarinsdottir 1,2 and Tilmann Gneiting 1 1 University of Washington, Seattle, Washington, USA 2 University of Aarhus, Aarhus, Denmark Technical Report no. 546 Department of Statistics, University of Washington December 2008 Abstract As wind energy penetration continues to grow, there is a critical need for probabilistic forecasts of wind resources. In addition, there are many other societally relevant uses for forecasts of wind speed, ranging from aviation to ship routing and recreational boating. Over the past two decades, ensembles of numerical weather prediction (NWP) models have been developed, in which multiple estimates of the current state of the atmosphere are used to generate a collection of deterministic forecasts. However, even state-of-the-art ensemble systems are uncalibrated and biased. Here we propose a novel way of statistically post-processing NWP ensembles for wind speed using heteroskedastic censored (Tobit) regression, where location and spread derive from the ensemble forecast. The resulting ensemble model output statistics (EMOS) method is applied to 48-hour ahead forecasts of maximum wind speed over the North American Pacific Northwest in 2003 using the University of Washington Mesoscale Ensemble. The statistically post-processed EMOS density forecasts turn out to be calibrated and sharp, and result in substantial improvement over the unprocessed NWP ensemble or climatological reference forecasts. Key words and phrases: Continuous ranked probability score; Density forecast; Ensemble system; Numerical weather prediction; Heteroskedastic censored regression; Tobit model; Wind energy. 1
2 1 Introduction Accurate forecasts of wind speed are of critical importance in many applications of societal relevance, ranging from severe weather warnings for the general public to risk assessment and decision making in aviation, ship routing, recreational boating, and agriculture. Wind storms often lead to power failures and can cause extensive damage as well as threat to human life. For example, during the December 14-15, 2006 Hanukkah Eve Wind Storm over the Pacific Northwest including the northwestern United States and southeastern Canada more than 1.3 million customers lost power for up to a week, at least 13 individuals lost their lives, and estimates of damage reached a billion dollars (Mass 2008). Figure 1 shows a scene at the second author s home in the aftermath of the storm. Principled risk management depends on probabilistic forecasts, that take the form of predictive probability distributions for future quantities or events (National Research Council 2006; Gneiting 2008a). Farmers might be interested in the chance of the wind being calm enough for them to spray pesticides, while recreational boaters may want to know how likely it is that the wind speed will be substantial. Arguably, the most pronounced need for probabilistic forecasts of wind resources stems from the global proliferation of wind energy, whose installed capacity increased by 27% in 2007 alone (Global Wind Energy Council 2008). Wind power provides an attractive, emissions free alternative to fossil fuels. However, it is an intermittent source of energy, and its continued spread hinges on the ability to reliably predict wind speed (Genton and Hering 2007). From an economic perspective, underand over-prediction of wind power result in heavy financial penalties in deregularised energy markets. The optimal point forecast depends on current, rapidly changing market features and typically is a quantile of the predictive distribution (Roulston, Kaplan, Hardenberg, and Smith 2003; Pinson, Chevallier, and Kariniotakis 2007). More generally, access to the full predictive distribution provides users with the ability to tailor a point forecast or decision to the loss structure at hand (Diebold, Gunther, and Tay 1998; Gneiting 2008b). Here we are concerned with probabilistic forecasts of surface wind speed at a prediction horizon of 48 hours. Short-range weather forecasts at prediction horizons of only few hours are typically done by purely statistical approaches, using time series models or neural networks (Brown, Katz, and Murphy 1984; Kretzschmar, Eckert, Cattani, and Eggimann 2004; Gneiting, Larson, Westrick, Genton, and Aldrich 2006; Costa, Crespo, Navarro, Lizcano, Madsen, and Feitosa 2008). In the medium-range, at prediction horizons of one to ten days, forecasts based on numerical weather prediction (NWP) models outperform purely statistical forecasts (Campbell and Diebold 2005). However, NWP forecasts are deterministic and do not account for the uncertainty that arises from incomplete initial estimates of atmospheric conditions, or imperfections and discretisation in the numerical model. To take these sources of uncertainty into account, a commonly used approach is to employ an ensemble of NWP forecasts, where the ensemble members differ from each other by the initial conditions and/or the numerical model being used (Palmer 2002; Gneiting and Raftery 2005). For almost all operational ensemble systems, a positive association between the forecast error 2
3 Figure 1: Tree damage to a home on View Ridge in Seattle, Washington after the Hanukkah Eve Wind Storm of and the spread in the forecast ensemble has been established. However, even state-of-the-art ensembles are uncalibrated and subject to systematic bias, in addition to being limited by the size of the ensemble, which typically comprises five to 50 members. In view of these limitations, ensemble forecasts call for some form of statistical post-processing before a predictive distribution is passed on to the user. For wind speed and wind power, the most common approach is to use quantile regression (Bremnes 2004; Nielsen, Madsen, and Nielsen 2006; Møller, Nielsen, and Madsen 2008). This method yields quantile and interval forecasts, but not a full predictive distribution. Here we propose a simple post-processing technique that uses heteroskedastic censored regression to obtain predictive distributions. This builds on the ensemble model output statistics (EMOS) approach of Gneiting, Raftery, Westveld, and Goldman (2005), who employ Gaussian predictive distributions for surface temperature, with a mean that is linear in the ensemble member forecasts, and a variance that is an affine function of the ensemble variance. This approach is simple and powerful (Wilks 2006a; Wilks and Hamill 2007), but does not apply directly to non-negative weather quantities, such as wind speed. To address the non-negativity of the predictand, we adapt the EMOS technique and employ truncated normal predictive distributions with a cut-off at zero. This is akin to the heteroskedastic censored (Tobit) regression model (Tobin 1958; Chen and Khan 2000) and yields predictive distributions that condition on the NWP ensemble, while correcting for biases and dispersion errors. To give an example, Figure 2 shows a 48-hour ahead EMOS density forecast of maximum wind speed valid June 14, 2003 at The Dalles, Oregon in the North American Pacific Northwest, using the eight-member University of Washington Mesoscale Ensemble (Eckel and Mass 3
4 Wind Speed in Knots Figure 2: 48-hour ahead Local EMOS density forecast of maximum wind speed valid June 14, 2003 at The Dalles, Oregon. The black lines represent the eight members of the University of Washington Mesoscale Ensemble. The red lines border the 77.8% central prediction interval for the EMOS density forecast, which is shown in grey. The broken red line represents the EMOS median forecast, and the blue line the verifying observation, at 18 knots. 2005). The EMOS density forecast, which is estimated on a 40 day rolling training period, corrects for the low bias and under-dispersion in the NWP ensemble. The remainder of the paper is organized as follows. Section 2 introduces the University of Washington Mesoscale Ensemble (UWME) and describes the EMOS post-processing technique. In Section 3 we apply the EMOS technique to create daily 48-hour ahead forecasts of surface wind speed over the Pacific Northwest in the calendar year 2003, based on the UWME system. In these experiments, the EMOS density forecasts turn out to be calibrated and sharp and compare favorably to reference forecasts. The paper closes with a discussion of potential extensions and future challenges in Section 4. 2 Data and methods 2.1 Forecast and observation data We consider 48-hour ahead forecasts of maximum wind speed in the Pacific Northwest in the period from 1 November 2002 through 31 December 2003, using the eight-member University of Washington Mesoscale Ensemble (UWME) system (Eckel and Mass 2005). The ensemble member forecasts rely on initial conditions supplied by eight different operational forecast centers that drive the fifth-generation Pennsylvania State University-National Center for Atmospheric Research (PSU-NCAR) Mesoscale Model (MM5) (Grell, Dudhia, and Stauffer 1995). The MM5 model runs on a 12 kilometer grid over the Pacific Northwest which, in general, does not match the observation locations. Forecasts at observation sites are thus created by bi-linear interpolation from the model grid, as is common practice in the 4
5 BC AB WA ID OR CA NV Figure 3: Surface airway observation (SAO) stations at airports in the Pacific Northwest, including the Canadian provinces of British Columbia (BC) and Alberta (AB), and the US states of Washington (WA), Oregon (OR), Idaho (ID), California (CA) and Nevada (NV). The arrows indicate the View Ridge area in Seattle, Washington (see Figure 1) and the city of The Dalles, Oregon (see Figure 10). meteorological community. More sophisticated interpolation schemes are unlikely to do any better and do not justify the extra computational effort (Shao, Stein, and Ching 2007; Jun, Knutti, and Nychka 2008). Our data base contains the eight ensemble member forecasts and verifying observations of maximum wind speed at 107 surface airway observation (SAO) stations in the United States and Canada, as illustrated in Figure 3. Maximum wind speed is defined as the maximum of the hourly instantaneous wind speed 10 meters above ground over the previous eighteen hours, where an hourly instantaneous wind speed is a 2-minute average from the period of two minutes before the hour to on the hour. Wind speed observations are rounded to the nearest whole knot when recorded, except that wind speeds below one knot are recorded as zero. One knot is equal to approximately meters per second, or miles per hour. For the calendar year 2003, data are available for 291 days, for a total of 29,542 individual forecast cases at 107 meteorological stations in the Pacific Northwest. Only 43 of these observations are at one knot or lower. The cases from 2002 were used for training purposes only. All data were subject to the quality control procedures described by Baars (2005). 5
6 2.2 Ensemble model output statistics using heteroskedastic censored regression Our goal here is to adapt the ensemble model output statistics (EMOS) post-processing approach of (Gneiting et al. 2005) to non-negative weather variables, such as wind speed. The name stems from the term model output statistics (MOS), which is used by atmospheric scientists to refer to regression approaches that use output from NWP models as predictor variables (Glahn and Lowry 1972; Wilks 2006b). Traditionally, MOS techniques have been used for point or probability of precipitation forecasts from a single NWP model, or for point forecasts from ensembles of seasonal weather or climate models (Krishnamurti et al. 1999; Kharin and Zwiers 2002). Specifically, let X 1,..., X k denote an ensemble of individually distinguishable forecasts for a uni-variate continuous quantity Y that takes values on R +, the non-negative real axis. Here we think of Y as wind speed, but the method applies more generally. To address the non-negativity of the predictand, we follow Gneiting et al. (2006) and employ a truncated normal predictive distribution with a cutoff at zero, namely N 0 (µ, σ 2 ) where µ = a + b 1 X b k X k and σ 2 = c + ds 2. (1) The location parameter, µ, is a linear function of the ensemble member forecasts, and the spread parameter, σ 2, is an affine function of the ensemble variance, S 2 = 1 k k i=1 (X i X) 2, where X = 1 k k i=1 X i. The EMOS predictive density for the future weather quantity Y thus becomes f(y) = [ 1 σ ϕ ( y µ σ )]/ ( µ ) Φ σ for y > 0 and f(y) = 0 otherwise, where ϕ and Φ denote the standard normal density function and standard normal cumulative distribution function, respectively. To ensure that (1) specifies a valid probability distribution, the spread parameters c and d need to be non-negative. We furthermore constrain the regression coefficients b 1,..., b k to be nonnegative; this does not deteriorate predictive performance, while enhancing interpretability and stabilizing the estimates (Gneiting et al. 2005). To include these constraints into the EMOS model we write b 1 = β 2 1,..., b k = β 2 k, c = γ 2, d = δ 2, (2) where β 1,..., β k, γ and δ are unconstrained real parameters. Thus, the EMOS density forecast is the fitted truncated normal distribution (1) with the constraints in (2). The model parameters allow for direct interpretation, with the intercept a being a bias correction term, and the regression coefficients b 1,..., b k reflecting the overall contributions of the ensemble members to the predictive skill over a training period. The variance parameters c and d can be interpreted in terms of the relationship between ensemble spread and forecast skill (Whitaker and Loughe 1998). All else being equal, larger values of the parameter d indicate 6
7 a more pronounced spread-skill correlation. If spread and forecast skill are independent of each other, the parameter d will be negligibly small. More general and more flexible parameterisations of the variance term in (1) are feasible, but seem unlikely to result in improved predictive performance. The EMOS approach fits the general framework of heteroskedastic regression (Leslie, Kohn, and Nott 2007) and can be interpreted as heteroskedastic censored (Tobit) regression (Tobin 1958; Chib 1992; Chen and Khan 2000). Previous uses of truncated normal distributions for weather variables include applications to quantitative precipitation (Sansò and Guenni 1999; Allcroft and Glasbey 2003) and wind speed (Gneiting et al. 2006). 2.3 Estimation The goal in probabilistic forecasting is to maximize the sharpness of the predictive distributions subject to calibration (Gneiting, Balabdaoui, and Raftery 2007; Pal 2009). Calibration refers to the statistical consistency between the predictive distributions and the observations. This goal should thus be reflected in the choice of the optimization method for the parameter estimation. One way to obtain this goal is to estimate the parameters by optimizing a proper scoring rule as a function of the parameter values (here, real parameters a, β 1,..., β k, γ and δ) on training data. Gneiting et al. (2005) and Gneiting and Raftery (2007) refer to this general approach as optimum score estimation. The propriety of the scoring rule ensures that both calibration and sharpness are addressed. The most popular scoring rules for density forecasts are the logarithmic score (Good 1952) and the continuous ranked probability score (Matheson and Winkler 1976; Gneiting and Raftery 2007). Both rules are proper and negatively oriented, that is, the smaller the better. The logarithmic score is simply the negative of the logarithm of the predictive density evaluated at the observation. Thus, optimum score estimation based on the logarithmic score is simply maximum likelihood estimation. The continuous ranked probability score is defined as ( ) 2 crps(f, y) = F (x) 1{x y} dx, where F is the predictive cumulative distribution function and y is the verifying observation. For a truncated normal predictive distribution and an observation y 0 we get crps ( N 0 (µ, σ 2 ), y ) ( µ ) 2 [ y µ ( µ ) { ( y µ ) ( µ ) } = σ Φ Φ 2 Φ + Φ 2 σ σ σ σ σ ( y µ ) ( µ ) + 2 ϕ Φ 1 ( µ )] Φ 2. σ σ π σ For a point forecast or Dirac measure, the continuous ranked probability score reduces to the absolute error (Grimit, Gneiting, Berrocal, and Johnson 2006). Following Gneiting et al. (2005), we employ optimum score estimation based on the continuous ranked probability score, which is a more robust choice than maximum likelihood 7
8 estimation (Gneiting and Raftery 2007). In other words, we find the values of a, β 1,..., β k, γ and δ that minimize 1 n crps ( ) N 0 (a + β 2 n 1X j1 + + βkx 2 jk, γ 2 + δ 2 Si 2 ), Y j, j=1 where the sum extends over the forecast cases in the training set. The method relies on numerical optimization, which is done with the Broyden-Fletcher-Goldfarb-Shanno algorithm (Bertsekas 1999, Section 1.7) as implemented in R ( A critical question remains, namely that of an appropriate choice of the training set. This will be addressed in the following section. 2.4 Choice of training data In real-time forecasting, a popular choice for the training set is that of a rolling training period. Hence, on any given day, we use training data from the most recent m days available. Two decisions are to be made here. One is the choice of the length m of the rolling training period. Clearly, there is a trade-off in doing this. Shorter training periods adapt rapidly to seasonally varying model biases, changes in the performance of the ensemble member models, and changes in environmental conditions. Longer training periods, on the other hand, reduce the statistical variability in the estimation. Another important decision is the choice of the geographical composition of the training set. We distinguish two different methods for doing this, to which we refer to as the Local EMOS and the Regional EMOS technique, respectively. The Regional EMOS technique uses training data from all 107 stations to estimate a single set of parameters across the Pacific Northwest, which is then used to create EMOS forecasts at all stations. Nott, Dunsmuir, Kohn, and Woodcock (2001) and Gneiting, Stanberry, Grimit, Held, and Johnson (2008) noted that localized statistical post-processing can address locally varying biases and dispersion errors in NWP models and ensemble systems. The Local EMOS technique thus restricts the training set to the station at hand, and obtains a separate set of parameter estimates at each station. For the Regional EMOS method, we use a rolling training period that consists of the m = 20 most recent available days. This is a subjective choice, and turns out not to be critical, since the method is highly robust against changes in the length of the training period. In experiments with training periods of m = 10, 15,..., 50 days the out-of-sample domain-wide mean absolute error (MAE) and mean continuous ranked probability score (CRPS) in 2003 differed by less than 0.5%. For the Local EMOS technique, we follow recommendations in the extant literature (Wilson, Beauregard, Raftery, and Verret 2007) and use a rolling training period of length m = 40 days. Not unexpectedly, the choice of m matters more than for the Regional EMOS method, because the training set contains one case per day only. Figure 4 shows the out-of-sample MAE and CRPS for the 29 stations in the US state of Washington for training periods of 8
9 (a) MAE (b) CRPS Relative MAE Relative CRPS Days in Training Set Days in Training Set Figure 4: Out-of-sample performance measures as a function of the length of the rolling training period for Local EMOS forecasts at the 29 SAO stations in Washington State, for a test period ranging from March through December (a) Relative mean absolute error (MAE) of the EMOS median forecast. (b) Relative mean continuous ranked probability score (CRPS). For each station, values are normalized in terms of the ratio of the value at hand and the mean over the candidate periods. The station-specific best choice is indicated by a black dot. m = 18, 20,..., 80 days. To facilitate interpretation, the values are normalized station by station, as the ratio of the value at hand and the respective mean over the candidate periods. While there is considerable variability in the empirically optimal value of m, the ratio mostly remains between 0.95 and Training periods of 35 to 70 days generally seem adequate. 3 Results In this case study, we use the eight-member University of Washington Mesoscale Ensemble (UWME) (Eckel and Mass 2005) to create daily 48-hour ahead probabilistic forecasts of surface wind speed at 107 meteorological stations in the Pacific Northwest in As noted above, the Regional EMOS technique employs a rolling training period consisting of the 20 most recent available days. The Local EMOS method uses a 40 day rolling training period. For calendar year 2003, data are available for 291 days, for a total of 29,542 individual forecast cases, as described in Section
10 (a) Max Coef (b) Intercept (c) Var Parameter (d) Var Parameter Member A C D Daily Index Daily Index Daily Index Daily Index Figure 5: Regional EMOS parameter estimates for the predictive model (1) over the Pacific Northwest. The daily index ranges over the 290 days for which UWME forecasts are available in (a) Fitted intercept a. (b) The panel indicates the UWME member with the largest coefficient b i, with identifications as in Table 1. (c) Fitted variance parameter c. (d) Fitted variance parameter d. 3.1 Fitted predictive model The Regional EMOS technique uses data from all 107 stations to estimate a single set of parameters across the Pacific Northwest from a 20 day rolling training period. Figure 5 shows how the parameter estimates for the predictive model (1) evolve over calendar year The fitted intercept, a, ranges from 3 to 3 knots with the higher values occurring in the warm season. Most of the time, the first or the eighth ensemble member receives the largest regression coefficient b i. As Table 1 shows, these members are driven by initial conditions supplied by the Aviation Model (AVN), run by the US National Centers for Environmental Prediction (NCEP), and by the Unified Model run by the UK Met Office (UKMO), which are generally considered the best sources. The fitted variance parameter c ranges from 10 to 30 with the higher values occurring in the cold season, and the fitted variance parameter d is mostly positive. The Local EMOS technique restricts the training set to the station at hand, and obtains a separate set of parameter estimates for each station. Figure 6 shows the fit at the station in The Dalles, Oregon, one of the windiest places in the Pacific Northwest. The estimates are generally less stable than the Regional EMOS estimates, which is unsurprising in view of the diminished, local training set. The variance parameter d is frequently estimated very near zero. One such instance is shown in Table 1, which gives details for the Local EMOS forecast at The Dalles, Oregon, valid June 14, The eight UWME ensemble member forecasts range from to The Local EMOS method fits an intercept of 3.65 and assigns the highest coefficients to the CMCG, ETA and GASP members. It corrects for the low bias in the numerical model and adjusts the ensemble spread, to give a realistic estimate of the forecast uncertainty. The Local EMOS median forecast is knots, and the verifying observation is 18 knots. A graphical illustration is given in Figure 2. 10
11 (a) Max Coef (b) Intercept (c) Var Parameter (d) Var Parameter Member A C D Daily Index Daily Index Daily Index Daily Index Figure 6: Local EMOS parameter estimates for the predictive model (1) at The Dalles, Oregon. The daily index ranges over the 274 days for which UWME forecasts are available in 2003 at this site. (a) Fitted intercept a. (b) The panel indicates the UWME member with the largest coefficient b i, with identifications as in Table 1. (c) Fitted variance parameter c. (d) Fitted variance parameter d. Table 1: Local EMOS forecast at The Dalles, Oregon, valid June 14, The first row shows the parameter estimates for the predictive model (1). The second row shows the UWME member forecasts in knots. Individual members are identified by the acronyms used by Eckel and Mass (2005). The EMOS median forecast is knots, and the verifying wind speed is 18 knots. See Figure 2 for a graphical illustration. Parameter a b 1 b 2 b 3 b 4 b 5 b 6 b 7 b 8 c d Source AVN CMCG ETA GASP JMA NGPS TCWB UKMO Estimate Forecast Predictive performance over the Pacific Northwest We now assess the out-of-sample predictive performance of the EMOS technique relative to the unprocessed ensemble forecast, as well as persistence and climatological reference forecasts. The persistence forecast is a naive point forecast: it predicts future wind speeds by the most recent available observed wind speed at the given site. The regional climatological forecast is the predictive distribution obtained by the empirical distribution of the wind observations when aggregated over the Pacific Northwest and the calendar year In the local climatological forecast, a site-specific predictive distribution is created for each site based on the observations at that site from the calendar year Strictly speaking, this is not a forecast, because both past and future observations are used. However, it provides an often used reference standard. As noted, the goal in probabilistic forecasting is to maximize the sharpness of the predictive distributions subject to calibration (Gneiting, Balabdaoui, and Raftery 2007). Calibration 11
12 Table 2: Mean continuous ranked probability score (CRPS) and mean absolute error (MAE), and coverage and average width of the 77.8% central prediction interval for probabilistic forecasts of wind speed over the Pacific Northwest in Coverage in percent, all other values in knots. The MAE refers to the point forecast given by the median of the respective predictive distribution. Forecast CRPS MAE Coverage Width Persistence 4.49 Regional Climatology Local Climatology UWME Regional EMOS Local EMOS refers to the statistical consistency between the predictive distributions and the observations. Sharpness refers to the concentration of the predictive distributions: the more concentrated, the sharper, and the sharper, the better, subject to calibration. In addition to calibration and sharpness checks, we report summary measures of predictive performance, such as the mean continuous ranked probability score (CRPS) and the mean absolute error (MAE). For a probabilistic forecast, we report the MAE for the optimal point forecast under the linear loss function, namely the median of the respective predictive distribution (Gneiting 2008b). Table 2 shows the CRPS and MAE for the various types of forecasts. All results are spatially and temporally aggregated, over the Pacific Northwest and calendar year Both the Regional EMOS and the Local EMOS technique show better predictive performance than the unprocessed UWME forecast, persistence, or the climatology forecasts. The Local EMOS technique clearly performs the best. To assess calibration, we consider Figure 7, which shows the verification rank histogram for the UWME forecast and probability integral transform (PIT) histograms for the EMOS density forecasts. The verification rank histogram plots the rank of each observed wind speed relative to the eight ensemble member forecasts (Anderson 1996; Hamill and Colucci 1997; Talagrand, Vautard, and Strauss 1997). If the ensemble members and the observation are exchangeable all possible ranks are equally likely, and the histogram is uniform. The PIT is the value that the predictive cumulative distribution function attains at the observation and is a continuous analog of the verification rank (Dawid 1984; Diebold et al. 1998; Gneiting et al. 2007). Again, deviations from uniformity indicate a lack of calibration. The unprocessed ensemble forecast (UWME) is under-dispersed, in that too many observations fall outside the ensemble range, which forms a nominal 7 or 77.8% central prediction 9 interval. Indeed, under exchangeability there is a 1 chance for the observed wind speed 9 to fall below the range of the eight-member ensemble, and a 1 chance to fall above. The 9 PIT histograms for the post-processed EMOS techniques show substantial improvement in 12
13 (a) UWME (b) Regional EMOS (c) Local EMOS Relative Frequency Verification Rank Density PIT Density PIT Figure 7: Calibration checks for probabilistic forecasts of wind speed over the Pacific Northwest in (a) Verification rank histogram for the UWME forecast. (b) PIT histogram for the Regional EMOS technique. (b) PIT histogram for the Local EMOS technique. calibration. To allow for direct comparisons, Table 2 shows the empirical coverage and the average width of 7 or 77.8% central prediction intervals from the various types of probabilistic forecasts. The results for the empirical coverage echo what we see in the histograms, 9 in that the unprocessed ensemble forecast is highly uncalibrated. The EMOS intervals are much better calibrated, even though the Local EMOS forecasts are, on average, slightly under-dispersed. To quantify sharpness, Table 2 shows the average width of the 77.8% central prediction interval for the various methods. The unprocessed ensemble forecast (UWME) is very sharp, but at the expense of being uncalibrated. Local EMOS returns sharper predictive distributions than Regional EMOS, and both EMOS methods are sharper than the climatological reference forecasts. 3.3 Predictive performance at individual stations We now turn to results at the 107 SAO stations individually. Figures 8 and 9 compare the Local EMOS forecast to the unprocessed ensemble forecast (UWME) and the Regional EMOS forecast in terms of station-specific CRPS and MAE values. The color of the points in the scatter-plots indicates the site-specific average observed wind speed in calendar year The lower tercile of the stations is shown in blue, the middle tercile in green, and the upper tercile in red. Several patterns emerge. Forecasts at stations with higher mean wind speed generally are more difficult. Mostly, the Local EMOS forecast performs the best and shows the lowest CRPS and MAE values. The Regional EMOS technique shows the lowest CRPS at 31 sites, that is, roughly one in three stations, roughly corresponding to the middle tercile of the mean wind speed. For the MAE, this number is slightly less. In the cases in which the Regional EMOS technique outperforms the Local EMOS method, the improvement is marginal. 13
14 Table 3: Mean continuous ranked probability score (CRPS) and mean absolute error (MAE), and coverage and average width of the 77.8% central prediction interval for probabilistic forecasts of wind speed at The Dalles, Oregon in Coverage in percent, all other values in knots. The MAE refers to the point forecast given by the median of the respective predictive distribution. Forecast CRPS MAE Coverage Width Persistence 6.17 Regional Climatology Local Climatology UWME Regional EMOS Local EMOS Conversely, if the Local EMOS method outperforms the Regional EMOS technique, the improvement can be substantial. Hence, for a typical station in the middle tercile, the predictive performance can be improved slightly by including off-site stations in the training set. For an atypical station, however, the inclusion of training data from other stations may lead to bias and dispersion corrections that are not representative of the local climate. In light of this, we prefer the Local EMOS method if predictions at a specific location are sought, such as at a wind energy or wind surfing site. That said, it is possible that even for atypical stations the addition of carefully selected off-site data to the training set turns out to be beneficial. Potentially, climatological and geographic information can serve to cluster observation sites, to provide guidance in the spatial composition of the training set. Such a method could also be used to post-process NWP ensemble forecasts directly on the model grid, similarly to the bias removal technique developed by Mass, Baars, Wedam, Grimit, and Steed (2008). Finally, we consider results at the city of The Dalles, Oregon, which is located at the eastern terminus of the Columbia River Gorge on the border between the US states of Washington and Oregon. The winds at The Dalles are generally dictated by the channeling effects of the Columbia River Gorge, the sole near-sea-level passage through the Cascade Mountains (Mass 2008). This is one of the windiest places in the Pacific Northwest. There are several wind farms nearby, and the city has a reputation for being the best place to learn wind surfing. Summary measures of the predictive performance for the various forecast techniques at The Dalles are shown in Table 3. The Local EMOS method outperforms its competitors. Figure 10 illustrates the Local EMOS forecast distributions at The Dalles for the period of June 14 through July 31, Note that the Local EMOS density forecast for the first day in the display, June 14, is illustrated in Table 3 and Figure 2. 14
15 (a) CRPS Local EMOS UWME (b) CRPS Local EMOS Regional EMOS Figure 8: Comparison of the station-specific mean continuous ranked probability score (CRPS) for (a) the Local EMOS technique versus the unprocessed EMOS forecast and (b) the Local EMOS technique versus the Regional EMOS method. The color of the points indicates the site-specific average observed wind speed. The lower tercile of the stations is shown in blue, the middle tercile in green, and the upper tercile in red. The scores are aggregated over calendar year (a) MAE Local EMOS UWME (b) MAE Local EMOS Regional EMOS Figure 9: Same as Figure 8, but for the mean absolute error (MAE). 15
16 Local EMOS Forecast at The Dalles, Oregon Wind Speed in Knots June 14 July 1 July 15 July 31 Figure 10: 48-hour ahead Local EMOS forecasts of maximum wind speed at The Dalles, Oregon for June 14 through July 31, The Local EMOS 77.8% prediction interval is shown in gray. The small black dots represent the eight members of the University of Washington Mesoscale Ensemble; the blue points the verifying wind speeds. Missing days are due to missing data. 4 Discussion We have shown how to apply heteroskedastic censored (Tobit) regression to statistically post-process ensemble forecasts of wind speed. This is in the tradition of regression or model output statistics (MOS) approaches that yield substantial improvement in the accuracy of point forecasts from numerical weather prediction (NWP) models. In Gneiting et al. (2005) and the current paper these methods have been developed further to yield the ensemble model output statistics (EMOS) technique, which generates full predictive distributions for future weather quantities, rather than just a bias-corrected point forecast. In experiments with the University of Washington Mesoscale Ensemble (UWME) (Eckel and Mass 2005), we applied the EMOS technique to create 48-hour ahead forecasts of surface wind speed over the North American Pacific Northwest. The EMOS density forecasts turn out to be calibrated and sharp. They correct for biases and are much better calibrated than the raw ensemble, which is under-dispersive. The Local EMOS forecast distributions are sharp, in that the prediction intervals are much shorter on average than prediction intervals based on climatology. Furthermore, the median of the Local EMOS density provides a point forecast with much lower MAE than the ensemble median, or other reference forecasts. The UWME member forecasts come from clearly distinguishable sources. Other ensemble systems, such as the global ensembles run by the European Centre for Medium-Range Weather Forecasts and the US National Centers for Environmental Prediction have members 16
17 that differ only in some random perturbations (Buizza et al. 2005). In these cases, members with identical statistical properties ought to be treated as exchangeable, and thus ought to have equal EMOS coefficients. This can be enforced easily, by constraining the regression coefficients b i = βi 2 in (1) and (2) to be equal. Two general approaches to the statistical post-processing of NWP ensembles have emerged recently (Wilks 2006a; Bröcker and Smith 2008). The ensemble model output statistics (EMOS) approach pursued here fits a single, parametric predictive distribution using summary statistics from the ensemble. Another approach is based on kernel dressing or Bayesian model averaging (BMA), where each individual ensemble member is associated with a kernel function (Raftery et al. 2005; Sloughter et al. 2007). Pinson and Madsen (2008) use Gaussian kernels in a wind energy application. In Sloughter et al. (2008), BMA with kernel functions given by gamma densities is applied to obtain predictive distributions for wind speed, using the same data as presented here. The BMA method uses regional parameter estimation; thus it corresponds to the Regional EMOS technique and results for the two methods can be directly compared. 1 The predictive performance of the two techniques is nearly the same, as can be seen from our Table 2 and Table 1 of Sloughter et al. (2007). However, the EMOS technique is much simpler conceptually and is easier to implement. An R package tentatively named ensemblemos is under preparation. The EMOS method does not take temporal or spatial correlation into account, in contrast to the approach taken by Gneiting et al. (2006), who build a spatio-temporal statistical model for short-range forecasts of wind speed at a prediction horizon of two hours. Indeed, the modeling of spatial or temporal correlation does not appear to be justifiable in the current context, since the dynamic evolution of the atmosphere is already captured by the NWP model. The EMOS model addresses forecast biases, dispersion and phase errors, and the predictive distribution is conditional on the NWP ensemble forecast. In many applications, a wind speed forecast at a single prediction horizon and a single location is needed, such as a wind farm, an airport, or a wind surfing or sailing site. The EMOS method in its current form is tailored to this situation. If instead we are interested in future temporal and/or spatial trajectories of wind speed, the modeling of temporal and/or spatial dependencies becomes critical. Methods for probabilistic weather forecasting at multiple locations simultaneously have been developed for temperature (Gel, Raftery, and Gneiting 2004; Berrocal, Raftery, and Gneiting 2007) and precipitation (Berrocal, Raftery, and Gneiting 2008), and can possibly be adapted to wind speed. Pinson, Madsen, Nielsen, Papaefthymiou, and Klöckl (2008) study methods of probabilistic forecasting for temporal trajectories of wind resources. Presumably, these methods could be combined and applied to wind speed and wind energy, to provide decision support in a wealth of problems that are of economic, environmental and societal importance, such as air traffic control, ship rout- 1 Sloughter et al. (2008) work with two additional SAO stations, which we discard, because they have less than 40 days of training data in late 2002, and hence are unable to provide our standard Local EMOS forecast. However, we ran Regional EMOS with the two stations added, which does not lead to any changes in Table 2. A Local BMA technique has not been implemented yet. 17
18 ing, and wind power generation over a region or country. A first, exploratory step in this direction is taken by Vlasova, Pinson, Kotwa, Madsen, and Nielsen (2008). Acknowledgements We are grateful to Jeff Baars, Veronica J. Berrocal, Chris Fraley, Clifford F. Mass, Adrian E. Raftery and J. McLean Sloughter for helpful discussions and for providing code and data. This research was supported by the National Science Foundation under Awards ATM and DMS , and by the Joint Ensemble Forecasting System (JEFS) under subcontract S from the University Corporation for Atmospheric Research (UCAR). References Allcroft, D. J. and C. A. Glasbey (2003). A latent Gaussian Markov random-field model for spatiotemporal rainfall disaggregation. Applied Statistics 52, Anderson, J. L. (1996). A method for producing and evaluating probabilistic forecasts from ensemble model integrations. Journal of Climate 9, Baars, J. (2005). Observations QC summary page. Available at washington.edu/mm5rt/qc obs/qc obs stats.html. Berrocal, V. J., A. E. Raftery, and T. Gneiting (2007). Combining spatial statistical and ensemble information in probabilistic weather forecasts. Monthly Weather Review 135, Berrocal, V. J., A. E. Raftery, and T. Gneiting (2008). Probabilistic quantitative precipitation field forecasting using a two-stage spatial model. Annals of Applied Statistics, in press. Bertsekas, D. P. (1999). Nonlinear Programming (2nd ed.). Athena Scientific. Bremnes, J. B. (2004). Probabilistic wind power forecasts using local quantile regression. Wind Energy 7, Bröcker, J. and L. A. Smith (2008). From ensemble forecasts to predictive distribution functions. Tellus Ser. A 60, Brown, B. G., R. W. Katz, and A. H. Murphy (1984). Time series models to simulate and forecast wind speed and wind power. Journal of Climate and Applied Meteorology 23, Buizza, R., P. L. Houtekamer, Z. Toth, G. Pellerin, M. Wei, and Y. Zhu (2005). A comparison of the ECMWF, MSC and NCEP global ensemble prediction systems. Monthly Weather Review 133,
19 Campbell, S. D. and F. X. Diebold (2005). Weather forecasting for weather derivatives. Journal of the American Statistical Association 100, Chen, S. and S. Khan (2000). Estimating censored regression models in the presence of nonparametric multiplicative heteroskedasticity. Journal of Econometrics 98, Chib, S. (1992). Bayes inference in the Tobit censored regression model. Journal of Econometrics 51, Costa, A., A. Crespo, J. Navarro, G. Lizcano, H. Madsen, and E. Feitosa (2008). A review on the young history of the wind power short-term prediction. Renewable and Sustainable Energy Reviews 12, Dawid, A. P. (1984). Statistical theory: The prequential approach (with discussion and rejoinder). Journal of the Royal Statistical Society Ser. A 147, Diebold, F. X., T. A. Gunther, and A. S. Tay (1998). Evaluating density forecasts with applications to financial risk management. International Economic Review 39, Eckel, A. F. and C. F. Mass (2005). Aspects of effective mesoscale, short-range ensemble forecasting. Weather and Forecasting 20, Gel, Y., A. E. Raftery, and T. Gneiting (2004). Calibrated probabilistic mesoscale weather field forecasting: The geostatistical output perturbation (GOP) method (with discussion and rejoinder). Journal of the American Statistical Association 99, Genton, M. and A. Hering (2007). Blowing in the wind. Significance 4, Glahn, H. R. and D. A. Lowry (1972). The use of model output statistics (MOS) in objective weather forecasting. Journal of Applied Meterology 11, Global Wind Energy Council (2008). Global Wind 2007 Report. Available at Gneiting, T. (2008a). Editorial: Probabilistic forecasting. Journal of the Royal Statistical Society Ser. A 171, Gneiting, T. (2008b). Quantiles as optimal point predictors. Technical Report 538, Department of Statistics, University of Washington. Available at washington.edu/research/reports/. Gneiting, T., F. Balabdaoui, and A. E. Raftery (2007). Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society Ser. B 69, Gneiting, T., K. Larson, K. Westrick, M. G. Genton, and E. Aldrich (2006). Calibrated probabilistic forecasting at the Stateline wind energy center: The regime-switching space-time method. Journal of the American Statistical Association 101, Gneiting, T. and A. E. Raftery (2005). Weather forecasting with ensemble methods. Science 310,
20 Gneiting, T. and A. E. Raftery (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association 102, Gneiting, T., A. E. Raftery, A. H. Westveld, and T. Goldman (2005). Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Monthly Weather Review 133, Gneiting, T., L. I. Stanberry, E. P. Grimit, L. Held, and N. A. Johnson (2008). Assessing probabilistic forecasts of multivariate quantities, with applications to ensemble predictions of surface winds (with discussion and rejoinder). Test 17, Good, I. J. (1952). Rational decisions. Journal of the Royal Statistical Society Ser. B 14, Grell, G. A., J. Dudhia, and D. R. Stauffer (1995). A description of the fifth-generation Penn State/NCAR mesoscale model (MM5). Technical Note NCAR/TN-398+STR. Available at Grimit, E. P., T. Gneiting, V. J. Berrocal, and N. A. Johnson (2006). The continuous ranked probability score for circular variables and its application to mesoscale forecast ensemble verification. Quarterly Journal of the Royal Meteorological Society 132, Hamill, T. M. and S. J. Colucci (1997). Verification of Eta-RSM short-range ensemble forecasts. Monthly Weather Review 125, Jun, S., R. Knutti, and D. W. Nychka (2008). Spatial analysis to quantify numerical model bias and dependence: How many climate models are there? Journal of the American Statistical Association 103, Kharin, V. V. and F. W. Zwiers (2002). Climate predictions with multimodel ensembles. Journal of Climate 15, Kretzschmar, R., P. Eckert, D. Cattani, and F. Eggimann (2004). Neural network classifiers for local wind prediction. Journal of Applied Meteorology 43, Krishnamurti, T. N., C. M. Kishtawal, T. E. LaRow, D. R. Bachiochi, Z. Zhang, C. E. Williford, S. Gadgil, and S. Surendran (1999). Improved weather and seasonal climate forecasts from multimodel superensemble. Science 285, Leslie, D. S., R. Kohn, and D. J. Nott (2007). A general approach to heteroskedastic regression. Statistics and Computing 17, Mass, C. (2008). The Weather of the Pacific Northwest. University of Washington Press. Mass, C. F., J. Baars, G. Wedam, E. Grimit, and R. Steed (2008). Removal of systematic model bias on a model grid. Weather and Forecasting 23, Matheson, J. E. and R. L. Winkler (1976). Scoring rules for continuous probability distributions. Management Science 22, Møller, J. K., H. A. Nielsen, and H. Madsen (2008). Time-adaptive quantile regression. Computational Statistics & Data Analysis 52,
21 National Research Council (2006). Completing the Forecast: Characterizing and Communicating Uncertainty for Better Decisions Using Weather and Climate Forecasts. The National Academies Press. Nielsen, H. A., H. Madsen, and T. S. Nielsen (2006). Using quantile regression to extend an existing wind power forecasting system with probabilistic forecasts. Wind Energy 9, Nott, D. J., W. T. M. Dunsmuir, R. Kohn, and F. Woodcock (2001). Statistical correction of a deterministic numerical weather prediction model. Journal of the American Statistical Association 96, Pal, S. (2009). On a conjectured sharpness principle for probabilistic forecasting with calibration. Biometrika, in press. Palmer, T. N. (2002). The economic value of ensemble forecasts as a tool for risk assessment: From days to decades. Quarterly Journal of the Royal Meteorological Society 128, Pinson, P., C. Chevallier, and G. N. Kariniotakis (2007). Trading wind generation with short-term probabilistic forecasts of wind power. IEEE Transactions on Power Systems 22, Pinson, P. and H. Madsen (2008). Ensemble-based probabilistic forecasting at Horns Rev. Wind Energy, in press. Pinson, P., H. Madsen, H. A. Nielsen, G. Papaefthymiou, and B. Klöckl (2008). From probabilistic forecasts to statistical scenarios of short-term wind power production. Wind Energy, in press. Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski (2005). Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review 133, Roulston, M. S., D. T. Kaplan, J. Hardenberg, and L. A. Smith (2003). Using mediumrange weather forcasts to improve the value of wind energy production. Renewable Energy 28, Sansò, B. and L. Guenni (1999). Venezuelan rainfall data analysed by using a Bayesian space-time model. Applied Statistics 48, Shao, X., M. L. Stein, and J. Ching (2007). Statistical comparisons of methods for interpolating the output of a numerical air quality model. Journal of Statistical Planning and Inference 137, Sloughter, J. M., T. Gneiting, and A. E. Raftery (2008). Probabilistic wind speed forecasting using ensembles and Bayesian model averaging. Technical Report 544, Department of Statistics, University of Washington. Available at edu/research/reports/. 21
DELIVERABLE REPORT D-1.10:
ANEMOSplus Advanced Tools for the Management of Electricity Grids with Large-Scale Wind Generation Project funded by the European Commission under the 6 th Framework Program Priority 61 Sustainable Energy
More informationEvaluation of Machine Learning Techniques for Green Energy Prediction
arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques
More information4.3. David E. Rudack*, Meteorological Development Laboratory Office of Science and Technology National Weather Service, NOAA 1.
43 RESULTS OF SENSITIVITY TESTING OF MOS WIND SPEED AND DIRECTION GUIDANCE USING VARIOUS SAMPLE SIZES FROM THE GLOBAL ENSEMBLE FORECAST SYSTEM (GEFS) RE- FORECASTS David E Rudack*, Meteorological Development
More informationPredicting daily incoming solar energy from weather data
Predicting daily incoming solar energy from weather data ROMAIN JUBAN, PATRICK QUACH Stanford University - CS229 Machine Learning December 12, 2013 Being able to accurately predict the solar power hitting
More information118358 SUPERENSEMBLE FORECASTS WITH A SUITE OF MESOSCALE MODELS OVER THE CONTINENTAL UNITED STATES
118358 SUPERENSEMBLE FORECASTS WITH A SUITE OF MESOSCALE MODELS OVER THE CONTINENTAL UNITED STATES Donald F. Van Dyke III * Florida State University, Tallahassee, Florida T. N. Krishnamurti Florida State
More informationVOLATILITY AND DEVIATION OF DISTRIBUTED SOLAR
VOLATILITY AND DEVIATION OF DISTRIBUTED SOLAR Andrew Goldstein Yale University 68 High Street New Haven, CT 06511 andrew.goldstein@yale.edu Alexander Thornton Shawn Kerrigan Locus Energy 657 Mission St.
More informationThe Wind Integration National Dataset (WIND) toolkit
The Wind Integration National Dataset (WIND) toolkit EWEA Wind Power Forecasting Workshop, Rotterdam December 3, 2013 Caroline Draxl NREL/PR-5000-60977 NREL is a national laboratory of the U.S. Department
More information2016 ERCOT System Planning Long-Term Hourly Peak Demand and Energy Forecast December 31, 2015
2016 ERCOT System Planning Long-Term Hourly Peak Demand and Energy Forecast December 31, 2015 2015 Electric Reliability Council of Texas, Inc. All rights reserved. Long-Term Hourly Peak Demand and Energy
More informationPredict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationForecaster comments to the ORTECH Report
Forecaster comments to the ORTECH Report The Alberta Forecasting Pilot Project was truly a pioneering and landmark effort in the assessment of wind power production forecast performance in North America.
More informationThe Weather Intelligence for Renewable Energies Benchmarking Exercise on Short-Term Forecasting of Wind and Solar Power Generation
Energies 2015, 8, 9594-9619; doi:10.3390/en8099594 Article OPEN ACCESS energies ISSN 1996-1073 www.mdpi.com/journal/energies The Weather Intelligence for Renewable Energies Benchmarking Exercise on Short-Term
More informationNOWCASTING OF PRECIPITATION Isztar Zawadzki* McGill University, Montreal, Canada
NOWCASTING OF PRECIPITATION Isztar Zawadzki* McGill University, Montreal, Canada 1. INTRODUCTION Short-term methods of precipitation nowcasting range from the simple use of regional numerical forecasts
More informationCan multi-model combination really enhance the prediction skill of probabilistic ensemble forecasts?
QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY Q. J. R. Meteorol. Soc. 134: 241 260 (2008) Published online in Wiley InterScience (www.interscience.wiley.com).210 Can multi-model combination really
More informationA simple scaling approach to produce climate scenarios of local precipitation extremes for the Netherlands
Supplementary Material to A simple scaling approach to produce climate scenarios of local precipitation extremes for the Netherlands G. Lenderink and J. Attema Extreme precipitation during 26/27 th August
More informationEstimation of satellite observations bias correction for limited area model
Estimation of satellite observations bias correction for limited area model Roger Randriamampianina Hungarian Meteorological Service, Budapest, Hungary roger@met.hu Abstract Assimilation of satellite radiances
More informationDeveloping Continuous SCM/CRM Forcing Using NWP Products Constrained by ARM Observations
Developing Continuous SCM/CRM Forcing Using NWP Products Constrained by ARM Observations S. C. Xie, R. T. Cederwall, and J. J. Yio Lawrence Livermore National Laboratory Livermore, California M. H. Zhang
More informationDescriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics
Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),
More informationComparing TIGGE multi-model forecasts with. reforecast-calibrated ECMWF ensemble forecasts
Comparing TIGGE multi-model forecasts with reforecast-calibrated ECMWF ensemble forecasts Renate Hagedorn 1, Roberto Buizza 1, Thomas M. Hamill 2, Martin Leutbecher 1 and T.N. Palmer 1 1 European Centre
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationProbabilistic Forecasts of Wind and Solar Power Generation
Probabilistic Forecasts of Wind and Solar Power Generation Henrik Madsen 1, Henrik Aalborg Nielsen 2, Peder Bacher 1, Pierre Pinson 1, Torben Skov Nielsen 2 hm@imm.dtu.dk (1) Tech. Univ. of Denmark (DTU)
More informationProposals of Summer Placement Programme 2015
Proposals of Summer Placement Programme 2015 Division Project Title Job description Subject and year of study required A2 Impact of dual-polarization Doppler radar data on Mathematics or short-term related
More informationBayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
More informationAn introduction to Value-at-Risk Learning Curve September 2003
An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk
More informationPredicting Flight Delays
Predicting Flight Delays Dieterich Lawson jdlawson@stanford.edu William Castillo will.castillo@stanford.edu Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing
More informationREDUCING UNCERTAINTY IN SOLAR ENERGY ESTIMATES
REDUCING UNCERTAINTY IN SOLAR ENERGY ESTIMATES Mitigating Energy Risk through On-Site Monitoring Marie Schnitzer, Vice President of Consulting Services Christopher Thuman, Senior Meteorologist Peter Johnson,
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More information2.8 Objective Integration of Satellite, Rain Gauge, and Radar Precipitation Estimates in the Multisensor Precipitation Estimator Algorithm
2.8 Objective Integration of Satellite, Rain Gauge, and Radar Precipitation Estimates in the Multisensor Precipitation Estimator Algorithm Chandra Kondragunta*, David Kitzmiller, Dong-Jun Seo and Kiran
More informationMarketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
More informationHow To Forecast Solar Power
Forecasting Solar Power with Adaptive Models A Pilot Study Dr. James W. Hall 1. Introduction Expanding the use of renewable energy sources, primarily wind and solar, has become a US national priority.
More informationElectric Energy Systems
Electric Energy Systems Electric Energy Systems seeks to explore methods at the frontier of understanding of the future electric power and energy systems worldwide. The track will focus on the electric
More informationA State Space Model for Wind Forecast Correction
A State Space Model for Wind Forecast Correction Valérie Monbe, Pierre Ailliot 2, and Anne Cuzol 1 1 Lab-STICC, Université Européenne de Bretagne, France (e-mail: valerie.monbet@univ-ubs.fr, anne.cuzol@univ-ubs.fr)
More informationGeostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt
More informationA verification score for high resolution NWP: Idealized and preoperational tests
Technical Report No. 69, December 2012 A verification score for high resolution NWP: Idealized and preoperational tests Bent H. Sass and Xiaohua Yang HIRLAM - B Programme, c/o J. Onvlee, KNMI, P.O. Box
More informationASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS
DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.
More informationIBM Big Green Innovations Environmental R&D and Services
IBM Big Green Innovations Environmental R&D and Services Smart Weather Modelling Local Area Precision Forecasting for Weather-Sensitive Business Operations (e.g. Smart Grids) Lloyd A. Treinish Project
More informationNext generation models at MeteoSwiss: communication challenges
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Next generation models at MeteoSwiss: communication challenges Tanja Weusthoff, MeteoSwiss Material from
More informationModeling and Analysis of Call Center Arrival Data: A Bayesian Approach
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science
More informationCurrent Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary
Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:
More informationAdequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection
Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics
More informationWHAT IS A JOURNAL CLUB?
WHAT IS A JOURNAL CLUB? With its September 2002 issue, the American Journal of Critical Care debuts a new feature, the AJCC Journal Club. Each issue of the journal will now feature an AJCC Journal Club
More informationMonotonicity Hints. Abstract
Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: joe@cs.caltech.edu Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology
More informationHISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationComparative Evaluation of High Resolution Numerical Weather Prediction Models COSMO-WRF
3 Working Group on Verification and Case Studies 56 Comparative Evaluation of High Resolution Numerical Weather Prediction Models COSMO-WRF Bogdan Alexandru MACO, Mihaela BOGDAN, Amalia IRIZA, Cosmin Dănuţ
More informationREGIONAL CLIMATE AND DOWNSCALING
REGIONAL CLIMATE AND DOWNSCALING Regional Climate Modelling at the Hungarian Meteorological Service ANDRÁS HORÁNYI (horanyi( horanyi.a@.a@met.hu) Special thanks: : Gabriella Csima,, Péter Szabó, Gabriella
More informationAn Introduction to. Metrics. used during. Software Development
An Introduction to Metrics used during Software Development Life Cycle www.softwaretestinggenius.com Page 1 of 10 Define the Metric Objectives You can t control what you can t measure. This is a quote
More informationClimate and Weather. This document explains where we obtain weather and climate data and how we incorporate it into metrics:
OVERVIEW Climate and Weather The climate of the area where your property is located and the annual fluctuations you experience in weather conditions can affect how much energy you need to operate your
More informationProbabilistic Forecasting of Medium-Term Electricity Demand: A Comparison of Time Series Models
Fakultät IV Department Mathematik Probabilistic of Medium-Term Electricity Demand: A Comparison of Time Series Kevin Berk and Alfred Müller SPA 2015, Oxford July 2015 Load forecasting Probabilistic forecasting
More informationSENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationSOLAR IRRADIANCE FORECASTING, BENCHMARKING of DIFFERENT TECHNIQUES and APPLICATIONS of ENERGY METEOROLOGY
SOLAR IRRADIANCE FORECASTING, BENCHMARKING of DIFFERENT TECHNIQUES and APPLICATIONS of ENERGY METEOROLOGY Wolfgang Traunmüller 1 * and Gerald Steinmaurer 2 1 BLUE SKY Wetteranalysen, 4800 Attnang-Puchheim,
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationProject Title: Quantifying Uncertainties of High-Resolution WRF Modeling on Downslope Wind Forecasts in the Las Vegas Valley
University: Florida Institute of Technology Name of University Researcher Preparing Report: Sen Chiao NWS Office: Las Vegas Name of NWS Researcher Preparing Report: Stanley Czyzyk Type of Project (Partners
More informationContinuous Time Bayesian Networks for Inferring Users Presence and Activities with Extensions for Modeling and Evaluation
Continuous Time Bayesian Networks for Inferring Users Presence and Activities with Extensions for Modeling and Evaluation Uri Nodelman 1 Eric Horvitz Microsoft Research One Microsoft Way Redmond, WA 98052
More informationData Preparation and Statistical Displays
Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability
More informationEST.03. An Introduction to Parametric Estimating
EST.03 An Introduction to Parametric Estimating Mr. Larry R. Dysert, CCC A ACE International describes cost estimating as the predictive process used to quantify, cost, and price the resources required
More informationArtificial Neural Network and Non-Linear Regression: A Comparative Study
International Journal of Scientific and Research Publications, Volume 2, Issue 12, December 2012 1 Artificial Neural Network and Non-Linear Regression: A Comparative Study Shraddha Srivastava 1, *, K.C.
More informationStatistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
More informationDevelopment of a. Solar Generation Forecast System
ALBANY BARCELONA BANGALORE 16 December 2011 Development of a Multiple Look ahead Time Scale Solar Generation Forecast System John Zack Glenn Van Knowe Marie Schnitzer Jeff Freedman AWS Truepower, LLC Albany,
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationMultiple Imputation for Missing Data: A Cautionary Tale
Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust
More informationNon Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization
Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization Jean- Damien Villiers ESSEC Business School Master of Sciences in Management Grande Ecole September 2013 1 Non Linear
More informationPrediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
More informationA Procedure for Classifying New Respondents into Existing Segments Using Maximum Difference Scaling
A Procedure for Classifying New Respondents into Existing Segments Using Maximum Difference Scaling Background Bryan Orme and Rich Johnson, Sawtooth Software March, 2009 Market segmentation is pervasive
More informationAn exploratory study of the possibilities of analog postprocessing
Intern rapport ; IR 2009-02 An exploratory study of the possibilities of analog postprocessing Dirk Wolters De Bilt, 2009 KNMI internal report = intern rapport; IR 2009-02 De Bilt, 2009 PO Box 201 3730
More informationRenewable Energy Management System (REMS): Using optimisation to plan renewable energy infrastructure investment in the Pacific
Renewable Energy Management System (REMS): Using optimisation to plan renewable energy infrastructure investment in the Pacific Abstract: Faisal Wahid PhD Student at the Department of Engineering Science,
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationIS0 14040 INTERNATIONAL STANDARD. Environmental management - Life cycle assessment - Principles and framework
INTERNATIONAL STANDARD IS0 14040 First edition 1997006-15 Environmental management - Life cycle assessment - Principles and framework Management environnemental - Analyse du cycle de vie - Principes et
More informationThe Best of Both Worlds:
The Best of Both Worlds: A Hybrid Approach to Calculating Value at Risk Jacob Boudoukh 1, Matthew Richardson and Robert F. Whitelaw Stern School of Business, NYU The hybrid approach combines the two most
More informationLDA at Work: Deutsche Bank s Approach to Quantifying Operational Risk
LDA at Work: Deutsche Bank s Approach to Quantifying Operational Risk Workshop on Financial Risk and Banking Regulation Office of the Comptroller of the Currency, Washington DC, 5 Feb 2009 Michael Kalkbrener
More informationBig Ideas in Mathematics
Big Ideas in Mathematics which are important to all mathematics learning. (Adapted from the NCTM Curriculum Focal Points, 2006) The Mathematics Big Ideas are organized using the PA Mathematics Standards
More informationEngineering Problem Solving and Excel. EGN 1006 Introduction to Engineering
Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques
More informationSouth Africa. General Climate. UNDP Climate Change Country Profiles. A. Karmalkar 1, C. McSweeney 1, M. New 1,2 and G. Lizcano 1
UNDP Climate Change Country Profiles South Africa A. Karmalkar 1, C. McSweeney 1, M. New 1,2 and G. Lizcano 1 1. School of Geography and Environment, University of Oxford. 2. Tyndall Centre for Climate
More informationA Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data
A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt
More informationLecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization
Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization 2.1. Introduction Suppose that an economic relationship can be described by a real-valued
More informationAIR TEMPERATURE IN THE CANADIAN ARCTIC IN THE MID NINETEENTH CENTURY BASED ON DATA FROM EXPEDITIONS
PRACE GEOGRAFICZNE, zeszyt 107 Instytut Geografii UJ Kraków 2000 Rajmund Przybylak AIR TEMPERATURE IN THE CANADIAN ARCTIC IN THE MID NINETEENTH CENTURY BASED ON DATA FROM EXPEDITIONS Abstract: The paper
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationThe Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy
BMI Paper The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy Faculty of Sciences VU University Amsterdam De Boelelaan 1081 1081 HV Amsterdam Netherlands Author: R.D.R.
More informationCommunicating the value of probabilistic forecasts with weather roulette
METEOROLOGICAL APPLICATIONS Meteorol. Appl. (28) Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 1.12/met.92 Communicating the value of probabilistic forecasts with weather roulette
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationA Fuel Cost Comparison of Electric and Gas-Powered Vehicles
$ / gl $ / kwh A Fuel Cost Comparison of Electric and Gas-Powered Vehicles Lawrence V. Fulton, McCoy College of Business Administration, Texas State University, lf25@txstate.edu Nathaniel D. Bastian, University
More informationSolar Irradiance Forecasting Using Multi-layer Cloud Tracking and Numerical Weather Prediction
Solar Irradiance Forecasting Using Multi-layer Cloud Tracking and Numerical Weather Prediction Jin Xu, Shinjae Yoo, Dantong Yu, Dong Huang, John Heiser, Paul Kalb Solar Energy Abundant, clean, and secure
More informationModelling of Smart Low-Carbon Energy Systems. Imperial College London
Modelling of Smart Low-Carbon Energy Systems Imperial College London Goran Strbac, Marko Aunedi, Dimitrios Papadaskalopoulos, Meysam Qadrdan, Rodrigo Moreno, Danny Pudjianto, Predrag Djapic, Ioannis Konstantelos,
More informationUncertainty of Power Production Predictions of Stationary Wind Farm Models
Uncertainty of Power Production Predictions of Stationary Wind Farm Models Juan P. Murcia, PhD. Student, Department of Wind Energy, Technical University of Denmark Pierre E. Réthoré, Senior Researcher,
More informationA Project to Create Bias-Corrected Marine Climate Observations from ICOADS
A Project to Create Bias-Corrected Marine Climate Observations from ICOADS Shawn R. Smith 1, Mark A. Bourassa 1, Scott Woodruff 2, Steve Worley 3, Elizabeth Kent 4, Simon Josey 4, Nick Rayner 5, and Richard
More informationHong Kong Observatory Summer Placement Programme 2015
Annex I Hong Kong Observatory Summer Placement Programme 2015 Training Programme : An Observatory mentor with relevant expertise will supervise the students. Training Period : 8 weeks, starting from 8
More informationPS 271B: Quantitative Methods II. Lecture Notes
PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.
More informationMultivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
More information99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm
Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the
More informationLean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY
TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online
More informationPractical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University
Practical I conometrics data collection, analysis, and application Christiana E. Hilmer Michael J. Hilmer San Diego State University Mi Table of Contents PART ONE THE BASICS 1 Chapter 1 An Introduction
More informationOBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS
OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS CLARKE, Stephen R. Swinburne University of Technology Australia One way of examining forecasting methods via assignments
More informationModel-based Synthesis. Tony O Hagan
Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that
More informationSouth Carolina College- and Career-Ready (SCCCR) Probability and Statistics
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)
More informationUSING SIMULATED WIND DATA FROM A MESOSCALE MODEL IN MCP. M. Taylor J. Freedman K. Waight M. Brower
USING SIMULATED WIND DATA FROM A MESOSCALE MODEL IN MCP M. Taylor J. Freedman K. Waight M. Brower Page 2 ABSTRACT Since field measurement campaigns for proposed wind projects typically last no more than
More informationEXPLORING SPATIAL PATTERNS IN YOUR DATA
EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze
More informationNOAA to Provide Enhanced Frost Forecast Information to Improve Russian River Water Management
NOAA to Provide Enhanced Frost Forecast Information to Improve Russian River Water Management David W. Reynolds Meteorologist in Charge (Retired) National Weather Service Forecast Office San Francisco
More informationPie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.
Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of
More informationThis unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
More information