Statistical Software for Weather and Climate
|
|
|
- Brianne Lawrence
- 10 years ago
- Views:
Transcription
1 Statistical Software for Weather and Climate Eric Gilleland, National Center for Atmospheric Research, Boulder, Colorado, U.S.A. Effects of climate change: coastal systems, policy implications, and the role of statistics. Interdisciplinary Workshop March 2009, Preluna Hotel and Spa, Sliema, Malta
2 The R programming language R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN , Vance A, Data analysts captivated by R s power. New York Times, 6 January Available at: business-computing/07program.html?_r=2
3 Already familiar with R? Advanced (potentially useful) topics: Reading and Writing NetCDF file formats: A Climate Related Precipitation Example for Colorado:
4 R preliminaries Assuming R is installed on your computer... In linux, unix, and Mac (terminal/xterm) the directory in which R is opened is (by default) the current working directory. In Windows (Mac GUI?), the working directory is usually in one spot, but can be changed (tricky). Open an R workspace: Type R at the command prompt (linux/unix, Mac terminal/xterm) or double click on R s icon (Windows, Mac GUI). getwd() # Find out which directory is the current working directory.
5 R preliminaries Assigning vectors and matrices to objects: # Assign a vector containing the numbers -1, 4 and 0 to an object called x x < c( -1, 4, 0) # Assign a 3 2 matrix with column vectors: 2, 1, 5 and # 3, 7, 9 to an object called y. y < cbind( c( 2, 1, 5), c(3, 7, 9)) # Write x and y out to the screen. x y
6 R preliminaries Saving a workspace and exiting # To save a workspace without exiting R. save.image() # To exit R while also saving the workspace. q("yes") # Exit R without saving the workspace. q("no") # Or, interactively... q()
7 R preliminaries Subsetting vectors: # Look at only the 3-rd element of x. x[3] # Look at the first two elements of x. x[1:2] # The first and third. x[c(1,3)] # Everything but the second element. x[-2]
8 R preliminaries Subsetting matrices: # Look at the first row of y. y[1,] # Assign the first column of y to a vector called y1. # Similarly for the 2nd column. y1 < y[,1] y2 < y[,2] # Assign a "missing value" to the second row, first column # element of y. y[2,1] < NA
9 R preliminaries Logicals and Missing Values: # Do x and/or y have any missing values? any( is.na( x)) any( is.na( y)) # Replace any missing values in y with y[ is.na( y)] < # Which elements of x are equal to 0? x == 0
10 R preliminaries Contributed packages # Install some useful packages. Need only do once. install.packages( c("fields", # A spatial stats package. "evd", # An EVA package. "evdbayes", # Bayesian EVA package. "ismev", # Another EVA package. "maps", # For adding maps to plots. "SpatialExtremes") # Now load them into R. Must do for each new session. library( fields) library( evd) library( evdbayes) library( ismev) library( SpatialExtremes)
11 R preliminaries See hierarchy of loaded packages: search() # Detach the SpatialExtremes package. detach(pos=2) See how to reference a contributed package: citation("fields")
12 R preliminaries Simulate some random fields From the help file for the fields function sim.rf help( sim.rf) #Simulate a Gaussian random field with an exponential # covariance function, range parameter = 2.0 and the # domain is [0, 5] [0, 5] evaluating the field at a # grid. grid < list( x= seq( 0,5 100), y= seq(0,5 100)) obj < Exp.image.cov( grid=grid, theta=.5, setup=true) look < sim.rf( obj) # Now simulate another... look2 < sim.rf( obj)
13 R preliminaries Plotting the simulated fields # setup the plotting device to have two plots side-by-side set.panel(2,1) # Image plot with a color scale. image.plot( grid$x, grid$y, look) title("simulated gaussian field") image.plot( grid$x, grid$y, look2) title("another (independent) realization...")
14 R preliminaries Basics of plotting in R: First must open a device on which to plot. Most plotting commands (e.g., plot) open a device (that you can see) if one is not already open. If a device is open, it will write over the current plot. X11() will also open a device that you can see. To create a file with the plot(s), use postscript, jpeg, png, or pdf (before calling the plotting routines. Use dev.off() to close the device and create the file. plot and many other plotting functions use the par values to define various characteristics (e.g., margins, plotting symbols, character sizes, etc.). Type help( plot) and help( par) for more information.
15 R preliminaries Simple plot example. plot( 1:10, z <- rnorm(10), type="l", xlab="", ylab="z", main="std Normal Random Sample") points( 1:10, z, col="red", pch="s", cex=2) lines( 1:10, rnorm(10), col="blue", lwd=2, lty=2) # Make a standard normal qq-plot of z. qqnorm( z) # Shut off the device. dev.off()
16 Background on Extreme Value Analysis (EVA) Motivation Sums, averages and proportions (Normality) Central Limit Theorem (CLT) Limiting distribution of binomial distribution Extremes Normal distribution inappropriate Bulk of data may be misleading Extremes are often rare, so often not enough data
17 Background on Extreme Value Analysis (EVA) Simulations # Simulate a sample of 1000 from a Unif(0,1) distribution. U < runif( 1000) hist( U) # Simulate a sample of 1000 from a N(0,1) distribution. Z < rnorm( 1000) hist( Z) # Simulate a sample of 1000 from a Gumbel distribution. M < rgev( 1000) hist( M)
18 Background on Extreme Value Analysis (EVA) Simulations # Simulate 1000 maxima from samples of size 30 from # the normal distribution. Zmax < matrix( NA, 30, 1000) dim( Zmax) for( i in 1:1000) Zmax[,i] < rnorm( 30) Zmax < apply( Zmax, 2, max) dim( Zmax) class( Zmax) length( Zmax) class( Zmax) hist( Zmax, breaks="fd", col="blue")
19 Background on Extreme Value Analysis (EVA) Simulations # Simulate maxima from samples of size 30 from # the Unif(0,1) distribution. Umax <- matrix( NA, 30, 1000) for( i in 1:1000) Umax[,i] < runif( 30) Umax < apply( Umax, 2, max) hist( Umax, breaks="fd", col="blue") # Simulate 1000 maxima from samples of size 30 from # the Fréchet distribution. Fmax < matrix( NA, 30, 1000) for( i in 1:1000) Fmax[,i] < rgev( 30, loc=2, scale=2.5, shape=0.5) Fmax < apply( Fmax, 2, max, finite=true, na.rm=true) hist( Fmax, breaks="fd", col="blue")
20 Background on Extreme Value Analysis (EVA) Simulations Last one # Simulate 1000 maxima from samples of size 30 from # the exponential distribution. Emax <- matrix( NA, 30, 1000) for( i in 1:1000) Emax[,i] <- rexp( 30) Emax <- apply( Emax, 2, max) # This time, compare with samples from the # exponential distribution. par( mfrow=c(1,2)) hist( Emax, breaks="fd", col="blue") hist( rexp(1000), breaks="fd", col="blue")
21 Background on Extreme Value Analysis (EVA) Extremal Types Theorem Let X 1,..., X n be a sequence of independent and identically distributed (iid) random variables with common distribution function, F. Want to know the distribution of M n = max{x 1,..., X n }. Example: X 1,..., X n could represent hourly precipitation, daily ozone concentrations, daily average temperature, etc. Interest for now is in maxima of these processes over particular blocks of time.
22 Background on Extreme Value Analysis (EVA) Extremal Types Theorem If interest is in the minimum over blocks of data (e.g., monthly minimum temperature), then note that min{x 1,..., X n } = max{ X 1,..., X n } Therefore, we can focus on the maxima.
23 Background on Extreme Value Analysis (EVA) Extremal Types Theorem Could try to derive the distribution for M n exactly for all n as follows. Pr{M n z} = Pr{X 1 z,..., X n z} indep. = Pr{X 1 z} Pr{X n z} ident. dist. = {F (z)} n.
24 Background on Extreme Value Analysis (EVA) Extremal Types Theorem Could try to derive the distribution for M n exactly for all n as follows. Pr{M n z} = Pr{X 1 z,..., X n z} indep. = Pr{X 1 z} Pr{X n z} ident. dist. = {F (z)} n. But! If F is not known, this is not very helpful because small discrepancies in the estimate of F can lead to large discrepancies for F n.
25 Background on Extreme Value Analysis (EVA) Extremal Types Theorem Could try to derive the distribution for M n exactly for all n as follows. Pr{M n z} = Pr{X 1 z,..., X n z} indep. = Pr{X 1 z} Pr{X n z} ident. dist. = {F (z)} n. But! If F is not known, this is not very helpful because small discrepancies in the estimate of F can lead to large discrepancies for F n. Need another strategy!
26 Background on Extreme Value Analysis (EVA) Extremal Types Theorem If there exist sequences of constants {a n > 0} and {b n } such that { } Mn b n Pr z G(z) as n, a n where G is a non-degenerate distribution function, then G belongs to one of the following three types.
27 Background on Extreme Value Analysis (EVA) Extremal Types Theorem I. Gumbel II. Fréchet III. Weibull G(z) = exp G(z) = G(z) = { [ exp with parameters a, b and α > 0. ( )]} z b, < z < a { 0, { z b, exp ( ) } z b α a, z > b; { { [ ( exp z b ) α ]} a, z < b, 1, z b
28 Background on Extreme Value Analysis (EVA) Extremal Types Theorem The three types can be written as a single family of distributions, known as the generalized extreme value (GEV) distribution. G(z) = exp { [ 1 + ξ ( z µ σ )] 1/ξ + }, where y + = max{y, 0}, < µ, ξ < and σ > 0.
29 Background on Extreme Value Analysis (EVA) GEV distribution Three parameters: location (µ), scale (σ) and shape (ξ). 1. ξ = 0 (Gumbel type, limit as ξ 0) 2. ξ > 0 (Fréchet type) 3. ξ < 0 (Weibull type)
30 Background on Extreme Value Analysis (EVA) Gumbel type Light tail Domain of attraction for many common distributions (e.g., normal, lognormal, exponential, gamma) Gumbel pdf
31 Background on Extreme Value Analysis (EVA) Fréchet type Heavy tail E[X r ] = for r 1/ξ (i.e., infinite variance if ξ 1/2) Of interest for precipitation, streamflow, economic impacts Frechet pdf
32 Background on Extreme Value Analysis (EVA) Weibull type Bounded upper tail at µ σ ξ Of interest for temperature, wind speed, sea level Weibull pdf
33 Background on Extreme Value Analysis (EVA) Normal vs. GEV ## The probability of exceeding increasingly high values ## (as they double). # Normal pnorm( c(1,2,4,8,16,32), lower.tail=false) # Gumbel pgev( c(1,2,4,8,16,32), lower.tail=false) # Fréchet pgev( c(1,2,4,8,16,32), shape=0.5, lower.tail=false) # Weibull (note bounded upper tail!) pgev( c(1,2,4,8,16,32), shape=-0.5, lower.tail=false)
34 Background on Extreme Value Analysis (EVA) Normal vs. GEV # Find Pr{X < x} for X = 0,..., 20. cdfnorm < pnorm( 0:20) cdfgum < pgev( 0:20) cdffrech < pgev( 0:20, shape=0.5) cdfweib < pgev( 0:20, shape=-0.5) # Now find Pr{X = x} for X = 0,..., 20. pdfnorm < dnorm( 0:20) pdfgum < dgev( 0:20) pdffrech < dgev( 0:20, shape=0.5) pdfweib < dgev( 0:20, shape=-0.5)
35 Background on Extreme Value Analysis (EVA) Normal vs. GEV par( mfrow=c(2,1), mar=c(5,4,0.5,0.5)) plot( 0:20, cdfnorm, ylim=c(0,1), type="l", xaxt="n", col="blue", lwd=2, xlab="", ylab="f(x)") lines( 0:20, cdfgum, col="green", lty=2, lwd=2) lines( 0:20, cdffrech, col="red", lwd=2) lines( 0:20, cdfweib, col="orange", lwd=2) legend( 10, 0.05, legend=c("normal", "Gumbel", "Frechet", "Weibull"), col=c("blue", "green", "red", "orange"), lty=c(1,2,1,1), bty="n", lwd=2)
36 Background on Extreme Value Analysis (EVA) Normal vs. GEV plot( 0:20, pdfnorm, ylim=c(0,1), typ="l", col="blue", lwd=2, xlab="x", ylab="f(x)") lines( 0:20, pdfgum, col="green", lty=2, lwd=2) lines( 0:20, pdffrech, col="red", lwd=2) lines( 0:20, pdfweib, col="orange", lwd=2)
37 Example Fort Collins, Colorado daily precipitation amount Time series of daily precipitation amount (in), Semi-arid region. Marked annual cycle in precipitation (wettest in late spring/early summer, driest in winter). No obvious long-term trend. Recent flood, 28 July (substantial damage to Colorado State University)
38 Example Fort Collins, Colorado precipitation Fort Collins daily precipitation Precipitation (in) Jan Mar May Jul Sep Nov
39 Example Fort Collins, Colorado precipitation Annual Maxima Fort Collins annual maximum daily precipitation 1997 Precipitation (in) Year
40 Example Fort Collins, Colorado precipitation How often is such an extreme expected? Assume no long-term trend emerges. Using annual maxima removes effects of annual trend in analysis. Annual Maxima fit to GEV. # Fit Fort Collins annual maximum precipitation to GEV. fit < gev.fit( ftcanmax$prec/100) # Check the quality of the fit. gev.diag( fit)
41 Example Fort Collins, Colorado precipitation Fit looks good (from diagnostic plots). Parameter Estimate (Std. Error) Location (µ) (0.617) Scale (σ) (0.488) Shape (ξ) (0.092)
42 Example Fort Collins, Colorado precipitation # Is the shape parameter really not zero? # Perform likelihood ratio test against Gumbel type. fit0 < gum.fit( ftcanmax$prec/100) Dev < 2*(fit0$nllh - fit$nllh) pchisq( Dev, 1, lower.tail=false) Likelihood ratio test for ξ = 0 rejects hypothesis of Gumbel type (p-value 0.038).
43 Example Fort Collins, Colorado precipitation 95% Confidence intervals for ξ, using profile likelihood, are: (0.009, 0.369). Profile Log likelihood Use gev.profxi and locator(2) to find CI s Shape Parameter
44 Example Fort Collins, Colorado precipitation Return Levels # Currently must assign the class "gev.fit" to the # fitted object, fit, in order to use the "extremes" # function, return.level. class( fit) < "gev.fit" fit.rl < return.level( fit) Return Estimated Return 95% CI Period Level (in) (2.41, 3.21) ?(3.35, 6.84).
45 Example Fort Collins, Colorado precipitation Return Levels CI s from return.level are based on the delta method, which assumes normality for the return levels. For longer return periods (e.g., beyond the range of the data), this assumption may not be valid. Can check by looking at the profile likelihood. gev.prof( fit, m=100, xlow=2, xup=8) Highly skewed! Using locator(2), a better approximation for the (95%) 100-year return level CI is about (3.9, 8.0).
46 Example Fort Collins, Colorado precipitation Probability of annual maximum precipitation at least as large as that during the 28 July 1997 flood (i.e., Pr{max(X) 1.54 in.}). # Using the pgev function from the "evd" package. pgev( 1.54, loc=fit$mle[1], scale=fit$mle[2], shape=fit$mle[3], lower.tail=false) pgev( 4.6, loc=fit$mle[1], scale=fit$mle[2], shape=fit$mle[3], lower.tail=false)
47 Peaks Over Thresholds (POT) Approach Let X 1, X 2,... be an iid sequence of random variables, again with marginal distribution, F. Interest is now in the conditional probability of X s exceeding a certain value, given that X already exceeds a sufficiently large threshold, u. Pr{X > u + y X > u} = 1 F (u + y), y > 0 1 F (u) Once again, if we know F, then the above probability can be computed. Generally not the case in practice, so we turn to a broadly applicable approximation.
48 Peaks Over Thresholds (POT) Approach If Pr{max{X 1,..., X n } z} G(z), where { [ ( )] } 1/ξ z µ G(z) = exp 1 + ξ σ for some µ, ξ and σ > 0, then for sufficiently large u, the distribution [X u X > u], is approximately the generalized Pareto distribution (GPD). Namely, ( H(y) = ξỹ ) 1/ξ, y > 0, σ with σ = σ + ξ(u µ) (σ, ξ and µ as in G(z) above). +
49 Peaks Over Thresholds (POT) Approach Pareto type (ξ > 0) heavy tail Beta type (ξ < 0) bounded above at u σ/ξ Exponential type (ξ = 0) light tail pdf GPD Pareto (xi>0) Beta (xi<0) Exponential (xi=0)
50 Peaks Over Thresholds (POT) Approach Hurricane damage Economic Damage from Hurricanes ( ) Economic damage caused by hurricanes from 1926 to billion US$ Trends in societal vulnerability removed. Excess over threshold of u = 6 billion US$ Year
51 Peaks Over Thresholds (POT) Approach Hurricane damage GPD pdf ˆσ = ˆξ Likelihood ratio test for ξ = 0 (p-value 0.018) 95% CI for shape parameter using profile likelihood < ξ < 1.56
52 Peaks Over Thresholds (POT) Approach Choosing a threshold Variance/bias trade-off Low threshold allows for more data (low variance). Theoretical justification for GPD requires a high threshold (low bias). Modified Scale Threshold gpd.fitrange( damage$dam, 2, 7) Shape Threshold
53 Peaks Over Thresholds (POT) Approach Dependence above threshold Often, threshold excesses are not independent. For example, a hot day is likely to be followed by another hot day. Various procedures to handle dependence. Model the dependence. De-clustering (e.g., runs de-clustering). Resampling to estimate standard errors (avoid tossing out information about extremes).
54 Peaks Over Thresholds (POT) Approach Dependence above threshold Phoenix airport (negative) Min. Temperature (deg. F) Phoenix (airport) minimum temperature ( o F). July and August Urban heat island (warming trend as cities grow). Model lower tail as upper tail after negation.
55 Peaks Over Thresholds (POT) Approach Dependence above threshold # Fit without de-clustering. phx.fit0 < gpd.fit( -Tphap$MinT, -73) # With runs de-clustering (r=1). phx.dc < dclust( -Tphap$MinT, u=-73, r=1, cluster.by=tphap$year) phxdc.fit0 < gpd.fit( phx.dc$xdat.dc, -73)
56 Peaks Over Thresholds (POT) Approach Point Process: frequency and intensity of threshold excesses Event is a threshold excess (i.e., X > u). Frequency of occurrence of an event (rate parameter), λ > 0. Pr{no events in [0, T ]} = e λt Mean number of events in [0, T ] = λt. GPD for excess over threshold (intensity).
57 Peaks Over Thresholds (POT) Approach Point Process: frequency and intensity of threshold excesses Relation of parameters of GEV(µ,σ,ξ) to parameters of point process (λ,σ,ξ). Shape parameter, ξ, identical. log λ = 1 ξ log ( 1 + ξ u µ ) σ σ = σ + ξ(u µ) More detail: Time scaling constant, h. For example, for annual maximum of daily data, h 1/ Change of time scale, h, for GEV(µ,σ,ξ) to h ( ) [ ξ h σ = σ and µ = µ + 1 ( ) ]} ξ h {σ 1 h ξ h
58 Peaks Over Thresholds (POT) Approach Point Process: frequency and intensity of threshold excesses Two ways to estimate PP parameters Orthogonal approach (estimate frequency and intensity separately). Convenient to estimate. Difficult to interpret in presence of covariates. GEV re-parameterization (estimate both simultaneously). More difficult to estimate. Interpretable even with covariates.
59 Peaks Over Thresholds (POT) Approach Point Process: frequency and intensity of threshold excesses Fort Collins, Colorado daily precipitation Analyze daily data instead of just annual maxima (ignoring annual cycle for now). Orthogonal Approach ˆλ = No. X i > No. X i 10.6 per year ˆσ 0.323, ˆξ 0.212
60 Peaks Over Thresholds (POT) Approach Point Process: frequency and intensity of threshold excesses Fort Collins, Colorado daily precipitation Analyze daily data instead of just annual maxima (ignoring annual cycle for now). Point Process ˆλ = [ 1 + ˆξ (u ˆµ) ˆσ ˆµ ˆσ = ˆξ ] 1/ˆξ 10.6 per year
61 Risk Communication Under Stationarity Unchanging climate Return level, z p, is the value associated with the return period, 1/p. That is, z p is the level expected to be exceeded on average once every 1/p years. That is, Return level, z p, with 1/p-year return period is z p = F 1 (1 p). For example, p = 0.01 corresponds to the 100-year return period. Easy to obtain from GEV and GP distributions (stationary case).
62 Risk Communication Under Stationarity Unchanging climate For example, GEV return level is given by z p = µ σ ξ [1 ( log(1 p))] ξ Return level with (1/p) year return period GEV pdf p p Similar for GPD, but must take λ into account. z_p
63 Risk Communication Under Stationarity Unchanging climate Compare previous GPD fits (with and without de-clustering). Must first assign the class name, gpd.fit" to each so that the extremes function, return.level, will know whether the objects refer to a GEV or GPD fit. # Without de-clustering. class( phx.fit0) < "gpd.fit" return.level( phx.fit0, make.plot=false) # With de-clustering. class( phxdc.fit0) < "gpd.fit" return.level( phxdc.fit0, make.plot=false)
64 Non-Stationarity Sources Trends: climate change: trends in frequency and intensity of extreme weather events. Cycles: Annual and/or diurnal cycles often present in meteorological variables. Other.
65 Non-Stationarity Theory No general theory for non-stationary case. Only limited results under restrictive conditions. Can introduce covariates in the distribution parameters.
66 Non-Stationarity Phoenix minimum temperature Phoenix summer minimum temperature Minimum temperature (deg. F) Year
67 Non-Stationarity Phoenix minimum temperature Recall: min{x 1,..., X n } = max{ X 1,..., X n }. Assume summer minimum temperature in year t = 1, 2,... has GEV distribution with: µ(t) = µ 0 + µ 1 t log σ(t) = σ 0 + σ 1 t ξ(t) = ξ
68 Non-Stationarity Phoenix minimum temperature Note: To convert back to min{x 1,..., X n }, change sign of location parameters. But note that model is Pr{ X x} = Pr{X x} = 1 F ( x). ˆµ(t) t log ˆσ(t) t Likelihood ratio test ˆξ 0.21 for µ 1 = 0 (p-value < 10 5 ), for σ 1 = 0 (p-value 0.366).
69 Non-Stationarity Phoenix minimum temperature Model Checking. Found the best model from a range of models, but is it a good representation of the data? Transform data to a common distribution, and check the qq-plot. 1. Non-stationary GEV to exponential { ε t = 1 + ˆξ(t) ˆσ(t) [X t ˆµ(t)] } 1/ˆξ(t) 2. Non-stationary GEV to Gumbel (used by ismev/extremes ε t = 1 { ( )} ˆξ(t) log 1 + ˆξ(t) Xt ˆµ(t) ˆσ(t)
70 Non-Stationarity Phoenix minimum temperature Model Checking. Found the best model from a range of models, but is it a good representation of the data? Transform data to a common distribution, and check the qq-plot. Q Q Plot (Gumbel Scale): Phoenix Min Temp Empirical Model
71 Non-Stationarity Physically based covariates Winter maximum daily temperature at Port Jervis, New York Let X 1,..., X n be the winter maximum temperatures, and Z 1,..., Z n the associated Arctic Oscillation (AO) winter index. Given Z = z, assume conditional distribution of winter maximum temperature is GEV with parameters µ(z) = µ 0 + µ 1 z log σ(z) = σ 0 + σ 1 z ξ(z) = ξ
72 Non-Stationarity Physically based covariates Winter maximum daily temperature at Port Jervis, New York ˆµ(z) z log ˆσ(z) = z ξ(z) = Likelihood ratio test for µ 1 = 0 (p-value < 0.001) Likelihood ratio test for σ 1 = 0 (p-value 0.635)
73 Non-Stationarity Cyclic variation Fort Collins, Colorado precipitation Fort Collins daily precipitation Precipitation (in) Jan Mar May Jul Sep Nov
74 Non-Stationarity Cyclic variation Fort Collins, Colorado precipitation Orthogonal approach. First fit annual cycle to Poisson rate parameter (T = ): ( ) ( ) 2πt 2πt log λ(t) = λ 0 + λ 1 sin + λ 2 cos T T Giving ( ) ( ) log ˆλ(t) 2πt 2πt sin 0.85 cos T T Likelihood ratio test for λ 1 = λ 2 = 0 (p-value 0).
75 Non-Stationarity Cyclic variation Fort Collins, Colorado precipitation Orthogonal approach. Next fit GPD with annual cycle in scale parameter. ( ) ( ) 2πt 2πt log σ (t) = σ0 + σ1 sin + σ2 cos T T Giving ( ) ( ) 2πt 2πt log ˆσ (t) sin 0.30 cos T T Likelihood ratio test for σ 1 = σ 2 = 0 (p-value < 10 5 )
76 Non-Stationarity Cyclic variation Fort Collins, Colorado precipitation Annual cycle in location and scale parameters of the GEV re-parameterization approach point process model with t = 1, 2,..., and T = µ(t) = µ 0 + µ 1 sin ( 2πt T log σ(t) = σ 0 + σ 1 sin ( 2πt T ξ(t) = ξ ) + µ2 cos ( ) 2πt T ) + σ2 cos ( 2πt T )
77 Non-Stationarity Cyclic variation Fort Collins, Colorado precipitation ˆµ(t) sin ( 2πt T log ˆσ(t) sin ( 2πt T ˆξ ) ( cos 2πt ) T ) cos ( 2πt T ) Likelihood ratio test for µ 1 = µ 2 = 0 (p-value 0). Likelihood ratio test for σ 1 = σ 2 = 0 (p-value 0).
78 Non-Stationarity Cyclic variation Fort Collins, Colorado precipitation Residual quantile Plot (Exptl. Scale) empirical model
79 Risk Communication (Under Non-Stationarity) Return period/level does not make sense anymore because of changing distribution (e.g., with time). Often, one uses an effective" return period/level instead. That is, compute several return levels for varying probabilities over time. Can also determine a single return period/level assuming temporal independence. 1 1 n m = Pr {max(x 1,..., X n ) z m } p i, where p i = { 1 1 n y 1/ξ i i, for y i > 0, 1, otherwise where y i = 1 + ξ i σ i (z m µ i ), and (µ i, σ i, ξ i ) are the parametrs of the point process model for observation i. Can be easily solved for z m (using numerical methods). Difficulty is in calculating the uncertainty (See Coles, 2001, chapter 7). i=1
80 References Coles S, An introduction to statistical modeling of extreme values. Springer, London. 208 pp. Katz RW, MB Parlange, and P Naveau, Statistics of extremes in hydrology. Adv. Water Resources, 25: Stephenson A and E Gilleland, Software for the analysis of extreme events: The current state and future directions. Extremes, 8:
An introduction to the analysis of extreme values using R and extremes
An introduction to the analysis of extreme values using R and extremes Eric Gilleland, National Center for Atmospheric Research, Boulder, Colorado, U.S.A. http://www.ral.ucar.edu/staff/ericg Graybill VIII
The R programming language. Statistical Software for Weather and Climate. Already familiar with R? R preliminaries
Statistical Software for Weather and Climate he R programming language Eric Gilleland, National Center for Atmospheric Research, Boulder, Colorado, U.S.A. http://www.ral.ucar.edu/staff/ericg R Development
An Introduction to Extreme Value Theory
An Introduction to Extreme Value Theory Petra Friederichs Meteorological Institute University of Bonn COPS Summer School, July/August, 2007 Applications of EVT Finance distribution of income has so called
The Extremes Toolkit (extremes)
The Extremes Toolkit (extremes) Weather and Climate Applications of Extreme Value Statistics Eric Gilleland National Center for Atmospheric Research (NCAR), Research Applications Laboratory (RAL) Richard
Estimation and attribution of changes in extreme weather and climate events
IPCC workshop on extreme weather and climate events, 11-13 June 2002, Beijing. Estimation and attribution of changes in extreme weather and climate events Dr. David B. Stephenson Department of Meteorology
Review of Random Variables
Chapter 1 Review of Random Variables Updated: January 16, 2015 This chapter reviews basic probability concepts that are necessary for the modeling and statistical analysis of financial data. 1.1 Random
Extreme Value Analysis of Corrosion Data. Master Thesis
Extreme Value Analysis of Corrosion Data Master Thesis Marcin Glegola Delft University of Technology, Faculty of Electrical Engineering, Mathematics and Computer Science, The Netherlands July 2007 Abstract
Operational Risk Management: Added Value of Advanced Methodologies
Operational Risk Management: Added Value of Advanced Methodologies Paris, September 2013 Bertrand HASSANI Head of Major Risks Management & Scenario Analysis Disclaimer: The opinions, ideas and approaches
Extreme-Value Analysis of Corrosion Data
July 16, 2007 Supervisors: Prof. Dr. Ir. Jan M. van Noortwijk MSc Ir. Sebastian Kuniewski Dr. Marco Giannitrapani Outline Motivation Motivation in the oil industry, hundreds of kilometres of pipes and
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
SUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize
CATASTROPHIC RISK MANAGEMENT IN NON-LIFE INSURANCE
CATASTROPHIC RISK MANAGEMENT IN NON-LIFE INSURANCE EKONOMIKA A MANAGEMENT Valéria Skřivánková, Alena Tartaľová 1. Introduction At the end of the second millennium, a view of catastrophic events during
Generalized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
Exploratory Data Analysis
Exploratory Data Analysis Johannes Schauer [email protected] Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
4 Other useful features on the course web page. 5 Accessing SAS
1 Using SAS outside of ITCs Statistical Methods and Computing, 22S:30/105 Instructor: Cowles Lab 1 Jan 31, 2014 You can access SAS from off campus by using the ITC Virtual Desktop Go to https://virtualdesktopuiowaedu
Logistic Regression (a type of Generalized Linear Model)
Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge
Time Series Analysis of Aviation Data
Time Series Analysis of Aviation Data Dr. Richard Xie February, 2012 What is a Time Series A time series is a sequence of observations in chorological order, such as Daily closing price of stock MSFT in
Probabilistic Forecasting of Medium-Term Electricity Demand: A Comparison of Time Series Models
Fakultät IV Department Mathematik Probabilistic of Medium-Term Electricity Demand: A Comparison of Time Series Kevin Berk and Alfred Müller SPA 2015, Oxford July 2015 Load forecasting Probabilistic forecasting
Time Series Analysis with R - Part I. Walter Zucchini, Oleg Nenadić
Time Series Analysis with R - Part I Walter Zucchini, Oleg Nenadić Contents 1 Getting started 2 1.1 Downloading and Installing R.................... 2 1.2 Data Preparation and Import in R.................
Uncertainty quantification for the family-wise error rate in multivariate copula models
Uncertainty quantification for the family-wise error rate in multivariate copula models Thorsten Dickhaus (joint work with Taras Bodnar, Jakob Gierl and Jens Stange) University of Bremen Institute for
How To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way
Statistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page
Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8
Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
Data Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,
Bootstrap Example and Sample Code
U.C. Berkeley Stat 135 : Concepts of Statistics Bootstrap Example and Sample Code 1 Bootstrap Example This section will demonstrate how the bootstrap can be used to generate confidence intervals. Suppose
Graphics in R. Biostatistics 615/815
Graphics in R Biostatistics 615/815 Last Lecture Introduction to R Programming Controlling Loops Defining your own functions Today Introduction to Graphics in R Examples of commonly used graphics functions
How To Test For Significance On A Data Set
Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
LOGNORMAL MODEL FOR STOCK PRICES
LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as
ITSM-R Reference Manual
ITSM-R Reference Manual George Weigt June 5, 2015 1 Contents 1 Introduction 3 1.1 Time series analysis in a nutshell............................... 3 1.2 White Noise Variance.....................................
Package SHELF. February 5, 2016
Type Package Package SHELF February 5, 2016 Title Tools to Support the Sheffield Elicitation Framework (SHELF) Version 1.1.0 Date 2016-01-29 Author Jeremy Oakley Maintainer Jeremy Oakley
Dongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
An Introduction to Point Pattern Analysis using CrimeStat
Introduction An Introduction to Point Pattern Analysis using CrimeStat Luc Anselin Spatial Analysis Laboratory Department of Agricultural and Consumer Economics University of Illinois, Urbana-Champaign
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
MATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
Climate Projections for Transportation Infrastructure Planning, Operations & Maintenance, and Design
Climate Projections for Transportation Infrastructure Planning, Operations & Maintenance, and Design KATHARINE HAYHOE, ANNE STONER, JO DANIEL, JENNIFER JACOBS and PAUL KIRSHEN THE INFRASTRUCTURE CLIMATE
Temporal variation in snow cover over sea ice in Antarctica using AMSR-E data product
Temporal variation in snow cover over sea ice in Antarctica using AMSR-E data product Michael J. Lewis Ph.D. Student, Department of Earth and Environmental Science University of Texas at San Antonio ABSTRACT
STAT 35A HW2 Solutions
STAT 35A HW2 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/09/spring/stat35.dir 1. A computer consulting firm presently has bids out on three projects. Let A i = { awarded project i },
How to assess the risk of a large portfolio? How to estimate a large covariance matrix?
Chapter 3 Sparse Portfolio Allocation This chapter touches some practical aspects of portfolio allocation and risk assessment from a large pool of financial assets (e.g. stocks) How to assess the risk
9th Russian Summer School in Information Retrieval Big Data Analytics with R
9th Russian Summer School in Information Retrieval Big Data Analytics with R Introduction to Time Series with R A. Karakitsiou A. Migdalas Industrial Logistics, ETS Institute Luleå University of Technology
Lecture 8: Gamma regression
Lecture 8: Gamma regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Models with constant coefficient of variation Gamma regression: estimation and testing
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:
Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:
Normality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. [email protected]
Time Series Analysis AMS 316
Time Series Analysis AMS 316 Programming language and software environment for data manipulation, calculation and graphical display. Originally created by Ross Ihaka and Robert Gentleman at University
Financial Time Series Analysis (FTSA) Lecture 1: Introduction
Financial Time Series Analysis (FTSA) Lecture 1: Introduction Brief History of Time Series Analysis Statistical analysis of time series data (Yule, 1927) v/s forecasting (even longer). Forecasting is often
OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS
OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS CLARKE, Stephen R. Swinburne University of Technology Australia One way of examining forecasting methods via assignments
Extreme Value Modeling for Detection and Attribution of Climate Extremes
Extreme Value Modeling for Detection and Attribution of Climate Extremes Jun Yan, Yujing Jiang Joint work with Zhuo Wang, Xuebin Zhang Department of Statistics, University of Connecticut February 2, 2016
Aggregate Loss Models
Aggregate Loss Models Chapter 9 Stat 477 - Loss Models Chapter 9 (Stat 477) Aggregate Loss Models Brian Hartman - BYU 1 / 22 Objectives Objectives Individual risk model Collective risk model Computing
PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION
PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,
SIMPLIFIED PERFORMANCE MODEL FOR HYBRID WIND DIESEL SYSTEMS. J. F. MANWELL, J. G. McGOWAN and U. ABDULWAHID
SIMPLIFIED PERFORMANCE MODEL FOR HYBRID WIND DIESEL SYSTEMS J. F. MANWELL, J. G. McGOWAN and U. ABDULWAHID Renewable Energy Laboratory Department of Mechanical and Industrial Engineering University of
Introduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
Chapter 7 Section 1 Homework Set A
Chapter 7 Section 1 Homework Set A 7.15 Finding the critical value t *. What critical value t * from Table D (use software, go to the web and type t distribution applet) should be used to calculate the
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Recall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
An analysis of the dependence between crude oil price and ethanol price using bivariate extreme value copulas
The Empirical Econometrics and Quantitative Economics Letters ISSN 2286 7147 EEQEL all rights reserved Volume 3, Number 3 (September 2014), pp. 13-23. An analysis of the dependence between crude oil price
Approximation of Aggregate Losses Using Simulation
Journal of Mathematics and Statistics 6 (3): 233-239, 2010 ISSN 1549-3644 2010 Science Publications Approimation of Aggregate Losses Using Simulation Mohamed Amraja Mohamed, Ahmad Mahir Razali and Noriszura
1 Prior Probability and Posterior Probability
Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which
The Variability of P-Values. Summary
The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 [email protected] August 15, 2009 NC State Statistics Departement Tech Report
Psychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
Inference on the parameters of the Weibull distribution using records
Statistics & Operations Research Transactions SORT 39 (1) January-June 2015, 3-18 ISSN: 1696-2281 eissn: 2013-8830 www.idescat.cat/sort/ Statistics & Operations Research c Institut d Estadstica de Catalunya
Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005
Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005 Philip J. Ramsey, Ph.D., Mia L. Stephens, MS, Marie Gaudard, Ph.D. North Haven Group, http://www.northhavengroup.com/
Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification
Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification Presented by Work done with Roland Bürgi and Roger Iles New Views on Extreme Events: Coupled Networks, Dragon
PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU
PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard
Statistical Analysis of Life Insurance Policy Termination and Survivorship
Statistical Analysis of Life Insurance Policy Termination and Survivorship Emiliano A. Valdez, PhD, FSA Michigan State University joint work with J. Vadiveloo and U. Dias Session ES82 (Statistics in Actuarial
Climatography of the United States No. 20 1971-2000
Climate Division: CA 4 NWS Call Sign: Month (1) Min (2) Month(1) Extremes Lowest (2) Temperature ( F) Lowest Month(1) Degree s (1) Base Temp 65 Heating Cooling 1 Number of s (3) Jan 59.3 41.7 5.5 79 1962
R: A self-learn tutorial
R: A self-learn tutorial 1 Introduction R is a software language for carrying out complicated (and simple) statistical analyses. It includes routines for data summary and exploration, graphical presentation
Java Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
Climatography of the United States No. 20 1971-2000
Climate Division: CA 6 NWS Call Sign: SAN Month (1) Min (2) Month(1) Extremes Lowest (2) Temperature ( F) Lowest Month(1) Degree s (1) Base Temp 65 Heating Cooling 100 Number of s (3) Jan 65.8 49.7 57.8
Minitab Tutorials for Design and Analysis of Experiments. Table of Contents
Table of Contents Introduction to Minitab...2 Example 1 One-Way ANOVA...3 Determining Sample Size in One-way ANOVA...8 Example 2 Two-factor Factorial Design...9 Example 3: Randomized Complete Block Design...14
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
2+2 Just type and press enter and the answer comes up ans = 4
Demonstration Red text = commands entered in the command window Black text = Matlab responses Blue text = comments 2+2 Just type and press enter and the answer comes up 4 sin(4)^2.5728 The elementary functions
Climatography of the United States No. 20 1971-2000
Climate Division: CA 2 NWS Call Sign: SAC Month (1) Min (2) Month(1) Extremes Lowest (2) Temperature ( F) Lowest Month(1) Degree s (1) Base Temp 65 Heating Cooling 100 Number of s (3) Jan 53.8 38.8 46.3
Life Table Analysis using Weighted Survey Data
Life Table Analysis using Weighted Survey Data James G. Booth and Thomas A. Hirschl June 2005 Abstract Formulas for constructing valid pointwise confidence bands for survival distributions, estimated using
Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm
Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay Solutions to Midterm Problem A: (30 pts) Answer briefly the following questions. Each question has
Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics
STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18 Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of
Handling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
Permutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
Probability Calculator
Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that
Systat: Statistical Visualization Software
Systat: Statistical Visualization Software Hilary R. Hafner Jennifer L. DeWinter Steven G. Brown Theresa E. O Brien Sonoma Technology, Inc. Petaluma, CA Presented in Toledo, OH October 28, 2011 STI-910019-3946
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different
A Comparative Software Review for Extreme Value Analysis
A Comparative Software Review for Extreme Value Analysis Eric Gilleland Research Applications Laboratory, National Center for Atmospheric Research, U.S.A. Mathieu Ribatet Institute of Mathematics, University
MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!
MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics
Arena 9.0 Basic Modules based on Arena Online Help
Arena 9.0 Basic Modules based on Arena Online Help Create This module is intended as the starting point for entities in a simulation model. Entities are created using a schedule or based on a time between
Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
