Local Quantile Regression



From this document you will learn the answers to the following questions:

What is the measure of?

Similar documents

Maximum Likelihood Estimation

Basics of Statistical Machine Learning

Monte Carlo Simulation

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Exact Confidence Intervals

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago

An Introduction to Machine Learning

How to assess the risk of a large portfolio? How to estimate a large covariance matrix?

Dongfeng Li. Autumn 2010

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

Sales forecasting # 2

Credit Risk Models: An Overview

VI. Real Business Cycles Models

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Working Paper A simple graphical method to explore taildependence

An Internal Model for Operational Risk Computation

Note on the EM Algorithm in Linear Regression Model

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

Geostatistics Exploratory Analysis

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS

Java Modules for Time Series Analysis

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

Detection of changes in variance using binary segmentation and optimal partitioning

Online Appendices to the Corporate Propensity to Save

Fitting Subject-specific Curves to Grouped Longitudinal Data

R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models

Lecture 8: More Continuous Random Variables

1 Prior Probability and Posterior Probability

A Log-Robust Optimization Approach to Portfolio Management

Statistical Machine Learning

Permutation Tests for Comparing Two Populations

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

Principle of Data Reduction

On Marginal Effects in Semiparametric Censored Regression Models

Chapter 3 RANDOM VARIATE GENERATION

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

M1 in Economics and Economics and Statistics Applied multivariate Analysis - Big data analytics Worksheet 1 - Bootstrap

A Study on the Comparison of Electricity Forecasting Models: Korea and China

Nonlinear Regression:

Nonparametric adaptive age replacement with a one-cycle criterion

Cyber-Security Analysis of State Estimators in Power Systems

Gamma Distribution Fitting

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

Bandwidth Selection for Nonparametric Distribution Estimation

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Growth Charts of Body Mass Index (BMI) with Quantile Regression

HYPOTHESIS TESTING: POWER OF THE TEST

1 Sufficient statistics

Contents. List of Figures. List of Tables. List of Examples. Preface to Volume IV

Supplement to Call Centers with Delay Information: Models and Insights

Numerical methods for American options

Chapter 4: Statistical Hypothesis Testing

Sovereign Defaults. Iskander Karibzhanov. October 14, 2014

EMPIRICAL RISK MINIMIZATION FOR CAR INSURANCE DATA

NIST GCR Probabilistic Models for Directionless Wind Speeds in Hurricanes

Planning And Scheduling Maintenance Resources In A Complex System

Lecture 3: Linear methods for classification

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

A Comparison of Correlation Coefficients via a Three-Step Bootstrap Approach

Comparison of resampling method applied to censored data

1.1 Introduction, and Review of Probability Theory Random Variable, Range, Types of Random Variables CDF, PDF, Quantiles...

Reject Inference in Credit Scoring. Jie-Men Mok

Option Valuation Using Intraday Data

Standard errors of marginal effects in the heteroskedastic probit model

Linear Classification. Volker Tresp Summer 2015

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA

Course 4 Examination Questions And Illustrative Solutions. November 2000

Comparing Neural Networks and ARMA Models in Artificial Stock Market

Normal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1.

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

Credit Scoring Modelling for Retail Banking Sector.

Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see. Description

Investors and Central Bank s Uncertainty Embedded in Index Options On-Line Appendix

Linear Threshold Units

University of Maryland Fraternity & Sorority Life Spring 2015 Academic Report

Univariate Time Series Analysis; ARIMA Models

General Sampling Methods

Estimation and attribution of changes in extreme weather and climate events

Transcription:

Wolfgang K. Härdle Vladimir Spokoiny Weining Wang Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin http://lvb.wiwi.hu-berlin.de http://www.case.hu-berlin.de

Motivation 1-1 logratio -0.8-0.6-0.4-0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 rangesca Figure 1: 221 observations, a light detection and ranging (LIDAR) experiment, X: range distance traveled before the light is reflected back to its source, Y: logarithm of the ratio of received light from two laser sources. 95% quantile.

Motivation 1-2 Quantile Estimation -0.15-0.05 0.05 0.15 1 3 5 7 9 12 14 16 18 20 Figure 2: Y: HK stock cheung kong returns, X: HK stock clpholdings returns, daily 01/01/2003 04/26/2010, 90% quantile.

Motivation 1-3 Electricity Demand Figure 3: Electricity Demand over Time

Motivation 1-4 InitValues$hseq[index] 0.03 0.08 0 20 40 60 80 100 1:length(index) logratio -4 0 4 0.0 0.2 0.4 0.6 0.8 1.0 rangesca l_tau 1.0 1.3 0 20 40 60 80 100 Figure 4: Plot of quantile for standardized weather residuals over 51 years at Berlin, 90% quantile.

Motivation 1-5 Weather Residuals cbind(aaav, bbbv, cccv)[, 1:3] 10 20 30 40 0 100 200 300 Figure 5: Estimated 90% quantile of variance functions, Berlin, average over 1995 1998, 1999 2002, 2003 2006

Motivation 1-6 Conditional Quantile Regression Median regression = mean regression (symmetric) Describes the conditional behavior of a response Y in a broader spectrum Example Financial Market & Econometrics VaR (Value at Risk) tool Detect conditional heteroskedasticity

Motivation 1-7 Example Labor Market To detect discrimination effects, need split other effects at first. log (Income) = A(year, age, education, etc) +β B(gender, nationality, union status, etc) + ε

Motivation 1-8 Parametric e.g. polynomial model Interior point method, Koenker and Bassett (1978), Koenker and Park (1996), Portnoy and Koenker (1996). Nonparametric model Check function approach, Fan et al. (1994), Yu and Jones (1997, 1998). Double-kernel technique, Fan et al. (1996). Weighted NW estimation, Hall et al. (1999), Cai (2002). LCP method, Spokoiny (2009). LMS, Katkovnik and Spokoiny (2007)

Outline 1. Motivation 2. Basic Setup 3. Parametric Exponential Bounds 4. Calibration by "Propagation" Condition 5. "Small Modeling Bias" and Propagation Properties 6. "Stability" and "Oracle" Property 7. Simulation 8. Application 9. Further work

Basic Setup 2-1 Basic Setup {(X i, Y i )} n i=1 i.i.d., pdf f (x, y), f (y x) l(x) = F 1 y x (τ), τ-quantile regression curve l n (x) quantile smoother

Basic Setup 2-2 Asymmetric Laplace Distribution (ALD) 1.0 0.8 0.6 0.4 0.2 0.0-4 -2 0 2 4 x Purple are 5,10,...,95 percentiles Figure 6: The cdf and pdf of an ALD with µ = 0, σ = 1.5, τ = 0.8.

Basic Setup 2-3 Quantile & ALD Location Model Y i = θ + ε i, ε i ALD τ (0, 1) l = F 1 (τ) = θ = 0 f (u, θ ) = τ(1 τ) exp{ ρ τ (u θ )} l(u, θ ) = log{τ(1 τ)} ρ τ (u θ ) ρ τ (u) def = τu1{u (0, )} (1 τ)u1{u (, 0)}

Basic Setup 2-4 Check Function 1.5 1 0.5 0-2 -1 0 1 2 Figure 7: Check function ρ τ conditional mean regression for τ=0.9, τ=0.5 and weight function in

Basic Setup 2-5 Quasi-likelihood Approach θ = arg min E λ0 (.) ρ τ (Y θ) θ θ = arg min L(θ) def = arg min θ θ Tail probabilities determine F : n { log τ(1 τ) + ρ τ (Y i θ)} i=1 λ + 0 (y) = (τ 1 y) 1 log{τ 1 P(Y 1 > y)}, y > 0 λ 0 (y) = {(1 τ) 1 y} 1 log{(1 τ) 1 P(Y 1 < y)}, y < 0 Assume λ +( ) 0 (y){λ 0 (y)} 0 as y +/ (heavy tails).

Basic Setup 2-6 Conditional Quantile Regression l(x) = argmin θ E λ0 (.){ρ τ (Y θ) X = x} l n (θ) = argmin θ n 1 n ρ τ (Y i θ)k h(x) (x X i ) i=1 K h(x) (u) = h(x) 1 K{u/h(x)} is a kernel with bandwidth h(x). L(θ, θ ) def = L(Y, θ) L(Y, θ )

Basic Setup 2-7 Local Polynomial Quantile Regression l(u) θ 0 (u) + θ 1 (u)(x u) + θ 2 (u)(x u) 2 +... + θ p (u)(x u) p, θ(u) = { θ 0 (u), θ 1 (u),..., θ p (u)} (1) n n = argmax θ Θ log τ(1 τ) w i ρ τ (Y i θ ψ i )w i i=1 = argmin θ Θ ρ τ (Y i θ ψ i )w i, i=1 i=1 where ψ i = {(X i u), (X i u) 2,..., (X i u) p } and w i = K{(x i u)/h}, for i in 1, 2,..., n.

Parametric Exponential Bounds 3-1 First Step to Happiness

Parametric Exponential Bounds 3-2 Parametric Exponential Bounds Theorem 1 Let V 0 = V (θ ). Let R satisfy λ qr/µ for µ = a/(2ν 0 q 3 ). We have for any r < R, if R µλ /(ν 0 q), then for any z > 0 with 2z/a R 2, it holds P λ0 (.){L( θ, θ ) > z} exp{ za/(ν 0 q 3 ) Q 0 (p, q)} + exp{ g(r)}, (2) where g(r) is defined as: g(r) def = inf sup θ Θ µ M = inf θ Θ g(r, θ, θ ) {µr + C(µ, θ, θ )} Go to details

Parametric Exponential Bounds 3-3 Parametric Exponential Bounds Theorem 2 Under technical assumptions, where R r > 0 is a constant. E λ0 (.) L(θ, θ ) r R r,

Calibration by "Propagation" Condition 4-1 Adaptation Scale Fix x, and a sequence of ordered weights Define LPA: l(x i ) l θ (X i ) W (k) = (w (k) 1, w (k) 2,..., w n (k) ) w (k) i = K hk (x X i ), (h 1 < h 2 <... < h K ) θ k = argmax θ def = argmax θ n i=1 l{y i, l θ (X i )}w (k) i L(W (k), θ)

Calibration by "Propagation" Condition 4-2 LMS Procedure Construct an estimate ˆθ = ˆθ(x), based on θ 1, θ 2,..., θ K. Start with ˆθ 1 = θ 1. For k 2, θ k is accepted and ˆθ k = θ k if θ k 1 was accepted and L(W (l), θ l, θ k ) z l, l = 1,..., k 1 ˆθ k is the latest accepted estimate after the first k steps. L(W (k), θ, θ ) = L(W (k), θ) L(W (k), θ )

Local Model Selection procedure Calibration by "Propagation" Condition 4-3 LMS procedure LMS: Illustration Illustration CS θ k + 1 θ 2 θ1 θ 3 θ k 1 2 3 * k * k +1 Stop

Calibration by "Propagation" Condition 4-4 "Propagation" Condition The critical value is determined via a false alarm consideration: E θ L(W (k), θ k, ˆθ k ) r R r (W (k) ) α, where R r (W (k) ) is a risk bound depending on W (k).

Calibration by "Propagation" Condition 4-5 Critical Values Consider first z 1 letting z 2 =... = z K 1 =. Leads to the estimates ˆθ k (z 1 ) for k = 2,..., K. The value z 1 is selected as the minimal one for which L{W (k), θ k, ˆθ k (z 1 )} r sup E θ,λ(.) θ R r (W (k) ) α, k = 2,..., K. K 1 Set z k+1 =... = z K 1 = and fix z k gives the set of parameters z 1,..., z k,,..., and the estimates ˆθ m (z 1,..., z k ) for m = k + 1,..., K. Select z k s.t. L{W (k), θ m, ˆθ m (z 1, z 2,..., z k )} r sup E θ,λ(.) θ R r (W (k) ) m = k + 1,..., K. kα K 1,

Calibration by "Propagation" Condition 4-6 c1 1.0 2.0 3.0 4.0 0.00 0.05 0.10 0.15 0.20 hseq Figure 8: Critical value with λ(.) corresponding to standard normal distribution, t(3), pareto(0.5,1) ρ = 0.2, r = 0.5, τ = 0.75.

Calibration by "Propagation" Condition 4-7 Bounds for Critical Values Theorem 3 Under the assumptions of probability bounds, there are a 0, a 1, a 2, s.t. the propagation condition is fulfilled with the choice of z k = a 0 r log(ρ 1 ) + a 1 r log(h K /h k ) + a 2 log(nh k ).

"Small Modeling Bias" and Propagation Properties 5-1 Small Modeling Bias λ0 (.)(W (k), θ) = n i=1 K{P l(xi ),λ 0 (.), P lθ (X i ),λ 0 (.)}1{w (k) i > 0} "Small Modeling Bias" Condition: λ0 (.)(W (k), θ), k < k E λ0 (.) log{1 + L(W (k), θ k, θ) r /R r (W (k) )} + 1,

"Small Modeling Bias" and Propagation Properties 5-2 Stability The attained quality of estimation during "propagation" can not get lost at further steps. Theorem 4 L(W (k ), θ k, θˆk)1{ˆk > k } z k

"Small Modeling Bias" and Propagation Properties 5-3 Oracle Combing the "propagation" and "stability" results yields Theorem 5 Let (W (k), θ) for some θ Θ and k k. Then E λ0 (.) log {1 + L(W (k ), θ k, θ) r } R r (W (k ) + 1 ) E λ0 (.) log {1 + L(W (k ), θ k, ˆθ) r } R r (W (k ) + α ) z k + log{1 + R r (W (k ) ) }

Monte Carlo Simulation 6-1 -2 0 2 4 6 8 Figure 9: The bandwidth sequence (upper panel), the data with exp(2) noise, the adaptive estimation of 0.75 quantile, the quantile smoother with fixed optimal bandwidth = 0.08 (yellow)

Monte Carlo Simulation 6-2 -2 0 2 4 1 2 3 4 5-2 0 2 4 1 2 3 4 5 Figure 10: The block residuals of the fixed bandwidth estimation (upper panel)and the adaptive estimation (lower panel)

Monte Carlo Simulation 6-3 Figure 11: The bandwidth sequence (upper panel), the data with standard normal noise, the adaptive estimation of 0.75 quantile, the quantile smoother with fixed optimal bandwidth = 0.089 (yellow),

Monte Carlo Simulation 6-4 -5 0 5 15 Figure 12: The bandwidth sequence (upper panel), the data with t(3) noise, the adaptive estimation of 0.75 quantile, the quantile smoother with fixed optimal bandwidth = 0.06 (yellow)

Monte Carlo Simulation 6-5 Local linear 0.01 0.03 0.05 1 2 3 4 5-30 -10 10-1 0 1 2 3 0 1 2 3 4 1 2 3 4 5 Figure 13: The bandwidth sequence (upper left panel), the data with t(3) noise, the adaptive estimation of 0.80 quantile, the quantile smoother with fixed optimal bandwidth = 0.06 (yellow) ; The blocked error fixed vs. adaptive(right panel).

Monte Carlo Simulation 6-6 0.030 0.045 1 2 3 4 5-40 0 40-60 -40-20 0 20-50 -30-10 10 1 2 3 4 5 Figure 14: The bandwidth sequence (upper left panel), the true trend curve (lower left panel), the adaptive estimation of 0.80 quantile,the quantile smoother with fixed optimal bandwidth = 0.078 (yellow); The blocked error fixed vs. adaptive (right panel).

Application 7-1 Tail Dependency y -4-2 0 2 0.0 0.2 0.4 0.6 0.8 1.0 Figure 15: Plot of quantile curve for two normal random variables with ρ = 0.3204 x

Application 7-2 q[ ] 0.060 0.070 0.080 0.090 0 20 40 60 80 100 0 20 40 60 80 100 q 0.85 0.95 1.05 1.15 q 0.85 0.95 1.05 1.15 _ 1.030 1.040 1.050 1.060 0.0 0.2 0.4 0.6 0.8 1.0 0.85 0.90 0.95 1.00 1.05 1.10 1.15 Figure 16: Plot of bandwidth sequence (upper left), plot of quantile smoother (upper left), scatter plot on scaled X (lower left), original scale (lower right)

Application 7-3 0.075 0.080 0.085 0.090 0 20 40 60 80 100-0.8-0.6-0.4-0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 17: 221 observations, a light detection and ranging (LIDAR) experiment, X: range distance travelled before the light is reflected back to its source, Y: logarithm of the ratio of received light from two laser sources.

Application 7-4 cbind(aaav, bbbv, cccv)[, 1:3] 5 10 15 0 100 200 300 Figure 18: Estimated 90% quantile of variance functions, Kaoshiung, average over 1995 1998, 1999 2002, 2003 2006

Wolfgang K. Härdle Vladimir Spokoiny Weining Wang Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin http://lvb.wiwi.hu-berlin.de http://www.case.hu-berlin.de

Assumptions and Definitions C(µ, θ, θ ) = q 1 M(qµ, θ, θ ) (p+1) log + {ɛ 1 µ V (θ θ ) } Q(p, q) and Q(p, q) = ν 0 qɛ 2 /2 log(2 + 2p) pc(q) log(1 a (p+1) )

Appendix 8-7 Assumptions and Definitions ζ(θ) ζ(θ, θ ) N(µ, θ, θ ) M(µ, θ, θ ) def = L(θ) E L(θ) (3) def = ζ(θ) ζ(θ ) def = log E exp{µζ(θ, θ )} def = log E exp{µl(θ, θ )}

Appendix 8-8 Assumptions and Definitions ζ(θ) satisfying the condition (ED): λ > 0 and a symmetric positive matrix V 2 s.t. 0 < λ λ, log E exp{λγ ζ(θ)} ν 0 λ 2 V 2 γ 2 /2, (4) for some fixed ν 0 1 and γ R p+1. Set V 2 (θ) = n i=1 ψ iψi p i (ψ θ)w i. (L) For R > 0, there exists constant a = a(r) > 0 such that it holds on the set Θ 0 (R) = {θ : V (θ θ ) R}: E L(θ, θ ) a V (θ θ ) 2 /2 return

Appendix 8-9 References Z. W. Cai Regression quantiles for time series Econometric Theory, 18:169-192, 2002. J. Fan and T. C. Hu and Y. K. Troung Robust nonparametric function estimation Scandinavian Journal of Statistics, 21:433-446, 1994. Y. Golubev and V. Spokoiny Exponential bounds for minimum contrast estimators Electronic Journal of Statistics, ISSN: 1935-7524.

Appendix 8-10 References W. Härdle, P. Janssen and R. Serfling Strong uniform consistency rates for estimators of conditional functionals Ann. Statist., 16:1428-1449 1988. X. He and H. Liang Quantile regression estimates for a class of linear and partially linear errors-in-variables model Ann. Statist., 10:129-140 2000. K. Jeong and W. Härdle A Consistent Nonparametric Test for Causality in Quantile SFB 649 Discussion Paper, 2008.

Appendix 8-11 References V. Katkovnik and V. Spokoiny Spatially adaptive estimation via fitted local likelihood (FLL) techniques R. Koenker and G. W. Bassett Regression quantiles Econometrica, 46:33-50, 1978.

Appendix 8-12 References R. Koenker and B. J. Park An interior point algorithm for nonlinear quantile regression Journal of Econometrics, 71:265-283 1996. M. G. Lejeune and P. Sarda Quantile regression: A nonparametric approach Computational Statistics & Data Analysis, 6:229-239 1988. K. Murphy and F. Welch Empirical age-earnings profiles Journal of Labor Economics, 8(2):202-229 1990.

Appendix 8-13 References S. Portnoy and R. Koenker The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute -error estimatiors (with discussion) Statistical Sciences, 12:279-300, 1997. V. Spokoiny Local parametric estimation Heildelberg: Springer Verlag, 2008 K. Yu and M. C. Jones A comparison of local constant and local linear regression quantile estimation Computational Statistics and Data Analysis, 25:159-166, 1997.

Appendix 8-14 References K. Yu and M. C. Jones Local linear quantile regression J. Amer. Statist. Assoc., 93:228-237, 1998. K. Yu, Z. Lu and J. Stander Quantile regression: applications and current research areas J. Roy. Statistical Society: Series D (The Statistician), 52:331-350, 2003.