ON LOCAL LIKELIHOOD DENSITY ESTIMATION WHEN THE BANDWIDTH IS LARGE


 Eileen Dixon
 2 years ago
 Views:
Transcription
1 ON LOCAL LIKELIHOOD DENSITY ESTIMATION WHEN THE BANDWIDTH IS LARGE Byeong U. Park 1 and Young Kyung Lee 2 Department of Statistics, Seoul National University, Seoul, Korea Tae Yoon Kim 3 and Ceolyong Park 3 Department of Statistics, Keimyung University, Taegu, Korea Sinto Eguci 3 Institute of Statistical Matematics, Tokyo, Japan August 31, 2004: Final version for JSPI Abstract In tis paper, we provide a large bandwidt analysis for a class of local likeliood metods. Tis work complements te small bandwidt analysis of Park, Kim and Jones (2002. Our treatment is more general tan te large bandwidt analysis of Eguci and Copas (1998. We provide a iger order asymptotic analysis for te risk of te local likeliood density estimator, from wic a direct comparison between various versions of local likeliood can be made. Te present work, being combined wit te small bandwidt results of Park et al. (2002, gives an optimal size of te bandwidt wic depends on te degree of departure of te underlying density from te proposed parametric model. AMS 2000 subject classifications. Primary 62G07; secondary 62G20. Key words and prases. Density estimation, local likeliood, kernel function, bandwidt. 1 Researc of Byeong U. Park was supported by KOSEF troug Statistical Researc Center for Complex Systems at Seoul National University. 2 Researc of Young Kyung Lee was supported by te Brain Korea 21 project in Researc of Sinto Eguci, Tae Yoon Kim and Ceolyong Park was supported by KOSEF(F and JSPS troug KoreaJapan Joint Researc Program. 1
2 1. Introduction Local likeliood metods for density estimation are very promising, not least on account of teir extraordinary flexibility and adaptivity. Tey afford efficient estimation if te proposed parametric family includes a good model for te data, as well as te usual good beaviour of nonparametric density estimation if it does not. Eac formulation of local likeliood estimation is based on a locally weigted loglikeliood were te weigts are determined by a kernel function K and a bandwidt. Wen is large, te resulting estimator is close to te fully parametric maximum likeliood estimator. On te oter and, wen is small, performance of te resulting estimator would not depend muc on te proposed parametric model. For consistent estimation, te metods require a correction term added to te naive locally weigted loglikeliood. Precise details of te correction term allow scope for variation. See Copas (1995, Loader (1996, Hjort and Jones (1996, and Kim, Park and Kim (2001 for statistical analysis on different formulations of local likeliood wic employ a specific coice of te correction term. Recently, Eguci and Copas (1998 made an important contribution to local likeliood density estimation. Tey provided a unified formulation of various local likeliood approaces by introducing an arbitrary function, denoted by ξ, for te additional correction term. Tis unified formulation includes as special cases te Cversion due to Copas (1995, te Uversion due to Loader (1996 and Hjort and Jones (1996, and te Tversion due to Eguci and Copas (1998. Te definition of te general local likeliood function and tose of te U, C, and Tversions are given in te next section. Later, Park et al. (2002 presented a small asymptotics of te class of local likeliood estimators. It is based on a general condition on ξ tailored for te small analysis. Eguci and Copas (1998 found some interesting large properties of te class. However, teir results are based on a rater stringent condition on ξ. Te condition excludes some important local likeliood metods suc as Copas (1995. See, for more details, te paragrap including (2.4 in te current paper, or te discussion in Eguci and Copas (1998 immediately below te proof of teir Teorem 1. Furtermore, teir results are restricted to te case were te underlying density is undetectably close to te parametric model. In tis paper, we give a more relevant large analysis wit a quite general condition on te function ξ. Te condition is different from te one used in Park et al. (2002 for a small analysis. Te U, C and Tversions of local likeliood are seen to satisfy our new 2
3 condition. We provide a useful approximation of te risk wen te bandwidt tends to infinity as sample size n grows. We sow tat te mean integrated squared error takes te form 1 C B,ML + C V,ML n C 1 B,SM C 1 2 V1,SM n + C 1 2 V2,SM n. (1.1 4 Te first term C B,ML represents te model misspecification error of te proposed parametric family. Tis and C V1,SM are zero if te true density actually belongs to te parametric model. In fact, C B,ML and n 1 C V,ML are te integrated squared bias and te integrated variance, respectively, of te parametric maximum likeliood estimator. Te next tree terms are originated from local smooting. Te constants C B,SM and C V1,SM depend on te distance between te true density and te parametric model, wile te last C V2,SM does not. Depending on ow close te true density is to te parametric model, te last term n 1 4 C V2,SM may dominate n 1 2 C V1,SM. We give explicit expressions for tese constants. Our formula is more useful tan te one given by Eguci and Copas (1998 in te sense tat its dependence on te coice of ξ is more transparent, tus a direct comparison between various local likeliood metods can be made. Let {f(, θ : θ Θ} be a parametric model proposed for te data were θ is a pdimensional parameter vector. Let θ denote te solution of te population version of te parametric likeliood equation. In fact, f(, θ minimizes te KullbackLeibler divergence from te true density among all members in te parametric model. It is te best parametric approximant to te true density in tat sense. A precise definition of θ is given at (2.8 in Section 2. We evaluate te risk in a srinking neigborood of te parametric model. In particular, we assume tat te true density g g n satisfies {g(x f(x, θ } 2 dx cn (1+α, (1.2 for some α > 1 and c > 0. Note tat α = corresponds to te case were te true density belongs to te parametric model. We derive an approximation of te risk of te form given at (1.1 under tis condition. It is sown tat te constants C B,ML and C B,SM in (1.1 are bot O(n 1 α. Te constant C V1,SM is seen to be O(n (1+α/2. Our large asymptotics is valid for all α > 1. Note tat, wen α = 1, a risk analysis in te usual nonparametric context of te bandwidt tending to zero as n grows is more appropriate. Eguci and Copas (1998 gave some interesting large properties of teir class of local likeliood density estimators, but only in te case were α > 0. 3
4 Tus, te present work is also an extension of Eguci and Copas (1998 in tis regard. Since features of te underlying density (suc as location and scale may be estimated at best to an accuracy of O(n 1/2, treatment of te case 1 < α 0, tat is not dealt in Eguci and Copas (1998, is useful to describe te large properties wen te underlying density lies in a region were one can distinguis it from te parametric model. Our large asymptotics and te small results of Park et al. (2002, te latter being valid for all α 1, may be combined to give an optimal size of te bandwidt going to zero or infinity. It is argued tat te value of α at wic a transition from a large to a small bandwidt is desirable does exist in te range 1 < α 0. We observe in an example tat te optimal bandwidt indeed ranges from a very small to a very large value depending on ow close te true density is to te parametric model. Our treatment of large bandwidt asymptotics is more rigorous tan tat of Eguci and Copas (1998. We obtain iger order expansions for te bias and te variance of te local likeliood estimator. We sow tat by coosing te bandwidt in an optimal way te local likeliood estimator may ave smaller risk tan te parametric maximum likeliood estimator except te case were α =. Te risk considered in tis paper is te mean integrated weigted squared error. An expansion of te relative entropy risk (expected KullbackLeibler divergence considered in Eguci and Copas (1998 is readily obtained from our results by coosing proper weigt functions. Tis paper is organized as follows. Section 2 introduces te unified formulation of local likeliood density estimation and provides te new condition on ξ for large analysis. It also contains some preliminary results for te risk analysis. In Section 3, we give a detailed account of large asymptotics for all α > 1. In Section 4, te risks of te U, C and Tversions of local likeliood estimation are compared wit an example. Also discussed is te issue of optimal bandwidt size. Some tecnical proofs are given in te appendix. 2. Preliminaries Let X 1,..., X n be independent dvariate random vectors from a common density g( supported on X. Let f(, θ wit θ being a pdimensional parameter vector be a parametric model proposed for te data. Define u(t, θ = ( / θ log{f(t, θ}. Let K be a nonnegative symmetric kernel function on IR d. For simplicity of presentation we take a 4
5 scalar bandwidt. For an arbitrary function ξ(,, te general form of local likeliood estimating equation considered by Eguci and Copas (1998 and Park et al.(2002 is given by ˆΨ (x, θ = 0 were ˆΨ (x, θ = 1 n ( x Xi K u(x i, θ 1 n ξ n i=1 n i=1 ( x X1 E θ {K { K u(x 1, θ }. ( x Xi, E θ K ( } x X1 Here and below E θ means expectation wit respect to te parametric density f(, θ wile E wit no subscript means expectation wit respect to te true density g. If we take =, tis reduces to te parametric maximum likeliood estimating equation: n u(x i, θ = 0. i=1 Let ˆθ (x denote te solution of te estimating equation ˆΨ (x, θ = 0. likeliood density estimator may be defined by A local ˆf (x = f(x, ˆθ (x. (2.1 Note tat tis estimator may not integrate to one since ˆθ (x depends on x. A bona fide density estimator is given by Te function ξ is assumed to satisfy E θ ξ { K ĝ (x = ˆf (x/ ˆf (x dx. (2.2 ( x X1, E θ K ( } x X1 = 1. (2.3 Tis condition is required for te estimating equation to be unbiased wen g( = f(, θ for some θ. In fact, it guarantees te firstorder Bartlett s identity: E θ ˆΨ (x, θ = 0 for all θ. Examples of ξ tat satisfy te condition (2.3 include te Uversion ξ(u, v 1 of Hjort and Jones (1996, te Cversion ξ(u, v = (1 u/(1 v of Copas (1995, and te Tversion ξ(u, v = u/v considered in Eguci and Copas (1998. Eguci and Copas (1998 assumed α > 0 and discussed te properties of f(x, ˆθ (x wen tends to infinity as n grows. Teir results are based on te following additional condition on ξ: E ξ { K ( x X1, E θ K ( } x X1 5 ( 1 = 1 + O 2 (2.4
6 as tends to infinity. But, tis condition is too restrictive. Te Cversion ξ(u, v = (1 u/(1 v, for example, does not satisfy tis condition. To see tis, assume as in Eguci and Copas (1998 tat K(t = 1 κ 2 t 2 + O( t 4 as t 0. Ten, E ξ { K ( ( } x X1 x X1 1, E θ K 1+ x y 2 {g(y f(y, θ E θ x X 1 2 } dy. Te second term in te above approximation is not O( 2, but equals O{n (1+α/2 } wen te true density g g n satisfies te condition (1.2. We consider ere a more relevant condition on ξ wic replaces (2.4. We note tat, under te condition K(t = 1 κ 2 t 2 + O( t 4 as t 0, te large properties of te local likeliood estimator depend on ξ troug te beaviour of ξ(1 y, 1 z wen bot y and z approac to zero from above. Tis can be seen from te fact tat bot arguments of ξ in te definition of ˆΨ converge to one as tends to infinity. Te condition on ξ sould be different from te one in small setting, were performance of te estimator relies on te properties of ξ(y, z near y = z = 0 since bot arguments of ξ tend to zero as converges to zero. See Park et al. (2002 for a suitable condition on ξ in small setting. Te condition on ξ for our large analysis is given in (A1 below. Te condition on te kernel K is stated in (A2 were we specify te coefficient of O( t 4 term for more detailed analysis. Assumptions. (A1 In addition to te consistency condition (2.3, ξ satisfies lim z 0 sup 0 y c 1 z {ξ(1 yz, 1 z ξ 0(y ξ 1 (yz} = 0 (2.5 for some functions ξ 0, ξ 1 and a constant c > 0. Te function ξ 0 is continuously differentiable and ξ 1 is continuous. Also, ξ(y, z is twice continuously differentiable wit respect to z on (0, 1 for eac y [0, 1]. (A2 Te kernel function K(t is continuous at t = 0 and satisfies K(t = 1 κ 2 t 2 + κ 4 t 4 + o( t 4 as t 0. Te tree versions of local likeliood mentioned above satisfy te condition (A1: for te Uversion, ξ 0 (y 1 and ξ 1 (y 0; for te Cversion, ξ 0 (y = y and ξ 1 (y 0; 6
7 and for te Tversion, ξ 0 (y 1 and ξ 1 (y = 1 y. Under te condition (A1, we may differentiate te left and side of (2.3 wit respect to θ. Tis yields ( ( ] ( x E θ ξ [K (1 X1 x X1 x X1, E θ K E θ K [ ( ( x X1 x X1 = E θ ξ K, E θ K for all θ, were ξ (1 (y, z = ( / zξ(y, z. u(x 1, θ ] u(x 1, θ (2.6 Below, we give some preliminary results for te discussion in Section 3. Let θ (x denote te solution of te equation E ˆΨ (x, θ = 0. (2.7 Also, define a population version of ˆf (x by f (x = f(x, θ (x. Tese two quantities are tose to wic ˆθ (x and ˆf (x, respectively, get closer as te sample size n grows. Next, define θ to be te solution of Eu(X, θ = 0. (2.8 Later, it will be seen tat θ is te limit of θ (x as tends to infinity. If te true density g belongs to te parametric model, i.e., if it equals f(, θ for some θ, ten θ equals tat value of te parameter and g = f(, θ. Trougout te paper, we assume tat te true density lies in a n (1+α/2 neigborood of te parametric model in te following sense. (A3 Te true density g g n satisfies for some α > 1 and c > 0 {g(x f(x, θ } 2 dx cn (1+α. A relevant expansion of ξ[k( 1 (x X 1, E θ K( 1 (x X 1 ] plays an important role for te asymptotic analysis in Section 3. Define Y (x = x X 1 2 E θ x X 1 2. Tis is te limit of Y (x = {1 K( 1 (x X 1 }/{1 E θ K( 1 (x X 1 } as tends to infinity. Also, let W (x = κ 4 κ 2 ξ 0(Y (xy (x [ E θ {Y (x x X 1 2 } x X 1 2] +κ 2 ξ 1 (Y (xe θ { x X 1 2 } (2.9 7
8 In te following lemma we give a useful expansion for ξ. Lemma 1. In addition to te conditions (A1 and (A2, assume tat g as compact support. Ten, for any constant c > 0 tere exists ɛ going down to zero as tends to infinity suc tat [ ( ( } P sup x {K ξ X1 x X1, E θ K x <c ξ 0 (Y (x 1 W (x ɛ ] = Proof. ξ [ K Let z (x = 1 E θ K( 1 (x X 1. Ten, we can write ( x X1, E θ K ( ] x X1 = ξ [1 Y (xz (x, 1 z (x]. (2.10 From te condition (A2 and compactness of te support of g, it follows tat for any constant c > 0 tere exists ɛ going down to zero as tends to infinity suc tat wit probability one { sup Y (x Y (x κ ( 4 x <c 2 Eθ [Y (x x X 1 2 ] x X 1 2} ɛ κ 2. ( Applying (2.5 to (2.10 wit te fact tat sup x <c z (x c / 2 for some c > 0, and using (2.11 yields te lemma. Remark 1. Te condition tat g as compact support in Lemma 1 may be relaxed to a weaker one. In fact, it may be proved wit more deliberate arguments tat te lemma still olds wen g as exponentially decaying tails. However, in tis case one needs some stronger conditions on ξ, instead. For example, in place of (2.5 one needs lim z 0 sup 0 y z 1+β 1 z {ξ(1 yz, 1 z ξ 0(y ξ 1 (yz} = 0 for an arbitrarily small β > 0. In addition, to control its property at tails one needs sup 0<z<ε sup z γ ξ(1 yz, 1 z ξ 0 (y ξ 1 (yz < c z 1+β <y z 1 for some ε, c > 0 and γ 0. Te tree versions of te local likeliood estimation still satisfy tese conditions on ξ: for te Uversion, c = 1 and γ = 0; for te Cversion, c = 1 and γ = 1; and for te Tversion, c = (1 ε 1 and γ = 0. 8
9 Te following two lemmas demonstrate te beaviour of f (x, defined immediately below (2.7, and tat of ˆf (x wen tends to infinity. Tese are useful to quantify te asymptotic risk of te estimator in te next section. To state te lemmas, let U (x, θ = ˆΨ (x, θ E ˆΨ (x, θ. It as mean zero and variance of order O(n 1. Define { ( } { } x X1 f I (x, θ = E K u(x 1, θu(x 1, θ T E f (X 1, θ { [ ( ( ] } x X1 x X1 E ξ K, E θ K u(x 1, θ ( } x X1 E θ {K u(x 1, θ T. (2.12 It will be seen tat tis is an approximation of ( / θe ˆΨ (x, θ. Above and in te subsequent arguments, we let ṗ(x, θ and p(x, θ for a function p denote, respectively, te first and te second derivatives of p(x, θ wit respect to θ. Write f (x = f(x, θ, and define a p p matrix I (θ = E { u(x 1, θu(x 1, θ T f } f (X 1, θ. Te two lemmas rely on some tecnical assumptions in addition to (A1 (A3. Tey are stated in te appendix. Proofs of te lemmas are also deferred to te appendix. Lemma 2. Assume (A1 (A3 and te conditions listed in te appendix. Ten, for all α 1 it follows tat uniformly for x in any compact subset of X f (x = f (x + f(x, θ T I (θ 1 E ˆΨ (x, θ + O(ρ (2.13 as n and, were ρ ρ(n, = n (1+α/2 4. Lemma 3. Assume te conditions of Lemma 2. Ten, for all α 1 it follows tat uniformly for x in any compact subset of X ˆf (x = f (x + f(x, θ T I (x, θ 1 U (x, θ + O p (δ as n and, were δ δ(n, = n 1 (α/2 2 (log n 1/2 + n 1 log n. Remark 2. Since U (x, θ at Lemma 3 is a sum of independent random vectors, asymptotic normality of ˆf (x follows immediately from te lemma. 3. Risk analysis 9
10 error: For te risk of an estimator ĝ of g, we consider te mean integrated weigted squared E {ĝ(x g(x} 2 w(x dx, were w( is a weigt function wose support is compact and contained in te support of g. In tis section, we provide te asymptotic risks of te estimator ˆf (x and its scaled version ĝ (x. First, we consider te unscaled estimator ˆf (x. Te asymptotic risk of te estimator ˆf (x is given by te risk of its approximation f (x wic is defined by f (x = f (x + f(x, θ T I (x, θ 1 U (x, θ. (3.1 We decompose te risk of f into two parts: b U (n, = {f (x g(x} 2 w(x dx, v U (n, = E { f (x f (x} 2 w(x dx. Te first term b U (n, represents te bias of te unscaled estimator f due to model misspecification, wile v U (n, measures its sampling variability. In te following teorem, we give approximations for tese components. To state te teorem, let I 0, = Eu(X 1, θ u(x 1, θ T and D = E{ f(x 1, θ /f(x 1, θ }. Note tat D is not zero since we take te expectation wit respect to te true density instead of f(, θ. It follows tat Define ν 0 = τ 0 = I 0, = I (θ D. (3.2 {f (x g(x} 2 w(x dx, f(x, θ T I (θ 1 I 0, I (θ 1 f(x, θ w(x dx, U(x = f(x, θ T I 1 0, D I 1 0, u(x 1, θ, Z U (x = f(x, { θ T I0, 1 x X1 2 u(x 1, θ ξ 0 (Y (xe θ x X 1 2 u(x 1, θ }, ν 1,U = 2 {E g f Z U (x} {f (x g(x} w(x dx, τ 1,U = 2 {EZ U (xu(x} w(x dx, τ 2,U = [EZU (x 2 {EZ U (xu(x 1, θ T }I 1 0, {Eu(X 1, θ Z U (x} ] w(x dx. 10
11 Here E g f denotes E E θ. Note tat ν 0 = O(n 1 α and ν 1,U = O(n 1 α by te condition (A3. Also, τ 1,U = O(n (1+α/2 since D = O(n (1+α/2 by te fact E θ { f(x 1, θ /f(x 1, θ } = 0. Now, τ 2,U is a constant wic does not depend on n. It is strictly positive. Tis follows from te CaucyScwartz inequality: for any IR p valued ψ and realvalued φ ( ( Eφψ T Eψψ T 1 (Eψφ Eφ 2 wit = olding if and only if φ = a T ψ for some constant vector a. Taking ψ = U(X 1, θ and φ = Z U (x sows tat τ 2,U > 0. Teorem 1. Under te conditions of Lemma 2, we get as n and ( ν 1,U b U (n, = ν 0 κ 2 + o 1, 2 n α+1 2 v U (n, = τ ( 0 n κ τ 1,U 2 n + τ 2,U 2 κ2 2 n + o 1 4 n 3/2+α/ n 4. Write Next, we consider te scaled estimator ĝ (x. Define g (x = f (x/ f (x dx. v (x = f(x, θ T I (θ 1 E ˆΨ (x, θ, V (x = f(x, θ T I (x, 1 U (x, θ. We may obtain expansions for g (x and ĝ (x, analogous to tose given at Lemmas 2 and 3, as follows: g (x = f (x + v (x f (x v (x dx + O(ρ, (3.3 ĝ (x = g (x + V (x f (x V (x dx + O p (δ. (3.4 In te next teorem, we give te risk of g (x, an approximation of ĝ (x defined by g (x = g (x + V (x f (x V (x dx. Similarly to te case of te unscaled f (x, te risk of g (x is decomposed into two parts: b S (n, = {g (x g(x} 2 w(x dx, (3.5 v S (n, = E { g (x g (x} 2 w(x dx. (3.6 11
12 To state te teorem, write Z S (x = Z U (x f (x Z U (x dx. Define te following scaled versions of ν 1,U, τ 1,U and τ 2,U : ν 1,S = 2 {E g f Z S (x} {f (x g(x} w(x dx, τ 1,S = 2 {EZ S (xu(x} w(x dx, τ 2,S = [EZS (x 2 {EZ S (xu(x 1, θ T }I 1 0, {Eu(X 1, θ Z S (x} ] w(x dx. Teorem 2. Under te conditions of Lemma 2, we get as n and ( ν 1,S b S (n, = ν 0 κ 2 + o 1, 2 n α+1 2 v S (n, = τ ( 0 n κ τ 1,S 2 n + τ 2,S 2 κ2 2 n + o 1 4 n 3/2+α/ n 4. Te first term ν 0 in te expansions of b U (n, and b S (n, in Teorems 1 and 2 represents te model misspecification error of te proposed parametric family. It is te integrated squared bias of te parametric maximum likeliood estimator f(, ˆθ MLE, were ˆθ n MLE is defined as te solution of te equation u(x i, θ = 0. It is zero if te true density actually belongs to te parametric model. Next, te first term τ 0 /n in te expansions of v U (n, and v S (n, is te integrated variance of te parametric maximum likeliood estimator. Since ν 0 and τ 0 do not depend on ξ, te first order properties of all te members in te class of te local likeliood estimation are te same. Te oter terms in te expansions depend on te bandwidt. Tese terms also depend on ξ, but only troug ξ 0. Tus, te U and Tversions ave te same second order properties, too, as tey bot ave ξ 0 1. We note tat Eguci and Copas (1998 neglected te term κ 2 τ 1 /n 2 in teir analysis of te asymptotic variance because teir main concern was te case were 0 < α < 1. Wen 0 < α < 1, te optimal is of order n α/2 (see Section 4 and wit tis coice te term κ 2 τ 1 /n 2 is negligible. We may find an optimal size of te bandwidt by minimizing te sum of b(n, and v(n,. Tis will be discussed in Section 4. Te formulas given in Teorems 1 and 2 are more useful tan te one given by Eguci and Copas (1998 since dependence of te risks on te function ξ are more transparent. A direct risk comparison between various local 12 i=1
13 likeliood metods can be made from te formulas, wic will be dealt too in te next section. Below, we give a proof of Teorem 1. Proof of Teorem 2 is omitted as it may be proved in a similar fasion using (3.3 and (3.4 instead of Lemmas 2 and 3. Proof of Teorem 1. From te consistency condition (2.3 on ξ, E ˆΨ ( [ ( ( ] x X1 x X1 x X1 (x, θ = E g f {K u(x 1, θ ξ K, E θ K ( } x X1 E θ K u(x 1, θ. Since E θ u(x 1, θ = 0, we obtain from te condition (A2 on te kernel tat uniformly for x in any compact subset of X ( x X1 E θ K u(x 1, θ = κ 2 2 E θ x X 1 2 u(x 1, θ + O ( 1 4. Tus, from Lemma 1 and te fact E g f u(x 1, θ = 0, it follows tat uniformly for x in any compact subset of X f(x, θ T I (θ 1 E ˆΨ (x, θ = κ 2 2 E g f Z U (x + o ( 1. (3.7 n (1+α/2 2 Te first part of te teorem ten follows immediately from (2.13 at Lemma 2 and (3.7. To find te formula for v(n,, we need to approximate I (x, θ, defined at (2.12, and var {U (x, θ }. Define I k, (x = E { x X 1 k u(x 1, θ u(x 1, θ T }, Z(x = x X 1 2 u(x 1, θ ξ 0 (Y (xe θ x X 1 2 u(x 1, θ. By similar arguments as in deriving (3.7, we get I (x, θ = I (θ κ 2 E Z(xu(X 1, θ 2 T + 1 [ κ 4 4 I 4, (x uniformly for x in any compact subset of X. +κ 2 E W (xu(x 1, θ { E θ x X 1 2 u(x 1, θ }] T (3.8 ( 1 +o 4 We compute var {U (x, θ }. From Lemma 1 and te condition (A2 on te kernel K, we may verify tat uniformly for x in any compact subset of X var {U (x, θ } = 1 [ n var u(x 1, θ κ 2 Z(x + 1 { κ4 x X u(x 1, θ + κ 2 W (xe θ x X 1 2 u(x 1, θ }] ( 1 + o. (3.9 n 4 13
14 Next, write H 2 (x = κ 2 E Z(xu(X 1, θ T and H 4 (x = κ 4 I 4, (x + κ 2 E W (xu(x 1, θ { E θ x X 1 2 u(x 1, θ T }. Note tat wit tese notations te equation (3.8 can be written as Also, te equation (3.9 reduces to I (x, θ = I (θ 1 2 H 2(x H 4(x. (3.10 n var {U (x, θ } = I 0, 1 2 { H2 (x + H 2 (x T } { H4 (x + H 4 (x T } ( + κ2 2 1 E 4 Z(xZ(xT + o. ( From (3.10 it follows tat uniformly for x in any compact subset of X I (x, θ 1 = I (θ I (θ 1 H 2 (xi (θ 1 We plug (3.11 and (3.12 into + 1 I (θ 4 [ 1 H 2 (xi (θ 1 H 2 (x H 4 (x ] I (θ 1 (3.12 ( 1 +o. 4 n f(x, θ T I (x, θ 1 var{u (x, θ }{I (x, θ 1 } T f(x, θ, and collect terms involving 2 and 4. We find tat te 2 terms are 1 f(x, θ 2 T I (θ { 1 H 2 (xi (θ 1 I 0, + I 0, I (θ 1 H 2 (x T H 2 (x H 2 (x } T I (θ 1 = 2 2 f(x, θ T I (θ 1 { H 2 (xi (θ 1 E f f (X 1, θ f(x, θ (3.13 } I (θ 1 f(x, θ. Te equation (3.13 follows from (3.2. We can replace I (θ 1 in (3.13 by I 1 0, wit an error O(n 1 α 2 since E{ f(x 1, θ /f(x 1, θ } = O(n (1+α/2. Tis gives tat (3.13 equals 2 κ 2 2 E{Z U (xu(x} + O(n 1 α 2. Similarly, we find tat te 4 terms reduce to 1 f(x, θ 4 T I (θ [ 1 κ2e 2 Z(xZ(x T H 2 (xi (θ 1 H 2 (x ] T I (θ 1 f(x, θ + o ( 1 4 = κ2 [ 2 EZU (x 2 {EZ 4 U (xu(x 1, θ T }I0, {Eu(X 1 1, θ Z U (x} ] ( 1 + o 4 14
15 uniformly for x in any compact subset of X. Te second part of te teorem now follows. Remark 3 (KullbackLeibler risk. An expansion of te KullbackLeibler risk E log {g(x/ĝ (x} g(x dx may be derived from Teorem 2 wit specific coices of te weigt function w. In fact, under te condition tat f is bounded away from zero on te support of g, we may approximate te KullbackLeibler risk by { } g (x f (x KL(g, f g(x dx + 1 { } 2 f (x 2 E g (x g (x g(x dx (3.14 g (x were KL(g, f = log {g(x/f(x} g(x dx denotes te KullbackLeibler divergence of f from g. Te first term ν KL 0 KL(g, f at (3.14 is te minimal KullbackLeibler divergence from g among all members in {f(, θ : θ Θ}. Te second term equals {g (x f (x} {g(x f (x} f (x 1 dx (3.15 since {g (x f (x} dx = 1 1 = 0. From te definition of b S (n, at (3.5 and its expansion given at Teorem 2, we may get an approximation of (3.15. By applying te first part of Teorem 2 wit w(x = 1/{2f (x}, we obtain (3.15 equals κ 2 ν KL o(n (1+α 2 were ν1 KL = {E g f Z S (x} {f (x g(x} f (x 1 dx. Wen ξ 0 1 (tus for U and Tversion, it may be proved tat ν KL 1 = 2 {E g f X 1 u(x 1, θ } T I 1 0, {E g f X 1 u(x 1, θ }. (3.16 Tis matces te bias results in Teorem 1 and Corollary 1 of Eguci and Copas (1998. Tis means tat te results of Eguci and Copas (1998 are valid only for te case were ξ 0 1. Note tat te unscaled version f does not integrate to one. Tus, te second term at (3.14 corresponding to te unscaled estimator ˆf would be of order n (1+α/2 2 wic is slower tan n (1+α 2 of te scaled estimator. Terefore, normalizing te local likeliood estimator by its integral is important for te KullbackLeibler risk. Te tird term at (3.14 is te variance part. We may get an expansion of tis from te second part of Teorem 2, now wit w(x = g(x/{2f (x 2 }. It equals n κ τ1 KL 2 n + τ KL ( 2 κ2 2 2 n + o 1 4 n 3/2+α/ n 4 τ KL 0 15
16 were τ KL i w(x. For instance, for i = 0, 1, 2 are defined as τ 0, τ 1,S and τ 2,S wit g(x/{2f (x 2 } replacing τ0 KL = 1 2 u(x, θ T I (θ 1 I 0, I (θ 1 u(x, θ g(x dx. 4. Comparison and optimal bandwidt In Teorems 1 and 2, ν 0 + (τ 0 /n is te mean integrated squared error of te parametric maximum likeliood estimator. Tus, te risk improvement acieved by te local likeliood estimators upon te parametric maximum likeliood estimator is given by r d ( κ 2 (ν 1 + τ 1 n 1 2 κ 2 2τ 2 n 1 4. (4.1 Here and below in tis section, we simply write ν 1 and τ i (i = 1, 2 for ν 1,k and τ i,k, respectively, were k = U or S. As a function of t = 2, r d ( is a concave parabola on t 0. It as te maximum value at t 0 = (ν 1 n + τ 1 /(2κ 2 τ 2 if ν 1 n + τ 1 > 0. Tus, in tis case te optimal bandwidt is given by ( 2κ2 τ 1/2 2 opt =, (4.2 ν 1 n + τ 1 and te maximum risk improvement equals (ν 1 n + τ 1 2 /(4nτ 2. Wen ν 1 n + τ 1 0, te risk improvement r d is a strictly decreasing function of t on t 0, tus it is maximized at t = 0, i.e. at = wit te maximum value being zero. Note tat = corresponds to te fully parametric maximum likeliood estimator. It is not clear to us weter ν 1 n + τ 1 > 0 in general. However, we found in an example below ν 1 and τ 1 are positive (see Figure 2. For te KullbackLeibler risk, it can be seen from (3.16 tat te U and Tversions ave ν KL 1 n + τ KL 1 > 0 for sufficiently large n. In te subsequent discussion we assume ν 1 n+τ 1 > 0. Now, recall tat ν 1 n (1+α and τ 1 n (1+α/2. Tus, te formula (4.2 is valid only wen α > 0 since it is derived in te large setting were tends to infinity as n grows. If 0 < α < 1, ten (ν 1 n dominates τ 1. Tus, in tis case opt n α/2 and te maximum risk improvement is of order n (1+2α. Next, wen α 1, te optimal bandwidt is asymptotic to n (1+α/4 wit te maximum risk improvement being of order n (2+α. In te remaining case were 1 < α 0, we see from (4.1 tat letting tend to infinity at a slower rate makes r d larger. Tus, a bandwidt tending to infinity at an ultimately slow rate would be preferable in tis case. 16
17 We combine te results of Park et al. (2002 into our large analysis. Recall d is te dimension of X i and p is te dimension of te parameter. Wen d = 1, it was sown tat te optimal bandwidt in te small setting is asymptotic to n 1/{1+4[(p+1/2]} wit te minimal risk being of order n q 1, were q 1 = 4[(p + 1/2]/{1 + 4[(p + 1/2]} and [(p + 1/2] denotes te greatest integer wic is less tan or equal to (p + 1/2. Tis can be generalized to an arbitrary d. Let q d = 4[(p + 1/2]/{d + 4[(p + 1/2]}. It can be seen tat in te dvariate case te minimum risk n q d is acieved by te optimal bandwidt of order n 1/{d+4[(p+1/2]}. Note tat te first order in te risk expansion for large is n (1+α +n 1. Comparing tis wit te small optimal risk n q d and taking into account te discussion in te previous paragrap, we arrive at te following conclusion. We find tat te value of α at wic a transition from a small to a large bandwidt is desirable is α = q d 1. (i 1 α < q d 1: opt n 1/{d+4[(p+1/2]} ; (ii q d 1 < α 0: tending to infinity at an ultimately slow rate is preferable; (iii 0 < α < 1: opt n α/2 ; (iv 1 α: opt n (1+α/4. Te large asymptotic formula (4.2 and te small results provided in Park et al. (2002 may be used to produce useful bandwidt selectors. For example, plugin metods are immediate from te formula (4.2, were θ is replaced by te solution of te likeliood equation n i=1 u(x i, θ = 0 and oter unknown quantities by teir obvious empirical versions. Least squares crossvalidation is an alternative way of coosing a datadriven bandwidt selector, and is readily applicable to local likeliood density estimation. Te latter is not so tied to asymptotics and does not depend on te knowledge of α. Tus, it may be used for a goodnessoffit test of a parametric model, were te parametric model is rejected for small values of crossvalidatory bandwidt selector. Determination of te cutoff values in tis case requires te sampling distribution of te bandwidt selector. Tis would be a callenging problem for future researc. 5. A skewed normal example We compare te large properties of te U and Cversions of te local likeliood estimation. Note tat te Tversion as te same first and second order properties wit 17
18 te Uversion as we pointed out in te paragrap immediately after te statement of Teorem 2. We consider N(θ, 1 as te parametric model. We take w(x 1 in te definition of te risk. Te true density is taken to be g(x g β (x = 2φ(xΦ(βx, (4.1 were φ and Φ are te standard normal density and its distribution function. Tis is te socalled skewed normal distribution of Azzalini (1985, and was also considered by Eguci and Copas (1998. Here, β acts as a discrepancy parameter. Wen β = 0, te density g is identical to φ. As β increases, it becomes increasingly skewed. In tis setting, we find θ = EX = 2 β π. ( β 2 Te integrated squared distance between te true density g and its best parametric approximant φ( θ, wic is ν 0 function of β. Figure 1 depicts ν 0 as a function of β. = {φ(x θ 2φ(xΦ(βx} 2 dx, is a symmetric (Insert Figure 1 about ere We calculate some ingredients to evaluate te risks given in Teorems 1 and 2. We find I (θ = 1, I 0, = 1 θ 2 and D = θ 2. For computing ν i and τ i, we use te formula for te odd moments of te skewed normal distribution given in Corollary 4 of Henze (1986. In particular, we find in addition to (4.2 E X 3 = ( 2 3 ( π β β β β 2 E X 5 = ( 2 5 ( 3 π 3 β β β β 2 ( β 1 + β 2. For te even moments we obtain E X 2k = (2k!/(2 k k!, and tus EX 2 = 1, EX 4 = 3, EX 6 = 15. Tese formula may be also obtained by applying Corollaries 3.2 and 5.3 of Aldersof et al. (1995. It may be seen tat all ν i and τ i are symmetric as functions of β. Furtermore, E Z U (y dy U(x = 2 θ2 C (1 θ 2 3 (x θ φ(x θ E(X 1 θ 3, were C = 1 for te Uversion and C = {y 2 /(1 + y 2 }φ(y dy for te Cversion. Tus, since τ 1,S = τ 1,U 2 f (x{e Z U (y dy U(x} dx and f (x = φ(x θ, we ave 18
19 τ 1,S = τ 1,U for bot te U and Cversions. Similarly, we may find τ 2,S = τ 2,U, but in tis case only for te Uversion. Figure 2 sows ν 1, τ 1 and τ 2. We find tat ν 1 and τ 1 are positive and converge to zero as β tends to zero. Also, from Figure 2(a we find tat te Uversions ave less bias tan te Cversion, and tat te scaling improves te bias. If one plugs te values of ν 1, τ 1 and τ 2 into te formula (4.2, one may see ow opt canges as β increases. We found tat for a sample of size 100 te optimal bandwidt for te Uversion of ˆf decreases from infinity to.31 as β increases from zero to 10, and for te Cversion it takes values from infinity to.30. (Insert Figure 2 about ere Now, we evaluate te maximum risk improvements r d ( opt = (ν 1 n + τ 1 2 /(4nτ 2 acieved by te U and Cversions upon te parametric maximum likeliood estimator. Note tat te maximum risk improvement does not depend on te coice of kernel. It is symmetric about zero as a function of β. Figure 3 depicts r d ( opt wen n = 100 and 400. Comparing te scaled estimator ĝ wit te unscaled ˆf, we find bot U(T and C versions of ĝ outperform te corresponding versions of ˆf for all β. Also, it is interesting to find tat te U(Tversion is better tan te Cversion for all β in te case of ĝ, but tat te risks of te unscaled estimators ˆf are indistinguisable altoug te Cversion now is sligtly better. We found tat tis is true for oter sample sizes, too. (Insert Figure 3 about ere We conducted a small simulation to ceck te validity of our discussion on te optimal bandwidt size. For tis, we took te standard normal density as te kernel function. We calculated an optimal bandwidt wic minimizes te sum of squared n deviations { ˆf (X i g β (X i } 2. Our simulation consists of te tree steps; (i for eac i=1 β = 0, 0.5, 1, 2, 5, and 10, generate a random sample of size n = 100 from te density g β at (4.1 by te rejection metod; (ii compute ˆf for te U and Cversions at eac data point X 1, X 2,..., X n ; (iii find te optimal bandwidt over te interval (0, 30 wic n minimizes te sum of squared deviations { ˆf (X i g β (X i } 2. Tese steps were repeated i=1 100 times. Table 1 sows te average of te 100 calculated optimal bandwidts for eac value of β. We see tat it clearly justifies our teoretical observation tat te optimal bandwidt traverses from a large to a small value as te degree of discrepancy from te 19
20 parametric model (in tis example, β increases. We note tat te teoretical values.31 and.30 at β = 10 discussed two paragraps above do not matc well wit.13 in te table because te teoretical values are obtained from te formula tat is valid in te near parametric case. (Insert Table 1 about ere Table 1: Average of te optimal bandwidt Uversion β Cversion β
21 Appendix A.1. Additional assumptions. following assumptions for te lemmas and te teorem. (A4 te solution θ (x defined at (2.7 is unique; In addition to te assumptions (A1 (A3, we need te (A5 te underlying density g and its best parametric approximation f ave compact supports; (A6 f(x, θ is tree times partially differentiable wit respect to θ and all te partial derivatives are continuous in x and θ; (A7 tere exists a function G wic is continuous and satisfies for all x 2 sup u(x, θ θ Θ θ2 G(x; A.2. Proof of Lemma 2. Here, all O expressions are uniform for x in any compact subset S of X, i.e. for a sequence of functions Q n, we say simply Q n, (x = O{r(n, } instead of sup x S Q n, (x = O{r(n, }. It follows from (3.7 tat Tis implies θ (x = θ E ˆΨ ( (x, θ = O 1 n (1+α/2 2 [ ] 1 θ E ˆΨ (x, θ E ˆΨ ( (x, θ + O θ=θ. (A.1 1 n (1+α 4. (A.2 Using te conditions (A2 and (A3, te identity (2.6, and te fact f(x, θ dx = 0, we may verify [ ] θ E ˆΨ ( (x, θ = I (x, θ + O θ=θ = I (θ + O 1 n (1+α/2 2 ( 1 2. Plug te second approximation at (A.3 into (A.2 and use (A.1 to get (A.3 θ (x = θ + I (θ 1 E ˆΨ (x, θ + O(ρ. (A.4 Te lemma follows immediately from (A.4. 21
22 A.3. Proof of Lemma 3. compact subset S of X. First, we observe ˆθ (x = θ (x In tis proof, all O p expressions are also uniform for x in any [ ] 1 θ E ˆΨ (x, θ θ=θ (x ( log n ˆΨ (x, θ (x + O p. (A.5 n Te proof of (A.5 is similar to tat of (4.1 in Park et al. (2002. Te only difference is tat we let tend to infinity instead of zero and tus only ave O p (n 1 log n instead of O p (n 1 1 log n for te remainder. Now, we can replace ˆΨ (x, θ (x by U (x, θ (x in (A.5 since E ˆΨ (x, θ (x = 0 by definition of θ (x. Also, we can replace [ ( / θe ˆΨ (x, θ ] θ=θ (x by I (x, θ wit an error O p {n 1 (α/2 2 (log n 1/2 }. Tis is due to te facts tat U (x, θ (x = O p {n 1/2 (log n 1/2 } and tat [ ( / θe ˆΨ (x, θ ] [ θ=θ (x ( / θe ˆΨ (x, θ ] as magnitude of order O(n (1+α/2 2 by (A.1 and (A.2. Also, [ θ=θ ( / θe ˆΨ (x, θ ] +I (x, θ = O(n (1+α/2 2 by te first approximation at (A.3. θ=θ Tis yields ˆθ (x = θ (x + I (x, θ 1 U (x, θ (x + O p (δ. Te lemma ten follows immediately from te facts U (x, θ = O p {n 1/2 (log n 1/2 } and θ (x θ = O(n (1+α/
23 Acknowledgments. of an associate editor and two reviewers. We are grateful for te elpful and constructive comments 23
24 References Aldersof, B., Marron, J. S., Park, B. U. and Wand, M. P. (1995. Facts about te Gaussian probability density function. Applicable Analysis 59, Azzalini, A. (1985. A class of distributions wic includes te normal ones. Scand. J. Statist. 12, Copas, J. B. (1995. Local likeliood based on kernel censoring. J. R. Statist. Soc. 57, B Eguci, S. and Copas, J. B. (1998. A class of local likeliood metods and nearparametric asymptotics. J. R. Statist. Soc. B 60, Henze, N. (1986. A probabilistic representation of te skewednormal distribution. Scan. J. Statist. 13, Hjort, N. L. and Jones, M. C. (1996. Locally parametric nonparametric density estimation. Ann. Statist. 24, Kim, W. C., Park, B. U., and Kim, Y. G. (2001. On Copas local likeliood density estimator. J. Kor. Statist. Soc. 30, Loader, C. R. (1996. Local likeliood density estimation. Ann. Statist. 24, Park, B. U., Kim, W. C. and Jones, M. C. (2002. On local likeliood density estimation. Ann. Statist. 30,
Verifying Numerical Convergence Rates
1 Order of accuracy Verifying Numerical Convergence Rates We consider a numerical approximation of an exact value u. Te approximation depends on a small parameter, suc as te grid size or time step, and
More informationInstantaneous Rate of Change:
Instantaneous Rate of Cange: Last section we discovered tat te average rate of cange in F(x) can also be interpreted as te slope of a scant line. Te average rate of cange involves te cange in F(x) over
More informationCHAPTER 7. Di erentiation
CHAPTER 7 Di erentiation 1. Te Derivative at a Point Definition 7.1. Let f be a function defined on a neigborood of x 0. f is di erentiable at x 0, if te following it exists: f 0 fx 0 + ) fx 0 ) x 0 )=.
More informationFINITE DIFFERENCE METHODS
FINITE DIFFERENCE METHODS LONG CHEN Te best known metods, finite difference, consists of replacing eac derivative by a difference quotient in te classic formulation. It is simple to code and economic to
More informationThe EOQ Inventory Formula
Te EOQ Inventory Formula James M. Cargal Matematics Department Troy University Montgomery Campus A basic problem for businesses and manufacturers is, wen ordering supplies, to determine wat quantity of
More informationFinite Difference Approximations
Capter Finite Difference Approximations Our goal is to approximate solutions to differential equations, i.e., to find a function (or some discrete approximation to tis function) tat satisfies a given relationsip
More informationTrapezoid Rule. y 2. y L
Trapezoid Rule and Simpson s Rule c 2002, 2008, 200 Donald Kreider and Dwigt Lar Trapezoid Rule Many applications of calculus involve definite integrals. If we can find an antiderivative for te integrand,
More informationCan a LumpSum Transfer Make Everyone Enjoy the Gains. from Free Trade?
Can a LumpSum Transfer Make Everyone Enjoy te Gains from Free Trade? Yasukazu Icino Department of Economics, Konan University June 30, 2010 Abstract I examine lumpsum transfer rules to redistribute te
More informationGeometric Stratification of Accounting Data
Stratification of Accounting Data Patricia Gunning * Jane Mary Horgan ** William Yancey *** Abstract: We suggest a new procedure for defining te boundaries of te strata in igly skewed populations, usual
More information100 Austrian Journal of Statistics, Vol. 32 (2003), No. 1&2, 99129
AUSTRIAN JOURNAL OF STATISTICS Volume 3 003, Number 1&, 99 19 Adaptive Regression on te Real Line in Classes of Smoot Functions L.M. Artiles and B.Y. Levit Eurandom, Eindoven, te Neterlands Queen s University,
More informationDifferentiable Functions
Capter 8 Differentiable Functions A differentiable function is a function tat can be approximated locally by a linear function. 8.. Te derivative Definition 8.. Suppose tat f : (a, b) R and a < c < b.
More informationThis supplement is meant to be read after Venema s Section 9.2. Throughout this section, we assume all nine axioms of Euclidean geometry.
Mat 444/445 Geometry for Teacers Summer 2008 Supplement : Similar Triangles Tis supplement is meant to be read after Venema s Section 9.2. Trougout tis section, we assume all nine axioms of uclidean geometry.
More informationSAMPLE DESIGN FOR THE TERRORISM RISK INSURANCE PROGRAM SURVEY
ASA Section on Survey Researc Metods SAMPLE DESIG FOR TE TERRORISM RISK ISURACE PROGRAM SURVEY G. ussain Coudry, Westat; Mats yfjäll, Statisticon; and Marianne Winglee, Westat G. ussain Coudry, Westat,
More informationIn other words the graph of the polynomial should pass through the points
Capter 3 Interpolation Interpolation is te problem of fitting a smoot curve troug a given set of points, generally as te grap of a function. It is useful at least in data analysis (interpolation is a form
More informationOptimal Pricing Strategy for Second Degree Price Discrimination
Optimal Pricing Strategy for Second Degree Price Discrimination Alex O Brien May 5, 2005 Abstract Second Degree price discrimination is a coupon strategy tat allows all consumers access to te coupon. Purcases
More informationLecture 10: What is a Function, definition, piecewise defined functions, difference quotient, domain of a function
Lecture 10: Wat is a Function, definition, piecewise defined functions, difference quotient, domain of a function A function arises wen one quantity depends on anoter. Many everyday relationsips between
More informationTangent Lines and Rates of Change
Tangent Lines and Rates of Cange 922005 Given a function y = f(x), ow do you find te slope of te tangent line to te grap at te point P(a, f(a))? (I m tinking of te tangent line as a line tat just skims
More information2 Limits and Derivatives
2 Limits and Derivatives 2.7 Tangent Lines, Velocity, and Derivatives A tangent line to a circle is a line tat intersects te circle at exactly one point. We would like to take tis idea of tangent line
More informationDistances in random graphs with infinite mean degrees
Distances in random graps wit infinite mean degrees Henri van den Esker, Remco van der Hofstad, Gerard Hoogiemstra and Dmitri Znamenski April 26, 2005 Abstract We study random graps wit an i.i.d. degree
More informationME422 Mechanical Control Systems Modeling Fluid Systems
Cal Poly San Luis Obispo Mecanical Engineering ME422 Mecanical Control Systems Modeling Fluid Systems Owen/Ridgely, last update Mar 2003 Te dynamic euations for fluid flow are very similar to te dynamic
More information2.28 EDGE Program. Introduction
Introduction Te Economic Diversification and Growt Enterprises Act became effective on 1 January 1995. Te creation of tis Act was to encourage new businesses to start or expand in Newfoundland and Labrador.
More informationProof of the Power Rule for Positive Integer Powers
Te Power Rule A function of te form f (x) = x r, were r is any real number, is a power function. From our previous work we know tat x x 2 x x x x 3 3 x x In te first two cases, te power r is a positive
More informationChapter 7 Numerical Differentiation and Integration
45 We ave a abit in writing articles publised in scientiþc journals to make te work as Þnised as possible, to cover up all te tracks, to not worry about te blind alleys or describe ow you ad te wrong idea
More informationComputer Science and Engineering, UCSD October 7, 1999 GoldreicLevin Teorem Autor: Bellare Te GoldreicLevin Teorem 1 Te problem We æx a an integer n for te lengt of te strings involved. If a is an nbit
More informationDerivatives Math 120 Calculus I D Joyce, Fall 2013
Derivatives Mat 20 Calculus I D Joyce, Fall 203 Since we ave a good understanding of its, we can develop derivatives very quickly. Recall tat we defined te derivative f x of a function f at x to be te
More informationCyber Epidemic Models with Dependences
Cyber Epidemic Models wit Dependences Maocao Xu 1, Gaofeng Da 2 and Souuai Xu 3 1 Department of Matematics, Illinois State University mxu2@ilstu.edu 2 Institute for Cyber Security, University of Texas
More informationMath 113 HW #5 Solutions
Mat 3 HW #5 Solutions. Exercise.5.6. Suppose f is continuous on [, 5] and te only solutions of te equation f(x) = 6 are x = and x =. If f() = 8, explain wy f(3) > 6. Answer: Suppose we ad tat f(3) 6. Ten
More informationFinite Volume Discretization of the Heat Equation
Lecture Notes 3 Finite Volume Discretization of te Heat Equation We consider finite volume discretizations of te onedimensional variable coefficient eat equation, wit Neumann boundary conditions u t x
More information7.6 Complex Fractions
Section 7.6 Comple Fractions 695 7.6 Comple Fractions In tis section we learn ow to simplify wat are called comple fractions, an eample of wic follows. 2 + 3 Note tat bot te numerator and denominator are
More informationResearch on the Antiperspective Correction Algorithm of QR Barcode
Researc on te Antiperspective Correction Algoritm of QR Barcode Jianua Li, YiWen Wang, YiJun Wang,Yi Cen, Guoceng Wang Key Laboratory of Electronic Tin Films and Integrated Devices University of Electronic
More information2.23 Gambling Rehabilitation Services. Introduction
2.23 Gambling Reabilitation Services Introduction Figure 1 Since 1995 provincial revenues from gambling activities ave increased over 56% from $69.2 million in 1995 to $108 million in 2004. Te majority
More informationTRADING AWAY WIDE BRANDS FOR CHEAP BRANDS. Swati Dhingra London School of Economics and CEP. Online Appendix
TRADING AWAY WIDE BRANDS FOR CHEAP BRANDS Swati Dingra London Scool of Economics and CEP Online Appendix APPENDIX A. THEORETICAL & EMPIRICAL RESULTS A.1. CES and Logit Preferences: Invariance of Innovation
More informationPretrial Settlement with Imperfect Private Monitoring
Pretrial Settlement wit Imperfect Private Monitoring Mostafa Beskar University of New Hampsire JeeHyeong Park y Seoul National University July 2011 Incomplete, Do Not Circulate Abstract We model pretrial
More informationUnderstanding the Derivative Backward and Forward by Dave Slomer
Understanding te Derivative Backward and Forward by Dave Slomer Slopes of lines are important, giving average rates of cange. Slopes of curves are even more important, giving instantaneous rates of cange.
More informationOPTIMAL DISCONTINUOUS GALERKIN METHODS FOR THE ACOUSTIC WAVE EQUATION IN HIGHER DIMENSIONS
OPTIMAL DISCONTINUOUS GALERKIN METHODS FOR THE ACOUSTIC WAVE EQUATION IN HIGHER DIMENSIONS ERIC T. CHUNG AND BJÖRN ENGQUIST Abstract. In tis paper, we developed and analyzed a new class of discontinuous
More informationReference: Introduction to Partial Differential Equations by G. Folland, 1995, Chap. 3.
5 Potential Theory Reference: Introduction to Partial Differential Equations by G. Folland, 995, Chap. 3. 5. Problems of Interest. In what follows, we consider Ω an open, bounded subset of R n with C 2
More informationSchedulability Analysis under Graph Routing in WirelessHART Networks
Scedulability Analysis under Grap Routing in WirelessHART Networks Abusayeed Saifulla, Dolvara Gunatilaka, Paras Tiwari, Mo Sa, Cenyang Lu, Bo Li Cengjie Wu, and Yixin Cen Department of Computer Science,
More informationLecture 10. Limits (cont d) Onesided limits. (Relevant section from Stewart, Seventh Edition: Section 2.4, pp. 113.)
Lecture 10 Limits (cont d) Onesided its (Relevant section from Stewart, Sevent Edition: Section 2.4, pp. 113.) As you may recall from your earlier course in Calculus, we may define onesided its, were
More information1 Density functions, cummulative density functions, measures of central tendency, and measures of dispersion
Density functions, cummulative density functions, measures of central tendency, and measures of dispersion densityfunctionsintro.tex October, 9 Note tat tis section of notes is limitied to te consideration
More informationModule 1: Introduction to Finite Element Analysis Lecture 1: Introduction
Module : Introduction to Finite Element Analysis Lecture : Introduction.. Introduction Te Finite Element Metod (FEM) is a numerical tecnique to find approximate solutions of partial differential equations.
More informationf(a + h) f(a) f (a) = lim
Lecture 7 : Derivative AS a Function In te previous section we defined te derivative of a function f at a number a (wen te function f is defined in an open interval containing a) to be f (a) 0 f(a + )
More informationPLUGIN BANDWIDTH SELECTOR FOR THE KERNEL RELATIVE DENSITY ESTIMATOR
PLUGIN BANDWIDTH SELECTOR FOR THE KERNEL RELATIVE DENSITY ESTIMATOR ELISA MARÍA MOLANESLÓPEZ AND RICARDO CAO Departamento de Matemáticas, Facultade de Informática, Universidade da Coruña, Campus de Elviña
More informationStrategic trading in a dynamic noisy market. Dimitri Vayanos
LSE Researc Online Article (refereed) Strategic trading in a dynamic noisy market Dimitri Vayanos LSE as developed LSE Researc Online so tat users may access researc output of te Scool. Copyrigt and Moral
More informationThe Derivative as a Function
Section 2.2 Te Derivative as a Function 200 Kiryl Tsiscanka Te Derivative as a Function DEFINITION: Te derivative of a function f at a number a, denoted by f (a), is if tis limit exists. f (a) f(a+) f(a)
More informationStrategic trading and welfare in a dynamic market. Dimitri Vayanos
LSE Researc Online Article (refereed) Strategic trading and welfare in a dynamic market Dimitri Vayanos LSE as developed LSE Researc Online so tat users may access researc output of te Scool. Copyrigt
More informationComparison between two approaches to overload control in a Real Server: local or hybrid solutions?
Comparison between two approaces to overload control in a Real Server: local or ybrid solutions? S. Montagna and M. Pignolo Researc and Development Italtel S.p.A. Settimo Milanese, ITALY Abstract Tis wor
More informationAdaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
More informationSome Notes on Taylor Polynomials and Taylor Series
Some Notes on Taylor Polynomials and Taylor Series Mark MacLean October 3, 27 UBC s courses MATH /8 and MATH introduce students to the ideas of Taylor polynomials and Taylor series in a fairly limited
More informationMATHEMATICS FOR ENGINEERING DIFFERENTIATION TUTORIAL 1  BASIC DIFFERENTIATION
MATHEMATICS FOR ENGINEERING DIFFERENTIATION TUTORIAL 1  BASIC DIFFERENTIATION Tis tutorial is essential prerequisite material for anyone stuing mecanical engineering. Tis tutorial uses te principle of
More informationOptimized Data Indexing Algorithms for OLAP Systems
Database Systems Journal vol. I, no. 2/200 7 Optimized Data Indexing Algoritms for OLAP Systems Lucian BORNAZ Faculty of Cybernetics, Statistics and Economic Informatics Academy of Economic Studies, Bucarest
More information1 Derivatives of Piecewise Defined Functions
MATH 1010E University Matematics Lecture Notes (week 4) Martin Li 1 Derivatives of Piecewise Define Functions For piecewise efine functions, we often ave to be very careful in computing te erivatives.
More informationTheoretical calculation of the heat capacity
eoretical calculation of te eat capacity Principle of equipartition of energy Heat capacity of ideal and real gases Heat capacity of solids: DulongPetit, Einstein, Debye models Heat capacity of metals
More informationWORKING PAPER SERIES THE INFORMATIONAL CONTENT OF OVERTHECOUNTER CURRENCY OPTIONS NO. 366 / JUNE 2004. by Peter Christoffersen and Stefano Mazzotta
WORKING PAPER SERIES NO. 366 / JUNE 24 THE INFORMATIONAL CONTENT OF OVERTHECOUNTER CURRENCY OPTIONS by Peter Cristoffersen and Stefano Mazzotta WORKING PAPER SERIES NO. 366 / JUNE 24 THE INFORMATIONAL
More informationStaffing and routing in a twotier call centre. Sameer Hasija*, Edieal J. Pinker and Robert A. Shumsky
8 Int. J. Operational Researc, Vol. 1, Nos. 1/, 005 Staffing and routing in a twotier call centre Sameer Hasija*, Edieal J. Pinker and Robert A. Sumsky Simon Scool, University of Rocester, Rocester 1467,
More informationBonferroniBased SizeCorrection for Nonstandard Testing Problems
BonferroniBased SizeCorrection for Nonstandard Testing Problems Adam McCloskey Brown University October 2011; Tis Version: October 2012 Abstract We develop powerful new sizecorrection procedures for
More informationA system to monitor the quality of automated coding of textual answers to open questions
Researc in Official Statistics Number 2/2001 A system to monitor te quality of automated coding of textual answers to open questions Stefania Maccia * and Marcello D Orazio ** Italian National Statistical
More informationThe modelling of business rules for dashboard reporting using mutual information
8 t World IMACS / MODSIM Congress, Cairns, Australia 37 July 2009 ttp://mssanz.org.au/modsim09 Te modelling of business rules for dasboard reporting using mutual information Gregory Calbert Command, Control,
More informationImproved dynamic programs for some batcing problems involving te maximum lateness criterion A P M Wagelmans Econometric Institute Erasmus University Rotterdam PO Box 1738, 3000 DR Rotterdam Te Neterlands
More informationAn inquiry into the multiplier process in ISLM model
An inquiry into te multiplier process in ISLM model Autor: Li ziran Address: Li ziran, Room 409, Building 38#, Peing University, Beijing 00.87,PRC. Pone: (86) 0062763074 Internet Address: jefferson@water.pu.edu.cn
More informationWelfare, financial innovation and self insurance in dynamic incomplete markets models
Welfare, financial innovation and self insurance in dynamic incomplete markets models Paul Willen Department of Economics Princeton University First version: April 998 Tis version: July 999 Abstract We
More information1 if 1 x 0 1 if 0 x 1
Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationM(0) = 1 M(1) = 2 M(h) = M(h 1) + M(h 2) + 1 (h > 1)
Insertion and Deletion in VL Trees Submitted in Partial Fulfillment of te Requirements for Dr. Eric Kaltofen s 66621: nalysis of lgoritms by Robert McCloskey December 14, 1984 1 ackground ccording to Knut
More informationAn Interest Rate Model
An Interest Rate Model Concepts and Buzzwords Building Price Tree from Rate Tree Lognormal Interest Rate Model Nonnegativity Volatility and te Level Effect Readings Tuckman, capters 11 and 12. Lognormal
More informationAreaSpecific Recreation Use Estimation Using the National Visitor Use Monitoring Program Data
United States Department of Agriculture Forest Service Pacific Nortwest Researc Station Researc Note PNWRN557 July 2007 AreaSpecific Recreation Use Estimation Using te National Visitor Use Monitoring
More informationNonparametric adaptive age replacement with a onecycle criterion
Nonparametric adaptive age replacement with a onecycle criterion P. CoolenSchrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK email: Pauline.Schrijner@durham.ac.uk
More informationProjective Geometry. Projective Geometry
Euclidean versus Euclidean geometry describes sapes as tey are Properties of objects tat are uncanged by rigid motions» Lengts» Angles» Parallelism Projective geometry describes objects as tey appear Lengts,
More informationWe consider the problem of determining (for a short lifecycle) retail product initial and
Optimizing Inventory Replenisment of Retail Fasion Products Marsall Fiser Kumar Rajaram Anant Raman Te Warton Scool, University of Pennsylvania, 3620 Locust Walk, 3207 SHDH, Piladelpia, Pennsylvania 191046366
More informationEquilibria in sequential bargaining games as solutions to systems of equations
Economics Letters 84 (2004) 407 411 www.elsevier.com/locate/econbase Equilibria in sequential bargaining games as solutions to systems of equations Tasos Kalandrakis* Department of Political Science, Yale
More informationModule 2. The Science of Surface and Ground Water. Version 2 CE IIT, Kharagpur
Module Te Science of Surface and Ground Water Version CE IIT, Karagpur Lesson 6 Principles of Ground Water Flow Version CE IIT, Karagpur Instructional Objectives On completion of te lesson, te student
More informationA strong credit score can help you score a lower rate on a mortgage
NET GAIN Scoring points for your financial future AS SEEN IN USA TODAY S MONEY SECTION, JULY 3, 2007 A strong credit score can elp you score a lower rate on a mortgage By Sandra Block Sales of existing
More information 1  Handout #22 May 23, 2012 Huffman Encoding and Data Compression. CS106B Spring 2012. Handout by Julie Zelenski with minor edits by Keith Schwarz
CS106B Spring 01 Handout # May 3, 01 Huffman Encoding and Data Compression Handout by Julie Zelenski wit minor edits by Keit Scwarz In te early 1980s, personal computers ad ard disks tat were no larger
More informationarxiv:math/ v1 [math.pr] 20 Feb 2003
JUGGLING PROBABILITIES GREGORY S. WARRINGTON arxiv:mat/0302257v [mat.pr] 20 Feb 2003. Introduction Imagine yourself effortlessly juggling five balls in a ig, lazy pattern. Your rigt and catces a ball and
More informationMultigrid computational methods are
M ULTIGRID C OMPUTING Wy Multigrid Metods Are So Efficient Originally introduced as a way to numerically solve elliptic boundaryvalue problems, multigrid metods, and teir various multiscale descendants,
More information6. Metric spaces. In this section we review the basic facts about metric spaces. d : X X [0, )
6. Metric spaces In this section we review the basic facts about metric spaces. Definitions. A metric on a nonempty set X is a map with the following properties: d : X X [0, ) (i) If x, y X are points
More informationACT Math Facts & Formulas
Numbers, Sequences, Factors Integers:..., 3, 2, 1, 0, 1, 2, 3,... Rationals: fractions, tat is, anyting expressable as a ratio of integers Reals: integers plus rationals plus special numbers suc as
More informationMultivariate time series analysis: Some essential notions
Capter 2 Multivariate time series analysis: Some essential notions An overview of a modeling and learning framework for multivariate time series was presented in Capter 1. In tis capter, some notions on
More informationPretrial Settlement with Imperfect Private Monitoring
Pretrial Settlement wit Imperfect Private Monitoring Mostafa Beskar Indiana University JeeHyeong Park y Seoul National University April, 2016 Extremely Preliminary; Please Do Not Circulate. Abstract We
More informationWorking Capital 2013 UK plc s unproductive 69 billion
2013 Executive summary 2. Te level of excess working capital increased 3. UK sectors acieve a mixed performance 4. Size matters in te supply cain 6. Not all companies are overflowing wit cas 8. Excess
More informationConvex analysis and profit/cost/support functions
CALIFORNIA INSTITUTE OF TECHNOLOGY Division of the Humanities and Social Sciences Convex analysis and profit/cost/support functions KC Border October 2004 Revised January 2009 Let A be a subset of R m
More informationTraining Robust Support Vector Regression via D. C. Program
Journal of Information & Computational Science 7: 12 (2010) 2385 2394 Available at ttp://www.joics.com Training Robust Support Vector Regression via D. C. Program Kuaini Wang, Ping Zong, Yaoong Zao College
More informationModeling User Perception of Interaction Opportunities for Effective Teamwork
Modeling User Perception of Interaction Opportunities for Effective Teamwork Ece Kamar, Ya akov Gal and Barbara J. Grosz Scool of Engineering and Applied Sciences Harvard University, Cambridge, MA 02138
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More informationSolution Derivations for Capa #7
Solution Derivations for Capa #7 1) Consider te beavior of te circuit, wen various values increase or decrease. (Select Iincreases, Ddecreases, If te first is I and te rest D, enter IDDDD). A) If R1
More informationRafał Weron * FORECASTING WHOLESALE ELECTRICITY PRICES: A REVIEW OF TIME SERIES MODELS. 1. Introduction
To appear as: R. Weron (008) Forecasting wolesale electricity prices: A review of time series models, in "Financial Markets: Principles of Modelling, Forecasting and DecisionMaking", eds. W. Milo, P.
More informationProperties of BMO functions whose reciprocals are also BMO
Properties of BMO functions whose reciprocals are also BMO R. L. Johnson and C. J. Neugebauer The main result says that a nonnegative BMOfunction w, whose reciprocal is also in BMO, belongs to p> A p,and
More informationSAT Subject Math Level 1 Facts & Formulas
Numbers, Sequences, Factors Integers:..., 3, 2, 1, 0, 1, 2, 3,... Reals: integers plus fractions, decimals, and irrationals ( 2, 3, π, etc.) Order Of Operations: Aritmetic Sequences: PEMDAS (Parenteses
More informationThe Derivative. Not for Sale
3 Te Te Derivative 3. Limits 3. Continuity 3.3 Rates of Cange 3. Definition of te Derivative 3.5 Grapical Differentiation Capter 3 Review Etended Application: A Model for Drugs Administered Intravenously
More informationA New Cement to Glue Nonconforming Grids with Robin Interface Conditions: The Finite Element Case
A New Cement to Glue Nonconforming Grids wit Robin Interface Conditions: Te Finite Element Case Martin J. Gander, Caroline Japet 2, Yvon Maday 3, and Frédéric Nataf 4 McGill University, Dept. of Matematics
More informationA Multigrid Tutorial part two
A Multigrid Tutorial part two William L. Briggs Department of Matematics University of Colorado at Denver Van Emden Henson Center for Applied Scientific Computing Lawrence Livermore National Laboratory
More informationMathematics Course 111: Algebra I Part IV: Vector Spaces
Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 19967 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are
More informationTo motivate the notion of a variogram for a covariance stationary process, { Ys ( ): s R}
4. Variograms Te covariogram and its normalized form, te correlogram, are by far te most intuitive metods for summarizing te structure of spatial dependencies in a covariance stationary process. However,
More informationThe differential amplifier
DiffAmp.doc 1 Te differential amplifier Te emitter coupled differential amplifier output is V o = A d V d + A c V C Were V d = V 1 V 2 and V C = (V 1 + V 2 ) / 2 In te ideal differential amplifier A c
More information2.13 Solid Waste Management. Introduction. Scope and Objectives. Conclusions
Introduction Te planning and delivery of waste management in Newfoundland and Labrador is te direct responsibility of municipalities and communities. Te Province olds overall responsibility for te development
More informationON THE EXISTENCE AND LIMIT BEHAVIOR OF THE OPTIMAL BANDWIDTH FOR KERNEL DENSITY ESTIMATION
Statistica Sinica 17(27), 2893 ON THE EXISTENCE AND LIMIT BEHAVIOR OF THE OPTIMAL BANDWIDTH FOR KERNEL DENSITY ESTIMATION J. E. Chacón, J. Montanero, A. G. Nogales and P. Pérez Universidad de Extremadura
More informationTHE CENTRAL LIMIT THEOREM TORONTO
THE CENTRAL LIMIT THEOREM DANIEL RÜDT UNIVERSITY OF TORONTO MARCH, 2010 Contents 1 Introduction 1 2 Mathematical Background 3 3 The Central Limit Theorem 4 4 Examples 4 4.1 Roulette......................................
More informationSection 12.6: Directional Derivatives and the Gradient Vector
Section 26: Directional Derivatives and the Gradient Vector Recall that if f is a differentiable function of x and y and z = f(x, y), then the partial derivatives f x (x, y) and f y (x, y) give the rate
More informationNo: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics
No: 10 04 Bilkent University Monotonic Extension Farhad Husseinov Discussion Papers Department of Economics The Discussion Papers of the Department of Economics are intended to make the initial results
More informationSAT Math MustKnow Facts & Formulas
SAT Mat MustKnow Facts & Formuas Numbers, Sequences, Factors Integers:..., 3, 2, 1, 0, 1, 2, 3,... Rationas: fractions, tat is, anyting expressabe as a ratio of integers Reas: integers pus rationas
More information