Using Mone Carlo Mehod o Compare CUSUM and EWMA Saisics Xiaoyu Shen Zhen Zhang Absrac: Since ordinary daases usually conain change poins of variance, CUSUM and EWMA saisics can be used o deec hese change poins. In his projec, we are going o use Mone Carlo Mehod o compare he efficiencies of hese wo saisics according o he average run lengh (ARL). Key words: change poin, CUSUM, EWMA, Mone Carlo Mehod, ARL 1. Inroducion Basic saisical heories old us ha he variance of a daa series is a very useful crierion indicaing he deviaion of he original daa from he mean. However, a series of pracical daa usually may no have he consan variance, such ha deecing he momen when he variance of daa changes has is own significan value. In he qualiy conrol heory, here exis wo qualiy conrol chars based on wo differen saisics, cumulaive sum (CUSUM) and exponenially weighed moving average (EWMA). From hese wo chars, one can easily noice he momen when he properies of daa become abnormal, which may help us o conrol he qualiy of daa. Our opic in his projec will focus on he CUSUM and EWMA saisics. Many saisical papers have revealed ha afer some simple ransformaions, boh CUSUM and EWMA saisics can be used o deec he change poins of variance. 2. Empirical Resuls The expression of cumulaive sum of square is CUSUM: C. 1 2 We can ge he D saisic afer some simple ransformaion of CUSUM. 2 C 1 D C n n n 2 n 1, 1,2, n.
Also, we can ge he expression of EWMA. EWMA: W r W r, 2 (1 ) 1 ln( ) 1, 2 2 ( xi ) 1 i 1 1 xi i 1. where i r(1 r) 2 r r 1 2 r i 2 EWMAn ( r) max r(1 r) ln( ) 1 2 i, n r(1 (1 r) ) i 0 2 (1 (1 r) ) is he weigh of he saisic. The mehod which may be found in many boos abou qualiy conrol is no he opic of our projec, so ha i is no included in his paper. Afer giving he expression of CUSUM and EWMA, we need o compare hese wo saisics. To finish his comparison, we inroduce a new crierion, he average run lengh, which may help us o compare he deecing effec of hese wo saisics. The following is he definiion of he average run lengh. Definiion: The average run lengh (ARL) of a sampling inspecion scheme a a given level of qualiy is he average number of samples of n iems aen in he period beween he ime when he process commences o run a he saed level and ha a which he scheme indicaes a change from accepable o rejecable qualiy level is liely o have occurred. In his projec, we can define ARL as he mean of he firs run lengh he change poin ae place afer generaing large numbers of samples. Wih he help of compuers, we could do he simulaion sudy of ARL o compare he deecing effec of CUSUM and EWMA. 3. Pracical Resuls To es he effec of CUSUM saisic, we firs generae wo series of random daa, each of which has 1000 daa. One of hem has no change poins of variance, anoher has wo change poins of variance. I is obvious o noice from he firs pair of plo ha 400 and 700 are he change poins of variance in he second series of daa. We can see from he second pair of graph ha he plo of CUSUM saisic wihou variance change poins is close o a line, while ha wih variance change poins has differen slopes. Afer some simple ransformaion, we ge he D plo (hird pair of plos) which shows he change poins more obviously. The D plo of daa wihou change poins is close o a line (bounded in [-0.015, 0.015]), while he D plo of daa wih change poins grealy exceed ha boundary and has he maximum a poin 700. Even if D plo canno convince us of he fac ha poin 400 is also a change poin, CUSUM saisics and is derivaive D really have good effecs on deecing he change poins of variance.
Figure 1: Effec of CUSUM We use he same wo series of daa o es he effec of EWMA saisic. The firs column in Figure 2 is he graphs of daa wihou variance change poins and he second column is hose wih change poins. Alhough differen weigh r may have differen effec of deecing, we can see he change poins of variance obviously from all he plos in he second column. Bu also from hese plos, he values of EWMA saisics seem o be abnormal when he lengh r is close o zero. So i is difficul for EWMA o deec he change poin exising a he beginning of daa.
Figure 2: Effec of EWMA (r=0.2, 0.6 and 0.9) We now from he previous discussion ha CUSUM and EWMA boh can help o deec he change poin of variance, so i is ime for us o apply Mone Carlo Mehod o he furher research now. According o Mone Carlo simulaion sudy of wo saisics, we find he ARL of CUSUM is always smaller han ha of EWMA, no maer how large he sample size n and weigh r are. If he sample size n is small, he ARL of EWMA decreases when he weigh r increases from 0.2 o 0.9. Bu if he sample size is large, he ARL of EWMA increases when he weigh r increases. n=50 n=100 n=200 CUSUM 24.23 52.24 48.61 EWMA (r=0.2) 30.95 57.87 112.86 EWMA (r=0.6) 29.16 58.65 114.29 EWMA (r=0.9) 29.05 56.66 118.24 Table 1: S=100
n=50 n=100 n=200 CUSUM 24.82 50.21 94.45 EWMA (r=0.2) 29.64 58.37 111.84 EWMA (r=0.6) 29.41 57.31 115.36 EWMA (r=0.9) 29.27 59.19 119.07 Table 2: S=300 n=50 n=100 n=200 CUSUM 25.56 49.83 100.74 EWMA (r=0.2) 29.89 58.5 114.21 EWMA (r=0.6) 29.7 58.65 116.96 EWMA (r=0.9) 29.09 57.41 118.09 Table 3: S=500 4. Conclusion CUSUM and EWMA can be boh used o deec he change poin of variance. However, afer comparing he ARL of hese wo saisics wih he help of Mone Carlo Mehod, we now ha change poins of variance migh be more easily found by means of CUSUM. In he case of EWMA, when he sample size is small, EWMA wih large weigh has beer effec of change poins deecing. Bu when he sample size is large, EWMA wih small weigh wors beer han ha wih large weigh. In addiion, we noice from he plo ha EWMA saisic canno deec he change poin a he beginning of he daase. The reasons may be he fac ha when he lengh r is small, EWMA will be oo large o reflec he effec of deecing. Also, informaion included in he beginning par of daa may no be adequae for EWMA o deec he overall change poins. Appendix R code for he projec # es he effec of CUSUM a=rnorm(1000);b=a; b[401:700]=rnorm(300,0,0.4);b[701:1000]=rnorm(300,0,1.2); c=a;d=a;e=b;f=b;c[1]=a[1]^2;e[1]=b[1]^2; for(i in 1:999) {c[i+1]=c[i]+a[i+1]^2;e[i+1]=e[i]+b[i+1]^2;} for(j in 1:1000) {d[j]=c[j]/c[1000]-j/1000;f[j]=e[j]/1000-j/1000;}
plo(a,ype="l",main="var is consan") plo(b,ype="l",main="var is non-consan") plo(c,ype="l",ylab="c",main="cusum") plo(e,ype="l",ylab="c",main="cusum") plo(d,ype="l",ylab="d",main="d plo",ylim=c(-0.3,0.3)) plo(f,ype="l",ylab="d",main="d plo",ylim=c(-0.3,0.3)) # es he effec of EWMA (r=0.2, 0.6 and 0.9) ewma=funcion(r){ c=a;d=a;e=b;f=b;c[1]=a[1]^2;e[1]=b[1]^2; for(i in 1:999) {c[i+1]=c[i]+a[i+1]^2;e[i+1]=e[i]+b[i+1]^2;} for(j in 1:1000) {d[j]=c[j]/j;f[j]=e[j]/j;} g=a;h=a;h[1]=sqr(2-r)/sqr(r*(1-(1-r)^2))*r*log(d[1]^2); for(i in 2:1000){g[1]=r*(1-r)^(i-1)*log(d[1]^2); for(j in 1:(i-1)) g[j+1]=r*(1-r)^(i-j-1)*log(d[j+1]^2)+g[j]; h[i]=sqr(2-r)/sqr(r*(1-(1-r)^(2*i)))*g[i]} x=b;y=b;y[1]=sqr(2-r)/sqr(r*(1-(1-r)^2))*r*log(f[1]^2); for(i in 2:1000){x[1]=r*(1-r)^(i-1)*log(f[1]^2); for(j in 1:(i-1)) x[j+1]=r*(1-r)^(i-j-1)*log(f[j+1]^2)+x[j]; y[i]=sqr(2-r)/sqr(r*(1-(1-r)^(2*i)))*x[i]} plo(h,ype="l",main="r=0.2") plo(y,ype="l",main="r=0.2") } # calculae he ARL of CUSUM (S=100,300,500, n=50,100,200) arl.cusum = funcion(s,n){ # S=100, n=50 z=rep(0,n) for( in 1:S) {a=rnorm(n); c=a;d=a;c[1]=a[1]^2; for(i in 1:(n-1)) {c[i+1]=c[i]+a[i+1]^2;} for(j in 1:n) {d[j]=c[j]/c[n]-j/n;} z[]=1; for(i in 1:( n-1)) { if (abs(d[i+1])>=abs(d[i])) {z[]=i+1;} else {d[i+1]=d[i]} }} lis(arl=mean(z))
} # calculae he ARL of EWMA (S=100,300,500, n=50,100,200, r=0.2, 0.6, 0.9) arl.ewma = funcion(s,n,r){ z=rep(0,s) for( in 1:S) {a=rnorm(n); c=a;d=a;c[1]=a[1]^2; for(i in 1:(n-1)) {c[i+1]=c[i]+a[i+1]^2;} for(j in 1:n) {d[j]=c[j]/j;} g=a;h=a;h[1]=sqr(2-r)/sqr(r*(1-(1-r)^2))*r*log(d[1]^2); for(i in 2:n){g[1]=r*(1-r)^(i-1)*log(d[1]^2); for(j in 1:(i-1)) g[j+1]=r*(1-r)^(i-j-1)*log(d[j+1]^2)+g[j]; h[i]=sqr(2-r)/sqr(r*(1-(1-r)^(2*i)))*g[i]} z[]=round(n/3); for(i in round(n/3):(n-1)) { if (abs(d[i+1])>=abs(d[i])) {z[]=i+1;} else {d[i+1]=d[i]} }} lis(r=r,arl=mean(z)) }