Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Size: px
Start display at page:

Download "Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation"

Transcription

1 Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The Mercatus Center George Mason Unversty Arlngton, VA 0 RPF Workng Paper No August 8, 008 RESEARCH PROGRAM ON FORECASTING Center of Economc Research Department of Economcs The George Washngton Unversty Washngton, DC 005 Research Program on Forecastng (RPF) Workng Papers represent prelmnary work crculated for comment and dscusson. Please contact the author(s) before ctng ths paper n any publcatons. The vews expressed n RPF Workng Papers are solely those of the author(s) and do not necessarly represent the vews of RPF or George Washngton Unversty.

2 Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The Mercatus Center George Mason Unversty Arlngton, VA 0

3 Regresson analyss s ntended to be used when the researcher seeks to test a gven hypothess aganst a data set. Unfortunately, n many applcatons t s ether not possble to specfy a hypothess, typcally because the research s n a very early stage, or t s not desrable to form a hypothess, typcally because the number of potental explanatory varables s very large. In these cases, researchers have resorted ether to overt data mnng technques such as stepwse regresson, or covert data mnng technques such as runnng varatons on regresson models pror to runnng the fnal model (also known as data peekng ). Whle data mnng sde-steps the need to form a hypothess, t s hghly susceptble to generatng spurous results. Ths paper draws on the known propertes of OLS estmators n the presence of omtted and extraneous varable models to propose a procedure for data mnng that attempts to dstngush between parameter estmates that are sgnfcant due to an underlyng structural relatonshp and those that are sgnfcant due to random chance.

4 . Background Regresson analyss s desgned to estmate the probablty of observng a gven data set gven that a pre-determned hypothess about the relatonshp between an outcome varable and a set of factors s assumed to be true. Unfortunately, n many applcatons t s not possble to specfy a hypothess. Two possble reasons why a researcher would want to perform analyss absent a hypothess are: Large Data Scope: Data sze refers to the number of observatons n a data set. Data scope refers to the number of potental explanatory factors ( canddate factors ) n the data set. In the case of large data scope (e.g., economc and fnancal data sets), the number of canddate factors s so large that the cost of formng a tractable hypothess s prohbtve. Early Stage Analyss: In the case of early stage analyss (e.g., clncal data), there has not been enough observaton to as yet for a hypothess. To date, stepwse regresson (SR) has been one of the more wdely used technques for performng these hypothess-less analyses. SR methods perform a smart samplng of regresson models n an attempt to fnd a regresson model that best fts the data. The procedure for smartly samplng s ad-hoc. Two typcally used procedures are backward (wheren the frst model ncludes all factors and factors are removed one at a tme) and forward (wheren the frst model ncludes no factors and factors are added one at a tme). SR procedures contan three major flaws: Samplng sze: The number of regresson models that can be constructed from a gven data set can be ncredbly large. For example, wth only 30 canddate 3

5 factors one could construct more than bllon regresson models ( 30 ). Stepwse procedures sample only a small number (typcally less than 00) of the set of possble regresson models. Whle stepwse methods can fnd models that ft the data reasonably well, as the number of factors rses, the probablty of stepwse methods fndng the best-ft model s vrtually zero. Ft crteron: In evaluatng competng models, stepwse methods typcally employ an F-statstc crteron. Ths crteron causes stepwse to methods to seek out the model that comes closest to explanng the data set. However, as the number of canddate factors ncreases, what also ncreases s the probablty of a gven factor addng explanatory power smply by random chance. Thus, the stepwse ft crteron cannot dstngush between factors that contrbute explanatory power to the outcome varable because of an underlyng relatonshp ( determnstc factors ) and those that contrbute by random chance only ( spurous factors ). Intal condton: Because SR only examnes a small subset of the space of possble models and because the space of fts of the models (potentally) contans many local optma, the soluton SR returns vares based on the startng pont of the search. For example, for the same data set, SR backward and SR forward can yeld dfferent results. As the startng ponts for SR backward and SR forward are arbtrary, there are many other potental startng ponts each of whch potentally yelds a dfferent result. For example, Fgure depcts the space of possble regresson models that can be formed usng K factors. Each block represents one regresson model. The shade of the block ndcates the qualty of the model (e.g., goodness of ft). A stepwse procedure that starts at model A 4

6 evaluates models n the vcnty of A, moves to the best model, then re-evaluates n the new vcnty. Ths contnues untl the procedure cannot fnd a better model n the vcnty. In ths example, stepwse would move along the ndcated path startng at pont A. Were stepwse to start at pont B, however, t would end up at a dfferent optmal model. Out of the four startng ponts shown, only startng pont D fnds the best model. Fgure. Results from stepwse procedures are dependent on the ntal condton from whch the search commences.. All Subsets Regresson All Subsets Regresson (ASR) s a procedure ntended to be used when a researcher wants to perform analyss n the absence of a hypothess and wants to avod the samplng sze problem nherent n stepwse procedures. For K potental explanatory factors, ASR examnes all K lnear models that can be constructed. Untl recently, ASR has been nfeasble due to the massve computng power requred. By employng grd-enabled super computaton, t s now 5

7 feasble to conduct ASR for moderately szed data sets. For example, as of today an average computer would requre nearly 00 years of contnuous work to perform ASR on a data set contanng 40 canddate factors (see Fgure ), whle a 0,000 node grd computer could complete the same analyss n less than a week. Whle ASR solves the samplng sze and ntal condton problems nherent n SR, ASR remans subject to the ft crteron problem. If anythng, the ft crteron s more of a problem for ASR as the procedure examnes a much larger space of models than does SR and therefore s more lkely to fnd spurous factors. Fgure. The tme a sngle computer requres to perform ASR rses exponentally wth the number of canddate factors. 6

8 3. Exhaustve Regresson Exhaustve Regresson (ER) utlzes the ASR procedure, but attempts to dentfy spurous factors va a cross-model ch-square statstc that tests for stablty n parameter estmates across models. The cross-model stablty test compares parameter estmates for each factor across all K models n whch the factor appears. Factors whose parameter estmates yeld sgnfcantly dfferent results across models are dentfed as spurous. A gven factor can exst n one of three types of models: omtted factor model, correctly specfed model, and extraneous factor model. A correctly specfed model contans all of the explanatory factors (.e., the factors contrbute to explanng the outcome varable because of some underlyng relatonshp between the factor and the outcome) and no other factors. An omtted varable model ncludes at least one, but not all explanatory factors and (possbly) other factors. An extraneous varable model ncludes all explanatory factors and at least one other factor. In the cases of the correctly specfed model and the extraneous varable model, estmates of parameters assocated wth the factors ( slope coeffcents ) are unbased. In the case of the omtted varable model, however, estmates of the slope coeffcents are based and the drecton of bas s a functon of the covarances of the omtted factor wth the outcome varable and the omtted factor wth the factors ncluded n the model. For example, consder a data set contanng k +k +k 3 factors, for each of whch there are N observatons. Let the factors be arranged nto three Nxk j, j={,,3} matrces X, X, and X 3, and let the sets of slope coeffcents assocated wth each set of factors be the k j x vectors β, β, and β 3, respectvely. Let Y be an Nx vector of observatons on the outcome varable. Suppose that, unknown to the researcher, the process that determnes the outcome varable s Assumng, of course, that the remanng classcal lnear model assumptons hold. 7

9 Y=Xβ +Xβ +u () where u s an Nx vector of ndependently and dentcally normally dstrbuted errors. The three cases are attaned when we apply the ordnary least squares (OLS) procedure to estmate the followng models: Omtted Varable Case: Y=Xβ + u O Correctly Specfed Case: Y=Xβ +Xβ +u Extraneous Varable Case: Y=Xβ +Xβ +Xβ 3 3 +u E In the correctly specfed case, the ordnary least squares regresson procedure yelds the slope estmate: βˆ - ' ' = ([ X X] [ X X] ) [ X X] Y ˆ β () where the square brackets ndcate a parttoned matrx. Substtutng the defnton for Y from () nto (), t can be shown that the expected values of the slope estmates equal the true slope values: βˆ β E = ˆ β β Ths s the unbasedness condton. Smlarly, n the extraneous varable case, we have: βˆ ' - ' β ˆ = ([ X X X3] [ X X X3] ) [ X X X3] Y ˆ β3 (3) Agan, t can be shown that the expected values of the slope estmates equal the true slope values: 8

10 By contrast, n the omtted varable case, we have Substtutng () nto (4), we have and the expected values of the slope estmates are: βˆ β E ˆ β = β ˆ β3 0 ' ( ) - ˆ ' β = X X X Y (4) ' - ' ( ) ( ) β ˆ = X X X X β +X β +u E ( ˆ ' ) ( ) - ' β = β + X X X X β β (5) From (5), we see that the expected value of the slope estmates n the omtted varable case are based and that the drecton of the bas depends on ' XXβ. We can construct a sequence of omtted varable cases as follows. Let { Z,Z,..., Z } be the set of all (column-wse unque) subsets of X. Let X % be the set of k regressors formed by mergng X and X and then removng Z so that, from the superset of regressors formed by mergng X and X, X % s the set of ncluded regressors and Z s the correspondng set of excluded regressors. Fnally, let β ˆ be the OLS estmate of β obtaned by regressng Y on X %. The expected value of the mean of the β ˆ k across all regresson models s: k k ˆ ' - ' E E k β = β + ( ) k X% X% X% Zβ (6) = = Snce Z s are subsets of X, each Z s Nxj where j k. 9

11 From (6), we see that the expected value of the mean of the β ˆ s β when any of the followng condtons are met:. For each set of ncluded regressors, there s no covarance between the ncluded regressors and the correspondng set of excluded regressors (.e., ' XZ % =0 );. For some sets of ncluded regressors, there s a non-zero covarance between the ncluded ' regressors and the correspondng set of excluded regressors (.e., XZ % 0 ), but the XZ % =0); ' expected value of the covarances s zero (.e., E ( ) 3. For some sets of ncluded regressors, there s a non-zero covarance between the ncluded ' regressors and the correspondng set of excluded regressors (.e., XZ % 0 ), and the ' expected value of the covarances s non-zero (.e., E ( ) XZ % 0), but the expected covarance of the excluded regressors wth the dependent varable s zero (.e., E ( β ) =0); 4. None of the above holds, but the expected value of the product of () the covarances between the ncluded and excluded regressors and () the covarance of the excluded E XZβ % = 0 ). ' regressors wth the dependent varable s zero (.e., ( ) The ER procedure reles on the reasonable assumpton that, as the data number of factors n the data set ncreases, condtons (), (3), and (4) wll hold asymptotcally. If true, ths enables us to construct the cross-model ch-square statstc. 0

12 4. The Cross-Model Ch-Square Statstc Let us assume that the th (n a set of K) factor, x (where x s an Nx vector) has no structural relatonshp wth an outcome varable y. Consder the equaton: y = α + β x + β x + + β x + u (7)... k k Under the null hypothess of β = 0, the square of the rato of ˆ β to ts standard error s dstrbuted χ wth one degree of freedom. By addng factors to and removng factors (other than x ) from (7), we can obtan other estmates of β. Let ˆj β be the j th such estmate of β. By lookng at all combnatons of factors from the superset of K factors, we can construct K estmates of β. Under the null hypothess that β = 0 and assumng that the ˆj β are ndependent, we have: K ˆ β j ~ χ K (8) j= s β From (8), we can construct c, the cross-model ch-square statstc for the factor x : c K ˆ β j ~ χ K (9) j= s β = Gven the (typcally) large number of degrees of freedom nherent n the ER procedure, t s worth notng that the measure n (8) s lkely to be subject to Type II errors. In an attempt to compensate, we dvde by the number of degrees of freedom to obtan c, a relatve ch-square statstc, a measure that s less dependent on sample sze. Carmnes and McIver (98) and Klne (998) clam that one should conclude that the data represent a good ft to the hypothess when the relatve ch-square measure s less than 3. Ullman (00) recommends usng a ch-square less than.

13 Because the ˆj β are obtaned by explorng all combnatons of factors from a sngle superset, one mght expect the ˆj β to be correlated (partcularly when factors are postvely correlated), and for the correlaton to ncrease n the presence of stronger multcollnearty among the factors. 5. Monte-Carlo Tests of c To test the ablty of the cross-model ch-square statstc to dentfy factors that mght show statstcally sgnfcant slope coeffcents smply by random chance, consder an outcome varable, Y, that s determned by three factors as follows Y = α + β X + β X + β X + u (0) 3 3 where u s an error term satsfyng the requrements for the classcal lnear regresson model. Let us randomly generate addtonal factors X 4 through X 5, run the ER procedure and calculate c for each of the factors. The followng fgures show the results of the ER runs. The frst set of bars show results for ER runs appled to the superset of factors X through X 4. The results are derved as follows:. Generate 500 observatons for X, randomly selectng observatons from the unform dstrbuton.. Generate 500 observatons each for X through X 5 such that X = γ X + v where the γ are randomly selected from the standard normal dstrbuton and dstrbuted, and v are normally dstrbuted wth mean zero and varance 0.. Ths step creates varyng multcollnearty among the factors. 3. Generate Y accordng to (0) where α = β = β = β 3 =, and u s normally dstrbuted wth a varance of.

14 4. Run all K = 5 regresson models to obtan the K estmates for each β: ˆ β,..., ˆ β, ˆ β,..., ˆ β, ˆ β,..., ˆ β, ˆ β,..., ˆ β.,,8,,8 3, 3,8 4, 4,8 5. Calculate c, c, c 3, and c 4 accordng to (9). 6. Repeat steps through 5 three-thousand tmes. 7. Repeat steps through 6, each tme ncreasng the varance of u by untl the varance reaches 0. 3 At the completon of the algorthm, there wll be 60,000 measures for each of c, c, c 3, and c 4 based on (60,000) (5) = 900,000 separate regressons. We then repeat the procedure usng a superset of factors X through X 5, then X through X 6, etc. up to X through X 5. 4 Step ntroduces random multcollnearty among the factors. On average, the multcollnearty of factors wth X follows the pattern shown n Fgure 3. Approxmately half of the correlatons wth X are postve and half are negatve. Whle the correlatons are constructed between X and the other factors, ths wll also result n the other factors beng par-wse correlated though to a lesser extent (on average) than they are correlated wth X. 3 Ths results n an average R for the estmate of equaton (0) of approxmately Ths fnal pass requres the estmaton of bllon separate regressons. The entre Monte-Carlo run requres computaton equvalent to almost,000 CPU-hours. 3

15 has less than ths correlaton wth X % 0% 0% 30% 40% 50% 60% 70% 80% 90% 00% Ths proporton of factors Fgure 3. Pattern of Squared Correlatons of Factors X through X K wth X Fgure 4 shows the results of the Monte-Carlo runs n whch the crtcal value for the c s set to 3. For example, when there are four factors n the data set, c, c, and c 3 pass the sgnfcance test slghtly over 50% of the tme versus 0% for c 4. In other words, for data sets n whch three out of four factors determne the outcome varable (and a crtcal value of 3), the ER procedure wll dentfy the three determnng factors 50% of the tme and dentfy the nondetermnng factor 0% of the tme. As the number of factors n the data set ncreases, the ER procedure better dscrmnates between the factors that determne the outcome varable and those that mght appear sgnfcant by random chance alone. The last set of bars n Fgure 4 shows the results for data sets n whch three out of ffteen factors determne the outcome varable. Here, 4

16 the ER procedure dentfes the determnng factors approxmately 85% of the tme and (erroneously) dentfes non-determnng factors only slghtly more than 0% of the tme. 00% 90% Cross-Model Ch-Square Statstc Exceeds % 70% 60% 50% 40% 30% 0% 0% 0% Number of Factors n the Regresson Model Factor Factor Factor 3 All Other Factors Fgure 4. Monte-Carlo Tests of ER Procedure Usng Supersets of Data from 4 Through 5 Factors (crtcal value = 3) These results are based on the somewhat arbtrary selecton of 3 for the relatve chsquare crtcal value. Reducng the crtcal value to produces the results shown n Fgure 5. As expected, reducng the crtcal value causes the ncdence of false postves (where postve means dentfcaton of a determnng factor ) to approxmately 5%, but the ncdence of false negatves falls to below 0%. Fgure 6 and Fgure 7, where the crtcal value s set to.5 and, respectvely, are shown for comparson. As expected, the margnal gans n the reducton n false negatves (versus Fgure 5) appear to be outweghed by the ncrease n the rate of false postves. 5

17 00% 90% Cross-Model Ch-Square Statstc Exceeds.0 80% 70% 60% 50% 40% 30% 0% 0% 0% Number of Factors n the Regresson Model Factor Factor Factor 3 All Other Factors Fgure 5. Monte-Carlo Tests of ER Procedure Usng Supersets of Data from 4 Through 5 Factors (crtcal value = ) 6

18 00% 90% Cross-Model Ch-Square Statstc Exceeds.5 80% 70% 60% 50% 40% 30% 0% 0% 0% Number of Factors n the Regresson Model Factor Factor Factor 3 All Other Factors Fgure 6. Monte-Carlo Tests of ER Procedure Usng Supersets of Data from 4 Through 5 Factors (crtcal value =.5) 00% 90% Cross-Model Ch-Square Statstc Exceeds.0 80% 70% 60% 50% 40% 30% 0% 0% 0% Number of Factors n the Regresson Model Factor Factor Factor 3 All Other Factors Fgure 7. Monte-Carlo Tests of ER Procedure Usng Supersets of Data from 4 Through 5 Factors (crtcal value = ) 7

19 To measure the effect of the determnstc model s goodness of ft on the cross-model ch-square statstc, we can arrange the Monte-Carlo results accordng to the varance of the error term n (0). The algorthm vares the error term from to 0 n ncrements of. Fgure 8 shows the proporton of tmes that c passes the sgnfcance threshold of 3 for factors X, X, and X 3 (combned) for varous numbers of factors n the data set and for varous levels of error varance. An error varance of corresponds to an R (for the estmate of equaton (0)) of approxmately 0.93 whle an error varance of 0 corresponds to an R of approxmately % 90% Cross-Model Ch-Square Statstc Exceeds % 70% 60% 50% 40% 30% 0% 0% 0% Varance of the Error Term 5-Factor Models Data 0-Factor Models Data 5-Factor Models Data Sets Sets Sets Fgure 8. Monte-Carlo Tests of ER Procedure for Factors X through X 3 (combned) As expected, as the error varance rses n a 5-factor data set, the probablty of a false negatve rses from approxmately 5% (when var(u) = ) to 30% (when var(u) = 0). Results are markedly worse for data sets wth fewer factors. Fgure 9 shows results for factors X 4 through 8

20 X 5, combned. Here, we see that the probablty of a false postve rses from under 5% (when var(u) = ) to almost 0% (when var(u) = 0) for 5-factor data sets. Employng a crtcal value of.0 yelds the results n Fgure 0 and Fgure. Comparng Fgure 8 and Fgure 9 wth Fgure 0 and Fgure, we see that employng the crtcal value of 3 versus cuts n half (approxmately) the lkelhoods of false postves and negatves. 00% 90% Cross-Model Ch-Square Statstc Exceeds % 70% 60% 50% 40% 30% 0% 0% 0% Varance of the Error Term 5-Factor Models Data 0-Factor Models Data 5-Factor Models Data Sets Sets Sets Fgure 9. Monte-Carlo Tests of ER Procedure for Factors X 4 through X 5, X 0, and X 5 (combned) 9

21 00% 90% Cross-Model Ch-Square Statstc Exceeds.0 80% 70% 60% 50% 40% 30% 0% 0% 0% Varance of the Error Term 5-Factor Models Data 0-Factor Models Data 5-Factor Models Data Sets Sets Sets Fgure 0. Monte-Carlo Tests of ER Procedure for Factors X through X 3 (combned) 00% 90% Cross-Model Ch-Square Statstc Exceeds.0 80% 70% 60% 50% 40% 30% 0% 0% 0% Varance of the Error Term 5-Factor Models Data 0-Factor Models Data 5-Factor Models Data Sets Sets Sets Fgure. Monte-Carlo Tests of ER Procedure for Factors X 4 through X 5, X 0, and X 5 (combned) 0

22 6. Estmated ER (EER) Even wth the applcaton of super computaton, large data sets can make ASR and ER mpractcal. For example, t would take a full year for a top-of-the-lne 00,000 node cluster computer to perform ASR/ER on a 50-factor data set. When one consders data mnng just smple non-lnear transformatons of factors (nverse, square, logarthm), the 50-factor data set becomes a 50-factor data set. If one then consders data mnng two-factor cross-products (e.g., X X, X X 3, etc.), the 50-factor data set balloons to an,75-factor data set. Ths suggests that super computaton alone sn t enough to make ER a unversal tool for data mnng. One possble approach to usng ER wth large data sets s to employ estmated ER. Estmated ER (EER) randomly selects J (out of a possble K ) models to estmate. Note that EER does not select the factors randomly, but selects the models randomly. Selectng factors randomly bases the model selecton toward models wth a total of K/ factors. Selectng models randomly gves each of the K models an equal probablty of beng chosen. Fgure and Fgure 3 show the results of EER for varous numbers of randomly selected models for 5-, 0-, and 5-Factor data sets. These tests were performed as follows:. Generate 500 observatons for each of the factors X, randomly selectng observatons from the unform dstrbuton.. Generate 500 observatons each for X through X K (where K s 5, 0, or 5) such that X = γ X + v where the γ are randomly selected from the standard normal dstrbuton and dstrbuted, and v are normally dstrbuted wth mean zero and varance 0.. Ths step creates varyng multcollnearty among the factors. 3. Generate Y accordng to (0) where α = β = β = β 3 =, and u s normally dstrbuted wth a varance of.

23 4. Randomly select J models out of the possble K regresson models to obtan J estmates for each β. 5. Calculate c, c, c 3,, c K accordng to (9). 6. Repeat steps through 5 three-hundred tmes. 7. Calculate the percentage of tmes that the cross-model test statstcs for each β exceed the crtcal value. 8. Repeat steps through 7, for J runnng from to % Cross-Model Ch-Square Statstc Exceeds.0 95% 90% 85% 80% 75% 70% 65% 60% 55% 50% Number of Randomly Selected Models 5-Factor Data Sets 0-Factor Data Sets 5-Factor Data Sets Fgure. Monte-Carlo Tests of EER Procedure for Factors X through X 3 (combned) Evdence suggests that smaller data sets are more senstve to a small number of randomly selected models. For 5-factor data sets, the rate of false negatves (Fgure ) does not stop fallng sgnfcantly untl approxmately J = 30, and the rate of false postves (Fgure 3)

24 does not stop rsng sgnfcantly untl approxmately J = 45. The rates of false negatves (Fgure ) and false postves (Fgure 3) for 0-factor and 5-factor data sets appear to settle for lesser values of J. 80% Cross-Model Ch-Square Statstc Exceeds.0 70% 60% 50% 40% 30% 0% 0% Number of Randomly Selected Models 5-Factor Data Sets 0-Factor Data Sets 5-Factor Data Sets Fgure 3. Monte-Carlo Tests of EER Procedure for Factors X 4 through X 5, X 0, and X 5 (combned) The number of factors n the data set that determne the outcome varable has a greater effect on the number of randomly selected models requred n EER. Fgure 4 compares results for 5-factor data sets when the outcome varable s a functon of only one factor versus beng a functon of three factors. The vertcal axs measures the cross-model ch-squared statstc for the ndcated number of randomly selected models dvded by the average cross-model ch-squared statstc over all 00 runs. Fgure 4 shows that, for 5-factor data sets, as the number of randomly selected models ncreases, the cross-model ch-squared statstc approaches ts mean value for 3

25 the 00 runs faster when the outcome varable s a functon of three factors versus beng a functon of only one factor. Fgure 5 shows smlar results for 0-factor data sets. Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 4. EER Procedure for Factor X and X through X 3 (combned) for 5-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors 4

26 Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 5. EER Procedure for Factor X and X through X 3 (combned) for 0-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 6. EER Procedure for Factor X and X through X 3 (combned) for 5-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors 5

27 Results n Fgure 6 are less compellng, but not contradctory. A second result, common to the large data sets (Fgure 5 and Fgure 6), s that the varaton n the cross-model chsquared estmates s less when the outcome varance s a functon of three versus one factor. Fgure 7, Fgure 8, and Fgure 9 show correspondng results for factors that do not determne the outcome varable. These results suggest that EER may be an adequate procedure for estmatng ER for larger data sets. Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 7. EER Procedure for Factors X through X 5 (combned) and X 4 through X 5 (combned) for 5-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors 6

28 Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 8. EER Procedure for Factors X through X 5 (combned) and X 4 through X 5 (combned) for 0-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 9. EER Procedure for Factors X through X 5 (combned) and X 4 through X 5 (combned) for 5-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors 7

29 7. Comparson to Stepwse and k-fold Holdout Kuk (984) demonstrated that stepwse procedures are nferor to all subsets procedures n data mnng proportonal hazard models. Logcally, the same argument apples to data mnng regresson models. As stepwse examnes only a subset of models and ASR examnes all models, the best that stepwse can do s to match ASR. Because stepwse smartly samples on the bass of margnal changes to a ft functon, multcollnearty among the factors can cause stepwse to return a soluton that s a local, but not global, optmum. What s of nterest s a comparson of stepwse to EER because EER, lke stepwse, samples the space of possble regresson models. K-fold holdout s less an alternatve data mnng method than t s an alternatve objectve. Data mnng methods typcally have the objectve of fndng the model that best fts the data (typcally measured by mprovements to the F-statstc). K-fold holdout offers the alternatve objectve of maxmzng the model s ablty to predct observatons that were not ncluded n the model estmaton (.e., held out observatons). The selecton of whch observatons to hold out vares dependng on the data set. For example, n the case of tme seres data, t makes more sense to hold out observatons at the end of the data set. The followng tests use the same data set and apply EER, estmated all subets regresson (EASR) where we conduct a samplng of models rather than examne all possble models, and stepwse. The procedure s as follows:. Generate 500 observatons for X, randomly selectng observatons from the unform dstrbuton.. Generate 500 observatons each for X through X 5 such that X = γ X + v where the γ are randomly selected from the standard normal dstrbuton and dstrbuted, and v are 8

30 normally dstrbuted wth mean zero and varance 0.. Ths step creates varyng multcollnearty among the factors. 3. Generate Y accordng to (0) where α = β = β = β 3 =, and u s normally dstrbuted wth a varance of. 4. Perform EER and EASR k-fold holdout: a. Randomly select 500 models out of the possble 30 regresson models to obtan J estmates for each β. b. Evaluate the k-fold holdout crteron: c. Calculate c, c, c 3,, c 30 accordng to (9). d. Mark factors for whch c > as beng selected by EER. 5. Evaluate the EER models usng the k-fold holdout crteron: a. For each randomly selected model n step 4a, randomly select 50 observatons to exclude. b. Estmate the model usng the remanng 450 observatons. c. Use the estmated model to predct the 50 excluded observatons. d. Calculate the MSE (mean squared error) where MSE = 50 observaton predcted observaton excluded observatons ( ) e. Over the 500 models randomly selected by EER, dentfy the one for whch the MSE s least. Mark the factors that produce that model as beng selected by the k-fold holdout crteron. 6. Peform backward stepwse: a. Let M be the set of ncluded factors, and N be the set of excluded factors such that M = m, N = n, and m+ n= 30. 9

31 b. Estmate the model β and calculate the estmated model s X M Y = α + X + u adjusted multple correlaton coeffcent, R 0. c. For each factor X, =,, m, move the factor from set M to set N, estmate the model Y = α + β X + u X M, calculate the estmated model s adjusted multple correlaton coeffcent, result n the set of measures d. Let R ( L mn R0,..., Rm ) measure R, and then return the factor X j to M from N. Ths wll R,..., R m. =. Identfy the factor whose removal resulted n the R L. Move that factor from set M to set N. e. Gven the new set M, for each factor X, =,, m, remove the factor from set M, estmate the model Y = α + β X + u X M, calculate the estmated model s adjusted multple correlaton coeffcent, from N. Ths wll result n the set of measures R, and then return the factor X j to M R,..., R m. f. For each factor X, =,, n, move the factor from N to M, estmate the model Y = α + β X + u X M, calculate the estmated model s adjusted multple correlaton coeffcent, R, and then return the factor X to M from N. Ths wll result n the set of measures R,..., m R. + m+ n g. Let R ( L = mn R,..., R m + n) and R ( H max R,..., R m + n) removal resulted n the measure =. Identfy the factor whose R L. Move that factor from set M to set N. If was attaned usng a factor from N, move that factor from set N to set M. R H 30

32 h. Repeat steps e through g untl there s no further mprovement n R H.. Mark the factors n the model attaned n step 5h as beng selected by stepwse. 7. Repeat steps through 6 sx-hundred tmes. 8. Calculate the percentage of tmes that each factor s selected by EER, k-fold, and stepwse. Fgure 0 shows the results of ths comparson. For 30 factors, where the outcome varable s determned by the frst three factors, stepwse correctly dentfed the frst factor slghtly less frequently than dd EER (43% of the tme versus 48% of the tme). Stepwse correctly dentfed the second and thrd factors slghtly more frequently than dd EER (87% and 9% of the tme versus 8% and 85% of the tme). The k-fold crteron when appled to EASR dentfed the frst three factors 53%, 63%, and 63% of the tme, respectvely. In ths, the false postve error rate was comparable for EER and stepwse, and sgnfcantly worse for k-fold. EER erroneously dentfed the remanng factors as beng sgnfcant, on average, 7% of the tme versus 34% of the tme for stepwse and 49% of the tme for k-fold. Ths suggests that EER may be more powerful as a tool for elmnatng factors from consderaton. 3

33 00% 90% 80% Proporton of Tmes Procedure Selects the Factor 70% 60% 50% 40% 30% 0% 0% 0% Factors EER Stepwse EASR Wth k-fold Crteron Fgure 0. Comparson of EER, Stepwse, and k-fold Crteron 8. Applcablty to Other Procedures and Drawbacks Ordnary least squares estmators belong to the class of maxmum lkelhood estmators. The estmates are unbased (.e., E( ˆ β ) arbtrarly small ε), and effcent (.e., var ( ˆ β ) var ( β ) = β ), consstent (.e., lm Pr ( ˆ β β > ε) = 0 for an N < % where β % s any lnear, unbased estmator of β. The ER procedure reles on the fact that parameter estmators are unbased n extraneous varable cases though based n varyng drectons across the omtted varable cases. 3

34 In the case of lmted dependent varable models (e.g., logt), parameter estmates are unbased and consstent when the model s correctly specfed. However, n the omtted varable case, logt slope coeffcents are based toward zero. For example, suppose the outcome varable, * Y, s determned by a latent varable Y * ff Y 0 such that Y = >. Suppose also that Y * s 0 otherwse determned by the equaton = α + β + β +. If we omt the factor X from the (logt) * Y X X u regresson model, we estmate * o Y α β X u = + +. Yatchew and Grlches (985) show that β β = < β o βσ X + σ u () As the denomnator n () s strctly greater than zero, the estmator s based toward zero. More mportantly, the based estmator has the same sgn as the unbased estmator. Ths volates all four of condtons n (6), any one of whch would valdate the ER procedure. However, whle estmates of determnstc parameters are based downward, estmates of nondetermnstc parameters are randomly based. For example, f Y * s determned by the equaton * Y α βx βx u = and we estmate * o o Y α β X β3 X3 u = + + +, we obtan β β = < β o βσ X + σ u () β o 3 β 3 = = βσ X + σ u 0 (3) Because X 3 s extraneous, β 3 s zero and so repeated estmates of ˆ β o 3 wll be randomly dstrbuted around zero. In short, parameter estmates for determnstc factors wll be: () unbased n the 33

35 correctly specfed case, () unbased n the extraneous varable case, and (3) based toward zero n the omtted varable case. But, parameter estmates for spurous factors wll be unbased toward zero n all three cases. Hence, we can expect the cross-model ch-square statstcs to work n the case of logt models, though lkely n an asymptotc sense. One drawback of the ER (and EER) procedure s that the procedure may select a set of factors that, whle ndvdually passng the cross-model ch-square test, are not statstcally sgnfcant when run together n a regresson model. One possble explanaton s that, wthn the confnes of a sngle regresson model, the error varance s large enough to drown out the explanatory power of a factor but, because the cross-model ch-square statstc s based on crossmodel nformaton that has estmated and fltered out the error term, the factor appears sgnfcant. Ths s analogous to the gan n nformaton obtaned from employng panel data versus tme seres data (cf., Daves, 006). A tme seres data set may measure the same phenomenon over the same tme perod as a panel data set, but because the panel data set also measures the phenomenon over multple cross-sectons, nformaton across cross-sectons can be used to mtgate the error varance wthn a gven tme perod. 9. Concluson The purpose of ths analyss s to buld on ASR by proposng a new procedure, ER, that combnes ASR wth a cross-model ch-square statstc that attempts to dstngush determnstc factors from spurous factors. Recognzng that, for large data sets, even ER s nfeasble even wth the applcaton of super computaton, ths analyss further proposes an adaptaton on ER, EER, whch samples the space of possble regresson models. 34

36 Ths analyss () proves that, under condtons less strngent than the classcal lnear model condtons, the cross-model ch-square statstc s a reasonable measure for dstngushng between determnstc and spurous factors, () demonstrates va Monte-Carlo studes the lkelhood of ER produce false postve and false negatve results under condtons of varyng numbers of factors n the data set, varyng varance of the regresson error, and varyng choce of crtcal value, (3) proposes a procedure, EER, that estmates ER results va samplng a subset of the space of possble regresson models, (4) demonstrates va Monte-Carlo studes the lkelhood of EER producng false postve and false negatve results under condtons of varyng sample sze, and varyng number of factors n the determnstc equaton, (5) compares EER model selecton wth backward stepwse and the k-fold crteron by va Monte-Carlo studes that apply the same data sets to all three procedures, and (6) demonstrates that EER avods false postves and false negatves better than the k-fold crteron, avods false postves approxmately as well as stepwse, and avods false negatves sgnfcantly better than stepwse. 35

37 0. References Carmnes, E.G., and J.P. McIver, 98. Analyzng models wth unobserved varables: Analyss of covarance structures, n Bohmstedt. In G.W. and E.F. Borgatta, eds., Socal Measurement. Sage Publcatons: Thousand Oaks, CA. pp Daves, A., 006. A framework for decomposng shocks and measurng volatltes derved from mult-dmensonal panel data of survey forecasts. Internatonal Journal of Forecastng, (): Klne, R.B., 998. Prncples and practce of structural equaton modelng. Gulford Press: New York. Kuk, A.C., 984. All subsets regresson n proportonal hazard models. Bometrka, 7(3): Ullman, J.B., 00. Structural equaton modelng. In Tabachnck, B.G. and L.S. Fdell, eds., Usng Multvarate Statstcs. Allyn and Bacon: Needham Heghts, MA. pp Yatchew, A. and Z. Grlches, 985. Specfcaton error n probt models. The Revew of Economcs and Statstcs, 67:

THE TITANIC SHIPWRECK: WHO WAS

THE TITANIC SHIPWRECK: WHO WAS THE TITANIC SHIPWRECK: WHO WAS MOST LIKELY TO SURVIVE? A STATISTICAL ANALYSIS Ths paper examnes the probablty of survvng the Ttanc shpwreck usng lmted dependent varable regresson analyss. Ths appled analyss

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Questions that we may have about the variables

Questions that we may have about the variables Antono Olmos, 01 Multple Regresson Problem: we want to determne the effect of Desre for control, Famly support, Number of frends, and Score on the BDI test on Perceved Support of Latno women. Dependent

More information

The Analysis of Outliers in Statistical Data

The Analysis of Outliers in Statistical Data THALES Project No. xxxx The Analyss of Outlers n Statstcal Data Research Team Chrysses Caron, Assocate Professor (P.I.) Vaslk Karot, Doctoral canddate Polychrons Economou, Chrstna Perrakou, Postgraduate

More information

9.1 The Cumulative Sum Control Chart

9.1 The Cumulative Sum Control Chart Learnng Objectves 9.1 The Cumulatve Sum Control Chart 9.1.1 Basc Prncples: Cusum Control Chart for Montorng the Process Mean If s the target for the process mean, then the cumulatve sum control chart s

More information

b) The mean of the fitted (predicted) values of Y is equal to the mean of the Y values: c) The residuals of the regression line sum up to zero: = ei

b) The mean of the fitted (predicted) values of Y is equal to the mean of the Y values: c) The residuals of the regression line sum up to zero: = ei Mathematcal Propertes of the Least Squares Regresson The least squares regresson lne obeys certan mathematcal propertes whch are useful to know n practce. The followng propertes can be establshed algebracally:

More information

The covariance is the two variable analog to the variance. The formula for the covariance between two variables is

The covariance is the two variable analog to the variance. The formula for the covariance between two variables is Regresson Lectures So far we have talked only about statstcs that descrbe one varable. What we are gong to be dscussng for much of the remander of the course s relatonshps between two or more varables.

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

HYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION

HYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION HYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION Abdul Ghapor Hussn Centre for Foundaton Studes n Scence Unversty of Malaya 563 KUALA LUMPUR E-mal: ghapor@umedumy Abstract Ths paper

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

Quality Adjustment of Second-hand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index

Quality Adjustment of Second-hand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index Qualty Adustment of Second-hand Motor Vehcle Applcaton of Hedonc Approach n Hong Kong s Consumer Prce Index Prepared for the 14 th Meetng of the Ottawa Group on Prce Indces 20 22 May 2015, Tokyo, Japan

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Regresson a means of predctng a dependent varable based one or more ndependent varables. -Ths s done by fttng a lne or surface to the data ponts that mnmzes the total error. -

More information

Binary Dependent Variables. In some cases the outcome of interest rather than one of the right hand side variables is discrete rather than continuous

Binary Dependent Variables. In some cases the outcome of interest rather than one of the right hand side variables is discrete rather than continuous Bnary Dependent Varables In some cases the outcome of nterest rather than one of the rght hand sde varables s dscrete rather than contnuous The smplest example of ths s when the Y varable s bnary so that

More information

x f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60

x f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60 BIVARIATE DISTRIBUTIONS Let be a varable that assumes the values { 1,,..., n }. Then, a functon that epresses the relatve frequenc of these values s called a unvarate frequenc functon. It must be true

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

Inequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001.

Inequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001. Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

I. SCOPE, APPLICABILITY AND PARAMETERS Scope

I. SCOPE, APPLICABILITY AND PARAMETERS Scope D Executve Board Annex 9 Page A/R ethodologcal Tool alculaton of the number of sample plots for measurements wthn A/R D project actvtes (Verson 0) I. SOPE, PIABIITY AD PARAETERS Scope. Ths tool s applcable

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Multivariate EWMA Control Chart

Multivariate EWMA Control Chart Multvarate EWMA Control Chart Summary The Multvarate EWMA Control Chart procedure creates control charts for two or more numerc varables. Examnng the varables n a multvarate sense s extremely mportant

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

SIMPLE LINEAR CORRELATION

SIMPLE LINEAR CORRELATION SIMPLE LINEAR CORRELATION Smple lnear correlaton s a measure of the degree to whch two varables vary together, or a measure of the ntensty of the assocaton between two varables. Correlaton often s abused.

More information

Regression Models for a Binary Response Using EXCEL and JMP

Regression Models for a Binary Response Using EXCEL and JMP SEMATECH 997 Statstcal Methods Symposum Austn Regresson Models for a Bnary Response Usng EXCEL and JMP Davd C. Trndade, Ph.D. STAT-TECH Consultng and Tranng n Appled Statstcs San Jose, CA Topcs Practcal

More information

The Analysis of Covariance. ERSH 8310 Keppel and Wickens Chapter 15

The Analysis of Covariance. ERSH 8310 Keppel and Wickens Chapter 15 The Analyss of Covarance ERSH 830 Keppel and Wckens Chapter 5 Today s Class Intal Consderatons Covarance and Lnear Regresson The Lnear Regresson Equaton TheAnalyss of Covarance Assumptons Underlyng the

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Study on CET4 Marks in China s Graded English Teaching

Study on CET4 Marks in China s Graded English Teaching Study on CET4 Marks n Chna s Graded Englsh Teachng CHE We College of Foregn Studes, Shandong Insttute of Busness and Technology, P.R.Chna, 264005 Abstract: Ths paper deploys Logt model, and decomposes

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton

More information

The Probit Model. Alexander Spermann. SoSe 2009

The Probit Model. Alexander Spermann. SoSe 2009 The Probt Model Aleander Spermann Unversty of Freburg SoSe 009 Course outlne. Notaton and statstcal foundatons. Introducton to the Probt model 3. Applcaton 4. Coeffcents and margnal effects 5. Goodness-of-ft

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Evaluating credit risk models: A critique and a new proposal

Evaluating credit risk models: A critique and a new proposal Evaluatng credt rsk models: A crtque and a new proposal Hergen Frerchs* Gunter Löffler Unversty of Frankfurt (Man) February 14, 2001 Abstract Evaluatng the qualty of credt portfolo rsk models s an mportant

More information

Linear Regression, Regularization Bias-Variance Tradeoff

Linear Regression, Regularization Bias-Variance Tradeoff HTF: Ch3, 7 B: Ch3 Lnear Regresson, Regularzaton Bas-Varance Tradeoff Thanks to C Guestrn, T Detterch, R Parr, N Ray 1 Outlne Lnear Regresson MLE = Least Squares! Bass functons Evaluatng Predctors Tranng

More information

Solution of Algebraic and Transcendental Equations

Solution of Algebraic and Transcendental Equations CHAPTER Soluton of Algerac and Transcendental Equatons. INTRODUCTION One of the most common prolem encountered n engneerng analyss s that gven a functon f (, fnd the values of for whch f ( = 0. The soluton

More information

H 1 : at least one is not zero

H 1 : at least one is not zero Chapter 6 More Multple Regresson Model The F-test Jont Hypothess Tests Consder the lnear regresson equaton: () y = β + βx + βx + β4x4 + e for =,,..., N The t-statstc gve a test of sgnfcance of an ndvdual

More information

Portfolio Loss Distribution

Portfolio Loss Distribution Portfolo Loss Dstrbuton Rsky assets n loan ortfolo hghly llqud assets hold-to-maturty n the bank s balance sheet Outstandngs The orton of the bank asset that has already been extended to borrowers. Commtment

More information

Analysis of Premium Liabilities for Australian Lines of Business

Analysis of Premium Liabilities for Australian Lines of Business Summary of Analyss of Premum Labltes for Australan Lnes of Busness Emly Tao Honours Research Paper, The Unversty of Melbourne Emly Tao Acknowledgements I am grateful to the Australan Prudental Regulaton

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

CHAPTER 7 THE TWO-VARIABLE REGRESSION MODEL: HYPOTHESIS TESTING

CHAPTER 7 THE TWO-VARIABLE REGRESSION MODEL: HYPOTHESIS TESTING CHAPTER 7 THE TWO-VARIABLE REGRESSION MODEL: HYPOTHESIS TESTING QUESTIONS 7.1. (a) In the regresson contet, the method of least squares estmates the regresson parameters n such a way that the sum of the

More information

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable

More information

+ + + - - This circuit than can be reduced to a planar circuit

+ + + - - This circuit than can be reduced to a planar circuit MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to

More information

Moment of a force about a point and about an axis

Moment of a force about a point and about an axis 3. STATICS O RIGID BODIES In the precedng chapter t was assumed that each of the bodes consdered could be treated as a sngle partcle. Such a vew, however, s not always possble, and a body, n general, should

More information

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS Chrs Deeley* Last revsed: September 22, 200 * Chrs Deeley s a Senor Lecturer n the School of Accountng, Charles Sturt Unversty,

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

Economic Interpretation of Regression. Theory and Applications

Economic Interpretation of Regression. Theory and Applications Economc Interpretaton of Regresson Theor and Applcatons Classcal and Baesan Econometrc Methods Applcaton of mathematcal statstcs to economc data for emprcal support Economc theor postulates a qualtatve

More information

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

ErrorPropagation.nb 1. Error Propagation

ErrorPropagation.nb 1. Error Propagation ErrorPropagaton.nb Error Propagaton Suppose that we make observatons of a quantty x that s subject to random fluctuatons or measurement errors. Our best estmate of the true value for ths quantty s then

More information

2.4 Bivariate distributions

2.4 Bivariate distributions page 28 2.4 Bvarate dstrbutons 2.4.1 Defntons Let X and Y be dscrete r.v.s defned on the same probablty space (S, F, P). Instead of treatng them separately, t s often necessary to thnk of them actng together

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that

More information

Chapter 7. Random-Variate Generation 7.1. Prof. Dr. Mesut Güneş Ch. 7 Random-Variate Generation

Chapter 7. Random-Variate Generation 7.1. Prof. Dr. Mesut Güneş Ch. 7 Random-Variate Generation Chapter 7 Random-Varate Generaton 7. Contents Inverse-transform Technque Acceptance-Rejecton Technque Specal Propertes 7. Purpose & Overvew Develop understandng of generatng samples from a specfed dstrbuton

More information

1 Approximation Algorithms

1 Approximation Algorithms CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall SP 2005-02 August 2005 Staff Paper Department of Appled Economcs and Management Cornell Unversty, Ithaca, New York 14853-7801 USA Farm Savngs Accounts: Examnng Income Varablty, Elgblty, and Benefts Brent

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

Capital asset pricing model, arbitrage pricing theory and portfolio management

Capital asset pricing model, arbitrage pricing theory and portfolio management Captal asset prcng model, arbtrage prcng theory and portfolo management Vnod Kothar The captal asset prcng model (CAPM) s great n terms of ts understandng of rsk decomposton of rsk nto securty-specfc rsk

More information

Communication Networks II Contents

Communication Networks II Contents 8 / 1 -- Communcaton Networs II (Görg) -- www.comnets.un-bremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP

More information

U.C. Berkeley CS270: Algorithms Lecture 4 Professor Vazirani and Professor Rao Jan 27,2011 Lecturer: Umesh Vazirani Last revised February 10, 2012

U.C. Berkeley CS270: Algorithms Lecture 4 Professor Vazirani and Professor Rao Jan 27,2011 Lecturer: Umesh Vazirani Last revised February 10, 2012 U.C. Berkeley CS270: Algorthms Lecture 4 Professor Vazran and Professor Rao Jan 27,2011 Lecturer: Umesh Vazran Last revsed February 10, 2012 Lecture 4 1 The multplcatve weghts update method The multplcatve

More information

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. INDEX 1. Load data usng the Edtor wndow and m-fle 2. Learnng to save results from the Edtor wndow. 3. Computng the Sharpe Rato 4. Obtanng the Treynor Rato

More information

Describing Communities. Species Diversity Concepts. Species Richness. Species Richness. Species-Area Curve. Species-Area Curve

Describing Communities. Species Diversity Concepts. Species Richness. Species Richness. Species-Area Curve. Species-Area Curve peces versty Concepts peces Rchness peces-area Curves versty Indces - mpson's Index - hannon-wener Index - rlloun Index peces Abundance Models escrbng Communtes There are two mportant descrptors of a communty:

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

II. PROBABILITY OF AN EVENT

II. PROBABILITY OF AN EVENT II. PROBABILITY OF AN EVENT As ndcated above, probablty s a quantfcaton, or a mathematcal model, of a random experment. Ths quantfcaton s a measure of the lkelhood that a gven event wll occur when the

More information

A Probabilistic Theory of Coherence

A Probabilistic Theory of Coherence A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want

More information

PROFIT RATIO AND MARKET STRUCTURE

PROFIT RATIO AND MARKET STRUCTURE POFIT ATIO AND MAKET STUCTUE By Yong Yun Introducton: Industral economsts followng from Mason and Ban have run nnumerable tests of the relaton between varous market structural varables and varous dmensons

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

Diagnostic Tests of Cross Section Independence for Nonlinear Panel Data Models

Diagnostic Tests of Cross Section Independence for Nonlinear Panel Data Models DISCUSSION PAPER SERIES IZA DP No. 2756 Dagnostc ests of Cross Secton Independence for Nonlnear Panel Data Models Cheng Hsao M. Hashem Pesaran Andreas Pck Aprl 2007 Forschungsnsttut zur Zukunft der Arbet

More information

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao

More information

Binomial Link Functions. Lori Murray, Phil Munz

Binomial Link Functions. Lori Murray, Phil Munz Bnomal Lnk Functons Lor Murray, Phl Munz Bnomal Lnk Functons Logt Lnk functon: ( p) p ln 1 p Probt Lnk functon: ( p) 1 ( p) Complentary Log Log functon: ( p) ln( ln(1 p)) Motvatng Example A researcher

More information

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIIOUS AFFILIATION AND PARTICIPATION Danny Cohen-Zada Department of Economcs, Ben-uron Unversty, Beer-Sheva 84105, Israel Wllam Sander Department of Economcs, DePaul

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

Lecture 10: Linear Regression Approach, Assumptions and Diagnostics

Lecture 10: Linear Regression Approach, Assumptions and Diagnostics Approach to Modelng I Lecture 1: Lnear Regresson Approach, Assumptons and Dagnostcs Sandy Eckel seckel@jhsph.edu 8 May 8 General approach for most statstcal modelng: Defne the populaton of nterest State

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM BARRIOT Jean-Perre, SARRAILH Mchel BGI/CNES 18.av.E.Beln 31401 TOULOUSE Cedex 4 (France) Emal: jean-perre.barrot@cnes.fr 1/Introducton The

More information

An Empirical Study of Search Engine Advertising Effectiveness

An Empirical Study of Search Engine Advertising Effectiveness An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman

More information

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* Luísa Farnha** 1. INTRODUCTION The rapd growth n Portuguese households ndebtedness n the past few years ncreased the concerns that debt

More information

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001 Proceedngs of the Annual Meetng of the Amercan Statstcal Assocaton, August 5-9, 2001 LIST-ASSISTED SAMPLING: THE EFFECT OF TELEPHONE SYSTEM CHANGES ON DESIGN 1 Clyde Tucker, Bureau of Labor Statstcs James

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Adaptive Clinical Trials Incorporating Treatment Selection and Evaluation: Methodology and Applications in Multiple Sclerosis

Adaptive Clinical Trials Incorporating Treatment Selection and Evaluation: Methodology and Applications in Multiple Sclerosis Adaptve Clncal Trals Incorporatng Treatment electon and Evaluaton: Methodology and Applcatons n Multple cleross usan Todd, Tm Frede, Ngel tallard, Ncholas Parsons, Elsa Valdés-Márquez, Jeremy Chataway

More information

BERNSTEIN POLYNOMIALS

BERNSTEIN POLYNOMIALS On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful

More information

An Analysis of Factors Influencing the Self-Rated Health of Elderly Chinese People

An Analysis of Factors Influencing the Self-Rated Health of Elderly Chinese People Open Journal of Socal Scences, 205, 3, 5-20 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/ss http://dx.do.org/0.4236/ss.205.35003 An Analyss of Factors Influencng the Self-Rated Health of

More information

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

v a 1 b 1 i, a 2 b 2 i,..., a n b n i. SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

More information

Statistical algorithms in Review Manager 5

Statistical algorithms in Review Manager 5 Statstcal algorthms n Reve Manager 5 Jonathan J Deeks and Julan PT Hggns on behalf of the Statstcal Methods Group of The Cochrane Collaboraton August 00 Data structure Consder a meta-analyss of k studes

More information

Nonlinear data mapping by neural networks

Nonlinear data mapping by neural networks Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt. Chapter 9 Revew problems 9.1 Interest rate measurement Example 9.1. Fund A accumulates at a smple nterest rate of 10%. Fund B accumulates at a smple dscount rate of 5%. Fnd the pont n tme at whch the forces

More information

ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C White Emerson Process Management

ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C White Emerson Process Management ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C Whte Emerson Process Management Abstract Energy prces have exhbted sgnfcant volatlty n recent years. For example, natural gas prces

More information

An empirical study for credit card approvals in the Greek banking sector

An empirical study for credit card approvals in the Greek banking sector An emprcal study for credt card approvals n the Greek bankng sector Mara Mavr George Ioannou Bergamo, Italy 17-21 May 2004 Management Scences Laboratory Department of Management Scence & Technology Athens

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Sldes Prepared JOHN S. LOUCKS St. Edward s Unverst Slde Chapter 4 Smple Lnear Regresson Smple Lnear Regresson Model Least Squares Method Coeffcent of Determnaton Model Assumptons Testng for Sgnfcance Usng

More information