Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Size: px
Start display at page:

Download "Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation"

Transcription

1 Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The Mercatus Center George Mason Unversty Arlngton, VA 0 antony@antoln-daves.com RPF Workng Paper No August 8, 008 RESEARCH PROGRAM ON FORECASTING Center of Economc Research Department of Economcs The George Washngton Unversty Washngton, DC Research Program on Forecastng (RPF) Workng Papers represent prelmnary work crculated for comment and dscusson. Please contact the author(s) before ctng ths paper n any publcatons. The vews expressed n RPF Workng Papers are solely those of the author(s) and do not necessarly represent the vews of RPF or George Washngton Unversty.

2 Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The Mercatus Center George Mason Unversty Arlngton, VA 0 antony@antoln-daves.com

3 Regresson analyss s ntended to be used when the researcher seeks to test a gven hypothess aganst a data set. Unfortunately, n many applcatons t s ether not possble to specfy a hypothess, typcally because the research s n a very early stage, or t s not desrable to form a hypothess, typcally because the number of potental explanatory varables s very large. In these cases, researchers have resorted ether to overt data mnng technques such as stepwse regresson, or covert data mnng technques such as runnng varatons on regresson models pror to runnng the fnal model (also known as data peekng ). Whle data mnng sde-steps the need to form a hypothess, t s hghly susceptble to generatng spurous results. Ths paper draws on the known propertes of OLS estmators n the presence of omtted and extraneous varable models to propose a procedure for data mnng that attempts to dstngush between parameter estmates that are sgnfcant due to an underlyng structural relatonshp and those that are sgnfcant due to random chance.

4 . Background Regresson analyss s desgned to estmate the probablty of observng a gven data set gven that a pre-determned hypothess about the relatonshp between an outcome varable and a set of factors s assumed to be true. Unfortunately, n many applcatons t s not possble to specfy a hypothess. Two possble reasons why a researcher would want to perform analyss absent a hypothess are: Large Data Scope: Data sze refers to the number of observatons n a data set. Data scope refers to the number of potental explanatory factors ( canddate factors ) n the data set. In the case of large data scope (e.g., economc and fnancal data sets), the number of canddate factors s so large that the cost of formng a tractable hypothess s prohbtve. Early Stage Analyss: In the case of early stage analyss (e.g., clncal data), there has not been enough observaton to as yet for a hypothess. To date, stepwse regresson (SR) has been one of the more wdely used technques for performng these hypothess-less analyses. SR methods perform a smart samplng of regresson models n an attempt to fnd a regresson model that best fts the data. The procedure for smartly samplng s ad-hoc. Two typcally used procedures are backward (wheren the frst model ncludes all factors and factors are removed one at a tme) and forward (wheren the frst model ncludes no factors and factors are added one at a tme). SR procedures contan three major flaws: Samplng sze: The number of regresson models that can be constructed from a gven data set can be ncredbly large. For example, wth only 30 canddate 3

5 factors one could construct more than bllon regresson models ( 30 ). Stepwse procedures sample only a small number (typcally less than 00) of the set of possble regresson models. Whle stepwse methods can fnd models that ft the data reasonably well, as the number of factors rses, the probablty of stepwse methods fndng the best-ft model s vrtually zero. Ft crteron: In evaluatng competng models, stepwse methods typcally employ an F-statstc crteron. Ths crteron causes stepwse to methods to seek out the model that comes closest to explanng the data set. However, as the number of canddate factors ncreases, what also ncreases s the probablty of a gven factor addng explanatory power smply by random chance. Thus, the stepwse ft crteron cannot dstngush between factors that contrbute explanatory power to the outcome varable because of an underlyng relatonshp ( determnstc factors ) and those that contrbute by random chance only ( spurous factors ). Intal condton: Because SR only examnes a small subset of the space of possble models and because the space of fts of the models (potentally) contans many local optma, the soluton SR returns vares based on the startng pont of the search. For example, for the same data set, SR backward and SR forward can yeld dfferent results. As the startng ponts for SR backward and SR forward are arbtrary, there are many other potental startng ponts each of whch potentally yelds a dfferent result. For example, Fgure depcts the space of possble regresson models that can be formed usng K factors. Each block represents one regresson model. The shade of the block ndcates the qualty of the model (e.g., goodness of ft). A stepwse procedure that starts at model A 4

6 evaluates models n the vcnty of A, moves to the best model, then re-evaluates n the new vcnty. Ths contnues untl the procedure cannot fnd a better model n the vcnty. In ths example, stepwse would move along the ndcated path startng at pont A. Were stepwse to start at pont B, however, t would end up at a dfferent optmal model. Out of the four startng ponts shown, only startng pont D fnds the best model. Fgure. Results from stepwse procedures are dependent on the ntal condton from whch the search commences.. All Subsets Regresson All Subsets Regresson (ASR) s a procedure ntended to be used when a researcher wants to perform analyss n the absence of a hypothess and wants to avod the samplng sze problem nherent n stepwse procedures. For K potental explanatory factors, ASR examnes all K lnear models that can be constructed. Untl recently, ASR has been nfeasble due to the massve computng power requred. By employng grd-enabled super computaton, t s now 5

7 feasble to conduct ASR for moderately szed data sets. For example, as of today an average computer would requre nearly 00 years of contnuous work to perform ASR on a data set contanng 40 canddate factors (see Fgure ), whle a 0,000 node grd computer could complete the same analyss n less than a week. Whle ASR solves the samplng sze and ntal condton problems nherent n SR, ASR remans subject to the ft crteron problem. If anythng, the ft crteron s more of a problem for ASR as the procedure examnes a much larger space of models than does SR and therefore s more lkely to fnd spurous factors. Fgure. The tme a sngle computer requres to perform ASR rses exponentally wth the number of canddate factors. 6

8 3. Exhaustve Regresson Exhaustve Regresson (ER) utlzes the ASR procedure, but attempts to dentfy spurous factors va a cross-model ch-square statstc that tests for stablty n parameter estmates across models. The cross-model stablty test compares parameter estmates for each factor across all K models n whch the factor appears. Factors whose parameter estmates yeld sgnfcantly dfferent results across models are dentfed as spurous. A gven factor can exst n one of three types of models: omtted factor model, correctly specfed model, and extraneous factor model. A correctly specfed model contans all of the explanatory factors (.e., the factors contrbute to explanng the outcome varable because of some underlyng relatonshp between the factor and the outcome) and no other factors. An omtted varable model ncludes at least one, but not all explanatory factors and (possbly) other factors. An extraneous varable model ncludes all explanatory factors and at least one other factor. In the cases of the correctly specfed model and the extraneous varable model, estmates of parameters assocated wth the factors ( slope coeffcents ) are unbased. In the case of the omtted varable model, however, estmates of the slope coeffcents are based and the drecton of bas s a functon of the covarances of the omtted factor wth the outcome varable and the omtted factor wth the factors ncluded n the model. For example, consder a data set contanng k +k +k 3 factors, for each of whch there are N observatons. Let the factors be arranged nto three Nxk j, j={,,3} matrces X, X, and X 3, and let the sets of slope coeffcents assocated wth each set of factors be the k j x vectors β, β, and β 3, respectvely. Let Y be an Nx vector of observatons on the outcome varable. Suppose that, unknown to the researcher, the process that determnes the outcome varable s Assumng, of course, that the remanng classcal lnear model assumptons hold. 7

9 Y=Xβ +Xβ +u () where u s an Nx vector of ndependently and dentcally normally dstrbuted errors. The three cases are attaned when we apply the ordnary least squares (OLS) procedure to estmate the followng models: Omtted Varable Case: Y=Xβ + u O Correctly Specfed Case: Y=Xβ +Xβ +u Extraneous Varable Case: Y=Xβ +Xβ +Xβ 3 3 +u E In the correctly specfed case, the ordnary least squares regresson procedure yelds the slope estmate: βˆ - ' ' = ([ X X] [ X X] ) [ X X] Y ˆ β () where the square brackets ndcate a parttoned matrx. Substtutng the defnton for Y from () nto (), t can be shown that the expected values of the slope estmates equal the true slope values: βˆ β E = ˆ β β Ths s the unbasedness condton. Smlarly, n the extraneous varable case, we have: βˆ ' - ' β ˆ = ([ X X X3] [ X X X3] ) [ X X X3] Y ˆ β3 (3) Agan, t can be shown that the expected values of the slope estmates equal the true slope values: 8

10 By contrast, n the omtted varable case, we have Substtutng () nto (4), we have and the expected values of the slope estmates are: βˆ β E ˆ β = β ˆ β3 0 ' ( ) - ˆ ' β = X X X Y (4) ' - ' ( ) ( ) β ˆ = X X X X β +X β +u E ( ˆ ' ) ( ) - ' β = β + X X X X β β (5) From (5), we see that the expected value of the slope estmates n the omtted varable case are based and that the drecton of the bas depends on ' XXβ. We can construct a sequence of omtted varable cases as follows. Let { Z,Z,..., Z } be the set of all (column-wse unque) subsets of X. Let X % be the set of k regressors formed by mergng X and X and then removng Z so that, from the superset of regressors formed by mergng X and X, X % s the set of ncluded regressors and Z s the correspondng set of excluded regressors. Fnally, let β ˆ be the OLS estmate of β obtaned by regressng Y on X %. The expected value of the mean of the β ˆ k across all regresson models s: k k ˆ ' - ' E E k β = β + ( ) k X% X% X% Zβ (6) = = Snce Z s are subsets of X, each Z s Nxj where j k. 9

11 From (6), we see that the expected value of the mean of the β ˆ s β when any of the followng condtons are met:. For each set of ncluded regressors, there s no covarance between the ncluded regressors and the correspondng set of excluded regressors (.e., ' XZ % =0 );. For some sets of ncluded regressors, there s a non-zero covarance between the ncluded ' regressors and the correspondng set of excluded regressors (.e., XZ % 0 ), but the XZ % =0); ' expected value of the covarances s zero (.e., E ( ) 3. For some sets of ncluded regressors, there s a non-zero covarance between the ncluded ' regressors and the correspondng set of excluded regressors (.e., XZ % 0 ), and the ' expected value of the covarances s non-zero (.e., E ( ) XZ % 0), but the expected covarance of the excluded regressors wth the dependent varable s zero (.e., E ( β ) =0); 4. None of the above holds, but the expected value of the product of () the covarances between the ncluded and excluded regressors and () the covarance of the excluded E XZβ % = 0 ). ' regressors wth the dependent varable s zero (.e., ( ) The ER procedure reles on the reasonable assumpton that, as the data number of factors n the data set ncreases, condtons (), (3), and (4) wll hold asymptotcally. If true, ths enables us to construct the cross-model ch-square statstc. 0

12 4. The Cross-Model Ch-Square Statstc Let us assume that the th (n a set of K) factor, x (where x s an Nx vector) has no structural relatonshp wth an outcome varable y. Consder the equaton: y = α + β x + β x + + β x + u (7)... k k Under the null hypothess of β = 0, the square of the rato of ˆ β to ts standard error s dstrbuted χ wth one degree of freedom. By addng factors to and removng factors (other than x ) from (7), we can obtan other estmates of β. Let ˆj β be the j th such estmate of β. By lookng at all combnatons of factors from the superset of K factors, we can construct K estmates of β. Under the null hypothess that β = 0 and assumng that the ˆj β are ndependent, we have: K ˆ β j ~ χ K (8) j= s β From (8), we can construct c, the cross-model ch-square statstc for the factor x : c K ˆ β j ~ χ K (9) j= s β = Gven the (typcally) large number of degrees of freedom nherent n the ER procedure, t s worth notng that the measure n (8) s lkely to be subject to Type II errors. In an attempt to compensate, we dvde by the number of degrees of freedom to obtan c, a relatve ch-square statstc, a measure that s less dependent on sample sze. Carmnes and McIver (98) and Klne (998) clam that one should conclude that the data represent a good ft to the hypothess when the relatve ch-square measure s less than 3. Ullman (00) recommends usng a ch-square less than.

13 Because the ˆj β are obtaned by explorng all combnatons of factors from a sngle superset, one mght expect the ˆj β to be correlated (partcularly when factors are postvely correlated), and for the correlaton to ncrease n the presence of stronger multcollnearty among the factors. 5. Monte-Carlo Tests of c To test the ablty of the cross-model ch-square statstc to dentfy factors that mght show statstcally sgnfcant slope coeffcents smply by random chance, consder an outcome varable, Y, that s determned by three factors as follows Y = α + β X + β X + β X + u (0) 3 3 where u s an error term satsfyng the requrements for the classcal lnear regresson model. Let us randomly generate addtonal factors X 4 through X 5, run the ER procedure and calculate c for each of the factors. The followng fgures show the results of the ER runs. The frst set of bars show results for ER runs appled to the superset of factors X through X 4. The results are derved as follows:. Generate 500 observatons for X, randomly selectng observatons from the unform dstrbuton.. Generate 500 observatons each for X through X 5 such that X = γ X + v where the γ are randomly selected from the standard normal dstrbuton and dstrbuted, and v are normally dstrbuted wth mean zero and varance 0.. Ths step creates varyng multcollnearty among the factors. 3. Generate Y accordng to (0) where α = β = β = β 3 =, and u s normally dstrbuted wth a varance of.

14 4. Run all K = 5 regresson models to obtan the K estmates for each β: ˆ β,..., ˆ β, ˆ β,..., ˆ β, ˆ β,..., ˆ β, ˆ β,..., ˆ β.,,8,,8 3, 3,8 4, 4,8 5. Calculate c, c, c 3, and c 4 accordng to (9). 6. Repeat steps through 5 three-thousand tmes. 7. Repeat steps through 6, each tme ncreasng the varance of u by untl the varance reaches 0. 3 At the completon of the algorthm, there wll be 60,000 measures for each of c, c, c 3, and c 4 based on (60,000) (5) = 900,000 separate regressons. We then repeat the procedure usng a superset of factors X through X 5, then X through X 6, etc. up to X through X 5. 4 Step ntroduces random multcollnearty among the factors. On average, the multcollnearty of factors wth X follows the pattern shown n Fgure 3. Approxmately half of the correlatons wth X are postve and half are negatve. Whle the correlatons are constructed between X and the other factors, ths wll also result n the other factors beng par-wse correlated though to a lesser extent (on average) than they are correlated wth X. 3 Ths results n an average R for the estmate of equaton (0) of approxmately Ths fnal pass requres the estmaton of bllon separate regressons. The entre Monte-Carlo run requres computaton equvalent to almost,000 CPU-hours. 3

15 has less than ths correlaton wth X % 0% 0% 30% 40% 50% 60% 70% 80% 90% 00% Ths proporton of factors Fgure 3. Pattern of Squared Correlatons of Factors X through X K wth X Fgure 4 shows the results of the Monte-Carlo runs n whch the crtcal value for the c s set to 3. For example, when there are four factors n the data set, c, c, and c 3 pass the sgnfcance test slghtly over 50% of the tme versus 0% for c 4. In other words, for data sets n whch three out of four factors determne the outcome varable (and a crtcal value of 3), the ER procedure wll dentfy the three determnng factors 50% of the tme and dentfy the nondetermnng factor 0% of the tme. As the number of factors n the data set ncreases, the ER procedure better dscrmnates between the factors that determne the outcome varable and those that mght appear sgnfcant by random chance alone. The last set of bars n Fgure 4 shows the results for data sets n whch three out of ffteen factors determne the outcome varable. Here, 4

16 the ER procedure dentfes the determnng factors approxmately 85% of the tme and (erroneously) dentfes non-determnng factors only slghtly more than 0% of the tme. 00% 90% Cross-Model Ch-Square Statstc Exceeds % 70% 60% 50% 40% 30% 0% 0% 0% Number of Factors n the Regresson Model Factor Factor Factor 3 All Other Factors Fgure 4. Monte-Carlo Tests of ER Procedure Usng Supersets of Data from 4 Through 5 Factors (crtcal value = 3) These results are based on the somewhat arbtrary selecton of 3 for the relatve chsquare crtcal value. Reducng the crtcal value to produces the results shown n Fgure 5. As expected, reducng the crtcal value causes the ncdence of false postves (where postve means dentfcaton of a determnng factor ) to approxmately 5%, but the ncdence of false negatves falls to below 0%. Fgure 6 and Fgure 7, where the crtcal value s set to.5 and, respectvely, are shown for comparson. As expected, the margnal gans n the reducton n false negatves (versus Fgure 5) appear to be outweghed by the ncrease n the rate of false postves. 5

17 00% 90% Cross-Model Ch-Square Statstc Exceeds.0 80% 70% 60% 50% 40% 30% 0% 0% 0% Number of Factors n the Regresson Model Factor Factor Factor 3 All Other Factors Fgure 5. Monte-Carlo Tests of ER Procedure Usng Supersets of Data from 4 Through 5 Factors (crtcal value = ) 6

18 00% 90% Cross-Model Ch-Square Statstc Exceeds.5 80% 70% 60% 50% 40% 30% 0% 0% 0% Number of Factors n the Regresson Model Factor Factor Factor 3 All Other Factors Fgure 6. Monte-Carlo Tests of ER Procedure Usng Supersets of Data from 4 Through 5 Factors (crtcal value =.5) 00% 90% Cross-Model Ch-Square Statstc Exceeds.0 80% 70% 60% 50% 40% 30% 0% 0% 0% Number of Factors n the Regresson Model Factor Factor Factor 3 All Other Factors Fgure 7. Monte-Carlo Tests of ER Procedure Usng Supersets of Data from 4 Through 5 Factors (crtcal value = ) 7

19 To measure the effect of the determnstc model s goodness of ft on the cross-model ch-square statstc, we can arrange the Monte-Carlo results accordng to the varance of the error term n (0). The algorthm vares the error term from to 0 n ncrements of. Fgure 8 shows the proporton of tmes that c passes the sgnfcance threshold of 3 for factors X, X, and X 3 (combned) for varous numbers of factors n the data set and for varous levels of error varance. An error varance of corresponds to an R (for the estmate of equaton (0)) of approxmately 0.93 whle an error varance of 0 corresponds to an R of approxmately % 90% Cross-Model Ch-Square Statstc Exceeds % 70% 60% 50% 40% 30% 0% 0% 0% Varance of the Error Term 5-Factor Models Data 0-Factor Models Data 5-Factor Models Data Sets Sets Sets Fgure 8. Monte-Carlo Tests of ER Procedure for Factors X through X 3 (combned) As expected, as the error varance rses n a 5-factor data set, the probablty of a false negatve rses from approxmately 5% (when var(u) = ) to 30% (when var(u) = 0). Results are markedly worse for data sets wth fewer factors. Fgure 9 shows results for factors X 4 through 8

20 X 5, combned. Here, we see that the probablty of a false postve rses from under 5% (when var(u) = ) to almost 0% (when var(u) = 0) for 5-factor data sets. Employng a crtcal value of.0 yelds the results n Fgure 0 and Fgure. Comparng Fgure 8 and Fgure 9 wth Fgure 0 and Fgure, we see that employng the crtcal value of 3 versus cuts n half (approxmately) the lkelhoods of false postves and negatves. 00% 90% Cross-Model Ch-Square Statstc Exceeds % 70% 60% 50% 40% 30% 0% 0% 0% Varance of the Error Term 5-Factor Models Data 0-Factor Models Data 5-Factor Models Data Sets Sets Sets Fgure 9. Monte-Carlo Tests of ER Procedure for Factors X 4 through X 5, X 0, and X 5 (combned) 9

21 00% 90% Cross-Model Ch-Square Statstc Exceeds.0 80% 70% 60% 50% 40% 30% 0% 0% 0% Varance of the Error Term 5-Factor Models Data 0-Factor Models Data 5-Factor Models Data Sets Sets Sets Fgure 0. Monte-Carlo Tests of ER Procedure for Factors X through X 3 (combned) 00% 90% Cross-Model Ch-Square Statstc Exceeds.0 80% 70% 60% 50% 40% 30% 0% 0% 0% Varance of the Error Term 5-Factor Models Data 0-Factor Models Data 5-Factor Models Data Sets Sets Sets Fgure. Monte-Carlo Tests of ER Procedure for Factors X 4 through X 5, X 0, and X 5 (combned) 0

22 6. Estmated ER (EER) Even wth the applcaton of super computaton, large data sets can make ASR and ER mpractcal. For example, t would take a full year for a top-of-the-lne 00,000 node cluster computer to perform ASR/ER on a 50-factor data set. When one consders data mnng just smple non-lnear transformatons of factors (nverse, square, logarthm), the 50-factor data set becomes a 50-factor data set. If one then consders data mnng two-factor cross-products (e.g., X X, X X 3, etc.), the 50-factor data set balloons to an,75-factor data set. Ths suggests that super computaton alone sn t enough to make ER a unversal tool for data mnng. One possble approach to usng ER wth large data sets s to employ estmated ER. Estmated ER (EER) randomly selects J (out of a possble K ) models to estmate. Note that EER does not select the factors randomly, but selects the models randomly. Selectng factors randomly bases the model selecton toward models wth a total of K/ factors. Selectng models randomly gves each of the K models an equal probablty of beng chosen. Fgure and Fgure 3 show the results of EER for varous numbers of randomly selected models for 5-, 0-, and 5-Factor data sets. These tests were performed as follows:. Generate 500 observatons for each of the factors X, randomly selectng observatons from the unform dstrbuton.. Generate 500 observatons each for X through X K (where K s 5, 0, or 5) such that X = γ X + v where the γ are randomly selected from the standard normal dstrbuton and dstrbuted, and v are normally dstrbuted wth mean zero and varance 0.. Ths step creates varyng multcollnearty among the factors. 3. Generate Y accordng to (0) where α = β = β = β 3 =, and u s normally dstrbuted wth a varance of.

23 4. Randomly select J models out of the possble K regresson models to obtan J estmates for each β. 5. Calculate c, c, c 3,, c K accordng to (9). 6. Repeat steps through 5 three-hundred tmes. 7. Calculate the percentage of tmes that the cross-model test statstcs for each β exceed the crtcal value. 8. Repeat steps through 7, for J runnng from to % Cross-Model Ch-Square Statstc Exceeds.0 95% 90% 85% 80% 75% 70% 65% 60% 55% 50% Number of Randomly Selected Models 5-Factor Data Sets 0-Factor Data Sets 5-Factor Data Sets Fgure. Monte-Carlo Tests of EER Procedure for Factors X through X 3 (combned) Evdence suggests that smaller data sets are more senstve to a small number of randomly selected models. For 5-factor data sets, the rate of false negatves (Fgure ) does not stop fallng sgnfcantly untl approxmately J = 30, and the rate of false postves (Fgure 3)

24 does not stop rsng sgnfcantly untl approxmately J = 45. The rates of false negatves (Fgure ) and false postves (Fgure 3) for 0-factor and 5-factor data sets appear to settle for lesser values of J. 80% Cross-Model Ch-Square Statstc Exceeds.0 70% 60% 50% 40% 30% 0% 0% Number of Randomly Selected Models 5-Factor Data Sets 0-Factor Data Sets 5-Factor Data Sets Fgure 3. Monte-Carlo Tests of EER Procedure for Factors X 4 through X 5, X 0, and X 5 (combned) The number of factors n the data set that determne the outcome varable has a greater effect on the number of randomly selected models requred n EER. Fgure 4 compares results for 5-factor data sets when the outcome varable s a functon of only one factor versus beng a functon of three factors. The vertcal axs measures the cross-model ch-squared statstc for the ndcated number of randomly selected models dvded by the average cross-model ch-squared statstc over all 00 runs. Fgure 4 shows that, for 5-factor data sets, as the number of randomly selected models ncreases, the cross-model ch-squared statstc approaches ts mean value for 3

25 the 00 runs faster when the outcome varable s a functon of three factors versus beng a functon of only one factor. Fgure 5 shows smlar results for 0-factor data sets. Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 4. EER Procedure for Factor X and X through X 3 (combned) for 5-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors 4

26 Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 5. EER Procedure for Factor X and X through X 3 (combned) for 0-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 6. EER Procedure for Factor X and X through X 3 (combned) for 5-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors 5

27 Results n Fgure 6 are less compellng, but not contradctory. A second result, common to the large data sets (Fgure 5 and Fgure 6), s that the varaton n the cross-model chsquared estmates s less when the outcome varance s a functon of three versus one factor. Fgure 7, Fgure 8, and Fgure 9 show correspondng results for factors that do not determne the outcome varable. These results suggest that EER may be an adequate procedure for estmatng ER for larger data sets. Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 7. EER Procedure for Factors X through X 5 (combned) and X 4 through X 5 (combned) for 5-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors 6

28 Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 8. EER Procedure for Factors X through X 5 (combned) and X 4 through X 5 (combned) for 0-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors Cross-Model Ch-Square Statstc Relatve to Mean Number of Randomly Selected Models Y = f (X) Y = f(x, X, X3) Fgure 9. EER Procedure for Factors X through X 5 (combned) and X 4 through X 5 (combned) for 5-Factor Data Sets when Outcome Varable s a Functon of One vs. Three Factors 7

29 7. Comparson to Stepwse and k-fold Holdout Kuk (984) demonstrated that stepwse procedures are nferor to all subsets procedures n data mnng proportonal hazard models. Logcally, the same argument apples to data mnng regresson models. As stepwse examnes only a subset of models and ASR examnes all models, the best that stepwse can do s to match ASR. Because stepwse smartly samples on the bass of margnal changes to a ft functon, multcollnearty among the factors can cause stepwse to return a soluton that s a local, but not global, optmum. What s of nterest s a comparson of stepwse to EER because EER, lke stepwse, samples the space of possble regresson models. K-fold holdout s less an alternatve data mnng method than t s an alternatve objectve. Data mnng methods typcally have the objectve of fndng the model that best fts the data (typcally measured by mprovements to the F-statstc). K-fold holdout offers the alternatve objectve of maxmzng the model s ablty to predct observatons that were not ncluded n the model estmaton (.e., held out observatons). The selecton of whch observatons to hold out vares dependng on the data set. For example, n the case of tme seres data, t makes more sense to hold out observatons at the end of the data set. The followng tests use the same data set and apply EER, estmated all subets regresson (EASR) where we conduct a samplng of models rather than examne all possble models, and stepwse. The procedure s as follows:. Generate 500 observatons for X, randomly selectng observatons from the unform dstrbuton.. Generate 500 observatons each for X through X 5 such that X = γ X + v where the γ are randomly selected from the standard normal dstrbuton and dstrbuted, and v are 8

30 normally dstrbuted wth mean zero and varance 0.. Ths step creates varyng multcollnearty among the factors. 3. Generate Y accordng to (0) where α = β = β = β 3 =, and u s normally dstrbuted wth a varance of. 4. Perform EER and EASR k-fold holdout: a. Randomly select 500 models out of the possble 30 regresson models to obtan J estmates for each β. b. Evaluate the k-fold holdout crteron: c. Calculate c, c, c 3,, c 30 accordng to (9). d. Mark factors for whch c > as beng selected by EER. 5. Evaluate the EER models usng the k-fold holdout crteron: a. For each randomly selected model n step 4a, randomly select 50 observatons to exclude. b. Estmate the model usng the remanng 450 observatons. c. Use the estmated model to predct the 50 excluded observatons. d. Calculate the MSE (mean squared error) where MSE = 50 observaton predcted observaton excluded observatons ( ) e. Over the 500 models randomly selected by EER, dentfy the one for whch the MSE s least. Mark the factors that produce that model as beng selected by the k-fold holdout crteron. 6. Peform backward stepwse: a. Let M be the set of ncluded factors, and N be the set of excluded factors such that M = m, N = n, and m+ n= 30. 9

31 b. Estmate the model β and calculate the estmated model s X M Y = α + X + u adjusted multple correlaton coeffcent, R 0. c. For each factor X, =,, m, move the factor from set M to set N, estmate the model Y = α + β X + u X M, calculate the estmated model s adjusted multple correlaton coeffcent, result n the set of measures d. Let R ( L mn R0,..., Rm ) measure R, and then return the factor X j to M from N. Ths wll R,..., R m. =. Identfy the factor whose removal resulted n the R L. Move that factor from set M to set N. e. Gven the new set M, for each factor X, =,, m, remove the factor from set M, estmate the model Y = α + β X + u X M, calculate the estmated model s adjusted multple correlaton coeffcent, from N. Ths wll result n the set of measures R, and then return the factor X j to M R,..., R m. f. For each factor X, =,, n, move the factor from N to M, estmate the model Y = α + β X + u X M, calculate the estmated model s adjusted multple correlaton coeffcent, R, and then return the factor X to M from N. Ths wll result n the set of measures R,..., m R. + m+ n g. Let R ( L = mn R,..., R m + n) and R ( H max R,..., R m + n) removal resulted n the measure =. Identfy the factor whose R L. Move that factor from set M to set N. If was attaned usng a factor from N, move that factor from set N to set M. R H 30

32 h. Repeat steps e through g untl there s no further mprovement n R H.. Mark the factors n the model attaned n step 5h as beng selected by stepwse. 7. Repeat steps through 6 sx-hundred tmes. 8. Calculate the percentage of tmes that each factor s selected by EER, k-fold, and stepwse. Fgure 0 shows the results of ths comparson. For 30 factors, where the outcome varable s determned by the frst three factors, stepwse correctly dentfed the frst factor slghtly less frequently than dd EER (43% of the tme versus 48% of the tme). Stepwse correctly dentfed the second and thrd factors slghtly more frequently than dd EER (87% and 9% of the tme versus 8% and 85% of the tme). The k-fold crteron when appled to EASR dentfed the frst three factors 53%, 63%, and 63% of the tme, respectvely. In ths, the false postve error rate was comparable for EER and stepwse, and sgnfcantly worse for k-fold. EER erroneously dentfed the remanng factors as beng sgnfcant, on average, 7% of the tme versus 34% of the tme for stepwse and 49% of the tme for k-fold. Ths suggests that EER may be more powerful as a tool for elmnatng factors from consderaton. 3

33 00% 90% 80% Proporton of Tmes Procedure Selects the Factor 70% 60% 50% 40% 30% 0% 0% 0% Factors EER Stepwse EASR Wth k-fold Crteron Fgure 0. Comparson of EER, Stepwse, and k-fold Crteron 8. Applcablty to Other Procedures and Drawbacks Ordnary least squares estmators belong to the class of maxmum lkelhood estmators. The estmates are unbased (.e., E( ˆ β ) arbtrarly small ε), and effcent (.e., var ( ˆ β ) var ( β ) = β ), consstent (.e., lm Pr ( ˆ β β > ε) = 0 for an N < % where β % s any lnear, unbased estmator of β. The ER procedure reles on the fact that parameter estmators are unbased n extraneous varable cases though based n varyng drectons across the omtted varable cases. 3

34 In the case of lmted dependent varable models (e.g., logt), parameter estmates are unbased and consstent when the model s correctly specfed. However, n the omtted varable case, logt slope coeffcents are based toward zero. For example, suppose the outcome varable, * Y, s determned by a latent varable Y * ff Y 0 such that Y = >. Suppose also that Y * s 0 otherwse determned by the equaton = α + β + β +. If we omt the factor X from the (logt) * Y X X u regresson model, we estmate * o Y α β X u = + +. Yatchew and Grlches (985) show that β β = < β o βσ X + σ u () As the denomnator n () s strctly greater than zero, the estmator s based toward zero. More mportantly, the based estmator has the same sgn as the unbased estmator. Ths volates all four of condtons n (6), any one of whch would valdate the ER procedure. However, whle estmates of determnstc parameters are based downward, estmates of nondetermnstc parameters are randomly based. For example, f Y * s determned by the equaton * Y α βx βx u = and we estmate * o o Y α β X β3 X3 u = + + +, we obtan β β = < β o βσ X + σ u () β o 3 β 3 = = βσ X + σ u 0 (3) Because X 3 s extraneous, β 3 s zero and so repeated estmates of ˆ β o 3 wll be randomly dstrbuted around zero. In short, parameter estmates for determnstc factors wll be: () unbased n the 33

35 correctly specfed case, () unbased n the extraneous varable case, and (3) based toward zero n the omtted varable case. But, parameter estmates for spurous factors wll be unbased toward zero n all three cases. Hence, we can expect the cross-model ch-square statstcs to work n the case of logt models, though lkely n an asymptotc sense. One drawback of the ER (and EER) procedure s that the procedure may select a set of factors that, whle ndvdually passng the cross-model ch-square test, are not statstcally sgnfcant when run together n a regresson model. One possble explanaton s that, wthn the confnes of a sngle regresson model, the error varance s large enough to drown out the explanatory power of a factor but, because the cross-model ch-square statstc s based on crossmodel nformaton that has estmated and fltered out the error term, the factor appears sgnfcant. Ths s analogous to the gan n nformaton obtaned from employng panel data versus tme seres data (cf., Daves, 006). A tme seres data set may measure the same phenomenon over the same tme perod as a panel data set, but because the panel data set also measures the phenomenon over multple cross-sectons, nformaton across cross-sectons can be used to mtgate the error varance wthn a gven tme perod. 9. Concluson The purpose of ths analyss s to buld on ASR by proposng a new procedure, ER, that combnes ASR wth a cross-model ch-square statstc that attempts to dstngush determnstc factors from spurous factors. Recognzng that, for large data sets, even ER s nfeasble even wth the applcaton of super computaton, ths analyss further proposes an adaptaton on ER, EER, whch samples the space of possble regresson models. 34

36 Ths analyss () proves that, under condtons less strngent than the classcal lnear model condtons, the cross-model ch-square statstc s a reasonable measure for dstngushng between determnstc and spurous factors, () demonstrates va Monte-Carlo studes the lkelhood of ER produce false postve and false negatve results under condtons of varyng numbers of factors n the data set, varyng varance of the regresson error, and varyng choce of crtcal value, (3) proposes a procedure, EER, that estmates ER results va samplng a subset of the space of possble regresson models, (4) demonstrates va Monte-Carlo studes the lkelhood of EER producng false postve and false negatve results under condtons of varyng sample sze, and varyng number of factors n the determnstc equaton, (5) compares EER model selecton wth backward stepwse and the k-fold crteron by va Monte-Carlo studes that apply the same data sets to all three procedures, and (6) demonstrates that EER avods false postves and false negatves better than the k-fold crteron, avods false postves approxmately as well as stepwse, and avods false negatves sgnfcantly better than stepwse. 35

37 0. References Carmnes, E.G., and J.P. McIver, 98. Analyzng models wth unobserved varables: Analyss of covarance structures, n Bohmstedt. In G.W. and E.F. Borgatta, eds., Socal Measurement. Sage Publcatons: Thousand Oaks, CA. pp Daves, A., 006. A framework for decomposng shocks and measurng volatltes derved from mult-dmensonal panel data of survey forecasts. Internatonal Journal of Forecastng, (): Klne, R.B., 998. Prncples and practce of structural equaton modelng. Gulford Press: New York. Kuk, A.C., 984. All subsets regresson n proportonal hazard models. Bometrka, 7(3): Ullman, J.B., 00. Structural equaton modelng. In Tabachnck, B.G. and L.S. Fdell, eds., Usng Multvarate Statstcs. Allyn and Bacon: Needham Heghts, MA. pp Yatchew, A. and Z. Grlches, 985. Specfcaton error n probt models. The Revew of Economcs and Statstcs, 67:

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

How To Calculate The Accountng Perod Of Nequalty

How To Calculate The Accountng Perod Of Nequalty Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

SIMPLE LINEAR CORRELATION

SIMPLE LINEAR CORRELATION SIMPLE LINEAR CORRELATION Smple lnear correlaton s a measure of the degree to whch two varables vary together, or a measure of the ntensty of the assocaton between two varables. Correlaton often s abused.

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

Regression Models for a Binary Response Using EXCEL and JMP

Regression Models for a Binary Response Using EXCEL and JMP SEMATECH 997 Statstcal Methods Symposum Austn Regresson Models for a Bnary Response Usng EXCEL and JMP Davd C. Trndade, Ph.D. STAT-TECH Consultng and Tranng n Appled Statstcs San Jose, CA Topcs Practcal

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Evaluating credit risk models: A critique and a new proposal

Evaluating credit risk models: A critique and a new proposal Evaluatng credt rsk models: A crtque and a new proposal Hergen Frerchs* Gunter Löffler Unversty of Frankfurt (Man) February 14, 2001 Abstract Evaluatng the qualty of credt portfolo rsk models s an mportant

More information

Analysis of Premium Liabilities for Australian Lines of Business

Analysis of Premium Liabilities for Australian Lines of Business Summary of Analyss of Premum Labltes for Australan Lnes of Busness Emly Tao Honours Research Paper, The Unversty of Melbourne Emly Tao Acknowledgements I am grateful to the Australan Prudental Regulaton

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

Portfolio Loss Distribution

Portfolio Loss Distribution Portfolo Loss Dstrbuton Rsky assets n loan ortfolo hghly llqud assets hold-to-maturty n the bank s balance sheet Outstandngs The orton of the bank asset that has already been extended to borrowers. Commtment

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable

More information

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS Chrs Deeley* Last revsed: September 22, 200 * Chrs Deeley s a Senor Lecturer n the School of Accountng, Charles Sturt Unversty,

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

+ + + - - This circuit than can be reduced to a planar circuit

+ + + - - This circuit than can be reduced to a planar circuit MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

Economic Interpretation of Regression. Theory and Applications

Economic Interpretation of Regression. Theory and Applications Economc Interpretaton of Regresson Theor and Applcatons Classcal and Baesan Econometrc Methods Applcaton of mathematcal statstcs to economc data for emprcal support Economc theor postulates a qualtatve

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall SP 2005-02 August 2005 Staff Paper Department of Appled Economcs and Management Cornell Unversty, Ithaca, New York 14853-7801 USA Farm Savngs Accounts: Examnng Income Varablty, Elgblty, and Benefts Brent

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. INDEX 1. Load data usng the Edtor wndow and m-fle 2. Learnng to save results from the Edtor wndow. 3. Computng the Sharpe Rato 4. Obtanng the Treynor Rato

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

Diagnostic Tests of Cross Section Independence for Nonlinear Panel Data Models

Diagnostic Tests of Cross Section Independence for Nonlinear Panel Data Models DISCUSSION PAPER SERIES IZA DP No. 2756 Dagnostc ests of Cross Secton Independence for Nonlnear Panel Data Models Cheng Hsao M. Hashem Pesaran Andreas Pck Aprl 2007 Forschungsnsttut zur Zukunft der Arbet

More information

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao

More information

A Probabilistic Theory of Coherence

A Probabilistic Theory of Coherence A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

Binomial Link Functions. Lori Murray, Phil Munz

Binomial Link Functions. Lori Murray, Phil Munz Bnomal Lnk Functons Lor Murray, Phl Munz Bnomal Lnk Functons Logt Lnk functon: ( p) p ln 1 p Probt Lnk functon: ( p) 1 ( p) Complentary Log Log functon: ( p) ln( ln(1 p)) Motvatng Example A researcher

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

An Empirical Study of Search Engine Advertising Effectiveness

An Empirical Study of Search Engine Advertising Effectiveness An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman

More information

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIIOUS AFFILIATION AND PARTICIPATION Danny Cohen-Zada Department of Economcs, Ben-uron Unversty, Beer-Sheva 84105, Israel Wllam Sander Department of Economcs, DePaul

More information

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* Luísa Farnha** 1. INTRODUCTION The rapd growth n Portuguese households ndebtedness n the past few years ncreased the concerns that debt

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001 Proceedngs of the Annual Meetng of the Amercan Statstcal Assocaton, August 5-9, 2001 LIST-ASSISTED SAMPLING: THE EFFECT OF TELEPHONE SYSTEM CHANGES ON DESIGN 1 Clyde Tucker, Bureau of Labor Statstcs James

More information

BERNSTEIN POLYNOMIALS

BERNSTEIN POLYNOMIALS On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful

More information

Statistical algorithms in Review Manager 5

Statistical algorithms in Review Manager 5 Statstcal algorthms n Reve Manager 5 Jonathan J Deeks and Julan PT Hggns on behalf of the Statstcal Methods Group of The Cochrane Collaboraton August 00 Data structure Consder a meta-analyss of k studes

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

v a 1 b 1 i, a 2 b 2 i,..., a n b n i. SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

More information

Adaptive Clinical Trials Incorporating Treatment Selection and Evaluation: Methodology and Applications in Multiple Sclerosis

Adaptive Clinical Trials Incorporating Treatment Selection and Evaluation: Methodology and Applications in Multiple Sclerosis Adaptve Clncal Trals Incorporatng Treatment electon and Evaluaton: Methodology and Applcatons n Multple cleross usan Todd, Tm Frede, Ngel tallard, Ncholas Parsons, Elsa Valdés-Márquez, Jeremy Chataway

More information

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM BARRIOT Jean-Perre, SARRAILH Mchel BGI/CNES 18.av.E.Beln 31401 TOULOUSE Cedex 4 (France) Emal: jean-perre.barrot@cnes.fr 1/Introducton The

More information

Intra-year Cash Flow Patterns: A Simple Solution for an Unnecessary Appraisal Error

Intra-year Cash Flow Patterns: A Simple Solution for an Unnecessary Appraisal Error Intra-year Cash Flow Patterns: A Smple Soluton for an Unnecessary Apprasal Error By C. Donald Wggns (Professor of Accountng and Fnance, the Unversty of North Florda), B. Perry Woodsde (Assocate Professor

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

Traffic-light a stress test for life insurance provisions

Traffic-light a stress test for life insurance provisions MEMORANDUM Date 006-09-7 Authors Bengt von Bahr, Göran Ronge Traffc-lght a stress test for lfe nsurance provsons Fnansnspetonen P.O. Box 6750 SE-113 85 Stocholm [Sveavägen 167] Tel +46 8 787 80 00 Fax

More information

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt. Chapter 9 Revew problems 9.1 Interest rate measurement Example 9.1. Fund A accumulates at a smple nterest rate of 10%. Fund B accumulates at a smple dscount rate of 5%. Fnd the pont n tme at whch the forces

More information

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy 4.02 Quz Solutons Fall 2004 Multple-Choce Questons (30/00 ponts) Please, crcle the correct answer for each of the followng 0 multple-choce questons. For each queston, only one of the answers s correct.

More information

Rate-Based Daily Arrival Process Models with Application to Call Centers

Rate-Based Daily Arrival Process Models with Application to Call Centers Submtted to Operatons Research manuscrpt (Please, provde the manuscrpt number!) Authors are encouraged to submt new papers to INFORMS journals by means of a style fle template, whch ncludes the journal

More information

Realistic Image Synthesis

Realistic Image Synthesis Realstc Image Synthess - Combned Samplng and Path Tracng - Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random

More information

Transition Matrix Models of Consumer Credit Ratings

Transition Matrix Models of Consumer Credit Ratings Transton Matrx Models of Consumer Credt Ratngs Abstract Although the corporate credt rsk lterature has many studes modellng the change n the credt rsk of corporate bonds over tme, there s far less analyss

More information

Traditional versus Online Courses, Efforts, and Learning Performance

Traditional versus Online Courses, Efforts, and Learning Performance Tradtonal versus Onlne Courses, Efforts, and Learnng Performance Kuang-Cheng Tseng, Department of Internatonal Trade, Chung-Yuan Chrstan Unversty, Tawan Shan-Yng Chu, Department of Internatonal Trade,

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

Chapter XX More advanced approaches to the analysis of survey data. Gad Nathan Hebrew University Jerusalem, Israel. Abstract

Chapter XX More advanced approaches to the analysis of survey data. Gad Nathan Hebrew University Jerusalem, Israel. Abstract Household Sample Surveys n Developng and Transton Countres Chapter More advanced approaches to the analyss of survey data Gad Nathan Hebrew Unversty Jerusalem, Israel Abstract In the present chapter, we

More information

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP) 6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes

More information

Method for assessment of companies' credit rating (AJPES S.BON model) Short description of the methodology

Method for assessment of companies' credit rating (AJPES S.BON model) Short description of the methodology Method for assessment of companes' credt ratng (AJPES S.BON model) Short descrpton of the methodology Ljubljana, May 2011 ABSTRACT Assessng Slovenan companes' credt ratng scores usng the AJPES S.BON model

More information

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Brigid Mullany, Ph.D University of North Carolina, Charlotte Evaluaton And Comparson Of The Dfferent Standards Used To Defne The Postonal Accuracy And Repeatablty Of Numercally Controlled Machnng Center Axes Brgd Mullany, Ph.D Unversty of North Carolna, Charlotte

More information

Quantization Effects in Digital Filters

Quantization Effects in Digital Filters Quantzaton Effects n Dgtal Flters Dstrbuton of Truncaton Errors In two's complement representaton an exact number would have nfntely many bts (n general). When we lmt the number of bts to some fnte value

More information

ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C White Emerson Process Management

ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C White Emerson Process Management ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C Whte Emerson Process Management Abstract Energy prces have exhbted sgnfcant volatlty n recent years. For example, natural gas prces

More information

7 ANALYSIS OF VARIANCE (ANOVA)

7 ANALYSIS OF VARIANCE (ANOVA) 7 ANALYSIS OF VARIANCE (ANOVA) Chapter 7 Analyss of Varance (Anova) Objectves After studyng ths chapter you should apprecate the need for analysng data from more than two samples; understand the underlyng

More information

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000 Problem Set 5 Solutons 1 MIT s consderng buldng a new car park near Kendall Square. o unversty funds are avalable (overhead rates are under pressure and the new faclty would have to pay for tself from

More information

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT Chapter 4 ECOOMIC DISATCH AD UIT COMMITMET ITRODUCTIO A power system has several power plants. Each power plant has several generatng unts. At any pont of tme, the total load n the system s met by the

More information

ESTIMATING THE MARKET VALUE OF FRANKING CREDITS: EMPIRICAL EVIDENCE FROM AUSTRALIA

ESTIMATING THE MARKET VALUE OF FRANKING CREDITS: EMPIRICAL EVIDENCE FROM AUSTRALIA ESTIMATING THE MARKET VALUE OF FRANKING CREDITS: EMPIRICAL EVIDENCE FROM AUSTRALIA Duc Vo Beauden Gellard Stefan Mero Economc Regulaton Authorty 469 Wellngton Street, Perth, WA 6000, Australa Phone: (08)

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy Fnancal Tme Seres Analyss Patrck McSharry patrck@mcsharry.net www.mcsharry.net Trnty Term 2014 Mathematcal Insttute Unversty of Oxford Course outlne 1. Data analyss, probablty, correlatons, vsualsaton

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.

More information

Gender differences in revealed risk taking: evidence from mutual fund investors

Gender differences in revealed risk taking: evidence from mutual fund investors Economcs Letters 76 (2002) 151 158 www.elsever.com/ locate/ econbase Gender dfferences n revealed rsk takng: evdence from mutual fund nvestors a b c, * Peggy D. Dwyer, James H. Glkeson, John A. Lst a Unversty

More information

Activity Scheduling for Cost-Time Investment Optimization in Project Management

Activity Scheduling for Cost-Time Investment Optimization in Project Management PROJECT MANAGEMENT 4 th Internatonal Conference on Industral Engneerng and Industral Management XIV Congreso de Ingenería de Organzacón Donosta- San Sebastán, September 8 th -10 th 010 Actvty Schedulng

More information

Underwriting Risk. Glenn Meyers. Insurance Services Office, Inc.

Underwriting Risk. Glenn Meyers. Insurance Services Office, Inc. Underwrtng Rsk By Glenn Meyers Insurance Servces Offce, Inc. Abstract In a compettve nsurance market, nsurers have lmted nfluence on the premum charged for an nsurance contract. hey must decde whether

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

Data Visualization by Pairwise Distortion Minimization

Data Visualization by Pairwise Distortion Minimization Communcatons n Statstcs, Theory and Methods 34 (6), 005 Data Vsualzaton by Parwse Dstorton Mnmzaton By Marc Sobel, and Longn Jan Lateck* Department of Statstcs and Department of Computer and Informaton

More information

A Model of Private Equity Fund Compensation

A Model of Private Equity Fund Compensation A Model of Prvate Equty Fund Compensaton Wonho Wlson Cho Andrew Metrck Ayako Yasuda KAIST Yale School of Management Unversty of Calforna at Davs June 26, 2011 Abstract: Ths paper analyzes the economcs

More information

1 De nitions and Censoring

1 De nitions and Censoring De ntons and Censorng. Survval Analyss We begn by consderng smple analyses but we wll lead up to and take a look at regresson on explanatory factors., as n lnear regresson part A. The mportant d erence

More information

Selection bias and econometric remedies in accounting and finance research

Selection bias and econometric remedies in accounting and finance research Selecton bas and econometrc remedes n accountng and fnance research Jennfer Wu Tucker Fsher School of Accountng Warrngton College of Busness Admnstraton Unversty of Florda (352) 273-0214 (Offce) (352)392-7962

More information

International University of Japan Public Management & Policy Analysis Program

International University of Japan Public Management & Policy Analysis Program Internatonal Unversty of Japan Publc Management & Polcy Analyss Program Practcal Gudes To Panel Data Modelng: A Step by Step Analyss Usng Stata * Hun Myoung Park, Ph.D. kucc65@uj.ac.jp 1. Introducton.

More information

Prediction of Disability Frequencies in Life Insurance

Prediction of Disability Frequencies in Life Insurance Predcton of Dsablty Frequences n Lfe Insurance Bernhard Köng Fran Weber Maro V. Wüthrch October 28, 2011 Abstract For the predcton of dsablty frequences, not only the observed, but also the ncurred but

More information

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW.

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW. SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW. Lucía Isabel García Cebrán Departamento de Economía y Dreccón de Empresas Unversdad de Zaragoza Gran Vía, 2 50.005 Zaragoza (Span) Phone: 976-76-10-00

More information

THE EFFECT OF PREPAYMENT PENALTIES ON THE PRICING OF SUBPRIME MORTGAGES

THE EFFECT OF PREPAYMENT PENALTIES ON THE PRICING OF SUBPRIME MORTGAGES THE EFFECT OF PREPAYMENT PENALTIES ON THE PRICING OF SUBPRIME MORTGAGES Gregory Ellehausen, Fnancal Servces Research Program George Washngton Unversty Mchael E. Staten, Fnancal Servces Research Program

More information

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ Effcent Strpng Technques for Varable Bt Rate Contnuous Meda Fle Servers æ Prashant J. Shenoy Harrck M. Vn Department of Computer Scence, Department of Computer Scences, Unversty of Massachusetts at Amherst

More information