Variable Selection for Survival Data under Weibull Distribution

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Variable Selection for Survival Data under Weibull Distribution"

Transcription

1 Artcle Varable Selecton for Survval Data under Webull Dstrbuton Ujjwal Das 1 Calcutta Statstcal Assocaton Bulletn 681& Calcutta Statstcal Assocaton, Kolkata SAGE Publcatons sagepub.n/home.nav DOI: / Abstract In a wde spectrum of natural and socal scences, very often one encounters a large number of predctors for tme to event data. An mportant task s to select rght ones, and thereafter carry out the analyss. The l 1 penalzed regresson, known as "least absolute shrnkage and selecton operator LASSO" has become a popular approach for predctor selecton n last two decades. The LASSO regresson nvolves a penalzng parameter commonly denoted by λ that controls the extent of penalty, and hence plays a crucal role n dentfyng the rght covarates. In ths artcle, we propose an nformaton theory-based method to determne the value of λ under the accelerated falure tme AFT model wth extreme value dstrbuton. Furthermore, an effcent algorthm s dscussed n the same context. We demonstrate the usefulness of our method through an extensve smulaton study. Keywords Bhattacharya dstance, ndex of resolvablty, Kullback Lebler measure, l 1 penalty, AFT model, tme to event data, Webull dstrbuton 1. Introducton The statstcal analyss of tme to event data s very common n several appled felds, such as bology, medcne, economcs, engneerng, and socal scences. Typcal examples of such an event may be onset of a dsease, death of a subject under study, occurrence of default of a corporate bond, malfunctonng of a system, and so on. It s very frequent to adjust the analyss of those event tmes by ncorporatng the nformaton avalable from covarates. Accelerated falure tme AFT model s one of the ways of modelng the tme to event data wth predctor varables. It relates the log-transformed survval tme lnearly to the covarates. [1,] Stated formally, an AFT model s ressed as: X log Y Z β + W, 1 1 Indan Insttute of Management Udapur, Udapur, Rajasthan, Inda. Correspondng author: Ujjwal Das, Indan Insttute of Management Udapur, Udapur, Rajasthan, Inda. E-mal:

2 Das 53 where Y denote the tme to an event or survval tme wth covarate nformaton Z. To be specfc, β β 0,β 1,..., β p s a vector of regresson coeffcents wth β 0 as ntercept, p denotes the number of covarates, s a scale parameter, and W s the random error. A varety of dstrbutons are avalable for modelng W, namely normal, extreme value or logstc dstrbutons. AFT model comes out as an attractve alternatve to the Cox [3] proportonal hazard model for censored falure tme data [4] due to ts physcal nterpretaton smlar to standard regresson. Hutton and Monaghan [5] showed that AFT model s more robust to the model ms-specfcaton and yeld narrower confdence nterval for regresson coeffcents due to ts log-lnear transformaton. Later, Kwong and Hutton [6] appled the AFT model for analyss of an eplepsy data and cerebral palsy data, and showed that AFT model s superor under certan condtons. Matsushta Hagwara, Shota, Shmada, Kuramoto, Toyokura [7] appled Webull probablty model to understand the lfe table and age-patterns of dsease n Japan. More recently, Swndell [8] showed that AFT model wth Webull dstrbuton s a valuable tool for agng research. Helsen and Schmttlen [9] used Webull regresson model to analyze rght censored marketng tme. Throughout ths artcle, we assume that the tme to event follows a Webull dstrbuton, or alternatvely, we assume that W follows an extreme value dstrbuton. In some practcal studes such as genetcs, researchers may have a large number of covarates p from fewer number of observatons n, and they may need to select only few of those many covarates. Example ncludes typcal mcroarray data set that conssts of thousands of genes from a hundred subjects. Tradtonal selecton methods such as stepwse deleton or best subset selecton, although useful, may perform poorly n hgh dmensonal p >>n stuatons. The lmtatons of the exstng methods of model selecton are mentoned n Breman [10] and Fan and L. [11] As a unfed method of varable selecton for both low and hgh dmenson, penalzed approach has ganed ncreasng popularty n recent years. The penalzed methods wth some condtons on the penalty functons not only retan the good propertes of the old methods, but also enjoy theoretcal justfcatons. Among the convex penalty functons, the least absolute shrnkage and selecton operator LASSO, proposed by Tbshran, [1] has ganed enormous attenton from the researchers. LASSO s defned as the l 1 norm of the parameters: λ β 1, where β s the vector of regresson coeffcents and λ s the tunng parameter or penalzng parameter. The penalzng parameter plays an nfluental role for varable selecton. Larger value of λ exerts hgher penalty on regresson coeffcents, resultng n the ncluson of fewer varables n the model. Conversely, a small value of λ leads to less penalty and, hence, ncluson of many varables. Commonly, a sequence of λ values are generated, and then varables are detected for each value of the seres. Thereafter, a value of λ s chosen by k-fold cross-valdaton, and correspondng set of predctors are ncluded n the model. Tbshran [13] used generalzed cross-valdaton for the Cox model. [3] More recently Smon, Fredman, Haste and Tbshran [14] developed an R-package for varable selecton n Cox model [3] va LASSO wth λ selected thorough cross-valdaton. Barron and Luo [15] developed the concept of nformaton theoretcally vald l 1 penalty by extendng the work of Grunwald. [16] Usng a smlar rsk analyss, Barron Huang, La, and Luo [17] and Barron and Luo [15] developed the concept of nformaton theoretcally vald l 1 norm penalty functon for lnear models. They obtaned a lower bound on the penalzng parameter whch makes the LASSO penalty nformaton theoretcally vald. In ths artcle, we ntroduce the nformaton theory for tme to event data under the model 1 and obtan the bound for λ. We wll use the lower bound as the value of the penalzng parameter. In addton to that, we propose an effcent algorthm for the AFT model under the assumpton of Webull dstrbuton to select varables followng Barron, Cohen, Dahmen, and DeVore. [18] Any software that performs constraned optmzaton, can be used to mplement the proposed algorthm. The artcle s organzed as follows. A bref descrpton on nformaton theory along wth related concepts and, the determnaton of the bound on penalzng parameter for the AFT model are gven

3 54 Calcutta Statstcal Assocaton Bulletn 681& n Secton. Secton 3 deals wth the algorthm and ts accuracy. Secton 4 ensures the usefulness of the proposed methodology through extensve smulaton studes. The results are presented n a tabular format for dfferent combnatons of n and p wth dfferent censorng proportons. We also compare the performance of our proposed λ wth the same obtaned by Bayesan nformaton crteron BIC. Fnally, some concludng remarks n Secton 5 completes the artcle.. Method Here, we develop the proposed bound on the penalzng parameter. As a measure of nformaton dscrepancy between two probablty dstrbuton functons P and Q, we use Kullback Lebler KL dvergence. [19] It s gven by DP, Q E p log p q px S log dpx, provded P s absolutely contnuous wth respect to Q on the support S. The gves the total ected redundancy for the data qx descrbed by q but governed by p or n other words, extent of data laned by the canddate q when the true dstrbuton s p. Throughout ths artcle, Bhattacharya Reny Hellnger dstance s used as the loss functon to judge the accuracy of the estmate. It helps to dscrmnate between two dstrbuton functons P and Q, and s gven by dp, Q log pxqxdx,. [0] For a thorough dscusson on nformaton measures. [1] Index of Resolvablty: Let L f be the lkelhood characterzed by f and f be the true value of f. Then, the ndex of resolvablty s defned as: { 1 R n f mn f F n D,L f + 1 } n penf, where f s a canddate to estmate unknown f, F s the set of all possble values of f and penf denotes some penalty functon. We use ths ndex to upper-bound the statstcal rsk assocated wth the estmates obtaned by achevng the followng mnmzaton: { 1 mn f F n log } L f n penf. 3 The estmator obtaned from 3 s called mnmal complexty estmator. It can be shown that the resson under mnmzaton n 3 converges n probablty to ndex of resolvablty plus a constant entropy, whch ensures that the mnmzaton n 3 s equvalent wth the mnmzaton of the resolvablty ndex, R n f, n. For a detaled dscusson on ndex of resolvablty and ts connecton wth nformaton measures, one may see Barron et al. [18] and Luo, [] and the references theren. From 1, f s the lnear predctor gven by Z β. Let ˆf be the mnmal complexty estmator of f. Then, we measure the assocated rsk of ˆf by E[d,Lˆf ]. We choose the penalzng parameter of LASSO such that E d,lˆf mn β R p { D,L f + λ n β j }. 4 where d,lˆf d,lˆf /n and D,L f D,L f /n are the average Bhattacharya Reny Hellnger dstance and KL measure, respectvely, when averaged across the n ndependent subjects. Luo [] came up wth the lower bound of λ for lnear models. In the next subsecton, we provde a lower bound of λ so that the rsk bound n 4 s attaned under extreme value dstrbuton.

4 Das Determnaton of the Bound on Penalzng Parameter Let X 1, X,..., X n be..d. responses, and C 1, C,...,C n be the pont of censorng for n subjects. For the th subject, we observe ether the event tme X, f the subject erences the event, or some known tme pont C, whchever comes frst. Here, C s assumed to be nonnformatve. In short, for the th subject, we have V,δ,Z, where V mnx,c, δ s the censorng ndcator takng value 1 or 0, dependng on whether the subject erenced the event or censored respectvely, and Z s the covarate nformaton. Then, the densty functon of V s p C V p f V C P f f V, when δ 1 and P C f C when X >C s the survval functon and Z β. Hence, the jont lkelhood functon for n subjects can be wrtten as: δ 0 where p f. s the densty functon of X, P f C P f f f 1,f,..., f n wth f Z v 1,v,..., v n Z v Z [p f v Z] δ [ P f C Z] 1 δ, 5 We assume an extreme value dstrbuton wth locaton f Z β and shape. Then, p C f v p f v s a log-webull or gumbel densty, and P C f V C P f C s the survval functon of the same dstrbuton. Throughout the artcle, the covarates Z are assumed to be fxed, and henceforth, for notatonal smplcty, we wll drop the Z from lkelhood. Under the assumpton of known, we have the followng result that gves the bound for penalzng parameter. Result 1: The l 1 penalzed lkelhood estmator ˆf f β Z ˆβ obtaned by mn β { [ δ n x f + log x ] f + [ 1 δ C f ] } + λ n β 1 attans the rsk bound Ed,Lˆf mn β D,L f + λ n β j for every sample sze provded that λ [ {δ e v f δ e C f 1 } ] log p, 6 In practce, f s replaced by ˆf obtaned from 6, and s known.

5 56 Calcutta Statstcal Assocaton Bulletn 681& Proof: The proof s outlned n the appendx. In general, s unknown, and we state a smlar theorem on the bound of the penalzng parameter. Result : The l 1 penalzed lkelhood estmator ˆf obtaned by mnmzng [ δ x f + log x f + e ɛ 1 x f ] [ ] 1 δ C f + n n n + n log + λ n β 1 7 wth respect to β, attans the rsk bound 1 n Ed,,Lˆf,ˆ mn β, D,,L f, + λ β j + 1 δ n n x f + log p n + log4pn n for every sample sze provded that λ [ { δ + 1 δ e C } ɛ f log e ɛ n] p where ɛ s a small postve number chosen sutably, U f j β j and m 1 s gven n the proof of the theorem. We choose ɛ 1 where n s the sample sze. n The proofs wth all notatons are outlned n the appendx. Remark 1: All the lower bound of λ are on. Snce we wll use the lower bound as the value of λ, we note that λ 0asn whch may make t comparable wth the λ proposed n the lterature, for n example, Knght and Fu [3] for lnear model, and Johnson, [4] Ca, Huang and Tan [5] for censored data. From 6 and 8 we note that the proposed λ depends on the data lkelhood. Remark : We note that the lower bound of λ depends on parameter. The nonlnear form of the lkelhood functon makes the lower bound dependent of unknown quanttes. Durng computaton, the lower bound and, hence, the λ wll be computed n a teratve way. At a gven teraton, the λ wll be obtaned by usng the estmates of prevous teraton. Ths s dfferent from the exstng methods lke BIC or cross-valdaton where a sequence of λ s generated not depend on parameter, and for each value, a model s dentfed. 3. The Algorthm We propose an algorthm for the detecton of predctors n AFT model under the assumpton of extreme value dstrbuton. A smlar algorthm was proposed n Barron et al. [18] n the context of lnear models. For p <n we ft an AFT model wth extreme value dstrbuton to the data, and use the pont estmates as ntal estmate for the algorthm. For p >n, we begn wth β 0 0. The updaton rule from t 1 th step to t th step s: β t αβ t 1 + γi l where the parameters are: α [0, 1], γ R and I l whch s a vector of zero except for l th component whch s 1. We mnmze the objectve functon n 6 or 7 wth respect to α and γ for each l 1,,..., p for known or unknown

6 Das 57 scale parameter. The optmal α t, γ t and I l t are those for whch the value of the objectve functon s mnmum. We update those coordnates and keep others unchanged. We wrte the objectve functon as a functon of α and γ for l th coordnate n the followng. For known from 6 we have: x α Z β t 1 γz l x α Z β t 1 γz l L t α, γ, l 1 δ n + log + 1 n C α 1 δ Z β t 1 γz l + λ α n β t 1 j and for unknown from 7 we have: x α L t δ α, γ, l n + δ n + γ. 8 Z β t 1 γz l x α + log x α +e ɛ 1 1 δ n n log + λ α n Z β t 1 γz l n C α β t 1 j Z β t 1 γz l + Z β t 1 γz l + γ 9 Usng any standard software, one can mnmze 8 and 9. We adopt R-routne constroptm wth the opton Nelder Mead method for performng the constraned mnmzaton of the non-smooth functons n 8 and 9. Ths method s partcularly sutable for optmzaton of non-smooth functon. We contnue the updaton procedure untl some convergence crteron s satsfed or a certan number of

7 58 Calcutta Statstcal Assocaton Bulletn 681& tmes the process s repeated. For unknown shape parameter, we estmate t for a gven ntal estmate of regresson coeffcents. The methodology that we used to estmate s dscussed n detal n Secton Accuracy of the Algorthm Let L f be the lkelhood functon wth unknown parameters or lnear combnaton of parameters f, estmated by ˆf k at k th teraton. Then, we have the followng result: Result 3: Let L ˆf be the mnmal complexty estmate of and L fˆ k be the estmate from k th teraton obtaned by our proposed algorthm. Then, { 1 n log 1 L ˆf k x + λv 1 k nf f n log 1 L f x + λv f + 4V } f, 10 k + 1 where v k p ˆβ j,k and V f p β j wth ˆβ j,k s the estmate of β j at k th teraton. Proof: The proof s gven n the appendx. 4. Numercal Studes We nvestgate the performance of the proposed λ along wth the algorthm through smulatons. We wll use the lower bound of λ as ts value for all numercal nvestgatons. Frst, we create a matrx of 100 rows and 1000 columns by randomly drawng 1000 observatons from a 100-dmensonal multvarate normal dstrbuton wth mean 0 and parwse correlaton 0.1. Throughout the smulaton study, we keep ths matrx fxed and use approprate number of rows and columns as desgn matrx under four dfferent scenaros: a n 100, p 50, b n 1000, p 100, for low dmenson, and c n 50, p 100, d n 100, p 1000 for hgh dmenson. For a, we use the frst 50 columns of the matrx, for b we transpose the matrx, and for c, we consder the frst 100 columns wth ther frst 50 rows for numercal studes. Let β denote the true vector of regresson coeffcents. So, β s a vector of length 50 for a, of length 100 for b and c, and of length 1000 for d. In each case, we randomly choose seven elements of β and set them to unty, and rest of the elements are all zero. Let Z be the desgn matrx of approprate order. We generate the log-transformed survval tmes from an extreme-value dstrbuton wth the lnear predctor Z β as locaton parameter. The value of shape parameter s set to unty. Next, the varables are selected through the algorthm dscussed n Secton 3. As we have developed bound on λ separately for known and unknown, the smulaton study s also performed separately, and the results are summarzed n Tables 1 and, respectvely. We compare our proposed method wth the BIC, gven by loglkelhood + k log n, where k denotes the number of predctors selected n the model. For ths, a sequence of λ was generated, and for each of them, we determne the BIC. The optmal model s selected for whch the BIC becomes mnmum. In both tables, n represents the number of subjects, p s the number of covarates as canddate of the model, Cens. Pcnt. gves percentage of censorng, Mthd. represents the method used to select varables, TMDR s the true model detecton rate whch s the percentage of replcatons where the full model all correct seven covarates s detected, Medan and Mean are the number of correct varables detected, and Avg. Incln. s the average model sze, from the

8 Das replcatons. Our proposed nformaton theory based method s represented by InfTh and BIC by BIC n the tables. For unknown, addtonally, we estmate the scale parameter. The estmaton of n our context s dscussed n Subsecton Smulaton for Known Here, we use the lower bound from result 1 as the value of λ. The smulaton results are summarzed n Table 1. Table 1. Summary of Smulaton Results for Known Scale Parameter n p Cens. Pcnt. Mthd. TMDR Medan Mean Avg. Incln InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC Source: Author s own.

9 60 Calcutta Statstcal Assocaton Bulletn 681& We see that for n>p, and up to 40 percent, censorng the correct varable detecton rate s almost 100 percent. Addtonally, for n 1000,p 100, no false covarate s dentfed as nonzero. For n 50, p 100 wth 5 percent censorng, the TMDR s 68 percent and gradually decreases as censorng ncreases. For n 100,p 1000, the true model dentfcaton rate s more than 90 percent, up to 0 percent censorng. The performance sharply decreases for 30 percent or more censorng, though the medan number of correct varables detected s not far from the true ones. In general, the TMDR from our proposed method s hgher than the same from BIC. In all stuatons we consder here, the mean number of correct varables detected s close to average model sze. The phenomenon ndcates the ncluson of few ncorrect varables n the model. 4.. Smulaton for Unknown In realty, s unknown. We start wth ntal value of the regresson coeffcents β 0 0. We replace β by β 0 n 7 and then optmze t as a functon of. We call R-optmzaton routne, optmze, for ths purpose. Wth ths estmate, we estmate λ and then update the regresson coeffcents from 9, followng the algorthm. At the next teraton, frst we update the estmate of λ and then estmate by optmzng 7 wth β replaced by ts updated estmate from frst teraton. Ths new estmate of yelds another estmate of λ, and usng them, we update β. The whole process s repeated untl some convergence crteron s attaned or some fxed number of tmes. It should be remarked that the ntal estmate of s a type of maxmum lkelhood estmate. As before, the frequency of detectng correct nonzero varables along wth the average number of nonzero detecton are provded n Table. From Table, we get a clear pcture about the performance of the proposed algorthm as well as the bound of the penalzng parameter for extreme value dstrbuton under dfferent censorng proportons. For n 50 and p 100 wth 5 percent censorng, the proposed method detects the entre model 60 percent cases. The same for percent and percent censorng are 73. percent and 6.8 percent, respectvely. We also note that at least sx correct varables are detected more than 90 percent and 80 percent cases, respectvely. As the proporton of censorng ncreases, t detects the entre model less number of tmes. Around 30 percent censorng, the true model s detected around 0 percent tmes. For n 100 and p 50, up to 40 percent censorng, we get the detecton of full model more than 90 percent tmes. Next, for n 100 and p 1000 around 5 percent censorng, the detecton s as hgh as 93 percent. It remans almost same f the censorng ncreases to near 10 percent. For 15.5 percent censorng, the detecton s around 86 percent. And, for 0 percent censored data, the entre model s detected near 80 percent replcatons. Thereafter, the detecton goes down to 7 percent for 5 percent censorng. But, for 30 percent censorng, less than 50 percent tmes the true model s detected. It should be mentoned that for known shape parameter study, the correct model was detected 66 percent tmes for around 30 percent censorng. Here also, the TMDR from our proposed method s hgher than the same from BIC. We also notce that the TMDR as well as the the average number of correct varables detected from BIC are slghtly hgher than the same from our proposed method at the cost of hgher average model length, for hgher censorng scenaros. Addtonally, the computaton tme s longer for BIC, whch may be for consderng a long sequence of λ. Generally speakng, we fnd that the performance of correct detecton of our proposed method s excellent rrespectve of known or unknown varance up to 0 percent censorng. Around 5 percent censorng, t s workng well to detect the correct varables. But, we note that for 30 percent censorng, t s detectng at least sx correct varables more than 70 percent tmes not shown n the tables. Lastly, for n 1000 and p 100, the entre model s detected n all replcatons and for all percentages of censorng.

10 Das 61 Table. Summary of Smulaton Result for Unknown Scale Parameter n p Cens. Pcnt. Mthd. TMDR Medan Mean Avg. Incln InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC InfTh BIC Source: Author s own. 5. Dscusson The above results enable us to draw some concluson about the overall performance of the proposed method when the event tme s assumed to follow a Webull dstrbuton. Frst, we have come up wth a justfcaton for the penalzng parameter that makes the varable selecton procedure nformaton theoretcally vald. The bound of λ depends on the probablty model adopted to analyze the data, and hence, t reflects the uncertanty of the data through the model. Second, we have used a greedy algorthm for log-webull model. Generally speakng, our proposed method s able to detect the correct model frequently for both known and unknown shape parameter n all four stuatons consdered n smulaton study. As censorng ncreases, the detecton of correct varable becomes worse. Because of havng a

11 6 Calcutta Statstcal Assocaton Bulletn 681& closed form resson of the survval functon, the tme to perform the numercal erment was short. In the numercal erment, the correlaton among the predctors was 0.1. For hghly correlated predctors, LASSO may not work properly as a varable selecton tool, and one may need some other penalty functon to account for the correlaton. It mght be nterestng to determne the bound on penalzng parameter for those penalty functon and study the performance. We leave t as our future research. Acknowledgements The author s thankful to the anonymous referee and the edtor for ther valuable comments and constructve suggestons that mproved the artcle. Appendx A Here, we outlne the proof of the results 1 and, and show the convergence of the proposed algorthm. Proof of Result 1: From Barronet al., [18] the condton on penalty functon s: penf log Lf X log L f X L E f X X E L f X X + L f 11 Proof: We start wth the frst term from 11. Usng the fact that f we get: p f and andng f around f L f v 1,v,..., v n L f v 1,v,..., v n log Lf x L f x { f f + e v f { } δ e v e f e f } δ { } 1 δ e v f e C f e C f { } 1 δ e C e f e f { e v e f f f } δ { e C e f f f } 1 δ { [δ e v e f f f } { + 1 δ e C e f f f }] [ e f f f { } ] δ e v + 1 δ e C 1

12 Das 63 Consder the second term from the condton 11 L f V 1,V,..., V n E V 1,V,..., V n Lf V E L f V δ Lf V 1 Pδ 1 + E V δ 0 Pδ 0 [ { e f f C e x } e f e f + x f e x f dx e C f e C f + e C f T 1 + T. 13 The ntergral from 13, whch s denoted as T 1, can be evaluated lctly by changng the varable k e x w, say where k e f +e f.for th subject, we have: L f v T 1 v PX C C L f v v v dv C Lf v v dv { C e f +f 1 e v e f e f +f f e +f e f + e f 1 C { ke v 1 e ke C } e v dv } + e f e v dv Smlarly, for our convenence, we wrte T lctly n the followng. T { e C f e C e C f e f e C f + e f } 14

13 64 Calcutta Statstcal Assocaton Bulletn 681& So, the second term from 11 becomes: L E f E L f e f +f e f +e f e f + f e f +e f { 1 e ke C + e C { 1 e ke C + e C e f e f } + e f } e f We use the approxmaton A+B C+D A. Thereafter, andng by Taylor seres up to frst order and snce C p f f, then by the fact that for x 0,e x 1 + x + x, 15 can be wrtten as: L E f E L f e f +f e f +e f e f + f e f +e f e f f e f e f 1 e ke C 1 e ke C + e f + e f 1 C e ke 1 e ke C { 1 + f f + f f } 8 1 e ke C + f f e ke C e f + e f e f + f f e f + e f 1 e ke C {1 + f f }, 16 8 where e ke C s the frst order dervatve of e ke C, and we have used the fact that f log on both sdes of 16 we get p f. Now, takng L E f log E E log E L f L f L f log {1 + f f } f f 8 8 f f 4 17

14 Das 65 Together wth 1, 17 and the fact that f f VV f K penf [ e f f f { δ e x [ e f f f { δ e x [ f f VV f K } + 1 δ e C f f 4 } + 1 δ e C f f {δ e x f δ Luo [] the condton 11 reduces to: ] + K log p f f ± δ 4 4 e C f 1 }] + K log p ] + K log p [{δ e x f δ e C f 1 }] + K log p. 18 Followng the smlar route as shown n Luo, [] we dfferentate the rght hand sde of 18 wth respect to K, and then equatng ths wth zero we get [{ VVf K δ e x f δ e C f 1 }] log p Then, replacng the value of K and wrtng penf λv f n 18 wth V V f, we get: λv f VV f λ [ log p δ e x f δ e C f 1 ] [ log p δ e x f δ e C f 1 ] 19 And, ths completes the proof of the theorem. For unknown, we have another result. Proof of Result : We construct the f n the same way as n known varance, and the representer of denoted by s also constructed, snce s unknown. Followng Barron et al., [18] we consder the representers f, and n 11, we have L f, nstead of L f. From Barron et al., [18] we take L f, K log p + logk + 1, where K wll be determned later, and K + 1 4pn as suggested n Barron et al. [18] We choose as a logarthmc dscretzaton of whch makes ther dfference neglgble n log scale. By choosng ɛ to be a very small postve number, the representer s constructed as e ɛ. Thus, by constructon,. Next, followng the smlar route of Theorem 1 and some algebra, one can show the lower bound: λ [ { δ + 1 δ e C } ɛ f log e ɛ n] p Remark 4: In practce, we need to estmate m 1. Wth our choce of ɛ, wehave e n 1. Then, and are replaced by the estmate ˆ, as dscussed n Secton 4, whch yelds the estmate of m 1 as e n The above bound can be exactly same wth unknown bound derved by Luo [] for ɛ 1 ˆ n, where n s the number of subjects nvolved n the study.

15 66 Calcutta Statstcal Assocaton Bulletn 681& Proof of accuracy of the proposed algorthm: Proof: Let e k 1 n log L ˆf x L ˆf k x + λv k V f. Then, usng 0 we get: e k 1 n log L ˆf x L ˆf k x + λv k V f 1 p log ˆf x δ P ˆf C 1 δ + λv k V f n p ˆf k x P ˆf k C { } 1 log ˆf n δ,k ˆf + e x ˆf,k e x ˆf + 1 δ log [ { }] 1 ˆf,k ˆf log δ + e x ˆf,k e x ˆf + λv k V f n + 1 e C ˆf n 1 δ log e C ˆf,k, e C ˆf e C ˆf,k + λv k V f where ˆf Z ˆβ and ˆf,k Z ˆβ k wth ˆβ k, obtaned at k th teraton, s the estmate of β. To prove the theorem we need to show: e k 1 αe k α V f. 0 It s clear that to have the nequalty 0, we only need to tackle the rato of the survval functons from 0 snce Luo [] have provded a general proof of the theorem for any densty n the absence censorng. For the th subject, we rewrte the rato of the survval functons from 0 n the followng way: log e C ˆf e C ˆf,k e C ˆf e C ˆf,k ᾱ log + log e C ˆf e C ˆf,k 1 e C ˆf ᾱ { e C ˆf,k 1 { } α { e C ˆf e C ˆf e C ˆf,k } α { e C ˆf,k e C ˆf,k 1 }ᾱ e C ˆf,k 1 }ᾱ

16 Das 67 So, to prove 0 we need to show { e C ˆf } α { e C ˆf,k e C ˆf,k 1 }ᾱ 1. 1 Frst we replace ˆf,k by the algorthm that ˆf,k Z β k ᾱ ˆf,k 1 + γz l and then we pck some p customary value for the optmum α t and γ t n such a way that γz l Z β. Usng these, we rewrte the left hand sde of 1 n the followng way: { } α { }ᾱ e C ˆf e C ˆf,k 1 e C ˆf,k { e C ˆf { }ᾱ { e C ˆf ᾱe C ˆf e C ˆf αe C ˆf,k 1 } { } e C ˆf,k 1 } e C ˆf α,k 1 e C ˆf,k e C ˆf,k 1 e C ᾱ ˆf,k 1 αf e C ˆf e C ˆf,k 1 ᾱe C ˆf αe C ˆf,k 1 e C ᾱ ˆf,k 1 αf Let D e denote the denomnator of. We show that D e has maxmum for 0 <α<1. We note that for α 0 or 1, the rato n reduces to 1. For smplcty, we work wth log D e. It can be shown that and log D e α C ˆf,k 1 + C ˆf log D e ˆf,k 1 ˆf α ˆf,k 1 ˆf C ᾱ ˆf,k 1 α ˆf C ᾱ ˆf,k 1 α ˆf 3 4 respectvely. We see from 4 that log D e < 0 for α 0, 1. So, we conclude that log D α e and, hence, D e attans maxmum for α n the open nterval 0, 1. Also, D e cannot have ts maxmum for α 0 or1 snce n that case, to be consstent wth our fndng n 4, D e has to be a constant. As a result, the rato n s less than or equal to 1. Ths completes the proof of the theorem.

17 68 Calcutta Statstcal Assocaton Bulletn 681& References 1. Kalbflesch JK, Prentce RL. Statstcal Analyss of Falure Tme Data nd edton. New York, USA: John Wley & Sons Klen JP, Moeschberger ML. Survval Analyss: Technque for Censored and Truncated Data. London, UK: Sprnger Cox, DR. Regresson Models and Lfe Tables wth Dscussons. J R Stat Soc B. 197; 34: We LJ. The Accelerated Falure Tme Model: A Useful Alternatve to the Cox Regresson Model n Survval Analyss. Stat Med. 199; 11: Hutton JL, Monaghan PF. Choce of Parametrc Accelerated Lfe and Proportonal Hazard Models for Survval Data: Asymptotc Results. Lfetme Data Anal. 00; 8: Kwong GPS, Hutton JL. Choce of Parametrc Models n Survval Analyss: Applcatons to Monotherapy for Eplepsy and Cerebral Palsy. Appl Stat. 003; 5: Matsushta S, Hagwara K, Shota T, Shmada H, Kuramoto K, Toyokura Y. Lfetme Data Analyss of Dsease and Agng by the Webull Probablty Dstrbuton. J Cln Epdemol. 199; 4510: Swndell WR. Accelerated Falure Tme Models Provde a Useful Statstcal Framework for Agng Research. Exp Gerontol. 009; 443: Helsen K, Schmttlen DC. Analyzng Duraton Tmes n Marketng: Evdence for the Effectveness of Hazard Rate Models. Marketg Sc. 1993; 14: Breman L. Heurstcs of Instablty and Stablzaton n Model Selecton. Ann Stat. 1996; 4: Fan J, L, R. Varable Selecton va Nonconcave Penalzed Lkelhood and ts Oracle Propertes. J Amer Stat Assoc. 001; 456: Tbshran R. Regresson Shrnkage and Selecton va the Lasso. J R Stat Soc B. 1996; 58: The LASSO Method for Varable Selecton n the Cox Model. Stat Med. 1997; 16: Smon N, Fredman J, Haste, T, Tbshran, R. Regularzaton Paths for Cox s Proportonal Hazards Model va Coordnate Descent. J Stat Softw. 011; 395: Barron AR, Luo X. MDL Procedures wth l 1 Penalty and Ther Statstcal Rsk. Proceedngs Workshop on Informaton Theoretc Methods n Scence and Engneerng. Tampere, Fnland: Tampere Unversty of Technology; Grunwald P. The Mnmum Descrpton Length Prncple. Cambrdge, MA: MIT Press Barron AR, Huang C, L JQ, Luo X. The MDL Prncple, Penalzed Lkelhoods, and Statstcal Rsk. In: Grunwald P, Myllymak P. Tabus I. Wenberger M, Yu B, edtors. Festschrft for Jorma Rssanen, Tampere Unversty Press. 008a, pp Barron AR, Cohen A, Dahmen W, DeVore R. Approxmatons and Learnng by Greedy Algorthms. Ann Stat. 008; 36: Kullback S. Informaton Theory and Statstcs. New York: Wley reprnted n 1968 by Dover. 0. Bhattacharya A. On a Measure of Dvergence between Two Statstcal Populatons Defned by Probablty Dstrbutons. Bull Calcutta Math Soc. 1943; 35: Ebrahm N, Soof ES, Soyer R. Informaton Measures n Perspectve. Intern Stat Rev. 010; 78: Luo X. Penalzed Lkelhoods: Fast Algorthms and Rsk Bounds. PhD. Thess, Statstcs Department, Yale Unversty; Knght K, Fu W. Asymptotcs for LASSO Type Estmator. Ann Stat. 000; 8: Johnson BA. On LASSO for Censored Data. Elec J Stat. 009; 3: Ca T, Huang J, Tan L. Regularzed Estmaton for the Accelerated Falure Tme Model. Bometrcs. 009; 65:

The Analysis of Outliers in Statistical Data

The Analysis of Outliers in Statistical Data THALES Project No. xxxx The Analyss of Outlers n Statstcal Data Research Team Chrysses Caron, Assocate Professor (P.I.) Vaslk Karot, Doctoral canddate Polychrons Economou, Chrstna Perrakou, Postgraduate

More information

THE TITANIC SHIPWRECK: WHO WAS

THE TITANIC SHIPWRECK: WHO WAS THE TITANIC SHIPWRECK: WHO WAS MOST LIKELY TO SURVIVE? A STATISTICAL ANALYSIS Ths paper examnes the probablty of survvng the Ttanc shpwreck usng lmted dependent varable regresson analyss. Ths appled analyss

More information

The Probit Model. Alexander Spermann. SoSe 2009

The Probit Model. Alexander Spermann. SoSe 2009 The Probt Model Aleander Spermann Unversty of Freburg SoSe 009 Course outlne. Notaton and statstcal foundatons. Introducton to the Probt model 3. Applcaton 4. Coeffcents and margnal effects 5. Goodness-of-ft

More information

I. SCOPE, APPLICABILITY AND PARAMETERS Scope

I. SCOPE, APPLICABILITY AND PARAMETERS Scope D Executve Board Annex 9 Page A/R ethodologcal Tool alculaton of the number of sample plots for measurements wthn A/R D project actvtes (Verson 0) I. SOPE, PIABIITY AD PARAETERS Scope. Ths tool s applcable

More information

9.1 The Cumulative Sum Control Chart

9.1 The Cumulative Sum Control Chart Learnng Objectves 9.1 The Cumulatve Sum Control Chart 9.1.1 Basc Prncples: Cusum Control Chart for Montorng the Process Mean If s the target for the process mean, then the cumulatve sum control chart s

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

HYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION

HYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION HYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION Abdul Ghapor Hussn Centre for Foundaton Studes n Scence Unversty of Malaya 563 KUALA LUMPUR E-mal: ghapor@umedumy Abstract Ths paper

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Multivariate EWMA Control Chart

Multivariate EWMA Control Chart Multvarate EWMA Control Chart Summary The Multvarate EWMA Control Chart procedure creates control charts for two or more numerc varables. Examnng the varables n a multvarate sense s extremely mportant

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

Inequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001.

Inequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001. Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

IMPROVEMENT OF CONVERGENCE CONDITION OF THE SQUARE-ROOT INTERVAL METHOD FOR MULTIPLE ZEROS 1

IMPROVEMENT OF CONVERGENCE CONDITION OF THE SQUARE-ROOT INTERVAL METHOD FOR MULTIPLE ZEROS 1 Nov Sad J. Math. Vol. 36, No. 2, 2006, 0-09 IMPROVEMENT OF CONVERGENCE CONDITION OF THE SQUARE-ROOT INTERVAL METHOD FOR MULTIPLE ZEROS Modrag S. Petkovć 2, Dušan M. Mloševć 3 Abstract. A new theorem concerned

More information

SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA

SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA E. LAGENDIJK Department of Appled Physcs, Delft Unversty of Technology Lorentzweg 1, 68 CJ, The Netherlands E-mal: e.lagendjk@tnw.tudelft.nl

More information

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

v a 1 b 1 i, a 2 b 2 i,..., a n b n i. SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

A Note on the Decomposition of a Random Sample Size

A Note on the Decomposition of a Random Sample Size A Note on the Decomposton of a Random Sample Sze Klaus Th. Hess Insttut für Mathematsche Stochastk Technsche Unverstät Dresden Abstract Ths note addresses some results of Hess 2000) on the decomposton

More information

New bounds in Balog-Szemerédi-Gowers theorem

New bounds in Balog-Szemerédi-Gowers theorem New bounds n Balog-Szemeréd-Gowers theorem By Tomasz Schoen Abstract We prove, n partcular, that every fnte subset A of an abelan group wth the addtve energy κ A 3 contans a set A such that A κ A and A

More information

Communication Networks II Contents

Communication Networks II Contents 8 / 1 -- Communcaton Networs II (Görg) -- www.comnets.un-bremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP

More information

Questions that we may have about the variables

Questions that we may have about the variables Antono Olmos, 01 Multple Regresson Problem: we want to determne the effect of Desre for control, Famly support, Number of frends, and Score on the BDI test on Perceved Support of Latno women. Dependent

More information

Graph Theory and Cayley s Formula

Graph Theory and Cayley s Formula Graph Theory and Cayley s Formula Chad Casarotto August 10, 2006 Contents 1 Introducton 1 2 Bascs and Defntons 1 Cayley s Formula 4 4 Prüfer Encodng A Forest of Trees 7 1 Introducton In ths paper, I wll

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

Control Charts for Means (Simulation)

Control Charts for Means (Simulation) Chapter 290 Control Charts for Means (Smulaton) Introducton Ths procedure allows you to study the run length dstrbuton of Shewhart (Xbar), Cusum, FIR Cusum, and EWMA process control charts for means usng

More information

Study on CET4 Marks in China s Graded English Teaching

Study on CET4 Marks in China s Graded English Teaching Study on CET4 Marks n Chna s Graded Englsh Teachng CHE We College of Foregn Studes, Shandong Insttute of Busness and Technology, P.R.Chna, 264005 Abstract: Ths paper deploys Logt model, and decomposes

More information

Nonlinear data mapping by neural networks

Nonlinear data mapping by neural networks Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

Aryabhata s Root Extraction Methods. Abhishek Parakh Louisiana State University Aug 31 st 2006

Aryabhata s Root Extraction Methods. Abhishek Parakh Louisiana State University Aug 31 st 2006 Aryabhata s Root Extracton Methods Abhshek Parakh Lousana State Unversty Aug 1 st 1 Introducton Ths artcle presents an analyss of the root extracton algorthms of Aryabhata gven n hs book Āryabhatīya [1,

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information

The covariance is the two variable analog to the variance. The formula for the covariance between two variables is

The covariance is the two variable analog to the variance. The formula for the covariance between two variables is Regresson Lectures So far we have talked only about statstcs that descrbe one varable. What we are gong to be dscussng for much of the remander of the course s relatonshps between two or more varables.

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

1 De nitions and Censoring

1 De nitions and Censoring De ntons and Censorng. Survval Analyss We begn by consderng smple analyses but we wll lead up to and take a look at regresson on explanatory factors., as n lnear regresson part A. The mportant d erence

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Regresson a means of predctng a dependent varable based one or more ndependent varables. -Ths s done by fttng a lne or surface to the data ponts that mnmzes the total error. -

More information

Binomial Link Functions. Lori Murray, Phil Munz

Binomial Link Functions. Lori Murray, Phil Munz Bnomal Lnk Functons Lor Murray, Phl Munz Bnomal Lnk Functons Logt Lnk functon: ( p) p ln 1 p Probt Lnk functon: ( p) 1 ( p) Complentary Log Log functon: ( p) ln( ln(1 p)) Motvatng Example A researcher

More information

An Analysis of Factors Influencing the Self-Rated Health of Elderly Chinese People

An Analysis of Factors Influencing the Self-Rated Health of Elderly Chinese People Open Journal of Socal Scences, 205, 3, 5-20 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/ss http://dx.do.org/0.4236/ss.205.35003 An Analyss of Factors Influencng the Self-Rated Health of

More information

PERRON FROBENIUS THEOREM

PERRON FROBENIUS THEOREM PERRON FROBENIUS THEOREM R. CLARK ROBINSON Defnton. A n n matrx M wth real entres m, s called a stochastc matrx provded () all the entres m satsfy 0 m, () each of the columns sum to one, m = for all, ()

More information

Estimation and Robustness of Linear Mixed Models in Credibility Context

Estimation and Robustness of Linear Mixed Models in Credibility Context Estmaton and Robustness of Lnear Mxed Models n Credblty Context by Wng Kam Fung and Xao Chen Xu ABSTRACT In ths paper, lnear mxed models are employed for estmaton of structural parameters n credblty context.

More information

1 Approximation Algorithms

1 Approximation Algorithms CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons

More information

Chapter 7. Random-Variate Generation 7.1. Prof. Dr. Mesut Güneş Ch. 7 Random-Variate Generation

Chapter 7. Random-Variate Generation 7.1. Prof. Dr. Mesut Güneş Ch. 7 Random-Variate Generation Chapter 7 Random-Varate Generaton 7. Contents Inverse-transform Technque Acceptance-Rejecton Technque Specal Propertes 7. Purpose & Overvew Develop understandng of generatng samples from a specfed dstrbuton

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

A Computer Technique for Solving LP Problems with Bounded Variables

A Computer Technique for Solving LP Problems with Bounded Variables Dhaka Unv. J. Sc. 60(2): 163-168, 2012 (July) A Computer Technque for Solvng LP Problems wth Bounded Varables S. M. Atqur Rahman Chowdhury * and Sanwar Uddn Ahmad Department of Mathematcs; Unversty of

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Survival analysis methods in Insurance Applications in car insurance contracts

Survival analysis methods in Insurance Applications in car insurance contracts Survval analyss methods n Insurance Applcatons n car nsurance contracts Abder OULIDI 1 Jean-Mare MARION 2 Hervé GANACHAUD 3 Abstract In ths wor, we are nterested n survval models and ther applcatons on

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Linear Regression, Regularization Bias-Variance Tradeoff

Linear Regression, Regularization Bias-Variance Tradeoff HTF: Ch3, 7 B: Ch3 Lnear Regresson, Regularzaton Bas-Varance Tradeoff Thanks to C Guestrn, T Detterch, R Parr, N Ray 1 Outlne Lnear Regresson MLE = Least Squares! Bass functons Evaluatng Predctors Tranng

More information

The eigenvalue derivatives of linear damped systems

The eigenvalue derivatives of linear damped systems Control and Cybernetcs vol. 32 (2003) No. 4 The egenvalue dervatves of lnear damped systems by Yeong-Jeu Sun Department of Electrcal Engneerng I-Shou Unversty Kaohsung, Tawan 840, R.O.C e-mal: yjsun@su.edu.tw

More information

Nasdaq Iceland Bond Indices 01 April 2015

Nasdaq Iceland Bond Indices 01 April 2015 Nasdaq Iceland Bond Indces 01 Aprl 2015 -Fxed duraton Indces Introducton Nasdaq Iceland (the Exchange) began calculatng ts current bond ndces n the begnnng of 2005. They were a response to recent changes

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. INDEX 1. Load data usng the Edtor wndow and m-fle 2. Learnng to save results from the Edtor wndow. 3. Computng the Sharpe Rato 4. Obtanng the Treynor Rato

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

State function: eigenfunctions of hermitian operators-> normalization, orthogonality completeness

State function: eigenfunctions of hermitian operators-> normalization, orthogonality completeness Schroednger equaton Basc postulates of quantum mechancs. Operators: Hermtan operators, commutators State functon: egenfunctons of hermtan operators-> normalzaton, orthogonalty completeness egenvalues and

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

x f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60

x f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60 BIVARIATE DISTRIBUTIONS Let be a varable that assumes the values { 1,,..., n }. Then, a functon that epresses the relatve frequenc of these values s called a unvarate frequenc functon. It must be true

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that

More information

SIMPLE LINEAR CORRELATION

SIMPLE LINEAR CORRELATION SIMPLE LINEAR CORRELATION Smple lnear correlaton s a measure of the degree to whch two varables vary together, or a measure of the ntensty of the assocaton between two varables. Correlaton often s abused.

More information

Binary Dependent Variables. In some cases the outcome of interest rather than one of the right hand side variables is discrete rather than continuous

Binary Dependent Variables. In some cases the outcome of interest rather than one of the right hand side variables is discrete rather than continuous Bnary Dependent Varables In some cases the outcome of nterest rather than one of the rght hand sde varables s dscrete rather than contnuous The smplest example of ths s when the Y varable s bnary so that

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, ISSUE, FEBRUARY ISSN 77-866 Logcal Development Of Vogel s Approxmaton Method (LD- An Approach To Fnd Basc Feasble Soluton Of Transportaton

More information

Passive Filters. References: Barbow (pp 265-275), Hayes & Horowitz (pp 32-60), Rizzoni (Chap. 6)

Passive Filters. References: Barbow (pp 265-275), Hayes & Horowitz (pp 32-60), Rizzoni (Chap. 6) Passve Flters eferences: Barbow (pp 6575), Hayes & Horowtz (pp 360), zzon (Chap. 6) Frequencyselectve or flter crcuts pass to the output only those nput sgnals that are n a desred range of frequences (called

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

Application of Quasi Monte Carlo methods and Global Sensitivity Analysis in finance

Application of Quasi Monte Carlo methods and Global Sensitivity Analysis in finance Applcaton of Quas Monte Carlo methods and Global Senstvty Analyss n fnance Serge Kucherenko, Nlay Shah Imperal College London, UK skucherenko@mperalacuk Daro Czraky Barclays Captal DaroCzraky@barclayscaptalcom

More information

Quality Adjustment of Second-hand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index

Quality Adjustment of Second-hand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index Qualty Adustment of Second-hand Motor Vehcle Applcaton of Hedonc Approach n Hong Kong s Consumer Prce Index Prepared for the 14 th Meetng of the Ottawa Group on Prce Indces 20 22 May 2015, Tokyo, Japan

More information

LETTER IMAGE RECOGNITION

LETTER IMAGE RECOGNITION LETTER IMAGE RECOGNITION 1. Introducton. 1. Introducton. Objectve: desgn classfers for letter mage recognton. consder accuracy and tme n takng the decson. 20,000 samples: Startng set: mages based on 20

More information

Chapter 3 Group Theory p. 1 - Remark: This is only a brief summary of most important results of groups theory with respect

Chapter 3 Group Theory p. 1 - Remark: This is only a brief summary of most important results of groups theory with respect Chapter 3 Group Theory p. - 3. Compact Course: Groups Theory emark: Ths s only a bref summary of most mportant results of groups theory wth respect to the applcatons dscussed n the followng chapters. For

More information

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

Prediction of Disability Frequencies in Life Insurance

Prediction of Disability Frequencies in Life Insurance Predcton of Dsablty Frequences n Lfe Insurance Bernhard Köng Fran Weber Maro V. Wüthrch October 28, 2011 Abstract For the predcton of dsablty frequences, not only the observed, but also the ncurred but

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

Testing and Debugging Resource Allocation for Fault Detection and Removal Process

Testing and Debugging Resource Allocation for Fault Detection and Removal Process Internatonal Journal of New Computer Archtectures and ther Applcatons (IJNCAA) 4(4): 93-00 The Socety of Dgtal Informaton and Wreless Communcatons, 04 (ISSN: 0-9085) Testng and Debuggng Resource Allocaton

More information

Lecture 2: Absorbing states in Markov chains. Mean time to absorption. Wright-Fisher Model. Moran Model.

Lecture 2: Absorbing states in Markov chains. Mean time to absorption. Wright-Fisher Model. Moran Model. Lecture 2: Absorbng states n Markov chans. Mean tme to absorpton. Wrght-Fsher Model. Moran Model. Antonna Mtrofanova, NYU, department of Computer Scence December 8, 2007 Hgher Order Transton Probabltes

More information

Prediction of Disability Frequencies in Life Insurance

Prediction of Disability Frequencies in Life Insurance 1 Predcton of Dsablty Frequences n Lfe Insurance Bernhard Köng 1, Fran Weber 1, Maro V. Wüthrch 2 Abstract: For the predcton of dsablty frequences, not only the observed, but also the ncurred but not yet

More information

Descriptive Statistics (60 points)

Descriptive Statistics (60 points) Economcs 30330: Statstcs for Economcs Problem Set 2 Unversty of otre Dame Instructor: Julo Garín Sprng 2012 Descrptve Statstcs (60 ponts) 1. Followng a recent government shutdown, Mnnesota Governor Mark

More information

S. Malasri, D.A.Halijan and M.L.Keough Department of Civil Engineering Christian Brothers University Memphis, TN 38104. Abstract

S. Malasri, D.A.Halijan and M.L.Keough Department of Civil Engineering Christian Brothers University Memphis, TN 38104. Abstract S. Malasr, D.A.Haljan and M.L.Keough Department of Cvl Engneerng Chrstan Brothers Unversty Memphs, TN 38104 Abstract Ths paper demonstrates an applcaton of the natural selecton process to the desgn of

More information

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia To appear n Journal o Appled Probablty June 2007 O-COSTAT SUM RED-AD-BLACK GAMES WITH BET-DEPEDET WI PROBABILITY FUCTIO LAURA POTIGGIA, Unversty o the Scences n Phladelpha Abstract In ths paper we nvestgate

More information

where the coordinates are related to those in the old frame as follows.

where the coordinates are related to those in the old frame as follows. Chapter 2 - Cartesan Vectors and Tensors: Ther Algebra Defnton of a vector Examples of vectors Scalar multplcaton Addton of vectors coplanar vectors Unt vectors A bass of non-coplanar vectors Scalar product

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

Linear Regression Analysis for STARDEX

Linear Regression Analysis for STARDEX Lnear Regresson Analss for STARDEX Malcolm Halock, Clmatc Research Unt The followng document s an overvew of lnear regresson methods for reference b members of STARDEX. Whle t ams to cover the most common

More information

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The

More information

2.4 Bivariate distributions

2.4 Bivariate distributions page 28 2.4 Bvarate dstrbutons 2.4.1 Defntons Let X and Y be dscrete r.v.s defned on the same probablty space (S, F, P). Instead of treatng them separately, t s often necessary to thnk of them actng together

More information

A Probabilistic Theory of Coherence

A Probabilistic Theory of Coherence A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want

More information

EE201 Circuit Theory I 2015 Spring. Dr. Yılmaz KALKAN

EE201 Circuit Theory I 2015 Spring. Dr. Yılmaz KALKAN EE201 Crcut Theory I 2015 Sprng Dr. Yılmaz KALKAN 1. Basc Concepts (Chapter 1 of Nlsson - 3 Hrs.) Introducton, Current and Voltage, Power and Energy 2. Basc Laws (Chapter 2&3 of Nlsson - 6 Hrs.) Voltage

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Sldes Prepared JOHN S. LOUCKS St. Edward s Unverst Slde Chapter 4 Smple Lnear Regresson Smple Lnear Regresson Model Least Squares Method Coeffcent of Determnaton Model Assumptons Testng for Sgnfcance Usng

More information

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts Power-of-wo Polces for Sngle- Warehouse Mult-Retaler Inventory Systems wth Order Frequency Dscounts José A. Ventura Pennsylvana State Unversty (USA) Yale. Herer echnon Israel Insttute of echnology (Israel)

More information