Evaluating the generalizability of an RCT using electronic health records data


 Brent Washington
 2 years ago
 Views:
Transcription
1 Evaluatng the generalzablty of an RCT usng electronc health records data
2 3 nterestng questons Is our RCT representatve? How can we generalze RCT results? Can we use EHR* data as a control group? *) Electronc Health Records
3
4 Dfference n data collecton The RCT data The EHR data Patents ncluded by crteras Regular vsts Measurements accordng to protocol Extenste data cleanng Usually randomzed Mssng could be nonnformatve e.g. LDL mssng due to lost sample Patents ncluded f they go to the doctor Vsts as needed Measurements as needed Data cleanng? confoundng Mssng s almost never nonnformatve e.g. LDL not needed and thus mssng Placebo effect
5 An llustratve(?) example One large global CV outcome study n prmary preventon wth lberal ncluson/excluson crtera Three CPRD cohorts ndcated for prmary preventon of CV dsease were created from a 6 year study wndow ( ): Cohort 1: patents meetng the tral s ncluson crtera and excludng those wth pror CV hstory or CRP value ndcatng severe bacteral nfecton Cohort 2: patents meetng NICE gudelnes for recommended statn therapy n prmary preventon of CV dsease
6 Sample szes and varables Sampleszes Data set N N complete cases N after mputaton RCT CPRD CPRD Realstc cohort Very broad cohort The analyss was run on two dfferent sets of varables: Framngham varables: AGE, BMI, TC, sex, smokng All* varables: AGE, ANTIHT_USE, ASA_USE, BMI, CRP, DBP, FPG, HDL, LDL, SBP, TC, TG, sex, smokng, weght
7 7 RWE Introducton February 2013
8 8 RWE Introducton February 2013
9 If we have the RCT results, what can we say about the effect n dfferent populaton? effect
10 Is our tral representatve? compare patent characterstcs One varable at a tme All varables at the same tme Convex hull Cross matchng (Lnear) dscrmnant analyss
11 Descrptve statstcs per varable RCT mean RCT std EHR mean EHR std GENDER_MALE AGE WEIGHT BMI SBP DBP CIGS/DAY SMOKER FPG GLUC LDL HDL TG TC HBA1C CRP ASA USE ANTIHT USE MEDHIST DM FAMHIST CHD
12 Convex hull Idea: Construct the convex hull for the RCT patents and see how many RWE patents fall nto that Defnton: the convex hull for a set S s the smallest convex set that contans s 1 d convex hull: The range 2 d convex hull: K d convex hull: Wrap t n (stff) paper Example: matcht(formula = tral ~ AGE + BMI+ CRP + SBP + DBP + sex + LDL + FPG + CR_CL + TC + TG + HDL, data=anadata, method="nearest", dscard="hull.control") Sample szes: EHR patents All 3933 In the RCT convex hull 254 Dscarded 3679 Almost all of the EHR patents les outsde of the convex hull for the RCT patents. It s enough to be extreme on one varable (or n one drecton)
13 Cross matchng Cross matchng as a test comparng multvarate dstrbutons Rosenbaum (2005) Idea: Merge all the data Create match pars usng the Mahalanobs dstance Count the number of cross matches (A matched to B) The number of cross matches has a know dstrbuton under H 0 A B A B A B A B B A A B A B A A A B B B A A B B Observed number of cross matches: 907 Expected number of cross matches under H 0 : 1183 Pvalue: 10**(35) Indcates that the RCT and RWE populatons dffer
14 Lnear dscrmnant analyss Try lnear dscrmnant analyss to fnd a lnear functon that separates RCT and EHR patents Coeffcents of lnear dscrmnants: LD1 AGE_I BMI CRP SBP DBP LDL FPG CR_CL TC TG HDL Percent of Total Comparng dstrbuton of posteror predcton probabltes RCT RWE Mean=0.58 Mean= posteror predcton probabltes
15 Next steps Propensty score (Stuart &Cole 2010, 2011) Cross desgn synthess (Kazar 2009) Herarccal models (Prevost et al 2000)
16 The nce thng about propensty score e X = P T X e X X
17 For us S ndcates membershp n the RCT sample T ndcates treatment assgnment T = {1,0} covarates X potental outcomes: Y (1) would be observed under treatment Y (0) would be observed under control The sample average treatment effect SATE = 1 n s =1 Y 1 Y 0 The populaton average treatment effect N PATE = 1 Y N 1 Y 0 =1
18 Key assumptons All patents n the populaton have some probablty of beng n the tral and no patents are always n the tral 0 < P S = 1 X < 1 Incluson n the tral does not depend on the potental outcomes except though the covarates X S Y 0, Y 1 X Treatment assgnment does not depend on the ncluson nto the tral or the potental outcomes except through X T S, Y 0, Y 1 X
19 Propensty score as a dstance meaure S S p x e n N x e n So, what constues a bg dffernece? Suggeston: bg f p p c ˆ C=0.25 or 0.1 has been suggested
20 Propensty score, all varbles Comparng dstrbuton of propensty scores, PLSglm Cohort1 RCT Jupter 20 Comparng dstrbuton of propensty scores, PLSglm Cohort3 Jupter RCT Percent of Total 20 CPRD CPRD 1 30 CPRD 3 0 Percent of Total Propensty score Propensty score Δ=0.165 Δ=0.478
21 Propensty scores based on rsk factors Comparng dstrbuton of propensty scores, PLSglm Cohort1 FH RCT Jupter 25 Comparng dstrbuton of propensty scores, PLSglm Cohort3 FH RCT Jupter Percent of Total 25 CPRD 1 5 CPRD 1 CPRD 3 0 Percent of Total 30 CPRD Propensty score Propensty score Δ=0.06 Δ=0.38
22 Predct the results n the EHR populaton (PATE) IPSW : Inverse probablty of selecton weght Weght: w P P S S 1 X In our case the endpont s a tme to event so we ll ft a weghted Cox proportonal hazards model wth the partal lkelhood L n n 1 k 1 expx w expx w k 1 R t k k Y Stuart & Cole 2011
23 Predcted treatment effect n the example Cohort Varables used Hazard Lower Upper pvalue rato RCT < Cohort1 All < Cohort1 Framngham < Cohort3 All < Cohort3 Framngham < The predcted treatment effect s slghtly closer to 1 whch ndcates an under representaton of low rsk patents n the Jupter populaton The dfference s smallest for cohort 1 and largest for cohort 3 Rsk factors don t account for all confoundng(?)
24 What s next? pacehr Tralo (n=453) EHR(n=38,000) N=662 At least 2 exacerpaton 1 year pre ndex Age Sprometry data N=367 No COPD & 1 year follow up data Only Gna 4 and 5 placebo (n=151) N=193 ptw
25 EHR data as a control group Weght the EHR data by: P S = 1 X P S = 0 X Weght the HER to estmate the treatment outcome that would have been observed f the HER data had the same dstrbuton of patents characterstcs as the RCT Sort of lke estmatng the ATT n an observatonal study.
26 Dong wthout patent level RCT data Evaluate the generalzablty usng Presslee s method Use weghts from the method of moments (Sgnorovch 2012)
27 Dong wthout tyhe RCT patent data 1 Idea: Use the RCT ncluson/excluson crtera to splt a regstry cohort nto RCT elgble and RCT non elgble patents. No acctual RCT data needed! E Defne the sub populaton average treatment effect n each sub populaton PATE SPATE(I) SPATE(E) 1 NI 1 N I N 1 0 N 1 N I 1 Y 1 Y 0 1 N I E 1 Y 1 Y 0 E 1 E I Y 1 Y Measure the generalzaton error by comparng outcomes for I and the whole populaton ˆ ˆ I I E E E ˆ I I ˆ E E
28 Dong wthout patent level data 2 R Regstry Actve Placebo Patent level data Aggregate data Weght the regstry on Gender Age BMI SBI LDL TG Smokng Outcome Y x, y x, y C C Idea: Use ˆ y w w y C where w P P S S 1 x 0 x T Estmate the weghts usng logstc regresson: w exp x T ˆ x exp x Method of moments, solve xc 0 T expx ˆ
29 References Ima K, Kng G, Stuart EA. Msunderstandngs between expermentalsts and observatonalsts about causal nference. Journal of the Royal Statstcal Socety A. 2008; 171: Stuart EA, Cole SR, Bradshaw CP, Leaf PJ. The use of propensty scores to assess the generalzablty of results from randomzed trals. Journal of the Royal Statstcal Socety A. 2011; 174: Cole SR, Stuart EA. Generalzng evdence from randomzed clncal trals to target populatons. Amercan Journal of Epdemology. 2010; 172: Rosenbaum PR. An exact dstrbutonfree test comparng two multvarate dstrbutons based on adjacency. Journal of the Royal Statstcal Socety B. 2005; 67: Prevost TC, Abrams KR, Jones DR. Herarccal models n generalzed synthess of evdence: an example based on studes of breast cancer screenng. Statstcs n Medcne. 2000; 19: Sgnorovch et al MatchngAdjusted Indrect Comparsons: A New Tool for Tmely Comparatve Effectveness Research. Value n Health. 2012; 12: Pressler T. R. Kazar E. E. The use of propensty scores and observatonal data to estmate randomzed controlled tral generalzablty bas. Statstcs n Medcne. 2013; 32:
The Analysis of Covariance. ERSH 8310 Keppel and Wickens Chapter 15
The Analyss of Covarance ERSH 830 Keppel and Wckens Chapter 5 Today s Class Intal Consderatons Covarance and Lnear Regresson The Lnear Regresson Equaton TheAnalyss of Covarance Assumptons Underlyng the
More informationDirected acyclic graphs (DAGs) for causal analysis Supporting text. Ph.D. course in epidemiology. Simpson: baby playing cards
Drected acyclc graphs (DAGs) for causal analyss Supportng text M.M. Glymour (2006). sng causal dagrams to understand common problems n socal epdemology. Ph.D. course n epdemology hapter 6 n J.M. Oakes
More informationTHE TITANIC SHIPWRECK: WHO WAS
THE TITANIC SHIPWRECK: WHO WAS MOST LIKELY TO SURVIVE? A STATISTICAL ANALYSIS Ths paper examnes the probablty of survvng the Ttanc shpwreck usng lmted dependent varable regresson analyss. Ths appled analyss
More informationSTATISTICAL DATA ANALYSIS IN EXCEL
Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 1401013 petr.nazarov@crpsante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for
More informationCan Auto Liability Insurance Purchases Signal Risk Attitude?
Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? ChuShu L Department of Internatonal Busness, Asa Unversty, Tawan ShengChang
More informationAnalysis of Covariance
Chapter 551 Analyss of Covarance Introducton A common tas n research s to compare the averages of two or more populatons (groups). We mght want to compare the ncome level of two regons, the ntrogen content
More informationb) The mean of the fitted (predicted) values of Y is equal to the mean of the Y values: c) The residuals of the regression line sum up to zero: = ei
Mathematcal Propertes of the Least Squares Regresson The least squares regresson lne obeys certan mathematcal propertes whch are useful to know n practce. The followng propertes can be establshed algebracally:
More informationQuality Adjustment of Secondhand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index
Qualty Adustment of Secondhand Motor Vehcle Applcaton of Hedonc Approach n Hong Kong s Consumer Prce Index Prepared for the 14 th Meetng of the Ottawa Group on Prce Indces 20 22 May 2015, Tokyo, Japan
More informationAn Analysis of Factors Influencing the SelfRated Health of Elderly Chinese People
Open Journal of Socal Scences, 205, 3, 520 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/ss http://dx.do.org/0.4236/ss.205.35003 An Analyss of Factors Influencng the SelfRated Health of
More informationCS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements
Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there
More informationApproximating Crossvalidatory Predictive Evaluation in Bayesian Latent Variables Models with Integrated IS and WAIC
Approxmatng Crossvaldatory Predctve Evaluaton n Bayesan Latent Varables Models wth Integrated IS and WAIC Longha L Department of Mathematcs and Statstcs Unversty of Saskatchewan Saskatoon, SK, CANADA
More informationQuestions that we may have about the variables
Antono Olmos, 01 Multple Regresson Problem: we want to determne the effect of Desre for control, Famly support, Number of frends, and Score on the BDI test on Perceved Support of Latno women. Dependent
More informationCausal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting
Causal, Explanatory Forecastng Assumes causeandeffect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of
More informationForecasting the Direction and Strength of Stock Market Movement
Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract  Stock market s one of the most complcated systems
More informationCHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol
CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL
More informationLatent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006
Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model
More informationCS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering
Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that
More informationStatistical algorithms in Review Manager 5
Statstcal algorthms n Reve Manager 5 Jonathan J Deeks and Julan PT Hggns on behalf of the Statstcal Methods Group of The Cochrane Collaboraton August 00 Data structure Consder a metaanalyss of k studes
More informationThe covariance is the two variable analog to the variance. The formula for the covariance between two variables is
Regresson Lectures So far we have talked only about statstcs that descrbe one varable. What we are gong to be dscussng for much of the remander of the course s relatonshps between two or more varables.
More informationDiabetes as a Predictor of Mortality in a Cohort of Blind Subjects
Internatonal Journal of Epdemology Internatonal Epdemologcal Assocaton 1996 Vol. 25, No. 5 Prnted n Great Brtan Dabetes as a Predctor of Mortalty n a Cohort of Blnd Subjects CHRISTOPH TRAUTNER,* ANDREA
More informationMultivariate EWMA Control Chart
Multvarate EWMA Control Chart Summary The Multvarate EWMA Control Chart procedure creates control charts for two or more numerc varables. Examnng the varables n a multvarate sense s extremely mportant
More informationPSYCHOLOGICAL RESEARCH (PYC 304C) Lecture 12
14 The Chsquared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed
More informationThe Probit Model. Alexander Spermann. SoSe 2009
The Probt Model Aleander Spermann Unversty of Freburg SoSe 009 Course outlne. Notaton and statstcal foundatons. Introducton to the Probt model 3. Applcaton 4. Coeffcents and margnal effects 5. Goodnessofft
More informationClustering Gene Expression Data. (Slides thanks to Dr. Mark Craven)
Clusterng Gene Epresson Data Sldes thanks to Dr. Mark Craven Gene Epresson Proles we ll assume we have a D matr o gene epresson measurements rows represent genes columns represent derent eperments tme
More informationAn Efficient Framework for Online Advertising Effectiveness Measurement and Comparison
An Effcent Framework for Onlne Advertsng Effectveness Measurement and Comparson Pengyuan Wang Yahoo Labs 701 1st Ave, Sunnyvale Calforna 94089 pengyuan@yahoonc.com HanYun Tsao Yahoo Labs 701 1st Ave,
More informationEstimating Agespecific Prevalence of Testosterone Deficiency in Men Using Normal Mixture Models
Journal of Data Scence 7(2009), 203217 Estmatng Agespecfc Prevalence of Testosterone Defcency n Men Usng Normal Mxture Models Yungta Lo Mount Sna School of Medcne Abstract: Testosterone levels declne
More informationTime Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University
Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton
More informationCalculation of Sampling Weights
Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a twostage stratfed cluster desgn. 1 The frst stage conssted of a sample
More informationWhat is Candidate Sampling
What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble
More informationA 'Virtual Population' Approach To Small Area Estimation
A 'Vrtual Populaton' Approach To Small Area Estmaton Mchael P. Battagla 1, Martn R. Frankel 2, Machell Town 3 and Lna S. Balluz 3 1 Abt Assocates Inc., Cambrdge MA 02138 2 Baruch College, CUNY, New York
More informationStatistical Methods to Develop Rating Models
Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and
More informationCHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES
CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable
More informationI529: Machine Learning in Bioinformatics (Spring 2013) Markov Models
I529: Machne Learnng n Bonformatcs (Sprng 213) Markov Models Yuzhen Ye School of Informatcs and Computng Indana Unversty, Bloomngton Sprng 213 Outlne Smple model (frequency & profle) revew Markov chan
More informationNasdaq Iceland Bond Indices 01 April 2015
Nasdaq Iceland Bond Indces 01 Aprl 2015 Fxed duraton Indces Introducton Nasdaq Iceland (the Exchange) began calculatng ts current bond ndces n the begnnng of 2005. They were a response to recent changes
More informationIntroduction to Regression
Introducton to Regresson Regresson a means of predctng a dependent varable based one or more ndependent varables. Ths s done by fttng a lne or surface to the data ponts that mnmzes the total error. 
More informationQuantization Effects in Digital Filters
Quantzaton Effects n Dgtal Flters Dstrbuton of Truncaton Errors In two's complement representaton an exact number would have nfntely many bts (n general). When we lmt the number of bts to some fnte value
More informationThe Development of Web Log Mining Based on ImproveKMeans Clustering Analysis
The Development of Web Log Mnng Based on ImproveKMeans Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.
More informationBinary Dependent Variables. In some cases the outcome of interest rather than one of the right hand side variables is discrete rather than continuous
Bnary Dependent Varables In some cases the outcome of nterest rather than one of the rght hand sde varables s dscrete rather than contnuous The smplest example of ths s when the Y varable s bnary so that
More informationAdaptive Clinical Trials Incorporating Treatment Selection and Evaluation: Methodology and Applications in Multiple Sclerosis
Adaptve Clncal Trals Incorporatng Treatment electon and Evaluaton: Methodology and Applcatons n Multple cleross usan Todd, Tm Frede, Ngel tallard, Ncholas Parsons, Elsa ValdésMárquez, Jeremy Chataway
More informationInequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001.
Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.
More informationNPAR TESTS. OneSample ChiSquare Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6
PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has
More informationLETTER IMAGE RECOGNITION
LETTER IMAGE RECOGNITION 1. Introducton. 1. Introducton. Objectve: desgn classfers for letter mage recognton. consder accuracy and tme n takng the decson. 20,000 samples: Startng set: mages based on 20
More information14.74 Lecture 5: Health (2)
14.74 Lecture 5: Health (2) Esther Duflo February 17, 2004 1 Possble Interventons Last tme we dscussed possble nterventons. Let s take one: provdng ron supplements to people, for example. From the data,
More informationLogistic Regression. Steve Kroon
Logstc Regresson Steve Kroon Course notes sectons: 24.324.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro
More informationLecture 5,6 Linear Methods for Classification. Summary
Lecture 5,6 Lnear Methods for Classfcaton Rce ELEC 697 Farnaz Koushanfar Fall 2006 Summary Bayes Classfers Lnear Classfers Lnear regresson of an ndcator matrx Lnear dscrmnant analyss (LDA) Logstc regresson
More informationStart me up: The Effectiveness of a SelfEmployment Programme for Needy Unemployed People in Germany*
Start me up: The Effectveness of a SelfEmployment Programme for Needy Unemployed People n Germany* Joachm Wolff Anton Nvorozhkn Date: 22/10/2008 Abstract In recent years actvaton of meanstested unemployment
More information1. Measuring association using correlation and regression
How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a
More informationSingle and multiple stage classifiers implementing logistic discrimination
Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul  PUCRS Av. Ipranga,
More informationLecture 10: Linear Regression Approach, Assumptions and Diagnostics
Approach to Modelng I Lecture 1: Lnear Regresson Approach, Assumptons and Dagnostcs Sandy Eckel seckel@jhsph.edu 8 May 8 General approach for most statstcal modelng: Defne the populaton of nterest State
More informationThe Current Employment Statistics (CES) survey,
Busness Brths and Deaths Impact of busness brths and deaths n the payroll survey The CES probabltybased sample redesgn accounts for most busness brth employment through the mputaton of busness deaths,
More informationTesting GOF & Estimating Overdispersion
Testng GOF & Estmatng Overdsperson Your Most General Model Needs to Ft the Dataset It s mportant that the most general (complcated) model n your canddate model lst fts the data well. Ths model s a benchmark
More informationU.C. Berkeley CS270: Algorithms Lecture 4 Professor Vazirani and Professor Rao Jan 27,2011 Lecturer: Umesh Vazirani Last revised February 10, 2012
U.C. Berkeley CS270: Algorthms Lecture 4 Professor Vazran and Professor Rao Jan 27,2011 Lecturer: Umesh Vazran Last revsed February 10, 2012 Lecture 4 1 The multplcatve weghts update method The multplcatve
More informationNaïve Bayes classifier & Evaluation framework
Lecture aïve Bayes classfer & Evaluaton framework Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Generatve approach to classfcaton Idea:. Represent and learn the dstrbuton p x, y. Use t to defne probablstc
More informationII. PROBABILITY OF AN EVENT
II. PROBABILITY OF AN EVENT As ndcated above, probablty s a quantfcaton, or a mathematcal model, of a random experment. Ths quantfcaton s a measure of the lkelhood that a gven event wll occur when the
More informationEstimating the Number of Clusters in Genetics of Acute Lymphoblastic Leukemia Data
Journal of Al Azhar UnverstyGaza (Natural Scences), 2011, 13 : 109118 Estmatng the Number of Clusters n Genetcs of Acute Lymphoblastc Leukema Data Mahmoud K. Okasha, Khaled I.A. Almghar Department of
More informationSurvival analysis methods in Insurance Applications in car insurance contracts
Survval analyss methods n Insurance Applcatons n car nsurance contracts Abder OULIDI 1 JeanMare MARION 2 Hervé GANACHAUD 3 Abstract In ths wor, we are nterested n survval models and ther applcatons on
More informationEnhancing the Quality of Price Indexes A Sampling Perspective
Enhancng the Qualty of Prce Indexes A Samplng Perspectve Jack Lothan 1 and Zdenek Patak 2 Statstcs Canada 1 Statstcs Canada 2 Abstract Wth the release of the Boskn Report (Boskn et al., 1996) on the state
More informationAnalysis of Premium Liabilities for Australian Lines of Business
Summary of Analyss of Premum Labltes for Australan Lnes of Busness Emly Tao Honours Research Paper, The Unversty of Melbourne Emly Tao Acknowledgements I am grateful to the Australan Prudental Regulaton
More informationMetaanalysis in Psychological Research.
Internatonal Journal of Psychologcal Research, 010. Vol. 3. No. 1. ISSN mpresa (prnted 011084 ISSN electrónca (electronc 011079 SánchezMeca, J., MarínMartínez, F., (010. Metaanalyss n Psychologcal
More informationCriminal Justice System on Crime *
On the Impact of the NSW Crmnal Justce System on Crme * Dr Vasls Sarafds, Dscplne of Operatons Management and Econometrcs Unversty of Sydney * Ths presentaton s based on jont work wth Rchard Kelaher 1
More information1 De nitions and Censoring
De ntons and Censorng. Survval Analyss We begn by consderng smple analyses but we wll lead up to and take a look at regresson on explanatory factors., as n lnear regresson part A. The mportant d erence
More informationEstimating Total Claim Size in the Auto Insurance Industry: a Comparison between Tweedie and ZeroAdjusted Inverse Gaussian Distribution
Estmatng otal Clam Sze n the Auto Insurance Industry: a Comparson between weede and ZeroAdjusted Inverse Gaussan Dstrbuton Autora: Adrana Bruscato Bortoluzzo, Italo De Paula Franca, Marco Antono Leonel
More informationNuno Vasconcelos UCSD
Bayesan parameter estmaton Nuno Vasconcelos UCSD 1 Maxmum lkelhood parameter estmaton n three steps: 1 choose a parametrc model for probabltes to make ths clear we denote the vector of parameters by Θ
More informationCausal Effects in NonExperimental Studies: ReEvaluating the Evaluation of Training Programs *
Causal Effects n NonExpermental Studes: ReEvaluatng the Evaluaton of Tranng Programs * Rajeev H. Deheja and Sadek Wahba cte as Rajeev Deheja and Sadek Wahba, Causal Effects n NonExpermental Studes:
More informationStudy on CET4 Marks in China s Graded English Teaching
Study on CET4 Marks n Chna s Graded Englsh Teachng CHE We College of Foregn Studes, Shandong Insttute of Busness and Technology, P.R.Chna, 264005 Abstract: Ths paper deploys Logt model, and decomposes
More informationLinear Regression Analysis for STARDEX
Lnear Regresson Analss for STARDEX Malcolm Halock, Clmatc Research Unt The followng document s an overvew of lnear regresson methods for reference b members of STARDEX. Whle t ams to cover the most common
More informationExhaustive Regression. An Exploration of RegressionBased Data Mining Techniques Using Super Computation
Exhaustve Regresson An Exploraton of RegressonBased Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The
More informationEvaluating credit risk models: A critique and a new proposal
Evaluatng credt rsk models: A crtque and a new proposal Hergen Frerchs* Gunter Löffler Unversty of Frankfurt (Man) February 14, 2001 Abstract Evaluatng the qualty of credt portfolo rsk models s an mportant
More informationthe Manual on the global data processing and forecasting system (GDPFS) (WMONo.485; available at http://www.wmo.int/pages/prog/www/manuals.
Gudelne on the exchange and use of EPS verfcaton results Update date: 30 November 202. Introducton World Meteorologcal Organzaton (WMO) CBSXIII (2005) recommended that the general responsbltes for a Lead
More informationPhoenix Center Policy Paper Number 39: Internet Use and Job Search. (January 2010)
PHOENIX CENTER POLICY PAPER SERIES Phoenx Center Polcy Paper Number 39: Internet Use and Job Search T. Randolph Beard, PhD George S. Ford PhD Rchard P. Saba, PhD (January 2010), T. Randolph Beard, George
More information1 Approximation Algorithms
CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons
More informationCHAPTER 14 MORE ABOUT REGRESSION
CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp
More informationEstimating Total Claim Size in the Auto Insurance Industry: a Comparison between Tweedie and ZeroAdjusted Inverse Gaussian Distribution
Avalable onlne at http:// BAR, Curtba, v. 8, n. 1, art. 3, pp. 3747, Jan./Mar. 2011 Estmatng Total Clam Sze n the Auto Insurance Industry: a Comparson between Tweede and ZeroAdjusted Inverse Gaussan
More informationx f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60
BIVARIATE DISTRIBUTIONS Let be a varable that assumes the values { 1,,..., n }. Then, a functon that epresses the relatve frequenc of these values s called a unvarate frequenc functon. It must be true
More informationL10: Linear discriminants analysis
L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss
More informationbenefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).
REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or
More informationLuby s Alg. for Maximal Independent Sets using Pairwise Independence
Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent
More informationBinomial Link Functions. Lori Murray, Phil Munz
Bnomal Lnk Functons Lor Murray, Phl Munz Bnomal Lnk Functons Logt Lnk functon: ( p) p ln 1 p Probt Lnk functon: ( p) 1 ( p) Complentary Log Log functon: ( p) ln( ln(1 p)) Motvatng Example A researcher
More informationCharacterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University
Characterzaton of Assembly Varaton Analyss Methods A Thess Presented to the Department of Mechancal Engneerng Brgham Young Unversty In Partal Fulfllment of the Requrements for the Degree Master of Scence
More informationLecture 9: Logit/Probit. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II
Lecture 9: Logt/Probt Prof. Sharyn O Halloran Sustanable Development U96 Econometrcs II Revew of Lnear Estmaton So far, we know how to handle lnear estmaton models of the type: Y = β 0 + β *X + β 2 *X
More informationNandini Dendukuri 1,2 Caroline Reinhold 3,4
Dendukur and Renhold Correlaton and Regresson Research Fundamentals of Clncal Research for Radologsts Downloaded from www.ajronlne.org by 37.44.07.0 on 0/3/7 from I address 37.44.07.0. Copyrght ARRS. For
More informationCausality and potential outcomes Average causal effects
treatment effects The term treatment effect refers to the causal effect of a bnary (0 1) varable on an outcome varable of scentfc or polcy nterest. Economcs examples nclude the effects of government programmes
More informationMethod for assessment of companies' credit rating (AJPES S.BON model) Short description of the methodology
Method for assessment of companes' credt ratng (AJPES S.BON model) Short descrpton of the methodology Ljubljana, May 2011 ABSTRACT Assessng Slovenan companes' credt ratng scores usng the AJPES S.BON model
More informationMetaAnalysis of Hazard Ratios
NCSS Statstcal Softare Chapter 458 MetaAnalyss of Hazard Ratos Introducton Ths module performs a metaanalyss on a set of togroup, tme to event (survval), studes n hch some data may be censored. These
More informationTable of Contents EQ.10...46 EQ.6...46 EQ.8...46
Table of Contents CHAPTER II  PATTERN RECOGNITION.... THE PATTERN RECOGNITION PROBLEM.... STATISTICAL FORMULATION OF CLASSIFIERS...6 3. CONCLUSIONS...30 UNDERSTANDING BAYES RULE...3 BAYESIAN THRESHOLD...33
More informationIdentifying CommunityLevel Predictors of Depression Hospitalizations
Identfyng CommuntyLevel Predctors of Depresson Hosptalzatons September 2005 John Fortney Gerard Rushton Scott Wood Lxun Zhang Kathryn Rost Western Interstate Commsson for Hgher Educaton Mental Health
More informationAn empirical study for credit card approvals in the Greek banking sector
An emprcal study for credt card approvals n the Greek bankng sector Mara Mavr George Ioannou Bergamo, Italy 1721 May 2004 Management Scences Laboratory Department of Management Scence & Technology Athens
More informationOUTLIERS IN REGRESSION
OUTLIERS IN REGRESSION Dagmar Blatná Introducton A observaton that s substantally dfferent from all other ones can make a large dfference n the results of regresson analyss. Outlers occur very frequently
More informationMeasures of Fit for Logistic Regression
ABSTRACT Paper 1485014 SAS Global Forum Measures of Ft for Logstc Regresson Paul D. Allson, Statstcal Horzons LLC and the Unversty of Pennsylvana One of the most common questons about logstc regresson
More informationProceedings of the Annual Meeting of the American Statistical Association, August 59, 2001
Proceedngs of the Annual Meetng of the Amercan Statstcal Assocaton, August 59, 2001 LISTASSISTED SAMPLING: THE EFFECT OF TELEPHONE SYSTEM CHANGES ON DESIGN 1 Clyde Tucker, Bureau of Labor Statstcs James
More informationRegression Models for a Binary Response Using EXCEL and JMP
SEMATECH 997 Statstcal Methods Symposum Austn Regresson Models for a Bnary Response Usng EXCEL and JMP Davd C. Trndade, Ph.D. STATTECH Consultng and Tranng n Appled Statstcs San Jose, CA Topcs Practcal
More informationControl Charts for Means (Simulation)
Chapter 290 Control Charts for Means (Smulaton) Introducton Ths procedure allows you to study the run length dstrbuton of Shewhart (Xbar), Cusum, FIR Cusum, and EWMA process control charts for means usng
More informationState function: eigenfunctions of hermitian operators> normalization, orthogonality completeness
Schroednger equaton Basc postulates of quantum mechancs. Operators: Hermtan operators, commutators State functon: egenfunctons of hermtan operators> normalzaton, orthogonalty completeness egenvalues and
More informationAn Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services
An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsnyng Wu b a Professor (Management Scence), Natonal Chao
More informationMODELBASED CALIBRATION OF A NONINVASIVE BLOOD GLUCOSE MONITOR. Yelena Shulga. A Project Report. Submitted to the Faculty. of the
MODELBASED CALIBRATION OF A NONINVASIVE BLOOD GLUCOSE MONITOR by Yelena Shulga A Project Report Submtted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE n partal fulfllment of the requrements for
More informationFINAL REPORT. City of Toronto. Contract 47016555. Project No: B0002033
Cty of Toronto SAFETY IMPACTS AD REGULATIOS OF ELECTROIC STATIC ROADSIDE ADVERTISIG SIGS TECHICAL MEMORADUM #2C BEFORE/AFTER COLLISIO AALYSIS AT SIGALIZED ITERSECTIO FIAL REPORT 3027 Harvester Road, Sute
More informationChapter XX More advanced approaches to the analysis of survey data. Gad Nathan Hebrew University Jerusalem, Israel. Abstract
Household Sample Surveys n Developng and Transton Countres Chapter More advanced approaches to the analyss of survey data Gad Nathan Hebrew Unversty Jerusalem, Israel Abstract In the present chapter, we
More informationDetection of Health Insurance Fraud with Discrete Choice Model: Evidence from Medical Expense Insurance in China
Detecton of Health Insurance Fraud wth Dscrete Choce Model: Evdence from Medcal Expense Insurance n Chna Abstract: Health nsurance fraud ncreases the neffcency and nequalty n our socety. To address the
More informationEvaluating the Effects of FUNDEF on Wages and Test Scores in Brazil *
Evaluatng the Effects of FUNDEF on Wages and Test Scores n Brazl * Naérco MenezesFlho Elane Pazello Unversty of São Paulo Abstract In ths paper we nvestgate the effects of the 1998 reform n the fundng
More informationLinear Regression, Regularization BiasVariance Tradeoff
HTF: Ch3, 7 B: Ch3 Lnear Regresson, Regularzaton BasVarance Tradeoff Thanks to C Guestrn, T Detterch, R Parr, N Ray 1 Outlne Lnear Regresson MLE = Least Squares! Bass functons Evaluatng Predctors Tranng
More information