The Importance of Reproducible Research

Size: px
Start display at page:

Download "The Importance of Reproducible Research"

Transcription

1 The Importance of Reproducible Research Christian Kleiber Universität Basel Berne, Workshop Improving Data Access and Research Transparency (DART) in Switzerland

2 Outline 1 Introduction 2 Reproducibility (in economics) 3 Case studies in forensic econometrics Confidence intervals for breakpoints in time series Data problems Complete separation in a binary response model Complete separation in a regression model for count data 4 Some suggestions 5 References Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

3 Introduction Computation-based science publication is currently a doubtful enterprise because there is not enough support for identifying and rooting out sources of error in computational work. Donoho (Biostatistics 2010) We argue that, with some exceptions, anything less than the release of source programs is intolerable for results that depend on computation. The vagaries of hardware, software and natural language will always ensure that exact reproducibility remains uncertain, but withholding code increases the chances that efforts to reproduce results will fail. Ince, Hatton, Graham-Cumming (Nature 2012) Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

4 Introduction Q1: What is replication, reproduction, etc.? Old definition: (replication in the wide sense) Getting similar results using different data, different methods,... New definition: (replication in the narrow sense) Getting the exact same (!) tables, figures, etc. as the original publication. Emerging terminology: computational reproducibility Q2: Why work reproducibly? more impact, citations, feedback,... better computing environments more effective advising An emerging community standard in various fields. Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

5 Introduction Some recent papers from various fields: Donoho DL, Maleki A, Shahram M, Rahman I, Stodden V (2009). Reproducible research in computational harmonic analysis. Computing in Science & Engineering, 11(1), Donoho D (2010). An invitation to reproducible computational research. Biostatistics, 11(3), Ince DC, Hatton L, Graham-Cumming J (2012). The case for open computer programs. Nature, 482, Peng RD, Dominici F, Zeger SL (2006). Reproducible epidemiologic research. American J Epidemiology, 163, Vandewalle P, Kovacevic J, Vetterli M (2009). Reproducible research in signal processing. IEEE Signal Processing Magazine, 26(3), And a recent book: Stodden V, Leisch F, Peng RD (eds) (2014). Implementing Reproducible Research. Chapman & Hall. Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

6 Introduction Traditional issues: Why are some publications not reproducible? Data are not available. Data available, but code is not. Data and code are available, but there are data problems, numerical problems, software problems,... Recent threats to reproducibility: Data explosion, big data. Rise of computational science (simulation-based inference, etc.). Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

7 Reproducibility in economics 1982 J Money, Credit and Banking (JMCB) Data Storage and Evaluation Project, funded by NSF Dewald, Thursby and Anderson (AER 1986) find: only 2 out of 54 works replicable Our findings suggest that inadvertent errors in published empirical articles are a commonplace rather than a rare occurrence.... we recommend that journals require the submission of programs and data at the time empirical papers are submitted Replication policy at American Economic Review: Data and code New JMCB study (McCullough, McGeary and Harrison, JMCB 2006): now 14 out of 62 replicable McCullough and Vinod (AER 2003) attempt replication of an entire issue of AER Since 2004: mandatory (?) data and code archives at American Economic Review, Econometrica, Review of Economic Studies, J Political Economy, Review of Economics and Statistics Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

8 Reproducibility in economics Neglected issue: Simulations (Kleiber and Zeileis 2013). JAE JoE Freq of manuscripts in total with simulation Freq of data availability in archive 31 0 proprietary 6 0 not available 0 12 none used 3 3 Freq of simulation types Monte Carlo Resampling 15 3 Simulation-based estimation 13 3 Nonstandard distributions 2 0 Prop of all manuscripts with simulation indicating software used providing code with code available upon request Prop of simulation manuscripts with replication files with random seed Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

9 Case studies in forensic econometrics Examples: (mainly taken from my own work) Evaluation of a nonstandard distribution in time series econometrics A classical panel data set with (too) many versions Non-existing estimates in a binary response model Non-existing estimates in a count data regression model Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

10 Confidence intervals for breaks in time series Example: Breaks in the US real interest rate, Bai and Perron (J Applied Econometrics 2003): Regression on a constant, standard errors for break points via HAC methods with automated bandwidth selection. Point estimates of break dates are fully reproducible... but only 2 out of 3 confidence intervals. Computational task: confidence intervals require quantiles from a non-standard distribution. Issues: coding error software fault Details: Zeileis and Kleiber (J Applied Econometrics 2005). Data and computational tools available in R package strucchange. Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

11 Confidence intervals for breaks in time series RealInt Time Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

12 Confidence intervals for breaks in time series Asymptotics of break points: Limiting distribution is distribution of where argmax V (s) s V (s) = { W1 ( s) s /2 for s 0, ξ(φ2 /φ 1 )W 2 (s) ξs/2 for s > 0. A two-sided Brownian motion with different scales and linear drifts. Right branch of limiting distribution: G(x) = { ξ x exp φ 2π ( d + 2 ξ 2φ x ) Φ } 8φ x ξ2 ( ξ 2 φ + c exp(ax)φ( b x) ) x (x > 0) Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

13 Confidence intervals for breaks in time series P(argmaxV x) T^1 T^2 T^ x Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

14 Confidence intervals for breaks in time series P(argmaxV x) GAUSS R x Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

15 Data problems: Grunfeld data Grunfeld Y (1958). The Determinants of Corporate Investment. Unpublished Ph.D. Dissertation, University of Chicago. Originally, an empirical study of corporate investment, with a panel of large US firms over a period of 20 years ( ): [1] "General Motors" "US Steel" "General Electric" [4] "Chrysler" "Atlantic Refining" "IBM" [7] "Union Oil" "Westinghouse" "Goodyear" [10] "Diamond Match" Later used for illustrations in econometric methodology, notably panel and SUR models. Used in numerous textbooks, including Maddala (1977): Econometrics (10 firms) Greene (2003): Econometric Analysis, 5e (5 firms) Greene (2008): Econometric Analysis, 6e (10 firms) Baltagi (2008): Econometric Analysis of Panel Data, 4e (10 firms) In fact, there are 11 firms... and also more data (years for some firms). Complete and correct data available in R package AER, accompanying Kleiber and Zeileis (2008): Applied Econometrics with R. Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

16 Data problems: Grunfeld data Theil Fomby, Hill, Johnson Grunfeld select 10 Boot, de Wit (all but AS) selects 2 (GE, WH) select 3 (WH, GE, GM) 2 errors Maddala Vinod, Ullah 2 errors + selects 5 (GM, US, GE, CH, WH) 1 error? Griffiths, Hill, Judge Hill, Griffiths, Lim Greene/1st Greene/5th Baltagi/Econ Baltagi/Panel Greene/6th Kleiber, Zeileis Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

17 Complete separation Example: Data from Maddala GS (2001). Introduction to Econometrics, 3rd ed, J. Wiley. Data on 44 US states for Variables are rate Murder rate per 100,000 (FBI estimate, 1950). convictions No. of convictions divided by no. of murders in executions Average number of executions during divided by convictions in time Median time served (in months) of convicted murderers released in income Median family income in 1949 (in 1,000 USD). lfp Labor force participation rate in 1950 (in percent). noncauc Proportion of population that is non-caucasian in southern Region (factor). Stokes H (2004). On the advantage of using two or more econometric software systems to solve the same problem. J Economic and Social Measurement, 29, Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

18 Complete separation Problem: Coefficient on southern somewhat unusual... Logit model estimated using defaults: Estimate Std. Error z value Change of convergence controls: Reason: no yes FALSE 9 0 TRUE Estimate Std. Error z value quasi-complete separation Hence MLE does not exist... Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

19 Count data regression Example: Recreation demand Cross-sectional data (n = 659) on the number of recreational boating trips to Lake Somerville, TX, in 1980, based on a survey administered to 2,000 registered leisure boat owners in 23 counties in eastern Texas. Variable Description trips Number of recreational boating trips. quality Facility s subjective quality ranking on scale 1 5. ski Was the individual engaged in water-skiing? income Annual household income (in 1,000 USD). userfee Did the owner pay an annual user fee at Lake Somerville? costc Expenditure when visiting Lake Conroe (in USD). costs Expenditure when visiting Lake Somerville (in USD). costh Expenditure when visiting Lake Houston (in USD). Data are used in various publications, among them Sellar, Stoll and Chavas, Land Economics 1985 Ozuna and Gomez, Empirical Economics 1995 Gurmu and Trivedi, J Business and Economic Statistics 1996 Cameron and Trivedi, Regression Models for Count Data, CUP 2013 Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

20 Count data regression trips trips Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

21 Count data regression Source: Ozuna and Gomez, Specification and Testing of Count Data Recreation Demand Functions, Empirical Economics Methodology: Poisson and negative binomial regression. Footnote says: It should be noted that one of the anonymous referees re-estimated the models used in this study using the same data set and he obtained different parameter estimates. The referee and the authors of this article agreed that the problem was in the software used to estimate the models. The referee used LIMDEP 6.0 for the Poisson and MICROFIT 3.0 for the NLS models whereas the authors used GAUSS 3.0. This is an important observation since the parameter estimates affect consumer surplus. Researchers should thus be cautious of the software they use to estimate the models. Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

22 Summary Summary of problems: Breakpoint estimation: closed source, coding errors Grunfeld: ancient data Complete separation in binary response: statistical and software issues Complete separation in count data: statistical and computational issues Implicit issues: In econometrics (and other social sciences?), software development is often considered as a subsidiary activity. By implication, the established econometrics journals currently do not publish papers on software development. Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

23 Some suggestions: Authors How to improve on the current situation? Authors: release data and code If journal does not have an archive: use e.g. RePEc code archives. publish case studies that document problems Journals: mandatory archives for data and code (ideally, an editorial function) require data and code already at submission publish case studies, replications, etc. Instructors: use archives in teaching all of this also applies to computational economics and computational social science Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

24 Some suggestions: Technology Technology for econom(etr)ics: Version control system (svn, Dropbox,...) Data in.txt/.csv (no proprietary formats please) L A TEX Statistical software... Most of these tools were unavailable 50 (or 40, 30, 20, 10...) years ago, but now the technology is available and we should use it. Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

25 Some suggestions: Technology Computational tools for reproducible research: Desirable: more than data and code fully replicable analyses. One solution: Example: literate programming R function Sweave() combines R and L A TEX See Leisch (2002) for more information. Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

26 Some suggestions: Technology Sweave() example: (source code from Zeileis and Kleiber, 2005)... Confidence intervals for the breakpoints can be computed from the fitted \texttt{bp.ri} object for any number of breaks (smaller than the maximal number of breaks admissible) using the \texttt{confint} method from \texttt{strucchange}. A function for estimating the covariance matrix, here \texttt{kernhac}, may again be supplied. <<eval=true, echo=false, results=hide>>= library("strucchange") data("realint") bp.ri <- breakpoints(realint ~ 1, h = 15) cis <- confint(bp.ri, breaks = 3, vcov = This returns the breakpoints and corresponding confidence intervals (at the default 95\% level) coded by... Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

27 Some suggestions: Computational tools Computational tools section from Kleiber and Zeileis (2013): Our results were obtained using R with the packages strucchange 1.4-6, and lattice and were identical on various platforms including PCs running Debian GNU/Linux (with a amd64 kernel) and Mac OS X, version Normal random variables were generated from uniform random numbers obtained by the Mersenne Twister currently R s default generator by means of the inversion method. The random seed and further technical details are available in the code supplementing this paper. Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

28 References Econom(etr)ics: Anderson RD, Greene WH, McCullough BD, Vinod HD (2008). The role of data/code archives in the future of economic research. J Economic Methodology, 15(1), Kleiber C, Zeileis A (2010). The Grunfeld data at 50. German Economic Review, 11(4), Kleiber C, Zeileis A (2013). Reproducible econometric simulations. J Econometric Methods, 2(1), Lovell MC, Selover DD (1994). Econometric software accidents. Economic J, 104, McCullough BD, Vinod HD (1999). The numerical reliability of econometric software. J Economic Literature, 37, McCullough BD, Vinod HD (2003). Verifying the solution from a nonlinear solver: A case study. American Economic Review, 93, Newbold P, Agiakloglou C, Miller J (1994). Adventures with ARIMA software. International J Forecasting, 10, Zeileis A, Kleiber C (2005). Validating multiple structural change models A case study. J Applied Econometrics, 20, Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

29 References Other fields: Buckheit JB, Donoho DL (1995). WaveLab and reproducible research. Dept. of Statistics, Stanford University, Tech. Rep Donoho DL, Maleki A, Shahram M, Rahman I, Stodden V (2009). Reproducible research in computational harmonic analysis. Computing in Science & Engineering, 11(1), Donoho D (2010). An invitation to reproducible computational research. Biostatistics, 11(3), Ince DC, Hatton L, Graham-Cumming J (2012). The case for open computer programs. Nature, 482, Leisch F (2002). Sweave: Dynamic generation of statistical reports using literate data analysis. In Härdle W, Rönz B (eds.), Compstat 2002 Proc. in Computational Statistics, pp Physica Verlag, Heidelberg. Peng RD, Dominici F, Zeger SL (2006). Reproducible epidemiologic research. American J Epidemiology, 163, Vandewalle P, Kovacevic J, Vetterli M (2009). Reproducible research in signal processing. IEEE Signal Processing Magazine, 26(3), Christian Kleiber (Universität Basel) The Importance of Reproducible Research DART Workshop, Berne, / 29

On Reproducible Econometric Research

On Reproducible Econometric Research On Reproducible Econometric Research Achim Zeileis http://eeecon.uibk.ac.at/~zeileis/ Overview Joint work with Roger Koenker (University of Urbana-Champaign). Koenker R, Zeileis A (2009). On Reproducible

More information

Monitoring Structural Change in Dynamic Econometric Models

Monitoring Structural Change in Dynamic Econometric Models Monitoring Structural Change in Dynamic Econometric Models Achim Zeileis Friedrich Leisch Christian Kleiber Kurt Hornik http://www.ci.tuwien.ac.at/~zeileis/ Contents Model frame Generalized fluctuation

More information

From the help desk: Bootstrapped standard errors

From the help desk: Bootstrapped standard errors The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution

More information

Interacting with local and remote data repositories using the stashr package

Interacting with local and remote data repositories using the stashr package Computational Statistics DOI 10.1007/s00180-008-0124-x ORIGINAL PAPER Interacting with local and remote data repositories using the stashr package Sandrah P. Eckel Roger D. Peng Received: 14 March 2007

More information

for an appointment, e-mail [email protected]

for an appointment, e-mail j.adda@ucl.ac.uk M.Sc. in Economics Department of Economics, University College London Econometric Theory and Methods (G023) 1 Autumn term 2007/2008: weeks 2-8 Jérôme Adda for an appointment, e-mail [email protected] Introduction

More information

How To Understand The Theory Of Probability

How To Understand The Theory Of Probability Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Implementing a Class of Structural Change Tests: An Econometric Computing Approach

Implementing a Class of Structural Change Tests: An Econometric Computing Approach Implementing a Class of Structural Change Tests: An Econometric Computing Approach Achim Zeileis http://www.ci.tuwien.ac.at/~zeileis/ Contents Why should we want to do tests for structural change, econometric

More information

Maximum likelihood estimation of mean reverting processes

Maximum likelihood estimation of mean reverting processes Maximum likelihood estimation of mean reverting processes José Carlos García Franco Onward, Inc. [email protected] Abstract Mean reverting processes are frequently used models in real options. For

More information

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008 Instructor: Maria Heracleous Lectures: M 8:10-10:40 p.m. WARD 202 Office: 221 Roper Phone: 202-885-3758 Office Hours: M W

More information

Hailong Qian. Department of Economics John Cook School of Business Saint Louis University 3674 Lindell Blvd, St. Louis, MO 63108, USA qianh@slu.

Hailong Qian. Department of Economics John Cook School of Business Saint Louis University 3674 Lindell Blvd, St. Louis, MO 63108, USA qianh@slu. Hailong Qian Department of Economics John Cook School of Business Saint Louis University 3674 Lindell Blvd, St. Louis, MO 63108, USA [email protected] FIELDS OF INTEREST Theoretical and Applied Econometrics,

More information

Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs

Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs Andrew Gelman Guido Imbens 2 Aug 2014 Abstract It is common in regression discontinuity analysis to control for high order

More information

Data Availability Policies & Author Responsibility Policies Time of Evaluation: May 2014

Data Availability Policies & Author Responsibility Policies Time of Evaluation: May 2014 Data policies found in a sample of 346 journals in economic sciences Data Availability Policies & Author Responsibility Policies Time of Evaluation: May 2014 Table of Contents: Data Availability Policies:...

More information

Automatic Generation of Simple (Statistical) Exams

Automatic Generation of Simple (Statistical) Exams Automatic Generation of Simple (Statistical) Exams Bettina Grün, Achim Zeileis http://statmath.wu-wien.ac.at/ Overview Introduction Challenges Solution implemented in the R package exams Exercises Combining

More information

Master programme in Statistics

Master programme in Statistics Master programme in Statistics Björn Holmquist 1 1 Department of Statistics Lund University Cramérsällskapets årskonferens, 2010-03-25 Master programme Vad är ett Master programme? Breddmaster vs Djupmaster

More information

VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

More information

Department of Economics

Department of Economics Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional

More information

Comparison of resampling method applied to censored data

Comparison of resampling method applied to censored data International Journal of Advanced Statistics and Probability, 2 (2) (2014) 48-55 c Science Publishing Corporation www.sciencepubco.com/index.php/ijasp doi: 10.14419/ijasp.v2i2.2291 Research Paper Comparison

More information

Statistical Rules of Thumb

Statistical Rules of Thumb Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN

More information

Curriculum Vitae Richard A. L. Carter

Curriculum Vitae Richard A. L. Carter Curriculum Vitae Richard A. L. Carter January 25, 2011 Personal Office Addresses: Department of Economics University of Western Ontario London, Ontario N6A 5C2 Department of Economics University of Calgary

More information

Nicholas J. Gonedes. 1971/1972: Graduate School of Industrial Administration, Carnegie-Mellon University.

Nicholas J. Gonedes. 1971/1972: Graduate School of Industrial Administration, Carnegie-Mellon University. Nicholas J. Gonedes Positions Assistant Professor of Accounting, Graduate School of Business, University of Chicago; September 1969 August 1974. Associate Professor of Accounting, Graduate School of Business,

More information

ESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL DATA FROM NORTH CAROLINA BADI H. BALTAGI*

ESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL DATA FROM NORTH CAROLINA BADI H. BALTAGI* JOURNAL OF APPLIED ECONOMETRICS J. Appl. Econ. 21: 543 547 (2006) Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/jae.861 ESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL

More information

The Variability of P-Values. Summary

The Variability of P-Values. Summary The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 [email protected] August 15, 2009 NC State Statistics Departement Tech Report

More information

Statistical Methods for research in International Relations and Comparative Politics

Statistical Methods for research in International Relations and Comparative Politics James Raymond Vreeland Dept. of Political Science Assistant Professor Yale University E-Mail: [email protected] Room 300 Tel: 203-432-5252 124 Prospect Avenue Office hours: Wed. 10am to 12pm New

More information

RUNNING HEAD: FAFSA lists 1

RUNNING HEAD: FAFSA lists 1 RUNNING HEAD: FAFSA lists 1 Strategic use of FAFSA list information by colleges Stephen R. Porter Department of Leadership, Policy, and Adult and Higher Education North Carolina State University Raleigh,

More information

A Case Study in Software Enhancements as Six Sigma Process Improvements: Simulating Productivity Savings

A Case Study in Software Enhancements as Six Sigma Process Improvements: Simulating Productivity Savings A Case Study in Software Enhancements as Six Sigma Process Improvements: Simulating Productivity Savings Dan Houston, Ph.D. Automation and Control Solutions Honeywell, Inc. [email protected] Abstract

More information

Minimum LM Unit Root Test with One Structural Break. Junsoo Lee Department of Economics University of Alabama

Minimum LM Unit Root Test with One Structural Break. Junsoo Lee Department of Economics University of Alabama Minimum LM Unit Root Test with One Structural Break Junsoo Lee Department of Economics University of Alabama Mark C. Strazicich Department of Economics Appalachian State University December 16, 2004 Abstract

More information

Testing for Granger causality between stock prices and economic growth

Testing for Granger causality between stock prices and economic growth MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Health Policy and Administration PhD Track in Health Services and Policy Research

Health Policy and Administration PhD Track in Health Services and Policy Research Health Policy and Administration PhD Track in Health Services and Policy INTRODUCTION The Health Policy and Administration (HPA) Division of the UIC School of Public Health offers a PhD track in Health

More information

Organizing Your Approach to a Data Analysis

Organizing Your Approach to a Data Analysis Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize

More information

Teaching model: C1 a. General background: 50% b. Theory-into-practice/developmental 50% knowledge-building: c. Guided academic activities:

Teaching model: C1 a. General background: 50% b. Theory-into-practice/developmental 50% knowledge-building: c. Guided academic activities: 1. COURSE DESCRIPTION Degree: Double Degree: Derecho y Finanzas y Contabilidad (English teaching) Course: STATISTICAL AND ECONOMETRIC METHODS FOR FINANCE (Métodos Estadísticos y Econométricos en Finanzas

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Keep It Simple: Easy Ways To Estimate Choice Models For Single Consumers

Keep It Simple: Easy Ways To Estimate Choice Models For Single Consumers Keep It Simple: Easy Ways To Estimate Choice Models For Single Consumers Christine Ebling, University of Technology Sydney, [email protected] Bart Frischknecht, University of Technology Sydney,

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium

More information

Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University

Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University Practical I conometrics data collection, analysis, and application Christiana E. Hilmer Michael J. Hilmer San Diego State University Mi Table of Contents PART ONE THE BASICS 1 Chapter 1 An Introduction

More information

EDMS 769L: Statistical Analysis of Longitudinal Data 1809 PAC, Th 4:15-7:00pm 2009 Spring Semester

EDMS 769L: Statistical Analysis of Longitudinal Data 1809 PAC, Th 4:15-7:00pm 2009 Spring Semester Instructor Dr. Jeffrey Harring 1230E Benjamin Building Phone: (301) 405-3630 Email: [email protected] Office Hours Tuesday 2:00-3:00pm, or by appointment Course Objectives, Description and Prerequisites

More information

[This document contains corrections to a few typos that were found on the version available through the journal s web page]

[This document contains corrections to a few typos that were found on the version available through the journal s web page] Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,

More information

F nest. Monte Carlo and Bootstrap using Stata. Financial Intermediation Network of European Studies

F nest. Monte Carlo and Bootstrap using Stata. Financial Intermediation Network of European Studies F nest Financial Intermediation Network of European Studies S U M M E R S C H O O L Monte Carlo and Bootstrap using Stata Dr. Giovanni Cerulli 8-10 October 2015 University of Rome III, Italy Lecturer Dr.

More information

The frequency of visiting a doctor: is the decision to go independent of the frequency?

The frequency of visiting a doctor: is the decision to go independent of the frequency? Discussion Paper: 2009/04 The frequency of visiting a doctor: is the decision to go independent of the frequency? Hans van Ophem www.feb.uva.nl/ke/uva-econometrics Amsterdam School of Economics Department

More information

Machine Learning Methods for Causal Effects. Susan Athey, Stanford University Guido Imbens, Stanford University

Machine Learning Methods for Causal Effects. Susan Athey, Stanford University Guido Imbens, Stanford University Machine Learning Methods for Causal Effects Susan Athey, Stanford University Guido Imbens, Stanford University Introduction Supervised Machine Learning v. Econometrics/Statistics Lit. on Causality Supervised

More information

How To Close The Loop On A Fully Differential Op Amp

How To Close The Loop On A Fully Differential Op Amp Application Report SLOA099 - May 2002 Fully Differential Op Amps Made Easy Bruce Carter High Performance Linear ABSTRACT Fully differential op amps may be unfamiliar to some designers. This application

More information

U.S DEPARTMENT OF COMMERCE

U.S DEPARTMENT OF COMMERCE Alaska Fisheries Science Center National Marine Fisheries Service U.S DEPARTMENT OF COMMERCE AFSC PROCESSED REPORT 2013-01 RMark: An R Interface for Analysis of Capture-Recapture Data with MARK March 2013

More information

Calculating the Probability of Returning a Loan with Binary Probability Models

Calculating the Probability of Returning a Loan with Binary Probability Models Calculating the Probability of Returning a Loan with Binary Probability Models Associate Professor PhD Julian VASILEV (e-mail: [email protected]) Varna University of Economics, Bulgaria ABSTRACT The

More information

Chapter 1 Introduction. 1.1 Introduction

Chapter 1 Introduction. 1.1 Introduction Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations

More information

Econometrics and Data Analysis I

Econometrics and Data Analysis I Econometrics and Data Analysis I Yale University ECON S131 (ONLINE) Summer Session A, 2014 June 2 July 4 Instructor: Doug McKee ([email protected]) Teaching Fellow: Yu Liu ([email protected]) Classroom:

More information

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050 PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050 Class Hours: 2.0 Credit Hours: 3.0 Laboratory Hours: 2.0 Date Revised: Fall 2013 Catalog Course Description: Descriptive

More information

INTERNATIONAL UNIVERSITY OF JAPAN Public Management and Policy Analysis Program Graduate School of International Relations

INTERNATIONAL UNIVERSITY OF JAPAN Public Management and Policy Analysis Program Graduate School of International Relations INTERNATIONAL UNIVERSITY OF JAPAN Public Management and Policy Analysis Program Graduate School of International Relations ADC6512 Topics in Data Analysis (Panel Data Models Using Stata) (2 Credits) Winter

More information

A spreadsheet Approach to Business Quantitative Methods

A spreadsheet Approach to Business Quantitative Methods A spreadsheet Approach to Business Quantitative Methods by John Flaherty Ric Lombardo Paul Morgan Basil desilva David Wilson with contributions by: William McCluskey Richard Borst Lloyd Williams Hugh Williams

More information

11. Time series and dynamic linear models

11. Time series and dynamic linear models 11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

More information

Simulation and Risk Analysis

Simulation and Risk Analysis Simulation and Risk Analysis Using Analytic Solver Platform REVIEW BASED ON MANAGEMENT SCIENCE What We ll Cover Today Introduction Frontline Systems Session Ι Beta Training Program Goals Overview of Analytic

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

Flood Risk Analysis considering 2 types of uncertainty

Flood Risk Analysis considering 2 types of uncertainty US Army Corps of Engineers Institute for Water Resources Hydrologic Engineering Center Flood Risk Analysis considering 2 types of uncertainty Beth Faber, PhD, PE Hydrologic Engineering Center (HEC) US

More information

From the help desk: Swamy s random-coefficients model

From the help desk: Swamy s random-coefficients model The Stata Journal (2003) 3, Number 3, pp. 302 308 From the help desk: Swamy s random-coefficients model Brian P. Poi Stata Corporation Abstract. This article discusses the Swamy (1970) random-coefficients

More information

REPORT DOCUMENTATION PAGE

REPORT DOCUMENTATION PAGE REPORT DOCUMENTATION PAGE Form Approved OMB NO. 0704-0188 Public Reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,

More information

Sample Size Designs to Assess Controls

Sample Size Designs to Assess Controls Sample Size Designs to Assess Controls B. Ricky Rambharat, PhD, PStat Lead Statistician Office of the Comptroller of the Currency U.S. Department of the Treasury Washington, DC FCSM Research Conference

More information

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA

More information

Note 2 to Computer class: Standard mis-specification tests

Note 2 to Computer class: Standard mis-specification tests Note 2 to Computer class: Standard mis-specification tests Ragnar Nymoen September 2, 2013 1 Why mis-specification testing of econometric models? As econometricians we must relate to the fact that the

More information

SOFTWARE PERFORMANCE EVALUATION ALGORITHM EXPERIMENT FOR IN-HOUSE SOFTWARE USING INTER-FAILURE DATA

SOFTWARE PERFORMANCE EVALUATION ALGORITHM EXPERIMENT FOR IN-HOUSE SOFTWARE USING INTER-FAILURE DATA I.J.E.M.S., VOL.3(2) 2012: 99-104 ISSN 2229-6425 SOFTWARE PERFORMANCE EVALUATION ALGORITHM EXPERIMENT FOR IN-HOUSE SOFTWARE USING INTER-FAILURE DATA *Jimoh, R. G. & Abikoye, O. C. Computer Science Department,

More information

health economics and policy

health economics and policy International doctoral courses and seminars in health economics and policy Advanced education in health economics and policy for PhD students Offered by theswiss School of Public Health+ University of

More information

An Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides: www.unc.

An Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides: www.unc. An Application of the G-formula to Asbestos and Lung Cancer Stephen R. Cole Epidemiology, UNC Chapel Hill Slides: www.unc.edu/~colesr/ 1 Acknowledgements Collaboration with David B. Richardson, Haitao

More information

Clustering in the Linear Model

Clustering in the Linear Model Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information

APPENDIX 15. Review of demand and energy forecasting methodologies Frontier Economics

APPENDIX 15. Review of demand and energy forecasting methodologies Frontier Economics APPENDIX 15 Review of demand and energy forecasting methodologies Frontier Economics Energex regulatory proposal October 2014 Assessment of Energex s energy consumption and system demand forecasting procedures

More information

R: A Free Software Project in Statistical Computing

R: A Free Software Project in Statistical Computing R: A Free Software Project in Statistical Computing Achim Zeileis Institut für Statistik & Wahrscheinlichkeitstheorie http://www.ci.tuwien.ac.at/~zeileis/ Acknowledgments Thanks: Alex Smola & Machine Learning

More information

Duration Analysis. Econometric Analysis. Dr. Keshab Bhattarai. April 4, 2011. Hull Univ. Business School

Duration Analysis. Econometric Analysis. Dr. Keshab Bhattarai. April 4, 2011. Hull Univ. Business School Duration Analysis Econometric Analysis Dr. Keshab Bhattarai Hull Univ. Business School April 4, 2011 Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 1 / 27 What is Duration Analysis?

More information

STATISTICS COURSES UNDERGRADUATE CERTIFICATE FACULTY. Explanation of Course Numbers. Bachelor's program. Master's programs.

STATISTICS COURSES UNDERGRADUATE CERTIFICATE FACULTY. Explanation of Course Numbers. Bachelor's program. Master's programs. STATISTICS Statistics is one of the natural, mathematical, and biomedical sciences programs in the Columbian College of Arts and Sciences. The curriculum emphasizes the important role of statistics as

More information

Local Government Information Security Risk in the Age of E-Government. Eunjung Shin Lauren N. Bowman PhD Students. Eric Welch Associate Professor

Local Government Information Security Risk in the Age of E-Government. Eunjung Shin Lauren N. Bowman PhD Students. Eric Welch Associate Professor Introduction Local Government Information Security Risk in the Age of E-Government Eunjung Shin Lauren N. Bowman PhD Students Eric Welch Associate Professor Department of Public Administration Science,

More information

The Impact of Release Management and Quality Improvement in Open Source Software Project Management

The Impact of Release Management and Quality Improvement in Open Source Software Project Management Applied Mathematical Sciences, Vol. 6, 2012, no. 62, 3051-3056 The Impact of Release Management and Quality Improvement in Open Source Software Project Management N. Arulkumar 1 and S. Chandra Kumramangalam

More information

How To Get A Degree In Economics At The University Of Houston

How To Get A Degree In Economics At The University Of Houston UNIVERSITY OF HOUSTON GRADUATE STUDY IN ECONOMICS The Department of Economics offers a program leading to the Ph.D. degree in Economics designed to provide students rigorous training in economic theory

More information

PROBABILITY AND STATISTICS. Ma 527. 1. To teach a knowledge of combinatorial reasoning.

PROBABILITY AND STATISTICS. Ma 527. 1. To teach a knowledge of combinatorial reasoning. PROBABILITY AND STATISTICS Ma 527 Course Description Prefaced by a study of the foundations of probability and statistics, this course is an extension of the elements of probability and statistics introduced

More information

CV of Dr. Joachim Schnurbus

CV of Dr. Joachim Schnurbus CV of Dr. Joachim Schnurbus June 21, 2016 1 Personal and contact Born on July 30, 1979 in Selb, Germany Email: [email protected] Fon: +49 851 509 2563 Fax: +49 851 509 2562 2 Education Aug.

More information

Online Appendix Assessing the Incidence and Efficiency of a Prominent Place Based Policy

Online Appendix Assessing the Incidence and Efficiency of a Prominent Place Based Policy Online Appendix Assessing the Incidence and Efficiency of a Prominent Place Based Policy By MATIAS BUSSO, JESSE GREGORY, AND PATRICK KLINE This document is a Supplemental Online Appendix of Assessing the

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Syllabus for the TEMPUS SEE PhD Course (Podgorica, April 4 29, 2011) Franz Kappel 1 Institute for Mathematics and Scientific Computing University of Graz Žaneta Popeska 2 Faculty

More information

QMB 3302 Business Analytics CRN 10251 Spring 2015 T R -- 11:00am - 12:15pm -- Lutgert Hall 2209

QMB 3302 Business Analytics CRN 10251 Spring 2015 T R -- 11:00am - 12:15pm -- Lutgert Hall 2209 QMB 3302 Business Analytics CRN 10251 Spring 2015 T R -- 11:00am - 12:15pm -- Lutgert Hall 2209 Elias T. Kirche, Ph.D. Associate Professor Department of Information Systems and Operations Management Lutgert

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

Fixed Effects Bias in Panel Data Estimators

Fixed Effects Bias in Panel Data Estimators DISCUSSION PAPER SERIES IZA DP No. 3487 Fixed Effects Bias in Panel Data Estimators Hielke Buddelmeyer Paul H. Jensen Umut Oguzoglu Elizabeth Webster May 2008 Forschungsinstitut zur Zukunft der Arbeit

More information

Chapter 11 Introduction to Survey Sampling and Analysis Procedures

Chapter 11 Introduction to Survey Sampling and Analysis Procedures Chapter 11 Introduction to Survey Sampling and Analysis Procedures Chapter Table of Contents OVERVIEW...149 SurveySampling...150 SurveyDataAnalysis...151 DESIGN INFORMATION FOR SURVEY PROCEDURES...152

More information

MEU. INSTITUTE OF HEALTH SCIENCES COURSE SYLLABUS. Biostatistics

MEU. INSTITUTE OF HEALTH SCIENCES COURSE SYLLABUS. Biostatistics MEU. INSTITUTE OF HEALTH SCIENCES COURSE SYLLABUS title- course code: Program name: Contingency Tables and Log Linear Models Level Biostatistics Hours/week Ther. Recite. Lab. Others Total Master of Sci.

More information

Department of Epidemiology and Public Health Miller School of Medicine University of Miami

Department of Epidemiology and Public Health Miller School of Medicine University of Miami Department of Epidemiology and Public Health Miller School of Medicine University of Miami BST 630 (3 Credit Hours) Longitudinal and Multilevel Data Wednesday-Friday 9:00 10:15PM Course Location: CRB 995

More information

The Probit Link Function in Generalized Linear Models for Data Mining Applications

The Probit Link Function in Generalized Linear Models for Data Mining Applications Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Teaching Statistics with Fathom

Teaching Statistics with Fathom Teaching Statistics with Fathom UCB Extension X369.6 (2 semester units in Education) COURSE DESCRIPTION This is a professional-level, moderated online course in the use of Fathom Dynamic Data software

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

Testing, Monitoring, and Dating Structural Changes in Exchange Rate Regimes

Testing, Monitoring, and Dating Structural Changes in Exchange Rate Regimes Testing, Monitoring, and Dating Structural Changes in Exchange Rate Regimes Achim Zeileis http://eeecon.uibk.ac.at/~zeileis/ Overview Motivation Exchange rate regimes Exchange rate regression What is the

More information