Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course



Similar documents
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

STAT 360 Probability and Statistics. Fall 2012

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Truman College-Mathematics Department Math 125-CD: Introductory Statistics Course Syllabus Fall 2012

EDMS 769L: Statistical Analysis of Longitudinal Data 1809 PAC, Th 4:15-7:00pm 2009 Spring Semester

STAT 1403 College Algebra Dr. Myron Rigsby Fall 2013 Section 0V2 crn 457 MWF 9:00 am

RARITAN VALLEY COMMUNITY COLLEGE ACADEMIC COURSE OUTLINE MATH 111H STATISTICS II HONORS

Online Basic Statistics

Register for CONNECT using the code with your book and this course access information:

Course Syllabus MATH 110 Introduction to Statistics 3 credits

MATH 140 HYBRID INTRODUCTORY STATISTICS COURSE SYLLABUS

Statistics Graduate Courses

DSCI 3710 Syllabus: Spring 2015

STAT 2080/MATH 2080/ECON 2280 Statistical Methods for Data Analysis and Inference Fall 2015

BUAD : Corporate Finance - Spring

Introduction to Statistics (Online) Syllabus/Course Information

Executive Master of Public Administration. QUANTITATIVE TECHNIQUES I For Policy Making and Administration U6310, Sec. 03

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

SAS Certificate Applied Statistics and SAS Programming

MATHEMATICAL TOOLS FOR ECONOMICS ECON FALL 2011

SYLLABUS MAC 1105 COLLEGE ALGEBRA Spring 2011 Tuesday & Thursday 12:30 p.m. 1:45 p.m.

School of Business ACCT2105 (Subclasses G, H, I) Introduction to Management Accounting , Semester 2 Course Syllabus

Social Work Statistics Spring 2000

MATHEMATICAL TOOLS FOR ECONOMICS ECON SPRING 2012

Winter 2016 MATH 631 Online University of Waterloo

MATH Probability & Statistics - Fall Semester 2015 Dr. Brandon Samples - Department of Mathematics - Georgia College

Mathematics Department Course outline Statistics for Social Science DW

Fairfield Public Schools

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050

LOGOM 3300: Business Statistics Fall 2015

KENNESAW STATE UNIVERSITY GRADUATE COURSE PROPOSAL OR REVISION, Cover Sheet (10/02/2002)

ECE475 Control System Analysis ABET Syllabus. ECE 326. Prerequisite topics: knowledge of Laplace transforms and Bode plots.

The University of Texas at Austin School of Social Work SOCIAL WORK STATISTICS

University of Maine Department of Mathematics & Statistics MAT122, Precalculus, Section 0990 Online, Fall 2015

ENVS 298: Environmental Accountability and Data Visualization

Math College Algebra (Online)

West LA College Mathematics Department. Math 227 (4916) Introductory Statistics. Fall 2013 August December 9, 2013

Multimedia & the World Wide Web

MTH 110: Elementary Statistics (Online Course) Course Syllabus Fall 2012 Chatham University

QMB 3302 Business Analytics CRN Spring 2015 T R -- 11:00am - 12:15pm -- Lutgert Hall 2209

THE UNIVERSITY OF HONG KONG FACULTY OF BUSINESS AND ECONOMICS Course Template for the Learning Outcomes System

IVY TECH COMMUNITY COLLEGE REGION 03 SYLLABUS MATH 136: COLLEGE ALGEBRA SUMMER Instructor: Jack Caster Telephone: ext.

Canisius College Richard J. Wehle School of Business Department of Marketing & Information Systems Spring 2015

MCS5813 Cryptography Spring and select CRN 3850

Physics 21-Bio: University Physics I with Biological Applications Syllabus for Spring 2012

NORTHWESTERN UNIVERSITY Department of Statistics. Fall 2012 Statistics 210 Professor Savage INTRODUCTORY STATISTICS FOR THE SOCIAL SCIENCES

BERGEN COMMUNITY COLLEGE DIVISION OF BUSINESS, ARTS, and SOCIAL SCIENCE BUSINESS DEPARTMENT

Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions

MATH 1050, College Algebra, QL, 4 credits. Functions: graphs, transformations, combinations and

Class Syllabus. Department of Business Administration & Management Information Systems. Texas A&M University Commerce

1 Goals & Prerequisites

Lake-Sumter Community College Course Syllabus. STA 2023 Course Title: Elementary Statistics I. Contact Information: Office Hours:

COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences Academic Year Qualification.

Computer Integrated Manufacturing Course Outline

Prairie View A&M University P.O. Box 519 Mail Stop 2510 Prairie View, TX 77446

Semester/Year: Spring, 2016

STAT3016 Introduction to Bayesian Data Analysis

ECON 523 Applied Econometrics I /Masters Level American University, Spring Description of the course

TEACHING OF STATISTICS IN KENYA. John W. Odhiambo University of Nairobi Nairobi

Business Statistics MATH 222. Eagle Vision Home. Course Syllabus

Governors State University College of Business and Public Administration. Course: STAT Statistics for Management I (Online Course)

Management Science and Business Information Systems

Developing a Course Syllabus: Steps for Syllabus Design

Psychology 314L (52510): Research Methods

MATH. ALGEBRA I HONORS 9 th Grade ALGEBRA I HONORS

BADM323: Information Systems for Business Professionals SU2016 Online Course

Truman College-Mathematics Department Math 140-ABD: College Algebra Course Syllabus Summer 2013

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

PADM 596: Research Methods for Public Managers Spring 2016

Econometrics and Data Analysis I

Office: LSK 5045 Begin subject: [ISOM3360]...

Brown University Department of Economics Spring 2015 ECON 1620-S01 Introduction to Econometrics Course Syllabus

A Quality Assessment Initiative for Computer Science, Electrical Engineering, and Physics at Select Universities in Vietnam

COURSE OUTLINE - Marketing Research BUS , Fall 2015

RYERSON UNIVERSITY Ted Rogers School of Information Technology Management And G. Raymond Chang School of Continuing Education

Colorado School of Mines Fall 2015 Principles of Corporate Finance EBGN 345-A

BUSSTAT 207 Introduction to Business Statistics Fall 2015

Advanced Psychological Statistics I. PSYX 520 Autumn 2015

Faculty of Science School of Mathematics and Statistics

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

Photojournalism Communication 3225 Tuesdays and Thursdays 8:00 am 9:20 am Room 281 Journalism Building

Moraine Valley Community College Course Syllabus

CLASS SESSIONS Wednesdays, 8:30 AM -11:20 AM, HSL LL204

CME403/603 Syllabus Page 1

MGT 5309 FALL 07 LOGISTICS AND SUPPLY CHAIN MANAGEMENT SYLLABUS

MASTER COURSE SYLLABUS-PROTOTYPE PSYCHOLOGY 2317 STATISTICAL METHODS FOR THE BEHAVIORAL SCIENCES

COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK DEPARTMENT OF INDUSTRIAL ENGINEERING AND OPERATIONS RESEARCH

Dr. Stephen K. Pollard. Online. Online. None

Political Science 1300: Global Politics Spring 2014

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS ADVANCED DATABASE MANAGEMENT SYSTEMS CSIT 2510

QMB Business Analytics CRN Fall 2015 T & R 9.30 to AM -- Lutgert Hall 2209

Transcription:

Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course Prerequisite: Stat 3201 (Introduction to Probability for Data Analytics) Exclusions: Class distribution: Three 55-minute lectures and one 55-minute recitation per week Course Description and Learning Outcomes The course covers foundational inferential methods for learning about populations from samples, including point and interval estimation, and the formulation and testing of hypotheses. Statistical theory is introduced to justify the approaches. The course emphasizes challenges that arise when applying classical ideas to big-data, partially through the use of computational and simulation techniques. Upon successful completion of the course, students will be able to 1. Describe the role of a parameter in a statistical model and its relationship to observed data 2. Use data to estimate and describe uncertainty about the parameters of a statistical model 3. Translate scientific hypotheses about a population into mathematical statements about parameters in a statistical model 4. Formulate statistical procedures to test a hypothesis about parameters in a statistical model, and interpret the results in both statistical and applicationspecific terms 5. Explain the difference between statistical and practical significance in massive data settings 6. Appreciate the effect of missing data on statistical inference 7. Evaluate and compare different statistical procedures for answering the same question Required Text and Other Course Materials The required textbook for the course is Mathematical Statistics with Applications (7th edition) by Wackerly, Mendenhall and Sheaffer. The book is available for purchase at the official University bookstore (ohiostate.bkstore.com ) and elsewhere online. The book is available on reserve in the 18 th Avenue Library.

Students will be required to use the R 1 software environment for statistical computing and graphics. R can be downloaded for free at http://www.r-project.org. Instructions for using the software will be given in class. Many students prefer to use RStudio, an IDE designed for use with R. RStudio is available for free at http://www.rstudio.com. Recitation Weekly recitations will reinforce the material introduced in lecture by having students solve both analytical and computational problems. The recitations will focus on details of problem solving and computational implementation. Assignments Homework will be assigned (approximately) weekly, will be due on the dates announced in class and will be graded. Assignments will consist of a mix of several problems selected from the textbook, problems motivated by data analytics applications, and small computer simulation problems. Two group projects will be assigned during the semester, as described briefly below: Project 1: Small-group project on the topic of estimation. The instructor will formulate a question based on an interesting data set and will propose various estimators that might be used to help answer the question. Groups will be assigned to investigate and compare different aspects of the various estimators using simulation. Group presentations of the results will be given in class, and a written report of the results must be submitted. Each group will also provide written feedback for one other group. Project 2: Small-group project on the topic of testing. The instructor will formulate a question based on an interesting data set and will specify various hypotheses and tests that might be used to answer the question. Groups will be assigned to investigate one particular aspect of the testing problem using simulation. Group presentations of the results will be given in class, and a written report of the results must be submitted. Each group will also provide written feedback for one other group. 1 For information on the use of R in data analytics, see: http://www.revolutionanalytics.com/why-revolution-r/whitepapers/r-is-hot.php http://techcrunch.com/2012/10/27/big-data-right-now-five-trendy-open-source-technologies/ http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html http://bits.blogs.nytimes.com/2009/01/08/r-you-ready-for-r/

Exams There will be two in-class midterms that cover material from lecture, the assigned readings and homework. A final examination will be given during the university s examination period. Grading Information The final course grade will be based on homework assignments, two projects, two midterms and a comprehensive final examination. The weights for each component of the grade are: Homework Project 1 Project 2 Midterm 1 Midterm 2 Final Exam 20% 10% 10% 20% 20% 20% Outline of topics 1. Introduction to inference a. General ideas: parameters, statistics, population, sample b. Classical inference (e.g., estimate a parameter) vs. modern inference from big data (e.g., infer relatedness on networks, infer evolution of network connections over time, estimate structures such as trees, etc.) c. Inference frameworks: frequentist, Bayesian 2. Sampling distributions a. Review of sampling distributions b. Asymptotic distributions of known form (e.g., CLT) c. Sampling distributions in complex settings (e.g., contingency tables, networks) d. Simulation activity: sampling distributions in complex settings 3. Estimation a. Topics in point estimation: bias, variance, sufficiency, completeness b. Interval estimation: estimation of variances, methods of constructing intervals c. Bootstrap estimation of variances, confidence intervals d. Simulation activity: Examining bootstrap approaches i. Parametric and nonparametric bootstrap ii. Comparison with existing approaches e. Missing data

4. Testing i. Effects on properties of estimators ii. Approaches to handling missing data f. Simulation activity: Exploring the effect of missing data i. Differences between patterns of missing data ii. Comparison of approaches to handling missing data a. Formulation of hypotheses b. Classical methods of forming tests (e.g., likelihood ratio tests) c. Concept of null distribution d. Methods of specifying null distribution: parametric form, nonparametric tests, permutation, simulation e. Type I and type II errors f. Statistical and practical significance g. Interpretation of p-values h. Multiple testing issues i. Simulation activities: i. Simulation-based evaluation of type I and type II error rates ii. Simulation of null distribution and comparison with other methods of specifying null iii. Examination of power under various alternatives iv. Comparison of multiple testing procedures 5. Introduction to Modeling a. What is a statistical model? b. Estimation in statistical models c. Model comparison 6. Case studies: estimation in complex settings a. Discussion of several examples of estimation for big data problems b. Examples involving estimation and/or testing for a complex data set address the issues discussed throughout the semester Statement on Academic Misconduct It is the responsibility of the Committee on Academic Misconduct to investigate or establish procedures for the investigation of all reported cases of student academic misconduct. The term academic misconduct includes all forms of student academic

misconduct wherever committed; illustrated by, but not limited to, cases of plagiarism and dishonest practices in connection with examinations. Instructors shall report all instances of alleged academic misconduct to the committee (Faculty Rule 3335-5-487). For additional information, see the Code of Student Conduct http://studentlife.osu.edu/csc/. Special Accommodations Students with disabilities that have been certified by the Office for Disability Services will be appropriately accommodated and should inform the instructor as soon as possible of their needs. The Office for Disability Services is located in 150 Pomerene Hall, 1760 Neil Avenue; telephone 292-3307, TDD 292-0901; http://www.ods.ohio-state.edu/.