MATH 4470/5470 EXPLORATORY DATA ANALYSIS ONLINE COURSE SYLLABUS



Similar documents
Math 1314 College Algebra Course Syllabus

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

MAT187 Precalculus Spring 2016 Section 27756

Linear Regression. use waist

Lecture 1: Review and Exploratory Data Analysis (EDA)

RANGER COLLEGE SYLLABUS

Math 1302 (College Algebra) Syllabus Fall 2015 (Online)

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

MATH 1342 Elementary Statistics

Math 131 College Algebra Fall 2015

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

Course Overview. Course Learning Objectives

MATH : Intermediate Algebra

Geostatistics Exploratory Analysis

Data Exploration Data Visualization

Precalculus Orientation and FAQ

SOUTHWEST COLLEGE Department of Mathematics COURSE SYLLABUS

Algebra I Credit Recovery

ACT Mathematics sub-score of 22 or greater or COMPASS Algebra score of 50 or greater or MATH 1005 or DSPM 0850

TITLE: Elementary Algebra and Geometry OFFICE LOCATION: M-106 COURSE REFERENCE NUMBER: see Website PHONE NUMBER: (619)

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

RUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY

MATH 1304 H COLLEGE ALGEBRA

Blackboard Development Checklist for Online Courses

SYLLABUS INSTRUCTOR NAME: DONNA MCGOVERN INSTRUCTOR

Requisite Approval must be attached

Fairfield Public Schools

Prerequisites: TSI Math Complete and high school Algebra II and geometry or MATH 0303.

Governors State University College of Business and Public Administration. Course: STAT Statistics for Management I (Online Course)

SYLLABUS INSTRUCTOR NAME: DONNA MCGOVERN INSTRUCTOR

CENTRAL COLLEGE Department of Mathematics COURSE SYLLABUS

CASPER COLLEGE COURSE SYLLABUS

Cover Page. MTT 1, MTT 2, MTT 3, MTT 4 Developmental Mathematics I, II, III, IV MTE 1, MTE 2, MTE 3, MTE 4, MTE 5, MTE 6, MTE 7, MTE 8, MTE 9

Manhattan Center for Science and Math High School Mathematics Department Curriculum

Mathematics Objective 6.) To recognize the limitations of mathematical and statistical models.

ANGELO STATE UNIVERSITY/GLEN ROSE HIGH SCHOOL DUAL CREDIT ALGEBRA II AND COLLEGE ALGEBRA/MATH

BUS Computer Concepts and Applications for Business Fall 2012

3: Summary Statistics

CHEMISTRY 109. Lecture 2, Fall Read This Syllabus Today Keep It for Future Reference

Introduction. The busy lives that people lead today have caused a demand for a more convenient method to

Kenneth Benoit Trinity College Dublin, Ireland

Springfield Technical Community College School of Mathematics, Sciences & Engineering Transfer

STA569 Applied (Nursing) Statistical Methods Instructor/ Facilitator:

Straightening Data in a Scatterplot Selecting a Good Re-Expression Model

Advanced Statistics & Data Analysis

Mathematics Spring Branch Campus

Module 4: Data Exploration

North Arkansas College Student Course Syllabus Spring 2015

MATH 2 Course Syllabus Spring Semester 2007 Instructor: Brian Rodas

MATH 1314 College Algebra Departmental Syllabus

Simple Linear Regression

Math 1280/1300, Pre-Calculus

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

SOUTHWEST COLLEGE Department of Mathematics

General Procedures for Developing an Online Course

Descriptive Statistics

Clovis Community College Core Competencies Assessment Area II: Mathematics Algebra

MATH 1314 College Algebra Departmental Syllabus

Decision Sciences Data Analysis for Managers

MAT 151 College Algebra and MAT 182 Trigonometry Course Syllabus Spring 2014

Prerequisite: MATH 0302, or meet TSI standard for MATH 0305; or equivalent.

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Prerequisite Math 115 with a grade of C or better, or appropriate skill level demonstrated through the Math assessment process, or by permit.

Oakland Community College MAT A1503 Calculus I Fall Semester, Instructor Jeremy JJ Mertz Office C-245

Development of a Fully Online Course in Engineering Economic Analysis

NEOSHO COUNTY COMMUNITY COLLEGE MASTER COURSE SYLLABUS

Functional Math II. Information CourseTitle. Types of Instruction

SAN DIEGO COMMUNITY COLLEGE DISTRICT CITY COLLEGE ASSOCIATE DEGREE COURSE OUTLINE

Street Address: 1111 Franklin Street Oakland, CA Mailing Address: 1111 Franklin Street Oakland, CA 94607

Alabama Department of Postsecondary Education. Representing The Alabama Community College System

MATH 101 E.S. COLLEGE ALGEBRA FALL 2011 SYLLABUS

2: Frequency Distributions

M In class: Played computer games for review of the story and vocabulary for Lob s Girl. Test will be Thursday! HW: Review for vocab:

IVY TECH COMMUNITY COLLEGE REGION 03 SYLLABUS MATH 136: COLLEGE ALGEBRA SUMMER Instructor: Jack Caster Telephone: ext.

MATH 1351 (3:3:0) Fundamentals of Mathematics II (Online Course) MATHEMATICS DEPARTMENT. Division of Arts & Sciences

Department of Mathematics Texas A&M University-Kingsville

UNIVERSITY of MASSACHUSETTS DARTMOUTH Charlton College of Business Decision and Information Sciences Fall 2010

College of Southern Maryland Fundamentals of Accounting Practice(ACC 1015) Course Syllabus Spring 2015

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS PROJECT SCHEDULING W/LAB ENGT 2021

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Johnson County Community College

I. PREREQUISITES For information regarding prerequisites for this course, please refer to the Academic Course Catalog.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

This version of course will be available for Fall 2015 term. New York City College of Technology The City University of New York

MIS 426: Management Information Systems

MATH 0110 Developmental Math Skills Review, 1 Credit, 3 hours lab

DYERSBURG STATE COMMUNITY COLLEGE Course Syllabus

Exploratory Data Analysis

Technology and Online Computer Access Requirements: Lake-Sumter State College Course Syllabus

How To Complete The College Readiness Math Mooc

Administrative - Master Syllabus COVER SHEET

Assessing Blackboard: Improving Online Instructional Delivery

Del Mar College, Corpus Christi, TX Department of Mathematics. Course: ID: Instructor:

4 Other useful features on the course web page. 5 Accessing SAS

Social Work Statistics Spring 2000

How To Pass Onliner College Algebra 1314 Online Online Online Online Online

Access Code: RVAE4-EGKVN Financial Aid Code: 6A9DB-DEE3B-74F

Transcription:

MATH 4470/5470 EXPLORATORY DATA ANALYSIS ONLINE COURSE SYLLABUS COURSE DESCRIPTION Introduction to modern techniques in data analysis, including stem-and-leafs, box plots, resistant lines, smoothing and median polish. SCHEDULE Fall of odd years. 3 credit hours. PREREQUISITE C or better in MATH 2470, MATH 3410, or MATH 4410 or consent of instructor. COURSE PURPOSE This class is designed to give the student a basic understanding of the basic principles behind modern data analysis. One can think of data analysis as numerical detective work. Given a batch or several batches of data, we wish to discover the basic pattern or structure that is present. Also, by looking at residuals, we look for interesting structure that can only be viewed after one has removed the basic patterns from the data. The student will learn some novel techniques for exploring data. More importantly, he or she will learn a basic philosophy of data analysis that will guide any type of data exploration in the future. The course discusses the display and summary of one batch, the effective comparison of batches, relationships between two measurement variables, smoothing time series data, additive fits to two-way tables, and working with binned data. Although we discuss a variety of methods useful for exploring data, they all share different characteristics that underlie the philosophy of Exploratory Data Analysis (EDA). Revelation: In EDA, there is an emphasis on using graphs to find patterns or displaying fits. Effective displays of data can communicate information in way that is not possible by the use of numbers. Resistance: In EDA, we wish to describe the general pattern in the majority of the data. In this detective work, we don t want our search to be unusually 1

affected by a few unusual observations. So it is important that our exploratory method is resistant or insensitive to outliers. Reexpression: We will see that the natural scale that the data is presented is not always the best scale for displaying or summarizing the data. In many situations, we wish to reexpress the data to a new scale by taking a square root or a logarithm. In this class, we will talk about a useful class of reexpressions, called the power family, and give guidance on the best choice of power reexpression to simplify the data analysis. Residual: In a typical data analysis, we will find a general pattern, which we call the FIT. The description of the FIT may be very informative. But in EDA we wish to look for deviations in the data from the general pattern in the FIT. We look at the residuals, which are defined as the difference between the data and the FIT. Outline of Topics 1. Introduction to EDA 2. Working with a single batch 3. Comparing batches 4. Transformations 5. Reexpressing for symmetry 6. Resistant line 7. Straightening plots 8. Smoothing sequences 9. Two-way tables median polish 10. Binned data -- rootograms 11. Fraction data REQUIRED RESOURCES Text There is no text for this course. (The material is learned using the on-line lecture notes.) A copy of Tukey s text Exploratory Data Analysis will be kept on two-hour reserve in the Math-Science library. This book is similar in spirit to this course and contains a discussion of the exploratory methods that we will be using. Almanac Although there is no required textbook, I highly recommend that you purchase a copy of any current world almanac. This is relatively inexpensive and it provides a convenient source for data for this course. For many of the homework assignments, you will be asked to work on your own data. If you don t want to 2

purchase an almanac, then you will be able to get data from the Internet or the libraries on campus. Internet Access All of the content of the course can be found on the BGSU Blackboard site. This includes all of the lecture notes, tutorial material on the R and Fathom software, and all homework assignments and Fathom lab assignments. COURSE DELIVERY Computing using R and Fathom We will be using the statistical software R for performing the EDA methods that we ll use in this course. R is free software. You may download it on your own computer. Also, several labs will have R available on campus. In addition, there will be a number of labs that use the Fathom software these labs are designed to illustrate graphically and dynamically many of the concepts in the course. Fathom is also available on-campus. You can purchase a student copy of Fathom from Key College (www.keycollege.com) for your home computer. Organization of Blackboard material: * EDA Lectures Here you find all of the lecture notes for the course. * R STUFF Here I will put tutorials and demonstrations of data analysis commands using the R software. * FATHOM Here I will put all of the Fathom labs. * DATA Here I will place all of the datasets that we will use in this class (lectures and homework). * HOMEWORK All of the homework assignments are located here * CLASS SCHEDULE This tells you the pace at which you should be working through the material and the homework assignment dates. Distance Learning and Communication This is a non-traditional class there will be no regularly scheduled classes oncampus and the idea is to learn the material by reading the lecture notes and doing homework. The pace at which you work is shown in the timetable in the Class Schedule. The instructor will use various methods to communicate with 3

the students regarding any problems that arise. There will likely be weekly help sessions held on campus to go over software and homework problems. The best way of communicating with the instructor is through email and the instructor will respond to any email within 24 hours COURSE GOALS Participants who complete this course will acquire the knowledge to perform effective data analyses in his or her particular field of study. The student will understand the importance of effective graphical displays, the usefulness of using resistant methods in fitting models, when to use transformations to help to simplify data structures, and how to look at residuals from a model fit to look for finer structure in the data. STUDENT OUTCOMES Student evaluation The grade in the class will be based on the student s performance on homework and computer labs. A typical homework assignment will consist of several data analyses following the techniques described in the lecture notes. All homeworks are to be turned in electronically or by hard-copy to the instructor using a word processor (such as Microsoft Word) using pasted output (graphs and tables) from the statistical package R. One component of this class is a data analysis project. You choose an interesting dataset and do an extensive analysis using methods from the course. You turn in a 5-10 page report that describes the dataset and what you learned about the structure of the data in your analysis. Grading scale All homeworks will be graded on a point-system. The final grade is based on the percentage of the total points on the homework where a percentage of 90-100 is an A, a percentage of 80-89 is a B, a percentage of 70-79 is a C, a percentage of 60-69 is a D, and a percentage lower than 60 is an F. A student s participation in the course by email or postings will help determine grades in borderline cases. COURSE OF STUDY The calendar of events for the course, including the homework assignments, is shown below. Week 1 Introduction to EDA, working with a single batch displays HOMEWORK: HW0, Fathom introduction 4

Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 14 Week 15 Week 16 Working with a single batch, summaries HOMEWORK: HW 1, Fathom outliers Comparing batches HOMEWORK: Fathom bins Spread vs level plot HOMEWORK: HW 2, Fathom comparing batches Transformations HOMEWORK: Fathom spread vs. level Reexpressing for symmetry HOMEWORK: Fathom symmetry HW 3 Plotting/Resistant lines HOMEWORK: Fathom fitting lines Straightening HOMEWORK: HW 4, Fathom straightening Smoothing sequences HOMEWORK: HW 5 Two-way tables, median polish Multiplicative and extended fits HOMEWORK: HW 6 Working with binned data HOMEWORK: HW 7 Fraction data HOMEWORK: HW 8 5