DATA ANALYTICS USING R

Size: px
Start display at page:

Download "DATA ANALYTICS USING R"

Transcription

1 DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data analytics in sufficient depth and breadth. The course will provide an overview of how to pose meaningful data analytic problems in a commercial setting. At the end of the course, participants will develop structured thinking approach to transition from data to problem definition. R, an open source tool for data analytics will be introduced in depth. Important and commonly used data analytic and machine learning algorithms will be described in detail. Laboratory sessions will be conducted as part of each module with the expectation that the participants will be able to apply these algorithms on their own data. A capstone case study tailored to the participants field of interest will be solved at the end of the course. Objectives: Introduce the participants to the field of data analytics background and key concepts Introduce the participants to problem types in the area of data analytics possible problem formulation framework Introduce the participants to R an easy to use tool for high level data analytics Introduce the participants to a comprehensive overview of linear algebra and statistics concepts critical concepts for the understanding of data analytic algorithms Introduce the participants to in-depth explanation of the most used data analytic algorithms supported by hands-on work in R from an application viewpoint Introduce the participants to a real life application of data analytics a case study approach. The case study can be chosen by the participants based on their field of interest Modules: Module 1: Data science Introduction Module 2: Data science Class of problems Module 3: R programming Module 4: Statistical modelling Module 5: Data inter-relationships Basics of linear algebra and a brief introduction to nonlinear equations Module 6: Data preparation Module 7: Predictive modelling Module 8: Machine learning techniques Module 9: Introduction to text mining and big data Module 10: Case study Optional modules Pre-requisite: Bachelor s degree with understanding of basic statistics, matrix algebra and probability

2 Module 1: Data science Introduction (3 hours) 1. Participants will get a bird s eye view of the field of data analytics Introduction to big data and data science (The Vs of big data and mathematical concepts that are building blocks for this field) Analytics as a pervasive solution approach cross-cutting disparate problem domains Module 2: Data science Class of problems (3 hours) 1. Participants will learn to apply structured thinking to unstructured problems 2. Participants will be able to categorize and understand various data types 3. Participants will be able to convert imprecise business relevant problem statements to precise data analytic problems 4. Participants will learn the importance of visualization in the data analytics solution process Structured thinking and how it can help Conceptual understanding of data types Importance of quality of data Conceptual understanding of solution typology Introduction to problem formulation framework Impact of visualization Business relevant problem statements Module 3: R programming (6 hours) 1. Participants will be introduced to basics of R programming 2. Participants will be able to write their own programs and will learn to use the already existing data analytics modules in R 3. Participants will be able to import data from Excel, SQL, etc to the R platform RStudio and its GUI Data types Importing and exporting data in R Data preprocessing Matrix algebra Built-in functions Programming in R Data visualization (ggplot)

3 Module 4: Statistical modelling (10 hours) 1. Participants will become well versed in basic probability and statistics concepts 2. Participants will be able to setup hypothesis testing protocols 3. Participants will be able to interpret hypothesis test results Probability Principle of counting Conditional probability, Bayes theorem, independent events Random variables, expectation Continuous and discrete random variables and their distributions (Poisson, Binomial, Normal and its derivatives), statistical intervals Descriptive statistics Hypothesis testing Introduction to elements of hypothesis testing General procedure for hypothesis testing One-sided and two-sided tests p-values, Type I and Type II errors Z, T, F and Chi-squared tests for hypothesis testing Introduction to Bayesian inference Hands-on session in R through examples Module 5: Data inter-relationships Basics of linear algebra and a brief introduction to nonlinear equations (12 hours) 1. Participants will be able to identify relationships between variables in large datasets 2. Participants will be able to identify information sufficiency in terms of both equations and variables 3. Participants will be able to understand basic linear algebra concepts that underlie the complicated data analytics algorithms 4. Participants will be able to understand and interpret solutions to simultaneous nonlinear equations Solving simultaneous linear equations o Independence of the given equations Redundant equations Inconsistent equations o Thinking about an ordered set of variables as vectors o Constructing matrices out of linear equations o Conditions for existence of solutions o Conditions for uniqueness of solutions o Connections between solutions when there are multiple solutions o Elimination approach to solving simultaneous equations

4 Introduction of the notion of distance Perpendicular vectors notion of orthogonality Converting vectors to an orthonormal basis Understanding solutions when the number of variables and equations are different Introduction to simultaneous nonlinear equations Newton-Raphson method for solving nonlinear equations Hands-on session in R through examples Module 6: Data preparation (6 hours) 1. Participants will be able to setup appropriate sampling techniques 2. Participants will be able to apply techniques to address outliers and missing values Sampling techniques o Probability sampling o Non probability sampling Stratified vs. Cluster sampling Treating outliers and missing values Design of Experiments o Single factor o Multiple factors Hands-on session in R through examples Module 7: Predictive modelling (16 hours) 1. Participants will be able to identify relationships between variables through correlation analysis 2. Participants will be able to develop predictive models between variables 3. Participants will be able to rationalize and assess the fidelity of models that are built Correlation o Pearson s correlation o Kendall rank correlation o Spearman rank correlation Regression o Types of regression o Fitting a function Criterion for best fit o Least squares

5 Correlation vs Regression Simple regression Multiple regression Diagnostics and ANOVA Model assessment and validation Non-parametric testing Hands-on session in R through examples Module 8: Machine learning techniques (24 hours) 1. Participants will be able to understand and develop algorithms for classification problems 2. Participants will be able to understand and develop algorithms for function approximation problems 3. Participants will be able to conceptualize novel algorithms Dimensionality reduction methods o Principal component analysis and its variants o Multidimensional scaling Multivariate regression o Ridge regression o Principal component regression o Logistic regression o LASSO Classification methods o Linear discriminant analysis o Quadratic discriminant analysis o K-neighborhood o Naïve Bayes classifier Clustering methods o K means clustering o Fuzzy C-means clustering o Hierarchical clustering Hands-on session in R through examples Module 9: Introduction to text mining and big data (3 hours)

6 Module 10: Case study (4 hours) 1. Participants will learn to solve data analytics problems from conceptualization to the final solution and concomitant visualization of the solution Participants can choose from one of the following domains Case Study 1: Marketing and sales Case Study 2: Accounting Case Study 3: Supply chain management Case Study 4: Financial Case Study 5: Process productivity improvement Case study format: Introduction to the problem (0.5 hours) Participants to identify the problem statement (0.5 hours) Presentation of the problem statement by the participants identification of variables, listing down key assumptions, understanding the data, data preparation requirements, contours of problem solution, visualization specifications (1 hour) Presentation of the problem statement by the instructor (1 hour) Solution for the problem statement in R, results and visualization (1 hour) Optional modules (10 hours for each module) 1. Time series 2. Neural networks 3. Decision trees 4. Natural language processing 5. Multivariate data analysis methods 6. Deep learning (pre-requisite neural networks)

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

MSCA 31000 Introduction to Statistical Concepts

MSCA 31000 Introduction to Statistical Concepts MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced

More information

MSCA 31000 Introduction to Statistical Concepts

MSCA 31000 Introduction to Statistical Concepts MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced

More information

Graduate Programs in Statistics

Graduate Programs in Statistics Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

More information

MEU. INSTITUTE OF HEALTH SCIENCES COURSE SYLLABUS. Biostatistics

MEU. INSTITUTE OF HEALTH SCIENCES COURSE SYLLABUS. Biostatistics MEU. INSTITUTE OF HEALTH SCIENCES COURSE SYLLABUS title- course code: Program name: Contingency Tables and Log Linear Models Level Biostatistics Hours/week Ther. Recite. Lab. Others Total Master of Sci.

More information

Semester 2 Statistics Short courses

Semester 2 Statistics Short courses Semester 2 Statistics Short courses Course: STAA0001 - Basic Statistics Blackboard Site: STAA0001 Dates: Sat 10 th Sept and 22 Oct 2016 (9 am 5 pm) Room EN409 Assumed Knowledge: None Day 1: Exploratory

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

Audit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila

Audit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila Audit Analytics --An innovative course at Rutgers Qi Liu Roman Chinchila A new certificate in Analytic Auditing Tentative courses: Audit Analytics Special Topics in Audit Analytics Forensic Accounting

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

City University of Hong Kong. Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015

City University of Hong Kong. Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015 City University of Hong Kong Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015 Part I Course Title: Fundamentals of Data Science Course Code:

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Microsoft Azure Machine learning Algorithms

Microsoft Azure Machine learning Algorithms Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql Tomaz.kastrun@gmail.com http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

Semester 1 Statistics Short courses

Semester 1 Statistics Short courses Semester 1 Statistics Short courses Course: STAA0001 Basic Statistics Blackboard Site: STAA0001 Dates: Sat. March 12 th and Sat. April 30 th (9 am 5 pm) Assumed Knowledge: None Course Description Statistical

More information

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.7 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Linear Regression Other Regression Models References Introduction Introduction Numerical prediction is

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of

More information

Applications of Intermediate/Advanced Statistics in Institutional Research

Applications of Intermediate/Advanced Statistics in Institutional Research Applications of Intermediate/Advanced Statistics in Institutional Research Edited by Mary Ann Coughlin THE ASSOCIATION FOR INSTITUTIONAL RESEARCH Number Sixteen Resources in Institional Research 2005 Association

More information

Chapter 14: Analyzing Relationships Between Variables

Chapter 14: Analyzing Relationships Between Variables Chapter Outlines for: Frey, L., Botan, C., & Kreps, G. (1999). Investigating communication: An introduction to research methods. (2nd ed.) Boston: Allyn & Bacon. Chapter 14: Analyzing Relationships Between

More information

C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)}

C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)} C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)} 1. EES 800: Econometrics I Simple linear regression and correlation analysis. Specification and estimation of a regression model. Interpretation of regression

More information

CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen

CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 3: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major

More information

Practical Data Science with R

Practical Data Science with R Practical Data Science with R Instructor Matthew Renze Twitter: @matthewrenze Email: matthew@matthewrenze.com Web: http://www.matthewrenze.com Course Description Data science is the practice of transforming

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

COURSE SYLLABUS COURSE TITLE:

COURSE SYLLABUS COURSE TITLE: 1 COURSE SYLLABUS COURSE TITLE: FORMAT: CERTIFICATION EXAMS: 55040 Data Mining: Predictive Analytics with Microsoft SQL Server Analysis Services and Excel Using PowerPivot and the Data Mining Add-Ins Instructor-Led

More information

1. Students will demonstrate an understanding of the real number system as evidenced by classroom activities and objective tests

1. Students will demonstrate an understanding of the real number system as evidenced by classroom activities and objective tests MATH 102/102L Inter-Algebra/Lab Properties of the real number system, factoring, linear and quadratic equations polynomial and rational expressions, inequalities, systems of equations, exponents, radicals,

More information

Learning outcomes. Knowledge and understanding. Ability and Competences. Evaluation capability and scientific approach

Learning outcomes. Knowledge and understanding. Ability and Competences. Evaluation capability and scientific approach Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

Big Data Analytics and Optimization

Big Data Analytics and Optimization Big Data Analytics and Optimization C e r t i f i c a t e P r o g r a m i n E n g i n e e r i n g E x c e l l e n c e e.edu.in http://www.insof LIST OF COURSES Essential Business Skills for a Data Scientist...

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

List of Ph.D. Courses

List of Ph.D. Courses Research Methods Courses (5 courses/15 hours) List of Ph.D. Courses The research methods set consists of five courses (15 hours) that discuss the process of research and key methodological issues encountered

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

Big Data Analytics and Optimization

Big Data Analytics and Optimization Big Data Analytics and Optimization C e r t i f i c a t e P r o g r a m i n E n g i n e e r i n g E x c e l l e n c e C e r t i f i c a t e P r o g r a m s i n A c c e l e r a t e d E n g i n e e r i n

More information

Economic Order Quantity and Economic Production Quantity Models for Inventory Management

Economic Order Quantity and Economic Production Quantity Models for Inventory Management Economic Order Quantity and Economic Production Quantity Models for Inventory Management Inventory control is concerned with minimizing the total cost of inventory. In the U.K. the term often used is stock

More information

PROGRAM DIRECTOR: Arthur O Connor Email Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements

PROGRAM DIRECTOR: Arthur O Connor Email Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements Data Analytics (MS) PROGRAM DIRECTOR: Arthur O Connor CUNY School of Professional Studies 101 West 31 st Street, 7 th Floor New York, NY 10001 Email Contact: Arthur O Connor, arthur.oconnor@cuny.edu URL:

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information

Exploring Practical Data Mining Techniques at Undergraduate Level

Exploring Practical Data Mining Techniques at Undergraduate Level Exploring Practical Data Mining Techniques at Undergraduate Level ERIC P. JIANG University of San Diego 5998 Alcala Park, San Diego, CA 92110 UNITED STATES OF AMERICA jiang@sandiego.edu Abstract: Data

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Official SAS Curriculum Courses

Official SAS Curriculum Courses Certificate course in Predictive Business Analytics Official SAS Curriculum Courses SAS Programming Base SAS An overview of SAS foundation Working with SAS program syntax Examining SAS data sets Accessing

More information

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Course Description. Learning Objectives

Course Description. Learning Objectives STAT X400 (2 semester units in Statistics) Business, Technology & Engineering Technology & Information Management Quantitative Analysis & Analytics Course Description This course introduces students to

More information

Diablo Valley College Catalog 2014-2015

Diablo Valley College Catalog 2014-2015 Mathematics MATH Michael Norris, Interim Dean Math and Computer Science Division Math Building, Room 267 Possible career opportunities Mathematicians work in a variety of fields, among them statistics,

More information

Graduate Certificate in Systems Engineering

Graduate Certificate in Systems Engineering Graduate Certificate in Systems Engineering Systems Engineering is a multi-disciplinary field that aims at integrating the engineering and management functions in the development and creation of a product,

More information

Data Mining mit der JMSL Numerical Library for Java Applications

Data Mining mit der JMSL Numerical Library for Java Applications Data Mining mit der JMSL Numerical Library for Java Applications Stefan Sineux 8. Java Forum Stuttgart 07.07.2005 Agenda Visual Numerics JMSL TM Numerical Library Neuronale Netze (Hintergrund) Demos Neuronale

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains

More information

Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar

More information

AP Statistics: Syllabus 3

AP Statistics: Syllabus 3 AP Statistics: Syllabus 3 Scoring Components SC1 The course provides instruction in exploring data. 4 SC2 The course provides instruction in sampling. 5 SC3 The course provides instruction in experimentation.

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Master of Arts in Mathematics

Master of Arts in Mathematics Master of Arts in Mathematics Administrative Unit The program is administered by the Office of Graduate Studies and Research through the Faculty of Mathematics and Mathematics Education, Department of

More information

INTERNATIONAL MASTER IN BUSINESS ANALYTICS AND BIG DATA

INTERNATIONAL MASTER IN BUSINESS ANALYTICS AND BIG DATA POLITECNICO DI MILANO GRADUATE SCHOOL OF BUSINESS BABD INTERNATIONAL MASTER IN BUSINESS ANALYTICS AND BIG DATA Courses Description A JOINT PROGRAM WITH POLITECNICO DI MILANO SCHOOL OF MANAGEMENT PRE-COURSES

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

A Statistical Text Mining Method for Patent Analysis

A Statistical Text Mining Method for Patent Analysis A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, shjun@cju.ac.kr Abstract Most text data from diverse document databases are unsuitable for analytical

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376 Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

K2 Data Science WELCOME TO K2. Learning the Fundamental Skills. Creating Data Driven Applications. Portfolio and Career Support

K2 Data Science WELCOME TO K2. Learning the Fundamental Skills. Creating Data Driven Applications. Portfolio and Career Support K2 Data Science Become a data scientist with our mentor-led program. WELCOME TO K2 Learning the Fundamental Skills Spend the first half of the program learning fundamental skills through recorded lectures,

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

The Probit Link Function in Generalized Linear Models for Data Mining Applications

The Probit Link Function in Generalized Linear Models for Data Mining Applications Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

STATISTICS COURSES UNDERGRADUATE CERTIFICATE FACULTY. Explanation of Course Numbers. Bachelor's program. Master's programs.

STATISTICS COURSES UNDERGRADUATE CERTIFICATE FACULTY. Explanation of Course Numbers. Bachelor's program. Master's programs. STATISTICS Statistics is one of the natural, mathematical, and biomedical sciences programs in the Columbian College of Arts and Sciences. The curriculum emphasizes the important role of statistics as

More information

Clustering and Data Mining in R

Clustering and Data Mining in R Clustering and Data Mining in R Workshop Supplement Thomas Girke December 10, 2011 Introduction Data Preprocessing Data Transformations Distance Methods Cluster Linkage Hierarchical Clustering Approaches

More information

PCHS ALGEBRA PLACEMENT TEST

PCHS ALGEBRA PLACEMENT TEST MATHEMATICS Students must pass all math courses with a C or better to advance to the next math level. Only classes passed with a C or better will count towards meeting college entrance requirements. If

More information

270107 - MD - Data Mining

270107 - MD - Data Mining Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 015 70 - FIB - Barcelona School of Informatics 715 - EIO - Department of Statistics and Operations Research 73 - CS - Department of

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

Mathematics within the Psychology Curriculum

Mathematics within the Psychology Curriculum Mathematics within the Psychology Curriculum Statistical Theory and Data Handling Statistical theory and data handling as studied on the GCSE Mathematics syllabus You may have learnt about statistics and

More information

2014-2015 The Master s Degree with Thesis Course Descriptions in Industrial Engineering

2014-2015 The Master s Degree with Thesis Course Descriptions in Industrial Engineering 2014-2015 The Master s Degree with Thesis Course Descriptions in Industrial Engineering Compulsory Courses IENG540 Optimization Models and Algorithms In the course important deterministic optimization

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Prerequisites. Course Outline

Prerequisites. Course Outline MS-55040: Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot Description This three-day instructor-led course will introduce the students to the concepts of data mining,

More information

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

More information

QUALITY ENGINEERING PROGRAM

QUALITY ENGINEERING PROGRAM QUALITY ENGINEERING PROGRAM Production engineering deals with the practical engineering problems that occur in manufacturing planning, manufacturing processes and in the integration of the facilities and

More information

CLASSIFICATION AND CLUSTERING. Anveshi Charuvaka

CLASSIFICATION AND CLUSTERING. Anveshi Charuvaka CLASSIFICATION AND CLUSTERING Anveshi Charuvaka Learning from Data Classification Regression Clustering Anomaly Detection Contrast Set Mining Classification: Definition Given a collection of records (training

More information

Prerequisite: High School Chemistry.

Prerequisite: High School Chemistry. ACT 101 Financial Accounting The course will provide the student with a fundamental understanding of accounting as a means for decision making by integrating preparation of financial information and written

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

Module 9: Nonparametric Tests. The Applied Research Center

Module 9: Nonparametric Tests. The Applied Research Center Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test

More information

Industrial and Systems Engineering Master of Science Program Logistics and Supply Chain Management

Industrial and Systems Engineering Master of Science Program Logistics and Supply Chain Management Industrial and Systems Engineering Master of Science Program Logistics and Supply Chain Management Department of Integrated Systems Engineering The Ohio State University Logistics is the science of design,

More information

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4. Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

A spreadsheet Approach to Business Quantitative Methods

A spreadsheet Approach to Business Quantitative Methods A spreadsheet Approach to Business Quantitative Methods by John Flaherty Ric Lombardo Paul Morgan Basil desilva David Wilson with contributions by: William McCluskey Richard Borst Lloyd Williams Hugh Williams

More information

Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program

Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program Department of Mathematics and Statistics Degree Level Expectations, Learning Outcomes, Indicators of

More information

MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing! MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics

More information

Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions

Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0

More information

Proposal for Undergraduate Certificate in Large Data Analysis

Proposal for Undergraduate Certificate in Large Data Analysis Proposal for Undergraduate Certificate in Large Data Analysis To: Helena Dettmer, Associate Dean for Undergraduate Programs and Curriculum From: Suely Oliveira (Computer Science), Kate Cowles (Statistics),

More information

Course Syllabus For Operations Management. Management Information Systems

Course Syllabus For Operations Management. Management Information Systems For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third

More information

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence Government of Russian Federation Federal State Autonomous Educational Institution of High Professional Education National Research University «Higher School of Economics» Faculty of Computer Science School

More information

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Syllabus for the TEMPUS SEE PhD Course (Podgorica, April 4 29, 2011) Franz Kappel 1 Institute for Mathematics and Scientific Computing University of Graz Žaneta Popeska 2 Faculty

More information

Business Analytics. Methods, Models, and Decisions. James R. Evans : University of Cincinnati PEARSON

Business Analytics. Methods, Models, and Decisions. James R. Evans : University of Cincinnati PEARSON Business Analytics Methods, Models, and Decisions James R. Evans : University of Cincinnati PEARSON Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London

More information