KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics"

Transcription

1 ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS Principles of Statistical Data Mining): 1.0 Course Designations and Approvals Required course approvals: Academic Unit Curriculum Committee College Curriculum Committee Optional designations: Is designation desired? General Education: Yes No Writing Intensive: Yes No Honors Yes No Approval request date: *Approval request date: Approval granted date: **Approval granted date: 2.0 Course information: Course title: KGCOE- CQAS- 747-Principles of Statistical Data Mining Credit hours: 3 Prerequisite(s): one course in basic statistics Co-requisite(s): None Course proposed by: Ernest Fokoué Effective date: August 2013 Contact hours Maximum students/section Classroom 3 25 Lab 0 Studio 0 Other (specify) 0 2.a Course Conversion Designation*** (Please check which applies to this course). *For more information on Course Conversion Designations please see page four. Semester Equivalent (SE) Please indicate which quarter course it is equivalent to: Semester Replacement (SR) Please indicate the quarter course(s) this course is replacing: Principles of Statistical Data Mining September 2010

2 2.b Semester(s) offered (check) Fall (distance) Spring (campus) Summer Other All courses must be offered at least once every 2 years. If course will be offered on a bi-annual basis, please indicate here: 2.c Student Requirements Students required to take this course: (by program and year, as appropriate) None Students who might elect to take the course: This is an elective for graduate students in Advanced Certificate and MS programs in Applied Statistics. Graduate students in other programs who interested in statistical data mining will also elect to take this class. In the sections that follow, please use sub-numbering as appropriate (eg. 3.1, 3.2, etc.) 3.0 Goals of the course (including rationale for the course, when appropriate): For students 3.1 To achieve a practical understanding of modern statistical data mining techniques 3.2 To develop the ability to correctly apply modern data mining techniques to a variety of real world case studies involving very massive high dimensional complex data. 3.3 To gain a hands on experience with data mining through case studies, among which examples like: Describing website visitors, Market basket analysis, Describing customer satisfaction, Predicting credit risk of small businesses, Predicting e-learning student performance, Predicting customer lifetime value and Operational risk management. 4.0 Course description (as it will appear in the RIT Catalog, including pre- and corequisites, and quarters offered). Please use the following format: Course: KGCOE-CQAS-846 Principles of Statistical Data Mining I This course covers topics such as clustering, classification and regression trees, multiple linear regression under various conditions, logistic regression, PCA and kernel PCA, model-based clustering via mixture of gaussians, spectral clustering, text mining, neural networks, support vector machines, multidimensional scaling, variable selection, model selection, k-means clustering, k-nearest neighbors classifiers, statistical tools for modern machine learning and data mining, naïve Bayes classifiers, variance reduction methods (bagging) and ensemble methods for predictive optimality.this course is designed to provide the student with a solid practical hands-on introduction to the fundamentals of modern concepts and techniques of statistical data mining, with a strong emphasis on the wide applicability of these techniques to real world problems. Throughout the course, many real world case studies are used to motivate and explain the strengths and appropriateness of each method of interest. To ease the exploration of the techniques, SAS Enterprise Miner will be our main computing software. We will occasionally mention other notable software for data mining such as Rattle in the R environment. Topics throughout this course include among other things: Distance Measures in Data Mining, Hierarchical clustering, Classification and Regression trees, Multiple Linear Regression under various conditions, Logistic regression for Pattern Recognition, Principal Components analysis, Factor Analysis, Model-based Clustering via Mixture of 2

3 Gaussians, Spectral Clustering Techniques, Text Mining, Neural Networks for classification and regression, Support Vector Machines for Classification and Regression, Multidimensional Scaling, Variable Selection, Model Selection, k-means clustering, k- Nearest Neighbors classifiers, Statistical tools for modern machine learning and data mining, Bayes Classifiers, Fisher Linear Discriminant Analysis and Quadratic discrimination, Variance Reduction Methods (Bagging) and Ensemble Methods for Predictive Optimality (Boosting and Random Forests) Prerequisite(s): one course in basic statistics. Class 3, Lab 0, Credit 3 (Fall-distance, Spring-campus) 5.0 Possible resources (texts, references, computer packages, etc.) Required texts 5.1 Applied Data Mining for Business and Industry, 2nd ed., Paolo Giudici and Silvia Figini (2009), Wiley, ISBN Recommended Texts 5.2 Statistical Data Mining Using SAS Applications, 2nd ed., George Fernandez (2009), CRC Press, ISBN Data Mining Using SAS Enterprise Miner, Randall Matignon (2009), Wiley 5.4 Getting Started with SAS Enterprise Miner (From SAS) 5.5 Applied Analytics Using SAS Enterprise Miner (From SAS) 6.0 Topics (outline): 6.1. Complex data structures and the emergence of Data Mining and Machine Learning 6.2. Measures of location and measures of variability 6.3. Distance measures, Similarity Measures and Dependency measures 6.4. Multiple linear regression and its extensions to Radial Basis Function regression 6.5. Difference of focus between model identification and predictive optimality 6.6. Principles and applications of dimensionality reduction techniques 6.7. Principal component Analysis and Singular Value Decomposition 6.8. Cluster analysis.via Hierarchical and Hierarchical Methods 6.9. Factor Analysis and Mixtures of Factor Analyzers Multidimensional scaling and its relationship to other techniques Model Based Clustering via Mixtures of Gaussians Logistic regression for Pattern Recognition Linear and Quadratic Discriminant analysis Classification and Regression Trees Neural networks: Multilayer Perceptron and Kohonen networks Support Vector Machines for classification and regression Nearest-neighbor models: kmeans and K Nearest Neighbors Variance Reduction Techniques: Bagging Predictors Non-parametric modeling and Bayesian Modeling Generalized linear models and Log-linear models Graphical models and their applications Model Evaluation and model selection techniques Ensemble Methods for Predictive Optimality: Boosting 3

4 4

5 7.0 Intended course learning outcomes and associated assessment methods of those outcomes (please include as many Course Learning Outcomes as appropriate, one outcome and assessment method per row). Course Objectives Level 2: Comprehension: 2.1.Understands the central role of model uncertainty in data mining, and maintains a keen awareness of the difference between accurate model identification and optimal prediction 2.2.Appreciates and takes into account the everpresent bias/variance dilemma in model selection and model building, and strives to find solutions that achieve bias/variance trade-off 2.3.Knows when and how to combine unsupervised learning techniques (e.g.: PCA for feature extraction) with supervised learning techniques (e.g. Neural Networks) to achieve optimality 2.4.Recognizes when and how to use Ensemble methods rather than select a single model, and also knows when to use variance reduction techniques like Bagging! 2.5.Understands the profound meaning of the No Free Lunch theorem, and refrains from relying solely on one single method of data mining, and indeed always comparing various methods before making recommendations Level 3: Application: 3.1.Identifies an interesting real world engineering problem during the course of study and formulates its statistically 3.2.Recognizes for each real world case study which classes of data mining methods are more appropriate 3.3.Uses statistical software like SAS Enterprise Miner to perform a thorough data mining analysis of real world problems Level 4: Analysis: 4.1.Determines/decides which statistical model(s) appear to be most appropriate for the task at hand in light of the graphs and descriptive statistics obtained for exploratory data analysis Assessment Method Homework Exams Projects 5

6 4.2.Fits the chosen plausible model(s) using a statistical software package like SAS Enterprise Miner, then extracts and interprets the estimates of the parameters 4.3.Performs additional statistical hypothesis tests wherever needed 4.4.Checks all the assumptions underlying each method/technique used 4.5.Interprets the statistical estimation and prediction results produced by the software package Level 5: Synthesis: 5.1.Selects the best model according to some of the usual model selection criteria 5.2.Provides any needed/required formal prediction or estimation. 5.3.Uses an ensemble (aggregation) of methods wherever the need arises 5.4.Draws conclusions and interpretations about the original engineering task based on sound formal analysis like confidence intervals and results of hypothesis testing. Level 6: Evaluation: 6.1.Evaluates several potential statistical models and decides on the most appropriate one for a given purpose. 6.2.Provides any needed/required formal prediction or estimation 6.3.Makes recommendations in clear and non technical language based a thorough assessment of the statistical findings 6

7 8.0 Program outcomes and/or goals supported by this course Relationship to Program Outcomes (1 = slightly, 2=moderately, 3=significantly) Program Outcomes and/or Goals for CQAS 8.1 Advanced Certificate in Lean Six Sigma Demonstrates an solid understanding of statistical thinking and Lean Six Sigma methodology in solving real-world problems Leads Lean Six Sigma improvement projects. Level of Support Advanced Certificate and Masters of Science in Applied Statistics Demonstrates solid understanding of statistical thinking and applied statistics methodology in solving real-world problems Designs studies that are efficient and valid Analyzes data using appropriate statistical methods Communicates the results of statistical analysis with effective reports and presentations. Note: Students obtaining the Advanced Certificate in Applied Statistics will not be expected to perform at the same level as students obtaining a Master of Science degree Not Applicable General Education Learning Outcome Supported by the Course, if appropriate Communication Express themselves effectively in common college-level written forms using standard American English Revise and improve written and visual content Express themselves effectively in presentations, either in spoken standard American English or sign language (American Sign Language or English-based Signing) Comprehend information accessed through reading and discussion Intellectual Inquiry Review, assess, and draw conclusions about hypotheses and theories Analyze arguments, in relation to their premises, assumptions, contexts, and conclusions Construct logical and reasonable arguments that include anticipation of counterarguments Use relevant evidence gathered through accepted scholarly methods and properly acknowledge sources of information Assessment Method 7

8 Ethical, Social and Global Awareness Analyze similarities and differences in human experiences and consequent perspectives Examine connections among the world s populations Identify contemporary ethical questions and relevant stakeholder positions Scientific, Mathematical and Technological Literacy Explain basic principles and concepts of one of the natural sciences Apply methods of scientific inquiry and problem solving to contemporary issues Comprehend and evaluate mathematical and statistical information Perform college-level mathematical operations on quantitative data Describe the potential and the limitations of technology Use appropriate technology to achieve desired outcomes Creativity, Innovation and Artistic Literacy Demonstrate creative/innovative approaches to course-based assignments or projects Interpret and evaluate artistic expression considering the cultural context in which it was created 10.0 Other relevant information (such as special classroom, studio, or lab needs, special scheduling, media requirements, etc.) None *Optional course designation; approval request date: This is the date that the college curriculum committee forwards this course to the appropriate optional course designation curriculum committee for review. The chair of the college curriculum committee is responsible to fill in this date. **Optional course designation; approval granted date: This is the date the optional course designation curriculum committee approves a course for the requested optional course designation. The chair of the appropriate optional course designation curriculum committee is responsible to fill in this date. ***Course Conversion Designations Please use the following definitions to complete table 2.a on page one. Semester Equivalent (SE) Closely corresponds to an existing quarter course (e.g., a 4 quarter credit hour (qch) course which becomes a 3 semester credit hour (sch) course.) The semester course may develop material in greater depth or length. Semester Replacement (SR) A semester course (or courses) taking the place of a previous quarter course(s) by rearranging or combining material from a previous quarter course(s) (e.g. a two semester sequence that replaces a three quarter sequence). New (N) - No corresponding quarter course(s). 8

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining

More information

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KCOE-CQAS- 873 - Time Series Analysis

More information

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics. NEW (or REVISED) COURSE: COS-STAT-701 Foundations of Experimental Design

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics. NEW (or REVISED) COURSE: COS-STAT-701 Foundations of Experimental Design ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COS-STAT-701 Foundations of Experimental Design

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences ! ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences New Revised COURSE: COS-MATH-252 Probability and Statistics II 1.0 Course designations and approvals:

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences ! ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences New Revised COURSE: COS-MATH-101 College Algebra 1.0 Course designations and approvals: Required

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences ! ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences New Revised COURSE: COS-MATH-200 Discrete Mathematics and Introduction to Proofs 1.0 Course designations

More information

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376 Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

More information

DATA ANALYTICS USING R

DATA ANALYTICS USING R DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Faculty of Science School of Mathematics and Statistics

Faculty of Science School of Mathematics and Statistics Faculty of Science School of Mathematics and Statistics MATH5836 Data Mining and its Business Applications Semester 1, 2014 CRICOS Provider No: 00098G MATH5836 Course Outline Information about the course

More information

EECS 445: Introduction to Machine Learning Winter 2015

EECS 445: Introduction to Machine Learning Winter 2015 Instructor: Prof. Jenna Wiens Office: 3609 BBB wiensj@umich.edu EECS 445: Introduction to Machine Learning Winter 2015 Graduate Student Instructor: Srayan Datta Office: 3349 North Quad (**office hours

More information

Email: justinjia@ust.hk Office: LSK 5045 Begin subject: [ISOM3360]...

Email: justinjia@ust.hk Office: LSK 5045 Begin subject: [ISOM3360]... Business Intelligence and Data Mining ISOM 3360: Spring 2015 Instructor Contact Office Hours Course Schedule and Classroom Course Webpage Jia Jia, ISOM Email: justinjia@ust.hk Office: LSK 5045 Begin subject:

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Lecture: Mon 13:30 14:50 Fri 9:00-10:20 ( LTH, Lift 27-28) Lab: Fri 12:00-12:50 (Rm. 4116)

Lecture: Mon 13:30 14:50 Fri 9:00-10:20 ( LTH, Lift 27-28) Lab: Fri 12:00-12:50 (Rm. 4116) Business Intelligence and Data Mining ISOM 3360: Spring 203 Instructor Contact Office Hours Course Schedule and Classroom Course Webpage Jia Jia, ISOM Email: justinjia@ust.hk Office: Rm 336 (Lift 3-) Begin

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition

Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition Brochure More information from http://www.researchandmarkets.com/reports/2170926/ Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

CSci 538 Articial Intelligence (Machine Learning and Data Analysis)

CSci 538 Articial Intelligence (Machine Learning and Data Analysis) CSci 538 Articial Intelligence (Machine Learning and Data Analysis) Course Syllabus Fall 2015 Instructor Derek Harter, Ph.D., Associate Professor Department of Computer Science Texas A&M University - Commerce

More information

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

More information

Learning outcomes. Knowledge and understanding. Ability and Competences. Evaluation capability and scientific approach

Learning outcomes. Knowledge and understanding. Ability and Competences. Evaluation capability and scientific approach Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

More information

CS 207 - Data Science and Visualization Spring 2016

CS 207 - Data Science and Visualization Spring 2016 CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and human-assisted analysis of data sets. These

More information

Course Description This course will change the way you think about data and its role in business.

Course Description This course will change the way you think about data and its role in business. INFO-GB.3336 Data Mining for Business Analytics Section 32 (Tentative version) Spring 2014 Faculty Class Time Class Location Yilu Zhou, Ph.D. Associate Professor, School of Business, Fordham University

More information

Predicting Student Persistence Using Data Mining and Statistical Analysis Methods

Predicting Student Persistence Using Data Mining and Statistical Analysis Methods Predicting Student Persistence Using Data Mining and Statistical Analysis Methods Koji Fujiwara Office of Institutional Research and Effectiveness Bemidji State University & Northwest Technical College

More information

Machine Learning with MATLAB David Willingham Application Engineer

Machine Learning with MATLAB David Willingham Application Engineer Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the

More information

Big Data Analytics and Optimization

Big Data Analytics and Optimization Big Data Analytics and Optimization C e r t i f i c a t e P r o g r a m i n E n g i n e e r i n g E x c e l l e n c e e.edu.in http://www.insof LIST OF COURSES Essential Business Skills for a Data Scientist...

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information

Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions

Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0

More information

MACHINE LEARNING AN INTRODUCTION

MACHINE LEARNING AN INTRODUCTION AN INTRODUCTION JOSEFIN ROSÉN, SENIOR ANALYTICAL EXPERT, SAS INSTITUTE JOSEFIN.ROSEN@SAS.COM TWITTER: @ROSENJOSEFIN AGENDA What is machine learning? When, where and how is machine learning used? Exemple

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

More information

2015 Workshops for Professors

2015 Workshops for Professors SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market

More information

FLORIDA STATE COLLEGE AT JACKSONVILLE COLLEGE CREDIT COURSE OUTLINE. Calculus for Business and Social Sciences

FLORIDA STATE COLLEGE AT JACKSONVILLE COLLEGE CREDIT COURSE OUTLINE. Calculus for Business and Social Sciences Form 2A, Page 1 FLORIDA STATE COLLEGE AT JACKSONVILLE COLLEGE CREDIT COURSE OUTLINE COURSE NUMBER MAC 2233 COURSE TITLE: PREREQUISITE(S): COREQUISITE(S): Calculus for Business and Social Sciences MAC 1105

More information

Industrial and Systems Engineering Master of Science Program Data Analytics and Optimization

Industrial and Systems Engineering Master of Science Program Data Analytics and Optimization Industrial and Systems Engineering Master of Science Program Data Analytics and Optimization Department of Integrated Systems Engineering The Ohio State University (Expected Duration: Semesters) Our society

More information

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Lecture: MWF: 1:00-1:50pm, GEOLOGY 4645 Instructor: Mihai

More information

Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015

Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015 Course Information Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015 Credit Hours: 3 Semester: Fall 2015 Meeting times and location: MWF, 12:10 13:00, Sloan 163 Course website:

More information

Course Syllabus. Purposes of Course:

Course Syllabus. Purposes of Course: Course Syllabus Eco 5385.701 Predictive Analytics for Economists Summer 2014 TTh 6:00 8:50 pm and Sat. 12:00 2:50 pm First Day of Class: Tuesday, June 3 Last Day of Class: Tuesday, July 1 251 Maguire Building

More information

King Saud University

King Saud University King Saud University College of Computer and Information Sciences Department of Computer Science CSC 493 Selected Topics in Computer Science (3-0-1) - Elective Course CECS 493 Selected Topics: DATA MINING

More information

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut. Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,

More information

Accelerated Undergraduate/Graduate (BS/MS) Dual Degree Program in Computer Science

Accelerated Undergraduate/Graduate (BS/MS) Dual Degree Program in Computer Science Accelerated Undergraduate/Graduate (BS/MS) Dual Degree Program in The BS degree in requires 126 semester hours and the MS degree in Computer Science requires 30 semester hours. Undergraduate majors who

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate

More information

Graduate Programs in Statistics

Graduate Programs in Statistics Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

New Course Proposal OSC 4820, Business Analytics and Data Mining

New Course Proposal OSC 4820, Business Analytics and Data Mining Banner/Catalog Information (Coversheet) Agenda Item #15-50 Effective Fall 2015 Eastern Illinois University Effective Fall 2016, with revisions New Course Proposal OSC 4820, Business Analytics and Data

More information

Information and Decision Sciences (IDS)

Information and Decision Sciences (IDS) University of Illinois at Chicago 1 Information and Decision Sciences (IDS) Courses IDS 400. Advanced Business Programming Using Java. 0-4 Visual extended business language capabilities, including creating

More information

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation

More information

MSCA 31000 Introduction to Statistical Concepts

MSCA 31000 Introduction to Statistical Concepts MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced

More information

Exploring Practical Data Mining Techniques at Undergraduate Level

Exploring Practical Data Mining Techniques at Undergraduate Level Exploring Practical Data Mining Techniques at Undergraduate Level ERIC P. JIANG University of San Diego 5998 Alcala Park, San Diego, CA 92110 UNITED STATES OF AMERICA jiang@sandiego.edu Abstract: Data

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning.  CS 2750 Machine Learning. Lecture 1 Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x-5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

CIS 270. Systems Analysis and Design

CIS 270. Systems Analysis and Design CIS 270 Systems Analysis and Design Approved: May 6, 2011 EFFECTIVE DATE: Fall 2011 COURSE PACKAGE FORM Team Leader and Members Andra Goldberg, Matt Butcher, Steve Sorden, Dave White Date of proposal to

More information

ACADEMIC POLICY AND PLANNING COMMITTEE REQUEST FOR AHC GENERAL EDUCATION CONSIDERATION

ACADEMIC POLICY AND PLANNING COMMITTEE REQUEST FOR AHC GENERAL EDUCATION CONSIDERATION ACADEMIC POLICY AND PLANNING COMMITTEE REQUEST FOR AHC GENERAL EDUCATION CONSIDERATION Allan Hancock College General Education Philosophy General education is a pattern of courses designed to develop in

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

Audit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila

Audit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila Audit Analytics --An innovative course at Rutgers Qi Liu Roman Chinchila A new certificate in Analytic Auditing Tentative courses: Audit Analytics Special Topics in Audit Analytics Forensic Accounting

More information

An Introduction to Statistical Machine Learning - Overview -

An Introduction to Statistical Machine Learning - Overview - An Introduction to Statistical Machine Learning - Overview - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny, Switzerland

More information

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-

More information

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning CS 590 and STAT 598A, Spring 2010 Instructor: S.V. N. Vishwanathan (email: vishy) http://www.stat.purdue.edu/~vishy/introml/introml.html January 12, 2010 S.V. N. Vishwanathan

More information

DATA MINING FOR BUSINESS ANALYTICS

DATA MINING FOR BUSINESS ANALYTICS DATA MINING FOR BUSINESS ANALYTICS B20.3336.31: Spring 2012 *DRAFT* SYLLABUS Professor Foster Provost, Information, Operations & Management Sciences Department Office; Hours KMC 8-86; TBA, and by appt.

More information

CATALOG CHANGES - F13. The Department of Ocean and Mechanical Engineering offers programs of study leading to the following degrees:

CATALOG CHANGES - F13. The Department of Ocean and Mechanical Engineering offers programs of study leading to the following degrees: CATALOG CHANGES - F13 Ocean and Mechanical Engineering Department: The following changes are necessary to update the catalog. The first change is to include the combined BSME to MS degree program in the

More information

270107 - MD - Data Mining

270107 - MD - Data Mining Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 015 70 - FIB - Barcelona School of Informatics 715 - EIO - Department of Statistics and Operations Research 73 - CS - Department of

More information

Industrial and Systems Engineering Master of Science Program Logistics and Supply Chain Management

Industrial and Systems Engineering Master of Science Program Logistics and Supply Chain Management Industrial and Systems Engineering Master of Science Program Logistics and Supply Chain Management Department of Integrated Systems Engineering The Ohio State University Logistics is the science of design,

More information

Machine learning for algo trading

Machine learning for algo trading Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with

More information

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table

More information

Middle School Course Catalog

Middle School Course Catalog Middle School Course Catalog 2015-2016 1 P a g e Mater Academy of Nevada School Mission Statement The mission of Mater Academy of Nevada is to provide an innovative, challenging, multi-cultural education,

More information

Big Data Analytics and Optimization

Big Data Analytics and Optimization Big Data Analytics and Optimization C e r t i f i c a t e P r o g r a m i n E n g i n e e r i n g E x c e l l e n c e C e r t i f i c a t e P r o g r a m s i n A c c e l e r a t e d E n g i n e e r i n

More information

Statistics W4240: Data Mining Columbia University Spring, 2014

Statistics W4240: Data Mining Columbia University Spring, 2014 Statistics W4240: Data Mining Columbia University Spring, 2014 Version: January 30, 2014. The syllabus is subject to change, so look for the version with the most recent date. Course Description Massive

More information

Leveraging Ensemble Models in SAS Enterprise Miner

Leveraging Ensemble Models in SAS Enterprise Miner ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to

More information

Machine Learning for Data Science (CS4786) Lecture 1

Machine Learning for Data Science (CS4786) Lecture 1 Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

Meta-learning. Synonyms. Definition. Characteristics

Meta-learning. Synonyms. Definition. Characteristics Meta-learning Włodzisław Duch, Department of Informatics, Nicolaus Copernicus University, Poland, School of Computer Engineering, Nanyang Technological University, Singapore wduch@is.umk.pl (or search

More information

lop Building Machine Learning Systems with Python en source

lop Building Machine Learning Systems with Python en source Building Machine Learning Systems with Python Master the art of machine learning with Python and build effective machine learning systems with this intensive handson guide Willi Richert Luis Pedro Coelho

More information

SAS JOINT DATA MINING CERTIFICATION AT BRYANT UNIVERSITY

SAS JOINT DATA MINING CERTIFICATION AT BRYANT UNIVERSITY SAS JOINT DATA MINING CERTIFICATION AT BRYANT UNIVERSITY Billie Anderson Bryant University, 1150 Douglas Pike, Smithfield, RI 02917 Phone: (401) 232-6089, e-mail: banderson@bryant.edu Phyllis Schumacher

More information

Predictive Modeling Techniques in Insurance

Predictive Modeling Techniques in Insurance Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics

More information

Data Mining and Statistics for Decision Making. Wiley Series in Computational Statistics

Data Mining and Statistics for Decision Making. Wiley Series in Computational Statistics Brochure More information from http://www.researchandmarkets.com/reports/2171080/ Data Mining and Statistics for Decision Making. Wiley Series in Computational Statistics Description: Data Mining and Statistics

More information

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

CSCI-599 DATA MINING AND STATISTICAL INFERENCE CSCI-599 DATA MINING AND STATISTICAL INFERENCE Course Information Course ID and title: CSCI-599 Data Mining and Statistical Inference Semester and day/time/location: Spring 2013/ Mon/Wed 3:30-4:50pm Instructor:

More information

TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP

TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

More information

Graduation Requirements

Graduation Requirements Graduation Requirements PROGRAMS OF INSTRUCTION The Lone Star College System offers courses and programs to suit the needs of individual students. In keeping with the mission of a community college, the

More information

MSCA 31000 Introduction to Statistical Concepts

MSCA 31000 Introduction to Statistical Concepts MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced

More information

Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition

Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition Brochure More information from http://www.researchandmarkets.com/reports/2171322/ Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition Description: This book reviews state-of-the-art methodologies

More information

AMIS 7640 Data Mining for Business Intelligence

AMIS 7640 Data Mining for Business Intelligence The Ohio State University The Max M. Fisher College of Business Department of Accounting and Management Information Systems AMIS 7640 Data Mining for Business Intelligence Autumn Semester 2013, Session

More information

Course Descriptions: Undergraduate/Graduate Certificate Program in Data Visualization and Analysis

Course Descriptions: Undergraduate/Graduate Certificate Program in Data Visualization and Analysis 9/3/2013 Course Descriptions: Undergraduate/Graduate Certificate Program in Data Visualization and Analysis Seton Hall University, South Orange, New Jersey http://www.shu.edu/go/dava Visualization and

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

The Partnership for the Assessment of College and Careers (PARCC) Acceptance Policy Adopted by the Illinois Council of Community College Presidents

The Partnership for the Assessment of College and Careers (PARCC) Acceptance Policy Adopted by the Illinois Council of Community College Presidents The Partnership for the Assessment of College and Careers (PARCC) Acceptance Policy Adopted by the Illinois Council of Community College Presidents This policy was developed with the support and endorsement

More information

Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

More information

New Course Proposal: ITEC-621 Predictive Analytics. Prerequisites: ITEC-610 Applied Managerial Statistics

New Course Proposal: ITEC-621 Predictive Analytics. Prerequisites: ITEC-610 Applied Managerial Statistics New Course Proposal: ITEC-621 Predictive Analytics Academic Unit: KSB Teaching Unit: Information Technology Course Title: Predictive Analytics Course Number : ITEC-621 Credit Hours: 3 Proposed effective

More information

Instructional Delivery Model Courses in the Ph.D. program are offered online.

Instructional Delivery Model Courses in the Ph.D. program are offered online. Doctor of Philosophy in Education Doctor of Philosophy Mission Statement The Doctor of Philosophy (Ph.D.) is designed to support the mission of the Fischler School of Education. The program prepares individuals

More information

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

More information

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA Welcome Xindong Wu Data Mining: Updates in Technologies Dept of Math and Computer Science Colorado School of Mines Golden, Colorado 80401, USA Email: xwu@ mines.edu Home Page: http://kais.mines.edu/~xwu/

More information