Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions

Size: px
Start display at page:

Download "Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions"

Transcription

1 SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0 PM Mon 08 Feb 7:00-8:0 PM Wed 0 Feb 7:00-8:0 PM Thurs 04 Feb Tues 09 Feb Thurs Feb SMA 50 L Sampling and statistical distributions Homework # handed out SMA 50 L2 Estimation, confidence intervals, and the bootstrap SMA 50 L Hypothesis testing, likelihood ratios, goodness-of-fit tests, approximate methods Accurate as of Jan 27, 200 NTU References R 6, 7.-7., R 8., , 8.7, 8.9, R , 8.2, Tues 9 Feb Thurs Feb 0:0-:0 AM * Rec.: Computing: graphics and the bootstrap R 9.8, 0.2., 0., NO CLASS ---- Singapore Holiday Chinese New Year from 4 Feb to 6 Feb 200 NO CLASS - holiday Presidents Day on 5 Feb 200 Tues 6 Feb 7:00-8:0 PM Wed 7 Feb SMA 50 L4 Bayesian inference, Molecular biology fundamentals Roy Welsch ( Monday schedule of classes to be held) Jagath Rajapakse R.5.2 (94,95), 8.6, BB 2.-2., CB.0-.7 Wed 7 Feb 7:00-8:0 PM Thur 8 Feb 7 Thurs 8 Feb Thurs 8 Feb 0:0-:0 AM SMA 50 L5 Die models of sequences, Markov models, pairwise sequence alignment Tentative homework # due dates Homework # 2 handed out Rec.: Testing and Bayesian Inference BB.; EG , , *:. Live video-casting from ; 2. Taped lecture from ;. Live video-casting from S pore; 4. Taped lecture from S pore; 5. Classroom lecture in S pore; 6. faculty teaches in S pore; 7. Recitation by faculty to students at and by NTU faculty to students at Singapore; 8. Other-Please specify

2 SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Mon 22 Feb 7:00-8:0 PM Wed 24 Feb 7:00-8:0 PM Tues 2 Feb Thurs 25 Feb 7 Tues 2 Feb Thurs 25 Feb 0:0-:0 AM Mon 0 Mar 7:00-8:0 PM Wed 0 Mar 7:00-8:0PM Tues 02 Mar Thurs 04 Mar 7 Tues 02 Mar Thurs 04 Mar 0:0-:0 AM Mon 08 Mar 7:00-8:0PM Wed 0 Mar 7:00-8:0PM Tues 09 Mar Thurs Mar SMA 50 L6 Substitution matrices, multiple sequence alignment, Markov chain Monte Carlo, simulated annealing, Gibbs sampling, BLAST SMA 50 L7 Hidden Markov models: gene structure prediction, profile HMM, Expectation- Accurate as of Jan 27, 200 NTU References EG , ,.-.7 Brooks paper BB EG Maximization algorithm Rec.: Die models; Markov modeling BB. EG ,.-.4 SMA 50 L8 Linear regression and smoothing SMA 50 L9 Regression diagnostics, collinearity, and robust regression Tentative homework # 2 due dates Homework # handed out Rec.: Computing: regression and Gene structure prediction with HMM SMA 50 L0 Comparing two samples; non-parametric methods and experimental design SMA 50 L Analysis of categorical data R4.4.2, , 4.7 Notes, R , 4.8 R.-.5 R., Tues 09 Mar Thurs Mar 0:0-:0 AM Rec.: Computing: diagnostics and twosample *:. Live video-casting from ; 2. Taped lecture from ;. Live video-casting from S pore; 4. Taped lecture from S pore; 5. Classroom lecture in S pore; 6. faculty teaches in S pore; 7. Recitation by faculty to students at and by NTU faculty to students at Singapore; 8. Other-Please specify

3 SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch NTU Accurate as of Jan 27, 200 References Daylight Savings Time Start Note Time Change at NTU and (starts from 4 March 09-2 hours difference) Mon 5 Mar Tues 6 Mar 7 Tues 6 Mar 0:0-:0 AM 8 (no beaming) Wed 7 Mar Thurs 8 Mar SMA 50 L2 Analysis of variance Tentative homework # due dates Rec.: Categorical Data and ANOVA Midterm Examination (in-class) R Spring Vacation March, (Mon Sun) Mon 29 Mar Wed Mar Tues 0 Mar Thurs 0 Apr 7 Tues 0 Mar Thurs 0 Apr 0:0-:0 AM Mon 05 Apr Tues 06 Apr SMA 50 L4 Learning from data Homework # 4 handed out SMA 50 L5 Model Assessment Rec.: Insightful Miner Basics SMA 50 L6 Regression Selection: Ridge, PCR, PLS, LAR H, 2 H7.-7.7, H.-.6,.9 Wed 07 Apr Thurs 08 Apr SMA 50 L7 Discriminant Analysis; Logistic Regression Tentative homework # 4 due dates Homework # 5 handed out H *:. Live video-casting from ; 2. Taped lecture from ;. Live video-casting from S pore; 4. Taped lecture from S pore; 5. Classroom lecture in S pore; 6. faculty teaches in S pore; 7. Recitation by faculty to students at and by NTU faculty to students at Singapore; 8. Other-Please specify

4 SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch 7 Tues 06 Apr Thurs 08 Apr 0:0-:0 AM Mon 2 Apr Wed 4 Apr Tues Apr Thurs 5 Apr 7 Tues Apr Thurs 5 Apr 0:0-:0 AM Rec.: Regression Selection SMA 50 L8 Generalized Additive Models and Trees: CART SMA 50 L9 Support Vector Machines; Support Vector Regression; Prediction of protein features: secondary structures, solvent accessibility, and accessibility area Rec.: Classification, logistics Reg., and SVM NTU Accurate as of Jan 27, 200 References H H4.5, 2.-2., (omit 2.., 2..5) Nguyen & Rajapakse 2005, 2006, 2007 holiday Patriots Day, 9-20 Apr (Mon, Tues) No Class Wed 2 Apr Thurs 22 Apr SMA 50 L20 Neural Networks, prediction of signal sites in genomic sequences H.,.-.0 Rajapakse & Ho 2005 Mon 26 Apr Tues 27 Apr SMA 50 L2 Cluster analysis, k-means, hierarchical clustering, clustering of gene expressions, biclustering H.-.2, 4. (omit 4..9) Wed 28 Apr Thurs 29 Apr Tentative homework # 5 due dates Homework # 6 handed out SMA 50 L22 Bagging and Boosting, AdaBoost, Random Forests H , , *:. Live video-casting from ; 2. Taped lecture from ;. Live video-casting from S pore; 4. Taped lecture from S pore; 5. Classroom lecture in S pore; 6. faculty teaches in S pore; 7. Recitation by faculty to students at and by NTU faculty to students at Singapore; 8. Other-Please specify

5 SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch 7 Tue 27 April Thurs 29 Apr 0:0 -:0AM 7 Mon 0 May Tues 04 May 9:00 :00 AM Rec.: Neural Nets, CART, Bagging and Boosting SMA 50 L2 Project consultation NTU Accurate as of Jan 27, 200 References 7 Wed 05 May Thurs 06 May 9:00 :00 AM Tues 04 May Thurs 06 May 7 :00AM -2:00 PM 7 Mon 0 May Tues May 9:00-:00 AM 7 Tues May Tues May :00 AM -2:00PM 7 Wed 2 May Thurs May 9:00 :00 AM SMA 50 L24 Project consultation Tentative homework # 6 due dates Rec.: Clustering and Neural Nets SMA 50 L25 Project presentations Rec.: Project help SMA 50 L26 Project presentations Project report due Texts:. John A. Rice, Mathematical Statistics and Data Analysis (Third Edition, 2007) [R] An alternative to Rice, especially if you are interested in bioinformatics might be: 2. Warren J. Ewens, Gregory R. Grant, Statistical Methods in Bioinformatics: An Introduction, Second Edition [EG]. Hastie, Tibshirani, and Friedman, The Elements of Statistical Leaning: Data Mining, Inference, and Prediction [H] On reserve or portions handed out: *:. Live video-casting from ; 2. Taped lecture from ;. Live video-casting from S pore; 4. Taped lecture from S pore; 5. Classroom lecture in S pore; 6. faculty teaches in S pore; 7. Recitation by faculty to students at and by NTU faculty to students at Singapore; 8. Other-Please specify

6 SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch 4. P. Baldi and S. Brunak, Bioinformatics: The Machine Learning Approach, Second Edition [BB] Accurate as of Jan 27, P. Clote and R. Backofen, Computational Molecular Biology: An Introduction [CB] 6. R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acid [DEKM] *Note that lab sessions are held at SCE Multi-Media Lab at N4-0A-02 *:. Live video-casting from ; 2. Taped lecture from ;. Live video-casting from S pore; 4. Taped lecture from S pore; 5. Classroom lecture in S pore; 6. faculty teaches in S pore; 7. Recitation by faculty to students at and by NTU faculty to students at Singapore; 8. Other-Please specify

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

CSCI-599 DATA MINING AND STATISTICAL INFERENCE CSCI-599 DATA MINING AND STATISTICAL INFERENCE Course Information Course ID and title: CSCI-599 Data Mining and Statistical Inference Semester and day/time/location: Spring 2013/ Mon/Wed 3:30-4:50pm Instructor:

More information

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence Government of Russian Federation Federal State Autonomous Educational Institution of High Professional Education National Research University «Higher School of Economics» Faculty of Computer Science School

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376 Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

More information

Statistics W4240: Data Mining Columbia University Spring, 2014

Statistics W4240: Data Mining Columbia University Spring, 2014 Statistics W4240: Data Mining Columbia University Spring, 2014 Version: January 30, 2014. The syllabus is subject to change, so look for the version with the most recent date. Course Description Massive

More information

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16 Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems

More information

Faculty of Science School of Mathematics and Statistics

Faculty of Science School of Mathematics and Statistics Faculty of Science School of Mathematics and Statistics MATH5836 Data Mining and its Business Applications Semester 1, 2014 CRICOS Provider No: 00098G MATH5836 Course Outline Information about the course

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

Statistics in Applications III. Distribution Theory and Inference

Statistics in Applications III. Distribution Theory and Inference 2.2 Master of Science Degrees The Department of Statistics at FSU offers three different options for an MS degree. 1. The applied statistics degree is for a student preparing for a career as an applied

More information

Email: justinjia@ust.hk Office: LSK 5045 Begin subject: [ISOM3360]...

Email: justinjia@ust.hk Office: LSK 5045 Begin subject: [ISOM3360]... Business Intelligence and Data Mining ISOM 3360: Spring 2015 Instructor Contact Office Hours Course Schedule and Classroom Course Webpage Jia Jia, ISOM Email: justinjia@ust.hk Office: LSK 5045 Begin subject:

More information

Predictive Modeling and Big Data

Predictive Modeling and Big Data Predictive Modeling and Presented by Eileen Burns, FSA, MAAA Milliman Agenda Current uses of predictive modeling in the life insurance industry Potential applications of 2 1 June 16, 2014 [Enter presentation

More information

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of

More information

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

More information

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

How To Understand The Theory Of Probability

How To Understand The Theory Of Probability Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

IN THE CITY OF NEW YORK Decision Risk and Operations. Advanced Business Analytics Fall 2015

IN THE CITY OF NEW YORK Decision Risk and Operations. Advanced Business Analytics Fall 2015 Advanced Business Analytics Fall 2015 Course Description Business Analytics is about information turning data into action. Its value derives fundamentally from information gaps in the economic choices

More information

ICPSR Summer Program

ICPSR Summer Program ICPSR Summer Program Data Mining Tools for Exploring Big Data Department of Statistics Wharton School, University of Pennsylvania www-stat.wharton.upenn.edu/~stine Modern data mining combines familiar

More information

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Lecture: MWF: 1:00-1:50pm, GEOLOGY 4645 Instructor: Mihai

More information

Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course

Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course Prerequisite: Stat 3201 (Introduction to Probability for Data Analytics) Exclusions: Class distribution:

More information

CS 6220: Data Mining Techniques Course Project Description

CS 6220: Data Mining Techniques Course Project Description CS 6220: Data Mining Techniques Course Project Description College of Computer and Information Science Northeastern University Spring 2013 General Goal In this project, you will have an opportunity to

More information

2015 2016 STUDENT ASSESSMENT TESTING CALENDAR

2015 2016 STUDENT ASSESSMENT TESTING CALENDAR Jan (date TBD) ESC training for the 2016 state assessment program Jan (date TBD) Completion date for training of district testing coordinators by ESCs Test Date(s) TAKS Oct 19 (Mon) Oct 20 (Tues) Oct 21

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK DEPARTMENT OF INDUSTRIAL ENGINEERING AND OPERATIONS RESEARCH

COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK DEPARTMENT OF INDUSTRIAL ENGINEERING AND OPERATIONS RESEARCH Course: IEOR 4575 Business Analytics for Operations Research Lectures MW 2:40-3:55PM Instructor Prof. Guillermo Gallego Office Hours Tuesdays: 3-4pm Office: CEPSR 822 (8 th floor) Textbooks and Learning

More information

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006 Hidden Markov Models in Bioinformatics By Máthé Zoltán Kőrösi Zoltán 2006 Outline Markov Chain HMM (Hidden Markov Model) Hidden Markov Models in Bioinformatics Gene Finding Gene Finding Model Viterbi algorithm

More information

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1 Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat

More information

Statistics Graduate Programs

Statistics Graduate Programs Statistics Graduate Programs Kathleen Maurer, Coordinator of Graduate Studies 146 Middlebush Columbia, MO 65211 573-882-6376 http://www.stat.missouri.edu/ About Statistics The statistics department faculty

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Analytics in Action. What do Jeopardy, Pampers, and Major League Baseball all have in common? October 24, 2012

Analytics in Action. What do Jeopardy, Pampers, and Major League Baseball all have in common? October 24, 2012 Analytics in Action What do Jeopardy, Pampers, and Major League Baseball all have in common? October 24, 2012 University of Cincinnati Tangeman University Center Theater Sponsored by LUCRUM, Inc. ABOUT

More information

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008 Instructor: Maria Heracleous Lectures: M 8:10-10:40 p.m. WARD 202 Office: 221 Roper Phone: 202-885-3758 Office Hours: M W

More information

Machine Learning. 01 - Introduction

Machine Learning. 01 - Introduction Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences. 2015-2016 Academic Year Qualification.

COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences. 2015-2016 Academic Year Qualification. COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences 2015-2016 Academic Year Qualification. Master's Degree 1. Description of the subject Subject name: Biomedical Data

More information

SAS Certificate Applied Statistics and SAS Programming

SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and

More information

CS 207 - Data Science and Visualization Spring 2016

CS 207 - Data Science and Visualization Spring 2016 CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and human-assisted analysis of data sets. These

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

QMB 3302 Business Analytics CRN 10251 Spring 2015 T R -- 11:00am - 12:15pm -- Lutgert Hall 2209

QMB 3302 Business Analytics CRN 10251 Spring 2015 T R -- 11:00am - 12:15pm -- Lutgert Hall 2209 QMB 3302 Business Analytics CRN 10251 Spring 2015 T R -- 11:00am - 12:15pm -- Lutgert Hall 2209 Elias T. Kirche, Ph.D. Associate Professor Department of Information Systems and Operations Management Lutgert

More information

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

More information

Our Philosophy. Authentic Contexts. Provide relevant and meaningful courseware to promote deeper understanding

Our Philosophy. Authentic Contexts. Provide relevant and meaningful courseware to promote deeper understanding AcademyR Revolution Analytics partners with leading minds and industry experts to offer professional training courses designed to give your organization a quick start in building high performance analytical

More information

STAT 370: Probability and Statistics for Engineers [Section 002]

STAT 370: Probability and Statistics for Engineers [Section 002] North Carolina State University STAT 370: Probability and Statistics for Engineers [Section 002] Today Introduction: What s statistics and what does this course cover? Course logistics Q&A Instructor:

More information

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut. Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,

More information

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate

More information

MSCA 31000 Introduction to Statistical Concepts

MSCA 31000 Introduction to Statistical Concepts MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced

More information

2015 2016 Training. 2015 Assessments. 2016 Assessments NAEP Assessments (selected sample)

2015 2016 Training. 2015 Assessments. 2016 Assessments NAEP Assessments (selected sample) Jan 11 (Mon) ESC training for the 2016 state assessment program Jan 29 (Fri) Completion date for training of district testing coordinators by ESCs Test Date(s) TAKS Oct 19 (Mon) Oct 20 (Tues) Oct 21 (Wed)

More information

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary

More information

2015 Workshops for Professors

2015 Workshops for Professors SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market

More information

Brown University Department of Economics Spring 2015 ECON 1620-S01 Introduction to Econometrics Course Syllabus

Brown University Department of Economics Spring 2015 ECON 1620-S01 Introduction to Econometrics Course Syllabus Brown University Department of Economics Spring 2015 ECON 1620-S01 Introduction to Econometrics Course Syllabus Course Instructor: Dimitra Politi Office hour: Mondays 1-2pm (and by appointment) Office

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

I INF 300: Probability and Statistics for Data Analytics (3 credit hours) Spring 2015, Class number 9873

I INF 300: Probability and Statistics for Data Analytics (3 credit hours) Spring 2015, Class number 9873 I INF 300: Probability and Statistics for Data Analytics (3 credit hours) Spring 2015, Class number 9873 Instructor: Norman Gervais Office location: BA 313 Office hours: Mondays 11:30-1:00 and Wednesdays

More information

BIOM611 Biological Data Analysis

BIOM611 Biological Data Analysis BIOM611 Biological Data Analysis Spring, 2015 Tentative Syllabus Introduction BIOMED611 is a ½ unit course required for all 1 st year BGS students (except GCB students). It will provide an introduction

More information

Bayesian Phylogeny and Measures of Branch Support

Bayesian Phylogeny and Measures of Branch Support Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The

More information

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html 10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information

T cell Epitope Prediction

T cell Epitope Prediction Institute for Immunology and Informatics T cell Epitope Prediction EpiMatrix Eric Gustafson January 6, 2011 Overview Gathering raw data Popular sources Data Management Conservation Analysis Multiple Alignments

More information

Decision Sciences 101 - Data Analysis for Managers

Decision Sciences 101 - Data Analysis for Managers Spring 2013 CALIFORNIA STATE UNIVERSITY, SACRAMENTO School of Business Administration Decision Sciences 101 - Data Analysis for Managers COURSE OUTLINE INSTRUCTOR: Dr. Stanley A. Taylor OFFICE: TAH -2096

More information

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

QMB 3302 - Business Analytics CRN 80700 - Fall 2015 T & R 9.30 to 10.45 AM -- Lutgert Hall 2209

QMB 3302 - Business Analytics CRN 80700 - Fall 2015 T & R 9.30 to 10.45 AM -- Lutgert Hall 2209 QMB 3302 - Business Analytics CRN 80700 - Fall 2015 T & R 9.30 to 10.45 AM -- Lutgert Hall 2209 Elias T. Kirche, Ph.D. Associate Professor Department of Information Systems and Operations Management Lutgert

More information

Ph.D. in Bioinformatics and Computational Biology Degree Requirements

Ph.D. in Bioinformatics and Computational Biology Degree Requirements Ph.D. in Bioinformatics and Computational Biology Degree Requirements Credits Students pursuing the doctoral degree in BCB must complete a minimum of 90 credits of relevant work beyond the bachelor s degree;

More information

Machine learning for algo trading

Machine learning for algo trading Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with

More information

ICPSR Summer Program, 2014

ICPSR Summer Program, 2014 ICPSR Summer Program, 2014 Data Mining Tools for Exploring Big Data Department of Statistics Wharton School, University of Pennsylvania www-stat.wharton.upenn.edu/~stine Modern data mining combines familiar

More information

Information and Decision Sciences (IDS)

Information and Decision Sciences (IDS) University of Illinois at Chicago 1 Information and Decision Sciences (IDS) Courses IDS 400. Advanced Business Programming Using Java. 0-4 Visual extended business language capabilities, including creating

More information

270107 - MD - Data Mining

270107 - MD - Data Mining Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 015 70 - FIB - Barcelona School of Informatics 715 - EIO - Department of Statistics and Operations Research 73 - CS - Department of

More information

Decision Sciences Department Business Analytics Program. Decision Sciences 6290: Introduction to Business Analytics (1.

Decision Sciences Department Business Analytics Program. Decision Sciences 6290: Introduction to Business Analytics (1. Decision Sciences Department Business Analytics Program Decision Sciences 6290: Introduction to Business Analytics (1.5 credit hours) Dr. Demirhan Yenigun Course Description The advancement in computing

More information

Microsoft Azure Machine learning Algorithms

Microsoft Azure Machine learning Algorithms Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql Tomaz.kastrun@gmail.com http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation

More information

Master of Science in Health Information Technology Degree Curriculum

Master of Science in Health Information Technology Degree Curriculum Master of Science in Health Information Technology Degree Curriculum Core courses: 8 courses Total Credit from Core Courses = 24 Core Courses Course Name HRS Pre-Req Choose MIS 525 or CIS 564: 1 MIS 525

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

Statistics with Aviation Applications Math 211 Mode of Delivery Lecture Blended Course Syllabus

Statistics with Aviation Applications Math 211 Mode of Delivery Lecture Blended Course Syllabus Statistics with Aviation Applications Math 211 Mode of Delivery Lecture Blended Course Syllabus Credit Hours: 3 Credits Academic Term: Term 4: 23 March 2015 24 May 2015 Meetings: Thurs 18:00-22:00 26 Mar;

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

Multivariate Statistical Inference and Applications

Multivariate Statistical Inference and Applications Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim

More information

CSC 314: Operating Systems Spring 2005

CSC 314: Operating Systems Spring 2005 CSC 314: Operating Systems Spring 2005 Instructor: Lori Carter lcarter@ptloma.edu (619) 849-2352 Office hours: MWF TTh 11:00 a.m. 12:00 p.m. 1:15 2:15 p.m 10:00-11:30 a.m. Texts: Silbershatz et.al, Operating

More information

Data Mining and Machine Learning in Bioinformatics

Data Mining and Machine Learning in Bioinformatics Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg

More information

Data Mining. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/

Data Mining. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/ Data Mining Dr. Saed Sayad University of Toronto 2010 saed.sayad@utoronto.ca http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by

More information

LAUREA MAGISTRALE - CURRICULUM IN INTERNATIONAL MANAGEMENT, LEGISLATION AND SOCIETY. 1st TERM (14 SEPT - 27 NOV)

LAUREA MAGISTRALE - CURRICULUM IN INTERNATIONAL MANAGEMENT, LEGISLATION AND SOCIETY. 1st TERM (14 SEPT - 27 NOV) LAUREA MAGISTRALE - CURRICULUM IN INTERNATIONAL MANAGEMENT, LEGISLATION AND SOCIETY 1st TERM (14 SEPT - 27 NOV) Week 1 9.30-10.30 10.30-11.30 11.30-12.30 12.30-13.30 13.30-14.30 14.30-15.30 15.30-16.30

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

More information

CS570 Data Mining Classification: Ensemble Methods

CS570 Data Mining Classification: Ensemble Methods CS570 Data Mining Classification: Ensemble Methods Cengiz Günay Dept. Math & CS, Emory University Fall 2013 Some slides courtesy of Han-Kamber-Pei, Tan et al., and Li Xiong Günay (Emory) Classification:

More information

200630 - FBIO - Fundations of Bioinformatics

200630 - FBIO - Fundations of Bioinformatics Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 200 - FME - School of Mathematics and Statistics 1004 - UB - (ENG)Universitat de Barcelona MASTER'S DEGREE IN STATISTICS AND

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Big Data Analytics and Optimization

Big Data Analytics and Optimization Big Data Analytics and Optimization C e r t i f i c a t e P r o g r a m i n E n g i n e e r i n g E x c e l l e n c e C e r t i f i c a t e P r o g r a m s i n A c c e l e r a t e d E n g i n e e r i n

More information

CLASS SESSIONS Wednesdays, 8:30 AM -11:20 AM, HSL LL204

CLASS SESSIONS Wednesdays, 8:30 AM -11:20 AM, HSL LL204 CLASS SESSIONS Wednesdays, 8:30 AM -11:20 AM, HSL LL204 Analytic Methods for Health Services Management P8529 INSTRUCTOR Nan Liu, Ph.D. 600 West 168 th Street, Room 603 nl2320@columbia.edu Office hour:

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

QMB 3302 - Business Analytics CRN 82361 - Fall 2015 W 6:30-9:15 PM -- Lutgert Hall 2209

QMB 3302 - Business Analytics CRN 82361 - Fall 2015 W 6:30-9:15 PM -- Lutgert Hall 2209 QMB 3302 - Business Analytics CRN 82361 - Fall 2015 W 6:30-9:15 PM -- Lutgert Hall 2209 Rajesh Srivastava, Ph.D. Professor and Chair, Department of Information Systems and Operations Management Lutgert

More information

List of Ph.D. Courses

List of Ph.D. Courses Research Methods Courses (5 courses/15 hours) List of Ph.D. Courses The research methods set consists of five courses (15 hours) that discuss the process of research and key methodological issues encountered

More information

Introduction to Big Data with Apache Spark UC BERKELEY

Introduction to Big Data with Apache Spark UC BERKELEY Introduction to Big Data with Apache Spark UC BERKELEY This Lecture Exploratory Data Analysis Some Important Distributions Spark mllib Machine Learning Library Descriptive vs. Inferential Statistics Descriptive:»

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

CSE 6040 Computing for Data Analytics: Methods and Tools. Lecture 1 Course Overview

CSE 6040 Computing for Data Analytics: Methods and Tools. Lecture 1 Course Overview CSE 6040 Computing for Data Analytics: Methods and Tools Lecture 1 Course Overview DA KUANG, POLO CHAU GEORGIA TECH FALL 2014 Fall 2014 CSE 6040 COMPUTING FOR DATA ANALYSIS 1 Course Staff Instructor Da

More information

MATHEMATICAL TOOLS FOR ECONOMICS ECON 1078-001 SPRING 2012

MATHEMATICAL TOOLS FOR ECONOMICS ECON 1078-001 SPRING 2012 MATHEMATICAL TOOLS FOR ECONOMICS ECON 1078-001 SPRING 2012 Instructor: Hakon Skjenstad Class Time: M, W, F, 12:00-12:50pm Classroom: DUAN G125 Email: hakon.skjenstad@colorado.edu Course Website: CULearn

More information

Leveraging Ensemble Models in SAS Enterprise Miner

Leveraging Ensemble Models in SAS Enterprise Miner ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to

More information

School of Mathematics and Science MATH 153 Introduction to Statistical Methods Section: WE1 & WE2

School of Mathematics and Science MATH 153 Introduction to Statistical Methods Section: WE1 & WE2 CCBC Essex School of Mathematics and Science MATH 153 Introduction to Statistical Methods Section: WE1 & WE2 CLASSROOM LOCATION: SEMESTER: Fall 2009 INSTRUCTOR: DONNA TUPPER OFFICE LOCATION: F-413 (or

More information

Audit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila

Audit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila Audit Analytics --An innovative course at Rutgers Qi Liu Roman Chinchila A new certificate in Analytic Auditing Tentative courses: Audit Analytics Special Topics in Audit Analytics Forensic Accounting

More information

Predicting Student Persistence Using Data Mining and Statistical Analysis Methods

Predicting Student Persistence Using Data Mining and Statistical Analysis Methods Predicting Student Persistence Using Data Mining and Statistical Analysis Methods Koji Fujiwara Office of Institutional Research and Effectiveness Bemidji State University & Northwest Technical College

More information

Teaching Biostatistics to Postgraduate Students in Public Health

Teaching Biostatistics to Postgraduate Students in Public Health Teaching Biostatistics to Postgraduate Students in Public Health Peter A Lachenbruch - h s hgeles, California, USA 1. Introduction This paper describes how biostatistics is taught in US Schools of Public

More information

Unsupervised and supervised dimension reduction: Algorithms and connections

Unsupervised and supervised dimension reduction: Algorithms and connections Unsupervised and supervised dimension reduction: Algorithms and connections Jieping Ye Department of Computer Science and Engineering Evolutionary Functional Genomics Center The Biodesign Institute Arizona

More information