COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences. 2015-2016 Academic Year Qualification.

Similar documents
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

ATV - Lifetime Data Analysis

Statistics Graduate Courses

Name of the module: Multivariate biostatistics and SPSS Number of module:

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

240ST014 - Data Analysis of Transport and Logistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Description. Textbook. Grading. Objective

MD - Data Mining

Learning outcomes. Knowledge and understanding. Competence and skills

DPLS 722 Quantitative Data Analysis

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

MEU. INSTITUTE OF HEALTH SCIENCES COURSE SYLLABUS. Biostatistics

Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course

Module 223 Major A: Concepts, methods and design in Epidemiology

Teaching model: C1 a. General background: 50% b. Theory-into-practice/developmental 50% knowledge-building: c. Guided academic activities:

Least Squares Estimation

ECON 523 Applied Econometrics I /Masters Level American University, Spring Description of the course

STAT 360 Probability and Statistics. Fall 2012

Regression Modeling Strategies

PROBABILITY AND STATISTICS. Ma To teach a knowledge of combinatorial reasoning.

Sun Li Centre for Academic Computing

UNIVERSITY of MASSACHUSETTS DARTMOUTH Charlton College of Business Decision and Information Sciences Fall 2010

Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions

Applied Regression Analysis and Other Multivariable Methods

EDMS 769L: Statistical Analysis of Longitudinal Data 1809 PAC, Th 4:15-7:00pm 2009 Spring Semester

Descriptive Statistics

Teaching Biostatistics to Postgraduate Students in Public Health

BIOM611 Biological Data Analysis

Statistics in Applications III. Distribution Theory and Inference

Executive Master of Public Administration. QUANTITATIVE TECHNIQUES I For Policy Making and Administration U6310, Sec. 03

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050

Online Course Syllabus Page 1

Econometrics and Data Analysis I

BIOS 6660: Analysis of Biomedical Big Data Using R and Bioconductor, Fall 2015 Computer Lab: Education 2 North Room 2201DE (TTh 10:30 to 11:50 am)

DAIC - Advanced Experimental Design in Clinical Research

SAS Certificate Applied Statistics and SAS Programming

International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone:

Additional sources Compilation of sources:

RARITAN VALLEY COMMUNITY COLLEGE ACADEMIC COURSE OUTLINE MATH 111H STATISTICS II HONORS

Regression and Multivariate Data Analysis

2015 TUHH Online Summer School: Overview of Statistical and Path Modeling Analyses

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence

Probability and Statistics

Statistical Analysis RSCH 665 Eagle Vision Classroom - Blended Course Syllabus

Online Course Syllabus Page 1

The Basics of Regression Analysis. for TIPPS. Lehana Thabane. What does correlation measure? Correlation is a measure of strength, not causation!

Course Syllabus MATH 110 Introduction to Statistics 3 credits

Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015

How To Understand The Theory Of Probability

STATISTICS COURSES UNDERGRADUATE CERTIFICATE FACULTY. Explanation of Course Numbers. Bachelor's program. Master's programs.

Statistical issues in the analysis of microarray data

Brown University Department of Economics Spring 2015 ECON 1620-S01 Introduction to Econometrics Course Syllabus

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

Statistics with Aviation Applications Math 211 Mode of Delivery Lecture Blended Course Syllabus

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

Economic Statistics (ECON2006), Statistics and Research Design in Psychology (PSYC2010), Survey Design and Analysis (SOCI2007)

Applications of R Software in Bayesian Data Analysis

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

STA569 Applied (Nursing) Statistical Methods Instructor/ Facilitator:

COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537)

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

Diablo Valley College Catalog

Multivariate Statistical Inference and Applications

Street Address: 1111 Franklin Street Oakland, CA Mailing Address: 1111 Franklin Street Oakland, CA 94607

Language: English Lecturer: Gianni de Fabritiis. Teaching staff: Language: English Lecturer: Jordi Villà i Freixa

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology

Social Work Statistics Spring 2000

Quantitative Analysis for Business BSNS102. Course Outline Semester One 2014

STAT 121 Hybrid SUMMER 2014 Introduction to Statistics for the Social Sciences Session I: May 27 th July 3 rd

Universitatea de Medicină şi Farmacie Grigore T. Popa Iaşi Comisia pentru asigurarea calităţii DISCIPLINE RECORD/ COURSE / SEMINAR DESCRIPTION

Multivariate Analysis of Ecological Data

RMTD 404 Introduction to Linear Models

Lean Certification Program Blended Learning Program Cost: $5500. Course Description

CS Data Science and Visualization Spring 2016

CBE 9190B ADVANCED STATISTICAL PROCESS ANALYSIS COURSE OUTLINE

Program description for the Master s Degree Program in Mathematics and Finance

MASTER COURSE SYLLABUS-PROTOTYPE PSYCHOLOGY 2317 STATISTICAL METHODS FOR THE BEHAVIORAL SCIENCES

STAT 2080/MATH 2080/ECON 2280 Statistical Methods for Data Analysis and Inference Fall 2015

Curriculum - Doctor of Philosophy

Statistics for Biology and Health

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

BIO 226: APPLIED LONGITUDINAL ANALYSIS COURSE SYLLABUS. Spring 2015

Data Mining Introduction

COURSE SYLLABUS Fall 2009: GNRS 713: Advanced Statistical Analysis 4 Units Thursday 5:00pm 9:00pm Nursing Computer Lab

Introduction to Research Methods and Applied Data Analysis

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

Transcription:

COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences 2015-2016 Academic Year Qualification. Master's Degree 1. Description of the subject Subject name: Biomedical Data Analysis Code: BDA Total credits: 5 ECTS Workload: 125 hrs Year : 1st Year MSc Term: 1st term Type of subject: Optional Centre: Department of Experimental and Health Sciences / Teaching language(s): English Teaching team/teaching staff: Other references Subject Coordinator: Arcadi Navarro Teaching staff: Hafid Laayouni, Ferran Sanz, Laura Ines Furlong, Juan Antonio Rodriguez. Groups: 1 single group Timetables: Taught at various times, usually in the mornings Building: Dr. Aiguader 80 Classrooms: TBD Comments: Please bring your own laptop for the hands-on sessions 1

2. Teaching guide (Campus Global) Introduction This course focuses in how to use standard statistical methods to analyze Biomedical data. After a general introduction on probability theory and statistical inference, an emphasis will be made on the most-common methods used to analyze multivariate data. Particular cases will be used as illustrative examples. The course will be focused on Frequentist Statistics, but basic overviews of Bayesian and Maximum Likelihood methods will be provided. The course comprises 5 ECTS credits, implying 20/25 hours of plenary lectures, 10 hours of exercises and hands-on computer classes, 13/18 hours of reading and personal study, and 2 hours performing tests. The subject is based on the understanding of key methodological concepts and tools and on the application of common software used in labs around the world. As this is a completely incremental subject, the student is advised of the need of strong interaction with the lecturers and the need of keeping the class material up to date. The subject focuses on practical implementation of different types of tools for statistical inference. Thus, the methods covered are strongly based on a good understanding of basic principles of probability and programming. The course includes magistral lectures and hands-on exercises on the use of publicly available software packages. The course will be evaluated by means of an individual exam and based on short questions/answers, some problems and some text questions. Additionally, a percentage of the final qualification will also be based on continuous evaluation focusing on homework assignments and hands-on practical assignments. - Prerequisites in order to follow the itinerary: Previous basic programming skills and notions of probability are required. Associated competences General competences Instrumental: 1. Proficient reading/writing/listening scientific English related to the subject. 2. Knowledge of office software to do quality scientific presentations and reports related to the subject. Interpersonal: 1. Group work 2. Ability to solve by yourself a given problem Systemic: 1. Analysis and synthesis abilities 2. Ability to search for information Specific competences 1. To understand the concept of probability. 2. To understand Bayes Theorem. 3. To distinguish statistical description from inference. 2

4. To understand the concept of random variable. 5. To become familiar with central trend and dispersion measures. 6. To understand the concept of probability distribution. 7. To become familiar the most common kinds of distributions. 8. To master the graphical representation distributions and summary statistics. 9. To understand the concept confidence intervals and standard error. 10. To understand the concept of hypothesis testing. 11. To understand the concept of Type I and II errors. 12. To master the concept of ANOVA and its different designs. 13. To master the concept of contingency table and the relevant testing procedures. 14. To master the concept and procedures for Regression and Correlation Analysis. 15. To understand the concept of non-parametric tests. 16. To understand multivariate statistics and representation procedures. 17. To understand resampling methods. 18. To master the concept and procedures for Principal Components Analysis. 19. To understand the concept and procedures for Correspondence Analysis. 20. To understand the concepts of multiple regression and correlation. 21. To understand the concept and procedures for other non-linear regression methods. 22. To understand the concept of Generalized Linear Models 23. To understand the concept of Bayesian Statistics. 24. To understand the concept of Maximum Likelihood. 25. To become familiar with the multiple testing problem and its solutions. 26. To master the R software package as a tool for the implementation of the procedures under study. - Learning aims: To understand and apply algorithms and methods currently used in Biomedicine to perform statistical inference upon data. Contents Contents 1: Block 1: Introduction to Statistics and Probability Theory. Description and inference. Probability. Independence. Conditional probability. Bayes theorem Uni- and bivariant statistics. Central trend and dispersion measures. Experimental errors. Random variables. Expected values. Estimators and estimation. Confidence intervals. Common distributions. Central limit theorem. Descriptive statistics. Central tendency and spread measures. Graphical representations. To be able to perform basic probability calculations. To be able to use the basic concepts descriptive statistics to perform calculations with them. To be able to compute confidence intervals. To be able to represent data in a number of ways. To be able to estimate different moments of common distributions. Contents 2: Block 2 Introduction to inferential Statistics. Hypothesis testing. Types I and II errors. The t-student test? Two-tailed vs. one-tailed tests. Contingency tables. Overview of common To be familiar with hypothesis testing. To be able to perform basic two-group tests (Student s t) To be able to analyse basic contingency 3

statistical tests. One factor ANOVA.. SPSS and R. tables. To be able to perform ANOVA s and interpret their results. To become familiar with the SPSS and R packages. Contents 3: Block 3: Comparing groups. ANOVA, Linear models. Advanced ANOVA. ANOVA assumptions. Non-parametric alternatives to ANOVA Assessing the required sample size. Introduction to Correlation and Regression. Multiple Linear Regression. Correlation coefficient. Partial correlation. To be able to perform more advanced ANOVA s design and interpret their results. To be able to select alternatives to ANOVA when appropriate. To be able to perform linear regressions. To be able to perform correlations. To be able to perform multiple regression analysis. Understand differences between simple and multiple regression. Understand the use of partial correlation Contents 4: Block 4: More on Linear models, Multivariate Analysis and Resampling methods. ANCOVA: Analysis of Covariance Logistic and other non-linear regressions. Multivariate statistics. Variances/covariances matrix. Correlations matrix. Principal Components Analysis. Generalized Linear Models. Resampling methods in statistics. Power Analysis and assessing the required sample size. To be able to perform nonlinear regressions. To be able to understand the multivariate nature of biomedical data. To be able to perform PCAs To be able to understand the linear nature of most tests. To be able to use a variety of resampling methods. To understand Power analysis concept and use. To be able to assess the required sample size for basic statistical analysis. Contents 5: Block 5: Overview of Advanced statistics. Bayesian Statistics. Maximum Likelihood None Parametric Statistics Expression Microarrays, Whole genome scans. The Multiple Tests problem. Text Mining. To understand the basis of Bayesian statistics. To use basic Bayesian analysis tools. To perform basic maximum likelihood calculations. Understand when and how to use non parametric statistics. Example of widely used non parametric tests To understand the statistical challenges under current biomedical data. To apply various algorithms for multiple testing correction. Recent examples on text mining methods 4

Assessment General assessment criteria: The evaluation will consist of a final exam at the end of the course, worth ~50%, the evaluation of the exercises performed during the course (~30%) and the hands on exercises performed at the end of the course (~20%). Grading system Grades are between 0 and 10 and an overall 5 is needed to pass. Competence Evaluation Instrumental 1. Proficient reading/writing/listening scientific English related to the subject 2. Knowledge of office software to do quality scientific presentations and reports. Attainment indicator Correct understanding of proposed exercises and correct final presentation High quality presentation of results of exercises. Assessment procedure Implicit in the exam and exercises Scheduling End of term Progressive Interpersonal 1. Group work Ability to do team work both in programming and in preparing final presentation 2. Ability to solve by yourself a given problem Systemic 1. Analysis and synthesis abilities 2. Ability to search for information Correct answer of set of pen and pencil exercises and final examination Development of algorithms for proposed problems. Complete final presentation Implicit in group based exercises. and final exam End of term End of term Progressive Progressive Specific 1. To understand the concept of probability. 2. To understand Bayes Theorem. 3. To distinguish statistical description from inference. 4. To understand the concept of random variable. 5. To become familiar with central trend and dispersion measures. 6. To understand the concept of probability distribution. 7. To become familiar the most common kinds of distributions. 8. To master the graphical representation distributions and summary statistics. 9. To understand the concept confidence intervals and standard error. 10. To understand the concept of hypothesis testing. 11. To understand the concept of Type I and II errors. 12. To master the concept of ANOVA and its different designs. 5

13. To master the concept of contingency table and the relevant testing procedures. 14. To master the concept and procedures for Regression and Correlation Analysis. 15. To understand the concept of nonparametric tests. 16. To understand multivariate statistics and representation procedures. 17. To understand resampling methods. 18. To master the concept and procedures for Principal Components Analysis. 19. To understand the concepts of multiple regression and correlation. 20. To understand the concept and procedures for other non-linear regression methods. 21. To understand the concept and procedures for other non-linear regression methods. 22. To understand the concept of Generalized Linear Models 23. To understand the concept of Bayesian Statistics. 24. To understand the concept of Maximum Likelihood. 25. To become familiar with the multiple testing problem and its solutions. 26. To master the R software package as a tool for the implementation of the procedures under study. Assessment of all specific competences is by an exam at the end term of the course except for assessment 26 that will be achieved by a specific hands-on exam. Bibliography Armitage P, Berry G. Statistical Methods in Medical Research. 3rd edition. Cambridge (UK): Blackwell Science, 1994. Bernstein IH, Garbin CP, Teng GK. Applied Multivariate Analysis. New York: Springer- Verlag, 1988. Conover, W. J. Practical Nonparametric Statistics. Wiley 1998. Cox DR, Oakes D. Analysis of Survival Data. London: Chapman & Hall, 1984. Dalgaard. P. Introductory Statistics with R. Springer, 2004. Efron B, Tibshirani RJ. An Introduction to the Bootstrap. New York: Chapman & Hall, 1993. Gentleman, R. Bioinformatics and Computational Biology Solutions Using R and Bioconductor (Statistics for Biology and Health). Springer 2005. Hollandeer, M., Wolfe D. A. Nonparametric Statistical Methods. Wiley-Interscience, 1999. Edwards, A. L. An Introduction to Linear Regression and Correlation. W. H. Freeman & Co (Sd); Second edition. 1984. Everit B. Cluster Analysis. London: Heinemann Educational Books, 1974. Ewens, W. J., Grant, G. Statistical Methods in Bioinformatics: An Introduction (Statistics for Biology and Health). Springer 2006. Gelman, A., Carlin, J.B., Stern, H. L. Rubin, D. B. Bayesian Data Analysis. Chapman & Hall/CRC, 2003. Harris R J A. Primer of Multivariate Statistics. Nova York: Academic Press, 1975. Hosmer DW, Lemeshow S. Applied Logistic Regression. New York: John Wiley & Sons, 1989. Kleinbaum D G, Kupper LL, Logistic Regression. A Self-Learning Text. Nova York: Springer Verlag, 1998. Kleinbaum DG, Kupper LL, Muller KE, Nizam A. Applied Regression Analysis and Other Multivariable Methods. Pacific Grove (USA): Duxbury Press, 1998. Lee P. M. Bayesian Statistics: An Introduction. Hodder Arnold; 2004. Moore D.S., Notz W. I. Statistics: and Controversies. W. H. Freeman, 2005. Sokal, R.R., Rohlf, J. F. Biometry. W. H. Freeman; 3rd edition, 1994. 6