BIOS 6660: Analysis of Biomedical Big Data Using R and Bioconductor, Fall 2015 Computer Lab: Education 2 North Room 2201DE (TTh 10:30 to 11:50 am)



Similar documents
Use advanced techniques for summary and visualization of complex data for exploratory analysis and presentation.

School of Public Health and Health Services Department of Epidemiology and Biostatistics

University of Maryland School of Medicine Master of Public Health Program. Evaluation of Public Health Competencies

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Department of Behavioral Sciences and Health Education

Define a public health problem and specify an analytic approach.

UCLA FIELDING SCHOOL OF PUBLIC HEALTH. Competencies for Graduate Degree Programs

Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program

BOSTON UNIVERSITY SCHOOL OF PUBLIC HEALTH PUBLIC HEALTH COMPETENCIES

Appendix I. Appendix I.A FHS Masters of Public Health Revised CORE courses To begin Fall Semester of AY 08-09

School-Wide Public Health Competencies for MPH Students

Following are detailed competencies which are addressed to various extents in coursework, field training and the integrative project.

Health Informatics Student Handbook

University of South Florida, College of Public Health Department of Community & Family Health

Executive Master of Public Administration. QUANTITATIVE TECHNIQUES I For Policy Making and Administration U6310, Sec. 03

PH.D. PROGRAM IN COMPUTATIONAL SCIENCE CONCENTRATION IN COMPUTATIONAL BIOLOGY & BIOINFORMATICS (Quantitative Biology)

BIOM611 Biological Data Analysis

MED 2400 MEDICAL INFORMATICS FUNDAMENTALS

Course Requirements for the Ph.D., M.S. and Certificate Programs

AIE: 85-86, 193, , 294, , , 412, , , 682, SE: : 339, 434, , , , 680, 686

Statistics in Applications III. Distribution Theory and Inference

Course Requirements for the Ph.D., M.S. and Certificate Programs

Biostatistics. Biostatistics DEGREES OFFERED: CERTIFICATE MS IN BIOSTATISTICS BIOSTATISTICS CERTIFICATE. West Virginia University 1

Appendix Chapter 2: Instructional Programs

Learning Objectives for Selected Programs Offering Degrees at Two Academic Levels

FIVS 316 BIOTECHNOLOGY & FORENSICS Syllabus - Lecture followed by Laboratory

COURSE OUTLINE - Marketing Research BUS , Fall 2015

RR765 Applied Multivariate Analysis

GUIDELINES FOR ADVISING MPH STUDENTS Master of Public Health, MPH One University Place, Rensselaer, NY

School of Public Health and Health Services Department of Prevention and Community Health

Apply an ecological framework to assess and promote population health.

Course Course Name # Summer Courses DCS Clinical Research 5103 Questions & Methods CORE. Credit Hours. Course Description

PSYC*3250, Course Outline: Fall 2015

octor of Philosophy Degree in Statistics

Overarching MPH Degree Competencies

MIS Systems Analysis & Design

UMEÅ INTERNATIONAL SCHOOL

Metrics: (1) Poorly (2) Adequately (3) Well (4) Very Well (5) With Distinction

ASPH Education Committee Master s Degree in Public Health Core Competency Development Project

BAE 402: Biosystems Engineering Design I Biosystems and Agricultural Engineering College of Engineering Fall 2013

Online MPH Program Supplemental Application Handbook

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology

The MSCR Curriculum and Its Advantages

The MPH. ability to. areas. program. planning, the following. Competencies: research. 4. Distinguish. among the for selection

Texas Higher Education Coordinating Board Characteristics of Texas Doctoral Programs 2015

School of Public Health. Academic Certificate Student Handbook

Statistics with Aviation Applications Math 211 Mode of Delivery Lecture Blended Course Syllabus

Syllabus. HMI 7437: Data Warehousing and Data/Text Mining for Healthcare

Masters of Science in Clinical Research (MSCR) Curriculum. Goal/Objective of the MSCR

Competency 1 Describe the role of epidemiology in public health

USC Columbia, Lancaster, Salkehatchie, Sumter & Union campuses

UNIVERSITY OF KENTUCKY COLLEGE OF PUBLIC HEALTH. Proposal for a Graduate Certificate in Biostatistics. Purpose and Background

Weldon School of Biomedical Engineering Continuous Improvement Guide

Health Science Education II, August 2013, Page 1 of 5

Syllabus for Accounting 300 Applied Managerial Accounting California State University Channel Islands Fall 2004

UNIVERSITY OF SOUTHERN CALIFORNIA Marshall School of Business BUAD 425 Data Analysis for Decision Making (Fall 2013) Syllabus

University of Texas at San Antonio English 2413: Technical Writing Fall 2011

Video Game Design (3 Teams per state, 2 team members minimum)

MEDICINE, DOCTOR OF (M.D.)/ PUBLIC HEALTH, MASTER OF (M.P.H.) [COMBINED]

ACADEMIC POLICIES MPH AND MSPH PRACTICUM GUIDELINES

Advanced Statistics & Data Analysis

Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course

Ph.D. in Bioinformatics and Computational Biology Degree Requirements

BJC Center for Lifelong Learning. Master of Science in Healthcare Informatics. Student Handbook

General Services Administration Federal Supply Service Authorized Federal Supply Schedule Price List

Design, conduct experiments, analyze, data, implementation strategies, recommendations findings communicated; (labs)

PhD in PUBLIC HEALTH SCIENCES Academic Year Health Services & Policy Program Profile

Module 223 Major A: Concepts, methods and design in Epidemiology

PSYC 270 Abnormal Psychology

Phone: (318) Phone: (318) Classroom: (218) Classroom: (222) Office: Rm. 11/239 Office: Rm. 9/236 MISSION STATEMENT

RFI Summary: Executive Summary

Winter 2016 MATH 631 Online University of Waterloo

Instructor/TA Info. Course Information. Instructor Information. Description. Prerequisites. Materials. Learning Outcomes

PHC 6601 Seminar in Contemporary Public Health Issues Credit: 1 credit

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

School of Public Health

Experimental Psychology PSY 3017, CRN Fall 2011

Department of Epidemiology and Biostatistics

ACCY 2001 Intro Financial Accounting Fall 2014

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences Academic Year Qualification.

Psychology 314L (52510): Research Methods

Updated March 10,

College of Health and Human Services. Fall Syllabus

University of Manchester Health Data Science Masters Modules

CLASS SESSIONS Wednesdays, 8:30 AM -11:20 AM, HSL LL204

Learning outcomes. Knowledge and understanding. Competence and skills

BS Environmental Science ( )

Clinical and Translational Science

Biology BSC 6932 Applied Regression for Scientists Fall 2014

OAKTON COMMUNITY COLLEGE Summer Semester, 2015 CLASS SYLLABUS

o Ivy Tech DESN 105- Architectural Design I DESN 113- Intermediate CAD o Vincennes University ARCH 221- Advanced Architectural Software Applications

Epidemiology. Admission Guidelines for MPH DEGREES OFFERED: MPH IN EPIDEMIOLOGY PH.D. IN EPIDEMIOLOGY FACULTY INTERIM CHAIR PROFESSORS

Study Design and Statistical Analysis

MTH 110: Elementary Statistics (Online Course) Course Syllabus Fall 2012 Chatham University

CS 425 Software Engineering

PSYC 3200-C Child Psychology 3 SEMESTER HOURS

POL 204b: Research and Methodology

COURSE NUMBER AND TITLE: Management Information Systems Concepts

COE Mission: To achieve excellence by guiding individuals as they become professionals

Transcription:

BIOS 6660: Analysis of Biomedical Big Data Using R and Bioconductor, Fall 2015 Computer Lab: Education 2 North Room 2201DE (TTh 10:30 to 11:50 am) Course Instructor: Dr. Tzu L. Phang, Assistant Professor of Bioinformatics. Division of Pulmonary Sciences and Critical Care Medicine University of Colorado Denver Office: Room 9003, Research Center 2, AMC tzu.phang@ucdenver.edu (303)724-6057 Course Description: BIOS 6660 (3 semester hours) - This course provides students with hands on experience in solving real life biological problems using the statistical software R and its packages from the Bioconductor consortium. The students will have an opportunity to work with participating researchers and clinicians in finding practical solutions for case studies in both the statistical and biological perspectives. Students will also learn to communicate with the scientists and interpret the results in the biological context. Pre/corequisites: BIOS 6612 or BIOS 6602, or equivalent graduate level statistics course with consent of instructor Course Objectives: After completion of the course, students will be able to perform a complete data analysis project from start to finish, including communicating and understanding the important questions to be answered, exploring the nature of the dataset, hypothesis generation, interpreting the biological meaning of the results, and learning to work with participating scientists to fulfill the research requirement. More importantly, students will learn to work with real researchers to perform biological discovery. These objectives will be accomplished using the free open-source statistical software R and Bioconductor. Case Study class format: We will use 4 class periods to conduct each case study. There will be 6 case studies in this course. Students will be divided into groups and work together as a team. General Case Study Class Period Format- Day 1: Participating scientist will present the biological importance and question of their high-throughput dataset, to convey understanding of the data structure and analysis goals. Students will be briefed on potential R and Bioconductor analysis solution(s). Day 2: Students will participant on exploring various analysis solutions on the case study problem(s) and decide on the analysis pipeline. Day 3: After trying out potential analysis pipelines, students will bring forward various issues and together we will try to troubleshoot the problems. Day 4: Students will interpret and present the solution to the participating scientists and be graded on both the presentation s merit and their written report, to be submitted the following class session. Objectives: At the completion of BIOS6660 students will be able to: 1. Practice Reproducible Research pipeline using open-source tools. 2. Use the statistical software R to perform basic programming, data manipulation, and statistical tests. 3. Understand the basic organization of the Bioconductor consortium and able to find appropriate packages for the analysis purposes. 4. Able to navigate the S4 object oriented data structure used in Bioconductor in order to perform basic modification for custom analysis. 5. Be competence in using the RStudio IDE environment for routine analysis and reproducible research practices. 6. Perform a complete high-throughput data analysis routine;; from problem formulation, data processing, data filtering, statistical analysis and result interpretation and presentation. 7. Perform clustering, comparative, and predictive algorithm for high-throughput data. 8. Understand the basic concept of Next Generation Sequencing and develop analysis solutions for these large datasets. 9. Able to comprehend real life biological problems and create analysis pipelines to fulfill the requirement of the project. 10. Able to communicate and interpret the analysis findings to biological scientists using nonstatistical terms. 1

MPH Applied Biostatistics concentration competencies addressed by this course Identifier CN-BIOS 1 CN-BIOS 2 CN-BIOS 3 CN-BIOS 4 CN-BIOS 5 CN-BIOS 6 CN-BIOS 7 CN-BIOS 8 CN-BIOS 9 CN-BIOS 10 CN-BIOS 11 MPH Applied Biostatistics Concentration (CN) Competencies Select and apply appropriate biostatistical methods to support research and evaluation in the core areas of public health research and practice, including: epidemiology, environmental and occupational health, community and behavioral health, and public health systems management, policy and outcomes research. Translate a study s scientific question or aims into testable statistical hypotheses and propose and apply appropriate statistical methods to test those hypotheses. Test and interpret models for continuous outcome data (normal linear model), categorical outcome data (logistic and Poisson regression), and time-to-event data (Cox regression). Demonstrate knowledge of the issues of bias, error, confounding, effect modification, sampling, and generalizability and how they relate to interpretation of study results. Carry out appropriate sample size and power calculations in basic situations to ensure that a study is sufficiently powered to achieve the scientific aims or address a specific research hypothesis. Use computer software for data entry and data base management and for summarizing, analyzing and displaying research results. Demonstrate knowledge of the basic ethical issues involved in collection, management, use and dissemination of biomedical and public health data. Critically review and interpret basic statistical methods presented in public health and medical literature to identify strengths, weaknesses, and potential biases in these studies. Apply scientific and statistical principles and methods to design basic public health and biomedical studies. Use the principles of hypothesis testing and estimation of population parameters to draw inferences from quantitative data and communicate verbally and in writing those inferences and their statistical and scientific interpretation to non-statistical scientists. Address a biomedical, public health or statistical research question with a basic statistical analysis (e.g. linear or logistic regression). 2

MS in Biostatistics competencies addressed by this course Identifier MS Biostatistics Competencies Study Development: Work collaboratively with biomedical or public health researchers and PhD biostatisticians, as necessary, to provide biostatistical expertise in the development and design of research studies. MS-BIOS 1 Map study aims to testable statistical hypotheses. MS-BIOS 3 Use probability and statistical theory to develop appropriate data analysis plans for study hypotheses. Modeling and Analysis: Develop, carry out and report biostatistical modeling and analysis of biological science and public health studies. MS-BIOS 4 Use advanced techniques for summary and visualization of complex data for exploratory analysis and presentation. MS-BIOS 5 Use probability and statistical theory to identify appropriate modeling and analysis methods to address study hypotheses. MS-BIOS 6 MS-BIOS 7 Determine and check modeling assumptions, and verify validity of proposed analyses. Carry out valid and efficient modeling, estimation, and inference to address study hypotheses, using standard statistical methods including basic one and two sample methods, general linear models including regression and anova, logistic regression, and clustered and longitudinal analysis. MS-BIOS 9 Demonstrate statistical programming proficiency, good coding style and use of reproducible research principles in leading statistical software. Biologic or Public Health Relevance: Show how biostatistical tools apply to and influence research and policy in the biomedical and public health arenas. Read subject specific biomedical or public health literature and synthesize issues that MS-BIOS 10 are important in the design, implementation, and analysis of research in the subject area. MS-BIOS 11 Understand ethical aspects of public health policy and practice, ensure the quality and security of information used in a study and adhere to the principles of research ethics. MS-BIOS 12 Develop and implement specialized study designs and analyses in biological (e.g. genetic association, genomics) or public health (e.g. epidemiological) settings. Communication: Communicate orally and in writing biostatistical concepts and results to both biostatistical and non-biostatistical audiences. Communicate orally and in writing simple and complex statistical ideas and methods to MS-BIOS 13 collaborators in non-technical terms including preparation of analysis section of grant proposals and methods and results sections of manuscripts. 3

Integration of Biostatistics Course Content with other Core Areas: Every attempt will be made to include examples of biostatistics applications in the areas of epidemiology, health behavior, environmental health, and health care organization and policy throughout the course. Textbook: Norman Matloff, The Art of R Programming. No Starch Press, 2011. ISBN: 1593273843. Software Use: We will be using the open-source R software (www.r-project.org) and its corresponding IDE RStudio (http://www.rstudio.com). The software is platform independent and will run in most major computer operating system. We will also be using the R packages from the Bioconductor consortium. Optional Textbook: Hadley Wickham. Advanced R. Chapman & Hall, 2014. ISBN: 1466586966. (http://adv-r.had.co.nz/) Course Requirements: 1. Participation and presentation (40% of course grade) There are 6 case study presentations to be given by each team;; each carries 7% of the course grade. Each student in the team must participant in the presentation in order to receive any grade. Both instructors and participating scientist will grade the presentation based on the following criterions: team work, confidence, quality of information presented, level of clarity, and level of organization. 2. Homework (60% of course grade) There are 6 case study reports to generate by each student: each carries 10% of the course grade. Both instructors will grade the reports by the following criterions: the coherent of the report, the practice of reproducible research, and the biological insight of the analysis results 3. Final Project There is no final project Grading: The following minimal grading scale will be used: 96-100 A+ 91-95 A 86-90 A- 81-85 B+ 76-80 B 71-75 B- 66-70 C+ 61-65 C 56-60 C- Below 56 F Course Policies: Attendance Policy Attendance is strongly encouraged and will be a significant part of the Participation contribution. Academic Conduct Policy All students are expected to abide the Honor Code of the Colorado School of Public Health. Unless otherwise instructed, all of your work in this course should represent completely independent work. Students are expected to familiarize themselves with the Student Honor Code that can be found at: http://www.cudenver.edu/academics/colleges/publichealth/students/studentaffairs/studentresou rces/pages/index.aspx or the Student Resources Section of the CSPH website. Any student found to have committed acts of misconduct (including, but not limited to cheating, plagiarism, misconduct of 4

research, breach of confidentiality, or illegal or unlawful acts) will be subject to the procedures outlined in the CSPH Honor Code. Disability Policy For students requesting accommodations, contact the Office of Disability Resources and Services. Their staff will assist in determining reasonable accommodations as well as coordinating the approved accommodations. Phone number: (303) 724-5640. Location: Building 500, Room W1103. The physical address is 13001 E. 17th Place. 5

BIOS 6660 Course Schedule Fall 2015 Dr. Tzu L. Phang, Associate Professor of Bioinformatics Computer Lab: Education 2 North 2201DE (10:30 to 11:50 am) Course Schedule Date Report Topic Sept 1 Introduction to R, RStudio and Reproducible Research Sept 3 Review: R Basic Sept 8 P28-2301 Review: R function and Control Structure Sept 10 Review: R Graphics Sept 15 Introduction to Genomic Object Structure Sept 17 Dynamic Visualization using shiny Sept 22 CS1-1: Microarray Data Analysis Presenting Scientist: Dr. Andy Bradford Sept 24 CS1-2: Exploratory Analysis Sept 29 CS1-3: Trouble Shooting Oct 1 CS1-4: Student Presentation Oct 6 R1 Due CS2-1: RNA-seq Data Analysis Presenting Scientist: Dr. Eric Schmidt Oct 8 CS2-2: Exploratory Analysis Oct 13 CS2-3: Trouble Shooting Oct 15 CS2-4: Student Presentation Oct 20 R2 Due CS3-1: ChIP-seq Analysis Presenting Scientist: Dr. Anthony Gerber Oct 22 CS3-2: Exploratory Analysis Oct 27 CS3-3: Trouble Shooting Oct 29 CS3-4: Student Presentation Nov 3 R3 Due CS4-1: Clinical Informatics Data Analysis Presenting Scientist: Dr. John Welton Nov 5 CS4-2: Exploratory Analysis Nov 10 CS4-3: Trouble Shooting Nov 12 CS4-4: Student Presentation Nov 17 R4 Due CS5-1: Public Health Informatics Data Analysis Presenting Scientist: Dr. Aarti Munjal Nov 19 CS5-2: Exploratory Analysis Nov 24 Thanks Giving Day Break Nov 26 Thanks Giving Day Break Dec 1 CS5-3: Trouble Shooting Dec 3 CS5-4: Student Presentation Dec 8 R5 Due CS6-1 Exome-seq Analysis Presenting Scientist: Dr. Hung-Chun (James) Yu Dec 10 CS6-2: Exploratory Analysis Dec 15 CS6-3: Trouble Shooting Dec 17 P28-2307 CS6-4: Student Presentation R6 due Dec 21 CS = Case Study R = Report Class Room Change Notification: Ed 2 North P28-2301/07*** 6