Course plan 2015-2016 Academic Year Qualification MSc on Bioinformatics for Health Sciences 1. Description of the subject Subject name: Information Extraction from Omics technologies Code: 31032 Total credits: 5 Workload: 125 hours Year: 1st Term: 3rd Type of subject: Theoretical and practical Centre: Faculty of Health and Life Sciences Teaching language(s): English Teaching team/teaching staff: Subject Coordinator Robert Castelo Teaching staff Robert Castelo 1
Information Extraction from Omics technologies (IEO) 2. Teaching guide (Campus Global) Introduction This course provides an introduction to the computational analysis of data obtained from high-throughput experimental technologies in molecular biology, such as microarrays and next-generation sequencing, using the software platform R/Bioconductor. Associated competences General competences 1. Learning technical aspects related to the generation of microarray and high-throughput sequencing data. 2. Applying bioinformatics tools for the quantitative analysis of DNA and RNA measurements obtained from high-throughput platforms. 3. Basic skills in analysing microarray and next-generation sequencing data with the R/Bioconductor platform. 4. Communicating scientific research by means of a presentation and a short manuscript, including the use of computational tools that facilitate reproducing the obtained results. Specific competences 1. Understanding the most relevant aspects of the experimental methodology. 2. Acquiring basic concepts of experimental design (replication, randomization, blocking). 3. Developing an intuition for critical aspects in the data obtained by highthroughput experimental platforms 4. Learning raw data quality assessment and control procedures. 5. Performing basic analyses of differential expression. 6. Assessing functional enrichment of differentially expressed genes. 7. Performing basic steps to call, annotate and filter variants from DNA sequencing data. 8. Understanding the task of identifying the genetic loci associated to a quantitative trait and the concept of percentage of variance explained in a phenotype by one or more genetic variants. 9. Using R and markdown to generate reproducible documents describing the statistical analysis of high-throughput omics data. 2
Information Extraction from Omics technologies (IEO) Contents Contents section 1: 1.1 Introduction to high-throughput genomics technologies 1.2 Introduction to R and Bioconductor 1.3 Quality assessment and normalization of microarray data 1.4 Batch effect identification and removal 1.5 Differential expression analysis of microarray data Contents section 2: 2.1 Functional annotations 2.2 Functional enrichment analysis 2.3 GSEA and GSVA approaches to functional analysis 2.4 Reproducible research Contents section 3: 3.1 Quality assessment, read mapping and summarization of NGS data 3.2 Visualization and transcript discovery with NGS data 3.3 Variant calling from whole-exome sequencing data 3.4 Variant annotation and filtering 3.5 Normalization and differential expression of RNA-Seq data 3.6 Quantitative trait loci search 3.7 Integrative genomics with eqtl networks Teaching methodology Approach and general organization of the subject All sessions of this course, except for the one about the introduction to highthroughput genomics technologies and the one about reproducibility in research, are hands-on using a computer, either located in a UPF computer room or in the cloud from Amazon. The subject is roughly organized into the three blocks describe above, with usually one lecture-free week between each block. Training activities* The students are expected to follow the data analysis steps described in each hands-on lecture, answering questions by themselves that are intertwined throughout the slides. At the end of each hands-on session there will be one or more s proposed to the student to help him/her consolidating the concepts illustrated during the session. The students will have to develop a data analysis during the entire term which will be delivered in two parts. One at an earlier deadline in the middle of the term and the other at the end of the term. 3
Information Extraction from Omics technologies (IEO) Assessment Assessment system Weekly quizzes Data Analysis Presentation of the data analysis Final exam Grading system Weekly quizzes (10%) Data Analysis (60%) Presentation of the data analysis (10%) Final exam (20%) A minimum performance of 50% on each item is required to pass the subject. 4
3. Programme of activities (Aula Global)* Description of the subject: Information Extraction from Omics technologies (IEO) Total credits: 5 Total number of hours: 125 hrs. Estimated time spent on the subject: - In the classroom: 36 - Outside the classroom: 89 Weekly timetable of learning and assessment activities Week Work in the classroom Estimated Activities outside Estimated (dates) (plenary, seminar, time the classroom time practical, etc.) (time studying, preparing activities, etc.) 1st week Plenary Studying/solving Practical Practical 2nd week Practical Weekly quiz Practical Studying/solving Practical 3rd week first part of the 4th week Practical Weekly quiz Practical Studying/solving first part of the 5th week Plenary first part of the 5
6th week Practical Weekly quiz Practical Studying/solving Practical 7th week Practical Weekly quiz Practical 8th week Practical Weekly quiz Practical 9th week 10th week Other Project presentation Final exam Total hours 36 hrs 89 hrs 6