1. GENERAL INFORMATION Title Unit code Credit rating 15 Level 7 Contact hours 30 Other Scheduled teaching and learning activities* Pre-requisite units Co-requisite units School responsible Member of staff responsible Bioinformatics, Interpretation, Statistics and Data Quality Assurance Lectures, seminars, tutorials, case studies, PBL. Total contact time with students will be approximately 30 hours. None ECTS** 7.5 Notional hours of Learning*** 150 2. AIMS Institute of Human Development Prof Andy Brass By the end of this compulsory module the student will be able to: 1. Analyse the principles applied to quality control of sequencing data, alignment of sequence to the reference genome, calling and annotating sequence variants, and filtering strategies to identify pathogenic mutations in sequencing data 2. Interrogate major data sources, e.g. of genomic sequence, protein sequences, variation, pathways, (e.g. EVS, dbsnp, ClinVar, etc.) and be able to integrate with clinical data, to assess the pathogenic and clinical significance of the genome result 3. Acquire relevant basic computational skills and understanding of statistical methods for handling and analysing sequencing data for application in both diagnostic and research settings 4. Gain practical experience of the bioinformatics pipeline through the Genomics England programme. 5. Justify and defend the place of Professional Best Practice Guidelines in the diagnostic setting for the reporting of genomic variation. 3. BRIEF DESCRIPTION OF THE UNIT Genetics/Genomics o Introduction to the history and scope of genomics o The Genome Landscape o Nucleic Acid structure and function, including the structure and function of coding and non-coding DNA o The central dogma: From DNA, to RNA and proteins o Noncoding regulatory sequence: promoters, transcription factor binding sites, splice site dinucleotides, enhancers, insulators o Genetic variation and its role in health and disease o Genomic technology and role of the genome in the development and treatment of disease Unit specification template 1
Sequencing o Types of sequencing, applications and limitations; Sanger versus short read o Analysis, annotation and interpretation o Panel versus exome versus whole genome resequencing o Aligning genome data to reference sequence using up to date alignment programmes (e.g.bwa) Statistics o Basic statistics applied to clinical genetics/genomics o Hardy-Weinberg, Bayes theorem, risks in pedigrees o Assessment of data quality through application of quality control measures o How to determine the analytical sensitivity and specificity of genomic tests Bioinformatic Fundamentals o Introduction to the history and scope of bioinformatics o Primary biological sequence resources, including INDSC (GenBank, EMBL, DDBJ) and UniProt (SwissProt and TrEMBL) o Genome browsers and interfaces; including Ensembl, UCSC Genome Browser, Entrez, o Similarity/homology, theory of sequence analysis, scoring matrices, dynamic programming methods including BLAST, pairwise alignments(e.g., Smith Waterman, Needleman Wunsch), multiple sequence alignments (e.g., ClustalW, T-Coffee, Muscle), BLAT o Feature identification including SNP analysis and transcription factor binding sites and their associated TF binding sequence motifs o Ontologies in particular GO, Human Phenotype Ontology (HuPO) Clinical application of bioinformatics Introduction to the clinical application of bioinformatic resources, including its role and use in a medical context in molecular genetics, cytogenetics and next generation sequencing for data manipulation and analysis, and genotyping microarrays (also used to predict CNVs). Use of tools to call sequence variants e.g. GATK, annotation of variant-call files (,vcf) using established databases. Filtering strategies of variants, in context of clinical data, and using publically-available control data sets. Use of multiple database sources, in silico tools and literature for pathogenicity evaluation, and familiarity with the statistical programmes to support this Background and application of specialist databases and browsers: o dbsnp, DECIPHER, Orphanet, DMuDB / NGRL Universal Browser, ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/intro/),omim, ECARUCA. DGV, ExAC, NHLBI-GO o LOVD/UMD database software and scientific literature o HGMD o Specific clinical analysis software o CNV analysis o Gene Prioritisation (e.g. ToppGene, Endeavour, GeCCO) o Missense analysis (e.g. Align GVGD, SIFT, PolyPhen, Panther, PhDSNP, MAPP) o Splicing analysis applications (e.g. GeneSplicer, MAxEntScan, NNSplice, SSFL, HSF, NetGene2) o Commercially available software (e.g. NextGENe, Alamut, Cartegenia Unit specification template 2
o o THE UNIVERSITY OF MANCHESTER Capture and representation of phenotype data Development of a simple application for clinical bioinformatic use Standards and governance o Data standards and formats o IUPAC codes, FASTA, GenBank, FASTQ, SAM/BAM/CRAM, VCF o HGVS variant nomenclature o HGNC gene nomenclature o RefSeq/RefSeqGene, LRG o Role and development of Standard Operating Procedures o Relevant standards (clinical, genetic, bioinformatic) for data representation and exchange o Principles of integration of laboratory and clinical information, and place of best-practice guidelines for indicating the clinical significance of results o Principles of downstream functional analysis e.g. knock-outs, and other cellular model o How to analyse genomic data to identify epigenetic and other variation that modifies o phenotype o Practice in examples of analysis of genomic data in the Training Embassy within the Genomics England Data Centre. * To inform the Key Information Set. Defined as any activity that a student has to attend or undertake at a fixed point and that has no flexibility for when it is undertaken, and where the student also has access to an available staff member ( Provision of Information about Higher Education: Outcomes of consultation and next steps June 2011/18) ** ECTS (European Credit Transfer and Accumulation System): There are 2 UK credits for every 1 ECT credit, in accordance with the Credit Framework (QAA). Therefore if a unit is worth 30 UK credits, this will equate to 15 ECT. *** Notional hours of learning: The number of hours which it is expected that a learner (at a particular level) will spend, on average, to achieve the specified learning outcomes at that level. It is expected that there will be 10 hours of notional study associated with every 1 credit achieved. Therefore if a unit is worth 30 credits, this will equate to 300 notional study hours, in accordance with the Credit Framework (QAA). Unit specification template 3
4. INTENDED LEARNING OUTCOMES Category of outcome Knowledge and understanding Students will be able to: 1. Critically evaluate the governance and ethical frameworks in place within the NHS and how they apply to bioinformatics. 2. Discuss and justify the importance of standards, best practice guidelines and standard operating procedures: how they are developed, improved and applied to clinical bioinformatics. 3. Describe the structure of DNA and the functions of coding and non-coding DNA. 4. Discuss the flow of information from DNA to RNA to protein in the cell. 5. Describe transcription of DNA to mrna and the protein synthesis process. 6. Discuss the role of polymorphisms in Mendelian and complex disorders and give examples of polymorphisms involved in genetic disease. 7. Describe appropriate bioinformatics databases capturing information on DNA, RNA and protein sequences. 8. Explain the theory of sequence analysis and the use of genome analysis tools. 9. Describe secondary databases in bioinformatics and their use in generating metadata on gene function. 10. Explain fundamental bioinformatic principles, including the scope and aims of bioinformatics and its development. 11. Explain fundamental genomic principles, including the scope and aims of genomics and its development. 12. Discover resources linking polymorphism to disease processes and discuss and evaluate the resources that are available to the bioinformatician and how these are categorised. 13. Discuss metadata and how it is captured in bioinformatics resources. 14. Interpret the metadata provided by the major bioinformatics resources. 15. Describe the use of ontologies in metadata capture and give examples of the use of ontologies for capturing information on gene function and phenotype. 16. Identify appropriate references where published data are to be reported. 17. Describe the biological background to diagnostic genetic testing and clinical genetics, and the role of bioinformatics. 18. Describe the partnership of Clinical Bioinformatics and Genetics to other clinical specialisms in the investigation and management of genetic disorders and the contribution to safe and effective patient care. Intellectual skills 1. Critically analyse scientific and clinical data 2. Present scientific and clinical data appropriately 3. Formulate a critical argument 4. Evaluate scientific and clinical literature 5. Critically evaluate the knowledge of clinical bioinformatics to address specific clinical problems Unit specification template 4
Practical skills 1. Present information clearly in the form of verbal and written reports. 2. Communicate complex ideas and arguments in a clear and concise and effective manner. 3. Work effectively as an individual or part of a team. 4. Use conventional and electronic resources to collect, select and organise complex scientific information 5. Perform analysis on DNA data and protein sequence data to infer function. 6. Perform sequence alignment tasks. 7. Select and apply appropriate bioinformatic tools and resources from a core subset to typical diagnostic laboratory cases, contextualised to the scope and practice of a clinical genetics laboratory. 8. Compare major bioinformatics resources for clinical diagnostics, and how their results can be summarised and integrated with other lines of evidence to produce clinically valid reports. 9. Interpret evidence from bioinformatic tools and resources and integrate this into the sum of genetic information for the interpretation and reporting of test results from patients. 10. Perform the recording of building or version numbers of resources used on a given date, including those of linked data sources, and understand the clinical relevance of this data. Transferable skills and personal qualities 1. Present complex ideas in simple terms in both oral and written formats. 2. Consistently operate within sphere of personal competence and level of authority. 3. Manage personal workload and objectives to achieve quality of care. 4. Actively seek accurate and validated information from all available sources. 5. Select and apply appropriate analysis or assessment techniques and tools. 6. Evaluate a wide range of data to assist with judgements and decision making. 7. Interpret data and convert into knowledge for use in the clinical context of individual and groups of patients. 8. Work in partnership with colleagues, other professionals, patients and their carers to maximise patient care. 5. LEARNING AND TEACHING PROCESSES (INCLUDING THE USE OF E-LEARNING) 1. Lectures, tutorials, case studies and PBL. In particular we will make extensive use of staged case studies to support students through the processes involved in: 1) developing a clinical understanding of genomic variants through bioinformatics analyses and 2) properly capturing a record of the analysis methodologies used Unit specification template 5
3) presenting the results back to clinical and clinical scientist colleagues in an appropriate format and in the context of the current understanding of the disease processes. 2. E-learning: - evidence-based learning supported by course notes, audio lectures, case studies - online tutorials 6. ASSESSMENT (INCLUDING FORMATIVE ASSESSMENT, E-ASSESSMENT, and INFORMATION ABOUT FEEDBACK) Assessment task Length How and when feedback is provided Weighting within unit (if relevant) Group work. Students will be assessed on the group effectiveness, their contribution to the group work, and on presentations on the group work tasks. Tutor feedback during sessions 30% Project report based on the analysis of variants provided to students. 3000 words Written feedback provided in grademark within 3 weeks of assignment submission deadline 70% 7. INDICATIVE READING LIST Suggested Reading for Introduction to genetics and genomics and DNA sequencing Molecular Biology/Genetics textbooks look for the latest edition 1. Human Molecular Genetics, Tom Strachan and Andrew Read, Garland Science Chapters 1, 2 and 13 2. New Clinical Genetics, Andrew Read and Dian Donnai, Scion Publishing 3. Essential Medical Genetics, JM Connor and MA Ferguson-Smith, Blackwell Science 4. Genomes, TA Brown, Bios Scientific Publishers 5. Human Genetics and Genomics, Bruce R Korf, Blackwell PublishingInstant Notes in Bioinformatics by Hodgman, French and Westhead (Bios, 2009) Journal papers 1. What is a gene, post ENCODE? History and updated definition Gerstein, MB et al (2007) Genome Research 17:p669 2. Non-coding RNAs: key regulators of mammalian transcription Kugel, JF and Goodrich, JA (2012) Trends Biochem Sci 37(4):p144 3. Long non-coding RNAs and enhancers Ørom UA and Sheikhattar, R (2011) Curr Opin Genet Dev 21(2):p194 Unit specification template 6
4. Human genetics and genomics a decade after the release of the draft sequence of the human genome Naidoo N et al (2011) Human Genetics 5(6):p577 5. Identifying Disease mutations in genomic medicine settings: current challenges and how to accelerate progress Lyon, GJ and Wang, K (2012), Genome Medicine 4:58 6. Implementing genomic medicine in the clinic: the future is here Manolio T et al (2013) Genetics in Medicine 15(4):p258 Genome Project websites 1. The Human Genome Project UK: http://www.sanger.ac.uk/about/history/hgp/ USA: http://www.genome.gov/10001772 2. 1000 Genomes Project http://www.1000genomes.org/ 3. 10,000 Genomes Project http://www.uk10k.org/ 4. Genomics England http://www.genomicsengland.co.uk/ Professional Practice Guidelines 1. USA: American College of Medical Genetics and Genomics (ACMG) https://www.acmg.net/ 2. UK: Association for Clinical Genetic Science http://www.acgs.uk.com/quality-committee/best-practice-guidelines/ Nomenclature Guidelines 1. http://www.hgvs.org/mutnomen/recs.html#general For Information and advice on Link2Lists reading list software, see: http://www.library.manchester.ac.uk/academicsupport/informationandadviceonlink2listsreadinglistsoftware/ Date of current version 25/6/15 Unit specification template 7
Document control box Policy / Procedure title: Template Date approved: January 2009 Approving body: TLSO Implementation date: January 2009 Version: 2.1, June 2012 Supersedes: 1.1 Previous review dates: Next review date: Related Statutes, Ordinances, General Regulations Related Policies: tbc N/A N/A Related Procedures and Guidance: Policy owner: Lead contact: The Manual of Academic Procedures (MAP) - http://www.tlso.manchester.ac.uk/map/ Louise Walmsley, Head of Teaching and Learning Support Office Miriam Graham, Teaching and Learning Adviser (Policies and Procedures) Unit specification template 8