Identifying sequence variants in whole genome sequencing data
|
|
- Agnes Bradley
- 7 years ago
- Views:
Transcription
1 Identifying sequence variants in whole genome sequencing data Saumya Kumar MRC-MGU
2 Outline of the talk Genomic approaches at Harwell o Methods to identify mouse models for diseases. Next Generation Sequencing: DNA Seq o Mutation detection (workflow analysis). o Mutation consequence of SNVs, INDELs and SVs o Mutational effects on protein structure and function Data Resources
3 Genomic approaches at Harwell Genotype driven screens. o International Mouse Phenotyping Consortium (IMPC): To build a resource of knockout mice and associated encyclopaedia of gene functions. Phenotype driven screens ENU Mutagenesis : To identify novel mouse models of human disease. o Harwell Ageing Screen: To identify mouse models for studying mutations that lead to early and late onset of diseases.
4 Harwell Ageing Mutant Screen pedigree production ENU C3H X X X X C57BL/6 (G0) G2 female offspring mated with G1 father +/+ +/m1 WT X +/m1 +/m2 +/m3 G1 offspring (Heterozygous) +/+ +/+ +/+ +/m1 +/m1 +/m1 +/m1 m1/m1 G3 Heterozygous and homozygous mutant offspring 100 mice per pedigree pedigrees over 5 years Produce, age and screen 20,000-25,000 mice Phenotyping
5 Next Generation Sequencing Massively parallel sequencing technologies to provide whole genome or exome sequence data. An efficient, low-cost and high-throughput sequencing method. Major sequencing platforms used are: o Illumina bp read length, reads are of same size o Roche 454- up to 1kbp, reads of variable lengths o Ion torrent - ~200 bp read length
6 Different Mutations detected using NGS Single Nucleotide Variations (SNVs) or point mutations. Substitution of a single nucleotide base. INDEL: small insertions/deletions. Size varies from 1bp to 100bps. Structural Variations(SVs), genomic variations involving large number of nucleotides (101bp-kbs) and more.
7 Analysis Workflow (SNVs and INDELs) DNA samples sent to Oxford for sequencing RAW SEQUENCE DATA REFERENCE GUIDED ASSEMBLY ALIGN ED READS ALIGNMENT AND ASSEMBLY Illumina paired-end BWA MGP dbsnp HARWELL INBRED SNPS DETECT HIGH CONFIDENCE SNPS AND INDELs FILTER KNOWN SNPS AND INDELs IDENTIFY ENU POSSIBLE SNPS AND INDELs SNP and INDEL DETECTION GATK SNP AND INDEL ANNOTATION DATABASE VISUALISATI ON - GENOME BROWSER ANNOTATION AND DISSEMINATION NGS-SNP VEP IGV
8 NGS: Alignment and sequence variation finding Single Nucleotide Variation (SNV) INDEL
9 Mutation Consequence :SNV SNVs can be coding or non-coding. Coding mutations are usually deleterious to the protein structure and function. These are: missense, splice-site and nonsense mutations Non-coding mutations are synonymous, intronic, intergenic, upstream, downstream, 5 UTR, 3 UTR, lincrna,other RNA etc. UPSTREAM 5`UTR INTRON EXON 3`UTR DOWNSTREAM INTERGENIC Coding mutations Non-Coding mutations
10 Harwell Ageing Screen NGS Coding Non-Coding 0% 0% initiator_codon 3% 1% 4% 7% missense splice_acceptor splice_donor 24% Promoter Associated 47% lincrna splice_region 29% other RNA(inc mirna, snrna) stop_gained 85% stop_lost The average high confidence coding changes is ~45 per genome. Coding mutation rate is 1.76 Mb -1 Using ENSEMBL and ENCODE we found approximately 198 SNVs are located in regulatory regions of the genome, with the greatest number found in promoter regions (47%) and lincrnas (29%). Non-coding mutation rate is comparable at 1.78 Mb -1 Cloned mutations Total Early Late Missense Stop gained Splice mutant 19*/9 10/1 3/2
11 Mutation Consequence: INDEL Protein-coding DNA is divided into codons. INDELs can alter these codons in a gene, causing frameshifts mutations leading to either nonsense mutations and/or missense mutations. E.g. A single base pair deletion in the gene Crb1 causes frameshift mutation which leads to blindness in C57B6N mice.(simon et. al, 2013)
12 Mutational Consequence on protein Coding SNVs(missense) and INDEL mutations can be damaging or tolerant to the protein structure and function. Various tools exist that can evaluate the mutation effect on the protein. E.g. Sift, Provean, Polyphen2 etc. These tools give a score to the missense/indels mutations indicating their tolerant or damaging behaviour.
13 Mutational Consequence on protein: tools Tool Notes Organism Reference CONSERVATION SiFT SiFT Predicts effect of SNVs Human and known mouse SNPs (dbsnp) (Kumar et al. 2009) MutationAssessor Predicts effect of SNVs Human data: cancer studies (Reva et al. 2011) Provean Predicts effect of SNVs, insertions and deletions Organism independent (Choi et al. 2012) STRUCTURE SNPs3D Sequence, 3-D structure, biological networks Human, useful for association physical properties of amino acids. studies MACHINE LEARNING / MULTIPLE DATASETS Polyphen-2 Implements MSA, amino acid changes, evolutionary Human, can be adapted for mouse conservation, SNV site hypermutability. Uses a naïve Bayes genome (standalone) classifier. MutationTaster2 Predicts amino acid substitution affects on protein function based on sequence homology and the Machine learning on evolutionary conservation, splice site changes, gene expression and protein features. Uses a Bayes classifier. (Yue et al. 2006) Can be applied to naturally occurring and laboratoryinduced missense mutations. (Adzhubei et al. 2010) Human, uses 1000G data (Schwarz et al. 2014) SNAP Site Directed Mutator(SDM) Uses neural networks for evolutionary conservation, secondary structure, solvent accessibility Uses a potential free energy function for protein stability; algorithm uses environment-specific substitution tables to calculate stability, predicts disease association. Human (Bromberg and Rost 2007) Organism independent (Worth et al. 2011) POST TRANSLATIONAL MODIFICATIONS PhosSNP Predicts SNV effect on PTM Human (Ren et al. 2010) SNPeffect Predicts SNV effect on PTM, structural features of proteins, subcellular localization and interactions Human (De Baets et al. 2012) PROTEIN-PROTEIN INTERACTIONS MuSiC Predicts SNV effect on pathways (Cancer studies). To segregate passenger mutations from truly significant mutations. human (Dees et al. 2012)
14 Mutational Consequence on protein Map3k1: It is 1493 aa long protein. Map3k1 knockout viable on mixed genetic background, but not viable on C57B6J background. Mutation at amino acid 184, changing Leucine to Arginine. Sift predictions suggest that the amino acid change is going to affect the protein function with a score of A score of <= 0.05 is considered to be damaging to the protein.
15 Mutational Consequence on protein structure Mapk1 mutation from Leucine to Arginine Wildtype Mutant Wildtype Mutant
16 SifT: Worked example The worked example is on SOD1 ALS mutation. Things required: Sequence of the gene: >SOD1 MAMKAVCVLKGDGPVQGTIHFEQKASGEPVVLSGQITGLTE GQHGFHVHQYGDNTQGCTSAGPHFNPHSKKHGGPADEER HVGDLGNVTAGKDGVANVSIEDRVISLSGEHSIIGRTMVVH EKQDDLGKGGNEESTKTGNAGSRLACGVIGIAQ Substitution position and aa : D84G SiFT link-
17 Structural Variations Structural variations e.g. deletions can span segments to whole genes. Insertions refers to extra DNA segment(s) added to the reference genome. Inversion can be explained as 180 flip in the DNA segment. Tandem Duplication refers to multiple copies of a DNA segment.
18 Analysis Workflow (Structural Variations) RAW SEQUENCE DATA REFERENCE GUIDEDED ASSEMBLY ALIGN ED READS ALIGNMENT AND ASSEMBLY Illumina paired-end BWA SV DETECTION REALIGNMENT OF READS DETECT SV BREAKPOINTS (Deletions, Insertions, Inversions, Tandem Duplicates) PINDEL SV DETECTION FILTER KNOWN SVs FILTER ON RANGE MGP FILTERATION HARWELL SPECIFIC SVs ANNOTATION ANNOTATION MANUAL CURATION Deletion; Structural OWN SCRIPT Variation(SV) IGV
19 Mutation Consequence: SV Since structural variations involve large number of base pairs, therefore if they are spanning over exons of a gene they are very likely to be detrimental to the gene/protein for the organism. E.g. a 4kb deletion in Dclre1c gene involves deletion of exons 10 and 11 in C57B6J mice. This deletion leads to immune system, hematopoietic system and endocrine/exocrine glands phenotypes. (Barthels et.al, 2013)
20 Data resources Mouse Genome Project (17 genomes) MouseBook
21 Mouse Genome Project Mouse Genome Project data at Sanger Institute. Whole genome sequencing data of up to 36 inbred mice strains to date. Provides information on inbred SNPs, INDELs and Structural Variations. Search Page:
22 MouseBook MRC Harwell s portal for all the Ageing screen data. Provides phenotype data for mice from all the pedigrees over various parameter and time points. o Graphs for each pedigree and advanced searches Holds the complete range of ENU mutations identified in F1 mice (coding and non-coding) from whole genome sequencing. Functional annotations Mutational Consequence on protein.
23 Bioinformatic Resources Genome Browsers: Ensembl UCSC Mutant Mouse Resources MouseBook Mouse Genome Informatics (MGI) NCBI Functional Analysis Gene Ontology G:Profiler EBI: BLAST Aligners InterproScan (Protein Domain Analysis) Protein Families pfam Smart Superfamily TreeFam Protein-protein/gene interactions Reactome Brenada String GeneMania KEGG Protein Structure Predictors nfold3/intfold Phyre GenThreader Pymol I-Tasser
24 MGP: Gene Search Step1: Load anger/mouse_snpviewer/rel Step2: Search on Gene name or genomic location. Start typing Cops5 and select that gene. Step3: Select SNP/INDEL types, SV types and strains of interest (scroll below). Press Search
25 MGP: Gene Search Select tabs for SNPs, INDELs and SV information Allele in different strains SNP consequence in the gene. Data can be saved in text-format or csv format Reference SNP IDs for known SNPs dbsnp.
26 MGP: Location Search Step2: Click on All. Press Search Step1: Search by Location: Enter 10: in the location box
27 MGP: Location Search Shows intergenic SNPs in strains
28 SiFT: Sorting Intolerant From Tolerant The worked example in the next slides is on SOD1 ALS mutation. Things required: Sequence of the gene Substitution position and aa
29 Worked Example: SiFT Step1: Load Step2: Click on SIFT Sequence
30 Worked Example: SiFT Step3: Copy-Paste the sequence in fasta format from the handout Step4: Copy-Paste the substitution Step5: Click on Submit Query
31 Worked Example: SiFT Step6: Click on Predictions of substitutions entered
32 MouseBook: Phenotype Heatmap Step1: Load Step 3: Scroll to locate MUTA-PED- C3PDE-49 Step 4: Click on Body weight parameter for this pedigree Step2: Click on Phenotype Heatmap
33 MouseBook: Phenotype Heatmap Shows other significant annotations for this pedigree This page shows the different time points at which the data was collected for body weight parameter. Step 5: Select on any test month to view graphs. Step 6: Hover over the data points to see details of a mouse at any time point.
34 MouseBook: SNVs and small Indels Step1: Load Step2: Click on SNVs and small indels
35 MouseBook: SNVs and small indels List of SNVs from the ageing database and APN. Selection can be made on any one of these: Based on protein coding information, Gene, Chromosome, Genomic Position,Type of Mutation, Mutational Consequence and SNV source. Step 3: Click on the SNV
Focusing on results not data comprehensive data analysis for targeted next generation sequencing
Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes
More informationBioinformatics Resources at a Glance
Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences
More informationRETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
More informationNext Generation Sequencing: Technology, Mapping, and Analysis
Next Generation Sequencing: Technology, Mapping, and Analysis Gary Benson Computer Science, Biology, Bioinformatics Boston University gbenson@bu.edu http://tandem.bu.edu/ The Human Genome Project took
More informationUsing Illumina BaseSpace Apps to Analyze RNA Sequencing Data
Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data The Illumina TopHat Alignment and Cufflinks Assembly and Differential Expression apps make RNA data analysis accessible to any user, regardless
More informationLecture 3: Mutations
Lecture 3: Mutations Recall that the flow of information within a cell involves the transcription of DNA to mrna and the translation of mrna to protein. Recall also, that the flow of information between
More informationHuman Genome Organization: An Update. Genome Organization: An Update
Human Genome Organization: An Update Genome Organization: An Update Highlights of Human Genome Project Timetable Proposed in 1990 as 3 billion dollar joint venture between DOE and NIH with 15 year completion
More informationDelivering the power of the world s most successful genomics platform
Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE
More informationIntroduction to NGS data analysis
Introduction to NGS data analysis Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Sequencing Illumina platforms Characteristics: High
More informationGenomes and SNPs in Malaria and Sickle Cell Anemia
Genomes and SNPs in Malaria and Sickle Cell Anemia Introduction to Genome Browsing with Ensembl Ensembl The vast amount of information in biological databases today demands a way of organising and accessing
More informationSimplifying Data Interpretation with Nexus Copy Number
Simplifying Data Interpretation with Nexus Copy Number A WHITE PAPER FROM BIODISCOVERY, INC. Rapid technological advancements, such as high-density acgh and SNP arrays as well as next-generation sequencing
More informationGenBank, Entrez, & FASTA
GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,
More informationG E N OM I C S S E RV I C ES
GENOMICS SERVICES THE NEW YORK GENOME CENTER NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. capabilities. N E X T- G E
More information1 Mutation and Genetic Change
CHAPTER 14 1 Mutation and Genetic Change SECTION Genes in Action KEY IDEAS As you read this section, keep these questions in mind: What is the origin of genetic differences among organisms? What kinds
More informationBioBoot Camp Genetics
BioBoot Camp Genetics BIO.B.1.2.1 Describe how the process of DNA replication results in the transmission and/or conservation of genetic information DNA Replication is the process of DNA being copied before
More informationSeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications
Product Bulletin Sequencing Software SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications Comprehensive reference sequence handling Helps interpret the role of each
More informationMUTATION, DNA REPAIR AND CANCER
MUTATION, DNA REPAIR AND CANCER 1 Mutation A heritable change in the genetic material Essential to the continuity of life Source of variation for natural selection New mutations are more likely to be harmful
More informationTutorial for Windows and Macintosh. Preparing Your Data for NGS Alignment
Tutorial for Windows and Macintosh Preparing Your Data for NGS Alignment 2015 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) 1.734.769.7249
More informationAn example of bioinformatics application on plant breeding projects in Rijk Zwaan
An example of bioinformatics application on plant breeding projects in Rijk Zwaan Xiangyu Rao 17-08-2012 Introduction of RZ Rijk Zwaan is active worldwide as a vegetable breeding company that focuses on
More informationJust the Facts: A Basic Introduction to the Science Underlying NCBI Resources
1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools
More informationUmm AL Qura University MUTATIONS. Dr Neda M Bogari
Umm AL Qura University MUTATIONS Dr Neda M Bogari CONTACTS www.bogari.net http://web.me.com/bogari/bogari.net/ From DNA to Mutations MUTATION Definition: Permanent change in nucleotide sequence. It can
More informationHow-To: SNP and INDEL detection
How-To: SNP and INDEL detection April 23, 2014 Lumenogix NGS SNP and INDEL detection Mutation Analysis Identifying known, and discovering novel genomic mutations, has been one of the most popular applications
More informationINTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE Q5B
INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE ICH HARMONISED TRIPARTITE GUIDELINE QUALITY OF BIOTECHNOLOGICAL PRODUCTS: ANALYSIS
More informationDatabase schema documentation for SNPdbe
Database schema documentation for SNPdbe Changes 02/27/12: seqs_containingsnps.taxid removed dbsnp_snp.tax_id renamed to dbsnp_snp.taxid General information: Data in SNPdbe is organized on several levels.
More informationThe Human Genome Project
The Human Genome Project Brief History of the Human Genome Project Physical Chromosome Maps Genetic (or Linkage) Maps DNA Markers Sequencing and Annotating Genomic DNA What Have We learned from the HGP?
More informationLeading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik
Leading Genomics Diagnostic harma Discove Collab Shanghai Cambridge, MA Reykjavik Global leadership for using the genome to create better medicine WuXi NextCODE provides a uniquely proven and integrated
More informationExercises for the UCSC Genome Browser Introduction
Exercises for the UCSC Genome Browser Introduction 1) Find out if the mouse Brca1 gene has non-synonymous SNPs, color them blue, and get external data about a codon-changing SNP. Skills: basic text search;
More informationGene mutation and molecular medicine Chapter 15
Gene mutation and molecular medicine Chapter 15 Lecture Objectives What Are Mutations? How Are DNA Molecules and Mutations Analyzed? How Do Defective Proteins Lead to Diseases? What DNA Changes Lead to
More informationescience and Post-Genome Biomedical Research
escience and Post-Genome Biomedical Research Thomas L. Casavant, Adam P. DeLuca Departments of Biomedical Engineering, Electrical Engineering and Ophthalmology Coordinated Laboratory for Computational
More informationThe sequence of bases on the mrna is a code that determines the sequence of amino acids in the polypeptide being synthesized:
Module 3F Protein Synthesis So far in this unit, we have examined: How genes are transmitted from one generation to the next Where genes are located What genes are made of How genes are replicated How
More informationWhen you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want
1 When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want to search other databases as well. There are very
More informationNext Generation Sequencing. mapping mutations in congenital heart disease
Next Generation Sequencing mapping mutations in congenital heart disease AV Postma PhD Academic Medical Center Amsterdam, the Netherlands Overview talk Congenital heart disease and genetics Next generation
More informationBreast cancer and the role of low penetrance alleles: a focus on ATM gene
Modena 18-19 novembre 2010 Breast cancer and the role of low penetrance alleles: a focus on ATM gene Dr. Laura La Paglia Breast Cancer genetic Other BC susceptibility genes TP53 PTEN STK11 CHEK2 BRCA1
More informationBioinformatics Grid - Enabled Tools For Biologists.
Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis
More informationLecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)
Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Single nucleotide polymorphisms or SNPs (pronounced "snips") are DNA sequence variations that occur
More informationText file One header line meta information lines One line : variant/position
Software Calling: GATK SAMTOOLS mpileup Varscan SOAP VCF format Text file One header line meta information lines One line : variant/position ##fileformat=vcfv4.1! ##filedate=20090805! ##source=myimputationprogramv3.1!
More informationmygenomatix - secure cloud for NGS analysis
mygenomatix Speed. Quality. Results. mygenomatix - secure cloud for NGS analysis background information & contents 2011 Genomatix Software GmbH Bayerstr. 85a 80335 Munich Germany info@genomatix.de www.genomatix.de
More informationBio 102 Practice Problems Genetic Code and Mutation
Bio 102 Practice Problems Genetic Code and Mutation Multiple choice: Unless otherwise directed, circle the one best answer: 1. Beadle and Tatum mutagenized Neurospora to find strains that required arginine
More informationAdvances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage
Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research March 17, 2011 Rendez-Vous Séquençage Presentation Overview Core Technology Review Sequence Enrichment Application
More informationBio-Informatics Lectures. A Short Introduction
Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively
More informationThe world of non-coding RNA. Espen Enerly
The world of non-coding RNA Espen Enerly ncrna in general Different groups Small RNAs Outline mirnas and sirnas Speculations Common for all ncrna Per def.: never translated Not spurious transcripts Always/often
More informationSearching Nucleotide Databases
Searching Nucleotide Databases 1 When we search a nucleic acid databases, Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from the forward strand and 3 reading frames
More informationEuropean Medicines Agency
European Medicines Agency July 1996 CPMP/ICH/139/95 ICH Topic Q 5 B Quality of Biotechnological Products: Analysis of the Expression Construct in Cell Lines Used for Production of r-dna Derived Protein
More informationGene and Chromosome Mutation Worksheet (reference pgs. 239-240 in Modern Biology textbook)
Name Date Per Look at the diagrams, then answer the questions. Gene Mutations affect a single gene by changing its base sequence, resulting in an incorrect, or nonfunctional, protein being made. (a) A
More informationTargeted. sequencing solutions. Accurate, scalable, fast TARGETED
Targeted TARGETED Sequencing sequencing solutions Accurate, scalable, fast Sequencing for every lab, every budget, every application Ion Torrent semiconductor sequencing Ion Torrent technology has pioneered
More informationLifeScope Genomic Analysis Software 2.5
USER GUIDE LifeScope Genomic Analysis Software 2.5 Graphical User Interface DATA ANALYSIS METHODS AND INTERPRETATION Publication Part Number 4471877 Rev. A Revision Date November 2011 For Research Use
More informationSingle-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation
PN 100-9879 A1 TECHNICAL NOTE Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation Introduction Cancer is a dynamic evolutionary process of which intratumor genetic and phenotypic
More informationAP BIOLOGY 2010 SCORING GUIDELINES (Form B)
AP BIOLOGY 2010 SCORING GUIDELINES (Form B) Question 2 Certain human genetic conditions, such as sickle cell anemia, result from single base-pair mutations in DNA. (a) Explain how a single base-pair mutation
More informationNext generation DNA sequencing technologies. theory & prac-ce
Next generation DNA sequencing technologies theory & prac-ce Outline Next- Genera-on sequencing (NGS) technologies overview NGS applica-ons NGS workflow: data collec-on and processing the exome sequencing
More informationA Primer of Genome Science THIRD
A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:
More informationSICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE
AP Biology Date SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE LEARNING OBJECTIVES Students will gain an appreciation of the physical effects of sickle cell anemia, its prevalence in the population,
More informationTHE UNIVERSITY OF MANCHESTER Unit Specification
1. GENERAL INFORMATION Title Unit code Credit rating 15 Level 7 Contact hours 30 Other Scheduled teaching and learning activities* Pre-requisite units Co-requisite units School responsible Member of staff
More informationAnalysis of ChIP-seq data in Galaxy
Analysis of ChIP-seq data in Galaxy November, 2012 Local copy: https://galaxy.wi.mit.edu/ Joint project between BaRC and IT Main site: http://main.g2.bx.psu.edu/ 1 Font Conventions Bold and blue refers
More informationSingle-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples
DATA Sheet Single-Cell DNA Sequencing with the C 1 Single-Cell Auto Prep System Reveal hidden populations and genetic diversity within complex samples Single-cell sensitivity Discover and detect SNPs,
More informationBIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS
BIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title: Bioinformatics
More informationAS4.1 190509 Replaces 260806 Page 1 of 50 ATF. Software for. DNA Sequencing. Operators Manual. Assign-ATF is intended for Research Use Only (RUO):
Replaces 260806 Page 1 of 50 ATF Software for DNA Sequencing Operators Manual Replaces 260806 Page 2 of 50 1 About ATF...5 1.1 Compatibility...5 1.1.1 Computer Operator Systems...5 1.1.2 DNA Sequencing
More informationLECTURE 6 Gene Mutation (Chapter 16.1-16.2)
LECTURE 6 Gene Mutation (Chapter 16.1-16.2) 1 Mutation: A permanent change in the genetic material that can be passed from parent to offspring. Mutant (genotype): An organism whose DNA differs from the
More informationBIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS
BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:
More informationChapter 2. imapper: A web server for the automated analysis and mapping of insertional mutagenesis sequence data against Ensembl genomes
Chapter 2. imapper: A web server for the automated analysis and mapping of insertional mutagenesis sequence data against Ensembl genomes 2.1 Introduction Large-scale insertional mutagenesis screening in
More informationGenome and DNA Sequence Databases. BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009
Genome and DNA Sequence Databases BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009 Admin Reading: Chapters 1 & 2 Notes available in PDF format on-line (see class calendar page): http://www.soe.ucsc.edu/classes/bme110/spring09/bme110-calendar.html
More informationGenetics Lecture Notes 7.03 2005. Lectures 1 2
Genetics Lecture Notes 7.03 2005 Lectures 1 2 Lecture 1 We will begin this course with the question: What is a gene? This question will take us four lectures to answer because there are actually several
More informationDNA Insertions and Deletions in the Human Genome. Philipp W. Messer
DNA Insertions and Deletions in the Human Genome Philipp W. Messer Genetic Variation CGACAATAGCGCTCTTACTACGTGTATCG : : CGACAATGGCGCT---ACTACGTGCATCG 1. Nucleotide mutations 2. Genomic rearrangements 3.
More informationNote: This document wh_informatics_practical.doc and supporting materials can be downloaded at
Woods Hole Zebrafish Genetics and Development Bioinformatics/Genomics Lab Ian Woods Note: This document wh_informatics_practical.doc and supporting materials can be downloaded at http://faculty.ithaca.edu/iwoods/docs/wh/
More informationModule 1. Sequence Formats and Retrieval. Charles Steward
The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.
More informationPairwise Sequence Alignment
Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What
More informationGenetics Module B, Anchor 3
Genetics Module B, Anchor 3 Key Concepts: - An individual s characteristics are determines by factors that are passed from one parental generation to the next. - During gamete formation, the alleles for
More informationGenetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )
Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins
More informationSchool of Nursing. Presented by Yvette Conley, PhD
Presented by Yvette Conley, PhD What we will cover during this webcast: Briefly discuss the approaches introduced in the paper: Genome Sequencing Genome Wide Association Studies Epigenomics Gene Expression
More informationGuide for Bioinformatics Project Module 3
Structure- Based Evidence and Multiple Sequence Alignment In this module we will revisit some topics we started to look at while performing our BLAST search and looking at the CDD database in the first
More informationBecker Muscular Dystrophy
Muscular Dystrophy A Case Study of Positional Cloning Described by Benjamin Duchenne (1868) X-linked recessive disease causing severe muscular degeneration. 100 % penetrance X d Y affected male Frequency
More informationFinal Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
More informationNazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office
2013 Laboratory Accreditation Program Audioconferences and Webinars Implementing Next Generation Sequencing (NGS) as a Clinical Tool in the Laboratory Nazneen Aziz, PhD Director, Molecular Medicine Transformation
More informationCCR Biology - Chapter 9 Practice Test - Summer 2012
Name: Class: Date: CCR Biology - Chapter 9 Practice Test - Summer 2012 Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Genetic engineering is possible
More informationChapter 5: Organization and Expression of Immunoglobulin Genes
Chapter 5: Organization and Expression of Immunoglobulin Genes I. Genetic Model Compatible with Ig Structure A. Two models for Ab structure diversity 1. Germ-line theory: maintained that the genome contributed
More informationData File Formats. File format v1.3 Software v1.8.0
Data File Formats File format v1.3 Software v1.8.0 Copyright 2010 Complete Genomics Incorporated. All rights reserved. cpal and DNB are trademarks of Complete Genomics, Inc. in the US and certain other
More informationBiological Sequence Data Formats
Biological Sequence Data Formats Here we present three standard formats in which biological sequence data (DNA, RNA and protein) can be stored and presented. Raw Sequence: Data without description. FASTA
More informationVersion 5.0 Release Notes
Version 5.0 Release Notes 2011 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) +1.734.769.7249 (elsewhere) +1.734.769.7074 (fax) www.genecodes.com
More informationLinear Sequence Analysis. 3-D Structure Analysis
Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical properties Molecular weight (MW), isoelectric point (pi), amino acid content, hydropathy (hydrophilic
More informationHidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006
Hidden Markov Models in Bioinformatics By Máthé Zoltán Kőrösi Zoltán 2006 Outline Markov Chain HMM (Hidden Markov Model) Hidden Markov Models in Bioinformatics Gene Finding Gene Finding Model Viterbi algorithm
More informationA Hitchhiker s Guide to Next-Generation Sequencing
A Hitchhiker s Guide to Next-Generation Sequencing by Gabe Rudy, VP of Product Development If you have had any experience with Golden Helix, you know we are not a company to shy away from a challenge.
More informationID of alternative translational initiation events. Description of gene function Reference of NCBI database access and relative literatures
Data resource: In this database, 650 alternatively translated variants assigned to a total of 300 genes are contained. These database records of alternative translational initiation have been collected
More informationBiological Sciences Initiative. Human Genome
Biological Sciences Initiative HHMI Human Genome Introduction In 2000, researchers from around the world published a draft sequence of the entire genome. 20 labs from 6 countries worked on the sequence.
More informationDNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!
DNA Replication & Protein Synthesis This isn t a baaaaaaaddd chapter!!! The Discovery of DNA s Structure Watson and Crick s discovery of DNA s structure was based on almost fifty years of research by other
More informationBioinformatics Unit Department of Biological Services. Get to know us
Bioinformatics Unit Department of Biological Services Get to know us Domains of Activity IT & programming Microarray analysis Sequence analysis Bioinformatics Team Biostatistical support NGS data analysis
More informationAnalysis of NGS Data
Analysis of NGS Data Introduction and Basics Folie: 1 Overview of Analysis Workflow Images Basecalling Sequences denovo - Sequencing Assembly Annotation Resequencing Alignments Comparison to reference
More informationBIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16
Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems
More informationShouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center
Computational Challenges in Storage, Analysis and Interpretation of Next-Generation Sequencing Data Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center Next Generation Sequencing
More informationDisease gene identification with exome sequencing
Disease gene identification with exome sequencing Christian Gilissen Dept. of Human Genetics Radboud University Nijmegen Medical Centre c.gilissen@antrg.umcn.nl Contents Infrastructure Exome sequencing
More informationAppendix 2 Molecular Biology Core Curriculum. Websites and Other Resources
Appendix 2 Molecular Biology Core Curriculum Websites and Other Resources Chapter 1 - The Molecular Basis of Cancer 1. Inside Cancer http://www.insidecancer.org/ From the Dolan DNA Learning Center Cold
More informationInformation leaflet. Centrum voor Medische Genetica. Version 1/20150504 Design by Ben Caljon, UZ Brussel. Universitair Ziekenhuis Brussel
Information on genome-wide genetic testing Array Comparative Genomic Hybridization (array CGH) Single Nucleotide Polymorphism array (SNP array) Massive Parallel Sequencing (MPS) Version 120150504 Design
More informationGene Models & Bed format: What they represent.
GeneModels&Bedformat:Whattheyrepresent. Gene models are hypotheses about the structure of transcripts produced by a gene. Like all models, they may be correct, partly correct, or entirely wrong. Typically,
More informationAGILENT S BIOINFORMATICS ANALYSIS SOFTWARE
ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS
More informationIntroduction to Bioinformatics 3. DNA editing and contig assembly
Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov
More informationYale Pseudogene Analysis as part of GENCODE Project
Sanger Center 2009.01.20, 11:20-11:40 Mark B Gerstein Yale Illustra(on from Gerstein & Zheng (2006). Sci Am. (c) Mark Gerstein, 2002, (c) Yale, 1 1Lectures.GersteinLab.org 2007bioinfo.mbb.yale.edu Yale
More informationGWASrap User Manual v1.1
GWASrap User Manual v1.1 1 / 28 Table of contents Introduction... 3 System Requirements... 3 Welcome... 3 Features... 4 Create New Run... 5 GWAS Representation... 7 GWAS Annotation... 13 GWAS Prioritization...
More informationMolecular Databases and Tools
NWeHealth, The University of Manchester Molecular Databases and Tools Afternoon Session: NCBI/EBI resources, pairwise alignment, BLAST, multiple sequence alignment and primer finding. Dr. Georgina Moulton
More informationA map of human genome variation from population-scale sequencing
doi:1.138/nature9534 A map of human genome variation from population-scale sequencing The 1 Genomes Project Consortium* The 1 Genomes Project aims to provide a deep characterization of human genome sequence
More informationReplacing TaqMan SNP Genotyping Assays that Fail Applied Biosystems Manufacturing Quality Control. Begin
User Bulletin TaqMan SNP Genotyping Assays May 2008 SUBJECT: Replacing TaqMan SNP Genotyping Assays that Fail Applied Biosystems Manufacturing Quality Control In This Bulletin Overview This user bulletin
More informationRecombinant DNA Technology
Recombinant DNA Technology Stephen B. Gruber, MD, PhD Division of Molecular Medicine and Genetics November 4, 2002 Learning Objectives Know the basics of gene structure, function and regulation. Be familiar
More informationHow To Use The Assembly Database In A Microarray (Perl) With A Microarcode) (Perperl 2) (For Macrogenome) (Genome 2)
The Ensembl Core databases and API Useful links Installation instructions: http://www.ensembl.org/info/docs/api/api_installation.html Schema description: http://www.ensembl.org/info/docs/api/core/core_schema.html
More information