NIAID Genomics and Bioinformatics Programs Vivien G. Dugan, PhD Office of Genomics and Advanced Technologies (OGAT) Division of Microbiology and Infectious Diseases NIAID, NIH, DHHS duganv@niaid.nih.gov April 1, 2014 FDA Workshop
NIAID Mission NIAID conducts and supports basic and applied research to better understand, treat, and ultimately prevent infectious, immunologic, and allergic diseases Maintain and grow a robust basic and applied research portfolio in microbiology, immunology, and clinical research Respond rapidly to new infectious disease threats NIAID Infectious Disease Research: A Dual Mandate Slide Source: A. S. Fauci
Division of Microbiology and Infectious Diseases Office of the Director Dr. Carole Heilman, Director Dr. Irene Glowinski, Deputy Director Office of Clinical Research Coordination Dr. Richard Gorman Associate Director Office of International Research in Infectious Diseases Dr. Polly Sager Assistant Director Office of Regulatory Affairs Dr. Robert Johnson Director Office of Clinical Research Affairs Dr. Shy Shorer Director Office of Clinical Research Resources Dr. Robert Johnson Acting Director Office of Biodefense, Research Resources, and Translational Research Dr. Michael Kurilla Director Office of Genomics and Advanced Technologies Dr. Maria Giovanni Director Office of Scientific Coordination and Program Operations Dr. Barbara Mulach Director Bacteriology and Mycology Branch Dr. Dennis Dixon Chief Enteric and Hepatic Diseases Branch Dr. Fred Cassels Chief Sexually Transmitted Infections Branch Dr. Carolyn Deal Chief Respiratory Diseases Branch Dr. Linda Lambert Chief Parasitology and International Programs Branch Dr. Lee Hall Chief Virology Branch Dr. Catherine Laughlin Chief NIH/NIAID
MERS-CoV Slide Source: A. S. Fauci
Slide Source: A. S. Fauci
Knowledge from Data Challenge Rapidly expanding volume, complexity, and range of data Genome and Annotation RNA Sequencing Protein Expression Structure Isolates Subcellular Location Phenotype Exploit data to understand, prevent, diagnosis and treat infectious diseases Slide Source: M.Y. Giovanni
NIAID Genomics Program Sequencing Functional Genomics Proteomics Structural Genomics Systems Biology Genomic Sequencing Centers Functional Genomic Research Centers Clinical Proteomics Centers Structural Genomics Centers Systems Biology Centers Bioinformatics Resource Centers Bioinformatics Genomic Research Resources Genomic/Omics Data Sets, Databases, Bioinformatics Tools, Biomarkers, 3D Structures, Protein Clones, Predictive Models To address key questions in microbiology and infectious disease Slide Source: M.Y. Giovanni
NIAID Genomic Sequencing Centers for Infectious Diseases Sample Processing Method Develop High Throughput Sequencing Pipelines Metagenomics Transcriptomics Bioinformatics Tools Data Analysis Pipelines Genomics Bioinformatics Training Slide Source: M.Y. Giovanni
NIAID Genome Sequencing Projects: 2001-2013 6418 Completed Genomes 5889 Bacteria 373 Fungi 148 Protozoan Parasites 4 Invertebrate Vectors 1 Plant 2 Worms 15,562 Completed Viral Genomes 11144 Influenza viruses 252 Rhinoviruses 1 Mammal (Ferret) 2523 Dengue viruses 28 Lassa virus 172 Coronaviruses 10 Adenoviruses 396 West Nile viruses 104 Arboviruses 502 Hepatitis C viruses 10 Measles 28 Herpes simplex viruses 16 Rubella 263 Rotaviruses 1 Varicella 77 Noroviruses 6 RSV 102 Human Exomes 14 Small RNA viruses 16 Paramyxoviruses (HPIV3, HMPV) NIAID GSC Web Site http://www3.niaid.nih.gov/labsandresources/resources/gsc/default.htm April 10, 2013 Slide Source: M.Y. Giovanni
Systems Biology Centers for Infectious Diseases Research Goal: To use computational and experimental approaches to identify, model and predict the pathways and networks derived from pathogen/host interactions Key Features: Combination of multiple omics technologies Genetic and biochemical networks of pathogenesis Data integration and computational / predictive modeling Public release of data and other resources FY13 Awards Influenza, Ebola, West Nile, M. tuberculosis http://www.niaid.nih.gov/labsandresources/resources/dmid/sb/pages/default.aspx Slide Source: M.Y. Giovanni
NIAID Bioinformatics Resource Centers (BRCs) Goal: Provide integrated bioinformatics resources in support of basic and applied infectious diseases research Data management and integration solutions Integrated data sets with easy access Computational analysis tools Workbenches and web interfaces Training and outreach activities Bioinformatics services https://www.niaid.nih.gov/labsandresources/resources/dmid/br c/pages/default.aspx Slide Source: A. Yao
Slide Source: M.Y. Giovanni
OMICS/Systems Biology Approach to Infectious Diseases Cell Lines, Animal Models, Tissues, Clinical Samples, Genomics Transcriptomics Metagenomics/Metaomics Functional Genomics/Proteomics Metabolomics Glycomics Lipidomics MICROBE HOST MICROBIOME Slide Source: M.Y. Giovanni
Systems Biology Systems Approach to TB Clinical Proteomics Proteomics GSC 409 Mtb strain s Integration of transcription data Prediction of gene expression & network perturbations Discovery of novel regulatory interactions TB RESIST: International Genome Partnership Increase Knowledge Base 1000s TB genomes for genomic analysis to study & understand drug resistance Disease progression & Active disease in HIV +/- individuals BRC Structural Genomics 27 Mtb & 139 Mycobacterium structures M. tuberculosis Ras-like transport protein TB Community Annotation Project Slide Source: M.Y. Giovanni
PATRIC BRC (www.patricbrc.org) Integrated Omics Data: ~14,000 bacterial genomes and standardized annotations Free genome annotation service (RAST) Integrated genomic and omics data, metadata and tools Comparative analyses and interactive visualizations Personal workspace TB Portal Slide Source: A. Yao Protein-protein interactions Structures Proteomics, ChIP-Seq data Transcriptomics (Microarray, RNA-Seq) Pathways
NIAID Antimicrobial Resistance Research High-Thoughput Omics Technologies Big Data Analysis Basic Pathogenesis Data System-Wide Regulation of Resistance Factors Molecular Networks of Host/Pathogen Interactions NIAID AR Research Identifying Targets for Novel Drugs and Vaccines Computational Modeling Predicting Drug Toxicity and Resistance Potential Identifying Biomarkers of Disease/ Resistance Slide Source: M.Y. Giovanni
OGAT Antimicrobial Resistance Projects Enterococcus faecium Staphlococcus aureus Klebsiella pneumoniae Acinetobacter baumannii Pseudomonas aeruginosa ESKAPE Pathogens Genomics Structural Genomics Evolution Transmission Resistance Sensitive & resistant strains Real-time sequencing Diverse strains Enterobacter Slide Source: M.Y. Giovanni
NIAID/DMID Data Sharing, Release, and Access Guiding Principles Projects/Data Types Data Sharing Data Release Guidelines/Policy Implementation Detailed Plans What, When, Where Slide Source: M.Y. Giovanni
Data Sharing Guiding Principles Commit to data sharing with timely (rapid) publicly release of data into international databases Commit to generate clear, concise, easy to understand Data Release Guidelines and Policies; Review and update policies Implement guidelines fairly and consistently Partner with NCBI, NIH institutes, other government agencies, domestic and international organizations Commit to working with scientific community Acknowledge data generators Promote fair use of pre-publication data Commit to responsible stewardship with data release from human clinical samples related to privacy, identity, consent and confidentiality Include research resources as reagents, organisms, research resources, bioinformatic tools Slide Source: M.Y. Giovanni
Data & Resource Sharing and Release Guidelines Sequence, Genomics and Transcriptomics Data: Release into GenBank, dbsnp, other international databases and NIAID Bioinformatics Resource Centers in 45 days from generation. Clinical and Meta Data: Release with genomic data to NCBI and NIAID Bioinformatics Resource Centers; some 9 month delayed data release. Other Omics and related data: Release into international databases and NIAID Bioinformatics Resource Centers 9-12 months from generation. Issues to consider: Privacy protection, deposition in controlled access database, release and trial design http://www.niaid.nih.gov/labsandresources/resources/dmid/ Pages/data.aspx Slide Source: M.Y. Giovanni
Data and Metadata Standards Issues Lack of standard data fields for infectious disease studies Sharing and access of metadata by scientific community Is available metadata enough to enable other people s research? Solutions Negotiating data release policy Collaboration between domain experts and bioinformaticians Establishing best practices for data collection and transfer to public databases Accomplishments Established data sharing/release policies Developed standardized data collection process for a variety types of data Implemented standards in BRC for data integration Slide Source: A. Yao
Standardized Metadata and Clinical Data for Infectious Diseases Research Rationale ~20,000 microbial genomes sequenced from GSCs between 2001-2013 Standards for capture of standardized human pathogen and vector sequencing metadata designed to support epidemiologic and phenotype-genotype association studies Capture of clinical metadata associated with clinical isolates Processes Working groups composed of domain experts and bioinformaticians from GSCs and BRCs External Expert Advisory Committee to provide guidance on human subject privacy and confidentiality issues Seek feedback from the larger scientific community Slide Source: A. Yao
Developing GSCID/BRC Project and Sample Application Standard Support epidemiologic and phenotype-genotype association studies Working groups composed of domain experts and bioinformaticians, and monitored by NIAID Utilize existing standards and ontologies Map to terms from other metadata standards initiatives, including the Genomic Standards Consortium s minimal information (MIxS) and NCBI s BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). External Expert Advisory Committee to provide guidance on human subject privacy and re-identifiability issues Seek feedback from the larger scientific community Data collection templates tested/used by data generators and implemented in BRCs Slide Source: A. Yao
Standards for Clinical and Metadata Collection GSCID/BRC Project and Sample Application Standard Metadata about core project, core sample, sequencing assay, project specific metadata, and pathogen specific metadata. Common Clinical Data Elements General fields specific to human hosts (e.g., race/ ethnicity), Infection, Physical examination, Diagnosis, Symptoms, Laboratory tests, Treatments, and Vaccination Slide Source: A. Yao
GSC/BRC Project and Sample Metadata Categories Application Standard core project, core sample, sequencing assay, project specific metadata, and pathogen specific metadata Data Fields organisms of specimen environmental source of the specimen spatial-temporal information about the specimen isolation event phenotypic characteristics of the pathogen/vector isolated sequencing and data processing methodologies used quality and coverage of the resulting sequences project leadership and support Slide Source: A. Yao
GSCID/BRC Core Sample v1.3 Specimen Source ID Specimen Category Specimen Source Species Common Name Source Gender, Age (units), Health Status Source Disease Collection Date, Latitude/Longitude, Location, Country Specimen Type Suspected Organism in specimen (species, subclass) Human pathogenicity Environmental material Detection Method Repository
GSCID/BRC Sequencing Assay v1.3 Sequencing Facility Nucleic Acid Extraction Method Nucleic Acid Preparation Method Sequencing Technology Assembly Name Assembly Method Genome Coverage Annotation Provider Annotation Method GenBank Record ID
GSCID/BRC Bacteria Specific v1.3 Antibiotic Sensitivity Bacteria Biovar Chromosome Content Extra Chromosomal Elements Bacteria Pathovar Serotype Serotyping Method
Common Clinical Data Elements Supplement the GSCID/BRC Project and Sample Application Standard by covering the required information related to a human subject. Data Fields General fields specific to human hosts (e.g., race/ ethnicity) Infection Physical examination Diagnosis Symptoms Laboratory tests Treatments Vaccination
Resources on NIAID Website DMID Data Sharing and Release Guidelines http://www.niaid.nih.gov/labsandresources/resources/dmid/pages/ data.aspx Metadata Implementation Guidelines http://www.niaid.nih.gov/labsandresources/resources/dmid/pages/ metadata.aspx DMID metadata standards http://www.niaid.nih.gov/labsandresources/resources/dmid/pages/ metadatastandards.aspx
Acknowledgements Office of Genomics and Advanced Technologies (OGAT) DMID http://www.niaid.nih.gov/topics/pathogengenomics/pages/default.aspx Maria Giovanni Valentina Di Francesco Eun Mi Lee Punam Mathur Malu Polanski Julia Puzak Alison Yao