Human Genome and Human Genome Project. Louxin Zhang



Similar documents
The Human Genome Project. From genome to health From human genome to other genomes and to gene function Structural Genomics initiative

Biological Sciences Initiative. Human Genome

Human Genome Organization: An Update. Genome Organization: An Update

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

CCR Biology - Chapter 9 Practice Test - Summer 2012

Basic Concepts of DNA, Proteins, Genes and Genomes

Structure and Function of DNA

13.4 Gene Regulation and Expression

A Primer of Genome Science THIRD

Biology Behind the Crime Scene Week 4: Lab #4 Genetics Exercise (Meiosis) and RFLP Analysis of DNA

The Human Genome Project

Genetics Test Biology I

Mitochondrial DNA Analysis

Worksheet - COMPARATIVE MAPPING 1

Bob Jesberg. Boston, MA April 3, 2014

BioBoot Camp Genetics

Genetics Lecture Notes Lectures 1 2

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!

Y Chromosome Markers

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure enzymes control cell chemistry ( metabolism )

MUTATION, DNA REPAIR AND CANCER

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

An Overview of Cells and Cell Research

INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE E15

How To Understand The Science Of Genomics

Translation Study Guide

Biotechnology and Recombinant DNA (Chapter 9) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College

1865 Discovery: Heredity Transmitted in Units

Next Generation Sequencing

Cancer Genomics: What Does It Mean for You?

Genomes and SNPs in Malaria and Sickle Cell Anemia

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)

14.3 Studying the Human Genome

12.1 The Role of DNA in Heredity

Copyright by Mark Brandt, Ph.D.

Genetics Module B, Anchor 3

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

Gene mutation and molecular medicine Chapter 15

SNP Essentials The same SNP story

Fact Sheet 14 EPIGENETICS

Chapter 5: Organization and Expression of Immunoglobulin Genes

Lecture 13: DNA Technology. DNA Sequencing. DNA Sequencing Genetic Markers - RFLPs polymerase chain reaction (PCR) products of biotechnology

Recombinant DNA and Biotechnology

Control of Gene Expression

Genetic Testing in Research & Healthcare

Mutations: 2 general ways to alter DNA. Mutations. What is a mutation? Mutations are rare. Changes in a single DNA base. Change a single DNA base

Chapter 13: Meiosis and Sexual Life Cycles

INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE Q5B

Crime Scenes and Genes

DNA, RNA, Protein synthesis, and Mutations. Chapters

Basic Concepts Recombinant DNA Use with Chapter 13, Section 13.2

1 Mutation and Genetic Change

A Genomic Timeline Tim Shank 2003

High-throughput sequencing and big data: implications for personalized medicine?

The world of non-coding RNA. Espen Enerly

RNA and Protein Synthesis

Biotechnology: DNA Technology & Genomics

a mutation that occurs during meiosis results in a chromosomal abnormality B.

DNA Damage and Repair

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Immunology Ambassador Guide (updated 2014)

CHAPTER 6: RECOMBINANT DNA TECHNOLOGY YEAR III PHARM.D DR. V. CHITRA

DNA and Forensic Science

The National Institute of Genomic Medicine (INMEGEN) was

DNA and the Cell. Version 2.3. English version. ELLS European Learning Laboratory for the Life Sciences

Introduction to Bioinformatics 3. DNA editing and contig assembly

Genetic Technology. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

FINDING RELATION BETWEEN AGING AND

The Techniques of Molecular Biology: Forensic DNA Fingerprinting

Genetics 301 Sample Final Examination Spring 2003

How many of you have checked out the web site on protein-dna interactions?

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources

European Medicines Agency

Control of Gene Expression

DNA Technology Mapping a plasmid digesting How do restriction enzymes work?

Replication Study Guide

Forensic DNA Testing Terminology

FACULTY OF MEDICAL SCIENCE

Genetic testing. The difference diagnostics can make. The British In Vitro Diagnostics Association

Viruses. Viral components: Capsid. Chapter 10: Viruses. Viral components: Nucleic Acid. Viral components: Envelope

Transcription and Translation of DNA

Biotechnical Engineering (BE) Course Description

Heredity - Patterns of Inheritance

Complex multicellular organisms are produced by cells that switch genes on and off during development.

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

CHROMOSOMES Dr. Fern Tsien, Dept. of Genetics, LSUHSC, NO, LA

Year 10: The transmission of heritable characteristics from one generation to the next involves DNA

Activity 7.21 Transcription factors

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. False (it s 99.

Annex to the Accreditation Certificate D-PL according to DIN EN ISO/IEC 17025:2005

History of DNA Sequencing & Current Applications

Name Class Date. Figure Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d.

Integration of Genetic and Familial Data into. Electronic Medical Records and Healthcare Processes

Information leaflet. Centrum voor Medische Genetica. Version 1/ Design by Ben Caljon, UZ Brussel. Universitair Ziekenhuis Brussel

Biomedical Big Data and Precision Medicine

Chromosomes, Mapping, and the Meiosis Inheritance Connection

Chapter 18 Regulation of Gene Expression

Bio 102 Practice Problems Genetic Code and Mutation

Becker Muscular Dystrophy

Transcription:

Human Genome and Human Genome Project Louxin Zhang

A Primer to Genomics Cells are the fundamental working units of every living systems. DNA is made of 4 nucleotide bases. The DNA sequence is the particular side-by-side arrangement of bases along the DNA strand. This order spells out the exact instructions required to create a particular organism. The genome is an organism s complete set of DNA. Except for mature red blood cells, all human cells contains a complete genome arranged in 24 distinct chromosomes.

A Primer to Genomics Each chromosome contains many genes, the basic physical and functional units of heredity. Genes are specific sequences of bases that encode instructions on how to make proteins. Proteins perform most life functions and even make up the majority of cellular structures. Proteins are large, complex molecules made up of smaller subunits called amino acids. A protein folds up into specific three-dimensional structure that define their particular functions in the cell.

Human Genome Project: Background HGP arose from two key insights in the early 1980s. 1. The ability to take global views of genomes could greatly accelerate biomedical research, by allowing researchers to attack problems in a comprehensive fashion. 2. The creation of such global views would requires a communal effort in infrastructure research.

Human Genome Project Background Key projects helped to crystallize the insights, including i) The sequencing of the some bacterial and animal viruses, as well as the human mitochondrion between 1977 and 1982. ii) The development of (random) shotgun sequencing of long DNA fragments for highthroughput gene discovery, later dubbed with expressed sequence tags(etss) and assembling computer programs.

How does the human genome stack up? Organism Human (Homo sapiens) Laboratory mouse (M. musculus) Mustard weed (A. thaliana) Roundworm (C. elegans) Fruit fly (D. melanogaster) Yeast (S. cerevisiae) Bacterium (E. coli) Human immunodeficiency virus (HIV) Genome Size (Bases) 3 billion 2.6 billion 100 million 97 million 137 million 12.1 million 4.6 million 9700 Estimated Genes 30,000 30,000 25,000 19,000 13,000 6,000 3,200 9

Human Genome Project Goals The idea of sequencing the entire human genome was first proposed in discussions at scientific meetings from 1984 to 1986. And a broader programme was recommended in a report by NRC, USA in 1998: Sequencing the human genome: creation of genetic, physical and sequence maps of the human genome. Parallel efforts in key model organisms. The development of technology in support of these objectives Research in the ethical, legal, and social issues raised by the programme.

Human Genome Project Milestones: 1990: Project initiated as joint effort of U.S. Department of Energy and the National Institutes of Health June 2000: Completion of a working draft of the entire human genome February 2001: Analyses of the working draft are published April 2003: HGP sequencing is completed and Project is declared finished two years ahead of schedule U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

By the Numbers What does the draft human genome sequence tell us? The human genome contains 3 billion chemical nucleotide bases (A, C, T, and G). The average gene consists of 3000 bases, but sizes vary greatly, with the largest known human gene being dystrophin at 2.4 million bases. The total number of genes is estimated at around 30,000-- much lower than previous estimates of 80,000 to 140,000. Almost all (99.9%) nucleotide bases are exactly the same in all people. The functions are unknown for over 50% of discovered genes. U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

What does the draft human genome sequence tell us How It's Arranged The human genome's gene-dense "urban centers" are in nucleotides G and C. In contrast, the gene-poor "deserts" are rich in the DNA nucleotides A and T.

What does the draft human genome sequence tell us Genes appear to be concentrated in random areas along the genome, with vast expanses of noncoding DNA between. Chromosome 1 has the most genes (2968), and the Y chromosome has the fewest (231).

The Wheat from the Chaff What does the draft human genome sequence tell us? Less than 2% of the genome codes for proteins. Repeated sequences that do not code for proteins ("junk DNA") make up at least 50% of the human genome. Repetitive sequences shed light on chromosome structure and dynamics. Over time, these repeats reshape the genome by rearranging it, creating entirely new genes, and modifying and reshuffling existing genes.

What does the draft human genome sequence tell us The repeats fall into five classes: i) transposon-derived repeats, known as interspersed repeats. ii) inactive retroposed copies of cellular genes, known as processed pseudogenes. Nonfunctional copies of the exon sequences of an active gene and thought to arise by integration into chromosomes of a natural cdna sequence generated by reverse transcription. iii) repeats of short k-mers such as (A)n, (CA)n, (AAT)n. Since they show a high degree of length polymorphisms in the human population, (CA)n repeat have been used as genetic marker in genetic mapping.

What does the draft human genome sequence tell us iv) segmental duplications, consisting of blocks of 10-300 kb that have been copied from one region of the genome into another region. Such duplications appears often in pericentromeres and subtelomeres of chromosomes. Recurrent structural rearrangements in duplication regions give rise to contiguous gene syndromes. v) tandemly repeated sequences, usually at centromere, telomers, the short arms of acrocentric chromosomes and ribosomal gene clusters. These regions are underrepresented in the draft genome sequence.

What does the draft human genome sequence tell us? How the Human Compares with Other Organisms Unlike the human's seemingly random distribution of gene-rich areas, many other organisms' genomes are more uniform, with genes evenly spaced throughout. Humans have on average three times as many kinds of proteins as the fly or worm because of mrna transcript "alternative splicing" and chemical modifications to the proteins. Humans share most of the same protein families with worms, flies, and plants; but the number of gene family members has expanded in humans, especially in proteins involved in development and immunity. The human genome has a much greater portion (50%) of repeat sequences than the mustard weed (11%), the worm (7%), and the fly (3%). U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

What does the draft human genome sequence tell us? Variations and Mutations Scientists have identified about 3 million locations where single-base DNA differences (SNPs) occur in humans. This information promises to revolutionize the processes of finding chromosomal locations for disease-associated sequences and tracing human history. The ratio of germline (sperm or egg cell) mutations is 2:1 in males vs females. Researchers point to several reasons for the higher mutation rate in the male germline, including the greater number of cell divisions required for sperm formation than for eggs. U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

Future Challenges: What We Still Don t Know Gene number, exact locations, and functions Noncoding DNA types, amount, distribution, information content, and functions Functional genomics Evolutionary conservation among organisms Proteomes (total protein content and function) in organisms Correlation of SNPs (single-base DNA variations among individuals) with health and disease Genes involved in complex traits and multigene diseases U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

Anticipated Benefits of Genome Research Molecular Medicine improve diagnosis of disease create drugs based on molecular information design custom drugs (pharmacogenomics) based on individual genetic profiles Microbial Genomics rapidly detect and treat pathogens (disease-causing microbes) in clinical practice protect citizenry from biological and chemical warfare U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

Anticipated Benefits of Genome Research-cont. Risk Assessment evaluate the health risks faced by individuals who may be exposed to radiation (including low levels in industrial areas) and to cancer-causing chemicals and toxins Bioarchaeology, Anthropology, Evolution, and Human Migration study evolution through germline mutations in lineages study migration of different population groups based on maternal inheritance study mutations on the Y chromosome to trace lineage and migration of males U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

Anticipated Benefits of Genome Research-cont. DNA Identification (Forensics) identify potential suspects whose DNA may match evidence left at crime scenes exonerate persons wrongly accused of crimes identify crime and catastrophe victims establish paternity and other family relationships U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

Anticipated Benefits of Genome Research-cont. Agriculture, Livestock Breeding, and Bioprocessing grow disease-, insect-, and drought-resistant crops breed healthier, more productive, disease-resistant farm animals grow more nutritious produce develop biopesticides incorporate edible vaccines incorporated into food products develop new environmental cleanup uses for plants like tobacco U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

Sequencing Strategy: Hierarchical shotgun sequencing