Lecture Outline. Introduction to Databases. Introduction. Data Formats Sample databases How to text search databases. Shifra Ben-Dor Irit Orr

Size: px
Start display at page:

Download "Lecture Outline. Introduction to Databases. Introduction. Data Formats Sample databases How to text search databases. Shifra Ben-Dor Irit Orr"

Transcription

1 Introduction to Databases Shifra Ben-Dor Irit Orr Lecture Outline Introduction Data and Database types Database components Data Formats Sample databases How to text search databases What units of information do we deal with in bioinformatics? SNPs Nucleotide sequence Genes mrna AAGTGCCACTGCATAAATGACCATGAGTGGGCACCGGTAAGGGAGGGTGATGCTATCTGGTCTGAAG DNA RNA Protein Sequence Structure Evolution Pathways Interactions Mutations Protein primary sequence Protein 3D structure Protein Function Acts as a tumor suppressor in many tumor types. induces growth arrest or apoptosis depending on the physiological circumstances or cell type, but both activities are involved in tumor suppression. Slide provided by Dr. Vered Caspi Involved in the transport of chloride ions. Defects in CFTR are the cause of cystic fibrosis. It is the most common genetic disease in the caucasian population, with a prevalence of about 1 in 2000 live births. cf, an autosomal recessive disorder, is a common generalized disorder of exocrine gland function

2 All of these have databases and tools that were created to work with them What do we want from databases? Information retrieval from sequence databases Biological databases contain enormous amounts of data. Databases need to be well annotated. Databases need to be easily searched. Data found in databases should be easily retrieved. Data in databases should be in standard formats. Integrated Information Retrieval Many databases contain logical relations between specific entries. One interface - connecting many biological databases. For example: a database that connects between protein sequence, protein domain, protein structure and reference databases. (Interpro) Another example: Connection between references, protein sequence, DNA sequence, and structure databases. (Entrez) Slide provided by Dr. Vered Caspi

3 Core Data and Annotation Databases generally have (at least) two types of data: Core data: The data the database was generated to organize Annotation: Extra information that rounds out our picture of the core data For example in a genome database, the sequence is the core data, and the location of genes is the annotation Database Issues Printed journals vs. databases Direct submission to databases (e.g. GenBank, GDB, PDB) Archival vs. curated databases Databases that publish experimental results of large genomic centers. Public vs. private databases. Database scope For Example: Classification of Genomic Databases Information source Information type Many genomes One Genome One Subject One Gene Direct submission from scientific community Scientific literature Genome center s experimental results Other databases Mapping Sequence & annotation Protein structure & function Variations Comparative genomics gene networks Slide provided by Dr. Vered Caspi Database search free text field-specific sequence-based Database output text graphics dynamic User Interface

4 Data Formats There are many data formats used for sequences (both nucleic and amino acid) Fasta Format GenBank Format EMBL Format GCG Format Simplest format Least information Fasta Format Starts with a > and sequence name on one line The sequence in plain text follows >OB2T2 GTGACAACATGTACAGCTGTGAGCGGTGTAAGAAGCTGCGGAACGGAGTGAAGTACTGCA AAGTCCTGCGGTTGCCCGAGATCCTGTGCATTCACCTAAAGCGCTTTCGGCACGAGGTGA TGTACTCATTCAAGATCAACAGCCACGTCTCCTTGCCCTCGAGGGGCTCGACCTGCGCCC CTTCCTTGCCAAGGAGTGCACATCCCAGATCACCACCTACGACCTCCTCTCGGTCATCTG CCACCACGGCACGGCAGGCA >TNRC_HUMAN P36941 (tumor necrosis factor c receptor) MLLPWATSAPGLAWGPLVLGLFGLLAASQPQAVPPYASENQTCRDQEKEYYEPQHRICCS RCPPGTYVSAKCSRIRDTVCATCAENSYNEHWNYLTICQLCRPCDPVMGLEEIAPCTSKR KTQCRCQPGMFCAAWALECTHCELLSDCPPGTEAELKDEVGKGNNHCVPCKAGHFQNTSS PSARCQPHTRCENQGLVEAAPGTAQSDTTCKNPLEPLPPEMSGTMLMLAVLLPLAFFLLL ATVFSCIWKSHPSLCRKLGSLLKRRPQGEGPNPVAGSWEPPKAHPYFPDLVQPLLPISGD VSPVSTGLPAAPVLEAGVPQQQSPLDLTREPQLEPGEQSQVAHGTNGIHVTGGSMTITGN IYIYNGPVLGGPPGPGDLPATPEPPYPIPEEGDPGPPGLSTPHQEDGKAWHLAETEHCGA TPSNRGPRNQFITHD >TNRC_MOUSE P50284 lymphotoxin-beta receptor precursor MRLPRASSPCGLAWGPLLLGLSGLLVASQPQLVPPYRIENQTCWDQDKEYYEPMHDVCCS RCPPGEFVFAVCSRSQDTVCKTCPHNSYNEHWNHLSTCQLCRPCDIVLGFEEVAPCTSDR KAECRCQPGMSCVYLDNECVHCEEERLVLCQPGTEAEVTDEIMDTDVNCVPCKPGHFQNT SSPRARCQPHTRCEIQGLVEAAPGTSYSDTICKNPPEPGAMLLLAILLSLVLFLLFTTVL ACAWMRHPSLCRKLGTLLKRHPEGEESPPCPAPRADPHFPDLAEPLLPMSGDLSPSPAGP PTAPSLEEVVLQQQSPLVQARELEAEPGEHGQVAHGANGIHVTGGSVTVTGNIYIYNGPV LGGTRGPGDPPAPPEPPYPTPEEGAPGPSELSTPYQEDGKAWHLAETETLGCQDL >TNR1_RAT P22934 tumor necrosis factor receptor 1 precursor (p60) MGLPIVPGLLLSLVLLALLMGIHPSGVTGLVPSLGDREKRDNLCPQGKYAHPKNNSICCT KCHKGTYLVSDCPSPGQETVCEVCDKGTFTASQNHVRQCLSCKTCRKEMFQVEISPCKAD MDTVCGCKKNQFQRYLSETHFQCVDCSPCFNGTVTIPCKEKQNTVCNCHAGFFLSGNECT PCSHCKKNQECMKLCLPPVANVTNPQDSGTAVLLPLVIFLGLCLLFFICISLLCRYPQWR PRVYSIICRDSAPVKEVEGEGIVTKPLTPASIPAFSPNPGFNPTLGFSTTPRFSHPVSST PISPVFGPSNWHNFVPPVREVVPTQGADPLLYGSLNPVPIPAPVRKWEDVVAAQPQRLDT ADPAMLYAVVDGVPPTRWKEFMRLLGLSEHEIERLELQNGRCLREAHYSMLEAWRRRTPR HEATLDVVGRVLCDMNLRGCLENIRETLESPAHSSTTHLPR

5 Genbank sequence format NM_ Homo sapiens crys...[gi: ] LOCUS NM_ bp mrna PRI 15-MAY-2001 DEFINITION Homo sapiens crystallin, alpha A (CRYAA), mrna. ACCESSION NM_ VERSION NM_ GI: KEYWORDS. SOURCE human. ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. REFERENCE 1 (bases 1 to 1114) AUTHORS Jaworski,C.J. and Piatigorsky,J. TITLE A pseudo-exon in the functional human alpha A- crystallin gene Genbank sequence format JOURNAL Nature 337 (6209), (1989) MEDLINE PUBMED REFERENCE 2 (bases 1 to 1114) AUTHORS Jaworski,C.J. TITLE A reassessment of mammalian alpha A-crystallin sequences using DNA sequencing: implications for anthropoid affinities of tarsier FEATURES Location/Qualifiers source /organism="homo sapiens" /db_xref="taxon:9606" /chromosome="21" /map="21q22.3" gene /gene="cryaa" /note="crya1" /db_xref="locusid:1409" /db_xref="mim:123580" misc_feature /note="crystallin; Region: Alpha crystallin A chain" CDS /gene="cryaa" /note="human alphaa-crystallin; crystallin, alpha-1" /codon_start=1 /db_xref="locusid:1409" /db_xref="mim:123580" /product="crystallin, alpha A" /protein_id="np_ " /db_xref="gi: " /translation="mdvtiqhpwfkrtlgpfypsrlfdqffgeglfeydllpfl SSTISPYYRQSLFRTVLDSGISEVRSDRDKFVIFLDVKHFSP EDLTVKVQDDFVEIHGKHNERQDDHGYISREFHRRYRLPS NVDQSALSCSLSADGMLTFCGPKIQTGLDATHAERAIPVSR EEKPTSAPSS" misc_feature /note="hsp20; Region: Hsp20/alpha crystallin family" polya_signal

6 BASE COUNT 183 a 400 c 309 g 222 t ORIGIN 1 acactgcgct gcccagaggc cccgctgact cctgccagcc tccaggtccc cgtggtacca 61 aagctgaaca tggacgtgac catccagcac ccctggttca agcgcaccct ggggcccttc 121 taccccagcc ggctgttcga ccagtttttc ggcgagggcc tttttgagta tgacctgctg 181 cccttcctgt cgtccaccat cagcccctac taccgccagt ccctcttccg caccgtgctg 241 gactccggca tctctgaggt tcgatccgac cgggacaagt tcgtcatctt cctcgatgtg 301 aagcacttct ccccggagga cctcaccgtg aaggtgcagg acgactttgt ggagatccac 361 ggaaagcaca acgagcgcca ggacgaccac ggctacattt cccgtgagtt ccaccgccgc 421 taccgcctgc cgtccaacgt ggaccagtcg gccctctctt gctccctgtc tgccgatggc 481 atgctgacct tctgtggccc caagatccag actggcctgg atgccaccca cgccgagcga 541 gccatccccg tgtcgcggga ggagaagccc acctcggctc cctcgtccta agcaggcatt 601 gcctcggctg gctcccctgc agccctggcc catcatgggg ggagcaccct gagggcgggg 661 tgtctgtctt cctttgcttc ccttttttcc tttccacctt ctcacatgga atgagggttt 721 gagagagcag ccaggagagc ttagggtctc agggtgtccc agaccccgac accggccagt 781 ggcggaagtg accgcacctc acactccttt agatagcagc ctggctcccc tggggtgcag 841 gcgcctcaac tctgctgagg gtccagaagg agggggtgac ctccggccag gtgcctcctg 901 acacacctgc agcctccctc cgcggcgggc cctgcccaca cctcctgggg cgcgtgaggc 961 ccgtggggcc ggggcttctg tgcacctggg ctctcgcggc ctcttctctc agaccgtctt 1021 cctccaaccc ctctatgtag tgccgctctt ggggacatgg gtcgcccatg agagcgcagc 1081 ccgcggcaat caataaacag caggtgatac aagc // Revised: October 24, EMBL sequence format ID A standard; DNA; FUN; 581 BP. AC AJ279484; SV AJ DT 14-JAN-2000 (Rel. 62, Created) DT 14-JAN-2000 (Rel. 62, Last updated, Version 2) DE Unidentified ascomycota sp. 4/ S rrna gene and ITS 1 and 2 KW 5.8S ribosomal RNA; 5.8S rrna gene; internal transcribed spacer 1; EMBL sequence format KW internal transcribed spacer 2; ITS1; ITS2. OS ascomycota sp. 4/97-9 OC Eukaryota; Fungi; Ascomycota. RN [1] RP RA Wirsel S.G.R.; RT ; RL Submitted (21-DEC-1999) to the EMBL/GenBank/DDBJ databases. RL Wirsel S.G.R., Fakultaet fuer Biologie, Universitaet Konstanz, RL Universitaetsstr. 10, Konstanz 78434, Germany. EMBL sequence format RN [2] RA Wirsel S.G.R., Leibinger W., Mendgen K.W.; RT "Genetic diversity of fungi associated with common reed (Phragmites RT australis)"; RL Unpublished. FH Key Location/Qualifiers FH FT source FT /db_xref="taxon:112223" FT /organism="ascomycota sp. 4/97-9" FT /isolate="4/97-9"

7 EMBL sequence format FT misc_feature FT /note="internal transcribed spacer 1, ITS1" FT rrna FT /gene="5.8s rrna" FT /product="5.8s ribosomal RNA" FT misc_feature FT /note="internal transcribed spacer 2, ITS2" SQ Sequence 581 BP; 132 A; 164 C; 145 G; 140 T; 0 other; ccatttagag gaagtaaaag tcgtaacaag gtctccgttg gtgaaccagggagggatc 60 ttacgagagt gtcaccactc ccaacccact gtttacctac ccgtccaccg tgcttcggca 120 ggcagtcctg tgggacaggg cctcgccccc ctccgggggg tgcctgccgc EMBL entry Each line in the entry begins with a two-character line code, which indicates the type of information contained in the line. The currently used line types, along with their respective line codes, are listed below: ID - identification (begins each entry; 1 per entry) AC - accession number (>=1 per entry) SV - sequence version (1 per entry) DT - date (2 per entry) DE - description (>=1 per entry) KW - keyword (>=1 per entry) EMBL entry OS - organism species (>=1 per entry) OC - organism classification (>=1 per entry) OG - organelle (0 or 1 per entry) RN - reference number (>=1 per entry) RC - reference comment (>=0 per entry) RP - reference positions (>=1 per entry) RX - reference cross-reference (>=0 per entry) RA - reference author(s) (>=1 per entry) RT - reference title (>=1 per entry) RL - reference location (>=1 per entry) DR - database cross-reference (>=0 per entry) EMBL entry FH - feature table header (0 or 2 per entry) FT - feature table data (>=0 per entry) CC - comments or notes (>=0 per entry) - spacer line (many per entry) SQ - sequence header (1 per entry) bb - (blanks) sequence data (>=1 per entry) // - termination line (ends each entry; 1 per entry )

8 GCG Format Has space for comments and space for data, separated by two dots.. Can contain full sequence data like GenBank or EMBL Has a minimum of sequence name, length, date, type (nucleic or amino acid) and checksum!!na_sequence 1.0 5B3.seq Length: 744 March 18, :43 Type: N Check: TCTAGAGGAG AYATYGTWAT GACCCAGTCT CCATCCTCCC TGAGTGTGTC 51 AGCAGGAGAG AAGGTCACTA TGAGCTGCAA GTCCAGTCAG AGTCTGTTAA 101 ACAGTAGAAA TCAAAAGAAC TACTTGGCCT GGTACCAGCA GAAACCAGGA 151 CAGCCTCCTA AACTTTTGAT CTACGGGGTA TTTATTAGGG ATTCTGGGGT 201 CCCTGATCGC TTCACAGGCA GTGGATCTGG AACCGATTTC ACTCTTACCA 251 TCAGCAGTGT GCAGGCTGAA GACCTGGCAG TTTATTACTG TCAGAATGAT 301 CATATTTATC CGTACACGTT CGGAGGGGGC ACWAAGCTGG AAATTAAAGG 351 GTCGACTTCC GGTAGCGGCA AATCCTCTGA AGGCAAAGGT SAGGTSCAGC 401 TGCAGGAGTC TGGACCTGGC CTGGTGAAGC CTTCCCAGTC TCTGTCCCTC 451 ACCTGCTCTG TCACTGGTTA CTCAATCACC AGTGGTTATG CCTGGAACTG 501 GATCCGGCAG TTTCCAGGAA ACAAACTGGA GTGGATGGGC TACATAAGCT 551 ACAGTGGTTT CACTAGCTAC AACCCATCTC TCAGAAGTCG AATCTCTTTC

THE GENBANK SEQUENCE DATABASE

THE GENBANK SEQUENCE DATABASE Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Second Edition Andreas D. Baxevanis, B.F. Francis Ouellette Copyright 2001 John Wiley & Sons, Inc. ISBNs: 0-471-38390-2 (Hardback);

More information

ACAAGGGACTAGAGAAACCAAAA AGAAACCAAAACGAAAGGTGCAGAA AACGAAAGGTGCAGAAGGGGAAACAGATGCAGA CHAPTER 3

ACAAGGGACTAGAGAAACCAAAA AGAAACCAAAACGAAAGGTGCAGAA AACGAAAGGTGCAGAAGGGGAAACAGATGCAGA CHAPTER 3 ACAAGGGACTAGAGAAACCAAAA AGAAACCAAAACGAAAGGTGCAGAA AACGAAAGGTGCAGAAGGGGAAACAGATGCAGA CHAPTER 3 GAAGGGGAAACAGATGCAGAAAGCATC AGAAAGCATC ACAAGGGACTAGAGAAACCAAAACGAAAGGTGCAGAAGGGGAAACAGATGCAGAAAGCATC Introduction

More information

DNA Sequence formats

DNA Sequence formats DNA Sequence formats [Plain] [EMBL] [FASTA] [GCG] [GenBank] [IG] [IUPAC] [How Genomatix represents sequence annotation] Plain sequence format A sequence in plain format may contain only IUPAC characters

More information

Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011

Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011 Sequence Formats and Sequence Database Searches Gloria Rendon SC11 Education June, 2011 Sequence A is the primary structure of a biological molecule. It is a chain of residues that form a precise linear

More information

GenBank, Entrez, & FASTA

GenBank, Entrez, & FASTA GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,

More information

BioJava In Anger. A Tutorial and Recipe Book for Those in a Hurry

BioJava In Anger. A Tutorial and Recipe Book for Those in a Hurry BioJava In Anger BioJava In Anger A Tutorial and Recipe Book for Those in a Hurry Introduction: BioJava can be both big and intimidating. For those of us who are in a hurry there really is a whole lot

More information

Bioinformatics Resources at a Glance

Bioinformatics Resources at a Glance Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences

More information

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

More information

Biological Sequence Data Formats

Biological Sequence Data Formats Biological Sequence Data Formats Here we present three standard formats in which biological sequence data (DNA, RNA and protein) can be stored and presented. Raw Sequence: Data without description. FASTA

More information

The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28.

The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28. Tutorial Module 5 BioMart You will learn about BioMart, a joint project developed and maintained at EBI and OiCR www.biomart.org How to use BioMart to quickly obtain lists of gene information from Ensembl

More information

Genome and DNA Sequence Databases. BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009

Genome and DNA Sequence Databases. BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009 Genome and DNA Sequence Databases BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009 Admin Reading: Chapters 1 & 2 Notes available in PDF format on-line (see class calendar page): http://www.soe.ucsc.edu/classes/bme110/spring09/bme110-calendar.html

More information

Activity 7.21 Transcription factors

Activity 7.21 Transcription factors Purpose To consolidate understanding of protein synthesis. To explain the role of transcription factors and hormones in switching genes on and off. Play the transcription initiation complex game Regulation

More information

Introduction to Bioinformatics 2. DNA Sequence Retrieval and comparison

Introduction to Bioinformatics 2. DNA Sequence Retrieval and comparison Introduction to Bioinformatics 2. DNA Sequence Retrieval and comparison Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov

More information

Genomes and SNPs in Malaria and Sickle Cell Anemia

Genomes and SNPs in Malaria and Sickle Cell Anemia Genomes and SNPs in Malaria and Sickle Cell Anemia Introduction to Genome Browsing with Ensembl Ensembl The vast amount of information in biological databases today demands a way of organising and accessing

More information

The Steps. 1. Transcription. 2. Transferal. 3. Translation

The Steps. 1. Transcription. 2. Transferal. 3. Translation Protein Synthesis Protein synthesis is simply the "making of proteins." Although the term itself is easy to understand, the multiple steps that a cell in a plant or animal must go through are not. In order

More information

Special report. Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006

Special report. Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006 Special report Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006 Gene And Protein The gene that causes the mutation is CCND1 and the protein NP_444284 The mutation deals with the cell

More information

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:

More information

A Tutorial in Genetic Sequence Classification Tools and Techniques

A Tutorial in Genetic Sequence Classification Tools and Techniques A Tutorial in Genetic Sequence Classification Tools and Techniques Jake Drew Data Mining CSE 8331 Southern Methodist University jakemdrew@gmail.com www.jakemdrew.com Sequence Characters IUPAC nucleotide

More information

Module 1. Sequence Formats and Retrieval. Charles Steward

Module 1. Sequence Formats and Retrieval. Charles Steward The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.

More information

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources 1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools

More information

Concluding lesson. Student manual. What kind of protein are you? (Basic)

Concluding lesson. Student manual. What kind of protein are you? (Basic) Concluding lesson Student manual What kind of protein are you? (Basic) Part 1 The hereditary material of an organism is stored in a coded way on the DNA. This code consists of four different nucleotides:

More information

org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers.

org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers. org.rn.eg.db December 16, 2015 org.rn.egaccnum Map Entrez Gene identifiers to GenBank Accession Numbers org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank

More information

Human Genome Organization: An Update. Genome Organization: An Update

Human Genome Organization: An Update. Genome Organization: An Update Human Genome Organization: An Update Genome Organization: An Update Highlights of Human Genome Project Timetable Proposed in 1990 as 3 billion dollar joint venture between DOE and NIH with 15 year completion

More information

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office

More information

Introduction to Genome Annotation

Introduction to Genome Annotation Introduction to Genome Annotation AGCGTGGTAGCGCGAGTTTGCGAGCTAGCTAGGCTCCGGATGCGA CCAGCTTTGATAGATGAATATAGTGTGCGCGACTAGCTGTGTGTT GAATATATAGTGTGTCTCTCGATATGTAGTCTGGATCTAGTGTTG GTGTAGATGGAGATCGCGTAGCGTGGTAGCGCGAGTTTGCGAGCT

More information

Regents Biology REGENTS REVIEW: PROTEIN SYNTHESIS

Regents Biology REGENTS REVIEW: PROTEIN SYNTHESIS Period Date REGENTS REVIEW: PROTEIN SYNTHESIS 1. The diagram at the right represents a portion of a type of organic molecule present in the cells of organisms. What will most likely happen if there is

More information

1 Mutation and Genetic Change

1 Mutation and Genetic Change CHAPTER 14 1 Mutation and Genetic Change SECTION Genes in Action KEY IDEAS As you read this section, keep these questions in mind: What is the origin of genetic differences among organisms? What kinds

More information

Interaktionen von RNAs und Proteinen

Interaktionen von RNAs und Proteinen Sonja Prohaska Computational EvoDevo Universitaet Leipzig June 9, 2015 Studying RNA-protein interactions Given: target protein known to bind to RNA problem: find binding partners and binding sites experimental

More information

Gene Therapy. The use of DNA as a drug. Edited by Gavin Brooks. BPharm, PhD, MRPharmS (PP) Pharmaceutical Press

Gene Therapy. The use of DNA as a drug. Edited by Gavin Brooks. BPharm, PhD, MRPharmS (PP) Pharmaceutical Press Gene Therapy The use of DNA as a drug Edited by Gavin Brooks BPharm, PhD, MRPharmS (PP) Pharmaceutical Press Contents Preface xiii Acknowledgements xv About the editor xvi Contributors xvii An introduction

More information

Becker Muscular Dystrophy

Becker Muscular Dystrophy Muscular Dystrophy A Case Study of Positional Cloning Described by Benjamin Duchenne (1868) X-linked recessive disease causing severe muscular degeneration. 100 % penetrance X d Y affected male Frequency

More information

Module 10: Bioinformatics

Module 10: Bioinformatics Module 10: Bioinformatics 1.) Goal: To understand the general approaches for basic in silico (computer) analysis of DNA- and protein sequences. We are going to discuss sequence formatting required prior

More information

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in DNA, RNA, Protein Synthesis Keystone 1. During the process shown above, the two strands of one DNA molecule are unwound. Then, DNA polymerases add complementary nucleotides to each strand which results

More information

Cystic Fibrosis Webquest Sarah Follenweider, The English High School 2009 Summer Research Internship Program

Cystic Fibrosis Webquest Sarah Follenweider, The English High School 2009 Summer Research Internship Program Cystic Fibrosis Webquest Sarah Follenweider, The English High School 2009 Summer Research Internship Program Introduction: Cystic fibrosis (CF) is an inherited chronic disease that affects the lungs and

More information

Lab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS

Lab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS Lab 2/Phylogenetics/September 16, 2002 1 Read: Tudge Chapter 2 PHYLOGENETICS Objective of the Lab: To understand how DNA and protein sequence information can be used to make comparisons and assess evolutionary

More information

ID of alternative translational initiation events. Description of gene function Reference of NCBI database access and relative literatures

ID of alternative translational initiation events. Description of gene function Reference of NCBI database access and relative literatures Data resource: In this database, 650 alternatively translated variants assigned to a total of 300 genes are contained. These database records of alternative translational initiation have been collected

More information

A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques

A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 402 A Multiple DNA Sequence Translation Tool Incorporating Web

More information

Teaching Bioinformatics to Undergraduates

Teaching Bioinformatics to Undergraduates Teaching Bioinformatics to Undergraduates http://www.med.nyu.edu/rcr/asm Stuart M. Brown Research Computing, NYU School of Medicine I. What is Bioinformatics? II. Challenges of teaching bioinformatics

More information

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins

More information

Name Class Date. Figure 13 1. 2. Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d.

Name Class Date. Figure 13 1. 2. Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d. 13 Multiple Choice RNA and Protein Synthesis Chapter Test A Write the letter that best answers the question or completes the statement on the line provided. 1. Which of the following are found in both

More information

Bioinformatics, Sequences and Genomes

Bioinformatics, Sequences and Genomes Bioinformatics, Sequences and Genomes BL4273 Bioinformatics for Biologists Week 1 Daniel Barker, School of Biology, University of St Andrews Email db60@st-andrews.ac.uk BL4273 and 4273π 4273π is a custom

More information

Integrating Bioinformatics, Medical Sciences and Drug Discovery

Integrating Bioinformatics, Medical Sciences and Drug Discovery Integrating Bioinformatics, Medical Sciences and Drug Discovery M. Madan Babu Centre for Biotechnology, Anna University, Chennai - 600025 phone: 44-4332179 :: email: madanm1@rediffmail.com Bioinformatics

More information

Biological Sciences Initiative. Human Genome

Biological Sciences Initiative. Human Genome Biological Sciences Initiative HHMI Human Genome Introduction In 2000, researchers from around the world published a draft sequence of the entire genome. 20 labs from 6 countries worked on the sequence.

More information

Sequencing the Human Genome

Sequencing the Human Genome Revised and Updated Edvo-Kit #339 Sequencing the Human Genome 339 Experiment Objective: In this experiment, students will read DNA sequences obtained from automated DNA sequencing techniques. The data

More information

Bio 102 Practice Problems Genetic Code and Mutation

Bio 102 Practice Problems Genetic Code and Mutation Bio 102 Practice Problems Genetic Code and Mutation Multiple choice: Unless otherwise directed, circle the one best answer: 1. Beadle and Tatum mutagenized Neurospora to find strains that required arginine

More information

Sickle cell anemia: Altered beta chain Single AA change (#6 Glu to Val) Consequence: Protein polymerizes Change in RBC shape ---> phenotypes

Sickle cell anemia: Altered beta chain Single AA change (#6 Glu to Val) Consequence: Protein polymerizes Change in RBC shape ---> phenotypes Protein Structure Polypeptide: Protein: Therefore: Example: Single chain of amino acids 1 or more polypeptide chains All polypeptides are proteins Some proteins contain >1 polypeptide Hemoglobin (O 2 binding

More information

Bioinformatics using Python for Biologists

Bioinformatics using Python for Biologists Bioinformatics using Python for Biologists 10.1 The SeqIO module Many file formats are employed by the most popular databases to store information in ways that should be easily interpreted by a computer

More information

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE AP Biology Date SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE LEARNING OBJECTIVES Students will gain an appreciation of the physical effects of sickle cell anemia, its prevalence in the population,

More information

Integration of data management and analysis for genome research

Integration of data management and analysis for genome research Integration of data management and analysis for genome research Volker Brendel Deparment of Zoology & Genetics and Department of Statistics Iowa State University 2112 Molecular Biology Building Ames, Iowa

More information

Scientific databases. Biological data management

Scientific databases. Biological data management Scientific databases Biological data management The term paper within the framework of the course Principles of Modern Database Systems by Aleksejs Kontijevskis PhD student The Linnaeus Centre for Bioinformatics

More information

Basic Concepts of DNA, Proteins, Genes and Genomes

Basic Concepts of DNA, Proteins, Genes and Genomes Basic Concepts of DNA, Proteins, Genes and Genomes Kun-Mao Chao 1,2,3 1 Graduate Institute of Biomedical Electronics and Bioinformatics 2 Department of Computer Science and Information Engineering 3 Graduate

More information

Genetic testing. The difference diagnostics can make. The British In Vitro Diagnostics Association

Genetic testing. The difference diagnostics can make. The British In Vitro Diagnostics Association 6 Genetic testing The difference diagnostics can make The British In Vitro Diagnostics Association Genetic INTRODUCTION testing The Department of Health published Our Inheritance, Our Future - Realising

More information

13.2 Ribosomes & Protein Synthesis

13.2 Ribosomes & Protein Synthesis 13.2 Ribosomes & Protein Synthesis Introduction: *A specific sequence of bases in DNA carries the directions for forming a polypeptide, a chain of amino acids (there are 20 different types of amino acid).

More information

From DNA to Protein. Proteins. Chapter 13. Prokaryotes and Eukaryotes. The Path From Genes to Proteins. All proteins consist of polypeptide chains

From DNA to Protein. Proteins. Chapter 13. Prokaryotes and Eukaryotes. The Path From Genes to Proteins. All proteins consist of polypeptide chains Proteins From DNA to Protein Chapter 13 All proteins consist of polypeptide chains A linear sequence of amino acids Each chain corresponds to the nucleotide base sequence of a gene The Path From Genes

More information

An agent-based layered middleware as tool integration

An agent-based layered middleware as tool integration An agent-based layered middleware as tool integration Flavio Corradini Leonardo Mariani Emanuela Merelli University of L Aquila University of Milano University of Camerino ITALY ITALY ITALY Helsinki FSE/ESEC

More information

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed

More information

The world of non-coding RNA. Espen Enerly

The world of non-coding RNA. Espen Enerly The world of non-coding RNA Espen Enerly ncrna in general Different groups Small RNAs Outline mirnas and sirnas Speculations Common for all ncrna Per def.: never translated Not spurious transcripts Always/often

More information

Worksheet - COMPARATIVE MAPPING 1

Worksheet - COMPARATIVE MAPPING 1 Worksheet - COMPARATIVE MAPPING 1 The arrangement of genes and other DNA markers is compared between species in Comparative genome mapping. As early as 1915, the geneticist J.B.S Haldane reported that

More information

Introduction to Bioinformatics 3. DNA editing and contig assembly

Introduction to Bioinformatics 3. DNA editing and contig assembly Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov

More information

ISTEP+: Biology I End-of-Course Assessment Released Items and Scoring Notes

ISTEP+: Biology I End-of-Course Assessment Released Items and Scoring Notes ISTEP+: Biology I End-of-Course Assessment Released Items and Scoring Notes Page 1 of 22 Introduction Indiana students enrolled in Biology I participated in the ISTEP+: Biology I Graduation Examination

More information

12.1 The Role of DNA in Heredity

12.1 The Role of DNA in Heredity 12.1 The Role of DNA in Heredity Only in the last 50 years have scientists understood the role of DNA in heredity. That understanding began with the discovery of DNA s structure. In 1952, Rosalind Franklin

More information

Biological Databases and Protein Sequence Analysis

Biological Databases and Protein Sequence Analysis Biological Databases and Protein Sequence Analysis Introduction M. Madan Babu, Center for Biotechnology, Anna University, Chennai 25, India Bioinformatics is the application of Information technology to

More information

Gene and Chromosome Mutation Worksheet (reference pgs. 239-240 in Modern Biology textbook)

Gene and Chromosome Mutation Worksheet (reference pgs. 239-240 in Modern Biology textbook) Name Date Per Look at the diagrams, then answer the questions. Gene Mutations affect a single gene by changing its base sequence, resulting in an incorrect, or nonfunctional, protein being made. (a) A

More information

G E N OM I C S S E RV I C ES

G E N OM I C S S E RV I C ES GENOMICS SERVICES THE NEW YORK GENOME CENTER NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. capabilities. N E X T- G E

More information

CCR Biology - Chapter 9 Practice Test - Summer 2012

CCR Biology - Chapter 9 Practice Test - Summer 2012 Name: Class: Date: CCR Biology - Chapter 9 Practice Test - Summer 2012 Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Genetic engineering is possible

More information

Processing Genome Data using Scalable Database Technology. My Background

Processing Genome Data using Scalable Database Technology. My Background Johann Christoph Freytag, Ph.D. freytag@dbis.informatik.hu-berlin.de http://www.dbis.informatik.hu-berlin.de Stanford University, February 2004 PhD @ Harvard Univ. Visiting Scientist, Microsoft Res. (2002)

More information

BioBoot Camp Genetics

BioBoot Camp Genetics BioBoot Camp Genetics BIO.B.1.2.1 Describe how the process of DNA replication results in the transmission and/or conservation of genetic information DNA Replication is the process of DNA being copied before

More information

DNA and the Cell. Version 2.3. English version. ELLS European Learning Laboratory for the Life Sciences

DNA and the Cell. Version 2.3. English version. ELLS European Learning Laboratory for the Life Sciences DNA and the Cell Anastasios Koutsos Alexandra Manaia Julia Willingale-Theune Version 2.3 English version ELLS European Learning Laboratory for the Life Sciences Anastasios Koutsos, Alexandra Manaia and

More information

Genomic Data at the British Oceanographic Data Centre

Genomic Data at the British Oceanographic Data Centre Genomic Data at the British Oceanographic Data Centre A data management project for the NERC Marine and Freshwater Microbial Biodiversity Thematic Programme Gwen Moncoiffé British Oceanographic Data Centre,

More information

BME 42-620 Engineering Molecular Cell Biology. Lecture 02: Structural and Functional Organization of

BME 42-620 Engineering Molecular Cell Biology. Lecture 02: Structural and Functional Organization of BME 42-620 Engineering Molecular Cell Biology Lecture 02: Structural and Functional Organization of Eukaryotic Cells BME42-620 Lecture 02, September 01, 2011 1 Outline A brief review of the previous lecture

More information

Genetics Test Biology I

Genetics Test Biology I Genetics Test Biology I Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Avery s experiments showed that bacteria are transformed by a. RNA. c. proteins.

More information

PRACTICE TEST QUESTIONS

PRACTICE TEST QUESTIONS PART A: MULTIPLE CHOICE QUESTIONS PRACTICE TEST QUESTIONS DNA & PROTEIN SYNTHESIS B 1. One of the functions of DNA is to A. secrete vacuoles. B. make copies of itself. C. join amino acids to each other.

More information

BIOINFORMATICS TUTORIAL

BIOINFORMATICS TUTORIAL Bio 242 BIOINFORMATICS TUTORIAL Bio 242 α Amylase Lab Sequence Sequence Searches: BLAST Sequence Alignment: Clustal Omega 3d Structure & 3d Alignments DO NOT REMOVE FROM LAB. DO NOT WRITE IN THIS DOCUMENT.

More information

Ms. Campbell Protein Synthesis Practice Questions Regents L.E.

Ms. Campbell Protein Synthesis Practice Questions Regents L.E. Name Student # Ms. Campbell Protein Synthesis Practice Questions Regents L.E. 1. A sequence of three nitrogenous bases in a messenger-rna molecule is known as a 1) codon 2) gene 3) polypeptide 4) nucleotide

More information

Molecular Cell Biology WS2011

Molecular Cell Biology WS2011 Molecular Cell Biology WS2011 Lecturer: Dr. Andreas Prokesch, Inst. for Genomics and Bioinformatics, TUG Purpose of this series of lectures: to offer you the basic knowledge required to know and understand

More information

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1 Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat

More information

Chapter 2. Using and Understanding RepeatMasker. Sébastien Tempel. Abstract. 1. Introduction

Chapter 2. Using and Understanding RepeatMasker. Sébastien Tempel. Abstract. 1. Introduction Chapter 2 Using and Understanding RepeatMasker Sébastien Tempel Abstract RepeatMasker is a program that screens DNA sequences for interspersed repeats and low-complexity DNA sequences. In this chapter,

More information

The sequence of bases on the mrna is a code that determines the sequence of amino acids in the polypeptide being synthesized:

The sequence of bases on the mrna is a code that determines the sequence of amino acids in the polypeptide being synthesized: Module 3F Protein Synthesis So far in this unit, we have examined: How genes are transmitted from one generation to the next Where genes are located What genes are made of How genes are replicated How

More information

Bioinformatics Grid - Enabled Tools For Biologists.

Bioinformatics Grid - Enabled Tools For Biologists. Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis

More information

escience and Post-Genome Biomedical Research

escience and Post-Genome Biomedical Research escience and Post-Genome Biomedical Research Thomas L. Casavant, Adam P. DeLuca Departments of Biomedical Engineering, Electrical Engineering and Ophthalmology Coordinated Laboratory for Computational

More information

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!! DNA Replication & Protein Synthesis This isn t a baaaaaaaddd chapter!!! The Discovery of DNA s Structure Watson and Crick s discovery of DNA s structure was based on almost fifty years of research by other

More information

Control of Gene Expression

Control of Gene Expression Control of Gene Expression What is Gene Expression? Gene expression is the process by which informa9on from a gene is used in the synthesis of a func9onal gene product. What is Gene Expression? Figure

More information

Lassen Community College Course Outline

Lassen Community College Course Outline Lassen Community College Course Outline BIOL-25 Human Anatomy and Physiology I 4.0 Units I. Catalog Description First semester of a two semester sequence covering structure and function, integration and

More information

Translation Study Guide

Translation Study Guide Translation Study Guide This study guide is a written version of the material you have seen presented in the replication unit. In translation, the cell uses the genetic information contained in mrna to

More information

Biomedical Big Data and Precision Medicine

Biomedical Big Data and Precision Medicine Biomedical Big Data and Precision Medicine Jie Yang Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago October 8, 2015 1 Explosion of Biomedical Data 2 Types

More information

SUBMITTING DNA SEQUENCES TO THE DATABASES

SUBMITTING DNA SEQUENCES TO THE DATABASES Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Second Edition Andreas D. Baxevanis, B.F. Francis Ouellette Copyright 2001 John Wiley & Sons, Inc. ISBNs: 0-471-38390-2 (Hardback);

More information

Pairwise Sequence Alignment

Pairwise Sequence Alignment Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What

More information

RNA & Protein Synthesis

RNA & Protein Synthesis RNA & Protein Synthesis Genes send messages to cellular machinery RNA Plays a major role in process Process has three phases (Genetic) Transcription (Genetic) Translation Protein Synthesis RNA Synthesis

More information

Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility

Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility Report of a meeting organized by the Wellcome Trust and held on 14 15 January 2003 at Fort Lauderdale,

More information

11, Olomouc, 783 71, Czech Republic. Version of record first published: 24 Sep 2012.

11, Olomouc, 783 71, Czech Republic. Version of record first published: 24 Sep 2012. This article was downloaded by: [Knihovna Univerzity Palackeho], [Vladan Ondrej] On: 24 September 2012, At: 05:24 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954

More information

BI122 Introduction to Human Genetics, Fall 2014

BI122 Introduction to Human Genetics, Fall 2014 BI122 Introduction to Human Genetics, Fall 2014 Course Overview We will explore 1) the genetic and molecular basis of heredity and inherited traits, 2) how genetics & genomics reveals an understanding

More information

DNA, RNA, Protein synthesis, and Mutations. Chapters 12-13.3

DNA, RNA, Protein synthesis, and Mutations. Chapters 12-13.3 DNA, RNA, Protein synthesis, and Mutations Chapters 12-13.3 1A)Identify the components of DNA and explain its role in heredity. DNA s Role in heredity: Contains the genetic information of a cell that can

More information

Rules and Format for Taxonomic Nucleotide Sequence Annotation for Fungi: a proposal

Rules and Format for Taxonomic Nucleotide Sequence Annotation for Fungi: a proposal Rules and Format for Taxonomic Nucleotide Sequence Annotation for Fungi: a proposal The need for third-party sequence annotation Taxonomic names attached to nucleotide sequences occasionally need to be

More information

Linear Sequence Analysis. 3-D Structure Analysis

Linear Sequence Analysis. 3-D Structure Analysis Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical properties Molecular weight (MW), isoelectric point (pi), amino acid content, hydropathy (hydrophilic

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

Central Dogma. Lecture 10. Discussing DNA replication. DNA Replication. DNA mutation and repair. Transcription

Central Dogma. Lecture 10. Discussing DNA replication. DNA Replication. DNA mutation and repair. Transcription Central Dogma transcription translation DNA RNA Protein replication Discussing DNA replication (Nucleus of eukaryote, cytoplasm of prokaryote) Recall Replication is semi-conservative and bidirectional

More information

A Practitioner's G uide to Data Management and Data Integration in Bioinformatics

A Practitioner's G uide to Data Management and Data Integration in Bioinformatics 3 CHAPTER A Practitioner's G uide to Data Management and Data Integration in Bioinformatics Barbara A. Eckman 3.1 INTRODUCTION Integration of a large and widely diverse set of data sources and analytical

More information

Data formats and file conversions

Data formats and file conversions Building Excellence in Genomics and Computational Bioscience s Richard Leggett (TGAC) John Walshaw (IFR) Common file formats FASTQ FASTA BAM SAM Raw sequence Alignments MSF EMBL UniProt BED WIG Databases

More information

Transcription and Translation of DNA

Transcription and Translation of DNA Transcription and Translation of DNA Genotype our genetic constitution ( makeup) is determined (controlled) by the sequence of bases in its genes Phenotype determined by the proteins synthesised when genes

More information

Bioinformatics: course introduction

Bioinformatics: course introduction Bioinformatics: course introduction Filip Železný Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Intelligent Data Analysis lab http://ida.felk.cvut.cz

More information

GenBank: A Database of Genetic Sequence Data

GenBank: A Database of Genetic Sequence Data GenBank: A Database of Genetic Sequence Data Computer Science 105 Boston University David G. Sullivan, Ph.D. An Explosion of Scientific Data Scientists are generating ever increasing amounts of data. Relevant

More information