Genetomic Promototypes



Similar documents
RNA & Protein Synthesis

In developmental genomic regulatory interactions among genes, encoding transcription factors

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

COMPUTATIONAL FRAMEWORKS FOR UNDERSTANDING THE FUNCTION AND EVOLUTION OF DEVELOPMENTAL ENHANCERS IN DROSOPHILA

From DNA to Protein. Proteins. Chapter 13. Prokaryotes and Eukaryotes. The Path From Genes to Proteins. All proteins consist of polypeptide chains

Understanding the dynamics and function of cellular networks

Translation Study Guide

Integrating DNA Motif Discovery and Genome-Wide Expression Analysis. Erin M. Conlon

Current Motif Discovery Tools and their Limitations

Bob Jesberg. Boston, MA April 3, 2014

Feed Forward Loops in Biological Systems

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

A Primer of Genome Science THIRD

Systems Biology through Data Analysis and Simulation

Human Genome Organization: An Update. Genome Organization: An Update

Basic Concepts of DNA, Proteins, Genes and Genomes

GENE REGULATION. Teacher Packet

Activity 7.21 Transcription factors

Name Class Date. Figure Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d.

Genetics Lecture Notes Lectures 1 2

Title: Surveying Genome to Identify Origins of DNA Replication In Silico

Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, b.patel@griffith.edu.

Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov

The sequence of bases on the mrna is a code that determines the sequence of amino acids in the polypeptide being synthesized:

Control of Gene Expression

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!

Protein Protein Interaction Networks

GenBank, Entrez, & FASTA

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Specific problems. The genetic code. The genetic code. Adaptor molecules match amino acids to mrna codons

School of Nursing. Presented by Yvette Conley, PhD

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

CCR Biology - Chapter 9 Practice Test - Summer 2012

Institutional Partnership Program

The Steps. 1. Transcription. 2. Transferal. 3. Translation

Transcription and Translation of DNA

Site-Directed Nucleases and Cisgenesis Maria Fedorova, Ph.D.

An Overview of Cells and Cell Research

Answer Key. Vocabulary Practice

13.2 Ribosomes & Protein Synthesis

Human Genome and Human Genome Project. Louxin Zhang

Chapter 5: Organization and Expression of Immunoglobulin Genes

Network Analysis. BCH 5101: Analysis of -Omics Data 1/34

CCR Biology - Chapter 8 Practice Test - Summer 2012

Web-Based Genomic Information Integration with Gene Ontology

Protein Synthesis How Genes Become Constituent Molecules

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in

TITLE MOTIVATION OBJECTIVES AUDIENCE COURSE INSTRUCTORS. Analysis of regulatory sequences controlling the expression of gene networks

Next Generation Sequencing: Technology, Mapping, and Analysis

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Analysis and Integration of Big Data from Next-Generation Genomics, Epigenomics, and Transcriptomics

13.4 Gene Regulation and Expression

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

Regents Biology REGENTS REVIEW: PROTEIN SYNTHESIS

Control of Gene Expression

RNA Viruses. A Practical Approac h. Alan J. Cann

A Mathematical Model of a Synthetically Constructed Genetic Toggle Switch

White Paper. Yeast Systems Biology - Concepts

Quantitative proteomics background

June 09, 2009 Random Mutagenesis

Gene Regulation -- The Lac Operon

GENETIC NETWORK ANALYSIS IN LIGHT OF MASSIVELY PARALLEL BIOLOGICAL DATA ACQUISITION.

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM

Molecular Genetics. RNA, Transcription, & Protein Synthesis

Genetics Module B, Anchor 3

Replication Study Guide

DNA and the Cell. Version 2.3. English version. ELLS European Learning Laboratory for the Life Sciences

Pairwise Sequence Alignment

Central Dogma. Lecture 10. Discussing DNA replication. DNA Replication. DNA mutation and repair. Transcription

Gene Models & Bed format: What they represent.

The world of non-coding RNA. Espen Enerly

INDUSTRY OVERVIEW. Our business segments. (ii) Global drug development service market Preclinical drug development services

How To Understand The Pharmacology Of The Pharmaceutical Industry

Resumen Curricular de los Profesores. Jesse Boehm

Biology Final Exam Study Guide: Semester 2

How many of you have checked out the web site on protein-dna interactions?

Microarray Technology

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology

Bioinformatics Resources at a Glance

ISTEP+: Biology I End-of-Course Assessment Released Items and Scoring Notes

The Making of the Fittest: Evolving Switches, Evolving Bodies

Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG)

Name: Date: Period: DNA Unit: DNA Webquest

CHAPTER 6: RECOMBINANT DNA TECHNOLOGY YEAR III PHARM.D DR. V. CHITRA

Modeling and Simulation of Gene Regulatory Networks

LightSwitch Luciferase Assay System

RT 2 Profiler PCR Array: Web-Based Data Analysis Tutorial

Denominazione insegnamento in italiano Denominazione insegnamento in inglese Tipologia dell esame (scritto- scritto/orale orale)

Heuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations

Bioinformatics: Network Analysis

INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE Q5B

Molecular Genetics: Challenges for Statistical Practice. J.K. Lindsey

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem

Introduction to Bioinformatics 3. DNA editing and contig assembly

Introduction to Genome Annotation

Core Bioinformatics. Degree Type Year Semester Bioinformàtica/Bioinformatics OB 0 1

Bioinformatics Grid - Enabled Tools For Biologists.

1. Introduction Gene regulation Genomics and genome analyses Hidden markov model (HMM)

Probabilistic methods for post-genomic data integration

Structure and Function of DNA

Transcription:

Genetomic Promototypes Mirkó Palla and Dana Pe er Department of Mechanical Engineering Clarkson University Potsdam, New York and Department of Genetics Harvard Medical School 77 Avenue Louis Pasteur Boston, Massachusetts 1

I. Introduction Transcriptional regulation plays a vital role in all living organisms. It influences development, complexity, diversity, homeostasis and other important biological functions (Davidson, 2001). Transcription is the first stage in the universal information flow from genome, where all genetic programs are stored, to proteome, through which these programs are executed. Thus, understanding the complex mechanism behind the control of transcription machinery constitutes one of the fundamental goals of quantitative biology. At the most fundamental level, transcription is controlled by the combinatorial interplay of cis-regulatory elements (or motifs) present in the gene s promoter region 1 and associated regulatory proteins (or transcription factors) present in the cytoplasm (Jacob and Monod, 1961). Because all transcription factors are gene products themselves, this mechanism is regulated by a set of motifs present in the particular gene s promoter. Thus, the elementary principles governing transcription can be understood by a quantitative description of how the motif s influence on gene expression depends on promoter context. In spite of major efforts aimed at identifying motifs in different species using a variety of approaches and analyzing their precise influence on gene expression (McGuire et al., 2000), little is known about the principles by which a gene s motifs translate into an expression level. In other words, quantitative effects of motifs on gene expression as a function of their promoter context is still poorly understood. II. Background Modern molecular biology has brought many new tools to the research scientists as well as an expanding database of genomes and new genes for study. Of particular use in the analysis of these genes is the synthetic promoter region, a 600-1000 base pair nucleotide sequence designed to the specifications of the investigator, which controls the transcription machinery. Synthetic promoters are responsible to control the same product 1 See figure 1 on page 6 for hypothetical gene control mechanism 2

as the gene of interest, but the bioengineered nucleotide sequence regulating that protein may express it differently under various environmental conditions. Designing synthetic promoters by hand is a time-consuming and error-prone process that may involve several computer programs. For this reason, an integrated bioengineering tool (a design software called BASHER) is under development, that combines many modules to provide a platform for high-throughput synthetic promoter region design for multi-kilobase sequences. Of all sequenced genomes, the yeast Saccharomyces cerevisiae has gained the most attention due to the availability of multiple yeast genomes and high quality mrna data. For this reason, this yeast species was chosen as our core model in the genomic analysis. III. Research methodology The power and flexibility of oligonucleotide synthesis is increasingly being recognized in the bioengineering community. Traditional promoter region synthesis applications include facilitation of site-directed mutagenesis, structural analysis and investigation of transcription regulation. The new theory of promoter variant design takes combinational and spacial effects (Beer and Tavazoie, 2004) of cis-binding sites 2 into account and incorporates them into the modeling process. Since binding sites can act as activators or inhibitors and can form modules (set of cis-elements) with linear, epistatic, synergistic or switch effects as result of their interaction, a deep combinatorial analysis is needed to decipher the governing regulatory logic. Previous studies show that there are functional and mechanistic implications of spatial organization of these regulatory elements. There are physical interactions between them as certain transcription factor binding sites overlap, implying the possibility for protein complex formation. Also, in the higher chromatin structure, there are regions of 3-dimensional occlusions blocking protein binding to regulatory motif sequence. Motif positioning relative to transcription start plays a significant role in the transcription regulatory mechanism, so synthetic DNA segment 2 Example of cis-binding sites (motifs) of promoter YCL027W figure 3 on page 7 3

insertions might reveal some functionality. Finally, the distance between cis-elements plays a major role in regulation; certain motif pairs only occur in a particular base pair distance form each other and some pairs occur more frequently then others in the promoter. It was also shown, that motif orientation and order has regulatory effects, i.e., a regulatory module will only influence gene expression in the right spatial combination (orientation, order). To decipher the governing regulatory logic, first combinations of elements must be removed or replaced with new synthetic motif sequences and the resulting gene expression profile can be analyzed under various environmental conditions 3. Furthermore the additional logical design steps should include: randomly moving a binding site to other locations, making small changes to cis-elements or adding new motifs based on new statistical data. These designing steps are performed by BASHER resulting in a set of systematic promoter variants in a high-throughput manner. In the past, researchers used many different programs to address the requirements of the separate steps of synthetic promoter design. Alternatively, they sent off their requirements to a black box provided by a gene synthesis company and let it use its proprietary programs to design nucleotide sequences of interest. To facilitate the use of synthetic promoter regions in both traditional and high-throughput applications, new and more flexible solutions are required. BASHER is a useful tool for investigators who wish to optimize protein expression and/or redesign their promoter of interest for detailed structure/function (Giaever, 2002) studies (e.g., mutagenesis). The objective of this research project is to create a Web-based program that is able to perform all of the functions outlined above for promoter design in a directed, step-wise manner. It accepts as input both ortholog promoter sequences and global transcription factor binding site maps of the organism of interest and allows users to move through the process of design in a series of modules that address practical issues surrounding oligonucleotide design. Users can follow the main design a promoter path or use the modules individually as needed. 3 See figure 2 on page 6 for flow chart of experimental steps 4

IV. References 1. Davidson EH (2001) Genomic Regulatory Systems: Development and Evolution. San Diego: Academic Press 2. Giaever G. et al. (2002) Functional profiling of the Saccharomyces cerevisiae genome. Nature 418: 387 391 3. Jacob F., Monod J. (1961) Genetic regulatory mechanisms in the synthesis of proteins. J Mol Biol 3: 318 356 4. Beer MA, Tavazoie S. (2004) Predicting gene expression from sequence. Cell 117: 185 198 5. McGuire AM, Hughes JD, Church GM (2000) Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes. Genome Res 10: 744 757 5

V. Figures and tables Figure 1 - Gene control mechanism for gene X Ortholog promoters Expression data PSSMs YFG Basher Cis element map Conditions Promoter variants Figure 2 Flow chart for experimental steps 6

Figure 3 Transcription factor binding sites for promoter YCL027W [Output example of visualization software see more in Manual] 7