Identifying microrna targets: computational and biochemical approaches



Similar documents
岑 祥 股 份 有 限 公 司 技 術 專 員 費 軫 尹

Micro RNAs: potentielle Biomarker für das. Blutspenderscreening

Profiling of non-coding RNA classes Gunter Meister

Outline. MicroRNA Bioinformatics. microrna biogenesis. short non-coding RNAs not considered in this lecture. ! Introduction

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

The world of non-coding RNA. Espen Enerly

mirnaselect pep-mir Cloning and Expression Vector

Interaktionen von RNAs und Proteinen

Outline. interfering RNA - What is dat? Brief history of RNA interference. What does it do? How does it work?

PART 3.3: MicroRNA and Cancer

MicroRNA Mike needs help to degrade all the mrna transcripts! Aaron Arvey ISMB 2010

OriGene Technologies, Inc. MicroRNA analysis: Detection, Perturbation, and Target Validation

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Lezioni Dipartimento di Oncologia Farmacologia Molecolare. RNA interference. Giovanna Damia 29 maggio 2006

Functional and Biomedical Aspects of Genome Research

V22: involvement of micrornas in GRNs

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium

RNAi Shooting the Messenger!

A role of microrna in the regulation of telomerase? Yuan Ming Yeh, Pei Rong Huang, and Tzu Chien V. Wang

Name Class Date. Figure Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d.

PreciseTM Whitepaper

Genetomic Promototypes

Dicer Substrate RNAi Design

Five-year relative survival rates. Cancer. Age-adjusted cancer death rates. Proteomic Technologies for Cancer Biomarker Discovery 2010/3/22

MicroRNA formation. 4th International Symposium on Non-Surgical Contraceptive Methods of Pet Population Control

Protein Synthesis How Genes Become Constituent Molecules

Biogenesis, Size and Function of Small RNAs

LightSwitch Luciferase Assay System

13.4 Gene Regulation and Expression

THE ENZYMES. Department of Microbiology, Immunology, and Molecular Genetics, Molecular Biology Institute University of California

micrornas Non protein coding, endogenous RNAs of 21-22nt length Evolutionarily conserved

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure enzymes control cell chemistry ( metabolism )

Chapter 5: Organization and Expression of Immunoglobulin Genes

How To Understand How Gene Expression Is Regulated

Comprehensive mirna Research Technologies

Protein-responsive ribozyme switches in eukaryotic cells

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!

Biological Sciences Initiative. Human Genome

Essentials of Real Time PCR. About Sequence Detection Chemistries

Chapter 18 Regulation of Gene Expression

Chapter 2. imapper: A web server for the automated analysis and mapping of insertional mutagenesis sequence data against Ensembl genomes

Current Motif Discovery Tools and their Limitations

School of Nursing. Presented by Yvette Conley, PhD

Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, b.patel@griffith.edu.

Control of Gene Expression

A Primer of Genome Science THIRD

The RNA strategy. RNA as a tool and target in human disease diagnosis and therapy.

Determinants of targeting by endogenous and exogenous micrornas and sirnas

Transfection-Transfer of non-viral genetic material into eukaryotic cells. Infection/ Transduction- Transfer of viral genetic material into cells.

Genetics Lecture Notes Lectures 1 2

Translation Study Guide

Functional RNAs; RNA catalysts, mirna,

Control of Gene Expression

Control of Gene Expression

Complex multicellular organisms are produced by cells that switch genes on and off during development.

Basic Concepts of DNA, Proteins, Genes and Genomes

Gene mutation and molecular medicine Chapter 15

Transcription and Translation of DNA

Structure and Function of DNA

Next Generation Sequencing: Technology, Mapping, and Analysis

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6

Module 3 Questions. 7. Chemotaxis is an example of signal transduction. Explain, with the use of diagrams.

Molecular Genetics: Challenges for Statistical Practice. J.K. Lindsey

CHAPTER 2: UNDERSTANDING CANCER

Mir-X mirna First-Strand Synthesis Kit User Manual

GenBank, Entrez, & FASTA

Data Analysis for Ion Torrent Sequencing

CONTRACTING ORGANIZATION: University of Alabama at Birmingham Birmingham, AL 35294

EXPRESSION ARREST shrna mir GENOME- WIDE LIBRARIES

GeneCopoeia Genome Editing Tools for Safe Harbor Integration in. Mice and Humans. Ed Davis, Liuqing Qian, Ruiqing li, Junsheng Zhou, and Jinkuo Zhang

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

TOOLS sirna and mirna. User guide

Introduction To Real Time Quantitative PCR (qpcr)

Notch 1 -dependent regulation of cell fate in colorectal cancer

The Human Genome Project

Evolution of microrna diversity and regulation in animals

Activity 7.21 Transcription factors

Expert Intelligence for Better Decisions. MicroRNA: An Insight to mirna-based Microarrays, Diagnostics and Therapeutics

CHAPTER 40 The Mechanism of Protein Synthesis

Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG)

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Core Facility Genomics

HiPerFect Transfection Reagent Handbook

Head of College Scholars List Scheme. Summer Studentship. Report Form

Summary of Discussion on Non-clinical Pharmacology Studies on Anticancer Drugs

Analysis of gene expression data. Ulf Leser and Philippe Thomas

Understanding the dynamics and function of cellular networks

Plant Growth & Development. Growth Stages. Differences in the Developmental Mechanisms of Plants and Animals. Development

CCR Biology - Chapter 9 Practice Test - Summer 2012

G E N OM I C S S E RV I C ES

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides

Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe

How many of you have checked out the web site on protein-dna interactions?

Basic Analysis of Microarray Data

SUPER SENSITIVE TM. mirna in FFPE. Next-Gen in situ Tissue Signature Markers Potential tool for Characterization of CUP

Recombinant DNA and Biotechnology

Luísa Romão. Instituto Nacional de Saúde Dr. Ricardo Jorge Av. Padre Cruz, Lisboa, Portugal. Cooper et al (2009) Cell 136: 777

Analytical Study of Hexapod mirnas using Phylogenetic Methods

MatureBayes: A Probabilistic Algorithm for Identifying the Mature mirna within Novel Precursors

Transcription:

Identifying microrna targets: computational and biochemical approaches Iddo Ben-Dov Nephrology and Hypertension Hadassah Hebrew University Medical Center

T. Tuschl

A Short History of a Short RNA The intellectual backdrop motivating our effort to clone lin-4 had nothing to do with questions about noncoding RNAs or antisense gene regulation. We were simply curious about an interesting worm mutant, and everything we found out about it was unexpected. We consider ourselves very lucky to happen to have chosen lin-4 to study. In fact, good fortune appeared at many steps before and during ourlin-4 project, often through the contributions of other people. Lee R et al. Cell 2004, S116:S89-S92 Ruvkun G et al. Cell 2004, S116:S93-S96

Most importantly, however, Gary Ruvkun's lab had identified evolutionally conserved sequences in the 3' UTR of lin-14 in a region of the mrna responsible for the downregulation of LIN-14; we and Gary's lab knew that these sequences could contain the elements through which lin-4 acts. Gary shared his lin-14 UTR sequences with us, and we sent the lin-4 sequences to Gary. On precisely the same day in June of 1992, Victor and Gary independently noticed the antisense complementarity between lin-4 and lin-14.

Despite their clear importance as a class of regulatory molecules, pinpointing the relevance of individual mirnas is challenging.

mirnas are small noncoding RNAs that regulate protein output post-transcriptionally. Overwhelming evidence accumulated since their discovery leaves little doubt regarding their importance. They comprise 1 2% of all genes in worms, flies, and mammals, and because each mirna is predicted to regulate hundreds of targets, the majority of protein coding genes is thought to be under their control. Vidigal JA, 2015

In practice, this means that virtually every biological process is subject to mirna dependent regulation. As additional evidence of their functional relevance, mirnas and their targets often display striking evolutionary conservation. Lastly, animals carrying mutations that impair mirna processing are not viable, indicating that complete loss of mirna activity is incompatible with life. Vidigal JA, 2015

MiRNAs are transcribed as long primary transcripts (pri-mirnas) and cleaved in the nucleus by the Drosha/DGCR8 microprocessor complex. The resulting 70-nucleotide-long hairpin-shaped molecule the pre-mirna is exported into the cytoplasm, where it is further processed by Dicer, bound by an Argonaute protein, and incorporated into an RNA-induced silencing complex (RISC). Metazoan mirnas typically direct the RISC to target mrnas through imperfect base pairing to their 3 UTR, leading to post-transcriptional repression mainly through mrna destabilization, although a minor component of translation inhibition has also been detected. Vidigal JA, 2015

Target recognition is primarily determined by the seed-sequence, a stretch of 6 nucleotides spanning nucleotides 2 7 on the 5 end of the mirna. Accordingly, targets can be predicted by searching for conserved matches to this sequence in the 3 UTR of messages. Prediction accuracy increases further when this search is restricted to seven nucleotide-long motifs encompassing the seed, and when the sequence context within the 3 UTR is taken into consideration.

Noncanonical targeting through sites with mismatches to the seed has also been reported, but seems to be generally associated with lower levels of repression and its biological relevance remains unclear. Because targeting requires the presence of such short conserved sequences, individual mirnas have the potential to regulate hundreds of targets. Computational predictions have been supported by experimental evidence showing that loss or overexpression of a mirna in cultured cells results in the deregulation of hundreds of genes. In both cases, the deregulated messages are enriched in conserved mirna binding sites and their expression in vivo tends to be anti-correlated with that of the mirna.

Genetic inactivation of a mirna results in very modest de-repression of its direct targets, typically less than two-fold even for highly abundant mirnas. These differences are well within the range that could be attributed to fluctuations of gene expression between two genetically identical cells or between individuals. For most genes, such modest changes in expression can be well tolerated by the organism, which might explain why genetic inactivation of mirnas often does not have obvious phenotypic consequences.

These observations sparked the idea that rather than acting as genetic switches where strong repression of one or few targets results in a clear phenotypic outcome most mirna act as rheostats, fine tuning the expression of hundreds of genes to reinforce cell fate decisions brought about through other mechanisms.

It is important to note, however, that even a mild derepression of many targets can have severe phenotypical consequences, especially if the targets are functionally linked. For example, in mice, deletion of mir-128 results in fatal epilepsy due to de-repression of several components of the MAPK pathway, leading to a significant increase in ERK2 phosphorylation. Similarly, loss of mir-205 results in neonatal lethality in mice with compromised epidermal and hair follicle growth, presumably by modulating the expression of multiple components of the PI(3)K signaling pathway. A recent study has implicated Drosophila mir-8 in CNS patterning and fly fertility through the regulation of genes whose products act together within a protein complex.

In contrast to loss-of-function studies, ectopic expression often leads to supra-physiologic levels of the mirna and stronger repression of its targets. The magnitude of this repression can bring down to inconsequential levels the expression of genes that would otherwise remain functional even in the presence of the targeting mirna. Often, these experiments also result in the expression of mirnas in tissues or cells in which they would normally be absent, leading to repression of messages that might not be their biological targets.

Thus, despite their wide-spread use, the propensity to generate a high fraction of falsepositive results constitutes a major caveat of mirna overexpression experiments. This does not mean that such experiments are devoid of value. In fact, much of the knowledge we have accumulated over the years on mirna biology has depended on them. In addition, regardless of the physiologic relevance of these studies, over-expression of mirna mimics may serve a therapeutic purpose.

Several mirna-based therapeutic strategies are currently under development, and these can broadly be divided into: mirna replacement strategies, which attempt to mimic the activity of specific mirnas by delivering small doublestranded RNA molecules that resemble mirna duplexes. MRX34, for example, is a mimic of the mir-34 family, which is currently in Phase I clinical trials to test its safety for patients with primary liver cancer or liver metastasis. mirna targeting strategies, which rely on antisense oligonucleotides (ASO) that bind to endogenous mirnas to prevent their interaction with targets. One such molecule, designed to inhibit mir-122, has also reached clinical trials (Phase II) and is being evaluated for its long term safety and efficacy in patients with chronic HCV infection.

One remarkable aspect of mirna genes is that a large number of them have obvious paralogs in the genome. Paralog mirnas arise from both tandem and nonlocal gene duplication events, which give rise to either duplication of sequences in the same transcript thus originating mirna clusters or on distant loci, typically on different chromosomes. These mirna copies not only retain a high degree of sequence homology, but also share the same seed sequence and are, thus, by convention, grouped into mirna seed families.

Because paralog mirnas share the same seed sequence, they are expected to have similar affinities to messages. When expressed in the same cells, these related mirnas can co-regulate targets, leading to higher levels of repression than those that could be achieved by each mirna individually.

The emerging picture suggests that mirnas occupy a very unique position in the hierarchy of gene regulators. In contrast to transcription factors, in most cases mirnas do not appear to act as master regulators of gene expression. Rather, their mechanism of action allows them to act as fine tuners of transcriptional programs, as components of complex network motifs, and as post-transcriptional buffers to confer robustness to transcriptional programs in the face of environmental and genetic variability.

Although investigating individual mirna mrna interactions can and has been useful in some instances, moving forward it will be essential to resist the temptation to reduce the biological functions of individual mirnas to repression of one or a few key targets. Rather, new computational and systems biology approaches will be needed to fully appreciate the intricacy, and multifaceted roles of gene regulation by these small noncoding RNAs.

Thus, for metazoan mirnas, the challenge has been to devise a genomewide computational search that captures most of the regulatory targets without also bringing in too many false predictions. Initial attempts generated algorithms and sets of predictions that were difficult for experimentalists to evaluate, which was exacerbated by the poor overlap between sets of predictions from the same organism

Nonetheless, some of these efforts have provided methods and insights that helped set the stage for our current understanding of metazoan mirna recognition. A key methodological advance was the use of preferential evolutionary conservation to evaluate the ability of an algorithm to distinguish mirna target sites from the multitude of 3 -UTR segments that otherwise would score equally well with regard to the quality of mirna pairing. To the extent that sites are conserved more than would be expected by chance, they are judged to be under selective pressure and therefore biologically functional.

In this way, common features of target recognition can be distinguished from those that seem equally plausible but are rarely if ever used, thereby enabling the principles of target recognition to be elucidated and algorithms to be developed without resorting to training on a known set of targets. Developing the algorithm without consideration of known targets avoids biases from sites that are more easily found experimentally and was particularly useful for mammalian mirnas, for which no targets were known.

Current prediction methods are diverse, both in approach and performance, and all have room for improvement. Nonetheless, agreement is emerging on three conclusions, which are each reassuringly consistent with a growing body of experimental data. And, as further relief for the noncomputational biologist, the most critical concepts for computational target prediction can be distilled down to a few simple guidelines.

The first major conclusion is that requiring conserved Watson Crick pairing to the 5 region of the mirna centered on nucleotides 2 7, which is called the mirna seed, markedly reduces the occurrence of false-positive predictions

The second conclusion is that conserved pairing to the seed region can also be sufficient on its own for predicting conserved targets above the noise of false-positive predictions. For example, mammalian targets can be predicted by simply searching for conserved 7 nt matches in aligned regions of vertebrate 3 UTRs. Prediction specificity increases when requiring an 8 nt match or multiple matches to the same mirna.

(1) Identify the two 7 nt matches to the seed region. For example, mir-1 would recognize the CAUUCCA match and the ACAUUCC match. 5 UGGAAUGUAAAGAAGUAUGUA 3 CCUUACA 3 ACCUUAC (2) Use available whole-genome alignments (Karolchik et al., 2008) to compile orthologous 3 UTRs. (3) Search within the orthologous UTRs for conserved occurrence of either 7 nt match. These are predicted regulatory sites!

As might be expected, searching for conserved instances of either of two 7-mers yields many predicted targets hundreds of messages for each mirna family. The surprise is that after the number of sites expected to be conserved by chance is subtracted, the number of predicted targets remains very high. This leads to the third major conclusion: highly conserved mirnas have very many conserved targets.

The current predictions by TargetScan, PicTar, EMBL, and ElMMo have a high degree of overlap because they now all require stringent seed pairing. However, they are not 100% identical. Some reasons for imperfect overlap can be traced to alignment artifacts, the use of slightly different UTR databases, or the use of different mirna sequences. Other reasons are intrinsic to the prediction algorithms themselves, such as the treatment of the target nucleotide opposite the first mirna nucleotide.

stoichiometry is important Nature Methods 2012

Incorporation of mirna binding site into a gene expression vector makes the transcript a target for endogenous mirna. When expressed at physiological level, such vector can sense mirna activity in the cell (and can provide means for eliminating vector expression in unwanted cell types).

Produced 5 tandem mirna binding sites downstream EGFP for 291 different mirna (291 sensor vectors). Transfected cells with the sensor library at 1 vector/cell. FACS was used to sort cells according to EGFP fluorescence. Cells were collected into 4 bins

After sorting, DNA was extracted, the portions of the vector encoding the target sites were amplified with barcoded primers for multiplexing and deep sequenced the amplicons was performed.

Our comparison of mirna concentration and target suppression using Sensor-seq revealed that only a small number of mirnas were expressed at a sufficient concentration to mediate sensor regulation. Almost 60% of the mirnas detected by deep-sequencing had no discernible suppressive activity. This supports previous assertion that mirnas expressed below ~100 copies per cell have little regulatory capacity.

Methods in RBP biochemistry Throughout their lifetime, transcripts are associated with a plethora of RNA-binding proteins (RBPs). The combinatorial binding and spatial arrangement of these RBPs give rise to a diverse range of ribonucleoprotein particles that determine the cellular fate and function of each RNA. Recent advances towards more precise positional information on the binding sites of RBPs within RNAs have improved our understanding of the molecular mechanisms of post-transcriptional regulation.

Originally, protein-rna interactions were studied using biochemical methods such as SELEX, electrophoretic mobility shift and RNA protection assays, or genetic methods such as the yeast three-hybrid system. These approaches, however, did not address RNA binding in its native cellular context. In a first step towards preserving the cellular context, RNA immunoprecipitation was combined with differential display or microarray analysis (RIP-CHIP). These methods were of low resolution and prone to identifying indirect interactions. Furthermore, they were limited to studying stable RNPs since protein- RNA complexes can re-associate after cell lysis.

In order to increase the resolution and specificity, a strategy referred to as CLIP (UV cross-linking and immunoprecipitation) was developed. CLIP combines UV cross-linking of RBPs to their cognate RNA molecules with rigorous purification schemes. In combination with high-throughput sequencing, CLIP has proven as a powerful tool to study protein-rna interactions on a genome-wide scale (referred to as HITS-CLIP or CLIP-seq).

Identification of mirna and mirna targets in human podocyte culture: a deep sequencing approach

Podocyte mirna and mirna targets BACKGROUND Disruption of the mirna processing machinery in mouse podocytes results in a lethal glomerulopathy. Identification of specific mirna and mirna targets may thus generate new insights on podocyte pathophysiology.

Podocyte mirna and mirna targets BACKGROUND RNA sequencing may allow transcriptomewide and unbiased identification of both mirna and mirna targets.

Podocyte mirna and mirna mirna expression profiling We studied mirna expression in conditionally immortalized cultured human podocytes using small RNA sequencing. targets METHODS mirna target identification We sequenced RISC-bound RNA in cultured podocytes by immunoprecipitation of in-vivo photo-crosslinked protein argonaute-2 to identify engaging mirna and mirna targets (PAR-CLIP).

Podocyte mirna and mirna targets mirna expression profiling METHODS modified from: Hafner M et al. RNA 2011

mirna content fmol/mcg total RNA Podocyte mirna and mirna Small RNA annotation in podocytes doubt mirna mirna - 72% none - 9% spike - 14% rrna trna targets RESULTS snorna pirna repeat misc Absolute mirna content in podocytes 18 16 14 12 10 8 6 4 2 0 P=0.06 P<0.001 immature mature hek293

Podocyte mirna and mirna Top podocyte mirna clusters cluster-hsa-mir-21(1) 43.8% cluster-hsa-mir-98(13) 11.5% cluster-hsa-mir-221(2) 4.7% cluster-hsa-mir-23a(6) 4.7% cluster-hsa-mir-30a(4) 4.3% cluster-hsa-mir-29a(4) 4.3% cluster-hsa-mir-17(12) 2.9% cluster-hsa-let-7i(1) 2.3% cluster-hsa-mir-25(3) 1.9% cluster-hsa-mir-200a(3) 1.5% targets RESULTS Top podocyte mirna families sf-hsa-mir-21(1) 43.6% sf-hsa-let-7a-1(12) 14.6% sf-hsa-mir-30a(6) 4.3% sf-hsa-mir-29a(4) 4.3% sf-hsa-mir-221(1) 2.9% sf-hsa-mir-17(8) 2.8% sf-hsa-mir-27a(2) 2.6% sf-hsa-mir-222(1) 1.8% sf-hsa-mir-103-1(3) 1.6% sf-hsa-mir-141(5) 1.5%

Mature vs. immature podocytes: mir-29(4) and mir- 30a(4) expressed higher in fully differentiated compared to undifferentiated cells, mir-21(1) was less abundant in differentiated cells.

Podocytes vs. nonkidney cells: mir-29a(4), let-7i(1), mir-10a(1)/10b(1) are higher in podocytes compared to other epithelia.

Podocytes vs. other kidney cells: Proximal tubular cells have a similar mirna profile Mesangial cells express vascular smooth muscle mirna

Normal vs. mutant or challenged podocytes: Changes suggesting loss of epithelial markers and gain of mesenchymal mirna

Podocyte mirna and mirna targets mirna target identification modified from: Hafner M et al. Cell 2010

Podocyte mirna and mirna targets mirna target identification X 4sU 4sU X AGO RISC 4sU X 4sU other protein AAAAAA modified from: Hafner M et al. Cell 2010

Podocyte mirna and mirna targets RESULTS PAR-CLIP of mature podocytes yielded ~3,500 mrna clusters, of which ~900 had high confidence characteristics (showing a unique PAR-CLIP crosslinking signature); this is a lower estimate of the true number of mirna targets in podocytes due to the scale of the experiment.

Podocyte mirna and mirna targets RESULTS mirna clusters mrna clusters

Podocyte mirna and mirna targets RESULTS 89% of mrna clusters mapped to exons Most exonic clusters are in 3 UTR (70%) followed by CDS (28%)

Podocyte mirna and mirna targets RESULTS G let7a: 3' ttga TATGTTGATGATGGAGt 5' : TRIP13: GGTGAAACTTACATACAAATATTACCTCATTTGTTGT reads ----AAACTTACATACAAATA CTACCTCATTTG---- 14 ----AAACTTACATACAAA CATTACCTCATTTG---- 5 An example of a mrna target as picked up in-vivo by PAR- CLIP. Shown is a part of the TRIP13 3 UTR along with PAR- CLIP reads mapped to it. Shown above is the mature sequence of hsa-let-7a in reverse orientation, with alignment symbols to the target mrna.

Podocyte mirna and mirna targets CONCLUSIONS We obtained mirna profiles from human podocytes using barcoded small RNA sequencing. These profiles, complemented with biochemically confirmed in-vivo mrna target PAR-CLIP data, can serve to direct further research of mirna involvement in podocyte development and disease.