Outline MicroRNA Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology (CMB) Karolinska Institutet! Introduction! microrna target site prediction! Useful resources 2 short non-coding RNAs not considered in this lecture microrna biogenesis Transcription! Piwi-interacting RNAs (pirnas) transcriptional gene silencing of retrotransposons and other genetic elements in germ line cells, particularly those in spermatogenesis! endogenous short interfering RNAs (endo-sirnas) generated from transposable elements or long hairpin containing noncoding RNAs! Nematode 21-U RNA Pri-miRNA (A) Mirtron n DGCR8 Processing Drosha (A) n Splicing Pre-miRNA Exportin5 Nucleus Cytoplasm Maturation Dicer TRBP AGO1 4 Strand selection; mirnp assembly! short 21-22 nt non-coding RNAs! Transcription: RNA pol II! Processing: "Drosha (nuclear) "Dicer (cytoplasm)! Post-trancriptional repressors "Translation inhibition "mrna cleavage AGO2 I I I I I I I I I AAAAA AGO1 4 I I : I I I I I AAAA CCR4 NOT Endonucleolytic cleavage Translational repression or deadenylation 3 4
Important roles in development and disease Developmental regulators! micrornas as oncogenes: He et al. Nature 2005 Stark A. et al. Cell 2005 5 6 8 Developmental regulators II Identification of microrna genes! micrornas as developmental regulators:! Cloning and sequencing of short RNAs mir-196b Hox A A1 A2 A3 A4 A5 A6 A7 A9 A10 A11 A13! Massive sequencing of short RNAs mir-196 target site mir-10a Hox B B1 B2 B3 B4 B5 B6 C4 C5 C6 B7 mir-196a-1 B8 B9 C8 C9 C10 C11 C12 C13 D8 D9 D10 D11 D12 D13 mir-196a-2 Hox C Hox D! Bioinformatic approaches mir-10b D1 D3 D4 3' 5' Direction of transcription of Hox genes Figure 4 Two families of mirnas, mir-10 and mir-196, are embedded in the mouse Hox clusters. As shown below, the Hoxb8 3 UTR contains a site complementary to mir-196a. 7
microrna gene features Characteristic conservation profile 9 10 microrna gene registry Release 13.0 - May 2009 Often updated!! Human: 706 microrna genes! Mouse: 547 microrna genes chr14: STS Markers MGC Genes Ensembl Genes Genscan Genes Exoniphy Spliced ESTs Other mrnas 115442600 115442650 115442700 115442750 115442800 115442850 115442900 115442950 115443000 115443050 115443100 115443150 115443200 115443250 115443300 115443350 STS Markers on Genetic and Radiation Hybrid Maps UCSC Gene Predictions Based on RefSeq, UniProt, GenBank, and Comparative Genomics RefSeq Genes Non-Mouse RefSeq Genes Mammalian Gene Collection Full ORF mrnas Ensembl Gene Predictions Genscan Gene Predictions Exoniphy Mouse/Rat/Human/Dog MicroRNAs from mirbase mmu-mir-17 mmu-mir-18a mmu-mir-19a Mouse mrnas from GenBank Mouse ESTs That Have Been Spliced Non-Mouse mrnas from GenBank 30-Way Multiz Alignment & Conservation Mammal Cons Rat Human Orangutan Dog Horse Opossum Chicken Stickleback SNPs (128) RepeatMasker Simple Nucleotide Polymorphisms (dbsnp build 128) Repeating Elements by RepeatMasker http://microrna.sanger.ac.uk/sequences/index.shtml 11 12
Conservation of microrna genes micrornas conserved throughout vertebrates (e.g. mir-9-1) micrornas conserved throughout mammals micrornas conserved through primates microrna discovery! Early lin-4:lin-14 interaction in c. elegans 13 14 Increased conservation of motifs in 3 UTRs that are complementary to 5 end of micrornas Target Predictions: TargetScanS! Presence of conserved target sites Lewis B. et al. Cell 2003 7mer 15 16
Conserved Target Sites What about nonconserved targets? Fahr, Science 2005 Friedman R. et al. Genome Res 2008 17 18 20 Additional sequence determinants in mirna:mrna interactions I Additional sequence determinants in mirna:mrna interactions II 1 1. Seed type opposite base 1 of the microrna (no-complementarity) 1. Seed- Adenine hierarchy 2. Adenine of Uracil in base 9 (independent upon microrna base at pos 9) 2. t1a3. AU-richness ( t1 anchor ) in upstream X basepairs 4. Position within 3 UTR 3. t9au 5. 4. Increased AU content downstream of site 5. Positioning within 3 UTR Nielsen C. et al. RNA 2007 19
Additional sequence determinants in mirna:mrna interactions II Additional sequence determinants in mirna:mrna interactions II 1 1 1. Seed type - Adenine opposite base 1 of the microrna (no-complementarity) 2. Adenine of Uracil in base 9 (independent upon microrna base at pos 9) 3. AU-richness in upstream X basepairs 4. Position within 3 UTR 5. 1. Seed hierarchy 2. t1a ( t1 anchor ) 3. t9au 4. Increased AU content downstream of site 5. Positioning within 3 UTR 1. Seed type - Adenine opposite base 1 of the microrna (no-complementarity) 2. Adenine of Uracil in base 9 (independent upon microrna base at pos 9) 3. AU-richness in upstream X basepairs 4. Position within 3 UTR 5. 1. Seed hierarchy 2. t1a ( t1 anchor ) 3. t9au 4. Increased AU content downstream of site 5. Positioning within 3 UTR 20 20 Improved conservation scoring Scope of microrna regulation! In total, >45,000 mirna target sites within human 3!UTRs are conserved above background levels, and >60% of human protein-coding genes have been under selective pressure to maintain pairing to mirnas. Friedman R. et al Genome Res 2008 21 22
Site Efficacy Target Prediction Resources Bartel D. Cell 2009 Table 1.!Tools for Predicting Metazoan mirna Targets Tool a Clades b Criteria for Prediction and Ranking Website URL Recent Reference Stringent seed pairing, site number, site type, site http://targetscan.org Friedman et al., 2008 http://targetscan.org Ruby et al., 2007; Ruby et al., 2006 http://russell.embl-heidelberg.de Stark et al., 2005 Site Conservation Considered TargetScan m accessibility); option of ranking by likelihood of preferential conservation rather than site context TargetScan f,w Stringent seed pairing, site number, site type EMBL f Stringent seed pairing, site number, overall pre dicted pairing stability PicTar Stringent seed pairing for at least one of the sites for the mirna, site number, overall predicted pairing stability http://pictar.mdc-berlin.de Lall et al., 2006 EIMMo Stringent seed pairing, site number, likelihood of preferential conservation http://www.mirz.unibas.ch/elmmo2 Gaidatzis et al., 2007 Miranda,+ pairing to most of the mirna http://www.microrna.org Betel et al., 2008 mirbase Targets,+ overall pairing http://microrna.sanger.ac.uk PITA Top overall predicted pairing stability, predicted site accessibility http://genie.weizmann.ac.il/pubs/ mir07/mir07_data.html Kertesz et al., 2007 mirwip w overall predicted pairing stability, predicted site accessibility http://146.189.76.171/query Hammell et al., 2008 http://targetscan.org Grimson et al., 2007-2008 Site Conservation Not Considered Bartel D. Cell 2009 TargetScan m Stringent seed pairing, site number, site type, site accessibility) 23 PITA All overall predicted pairing stability, predicted site accessibility http://genie.weizmann.ac.il/pubs/ mir07/mir07_data.html Kertesz et al., 2007 RNA22 Moderately stringent seed pairing, matches to sequence patterns generated from mirna set, overall predicted pairing and predicted pairing stability http://cbcsrv.watson.ibm.com/ rna22.html Miranda et al., 2006 a Tools are listed according to criteria for prediction and ranking, which for those tools assessed with recent proteomics results generally correspond to their overall performance (Baek et al., 2008). 24 b Target Predictions: current Abundance of Conserved Targets differ for different microrna families! MicroRNA families conserved throughout vertebrates (n=87) have > 400 conserved target sites / family! MicroRNA families conserved throughout mammals (n=40) have > 11 conserved target sites / family! http://www.targetscan.org/ Looking for conserved and nonconserved targets! 25 26
Predicting microrna expression by target site avoidance Predicting microrna expression by target site avoidance 27 27 What microrna regulate what pathway/process?! Functional enrichment of predicted targets Modulating 3 UTR affect microrna mediated gene regulation Regulation or Degradation! CDS Common Extended Regulators: micrornas CDS Translation! Common Extended CDS Common Extended CDS Common Extended Sandberg*, Neilson*, et al. Science 2008 Gaidatzis BMC Bioinformatics 2007 28 29
Target prediction summary! The target predictions are useful but all of them only work on average! Finding many sites for the same microrna increase the chance of regulation! Conservation of target site in many species increase the chance of regulation! UTR regulation can affect microrna mediated gene regulation 30 31