BIOINFORMATICS ANALYSIS OF SNPS IN MICRO-RNA AND ITS PROCESSING MACHINERY GENES

Size: px
Start display at page:

Download "BIOINFORMATICS ANALYSIS OF SNPS IN MICRO-RNA AND ITS PROCESSING MACHINERY GENES"

Transcription

1 CHAPTER 6 BIOINFORMATICS ANALYSIS OF SNPS IN MICRO-RNA AND ITS PROCESSING MACHINERY GENES 6.1. INTRODUCTION MicroRNAs are coded from post translational regulatory genes which are responsible for silencing of the mrnas. These are a group of non-coding RNAs usually located in the intergenic non-coding regions of protein coding genes (25%) or the introns or exons of non-coding RNA genes (75%). They show ubiquitous expression in all cell types. Victor Ambros, Rhonda Feinbaum and Rosalind Lee discovered the first microrna in lin-4 was the first microrna from the bacteria C.elegans. Lin-4 was reported to be a mutant gene which showed the aberrant cell lineages during cell development(bartel 2004). Due to the lack of an open reading frame or ORF it could not generate a protein. Instead the gene products were short RNA transcripts called Precursor mirna and mature mirna of lengths 61 and 22 nucleotides, respectively(moss and Tang 2003). Lin-4 mirna (22nt) silenced or repressedthe translation of the Lin-14 gene by hybridizing to the complementary 3 UTR region of the Lin-14 gene. However,miRNA gene repression was inhibited by RNAs transcribed from the genome. These were called Pri-miRNAs which had a stem loop structure and acted as a trigger for mirna silencing pathways. MicroRNAs are functionally important endogenous molecules which play an important role in the regulation of biological processes, such as Development Cell proliferation Cell differentiation Apoptosis Transposon silencing Antiviral defence 89 P a g e

2 DNA repair Most mammalian mirnas do not appear to be primarily involved at the upper levels of the gene regulatory cascades but instead appear to be operating at many levels to regulate the expression of a diverse set of genes, many of which do not go on to directly influence the expression of other genes (Lewis et al 2003). mirnas and sirnas have a shared central biogenesis and can perform interchangeable biochemical functions. Hence, these two classes of silencing RNAs cannot be distinguished by either their chemical composition or mechanism of action. Nonetheless, important distinctions can be made, particularly based upon their origin, evolutionary conservation and the genes they target. First, mirnas are derived from genomic loci distinct from other genes, whereas sirnas often derive from Transposons, viruses, mrnas or heterochromatic DNA. Second, mirnas are processed from transcripts that can form local RNA hairpin structures. sirnas on the other hand are processed from long bimolecular RNA duplexes or extended hairpins. Third, a single mirna-mirna* duplex is generated from each mirna precursor molecule,whereas a multitude of sirna duplexes are generated from each sirna precursor molecules, leading to many sirnas accumulating from both strands of this extended double stranded RNA. Fourth, mirna sequences are nearly always conserved in related organisms unlike the sirnas which are seldom conserved. Strikingly, endogenous sirnas typically specify auto-silencing i.e. they specify the silencing of the same locus from which they originate. mirnas specify hetero-silencing in that they are produced from genes which silence very different genes. The fifth distinction explains the greater sequence conservation seen for mirnas.to the extent that sirnas come from the same loci that they target, a mutational event that changes the sequence of the sirna would also change the sequence of its regulatory target, and the sirna regulation would be preserved. In contrast, a mutation in mirna is rarely accompanied by simultaneous compensatory changes at the loci of its targets and thus selection pressure would preserve the mirna sequence (Dykxhoorn et al 2003). 90 P a g e

3 Single nucleotide polymorphisms Identification of genetic variants underlying complex traits is a major goal in genetic studies. Herein, it is critical to focus on the genetic variants that are most likely to exert functional impacts (Ye et al 2001). A single-nucleotide polymorphism is a DNA sequence variation occurring when a single nucleotide A, T, C, or G in the genome (or other shared sequence) differs between members of a species (or between paired chromosomes in an individual). For example, two sequenced DNA fragments from different individuals, AAGCCTA to AAGCTTA, contain a difference in a single nucleotide. In this case we say that there are two alleles: C and T. Almost all common SNPs have only two alleles. Within a population, SNPs can be assigned a minor allele frequency the lowest allele frequency at a locus that is observed in a particular population. This is simply the lesser of the two allele frequencies for single-nucleotide polymorphisms. There are variations between human populations, so a SNP allele that is common in one geographical or ethnic group may be much rarer in another. SNPs are important with regards to microrna regulation, as a single nucleotide change occurring naturally in the putative target sites are candidates for functional variation that may be of interest for the biomedical applications and evolutionary studies(saunders et al 2007). Not just constrained to the functional aspects, SNPs are also associated with the altering and modulation of the primary mirna processing. A large scale analysis of the occurrence of SNPs in specific functional and non-functional regions of the microrna has been performed in this study. Results have been obtained for the presently known mirnas in Homo sapiens toestablish the functional and structural implications of the SNP on the structural stability, target binding and posttranslational regulation. After obtaining the SNPs from the reference databases, it has been attempted to distinguish these SNPs based upon their locations within the five domains of the mirna. Prime importance has been attributed to the SNPs located in the seed region of the mirnas. Potential targets for these mirnas have been predicted using the tools RNAhybrid and miranda. 91 P a g e

4 MicroRNA target prediction RNAhybridis a target prediction tool for finding the minimum free energy hybridisation of a long and a short RNA. The hybridisation is performed in a kind of domain mode, i.e. the short sequence is hybridised to the best fitting part of the long one.the algorithmic core of RNAhybridis a variation of the classic RNA secondary structure prediction. Instead of a single sequence that is folded back onto itself in the energetically most favourable fashion, RNAhybriddetermines the most favourablehybridization site between two sequences. Though in principle these two sequences can be arbitrarily long, for microrna target prediction, the target candidate will be rather long (hundreds to thousands of nucleotides) and the mirna will be between 19 and 24 nt. Since microrna/target interactions have not been reported to contain bifurcations (also called multi-loops), these are not considered by RNAhybrid, thus considerably increasing the speed of the algorithm. RNAhybriddoes not use any RNA folding or pairwise sequence alignment code, but implements an algorithm that was specifically designed for RNA hybridization (Kruger and Rehmsmeier 2006). mirandais one of the earliest developed large-scale target prediction algorithms for vertebrates. The standard version of miranda selects target genes based on three properties: sequence complementarity using a position-weighed local alignment algorithm, free energies of RNA-RNA duplexes using the Vienna RNA fold package and conservation of targets in related genomes _ENREF_8 (John et al 2004). These features are weighed in a decreasing order. Targets binding to the mirna may fall into the category of True, Sifted, False or Coding. The relative binding positions of the different categories of targets as observed in case of miranda and RNAhybrid have been shown in Fig 6.1. miranda has the highest specificity when testing on shuffled and coding sequences; RNAhyrbid has the highest specificity when testing on validated false target set. Moreover, miranda and TargetScanS have similar patterns on different data sets. The specificity on coding set drop around 10% and 40-50% when comparing to that of shuffled and false ori set, respectively. RNAhybrid, however, did not follow this pattern. 92 P a g e

5 Fig. 6.1: Relative binding position miranda andrnahybrid(zhang and Verbeek 2010). A possible reason for this is that miranda and TargetScanS are sequence-based algorithms which respond similarly on different types of sequences; whereas RNAhybrid is energy-based. In general, all three exhibit either a relative low specificity or/and sensitivity indicating that their prediction accuracy cannot yet be considered satisfactory. (Kruger and Rehmsmeier 2006). The net change in the secondary structural stability of the mirnas due to the occurrence of variation within them has been enunciated by the use of the Mfold, which is a tool for the nucleic acid folding and hybridization prediction based upon thermodynamic methods. The core algorithm predicts a minimum free energy, G, as well as minimum free energies for foldings that must contain any particular base pair. Any base pair, ri-rj, between the ith nucleotide and the jth nucleotide that is contained in a folding no more than G from the minimum, is plotted in a triangular plot called the energy dot plot. The base pair ri-rjis plotted in row i and column j of this matrix(mathews et al 1999). The free energy increment, G, is chosen a priori by the user, who selects a percent suboptimality, P. From this, G is computed to be P/100 G. Base pairs within this free energy increment are chosen either automatically, or else by the user, and foldings that contain the chosen base pair are computed. They have minimum free energy conditional on containing the chosen base pair (Zuker 2003). 93 P a g e

6 6.2. MATERIALS AND METHODS In silico analysis of SNPs in microrna Identification of SNPs in human micrornas The genomic co-ordinates (hg19; National Centre for Biotechnology Information build 37.1) of all the available human pre-mirnas (n=721) were taken from mirbasedatabase (Release 14)(Griffiths-Jones et al 2008). The mirbase database is sub divided into three parts, the mirbase Sequence, Registry and Targets databases, the details of which are explained below: 1. The mirbase Sequence Database is a searchable database of published mirnasequences with annotation. The data were previously provided by the mirna Registry. 2. The mirbase Registry continues to provide gene hunters with unique names for novelmirna genes prior to publication of results. 3. The mirbase Target database is a new resource of predicted mirna targets inanimals. Each entry in the mirbase Sequence database represents a predicted hairpin portion of a mirna transcript (termed mir in the database), with information on the location and sequence of the mature mirna sequence (termed mir). Both hairpin and mature sequences are available for searching; using BLAST(Altschul et al 1990) and SSEARCH (Smith-Waterman search algorithm)(pearson 1990), and entries can also be retrieved by name, keyword, references and annotation. All sequence and annotation data are also available for download. The SNPs were identified (dbsnp build 130)(Sherry et al 2001) within the mirna genes using the application programming interface tool Biomart(Kasprzyk 2011) in the ENSEMBL database. Identification of SNPs in the flanking regions around the gene (upstream & downstream) was also carried out by querying the database in a similar manner. Ensembl is a joint project between EMBL-EBI and the Sanger Centre to develop a software system, which produces and maintains automatic annotation of eukaryotic genomes. 94 P a g e

7 Secondary structural analysis of microrna using Mfold The secondary structural analysis of 180 micrornas containing SNPs, was performed by means of the MFOLD web server(zuker 2003). This analysis has facilitated a structural insight into the effects of variations on the secondary structure of micrornas. These variations may eventually affect the structural stability and target binding or may not have any specific consequences at all. The analysis is purely based upon the thermodynamic aspects of structural stability. TheGibbs free energy (ΔG) for each wild and mutantmicrorna has been calculated. Various parameters such as the temperature (37 C), ionic conditions (1M NaCl, no divalent atoms), percent sub-optimality (controls the free energy increase), and upper bound on the number of folding allowed (50), window parameter and the maximum distance between paired bases are available. Default parameters have been used for the purpose of mirna structural prediction Domain classifications of pre-mirna The SNPs within the gene were classified into five well defined regions or domains of the microrna (Fig. 6.2). MicroRNAs act as adaptors that employ a silencing complex to target mrnas by selective base-pairing, primarily in the 3'-UTR region. Target interaction does not require perfect complementarity between microrna and mrna sequences, although nearperfect base-pairing in a small region in the 5'end (positions 2 7) of the microrna (sometimes termed seed ) appears to be one of the key determinants of target recognition. The SNPs which were constrained to the seed region of the microrna were specifically used for target prediction. 95 P a g e

8 Fig. 6.2: Five domain regions of microrna The 3 UTR sequences of the entire Human genome have been retrieved from the UCSC Genome Browser. The UCSC Genome Browser is developed and maintained by the Genome Bioinformatics Group, a cross-departmental team within the Centre for Biomolecular Science and Engineering (CBSE) at the University of California Santa Cruz (UCSC) MicroRNA target prediction Two tools, namely RNAhybrid and miranda have been used in the standalone mode in order to predict the targets of each of the microrna having an SNP within its seed region, as these variations are crucial factors affecting mirna-mrna binding miranda It computes optimal sequence complementarity between a set of mature micrornas and a given mrna using a weighted dynamic programming algorithm. The key extension to the Smith Waterman algorithm is that the alignment score is a weighted sum of match and mismatch scores for base pairs (including G:U wobbles) and gap penalties. Weights are position-dependent and reflect the relative importance of the 5' and 3' regions in a finely adjustable way. The weight of each position can be optimized to reflect 96 P a g e

9 experimental facts and physical principles. In addition, miranda uses an estimate of the free energy of formation of the microrna: mrna duplex as a secondary filter. A natural consequence of the weighted alignment optimization is the inclusion of potential targets with some mismatches at the dominant 5' end of the microrna, but with otherwise good complementarity to the target gene. To reflect the importance of the 5' region, base-pairing in positions 2 8 of the microrna are given higher weights when computing the microrna: mrna alignment score. This approach, as designed into the original miranda algorithm, is congruent with experimentally validated targets that do not contain perfect seed matches and includes what other approaches have subsequently introduced as 3'-compensatory matches or combinations of seed and 3' match rules (Betel et al 2008).The mirna sequence given as the query is searched through the file containing the 3 UTR sequences of the human using the default parameters. mirandamir15.fasta 3_UTR.fasta The Number of targets without the presence of SNP and after the incorporation of the SNP in the seed region has been analyzed and the total number of hits and target sites has been listed in both cases. The results have been analyzed through a program written using PERL scripting languageprogram calculates the total number of genes giving hits, the number of genes not giving hits and the number of target sites within the genes giving hits) RNAhybrid Itperforms a thorough statistical analysis of the MFEs (Minimum Free Energy). It normalizes the MFEs with the sequence lengths of mirnas and targets, and models such normalized MFEs as extreme value distributed. The parameters of thesedistributions are estimated specifically for every mirna with a second program, RNAcalibrate, is subsequently used to assign p-values to normalized MFEs. The significance of multiple binding sites in a single target is evaluated with a Poisson 97 P a g e

10 statistics. For comparative studies on multiple organisms such as different Drosophila species, it combines Poisson p-values from the orthologous targets using the effective number of sequences. This effective number respects the fact that related sequences cannot always be treated as statistically independent. Calculation of these effective numbers is mirna and target specific and is accomplished by a third program, RNAeffective. In this study, we have used the program RNAhybrid using the 3_UTR human file as the reference file. The mirna containing the SNP within the seed region is given as the query, the 3 UTR of the entire human genome is given as the target. The command used for performing the search of query against the targets is: RNAhybrid q t -3UTR_human >result Here ; -q is the name of the file containing FASTA sequence of the mirna -t is the name of the file containing the 3 UTR sequences of human genome >result will write the results into a file named results The targets with the least p-value and minimum free energy have been considered as the best hits. Thus binding efficacy of the mirna before and after the occurrence of variation has been elucidated and the SNPs showing drastic effect on target binding have been highlighted. The validated targets have been retrieved from the TarBase Database and the unknown chromosomal positions of mrna binding region have been established.the NM reference number for these known targets was then obtained from National Centre of Biotechnology (NCBI) in order to identify the target genes and their sequences were retrieved from the UCSC table browser for the respective mirnas targeting them. Asthe targets for the micrornas are already known, a specific program has been written in order to obtain the chromosomal positions which have not been found yet. 98 P a g e

11 6.2.2.insilicoanalysis of SNPs in microrna processing machinery genes Data mining of SNPs The various proteins involved in the microrna biogenesis pathway were searched in the literature database, Pubmed-NCBI. From 22 different scientific papers 38 proteins, their domains and their role in the biogenesis process were identified.through an extensive mining of the databases of the International HapMap Project ( and dbsnp ( and a perl-interface program Biomart the SNPs in the desired 38 genes were retrieved and mapped into six different domains of the gene (i) promoter region, (ii) 5 UTR, (iii) exon, (iv) intron, (v) 3 UTR, and (vi) 3 near gene region. The SNP density in each of the six domains was calculated Sequence analysis SNPs in the functionally important domain (promoter, exon and 3 UTR) of the microrna biogenesis genes were analyzed using in silico tools SIFT The exonic SNPs were categorized into non-synonymous and synonymous changes. The non-synonymous SNPs were submitted into a sequence homology-based tool called SIFT (Sorts Intolerant From Tolerant) which predicts whether an amino acid substitution in a protein will have a phenotypic effect. More details of SIFT are given in Chapter MicroRNA target prediction The potential target sites for the micrornas in the 3 UTR of the mrnas were predicted using a stand-alone tool miranda-3.3a version. 3 UTR sequences were retrieved from nucleotide database of NCBI and from mirbase human mature microrna sequences were obtained. The effect of polymorphism in microrna target sites was predicted using miranda. 99 P a g e

12 Transcription factor prediction The flanking sequences of 100 bp for each SNPs were obtained from dbsnp.the SNPs with MAF of >1% were subjected for Transcription factor analysis using Match Transfac (BIOBASE) Structural analysis The functional role of nssnps was further analyzed from protein structure Homology modelling The sequences of all 38 microrna biogenesis related proteins were retrieved from the Uniprot (protein sequence database). Templates were searched for proteins whose crystal structures have not been solved completely or not available in the Protein Data Bank. BLASTP program was used to search protein template against PDB. PHYRE2 and I- TASSER protein modeling servers were used to model the proteins because blast results do not find significantly suitable template. Results from both the servers were subjected for structural validation and the best validated structures were taken for the further analysis Protein structure validation The predicted protein structures were energy minimized using SCHRODINGER- Protein preparation wizard. To validate the modelled structure, a Ramachandran plot was drawn for energy minimized structure using PROCHECK Stability analysis The PoPMuSiC program is a tool for evaluating the changes in stability of a given protein or peptide under single-site mutations, on the basis of the protein s structure. Three options are available: Single:PoPMuSiC predicts the stability change resulting from a given mutation. 100 P a g e

13 Systematic:PoPMuSic evaluates the stability changes resulting from all possible mutations, and returns a report containing a list of the most stabilizing or destabilizing mutations, or of the mutations that do not affect stability. File:PoPMuSiC predicts the stability changes resulting from a list of mutations specified by the user in an uploaded file. In this study, we used File option, where we uploaded the PDB structure of the protein to be analyzed and a text file with all the mutations of that protein. Results were analyzed using prediction ΔΔG in kcal/mol of each mutation RESULTS AND DISCUSSION General observations 721 MicroRNAs in Homo sapiens have been listed till date in mirbase. The number of micrornas within each chromosome have been shown in Fig 6.4.The maximum number of micrornas are found in Chr19 ie 81 and Chr.Xie 82 as shown in Fig 6.3. The minimum number of micrornas have been found in chromosome 21 ie 5 mirnas only. The three mitochondrial mirnas are predicted; however, these have not been validated as yet and in current version this three mirna were removed from the database. Chr.Y is devoid of any microrna for a reason yet to be identified. The micrornas were classified based upon their location within Exon, intron and intergenic regions. The following bar diagram (Fig. 6.4) discerns the number of mirtrons to be 344, micrornas in intergenic regions to be 337 and number within exons to be the least, ie 56 only.this reinforced the fact that majority of the micrornas are present in intronic and intergenic regions as stated before. 101 P a g e

14 Fig. 6.3: MicroRNA count within each chromosome Fig. 6.4: Number of mirnas in introns, intergenic and exonic regions. 102 P a g e

15 Fig. 6.5: SNP count in validated human pre-mirnas and flanking regions Single nucleotide polymorphism in microrna SNPs in human microrna genes were identified by querying the Single Nucleotide Polymorphism database (dbsnp) at the genomic co-ordinates of 721 genes. A total of 257 SNPs (including in-del polymorphisms) were obtained within 180 mirnas. For the purpose of comparasion, dbsnp was also queried for the flanking regions around the premirnas. As the flanking regions around the pre-mirnas are mostly intergenic in origin, these showed a higher density of SNPs SNPs were located in the upstream flanking region of the mirnas and 1250 within the downstream region of the mirna genes. Fig 6.5 clearly shows the SNP count within and around the mirna genes. The pre-mirna is composed of different domains with different functional significance. To gain insight into the potential functional importance of the identified polymorphisms, the SNPs were mapped to five different domains of the pre-mirnas: (i) the seed region, (ii) the mature region excluding the seed region (MIR_seed), (iii) the stem region complementary to the MIR (MIR*), (iv) the stem region that is neither the MIR nor 103 P a g e

16 MIR*, and (v) the loop region (Fig 6.2). 14 SNPs identified within the Seed region were attributed prime importance in the present study. 22 SNPs in the loop region, 43 in the complementary or mir* domain, 121 within the stem and 55 SNPs within the mir seed domain were located based upon their chromosomal positions. The bar diagram (Fig 6.6) displays the SNP distribution between the five domains of the mirnas.the percentage of SNP was calculated within each domain, elucidated in the graph shown in Fig 6.6. Fig.6.6:SNP distribution between 5 domains of mirna. Fig. 6.7:Percentage of SNPs within each domain 104 P a g e

17 Overall, ~90% human pre-mirnas have no reported polymorphisms and most observed polymorphisms were not present within the seed region (Fig 6.6,6.7). Hence this suggested a strong selective pressure on human pre-mirnas Stability analysis of microrna-mfold It was carefully listed as to which SNP brought about a favourable change in the secondary structural stability of the mirna genes within which they are occurring. The secondary structure (Appendix 6.1) elucidate the mirnas within which the polymorphism occurred along with the free ΔG before and after the occurrence of variation and the structure with the position of the SNP were underlined(appendix 6.1). The mature region is shown in red and polymorphism position shown in green. 41 SNPs showed a favourable decrease in the free ΔG of the mirna structure and hence seem to stabilize these mirnas. 30 SNPs showed no net change in ΔG and did not affect the stability of the mirna in any manner. Thus, the remaining 196 SNPs which showed considerable difference in free ΔG may be responsible for a drastic change in the target binding specificity of these mirnas as the changed conformation of structure may attract a different set of targets(appendix 6.1) mirna target prediction of seed domain SNPs RNAhybrid The 14 MicroRNAs containing SNPs in the seed region were evaluated using the standalone version of RNAhybrid. The top 5 targets with the lowest P-values and lowest energies are reported for the mirna genes before and after (indicated by SNP in brackets in Table 6.1) polymorphism. Tables for 5 mirnas have been shown as examples. As evident from the tables shown (Table 6.1), the top 5 target bindings to the mirnas mir- 499 and mir-513 are different presence and absence polymorphism. In case of mir-124 and mir-219 only 1 target, ie C11orf74 and CCDC124, respectively, occurred even after presence of SNP.These however are binding with lower energies and lower P-values 105 P a g e

18 after polymorphism. This indicates that these SNPs are favorable for the hybridization of the mirna to the target in terms of structural stability. Table 6.1miRNA targets before and after polymorphism. mir-124 mir-124 (SNP) Target gene Mfe P-value NLRX C11orf ATPIF ZNF KRTAP mir-125a Target gene Mfe P-value DUX LOC LOC LOC LOC mir-219 Target gene Mfe P-value SPATC SLC1A MAN2B FKSG CCDC mir-499 Target gene Mfe P-value EHMT C9orf CCS ATP6AP1L TAAR Target gene Mfe P-value C11orf GMPS DOM3Z C4orf MATN mir-125a (SNP) Target Mfe P-value RETN gene SHH DUX LOC LOC mir-219 (SNP) Target gene Mfe P-value CCDC C2orf COX6A DUX LOC mir-499(snp) Target gene Mfe P-value LOXHD GALR WDR FAM128B WNT9A P a g e

19 mir-513 Target gene MFE P-value RTL CRYBA PGAM C4orf C1orf mir-513(snp) Target gene MFE P-value TUBGCP RNF TCEB GABRQ C9orf *SI units of MFE value is kcal/mole miranda An overall conclusion was drawn from the results obtained by using miranda algorithm for target prediction. A graph was generated for each of the 14 mirnas containing SNPs in the seed region, in order to identify the difference in the number of targets binding before and after variation. The results are as follows: mir-124 SNP No Hits HITS Target Sites Fig. 6.8: hsa-mir-124 target prediction. Number ofmrnas not binds to mirna (No hits), number of mrnas binding (Hits) & the number of target sites within the mrnas binding (Target Sites). 107 P a g e

20 The bar diagram (Fig.6.8) indicates that the number of mrna genes binding to the mirna before variation is more than the number of mrna genes binding to hsa-mir-124 after variation, i.e and 2151, respectively. There was a clear difference of 4068 genes giving hits in the two cases. There was also a clear drop in the number of target sites after variation. Thus the SNP rs showeda drastic effect on the target binding of the mirna. The bar diagram (Fig. 6.9) shows a very minute difference between the number of hits before and after variation. However, strikingly the number of hits and target sites binding to hsa-mir-125a before mutation (16,802) was less than the hits after the occurrence of the SNP rs (17,543) in the seed region hsa-mir-125a mir-125a snp-125a No Hits HITS Target Sites Fig. 6.9: hsa-mir-125a target prediction 108 P a g e

21 30000 hsa-mir mir-499 snp No Hits HITS Target Sites Fig.6.10: hsa-mir-499 Target Analysis graph The bar diagram (Fig 6.10) for hsa-mir-499 target analysis also shows a similar effect of the SNP rs on mrna binding to the mirna. The number of genes retrieved as hits before SNP occurrence was 7672 and showed an increase to 8514 after the incorporation of the SNP in the mirna. Hsa-mir-219 displayed a similar trend and as seen in Fig 6.11, the target binding sites and the number of hits showed a slight hike in the number after the incorporation of SNP within the microrna. Thus, these demonstrated the example of single nucleotide polymorphisms which arefavorable for target binding. 109 P a g e

22 hsa-mir-219 No Hits HITS Target Sites mir-219 snp-219 Fig. 6.11: Target Analysis for hsa-mir-219 In the following figures (Fig 6.12, 6.13, 6.14, 6.15), hsa-mir-513, hsa-mir-518d, hsa-mir- 627 and hsa-mir-662 showed identical pattern of reduction in the number of targets binding to them after the Single nucleotide polymorphism in the seed region. This highlights the importance of the seed region 2-8 in the hybridizing potential and specificity for silencing of mrnas in the 3 UTR region in human beings. While mir-513 and mir-518d showed a minute difference in the number of hits, mir-627 and mir-585 show considerable depreciation in target binding. 110 P a g e

23 30000 hsa-mir mir-513 snp No Hits HITS Target Sites Fig. 6.12: Target analysis for hsa-mir hsa-mir-518d mir-518d snp-518d No Hits HITS Target Sites Fig. 6.13: Target Analysis for hsa-mir-518d 111 P a g e

24 35000 hsa-mir mir-585 snp No Hits HITS Target Sites Fig. 6.14: Target analysis for hsa-mir hsa-mir mir-627 snp No Hits HITS Target Sites Fig. 6.15: Target analysis for hsa-mir P a g e

25 60000 hsa-mir mir-662 snp No Hits HITS Target Sites Fig. 6.16: Target analysis for hsa-mir-662 Hsa-mir-662 (Fig 6.16) showed an interesting result as there was no change in the number of targets binding to this mirna even after the occurrence of SNP rs in the seed region. There was no net change in the number of hits due to this variation. Hsamir-1268 (Fig 6.19) too showed no net effect of the mutation on targets but differed in target sites hsa-mir-941 No Hits HITS Target Sites mir-941 snp-941 Fig. 6.17: Target analysis for hsa-mir P a g e

26 70000 hsa-mir mir-1236 snp No Hits HITS Target Sites Fig. 6.18: Target analysis for hsa-mir hsa-mir-1268 No Hits HITS Target Sites mir-1268 snp-1268 Fig. 6.19: Target analysis for hsa-mir P a g e

27 30000 hsa-mir mir-1276 snp No Hits HITS Target Sites Fig. 6.20: Target analysis for hsa-mir hsa-mir mir-1302 snp No Hits HITS Target Sites Fig. 6.21: Target analysis for hsa-mir SNPs in validated target sites The Number of SNPs in the target sites was much lower than the number of SNPs in the seed region of the mirnas. A total of only 7 SNPs were obtained within 139 target sites i.e. 843 bases, thus showing the low rate of polymorphisms within the messenger RNAs 115 P a g e

28 (Fig. 6.22). Thus we may suggest the importance of polymorphism in mirna seed region in the target silencing mechanism, as compared to the target site polymorphisms. Target site SNPs Target Sites SNP count 5% 95% Fig. 6.22: SNPs in validated target sites SNPs in microrna biogenesis proteins Literature search Till date, the analytical study of approximately ten biogenesis proteins has been carried out. Through literature survey, we found 38 proteins were involved directly or indirectly in the microrna biogenesis process. For most of the protein we were able to identify the domains (Table 6.2) and role of proteins in microrna biogenesis through literature. 116 P a g e

29 Table6.2 MicroRNA biogenesis proteins, their domains and function. Protein Domains Function Dicer RNase III, DEAD, PAZ, mirna precursor processing dsrbd AGO3 PAZ, PIWI Short RNA binding AGO4 PAZ, PIWI Short RNA binding Gemin3 DEAD RNA helicase Drosha RNase III Processing of primary mirna transcript Exportin-5 NA Nuclear export of mirna precursors Gemin4 NA Not investigated TGF-β,SMADs R-Smad, Co-Smad Induces SMAD signaling to the mir-21 precursor and enhances its efficient processing by Drosha TRBP NA Stabilizes Dicer Importin-8 NA Required for cytoplasmic mirnaguided gene silencing ELAV1 AU-rich element Inhibit mir-122 repression of target sites AGO1 PAZ, PIWI Short RNA binding AGO2 PAZ, PIWI Short RNA binding Dnd1 U-rich region Inhibit mirna access to target mrna TNRC6B RRM, GW repeats mirna-guided cleavage DCP1a NA Not investigated DCP2 Nudix Not investigated MOV10 DExH, box mirna-guided cleavage PRMT5 Methyl-transferase Not investigated TNRC6A/GW18 2 RRM, GW repeats mirna-guided cleavage, translational repression TTP Zn-finger AU-rich mrna destabilization 117 P a g e

30 eif4e NA Not investigated Rck/p54 DEAD box mirna-guided cleavage PACT DsRBD Small RNA Processing, RISC activity FMRp KH domain, RGG box Not investigated FXR1 KH domain, RGG box Not investigated FXR2 KH domain, RGG box Not investigated KIF17b Kinesin motor Not investigated MVH DEAD box Not investigated MAEL HMG box Not investigated SNP density The overall SNP density in the different domains of all the genes has been represented in Fig This represents that the presence of SNP is usually high in the 3 UTR region. We can hypothesis that since these proteins were directly or indirectly involved in the biogenesis process, it could be possible that due to the polymorphism, target sites for some micrornas were being created/changed in a gene, which eventually lead to the negative feedback for the synthesis of a microrna. 1.6 SNP density ' NEAR GENE 5'UTR EXON INTRON 3'UTR 3' NEAR GENE Fig.6.23:Overall SNP density in the six domains. 118 P a g e

31 On comparing the overall SNP density with the SNP density of individual genes like Ago1 (Fig.6.24) and Importin8 (Fig. 6.25), it s clearly figured out that the SNP density could be high in any domain of the gene rather than just the 3 UTR. Thus, polymorphism in all the regulatory domains could play an important role and need to be studied further. 21% 9% 0% 5' NEAR GENE 5'UTR 26% EXON 19% INTRON 3'UTR 3' NEAR GENE 25% Fig. 6.24: SNP density in Ago1 0% 11% 9% 5' NEAR GENE 5'UTR 22% 28% EXON INTRON 3'UTR 30% 3' NEAR GENE Fig. 6.25: SNP density in IPO Non-synonymous SNPs All the SNPs in the exonic region of the gene were categorized into synonymous and non-synonymous SNPs. The non-synonymous SNPs were further classified as Missense change and Nonsense change.the missense change is which results in a change in the amino acid and eventually in the protein product formed. The nonsense change results in 119 P a g e

32 a premature stop codon. Both the changes affect the function of the protein being encoded by the gene. In this study we were interested to know how the mutation in the SNPs SIFT To identify the important non-synonymous SNPs in each gene we used SIFT. The variations which werepredicted to cause damage in the protein were considered as important SNPs (Table 6.3). Table 6.3 SIFT results for DCP2 protein. SNP AA change Prediction Homologs Score Homologs rs33555* L16F TOLERATED 1 DAMAGING 0 rs * S71I TOLERATED 0.62 DAMAGING 0.02 rs T213A TOLERATED 1 TOLERATED 0.34 rs Q298K TOLERATED 1 TOLERATED 0.77 Note:* important SNP which may damage the protein Promoter region SNPs: Transfac-MATCH In order to analyze the significance of the SNPs in the promoter region, all the SNPs with reported MAF of over 1% were selected and analyzed by MATCH (Transfac) tool. The binding site of various transcription factors were predicted at the promoter region of the genes. To know how the polymorphism at promoter region regulates the transcription factor binding was also analyzed using MATCH. Out of 38 genes only 9 genes promoter- SNPs showed a regulation in transcription factor binding (Fig.6.26). 15 transcription factors binding were regulated by SNPs, out of this PAX6 and Nkx2-5 were the two frequently regulated transcription factors. PAX6, a transcription factor, has recentlyreported as a tumor suppressor in glioblastoma and acts as an early differentiation marker for neuroendocrine cells. Its role in prostate cancer is also being investigated. 120 P a g e

33 Nkx2-5 has shown association with congenital heart defects and its role in prostate and colon cancer is being investigated. Total SNPs Significant SNPs AGO1 ELAV1 GEMIN3 GEMIN4 MAEL PACT TNRC6A SMAD2 SMAD4 Fig. 6.26: MATCH-Transfac Results The change in the transcription factor binding site of the PACT gene was shown (Fig. 6.27). Transcription factor, Pax-6 binds to the wild-type sequence of the promoter region whereas the polymorphism (highlighted in yellow) creates the binding site for Nkx2-5. Wild-type Mutant Fig. 6.27: Promoter sequence of PACT gene. 121 P a g e

34 microrna target prediction of 3 utr SNP To predict microrna target sites on the 3 UTR region of the genes, a stand-alone tool miranda was used. Here we show the results for Ago4 gene which is a key protein involved in RISC. Fig 6.28 shows the number of microrna binding sites and the number of micrornas binding in the 3 UTR of Ago Chart Title Site HIT Fig. 6.28:miRanda results for Ago4 We observed that the wild type sequence has 606 binding sites onto which 488 micrornas could bind. But in case of polymorphism (rs ), in the 3 UTR of Ago4, an additional binding site for microrna was created (Fig. 6.28). 122 P a g e

35 MicroRNA SNP ID SCORE ENERGY mirlength hsa-mir-663b rs Fig. 6.29:miRanda output In the output shown Fig 6.29, the region in green represents the seed sequence and the polymorphism (G/C) is highlighted in yellow which is causing the binding of an additional microrna Structural analysis of nssnps Blast results The homology modelling method is one of the best computational methods to predict the protein structure which is a template based method. For the structural analysis of the effect of the nssnps, i first carried out the BLASTP search was carried out in order to retrieve the suitable templates for the homology modelling of the proteins. Parameters to filter the blast results: Identity to be greater than 35% The E-value to be upto E-10 Significant query coverage E-value for all the templates was insignificant and the query coverage for most of the templates was not more than 70 amino acids. Similar results were obtained for all the other proteins (Table 6.4).Proteins with less homology were modeled by PHYRE2 and I- TASSER. 123 P a g e

36 Table6.4 Blast results for Dicer1 protein. pdb id identity positive match mis-match gap query start query end e-value score 3c4b eb a ffl e ffl ffl z0m eaq i oyy o6b PHYRE2 The amino acid sequences of all the proteins were submitted into a homology based modelling tool called PHYRE2. SNP (rs ) shows a change S73T (highlighted in yellow), where a polar amino acid is changing into another polar amino acid. Thus the change is not significant. However, SNP (rs ) shows a significant change D147N (highlighted in red), where a negative amino acid is changing into a polar amino acid I-TASSER I-TASSER is anabinitomethod based protein modelling tool. All the protein sequences were submitted into I-TASSER. The structure with higher C-score (confidence score) were taken for the analysis. C-score is a confidence score for estimating the quality of predicted models by I-TASSER. It is calculated based on the significance of threading template alignments and the convergence parameters of the structure assembly simulations. 124 P a g e

37 rs rs Fig. 6.30: 3-D structure of eif4e protein generated by PHYRE2. C-score is typically in the range of [-5 to 2], where a C-score of higher value signifies a model with a high confidence and vice-versa.tm-score and RMSD are known standards for measuring structural similarity between two structures which are usually used to measure the accuracy of structure modelling when the native structure is known.for Gemin4 protein, five models with different C-scores were obtained (Fig. 6.31). Model1: Model2: Model3: Model4: Model5: Since the C-score of model1 is highest, so it was selected for further analysis. 125 P a g e

38 Fig. 6.31: 3D structure of Gemin-4 generated by I-TASSER Energy minimization & validation The modelled structures were energy minimized using SCHRODINGER (Fig 6.32). A project table was created and the PDB structure was imported into the table. Protein is energy minimized by OPLS2005 force field for 500 iterations. In order to validate the energy minimized structure, Ramachandran plot is generated by PROCHECK. This determines whether the alpha-helices, beta sheet and the turn region of the protein are present in a stable orientation. 126 P a g e

39 Fig. 6.32: Ramachandran plot for Gemin4 In Ramachandran plot protein with >90% of the residues in the most favoured regions is considered to be good structure. However in Gemin4, 87.6% of the residues are in the most favoured orientation. Similarly, energy minimization and validation of each protein by generation of the Ramachandran plot was carried out. Modelled structures with good Ramachandran plot were selected for the further analysis. 127 P a g e

40 POPMUSIC The PoPMuSiC predicts the changes in stability of a given protein structure due to point mutant (6.33). rs , 1.27 rs , rs , 0.17 rs , 0.41 rs , 0.86 rs7813, 0.96 rs , 0.92 rs , 1.46 ΔΔG rs , 1.34 rs , -0.1 rs , 0.98 rs , 1.26 rs , 1.38 rs910925, 1.16 rs , 0.34 rs , 1.47 rs , 0.59 rs , 0.28 rs , 0.93 rs , 0.12 rs , 1.2 rs , rs , 1.9 rs , 0.87 rs , Fig.6.33: Results for Popmusic for Gemin4 It predicts the ΔΔG value, in kcal/mol. lesser the value, more stable is the structure of the protein and vice-versa. Gemin4 gene popmusic results were shown in the graph. Most of the SNPs in the Genin4 destabilize the protein. SNP (rs ) make the protein to be more unstable than any other mutation CONCLUSION For the past two decades, we only knew about the involvement of 6 major proteins in microrna biogenesis, but in recent years researchers have reported several others protein involved directly or indirectly in microrna biogenesis. In this study, we found 38 proteins were involved in microrna biogenesis through literature survey. Expression 128 P a g e

41 of these proteins was also regulated in diseases like cancer. Our interest was to know how microrna expressions were regulated through biogenesis proteins and the mutations therein affect the microrna function indirectly. However, enough mutational information of these proteins is not available, leaving no choice than to analyze the polymorphic effect on these proteins computationally. SNPs of these proteins were mapped into 6 domains of each gene. Expressions of these genes were regulated by SNPs at promoter or 3 UTR domains. To study the mutational effect of these genes, nonsynonymous SNPs in exon domain were analyzed. In promoter region, 15 SNPs provided significant results and also PAX6 and Nkx2-5 were found to be important regulators for microrna biogenesis. In 3 UTR analysis, few new microrna target sites were created. This result also suggested a negative feedback of microrna. SIFT predicted 28% of the missense mutation caused damage to the protein. This sequence based mutational analysis results suggested that most of the non-synonymous SNPs in these genes could affect the function of the protein. This made us to study the mutation at structural level. To carry out the mutational analysis on biogenesis protein structures, we modelled 35 proteins, since only 3 proteins were fully crystallized and deposited in PDB and the Blast search didn t find good template. Very few templates had good sequence identity although these failed to cover significant query coverage. PSI-Blast search also didn t find any significant template. Blast results made us to model the proteins using PHYRE2 and I-TASSER. Structures from both the modelling servers were subjected to energy minimization using the force field OPLS2005 for 500 iterations in SCHRODINGER. Ramachandran plot was created to validate the modelled structures. Both the server modelled structures were validated and structures with maximum residues falling in allowed regions were taken for the stability analysis. POPMUSIC was used to calculate the stability of the wild and mutant structures. Most of the mutations, it was observe, destabilized the proteins. The mutations with delta G higher than 1 were considered as most important mutants. The significant SNPs in each domain of a gene in future would require experimental validation. 129 P a g e

42 MicroRNAs function as endogenous translational repressors of protein-coding genes in animals by binding to target sites in the 3 UTRs of mrnas. Because a single nucleotide change in the sequence of a target site or seed region COULD affect mirna regulation, naturally occurring SNPs in target sites are candidates for functional variation that may be of interest for biomedical applications and evolutionary studies. However, little is known to date about variation among humans in mirnas and their target sites. In this study, we analyzed publicly available SNP data in context with mirnas throughout the human genome, and we found a relatively low level of variation in functional regions of mirnas. The stem and complementary strand of the mirnas showed a very high count of SNPs within them.the seed region showed a very low polymorphism rate, however with considerable effect on the target binding.regulation of microrna through biogenesis proteins is not well understood. In the present study we made an attempt to investigate the role of SNPs in the biogenesis proteins. From this study we came out with interesting inferences. SNPs at the promoter region regulated the binding of two transcription factors PAX6 and Nkx2-5 which are also associated with many types of cancers. We also found that a negative feedback inhibition of the microrna is possible due to the polymorphism in the 3 UTR. Structural analysis suggested that most of the SNPs in the proteins could affect the protein stability. Sequence and structural analysis of our study predicted 5.7% of the SNPs significantly regulating the microrna-biogenesisproteins. 130 P a g e

Outline. MicroRNA Bioinformatics. microrna biogenesis. short non-coding RNAs not considered in this lecture. ! Introduction

Outline. MicroRNA Bioinformatics. microrna biogenesis. short non-coding RNAs not considered in this lecture. ! Introduction Outline MicroRNA Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology (CMB) Karolinska Institutet! Introduction! microrna target site prediction! Useful resources 2 short non-coding RNAs

More information

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

More information

The world of non-coding RNA. Espen Enerly

The world of non-coding RNA. Espen Enerly The world of non-coding RNA Espen Enerly ncrna in general Different groups Small RNAs Outline mirnas and sirnas Speculations Common for all ncrna Per def.: never translated Not spurious transcripts Always/often

More information

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources 1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools

More information

Outline. interfering RNA - What is dat? Brief history of RNA interference. What does it do? How does it work?

Outline. interfering RNA - What is dat? Brief history of RNA interference. What does it do? How does it work? Outline Outline interfering RNA - What is dat? Brief history of RNA interference. What does it do? How does it work? What is RNA interference? Recently discovered regulatory level. Genome immune system.

More information

mirnaselect pep-mir Cloning and Expression Vector

mirnaselect pep-mir Cloning and Expression Vector Product Data Sheet mirnaselect pep-mir Cloning and Expression Vector CATALOG NUMBER: MIR-EXP-C STORAGE: -80ºC QUANTITY: 2 vectors; each contains 100 µl of bacterial glycerol stock Components 1. mirnaselect

More information

Micro RNAs: potentielle Biomarker für das. Blutspenderscreening

Micro RNAs: potentielle Biomarker für das. Blutspenderscreening Micro RNAs: potentielle Biomarker für das Blutspenderscreening micrornas - Background Types of RNA -Coding: messenger RNA (mrna) -Non-coding (examples): Ribosomal RNA (rrna) Transfer RNA (trna) Small nuclear

More information

GenBank, Entrez, & FASTA

GenBank, Entrez, & FASTA GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,

More information

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals Xiaohui Xie 1, Jun Lu 1, E. J. Kulbokas 1, Todd R. Golub 1, Vamsi Mootha 1, Kerstin Lindblad-Toh

More information

Genomes and SNPs in Malaria and Sickle Cell Anemia

Genomes and SNPs in Malaria and Sickle Cell Anemia Genomes and SNPs in Malaria and Sickle Cell Anemia Introduction to Genome Browsing with Ensembl Ensembl The vast amount of information in biological databases today demands a way of organising and accessing

More information

Bioinformatics Resources at a Glance

Bioinformatics Resources at a Glance Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences

More information

PART 3.3: MicroRNA and Cancer

PART 3.3: MicroRNA and Cancer BIBM 2010 Tutorial: Epigenomics and Cancer PART 3.3: MicroRNA and Cancer Dec 18, 2010 Sun Kim at Indiana University Outline of Part 3.3 Background on microrna Role of microrna in cancer MicroRNA pathway

More information

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Focusing on results not data comprehensive data analysis for targeted next generation sequencing Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes

More information

13.4 Gene Regulation and Expression

13.4 Gene Regulation and Expression 13.4 Gene Regulation and Expression Lesson Objectives Describe gene regulation in prokaryotes. Explain how most eukaryotic genes are regulated. Relate gene regulation to development in multicellular organisms.

More information

V22: involvement of micrornas in GRNs

V22: involvement of micrornas in GRNs What are micrornas? V22: involvement of micrornas in GRNs How can one identify micrornas? What is the function of micrornas? Elisa Izaurralde, MPI Tübingen Huntzinger, Izaurralde, Nat. Rev. Genet. 12,

More information

OriGene Technologies, Inc. MicroRNA analysis: Detection, Perturbation, and Target Validation

OriGene Technologies, Inc. MicroRNA analysis: Detection, Perturbation, and Target Validation OriGene Technologies, Inc. MicroRNA analysis: Detection, Perturbation, and Target Validation -Optimal strategies to a successful mirna research project Optimal strategies to a successful mirna research

More information

岑 祥 股 份 有 限 公 司 技 術 專 員 費 軫 尹 20100803

岑 祥 股 份 有 限 公 司 技 術 專 員 費 軫 尹 20100803 技 術 專 員 費 軫 尹 20100803 Overview of presentation Basic Biology of RNA interference Application of sirna for gene function? How to study mirna? How to deliver sirna and mirna? New prospects on RNAi research

More information

Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu.

Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu. Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu.au What is Gene Expression & Gene Regulation? 1. Gene Expression

More information

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:

More information

Functional and Biomedical Aspects of Genome Research

Functional and Biomedical Aspects of Genome Research Functional and Biomedical Aspects of Genome Research 20 11 35 Vorlesung SS 04 Bartsch, Jockusch & Schmitt-John Mi. 9:15-10:00, in W7-135 13 Functional RNAs Thomas Schmitt-John micro RNAs small interfering

More information

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE AP Biology Date SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE LEARNING OBJECTIVES Students will gain an appreciation of the physical effects of sickle cell anemia, its prevalence in the population,

More information

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins

More information

Name Class Date. Figure 13 1. 2. Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d.

Name Class Date. Figure 13 1. 2. Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d. 13 Multiple Choice RNA and Protein Synthesis Chapter Test A Write the letter that best answers the question or completes the statement on the line provided. 1. Which of the following are found in both

More information

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!! DNA Replication & Protein Synthesis This isn t a baaaaaaaddd chapter!!! The Discovery of DNA s Structure Watson and Crick s discovery of DNA s structure was based on almost fifty years of research by other

More information

Chapter 18 Regulation of Gene Expression

Chapter 18 Regulation of Gene Expression Chapter 18 Regulation of Gene Expression 18.1. Gene Regulation Is Necessary By switching genes off when they are not needed, cells can prevent resources from being wasted. There should be natural selection

More information

MatureBayes: A Probabilistic Algorithm for Identifying the Mature mirna within Novel Precursors

MatureBayes: A Probabilistic Algorithm for Identifying the Mature mirna within Novel Precursors MatureBayes: A Probabilistic Algorithm for Identifying the Mature mirna within Novel Precursors Katerina Gkirtzou 1,2, Ioannis Tsamardinos 1,2, Panagiotis Tsakalides 1,2, Panayiota Poirazi 3 * 1 Computer

More information

Transcription and Translation of DNA

Transcription and Translation of DNA Transcription and Translation of DNA Genotype our genetic constitution ( makeup) is determined (controlled) by the sequence of bases in its genes Phenotype determined by the proteins synthesised when genes

More information

Dicer Substrate RNAi Design

Dicer Substrate RNAi Design INTEGRATED DNA TECHNOLOGIES, INC. Dicer Substrate RNAi Design How to design and order 27-mer Dicer-substrate Duplex RNAs for use as RNA interference reagents The following document provides a summary of

More information

1 Mutation and Genetic Change

1 Mutation and Genetic Change CHAPTER 14 1 Mutation and Genetic Change SECTION Genes in Action KEY IDEAS As you read this section, keep these questions in mind: What is the origin of genetic differences among organisms? What kinds

More information

Translation Study Guide

Translation Study Guide Translation Study Guide This study guide is a written version of the material you have seen presented in the replication unit. In translation, the cell uses the genetic information contained in mrna to

More information

Searching Nucleotide Databases

Searching Nucleotide Databases Searching Nucleotide Databases 1 When we search a nucleic acid databases, Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from the forward strand and 3 reading frames

More information

Activity 7.21 Transcription factors

Activity 7.21 Transcription factors Purpose To consolidate understanding of protein synthesis. To explain the role of transcription factors and hormones in switching genes on and off. Play the transcription initiation complex game Regulation

More information

Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6

Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6 Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6 Overview This tutorial outlines how microrna data can be analyzed within Partek Genomics Suite. Additionally,

More information

Analytical Study of Hexapod mirnas using Phylogenetic Methods

Analytical Study of Hexapod mirnas using Phylogenetic Methods Analytical Study of Hexapod mirnas using Phylogenetic Methods A.K. Mishra and H.Chandrasekharan Unit of Simulation & Informatics, Indian Agricultural Research Institute, New Delhi, India akmishra@iari.res.in,

More information

School of Nursing. Presented by Yvette Conley, PhD

School of Nursing. Presented by Yvette Conley, PhD Presented by Yvette Conley, PhD What we will cover during this webcast: Briefly discuss the approaches introduced in the paper: Genome Sequencing Genome Wide Association Studies Epigenomics Gene Expression

More information

Biogenesis, Size and Function of Small RNAs

Biogenesis, Size and Function of Small RNAs srnas of Plants Small, Non-coding RNAs of Plants Regulatory RNAs that act through gene silencing Two classes of small RNAs (srnas) o microrna (mirnas) Encoded by genes in the genome o small interfering

More information

RNA & Protein Synthesis

RNA & Protein Synthesis RNA & Protein Synthesis Genes send messages to cellular machinery RNA Plays a major role in process Process has three phases (Genetic) Transcription (Genetic) Translation Protein Synthesis RNA Synthesis

More information

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Single nucleotide polymorphisms or SNPs (pronounced "snips") are DNA sequence variations that occur

More information

From DNA to Protein. Proteins. Chapter 13. Prokaryotes and Eukaryotes. The Path From Genes to Proteins. All proteins consist of polypeptide chains

From DNA to Protein. Proteins. Chapter 13. Prokaryotes and Eukaryotes. The Path From Genes to Proteins. All proteins consist of polypeptide chains Proteins From DNA to Protein Chapter 13 All proteins consist of polypeptide chains A linear sequence of amino acids Each chain corresponds to the nucleotide base sequence of a gene The Path From Genes

More information

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in DNA, RNA, Protein Synthesis Keystone 1. During the process shown above, the two strands of one DNA molecule are unwound. Then, DNA polymerases add complementary nucleotides to each strand which results

More information

Sickle cell anemia: Altered beta chain Single AA change (#6 Glu to Val) Consequence: Protein polymerizes Change in RBC shape ---> phenotypes

Sickle cell anemia: Altered beta chain Single AA change (#6 Glu to Val) Consequence: Protein polymerizes Change in RBC shape ---> phenotypes Protein Structure Polypeptide: Protein: Therefore: Example: Single chain of amino acids 1 or more polypeptide chains All polypeptides are proteins Some proteins contain >1 polypeptide Hemoglobin (O 2 binding

More information

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed

More information

Basic Concepts of DNA, Proteins, Genes and Genomes

Basic Concepts of DNA, Proteins, Genes and Genomes Basic Concepts of DNA, Proteins, Genes and Genomes Kun-Mao Chao 1,2,3 1 Graduate Institute of Biomedical Electronics and Bioinformatics 2 Department of Computer Science and Information Engineering 3 Graduate

More information

CHAPTER 40 The Mechanism of Protein Synthesis

CHAPTER 40 The Mechanism of Protein Synthesis CHAPTER 40 The Mechanism of Protein Synthesis Problems: 2,3,6,7,9,13,14,15,18,19,20 Initiation: Locating the start codon. Elongation: Reading the codons (5 3 ) and synthesizing protein amino carboxyl.

More information

Module 3 Questions. 7. Chemotaxis is an example of signal transduction. Explain, with the use of diagrams.

Module 3 Questions. 7. Chemotaxis is an example of signal transduction. Explain, with the use of diagrams. Module 3 Questions Section 1. Essay and Short Answers. Use diagrams wherever possible 1. With the use of a diagram, provide an overview of the general regulation strategies available to a bacterial cell.

More information

Five-year relative survival rates. Cancer. Age-adjusted cancer death rates. Proteomic Technologies for Cancer Biomarker Discovery 2010/3/22

Five-year relative survival rates. Cancer. Age-adjusted cancer death rates. Proteomic Technologies for Cancer Biomarker Discovery 2010/3/22 Cancer Five-year relative survival rates Basal lamina Underlyig tissue Normal tissue Carcinoma Invasive carcinoma 1 http://www.cancer.org/docroot/home/index.asp 2 Proteomic Technologies for Cancer Biomarker

More information

Structure and Function of DNA

Structure and Function of DNA Structure and Function of DNA DNA and RNA Structure DNA and RNA are nucleic acids. They consist of chemical units called nucleotides. The nucleotides are joined by a sugar-phosphate backbone. The four

More information

How To Understand How Gene Expression Is Regulated

How To Understand How Gene Expression Is Regulated What makes cells different from each other? How do cells respond to information from environment? Regulation of: - Transcription - prokaryotes - eukaryotes - mrna splicing - mrna localisation and translation

More information

Human Genome Organization: An Update. Genome Organization: An Update

Human Genome Organization: An Update. Genome Organization: An Update Human Genome Organization: An Update Genome Organization: An Update Highlights of Human Genome Project Timetable Proposed in 1990 as 3 billion dollar joint venture between DOE and NIH with 15 year completion

More information

Introduction to Genome Annotation

Introduction to Genome Annotation Introduction to Genome Annotation AGCGTGGTAGCGCGAGTTTGCGAGCTAGCTAGGCTCCGGATGCGA CCAGCTTTGATAGATGAATATAGTGTGCGCGACTAGCTGTGTGTT GAATATATAGTGTGTCTCTCGATATGTAGTCTGGATCTAGTGTTG GTGTAGATGGAGATCGCGTAGCGTGGTAGCGCGAGTTTGCGAGCT

More information

Lezioni Dipartimento di Oncologia Farmacologia Molecolare. RNA interference. Giovanna Damia 29 maggio 2006

Lezioni Dipartimento di Oncologia Farmacologia Molecolare. RNA interference. Giovanna Damia 29 maggio 2006 Lezioni Dipartimento di Oncologia Farmacologia Molecolare RNA interference Giovanna Damia 29 maggio 2006 RNA INTERFERENCE Sequence-specific gene suppression by dsrnas Gene silencing by dsrna: C. elegans

More information

Control of Gene Expression

Control of Gene Expression Control of Gene Expression What is Gene Expression? Gene expression is the process by which informa9on from a gene is used in the synthesis of a func9onal gene product. What is Gene Expression? Figure

More information

When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want

When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want 1 When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want to search other databases as well. There are very

More information

Control of Gene Expression

Control of Gene Expression Home Gene Regulation Is Necessary? Control of Gene Expression By switching genes off when they are not needed, cells can prevent resources from being wasted. There should be natural selection favoring

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

Pairwise Sequence Alignment

Pairwise Sequence Alignment Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What

More information

THE ENZYMES. Department of Microbiology, Immunology, and Molecular Genetics, Molecular Biology Institute University of California

THE ENZYMES. Department of Microbiology, Immunology, and Molecular Genetics, Molecular Biology Institute University of California VOLUME THIRTY TWO THE ENZYMES Eukaryotic RNases and their Partners in RNA Degradation and Biogenesis, Part B Edited by FENG GUO Department of Biological Chemistry, David Geffen School of Medicine, Molecular

More information

Lecture 3: Mutations

Lecture 3: Mutations Lecture 3: Mutations Recall that the flow of information within a cell involves the transcription of DNA to mrna and the translation of mrna to protein. Recall also, that the flow of information between

More information

Delivering the power of the world s most successful genomics platform

Delivering the power of the world s most successful genomics platform Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE

More information

MUTATION, DNA REPAIR AND CANCER

MUTATION, DNA REPAIR AND CANCER MUTATION, DNA REPAIR AND CANCER 1 Mutation A heritable change in the genetic material Essential to the continuity of life Source of variation for natural selection New mutations are more likely to be harmful

More information

Chapter 6 DNA Replication

Chapter 6 DNA Replication Chapter 6 DNA Replication Each strand of the DNA double helix contains a sequence of nucleotides that is exactly complementary to the nucleotide sequence of its partner strand. Each strand can therefore

More information

a. Ribosomal RNA rrna a type ofrna that combines with proteins to form Ribosomes on which polypeptide chains of proteins are assembled

a. Ribosomal RNA rrna a type ofrna that combines with proteins to form Ribosomes on which polypeptide chains of proteins are assembled Biology 101 Chapter 14 Name: Fill-in-the-Blanks Which base follows the next in a strand of DNA is referred to. as the base (1) Sequence. The region of DNA that calls for the assembly of specific amino

More information

PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org

PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,

More information

Next Generation Sequencing: Technology, Mapping, and Analysis

Next Generation Sequencing: Technology, Mapping, and Analysis Next Generation Sequencing: Technology, Mapping, and Analysis Gary Benson Computer Science, Biology, Bioinformatics Boston University gbenson@bu.edu http://tandem.bu.edu/ The Human Genome Project took

More information

GENE REGULATION. Teacher Packet

GENE REGULATION. Teacher Packet AP * BIOLOGY GENE REGULATION Teacher Packet AP* is a trademark of the College Entrance Examination Board. The College Entrance Examination Board was not involved in the production of this material. Pictures

More information

MicroRNA Mike needs help to degrade all the mrna transcripts! Aaron Arvey ISMB 2010

MicroRNA Mike needs help to degrade all the mrna transcripts! Aaron Arvey ISMB 2010 Target mrna abundance dilutes microrna and sirna activity MicroRNA Mike needs help to degrade all the mrna transcripts! Aaron Arvey ISMB 2010 Target mrna abundance dilutes microrna and sirna activity Erik

More information

RNA Structure and folding

RNA Structure and folding RNA Structure and folding Overview: The main functional biomolecules in cells are polymers DNA, RNA and proteins For RNA and Proteins, the specific sequence of the polymer dictates its final structure

More information

Sample Questions for Exam 3

Sample Questions for Exam 3 Sample Questions for Exam 3 1. All of the following occur during prometaphase of mitosis in animal cells except a. the centrioles move toward opposite poles. b. the nucleolus can no longer be seen. c.

More information

Module 1. Sequence Formats and Retrieval. Charles Steward

Module 1. Sequence Formats and Retrieval. Charles Steward The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.

More information

mrna EDITING Watson et al., BIOLOGIA MOLECOLARE DEL GENE, Zanichelli editore S.p.A. Copyright 2005

mrna EDITING Watson et al., BIOLOGIA MOLECOLARE DEL GENE, Zanichelli editore S.p.A. Copyright 2005 mrna EDITING mrna EDITING http://dbb.urmc.rochester.edu/labs/smith/research_2.htm The number of A to I sites in the human transcriptome >15;000 the vast majority of these sites occurring in Alu repeats

More information

Profiling of non-coding RNA classes Gunter Meister

Profiling of non-coding RNA classes Gunter Meister Profiling of non-coding RNA classes Gunter Meister RNA Biology Regensburg University Universitätsstrasse 31 93053 Regensburg Overview Classes of non-coding RNAs Profiling strategies Validation Protein-RNA

More information

SUPPLEMENTAL DATA. Construct Registry title Registry number. P y = Ubc, Gal4 = 0, P x = none (Fig. 2) Ubc-CFP BBa_J176093

SUPPLEMENTAL DATA. Construct Registry title Registry number. P y = Ubc, Gal4 = 0, P x = none (Fig. 2) Ubc-CFP BBa_J176093 SUPPLEMENTAL DATA Construct Registry title Registry number P y = Ubc, Gal4 = 0, P x = none (Fig. 2) Ubc-CFP BBa_J176093 P y = Ubc, Gal4 = 0, P x = Ubc (Fig. 2) Test mir-luc sensor 1 BBa_J176142 P y = Ubc,

More information

SOP 3 v2: web-based selection of oligonucleotide primer trios for genotyping of human and mouse polymorphisms

SOP 3 v2: web-based selection of oligonucleotide primer trios for genotyping of human and mouse polymorphisms W548 W552 Nucleic Acids Research, 2005, Vol. 33, Web Server issue doi:10.1093/nar/gki483 SOP 3 v2: web-based selection of oligonucleotide primer trios for genotyping of human and mouse polymorphisms Steven

More information

The sequence of bases on the mrna is a code that determines the sequence of amino acids in the polypeptide being synthesized:

The sequence of bases on the mrna is a code that determines the sequence of amino acids in the polypeptide being synthesized: Module 3F Protein Synthesis So far in this unit, we have examined: How genes are transmitted from one generation to the next Where genes are located What genes are made of How genes are replicated How

More information

Gene Models & Bed format: What they represent.

Gene Models & Bed format: What they represent. GeneModels&Bedformat:Whattheyrepresent. Gene models are hypotheses about the structure of transcripts produced by a gene. Like all models, they may be correct, partly correct, or entirely wrong. Typically,

More information

Protein Synthesis How Genes Become Constituent Molecules

Protein Synthesis How Genes Become Constituent Molecules Protein Synthesis Protein Synthesis How Genes Become Constituent Molecules Mendel and The Idea of Gene What is a Chromosome? A chromosome is a molecule of DNA 50% 50% 1. True 2. False True False Protein

More information

DNA, RNA, Protein synthesis, and Mutations. Chapters 12-13.3

DNA, RNA, Protein synthesis, and Mutations. Chapters 12-13.3 DNA, RNA, Protein synthesis, and Mutations Chapters 12-13.3 1A)Identify the components of DNA and explain its role in heredity. DNA s Role in heredity: Contains the genetic information of a cell that can

More information

17 July 2014 WEB-SERVER MANUAL. Contact: Michael Hackenberg (hackenberg@ugr.es)

17 July 2014 WEB-SERVER MANUAL. Contact: Michael Hackenberg (hackenberg@ugr.es) WEB-SERVER MANUAL Contact: Michael Hackenberg (hackenberg@ugr.es) 1 1 Introduction srnabench is a free web-server tool and standalone application for processing small- RNA data obtained from next generation

More information

Frequently Asked Questions Next Generation Sequencing

Frequently Asked Questions Next Generation Sequencing Frequently Asked Questions Next Generation Sequencing Import These Frequently Asked Questions for Next Generation Sequencing are some of the more common questions our customers ask. Questions are divided

More information

Data Analysis for Ion Torrent Sequencing

Data Analysis for Ion Torrent Sequencing IFU022 v140202 Research Use Only Instructions For Use Part III Data Analysis for Ion Torrent Sequencing MANUFACTURER: Multiplicom N.V. Galileilaan 18 2845 Niel Belgium Revision date: August 21, 2014 Page

More information

Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage

Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research March 17, 2011 Rendez-Vous Séquençage Presentation Overview Core Technology Review Sequence Enrichment Application

More information

Complex multicellular organisms are produced by cells that switch genes on and off during development.

Complex multicellular organisms are produced by cells that switch genes on and off during development. Home Control of Gene Expression Gene Regulation Is Necessary? By switching genes off when they are not needed, cells can prevent resources from being wasted. There should be natural selection favoring

More information

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office

More information

Chapter 2. imapper: A web server for the automated analysis and mapping of insertional mutagenesis sequence data against Ensembl genomes

Chapter 2. imapper: A web server for the automated analysis and mapping of insertional mutagenesis sequence data against Ensembl genomes Chapter 2. imapper: A web server for the automated analysis and mapping of insertional mutagenesis sequence data against Ensembl genomes 2.1 Introduction Large-scale insertional mutagenesis screening in

More information

AP BIOLOGY 2010 SCORING GUIDELINES (Form B)

AP BIOLOGY 2010 SCORING GUIDELINES (Form B) AP BIOLOGY 2010 SCORING GUIDELINES (Form B) Question 2 Certain human genetic conditions, such as sickle cell anemia, result from single base-pair mutations in DNA. (a) Explain how a single base-pair mutation

More information

Genetics Lecture Notes 7.03 2005. Lectures 1 2

Genetics Lecture Notes 7.03 2005. Lectures 1 2 Genetics Lecture Notes 7.03 2005 Lectures 1 2 Lecture 1 We will begin this course with the question: What is a gene? This question will take us four lectures to answer because there are actually several

More information

RNAi Shooting the Messenger!

RNAi Shooting the Messenger! RNAi Shooting the Messenger! Bronya Keats, Ph.D. Department of Genetics Louisiana State University Health Sciences Center New Orleans Email: bkeats@lsuhsc.edu RNA interference (RNAi) A mechanism by which

More information

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem Elsa Bernard Laurent Jacob Julien Mairal Jean-Philippe Vert September 24, 2013 Abstract FlipFlop implements a fast method for de novo transcript

More information

Chapter 5: Organization and Expression of Immunoglobulin Genes

Chapter 5: Organization and Expression of Immunoglobulin Genes Chapter 5: Organization and Expression of Immunoglobulin Genes I. Genetic Model Compatible with Ig Structure A. Two models for Ab structure diversity 1. Germ-line theory: maintained that the genome contributed

More information

Version 5.0 Release Notes

Version 5.0 Release Notes Version 5.0 Release Notes 2011 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) +1.734.769.7249 (elsewhere) +1.734.769.7074 (fax) www.genecodes.com

More information

Luísa Romão. Instituto Nacional de Saúde Dr. Ricardo Jorge Av. Padre Cruz, 1649-016 Lisboa, Portugal. Cooper et al (2009) Cell 136: 777

Luísa Romão. Instituto Nacional de Saúde Dr. Ricardo Jorge Av. Padre Cruz, 1649-016 Lisboa, Portugal. Cooper et al (2009) Cell 136: 777 Luísa Romão Instituto Nacional de Saúde Dr. Ricardo Jorge Av. Padre Cruz, 1649-016 Lisboa, Portugal Cooper et al (2009) Cell 136: 777 PTC = nonsense or stop codon = UAA, UAG, UGA PTCs can arise in a variety

More information

Biological Sciences Initiative. Human Genome

Biological Sciences Initiative. Human Genome Biological Sciences Initiative HHMI Human Genome Introduction In 2000, researchers from around the world published a draft sequence of the entire genome. 20 labs from 6 countries worked on the sequence.

More information

BioBoot Camp Genetics

BioBoot Camp Genetics BioBoot Camp Genetics BIO.B.1.2.1 Describe how the process of DNA replication results in the transmission and/or conservation of genetic information DNA Replication is the process of DNA being copied before

More information

Functional RNAs; RNA catalysts, mirna,

Functional RNAs; RNA catalysts, mirna, Functional RNAs; RNA catalysts, mirna, srna, RNAi... RNAs have many functions rrna (ribosomal RNA) trna (transfer RNA) mrna (Messenger RNA) snrna (including snorna) ) (Small nuclear RNA- splicing) Other

More information

Note: This document wh_informatics_practical.doc and supporting materials can be downloaded at

Note: This document wh_informatics_practical.doc and supporting materials can be downloaded at Woods Hole Zebrafish Genetics and Development Bioinformatics/Genomics Lab Ian Woods Note: This document wh_informatics_practical.doc and supporting materials can be downloaded at http://faculty.ithaca.edu/iwoods/docs/wh/

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

micrornas Contents Introduction

micrornas Contents Introduction micrornas Contents Introduction... 1 Structure and Function of mirnas... 2 Plant mirnas... 4 mirna Biosynthesis... 5 Evolution of mirnas... 6 Conclusions... 9 References and Resources... 9 Introduction

More information

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik Leading Genomics Diagnostic harma Discove Collab Shanghai Cambridge, MA Reykjavik Global leadership for using the genome to create better medicine WuXi NextCODE provides a uniquely proven and integrated

More information

micrornas Non protein coding, endogenous RNAs of 21-22nt length Evolutionarily conserved

micrornas Non protein coding, endogenous RNAs of 21-22nt length Evolutionarily conserved microrna 2 micrornas Non protein coding, endogenous RNAs of 21-22nt length Evolutionarily conserved Regulate gene expression by binding complementary regions at 3 regions of target mrnas Act as negative

More information

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16 Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems

More information