DEVELOPMENT OF A WEBBASED APPLICATION TO DETECT PALINDROMES IN DNA SEQUENCES



Similar documents
Replication Study Guide

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

A Web Based Software for Synonymous Codon Usage Indices

PREDA S4-classes. Francesco Ferrari October 13, 2015

Translation Study Guide

Basic Concepts of DNA, Proteins, Genes and Genomes

Today you will extract DNA from some of your cells and learn more about DNA. Extracting DNA from Your Cells

Recombinant DNA & Genetic Engineering. Tools for Genetic Manipulation

restriction enzymes 350 Home R. Ward: Spring 2001

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. False (it s 99.

Pairwise Sequence Alignment

Biological Sequence Data Formats

Chapter 6 DNA Replication

Heuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

How is genome sequencing done?

CCR Biology - Chapter 9 Practice Test - Summer 2012

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

MORPHEUS. Prediction of Transcription Factors Binding Sites based on Position Weight Matrix.

From DNA to Protein

Structure and Function of DNA

2. The number of different kinds of nucleotides present in any DNA molecule is A) four B) six C) two D) three

DNA, RNA, Protein synthesis, and Mutations. Chapters

Bob Jesberg. Boston, MA April 3, 2014

Having a BLAST: Analyzing Gene Sequence Data with BlastQuest

agucacaaacgcu agugcuaguuua uaugcagucuua

Name Class Date. Figure Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d.

Cardiff University MSc in Bioinformatics - MSc in Genetic Epidemiology & Bioinformatics

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure enzymes control cell chemistry ( metabolism )

RNA & Protein Synthesis

Bioinformatics Grid - Enabled Tools For Biologists.

DNA Technology Mapping a plasmid digesting How do restriction enzymes work?

MATCH Commun. Math. Comput. Chem. 61 (2009)

GenBank, Entrez, & FASTA

Genotyping by sequencing and data analysis. Ross Whetten North Carolina State University

DNA Sequence Analysis Software

MUTATION, DNA REPAIR AND CANCER

DNA Sequence Analysis

Guidelines for Establishment of Contract Areas Computer Science Department

Effective Secure Encryption Scheme [One Time Pad] Using Complement Approach Sharad Patil 1 Ajay Kumar 2

Protein Synthesis How Genes Become Constituent Molecules

Beginner s Guide to Real-Time PCR

The Structure, Replication, and Chromosomal Organization of DNA

HIGH DENSITY DATA STORAGE IN DNA USING AN EFFICIENT MESSAGE ENCODING SCHEME Rahul Vishwakarma 1 and Newsha Amiri 2

Interaktionen von Nukleinsäuren und Proteinen

Gene Expression Macro Version 1.1

1. Molecular computation uses molecules to represent information and molecular processes to implement information processing.

Genetics Lecture Notes Lectures 1 2

GENE REGULATION. Teacher Packet

DNA and the Cell. Version 2.3. English version. ELLS European Learning Laboratory for the Life Sciences

CCR Biology - Chapter 8 Practice Test - Summer 2012

On Covert Data Communication Channels Employing DNA Steganography with Application in Massive Data Storage

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in

International Language Character Code

Illumina TruSeq DNA Adapters De-Mystified James Schiemer

Distributed Bioinformatics Computing System for DNA Sequence Analysis

Annex 6: Nucleotide Sequence Information System BEETLE. Biological and Ecological Evaluation towards Long-Term Effects

An Overview of DNA Sequencing

Abdullah Mohammed Abdullah Khamis

Hidden Markov Models

Hadoop and Map-reduce computing

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Genetomic Promototypes

MAKING AN EVOLUTIONARY TREE

Bioinformatics Resources at a Glance

Fluorescence in situ hybridisation (FISH)

Forensic DNA Testing Terminology

Genetic Technology. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

A greedy algorithm for the DNA sequencing by hybridization with positive and negative errors and information about repetitions

AS Biology Unit 2 Key Terms and Definitions. Make sure you use these terms when answering exam questions!

MiSeq: Imaging and Base Calling

Sanjeev Kumar. contribute

A demonstration of the use of Datagrid testbed and services for the biomedical community

Molecular Computing Athabasca Hall Sept. 30, 2013

HCS Exercise 1 Dr. Jones Spring Recombinant DNA (Molecular Cloning) exercise:

Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011

Human Genome and Human Genome Project. Louxin Zhang

Transcription and Translation of DNA

DNA Sequencing. Ben Langmead. Department of Computer Science

MACM 101 Discrete Mathematics I

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Genome Explorer For Comparative Genome Analysis

Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing

BioBoot Camp Genetics

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

A NEW DNA BASED APPROACH OF GENERATING KEY-DEPENDENT SHIFTROWS TRANSFORMATION

A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques

Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

UF EDGE brings the classroom to you with online, worldwide course delivery!

Genetics Module B, Anchor 3

Introduction to bioinformatics

Teacher Guide: Have Your DNA and Eat It Too ACTIVITY OVERVIEW.

K'NEX DNA Models. Developed by Dr. Gary Benson Department of Biomathematical Sciences Mount Sinai School of Medicine

DNA Replication in Prokaryotes

Chapter 11: Molecular Structure of DNA and RNA

Page 1. Name:

a. Ribosomal RNA rrna a type ofrna that combines with proteins to form Ribosomes on which polypeptide chains of proteins are assembled

Transcription:

DEVELOPMENT OF A WEBBASED APPLICATION TO DETECT PALINDROMES IN DNA SEQUENCES Fatma Eltayeb 1, Muna Elbahir 2, Sahar Mohamed 3, Mohamed Ahmed and 4 Nazar Zaki 5 1, 2, 3,4 Faculty of Mathematical Science, University of Khartoum, Sudan 5 College of Information Technology, UAE University, Al Ain 17555, UAE nzaki@uaeu.ac.ae ABSTRACT Detecting palindromes in DNA sequence is a central problem in computational biology. Identifying palindromes could help scientists advance the understanding of genomic instability. DNA sequences containing long adjacent inverted repeats (palindromes) are inherently unstable and are associated with many types of chromosomal rearrangements. In this paper, we present a simple web-base tool to assist biologist detecting palindromes in DNA sequence. 1. Introduction A palindrome is a sequence of letters/words which reads the same in forward as well as backward directions. For example, the English word madam is a palindrome [1]. DNA palindromes are words from the nucleotide base alphabet A = {A, C, G, T} that are symmetrical in the sense that they read exactly the same as their complementary sequences in the reverse direction [2]. There are two types of DNA palindromes, Palindromes that occur on opposite strands of the same section of DNA helix and Inverted Repeats in which two different segments of the double helix read the same but in opposite directions [3]. DNA palindromes are crucial for gene regulation, DNA replication and initiation of gene amplification. For the last few decades, vast of algorithms have been developed to aid in solving the problem through dynamic programming and heuristics techniques, however, very little attention has been made on producing a convenient and user-friendly system to be used by an average biologist. Therefore, in this work, we present a simple tool to assist biologist detecting palindromes in DNA sequence using a simple web-base application. 2. Online palindromes detection The application is held in a dynamic website (www.bio-sdn.com) to be accessible 24 hours. For the development of the web site we used JSP technology, which is an exciting new technology that provides powerful and efficient creation of dynamic content [4]. JSP is a presentation layer technology that allows static Web content to be mixed with Java code [4]. MySql was used for creating database to hold user palindrome detection results, to be able to retrieve the saved results. The algorithm for detecting the palindrome was written in the Java programming language, which presents the dynamic content. This dynamic content is Web pages that are generated by an application on the server. To detect palindromes, the user should enter or upload a DNA sequence of interest in plain text format as input to the application as shown in Figure 1. A subsequence size should also be specified.

Figure 1. Input the DNA sequence. There are several steps in the algorithm of detecting palindromes, the algorithm will start by validating the input, which means ignoring numbers, spaces and all characters except the basic nucleotide acids (A, T, C, G) as shown in figure 2. Figure 2. Input the DNA sequence. The next step is to convert all A s to T s, T s to A s, G s to C s, and C s to G s. The deliverable of this step is the complemented sequence. The DNA sequence will then be compared with the reverse of the complemented sequence. This process is illustrated by the following example. Suppose the DNA sequence ATGACCAGGTCAT is a palindrome of length 6 ATGACCAGGTCAT, then the complement for this sequence will be TACTGGACCAGTA and the reverse of the complemented sequence will be ATGACCTGGTCAT. By comparing the two sequences TACTGGACCAGTA and ATGACCTGGTCAT we can detect the Palindrome as: 1 5 ATGACC with 8 13 GGTCAT. This we call symmetrical Palindromes. Another example is shown in Figure 3. Figure 3. Symmetrical Palindrome. The process of detecting Mirror Images in a DNA sequence is actually the same as mentioned above, except for the complementary step. An example is shown in Figure 4.

Figure 4. Mirror Image. The final output shown in figure 5 is the list of all possible palindromes in the DNA sequence of interest. Figure 5. Detection of the possible palindrome. 3. CONCLUSION DNA sequences containing palindromes are associated with a high degree of genomic instability. The instability associated with palindromic sequences also creates difficulties in their molecular analysis. The main objective of this work is to develop a tool to assist biologist to detect any palindrome exists in a given DNA sequence. In this tool, the input sequence is reversed and complemented and then subsequences of the inputted sequence are compared with the resultant sequence using linear search under specific conditions. The matches found are then recorded as a palindrome. The tool has been developed using Java and Java Server Pages (JSP). In the future, the performance of this tool will be evaluated. 4. References [1] Ravi Gupta, Ankush Mittal, Sumit Gupta, An efficient algorithm to detect palindromes in DNA sequences using periodicity transform, Elsevier, Signal Processing 86 (2006) 2067 2073. [2] Ming-Ying Leung, Kwok Pui Choi, Aihua Xia, and Louis H.Y. Chen, Nonrandom Clusters of Palindromes in Herpesvirus Genomes, paper, U.S.A., October 15, 2004. [3] John W. Kimball, Kimball's general biology, 2003. [4] Damon Hougland, and Aaron Tavistock, Core JSP, Prentice Hall PTR; Pap/Cdr edition, 5/10/ 2000. DEVELOPMENT OF A WEBBASED APPLICATION TO DETECT PALINDROMES IN DNA SEQUENCES Fatma Eltayeb 1, Muna Elbahir 2, Sahar Mohamed 3, Mohamed Ahmed and 4 Nazar Zaki 5 1, 2, 3,4 Faculty of Mathematical Science, University of Khartoum, Sudan 5 College of Information Technology, UAE University, Al Ain 17555, UAE nzaki@uaeu.ac.ae

ABSTRACT Detecting palindromes in DNA sequence is a central problem in computational biology. Identifying palindromes could help scientists advance the understanding of genomic instability. DNA sequences containing long adjacent inverted repeats (palindromes) are inherently unstable and are associated with many types of chromosomal rearrangements. In this paper, we present a simple web-base tool to assist biologist detecting palindromes in DNA sequence. 1. Introduction A palindrome is a sequence of letters/words which reads the same in forward as well as backward directions. For example, the English word madam is a palindrome [1]. DNA palindromes are words from the nucleotide base alphabet A = {A, C, G, T} that are symmetrical in the sense that they read exactly the same as their complementary sequences in the reverse direction [2]. There are two types of DNA palindromes, Palindromes that occur on opposite strands of the same section of DNA helix and Inverted Repeats in which two different segments of the double helix read the same but in opposite directions [3]. DNA palindromes are crucial for gene regulation, DNA replication and initiation of gene amplification. For the last few decades, vast of algorithms have been developed to aid in solving the problem through dynamic programming and heuristics techniques, however, very little attention has been made on producing a convenient and user-friendly system to be used by an average biologist. Therefore, in this work, we present a simple tool to assist biologist detecting palindromes in DNA sequence using a simple web-base application. 2. Online palindromes detection The application is held in a dynamic website (www.bio-sdn.com) to be accessible 24 hours. For the development of the web site we used JSP technology, which is an exciting new technology that provides powerful and efficient creation of dynamic content [4]. JSP is a presentation layer technology that allows static Web content to be mixed with Java code [4]. MySql was used for creating database to hold user palindrome detection results, to be able to retrieve the saved results. The algorithm for detecting the palindrome was written in the Java programming language, which presents the dynamic content. This dynamic content is Web pages that are generated by an application on the server. To detect palindromes, the user should enter or upload a DNA sequence of interest in plain text format as input to the application as shown in Figure 1. A subsequence size should also be specified. Figure 1. Input the DNA sequence. There are several steps in the algorithm of detecting palindromes, the algorithm will start by validating the input, which means ignoring numbers, spaces and all characters except the basic nucleotide acids (A, T, C, G) as shown in figure 2.

Figure 2. Input the DNA sequence. The next step is to convert all A s to T s, T s to A s, G s to C s, and C s to G s. The deliverable of this step is the complemented sequence. The DNA sequence will then be compared with the reverse of the complemented sequence. This process is illustrated by the following example. Suppose the DNA sequence ATGACCAGGTCAT is a palindrome of length 6 ATGACCAGGTCAT, then the complement for this sequence will be TACTGGACCAGTA and the reverse of the complemented sequence will be ATGACCTGGTCAT. By comparing the two sequences TACTGGACCAGTA and ATGACCTGGTCAT we can detect the Palindrome as: 1 5 ATGACC with 8 13 GGTCAT. This we call symmetrical Palindromes. Another example is shown in Figure 3. Figure 3. Symmetrical Palindrome. The process of detecting Mirror Images in a DNA sequence is actually the same as mentioned above, except for the complementary step. An example is shown in Figure 4. Figure 4. Mirror Image. The final output shown in figure 5 is the list of all possible palindromes in the DNA sequence of interest.

Figure 5. Detection of the possible palindrome. 3. CONCLUSION DNA sequences containing palindromes are associated with a high degree of genomic instability. The instability associated with palindromic sequences also creates difficulties in their molecular analysis. The main objective of this work is to develop a tool to assist biologist to detect any palindrome exists in a given DNA sequence. In this tool, the input sequence is reversed and complemented and then subsequences of the inputted sequence are compared with the resultant sequence using linear search under specific conditions. The matches found are then recorded as a palindrome. The tool has been developed using Java and Java Server Pages (JSP). In the future, the performance of this tool will be evaluated. 4. References [1] Ravi Gupta, Ankush Mittal, Sumit Gupta, An efficient algorithm to detect palindromes in DNA sequences using periodicity transform, Elsevier, Signal Processing 86 (2006) 2067 2073. [2] Ming-Ying Leung, Kwok Pui Choi, Aihua Xia, and Louis H.Y. Chen, Nonrandom Clusters of Palindromes in Herpesvirus Genomes, paper, U.S.A., October 15, 2004. [3] John W. Kimball, Kimball's general biology, 2003. [4] Damon Hougland, and Aaron Tavistock, Core JSP, Prentice Hall PTR; Pap/Cdr edition, 5/10/ 2000. DEVELOPMENT OF A WEBBASED APPLICATION TO DETECT PALINDROMES IN DNA SEQUENCES Fatma Eltayeb 1, Muna Elbahir 2, Sahar Mohamed 3, Mohamed Ahmed and 4 Nazar Zaki 5 1, 2, 3,4 Faculty of Mathematical Science, University of Khartoum, Sudan 5 College of Information Technology, UAE University, Al Ain 17555, UAE nzaki@uaeu.ac.ae ABSTRACT Detecting palindromes in DNA sequence is a central problem in computational biology. Identifying palindromes could help scientists advance the understanding of genomic instability. DNA sequences containing long adjacent inverted repeats (palindromes) are inherently unstable and are associated with many types of chromosomal rearrangements. In this paper, we present a simple web-base tool to assist biologist detecting palindromes in DNA sequence. 1. Introduction A palindrome is a sequence of letters/words which reads the same in forward as well as backward directions. For example, the English word madam is a palindrome [1]. DNA palindromes are words from the nucleotide base

alphabet A = {A, C, G, T} that are symmetrical in the sense that they read exactly the same as their complementary sequences in the reverse direction [2]. There are two types of DNA palindromes, Palindromes that occur on opposite strands of the same section of DNA helix and Inverted Repeats in which two different segments of the double helix read the same but in opposite directions [3]. DNA palindromes are crucial for gene regulation, DNA replication and initiation of gene amplification. For the last few decades, vast of algorithms have been developed to aid in solving the problem through dynamic programming and heuristics techniques, however, very little attention has been made on producing a convenient and user-friendly system to be used by an average biologist. Therefore, in this work, we present a simple tool to assist biologist detecting palindromes in DNA sequence using a simple web-base application. 2. Online palindromes detection The application is held in a dynamic website (www.bio-sdn.com) to be accessible 24 hours. For the development of the web site we used JSP technology, which is an exciting new technology that provides powerful and efficient creation of dynamic content [4]. JSP is a presentation layer technology that allows static Web content to be mixed with Java code [4]. MySql was used for creating database to hold user palindrome detection results, to be able to retrieve the saved results. The algorithm for detecting the palindrome was written in the Java programming language, which presents the dynamic content. This dynamic content is Web pages that are generated by an application on the server. To detect palindromes, the user should enter or upload a DNA sequence of interest in plain text format as input to the application as shown in Figure 1. A subsequence size should also be specified. Figure 2. Input the DNA sequence. The next step is to convert all A s to T s, T s to A s, G s to C s, and C s to G s. The deliverable of this step is the complemented sequence. The DNA sequence will then be compared with the reverse of the complemented sequence. This process is illustrated by the following example. Suppose the DNA sequence ATGACCAGGTCAT is a palindrome of length 6 ATGACCAGGTCAT, then the complement for this sequence will be TACTGGACCAGTA and the reverse of the complemented sequence will be ATGACCTGGTCAT. By comparing the two sequences TACTGGACCAGTA and ATGACCTGGTCAT we can detect the Palindrome as: 1 5 ATGACC with 8 13 GGTCAT. This we call symmetrical Palindromes. Another example is shown in Figure 3. Figure 3. Symmetrical Palindrome. The process of detecting Mirror Images in a DNA sequence is actually the same as mentioned above, except for the complementary step. An example is shown in Figure 4. Figure 1. Input the DNA sequence. There are several steps in the algorithm of detecting palindromes, the algorithm will start by validating the input, which means ignoring numbers, spaces and all characters except the basic nucleotide acids (A, T, C, G) as shown in figure 2. Figure 4. Mirror Image. The final output shown in figure 5 is the list of all possible palindromes in the DNA sequence of interest.

Figure 5. Detection of the possible palindrome. 3. CONCLUSION DNA sequences containing palindromes are associated with a high degree of genomic instability. The instability associated with palindromic sequences also creates difficulties in their molecular analysis. The main objective of this work is to develop a tool to assist biologist to detect any palindrome exists in a given DNA sequence. In this tool, the input sequence is reversed and complemented and then subsequences of the inputted sequence are compared with the resultant sequence using linear search under specific conditions. The matches found are then recorded as a palindrome. The tool has been developed using Java and Java Server Pages (JSP). In the future, the performance of this tool will be evaluated. 4. References [1] Ravi Gupta, Ankush Mittal, Sumit Gupta, An efficient algorithm to detect palindromes in DNA sequences using periodicity transform, Elsevier, Signal Processing 86 (2006) 2067 2073. [2] Ming-Ying Leung, Kwok Pui Choi, Aihua Xia, and Louis H.Y. Chen, Nonrandom Clusters of Palindromes in Herpesvirus Genomes, paper, U.S.A., October 15, 2004. [3] John W. Kimball, Kimball's general biology, 2003. [4] Damon Hougland, and Aaron Tavistock, Core JSP, Prentice Hall PTR; Pap/Cdr edition, 5/10/ 2000.