Bioinformatics for Informatics Vinicius Vielmo Cogo. Presented in UFSM, Santa Maria, RS, BR. April 08, 2014.
|
|
- Ashley Pitts
- 8 years ago
- Views:
Transcription
1 Bioinformatics for Informatics Vinicius Vielmo Cogo Presented in UFSM, Santa Maria, RS, BR. April 08, 2014.
2 Context The presenter: is not a bioinformatician is a computer scientist from Inf2006-UFSM is a PhD. student in the ULisboa is a researcher at the LaSIGE (Navigators group ) has as main research areas: Distributed systems Dependability Fault Tolerance Security But why are you going to present about bioinformatics? 2
3 Context Interesting area Many open problems Opportunities to acquire and apply knowledge Opportunities from projects Bachelor in Computer Science (UFSM) Master in Informatics (ULisboa) PhD. in Informatics (ULisboa) PET-CC FTH-GRID CloudFIT BiobankCloud PET-CC LSC Tclouds 3
4 Context is not an introduction to bioinformatics is not a study of is a different way to look to bioinformatics, from an informatics perspective 4
5 What is bioinformatics? 5
6 What is bioinformatics? 6
7 What is bioinformatics? is the study of life and living organisms 7
8 What is bioinformatics? 8
9 What is bioinformatics? Mathematics Astronomy Physics Biology Ethics Chemistry Informatics Statistics 9
10 What is bioinformatics? Mathematics Astronomy Physics Biology Ethics Chemistry Informatics Statistics 10
11 What is bioinformatics? Biology Informatics 11
12 What is bioinformatics? Biology Bioinformatics? Informatics 12
13 What is bioinformatics? Biology Bioinformatics Informatics Computational Biology Biological Computing Modeling biological virtual systems and organs. Searching the optimal path in a graph based on ants behavior 13
14 What is bioinformatics? Biology Bioinformatics Informatics Bioinformatics is an informatics discipline for storing, retrieving, organizing and analyzing biological data. 14
15 Motivation But what makes the biological data so special? 15
16 Motivation But what makes the biological data so special? Data complexity Size Quantity Meaning Incomplete and uncertain data 16
17 Outline 1. Size 2. Quantity 3. Meaning 17
18 1. Size
19 DNA TCTCATGCATACGCAGCAGGTCAGGCAT DNA is: A very large string Composed only by A, C, G, and T characters 19
20 Human DNA From human beings to DNA trillion pairs 46 2 molecules A human being Cells Nucleus Chromosomes DNA 210 types of cells 300 million die every minute 20
21 Human DNA TCTCATGCATACGCAGCAGGTCAGGCAT Human DNA: 3.2 B characters (bp, base-pairs) 21
22 Genome size Organism Genome size Picture Porcine circovirus type K Drosophila melanogaster 130 M Mus musculus 2.7 B Homo sapiens 3.2 B Zea mays 5 B Marbled lungfish 130 B 22
23 Printing a human genome Times New Roman 12 pt 2622 bp per page 1.2 M single-sided pages 2415 reams (500 sheets) 129 meters 129 m 93 m 30 m 23
24 Printing a human genome Leighton Pritchard 4.5 pt Double-sided pages 120 book volumes 1 bookshelf 24
25 Loading a human genome to memory Does it work? char[] humandna = char[ ]; 25
26 Loading a human genome to memory Does it work? char[] humandna = char[ ]; In most programming languages: No Most data structures are indexed by signed int 2.1 B chars = (2^31-1 = ) Solutions? 26
27 Loading a human genome to memory char[] humandna = char[ ]; Solutions: Unsigned int: (2^32-1) 64-bit indexes: (2^63-1) Matrices: humandna = char[100][ ] 27
28 Loading a human genome to memory char[] humandna = char[ ]; Solutions: Unsigned int: (2^32-1) 64-bit indexes: (2^63-1) Matrices: humandna = char[100][ ] Split in chromosomes: Map<Key, Value> Map<chrName, chrsequence> 28
29 Human chromosomes The human genome is not a contiguous DNA 29
30 Storing a human genome in a file Entire human genome 3.2 B chars 3 GB UTF-8 or ISO char = 1-byte (8-bits) 30
31 Storing a human genome in a file FASTA format Broadly used Accepts comments (>) Store also incomplete or small sequences > HG00339 chr1 ATGCATCGTACGCTCATCATGCAGC AGACTGCTAGCTGATGGTACGTCAT GCAGTCGTCAGTCAGTCAGTACATC > HG00339 chr2 TCGTACGCTCATCATGCAGCAGGTC AGTACCATGGTCAGACTCATTGCAA CTGCTAGCTGATGATCGGCATAGT > HG00339 chr22 CAGTACGTCAGTGCTAGCTAGCTAC GATCGATCGACTGATACGATC > HG00339 chry GGGTACTCATGCAGTCAGTCAGTTG CAGTCAGTCAGTCGCAGTCA 31
32 Storing a human genome in a file 1 char = 2-bits 770 MB Entire human genome 3.2 B chars A = 00 C = 01 G = 10 T = 11 32
33 Storing a human genome in a file 2-bit format Not human readable (normal humans) There are other characters than A, C, G and T E.g.: N for any. Nomenclature for incompletely specified bases in nucleic acid sequences IUPAC How to reduce even more?
34 Compressing a human genome file ZIP Compression 830 MB (in 10min) 34
35 Compressing a human genome file Referential Compression Humans are 99.9% genetically equals 7 MB (in 30s) 35
36 Sequencing a human genome Reads Genomes are not sequenced in one contiguous line Reads contains bp each Human genome reference Resulting aligned genome 36
37 Sequencing a human genome Sequenced Genome 180 GB > HG00339 read bp ATGCATCGTACGCTCATCATGCAGC +!''*((((***+))%%%++)(%%%%).1*** > HG00339 read bp TCGTACGCTCATCATGCAGCAGGTC + %%%).1****''))**55CCF>>>>>>CC data FASTQ format > HG00339 read bp CAGTACGTCAGTGCTAGCTAGCTAC + )(%%%%).1***''))**5CCCC?CCC65 37
38 Resuming Human genome = string of 3.2B chars Size of creatures does not determine genome size Normally in memory, a genome is divided into 23 chromosomes In a file: 3 GB in FASTA 770 MB in 2-bits 830 MB when compressed with ZIP 7 MB when using referential compression 180 GB in FASTQ 38
39 2. Quantity
40 Problem statement Huge Big data data 40
41 Quantity The number of sequenced genomes is exponentially increasing Caused by the decreasing on: time and cost 1 st Human Genome Currently Year Cost US$ 3 billion US$ Time 13 years (2001) < 1 hour 41
42 From first to next-generation sequencing Sanger-based chemistries and capillary-based instruments sequencing platforms Next-Generation sequencing platforms (NGS) 42
43 Cost per human genome 43
44 Cost x quantity 44
45 US$1000 (2014) HiSeq X Ten 45
46 NGS around the world 46
47 US$100 (?) 47
48 US$30 (?) 48
49 Projects gathering genomes 49
50 Physical storage in biobanks 50
51 Physical storage in biobanks 51
52 Digital storage in e-biobanks Biobanks store also health and behavior data Pressure to store also sequenced DNA Researchers need a lot of data (shared) 52
53 Digital storage in e-biobanks Biobanks will soon start using cloud computing Illumina Basespace BiobankCloud is creating a platform for biobanks efficiently and securely use cloud facilities 53
54 Neonatal heel prick Newborn screening Detect congenital diseases Hypotiroidism Cystic fibrosis Implemented all over the world Employed in demographics for birth rates What about newborn genome sequencing? 54
55 Genomes of a population How much disk space do we need to store all genomes of an entire population? 1-person 1 Population Santa Maria, RS Porto Alegre, RS Rio Grande do Sul São Paulo, SP Brazil Latin America World 270 K 1.5 M 10.7 M 11.3 M 201 M 589 M 7 B 55
56 Genomes of a population How much disk space do we need to store all genomes of an entire population? Population Compressed Size with 2-bits Size with 8-bits WGS 1-person 1 7 MB 770 MB 3 GB 180 GB Santa Maria, RS Porto Alegre, RS Rio Grande do Sul São Paulo, SP Brazil Latin America World 270 K 1.5 M 10.7 M 11.3 M 201 M 589 M 7 B B KB MB GB TB PB EB ZB YB 56
57 Genomes of a population How much disk space do we need to store all genomes of an entire population? Population Compressed Size with 2-bits Size with 8-bits WGS 1-person 1 7 MB 770 MB 3 GB 180 GB Santa Maria, RS 270 K 1.8 TB 200 TB 790 TB 46 PB Porto Alegre, RS 1.5 M 10 TB 1.1 PB 4.3 PB 260 PB Rio Grande do Sul 10.7 M 72 TB 7.7 PB 30 PB 1.8 EB São Paulo, SP 11.3 M 75 TB 8.1 PB 32 PB 1.9 EB Brazil 201 M 1.3 PB 144 PB 575 PB 34 EB Latin America 589 M 3.8 PB 422 PB 1.6 EB 100 EB World 7 B 45 PB 5 EB 20 EB 1.1 ZB B KB MB GB TB PB EB ZB YB 57
58 Genomes of a population Current storage capacity of a hard disk? 58
59 Genomes of a population Current storage capacity of a hard disk? 6 TB WD Ultrastar He6 (US$750) Compressed Size with 2-bits Size with 8-bits WGS 1-person 7 MB 770 MB 3 GB 180 GB Population 898 K x SM B KB MB GB TB PB EB ZB YB 59
60 Genomes of a population Current storage capacity of a data center? 60
61 Genomes of a population Current storage capacity of a data center? 1 PB 350 Seagate of 3TB (US$150 each) (US$50.000) Compressed Size with 2-bits Size with 8-bits WGS 1-person 7 MB 770 MB 3 GB 180 GB Population 153 M 1.3 M 350 K 5825 SM B KB MB GB TB PB EB ZB YB 61
62 Genomes of a population Current storage capacity of a data center? 500 PB BackBlaze, Inc. Compressed Size with 2-bits Size with 8-bits WGS 1-person 7 MB 770 MB 3 GB 180 GB Population 76 B 697 M 174 M 2.9 M 9x World Lat. Am. POA B KB MB GB TB PB EB ZB YB 62
63 Genomes of a population Current storage capacity of the entire world? 63
64 Genomes of a population Current storage capacity of the entire world? 8.2 ZB 295 EB in 2007 law Compressed Size with 2-bits Size with 8-bits WGS 1-person 7 MB 770 MB 3 GB 180 GB Population T 50 B 400x World 7x World B KB MB GB TB PB EB ZB YB 64
65 Genomes of a population Near future storage capacity of a data center? 50 ZB (or 1 YB?) NSA data center in Bluffdale, UT, US Compressed Size with 2-bits Size with 8-bits WGS 1-person 7 MB 770 MB 3 GB 180 GB Population B 40x World B KB MB GB TB PB EB ZB YB 65
66 Resuming Sequencing cost and time will continue to fall Huge data: size*quantity Storage capacity is increasing (not at the same pace as sequencing) 66
67 3. Meaning
68 Meaning GATTCGTACTGACTACGCAGCAGGT What does it do? What does it produces? How does it looks like? Is it related with a specific disease? Does it interact with other components? Does it interact with some medicine? 68
69 Meaning CATTCATCATGCATACGCAGCAGGT What does it mean individual? (Personalized medicine) population? (Public-health genomics) human kind? (Science) 69
70 DNA Lowest structure present in all living organisms High expectation for medicine DNA cannot tell everything about your future DNA is not the only variable causing diseases Behavior and environment interfere But, DNA still plays an important role 70
71 From DNA to meaning DNA 98% Noncoding 2% Coding Amino acids Genes Proteins 71
72 3.1. Comparing sequences Do you see any difference in these sequences? CATTCATCATGCATACGGACTGCAGCAGGT CATTCATTATGCATACGGACTGCAGCAGGT 72
73 3.1. Comparing sequences Do you see any difference in these sequences? CATTCATCATGCATACGGACTGCAGCAGGT CATTCATTATGCATACGGACTGCAGCAGGT 73
74 3.1. Comparing sequences Comparison of two or more sequences Similarity, similarity and similarity Similar sequences may have similar genes, proteins, structures and functions Percent identity as metric Considering small variations (SNPs, insertions, deletions) Search terms: Hamming distance, Edit distance (Levenshtein), sequence alignment, Needleman-Wunsch algorithm, Smith and Waterman algorithm, substitution matrices, BLAST algorithm, seedand-extend, suffix tree, homologue sequences 74
75 3.2. Assembling DNA fragments In which chromosome can we find this sequence? In which position? CATTCATCATGCATACGGACTGCAGCAGGT 75
76 NGS Reads 3.2. Assembling DNA fragments Human genome reference Resulting aligned genome 76
77 3.2. Assembling DNA fragments 77
78 3.2. Assembling DNA fragments Assembling DNA to create a contiguous genome It may be tricky DNA has many repetitive regions Orientation, repeats and sequencing errors Search terms: Alignment algorithms, de-novo assembly, sequence mapping, MAQ, SHRiMP, Bowtie, GNUMAP, MUMmer, BWA, Slider, SOAP3, SNAP 78
79 3.3. Constructing evolutionary trees The 1 st sequence is more similar to the 2 nd or the 3 rd? CATTACGGACGCATCATCATGCAGCAGGTG CATTACGGACGTATCATCATCCAGCAGGTG CATTACGGACGGATCATCATACAGCAGGTG 79
80 3.3. Constructing evolutionary trees 80
81 3.3. Constructing evolutionary trees All about evolution and again similarity Comparing sequences from different organisms Groups organisms by similarity They show how organisms evolve in time Multiple alignments may generate phylogenic trees Recursively compare sequences until find a common ancestor Search terms: Evolutionary similarity, multiple alignments, tree of life, phylogenetic tree, Clustal, T- coffee, Muscle, ProbCons, COBALT, UniProt database, JalView, Logos, PSI-BLAST 81
82 3.4. Detecting patterns in sequences Do you see any pattern in this sequence? CGTTACGGACGCATCATCATGCAGCAGGTG 82
83 3.4. Detecting patterns in sequences Do you see any pattern in this sequence? CGTTACGGACGCATCATCATGCAGCAGGTG 83
84 3.4. Detecting patterns in sequences Delimit parts of sequences that have a biological meaning Detect the begin and end of a gene Detect the begin and end of shapes Search terms: Conserved domains, motifs, regular expressions, Fuzzy regular expressions, Hidden Markov Models (HMMs), InterPro database, PROSITE, multiple alignments, grammars, PCR analysis 84
85 3.5. Determining structure from sequence Which is the structure of this sequence in real life? CATTACGGACGCATCATCATGCAGCAGGTG 85
86 3.5. Determining structure from sequence Which is the structure of this sequence in real life? Primary Secondary Tertiary 86
87 3.5. Determining structure from sequence Structure is related with function Structure: Primary Secondary Tertiary Quaternary Search terms: Conserved domains, motifs, folding, PDB, 3D transformations, protein alignment, intermolecular distances, intramolecular distance, RMSD, DALI, SSAP, graph theory, abinitio methods, fold recognition, threading, neural networks, machine learning, support vector machines, random forests, lowest energy algorithms, ROSETTA, CASP challenge 87
88 3.6. Inferring cell regulation Does the 1 st sequence interfere in the 2 nd? CATTACGGACGCATCATCATGCAGCAGGTG 88
89 3.6. Inferring cell regulation Genes interact with each other Non-coding regions of DNA are important for regulation Depending on geographic position of a cell and neighbors to turn on/off genes Search terms: Cell regulation, junk DNA, turn on/off genes, pathways, Bayesian networks, Support vector machines, clustering algorithms, reducing dimensionality, simulation and modeling, discrete and continuous models, epigenomics 89
90 3.7. Determining protein function What does this sequence? Does it define my hair color? CATTACGGACGCATCATCATGCAGCAGGTG 90
91 3.7. Determining protein function 91
92 3.7. Determining protein function Body measurement 92
93 3.7. Determining protein function Cardiovascular disorder 93
94 3.7. Determining protein function Most challenging topic Depend on all previous topics Human annotations for protein functions Search terms: Annotating DNA, ontologies, natural language processing (NLP), motifs, conserved domains, gene ontology, OBO, BioOntologies, UMLS, term-for-term analysis, data mining, directed acyclic graph, text mining, semantic web 94
95 3.8. Languages Workloads everywhere Step-by-step to achieve the final goal Optimization, parallel programming Python, Perl and Bash Java, C, C++ R Search terms: Workload, programming languages, script languages, dynamic programming, design pattersn, gang of four, regular expressions, performance evaluation 95
96 3.9. Other important topics Database design, development, management Natural language processing (NLP) Graphics and graphical interfaces Distributed systems Security 96
97 3.10. Privacy The individual right of maintaining private undisclosed information Genomes contain private information 97
98 3.10. Our approach Does it contains any private information? CATTCATCATGCATACGGACTGCAGCAGGT 98
99 3.10. Disclosure filter for genetic data 7M bp/s * Throughput with a single core. Scales until approximately 300 M bp/s in some cases. 99
100 3.11. Other problems ELSI Ethical Legal Social implications Genism Genetic discrimination Targeted attacks Loss of reputation Data leakage Privacy problems 100
101 Resuming Obtaining meaning from DNA is Difficult Complex Time consuming Painful Similarity is important Comparison with what is already known 101
102 Final remarks Bioinformatics: interesting area many open problems opportunities to acquire and apply knowledge opportunities from projects reachable for students from all semesters Informatics: improve biology area improved by working on biological data 102
103 CC and SI disciplines Disciplinas CC SI Cálculo A 1 1 Circuitos Digitais 1 1 Lógica e Algoritmo 1 1 * Laboratório de Programação I 1 1 * Introdução à CC e SI 1 1 Geometria Analítica 1 - Vectors, 3D representation, torsion angles Teoria Geral da Administração - 1 Estruturas de Dados 2 2 Trees, graphs, probabilistic data structures Organização de Computadores 2 2 Laboratório de Programação II 2 2 * Matemática Discreta 2 2 Álgebra Linear 2 - Língua Inglesa Instrumental I 2 - Teoria Econômica - 2 Introdução a Ciencia da Informação - 2 * 103
104 CC and SI disciplines Disciplinas CC SI Arquitetura de Computadores 3 3 Estatística 3 3 GWAS, microarray, HMMs Pesquisa e Ordenação de Dados 3 3 DAG, pairwise alignment Paradigmas de Programação 3 3 * Cálculo B 3 - Língua Inglesa Intrumental II 3 - Gestão de Pessoas A - 3 Fundamentos de Bancos de Dados 4 4 Designing DB for biological data Engenharia de Software 4 3 Gang of Four, Design patterns Sistemas Operacionais 4 4 Compression, running workloads, scheduling Métodos Numéricos e Computacionais 4 - Eletricidade e Magnetismo 4 - Projeto e Análise de Algoritmos 4 - Optimization Marketing - 4 Engenharia Econômica - 4 Gerência de Projetos de Software
105 CC and SI disciplines Disciplinas CC SI Qualidade de Software 5 5 Comunicação de Dados 5 - Optimizing data transfer Computação Gráfica 5-3D structure representation, torsion angles Linguagens Formais 5 - Lógica de Predicados 5 - Implementação de Bancos de Dados 5 - Database for biological data Projeto e Gerência de Banco de Dados - 5 Database for biological data Custo A - 5 Redes de Computadores 6 6 Computadores e Sociedade 6 4 ELSI of genomics, privacy Empreendedorismo 6 6 Implementação de Linguagens de Programação 6 - Teoria da Computação 6 - Projeto de Software I
106 CC and SI disciplines Disciplinas CC SI Sistemas Distribuídos 7 7 Distributed services and data access Metodologia de Pesquisa em Informática 7 7 Defining experiments and writing articles Inteligência Artificial 7 - Gene finding, secondary structure Compiladores 7 - Projeto de Software II - 7 Sistemas de Computação Móvel DCG DCG Mobile solutions for storage and processing Middleware para Sistemas Distribuídos DCG DCG Distributed protocols Interface Humano-Computador DCG 5 ELSI of genomics, interfaces for biologists Processamento Estruturado de Documentos Métodos Computacionais Aplicados à Educação DCG - DCG - Processamento de Imagens DCG - Microarrays, diagnosis annotation Empreendedorismo DCG - Fundamentos de Tolerância a Falhas DCG - 106
107 CC and SI disciplines Disciplinas CC SI Programação Paralela DCG - Parallelize bioinformatics algorithms Visão Computacional DCG - Programação de Jogos 3D DCG - Computação Gráfica Avançada DCG - Projeto de Sistemas Digitais DCG - Concepção de Circuitos Integrados DCG - Bases da Gestão Eletrônica de Documento - DCG LIMS, management of donor consents Desenvolvimento de Software Para A Web - DCG Web services and data access Linguagens de Marcação Extensíveis - DCG Ontologies Modelagens de Processos de Negócios - DCG Preparação para Maratona de Programação I - DCG Sistemas Colaborativos - DCG Mineração de Dados - DCG Gene finding, secondary struct., clustering Sistemas Inteligentes - DCG Análise e Projeto de Sistemas Orientados a Objetos - DCG * 107
108 Final remarks exciting problems to work on Donald Knuth,
109 Further reading Bioinformatics An introduction for Computer Scientists. Jacques Cohen / Bioinformatics: Genes, Proteins & Computers. Christine Orengo David Jones Janet Thornton Introduction to Bioinformatics. Arthur M. Lesk Bioinformatics for Dummies. Jean-Michel Claverie Cedric Notredame Thank you! Vinicius Vielmo Cogo 109
Agreement on Dual Degree Bachelor Program in Computer Science. Universidade Federal do Rio Grande do Sul. Technische Universität Berlin
Agreement on Dual Degree Bachelor Program in Computer Science between Universidade Federal do Rio Grande do Sul Instituto de Informática and Technische Universität Berlin School of Electrical Engineering
More informationMSc on Applied Computer Science
MSc on Applied Computer Science (Majors in Security and Computational Biology) Amílcar Sernadas, Ana Teresa Freitas, Arlindo Oliveira, Isabel Sá Correia May 18, 2006 1 Context The new model for high education
More informationCore Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat
More informationAnalysis of NGS Data
Analysis of NGS Data Introduction and Basics Folie: 1 Overview of Analysis Workflow Images Basecalling Sequences denovo - Sequencing Assembly Annotation Resequencing Alignments Comparison to reference
More informationBIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS
BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:
More informationBio-Informatics Lectures. A Short Introduction
Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively
More informationVersion 5.0 Release Notes
Version 5.0 Release Notes 2011 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) +1.734.769.7249 (elsewhere) +1.734.769.7074 (fax) www.genecodes.com
More informationBioinformatics Grid - Enabled Tools For Biologists.
Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis
More informationGuidelines for Establishment of Contract Areas Computer Science Department
Guidelines for Establishment of Contract Areas Computer Science Department Current 07/01/07 Statement: The Contract Area is designed to allow a student, in cooperation with a member of the Computer Science
More informationCore Bioinformatics. Titulació Tipus Curs Semestre. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Codi: 42397 Crèdits: 12 Titulació Tipus Curs Semestre 4313473 Bioinformàtica/Bioinformatics OB 0 1 Professor de contacte Nom: Sònia Casillas Viladerrams Correu electrònic:
More informationDoctor of Philosophy in Computer Science
Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects
More informationBIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16
Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems
More informationGuide for Bioinformatics Project Module 3
Structure- Based Evidence and Multiple Sequence Alignment In this module we will revisit some topics we started to look at while performing our BLAST search and looking at the CDD database in the first
More informationSGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, 2012. Abstract. Haruna Cofer*, PhD
White Paper SGI High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems Haruna Cofer*, PhD January, 2012 Abstract The SGI High Throughput Computing (HTC) Wrapper
More informationREGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf])
820 REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf]) (See also General Regulations) BMS1 Admission to the Degree To be eligible for admission to the degree of Bachelor
More information14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)
Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,
More informationUGENE Quick Start Guide
Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.
More informationLecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions
SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0
More informationPairwise Sequence Alignment
Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What
More informationCurriculum Reform in Computing in Spain
Curriculum Reform in Computing in Spain Sergio Luján Mora Deparment of Software and Computing Systems Content Introduction Computing Disciplines i Computer Engineering Computer Science Information Systems
More informationMaster's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University
Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary
More informationPROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org
BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,
More informationHuman Genome and Human Genome Project. Louxin Zhang
Human Genome and Human Genome Project Louxin Zhang A Primer to Genomics Cells are the fundamental working units of every living systems. DNA is made of 4 nucleotide bases. The DNA sequence is the particular
More informationProtein & DNA Sequence Analysis. Bobbie-Jo Webb-Robertson May 3, 2004
Protein & DNA Sequence Analysis Bobbie-Jo Webb-Robertson May 3, 2004 Sequence Analysis Anything connected to identifying higher biological meaning out of raw sequence data. 2 Genomic & Proteomic Data Sequence
More informationNext Generation Sequencing
Next Generation Sequencing Technology and applications 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 1 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977
More informationCore Bioinformatics. Degree Type Year Semester
Core Bioinformatics 2015/2016 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat Teachers Use of
More informationProfessional Organization Checklist for the Computer Science Curriculum Updates. Association of Computing Machinery Computing Curricula 2008
Professional Organization Checklist for the Computer Science Curriculum Updates Association of Computing Machinery Computing Curricula 2008 The curriculum guidelines can be found in Appendix C of the report
More informationBachelor of Games and Virtual Worlds (Programming) Subject and Course Summaries
First Semester Development 1A On completion of this subject students will be able to apply basic programming and problem solving skills in a 3 rd generation object-oriented programming language (such as
More information01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.
(International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models
More informationGenome Explorer For Comparative Genome Analysis
Genome Explorer For Comparative Genome Analysis Jenn Conn 1, Jo L. Dicks 1 and Ian N. Roberts 2 Abstract Genome Explorer brings together the tools required to build and compare phylogenies from both sequence
More informationCancer Genomics: What Does It Mean for You?
Cancer Genomics: What Does It Mean for You? The Connection Between Cancer and DNA One person dies from cancer each minute in the United States. That s 1,500 deaths each day. As the population ages, this
More informationDatabases and mapping BWA. Samtools
Databases and mapping BWA Samtools FASTQ, SFF, bax.h5 ACE, FASTG FASTA BAM/SAM GFF, BED GenBank/Embl/DDJB many more File formats FASTQ Output format from Illumina and IonTorrent sequencers. Quality scores:
More informationNext generation sequencing (NGS)
Next generation sequencing (NGS) Vijayachitra Modhukur BIIT modhukur@ut.ee 1 Bioinformatics course 11/13/12 Sequencing 2 Bioinformatics course 11/13/12 Microarrays vs NGS Sequences do not need to be known
More informationProtein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
More informationFocusing on results not data comprehensive data analysis for targeted next generation sequencing
Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes
More informationA Primer of Genome Science THIRD
A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:
More informationDoctor of Philosophy in Informatics
Doctor of Philosophy in Informatics 2014 Handbook Indiana University established the School of Informatics and Computing as a place where innovative multidisciplinary programs could thrive, a program where
More informationEoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille
Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille Journées SUCCES Stéphane Le Crom (UPMC IBENS) stephane.le_crom@upmc.fr Paris November 2013 The Sanger DNA sequencing method Sequencing
More informationIntroduction to Bioinformatics 3. DNA editing and contig assembly
Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov
More informationJust the Facts: A Basic Introduction to the Science Underlying NCBI Resources
1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools
More informationNext Generation Sequencing: Technology, Mapping, and Analysis
Next Generation Sequencing: Technology, Mapping, and Analysis Gary Benson Computer Science, Biology, Bioinformatics Boston University gbenson@bu.edu http://tandem.bu.edu/ The Human Genome Project took
More informationIntroduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
More informationRETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
More informationBachelor Curriculum in cooperation with
K 0/675 Bachelor Curriculum in cooperation with Přírodovědecká fakulta, PRF, Faculty of Science Jihočeská univerzita, University of South Bohemia in Budweis (České Budějovice) Česká republika, Czech Republic
More informationWhen you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want
1 When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want to search other databases as well. There are very
More informationBachelor Curriculum in cooperation with
K 0/675 Bachelor Curriculum in cooperation with Přírodovědecká fakulta, PRF, Faculty of Science Jihočeská univerzita, University of South Bohemia in Budweis (České Budějovice) Česká republika, Czech Republic
More informationSanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a
More informationAn Interdepartmental Ph.D. Program in Computational Biology and Bioinformatics:
An Interdepartmental Ph.D. Program in Computational Biology and Bioinformatics: The Yale Perspective Mark Gerstein, Ph.D. 1,2, Dov Greenbaum 1, Kei Cheung, Ph.D. 3,4,5, Perry L. Miller, M.D., Ph.D. 3,4,6
More informationSyllabus of B.Sc. (Bioinformatics) Subject- Bioinformatics (as one subject) B.Sc. I Year Semester I Paper I: Basic of Bioinformatics 85 marks
Syllabus of B.Sc. (Bioinformatics) Subject- Bioinformatics (as one subject) B.Sc. I Year Semester I Paper I: Basic of Bioinformatics 85 marks Semester II Paper II: Mathematics I 85 marks B.Sc. II Year
More informationIntroduction to NGS data analysis
Introduction to NGS data analysis Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Sequencing Illumina platforms Characteristics: High
More informationEfficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing
Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing James D. Jackson Philip J. Hatcher Department of Computer Science Kingsbury Hall University of New Hampshire Durham,
More informationHadoopizer : a cloud environment for bioinformatics data analysis
Hadoopizer : a cloud environment for bioinformatics data analysis Anthony Bretaudeau (1), Olivier Sallou (2), Olivier Collin (3) (1) anthony.bretaudeau@irisa.fr, INRIA/Irisa, Campus de Beaulieu, 35042,
More informationT cell Epitope Prediction
Institute for Immunology and Informatics T cell Epitope Prediction EpiMatrix Eric Gustafson January 6, 2011 Overview Gathering raw data Popular sources Data Management Conservation Analysis Multiple Alignments
More informationBioinformatics: course introduction
Bioinformatics: course introduction Filip Železný Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Intelligent Data Analysis lab http://ida.felk.cvut.cz
More informationREGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])
244 REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc]) (See also General Regulations) Any publication based on work approved for a higher degree should contain a reference
More informationCD-HIT User s Guide. Last updated: April 5, 2010. http://cd-hit.org http://bioinformatics.org/cd-hit/
CD-HIT User s Guide Last updated: April 5, 2010 http://cd-hit.org http://bioinformatics.org/cd-hit/ Program developed by Weizhong Li s lab at UCSD http://weizhong-lab.ucsd.edu liwz@sdsc.edu 1. Introduction
More informationNew solutions for Big Data Analysis and Visualization
New solutions for Big Data Analysis and Visualization From HPC to cloud-based solutions Barcelona, February 2013 Nacho Medina imedina@cipf.es http://bioinfo.cipf.es/imedina Head of the Computational Biology
More informationTutorial for Windows and Macintosh. Preparing Your Data for NGS Alignment
Tutorial for Windows and Macintosh Preparing Your Data for NGS Alignment 2015 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) 1.734.769.7249
More informationUF EDGE brings the classroom to you with online, worldwide course delivery!
What is the University of Florida EDGE Program? EDGE enables engineering professional, military members, and students worldwide to participate in courses, certificates, and degree programs from the UF
More informationSimilarity Searches on Sequence Databases: BLAST, FASTA. Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003
Similarity Searches on Sequence Databases: BLAST, FASTA Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003 Outline Importance of Similarity Heuristic Sequence Alignment:
More informationHow To Get A Masters Degree In Biomedical Informatics
UNIVERSITY OF COLOMBO POSTGRADUATE INSTITUTE OF MEDICINE OF SRI LANKA Drafted and printed by the Board of Study in Multi Disciplinary Study Courses and the Postgraduate Institute of Medicine, University
More informationBiomedical Big Data and Precision Medicine
Biomedical Big Data and Precision Medicine Jie Yang Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago October 8, 2015 1 Explosion of Biomedical Data 2 Types
More informationNetwork Protocol Analysis using Bioinformatics Algorithms
Network Protocol Analysis using Bioinformatics Algorithms Marshall A. Beddoe Marshall_Beddoe@McAfee.com ABSTRACT Network protocol analysis is currently performed by hand using only intuition and a protocol
More informationREGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])
305 REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc]) (See also General Regulations) Any publication based on work approved for a higher degree should contain a reference
More informationFlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem
FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem Elsa Bernard Laurent Jacob Julien Mairal Jean-Philippe Vert September 24, 2013 Abstract FlipFlop implements a fast method for de novo transcript
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationREGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])
299 REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc]) (See also General Regulations) Any publication based on work approved for a higher degree should contain a reference
More informationAn example of bioinformatics application on plant breeding projects in Rijk Zwaan
An example of bioinformatics application on plant breeding projects in Rijk Zwaan Xiangyu Rao 17-08-2012 Introduction of RZ Rijk Zwaan is active worldwide as a vegetable breeding company that focuses on
More informationCCR Biology - Chapter 9 Practice Test - Summer 2012
Name: Class: Date: CCR Biology - Chapter 9 Practice Test - Summer 2012 Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Genetic engineering is possible
More informationUsability in bioinformatics mobile applications
Usability in bioinformatics mobile applications what we are working on Noura Chelbah, Sergio Díaz, Óscar Torreño, and myself Juan Falgueras App name Performs Advantajes Dissatvantajes Link The problem
More informationLearning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
More informationData Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms
Data Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms Introduction Mate pair sequencing enables the generation of libraries with insert sizes in the range of several kilobases (Kb).
More informationLecture Outline. Introduction to Databases. Introduction. Data Formats Sample databases How to text search databases. Shifra Ben-Dor Irit Orr
Introduction to Databases Shifra Ben-Dor Irit Orr Lecture Outline Introduction Data and Database types Database components Data Formats Sample databases How to text search databases What units of information
More informationCurrent Motif Discovery Tools and their Limitations
Current Motif Discovery Tools and their Limitations Philipp Bucher SIB / CIG Workshop 3 October 2006 Trendy Concepts and Hypotheses Transcription regulatory elements act in a context-dependent manner.
More informationAlgorithms in Computational Biology (236522) spring 2007 Lecture #1
Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office
More informationHigh Performance Compu2ng Facility
High Performance Compu2ng Facility Center for Health Informa2cs and Bioinforma2cs Accelera2ng Scien2fic Discovery and Innova2on in Biomedical Research at NYULMC through Advanced Compu2ng Efstra'os Efstathiadis,
More informationIntro to Bioinformatics
Intro to Bioinformatics Marylyn D Ritchie, PhD Professor, Biochemistry and Molecular Biology Director, Center for Systems Genomics The Pennsylvania State University Sarah A Pendergrass, PhD Research Associate
More informationNazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office
2013 Laboratory Accreditation Program Audioconferences and Webinars Implementing Next Generation Sequencing (NGS) as a Clinical Tool in the Laboratory Nazneen Aziz, PhD Director, Molecular Medicine Transformation
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More information200630 - FBIO - Fundations of Bioinformatics
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 200 - FME - School of Mathematics and Statistics 1004 - UB - (ENG)Universitat de Barcelona MASTER'S DEGREE IN STATISTICS AND
More informationIC05 Introduction on Networks &Visualization Nov. 2009. <mathieu.bastian@gmail.com>
IC05 Introduction on Networks &Visualization Nov. 2009 Overview 1. Networks Introduction Networks across disciplines Properties Models 2. Visualization InfoVis Data exploration
More informationIntroduction to the Mathematics of Big Data. Philippe B. Laval
Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2015 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business,
More informationShouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center
Computational Challenges in Storage, Analysis and Interpretation of Next-Generation Sequencing Data Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center Next Generation Sequencing
More informationBecker Muscular Dystrophy
Muscular Dystrophy A Case Study of Positional Cloning Described by Benjamin Duchenne (1868) X-linked recessive disease causing severe muscular degeneration. 100 % penetrance X d Y affected male Frequency
More informationPipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices
overview Pipeline Pilot Enterprise Server Pipeline Pilot Enterprise Server (PPES) is a powerful client-server platform that streamlines the integration and analysis of the vast quantities of data flooding
More informationMaster s Program in Information Systems
The University of Jordan King Abdullah II School for Information Technology Department of Information Systems Master s Program in Information Systems 2006/2007 Study Plan Master Degree in Information Systems
More informationFinal Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
More informationSearching Nucleotide Databases
Searching Nucleotide Databases 1 When we search a nucleic acid databases, Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from the forward strand and 3 reading frames
More informationSequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011
Sequence Formats and Sequence Database Searches Gloria Rendon SC11 Education June, 2011 Sequence A is the primary structure of a biological molecule. It is a chain of residues that form a precise linear
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationMachine Learning. 01 - Introduction
Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
More informationRemoving Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data
Removing Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data Yi Wang, Gagan Agrawal, Gulcin Ozer and Kun Huang The Ohio State University HiCOMB 2014 May 19 th, Phoenix, Arizona 1 Outline
More informationDelivering the power of the world s most successful genomics platform
Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE
More informationGenBank: A Database of Genetic Sequence Data
GenBank: A Database of Genetic Sequence Data Computer Science 105 Boston University David G. Sullivan, Ph.D. An Explosion of Scientific Data Scientists are generating ever increasing amounts of data. Relevant
More informationUsing MATLAB: Bioinformatics Toolbox for Life Sciences
Using MATLAB: Bioinformatics Toolbox for Life Sciences MR. SARAWUT WONGPHAYAK BIOINFORMATICS PROGRAM, SCHOOL OF BIORESOURCES AND TECHNOLOGY, AND SCHOOL OF INFORMATION TECHNOLOGY, KING MONGKUT S UNIVERSITY
More informationOverview. Overarching observations
Overview Genomics and Health Information Technology Systems: Exploring the Issues April 27-28, 2011, Bethesda, MD Brief Meeting Summary, prepared by Greg Feero, M.D., Ph.D. (planning committee chair) The
More informationOrganization and analysis of NGS variations. Alireza Hadj Khodabakhshi Research Investigator
Organization and analysis of NGS variations. Alireza Hadj Khodabakhshi Research Investigator Why is the NGS data processing a big challenge? Computation cannot keep up with the Biology. Source: illumina
More informationLearning is a very general term denoting the way in which agents:
What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);
More informationImportance of Statistics in creating high dimensional data
Importance of Statistics in creating high dimensional data Hemant K. Tiwari, PhD Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham History of Genomic Data
More informationEFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
More information