Bioinformatics: course introduction
|
|
|
- Rosaline Green
- 10 years ago
- Views:
Transcription
1 Bioinformatics: course introduction Filip Železný Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Intelligent Data Analysis lab Filip Železný (ČVUT) Bioinformatics - intro 1 / 38
2 A6M33BIN Purpose of this course: Understand the computational problems in bioinformatics, the available types of data and databases, and the algorithms that solve the problems. Methods/Prerequisities mainly: probability and statistics, algorithms (complexity classes), programming skills also: discrete math topics (graphs, automata), relational databases Lectures by FZ may be held in English (pending your consensus) Purpose of this lecture Sneak informal preview of the major bioinformatics topics Filip Železný (ČVUT) Bioinformatics - intro 2 / 38
3 Teachers Doc. Filip Železný CTU Prague, Dept. of Cybernetics Dr. Jiří Kléma CTU Prague, Dept. of Cybernetics Prof. Zdeněk Sedláček Charles Univ., Dept. of Biology and Medical Genetics Ing. Ondřej Kuželka CTU Prague, Dept. of Cybernetics Filip Železný (ČVUT) Bioinformatics - intro 3 / 38
4 Course materials Main page find a6m33bin on department s courseware page Course largely based on Mark Craven s bioinformatics class page at UW Wisconsin Contains a lot of links to useful materials in English Links will be also continually added to our CW The only Czech bioinformatics book Fatima Cvrčková: Úvod do praktické bioinformatiky (Academia, 2006) user-oriented, for biologists/medics, not informaticians Filip Železný (ČVUT) Bioinformatics - intro 4 / 38
5 Bioinformatics Bioinformatics representation storage retrieval analysis of gene- and protein-centric biological data Not just bio databases! Also: computational biology Related: systems biology, structural biology Filip Železný (ČVUT) Bioinformatics - intro 5 / 38
6 Bioinformatics: Main sources of data Information processes inside each cell which govern the entire organism. Filip Železný (ČVUT) Bioinformatics - intro 6 / 38
7 Bioinformatics vs. Biomedical Informatics Biomedical informatics includes Bioinformatics but also other fields such as signal analysis image analysis healthcare informatics not usually associated with bioinformatics. Filip Železný (ČVUT) Bioinformatics - intro 7 / 38
8 Bioinformatics vs. Bio-Inspired Computing Artificial neural networks Swarm intelligence Genetic algorithms DNA computing Also computers + biology but not bioinformatics Filip Železný (ČVUT) Bioinformatics - intro 8 / 38
9 Bioinformatics vs. Bioinformatics mimosmyslova spionaz dalkove pozorovani i.htm Podle definičního třídění ruských vědců rozlišujeme dva obory paranormálních jevů: bioinformatika a bioenergetika. Bioinformatika (tzn. mimosmyslové vnímání, ESP) zahrnuje získávání a výměnu informací mimosmyslovou cestou (nikoli normálními smyslovými orgány). V podstatě rozlišujeme následující formy bioinformace: hypnózu (kontrolu vědomí), telepatii, dálkové vnímání, prekognici, retrokognici, mimotělní zkušenost, vidění rukama nebo jinými částmi těla, inspiraci a zjevení. not bioinformatics Filip Železný (ČVUT) Bioinformatics - intro 9 / 38
10 Bioinformatics: Impact Worldwide Basic biological research Personalized health care Gene-therapy Drug discovery etc. Czech landscape Small community (FEL, VSCHT, MMF,...) High demand (IKEM, IEM, IMB,...) come to see our projects Filip Železný (ČVUT) Bioinformatics - intro 10 / 38
11 Bioinformatics: origins 1950 s: Fred Sanger deciphers the sequence of letters (amino acids) in the insulin protein 51 letters Filip Železný (ČVUT) Bioinformatics - intro 11 / 38
12 Bioinformatics: origins 2004: Human Genome (DNA) deciphered billions of letters (nucleic acids) Filip Železný (ČVUT) Bioinformatics - intro 12 / 38
13 Progress in Sequencing Sequencing: reading the letters in the macromolecules of interest Work continues: population sequencing (not just 1 individual), variation analysis Extinct species (Neandertal genome sequenced in 2010) Filip Železný (ČVUT) Bioinformatics - intro 13 / 38
14 Shotgun sequencing DNA letters can be read only small sequences Shotgun approach: first shatter DNA into fragments Classical bioinformatics problem: assemble a genome from the read sequence fragments Shortest superstring problem Graph-theoretical formulations (Hamiltonian / Eulerian path finding) Filip Železný (ČVUT) Bioinformatics - intro 14 / 38
15 Databases Read bio sequences are stored in public databases Main umbrella institutes European Bioinformatics US National Center for Institute (EBI) Biotechnology Information (NCBI) Protein databases: Protein Data Bank (PDB), SWISS-PROT,... Gene databases: EMBL, GenBank, Entrez,... Many more Mutually interlinked Filip Železný (ČVUT) Bioinformatics - intro 15 / 38
16 Database Retrieval by Similarity Typical biologist s problem: retrieve sequences similar to one I have (protein, DNA fragment,..) Sequence similarity may imply homology (descent from a common ancestor) and similar functions Similarity is tricky: insertions and deletions must be considered Bioinformatics problem: find and score the best possible alignment Dynamic programming, heuristic methods,... Filip Železný (ČVUT) Bioinformatics - intro 16 / 38
17 Whole Genome Similarity Entire genomes (not just fragments) may be aligned Reveal relatedness between organisms Further complications come into play variations in repeat numbers inversions etc. Filip Železný (ČVUT) Bioinformatics - intro 17 / 38
18 Inference of Phylogenetic Trees Given a pairwise similarity function, and a set of genomes, infer the optimal phylogenetic tree of the corresponding organisms Application of hierarchical clustering A modern approach to replace phenotype-based taxonomy Filip Železný (ČVUT) Bioinformatics - intro 18 / 38
19 Multiple Sequence Alignment Aligning more than two sequences Reveal shared evolutionary origins (conserved domains) NP-complete problem (exp time in the number of aligned sequences) Filip Železný (ČVUT) Bioinformatics - intro 19 / 38
20 Probabilistic Sequence Models specific sites (substrings) on a sequence have specific roles e.g. genes or promoters on DNA, active sites on proteins How to tell them apart? Markov Chain Model Each type of site has a different probabilistic model Filip Železný (ČVUT) Bioinformatics - intro 20 / 38
21 Protein Spatial Structure From the DNA nucleic-acid sequence, the protein amino-acid sequence is constructed by cell machinery The protein folds into a complex spatial conformation Spatial conformation can be determined at high cost e.g. X-ray crystallography Determined structures are deposited in public protein data bases Filip Železný (ČVUT) Bioinformatics - intro 21 / 38
22 Protein Structure Prediction Can we compute protein structure from sequence? At least distinguish α-helices from β-sheets Very difficult, not yet solved problem Approches include machine learning Filip Železný (ČVUT) Bioinformatics - intro 22 / 38
23 Protein Function Prediction Protein function is given by its geometrical conformation E.g., ability to bind to DNA or to other proteins The active site (shown in purple) is most important Important machine-learning tasks: prediction of function from structure detection of active sites within structure Filip Železný (ČVUT) Bioinformatics - intro 23 / 38
24 Protein Docking Problem Proteins interact by docking Will a protein dock into another protein? Optimization problem in a geometrical setting Important for novel drug discovery e.g: green - receptor, red - drug the trouble is, the protein may dock also in many unwanted receptors immensely hard computational problems under uncertainty Filip Železný (ČVUT) Bioinformatics - intro 24 / 38
25 Gene Expression Analysis A gene is expressed if the cell produces proteins according to it Rate of expression can be measured for thousands of genes simultaneously by microarrays Can we predict phenotype (e.g. diseases) by gene expression profiling? Filip Železný (ČVUT) Bioinformatics - intro 25 / 38
26 High-throughput data analysis Gene expression data are called high-troughput since lots of measurements (thousands of genes) are produced in a single experiment Puts biologists in a new, difficult situation: how to interpret such data? Example problems: Too many suspects (genes), multiple hypothesis testing How to spot functional patterns among so many variables? How to construct multi-factorial predictive models? Wide opportunities for novel data analysis methods, incl. machine learning Filip Železný (ČVUT) Bioinformatics - intro 26 / 38
27 Other high-throughput technologies Methylation arrays (epigenetics) Chip-on-chip (protein X DNA interactions) mass spectrometry (presence of proteins)..and more Filip Železný (ČVUT) Bioinformatics - intro 27 / 38
28 Genome-wide association studies Correlates traits (e.g. susceptibility to disease) to genetic variations variations : single nucleotide polymorphisms (SNP) in DNA sequence involves a population of people X: SNP s, Y: level of association Filip Železný (ČVUT) Bioinformatics - intro 28 / 38
29 Gene Regulatory Networks Feedback loops in expression: (a protein coded by) a gene influences the expression of another gene positively (transcription factor) or negatively (inhibitor) Results in extremly complex networks with intricate dynamics Most of regulatory networks are unknown or only partially known. Can we infer such networks from time-stamped gene expression data? Filip Železný (ČVUT) Bioinformatics - intro 29 / 38
30 Metabolic Networks Capture metabolism (energy processing) in cells Involves gene/proteins but also other molecules Computational problems similar as in gene regulation networks Filip Železný (ČVUT) Bioinformatics - intro 30 / 38
31 Exploiting Background Knowledge The bioinformatics tasks exemplified so far followed the pattern Data Genomic knowledge A lot of relevant formal (computer-understandable) knowledge available so the equation should be Data + Current Genomic Knowledge New Genomic Knowledge for example: Gene expression data + Known functions of genes Phenotype linked to a gene function But how to represent backround knowledge and use it systematically in data analysis? Important bioinformatics problem Filip Železný (ČVUT) Bioinformatics - intro 31 / 38
32 Examples of Genomic Background Knowledge scientific abstracts gene ontology interaction networks and many other kinds Filip Železný (ČVUT) Bioinformatics - intro 32 / 38
33 Bioinformatics at the IDA lab Protein structure analysis with machine learning Prodigy Software Filip Železný (ČVUT) Bioinformatics - intro 33 / 38
34 Bioinformatics at the IDA lab Gene expression analysis with machine learning Filip Železný (ČVUT) Bioinformatics - intro 34 / 38
35 Bioinformatics at the IDA lab Gene expression analysis with machine learning Filip Železný (ČVUT) Bioinformatics - intro 35 / 38
36 Bioinformatics at the IDA lab Applications in medical studies Filip Železný (ČVUT) Bioinformatics - intro 36 / 38
37 Bioinformatics at the IDA lab If you find this course interesting, you can take part in IDA s research! Filip Železný (ČVUT) Bioinformatics - intro 37 / 38
RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS
BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:
A Primer of Genome Science THIRD
A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:
BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16
Course Director: Dr. Barry Grant (DCM&B, [email protected]) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems
Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources
1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools
Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: [email protected]
Introduction to Bioinformatics 3. DNA editing and contig assembly
Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 [email protected]
Teaching Bioinformatics to Undergraduates
Teaching Bioinformatics to Undergraduates http://www.med.nyu.edu/rcr/asm Stuart M. Brown Research Computing, NYU School of Medicine I. What is Bioinformatics? II. Challenges of teaching bioinformatics
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
Integrating Bioinformatics, Medical Sciences and Drug Discovery
Integrating Bioinformatics, Medical Sciences and Drug Discovery M. Madan Babu Centre for Biotechnology, Anna University, Chennai - 600025 phone: 44-4332179 :: email: [email protected] Bioinformatics
A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques
Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 402 A Multiple DNA Sequence Translation Tool Incorporating Web
BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS
BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge
Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov
Data Integration Lectures 16 & 17 Lectures Outline Goals for Data Integration Homogeneous data integration time series data (Filkov et al. 2002) Heterogeneous data integration microarray + sequence microarray
How many of you have checked out the web site on protein-dna interactions?
How many of you have checked out the web site on protein-dna interactions? Example of an approximately 40,000 probe spotted oligo microarray with enlarged inset to show detail. Find and be ready to discuss
Molecular Genetics: Challenges for Statistical Practice. J.K. Lindsey
Molecular Genetics: Challenges for Statistical Practice J.K. Lindsey 1. What is a Microarray? 2. Design Questions 3. Modelling Questions 4. Longitudinal Data 5. Conclusions 1. What is a microarray? A microarray
REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf])
820 REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf]) (See also General Regulations) BMS1 Admission to the Degree To be eligible for admission to the degree of Bachelor
Next Generation Sequencing: Technology, Mapping, and Analysis
Next Generation Sequencing: Technology, Mapping, and Analysis Gary Benson Computer Science, Biology, Bioinformatics Boston University [email protected] http://tandem.bu.edu/ The Human Genome Project took
Bioinformatics Grid - Enabled Tools For Biologists.
Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis
CCR Biology - Chapter 9 Practice Test - Summer 2012
Name: Class: Date: CCR Biology - Chapter 9 Practice Test - Summer 2012 Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Genetic engineering is possible
Module 1. Sequence Formats and Retrieval. Charles Steward
The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.
Algorithms in Computational Biology (236522) spring 2007 Lecture #1
Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office
Guide for Bioinformatics Project Module 3
Structure- Based Evidence and Multiple Sequence Alignment In this module we will revisit some topics we started to look at while performing our BLAST search and looking at the CDD database in the first
PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: [email protected]
BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,
Intro to Bioinformatics
Intro to Bioinformatics Marylyn D Ritchie, PhD Professor, Biochemistry and Molecular Biology Director, Center for Systems Genomics The Pennsylvania State University Sarah A Pendergrass, PhD Research Associate
Linear Sequence Analysis. 3-D Structure Analysis
Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical properties Molecular weight (MW), isoelectric point (pi), amino acid content, hydropathy (hydrophilic
Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )
Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins
Core Facility Genomics
Core Facility Genomics versatile genome or transcriptome analyses based on quantifiable highthroughput data ascertainment 1 Topics Collaboration with Harald Binder and Clemens Kreutz Project: Microarray
SGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, 2012. Abstract. Haruna Cofer*, PhD
White Paper SGI High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems Haruna Cofer*, PhD January, 2012 Abstract The SGI High Throughput Computing (HTC) Wrapper
On Covert Data Communication Channels Employing DNA Steganography with Application in Massive Data Storage
ARAB ACADEMY FOR SCIENCE, TECHNOLOGY AND MARITIME TRANSPORT COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ENGINEERING DEPARTMENT On Covert Data Communication Channels Employing DNA Steganography with
Translation Study Guide
Translation Study Guide This study guide is a written version of the material you have seen presented in the replication unit. In translation, the cell uses the genetic information contained in mrna to
Biological Sciences Initiative. Human Genome
Biological Sciences Initiative HHMI Human Genome Introduction In 2000, researchers from around the world published a draft sequence of the entire genome. 20 labs from 6 countries worked on the sequence.
Biological Sequence Data Formats
Biological Sequence Data Formats Here we present three standard formats in which biological sequence data (DNA, RNA and protein) can be stored and presented. Raw Sequence: Data without description. FASTA
When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want
1 When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want to search other databases as well. There are very
Syllabus of B.Sc. (Bioinformatics) Subject- Bioinformatics (as one subject) B.Sc. I Year Semester I Paper I: Basic of Bioinformatics 85 marks
Syllabus of B.Sc. (Bioinformatics) Subject- Bioinformatics (as one subject) B.Sc. I Year Semester I Paper I: Basic of Bioinformatics 85 marks Semester II Paper II: Mathematics I 85 marks B.Sc. II Year
Vad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives
Vad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives [email protected] 2015-05-21 Functional Bioinformatics, Örebro University Vad är bioinformatik och varför
BIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS
BIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title: Bioinformatics
Dr Alexander Henzing
Horizon 2020 Health, Demographic Change & Wellbeing EU funding, research and collaboration opportunities for 2016/17 Innovate UK funding opportunities in omics, bridging health and life sciences Dr Alexander
Human Genome and Human Genome Project. Louxin Zhang
Human Genome and Human Genome Project Louxin Zhang A Primer to Genomics Cells are the fundamental working units of every living systems. DNA is made of 4 nucleotide bases. The DNA sequence is the particular
Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University
Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary
Pairwise Sequence Alignment
Pairwise Sequence Alignment [email protected] SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60
Biological Databases and Protein Sequence Analysis
Biological Databases and Protein Sequence Analysis Introduction M. Madan Babu, Center for Biotechnology, Anna University, Chennai 25, India Bioinformatics is the application of Information technology to
Lab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS
Lab 2/Phylogenetics/September 16, 2002 1 Read: Tudge Chapter 2 PHYLOGENETICS Objective of the Lab: To understand how DNA and protein sequence information can be used to make comparisons and assess evolutionary
Bioinformatics Resources at a Glance
Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences
Doctor of Philosophy in Computer Science
Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects
Focusing on results not data comprehensive data analysis for targeted next generation sequencing
Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes
582606 Introduction to bioinformatics
582606 Introduction to bioinformatics Autumn 2007 Esa Pitkänen Master's Degree Programme in Bioinformatics (MBI) Department of Computer Science, University of Helsinki http://www.cs.helsinki.fi/mbi/courses/07-08/itb/
Gene mutation and molecular medicine Chapter 15
Gene mutation and molecular medicine Chapter 15 Lecture Objectives What Are Mutations? How Are DNA Molecules and Mutations Analyzed? How Do Defective Proteins Lead to Diseases? What DNA Changes Lead to
Searching Nucleotide Databases
Searching Nucleotide Databases 1 When we search a nucleic acid databases, Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from the forward strand and 3 reading frames
Distributed Data Mining in Discovery Net. Dr. Moustafa Ghanem Department of Computing Imperial College London
Distributed Data Mining in Discovery Net Dr. Moustafa Ghanem Department of Computing Imperial College London 1. What is Discovery Net 2. Distributed Data Mining for Compute Intensive Tasks 3. Distributed
A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks
A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks Text Analytics World, Boston, 2013 Lars Hard, CTO Agenda Difficult text analytics tasks Feature extraction Bio-inspired
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
Biology & Big Data. Debasis Mitra Professor, Computer Science, FIT
Biology & Big Data Debasis Mitra Professor, Computer Science, FIT Cloud? Debasis Mitra, Florida Tech Data as Service Transparent to user Multiple locations Robustness Software as Service Software location
Human-Mouse Synteny in Functional Genomics Experiment
Human-Mouse Synteny in Functional Genomics Experiment Ksenia Krasheninnikova University of the Russian Academy of Sciences, JetBrains [email protected] September 18, 2012 Ksenia Krasheninnikova
Genetomic Promototypes
Genetomic Promototypes Mirkó Palla and Dana Pe er Department of Mechanical Engineering Clarkson University Potsdam, New York and Department of Genetics Harvard Medical School 77 Avenue Louis Pasteur Boston,
Healthcare Analytics. Aryya Gangopadhyay UMBC
Healthcare Analytics Aryya Gangopadhyay UMBC Two of many projects Integrated network approach to personalized medicine Multidimensional and multimodal Dynamic Analyze interactions HealthMask Need for sharing
Analysis of gene expression data. Ulf Leser and Philippe Thomas
Analysis of gene expression data Ulf Leser and Philippe Thomas This Lecture Protein synthesis Microarray Idea Technologies Applications Problems Quality control Normalization Analysis next week! Ulf Leser:
Basic Concepts of DNA, Proteins, Genes and Genomes
Basic Concepts of DNA, Proteins, Genes and Genomes Kun-Mao Chao 1,2,3 1 Graduate Institute of Biomedical Electronics and Bioinformatics 2 Department of Computer Science and Information Engineering 3 Graduate
LESSON 9. Analyzing DNA Sequences and DNA Barcoding. Introduction. Learning Objectives
9 Analyzing DNA Sequences and DNA Barcoding Introduction DNA sequencing is performed by scientists in many different fields of biology. Many bioinformatics programs are used during the process of analyzing
Genomes and SNPs in Malaria and Sickle Cell Anemia
Genomes and SNPs in Malaria and Sickle Cell Anemia Introduction to Genome Browsing with Ensembl Ensembl The vast amount of information in biological databases today demands a way of organising and accessing
Big Data Healthcare. Fei Wang Associate Professor Department of Computer Science and Engineering School of Engineering University of Connecticut
Big Data Healthcare Fei Wang Associate Professor Department of Computer Science and Engineering School of Engineering University of Connecticut Healthcare Is in Crisis Hersh, W., Jacko, J. A., Greenes,
Introduction to Bioinformatics 2. DNA Sequence Retrieval and comparison
Introduction to Bioinformatics 2. DNA Sequence Retrieval and comparison Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 [email protected]
Bio-Informatics Lectures. A Short Introduction
Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively
Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources
Appendix 2 Molecular Biology Core Curriculum Websites and Other Resources Chapter 1 - The Molecular Basis of Cancer 1. Inside Cancer http://www.insidecancer.org/ From the Dolan DNA Learning Center Cold
Final Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
THE UNIVERSITY OF MANCHESTER Unit Specification
1. GENERAL INFORMATION Title Unit code Credit rating 15 Level 7 Contact hours 30 Other Scheduled teaching and learning activities* Pre-requisite units Co-requisite units School responsible Member of staff
SNP Essentials The same SNP story
HOW SNPS HELP RESEARCHERS FIND THE GENETIC CAUSES OF DISEASE SNP Essentials One of the findings of the Human Genome Project is that the DNA of any two people, all 3.1 billion molecules of it, is more than
BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS
BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 [email protected] Genomics A genome is an organism s
T cell Epitope Prediction
Institute for Immunology and Informatics T cell Epitope Prediction EpiMatrix Eric Gustafson January 6, 2011 Overview Gathering raw data Popular sources Data Management Conservation Analysis Multiple Alignments
An Introduction to Genomics and SAS Scientific Discovery Solutions
An Introduction to Genomics and SAS Scientific Discovery Solutions Dr Karen M Miller Product Manager Bioinformatics SAS EMEA 16.06.03 Copyright 2003, SAS Institute Inc. All rights reserved. 1 Overview!
AP Biology Essential Knowledge Student Diagnostic
AP Biology Essential Knowledge Student Diagnostic Background The Essential Knowledge statements provided in the AP Biology Curriculum Framework are scientific claims describing phenomenon occurring in
Analyzing A DNA Sequence Chromatogram
LESSON 9 HANDOUT Analyzing A DNA Sequence Chromatogram Student Researcher Background: DNA Analysis and FinchTV DNA sequence data can be used to answer many types of questions. Because DNA sequences differ
Genome Explorer For Comparative Genome Analysis
Genome Explorer For Comparative Genome Analysis Jenn Conn 1, Jo L. Dicks 1 and Ian N. Roberts 2 Abstract Genome Explorer brings together the tools required to build and compare phylogenies from both sequence
Big Data Visualization for Genomics. Luca Vezzadini Kairos3D
Big Data Visualization for Genomics Luca Vezzadini Kairos3D Why GenomeCruzer? The amount of data for DNA sequencing is growing Modern hardware produces billions of values per sample Scientists need to
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
BioBoot Camp Genetics
BioBoot Camp Genetics BIO.B.1.2.1 Describe how the process of DNA replication results in the transmission and/or conservation of genetic information DNA Replication is the process of DNA being copied before
escience and Post-Genome Biomedical Research
escience and Post-Genome Biomedical Research Thomas L. Casavant, Adam P. DeLuca Departments of Biomedical Engineering, Electrical Engineering and Ophthalmology Coordinated Laboratory for Computational
Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences
Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of
Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions
SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0
Protein Sequence Analysis - Overview -
Protein Sequence Analysis - Overview - UDEL Workshop Raja Mazumder Research Associate Professor, Department of Biochemistry and Molecular Biology Georgetown University Medical Center Topics Why do protein
1. Introduction Gene regulation Genomics and genome analyses Hidden markov model (HMM)
1. Introduction Gene regulation Genomics and genome analyses Hidden markov model (HMM) 2. Gene regulation tools and methods Regulatory sequences and motif discovery TF binding sites, microrna target prediction
How To Understand The Science Of Genomics
Curs Bioinformática. Grau Genética GENÓMICA INTRODUCTION TO GENOME SCIENCE Antonio Barbadilla Group Genomics, Bioinformatics & Evolution Institut Biotecnologia I Biomedicina Departament de Genètica i Microbiologia
Sequencing the Human Genome
Revised and Updated Edvo-Kit #339 Sequencing the Human Genome 339 Experiment Objective: In this experiment, students will read DNA sequences obtained from automated DNA sequencing techniques. The data
Integrating DNA Motif Discovery and Genome-Wide Expression Analysis. Erin M. Conlon
Integrating DNA Motif Discovery and Genome-Wide Expression Analysis Department of Mathematics and Statistics University of Massachusetts Amherst Statistics in Functional Genomics Workshop Ascona, Switzerland
Biomedical Big Data and Precision Medicine
Biomedical Big Data and Precision Medicine Jie Yang Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago October 8, 2015 1 Explosion of Biomedical Data 2 Types
Using MATLAB: Bioinformatics Toolbox for Life Sciences
Using MATLAB: Bioinformatics Toolbox for Life Sciences MR. SARAWUT WONGPHAYAK BIOINFORMATICS PROGRAM, SCHOOL OF BIORESOURCES AND TECHNOLOGY, AND SCHOOL OF INFORMATION TECHNOLOGY, KING MONGKUT S UNIVERSITY
Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011
Sequence Formats and Sequence Database Searches Gloria Rendon SC11 Education June, 2011 Sequence A is the primary structure of a biological molecule. It is a chain of residues that form a precise linear
InSyBio BioNets: Utmost efficiency in gene expression data and biological networks analysis
InSyBio BioNets: Utmost efficiency in gene expression data and biological networks analysis WHITE PAPER By InSyBio Ltd Konstantinos Theofilatos Bioinformatician, PhD InSyBio Technical Sales Manager August
UGENE Quick Start Guide
Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.
SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE
AP Biology Date SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE LEARNING OBJECTIVES Students will gain an appreciation of the physical effects of sickle cell anemia, its prevalence in the population,
Databases and platforms for data analysis from NGS of MTB
Databases and platforms for data analysis from NGS of MTB Derrick Crook MMM Consortium MMM Consortium Linking Clinical record systems and NHS databases Translating next generation sequencing for patient
Unraveling protein networks with Power Graph Analysis
Unraveling protein networks with Power Graph Analysis PLoS Computational Biology, 2008 Loic Royer Matthias Reimann Bill Andreopoulos Michael Schroeder Schroeder Group Bioinformatics 1 Complex Networks
Factors for success in big data science
Factors for success in big data science Damjan Vukcevic Data Science Murdoch Childrens Research Institute 16 October 2014 Big Data Reading Group (Department of Mathematics & Statistics, University of Melbourne)
