DNA Mapping/Alignment. Team: I Thought You GNU? Lars Olsen, Venkata Aditya Kovuri, Nick Merowsky
|
|
- Jonah Murphy
- 8 years ago
- Views:
Transcription
1 DNA Mapping/Alignment Team: I Thought You GNU? Lars Olsen, Venkata Aditya Kovuri, Nick Merowsky
2 Overview Summary Research Paper 1 Research Paper 2 Research Paper 3 Current Progress Software Designs to Come
3 Summary Next generation sequencing allows genetic information to be sequenced and analyzed through rigorous computation In order to obtain genetic data, a biological sample must be prepared before it is to be sequenced. Once the sample has gone through the chemical preparation, it is able to then be run through one of the various NGS technologies to be sequenced.
4 Summary The data generated from the sequencer is in the form of reads. Reads are strings of nucleotides which are a partial copy of the genetic material of interest. Reads can range from 10 s to 1000 s of nucleotides long. After a few quality control measures, the reads are then ready to be analyzed. In order to be analyzed, the reads must be mapped and aligned to a reference of interest in order to compute results such as differential expression.
5 Summary A common algorithm that has spawned many derivatives within the mapping/alignment program community includes the Seed and Extend method.
6 Summary The Seed and Extend (Reference hashed) method breaks the problem down into these steps: Index the reference sequence via a hash Break reads into Seeds or smaller portions for each seed from a read: Find the most unique seed within the reference Extend the seed outward to check for a more confident match Record reads location in respect to the reference sequence
7 Summary
8 Summary
9 A hybrid short read mapping accelerator Authors: Yupeng Chen, Bertil Schmidt & Douglas L. Maskell Publication: BMC Bioinformatics Date: February 2013 Doi: / Abstract: A hybrid of parallel software and special hardware, specifically a field programmable gate array, is used to provide faster processing of mapping-based sequence assembly while maintaining accuracy.
10 A hybrid short read mapping accelerator Problem caused by ever-growing volume of short read sequence data: Fast methods do not accommodate much error. Approaches that do handle error well tend to be impractically slow. Goal of this technique: to improve both speed and accuracy of mapping short read alignments, or SRAs. Most previous answers focus on software only.
11 A hybrid short read mapping accelerator This approach: Indexes the genomic template once. Done once and saved for future use. Index is generated as a separate process before program execution. Uses a fixed seed length. Required in order to always use the same genomic template index for a given species.
12 A hybrid short read mapping accelerator This approach uses a hybrid of both software and a type of hardware known as field programmable gate arrays, or FPGAs. FPGAs: Great potential for massively parallel computations. Require additional design work for implementation. Few attempts have been made to utilize an FPGA based approach.
13 A hybrid short read mapping accelerator Hardware: FPGA (one Virtex5 FPGA chip): used for the generation of seeds and sequence alignment processes both tasks demand large amounts of computational resources data in memory divided between host PC and the FPGA
14 A hybrid short read mapping accelerator Software: uses seed-and-extend method commonly used in SRAs (Short Read Aligners) 2 stages of the algorithm are each run in parallel: seed generation is done in parallel seed extension is done in parallel
15 A hybrid short read mapping accelerator Seed extension process: Longest running time of any part of the algorithm. Well suited for FPGA parallelization. Primarily composed of repeated random access of a sizable lookup table.
16 A hybrid short read mapping accelerator Division of tasks: CPU (less demanding tasks): Convert reads into binary representation (2 bits per nucleotide) Sending the encoded reads to the short read alignment process on the FPGA Sending commands to the process on the FPGA Accepting the results and writing them to disk
17 A hybrid short read mapping accelerator Division of tasks: FPGA (Highly computationally intensive and parallelizable tasks): Generation of seeds. Extension of seed matches.
18 A hybrid short read mapping accelerator Results: Seed extension (previously the most time consuming step) was made faster to the point where it was no longer the bottleneck of the SRA process Seed generation is now the bottleneck The authors site future plans to parallelize the initial step of encoding the reads for further speed up.
19 A hybrid short read mapping accelerator What can we use from this? We are unlikely to be able to use FPGA hardware. However, we can use some of the software concepts used in this approach in our solution.
20 A hybrid short read mapping accelerator What can we use from this? Software concepts: More than one portion of the SRA process can be made parallel: Initial encoding of reads. Generation of seeds. Extension of initial seed matches.
21 A hybrid short read mapping accelerator What can we use from this? Software concepts: To further reduce execution time: The genomic template can be indexed prior to program execution. This index need only be generated once for a given species, and can be re-used many times.
22 Efficient storage of high throughput DNA sequencing data using reference-based compression Authors: Markus Hsi-Yang Fritz, Rasko Leinonen, Guy Cochrane, and Ewan Birney Publication: Genome Research Date: January 2011 Doi: /gr Abstract: Data storage costs have become an appreciable proportion of total cost in the creation and analysis of DNA sequence data, hence the necessity of high throughput DNA sequencing data using referencebased compression is evident.high throughput DNA
23 Efficient storage of high throughput DNA sequencing data using reference-based compression Problem: There are many challenges in handling the next generation of sequence data, from the highly fragmented nature of the shorter reads generated by the new technologies, to storage, analyze and computational requirements for such large data volumes. The main concern is that the rate of increase in DNA sequencing is significantly outstripping the rate of increase in disk storage capacity.
24 Efficient storage of high throughput DNA sequencing data using reference-based compression Addressing the Issue: Aligning new sequences to a reference genome and then encode the differences between the new sequence and the reference genome, these differences are then stored creating a relatively less storage.
25 Efficient storage of high throughput DNA sequencing data using reference-based compression The efficiency of the compression method is increased exponentially with the increase in the read length i.e, the bigger the size of the read the greater the quality of compression. The magnitude of this efficiency gain can be controlled by changing the amount of quality information stored.
26 Efficient storage of high throughput DNA sequencing data using reference-based compression Prior to 2005 the rate of increase in sequencing capacity was close to the rate of increase in disk storage capacity on a per unit cost basis.
27 Efficient storage of high throughput DNA sequencing data using reference-based compression Given the potential memory demands of this project, this new concept of structuring our data may help us in the future If we foresee this memory bottleneck within our program, we will incorporate this approach to read/reference storage and analysis
28 Sense from sequence reads: methods for alignment and assembly Authors: Paul Flicek, Ewan Birney Publication: Nature Volume 6, No.11s doi: /nmeth.1376 Date: November 2009 Abstract: Discussion on the current algorithms behind mapping/alignment and assembly programs and future directions of these algorithms
29 Sense from sequence reads: methods for alignment and assembly General overview on the importance of mapping/alignment and assembly within the scientific community Alignment/mapping portion is split into two major algorithmic types: Seed and Extend (hash-based) and Burrows-Wheeler Transform (BWT) Explains the basic structures of the above algorithms Our main interest would be the Seed and Extend based methods
30 Sense from sequence reads: methods for alignment and assembly Two types of hash indexes: Reference-based and Read-based Reference-based hashes read the reference into a hash in sections and matches reads to the hashed index. Pros: Fast look-up Cons: High memory footprint Read-based hashes read the reads into a hash in seeds and the reference is used to search the hash. Pros: Small memory requirement Cons: Increased processing time to scan the reference
31 Sense from sequence reads: methods for alignment and assembly We will be implementing the reference-hashing algorithm for our project Given that most of our mapping with be exact mapping with the possibility of mutations and that our reference and reads will not reach the size of terabytes, hashing the reference is much easier to conceptualize Intense memory usage will not be that large of an issue
32 Current Progress Currently, we have begun generating test sets and are still within the design phase of our algorithm For our first test set, we ve taken the Escherichia coli isolate BL26A plasmid plmo226 which is ~2000 basepairs long and have split it into reads of 100 base-pairs for simple testing. We have also generated another test set using the same sample but with 10x coverage.
33 Current Progress
34 Current Progress We ve created a simple algorithm that reads a sequence in and splits the sequence into reads of a specified length We also amplify the number of reads generated based on another specified amount to generate that many more duplicates to simulate coverage
35 Software Designs to Come Designs we plan to implement include: Nucleotide representation conversion from strings to binary. A = 00, C = 01, G = 10, T = 11. Seed/hash key representation will be in the form of bit sets using long variables. For example: ATCG = or 54 Efficient hash table storage for locations with the same nucleotide string. Reference-hashing Seed and Extend algorithm. And if possible: Seed masks for handling of mutations/insertions/deletions
36 Questions?
DNA Sequencing Data Compression. Michael Chung
DNA Sequencing Data Compression Michael Chung Problem DNA sequencing per dollar is increasing faster than storage capacity per dollar. Stein (2010) Data 3 billion base pairs in human genome Genomes are
More informationAcceleration for Personalized Medicine Big Data Applications
Acceleration for Personalized Medicine Big Data Applications Zaid Al-Ars Computer Engineering (CE) Lab Delft Data Science Delft University of Technology 1" Introduction Definition & relevance Personalized
More information14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)
Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,
More informationRemoving Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data
Removing Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data Yi Wang, Gagan Agrawal, Gulcin Ozer and Kun Huang The Ohio State University HiCOMB 2014 May 19 th, Phoenix, Arizona 1 Outline
More informationHardware and Software
Hardware and Software 1 Hardware and Software: A complete design Hardware and software support each other Sometimes it is necessary to shift functions from software to hardware or the other way around
More informationWindows Server Performance Monitoring
Spot server problems before they are noticed The system s really slow today! How often have you heard that? Finding the solution isn t so easy. The obvious questions to ask are why is it running slowly
More informationCompiling PCRE to FPGA for Accelerating SNORT IDS
Compiling PCRE to FPGA for Accelerating SNORT IDS Abhishek Mitra Walid Najjar Laxmi N Bhuyan QuickTime and a QuickTime and a decompressor decompressor are needed to see this picture. are needed to see
More informationEfficient storage of high throughput DNA sequencing data using reference-based compression
Method Efficient storage of high throughput DNA sequencing data using reference-based compression Markus Hsi-Yang Fritz, Rasko Leinonen, Guy Cochrane, and Ewan Birney 1 European Molecular Biology Laboratory
More informationReconfigurable FPGA Inter-Connect For Optimized High Speed DNA Sequencing
Reconfigurable FPGA Inter-Connect For Optimized High Speed DNA Sequencing 1 A.Nandhini, 2 C.Ramalingam, 3 N.Maheswari, 4 N.Krishnakumar, 5 A.Surendar 1,2,3,4 UG Students, 5 Assistant Professor K.S.R College
More informationAn FPGA Acceleration of Short Read Human Genome Mapping
An FPGA Acceleration of Short Read Human Genome Mapping Corey Bruce Olson A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering University
More informationStorage Solutions for Bioinformatics
Storage Solutions for Bioinformatics Li Yan Director of FlexLab, Bioinformatics core technology laboratory liyan3@genomics.cn http://www.genomics.cn/flexlab/index.html Science and Technology Division,
More informationComputer Graphics Hardware An Overview
Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and
More informationwhat operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?
Inside the CPU how does the CPU work? what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored? some short, boring programs to illustrate the
More informationWhitepaper. Innovations in Business Intelligence Database Technology. www.sisense.com
Whitepaper Innovations in Business Intelligence Database Technology The State of Database Technology in 2015 Database technology has seen rapid developments in the past two decades. Online Analytical Processing
More informationA Tutorial in Genetic Sequence Classification Tools and Techniques
A Tutorial in Genetic Sequence Classification Tools and Techniques Jake Drew Data Mining CSE 8331 Southern Methodist University jakemdrew@gmail.com www.jakemdrew.com Sequence Characters IUPAC nucleotide
More informationBinary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
More informationNext generation sequencing (NGS)
Next generation sequencing (NGS) Vijayachitra Modhukur BIIT modhukur@ut.ee 1 Bioinformatics course 11/13/12 Sequencing 2 Bioinformatics course 11/13/12 Microarrays vs NGS Sequences do not need to be known
More informationBricata Next Generation Intrusion Prevention System A New, Evolved Breed of Threat Mitigation
Bricata Next Generation Intrusion Prevention System A New, Evolved Breed of Threat Mitigation Iain Davison Chief Technology Officer Bricata, LLC WWW.BRICATA.COM The Need for Multi-Threaded, Multi-Core
More information1. Molecular computation uses molecules to represent information and molecular processes to implement information processing.
Chapter IV Molecular Computation These lecture notes are exclusively for the use of students in Prof. MacLennan s Unconventional Computation course. c 2013, B. J. MacLennan, EECS, University of Tennessee,
More informationBig Data Technology Map-Reduce Motivation: Indexing in Search Engines
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process
More informationDistance Degree Sequences for Network Analysis
Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation
More informationMOMENTUM - A MEMORY-HARD PROOF-OF-WORK VIA FINDING BIRTHDAY COLLISIONS. DANIEL LARIMER dlarimer@invictus-innovations.com Invictus Innovations, Inc
MOMENTUM - A MEMORY-HARD PROOF-OF-WORK VIA FINDING BIRTHDAY COLLISIONS DANIEL LARIMER dlarimer@invictus-innovations.com Invictus Innovations, Inc ABSTRACT. We introduce the concept of memory-hard proof-of-work
More informationA Time Efficient Algorithm for Web Log Analysis
A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,
More informationBioinformatics Resources at a Glance
Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences
More informationPhysical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
More informationFocusing on results not data comprehensive data analysis for targeted next generation sequencing
Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes
More informationFPGA-based Multithreading for In-Memory Hash Joins
FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded
More informationCloud-Based Big Data Analytics in Bioinformatics
Cloud-Based Big Data Analytics in Bioinformatics Presented By Cephas Mawere Harare Institute of Technology, Zimbabwe 1 Introduction 2 Big Data Analytics Big Data are a collection of data sets so large
More informationShouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center
Computational Challenges in Storage, Analysis and Interpretation of Next-Generation Sequencing Data Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center Next Generation Sequencing
More informationON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT
216 ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT *P.Nirmalkumar, **J.Raja Paul Perinbam, @S.Ravi and #B.Rajan *Research Scholar,
More informationData Backup and Archiving with Enterprise Storage Systems
Data Backup and Archiving with Enterprise Storage Systems Slavjan Ivanov 1, Igor Mishkovski 1 1 Faculty of Computer Science and Engineering Ss. Cyril and Methodius University Skopje, Macedonia slavjan_ivanov@yahoo.com,
More informationLDA, the new family of Lortu Data Appliances
LDA, the new family of Lortu Data Appliances Based on Lortu Byte-Level Deduplication Technology February, 2011 Copyright Lortu Software, S.L. 2011 1 Index Executive Summary 3 Lortu deduplication technology
More informationThe enhancement of the operating speed of the algorithm of adaptive compression of binary bitmap images
The enhancement of the operating speed of the algorithm of adaptive compression of binary bitmap images Borusyak A.V. Research Institute of Applied Mathematics and Cybernetics Lobachevsky Nizhni Novgorod
More informationChapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju
Chapter 7: Distributed Systems: Warehouse-Scale Computing Fall 2011 Jussi Kangasharju Chapter Outline Warehouse-scale computing overview Workloads and software infrastructure Failures and repairs Note:
More informationTable Lookups: From IF-THEN to Key-Indexing
Paper 158-26 Table Lookups: From IF-THEN to Key-Indexing Arthur L. Carpenter, California Occidental Consultants ABSTRACT One of the more commonly needed operations within SAS programming is to determine
More informationArchitecture bits. (Chromosome) (Evolved chromosome) Downloading. Downloading PLD. GA operation Architecture bits
A Pattern Recognition System Using Evolvable Hardware Masaya Iwata 1 Isamu Kajitani 2 Hitoshi Yamada 2 Hitoshi Iba 1 Tetsuya Higuchi 1 1 1-1-4,Umezono,Tsukuba,Ibaraki,305,Japan Electrotechnical Laboratory
More information3 SOFTWARE AND PROGRAMMING LANGUAGES
3 SOFTWARE AND PROGRAMMING LANGUAGES 3.1 INTRODUCTION In the previous lesson we discussed about the different parts and configurations of computer. It has been mentioned that programs or instructions have
More informationApplication of Neural Network in User Authentication for Smart Home System
Application of Neural Network in User Authentication for Smart Home System A. Joseph, D.B.L. Bong, D.A.A. Mat Abstract Security has been an important issue and concern in the smart home systems. Smart
More informationRethinking SIMD Vectorization for In-Memory Databases
SIGMOD 215, Melbourne, Victoria, Australia Rethinking SIMD Vectorization for In-Memory Databases Orestis Polychroniou Columbia University Arun Raghavan Oracle Labs Kenneth A. Ross Columbia University Latest
More informationHardware Configuration Guide
Hardware Configuration Guide Contents Contents... 1 Annotation... 1 Factors to consider... 2 Machine Count... 2 Data Size... 2 Data Size Total... 2 Daily Backup Data Size... 2 Unique Data Percentage...
More informationHow To Make A Backup System More Efficient
Identifying the Hidden Risk of Data De-duplication: How the HYDRAstor Solution Proactively Solves the Problem October, 2006 Introduction Data de-duplication has recently gained significant industry attention,
More informationDeltaStor Data Deduplication: A Technical Review
White Paper DeltaStor Data Deduplication: A Technical Review DeltaStor software is a next-generation data deduplication application for the SEPATON S2100 -ES2 virtual tape library that enables enterprises
More informationAccelerating variant calling
Accelerating variant calling Mauricio Carneiro GSA Broad Institute Intel Genomic Sequencing Pipeline Workshop Mount Sinai 12/10/2013 This is the work of many Genome sequencing and analysis team Mark DePristo
More informationEfficient storage of high throughput DNA sequencing data using reference-based compression
Efficient storage of high throughput DNA sequencing data using reference-based compression Markus Hsi-Yang Fritz, Rasko Leinonen, Guy Cochrane and Ewan Birney EMBL-EBI, Wellcome Trust Genome Campus, Hinxton,
More informationEvaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware
Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware Mahyar Shahsavari, Zaid Al-Ars, Koen Bertels,1, Computer Engineering Group, Software & Computer Technology
More informationScalable Cloud Computing Solutions for Next Generation Sequencing Data
Scalable Cloud Computing Solutions for Next Generation Sequencing Data Matti Niemenmaa 1, Aleksi Kallio 2, André Schumacher 1, Petri Klemelä 2, Eija Korpelainen 2, and Keijo Heljanko 1 1 Department of
More informationSubject knowledge requirements for entry into computer science teacher training. Expert group s recommendations
Subject knowledge requirements for entry into computer science teacher training Expert group s recommendations Introduction To start a postgraduate primary specialist or secondary ITE course specialising
More informationhigh-performance computing so you can move your enterprise forward
Whether targeted to HPC or embedded applications, Pico Computing s modular and highly-scalable architecture, based on Field Programmable Gate Array (FPGA) technologies, brings orders-of-magnitude performance
More informationSynthetic Biology: DNA Digital Storage, Computation and the Organic Computer
Synthetic Biology: DNA Digital Storage, Computation and the Organic Computer Alex Widdel University of Minnesota, Morris 1 / 27 Outline Overview of Synthetic Biology 1 Overview of Synthetic Biology 2 3
More informationOutline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
More informationExtending the Power of FPGAs. Salil Raje, Xilinx
Extending the Power of FPGAs Salil Raje, Xilinx Extending the Power of FPGAs The Journey has Begun Salil Raje Xilinx Corporate Vice President Software and IP Products Development Agenda The Evolution of
More informationInternational Language Character Code
, pp.161-166 http://dx.doi.org/10.14257/astl.2015.81.33 International Language Character Code with DNA Molecules Wei Wang, Zhengxu Zhao, Qian Xu School of Information Science and Technology, Shijiazhuang
More informationWide-area Network Acceleration for the Developing World. Sunghwan Ihm (Princeton) KyoungSoo Park (KAIST) Vivek S. Pai (Princeton)
Wide-area Network Acceleration for the Developing World Sunghwan Ihm (Princeton) KyoungSoo Park (KAIST) Vivek S. Pai (Princeton) POOR INTERNET ACCESS IN THE DEVELOPING WORLD Internet access is a scarce
More informationEvolutionary SAT Solver (ESS)
Ninth LACCEI Latin American and Caribbean Conference (LACCEI 2011), Engineering for a Smart Planet, Innovation, Information Technology and Computational Tools for Sustainable Development, August 3-5, 2011,
More informationComparative genomic hybridization Because arrays are more than just a tool for expression analysis
Microarray Data Analysis Workshop MedVetNet Workshop, DTU 2008 Comparative genomic hybridization Because arrays are more than just a tool for expression analysis Carsten Friis ( with several slides from
More informationInline Deduplication
Inline Deduplication binarywarriors5@gmail.com 1.1 Inline Vs Post-process Deduplication In target based deduplication, the deduplication engine can either process data for duplicates in real time (i.e.
More informationENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #2610771
ENHANCEMENTS TO SQL SERVER COLUMN STORES Anuhya Mallempati #2610771 CONTENTS Abstract Introduction Column store indexes Batch mode processing Other Enhancements Conclusion ABSTRACT SQL server introduced
More informationSeeking Opportunities for Hardware Acceleration in Big Data Analytics
Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who
More informationNetwork Traffic Monitoring an architecture using associative processing.
Network Traffic Monitoring an architecture using associative processing. Gerald Tripp Technical Report: 7-99 Computing Laboratory, University of Kent 1 st September 1999 Abstract This paper investigates
More informationReconfigurable Architecture Requirements for Co-Designed Virtual Machines
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada ken@unb.ca Micaela Serra
More informationMinimizing code defects to improve software quality and lower development costs.
Development solutions White paper October 2008 Minimizing code defects to improve software quality and lower development costs. IBM Rational Software Analyzer and IBM Rational PurifyPlus software Kari
More informationHigh Performance Compu2ng Facility
High Performance Compu2ng Facility Center for Health Informa2cs and Bioinforma2cs Accelera2ng Scien2fic Discovery and Innova2on in Biomedical Research at NYULMC through Advanced Compu2ng Efstra'os Efstathiadis,
More informationGPU File System Encryption Kartik Kulkarni and Eugene Linkov
GPU File System Encryption Kartik Kulkarni and Eugene Linkov 5/10/2012 SUMMARY. We implemented a file system that encrypts and decrypts files. The implementation uses the AES algorithm computed through
More informationA FAST STRING MATCHING ALGORITHM
Ravendra Singh et al, Int. J. Comp. Tech. Appl., Vol 2 (6),877-883 A FAST STRING MATCHING ALGORITHM H N Verma, 2 Ravendra Singh Department of CSE, Sachdeva Institute of Technology, Mathura, India, hnverma@rediffmail.com
More informationDelivering the power of the world s most successful genomics platform
Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE
More informationfind model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1
Monitors Monitor: A tool used to observe the activities on a system. Usage: A system programmer may use a monitor to improve software performance. Find frequently used segments of the software. A systems
More informationManagement von Forschungsprimärdaten und DOI Registrierung. Dr. Matthias Lange (Bioinformatics & Information Technology) June 19 th, 2013
Management von Forschungsprimärdaten und DOI Registrierung Dr. Matthias Lange (Bioinformatics & Information Technology) June 19 th, 2013 Outline Motivation: IPK data infrastructure LIMS: Integration of
More informationProgramming NAND devices
Technical Guide Programming NAND devices Kelly Hirsch, Director of Advanced Technology, Data I/O Corporation Recent Design Trends In the past, embedded system designs have used NAND devices for storing
More informationINCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT
INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT UNPRECEDENTED OBSERVABILITY, COST-SAVING PERFORMANCE ACCELERATION, AND SUPERIOR DATA PROTECTION KEY FEATURES Unprecedented observability
More informationHardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner
Hardware Implementations of RSA Using Fast Montgomery Multiplications ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner Overview Introduction Functional Specifications Implemented Design and Optimizations
More informationHIGH DENSITY DATA STORAGE IN DNA USING AN EFFICIENT MESSAGE ENCODING SCHEME Rahul Vishwakarma 1 and Newsha Amiri 2
HIGH DENSITY DATA STORAGE IN DNA USING AN EFFICIENT MESSAGE ENCODING SCHEME Rahul Vishwakarma 1 and Newsha Amiri 2 1 Tata Consultancy Services, India derahul@ieee.org 2 Bangalore University, India ABSTRACT
More informationBig Data Challenges in Bioinformatics
Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding
More informationNext Generation Sequencing: Technology, Mapping, and Analysis
Next Generation Sequencing: Technology, Mapping, and Analysis Gary Benson Computer Science, Biology, Bioinformatics Boston University gbenson@bu.edu http://tandem.bu.edu/ The Human Genome Project took
More informationTop Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May 2014. Copyright 2014 Permabit Technology Corporation
Top Ten Questions to Ask Your Primary Storage Provider About Their Data Efficiency May 2014 Copyright 2014 Permabit Technology Corporation Introduction The value of data efficiency technologies, namely
More informationDeploying De-Duplication on Ext4 File System
Deploying De-Duplication on Ext4 File System Usha A. Joglekar 1, Bhushan M. Jagtap 2, Koninika B. Patil 3, 1. Asst. Prof., 2, 3 Students Department of Computer Engineering Smt. Kashibai Navale College
More informationHigh-Volume Data Warehousing in Centerprise. Product Datasheet
High-Volume Data Warehousing in Centerprise Product Datasheet Table of Contents Overview 3 Data Complexity 3 Data Quality 3 Speed and Scalability 3 Centerprise Data Warehouse Features 4 ETL in a Unified
More informationEnhance Service Delivery and Accelerate Financial Applications with Consolidated Market Data
White Paper Enhance Service Delivery and Accelerate Financial Applications with Consolidated Market Data What You Will Learn Financial market technology is advancing at a rapid pace. The integration of
More informationRevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
More informationBest Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com
Best Practises for LabVIEW FPGA Design Flow 1 Agenda Overall Application Design Flow Host, Real-Time and FPGA LabVIEW FPGA Architecture Development FPGA Design Flow Common FPGA Architectures Testing and
More informationHow To Design An Image Processing System On A Chip
RAPID PROTOTYPING PLATFORM FOR RECONFIGURABLE IMAGE PROCESSING B.Kovář 1, J. Kloub 1, J. Schier 1, A. Heřmánek 1, P. Zemčík 2, A. Herout 2 (1) Institute of Information Theory and Automation Academy of
More informationMaster's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University
Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary
More informationIn Memory Accelerator for MongoDB
In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000
More informationThe Curious Case of Database Deduplication. PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle
The Curious Case of Database Deduplication PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle Agenda Introduction Deduplication Databases and Deduplication All Flash Arrays and Deduplication 2 Quick Show
More informationParallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage
Parallel Computing Benson Muite benson.muite@ut.ee http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework
More informationCommunicating with devices
Introduction to I/O Where does the data for our CPU and memory come from or go to? Computers communicate with the outside world via I/O devices. Input devices supply computers with data to operate on.
More informationNon-Data Aided Carrier Offset Compensation for SDR Implementation
Non-Data Aided Carrier Offset Compensation for SDR Implementation Anders Riis Jensen 1, Niels Terp Kjeldgaard Jørgensen 1 Kim Laugesen 1, Yannick Le Moullec 1,2 1 Department of Electronic Systems, 2 Center
More informationSSD Performance Tips: Avoid The Write Cliff
ebook 100% KBs/sec 12% GBs Written SSD Performance Tips: Avoid The Write Cliff An Inexpensive and Highly Effective Method to Keep SSD Performance at 100% Through Content Locality Caching Share this ebook
More informationKey Components of WAN Optimization Controller Functionality
Key Components of WAN Optimization Controller Functionality Introduction and Goals One of the key challenges facing IT organizations relative to application and service delivery is ensuring that the applications
More informationChapter 18: Database System Architectures. Centralized Systems
Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and
More informationSTORAGE SOURCE DATA DEDUPLICATION PRODUCTS. Buying Guide: inside
Managing the information that drives the enterprise STORAGE Buying Guide: inside 2 Key features of source data deduplication products 5 Special considerations Source dedupe products can efficiently protect
More informationCRAC: An integrated approach to analyse RNA-seq reads Additional File 3 Results on simulated RNA-seq data.
: An integrated approach to analyse RNA-seq reads Additional File 3 Results on simulated RNA-seq data. Nicolas Philippe and Mikael Salson and Thérèse Commes and Eric Rivals February 13, 2013 1 Results
More informationIn-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps Yu Su, Yi Wang, Gagan Agrawal The Ohio State University Motivation HPC Trends Huge performance gap CPU: extremely fast for generating
More informationSTORAGE. Buying Guide: TARGET DATA DEDUPLICATION BACKUP SYSTEMS. inside
Managing the information that drives the enterprise STORAGE Buying Guide: DEDUPLICATION inside What you need to know about target data deduplication Special factors to consider One key difference among
More informationNext Generation Sequencing
Next Generation Sequencing Technology and applications 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 1 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977
More informationMolecular typing of VTEC: from PFGE to NGS-based phylogeny
Molecular typing of VTEC: from PFGE to NGS-based phylogeny Valeria Michelacci 10th Annual Workshop of the National Reference Laboratories for E. coli in the EU Rome, November 5 th 2015 Molecular typing
More informationMoving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage
Moving Virtual Storage to the Cloud Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Table of Contents Overview... 1 Understanding the Storage Problem... 1 What Makes
More informationCOS 318: Operating Systems. Virtual Memory and Address Translation
COS 318: Operating Systems Virtual Memory and Address Translation Today s Topics Midterm Results Virtual Memory Virtualization Protection Address Translation Base and bound Segmentation Paging Translation
More informationA greedy algorithm for the DNA sequencing by hybridization with positive and negative errors and information about repetitions
BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES, Vol. 59, No. 1, 2011 DOI: 10.2478/v10175-011-0015-0 Varia A greedy algorithm for the DNA sequencing by hybridization with positive and negative
More informationFUSION iocontrol HYBRID STORAGE ARCHITECTURE 1 WWW.FUSIONIO.COM
1 WWW.FUSIONIO.COM FUSION iocontrol HYBRID STORAGE ARCHITECTURE Contents Contents... 2 1 The Storage I/O and Management Gap... 3 2 Closing the Gap with Fusion-io... 4 2.1 Flash storage, the Right Way...
More information