UNICORE as a Tool for Processing the Data from GS FLX Instrument



Similar documents
Data Storage Solution Using PL-Grid UNICORE Infrastructure

Monitoring of the UNICORE middleware

PLGrid Programme: IT Platforms and Domain-Specific Solutions Developed for the National Grid Infrastructure for Polish Science

Development of parallel codes using PL-Grid infrastructure.

PRACE WP4 Distributed Systems Management. Riccardo Murri, CSCS Swiss National Supercomputing Centre

Status and Current Achievements

PLGrid Infrastructure Solutions For Computational Chemistry

UNICORE Summit 2013 Proceedings, 18th June 2013 Leipzig, Germany. IAS Series Volume 21 ISBN

For designers and engineers, Autodesk Product Design Suite Standard provides a foundational 3D design and drafting solution.

Anwendungsintegration und Workflows mit UNICORE 6

Program Grid and HPC5+ workshop

Introduction to the PL-Grid e-infrastructure and the QosCosGrid services

Windows Compute Cluster Server Miron Krokhmal CTO

supercomputing. simplified.

GRID Computing and Networks

DataNet Flexible Metadata Overlay over File Resources

Science Gateways and services/tools for application areas: filling the gap between scientists and HPC/HTC platforms

July 7th 2009 DNA sequencing

COMPUTER GRAPHICS AND INTERACTIVE SYSTEMS LABORATORY

Virtualization of a Cluster Batch System

Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille

ATLAS Cloud Computing and Computational Science Center at Fresno State

Hadoop on a Low-Budget General Purpose HPC Cluster in Academia

History of DNA Sequencing & Current Applications

Teaching secondary school students programming using distance learning. A case study.

Teaching Secondary School Students Programming Using Distance Learning: A Case Study

Deploying Multiscale Applications on European e-infrastructures

TCB No March Technical Bulletin. GS FLX and GS FLX+ Systems. Configuration of Data Backup Using backupscript.sh

UFTP High-performance data transfer for UNICORE

What Do I Need To Create a Visualization For ScreenPlay?

1 DCSC/AU: HUGE. DeIC Sekretariat /RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations

On Enabling Hydrodynamics Data Analysis of Analytical Ultracentrifugation Experiments

High Performance Computing Infrastructure at DESY

PSE Molekulardynamik

Recommended hardware system configurations for ANSYS users

Overview of HPC systems and software available within

GridKa site report. Manfred Alef, Andreas Heiss, Jos van Wezel. Steinbuch Centre for Computing

Mitglied der Helmholtz-Gemeinschaft UNICORE. Uniform Access to JSC Resources. Michael Rambadt, 20.

Genomic Applications on Cray supercomputers: Next Generation Sequencing Workflow. Barry Bolding. Cray Inc Seattle, WA

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

High Productivity Computing With Windows

UNICORE s UNICORECC Toolbox and SDKs

Cloud computing. Intelligent Services for Energy-Efficient Design and Life Cycle Simulation. as used by the ISES project

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA

Copyright Soleran, Inc. esalestrack On-Demand CRM. Trademarks and all rights reserved. esalestrack is a Soleran product Privacy Statement

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed

Putting Genomes in the Cloud with WOS TM. ddn.com. DDN Whitepaper. Making data sharing faster, easier and more scalable

Batch Processor INSTALL & USER GUIDELINES

The Rise of Industrial Big Data. Brian Courtney General Manager Industrial Data Intelligence

Nicolaus Copernicus University

Parallel Programming Survey

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

System Requirements Table of contents

The GRID according to Microsoft

Handling next generation sequence data

HP Z Turbo Drive PCIe SSD

Thermo Scientific Compound Discoverer Software. A New Generation. of integrated solutions for small molecule structure ID

The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics

Oracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011

ONE platform for ALL YOUR DATA Radim Petrzela February 26 th, 2013

Enhancing UNICORE Storage Management using Hadoop

Overview sequence projects

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

HP Workstations for Adobe Creative Cloud

Cosmological simulations on High Performance Computers

SGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, Abstract. Haruna Cofer*, PhD

SURVEY ON SCIENTIFIC DATA MANAGEMENT USING HADOOP MAPREDUCE IN THE KEPLER SCIENTIFIC WORKFLOW SYSTEM

CLOUD BENCHMARK ROUND 1

Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

Building Clusters for Gromacs and other HPC applications

W o r k s h e e t : P r o c e s s o r s

Roche Support Network. Dependable technical, clinical and IT support delivered by experts who know your business

LHC GRID computing in Poland

THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid

NETWRIX CHANGE NOTIFIER

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery

454 Sequencing System Software Manual, v 2.5p1

Dee Family Technology Awards Proposal for Funding. Multimedia Course Development and Computer Aided Research. & famiri@weber.

New to servers. Are you new to servers? Consider these HP ProLiant Essentials servers. Family guide HP ProLiant rack and tower servers

Grids Computing and Collaboration

HP Z Workstations graphics card options

XSEDE Service Provider Software and Services Baseline. September 24, 2015 Version 1.2

4. GS Reporter Application 65

wu.cloud: Insights Gained from Operating a Private Cloud System

Smart Campus Management with Cloud Services

Xgrid. The simple solution for distributed computing. Features

Data Sharing Options for Scientific Workflows on Amazon EC2

Notebook Processor Tour

HPC and Grid Concepts

Qualified Apple Mac Systems for Media Composer 8.0

Overview of Next Generation Sequencing platform technologies

Analyses on functional capabilities of BizTalk Server, Oracle BPEL Process Manger and WebSphere Process Server for applications in Grid middleware

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages , Warsaw, Poland, December 2-4, 1999

HP Workstations graphics card options

GTC Presentation March 19, Copyright 2012 Penguin Computing, Inc. All rights reserved

AN APPROACH TO DEVELOPING BUSINESS PROCESSES WITH WEB SERVICES IN GRID

VENTANA Digital Pathology. Reliable. Efficient. Comprehensive.

Transcription:

UNICORE as a Tool for Processing the Data from GS FLX Instrument 1,2 R. Kluszczyński 1 K. Skonieczna 3,4 T. Grzybowski 3 Piotr Bała 1,2 1 ICM University of Warsaw 2 Faculty of Mathematics and Computer Science, UMK, Toruń 3 Collegium Medicum, UMK, Bydgoszcz 4 Postgraduate School, Medical University of Warsaw

MOTIVATION PROCESSING TIME STORAGE TECHNICAL SUPPORT AUTOMATION FLEXIBILITY SECURITY

PL-GRID The goal of the PL-Grid project (Polish Infrastructure for Supporting Computational Science in the European Research Space) is to provide the Polish scientific community with an IT platform based on Grid computer clusters, enabling e-science research in various fields. PL-Grid aims at significantly extending the amount of computing resources provided to the Polish scientific community (by approximately 215 TFlops of computing power and 2500 TB of storage capacity) and constructing a Grid system that will facilitate effective and innovative use of the available resources. www.plgrid.pl

MOTIVATION PROCESSING TIME STORAGE TECHNICAL SUPPORT AUTOMATION FLEXIBILITY SECURITY

UNICORE UNICORE (Uniform Interface to Computing Resources) is a middleware enabling access to the Grid resources in a seamless and secure way. UNICORE is a part of Unified Middleware Distribution developed by EMI project. www.unicore.eu www.eu-emi.eu UNICORE RichClient(URC) UNICORE CommandlineClient (UCC) High-LevelAPI (HiLA)

UNICORE www.unicore.eu

UNICORE WORKFLOW www.unicore.eu

EXPERIMENT Determination of the 18 complete mitochondrial genome sequences of tumor and matched non-tumor tissues obtained from 9 patients diagnosed with colorectal cancer mtdna sequences comparison with the reference sequence mtdna mutation identification Ultra high speed processing of mtdna sequence data. High-throughput GS FLX Instrument (Roche Diagnostics) Up to 1 million reads of approxmately 500 bp long in a single experiment

WORKFLOW GSRunProcessor : Data from GS FLX Instrument (Roche Diagnostics), SFF and CWF files GSReferenceMapper: SFF files GSReporter: CWF files GSAssembler: SFF files, FASTA file BLAST: FASTA file

DATA PROCESSING High-throughput GS FLX Instrument (Roche Diagnostics) UNICORE Commandline Client (UFTP) Target System Storage (PL-Grid) UNICORE Rich Client Batch System (PL-Grid): GS Run Processor GS Reporter GS Reference Mapper GS Assembler BLAST

STORAGE

UNICORE RICH CLIENT Gridbeans are plug-ins enabling to run an application on the grid. They generate description of the job and supply user with graphical interface to enter input data and present results.

WORKFLOW EDITOR Gridbeans can be used to build simple jobs or can be treated as building blocks for workflows consisting of various tasks and operations.

DETAILS Data: 17 Gb Images: 834 files File size: 33Mb Transfer: 3s / file GSRunAnalysisPipe: Interlagos: AMD Opteron(TM) Processor 6272 @ 2.10GHz AMD: AMD Opteron(tm) Processor 6174 @ 2.20GHz Intel: Intel(R) Xeon(R) CPU, X5660 @ 2.80GHz (inifiniband) 1 cpu: 70.0h 8x8 cpu (Intel, MPI): 2.5h

SHORT DEMONSTRATION (1) SHORT DEMONSTRATION (2)

REFERENCES www.unicore.eu www.plgrid.pl www.eu-emi.eu www.roche.com Building a National Distributed e-infrastructure - PL-Grid Lecture Notes in Computer Science, Vol 7136, in the subseries: Information Systems and Applications, incl. Internet / Web, and HCI.