UNICORE as a Tool for Processing the Data from GS FLX Instrument 1,2 R. Kluszczyński 1 K. Skonieczna 3,4 T. Grzybowski 3 Piotr Bała 1,2 1 ICM University of Warsaw 2 Faculty of Mathematics and Computer Science, UMK, Toruń 3 Collegium Medicum, UMK, Bydgoszcz 4 Postgraduate School, Medical University of Warsaw
MOTIVATION PROCESSING TIME STORAGE TECHNICAL SUPPORT AUTOMATION FLEXIBILITY SECURITY
PL-GRID The goal of the PL-Grid project (Polish Infrastructure for Supporting Computational Science in the European Research Space) is to provide the Polish scientific community with an IT platform based on Grid computer clusters, enabling e-science research in various fields. PL-Grid aims at significantly extending the amount of computing resources provided to the Polish scientific community (by approximately 215 TFlops of computing power and 2500 TB of storage capacity) and constructing a Grid system that will facilitate effective and innovative use of the available resources. www.plgrid.pl
MOTIVATION PROCESSING TIME STORAGE TECHNICAL SUPPORT AUTOMATION FLEXIBILITY SECURITY
UNICORE UNICORE (Uniform Interface to Computing Resources) is a middleware enabling access to the Grid resources in a seamless and secure way. UNICORE is a part of Unified Middleware Distribution developed by EMI project. www.unicore.eu www.eu-emi.eu UNICORE RichClient(URC) UNICORE CommandlineClient (UCC) High-LevelAPI (HiLA)
UNICORE www.unicore.eu
UNICORE WORKFLOW www.unicore.eu
EXPERIMENT Determination of the 18 complete mitochondrial genome sequences of tumor and matched non-tumor tissues obtained from 9 patients diagnosed with colorectal cancer mtdna sequences comparison with the reference sequence mtdna mutation identification Ultra high speed processing of mtdna sequence data. High-throughput GS FLX Instrument (Roche Diagnostics) Up to 1 million reads of approxmately 500 bp long in a single experiment
WORKFLOW GSRunProcessor : Data from GS FLX Instrument (Roche Diagnostics), SFF and CWF files GSReferenceMapper: SFF files GSReporter: CWF files GSAssembler: SFF files, FASTA file BLAST: FASTA file
DATA PROCESSING High-throughput GS FLX Instrument (Roche Diagnostics) UNICORE Commandline Client (UFTP) Target System Storage (PL-Grid) UNICORE Rich Client Batch System (PL-Grid): GS Run Processor GS Reporter GS Reference Mapper GS Assembler BLAST
STORAGE
UNICORE RICH CLIENT Gridbeans are plug-ins enabling to run an application on the grid. They generate description of the job and supply user with graphical interface to enter input data and present results.
WORKFLOW EDITOR Gridbeans can be used to build simple jobs or can be treated as building blocks for workflows consisting of various tasks and operations.
DETAILS Data: 17 Gb Images: 834 files File size: 33Mb Transfer: 3s / file GSRunAnalysisPipe: Interlagos: AMD Opteron(TM) Processor 6272 @ 2.10GHz AMD: AMD Opteron(tm) Processor 6174 @ 2.20GHz Intel: Intel(R) Xeon(R) CPU, X5660 @ 2.80GHz (inifiniband) 1 cpu: 70.0h 8x8 cpu (Intel, MPI): 2.5h
SHORT DEMONSTRATION (1) SHORT DEMONSTRATION (2)
REFERENCES www.unicore.eu www.plgrid.pl www.eu-emi.eu www.roche.com Building a National Distributed e-infrastructure - PL-Grid Lecture Notes in Computer Science, Vol 7136, in the subseries: Information Systems and Applications, incl. Internet / Web, and HCI.