Writing & Running Pipelines on the Open Grid Engine using QMake. Wibowo Arindrarto DTLS Focus Meeting

Size: px
Start display at page:

Download "Writing & Running Pipelines on the Open Grid Engine using QMake. Wibowo Arindrarto DTLS Focus Meeting 15.04.2014"

Transcription

1 Writing & Running Pipelines on the Open Grid Engine using QMake Wibowo Arindrarto DTLS Focus Meeting

2 Makefile (re)introduction Atomic recipes / rules that define full pipelines Initially written for compiling source code files into executables Repurposed for data processing pipelines

3 Makefile Recipes Atomic recipes / rules that define full pipelines ## This is my Makefile all: sample.bam %.sam: %.fastq hg19.ref bowtie --sam $(word $^,2) $< > $@ %.bam: %.sam hg19.fa samtools view -bt $word $^,2) -o $@ $< Workflow are determined implicitly, depending on the main target Some useful aliases for common patterns, e.g.: %.sam: %.fastq '%' denote common part $^ list of all dependencies $< first dependency

4 Why Makefiles? Many alternatives already exist Ruffus, Snakemake, bpipe, etc. None parallelizes as easy as (q)make on our cluster Coming up with a defined approach for Makefile pipelines & writing helper scripts is the way to go for now This is not a solved problem!

5 Makefiles are neat Parallelization ~ for cores (make) and nodes (qmake) Resume runs from failure points Easy to define dependencies among steps Close to the shell environment Already used in some of our earlier internal pipelines, e.g. GAPSS3 Big upgrade from shell / python / perl scripts!

6 GAPSS3 Makefile-based pipeline for exome and genome alignment Designed to be run on multiple core machines (or a cluster) Ran as regular Makefile $ make -f Exome.mk $ qmake -cwd -inherit -- -j 5 -f Exome.mk Worked as intended, but highlights areas where we can improve...

7 Problems Encountered Bioinformatics pipelines (vs software build systems): More moving parts (aligners, variant callers) should be easier to swap parts in and out More experimental in nature should be easy to play with program option flags More investigative in nature should be easy to generate reports for diagnosis

8 Rig: Framework on Top of Make Core idea Modules: a single unit that perform useful function Each module are standalone but can also be combined to create another module Implementation Each module: recipe file + config file Two types of modules: tool wrappers and pipelines

9 Rig: Framework on Top of Make Logged by default Variables defined inside the config stdout, stderr, and job details (qmake only) Dynamic options Command-line flag change on the fly $ qmake -- -j 5 -f pipeline.mk OPT_BOWTIE_m=1

10 Module Structure sample.mk: recipes %.sam: %.1.fastq %.2.fastq $(BOWTIE) $(IDX) $^ > %.bam: %.sam $(SAMTOOLS) view -bt $(REF) -o $< sample.mkc: config INPUTS := mine.1.fastq mine.2.fastq IDX := /usr/local/indices/hg19 REF := /usr/local/genomes/hg19.fa BOWTIE := /usr/bin/bowtie SAMTOOLS := /usr/bin/samtools Cleaner separation of module logic & module components Easier setup of required variables (e.g. reporting variables)

11 Module Types Pipeline Multiple recipes (similar to GAPSS) Tool wrapper Single recipe, 'wraps' a single command line tool $ bowtie --m 1 /usr/local/indices/hg19 sample.fq > sample.sam $ qmake -- -j 5 -f pipeline.mk OPT_BOWTIE_m=1 %.sam: %.fq bowtie /usr/local/indices/hg19 $^ --m 1 $< > $@ # instead %.sam: %.fasta $(MAKE) $(MODULE_ALIGNER) $(OPT_ALIGNER)

12 Other Additions Python scripts to handle boilerplate code $ rig_gen.py tool bowtie2 # creates a template tool wrapper named bowtie2 Python module for exploring job logs >>> from rig import RigRun >>> run = RigRun('my_pipeline', '/path/to/log/directory') >>> for module in run:... for job in module:... print job.id, job.start my_module datetime.datetime(2013, 6, 26, 0, 54, 16, ) Nameset files for defined input patterns

13 Tool wrappers: 41 In Progress bowtie, cufflinks, sickle, etc. Pipelines: 14 Customizable QC pipeline (FastQC, sickle, cutadapt) Gentrap v2.0 (using the QC pipeline, 2 new aligners) GATK best practices pipeline (one module per phase) Deepsage pipeline using genome & 'transcriptome' alignment and more.. Identical logging for make and qmake Tests..?

14 Compromises File sync problem: not cleanly handled by qmake We had to hack into it and use a custom shell wrapper to ensure dependencies are available before each job.

15 Acknowledgements Jeroen Laros Leon Mei Martijn Vermaat Martin van den Kerkhoff Michiel van Galen Peter van 't Hof Zuotian Tatum Sander van der Zeeuw Wai Yi Leung

16 Initial Development Model Single git repository: core library, pipelines, tool wrappers Dependency problem: one tool, multiple pipelines? Challenge: How to version repo within a repo? Choice between git submodules and git subtree

17 subtree vs submodule git subtree: Add sub repository as a folder under the main repository (as a remote) Can push to sub-repository selectively Can pull entire sub-repository history git submodule: Add sub repository under the main repository, but not under git (?) Requires additional.gitsubmodule file (which is versioned) Cloning is messy..

18 Workflow Create tool wrappers, push to remote repo Create pipeline, and then add modules: git remote add mod_tool... git subtree add -P {path} --squash mod_tool/master Work on pipeline, work on tool Pull from tool repo: git subtree merge -P {path} --mod_tool/master Push to tool repo: git subtree push -P {path mod_tool/master

19 git subtree considerations Advantages: Everything is a regular file Makes releases easy Clone as usual Downsides: History becomes messy

20 Demo time! bioassisst: simple pipeline that processes maps FASTQ files into BAM files

Introduction to NGS data analysis

Introduction to NGS data analysis Introduction to NGS data analysis Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Sequencing Illumina platforms Characteristics: High

More information

About the Princess Margaret Computational Biology Resource Centre (PMCBRC) cluster

About the Princess Margaret Computational Biology Resource Centre (PMCBRC) cluster Cluster Info Sheet About the Princess Margaret Computational Biology Resource Centre (PMCBRC) cluster Welcome to the PMCBRC cluster! We are happy to provide and manage this compute cluster as a resource

More information

Annoyances with our current source control Can it get more comfortable? Git Appendix. Git vs Subversion. Andrey Kotlarski 13.XII.

Annoyances with our current source control Can it get more comfortable? Git Appendix. Git vs Subversion. Andrey Kotlarski 13.XII. Git vs Subversion Andrey Kotlarski 13.XII.2011 Outline Annoyances with our current source control Can it get more comfortable? Git Appendix Rant Network traffic Hopefully we have good repository backup

More information

MATLAB & Git Versioning: The Very Basics

MATLAB & Git Versioning: The Very Basics 1 MATLAB & Git Versioning: The Very Basics basic guide for using git (command line) in the development of MATLAB code (windows) The information for this small guide was taken from the following websites:

More information

Practical Solutions for Big Data Analytics

Practical Solutions for Big Data Analytics Practical Solutions for Big Data Analytics Ravi Madduri Computation Institute (madduri@anl.gov) Paul Dave (pdave@uchicago.edu) Dinanath Sulakhe (sulakhe@uchicago.edu) Alex Rodriguez (arodri7@uchicago.edu)

More information

Using GitHub for Rally Apps (Mac Version)

Using GitHub for Rally Apps (Mac Version) Using GitHub for Rally Apps (Mac Version) SOURCE DOCUMENT (must have a rallydev.com email address to access and edit) Introduction Rally has a working relationship with GitHub to enable customer collaboration

More information

Streamline your drupal development workflow in a 3-tier-environment - A story about drush make and drush aliases

Streamline your drupal development workflow in a 3-tier-environment - A story about drush make and drush aliases Streamline your drupal development workflow in a 3-tier-environment - thomas.bussmeyer@init.de Berlin, 18.09.2011 1. Who we are 2. Scenario 3. Solution 4. Notes Who we are Have a look at http://www.init.de

More information

Globus Genomics Tutorial GlobusWorld 2014

Globus Genomics Tutorial GlobusWorld 2014 Globus Genomics Tutorial GlobusWorld 2014 Agenda Overview of Globus Genomics Example Collaborations Demonstration Globus Genomics interface Globus Online integration Scenario 1: Using Globus Genomics for

More information

PROGRAMMING FOR BIOLOGISTS. BIOL 6297 Monday, Wednesday 10 am -12 pm

PROGRAMMING FOR BIOLOGISTS. BIOL 6297 Monday, Wednesday 10 am -12 pm PROGRAMMING FOR BIOLOGISTS BIOL 6297 Monday, Wednesday 10 am -12 pm Tomorrow is Ada Lovelace Day Ada Lovelace was the first person to write a computer program Today s Lecture Overview of the course Philosophy

More information

MATLAB @ Work. MATLAB Source Control Using Git

MATLAB @ Work. MATLAB Source Control Using Git MATLAB @ Work MATLAB Source Control Using Git Richard Johnson Using source control is a key practice for professional programmers. If you have ever broken a program with a lot of editing changes, you can

More information

E. coli plasmid and gene profiling using Next Generation Sequencing

E. coli plasmid and gene profiling using Next Generation Sequencing E. coli plasmid and gene profiling using Next Generation Sequencing Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Introduction General

More information

MOOSE-Based Application Development on GitLab

MOOSE-Based Application Development on GitLab MOOSE-Based Application Development on GitLab MOOSE Team Idaho National Laboratory September 9, 2014 Introduction The intended audience for this talk is developers of INL-hosted, MOOSE-based applications.

More information

Writing standalone Qt & Python applications for Android

Writing standalone Qt & Python applications for Android Writing standalone Qt & Python applications for Android Martin Kolman Red Hat & Faculty of Informatics, Masaryk University http://www.modrana.org/om2013 martin.kolman@gmail.com 1 Overview Writing Android

More information

Lab Exercise Part II: Git: A distributed version control system

Lab Exercise Part II: Git: A distributed version control system Lunds tekniska högskola Datavetenskap, Nov 25, 2013 EDA260 Programvaruutveckling i grupp projekt Labb 2 (part II: Git): Labbhandledning Checked on Git versions: 1.8.1.2 Lab Exercise Part II: Git: A distributed

More information

CPSC 491. Today: Source code control. Source Code (Version) Control. Exercise: g., no git, subversion, cvs, etc.)

CPSC 491. Today: Source code control. Source Code (Version) Control. Exercise: g., no git, subversion, cvs, etc.) Today: Source code control CPSC 491 Source Code (Version) Control Exercise: 1. Pretend like you don t have a version control system (e. g., no git, subversion, cvs, etc.) 2. How would you manage your source

More information

An Introduction to Mercurial Version Control Software

An Introduction to Mercurial Version Control Software An Introduction to Mercurial Version Control Software CS595, IIT [Doc Updated by H. Zhang] Oct, 2010 Satish Balay balay@mcs.anl.gov Outline Why use version control? Simple example of revisioning Mercurial

More information

Data Analysis & Management of High-throughput Sequencing Data. Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute

Data Analysis & Management of High-throughput Sequencing Data. Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute Data Analysis & Management of High-throughput Sequencing Data Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute Current Issues Current Issues The QSEQ file Number files per

More information

Hadoop-BAM and SeqPig

Hadoop-BAM and SeqPig Hadoop-BAM and SeqPig Keijo Heljanko 1, André Schumacher 1,2, Ridvan Döngelci 1, Luca Pireddu 3, Matti Niemenmaa 1, Aleksi Kallio 4, Eija Korpelainen 4, and Gianluigi Zanetti 3 1 Department of Computer

More information

Our Puppet Story. Martin Schütte. May 5 2014

Our Puppet Story. Martin Schütte. May 5 2014 Our Puppet Story Martin Schütte May 5 2014 About DECK36 Small team of 7 engineers Longstanding expertise in designing, implementing and operating complex web systems Developing own data intelligence-focused

More information

Dry Dock Documentation

Dry Dock Documentation Dry Dock Documentation Release 0.6.11 Taylor "Nekroze" Lawson December 19, 2014 Contents 1 Features 3 2 TODO 5 2.1 Contents:................................................. 5 2.2 Feedback.................................................

More information

Version Control with Git. Kate Hedstrom ARSC, UAF

Version Control with Git. Kate Hedstrom ARSC, UAF 1 Version Control with Git Kate Hedstrom ARSC, UAF Linus Torvalds 3 Version Control Software System for managing source files For groups of people working on the same code When you need to get back last

More information

Version Control Your Jenkins Jobs with Jenkins Job Builder

Version Control Your Jenkins Jobs with Jenkins Job Builder Version Control Your Jenkins Jobs with Jenkins Job Builder Abstract Wayne Warren wayne@puppetlabs.com Puppet Labs uses Jenkins to automate building and testing software. While we do derive benefit from

More information

Git - Working with Remote Repositories

Git - Working with Remote Repositories Git - Working with Remote Repositories Handout New Concepts Working with remote Git repositories including setting up remote repositories, cloning remote repositories, and keeping local repositories in-sync

More information

Continuous Integration and Delivery at NSIDC

Continuous Integration and Delivery at NSIDC National Snow and Ice Data Center Supporting Cryospheric Research Since 1976 Continuous Integration and Delivery at NSIDC Julia Collins National Snow and Ice Data Center Cooperative Institute for Research

More information

UMass High Performance Computing Center

UMass High Performance Computing Center .. UMass High Performance Computing Center University of Massachusetts Medical School October, 2014 2 / 32. Challenges of Genomic Data It is getting easier and cheaper to produce bigger genomic data every

More information

SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop

SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop André Schumacher, Luca Pireddu, Matti Niemenmaa, Aleksi Kallio, Eija Korpelainen, Gianluigi Zanetti and Keijo Heljanko Abstract

More information

Version Control with. Ben Morgan

Version Control with. Ben Morgan Version Control with Ben Morgan Developer Workflow Log what we did: Add foo support Edit Sources Add Files Compile and Test Logbook ======= 1. Initial version Logbook ======= 1. Initial version 2. Remove

More information

Removing Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data

Removing Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data Removing Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data Yi Wang, Gagan Agrawal, Gulcin Ozer and Kun Huang The Ohio State University HiCOMB 2014 May 19 th, Phoenix, Arizona 1 Outline

More information

EMC DOCUMENTUM xplore 1.1 DISASTER RECOVERY USING EMC NETWORKER

EMC DOCUMENTUM xplore 1.1 DISASTER RECOVERY USING EMC NETWORKER White Paper EMC DOCUMENTUM xplore 1.1 DISASTER RECOVERY USING EMC NETWORKER Abstract The objective of this white paper is to describe the architecture of and procedure for configuring EMC Documentum xplore

More information

Large-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of Chicago @madduri

Large-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of Chicago @madduri Large-scale Research Data Management and Analysis Using Globus Services Ravi Madduri Argonne National Lab University of Chicago @madduri Outline Who we are Challenges in Big Data Management and Analysis

More information

Extending Remote Desktop for Large Installations. Distributed Package Installs

Extending Remote Desktop for Large Installations. Distributed Package Installs Extending Remote Desktop for Large Installations This article describes four ways Remote Desktop can be extended for large installations. The four ways are: Distributed Package Installs, List Sharing,

More information

Improving your Drupal Development workflow with Continuous Integration

Improving your Drupal Development workflow with Continuous Integration Improving your Drupal Development workflow with Continuous Integration Peter Drake Sahana Murthy DREAM IT. DRUPAL IT. 1 Meet Us!!!! Peter Drake Cloud Software Engineer @Acquia Drupal Developer & sometimes

More information

Version Control using Git and Github. Joseph Rivera

Version Control using Git and Github. Joseph Rivera Version Control using Git and Github Joseph Rivera 1 What is Version Control? Powerful development tool! Management of additions, deletions, and modifications to software/source code or more generally

More information

FEEG6002 - Applied Programming 3 - Version Control and Git II

FEEG6002 - Applied Programming 3 - Version Control and Git II FEEG6002 - Applied Programming 3 - Version Control and Git II Sam Sinayoko 2015-10-16 1 / 26 Outline Learning outcomes Working with a single repository (review) Working with multiple versions of a repository

More information

Magento Search Extension TECHNICAL DOCUMENTATION

Magento Search Extension TECHNICAL DOCUMENTATION CHAPTER 1... 3 1. INSTALLING PREREQUISITES AND THE MODULE (APACHE SOLR)... 3 1.1 Installation of the search server... 3 1.2 Configure the search server for usage with the search module... 7 Deploy the

More information

OpenMake Dynamic DevOps Suite 7.5 Road Map. Feature review for Mojo, Meister, CloudBuilder and Deploy+

OpenMake Dynamic DevOps Suite 7.5 Road Map. Feature review for Mojo, Meister, CloudBuilder and Deploy+ OpenMake Dynamic DevOps Suite 7.5 Road Map Feature review for Mojo, Meister, CloudBuilder and Deploy+ Release Date: August 2012 Dated: May 21, 2012 Table of Contents OpenMake Dynamic DevOps Suite 7.5 Road

More information

Git Fusion Guide 2015.3. August 2015 Update

Git Fusion Guide 2015.3. August 2015 Update Git Fusion Guide 2015.3 August 2015 Update Git Fusion Guide 2015.3 August 2015 Update Copyright 1999-2015 Perforce Software. All rights reserved. Perforce software and documentation is available from http://www.perforce.com/.

More information

s@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ]

s@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ] s@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ] Oracle 1z0-102 : Practice Test Question No : 1 Which two statements are true about java

More information

Integrated Rule-based Data Management System for Genome Sequencing Data

Integrated Rule-based Data Management System for Genome Sequencing Data Integrated Rule-based Data Management System for Genome Sequencing Data A Research Data Management (RDM) Green Shoots Pilots Project Report by Michael Mueller, Simon Burbidge, Steven Lawlor and Jorge Ferrer

More information

UGENE Quick Start Guide

UGENE Quick Start Guide Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.

More information

C Programming Review & Productivity Tools

C Programming Review & Productivity Tools Review & Productivity Tools Giovanni Agosta Piattaforme Software per la Rete Modulo 2 Outline Preliminaries 1 Preliminaries 2 Function Pointers Variadic Functions 3 Build Automation Code Versioning 4 Preliminaries

More information

Version Control with Git. Linux Users Group UT Arlington. Rohit Rawat rohitrawat@gmail.com

Version Control with Git. Linux Users Group UT Arlington. Rohit Rawat rohitrawat@gmail.com Version Control with Git Linux Users Group UT Arlington Rohit Rawat rohitrawat@gmail.com Need for Version Control Better than manually storing backups of older versions Easier to keep everyone updated

More information

HDFS Cluster Installation Automation for TupleWare

HDFS Cluster Installation Automation for TupleWare HDFS Cluster Installation Automation for TupleWare Xinyi Lu Department of Computer Science Brown University Providence, RI 02912 xinyi_lu@brown.edu March 26, 2014 Abstract TupleWare[1] is a C++ Framework

More information

Exam Name: IBM InfoSphere MDM Server v9.0

Exam Name: IBM InfoSphere MDM Server v9.0 Vendor: IBM Exam Code: 000-420 Exam Name: IBM InfoSphere MDM Server v9.0 Version: DEMO 1. As part of a maintenance team for an InfoSphere MDM Server implementation, you are investigating the "EndDate must

More information

CSE-E5430 Scalable Cloud Computing. Lecture 4

CSE-E5430 Scalable Cloud Computing. Lecture 4 Lecture 4 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 5.10-2015 1/23 Hadoop - Linux of Big Data Hadoop = Open Source Distributed Operating System

More information

Introduction. Created by Richard Bell 10/29/2014

Introduction. Created by Richard Bell 10/29/2014 Introduction GNU Radio is open source software that provides built in modules for standard tasks of a wireless communications system. Within the GNU Radio framework is gnuradio-companion, which is a GUI

More information

Version Control with Git. Dylan Nugent

Version Control with Git. Dylan Nugent Version Control with Git Dylan Nugent Agenda What is Version Control? (and why use it?) What is Git? (And why Git?) How Git Works (in theory) Setting up Git (surviving the CLI) The basics of Git (Just

More information

Introduction to the Git Version Control System

Introduction to the Git Version Control System Introduction to the Sebastian Rockel rockel@informatik.uni-hamburg.de University of Hamburg Faculty of Mathematics, Informatics and Natural Sciences Department of Informatics Technical Aspects of Multimodal

More information

An Introduction to Mercurial Version Control Software

An Introduction to Mercurial Version Control Software An Introduction to Mercurial Version Control Software LANS Weekly Seminar October 17, 2006 Satish Balay balay@mcs.anl.gov Outline Why use version control? Simple example of revisioning Mercurial introduction

More information

Version Control! Scenarios, Working with Git!

Version Control! Scenarios, Working with Git! Version Control! Scenarios, Working with Git!! Scenario 1! You finished the assignment at home! VC 2 Scenario 1b! You finished the assignment at home! You get to York to submit and realize you did not

More information

Introduction to Git. Markus Kötter koetter@rrzn.uni-hannover.de. Notes. Leinelab Workshop July 28, 2015

Introduction to Git. Markus Kötter koetter@rrzn.uni-hannover.de. Notes. Leinelab Workshop July 28, 2015 Introduction to Git Markus Kötter koetter@rrzn.uni-hannover.de Leinelab Workshop July 28, 2015 Motivation - Why use version control? Versions in file names: does this look familiar? $ ls file file.2 file.

More information

Bundler v0.5 Documentation

Bundler v0.5 Documentation Bundler v0.5 Documentation Prepared by the West Quad Computing Group October, 2008 1 Overview In the past, all development and computational activities took place on the (former) Roth lab cluster head-node,

More information

Data management on HPC platforms

Data management on HPC platforms Data management on HPC platforms Transferring data and handling code with Git scitas.epfl.ch September 10, 2015 http://bit.ly/1jkghz4 What kind of data Categorizing data to define a strategy Based on size?

More information

Version Control with Svn, Git and git-svn. Kate Hedstrom ARSC, UAF

Version Control with Svn, Git and git-svn. Kate Hedstrom ARSC, UAF 1 Version Control with Svn, Git and git-svn Kate Hedstrom ARSC, UAF 2 Version Control Software System for managing source files For groups of people working on the same code When you need to get back last

More information

monoseq Documentation

monoseq Documentation monoseq Documentation Release 1.2.1 Martijn Vermaat July 16, 2015 Contents 1 User documentation 3 1.1 Installation................................................ 3 1.2 User guide................................................

More information

Integrated version control with Fossil SCM

Integrated version control with Fossil SCM Integrated version control with Fossil SCM Tech Talk 2009-12-01 Arne Bachmann Folie 1 Overview Web address www.fossil-scm.org Author Dr. D.R. Hipp - Author of License GPL v2 Motto No information shall

More information

Gitflow process. Adapt Learning: Gitflow process. Document control

Gitflow process. Adapt Learning: Gitflow process. Document control Adapt Learning: Gitflow process Document control Abstract: Presents Totara Social s design goals to ensure subsequent design and development meets the needs of end- users. Author: Fabien O Carroll, Sven

More information

Continuous Integration and Delivery. manage development build deploy / release

Continuous Integration and Delivery. manage development build deploy / release Continuous Integration and Delivery manage development build deploy / release test About the new CI Tool Chain One of the biggest changes on the next releases of XDK, will be the adoption of the New CI

More information

StriderCD Book. Release 1.4. Niall O Higgins

StriderCD Book. Release 1.4. Niall O Higgins StriderCD Book Release 1.4 Niall O Higgins August 22, 2015 Contents 1 Introduction 3 1.1 What Is Strider.............................................. 3 1.2 What Is Continuous Integration.....................................

More information

000-420. IBM InfoSphere MDM Server v9.0. Version: Demo. Page <<1/11>>

000-420. IBM InfoSphere MDM Server v9.0. Version: Demo. Page <<1/11>> 000-420 IBM InfoSphere MDM Server v9.0 Version: Demo Page 1. As part of a maintenance team for an InfoSphere MDM Server implementation, you are investigating the "EndDate must be after StartDate"

More information

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Streamline Computing Linux Cluster User Training. ( Nottingham University) 1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running

More information

DevShop. Drupal Infrastructure in a Box. Jon Pugh CEO, Founder ThinkDrop Consulting Brooklyn NY

DevShop. Drupal Infrastructure in a Box. Jon Pugh CEO, Founder ThinkDrop Consulting Brooklyn NY DevShop Drupal Infrastructure in a Box Jon Pugh CEO, Founder ThinkDrop Consulting Brooklyn NY Who? Jon Pugh ThinkDrop Consulting Building the web since 1997. Founded in 2009 in Brooklyn NY. Building web

More information

A Complete Example of Next- Gen DNA Sequencing Read Alignment. Presentation Title Goes Here

A Complete Example of Next- Gen DNA Sequencing Read Alignment. Presentation Title Goes Here A Complete Example of Next- Gen DNA Sequencing Read Alignment Presentation Title Goes Here 1 FASTQ Format: The de- facto file format for sharing sequence read data Sequence and a per- base quality score

More information

PKI, Git and SVN. Adam Young. Presented by. Senior Software Engineer, Red Hat. License Licensed under http://creativecommons.org/licenses/by/3.

PKI, Git and SVN. Adam Young. Presented by. Senior Software Engineer, Red Hat. License Licensed under http://creativecommons.org/licenses/by/3. PKI, Git and SVN Presented by Adam Young Senior Software Engineer, Red Hat License Licensed under http://creativecommons.org/licenses/by/3.0/ Agenda Why git Getting started Branches Commits Why? Saved

More information

Processing NGS Data with Hadoop-BAM and SeqPig

Processing NGS Data with Hadoop-BAM and SeqPig Processing NGS Data with Hadoop-BAM and SeqPig Keijo Heljanko 1, André Schumacher 1,2, Ridvan Döngelci 1, Luca Pireddu 3, Matti Niemenmaa 1, Aleksi Kallio 4, Eija Korpelainen 4, and Gianluigi Zanetti 3

More information

Developer Workshop 2015. Marc Dumontier McMaster/OSCAR-EMR

Developer Workshop 2015. Marc Dumontier McMaster/OSCAR-EMR Developer Workshop 2015 Marc Dumontier McMaster/OSCAR-EMR Agenda Code Submission 101 Infrastructure Tools Developing OSCAR Code Submission: Process OSCAR EMR Sourceforge http://www.sourceforge.net/projects/oscarmcmaster

More information

Introduction to Version Control

Introduction to Version Control Research Institute for Symbolic Computation Johannes Kepler University Linz, Austria Winter semester 2014 Outline General Remarks about Version Control 1 General Remarks about Version Control 2 Outline

More information

Delivering the power of the world s most successful genomics platform

Delivering the power of the world s most successful genomics platform Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE

More information

DEPLOYING EMC DOCUMENTUM BUSINESS ACTIVITY MONITOR SERVER ON IBM WEBSPHERE APPLICATION SERVER CLUSTER

DEPLOYING EMC DOCUMENTUM BUSINESS ACTIVITY MONITOR SERVER ON IBM WEBSPHERE APPLICATION SERVER CLUSTER White Paper DEPLOYING EMC DOCUMENTUM BUSINESS ACTIVITY MONITOR SERVER ON IBM WEBSPHERE APPLICATION SERVER CLUSTER Abstract This white paper describes the process of deploying EMC Documentum Business Activity

More information

Version control with GIT

Version control with GIT AGV, IIT Kharagpur September 13, 2012 Outline 1 Version control system What is version control Why version control 2 Introducing GIT What is GIT? 3 Using GIT Using GIT for AGV at IIT KGP Help and Tips

More information

Hadoopizer : a cloud environment for bioinformatics data analysis

Hadoopizer : a cloud environment for bioinformatics data analysis Hadoopizer : a cloud environment for bioinformatics data analysis Anthony Bretaudeau (1), Olivier Sallou (2), Olivier Collin (3) (1) anthony.bretaudeau@irisa.fr, INRIA/Irisa, Campus de Beaulieu, 35042,

More information

Developing tests for the KVM autotest framework

Developing tests for the KVM autotest framework Lucas Meneghel Rodrigues lmr@redhat.com KVM Forum 2010 August 9, 2010 1 Automated testing Autotest The wonders of virtualization testing 2 How KVM autotest solves the original problem? Features Test structure

More information

Web Developer Toolkit for IBM Digital Experience

Web Developer Toolkit for IBM Digital Experience Web Developer Toolkit for IBM Digital Experience Open source Node.js-based tools for web developers and designers using IBM Digital Experience Tools for working with: Applications: Script Portlets Site

More information

LifeScope Genomic Analysis Software 2.5

LifeScope Genomic Analysis Software 2.5 USER GUIDE LifeScope Genomic Analysis Software 2.5 Graphical User Interface DATA ANALYSIS METHODS AND INTERPRETATION Publication Part Number 4471877 Rev. A Revision Date November 2011 For Research Use

More information

Using Git for Centralized and Distributed Version Control Workflows - Day 3. 1 April, 2016 Presenter: Brian Vanderwende

Using Git for Centralized and Distributed Version Control Workflows - Day 3. 1 April, 2016 Presenter: Brian Vanderwende Using Git for Centralized and Distributed Version Control Workflows - Day 3 1 April, 2016 Presenter: Brian Vanderwende Git jargon from last time... Commit - a project snapshot in a repository Staging area

More information

Git Basics. Christian Hanser. Institute for Applied Information Processing and Communications Graz University of Technology. 6.

Git Basics. Christian Hanser. Institute for Applied Information Processing and Communications Graz University of Technology. 6. Git Basics Christian Hanser Institute for Applied Information Processing and Communications Graz University of Technology 6. March 2013 Christian Hanser 6. March 2013 Seite 1/39 Outline Learning Targets

More information

Analysis of NGS Data

Analysis of NGS Data Analysis of NGS Data Introduction and Basics Folie: 1 Overview of Analysis Workflow Images Basecalling Sequences denovo - Sequencing Assembly Annotation Resequencing Alignments Comparison to reference

More information

NGS Data Analysis: An Intro to RNA-Seq

NGS Data Analysis: An Intro to RNA-Seq NGS Data Analysis: An Intro to RNA-Seq March 25th, 2014 GST Colloquim: March 25th, 2014 1 / 1 Workshop Design Basics of NGS Sample Prep RNA-Seq Analysis GST Colloquim: March 25th, 2014 2 / 1 Experimental

More information

Next Generation Sequencing; Technologies, applications and data analysis

Next Generation Sequencing; Technologies, applications and data analysis ; Technologies, applications and data analysis Course 2542 Dr. Martie C.M. Verschuren Research group Analysis techniques in Life Science, Breda Prof. dr. Johan T. den Dunnen Leiden Genome Technology Center,

More information

Is This Your Pipe? Hijacking the Build Pipeline

Is This Your Pipe? Hijacking the Build Pipeline Is This Your Pipe? Hijacking the Build Pipeline $ whoami @rgbkrk OSS, Builds and Testing Protecting pipelines Need Want benefits of continuous delivery! Open source pathways to real, running infrastructure!

More information

Using Git for Project Management with µvision

Using Git for Project Management with µvision MDK Version 5 Tutorial AN279, Spring 2015, V 1.0 Abstract Teamwork is the basis of many modern microcontroller development projects. Often teams are distributed all over the world and over various time

More information

Putting It All Together. Vagrant Drush Version Control

Putting It All Together. Vagrant Drush Version Control Putting It All Together Vagrant Drush Version Control Vagrant Most Drupal developers now work on OSX. The Vagarant provisioning scripts may not work on Windows without subtle changes. If supplied, read

More information

Administration GUIDE. SharePoint Server idataagent. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 201

Administration GUIDE. SharePoint Server idataagent. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 201 Administration GUIDE SharePoint Server idataagent Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 201 Getting Started - SharePoint Server idataagent Overview Deployment Configuration Decision Table

More information

Version control. with git and GitHub. Karl Broman. Biostatistics & Medical Informatics, UW Madison

Version control. with git and GitHub. Karl Broman. Biostatistics & Medical Informatics, UW Madison Version control with git and GitHub Karl Broman Biostatistics & Medical Informatics, UW Madison kbroman.org github.com/kbroman @kwbroman Course web: kbroman.org/tools4rr Slides prepared with Sam Younkin

More information

Continuous Integration. CSC 440: Software Engineering Slide #1

Continuous Integration. CSC 440: Software Engineering Slide #1 Continuous Integration CSC 440: Software Engineering Slide #1 Topics 1. Continuous integration 2. Configuration management 3. Types of version control 1. None 2. Lock-Modify-Unlock 3. Copy-Modify-Merge

More information

SMRT Analysis Software Installation (v2.3.0)

SMRT Analysis Software Installation (v2.3.0) SMRT Analysis Software Installation (v2.3.0) Introduction This document describes the basic requirements for installing SMRT Analysis v2.3.0 on a customer system. SMRT Analysis is designed to be installed

More information

Automating Big Data Benchmarking for Different Architectures with ALOJA

Automating Big Data Benchmarking for Different Architectures with ALOJA www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.

More information

Mobile Development with Git, Gerrit & Jenkins

Mobile Development with Git, Gerrit & Jenkins Mobile Development with Git, Gerrit & Jenkins Luca Milanesio luca@gerritforge.com June 2013 1 ENTERPRISE CLOUD DEVELOPMENT Copyright 2013 CollabNet, Inc. All Rights Reserved. About CollabNet Founded in

More information

[Handout for L6P2] How to Avoid a Big Bang: Integrating Software Components

[Handout for L6P2] How to Avoid a Big Bang: Integrating Software Components Integration [Handout for L6P2] How to Avoid a Big Bang: Integrating Software Components Timing and frequency: Late and one time vs early and frequent Integrating parts written by different team members

More information

Building a Python Plugin

Building a Python Plugin Building a Python Plugin QGIS Tutorials and Tips Author Ujaval Gandhi http://google.com/+ujavalgandhi This work is licensed under a Creative Commons Attribution 4.0 International License. Building a Python

More information

Introducing Xcode Source Control

Introducing Xcode Source Control APPENDIX A Introducing Xcode Source Control What You ll Learn in This Appendix: u The source control features offered in Xcode u The language of source control systems u How to connect to remote Subversion

More information

Text file One header line meta information lines One line : variant/position

Text file One header line meta information lines One line : variant/position Software Calling: GATK SAMTOOLS mpileup Varscan SOAP VCF format Text file One header line meta information lines One line : variant/position ##fileformat=vcfv4.1! ##filedate=20090805! ##source=myimputationprogramv3.1!

More information

Using the Yale HPC Clusters

Using the Yale HPC Clusters Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Dec 2015 To get help Send an email to: hpc@yale.edu Read documentation at: http://research.computing.yale.edu/hpc-support

More information

Galaxy4Bioinformatics Développement et intégration d application sous Galaxy TOOL INTEGRATION

Galaxy4Bioinformatics Développement et intégration d application sous Galaxy TOOL INTEGRATION Galaxy4Bioinformatics Développement et intégration d application sous Galaxy Gildas Le Corguillé Gwendoline Andres Loraine Guéguen IFB-GT Galaxy Devteam March 4, 2015 9am 18am TOOL INTEGRATION Part I CONTEXT

More information

Handling next generation sequence data

Handling next generation sequence data Handling next generation sequence data a pilot to run data analysis on the Dutch Life Sciences Grid Barbera van Schaik Bioinformatics Laboratory - KEBB Academic Medical Center Amsterdam Very short intro

More information

The Global Rules set is evaluated first and contains the global access rules that apply to all NG firewalls using the shared service.

The Global Rules set is evaluated first and contains the global access rules that apply to all NG firewalls using the shared service. Distributed Firewall The distributed firewall (formerly Cascaded Firewall or cfirewall) is a firewall service distributed across multiple NG Firewalls. It is a variant of the regular firewall service,

More information

Module 11 Setting up Customization Environment

Module 11 Setting up Customization Environment Module 11 Setting up Customization Environment By Kitti Upariphutthiphong Technical Consultant, ecosoft kittiu@gmail.com ADempiere ERP 1 2 Module Objectives Downloading ADempiere Source Code Setup Development

More information

Surround SCM Best Practices

Surround SCM Best Practices Surround SCM Best Practices This document addresses some of the common activities in Surround SCM and offers best practices for each. These best practices are designed with Surround SCM users in mind,

More information

DevOps Course Content

DevOps Course Content DevOps Course Content INTRODUCTION TO DEVOPS What is DevOps? History of DevOps Dev and Ops DevOps definitions DevOps and Software Development Life Cycle DevOps main objectives Infrastructure As A Code

More information

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015 Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians

More information