Writing & Running Pipelines on the Open Grid Engine using QMake. Wibowo Arindrarto DTLS Focus Meeting
|
|
- Dorthy Wright
- 8 years ago
- Views:
Transcription
1 Writing & Running Pipelines on the Open Grid Engine using QMake Wibowo Arindrarto DTLS Focus Meeting
2 Makefile (re)introduction Atomic recipes / rules that define full pipelines Initially written for compiling source code files into executables Repurposed for data processing pipelines
3 Makefile Recipes Atomic recipes / rules that define full pipelines ## This is my Makefile all: sample.bam %.sam: %.fastq hg19.ref bowtie --sam $(word $^,2) $< > $@ %.bam: %.sam hg19.fa samtools view -bt $word $^,2) -o $@ $< Workflow are determined implicitly, depending on the main target Some useful aliases for common patterns, e.g.: %.sam: %.fastq '%' denote common part $^ list of all dependencies $< first dependency
4 Why Makefiles? Many alternatives already exist Ruffus, Snakemake, bpipe, etc. None parallelizes as easy as (q)make on our cluster Coming up with a defined approach for Makefile pipelines & writing helper scripts is the way to go for now This is not a solved problem!
5 Makefiles are neat Parallelization ~ for cores (make) and nodes (qmake) Resume runs from failure points Easy to define dependencies among steps Close to the shell environment Already used in some of our earlier internal pipelines, e.g. GAPSS3 Big upgrade from shell / python / perl scripts!
6 GAPSS3 Makefile-based pipeline for exome and genome alignment Designed to be run on multiple core machines (or a cluster) Ran as regular Makefile $ make -f Exome.mk $ qmake -cwd -inherit -- -j 5 -f Exome.mk Worked as intended, but highlights areas where we can improve...
7 Problems Encountered Bioinformatics pipelines (vs software build systems): More moving parts (aligners, variant callers) should be easier to swap parts in and out More experimental in nature should be easy to play with program option flags More investigative in nature should be easy to generate reports for diagnosis
8 Rig: Framework on Top of Make Core idea Modules: a single unit that perform useful function Each module are standalone but can also be combined to create another module Implementation Each module: recipe file + config file Two types of modules: tool wrappers and pipelines
9 Rig: Framework on Top of Make Logged by default Variables defined inside the config stdout, stderr, and job details (qmake only) Dynamic options Command-line flag change on the fly $ qmake -- -j 5 -f pipeline.mk OPT_BOWTIE_m=1
10 Module Structure sample.mk: recipes %.sam: %.1.fastq %.2.fastq $(BOWTIE) $(IDX) $^ > %.bam: %.sam $(SAMTOOLS) view -bt $(REF) -o $< sample.mkc: config INPUTS := mine.1.fastq mine.2.fastq IDX := /usr/local/indices/hg19 REF := /usr/local/genomes/hg19.fa BOWTIE := /usr/bin/bowtie SAMTOOLS := /usr/bin/samtools Cleaner separation of module logic & module components Easier setup of required variables (e.g. reporting variables)
11 Module Types Pipeline Multiple recipes (similar to GAPSS) Tool wrapper Single recipe, 'wraps' a single command line tool $ bowtie --m 1 /usr/local/indices/hg19 sample.fq > sample.sam $ qmake -- -j 5 -f pipeline.mk OPT_BOWTIE_m=1 %.sam: %.fq bowtie /usr/local/indices/hg19 $^ --m 1 $< > $@ # instead %.sam: %.fasta $(MAKE) $(MODULE_ALIGNER) $(OPT_ALIGNER)
12 Other Additions Python scripts to handle boilerplate code $ rig_gen.py tool bowtie2 # creates a template tool wrapper named bowtie2 Python module for exploring job logs >>> from rig import RigRun >>> run = RigRun('my_pipeline', '/path/to/log/directory') >>> for module in run:... for job in module:... print job.id, job.start my_module datetime.datetime(2013, 6, 26, 0, 54, 16, ) Nameset files for defined input patterns
13 Tool wrappers: 41 In Progress bowtie, cufflinks, sickle, etc. Pipelines: 14 Customizable QC pipeline (FastQC, sickle, cutadapt) Gentrap v2.0 (using the QC pipeline, 2 new aligners) GATK best practices pipeline (one module per phase) Deepsage pipeline using genome & 'transcriptome' alignment and more.. Identical logging for make and qmake Tests..?
14 Compromises File sync problem: not cleanly handled by qmake We had to hack into it and use a custom shell wrapper to ensure dependencies are available before each job.
15 Acknowledgements Jeroen Laros Leon Mei Martijn Vermaat Martin van den Kerkhoff Michiel van Galen Peter van 't Hof Zuotian Tatum Sander van der Zeeuw Wai Yi Leung
16 Initial Development Model Single git repository: core library, pipelines, tool wrappers Dependency problem: one tool, multiple pipelines? Challenge: How to version repo within a repo? Choice between git submodules and git subtree
17 subtree vs submodule git subtree: Add sub repository as a folder under the main repository (as a remote) Can push to sub-repository selectively Can pull entire sub-repository history git submodule: Add sub repository under the main repository, but not under git (?) Requires additional.gitsubmodule file (which is versioned) Cloning is messy..
18 Workflow Create tool wrappers, push to remote repo Create pipeline, and then add modules: git remote add mod_tool... git subtree add -P {path} --squash mod_tool/master Work on pipeline, work on tool Pull from tool repo: git subtree merge -P {path} --mod_tool/master Push to tool repo: git subtree push -P {path mod_tool/master
19 git subtree considerations Advantages: Everything is a regular file Makes releases easy Clone as usual Downsides: History becomes messy
20 Demo time! bioassisst: simple pipeline that processes maps FASTQ files into BAM files
Introduction to NGS data analysis
Introduction to NGS data analysis Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Sequencing Illumina platforms Characteristics: High
More informationAbout the Princess Margaret Computational Biology Resource Centre (PMCBRC) cluster
Cluster Info Sheet About the Princess Margaret Computational Biology Resource Centre (PMCBRC) cluster Welcome to the PMCBRC cluster! We are happy to provide and manage this compute cluster as a resource
More informationAnnoyances with our current source control Can it get more comfortable? Git Appendix. Git vs Subversion. Andrey Kotlarski 13.XII.
Git vs Subversion Andrey Kotlarski 13.XII.2011 Outline Annoyances with our current source control Can it get more comfortable? Git Appendix Rant Network traffic Hopefully we have good repository backup
More informationMATLAB & Git Versioning: The Very Basics
1 MATLAB & Git Versioning: The Very Basics basic guide for using git (command line) in the development of MATLAB code (windows) The information for this small guide was taken from the following websites:
More informationPractical Solutions for Big Data Analytics
Practical Solutions for Big Data Analytics Ravi Madduri Computation Institute (madduri@anl.gov) Paul Dave (pdave@uchicago.edu) Dinanath Sulakhe (sulakhe@uchicago.edu) Alex Rodriguez (arodri7@uchicago.edu)
More informationUsing GitHub for Rally Apps (Mac Version)
Using GitHub for Rally Apps (Mac Version) SOURCE DOCUMENT (must have a rallydev.com email address to access and edit) Introduction Rally has a working relationship with GitHub to enable customer collaboration
More informationStreamline your drupal development workflow in a 3-tier-environment - A story about drush make and drush aliases
Streamline your drupal development workflow in a 3-tier-environment - thomas.bussmeyer@init.de Berlin, 18.09.2011 1. Who we are 2. Scenario 3. Solution 4. Notes Who we are Have a look at http://www.init.de
More informationGlobus Genomics Tutorial GlobusWorld 2014
Globus Genomics Tutorial GlobusWorld 2014 Agenda Overview of Globus Genomics Example Collaborations Demonstration Globus Genomics interface Globus Online integration Scenario 1: Using Globus Genomics for
More informationPROGRAMMING FOR BIOLOGISTS. BIOL 6297 Monday, Wednesday 10 am -12 pm
PROGRAMMING FOR BIOLOGISTS BIOL 6297 Monday, Wednesday 10 am -12 pm Tomorrow is Ada Lovelace Day Ada Lovelace was the first person to write a computer program Today s Lecture Overview of the course Philosophy
More informationMATLAB @ Work. MATLAB Source Control Using Git
MATLAB @ Work MATLAB Source Control Using Git Richard Johnson Using source control is a key practice for professional programmers. If you have ever broken a program with a lot of editing changes, you can
More informationE. coli plasmid and gene profiling using Next Generation Sequencing
E. coli plasmid and gene profiling using Next Generation Sequencing Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Introduction General
More informationMOOSE-Based Application Development on GitLab
MOOSE-Based Application Development on GitLab MOOSE Team Idaho National Laboratory September 9, 2014 Introduction The intended audience for this talk is developers of INL-hosted, MOOSE-based applications.
More informationWriting standalone Qt & Python applications for Android
Writing standalone Qt & Python applications for Android Martin Kolman Red Hat & Faculty of Informatics, Masaryk University http://www.modrana.org/om2013 martin.kolman@gmail.com 1 Overview Writing Android
More informationLab Exercise Part II: Git: A distributed version control system
Lunds tekniska högskola Datavetenskap, Nov 25, 2013 EDA260 Programvaruutveckling i grupp projekt Labb 2 (part II: Git): Labbhandledning Checked on Git versions: 1.8.1.2 Lab Exercise Part II: Git: A distributed
More informationCPSC 491. Today: Source code control. Source Code (Version) Control. Exercise: g., no git, subversion, cvs, etc.)
Today: Source code control CPSC 491 Source Code (Version) Control Exercise: 1. Pretend like you don t have a version control system (e. g., no git, subversion, cvs, etc.) 2. How would you manage your source
More informationAn Introduction to Mercurial Version Control Software
An Introduction to Mercurial Version Control Software CS595, IIT [Doc Updated by H. Zhang] Oct, 2010 Satish Balay balay@mcs.anl.gov Outline Why use version control? Simple example of revisioning Mercurial
More informationData Analysis & Management of High-throughput Sequencing Data. Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute
Data Analysis & Management of High-throughput Sequencing Data Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute Current Issues Current Issues The QSEQ file Number files per
More informationHadoop-BAM and SeqPig
Hadoop-BAM and SeqPig Keijo Heljanko 1, André Schumacher 1,2, Ridvan Döngelci 1, Luca Pireddu 3, Matti Niemenmaa 1, Aleksi Kallio 4, Eija Korpelainen 4, and Gianluigi Zanetti 3 1 Department of Computer
More informationOur Puppet Story. Martin Schütte. May 5 2014
Our Puppet Story Martin Schütte May 5 2014 About DECK36 Small team of 7 engineers Longstanding expertise in designing, implementing and operating complex web systems Developing own data intelligence-focused
More informationDry Dock Documentation
Dry Dock Documentation Release 0.6.11 Taylor "Nekroze" Lawson December 19, 2014 Contents 1 Features 3 2 TODO 5 2.1 Contents:................................................. 5 2.2 Feedback.................................................
More informationVersion Control with Git. Kate Hedstrom ARSC, UAF
1 Version Control with Git Kate Hedstrom ARSC, UAF Linus Torvalds 3 Version Control Software System for managing source files For groups of people working on the same code When you need to get back last
More informationVersion Control Your Jenkins Jobs with Jenkins Job Builder
Version Control Your Jenkins Jobs with Jenkins Job Builder Abstract Wayne Warren wayne@puppetlabs.com Puppet Labs uses Jenkins to automate building and testing software. While we do derive benefit from
More informationGit - Working with Remote Repositories
Git - Working with Remote Repositories Handout New Concepts Working with remote Git repositories including setting up remote repositories, cloning remote repositories, and keeping local repositories in-sync
More informationContinuous Integration and Delivery at NSIDC
National Snow and Ice Data Center Supporting Cryospheric Research Since 1976 Continuous Integration and Delivery at NSIDC Julia Collins National Snow and Ice Data Center Cooperative Institute for Research
More informationUMass High Performance Computing Center
.. UMass High Performance Computing Center University of Massachusetts Medical School October, 2014 2 / 32. Challenges of Genomic Data It is getting easier and cheaper to produce bigger genomic data every
More informationSeqPig: simple and scalable scripting for large sequencing data sets in Hadoop
SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop André Schumacher, Luca Pireddu, Matti Niemenmaa, Aleksi Kallio, Eija Korpelainen, Gianluigi Zanetti and Keijo Heljanko Abstract
More informationVersion Control with. Ben Morgan
Version Control with Ben Morgan Developer Workflow Log what we did: Add foo support Edit Sources Add Files Compile and Test Logbook ======= 1. Initial version Logbook ======= 1. Initial version 2. Remove
More informationRemoving Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data
Removing Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data Yi Wang, Gagan Agrawal, Gulcin Ozer and Kun Huang The Ohio State University HiCOMB 2014 May 19 th, Phoenix, Arizona 1 Outline
More informationEMC DOCUMENTUM xplore 1.1 DISASTER RECOVERY USING EMC NETWORKER
White Paper EMC DOCUMENTUM xplore 1.1 DISASTER RECOVERY USING EMC NETWORKER Abstract The objective of this white paper is to describe the architecture of and procedure for configuring EMC Documentum xplore
More informationLarge-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of Chicago @madduri
Large-scale Research Data Management and Analysis Using Globus Services Ravi Madduri Argonne National Lab University of Chicago @madduri Outline Who we are Challenges in Big Data Management and Analysis
More informationExtending Remote Desktop for Large Installations. Distributed Package Installs
Extending Remote Desktop for Large Installations This article describes four ways Remote Desktop can be extended for large installations. The four ways are: Distributed Package Installs, List Sharing,
More informationImproving your Drupal Development workflow with Continuous Integration
Improving your Drupal Development workflow with Continuous Integration Peter Drake Sahana Murthy DREAM IT. DRUPAL IT. 1 Meet Us!!!! Peter Drake Cloud Software Engineer @Acquia Drupal Developer & sometimes
More informationVersion Control using Git and Github. Joseph Rivera
Version Control using Git and Github Joseph Rivera 1 What is Version Control? Powerful development tool! Management of additions, deletions, and modifications to software/source code or more generally
More informationFEEG6002 - Applied Programming 3 - Version Control and Git II
FEEG6002 - Applied Programming 3 - Version Control and Git II Sam Sinayoko 2015-10-16 1 / 26 Outline Learning outcomes Working with a single repository (review) Working with multiple versions of a repository
More informationMagento Search Extension TECHNICAL DOCUMENTATION
CHAPTER 1... 3 1. INSTALLING PREREQUISITES AND THE MODULE (APACHE SOLR)... 3 1.1 Installation of the search server... 3 1.2 Configure the search server for usage with the search module... 7 Deploy the
More informationOpenMake Dynamic DevOps Suite 7.5 Road Map. Feature review for Mojo, Meister, CloudBuilder and Deploy+
OpenMake Dynamic DevOps Suite 7.5 Road Map Feature review for Mojo, Meister, CloudBuilder and Deploy+ Release Date: August 2012 Dated: May 21, 2012 Table of Contents OpenMake Dynamic DevOps Suite 7.5 Road
More informationGit Fusion Guide 2015.3. August 2015 Update
Git Fusion Guide 2015.3 August 2015 Update Git Fusion Guide 2015.3 August 2015 Update Copyright 1999-2015 Perforce Software. All rights reserved. Perforce software and documentation is available from http://www.perforce.com/.
More informations@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ]
s@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ] Oracle 1z0-102 : Practice Test Question No : 1 Which two statements are true about java
More informationIntegrated Rule-based Data Management System for Genome Sequencing Data
Integrated Rule-based Data Management System for Genome Sequencing Data A Research Data Management (RDM) Green Shoots Pilots Project Report by Michael Mueller, Simon Burbidge, Steven Lawlor and Jorge Ferrer
More informationUGENE Quick Start Guide
Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.
More informationC Programming Review & Productivity Tools
Review & Productivity Tools Giovanni Agosta Piattaforme Software per la Rete Modulo 2 Outline Preliminaries 1 Preliminaries 2 Function Pointers Variadic Functions 3 Build Automation Code Versioning 4 Preliminaries
More informationVersion Control with Git. Linux Users Group UT Arlington. Rohit Rawat rohitrawat@gmail.com
Version Control with Git Linux Users Group UT Arlington Rohit Rawat rohitrawat@gmail.com Need for Version Control Better than manually storing backups of older versions Easier to keep everyone updated
More informationHDFS Cluster Installation Automation for TupleWare
HDFS Cluster Installation Automation for TupleWare Xinyi Lu Department of Computer Science Brown University Providence, RI 02912 xinyi_lu@brown.edu March 26, 2014 Abstract TupleWare[1] is a C++ Framework
More informationExam Name: IBM InfoSphere MDM Server v9.0
Vendor: IBM Exam Code: 000-420 Exam Name: IBM InfoSphere MDM Server v9.0 Version: DEMO 1. As part of a maintenance team for an InfoSphere MDM Server implementation, you are investigating the "EndDate must
More informationCSE-E5430 Scalable Cloud Computing. Lecture 4
Lecture 4 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 5.10-2015 1/23 Hadoop - Linux of Big Data Hadoop = Open Source Distributed Operating System
More informationIntroduction. Created by Richard Bell 10/29/2014
Introduction GNU Radio is open source software that provides built in modules for standard tasks of a wireless communications system. Within the GNU Radio framework is gnuradio-companion, which is a GUI
More informationVersion Control with Git. Dylan Nugent
Version Control with Git Dylan Nugent Agenda What is Version Control? (and why use it?) What is Git? (And why Git?) How Git Works (in theory) Setting up Git (surviving the CLI) The basics of Git (Just
More informationIntroduction to the Git Version Control System
Introduction to the Sebastian Rockel rockel@informatik.uni-hamburg.de University of Hamburg Faculty of Mathematics, Informatics and Natural Sciences Department of Informatics Technical Aspects of Multimodal
More informationAn Introduction to Mercurial Version Control Software
An Introduction to Mercurial Version Control Software LANS Weekly Seminar October 17, 2006 Satish Balay balay@mcs.anl.gov Outline Why use version control? Simple example of revisioning Mercurial introduction
More informationVersion Control! Scenarios, Working with Git!
Version Control! Scenarios, Working with Git!! Scenario 1! You finished the assignment at home! VC 2 Scenario 1b! You finished the assignment at home! You get to York to submit and realize you did not
More informationIntroduction to Git. Markus Kötter koetter@rrzn.uni-hannover.de. Notes. Leinelab Workshop July 28, 2015
Introduction to Git Markus Kötter koetter@rrzn.uni-hannover.de Leinelab Workshop July 28, 2015 Motivation - Why use version control? Versions in file names: does this look familiar? $ ls file file.2 file.
More informationBundler v0.5 Documentation
Bundler v0.5 Documentation Prepared by the West Quad Computing Group October, 2008 1 Overview In the past, all development and computational activities took place on the (former) Roth lab cluster head-node,
More informationData management on HPC platforms
Data management on HPC platforms Transferring data and handling code with Git scitas.epfl.ch September 10, 2015 http://bit.ly/1jkghz4 What kind of data Categorizing data to define a strategy Based on size?
More informationVersion Control with Svn, Git and git-svn. Kate Hedstrom ARSC, UAF
1 Version Control with Svn, Git and git-svn Kate Hedstrom ARSC, UAF 2 Version Control Software System for managing source files For groups of people working on the same code When you need to get back last
More informationmonoseq Documentation
monoseq Documentation Release 1.2.1 Martijn Vermaat July 16, 2015 Contents 1 User documentation 3 1.1 Installation................................................ 3 1.2 User guide................................................
More informationIntegrated version control with Fossil SCM
Integrated version control with Fossil SCM Tech Talk 2009-12-01 Arne Bachmann Folie 1 Overview Web address www.fossil-scm.org Author Dr. D.R. Hipp - Author of License GPL v2 Motto No information shall
More informationGitflow process. Adapt Learning: Gitflow process. Document control
Adapt Learning: Gitflow process Document control Abstract: Presents Totara Social s design goals to ensure subsequent design and development meets the needs of end- users. Author: Fabien O Carroll, Sven
More informationContinuous Integration and Delivery. manage development build deploy / release
Continuous Integration and Delivery manage development build deploy / release test About the new CI Tool Chain One of the biggest changes on the next releases of XDK, will be the adoption of the New CI
More informationStriderCD Book. Release 1.4. Niall O Higgins
StriderCD Book Release 1.4 Niall O Higgins August 22, 2015 Contents 1 Introduction 3 1.1 What Is Strider.............................................. 3 1.2 What Is Continuous Integration.....................................
More information000-420. IBM InfoSphere MDM Server v9.0. Version: Demo. Page <<1/11>>
000-420 IBM InfoSphere MDM Server v9.0 Version: Demo Page 1. As part of a maintenance team for an InfoSphere MDM Server implementation, you are investigating the "EndDate must be after StartDate"
More informationStreamline Computing Linux Cluster User Training. ( Nottingham University)
1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running
More informationDevShop. Drupal Infrastructure in a Box. Jon Pugh CEO, Founder ThinkDrop Consulting Brooklyn NY
DevShop Drupal Infrastructure in a Box Jon Pugh CEO, Founder ThinkDrop Consulting Brooklyn NY Who? Jon Pugh ThinkDrop Consulting Building the web since 1997. Founded in 2009 in Brooklyn NY. Building web
More informationA Complete Example of Next- Gen DNA Sequencing Read Alignment. Presentation Title Goes Here
A Complete Example of Next- Gen DNA Sequencing Read Alignment Presentation Title Goes Here 1 FASTQ Format: The de- facto file format for sharing sequence read data Sequence and a per- base quality score
More informationPKI, Git and SVN. Adam Young. Presented by. Senior Software Engineer, Red Hat. License Licensed under http://creativecommons.org/licenses/by/3.
PKI, Git and SVN Presented by Adam Young Senior Software Engineer, Red Hat License Licensed under http://creativecommons.org/licenses/by/3.0/ Agenda Why git Getting started Branches Commits Why? Saved
More informationProcessing NGS Data with Hadoop-BAM and SeqPig
Processing NGS Data with Hadoop-BAM and SeqPig Keijo Heljanko 1, André Schumacher 1,2, Ridvan Döngelci 1, Luca Pireddu 3, Matti Niemenmaa 1, Aleksi Kallio 4, Eija Korpelainen 4, and Gianluigi Zanetti 3
More informationDeveloper Workshop 2015. Marc Dumontier McMaster/OSCAR-EMR
Developer Workshop 2015 Marc Dumontier McMaster/OSCAR-EMR Agenda Code Submission 101 Infrastructure Tools Developing OSCAR Code Submission: Process OSCAR EMR Sourceforge http://www.sourceforge.net/projects/oscarmcmaster
More informationIntroduction to Version Control
Research Institute for Symbolic Computation Johannes Kepler University Linz, Austria Winter semester 2014 Outline General Remarks about Version Control 1 General Remarks about Version Control 2 Outline
More informationDelivering the power of the world s most successful genomics platform
Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE
More informationDEPLOYING EMC DOCUMENTUM BUSINESS ACTIVITY MONITOR SERVER ON IBM WEBSPHERE APPLICATION SERVER CLUSTER
White Paper DEPLOYING EMC DOCUMENTUM BUSINESS ACTIVITY MONITOR SERVER ON IBM WEBSPHERE APPLICATION SERVER CLUSTER Abstract This white paper describes the process of deploying EMC Documentum Business Activity
More informationVersion control with GIT
AGV, IIT Kharagpur September 13, 2012 Outline 1 Version control system What is version control Why version control 2 Introducing GIT What is GIT? 3 Using GIT Using GIT for AGV at IIT KGP Help and Tips
More informationHadoopizer : a cloud environment for bioinformatics data analysis
Hadoopizer : a cloud environment for bioinformatics data analysis Anthony Bretaudeau (1), Olivier Sallou (2), Olivier Collin (3) (1) anthony.bretaudeau@irisa.fr, INRIA/Irisa, Campus de Beaulieu, 35042,
More informationDeveloping tests for the KVM autotest framework
Lucas Meneghel Rodrigues lmr@redhat.com KVM Forum 2010 August 9, 2010 1 Automated testing Autotest The wonders of virtualization testing 2 How KVM autotest solves the original problem? Features Test structure
More informationWeb Developer Toolkit for IBM Digital Experience
Web Developer Toolkit for IBM Digital Experience Open source Node.js-based tools for web developers and designers using IBM Digital Experience Tools for working with: Applications: Script Portlets Site
More informationLifeScope Genomic Analysis Software 2.5
USER GUIDE LifeScope Genomic Analysis Software 2.5 Graphical User Interface DATA ANALYSIS METHODS AND INTERPRETATION Publication Part Number 4471877 Rev. A Revision Date November 2011 For Research Use
More informationUsing Git for Centralized and Distributed Version Control Workflows - Day 3. 1 April, 2016 Presenter: Brian Vanderwende
Using Git for Centralized and Distributed Version Control Workflows - Day 3 1 April, 2016 Presenter: Brian Vanderwende Git jargon from last time... Commit - a project snapshot in a repository Staging area
More informationGit Basics. Christian Hanser. Institute for Applied Information Processing and Communications Graz University of Technology. 6.
Git Basics Christian Hanser Institute for Applied Information Processing and Communications Graz University of Technology 6. March 2013 Christian Hanser 6. March 2013 Seite 1/39 Outline Learning Targets
More informationAnalysis of NGS Data
Analysis of NGS Data Introduction and Basics Folie: 1 Overview of Analysis Workflow Images Basecalling Sequences denovo - Sequencing Assembly Annotation Resequencing Alignments Comparison to reference
More informationNGS Data Analysis: An Intro to RNA-Seq
NGS Data Analysis: An Intro to RNA-Seq March 25th, 2014 GST Colloquim: March 25th, 2014 1 / 1 Workshop Design Basics of NGS Sample Prep RNA-Seq Analysis GST Colloquim: March 25th, 2014 2 / 1 Experimental
More informationNext Generation Sequencing; Technologies, applications and data analysis
; Technologies, applications and data analysis Course 2542 Dr. Martie C.M. Verschuren Research group Analysis techniques in Life Science, Breda Prof. dr. Johan T. den Dunnen Leiden Genome Technology Center,
More informationIs This Your Pipe? Hijacking the Build Pipeline
Is This Your Pipe? Hijacking the Build Pipeline $ whoami @rgbkrk OSS, Builds and Testing Protecting pipelines Need Want benefits of continuous delivery! Open source pathways to real, running infrastructure!
More informationUsing Git for Project Management with µvision
MDK Version 5 Tutorial AN279, Spring 2015, V 1.0 Abstract Teamwork is the basis of many modern microcontroller development projects. Often teams are distributed all over the world and over various time
More informationPutting It All Together. Vagrant Drush Version Control
Putting It All Together Vagrant Drush Version Control Vagrant Most Drupal developers now work on OSX. The Vagarant provisioning scripts may not work on Windows without subtle changes. If supplied, read
More informationAdministration GUIDE. SharePoint Server idataagent. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 201
Administration GUIDE SharePoint Server idataagent Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 201 Getting Started - SharePoint Server idataagent Overview Deployment Configuration Decision Table
More informationVersion control. with git and GitHub. Karl Broman. Biostatistics & Medical Informatics, UW Madison
Version control with git and GitHub Karl Broman Biostatistics & Medical Informatics, UW Madison kbroman.org github.com/kbroman @kwbroman Course web: kbroman.org/tools4rr Slides prepared with Sam Younkin
More informationContinuous Integration. CSC 440: Software Engineering Slide #1
Continuous Integration CSC 440: Software Engineering Slide #1 Topics 1. Continuous integration 2. Configuration management 3. Types of version control 1. None 2. Lock-Modify-Unlock 3. Copy-Modify-Merge
More informationSMRT Analysis Software Installation (v2.3.0)
SMRT Analysis Software Installation (v2.3.0) Introduction This document describes the basic requirements for installing SMRT Analysis v2.3.0 on a customer system. SMRT Analysis is designed to be installed
More informationAutomating Big Data Benchmarking for Different Architectures with ALOJA
www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.
More informationMobile Development with Git, Gerrit & Jenkins
Mobile Development with Git, Gerrit & Jenkins Luca Milanesio luca@gerritforge.com June 2013 1 ENTERPRISE CLOUD DEVELOPMENT Copyright 2013 CollabNet, Inc. All Rights Reserved. About CollabNet Founded in
More information[Handout for L6P2] How to Avoid a Big Bang: Integrating Software Components
Integration [Handout for L6P2] How to Avoid a Big Bang: Integrating Software Components Timing and frequency: Late and one time vs early and frequent Integrating parts written by different team members
More informationBuilding a Python Plugin
Building a Python Plugin QGIS Tutorials and Tips Author Ujaval Gandhi http://google.com/+ujavalgandhi This work is licensed under a Creative Commons Attribution 4.0 International License. Building a Python
More informationIntroducing Xcode Source Control
APPENDIX A Introducing Xcode Source Control What You ll Learn in This Appendix: u The source control features offered in Xcode u The language of source control systems u How to connect to remote Subversion
More informationText file One header line meta information lines One line : variant/position
Software Calling: GATK SAMTOOLS mpileup Varscan SOAP VCF format Text file One header line meta information lines One line : variant/position ##fileformat=vcfv4.1! ##filedate=20090805! ##source=myimputationprogramv3.1!
More informationUsing the Yale HPC Clusters
Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Dec 2015 To get help Send an email to: hpc@yale.edu Read documentation at: http://research.computing.yale.edu/hpc-support
More informationGalaxy4Bioinformatics Développement et intégration d application sous Galaxy TOOL INTEGRATION
Galaxy4Bioinformatics Développement et intégration d application sous Galaxy Gildas Le Corguillé Gwendoline Andres Loraine Guéguen IFB-GT Galaxy Devteam March 4, 2015 9am 18am TOOL INTEGRATION Part I CONTEXT
More informationHandling next generation sequence data
Handling next generation sequence data a pilot to run data analysis on the Dutch Life Sciences Grid Barbera van Schaik Bioinformatics Laboratory - KEBB Academic Medical Center Amsterdam Very short intro
More informationThe Global Rules set is evaluated first and contains the global access rules that apply to all NG firewalls using the shared service.
Distributed Firewall The distributed firewall (formerly Cascaded Firewall or cfirewall) is a firewall service distributed across multiple NG Firewalls. It is a variant of the regular firewall service,
More informationModule 11 Setting up Customization Environment
Module 11 Setting up Customization Environment By Kitti Upariphutthiphong Technical Consultant, ecosoft kittiu@gmail.com ADempiere ERP 1 2 Module Objectives Downloading ADempiere Source Code Setup Development
More informationSurround SCM Best Practices
Surround SCM Best Practices This document addresses some of the common activities in Surround SCM and offers best practices for each. These best practices are designed with Surround SCM users in mind,
More informationDevOps Course Content
DevOps Course Content INTRODUCTION TO DEVOPS What is DevOps? History of DevOps Dev and Ops DevOps definitions DevOps and Software Development Life Cycle DevOps main objectives Infrastructure As A Code
More informationWork Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015
Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians
More information