Anar Manafov, GSI Darmstadt. GSI Palaver, 2010-03-09

Similar documents
Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

CS 2112 Lab: Version Control

Running COMSOL in parallel

Installing and running COMSOL on a Linux cluster

Sun Grid Engine, a new scheduler for EGEE

The CMS analysis chain in a distributed environment

Using GitHub for Rally Apps (Mac Version)

OMU350 Operations Manager 9.x on UNIX/Linux Advanced Administration

OpenAdmin Tool for Informix (OAT) October 2012

MPI / ClusterTools Update and Plans

Manjrasoft Market Oriented Cloud Computing Platform

Optimize the execution of local physics analysis workflows using Hadoop

CONDOR CLUSTERS ON EC2

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

MATLAB Distributed Computing Server Installation Guide. R2012a

MySQL Administration and Management Essentials

Customize Mobile Apps with MicroStrategy SDK: Custom Security, Plugins, and Extensions

XSEDE Service Provider Software and Services Baseline. September 24, 2015 Version 1.2

ADAM 5.5. System Requirements

Jenkins Slave Cloud with Apache Mesos. Klaus Azesberger Reinhard Kiesswetter Infonova GmbH

A central continuous integration platform

Java Application Development using Eclipse. Jezz Kelway Java Technology Centre, z/os Service IBM Hursley Park Labs, United Kingdom

Developing Parallel Applications with the Eclipse Parallel Tools Platform

A Framework for Creating a Distributed Rendering Environment on the Compute Clusters

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Manjrasoft Market Oriented Cloud Computing Platform

SAS 9.4 Intelligence Platform

Installing Bacula Client on Mac OS X Server

Bright Cluster Manager

Status and Integration of AP2 Monitoring and Online Steering

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

Cluster, Grid, Cloud Concepts

Provisioning and Resource Management at Large Scale (Kadeploy and OAR)

LSKA 2010 Survey Report Job Scheduler

Creating Home Directories for Windows and Macintosh Computers

VMware vcenter Log Insight Security Guide

The dcache Storage Element

Eclipse installation, configuration and operation

Sun Grid Engine, a new scheduler for EGEE middleware

Integration of Virtualized Workernodes in Batch Queueing Systems The ViBatch Concept

Version Control Your Jenkins Jobs with Jenkins Job Builder

X-POS GUIDE. v3.4 INSTALLATION SmartOSC and X-POS

What s New in MySQL 5.7 Security Georgi Joro Kodinov Team Lead MySQL Server General Team

Controlling the Linux ecognition GRID server v9 from a ecognition Developer client

Grid Computing in SAS 9.4 Third Edition

SOFTWARE DEVELOPMENT BASICS SED

CONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities

ANSYS Remote Solve Manager User's Guide

Installation Manual for Grid Monitoring Tool

Shellshock Security Patch for X86

Analyzing large flow data sets using. visualization tools. modern open-source data search and. FloCon Max Putas

GLOBAL CONSULTING SERVICES TOOLS FOR WEBMETHODS Software AG. All rights reserved. For internal use only

Kiko> A personal job scheduler

OpenShift on you own cloud. Troy Dawson OpenShift Engineer, Red Hat November 1, 2013

Potential of Virtualization Technology for Long-term Data Preservation

Cloud Tools Reference Guide. Version: GA

GUI and Web Programming

Apache CloudStack 4.x (incubating) Network Setup: excerpt from Installation Guide. Revised February 28, :32 pm Pacific

System Requirements. Version

DeployStudio. The free deployment solution. Alan Gordon Chief Technology Officer. søndag den 9. oktober 11

Features of AnyShare

Parallel Visualization of Petascale Simulation Results from GROMACS, NAMD and CP2K on IBM Blue Gene/P using VisIt Visualization Toolkit

Chapter 4: Implementing and Managing Group and Computer Accounts. Objectives

Jitterbit Technical Overview : Microsoft Dynamics CRM

Hudson configuration manual

The ADOxx Metamodelling Platform Workshop "Methods as Plug-Ins for Meta-Modelling" in conjunction with "Modellierung 2010", Klagenfurt

Virtual Clusters as a New Service of MetaCentrum, the Czech NGI

Maintaining Non-Stop Services with Multi Layer Monitoring

BIRT Application and BIRT Report Deployment Functional Specification

Source Control Systems

Magento OpenERP Integration Documentation

Inca User-level Grid Monitoring

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

The RWTH Compute Cluster Environment

Inca User-level Grid Monitoring

Savanna Hadoop on. OpenStack. Savanna Technical Lead

OnCommand Unified Manager

Self service for software development tools

An Introduction to High Performance Computing in the Department

ALERT installation setup

Sybase Unwired Platform 2.0

Chapter 2 SYSTEM MANAGEMENT. SYS-ED/ Computer Education Techniques, Inc.

Regional SEE-GRID-SCI Training for Site Administrators Institute of Physics Belgrade March 5-6, 2009

1 Building, Deploying and Testing DPES application

Transcription:

Anar Manafov, GSI Darmstadt

HEP Data Analysis Implement algorithm Run over data set Make improvements Typical HEP analysis needs a continuous algorithm refinement cycle 2

PROOF Storage File Catalog Query PROOF cluster Scheduler CPUs PROOF Query: data file list, myselector.c Feedback, merged final output Master 3

PROOF PROOF cluster as extension of a local PC, same macro and syntax as in local ROOT session, more dynamic use of resources, real-time feedback, automatic splitting and merging. 4

Dynamic cluster User can entirely control it, can setup and use it on demand, can reserve desired amount of workers, can select a preferable master host, doesn t need admins to take an action, doesn t interact with other users. 5

6

PoD v2.1.x pod-console PoDWorker PoD server management PROOF workers monitor PoD utilities and command line tools Job Manager plug-in system glite LSF PBS SSH Configuration files (PoD, xrootd, PROOF) pod-user-defaults glite-api-wrapper (GAW) pod-agent glite Grid API LSF API PBS (torque) API ssh/scp xrootd/xproof plug-in 7

8

Resource management system User workspace 8

Resource management system User workspace 8

Resource management system User workspace 8

Worker node workspace Resource management system Firewall User workspace User workspace 8

Worker node workspace Resource management system Firewall pod-agent server User workspace xrootd User workspace 8

Resource management system PoDWorker job #1 Worker node workspace xrootd pod-agent worker #1 Firewall pod-agent server User workspace xrootd User workspace 8

Worker node workspace Resource management system PoDWorker job #1 xrootd PoDWorker job #2 pod-agent worker #1 pod-agent worker #2 Firewall pod-agent server User workspace xrootd User workspace 8

Worker node workspace Resource management system PROOF worker #1 PROOF worker #2 PoDWorker job #1 xrootd PoDWorker job #2 pod-agent worker #1 pod-agent worker #2 Firewall pod-agent server User workspace xrootd User workspace PROOF master 8

Worker node workspace Resource management system PROOF worker #1 PROOF worker #2 PoDWorker job #1 xrootd PoDWorker job #2 pod-agent pod-agent xrootd User workspace PROOF master pod-agent server User workspace 9

LIVE DEMO 10

Key features Easy to use GUI & Command-line Different job managers Multiuser/-core environment Native PROOF connections Packet-forwarding User defaults - configuration 11

PoD development PoD is about 85% C++, 13% Bash, 2% (Python and Perl) VCS is git: http://depc218.gsi.de:22222/git/ Continues integration system is buildbot: http://depc218.gsi.de:22000/waterfall Development environment: Xcode and Eclipse 12

PoD Development Build system CMake. Documentation systems DocBook (user manual and web site http://pod.gsi.de), Doxygen (source code doc.). All docs. and web site are generated automatically every night.

PoD at GSI Dedicated LSF queue preemptive, max. 120 jobs per user and max. 4 hours run-time per job. Data located on the lustre FS. Mainly used by ALICE group (GSI, Heidelberg, Münster). In average we have 2-5 concurrent users with 20-120 workers each. 14

15

16

ToDo PBS and SSH plug-ins, out of server UI, native Mac OS X implementation of UI. 17