GRID computing at LHC Science without Borders



Similar documents
(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

CMS Tier-3 cluster at NISER. Dr. Tania Moulik

SUPERCOMPUTING FACILITY INAUGURATED AT BARC

Cluster, Grid, Cloud Concepts

System Models for Distributed and Cloud Computing

Indian NERN: ERNET. Presented by : Meharban Singh, ERNET India

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report.

Distributed Systems LEEC (2005/06 2º Sem.)

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Network operating systems typically are used to run computers that act as servers. They provide the capabilities required for network operation.

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

Functions of NOS Overview of NOS Characteristics Differences Between PC and a NOS Multiuser, Multitasking, and Multiprocessor Systems NOS Server

Network & HEP Computing in China. Gongxing SUN CJK Workshop & CFI

Invenio: A Modern Digital Library for Grey Literature

Big Data Analytics. for the Exploitation of the CERN Accelerator Complex. Antonio Romero Marín

Taking Big Data to the Cloud. Enabling cloud computing & storage for big data applications with on-demand, high-speed transport WHITE PAPER

Virtual machine interface. Operating system. Physical machine interface

Big Data Challenges in Bioinformatics

SSL VPN vs. IPSec VPN

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

lesson 1 An Overview of the Computer System

Grid Computing Vs. Cloud Computing

Hadoop Cluster Applications

Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer. Frank Würthwein Rick Wagner August 5th, 2013

Status and Evolution of ATLAS Workload Management System PanDA

Star System Deitel & Associates, Inc. All rights reserved.

Four Ways High-Speed Data Transfer Can Transform Oil and Gas WHITE PAPER

Client/server is a network architecture that divides functions into client and server

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

OBJECTIVE. National Knowledge Network (NKN) project is aimed at

Cloud: It s not a nebulous concept

OpenScape Web Collaboration

Disaster Recovery Strategies: Business Continuity through Remote Backup Replication

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction

Every organization has critical data that it can t live without. When a disaster strikes, how long can your business survive without access to its

Synapse s SNAP Network Operating System

Building a Volunteer Cloud

XFS File System and File Recovery Tools

Scala Storage Scale-Out Clustered Storage White Paper

- An Essential Building Block for Stable and Reliable Compute Clusters

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

DIGITAL SYSTEMS V/S IP PHONE SYSTEMS

The CMS analysis chain in a distributed environment

Hadoop. Sunday, November 25, 12

20 th Year of Publication. A monthly publication from South Indian Bank.

Summer Student Project Report

Automated file management with IBM Active Cloud Engine

The supercomputer for particle physics at the ULB-VUB computing center

Innovative, High-Density, Massively Scalable Packet Capture and Cyber Analytics Cluster for Enterprise Customers

What are Hosted Desktops?

ATLAS job monitoring in the Dashboard Framework

High Performance Computing. Course Notes HPC Fundamentals

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

irods at CC-IN2P3: managing petabytes of data

VIA CONNECT PRO Deployment Guide

Introducing. Markus Erlacher Technical Solution Professional Microsoft Switzerland

Cloud Computing and Amazon Web Services

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Cray DVS: Data Virtualization Service

PRODUCTS & TECHNOLOGY

Testing & Assuring Mobile End User Experience Before Production. Neotys

Benchmarking Hadoop & HBase on Violin

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

Desktop Virtualization Technologies and Implementation

Archive Data Retention & Compliance. Solutions Integrated Storage Appliances. Management Optimized Storage & Migration

Manjrasoft Market Oriented Cloud Computing Platform

Human Brain Project -

LCMON Network Traffic Analysis

Transcription:

GRID computing at LHC Science without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Disclaimer: I am a physicist whose research field induces & utilizes cutting-edge technology in the field of electronics, communication,.. Dr. Paul s Eng. College, Velucherry September 12, 2011

Basic idea (G. Gilder): when the network is as fast as the computer s internal link, the machine disintegrates across the net into a set of special purpose appliances. Plan of talk Requirements of today s scientific community Grid concept in simple terms Evolution of Grid LHC Computing Grid and CMS experiment CMS Tier2 Grid Computing Centre at TIFR, Mumbai Outlook

Computing requirements and challenges Today s science is based on computations, data analysis, data visualization,.. 1. Scientific and engineering problems are getting ever more complex. 2. Collaborations are becoming larger. Computer simulation and modelling is more cost-effective than experimental methods in some cases (eg. reactor safety, designing of an aircraft). Users need more accurate and precise solutions to their problems in shortest time possible (eg. weather forecasts). Recent years is seeing mammoth scientific projects where data size is several PetaBytes per year (eg., LHC experiments) to be used by several thousand people. To work with a colleague even across a campus on Petabyte (1015 ) scale we need ultrafast network. Even though CPU power, disc storage, communication speed continue to increase, computing resources are failing to satisfy users demands!

Current trend in scientific communications 1. Free, open-source software GNU/Linux based OS has been developed consciously with many applications Research/academic institutes use cheaper PC clusters to achieve high performance easy to develop loosely coupled distributed applications. Softwares have to catch up with users demands and expectations for high end computing. 2. Parallel computing: multiple computers or processors working together on a common task -- each processor works on its section of the problem -- processors are allowed to exchange information among themselves Two big advantages of parallel computers: performance and memory. 3. Internet computing using idle PC s is becoming an important computing platform (LHC@home, Seti@home, Napster,..) www is the promising candidate for core component of wide-area distributed computing environment. Efficient client/server models/protocols Transparent networking, navigation, GUI with multimedia access and dissemination for data visualization.

Grid computing in simple words Grid is an utility or infra-structure for complex, huge computations, where remote resources are accessible through web (internet), from desktop, laptop, mobile phone. It is similar to the electrical power grid, where the user does not have to worry about the source of the computing power. Imagine millions of computers owned by individuals, institutes from various countries across the world connected to form a single, huge, super-computer! This technology, developed since last only one decade, is being used by --- high energy physicists to store, analyze data being produced by LHC experiments at CERN, Geneva, Switzerland. --- Earth scientists to monitor Ozone layer activity. --- Biologists to monitor behaviour of bees ---... It is the natural evolution of internet facility.

Going back World Wide Web Information Sharing Invented at CERN by Tim Berners-Lee (in 1990s) For use in High Energy Physics experiments Quickly crossed over into public use Agreed protocols, like, HTTP Anyone can access information and post their own GRID is changing the way science is being done. High-speed networking over large distance has been the key aspect of GRID.

From Web to Grid Computing Working together apart. Use of internet as infrastructure, and advanced web services for seemless Integration. 1. Sharing more than just information; Data, computing power, applications in dynamic, multi-institutional, virtual organizations tools: email, video conference, webcast. white board. 2. Efficient use of major and minor resources at many institutes. People from many institutions working to solve a common problem Ensure data accessible anywhere and anytime. 3. Interactions with the underneath layers need to be transparent and seemless to the user. 4. Harness the power of internet to aggregate and share resources spread across the globe: both challenging and highly cost-effective can give unlimited capability. Grow rapidly, yet remain reliable for more than a decade.

Large Hadron Collider (LHC) Largest ever scientific project 20 years to plan, build 20 years to work with 27 km circumference at 1.9 K at 10-13 Torr at 50-175 m below surface more than 10K magnets 4 big experiments, with about 10K scientists, 3k students,engineers. Operational since 2009, Q4 excellent performance fast reap of science!

LHC: ~ 10-12 seconds (p-p) ~ 10-6 seconds (Pb-Pb) Big Bang WMAP (2001) COBE(1989) COBE( Today Experiments in Astrophysics & Cosmology ~300 000 years

In hard numbers LHC collides 6-8 hundred million proton-on-proton per second for several years. Only 1 in ~20 thousand collisions will have an important tale to tell, but we do not know which one! so we have to search through all of them! Huge task! 15 PBytes (10 15 bytes) of data a year Analysis requires ~100,000 computers to get results in reasonable time. GRID computing is essential

Complexity of LHC experiments When 2 very high energy protons collide at LHC, it results in a very crowded situation. In a single experiment several million electrical signals are recorded within tiny fraction of a second, repeatedly, for a long time. There are 4 big experiments. Using computers, a digital image is created for each such instance. Image size can vary from 1 to 80 MB depending on the impact. But, unfortunately, most of these pictures are not interesting! One in few thousand billion collisions will be really useful to provide the clue about the early conditions in the universe! Store data by colliding intense beams of energetic protons. statistically search for clue of the early universe when it was much hotter.

ata volume rates for a typical experiment Presently event size ~ 1MB data collection rate ~ 400 Hz,

Layered Structure of CMS GRID Experimental site Tier 0 Tier 1 National centres ASIA (Taiwan) connecting computers across globe CERN computer centre, Geneva USA Germany Italy France Tier 2 Regional groups in a continent/nation India Indiacms T2_IN_TIFR Different Universities, Institutes in a country Individual scientist s PC,laptop,.. Chin a BARC TIFR Korea Taiwa n Delhi Univ. Pakist an Panjab Univ.

Overview of Grid Components A huge manpower is invisibly at work Tier2 components

Grid middleware The grid relies on advanced software which interfaces between resources and applications linked by internet: Middleware mediates everything 1.Secure and effective unifrom access to wide range of resources 2.Optimal use 3.Authentication to the system by digital certificate and then to groups and sites 4. Application level amangemnet: job execution and monitoring during progress 5.Problem recovery 6.Collection of results after execution and delivery to user 7. Address inter-domain issue of security, policy, etc. authorisation rights to use the facility for the user s purpose Middleware components: User Interface Resource broker/worksload management system Information system, file and replica catalogues Logging and book-keeping 1. You submit task to grid. Storage elements 2. Grid find convenient places to execute the task. compute elements decomposes if necessary. 3. Informs you when finished.

GRID portal / Gateway Event level parallelism: process event-by event. Split large job into M efficient processes, each dealing with M events. Large memory needed, though scalability is built-in.

Grid map for CMS experiment at LHC CMS in Total: 1 Tier-0 at CERN (GVA) 7 Tier-1s on 3 continents 50 Tier-2s on 4 continents CMS T2 in India : one of the 5 in Asia-Pacific region Today : 6 collaborating institutes in CMS, ~ 50 scientists +students 2.1% of signing authors in publication, Contributing to computing resource of CMS ~ 3%

CMS Tier2 site at TIFR: T2_IN_TIFR Current resources: Storage: 450 TB 400 worker nodes. Internet bandwidth > 1 GBps Note, continuous monitoring essential. To have reliable service and availability for 24X7 About 60 users/scientists at present, still growing. Grid facility has been functional at TIFR for last few years. The CMS collaboration at LHC, CERN has been using the computer resources at Mumbai to mainly perform event simulation, storing Physics data Indian contribution noted as collective service to the experiment.

Grid Connectivity within India Network connections 1 Gbps to CERN peered to GEANT 2.5 Gbps NKN +TEIN3 VECC-INDIAALICE-T2 TIFR-INDIACMS T2 100 Mbps to VECC RRCAT, IPR

Data Transfers from/to TIFR upload Total data volume at present ~ 250 TB download Current CMS total CPU pledge at T2s : 18k jobs slots Nominal Analysis pledge : 50% Slot utilization during Summer/Fall 09 was reasonable but need to go into sustained analysis mode Total transfers during last 6 months ~ 70 TB TIFR hosting i) centrally managed data (simulated, custodial) ii)collision data skims

August 15-18, 2011 Maximum: 1.5 Gbps Avg. : 1Gbps

Statistics and plots Site summary table Site ranking Site history

Conclusion Front ranking science and engineering requires massive amounts of computing, including huge data collection, storage and access to data, database etc. LHC is the largest grid serving in the world with 200 sites in 40 countires, equipped with tens of thousands of linux servers and tens of PetaByte storage. Seemless and transparent access is enabled by grid technology, without compromising on security and convenience. Challenge for the younger generation Conservation of network bandwidth or use on demand basis is a challenge. The technology is still young and immature Good tools are required Portability and scalability likely be resolved by virtualization YOU ARE WELCOME TO GET STARTED WITH GRID ISSUES!