CMS Tier-3 cluster at NISER. Dr. Tania Moulik



Similar documents
Arrow ECS sp. z o.o. Oracle Partner Academy training environment with Oracle Virtualization. Oracle Partner HUB

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering

Terms of Reference Microsoft Exchange and Domain Controller/ AD implementation

Building Clusters for Gromacs and other HPC applications

1. Specifiers may alternately wish to include this specification in the following sections:

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

Power Redundancy. I/O Connectivity Blade Compatibility KVM Support

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

Small Industries Development Bank of India Request for Proposal (RfP) For Purchase of Intel Servers for Various Locations

Cisco MCS 7825-H3 Unified Communications Manager Appliance

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

FLOW-3D Performance Benchmark and Profiling. September 2012

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

QUESTIONS & ANSWERS. ItB tender 72-09: IT Equipment. Elections Project

10.2 Requirements for ShoreTel Enterprise Systems

Apache Hadoop Cluster Configuration Guide

Enabling Technologies for Distributed Computing

IMPLEMENTING GREEN IT

Software Scalability Issues in Large Clusters

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Precio Bundle. Disponibilidad. Description. Número de Parte

Fusionstor NAS Enterprise Server and Microsoft Windows Storage Server 2003 competitive performance comparison

Cisco MCS 7816-I3 Unified Communications Manager Appliance

Brainlab Node TM Technical Specifications

Cluster Implementation and Management; Scheduling

Data Communications Hardware Data Communications Storage and Backup Data Communications Servers

A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing.

HP Proliant BL460c G7

Linux clustering. Morris Law, IT Coordinator, Science Faculty, Hong Kong Baptist University

Low-cost BYO Mass Storage Project. James Cizek Unix Systems Manager Academic Computing and Networking Services

Education. Servicing the IBM ServeRAID-8k Serial- Attached SCSI Controller. Study guide. XW5078 Release 1.00

SAN TECHNICAL - DETAILS/ SPECIFICATIONS

Clusters: Mainstream Technology for CAE

Chapter 5 Cubix XP4 Blade Server

VTrak SATA RAID Storage System

Are Blade Servers Right For HEP?

Minimum Hardware Specifications Upgrades

Business white paper. HP Process Automation. Version 7.0. Server performance

LB-IPC. High Performance, Ultra Low Noise Low Power Consumption. Features. Applications

OSG Hadoop is packaged into rpms for SL4, SL5 by Caltech BeStMan, gridftp backend

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

Enabling Technologies for Distributed and Cloud Computing

Minimum Hardware Specifications Upgrades

SLIDE 1 Previous Next Exit

N /150/151/160 RAID Controller. N MegaRAID CacheCade. Feature Overview

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

INVITATION FOR TENDER AND INSTRUCTIONS TO TENDERERS INSTRUCTIONS TO TENDERERS

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

SUN ORACLE EXADATA STORAGE SERVER

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

CRIBI. Calcolo Scientifico e Bioinformatica oggi Università di Padova 13 gennaio 2012

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

Cisco 7816-I5 Media Convergence Server

THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

APACHE HADOOP PLATFORM HARDWARE INFRASTRUCTURE SOLUTIONS

Cisco Prime Home 5.0 Minimum System Requirements (Standalone and High Availability)

Cornell University Center for Advanced Computing

The GRID and the Linux Farm at the RCF

IBM System x family brochure

Hardware & Software Specification i2itracks/popiq

Adaptec: Snap Server NAS Performance Study

MESOS CB220. Cluster-in-a-Box. Network Storage Appliance. A Simple and Smart Way to Converged Storage with QCT MESOS CB220

Flash Performance in Storage Systems. Bill Moore Chief Engineer, Storage Systems Sun Microsystems

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN

The Bus (PCI and PCI-Express)

FUJITSU Enterprise Product & Solution Facts

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

EMC SYMMETRIX VMAX 20K STORAGE SYSTEM

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

Sun Microsystems Special Promotions for Education and Research January 9, 2007

Referencia: Dell PowerEdge SC1430-2GB9L3J Nettó ár: Db: 1 Tag Number: 2GB9L3J

SUN ORACLE DATABASE MACHINE

Analyzing the Virtualization Deployment Advantages of Two- and Four-Socket Server Platforms

Communicating with devices

Current Status of FEFS for the K computer

E-Business Technologies

System Configuration and Order-information Guide ECONEL 100 S2. March 2009

Hadoop Cluster Applications

HUS-IPS-5100S(D)-E (v.4.2)

Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

Implementing Enterprise Disk Arrays Using Open Source Software. Marc Smith Mott Community College - Flint, MI Merit Member Conference 2012

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Transcription:

CMS Tier-3 cluster at NISER Dr. Tania Moulik

What and why? Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach common goal. Grids tend to be more loosely coupled, heterogeneous, and geographically dispersed as compared to conventional cluster computing. Grids are often constructed with the aid of generalpurpose grid software libraries known as middleware (glite, globus, in our case).

Grid vs Supercomputer The primary advantage Each node can be purchased as commodity hardware, which, when combined, can produce a similar computing resource as multiprocessor supercomputer, but at a lower cost. The primary performance disadvantage various processors and local storage areas do not have high-speed connections. Grid is well-suited to applications in which multiple parallel computations can take place independently, without the need to communicate intermediate results between processors. It can be costly and difficult to write programs that can run in the environment of a supercomputer, which may have a custom operating system, or require the program to address concurrency issues. If a problem can be adequately parallelized, a thin layer of grid infrastructure can allow conventional, standalone programs, given a different part of the same problem, to run on multiple machines. This makes it possible to write and debug on a single conventional machine, and eliminates complications due to multiple instances of the same program running in the same shared memory and storage space at the same time.

Grid computing versus Supercomputer Tier-1 Tier2 Tier-2 Tier-3 Tier-3 Tier-3 Processor 1 Processor 2 Processor 3 Computer Bus

Why distributed computing? Huge amount of data Collisions every 25 ns (40 MHz) Event data recording (High Level Trigger farm input buffer) (200-300 Hz == 5 ms) 1 event = 2-3 Mbytes. So, 2-3 Billion events 2-6 Petabytes of data per year. Many institutions distributed over many countries Need fast transportation of data. 3 Tiered system Tier-0 : Raw data is collected and stored and then shipped to the Tier-1 centers. Data transfer from HLT to Tier-0 must happen in real time ~225 Mb/s Minimal data processing Tier-1 : Six Tier-1 centers distributed in various countries where the data is shipped for "custodial storage (Live copy of data). Processing and re-processing of data as well. Tier-2 : Pulls up the processed data from Tier-1. Could do further reprocessing depending on physics goals. Tier-3 : Reduced Dataset for individual analyses. Compute nodes house CMSSW specific software Batch submisson etc.

The CMS data distribution design Tier-1 : ASCC (Taipei), CCIN2P3 (Lyon), FNAL (Chicago), GridKA(Karlsruhe), INFN-CNAF(Bologna), PIC(Barcelona), RAL(Oxford)

Hardware configuration 2 U Server Rack mountable. 4 hot pluggable compute nodes Processor : 2 x Intel Quad Core 2.4/2.5 GHz with 12 MB Cache Chipset : Intel 5000P series chipset with 1333 MHz FSB Memory : 32 Gbytes ECC registered DDR3 1066 MHz DIMMs Hard Disk : 2 x 250 GB Serial ATA Disk Drives, 7200 RPM, Hot swappable Interconnect : 2 x 1 TB SATA2 10k Hot plug HDD Network : At least 4 gigabit Ethernet ports Storage Space : XX (to be decided) Tbytes HDD based Raid Array 4-6 User nodes (Desktops/Laptops To be decided) Network NKN (?), Airtel Network switches, UPS, Cooling units, etc.

A typical Tier-3 Cluster Shared file system NFS server. Condor batch queue - Nodes in a batch queue managed wth Condor Interactive nodes The user client machine is an interactive node that allows users to login and provides them with software to access Grid services. Compute Element Sharing your local resources with other grid users. Storage Element e.g. raid array providing high performance data transfers over the grid.

Software configuration Basic OS - Scientific Linux 5.4 Cluster installation Using a Cluster management file e.g. ROCKS Using a Kickstart file Job Scheduling Condor Cluster Monitoring and Job throwing Ganglia. GUMS -- An optional Grid Identity Mapping Service that provides automatic capabilities for managing user lists as well as enabling users to have additional "groups" and "role" privileges.

User, compute and storage elements Use pacman Package management system to install most of OSG software. source /nfs/osg/app/pacman/setup.sh Gridclient software - software to access Grid services Worker Element : source /nfs/osg/app/pacman/setup.sh mkdir /nfs/osg/wn ln -s /nfs/osg/wn /osg/wn cd /osg/wn pacman -get http://software.grid.iu.edu/osg-1.2:wn-client ln -s /etc/gridsecurity/certificates /osg/wn/globus/trusted_ca Storage element : Storage Resource Manager (SRM) A GridFTP server - GridFTP is the standard way for moving large datasets across the grid, and is required for data subscription models. OSG Bestmangateway to install SRM+GridFTP

Tentative timeline Acknowledgement : Many thanks to TIFR group experts (Mr. P.V. Deshpande) Budget preparation for Tier-3 cluster is ready. Submit to India-CMS for approval after internal review and approval of director. Acquire hardware by early 2011 Software installation by mid 2011