Distributed Computing for CEPC. YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep.

Similar documents
YAN, Tian. On behalf of distributed computing group. Institute of High Energy Physics (IHEP), CAS, China. CHEP-2015, Apr th, OIST, Okinawa

Network & HEP Computing in China. Gongxing SUN CJK Workshop & CFI

Client/Server Grid applications to manage complex workflows

Dynamic Extension of a Virtualized Cluster by using Cloud Resources CHEP 2012

Solution for private cloud computing

Integration of Virtualized Workernodes in Batch Queueing Systems The ViBatch Concept

files without borders

The CMS analysis chain in a distributed environment

PHENIX Job Submission/Monitoring in transition to the Grid Infrastructure

RO-11-NIPNE, evolution, user support, site and software development. IFIN-HH, DFCTI, LHCb Romanian Team

DSS. Data & Storage Services. Cloud storage performance and first experience from prototype services at CERN

Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil

Forschungszentrum Karlsruhe in der Helmholtz - Gemeinschaft. Holger Marten. Holger. Marten at iwr. fzk. de

Context-aware cloud computing for HEP

Data storage services at CC-IN2P3

HEP Data-Intensive Distributed Cloud Computing System Requirements Specification Document

Alternative models to distribute VO specific software to WLCG sites: a prototype set up at PIC

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

EDG Project: Database Management Services

Report from SARA/NIKHEF T1 and associated T2s

Analisi di un servizio SRM: StoRM

HEP Compu*ng in a Context- Aware Cloud Environment

Michael Thomas, Dorian Kcira California Institute of Technology. CMS Offline & Computing Week


File Transfer Software and Service SC3

User-friendly access to Grid and Cloud resources for 18scientific th 19 th computing January / 21

Interoperating Cloud-based Virtual Farms

Cluster, Grid, Cloud Concepts

LHCb activities at PIC

NT1: An example for future EISCAT_3D data centre and archiving?

Status of Grid Activities in Pakistan. FAWAD SAEED National Centre For Physics, Pakistan

Software, Computing and Analysis Models at CDF and D0

Adding IaaS Clouds to the ATLAS Computing Grid

Cloud Computing PES. (and virtualization at CERN) Cloud Computing. GridKa School 2011, Karlsruhe. Disclaimer: largely personal view of things

Shoal: IaaS Cloud Cache Publisher

PES. Batch virtualization and Cloud computing. Part 1: Batch virtualization. Batch virtualization and Cloud computing

Resume. Wenjing. Date of birth: June 11th, 1982 Nationality: Chinese Phone number: Cell phone:

SUSE Cloud Installation: Best Practices Using a SMT, Xen and Ceph Storage Environment

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

Status and Evolution of ATLAS Workload Management System PanDA

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

HPC-related R&D in 863 Program

SUSE Cloud Installation: Best Practices Using an Existing SMT and KVM Environment

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014

Cloudified IP Multimedia Subsystem (IMS) for Network Function Virtualization (NFV)-based architectures

Integration of Virtualized Worker Nodes in Batch Systems

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

The Greenplum Analytics Workbench

How To Create A Grid On A Microsoft Web Server On A Pc Or Macode (For Free) On A Macode Or Ipad (For A Limited Time) On An Ipad Or Ipa (For Cheap) On Pc Or Micro

Cloud services for the Fermilab scientific stakeholders

Roberto Barbera. Centralized bookkeeping and monitoring in ALICE

OSG Hadoop is packaged into rpms for SL4, SL5 by Caltech BeStMan, gridftp backend

Intellicus Enterprise Reporting and BI Platform

Computing in High- Energy-Physics: How Virtualization meets the Grid

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed

Software installation and condition data distribution via CernVM File System in ATLAS

Tier 1 Services - CNAF to T1

HTCondor within the European Grid & in the Cloud

Potential of Virtualization Technology for Long-term Data Preservation

A Web-based Portal to Access and Manage WNoDeS Virtualized Cloud Resources

Leveraging OpenStack Private Clouds

The EGI pan-european Federation of Clouds

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

IPv6 Traffic Analysis and Storage

A batch system for HEP applications on a distributed IaaS cloud

Brian Amedro CTO. Worldwide Customers

HAMBURG ZEUTHEN. DESY Tier 2 and NAF. Peter Wegner, Birgit Lewendel for DESY-IT/DV. Tier 2: Status and News NAF: Status, Plans and Questions

ATLAS Software and Computing Week April 4-8, 2011 General News

Batch and Cloud overview. Andrew McNab University of Manchester GridPP and LHCb

Grid Computing in Aachen

Analyses on functional capabilities of BizTalk Server, Oracle BPEL Process Manger and WebSphere Process Server for applications in Grid middleware

Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version

POSIX and Object Distributed Storage Systems

Solution for private cloud computing

Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary

The CMS Tier0 goes Cloud and Grid for LHC Run 2. Dirk Hufnagel (FNAL) for CMS Computing

Distributed IaaS Clouds and 100G Networking for HEP applications

Virtualization Infrastructure at Karlsruhe

2) Xen Hypervisor 3) UEC

HTCondor at the RAL Tier-1

Big Data and Cloud Computing for GHRSST

Long term analysis in HEP: Use of virtualization and emulation techniques

Managing a local Galaxy Instance. Anushka Brownley / Adam Kraut BioTeam Inc.

TheraDoc v4.6.1 Hardware and Software Requirements

Cloud Accounting. Laurence Field IT/SDC 22/05/2014

Data Lab Operations Concepts

Database Monitoring Requirements. Salvatore Di Guida (CERN) On behalf of the CMS DB group

Report on WorkLoad Management activities

WebLogic on Oracle Database Appliance: Combining High Availability and Simplicity

msuite5 & mdesign Installation Prerequisites

How To Set Up A Shared Insight Cache Server On A Pc Or Macbook With A Virtual Environment On A Virtual Computer (For A Virtual) (For Pc Or Ipa) ( For Macbook) (Or Macbook). (For Macbook

SQL Server Instance-Level Benchmarks with DVDStore

Cloud and Virtualization to Support Grid Infrastructures

The Evolution of Cloud Computing in ATLAS

Cloud Implementation using OpenNebula

What is the real cost of Commercial Cloud provisioning? Thursday, 20 June 13 Lukasz Kreczko - DICE 1

Transcription:

Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep. 12-13, 2014 1

Outline Introduction Experience of BES-DIRAC Distributed Computing Distributed Computing for CEPC Summary 2

Part I INTRODUCTION 3

Distributed Computing Distributed computing plays an import role in discovery of Higgs Large HEP experiments need plenty of computing resources, which may not be afforded by only one institution or university Distributed computing allow to organize heterogeneous resources (cluster, grid, cloud, volunteer computing) and distributed resources from collaborations 4

DIRAC DIRAC (Distributed Infrastructure with Remote Agent Control) provide a framework and solution for experiments to setup their own distributed computing system. It s widely used by many HEP experiments. DIRAC Users CPU Cores No. of Sites LHCb 40,000 110 Belle 2 12,000 34 CTA 5,000 24 ILC 3,000 36 BES 3 1,800 8 etc 5

DIRAC User: LHCb first user of DIRAC 110 Sites 40,000 CPU cores 6

DIRAC User: Belle II 34 Sites 12,000 CPU cores Plan to enlarge to ~100,000 CPU cores 7

Part II EXPERIENCE OF BES-DIRAC DISTRIBUTED COMPUTING 8

BES-DIRAC: Computing Model Detector IHEP Data Center DIRAC Central SE (Storage Element) Raw data dst & ramdomtrg MC dst All dst CPU MC prod. analysis Storage Cloud Site Cluster Site Grid Site analysis local Resources local Resources local Resources 9

BES-DIRAC: Computing Resources List # Contributors CE Type CPU Cores SE Type SE Capacity Status 1 IHEP Cluster + Cloud 144 dcache 214 TB Active 2 Univ. of CAS Cluster 152 Active 3 USTC Cluster 200 ~ 1280 dcache 24 TB Active 4 Peking Univ. Cluster 100 Active 5 Wuhan Univ. Cluster 100 ~ 300 StoRM 39 TB Active 6 Univ. of Minnesota Cluster 768 BeStMan 50 TB Active 7 JINR glite + Cloud 100 ~ 200 dcache 8 TB Active 8 INFN & Torino Univ. glite + Cloud 264 StoRM 50 TB Active Total 1828 ~ 3208 385 TB 9 Shandong Univ. Cluster 100 In progress 10 BUAA Cluster 256 In progress 11 SJTU Cluster 192 144 TB In progress Total 548 144 TB 10

BES-DIRAC: Official MC Production # Time Task BOSS Ver. Total Events Jobs Data Output 1 2013.9 J/psi inclusive (round 05) 6.6.4 900.0 M 32,533 5.679 TB 2 2013.11~2014.01 Psi3770 (round 03,04) 6.6.4.p01 1352.3 M 69,904 9.611 TB Total 2253.3 M 102,437 15.290 TB keep run ~1350 jobs for one week in 2 nd batch: Dec.7~15 Physical validation check of 1 st production Job running @ 2 nd batch of 2 nd production 11

BES-DIRAC: Data Transfer System Developed based on DIRAC framework to support transfers of: BESIII randomtrg data for remote MC production BESIII dst data for remote analysis Feature allow user subcription and central control integrate with central file catalog, support dataset based transfer support multi thread transfer Can be used by other HEP experiments who need massive remote transfer 12

BES-DIRAC: Data Transfer System Data transfered from March to July 2014, total 85.9 TB Data Type Data Data Size Source SE Destination SE DST Random trigger data xyz 24.5 TB IHEP USTC psippscan 2.5 TB IHEP UMN round 02 1.9 TB IHEP USTC, WHU, UMN, JINR round 03 2.8 TB IHEP USTC, WHU, UMN round 04 3.1 TB IHEP USTC, WHU, UMN round 05 3.6 TB IHEP USTC, WHU, UMN round 06 4.4 TB IHEP USTC, WHU, UMN, JINR round 07 5.2 TB IHEP USTC, WHU high quality ( > 99% one-time success rate) high transfer speed ( ~ 1 Gbps to USTC, WHU, UMN; 300Mbps to JINR): Data Source SE Destination SE Peak Speed Average Speed randomtrg r04 USTC, WHU UMN 96 MB/S 76.6 MB/s (6.6 TB/day) randomtrg r07 IHEP USTC, WHU 191 MB/s 115.9 MB/s (10.0 TB/day) 13

IHEP USTC, WHU @ 10.0 TB/day one-time success > 99% USTC, WHU UMN @ 6.6 TB/day 14

Cloud Computing Cloud is a new resource to be added in BESIII distributed computing Advantages: make sharing resources among different experiments much easier easy deploment and maintance for site allow site easily support diffrerent experiment s requiremnts(os, software, lib, etc.) users can freely choose whatever OS they need same computing environment in all site Recent testing shows cloud resource is usable for BESIII Cloud resources are also successfully used in CEPC testing 15

Recent Testing for Cloud Cloud Resources for Test Site Cloud Manager CPU Cores Memory CLOUD.IHEP-OPENSTACK.cn OpenStack 24 48 GB CLOUD.IHEP-OPENNEBULA.cn OpenNebula 24 48 GB CLOUD.CERN.ch OpenStack 20 40 GB CLOUD.TORINO.it OpenNebula 60 58.5 GB CLOUD.JINR.ru OpenNebula 5 10 GB 913 test BOSS jobs simulation + reconstruction psi(4260) hadron decay, 5000 events each 100% successful Test Jobs Running on Cloud Sites 14000 12000 10000 8000 6000 4000 2000 0 Performance Execution Time sim rec download CLOUD.IHEP-OPENSTACK.cn CLOUD.IHEP-OPENNEBULA.cn CLOUD.TORINO.it CLOUD.JINR.ru BES.IHEP-PBS.cn BES.UCAS.cn BES.USTC.cn BES.WHU.cn BES.UMN.us BES.JINR.ru 16

part III DISTRIBUTED COMPUTING FOR CEPC 17

A Test Bed Established Software deploy and Job flow *.stdhep input data *.slcio output data IHEP Lustre BES-DIRAC Servers CEPC software installed here CVMFS Server DB mirror WHU SE IHEP DB BUAA Site OS: SL 5.8 Remote WHU Site OS: SL 6.4 Remote IHEP PBS Site OS: SL 5.5 IHEP Cloud Site IHEP Local Resources 18

Computing Resources & Software Deployment Resources List of this Test Bed Contributors CPU cores Storage IHEP 144 WHU 100 20 TB BUAA 20 Total 264 20 TB 264 CPU cores, shared with BES III 20 TB dedicated SE capacity, for test is OK, but it s not enough for production CEPC detector simulation need 100k CPU days every year. We need more contributors! Deploy CEPC software by CVMFS CVMFS: CERN Virtual Machine File System A network file system based on HTTP optimized to deliver experiment software software are hosted on web server in client side, load data only on access CVMFS is also used in BES III distributed computing CVMFS Server web proxy work node Repositories Cache load data only on acess 19

CEPC Testing Job Workflow Submit a test job step by step: (1) upload input data to SE (2) prepare job.sh (3) prepare a JDL file: job.jdl (4) submit job to DIRAC (5) monitoring job status in web portal (6) Download output data to Lustre For user job: In future, a frontend need to be developed to avoid details. User only need to provide some configuration parameters to submit jobs 20

Testing Jobs Statistics (1/4) 3063 jobs process: nnh 1000 events/job full sim. + rec. 21

Testing Jobs Statistics (2/4) 2 cluster sites: IHEP-PBS WHU 2 cloud sites: IHEP OpenStack IHEP OpenNebula 22

Testing Jobs Statistics (3/4) 96.8 % Success 3.2% job stalled because of PBS node down and network maintenance 23

Testing Jobs Statistics (4/4) 3.59 TB output data uploaded to WHU SE 1.1 GB output/job larger than typical BESIII job 24

To Do List Further physics validation on current test-bed Deploy remote mirror MySQL database Develop frontend tools for physics users to deal with massive job splitting, submission, monitoring & data management Provide multi-vo suport to manage BESIII&CEPC sharing resources if needed Support user analysis 25

Summary BESIII distributed computing has become a supplement to BESIII computing CEPC simulation has been successfully done on CEPC- DIRAC test bed Successful tests show that distributed computing could contribute resources to CEPC computing in early stage and even in future 26

Thanks Thank you for your attention! Q & A 27