(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015



Similar documents
The CMS analysis chain in a distributed environment

Computing at the HL-LHC

Grid Computing in Aachen

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

US NSF s Scientific Software Innovation Institutes

From Distributed Computing to Distributed Artificial Intelligence

KIT Site Report. Andreas Petzold. STEINBUCH CENTRE FOR COMPUTING - SCC

Open access to data and analysis tools from the CMS experiment at the LHC

Managed GRID or why NFSv4.1 is not enough. Tigran Mkrtchyan for dcache Team

From raw data to Pbytes on disk The world wide LHC Computing Grid

Data storage services at CC-IN2P3

The dcache Storage Element

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary

Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier

Tier0 plans and security and backup policy proposals

US CMS Tier1 Facility Network at Fermilab

Data Quality Monitoring. workshop

Deploying distributed network monitoring mesh

Status and Evolution of ATLAS Workload Management System PanDA

HEP computing and Grid computing & Big Data

Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version

Database Monitoring Requirements. Salvatore Di Guida (CERN) On behalf of the CMS DB group

Cluster, Grid, Cloud Concepts

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Chapter 12 Distributed Storage

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

Performance Monitoring of the Software Frameworks for LHC Experiments

Network & HEP Computing in China. Gongxing SUN CJK Workshop & CFI

(Scale Out NAS System)

Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer. Frank Würthwein Rick Wagner August 5th, 2013

GRID computing at LHC Science without Borders

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

Service Challenge Tests of the LCG Grid

Performance monitoring of the software frameworks for LHC experiments

Evolution of Database Replication Technologies for WLCG

Forschungszentrum Karlsruhe in der Helmholtz - Gemeinschaft. Holger Marten. Holger. Marten at iwr. fzk. de

Web based monitoring in the CMS experiment at CERN

Big Data Needs High Energy Physics especially the LHC. Richard P Mount SLAC National Accelerator Laboratory June 27, 2013

High Availability Databases based on Oracle 10g RAC on Linux

HPC Storage Solutions at transtec. Parallel NFS with Panasas ActiveStor

Software Defined Networking for big-data science

HIGH ENERGY PHYSICS EXPERIMENTS IN GRID COMPUTING NETWORKS EKSPERYMENTY FIZYKI WYSOKICH ENERGII W SIECIACH KOMPUTEROWYCH GRID. 1.

HADOOP, a newly emerged Java-based software framework, Hadoop Distributed File System for the Grid

Linux and the Higgs Particle

Online data handling with Lustre at the CMS experiment

The Data Quality Monitoring Software for the CMS experiment at the LHC

ATLAS Petascale Data Processing on the Grid: Facilitating Physics Discoveries at the LHC

Online CMS Web-Based Monitoring. Zongru Wan Kansas State University & Fermilab (On behalf of the CMS Collaboration)

Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing

Data analysis in Par,cle Physics

From Internet Data Centers to Data Centers in the Cloud

Report from SARA/NIKHEF T1 and associated T2s

Data Requirements from NERSC Requirements Reviews

Next Generation Operating Systems

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report.

Transcription:

(Possible) HEP Use Case for NDN Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

Outline LHC Experiments LHC Computing Models CMS Data Federation & AAA Evolving Computing Models & NDN Summary Phil DeMar: HEP Use Case for NDN 2

Large Hadron Collider (LHC) 101 Circumference: ~ 17 Miles 2 proton beams circulating at 99.9999991% speed of light: Beams cross and are brought to collision at 4 points: Experiments built at those points ATLAS CMS ALICE LHCb Phil DeMar: HEP Use Case for NDN 3

Compact Muon Solenoid (CMS) Experiment Detector built around collision point CMS detector Records flight path and energy of all particles produced in a collision 100 Million individual measurements (channels) All measurements of a collision together are called: event Phil DeMar: HEP Use Case for NDN 4

LHC schedule HL-LHC Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Rate: ~500 Hz Trigger -Rate: ~1 khz Trigger -Rate: ~1 khz LS1 LS2 LS3 Trigger- Trigger- Rate: ~7.5 khz L S 4 L S 5 Trigger- Rate: ~7.5 khz Higgs discovered! You are here M. Girone (CERN) Phil DeMar: HEP Use Case for NDN 5

Projected LHC data volumes RAW M. Girone (CERN) Raw data = generated by detector(s) Derived data = reconstructed data, simulation data, summary data sets, etc ) (derived data volumes) ~= (raw data volumes) x 8 Exabyte era Phil DeMar: HEP Use Case for NDN 6

CMS Collaboration 186 institutions (globally distributed) High b/w R&E networks support experiment data movement Phil DeMar: HEP Use Case for NDN 7

LHC Computing Models Phil DeMar: HEP Use Case for NDN 8

Computing Lifecycle: CMS Tier structure for computing (MONARC): Tier 0 = CERN Tier 1 = National data centers for event reconstruction & archiving Tier 2 = Computing facilities for Monte Carlo production & event analysis Tier 3 = Collaboration sites O. Gutsche (FNAL) Tier 4 = Physicist desktops Phil DeMar: HEP Use Case for NDN 9

CMS Computing GRID infrastructure CERN (T0) at the center 7 Tier-1 centers: Connected to T0 by a dedicated network 54 Tier-2 facilities Connected to T1s by R&E networks ~120,000 cores ~75PB disk ~100PB tape UK T1 T1 Germany T1 Italy T1 Spain 54 sites France T1 T1 Russia T0 @ CERN Dedicated Optical Private Network between T0 and all T1 sites LHCOPN T1 USA (FNAL) General Purpose Scientific Networks between all T1 and sites GPN O. Gutsche (FNAL) Phil DeMar: HEP Use Case for NDN 10

Tier Model for Data Movement Abandoned MONARC hierarchical model Based on expectation of low b/w & modest storage at s CMS abandoned MONARC before the LHC even started ATLAS followed suit during Run I Any CMS T1/ site could be used as a data source Encouraged more flexible data placement & replication Enabled more efficient utilization of available resources T. Wenaus (BNL) Phil DeMar: HEP Use Case for NDN 11

CMS Data Federation & AAA Phil DeMar: HEP Use Case for NDN 12

Data Federation - XrootD LHC experiments have implemented federated data storage, made possible by: High bandwidth WAN connectivity across all tiers Global data namespace(s) Based on XrootD: Hides local file storage systems Hierarchical, w/ regional, global, & local redirectors Maintains catalog of known file locations Negative cache as well Tree-walk redirects to locate file dcache Lustre Hadoop dcache Phil DeMar: HEP Use Case for NDN 13

Any Data, Any Time, Anywhere (AAA) AAA is CMS s implementation of federated storage: Based on XrootD Finds data based on logical file name Transfers data to application High-level philosophy: remote storage ~= local storage: In practice: CPU efficiency slightly lower w/ remote data Principally driven by (macro) economics: Maximizes efficiency of collaboration computing resources Fallback data access & overflow job redistribution capabilities A few numbers: Nearly all (95%+) CMS data available via AAA Projection is 20%+ of CMS Run II data access through AAA Local storage access is not through AAA Phil DeMar: HEP Use Case for NDN 14

AAA s Two-domain Federation Production domain for AAA performance-certified sites Transition domain for sites not meeting performance standards All CMS T1s and most s are now Production-certified Production (Qualifying T1s/s) Transitional (T3s & non-qualifying s) Redirector Redirector Redirector Global redirect only after Production domain tree-walk Site Redirector Site Site Site Site Site Site M. Girone (CERN) Phil DeMar: HEP Use Case for NDN 15

AAA Fallback Mode Job unable to access local data: AAA fallback capability locates remote copy of data Job is able to complete Useful in redirecting jobs to other sites in overflow situations Real life example: DB error results in missing local data at FNAL AAA failover locates replica at CNAF (Italy) Jobs run for 2 days using CNAF data, without anyone noticing Phil DeMar: HEP Use Case for NDN 16

Evolving Computing Models & NDN Phil DeMar: HEP Use Case for NDN 17

Additional Trends in CMS Computing Model Dynamic data placement (ALICE/ATLAS): Distributing/redistributing (abbreviated) data sets by popularity Subset of larger trend for dynamic data management in general Cloud & High Performance Computing (HPC) cycles: Amazon Web Service spot CPU cycles already highly economic Next gen. super computers will have massive computing power M. Ernst (BNL) Phil DeMar: HEP Use Case for NDN 18

CMS Computing (today ) vs NDN CMS (today) NDN Namespace Global logical file names Hierarchical data name space Content-based data retrieval Routing optimization Caching optimizations Scalable Repository Warning!!! My interpretation only! Subject to large error bars on both ends Middleware service Some architectural & middleware optimizations Middleware optimizations Open Science Grid Stashcache (middleware) [?] Basic network service Basic network service Basic network service (?) (not clear how this would work with LHC scale data volumes) Repo-Se (?) Phil DeMar: HEP Use Case for NDN 19

But Don t Confuse Us with NetFlix NetFlix delivers streaming video content to ~20M users Regarded as largest content provider for internet traffic CMS much smaller user base & generates only a fraction of NetFlix s traffic But CMS aggregate amount of data is 1000X NetFlix NetFlix deals with much lower amount of data, which is much easier to efficiently replicate or cache Users NetFlix 20M CMS 100K Total Data 20TB 20PB O. Gutsche (FNAL) Phil DeMar: HEP Use Case for NDN 20

NDN Activities in High Energy Physics (HEP) Climate Data Sciences NDN test bed (C. Papadopoulos, etc.) has ties with HEP community Caltech Network Research group (H. Newman) is involved Imperial College London (D. Rand, etc.) evaluating NDN in a local test bed: Application-level (ROOT) Repository-level Caltech & FNAL funded to create small NDN test bed for CMS app evaluations Phil DeMar: HEP Use Case for NDN 21

Summary LHC experiments heading toward exascale data volumes: Terabit networks will be needed to handle that data LHC computing models are becoming increasingly distributed in nature: Both data storage & CPU This creates greater demands on network services beyond b/w LHC computing is already implementing content-based data services at the middleware level There seems to be a natural fit for NDN with LHC computing: Performance optimizations within the exascale data / terabit network environment will be key Phil DeMar: HEP Use Case for NDN 22

Questions? Phil DeMar HEP Use Case for NDN 23 9/28/2015