Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer. Frank Würthwein Rick Wagner August 5th, 2013

Similar documents
CMS Tier-3 cluster at NISER. Dr. Tania Moulik

Testing the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper

HADOOP, a newly emerged Java-based software framework, Hadoop Distributed File System for the Grid

Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

GRID computing at LHC Science without Borders

Estonian Scientific Computing Infrastructure (ETAIS)

Grid Computing in Aachen

Data analysis in Par,cle Physics

The CMS analysis chain in a distributed environment

PACE Predictive Analytics Center of San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D.

Big Data Analytics. for the Exploitation of the CERN Accelerator Complex. Antonio Romero Marín

Data Requirements from NERSC Requirements Reviews

Linux and the Higgs Particle

Data And Software Preservation for Open Science (DASPOS)

HIGH ENERGY PHYSICS EXPERIMENTS IN GRID COMPUTING NETWORKS EKSPERYMENTY FIZYKI WYSOKICH ENERGII W SIECIACH KOMPUTEROWYCH GRID. 1.

FCC JGU WBS_v0034.xlsm

A Physics Approach to Big Data. Adam Kocoloski, PhD CTO Cloudant

Analyses on functional capabilities of BizTalk Server, Oracle BPEL Process Manger and WebSphere Process Server for applications in Grid middleware

New Jersey Big Data Alliance

The supercomputer for particle physics at the ULB-VUB computing center

An objective comparison test of workload management systems

PRAXIS Pass Rates Fall 2010 through Spring 2013

Leveraging OpenStack Private Clouds

Michael Thomas, Dorian Kcira California Institute of Technology. CMS Offline & Computing Week

Big Data Needs High Energy Physics especially the LHC. Richard P Mount SLAC National Accelerator Laboratory June 27, 2013

Abderrahman El Kharrim

THE CCLRC DATA PORTAL

How To Teach Physics At The Lhc

The Data Quality Monitoring Software for the CMS experiment at the LHC

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary

Data Management using irods

Network & HEP Computing in China. Gongxing SUN CJK Workshop & CFI

U-LITE Network Infrastructure

Status and Evolution of ATLAS Workload Management System PanDA

Human Brain Project -

From Distributed Computing to Distributed Artificial Intelligence

HEP GROUP UNIANDES - COLOMBIA

SUPERCOMPUTING FACILITY INAUGURATED AT BARC

The dcache Storage Element

OSG Hadoop is packaged into rpms for SL4, SL5 by Caltech BeStMan, gridftp backend

Integration of Virtualized Workernodes in Batch Queueing Systems The ViBatch Concept

Computing at the HL-LHC

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. dcache Introduction

Data Movement and Storage. Drew Dolgert and previous contributors

Technical Case Study CERN the European Organization for Nuclear Research

Cloud Computing. Lecture 5 Grid Case Studies

What Does It Take to Solve the Mysteries of the Universe?

Big Data Processing Experience in the ATLAS Experiment

The DESY Big-Data Cloud System

Berkeley City College TEACH Associate of Arts for Transfer (AA-T) Program Handbook Center St. Berkeley, CA 94704

Transcription:

Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer Frank Würthwein Rick Wagner August 5th, 2013

The Universe is a strange place! 67% of energy is dark energy We got no clue what this is. 29% of matter is dark matter We have some ideas but no proof of what this is! All of what we know makes up Only about 4% of the universe.

Fkw s Research is focused on Higgs and Dark Matter. We have delivered the Higgs. Now it s time to search for Dark Matter 3

Experimental Particle Physics: The Big bang in the laboratory We gain insight by colliding particles at the highest energies possible to measure: Production rates Masses & lifetimes Decay rates From this we derive the spectroscopy as well as the dynamics of elementary particles. Progress is made by going to higher energies and/or brighter beams. Higher Energies gets us closer to the big bang Brighter beams allows the study of rare phenomena

To study Dark Matter we need to create it in the laboratory CMS Lake Geneva CERN

The Large Hadron Collider

The CMS Experiment

The CMS Experiment 80 Million electronic channels x 4 bytes x 40MHz ----------------------- ~ 10 Petabytes/sec of information x 1/1000 zero-suppression x 1/100,000 online event filtering ------------------------ ~ 100-1000 Megabytes/sec raw data to tape 1 to 10 Petabytes of raw data per year 2000 Scientists (1200 Ph.D. in physics) ~ 180 Institutions ~ 40 countries 12,500 tons, 21m long, 16m diameter

The Challenge How do we organize the processing of 10 s to 1000 s of Petabytes of data by a globally distributed community of scientists, and do so with manageable change costs for the next 20 years? Solution to the Challenge Chose technical solutions that allow computing resources as distributed as human resources. Support distributed ownership and control, within a global single-sign on security context. Design for heterogeneity and adaptability.

CMS global processing infrastructure Depends on Federation of Regional Infrastructures Tier-1: Archive & Primary processing Tier-2: Simulation & Science Data Analysis

The Open Science Grid A Consortium of Universities and National Labs to share resources and technologies to advance Science Open for all of science, including biology, chemistry, computer science, engineering, mathematics, medicine, and physics. Backbone of CMS processing in the US.

Vision going forward Implemented vision for 1 st time in Spring 2013 using Gordon Supercomputer at SDSC.

Using Gordon to Accelerate LHC Science

Contributors Brian Bockelman (UNL) Igor Sfiligoi (UCSD) Matevz Tadel (UCSD) James Letts (UCSD) Frank Würthwein (UCSD) Lothar A. Bauerdick (FNAL) Rick Wagner Mahidhar Tatineni Eva Hocks Kenneth Yoshimoto Scott Sakai Michael L. Norman

When Grids Collide

Overview 2012 LHC data collection rates higher than first planned (1000Hz vs. 150Hz) Additional data was parked to be reduced during 2 year shutdown Delays the science from data at the end

Linking the Grids CMS Components CMSSW: Base software components, NFS exported from IO node OSG worker node client: CA certs, CRLs Squid proxy: cache calibration data needed for each job, running on IO node glideinwms: worker node manager pulls down CMS jobs BOSCO: GSI-SSH capable batch job submission tool PhEDEx: data transfer management GSI Authentication GridFTP SSH A lot of shared knowledge Common Tools and Connectors

Results Work completed in February to March 2013 400 million collision events 125TB in, ~150 TB out ~2 million SUs Good experience regarding OSG-XSEDE compatibility