Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer Frank Würthwein Rick Wagner August 5th, 2013
The Universe is a strange place! 67% of energy is dark energy We got no clue what this is. 29% of matter is dark matter We have some ideas but no proof of what this is! All of what we know makes up Only about 4% of the universe.
Fkw s Research is focused on Higgs and Dark Matter. We have delivered the Higgs. Now it s time to search for Dark Matter 3
Experimental Particle Physics: The Big bang in the laboratory We gain insight by colliding particles at the highest energies possible to measure: Production rates Masses & lifetimes Decay rates From this we derive the spectroscopy as well as the dynamics of elementary particles. Progress is made by going to higher energies and/or brighter beams. Higher Energies gets us closer to the big bang Brighter beams allows the study of rare phenomena
To study Dark Matter we need to create it in the laboratory CMS Lake Geneva CERN
The Large Hadron Collider
The CMS Experiment
The CMS Experiment 80 Million electronic channels x 4 bytes x 40MHz ----------------------- ~ 10 Petabytes/sec of information x 1/1000 zero-suppression x 1/100,000 online event filtering ------------------------ ~ 100-1000 Megabytes/sec raw data to tape 1 to 10 Petabytes of raw data per year 2000 Scientists (1200 Ph.D. in physics) ~ 180 Institutions ~ 40 countries 12,500 tons, 21m long, 16m diameter
The Challenge How do we organize the processing of 10 s to 1000 s of Petabytes of data by a globally distributed community of scientists, and do so with manageable change costs for the next 20 years? Solution to the Challenge Chose technical solutions that allow computing resources as distributed as human resources. Support distributed ownership and control, within a global single-sign on security context. Design for heterogeneity and adaptability.
CMS global processing infrastructure Depends on Federation of Regional Infrastructures Tier-1: Archive & Primary processing Tier-2: Simulation & Science Data Analysis
The Open Science Grid A Consortium of Universities and National Labs to share resources and technologies to advance Science Open for all of science, including biology, chemistry, computer science, engineering, mathematics, medicine, and physics. Backbone of CMS processing in the US.
Vision going forward Implemented vision for 1 st time in Spring 2013 using Gordon Supercomputer at SDSC.
Using Gordon to Accelerate LHC Science
Contributors Brian Bockelman (UNL) Igor Sfiligoi (UCSD) Matevz Tadel (UCSD) James Letts (UCSD) Frank Würthwein (UCSD) Lothar A. Bauerdick (FNAL) Rick Wagner Mahidhar Tatineni Eva Hocks Kenneth Yoshimoto Scott Sakai Michael L. Norman
When Grids Collide
Overview 2012 LHC data collection rates higher than first planned (1000Hz vs. 150Hz) Additional data was parked to be reduced during 2 year shutdown Delays the science from data at the end
Linking the Grids CMS Components CMSSW: Base software components, NFS exported from IO node OSG worker node client: CA certs, CRLs Squid proxy: cache calibration data needed for each job, running on IO node glideinwms: worker node manager pulls down CMS jobs BOSCO: GSI-SSH capable batch job submission tool PhEDEx: data transfer management GSI Authentication GridFTP SSH A lot of shared knowledge Common Tools and Connectors
Results Work completed in February to March 2013 400 million collision events 125TB in, ~150 TB out ~2 million SUs Good experience regarding OSG-XSEDE compatibility