Linux and the Higgs Particle Dr. Bernd Panzer-Steindel Computing Fabric Area Manager, CERN/IT Linux World, Frankfurt 27.October 2004
Outline What is CERN The Physics The Physics Tools The Accelerator The Detectors The Computing Tools The Local Computing Fabric The World Wide GRID
The Institute CERN
CERN Conseil Européen pour la Recherche Nucléaire European Organisation for Particle Physics Basic Research Laboratory, world s largest particle physics centre Founded in 1954, 50 th Anniversary this year! Located on top of the French-Swiss border in Geneva (Switzerland) 2700 Staff members and Fellows plus ~6500 visitors on-site ~1000 MCHF (~700 MEuro) Annual Budget
CERN has some 6,500 visiting scientists from more than 500 institutes and 80 countries from around the world Europe: 267 institutes 4663 users Elsewhere: 238 institutes 1832 users
The Physics
Particle Physics Establish a periodic system of the fundamental building blocks and understand forces
The standard model of particle physics The Standard Model, the unification of three out of four theories. Great success with a precision of 0.1 % verified Constant interaction of theory and experiment but too many free input parameter and there are nonsense predictions at very high energies
The Higgs Particle The inclusion of the Higgs mechanism into the standard model fixes quite a few problems. The vacuum is not empty, but is filled with a Higgs particle condensate. All particles collide with the Higgs particle while they move through the vacuum. This acts like a molasses, slows the particles down and gives them mass. This is one of the key elements of the expansion of the standard model.
Open Questions Why are the parameters of the size as we observe them? What gives the particles their masses? How can gravity be integrated into a unified theory? Why is there only matter and no anti-matter in the universe? Are there more space-time dimensions than the 4 we know of? What is dark energy and dark matter which makes up 98% of the universe? finding the Higgs and possible new physics with LHC will give the answers!
The Physics Tools 1. The Accelerator
Methods of Particle Physics The most powerful microscope Creating conditions similar to the Big Bang
The principal accelerator machine components
The Large Hadron Collider LHC
View of the LHC Experiements
The LHC accelerator The largest superconducting installation in the world 27 kilometer long circle with two beam tubes 15 meter long dipole magnets at -271 o C 1700 superconducting magnets 7000 kilometers super conducting cables niobium-titanium with a copper matrix 13000 amps 8.3 Tesla magnetic field
Tides Stray currents Precision The 27 km length of the ring is sensitive to <1mm changes Rainfall
The Physics Tools 2. The Detectors
The ATLAS Experiment Diameter 25 m Barrel toroid length 26 m End-wall chamber span 46 m Overall weight 7000 Tons
The ATLAS Cavern 140000 m3 rock removed 53000 m3 concrete 6000 tons steel reinforcement 53 meters long 30 meters wide 53 meters high (10-storey building)
The CMS Magnet
The Dataflow of an Experiment
Data Rates Data Rates On-line System Multi-level trigger Filter out background Reduce data volume 24 x 7 operation 40 MHz 40 MHz (1000 TB/sec) (1000 TB/sec) 75 KHz 75 KHz (75 GB/sec) Level 1 - Special Hardware (75 GB/sec) 5 KHz 5 KHz (5 GB/sec) (5 GB/sec) Level 2 - Embedded Processors 100 Hz 100 Hz (100 MB/sec) (100 MB/sec) Level 3 Farm of commodity CPUs Data Recording Recording & Offline Offline Analysis Analysis
Particle physics data From raw data to physics results 2037 2446 1733 1699 4003 3611 952 1328 2132 1870 2093 3271 4732 1102 2491 3216 2421 1211 2319 2133 3451 1942 1121 3429 3742 1288 2343 7142 e + e - Z 0 f _ f Raw data Convert to physics quantities Detector response apply calibration, alignment Interaction with detector material Pattern, recognition, Particle identification Fragmentation, Decay, Physics analysis Basic physics Results Reconstruction Analysis Simulation (Monte-Carlo)
A Photo of a proton-proton collision (Event)
LHC data 40 million collisions per second After filtering, 100-200 collisions of interest per second, 1-10 good! events 1-10 Megabytes of data digitised for each collision = recording rate of 0.1-1 Gigabytes/sec 10 10 collisions recorded each year = ~15 Petabytes/year of data 1 Megabyte (1MB) A digital photo 1 Gigabyte (1GB) = 1000MB A DVD movie 1 Terabyte (1TB) = 1000GB World annual book production 1 Petabyte (1PB) = 1000TB Annual production of one LHC experiment 1 Exabyte (1EB) = 1000 PB World annual information production CMS LHCb ATLAS ALICE
The Computing Tools 1. The Local Computing Fabric
Challenge : Large, distributed community ATLAS Offline software effort: 1000 person-years per experiment CMS Software life span: 20 years ~ 5000 Physicists around the world - around the clock LHCb
detector event event filter filter (selection (selection& reconstruction) reconstruction) Data Handling and Computation for Physics Analysis reconstruction event summary data processed data raw data event event reprocessing reprocessing analysis batch batch physics physics analysis analysis analysis objects (extracted by physics topic) event event simulation simulation simulation interactive physics analysis
Requirements and Boundaries (I) The High Energy Physics applications require integer processor performance and less floating point performance choice of processor type, benchmark reference Large amount of processing and storage needed, but optimization is for aggregate performance, not the single tasks + the events are independent units many components, moderate demands on the single components, coarse grain parallelism Basic infrastructure, environment availability of space, cooling and electricity heavy investment, don t underestimate
Requirements and Boundaries (II) the major boundary condition is cost, staying within the budget envelope + maximum amount of resources commodity equipment, best price/performance values cheapest! take into account reliability, functionality and performance together == total-cost-of-ownership Chaotic workload! - batch & interactive - research environment == physics analysis by collective iterative discovery unpredictable data acces no practical limit to the requirements
View of different Fabric areas Automation, Operation, Control Installation Configuration + monitoring Fault tolerance Infrastructure Electricity, Cooling, Space Storage system (AFS, CASTOR, disk server) Benchmarks, R&D, Architecture Prototype, Testbeds Batch system (LSF, CPU server) Network GRID services!? Purchase, Hardware selection, Resource planning Coupling of components through hardware and software
Current CERN Fabrics architecture based on : In general on commodity components Dual Intel processor PC hardware for CPU, disk and tape Server Hierarchical Ethernet (100, 1000, 10000) network topology NAS disk server with ATA/SATA disk arrays RedHat Linux operating system Medium end tape drive (linear) technology OpenSource software for storage (CASTOR, OpenAFS) and cluster management (Quattor, Lemon, ELF) Commercial software packages (LSF, Oracle)
Level of complexity CPU PC Cluster Couplings Disk Storage tray, NAS server, SAN element Hardware Motherboard, backplane, Bus, integrating devices (memory,power supply, controller,..) Physical and logical coupling Network (Ethernet, fibre channel, Myrinet,.) Hubs, switches, routers Software Operating system (Linux), driver, applications Batch system (LSF), Mass Storage (CASTOR) filesystems (AFS), Control software, World wide cluster Grid-Fabric Interfaces Wide area network (WAN) Grid middleware, monitoring, firewalls (Services)
Building the Farm CPU server + Fiber Channel Interface + tape drive == Tape server Processors desktop+ node == CPU server CPU server + larger case + 6*2 disks == Disk server All using the Linux OS
Today s schematic network topology WAN Gigabit Ethernet 1000 Mbit/s Backbone Multiple Gigabit Ethernet, 20 * 1000 Mbit/s Disk Server Tape Server Gigabit Ethernet 1000 Mbit/s CPU Server Fast Ethernet, 100 Mbit/s Tomorrow s schematic network topology Backbone WAN 10 Gigabit Ethernet 10000 Mbit/s Multiple 10 Gigabit Ethernet 200 * 10000 Mbit/s 10 Gigabit Ethernet 10000 Mbit/s Gigabit Ethernet 1000 Mbit/s Disk Server CPU Server Tape Server
General Fabric Layout Development cluster GRID testbeds New software, new hardware (purchase) Certification cluster Main cluster en miniature R&D cluster (new architecture and hardware) Benchmark and performance cluster (current architecture and hardware) Service control and management (e.g. stager, HSM, LSF master, repositories, GRID services, CA, etc Main fabric cluster 2-3 hardware generations 2-3 OS/software versions 4 Experiment environments old current new
Software glue management of the basic hardware and software : installation, configuration and monitoring system (from the European Data Grid project) management of the processor computing resources : Batch system (LSF from Platform Computing) management of the storage (disk and tape) : CASTOR (CERN developed Hierarchical Storage Management system)
Linux Linux is our choice as the OS for all LHC computing Using Redhat Enterprise version We have our own 4 person support team Linux deployed on ~2000 farm PC s and 1500 desktop nodes still trying to sort out an efficient TCO (Total-Cost_of_Ownership) model stability versus new features problem tracking and bug fixes community support versus licenses and support contract Boundary conditions support of old versions user community heterogeneous, can t move to new versions easily long and complicated certification process of a new version several third-party products to be supported
The CERN Computing Centre ~4000 processors ~400 TBytes of disk ~12 PB of magnetic tape Even with technology-driven improvements in performance and costs CERN can provide nowhere near enough capacity for LHC!
Considerations current state of performance, functionality and reliability is good and technology developments look still promising more of the same for the future!?!? How can we be sure that we are following the right path? How to adapt to changes?
Strategy continue and expand the current system BUT do in parallel : R&D activities SAN versus NAS, iscsi, IA64 processors,. technology evaluations infiniband clusters, new filesystem technologies,.. Data Challenges to test scalabilities on larger scales bring the system to it s limit and beyond we are very successful already with this approach, especially with the beyond part watch carefully the market trends
Challenges 1. Status of the current system Is the stability of the equipment acceptable? stress test the equipment? where and what are the weak points / bottlenecks? 2. Physics Data Challenges test the bookkeeping, organization and management of data processing 3. Computing Data Challenge scalability of software and hardware in the fabric try to verify whether the current architecture would survive the anticipated load in the LHC area.
Dataflow local CERN Fabric 2007 Complex organization with high data rates (~10 GBytes/s) and ~100k streams in parallel permanent Disk Storage Calibration Farm Analysis Farm Raw Data Calibration Data Online Filter Farm (HLT) Reconstruction Farm permanent Disk Storage EST Data Raw Data Calibration Data Raw Data EST Data AOD Data AOD Data Disk Storage Disk Storage Raw Data Calibration Data EST Data AOD Data EST Data AOD Data Raw Data Calibration Data Tape Storage Tape Storage Tape Storage Tier 1 Data Export
High Througput Prototype (openlab( + LCG prototype) (specific layout, October 2004) 12 Tape Server STK 9940B 4 * GE connections to the backbone 10GE WAN connection lxsharexxxd 36 Disk Server (dual P4, IDE disks, ~ 1TB disk space each) 4 *ENTERASYS N7 10 GE Switches 2 * Enterasys X-Series 2 * 50 Itanium 2 (dual 1.3/1.5 GHz, 2 GB mem) oplapro0xx 10 GE per node tbed00xx 10 GE per node 80 * IA32 CPU Server (dual 2.4 GHz P4, 1 GB mem.) 10GE 10GE 1 GE per node 20 TB, IBM StorageTank lxs50xx 40 * IA32 CPU Server (dual 2.4 GHz P4, 1 GB mem.) 80 IA32 CPU Server (dual 2.8 GHz P4, 2 GB mem.) 12 Tape Server STK 9940B
IT Data Challenge performance [ GBytes/s] 1.4 CPU Disk Tape running in parallel with increasing production service 1.2 1.0 0.8 920 MB/s average 0.6 0.4 daytime tape server intervention 0.2 0.0 time in minutes
CERN computer center 2008 Hierarchical Ethernet network tree topology (280 GBytes/s) ~ 8000 mirrored disks ( 4 PB) ~ 4000 dual CPU nodes (20 million SI2000) ~ 170 tape drives (4 GB/s) ~ 25 PB tape storage estimated investment in 2006-2008: ~ 50 million Euro all numbers : IF exponential growth rate continues! (Moore s law)
The Computing Tools 2. The World Wide GRID
Why the GRID? The CERN computer center can only deliver only a fraction (~10%) of the cpu/disk capacity needed for the analysis of the huge amount of data delivered by the LHC experiments. Need a transparent mechanism for the physicists to run their analysis jobs anywhere in the world.
Scavenging unused cycles What is a Grid? Going strong since 1986 Not so easy to scavenge unused storage Berkeley Open Infrastructure for Network Computing
What is the Grid? Resource Sharing On a global scale, across the labs/universities Secure Access Needs a high level of trust Resource Use Load balancing, making most efficient use The Death of Distance Requires excellent networking Open Standards Allow constructive distributed development 5.44 Gbps 1.1 TB in 30 min. 6.25 Gbps 20 April 2004 There is not (yet) a single Grid
The GRID middleware: How will it work? Finds convenient places for the scientists job (computing task) to be run Optimises use of the widely dispersed resources Organises efficient access to scientific data Deals with authentication to the different sites that the scientists will be using Interfaces to local site authorisation and resource allocation policies Runs the jobs Monitors progress Recovers from problems and. Tells you when the work is complete and transfers the result back!
Virtual Organizations for LHC and others ATLAS VO BioMed VO CMS VO coupling of computer centres
UI JDL A Job Submission Example Input sandbox Data Management Services LFN->PFN Information Service Author. &Authen. Job Submit Job Query Output sandbox Job Status Resource Broker Job Submission Service Input sandbox Brokerinfo Storage Element Logging & Book-keeping keeping Job Status Output sandbox Compute Element
High Energy Physics Leading and Leveraging Grid Technology Many national, regional Grid projects -- GridPP(UK), INFN-grid(I), NorduGrid, Dutch Grid, US projects European projects
The LHC Computing Grid Project - LCG Collaboration LHC Experiments Grid projects: Europe, US Regional & national centres Choices Adopt Grid technology. Go for a Tier hierarchy. Use Intel CPUs in standard PCs Use LINUX operating system. Goal Prepare and deploy the computing environment to help the experiments analyse the data from the LHC detectors. Tier3 physics departmen t γ Lab a Tier2 β grid for a regional group CERN Tier 1 USA Lab b α Uni x Tier 1 Italy Uni y Desktop Taipei Lab m CERN Tier 0 UK France Japan Germany Uni b Uni a Lab c grid for a physics study group Uni n
desktops portables small centres IFCA LHC Computing Model (simplified!!) Tier-0 the accelerator centre Filter raw data Reconstruction summary data (ESD) Record raw data and ESD Distribute raw and ESD to Tier-1 Tier-1 Permanent storage and management of raw, ESD, calibration data, metadata, analysis data and databases grid-enabled data service Data-heavy analysis Re-processing raw ESD National, regional support online to the data acquisition process high availability, long-term commitment managed mass storage UB MSU IC Cambridge Budapest Prague Taipei Tier-2 TRIUMF Legnaro Tier-1 RAL IN2P3 FNAL CSCS CNAF FZK Rome Tier-2 PIC CIEMAT BNL Well-managed disk storage grid-enabled Simulation Data distribution ~ 70 Gbits/s Krakow NIKHEF ICEPP USC End-user analysis batch and interactive High performance parallel analysis (PROOF)
Challenges Service quality Reliability, availability, scaling, performance Security our biggest risk Management and operations grid a collaboration of computing centres Maturity is some years away - a second (or third) generation of middleware will be needed before LHC starts In the short-term there will many grids and middleware implementations for LCG - inter-operability will be a major headache How homogeneous does it need to be? Standards help to avoid adapters
The Summary
The scientific collaborations are large, global, and already in place There will be a lot of data complex data handling, large amount of storage, 10 s of PB and that will need a lot of processing power order of 100K processors The vast majority of the PC s will use Linux as the Operating System key element of the architecture Need to pay attention to the market developments, technology is of secondary concern We need to have the computing facility in perfect operational shape by the end of 2006, not much time left for such a complex operation A utility grid looks like a very good fit for LHC and LHC looks like an ideal pilot application for a utility grid