Establishing Applicability of SSDs to LHC Tier-2 Hardware Configuration A CHEP 2010 presentation by: Sam Skipsey and The GridPP Storage Group With particular acknowledgments to: Wahid Bhimji (go see his talk on wider Storage Performance matters) Dan van der Ster (HammerCloud awesomeness)
Itinerary Background Tests: Glasgow hardware HammerClouds, blktrace Results Why not remote i/o? Conclusions
Background Particle Physics analysis is I/O intensive. Particle Physics analysis is single-threaded. Modern servers are very multicore. Modern storage is not (usually) multiheaded. SSDs provide much faster seeks than HDDs, though... Why not use SSDs for worker nodes (or servers)?
Glasgow Setup Conventional Worker nodes: 8 cores (2 4core Intel Xeon E5420, 2.5GHz) 1Gb networking to storage single 7200RPM SATA disk (partitioned) 24 core node (Magny Cours test box)
Test hardware 7200RPM 500GB SATA HDD Standard server-class hard disk. Kingston SSDNow V-series 128GB SSD Value SSD option. Intel X-25 G2 M 160GB SSD Commodity SSD option. Deliberately only affordable solutions!
Blktrace SL5 + only (needs kernel > 2.6.10+) reads raw kernel io events from debugfs Output to a device separate from the one being monitored Process with: blkparse (statistics) seekwatcher (graphs)
HammerClouds ATLAS (CMS, LHCb) automated testing framework. Uses Ganga to automatically load sites with test jobs and maintain statistics on them. http://hammercloud.cern.ch/ HC tests stored forever: 1332, 1334, 1348 most relevant tests
The test Use HammerCloud to send identical jobs to a subcluster at Glasgow. All jobs stage their files to the worker. Replace some of the working storage on workers with SSDs (and other things). Compare efficiencies and collect data.
Results (FileStager) Jobs:Cores Storage Efficiency Throughput Standard Node 8:8 1 Kingston Value SSD 60% 4.5 8:8 1 SATA HDD 75% 5.5 8:8 1 Intel X25 SSD 80% 6 8:8 2 SATA HDD (RAID 1) 83% 6.6 8:8 2 SATA HDD (RAID 0) 90% 7 Magny-Cours Node 24:24 1 Intel X25 SSD 50% 12 24:24 2 SATA HDD (RAID 0) 86% 21 Single Occupancy Efficiency (Measured) 1 SATA HDD 90% 0.9
Blktrace (HDD)
Blktrace (SSD)
Blktrace (RAID 0)
Price/performance Intel X25 G2 M - 340 / 160GB ( 2.13 / GB) 4.25 per %efficiency RAID-0 Hard disks - 130 / 1TB ( 0.13 / GB) 1.44 per %efficiency For a 2000 worker node, the effective difference is on the order of 10 to 15% of the cost. ATLAS require 50GB scratch per core!
But what about remote I/O? Why not access data with direct rfio against disk server, if local io isn t sufficient? We tested this too. HammerCloud against same data. DQ2_LOCAL remote IO jobs sent. Caveat: all nodes limited to 1Gb/s links.
Results (Remote I/O) Jobs:Cores Storage Efficiency Throughput Standard Node 8:8 1 Kingston Value SSD 67% 5.35 8:8 1 Intel X25 SSD 73% 5.86 8:8 2 SATA HDD (RAID 1) 73% 5.88 8:8 1 SATA HDD 78% 6.25 Magny-Cours Node 24:24 2 SATA HDD (RAID 0) 73% 17.4 24:24 3 SATA HDD (RAID 0) 76% 18.2
Real Analysis / pcache Node type Efficiency Test efficiency Intel X25 SSD 78% 80% 2 SATA HDD (RAID 0) 88% 90% Cluster was retrofitted with RAID 0 for all but the Intel nodes. Efficiency for sample of ATLAS analysis pilots from 1 September - 3 September taken. Nodes also now have pcache installed... but we see no statistical improvement here
Metrics and Conclusions Metrics should be taken with respect to cost. There is a minimum effective IOPs per core. Above that limit, bandwidth is more important Write efficiency issues with SSDs show in limited performance gains in some metrics.. RAID0 wins vs current SSDs Back in 2 years for a rematch?
Addendum: Server Class Machines What front-end servers might benefit from SSDs? Database backends (SEs, LBs) CREAM CE sandbox dir creation. Note that all of these are also improved by better caching, buffering etc. Tests in progress at Glasgow for the former cases.