Juan Deaton, Ph.D. Research Scientist and Engineer jdeaton@aha.com
AHA Introduction 2 Moscow, ID Experts in Source and Channel Coding - Error Correction, Data Compression, Encryption - Anything with a Galois Field - Almost 30 years experience Brief History - Established 1988 on Forward Error Correction - 1992 First CAM in Data Compression - 2006 First GZIP Compression IC - 2014 Fastest PCIe GZIP Accelerator
How do you Dominate? 3 Market Strategy - Lower TCO? - High Performance? Need Both! Historically, computing technology has followed a lower cost higher performance trend.
Outline 4 Data Compression - Why, What, and How? GZIP Hardware - What? Performance Analysis - How can GZIP accelerators help in applications?
Data Compression in a Nutshell 5 Applications - Storage Arrays - Packet Capture Systems - Load Balancers - L4+ switches - WAN Optimization - Web Servers - Data Analytics Systems Key Benefits - Increased Storage Capacity - Higher Disk and Network I/O - Lower Disk Read Latency - Longer Hardware Life - Lower Energy Costs Challenge: Achieving good compression ratios comes at the cost of compression rate and CPU resources.
Compression Tradeoffs 6 4 3.5 Compression Ratio 3 2.5 2 SLOW Compression Rate HIGH Compression Ratio FAST Compression Rate HIGH Compression Ratio 1.5 1 SLOW Compression Rate LOW Compression Ratio FAST Compression Rate LOW Compression Ratio 10 100 1000 10000 Compression Rate (Mbps)
Strong Compression Slow Speed 7 4 3.5 bzip2, 47Mbps, 3.5:1 Compression Ratio 3 2.5 2 SLOW Compression Rate HIGH Compression Ratio zlib, 70Mbps, 2.8:1 FAST Compression Rate HIGH Compression Ratio 1.5 1 SLOW Compression Rate LOW Compression Ratio FAST Compression Rate LOW Compression Ratio 10 100 1000 10000 Compression Rate (Mbps)
Weak Compression Fast Speed 8 4 3.5 bzip2, 47Mbps, 3.5:1 Compression Ratio 3 2.5 2 1.5 SLOW Compression Rate HIGH Compression Ratio zlib, 70Mbps, 2.8:1 FAST Compression Rate HIGH Compression Ratio lzo, 640Mbps, 1.9:1 lz4, 1.2Gbps, 1.8:1 1 SLOW Compression Rate LOW Compression Ratio FAST Compression Rate LOW Compression Ratio 10 100 1000 10000 Compression Rate (Mbps)
Data Compression Accelerators 9 4 3.5 bzip2, 47Mbps, 3.5:1 Compression Ratio 3 2.5 2 1.5 SLOW Compression Rate HIGH Compression Ratio zlib, 70Mbps, 2.8:1 FAST Compression Rate HIGH Compression Ratio lzo, 640Mbps, 1.9:1 lz4, 1.2Gbps, 1.8:1 aha-gzip, 3.3Gbps, 2.7:1 1 SLOW Compression Rate LOW Compression Ratio FAST Compression Rate LOW Compression Ratio 10 100 1000 10000 Compression Rate (Mbps)
10 AHA37XPCIe Family Features Interface - PCIe 3.0x8 interface Algorithms - GZIP, ZLIB, LZS Board by Compression Speed AHA371, AHA372 - AHA371 10 Gbps - AHA372 20 Gbps - AHA374 40 Gbps - AHA378 80 Gbps AHA374, AHA378
20 Core CPU vs. AHA372 11 The AHA372 has almost 8x throughput and 10x energy efficiency advantage.
Choosing the Right Tool 12 Wenger Giant Knife with 141 Functions and Phillips Screwdriver with 1 Function.
13 Experiment Setup Client Emulator Page Requests Apache Web Server 10 Gig-E Page Responses Power Meter 120 VAC PowerEdge R720 Motherboard Intel Xeon E5-2643 3.30GHz (Total 8 cores) 24 X 4GB RDIMM, 1600MT/s RAM Western Digital WD800JD-75MSA2 Intel 82598EB 10-Gigabit AT2 Server Adapter Watts up? PRO ES Accuracy: +/- 1.5% 40 Gbps Compression: 2x AHA372 20Gbps GZIP Accelerators Supermicro X8DTL-6F Motherboard Intel Xeon E5620 CPU 3 X 2GB Kingston KVR13R9S8K3/6I Seagate Barracuda ST3160815AS Intel 82598EB 10-Gigabit AT2 Server Adapter Client Emulator sends HTTP requests - Requests maximize throughput of 10 G link Webserver responds with GZIP d page - Requests are run continuously 120
Scenarios and Measurements 14 Observed Metrics - Effective throughput of 10Gb Link Better compression ratio, more effective throughput Experiment average between 1:2, 1:3, compression ratios - CPU Utilization - Power/ Throughput (Watt/Gbps) or Energy/Bit (Joule/Gb) Compression Scenarios Ran - No Compression - CPU Compression : mod_deflate - AHA Compression : 2 x AHA 372 20Gbps
CPUs are Inefficient at GZIP 15 CPU is Crippled No GZIP CPU GZIP - Serving web pages drops from 9 Gbps to 2Gbps. Throughput 9 Gbps 2 Gbps CPU is Inefficient - 8 Core CPU at 100% load and consumes 7x more energy. CPU Load 15% 100 % Energy Efficiency 16 J/Gb 116 J/Gb CPUs are not optimal for performing GZIP and other tasks simultaneously.
GZIP Accelerator Removes Bottlenecks 16 18x Throughput over CPU At full load with 8 cores, CPU GZIP is only capable of 6% of the throughput of GZIP hardware. Throughput CPU GZIP 2 Gbps GZIP 35 Gbps Available CPU cycles Not performing compression, the CPU can perform tasks it is most efficient at. CPU Load 100 % 59% 17x energy efficiency - Less energy cost Energy Efficiency 116 J/Gb 6.8 J/Gb AHA372 is order of magnitudes more efficient at GZIP than CPUs.
Energy Break Even Point (ebep) 17 Assumptions - Card Price at Low Volumes - Continuous Data Stream - Power Usage Effectiveness 1.7 - Normalized Linear Performance Adjustment for Power Consumption Hardware Pays Off FAST! - 22 Days at Lowest Electricity Cost Cost of Electricity cents (kw/h) ebep of GZIP Hardware Vs. CPU GZIP 15 22 days 20 17 days 25 14 days 30 11 days 35 10 days GZIP Hardware Reduces OpEx and CapEx by creating optimal performing systems using less hardware.
GZIP Accelerators 18 Avoid Compression/Resource Tradeoff - GZIP Accelerators are high compression and high speed Provide Efficient Computing - GZIP Accelerators offer performance advantage and lower power consumption Offer a Competitive Advantage - GZIP accelerators increase system performance and offer end customers lower TCO.
Sales and Contact Information 19 Website - www.aha.com Sales Contact - sales@aha.com White Paper to Presentation - http://www.aha.com/uploads/gzip_benefits_whitepaper9.pdf