Mike Ault - Oracle FlashSystem Consulting Manager, IBM January 2014 Accelerating Oracle with IBM FlashSystem: The Need for Speed 2014 IBM Corporation
Smarter Computing Demands Flash The more we have flash for consumer devices......the more we need flash for our data centers... 2014 IBM Corporation
Why Flash Storage.. Timing is Perfect!! In the last 10 years CPU Speed: Performance increase roughly 8-10x DRAM Speed: Performance increase roughly 7-9x Network Speed: Performance increase of 100x Bus Speed: Performance increased roughly 20x Disk speed: Performance increased 1.2x IBM FlashSystem 50+x 300K+ RPM Disk 2014 IBM Corporation
Datacenter s Response to Bridge Disk Performance Gap Most Costly & Volatile Add More Memory HDD Performance Enhancement Wasteful, Expensive & Ineffective with Storage Latency Issues Typical Performance Mitigation Tactics Expensive & Ineffective for Storage Performance Issues Add CPUs Tune & Modify Application Time Consuming, Very Expensive & Risky 2014 IBM Corporation
What if we only reduced Latency?? Consider Little s Law of mathematical queue theory as it applies to Application Performance Q = Number of parallel threads running in the application t = Time it takes for an IO request to be serviced (Latency) R = Result, typically measured in IOPS or Bandwidth Let s Assign Values to this equation: Now let s see how FlashSystem alters this equation A 50X improvement in application response time by only installing Flash! 2014 IBM Corporation
Economics of All Flash, Performance is Really Just a BONUS! 38% Lower software license costs Due to fewer cores Lower software maintenance More Efficient Infrastructure 13% lower infrastructure software costs 35% lower operational support costs Server / Storage Admin Much better storage utilization As much as 50% Lower maintenance Ease management by 50% 17% Fewer Servers Fewer cores Lower Memory Fewer network connections Lower maintenance Environmentals 74% Lower Cost Lower power / cooling Less floor space All Flash is 31% Less Expensive Overall 2014 IBM Corporation
Microsecond latency maximizes Application CPU utilization I/O Serviced by Disk 1. Issue I/O request ~ 100 μs 2. Wait for I/O to be serviced ~ 5,000 μs 3. Process I/O ~ 100 μs Processing CPU State Waiting ~100 µs ~100 µs ~5,000 µs Time to process 1 I/O request = 200 μs + 5,000 μs = 5,200 μs CPU Utilization = Wait time / Processing time = 200 / 5,200 = ~4% Time 1 I/O Request I/O Serviced by IBM FlashSystem 1. Issue I/O request ~ 100 μs 2. Wait for I/O to be serviced ~ 200 μs 3. Process I/O ~ 100 μs Processing CPU State Waiting ~100 µs ~100 µs ~200 µs 12X Application benefit by only changing storage latency! Time to process 1 I/O request = 200 μs + 200 μs = 400 μs CPU Utilization = Wait time / Processing time = 200 / 400 = 50% Time 1 I/O Request 2014 IBM Corporation
Introduction Important applications require: High Performance Queries, reports, and screens must return quickly Scale to high user loads Reliability 100% uptime Single system fault can not be fatal Loss of processing impacts bottom line Cost Effectiveness Effective use of resources Leverage tech to achieve accelerated performance gains for the cost Reliability can not be compromised 2013 IBM Corporation
But Where Does Oracle Need Speed? 2014 IBM Corporation
Oracle and Queries -Where does latency matter? READS User s Query Memory SGA & PGA Oracle Processes Reads - Cache miss Foreground Waits: DB file sequential read DB file scattered read 3-5 ms Storage latency Tables & Indexes Logs 2013 IBM Corporation
Why Don t Writes Matter? For data and index block writes: Uses delayed block cleanout Writes when it can t find clean blocks Writes every 3 seconds Writes on checkpoints 2013 IBM Corporation
Oracle and Insert/update/delete- Where does latency matter? LOG WRITES Memory SGA & PGA DBWR (background) Users Insert Commit Oracle Processes Tables & Indexes LGWR (foreground) Logs 2013 IBM Corporation
Where Else? Temporary Activity Sorts Hashes Bitmaps Global Temporary Tables Non-memory Undo activity Flash Cache 2013 IBM Corporation
FlashSystem 840 2014 IBM Corporation
IBM FlashSystem 840: Hardware View Improved RAS features Front/Back accessible Hot-swap Flash Modules, Power Supplies, Batteries, Fans, Controllers w/ interface cards and Canisters Non-disruptive maintenance and firmware updates (concurrent code load) Flash Modules (12) Battery Modules (2) RAID Controllers (2) Interface Modules (4) Fan Modules (4) Canisters (2) Management Modules (2) Power Supplies (2) 2014 IBM Corporation
IBM FlashSystem 840: Reliability Ingredients Superior Durability: Using the Best Flash Superior Protection: Beyond Disk RAID Chip/Plane/Die level protection SLC Market demand decreasing. emlc data protection techniques delivering more wear life than what market demands emlc delivers best Price/Performance 3X Variable Stripe Sizes Read Disturb Mitigation Automatic Read Sweeper High-Speed Clock Recovery Advanced Engineering = Less Maintenance 10X Protection Within And Across Flash Modules Self-Recovering Flash Modules Avoid system rebuilds 2014 IBM Corporation
Storage Performance Council (SPC-1/e) 2014 IBM Corporation
FlashSystem Result Details: IBM MicroLatency Leadership minimum reported latency (SPC-1 LRT ): 0.18 ms Single-system latency leadership up to about 85K IOPS, scalable with multiple FlashSystem units Nearest latency competitor (HDS) uses 2 racks of equipment, over 2x the flash for storage, plus a massive 1 TB DRAM cache and an additional 1 TB flash cache Nearest standard SSDs are ~2x the latency! 2014 IBM Corporation
FlashSystem Result Details: Extreme Performance System 8 Gbit FC Ext Ports Max IOPS/ Ext Port $/ASU GB Huawei Dorado5100 8 75K $76 Huawei Dorado2100 G2 IBM FlashSystem 820 HDS HUS 150 (SSDs) HDS VSP with HAF HP StorServ 7400 (SSDs) 8 50K $60 4 49K $25 4 31K $116 32 19K $148 20 13K $130 IOPS per external port normalizes aggregate performance across large and small-scale results. SLC + servers = more speed, higher price! We can do similar with 7xx products but do clients need it? Our result shows a single 1U building block, not a highly scaled out design like other results Maximum aggregate performance of 195,021.70 SPC-1 IOPS from a single 1U FlashSystem 820 Scale IOPS linearly by stacking FlashSystem units Strong performance efficiency : ~200K IOPS per rack unit ~50K IOPS per 8 Gbit FC port ~250 IOPS/Watt - more than 5x better than last SPC-1/E leader 2014 IBM Corporation
OPERA (Preferred read) IBM FlashSystem Accelerating Disks 2014 IBM Corporation
OPERA Example DB Servers Boost Performance Boost Redundancy - Without Disruption - Without Risk -Without Feature Loss READS ASM WRITE S SAN SAN SAN SAN ASM FG2 IBM Flash System ACTIVE DATA 20 TB Mirror ASM FG1 20 TB ACTIVE DATA 100 TB ARCHIVE DATA 21 TRANSITIONAL DATA5 TB TRANSITIONAL DATA 5 TB 2014 IBM Corporation
Optimal Performance Enhancing Real FlashSystem Architecture Storwize V7000: 36x 300GB 10k disks Brocade SAN switch: SAN40B-4 8Gbit ports Power server(lpar1&2): Power 750 8233-E8B Each has: 8 CPU 100 GB memory AIX 7.1 TL2 SP2 2x 8Gbit FC ports FlashSystem 820 20TB 4x 8Gbit FC ports 8 x 500Gib Luns 2014 IBM Corporation
Preferred Read Acceleration Example System performance @10,000 IOPS for a given app Read/Write Ratio @ 80% Reads / 20% Writes Reads: 8,000 / Sec Writes: 2,000 / Sec Introduce IBM FlashSystem as Primary Copy of new mirror 8,000 Reads / Sec now at extremely low latency System was 10,000 IOPS Now 10,000+ Writes / Sec R/W ratio does not change; No change in the app System does 10,000 Writes & IBM FlashSystem does 10,000 Writes & 40,000 Reads = System Accelerated 5x 2014 IBM Corporation
Swingbench OLTP Results IBM FlashSystem verses V7000 2014 IBM Corporation
V7000 results 2014 IBM Corporation
Acceleration with IBM FlashSystem 820 V7000 FlashSystem 820 X Increase 8.06 2.29 2.52 28.43 2.15 12.22 15.26 2.04 6.48 82.42 7.3 10.29 2014 IBM Corporation
DWH Swingbench Results V7000 and FlashSystem 820 2014 IBM Corporation
V7000 results 2014 IBM Corporation
FlashSystem 820 Acceleration V7000 Flash System820 X Increase 1453703 621184 2.34 2885604 718334 4.02 255656 64721 3.95 3705331 1344883 2.76 1660603 857933 1.94 2014 IBM Corporation
SLOB (Silly Little Oracle Benchmark) Testing Scenario SLOB generates the IO requests via PL/SQL, thus exercising full Oracle IO machinery along with SGA etc. SLOB is capable of testing random single block reads, writes and extreme REDO logging. It does all this with no application contention, thus allowing one to measure true maximum IO that can be achieved on a system. Workload consisted of 56 users from both RAC nodes generating db file sequential reads (Random Physical single block Reads). The test started with preferred read set to the disk failure group and continued after preferred read changed to FlashSystem failure group - ONLINE. 2013 IBM Corporation
Acceleration of Database Creation with IBM Flash System After swithcing to FlashSystem (05:27 PM) Disk IO wait disappears and waiting is now on host CPU. This graph shows the effect of the low latency of FlashSystem and how it increases the host CPU utilization. 2013 IBM Corporation
SLOB TEST PR Online Acceleration of Disk by FlashSystem 2013 IBM Corporation
SLOB Test Results Just by Adding a Single FlashSystem Box to an Oracle environment you can get: 36x acceleration of IOPS! 30x acceleration of throughput! 40x Acceleration of Latency! (assuming 0.43 ms, Oracle reports <0.5 as 0) 2013 IBM Corporation
Preferred Read Acceleration Comparing AWR logs Before Read From Disk After Acceleration Read From FlashSystem The average read response time for the first instance accelerated from 15.33 ms to 0.43 ms and average IOPS accelerated from 3644 to 111831 2013 IBM Corporation
System Configuration Linux X86 RHEL Server Gen2 XIV disk array IBM FlashSystem 820 ASM used to mirror between XIV and FlashSystem Switched preferred read mirror at Instance level Gen2/Flash means the Gen2 was PRM Flash/Gen2 means FlashSystem was PRM With PRM to FlashSystem achieved 75/25 ratio of FS to XIV reads 2013 IBM Corporation
Gen2 XIV and FlashSystem PRM Tests 2013 IBM Corporation
Gen2 XIV and FlashSystem PRM Tests 2013 IBM Corporation
Gen2 XIV and FlashSystem PRM Tests 2013 IBM Corporation
Conclusions Using ASM PRM achieved near Flash-only levels of performance Preferred read mirror using IBM FlashSystem provides dramatic performance boost in read heavy environments. 2013 IBM Corporation
How About Some Real World Tests 2014 IBM Corporation
ABB As Is Environment: ABB US has ~ 15 major manufacturing plants & 7,500 users All plants depend on SAP; The SAP Oracle DB = ~ 3.2 TB The PR1 DB is hosted on HACMP-clustered AIX LPARs; The LPARs are clustered between 2x p570 P6+ frames The DB resides on 2x DS8700 arrays front-ended by SVCs DB LUNs are mirrored between the 2 arrays at the host level 2013 IBM Corporation
Business Challenge User dissatisfied with SAP performance Slow month-end batch reads and reporting Dialog response times approaching business SLAs Performance concerns causing hesitation to invest in growth 100K (USD) per month in SLA fines 2013 IBM Corporation
Fixes: Many alternate solutions tried / considered but limited success: Additional CPU => Limited improvements; Additional cost SAP dedicated SAN => Too expensive DS8700 SSD => Too expensive; Configuration limitations SAP / Oracle tuning => Limited changes helped a little; Extensive changes too labor intensive SAP HANA => Un-proven; insufficient app team cycles 2013 IBM Corporation
Proposed Fix Install (2) IBM FlashSystem 810 units, one to accelerate each Pod Attach behind SVC; powered by IBM P Series and AIX Migrate the database and logs to the SSD and mirror across the Pods via AIX-level mirroring (standard mirroring used today) 2013 IBM Corporation
Predicted Acceleration 250.00% 200.00% 150.00% 100.00% 50.00% 0.00% 1 Current wait time total 70.84% New project wait time 2.78% Total Improvement 213.10% Predicted CPU Utilization Percent 140 120 100 80 60 40 20 0 CPU% Corrected CPU% % Increase Before and After 2013 IBM Corporation
Results Actual CPU Utilization Percent 180.00 160.00 140.00 120.00 100.00 80.00 60.00 40.00 20.00 - CPU% Corrected CPU% Percent Increase Before and After 2013 IBM Corporation
Proof points: Stuttgarter Straßenbahnen AG SAP ERP Finance Oracle AIX SVC stretched cluster Several other customer cases show batch run time reduction by factor 5 to 10! 2013 IBM Corporation 47
Non-Oracle Examples 2014 IBM Corporation
Tipping Point Demonstration IBM FlashSystem, IBM Power Systems and DB2 Highly Scalable & I/O Intensive OLTP Database Workload Compelling Economics Significantly improved workload efficiency Extreme Capacity Buy only what you need; add capacity as needed Application Transparency Avoid risk and cost of change as you grow or subside Continuous Availability Uninterrupted access to data with consistent performance 10 GbE Networking Fibre Channel Networking IBM FlashSystem 820 (4-1U units, 20TB Each) IBM Power 780 (4 nodes, 128 Cores, 2TB Memory) IBM DB2 v10.5 (10-8 core cluster members) 2013 IBM Corporation
Tipping Point Demonstration Results IBM FlashSystem, IBM Power Systems and DB2 1.3 Million IOPS 43K+ Transactions per second 13K Updates per second Normalized $ / IOPS 11x Less IBM FlashSystem 2,500Spindles + 128 SSDs 5,000 Spindles Energy Space 26x Less 80x Less IBM FlashSystem 2,500Spindles + 128 SSDs 5,000 Spindles IBM FlashSystem 2,500Spindles + 128 SSDs 5,000 Spindles 2013 IBM Corporation
All flash Case Study: Life sciences Client SQL cluster IBM 3650 IBM 3650 10 TB Flash System 820 Problem Experiencing pain with JDE BD loads / backups / restores Needed better system performance for the end user Solution Installed IBM FlashSystem 820 into a a SQL DB, clustered, running Oracle JDE Included Oracle OLAP processes Benefit Backup Time improved from 5 hours to 42 minutes Restore Time improved from 6.5 hours to 1.2 hours Batch times went from 7:30 hours to 2:37 and 17:47 to 7:07 2013 IBM Corporation
Questions? Mike Ault mrault@us.ibm.com Thanks to : hakany@tr.ibm.com Ali Fığ ığlalı alif@tr.ibm.com Orçun Budak orcunb@tr.ibm.com STG Turkey for current results! 2013 IBM Corporation