Business in a Flash. Getting the Most Out of Flash Storage Introduction, Usability, Optimization May 2015 David Lin Solutions Architect dlin@vmem.com 1
The copyright for images, icons, and logos used belong to their respective owners. They have been used to educate the viewer on general flash storage concepts. 2
Agenda Brief Introduction to Violin Memory NAND Flash Refresher Flash Storage Categories Compromises Tricks of the Trade Stack of Layers Integration Pitfalls The Potential of Flash Storage, Success Stories Windows Storage I/O Performance Tools 3
Violin Memory: Accelerating business Founded in 05 Team of Flash industry pioneers Valuation > $800+ Million (Q1 2012) Flash Memory Arrays Since 2009 Doubled employee count over last 18 months HW & SW innovation 200+ Engineering team 350+ Employees Next Gen Tier 1 Designed for Tier 1 Worldwide support capability Proven industry-leading performance Fortune 500 leaders Aggressively adding logos Web, Telco, Banking, Retail, Govt., Healthcare, etc. 300+ Customers Toshiba: NAND Flash Inventor with close R&D collaboration Leading strategic investors: Toshiba, SAP, Juniper World Wide Headquarters in Silicon Valley, CA SV Startup of the Year (2011) Global Presence 4
What is NAND Flash? Silicon-based storage medium Has no moving parts Uses electrons to store bits, not magnetic particles Is non-volatile; will not lose data if power is lost 5
Why flash instead of disk? today +18-24 months Combination of data efficiency and 3D NAND introduction, making flash effectively free on TCO basis! Next 18-24 months market will explode to a multi billion dollar market, resulting in complete shake up of industry positions $/Gb TCO/Gb disk Capex/Gb disk Capex/Gb Flash t 6
NAND Flash 101 Erases at Block Level Writes at Page Level 1 Package 8 Dies 2 Planes 1000s of Blocks per Die 256 Pages per Block 7
Visualizing impact of storage latency on performance 8ms HDD Storage 8ms latency I/O Bound Apps Oracle DB2 SQL Server etc Time Total latency = seek time + rotational latency Flash Storage 0.5ms (500 microsecond) latency Time Over 16 I/O in the same amount of time 8
Flash Storage Categories SD cards USB flash drives SAS or SATA SSDs FC, IB, 10 GbEconnected All Flash Arrays 9
Flash Is Not APanacea The Good The Bad The Ugly Non-volatile Extremely fast Low power Great at random I/O Writes block reads Cannot over-write pages Erases block reads & writes Erases are painfully slow (1-5 ms) Erases wear flash out Gets slower with each generation Flash must be managed properly to extract its true potential! 10
Why Purpose Built Arrays? Legacy Disk Avg Latency: 3-5 ms Disk Plus Avg Latency: 1-2 ms Insanely Fast Avg Latency: 250 µs (µs = 1 millionth of a second) 20x faster than disk! 10K RPM HDD SSD NAND Flash 11
Why Purpose Built Arrays? Storage O/S Device layer Storage O/S Device layer Violin vmos You must own the entire stack all the way down to the chip level HDD to extract the true SSD potential of flash. NAND Flash 12
New Solid-State Storage Tier 1 PB 100 TB VERY FAST No seek times Non-volatile Green 10 TB 1 TB 100 GB 10 GB 1 GB Multi-core CPU Processor Cache DRAM Violin Memory Arrays 8,000µs (2 orders of magnitude) SSDs Emulating HDDs 15K Disk Array SATA Array ns 1µs 150µs 3ms 8ms 20ms TIME (Access Delay) 13
Compromises Why is there such a wide range of flash storage products? Adjusting for higher performance may - Reduce capacity - Reduce endurance - Increase the price Adjusting for higher capacity may - Reduce performance - Reduce endurance - Reduce the price per GB Adjusting for higher endurance may - Reduce performance - Reduce capacity - Increase the price 14
Tricks of the trade TRIM - May help reduce wear on SSDs - Must be supported by SSDand OS Garbage Collection - Required to reclaim deleted space - May compete with primary writes Overprovisioning - Working space utilized in the process of garbage collection Wear Leveling - Algorithms to distribute writes across all flash evenly Buffering and I/O management to bias for sequentialvs random and read vs write 15
Stack of Layers Application I/O File system I/O Operating system I/O Hardware I/O interfaces Physical integration Storage Device Settings How does the application read and write data? How does the operating system present the storage to applications? How does the system make the raw storage usable? Connecting storage devices to a system How is the storage attached, cabled, powered? Overprovisioning, TRIM, bias for reads or writes, random or sequential 16
Integration Pitfalls Application I/O File system I/O Operating system I/O Hardware I/O interfaces Physical integration Storage Device Settings Legacy I/O patterns: 4K, sequential, single threaded File system capabilities and/or defaults are for HDDs OS I/O defaults are to compensate for slowness of HDDs SSDsare plugged into RAID controllers for HDDswhich are designed for much lower aggregate throughput SSDsplugged into HDD shelves which are cabled for much lower aggregate throughput Settings or product chosen is mis-matched to workload 17
Application & Business Impact Use Case Category Measurement Impact Billing (Keenan) Monthly billing run 72 hrs 22 hrs 330% improvement SupplyChain Management (SAP Sales Distribution, Materials Management) ERP Calculation 7hr 2 hrs 350% improvement + consolidating multiple streets into one. VDI (XenServer, Intel Westmere) Desktop Support 615 users per core Less SAN ports (112) and servers (56) for 3000 users] 18
Accelerated Return on Investment 50% reduction in total cost of ownership. Reduced the cost per IOP by nearly 9X (from 8.88 to 1.03) Mary Reeder, CTO $1.2M savings on legal fees and $775,000 savings in CAPEX. $400,000savings in operational costs $150,000 savings annually in Operation costs. Up to 80%reduction in cooling, power, space costs, and YoY savings of 200 per VDI unit. $500,000 in savings with an accelerated storage platform. ROI in >1 year with 90% reduction in report time for its billing application. ROI in 11 months; reduced nightly report times by 3Xfor Oracle E- Business Suite 19
Accelerated Application Performance 5X improvement by reducing Qtr- End report times from 123 hours down to 25 hours. 8Xfaster creation of virtual machine instances. ERP system is running 4X faster on the Violin platform. Video streaming application runs 10Xfaster on Violin Memory flash storage. 85%improvement in report turnaround time in NCCD s Oracle RAC database environment. >2X improvement: Jobs that took 40hours now take 18hours. Chris Aiden, Head of Engineering Reduced processing time for its mission-critical billing app by 76% Database productivity increases by 40%, allowing for more reports in less time. Reduced 400M patient scans down to 1 minute from 60 minutes. 20
Logistic company WFA testing results Technical details SQL Server with 1.3 billon rows Utilize Microsoft s SMB Direct networking protocol with RDMA capable 10Gb Ethernet cards Host server: HP DL 580 (4 CPUs 96 logical cores 4 Rack U) Storage: Violin Memory WFA (3 Rack U Windows Server embedded inside array for native SMB Direct) Results from Violin WFA storage: Existing platform with Disk+Flash and SQL2012 Test environment: 0.455 mill rows/sec Prod environment: 0.621 mill rows/sec WFA+SMB Direct and SQL2012 Test environment: 13.2 mill rows/sec Prod environment: 13.2 mill rows/sec Improved 30x Improved 21x WFA+SMB Direct and SQL2014 (updatable in-memory columnstore index) Test environment: 211 mill rows/sec Prod environment: 211 mill rows/sec Improved 464x Improved 340x 21
Australian Department of Defense Challenges Latency sensitive applications Massive IOPs requirement Data center space limitations Unable to take advantage of HA and virtualization Solution Windows Flash Array 4-CPU host servers Applications Network monitoring / dashboard reporting MSSQL Hyper-v Results Sustained 500-800K IOPS at <1ms latency Server consolidation 45 servers to 2 Core / licensing consolidation 90 CPU s to 8 90% reduction in DC space & power requirements Simplified management and support processes on MS stack 22
Windows Storage I/O Performance Tools Let s take a look at several Microsoft tools that show us how applications utilize hardware resources in general, and (flash) storage specifically. The first three are included with every modern Windows OS. The basic Task Manager (available any time when you press Ctl-Alt-Del) gives you general indications of how your system resources are utilized. 23
Windows Storage I/O Performance Tools The Resource Monitor(press the button at the bottom of the Performance tab of the Task manager) gives us a lot more details, like the number of threads an application is using or, here we are looking at the files used by each process. 24
Windows Storage I/O Performance Tools The Performance Monitor (search for it from the Start menu) can show us very detailed system resource metrics. You can also record these metrics and then use other tools (spreadsheets) to look for trends and correlations of these metrics with each other and with application performance to determine where bottlenecks may be. 25
Windows Storage I/O Performance Tools TheProcess Monitor is an advanced monitoring tool for Windows that shows real-time file system, Registry and process/thread activity. It is a very powerful tool to show you everything your system is doing at the highest level of detail, like I/O down to the transaction level. It is a free download from Microsoft. (https://technet.microsoft.com/en-us/library/bb896645.aspx) 26
In closing Flash storage of many kinds are widely available and can be simple to use However, a specific capacity and type of flash storage may vary widely in price, performance, and endurance between product lines and vendors Even more so, weak system integration of flash storage may only deliver a small fraction of the performance potential, sometimes less performance than the hard disk drives they are replacing And, poor application I/O may only utilize a fraction of the performance potential of any flash storage product Visit www.violin-memory.com or contact us if you would like to learn more 27
Business in a Flash. Thank You 28