89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 Capitalizing on Smarter and Faster Insight with Flash IBM FlashSystem and IBM InfoSphere Identity Insight
Printed in the United States of America Copyright 2014 Edison Group, Inc. New York. Edison Group offers no warranty either expressed or implied on the information contained herein and shall be held harmless for errors resulting from its use. All products are trademarks of their respective owners. First Publication: March 2014 Produced by: Chris M. Evans, Senior Analyst; Manny Frishberg, Editor; Barry Cohen, Editor-in-Chief
Table of Contents Executive Summary... 1 Identity Insight Overview... 2 The Problem... 3 Flash to the Rescue... 4 Flash Options... 4 Proof Points... 5 Conclusion... 7
Executive Summary Keeping pace with the rapid change in technology can prove a challenge for many organizations, especially as more and more business is conducted online. Organizations can improve their customer service, detect and pre-empt fraud and other criminal activity and avoid threats more effectively by implementing analytics solutions. Operating a business in this day and age demands decisions be made with the most current information. This requires timely processing of enormous amounts of available data to make smarter decisions in shorter time frames, increase protection, improve services and anticipate potential threats. IBM InfoSphere Identity Insight enables businesses to use their own data to create meaningful and actionable information that can be used to discover opportunities or detect and prevent business threats. Identity Insight is part of IBM s Smarter Planet initiatives, including Smarter Public Safety and Smarter Counter Fraud. In order to best exploit the value of high velocity data analytics, Identity Insight can be combined with IBM FlashSystem technologies, notably the IBM Flash Adapter 90 and FlashSystem arrays, to analyze data over 100 times faster than with diskbased solutions. As a comparison, to reach the level of throughput achieved by FlashSystem, a customer would need over 1,000 disks, filling two racks and costing over five times as much. FlashSystem analyzes data over 100 times faster and at a fraction of the cost, compared to disk The need for technologies like Identity Insight will continue to grow, creating the requirement to analyze ever-larger amounts of data at accelerating speeds and accuracy. Identity Insight and IBM s flash portfolio are a perfect match for delivering the requirements of fast, accurate analytics, both today and into the future. Edison: Capitalizing on Smarter and Faster Insight with IBM FlashSystem Page 1
Identity Insight Overview IBM Infosphere Identity Insight is a software solution for analyzing data sources to produce meaningful and actionable information in areas such as customer service, fraud detection and threat avoidance. Along with the rest of IBM s Analytics portfolio of Contextual Computing offerings, Identity Insight is focused on helping organizations parse the data available to them, and do so quickly enough that threats can be addressed while they are still happening. The Identity Insight software takes data (in batch or streams) from multiple sources and formats, bringing the information together to identify individuals and groups, as well as their activities. From this data, the system is able to infer and create actionable information that can be used to discover opportunities or risk (e.g. customer service, fraud detection, identity management and law enforcement). Five Things to Know about Contextual Computing: A Fun and Easy to Understand Six- Minute Video Edison: Capitalizing on Smarter and Faster Insight with IBM FlashSystem Page 2
The Problem With the ability to conduct business through multiple channels, enterprises benefit in many ways by being increasingly diligent about monitoring for and detecting unusual and potentially fraudulent activity or threats to their data security. There have been many breaches of compliance in financial organizations failing to properly detect money-laundering activity, resulting in large financial penalties. In one example, Identity Insight integrated into MoneyGram s fraud detection system was able to detect and halt a transfer of funds for a 100-year-old grandmother who was transferring $2500 in bail for her grandson s purported arrest. The request for money turned out to be a scam, and it was money the woman could scarcely afford to lose. Identity and activity tracking information of this type has the highest value when delivered as quickly as possible. The faster the answers are delivered, the greater the potential benefit in both security and revenue. However, producing results with a high degree of accuracy requires the processing of ever-increasing amounts of data. Identity Insight ingests an organization s proprietary data records, including personal details such as names, addresses, and phone numbers through a variety of methods, including files and database extracts, as well as through web-based protocols. These information sources (whether arriving in batch or real-time) are streamed into an analytics engine, updating multiple tables of data in a relational database. Data fragments are small, with typical systems holding billions of records in only 5-10TB of storage. Processing by Identity Insight results in a high I/O density with a high number of IOPS (I/O requests per second) per TB of data stored. Over time, hard drive capacities have increased. However, hard drive performance in terms of responsiveness and latency has remained relatively static. As a result, Identity Insight deployments using spinning disk media have needed to supplement the number of physical disks or spindle count in order to achieve the required system performance, which results in provisioning more storage capacity than can be utilized. Although storage systems with large numbers of hard disks can deliver high amounts of IOPS, they cannot deliver the low latency needed to scale analytic intensive workloads such as Identity Insight, which will continue to be a problem as data volumes and performance requirements grow. Edison: Capitalizing on Smarter and Faster Insight with IBM FlashSystem Page 3
Flash to the Rescue Flash storage provides much greater I/O density capabilities than can be achieved with traditional spinning media, typically hundreds or thousands of times better than today s hard drives. Flash performs equally well with random compared to sequential I/O workloads; random read I/O is particularly challenging for spinning media to handle. In addition, flash storage provides much lower latency (or response time) than hard drives, which enables more concurrent queries to be completed simultaneously and results to be available for analysis more quickly. IBM flash-based products such as the IBM Flash Adapter 90 and IBM FlashSystem arrays are perfectly suited to the demands of Identity Insight. IBM FlashSystem is not only capable of processing the data at high velocity; it can also handle a much larger workload than traditional storage. FlashSystem includes IBM s MicroLatency technology, which delivers read and write IOPS at around 135 microseconds (or as much as 25x faster than a storage array relying on hard disk drives). As flash meets the I/O profile demands of Identity Insight, there is no need to overprovision resources in order to deliver the throughput required, resulting in a lower TCO, with savings in hardware, data center space, power and cooling. Flash Options IBM offers two flash solutions best suited to power Identity Insight. They are delivered in two very different form factors to fully address the broadest range of system requirements. The IBM Flash Adapter 90 is a PCIe adapter card for IBM Power Systems that uses emlc flash to deliver 900GB of storage capacity and up to 325,000 4KB random read IOPS. The Flash Adapter 90 delivers up to 1.2GB/s of read performance and 0.7GB/s of sequential write throughput. IBM recommends the Flash Adapter 90 for Identity Insight deployments up to 4TB, with the ability to deploy multiple Flash Adapter 90 adapters in each Power System server. For larger capacity systems, IBM offers FlashSystem arrays; external storage that uses emlc flash. FlashSystem 840 scales up to 40TB of protected capacity in only two rack units and delivers throughput of up to 8GB/s (read, 100% sequential) and 4GB/s for 100% sequential writes. Latency values are as low as 90us for writes and 135us for reads, with each FlashSystem 840 capable of providing up to 1,100,000 random read IOPS while drawing less than 625 watts of power. In comparison, hard disk based solutions would occupy dramatically more rack space and consume more power in order to achieve these IOPS objectives and still be unable to achieve the MicroLatency that IBM FlashSystem delivers. Edison: Capitalizing on Smarter and Faster Insight with IBM FlashSystem Page 4
Proof Points IBM has performed extensive testing of Identity Insight against both traditional storage arrays and FlashSystem. In one test, IBM compared the performance of FlashSystem 840 configured with 40TB (RAID-5), Flash Adapter 90 2TB (mirrored), and a hard drive array configured with forty, 900GB 10K SAS hard drives (RAID-5). In the test, IBM measured the Identity Insight transaction throughput at 100M records over a sustained period of 120 minutes. Figure 1: Identity Insight Transaction Throughput at 100M Records Figure 1 shows the throughput per minute achieved by FlashSystem 840, the Flash Adapter 90, and a high performance hard disk drive array. Flash Adapter 90 performed 37 times faster and FlashSystem performed 110 times faster compared to disk. To reach Edison: Capitalizing on Smarter and Faster Insight with IBM FlashSystem Page 5
the level of throughput achieved by FlashSystem (297K TPM), an Identity Insight user would need over 1,000 disks, filling two racks and costing over five times as much. In this test the Identity Insight workload generated a mix of read and write IOPS in a three-to-two ratio. FlashSystem 840 only used half of its available IOPS and achieved response latency of only 0.3 ms. A business implementing Identity Insight is looking to detect and stop fraud before it impacts the business and its reputation. Using FlashSystem for data ingest provides business results in a fraction of the time of a performance optimized HDD array. This translates directly into improved impact from Identity Insight and major business value as a security risk can be stopped in its tracks. Edison: Capitalizing on Smarter and Faster Insight with IBM FlashSystem Page 6
Conclusion Infosphere Identity Insight integrates diverse data sources to provide customers with real, actionable information based on the high-velocity processing of multiple data streams. The results achieved from Identity Insight analytics are of highest value when using the lowest latency, fastest throughput storage. Hard disk technology cannot keep up with the I/O and latency demands required and still deliver a solution with an acceptable TCO. When Identity Insight is combined with IBM flash storage, either using Flash Adapter 90 or FlashSystem arrays for more scalable solutions, the result is a system that delivers superior performance and analytics that translate directly into business value for IBM customers. Edison: Capitalizing on Smarter and Faster Insight with IBM FlashSystem Page 7