Big Data Beyond the Hype

Size: px
Start display at page:

Download "Big Data Beyond the Hype"

Transcription

1

2 Big Data Beyond the Hype

3 About the Authors Paul Zikopoulos is the Vice President of Technical Professionals for IBM s Information Management division and leads its World-Wide Competitive Database and Big Data teams. Paul is an award-winning writer and speaker with more than 20 years of experience in information management and is seen as a global expert in Big Data and Analytic technologies. Independent groups often recognize Paul as a thought leader with nominations to SAP s Top 50 Big Data Twitter Influencers, Big Data Republic s Most Influential, Onalytica s Top 100, and Analytics Week s Thought Leader in Big Data and Analytics lists. Technopedia listed him a Big Data Expert to Follow, and he was consulted on the topic by the popular TV show 60 Minutes. Paul has written more than 350 magazine articles and 18 books, some of which include Hadoop for Dummies, Harness the Power of Big Data, Understanding Big Data, DB2 for Dummies, and more. Paul has earned an undergraduate degree in Economics and an MBA. In his spare time, he enjoys all sorts of sporting activities, including, apparently, hot yoga. Ultimately, Paul is trying to figure out the world according to Chloë his daughter. Follow him on Dirk deroos is IBM s World-Wide Technical Sales Leader for IBM s Big Data technologies. Dirk has spent the past four years helping customers build Big Data solutions featuring InfoSphere BigInsights for Hadoop and Apache Hadoop, along with other components in IBM s Big Data and Analytics platform. Dirk has co-authored three books on this subject area: Hadoop for Dummies, Harness the Power of Big Data, and Understanding Big Data. Dirk earned two degrees from the University of New Brunswick in Canada: a bachelor of computer science and a bachelor of arts (honors English). You can reach him on Christopher Bienko covers the Information Management Cloud Solutions portfolio for IBM s World-Wide Technical Sales organization. Over the better part of two years, Christopher has navigated the cloud computing domain, enabling IBM customers and sellers to stay abreast of these rapidly evolving technologies. Prior to this, Christopher was polishing off his freshly minted bachelor of computer science degree from Dalhousie University. Christopher also holds a bachelor of science degree from Dalhousie University, with a

4 double major in biology and English. You can follow his musings on and see the world through his lens at Rick Buglio has been at IBM for more than eight years and is currently a product manager responsible for managing IBM s InfoSphere Optim solutions, which are an integral part of IBM s InfoSphere lifecycle management portfolio. He specializes in Optim s Data Privacy and Test Data Management services, which are used extensively to build right-sized, privatized, and trusted nonproduction environments. Prior to joining the Optim team at IBM, he was a product manager for IBM s Data Studio solution and was instrumental in bringing the product to market. Rick has more than 35 years of experience in the information technology and commercial software industry across a vast number of roles as an application programmer, business analyst, database administrator, and consultant, and has spent the last 18 years as a product manager specializing in the design, management, and delivery of successful and effective database management solutions for numerous industry-leading database management software companies. Marc Andrews is an IBM Vice President who leads a worldwide team of industry consultants and solution architects who are helping organizations use increasing amounts of information and advanced analytics to develop industry-specific Big Data and Analytics solutions. Marc meets with business and technology leaders from multiple industries across the globe to share best practices and identify how they can take advantage of emerging Big Data and Analytics capabilities to drive new business value. Marc is a member of the IBM Industry Academy and has been involved in several multibillion-dollar acquisitions in the information management and analytics space. He has a bachelor of economics degree from the Wharton Business School at the University of Pennsylvania and holds three patents related to information integration. You can reach him at About the Technical Editor Roman B. Melnyk is a senior member of the DB2 Information Development team. Roman edited DB with BLU Acceleration: New Dynamic In-Memory Analytics for the Era of Big Data; Harness the Power of Big Data: The IBM Big Data

5 Platform; Warp Speed, Time Travel, Big Data, and More: DB2 10 for Linux, UNIX, and Windows New Features; and Apache Derby: Off to the Races. Roman co-authored Hadoop for Dummies, DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, and DB2 for Dummies.

6 Big Data Beyond the Hype A Guide to Conversations for Today s Data Center Paul Zikopoulos Dirk deroos Christopher Bienko Rick Buglio Marc Andrews New York Chicago San Francisco Athens London Madrid Mexico City Milan New Delhi Singapore Sydney Toronto

7 Copyright 2015 by McGraw-Hill Education. All rights reserved. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the publisher, with the exception that the program listings may be entered, stored, and executed in a computer system, but they may not be reproduced for publication. ISBN: MHID: X The material in this ebook also appears in the print version of this title: ISBN: , MHID: ebook conversion by codemantra Version 1.0 All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book, they have been printed with initial caps. McGraw-Hill Education ebooks are available at special quantity discounts to use as premiums and sales promotions or for use in corporate training programs. To contact a representative, please visit the Contact Us page at The contents of this book represent those features that may or may not be available in the current release of any on premise or off premise products or services mentioned within this book despite what the book may say. IBM reserves the right to include or exclude any functionality mentioned in this book for the current or subsequent releases of any IBM cloud services or products mentioned in this book. Decisions to purchase any IBM software should not be made based on the features said to be available in this book. In addition, any performance claims made in this book aren t official communications by IBM; rather, they are the results observed by the authors in unaudited testing. The views expressed in this book are ultimately those of the authors and not necessarily those of IBM Corporation. Information has been obtained by McGraw-Hill Education from sources believed to be reliable. However, because of the possibility of human or mechanical error by our sources, McGraw-Hill Education, or others, McGraw-Hill Education does not guarantee the accuracy, adequacy, or completeness of any information and is not responsible for any errors or omissions or the results obtained from the use of such information. TERMS OF USE This is a copyrighted work and McGraw-Hill Education and its licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill Education s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited. Your right to use the work may be terminated if you fail to comply with these terms. THE WORK IS PROVIDED AS IS. McGRAW-HILL EDUCATION AND ITS LICENSORS MAKE NO GUAR- ANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABIL- ITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill Education and its licensors do not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill Education nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill Education has no responsibility for the content of any information accessed through the work. Under no circumstances shall McGraw-Hill Education and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise.

8 Book number 19: One day I ll come to my senses. The time and energy needed to write a book in your spare time are daunting, but if the end result is the telling of a great story (in this case, the topic of Big Data in general and the IBM Big Data and Analytics platform), I feel that it is well worth the sacrifice. Speaking of sacrifice, it s not just the authors who pay a price to tell a story; it s also their loved ones, and I am lucky to have plenty of those people. My wife is unwavering in her support for my career and takes a backseat more often than not she s the behind-the-scenes person who makes it all work: Thank you. And my sparkling daughter, Chloë, sheer inspiration behind a simple smile, who so empowers me with endless energy and excitement each and every day. Over the course of the summer, we started to talk about Hadoop. She hears so much about it and wanted to know what it is. We discussed massively parallel processing (MPP), and I gave her a challenge: Create for me an idea how MPP could help in a kid s world. After some thought, she brought a grass rake with marshmallows on each tine, put it on the BBQ, and asked Is this MPP, BigDaDa? in her witty voice. That s an example of the sparkling energy I m talking about. I love you for this, Chloë and for so much more. I want to thank all the IBMers who I interact with daily in my quest for knowledge. I m not born with it, I learn it and from some of the most talented people in the world IBMers. Finally, to Debbie Smith, Kelly McCoy, Brad Strickert, Brant Hurren, Lindsay Hogg, Brittany Fletcher, and the rest of the Canada Power Yoga crew in Oshawa: In the half-year leading up to this book, I badly injured my back twice. I don t have the words to describe the frustration and despair and how my life had taken a wrong turn. After meeting Debbie and her crew, I found that, indeed, my life had taken a turn but in the opposite direction from what I had originally thought. Thanks to this studio for teaching me a new practice, for your caring and your acceptance, and for restoring me to a place of well-being (at least until Chloë is a teenager). Namaste. Paul Zikopoulos

9 To Sandra, Erik, and Anna: Yet again, I ve taken on a book project, and yet again, you ve given me the support that I ve needed. Dirk deroos To my family and friends who patiently put up with the late nights, short weekends, and the long hours it takes to be a part of this incredible team: Thank you. I would also like to thank those who first domesticated Coffee arabica for making the aforementioned possible. Christopher Bienko I would like to thank my immediate family for supporting my contribution to this book by putting up with my early and late hours and, at times, my stressed-out attitude. I especially want to thank my wife, Lyndy, my in-house English major, who helped edit and proofread my first draft and this dedication. I would also like to thank Paul Zikopoulos for providing me with the opportunity to be part of this book, even though there were times when I thought I was crazy for volunteering. Finally, I would like to thank my entire Optim extended family for having my back at work and for allowing me to go stealth as needed so that I could focus on making this project a reality. Rick Buglio I would like to thank all of our clients for taking the time to tell us about their challenges and giving us the opportunity to demonstrate how we can help them. I would also like to thank the entire IBM Big Data Industry Team for their continued motivation and passion working with our clients to understand their business needs and helping them to find ways of delivering new value through information and analytics. By listening to our clients and sharing their experiences, we are able to continuously learn new ways to help transform industries and businesses with data. And thank you to my family, Amy, Ayla, and Ethan, for their patience and support even when I am constantly away from home to spend time with companies across the world in my personal pursuit to make an impact. Marc Andrews

10 CONTENTS AT A GLANCE PART I Opening Conversations About Big Data 1 Getting Hype out of the Way: Big Data and Beyond To SQL or Not to SQL: That s Not the Question, It s the Era of Polyglot Persistence Composing Cloud Applications: Why We Love the Bluemix and the IBM Cloud The Data Zones Model: A New Approach to Managing Data PART II Watson Foundations 5 Starting Out with a Solid Base: A Tour of Watson Foundations Landing Your Data in Style with Blue Suit Hadoop: InfoSphere BigInsights In the Moment Analytics: InfoSphere Streams Million Times Faster Than the Blink of an Eye: BLU Acceleration An Expert Integrated System for Deep Analytics Build More, Grow More, Sleep More: IBM Cloudant ix

11 PART III Calming the Waters: Big Data Governance 11 Guiding Principles for Data Governance Security Is NOT an Afterthought Big Data Lifecycle Management Matching at Scale: Big Match

12 CONTENTS Foreword.... Acknowledgments... Introduction... xix xxi xxiii PART I Opening Conversations About Big Data 1 Getting Hype out of the Way: Big Data and Beyond... 3 There s Gold in Them There Hills!... 3 Why Is Big Data Important?... 5 Brought to You by the Letter V: How We Define Big Data... 8 Cognitive Computing Why Does the Big Data World Need Cognitive Computing? A Big Data and Analytics Platform Manifesto Discover, Explore, and Navigate Big Data Sources Land, Manage, and Store Huge Volumes of Any Data Structured and Controlled Data Manage and Analyze Unstructured Data Analyze Data in Real Time A Rich Library of Analytical Functions and Tools Integrate and Govern All Data Sources Cognitive Computing Systems Of Cloud and Manifestos Wrapping It Up To SQL or Not to SQL: That s Not the Question, It s the Era of Polyglot Persistence Core Value Systems: What Makes a NoSQL Practitioner Tick What Is NoSQL? Is Hadoop a NoSQL Database? xi

13 xii Contents Different Strokes for Different Folks: The NoSQL Classification System Give Me a Key, I ll Give You a Value: The Key/Value Store The Grand-Daddy of Them All: The Document Store Column Family, Columnar Store, or BigTable Derivatives: What Do We Call You? Don t Underestimate the Underdog: The Graph Store From ACID to CAP CAP Theorem and a Meatloaf Song: Two Out of Three Ain t Bad Let Me Get This Straight: There Is SQL, NoSQL, and Now NewSQL? Wrapping It Up Composing Cloud Applications: Why We Love the Bluemix and the IBM Cloud At Your Service: Explaining Cloud Provisioning Models Setting a Foundation for the Cloud: Infrastructure as a Service IaaS for Tomorrow Available Today: IBM SoftLayer Powers the IBM Cloud Noisy Neighbors Can Be Bad Neighbors: The Multitenant Cloud Building the Developer s Sandbox with Platform as a Service If You Have Only a Couple of Minutes: PaaS and IBM Bluemix in a Nutshell Digging Deeper into PaaS Being Social on the Cloud: How Bluemix Integrates Platforms and Architectures Understanding the Hybrid Cloud: Playing Frankenstein Without the Horror Tried and Tested: How Deployable Patterns Simplify PaaS Composing the Fabric of Cloud Services: IBM Bluemix Parting Words on Platform as a Service Consuming Functionality Without the Stress: Software as a Service The Cloud Bazaar: SaaS and the API Economy Demolishing the Barrier to Entry for Cloud-Ready Analytics: IBM s dashdb... 89

14 Contents xiii Build More, Grow More, Know More: dashdb s Cloud SaaS Refinery as a Service Wrapping It Up The Data Zones Model: A New Approach to Managing Data Challenges with the Traditional Approach Agility Cost Depth of Insight Next-Generation Information Management Architectures Prepare for Touchdown: The Landing Zone Into the Unknown: The Exploration Zone Into the Deep: The Deep Analytic Zone Curtain Call: The New Staging Zone You Have Questions? We Have Answers! The Queryable Archive Zone In Big Data We Trust: The Trusted Data Zone A Zone for Business Reporting From Forecast to Nowcast: The Real-Time Processing and Analytics Zone Ladies and Gentlemen, Presenting The Data Zones Model PART II Watson Foundations 5 Starting Out with a Solid Base: A Tour of Watson Foundations Overview of Watson Foundations A Continuum of Analytics Capabilities: Foundations for Watson Landing Your Data in Style with Blue Suit Hadoop: InfoSphere BigInsights Where Do Elephants Come From: What Is Hadoop? A Brief History of Hadoop Components of Hadoop and Related Projects Open Source and Proud of It Making Analytics on Hadoop Easy

15 xiv Contents The Real Deal for SQL on Hadoop: Big SQL Machine Learning for the Masses: Big R and SystemML The Advanced Text Analytics Toolkit Data Discovery and Visualization: BigSheets Spatiotemporal Analytics Finding Needles in Haystacks of Needles: Indexing and Search in BigInsights Cradle-to-Grave Application Development Support The BigInsights Integrated Development Environment The BigInsights Application Lifecycle An App Store for Hadoop: Easy Deployment and Execution of Custom Applications Keeping the Sandbox Tidy: Sharing and Managing Hadoop The BigInsights Web Console Monitoring the Aspects of Your Cluster Securing the BigInsights for Hadoop Cluster Adaptive MapReduce A Flexible File System for Hadoop: GPFS-FPO Playing Nice: Integration with Other Data Center Systems IBM InfoSphere System z Connector for Hadoop IBM PureData System for Analytics InfoSphere Streams for Data in Motion InfoSphere Information Server for Data Integration Matching at Scale with Big Match Securing Hadoop with Guardium and Optim Broad Integration Support Deployment Flexibility BigInsights Editions: Free, Low-Cost, and Premium Offerings A Low-Cost Way to Get Started: Running BigInsights on the Cloud Higher-Class Hardware: Power and System z Support

16 Contents xv Get Started Quickly! Wrapping It Up In the Moment Analytics: InfoSphere Streams Introducing Streaming Data Analysis How InfoSphere Streams Works A Simple Streams Application Recommended Uses for Streams How Is Streams Different from CEP Systems? Stream Processing Modes: Preserve Currency or Preserve Each Record High Availability Dynamically Distributed Processing InfoSphere Streams Platform Components The Streams Console An Integrated Development Environment for Streams: Streams Studio The Streams Processing Language Source and Sink Adapters Analytical Operators Streams Toolkits Solution Accelerators Use Cases Get Started Quickly! Wrapping It Up Million Times Faster Than the Blink of an Eye: BLU Acceleration What Is BLU Acceleration? What Does a Next Generation Database Service for Analytics Look Like? Seamlessly Integrated Hardware Optimized Convince Me to Take BLU Acceleration for a Test Drive Pedal to the Floor: How Fast Is BLU Acceleration? From Minimized to Minuscule: BLU Acceleration Compression Ratios Where Will I Use BLU Acceleration?

17 xvi Contents How BLU Acceleration Came to Be: Seven Big Ideas Big Idea #1: KISS It! Big Idea #2: Actionable Compression and Computer-Friendly Encoding Big Idea #3: Multiplying the Power of the CPU Big Idea #4: Parallel Vector Processing Big Idea #5: Get Organized by Column Big Idea #6: Dynamic In-Memory Processing Big Idea #7: Data Skipping How Seven Big Ideas Optimize the Hardware Stack The Sum of All Big Ideas: BLU Acceleration in Action DB2 with BLU Acceleration Shadow Tables: When OLTP + OLAP = 1 DB What Lurks in These Shadows Isn t Anything to Be Scared of: Operational Reporting Wrapping It Up An Expert Integrated System for Deep Analytics Before We Begin: Bursting into the Cloud Starting on the Whiteboard: Netezza s Design Principles Appliance Simplicity: Minimize the Human Effort Process Analytics Closer to the Data Store Balanced + MPP = Linear Scalability Modular Design: Support Flexible Configurations and Extreme Scalability What s in the Box? The Netezza Appliance Architecture Overview A Look Inside a Netezza Box How a Query Runs in Netezza How Netezza Is a Platform for Analytics Wrapping It Up Build More, Grow More, Sleep More: IBM Cloudant Cloudant: White Glove Database as a Service Where Did Cloudant Roll in From? Cloudant or Hadoop? Being Flexible: Schemas with JSON Cloudant Clustering: Scaling for the Cloud

18 Contents xvii Avoiding Mongo-Size Outages: Sleep Soundly with Cloudant Replication Cloudant Sync Brings Data to a Mobile World Make Data, Not War: Cloudant Versioning and Conflict Resolution Unlocking GIS Data with Cloudant Geospatial Cloudant Local Here on In: For Techies For Techies: Leveraging the Cloudant Primary Index Exploring Data with Cloudant s Secondary Index Views Performing Ad Hoc Queries with the Cloudant Search Index Parameters That Govern a Logical Cloudant Database Remember! Cloudant Is DBaaS Wrapping It Up PART III Calming the Waters: Big Data Governance 11 Guiding Principles for Data Governance The IBM Data Governance Council Maturity Model Wrapping It Up Security Is NOT an Afterthought Security Big Data: How It s Different Securing Big Data in Hadoop Culture, Definition, Charter, Foundation, and Data Governance What Is Sensitive Data? The Masquerade Gala: Masking Sensitive Data Don t Break the DAM: Monitoring and Controlling Access to Data Protecting Data at Rest Wrapping It Up Big Data Lifecycle Management A Foundation for Data Governance: The Information Governance Catalog Data on Demand: Data Click

19 xviii Contents Data Integration Data Quality Veracity as a Service: IBM DataWorks Managing Your Test Data: Optim Test Data Management A Retirement Home for Your Data: Optim Data Archive Wrapping It Up Matching at Scale: Big Match What Is Matching Anyway? A Teaser: Where Are You Going to Use Big Match? Matching on Hadoop Matching Approaches Big Match Architecture Big Match Algorithm Configuration Files Big Match Applications HBase Tables Probabilistic Matching Engine How It Works Extract Search Applications for Big Match Enabling the Landing Zone Enhanced 360-Degree View of Your Customers More Reliable Data Exploration Large-Scale Searches for Matching Records Wrapping It Up

20 FOREWORD Through thousands of client engagements, we ve learned organizations that outperform are using data for competitive advantage. What sets these leaders apart? It s their ability to get three things right. First, they drive business outcomes by applying more sophisticated analytics across more disparate data sources in more parts of their organization they infuse analytics everywhere. Second, they capture the time value of data by developing speed of insight and speed of action as core differentiators they don t wait. Third, they look to change the game in their industry or profession they shift the status quo. These game changers could be in how they apply Big Data and Analytics to attract, grow, and retain customers; manage risk or transform financial processes; optimize operations; or indeed create new business models. At their very core, they directly or indirectly seek to capitalize on the value of information. If you selected this book, you likely play a key role in the transformation of your organization through Big Data and Analytics. You are tired of the hype and are ready to have a conversation about seizing the value of data. This book provides a practical introduction to the next generation of data architectures; introduces the role of the cloud and NoSQL technologies; and discusses the practicalities of security, privacy, and governance. Whether you are new to this topic or an expert looking for the latest information, this book provides a solid foundation on which to grow. Our writers are accomplished scientists, developers, architects, and mathematicians. They are passionate about helping our clients turn hype into reality. They understand the complexities of rapidly shifting technology and the practicalities of evolving and expanding a data architecture already in place. They have worked side-by-side with clients to help them transform their organization while keeping them on a winning path. I d like to acknowledge Paul, Dirk, Chris, Rick, and Marc for sharing their deep knowledge of this complex topic in a style that makes it accessible to all of us ready to seize our moment on this Big Data journey. Beth Smith General Manager, IBM Information Management xix

21 This page intentionally left blank

22 ACKNOWLEDGMENTS Collectively, we want to thank the following people, without whom this book would not have been possible: Anjul Bhambhri, Rob Thomas, Roger Rea, Steven Sit, Rob Utzschneider, Joe DiPietro, Nagui Halim, Shivakumar Vaithyanathan, Shankar Venkataraman, Dwaine Snow, Andrew Buckler, Glen Sheffield, Klaus Roder, Ritika Gunnar, Tim Vincent, Jennifer McGinn, Anand Ranganathan, Jennifer Chen, and Robert Uleman. Thanks also to all the other people in our business who make personal sacrifices day in and day out to bring you the IBM Big Data and Analytics platform. IBM is an amazing place to work and is unparalleled when you get to work beside this kind of brain power every day. Roman Melnyk, our technical editor, has been working with us for a long time sometimes as a coauthor, sometimes as an editor, but always as an integral part of the project. We also want to thank Xiamei (May) Li who bailed us out on Chapter 14 and brought some common sense to our Big Match chapter. Bob Harbus helped us a lot with Chapter 8 the shadow tables technology and we wanted to thank him here too. We want to thank (although at times we cursed) Susan Visser, Frances Wells, Melissa Reilly, and Linda Currie for getting the book in place; an idea is an idea, but it takes people like this to help get that idea up and running. Our editing team Janet Walden, Kim Wimpsett, and Lisa McCoy all played a key role behind the scenes, and we want to extend our thanks for that. It s also hard not to give a special sentence of thanks to Hardik Popli at Cenveo Publisher Services the guy s personal effort to perfection is beyond apparent. Finally, to our McGraw-Hill guru, Paul Carlstroem there is a reason why we specifically want to work with you you did more magic for this book than any other before it thank you. xxi

23 This page intentionally left blank

24 INTRODUCTION The poet A.R. Ammons once wrote, A word too much repeated falls out of being. Well, kudos to the term Big Data, because it s hanging in there, and it s hard to imagine a term with more hype than Big Data. Indeed, perhaps it is repeated too much. Big Data Beyond the Hype: A Guide to Conversations for Today s Data Center is a collection of discussions that take an overused term and break it down into a confluence of technologies, some that have been around for a while, some that are relatively new, and others that are just coming down the pipe or are not even a market reality yet. The book is organized into three parts. Part I, Opening Conversations About Big Data, gives you a framework so that you can engage in Big Data conversations in social forums, at keynotes, in architectural reviews, during marketing mix planning, at the office watercooler, or even with your spouse (nothing like a Big Data discussion to inject romance into an evening). Although we talk a bit about what IBM does in this space, the aim of this part is to give you a grounding in cloud service delivery models, NoSQL, Big Data, cognitive computing, what a modern data information architecture looks like, and more. This chapter is going to give you the constructs and foundations that you will need to engage conversation that indeed can hype Big Data, but allow you to extend those conversations beyond. In Chapter 1, we briefly tackle, define, and illustrate the term Big Data. Although its use is ubiquitous, we think that many people have used it irresponsibly. For example, some people think Big Data just means Hadoop and although Hadoop is indeed a critical repository and execution engine in the Big Data world, Hadoop is not solely Big Data. In fact, without analytics, Big Data is, well, just a bunch of data. Others think Big Data just means more data, and although that could be a characteristic, you certainly can engage in Big Data without lots of data. Big Data certainly doesn t replace the RDBMS either, and admittedly we do find it ironic that the biggest trend in the NoSQL world is SQL. We also included in this chapter a discussion of cognitive computing the next epoch of data analytics. IBM Watson represents a whole new class of industry-specific solutions called Cognitive Systems. It builds upon but is xxiii

Focus on the business, not the business of data warehousing!

Focus on the business, not the business of data warehousing! Focus on the business, not the business of data warehousing! Adam M. Ronthal Technical Product Marketing and Strategy Big Data, Cloud, and Appliances @ARonthal 1 Disclaimer Copyright IBM Corporation 2014.

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data IBM Software Group Important Disclaimer THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

TOEFL. Test Fourth. The Official Guide to the. Edition

TOEFL. Test Fourth. The Official Guide to the. Edition The Official Guide to the TOEFL Test Fourth Edition New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Copyright 2012, 2009 by Educational

More information

Oracle Big Data Handbook

Oracle Big Data Handbook ORACLG Oracle Press Oracle Big Data Handbook Tom Plunkett Brian Macdonald Bruce Nelson Helen Sun Khader Mohiuddin Debra L. Harding David Segleau Gokula Mishra Mark F. Hornick Robert Stackowiak Keith Laker

More information

2015 Ironside Group, Inc. 2

2015 Ironside Group, Inc. 2 2015 Ironside Group, Inc. 2 Introduction to Ironside What is Cloud, Really? Why Cloud for Data Warehousing? Intro to IBM PureData for Analytics (IPDA) IBM PureData for Analytics on Cloud Intro to IBM dashdb

More information

Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage

Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage SAP HANA Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage Deep analysis of data is making businesses like yours more competitive every day. We ve all heard the reasons: the

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

Buyer s Guide to Big Data Integration

Buyer s Guide to Big Data Integration SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Evolving Data Warehouse Architectures

Evolving Data Warehouse Architectures Evolving Data Warehouse Architectures In the Age of Big Data Philip Russom April 15, 2014 TDWI would like to thank the following companies for sponsoring the 2014 TDWI Best Practices research report: Evolving

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

SAP HANA One Platform for All Applications

SAP HANA One Platform for All Applications SAP HANA SAP HANA One Platform for All Applications Table of Contents 2 SAP HANA Platform 4 SAP Product Road Map 7 User Experience with SAP Fiori 8 Unified Extensibility 10 Superior Customer Value Note:

More information

Why DBMSs Matter More than Ever in the Big Data Era

Why DBMSs Matter More than Ever in the Big Data Era E-PAPER FEBRUARY 2014 Why DBMSs Matter More than Ever in the Big Data Era Having the right database infrastructure can make or break big data analytics projects. TW_1401138 Big data has become big news

More information

HP Vertica OnDemand. Vertica OnDemand. Enterprise-class Big Data analytics in the cloud. Enterprise-class Big Data analytics for any size organization

HP Vertica OnDemand. Vertica OnDemand. Enterprise-class Big Data analytics in the cloud. Enterprise-class Big Data analytics for any size organization Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

Welcome to The Future of Analytics In Action

Welcome to The Future of Analytics In Action Welcome to The Future of Analytics In Action IBM Cloud Data Services Goals for Today Share the cloud-based data management and analytics technologies that are enabling rapid development of new mobile and

More information

IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse

IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse IBM Analytics Just the facts: Four critical concepts for planning the logical data warehouse 1 2 3 4 5 6 Introduction Complexity Speed is businessfriendly Cost reduction is crucial Analytics: The key to

More information

IBM Big Data Platform

IBM Big Data Platform IBM Big Data Platform Turning big data into smarter decisions Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie Big Data, Bigga byte kräver Pigga Hertz! May 16, 2013 By 2015, 80% of

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Oracle Big Data Discovery The Visual Face of Hadoop

Oracle Big Data Discovery The Visual Face of Hadoop Disclaimer: This document is for informational purposes. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development,

More information

WELCOME TO The Future of Analytics in Action: The Art of the Possible

WELCOME TO The Future of Analytics in Action: The Art of the Possible WELCOME TO The Future of Analytics in Action: The Art of the Possible Goals for Today Share the cloud-based data management and analytics technologies that are enabling rapid development of new mobile

More information

IBM Software Hadoop in the cloud

IBM Software Hadoop in the cloud IBM Software Hadoop in the cloud Leverage big data analytics easily and cost-effectively with IBM InfoSphere 1 2 3 4 5 Introduction Cloud and analytics: The new growth engine Enhancing Hadoop in the cloud

More information

Key Attributes for Analytics in an IBM i environment

Key Attributes for Analytics in an IBM i environment Key Attributes for Analytics in an IBM i environment Companies worldwide invest millions of dollars in operational applications to improve the way they conduct business. While these systems provide significant

More information

IBM Netezza High Capacity Appliance

IBM Netezza High Capacity Appliance IBM Netezza High Capacity Appliance Petascale Data Archival, Analysis and Disaster Recovery Solutions IBM Netezza High Capacity Appliance Highlights: Allows querying and analysis of deep archival data

More information

of DATA FUTURE The WAREHOUSING Best Practices Series IBM Syncsort PAGE 4 PAGE 6 WHY CLOUD IS THE FUTURE OF DATA WAREHOUSING

of DATA FUTURE The WAREHOUSING Best Practices Series IBM Syncsort PAGE 4 PAGE 6 WHY CLOUD IS THE FUTURE OF DATA WAREHOUSING IBM PAGE 4 WHY CLOUD IS THE FUTURE OF DATA WAREHOUSING Syncsort PAGE 6 A THOUGHTFUL APPROACH TO OPTIMIZING THE DATA WAREHOUSE WITH HADOOP The FUTURE of DATA WAREHOUSING Best Practices Series 2 APRIL/MAY

More information

Tap into Big Data at the Speed of Business

Tap into Big Data at the Speed of Business SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics

More information

CREATING PACKAGED IP FOR BUSINESS ANALYTICS PROJECTS

CREATING PACKAGED IP FOR BUSINESS ANALYTICS PROJECTS CREATING PACKAGED IP FOR BUSINESS ANALYTICS PROJECTS A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation 1/ What is Packaged IP? Categorizing the Options 2/ Why Offer Packaged IP?

More information

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3

More information

IBM InfoSphere BigInsights Enterprise Edition

IBM InfoSphere BigInsights Enterprise Edition IBM InfoSphere BigInsights Enterprise Edition Efficiently manage and mine big data for valuable insights Highlights Advanced analytics for structured, semi-structured and unstructured data Professional-grade

More information

Integrate and Deliver Trusted Data and Enable Deep Insights

Integrate and Deliver Trusted Data and Enable Deep Insights SAP Technical Brief SAP s for Enterprise Information Management SAP Data Services Objectives Integrate and Deliver Trusted Data and Enable Deep Insights Provide a wide-ranging view of enterprise information

More information

Turning Big Data into Big Insights

Turning Big Data into Big Insights mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

Modern Data Integration

Modern Data Integration Modern Data Integration Whitepaper Table of contents Preface(by Jonathan Wu)... 3 The Pardigm Shift... 4 The Shift in Data... 5 The Shift in Complexity... 6 New Challenges Require New Approaches... 6 Big

More information

THE ENTERPRISE GAMING COOKBOOK

THE ENTERPRISE GAMING COOKBOOK THE ENTERPRISE GAMING COOKBOOK Learn how game studios in our Ecosystem are using Bluemix to build the world s most advanced serious games We break down the web services needed to develop a variety of experiences

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Understanding the Value of In-Memory in the IT Landscape

Understanding the Value of In-Memory in the IT Landscape February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1 Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots

More information

Adopting a service-centric approach to backup & recovery

Adopting a service-centric approach to backup & recovery Adopting a service-centric approach to backup & recovery Written by John Maxwell, VP, Data Protection Products Abstract This solution brief explores the business challenges driving the need to move beyond

More information

Developing a Backup Strategy for Hybrid Physical and Virtual Infrastructures

Developing a Backup Strategy for Hybrid Physical and Virtual Infrastructures Virtualization Backup and Recovery Solutions for the SMB Market The Essentials Series Developing a Backup Strategy for Hybrid Physical and Virtual Infrastructures sponsored by Introduction to Realtime

More information

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment

More information

Big Data & Analytics for Semiconductor Manufacturing

Big Data & Analytics for Semiconductor Manufacturing Big Data & Analytics for Semiconductor Manufacturing 半 導 体 生 産 におけるビッグデータ 活 用 Ryuichiro Hattori 服 部 隆 一 郎 Intelligent SCM and MFG solution Leader Global CoC (Center of Competence) Electronics team General

More information

Evolution to Revolution: Big Data 2.0

Evolution to Revolution: Big Data 2.0 Evolution to Revolution: Big Data 2.0 An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for Actian March 2014 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Table of Contents

More information

IBM System x reference architecture solutions for big data

IBM System x reference architecture solutions for big data IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,

More information

Five Technology Trends for Improved Business Intelligence Performance

Five Technology Trends for Improved Business Intelligence Performance TechTarget Enterprise Applications Media E-Book Five Technology Trends for Improved Business Intelligence Performance The demand for business intelligence data only continues to increase, putting BI vendors

More information

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

IBM Software Delivering trusted information for the modern data warehouse

IBM Software Delivering trusted information for the modern data warehouse Delivering trusted information for the modern data warehouse Make information integration and governance a best practice in the big data era Contents 2 Introduction In ever-changing business environments,

More information

Harnessing the Power of the Microsoft Cloud for Deep Data Analytics

Harnessing the Power of the Microsoft Cloud for Deep Data Analytics 1 Harnessing the Power of the Microsoft Cloud for Deep Data Analytics Today's Focus How you can operate your business more efficiently and effectively by tapping into Cloud based data analytics solutions

More information

E-Guide THE CHALLENGES BEHIND DATA INTEGRATION IN A BIG DATA WORLD

E-Guide THE CHALLENGES BEHIND DATA INTEGRATION IN A BIG DATA WORLD E-Guide THE CHALLENGES BEHIND DATA INTEGRATION IN A BIG DATA WORLD O n one hand, while big data applications have eliminated the rigidity of the data integration process, they don t take responsibility

More information

Big Data and Big Data Modeling

Big Data and Big Data Modeling Big Data and Big Data Modeling The Age of Disruption Robin Bloor The Bloor Group March 19, 2015 TP02 Presenter Bio Robin Bloor, Ph.D. Robin Bloor is Chief Analyst at The Bloor Group. He has been an industry

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

Making Sense of the Madness

Making Sense of the Madness Making Sense of the Madness Deploying Big Data techniques to deal with real world Bigish Data issues Copyright James Mitchell 2014 1 Introduction Warning! Parental Guidance Recommended Please read the

More information

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances Highlights IBM Netezza and SAS together provide appliances and analytic software solutions that help organizations improve

More information

Protecting Big Data Data Protection Solutions for the Business Data Lake

Protecting Big Data Data Protection Solutions for the Business Data Lake White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With

More information

A financial software company

A financial software company A financial software company Projecting USD10 million revenue lift with the IBM Netezza data warehouse appliance Overview The need A financial software company sought to analyze customer engagements to

More information

WINDOWS AZURE DATA MANAGEMENT

WINDOWS AZURE DATA MANAGEMENT David Chappell October 2012 WINDOWS AZURE DATA MANAGEMENT CHOOSING THE RIGHT TECHNOLOGY Sponsored by Microsoft Corporation Copyright 2012 Chappell & Associates Contents Windows Azure Data Management: A

More information

WHY IT ORGANIZATIONS CAN T LIVE WITHOUT QLIKVIEW

WHY IT ORGANIZATIONS CAN T LIVE WITHOUT QLIKVIEW WHY IT ORGANIZATIONS CAN T LIVE WITHOUT QLIKVIEW A QlikView White Paper November 2012 qlikview.com Table of Contents Unlocking The Value Within Your Data Warehouse 3 Champions to the Business Again: Controlled

More information

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora SAP Brief SAP Technology SAP HANA Vora Objectives Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora Bridge the divide between enterprise data and Big Data Bridge the divide

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS Oracle Fusion editions of Oracle's Hyperion performance management products are currently available only on Microsoft Windows server platforms. The following is intended to outline our general product

More information

IBM Software Integrating and governing big data

IBM Software Integrating and governing big data IBM Software big data Does big data spell big trouble for integration? Not if you follow these best practices 1 2 3 4 5 Introduction Integration and governance requirements Best practices: Integrating

More information

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform: Creating an Integrated, Optimized, and Secure Enterprise Data Platform: IBM PureData System for Transactions with SafeNet s ProtectDB and DataSecure Table of contents 1. Data, Data, Everywhere... 3 2.

More information

Protecting Data with a Unified Platform

Protecting Data with a Unified Platform Protecting Data with a Unified Platform The Essentials Series sponsored by Introduction to Realtime Publishers by Don Jones, Series Editor For several years now, Realtime has produced dozens and dozens

More information

Analytics In the Cloud

Analytics In the Cloud Analytics In the Cloud 9 th September Presented by: Simon Porter Vice President MidMarket Sales Europe Disruptors are reinventing business processes and leading their industries with digital transformations

More information

Big data: Unlocking strategic dimensions

Big data: Unlocking strategic dimensions Big data: Unlocking strategic dimensions By Teresa de Onis and Lisa Waddell Dell Inc. New technologies help decision makers gain insights from all types of data from traditional databases to high-visibility

More information

Best Practices for Log File Management (Compliance, Security, Troubleshooting)

Best Practices for Log File Management (Compliance, Security, Troubleshooting) Log Management: Best Practices for Security and Compliance The Essentials Series Best Practices for Log File Management (Compliance, Security, Troubleshooting) sponsored by Introduction to Realtime Publishers

More information

How Traditional Physical Backup Imaging Technology Fits Into a Virtual Backup Solution

How Traditional Physical Backup Imaging Technology Fits Into a Virtual Backup Solution Virtualization Backup and Recovery Solutions for the SMB Market The Essentials Series How Traditional Physical Backup Imaging Technology Fits Into a Virtual Backup Solution sponsored by Introduction to

More information

Data virtualization: Delivering on-demand access to information throughout the enterprise

Data virtualization: Delivering on-demand access to information throughout the enterprise IBM Software Thought Leadership White Paper April 2013 Data virtualization: Delivering on-demand access to information throughout the enterprise 2 Data virtualization: Delivering on-demand access to information

More information

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept

More information

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity

More information

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

TE's Analytics on Hadoop and SAP HANA Using SAP Vora TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -

More information

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice

More information

Redefining Infrastructure Management for Today s Application Economy

Redefining Infrastructure Management for Today s Application Economy WHITE PAPER APRIL 2015 Redefining Infrastructure Management for Today s Application Economy Boost Operational Agility by Gaining a Holistic View of the Data Center, Cloud, Systems, Networks and Capacity

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information

Welcome to The Future of Analytics In Action. 2015 IBM Corporation

Welcome to The Future of Analytics In Action. 2015 IBM Corporation Welcome to The Future of Analytics In Action Goals for Today Share the cloud-based data management and analytics technologies that are enabling rapid development of new mobile applications Discuss examples

More information

Virtual Data Warehouse Appliances

Virtual Data Warehouse Appliances infrastructure (WX 2 and blade server Kognitio provides solutions to business problems that require acquisition, rationalization and analysis of large and/or complex data The Kognitio Technology and Data

More information

Big Data and Hadoop for the Executive A Reference Guide

Big Data and Hadoop for the Executive A Reference Guide Big Data and Hadoop for the Executive A Reference Guide Overview The amount of information being collected by companies today is incredible. Wal- Mart has 460 terabytes of data, which, according to the

More information

IBM PureFlex System. The infrastructure system with integrated expertise

IBM PureFlex System. The infrastructure system with integrated expertise IBM PureFlex System The infrastructure system with integrated expertise 2 IBM PureFlex System IT is moving to the strategic center of business Over the last 100 years information technology has moved from

More information

Big Data must become a first class citizen in the enterprise

Big Data must become a first class citizen in the enterprise Big Data must become a first class citizen in the enterprise An Ovum white paper for Cloudera Publication Date: 14 January 2014 Author: Tony Baer SUMMARY Catalyst Ovum view Big Data analytics have caught

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

Management Consulting Systems Integration Managed Services WHITE PAPER DATA DISCOVERY VS ENTERPRISE BUSINESS INTELLIGENCE

Management Consulting Systems Integration Managed Services WHITE PAPER DATA DISCOVERY VS ENTERPRISE BUSINESS INTELLIGENCE Management Consulting Systems Integration Managed Services WHITE PAPER DATA DISCOVERY VS ENTERPRISE BUSINESS INTELLIGENCE INTRODUCTION Over the past several years a new category of Business Intelligence

More information

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS PRODUCT FACTS & FEATURES KEY FEATURES Comprehensive, best-of-breed capabilities 100 percent thin client interface Intelligence across multiple

More information

The Liaison ALLOY Platform

The Liaison ALLOY Platform PRODUCT OVERVIEW The Liaison ALLOY Platform WELCOME TO YOUR DATA-INSPIRED FUTURE Data is a core enterprise asset. Extracting insights from data is a fundamental business need. As the volume, velocity,

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Big Data and Natural Language: Extracting Insight From Text

Big Data and Natural Language: Extracting Insight From Text An Oracle White Paper October 2012 Big Data and Natural Language: Extracting Insight From Text Table of Contents Executive Overview... 3 Introduction... 3 Oracle Big Data Appliance... 4 Synthesys... 5

More information

Fast, Low-Overhead Encryption for Apache Hadoop*

Fast, Low-Overhead Encryption for Apache Hadoop* Fast, Low-Overhead Encryption for Apache Hadoop* Solution Brief Intel Xeon Processors Intel Advanced Encryption Standard New Instructions (Intel AES-NI) The Intel Distribution for Apache Hadoop* software

More information

Data Warehousing in the Age of Big Data

Data Warehousing in the Age of Big Data Data Warehousing in the Age of Big Data Krish Krishnan AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD * PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Morgan Kaufmann is an imprint of Elsevier

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

IBM Analytics The fluid data layer: The future of data management

IBM Analytics The fluid data layer: The future of data management IBM Analytics The fluid data layer: The future of data management Why flexibility and adaptability are crucial in the hybrid cloud world 1 2 3 4 5 6 The new world vision for data architects Why the fluid

More information

AGILE ANALYTICS IN THE CLOUD 93% ORACLE BUSINESS INTELLIGENCE CLOUD SERVICE EXECUTIVE SUMMARY

AGILE ANALYTICS IN THE CLOUD 93% ORACLE BUSINESS INTELLIGENCE CLOUD SERVICE EXECUTIVE SUMMARY AGILE ANALYTICS IN THE CLOUD ORACLE BUSINESS INTELLIGENCE CLOUD SERVICE EXECUTIVE SUMMARY Your business is changing. Are you prepared for it? Can you quickly access all the information you need to analyze,

More information