Why Big Data in the Cloud?

Size: px
Start display at page:

Download "Why Big Data in the Cloud?"

Transcription

1 Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data

2 TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data in the Cloud Use Cases for Big Data in the Cloud The Challenges of Big Data in the Cloud Vendor Example: Treasure Data Data Acquisition Data Storage Data Analysis Treasure Data Value Proposition Conclusion Copyright 2014 BI Research, All Rights Reserved.

3 INTRODUCTION Big data and cloud computing are top initiatives for IT, and when used together they promise significant benefits for both the business and IT. Big data helps create competitive advantage, increase revenues and identify new business opportunities, while cloud computing offers the potential to reduce IT costs and provide faster time to value for IT investments. Both technologies are evolving rapidly, and an increasing number of vendors are developing and delivering products and services for enabling big data solutions in a cloud-computing environment. Although all organizations should evaluate the use of cloud computing for their big data projects, it is also important to realize that big data in the cloud is not a one-size-fits-all approach. There are many different cloud services on the market and it is essential that you match business and technology requirements to the most appropriate offering. Also, not all big data projects are ideally suited to a cloud computing approach, and it is important to clearly identify those projects that do and do not lend themselves to a cloud environment. The objectives of this paper are to provide an overview of key industry trends in the use of big data in the cloud and to help you identify the use cases that best fit a big data cloud computing environment. Along the way, as an example, it will also take a look at Treasure Data (the sponsor of this paper) and its end-to-end cloud solution for big data. THE IMPORTANCE OF BIG DATA Big data projects initially focused on processing business information extracted from Internet and Web data sources, for example, , Web pages and logs, and social computing sites (Facebook, Twitter, etc.). More recently, the use of big data has grown to include other sources especially data generated by sensors installed on a wide range of equipment such as mobile devices, smart utility meters, motor vehicles, aircraft engines, security equipment, telephone switches, RFID readers, and so forth. In fact, machinegenerated data is one of the fastest growing sources of big data. As companies began to exploit big data it quickly became apparent that traditional approaches to data warehousing and analytic processing were unable to handle not only the data volumes and data rates involved, but also the diverse set of data sources and varieties of data required by big data projects. Clearly, a more efficient and cost-effective infrastructure was required. Solutions here vary from reducing the cost and improving the capabilities of relational database products to providing alternative data management technologies such as the Hadoop distributed computing environment. The value of big data, however, is not in the raw data itself, but in the business insights that can be gained by extracting and analyzing the business information embedded in the data. This is why vendors are focusing not only on providing products that help manage big data, but also on solutions that can extract and analyze the business information in that data. The result is that several vendors now offer end-to-end solutions that provide data acquisition and integration, data management, data analysis and data visualization Copyright 2014 BI Research, All Rights Reserved. 1

4 capabilities for the processing of big data. To speed deployment and improve time to value, these solutions are frequently offered as prepackaged hardware and software appliances and/or as a set of services for rapid deployment in a cloud-computing environment. THE ROLE OF CLOUD COMPUTING Cloud computing services promise pay-as-you go, on demand and elastic scalability for developing and deploying many IT projects. Compared with an on-premises IT environment, cloud computing reduces upfront IT costs and enables organizations to scale their IT resources as required, while paying only for the resources they use. The cloud is therefore an ideal environment for big data projects, given the large data volumes and unpredictable nature of the analytic workloads involved. This is one of the reasons why the industry is seeing a sudden and significant jump in the use of cloud computing. Another reason for this sudden growth is that cloud technologies are maturing and organizations are overcoming their data security issues and concerns. Barriers still remain to successful cloud adoption, however. Chief among these is complexity of integrating cloud and on-premises data and the inability of many cloud services to efficiently and rapidly move data into and out of the cloud environment this topic is discussed in more detail below. USING BIG DATA IN THE CLOUD Most traditional data warehousing and business analytics projects to date have involved managing and analyzing data extracted from on-premises business transaction systems. 1 In some situations, cloud services have been used for developing analytics on business transaction data stored in a cloud computing system such as salesforce.com, but these have been piecemeal and standalone projects. In fact, one of the risks of cloud computing is that it has made it easier for business groups and business users to bypass IT and purchase their own cloud-based IT services. This is why it is important for IT to partner and collaborate with the business in deploying and using cloud services to reduce risk, avoid poor technology selection, and manage data governance and data security issues. For the foreseeable future, it is unlikely that many organizations will move their existing business transaction systems or sensitive transaction data for analysis purposes to a public cloud environment. 2 However, cloud adoption for business transaction processing is increasing, especially for new projects and projects involving packaged application solutions, and so in the longer term this will lead to more traditional business transaction processing and associated analytic processing being done in the cloud. The biggest potential for cloud computing is the processing of data that already exists in 1 The exceptions are newer and Web-focused companies whose sole business is oriented towards Internet commerce. These companies have few legacy systems and it is therefore easier for them to move to a cloud-computing environment. 2 Many companies are, however, beginning to deploy private clouds and virtualized environments for in-house use, but this topic is beyond the scope of this paper. Copyright 2014 BI Research, All Rights Reserved. 2

5 the cloud. This data includes the large volumes of data on internal and public web servers, and also generated by third-party providers. It also includes externally generated data (certain types of machine sensor data, for example) that can as easily be delivered to a cloud environment as it can to an in-house environment. These large volumes of Web and sensor data can be captured, filtered and transformed in the cloud and then delivered to an in-house system for analysis. In many cases, the data can also be analyzed in the cloud and the results delivered to internal and external business users. One of the key requirements here is to keep data movement to a minimum and to process data where it resides. As noted in the beginning of this article, it is important to realize that big data in the cloud is not a one-size-fits-all solution. It pays to make use of cloud services where it makes sense from the perspective of satisfying business needs, reducing costs, achieving faster time to value, and providing flexibility and scalability. USE CASES FOR BIG DATA IN THE CLOUD When examining how organizations use cloud computing for big data projects, three main use cases become apparent these are outlined below. Standalone reporting and analysis of Web, social media or sensor data: A cloudbased reporting and analysis system is a cost-effective way of capturing, storing and analyzing high-volume web log/clickstream, social media (from Twitter, for example) or sensor (from telemetric devices, for example) data. In this use case, data from each source is uploaded in small batches or streamed directly to the cloud service for reporting and analysis. Data analysis and visualization of e-commerce data: Many organizations (web retailers, on-line gaming companies, etc.) run their entire businesses on the web. For these companies, monitoring business operations, analyzing customer and user behavior and tracking marketing programs are top priorities. The applications involved in e- commerce are often deployed on hundreds of servers and handle requests from millions of users and a variety of devices. They also generate terabytes of data every day. A cloudbased system is ideally suited to collecting, analyzing and visualizing all of this data to help business managers track and analyze overall business operations and performance. Data warehouse augmentation: A cloud-based data refinery or data hub is a costeffective way of capturing, storing, transforming and archiving high-volume data while also providing connectivity to a data warehouse for transferring data. In this use case, the data warehouse remains the primary source of analytics for business users, but direct reporting and analysis of cloud-based data may also be provided as required. THE CHALLENGES OF BIG DATA IN THE CLOUD The main tasks in any big data project involve acquiring and integrating the raw source data, managing that data, processing the data, and finally delivering the results to the Copyright 2014 BI Research, All Rights Reserved. 3

6 systems and users that require the processed data. Processing may involve transforming and filtering the data and also possibly analyzing the data. As in an on-premises environment, cloud users have the choice of integrating various cloud products and services themselves or using an integrated end-to-end solution. In the same way that an integrated hardware and software appliance simplifies development, deployment and administration for on-premises project, an integrated end-to-end cloud solution for big data offers similar benefits to an appliance approach. A cloud solution also provides flexible scalability and a pay-as-you-go pricing model. As mentioned earlier, one of the biggest barriers to cloud deployment is data integration and data movement. Ideally, the data should be processed where it resides, but even when the source data already resides in the cloud it may still have to be moved to a different cloud system for processing in the same way that data is moved from business transaction systems to a data warehouse in an on-premises environment. An added complication with big data is that the project may also involve a mixture of data in the cloud and on-premises data. In this case, the on-premises data may be accessed dynamically by a cloud application or staged from the on-premises environment to the cloud for use by the cloud application. Again, this is the same as in an on-premises environment where big data projects are increasingly using data from a variety of data sources in addition to a data warehouse. The key difference in a cloud environment is that data movement occurs across an Internet connection, which has security, performance and cost implications. It is very important in a cloud environment therefore to look for big data solutions that not only simplify development, deployment and administration, but that also provide solid and well performing data integration and data movement capabilities. VENDOR EXAMPLE: TREASURE DATA Treasure Data was founded in 2011 and is based in Mountain View, California. The company offers a managed cloud service that provides an end-to-end solution for the acquisition, storage and analysis of big data. At the time of writing, Treasure Data had some 90 customers, including several Fortune 500 companies. These customers come from a variety of industries but most of their big data projects fit into one of the three use cases outlined earlier. Data Acquisition Data is uploaded to the Treasure Data service using a parallel bulk data import tool or real-time data collection agents that run on the customer s local systems. The bulk data import tool is typically used to import data from relational databases, flat files (Microsoft Excel, comma delimited, etc.) and applications systems (ERP, CRM, etc.). Data collection agents are designed to capture data in real-time from Web and application logs, sensors, mobile systems, and so forth. Since near-real-time data is critical for the majority of customers, most data comes into Treasure Data system using data collection agents. Copyright 2014 BI Research, All Rights Reserved. 4

7 Data collection agents filter, transform and/or aggregate data before it is transmitted to the cloud service. All data is transmitted in a binary format known as MessagePack. 3 The agent technology has been designed to be lightweight, extensible and reliable. It also employs parallelization, buffering and compression mechanisms to maximize performance, minimize network traffic, and ensure no data loss or duplication during transmission. Buffer sizes can be tuned based on timing and data size. One of Treasure Data s customers, for example, uses data collection agents to transmit over a terabyte of compressed log data per day to the service for customer billing purposes. Another collects and transmits real-time gaming data from 3,500 servers for analysis on the Treasure Data service. The agent technology comes in two versions: an open source version (Fluentd) and an enhanced version supported by Treasure Data (Treasure Agent). The Fluentd open source community has some 2,000 members who have developed and contributed over 200 data capture plug-ins for use on-premises and in the cloud (including the Treasure Data cloud service). Treasure Data supplies a range of enhanced enterprise-ready plug-ins that provide improved compatibility and performance. Treasure Data also offers a monitoring and alerting service for Treasure Agent users. Data Storage The Treasure Data cloud service currently employs Amazon web services and the Amazon S3 object storage layer, but Treasure Data claims it can easily port to other platforms as customer needs dictate. All data flowing into the Treasure Data cloud service is time stamped, transformed into a compressed columnar MessagePack format, and stored in a proprietary columnar database known as Plazma. This database can then be queried using an enhanced Hadoop processing environment. Data is first kept in real-time files and then moved into archive files at regular intervals. This latter process enables time-based partitioning and larger data files for more efficient processing. The process is completely transparent to applications. A Web-based management console is provided for monitoring resources, managing access controls, and raising support tickets. Treasure Data is working on expanding this console to provide Treasure Viewer, an interface to query and visualize data. This interface is currently in beta testing. Treasure Data uses a flat-rate tiered pricing model that is based on the number of data rows imported into the service annually, guaranteed processing capacity, service-level requirements, and the level of support needed. The Treasure Data service provides a multi-tenant environment where additional machine resources expand to meet customer 3 MessagePack is an efficient binary serialization format used for exchanging data between systems. It is similar to the JSON data format, but is faster and more compact than JSON. MessagePack encodes single integers into a single byte, which means that short strings typically incur only one byte of overhead when encoded. Copyright 2014 BI Research, All Rights Reserved. 5

8 needs and where customers can use up to four times the guaranteed capacity at no extra cost if that capacity is available. Data Analysis Applications access and analyze data in a Treasure Data environment using Hadoop Hive or Treasure Query Accelerator queries coded in SQL syntax. Some Treasure Data customers are happy with Hive, while others often require a more interactive and highperformance interface than that supported by the MapReduce batch jobs generated by Hive. To meet this customer need, Treasure Data offers the Treasure Query Accelerator, which extends the SQL interface to support an enhanced version of Cloudera Impala. The Treasure Data platform separates the query engine from the storage layer to make it easier to add other SQL interfaces as other open source products mature. Both ODBC and JDBC drivers are available for the query interface, which enables many popular BI and analytics tools to be used with the service. Several customers, for example, use Tableau to access and analyze data managed by Treasure Data. Treasure Data Value Proposition The objective of Treasure Data is to provide an end-to-end cloud service for big data projects that is fast and easy to deploy, is economic, and also well supported. Its managed service model makes it attractive to companies with limited technical resources. The company receives high marks from its customers for fast implementation times and the support it provides. Another objective of Treasure Data s cloud service is to overcome the data integration and data movement issues outlined in this paper by providing optimized data collection agents that support a wide range of data sources. The Treasure Data service is especially well suited to those organizations that have data in the cloud data and/or externally generated sensor data, are open to a cloud-based approach, wish to use a managed big data service rather than a set of complex platform services, and do not have the skills or desire to manage a big data platform. CONCLUSION Big data is currently receiving significant industry attention and there is considerable hype associated with this topic. At the same time, however, more and more companies are beginning to see the business value of big data projects, and as this field matures the rate of adoption will accelerate. There is also considerable interest in cloud computing for reducing up-front IT costs, providing elastic scalability and enabling the rapid deployment of new projects. As a result an increasing number of companies will deploy their big data projects in the cloud. Both big data and cloud computing require a new set of skills, and organizations need to ensure these skills are in place before embarking on big data in the cloud. Vendors can help organizations get up to speed in this area and this is why choosing the right cloud vendor is important. A companion paper to this one looks at how organizations should prepare and get started on big data projects in the cloud and also offers suggestions for the features an organization should look for in selecting a cloud services vendor. Copyright 2014 BI Research, All Rights Reserved. 6

Using and Choosing a Cloud Solution for Data Warehousing

Using and Choosing a Cloud Solution for Data Warehousing TDWI RESEARCH TDWI CHECKLIST REPORT Using and Choosing a Cloud Solution for Data Warehousing By Colin White Sponsored by: tdwi.org JULY 2015 TDWI CHECKLIST REPORT Using and Choosing a Cloud Solution for

More information

Big Data for the Rest of Us Technical White Paper

Big Data for the Rest of Us Technical White Paper Big Data for the Rest of Us Technical White Paper Treasure Data - Big Data for the Rest of Us 1 Introduction The importance of data warehousing and analytics has increased as companies seek to gain competitive

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

WHITE PAPER. Data Migration and Access in a Cloud Computing Environment INTELLIGENT BUSINESS STRATEGIES

WHITE PAPER. Data Migration and Access in a Cloud Computing Environment INTELLIGENT BUSINESS STRATEGIES INTELLIGENT BUSINESS STRATEGIES WHITE PAPER Data Migration and Access in a Cloud Computing Environment By Mike Ferguson Intelligent Business Strategies March 2014 Prepared for: Table of Contents Introduction...

More information

An Oracle White Paper September 2014. Oracle: Big Data for the Enterprise

An Oracle White Paper September 2014. Oracle: Big Data for the Enterprise An Oracle White Paper September 2014 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform...

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

Delivering Real-World Total Cost of Ownership and Operational Benefits

Delivering Real-World Total Cost of Ownership and Operational Benefits Delivering Real-World Total Cost of Ownership and Operational Benefits Treasure Data - Delivering Real-World Total Cost of Ownership and Operational Benefits 1 Background Big Data is traditionally thought

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

Big Data and Market Surveillance. April 28, 2014

Big Data and Market Surveillance. April 28, 2014 Big Data and Market Surveillance April 28, 2014 Copyright 2014 Scila AB. All rights reserved. Scila AB reserves the right to make changes to the information contained herein without prior notice. No part

More information

How To Make Data Streaming A Real Time Intelligence

How To Make Data Streaming A Real Time Intelligence REAL-TIME OPERATIONAL INTELLIGENCE Competitive advantage from unstructured, high-velocity log and machine Big Data 2 SQLstream: Our s-streaming products unlock the value of high-velocity unstructured log

More information

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

How To Use Hp Vertica Ondemand

How To Use Hp Vertica Ondemand Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence

More information

Oracle Data Integration: CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration

Oracle Data Integration: CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration Oracle Data Integration: CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration Julien Testut Principal Product Manager, Oracle Data Integration Sumit Sarkar Principal Systems Engineer,

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES [ Consumer goods, Data Services ] TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES QUICK FACTS Objectives Develop a unified data architecture for capturing Sony Computer Entertainment America s (SCEA)

More information

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY Analytics for Enterprise Data Warehouse Management and Optimization Executive Summary Successful enterprise data management is an important initiative for growing

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

CREATING PACKAGED IP FOR BUSINESS ANALYTICS PROJECTS

CREATING PACKAGED IP FOR BUSINESS ANALYTICS PROJECTS CREATING PACKAGED IP FOR BUSINESS ANALYTICS PROJECTS A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation 1/ What is Packaged IP? Categorizing the Options 2/ Why Offer Packaged IP?

More information

the missing log collector Treasure Data, Inc. Muga Nishizawa

the missing log collector Treasure Data, Inc. Muga Nishizawa the missing log collector Treasure Data, Inc. Muga Nishizawa Muga Nishizawa (@muga_nishizawa) Chief Software Architect, Treasure Data Treasure Data Overview Founded to deliver big data analytics in days

More information

The benefits and implications of the Cloud and Software as a Service (SaaS) for the Location Services Market. John Caulfield Solutions Director

The benefits and implications of the Cloud and Software as a Service (SaaS) for the Location Services Market. John Caulfield Solutions Director The benefits and implications of the Cloud and Software as a Service (SaaS) for the Location Services Market John Caulfield Solutions Director What Is Cloud Computing Cloud Computing Everyone Is Talking

More information

WHITE PAPER LOWER COSTS, INCREASE PRODUCTIVITY, AND ACCELERATE VALUE, WITH ENTERPRISE- READY HADOOP

WHITE PAPER LOWER COSTS, INCREASE PRODUCTIVITY, AND ACCELERATE VALUE, WITH ENTERPRISE- READY HADOOP WHITE PAPER LOWER COSTS, INCREASE PRODUCTIVITY, AND ACCELERATE VALUE, WITH ENTERPRISE- READY HADOOP CLOUDERA WHITE PAPER 2 Table of Contents Introduction 3 Hadoop's Role in the Big Data Challenge 3 Cloudera:

More information

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure

More information

IT Workload Automation: Control Big Data Management Costs with Cisco Tidal Enterprise Scheduler

IT Workload Automation: Control Big Data Management Costs with Cisco Tidal Enterprise Scheduler White Paper IT Workload Automation: Control Big Data Management Costs with Cisco Tidal Enterprise Scheduler What You Will Learn Big data environments are pushing the performance limits of business processing

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Please give me your feedback

Please give me your feedback Please give me your feedback Session BB4089 Speaker Claude Lorenson, Ph. D and Wendy Harms Use the mobile app to complete a session survey 1. Access My schedule 2. Click on this session 3. Go to Rate &

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Big Data Analytics Platform @ Nokia

Big Data Analytics Platform @ Nokia Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform

More information

Informatica and the Vibe Virtual Data Machine

Informatica and the Vibe Virtual Data Machine White Paper Informatica and the Vibe Virtual Data Machine Preparing for the Integrated Information Age This document contains Confidential, Proprietary and Trade Secret Information ( Confidential Information

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Best Practices for Hadoop Data Analysis with Tableau

Best Practices for Hadoop Data Analysis with Tableau Best Practices for Hadoop Data Analysis with Tableau September 2013 2013 Hortonworks Inc. http:// Tableau 6.1.4 introduced the ability to visualize large, complex data stored in Apache Hadoop with Hortonworks

More information

Connecting Hadoop with Oracle Database

Connecting Hadoop with Oracle Database Connecting Hadoop with Oracle Database Sharon Stephen Senior Curriculum Developer Server Technologies Curriculum The following is intended to outline our general product direction.

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

UNIFY YOUR (BIG) DATA

UNIFY YOUR (BIG) DATA UNIFY YOUR (BIG) DATA ANALYTIC STRATEGY GIVE ANY USER ANY ANALYTIC ON ANY DATA Scott Gnau President, Teradata Labs scott.gnau@teradata.com t Unify Your (Big) Data Analytic Strategy Technology excitement:

More information

Dell* In-Memory Appliance for Cloudera* Enterprise

Dell* In-Memory Appliance for Cloudera* Enterprise Built with Intel Dell* In-Memory Appliance for Cloudera* Enterprise Find out what faster big data analytics can do for your business The need for speed in all things related to big data is an enormous

More information

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition 12c delivers high-performance data movement and transformation among enterprise platforms with its open and integrated

More information

Converging Technologies: Real-Time Business Intelligence and Big Data

Converging Technologies: Real-Time Business Intelligence and Big Data Have 40 Converging Technologies: Real-Time Business Intelligence and Big Data Claudia Imhoff, Intelligent Solutions, Inc Colin White, BI Research September 2013 Sponsored by Vitria Technologies, Inc. Converging

More information

Enterprise Data Integration

Enterprise Data Integration Enterprise Data Integration Access, Integrate, and Deliver Data Efficiently Throughout the Enterprise brochure How Can Your IT Organization Deliver a Return on Data? The High Price of Data Fragmentation

More information

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache

More information

Cost-Effective Data Management and a Simplified Data Warehouse

Cost-Effective Data Management and a Simplified Data Warehouse SAP Information Sheet SAP Technology SAP HANA Dynamic Tiering Quick Facts Cost-Effective Data Management and a Simplified Data Warehouse Quick Facts Summary The SAP HANA dynamic tiering option helps application

More information

Ubuntu and Hadoop: the perfect match

Ubuntu and Hadoop: the perfect match WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely

More information

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload

More information

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Einsatzfelder von IBM PureData Systems und Ihre Vorteile. Einsatzfelder von IBM PureData Systems und Ihre Vorteile demirkaya@de.ibm.com Agenda Information technology challenges PureSystems and PureData introduction PureData for Transactions PureData for Analytics

More information

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process ORACLE OLAP KEY FEATURES AND BENEFITS FAST ANSWERS TO TOUGH QUESTIONS EASILY KEY FEATURES & BENEFITS World class analytic engine Superior query performance Simple SQL access to advanced analytics Enhanced

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

SQLstream 4 Product Brief. CHANGING THE ECONOMICS OF BIG DATA SQLstream 4.0 product brief

SQLstream 4 Product Brief. CHANGING THE ECONOMICS OF BIG DATA SQLstream 4.0 product brief SQLstream 4 Product Brief CHANGING THE ECONOMICS OF BIG DATA SQLstream 4.0 product brief 2 Latest: The latest release of SQlstream s award winning s-streaming Product Portfolio, SQLstream 4, is changing

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service

Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service A Sumo Logic White Paper Introduction Managing and analyzing today s huge volume of machine data has never

More information

Focus on the business, not the business of data warehousing!

Focus on the business, not the business of data warehousing! Focus on the business, not the business of data warehousing! Adam M. Ronthal Technical Product Marketing and Strategy Big Data, Cloud, and Appliances @ARonthal 1 Disclaimer Copyright IBM Corporation 2014.

More information

Investor Presentation. Second Quarter 2015

Investor Presentation. Second Quarter 2015 Investor Presentation Second Quarter 2015 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences

More information

Parallel Data Warehouse

Parallel Data Warehouse MICROSOFT S ANALYTICS SOLUTIONS WITH PARALLEL DATA WAREHOUSE Parallel Data Warehouse Stefan Cronjaeger Microsoft May 2013 AGENDA PDW overview Columnstore and Big Data Business Intellignece Project Ability

More information

Clickstream Data Warehouse Initiative

Clickstream Data Warehouse Initiative WHITE PAPER Clickstream Data Warehouse Initiative Business Drivers and Enabling Technologies Clickstream Data Warehouse Initiative Business Drivers and Enabling Technologies Executive Summary 1 Web Analytics

More information

BIG DATA AND MICROSOFT. Susie Adams CTO Microsoft Federal

BIG DATA AND MICROSOFT. Susie Adams CTO Microsoft Federal BIG DATA AND MICROSOFT Susie Adams CTO Microsoft Federal THE WORLD OF DATA IS CHANGING Cloud What s making this possible? Electrical efficiency of computers doubles every year and ½. Laptops and mobile

More information

Next-Generation Cloud Analytics with Amazon Redshift

Next-Generation Cloud Analytics with Amazon Redshift Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional

More information

Big Data and Your Data Warehouse Philip Russom

Big Data and Your Data Warehouse Philip Russom Big Data and Your Data Warehouse Philip Russom TDWI Research Director for Data Management April 5, 2012 Sponsor Speakers Philip Russom Research Director, Data Management, TDWI Peter Jeffcock Director,

More information

GigaSpaces Real-Time Analytics for Big Data

GigaSpaces Real-Time Analytics for Big Data GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and

More information

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS Managing and analyzing data in the cloud is just as important as it is anywhere else. To let you do this, Windows Azure provides a range of technologies

More information

Three Reasons Why Visual Data Discovery Falls Short

Three Reasons Why Visual Data Discovery Falls Short Three Reasons Why Visual Data Discovery Falls Short Vijay Anand, Director, Product Marketing Agenda Introduction to Self-Service Analytics and Concepts MicroStrategy Self-Service Analytics Product Offerings

More information

Innovation Session BIG DATA. HP EMEA Software Performance Tour 2014

Innovation Session BIG DATA. HP EMEA Software Performance Tour 2014 HP EMEA Software Performance Tour 2014 Innovation Session BIG DATA Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Unlocking

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

Why DBMSs Matter More than Ever in the Big Data Era

Why DBMSs Matter More than Ever in the Big Data Era E-PAPER FEBRUARY 2014 Why DBMSs Matter More than Ever in the Big Data Era Having the right database infrastructure can make or break big data analytics projects. TW_1401138 Big data has become big news

More information

Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database

Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database White Paper Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database Abstract This white paper explores the technology

More information

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved. Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any

More information

IBM Data Warehousing and Analytics Portfolio Summary

IBM Data Warehousing and Analytics Portfolio Summary IBM Information Management IBM Data Warehousing and Analytics Portfolio Summary Information Management Mike McCarthy IBM Corporation mmccart1@us.ibm.com IBM Information Management Portfolio Current Data

More information

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework

Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework Many corporations and Independent Software Vendors considering cloud computing adoption face a similar challenge: how should

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.

More information

Big Data & Cloud Computing. Faysal Shaarani

Big Data & Cloud Computing. Faysal Shaarani Big Data & Cloud Computing Faysal Shaarani Agenda Business Trends in Data What is Big Data? Traditional Computing Vs. Cloud Computing Snowflake Architecture for the Cloud Business Trends in Data Critical

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER

What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER A NEW PARADIGM IN INFORMATION TECHNOLOGY There is a revolution happening in information technology, and it s not

More information

NoSQL for SQL Professionals William McKnight

NoSQL for SQL Professionals William McKnight NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to

More information

Solution brief. HP CloudSystem. An integrated and open platform to build and manage cloud services

Solution brief. HP CloudSystem. An integrated and open platform to build and manage cloud services Solution brief An integrated and open platform to build and manage cloud services The industry s most complete cloud system for enterprises and service providers Approximately every decade, technology

More information

Service Provider Builds Secure Public Cloud for Mission-Critical Applications

Service Provider Builds Secure Public Cloud for Mission-Critical Applications Service Provider Builds Secure Public Cloud for Mission-Critical Applications OpSource, Inc. enabled customers to provision compute and storage, plus network security services and load balancing. EXECUTIVE

More information

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP Your business is swimming in data, and your business analysts want to use it to answer the questions of today and tomorrow. YOU LOOK TO

More information

IBM DB2 Near-Line Storage Solution for SAP NetWeaver BW

IBM DB2 Near-Line Storage Solution for SAP NetWeaver BW IBM DB2 Near-Line Storage Solution for SAP NetWeaver BW A high-performance solution based on IBM DB2 with BLU Acceleration Highlights Help reduce costs by moving infrequently used to cost-effective systems

More information

CitusDB Architecture for Real-Time Big Data

CitusDB Architecture for Real-Time Big Data CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing

More information

SOLUTION BRIEF. JUST THE FAQs: Moving Big Data with Bulk Load. www.datadirect.com

SOLUTION BRIEF. JUST THE FAQs: Moving Big Data with Bulk Load. www.datadirect.com SOLUTION BRIEF JUST THE FAQs: Moving Big Data with Bulk Load 2 INTRODUCTION As the data and information used by businesses grow exponentially, IT organizations face a daunting challenge moving what is

More information