WHITE PAPER LOWER COSTS, INCREASE PRODUCTIVITY, AND ACCELERATE VALUE, WITH ENTERPRISE- READY HADOOP
|
|
- Natalie Hunter
- 8 years ago
- Views:
Transcription
1 WHITE PAPER LOWER COSTS, INCREASE PRODUCTIVITY, AND ACCELERATE VALUE, WITH ENTERPRISE- READY HADOOP
2 CLOUDERA WHITE PAPER 2 Table of Contents Introduction 3 Hadoop's Role in the Big Data Challenge 3 Cloudera: The Leading Hadoop Distribution 4 Informatica: Discover Insights and Innovate Faster on Hadoop 5 Data Warehouse and ETL Optimization with Cloudera and Informatica 6 eharmony Embraces Big Data with Cloudera and Informatica 7 The Cloudera/Informatica Advantage 8 Conclusion 9
3 CLOUDERA WHITE PAPER 3 Introduction Organizations increasingly recognize the potential of big data to transform their business improving customer retention and acquisition, increasing operational efficiencies, enabling better products and service delivery, and generating new business insights. Cost-effectively harnessing terabytes or petabytes of big data requires a new approach that extends current technologies. The limitations of traditional data infrastructures render them unsuitable for the extreme scale of big data processing and storage. The open source Hadoop framework and advanced data integration technology are critical components in a growing number of big data initiatives for both processing and storing data in Hadoop at dramatically lower costs. This white paper outlines how organizations can realize big data s promise by combining Cloudera Enterprise, an open-source Hadoop distribution and associated tools and services, and the Informatica Platform. The Informatica Platform can access all types of data, move up to terabytes per hour into Hadoop, parse, cleanse and transform data on Hadoop, and deliver insights from Hadoop at any latency across the enterprise. Over several years, Cloudera and Informatica have collaborated at a technological level to optimize interoperability between the joint solutions. As respective leaders in Hadoop products and services and enterprise data integration, the Cloudera and Informatica partnership can equip your organization with proven technology and services expertise to maximize your return on big data. Hadoop is ideally suited for complex data analytics and largescale data storage and processing. Hadoop s Role in the Big Data Challenge Growth in data volumes, variety, and velocity is hitting the limits of existing information management infrastructures, forcing companies to invest in more hardware and costly upgrades of databases and data warehouses. In many cases, adding traditional data infrastructure is impractical because of high costs, scalability limitations when dealing with hundreds of terabytes, and incompatibility of relational systems with unstructured big data. Organizations are implementing innovative approaches to handling growth in both big transaction data (data warehouses, ERP applications, and OLTP systems) and big interaction data (from social media, web clickstreams, call detail records [CDRs], sensors and devices, and more). Beyond handling growth, they seek a solution capable of integrating traditional structured, multistructured, and unstructured data to gain insights not otherwise possible. Enter Hadoop. Cloudera chief architect Doug Cutting founded the Apache Hadoop project to address the inability of traditional systems to handle the explosion of data on the Web. It enables distributed, fault-tolerant, parallel storage, processing, and analysis of huge amounts of multistructured data across highly available clusters of inexpensive industry standard servers. Hadoop is ideally suited for complex data analytics and largescale data storage and processing, often at 10 to 100 times less cost than traditional systems. Given its unique strengths, many organizations are offloading between 20 percent and 50 percent of processing and storage to Hadoop systems.
4 CLOUDERA WHITE PAPER 4 Cloudera supplies the Industry s leading Hadoop distribution, as well as a comprehensive set of tools and services. Cloudera: The Leading Hadoop Distribution With customers including ebay, Samsung, Chevron, Nokia, and JP Morgan Chase & Co., Cloudera supplies the industry s leading Hadoop distribution as well as a comprehensive set of tools and services to effectively operate Hadoop as a critical part of a technology infrastructure. Its Cloudera Enterprise offering includes: > CDH: Cloudera s 100 percent open source platform based on Apache Hadoop delivers the core elements of Hadoop scalable storage and distributed computing plus capabilities for security, high availability, fault tolerance, load balancing, compression, and integration with software and hardware solutions from partners such as Informatica. The CDH distribution is strengthened by a bundle of more than a dozen open source projects including a nonrelational database, workflow orchestration, cloud integration, and machine learning libraries to help maximize the performance and value of a Hadoop deployment. > Cloudera Impala: As the industry s first native real-time SQL query engine for Apache Hadoop, Impala is the newest component of CDH. Impala completely changes the way organizations can benefit from Hadoop, including: > Data processing workload acceleration, with data pipelines that last seconds instead of minutes or hours, to meet tighter service-level agreement (SLA) specifications. > Interactive business intelligence with popular tools. This opens up real-time access to big data to every analyst in the organization, without requiring any special Hadoop training, significantly lowering the adoption risk of a big data project and accelerating return on investment (ROI). > Reduced overall cost of data management. Instead of replicating large amounts of data to a relational database to get interactive SQL performance, Cloudera customers can obtain the same experience without added cost or complexity. > Cloudera Manager: Cloudera s Hadoop management platform supplies a central point for administration across a CDH cluster. The application automates installation to reduce deployment time from weeks to minutes, provides a cluster-wide, real-time view of nodes and services running, enables configuration changes from a single control console, and delivers reporting and diagnostic tools for troubleshooting and optimization. > Cloudera Support: Cloudera offers the industry s highest quality technical support for Hadoop, with a team of support engineers composed of contributors and committers for every component of CDH. No one knows the Hadoop stack better or has more experience supporting large-scale clusters in production. With Cloudera Support, customers experience more uptime, faster issue resolution, and better performance.
5 CLOUDERA WHITE PAPER 5 The Cloudera/Informatica solution enables organizations to utilize their existing Informatica-trained professionals Informatica: Discover Insights and Innovate Faster on Hadoop For all its advantages in data processing and storage, Hadoop stands to become another data silo without data integration or other complementary technology to unlock the business value of big data. In a number of early deployments, some enterprises resorted to time-consuming hand coding for a range of data process requirements, despite high costs and downstream maintenance headaches. Informatica addresses the need for a codeless environment for extract, transform, and load (ETL) workloads on Hadoop, with a range of innovative Informatica Platform technologies that enable organizations to use their existing Informatica-trained professionals or find the requisite skills from a global pool of more than 100,000 developers trained on Informatica technology. Informatica capabilities for Hadoop include: > GUI-based development: Most Hadoop development today is performed by hand in a manner very similar to the way ETL code was developed a decade ago before ETL tools such as Informatica PowerCenter were created. Graphical codeless development has already proven to reduce development time by as much as fivefold while identifying data errors not caught by hand coding Hadoop. > Universal data access: Organizations use Hadoop to store and process a variety of diverse data sources and often face challenges in combining and processing all relevant data from their legacy data sources and new types of data. The Informatica Platform helps organizations achieve ease and reliability of pre- and postprocessing of data into and out of Hadoop. > High-speed data ingestion: Access, load, transform, and extract big data between source and target systems or directly into Hadoop or your data warehouse. Replicate hundreds of gigabytes to terabytes per hour from source systems to Hadoop. > Data archiving: Archive data directly to Hadoop. Informatica helps to automate complex partitioning based on related tables or entities, not just individual tables, using the underlying database partitioning capabilities. Archive inactive data from production databases and data warehouses to extend their capacity and avoid costly upgrades. > Data parsing and exchange: Hadoop excels at storing a diversity of data, but the ability to derive meanings and make sense of it across all relevant datatypes is a major challenge. Informatica technology helps improve productivity for extracting greater value from unstructured data sources including images, texts, binaries, and industry standards. > Comprehensive data transformations: The Informatica Platform provides an extensive library of prebuilt transformation capabilities on Hadoop, including basic datatype conversions and string manipulations, high-performance caching-enabled lookups, joiners, sorters, routers, aggregations, and many more. Perform natural language processing to extract entities from unstructured data such as from s, social data, and documents used to enrich master data. > Metadata management: Informatica supplies full metadata management capabilities, with data lineage and auditability, and promotes standardization across heterogeneous data environments.
6 CLOUDERA WHITE PAPER 6 > Data quality and data governance: Many organizations use Hadoop for end-user reporting and analytics that require high data quality. Informatica technology furnishes capabilities to profile, cleanse, and manage data to better understand what data means, increase trust, and manage data growth effectively and securely. > Data profiling: Profile data directly on Hadoop both through the Informatica developer tool and a browser-based analyst tool. This ability makes profiling data faster and more scalable, as well as easier for developers, analysts, and data scientists to collaborate on data flow specifications and validate mapping transformation and rules logic. > Data virtualization: Use data virtualization to provide a fine-grained secure access layer that combines data on Hadoop with other information management systems such as your data warehouse, MDM, or application databases. The Cloudera/Informatica solution helps organizations address the challenges of traditional environments through unlimited scalability, cost-effective performance, while lowering costs between 10 to 100 times and increasing productivity up to 5 times Data Warehouse and ETL Optimization with Cloudera and Informatica Through technology and professional services, Cloudera and Informatica offer enterprises a fast, repeatable process to optimize data warehouse and ETL processing and storage that maximizes the ROI of existing information management infrastructure and the high performance and cost-effective benefits of Hadoop. The challenges that motivate shifting data processing and data volumes to Hadoop include the following four: > As data volumes and business complexity grows, ETL and ELT processing is unable to keep up on conventional relational database technology. Critical business windows are missed. > Databases are designed to primarily load and query data, not transform it. Transforming data in the database consumes valuable CPU, making queries run slower, which impacts BI users experience. > Conventional databases are expensive to scale as data volumes grow. Therefore, most organizations are unable to keep all the data they would like to analyze directly in the data warehouse. As a result, they end up throwing away the data or moving data to more affordable off-line systems, such as a storage grid or tape backup. It s very common to hear: We want to analyze three years of data but can only afford three months. > Traditional data management infrastructure is not as flexible to change as data volumes grow and new datatypes emerge (e.g., machine data, documents, and social media). Change requests to schemas and reports can take weeks or even months, leaving the business to fend for itself. Hadoop provides the flexibility to cost-effectively work with more data and more types of data and to perform more flexible analysis, enabling the business and IT to be more agile.
7 CLOUDERA WHITE PAPER 7 Consulting and tools such as Informatica s Data Warehouse Advisor, software that monitors how businesses use data, can help organizations evaluate their current cost of data storage, processing capacity, and performance bottlenecks, plus raw or dormant data that could be more cost-effectively managed in Hadoop. The PowerCenter Big Data Edition supplies a visual no-code development environment to build and execute ETL transformations on Hadoop. It also enables developers to do complex file parsing (e.g., Web logs, JSON, and XML), data profiling, and entity extraction for unstructured text (e.g., natural language processing) on Hadoop. The PowerCenter Big Data Edition includes connectivity to traditional relational databases, social data for Facebook, Twitter, and LinkedIn, and many other capabilities. The Cloudera/Informatica solution helps organizations address the challenges of traditional environments through unlimited scalability, cost-effective performance, lower costs between 10 to 100 times, and increased productivity up to 5 times. Informatica technology enables developers to build and deploy data transformations and data flows on Hadoop without hand codingand offers a variety of data movement capabilities, including data replication, batch, trickle feed, and streaming, with scalability to move up to terabytes per hour into Hadoop and out of Hadoop. Cloudera consultants provide expertise in configuring, managing, and tuning a CDH cluster, with knowledge transfer to ensure sustainability and extensibility in the years to come. eharmony, the popular on-line dating site, is a good example of an enterprise capitalizing on the capabilities of a joint Cloudera/Informatica solution. The Cloudera/Informatica solution gives eharmony greater speed and agility in embracing big data to meet business demands eharmony Embraces Big Data with Cloudera and Informatica eharmony founded in 2000 and now resulting in an average of 542 marriages a day in the United States deployed the Cloudera CDH Hadoop distribution as the analytics platform to run proprietary algorithms that processed data to generate compatibility matches. The company s problem was that reliance on Ruby scripting to transform hierarchical JSON data in Hadoop for use by its data warehouse was time-consuming for both script development and processing; it also could not scale to an expected fivefold increase in data volumes. eharmony turned to HParser, Informatica s data transformation environment optimized for Hadoop, to take full advantage of Cloudera CDH and cut data processing time by four times. Replacing Ruby scripting to process JSON data held in Hadoop, HParser introduced advanced data parsing capabilities into the CDH environment, eliminating tedious script development while slashing big data processing time from 40 minutes to 10 minutes. With the move, eharmony extended its existing investment in Informatica PowerCenter, which loaded up to 7 TB a day into the data warehouse from conventional sources, to add HParser s capabilities to handle JSON, XML, Omniture Web analytics data, log files, Word, Excel, PDF and other files, as well as industry-standard file formats (e.g., SWIFT, NACHA, and HIPAA). The joint Cloudera/Informatica solution gives eharmony greater speed and agility in embracing big data to meet business demands for instance, generating compatible matches almost immediately after a new member joins.
8 CLOUDERA WHITE PAPER 8 The Cloudera/Informatica solution offers distinct advantages in enabling organizations to realize the promise of big data The Cloudera/Informatica Advantage A joint Cloudera/Informatica solution offers distinct advantages in enabling organizations to realize the promise of big data: > Accelerates adoption of Hadoop by leveraging existing Informatica skill sets, letting customers design in Informatica, reuse existing work, and run on CDH > Expands Hadoop s connectivity and processing capabilities through a rich set of prepackaged data integration functionality > Lowers costs of data processing and storage by allowing Informatica tasks best suited for Hadoop to run on CDH > Increases developer productivity with a metadata-driven graphical environment on a flexible and scalable data platform > Enables unified monitoring and management of data integration across Hadoop and other systems using Informatica s unified administration and Cloudera Manager > Allows data governance across all data assets including data on Hadoop
9 CLOUDERA WHITE PAPER 9 Conclusion Effectively harnessing big data promises quantifiable benefits to organizations. Beyond offloading data storage and preprocessing from expensive database and data warehouse platforms to Hadoop for staging and ETL, financial services companies can improve fraud detection processes and risk and portfolio analysis. Telcos can process massive volumes of CDRs to improve customer support and provide new location-based services. Manufacturers can leverage big data from machine device sensors to improve product quality and predictive maintenance. Retailers can use big data to make next-best offer recommendations to increase customer up-sell and cross-sell. An analytics-ready Hadoop platform and advanced data integration are critical technologies to take full advantage of big data. With Cloudera and Informatica, enterprises have proven solutions and services to maximize their big data returns by successfully leveraging Hadoop as one part of their overall data integration infrastructure. Learn more at and About Cloudera Cloudera, the leader in Apache Hadoop-based software and services, enables data driven enterprises to easily derive business value from all their structured and unstructured data. As the top contributor to the Apache open source community and with tens of thousands of nodes under management across customers in financial services, government, telecommunications, media, web, advertising, retail, energy, bioinformatics, pharma/healthcare, university research, oil and gas and gaming, Cloudera's depth of experience and commitment to sharing expertise are unrivaled. Cloudera provides no representations or warranties regarding the accuracy, reliability, or serviceability of any information or recommendations provided in this publication, or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS, and the use of this information or the implementation of any recommendations or techniques herein is a customer s responsibility and depends on the customer s ability to evaluate and integrate them into the customer s operational environment. Cloudera, Inc. 220 Portage Avenue, Palo Alto, CA USA or cloudera.com 2013 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice.
CDH AND BUSINESS CONTINUITY:
WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationHADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics
HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop
More informationDeploying an Operational Data Store Designed for Big Data
Deploying an Operational Data Store Designed for Big Data A fast, secure, and scalable data staging environment with no data volume or variety constraints Sponsored by: Version: 102 Table of Contents Introduction
More informationEnd to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
More informationWhy Big Data in the Cloud?
Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data
More informationEnterprise Data Integration
Enterprise Data Integration Access, Integrate, and Deliver Data Efficiently Throughout the Enterprise brochure How Can Your IT Organization Deliver a Return on Data? The High Price of Data Fragmentation
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationThe Enterprise Data Hub and The Modern Information Architecture
The Enterprise Data Hub and The Modern Information Architecture Dr. Amr Awadallah CTO & Co-Founder, Cloudera Twitter: @awadallah 1 2013 Cloudera, Inc. All rights reserved. Cloudera Overview The Leader
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationWHITE PAPER USING CLOUDERA TO IMPROVE DATA PROCESSING
WHITE PAPER USING CLOUDERA TO IMPROVE DATA PROCESSING Using Cloudera to Improve Data Processing CLOUDERA WHITE PAPER 2 Table of Contents What is Data Processing? 3 Challenges 4 Flexibility and Data Quality
More informationData Warehouse Optimization with Hadoop
White Paper Data Warehouse Optimization with Hadoop A Big Data Reference Architecture Using Informatica and Cloudera Technologies This document contains Confidential, Proprietary and Trade Secret Information
More informationApache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com
Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache
More informationInformatica and the Vibe Virtual Data Machine
White Paper Informatica and the Vibe Virtual Data Machine Preparing for the Integrated Information Age This document contains Confidential, Proprietary and Trade Secret Information ( Confidential Information
More informationThe Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
More informationDATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases
DATAMEER WHITE PAPER Beyond BI Big Data Analytic Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence
More informationData Integration Checklist
The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media
More informationHow To Use Hp Vertica Ondemand
Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater
More informationAccelerate your Big Data Strategy. Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator
Accelerate your Big Data Strategy Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator Enterprise Data Hub Accelerator enables you to get started rapidly and cost-effectively with
More informationCloudera Enterprise Data Hub in Telecom:
Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer
More informationInformation Architecture
The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationINDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES
INDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES Data Consolidation and Multi-Tenancy in Financial Services CLOUDERA INDUSTRY BRIEF 2 Table of Contents Introduction 3 Security
More informationWHITE PAPER WHY ARE FINANCIAL SERVICES FIRMS ADOPTING CLOUDERA S BIG DATA SOLUTIONS?
WHITE PAPER WHY ARE FINANCIAL SERVICES FIRMS ADOPTING CLOUDERA S BIG DATA SOLUTIONS? CLOUDERA WHITE PAPER 2 Table of Contents Introduction 3 On the Brink. Too Much Data. 3 The Hadoop Opportunity 5 Consumer
More informationCloud Integration and the Big Data Journey - Common Use-Case Patterns
Cloud Integration and the Big Data Journey - Common Use-Case Patterns A White Paper August, 2014 Corporate Technologies Business Intelligence Group OVERVIEW The advent of cloud and hybrid architectures
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationGetting Started Practical Input For Your Roadmap
Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson
More informationMore Data in Less Time
More Data in Less Time Leveraging Cloudera CDH as an Operational Data Store Daniel Tydecks, Systems Engineering DACH & CE Goals of an Operational Data Store Load Data Sources Traditional Architecture Operational
More informationIdentifying Fraud, Managing Risk and Improving Compliance in Financial Services
SOLUTION BRIEF Identifying Fraud, Managing Risk and Improving Compliance in Financial Services DATAMEER CORPORATION WEBSITE www.datameer.com COMPANY OVERVIEW Datameer offers the first end-to-end big data
More informationBEYOND BI: Big Data Analytic Use Cases
BEYOND BI: Big Data Analytic Use Cases Big Data Analytics Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence
More informationExecutive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
More informationDatenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationHow To Make Data Streaming A Real Time Intelligence
REAL-TIME OPERATIONAL INTELLIGENCE Competitive advantage from unstructured, high-velocity log and machine Big Data 2 SQLstream: Our s-streaming products unlock the value of high-velocity unstructured log
More informationConverged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities
Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling
More informationInteractive data analytics drive insights
Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has
More informationIBM Software Hadoop in the cloud
IBM Software Hadoop in the cloud Leverage big data analytics easily and cost-effectively with IBM InfoSphere 1 2 3 4 5 Introduction Cloud and analytics: The new growth engine Enhancing Hadoop in the cloud
More informationVIEWPOINT. High Performance Analytics. Industry Context and Trends
VIEWPOINT High Performance Analytics Industry Context and Trends In the digital age of social media and connected devices, enterprises have a plethora of data that they can mine, to discover hidden correlations
More informationThe Safe On-Ramp to Big Data
White Paper The Safe On-Ramp to Big Data Lower Costs, Minimize Risk, and Innovate Faster with a Proven Approach to Big Data WHITE PAPER This document contains Confidential, Proprietary and Trade Secret
More informationWhite Paper: Enhancing Functionality and Security of Enterprise Data Holdings
White Paper: Enhancing Functionality and Security of Enterprise Data Holdings Examining New Mission- Enabling Design Patterns Made Possible by the Cloudera- Intel Partnership Inside: Improving Return on
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationBIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
More informationInformatica PowerCenter The Foundation of Enterprise Data Integration
Informatica PowerCenter The Foundation of Enterprise Data Integration The Right Information, at the Right Time Powerful market forces globalization, new regulations, mergers and acquisitions, and business
More informationApache Hadoop: The Big Data Refinery
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
More informationOperational Analytics
Operational Analytics Version: 101 Table of Contents Operational Analytics 3 From the Enterprise Data Hub to the Enterprise Application Hub 3 Operational Intelligence in Action: Some Examples 4 Requirements
More informationIBM BigInsights for Apache Hadoop
IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced
More informationBIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE
BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.
More informationDell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert Armando_Acosta@Dell.com/
More informationW H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
More informationGRIDS IN DATA WAREHOUSING
GRIDS IN DATA WAREHOUSING By Madhu Zode Oct 2008 Page 1 of 6 ABSTRACT The main characteristic of any data warehouse is its ability to hold huge volume of data while still offering the good query performance.
More informationGanzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
More informationActian SQL in Hadoop Buyer s Guide
Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop
More informationPreparing for the Big Data Journey
Preparing for the Big Data Journey A Strategic Roadmap to Maximizing Your Return from Big Data WHITE PAPER This document contains Confidential, Proprietary and Trade Secret Information ( Confidential Information
More informationProtecting Big Data Data Protection Solutions for the Business Data Lake
White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With
More informationAligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
More informationTap into Big Data at the Speed of Business
SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics
More informationA TECHNICAL WHITE PAPER ATTUNITY VISIBILITY
A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY Analytics for Enterprise Data Warehouse Management and Optimization Executive Summary Successful enterprise data management is an important initiative for growing
More informationInformatica PowerCenter Data Virtualization Edition
Data Sheet Informatica PowerCenter Data Virtualization Edition Benefits Rapidly deliver new critical data and reports across applications and warehouses Access, merge, profile, transform, cleanse data
More informationMicrosoft Big Data. Solution Brief
Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,
More informationHow to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
More informationDell Cloudera Syncsort Data Warehouse Optimization ETL Offload
Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload
More informationAGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
More informationWhat s New with Informatica Data Services & PowerCenter Data Virtualization Edition
1 What s New with Informatica Data Services & PowerCenter Data Virtualization Edition Kevin Brady, Integration Team Lead Bonneville Power Wei Zheng, Product Management Informatica Ash Parikh, Product Marketing
More informationAn Enterprise Data Hub, the Next Gen Operational Data Store
An Enterprise Data Hub, the Next Gen Operational Data Store Version: 101 Table of Contents Summary 3 The ODS in Practice 4 Drawbacks of the ODS Today 5 The Case for ODS on an EDH 5 Conclusion 6 About the
More informationORACLE DATA INTEGRATOR ENTERPRISE EDITION
ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition 12c delivers high-performance data movement and transformation among enterprise platforms with its open and integrated
More informationHow the oil and gas industry can gain value from Big Data?
How the oil and gas industry can gain value from Big Data? Arild Kristensen Nordic Sales Manager, Big Data Analytics arild.kristensen@no.ibm.com, tlf. +4790532591 April 25, 2013 2013 IBM Corporation Dilbert
More informationWhite Paper. Unified Data Integration Across Big Data Platforms
White Paper Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using
More informationHarnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service
Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service A Sumo Logic White Paper Introduction Managing and analyzing today s huge volume of machine data has never
More informationUnified Data Integration Across Big Data Platforms
Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using ELT... 6 Diyotta
More informationCitusDB Architecture for Real-Time Big Data
CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing
More informationQLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment
More informationSimplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!
Simplifying Big Data Analytics: Unifying Batch and Stream Processing John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Streaming Analy.cs S S S Scale- up Database Data And Compute Grid
More informationVirtualizing Apache Hadoop. June, 2012
June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING
More informationSQLstream Blaze and Apache Storm A BENCHMARK COMPARISON
SQLstream Blaze and Apache Storm A BENCHMARK COMPARISON 2 The V of Big Data Velocity means both how fast data is being produced and how fast the data must be processed to meet demand. Gartner The emergence
More informationThe 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
More informationSQL Server 2012 Performance White Paper
Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.
More informationHadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop
Hadoop Data Hubs and BI Supporting the migration from siloed reporting and BI to centralized services with Hadoop John Allen October 2014 Introduction John Allen; computer scientist Background in data
More informationQuickly Deploy Microsoft Private Cloud and SQL Server 2012 Data Warehouse on Hitachi Converged Solutions. September 25, 2013
Quickly Deploy Microsoft Private Cloud and SQL Server 2012 Data Warehouse on Hitachi Converged Solutions September 25, 2013 1 WEBTECH EDUCATIONAL SERIES QUICKLY DEPLOY MICROSOFT PRIVATE CLOUD AND SQL SERVER
More informationIBM InfoSphere BigInsights Enterprise Edition
IBM InfoSphere BigInsights Enterprise Edition Efficiently manage and mine big data for valuable insights Highlights Advanced analytics for structured, semi-structured and unstructured data Professional-grade
More informationIBM Software Integrating and governing big data
IBM Software big data Does big data spell big trouble for integration? Not if you follow these best practices 1 2 3 4 5 Introduction Integration and governance requirements Best practices: Integrating
More informationWhite Paper: Datameer s User-Focused Big Data Solutions
CTOlabs.com White Paper: Datameer s User-Focused Big Data Solutions May 2012 A White Paper providing context and guidance you can use Inside: Overview of the Big Data Framework Datameer s Approach Consideration
More informationAn Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
More informationBig Data at Cloud Scale
Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For
More informationThe Rise of Industrial Big Data
GE Intelligent Platforms The Rise of Industrial Big Data Leveraging large time-series data sets to drive innovation, competitiveness and growth capitalizing on the big data opportunity The Rise of Industrial
More informationTHE PLATFORM FOR BIG DATA
WHITE PAPER THE PLATFORM FOR BIG DATA CLOUDERA WHITE PAPER Table of Contents Introduction Data in Crisis The Data Brain Anatomy of the Platform Essentials of Success 7 A Data Platform 9 The Road Ahead
More informationHadoop Trends and Practical Use Cases. April 2014
Hadoop Trends and Practical Use Cases John Howey Cloudera jhowey@cloudera.com Kevin Lewis Cloudera klewis@cloudera.com April 2014 1 Agenda Hadoop Overview Latest Trends in Hadoop Enterprise Ready Beyond
More informationInformatica Application Information Lifecycle Management
Informatica Application Information Lifecycle Management Cost-Effectively Manage Every Phase of the Information Lifecycle brochure Controlling Explosive Data Growth The era of big data presents today s
More informationOracle Big Data Building A Big Data Management System
Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following
More informationBeyond the Single View with IBM InfoSphere
Ian Bowring MDM & Information Integration Sales Leader, NE Europe Beyond the Single View with IBM InfoSphere We are at a pivotal point with our information intensive projects 10-40% of each initiative
More informationBig Data, Big Traffic. And the WAN
Big Data, Big Traffic And the WAN Internet Research Group January, 2012 About The Internet Research Group www.irg-intl.com The Internet Research Group (IRG) provides market research and market strategy
More informationBig Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
More informationThe Business Analyst s Guide to Hadoop
White Paper The Business Analyst s Guide to Hadoop Get Ready, Get Set, and Go: A Three-Step Guide to Implementing Hadoop-based Analytics By Alteryx and Hortonworks (T)here is considerable evidence that
More informationIBM System x reference architecture solutions for big data
IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,
More informationBig Data Comes of Age: Shifting to a Real-time Data Platform
An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for SAP April 2013 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Table of Contents Introduction... 1 Drivers of Change...
More informationBUSINESSOBJECTS DATA INTEGRATOR
PRODUCTS BUSINESSOBJECTS DATA INTEGRATOR IT Benefits Correlate and integrate data from any source Efficiently design a bulletproof data integration process Accelerate time to market Move data in real time
More informationCloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service
Cloudera Enterprise Data Hub GCloud Service Definition Lot 3: Software as a Service December 2014 1 SERVICE OVERVIEW & SOLUTION... 4 1.1 Service Overview... 4 1.2 Introduction to Cloudera... 5 1.3 Cloudera
More informationWhitepaper: Solution Overview - Breakthrough Insight. Published: March 7, 2012. Applies to: Microsoft SQL Server 2012. Summary:
Whitepaper: Solution Overview - Breakthrough Insight Published: March 7, 2012 Applies to: Microsoft SQL Server 2012 Summary: Today s Business Intelligence (BI) platform must adapt to a whole new scope,
More information