Next-Generation Cloud Analytics with Amazon Redshift

Similar documents
QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

Big Data & Cloud Computing. Faysal Shaarani

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Native Connectivity to Big Data Sources in MSTR 10

Big Data Technologies Compared June 2014

CitusDB Architecture for Real-Time Big Data

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Building your Big Data Architecture on Amazon Web Services

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Amazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH

Big Data on the Open Cloud

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

Virtualizing Apache Hadoop. June, 2012

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Il mondo dei DB Cambia : Tecnologie e opportunita`

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

BIG DATA TRENDS AND TECHNOLOGIES

The 3 questions to ask yourself about BIG DATA

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Data Integration Checklist

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

How To Handle Big Data With A Data Scientist

BIG Data Analytics Move to Competitive Advantage

Real Time Big Data Processing

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

Big Data at Cloud Scale

Ganzheitliches Datenmanagement

Proact whitepaper on Big Data

Scaling Your Data to the Cloud

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

The Inside Scoop on Hadoop

Introducing Oracle Exalytics In-Memory Machine

SQL Server 2012 Parallel Data Warehouse. Solution Brief

AtScale Intelligence Platform

NextGen Infrastructure for Big DATA Analytics.

In-Memory Analytics for Big Data

Information Architecture

Choosing The Right Big Data Tools For The Job A Polyglot Approach

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

The Power of Pentaho and Hadoop in Action. Demonstrating MapReduce Performance at Scale

DLT Solutions and Amazon Web Services

Ubuntu and Hadoop: the perfect match

Tap into Hadoop and Other No SQL Sources

Protecting Big Data Data Protection Solutions for the Business Data Lake

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

HPE Vertica & Hadoop. Tapping Innovation to Turbocharge Your Big Data. #SeizeTheData

Data Analytics Infrastructure

Big Data and Transactional Databases Exploding Data Volume is Creating New Stresses on Traditional Transactional Databases

Interactive data analytics drive insights

Welcome to The Future of Analytics In Action

Big Data: Are You Ready? Kevin Lancaster

Welcome to The Future of Analytics In Action IBM Corporation

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

HDP Hadoop From concept to deployment.

Why DBMSs Matter More than Ever in the Big Data Era

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

A Comprehensive Review of Self-Service Data Visualization in MicroStrategy. Vijay Anand January 28, 2014

The Future of Data Management

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Step by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015

Cloud Ready Data: Speeding Your Journey to the Cloud

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

TOP 8 TRENDS FOR 2016 BIG DATA

WELCOME TO The Future of Analytics in Action: The Art of the Possible

High Performance Data Management Use of Standards in Commercial Product Development

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Bringing Big Data to People

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Actian SQL in Hadoop Buyer s Guide

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

Peninsula Strategy. Creating Strategy and Implementing Change

Big Data for the Rest of Us Technical White Paper

Databricks. A Primer

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Using Tableau Software with Hortonworks Data Platform

Oracle Data Integration: CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration

INTRODUCING APACHE IGNITE An Apache Incubator Project

Luncheon Webinar Series May 13, 2013

Why Big Data in the Cloud?

High performance analytics. Benchmark results.

High Performance IT Insights. Building the Foundation for Big Data

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Transcription:

Next-Generation Cloud Analytics with Amazon Redshift

What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional Data Generated by NoSQL Maximizing Your Return on Hadoop Projects Case in Point: Global Media Giant UBM Tech Next Steps 3 4 5 7 9 11 12 2 Next-Generation Cloud Analytics with Amazon Redshift

Introduction We live in an era when data is exploding from every device and application that we interact with. This data tsunami presents a significant opportunity for ambitious enterprises. On a daily basis, virtually every one of our interactions is captured as a data point, but behind every interaction is a customer. To truly understand their customers motivations and desires, companies need to measure and analyze these interactions by correlating, aggregating, and synthesizing data from numerous systems, including back-office applications, databases and cloud applications. Often, this disparate data is pulled together in a purpose-built, on-premises data warehouse system. Traditional data warehouses and on-premises applications take time to configure and provision. Amazon Redshift, provides a petabyte-scale cloud data warehousing service that s fully managed, fast, and cost-effective and that serves as a powerful launch pad for analytics applications such as Birst, Microstrategy, and Tableau, amongst many others. However, getting data into Amazon Redshift can often be a last-mile roadblock, due to the amount of queries to write from all your tables resulting in significant delays. Combining a cloud integration solution with Amazon Redshift can add more value to your analytics needs. Getting data into Amazon Redshift can often be a last-mile roadblock 3 Next-Generation Cloud Analytics with Amazon Redshift

Why Amazon Redshift is Great for Analytics Amazon Redshift changes the dynamics of data warehousing by making it easy to provision nodes, scale on demand, and query datasets securely. Amazon Redshift s architecture leverages massive parallel processing (MPP) capabilities with columnar storage and data compression to enable timely execution of even the most complex queries. Clusters can be resized dynamically without downtime, and distributing workloads across compute nodes can optimize I/O time. As a consequence, companies can employ Business Intelligence tools to quickly analyze different segments in an Amazon Redshift dataset and benefit from the resulting insights. 4 Next-Generation Cloud Analytics with Amazon Redshift

Cloud Data Warehousing Strategies for Relational Databases Traditional RDBMS providers such as MySQL, Oracle, Microsoft SQL Server, IBM DB2, and Postgresql have been around for several years and are extremely reliable for structured data. Some are better suited for OLTP workloads, while others are a better fit for OLAP workloads and data warehousing. More and more companies offload a subset of data contained in onpremise databases to cloud-based data warehousing services. Increasingly, large enterprises recognize traditional on-premise data warehousing techniques are inherently slow due to the amount of time required to size and set up databases. As a result, more and more companies offload a subset of data contained in on-premise databases to cloud-based data warehousing services, such as Amazon Redshift, for more agile data warehousing and analytics. When moving to a cloud data warehousing solution, companies also want their integration solution to be equally agile, and the only way this can be accomplished is through cloud integration. 011100010 101011000 001101010 110111000 010101110 1000 5 Next-Generation Cloud Analytics with Amazon Redshift

If you re looking to move data to Amazon Redshift, look for a cloud integration solution with: Native connectors to as many relational databases, on-premise and cloud applications as possible Automated functionality that eliminates hand-coding associated with transforming and aggregating data into Redshift nodes The ability to bulk load data into Redshift Cleansing the data using data quality rules 6 Next-Generation Cloud Analytics with Amazon Redshift

Analyzing Fast, Transactional Data Generated by NoSQL Companies in the social, mobile, gaming, or advertising space have sub-millisecond requirements when it comes to processing their data. Social media usage spikes considerably during high-profile events, while consumer mobile apps need to deal with a continuous stream of usergenerated data around location, preference, and other criteria. Advertising providers change their bidding rates for ad space on a constant basis based on elastic supply and demand, and gaming apps need to modify their in-app purchase recommendations on the fly by trying to predict users intentions. Capturing this rapidly changing data requires an agile delivery system and a cloud-based NoSQL database service such as Amazon DynamoDB excels here. Amazon DynamoDB supports both document and key-value data models and does not require adherence to the traditional RDMS structure. However, while NoSQL databases, such as Amazon DynamoDB, are great at ingesting data at low latencies, their structure does not make them suitable for analytics-related workloads. To understand the profile of their users, companies using NoSQL data need to analyze multiple datasets across several dimensions. This is where Amazon Redshift s data warehouse-as-a-service comes into play. Capturing this rapidly changing data requires an agile delivery system 7 Next-Generation Cloud Analytics with Amazon Redshift

If you re using a NoSQL database and need to analyze hundreds of millions of ad impressions, game metrics, or social media hashtags, then use Amazon Redshift as a data warehouse, and look for a cloud integration solution that: Features batch processing capabilities that can use pushdown optimization during runtime to quickly move data into Redshift Native connectivity to DynamoDB so that you don t have to manually transform key-value data models into Redshift s row-andcolumn structure Data quality rules that allow only correct, and validated data to be analyzed # 8 Next-Generation Cloud Analytics with Amazon Redshift

Maximizing Your Return on Hadoop Projects An essential element in the growth and utilization of big data, Hadoop offers a remarkable, proven technology for storing files and processing data in a highly distributed manner. MapReduce, the programming model that s used on top of HDFS, delivers ample power for processing large amounts of unstructured or semi-structured data across several nodes in single or multiple clusters. When faced with several different types of data both unstructured and structured, from a multitude of data sources most organizations face bottlenecks with a traditional data warehouse. The answer to solving this problem is to create a data lake with Hadoop to analyze this data. However, while MapReduce is a great model for breaking up tasks into smaller pieces, it s not well suited for high-performance analytics. Considering that most Hadoop clusters consist of several terabytes of data, Amazon Redshift s compression capabilities can help make sense of the enormous volume of data contained in these clusters. 9 Next-Generation Cloud Analytics with Amazon Redshift

If you re using Hadoop with Amazon Redshift, look for an integration solution that: Offers native connectors to all major Hadoop distributions, such as Cloudera, Hortonworks, and MapR, as well as Amazon s own cloudbased Hadoop distribution, Amazon Elastic MapReduce. Features advanced mapping functionality for aggregations, joins, and other transformations Amazon EMR 10 Next-Generation Cloud Analytics with Amazon Redshift

Case in Point: Global Media Giant UBM Tech UBM Tech a global media business for the world s technology communities wanted to target its users better. Their customers interactions range from in-person events to attending online webcasts to reading articles, and they accessed this content on a variety of different devices. The data from customer interactions is stored in a fragmented constellation of cloud apps, on-premises apps, databases and other systems. To begin moving toward a more holistic customer view, UBM Tech identified a need to consolidate customer demographic and company information with engagement preferences, content download history, influencer status, and event registration preferences. UBM Tech used Informatica Cloud Data Integration to merge and load data from several marketing automation, content taxonomy, online registration, and onsite registration systems such as Eloqua, OpenCalais, EV2, and NextGen into Amazon Redshift. Then, Informatica Cloud Data Quality helped create an integrated view of the customer by using data cleansing and data quality rules. By using the Informatica Cloud solution portfolio and Amazon Redshift, UBM Tech gained significant insights into the entire content consumption lifecycle of its user base, and leveraged the power of Informatica Cloud to ready its user data for interactive visualization and dashboarding using Tableau and predictive modeling and simulation using R. UBM Tech s efforts quadrupled client-delivered leads and a drove a 27% uplift in the open rate for event marketing efforts. Amazon Redshift UBM Tech has been able to achieve double-digit improvements in uplift for our marketing results, by using customer behavioral data and predictive analytics to drive the majority of our online and event marketing activities. Amandeep Sandhu, Vice President, Customer Insight, UBM Tech 11 Next-Generation Cloud Analytics with Amazon Redshift

Next Steps The Informatica Cloud Trial allows you to quickly and reliably import data into Amazon Redshift from on-premises enterprise systems or other cloud applications. Leverage the power of this award-winning platform free for 60 days and explore the newest frontier in enterprise data warehousing. About Informatica We help companies manage their data so they can gain measurable business value from it. And we re helping some of the biggest companies in the world navigate the most common data management mistakes to succeed at scalable, repeatable big data projects. Let s talk. 12 Next-Generation Cloud Analytics with Amazon Redshift