Advanced In-Database Analytics
|
|
|
- Colleen Stokes
- 10 years ago
- Views:
Transcription
1 Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1
2 That sounds complicated? 2
3 Who can tell me how best to solve this 3
4 What are the main mathematical functions?? MULTIPLICATION DIVISION ADDITION 4
5 So What s the Problem? How many of you have tried to run complex queries that cannot complete? How many of you would love for your IT or other execs to just understand the basic maths?? How many of you would like your analytics to be part of every business process? 5
6 The Platform Data Science Application Development 6
7 THE PLATFORM Greenplum UAP Unified Analytics Platform for Big Data Greenplum Database for structured data Greenplum HD, Enterprise-ready Hadoop for unstructured data Greenplum Chorus, the social platform for data science The Platform Application Development 7
8 Introducing EMC Greenplum Data Computing Appliance DATA IN. DECISIONS OUT. Delivering the fastest data loading and best price/performance ratio in the data warehousing industry 8
9 EMC Greenplum Data Computing Appliance Performance, scalability, reliability, and reduced TCO for data warehousing/business intelligence environments Extreme performance Optimized for fast query execution and unmatched data loading Rapidly deployable Purpose-built data warehousing appliance Reduced TCO Consolidate data marts for lower costs Private cloud-ready Data and computing are automatically optimized and distributed Highly available Self-healing and fully redundant Elastic scalability Expand capacity and performance online Advanced backup and disaster recovery Leverage industry-leading Data Domain backup and recovery 9
10 Benefits of an Appliance Approach EMC GREENPLUM DATA COMPUTING APPLIANCE Compute Storage Database Network Rapidly deployable in days, not weeks or months Appliance packaging and pre-tuning assures predictable performance Dramatically simplifies data warehouse and analytics infrastructure Reduces administration overhead Scale-out architecture; simply expand capacity and performance as needed Designed for rapid analysis of data volumes from less than a terabyte, scaling into the petabytes One support structure 10
11 EMC Greenplum Database Fastest data loading Advanced analytics DATA IN IN-DATABASE ANALYTICS DECISIONS OUT Scatter/Gather Streaming technology for the world s fastest data loading Eliminate data load bottlenecks Clean and integrate new data Several loading options, ranging from bulk load updates to microbatching for near real-time processing Optimized for fast query execution and linear scalability Move processing closer to data Shared-nothing, massively parallel processing (MPP) scale-out architecture Computing is automatically optimized and distributed across resources Provides the best concurrent multiworkload performance Unified data access for greater insight and value from data Enable parallel analysis across the enterprise Open platform with broad language support Certified enterprise connectivity and integration with most business intelligence; extract, transform, and load (ETL); and management products 11
12 TB/hour Industry s Fastest Data-Loading Rate Scatter/Gather Streaming technology for the industry s fastest data loading 5X 2X Eliminate data-load bottlenecks Remove additional loading tiers Parallel everywhere Netezza TwinFin Teradata Oracle Exadata EMC Greenplum Data Computing Appliance 12
13 EMC Greenplum Data Computing Appliance Architecture Flexible framework for processing large datasets SQL MapReduce Master Master Segment Segment Segment Segment Segment Massively parallel processing (MPP) architecture Shared-nothing architecture No single coordinator or performance bottleneck MPP everywhere Query optimization across segment servers Automated failover High reliability and availability Linear scalability I/O optimized 13
14 Shared-Nothing Architecture Massively Parallel Processing (MPP) Interconnect Most scalable database architecture Optimized for business intelligence and analytics Provides automatic parallelization No need for manual partitioning or tuning Just load and query like any database Tables are distributed across segments Each table has a subset of the rows Loading Extremely scalable and I/O optimized All nodes can scan and process in parallel No I/O contention among segments Linear scalability by adding nodes Each node adds storage, query performance, and loading performance 14
15 High Availability Self-healing and rapid recovery Master Master Segment Segment Segment Segment Master server data protection RAID protection for drive failures Replicated transaction logs for server failure On server failure Standby server-activated Administrator alerted Segment server data protection RAID protection for drive failures Mirrored segments for server failures On server failure Mirrored segments take over with no loss of service Fast online differential recovery 15
16 Self-Healing Automatic Failover Master servers Master servers Network Interconnect Segment servers Greenplum provides automatic failover using a selfhealing physical block replication architecture Key benefits of this architecture : Automatic failure detection and failover to mirror segments Fast differential recovery and sync (while fully online/readwrite) Improved write performance and reduced network load 16
17 Integrated EMC Data Domain Backup EMC Greenplum Data Computing Appliance Segment server NFS shares Twinax/ Fibre Channel cables Two 10 Gb IP links EMC Data Domain DD880 Backup and recovery With EMC Data Domain/ Greenplum native utility Reduces storage backup requirements Deduplicates data Fast, reliable data recovery Reduced recovery time Flexible and efficient Designate backup intervals Point-in-time copies 17
18 Proven Deployments of EMC Greenplum Database Sample use cases across industries with Greenplum Telecommunications, Media, and Entertainment Understand customer behaviors to reduce customer churn rates and develop customer loyalty programs Retail Analyze supply chain to optimize and cut costs Internet Clickstream analytics for ad targeting and market research Financial Services Detect and prevent fraud Credit scoring to reduce credit risk Pharmaceutical Analytics for drug discovery and development 18
19 Greenplum Data Computing Appliance Is Complementary to Enterprise Data Warehouse Enterprise Data Warehouse Single source of truth One logical model Heavy data governance and quality Operational reporting Financial consolidation Greenplum Data Computing Appliance Source of all the raw data (often 10-times the size of the enterprise data warehouse) Self-service infrastructure to support multiple data marts and sandboxes Rapid analytic iteration and business-led solutions 19
20 The Need for Consolidation: Data in a Typical Enterprise Enterprise data warehouse ~10% of data Data marts and personal databases ~90% of data Data is everywhere corporate enterprise data warehouse, hundreds of data marts, shadow databases, and spreadsheets The goal of centralizing all data in a single enterprise data warehouse has proven untenable 20
21 GREENPLUM DATABASE MADlib In-Database Analytical Functions Descriptive Statistics Quantile Profile CountMin (Cormode-Muthukrishnan) Sketch-based Estimator FM (Flajolet-Martin) Sketch-based Estimator MFV (Most Frequent Values) Sketchbased Estimator Frequency Histogram Bar Chart Box Plot Chart Latent Dirichlet Allocation Topic Modeling Modeling Correlation Matrix Association Rule Mining K-Means Clustering Naïve Bayes Classification Linear Regression Logistic Regression Support Vector Machines SVD Matrix Factorisation Decision Trees/CART 21
22 GREENPLUM HD Mahout Analytical Functions for Hadoop Sampling of Algorithms in Mahout Today: Collaborative Filtering User-based, Item-based recommenders K-Means Clustering Fuzzy K-Means Clustering Mean Shift Clustering Dirichlet Process Clustering Latent Dirichlet Allocation Singular Value Decomposition Parallel Frequent Pattern mining Complementary Naïve Bayes Classifier Random Forest Decision Tree-Based Classifier Java collections (previously Colt) Many more are included or are in development Plus, a robust and growing user community 22
23 Powerful Partner Ecosystem ANALYTICS BUSINESS INTELLIGENCE DATA INTEGRATION INDUSTRY Discovix TECHNOLOGY 23
24 So What s the Problem? How many of you have tried to run complex queries that cannot complete? How many of you would love for your IT or other execs to just understand the basic maths?? How many of you would like your analytics to be part of every business process? 24
25 Greenplum Analytics Lab Data Science Leverage the expertise of Greenplum s Data Scientists t 25
26 So What s the Problem? How many of you have tried to run complex queries that cannot complete? How many of you would love for your IT or other execs to just understand the basic maths?? How many of you would like your analytics to be part of every business process? 26
27 Application Development Pivotal Labs The Execution Engine To Quickly Create And Deploy Big Data Applications 27
28 GREENPLUM DELIVERS THE PREDICTIVE ENTERPRISE 28
29 The Predictive Enterprise Predictive Enterprise Data Driven Decisions Deliver maximum business value from all the available data Predict outcomes using advanced analytics Leverage data science to gain deep insight about the business Turn insight into action with new applications 29
30 LET S GET STARTED 30
31
Greenplum Database. Getting Started with Big Data Analytics. Ofir Manor Pre Sales Technical Architect, EMC Greenplum
Greenplum Database Getting Started with Big Data Analytics Ofir Manor Pre Sales Technical Architect, EMC Greenplum 1 Agenda Introduction to Greenplum Greenplum Database Architecture Flexible Database Configuration
Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
EMC/Greenplum Driving the Future of Data Warehousing and Analytics
EMC/Greenplum Driving the Future of Data Warehousing and Analytics EMC 2010 Forum Series 1 Greenplum Becomes the Foundation of EMC s Data Computing Division E M C A CQ U I R E S G R E E N P L U M Greenplum,
Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
EMC GREENPLUM DATABASE
EMC GREENPLUM DATABASE Driving the future of data warehousing and analytics Essentials A shared-nothing, massively parallel processing (MPP) architecture supports extreme performance on commodity infrastructure
EMC BACKUP MEETS BIG DATA
EMC BACKUP MEETS BIG DATA Strategies To Protect Greenplum, Isilon And Teradata Systems 1 Agenda Big Data: Overview, Backup and Recovery EMC Big Data Backup Strategy EMC Backup and Recovery Solutions for
Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.
Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
Integrated Grid Solutions. and Greenplum
EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving
Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features
Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Charlie Berger, MS Eng, MBA Sr. Director Product Management, Data Mining and Advanced Analytics [email protected] www.twitter.com/charliedatamine
Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database
White Paper Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database Abstract This white paper explores the technology
Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata
Up Your R Game James Taylor, Decision Management Solutions Bill Franks, Teradata Today s Speakers James Taylor Bill Franks CEO Chief Analytics Officer Decision Management Solutions Teradata 7/28/14 3 Polling
HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics
HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
I/O Considerations in Big Data Analytics
Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very
EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data
EMC Greenplum Driving the Future of Data Warehousing and Analytics Tools and Technologies for Big Data Steven Hillion V.P. Analytics EMC Data Computing Division 1 Big Data Size: The Volume Of Data Continues
Big Data and Its Impact on the Data Warehousing Architecture
Big Data and Its Impact on the Data Warehousing Architecture Sponsored by SAP Speaker: Wayne Eckerson, Director of Research, TechTarget Wayne Eckerson: Hi my name is Wayne Eckerson, I am Director of Research
2009 Oracle Corporation 1
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
Oracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
Big Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
Protecting Big Data Data Protection Solutions for the Business Data Lake
White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With
WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics
WHITE PAPER Harnessing the Power of Advanced How an appliance approach simplifies the use of advanced analytics Introduction The Netezza TwinFin i-class advanced analytics appliance pushes the limits of
Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop
1 Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 2 Pivotal s Full Approach It s More Than Just Hadoop Pivotal Data Labs 3 Why Pivotal Exists First Movers Solve the Big Data Utility Gap
Harnessing the power of advanced analytics with IBM Netezza
IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced
Upgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000
Upgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000 Your Data, Any Place, Any Time Executive Summary: More than ever, organizations rely on data
SQL Server 2005 Features Comparison
Page 1 of 10 Quick Links Home Worldwide Search Microsoft.com for: Go : Home Product Information How to Buy Editions Learning Downloads Support Partners Technologies Solutions Community Previous Versions
Big Data and the Data Lake. February 2015
Big Data and the Data Lake February 2015 My Vision: Our Mission Data Intelligence is a broad term that describes the real, meaningful insights that can be extracted from your data truths that you can act
IBM Netezza High Capacity Appliance
IBM Netezza High Capacity Appliance Petascale Data Archival, Analysis and Disaster Recovery Solutions IBM Netezza High Capacity Appliance Highlights: Allows querying and analysis of deep archival data
Innovative technology for big data analytics
Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of
BIG DATA-AS-A-SERVICE
White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Big + Fast + Safe + Simple = Lowest Technical Risk
Big + Fast + Safe + Simple = Lowest Technical Risk The Synergy of Greenplum and Isilon Architecture in HP Environments Steffen Thuemmel (Isilon) Andreas Scherbaum (Greenplum) 1 Our problem 2 What is Big
Integrating Netezza into your existing IT landscape
Marco Lehmann Technical Sales Professional Integrating Netezza into your existing IT landscape 2011 IBM Corporation Agenda How to integrate your existing data into Netezza appliance? 4 Steps for creating
Advanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
A HIGH-PERFORMANCE, SCALABLE BIG DATA APPLIANCE LAURA CHU-VIAL, SENIOR PRODUCT MARKETING MANAGER JOACHIM RAHMFELD, VP FIELD ALLIANCES OF SAP
A HIGH-PERFORMANCE, SCALABLE BIG DATA APPLIANCE LAURA CHU-VIAL, SENIOR PRODUCT MARKETING MANAGER JOACHIM RAHMFELD, VP FIELD ALLIANCES OF SAP WEBTECH EDUCATIONAL SERIES A HIGH-PERFORMANCE, SCALABLE BIG
Integrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
Using an In-Memory Data Grid for Near Real-Time Data Analysis
SCALEOUT SOFTWARE Using an In-Memory Data Grid for Near Real-Time Data Analysis by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 IN today s competitive world, businesses
Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard
Hadoop and Relational base The Best of Both Worlds for Analytics Greg Battas Hewlett Packard The Evolution of Analytics Mainframe EDW Proprietary MPP Unix SMP MPP Appliance Hadoop? Questions Is Hadoop
2015 Ironside Group, Inc. 2
2015 Ironside Group, Inc. 2 Introduction to Ironside What is Cloud, Really? Why Cloud for Data Warehousing? Intro to IBM PureData for Analytics (IPDA) IBM PureData for Analytics on Cloud Intro to IBM dashdb
Virtual Data Warehouse Appliances
infrastructure (WX 2 and blade server Kognitio provides solutions to business problems that require acquisition, rationalization and analysis of large and/or complex data The Kognitio Technology and Data
HP Vertica. Echtzeit-Analyse extremer Datenmengen und Einbindung von Hadoop. Helmut Schmitt Sales Manager DACH
HP Vertica Echtzeit-Analyse extremer Datenmengen und Einbindung von Hadoop Helmut Schmitt Sales Manager DACH Big Data is a Massive Disruptor 2 A 100 fold multiplication in the amount of data is a 10,000
Introducing Oracle Exalytics In-Memory Machine
Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle
6.0, 6.5 and Beyond. The Future of Spotfire. Tobias Lehtipalo Sr. Director of Product Management
6.0, 6.5 and Beyond The Future of Spotfire Tobias Lehtipalo Sr. Director of Product Management Key peformance indicators Hundreds of Records Visual Data Discovery Millions of Records Data Mining or Data
In-Memory Analytics for Big Data
In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
BIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform
SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform David Lawler, Oracle Senior Vice President, Product Management and Strategy Paul Kent, SAS Vice President, Big Data What
In-Database Analytics
Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing
IBM Netezza 1000. High-performance business intelligence and advanced analytics for the enterprise. The analytics conundrum
IBM Netezza 1000 High-performance business intelligence and advanced analytics for the enterprise Our approach to data analysis is patented and proven. Minimize data movement, while processing it at physics
Data Warehouse as a Service. Lot 2 - Platform as a Service. Version: 1.1, Issue Date: 05/02/2014. Classification: Open
Data Warehouse as a Service Version: 1.1, Issue Date: 05/02/2014 Classification: Open Classification: Open ii MDS Technologies Ltd 2014. Other than for the sole purpose of evaluating this Response, no
SEIZE THE DATA. 2015 SEIZE THE DATA. 2015
1 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. BIG DATA CONFERENCE 2015 Boston August 10-13 Predicting and reducing deforestation
How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
NextGen Infrastructure for Big DATA Analytics.
NextGen Infrastructure for Big DATA Analytics. So What is Big Data? Data that exceeds the processing capacity of conven4onal database systems. The data is too big, moves too fast, or doesn t fit the structures
Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015
Hadoop MapReduce and Spark Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015 Outline Hadoop Hadoop Import data on Hadoop Spark Spark features Scala MLlib MLlib
Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.
Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus
5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
CitusDB Architecture for Real-Time Big Data
CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing
VIEWPOINT. High Performance Analytics. Industry Context and Trends
VIEWPOINT High Performance Analytics Industry Context and Trends In the digital age of social media and connected devices, enterprises have a plethora of data that they can mine, to discover hidden correlations
An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
SAS and Teradata Partnership
SAS and Teradata Partnership Ed Swain Senior Industry Consultant Energy & Resources [email protected] 1 Innovation and Leadership Teradata SAS Magic Quadrant for Data Warehouse Database Management
HIGH PERFORMANCE ANALYTICS FOR TERADATA
F HIGH PERFORMANCE ANALYTICS FOR TERADATA F F BORN AND BRED IN FINANCIAL SERVICES AND HEALTHCARE. DECADES OF EXPERIENCE IN PARALLEL PROGRAMMING AND ANALYTICS. FOCUSED ON MAKING DATA SCIENCE HIGHLY PERFORMING
CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19
PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations
Ramesh Bhashyam Teradata Fellow Teradata Corporation [email protected]
Challenges of Handling Big Data Ramesh Bhashyam Teradata Fellow Teradata Corporation [email protected] Trend Too much information is a storage issue, certainly, but too much information is also
Investor Presentation. Second Quarter 2015
Investor Presentation Second Quarter 2015 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, [email protected] Assistant Professor, Information
White Paper. Unified Data Integration Across Big Data Platforms
White Paper Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using
Unified Data Integration Across Big Data Platforms
Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using ELT... 6 Diyotta
ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE
ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE Big Data Big Data What tax agencies are or will be seeing! Big Data Large and increased data volumes New and emerging
The Internet of Things and Big Data: Intro
The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific
EMC SOLUTION FOR AGILE AND ROBUST ANALYTICS ON HADOOP DATA LAKE WITH PIVOTAL HDB
EMC SOLUTION FOR AGILE AND ROBUST ANALYTICS ON HADOOP DATA LAKE WITH PIVOTAL HDB ABSTRACT As companies increasingly adopt data lakes as a platform for storing data from a variety of sources, the need for
Safe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION
ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence
Netezza and Business Analytics Synergy
Netezza Business Partner Update: November 17, 2011 Netezza and Business Analytics Synergy Shimon Nir, IBM Agenda Business Analytics / Netezza Synergy Overview Netezza overview Enabling the Business with
Technical white paper. R you ready? Turning big data into big value with the HP Vertica Analytics Platform and R
Technical white paper R you ready? Turning big data into big value with the HP Vertica Analytics Platform and R Table of contents Executive summary The data mining challenge Data mining implementation
How to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
SQL Server 2012 Parallel Data Warehouse. Solution Brief
SQL Server 2012 Parallel Data Warehouse Solution Brief Published February 22, 2013 Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse...
Universal PMML Plug-in for EMC Greenplum Database
Universal PMML Plug-in for EMC Greenplum Database Delivering Massively Parallel Predictions Zementis, Inc. [email protected] USA: 6125 Cornerstone Court East, Suite #250, San Diego, CA 92121 T +1(619)
Information Architecture
The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to
EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.
EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics
How To Use Hp Vertica Ondemand
Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater
EMC DATA DOMAIN OPERATING SYSTEM
EMC DATA DOMAIN OPERATING SYSTEM Powering EMC Protection Storage ESSENTIALS High-Speed, Scalable Deduplication Up to 58.7 TB/hr performance Reduces requirements for backup storage by 10 to 30x and archive
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
EMC DATA DOMAIN OPERATING SYSTEM
ESSENTIALS HIGH-SPEED, SCALABLE DEDUPLICATION Up to 58.7 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability DATA INVULNERABILITY ARCHITECTURE Inline write/read
INVESTOR PRESENTATION. First Quarter 2014
INVESTOR PRESENTATION First Quarter 2014 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences
An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database
An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct
VMware vsphere Data Protection
FREQUENTLY ASKED QUESTIONS VMware vsphere Data Protection vsphere Data Protection Advanced Overview Q. What is VMware vsphere Data Protection Advanced? A. VMware vsphere Data Protection Advanced is a backup
SAP Real-time Data Platform. April 2013
SAP Real-time Data Platform April 2013 Agenda Introduction SAP Real Time Data Platform Overview SAP Sybase ASE SAP Sybase IQ SAP EIM Questions and Answers 2012 SAP AG. All rights reserved. 2 Introduction
BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE
BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise
EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable with
Focus on the business, not the business of data warehousing!
Focus on the business, not the business of data warehousing! Adam M. Ronthal Technical Product Marketing and Strategy Big Data, Cloud, and Appliances @ARonthal 1 Disclaimer Copyright IBM Corporation 2014.
PARC and SAP Co-innovation: High-performance Graph Analytics for Big Data Powered by SAP HANA
PARC and SAP Co-innovation: High-performance Graph Analytics for Big Data Powered by SAP HANA Harnessing the combined power of SAP HANA and PARC s HiperGraph graph analytics technology for real-time insights
