Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. DATA MANAGEMENT FOR ANALYTICS
|
|
- Marjorie Pearson
- 8 years ago
- Views:
Transcription
1 DATA MANAGEMENT FOR ANALYTICS
2 WHAT IS ANALYTICS? A VERY BROAD TERM OFTEN CONFUSED Descriptive What happened? When? Why? Advanced What will happen? When? Why? How do we benefit? What actions should I take? ANALYTICS 2
3 THE DISCONNECT TOH-MAY-TOH TOH-MAH-TOH I need data. Can you be more specific? Nope, not yet.???? 3
4 PARADIGM SHIFT IT S ABOUT THE DATA S DESTINATION Design Extract Transform Load Validate Refresh 4
5 PARADIGM SHIFT IT S ABOUT THE DATA S JOURNEY Access Explore Clean Transform Analytic method 5
6 PREPARING DATA FOR ANALYTICS Data Access Understand Cleansing Reshape Leverage Metadata Access to multiple sources of data. New sources combined with existing and legacy sources: Validate data movement and verify consistency and completeness: Perform cleansing functions on joined data to increase value Data is rarely in the format needed and different methods of analytics require different shapes of data Metadata is a valuable asset to assist in the collaboration between business and IT Data Types Data Movement Combine Filter Statistical Analysis Distributions Associations De-duplication Enrichment Standardization Missing values Wide Long Transposition Understand how models are built Collaborate on the data TRADITIONALLY OPERATIONALIZED MANUAL AUTOMATED PROCESS PROCESS 80% 20% 80% 20% 6
7 ACCESS SO MANY DATA TYPES AND SOURCES Access Excel SQLServer Oracle MySQL boolean Yes/No Bit Byte N/A Boolean integer Number Int Number Int Int float Number (single) Float Number Float Numeric currency Currency Money NA NA Money string NA Char Char Char Char string Text VarChar VarChar VarChar VarChar binary OLE Obj Memo Binary Varbinary Image Long Raw Blob Text Binary Varbinary 7
8 ACCESS SO MUCH DATA MOVEMENT Data Data Data SAS Server Push some, or ALL processing to the data 8
9 UNDERSTAND WHAT DO I HAVE AND HOW USEFUL IS IT? Is my data consistent? Is my data complete? Is my data highly unique? 9
10 UNDERSTAND Is my data normal? WHAT DO I HAVE AND HOW USEFUL IS IT? Is my data linear? What are the associations in the data? 10
11 CLEAN FILLING IN THE GAPS AND STANDARDIZING Standardizing Text Standardizing Numeric De-duplication 11
12 CLEAN FILLING IN THE GAPS AND STANDARDIZING Dropping outliers Grouping or binning data 12
13 RESHAPE PURPOSE BUILT DATA STRUCTURE Efficient storage Fast retrieval Defined schema WIDE tables / Time series data Iteration (build, test, repeat) Schema-less 13
14 RESHAPE TURNING DATA AROUND Add up all the quantities for each product purchased in each product category. 14
15 RESHAPE TURNING DATA AROUND Each product category will become its own row, with each product purchased its own distinct category column. 15
16 PARADIGM SHIFT IT S ABOUT THE DATA S JOURNEY Access Explore Clean Transform Analytic method How can we do this better? 16
17 METADATA LINEAGE & TRACEABILITY A view into existing data sources/targets, jobs and the associated owners 17
18 METADATA COLLABORATION AND REPEATABILITY Managed, collaborative environments with shared content, data sources and personal development space 18
19 LEVERAGING A FRAMEWORK FOR SUCCESS SOURCES DATA MANAGEMENT DATA GOVERANCE CONSUMERS EVENT STREAM PROCESSING DATA INTEGRATION XML Cloud DATA ACCESS DATA QUALITY MQ DATA VIRTUALIZATION MASTER DATA MGMT RDBMS 19
20 GROWTH OF THE INTERNET OF THINGS TRENDS TODAY
21
22
23
24
25 Publish Subscribe ENGINEERED FOR FAST AND ADAPTIVE ACTION Event Stream Processing Model Streaming Events Event Actions Continuous Query SAS In-Memory SAS-generated Insights Enrichment Data Analytic Models Busines s Rules Copyr i g ht 2015, SAS Ins titut e Inc. All rights res er ve d.
26 Publish Subscribe ENGINEERED FOR FAST AND ADAPTIVE ACTION Event Stream Processing Model Streaming Events Event Actions Continuous Query SAS In-Memory SAS-generated Insights Enrichment Data Analytic Models Busines s Rules Copyr i g ht 2015, SAS Ins titut e Inc. All rights res er ve d.
27 Publish Subscribe ENGINEERED FOR FAST AND ADAPTIVE ACTION Event Stream Processing Model Streaming Events Event Actions Continuous Query SAS In-Memory Low-latency assessment of high-volume, high-velocity data streams to detect, filter, aggregate & analyze SAS-generated Insights Enrichment Data Analytic Models Busines s Rules Copyr i g ht 2015, SAS Ins titut e Inc. All rights res er ve d.
28 STREAMING DATA TAKE REAL TIME ACTION APPLY MULTI-PHASE ANALYTICS FOCUS ON RELEVANT DATA Detect and monitor events of interest and trigger appropriate realtime actions & alerts Apply multi-phase analytics to determine events that can benefit from deeper and more complex analysis Continuous loading of relevant streaming data for in-depth analytics 28
29 HADOOP TRENDS WHY HADOOP? $ 1. Store data for less 2. Process data more quickly (for less $ ) 29
30 HADOOP TRENDS ROLES IT S PLAYING Stage structured data. Process structured data. Archive any data. Process any data. Access any data. (via data warehouse) Access any data. (via Hadoop) 30
31 TERMINOLOGY TRADITIONAL Primary Key RDBMS Relationship Index Normalize Primary Key Database Constraint Table Foreign Key SQL Schema 31
32 TERMINOLOGY HADOOP Hadoop Cluster NameNode Pig Hive DataNode HDFS YARN Block Cloudera JobTracker MapReduce 32
33 TERMINOLOGY Παραδεισένι ο νησί. IT S ALL GREEK TO ME (MOST)! Αρχαίοι ναοί. Είναι όλα τα ελληνικά μου. Σαλάτα. Ο Θεός της βροντής. Γιαούρτι. Ολυμπιακοί Αγώνες. Ελληνορωμ αϊκή. Όμορφη αρχιτεκτονικ ή. Μεγάλοι της λογοτεχνίας και της φιλοσοφίας. Τραγωδία. Μεσογείου. 33
34 SAS & INTEL STUDY Results & Key Findings HADOOP ADOPTION & CHALLENGES 60% - cited advanced analytics, data discovery, or as an analytical lab Research summary: SAS and Intel asked more than 300 IT-managers from the largest companies in Denmark, Finland, Norway and Sweden about the adoption of Big Data analytics and Hadoop. Primary reason for considering Hadoop 22% - would like to speed up processing Adoption / Obstacles 35% - cited Resources and Competencies 34
35 HADOOP BIG DATA CHALLENGES 35
36 CHALLENGES HADOOP SKILLS SHORTAGE CURRENT USER TOOLS ARE NOT BIG DATA ENABLED 1) Performing even the simplest tasks in Hadoop typically requires mastering disparate tools and writing hundreds of lines of code MapReduce Pig Latin HiveQL HDFS Sqoop and Oozie 2) User tools are not engineered to process data inside Hadoop. Tools are not optimized for Hadoop Users move data out of Hadoop to do data management and data quality This requires more processing time Data is duplicated and more storage is required Users do not use the Hadoop platform as it was designed 36
37 SELF-SERVICE DATA PREPARATION FOR HADOOP Manage Data inside Hadoop Reduce Complexity of Hadoop Accelerate User adoption 37
38 SAS DATA LOADER FOR HADOOP SELF-SERVICE DATA PREPARATION FOR HADOOP Reduce Complexity of Hadoop Manage Data inside Hadoop Accelerate User adoption Query, Join and Filter Transform and Integrate Analytics Profile Hadoop Load into and memory Cleanse Empower Business Users Unburden IT - Harness the Power of Big Data 38
39 SAS DATA MANAGEMENT THE DATA MANAGEMENT JOURNEY GETTING STARTED What does Data Governance mean to us? How do we implement and sustain a program? You can get there from here! How do we even get started? 39
40 REVERSE IT BE MORE PRODUCTIVE 20% 80% 40
41 MERCI BEAUCOUP!
Safe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
More informationSQL. Short introduction
SQL Short introduction 1 Overview SQL, which stands for Structured Query Language, is used to communicate with a database. Through SQL one can create, manipulate, query and delete tables and contents.
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More informationConstructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
More informationHadoop Job Oriented Training Agenda
1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module
More informationWhite Paper. Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices.
White Paper Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices. Contents Data Management: Why It s So Essential... 1 The Basics of Data Preparation... 1 1: Simplify Access
More informationHow To Create A Table In Sql 2.5.2.2 (Ahem)
Database Systems Unit 5 Database Implementation: SQL Data Definition Language Learning Goals In this unit you will learn how to transfer a logical data model into a physical database, how to extend or
More informationData Governance in the Hadoop Data Lake. Michael Lang May 2015
Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales
More informationIntroduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.
Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in
More informationEnd to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
More informationCisco Data Preparation
Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and
More informationOracle Big Data Building A Big Data Management System
Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following
More information<Insert Picture Here> Big Data
Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big
More informationANALYTICS IN BIG DATA ERA
ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut
More informationHadoop & SAS Data Loader for Hadoop
Turning Data into Value Hadoop & SAS Data Loader for Hadoop Sebastiaan Schaap Frederik Vandenberghe Agenda What s Hadoop SAS Data management: Traditional In-Database In-Memory The Hadoop analytics lifecycle
More informationSpring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE
Spring,2015 Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Contents: Briefly About Big Data Management What is hive? Hive Architecture Working
More informationLecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
More informationSQL Server An Overview
SQL Server An Overview SQL Server Microsoft SQL Server is designed to work effectively in a number of environments: As a two-tier or multi-tier client/server database system As a desktop database system
More informationMicrosoft SQL Server Connector for Apache Hadoop Version 1.0. User Guide
Microsoft SQL Server Connector for Apache Hadoop Version 1.0 User Guide October 3, 2011 Contents Legal Notice... 3 Introduction... 4 What is SQL Server-Hadoop Connector?... 4 What is Sqoop?... 4 Supported
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationNew Modeling Challenges: Big Data, Hadoop, Cloud
New Modeling Challenges: Big Data, Hadoop, Cloud Karen López @datachick www.datamodel.com Karen Lopez Love Your Data InfoAdvisors.com @datachick Senior Project Manager & Architect 1 Disclosure I m a Data
More informationOracle Big Data Discovery Unlock Potential in Big Data Reservoir
Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationQsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
More informationBIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationCloudera Certified Developer for Apache Hadoop
Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number
More informationQLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment
More informationGAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
More informationBringing the Power of SAS to Hadoop. White Paper
White Paper Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities Contents Introduction... 1 What
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationHadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
More informationMySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering
MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation
More informationArchitecting for the Internet of Things & Big Data
Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to
More informationINTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe
More informationAn Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
More informationGør dine big data klar til analyse på en nem måde med Hadoop og SAS Data Loader for Hadoop. Jens Dahl Mikkelsen SAS Institute
Gør dine big data klar til analyse på en nem måde med Hadoop og SAS Data Loader for Hadoop Jens Dahl Mikkelsen SAS Institute Indhold Udfordringer for analytikeren Hvordan kan SAS Data Loader for Hadoop
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationA Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
More informationHadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh
1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets
More informationData Lake In Action: Real-time, Closed Looped Analytics On Hadoop
1 Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 2 Pivotal s Full Approach It s More Than Just Hadoop Pivotal Data Labs 3 Why Pivotal Exists First Movers Solve the Big Data Utility Gap
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationMove Data from Oracle to Hadoop and Gain New Business Insights
Move Data from Oracle to Hadoop and Gain New Business Insights Written by Lenka Vanek, senior director of engineering, Dell Software Abstract Today, the majority of data for transaction processing resides
More informationBig Data and Advanced Analytics Technologies for the Smart Grid
1 Big Data and Advanced Analytics Technologies for the Smart Grid Arnie de Castro, PhD SAS Institute IEEE PES 2014 General Meeting July 27-31, 2014 Panel Session: Using Smart Grid Data to Improve Planning,
More informationA Scalable Data Transformation Framework using the Hadoop Ecosystem
A Scalable Data Transformation Framework using the Hadoop Ecosystem Raj Nair Director Data Platform Kiru Pakkirisamy CTO AGENDA About Penton and Serendio Inc Data Processing at Penton PoC Use Case Functional
More informationLofan Abrams Data Services for Big Data Session # 2987
Lofan Abrams Data Services for Big Data Session # 2987 Big Data Are you ready for blast-off? Big Data, for better or worse: 90% of world s data generated over last two years. ScienceDaily, ScienceDaily
More informationCIO Guide How to Use Hadoop with Your SAP Software Landscape
SAP Solutions CIO Guide How to Use with Your SAP Software Landscape February 2013 Table of Contents 3 Executive Summary 4 Introduction and Scope 6 Big Data: A Definition A Conventional Disk-Based RDBMs
More informationInternals of Hadoop Application Framework and Distributed File System
International Journal of Scientific and Research Publications, Volume 5, Issue 7, July 2015 1 Internals of Hadoop Application Framework and Distributed File System Saminath.V, Sangeetha.M.S Abstract- Hadoop
More informationProgramming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview
Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce
More informationMySQL and Hadoop. Percona Live 2014 Chris Schneider
MySQL and Hadoop Percona Live 2014 Chris Schneider About Me Chris Schneider, Database Architect @ Groupon Spent the last 10 years building MySQL architecture for multiple companies Worked with Hadoop for
More informationQUEST meeting Big Data Analytics
QUEST meeting Big Data Analytics Peter Hughes Business Solutions Consultant SAS Australia/New Zealand Copyright 2015, SAS Institute Inc. All rights reserved. Big Data Analytics WHERE WE ARE NOW 2005 2007
More informationOracle Big Data Discovery The Visual Face of Hadoop
Disclaimer: This document is for informational purposes. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development,
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationOracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
More informationBIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
More informationHadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science
A Seminar report On Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science SUBMITTED TO: www.studymafia.org SUBMITTED BY: www.studymafia.org
More informationOracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
More informationWhat's New in SAS Data Management
Paper SAS034-2014 What's New in SAS Data Management Nancy Rausch, SAS Institute Inc., Cary, NC; Mike Frost, SAS Institute Inc., Cary, NC, Mike Ames, SAS Institute Inc., Cary ABSTRACT The latest releases
More informationWhy Big Data in the Cloud?
Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data
More informationbrief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 PART 2 PART 3 BIG DATA PATTERNS...253 PART 4 BEYOND MAPREDUCE...385
brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 1 Hadoop in a heartbeat 3 2 Introduction to YARN 22 PART 2 DATA LOGISTICS...59 3 Data serialization working with text and beyond 61 4 Organizing and
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationTeradata s Big Data Technology Strategy & Roadmap
Teradata s Big Data Technology Strategy & Roadmap Artur Borycki, Director International Solutions Marketing 18 March 2014 Agenda > Introduction and level-set > Enabling the Logical Data Warehouse > Any
More informationHadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
More informationBig Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com
More informationIntegrating VoltDB with Hadoop
The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.
More informationHarnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization
Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization Kimberly Palko, Product Manager Red Hat JBoss Doug Reid, Director Partner Product Management Hortonworks Cojan van
More informationBringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
More informationBig Data: Using ArcGIS with Apache Hadoop. Erik Hoel and Mike Park
Big Data: Using ArcGIS with Apache Hadoop Erik Hoel and Mike Park Outline Overview of Hadoop Adding GIS capabilities to Hadoop Integrating Hadoop with ArcGIS Apache Hadoop What is Hadoop? Hadoop is a scalable
More informationLecture 10: HBase! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl
Big Data Processing, 2014/15 Lecture 10: HBase!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind the
More informationISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
More informationDRIVING THE CHANGE ENABLING TECHNOLOGY FOR FINANCE 15 TH FINANCE TECH FORUM SOFIA, BULGARIA APRIL 25 2013
DRIVING THE CHANGE ENABLING TECHNOLOGY FOR FINANCE 15 TH FINANCE TECH FORUM SOFIA, BULGARIA APRIL 25 2013 BRAD HATHAWAY REGIONAL LEADER FOR INFORMATION MANAGEMENT AGENDA Major Technology Trends Focus on
More informationInternational Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763
International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 A Discussion on Testing Hadoop Applications Sevuga Perumal Chidambaram ABSTRACT The purpose of analysing
More informationCOURSE CONTENT Big Data and Hadoop Training
COURSE CONTENT Big Data and Hadoop Training 1. Meet Hadoop Data! Data Storage and Analysis Comparison with Other Systems RDBMS Grid Computing Volunteer Computing A Brief History of Hadoop Apache Hadoop
More informationMoving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com sujee@elephantscale.com Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
More informationCOSC 6397 Big Data Analytics. 2 nd homework assignment Pig and Hive. Edgar Gabriel Spring 2015
COSC 6397 Big Data Analytics 2 nd homework assignment Pig and Hive Edgar Gabriel Spring 2015 2 nd Homework Rules Each student should deliver Source code (.java files) Documentation (.pdf,.doc,.tex or.txt
More informationBig Data Introduction
Big Data Introduction Ralf Lange Global ISV & OEM Sales 1 Copyright 2012, Oracle and/or its affiliates. All rights Conventional infrastructure 2 Copyright 2012, Oracle and/or its affiliates. All rights
More informationAre You Big Data Ready?
ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain
More informationCollaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
More informationFact Sheet In-Memory Analysis
Fact Sheet In-Memory Analysis 1 Copyright Yellowfin International 2010 Contents In Memory Overview...3 Benefits...3 Agile development & rapid delivery...3 Data types supported by the In-Memory Database...4
More informationHortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
More informationBig Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
More informationGetting Started with Hadoop. Raanan Dagan Paul Tibaldi
Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop
More informationWorkshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
More informationBig Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management
Big Data and New Paradigms in Information Management Vladimir Videnovic Institute for Information Management 2 "I am certainly not an advocate for frequent and untried changes laws and institutions must
More informationLarge scale processing using Hadoop. Ján Vaňo
Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine
More informationApache Hadoop: The Big Data Refinery
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
More informationAssociate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
More informationCertified Big Data and Apache Hadoop Developer VS-1221
Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification
More informationData Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015
Data Governance in the Hadoop Data Lake Kiran Kamreddy May 2015 One Data Lake: Many Definitions A centralized repository of raw data into which many data-producing streams flow and from which downstream
More informationThe Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
More informationOracle Data Integrator for Big Data. Alex Kotopoulis Senior Principal Product Manager
Oracle Data Integrator for Big Data Alex Kotopoulis Senior Principal Product Manager Hands on Lab - Oracle Data Integrator for Big Data Abstract: This lab will highlight to Developers, DBAs and Architects
More informationVelociData Solving the Need for Speed in DataOps. Inside Analysis / Bloor Group Briefing June 13, 2014
VelociData Solving the Need for Speed in DataOps Inside Analysis / Bloor Group Briefing June 13, 2014 1 Transforming Speed and Economics of Data Operations to Achieve Time-Bound Service Levels, Gain Wire-Speed
More informationMaking Sense of Big Data in Insurance
Making Sense of Big Data in Insurance Amir Halfon, CTO, Financial Services, MarkLogic Corporation BIG DATA?.. SLIDE: 2 The Evolution of Data Management For your application data! Application- and hardware-specific
More informationBig Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA
Big Data: What You Should Know Mark Child Research Manager - Software IDC CEMA Agenda Market Dynamics Defining Big Data Technology Trends Information and Intelligence Market Realities Future Applications
More informationInformation Builders Mission & Value Proposition
Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns
More informationHas been into training Big Data Hadoop and MongoDB from more than a year now
NAME NAMIT EXECUTIVE SUMMARY EXPERTISE DELIVERIES Around 10+ years of experience on Big Data Technologies such as Hadoop and MongoDB, Java, Python, Big Data Analytics, System Integration and Consulting
More informationData. Data and database. Aniel Nieves-González. Fall 2015
Data and database Aniel Nieves-González Fall 2015 Data I In the context of information systems, the following definitions are important: 1 Data refers simply to raw facts, i.e., facts obtained by measuring
More informationNew Modeling Challenges: Big Data, Hadoop, Cloud
New Modeling Challenges: Big Data, Hadoop, Cloud Karen López @datachick Karen Lopez Love Your Data InfoAdvisors.com @datachick Senior Project Manager & Architect 1 Abstract We data architects have done
More informationAdvanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
More informationDell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert Armando_Acosta@Dell.com/
More information