Oracle Big Data Spatial & Graph Social Network Analysis - Case Study
|
|
|
- Kathlyn Stanley
- 9 years ago
- Views:
Transcription
1 Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Mark Rittman, CTO, Rittman Mead OTN EMEA Tour, May 2016 [email protected]
2
3 About the Speaker Mark Rittman, Co-Founder of Rittman Mead Oracle ACE Director, specialising in Oracle BI&DW 14 Years Experience with Oracle Technology Regular columnist for Oracle Magazine Author of two Oracle Press Oracle BI books Oracle Business Intelligence Developers Guide Oracle Exalytics Revealed Writer for Rittman Mead Blog : [email protected] Twitter [email protected] 3
4 About Rittman Mead Oracle Gold Partner with offices in the UK and USA (Atlanta) 70+ staff delivering Oracle BI, DW, Big Data and Advanced Analytics projects Oracle ACE Director (Mark Rittman, CTO) + 2 Oracle ACEs Significant web presence with the Rittman Mead Blog ( Regular sers of social media (Facebook, Twitter, Slideshare etc) Regular column in Oracle Magazine and other publications Hadoop R&D lab for dogfooding solutions developed for customers [email protected] 4
5 Business Scenario Rittman Mead want to understand drivers and audience for their website What is our most popular content? Who are the most in-demand blog authors? Who are the influencers? What communities exist around our web presence? Three data sources in scope: RM Website Logs Twitter Stream Website Posts, Comments etc 5
6 Real-Time & Batch Log & Event Ingestion : ODI12c ODI provides an excellent framework for running Hadoop ETL jobs ELT approach pushes transformations down to Hadoop - leveraging power of cluster Hive, HBase, Sqoop and OLH/ODCH KMs provide native Hadoop loading / transformation Whilst still preserving RDBMS push-down Extensible to cover Pig, Spark etc Process orchestration Data quality / error handling Metadata and model-driven New in ability to generate Pig and Spark jobs too [email protected] 6
7 Overall Project Architecture - Phase 1 Initial iteration of project focused on capturing and ingesting web + social media activity Apache Flume used for capturing website hits, page views Twitter Streaming API used to capture tweets referring to RM website or RM staff Activity landed into Hadoop (HDFS), processed and enriched and presented using Hive [email protected] 7
8 Real-Time Metrics around Site Activity - What? Provided real-time counts of page views, correlated with Twitter activity stored in Hive tables Accessed using Oracle Big Data SQL + joined to Oracle RDBMS reference data Delivered using OBIEE reports and dashboards Data Warehousing, but cheaper + real-time Answered questions such as What are our most popular site pages? Which pages attracted the most attention on Twitter, Facebook? What topics are popular? Combine with Oracle Big Data SQL for structured OBIEE dashboard analysis What pages are people visiting? Who is referring to us on Twitter? What content has the most reach? [email protected] 8
9 Oracle BDD for Data Wrangling + Data Enrichment Oracle Big Data Discovery used to go back to the raw event data add more meaning Enrich data, extract nouns + terms, add reference data from file, RDBMS etc Understand sentiment + meaning of tweets, link disparate + loosely coupled events Faceted search dashboards [email protected] 9
10 Answered the What and Why Questions Counts of page views, tweets, mentions etc helped us understand what content was popular Analysis of tweet sentiment, meaning and correlation with content answered why Combine with Oracle Big Data SQL for structured OBIEE dashboard analysis Combine with site content, semantics, text enrichment Catalog and explore using Oracle Big Data Discovery What pages are people visiting? Who is referring to us on Twitter? What content has the most reach? Why is some content more popular? Does sentiment affect viewership? What content is popular, where? 10
11 But Who Are The Influencers In Our Community? Previous counts assumed that all tweet references equally important But some Twitter users are far more influential than others Sit at the centre of a community, have 1000 s of followers A reference by them has massive impact on page views Positive or negative comments from them drive perception Can we identify them? Potentially reach out with analyst program Study what website posts go viral Understand out audience, and the conversation, better Find out people that are central in the given network e.g. influencer marketing Influencer Identification Communication Stream (e.g. tweets) [email protected] 11
12 What Communities and Networks Are Our Audience? Rittman Mead website features many types of content Blogs on BI, data integration, big data, data warehousing Op-Eds ( OBIEE12c - Three Months In, What s the Verdict? ) Articles on a theme, e.g. performance tuning Details of new courses, new promotions Different communities likely to form around these content types Different influencers and patterns of recommendation, discovery Can we identify some of the communities, segment our audience? Identify group of people that are close to each other e.g. target group marketing Community Detection [email protected] 12
13 Tabular (SQL) Query Tools Aimed at Counts + Aggs [email protected] 13
14 Graph Example : RM Blog Post Referenced on Twitter Follows Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Follows Page Views [email protected] 14
15 Network Effect Magnified by Extent of Social Graph Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Page Views [email protected] 15
16 Retweets by Influential Twitter Users Drive Visits RT: Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Page Views Retweet Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Page Views [email protected] 16
17 Retweets, Mentions and Replies Create Communities Retweet Mention Mention Mention Reply Reply Mention Mention Reply #bigdatasql #thatswhatshesaid 17
18 Property Graph Terminology Node, or Vertex Vertex Properties Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Edge Type Directed Connection, or Edge Mentions Node, or Vertex 18
19 Property Graph Terminology Directed Connection, or Edge Retweets Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Node, or Vertex Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Node, or Vertex Mentions 19
20 Determining Influencers - Factors to Consider Different types of Twitter interaction could imply more or less influence Retweet of another user s Tweet implies that person is worth quoting or you endorse their opinion Reply to another user s tweet could be a weaker recognition of that person s opinion or view Mention of a user in a tweet is a weaker recognition that they are part of a community / debate [email protected] 20
21 Relative Importance of Edge Types Added via Weights Edge Property Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Retweet, Weight = 100 Lifting the Lid on OBIEE Internals with Linux Diagnostics Tools Edge Property Mentions, Weight = 30 [email protected] 21
22 Oracle Big Data Spatial & Graph Graph, spatial and raster data processing for big data Primarily documented + tested against Oracle BDA Installable on commodity cluster using CDH Data stored in Apache HBase or Oracle NoSQL DB Complements Spatial & Graph in Oracle Database Designed for trillions of nodes, edges etc Out-of-the-box spatial enrichment services Over 35 of most popular graph analysis functions Graph traversal, recommendations Finding communities and influencers, Pattern matching [email protected] 22
23 Oracle Big Data Graph and Spatial Architecture Data loaded from files or through Java API into HBase In-Memory Analytics layer runs common graph and spatial algorithms on data Visualised using R or other graphics packaged Lightning-Fast In-Memory Analytics YARN Container Standalone Server Embedded Massively Scalable Graph Store Oracle NoSQL HBase [email protected]
24 Preparing Vertices and Edges for Ingestion ODI12c used to prepare two files in Oracle Flat File Format Extracted vertices and edges from existing data in Hive Wrote vertices (Twitter users) to.opv file, edges (RTs, replies etc) to.ope file For exercise, only considered 2-3 days of tweets Did not include follows (user A followed user B) as not reported by Twitter Streaming API Could approximate larger follower networks through multiplying weight of edge by follower scale -Useful for Page Rank, but does it skew actual detection of influencers in exercise? [email protected] 24
25 Oracle Flat File Format Vertices and Edge Files Vertex File (.opv) Edge File (.ope) Unique ID for the vertex Property name ( name ) Property value datatype (1 = String) Property value ( markrittman ) Unique ID for the edge Leading edge vertex ID Trailing edge vertex ID Edge Type ( mentions ) Edge Property ( weight ) Edge Property datatype and value [email protected] 25
26 Loading Edges and Vertices into HBase cfg = GraphConfigBuilder.forPropertyGraphHbase() \.setname("connectionshbase") \.setzkquorum("bigdatalite").setzkclientport(2181) \.setzksessiontimeout(120000).setinitialedgenumregions(3) \.setinitialvertexnumregions(3).setsplitsperregion(1) \.addedgeproperty("weight", PropertyType.DOUBLE, " ") \.build(); opg = OraclePropertyGraph.getInstance(cfg); opg.clearrepository(); vfile="../../data/biwa_connections.opv" efile="../../data/biwa_connections.ope" opgdl=oraclepropertygraphdataloader.getinstance(); opgdl.loaddata(opg, vfile, efile, 2); // read through the vertices opg.getvertices(); // read through the edges opg.getedges(); Uses Gremlin Shell for HBase Creates connection to HBase Sets initial configuration for database Builds the database ready for load Defines location of Vertex and Edge files Creates instance of OraclePropertyGraphDataLoader Loads data from files Prepares the property graph for use Loads in Edges and Vertices Now ready for in-memory processing [email protected] 26
27 Calculating Most Influential Tweeters Using Page Rank voutput="/tmp/mygraph.opv" eoutput="/tmp/mygraph.ope" OraclePropertyGraphUtils.exportFlatFiles(opg, voutput, eoutput, 2, false); session = Pgx.createSession("session-id-1"); analyst = session.createanalyst(); graph = session.readgraphwithproperties(opg.getconfig()); rank = analyst.pagerank(graph, 0.001, 0.85, 100); rank.gettopkvalues(5); Initiates an in-memory analytics session Runs Page Rank algorithm to determine influencers Outputs top ten vertices (users) Top 10 vertices ==>PgxVertex with ID 1= ==>PgxVertex with ID 3= ==>PgxVertex with ID 101= ==>PgxVertex with ID 6= ==>PgxVertex with ID 37= ==>PgxVertex with ID 17= ==>PgxVertex with ID 29= ==>PgxVertex with ID 65= ==>PgxVertex with ID 15= ==>PgxVertex with ID 93= [email protected] 27
28 Calculating Most Influential Tweeters Using Page Rank v1=opg.getvertex(1l); v2=opg.getvertex(3l); v3=opg.getvertex(101l); \ v4=opg.getvertex(6l); v5=opg.getvertex(37l); v6=opg.getvertex(17l); \ v7=opg.getvertex(29l); v8=opg.getvertex(65l); v9=opg.getvertex(15l); \ v10=opg.getvertex(93l); System.out.println("Top 10 influencers: \n " + v1.getproperty("name") + \ "\n " + v2.getproperty("name") + \ "\n " + v3.getproperty("name") + \ "\n " + v4.getproperty("name") + \ "\n " + v5.getproperty("name") + \ "\n " + v6.getproperty("name") + \ "\n " + v7.getproperty("name") + \ "\n " + v8.getproperty("name") + \ "\n " + v9.getproperty("name") + \ "\n " + v10.getproperty("name")); Note : Over a 3-day period in May 2015 Twitter users referencing RM website + staff accounts Top 10 influencers: markrittman rmoff rittmanmead mrainey JeromeFr Nephentur borkur BIExperte i_m_dave dw_pete [email protected] 28
29 Visualising Property Graphs with Cityscape Open source graph analysis tool with Oracle Big Data Graph and Spatial Plug-in Available shortly from Oracle, connects to Oracle NoSQL or HBase and runs Page Rank etc Alternative to command-line for In-Memory Analytics once base graph created 29
30 Calculating Top 10 Users using Page Rank Algorithm 30
31 Visualising the Social Graph Around Particular Users 31
32 Detecting Clusters (Communities) 32
33 Calculating Shortest Path Between Users 33
34 Conclusions, and Further Reading Tools such as OBIEE are great for understanding what (counts, page views, popular items) Oracle Big Data Discovery can be useful for understanding why? (sentiment, terms etc) Graph Analysis can help answer who? Who are our audience? What are our communities? Who are their important influencers? Oracle Big Data Graph and Spatial can answer these questions to big data scale Articles on the Rittman Mead Blog Rittman Mead offer consulting, training and managed services for Oracle Big Data 34
35 Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Mark Rittman, CTO, Rittman Mead OTN EMEA Tour, May 2016 [email protected]
Deep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco
Deep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco About the Speaker Mark Rittman, Co-Founder of Rittman Mead
Safe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
Data Discovery and Systems Diagnostics with the ELK stack. Rittman Mead - BI Forum 2015, Brighton. Robin Moffatt, Principal Consultant Rittman Mead
Data Discovery and Systems Diagnostics with the ELK stack Rittman Mead - BI Forum 2015, Brighton Robin Moffatt, Principal Consultant Rittman Mead T : +44 (0) 1273 911 268 (UK) About Me Principal Consultant
OBIEE 11g Data Modeling Best Practices
OBIEE 11g Data Modeling Best Practices Mark Rittman, Director, Rittman Mead Oracle Open World 2010, San Francisco, September 2010 Introductions Mark Rittman, Co-Founder of Rittman Mead Oracle ACE Director,
HDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
Comprehensive Analytics on the Hortonworks Data Platform
Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
Evolution of Information Management Architecture and Development
Evolution of Information Management Architecture and Development Stewart Bryson Chief Innovation Officer, Rittman Mead! Andrew Bond Head of Enterprise Architecture, Oracle EMEA Oracle Information Management
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at [email protected].
How Companies are! Using Spark
How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made
Oracle Big Data Essentials
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 40291196 Oracle Big Data Essentials Duration: 3 Days What you will learn This Oracle Big Data Essentials training deep dives into using the
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
Oracle Big Data Discovery Unlock Potential in Big Data Reservoir
Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: [email protected] Website: www.qburst.com
Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...
<Insert Picture Here> Big Data
Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big
Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!
Simplifying Big Data Analytics: Unifying Batch and Stream Processing John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Streaming Analy.cs S S S Scale- up Database Data And Compute Grid
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
Hadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
Big Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.
Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in
The Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
How To Use A Data Center With A Data Farm On A Microsoft Server On A Linux Server On An Ipad Or Ipad (Ortero) On A Cheap Computer (Orropera) On An Uniden (Orran)
Day with Development Master Class Big Data Management System DW & Big Data Global Leaders Program Jean-Pierre Dijcks Big Data Product Management Server Technologies Part 1 Part 2 Foundation and Architecture
Workshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
What s New with Oracle BI, Analytics and DW
What s New with Oracle BI, Analytics and DW Mark Rittman, CTO, Rittman Mead India Masterclass Tour 2013 About the Speaker Mark Rittman, Co-Founder of Rittman Mead Oracle ACE Director, specialising in Oracle
Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
INTRODUCTION TO CASSANDRA
INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open
Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
Cost-Effective Business Intelligence with Red Hat and Open Source
Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,
Oracle Big Data Fundamentals Ed 1 NEW
Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big
Building Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon.
Building Scalable Big Data Infrastructure Using Open Source Software Sam William sampd@stumbleupon. What is StumbleUpon? Help users find content they did not expect to find The best way to discover new
Native Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:
End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
Regression & Load Testing BI EE 11g
Regression & Load Testing BI EE 11g Venkatakrishnan J Who Am I? Venkatakrishnan Janakiraman Over 8+ Years of Oracle BI & EPM experience Managing Director (India), Rittman Mead India Blog at http://www.rittmanmead.com/blog
Tap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing Michael Rainey, Principal Consultant, Rittman Mead RMOUG Training Days, February 2013 About me... Michael Rainey, Principal Consultant,
OBIEE Deployment & Change Management
OBIEE Deployment & Change Management Mark Rittman, Technical Director, Rittman Mead Rocky Mountains Oracle User Group Training Days 2012, Denver Mark Rittman Mark Rittman, Co-Founder of Rittman Mead Oracle
Introduction to Big Data Training
Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB
Welkom! Copyright 2014 Oracle and/or its affiliates. All rights reserved.
Welkom! WIE? Bestuurslid OGh met BI / WA ervaring Bepalen activiteiten van de vereniging Deelname in organisatie commite van 1 of meerdere events Faciliteren van de SIG s Redactie van OGh-Visie Onderhouden
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
TE's Analytics on Hadoop and SAP HANA Using SAP Vora
TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -
BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP
BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP Business Analytics for All Amsterdam - 2015 Value of Big Data is Being Recognized Executives beginning to see the path from data insights to revenue
ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
Dominik Wagenknecht Accenture
Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna
Getting Started Practical Input For Your Roadmap
Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson
The Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
MySQL and Hadoop Big Data Integration
MySQL and Hadoop Big Data Integration Unlocking New Insight A MySQL White Paper December 2012 Table of Contents Introduction... 3 The Lifecycle of Big Data... 4 MySQL in the Big Data Lifecycle... 4 Acquire:
Oracle BIEE and SOA Integration : Step by Step. Mark Rittman, Director, Rittman Mead Consulting
Oracle BIEE and SOA Integration : Step by Step Mark Rittman, Director, Rittman Mead Consulting What is Service-Orientated Architecture? Not a technology or product, more a design approach Exposure of product
Oracle Big Data Handbook
ORACLG Oracle Press Oracle Big Data Handbook Tom Plunkett Brian Macdonald Bruce Nelson Helen Sun Khader Mohiuddin Debra L. Harding David Segleau Gokula Mishra Mark F. Hornick Robert Stackowiak Keith Laker
Ganzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
Big Data Course Highlights
Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like
Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science
A Seminar report On Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science SUBMITTED TO: www.studymafia.org SUBMITTED BY: www.studymafia.org
Upcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC [email protected] Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
Dell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert [email protected]/
A Scalable Data Transformation Framework using the Hadoop Ecosystem
A Scalable Data Transformation Framework using the Hadoop Ecosystem Raj Nair Director Data Platform Kiru Pakkirisamy CTO AGENDA About Penton and Serendio Inc Data Processing at Penton PoC Use Case Functional
Analytics on Spark & Shark @Yahoo
Analytics on Spark & Shark @Yahoo PRESENTED BY Tim Tully December 3, 2013 Overview Legacy / Current Hadoop Architecture Reflection / Pain Points Why the movement towards Spark / Shark New Hybrid Environment
#TalendSandbox for Big Data
Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND
Architecting for the Internet of Things & Big Data
Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to
Testing Big data is one of the biggest
Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing
Information Builders Mission & Value Proposition
Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns
Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop
1 Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 2 Pivotal s Full Approach It s More Than Just Hadoop Pivotal Data Labs 3 Why Pivotal Exists First Movers Solve the Big Data Utility Gap
Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise
An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure
An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise
An Oracle White Paper October 2011 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5
Hadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
Cloudera Certified Developer for Apache Hadoop
Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number
Oracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
Big Data Introduction
Big Data Introduction Ralf Lange Global ISV & OEM Sales 1 Copyright 2012, Oracle and/or its affiliates. All rights Conventional infrastructure 2 Copyright 2012, Oracle and/or its affiliates. All rights
White Paper: What You Need To Know About Hadoop
CTOlabs.com White Paper: What You Need To Know About Hadoop June 2011 A White Paper providing succinct information for the enterprise technologist. Inside: What is Hadoop, really? Issues the Hadoop stack
TRAINING PROGRAM ON BIGDATA/HADOOP
Course: Training on Bigdata/Hadoop with Hands-on Course Duration / Dates / Time: 4 Days / 24th - 27th June 2015 / 9:30-17:30 Hrs Venue: Eagle Photonics Pvt Ltd First Floor, Plot No 31, Sector 19C, Vashi,
OWB Users, Enter The New ODI World
OWB Users, Enter The New ODI World Kulvinder Hari Oracle Introduction Oracle Data Integrator (ODI) is a best-of-breed data integration platform focused on fast bulk data movement and handling complex data
How To Create A Data Visualization With Apache Spark And Zeppelin 2.5.3.5
Big Data Visualization using Apache Spark and Zeppelin Prajod Vettiyattil, Software Architect, Wipro Agenda Big Data and Ecosystem tools Apache Spark Apache Zeppelin Data Visualization Combining Spark
W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
Oracle Data Integrator for Big Data. Alex Kotopoulis Senior Principal Product Manager
Oracle Data Integrator for Big Data Alex Kotopoulis Senior Principal Product Manager Hands on Lab - Oracle Data Integrator for Big Data Abstract: This lab will highlight to Developers, DBAs and Architects
Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
HDP Enabling the Modern Data Architecture
HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,
Oracle Big Data Strategy Simplified Infrastrcuture
Big Data Oracle Big Data Strategy Simplified Infrastrcuture Selim Burduroğlu Global Innovation Evangelist & Architect Education & Research Industry Business Unit Oracle Confidential Internal/Restricted/Highly
ITG Software Engineering
Introduction to Apache Hadoop Course ID: Page 1 Last Updated 12/15/2014 Introduction to Apache Hadoop Course Overview: This 5 day course introduces the student to the Hadoop architecture, file system,
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Implement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
Qsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
BIG DATA - HADOOP PROFESSIONAL amron
0 Training Details Course Duration: 30-35 hours training + assignments + actual project based case studies Training Materials: All attendees will receive: Assignment after each module, video recording
Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise
An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our
Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer
Automated Data Ingestion Bernhard Disselhoff Enterprise Sales Engineer Agenda Pentaho Overview Templated dynamic ETL workflows Pentaho Data Integration (PDI) Use Cases Pentaho Overview Overview What we
Building Your Big Data Team
Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.
An Oracle BI and EPM Development Roadmap
An Oracle BI and EPM Development Roadmap Mark Rittman, Director, Rittman Mead UKOUG Financials SIG, September 2009 1 Who Am I? Oracle BI&W Architecture and Development Specialist Co-Founder of Rittman
How To Use Big Data For Business
Big Data Maturity - The Photo and The Movie Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson Mike
Big Data Too Big To Ignore
Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction
