Oracle Big Data Handbook

Size: px
Start display at page:

Download "Oracle Big Data Handbook"

Transcription

1 ORACLG Oracle Press Oracle Big Data Handbook Tom Plunkett Brian Macdonald Bruce Nelson Helen Sun Khader Mohiuddin Debra L. Harding David Segleau Gokula Mishra Mark F. Hornick Robert Stackowiak Keith Laker Mc Graw Hill Education New York Chicago San Francisco Athens London Madrid Mexico City Milan New Delhi Singapore Sydney Toronto

2 Contents Acknowledgments Introduction xxi xxv PART I Introduction 1 Introduction to Big Data 3 Big Data 4 Google's MapReduce Algorithm and Apache Hadoop 5 Oracle's Big Data Platform 7 Summary 10 2 The Value of Big Data 11 Am I Big Data, or Is Big Data Me? 12 Big Data, Little Data It's Still Me 15 What Happened? 16 Now What? 17 Reality, Check Please! 18 What Do You Make of It? 20 Information Chain Reaction (ICR) 21 Big Data, Big Numbers, Big Business? 23 Twitter 24 Facebook 25 Internal Source 25 ICR: Connect 26 ICR: Change 27 xi

3 xii Oracle Big Data Handbook Wanted: Big Data Value 29 Big Data Example 1: Clinical Trial Research Within the Healthcare Industry 30 Example 2: Improvements in Car Design for Driver Safety Within the Automotive 31 Industry 32 Summary PART II Big Data Platform 3 The Apache Hadoop Platform 37 Software vs. Hardware 39 The Hadoop Software Platform 39 Hadoop Distributions and Versions 40 The Hadoop Distributed File System (HDFS) 40 Scheduling, Compute, and Processing 43 Operating System Choices 45 I/O and the Linux Kernel 46 The Hadoop Hardware Platform 46 CPU and Memory 47 Network 47 Disk 48 Putting It All Together 48 4 Why an Appliance? 51 Why Would Oracle Create a Big Data Appliance? 52 What Is an Appliance? 53 What Are the Goals of Oracle Big Data Appliance? 54 Optimizing an Appliance 55 Oracle Big Data Appliance Version 2 Software 56 Oracle Big Data Appliance X3-2 Hardware 58 Where Did Oracle Get Hadoop Expertise? 61 Configuring a Hadoop Cluster 63 Choosing the Core Cluster Components 64 Assembling the Cluster 66 What About a Do-It-Yourself Cluster? 67 Total Costs of a Cluster 69

4 Contents xih Time to Value 73 How to Build Out Larger Clusters 75 Can I Add Other Software to Oracle Big Data Appliance? 75 Drawbacks of an Appliance 76 5 BDA Configurations, Deployment Architectures, and Monitoring 79 Introduction 80 Big Data Appliance X3-2 Full Rack (Eighteen Nodes) 82 Big Data Appliance X3-2 Starter Rack (Six Nodes) 86 Big Data Appliance X3-2 In-Rack Expansion (Six Nodes) 89 Hardware Modifications to BDA 89 Software Supported on Big Data Appliance X BDA Install and Configuration Process 92 Critical and Noncritical Nodes 94 Automatic Failover of the NameNode 95 BDA Disk Storage Layout 96 Adding Storage to a Hadoop Cluster 99 Hadoop-Only Config and Hadoop+NoSQL DB 99 Hadoop-Only Appliance 100 Hadoop and NoSQL DB 100 Memory Options 103 Deployment Architectures 103 Multitenancy and Hadoop in the Cloud 103 Scalability 105 Multirack BDA Considerations 106 Installing Other Software on the BDA 107 BDA in the Data Center 107 Administrative Network 107 Client Access Network 108 InfiniBand Private Network 108 Network Requirements 109 Connecting to Data Center LAN 111 Example Connectivity Architecture 111 Oracle Big Data Appliance Restrictions on Use 112 BDA Management and Monitoring 113 Enterprise Manager 115 Cloudera Manager 117 Hadoop Monitoring Utilities: Web GUI 117 Oracle ILOM 120 Hue 122 DCLI Utility 123

5 xiv Oracle Big Data Handbook 6 Integrating the Data Warehouse and Analytics Infrastructure to Big Data 125 The Data Warehouse as a Historic Database of Record 126 The Oracle Database as a Data Warehouse 127 Why the Data Warehouse and Hadoop Are Deployed Together 128 Completing the Footprint: Business Analyst Tools 130 Building Out the Infrastructure BDA Connectors 133 Oracle Big Data Connectors 134 Oracle Loader for Hadoop 136 Online Mode 137 Oracle OCI Direct Path Output JDBC Output 139 Offline Mode 140 Oracle Data Pump Output 141 Delimited Text Output 141 Installation of Oracle Loader for Hadoop 142 Invoking Oracle Loader for Hadoop 143 Input Formats 144 DelimitedTextlnputFormat 145 RegexInputFormat 146 AvrolnputFormat 146 HiveToAvrolnputFormat 146 KVAvroInputFormat 147 Custom Input Formats 147 Oracle Loader for Hadoop Configuration Files 147 Loader Maps 150 Additional Optimizations 152 Leveraging InfiniBand 152 Comparison to Apache Sqoop 153 Oracle SQL Connector for HDFS 153 Installation of Oracle SQL Connector for HDFS 157 HIVE Installation 159 Creating External Tables Using Oracle SQL Connector for HDFS 160 ExternalTable Configuration Tool 161 Data Source Types 161 Configuration Tool Syntax 162 Required Properties 163 Optional Properties 164 ExternalTable Tool for Delimited Text Files 164 Testing DDL with -noexecute

6 Contents XV Adding a New HDFS File to the Location File 167 Manual External Table Configuration 1 68 Hive Sources 169 ExternalTable Example 170 Oracle Data Pump Sources 171 Configuration Files 173 Querying with Oracle SQL Connector for HDFS 175 Oracle R Connector for Hadoop 1 76 Oracle Data Integrator Application Adapter for Hadoop Oracle NoSQL Database 181 What Is a NoSQL Database System? 182 NoSQL Applications 184 Oracle NoSQL Database 185 A Sample Use Case 186 Architecture 188 Client Driver 189 Key-Value Pairs 190 Storage Nodes 192 Replication 193 Smart Topology 194 Online Elasticity 194 No Single Point of Failure 195 Data Management 195 APIs 195 CRUD Operations 196 Multiple Update Operations 196 Lookup Operations 196 Transactions 197 Predictable Performance 198 Integration 199 Installation and Administration 200 Simple Installation 200 Administration 200 How Oracle NoSQL Database Stacks Up 201 Useful Links 202 PART III Analyzing Information and Making Decisions 9 In-Database Analytics: Delivering Faster Time to Value 205 Introduction 206 Oracle's In-Database Analytics 208 Why Running In-Database Is So Important 211

7 XVi Oracle Big Data Handbook Introduction to Oracle Data Mining and Statistical Analysis 211 Oracle's In-Database Advanced Analytics 213 Oracle Data Mining 213 Introduction to R 223 Text Mining 231 In-Database Statistical Functions 236 Making Bl Tools Smarter 237 Spatial Analytics 238 Understanding the Spatial Data Model 239 Querying the Spatial Data Model 239 Using Spatial Analytics 240 Making Bl Tools Smarter 241 Graph-Based Analytics 242 Graph Data Model 242 Querying Graph Data 243 Multidimensional Analytics 245 Making Bl Tools Smarter and Faster 246 In-Database Analytics: Bringing It All Together 247 Integrating Analytics into Extract-Load-Transform Processing 247 Delivering Guided Exploration 248 Delivering Analytical Mash-ups 249 Conclusion Analyzing Data with R 251 Introduction to Open Source R 252 CRAN, Packages, and Task Views 252 GUIs and IDEs 255 Traditional R and Database Interaction vs. Oracle R Enterprise 256 Oracle's Strategic R Offerings 258 Oracle R Enterprise 259 Oracle R Distribution 260 ROracle 261 Oracle R Connector for Hadoop 261 Oracle R Enterprise: Next-Level View 261 Oracle R Enterprise Installation and Configuration 263 Using Oracle R Enterprise 265 Transparency Layer 265 Embedded R Execution 276 Predictive Analytics 293

8 Contents Xvii Oracle R Connector for Hadoop 309 Invoking MapReduce Jobs 311 Testing ORCH R Scripts Without the Hadoop Cluster 311 Interacting with HDFS from R 313 HDFS Metadata Discovery 314 Working with Hadoop Using the ORCH Framework 316 ORCH Predictive Analytics on Hadoop 317 ORCHhive 319 Oracle R Connector for Hadoop and Oracle R Enterprise Interaction 322 Summary Endeca Information Discovery 325 Why Did Oracle Select Endeca? 326 Product Suites Overview 326 Endeca Information Discovery Platform 328 Major Functional Areas 328 Key Features 328 Endeca Information Discovery and Business Intelligence 331 Difference in Roles and Functions 332 Bl Development Process vs. Information Discovery Approach 333 Complementary But Not Exclusive 334 Architecture 335 Oracle Endeca Server 336 Oracle Endeca Studio 339 Oracle Endeca Integration Suite 342 Endeca on Exalytics 343 Scalability and Load Balancing 344 Unifying Diverse Content Sets 348 Endeca Differentiator 349 Industry Use Cases 349 Hands-On with Endeca 351 Installation and Configuration 351 Developing an Endeca Application Big Data Governance 357 Key Elements of Enterprise Data Governance 359 Business Outcome 359 Information Lifecycle Management 359 Regulatory Compliance and Risk Management 360 Metadata Management 360

9 Xviii Oracle Big Data Handbook Data Quality Management 361 Master and Reference Data Management 361 Data Security and Privacy Management 362 Business Process Alignment 362 How Does Big Data Impact Enterprise Data Governance? 363 Modeled Data vs. Raw Data 363 Types of Big Data 366 Applying Data Governance to Big Data 370 Leveraging Big Data Governance 373 Industry-Specific Use Cases 377 Utilities 377 Healthcare 379 Financial Services 380 Retail 382 Consumer Packaged Goods (CPG) 383 Telecommunications 384 Oil and Gas 386 How Does Big Data Impact Data Governance Roles? 388 Governance Roles and Organization 388 An Approach to Implementing Big Data Governance Developing Architecture and Roadmap for Big Data 393 Architecture Capabilities for Big Data 394 New Characteristics of Big Data 394 Conceptual Architecture Capabilities of Big Data 395 Product Capabilities and Tools 397 Making Big Data Architecture Decisions 399 Architecture Development Process for Realizing Incremental Values 400 Overview of Oracle Information Architecture Framework 400 Overview of Applied OADP for Information Architecture 406 Big Data Architecture Development Process 408 Impact on Data Management and Bl Processes Traditional Bl Development Process 415 Big Data and Analytics Development Process 415 Big Data Governance 416 Traditional Data Governance Focus 417 New Focus for Governance in Big Data 417 Developing Skills and Talent 418 Data Scientist

10 Contents XIX Big Data Developer 419 Big Data Administrator 419 Big Data Best Practices 419 Align Big Data Initiative with Specific Business Goals 420 Ensure a Centralized IT Strategy for Standards and Governance 420 Use a Center of Excellence to Minimize Training and Risk 420 Correlate Big Data with Structured Data 420 Provide High-Performance and Scalable Analytical Sandboxes 420 Reshape the IT Operating Model 421 Index 423

Oracle s Big Data solutions. Roger Wullschleger.

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Introducing Oracle Exalytics In-Memory Machine

Introducing Oracle Exalytics In-Memory Machine Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle

More information

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our

More information

Oracle Big Data Fundamentals Ed 1 NEW

Oracle Big Data Fundamentals Ed 1 NEW Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big

More information

Big Data Big Deal for Public Sector Organizations

Big Data Big Deal for Public Sector Organizations Big Data Big Deal for Public Sector Organizations Hoàng Xuân Hiếu Director, FAB & Government Business Indochina & Myanmar 1 Copyright 2013, Oracle and/or its affiliates. All rights reserved. The following

More information

Management. Oracle Fusion Middleware. 11 g Architecture and. Oracle Press ORACLE. Stephen Lee Gangadhar Konduri. Mc Grauu Hill.

Management. Oracle Fusion Middleware. 11 g Architecture and. Oracle Press ORACLE. Stephen Lee Gangadhar Konduri. Mc Grauu Hill. ORACLE Oracle Press Oracle Fusion Middleware 11 g Architecture and Management Reza Shafii Stephen Lee Gangadhar Konduri Mc Grauu Hill New York Chicago San Francisco Lisbon London Madrid Mexico City Milan

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

Oracle Big Data Essentials

Oracle Big Data Essentials Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 40291196 Oracle Big Data Essentials Duration: 3 Days What you will learn This Oracle Big Data Essentials training deep dives into using the

More information

An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture

An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP ESG Data Systems Architecture Big Data & Analytics as a Service Components Unstructured Data / Sparse Data of Value

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data: The Datafication Of Everything Thoughts Devices Processes Thoughts Things Processes Run the Business Organize data to do something

More information

Architecting for the Internet of Things & Big Data

Architecting for the Internet of Things & Big Data Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to

More information

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure

More information

TUT NoSQL Seminar (Oracle) Big Data

TUT NoSQL Seminar (Oracle) Big Data Timo Raitalaakso +358 40 848 0148 rafu@solita.fi TUT NoSQL Seminar (Oracle) Big Data 11.12.2012 Timo Raitalaakso MSc 2000 Work: Solita since 2001 Senior Database Specialist Oracle ACE 2012 Blog: http://rafudb.blogspot.com

More information

Data Warehousing in the Age of Big Data

Data Warehousing in the Age of Big Data Data Warehousing in the Age of Big Data Krish Krishnan AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD * PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Morgan Kaufmann is an imprint of Elsevier

More information

Tuning Tips & Techniques

Tuning Tips & Techniques ORACLE Oracle Press Oracle E-Business Suite 12 Tuning Tips & Techniques Richard Bingham Mc Graw Hill Education New York Chicago San Francisco Athens London Madrid Mexico City Milan New Delhi Singapore

More information

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

Oracle Big Data Strategy Simplified Infrastrcuture

Oracle Big Data Strategy Simplified Infrastrcuture Big Data Oracle Big Data Strategy Simplified Infrastrcuture Selim Burduroğlu Global Innovation Evangelist & Architect Education & Research Industry Business Unit Oracle Confidential Internal/Restricted/Highly

More information

Big Data: Are You Ready? Kevin Lancaster

Big Data: Are You Ready? Kevin Lancaster Big Data: Are You Ready? Kevin Lancaster Director, Engineered Systems Oracle Europe, Middle East & Africa 1 A Data Explosion... Traditional Data Sources Billing engines Custom developed New, Non-Traditional

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

FIFTH EDITION. Oracle Essentials. Rick Greenwald, Robert Stackowiak, and. Jonathan Stern O'REILLY" Tokyo. Koln Sebastopol. Cambridge Farnham.

FIFTH EDITION. Oracle Essentials. Rick Greenwald, Robert Stackowiak, and. Jonathan Stern O'REILLY Tokyo. Koln Sebastopol. Cambridge Farnham. FIFTH EDITION Oracle Essentials Rick Greenwald, Robert Stackowiak, and Jonathan Stern O'REILLY" Beijing Cambridge Farnham Koln Sebastopol Tokyo _ Table of Contents Preface xiii 1. Introducing Oracle 1

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database - Engineered for Innovation Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database 11g Release 2 Shipping since September 2009 11.2.0.3 Patch Set now

More information

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08

More information

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

Connecting Hadoop with Oracle Database

Connecting Hadoop with Oracle Database Connecting Hadoop with Oracle Database Sharon Stephen Senior Curriculum Developer Server Technologies Curriculum The following is intended to outline our general product direction.

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

High Performance Data Management Use of Standards in Commercial Product Development

High Performance Data Management Use of Standards in Commercial Product Development v2 High Performance Data Management Use of Standards in Commercial Product Development Jay Hollingsworth: Director Oil & Gas Business Unit Standards Leadership Council Forum 28 June 2012 1 The following

More information

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise An Oracle White Paper October 2011 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5

More information

Oracle Big Data Building A Big Data Management System

Oracle Big Data Building A Big Data Management System Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following

More information

Oracle Big Data Fundamentals Ed 1

Oracle Big Data Fundamentals Ed 1 Oracle University Contact Us: 001-855-844-3881 Oracle Big Data Fundamentals Ed 1 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big Data

More information

WebLogic Server 11g Administration Handbook

WebLogic Server 11g Administration Handbook ORACLE: Oracle Press Oracle WebLogic Server 11g Administration Handbook Sam R. Alapati Mc Graw Hill New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Hadoop vs Apache Spark

Hadoop vs Apache Spark Innovate, Integrate, Transform Hadoop vs Apache Spark www.altencalsoftlabs.com Introduction Any sufficiently advanced technology is indistinguishable from magic. said Arthur C. Clark. Big data technologies

More information

An Oracle White Paper September 2014. Oracle: Big Data for the Enterprise

An Oracle White Paper September 2014. Oracle: Big Data for the Enterprise An Oracle White Paper September 2014 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform...

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Architecting your Business for Big Data Your Bridge to a Modern Information Architecture

Architecting your Business for Big Data Your Bridge to a Modern Information Architecture Architecting your Business for Big Data Your Bridge to a Modern Information Architecture Robert Stackowiak Vice President, Information Architecture & Big Data Oracle Safe Harbor Statement The following

More information

ORACLE BIG DATA APPLIANCE X4-2

ORACLE BIG DATA APPLIANCE X4-2 ORACLE BIG DATA APPLIANCE X4-2 BIG DATA FOR THE ENTERPRISE OPEN, SECURE AND INTEGRATED KEY FEATURES Massively scalable, open infrastructure to store and manage big data Industry-leading security, performance

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

Big Data Analytics From Strategie Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph

Big Data Analytics From Strategie Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph Big Data Analytics From Strategie Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph David Loshin ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN

More information

Hadoop Meets Exadata. Presented by: Kerry Osborne. DW Global Leaders Program Decemeber, 2012

Hadoop Meets Exadata. Presented by: Kerry Osborne. DW Global Leaders Program Decemeber, 2012 Hi Hadoop Meets Exadata Presented by: Kerry Osborne DW Global Leaders Program Decemeber, 2012 whoami Never Worked for Oracle Worked with Oracle DB Since 1982 (V2) Working with Exadata since early 2010

More information

TRAINING PROGRAM ON BIGDATA/HADOOP

TRAINING PROGRAM ON BIGDATA/HADOOP Course: Training on Bigdata/Hadoop with Hands-on Course Duration / Dates / Time: 4 Days / 24th - 27th June 2015 / 9:30-17:30 Hrs Venue: Eagle Photonics Pvt Ltd First Floor, Plot No 31, Sector 19C, Vashi,

More information

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All

More information

Qsoft Inc www.qsoft-inc.com

Qsoft Inc www.qsoft-inc.com Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:

More information

Oracle Big Data Management System

Oracle Big Data Management System Oracle Big Data Management System A Statement of Direction for Big Data and Data Warehousing Platforms O R A C L E S T A T E M E N T O F D I R E C T I O N A P R I L 2 0 1 5 Disclaimer The following is

More information

Big Data

<Insert Picture Here> Big Data Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big

More information

Using OBIEE for Location-Aware Predictive Analytics

Using OBIEE for Location-Aware Predictive Analytics Using OBIEE for Location-Aware Predictive Analytics Jean Ihm, Principal Product Manager, Oracle Spatial and Graph Jayant Sharma, Director, Product Management, Oracle Spatial and Graph, MapViewer Oracle

More information

ORACLE BIG DATA APPLIANCE X3-2

ORACLE BIG DATA APPLIANCE X3-2 ORACLE BIG DATA APPLIANCE X3-2 BIG DATA FOR THE ENTERPRISE KEY FEATURES Massively scalable infrastructure to store and manage big data Big Data Connectors delivers load rates of up to 12TB per hour between

More information

Disrupt or be disrupted IT Driving Business Transformation

Disrupt or be disrupted IT Driving Business Transformation Disrupt or be disrupted IT Driving Business Transformation Gokula Mishra VP, Big Data & Advanced Analytics Business Analytics Product Group Copyright 2014 Oracle and/or its affiliates. All rights reserved.

More information

A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle

A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle Growth in Data Diversity and Usage 1.8 Zettabytes of Data in 2011, 20x Growth by 2020

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Big Data Analytics Scaling R to Enterprise Data user! 2013 Albacete Spain #user2013

Big Data Analytics Scaling R to Enterprise Data user! 2013 Albacete Spain #user2013 Big Analytics Scaling R to Enterprise user! 2013 Albacete Spain #user2013 Luis Campos Mark Hornick 1 Big Solutions Lead, Oracle EMEA Director, Oracle base Advanced Analytics @luigicampos @MarkHornick 2

More information

Introduction to Big Data Training

Introduction to Big Data Training Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB

More information

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform David Lawler, Oracle Senior Vice President, Product Management and Strategy Paul Kent, SAS Vice President, Big Data What

More information

Master Data Management and Data Governance Second Edition

Master Data Management and Data Governance Second Edition Master Data Management and Data Governance Second Edition Alex Berson Larry Dubov Mc Grauu Hill New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore

More information

Big Data in a Relational World Presented by: Kerry Osborne JPMorgan Chase December, 2012

Big Data in a Relational World Presented by: Kerry Osborne JPMorgan Chase December, 2012 Big Data in a Relational World Presented by: Kerry Osborne JPMorgan Chase December, 2012 whoami Never Worked for Oracle Worked with Oracle DB Since 1982 (V2) Working with Exadata since early 2010 Work

More information

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine Version 3.0 Please note: This appliance is for testing and educational purposes only; it is unsupported and not

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

Building and Managing

Building and Managing ORACLE Oracle Press' Building and Managing a Cloud Using Oracle Enterprise Manager 12c Madhup Gulati Adeesh Fulay Sudip Datta Mc Graw Hill Education New York Chicago San Francisco Lisbon London Madrid

More information

Managing Data in Motion

Managing Data in Motion Managing Data in Motion Data Integration Best Practice Techniques and Technologies April Reeve ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY

More information

Cost-Effective Business Intelligence with Red Hat and Open Source

Cost-Effective Business Intelligence with Red Hat and Open Source Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,

More information

Oracle Big Data Appliance X5-2

Oracle Big Data Appliance X5-2 Oracle Big Data Appliance X5-2 Oracle Big Data Appliance is a high-performance, secure platform for running diverse workloads on Hadoop and NoSQL systems. With Oracle Big Data SQL, Oracle Big Data Appliance

More information

Successfully Deploying Alternative Storage Architectures for Hadoop Gus Horn Iyer Venkatesan NetApp

Successfully Deploying Alternative Storage Architectures for Hadoop Gus Horn Iyer Venkatesan NetApp Successfully Deploying Alternative Storage Architectures for Hadoop Gus Horn Iyer Venkatesan NetApp Agenda Hadoop and storage Alternative storage architecture for Hadoop Use cases and customer examples

More information

Apache Hadoop: Past, Present, and Future

Apache Hadoop: Past, Present, and Future The 4 th China Cloud Computing Conference May 25 th, 2012. Apache Hadoop: Past, Present, and Future Dr. Amr Awadallah Founder, Chief Technical Officer aaa@cloudera.com, twitter: @awadallah Hadoop Past

More information

Big Data Use Cases Update

Big Data Use Cases Update Big Data Use Cases Update Sanat Joshi Industry Solutions Manufacturing Industries Business Unit 1 Data Explosion Web & social networks experienced it first Infographic by Go-gulf.com 2 Number Of Connected

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Big Data Primer. Andrew Mendelsohn, SVP Database April 12, 2012

Big Data Primer. Andrew Mendelsohn, SVP Database April 12, 2012 Big Data Primer Andrew Mendelsohn, SVP Database April 12, 2012 Safe Harbor Statement "Safe Harbor" Statement: Statements in this presentation relating to Oracle's future plans, expectations, beliefs, intentions

More information

Hadoop & SAS Data Loader for Hadoop

Hadoop & SAS Data Loader for Hadoop Turning Data into Value Hadoop & SAS Data Loader for Hadoop Sebastiaan Schaap Frederik Vandenberghe Agenda What s Hadoop SAS Data management: Traditional In-Database In-Memory The Hadoop analytics lifecycle

More information

Big Data Too Big To Ignore

Big Data Too Big To Ignore Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction

More information

Safe Harbor Statement

Safe Harbor Statement Defining a Roadmap to Big Data Success Robert Stackowiak, Oracle Vice President, Big Data 17 November 2015 Safe Harbor Statement The following is intended to outline our general product direction. It is

More information

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform: Creating an Integrated, Optimized, and Secure Enterprise Data Platform: IBM PureData System for Transactions with SafeNet s ProtectDB and DataSecure Table of contents 1. Data, Data, Everywhere... 3 2.

More information

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Hadoop Big Data for Processing Data and Performing Workload

Hadoop Big Data for Processing Data and Performing Workload Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

Cisco. A Beginner's Guide Fifth Edition ANTHONY T. VELTE TOBY J. VELTE. City Milan New Delhi Singapore Sydney Toronto. Mc Graw Hill Education

Cisco. A Beginner's Guide Fifth Edition ANTHONY T. VELTE TOBY J. VELTE. City Milan New Delhi Singapore Sydney Toronto. Mc Graw Hill Education Cisco A Beginner's Guide Fifth Edition ANTHONY T. VELTE TOBY J. VELTE Mc Graw Hill Education New York Chicago San Francisco Athens London Madrid Mexico City Milan New Delhi Singapore Sydney Toronto Contents

More information

Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science

Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science A Seminar report On Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science SUBMITTED TO: www.studymafia.org SUBMITTED BY: www.studymafia.org

More information

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com Data Warehousing and Analytics Infrastructure at Facebook Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com Overview Challenges in a Fast Growing & Dynamic Environment Data Flow Architecture,

More information

Pro Apache Hadoop. Second Edition. Sameer Wadkar. Madhu Siddalingaiah

Pro Apache Hadoop. Second Edition. Sameer Wadkar. Madhu Siddalingaiah Pro Apache Hadoop Second Edition Sameer Wadkar Madhu Siddalingaiah Contents J About the Authors About the Technical Reviewer Acknowledgments Introduction xix xxi xxiii xxv Chapter 1: Motivation for Big

More information

Big Data Big Data/Data Analytics & Software Development

Big Data Big Data/Data Analytics & Software Development Big Data Big Data/Data Analytics & Software Development Danairat T. danairat@gmail.com, 081-559-1446 1 Agenda Big Data Overview Business Cases and Benefits Hadoop Technology Architecture Big Data Development

More information

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload

More information

Oracle R zum Anfassen: Die Themen

Oracle R zum Anfassen: Die Themen R zum Anfassen: Die Themen 09:30 Begrüßung 09:45 R Zum Anfassen Einführung 10:15 Minikurs in der Sprache R Sprachmittel, Hilfen, GUIs zum Erstellen der Skripte Schnell und einfach ansprechende Grafiken

More information

Copyright 2012 EMC Corporation. All rights reserved.

Copyright 2012 EMC Corporation. All rights reserved. 1 Greenplum UAP Enabling Big Data Analytics Brendon Moran Data Scientist 2 Agenda Background On Greenplum And Big Data Analytics Greenplum UAP Greenplum: Not Just Infrastructure Pivotal Labs Customers

More information

Extend your analytic capabilities with SAP Predictive Analysis

Extend your analytic capabilities with SAP Predictive Analysis September 9 11, 2013 Anaheim, California Extend your analytic capabilities with SAP Predictive Analysis Charles Gadalla Learning Points Advanced analytics strategy at SAP Simplifying predictive analytics

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Mark Rittman, CTO, Rittman Mead OTN EMEA Tour, May 2016 info@rittmanmead.com www.rittmanmead.com @rittmanmead About the Speaker Mark

More information

Oracle Business Analytics Overview

Oracle Business Analytics Overview Oracle Business Analytics Overview Markus Päivinen Business Analytics Country Leader, Finland May 2014 1 Presentation content What are the requirements for modern BI Trend in Business Analytics Big Data

More information

Networking. Sixth Edition. A Beginner's Guide BRUCE HALLBERG

Networking. Sixth Edition. A Beginner's Guide BRUCE HALLBERG Networking A Beginner's Guide Sixth Edition BRUCE HALLBERG Mc Graw Hill Education New York Chicago San Francisco Athens London Madrid Mexico City Milan New Delhi Singapore Sydney Toronto Contents Acknowledgments

More information