Cost-Effective Business Intelligence with Red Hat and Open Source

Similar documents
Main Memory Data Warehouses

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Oracle Big Data SQL Technical Update

HDP Hadoop From concept to deployment.

Real Life Performance of In-Memory Database Systems for BI

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Big Data Technologies Compared June 2014

Introducing Oracle Exalytics In-Memory Machine

HDP Enabling the Modern Data Architecture

Hadoop: Distributed Data Processing. Amr Awadallah Founder/CTO, Cloudera, Inc. ACM Data Mining SIG Thursday, January 25 th, 2010

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Microsoft Analytics Platform System. Solution Brief

Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7

hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau

Big Data and Its Impact on the Data Warehousing Architecture

How To Scale Out Of A Nosql Database

BIG DATA APPLIANCES. July 23, TDWI. R Sathyanarayana. Enterprise Information Management & Analytics Practice EMC Consulting

Netezza and Business Analytics Synergy

Data processing goes big

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Hadoop and Map-Reduce. Swati Gore

Please give me your feedback

Parallel Data Warehouse

Big Data Explained. An introduction to Big Data Science.

Oracle Exalytics Briefing

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Testing Big data is one of the biggest

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Hadoop and its Usage at Facebook. Dhruba Borthakur June 22 rd, 2009

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

CSE-E5430 Scalable Cloud Computing Lecture 2

WHAT S NEW IN SAS 9.4

Hadoop & SAS Data Loader for Hadoop

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Big Data Are You Ready? Thomas Kyte

Implement Hadoop jobs to extract business value from large and varied data sets

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Bringing Big Data into the Enterprise

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

Introduction to Cloud Computing

Big data blue print for cloud architecture

Big Data Too Big To Ignore

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Building Your Big Data Team

Cisco IT Hadoop Journey

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

Hadoop & its Usage at Facebook

Tiber Solutions. Understanding the Current & Future Landscape of BI and Data Storage. Jim Hadley

Il mondo dei DB Cambia : Tecnologie e opportunita`

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Mike Boyarski Jaspersoft Product Marketing Business Intelligence in the Cloud

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

BIG DATA What it is and how to use?

Testing 3Vs (Volume, Variety and Velocity) of Big Data

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

How Companies are! Using Spark

<Insert Picture Here> Big Data

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

The Future of Data Management

Big Data Architectures. Tom Cahill, Vice President Worldwide Channels, Jaspersoft

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

Hadoop & its Usage at Facebook

BIG DATA SOLUTION DATA SHEET

Analytics on Spark &

Dell In-Memory Appliance for Cloudera Enterprise

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Tableau Visual Intelligence Platform Rapid Fire Analytics for Everyone Everywhere

Safe Harbor Statement

Trafodion Operational SQL-on-Hadoop

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

Hadoop: Embracing future hardware

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

In-memory computing with SAP HANA

2009 Oracle Corporation 1

QlikView Business Discovery Platform. Algol Consulting Srl

Exadata Database Machine

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

Has been into training Big Data Hadoop and MongoDB from more than a year now

Data Warehouse design

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

Cisco Data Preparation

Big Data Big Data/Data Analytics & Software Development

Modern Data Warehousing

How To Create A Data Visualization With Apache Spark And Zeppelin

Zynga Analytics Leveraging Big Data to Make Games More Fun and Social

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Open source Google-style large scale data analysis with Hadoop

Transcription:

Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1

Agenda Introductions Quick survey What is BI?: reporting, analysis, data integration BI is important for IT! BI Architectures with Red Hat: use cases, demonstrations, state of the market, open source options Database: Marts and Warehouses Data Integration and Map Reduce Visualization: analytics, end user interactivity, in-memory, dashboards Embedding BI in applications Q&A 2

Introductions and a Quick Survey Me, Jaspersoft Jaspersoft and Red Hat Platform provider (Ops, Sys admin, Uses RHEL) vs Software Vendors (Developer on RHEL) vs End Users Using enterprise software suites: SAP, Oracle Please ask questions as we go Or wait for the Q&A at the end Anything you specifically want covered? 3

Priority of Business Intelligence 4

All Kinds of Reporting and Analysis 5 5

Demonstration 6

Traditional Business Intelligence Reporting Analytics BI Data Sources: Anything! 7 Data Data Integration, Integration Enhancement Clean Data (ODS) Database Data Mart/ Warehouse

Embedded Business Intelligence Add BI to an existing application Aiming for seamless integration 8 Single sign on Often a single data source Typical for application developers and software vendors Examples: Red Hat Satellite, Virgin Money

Stand Alone BI Within an organization, across groups/departments The analytic application Production reporting Often multiple data sources People come directly to the BI application Data warehouses and marts: star schema Typical for consultants/system integrators, internal IT groups Examples: 9

Red Hat and Open Source in BI 10 Open source alternatives to the commercial vendors at all levels in the BI stack Linux is the OS of choice in non-microsoft environments and RHEL is the leading commercial distribution Often see complete open source solutions: operating system, database, data integration, reporting and analysis

Open Source, Free, Low Cost BI Databases Optimized for data load and query 11

An Data Warehouse Alternative: JBoss Teiid 12

Data Integration 13 Commercial Informatica IBM - Ascential (DataStage) Open Source Custom Scripting Talend/Jaspersoft Kettle And Hadoop!

Talend/Jaspersoft Data Integration 14

Map Reduce and Distributed File Systems Map Reduce algorithm: dealing with massive data Google, Yahoo: how do I index the Web? Batch processing: data analyst workbench to production Different mind set to traditional data integration addressing a different class of problem Originally the big web companies, now moving to broader use and embedding in databases: Asterdata, Greenplum, HadoopDB Hadoop provides open source frameworks and eco-system for Map Reduce Hadoop file system (HDFS): extreme clusters with redundancy, failover 15 Kosmos File System too Base shell scripting and Java for transformations Higher level interfaces: Hive (SQL), Pig, Cascading Hbase: BigTable implementation

Map Reduce Examples Yahoo Hadoop, April 09: Sort 1 TB in 62 secs Approx. 3800 nodes (in such a large cluster, some nodes are always down) 2 quad core Xeons @ 2.0ghz, 8GB RAM per node 4 SATA disks per a node Red Hat Enterprise Linux Server Release 5.1 (kernel 2.6.18) Sun Java JDK 1.6.0 13-b03 (32 and 64 bit) Google does same in 63 secs with 1000 nodes, November 08 16 1 gigabit ethernet on each node, 40 nodes per a rack, 8 gigabit ethernet uplinks from each rack to the core Google File System, not open source

BI and Map Reduce BI tools can access Map Reduce in several ways Push Map Reduce results into a traditional relational database 17 No different from any traditional BI/data warehouse process Avoids Map Reduce performance issues for users Limits flexibility of analysis Access Map Reduce directly Access raw results on the file system or via APIs Not useful for interactivity Data analysts have more flexibility

BI Market and Vendors 18

What drives cost of BI? Complexity of BI and DW technologies Complicated BI tools drive high end user support costs Most DWs require large complex/specialized hardware and constant tuning and configuration Complex ETL processes prevent IT from keeping up with increasing data access needs Cost of software, hardware and per-user licenses Difficulty supporting new applications or decisions with limited resources 19 RESULT: Only 15% of today s enterprise users have access to Business Intelligence tools

Costs Reporting and Analysis 20 Commercial vendors tend to: Charge per seat Limit the number of users per server or CPU: 20 users in one case High up front license cost Pay for completely new licenses to add functionality 22% annual maintenance fee! Lock in, particularly now with all the industry consolidation Don t play well with partners, which are often needed in BI

Reporting and Analysis License Cost Comparison Customer Comparison: Jaspersoft vs. Business Objects 21 3-year TCO comparison from BIScorecard

Costs Data Integration Example ETL project: daily load of a data warehouse with data contained in three different RDBMS and a CRM application, and data provided by Web Services Data integration sucks $s: 50-90% of budget and time for implementation 22 Can you use standard interfaces for applications and data?

Costs Database 30TB warehouse for a telco. List prices. Vertica performed a magnitude faster than the Oracle configuration. 23

Bottom Line Business Intelligence remains the highest priority for technology groups Vendor consolidation is making costs rise for commercial BI 24 The data warehouse market has many new entrants and approaches for dealing with large data volumes, like: JBoss Teiid Compressed column data stores Cloud and virtualized configurations Map Reduce, massively distributed file systems Open source alternatives Open source business intelligence, data integration and databases running on Red Hat provide excellent TCO compared with commercial alternatives Open source business intelligence and data Integration rely heavily on Java,

25

26

Agenda Business Intelligence (reporting and data analysis) has been at the top of IT's to do list for over 10 years, but has remained expensive to purchase and deliver. This session will show configurations of Red Hat, Jaspersoft's commercial open source BI suite and other open source technologies to provide business intelligence services for organizations and individual applications at a fraction of the cost of proprietary options 27

Reporting vs Analysis 29 Reporting Tends to be static, WYSIWYG/print focused Same view over time: Production and scheduled reporting Analysis Statistical and aggregate operations Data exploration: interactivity Data mining: predictive analysis, finding patterns Convert analysis insights into actions with reports Process focused BI Business/Corporate Performance Management: budget vs actuals Business Activity Monitoring: real time analysis of processes