Data Warehouse design

Size: px
Start display at page:

Download "Data Warehouse design"

Transcription

1 Data Warehouse design Design of Enterprise Systems University of Pavia 10/12/2013 2h for the first; 2h for hadoop - 1-

2 Table of Contents Big Data Overview Big Data DW & BI Big Data Market Hadoop & Mahout - 2-

3 Data Warehouse design BIG DATA OVERVIEW - 3-

4 Big Data Overview: Table of Contents Big Data Overview Data Growth Definition Big Data v.s. Relational Data Its Value Big Data Benefit Big Data Usage Challenges - 4-

5 Exabytes Exabytes Big Data Overview: Data Growth 15/ 通 用 格 式 26/ 通 用 格 式 6/ 通 用 格 式 18/ 通 用 格 式 29/ 通 用 格 式 9/ 通 用 格 式 19/ 通 用 格 式 0/ 通 用 格 式 Data Storage Growth 15/ 通 用 格 式 6/ 通 用 格 式 28/ 通 用 格 式 18/ 通 用 格 式 11/ 通 用 格 式 3/ 通 用 格 式 24/ 通 用 格 式 18/ 通 用 格 式 8/ 通 用 格 式 0/ 通 用 格 式 Data Storage Growth Years Storage capacity increases 23% on average annually Years Exponential growth during a decade starts from 2010 End the ability to store all the available information - 5-

6 Big Data Overview: Definition Gartner Definition(2012): "Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization." - 6-

7 Big Data Overview: Big Data V.S. Relational Data Application Relation-Based Data Big Data Data processing Single-computer platform that scales with better CPUs, centralized processing. Cluster platforms that scale to thousands of nodes, distributed process. Data management Relational database (SQL), centralized storage. Non-relational databases that manage varied data types and formats (NoSQL), distributed storage. Analytics Batched, descriptive, centralized. Real-time, predictive and prescriptive, distributed analytics. - 7-

8 Big Data Overview: Its Value 1/3 Several classes of company heading the revenue chart($11.59 billion) broad-portfolio tech giants (IBM, HP, Oracle, EMC) leading software houses (Teradata, SAP, Microsoft) professional services companies (PwC, Accenture) Source: Wikibon, Big Data Vendor Revenue and Market Forecast Source:

9 Big Data Overview: Its Value 2/3 Pure play: vendors who derive 100 percent of their revenue from this market Source: Wikibon, Big Data Vendor Revenue and Market Forecast Source:

10 Big Data Overview: Its Value 3/3 IDC: Big data will become a $17 billion business by 2015($23.8 billion by 2016) Big data storage will account for 6.8% of the entire worldwide storage market by 2015 Source: Worldwide Big Data Technologies and Services: Forecast (IDC, 2012) Source:

11 Big Data Overview: Big Data Benefits Business benefits received by implementing an effective Big Data methodology. The survey is based on 1153 responses from 325 respondents - 11-

12 Big Data Overview: Big Data Usage 1/2 E-Commerce and Market Intelligence Recommender system Social media monitoring and analysis Crowd-sourcing systems Social and virtual games E-Government and Politics 2.0 Ubiquitous government services Equal access and public services Citizen engagement Science & Technology S&T innovation Hypothesis testing Knowledge discovery Smart Health and Wellbeing Human and plant genomics Healthcare decision support Patient community analysis Security and Public Safety Crime analysis Computational criminology Terrorism informatics Open-source intelligence Cyber security - 12-

13 Big Data Overview: Big Data Usage 2/2 Survey of European companies from Steria's Business Intelligence Maturity Audit (bima) - 13-

14 Big Data Overview: Challenges 1/2 Main challenges between Big Data and companies. The survey is based on 1153 responses from 325 respondents - 14-

15 Big Data Overview: Challenges 2/2 A Survey of European companies from Steria's Business Intelligence Maturity Audit (bima) Technical 38% has data quality problem A lack of data governance; no master data management system(38%) Organizational 72% has no BI strategy; 70% has no BI governance 7% grades big data as relevant Source: of-european-companies-rate-big-data-as-very-relevant-to-their-business/ - 15-

16 Data Warehouse design BIG DATA, DW & BI - 16-

17 Big Data, DW & BI: Table of Contents Big Data, DW & BI Evolution Techniques Cost Best Practices - 17-

18 BI Evolution Key Characteristics Gartner BI Platforms Core Capabilities Gartner Hype Cycle BI&A 1.0 -DBMS-based, structured content. -RDBMS & data warehousing. -ETL & OLAP. -Dashboards & scorecards. -Data mining & statistical analysis. -Ad hoc query & search-based BI -Reporting, dashboards & scorecards -OLAP -Interactive visualization -Predictive modeling & data mining. -Column-based DBMS -In-memory DBMS -Real-time decision -Data mining workbenches BI&A 2.0 Web-based, unstructured content -Information retrieval and extraction -Opinion mining -Question answering -Web analytics and web intelligence -Social media analytics -Social network analysis -Spatial-temporal analysis -Information semantic services -Natural language question answering -Content & text analytics BI&A 3.0 Mobile and sensor-based content -Location-aware analysis -Person-centered analysis -Context-relevant analysis -Mobile visualization & HCI -Mobile BI BI and Analytics: evolution and characteristics - 18-

19 Big Data Overview: Techniques 1/2 McKinsey Global Institute in 2011 provided a list of the top 10 common techniques applicable across a range of industries, particularly in response to the need to analyze new amounts of data and their combination. List of the top 10 techniques which require Big data(1/2) A/B Testing Cluster Analysis Classification A technique in which a control group is compared with a variety of test groups in order to determine what treatments will improve a given objective. An example application is determining what copy text, layouts, images, or colors will improve conversion rates on an e-commerce Web site. Big Data enables huge numbers of tests to be executed and analyzed. A statistical method aimed to classify an huge data set and in particular to identify a common behavior. Classification. A set of techniques to identify the categories in which new data points belong, based on a training set containing data points that have already been categorized. Data Mining A set of techniques and technologies with the purpose to extract patterns from large datasets through the combination of methods following statistics and algorithms. These techniques include association rule learning, cluster analysis, classification, and regression

20 Big Data Overview: Techniques 2/2 McKinsey Global Institute in 2011 provided a list of the top 10 common techniques applicable across a range of industries, particularly in response to the need to analyze new amounts of data and their combination. List of the top 10 techniques which require Big data(2/2) Network analysis A set of techniques used to characterize relationships among discrete nodes in a graph or a network. In social network analysis, connections between individuals in a community or organization are analyzed. Predictive modeling A set of techniques in which a mathematical model is created or chosen to best predict the probability of an outcome. Sentiment analysis Statistics Application of natural language processing and other analytic techniques to identify and extract subjective information from source text material. The science of the collection, organization, and interpretation of data, including the design of surveys and experiments. Statistical techniques are often used to understand the relationships between all the variables. Visualization Techniques used to create images, diagrams or animations, usually integrated in more complex dashboards

21 Big Data: Cost 1/2 ESG (Enterprise Strategy Group) provides an analysis on the costs of Big Data, in particular a comparison between a build and buy solution. Item Cost each; enterprise class with dual Servers $400,000 power supplies, 36TB of serial attached SCSI (SAS) storage, gigabytes memory, 1 rack Server support of server cost Switches $15,000 Distribution/systems management software $5k for InfiniBand; in older network switches will run at least 3x the costs of InfiniBand $90,000 Cloudera: 18 $5k each Integration $100,000 Licenses and dedicated hardware Information Management Tools Node Configuration and Implementation $20, $100/hour human cost $16,000 Build Project Costs $733,000 8 hours/node, 20 nodes = 160 hours, $100/hour Those project items where a "buy" option exists Build Versus Buy Elements (Using Build Pricing) - 21-

22 Big Data: Cost 2/2 ESG (Enterprise Strategy Group) provides an analysis on the costs of Big Data, in particular a comparison between a build and buy solution. Item Cost Notes Build Total $733,000 Buy (Oracle Big Data Appliance) Buy (Oracle Big Data Appliance) Savings ESG Estimated Savings $450,000 $283,000 ~39% Cost of Oracle Big Data Appliance for same infrastructure and tasks costs (list) Not lifecycle costs, just for initial project Oracle Big Data Appliance lowers costs versus do-it-yourself Build Versus Buy Elements (Using Buy Pricing) - 22-

23 Big Data: Best Practices 1/3 First of all, however, we need to focus on some considerations on when is suitable to use Big Data technologies Analyze a huge quantity of data not only structured but also semi-structured and unstructured from a wide variety of resources; All of the data gathered must be analyzed against a sample or in another case, sampling of data is not as effective as the analysis made upon a large amount of data; Iterative and explorative analysis when business measures on the data are not determined a priori; Solving information and business challenges that are not properly addressed by a traditional relational database approach

24 Big Data: Best Practices 2/3 The best practices that we are going to describe regard both the management aspects and the organizational and technological ones. Muting the HiPPOs: the highest-paid person opinions are those on which depend the most important decisions on how to retrieve and analyze data. Today these people rely too much on intuition and experience rather than the pure rationality of data so there is the need to transform this behavior; Start with initiative that led to customer-centric outcome. It is very important for those organization that are customer oriented to begin with customer analytics that enable better services as a result of a deep understand of customers needs and future behaviors; Develop an enterprise schema that include the vision, the strategies and the requirements for Big Data and is useful to align the business users need and the implementation roadmap of information technologies; In order to achieve near-term results is crucial the adoption of a pragmatic approach, starting from the most logical and cost-effective place to look for insight that is within the enterprise; - 24-

25 Big Data: Best Practices 3/3 Big Data Analytics effectiveness strictly depends on analytical skills and analytics tools. So the enterprises should invest in acquiring both tools and skills; The Big Data strategy and the business analytics should encompass an evaluation of the decision-making processes of the organization as well as an evaluation on the groups and types of decision makers; Try to uncover new metrics, key performance indicators and new analytics technique to lock at new and existing data in a different way in order to find new opportunity. This could require setting up a separate Big Data team with the purpose of experiment and innovate; The final goal of a Big Data project is not the collection of much data as possible but the support of the concrete business needs and provide new reliable information to decision makers; Only one technology cannot meet all the Big Data requirements. The presence of different workloads, data types, and user types should be served by the most suitable technology. For example, Hadoop could be the best choice for a large-scale Web log analysis but is not suitable for a real-time streaming at all. Multiple Big Data technologies must coexist and address use cases for which they are optimized

26 Data Warehouse design BIG DATA MARKET - 26-

27 Big Data Market Definition IDC(2012) defines the big data market as an aggregation of storage, server, networking, software, and services market segments, each with several subsegments. Big Data Technology Stack - 27-

28 Big Data Market Segments Infrastructure External storage systems Servers(including internal storage, memory, network cards) and supporting system software as well as spending for self-built servers by large cloud service providers Datacenter networking infrastructure used in support of Big Data server and storage infrastructure Services business consulting, business process outsourcing, plus IT projectbased services, IT outsourcing, and IT support, and training services related to Big Data implementations Softwares Data organization and management software, including parallel and distributed file systems and others Analytics and discovery software, including search engines used for Big Data applications, data mining, text mining, rich media analysis, data visualization, and others - 28-

29 Big Data Market Analysis Marketsandmarkets Big Data Market By Types (Hardware; Software; Services; BDaaS - HaaS; Analytics; Visualization as Service); By Software (Hadoop, Big Data Analytics and Databases, System Software (IMDB, IMC): Worldwide Forecasts & Analysis ( ) - 29-

30 Data Warehouse design HADOOP & MAHOUT - 30-

31 Hadoop & Mahout: Table of Contents Hadoop Mahout Overview HDFS Map Reduce Hadoop Ecosystem Overview Algorithms Structure Structure HBase File Write Job Submission Pig File Read Job Execution Hive - 31-

32 Hadoop: Overview Hadoop Overview Master Node HDFS Slave Node1 Slave Node K Slave Node N Storage Computing Storage Computing Storage Computing The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models Open source Scalable Distributed Map-Reduce Master Node controls everything! - 32-

33 Hadoop & Mahout: Table of Contents Hadoop Mahout Overview HDFS Map Reduce Hadoop Ecosystem Overview Algorithms Structure Structure HBase File Write Job Submission Pig File Read Job Execution Hive - 33-

34 HDFS Structure Hadoop: HDFS Structure Name Node Metadata Data Node1 Data Node K Data Node N File Name node controls almost everything about storage Large files are partitioned into chunks and stored across multiple nodes File chunks are replicated to mitigate the node failure problems - 34-

35 Hadoop: HDFS write Operation series when writing a file - 35-

36 Hadoop: HDFS Read Operation series when reading a file - 36-

37 Hadoop & Mahout: Table of Contents Hadoop Mahout Overview HDFS Map Reduce Hadoop Ecosystem Overview Algorithms Structure Structure HBase File Write Job Submission Pig File Read Job Execution Hive - 37-

38 Map-Reduce Structure Hadoop: Map-Reduce Structure Job Tracker Task Tracker1 Task Tracker K Task Tracker N Mapper Mapper Mapper Reducer Reducer Reducer Job tracker controls almost everything about computing Key concepts of Map-Reduce Computation goes with data - 38-

39 Hadoop: Job submission The initialization takes some time Job execution is monitored by Job tracker through heartbeat - 39-

40 Hadoop: Map-Reduce Execution Bandwidth required in the copy process - 40-

41 Hadoop & Mahout: Table of Contents Hadoop Mahout Overview HDFS Map Reduce Hadoop Ecosystem Overview Algorithms Structure Structure HBase File Write Job Submission Pig File Read Job Execution Hive - 41-

42 Hadoop Ecosystem: HBase HDFS Structured/semistructure/unstructure d data Write only once, read many Hbase is an opensource, distributed, versioned, columnoriented store modeled after Google's Bigtable Column based database. It supports Insert Delete Update - 42-

43 Hadoop Ecosystem: Hbase Storage model 1/3 Hbase is a column-oriented database - 43-

44 Hadoop Ecosystem: Hbase Storage model 1/3 Hbase storage system - 44-

45 Hadoop Ecosystem: Hbase Storage model 1/3 Hbase storage system - 45-

46 Hadoop Ecosystem: Pig Hadoop A lot of java codes in case of analyzing No scripting Pig is a platform for analyzing large data sets that consists of a highlevel language for expressing data analysis programs Pig generates and compiles a Map/Reduce program(s) on the fly

47 Hadoop Ecosystem: Pig Sample Scripts RawInput = LOAD '$INPUT' USING com.contextweb.pig.cwheaderloader('$resources/schema/wide.xml'); input = foreach RawInput GENERATE ContextCategoryId as Category, DefLevelId, TagId, URL,Impressions; deffilter = FILTER input BY (DefLevelId == 8) or (DefLevelId == 12); GroupedInput = GROUP deffilter BY (Category, TagId, URL); result = FOREACH GroupedInput GENERATE group, SUM(input.Impressions) as Impressions; STORE result INTO '$OUTPUT' USING com.contextweb.pig.cwheaderstore(); - 47-

48 Hadoop Ecosystem: Hive Hive is a data warehouse infrastructure built on top of hadoop Supports analysis of large datasets stored in Hadoop compatible file systems like HDFS and Amazon S3 file system Provides SQL-Like query language called HiveSQL Provides index to accelerate queries - 48-

49 Hadoop Ecosystem: HiveSQL DML Select DDL SHOW TABLES CREATE TABLE ALTER TABLE DROP TABLE - 49-

50 Mahot Hadoop Mahout Overview HDFS Map Reduce Hadoop Ecosystem Overview Algorithms Structure Structure HBase File Write Job Submission Pig File Read Job Execution Hive - 50-

51 Mahout: Overview A scalable machine learning library built on Hadoop, written in java Driven by Ng et al. s paper MapReduce for Machine Learning on Multicore - 51-

52 Mahout: Algorithms Classification Logistic Regression Bayesian SVM NN Hidden Markov Models Clustering Kmeans Mean Shift Clustering Spectral Clustering Top Down Clustering Pattern Mining Parallel FP Growth Algorithm Regression Locally Weighted Linear Regression Dimension reduction SVD PCA GDA Collaborative filtering Non-distributed recommenders Distributed Item-Based Collaborative Filtering - 52-

53 Data Warehouse design EXERCISE - 53-

54 Mobility Analyzer: A Show Case Site Data Flow Modules HANA DB Local ExportTweetsInfo CSV Files CSVConverter Hadoop Sequence Files Mahout Run.sh Cluster Info. Clusterdump Cluster Info. ImportClusterInfo Local HANA DB - 54-

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information

Achieving Business Value through Big Data Analytics Philip Russom

Achieving Business Value through Big Data Analytics Philip Russom Achieving Business Value through Big Data Analytics Philip Russom TDWI Research Director for Data Management October 3, 2012 Sponsor 2 Speakers Philip Russom Research Director, Data Management, TDWI Brian

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Big Data Analytics DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Tom Haughey InfoModel, LLC 868 Woodfield Road Franklin Lakes, NJ 07417 201 755 3350 tom.haughey@infomodelusa.com

More information

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem: Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

How To Scale Out Of A Nosql Database

How To Scale Out Of A Nosql Database Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Big Data. Lyle Ungar, University of Pennsylvania

Big Data. Lyle Ungar, University of Pennsylvania Big Data Big data will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus. McKinsey Data Scientist: The Sexiest Job of the 21st Century -

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Architectures for Big Data Analytics A database perspective

Architectures for Big Data Analytics A database perspective Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

NoSQL and Hadoop Technologies On Oracle Cloud

NoSQL and Hadoop Technologies On Oracle Cloud NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS 9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence

More information

Architecting for the Internet of Things & Big Data

Architecting for the Internet of Things & Big Data Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to

More information

Cost-Effective Business Intelligence with Red Hat and Open Source

Cost-Effective Business Intelligence with Red Hat and Open Source Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

Data Management in SAP Environments

Data Management in SAP Environments Data Management in SAP Environments the Big Data Impact Berlin, June 2012 Dr. Wolfgang Martin Analyst, ibond Partner und Ventana Research Advisor Data Management in SAP Environments Big Data What it is

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Wienand Omta Fabiano Dalpiaz 1 drs. ing. Wienand Omta Learning Objectives Describe how the problems of managing data resources

More information

Architecture & Experience

Architecture & Experience Architecture & Experience Data Mining - Combination from SAP HANA, R & Hadoop Markus Severin, Solution Principal Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

TE's Analytics on Hadoop and SAP HANA Using SAP Vora TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

Big Data solutions to support Intelligent Systems and Applications

Big Data solutions to support Intelligent Systems and Applications Big solutions to support Intelligent Systems and Applications Luciana Lima, Filipe Portela, Manuel Filipe Santos, António Abelha and José Machado. Abstract in the last years the number of data available

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

Big Data Too Big To Ignore

Big Data Too Big To Ignore Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction

More information

Big Data and Market Surveillance. April 28, 2014

Big Data and Market Surveillance. April 28, 2014 Big Data and Market Surveillance April 28, 2014 Copyright 2014 Scila AB. All rights reserved. Scila AB reserves the right to make changes to the information contained herein without prior notice. No part

More information

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise An Oracle White Paper October 2011 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

This Symposium brought to you by www.ttcus.com

This Symposium brought to you by www.ttcus.com This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation

More information

Bringing the Power of SAS to Hadoop. White Paper

Bringing the Power of SAS to Hadoop. White Paper White Paper Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities Contents Introduction... 1 What

More information

Play with Big Data on the Shoulders of Open Source

Play with Big Data on the Shoulders of Open Source OW2 Open Source Corporate Network Meeting Play with Big Data on the Shoulders of Open Source Liu Jie Technology Center of Software Engineering Institute of Software, Chinese Academy of Sciences 2012-10-19

More information

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Testing 3Vs (Volume, Variety and Velocity) of Big Data Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D. Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology

More information

Hadoop implementation of MapReduce computational model. Ján Vaňo

Hadoop implementation of MapReduce computational model. Ján Vaňo Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed

More information

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required. What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08

More information

Trafodion Operational SQL-on-Hadoop

Trafodion Operational SQL-on-Hadoop Trafodion Operational SQL-on-Hadoop SophiaConf 2015 Pierre Baudelle, HP EMEA TSC July 6 th, 2015 Hadoop workload profiles Operational Interactive Non-interactive Batch Real-time analytics Operational SQL

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

Chase Wu New Jersey Ins0tute of Technology

Chase Wu New Jersey Ins0tute of Technology CS 698: Special Topics in Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Ins0tute of Technology Some of the slides have been provided through the courtesy of Dr. Ching-Yung Lin at

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12 Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based

More information

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER Hur hanterar vi utmaningar inom området - Big Data Jan Östling Enterprise Technologies Intel Corporation, NER Legal Disclaimers All products, computer systems, dates, and figures specified are preliminary

More information

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation

More information

Big Data Zurich, November 23. September 2011

Big Data Zurich, November 23. September 2011 Institute of Technology Management Big Data Projektskizze «Competence Center Automotive Intelligence» Zurich, November 11th 23. September 2011 Felix Wortmann Assistant Professor Technology Management,

More information

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure

More information

So What s the Big Deal?

So What s the Big Deal? So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

More information

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are

More information

Large scale processing using Hadoop. Ján Vaňo

Large scale processing using Hadoop. Ján Vaňo Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine

More information

ANALYTICS STRATEGY: creating a roadmap for success

ANALYTICS STRATEGY: creating a roadmap for success ANALYTICS STRATEGY: creating a roadmap for success Companies in the capital and commodity markets are looking at analytics for opportunities to improve revenue and cost savings. Yet, many firms are struggling

More information

White Paper: What You Need To Know About Hadoop

White Paper: What You Need To Know About Hadoop CTOlabs.com White Paper: What You Need To Know About Hadoop June 2011 A White Paper providing succinct information for the enterprise technologist. Inside: What is Hadoop, really? Issues the Hadoop stack

More information

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics

More information

Performance and Scalability Overview

Performance and Scalability Overview Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and

More information

III JORNADAS DE DATA MINING

III JORNADAS DE DATA MINING III JORNADAS DE DATA MINING EN EL MARCO DE LA MAESTRÍA EN DATA MINING DE LA UNIVERSIDAD AUSTRAL PRESENTACIÓN TECNOLÓGICA IBM Alan Schcolnik, Cognos Technical Sales Team Leader, IBM Software Group. IAE

More information

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Addressing Open Source Big Data, Hadoop, and MapReduce limitations Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

Microsoft SQL Server 2012 with Hadoop

Microsoft SQL Server 2012 with Hadoop Microsoft SQL Server 2012 with Hadoop Debarchan Sarkar Chapter No. 1 "Introduction to Big Data and Hadoop" In this package, you will find: A Biography of the author of the book A preview chapter from the

More information

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:

More information