Changing the face of Business Intelligence & Information Management

Size: px
Start display at page:

Download "Changing the face of Business Intelligence & Information Management"

Transcription

1 GPO Box 589 Melbourne VIC 3001 Australia ABN White Paper Big Data Changing the face of Business Intelligence & Information Management Everyone is talking Big Data it s the tech buzz word du jour. There is certainly merit in the discussion as Big Data will change the information landscape of the future and, for those who embrace it, will provide strong competitive advantage and insight like never before. Successful Business Intelligence projects of tomorrow will need to consider Big Data as part of their data landscape for the value that it delivers. C3 Business Solutions is an award winning Business Intelligence and Information Management company. We are strictly vendor independent and provide impartial advice and innovative solutions to our clients. Prepared by Sharon Hobart, Matthew Connock and Robert Postill C3 Business Solutions July 2011 Melbourne Sydney Canberra Brisbane Perth

2 Contents What is Big Data? 3 BI & Big Data 4 Who Will Benefit? 5 The Tech 6 The Advantages 7 What It Means 8 Steps to Success 9 Beware Examples 11 In Summary 12 2 P age

3 What is Big Data? Big Data is exactly what it says it is; extremely large data sets that need to be distributed across different servers due to their size. Every day, organisations are producing enormous amounts of structured and unstructured information. Structured includes trading floor financial information, line of business system data and relational database contents while unstructured data refers to the less tangible areas such as web logs, RFID tags, sensor networks, voice, video and images. Much of this today is either being backed up tape never to be used again or thrown away. Big Data allows you to keep these huge data sets useable for analysis and ultimately strategic direction. Data is growing... Walmart is processing one million customer transactions per hour which equates to around 2,500 Terabytes of data each year. Most websites generate Gigabytes of data every day, much of which is thrown away. Scientists need to manage enormous data sets in a variety of research areas such as genomics, climate science and meteorology, life sciences, and high energy physics. Cern s Large Hadron Collider, for example, is churning out up to 40 Terabytes per second and thus, amassing data in the tens of Petabytes range. 3 P age

4 BI & Big Data Big Data is likely to change the face of Business Intelligence into the future. Expect Hadoop clusters (see The Tech) to be the front line of much business intelligence work. It will enable the analysis of information that we re never thought to look at before, especially unstructured data that can t feed into most data warehouses. Even the most sophisticated data warehouses store only three years of information; after that, there is simply too much for current enterprise systems to manage. Those organisations who actively seek out data in their environment will use Big Data to control the volumes passing into their data warehouses. Big Data removes the need for archiving which saves not only the cost of tapes or paid storage, but keeps all information live relatively cheaply. Sampling is dead; assumptions can be removed from algorithms and real information can be used for analysis. The open source software is free and supported through both the open source community and increasingly major IT vendors, and when coupled with commodity hardware, you can scale up as much as you like for a fraction of the cost (relatively speaking of course). Big Data does mean however, that we need to think differently about the questions we ask and the data we query. The value in unstructured data such as web logs, clickstream paths, web content, video and image, as a strategic tool is significant. 4 P age

5 Who Will Benefit? Organisations with truly enormous datasets in highly competitive markets will benefit the most from Big Data technology. Telecos Can bring together web logs, call detail records (mobile and fixed) to perform behavioural analysis across these to reduce churn, predict consumer behaviours to support long term strategic thinking Finance Global credit card companies (consumer behaviour, fraud analysis), international trading exchanges (risk analysis, fraud analysis or money laundering through pattern analysis across all trades Government Agencies Analysis of long term statistics and analysis of information to support policy decisions (call, web, transport, and intelligence data) Global Retailers Multi-channel consumer behaviour analysis and web content analysis Energy / Manufacturing Analysis from sensor networks to predict demand forecasts and identify areas of concern such as process inefficiency or potential fraud. Science Genomic analysis, high energy physics, astronomy, geology, climate science. What if your car insurance premium was based on your specific driving habits? With GPS data feeding directly from your car, an insurer could tailor premiums to specific usage or even adjust premiums over fixed time periods based on usage (e.g. your premium varies each quarter based on where you go/how far you drive). How about the government charging road tax based on your particular road usage? Again, GPS information could feed directly from your car into a data lake where it could be analysed. 5 P age

6 The Tech The technology behind Big Data gives businesses the opportunity to access and use this valuable large scale information rather than storing it to tape and forgetting about it forever. It complements your existing investment in Data Warehouse technology. Big Data is characterised by Open Source technologies, often originating from the large web players such as Facebook, Yahoo or Google. There are many of these NoSQL (Not Only SQL) databases in circulation (see for a detailed list) but by far the most widely accepted is Hadoop. Hadoop is an open source project that originally grew out of Yahoo. It is intended to ease the complexities of performing large-scale batch operations on data and is managed within the Apache project framework. The Hadoop project now has contributions from Yahoo, Google, Apple and Facebook; arguably, employing some of the smartest minds in the business. Hadoop is now being embraced by the mainstream with Cloudera having been the main provider of distributions to date. Things are however heating up with some of the major BI vendors announcing support for, or solutions using big data technology. EMC has just announced EMC Greenplum Hadoop distribution. IBM s BigInsights offering uses a Hadoop base for storage and processing, IBM InfoSphere for integration, specific analytic solutions and integrates with Netezza. Teradata has partnered with Cloudera to provide integration of data from the Hadoop HDFS into Teradata. In addition, key vendors such as Informatica (via EMC), Microstrategy and Pentaho are now providing support. 6 P age

7 The Advantages Scale We can now build data lakes that enable a broader analysis of information. Big Data also keeps the information live (rather than archiving to tape which usually results in dead data). Budget Aside from the hardware costs, building a petabyte sized data lake is minimal in terms of toolsets. Storage is now cheap, it s almost free. Hadoop also means inexpensive hardware to build scale as we need it. New Information New types of information, in particular unstructured data, can be analysed to provide value to the business. For example, analyse web content to determine sentiment which would prove very useful for military intelligence or for large ecommerce vendors. Never Lose Data Again Hadoop is redundant and reliable; it doesn t stop or lose data even in the event of hardware failure as the data is replicated in multiple locations. 7 P age

8 What It Means Different Skills Organisations will need people who can manage and analyse data on a huge scale. Data Scientist roles are now appearing on job sites as organisations look for individuals who can help them understand their data. Look to the big web companies such as Google, Facebook and Amazon who are leading the way. I keep saying that the sexy job in the next 10 years will be statisticians and I m not kidding, Hal Varian, Google Chief Economist. Different Toolsets We need to think differently about our toolsets and get comfortable with open source quickly. Most Big Data tools are open sourced. Democratisation of Algorithms Most algorithms you need have been written and open sourced. The benefit is in the data and the business problems you apply them to. Architecture Will Change Big Data works in data lakes (Hadoop clusters) and not only runs analysis on this mass scale information but also becomes a source for data warehouses. We can now run machine learning on 100Gb+ of image data and clickstream analysis on 100Tb data on the same platform. Hardware will change We will need to rely less on small numbers of large machines and look more at large numbers of commodity hardware (long-term perhaps even cloud resources). The temporal nature of data is changing... Because the volumes are increasing so rapidly. Batch operations are back and they are feeding more traditional BI technologies. 8 P age

9 Steps to Success The following steps will help you incorporate Big Data into your BI program successfully. 1. Gain Executive Support Which is based on an acceptance of the value of evidence based strategy (i.e. they will already be using data warehouse and probably data mining extensively within the organisation). Find a commercial problem, not a technical problem to apply this to. 2. Get the Right People You will need people who can manage large, distributed data sets and the hardware that comes with it. Next are the people who can make sense of all the data and can then put that into a business context. Think data scientists as opposed to existing data analysts and data miners. 3. Embrace Open Source Traditional vendors are not the answer here. You ll need to get comfortable quickly with open source. The innovators here are communities made up of the smartest people from the smartest companies around; Google, Yahoo, Apple, Facebook. 4. Buy capacity from small standard units Infrastructure as a service (IaaS) vendors and cloud resources provide massive time-to-market and timeliness advantages to those organisations capable of taking advantage. 5. Find a data source you don t use For example many organisaitons don t derive value from their websites. What happens to web logs? Ask questions like what s the least popular web page or what s the busiest time of day for your website? You should be able to work out which ISP your customers use. Could you use that information for joint marketing? 6. Visualisation Think about new ways of presenting data as some analysis simply won t make sense using tables or graphics. 9 P age

10 Beware... Big Data is definitely here to stay; it will change the BI landscape and provide a valuable data resource for organisations. However, as with any new technology, there are a number of things to be aware of. Skills are critical Big Data is in its infancy and requires a different skill set from your existing data warehouse team. You need the right people to query the data such as data scientists (data quants) rather than traditional SQL query writers. Don t start from scratch Get a Hadoop distribution from Cloudera or EMC. This will provide the basic tools you need. PoC only at Scale Benefits will kick in at the hundreds of Gigabytes range, not on a couple of laptops! Manage Expectations Big Data is good for large scale analytics and long-term strategic direction. Don t think it will deliver monthly management reporting or that you can use it for ad-hoc queries over structured data. 10 P age

11 Examples There are an emerging number of examples where Big Data is being used to make strategic decisions at some of the world s more forward thinking organisations. A simple google search reveals many such examples (http://wiki.apache.org/hadoop/poweredby) Ebay Enables search optimisation and research on the ebay network with 532 node clusters handling 5.3 Petabytes. Facebook Enables reporting/analytics and machine learning for Facebook advertising. Uses a 1,100 machine cluster (8,800 cores) storing 12Tb of raw data. LinkedIn Uses Hadoop to power People You May Know using the Graph algorithm provided in MapReduce CERN Uses Hadoop to manage their data from Atlas and other components of LHC via University of Nebraska to search for Higgs Boson particle 54 Petabytes under storage (this year alone) Yahoo Over 100,000 CPUs in more than 40,000 machines are running Hadoop. The biggest is cluster used to support research for Ad Systems and Web Search (total over 16 Petabytes of storage). Yahoo homepage personalisation is provided based on Hadoop anlaysis they have seen twice the uptake. Yahoo Mail anti-spam analysis sees 40% less spam than Hotmail (Yahoo figures). VISA Replaced legacy ETL subsystem with Hadoop based alternative that is more flexible, faster and cheaper. Chase Analyse long-term historical trade data to identify fraudulent activity and build real-time fraud prevention The British Library The Library is working with IBM using their Hadoop based BigSheets solution to preserve and analyse all the websites in the.uk top level domain to provide a unique view of British online activity over time; something that was simply not possible in the past. (http://news.cnet.com/ _ html And Data Storage Byte Table 1000 Megabytes = 1 Gigabyte 1000 Gigabytes = 1 Terabyte 1000 Terabytes = 1 Petabyte 1000 Petabytes = 1 Exabyte 1000 Exabytes = 1 Zettabyte 1000 Zettabytes = 1 Yottabyte 1000 Yottabytes = 1 Brontobyte 1000 Brontobytes = 1 Geopbyte 11 P age

12 In Summary There is an estimated one Zettabyte (or 1,000 Exabytes) of information currently stored worldwide and by 2030 this is predicted to increase to as much as 700 Zettabytes 1. The sheer volume of information that organisations now store, and therefore, want to access for competitive advantage and strategic decision making, means that we must rethink the way we store information. Parking data with strategic value on twenty Gigabyte tapes is not the answer. As the amount of data available continues to grow rapidly, businesses that fail to develop the skills to manage and analyse it will find themselves at a competitive disadvantage. As more and more organisations move into statistics and data mining to set strategic direction, the need for greater insights to stay ahead of the pack is required. Used properly, Big Data will help organisations manage risk better, and improve the customer experience, fundamentally changing the way information management operates. 1 During the 2010 Hadoop World Conference, Abhishek Mehta, then a managing director at Bank of America, and now founder of Tresata, cited a Cisco Systems estimate that by zetabytes (ZBs) of data would be flowing across the Internet. A zetabyte represents 1,000 exabytes (EBs), or 1 million petabytes (PBs). 12 P age

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

How Big Is Big Data Adoption? Survey Results. Survey Results... 4. Big Data Company Strategy... 6

How Big Is Big Data Adoption? Survey Results. Survey Results... 4. Big Data Company Strategy... 6 Survey Results Table of Contents Survey Results... 4 Big Data Company Strategy... 6 Big Data Business Drivers and Benefits Received... 8 Big Data Integration... 10 Big Data Implementation Challenges...

More information

Getting Started Practical Input For Your Roadmap

Getting Started Practical Input For Your Roadmap Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson

More information

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica So What s the market s definition of Big Data? Datasets whose volume, velocity, variety

More information

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

The Potential of Big Data in the Cloud. Juan Madera Technology Consultant juan.madera.jimenez@accenture.com

The Potential of Big Data in the Cloud. Juan Madera Technology Consultant juan.madera.jimenez@accenture.com The Potential of Big Data in the Cloud Juan Madera Technology Consultant juan.madera.jimenez@accenture.com Agenda How to apply Big Data & Analytics What is it? Definitions, Technology and Data Science

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Open source Google-style large scale data analysis with Hadoop

Open source Google-style large scale data analysis with Hadoop Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

What happens when Big Data and Master Data come together?

What happens when Big Data and Master Data come together? What happens when Big Data and Master Data come together? Jeremy Pritchard Master Data Management fgdd 1 What is Master Data? Master data is data that is shared by multiple computer systems. The Information

More information

Customized Report- Big Data

Customized Report- Big Data GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved. Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.

More information

Bringing Big Data into the Enterprise

Bringing Big Data into the Enterprise Bringing Big Data into the Enterprise Overview When evaluating Big Data applications in enterprise computing, one often-asked question is how does Big Data compare to the Enterprise Data Warehouse (EDW)?

More information

Hadoop implementation of MapReduce computational model. Ján Vaňo

Hadoop implementation of MapReduce computational model. Ján Vaňo Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

Doing Multidisciplinary Research in Data Science

Doing Multidisciplinary Research in Data Science Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University aadamov@qu.edu.az http://ce.qu.edu.az/~aadamov 16 May

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

While a number of technologies fall under the Big Data label, Hadoop is the Big Data mascot.

While a number of technologies fall under the Big Data label, Hadoop is the Big Data mascot. While a number of technologies fall under the Big Data label, Hadoop is the Big Data mascot. Remember it stands front and center in the discussion of how to implement a big data strategy. Early adopters

More information

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12 Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using

More information

THE AGE OF BIG DATA. Chula DataScience

THE AGE OF BIG DATA. Chula DataScience THE AGE OF BIG DATA Asst. Prof. Natawut Nupairoj, Ph.D. Mobile Application and System Services Research Group Department of Computing Engineering Chulalongkorn University natawut.n@chula.ac.th Data is

More information

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily

More information

Large scale processing using Hadoop. Ján Vaňo

Large scale processing using Hadoop. Ján Vaňo Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine

More information

A Survey on Big Data Concepts and Tools

A Survey on Big Data Concepts and Tools A Survey on Big Data Concepts and Tools D. Rajasekar 1, C. Dhanamani 2, S. K. Sandhya 3 1,3 PG Scholar, 2 Assistant Professor, Department of Computer Science and Engineering, Sri Krishna College of Engineering

More information

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?

More information

Big Data on Microsoft Platform

Big Data on Microsoft Platform Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»

More information

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business

More information

Big Data and Hadoop for the Executive A Reference Guide

Big Data and Hadoop for the Executive A Reference Guide Big Data and Hadoop for the Executive A Reference Guide Overview The amount of information being collected by companies today is incredible. Wal- Mart has 460 terabytes of data, which, according to the

More information

Expert Reference Series of White Papers. Ten Common Hadoopable Problems. info@globalknowledge.net www.globalknowledge.net

Expert Reference Series of White Papers. Ten Common Hadoopable Problems. info@globalknowledge.net www.globalknowledge.net Expert Reference Series of White Papers Ten Common Hadoopable Problems info@globalknowledge.net www.globalknowledge.net Ten Common Hadoopable Problems What is Hadoop? Hadoop is a data storage and processing

More information

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Big Data Analytics DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Tom Haughey InfoModel, LLC 868 Woodfield Road Franklin Lakes, NJ 07417 201 755 3350 tom.haughey@infomodelusa.com

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D. Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology

More information

Big Data a threat or a chance?

Big Data a threat or a chance? Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out Big Data Challenges and Success Factors Deloitte Analytics Your data, inside out Big Data refers to the set of problems and subsequent technologies developed to solve them that are hard or expensive to

More information

DATA MINING AND WAREHOUSING CONCEPTS

DATA MINING AND WAREHOUSING CONCEPTS CHAPTER 1 DATA MINING AND WAREHOUSING CONCEPTS 1.1 INTRODUCTION The past couple of decades have seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation

More information

Oracle Big Data for Dummies

Oracle Big Data for Dummies Oracle Big Data for Dummies Sai Janakiram Penumuru WW Product Expert Cloud Platforms The Father of Microbiology First Microbiologist Antonie Philips van Leeuwenhoek 2 Sai Janakiram Penumuru o o o o o o

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc. Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has

More information

Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014

Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014 White Paper Big Data Executive Overview WP-BD-10312014-01 By Jafar Shunnar & Dan Raver Page 1 Last Updated 11-10-2014 Table of Contents Section 01 Big Data Facts Page 3-4 Section 02 What is Big Data? Page

More information

Big Data, Big Traffic. And the WAN

Big Data, Big Traffic. And the WAN Big Data, Big Traffic And the WAN Internet Research Group January, 2012 About The Internet Research Group www.irg-intl.com The Internet Research Group (IRG) provides market research and market strategy

More information

The HP IT Transformation Story

The HP IT Transformation Story The HP IT Transformation Story Continued consolidation and infrastructure transformation impacts to the physical data center Dave Rotheroe, October, 2015 Why do data centers exist? Business Problem Application

More information

Big Data and Industrial Internet

Big Data and Industrial Internet Big Data and Industrial Internet Keijo Heljanko Department of Computer Science and Helsinki Institute for Information Technology HIIT School of Science, Aalto University keijo.heljanko@aalto.fi 16.6-2015

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

BIG DATA CHALLENGES AND PERSPECTIVES

BIG DATA CHALLENGES AND PERSPECTIVES BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,

More information

Ten common Hadoopable Problems Real-World Hadoop Use Cases WHITE PAPER

Ten common Hadoopable Problems Real-World Hadoop Use Cases WHITE PAPER Ten common Hadoopable Problems WHITE PAPER TABLE OF CONTENTS Introduction... 1 What is Hadoop?... 1 Recognizing Hadoopable Problems... 3 Ten Common Hadoopable Problems... 4 1 Risk modeling... 4 2 Customer

More information

Big Data Zurich, November 23. September 2011

Big Data Zurich, November 23. September 2011 Institute of Technology Management Big Data Projektskizze «Competence Center Automotive Intelligence» Zurich, November 11th 23. September 2011 Felix Wortmann Assistant Professor Technology Management,

More information

Big data for the Masses The Unique Challenge of Big Data Integration

Big data for the Masses The Unique Challenge of Big Data Integration Big data for the Masses The Unique Challenge of Big Data Integration White Paper Table of contents Executive Summary... 4 1. Big Data: a Big Term... 4 1.1. The Big Data... 4 1.2. The Big Technology...

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

BIRT in the World of Big Data

BIRT in the World of Big Data BIRT in the World of Big Data David Rosenbacher VP Sales Engineering Actuate Corporation 2013 Actuate Customer Days Today s Agenda and Goals Introduction to Big Data Compare with Regular Data Common Approaches

More information

Intro to Big Data and Business Intelligence

Intro to Big Data and Business Intelligence Intro to Big Data and Business Intelligence Anjana Susarla Eli Broad College of Business What is Business Intelligence A Simple Definition: The applications and technologies transforming Business Data

More information

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Exploiting Data at Rest and Data in Motion with a Big Data Platform Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, sarah_brader@uk.ibm.com What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags

More information

Unico Enterprise Big Data

Unico Enterprise Big Data Unico Enterprise Big Data Managing and scaling Big Data to gain big insights 5 Queens Road, Melbourne Victoria 3004, Australia Phone +61 3 9866 5688 email unico@unico.com.au www.unico.com.au Big Data opportunities

More information

Introduction to Predictive Analytics. Dr. Ronen Meiri ronen@dmway.com

Introduction to Predictive Analytics. Dr. Ronen Meiri ronen@dmway.com Introduction to Predictive Analytics Dr. Ronen Meiri Outline From big data to predictive analytics Predictive Analytics vs. BI Intelligent platforms What can we do with it. The modeling process. Example

More information

Here comes the flood Tools for Big Data analytics. Guy Chesnot -June, 2012

Here comes the flood Tools for Big Data analytics. Guy Chesnot -June, 2012 Here comes the flood Tools for Big Data analytics Guy Chesnot -June, 2012 Agenda Data flood Implementations Hadoop Not Hadoop 2 Agenda Data flood Implementations Hadoop Not Hadoop 3 Forecast Data Growth

More information

Big Data. Fast Forward. Putting data to productive use

Big Data. Fast Forward. Putting data to productive use Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize

More information

Ten Common Hadoopable Problems

Ten Common Hadoopable Problems XML Impacting the Enterprise Tapping into the Power of XML: Five Success Stories Ten Common Hadoopable Problems Real-World Hadoop Use Cases Ten Common Hadoopable Problems Real-World Hadoop Use Cases Table

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

Outline. What is Big data and where they come from? How we deal with Big data?

Outline. What is Big data and where they come from? How we deal with Big data? What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline

More information

Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2012 2018

Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2012 2018 Transparency Market Research Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2012 2018 Buy Now Request Sample Published Date: July 2013 Single User License: US $ 4595

More information

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems

More information

22 SMARTENTERPRISEMAG.COM

22 SMARTENTERPRISEMAG.COM 22 SMARTENTERPRISEMAG.COM Smart Strategies BIG DATA, Big Innovation Smart CIOs are mining their organizations huge data stores for insights that lead to business innovation. By Tom Farre ILLUSTRATION:

More information

Mind Commerce. http://www.marketresearch.com/mind Commerce Publishing v3122/ Publisher Sample

Mind Commerce. http://www.marketresearch.com/mind Commerce Publishing v3122/ Publisher Sample Mind Commerce http://www.marketresearch.com/mind Commerce Publishing v3122/ Publisher Sample Phone: 800.298.5699 (US) or +1.240.747.3093 or +1.240.747.3093 (Int'l) Hours: Monday - Thursday: 5:30am - 6:30pm

More information

The Enterprise Data Hub and The Modern Information Architecture

The Enterprise Data Hub and The Modern Information Architecture The Enterprise Data Hub and The Modern Information Architecture Dr. Amr Awadallah CTO & Co-Founder, Cloudera Twitter: @awadallah 1 2013 Cloudera, Inc. All rights reserved. Cloudera Overview The Leader

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Empower Your organization with

Empower Your organization with Empower Your organization with Big Data Predictive Analytics Solutions AUTOMOBILES MACHINE DATA POINT SALE SOCIAL NET WORK RFID CUSTOMER BASED TEXT DATA SMART METER MOBILE DATA LOCATION BASED STRUCTURED

More information

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

Chapter 1. Contrasting traditional and visual analytics approaches

Chapter 1. Contrasting traditional and visual analytics approaches Chapter 1 Understanding Big Data Analytics In This Chapter Defining Big Data Understanding Big Data Analytics Contrasting traditional and visual analytics approaches The era of Big Data is upon us. The

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

Analyzing Big Data with AWS

Analyzing Big Data with AWS Analyzing Big Data with AWS Peter Sirota, General Manager, Amazon Elastic MapReduce @petersirota What is Big Data? Computer generated data Application server logs (web sites, games) Sensor data (weather,

More information

Microsoft SQL Server 2012 with Hadoop

Microsoft SQL Server 2012 with Hadoop Microsoft SQL Server 2012 with Hadoop Debarchan Sarkar Chapter No. 1 "Introduction to Big Data and Hadoop" In this package, you will find: A Biography of the author of the book A preview chapter from the

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Ubuntu and Hadoop: the perfect match

Ubuntu and Hadoop: the perfect match WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely

More information

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem: Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Ten Mistakes to Avoid

Ten Mistakes to Avoid EXCLUSIVELY FOR TDWI PREMIUM MEMBERS TDWI RESEARCH SECOND QUARTER 2014 Ten Mistakes to Avoid In Big Data Analytics Projects By Fern Halper tdwi.org Ten Mistakes to Avoid In Big Data Analytics Projects

More information

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

Big Data Analytics Best Practices

Big Data Analytics Best Practices 1 Big Data Analytics Best Practices Marshall Presser Federal Field CTO Greenplum 2 Big Data Makes the Mainstream 3 WHAT DOES IT TAKE? 4 1. New Applications MADlib 5 2. New Skill Sets -- Data Science 6

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON BIG DATA MANAGEMENT AND ITS SECURITY PRUTHVIKA S. KADU 1, DR. H. R.

More information