Chapter 6: Big Data and Analytics

Size: px
Start display at page:

Download "Chapter 6: Big Data and Analytics"

Transcription

1 Learning Objectives Dr. Chen, Data Base Management Chapter 6: Big Data and Analytics Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga University Spokane, WA Learn what Big Data is and how it is changing the world of analytics Understand the motivation for and business drivers of Big Data analytics Become familiar with the wide range of enabling technologies for Big Data analytics Learn about Hadoop, MapReduce, and NoSQL as they relate to Big Data analytics Understand the role of and capabilities/skills for data scientist as a new analytics profession (Continued ) 2 Learning Objectives Opening Vignette Compare and contrast the complementary uses of data warehousing and Big Data Become familiar with the vendors of Big Data tools and services Understand the need for and appreciate the capabilities of stream analytics Learn about the applications of stream analytics Big Data Meets Big Science at CERN Situation Problem Solution Results Answer & discuss the case questions. 4 Questions for the Opening Vignette What is CERN, and why is it important to the world of science? How does the Large Hadron Collider work? What does it produce? What is the essence of the data challenge at CERN? How significant is it? What was the solution? How were the Big Data challenges addressed with this solution? What were the results? Do you think the current solution is sufficient? 1. What is CERN, and why is it important to the world of science? CERN is the European Organization for Nuclear Research. It plays a leading role in fundamental studies of physics. It has been instrumental in many key global innovations and breakthrough discoveries in theoretical physics and today operates the world s largest particle physics laboratory, home to the Large Hadron Collider (LHC)

2 2. How does Large Hadron Collider work? What does it produce? The LHS houses the world s largest and the most sophisticated scientific instruments to study the basic constituents of matter the fundamental particles. These instruments include purpose-built particle accelerators and detectors. Accelerators boost the beams of particles to very high energies before the beams are forced to collide with each other or with stationary targets. Detectors observe and record the results of these collisions, which are happening at or near the speed of light. This process provides the physicists with clues about how the particles interact, and provides insights into the fundamental laws of nature 7. What is the essence of the data challenge at CERN? How significant is it? Collision events in LHC occur 40 million times per second, resulting in 15 petabytes of data produced annually at the CERN Data Centre. CERN does not have the capacity to process all of the data that it generates, and therefore relies on numerous other research centers all around the world to access and process the data. Processing all the data of their experiments requires an enormously complex distributed and heterogeneous computing and data management system. With such vast quantities of data, both structured and unstructured, information discovery is a big challenge for CERN What was the solution? How were the Big Data challenges addressed with this solution? CMS s data management and workflow management (DMWM) created the Data Aggregation System (DAS), built on MongoDB (a Big Data management infrastructure) to provide the ability to search and aggregate information across this complex data landscape. MongoDB s non-schema structure allowed flexible data structures to be stored and indexed What were the results? Do you think the current solution is sufficient? DAS is used 24 hours a day, seven days a week, by CMS physicists, data operators, and data managers at research facilities around the world. The performance of MongoDB has been outstanding, with an ability to offer a free text query system that is fast and scalable. Without help from DAS, information lookup would have taken orders of magnitude longer. The current solution is outstanding, but more improvements can be made and CERN is looking to apply big data approached beyond CMS. 10 Big Data - Definition and Concepts Big Data means different things to people with different backgrounds and interests Traditionally, Big Data = massive volumes of data E.g., volume of data at CERN, NASA, Google, Where does the Big Data come from? Everywhere! Web logs, RFID, GPS systems, sensor networks, social networks, Internet-based text documents, Internet search indexes, detail call records, astronomy, atmospheric science, biology, genomics, nuclear physics, biochemical experiments, medical records, scientific research, military surveillance, multimedia archives, Technology Insights 6.1 The Data Size Is Getting Big, Bigger, Hadron Collider - 1 PB/sec Boeing jet - 20 TB/hr Facebook TB/day YouTube 1 TB/4 min The proposed Square Kilometer Array telescope (the world s proposed biggest telescope) 1 EB/day

3 Big Data - Definition and Concepts Big Data is a misnomer! Big Data is more than just big The Vs that define Big Data Volume Variety Velocity Veracity Variability Value ERP SCM CRM Images Audio and Video Machine Logs Text Web and Social A High-Level Conceptual Architecture for Big Data Solutions (by AsterData / Teradata) MOVE UNIFIED DATA ARCHITECTURE System Conceptual View DATA PLATFORM EVENT PROCESSING MANAGE INTEGRATED DATA WAREHOUSE ACCESS DISCOVERY PLATFORM Marketing Applications Business Intelligence Data Mining Math and Stats Languages Marketing Executives Operational Systems Customers Partners Frontline Workers Business Analysts Data Scientists Engineers BIG DATA SOURCES ANALYTIC TOOLS & APPS USERS 1 14 Fundamentals of Big Data Analytics Big Data by itself, regardless of the size, type, or speed, is worthless Big Data + big analytics = value With the value proposition, Big Data also brought about big challenges Effectively and efficiently capturing, storing, and analyzing Big Data New breed of technologies needed (developed or purchased or hired or outsourced ) 15 Big Data Considerations You can t process the amount of data that you want to because of the limitations of your current platform. You can t include new/contemporary data sources (e.g., social media, RFID, Sensory, Web, GPS, textual data) because it does not comply with the data storage schema You need to (or want to) integrate data as quickly as possible to be current on your analysis. You want to work with a schema-on-demand data storage paradigm because of the variety of data types involved. The data is arriving so fast at your organization s doorstep that your traditional analytics platform cannot handle it. 16 Critical Success Factors for Big Data Analytics A clear business need (alignment with the vision and the strategy) Strong, committed sponsorship (executive champion) Alignment between the business and IT strategy A fact-based decision-making culture A strong data infrastructure The right analytics tools Right people with right skills Critical Success Factors for Big Data Analytics Personnel with advanced analytical skills The right analytics tools A strong data infrastructure A Clear business need Keys to Success with Big Data Analytics A fact - based decision - making culture Strong, committed sponsorship Alignment between the business and IT strategy 17 18

4 Enablers of Big Data Analytics In-memory analytics Storing and processing the complete data set in RAM In-database analytics Placing analytic procedures close to where data is stored Grid computing & MPP Use of many machines and processors in parallel (MPP - massively parallel processing) Appliances Combining hardware, software, and storage in a single unit for performance and scalability 19 Challenges of Big Data Analytics Data volume The ability to capture, store, and process the huge volume of data in a timely manner Data integration The ability to combine data quickly and at reasonable cost Processing capabilities The ability to process the data quickly, as it is captured (i.e., stream analytics) Data governance ( security, privacy, access) Skill availability ( data scientist) Solution cost (ROI) 20 Business Problems Addressed by Big Data Analytics Process efficiency and cost reduction Brand management Revenue maximization (e.g., cross-selling/ up-selling/ bundled-selling) Enhanced customer experience Churn identification, customer recruiting Improved customer service Identifying new products and market opportunities Risk management Regulatory compliance Enhanced security capabilities 21 Application Case 6.2 Top 5 Investment Bank Achieves Single Source of the Truth Questions for Discussion 1. How can Big Data benefit large-scale trading banks? 2. How did MarkLogic s infrastructure help ease the leveraging of Big Data?. What were the challenges, the proposed solution, and the obtained results? 22 Application Case 6.2 Moving from many old systems to a unified new system Before After 1. How can Big Data benefit large-scale trading banks? Big Data can potentially handle the high volume, high variability, continuously streaming data that trading banks need to deal with. Traditional relational databases are often unable to keep up with the data demand. Before it was difficult to identify financial exposure across many systems (separate copies of derivatives trade store) After it was possible to analyze all contracts in single database (MarkLogic Server eliminates the need for 20 database copies)

5 2. How did MarkLogic infrastructure help ease the leveraging of Big Data? MarkLogic was able to meet two needs: (1) upgrading existing Oracle and Sybase platforms, and (2) compliance with regulatory requirements. Their solution provided better performance, scalability, and faster development for future requirements than competitors. Most importantly, MarkLogic was able to eliminate the need for replicated database servers by providing a single server providing timely access to the data.. What were the challenges, the proposed solution, and the obtained results? The Bank s legacy system was built on relational database technology. As data volumes and variability increased, the legacy system was not fast enough to respond to growing business needs and requirements. It was unable to deliver real-time alerts to manage market and counterparty credit positions in the desired timeframe. Big data offered the scalability to address this problem. The benefits included a new alert feature, less downtime for maintenance, much faster capacity to process complex changes, and reduced operations costs Big Data Technologies There are a number of technologies for processing and analyzing Big Data, but most have some common characteristics. Namely: take advantage of commodity hardware to enable scale-out and parallel-processing techniques; employ non-relational data storage capabilities to process unstructured and semi-structured data; apply advanced analytics and data visualization technology to Big data to convey insights to end users. Three major technologies will transform the business analytics and data management markets: MapReduce, Hadoop, and NoSQL 27 Big Data Technologies MapReduce Hadoop Hive Pig Hbase Flume Oozie Ambari Avro Mahout, Sqoop, Hcatalog,. 28 Big Data Technologies - MapReduce MapReduce distributes the processing of very large multi-structured data files across a large cluster of ordinary machines/processors by breaking the processing into small units of work is a programming model not a programming language Goal achieving high performance with many simple computers processing and analyzing large volumes of multi-structured data in a timely manner Developed and popularized by Google Example tasks: indexing the Web for search, graph analysis, text analysis, machine learning, 29 MapReduce Processes Objective: to count the number of squares of each color Input: a set of colored squares Responsibility of programmers: coding the map and reducing programs; the remainder of the processing is handled by the software system implementing the MapReduce programming model Processes: the MapReduce system first reads the input file and splits it into multiple pieces. In this example, there are two splits, but in a real-life scenario, the number of splits would typically be much higher. These splits are then processed by multiple map programs running in parallel on the nodes of the clusters. Role of each map program: groups the data in a split by color, the MapReduce system then takes the output from each map program and merges (shuffle/sort) the results for input to the reduce program, which calculates the sum of the number of squares of each color. In this example, only one copy of the reduce program is used 0 5

6 Big Data Technologies MapReduce Big Data Technologies Hadoop How does MapReduce work? 4 Raw Data Map Function Reduce Function Fig. 6.4 A Graphical Depiction of the MapReduce Process 1 Hadoop is an open source framework for storing and analyzing massive amounts of distributed, unstructured data in a cost- and time-effective manner. Originally created by Doug Cutting at Yahoo! Hadoop clusters run on inexpensive commodity hardware so projects can scale-out inexpensively Hadoop is now part of Apache Software Foundation Open source - hundreds of contributors continuously improve the core technology MapReduce + Hadoop = Big Data core technology 2 Big Data Technologies Hadoop How Does Hadoop Work? Access unstructured and semi-structured data (e.g., log files, social media feeds, other data sources) Break the data up into parts, which are then loaded into a file system made up of multiple nodes running on commodity hardware using Hadoop Distributed File System (HDFS). Each part is replicated multiple times and loaded into the file system for replication and failsafe processing A node acts as the Facilitator and another as Job Tracker Facilitator: communicate back to the client information such aws which nodes are available, where in the cluster certain data resides, and which node have failed Job Tracker: determine which data it needs to access to complete the job and where in the cluster that data is located. The job Tracker submits the query to the relevant nodes rather than bringing all the data back into a central location for processing. Processing then occurs at each node simultaneously, or in parallel an essential characteristics of Hadoop. Big Data Technologies Hadoop How Does Hadoop Work? (cont.) - Jobs are distributed to the clients, and once completed, the results are collected and aggregated using MapReduce When each node has finished processing its given job, it stores the results. The client initiates a Reduce job through the Job Tracker in which results of the map phase stored locally on individual nodes are aggregated to determine the answer to the original query, and then are loaded onto another node in the cluster. The client accesses these results, which can then be loaded into one of a number of analytic environments for analysis. The MapReduce job has now been completed. Once the MapReduce phase is complete, the processed data is ready for further analysis by data scientists and others with advanced data analytics skills. 4 Big Data Technologies Hadoop Big Data Technologies Hadoop - Demystifying Facts Hadoop Technical Components Hadoop Distributed File System (HDFS) Name Node (primary facilitator) Secondary Node (backup to Name Node) Job Tracker Slave Nodes (the grunts of any Hadoop cluster) Additionally, Hadoop ecosystem is made up of a number of complementary sub-projects: NoSQL (Cassandra, Hbase), DW (Hive), NoSQL = not only SQL 5 Hadoop consists of multiple products Hadoop is open source but available from vendors too Hadoop is an ecosystem, not a single product HDFS is a file system, not a DBMS Hive resembles SQL but is not standard SQL Hadoop and MapReduce are related but not the same MapReduce provides control for analytics, not analytics Hadoop is about data diversity, not just data volume Hadoop complements a DW; it s rarely a replacement Hadoop enables many types of analytics, not just Web analytics 6 6

7 Data Scientist Skills That Define a Data Scientist The Sexiest Job of the 21st Century Thomas H. Davenport and D. J. Patil Harvard Business Review, October 2012 Data Scientist = Big Data guru One with skills to investigate Big Data Very high salaries, very high expectations Where do Data Scientists come from? M.S./Ph.D. in MIS, CS, IE, and/or Analytics There is not a specific degree program for DS! PE, PML, DSP (Data Science Professional) Communication and Interpersonal Curiosity and Creativity Domain Expertise, Problem Definition and Decision Modeling DATA SCIENTIST Internet and Social Media / Social Networking Technologies Data Access and Management ( both traditional and new data systems ) Programming, Scripting and Hacking 7 8 A Typical Job Post for Data Scientist Application Case 6.4 Big Data and Analytics in Politics Questions for Discussion 1. What is the role of analytics and Big Data in modern-day politics? 2. Do you think Big Data analytics could change the outcome of an election?. What do you think are the challenges, the potential solution, and the probable results of the use of Big Data analytics in politics? 9 40 Application Case 6.4 Big Data and Analytics in Politics INPUT: Data Sources Census data Population specifics, age, race, sex, income, etc. Election Databases Party affiliations, previous election outcomes, trends and distributions Market research Polls, recent trends and movements Social media Facebook, Twitter, LinkedIn, Newsgroups, Blogs, etc. Web (in general) Web pages, posts and replies, search trends, etc. Other data sources Big Data & Analytics (Data Mining, Web Mining, Text Mining, Multi-media Mining) Predicting outcomes and trends Identifying associations between events and outcomes Assessing and measuring the sentiments Profiling (clustering) groups with similar behavioral patterns Other knowledge nuggets OUTPUT: Goals Raise money contributions Increase number of volunteers Organize movements Mobilize voters to get out and vote Other goals and objectives What is the role of analytics and Big Data in modern day politics? Big Data and analytics have a lot to offer to modern-day politics. The main characteristics of Big Data, namely volume, variety, and velocity (the three Vs), readily apply to the kind of data that is used for political campaigns. Big Data Analytics can help predict election outcomes as well as targeting potential voters and donors, and have become a critical part of political campaigns

8 2. Do you think Big Data analytics could change the outcome of an election? It may well have changed the outcome of the 2008 and 2012 elections. For example, in 2012, Barack Obama had a data advantage and the depth and breadth of his campaign s digital operation, from political and demographic data mining to voter sentiment and behavioral analysis, reached beyond anything politics had ever seen. So there was a significant difference in expertise and comfort level with modern technology between the two parties. The usage and expertise gap between the party lines may disappear over time, and this will even the playing field. But new data regarding voter sympathies may itself change the thinking and ideological directions of the major parties. This in itself could shift election outcomes.. What do you think are the challenges, the potential solution, and the probable results of the use of Big Data analytics in politics? One of the big challenges is to ensure some sort of reliability in news media coverage of election issues. For example, most people, including the so-called political experts (who often rely on gut feelings and experiences), thought the 2012 presidential election would be very close. They were wrong. On the other hand, data analysts, using data-driven analytical models, predicted that Barack Obama would win easily with close to 99 percent certainty. The results will be an ever-increasing use of Big Data analytics in politics, both by the parties and candidates themselves and by the news media and analysts who cover them Big Data And Data Warehousing Two paradigms in BI: Data Warehouse and Big. Data Both are competing each other for turning data into actionable information. However, in recent years, the variety and complexity of data made data warehouse incapable of keeping up the changing needs. Big Data A new paradigm that the world of IT was forced to develop, not because the volume of the structured data but the variety and the velocity. Big Data And Data Warehousing What is the impact of Big Data on DW? Big Data and RDBMS do not go nicely together Will Hadoop replace data warehousing/rdbms? Use Cases for Hadoop Hadoop is the repository and refinery for raw data Hadoop is a powerful, economical, and active archive Use Cases for Data Warehousing Data warehouse performance Integrating data that provides business value Interactive BI tools Hadoop versus Data Warehouse When to Use Which Platform Coexistence of Hadoop and DW The following scenarios support an environment where the Hadoop and RDBMS are separate from each other and connectivity software is used to exchange data between the two systems: 1. Use Hadoop for storing and archiving multi-structured data 2. Use Hadoop for filtering, transforming, and/or consolidating multi-structured data. Use Hadoop to analyze large volumes of multi-structured data and publish the analytical results 4. Use a relational DBMS that provides MapReduce capabilities as an investigative computing platform 5. Use a front-end query tool to access and analyze data

9 Coexistence of Hadoop and DW Big Data Vendors connectivity Big Data vendor landscape is developing very rapidly A representative list would include Claudera - claudera.com Software, MapR mapr.com Hardware, Service, Hortonworks - hortonworks.com Also, IBM (Netezza, InfoSphere), Oracle (Exadata, Exalogic), Microsoft, Amazon, Google, Source: Teradata Top 10 Big Data Vendors with Primary Focus on Hadoop Technology Insights 6.4 How to Succeed with Big Data $70 $60 $50 $40 $0 $20 $10 $0 1. Simplify 2. Coexist. Visualize 4. Empower 5. Integrate 6. Govern 7. Evangelize Big Data And Stream Analytics Data-in-motion analytics and real-time data analytics One of the Vs in Big Data = Velocity Analytic process of extracting actionable information from continuously flowing/streaming data Why Stream Analytics? It may not be feasible to store the data, or may lose its value Stream Analytics vs. Perpetual Analytics Stream analytics involves applying transaction-level logic to realtime observations. Used when transaction volumes are high and the time-to-decision is too short, Perpetual analytics is the term describing the process of performing (near) real-time analytics on data streams. Used when the mission is critical and transaction volumes can managed in real time. 5 Q: Why is stream analytics becoming more popular? Stream analytics is becoming increasingly more popular for two main reasons. First, time-to-action has become an ever decreasing value, and second, we have the technological means to capture and process the data while it is being created. Application Case 6.7: Turning Machine-Generated Streaming Data into Valuable Business Insights 54 9

10 Stream Analytics A Use Case in Energy Industry Stream Analytics Applications Sensor Data (Energy Production System Status) Meteorological Data (Wind, Light, Temperature, etc.) Usage Data (Smart Meters, Smart Grid Devices) Energy Production System (Traditional and/or Renewable) Data Integration and Temporary Staging Permanent Storage Area Capacity Decisions Streaming Analytics (Predicting Usage, Production and Anomalies) e-commerce Telecommunication Law Enforcement and Cyber Security Power Industry Financial Services Health Services Government Energy Consumption System (Residential and Commercial) Pricing Decisions End-of-Chapter Application Case Discovery Health Turns Big Data into Better Healthcare Questions for Discussion 1. How big is big data for Discovery Health? 2. What big data sources did Discovery Health use for their analytic solutions?. What were the main data/analytics challenges Discovery Health was facing? 4. What were the main solutions they have produced? 5. What were the initial results/benefits? What do you think will be the future of big data analytics at Discovery? How big is big data for Discovery Health? Discovery Health serves 2.7 million members, and its claims system generates a million new rows of data each day. For analytics purposes, they want to use three years of data, which comes to over a billion rows of claims data alone. 2. What big data sources did Discovery Health use for their analytic solutions? The data sources include members claims data, pathology results, members questionnaires, data from pharmacies, hospital admissions, and billing data, to name a few. Looking to the future, Discovery is also starting to analyze unstructured data, such as text-based surveys and comments from online feedback forms. 58. What were the main data/analytics challenges Discovery Health was facing? The challenge faced by Discovery Health was similar to that faced by any company operating in a Big Data environment. Given the huge amount of data that Discovery Health was generating, how could they identify the key insights that their business and their members health depend on? To find the needles of vital information in the big data haystack, the company not only needed a sophisticated data-mining and predictive modeling solution, but also an analytics infrastructure with the power to deliver results at the speed of business What were the main solutions they have produced? To achieve this transformation in its analytics capabilities, Discovery worked with BITanium, an IBM Business Partner with deep expertise in operational deployments of advanced analytics technologies. The solution included use of IBM s SSPS and PureData System for Analytics

11 5. What were the initial results/benefits? What do you think will be the future of big data analytics at Discovery? Discovery Health is now able to run three years worth of data for their 2.7 million members through complex statistical models to deliver actionable insights in a matter of minutes. This helps them to better predict and prevent health risks, and to identify and eliminate fraud. The results have been transformative. From an analytics perspective, the speed of the solution gives Discovery more scope to experiment and optimize its models. From a broader business perspective, the combination of SPSS and PureData technologies gives Discovery the ability to put actionable data in the hands of its decision makers faster. End of the Chapter Questions, comments

Big Data. What is Big Data? Over the past years. Big Data. Big Data: Introduction and Applications

Big Data. What is Big Data? Over the past years. Big Data. Big Data: Introduction and Applications Big Data Big Data: Introduction and Applications August 20, 2015 HKU-HKJC ExCEL3 Seminar Michael Chau, Associate Professor School of Business, The University of Hong Kong Ample opportunities for business

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

WHITE PAPER. Four Key Pillars To A Big Data Management Solution

WHITE PAPER. Four Key Pillars To A Big Data Management Solution WHITE PAPER Four Key Pillars To A Big Data Management Solution EXECUTIVE SUMMARY... 4 1. Big Data: a Big Term... 4 EVOLVING BIG DATA USE CASES... 7 Recommendation Engines... 7 Marketing Campaign Analysis...

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

Enable your Modern Data Architecture by delivering Enterprise Apache Hadoop

Enable your Modern Data Architecture by delivering Enterprise Apache Hadoop Modern Data Architecture with Enterprise Apache Hadoop Hortonworks. We do Hadoop. Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Our Mission: Enable your Modern Data Architecture

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Big Data Analytics DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Tom Haughey InfoModel, LLC 868 Woodfield Road Franklin Lakes, NJ 07417 201 755 3350 tom.haughey@infomodelusa.com

More information

Getting Started Practical Input For Your Roadmap

Getting Started Practical Input For Your Roadmap Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson

More information

How Big Is Big Data Adoption? Survey Results. Survey Results... 4. Big Data Company Strategy... 6

How Big Is Big Data Adoption? Survey Results. Survey Results... 4. Big Data Company Strategy... 6 Survey Results Table of Contents Survey Results... 4 Big Data Company Strategy... 6 Big Data Business Drivers and Benefits Received... 8 Big Data Integration... 10 Big Data Implementation Challenges...

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

Bringing Big Data into the Enterprise

Bringing Big Data into the Enterprise Bringing Big Data into the Enterprise Overview When evaluating Big Data applications in enterprise computing, one often-asked question is how does Big Data compare to the Enterprise Data Warehouse (EDW)?

More information

Big data for the Masses The Unique Challenge of Big Data Integration

Big data for the Masses The Unique Challenge of Big Data Integration Big data for the Masses The Unique Challenge of Big Data Integration White Paper Table of contents Executive Summary... 4 1. Big Data: a Big Term... 4 1.1. The Big Data... 4 1.2. The Big Technology...

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:

More information

How the oil and gas industry can gain value from Big Data?

How the oil and gas industry can gain value from Big Data? How the oil and gas industry can gain value from Big Data? Arild Kristensen Nordic Sales Manager, Big Data Analytics arild.kristensen@no.ibm.com, tlf. +4790532591 April 25, 2013 2013 IBM Corporation Dilbert

More information

Firebird meets NoSQL (Apache HBase) Case Study

Firebird meets NoSQL (Apache HBase) Case Study Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

So What s the Big Deal?

So What s the Big Deal? So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

More information

Investor Presentation. Second Quarter 2015

Investor Presentation. Second Quarter 2015 Investor Presentation Second Quarter 2015 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information

Integrating a Big Data Platform into Government:

Integrating a Big Data Platform into Government: Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise An Oracle White Paper October 2011 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Navigating Big Data business analytics

Navigating Big Data business analytics mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what

More information

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Bringing Big Data to People

Bringing Big Data to People Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process

More information

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse SQL Server 2012 PDW Ryan Simpson Technical Solution Professional PDW Microsoft Microsoft SQL Server 2012 Parallel Data Warehouse Massively Parallel Processing Platform Delivers Big Data HDFS Delivers Scale

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013 Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

BIRT in the World of Big Data

BIRT in the World of Big Data BIRT in the World of Big Data David Rosenbacher VP Sales Engineering Actuate Corporation 2013 Actuate Customer Days Today s Agenda and Goals Introduction to Big Data Compare with Regular Data Common Approaches

More information

How to Leverage Big Data in the Cloud to Gain Competitive Advantage

How to Leverage Big Data in the Cloud to Gain Competitive Advantage How to Leverage Big Data in the Cloud to Gain Competitive Advantage James Kobielus, IBM Big Data Evangelist Editor-in-Chief, IBM Data Magazine Senior Program Director, Product Marketing, Big Data Analytics

More information

White Paper: What You Need To Know About Hadoop

White Paper: What You Need To Know About Hadoop CTOlabs.com White Paper: What You Need To Know About Hadoop June 2011 A White Paper providing succinct information for the enterprise technologist. Inside: What is Hadoop, really? Issues the Hadoop stack

More information

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

Big Data: Tools and Technologies in Big Data

Big Data: Tools and Technologies in Big Data Big Data: Tools and Technologies in Big Data Jaskaran Singh Student Lovely Professional University, Punjab Varun Singla Assistant Professor Lovely Professional University, Punjab ABSTRACT Big data can

More information

Navigating the Big Data infrastructure layer Helena Schwenk

Navigating the Big Data infrastructure layer Helena Schwenk mwd a d v i s o r s Navigating the Big Data infrastructure layer Helena Schwenk A special report prepared for Actuate May 2013 This report is the second in a series of four and focuses principally on explaining

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Hadoop implementation of MapReduce computational model. Ján Vaňo

Hadoop implementation of MapReduce computational model. Ján Vaňo Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed

More information

Evolution to Revolution: Big Data 2.0

Evolution to Revolution: Big Data 2.0 Evolution to Revolution: Big Data 2.0 An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for Actian March 2014 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Table of Contents

More information

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? The Big Data Buzz big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database

More information

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.

More information

Big Data Big Data/Data Analytics & Software Development

Big Data Big Data/Data Analytics & Software Development Big Data Big Data/Data Analytics & Software Development Danairat T. danairat@gmail.com, 081-559-1446 1 Agenda Big Data Overview Business Cases and Benefits Hadoop Technology Architecture Big Data Development

More information

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved. Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their

More information

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Addressing Open Source Big Data, Hadoop, and MapReduce limitations Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?

More information

Big Data and Trusted Information

Big Data and Trusted Information Dr. Oliver Adamczak Big Data and Trusted Information CAS Single Point of Truth 7. Mai 2012 The Hype Big Data: The next frontier for innovation, competition and productivity McKinsey Global Institute 2012

More information

Buyer s Guide to Big Data Integration

Buyer s Guide to Big Data Integration SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out Big Data Challenges and Success Factors Deloitte Analytics Your data, inside out Big Data refers to the set of problems and subsequent technologies developed to solve them that are hard or expensive to

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW Roger Breu PDW Solution Specialist Microsoft Western Europe Marcus Gullberg PDW Partner Account Manager Microsoft Sweden

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required. What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees

More information

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

A New Era Of Analytic

A New Era Of Analytic Penang egovernment Seminar 2014 A New Era Of Analytic Megat Anuar Idris Head, Project Delivery, Business Analytics & Big Data Agenda Overview of Big Data Case Studies on Big Data Big Data Technology Readiness

More information

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12 Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using

More information

IBM Big Data Platform

IBM Big Data Platform IBM Big Data Platform Turning big data into smarter decisions Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie Big Data, Bigga byte kräver Pigga Hertz! May 16, 2013 By 2015, 80% of

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

Tap into Big Data at the Speed of Business

Tap into Big Data at the Speed of Business SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

Big Data on Microsoft Platform

Big Data on Microsoft Platform Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information