Ten Mistakes to Avoid

Size: px
Start display at page:

Download "Ten Mistakes to Avoid"

Transcription

1 EXCLUSIVELY FOR TDWI PREMIUM MEMBERS FIRST QUARTER 2015 Ten Mistakes to Avoid In Hadoop Implementations By Krish Krishnan tdwi.org

2 Ten Mistakes to Avoid In Hadoop Implementations By Krish Krishnan FOREWORD Data management and analytics are foundational requirements for creating, managing, and executing a successful business. From an infrastructure perspective, however, it is a struggle to build an integrated data platform that can support the information architecture required by an enterprise data repository and analytics hub. In the past decade, we have seen a successful set of distributed processing architectures including Google and Nutch that inspired us to bring distributed data processing architecture with Hadoop and its ecosystem of projects. Enterprises have explored Hadoop since 2009, and many start-ups are now focusing on that ecosystem. Today, this infrastructure distribution is being implemented as the enterprise hub for all data; some implementations are successful, but many others are abysmal failures. Why do so many fail? Where did they go wrong? How do we identify and avoid the mistakes? When inspecting failures and listening to companies and teams, we see that fundamental steps have been missed or ignored, including end-user management, data security, performance tuning, infrastructure configuration, and sizing. From the Hadoop infrastructure perspective, simply applying workarounds to implementations doesn t work. In this Ten Mistakes to Avoid, we identify the mistakes with the most negative impact on Hadoop implementations and recommend solutions you can apply to your own environment. ABOUT THE AUTHOR Krish Krishnan is the founder and president of Sixth Sense Advisors Inc., a management consulting and independent technology analyst firm based in Chicago. He is a recognized expert worldwide in the strategy, architecture, and implementation of big data, text analytics, and high-performance data warehousing, and a recognized author of books, articles, white papers, and e-books. Krish teaches at TDWI events and speaks at many technology conferences by TDWI, a division of 1105 Media, Inc. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. requests or feedback to info@tdwi.org. Product and company names mentioned herein may be trademarks and/or registered trademarks of their respective companies. tdwi.org 1

3 MISTAKE ONE: THINKING TECHNOLOGY IS A SILVER BULLET The biggest blunder enterprises commit is assuming technology is a silver bullet. What must we understand to avoid this thinking? Let s examine the enterprise information and data management landscape. Integrating all data from internal and external data sets drives our need for data storage and analysis platforms that can handle many data formats delivered at different times and speeds. Organizations have not understood the maturity of the platform and therefore have not been able to align their business requirements to its processing capability. From an enterprise perspective, data management includes acquisition, quality, integration, and analytics, but these are not the requirements that drive successful Internet-based companies that are contributing to the Hadoop ecosystem. Their foundational requirement is to collect data of all types, sizes, and states into a platform that can be used to discover and tag relationships between data elements and analyze and store them in a structured database for analytics. To successfully adopt Hadoop, consider: Hadoop is a foundational platform for data integration, analysis, and processing, and is a good enterprise option, but do not get carried away by others success. Your enterprise must understand its maturity as a data processor, the maturity of its Hadoop ecosystem, and your road map of initiatives from a Hadoop perspective. To implement a program with Hadoop, change how you execute an initiative. Business teams must own the program, manage the data governance experts, and drive the change. Those with data processing skills should form an independent team that will align with business teams to create the processing rules and architecture. IT will own the execution process and assist business and data processing teams. By determining the business requirements that will drive your project, deciding how you ll align those requirements across business and technology groups, and understanding your cost, governance, and skill goals, you will get better at implementing and integrating the technology. 2

4 MISTAKE TWO: ADOPTING A STORE FIRST, THEN ANALYZE WORKFLOW Hadoop s store first feature should be avoided in any implementation. Organizations tend to listen to the claims that an enterprise should first acquire or store data and then worry about processing it. Internet-based companies have made this a successful entry/exit criterion because they need to capture data quickly, tag it, and provide it for end-user consumption in search, mashup reports, and analytical dashboards. However, enterprises cannot afford to use the same strategy because their business goals differ. Store first offers a shaky foundation because we have not done a good job of using existing data from analytics or BI initiatives yet. We don t understand how to focus on a data-driven model; acquiring data without understanding how to align it with business goals and insights is foolhardy. Take, for example, integrated data analysis from social media forums and campaign success analytics (such as a storyboard). Imagine a large retail company conducting campaigns for a new laundry detergent. The company varies its campaign strategies based on climate, socioeconomic background, age, and ethnicity. The campaign analytics show a slower-than-expected move by consumers to try the new product, but the analysis doesn t explain the reasons or how to make the campaign more effective. To learn more, the enterprise considers data from social media and creates a crawler to monitor the product or the brand (or both). The crawler collects data from social media and stores and tags the data in a Hadoop platform. Business subject matter experts (SMEs) and data analysts examine the collected and tagged data and discover that their new product has not matched price expectations in the East, Northeast, and Southwest regions. Sales have been higher in the upper Midwest and Northwest regions but are far below expectations in the lower Midwest and Southeast. This data provides some initial insights, and the team decides to conduct full text analytics while adjusting the campaign by region to better match consumer needs. Repeated attempts start producing improvements, and a few months later the campaign analytics dashboard shows positive readings. Such behavior is the change we need when implementing a Hadoop platform. tdwi.org 3

5 MISTAKE THREE: FAILURE TO PREVENT TRANSFORMATION OVERLOAD When I wrote my first programs on the Hadoop platform as a developer, I was confronted with an issue many people face over time: data transformation. How do we apply the rules of data processing on a file-server-based system where data consistency is not an issue, considering that we can process the same data in different data nodes (storage areas) using different algorithms and programs for discovery and analysis? Why is transformation an issue on this platform? When does it become an issue, and how can we identify the symptoms and rectify the problem? These questions immediately occurred to me on the architect side of my brain. In many of today s Hadoop deployments, transformation overload and its associated problems with data processing during discovery and analysis persist. From a Hadoop and technology perspective, our goal has been to create an infrastructure that can scale up and out on demand, whether that means to acquire and store data or process the data from our data discovery and analysis. As an #acme #airlines long line, #noupgd platinum_mbr 14hr flt #disgusting #fail #nofly is a tweet a business user wants to transform to Brand: Acme Airlines, User: John Doe, Status: Platinum Member, Issue: Long Travel, No Upgrade, Long Lines to Check in, Sentiment: Negative, Outcome: Will not fly again with Acme. Date Posted, Airport, Retweets, SM_Status: Influencer. In these exercises, there is no mention of needing a schema or any structured storage architecture model that will require converting the data at the end of discovery and analysis from unstructured or semi-structured to a structured form. Without establishing this need at data acquisition time, we will lose our way in the discovery phase with incorrect transformations and business semantics applied incorrectly. The result is cycles of processing without resulting in meaningful insights. Eventually, we stop processing the data because we have no idea what is needed. This is a mistake we must avoid. 4

6 MISTAKE FOUR: FAILURE TO SECURE EXECUTIVE SPONSORSHIP Executive sponsorship is a sensitive subject but must be addressed, especially in the context of Hadoop implementation. We are not talking about the 1980s, when CxOs attended conferences and returned with specific thoughts on the competition, the marketplace, enterprise alignment, and customer life cycle management. Today, we have evolved to social and collaborative data-driven operations where the entire enterprise is transparent in all of its activities. Competitive research has taken on a whole new meaning and involves direct access to any prospects and customers over all channels, including social media. This evolution has created opportunities for innovation to be driven both top-down and bottom-up. For this new mode of operation to be successful, especially with our need for transparency and innovation, we need new attitudes at the executive levels. The enterprise must become transparent to the appropriate managers and business users; they must understand the business strategy and know the enterprise s competitors. This provides focus for the programs in the areas of competitive research, social media analytics, and campaign management. Changes in these focus areas can be measured and presented to executives for discussion with colleagues at their level and above. Another change in executive support is related to processing data (analysis) in the Hadoop initiative. In the new world of data discovery with large data sets, business users as well as business and data analysts must create a discovery road map with search and semantic workflows. IT must support their implementation and manage the necessary technology integration and usage. When IT is the system integrator, not the solution provider, executive support and direction will need to be clear to avoid potential issues. How do organizations succeed? It is a rough path for anyone to tread. We recommend that organizations wishing to engage in a Hadoop program create a set of measurable executive sponsorship goals. These goals are not ROI or financial goals, but rather goals to gain competitive insights, understand market alignment, visualize social media analytics, and create a change in business strategy to emerge as a visionary leader. tdwi.org 5

7 MISTAKE FIVE: FAILURE TO ESTABLISH GOALS In today s world, we often execute on an outcomes-based approach. For example, we start our day with a quick look at and our calendar, then decide which efforts take priority. This is easy to manage for an individual, but a large enterprise must consider payroll processing, for example, with all the inputs being updated until the last minute before beginning to process the cycle for each month. This approach is not easy to implement across all aspects of data processing. However, if we do not establish a key set of goals, our program will fail. For payroll processing, the goals are to ensure all employees are paid and that their earned and used hours of vacation, 401k, and other benefits are calculated accurately. The new world of data management in any enterprise will evolve from its current architecture of the data warehouse to a new model that will include Hadoop, NoSQL, in-memory, and cloud technologies along with the data warehouse, master data management, and metadata programs. This new world has many terms: logical data warehouse, unstructured data warehouse, and next-generation data warehouse. The overall ecosystem will process data from acquisition, discovery, analysis, transformation, integrated structured data from a current-state data warehouse, and data marts and other enterprise systems, then present an integrated set of outcomes for analytics and BI processing. Analytics can also be executed on any part of this ecosystem as needed. When implementing Hadoop as a platform, we fail in the integration process for two reasons. The first is that we do not understand the entire data life cycle as we start the acquisition process; as a result, we end up with transformation overload (as discussed in Mistake Three). The second reason is that we fail to establish desired outcomes from the data discovery and analysis processes, which leaves the data transformation in a mismatched state resulting in chaos and failure. How can we avoid this situation? There are two distinct steps. The first is in the proof-of-concept stage where we can establish outcomes from the discovery and analysis phase. The second is where we start the Hadoop program and get the business SMEs and data analysts together to create a high-level set of outcomes from discovery to analytics phases. Associate KPIs at each instance as desired. 6

8 MISTAKE SIX: IMPROPER INFRASTRUCTURE PLANNING Why does infrastructure planning become a mess? One of the most important reasons is the lack of a standard configuration guide, which is partially due to the newness of the infrastructure. Also, one size does not fit all. The problems we have seen with Hadoop are primarily those of memory, CPU, MapReduce, and YARN configuration and management. When configuring Hadoop, check several infrastructure issues that tend to overload memory management, especially with data and computing. When programs slow down and start to fail, we try to fix the error and not simply throw more resources at the problem. We make configuration mistakes with MapReduce and YARN that we need to revisit multiple times in order to strengthen the processing of data through the infrastructure. Unfortunately, to make memory specifications feasible, we overload the CPU cycles and harm performance. For each proof of concept or implementation, the project team must work with the vendor to configure the selection of infrastructure for Hadoop. The team must outline the performance and scalability expectations for the infrastructure for the next five years. This will result in some initial over-configuration of the infrastructure, but once data acquisition and discovery start, the configuration can be fine-tuned to sustain performance with simultaneous loading and analysis of data. Improper configuration can be handled, and memory crashes with YARN and MapReduce can be avoided. Processor overloading can be prevented by configuring the cycles of CPU per job, which can be managed by focusing on workflow configuration for the nodes and the task trackers associated with the actual job execution. Although this sounds like a complex scenario, job execution is easy to manage, and we anticipate the processor cycle to be tuned more than once in a Hadoop cluster. tdwi.org 7

9 MISTAKE SEVEN: FAILURE TO PROPERLY CONFIGURE AND MANAGE SEMANTIC DATA Many organizations forget semantic data and its emergence as a metadata and taxonomy integrator as they begin implementing Hadoop projects. Data discovery is an important step for every Hadoop initiative because it is the first step in the next generation of data processing. In the data discovery process, the goal is to identify the data, select tags, and create a tag index that includes the metadata, the line of occurrence, the context of its occurrence, and the number of occurrences in the current process. To complete this step successfully, we must use a combination of metadata, taxonomies, semantic workflows, databases, and business-rule engines. This extended set of metadata elements is needed to create a robust data map as we discover the data without losing sight of any piece of the information or leading the team into chaos with bad data maps. We must configure the metadata and semantic libraries with the appropriate set of business rules and context rules applied, which will process the data with the closest and most correct matches. Another area of semantic technology is machine learning. By applying semantic libraries to aid in machine learning: Data mapping and linking are clean with the right metadata, and data lineage is clearer to process Data duplication is tagged and removed Stop-words processing for text analytics is easier Stemming of any text data can use semantic libraries Tagging is accomplished easily These advantages make automating machine learning simple, and we can develop a smart algorithm from simple blocks into a large module for execution. Today we see machine learning in healthcare, insurance, and credit card fraud analytics. To choose the best strategy in this architecture and avoid the chaos of incorrect semantic integration: 1. Create a data-discovery strategy road map for your Hadoop implementation 2. Conduct a proof-of-concept exercise of data discovery using semantic libraries 3. Ask business user teams to identify the most appropriate semantic libraries to add to the platform 8

10 MISTAKE EIGHT: MAPREDUCE IMPLEMENTATION A fundamental issue that has hampered many Hadoop implementations is related to MapReduce programming. In the world of Hadoop systems, MapReduce is the lowest grain of execution of programs, whether you execute Hive queries, Pig scripts, or Java or Python code. The confusion starts with how to configure the MapReduce processing ecosystem. Let us understand the flow of MapReduce processing in Hadoop: MapReduce programming model Map: Process a key/value pair to generate intermediate key/value pairs Reduce: Merge all intermediate values associated with the same key Users implement interface of two primary methods: 1. Map: (key1, val1) (key2, val2) 2. Reduce: (key2, [val2]) [val3] Developers first determine the number of maps and reducers to create. For example, a text document will need tagging and then a set of maps based on number of tags. An Excel spreadsheet will need a set of maps matching the number of columns. Execution process The Input phase generates a number of FileSplits from input files (one per Map task) The Map phase executes a user function to transform input kev-pairs into a new set of kev-pairs The Hadoop framework sorts and Shuffles the kev-pairs to output nodes The Reduce phase combines all kev-pairs with the same key into new kev-pairs The output phase writes the resulting pairs to files Hadoop framework handles scheduling of tasks on cluster Hadoop framework handles recovery when a node fails tdwi.org 9

11 This is where we as developers believe the framework exists to handle all the infrastructure management, memory allocation, parallelism, and distributed computing. The framework can certainly handle the tasks, but not all of them all the time. The question is: How do we know what to do and when to implement each of them? Here are several tips: Do not create a distributed processing approach where the objects in the Map and Reduce programs need to communicate repeatedly. This will create issues with object state preservation and its associated memory allocation. Try to process aggregations locally rather than distributed over the network. This step will provide greater scalability and improve performance, especially if you use partitions and combiners effectively to manage the MapReduce process. Try to use in-memory combiner mapping where possible to create a scalable process flow. Pairs and Stripes are useful in MapReduce to process text and semi-structured data. Mahout algorithms use these effectively. Look at the Apache Mahout website for additional design details. Design your MapReduce programs with a test data set that contains all the types, formats, and structures of the data, and identify tuning requirements. Start applying the steps as discussed to arrive at an optimal execution state.

12 MISTAKE NINE: FAILING TO PROVIDE BIG DATA GOVERNANCE Data governance has always been a sensitive subject in the world of data management. The first question is often whether data governance is appropriate for big data. We are still talking about data, so yes, we need a governance program no matter what kind of data we have: big, small, fast, wide, or deep it doesn t matter. We need governance for this program because: Data and user security are still evolving in Hadoop. The data must be discovered and tagged before analysis; this can require several types of governance rules to be processed and applied to the data to get the correct results and eventually the appropriate analytic outcomes. The data is free form, and acquisition of this data requires strong governance on HCatalog and associated metadata processed and applied to the data. Executive sponsorship must be managed to successfully implement Hadoop. This can be executed as a part of your enterprise s governance program. The use of taxonomies and ontologies, if not governed and managed, can cause a data processing overrun from acquisition to analytics on the Hadoop platform. In current Hadoop implementations, we see issues and mistakes arising with governance. Further analysis of these issues reveals consistent ignorance about security, compression, metadata management, and integration of taxonomies and ontologies. When presented with a road map for big data implementations and the maturity models available today (TDWI, EMC) where one core area is governance, teams agreed that paying attention to data governance could have made the difference between success and failure. Governance is critical for the overall success of all data-driven and data-oriented programs, including Hadoop implementations. For your implementation to succeed, we recommend that you create a road map and a governance maturity model that will guide you from today into the future. tdwi.org 11

13 MISTAKE TEN: USING HADOOP AS AN ENTERPRISE DATA REPOSITORY A classic mistake made by enterprises invested in Hadoop technology is the desired end state of the enterprise data lake or data hub or data repository. Although it is common to store all enterprise data centrally to be consumed by the enterprise, this cannot be accomplished by one technology platform; it needs an ecosystem of components that can integrate and perform the desired functions. In this respect, Hadoop by itself is not a technology platform but high-performing storage architecture with provisions for computing data at the same layers. There are several reasons to be sure you don t adopt this desire. First, enterprises have a considerable amount of data in different formats. The data has different grains of accuracy, different velocity, and different veracity. The data needs multiple metadata and master-data sets to process. Although Hadoop has been publicly acknowledged as a platform that can handle and satisfy these requirements, in real-life implementations questions remain about linking and securing the data and executing analytics on the data, all of which have not been completely implementable in Hadoop. This is primarily because Hadoop is an ever-evolving environment and some of these features have yet to be developed in Hadoop. Hadoop is a technology platform and not a solution. Before you jump into the enterprise data hub or data lake hype, understand the requirements of the data and how it will be used, or you will be implementing a version of the enterprise swamp. To make sure of your approach, you do not need a team of data scientists but rather a team of enterprise architects, business SMEs, and data leaders across your organization to establish and validate your Hadoop requirements and detail the outcomes you expect. When you document all your viewpoints, outcomes, and expectations, you can decide how you will bring Hadoop in as an enterprise platform and use it to answer your business questions. 12

14 ABOUT TDWI TDWI is your source for in-depth education and research on all things data. For 20 years, TDWI has been helping data professionals get smarter so the companies they work for can innovate and grow faster. TDWI provides individuals and teams with a comprehensive portfolio of business and technical education and research to acquire the knowledge and skills they need, when and where they need them. The in-depth, best-practices-based information TDWI offers can be quickly applied to develop world-class talent across your organization s business and IT functions to enhance analytical, data-driven decision making and performance. TDWI advances the art and science of realizing business value from data by providing an objective forum where industry experts, solution providers, and practitioners can explore and enhance data competencies, practices, and technologies. TDWI offers five major conferences, topical seminars, onsite education, a worldwide membership program, business intelligence certification, live Webinars, resourceful publications, industry news, an in-depth research program, and a comprehensive website: tdwi.org. 555 S Renton Village Place, Ste. 700 Renton, WA T F E info@tdwi.org tdwi.org

Ten Mistakes to Avoid

Ten Mistakes to Avoid EXCLUSIVELY FOR TDWI PREMIUM MEMBERS TDWI RESEARCH SECOND QUARTER 2014 Ten Mistakes to Avoid In Big Data Analytics Projects By Fern Halper tdwi.org Ten Mistakes to Avoid In Big Data Analytics Projects

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

POLAR IT SERVICES. Business Intelligence Project Methodology

POLAR IT SERVICES. Business Intelligence Project Methodology POLAR IT SERVICES Business Intelligence Project Methodology Table of Contents 1. Overview... 2 2. Visualize... 3 3. Planning and Architecture... 4 3.1 Define Requirements... 4 3.1.1 Define Attributes...

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Are You Big Data Ready?

Are You Big Data Ready? ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

Big Data Zurich, November 23. September 2011

Big Data Zurich, November 23. September 2011 Institute of Technology Management Big Data Projektskizze «Competence Center Automotive Intelligence» Zurich, November 11th 23. September 2011 Felix Wortmann Assistant Professor Technology Management,

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

Data Management in SAP Environments

Data Management in SAP Environments Data Management in SAP Environments the Big Data Impact Berlin, June 2012 Dr. Wolfgang Martin Analyst, ibond Partner und Ventana Research Advisor Data Management in SAP Environments Big Data What it is

More information

TDWI: BUSINESS INTELLIGENCE & DATA WAREHOUSING EDUCATION EUROPE

TDWI: BUSINESS INTELLIGENCE & DATA WAREHOUSING EDUCATION EUROPE TDWI: BUSINESS INTELLIGENCE & DATA WAREHOUSING EDUCATION EUROPE TDWI In-Depth Courses 1st Half 2016 In-Depth course: Data Visualization In-Depth course: Big Data In-Depth course: Hadoop CBIP Preparation

More information

Oracle Big Data Building A Big Data Management System

Oracle Big Data Building A Big Data Management System Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following

More information

Using and Choosing a Cloud Solution for Data Warehousing

Using and Choosing a Cloud Solution for Data Warehousing TDWI RESEARCH TDWI CHECKLIST REPORT Using and Choosing a Cloud Solution for Data Warehousing By Colin White Sponsored by: tdwi.org JULY 2015 TDWI CHECKLIST REPORT Using and Choosing a Cloud Solution for

More information

Navigating Big Data business analytics

Navigating Big Data business analytics mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what

More information

VIEWPOINT. High Performance Analytics. Industry Context and Trends

VIEWPOINT. High Performance Analytics. Industry Context and Trends VIEWPOINT High Performance Analytics Industry Context and Trends In the digital age of social media and connected devices, enterprises have a plethora of data that they can mine, to discover hidden correlations

More information

Evolution to Revolution: Big Data 2.0

Evolution to Revolution: Big Data 2.0 Evolution to Revolution: Big Data 2.0 An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for Actian March 2014 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Table of Contents

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

Traditional BI vs. Business Data Lake A comparison

Traditional BI vs. Business Data Lake A comparison Traditional BI vs. Business Data Lake A comparison The need for new thinking around data storage and analysis Traditional Business Intelligence (BI) systems provide various levels and kinds of analyses

More information

ten mistakes to avoid

ten mistakes to avoid second quarter 2010 ten mistakes to avoid In Predictive Analytics By Thomas A. Rathburn ten mistakes to avoid In Predictive Analytics By Thomas A. Rathburn Foreword Predictive analytics is the goal-driven

More information

Agile Business Intelligence Data Lake Architecture

Agile Business Intelligence Data Lake Architecture Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

How To Make Sense Of Data With Altilia

How To Make Sense Of Data With Altilia HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to

More information

DATA VISUALIZATION AND DISCOVERY FOR BETTER BUSINESS DECISIONS

DATA VISUALIZATION AND DISCOVERY FOR BETTER BUSINESS DECISIONS TDWI research TDWI BEST PRACTICES REPORT THIRD QUARTER 2013 EXECUTIVE SUMMARY DATA VISUALIZATION AND DISCOVERY FOR BETTER BUSINESS DECISIONS By David Stodder tdwi.org EXECUTIVE SUMMARY Data Visualization

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013 Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache

More information

Big Data must become a first class citizen in the enterprise

Big Data must become a first class citizen in the enterprise Big Data must become a first class citizen in the enterprise An Ovum white paper for Cloudera Publication Date: 14 January 2014 Author: Tony Baer SUMMARY Catalyst Ovum view Big Data analytics have caught

More information

How To Turn Big Data Into An Insight

How To Turn Big Data Into An Insight mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

From Lab to Factory: The Big Data Management Workbook

From Lab to Factory: The Big Data Management Workbook Executive Summary From Lab to Factory: The Big Data Management Workbook How to Operationalize Big Data Experiments in a Repeatable Way and Avoid Failures Executive Summary Businesses looking to uncover

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

CONNECTING DATA WITH BUSINESS

CONNECTING DATA WITH BUSINESS CONNECTING DATA WITH BUSINESS Big Data and Data Science consulting Business Value through Data Knowledge Synergic Partners is a specialized Big Data, Data Science and Data Engineering consultancy firm

More information

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013 Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the

More information

Architected Blended Big Data with Pentaho

Architected Blended Big Data with Pentaho Architected Blended Big Data with Pentaho A Solution Brief Copyright 2013 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information,

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

BUSINESS RULES AND GAP ANALYSIS

BUSINESS RULES AND GAP ANALYSIS Leading the Evolution WHITE PAPER BUSINESS RULES AND GAP ANALYSIS Discovery and management of business rules avoids business disruptions WHITE PAPER BUSINESS RULES AND GAP ANALYSIS Business Situation More

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

Extend your analytic capabilities with SAP Predictive Analysis

Extend your analytic capabilities with SAP Predictive Analysis September 9 11, 2013 Anaheim, California Extend your analytic capabilities with SAP Predictive Analysis Charles Gadalla Learning Points Advanced analytics strategy at SAP Simplifying predictive analytics

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Customized Report- Big Data

Customized Report- Big Data GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved. Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.

More information

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Addressing Open Source Big Data, Hadoop, and MapReduce limitations Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?

More information

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases DATAMEER WHITE PAPER Beyond BI Big Data Analytic Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence

More information

Data Warehousing in the Age of Big Data

Data Warehousing in the Age of Big Data Data Warehousing in the Age of Big Data Krish Krishnan AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD * PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Morgan Kaufmann is an imprint of Elsevier

More information

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data: The Datafication Of Everything Thoughts Devices Processes Thoughts Things Processes Run the Business Organize data to do something

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Data Virtualization A Potential Antidote for Big Data Growing Pains

Data Virtualization A Potential Antidote for Big Data Growing Pains perspective Data Virtualization A Potential Antidote for Big Data Growing Pains Atul Shrivastava Abstract Enterprises are already facing challenges around data consolidation, heterogeneity, quality, and

More information

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

ROME, 17-10-2013 BIG DATA ANALYTICS

ROME, 17-10-2013 BIG DATA ANALYTICS ROME, 17-10-2013 BIG DATA ANALYTICS BIG DATA FOUNDATIONS Big Data is #1 on the 2012 and the 2013 list of most ambiguous terms - Global language monitor 2 BIG DATA FOUNDATIONS Big Data refers to data sets

More information

Big Data: Beyond the Hype

Big Data: Beyond the Hype Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER Big Data: Beyond the Hype Why Big Data Matters to You By DataStax Corporation October 2011 Table of Contents Introduction...4 Big Data

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

Achieving Business Value through Big Data Analytics Philip Russom

Achieving Business Value through Big Data Analytics Philip Russom Achieving Business Value through Big Data Analytics Philip Russom TDWI Research Director for Data Management October 3, 2012 Sponsor 2 Speakers Philip Russom Research Director, Data Management, TDWI Brian

More information

HadoopTM Analytics DDN

HadoopTM Analytics DDN DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate

More information

Big Data Comes of Age: Shifting to a Real-time Data Platform

Big Data Comes of Age: Shifting to a Real-time Data Platform An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for SAP April 2013 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Table of Contents Introduction... 1 Drivers of Change...

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

G-Cloud Big Data Suite Powered by Pivotal. December 2014. G-Cloud. service definitions

G-Cloud Big Data Suite Powered by Pivotal. December 2014. G-Cloud. service definitions G-Cloud Big Data Suite Powered by Pivotal December 2014 G-Cloud service definitions TABLE OF CONTENTS Service Overview... 3 Business Need... 6 Our Approach... 7 Service Management... 7 Vendor Accreditations/Awards...

More information

Analytics With Hadoop. SAS and Cloudera Starter Services: Visual Analytics and Visual Statistics

Analytics With Hadoop. SAS and Cloudera Starter Services: Visual Analytics and Visual Statistics Analytics With Hadoop SAS and Cloudera Starter Services: Visual Analytics and Visual Statistics Everything You Need to Get Started on Your First Hadoop Project SAS and Cloudera have identified the essential

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.

More information

Ten Mistakes to Avoid

Ten Mistakes to Avoid EXCLUSIVELY FOR TDWI PREMIUM MEMBERS TDWI RESEARCH FOURTH QUARTER 2014 Ten Mistakes to Avoid When Democratizing BI and Analytics By David Stodder tdwi.org Ten Mistakes to Avoid When Democratizing BI and

More information

Business Intelligence for Big Data

Business Intelligence for Big Data Business Intelligence for Big Data Will Gorman, Vice President, Engineering May, 2011 2010, Pentaho. All Rights Reserved. www.pentaho.com. What is BI? Business Intelligence = reports, dashboards, analysis,

More information

Before You Buy: A Checklist for Evaluating Your Analytics Vendor

Before You Buy: A Checklist for Evaluating Your Analytics Vendor Executive Report Before You Buy: A Checklist for Evaluating Your Analytics Vendor By Dale Sanders Sr. Vice President Health Catalyst Embarking on an assessment with the knowledge of key, general criteria

More information

Big Data Big Data/Data Analytics & Software Development

Big Data Big Data/Data Analytics & Software Development Big Data Big Data/Data Analytics & Software Development Danairat T. danairat@gmail.com, 081-559-1446 1 Agenda Big Data Overview Business Cases and Benefits Hadoop Technology Architecture Big Data Development

More information

BEYOND BI: Big Data Analytic Use Cases

BEYOND BI: Big Data Analytic Use Cases BEYOND BI: Big Data Analytic Use Cases Big Data Analytics Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

Data Modeling in the Age of Big Data

Data Modeling in the Age of Big Data Data Modeling in the Age of Big Data Pete Stiglich Pete Stiglich is a principal at Clarity Solution Group. pstiglich@clarity-us.com Abstract With big data adoption accelerating and strong interest in NoSQL

More information

MicroStrategy Course Catalog

MicroStrategy Course Catalog MicroStrategy Course Catalog 1 microstrategy.com/education 3 MicroStrategy course matrix 4 MicroStrategy 9 8 MicroStrategy 10 table of contents MicroStrategy course matrix MICROSTRATEGY 9 MICROSTRATEGY

More information

FINANCIAL SERVICES: FRAUD MANAGEMENT A solution showcase

FINANCIAL SERVICES: FRAUD MANAGEMENT A solution showcase FINANCIAL SERVICES: FRAUD MANAGEMENT A solution showcase TECHNOLOGY OVERVIEW FRAUD MANAGE- MENT REFERENCE ARCHITECTURE This technology overview describes a complete infrastructure and application re-architecture

More information

PRIME DIMENSIONS. Revealing insights. Shaping the future.

PRIME DIMENSIONS. Revealing insights. Shaping the future. PRIME DIMENSIONS Revealing insights. Shaping the future. Service Offering Prime Dimensions offers expertise in the processes, tools, and techniques associated with: Data Management Business Intelligence

More information

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise An Oracle White Paper October 2011 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop Hadoop Data Hubs and BI Supporting the migration from siloed reporting and BI to centralized services with Hadoop John Allen October 2014 Introduction John Allen; computer scientist Background in data

More information

Three Reasons Why Visual Data Discovery Falls Short

Three Reasons Why Visual Data Discovery Falls Short Three Reasons Why Visual Data Discovery Falls Short Vijay Anand, Director, Product Marketing Agenda Introduction to Self-Service Analytics and Concepts MicroStrategy Self-Service Analytics Product Offerings

More information

PDF PREVIEW EMERGING TECHNOLOGIES. Applying Technologies for Social Media Data Analysis

PDF PREVIEW EMERGING TECHNOLOGIES. Applying Technologies for Social Media Data Analysis VOLUME 34 BEST PRACTICES IN BUSINESS INTELLIGENCE AND DATA WAREHOUSING FROM LEADING SOLUTION PROVIDERS AND EXPERTS PDF PREVIEW IN EMERGING TECHNOLOGIES POWERFUL CASE STUDIES AND LESSONS LEARNED FOCUSING

More information

UNIFY YOUR (BIG) DATA

UNIFY YOUR (BIG) DATA UNIFY YOUR (BIG) DATA ANALYTIC STRATEGY GIVE ANY USER ANY ANALYTIC ON ANY DATA Scott Gnau President, Teradata Labs scott.gnau@teradata.com t Unify Your (Big) Data Analytic Strategy Technology excitement:

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

The IBM Cognos Platform

The IBM Cognos Platform The IBM Cognos Platform Deliver complete, consistent, timely information to all your users, with cost-effective scale Highlights Reach all your information reliably and quickly Deliver a complete, consistent

More information

Scalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data

Scalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data Transforming Data into Intelligence Scalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data Big Data Data Warehousing Data Governance and Quality

More information

Bringing the Power of SAS to Hadoop. White Paper

Bringing the Power of SAS to Hadoop. White Paper White Paper Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities Contents Introduction... 1 What

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Architecting your Business for Big Data Your Bridge to a Modern Information Architecture

Architecting your Business for Big Data Your Bridge to a Modern Information Architecture Architecting your Business for Big Data Your Bridge to a Modern Information Architecture Robert Stackowiak Vice President, Information Architecture & Big Data Oracle Safe Harbor Statement The following

More information

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out Big Data Challenges and Success Factors Deloitte Analytics Your data, inside out Big Data refers to the set of problems and subsequent technologies developed to solve them that are hard or expensive to

More information

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Testing 3Vs (Volume, Variety and Velocity) of Big Data Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used

More information

How to Run a Successful Big Data POC in 6 Weeks

How to Run a Successful Big Data POC in 6 Weeks Executive Summary How to Run a Successful Big Data POC in 6 Weeks A Practical Workbook to Deploy Your First Proof of Concept and Avoid Early Failure Executive Summary As big data technologies move into

More information

Understanding the Value of In-Memory in the IT Landscape

Understanding the Value of In-Memory in the IT Landscape February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to

More information

Discovering Business Insights in Big Data Using SQL-MapReduce

Discovering Business Insights in Big Data Using SQL-MapReduce Discovering Business Insights in Big Data Using SQL-MapReduce A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy July 2013 Sponsored by Copyright 2013

More information

Predictive Analytics: Revolutionizing Business Decision Making

Predictive Analytics: Revolutionizing Business Decision Making NOVEMBER 2014 TDWI E-Book Predictive Analytics: Revolutionizing Business Decision Making 1 Q&A: Predictive Analytics 101 3 Who Should Be Building Predictive Models? 5 Exploratory Predictive Analytics 8

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

How To Learn To Use Big Data

How To Learn To Use Big Data Information Technologies Programs Big Data Specialized Studies Accelerate Your Career extension.uci.edu/bigdata Offered in partnership with University of California, Irvine Extension s professional certificate

More information

Big Data for Investment Research Management

Big Data for Investment Research Management IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment Management firms turn big data into actionable

More information