NAVIGATING THE BIG DATA JOURNEY



Similar documents
Analytics Strategy Information Architecture Data Management Analytics Value and Governance Realization

Big Data Services From Hitachi Data Systems

EMC ADVERTISING ANALYTICS SERVICE FOR MEDIA & ENTERTAINMENT

Operational Excellence, Data Driven Transformation Now Available at American Hospitals

Cisco Unified Communications and Collaboration technology is changing the way we go about the business of the University.

Enterprise Business Service Management

How To Test For Elulla

Ten Mistakes to Avoid

A Whole New World. Big Data Technologies Big Discovery Big Insights Endless Possibilities

How To Design A Cloud Based Infrastructure For Spera

Top 10 Considerations for Enterprise Agile Tools.

UNIFY YOUR (BIG) DATA

04 Executive Summary. 08 What is a BI Strategy. 10 BI Strategy Overview. 24 Getting Started. 28 How SAP Can Help. 33 More Information

Title of brochure. Moving toward High Performance through Electronic Medical Record Programs

SUSTAINING COMPETITIVE DIFFERENTIATION

GUIDEBOOK MAXIMIZING SUCCESS DELIVERING MICROSOFT DYNAMICS

BIG DATA KICK START. Troy Christensen December 2013

IDC MaturityScape Benchmark: Big Data and Analytics in Government. Adelaide O Brien Research Director IDC Government Insights June 20, 2014

IDC MaturityScape Benchmark: Big Data and Analytics in Government

THE ANALYTICS HUB LEVERAGING A SHARED SERVICES MODEL TO UNLOCK BIG DATA. Thomas Roland Managing Director. David Roggen Director CONTENTS

Realizing Hidden Value: Optimizing Utility Field Service Performance by Measuring the Right Things

The Role of Feedback Management in Becoming Customer Centric

CONNECTING DATA WITH BUSINESS

The Future of Business Analytics is Now! 2013 IBM Corporation

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

Tech-Clarity Insight: Top 5 Misconceptions about Innovation Management Software

White. Paper. Big Data Advisory Service. September, 2011

Transforming the Healthcare Organization through Process Improvement

Global Payroll? A Few Planning Considerations. Human Resources Globalization

Accenture cloud application migration services

Extend your analytic capabilities with SAP Predictive Analysis

BANKING ON CUSTOMER BEHAVIOR

INTRODUCING TALEO 10. Solutions Built for the Talent Age. Powering the New Age of Talent

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

Process-Based Business Transformation. Todd Lohr, Practice Director

Business Analysis Capability Assessment

NCOE whitepaper Master Data Deployment and Management in a Global ERP Implementation

Five best practices for deploying a successful service-oriented architecture

5 Steps to Choosing the Right BPM Suite

Driving Growth with Customer Data Management

The New Model for IT Service Delivery

HP DevOps by Design. Your Readiness for Continuous Innovation Rony Van Hove/ April 2 nd, HP Software: Apps meet Ops 2015

Five Best Practices for Maximizing Big Data ROI

Information Management and Analytics. Accelerate your insights

The value of Big Data: How analytics differentiates winners

WHITE PAPER. The Five Fundamentals of a Successful FCR Program

Business Intelligence

More Data in Less Time

MARKETING AUTOMATION & YOUR CRM THE DYNAMIC DUO. Everything you need to know to create the ultimate sales and marketing tool.

Business Process Services: A Value-Based Approach to Process Improvement and Delivery

Presented By: Leah R. Smith, PMP. Ju ly, 2 011

SEYMOUR SLOAN IDEAS THAT MATTER

Begin Your BI Journey

Session 0905 ASUG SBOUC Align your Business and IT with a Solid BI Strategy. Deepa Sankar Pat Saporito

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist

Marketing Automation 2.0 Closing the Marketing and Sales Gap with a Next-Generation Collaborative Platform

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

BUSINESS PROCESS MANAGEMENT and IT. Helping Align IT with Business

TDWI strives to provide course books that are content-rich and that serve as useful reference documents after a class has ended.

Hybrid IT A Low-Risk Path from On-Premise to ITaaS

Openbravo Services for Partners

A Supplement to Mobile Enterprise Magazine. Key Elements for Creating an Enterprise- Wide Strategy. sponsors

How To Implement An Enterprise Resource Planning Program

QLIKVIEW FOR LIFE SCIENCES. Partnering for Innovation and Sustainable Growth

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement

VCE PROFESSIONAL SERVICES PORTFOLIO OVERVIEW

Why your business decisions still rely more on gut feel than data driven insights.

The Treasury 3.0 Framework: Deploying a Model of Best Practices Treasury Strategies, Inc. All rights reserved.

CRM Budgeting & Planning

Dell* In-Memory Appliance for Cloudera* Enterprise

Cisco IT Hadoop Journey

Before You Buy: A Checklist for Evaluating Your Analytics Vendor

Open Data for Big Data

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices

Self-Assessment A Product Audit Are You Happy with Your Product Results

KNOWLEDGENT WHITE PAPER. Data Governance for the Data-Driven Enterprise

Dashboard Engine for Hadoop

Windows XP Application Migration Checklist

The Future of Data Management

Infor Human Capital Management Talent DNA that drives your business

ORACLE S PRIMAVERA FEATURES PORTFOLIO MANAGEMENT. Delivers value through a strategy-first approach to selecting the optimum set of investments

EMA Service Catalog Assessment Service

Lexmark Managed Print Services

Hit the Ground Running Modernizing Your Sales New Hire Onboarding. January 28, 2015

Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement

Architecture & Experience

Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center

Data Migration through an Information Development Approach An Executive Overview

Best Practice for a Successful Talent Management Technology Implementation

From Lab to Factory: The Big Data Management Workbook

Cray: Enabling Real-Time Discovery in Big Data

Databricks. A Primer

Lean manufacturing in the age of the Industrial Internet

Microsoft Big Data. Solution Brief

WHITE PAPER: STRATEGIC IMPACT PILLARS FOR EFFICIENT MIGRATION TO CLOUD COMPUTING IN GOVERNMENT

Five Technology Trends for Improved Business Intelligence Performance

PERFORMANCE. 9 Out-of-the-box integration with best-in-class social platforms

Cisco Network Optimization Service

Transcription:

Making big data come alive NAVIGATING THE BIG DATA JOURNEY Big Data and Hadoop: Moving from Strategy to Production London Dublin Mumbai Boston New York Atlanta Chicago Salt Lake City Silicon Valley (650) 949-2350 thinkbiganalytics.com

While Hadoop is a maturing technology, there are still many companies experimenting with big data initiatives, and other companies that don t exactly know where or how to begin. As technology continuously changes, charting a successful course for big data investments is no easy task even for organizations that have moved beyond the pilot phase. How far along are you in your big data journey? Wherever you are in the journey, there can be any number of challenges standing in the way of successfully harnessing the power of big data and maximizing its return, such as: Moving past big data strategy and into production Avoiding common stumbling points in the development cycle Determining which big data skills are needed now and in the future Aligning business and IT groups around common big data use cases Focusing on one particular use case to the detriment of others Where to begin? The starting point for any big data journey begins with developing a big data strategy that is paired with a business vision, with appropriate sponsorship from both IT and business. A best practice is to first have a strategy in place and then proceed to application development, followed by solution deployment. As Hadoop and big data become more pervasive within the enterprise, organizations are developing their own ideas of how to move forward. Many companies have done research into how they will use big data and even identified low-hanging fruit where they can derive quick value from one use case. The challenge with this approach is that they ve developed a solution for a single use case, but not multiple initiatives with a long-term roadmap. This narrow focus can stall big data projects from realizing their full potential. A strategy for success A big data strategy should provide three core components: An analysis of how Hadoop can be used to drive business value A roadmap built on a collaborative vision from business and technology stakeholders A comprehensive architecture definition supporting differentiated use cases A roadmap helps identify how both business and IT teams can drive value out of using Hadoop, as well as determining architecture investments, deciding which datasets will be landed first, and identifying pilot stakeholders who will carry over to implementation. A roadmap should serve as a guide for moving the big data initiative forward over the next 12 months. An architecture definition is not a detailed design, but rather a high-level architecture that identifies the core functions of a big data solution that will support the multiple use cases on the roadmap use cases driven by the joint vision of business and technology teams. When identifying uses cases, there might be dozens that will surface. The goal is to pick the most valuable use cases in which to invest over the next 12-months. Also, keep in mind that your big data architecture and design will have to support all of the functions and features that will be needed, such as: real-time, long-term storage; data access management; datasets exported to Hadoop for end users; various tools and distributions; metadata strategy, etc. The system should be built for long-term value, not just a few use cases. Often when your big data solution is built with top use cases in mind, the next set of use cases naturally follow down the road. Navigating the Big Data Journey / 2

From strategy to development With the strategy created and guiding the way, the next phase is development. Targeting an initial use case that can be quickly implemented is a proven way to accelerate business value. One best practice is to start an initiative that will get end users involved early while also supporting long-term investment. Aiming too high and setting lofty goals can delay results. It s not acceptable to ask business users to wait 10 months or more for an application to be deployed. Defining an operations support plan is also an important step, because your applications and operations teams will need to work closely together to build skills internally. Cluster setup, configuration, and ongoing support are critical to assist application development. In big data projects, the systems administrator will often be tapped to be the Hadoop administrator and will have a large learning curve, which will also require ongoing support. For many organizations, scaling activities around Hadoop and trying to develop production data flows can be an area of significant challenge. That s why checks and balances on data quality are essential to ensure accurate data by planning ahead on ways to ensure reliability. It s important to consider what your end user will see and where data will be stored, and to examine application needs for how frequently business users require data to be pushed. From development to production The best practices outlined below are not new or revolutionary, but they are critical to achieving success when moving from big data development to production. There can be an explosion of data when the solution is deployed, which is why capacity planning is a must. Start by creating test runs in the pre-production environment to estimate the footprint (disk space and memory) and how much data will be processed (ingesting, storing, processing Big Data Strategy and Roadmap: Think Big Engagement Model Creating a big data strategy and roadmap with Think Big involves four phases. Discovery Collaborative business and technology workshops discuss data challenges, data opportunities, the value of being able to ask new questions of new datasets, etc. Identification and prioritization of high-value use cases typically identify 50 to 60 uses cases and determine which ones could deliver the most use and value. Architecture Definition Architecture recommendations Finalize the use case list and perform criteria scoring Readiness Analysis Development of capability definitions, including organization and training identify skill gaps to develop training to properly use the new technology and tools, and maintain solutions after implementation. Analysis of use cases against current and future technology, organizational strategy and data identify access patterns by use case, which will then drive tools and applications on the cluster. Roadmap & Recommendations Creation of 12-month roadmap based on priorities piecing together all components in a sequence plan with business and technology stakeholders, layering use cases and the length of time it will take to implement with the required investments. Identifying the different datasets that will land in a Hadoop cluster and understanding business milestones and how they align with the overall big data plan. Executive presentation securing the investment needed to go forward with your big data initiatives. Navigating the Big Data Journey / 3

in-memory). Using a few months of data can be a good start and may provide an idea of how much capacity a few years of data would consume. Performance testing is another must. Performance on Hadoop can be estimated by running pre-packed jobs, such as TeraSort. Doing so provides the big data team a simple baseline for expected throughput of the cluster. Tuning the cluster based on observations from these job runs is much easier than tuning the cluster with target applications. Cluster tuning that is specific to the hosted applications is still important, but should come after baseline cluster configuration/optimization. Success in moving from big data development to production goes back to a strong partnership between application and operations teams, because daily workshops may be needed when beginning the data ingestion to analyze log data and troubleshoot issues. Making big data come alive Think Big, a Teradata company, provides data science and engineering services that enable organizations to accelerate their time to value from big data. As the first big data services firm, Think Big s data scientists, data engineers, and project managers are trusted advisors to the world s most innovative companies. Visit thinkbig.teradata.com. Moving past big data dreaming and pilots and into full production requires firmly grasping the right analytic priorities, architecture, infrastructure, skills, and support. For more details, check out this big data and Hadoop webinar with Mike Portell, director of client services at Think Big. Avoiding Big Data Pitfalls Strategy Lack of business sponsorship for implementation a big data strategy could be delayed or even dropped if support from the business has not been identified. Investing too early in a Hadoop cluster experimenting is fine, but too many hands in a cluster will lead to misconfigurations and stall moving to implementation. It s also important to consider what the environment will be used for in the future and if any remediation will need to be done to move forward with application development. Development The Hadoop ecosystem evolves quickly new, major capabilities are introduced frequently and it can be important to take advantage of these new capabilities. Be aware of how changes in the environment will affect existing applications and if they could lead to significant code remediation. Change management for analysts it s important to get end users (data scientists and analysts) involved early by pushing new threads to them quickly and asking for feedback so when deployment happens, there will be fewer issues. Production Data multiplication duplicate data and metadata can quickly increase data size. In the beginning, storing as much data as possible will deliver the most value, but data multiplication will happen when it eventually gets ingested into a cluster. It s important to know how data will be multiplied once they are ingested and how to create checks and balances on the data so they remain manageable. Historical data load and performance significant processing on ingest could lead to an unexpected historical data load timeline. This stage offers an opportunity to configure and tune the cluster and application to make it more reasonable. Navigating the Big Data Journey / 4

High Tech Manufacturer Implements Big Data Strategy to Improve Yields A large manufacturer of external storage products had a gold mine of historical log data from its drives, but needed a way to unlock the tremendous value hidden in this drive DNA. Historical log data would drive new analytics from information that wasn t previously used. The storage-technology manufacturer turned to Think Big to embark on a program to build a big data platform to reduce the amount of time engineers spend searching for data, facilitate large scale analytics for yield improvement, and work with customers to identify problems before they happen. Think Big has worked with the manufacturer on big data strategy, architecture design, data lake implementation, and analytic solution development, helping deliver rapid value while also stabilizing the environment for production. Over the course of two years, Think Big has helped the manufacturer move from strategy to development to production for two major big data applications. Through the platform, the manufacturer has uncovered opportunities to reduce scrap waste, speed time to market, and gain more timely analytics and insights. This unique end-to-end data analysis provides significant benefit to the bottom-line through reduced development time, improved manufacturing yield, and increased customer satisfaction. Think big. Start smart. Scale fast. Contact us to learn more about how we can help you make big data come alive to deliver true business value. EB-7095 0615 London Dublin Mumbai Boston New York Atlanta Chicago Salt Lake City Silicon Valley (650) 949-2350 thinkbiganalytics.com