KNOWLEDGENT REPORT 2015 Big Data Survey: Current Implementation Challenges
INTRODUCTION The amount of data in both the private and public domain is experiencing exponential growth. Mobile devices, sensors, audio and video feeds, social media, and what has become known as the Internet of Things are all contributing to this increase in information variety, volume and velocity. The significant increase in data in recent years, coupled with the development of new techniques and technologies to analyze it ( Big Data ), enables disruptive business models to flourish and is now spreading into the more traditional corporate models and activities. To better understand the challenges faced by organizations trying to leverage Big Data, Knowledgent recently conducted a survey designed to gauge the levels of difficultly experienced in key areas that, in Knowledgent s perspective, are potential pain points. In this survey, we asked questions relative to the status of Big Data initiatives and projects and the value being received by these efforts. The survey found that: Big Data continues to grow in importance despite significant obstacles. The combination of traditional and more unstructured data sources, combined with advanced analytics, are contributing to the development of new business insights. Big Data initiatives are transitioning from Proofs-of-Concept to production. Over 60% of respondents indicated that Big Data initiatives were either very or extremely important to their organizations. However, even with Big Data s growth and benefits, there are still significant challenges to organizational adoption: Resources, both human and other, continue to be a major constraint. Putting together an overall production grade program, particularly those aspects related to standardizing process, is a notable challenge. The Data Lake architecture needs to evolve and mature to better support end users. 2015 Knowledgent Group Inc. 2
IMPLEMENTATION STATUS Based on the survey results, it is clear that Big Data is moving out of the experimental stage. The results indicate that by the end of 2015, the majority of respondents expect to be utilizing Big Data in a production environment. Over 60% of respondents indicated that Big Data initiatives were either very or extremely important to their organizations. On an industry basis, the Financial Services and Healthcare sectors attributed more importance to Big Data than other sectors, while Insurance trailed. 25% of respondents reported having already implemented a Big Data solution, while most other respondents indicated they were within six months of doing so. Over 75% of respondents view Big Data as having gone from proof-of-concept to production. Overall Financial Services Healthcare Insurance Figure 1: Priority of Big Data initiatives WHAT IS THE PRIORITY OF BIG DATA INITIATIVES IN YOUR ORGANIZATION? 0 10 20 30 40 50 60 70 Extremely Unimportant Very Unimportant Somewhat Important Very Important Extremely Important Figure 2: Big Data Implementation Timeline Figure 3: Implementation Status of Big Data Technology WHAT IS YOUR ORGANIZATION S TIMEFRAME FOR IMPLEMENTING BIG DATA INITIATIVES? BIG DATA TECHNOLOGY HAS MOVED FROM A PROOF-OF-CONCEPT TO A PRODUCTION CAPABILITY 0 10 20 30 Already Implemented 0 20 40 60 Within 3 months 3-6 months 6-9 months 9-12 months Disagree Neither Agree Nor Disagree Agree Strongly Agree 2015 Knowledgent Group Inc. 3
WHAT IS THE VALUE BEING REALIZED? In the early stages of Big Data evolution, there was an emphasis on the cost savings to be realized by open-source software and commodity hardware. However, it has always been Knowledgent s observation that the biggest value comes from gaining new analytical insights, particularly those gained by the combination of traditional, structured data and newer, non-tabular data formats. Most respondents agreed that Big Data is effectively enabling the combination of structured and unstructured data. Most respondents agreed that Big Data is driving the use of advanced analytics and leading to new analytical insights. There was a split opinion on the value of Big Data as a data processing hardware/software cost-reduction strategy. Figure 4: Realization of Big Data Value BIG DATA VALUE 0 20 40 Big Data is enabling the combination of unstructured and structured data Big Data initiatives are leading to analytical insights Big Data provides a cost-effective mechanism to process large volumes of information Big Data s primary benefit has been in the cost reduction for new data processing hardware Big Data is enabling the broader use of advanced analytics (for example, predictive...) Disagree Neither Agree Nor Disagree Agree Strongly Agree 60 80 2015 Knowledgent Group Inc. 4
WHAT ARE THE CHALLENGES? The survey showed that there is very broad agreement that many aspects of implementing a Big Data solution remain at least somewhat challenging. The biggest task continues to be in finding experienced resources. Over 55% of respondents identified finding resources with the required Big Data skills as either very or extremely challenging Gaining buy-in from business stakeholders scored as the least challenging in this category Figure 5: Big Data Challenges 0 Getting the appropriate infrastructure (hardware and software) installed and operational Finding qualified resources with the necessary Big Data skills Establishing the necessary processes to go from an experimental to a production grade environment Implementing the required data compliance policies Getting buy-in from internal business stakeholders WHAT ARE THE CHALLENGES? 10 20 30 40 50 Not Challenging At All Not Challenging Somewhat Challenging Very Challenging Extremely Challenging 2015 Knowledgent Group Inc. 5
BIG DATA IT MANAGEMENT PERSPECTIVE Respondents, perhaps unsurprisingly given the maturity of the domain, are finding the greatest challenges in the overall program development and management of Big Data initiatives. Under this umbrella, it seems that standard processes for data ingestion and transformation are still evolving. On average, at least 75% of respondents noted that many aspects of managing and operating a Big Data environment still remain at least somewhat challenging The most challenging aspect noted was in developing the overall program The least challenging was in controlling access and privileges Figure 6: Challenges Faced by IT Managers 0 Integrating your Big Data platform with the other data platforms in your environment (for...) Developing your Big Data management program Documenting your Big Data governance operating model Having standard processes for ingesting data into your Big Data environment Having standard processes for moving and transforming the data within your Big Data... Managing, monitoring, and logging who does what and when in your environment IT MANAGEMENT 10 20 30 40 50 Not Challenging At All Not Challenging Somewhat Challenging Very Challenging Extremely Challenging 2015 Knowledgent Group Inc. 6
BIG DATA END-USER SUPPORT PERSPECTIVE Knowledgent crafted the questions in this section based on field observations across multiple projects, particularly those with some flavor of the Data Lake architecture. We have noted end-user challenges with locating data, understanding data, and requisitioning data for analytical use. We wanted to gauge if our observations were being more broadly experienced. In this category, all aspects questioned remain at least somewhat challenging for at least 75% of respondents This is entirely consistent with Knowledgent s experience and one of the reasons we developed Kariba, our data and analytics platform. Figure 7: End User Support Challenges 0 Enabling end users to locate the data they need when they need it Providing end users with a self-service capability Providing the necessary metadata so that end users can understand where... Providing data profiling and quality metadata to inform end users on... Providing business-level context for end users to understand the data... END USER SUPPORT 10 20 30 40 50 Not Challenging At All Not Challenging Somewhat Challenging Very Challenging Extremely Challenging 2015 Knowledgent Group Inc. 7
SURVEY METHOD AND DEMOGRAPHICS The Big Data survey was a one-time survey conducted from March 12 to April 9, 2015 by Knowledgent. The survey was targeted at IT practitioners with some exposure to Big Data technology. Survey candidates were asked to complete an online questionnaire of 27 questions hosted on Knowledgent s website. The questions were closed ended with answer options along a Likert scale. A broad range of respondents took the survey. They represented many industry sectors and sizes of organization. Almost 100 people responded to the survey. Over one-third of the respondents came from the financial services sector. The Healthcare, Insurance, and Life Sciences sector made up a further 30%. More than 50% of those taking the survey came from companies with a $1B or more in revenue. Over 50% of respondents hold a position of manager or above. KARIBA Kariba is Knowledgent s innovative software product that provides a Data and Analytic Self-Service platform. Kariba s powerful keyword, faceted, and semantic search capabilities revolutionize users ability to quickly and accurately find not only structured data like files and tables but also unstructured data like social media posts, chat logs, and web logs. In addition, users learn to trust and understand the data by viewing the quality and lineage of the data and through additional information obtained from the broader user community, such as reviews and common use cases. From an IT management point-of-view, Kariba provides utilities to rapidly ingest new data into the Data Lake. Typically, the development of a new ingestion process can be a lengthy, customized development task. Kariba s configuration-driven ingestion mechanism provides a high level of automation and reuse, enabling easy and secure content ingestion from new sources. For more on Kariba visit: http://knowledgent.com/kariba/ New York, New York Warren, New Jersey Boston, Massachusetts Toronto, Canada www.knowledgent.com 2015 Knowledgent Group Inc. All rights reserved.