Skills shortage, training present pitfalls for big data analytics



Similar documents
Big Data BI and analytics: Tips and best practices for managing bigdata

E-Guide HOW THE VMWARE SOFTWARE DEFINED DATA CENTER WORKS: AN IAAS EXAMPLE

E-Guide CLOUD COMPUTING FACTS MAY UNCLENCH SERVER HUGGERS HOLD

E-Guide CONSIDERATIONS FOR EFFECTIVE SOFTWARE LICENSE MANAGEMENT

Hybrid cloud computing explained

Order Management System Best Practices

E-Guide THE LATEST IN SAN AND NAS STORAGE TRENDS

E-Guide MANAGING AND MONITORING HYBRID CLOUD RESOURCE POOLS: 3 STEPS TO ENSURE OPTIMUM APPLICATION PERFORMANCE

Benefits of virtualizing your network

Securing the SIEM system: Control access, prioritize availability

BUYING PROCESS FOR ALL-FLASH SOLID-STATE STORAGE ARRAYS

Managing the supply chain for SAP

Cloud Security Certification Guide What certification is right for you?

Advantages on Green Cloud Computing

Expert guide to achieving data center efficiency How to build an optimal data center cooling system

HOW MICROSOFT AZURE AD USERS CAN EMPLOY SSO

E-Guide NETWORKING MONITORING BEST PRACTICES: SETTING A NETWORK PERFORMANCE BASELINE

Supply Chain Management Tips and Best Practices

Exchange Server 2010 backup and recovery tips and tricks

Hyper-V 3.0: Creating new virtual data center design options Top four methods for deployment

Solution Spotlight BEST PRACTICES FOR DEVELOPING MOBILE CLOUD APPS REVEALED

E-Guide HOW A TOP E-COMMERCE STRATEGY LEADS TO STRONG SALES

E-Guide SIX ENTERPRISE CLOUD STORAGE AND FILE-SHARING SERVICES TO CONSIDER

HR Managers Focus on Recruiting Experience as War for Talent Intensifies

Evaluating SaaS vs. on premise for ERP systems

5 free Exchange add-ons you should consider Eliminating administration pain points on a budget

Best Practices for Database Security

E-Guide WHAT IT MANAGERS NEED TO KNOW ABOUT RISKY FILE-SHARING

Top Data Management Terms to Know Fifteen essential definitions you need to know

Virtualization backup tools: How the field stacks up

HOW TO SELECT THE BEST SOLID- STATE STORAGE ARRAY FOR YOUR ENVIRONMENT

GUIDELINES FOR EVALUATING PROCUREMENT SOFTWARE

5 ways to leverage the free VMware hypervisor Key tips for working around the VMware cost barrier

Managing Data Center Growth Explore Your Options

Software Defined Networking Goes Well Beyond the Data Center

E-Guide UNDERSTANDING PCI MOBILE PAYMENT PROCESSING SECURITY GUIDELINES

Essentials Guide CONSIDERATIONS FOR SELECTING ALL-FLASH STORAGE ARRAYS

Is Your Data Safe in the Cloud?

Preparing for the cloud: Understanding the infrastructure impacts Eight essential tips for a successful cloud migration

How to Define SIEM Strategy, Management and Success in the Enterprise

LTO tape technology continues to evolve with LTO 5

3 common cloud challenges eradicated with hybrid cloud

The skinny on storage clusters

WHAT S INSIDE NEW HYPER- CONVERGED SYSTEMS

How To Handle Big Data With A Data Scientist

Social media driving CRM strategies

The state of cloud adoption in India The use cases, industry trends, business demands, and user expectations driving cloud adoption in Indian

Cloud Business Intelligence Trends to Watch

Best practices for managing the data warehouse to support Big Data

Extend your analytic capabilities with SAP Predictive Analysis

Solution Spotlight KEY OPPORTUNITIES AND PITFALLS ON THE ROAD TO CONTINUOUS DELIVERY

Key Trends in the Identity and Access Management Market and How CA IAM R12 Suite Addresses These Trends

Transcription:

present pitfalls for big The biggest challenges related to big data analytics, according to consultants and IT managers, boil down to a simple one-two punch: The technology is still fairly raw and user-unfriendly, and there aren t enough skilled experts to go around. In this Tip Guide, readers will get tips on how to avoid these big data pitfalls. data analytics By: Beth Stackpole, Contributor The biggest challenges related to big data analytics, according to consultants and IT managers, boil down to a simple one-two punch: The technology is still fairly raw and user-unfriendly, and there aren t enough skilled experts to go around. A lot of big data technologies -- like Hadoop and MapReduce -- hail from the open source world, developed by Internet pioneers such as Google and Yahoo to take on the problem of cost-effectively processing large volumes of information, including both structured and unstructured data. As a result of this orientation, most of the technologies lack the maturity and accessibility of traditional databases and data management suites, and there is still a limited selection of complementary analytics tools available to make these environments feel familiar to many data warehousing and analytics professionals. There s a steep learning curve to all this, with a lot of new technologies and unwritten lore as to how to make things work, said Ron Bodkin, CEO of Think Big Analytics, a Mountain View, Calif.-based consulting firm specializing in big data analytics. The majority of people are used to working with relational database management systems, which have a different model of storing and processing data. Page 2 of 6

present pitfalls for big While data management teams typically have a well-defined set of expertise around managing and organizing highly structured data and modeling and creating reports in SQL, those conventional skill sets don t translate well to the unstructured, flat-file part of the big data world, where command lines and NoSQL database technologies are the core building blocks of most of the emerging platforms. You have to be willing to get your hands dirty, said Will Duckworth, vice president of software engineering at comscore Inc., a Reston, Va.-based provider of Web analytics and marketing intelligence services that has developed and implemented a big data analytics strategy in recent years. This isn t a fully shrink-wrapped product where you open the box, install it on servers and it runs fine. You need a good set of system administrators and solid practices around how to build out these environments. Bring on the Ph.D.s Much of what big data analytics brings to the table is based on predictive modeling or a look into future trends. But the discipline of developing the models for predictive analytics applications isn t within the skill set of the average business user or even the traditional business intelligence (BI) data analyst. In addition, much of the data is in a raw form, from sources such as Web activity logs or sensors. Thus, companies need access to a cadre of experts who are versed in statistical and mathematical principles to build advanced analytical models that can uncover trends and hidden patterns and actually make big data useful. Not only do you need the IT operational skills to be able to realize value, the biggest shortage we see around big data is data scientists -- people with Ph.D.s in statistics, said Brian Hopkins, a principal analyst at Forrester Research Inc. in Cambridge, Mass. Most of the data is raw -- it s not something you can read and get value out of. There will always be a need for a skill set of people who know what to do with the raw information, and you have to build the acquisition of talent into the business case. At comscore, where the company s business model is predicated on crunching through volumes of Web data to unearth trends for customers, Page 3 of 6

present pitfalls for big many analytics users are trained in predictive modeling and are also technically savvy enough to understand the impact of a particular query on overall system performance. Others, however, didn t possess that level of expertise, Duckworth said. So comscore has invested time and money in reeducation efforts to orient them to think about the scale of the data and to spend time considering such details as data partitioning and load size when they re building models and queries. At the same time, the company has designed its big data system with checks and balances. For example, if someone tries to run a query that could potentially crash the cluster, the system pops up a note to ensure that the user is fully aware of the ramifications of the planned job. At scale, things break pretty fast, Duckworth said. ComScore has also brought in a packaged application that adds a SQL-like environment to its Hadoop big data analytics environment, so it feels more familiar to mainstream users. Training was also an integral part of the big data analytics strategy for Zions Bancorporation, a commercial bank holding company based in Salt Lake City that has deployed big data technology to help it do modeling and risk management for various loan portfolios. Yet the training wasn t just about learning Hadoop skills or serving as a crash course in statistical science. Rather, a considerable amount of time and energy went into acclimating members of the technical team so they were able to comfortably transition to a totally new way of managing data. This is new technology that traditional and very conservative IT shops may be reluctant to implement, said Clint Johnson, who until recently was senior vice president of data warehousing, BI and analytics at Zions. You have systems administrators or database administrators who ve built an entire career around a particular skill set, and then you thrust some new technology at them and say they have to learn it. There are cultural challenges you have to deal with in terms of supporting the new model. Page 4 of 6

present pitfalls for big ABOUT THE AUTHOR Beth Stackpole is a freelance writer who has been covering the intersection of technology and business for 25-plus years for a variety of trade and business publications and websites. Page 5 of 6

present pitfalls for big Free resources for technology professionals TechTarget publishes targeted technology media that address your need for information and resources for researching products, developing strategy and making cost-effective purchase decisions. Our network of technology-specific Web sites gives you access to industry experts, independent content and analysis and the Web s largest library of vendor-provided white papers, webcasts, podcasts, videos, virtual trade shows, research reports and more drawing on the rich R&D resources of technology providers to address market trends, challenges and solutions. Our live events and virtual seminars give you access to vendor neutral, expert commentary and advice on the issues and challenges you face daily. Our social community IT Knowledge Exchange allows you to share real world information in real time with peers and experts. What makes TechTarget unique? TechTarget is squarely focused on the enterprise IT space. Our team of editors and network of industry experts provide the richest, most relevant content to IT professionals and management. We leverage the immediacy of the Web, the networking and face-to-face opportunities of events and virtual events, and the ability to interact with peers all to create compelling and actionable information for enterprise IT professionals across all industries and markets. Related TechTarget Websites Page 6 of 6