Gamification Meets Analytics With Kaggle



Similar documents
Deliver Process-Driven Business Intelligence With a Balanced BI Platform

Recognize the Importance of Digital Marketing

Agenda for Supply Chain Strategy and Enablers, 2012

Gartner's View on 'Bring Your Own' in Client Computing

2010 FEI Technology Study: CPM and BI Show Improvement From 2009

Modify Your Storage Backup Plan to Improve Data Management and Reduce Cost

The Four New Ps of Marketing That CMOs and CIOs Should Consider

Singapore Empowers Land Transport Planners With Data Warehouse

Data in the Cloud: The Changing Nature of Managing Data Delivery

Key Issues for Business Intelligence and Performance Management Initiatives, 2008

Integrated Marketing Management Aligns Executional, Operational and Analytical Processes in a Closed-Loop Process

The Next Generation of Functionality for Marketing Resource Management

Prepare for the Inevitable With an Effective Security Incident Response Plan

Business Intelligence Focus Shifts From Tactical to Strategic

The Current State of Agile Method Adoption

Case Study: Innovation Squared: The Department for Work and Pensions Turns Innovation Into a Game

Overcoming the Gap Between Business Intelligence and Decision Support

The Six Triggers for Using Data Center Infrastructure Management Tools

2009 FEI Technology Study: CPM and BI Pose Challenges and Opportunities

Business Intelligence Platform Usage and Quality Dynamics, 2008

Q&A: The Many Aspects of Private Cloud Computing

Dutch University's Successful Enterprise System Implementation Yields Valuable Lessons

Key Issues for Identity and Access Management, 2008

The Hype Around an Integrated Talent Management Suite Outpaces Customer Adoption

Cost Optimization: Three Steps to Saving Money on Maintenance and Support for Network Security Products

Cloud IaaS: Service-Level Agreements

How Eneco's Enterprisewide BI and Performance Management Initiative Delivered Significant Business Benefits

How To Create A Cloud Computing System

Knowledge Management and Enterprise Information Management Are Both Disciplines for Exploiting Information Assets

Vendor Focus for IBM Global Services: Consulting Services for Cloud Computing

When to Use Custom, Proprietary, Open-Source or Community Source Software in the Cloud

Ensure Emerging Trends and Technologies Advance Your Marketing Strategy

Backup and Disaster Recovery Modernization Is No Longer a Luxury, but a Business Necessity

IT Architecture Is Not Enterprise Architecture

Tips for Evaluators: Better Business Intelligence RFPs

The Value of Integrating Configuration Management Databases With Enterprise Architecture Tools

Gartner's Business Intelligence and Performance Management Framework

The Five Competencies of MRM 'Re-' Defined

Roundup of Business Intelligence and Information Management Research, 1Q08

IT asset management (ITAM) will proliferate in midsize and large companies.

Best Practices for Confirming Software Inventories in Software Asset Management

Private Cloud Computing: An Essential Overview

Cloud, SaaS, Hosting and Other Off-Premises Computing Models

Real-Time Decisions Need Corporate Performance Management

Research Agenda and Key Issues for Converged Infrastructure, 2006

Key Issues for Data Management and Integration, 2006

Solution Path: Threats and Vulnerabilities

Gartner Clarifies the Definition of the Term 'Enterprise Architecture'

An outline of the five critical components of a CRM vision and how they contribute to an enterprise's CRM success

Eight Critical Forces Shape Enterprise Data Center Strategies

Transactional HR self-service applications typically get implemented first because they typically automate manual, error-prone processes.

Emerging PC Life Cycle Configuration Management Vendors

NAC Strategies for Supporting BYOD Environments

The What, Why and When of Cloud Computing

2010 Gartner FEI Technology Study: Planned Shared Services and Outsourcing to Increase

Clients That Don't Segment Their Network Infrastructure Will Have Higher Costs and Increased Vendor Lock-in

Toolkit: Reduce Dependence on Desk-Side Support Technicians

Case Study for Supply Chain Leaders: Dell's Transformative Journey Through Supply Chain Segmentation

The EA process and an ITG process should be closely linked, and both efforts should leverage the work and results of the other.

The Electronic Signature Market Is Poised to Take Off

X.509 Certificate Management: Avoiding Downtime and Brand Damage

Iron Mountain's acquisition of Mimosa Systems addresses concerns from prospective customers who had questions about Mimosa's long-term viability.

Discovering the Value of Unified Communications

Case Study: Lexmark Uses MDM to Turn Information Into a Business Asset

IT Operational Considerations for Cloud Computing

Managing IT Risks During Cost-Cutting Periods

Case Study: New South Wales State Department of Education Adopts Gmail for 1.2 Million Students

Successful EA Change Management Requires Five Key Elements

2009 Gartner FEI Technology Study: XBRL in the U.S. Enterprise

BEA Customers Should Seek Contractual Protections Before Acquisition by Oracle

Gartner Defines Enterprise Information Architecture

Gartner Updates Its Definition of IT Infrastructure Utility

Microsoft's Cloud Vision Reaches for the Stars but Is Grounded in Reality

Tactical Guideline: Minimizing Risk in Hosting Relationships

Business Intelligence Platform Capability Matrix

For cloud services to deliver their promised value, they must be underpinned by effective and efficient processes.

Responsible Vulnerability Disclosure: Guidance for Researchers, Vendors and End Users

Organizational Structure: Business Intelligence and Information Management

Transcription:

G00228640 Gamification Meets Analytics With Kaggle Published: 1 June 2012 Analyst(s): Rita L. Sallam This note describes how Kaggle is bringing "the collective" to "the predictive" to help companies overcome their advanced analytic skills gaps. Key Findings Pervasive and advanced analytics will become necessary for leading and analytically mature organizations that want to gain competitive advantage. A lack of skills is a critical inhibitor to the adoption and the deriving of value from advanced analytics. Recommendations Develop a plan to acquire the new roles and skills needed to support an advanced analytics strategy. Identify analytic service providers that offer business process and industry-specific algorithms, analytic applications and/or data as a service to supplement internal skills, or as a completely outsourced solution to address the gap. Leverage analytic contest platforms, such as Kaggle, to enlist the assistance of data scientists from around the globe in developing specialized analytics models. Use a contest to test and prove out the value of analytics to your organization, before building a team and process to service these needs. Weigh the costs and benefits of training existing staff in analytics against hiring consultants with the necessary expertise and experience. Table of Contents Analysis...2 The Advanced Analytics Trend...2 Kaggle's Approach...2 Potential Challenges...3

Who Should Care?...4 Recommended Reading...4 Analysis The Advanced Analytics Trend Widespread use of advanced analytics on increasing large and diverse data will become necessary for leading and analytically mature organizations that want to differentiate, innovate and gain competitive advantage (see "ITScore for Business Intelligence and Performance Management"). Gartner defines advanced analytics as the analysis of structured data and content (such as text, images, video and audio), using sophisticated quantitative methods (for example, statistics, descriptive and predictive data mining, simulation and optimization) to produce insights that traditional approaches to business intelligence (BI) such as query and reporting are unlikely to discover. According to the customer reference survey for Gartner's "Magic Quadrant for Business Intelligence Platforms," organizations predominantly use BI technologies that measure the past, such as reporting, ad hoc analysis and dashboards. Also, around one-third of organizations report extensive use of diagnostic capabilities for interactive visualization and online analytical processing (OLAP). Only a small percentage of organizations (13%) currently report extensive use of predictive analytics. This trend must change as organizations express interest in increasing their use of advanced styles of analytics because of their significant potential to create business value and competitive advantage. However, in most organizations a lack of skills is the biggest barrier to success. One solution to this problem is to build and nurture data scientist skills and an advanced analytics core competency internally. Another is to use external service providers such as IBM's Business Analytics and Optimization (BAO), Deloitte Analytics and newly funded vendor, Mu Sigma. These offer business process and industry-specific algorithms, analytics applications and data as a service to supplement internal skills, or as completely outsourced solutions to address the gap. A new option, however, is to use a crowd sourcing competition platform, such as Kaggle, for finding the best predictive analytics models. Aspiring to be the "PGA Tour of analytics," Kaggle has turned advanced analytics on big data into a sport. It has done this by leveraging gamification concepts to devise a competition platform that brings together companies with big data and advanced analytics problems with the brain power of 38,000 (and growing) data scientists. These data scientists are from all around the globe and compete for top rankings based on their competition participation. By doing this, Kaggle is making this scarce and often expensive expertise accessible and affordable to companies of any size. Kaggle's Approach After Netflix successfully held a $1 million contest to improve its algorithm for recommending movies based on someone's movie preferences, Kaggle was founded to help companies of any size run Netflix-like competitions. The customer supplies a dataset, tells Kaggle the question it wants answered, and decides how much prize money it's willing to offer. Kaggle moulds these inputs into Page 2 of 6 Gartner, Inc. G00228640

a contest in which the data geeks of the world battle for intellectual supremacy bragging rights and, of course, the prize money. In exchange, Kaggle charges a fixed fee plus a monthly fee, or it takes a percentage of the prize money, to set up and run the competition. To date, its network of data scientist experts from around the globe have competed in over 100 contests sponsored by companies such as Allstate, Deloitte, Ford, Dunnhumby, Microsoft and NASA. The company says that its competitions have resulted in a 40% average improvement in benchmarks compared to existing algorithms. Kaggle is an example of the trend to engage people using game mechanics. Commonly referred to as gamification, companies are leveraging the design features found in games such as competition, ratings and extrinsic and intrinsic rewards to engage a target audience. There's strong evidence that this approach can really work. For example, NASA and the Royal Astronomical Society sponsored a cosmological image analysis competition to develop an algorithm that can mathematically detect dark matter in the universe. Dark matter is unobservable matter that doesn't reflect or emit light, but scientists believe it must be measured along with observable matter (for example, stars and planets) in order to accurately calculate the mass of the entire universe. After only one week of competition, Martin O'Leary, a PhD student in glaciology, created an algorithm that outperformed algorithms most commonly used in astronomy for mapping dark matter, algorithms that NASA had been working on for 30 years (see "Competition Shines Light on Dark Matter" on the U.S. government's Office of Science and Technology Policy website). If Martin were a consultant he would have had little incentive to improve his algorithm any further. However, due to the gaming mechanics of Kaggle's competitions, in mere days, someone else surpassed the performance of Martin's algorithm forcing him to refine his algorithm even further. This continual leapfrogging happens until every little juice in the data has been squeezed out enabling the competition sponsors to achieve the best results from their data. Other competitions include one run by Allstate, which offered a $10,000 prize to improve the way it priced automobile insurance policies. The goal of the competition was to identify attributes of a car that would make it more likely to be involved in an accident resulting in a bodily injury claim. Kaggle's largest competition to date, with a $3 million prize, is being run by Heritage Provider Network. The winner will be the person or team that, based on past insurance claim data, can most accurately forecast which patients will be admitted to a hospital within the following year. As an expansion to its services as an analytics outsourcer, Kaggle also plans to begin running recruiting contests for companies in need of permanent data scientist expertise. The idea of crowd sourcing tough problems through competitions is not new. The X Prize Foundation has been holding competitions for everything from a $30 million prize for the team that can safely land a robot on the surface of the moon (Google LunarX), to a $10 million prize for the team that can create the most accurate consumer-based mobile health diagnosis application (Qualcomm Tricorder). Challenge.gov is another crowd sourcing contest platform run by the U.S. government for finding innovative solutions to government problems. Potential Challenges Kaggle's approach to applying the concept to analytics is unique but poses some challenges. First, to participate in contests, organizations have to supply the data, which is often very sensitive in nature. Providing open access to this data as part of a competition is not an option for many Gartner, Inc. G00228640 Page 3 of 6

companies. Kaggle provides a solution to this problem by offering companies the option of a private competition in which it invites 10 to 15 of its most successful data scientist extraordinaires (those with the highest rankings based on other competitions), who must sign nondisclosure agreements with the company to participate. The option for private competitions provides for the protection of sensitive data while at the same time helping Kaggle create barriers to entry against other analytics competition platforms. Data scientists are not exclusive to Kaggle, but only data scientists that have competed successfully (data scientists are scored and ranked like in any other sport or game) in other Kaggle open competitions are invited to participate in the selective and prized private competitions. Unlike the open competitions, in the private competitions selected data scientists get compensated regardless of whether or not they win. For data scientists, they must get permission to work outside their primary employment to participate in Kaggle competitions. This will likely be less of an issue for consultants and academics. In fact, this group might consider evolving their business models to supply resources to Kaggle as a profit center. Kaggle believes that by providing incentives to participate actively in competitions, it can create a robust market for analytics competitions. This will attract the most leading data scientists, then the most competitions, then give data scientists the best opportunities to make money and earn bragging rights, and then attract more data scientists, and so on. The company hopes that by creating the largest and most active forum for competitions and data scientist gurus, it will remain the de facto analytics competition platform of choice. Who Should Care? Whether you are a small organization without in-house advanced analytics expertise, or you are large organization with internal skills, but looking to ways to improve on existing algorithms, a competition may be an attractive option to build and/or enhance your company's advanced analytical capabilities. This may be less expensive than hiring a consulting firm; allow you to flexibly enhance or add to internal staff as needed; and get results quickly Kaggle competitions have a track record of being able to achieve benchmark breaking results rapidly, often in weeks rather than months. Recommended Reading Some documents may not be available as part of your current Gartner subscription. "Advanced Analytics: Predictive, Collaborative and Pervasive" "Seek Information Patterns With Data Mining and Predictive Analytics" "Predicts 2012: Skills Gap is the Biggest Challenge in Leveraging New Opportunities in Analytics and Performance Management" "Decision Support Capabilities in Gartner's Business Analytics Framework" Page 4 of 6 Gartner, Inc. G00228640

"Ten Reasons to Reach Beyond Basic Business Intelligence" "Best Practices in Analytics: Integrating Analytical Capabilities and Process Flows" "Better Data and Analytics Saves UPS Millions of Dollars a Year" "ITScore for Business Intelligence and Performance Management" Gartner, Inc. G00228640 Page 5 of 6

Regional Headquarters Corporate Headquarters 56 Top Gallant Road Stamford, CT 06902-7700 USA +1 203 964 0096 European Headquarters Tamesis The Glanty Egham Surrey, TW20 9AW UNITED KINGDOM +44 1784 431611 Japan Headquarters Gartner Japan Ltd. Atago Green Hills MORI Tower 5F 2-5-1 Atago, Minato-ku Tokyo 105-6205 JAPAN + 81 3 6430 1800 Latin America Headquarters Gartner do Brazil Av. das Nações Unidas, 12551 9 andar World Trade Center 04578-903 São Paulo SP BRAZIL +55 11 3443 1509 Asia/Pacific Headquarters Gartner Australasia Pty. Ltd. Level 9, 141 Walker Street North Sydney New South Wales 2060 AUSTRALIA +61 2 9459 4600 2012 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. or its affiliates. This publication may not be reproduced or distributed in any form without Gartner s prior written permission. The information contained in this publication has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information and shall have no liability for errors, omissions or inadequacies in such information. This publication consists of the opinions of Gartner s research organization and should not be construed as statements of fact. The opinions expressed herein are subject to change without notice. Although Gartner research may include a discussion of related legal issues, Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner is a public company, and its shareholders may include firms and funds that have financial interests in entities covered in Gartner research. Gartner s Board of Directors may include senior managers of these firms or funds. Gartner research is produced independently by its research organization without input or influence from these firms, funds or their managers. For further information on the independence and integrity of Gartner research, see Guiding Principles on Independence and Objectivity on its website, http://www.gartner.com/technology/about/ ombudsman/omb_guide2.jsp. Page 6 of 6 Gartner, Inc. G00228640