1 - Applying data analytics to actuarial and insurance challenges Thursday 25 June
2 Actuaries core competence Analyse the financial consequences of uncertain future events How we do it: Happy principals! Maths & Statistics I ve seen this before Unique perspective Solutions to complex financial problems Finance & Economics Exams/training Core insurance/ pensions work Experience Support from other disciplines
3 Common problems solved Model uncertainty we develop models to help prioritise and make business choices Make data talk we use data to derive business insights and drive evidencebased change Quantify complex risks and uncertainty we help quantify and manage volatile business risks and uncertainty
4 Big data and analytics are impacting on insurance The insurance sector has a long history of creating and leveraging mostly domain-specific, data-oriented models Insurance is the most obvious industry about to explode with uses for big data (Eric Schmidt, Google Chairman) How Big Data is impacting insurance market? Shaping the competition landscape The data explosion has created a new ecosystem of data sources incorporating a diverse range of new players from telecommunications, retail and automotive companies, to domain-specific data providers and aggregators Opening new market opportunities The application of analytical techniques to available data gives visibility on more forward-looking business opportunities. Examples include capital modelling and economic scenario and catastrophe simulations. Rethinking the products Organisations are looking at how big data can drive development of new products, such as those based on longer term policies (e.g. life assurance, consumer healthcare) and smart technology provision (e.g. car insurance)
5 Actuaries and the data science community There is a vibrant data science/analytics community The data science community thrives on exchange and challenge, e.g. Kaggle as main platform of data science competitions posted by Fortune 500 companies Actuaries are still focused on our own community Recent actuarial graduates are flexible and better trained to actively participate and benefit How can we trigger an interest and get our members interested and involved in these communities?
7 The Titanic Competition An ideal starting point for those interested in data analytics. In this challenge, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy. No cash prize but knowledge. Tutorials available in data analytics, R and Python. The competition runs from 1 st July to 31 st December. Contact Emily O Gara at if you would like to join.
8 Next Steps Join the competition! September Titanic workshop End of competition event share your experience!
9 R is a language and environment for statistical computing and graphics. (www.r-project.org) Why R? R is free, portable, flexible, widely used and fit for purpose. There are many libraries and tutorials freely available online What are the alternatives? Free: RapidMiner, SAS University Edition, Julia... Not so free: SAS, SPSS, Mathematica, STATA...
10 Applying data analytics to actuarial and insurance challenges Gábor Stikkel Society of Actuaries in Ireland, Featured Event
11 Who am I? Az 1994/95. évi matematika Országos Középiskolai Tanulmányi Verseny eredményei I. díj: Németh Sándor Géza, IV. o., Vác, Boronkay György Műszaki Szakközépiskola, felkészítő tanár: Benedek Ilona II. díj: Zaupper Bence, III. o., Győr, Krúdy Gyula Gimnázium és Vendéglátóipari Szki., felkészítő tanár: Babarczi Imréné III. díj: Bányai Attila, IV. o., Kaposvár, Eötvös Loránd Műszaki Középiskola, felkészítő tanár: Demeter László 4. Kiss Béla, II. o., Vác, Boronkay György Műszaki Szakközépiskola, felkészítő tanár: Újvári István 5. Kiss Olivér, IV. o., Debrecen, Mechwart András Gépipari Műszaki Szki., felkészítő tanár: dr. Rutovszky Ede 6. Stikkel Gábor, III. o., Eger, Neumann János Közgazdasági Szki. és Gimnázium, felkészítő tanár: Máté Mihályné
12 Agenda AXA Driver Telematics competition on Kaggle Insurance related data science competitions and use cases Importance of data analytics in other industries Basic skills required to be a data scientist Profile of a typical data analyst Software tools for applying data science R Studio A basic solution for the Titanic competition Summary
13 Data science competitions Netflix prize: USD to improve movie recommendations by 10%, 2009 Foundation of Kaggle, 2010 Crowdsourcing predictive modeling Has members Selling data analytics service of the best performers Hosted 179 competitions
14 AXA Driver Telematics Analysis The intent of this competition is to develop an algorithmic signature of driving type Come up with a "telematic fingerprint" capable of distinguishing when a trip was driven by a given driver Data available: 2736 folder with 200 trips in each the majority of the trips in one folder belongs to one driver trip (all centered to origin, randomly rotated and flipped): x,y 0.0, , ,-21.9
15 Evaluation Submissions (probabilities of each trip belonging to a particular driver) are judged on area under the receiver operating curve (AUC) The ROC area is calculated in a global manner (all predictions together) You should therefore aim to submit calibrated probabilities between the drivers Interpretation of global AUC: count = 0 for i=1 to m for j=1 to n if( p[i] > p[j] ) count = count + 1; else if( p[i] < p[j] ) count = count - 1; AUC = (count/(2*m*n)) Here i runs over all m data points with true label 1, and j runs over all n data points with true label 0; p[i] and p[j] denote the probability score assigned by the classifier to data point i and j, respectively.
16 Main strategy Assume that all 200 trips were driven by the same driver, label them as 1 Select trips from other drivers and label them as 0 Strategy of how to do it can vary Run supervised learning on this dataset and gather probabilities for the first 200 trips
17 Features trip matching Heuristics work Heuristics does not work Heuristics: rotate the trip to have the endpoint on y-axis and check which rectangle it can fit
18 Features - cont. Distribution of acceleration (discretized: 0, 0.5,, 10 m/s 2 ) deceleration (-10, -9.5,, 0 m/s 2 ) speed (0, 2,, 50 m/s) angles (0, 0.025,, 0.5 radian) Length of trip in seconds Distance of trip in meter Quantiles of speed (0, 10%,, 100%) Horizontal extension of rotated trip Vertical extension of rotated trip Sum of angle changes
19 Major milestones In terms of AUC score: : 1 st submission, heuristics of trip length : still heuristics of trip length and horizontal extension : zscore on heuristics of trip shape, speed, acceleration : Random Forest on acceleration and deceleration histograms : RF on acc, dec, speed and trip length (2x200 random trips) : RF on full feature set (20x40 random trips) : blending 10 different RF on full feature set ( 20x40 random trips)
20 Winning features AUC (7 th place) can be achieved by using The derivative of acceleration: jerk Centripetal acceleration Tangential acceleration Heading change distribution Stopping time Number of stops Ensembling with Lasso-regression instead of just averaging
21 Winning features and ideas Trip matching does count! Ramer-Douglas-Peucker algorithm:
22 Insurance related competitions
23 Insurance related big data use cases PWC s approach to cure churn: Smartcasco: cellphone assisted car insurance (30% cheaper than normal) Liberty Mutual Insurance and Nest Partner to Reward Customers For Protecting Their Homes With Innovative Technology
24 A possible application of driver s signature
25 Importance of data analytics in other industries Data is the new oil Data is the new gold Report from McKinsey Global Institute (2011) on Big Data claims it is the next frontier for innovation, competition, and productivity The trend is visible everywhere: from telecom through agriculture to psychology Connected vehicle cloud platform for pay-as-you-go car insurance policies Even following blogs on big data is time consuming Paco Nathan Bernard Marr
26 Data is the new gold Big Data Gartner, August 2014
27 Agenda AXA Driver Telematics competition on Kaggle Insurance related data science competitions Importance of data analytics in other industries Basic skills required to be a data scientist Profile of a typical data analyst Software tools for applying data science R Studio A basic solution for the Titanic competition Summary
28 Data science Venn diagram Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.
29 Skills required for data science 4 types of data science jobs with a breakdown of the 8 skills you need The sexiest job of the 21 st century Coursera learnings MIT edx course on data analytics with Spark
31 Framework for data science
32 Summary Fun to compete especially when discussing ideas with your colleagues at the coffee machine One will always find something new to learn in R Running the AXA models on multicore, link to my code in Github Feature engineering tests creativity Data is the new gold
33 References The competition s webpage: https://www.kaggle.com/c/axa-driver-telematicsanalysis My blog post about the AXA telematics competition: Andrei Olariu s blog: Connected vehicle cloud concept of Ericsson: https://www.youtube.com/watch?v=nk12lhw_siw R Studio:
34 Prediction is very difficult, especially about the future Niels Bohr
July 2013 Contents 1. Introduction 3 2. What is Big Data? 4 3. Big Data Adoption 5 4. Drivers and Barriers 11 5. Opportunities for Digital Entrepreneurship 14 5.1. Supply-side Business opportunities 14
ILM Level 3 Qualifications in Leadership and Management Candidate Handbook 2 Background to ILM The Institute of Leadership & Management (ILM) is Europe s largest independent Leadership and Management Awarding
Foresight, Competitive Intelligence and Business Analytics Tools for Making Industrial Programmes More Efficient Jonathan Calof, Gregory Richards, Jack Smith 9 φ β X Creating industrial policy and programmes,
Business innovation and IT trends If you just follow, you will never lead Contents Executive summary 4 Background: Innovation and the CIO agenda 5 Cohesion and connection between technology trends 6 About
Analysis, Planning and Implementation of CIM applications within SME's with the CIMple Toolbox Presenting author: C.-Andreas Dalluege, IBK Co-author: Dr. N. Banerjee, ISARDATA The contents of this paper
How to Develop and Monitor Your Company's Intellectual Capital Tools and actions for the competency-based organisation The Frame project Nordic Industrial Fund How to Develop and Monitor Your Company's
How to embrace Big Data A methodology to look at the new technology Contents 2 Big Data in a nutshell 3 Big data in Italy 3 Data volume is not an issue 4 Italian firms embrace Big Data 4 Big Data strategies
Eindhoven, August 2014 Big Data Opportunities for the Retail Sector A Model Proposal by M.G.H. (Marcel) van Eupen BSc Industrial Engineering & Management Science TU/e 2014 Student identity number 0715154
Bachelor of Science in Business Human Resource Management The Bachelor of Science in Business Human Resource Management is a competency-based program that enables students to earn a Bachelor of Science
www.pwc.com PwC Advisory Oracle practice 2012 How to drive innovation and business growth Leveraging emerging technology for sustainable growth 1 Heart of the matter Top growth driver today is innovation
2 COMMUNICATION, MANAGEMENT AND HEALTH GOALS AND CONTENTS The University of Lugano (USI), Switzerland, in collaboration with Virginia Polytechnic Institute and State knowledge and skills to Acquire the
Institute of Architecture of Application Systems University of Stuttgart Universittsstrae 38 D 70569 Stuttgart Diplomarbeit Nr. 3538 Risk assessment-based decision support for the migration of applications
MOOCs and Open Education: Implications for Higher Education A white paper By Li Yuan and Stephen Powell http://publications.cetis.ac.uk/2013/667 MOOCs and Open Education: Implications for Higher Education
Bachelor of Science in Business Management The Bachelor of Science in Business Management is a competencybased program that enables leaders and managers in organizations to earn a Bachelor of Science degree.
Example Application Form Answers This document provides sample answers to some of the questions often found in application forms. These include questions designed to find out about your skills (or competencies),
Big Data Analytics Harvard-Smithsonian Center for Astrophysics Data Science Training for Librarians April 4, 2013 David Dietrich, EMC Education Services I ll go into a company and say, What data problems
Transforming Payments from a Transaction Business to an Information Business A White Paper on Consumer Payments Innovation by Microsoft and FreedomPay Foreword by Consult Hyperion September 2013 Contents
All available Global Online MBA routes have a set of core modules required to be completed in order to achieve an MBA. Those modules are: Management and Organizational Change (P.4) Leading Strategic Decision
Online Education White Paper January 25, 2014 Executive Summary A renewed interest in online education has surfaced recently, largely sparked by issues of bottlenecking and course availability on college
Bachelor of Science in Marketing Management The Bachelor of Science in Marketing Management is a competencybased program that enables marketing and sales professionals to earn a Bachelor of Science degree.
At the Big Data Crossroads: turning towards a smarter travel experience Thomas H. Davenport Visiting Professor at Harvard Business School Amadeus IT Group is committed to minimizing its carbon footprint.
Report of the Research Prioritisation Steering Group REPORT OF THE RESEARCH PRIORITISATION STEERING GROUP Minister s Foreword The last decade has seen consistent and considerable public and private investment
Convergence of Social, Mobile and Cloud: 7 Steps to Ensure Success June, 2013 Contents Executive Overview...4 Business Innovation & Transformation...5 Roadmap for Social, Mobile and Cloud Solutions...7
Bachelor of Science in Accounting The Bachelor of Science in Accounting is a competency-based program that enables professionals in accounting to earn a Bachelor of Science degree. The Accounting degree
Online Learning and Higher Engineering Education The MOOC phenomenon P. de Vries Assistant Professor Delft University of Technology Delft, Netherlands email@example.com Conference Key Areas: New