Grameen Foundation s Savings Seminar Data Analytics: Answering business questions with data Oct 22 nd, 2013 Washington DC
Speakers Tanaya Kilara, Financial Sector Analyst at CGAP Jacobo Menajovsky, Senior Data Analyst at Grameen Foundation
The Role of Data Grameen Foundation Savings Seminar October 22, 2013
Warm-up Quiz How long does it take Google to get 2 million queries? How much do consumers spend on web shopping in a an hour? How many emails sent in a minute? 4
More Data with Every Passing Day Big Data Analytics Modelling Data Mining 5
Significantly Better Analytical Capacity 6
Implications More Data Capacity to Analyze Gleaning Customer Insights Fitting Products to Needs Managing g Risk Designing Customer Experience Optimizing Channel 7
Challenges in Financial Inclusion Banks Have customer data, need to build analytical capacity MFIs Need to build systems to capture and analyze data Telcos Have the capacity, need to use it to generate insights relevant to financial services 8
Asking the Right Questions What is the problem I am looking to solve? What types of data do I need to answer my question? How do I get the mix of data right (quant vs qual, internal vs external)? Data gives me the how. What methods to answer the why? 9
Advancing financial access for the world s poor www.cgap.org 10
Agenda Some guiding principles for doing Analytics Data is everywhere. Why? Applied statistics i 101, concepts and most common problems and mistakes Using, mixing, benchmarking, visualizing and testing data to support decisions and respond to business questions A few guidelines to hypothesis testing using excel
A few guiding principles Not all products are created equal. Not all customers have the same needs. Discovering customers profiles and usage patterns can support product and service (re)design. Understanding big trends and patterns in the portfolio can Understanding big trends and patterns in the portfolio can help orgs to drive change and take decisions.
Data is everywhere
Data is everywhere
Data is everywhere
Data is everywhere Start small Think data as signs and indicators, not as numbers in an excel file All of us are using and modelling data all the time to make even the simplest decisions Put your questions first and then go to the data Don t overcomplicate things, but be careful because it is really easy to lie to yourself with statistics
Its really easy to lie to yourself with statistics
Statistical lies? Are you sure? The average annual salary of a Lakeside school graduate is e a e age a ua sa a y o a a es de sc oo g aduate s around 2,000,000 per year.
What a class! Disclaimer: all names and annual salary figures are fake.
Outliers
How many households below the poverty line does your organization reach? Find out with the Progress out of Poverty Index (PPI ) What is the PPI? A poverty measurement tool for organizations with a mission to serve the poor 10 easy-to-answer questions and a scoring system Provides the likelihood that the survey respondent s s household is living below the poverty line Country-specific; there are PPIs for 45 countries Why use the PPI? With the PPI, your organization can: To download the PPI and learn more, visit: www.progressoutofpoverty.org 21
PPI as a segmentation tool - Survey for the Philippines Segmentation Family size Schooling Educational level Employment For the complete survey and look up tables go to: progressoutofpoverty.org
About the data we used From partners and public sources Financial, demographic and poverty data Transactional level Customer level Aggregated level Data comes under different formats, dirty and dispersed Great amount of data manipulation and transformation
What are we doing with the data? Measuring poverty outreach and benchmarking against national figures. Tracking main trends like product performance, penetration, uptake, and dormancy levels. Discovering behavioral patterns and interactions in the data. Running models to discover main drivers of certain events. M&E, program and milestones tracking, etc.
Partner s overview and poverty outreach benchmarking India Cashpor 100K+ active savers R.232 (US$3.50) average savings balances <1% PAR 30 Philippines CARD Bank 750K+ active savers Php 2900 (US$65) average savings balances <3% PAR 30 96% of Cashpor s customers are living below the $2 line 48% of CARD Bank s customers are living below the $2.50 line
Scaling up savings - Some initial questions (CARD Bank) What did the savings business look like when the project started (and after)? a) What was their product offering and cross selling product penetration? b) What was CARD s strategy for scaling up savings? I. Customer base expansion? II. Product deepening and cross selling? III. Both?
Product penetration mapping at CARD Bank Before and after a) Before I. 300K accounts II. 97% monoproduct, only 2.5% cross sold into just one savings product b) After I. 750K accounts II. 84% monoproduct, 15% cross sold into 4 different savings products targeting 4 different customer segments Kids savings, Convenient access, Increased returns, Regular savings
A few business and social questions we wanted to answer with data
Which should be the main target segment when introducing a new savings product at CARD Bank and when? Cross sold profiling and customer lifecycle analysis Average savings by tenure (in years) and poverty level PPI Much higher cross sell penetration PPI Profile data Financial data
Is it possible to launch an aggressive customer expansion strategy without affecting poverty outreach?
Is ATM technology a barrier for the poorest customers? Transactional savings volume by channel and poverty level N=2,244
Is it possible that transactional fees had an effect on saving behaviors at Cashpor? How much are they saving? (average e age amount) Pay as you go Yearly fee: Unlimited transactions fkdfhdsf khsdfkhd fkdfhdsfkh hsdfkhd Last 12 months of activity Last 12 months of activity N=21,731 N=64,841
Hypotheses can be rejected or supported, never proven Putting your data to test t Why is it important to test hypothesis and assumptions? What are the data and tools required to do so? What are the most common methods? Your questions and data will help you identify which tests you should apply. Use correlations to look at whether changes in one variable are accompanied by changes in another variable. Use the chi-square test to look at whether actual data differ from a random distribution. T tests can be used to compare two groups or treatments.
Is tenure correlated with the historic total number of loans disbursed? Correlation Correlation refers to any of a broad class of statistical relationships involving dependence. Dependence refers to any statistical relationship between two random variables or two sets of data. Number of loans disbursed Tenure (length as a customer in months) Pearson s correlation=.789 R2= 62%
Is tenure correlated with the historic total number of loans disbursed? Correlation Number of loans disbursed Tenure (length as a customer in months) Pearson s correlation=.789 R2= 62%
Is tenure correlated with the historic total number of loans disbursed? Correlation disbursed Above average loan takers Number of loans Below average loan takers Tenure (length as a customer in months) Pearson s s correlation=.789 R2= 62%
Hypothesis: Are women in my portfolio poorer than men? Chi-Square test The Chi Square test tests a null hypothesis stating that the frequency distribution of certain events observed in a sample is consistent with a particular theoretical distribution.
Hypothesis: Are women in my portfolio poorer than men? Chi-Square test The Chi Square test tests a null hypothesis stating that the frequency distribution of certain events observed in a sample is consistent with a particular theoretical distribution. Hypothesis supported Pearson's s Chi-Square= 0.0000000063 0000000063
Is there a significant difference on declared assets across poverty segments? T-tests T tests can be used to compare two groups or p g p treatments.
Is there a significant difference on declared assets across poverty segments? T-tests Hypothesis supported Student s T
Closing remarks Wh i d t l ti b i iti l Why is data analytics becoming critical for financial inclusion and development?
Q&A