Data Analytics for Campaigns Assignment 1: Jan 6 th, 2015 Due: Jan 13 th, 2015



Similar documents
The Ohio Board of Regents Credit When It s Due process identifies students who

2. Visit the Admissions section of the TCC website Follow steps #1-3.

STEP 1: Student Application Submission

LOUISIANA TECH UNIVERSITY Division of Student Financial Aid Post Office Box 7925 Ruston, LA 71272

ARE YOU INTERESTED IN THE PRIOR LEARNING ASSESSMENT (PLA) PROGRAM?

learndirect Test Information Guide The National Test in Adult Numeracy

AMWA Chapter Subgroups on LinkedIn Guidance for Subgroup Managers and Chapter Leaders, updated

Frequently Asked Questions About I-9 Compliance

APPLICATION

CHECKING ACCOUNTS AND ATM TRANSACTIONS

3/2 MBA Application Instructions

Frequently Asked Questions about the Faith A. Fields Nursing Scholarship Loan

Workers Compensation Employee Packet

Inspired Leaders Principal Licensure Program PROGRAM APPLICATION

Hartford Seminary s. Online Application Instructions

Annuities and Senior Citizens

Colorado Gardener Certificate Training 2015 Application and Training Information

What Does Specialty Own Occupation Really Mean?

Space Exploration Classroom Activity

Access EEC s Web Applications... 2 View Messages from EEC... 3 Sign In as a Returning User... 3

Patient Participation Report

COMPREHENSIVE SAFETY ASSESSMENT INSTRUCTIONS for STUDY ABROAD PROGRAMS

Creating Your First Year/Semester Student s Group Advising session

Internal ID: Nisei Student Relocation Commemorative Fund Scholarship Application Academic Year

Point2 Property Manager Quick Setup Guide

NextGenJustice Florida attorneys have prepared the following Frequently Asked Questions to help you with your uncontested divorce.

Site Coordinator Volunteer Resource Guide To assist with volunteer recruitment, training and support

BRILL s Editorial Manager (EM) Manual for Authors Table of Contents

By offering the Study Abroad Scholarship, we hope to make your study abroad experience much more affordable!

Today on Election Day

The Stanley Foundation 209 Iowa Avenue Muscatine, IA FAX EVENT PLANNER S GUIDE

March 2016 Group A Payment Issues: Missing Information-Loss Calculation letters ( MILC ) - deficiency resolutions: Outstanding appeals:

Title IV Refund Policy (R2T4)

The HR Coach Certification Student Information Sheet

Connecticut State Department of Education School Health Services Information Survey

Heythrop College Disciplinary Procedure for Support Staff

MU Sinclair School of Nursing Accelerated BSN Application Information

Michigan Transfer Agreement (MTA) Frequently Asked Questions for College Personnel

Special Tax Notice Regarding 403(b) (TSA) Distributions

In addition to assisting with the disaster planning process, it is hoped this document will also::

We will record and prepare documents based off the information presented

COUNSELING DEFINITIONS

WHAT SHOULD I LOOK FOR WHEN I BUY HEALTH INSURANCE?

FINANCIAL OPTIONS. 2. For non-insured patients, payment is due on the day of service.

Volume THURSTON COUNTY CLERK S OFFICE. e-file SECURE FTP Site (January 2011) User Guide

What Happens To My Benefits If I Get a Bunch of Money? TANF Here is what happens if you are on the TANF program when you get lump-sum income:

Occupational Therapy

SCHOLARSHIP APPLICATION

Loan Repayment Planning Worksheet

Northern Illinois University McHenry County College Transfer Guide Early Childhood Education ( Catalog)

Watlington and Chalgrove GP Practice - Patient Satisfaction Survey 2011

How Checking Accounts Work

Table of Contents. Welcome to Employee Self Service... 3 Who Do I Call For Help?... 3

FundingEdge. Guide to Business Cash Advance & Bank Statement Loan Programs

How to put together a Workforce Development Fund (WDF) claim 2015/16

efusion Table of Contents

Cell Phone & Data Access Policy Frequently Asked Questions


CSAT Account Management

Cancer Treatments. Cancer Education Project. Overview:

WRHA Health Interpreter Guidelines 1 for Message Relay, Reminder Call and Conference Call

Enrollee Health Assessment Program Implementation Guide and Best Practices

Research Protocol for Nurse Practitioner Scope of Practice Laws. Prepared by the LawAtlas Legal Team

Budget Workbook. $ Live within your income. $ Realize personal more effectively. $ Develop economic competence and confidence goals

990 e-postcard FAQ. Is there a charge to file form 990-N (e-postcard)? No, the e-postcard system is completely free.

Baltimore County Retired School Personnel Association, Inc. P. O. Box Nottingham, MD

Duration of job. Context and environment: (e.g. dept description, region description, organogram)

STUDIO DESIGNER. Accounting 3 Participant

PEARL LINGUISTICS YOUR NEW LANGUAGE SERVICE PROVIDER FREQUENTLY ASKED QUESTIONS

Tipsheet: Sending Out Mass s in ApplyYourself

Important 2015 Date!!! Our home swim meet The Ukiah Dolphins Soroptimist Swim Meet is the weekend of July 24 th.

Montana Acquisition & Contracting System (emacs) emacs Handbook. Vendor Registration and Data Management

To discuss Chapter 13 bankruptcy questions with our bankruptcy attorney, please call us or fill out a Free Evaluation form on our website.

Sonny s Franchise Company 201 North New York Avenue 3rd floor Winter Park, FL 32789

Research Findings from the West Virginia Virtual School Spanish Program

Account Switch Kit. Locations. HACKLEBURG PO DRAWER A US HWY 43 HACKLEBURG, AL Phone: (205) Fax: (205)

FORM ADV (Paper Version) UNIFORM APPLICATION FOR INVESTMENT ADVISER REGISTRATION AND REPORT FORM BY EXEMPT REPORTING ADVISERS

The Family Cost Share system is designed so families with the ability to pay will share in the cost of services.

The Spirit of Excellence Tutorial Registration Form

Transcription:

Data Analytics fr Campaigns Assignment 1: Jan 6 th, 2015 Due: Jan 13 th, 2015 These are sample questins frm a hiring exam that was develped fr OFA 2012 Analytics team. Plan n spending n mre than 4 hurs n this assignment and feel free t use any nline resurces and tls/sftware yu want. Yu are nt expected t knw everything in this exam. This will help me evaluate where everyne is in the beginning f the curse and adap the class t fit the needs and level f the students. Have fun! Questin 1: Yu are an Electins Analyst fr a majr plitical campaign. Yu have the fllwing infrmatin available t yu: A natinal database with a recrd fr each individual registered vter in the cuntry including their name, address, registered party, past vte histry, and demgraphic infrmatin (age, gender, ethnicity). Yur campaign s statistical mdeling team has built a set f individual- level (vter- specific) prpensity scres using the database abve. Each vter in yur database is given three distinct scres frm yur mdeling team: Demcratic supprt scre (0-100): Prbability this individual will vte fr the Demcratic candidate rather than the Republican candidate, given that s/he casts a ballt fr ne f these tw candidates 2012 turnut scre (0-100): prbability this individual will cast a ballt in the next electin Persuasin scre (0-10): prbability this individual will switch his r her vte frm supprting the Republican candidate t supprting the Demcratic candidate in respnse t a single cntact frm the Demcratic campaign. The scre distributin frm 0-10 indicates that the peple mst likely t switch frm supprting the Republican t supprting the Demcrat have a 10% prbability f making that switch in respnse t a single cntact frm the Demcratic campaign. A. Hw wuld yu use these scres and ther infrmatin in the database t cnstruct a universe f targets t cntact fr persuasin? Hw wuld yu rank r tier persuasin targets in pririty rder? Wh wuld yu NOT want t cntact fr persuasin? B. Hw wuld yu use these scres and ther infrmatin in the database t cnstruct a universe f targets t cntact fr GOTV? Hw wuld yu rank r tier GOTV targets in pririty rder? Wh wuld yu NOT want t cntact fr GOTV? (Nte: GOTV stands fr Get Out the Vte and its purpse is t increase turnut amng targeted vters.) C. One day, yu receive an email frm a senir campaign staffer wh asks the fllwing questin:

Accrding t yur supprt mdel, my friend has a supprt scre f 50 (ut f 100), but I knw she always vtes fr Demcratic candidates. Why has yur mdel assigned her a scre f nly 50? Hw wuld yu explain this t the campaign staffer? Assume this campaign staffer is a smart, educated persn with extensive plitical experience and little r n backgrund in statistics. Then suggest a different way that the campaign staffer can cnfirm the accuracy f the predictive mdel created by yur clleagues. Questin 2: It s early 2011, and yu are an Electins Analyst fr an ff- year special electin. Yu are asked t design a cntrlled experiment t measure the impact f a vlunteer telephne call r a vlunteer dr- knck n vter turnut the likelihd that a cntacted individual will vte in a given electin. Yu have available t yu a vter file database f public infrmatin frm the state Secretary f State. This database includes the name and address f every registered vter in the state. It als includes each individual s past vte histry which electins they did/did nt vte in, the party with which they are registered t vte, their birthdate and gender. After the electin, yu will btain an updated vter file with all f the same infrmatin and ne additinal field whether they vted in this special electin. A. Hw wuld yu design this experiment? What data wuld yu cllect? Hw wuld yu supervise the cnduct f this experiment? Hw wuld yu use the pst- electin vter file frm the state Secretary f State t determine the impact f vter cntact n turnut in the 2011 electin? After winning the special electin in 2011, the same candidate has t run again fr the same ffice in 2012. In the 2012 electin, yu d NOT cnduct an experiment. But after the electin, the campaign manager asks yu the same questin: what was the impact f a phne call r dr knck in 2012 n 2012 turnut? Yu btain a new vter file including an additinal field indicating whether each individual did r did nt vte in the 2012 electin. Yu als have additinal fields that indicate whether each persn was called r kncked in 2012, the date f attempted cntact and the result f cntact (nt hme, canvassed, etc.). B. What is the single mst imprtant difference between the experiment in part A and this analysis? Is this analysis easier r harder? Why? Hw wuld yu analyze this data? What caveats wuld yu include with yur analysis? Questin 3:

It s August 2012, and yu re wrking as a Statistical Mdeling Analyst fr a state Demcratic campaign. Yur campaign has access t a state database with a recrd fr each individual registered vter in the state including their name, address, party registratin, past vte histry, demgraphic infrmatin and mre. This infrmatin has been cmbined with a recent telephne pll f 5K randm cnstituents where each persn was asked what candidate they planned n supprting: Demcrat Herman Madisn r Republican Martha Whistler. There are n ther candidates. Using this cmbined data set, ne f yur fellw mdeling analysts has built a lgistic regressin mdel that predicts the prbability an individual vter will supprt the Demcrat. Belw are cefficients frm this mdel. The definitins f the variables are belw. Variable cefficient standard errr z scre Demcrat 1.45 0.09 15.81 Republican - 2.11 0.1-21.95 Ln_Incme - 0.109 0.041-2.63 Age - 0.013 0.0096-1.4 Age_Sq 0.0001 0.00009 1.52 Census_Cllege 1.77 0.33 5.37 AfAm 2.07 0.399 5.18 AfAm_Demcrat - 0.872 0.437-2.01 Cnstant 1.2 0.484 2.48 Demcrat Cded as 1 if the vter is a registered Demcrat, 0 if he/she is nt Republican Cded as 1 if the vter is a registered Republican, 0 if he/she is nt Ln_Incme The natural lgarithm f the vter s incme (in dllars) Age The vter s age (in years) Age_Sq The vter's age (in years) squared Census_Cllege The percentage f residents in the vter s neighbrhd wh have a cllege degree (scaled frm 0 t 100) AfAm Cded as 1 if the vter is African American, 0 if he/she is nt AfAm_Demcrat Cded as 1 if the vter is bth African American and a registered Demcrat, 0 if he/she is nt Cnstant The cnstant term A. Cnsider 4 vters, Adam, Bb, Chris and David. Adam and Chris share identical characteristics except fr their incmes. Bb and David als share identical characteristics (with each ther, nt necessarily Adam and Chris), except fr their incmes.

Name Incme Mdeled Supprt Adam $50,000 50% Bb $200,000 50% Chris $40,000? David $190,000? Based n the cefficients abve, wh wuld yu think has a higher prbability f supprting Herman Madisn? Chris David They have the same prbability Cannt tell based n the infrmatin prvided What is yur reasning? (yu need nt calculate an exact prbability t answer this questin. Just explain yur reasning in general terms.) B. The cefficient fr AfAm_Demcrat is negative. Hw d yu interpret this? Des this mean that African- American registered Demcrats supprt Herman Madisn at lwer rates than African- American independents? What abut relative t white registered Demcrats? C. Hw d we interpret the difference in supprt between vters f different ages? Hw d the variables in the mdel estimate such supprt? D. Are there any variables in this mdel that yu wuld chse t drp? Why r why nt? Wuld yu need mre infrmatin in rder t make this decisin? Questin 4: Yu are asked t predict the prbability each individual registered vter will turn ut t vte in the next electin. Yu have a database that includes the fllwing infrmatin fr each registered vter: Name, street address, city, state, zip, phne Past vte histry (whether r nt the individual vted in each past electin) Registered party (in states with party registratin) Birthdate, gender Additinal fields (such as educatin, incme, ethnicity) In additin t this database, yu als have survey data fr a small sample (ten thusand vters) indicating stated likelihd f vting in the next electin and strength f supprt fr the Demcratic candidate. This survey sample has been matched t the larger database. Yu als have similar data frm past electins. Hw wuld yu use this infrmatin t assign each individual registered vter a prbability f vting in the next electin? Wuld yu use the first data set by itself r wuld yu als use the

secnd data set? Hw wuld yu cmbine them if yu decide t use bth? Hw wuld yu validate yur mdel prir t the next electin? Questin 5: It s Octber, and yu ve been asked t build a statistical mdel t help identify likely supprters fr the campaign s Get Out the Vte peratin. The campaign s data team has assembled the attached dataset and yur task is t use it t build this supprt scre mdel. There are tw steps: PART A: Mdel building yu will build a mdel using sme r all f the attached data (cnsider part B befre starting part A). PART B: Validatin yu will validate this mdel using sme r all f the attached data. PART A: Fr yur cnvenience, we have put the file int Excel (attached). Yu may imprt r cpy and paste this data int any statistics package f yur chice (Stata, R, SAS, SPSS) t build yur mdel. Yur jb is t prduce a simple mdel that predicts the prbability f identifying fr the Demcratic candidate based n the attached data. We have als included a data dictinary which defines each variable fr yur reference. Feel free t use nt nly the variables included in the attached data set, but als ther variables built upn these (such as interactins r transfrmatins). The data may have sme missing values. Please keep this in mind, and explain hw yu will deal with this missing data and missing data in general. Please tell us what kind f mdels and algrithms yu wuld cnsider and explain yur chice f the mdel yu decided t build. Once yu have selected a single mdel type (regressin, decisin tree, supprt vectr machine, etc.), please build at least tw different variatins f that mdel. Fr example, yu may want t vary which variable(s) are included, r yu may want t try a variable transfrmatin r interactin. Please cpy and paste the results f each variatin int yur MS Wrd dcument. Discuss why yur final mdel is superir t ther mdels yu tried. Fr yur final mdel, please explain what variables are mst imprtant and hw the results shuld be interpreted. Additinally, using yur final mdel please create a clumn n the MS Excel spreadsheet that gives a prbability that each vter will supprt the Demcratic candidate (return the spreadsheet with yur exam). Please nte that we wuld like scres fr thse vters fr whm the dependent variable (supprt_demcrat) is missing. Please describe ne r mre graphics yu culd generate t use as a diagnstic tl t evaluate the quality f the mdel r as a visual tl t demnstrate the effectiveness f the mdel. (Yu may create ne r mre f these graphics if yu have extra time, but it is nt necessary.)

If yu had mre time, what else wuld yu d? What ther variables wuld yu ask fr / cnstruct? What ther mdel specificatins wuld yu explre and why (briefly)? PART B: Use sme r all f the attached data t validate yur mdel. Hw well des yur mdel validate? Why d yu say that? Nw suggest anther way yu culd validate yur mdel using external data rather than the attached data. What additinal value wuld this validatin prvide?