Predictive Analytics Workshop With IBM SPSS Modeler
Introduction What Makes a Smarter City?
Objectives Smarter Public Safety with IBM
The Power of Predictive Analytics What IBM Strives to Accomplish in Public Safety Efficient Outcomes Smarter Decisions Actionable Intel Give departments the ability to accommodate large volumes of data and put evidence-based input into operational decisions to proactively deploy limited resources do more with what you have.
IBM Predictive Analytics The Process
Our Challenge Common Scenario The Problem Public Safety: Property Crimes & Quality of Life Analyst Data Rich yet Information Poor Difficult to detect actionable patterns, when those patterns will often change or stop Forecasting will provide what, but not why, or how to affect the action
The Solution? What Predictive Analytics is NOT Crystal Ball? Crap Shoot? Predictive Policing? D.O.A.M.? Historic Dashboards? Minority Report?
Predictive Analytics is a Process Analyze & Predict Predictive Software Detect & Capture Timely Intervention Force Deployment Save Time FAST Decisions Engage & Act Crime Data, RMS, Police Reports, Causal Factors Actionable Intel
A More Detailed View Business Understanding Data Understanding Data Preparation Modeling Evaluation Final Report Determine Business Objectives Background Business Objectives Business Success Criteria Situation Assessment Inventory of Resources Requirements, Assumptions, and Constraints Risks and Contingencies Terminology Costs and Benefits Determine Data Mining Goal Data Mining Goals Data Mining Success Criteria Collect Initial Data Initial Data Collection Report Describe Data Data Description Report Explore Data Data Exploration Report Verify Data Quality Data Quality Report 15% Data Set Data Set Description Select Data Rationale for Inclusion / Exclusion Clean Data Data Cleaning Report Construct Data Derived Attributes Generated Records Integrate Data Merged Data Format Data Reformatted Data Select Modelling Technique Modelling Technique Modelling Assumptions Generate Test Design Test Design Build Model Parameter Settings Models Model Description Assess Model Model Assessment Revised Parameter Settings 40% 25% Evaluate Results Assessment of Data Mining Results w.r.t. Business Success Criteria Approved Models Review Process Review of Process Determine Next Steps List of Possible Actions Decision 10% Produce Final Report Final Report Final Presentation Review Project Experience Documentation 5% Produce Project Plan Project Plan Initial Assessment of Tools and Techniques 5%
Sense & Respond* Predict & Act*
Use Cases One Analytical Workbench Endless Applications
Use Case 1: C.P.A.S. Crime Prediction & Analytics Solution: Force Deployment Decrease the number of criminal incidents (property crimes, robberies, quality of life) through dynamic manpower deployment. Deter crime, decrease time to response, and arrest more criminals by deploying resources in the right place and at the right time. Discover new patterns in the data that are associated with specific crime types and trends. Incorporate ALL disparate data sources into one workbench: Criminal incident(rms) Calls for service Office case notes/narratives Weather Sporting events City events, i.e. festivals, block Parties, Concerts, Paydays, Social Security checks, Disability Holidays, 3-day weekends Demographic data Socio-economic data Lunar cycle Crime Prediction & Analytics Solution : Officer Safety Improve officer safety, and the efficiency and effectiveness of incident response Discover when and where to proactively place officers and limited resources to prevent ANY crime Generate Actionable Intel
Use Case 1: Example C.P.A.S. Deployment Causal Factors/Triggers City events Paydays Holiday, Marital Status Crime and Offense data Incidents Offense Date / Time Dispatch zone Analyses Risk analysis Propensity/Likelihood Scoring models Forecasting Recidivism... CENTRAL COLLABORATION REPOSITORY Operational Planning Process Enabling factors Weather conditions Police presence Feedback Capture Predict Act
Some Example Output**.
Use Case 1: Model Results Vehicle Burglary Predictors Shift Day of Week District Days in Prior Week with Vehicle Burglaries Expected Subtle Prior Week incidents with these Crime types Weekend (Friday Sunday) Christmas Eve
Use Case 1: Model Results POC Model Results: Some inputs NOT useful for Vehicle Burglary Prediction Month Temperature is not predictive for Vehicle Burglary Social Security Check Distribution Prior week incidents with these crime types Valentine s Day
Use Case 2: Text Analytics Turn unstructured officer notes and narratives into useable and searchable context-rich content with Text Analytics.
The Evolution of Text Analytics Technology Mr. Smith aka Mr. Ahmed was seen on the corner of Church St. and Magnolia Ave. on Nov 13 th Bag of «Words» extraction Expressions extraction Mr. Smith (Person) -> aka (Alias) -> Mr. Ahmed (Person) was seen (location) -> Church and Magnolia (address) -> November 13 (Date) Mr. Smith aka was seen with Ahmed on the corner of Church Etc. Mr. Smith was seen Mr. Ahmed corner Church St. Magnolia Ave. Nov 13th Named Entities extraction Mr. Smith -> Person Mr. Ahmed-> Person aka -> Alias was seen -> location Church St. -> Address Magnolia Ave. -> Address Nov 13 th -> Date Events/Sentiment Extraction 70 s 80 s 90 s Now Mr. Ahmed in database wanted for questioning Suspect -> send agent to this location Combined with structured data
The IBM Solution Push a button and learn from your narratives. For example: How many copper wire thefts have there been in the last year. Where have they occurred? Is there a consistent time of day/dow? Do we have any suspect descriptions? What type of locations are being hit? List all burglaries in the past 9 months involving a blue Honda Civic as the get away vehicle. How many incidents are there? Are the incidents in the same general location or did they travel during the 9 months? Do we have any suspect descriptions? What type of buildings are being targeted? Do we have any descriptions of property stolen? Way beyond key word searches. Use NLP to extract concepts and key intelligence. Will require IBM consultants or your analysts to customize the NLP library to your data and build the web reports. One size fits all does not work with narratives. 19
Use Case 3: Entity Analytics 3634 Suspects Results Fields Used in EA Resolution % Missing in PD Database LAST 0.7 FIRST 0.4 MIDDLE 63.9 RACE 0.1 SEX 0.1 DOB 2.4 ADDR 0.9 DRLIC 63.8 PHONE 59.1 Found 100% of the duplicate suspects. 100% accurate when declaring a suspect a duplicate. -based on a 10% verification sample 20
Why IBM?
The IBM Difference Achieve Multiple Public Safety Initiatives with 1 Solution for Predictive & Advanced Analytics From Seasoned Software Partner IBM SPSS Predict and reduce crime rates, improve officer safety and increase quality of life with custom Force Deployment Models. FOR ANY CRIME TYPE. Improve closure rates on unsolved cases with Investigative Aid Models. Resolve duplicate identities with Entity Analytics technology. Turn unstructured officer notes and narratives into useable and searchable context-rich intelligence with Text Analytics. NOT A WORD SEARCH. MUCH SMARTER. Make informed data-based decisions with Decision Models. Perform traditional reporting, summarizing, mapping and forecasting tasks quickly and confidently ad hoc or automated on a schedule.
Cost Benefit Analysis Cost Officer Safety and Crime Rates Quality of Life Wasted Overtime Crime Prevention & Protection New Annual Costs How IBM Helps What if you could keep officers safe while increasing time to response and lowering crime rates? What if you could create safer more desirable neighborhoods? What if you could reduce current Overtime costs by 5%? What if you could predict and prevent ANY crime type? What if you could cut crime without adding annual costs? Every New Year's Eve, a PD on the East coast, would experience an increase in random gunfire. Police began looking at data gathered over the years, and based on that information, they were able to anticipate the time, location and nature of future incidents. One year on New Year's Eve, this PD placed officers at those locations to prevent crime and respond more rapidly. The result was a 47 percent decrease in random gunfire and a 246 percent increase in weapons seized. The department saved $15,000 in personnel costs on single day.
Nucleus Research: The Real ROI from IBM 94% of clients achieved a positive ROI, with an average payback period of 10.7 months Key benefits achieved include reduced costs, increased productivity, improved citizen & employee satisfaction and safety. 81% of projects deployed on time, 75% on or under budget This is one of the highest ROI scores Nucleus has ever seen in its Real ROI series of research reports. Rebecca Wettemann, Vice President of Research, Nucleus Research
Appendix
Use Case 2: Investigative Aid Improve closure rates on unsolved cases with Investigative Aid Models. Analyst feeds robbery details and suspect information into the model. Model produces a report that includes a suspect list based on historic crime patterns. Investigator uses the report to guide the investigation. Model assigns the incident to a robbery signature. Model uses historical data to find the most likely suspect signature given the crime signature.
Use Case 2: Investigative Aid
Test Run and Results Rank = Rank of real perpetrator in the ordered suspect list. In 18 of the 40 cases the perpetrator was in the top 5 of the predictive list! (45% hit rate for top 5) In 29 of the 40 cases the perpetrator was in the top 20 of the predictive list! (73% hit rate for top 20)
IBM SPSS gradually builds a wide range of benefits to the agency and the community. Our research suggests that for a typical city there is $185M to $231M* per year of potential value waiting to be seized 12% Societal benefits 10% Criminal justice cost savings 46% Victim cost avoidance 8% Improved resource deployment 23% Employee efficiency/ productivity 1% Agency cost savings *Source: IBM Center For Applied Insights. November 2011. The potential benefits above are modeled using publically and privately available data. These potential benefits reflect a relative result based on a specific set of data and assumptions. Therefore, potential benefits will vary by organization and are not guaranteed. 29
In our example a five year program lifecycle breaks even in year two and provides a steady state net cash flow of $174M in year five. Example 5-Year Phased Implementation Annual Investments and Value Profile Illustrative $350M op costs U.S. City police department* 175 Annual Investments Annual Value Net Cash Flow $173.7 125 ROI: 94% (3 years); 498% (5 years) Payback period: Year 3 $77M $99.4M 75 $20.1M $67.7M 25 ($3.8M) $0 ($1.3M) $10.6M $30.9M (25) ($3.8M) ($8.2M) ($9.5M) ($10.8M) ($2.7M) 30 Year 1 Year 2 Year 3 Year 4 *Source: IBM Center For Applied Insights. November 2011. The potential benefits above are modeled using publically and privately available data. These potential benefits reflect a relative result based on a specific set of data and assumptions. Therefore, potential benefits will vary by organization and are not guaranteed. Year 5
IBM SPSS also delivers many other important benefits that are not monetized. 21% improved officer safety 22% improved organizational image 5% improved employee satisfaction 15% faster response 10% more arrests 8% improved community engagement *Source: IBM Center For Applied Insights. November 2011. These results are from modeling an Illustrative U.S. City Police department* ($350M op costs, 4250 employees) The potential benefits above are modeled using publically and privately available data. These potential benefits reflect a relative result based on a specific set of data and assumptions. Therefore, potential benefits will vary by organization and are not guaranteed. 31
Appendix Additional Use Cases
Use Case 3: Corrections Combat Recidivism Decrease recidivism rates by predicting which inmates are highly likely to become repeat offenders. Make optimal prescriptive rehabilitation decisions, individualized for each offender, through the use of historically-sound, evidenced-based recidivism risk scores. Optimize Pathway Program Effectiveness Determine which rehabilitation (pathway) programs and decisions are leading to success by analyzing the catalogue of past rehabilitation decisions against success factors. Automatically recommend to officers, case managers, and prosecuting attorneys the optimal prescriptive action for each offender using modern decision management technology. Update Offender Forecast Develop reliable forecasts quickly and in a smart way regardless of the size of the dataset or number of variables Update and manage ALL forecasts and models efficiently by reducing errors and automating appropriate model selection and parameters Deliver standard and ad hoc reports in high-resolution graphs and communicate results effectively DATA BASED OUTCOME MEASUREMENT AND DECISIONS
Use Case: Entity Analytics Resolve identity conflicts with Entity Analytics technology. Key benefits for law enforcement o Close cases quicker o Improve officer safety o Leverage the complete suspect profile - correct name, address, phone, etc. o Assist crime analysts in background searches Entity Analytics functionality robust and developed over many years o GNR (Global Name Recognition) o Define matching aggressiveness o Pre-mapped features based on years of experience SSN, phone, birthdate, etc.
Use Case: Entity Analytics Suppose that you have the following records from two different sources, and are not sure whether they refer to the same person or different people. Source 1 Record no.: 70001 Name: Jon Smith Address: 123 Main Street Driv. License: 0001133107 DL No exact matches between the two records. However, if we introduce a third source, we find some common attributes Source 2 Record no.: 9103 Name: JOHNATHAN Smith Date of Birth: 06/17/1934 Telephone: 555-1212 Email: jls@mail.com IP address: 9.50.18.77. Source 3 Record no.: 6251 Name: Jon Smith Telephone: 555-1212 Driv. License: 0001133107 Telephone
Use Case: Entity Analytics Example: Entity Analytics POC 3634 Suspects Results Fields Used in EA Resolution % Missing in PD Database LAST 0.7 FIRST 0.4 MIDDLE 63.9 RACE 0.1 SEX 0.1 DOB 2.4 ADDR 0.9 DRLIC 63.8 PHONE 59.1
Use Information Case: Management Text SoftwareAnalytics Turn unstructured officer notes and narratives into useable and searchable context-rich content with Text Analytics.
Use Case: Reporting, Mapping, Forecasting, Analysis Perform traditional reporting, summarizing, mapping and forecasting tasks quickly and confidently ad hoc or automated on a schedule. 38
Summary What you get: Leverage all department data and outside data sources too Automatic Pattern Discovery (not manual) Operationalize Predictive Models (not reactive or based on counts of crime alone) Where you apply it: Targeted force deployment Suspect lead generation Resolving multiple entities Turn open ended text into searchable, usable intelligence Decision models Traditional reporting, summarizing, forecasting and mapping