1 03: What You Need to Know About Big Data: Understanding and Better Utilizing Data Analytics Trainer(s): Mike Holland NYU Center for Urban Science and Progress Timothy Savage NYU Center for Urban Science and Progress Alan Mitchell KPMG Stephen C. Beatty KPMG
2 What You Need to Know about Big Data Understanding and Better Utilizing Data Analytics Mike Holland Tim Savage March 7, 2015
3 Applied Sciences NYC Applied Sciences NYC is the City s unparalleled opportunity to build or expand world-class applied sciences and engineering campuses in New York City. We are seeking to dramatically expand our capacity in the applied sciences to maintain our global competitiveness and create jobs. These campuses would not only enrich the City s existing research capabilities, but also lead to innovative ideas that can be commercialized, catalyzing hundreds of spinoff companies and increasing the probability that the next high growth company a Google, Amazon, or Facebook will emerge in New York City. New York City Economic Development Corporation The NYU-led Center for Urban Science and Progress, a multi-sector research and education collaborative, was announced on April 23, 2012.
4 Big Cities + Big Data The world is urbanizing Cities are the loci of consumption, economic activity, and innovation Cities are the cause of our problems and the source of the solutions Global network traffic, 30% CAGR Informatics capabilities are exploding Storage, transmission, analysis Proliferation of static and mobile sensors Internet of things
5 GRADUATE PROGRAMS IN APPLIED URBAN SCIENCE AND INFORMATICS DEGREE Master of Science LENGTH One Year, 3-semester (Full-time) CLASS SIZE Approx. 60 students Z
6 Projects for the City & State City Lights Building Informatics Urban Soundscape Neuroeconomics of Decision Making Economic Mapping Greener Greater Buildings Plan MTA Bus Driver Optimization MTA Origin/Destination Study New York City Police Department 911/311 Trash Informatics Parks Attendance & Utilization Property Ownership Records Assessment School Property Use Assessment Taxi Visualization Transit Operations
7 What does it mean to instrument a city? Infrastructure Environment People Condition, operations Meteorology, pollution, noise, flora, fauna Properly acquired, integrated, and analyzed, data can Take government beyond imperfect understanding Better (and more efficient) operations, better planning, better policy Improve governance and citizen engagement Enable the private sector to develop new services for citizens, governments, firms Enable a revolution in the social sciences Relationships, location, economic /communications activities, health, nutrition, opinions,
8 Data Types
9 Urban Data Sources: Acquire, Integrate, Use Organic Data Flows Sensors Novel Technologies Administrative records (census, permits, ) Transactions (sales, communications, ) Operational (traffic, transit, utilities, health system, ) Social media (Twitter, Facebook, blogs, ) Personal (location, activity, physiological) Fixedin situ sensors Crowd sourcing (mobile phones, ) Choke points (people, vehicles) Visible, infrared and spectral imagery RADAR, LIDAR Gravity and magnetic Seismic, acoustic Ionizing radiation, biological, chemical
10 Privacy, Big Data, and the Public Good: Frameworks for Engagement The book identifies ways in which vast new sets of data on human beings can be collected, integrated, and analyzed to improve urban systems and quality of life while protecting confidentiality. Sponsored by CUSP, the American Statistical Association, its Privacy and Confidentiality subcommittee, and the Research Data Centre of the German Federal Employment Agency. Editors: Julia Lane, American Institutes for Research; Victoria Stodden, Columbia; Stefan Bender, The German Federal Employment Agency; Helen Nissenbaum, NYU Chapter Authors Alessandro Acquisti, Carnegie Mellon University; Cynthia Dwork, Microsoft; Peter Elias, University of Warwick; Robert Goerge, UChicago; Alan Karr, National Institute of Statistical Sciences and Jerry Reiter, Duke University; Steve Koonin and Michael Holland, CUSP; Frauke Kreuter, U-MD and Richard Peng, Johns Hopkins; Carl Landwehr, George Washington University; Helen Nissenbaum and Solon Baracas, NYU; Paul Ohm, Colorado; Alexander Pentland, et al., MIT; Kathy Strandberg, NYU; Victoria Stodden, Columbia; John Wilbanks, Sage Bionetworks/Kauffman Foundation. visit dataprivacybook.org.
12 Analysis of Massive Taxi GPS Data Overview Data from yellow cabs is almost 800 million trips; nearly impossible to manage, explore, visualize, and analyze with existing tools Objective & Goal Build scalable, usable tools that can be used by experts and non-experts Work with relevant city agencies on development & deployment of the technology Status Initial deployment of TaxiVis at NYC Taxi & Limousine Commission and Department of Transportation Freire, Silva, Vo, et al.
13 Taxis as Sensors for Manhattan Taxis are sensors that can provide unprecedented insight into city life: economic activity, human behavior, mobility patterns, April 2011: Taxi drivers petitioning TLC for higher fares to compensate for rising gasoline prices. August 2011: Hurricane Irene October 2012: Hurricane Sandy
14 Urban Observatory PERSISTENT and SYNOPTIC ANALYTICS for URBAN SCIENCE
15 Manhattan in the Thermal IR 199 Water Street Built 1993 :: 998,000 sq ft electricity, natural gas, steam LEED Certified Photo by Tyrone Turner/National Geographic Other synoptic modalities: Hyperspectral, RADAR, LIDAR, Gravity, Magnetic,
17 raw image Plumes of Opportunity Background subtraction: registration to reference image form 10 absolute difference images from surrounding frames construct the minimum difference image pixel by pixel Plume identification and tracking: denoise background subtracted image identify excess/deficit in luminosity space cross check object location in color space localization and probability weighted tracking of centroids Upcoming use cases: plume rate urban winds carbon vs steam emissions TOO (triggered) observations background subtracted Source: Dobler, et al.
18 Street Environment: Attention, Distraction, and Interaction Dynamics P. Glimcher, M. Grubb, M. Ghandehari, G. Dobler, M. Sharma, A. Chiang
19 Hyperspectral Imaging of Manhattan Bridge Lights Source: Dobler, et al.
20 Open Data
21 https://project-open-data.cio.gov/ Federal Open Data Policies
22 State & Local Open Data its.github.io/open data handbook/opendatahandbook.pdf navigation
23 *Seattle s 911 dispatches, with 438,000 downloads, is the table with the highest number of downloads Source: Barbosa, Luciano, et al. "Structured open urban data: understanding the landscape." Big data 2.3 (2014):
24 Cities and States with Chief Data Officers Blue signifies a state level officer, green signifies a local level officer, and yellow signifies an officer in education. Source: Steve Towns, Which States and Cities Have Chief Data Officers?, govtech.com, June 13, 2014
25 Open Data Can Lead to Open Innovation A consortium of public sector transit agencies, commercial firms, nonprofits, academic researchers, and interested individuals Real time arrival predictions 94% reported increased or greatly increased satisfaction with public transit Significant decrease in actual wait time per user, and an even greater decrease in perceived wait time 78% of riders reported increased walking a significant public health benefit
26 $397B Sanitation Utilities Parks Roads $180B Emergency Mgmt Courts, Jails Police Fire Streets $245B Planning Public Buildings Financial Admin Community Development Safety Core City Services Include General Government Human Services $826B Health Education Social Services We need to understand: How data flows within agencies? How interoperable can data be? What data can be shared? and how is it shared to support delivery of city services? Local Gov t. Expenditures: U.S. Census Bureau, 2012 Census of Governments: Surveys of State and Local Government Finances,
27 Tools and Uses of Big Data
28 Tools Data acquisition and synthesis Exploration and data mining Formulation of meaningful policy questions Formal modeling and interpretation
29 Tools Data acquisition and synthesis Exploration and data mining Formulation of meaningful policy questions Formal modeling and interpretation
30 Picture merges image captured from video, 3 D LIDAR map of NYC, PLUTO (Primary Land Use Tax Lot Output) database, and LL84 Energy Benchmarking data Source: Dobler, et al.
31 Tools Data acquisition and synthesis Exploration and data mining Formulation of meaningful policy questions Formal modeling and interpretation
32 TaxiVis: Interactive Visual Exploration of NYC Taxi Records Source: Freire, Silva, Vo, et al.
33 Source: Freire, Silva, Vo, et al.
34 Tools Data acquisition and synthesis Exploration and data mining Formulation of meaningful policy questions Formal modeling and interpretation
35 Tools Data acquisition and synthesis Exploration and data mining Formulation of meaningful policy questions Formal modeling and interpretation
36 Uses of Data Analytics Regulatory compliance Targeted enforcement Improved understanding of municipal ecosystems via crowd sourcing
37 Some Examples Regulatory compliance Targeted enforcement Improved understanding of municipal ecosystems via crowd sourcing
38 Some Examples Regulatory compliance Targeted enforcement Improved understanding of municipal ecosystems via crowd sourcing
39 Apartment Fires in the Bronx and Brooklyn 20,000+ complaints/year of unsafe illegal conversions Department of Buildings: 200 building inspectors for 900,000 buildings relied on expert judgment to prioritize Historically, only 8% of inspections found serious violations Strongest predictors of unsafe illegal conversion Whether the building is current on its property taxes: data at Department of Finance Whether banks have filed any mortgage foreclosures: data at Office of Court Administration Teaming Fire Marshals up with Building Inspectors Fire fighters 15X more likely to die responding to a fire in an illegal conversion than other fires Vacate orders jumped to more than 70% Source: Mike Flowers, Beyond Open Data: The Data-Driven City in Beyond Transparency: Open Data and the Future of Civic Innovation, Brett Goldstein, Lauren Dyson, Eds.; San Francisco, CA: Code for America Press (2013).
40 Some Examples Regulatory compliance Targeted enforcement Improved understanding of municipal ecosystems via crowd sourcing
Marshall Duer-Balkind Energy Administration District Department of the Environment Government of the District of Columbia September 25, 2014 Washington, DC Big Data, Small Places: How Smart Data Collection
CENTER FOR URBAN SCIENCE+PROGRESS The Promise of Urban Informatics May 30,2013 The Center for Urban Science and Progress Steven E. Koonin 1 On April 23, 2012, New York City and New York University announced
BIG DATA, SMALL DATA, OPEN DATA opportunities and challenges for governance BETHSIMONENOVECK Belk School of Business, Charlotte, North Carolina MAY132014 http://online.northcarolina.edu/unconline/courses.php
Big Data for Public Good: A Primer Final Report March 31, 2014 Prepared by: Nordicity For: ICE Committee Table of Contents 1. Introduction 4 1.1 Project Rationale 4 1.2 Project Mandate 5 1.3 Approach &
National Spatial Data Infrastructure Strategic Plan 2014 2016 Federal Geographic Data Committee December 2013 Federal Geographic Data Committee Federal Geographic Data Committee, Reston, Virginia: 2013
New Data for Understanding the Human Condition: International Perspectives OECD Global Science Forum Report on Data and Research Infrastructure for the Social Sciences Data-driven and evidence-based research
The Smart/Connected City and Its Implications for Connected Transportation www.its.dot.gov/index.htm White Paper October 14, 2014 FHWA-JPO-14-148 Produced by the John A. Volpe National Transportation Systems
Case Study Open Government Data in Rio de Janeiro City Ricardo Matheus Manuella Maia Ribeiro Final Draft August 2014 Page 1 of 50 Table of contents 1 Introduction... 3 1.1 Overview... 4 1.2 Conceptualizing
SMART CITIES AND SUSTAINABILITY INITIATIVE American Planning Association April 2015 ACKNOWLEGEMENTS Co-Chairs Robert M. Kerns, Jr., aicp Kathleen McMahon, aicp David Fields, aicp Coordinators/Contributors
Data Science for the Commonwealth POWERED BY The University of Massachusetts University of Massachusetts Amherst Boston Dartmouth Lowell Worcester UMassOnline University of Massachusetts Amherst Boston
WHITE PAPER Smart Cities and the Internet of Everything: The Foundation for Delivering Next-Generation Citizen Services Sponsored by: Cisco Ruthbea Yesner Clarke October 2013 INTRODUCTION Smart City development
GIS Coordinating Group Public Works, John Goodrich Public Works, Paul Izatt Public Works, Sara Doughty Library, Joyce Niewendorp Community Development, Tom McGuire Community Development, Agnes Kowacz Community
IBM Software Business Analytics Predictive Analytics Software Predictive Threat and Fraud Analytics: Meeting the Challenges of a Smarter Planet 2 Predictive Threat and Fraud Analytics: Meeting the Challenges
White paper Proactive Planning for.. Big Data.. In government, Big Data presents both a challenge and an opportunity that will grow over time. Executive Summary Consider this list of government-adopted
0 0 0 0 POTENTIALS OF ONLINE MEDIA AND LOCATION-BASED BIG DATA FOR URBAN TRANSIT NETWORKS IN DEVELOPING COUNTRIES Kelsey Lantz* Ph.D. Student Glenn Department of Civil Engineering Lowry Hall, Clemson University,
The City of Edmonton: Sustainable Return on Investment Analysis of LEED Certification Levels for New Building Construction Prepared for: The City of Edmonton Prepared by: HDR Corporation May 30 th, 2014
Foster Open and Accessible Government Enable Decisions through Research & Analytics Create a Connected and Engaged Workplace Improve Services through Innovation & Partnerships IT Master Plan - 2015 1 Table
Testimony of Farnam Jahanian, Ph.D. Assistant Director Computer and Information Science and Engineering Directorate National Science Foundation Before the Committee on Science, Space, and Technology Subcommittee
New and Expanding Market Investigations: Exploration into an Energy Management Studies Program Boston University Metropolitan College, April 2009 Proposal Authors: Eric Braude-Computer Science, Sam Hammer-General
BUSINESS PLAN FOR A SUSTAINABLE MOBILITY INITIATIVE June 2012 Lawrence D. Burns, Director, Program on Sustainable Mobility Bonnie A. Scarborough, Program Manager, Program on Sustainable Mobility The Earth
Predictive Analytics for Government & Finance QR Code Presenters: Moderator: Larry Sapp, Finance Manager, Hilton Head Public Service District #1 Speakers: Eero Kilkson, Chief Data Architect, City of Minneapolis
THE FUTURE OF The U.S. Chamber of Commerce Foundation (USCCF) is a 501(c)(3) nonprofit affiliate of the U.S. Chamber of Commerce dedicated to strengthening America s long-term competitiveness and educating
WORLD BANK PRIMERS SERIES BIG DATA & MOBILITY Written by Emmanuel Letouzé June 2015 Draft v13 06/30/15 IN PARTNERSHIP AND WITH FUNDING FROM ABOUT DATA-POP ALLIANCE Data-Pop Alliance is a research, policy
CCICADA and Big Data Delivered to the US Department of Homeland Security by CCICADA May 2014 1. Introduction Everyone is talking about Big Data. Last year, six agencies offered $200M in Big Data grants
Columbia university Institute for Data Sciences and Engineering A proposal to the City of New York executive summary Columbia University Executive Summary 1 I. A University in Full Since its founding in
THE INTERNET OF THINGS: MAPPING THE VALUE BEYOND THE HYPE JUNE 2015 EXECUTIVE SUMMARY In the 25 years since its founding, the (MGI) has sought to develop a deeper understanding of the evolving global economy.
Government of British Columbia Ministry of Labour, Citizens Services and Open Government Concept of Operations V1.0 March 2012 DataBC Concept of Operations V1.0 1 Table of contents 1. Background & Strategic