The Complexities of Creating Small Big-Data: Using Public Survey Data to Explore Unfolding Social and Economic Change Emily Gray, Stephen Farrall (University of Sheffield University), Will Jennings University of Southampton) and Colin Hay (Sciences Po) Data Power Conference, Sheffield, June 2015
Overview Methodological reflections on our use of amalgamated long-term social survey data. Practical challenges of managing large datasets. Big data makes for uncomfortable comparisons with bread and butter of traditional data sources (Housely 2014: 2) Many big social science questions cannot be met by big or transactional data.
Big Data Big changes? Big data Explosive entrance into the social sciences (Burrows & Savage, 2014). Internet has brought with it vast amounts of data and fundamental methodological questions about the nature of data. National, episodic surveys remain crucial. Dynamic, historical, longitudinal processes. Big-data tendency towards chronocentricism.
Thatcher s legacy
Project Outline ESRC grant award no ES/K006398/1 Crime became a key issue in UK political and social agendas in three ways: 1: long-term social and economic trends led to increases in crime rates from the 1960s. 2: the economic and social policies (neo-liberalism) pursued from the 1970s/80s accentuated these trajectories, adding to rises in crime. 3: competition between political parties on the issue of crime raised the profile of crime as an issue and added to levels of public concern over crime. 4: feed-back loops between these operated to foster those circumstances which produced crime and in turn led to the rise of crime as an object of political concern (neo- conservatism).
Methodological Demands This meant... Reviewing the existing datasets (ESRC funded project 2008) Working out how to interrogate the datasets Building the dataset needed (current ESRC award).
Methodological/ Theoretical challenges Complex relationship between crime and economic factors. Need to embed models of the crime-economy link in a wider understanding of social, economic and political changes. Temporal processes age, period, cohort effects. Similar long-term research in Housing (Dorling, 2014); Opiate Drug use (Morgan, 2014); Education policy (Betteridge et al, 2001; Social attitudes (Nacten, 2014).
Datasets include... Fear of crime, victimisation, local ASB, CJS effectiveness, crime prevention measures, crime rates, interviewer assessments. Social/political attitudes, voting, political engagement, trust, newspaper readership. Usual socio-demographics (age, tenure, gender region, on benefits) Range of official data (recorded crime).
Original Sources of Data Individual level Key/Sample Questions BCS/CSEW BSA BES-CMS Victimisation (multiple categories) Fear of crime Common problems Confidence in police/criminal justice system Attitudes on sentencing Burglar/car alarm Role of government Unemployment vs. inflation Puntiveness and authoritarianism Likelihood of riots Attitudes on welfare state Trust in government Crime situation Government/opposition handling of crime Emotions towards crime Sought crime assistance Satisfied with assistance Importance of crime as an issue People trustworthy N of variables (including demographics) 109 80 63 N of respondents 599,517 89,466 124,110 Period 1981-2013 1983-2012 2004-2013
Original Sources of Data Aggregate level Crime and criminal justice Employment Macroecono mics Welfare/Other Politics/Policy Selected data series Official recorded statistics (total/violent/pro perty) Convictions (total/as % of recorded crimes) Prison population Police force strength Unemployment rate (national/by region/males 16-17; 18-24) Economic activity rate Claimant count (national/by region) Average weekly earnings Labour disputes (days lost) Interest rates Public spending GDP Inflation Inequality Poverty Child Poverty Total benefit expenditure (real/nominal terms/% of GDP) Unemployment/i ncapacity/housin g benefit (real/nominal terms/caseload) Suicide rates Children in care Council house sales Truancy and school expulsions Drug addicts Queen s Speech Acts of Parliament Parliamentary questions (e.g. referring to crime rate, burglary, anti-social behaviour )
Methodological challenges Data preparation took +15mths (and is on-going!) Need to code variable names, values, check question wordings, consistency over time. Changes in survey designs over time too. Use of historical data (in effect).
Possible extensions Look at rare populations (Male DV) Extend future years by appending these Include Vs we didn t collect (other attitudes in BSA, for example) Add other aggregate levels Vs from other datasets by year (NHS data on wounding, for example) Create new Vs by collapsing/combining existing Vs (e.g. single people w/ a car)
Reflections Attitudinal/ behavioural patterns whose roots lie in pre-21 st Century cannot easily be captured by big data. Current data may be disparate and disordered (more so than national data, which is also resource intensive to prepare) Secondary data limited. Some historical topics remain beyond our limits. Next steps how can big-data and social survey data be amalgamated.
Resources Data sets to be made available - lodged with UK Data Archive in late 2015, early 2016. Free to download. Documentary film Dissemination events and publications Email: emily.gray@sheffield.ac.uk Twitter @thatchers_legacy