Social Science Data: A Data Archive Perspective



Similar documents
Data-PASS: Data Preservation Alliance for the Social Sciences

WHAT SHOULD NSF DATA MANAGEMENT PLANS LOOK LIKE

INTRODUCTION TO THE DATAVERSE NETWORK

Building community clouds to support access to scholarship. Michele Kimpton CEO, DuraSpace Jonathan Markow CSO, DuraSpace

Research Support for Evidence Based Education Policy: The North Carolina Education Research Data Center at Duke University

Introduction to the Research Data Center PsychData

Encrypting*a*Windows*7*Hard*Disk* with%bitlocker%disk%encryption!

ODUM INSTITUTE ARCHIVE SERVICES OVERVIEW IASSIST 2015

Using ICPSR Database: Finding Social Science Data

The Data Management Plan with. Dataverse. Mercè Crosas, Ph.D. Director of Product Development

Documenting the research life cycle: one data model, many products

Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network

Graduate School Rankings By U.S. News & World Report: MARKETING PROGRAMS

DDI Lifecycle: Moving Forward Status of the Development of DDI 4. Joachim Wackerow Technical Committee, DDI Alliance

University Induction Clear Credential Program

Data Wrangling: From the Wild to the Lake

Data Driven Discovery In the Social, Behavioral, and Economic Sciences

The 10 Best Graduate Programs In Urban And Regional Planning

AP Cambridge Capstone Pilot Program. Cathy Brigham, Ph.D.

PSG College of Technology, Coimbatore Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

Using Dataverse Virtual Archive Technology for Research Data Management. Jonathan Crabtree Thu-Mai Christian Amanda Gooch

Preparing your students for FETCH!

Data Lifecycle Management

Policy Development for Big Data at the ITS JPO

Data Acquisition, Management, Security and Retention. Dustin Tingley

Research Data Services at London s Global University. UCL Research Data Services RECODE Meeting 14 th Jan 2015

EASy. Performance Dashboard. Report Bank. Quick Click Reports. You can drill into a Performance Lozenge to. Teacher, Class or Student.

BENCHMARKING UNIVERSITY ADVANCEMENT PERFORMANCE

FGDC, Meet the DDI. Adding Geospatial Metadata to a Numeric Data Catalog. Julie Linden Yale University

Research Data Archival Guidelines

Graduate School Rankings By U.S. News & World Report: ACCOUNTING PROGRAMS

West Virginia. Next Steps. Life After Special Education

Introduction to Research Data Management. Tom Melvin, Anita Schwartz, and Jessica Cote April 13, 2016

The ISPS Data Archive: Mission, Work, and Some Reflections

SowiDataNet. Bringing Social and Economic Research Data Together

THE UNIVERSITY OF LEEDS. Vice Chancellor s Executive Group Funding for Research Data Management: Interim

Making Data Citable The German DOI Registration Agency for Social and Economic Data: da ra

Harvard Library Preparing for a Trustworthy Repository Certification of Harvard Library s DRS.

Best Practices for Data Management. RMACC HPC Symposium, 8/13/2014

ANONYMISATION VEERLE VAN DEN EYNDEN & BETHANY MORGAN BRETT... UK DATA ARCHIVE UNIVERSITY OF ESSEX...

Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer

Automated and Scalable Data Management System for Genome Sequencing Data

Graduate School Rankings By U.S. News & World Report: MEDICAL SCHOOLS - PRIMARY CARE

HARNESSING DATA CENTRE EXPERTISE TO DRIVE FORWARD INSTITUTIONAL RESEARCH DATA MANAGEMENT

BREAKFAST CHANGES LIVES ENSURING NO KID GOES HUNGRY IN THE CLASSROOM

Professional Development Resources for International Education

Policy Position Charter and Cyber Charter School Law March 2013

Johns Hopkins University Data Management Services

How To Get A Special Collections Certificate

A Plan for Supporting Research, Teaching, and Service. The Library. and the Mission of the University

Big Data to Knowledge (BD2K)

LOS ANGELES UNIFIED SCHOOL DISTRICT REFERENCE GUIDE

* 2: District Number: Please write your answer here: * 3: School Name: Please write your answer here:

The REAL Big Data Actually using smart sensors and other time sensitive data

Benefits of managing and sharing your data

Competency-Based Learning: Definitions, Policies, and Implementation

MATRIX and H-Net Backup and Archival Storage: Practices and Suggested Improvements. Preservation of the H-Net Lists Supplemental Report

Public Health and Law Schools Reported by Survey Respondents

The Metadata in the Data Catalogue DBK

IB students & first-year university performance: The UBC undergraduate admissions model

Discover Viterbi: Engineering Management

Global Scientific Data Infrastructures: The Big Data Challenges. Capri, May, 2011

The Importance of Information Governance and Risk Management

Transportation Secure Data Center (TSDC) Overview

NATIONAL HISTORY DAY IN N.C ONLINE TEACHER WORKSHOP

Math TLC. MSP LNC Conference Handout. The Mathematics Teacher Leadership Center. MSP LNC Conference Handout. !!! Math TLC

Data Management Planning for KU Social Scientists drafting Grant Applications

INTRODUCING THE STATEMENT OF BEST PRACTICES IN FAIR USE OF COLLECTIONS CONTAINING ORPHAN WORKS FOR LIBRARIES, ARCHIVES, AND OTHER MEMORY INSTITUTIONS

Dental School Additional Required Courses Job Shadowing/ # of hours Alabama

KIPLINGER'S Top 10 Values in Public Colleges In-State

The National Consortium for Data Science (NCDS)

ODIN ORCID and DATACITE Interoperability Network Title

FRONTIER REGIONAL/UNION#38 SCHOOL DISTRICTS. Records Retention Policy for Electronic Correspondence

Frequently Asked Questions. Spring 2013 Smarter Balanced Pilot

Higher Education, Higher Expectations The State of U.S. University and College Websites

Easy Surefire Guaranteed System To Win Every Time At Online Blackjack (only system in the world based on human greed)

Social Networking Policy of the Milton Public Schools. I. Internet Acceptable Use Policy still in force

Research Data Management Guide

Data Management. Facility Access Challenges: Rudi Eigenmann NEES Operations Headquarters NEEScomm Center Purdue University

Data Publishing Workflows with Dataverse

Differentiated learning. Professional Development. Documentary

Frequently Asked Questions

Improving Access to Documentation on Restricted Labor Market Data

Douglas County School District. Human Resources. Strategic Plan

Organization: CPEP Connecticut Pre-Engineering Program. Program Contact: Mr. Noah Ratzan Senior Programs Manager

Response from Oxford University Press, USA

HathiTrust Digital Assets Agreement

Canadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management

Drugs & Driving. Being smart, safe and supportive. Guide for Community Coalitions. Drugs & Driving: Guide for Community Coalitions

Checklist and guidance for a Data Management Plan

Discover Viterbi: Engineering Management

Illinois State Board of Education

INDIVIDUAL MASTERY for: St#: Test: CH 9 Acceleration Test on 29/07/2015 Grade: B Score: % (35.00 of 41.00)

INDIVIDUAL MASTERY for: St#: Test: CH 9 Acceleration Test on 09/06/2015 Grade: A Score: % (38.00 of 41.00)

Data Management Plan Template Guidelines

DATA CITATION. what you need to know

DRAFT IDAHO STATE UNIVERSITY POLICIES AND PROCEDURES (ISUPP) Asset Management Policy #2430

The Disparate Impact of Disciplinary Exclusion From School in America

Copyright 2012 by the Inter-university Consortium for Political and Social Research (ICPSR)

Working with the Paraprofessional in Your Classroom

Transcription:

Social Science Data: A Data Archive Perspective George Alter Director ICPSR, Institute for Social Research University of Michigan

About ICPSR Founded in 1962 as a consortium of 21 universities to share the National Election Survey Today: 700+ members around the world Data dissemination for more than 20 federal and non government sponsors 600,000+ visitors per year

What we do Acquire and archive social science data Distribute data to researchers Preserve data for future generations Provide training in quantitative methods Archive size 8,000 data collections, over 60,000 data sets Grows by 200+ collections a year

Changes in the information landscape: Improved metadata and data discovery Linking data and publications Emerging sources and types of data Sharing confidential data

Improved metadata and data discovery Social science data tends to be wide(>400 columns/variables) but not deep (<10,000 rows) Data discovery requires variable level metadata Data Documentation Initiative (DDI) XML standard for micro data in the social sciences

Improved metadata and data discovery New search tools Variable level searching Question banks Harmonization tools

Search/Compare Variables examines 2.1 million variables in 4,000 data collections

Improved metadata and data discovery Federated search tools Data Preservation Alliance for the Social Sciences (Data PASS) combined catalog searches: Institute for Quantitative Social Science, Harvard University Odum Institute, University of North Carolina Chapel Hill ICPSR, University of Michigan Electronic and Special Media Records Service Division, National Archives and Records Administration Roper Center, University of Connecticut Social Science Data Archive, University of California, Los Angeles (UCLA)

Linking data and publications Benefits of linking data and publications Encourages research transparency Promotes re use of existing data Links researchers across disciplines ICPSR Bibliography of Data related Literature 60,000+ citations Challenge of poor citation practices DataCite: DOIs for data Data Citation Index

Emerging sources and types of data Geo spatial Video Administrative data Online text Transactions Clicks Sensors

Safety Pilot Project University of Michigan Transportation Research Institute under a USDOT contract Vehicle to Vehicle and Vehicle to Infrastructure devices in volunteer s cars 2800 cars, trucks, and buses with Vehicle Awareness Devices, sensors, and video ~ 1 Petabyte of data per year

Adaptive education software records every click.

Casinos capture data too SECTION 97., a gaming establishment shall supply the Massachusetts gaming Commission with customer tracking data collected or generated by loyalty programs, player tracking software, player card systems, online gambling transactions or any other information system.... The Commission shall convey the anonymized data to a research facility which shall make the data available to qualified researchers Chapter 194 of the Acts of 2011: An Act Establishing Expanded Gaming in the Commonwealth of Massachusetts, signed into law November 22, 2011. http://www.malegislature.gov/laws/sessi onlaws/acts/2011/chapter194

MET Longitudinal Database Variety of indicators 5 instruments for scoring classroom observation Student surveys teacher evaluations effort enjoyment Value added computed from test scores State math and English language tests Other standardized tests Other instruments and surveys

MET Longitudinal Database Scale 2 academic years 6 large school districts 6 grade levels 3,000 teachers 44,500 students 24,000 videos 22,500 observation sessions 900+ observers trained by ETS to score videos ~12GB of quantitative data ~10TB of video

Metadata: MET LDB Sample Codebook 864 variables 423 pages 1 of 30+ data files in the MET LDB collection ICPSR codebooks are generated from DDI.

Sharing confidential data Safe data: Modify the data to reduce the risk of re identification Safe places: Physical isolation and secure technologies Safe people: Training and Data use agreements

Confidentiality: Harm, Risk & Restriction High Risk Enclosed Data Center Harm to subject Tiny Risk Web Access Some Risk Data Use Agreement Moderate Risk - Strong DUA - Virtual Data Enclave - Encrypted streaming Risk of re-identification

Challenges Secure access to confidential information Partnerships between public/private data centers and research communities New access models Providing accurate and complete metadata, especially provenance Automated metadata capture Improving data management and citation practices

Thank you. George Alter altergc@umich.edu