Towards a National Data Infrastructure Sharon Neal Chemistry Division National Science Foundation shneal@nsf.gov 1
Data is on everyone s mind 2
National Science Foundation (NSF) NSF is an independent federal agency created by Congress in 1950 (NSF Act of 1950) to promote the progress of science; to advance the national health, prosperity, and welfare; to secure the national defense NSF celebrated its 60 th Anniversary in 2010 Executive Branch Executive Office of the President Departments (e.g. Defense) and Independent Establishments, Agencies, and Government Corporations (e.g. Securities & Exchange Commission) National Science Foundation www.nsf.gov
The 24-member Presidentially-appointed National Science Board (NSB) establishes overall policies for NSF - www.nsf.gov/nsb
Current NSF Funding for Data DataNet Long-term preservation and access of data Data Infrastructure Building Blocks (DIBBs) Cyber-Enabled Discovery and Innovation (CDI) Data enabled science and engineering Core Techniques and Technologies for Advancing Big Data Science & Engineering (BIGDATA) "Earth Cube" - Towards a National Data Infrastructure for Earth System Science (EarthCube) 5
What is the NSF DataWay? The NSF DataWay Towards a National Data Infrastructure is an NSF-wide project under the framework of CIF21 to contribute to the collaborative creation of a digital access infrastructure for the sharing of scientific data from multiple sources, and facilitating the transfer from individual researchers to widely accessible data systems. 6
What is the Goal of DataWay? To promote the conduct of research by supporting community-based cyberinfrastructure that supports integration of data and information for knowledge management by building strategies that support the development of infrastructure that will: 7
What is the DataWay Charrette? The DataWay Charrette will provide a forum for stakeholders to discuss and vet various strategies for collaboratively creating a National Data Infrastructure (NDI) that connects the broad spectrum of data sources, both current and future. 8
Preparing for the Charrette Charrette Logistics February, 2013* Washington, DC Area Pre-Charrette contributions by the community are essential Short (<6 pages) position papers in any of four focus areas will be welcome http:// www.clipartillustration.com/ clipart-illustration-orangeman-with-giant-pencilwriting-memo-or-note/ 9
Tentative Position Paper Topics Data generation and acquisition Data storage, curation and management Analysis, modeling and visualization Governance, economics and culture of data storage and sharing 10
RDM Group Input: Vision an NSF-wide project within CIF21 to contribute to the collaborative creation of a digital access infrastructure for the sharing of scientific data from multiple sources, and facilitating the transfer from individual researchers to widely accessible data system. Create a framework that supports the data lifecycle for varied disciplines that is sustainable, promotes the integrated use of different data sources, relies on reuse and sharing of tools, policies and processes across disciplines and organizations. 11
RDM Group Input: Goals Promote the conduct of research by supporting community-based cyberinfrastructure that supports integration of data and information for knowledge management Facilitate the emergence of broadly useful tools that can be used by investigators in many fields Support the evolution of collaborative communities around the use of data infrastructure tools by promoting better communication, exchange and cross-education Create a Governance structure for managing the data lifecycle of varied disciplines Build communities of data sponsorship that promote the development of standards and the sharing of data and best practices Create or promote the development of policies, structures and tools to support access to research data while protecting intellectual property rights Create models and incentives for institutional adoption of data lifecycle management 12
RDM Group Input: Position Papers Access and discoverability of distributed data sources including rights management Methods for inclusion and validation of data provenance and data quality Development of incentives/drivers for use and adoption of lifecycle management including use of standards, access and preservation Governance and economic models for sustainable curation including distribution of effort between local through global communities Communities, methods and processes for the definition of metadata and ontologies (vocabulary) Methods and tools for the mapping between ontologies across fields & disciplines Mechanisms for evaluating the DataWay process & progress Survey of organizational and business process models that institutions, collaborations and federations are currently using Case studies involving public/private partnerships to improve the management of research data Models for building multi-institutional, cross disciplinary collaborations to improve the management of research data Identification of logical / potential division between institution / regional / national responsibilities for various DataWay (data life cycle) components 13
What s Next? A Dear Colleague Letter describing the evolution of the Charrette and defining the position paper topics will appear shortly. The DataWay working group will work to identify Charrette participants (web and on site) based on submitted position papers. The Charrette will be held in February 2013 to discuss and vet the strategies described in the position papers 14
Where discoveries begin 15