DASISH Digital Services Infrastructure for Social Sciences and Humanities WP4 Data Archiving Vigdis Kvalheim Norwegian Social Science Data Services (NSD) IASSIST Toronto 2014
DASISH PM Distribution and Partners PM CESSDA NSD, Norwegian Social Science Data Services ( 15 PM) FSD, Finish Social Science Data Archive (2 PM) 199.5 171 SND, Swedish National Data Services (5 PM) GESIS - Leibniz Institute for the Social Sciences, (6 PM) 44 83 67 68 56 34 CLARIN MPG, Max Planck Institute for Psycholinguistics (6 PM) UiB, University of Bergen (7 PM) DARIAH OEAW, Austrian Academy of Sciences (5 PM) DANS, Data Archiving and networked services (5 PM) UGOE, Goettingen University (6 PM) ESS CITY, City University, London (2 PM) SHARE CentERdata, The Netherlands (7 PM)
Archiving and Curation - Access and Sharing DASISH will rely on common data services offered by a network of strong data centres with national backing Purpose: Assess and discuss the state of data and deposit services in the SSH domain and identify gaps, bottlenecks and requirements Develop and recommend a requirements for deposit services which handle various types of data Work out and suggest policy rules and guidelines for proper data management, that can be taken up by data infrastructures providing long term preservation and curation services
WP4 Sub-tasks Task 4.1: State-of-the-art of data preservation and curation Task 4.2: Assessment of deposit services Task 4.3: Deposit service convergence Task 4.4: Recommendation of a set of policy rules Current state of data preservation and curation Analyze and describe Investigate existing deposit offers. Assess the scope of policy rules and their requirements Policies and guidelines Recommendations Service Level Agreements Establish policy rules Requirements specification Service level agreements PR and training material Implement and test the policy framework NSD 2012
D4.1 and D4.2: Fact Sheets First Year http://dasish.eu/publications/projectreports/d4.1_-_roadmap_for_preservation_and_curation_in_the_ssh.pdf http://dasish.eu/publications/projectreports/d4.2_-_report_about_preservation_service_offers.pdf
Five Level Trust Maturity Model (D4.1) Trust Maturity Level Key Guideline Guideline Source 1. OAIS Core Conformance Support OAIS Information Model. Acknowledge OAIS Archive responsibilities. OAIS Information Model: Section 2.2 of CCSDS 650.0-M-2 / ISO 14721:2012. OAIS Archive Responsibilities: Section 3.1 of CCSDS 650.0-M-2 / ISO 14721:2012. Self-assessment through PLATTER and PLATTER Key Self-assessment questions. DRAMBORA. 2. Initial self-assessment, PLATTER/DRAMBORA DRAMBORA Key Self-assessment questions. 3. Peer-reviewed self-assessment I, DSA Peer-reviewed self-assessment I, DSA. Data Seal of Approval Guidelines. 4. Peer-reviewed self-assessment II, ISO 16363/DIN 31644 Conformance to the OAIS Detailed Functional Model. Self-audit with the ISO 16363. Alternatively, self-audit with DIN 31644. Support: NESTOR criteria OAIS Detailed Functional Model: Section 4.1 of CCSDS 650.0-M-2 / ISO 14721:2012. CCSDS 652.0-M-1 / ISO 16363:2012. DIN 31644 5. Certification and Optimization External review and formal certification in conformance with the ISO 16363. Alternatively, with DIN 31644. CCSDS 652.0-M-1 / ISO 16363:2012. DIN 31644.
Nr DASISH Data Archive Description Sheet Functionality Nr Functionality Administrative context 1 Funding 2 Depositor Agreements 3 Usage Agreements, Code of Conduct to be signed 4 Policies in place 5 Rights on data claimed by the archive 6 Data Curation strategy Pre-Ingest 7 Primary community in focus for deposits 8 Secondary communities accepted for deposits Archival storage and preservation 13 Size of current archive in TB 14 Size of current archive in other means (collections, files, etc.) 15 Maximal deposit size in TB 16 Long term guarantees / standards of trust 17 Checks on quality / quality control Dissemination 18 Costs / Conditions for Access 19 Tools / Interfaces used for Access Ingest 9 Formats accepted and curated 10 Formats accepted and not curated 11 Metadata formats accepted 12 User-based ingest
Survey on data deposit service arrangements The questionnaire; based on the results and recommendations of D4.1, D4.2 and the DADS The purpose; to gain broader and more detailed insights about the organization, the state of and the degree to which data archive solutions exists across Europe and across scientific fields. Point of departure for the next steps: having in-depth interviews with selected data archive services
Survey key findings Background 45 Archive service level 40 35 30 25 20 15 10 5 0 None/No plans Plan to launch Functioning data archive
Survey key findings - Organizational context Key requirement compliance indicators: Documentation on deposit agreements, usage agreements and preservation policies..data Seal of Approval (DSA), Service Provider requirements among others.. Overall, 75 % of the services do have a licence or depositor agreement North-Western Europe the percentage of respondents confirming the existence of deposit/license agreement is somewhat higher (85 %) than South and East (53 %) Code of conduct / usage agreements are in place among 82 % of the North- Western Europe respondents; 41 % among South and East Preservation policy are in place among 62 % of the North-Western Europe respondents; for South and East it is 29 %
Survey key findings - Level of Trust 25 of 46 respondents indicate that their services have undertaken activities to determine their trustworthiness 15 respondents from existing data archive services indicate that these services have not undertaken any action in this respect yet Among the respondents from North Western Europe, 65 % mentioned certification activities (half of them on the level of peer-reviewed DSAassessment or higher); 27 % from Southern and Eastern Europe
Survey findings - Self-reported maturity level of Data Archive Services We asked the respondents if they are satisfied with the maturity level of several aspects of their data archive service. We split this item into 5 sub-items (related to the OAIS reference model) 30 25 20 15 10 5 0 No measures needed Archival storage and preservation Data archive administration Ingest facilities Dissemination facilities
The way ahead some suggestions Further steps; the selection and recommendation of appropriate data service are dependent on further analyses of survey results The next step is to complete the DADS for all or the most promising data services, except those already included, based on the competed survey and with the help of the data infrastructure/deposit service itself. D4.3: List of recommended data services (trusted centres), will be a based on the completed and verified DADS First step feed into world wide registry Updated version of the Survey Report including information on the less mature, emerging/aspiring data archives with institutional/national backing, that to various extent meet requirements recommended in 4.1, 4.2 and 4.3.
Policy Rules for Data Management Deliverable in Month 33: A Comprehensive Set of Policy Rules for Data Management Partners: NSD, UGOE, FSD, MPG, UiB, GESIS Procedure: Data Policy Description Sheet (DPDS) Assess the scope of policy rules and their requirements in collaboration with initiatives in Europe and the US Establish policy rules in close collaboration with experts and emerging collaborative data services infrastructure
IFDO Survey on Research Funders Data Policies Country-by-country information on current institutional research data policies Main focus on formal data policies Existence, contents and quality of data sharing requirements Type of linkage to funding
IFDO Data Policy Description Sheet Topic Nr. Topic Item Background information 1 Name of funder Background information 2 Homepage General policy 3 General conditions General policy 4 Data Management Plan (DMP) for Proposal General policy 5 Data Timeframe General policy 6 Guidance General policy 7 Compliance/Monitoring General policy 8 Funding / Costs General policy 9 Scope of policy Standards/Documentation 10 Documentation Requirements Standards/Documentation 11 Data Standards Standards/Documentation 12 Metadata Standards Access and preservation 13 Data Preservation Access and preservation 14 Scope of preservation provisons Access and preservation 15 Data Access / Sharing Access and preservation 16 Data Access / Sharing incentives Access and preservation 17 Data Sharing Rights (IPR) Access and preservation 18 Data Embargo / Data Retention Access and preservation 19 Data Sharing requirements / timeframe Access and preservation 20 Designated Data Repository Access and preservation 21 Data Repository Supported Access and preservation 22 Institutional (data repository) Requirements Publications 23 Open Access to Publications Publications 24 Publication Repository Specified Publications 25 Publication Repository Supported Resources/References 26 Date of policy Resources/References 27 Policy link Resources/References 28 Policy link
Well described / Required - 2009, 2012 Data Policy Description Sheet - example Direct quotes / paraphrased information from policy Links to documents containing quote(s) / paraphrase(s) Input, short. Input, free text (elaborate from previous column) - Research Council of Norway - http://www.forskningsradet.no/en/ Applies to all projects funded totally or partly by the Well described Norwegian Research Council "With regard to the use of research infrastructure for research involving the processing of large amounts of data (time series, registries, scientific collections, etc.), the progress report shall also show how the data generated are safeguarded through large-scale storage resources, data handling tools and dedicated point-topoint network connections for particularly demanding Suggested / Refers to 'progress report', not data management plan. applications." R&D Project Agreement Document "As a general rule, the formal applicant to the Research Council is to be a Norwegian institution/enterprise with a specific individual designated as the project Suggested Applies to all research data administrator". General application requirements "Unless otherwise agreed with the Research Council, copies of all research-generated data, including requisite documentation, shall be transferred from the Project Owner to the Norwegian Social Science Data Services. This shall be carried out as soon as possible All data and documentation to be deposited at designated data and at the latest two years following the conclusion of Suggested / Recommended centre the project period. R&D Project Agreement Document Suggested Applies to all research data See quote in input nr 13 /Suggested Indirectly and externally, through NSD licence/deposit form. All research-generated data; as soon as possible, max. two Required years. See quote in input nr 13 Required Norwegian Social Science Data Services (NSD) See quote in input nr 13 /Suggested Indirectly, NSD (financial support) "Scientific publications based on R&D projects funded wholly or partially by the Research Council must be made openly accessible to all interested parties". The Research Council's Principles for Open Access to Scientific Publications
Common Challenges and needs Looking at the overall picture: In many countries high-level policy recommendations has not yet led to specified national policies by key research funders. If SSH funders has formulated open access policies, they are likely to be soft recommendations without well defined requirements and guidance to follow-up and implementation of recommendations.
Common Challenges and needs Looking at the overall picture: it is still unusual to enforce projects to open their data - we need to move form policy statements to policy enforcements and monitoring too many countries lack sufficient data sharing (trusted centers) infrastructures we need to move from short-term funding to long-term funding and business models that build trust, confidence and incentives to contribute to the data infrastructure. Moving towards policy based data archiving!
Thank you for listening!