Facilitate Open Science Training for European Research Martin Donnelly Digital Curation Centre University of Edinburgh 2020 Vision: Making Your Research Output Compliant Keble College, University of Oxford 21 April 2015
Facilitate Open Science Training for European Research Open Access and Open Data in Horizon 2020 OVERVIEW 1. Open Access, Open Data, Open Science 2. Open Access to research publications a. OA in FP7 and H2020 b. Publishing particulars c. Possible pathways and summary points 3. Data Management in H2020 a. Data sharing and publication b. Data management plans and planning c. Data-related policies and expectations (UK, US, Australia) d. The Horizon 2020 Data Management Pilot 4. Useful resources 5. About the FOSTER project / Q&A
1. OPEN ACCESS, OPEN DATA, OPEN SCIENCE Open Access (OA) was born in the 1980s with free-to-access Listserv journals, but it really took off with the popularisation of the Internet in the mid-1990s, and the subsequent boom in online journals The Internet lowered (physical) barriers to accessing knowledge, but financial barriers remained indeed, the cost of online journals tended to increase much faster than inflation, and scholars/libraries faced a cost crisis OA is part of a broader trend in research, sometimes termed Science 2.0. As Open Access to publications became normal (if not ubiquitous), the scholarly community turned its attention to the data which underpins the research outputs, and eventually to consider it a first-class output in its own right In fact, the development of the OA and research data management (RDM) agendas are closely linked
Timeline: Open Access and Data Sharing 1987: New Horizons in Adult Education launched by the Syracuse University Kellogg Project. (An early free online peer-reviewed journal.) 1991: The Bromley Principles Regarding Full and Open Access to Global Change Data, in Policy Statements on Data Management for Global Change Research, U.S. Office of Science and Technology Policy 2001: The term Open Access (OA), the free online availability of research literature, is first coined at an Open Society sponsored meeting in Budapest, Hungary. 2004: Ministerial representatives from 34 nations to the Organisation for Economic Co-operation and Development (OECD) issue the Declaration on Access to Research Data From Public Funding. 2006: The Scientific Council of the European Research Council (ERC) pledges to adopt an OA mandate for ERC-funded research as soon as pertinent repositories become operational. 2012: European Commission recognises research data is as important as publications. Announces in July 2012 that it would experiment with open access to research data (see IP/12/790) http://europa.eu/rapid/press-release_ip-12-790_en.htm (Derived from, inter alia, Peter Suber (2009) Timeline of the open access movement, http://www.earlham.edu/~peters/fos/timeline.htm )
2. OPEN ACCESS TO RESEARCH PUBLICATIONS OA publication is past the tipping point in several fields (e.g. biology, biomedical research, mathematics and general science & technology) whereas the social sciences, humanities, applied sciences, engineering and technology are the least engaged. (Archambault et al. (2013) Proportion of Open Access Peer-Reviewed Papers at the European and World Levels 2004-2011 ) The EC sees a real economic benefit to OA by supporting SMEs and NGOs that can t afford subscriptions to the latest research Houghton, Swan and Brown offer quantifiable evidence of how much a lack of OA costs SMEs, both in terms of the time lost accessing documents and the delays in producing new products The EC s OA pilot in FP7 is now a requirement in Horizon 2020. A pilot for open data has also been introduced with an intention to develop policy in the same way
2a. Recap: Open Access in FP7 The EC s Open Access pilot ran from August 2008 until the end of the Seventh Research Framework Programme (FP7) in 2013. It required grant recipients in certain areas to deposit peer reviewed research articles or final manuscripts resulting from their FP7 projects into an online repository and make their best efforts to ensure open access to these articles. Both green and gold OA were catered for. Rationale: to improve and promote the dissemination of knowledge, thereby improving the efficiency of scientific discovery, and maximising return on investment in R&D by public research funding bodies Coverage: Peer reviewed research articles in the following areas Energy; Environment (including Climate Change); Health; Information and Communication Technologies (Cognitive Systems, Interaction, Robotics); Research Infrastructures (e-infrastructures); Science in society *; Socio-economic sciences and the humanities * Timing: Open access to these publications is to be ensured within six months after publication (* twelve months in the last two areas) Place of deposit: Institutional repository was first choice, failing that an appropriate subject based/thematic repository or the EC s open repository for papers that would otherwise be homeless. Full guidelines: ftp://ftp.cordis.europa.eu/pub/fp7/docs/open-access-pilot_en.pdf
2b. Publishing particulars The EC view is that the new H2020 OA mandate does not restrict publishing in any way; researchers can publish where they choose. The only requirement is that they ensure the publication is made openly available via a repository. This can be done by: publishing with an OA journal, which may or may not charge an APC; publishing with a subscription-based journal, and depositing a copy into a repository (with open access being usually delayed by an embargo period imposed by the publisher); or (if the option is provided by the publisher) pay an APC to have an immediate open access copy. In Horizon 2020, a copy of the article must always be deposited in a repository, even if the gold (or hybrid) option is chosen When researchers are deciding where to publish, it s useful to consult a service like SHERPA RoMEO to see what open access options are available. Researchers could start with a list of targeted journals and prioritise, or use a mix-and-match approach Although over 60% of publishers don t charge APCs, fees can be quite steep. The average rate is 1,020 per article for open access publishers and 1,980 for hybrid journals. (Ref: Björk & Solomon). It could be very costly to always choose the gold route and pay many APCs, so a mixture of gold and green approaches is likely to be best
2c. Possible OA pathways
Summary points Main points of the Horizon 2020 Open Access requirements: Researcher chooses where to publish; Requirements apply to peer-reviewed articles rather than monographs, technical reports and conference proceedings, though these can be included as desired; All peer-reviewed publications should be made OA via the green or gold routes; It is no longer sufficient to make publications available on the project website. Deposit in repositories is required in all cases (even under gold OA), so the bibliographic data is open and can be harvested by services like OpenAIRE; The EC does not currently impose any price cap on fees for publication costs. Researchers should plan OA from the proposal stage, and write any APCs into the proposal under the dissemination budget; The EC recommends how their funding should be acknowledged in publications. This style should be followed in order to facilitate indexing. The primary document to consult is Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 (2013)
(From Open Access to Open Science ) All projects receiving Horizon 2020 funding are obliged to make sure that any peer-reviewed journal article they publish is openly accessible, free of charge Some disciplines have committed to sharing data and are reaping the benefits. The research process is now fastest in High Energy Physics due to the community practice of immediate data publication The European Commission is now moving beyond open access towards the more inclusive area of open science. Elements of open science will gradually feed into the shaping of a policy for Responsible Research and Innovation and will contribute to the realisation of the European Research Area and the Innovation Union, the two main flagship initiatives for research and innovation http://ec.europa.eu/research/swafs/index.cfm?pg=policy&lib=scien ce
3. DATA MANAGEMENT IN H2020 H2020 features an Open Research Data pilot, and it seems likely that it will become an across-theboard requirement in FP9 The main goals of these developments are to lower barriers to accessing the products of publicly funded research ( science ), and to strengthen the integrity and longevity of the scholarly record This section of the presentation focuses on the data management (planning) aspects of the Open Research Data pilot
3a. Recap: Data sharing and publication Benefits of sharing / publishing data TRANSPARENCY and QUALITY: The evidence that underpins research can be made open for anyone to scrutinise, and attempt to replicate findings. This leads to a more robust scholarly record. EFFICIENCY: Data collection can be funded once, and used many times for a variety of purposes. ACCESSIBILITY: Interested third parties can (where appropriate) access and build upon publicly-funded research resources with minimal barriers to access.
3b. Data management plans and planning Data management planning (DMP) underpins and pulls together different strands of data management activities DMP is the process of planning, describing and communicating the activities carried out during the research lifecycle in order to Keep sensitive data safe Maximise data s re-use potential Support longer-term preservation Research funders (and other bodies) often ask for a short statement/plan to be submitted alongside grant applications. HEIs are increasingly asking their researchers to do this too
What does a data management plan look like? A brief statement defining how data will be captured/created how it will be documented who will be able to access it where it will be stored how it will be backed up, and whether (and how) it will be shared and preserved long-term etc DMPs are often submitted as part of funding applications, but will be useful whenever researchers are creating (or reusing) data, especially where the research involves multiple partners, countries, etc
Roles and responsibilities It s worth bearing in mind that RDM is a hybrid activity, involving multiple stakeholder groups The principal investigator (usually ultimately responsible for data) Research assistants (may be more involved in day-to-day data management) The institution s funding office (may have a compliance role) Library/IT/Legal (The library may issue PIDs, or liaise with an external service who do this, e.g. DataCite.) Partners based in other institutions Commercial partners etc
Benefits of data management planning It is intuitive that planned activities stand a better chance of meeting their goals than unplanned ones. The process of planning is also a process of communication, increasingly important in interdisciplinary / multi-partner research. Collaboration will be more harmonious if project partners (in industry, other universities, other countries ) are in accord In terms of data security, if there are good reasons not to publish/share data, in whole or in part, you will be on more solid ground if you flag these up early in the process DMP also provides an ideal opportunity to engender good practice with regard to (e.g.) file formats, metadata standards, storage and risk management practices, leading to greater longevity of data, and higher quality standards
3c. Data-related policies (UK) Seven Common Principles on Data Policy Data as a public good; Preservation; Discovery; Confidentiality; Right of first use; Recognition; Public funding for RDM Six of the seven RCUK funders require data management plans, or equivalent, at the application stage, as do Wellcome & CRUK The other council (EPSRC) requires nothing short of an institutional data infrastructure (by May 2015). They also expect that DMP will be a key component of this
3c. Data-related policies (USA) The National Science Foundation (NSF) announced a DMP requirement in 2010, taking effect early in 2011 White House Office of Science and Technology Policy requirement for DMPs announced March 2013 (programmes awarding >$100m annually) White House requirements include mechanisms covering compliance with plans and policies, and also cover costs of implementing plans
3c. Data-related policies (Australia) In 2014, the Australian Research Council (ARC) released new instructions for applications for Laureate Fellowships (http://www.arc.gov.au/ncgp/laureate/fl_instructions.h tm) and Discovery Grants (http://www.arc.gov.au/ncgp/dp/dp_instructions.htm) Both include the following requirements when describing a proposal COMMUNICATION OF RESULTS: Outline plans for communicating the research results to other researchers and the broader community, including scholarly and public communication and dissemination MANAGEMENT OF DATA: Outline plans for the management of data produced as a result of the proposed research, including but not limited to storage, access and re-use arrangements
3d. DMP in Europe The Horizon 2020 Open Research Data pilot covers Innovation actions and Research and Innovation actions It involves three iterations of Data Management Plan (DMP) 6 months after start of project, mid-project review, end-of-project (final review) DMP contents Data types; Standards used; Sharing/making available; Curation and preservation There are opt-out conditions. A detailed description and scope of the Open Research Data Pilot requirements is provided on the Participants Portal
Open Research Data Pilot: specifics (i) AIM The Open Research Data Pilot aims to improve and maximise access to and re-use of research data generated by projects. It will be monitored throughout Horizon 2020 with a view to further developing EC policy on open research. SCOPE For the 2014-2015 Work Programme, the areas of Horizon 2020 participating in the Open Research Data Pilot are: Future and Emerging Technologies; Research infrastructures; part e-infrastructures; Leadership in enabling and industrial technologies; Information and Communication Technologies; Societal Challenge: 'Secure, Clean and Efficient Energy ; part Smart cities and communities; Societal Challenge: 'Climate Action, Environment, Resource Efficiency and Raw materials' except raw materials; Societal Challenge: 'Europe in a changing world inclusive, innovative and reflective Societies ; Science with and for Society This corresponds to about 3 billion or 20% of the overall Horizon 2020 budget in 2014-2015. COVERAGE The Open Research Data Pilot applies to two types of data: 1. the data, including associated metadata, needed to validate the results presented in scientific publications as soon as possible; 2. other data, including associated metadata, as specified and within the deadlines laid down in the data management plan.
Open Research Data Pilot: specifics (ii) STEP 1 The data should be deposited, preferably in a dedicated research data repository. These may be subject-based/thematic, institutional or centralised. EC suggests the Registry of Research Data Repositories (www.re3data.org) and Databib (http://databib.org) for researchers looking to identify an appropriate repository Open Access Infrastructure for Research in Europe (OpenAIRE) will also become an entry point for linking publications to data. STEP 2 So far as possible, projects must then take measures to enable for third parties to access, mine, exploit, reproduce and disseminate (free of charge for any user) this research data. EC suggests attaching Creative Commons Licence (CC-BY or CC0) to the data deposited (http://creativecommons.org/licenses/, http://creativecommons.org/about/cc0). At the same time, projects should provide information via the chosen repository about tools and instruments at the disposal of the beneficiaries and necessary for validating the results, for instance specialised software or software code, algorithms, analysis protocols, etc. Where possible, they should provide the tools and instruments themselves.
Open Research Data Pilot: specifics (iii) COSTS Costs relating to the implementation of the pilot will be eligible. Specific technical and professional support services will also be provided (e- Infrastructures WP), e.g. EUDAT and OpenAIRE, alongside support measures such as FOSTER. OPT-OUTS Opt outs are possible, either totally or partially. Projects may opt out of the Pilot at any stage, for a variety of reasons, e.g. if participation in the Pilot on Open Research Data is incompatible with the Horizon 2020 obligation to protect results if they can reasonably be expected to be commercially or industrially exploited; confidentiality (e.g. security issues, protection of personal data); if participation in the Pilot on Open Research Data would jeopardise the achievement of the main aim of the action; if the project will not generate / collect any research data; if there are other legitimate reasons to not take part in the Pilot (to be declared at proposal stage)
4. Useful resources (DCC) Book chapter Donnelly, M. (2012) Data Management Plans and Planning, in Pryor (ed.) Managing Research Data, London: Facet Guidance, e.g. How-To Develop a Data Management and Sharing Plan DCC Checklist for a Data Management Plan: http://www.dcc.ac.uk/resources/datamanagement-plans/checklist Links to all DCC DMP resources via http://www.dcc.ac.uk/resources/datamanagement-plans DMPonline: https://dmponline.dcc.ac.uk/
DMPonline: overview Helps researchers write DMPs Provides funder questions and guidance Provides help from universities Examples and suggested answers Free to use Mature (v1 launched April 2010) Code is Open Source (on GitHub) https://dmponline.dcc.ac.uk
Non-DCC tools and resources Book chapter Sallans, A. and Lake, S. (2014) Data Management Assessment and Planning Tools, in Ray (ed.) Research Data Management, Purdue University Press DMPTool UKDA guidance NERC guidance European Union resources Resources from other universities, inc. Oxford (http://researchdata.ox.ac.uk/)
5. ABOUT THE FOSTER PROJECT Facilitate Open Science Training for European Research
Facilitate Open Science Training for European Research OBJECTIVES To support different stakeholders, especially young researchers, in adopting open access in the context of the European Research Area (ERA) and in complying with the open access policies and rules of participation set out for Horizon 2020 To integrate open access principles and practice in the current research workflow by targeting the young researcher training environment To strengthen institutional training capacity to foster compliance with the open access policies of the ERA and Horizon 2020 (beyond the FOSTER project) To facilitate the adoption, reinforcement and implementation of open access policies from other European funders, in line with the EC s recommendation, in partnership with PASTEUR4OA project
Facilitate Open Science Training for European Research METHODS Identifying already existing content that can be reused in the context of the training activities and repackaging, reformatting them to be used within FOSTER, and developing/creating/enhancing contents as required Developing the FOSTER Portal to support e-learning, blended learning, self-learning, dissemination of training materials/contents and a Helpdesk Delivery of face-to-face training, especially training trainers/multipliers who can deliver further training and dissemination activities, within institutions, nations or disciplinary communities
THANK YOU any questions? Martin Donnelly Digital Curation Centre University of Edinburgh martin.donnelly@ed.ac.uk Twitter: @mkddcc www.dcc.ac.uk www.fosteropenscience.eu This work is licensed under the Creative Commons Attribution 2.5 UK: Scotland License.