Investigating Data Sustainability of Citizen Science Projects Elena Roglia (with contributions from Hildegard Gerlach, Chrisa Tsinaraki, Sven Schade and Max Craglia) Frascati, Italy 14 October 2015
Motivation Citizens' contributions to scientific processes, i.e. Citizen Science initiatives, are blossoming all over geographic scales and disciplines. A Citizen Science and Smart City Summit in 2014, identified the management of citizen-collected data as a major barrier to the re-usability and integration of these contributions across borders. Whilst funding Citizen Science projects, the European Commission (EC) could also provide additional contributions (win-win situations).
Part I Data Management Survey
Intention Develop an d understanding of the state of play with regard to data management practices on the local, national and continental scales Initiate discussion with Citizen Science communities world-wide Establish a baseline for prioritising subsequent actions and for measuring progress
Set-up Focusing on data discoverability, access, re-use and preservation Following newly proposed GEOSS data management principles and reports of the Belmond forum After consultating colleagues and representatives of the Citizen Science community Using EUSurvey as a tool
And the results?
[https://pixabay.com]
Question 1: Which topic areas does your project cover? Others including (amongst more): Arts (2) Biology (3) Cultural heritage (2) Economics Humanities (3) Mapping and GIS (2) Smart cities (1)
Question 2: Which topic areas does your project cover? Others including (amongst more): Species observations, habitats and phenology (10) Light pollution (3) Litter (2) Environmental burdens (2) Environmental change (1)
Question 5: Which geographic extent does the project cover? (multiple choice) Extent Inside EU Outside EU Neighborhood 22 24 City level 29 30 Regional 31 39 Country 37 30 Continental 27 26 Total coverage 146 149 Countries in which data is stored:
Question 5: How is this citizen science project financed? Question 6: Does your project include an explicit data management plan? >> Yes (72), No (49)
Question 7: Is the data and all associated metadata from your project discoverable through catalogues and search engines? Question 8: Does the data contain persistent, unique, and resolvable identifiers?
Question 11: Is the data collected by the project accessible via an online services for visualization Question 12: If the data collected by the project is accessible via an online download service, how can it be accessed?
Question 9: Do you provide access to raw data sets or aggregated values?
Question 13: Do you make the data available for re-use? Question 13.1: Which are the conditions for re-use? 45 46 30
Question 13.2: Which license do you use? Only 35 projects indicated that they decided for a particular license In 6 cases, we identified a miss match between the indicated license and intended re-use conditions 10 indicated that they are in the decision process >10 consider that data licensing is not applicable to their project Question 14: Why did you take this decision? Replies, included (amongst others): By institutional policy Promoting the idea of Open Science To stimulate reuse by citizens or start ups To give back to the community
20. For how long do you ensure the access to the data from you project? Question 21: How do you preserve the data from your project?
If you have any additional information to share with us... Specific comments, included (amongst others): We've had great difficulty working with IT professionals to ensure the data is made available according to our needs. I am in the process now of finding a place to host the data and I am having a very hard time finding a free online data portal. Our project has been in operation for about 6 years. We are currently in the process of formalizing many of the data storage, documentation and access arrangements. Thank you for doing this survey, common databases are truly needed.
Part II Repository of EU-funded projects
Intention Collecting additional evidence about current data management practices. Design a possible node for managing citizen science data (observations, deliverables, publications, web pages, etc.). Discuss how this node might fit into the bigger ICT (eco)system. Experiment with the possibility to curate and provide access to the results of EU-funded Citizen Science projects once projects finish.
Approach Provide a storage facility for (static) data that are abandoned after projects end Organise (projects, organisations, data) via metadata
Status Prototype for static storage (FTP) and cataloguing (CKAN) available inside JRC Filled with test metadata about projects and organisations (taken from http://www.citizen-obs.eu/ and http://www.everyaware.eu/) Initial test with sample data set
Current status Discussions with the Publication Office of the European Union in order to access project and organizations metadata from the CORDIS data base Investigations how to include metadata for data in third party repositories Investigations how to store Citizen Science data at JRC premises Collaborations with similar efforts of others (incl. ECSA, ACSA and CSA) [flickr.com, joelogon]
Next steps Analyzing survey responses for correlations, e.g. with geographic extend, funding mechanisms, project duration Publication of survey data and results Further testing and development of the repository Continued collaboration with the European Citizen Science Association (ECSA) and their American and Australian counterparts [pixabay.com]
Thank you for your attention! Come and see the results......on-line or in person For further information, check out http://digitalearthlab.jrc.ec.europa.eu/citizensscience/ @innovatearth sven.schade@jrc.ec.europa.eu