Service Road Map for ANDS Core and Applications Programs Version 1.0 public exposure draft 31-March 2010 Document Target Audience This is a high level reference guide designed to communicate to ANDS external stakeholders the directions of services development over the next 18 months in two ANDS programs: ARDC Core and ARDC Applications. Context The strategic context for these two programs is set by the Australian Research Data Commons (ARDC) Project Plan: http://ands.org.au/infrastructure.html This roadmap does not re-visit the strategic framework for ANDS and its programs and simply refers to that pre-existing rationale. A needs analysis has been conducted to survey the needs of a sample of ANDS stakeholders in identified areas: http://ands.org.au/ands-needs-analysis.html That needs analysis has been used to determine stakeholder requirements and prioritise services to address those needs. Some of the needs described in the analysis were out of scope for these particular programs and will be addressed elsewhere in ANDS, NCRIS or the research and innovation system. Strategic Directions The products identified in this roadmap support ANDS mission to enable Australian researchers to manage, use and re-use data more effectively. The products respond to a number of strategic priorities within that mission: 1. To support the emergence of a coherent view of Australian research data assets (Australian Research Data Commons) 2. To improve the usability of data relevant to Australian research 3. To provide support and incentive for the publication of data 1
4. To provide broad fundamental data services for NCRIS, Super Science facilities, research groups and organisations, and the innovations sector broadly 5. To commission data exploitation s in high profile research areas of national significance Market Niche These strategic directions pursue three markets : infrastructure services that can be most efficiently and effectively provided centrally by ANDS rather than re-invented at every NCRIS and Super Science facility fundamental services that align with and extend the existing service agendas of government agencies and research organisations domain specific services and s customised to the needs of specific high-profile research groups Time Frame The products and services mentioned here are to be established between March 2010 and June 2011. This is the end-date of the ANDS ARDC project, funded through the Education Investment Fund. Work has started or will start immediately with some s. Where the product is well defined, partners will be sought as soon as possible for work to begin by mid-year. Where customisation may be required for specific domain areas or community consultation may be necessary to clarify needs or relative priorities, work may not begin until Q4 2010. Categories The needs analysis was structured using a number of categories of data related activity related to the publication and re-use of data: Create, Store, Identify,, Register,, Access, Exploit. 1 Services and products will be commissioned to meet relevant needs in these areas. A variety of operating and development models are envisaged. Some services are to be operated by ANDS, some by government agencies, some by research or research support organisations. ANDS will develop some of the underlying systems and will commission the majority from third parties. The Services Suitable products and services identified through the needs analysis process are listed in the table below. 1 For details on this approach see: http://www.dcc.ac.uk/sites/default/files/documents/events/dcc- 2009/papers/AndrewTreloar.pdf 2
Table 1. Identified Services Service/ Product Description Category Strategic Priority A. Digitised content management Systems for managing digitised textual and image content allowing distributed correction and annotation. Create Data Publication B. Identifier for Dataset (DOI) Service enabling data facilities to allocate internationally recognised identifiers for published data sets enabling citation, tracking, and acknowledgement for reference data sets Identify Data publication C. Terminology Support Service Service allowing a community to create, manage, and publish (in machine actionable formats) standardised vocabularies which act as a basis for specific data aggregation and linking activities. Create Exploit D. Metadata Schema and Ontology Support infrastructure allowing a community to manage and publish metadata schemata and ontologies to facilitate the creation of intra- and inter-disciplinary semantic equivalences. Create Exploit E. Machine to Machine search F. Alerting/Notification Systems and protocols to allow subject portals (eg ALA) and ANDS to reciprocally support See also type features. Service to notify users when particular datasets are deposited with a data facility or when their descriptions are registered with ANDS. Data publication G. Recommender Services Services to recommend datasets on the basis of user feedback or user behaviour. H. Cross Domain Search/ Equivalence service Vocabulary and ontology equivalence service to enable cross domain search and data integration. I. Data and Publication Linkage Service/Tool Coordinated implementation of infrastructure, systems and processes at national data facilities for linking datasets with publications, including DOI identifiers systems, citation standards, citation indexing. Data publication Improve Usability 3
Service/ Product Description Category Strategic Priority J. Access policy infrastructure K. Data Visualisation Applications L. Data Linking / Merging Applications M. Data Mining Applications Coordinated implementation of infrastructure, systems and processes at national data facilities supporting common access policies for datasets (potential collaboration with AAF). Applications developed in partnership with high profile research groups for the visualisation of data from the Australian Research Data Commons. Applications developed in partnership with high profile research groups for data linking and merging data from the Australian Research Data Commons. One example might be a service to tag the contents of humanities data sources with information about people, places, times, and things as an aid to linking between and searching across these sources Applications developed in partnership with high profile research groups for the mining/analysis of data from the Australian Research Data Commons. Access Exploit Exploit Exploit N. Annotation Service to allow research groups to annotate web-based research materials and collaborate around those annotations. O. Quality Assurance Systems to enable automated quality assurance measures for data facilities (metadata quality, data integrity, service uptime) including data facility auditing tool sets. Store Access P. Dataset Preservation Curation Service to enable format migration of archived datasets and databases to allow preservation and continuity of access. Store Access Q. Publish my Model Service to allow scientific models to be published in ways analogous to the publication of journal articles or datasets. Register Identify R. Party to provide more precise and richer identification of parties (people and Coherent View Fundamental Service 4
Service/ Product Description Category Strategic Priority organisations) involved in the Australian Research Data Commons; includes equivalences and disambiguation. S. Research Activity Description to provide more precise and richer identification of research activities (projects, programs, etc) involved in the Australian Research Data Commons. T. Location to provide more precise and richer identification of locations referred to in the Australian Research Data Commons. Scheduled Timeline It is considered unrealistic to develop the full set of services above simultaneously and within the current ANDS project timeline. A prioritised timeline has been developed to allow a pragmatic approach to commissioning an initial set of services in the first instance followed by a second and third round of commissioning as time and resources allow. The table below represents the proposed infrastructure establishment schedule leading to the commissioning of services: 5
Table 2: Service Commissioning Schedule 6