Science Enabling Applications Development Work Package Software Requirements Specification (WP970) prepared by: approved by: reference: issue: revision: 1 X. Luri, P.M. Marrese, F.Julbe, H. Enke, N. Walton, G. Gracia, G. Comoretto X. Luri, P.M. Marresse date: 2014-11-18 status: Draft. Pending of formal approval in DPAC CU9
Abstract This document provides the list of requirements applicable to the Gaia science enabling applications work package (WP970). It also covers deliverable D4.1 for the GENIUS project. Software Requirement Specifications 2
Document History Issue Revision Date Author Comment D 0 2014-11-17 XL Code document change D 0 2014-09-19 FJL First version of the CU9 science enabling applications SRS D 0 2014-11-10 PM Separate table for requirements with no parents added Software Requirement Specifications 3
Contents 1 Introduction 6 1.1 Objectives.................................... 6 1.2 Scope....................................... 6 1.3 Assumptions................................... 6 1.4 Applicable Documents.............................. 6 1.5 Requirement Definition............................. 7 2 List of requirements 9 2.1 General requirements............................... 9 2.2 Advanced data access tools............................ 9 2.3 Data Mining.................................... 11 2.4 Cross-Matching.................................. 12 2.5 Science Alerts................................... 14 2.6 Documentation.................................. 14 2.7 Help Desk..................................... 14 2.8 Public Outreach.................................. 14 3 Missing Parent requirements 14 References 16 A Requirements traceability 17 Software Requirement Specifications 4
Acronym List The following table has been generated from the on-line Gaia acronym list: Acronym ASDC AUT BP CU DPAC ESA ESAC GAIA GENIUS GPDB HDFS MAN PM RP RVS SAT SRS SSS TAP TBD VO WP Description ASI Science Data Centre AUTomated Blue Photometer Coordination Unit (in DPAC) Data Processing and Analysis Consortium European Space Agency European Space Astronomy Centre (VilSpa) Global Astrometric Interferometer for Astrophysics (obsolete; now spelled as Gaia) Gaia European Network for Improved User Services Gaia Parameter DataBase Hadoop Distributed File System MANual Polarisation Maintaining Red Photometer Radial Velocity Spectrometer Satellite Archive Team Software Requirements Specification System Software Specification Table Access Protocol To Be Defined (Determined) Virtual Observatory Work Package Software Requirement Specifications 5
1 Introduction This document sets out the software requirements pertaining to the Gaia science enabling applications. 1.1 Objectives The objective of this document is to define a set of requirements for those applications developed for the GAIA catalogue scientific exploitation. Some of the listed requirements can be divided into smaller ones in order to improve its development and monitoring. The requirements will cover all funtional and technological aspects of the science enabling applications in order to produce fully functional state-of-the-art products for the GAIA catalogue scientific explotation. 1.2 Scope This work package (WP970) includes 5 sub-work packages that cover different functional domains and together they cover all aspects of the software applications to a full exploitation of the GAIA catalogue. 1.3 Assumptions Some requirements have higher level CU9 requirements. These dependencies are recorded but not repeated in this document; the relevant SRS (see next Section) should be referred to for further details. The top level requirements specification should be considered as level 0 requirements that are unlikely to change during the development and implementation iterations, while the derived and more detailed requirements herein may subject to significant revision consistent with a pragmatic and agile development process. 1.4 Applicable Documents When applicable documents change a change may be required in this document. The applicable documents are listed here for clarity; a full reference list is provided at the end of the document. (WOM-086) (WOM-033) (AB-026) Software Development Plan for CU9 Gaia Catalogue and Archive SRS Gaia data access scenarios summary For convenience only the structure of this document follows that of the level 0 SRS (WOM- 033). Software Requirement Specifications 6
1.5 Requirement Definition The requirements set out in this SRS follow the labelling scheme: Where : CU9-WP97x-X-SCOPE-xxx WP97x is the (sub )WP number as follows: WP971: Management WP972: Advanced data access tools WP973: Data Mining WP974: Cross-Matching WP975: Science Alerts X is either S (for Scientific), T (for Technical), Q (for Quality Assurance), or M (for Management) SCOPE is a four letter scope specification of the requirement following the identified list of possible values as shown in the list below xxx is a monotonically increasing counter for every unique combination SCOPE is a 4-letter scope specification of the requirement. In this document the following scopes have been used: GLOB: for top-level global requirements; ALGO: for requirements on the scientific algorithm to be applied in the data processing (this should be for the detail on how the functionality is to be achieved); CODE: for requirements on the coding activities (this should just be for how the code is written e.g. the GPDB should be used); COOR: for requirements on the coordination activities (this should not be used for functionality requested by other WPs, but that you coordinate with them in the design etc.); DATA: for requirements on data stream handling (this can include descriptions of input and output data); FUNC: concerning functional requirements (this describes the functionality that is required by the system); HARD: concerning hardware requirements; PERF: for requirements on performances; PLAN: for requirements on the planning activities; Software Requirement Specifications 7
QUAL: for requirements on the quality assurance (both scientific and software, robustness and quality of data could go here); RESO: for requirements on the resource management. DOCU: for requirements on documentation. Each requirement is presented with its own unique label and a number of attributes in the following form: CU9-WP97x-X-SCOPE-020 C.v Verification Status Description Parent: Parent CU9-WP97x-X- SCOPE-xxx C.v Verification Status Parent The unique identifier of the requirement (see above). Version number of the requirement composed of major part (C) corresponding to the cycle (1, 2 and 3 corresponding to A, B and C respectively in WOM-086) in which the requirement was created and minor part (v) corresponding to the version of the requirement. Envisaged validation method of requirement - this will be either AUT for automated or MAN for Manual. Status identifier. Higher level requirement or requirements in a comma separated list. Software Requirement Specifications 8
2 List of requirements 2.1 General requirements CU9-WP971-M-PLAN-020 16.0 MAN Draft CU9 WP970 shall operate within the agreed management structures of the CU (which in turn operates within the existing DPAC management structure). Parent: CU9-ARC-M-PLN-020 CU9-WP971-M-PLAN-040 1.1 MAN Draft CU9 WP970 shall develop all code and documentation within the DPAC code repository where it shall be visible to all DPAC members but modifiable only by CU9 members. Parent: CU9-ARC-M-PLN-040 CU9-WP971-M-PLAN-060 1.1 MAN Draft CU9 WP970 shall follow the engineering guidelines laid down in WOM-086. Parent: CU9-ARC-M-PLN-060 CU9-WP971-M-PLAN-080 1.1 MAN Draft CU9 WP970 developments shall be overseen by a System Engineering coordination group. Parent: CU9-ARC-M-PLN-060 CU9-WP971-M-COOR-020 1.1 MAN Draft CU9 WP970 shall coordinate with other CU9 WPs in particular WP930 and WP950. Parent: 2.2 Advanced data access tools CU9-WP972-T-FUNC-020 1.0 MAN Draft The Gaia Archive shall provide a TAP access to the Gaia Catalogue. Parent: CU9-ITG-T-FUN-140, CU9-ITG-T-FUN-020 CU9-WP972-T-FUNC-040 1.0 MAN Draft The Gaia Archive shall provide a SSAP access delivering BP, RP and RVS spectra calibrated in wavelength and in flux. The delivered spectra shall be compliant with the Spectrum datamodel (VO standard). The spectra resolution shall be specified for each spectrum and standard units shall be used. Parent: CU9-ITG-T-FUN-140 Software Requirement Specifications 9
CU9-WP972-T-FUNC-060 1.0 MAN Draft WP972 shall adapt at least one VO-tool able to display the data provided through the TAP access of the Gaia Archive. Parent: CU9-WP972-T-FUNC-080 1.0 MAN Draft WP972 shall adapt at least one VO-tool in order to select and display Gaia spectra. Parent: CU9-WP972-T-FUNC-100 1.0 MAN Draft WP972 shall adapt at least one VO-tool in order to visualise, at least in 2D, positionnal/astrometric Gaia data. Parent: CU9-ADV-T-FUN-040 CU9-WP972-T-FUNC-120 1.0 MAN Draft All VO-tools adapted by WP972 shall be able to access the Gaia Catalogue through the Gaia Archive using one of the implemented VO protocols. Parent: CU9-ITG-T-FUN-140 CU9-WP972-T-FUNC-140 1.0 MAN Draft All VO-tools adapted by WP972 shall be able to interact each other using SAMP, in order to process different types of Gaia data in the adequate tools. Parent: CU9-ADV-T-FUN-060 For instance: selecting a Gaia object inside Aladin and sending the position to SPLAT in order to display its spectra. CU9-WP972-T-FUNC-160 1.0 MAN Draft WP972 shall deliver at least one spectra matching tool/service able to work on the spectra provided by the Gaia Archive. Parent: CU9-ITG-T-FUN-140, CU9-ITG-T-FUN-040 CU9-WP972-T-FUNC-180 1.0 MAN Draft All spectra matching tools/services provided by WP972 shall be able to match Gaia spectra to users spectra. A users spectrum may be provided in different formats, but at least a Spectrum datamodel (VO standard) compliant input must be supported. Parent: CU9-ITG-T-FUN-140, CU9-ITG-T-FUN-040 CU9-WP972-T-FUNC-200 1.0 MAN Draft All spectra matching tools/services provided by WP972 shall be able to access Gaia spectra through the Gaia Archive. Parent: CU9-ITG-T-FUN-140, CU9-ITG-T-FUN-040 Software Requirement Specifications 10
2.3 Data Mining CU9-WP973-T-FUNC-020 1.0 MAN Draft Data mining framework will allow the community to perform complex queries through an easy to use interface. This User Interface must be based on a web technology preferably fully integrated or into the rest of the Gaia archive access and querying tools or compatible with them. Parent: CU9-CIF-T-MAN-020 CU9-WP973-T-FUNC-040 1.0 MAN Draft Data mining framework must implement a basic set of MlLib (Machine Learning Libs) as a basic building blocks to build more complex use cases on top of them. These complex and more common use cases will also be implemented in the framework. Parent: CU9-CIF-T-MAN-020 CU9-WP973-T-FUNC-060 1.0 MAN Draft The framework must integrate a job management platform to handle task execution. This job management system must implement the job submitting policies necessary such as: synchronized execution or batch execution in a prioritized queue policy, with the necessary cluster resource management. The usage of the cluster and its resources should be monitored by the appropriate tools. Parent: CU9-CIF-T-MAN-140 CU9-WP973-T-FUNC-080 1.0 MAN Draft Advanced users will also be allowed to submit their own custom applications to the cluster though a proper entry point to the framework. The submitted job must also be integrated into the job management system for its execution. Parent: CU9-CIF-T-MAN-040, CU9-ITG-T-FUNC-200 CU9-WP973-T-FUNC-100 1.0 MAN Draft The data mining framework must implement the security policies necessary according to the SAT (Science Archive team) and ESA security standards. Also it should be integrated into the rest of the security framework (single sign on service) with the rest of the Gaia Archive web services and interfaces. Parent: CU9-CIF-T-FUN-080 Software Requirement Specifications 11
CU9-WP973-T-FUNC-120 1.0 MAN Draft Data mining task results must be displayed through the client using advanced data displaying tools (including graphical features -2D/3D-). Interaction with the visualization Work Package (WP980) must be established. Parent: CU9-ADV-T-FUN-020, CU9-ADV-T-FUN-040 CU9-WP973-T-DATA-020 1.0 MAN Draft Gaia archive data must be provided into a HDFS (Hadoop Distributed File System) o similar distributed file system together with its relational version and both have to be in synch. Also, Archive metadata must be provided in order to be user by the data miming framework and its query system. A preferential file format to be used by the data mining framework must be evaluated and proposed by this work package. Parent: CU9-CAT-M-PLN-080 2.4 Cross-Matching In the following by External Catalogues we intend the catalogues for which the cross-match will be pre-computed and will be part of the Gaia releases. Those will be catalogues with N stars greater than a few hundred millions, observed in the optical/near infrared and public available. The External Catalogues, the cross-match algorithm and the output will be release specific, so the following requirements will need to be fullfilled for each release. In order to calculate the cross-match of the Gaia Catalogue with External surveys, the latter must be homogenized in order for a single algorithm to run on several different catalogues. It is thus necessary to have both original catalogues (either full or a sub-set of the fields needed by cross-match and validation) and a cross-match specific version of the same catalogues. CU9-WP974-S-DATA-020 16.0 MAN Draft Define External Catalogues (original catalogues and cross-match specific catalogues) input through a definition of the data models. CU9-WP974-S-DATA-040 16.0 MAN Draft Define cross-match output through a definition of the data models (full output will be used for the cross-match validation, sub-set will be made available to final users). CU9-WP974-T-DATA-020 16.0 MAN Draft Prepare and deliver to ESAC the gbin files of the original External Catalogues using the CU9 data models. Software Requirement Specifications 12
CU9-WP974-T-DATA-040 16.0 MAN Draft Prepare and deliver to ESAC the gbin files of the cross-match specific External Catalogues using the CU9 data models. CU9-WP974-S-ALGO-020 16.0 MAN Draft Detailed description and specifications of the procedure that calculates the cross-match of the Gaia Catalogue with the External Catalogues starting from the defined input and obtaining the defined output. CU9-WP974-T-FUNC-020 16.0 MAN Draft A cross-match software compliant with the specified algorithm must be available in ASDC for cross-match validation or directly to calculate the cross-match. CU9-WP974-S-ALGO-020 16.0 MAN Draft Definition of the validation process of the cross-match algorithm. CU9-WP974-T-FUNC-020 16.0 MAN Draft The cross-match validation must be performed following the cross-match validation requirements. CU9-WP974-T-DATA-020 16.0 MAN Draft Definition of the data set needed for the cross-match validation. CU9-WP974-T-DOCU-020 16.0 MAN Draft Provide documentation on original External Catalogues and a detailed description of the corresponding data models. Parent: CU9-DOC-S-FUN-040 CU9-WP974-T-DOCU-040 16.0 MAN Draft Provide documentation on cross-match specific External Catalogues and generate a detailed description of the corresponding data models. Parent: CU9-DOC-S-FUN-040 Software Requirement Specifications 13
CU9-WP974-T-DOCU-060 16.0 MAN Draft Provide documentation on cross-match tables (output of cross-match activities) and generate a detailed description of the corresponding data models. Parent: CU9-DOC-S-FUN-040 CU9-WP974-T-DOCU-080 16.0 MAN Draft Provide documentation on cross-match algorithm and on the tests performed to develop it. Parent: CU9-DOC-S-FUN-040 CU9-WP974-T-DOCU-100 16.0 MAN Draft Provide documentation on the validation of the cross-match algorithm. Parent: CU9-DOC-S-FUN-040 2.5 Science Alerts CU9-WP975-T-COOR-020 1.1 MAN Draft...TBD... 2.6 Documentation CU9-WP975-T-DOCU-020 1.1 MAN Draft...TBD... Parent: CU9-DOC-S-FUN-040,CU9-DOC-S-PLN-060,?? This section may be deleted, generation of the documentation should be added as requirements in each sub-wp section, in addition a requirement on the publication of the documentation should be sent to WP920 for inclusion in their SRS document. 2.7 Help Desk See elsewhere for the SRS of the relevant workpackage (WP953) 2.8 Public Outreach See elsewhere for the SRS of the relevant workpackage (WP960) 3 Missing Parent requirements Each requirement in this document must have a parent requirement at higher level. However for some of the requirements defined in this document no parent requirement could be found in the CU9 SRS WOM-033. We consider that WOM-033 needs to be updated to include the following top level CU9 requirements Software Requirement Specifications 14
WP970 Requirement CU9-WP971-M-COO-020 Missing Top level CU9 requirement No requirement on coordination between CU9 WPs exists in the CU9 SRS. Software Requirement Specifications 15
References [AB-026], Brown, A., Arenou, F., Hambly, N., et al., 2012, Gaia data access scenarios summary, GAIA-C9-TN-LEI-AB-026, URL http://www.rssd.esa.int/cs/livelink/open/3125400 [WOM-033], O Mullane, W., 2009, Gaia Catalogue and Archive Software Requirements and Specification, GAIA-C9-SD-ESAC-WOM-033, URL http://www.rssd.esa.int/cs/livelink/open/2907710 [WOM-086], O Mullane, W., Luri, X., Gracia, G., 2014, CU9 Software Development Plan, GAIA-C9-PL-ESAC-WOM-086, URL http://www.rssd.esa.int/cs/livelink/open/3237698 Software Requirement Specifications 16
A Requirements traceability The traceability between this SRS and parent requirements such as the SSS should be given here. A script makerequirementstraceparents.rb is provided in CU1/docs/common/scripts to create this from the higher level requirements optionally specified in the req TeX macro i.e. the PARENT requirement. If the requirements contain tex labels which start with req: then they will become clickable links in the table. If you do not have labels you may use the script addreqlabels.rb which attempts to add labels to all requirements. The following table provides traceability for derived requirements within this requirements specification, and also to level 0 requirements in WOM-033. Parent Requirement Requirements in this document CU9-ADV-T-FUN-020 CU9-WP973-T-DATA-020 CU9-ADV-T-FUN-040 CU9-WP973-T-DATA-020 CU9-ADV-T-FUN-080 CU9-WP974-S-DATA-020, CU9-WP974-S-DATA-040, CU9-WP974-T-DATA-020, CU9-WP974-T-DATA-040, CU9-WP974-S-ALGO-020, CU9-WP974-T-FUNC-020, CU9-WP974-S-ALGO-020, CU9-WP974-T-FUNC-020, CU9-WP974-T-DATA-020, CU9-WP975-T-COOR-020 CU9-ARC-M-PLN-020 CU9-WP971-M-PLAN-020 CU9-CAT-M-PLN-080 CU9-WP973-T-DATA-020 CU9-CIF-T-FUN-080 CU9-WP973-T-DATA-020 CU9-CIF-T-MAN-020 CU9-WP972-T-FUNC-200, CU9-WP973-T-DATA-020, CU9-WP973-T-DATA-020 CU9-CIF-T-MAN-040 CU9-WP973-T-DATA-020 CU9-CIF-T-MAN-140 CU9-WP973-T-DATA-020 CU9-DOC-S-FUN-040 CU9-WP974-T-DOCU-020, CU9-WP974-T-DOCU- 040, CU9-WP974-T-DOCU-060, CU9-WP974-T- DOCU-080, CU9-WP974-T-DOCU-100, CU9-WP975- T-DOCU-020 CU9-DOC-S-PLN-060 CU9-WP975-T-DOCU-020 CU9-ITG-T-FUNC-200 CU9-WP973-T-DATA-020?? CU9-WP975-T-DOCU-020 Software Requirement Specifications 17