Archival of raw and analysed radar data at EISCAT and worldwide Carl-Fredrik Enell, EISCAT Scientific Association COOPEUS workshop and EGI-CC kickoff, 11 March 2015 C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 1 / 23
1 Data levels in incoherent scatter radars 2 Archival of raw data at EISCAT 3 Archival of and access to analysed data: Madrigal and data portals 4 Discussion C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 2 / 23
Outline 1 Data levels in incoherent scatter radars 2 Archival of raw data at EISCAT 3 Archival of and access to analysed data: Madrigal and data portals 4 Discussion C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 3 / 23
Data from antenna to plots Signal processing chain EISCAT as example Most (one-dimensional, monostatic) radars are similar C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 4 / 23
Data from antenna to plots RF Level 1 processing VME bus Level 2 processing Level 3 processing Analog receiver chain Channel boards Filtering Site server Decoding Integration Analysis computer Parameter fitting Crate computer Network Lag profiles (integrated filtered samples) Physical parameters ADC Level 0 data Level 2 (+ 1) data Level 3 data Power meter Auxiliary sampler Level 0 data Computer Level 1 data User-defined level 0 processing User-defined level 1 storage C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 4 / 23
Analog signal Antenna Receiving element Preamplifiers Analog receiver chain Polarization combiner or separate chains Local oscillators and mixers Analogue filters The signal is not yet data at this level so the analogue signal can be thought of as level -1 C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 5 / 23
Digital signal A/D converted voltage domain signal Level 0 Stream of raw samples Level 1 Signal after digital filtering and decimation. First accessible level of data. Correlated/decoded ACF domain signal Level 2 Signal after forming lag profiles and decoding. Reduces data rates significantly but retains spectral information. C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 6 / 23
Digital signal Level 1 vs level 2, two philosophies: 1 Analysis in waveform domain can be easier with present computer capacity. Needed for interferometry etc. 2 Correlated data have essentially the same information content while reducing data volumes. Stored at EISCAT. C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 6 / 23
Analysed data Level 3 Parameters estimated by fitting a theoretical function to calibrated data (at level 1 or 2) Level 4 Visualisations and published articles C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 7 / 23
Outline 1 Data levels in incoherent scatter radars 2 Archival of raw data at EISCAT 3 Archival of and access to analysed data: Madrigal and data portals 4 Discussion C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 8 / 23
Data formats Level 1 Stored only for certain needs e.g. meteors, space debris (coherent echoes), together with level 2 Level 2 Matlab v4 compatible files with metadata and data blocks (EISCAT specific) Level 3 EISCAT specific Matlab and Madrigal NCAR format Level 4 Published articles etc: EISCAT annual reports. Investigate best practice for data citation standards and persistent object identifiers C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 9 / 23
Storage at EISCAT Level 2 data archive Level 2 is the lowest data level routinely archived at EISCAT, thus representing raw data Some users store level 1 data File storage, 2 copies on RAID units Data 1981 present: some 40 TB C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 10 / 23
Storage at EISCAT Level 2 data search and retrieval Matlab file archive on RAID Index in MySQL database (file catalogue) Apache-MySQL-Python interface to data in-house development at EISCAT part of EISCAT schedule pages C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 10 / 23
Storage at EISCAT Level 2 data retrieval functions Download (http) data transfer from RAID server to web server by TCP socket (Python) Online analysis CGI web page writes a text file, processed by cron job on analysis computer Runs Matlab Emails results to user C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 10 / 23
SQL databases Master database server: Data on RAID Catalogue of raw (level 2) incoherent scatter data files Dynasonde web portal Analysed dynasonde data Archived historical ionosonde data Slave database server radar data catalogue dynasonde portal dynasonde data EISCAT 3D web site C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 11 / 23
Level 2 data retrieval and online analysis Demo http://www.eiscat.se/schedule/schedule.cgi C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 12 / 23
Outline 1 Data levels in incoherent scatter radars 2 Archival of raw data at EISCAT 3 Archival of and access to analysed data: Madrigal and data portals 4 Discussion C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 13 / 23
Physical parameters retrieved All incoherent scatter radars retrieve fundamental ionospheric parameters electron density ion temperature electron temperature ion drift velocity etc C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 14 / 23
Harmonisation of level 3 data access: Madrigal Madrigal URL http://www.openmadrigal.org Developer Bill Rideout, Millstone Hill (MIT) with assistance from EISCAT and several other sites Use The de facto standard for distributing incoherent scatter parameters Also used by several other instruments C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 15 / 23
Harmonisation of level 3 data access: Madrigal Design principles of Madrigal NCAR format file backend holds 1-dimensional and 2-dimensional time series data Per-experiment directories (e.g. one per day) File index and other metadata: plain text files C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 15 / 23
Madrigal data access Methods of access Web GUI Web services with APIs for Matlab, Python... Download: ASCII text, NCAR file formats, HDF5 C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 16 / 23
Madrigal data access EISCAT results 1981 around 110 GB http://www.eiscat.se/madrigal C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 16 / 23
Realtime Madrigal experiments Madrigal experiments can be inserted in real time no conceptual difference from normal experiments, but useful to distinguish preliminary from calibrated data Realtime plot of electron density profiles from selected sites Using Madrigal Python API for retrieval and plotting URL http://www.eiscat.se/realtime C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 17 / 23
Data portals Data portals are search engines for data. They hold metadata only; data are downloaded from the individual providers. EISCAT participation ENVRI see talk by Ingemar ESPAS http://www.espas-fp7.eu ESPAS and ENVRI metadata are XML files generated from Madrigal (routines implemented with the Python API). C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 18 / 23
Outline 1 Data levels in incoherent scatter radars 2 Archival of raw data at EISCAT 3 Archival of and access to analysed data: Madrigal and data portals 4 Discussion C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 19 / 23
Pros and cons of Madrigal + Well established Easy-to-use web GUI Has programming APIs for many common languages e.g. Python Present storage file format only 2 D Only analysed data (no interface to levels 1 or 2) Scaling to much larger data volumes may be difficult No persistent data identifiers (numbering of experiments changes each time a new experiment is added!) C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 20 / 23
Pros and cons of data portals and virtual observatories + Online search expected by contemporary users Provide means to search for coincident data from several heterogeneous data sets in a consistent way Consistent metadata (data models / ontologies) Multitude of virtual observatories, data portals and GUIs confusing to data users Usually store only metadata, not addressing needs for archival and backup Only suitable for analysed parameters, not raw data How secure operation after end of projects? C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 21 / 23
Harmonization of data and access at different levels Level 1+2 Highly system-specific, but should aim for open formats (that is, not proprietary formats such as Matlab). See http://www.openradar.org Open questions: data access, data embargos Level 3+4 General (same parameters everywhere) To most users, level 3 is the incoherent scatter radar data Existing system (Madrigal) serves existing radars well New scalable systems needed to archive and access 3-D data Must address data citation: data provenance, persistent object identifiers C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 22 / 23
Thanks for your attention! Questions? Data volumes: temporary PBs, archives 100s of TB up to EBs Data transfer protocols Storage file formats Home-brewn vs standards Data provenance, Digital object identifiers Big data in other research communities (CERN, astronomy... )? C-F Enell, EISCAT Radar data archival Coopeus/EGI workshop 23 / 23