Data Management and So-ware Centre Mark Hagen Head of DMSC Mark.Hagen@esss.se www.europeanspalla:onsource.se Lithuanian Academy of Sciences, March 4 th 2014
What is DMSC? o Data Management and So-ware Centre (DMSC) o A Division of ESS Science Directorate Just like Neutron Technologies, Neutron Instruments etc. Two campuses: ESS Lund & ESS Copenhagen (Universitetsparken, Københavns Universitet) DMSC building to be constructed in Copenhagen o Responsibility: design, develop & implement for the ESS instruments: So-ware (user control interfaces, data acquisi:on, reduc:on & analysis) Hardware (servers, networks, worksta:ons, clusters, disks, pfs etc.) 2
ESS Organiza:on ESS BOARD OF DIRECTORS DIRECTOR GENERAL/CEO INFRASTRUCTURE DIRECTORATE MACHINE DIRECTORATE SCIENCE DIRECTORATE PROJECT SUPPORT & ADMINISTRATION DIRECTORATE CONVENTIONAL FACILITIES ACCELERATOR NEUTRON INSTRUMENTS GENERAL SERVICES SAFETY, HEALTH & ENVITONMENT TARGET NEUTRON TECHNOLOGIES HUMAN RESOURCES ENERGY SYSTEMS ENGINEERING SCIENTIFIC ACTIVITIES FINANCE INTEGRATED CONTROL SYSTEM SCIENTIFIC PROJECTS COMMUNICATIONS & EXTERNAL RELATIONS INTEGRATION & DESIGN SUPPORT DATA MANAGEMENT SOFTWARE CENTRE INFORMATION TECHNOLOGY LEGAL SUPPLY, PROCUREMENT & LOGISTICS
What is the ESS? DMSC ESS
What will DMSC do for ESS? o DMSC is involved. From idea to publica0on o So-ware and servers for the user proposals & the review o Coordina:on (scheduling) of experiments and equipment o The opera:on of the instruments/experiments Controlling the components of the instrument Acquiring the data from detectors & meta- data from instrument/samp. env. etc. Streaming the data (publish/subscribe) Recording/archiving/cataloguing the data Carrying out the data reduc:on o So-ware for Data Analysis Modeling & Simula:on o Finally a Publica:ons Database track the results from ESS
DMSC Organiza:on DMSC Data Systems & Technologies Inst. Control, Data Acq. & Reduc:on Data Management Data Analysis & Modeling User Office So-ware Copenhagen Data Centre DMSC servers in Lund Clusters, Worksta:ons Disks, Parallel File System Networks (inc. Lund CPH) Data transfer & Back- Up External Servers Instrument Control User Interfaces EPICS read/write Streaming data (ADARA) Data reduc:on (MANTID) File writers (ADARA) Data Catalogues Workflow Management Post- Processing. - - - - Reduc:on - - - - Analysis Messaging Services Web Interfaces MCSTAS support + dev. Instrument Integrators Analysis codes (e.g. SANSview, Rietveld, ) MD + DFT Framework User Database Proposal System Training Database Publica:ons Database Ini:ally one work package (2 work units) 6
General Framework + Customiza:on Generic Data Framework o Event mode data for later reprocessing/filtering o Stream the data è on the fly data processing (live view) è on the fly file crea:on è to a loca:on where appropriate post- acq. resources are available o Create standard HDF5 data files (NeXus or other) o (Where possible) Automate data reduc:on & analysis o Catalogue the data & meta- data for fast processing Don t re- invent the wheel è Exis:ng projects: ICAT, MANTID, ADARA Join these Customize for ESS Instruments o User Control Interface, detector type, IOC s for sample environment etc. 7
Data Acquisi:on, Reduc:on & Control NEUTRON TECHNOLOGIES Generic Framework o Instrument Control o Data Acquisi:on o Data Reduc:on & Visualiza:on Choppers Mo:on Control Sample Environment Fast Sample Environment Timing Bulk Data Readout (Detector Group) Control Box INTEGRATED CONTROL SYSTEMS User Control Interface Live, Local & Remote Data Reduc:on Data Analysis Interfaces Instrument Control Room Detectors & Monitors DATA MANAGEMENT & SOFTWARE CENTRE Automated Data Reduc:on Lund Server Room Data Aggregator & Streamer ( ) Automated Data Reduc:on Copenhagen Server Room 8
Control & DAQ Components Data AcquisiMon, Streaming & ReducMon Used by ESS accelerator/target, SLS, Diamond, US light sources, to be used by ISIS & SNS Publish/subscribe so-ware & protocol for streaming data (neutron + meta) Data reduc:on framework in Python & C++ developed by ISIS & SNS Data Management ICAT data cataloguing so-ware developed under NMI3 by PanData collabora:on of 19 European facili:es (+ SNS in US) 9
Data Analysis & Modeling o Data on disk is useless! It is published results from the data that makes progress o Need to ensure that ESS users have access to appropriate so-ware packages for data analysis the necessary computa:onal resources to exploit the so-ware to obtain those results analysis so-ware during experiment to influence the data taking strategies o Roll out in- sync with instruments Structure of Nanomaterials K. L. Krycka et al. Core Shell MagneMc Morphology of Structurally Uniform MagneMte NanoparMcles PRL 104, 207203 (2010) Polarized SANS demonstrated that these nanopar:cles have uniform nuclear structure but core- shell magne:c structure. Required development of both data reduc:on and data analysis methods and tools. 10
Data Analysis Development Original force- field IntegraMon of Analysis with Advanced Modeling Techniques Molecular Dynamics & Density Func:onal Theory (DFT) Techniques Jose Borreguero (SNS/NDAV) Mike Crawford (Dupont) & Niina Jalarvo (Julich): BASIS experiment / MD simula:on studies of Methyl rota:ons in methyl- Polyhedral oligomeric silsesquioxanes (POSS) Op:miza:on of dihedral poten:al governing the rota:onal jumps in methyl hydrogens. Preliminary simula:ons indicate that the dihedral poten:al barrier must be decreased ~37% in order to match experimental data OpMmized force- field
Data Systems & Technologies o DMSC will not be a supercomputer centre o Data (disk) storage: Back of the envelope è 2 3 PBytes/yr Spectrum of file sizes: ~100MByte - ~10 s GByte ~1TByte Fast disk (200MByte/s) & Parallel File System (10GByte/s) o Cluster(s) for data reduc:on & (modest) data analysis - ~2048 cores Architecture CPU, GPU visualiza:on cluster/server o Data download servers s-p & grid-p o Remote login capability for ESS users: Re- reduce data using cluster Data analysis so-ware available for users o So-ware development servers repositories, bug trackers, build servers 12
During the Construc:on Years o Primary goal: To be ready for hot commissioning of first ESS instruments in 2020. o Includes significant amount of NRE (Non- Recurrent Engineering) for subsequent ESS instruments. o Establish the basic facili:es at both campuses. o Front loaded with instrument- centric work + scope to grow analysis in an integrated way 13
Conclusion o DMSC s mission is the computa:onal (so-ware/hardware) infrastructure for the ESS data chain - instrument control, data acquisi:on, reduc:on & analysis o Generic framework customized for each of the instruments u:lizes data streaming to account for large data files/live processing o Don t re- invent the wheel work with collaborators ISIS, SNS, PanData (+ others?) to develop/customize exis:ng so-ware EPICS, ADARA, MANTID, ICAT o Customize for each of the ESS instruments in- sync as they roll out o In- sync with instrument roll out work with instrument teams to ensure analysis so-ware is available (doesn t necessarily mean develop) o Longer term goal to develop new data analysis methods for ESS instruments 14
Ques:ons QUESTIONS