BIG data big problems big opportunities Rudolf Dimper Head of Technical Infrastructure Division ESRF



Similar documents
Control Software at ESRF beamlines

Protect, Share, and Distribute Healthcare Data with ehealth Managed Services

MODELING AND IMPLEMENTATION OF THE MECHANICAL SYSTEM AND CONTROL OF A CT WITH LOW ENERGY PROTON BEAM

With DDN Big Data Storage

From Big Data to Smart Data Thomas Hahn

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

ICT 10: Software Technologies

Status of Radiation Safety System at

SURFsara Data Services

Siemens Computed Tomography

ICT 10: Software Technologies

Radiation therapy involves using many terms you may have never heard before. Below is a list of words you could hear during your treatment.

Image Area. View Point. Medical Imaging. Advanced Imaging Solutions for Diagnosis, Localization, Treatment Planning and Monitoring.

MI Software. Innovation with Integrity. High Performance Image Analysis and Publication Tools. Preclinical Imaging

GE Healthcare. Centricity PACS and PACS-IW with Universal Viewer* Where it all comes together

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

CT for the production floor: Recent developments in Hard- and Software for industrial CT Systems

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

COMPARISON OF FOUR DATA ANALYSIS SOFTWARE FOR COMBINED X-RAY REFLECTIVITY AND GRAZING INCIDENCE X-RAY FLUORESCENCE MEASUREMENTS

Cryo SAXS sample environment Project update 02/2007

Analysis Programs DPDAK and DAWN

Precise Treatment System Clinically Flexible Digital Linear Accelerator. Personalized radiotherapy solutions for everyday treatment care

White paper: Unlocking the potential of load testing to maximise ROI and reduce risk.

ESRF Upgrade Phase II: le nuove opportunitá per le linee da magnete curvante

RiMONITOR. Monitoring Software. for RIEGL VZ-Line Laser Scanners. Ri Software. visit our website Preliminary Data Sheet

ULTRAFAST LASERS: Free electron lasers thrive from synergy with ultrafast laser systems

Embedded Systems in Healthcare. Pierre America Healthcare Systems Architecture Philips Research, Eindhoven, the Netherlands November 12, 2008

Beamline Automation at the APS

ICT Perspectives on Big Data: Well Sorted Materials

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement

Development of on line monitor detectors used for clinical routine in proton and ion therapy

Medical Applications of radiation physics. Riccardo Faccini Universita di Roma La Sapienza

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Predictive Analytics

Automated file management with IBM Active Cloud Engine

Advanced Computing. for large DESY. Volker Guelzow IT-Gruppe Hamburg, Oct 29th, 2014

Session 2. The economics of Cloud Computing

case study Institut Gustave Roussy (IGR) The Institute Gustave-Roussy Advances Cancer Treatment Summary ORGANIZATION: PROJECT NAME:

Health Management Information Systems: Medical Imaging Systems. Slide 1 Welcome to Health Management Information Systems, Medical Imaging Systems.

FCC JGU WBS_v0034.xlsm

Leveraging Big Data Technologies to Support Research in Unstructured Data Analytics

Data Mining for Data Cloud and Compute Cloud

Data-analysis scheme and infrastructure at the X-ray free electron laser facility, SACLA

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

Beam Instrumentation Group, CERN ACAS, Australian Collaboration for Accelerator Science 3. School of Physics, University of Melbourne 4

The Key Elements of Digital Asset Management

Make the Most of Big Data to Drive Innovation Through Reseach

The Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT): A Vision for Large-Scale Climate Data

Hitachi Healthcare Company - Our Strategy and Opportunities -

Towards large dynamic range beam diagnostics and beam dynamics studies. Pavel Evtushenko

Shielding and Radiation Measurements at ESRF

Guide to Understanding X-ray Crystallography

G-Cloud II Services Service Definition Accenture Cloud Infrastructure Implementation Services

Intelligent Assets. Manufacturing Analytics Institute for Manufacturing, University of Cambridge. 1 st February Daniel Keely

Chapter 6. The stacking ensemble approach

The consistent quality of connected radiology

Web-based pre-analysis Tools

Orthogonal ray imaging: from dose monitoring in external beam therapy to low-dose morphologic imaging with scanned megavoltage X-rays

CHAPTER 7 SUMMARY AND CONCLUSION

Challenge of FUJIFILM in Medical ICT

Contract: FIGM-CT Measures for Optimising Radiological Information and Dose in Digital Imaging and Interventional Radiology (DIMOND III)

Workprogramme

Two years ANR Post Doctorate fellowship Mathematics and Numerics of Dynamic Cone Beam CT and ROI Reconstructions

Radiation Oncology Department Solution

ADROTERAPIA A MEDAUSTRON

Radiation Protection in Radiotherapy

FireScope + ServiceNow: CMDB Integration Use Cases

Intelligent Government From Data to Decision. Robert Lindsley Oracle, Public Sector Technology Group

Big Data in the context of Preservation and Value Adding

CROSS PLATFORM AUTOMATIC FILE REPLICATION AND SERVER TO SERVER FILE SYNCHRONIZATION

Proton beam. Medical Technical Complex Joint Institute for Nuclear Research, Dubna, Russia

5G Network Infrastructure for the Future Internet

Transcription:

BIG data big problems big opportunities Rudolf Dimper Head of Technical Infrastructure Division ESRF Slide: 1

! 6 GeV, 850m circonference Storage Ring! 42 public and CRG beamlines! 6000+ user visits/y! ~1000 experiments/y! ~1.5-2 PB/y! ~2000 publications/y Slide: 2

What is Big data? A volume of data that is impossible to process by simple means, hence requiring significant investments in IT infrastructure to capture, store, transfer, analyse and visualise datasets. Slide: 3

Big data one buzzword different challenges Industry Data mining Business intelligence Get additional information from (often) already existing data Data aggregation New field to make money Science Handling huge amounts of data Data transport Distributed data sources and/or storage (Global) data management Data preservation Analysing huge amounts of data Complexity of code Parallel architectures Different software environments Scientific results must be verifiable Slide: 4

Data " BIG data More photons " ESRF upgrade " new lattice " lower emittance " more photons " Shorter experiments Optimised experiments " More automation " Multiple detectors Better detectors " Higher resolution " Faster readout, less dead-time " New experimental methods become possible " Single experiments now TBs and 100 000 s of files Slide: 5

Example: Computed Tomography X-ray computed tomography (CT) is an imagine technique to produce cross sectional images (previously also known as CAT scans (Computed Axial Tomography) Used of diagnostics and therapy purposes Many slices form a volume CT is known as a moderate- to high-radiation diagnostic technique The Grenoble team in the control room of the medical research beamline. From left to right: Paola Coan, Alberto Bravin and Emmanuel Brun. Credit ESRF/Blascha Faust. Slide: 6

CT scans Breast Cancer 3D diagnostic computed tomography The typical dual view digital mammography is limited and does not detect 10-20% of breast tumors Hospital CT scans can not be used radiation dose too high Synchrotron CT scans: High energy X-rays Phase contrast imaging Novel mathematical algorithms: equally sloped tomography Spatial resolution 2-3 times higher than in a hospital Radiation dose 25 times lower In the US alone an estimated 40 000 persons/year die of breast cancer! Slide: 7

CT scans Breast Cancer Slide: 8

Wouldn t it be cool if we would have unlimited, easy to use IT in our facilities " Fast feedback loops = optimised experiments = more science I can leave the raw data at the RI no need to carry loads of disks User Data analysis easy and fast I am helped with my data analysis I can visualise data remotely I don t need to be a specialist to use RI s I can combine data sets from different RI s Slide: 9

IT a barrier? Difficult to scale up " Required bandwidth for detector read-out " On-line or near on-line data analysis while taking data Data analysis requires low latency IT " Automated metadata capture required " Data management a challenge Large number of individual files Slide: 10

IT a barrier? Difficult to scale up " Data sets too big to take home " Software environment a challenge Complexity, heterogeneity Users usually not affiliated to the facility Less IT literate scientists than e.g. in HEP Users require support to install software and analyse data " Scientific collaborations require distributed infrastructures and networks Slide: 11

Data Analysis THE barrier! We must match data analysis capabilities with advancements in detectors and sources! Since resources are scarce, we have to do this collaboratively Create a homogeneous and compatible data analysis environment Provide data analysis services to our users Share best practices and solutions Pool our resources between Research Infrastructures to create critical mass Build sustainable solutions Slide: 12

Moving the barrier Analytical Analytical Courtesy: T. Proffen Slide: 13

Data acquisition! Fight against latency! Guarantee bandwidth Experiment Control Hutch Example: Fast buffer storage:! fast preview / on-line data analysis! 2 days capacity! central storage push! multiple 10 Gbps! NFS (V4+V3) and CIFS! write >1 GB/s! read write speed Slide: 14

Reaching for the clouds Why Cloud?! Easy to use! Economies of scale through standardisation! Powerful software models! Flexibile resource allocation Complexity for simplicity! Virtualisation! Cloud platform to encapsulate services! An all-in-one environment to implement Data Analysis as a Service! Federation of local clouds to span RIs in Europe! Open to public clouds, allowing to choose in the future Slide: 15

Integrate Data Analysis Code! Make data analysis software available in an optimised environment! Data analysis without installing software on client computers Slide: 16

Data visualisation # Browsing and searching through meta-data # Browsing through experimental data # Multi dimensional web-based (remote) visualisation of large data sets High-resolution simulations of beam dynamics in electron linacs, particle density in physical space. LBNL visualization group X-ray diffraction pattern of a single Mimivirus particle, imaged at LCLS Slide: 17

Photon and Neutron Data Analysis As A Service Slide: 18

Thank you for your attention Slide: 19