Data Quality Monitoring. DAQ@LHC workshop



Similar documents
The Data Quality Monitoring Software for the CMS experiment at the LHC

First-year experience with the ATLAS online monitoring framework

World-wide online monitoring interface of the ATLAS experiment

ATLAS TDAQ Monitoring Questionnaire

TDAQ Analytics Dashboard

Online CMS Web-Based Monitoring. Zongru Wan Kansas State University & Fermilab (On behalf of the CMS Collaboration)

Web based monitoring in the CMS experiment at CERN

Database Monitoring Requirements. Salvatore Di Guida (CERN) On behalf of the CMS DB group

The Compact Muon Solenoid Experiment. CMS Note. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

ATLAS Muon System Data Quality: Tools and Organization

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

Data Quality Monitoring for High Energy Physics (DQM4HEP) Module interfaces

Distributed Database Access in the LHC Computing Grid with CORAL

Database Services for CERN

Managing your Red Hat Enterprise Linux guests with RHN Satellite

DSS. Diskpool and cloud storage benchmarks used in IT-DSS. Data & Storage Services. Geoffray ADDE

Realization of Inventory Databases and Object-Relational Mapping for the Common Information Model

COMPASS Database Work in 2014/15

Das HappyFace Meta-Monitoring Framework

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

Online Data Monitoring Framework Based on Histogram Packaging in Network Distributed Data Acquisition Systems

Data Quality Monitoring for the CMS Electromagnetic Calorimeter. Giuseppe Della Ricca, Emanuele Di Marco, Giovanni Franzoni, Benigno Gobbo

SPD Calibration: Status and Plans

Responsive, resilient, elastic and message driven system

CMS data quality monitoring web service

Glance Project: a database retrieval mechanism for the ATLAS detector

Web Based Monitoring in the CMS Experiment at CERN

Web application for detailed realtime database transaction monitoring

Improved metrics collection and correlation for the CERN cloud storage test framework

Challenge 10 - Attack Visualization The Honeynet Project / Forensic Challenge 2011 /

Test Run Analysis Interpretation (AI) Made Easy with OpenLoad

Computing at the HL-LHC

LHC Databases on the Grid: Achievements and Open Issues * A.V. Vaniachine. Argonne National Laboratory 9700 S Cass Ave, Argonne, IL, 60439, USA

Evolution of Database Replication Technologies for WLCG

A Tool for Evaluation and Optimization of Web Application Performance

CLAS12 Offline Software Tools. G.Gavalian (Jlab)

ATLAS job monitoring in the Dashboard Framework

Dashboard applications to monitor experiment activities at sites

Middleware- Driven Mobile Applications

IBM Digital Experience. Using Modern Web Development Tools and Technology with IBM Digital Experience

CMS Remote Monitoring at Fermilab

Proactive and Reactive Monitoring

Decoding DNS data. Using DNS traffic analysis to identify cyber security threats, server misconfigurations and software bugs

A J2EE based server for Muon Spectrometer Alignment monitoring in the ATLAS detector Journal of Physics: Conference Series

Responsiveness. Edith Law & Mike Terry

CMS Dashboard of Grid Activity

50 shades of Siebel mobile

Content Management Systems: Drupal Vs Jahia

design coding monitoring deployment Java Web Framework for the Efficient Development of Enterprise Web Applications

How To Develop A Mobile Application On An Android Device

Testing the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper

Adding Web 2.0 features to a Fleet Monitoring Dashboard

Online Performance Monitoring of the Third ALICE Data Challenge (ADC III)

MOBILE ARCHITECTURE FOR DYNAMIC GENERATION AND SCALABLE DISTRIBUTION OF SENSOR-BASED APPLICATIONS

Calibrations, alignment, online data monitoring

DEPLOYMENT GUIDE DEPLOYING F5 WITH MICROSOFT WINDOWS SERVER 2008

Oracle JRockit Mission Control Overview

The CMS analysis chain in a distributed environment

ATLAS Software and Computing Week April 4-8, 2011 General News

Open Source DBMS CUBRID 2008 & Community Activities. Byung Joo Chung bjchung@cubrid.com

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

Database Performance Monitoring and Tuning Using Intelligent Agent Assistants

How To Build A Web App

Category: Business Process and Integration Solution for Small Business and the Enterprise

Software, Computing and Analysis Models at CDF and D0

Getting Started with Android Programming (5 days) with Android 4.3 Jelly Bean

Web Cloud Architecture

The Status of Astronomical Control Systems. Steve Scott OVRO

Managing Complexity in Mobile Application Deployment Using the OSGi Service Platform

DEPLOYMENT GUIDE. Deploying the BIG-IP LTM v9.x with Microsoft Windows Server 2008 Terminal Services

DEVELOPMENT OF AN ANALYSIS AND REPORTING TOOL FOR ORACLE FORMS SOURCE CODES

DiskPulse DISK CHANGE MONITOR

ZABBIX. An Enterprise-Class Open Source Distributed Monitoring Solution. Takanori Suzuki MIRACLE LINUX CORPORATION October 22, 2009

4/25/2016 C. M. Boyd, Practical Data Visualization with JavaScript Talk Handout

The next generation of ATLAS PanDA Monitoring

Content Management System (CMS)

Documentation of open source GIS/RS software projects

Sisense. Product Highlights.

Site specific monitoring of multiple information systems the HappyFace Project

Web based data visualization and processing tools for ASEC and SEVAN particle detector networks (DVIN 5.0)

HappyFace for CMS Tier-1 local job monitoring

MySQL Administration and Management Essentials

Status and Integration of AP2 Monitoring and Online Steering

Application Performance Testing Basics

Easy Setup Guide 1&1 CLOUD SERVER. Creating Backups. for Linux

TimePictra Release 10.0

Aspera Connect User Guide

Distributed Realtime Systems Framework for Sustainable Industry 4.0 applications

Dynamic Web Programming BUILDING WEB APPLICATIONS USING ASP.NET, AJAX AND JAVASCRIPT

ALICE Trigger and Event Selection QA

The full setup includes the server itself, the server control panel, Firebird Database Server, and three sample applications with source code.

ARDA Experiment Dashboard

Track Trigger and Modules For the HLT

MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper

W3Perl A free logfile analyzer

A simple object storage system for web applications Dan Pollack AOL

Beyond The Web Drupal Meets The Desktop (And Mobile) Justin Miller Code Sorcery Workshop, LLC

ATLAS Petascale Data Processing on the Grid: Facilitating Physics Discoveries at the LHC

DEPLOYMENT GUIDE Version 2.1. Deploying F5 with Microsoft SharePoint 2010

Transcription:

Data Quality Monitoring DAQ@LHC workshop

Introduction What this presentation is not What it is and how it is organized Definition of DQM Overview of systems and frameworks Specific chosen aspects o Data collection, Storage, Visualization, analyses Operations Qualitative assessment & discussion Future 12/03/2013 Barthélémy von Haller, ALICE, CERN 1/26

Data Quality Monitoring (1) Online feedback on the quality of data Make sure to take and record high quality data Use the data taking time and the precious bandwidth in an optimal way Identify and solve problem(s) early 12/03/2013 Barthélémy von Haller, ALICE, CERN 2/26

Data Quality Monitoring (2) Data Quality Monitoring (DQM) involves Online gathering of data Analysis by user-defined algorithm(s) Storage of monitoring results Visualization Dataflow Data samples Analysis Results (histos, values ) Storage Vizualization 12/03/2013 Barthélémy von Haller, ALICE, CERN 3/26

Data Quality Monitoring (3) Requirements Range from formal documents to oral tradition Changing difficulties No interference with data taking Low latency update during runs Centralized results Modular, fast update of users code 12/03/2013 Barthélémy von Haller, ALICE, CERN 4/26

Overview: ATLAS «Generic, yet flexible and nicely scalable» Shared online/offline and by ~10 independent communities C++ Qt ROOT XML JS CGI - Python Dataflow Events Histos EMON 10-15% 1Hz/MT MT MT MT 150MTs 50MTs Histos 70000 DQParameter An histo DQRegion A subset of detector DQAlgorithm A specific analysis DQResult A color assessment Information Services 12/03/2013 Barthélémy von Haller, ALICE, CERN DQParam DQRegion DQResults Configura tion DB DQMF Web app Display GUI 5/26

Collector Quantities Via DIP Monitoring Elements Overview: CMS DQM framework part of reconstruction framework SMPS allows for trigger path selection 1 DQM app per subsystem Re-run reco according to needs Run certain analysis C++, Python, Java, Oracle DB Dataflow Events Histos 10% SMPS DQM DQM DQM 1 app/subsys TCP/IP 70000 Monitoring element Metadata Reference histo Quality Web based HTTP for data exchange Web UI only Web Server WEB GUI Memory & Storage Shifter Experts Summary WS WBM Condition DB 12/03/2013 Barthélémy von Haller, ALICE, CERN 6/26

20x500 Alarms ~100 Histos Overview: LHCb DIM Network RPC 3 distinct histogramming places C++ - ROOT Monitoring on 70-100% of data Auto. analysis every 15 minutes Automatic actions! Dataflows 5kHz From HLT Event data 3-5kHz (>70%) Event data 50Hz Aggregated histos ~100 Reco Monitoring Farm Histogram Presenter Histo Savesets Alarms Automatic analysis Alarms Control System 12/03/2013 Barthélémy von Haller, ALICE, CERN 7/26

Overview: ALICE AMORE framework C++ - ROOT Plugin architecture Notifications via DIM MySQL database Plugins developed in 20 institutes worldwide Histograms or values also retrieved in other systems MonitorObject Metadata (run, quality, expertshifter flag) ROOT base type encapsulated (TH1, TObject, scalar, ) Dataflow Events Monito -ring ~40 agents Agent Agent Agent MonitorObjects ~11500 /min Data Pool elogbook Expert GUIs HLT DCS Trigger Histos Values Alarms Orthos 12/03/2013 Barthélémy von Haller, ALICE, CERN Generic GUI 8/26

Data samples collection Main input to DQM Events (all) and sub-events (ALICE, ATLAS) Histograms or values coming from other systems, esp. HLT. Challenges Bandwidth issues: monitoring processes do events selection themselves instead of using a proxy (ATLAS, ALICE) Merge of all histograms from HLT (LHCb, CMS) 12/03/2013 Barthélémy von Haller, ALICE, CERN 9/26

Data analysis (1) Most typical analysis include Check for dead channels (holes in distribution) Check for noisy channels (peaks in distribution) Check for differences with a reference Check for non-empty error histograms Also Check for infrastructure issues Check for reconstruction issues Check for trigger/dead time issues 12/03/2013 Barthélémy von Haller, ALICE, CERN 10/26

Data analysis (2) Offline - Online Same framework? Same analysis? ATLAS CMS LHCb ALICE 12/03/2013 Barthélémy von Haller, ALICE, CERN 11/26

Storage Online Storage In-memory (ATLAS, CMS, LHCb) Database (ALICE) Online recent history Short-term, detailed (all) Archiving Long-term per run/per time interval (all) Does long-term archiving make sense? Is it still online DQM? Rather offline QA? 12/03/2013 Barthélémy von Haller, ALICE, CERN 12/26

Visualization (Desktop) Desktop GUI (all but CMS) C++, Qt or ROOT Very similar display showing online Also past objects (LHCb) Summary panel (ATLAS) Easy customization of the layout Quality and alarms Objects Tree Menu bar Tool bar Object(s) display Extra information on the object(s) 12/03/2013 Barthélémy von Haller, ALICE, CERN 13/26

Visualization (Desktop) 12/03/2013 Barthélémy von Haller, ALICE, CERN 14/26

Remote data access How can experts access DQM results? Remote ssh, interactive access to control room LHCb: open to collaboration ATLAS: provides tokens for a few hours access + replica CMS: closed but has a collaboration replica ALICE: closed Web GUI Images and ROOT objects download (ALICE) Generated static set of pages (ATLAS) Full-fledged web application (CMS) 12/03/2013 Barthélémy von Haller, ALICE, CERN 15/26

Visualization (Web, ALICE) 12/03/2013 Barthélémy von Haller, ALICE, CERN 16/26

Visualization (Web, ATLAS) 12/03/2013 Barthélémy von Haller, ALICE, CERN 17/26

Visualization (Web, CMS) Remote monitoring Layouts and render plugins Dynamic zoom and drawing options Caching Home-made DB for quick retrieval histograms 12/03/2013 Barthélémy von Haller, ALICE, CERN 18/26

Operations (1) A typical DQM shifter Check alarms Go through plots layouts (LHCb, ALICE, ATLAS) Find information in a Wiki or directly the GUI Contact experts / shifters (if present) Inform the shift leader 12/03/2013 Barthélémy von Haller, ALICE, CERN 19/26

Operations (2) Is it worth it? Yes! ATLAS Transition Radiation Tracker timing problem spotted CMS Wrong DAQ configuration of SST detector used during beam "ramp" audio alarm triggered by DQM fix before stable beam was declared. LHCb Bad mass plot -> anomaly -> inverse value of the magnetic field used by the HLT ALICE HLT started compressing the TPC laser events reconstruction problems DQM shifter spotted anomalies fix 12/03/2013 Barthélémy von Haller, ALICE, CERN 20/26

Qualitative review (1) Personal hit-list ATLAS: scalability, service-oriented CMS: interactive and dynamic web interface LHCb: automatic actions ALICE: versatility General: o Flexibility and reusability 12/03/2013 Barthélémy von Haller, ALICE, CERN 21/26

Qualitative review (2) Biggest challenges ATLAS: initial display was in Java and couldn t cope with the number of histograms (70k) CMS: handling huge quantities of histograms in the web gui and yet be responsive LHCb: nothing dramatic, summing up HLT histograms with data rates of the order of a LEP experiment ALICE: ensure stability despite of the multiple contributions 12/03/2013 Barthélémy von Haller, ALICE, CERN 22/26

Qualitative review (3) Could we have used the same system? 4 efficient and scalable DQM Similar architecture and technologies I believe the answer is: Yes up to a certain extent If not the framework, maybe have the same (Web) UI? Could we at least share more? (Regular) common meetings? 12/03/2013 Barthélémy von Haller, ALICE, CERN 23/26

Future LS1 Algorithm and thresholds adjusting on run conditions (ALICE, ATLAS) Adapt SW to dependencies changes (LHCb, CMS) Adapt framework for multi-core multi-thread environment (CMS, ALICE) Web GUI and tools (ALICE, [LHCb]) Full archiving (ATLAS) Proxy monitoring (ALICE) LS2 Complete rewrite along with online-offline framework (ALICE) 12/03/2013 Barthélémy von Haller, ALICE, CERN 24/26

Conclusion Much similarities General LHC DQM framework? Trans-experiments DQM meetings? DQM in all 4 experiments Work well and up to the demand Proved to be of great importance Future DQM is crucial, still often considered late Today is an opportunity to attack it early! 12/03/2013 Barthélémy von Haller, ALICE, CERN 25/26

A big thanks to: Markus Frank Clara Gaspar Serguei Kolos Marco Rovere Questions and discussion 12/03/2013 Barthélémy von Haller, ALICE, CERN 26/26

Backup 12/03/2013 Barthélémy von Haller, ALICE, CERN 27/26

Archiv e Live Visualization (Web, CMS) Remote monitoring Layouts and render plugins Dynamic zoom and drawing options Caching Web Browser Javascript 6:parsing 7:HTML+CSS generation 1:Async Request 5:JSON Python & C++ 4:Image & JSON generation? Shared memory Web Server 2:Ask data 3:Monit oring elemen t Collector Homemade DB (ROOT files) Histo renderer 12/03/2013 Barthélémy von Haller, ALICE, CERN 28/26

Visualization (Web, ATLAS) ATLAS Web Server Web Browser JS parsing Request XML CGI Ask XML Information Service 12/03/2013 Barthélémy von Haller, ALICE, CERN 29/26

Overview ATLAS Services approach C++ Qt ROOT XML JS Framework DQMF based on DQMCore DQParameter An histo DQRegion A subset of detector DQAlgorithm A specific analysis DQResult A color assessment [New histo] 12/03/2013 Barthélémy von Haller, ALICE, CERN 30/26

Overview: ATLAS Web app Dataflow Data samples EMON MT MT MT Histos Histos Histos Online Histogramming Information Services Display GUI DQResults DQParam DQMF Configura tion DB DQAlgo 12/03/2013 Barthélémy von Haller, ALICE, CERN 31/26