Optimisation of the ATLAS track reconstruction software for Run-2. A. Salzburger, CERN

Similar documents
Jet Reconstruction in CMS using Charged Tracks only

Linux Foundation Automotive Summit - Yokohama, Japan

Inner Detector/Tracking Tools

Real Time Tracking with ATLAS Silicon Detectors and its Applications to Beauty Hadron Physics

Part 4 fitting with energy loss and multiple scattering non gaussian uncertainties outliers

PHYSICS WITH LHC EARLY DATA

Track Trigger and Modules For the HLT

variables to investigate Monte Carlo methods of t t production

Performance Monitoring of the Software Frameworks for LHC Experiments

The TOTEM experiment at the LHC: results and perspective

FTK the online Fast Tracker for the ATLAS upgrade

Computing at the HL-LHC

CMS Tracking Performance Results from early LHC Running

Performance monitoring of the software frameworks for LHC experiments

Vertex and track reconstruction with the ALICE Inner Tracking System

Top rediscovery at ATLAS and CMS

Tohoku University and the Great East Japan Earthquake Our Role, Responsibility and Mission. Susumu SATOMI President, Tohoku University

Use of ROOT in Geant4

CMS Level 1 Track Trigger

Data Mining for Risk Management in Hospital Information Systems

Measurement of the Mass of the Top Quark in the l+ Jets Channel Using the Matrix Element Method

Document and entity information

Top-Quark Studies at CMS

Distributed Database Access in the LHC Computing Grid with CORAL

The STAR Level-3 Trigger System

Bachelor of Games and Virtual Worlds (Programming) Subject and Course Summaries

Calorimetry in particle physics experiments

Status and Evolution of ATLAS Workload Management System PanDA

Introduction to GPU Programming Languages

Multi-GPU Load Balancing for Simulation and Rendering

Performance monitoring at CERN openlab. July 20 th 2012 Andrzej Nowak, CERN openlab

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

Discover the power of reading for university learners of Japanese in New Zealand. Mitsue Tabata-Sandom Victoria University of Wellington

Proton tracking for medical imaging and dosimetry

Parallel Computing with MATLAB

ATLAS Petascale Data Processing on the Grid: Facilitating Physics Discoveries at the LHC

GPGPU accelerated Computational Fluid Dynamics

Experiences on using GPU accelerators for data analysis in ROOT/RooFit

Copyright 2015 NTT corp. All Rights Reserved.

Jets energy calibration in ATLAS

Operation and Performance of the CMS Silicon Tracker

Agenda. About Gengo. Our PostgreSQL usage. pgpool-ii. Lessons

Online CMS Web-Based Monitoring. Zongru Wan Kansas State University & Fermilab (On behalf of the CMS Collaboration)

Running a typical ROOT HEP analysis on Hadoop/MapReduce. Stefano Alberto Russo Michele Pinamonti Marina Cobal

GPGPU acceleration in OpenFOAM

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

HPC Deployment of OpenFOAM in an Industrial Setting

arxiv: v1 [physics.ins-det] 4 Feb 2014

GLAST Geant4 Simulation

CLAS12 Offline Software Tools. G.Gavalian (Jlab)

Event display for the International Linear Collider Summer student report

Development of the Incorporating System of Automatic Contrast Injector and Radiology Information System (RIS) for Contrast-enhanced CT Examination

The LHCb Software and Computing NSS/IEEE workshop Ph. Charpentier, CERN

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture

New Design and Layout Tips For Processing Multiple Tasks

US NSF s Scientific Software Innovation Institutes

A Process for ATLAS Software Development

レッドハット 製 品 プライスリスト Red Hat Enterprise Linux 製 品 (RHEL for HPC) 更 新 :2015 年 4 22

A Guide to Detectors Particle Physics Masterclass. M. van Dijk

LHCC TOTEM STATUS REPORT

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

A new inclusive secondary vertex algorithm for b-jet tagging in ATLAS

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Big Data Processing Experience in the ATLAS Experiment

Silicon Seminar. Optolinks and Off Detector Electronics in ATLAS Pixel Detector

FCC JGU WBS_v0034.xlsm

Poisson Equation Solver Parallelisation for Particle-in-Cell Model

Panasonic AC-DC Power Supply Design Support Service

ATLAS NOTE ATLAS-CONF July 21, Search for top pair candidate events in ATLAS at s = 7 TeV. The ATLAS Collaboration.

GenICam 3.0 Faster, Smaller, 3D

Data Centric Systems (DCS)

Learn CUDA in an Afternoon: Hands-on Practical Exercises

GPUs for Scientific Computing

ProTrack: A Simple Provenance-tracking Filesystem

Introduction to Medical Imaging. Lecture 11: Cone-Beam CT Theory. Introduction. Available cone-beam reconstruction methods: Our discussion:

Transcription:

Optimisation of the ATLAS track reconstruction software for Run-2 A. Salzburger, CERN

ATLAS Inner Detector (ID) track reconstruction Track reconstruction is the most challenging step in event reconstruction - classical pattern recognition problem in steps 2

ATLAS Inner Detector (ID) track reconstruction Track reconstruction is the most challenging step in event reconstruction - classical pattern recognition problem in steps (1) track seeding using groups of 3D space points 2

ATLAS Inner Detector (ID) track reconstruction Track reconstruction is the most challenging step in event reconstruction - classical pattern recognition problem in steps (1) track seeding using groups of 3D space points (2) track candidate building using a combinatorial filter 2

ATLAS Inner Detector (ID) track reconstruction Track reconstruction is the most challenging step in event reconstruction - classical pattern recognition problem in steps (1) track seeding using groups of 3D space points (2) track candidate building using a combinatorial filter (3) ambiguity solving of tracks in the silicon tracker 2

ATLAS Inner Detector (ID) track reconstruction Track reconstruction is the most challenging step in event reconstruction - classical pattern recognition problem in steps (1) track seeding using groups of 3D space points (2) track candidate building using a combinatorial filter (3) ambiguity solving of tracks in the silicon tracker (4) track extension into transition radiation tracker (TRT) 2

The LHC was performing outstandingly well - came with a price - event pile-up, i.e. instantaneous collisions per bunch crossing Peak interactions per crossing The Run-1 data taking period 50 45 40 s = 7 TeV s = 7 TeV ATLAS Online Luminosity s = 8 TeV 35 30 25 20 what we initially designed for 15 10 5 0 Jan Apr Jul Oct Month in 2010 Jan Apr Jul Oct Jan Month in 2011 Apr Jul Oct Month in 2012 3

Boundaries and projections - 境 界 と 予 測 Run-2 start up brings excitement of new challenges - increase to 13 TeV (from 7 TeV) : more particles per collision - increase of HLT rate to 1kHz (from 400 Hz): more collisions to process/time - increase of pile-up to <μ>~40 (from ~20) : more collisions per bunch crossing Funding profile is likely to stay flat at best - extrapolation: 3 speed-up for event reconstruction needed - ID track reconstruction is the dominant part Achieved a factor 4 - reduced relative fraction of ID to total reconstruction time Reconstruction time per event [s] 80 70 60 50 40 30 20 10 0 ATLAS Simulation Preliminary RDO to ESD s = 14 TeV <µ> = 40 25 ns bunch spacing Run 1 Geometry pp tt HS06 = 13.08 Full reconstruction Inner Detector only 17.2, 32bit 19.0, 64bit 19.1, 64bit 20.1, 64bit Software release 4

Changing the algebra and math libraries Event Data Model (EDM) and algorithmic code was based on CLHEP Track reconstruction makes heavy use of matrix manipulations - usually N x M (with N,M in [1,5]) and inversions Identified that CLHEP was one of the bottlenecks in our software - simple testbeds implemented to mimic Kalman filterting or Jacobian transport - Eigen algebra library chosen (supports SIMD instructions) Massive reworking of entire ATLAS code 10 8 6 4 2 0 CLHEP MKL SMatrix Eigen Achieved speed-up w.r.t. CLHEP in 5x5 matrix multiplication testbed - more than 1000 packages changed - Eigen/ATLAS interface via typedefs and plugins http://eigen.tuxfamily.org/ 5

Cleaning up the Event Data Model (EDM) Flattening the structures of the track reconstruction EDM Surface trajectory needs to be expressed on different surfaces of the detector exist as charged / neutral representation may exist with covariance or without may be 5-dimensional or 6-dimensional representation (when adding mass) PlaneSurface CylinderSurface ConeSurface DiskSurface Run-1 EDM (x charge, x DIM) AtaPlane AtaCylinder AtaCone AtaDisk MeasuredAtaPlane MeasuredAtaCylinder MeasuredAtaCone MeasuredAtaDisk Run-2 EDM template <class Surface, class Charge, size_t DIM> AtaSurface; A. Salzburger - ATLAS Track Reconstruction Optimisation during LS1 - CHEP April 14, 13, 2015 PerigeeSurface StraightLineSurface Perigee AtaStraightLine MeasuredPerigee MeasuredAtaStraightLine 6

Enormous reduction of code lines in tracking EDM - while even extending the functionality Package C++ C/C++ C++ C/C++ Header Header TrkParameterBase 63 561 11 214 TrkParameters 1715 602 0 52 TrkNeutralParameters 1425 663 0 48 ExtendedTrkParameterBase 0 295 0 0 ExtendedTrkParameters 1412 514 0 0 ExtendedTrkNeutralParameters 1416 514 0 0 Total 6031 3149 11 266 - nice consequence: Run-1 Run-2 Run-1 simplification of object persistency service (only one converter needed) Additional campaigns cleanup A. Salzburger - ATLAS Track Reconstruction Optimisation during LS1 - CHEP April 14, 13, 2015 - removed lazy initialisation and dynamic memory allocation where possible (led to memory fragmentation) - implemented type identifiers to avoid dynamic_cast testing 7

Optimising the software - ソフトウェアを 最 適 化 Example: magnetic field access - numerical (Runge-Kutta) field integration is one of the big CPU consumers - ATLAS adaptive Runge-Kutta propagator has been highly optimised dedicated version was back-ported into Geant4 - field access was not yet optimised deep caller chain field data needed conversion was written in FORTRAN90 - new field service implemented simplified caller chain use native units use cell caching to store value of field -> minimised cache misses speed-up of 20% in simulation, few % in reconstruction Magnetic field map in memory as 3D grid Field look up in Runge-Kutta integration 8

class TrackParticle that marks the analysis representation of tracking. The constructor of the TrackParticleBase shows the new philosophy that allows multiple representations of the underlying track within the detector, while keeping one ParametersBase object specifically outstanding to identify the track state where the four-momentum is defined. Centralise tasks - タスクを 一 元 化 Track to calorimeter cluster association is a frequent process in event reconstruction: - clients throughout the combined reconstruction, e.g.: Figure 6: The new ParticleBase object illustrated in an example based on the ATLANTIS [5] event display. The Track muon/tau/electron is hereby represented reconstruction, with one single TrackParticleBase object at three missing track Et, etc. di erent stages in the detector: as a MeasuredPerigee expression close to the particle flow, photon reconstruction, - analysis showed that this was done up interaction point (defining parameters), to six times per track in our factory design through TrackParameters at the exit of the Inner Detector and the Calorimeter, respectively. switched to a service design where all tracks are dressed with their calorimeter cell associations time saving from multiple calls allows free d CPU cycles to be invested into a more precise job parameters at vertex (defining) - neatly tie in with the new ATLAS analysis event data format (xaod) parameters at ID exit parameters at Calorimeter exit Track prediction through an example calorimeter cell Run-1/2: intersection with cell center Run-2: additionally entry/exit position and path length in cell A. Salzburger - ATLAS Track Reconstruction Optimisation during LS1 - CHEP April 14, 13, 2015 9

Being smarter - 賢 く Track reconstruction software for Run-1 was designed with a lot of redundancy and safety margin Run-1 performance convinced us that our system was understood - and extremely well modelled by MC Re-investigation of track seeding - taking high purity seeds from strip - make optimal use of new innermost Pixel layer (IBL) Greatly improved the seed purity managed to be more efficient in less time: ~25 % saving triple seeds can we built as: - pixel space points only (PPP ) - strip space points only (SSS ) - a combination of both, e. g. (PSS ) 10

Being smarter - 賢 く Track reconstruction software for Run-1 was designed with a lot of redundancy and safety margin Run-1 performance convinced us that our system was understood - and extremely well modelled by MC Re-investigation of track seeding - taking high purity seeds from strip - make optimal use of new innermost Pixel layer (IBL) Greatly improved the seed purity managed to be more efficient in less time: ~25 % saving Efficiency of a seed with 3 space points resulting in a successful track candidate <μ> PPP PPS PSS SSS 0 57% 26% 29% 66% 40 17% 6% 5% 35% When requiring confirmation by another space point (Run-2 strategy) <μ> PPP+I PPS+I PSS+I SSS+I 0 79% 53% 52% 86% 40 39% 8% 16% 70% triple seeds can we built as: - pixel space points only (PPP ) - strip space Step points 2 only remove (SSS space ) - a combination of both, points e. g. from (PSS Step 1 ) Step 1 10

Doing better - より 良 いやって During Run-1 we developed a neural network based cluster splitting - aimed at identifying merged clusters stemming from multiple particles - was run as default before the pattern (ran over all clusters as default) Second iteration - update of the ambiguity solving method - only clusters on track candidates are further tested for splitting (less hits into the pattern) - at the same time use this information to allow for more shared hits on track - about 10% CPU gain Algorithmic Efficiency τ 1.1 1 0.9 Baseline TIDE ATLAS Preliminary Simulation, τ ν τ 3π ± 2 Shared SCT Clusters No Secondaries ATL-PHYS-PUB-2015-006 0.8 0 200 400 600 800 1000 τ p T [GeV] 11

Free lunch - フリーランチ Some CPU saving came for free by updates - new kernel in Scientific Linux 6 gave approximately 10% saving for total event reconstruction - switching from 32bit to 64bit architecture did increase memory footprint slightly - newer compiler version - some vectorisation benefits that came into place via the Eigen library nota bene: track reconstruction is mainly operating on local coordinate systems: DIM 1,2 global coordinate systems: DIM 3 full track representation: DIM 5 this is not optimal for a DIM 4 based vector register [ are currently revisiting a potential DIM 4 based Runge-Kutta method ] - switch to Intel math library (pre-loaded as a plug-in) 12

The factor 4 - 因 子4 Reconstruction time per event [s] LHC Run-1 80 70 60 50 40 30 20 10 0 Jan 2013 Eigen/algebra tests ATLAS Simulation Preliminary RDO to ESD > 1 year w/o working head release s = 14 TeV <µ> = 40 25 ns bunch spacing Run 1 Geometry pp tt HS06 = 13.08 Full reconstruction Inner Detector only 17.2, 32bit 19.0, 64bit 19.1, 64bit 20.1, 64bit integration mag field pattern updates LS 1 Planning & Deployment Software release LHC Run-2 A. Salzburger - ATLAS Track Reconstruction Optimisation during LS1 - CHEP April 14, 13, 2015 TIDE changes Tracking SW workshop Run-2 planning Nov 2012 Tracking SW workshop LS 1 Mid-term Oct 2013 March/April 2015 Run-2 release frozen 13

Today and Tomorrow - 今 日と 明 日 LS1 gave a unique opportunity to clean up the ATLAS track reconstruction software - massive campaign with a rework of almost the entire repository mixture of technology improvements, algorithmic improvements and simply code cleanup - disentangling the impact of the different projects is almost impossible, due to time constraint of LS-1 we had to develop and deploy in parallel We achieved a factor 4 speedup of the overall event reconstruction time - at the same time improving physics performance on all ends - mainly achieved by the Inner Detector track reconstruction - ready for Run-2 data taking Currently in review of the Tracking software for future ATLAS framework - anticipate extensive use of concurrency 14

ご 静 聴 ありがとうございました