Advancements in Big Data Processing

Size: px
Start display at page:

Download "Advancements in Big Data Processing"

Transcription

1 Advancements in Big Data Processing Alexandre Vaniachine V Interna0onal Conference "Distributed Compu0ng and Grid- technologies in Science and Educa0on" (Grid2012) July 16 21, 2012 JINR, Dubna, Russia

2 Big Data and the Grid The forthcoming conference will be the fioh one organized by the JINR Laboratory of Informa0on Technologies star0ng from The distributed compu0ng and grid- technologies have gone beyond the academic interest since than. The present- day grid- technologies made it possible to integrate the compu0ng centers located at any place worldwide and to provide these compu0ng resources to the users. The present day high technologies ask for the development of high- performance compu0ng. The coverage of the ac0vi0es involving huge amounts of informa0on throughputs encompasses now not only various areas of natural sciences but also business and industry. During the last few years the implemented distributed compu0ng grid- infrastructure enables the solu0on of a wide class of tasks that are asking for the processing of enormous amount of data. 2

3 Impressive Conference Series 3

4 Fourth Paradigm In a speech given just a few weeks before he was lost at sea off the California coast in January 2007, Jim Gray, a database sooware pioneer and a MicrosoO researcher, sketched out an argument that exaflood of scien0fic data is fundamentally transforming the prac0ce of science Dr. Gray called the shio a fourth paradigm Invited talk at Grid2010 Conference, Dubna, Russia hzp:// A. Vaniachine 4

5 Big Data The ever- increasing volumes of scien0fic data present new challenges to Distributed Compu0ng and Grid- technologies The emerging Big Data revolu0on drives new discoveries in scien0fic fields including nanotechnology, astrophysics, high- energy physics, biology and medicine New ini0a0ves are transforming data- driven scien0fic fields enabling massive data analysis in new ways To address the Big Data processing challenge, the LHC experiments are relying on the computa0onal grids infrastructure deployed in the framework of the Worldwide LHC Compu0ng Grid Following Big Data processing on the Grid, more than 8000 scien0sts analyze LHC data in search of discoveries 5

6 Historic Milestone Expected from SM Higgs at given mh Global Effort Æ Global Success Results today only possible due to extraordinary performance of accelerators ± experiments ± Grid computing Observation of a new particle consistent with a Higgs Boson (but which RQH«" Historic Milestone but only the beginning Global Implications for the future!"#$%&'&($ 6

7 J. Incandela for the CMS COLLABORATION Big Data Processing for Discoveries GEN-SIM/AODSIM Production/Month!"#$%&'()*+&,#"-.-/+0+*+12##! !#9#:+1%6;<2#! 56-,%78369:;)%! A1'+B+6A1%1)&9# :+1%6C<2# 100 k July 4th 2012 The Status of the Higgs Search It would have been impossible to release physics results so quickly without! 7(2*-+,#CD?E&,*@# the outstanding performance of the Grid (including the CERN Tier-0) Number of concurrent ATLAS jobs Jan-July !#! <3(=)%/8>/%()%+?@A% A1'+B+6A1%1)&# Includes MC production, user and group analysis at CERN, 10 Tier1-s, ~ 70 Tier-2 federations! > 80 sites ;># > 1500 distinct ATLAS users do analysis on the GRID " Available resources fully used/stressed (beyond pledges in some cases) " Massive production of 8 TeV Monte Carlo samples " Very effective and flexible Computing Model and Operation team! accommodate high trigger rates and pile-up, intense MC simulation, analysis demands from worldwide users (through e.g. dynamic data placement) 12 Photo: Denis Balibouse, Pool/AP 7

8 Genesis Big Data experiments at LHC study quark- hadron transi0on (ALICE) electroweek phase transi0on Baryon number violated earlier mechanism: CP viola0on A.D. Sakharov (1967) CP- viola0on in Standard Model is too small for baryogenesis: indicates new physics CP- viola0on/bariogenesis beyond the LHC require Big Data 8

9 Future Big Data Experiments DAQ - event rate 748'9:$-+'1,;$'<=>>?&5+$'@A'B'C'&4&4%D' Experiment Event Size E,-,FE,3%)'$:$-+'1,;$'A'G'C'&4&4%' [kb] Rate [Hz] Rate [MB/s] High rate scenario for Belle II DAQ G'3)#,$1')K'%48'04+4' Belle II 300!?,*'$C#4-1,)-'K43+)%6'L' 6,000 1,800 LCG TDR (2005)* ALICE (HI) 12, '04+4'1+)%$0')-'+4#$' 100 1,250! ALICE (pp) 1, '9:$-+'1,;$'<=>>?&5+$'@A'B'C'&4&4%D' ATLAS 1,600 ' E,-,FE,3%)'$:$-+'1,;$'A'G'C'&4&4%' CMS 1,500 (HI'F'"-,+'J"*,6'B'C'&4&4%' G'3)#,$1')K'%48'04+4' LHCb 25!!?,*'$C#4-1,)-'K43+)%6'L' 2,000 50!"#$%&'#()'*#+,'*-%.,'/()('0(%!)*$'K%43+,)-')K'E,-,')-'0,1?'@=>>M'NO'=>MD' * The LHC experiments are,'%1+2$%3'()24%5$46+06% running J"*,' at a factor of two or higher event rates 5 July 2012 Martin Sevior, ICHEP, Melbourne 2012!"#$%&'()*#"+,-.'/$$01' 2,*,+$0'#%$3,1,)-'0"$'+)'*4-5'411"*#+,)-16'!+)%4.$'.%)81'K%)*'P@L>D'H&'+)'P@Q>>D'H&',-'Q'5$4%1R' 748'04+4'1+)%$0')-'+4#$'! Page 15!+)%4.$'.%)81'K%)*'P@L>D'H&'+)'P@Q>>D'H&',-'Q'5$4%1R' '!"#$%&'()*#"+,-.'/$$01' 2,*,+$0'#%$3,1,)-'0"$'+)'*4-5'411"*#+,)-16' (HI'F'"-,+'J"*,6'B'C'&4&4%'!)*$'K%43+,)-')K'E,-,')-'0,1?'@=>>M'NO'=>MD' J"*,' (HI'.%)81'K%)*'L>>'+)'=GS>>>'TU$#!#$3',-'Q'5$4%1R'!"#!$%#&'$($## )$%*+,'-$.#/01/#2,%3#4!"##11!"# (HI'.%)81'K%)*'L>>'+)'=GS>>>'TU$#!#$3',-'Q'5$4%1R'!"#$% 9

10 Big Data Processing in Future Experiments Belle II Computing Model Grid-based Distributed Computing Common framework for DAQ and offline basf2 based on root I/O Raw Data Storage and Processing Belle II detector MC Production and Ntuple Production MC Production (optional) Ntuple Analysis!"# Zden!k Dole"al SuperB Collaboration Meeting 1/6/ ! 10

11 Six Sigma Quality of Industrial Production In industry, Six Sigma analysis improves the quality of produc0on by iden0fying and removing the causes of defects A Six Sigma process is one in which products are free of defects at level In prac0ce, an industrial Six Sigma process corresponds to 4.5 sigma: taking into account the 1.5- sigma shio from varia0ons in produc0on process In contrast, Big Data processing requires six sigma quality in a true mathema0cal sense 10-8 level of defects 11

12 σ tot Tevatron LHC Why Six Sigma? 10-5 event selection in online data processing σ b σ (nb) σ jet (E T jet > s/20) σ W σ Z σ jet (E T jet > 100 GeV) σ t events/sec for L = cm -2 s -1 >1 PB per year 10-9 event selection in offline data processing σ jet (E T jet > s/4) σ Higgs (M H = 150 GeV) σ Higgs (M H = 500 GeV) s (TeV) Figure 4-1 Cross-section and rates (for a luminosity of cm 2 s 1 ) for various processes in proton Physics discovery requires six sigma quality during Big Data processing 12

13 Golden H 4l Channel: One Event per Billion LHC physics discovery requires Big Data processing errors below 10-8 level 13

14 Six Sigma Quality Failures recovered by re- tries achieves produc0on quality in Big Data processing on the Grid at the six sigma level No events were lost during the main ATLAS reprocessing campaign of the 2010 data that reconstructed on the Grid more than 1 PB of data with events In the last 2011 data reprocessing only two collision events out of events total could not be reconstructed These events were reprocessed later in a dedicated data recovery step Later, silent data corrup0on was detected in six events from the reprocessed 2010 data and in one case of five adjacent events from the 2011 reprocessed data Corresponding to event losses below the 10-8 level, this demonstrates sustained six sigma quality performance in Big Data processing CMS keeps event losses below the 10-8 level in the first pass processing (not Grid) 14

15 Task Processing: ATLAS vs. CMS In Big Data processing scien0sts deal with datasets, not individual files Thus, a task not a Request Define job is a major unit Task Tag Big Data processing on the Grid Task Evolution Recovery Speed up & Forcefinish DEfT: Dynamic Evolution for Tasks Forking & Extensions Transient & Intermediate Autosplitting Figure 1. Workload management architecture. 15

16 A Powerful Analogy!"#$%&'(&)*%&+,&% ' -. %/$00$%11%2$#$340%5$$-6#7*%8347"$*%9:$;.% Splipng of a large data processing task into jobs is similar to the splipng of a large file into smaller TCP/IP packets during the FTP data transfer Splipng data into smaller pieces achieves reliability by re- sending of the dropped TCP/IP packets!"#$%&'$(')"**"'++'',-&-'.-(/*0(1'2$%30(1'4%$5#' Likewise in Big Data processing transient job failures are recovered by re- tries In file transfer the TCP/IP packet is a unit of checkpoin0ng In HEP Big Data processing the checkpoin0ng unit is a job (PanDA) or a file (DIRAC) Unlike many 0ghtly- coupled high- performance compu0ng problems, which require inter- process communica0ons to be parallelized, high energy physics Big Data processing ooen called embarrassingly parallel, since each event is independent However, the event- level checkpoin0ng rarely used today, as its granularity is too small for Big Data processing in high energy physics Next genera0on systems need event- level checkpoin0ng for Big Data (see PanDA talk by K. De),%!"# 16

17 Reliability Engineering for Big Data on the Grid LHC experience has shown that Grid failures can occur for a variety of reasons Grid heterogeneity makes failures hard to diagnose and repair quickly Big Data processing on the Grid must tolerate a con0nuous stream of failures errors and faults While many fault- tolerance mechanisms improve the reliability of Big Data processing in the Grid, their benefits come at costs Reliability Engineering provides a framework for fundamental understanding of the Big Data processing on the Grid, which is not a desirable enhancement but a necessary requirement 17

18 Jim Gray Hypothesis Heisenbugs are the bugs which may or may not cause a fault for a given opera0on If a Heisenbug is present in the system, the error could be removed on retrying the opera0on Heisenbugs are also called as transient or intermizent faults Bohrbugs are the bugs which always cause a failure when a par0cular opera0on is performed In the presence of Bohrbugs, there would always be a failure on retrying the opera0on which caused the failure Bohrbugs are also called permanent faults 18

19 marked as down. If a component is down the operator proceeds to log in to the machine and check the logs for the given component, restart it, and make an entry on the team e-log about the incident. If the problem persists then one of the team leader checks and diagnoses the situation, and if needed contacts the WMAgent support (developers). But keeping up the infrastructure is not CMS limited Failure to just restarting Recovery some components: Cost it also involves checking the load on the machines and if necessary clean up disk space. An alert system is being developed in order to further automatize this task and it will be soon commissioned for production [9]. Available on CMS information server CMS CR -2012/070 Debugging site problems Given The Compact the distributed Muon Solenoid nature Experiment of the CMS computing model the debugging and monitoring Conference of site specific problems Report becomes a communication challenge. Site problems can be divided into two types: jobs cannot enter a site or jobs can run but fail. The Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland first one is solved by communicating the problem to the GlideIn factory support and debugging the problem with them. The second one involves the monitoring of failed jobs in the Global 10 May 2012 Monitor together with the information of Dashboard. Once the nature of the problem is found a site problem or a configuration problem it is then correctly propagated. To the site through the Savannah and GGUS ticket systems [11] in the first case or to the requestor in the latter one. Once the situation A is new understood era for central the request processing is either aborted and production (user configuration problems) or the submission to the site is stopped until the problem is understood and solved. in CMS Edgar Mauricio Fajardo Hernandez for the CMS Collaboration Closing-out requests Once a request is finished, all jobs have either succeeded or failed. At this moment the operator checks that the event count of the request is higher than 95% of the Abstract requested events for MC production and reprocessing and 100% for real data. If it is lower, The goal for CMS computing is to maximise the throughput of simulated event generation while also then another request must be created to recover the failed jobs; this task normally involves processing the real data events as quickly and reliably as possible. To maintain this achievement as communication withe quantity the of coordinators. events increases, since theonce beginningthe of 2011event CMS computing count has migrated is correct, at the the correct transfer Tier 1 level from its old production framework, ProdAgent, to a new one, WMAgent. The WMAgent subscription (to theframework corresponding offers improved processing custodial efficiency and Tier-1 increasedsite) resource usage is monitored as well as a reduction to be at 100%. Once the in manpower. operator agrees that the event count and transfers are fine, he sets the status of the request on In addition to the challenges encountered during the design of the WMAgent framework, several the ReqMgr as closed-out. operational issues have arisen during its commissioning. The largest operational challenges were in the usage and monitoring of resources, mainly a result of a change in the way work is allocated. Other tasks thatinstead areofmainly work being assigned done to operators, by the all work Team is centrally leaders injected andconsist managed in the ofrequest the following: tightly work Invited talk Grid2012 Conference, Dubna, Russia Manager system and the task of the operators has changed from running individual workflows to with the Integration team to commission new sites for MC production and to test new versions monitoring the global workload. 19 of ReqMgr, Global Workqueue and WMAgent. In this report we present how we tackled some of the operational challenges, and how we benefitted from the lessons learned in the commissioning of the WMAgent framework at the Tier 2 level in late As case studies, we will show how the WMAgent system performed during some of the large data reprocessing and Monte Carlo simulation campaigns.

20 ATLAS Failure Recovery Cost CPU- hours used to recover transient failures Job resubmission avoids data loss at the expense of CPU 0me used by the failed jobs Distribu0on of tasks ordered by CPU 0me used to recover transient failures is not uniform: Most of CPU 0me required for recovery was used in a small frac0on of tasks In 2010 reprocessing, the CPU 0me used to recover transient failures was 6% of the CPU 0me used for reconstruc0on In 2011 reprocessing, the CPU 0me used to recover transient failures was reduced to 4% of the CPU 0me used for the reconstruc0on ATLAS Production System, Ljubljana - June 20, 2012 Task Order 20

21 Waloddi Weibull 1939 Weibull published his paper on Weibull distribu0on in probability theory and sta0s0cs 1941 Weibull received a personal research professorship in Technical Physics at the Royal Ins0tute of Technology in Stockholm from the arms producer Bofors 1951 Weibull presented his most famous paper to the American Society of Mechanical Engineers (ASME) on Weibull distribu0on, using seven case studies 1972 American Society of Mechanical Engineers awarded Weibull the Gold Medal 1978 Royal Swedish Academy of Engineering Sciences awarded Weibull the Great Gold Medal The Weibull distribu0on is by far the world's most popular sta0s0cal model for produc0on data 21

22 Bi-modal Weibull Distribution: CPU-time Used to Recover Job Failures in ATLAS Data Reprocessing 100 Number of Tasks Majority of costs are from failures at the end of job 2011 improvements reduced major costs Log 10 (CPU-hours) 22

23 Time Overhead It takes about three million core- hours to processes one petabyte of ATLAS data Transient job failures and retries delay the reprocessing dura0on Op0miza0on of ATLAS Big Data processing workflow and other improvements cut the tails and halved the dura0on of the petabyte- scale reprocessing on the Grid from almost two months in 2010 to less than four weeks in 2011 Op0miza0on of fault- tolerance techniques vs. 0me overhead ( tails ) induced in task comple0on is an ac0ve area of research in Big Data processing on the Grid See the presenta0on by D. Golubkov at the ATLAS Compu0ng Workshop session today 23

24 Conclusions The emerging Big Data revolu0on drives new discoveries in scien0fic fields including nanotechnology, astrophysics, high- energy physics, biology and medicine In Big Data processing scien0sts deal with datasets, not individual files A task (comprised of many jobs) became a unit of Big Data processing on the Grid Splipng of the Big Data processing task into jobs enables fine- granularity checkpoin0ng at the level of a single job/file Fault- tolerance achieved through automa0c re- tries of the failed jobs induces a 0me overhead in the task comple0on, which is difficult to predict Reliability Engineering provides a framework for fundamental understanding of the Big Data processing on the Grid, which is not a desirable enhancement but a necessary requirement Inter- rela0onship of fault- tolerance techniques and performance predic0on is an ac0ve area of research in Big Data processing on the Grid 24

25 Extra

26 Slide from ICHEP: Which Experiments use Big Data? Progress: Scale of the Solution Progress in distributed computing and evolution of computing capacity WLCG processes 1.5M jobs on the grid per day More than 150PB of both disk and tape #,-!+" +!!!!!!!!" *!!!!!!!!" )!!!!!!!!" (!!!!!!!!" '!!!!!!!!"!"#$% #&'% ()!('% (!*#+%!" # $%&'('&)*+,-./01,23+$ 45!6"$7$)'8$9,2:2-,-/$-/;<$ &!!!!!!!!" %!!!!!!!!" $!!!!!!!!" #!!!!!!!!" Ian Fisk FNAL/CD!"./01#!" 2341#!" 5/61#!" 7861#!" 5/91#!".:01#!".:;1#!" 7:<1#!" =381#!" >?@1#!" ABC1#!" D3?1#!"./01##" 2341##" 5/61##" 7861##" 5/91##".:01##".:;1##" 7:<1##" =381##" >?@1##" ABC1##" D3?1##" 13 26

27 Big Data Processing in LHC Experiments Big Data processing on the Grid requires a balance between the CPU performance and the storage Pushing Bid Data limits, LHC compu0ng evolves from the hierarchy to the mesh IEEE NSS/MIC - October 24,

28 Tier 0 First Pass Data Processing Minimizing event losses, LHC experiments implemented mul0- pass processing for Big Data First pass data processing at Tier- 0 Reprocessing at Tier- 1 on the Grid CMS (right) reported first pass event losses at 10-8 level!"#$%"&'$()*+),- ATLAS (leo) tolerates event!"#$%&'()*+(($,$)-+./+. losses at 10-5 level during first pass processing Lost events are recovered during reprocessing 28

ADVANCEMENTS IN BIG DATA PROCESSING IN THE ATLAS AND CMS EXPERIMENTS 1. A.V. Vaniachine on behalf of the ATLAS and CMS Collaborations

ADVANCEMENTS IN BIG DATA PROCESSING IN THE ATLAS AND CMS EXPERIMENTS 1. A.V. Vaniachine on behalf of the ATLAS and CMS Collaborations ADVANCEMENTS IN BIG DATA PROCESSING IN THE ATLAS AND CMS EXPERIMENTS 1 A.V. Vaniachine on behalf of the ATLAS and CMS Collaborations Argonne National Laboratory, 9700 S Cass Ave, Argonne, IL, 60439, USA

More information

Big Data Processing Experience in the ATLAS Experiment

Big Data Processing Experience in the ATLAS Experiment Big Data Processing Experience in the ATLAS Experiment A. on behalf of the ATLAS Collabora5on Interna5onal Symposium on Grids and Clouds (ISGC) 2014 March 23-28, 2014 Academia Sinica, Taipei, Taiwan Introduction

More information

ATLAS Petascale Data Processing on the Grid: Facilitating Physics Discoveries at the LHC

ATLAS Petascale Data Processing on the Grid: Facilitating Physics Discoveries at the LHC ATLAS Petascale Data Processing on the Grid: Facilitating Physics Discoveries at the LHC Wensheng Deng 1, Alexei Klimentov 1, Pavel Nevski 1, Jonas Strandberg 2, Junji Tojo 3, Alexandre Vaniachine 4, Rodney

More information

Data analysis in Par,cle Physics

Data analysis in Par,cle Physics Data analysis in Par,cle Physics From data taking to discovery Tuesday, 13 August 2013 Lukasz Kreczko - Bristol IT MegaMeet 1 $ whoami Lukasz (Luke) Kreczko Par,cle Physicist Graduated in Physics from

More information

SUPPORT FOR CMS EXPERIMENT AT TIER1 CENTER IN GERMANY

SUPPORT FOR CMS EXPERIMENT AT TIER1 CENTER IN GERMANY The 5th InternaEonal Conference Distributed CompuEng and Grid technologies in Science and EducaEon SUPPORT FOR CMS EXPERIMENT AT TIER1 CENTER IN GERMANY N. Ratnikova, J. Berger, C. Böser, O. Oberst, G.

More information

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015 (Possible) HEP Use Case for NDN Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015 Outline LHC Experiments LHC Computing Models CMS Data Federation & AAA Evolving Computing Models & NDN Summary Phil DeMar:

More information

Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil

Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil Volker Büge 1, Marcel Kunze 2, OIiver Oberst 1,2, Günter Quast 1, Armin Scheurer 1 1) Institut für Experimentelle

More information

Big Data Needs High Energy Physics especially the LHC. Richard P Mount SLAC National Accelerator Laboratory June 27, 2013

Big Data Needs High Energy Physics especially the LHC. Richard P Mount SLAC National Accelerator Laboratory June 27, 2013 Big Data Needs High Energy Physics especially the LHC Richard P Mount SLAC National Accelerator Laboratory June 27, 2013 Why so much data? Our universe seems to be governed by nondeterministic physics

More information

Evolution of Database Replication Technologies for WLCG

Evolution of Database Replication Technologies for WLCG Home Search Collections Journals About Contact us My IOPscience Evolution of Database Replication Technologies for WLCG This content has been downloaded from IOPscience. Please scroll down to see the full

More information

E-mail: guido.negri@cern.ch, shank@bu.edu, dario.barberis@cern.ch, kors.bos@cern.ch, alexei.klimentov@cern.ch, massimo.lamanna@cern.

E-mail: guido.negri@cern.ch, shank@bu.edu, dario.barberis@cern.ch, kors.bos@cern.ch, alexei.klimentov@cern.ch, massimo.lamanna@cern. *a, J. Shank b, D. Barberis c, K. Bos d, A. Klimentov e and M. Lamanna a a CERN Switzerland b Boston University c Università & INFN Genova d NIKHEF Amsterdam e BNL Brookhaven National Laboratories E-mail:

More information

CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics)

CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics) CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics) With input from: Daniele Bonacorsi, Ian Fisk, Valentin Kuznetsov, David Lange Oliver Gutsche CERN openlab technical

More information

The Emerging Discipline of Data Science. Principles and Techniques For Data- Intensive Analysis

The Emerging Discipline of Data Science. Principles and Techniques For Data- Intensive Analysis The Emerging Discipline of Data Science Principles and Techniques For Data- Intensive Analysis What is Big Data Analy9cs? Is this a new paradigm? What is the role of data? What could possibly go wrong?

More information

CMS Computing Model: Notes for a discussion with Super-B

CMS Computing Model: Notes for a discussion with Super-B CMS Computing Model: Notes for a discussion with Super-B Claudio Grandi [ CMS Tier-1 sites coordinator - INFN-Bologna ] Daniele Bonacorsi [ CMS Facilities Ops coordinator - University of Bologna ] 1 Outline

More information

Computing at the HL-LHC

Computing at the HL-LHC Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory Group ALICE: Pierre Vande Vyvre, Thorsten Kollegger, Predrag Buncic; ATLAS: David Rousseau, Benedetto Gorini,

More information

How To Teach Physics At The Lhc

How To Teach Physics At The Lhc LHC discoveries and Particle Physics Concepts for Education Farid Ould- Saada, University of Oslo On behalf of IPPOG EPS- HEP, Vienna, 25.07.2015 A successful program LHC data are successfully deployed

More information

LHC Databases on the Grid: Achievements and Open Issues * A.V. Vaniachine. Argonne National Laboratory 9700 S Cass Ave, Argonne, IL, 60439, USA

LHC Databases on the Grid: Achievements and Open Issues * A.V. Vaniachine. Argonne National Laboratory 9700 S Cass Ave, Argonne, IL, 60439, USA ANL-HEP-CP-10-18 To appear in the Proceedings of the IV International Conference on Distributed computing and Gridtechnologies in science and education (Grid2010), JINR, Dubna, Russia, 28 June - 3 July,

More information

Object Database Scalability for Scientific Workloads

Object Database Scalability for Scientific Workloads Object Database Scalability for Scientific Workloads Technical Report Julian J. Bunn Koen Holtman, Harvey B. Newman 256-48 HEP, Caltech, 1200 E. California Blvd., Pasadena, CA 91125, USA CERN EP-Division,

More information

Objectivity Data Migration

Objectivity Data Migration Objectivity Data Migration M. Nowak, K. Nienartowicz, A. Valassi, M. Lübeck, D. Geppert CERN, CH-1211 Geneva 23, Switzerland In this article we describe the migration of event data collected by the COMPASS

More information

Big Data and Storage Management at the Large Hadron Collider

Big Data and Storage Management at the Large Hadron Collider Big Data and Storage Management at the Large Hadron Collider Dirk Duellmann CERN IT, Data & Storage Services Accelerating Science and Innovation CERN was founded 1954: 12 European States Science for Peace!

More information

Science+ Large Hadron Cancer & Frank Wurthwein Virus Hunting

Science+ Large Hadron Cancer & Frank Wurthwein Virus Hunting Shared Computing Driving Discovery: From the Large Hadron Collider to Virus Hunting Frank Würthwein Professor of Physics University of California San Diego February 14th, 2015 The Science of the LHC The

More information

NT1: An example for future EISCAT_3D data centre and archiving?

NT1: An example for future EISCAT_3D data centre and archiving? March 10, 2015 1 NT1: An example for future EISCAT_3D data centre and archiving? John White NeIC xx March 10, 2015 2 Introduction High Energy Physics and Computing Worldwide LHC Computing Grid Nordic Tier

More information

Long Term Data Preserva0on LTDP = Data Sharing In Time and Space

Long Term Data Preserva0on LTDP = Data Sharing In Time and Space Long Term Data Preserva0on LTDP = Data Sharing In Time and Space Jamie.Shiers@cern.ch Big Data, Open Data Workshop, May 2014 International Collaboration for Data Preservation and Long Term Analysis in

More information

The Data Quality Monitoring Software for the CMS experiment at the LHC

The Data Quality Monitoring Software for the CMS experiment at the LHC The Data Quality Monitoring Software for the CMS experiment at the LHC On behalf of the CMS Collaboration Marco Rovere, CERN CHEP 2015 Evolution of Software and Computing for Experiments Okinawa, Japan,

More information

No file left behind - monitoring transfer latencies in PhEDEx

No file left behind - monitoring transfer latencies in PhEDEx FERMILAB-CONF-12-825-CD International Conference on Computing in High Energy and Nuclear Physics 2012 (CHEP2012) IOP Publishing No file left behind - monitoring transfer latencies in PhEDEx T Chwalek a,

More information

ATLAS Software and Computing Week April 4-8, 2011 General News

ATLAS Software and Computing Week April 4-8, 2011 General News ATLAS Software and Computing Week April 4-8, 2011 General News Refactor requests for resources (originally requested in 2010) by expected running conditions (running in 2012 with shutdown in 2013) 20%

More information

From Distributed Computing to Distributed Artificial Intelligence

From Distributed Computing to Distributed Artificial Intelligence From Distributed Computing to Distributed Artificial Intelligence Dr. Christos Filippidis, NCSR Demokritos Dr. George Giannakopoulos, NCSR Demokritos Big Data and the Fourth Paradigm The two dominant paradigms

More information

New Design and Layout Tips For Processing Multiple Tasks

New Design and Layout Tips For Processing Multiple Tasks Novel, Highly-Parallel Software for the Online Storage System of the ATLAS Experiment at CERN: Design and Performances Tommaso Colombo a,b Wainer Vandelli b a Università degli Studi di Pavia b CERN IEEE

More information

Computing Model for SuperBelle

Computing Model for SuperBelle Computing Model for SuperBelle Outline Scale and Motivation Definitions of Computing Model Interplay between Analysis Model and Computing Model Options for the Computing Model Strategy to choose the Model

More information

Status and Evolution of ATLAS Workload Management System PanDA

Status and Evolution of ATLAS Workload Management System PanDA Status and Evolution of ATLAS Workload Management System PanDA Univ. of Texas at Arlington GRID 2012, Dubna Outline Overview PanDA design PanDA performance Recent Improvements Future Plans Why PanDA The

More information

A multi-dimensional view on information retrieval of CMS data

A multi-dimensional view on information retrieval of CMS data A multi-dimensional view on information retrieval of CMS data A. Dolgert, L. Gibbons, V. Kuznetsov, C. D. Jones, D. Riley Cornell University, Ithaca, NY 14853, USA E-mail: vkuznet@gmail.com Abstract. The

More information

BaBar and ROOT data storage. Peter Elmer BaBar Princeton University ROOT2002 14 Oct. 2002

BaBar and ROOT data storage. Peter Elmer BaBar Princeton University ROOT2002 14 Oct. 2002 BaBar and ROOT data storage Peter Elmer BaBar Princeton University ROOT2002 14 Oct. 2002 The BaBar experiment BaBar is an experiment built primarily to study B physics at an asymmetric high luminosity

More information

PoS(EGICF12-EMITC2)110

PoS(EGICF12-EMITC2)110 User-centric monitoring of the analysis and production activities within the ATLAS and CMS Virtual Organisations using the Experiment Dashboard system Julia Andreeva E-mail: Julia.Andreeva@cern.ch Mattia

More information

BNL Contribution to ATLAS

BNL Contribution to ATLAS BNL Contribution to ATLAS Software & Performance S. Rajagopalan April 17, 2007 DOE Review Outline Contributions to Core Software & Support Data Model Analysis Tools Event Data Management Distributed Software

More information

Software, Computing and Analysis Models at CDF and D0

Software, Computing and Analysis Models at CDF and D0 Software, Computing and Analysis Models at CDF and D0 Donatella Lucchesi CDF experiment INFN-Padova Outline Introduction CDF and D0 Computing Model GRID Migration Summary III Workshop Italiano sulla fisica

More information

Online Monitoring in the CDF II experiment

Online Monitoring in the CDF II experiment Online Monitoring in the CDF II experiment 3t, Tetsuo Arisawa 4, Koji Ikado 4, Kaori Maeshima 1, Hartmut Stadie 3, Greg Veramendi 2, Hans Wenzel 1 1 Fermi National Accelerator Laboratory, Batavia, U.S.A.

More information

CERN s Scientific Programme and the need for computing resources

CERN s Scientific Programme and the need for computing resources This document produced by Members of the Helix Nebula consortium is licensed under a Creative Commons Attribution 3.0 Unported License. Permissions beyond the scope of this license may be available at

More information

CLEO III Data Storage

CLEO III Data Storage CLEO III Data Storage M. Lohner 1, C. D. Jones 1, Dan Riley 1 Cornell University, USA Abstract The CLEO III experiment will collect on the order of 200 TB of data over the lifetime of the experiment. The

More information

Scalus A)ribute Workshop. Paris, April 14th 15th

Scalus A)ribute Workshop. Paris, April 14th 15th Scalus A)ribute Workshop Paris, April 14th 15th Content Mo=va=on, objec=ves, and constraints Scalus strategy Scenario and architectural views How the architecture works Mo=va=on for this MCITN Storage

More information

Top rediscovery at ATLAS and CMS

Top rediscovery at ATLAS and CMS Top rediscovery at ATLAS and CMS on behalf of ATLAS and CMS collaborations CNRS/IN2P3 & UJF/ENSPG, LPSC, Grenoble, France E-mail: julien.donini@lpsc.in2p3.fr We describe the plans and strategies of the

More information

Testing the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper

Testing the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper Testing the In-Memory Column Store for in-database physics analysis Dr. Maaike Limper About CERN CERN - European Laboratory for Particle Physics Support the research activities of 10 000 scientists from

More information

Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version

Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version The Particle Physics Experiment Consolidated Grant proposals now being submitted

More information

The Resilient Smart Grid Workshop Network-based Data Service

The Resilient Smart Grid Workshop Network-based Data Service The Resilient Smart Grid Workshop Network-based Data Service October 16 th, 2014 Jin Chang Agenda Fermilab Introduction Smart Grid Resilience Challenges Network-based Data Service (NDS) Introduction Network-based

More information

Context-aware cloud computing for HEP

Context-aware cloud computing for HEP Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada V8W 2Y2 E-mail: rsobie@uvic.ca The use of cloud computing is increasing in the field of high-energy physics

More information

LHC schedule: what does it imply for SRM deployment? Jamie.Shiers@cern.ch. CERN, July 2007

LHC schedule: what does it imply for SRM deployment? Jamie.Shiers@cern.ch. CERN, July 2007 WLCG Service Schedule LHC schedule: what does it imply for SRM deployment? Jamie.Shiers@cern.ch WLCG Storage Workshop CERN, July 2007 Agenda The machine The experiments The service LHC Schedule Mar. Apr.

More information

Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier

Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier Using S3 cloud storage with ROOT and CernVMFS Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier INDEX Huawei cloud storages at CERN Old vs. new Huawei UDS comparative

More information

HIGH ENERGY PHYSICS EXPERIMENTS IN GRID COMPUTING NETWORKS EKSPERYMENTY FIZYKI WYSOKICH ENERGII W SIECIACH KOMPUTEROWYCH GRID. 1.

HIGH ENERGY PHYSICS EXPERIMENTS IN GRID COMPUTING NETWORKS EKSPERYMENTY FIZYKI WYSOKICH ENERGII W SIECIACH KOMPUTEROWYCH GRID. 1. Computer Science Vol. 9 2008 Andrzej Olszewski HIGH ENERGY PHYSICS EXPERIMENTS IN GRID COMPUTING NETWORKS The demand for computing resources used for detector simulations and data analysis in High Energy

More information

The CMS analysis chain in a distributed environment

The CMS analysis chain in a distributed environment The CMS analysis chain in a distributed environment on behalf of the CMS collaboration DESY, Zeuthen,, Germany 22 nd 27 th May, 2005 1 The CMS experiment 2 The CMS Computing Model (1) The CMS collaboration

More information

Batch and Cloud overview. Andrew McNab University of Manchester GridPP and LHCb

Batch and Cloud overview. Andrew McNab University of Manchester GridPP and LHCb Batch and Cloud overview Andrew McNab University of Manchester GridPP and LHCb Overview Assumptions Batch systems The Grid Pilot Frameworks DIRAC Virtual Machines Vac Vcycle Tier-2 Evolution Containers

More information

US NSF s Scientific Software Innovation Institutes

US NSF s Scientific Software Innovation Institutes US NSF s Scientific Software Innovation Institutes S 2 I 2 awards invest in long-term projects which will realize sustained software infrastructure that is integral to doing transformative science. (Can

More information

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014 Enabling multi-cloud resources at CERN within the Helix Nebula project D. Giordano (CERN IT-) HEPiX Spring 2014 Workshop This document produced by Members of the Helix Nebula consortium is licensed under

More information

Open access to data and analysis tools from the CMS experiment at the LHC

Open access to data and analysis tools from the CMS experiment at the LHC Open access to data and analysis tools from the CMS experiment at the LHC Thomas McCauley (for the CMS Collaboration and QuarkNet) University of Notre Dame, USA thomas.mccauley@cern.ch! 5 Feb 2015 Outline

More information

Next Generation Operating Systems

Next Generation Operating Systems Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015 The end of CPU scaling Future computing challenges Power efficiency Performance == parallelism Cisco Confidential 2 Paradox of the

More information

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group DSS High performance storage pools for LHC Łukasz Janyst on behalf of the CERN IT-DSS group CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Introduction The goal of EOS is to provide a

More information

Mission. To provide higher technological educa5on with quality, preparing. competent professionals, with sound founda5ons in science, technology

Mission. To provide higher technological educa5on with quality, preparing. competent professionals, with sound founda5ons in science, technology Mission To provide higher technological educa5on with quality, preparing competent professionals, with sound founda5ons in science, technology and innova5on, commi

More information

August 2009. Transforming your Information Infrastructure with IBM s Storage Cloud Solution

August 2009. Transforming your Information Infrastructure with IBM s Storage Cloud Solution August 2009 Transforming your Information Infrastructure with IBM s Storage Cloud Solution Page 2 Table of Contents Executive summary... 3 Introduction... 4 A Story or three for inspiration... 6 Oops,

More information

CMS Software Deployment on OSG

CMS Software Deployment on OSG CMS Software Deployment on OSG Bockjoo Kim 1, Michael Thomas 2, Paul Avery 1, Frank Wuerthwein 3 1. University of Florida, Gainesville, FL 32611, USA 2. California Institute of Technology, Pasadena, CA

More information

Tier0 plans and security and backup policy proposals

Tier0 plans and security and backup policy proposals Tier0 plans and security and backup policy proposals, CERN IT-PSS CERN - IT Outline Service operational aspects Hardware set-up in 2007 Replication set-up Test plan Backup and security policies CERN Oracle

More information

Scheduling and Load Balancing in the Parallel ROOT Facility (PROOF)

Scheduling and Load Balancing in the Parallel ROOT Facility (PROOF) Scheduling and Load Balancing in the Parallel ROOT Facility (PROOF) Gerardo Ganis CERN E-mail: Gerardo.Ganis@cern.ch CERN Institute of Informatics, University of Warsaw E-mail: Jan.Iwaszkiewicz@cern.ch

More information

Status of Grid Activities in Pakistan. FAWAD SAEED National Centre For Physics, Pakistan

Status of Grid Activities in Pakistan. FAWAD SAEED National Centre For Physics, Pakistan Status of Grid Activities in Pakistan FAWAD SAEED National Centre For Physics, Pakistan 1 Introduction of NCP-LCG2 q NCP-LCG2 is the only Tier-2 centre in Pakistan for Worldwide LHC computing Grid (WLCG).

More information

BENCHMARKING V ISUALIZATION TOOL

BENCHMARKING V ISUALIZATION TOOL Copyright 2014 Splunk Inc. BENCHMARKING V ISUALIZATION TOOL J. Green Computer Scien

More information

Top-Quark Studies at CMS

Top-Quark Studies at CMS Top-Quark Studies at CMS Tim Christiansen (CERN) on behalf of the CMS Collaboration ICHEP 2010, Paris 35th International Conference on High-Energy Physics tt 2 km 22 28 July 2010 Single-top 4 km New Physics

More information

Theoretical Particle Physics FYTN04: Oral Exam Questions, version ht15

Theoretical Particle Physics FYTN04: Oral Exam Questions, version ht15 Theoretical Particle Physics FYTN04: Oral Exam Questions, version ht15 Examples of The questions are roughly ordered by chapter but are often connected across the different chapters. Ordering is as in

More information

NERSC Archival Storage: Best Practices

NERSC Archival Storage: Best Practices NERSC Archival Storage: Best Practices Lisa Gerhardt! NERSC User Services! Nick Balthaser! NERSC Storage Systems! Joint Facilities User Forum on Data Intensive Computing! June 18, 2014 Agenda Introduc)on

More information

Scalable Multi-Node Event Logging System for Ba Bar

Scalable Multi-Node Event Logging System for Ba Bar A New Scalable Multi-Node Event Logging System for BaBar James A. Hamilton Steffen Luitz For the BaBar Computing Group Original Structure Raw Data Processing Level 3 Trigger Mirror Detector Electronics

More information

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report. http://hpcrd.lbl.gov/hepcybersecurity/

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report. http://hpcrd.lbl.gov/hepcybersecurity/ An Integrated CyberSecurity Approach for HEP Grids Workshop Report http://hpcrd.lbl.gov/hepcybersecurity/ 1. Introduction The CMS and ATLAS experiments at the Large Hadron Collider (LHC) being built at

More information

Designing a Cloud Storage System

Designing a Cloud Storage System Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes

More information

Tier-1 Services for Tier-2 Regional Centres

Tier-1 Services for Tier-2 Regional Centres Tier-1 Services for Tier-2 Regional Centres The LHC Computing MoU is currently being elaborated by a dedicated Task Force. This will cover at least the services that Tier-0 (T0) and Tier-1 centres (T1)

More information

PHYSICS WITH LHC EARLY DATA

PHYSICS WITH LHC EARLY DATA PHYSICS WITH LHC EARLY DATA ONE OF THE LAST PROPHETIC TALKS ON THIS SUBJECT HOPEFULLY We may have some two month of the Machine operation in 2008 LONG HISTORY... I will extensively use: Fabiola GIANOTTI

More information

Beyond High Performance Computing: What Matters to CERN

Beyond High Performance Computing: What Matters to CERN Beyond High Performance Computing: What Matters to CERN Pierre VANDE VYVRE for the ALICE Collaboration ALICE Data Acquisition Project Leader CERN, Geneva, Switzerland 2 CERN CERN is the world's largest

More information

Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer. Frank Würthwein Rick Wagner August 5th, 2013

Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer. Frank Würthwein Rick Wagner August 5th, 2013 Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer Frank Würthwein Rick Wagner August 5th, 2013 The Universe is a strange place! 67% of energy is dark energy We got no

More information

Online Performance Monitoring of the Third ALICE Data Challenge (ADC III)

Online Performance Monitoring of the Third ALICE Data Challenge (ADC III) EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH European Laboratory for Particle Physics Publication ALICE reference number ALICE-PUB-1- version 1. Institute reference number Date of last change 1-1-17 Online

More information

Distributed Computing for CEPC. YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep.

Distributed Computing for CEPC. YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep. Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep. 12-13, 2014 1 Outline Introduction Experience of BES-DIRAC Distributed

More information

FCC 1309180800 JGU WBS_v0034.xlsm

FCC 1309180800 JGU WBS_v0034.xlsm 1 Accelerators 1.1 Hadron injectors 1.1.1 Overall design parameters 1.1.1.1 Performance and gap of existing injector chain 1.1.1.2 Performance and gap of existing injector chain 1.1.1.3 Baseline parameters

More information

Database Monitoring Requirements. Salvatore Di Guida (CERN) On behalf of the CMS DB group

Database Monitoring Requirements. Salvatore Di Guida (CERN) On behalf of the CMS DB group Database Monitoring Requirements Salvatore Di Guida (CERN) On behalf of the CMS DB group Outline CMS Database infrastructure and data flow. Data access patterns. Requirements coming from the hardware and

More information

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland Available on CMS information server CMS CR -2012/114 The Compact Muon Solenoid Experiment Conference Report Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland 23 May 2012 CMS Data Transfer operations

More information

Big Data + Big Analytics Transforming the way you do business

Big Data + Big Analytics Transforming the way you do business Big Data + Big Analytics Transforming the way you do business Bryan Harris Chief Technology Officer VSTI A SAS Company 1 AGENDA Lets get Real Beyond the Buzzwords Who is SAS? Our PerspecDve of Big Data

More information

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management

More information

ATLAS job monitoring in the Dashboard Framework

ATLAS job monitoring in the Dashboard Framework ATLAS job monitoring in the Dashboard Framework J Andreeva 1, S Campana 1, E Karavakis 1, L Kokoszkiewicz 1, P Saiz 1, L Sargsyan 2, J Schovancova 3, D Tuckett 1 on behalf of the ATLAS Collaboration 1

More information

Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing

Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing Garhan Attebury 1, Andrew Baranovski 2, Ken Bloom 1, Brian Bockelman 1, Dorian Kcira 3, James Letts 4, Tanya Levshina 2,

More information

Big Data Analytics. for the Exploitation of the CERN Accelerator Complex. Antonio Romero Marín

Big Data Analytics. for the Exploitation of the CERN Accelerator Complex. Antonio Romero Marín Big Data Analytics for the Exploitation of the CERN Accelerator Complex Antonio Romero Marín Milan 11/03/2015 Oracle Big Data and Analytics @ Work 1 What is CERN CERN - European Laboratory for Particle

More information

Leveraging OpenStack Private Clouds

Leveraging OpenStack Private Clouds Leveraging OpenStack Private Clouds Robert Ronan Sr. Cloud Solutions Architect! Robert.Ronan@rackspace.com! LEVERAGING OPENSTACK - AGENDA OpenStack What is it? Benefits Leveraging OpenStack University

More information

Big Data : The New Oil

Big Data : The New Oil Big Data : The New Oil Data Driven Innova7on: $ 15 billion by 2015 Data Driven Innova,on for Growth and Well Being: OECD Report: Interim Synthesis Report 2014 The new data- driven era of scien7fic discovery

More information

Performance Monitoring of the Software Frameworks for LHC Experiments

Performance Monitoring of the Software Frameworks for LHC Experiments Proceedings of the First EELA-2 Conference R. mayo et al. (Eds.) CIEMAT 2009 2009 The authors. All rights reserved Performance Monitoring of the Software Frameworks for LHC Experiments William A. Romero

More information

Clusters in the Cloud

Clusters in the Cloud Clusters in the Cloud Dr. Paul Coddington, Deputy Director Dr. Shunde Zhang, Compu:ng Specialist eresearch SA October 2014 Use Cases Make the cloud easier to use for compute jobs Par:cularly for users

More information

Running a typical ROOT HEP analysis on Hadoop/MapReduce. Stefano Alberto Russo Michele Pinamonti Marina Cobal

Running a typical ROOT HEP analysis on Hadoop/MapReduce. Stefano Alberto Russo Michele Pinamonti Marina Cobal Running a typical ROOT HEP analysis on Hadoop/MapReduce Stefano Alberto Russo Michele Pinamonti Marina Cobal CHEP 2013 Amsterdam 14-18/10/2013 Topics The Hadoop/MapReduce model Hadoop and High Energy Physics

More information

The Elusive U,lity Customer: How Big Data & Analy,cs Connects U,li,es & Their Customers

The Elusive U,lity Customer: How Big Data & Analy,cs Connects U,li,es & Their Customers The Place Analy,cs Leaders Turn to for Answers Member.U(lityAnaly(cs.com The Elusive U,lity Customer: How Big & Analy,cs Connects U,li,es & Their Customers Mike Smith Vice President, U(lity Analy(cs Ins(tute

More information

Unterstützung datenintensiver Forschung am KIT Aktivitäten, Dienste und Erfahrungen

Unterstützung datenintensiver Forschung am KIT Aktivitäten, Dienste und Erfahrungen Unterstützung datenintensiver Forschung am KIT Aktivitäten, Dienste und Erfahrungen Achim Streit Steinbuch Centre for Computing (SCC) KIT Universität des Landes Baden-Württemberg und nationales Forschungszentrum

More information

Workflow Requirements (Dec. 12, 2006)

Workflow Requirements (Dec. 12, 2006) 1 Functional Requirements Workflow Requirements (Dec. 12, 2006) 1.1 Designing Workflow Templates The workflow design system should provide means for designing (modeling) workflow templates in graphical

More information

How To Find The Higgs Boson

How To Find The Higgs Boson Dezső Horváth: Search for Higgs bosons Balaton Summer School, Balatongyörök, 07.07.2009 p. 1/25 Search for Higgs bosons Balaton Summer School, Balatongyörök, 07.07.2009 Dezső Horváth MTA KFKI Research

More information

Real Time Tracking with ATLAS Silicon Detectors and its Applications to Beauty Hadron Physics

Real Time Tracking with ATLAS Silicon Detectors and its Applications to Beauty Hadron Physics Real Time Tracking with ATLAS Silicon Detectors and its Applications to Beauty Hadron Physics Carlo Schiavi Dottorato in Fisica - XVII Ciclo Outline The ATLAS Experiment The SiTrack Algorithm Application

More information

Summer Student Project Report

Summer Student Project Report Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June September 2014 Abstract This report will outline two projects that were done as part of a three months

More information

PLANNING FOR PREDICTABLE NETWORK PERFORMANCE IN THE ATLAS TDAQ

PLANNING FOR PREDICTABLE NETWORK PERFORMANCE IN THE ATLAS TDAQ PLANNING FOR PREDICTABLE NETWORK PERFORMANCE IN THE ATLAS TDAQ C. Meirosu *#, B. Martin *, A. Topurov *, A. Al-Shabibi Abstract The Trigger and Data Acquisition System of the ATLAS experiment is currently

More information

Secure Hybrid Cloud Infrastructure for Scien5fic Applica5ons

Secure Hybrid Cloud Infrastructure for Scien5fic Applica5ons Secure Hybrid Cloud Infrastructure for Scien5fic Applica5ons Project Members: Paula Eerola Miika Komu MaA Kortelainen Tomas Lindén Lirim Osmani Sasu Tarkoma Salman Toor (Presenter) salman.toor@helsinki.fi

More information

Measurement of BeStMan Scalability

Measurement of BeStMan Scalability Measurement of BeStMan Scalability Haifeng Pi, Igor Sfiligoi, Frank Wuerthwein, Abhishek Rana University of California San Diego Tanya Levshina Fermi National Accelerator Laboratory Alexander Sim, Junmin

More information

ATLAS grid computing model and usage

ATLAS grid computing model and usage ATLAS grid computing model and usage RO-LCG workshop Magurele, 29th of November 2011 Sabine Crépé-Renaudin for the ATLAS FR Squad team ATLAS news Computing model : new needs, new possibilities : Adding

More information

Jet Reconstruction in CMS using Charged Tracks only

Jet Reconstruction in CMS using Charged Tracks only Jet Reconstruction in CMS using Charged Tracks only Andreas Hinzmann for the CMS Collaboration JET2010 12 Aug 2010 Jet Reconstruction in CMS Calorimeter Jets clustered from calorimeter towers independent

More information

Web based monitoring in the CMS experiment at CERN

Web based monitoring in the CMS experiment at CERN FERMILAB-CONF-11-765-CMS-PPD International Conference on Computing in High Energy and Nuclear Physics (CHEP 2010) IOP Publishing Web based monitoring in the CMS experiment at CERN William Badgett 1, Irakli

More information

The CMS Tier0 goes Cloud and Grid for LHC Run 2. Dirk Hufnagel (FNAL) for CMS Computing

The CMS Tier0 goes Cloud and Grid for LHC Run 2. Dirk Hufnagel (FNAL) for CMS Computing The CMS Tier0 goes Cloud and Grid for LHC Run 2 Dirk Hufnagel (FNAL) for CMS Computing CHEP, 13.04.2015 Overview Changes for the Tier0 between Run 1 and Run 2 CERN Agile Infrastructure (in GlideInWMS)

More information

Scalable stochastic tracing of distributed data management events

Scalable stochastic tracing of distributed data management events Scalable stochastic tracing of distributed data management events Mario Lassnig mario.lassnig@cern.ch ATLAS Data Processing CERN Physics Department Distributed and Parallel Systems University of Innsbruck

More information

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary 16/02/2015 Real-Time Analytics: Making better and faster business decisions 8 The ATLAS experiment

More information