Einbindung von Cloud-Ressourcen in Workflows der Teilchenphysik und Messung des Underlying Event in Proton-Proton-Kollisionen am LHC.

Transcription

1 IEKP-KA/ Einbindung von Cloud-Ressourcen in Workflows der Teilchenphysik und Messung des Underlying Event in Proton-Proton-Kollisionen am LHC Stephan Riedel DIPLOMARBEIT an der Fakultät für Physik des Karlsruher Instituts für Technologie Referent: Prof. Dr. G. Quast Institut für Experimentelle Kernphysik Korreferent: Prof. Dr. M. Feindt Institut für Experimentelle Kernphysik 31. Oktober 2011

2

3 Deutsche Zusammenfassung Die Suche nach den elementaren Bauteilen der uns umgebenden Materie beschäftigt seit Jahrhunderten Naturwissenschaftler und Philosophen. Bereits im antiken Griechenland wurde die Existenz von fundamentalen Bausteinen vermutet, die átomos - das Unzerschneidbare - getauft wurden. Mit der Einteilung der Atome nach ihren Massenzahlen in das Periodensystem der Elemente durch Mendelejew und Meyer wurde eine erste Ordnung in die bis dahin scheinbar ungeordnete Materie gebracht. Mit der Entdeckung des Elektrons durch Thompson und des Atomkerns durch Rutherford wurde dieses Bild weiter verfeinert. Später stellte sich heraus, dass der Atomkern und auch dessen Bestandteile zusammengesetzte Objekte sind, deren Struktur sich durch hochenergetische Streuexperimente auflösen lässt. Hierfür wurden in den letzten Jahrzehnten Teilchenbeschleuniger mit ständig wachsenden Schwerpunktsenergien gebaut. Der heutige Kenntnisstand über die elementaren Bausteine und deren Wechselwirkungen (außer der Gravitation) wird durch das Standardmodell der Teilchenphysik beschrieben: Es gibt insgesamt sechs verschiedene Quarks (up, down, strange, charm, top, bottom) sowie sechs verschiedene Leptonen (Elektron, Myon und Tau- Lepton sowie die dazugehörigen Neutrinos), die sich jeweils in drei Familien einteilen lassen. Für jedes dieser Teilchen existiert ein Antiteilchen mit entgegengesetzter Ladung und schwachem Isospin ansonsten aber identischen Eigenschaften. Die Wechselwirkungen zwischen diesen Teilchen werden durch sogenannte Eichbosonen vermittelt. Für die elektromagnetische Wechselwirkung sind dies Photonen, für die elektroschwache Wechselwirkung die elektrisch geladenen W ± bzw. neutralen Z 0 Bosonen und für die starke Wechselwirkung insgesamt acht farbgeladene Gluonen. Dennoch bleiben bisher einige Fragen ungeklärt: So zum Beispiel der Ursprung der Masse der Elementarteilchen, die Beschaffenheit von Dunkler Materie und Dunkler Energie oder die Vereinheitlichung aller drei Wechselwirkungen zu einer Theorie. i

4 Der Large Hadron Collider (LHC) am europäischen Kernforschungszentrum CERN 1 in Genf wurde gebaut um einige dieser Fragen zu beantworten. Er ist ein 27 km langer Speicherring für Proton-Proton Kollisionen, der sich unterirdisch vom Genfer See bis unter das französische Jura Gebirge erstreckt und an dem sich mehrere Teilchendetektoren befinden. Mit einer Schwerpunktsenergie, die auf bis zu s = 14 TeV ausgebaut werden kann, ist der LHC der bisher leistungsstärkste Teilchenbeschleuninger der Welt. Eines der Experiemente am LHC ist der Compact Muon Solenoid (CMS) Detektor, der für die Bestimmung von einer Vielzahl an physikalischen Prozessen eingesetzt werden kann. Besonders geeignet jedoch ist er für die Identifizierung von Myonen und die Messung ihres Impulses. Protonen bestehen aus drei Quarks, die farbgeladen sind und durch die starke Wechselwirkung aneinander gebunden werden. Durch das sogenannte Color Confinement ist es nicht möglich, freie Quarks zu beobachten: Die Energie, die sich im Farbfeld zwischen zwei sich entfernenden Quarks befindet, nimmt mit wachsendem Abstand zu und führt zur Neubildung eines Quark-Anitquark-Paars sobald genügend Energie vorhanden ist. Die beiden Quarks, die an der eigentlichen harten Wechselwirkung teilnehmen bilden durch Hadronisierungsprozesse neue Teilchen. Die Überreste der Protonen bestehen aus instabilen, farbgeladenen Teilchen, die Gluonen abstrahlen und neue Objekte bilden, die die Messung der harten Wechselwirkung verunreinigen. Zusammen mit Teilchen, die aus Parton Mehrfachstreuungen stammen, bilden diese das sogenannte Underlying Event. Pro Sekunde werden insgesammt 10 9 Kollisionen am CMS Detektor beobachtet. Die Verwendung eines ausgeklügelten Trigger-Systems, das interssante von uninteressanten Ereignissen trennt, erlaubt zwar eine Reduzierung der auftretenden Daten, bei einer durchschnittlichen Größe von 1.5 MB pro Ereignis ensteht jedoch immer noch eine enorme Datenmenge, die verarbeitet und gespeichert werden muss. Um theoretische Modelle mit den Messungen zu vergleichen werden zusätzlich simulierte Ereignisse mit sogenannten Monte Carlo Methoden produziert. Simulierte und gemessene Daten müssen weltweit an Forschergruppen verteilt und analysiert werden. Hierfür wurde das Worldwide LHC Computing Grid (WLCG) konzipiert. 1 Conseil Européen pour la Recherche Nucléaire ii

5 Dynamische Erweiterung von lokalen Batch Systemen mit Cloud-Ressourcen Neben dem Worlwide LHC Computing Grid werden von lokalen Instituten und Forschungsgruppen Rechen-Cluster zur Analyse ihrer Daten verwendet. Um eine effektive und gleichmäßige Auslastung der Rechenknoten zu gewährleisten, werden Lastverteilungsysteme eingesetzt. Diese Batch Systeme haben typischerweise eine statische Konfiguration: Rechenknoten werden nur bei Wartungsarbeiten aus dem System genommen oder bei Erweiterung von Ressourcen hinzugefügt. Mit dem Aufkommen des Cloud Computing können jedoch ganz neue Konzepte verwirklicht werden. Sowohl Rechenzeit als auch die dafür notwenige Infrastruktur kann bei Cloud-Anbietern angemietet werden. Berechnet hierfür werden nur die tatsächlich verwendeten Ressourcen. Grundlage des Cloud Computings ist die Virtualisierung mit der sich mehrere Bertriebssystem auf einem Rechner gleichzeitig ausführen lassen. Das am Institut für Experimentelle Kernphysik am Karsruher Institut für Technologie entwickelte Framework ROCED (Responsive On-demand Cloud Enabled Deployment) bietet die Möglichkeit auf erhöhte Nachfragen nach Rechenkapazitäten zu reagieren und automatisch Knoten, die in der Computing Cloud gestartet werden, einem Batch System hinzuzufügen. Mit dieser Methode lassen sich Lastspitzen abfangen und lokale Hardware reduzieren, die während Zeiten mit geringer Nachfrage ansonsten ungenutzt bliebe (vergleiche Abbildung 1). computing resources over-utilization under-utilization available capacity demand time computing resources available capacity demand time Abbildung 1.: Vergleich der Auslastung eines Batch Systems ohne (oben) und mit (unten) Cloud Erweiterung. Bisher wurden die Cloud Schnittstellen der Amazon Elastic Computing Cloud (EC2) und Eucalyptus sowie das Batch System TORQUE unterstützt. Der Fokus dieser Arbeit liegt auf der Erweiterung von ROCED mit Schnittstellen zum Batch System Oracle Grid Engine, dem Cloud Interface OpenNebula und dem VPN iii

6 Tool OpenVPN. Letzteres ist notwendig um Cloud Knoten in das lokale Netzwerk einzubinden um den Zugriff auf lokale Speicherelemente zu ermöglichen. Bestimmung des Underlying Event mit Hilfe von Jetflächen Der klassische Ansatz um das Underlying Event zu bestimmen, besteht aus der Untersuchung der Region die sich in transversaler Richtung zum Object mit dem höchsten transversalen Impuls p T befindet. Die Aktivität in diesem Teil des Detektors ist dann ein Maß für das Underlying Event. Die relativ neue Jet Area/Median Methode erlaubt die Bestimmung des Underlying Event mit Hilfe einer neuen Observable ρ, die den Median der Verteilung von p T,j /A j für jeden Jet in einem Event untersucht, wobei p T,j der transversale Impuls und A j die aktive Fläche eines Jets sind. [{ }] ρ pt,j = C (1) median j physical jets In einer ersten bereits veröffentlichten Studie zu ρ wurden Daten aus dem Jahr 2009 mit einer Schwerpunktsenergie von s = 900 GeV untersucht [1]. Die hier vorgestellte Arbeit setzt die Studien zu ρ mit Daten aus 2010 mit Schwerpunktsenergien von s = 900 GeV sowie s = 7 TeV fort. Hierfür wurden die systematischen Ungenauigkeiten neu bestimmt und die Studie durch die Abhängigkeit von ρ bezüglich der Ereignisskala verfeinert. Die Ereignisskala gibt an, in welchem Impulsbereich sich das führende Object in einem Ereignis befindet. Es stellte sich heraus, dass alle Vorhersagen durch Monte Carlo Datensätze Abweichungen von den gemessenen Daten zeigen. Für die Untersuchung der Verteilungen abhängig von der Ereignisskala wurde jeweils der Mittelwert der Verteilungen betrachtet (vergleiche Abbildung 2). Obwohl die Voraussagen der Mittelwerte innerhalb gewisser Schwankungen relativ nahe an den Daten liegen, beschreibt keiner der verwendeten Monte Carlo Tunes die Messwerte perfekt. Zusammen mit den Entfaltungsergebnissen, die in [2] präsentiert werden, kann die Jet Area/Median Methode verwendet werden um neue Monte Carlo Datensätze zu produzieren, die das Underlying Event besser beschreiben. Dies ist speziell für die Suche nach solchen Prozessen notwendig, bei denen Signal und Untergrund nur schwer voneinander getrennt werden können. Ein verbessertes Model des Underlying Event erhöht daher die Signifikanz bei der Identifikation neuer Phänomene. A j iv

7 <ρ'> [GeV] s = 900 GeV track-jets k T R=0.6 η < leading jet p [GeV] T Sys. Sys.+Stat. Pythia 6 Z1 Pythia 6 Z2 Pythia 6 D6T Pythia 8 4C Data <ρ'> [GeV] s = 7 TeV track-jets k T R=0.6 η < leading jet p [GeV] T Sys. Sys.+Stat. Pythia 6 Z1 Pythia 6 Z2 Pythia 6 D6T Pythia 8 4C Data Abbildung 2.: Mittelwerte der Verteilungen von ρ in Abhängigkeit der betrachteten Ereignis Skala. Messwerte werden als schwarze Datenpunkte und Monte Carlo Daten als farbige Linien dargestellt. Auf der linken Seite befindet sich der Plot für s = 900 GeV und auf der rechten Seite für s = 7 TeV. v

8

9 IEKP-KA/ Integration of Cloud Resources into Workflows of Particle Physics Analyses and Measurement of the Underlying Event in Proton-Proton-Collisions at the LHC Stephan Riedel DIPLOMA THESIS Physics Faculty, Karlsruhe Institute of Technology Assessor: Prof. Dr. G. Quast Institute for Experimental Particle Physics Co-assessor: Prof. Dr. M. Feindt Institute for Experimental Particle Physics October 31 st, 2011

10

11 Introduction The current knowledge about the fundamental building blocks of nature and their interactions is combined in the Standard Model of particle physics. This framework describes measured data with remarkable success. In chapter 1 an introduction to it together with its mathematical formalism is given. Over the last decades, the construction of particle accelerators with steadily increasing center-of-mass energies and the measurements of scattering experiments contributed to this picture to a great extend. But there are still some unanswered questions, like the origin of the masses of the elementary particles, the nature of dark matter and dark energy as well as the unification of all three fundamental forces into one theory. To provide an answer to some of these issues the Large Hadron Collider (LHC) was built residing in an 27 km long underground tunnel reaching from lake Geneva to the French Jura Mountains. By colliding protons at a center-of-mass energy up to s = 14 TeV it is the most powerful particle accelerator of the world. One of the particle detectors located at the superconducting storage-ring keeping the particles on track is the Compact Muon Solenoid (CMS) experiment. Being a multi-purpose particle detector it is able to reconstruct numerous physical processes taking place during the collisions. The LHC and the CMS experiment are described in chapter 2. In order to compare measured to simulated data the CMS software framework (CMSSW) was developed. It contains tools for event reconstruction as well as event generators for the simulation of artificial events based on the predictions made by the Standard Model. Furthermore, both simulated and measured data has to be stored in a redundant manner and distributed to research groups located all over the world. For this purpose the Worldwide LHC Computing Grid (WLCG) was established. Chapter 3 provides an introduction to both CMSSW and the WLCG. Besides the WLCG, local computing clusters are employed to analyse the data spreading the load over numerous computing nodes by using batch systems. During times of high demand, the newly emerging sector of Cloud computing allows 1

12 2 to dynamically respond to peak loads and to add additional Cloud resources to the local batch system. The scheduling of Cloud resources is described in chapter 4. Besides the hard interaction, during a proton-proton collision additional soft contributions arise due to QCD background activity polluting the detector and affecting all measurements. These additional components are reffered to as the Underlying Event (UE). In chapter 5 an advanced study making use of the new jet area/median approach for measuring the UE is presented.

13 Contents 1. The Standard Model of Particle Physics Elementary Particles Mathematical Formalism The Fundamental Forces of Particle Physics The Electromagnetic Interaction The Weak Interaction and the Electroweak Unification The Strong Interaction Cross Section and Luminosity The CMS Experiment at the Large Hadron Collider The Large Hadron Collider The CMS Experiment The Coordinate System The Solenoid Magnet The Tracking System The Electromagnetic Calorimeter The Hadronic Calorimeter The Muon System The Trigger System CMS Computing Software Used in HEP Analyses The CMS Software Framework CMSSW ROOT Monte Carlo Event Generation Event Reconstruction The Worldwide LHC Computing Grid Grid User Management Grid Usage

14 4 Contents 4. Extending Batch Systems with Cloud Resources Cloud Computing The Layers of Cloud Computing Virtualization The OpenNebula Cloud Interface The CERN Virtual Software Appliance Project CernVM Distributed Resource Management by Batch Systems Virtual Private Networks Introduction to OpenVPN Scheduling Cloud Resources ROCED Integration of OpenNebula Cloud Nodes into the IEKP HEP Computing Workflow Conclusions and Outlook Studies of the Underlying Event The Traditional Approach of Measuring the Underlying Event The Jet Area/Median Method Data Samples Event Selection Reconstruction Track Selection Jet Reconstruction Systematic Uncertainties Track Efficiency and Fake Rate Uncertainty Systematic Effects due to Track Selection Variations Track-Jet Response Uncertainty Trigger Efficiency Bias Malfunctioning tracker components and tracker alignment Overall Systematic Uncertainty Event Scale Dependence of ρ Inclusive Measurement Event Scale Dependency Conclusions and Outlook A. Cloud Computing 89 A.1. OpenNebula Examples A.1.1. Image Template File A.1.2. Machine Template File A.1.3. Script to to Trigger the Contextualization Process A.1.4. Contextualization Script A.1.5. XML-RPC Interface

15 Contents 5 B. Underlying Event 95 B.1. PYTHIA 6 Tune Parameters B.2. Used Data Sets B.2.1. Data Sets and Monte Carlo Samples for s = 7 TeV B.2.2. Data Sets and Monte Carlo Samples for s = 900 GeV.. 96 B.2.3. Event Scale Dependency of ρ at s = 7 TeV B.2.4. Event Scale Dependency of ρ at s = 900 GeV List of Figures 101 List of Tables 105 Bibliography 107

16

17 Chapter 1 The Standard Model of Particle Physics Mankind has always been on the pursuit of a better understanding of nature and its structures. The idea of fundamental building blocks of matter goes back to ancient Greece claiming the existence of indivisible units called átomos. Over the course of time scientists were able to complement and refine this picture: In the 19th century John Dalton found out, that every chemical element consists of characteristic atoms. Later, J.J. Thompson discovered the electron which confirmed that atoms are composite particles themselves. In the early 20th century, Ernest Rutherford was able to show with his scattering experiments, that atoms consist of a very small nucleus which is surrounded by electrons, opposed to the so far proposed model of a plum pudding. A few years later, the development of quantum mechanics increased the insight into sub-atomic structures and processes drastically. Today, the Standard Model of particle physics describes the current knowledge of the building blocks of nature with great success. All elementary particles can be grouped into the two categories of fermions and bosons depending on the their intrinsic spin. While leptons and quarks carry an intrinsic spin of /2 thus being fermions, the mediators of the fundamental forces carry integer spin and belong to the group of bosons. An overview over the elementary particles are given in chapter 1.1 followed by an introduction to the mathematical concepts used in particle physics in chapter 1.2. Then, the fundamental forces are described in chapter Elementary Particles Leptons and quarks are further divided into three generations where leptons and quarks in higher generations possess higher masses. The family of the leptons consists of the electron e, the muon µ and the tau-lepton 7

18 8 Chapter 1. The Standard Model of Particle Physics τ together with their according neutrinos. While e, µ and τ carry one negative elementary charge as well as masses, the neutrinos, as described by the Standard Model, are neutral and massless particles. Similar to the family of leptons, the quark family consists of six different quark flavors. These are up, down, charm, strange, top and bottom and are ordered by increasing masses. Opposed to leptons, quarks carry fractional electrical charges. While up, charm and top quarks are charged with 2 e down strange and bottom 3 possess an electrical charge of 1e. 3 For each particle there exists an antiparticle resulting in a total number of twelve leptons and twelve quarks. These twelve quarks form composite particles, so called hadrons, which are divided into two groups depending on their spin: ˆ Mesons consist of a quark-antiquark pair and have an integer spin and are thus bosons. ˆ Baryons carry spin- 1 or spin- 3, are fermions and consist of three quarks or 2 2 three antiquarks respectively. With the discovery of the ++ resonance which consists of three up-quarks resulting in a total spin of 3 a new quantum number had to be introduced to the quark 2 model since Pauli s exclusion principle forbids that two identical fermions occupy the same quantum state. Inspired by the additive color model, the new quantum number was named color with the three color charges red, green and blue carried by quarks and the respective anti-colors carried by antiquarks. The main properties of leptons and quarks are summarized in table 1.1. Fermions Leptons Quarks Generation El. Charge Spin Color (in units of e) (in units of ) ν e ν µ ν τ 0 e µ τ -1 1/2 u c t +2/3 d s b -1/3 r,g,b 1/2 Table 1.1.: The main properties of quarks and leptons Mathematical Formalism Quantum field theory is the mathematical concept of the Standard Model and can be expressed according to the Lagrangian formulation of classical mechanics.

19 1.2. Mathematical Formalism 9 According to Lagrange, the trajectory of a particle is derived by solving the Euler- Lagrange equations ( ) d L = L (1.1) dt q i q i where L is the Lagrangian which depends on the generalized coordinates q i and their time derivatives q i. L equals the difference of the kinetic energy and the scalar potential U: L(q i, q i, t) = T U (i = 1, 2, 3). (1.2) For quantum field theory the Langrangian has to be modified in order to describe fields occupying some regions in space instead of localized particles: L(q i, q i, t) L(φ i, µ φ i, x µ ) with µ φ i φ i x µ. (1.3) The Langrange density L depends on the functions φ i of the space time coordinates x µ. This leads to the generalized Euler-Langrange equation ( ) L µ = L (i = 1, 2, 3,...). (1.4) ( µ φ i ) φ i In contrast to classical mechanics, the Langrange densities cannot be derived from fundamental assumptions. They are constructed in such a way that they produce the desired field equations after solving the Euler-Langrange equation. The following Langrangians for spin-0, spin- 1 and spin-1 fields are taken from [3] 2 (without derivation of the final field equations). ˆ The Klein-Gordon Langrangian for a scalar (spin-0) field ψ L = 1 2 ( µψ)( µ ψ) 1 ( mc ) 2 ψ 2 (1.5) 2 produces after substitution into Eq. 1.4 the Klein Gordon equation ( mc ) 2 µ µ ψ + ψ = 0 (1.6) describing a particle with spin-0 and mass m. ˆ The Dirac Lagrangian for a spinor (spin- 1 2 ) field ψ L = i( c)ψγ µ µ ψ (mc 2 )ψψ (1.7) results in the well-known Dirac equation describing a particle of spin- 1 and 2 mass m, e.g. quarks, ( mc ) i µ ψγ µ + ψ = 0 (1.8) where γ µ are the Dirac matrices γ 0,γ 1,γ 2 and γ 3.

20 10 Chapter 1. The Standard Model of Particle Physics ˆ The Proca Lagrangian for a vector (spin-1) field A Solving 1.4 with the following Lagrangian L = 1 16π F µν F µν + 1 8π results in ( mc ) 2 A ν A ν with F µν µ A ν ν A µ (1.9) µ F µν + ( mc ) 2 A ν = 0 (1.10) which is called the Proca equation and describes a particle of spin 1 and mass m, e.g. the gauge bosons of the weak interaction (see below). With the requirement for local invariance under transformation the above given Lagrangians have to be complemented with an additional term, a gauge field. In case of the Dirac Lagrangian this is a massless vector field A µ which can be identified with the electromagnetic potential together with the electromagnetic flux produced between Dirac particles. In this case the gauge boson is the photon and acts as the mediator of the electromagnetic force. The constraint of local gauge invariance can be imposed on all Langrangians and results in a total of 12 gauge bosons: the photon, three weak bosons and eight gluons. In chapter 1.3, the forces mediated between the fundamental particles are further explained The Fundamental Forces of Particle Physics The fundamental forces as described by the Standard Model are divided into four fundamental interactions: the electromagnetic interaction mediated by the photon, the weak interaction with the W ± and Z 0 as its mediators and the strong interaction mediated by gluons (cf. table 1.2). Interaction Exchanged charge Spin Gauge boson Mass (GeV/c 2 ) strong color 8 gluons (g) 0 electromagnetic electric charge photon (γ) 0 weak weak charge W ±, Z , Table 1.2.: Main properties of the fundamental forces and the respective gauge bosons. Masses of the W ± and Z 0 taken from [4].

21 1.3. The Fundamental Forces of Particle Physics The Electromagnetic Interaction The electromagnetic interaction is described by Quantum Electrodynamics (QED) which is a quantum field theory based on the U(1) symmetry group [3] obeying to the rules of a Lie algebra. The U(1) symmetry group has one generator which acts as the mediator of the electromagnetic force: the photon γ which is a massless spin-1 particle with an infinite range of interaction coupling to all electrically charged particles The Weak Interaction and the Electroweak Unification The weak interaction is described by the SU(2) symmetry group which is known to have three generators mediating the weak force. Based on the theory of Glashow, Weinberg and Salam (GWS), the mediators of the electromagnetic and the weak interaction have a common origin: The electroweak symmetry group SU(2) U(1) where the SU(2) group has three generator W 0,W 1 and W 2 and the U(1) group has one generator B 0. According to the GWS theory there is a spontaneous symmetry breaking which is caused by the Higgs mechanism [5, 6] causing the W 0 and B 0 to form two different bosons which can be identified with the Z 0 and the photon γ. This can mathematically be expressed by using the weak mixing angle Θ W : ( ) ( ) ( ) γ cos ΘW sin Θ Z 0 = W B 0 sin Θ W cos Θ W W 0 (1.11) The masses of the Z 0 and W ± bosons can now be related to each other by using the weak mixing angle: M Z = M W Θ W (1.12) Due to their masses and Heisenberg s uncertainty principle they have short a lifetime limiting their range of interaction The Strong Interaction Inspired by Quantum Electrodynamics the theory of the strong interaction was named Quantum Chromodynamics (QCD) and describes the interaction between color-charged objects. The strong force is mediated by so-called gluons which glue the quarks of hadrons together. The SU(3) symmetry group of the strong interaction implies the existence of a total number of 9 gluons each carrying a combination of color and anti-color. This results in a gluon singlet and a gluon octet for which one possible representation is given in table 1.3. However the singlet state has to be excluded since the colors are cancelled out by

22 12 Chapter 1. The Standard Model of Particle Physics Singlet 1/3(rr + gg + bb) Octet rb rg bg br gr gb 1/2(rr bb) 1/6(rr + bb 2gg) Table 1.3.: Gluon color representation anti-green red anti-blue blue red anti-red red blue green green anti-red Figure 1.1.: Additive color scheme. Taken from [7]. the respective anti-colors forming a color-neutral state (cf. figure 1.1). Gluons do not only interact with quarks but every color charged object, including themselves. There are only three different tree-level QCD processes observable: Radiation of a gluon from a quark and vertices with three as well as four gluons. The Feynman graphs of all three possible processes are depicted in figure 1.2. q g g g g g q g g g Figure 1.2.: Feynman graphs of the basic QCD processes. Taken from [8] The gluon self-coupling is accounted for the energy stored in the color field which increases by separating quarks from each other. Once enough energy is stored the gluon string formed between two quarks breaks apart and forms a new quarkantiquark pair from the vacuum. This happens at a distance of about 1 fm and is called color confinement and explains why color charged objects cannot be observed. However, for large momentum transfers in collider experiments quarks are assumed to be quasi-free particles. This feature is called asymptotic freedom and allows theoretical calculations with perturbation theory. Compared to electromagnetic interactions this behaviour can be interpreted as

23 1.3. The Fundamental Forces of Particle Physics 13 anti-screening of a charge. While the coupling between electrical charges becomes bigger with smaller distances, in QCD it is the other way round: In the low energy regime, the coupling diverges and makes calculations with perturbation theory impossible. A summary of of all measurements and the average value of the strong coupling constant α S (M Z ) is shown in figure α s (Q) 0.4 July 2009 Deep Inelastic Scattering e + e Annihilation Heavy Quarkonia QCD α s(μ Z) = ± Q [GeV] Figure 1.3.: Left: Summary of all measurements of α S (M Z ) used as input for the world average value; Right: Summary of measurements of α S as a function of the respective energy scale Q. Here, the mass of the Z 0 boson (M Z = 91, 2 GeV/c 2 ) was taken as reference point. Both plots are taken from [4]. An overview of all fundamental interactions and particles together with all possible couplings is depicted in figure 1.4.

24 14 Chapter 1. The Standard Model of Particle Physics Leptons l q Quarks u, c, t d, s, b γ W Z g Photon W /W Z Gluons H Higgs Boson Figure 1.4.: Summary of the interactions between the fundamental particles. Taken from [9] Cross Section and Luminosity The cross section σ is a measure of the probability that a certain process occurs in a particle collision. Since the geometrical interpretation of the cross section is an area it is measured in m 2. A more commonly used unit is barn (1 barn = m 2 ). The reaction rate W of a certain process is related to σ by the luminosity L: W = Lσ (1.13) where L can be intepreted as the praticle flux and is defined as the number of particles per unit area and unit time: L = number of particles unit area unit time (1.14) With the definition of the integrated luminosity L L = L dt (1.15) the cross section σ can also be expressed in the following way: σ = N inter L (1.16) Here, N inter is the number of observed interactions. In figure 1.5 some cross sections for proton-(anti-)proton collisions are shown in dependency on the center of mass energy s.

25 1.4. Cross Section and Luminosity 15 proton - (anti)proton cross sections σ tot Tevatron LHC σ (nb) σ b σ jet (E jet T > s/20) σ W σ Z σ jet (E jet T > 100 GeV) events/sec for L = cm -2 s σ t σ jet (E T jet > s/4) σ Higgs (M H = 150 GeV) σ Higgs (M H = 500 GeV) s (TeV) Figure 1.5.: Cross sections for various processes in proton-(anti-)proton collisions. Taken from [10].

26

27 Chapter 2 The CMS Experiment at the Large Hadron Collider In order to test the Standard Model of particles physics, particle accelerators with steadily increasing center-of-mass energies and sophisticated particle detectors have been built over the last decades. This was necessary to observe rare processes and to enhance the accuracy of existent measurements. Still, there are some unanswered questions like the origin of mass, whether there is a Grand Unification [11] of all three fundamental forces as well as the nature of the dark matter and the dark energy. The Large Hadron Collider (LHC) at the European Organization for Nuclear Research CERN 1 located at the Franco-Swiss border near Geneva was built to answer some of these questions. The basic properties of the LHC and an introduction to the CMS experiment are subject of chapter 2.1 and chapter The Large Hadron Collider The LHC is located in the former tunnel of LEP, the Large Electron-Positron Collider which was in operation at CERN from 1989 to 2000 and had a peak center-of-mass energy of s = 209 GeV. LEP was mainly limited by the loss of energy due to synchrotron radiation of the particle beam. Since heavier particles emit far less synchrotron radiation, the LHC utilizes protons which allow for a higher center-of-mass energy. These protons are accelerated in opposite directions in two different beam pipes which are using superconducting magnets operating at temperatures below 2 K. To cool down the magnets liquid helium is used. In order to keep the particles on track a magnetic field of 8.33 T has to be provided. 1 Conseil Européen pour la Recherche Nucléaire 17

28 18 Chapter 2. The CMS Experiment at the Large Hadron Collider The beams are crossing at four points which are called intersection points in order to collide the protons. The LHC is currently operating at s = 7 TeV. After a technical shutdown in 2012 this will eventually be raised to its design center-of-mass energy of s = 14 TeV. This makes the LHC the world s most powerful particle accelerator 2. As depicted in figure 2.1, the LHC is the last stage of the CERN accelerator complex. The protons are accelerated successively to higher energies: The LINAC 2 delivers protons with an energy of 50 MeV which is increased to 1.4 GeV in the Booster. Subsequently, the energy is further elevated in the Proton Synchrotron (PS) to 25 GeV and in the Super Proton Synchrotron (SPS) to 450 GeV before it is injected into the LHC. Once both accelerator rings are filled, the protons are once again accelerated to their final energy of 7 TeV and circulate for many hours. Figure 2.1.: The LHC accelerator complex. Taken from [12]. In addition, it is also possible to accelerate heavy ions in the LHC [13]. For this purpose lead ions are accelerated by the LINAC 3 to 4.2 MeV per nucleon before passing through an electron stripping carbon foil. In the Low Energy Ion Ring (LEIR) the energy is increased to 72 MeV/u. Subsequently, the energy is raised to 5.9 GeV/u in the PS before the ions pass a second foil removing all remaining electrons. Finally, the SPS accelerates the ions to 177 GeV/u before they get 2 In comparison, the Tevatron which was a proton-antiproton collider at the Fermi National Accelerator Laboratory (Fermilab) reached a center-of-mass energy of s = 1.96 TeV.

29 2.1. The Large Hadron Collider 19 injected into the LHC where they reach their nominal energy of 2.76 TeV/u. To observe the particle collisions, several detectors are installed at the intersection points. The ALICE 3 experiment is specifically designed for heavy ion collisions while the LHCb 4 experiment is dedicated for b-physics. The other two experiments ATLAS 5 and CMS 6 are multi-purpose detectors intended to explore new physics. In addition to these four main experiments, two more detectors are installed at the LHC: the LHCf 7 measures particles which are created in the forward direction at the ATLAS experiment and simulate cosmic rays under laboratory conditions. Using extreme forward detectors, the TOTEM [19] experiment is designed to measure the total proton-proton cross section. In chapter 2.2, the CMS experiment is introduced in detail. The cavity resonators used for accelerating the protons employ electromagnetic fields alternating in high frequency. A direct result of this method is the arrangement of the protons in so-called bunches. Each bunch contains about protons. In the current operation mode the time spacing between two of these bunches amounts to 50 ns but will eventually be increased to 25 ns. An important characteristic value of a particle accelerator is the luminosity since it is directly connected to the rate of collisions that occur. According to (1.16) the event rate Ṅ of a physical process with the cross section σ is proportional to the luminosity L: Ṅ = L σ (2.1) For a proton-proton collider L can be expressed as L = f γn BN 2 P 4πɛ n β F (2.2) where f is the circulation frequency of the bunches, γ the Lorentz facor, N B the number of bunches consisting of N P protons each, ɛ n the normalized transverse emittance which is a measure of the phase space area of the beam, β the amplitude function at the interaction point and F the reduction function which accounts for the interaction angle of the beams. Currently the LHC is operated at a luminosity of L cm 2 s 1 which will further be increased up to its design luminosity of L = cm 2 s 1. This high luminosity is essential to observe rare processes with very small cross sections. 3 A Large Ion Collider Experiment [14] 4 Large Hadron Collider beauty experiment[15] 5 A Toroidal LHC ApparatuS [16] 6 Compact Muon Solenoid [17] 7 Large Hadron Collider forward [18]

30 20 Chapter 2. The CMS Experiment at the Large Hadron Collider Dealing with such high event rates is a challenging task: At design luminosity about 10 9 inelastic events occur every second at every intersection point and one event of interest will be superimposed by 20 inelastic events [20] The CMS Experiment The CMS experiments is one if the two multi-purpose particle detectors at the Large Hadron Collider. CMS stands for Compact Muon Solenoid which already reflects the main design considerations of this detector. With its 21.5 m length and 15 m diameter it is smaller than ATLAS yet heavier with a total weight of t. The latter attribute is caused by the huge iron yoke which guides the magnetic field outside the superconducting solenoid magnet coil. The momentum of charged particles can be precisely reconstructed by the curvature of the tracks in the tracking system. For this purpose the coil creates a strong magnetic field so that even the momentum of extremely boosted particles can be measured. One of the main challenges of the CMS detector is the precise identification and reconstruction of muons. It is designed to provide accurate measurements of the transverse momentum of muons with an energy up to 1 TeV. An assembly drawing of the CMS detector is shown in figure 2.2. In figure 2.3 a slice of the detector is shown. The different components of the detector are explained in more detail in the following sections The Coordinate System The cartesian coordinate system employed at the CMS experiment has its reference point at the nominal interaction point in the center of the CMS detector. While the z-axis points in the direction of the beam pipe the x-axis points towards the center of the LHC ring. The direction of the y-axis is perpendicular to the xz-plane and points upwards. Coordinates which are better suited for the barrel shaped composition of the detector can be defined by introducing the two angles Θ and Φ forming a polar coordinate system. The azimuthal angle Φ is measured from the x-axis to the y-axis and the polar angle Θ is measured from the symmetry axis z to the y-axis. The coordinate r defines the distance to the beam pipe. The arrangement of the two coordinate systems are depicted in figure 2.4. Another commonly used quantity in particle physics is the rapidity y and is defined as y = 1 ( ) E + 2 ln pz. (2.3) E p z

31 The CMS Experiment Silicon Strip Tracker Silicon Pixel Tracker Electromagnetic Calorimeter (ECAL) Hadronic Calorimeter (HCAL) Return Yoke 15 m Forward Calorimeter Superconducting Solenoid 21.5 m len gt h Muon Detectors Figure 2.2.: The exploded assembly drawing of the CMS experiment shows all major sub-systems of the detector. Adopted from [21].

32 22 Chapter 2. The CMS Experiment at the Large Hadron Collider Figure 2.3.: Slice of the CMS detector. Here, several particle trajectories together with the corresponding energy deposits are drawn exemplarily as well. Taken from [21]. Center of the LHC y x Center of the CMS detector beam pipe Figure 2.4.: Coordinate Systems of the CMS experiment. z

33 2.2. The CMS Experiment 23 y is a dimensionless quantity and is invariant under boosts in the z-direction. Instead of using the rapidity one can define the geometrical quantity pseudorapidity η which is directly connected with the polar angle Θ: [ ( )] Θ η = ln tan 2 (2.4) The Solenoid Magnet The magnetic field of the CMS experiment is produced by a superconducting solenoid magnet which is 12.5 m long and has an inner diameter of 6.3 m. It generates a magnetic field up to 4 T which is guided by an iron yoke outside the coil. The return yoke is interleaved with the muon detectors (cf. chapter 2.2.6) and consists of five 12-sided barrel wheels and six endcaps with a total weight of t. Operating at full power the magnetic field stores a total energy of 2.6 GJ The Tracking System The innermost part of the detector is the tracking system which is responsible for precisely measuring the trajectories of charged particles as well as reconstructing the secondary vertices. This requires a fast readout of the single tracker channels as well as a sufficiently high granularity and durability of the tracker material facing the high collision rates and the enormous radiation exposure. For this task, the CMS experiment utilizes a tracking system consisting of a silicon pixel detector and a silicon strip detector. It has a total length of 5.8 m and a diameter of 2.5 m. The solenoid coil provides a homogeneous magnetic field up to 4 T in the tracker. Its design allows the measurement of tracks up to pseudorapidities of η =2.5. Figure 2.5 shows a schematic view of the tracking system. The silicon pixel detector is in the inner part of the tracking system. Each pixel measures µm 2 in size. The pixel layers are arranged in concentric rings around the beam pipe enclosed with two endcaps at each side. The silicon pixel tracker covers a total area of 1 m 2 and features a total amount of 66 million pixels. With increasing distance from the beam pipe the particle flux decreases and allows for lower priced silicon strip detectors. They are divided in several components 8 providing different degrees of granularity and covering different regions of the tracker. 8 Tracker Inner Barrel (TIB), Tracker Outer Barrel (TOB), Tracker Inner Disk (TID) and Tracker EndCap (TEC)

34 24 Chapter 2. The CMS Experiment at the Large Hadron Collider η r (mm) TEC- TID TID TOB TIB TIB TID PIXEL TID TEC TOB z (mm) Figure 2.5.: Cross section view of the CMS tracking system. Taken from [22] The Electromagnetic Calorimeter The electromagnetic calorimeter (ECAL) is used to measure the momentum of electrons, positrons and photons which produce electromagnetic showers in the ECAL material through bremsstrahlung and pair production. The kinetic energy of these particles is directly proportional to the energy deposits in the ECAL which can be transformed into electrical signals by using photodetectors. For the calorimeter material lead tungstate (PbWO 4 ) crystals are used in the barrel region and crystals in each of the two endcaps. Lead tungstate is used due to its outstanding properties regarding the fast decay times of excited states. The scintillation decay time is of the same order of magnitude as the LHC bunch crossing time: about 80 % of the light is emitted within 25 ns [23]. Like the tracking system, the ECAL features different components for different sections. While the pseudorapidity region up to η = is covered by the ECAL barrel (EB) the ranges < η < 2.6 and < η < 2.6 are covered by the preshower ECAL (ES) and ECAL endcap (EE). A schematically view of the ECAL system is given in figure The Hadronic Calorimeter The hadronic calorimeter forms the next layer of the CMS detector enclosing the ECAL. As a result of their larger interaction length most hadrons produced in a collision pass the ECAL without losing much of their energy. The HCAL was particularly designed for the measurement of hadron jets as well as indirect determination of neutrinos. For this task the HCAL employs steel and brass absorber plates where hadronic showers are produced. The emitted light is guided to hybrid photodiodes by interleaved plastic scintillator tiles.

35 2.2. The CMS Experiment 25 y Barrel ECAL (EB) z =1.479 =2.6 =1.653 Preshower (ES) =3.0 Endcap ECAL (EE) Figure 2.6.: Transverse section through the ECAL, showing the geometrical configuration. Taken from [23]. Due to momentum conservation, the sum of all transverse momenta in each event must be balanced with the nominal transverse momentum. The difference is called missing energy. The occurrence of missing energy is an indicator for unobserved particles such as neutrinos and exotic particles which do not interact with any detector material. The HCAL is devided into four subsystems: The hadron barrel calorimeter (HB), the hadron endcap calorimeter (HE), the hadron outer calorimeter (HO) and the hadron forward calorimeter (HF). A schematic view of the hadron subsystems is depicted in figure 2.7. Like the other parts of the CMS detector, the granularity of the HCAL varies with the different subsystems The Muon System One of the core capabilities of the CMS detector is the detection of muons and the measurement of their momentum. For a precise measurement, the muon detection chambers are interleaved with the return yoke which provides a magnetic field of 1.8 T outside the coil. The muon system employs three different kinds of gaseous detectors with different characteristics. While the Resistive Plate Chambers (RPC) have a fast response time and are thus suited to seed the Level-1 trigger (see chapter 2.2.7), the Drift Tubes (DT) and Cathode Strip Chambers (CSC) are rather slow but provide a good momentum and position resolution. The muon system features a barrel-like design and different subsections cover different regions of the detector. The DT and RPC in the barrel region provide independent measurements up to pseudorapidities of η = 1.2. For the endcap region CSC detectors are used for covering the region of 1.2 < η < 2.4. Additionally RPC detectors cover a region up to η = 1.6.

36 26 Chapter 2. The CMS Experiment at the Large Hadron Collider HO HB HE HF Figure 2.7.: Longitudinal view of the CMS detector showing the locations of the hadron barrel (HB), endcap (HE), outer (HO) and forward (HF) calorimeters. Taken from [22]. The composition of the muon system allows detection efficiencies of more than 90% for muons with a momentum of over 100 GeV [24]. The arrangement of the subsystems is shown in figure The Trigger System As described in chapter 2.1, a total number of about 10 9 events is occurring each second at the design luminosity of cm 2 s 1 resulting in a huge amount of data to be stored and processed. To reduce this bulk of information a two-level trigger system is used within the CMS experiment. The Level-1 trigger (L1) is directly implemented into the CMS hardware and uses programmable processors which are able to reject or accept events on a very fast time scale. The decisions are based on relatively simple event pattern in the detector cells. The L1 trigger allows to reduce the event rate to about 100 khz. Accepted events are passed to the High Level Trigger System (HLT) which is a software driven trigger implemented on the CMS filter farm, which consists of about a thousand CPUs. The HLT is able to combine various informations of all detector components. The used algorithms scan for further interesting signatures and are able to reduce the event rate to about 300 Hz. Assuming an event to possess about 1.5 MB of size, a stream of about 450 MB/s has to be processed by the storage elements.

37 2.2. The CMS Experiment 27 R (cm) MB4 DT eta = RPC 1.2 MB MB2 MB ME1 ME2 ME3 CSC ME Z (cm) Figure 2.8.: Layout of the CMS muon system. Taken from [23].

38

39 Chapter 3 CMS Computing One of the key components of the CMS High Energy Physics (HEP) computing model is the CMS Software (CMSSW) framework which includes methods for event reconstruction and simulation. The event reconstruction is the operation of producing abstract objects based on the raw data delivered by the detector in order to describe the physical processes that have taken place during the collision. To compare the measured quantities to theoretical predictions based on the Standard Model, a comparable amount of Monte Carlo dataset has to be produced. For this task artificial collisions and the expected response of the detector are computed in a full simulation. This takes place in three steps beginning with the calculation of all particles resulting from a proton-proton collision according to the current knowledge of particle physics. In a second step, the interaction of the simulated particles with the detector material as well as the response of the detector read-out electronics is computed. In a last step, the simulated events are reconstructed by the same algorithms used for real events. In the following, chapter 3.1 gives an introduction to the CMSSW framework. Since the commissioning of the LHC, large amounts of data have been produced by the various particle detectors. Even tough the event rate is reduced by applying various triggers algorithms (cf. chapter 2.2.7), still several hundred megabytes have to be stored each second. Subsequently, preprocessed datasets as well as Monte Carlo simulations have to be reconstructed and delivered to research groups located all over the world for evaluation. To cope with this task, an efficient way of storing and distributing the datasets has been developed forming the Worldwide LHC Computing Grid (WLCG). An introduction to the CMS WLCG computing concept is given in chapter

40 30 Chapter 3. CMS Computing 3.1. Software Used in HEP Analyses The CMS Software Framework CMSSW The CMSSW framework stores all collision events in the so-called Event Data Model (EDM) which is based on the concept of C++ object containers. These containers are referred to as Event Containers. All operations on these containers are performed by CMSSW modules. Additional information on the detector conditions like the magnetic field, the alignment of the components or the calibration are provided by the Event Setup. The data processing is initialized by calling a single executable called cmsrun. The execution of the various modules is invoked by the respective statements in the CMSSW configuration files. These modules are dynamically loaded and successively process a given event. There are various modules available which are divided into the following categories: ˆ Source Reads data from the Event Container and the Event Setup or generates empty Event Containers which are filled by other modules. ˆ EDProducer Processes data contained in the Event Container to add additional information to the event, e.g. by clustering tracks into jets (cf. chapter 3.1.4). ˆ EDFilter Stalls the processing of an event if specified requirements are not fulfilled e.g. the momentum of a particle has to exceed a certain threshold. ˆ EDAnalyzer Reads the Event Container but does not change or add any properties. This module can be used to create histograms or ntuples based on the information from the Event Containers or the previously executed EDProducer. ˆ Output Writes the data accumulated during the processing back to disk into a file ROOT The ROOT [25] software project was started in 1994 at CERN in order to analyse huge amounts of data produced in particle collisions. It is the successor of the Physics Analysis Workstation (PAW) program library [26] whose development was discontinued in favour of ROOT. Published under the GNU Lesser General Public License (LGPL) [27] it is freely available for a wide range of operating systems.

41 3.1. Software Used in HEP Analyses 31 ROOT is mostly developed in C++ and utilizes the possibilities of an object oriented programming language. For rapid prototyping, ROOT provides an interactive C++ interpreter, CINT [28], which is able to load and process macros which are in general C++ functions. Using the GNU Compiler Collection (GCC) [29], ROOT programs can also be compiled by linking to the respective libraries increasing the speed of execution. For the use in High Energy Physics, ROOT provides extensive functionalities such as histograms, curve fitting, visualization in 2D and 3D, mathematical functions, statistics tools, four vectors and matrix algebra, geometrical packages and much more. To provide the needed data throughput ROOT has implemented a highperformance input/output system.s The work presented in chapter 5 uses ROOT for histograming, plotting, curve fitting and storing results in the ROOT file format Monte Carlo Event Generation In order to compare measured events to the predictions of theoretical models, datasets with simulated events have to be produced using Monte Carlo techniques by repeatedly random sampling of the underlying physical probability distributions. For this task, so-called Monte Carlo event generators are used. Available event generators are amongst others PYTHIA [30], Herwig++ [31], MadGraph [32], Alpgen [33] and Sherpa [34]. The Geant4 toolkit [35] is used for the simulation of the interaction between the generated particles and the detector material. PYTHIA Event Generator PYTHIA [30] is one of the most popular event generators used for the simulation of hypothetical CMS events. The development was started in 1978 at the Lund University by the local physics theory group. Today, there are two main releases namely PYTHIA 6, which is written in Fortran, and PYTHIA 8, which is a rewriting to C++ aiming to replace the older version in the near future. Although PYTHIA 8 implements some new features it is not yet fully developed. Thus, PYTHIA 6 is the default version for the CMS dataset production. The simulation of events starts with the hard process between two partons of the colliding protons. PYTHIA currently implements about 300 different physical processes calculated at leading order perturbation theory evaluating the matrix element of the corresponding process. To account for soft QCD contributions which cannot be calculated using perturbation theory, phenomenological models are implemented to reproduce the observed particle distributions. One of these models is the Lund string fragmentation model which is motivated by the model

42 32 Chapter 3. CMS Computing of the linear color-field spanned by a quark-antiquark pair moving apart. Due to color confinement the string in between rips apart once enough energy is stored in the field producing a new quark-antiquark pair. This scheme is applied until only hadrons with no net color charge are left. Figure 3.1.: Sketch of the Lund string fragmentation model; here with meson production. The lines at the top of the figure depict the strings spanned by the color field. Taken from [36]. The phenomenological models applied offer numerous parameters whose values cannot be theoretically predicted but have to be tuned to describe the respective event topologies. A fixed set of parameters is referred to as a tune. Depending on the physical process which is to be described, these tunes invoke amongst others different values of the coupling constants α S and α W as well as different parton density functions. The default tune used in the CMS collaboration is tune Z2 of the PYTHIA 6 release. In table B.1, a subset of the various parameters employed by the PYTHIA 6 event generator is given. These parameters are the most important ones for the generation of Underlying Event contributions (cf. chapter 5). Notably is that the main difference between tune Z1 and tune Z2 arises through using different parton density functions but otherwise almost identical parameters. While for PYTHIA 6 there are numerous different tunes, the tuning efforts for PYTHIA 8 are still in its beginning stages. The default tune 4C is loosely based on tune P0 as given by PYTHIA 6. Detector Simulation In order to compare simulated to measured events one has to calculate the interaction between the particles produced by the event generator and the detector material as well as the response of the detector electronics. For this task a complete model of the CMS detector was build using the simulation tool Geant4 [35] including precise geometrical dimensions, the magnetic field configuration and the different detector materials.

43 3.1. Software Used in HEP Analyses 33 In a first step, Geant4 calculates the passage of the particles through the detector subsystems accounting for energy deposits due to ionisation, multiple scattering and bremsstrahlung as well as for the electromagnetic and hadronic showering in the respective subsystems. In the next step, these detector hits are digitalized by simulating the readout electronics. To account for additional collisions during the bunch crossing, the so-called pile-up events, the detector simulation is overlaid by additional signals afterwards. This allows to use the same detector simulations for different ranges of luminosity resulting in different amounts of pile-up Event Reconstruction The data collected by the detector is referred to as raw data. Up to that point the only information obtained is the position and the amount of numerous energy deposits. The process of reducing the raw data to abstract physical objects with new properties like four-vector, charge or particle type is referred to as event reconstruction. This event interpretation is done in several steps which are implemented as CMSSW modules. The reconstruction of jets and their physical properties is referred to as jet clustering which is introduced in the following section. Jet Reconstruction As already described in chapter 1.3.3, color charged particles originating from proton-proton collisions are confined in their color field and cannot be observed by the detector. In fact, numerous new particles are created through gluon radiation forming stable color-neutral final state particles. Collimated in narrow streams, these particles are referred to as a jets. For the clustering process there are several different jet algorithms available which make use of different techniques and therefore posses different characteristics. To be robust against soft QCD emissions, jet algorithms are required to be collinear and infrared safe since slight changes in the input parameters can severely influence the outcome. The behaviour of infrared and collinear unsafe jet algorithms are explained with the help of figure 3.2 and 3.3 respectively. The work presented in chapter 5 relies substantially on both of these characteristics. In the following an introduction to different jet algorithms used within the CMS collaboration is given. ˆ Iterative Cone The Iterative Cone clustering algorithm is a fast, cone based, yet infrared and collinear unsafe algorithm. Starting from the most energetic particle in an event, a circle with the radius R is draw in the η φ plane. In the next iterative steps all objects lying inside the cone are successively merged

44 34 Chapter 3. CMS Computing Figure 3.2.: Illustration of infrared unsafe behaviour of jet clustering algorithms. Particles normally clustered into two jets are merged into one by adding a soft parton. Adopted from [8]. Figure 3.3.: Illustration of collinear unsafe behaviour of jet clustering algorithms. A collinear splitting of of the input particle can lead to the rejection of the particle as seed and the non-consideration of the jet altogether. Adopted from [8].

45 3.1. Software Used in HEP Analyses 35 forming a proto-jet. This is done until the direction no longer changes and the jet is declared to be stable. The algorithm then proceeds with the highest unclustered object. ˆ SISCone The Seedless Infrared-Safe Cone (SISCone) algorithm [37] was designd to be infrared and collinear safe choosing two input particles as starting points whose distance is larger than twice the jet radius defined. In a next step the circle is rotated around one of the points until another particle lies on the circle. By comparing the four-vectors of the particles encircled by the cone before and after the rotation, stable jets are found. In a final split and merge procedure two eventually overlapping jets are separated. The SISCone Algorithm is no longer used within LHC experiments due to its huge computational demands. ˆ Generalized k T Algorithm In contrast to cone type algorithms using fixed jet shapes, the generalized k T algorithm clusters particles depending on the direction of their four-vectors. The distance measure employed is defined as following: d ij = min(p 2p T,i, p2p T,j ) R2 ij R 2 with R 2 ij = (y i y j ) 2 + (φ i φ j ) 2 (3.1) d ib = p 2 T,ip (3.2) where d ij is the distance between two particles in the four-vector space, d ib the distance between a particle and the beam line and p T,i the transverse momentum of an object. Depending on the input parameter p the algorithm is referred to in different names: The anti-k T algorithm (p = 1), the Cambridge/Aachen algorithm (p = 0) and the k T algorithm (p = 1). The generalized k T algorithms are by design infrared safe. Although the calculation of the distances is an exhaustive task, by employing and effective working implementation as well as sophisticated computational methods like Voronoi diagrams [38], the generalized k T algorithms perform excellent in terms of computing time. While for the cone type and ant-k T jet algorithms the areas of the jets are always shaped like circles, the k T and Cambridge/Aachen algorithms allow for irregular shapes as well (cf. figure 3.4). The Concept of Jet Areas Since jets consist of point-like particles they do not posses a definite area. However, by overlaying each event with a grid of extremely soft and uniformly distributed pseudo-particles in the η φ plane the active area of a jet A j can be determined

46 36 Chapter 3. CMS Computing Figure 3.4.: Comparison of the of jet area shapes calculated with the different jet algorithms. Taken from [39]. [40]. For this purpose these ghost particles with a transverse momentum in the order of GeV are clustered together with the tracks of the given event. Since these particles are extremely soft, the kinematic properties of the event remains unchanged. The number of ghost particles ending up in a jet is then a measure of its area: A j = N ghosts j ρ ghosts = N ghosts j N ghosts tot A tot (3.3) Here, N ghosts j is the number of ghosts ending up in one jet and N ghosts tot the total number of ghost particles distributed with a density of ρ ghosts. The total available area A tot = 8π equals the maximum available area of the detector in the η φ plane. Since the output of the clustering process may not depend on the additional ghost particles an infrared and collinear safe algorithm has to be employed. Covered solely with ghost particles, empty regions in an event are clustered to so called ghost-jets. In figure 3.5 and example for the determined jet areas is shown.

47 3.2. The Worldwide LHC Computing Grid 37 GeV A = 8π tot A jet Φ y Figure 3.5.: Jet areas obtained by active area clustering using the k t algorithm. A tot is the total available area of the η-φ plane of the detector. While the area of physical jets are coloured, empty regions are clustered to ghost jets which are not drawn in this figure. Taken from [41] The Worldwide LHC Computing Grid The distribution and processing of the enormous amounts of data produced by the particle detectors at the LHC implies high demands on the computing infrastructure: Newly measured data as well as Monte Carlo datasets have to be made available to hundreds of research groups worldwide. Simultaneously, they have to be stored in a redundant manner to prevent data loss. For an estimated size of 1.5 MB/event the annual amount of accumulated data results in 15 petabytes inducing a challenging task. To meet these requirements the Worldwide LHC Computing Grid (WLCG) [42] has been established featuring a tiered structure (cf. figure 3.6). By combining decentralized computing clusters to a gird a virtual supercomputer was created. In the CMS computing model [43] each tier provides a dedicated service: ˆ Tier-0: Located directly at CERN, the only Tier-0 center performs the first reconstruction step based on the raw data provided by the detectors. Both reconstructed (RECO) and raw (RAW) data are transferred to a mass storage system as well as to different Tier-0 centers. ˆ Tier-1: Currently, there are seven Tier-1 centers available at the WLCG which are located in Europe, the United States and Taiwan. Their main purpose is to replicate the datasets and provide long-term storage as well as producing a first version of the Analysis Object Data (AOD) which is a subset of the RECO data format. Since Tier-1 centers offer vast storage and computing resources, large Monte Carlo productions job as well as re-

48 38 Chapter 3. CMS Computing reconstructions of data with improved algorithms and detector alignment and calibration constants are run here as well. ˆ Tier-2: The datasets produced at the Tier-1 centers are transferred to the Tier-2 centers which are located all over the world. While the user access to Tier-1 centers is limited, Tier-2 center are meant to run jobs of regular grid users. Although the capacities of Tier-2s are smaller than those of Tier-1s, they offer a substantial computing and storage resources. ˆ Tier-3: This layer mostly consists of university clusters connected to the grid and is intended for local user analyses. Tier 2/3 Tier 2/3 Tier 2/3 Tier 2/3 Tier 1 FNAL United States Tier 2/3 Tier 2/3 Tier 1 PIC Spain Tier 1 RAL United Kingdom Tier 2 Warsaw Tier 2/3 Aachen CMS WLCG Structure Tier 0 Tier 1 GridKa Germany Tier 3 Karlsruhe Tier 2 CSCS Tier 2/3 DESY Tier 2/3 Tier 1 ASGC Taiwan Tier 1 CC-IN2P3 France Tier 1 CNAF Italy Tier 2/3 Tier 2/3 Tier 2/3 Tier 2/3 Tier 2/3 Figure 3.6.: Schematic overview of the tiered structure of the CMS Worldwide LHC Computing Grid. Taken from [44] Grid User Management The authentication and authorization within the WLCG is realized via certificates based on a Public Key Infrastructure (PKI). It requires to generate a public and private key-pair which has to be signed by a Certification Authority (CA) in order to proof the user s authenticity. For German grid users the CA is hosted by the Grid Computing Centre Karlsruhe GridKA [45]. Once the authenticity is confirmed, the membership in a Virtual Organization

49 3.2. The Worldwide LHC Computing Grid 39 (VO) has to be requested providing the user with access privileges to computing and storage resources. The amount of resources which are available for disposal are depending on the user s VO. For grid access a temporary certificate based on the original certificate has to be created which is referred to as a proxy usually being valid for several hours or days. On submission, the proxy is packaged together with the users job to be able to identify itself. This is necessary when the job tries to write back its output to a storage element for example Grid Usage Since the hard- and software resources of the WLCG are very heterogeneous, a common interface has to be provided. This is achieved by employing a grid middleware. Two of the middlewares used within the WLCG are the glite [46] framework developed by the Enabling Grids for E-sciencE (EGEE) project [47] and the Virtual Data Toolkit (VDT) middleware [48] maintained by the Open Science Grid (OSG) [49]. While the first of which is mainly deployed at European tier centres, the latter is used at tier centres in the United States. In order to submit jobs to the glite middleware, the so-called Job Description Language (JDL) [50] is used which specifies the requirements of the job e.g. input and output files or a lists of computing sites to be used or blocked. In figure 3.7 a representative overview of the CMS WLCG workflow is given. In a first step the user, authenticated by a proxy, submits his job specified by a JDLfile to the Workload Management System (WMS) which queries the CMS Dataset Bookkeeping System (DBS) as well as the Information Service for additional information on the location of the datasets and available Computing Elements (CE) and Storage Elements (SE). Subsequently, the job is transmitted to the computing site which is the closest to the datasets to reduce network overhead. While the job is executed a logging service keeps track of the progress. After the job has finished, the output is transferred back to the WMS and stored until it is retrieved by the user.

50 40 Chapter 3. CMS Computing get status Logging Service input/output sandbox update status Workload Management System (WMS) transmit sandbox DBS match requirements Info Service collect information Site update status CE WN SE Figure 3.7.: Schematic overview of the CMS WLCG workflow. Taken from [51].

51 Chapter 4 Extending Batch Systems with Cloud Resources As described in chapter 3, particle physics analyses require an extended usage of hard- and software resources. On the one hand, data acquired by the particle detectors at the LHC has to be analyzed and compared to theoretical predictions. On the other hand, these so-called Monte Carlo simulations have to be computed. For this purpose, a vast amount of storage has to be accessible for the scientific community to guarantee availability and redundancy. The processing of datasets and the generation of simulated events requires a lot of CPUs and memory. In order to fit these needs a grid of large computing clusters, the WLCG 1, has been established featuring a tiered architecture with computing centres located all over the world. While the Tier 0 and Tier 1 centres are exclusively intended to store and replicate datasets, Tier 2 centres are meant to be used for analyses. Additionally, most research groups have locally deployed computing clusters which are mainly used for processing ntuples 2 and private Monte Carlo productions. For an effective usage of the available hardware, all computing sites feature batch systems scheduling the user s jobs to be processed on computing nodes. Commonly used batch systems in High Energy Physics (HEP) computing are Condor [52], LSF [53], TORQUE [54] and Oracle Grid Engine [55]. Even though prices for computer hardware are constantly decreasing, research groups concerning to operate their own cluster, not only have to consider the costs of acquisition but also the costs of maintenance as well as the costs of human resources which are necessary for this task. In recent years a new commercial sector has emerged, which sells computing infrastructure as a service: Cloud Computing. This part of the thesis deals with the dynamical extension of local batch 1 Worldwide LHC Computing Grid 2 Event container with user-defined content 41

52 42 Chapter 4. Extending Batch Systems with Cloud Resources systems with Cloud resources in the field of HEP computing. In chapter 4.1 an introduction to Cloud computing with its various layers and techniques is given followed by a presentation about batch systems in general and the Oracle Grid Engine in chapter 4.2. Chapter 4.3 is concerned with virtual private networks which offer further opportunities in conjunction with Cloud computing and batch processing. Subsequently, chapter 4.4 deals with the scheduling of Cloud resources to handle peak loads in local batch systems by using the framework ROCED followed by the prospects of this work in chapter Cloud Computing Cloud computing is one of the most promising topics in High Performance Computing (HPC) in the last few years. In general, Cloud computing gives the illusion of infinite computing resources and total elasticity: There are as many resources available as required, for the time needed and only the resources which are actually used are charged. This opens all sorts of new possibilities for all types of user groups. For example a start-up enterprise can begin with a small scale version and test if there is demand for such a service. Later on, they can dynamically add or remove resources depending on their success. They do not have to take the risks of high investments into hardware, which pays its revenue only over a longer period of time. This makes it possible to launch all kinds of new services, which were up to now not being considered profitable [56] The Layers of Cloud Computing According to [57] Cloud Computing can roughly be divided into the following three categories: ˆ Software as a Service (SaaS) This service provides a ready-to-use software interface, which is used within the Cloud and mostly needs no further installations on the customer s computer. Examples for this service are the image hosting service Flickr [58], the multi-purpose tool collection Google Apps [59] or the storage provider Dropbox [60]. ˆ Platform as a Service (PaaS) PaaS provides software developers with a runtime environment in which the development and execution of programs is possible. Most of the time, there are restrictions to certain programming languages. Examples for this service are the web application infrastructures Google App Engine [61] and Windows Azure [62].

53 4.1. Cloud Computing 43 ˆ Infrastructure as a Service (IaaS) Infrastructure as a Service provides access to a whole virtual machine including networking and storage and allows you to perform almost the same actions than on a local machine. Examples for this service are the Cloud management solutions by Amazon Elastic Compute Cloud (EC2) [63], Eucalyptus [64] and OpenNebula [65]. Figure 4.1.: The main Cloud categories: Software as a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Service (IaaS). Adopted from [44].. The hierarchy of the different layers is shown in figure 4.1. While it would be possible to run, for example, Software as a Service on a machine that is provided by Infrastructure as a Service, the other way round would not be possible. All three sections have in common, that it is not obvious for the user where files are stored or on which physical computer programs are running. This is why it is called the Cloud. This decoupling of software and physical hardware is achieved by virtualization. A brief introduction to this technique is given in chapter Furthermore, Clouds are separated into Public Clouds and Private Clouds. While Public Clouds (e.g. Amazon EC2 web services) are mostly commercial and offer their service to everyone, Private Clouds are non-commercial and have a restricted group of users. This does not mean that Private Clouds are only employed by non-profit organizations. The significant difference between the two models resides in the fact that the service of a Private Cloud is of non-commercial purpose. Most Public Cloud Providers offer a pay-as-you-go cost model which charges for used resources only. While storage, memory and CPU is rather cheap, network bandwidth is charged with higher fees. Since physics analysis includes transferring

54 44 Chapter 4. Extending Batch Systems with Cloud Resources large datasets, Public Clouds are not yet interesting for I/O extensive 3 usage by HEP Computing. This gap can be closed by Private Clouds which can be run on local clusters. For this thesis, only Infrastructure as a Service has been used in a local Cloud installation. From now on the term Cloud is used as a synonym for the Infrastructure as a Service paradigm Virtualization Virtualization is a computing technique which allows to run several operating systems on the same physical hardware by encapsulating each in a so-called virtual machine (VM). First implementations of virtualization reach back to the 1960s where many users shared one IBM mainframe [66]. These mainframes were able to cope with multiple operating systems running in parallel. However, with the rise of the personal computer the trend changed from many users per system to one user per system. Nowadays, the setups of computing clusters are very heterogeneous using various hardware and operating systems. To consolidate these setups the technique of virtualization is reappearing in High Performance Computing systems. The Virtualization of operating systems offers many advantages over running them natively: ˆ Reduction of over-provisioning of hardware resources Today s computing clusters are often designed to handle peak loads. This leads to under-usage of the existing hardware. Sharing the available resources among several operating systems results in a better overall utilization. ˆ Flexibility in the choice of the operating system Many applications require a specific operating system. Virtualization can provide every user with the operating system best suited for the specific needs. ˆ Availability The decoupling of soft- and hardware makes it possible to migrate virtual machines between hosts even during runtime. The possibility of copying whole machine templates makes it easy to set up additional machines and to create backups. This minimizes downtimes due to maintenance or hardware failures. ˆ Increased safety The encapsulation of each machine limits the risk of the host to be affected 3 Computations with a high data throughput.

55 4.1. Cloud Computing 45 by malfunctioning guest systems. The additional virtualization layer is an effective measure against compromisations of the host system as well. To run multiple operating systems on one single host there exist several different virtualization techniques which are briefly described in the following. A more detailed introduction can be found in [67, 68]. To understand the different approaches, a look at x86 based operating systems is helpful. x86 systems are designed to run directly on the hardware and are divided into privilege levels which are known as Ring 0,1,2 and 3, as depicted in figure 4.2. While user applications run in the unprivileged Ring 3, the operating system has to run in Ring 0 to guarantee direct access to the hardware. In order to virtualize the x86 architecture and to manage the hardware access of several operating systems, an additional layer has to be introduced on the level of Ring 0 [69]. Ring 3 User Apps Ring 3 User Apps Ring 2 Ring 2 Ring 1 Ring 1 Guest OS Ring 0 OS Ring 0 VMM System Hardware System Hardware Figure 4.2.: On the left hand side, a schematic overview of the x86 privilege level architecture without virtualization is shown. The operating system runs in the most privileged Ring 0 whereas user applications are executed in Ring 3. On the right hand side, a virtual machine monitor (VMM) is executed in Ring 0. The guest operating systems communicate through the VMM by binary translation with the physical hardware whereas user applications can still be executed directly. Full Virtualization For full virtualization, a complete simulation of the underlying hardware is provided by executing a Virtual Machine Monitor (VMM) in Ring 0 for each started virtual machine. This layer provides all system services of the physical machine to the guest operating system (see figure 4.2). This is achieved by using binary translation converting one instruction set into another. Meanwhile, applications running in the user level are directly executed on the processor. The guest operating system is not aware of this abstraction layer and needs no additional modifications. Since full virtualization completely decouples the operating system from hardware it is the most flexible of the three approaches allowing to backup, migrate or port virtual machines easily.

56 46 Chapter 4. Extending Batch Systems with Cloud Resources Paravirtualization Just like in the full virtualization approach, an abstraction of the underlying computer system is provided by the VMM. To improve performance and efficiency, paravirtualization allows the guest OS to access the system hardware directly. This is realized by hooks in the hypervisor system which are passed to the virtual machine. To handle these hooks the guest OS has to be modified. Paravirtualization is depicted schematically in figure 4.3. Hardware Assisted Virtualization With the success of virtualization, hardware vendors like Intel or AMD created new processor extensions (Intel VT-x [70] and AMD-V [71]) for the x86 architecture to simplify virtualization techniques. These extensions allow the VMM to run in a new root mode below ring 0 as shown in figure 4.3. With Hardware Assisted Virtualization certain privileged calls are directly passed to the hypervisor. Ring 3 User Apps Ring 3 User Apps Ring 2 Ring 1 x86 Privilege Architecture Ring 2 Ring 1 Ring 0 Guest OS Ring 0 Guest OS Virtualization Layer System Hardware Root Mode Privilege Levels VMM System Hardware Figure 4.3.: On the left hand side a schematic overview of the para virtualization technique is depicted. Here dedicated calls can directly be passed to the system hardware without binary translation. On the right hand side the approach of hardware assisted virtualization is shown which uses special extensions introduced to simplify processor calls The OpenNebula Cloud Interface There are several open source software projects which allow to set up a Cloud interface on top of a computing cluster, for example Eucalyptus (the open source implementation of the Amazon EC2 interface) [64], OpenStack [72] or OpenNebula [65]. Most Cloud interfaces offer a common set of basic services and feature similar designs but differ in some extended functionalities. Images of virtual machines can be uploaded to a Cloud Controller which acts as the Cloud manager as well as the access point to it. The different actions which can be performed on the virtual machines are executed on Cloud nodes which are usually part of a local cluster. The schematic overview of a typical Cloud setup is depicted in figure

57 4.1. Cloud Computing While [44] shows, that it is possible to automatically extend a batch system User supplied Machine Image(s) Cloud Controller boot / shutdown pause / migrate Node Controller Node Controller Node Controller Virtualized. Ubuntu SL 5 Virtualized. Win2k3 Debian 5 Virtualized. <any OS> <any OS> Figure 4.4.: Schematic overview of a typical Cloud setup. Taken from [73]. with nodes from an Amazon EC2 and Eucalyptus Cloud, the work of this thesis extends the afore mentioned concept with the possibility to use the OpenNebula Cloud interface. OpenNebula (ONE) is an open source Cloud interface published under the Apache License v2. It allows to establish a private, public or hybrid Cloud and contains a collection of tools which unite different technologies for storage, networking, virtualization and monitoring techniques. The CERN IT department uses OpenNebula in its lxcloud and was able to manage up to VMs simultaneously demonstarting that ONE is a robust tool to handle large quantities of Cloud nodes. In the following, an outline of the main functionalities of OpenNebula is given [74]. Virtual Machine Image Management In order to manage virtual machine images, OpenNebula offers the possibility to register user defined images in a repository. Once registered, images can be shared with other users. An example of the usage of the image management is given in appendix A.1.1. Virtual Network Management The ONE network manager allows to define several virtual subnets with different IP ranges and/or subnet masks. This way, it is possible to isolate nodes from each other and to control network access. The network settings are applied during the contextualization process which allows to add additional settings and customizations to the virtual machine during the start-up process. This includes adding

58 48 Chapter 4. Extending Batch Systems with Cloud Resources default users as well as registering SSH 4 keys. To use this feature the virtual machine has to be adjusted by a script that mounts an ISO image 5 that is attached by OpenNebula to the machine on start-up. The ISO image contains a file (context.sh) with all necessary information and a script (init.sh) which takes care of the rest of the contextualization. An example of both scripts are given in appendix A.1.3 and A.1.4 respectively. The workflow is depicted in figure 4.5 and proceeds as follows: 1. Get a free IP and all according configurations from the OpenNebula network manager and write it to context.sh. 2. Copy configurations together with init.sh (and other files) to an ISO image. 3. Start up the virtual machine and attach the ISO image. 4. Mount ISO image and start the init.sh script. OpenNebula host init.sh one template file creates ssh-keys context.sh Virtual machine iso image mount image ONE selects suited host and starts VM - network settings - add users - register ssh keys - install custom stuff /mnt/context/init.sh /mnt/context/context.sh Figure 4.5.: Schematic view of the OpenNebula contextualization process. After an IP address has been assigned to a node, the ONE network manager marks it as used and only releases it after successful termination of the virtual machine. Virtualization OpenNebula supports several hypervisors for the virtualization such as KVM, Xen and VMware. Images can be locally prepared and uploaded to the Cloud. This 4 The Secure Shell is a network protocol which is used to open secure connections to remote shells. 5 An ISO (International Organization for Standardization) image is an archive file with an ISO 9660 file system which is used with CD-ROM or DVD media.

59 4.1. Cloud Computing 49 allows easy configuration and maintenance of user defined machines. On creation of a virtual machine, an overlay image is copied by OpenNebula to one of its nodes with spare resources and started by the according hypervisor. Changes to the image, which are done during runtime are lost after termination of the machine. However, besides pausing and restarting a running machine, OpenNebula offers a feature which allows to save it in its current state as a newly created image. Another powerful feature is the migration of a virtual machine from one OpenNebula node to another. This can even be done live, which means that neither the machine have to be shut down nor a user is aware of a change. To ease the monitoring of started instances, OpenNebula keeps track of the current state of all virtual machines. For debugging purposes ONE also offers a VNC 6 interface through which it is possible to connect to machines directly. Storage Management In contrast to other Cloud interfaces, where it is possible to attach additional storage during runtime, OpenNebula only creates space for a virtual machine on its start-up. The amount of storage has to be declared within the virtual machine image template file together with other properties such as CPU or memory requirements. An example for such a template file is give in appendix A.1.2. Interfaces To perform actions on virtual machines (e.g. creating, pausing, migrating, etc.), OpenNebula can be addressed via several interfaces: ˆ OpenNebula command-line tools ˆ Open Cloud Computing Interface (OCCI) ˆ Distributed Resource Management Application API (DRMAA) ˆ Extensible Markup Language Remote Procedure Call (XML-RPC) The OpenNebula user management only allows to perform actions on machines which are owned (ergo started) by the user. The command-line tools offer a maximum of control and functionality and can be used directly on the Cloud Controller via SSH. These are especially useful for testing, debugging or starting single machines. The Open Cloud Computing Interface is a remote management API for IaaS model based services, allowing for the development of interoperable tools for common tasks including deployment, autonomous scaling and monitoring. 6 The Virtual Network Computing interface is a graphical desktop sharing tool to remotely control another computer.

60 50 Chapter 4. Extending Batch Systems with Cloud Resources Similar to OCCI, the DRMAA interface was developed to provide a common interface to remotely manage and control Cloud nodes. The OpenNebula DRMAA implementation only allows to perform actions on machines started during the same session which does not suite the requirements for automated Cloud deployment. The set of commands of the XML-RPC interface is almost identical to the commandline tools. The usage requires the OpenNebula RPC server to be started. The authentication takes place through the OpenNebula user management. By establishing an SSH tunnel on the RPC port to the Cloud Controller it is assured, that the user password is transmitted in a safe way which is only ciphered with the less secure SHA1 7 algorithm and transferred as plain text. Since most programming languages already feature XML-RPC support no additional libraries have to be included. Although the other APIs might be more sophisticated or portable, this thesis uses the XML-RPC API because of its simplicity and the advance to use it without external libraries The CERN Virtual Software Appliance Project CernVM The analysis of HEP data requires experiment specific software which usually runs only on particular operating systems. The installation and maintenance of these tools is a rather time consuming task: New releases have to be installed and previous releases have to be kept for backwards compatibility. Since the experiment software is fairly large (typically several gigabytes per release) it is difficult to store it together with the operating system in a single image with a moderate size. A larger image increases the network overhead caused by the transfer of the image to the host system and results in an longer start-up time of the virtual machine. The CernVM project [76] provides a subtle solution to these problems. It already provisions nearly all software analysis versions of all main experiments at the LHC but still tries to keep a baseline image size of about 1 GB. This is achieved by decoupling the operating system from the experiment software. Using the rbuilder [77] toolchain, a minimal operating system that is sufficient to run a given application is constructed. To include the experiment specific software a new read-only file system (CernVM-FS) has been developed. CernVM-FS exploits the fact, that a typical analysis job only requires a small fraction of files of a specific release. For a user it seems like all files are stored in the virtual machine. Needed files are downloaded on demand from a proxy using HTTP and cached in the virtual machine. CernVM-FS is specifically designed to deal with network 7 Secure Hash Algorithm 1 developed by the National Institute of Standards and Technology [75]

61 4.1. Cloud Computing 51 failures and is implemented into the system using the FUSE 8 kernel module. To reduce the network overhead CernVM-FS uses file compression. Duplicate files are detected and only downloaded once. To increase the speed and availability, local proxies can be installed and added to CernVM-FS. The key building blocks of CernVM are shown in figure 4.6. CernVM is available Figure 4.6.: The key building blocks of a CernVM image: Minimal Operating System, CernVM-FS-read only network file system using HTTP protocol, contextualization and configuration interfaces. Take from [79]. for several different hypervisor architectures and use cases. The Batch Node distribution can directly be used within OpenNebula by simply adapting the image to the contextualizations described in chapter Furthermore, CernVM-FS can be installed on an already existing machine as well. After start-up, the experiment specific software is invoked into CernVM by executing the following lines with root rights: $> /etc/init.d/cvmfs stop $> /etc/cernvm/config -c site CERNVM_ORGANISATION=CMS $> /usr/sbin/groupadd -r cms $> /usr/sbin/adduser -g cms <username> $> /bin/passwd <username> $> /etc/init.d/cvmfs start 8 The Filesystem in Userspace kernel module [78] allows non-privileged users to create their own file systems. This is achieved by running the file system in user space while FUSE provides a bridge to the actual kernel modules.

62 52 Chapter 4. Extending Batch Systems with Cloud Resources 4.2. Distributed Resource Management by Batch Systems The primary task of a batch system is to balance the load of many jobs among a given set of available computing resources. These resources are commonly organized in clusters which consist of worker nodes. A user can submit his job to a queue on a central batch server. This job is hold in the queue until a worker node matching all requirements has spare resources. Subsequently, the job is handed over to the worker node for the actual processing. Batch systems usually feature a static setup: Only if new hardware resources are available new nodes are added and working nodes are only taken offline due to failure or maintenance. There are several popular software projects for Distributed Resource Management (DRM). Two well-established batch systems are TORQUE 9 /PBS 10 [54] and Grid Engine [55]. The IEKP 11 originally deployed the TORQUE resource manager at its local computing cluster but switched recently to Grid Engine. While [44] deals with the dynamic extension of the TORQUE/PBS batch system, this work is involved with the Grid Engine. Grid Engine is an open source job scheduler originally developed by Sun Microsystems. With the acquisition of Sun by Oracle, the Oracle Grid Engine is now closed source beginning with version 6.2u6. Thanks to its popularity several open source projects ([80, 81, 82]) have been established by user communities, which continue the development based on version 6.2u5. For operating Grid Engine a so-called master daemon has to be installed on the central server which provides an interface for job submission and manages the balancing of the load. On the nodes a so-called execution daemon (execd) has to be invoked. It is responsible for the connection to the master daemon and the execution of incoming jobs. For the communication between the two stations the TCP/IP protocol is used. Being a part of Grid Engine, the tool qconf can be utilized for managing the setup and adding/removing compute nodes. It allows to assign various roles to different user groups as well as to the computing elements. For example, new nodes can only be added by administration hosts. Jobs are submitted by calling qsub <job script>. The job script is usually a shell script which calls further executables or exports necessary shell variables. 9 Terascale Open-Source Resource and QUEue Manager 10 Portable Batch System 11 The Institute of Experimental Nuclear Physics (IEKP) of the Karlsruhe Institute of Technology (KIT)

63 4.3. Virtual Private Networks 53 qsub offers numerous parameters for specifying the behaviour of the job. For example, jobs are moved to predefined queues by passing -q <queue name>. This allows to sort jobs with similar requirements such as estimated execution time in the same queues. To monitor the progress of submitted jobs the tool qstat is used which returns a list of all jobs and their current status Virtual Private Networks For a seamless integration of Cloud resources into a local batch system some other requirements have to be matched to guarantee a transparent transition for users between native cluster and Cloud nodes. A user which submits jobs to a queue expects the same working environment on both kind of nodes. This presumes that the same storage elements and home folders have to be attached as on native machines for accessing datasets and storing the output of the jobs. Furthermore no extra authentication or user management is expected. For this purpose the LDAP [83] Server of the local cluster has to be queried. Since Cloud nodes usually reside in other networks than the local ones, they have in general no access to these elements for security reasons. By using a virtual private network (VPN) they can be integrated into it. The popular VPN client and server tool OpenVPN incorporates various measurements to provide a high degree of security which are explained in the following section Introduction to OpenVPN A virtual private network is often used to connect remote computers (typically over the internet) with a local network. OpenVPN can be operated in two different modes: ˆ Routing mode (point-to-point connection) This mode creates an IP tunnel between two peers based on layer-3 of the OSI-reference-model 12. It only allows communication over IP-packets for which every peer is assigned a virtual IP address. ˆ Bridging mode (multi-client connection) Opposed to routing, the bridging mode is based on layer-2 of the OSIreference-model and can carry out any type of Ethernet traffic. Communication is not only possible between two points but the network behind can also be accessed. 12 Open Systems Interconnection Reference Model[84]

64 54 Chapter 4. Extending Batch Systems with Cloud Resources Since Cloud nodes have to be able to access storage elements and the LDAP server, only the bridging mode is of interest. For this purpose a dedicated gateway server for VPN connections has been established which acts like a software switch and has to be connectable by a public IP. The complete network traffic is passed through the VPN server which also deals with the encryption and authentication of the nodes. For user authentication and encryption OpenVPN utilizes the TLS (Transport Layer Security) protocol which is a cryptographic standard for secure communication over the internet. It can be used in two different ways: ˆ Static key In static key mode symmetric key cryptography is used: A pre-shared key is generated and handed to both OpenVPN peers before the tunnel is started. The same key is used to lock and unlock the data. Static keys are easy to generate but since all peers share the same key if one is exposed all have to be replaced. ˆ Certificates Certificates make use of asymmetric key cryptography which requires two separate keys (a public encryption key and a private decryption key) to lock and unlock segments of the network traffic. Certificates can be either issued by a commercial or by a private Certification Authority (CA). While the static key mode comes handy for single VPNs or test connections, certificates are favorable for multi-client connections because they can be revoked individually, have a limited time of validity and can be created an arbitrary number of times. These are important features when it comes to integrating Cloud nodes in a local network. The management and authentication of the certificates is done by the Certification Authority. For setting up a private CA and generating certificates, Easy-RSA, a simple to use key management tool contained in the OpenVPN software package, is used. It makes use of the OpenSSL command line tool which has to be installed as well. Easy-RSA features the creation of certificates in batch mode: The certificate is generated together with a private key and signed by the CA in one step. Since the batch mode does not provide password protection, both files have to be copied to the destination host over a secure channel. By using the --pkcs12 flag, all files are stored in the PKCS #12 [85] format 13. The resulting file can now be copied to a client in a convenient way. 13 PKCS is the Public-Key Cryptography Standard. The standard #12, also known as Personal Information Exchange Syntax Standard, specifies a format for storing or transporting a user s private keys, certificates, etc. in a single file.

65 4.4. Scheduling Cloud Resources 55 Once the OpenVPN daemon on the VPN server is started (with proper configurations), clients can connect to it using a valid certificate. The revocation of valid certificates can also be done with Easy-RSA. Clients with revoked certificates are not able to connect again, but stay connected for the time being. Further information on OpenVPN, its configuration and usage can be found here [86] Scheduling Cloud Resources The Institute of Experimental Nuclear Physics (IEKP) [87] at the Karlsruhe Institute of Technology (KIT) [88] takes an active role in the analysis of the data taken at the CMS Experiment [17]. As already pointed out in chapter 3 this task involves the processing of huge amounts of data requiring large computing resources. For the task of evaluating this data, the IEKP maintains a private cluster using the Oracle Grid Engine to spread the computing load over its single nodes. Furthermore, members of the institute is granted access to the Instituts-Cluster IC1 [89] which is maintained by the Steinbuch Centre for Computing (SCC) [90]. 14 A typical scenario during HEP analysis is shown in figure 4.7: During periods of high demand (e.g. before conferences, finishing publications, etc.) local clusters are often completely occupied. Buying a new cluster is expensive and most of the time it will be under-utilized. One possible solution for this problem is to computing resources over-utilization under-utilization available capacity demand time Figure 4.7.: Typical utilization of a local batch system. The red line illustrates the available resources in a computing clusters. In times of high demand the cluster is over-utilized while it stays idle in times of low demand. expand into the Cloud. The elasticity of IaaS Providers allows to dynamically add or remove Cloud resources and to handle these peak loads as show in figure 4.8. For the task of monitoring available resources and administrate the expansion 14 The utilization of the IC1 cluster is part of several other diploma and PhD theses [67, 68, 2].

66 56 Chapter 4. Extending Batch Systems with Cloud Resources computing resources available capacity demand Figure 4.8.: Workload of a batch system. There are times with high and times with low demand. time with Cloud nodes, a special scheduler has been developed at IEKP: the ROCED framework [44] ROCED The ROCED Design Baseline ROCED stands for Responsive On-demand Cloud Enabled Deployment and is a meta-scheduler for integrating Cloud resources into local batch systems. It is written in Python 2.6 [91] and features a modular design which is schematically depicted in figure 4.9. Every major part is implemented as an Adapter which are connected by the ROCED Core orchestrating the dedicated tasks of each Adapter: ˆ Requirement Adapter The Requirement Adapter supplies information on the status of a DRM system. The size of a certain queue is an indicator on how many machines are needed. ˆ Site Adapter The Site Adapter delivers the interface to the Cloud site. It handles the requests for booting new or stopping spare machines. ˆ Integration Adapter The Integration Adapter is assigned to register new nodes in the batch system. Machines which are going to be shut down have to be removed properly. At the current state of development there are Adapters available for the DRM systems TORQUE and Grid Engine and the Cloud interfaces Amazon EC2, Eucalyptus and OpenNebula. Furthermore all components can be tested locally by using dummy Adapters which are able to simulate a real Cloud site or a batch system. The modular design of ROCED allows to add new Adapters easily and in a convenient way. The development and testing of the Grid Engine and OpenNebula Adapters was part of this thesis. The ROCED Core is the central unit of the framework and contains the ROCED

67 4.4. Scheduling Cloud Resources 57 Broker: Based on the informations supplied by the other Adapters the Broker decides how many machines have to be started or shut-down. A cost-model of the different possible sites allows ROCED to use cheap sites first in order to avoid expensive ones. Integration Adapters... integrates booted compute nodes into existing PBS Server Torque Adapter Grid Engine... ROCED Core Broker Requirement Adapters... supplies information about needed compute nodes, e.g. PBS queue size Torque Adapter Grid Engine decides which machines to boot or shutdown Site Adapters... boot machines on various Cloud Computing sites Amazon EC2 Eucalyptus OpenNebula... Figure 4.9.: The design Baseline of ROCED (Responsive On-demand Enabled Cloud Deployment). The Workflow In the ROCED workflow the repeatedly scanning of available and required resources plays a crucial role. The various actions performed by ROCED can be summed up to the following five steps (cf. 4.10): 1. Monitor queue In a first step one or more batch server queues are passively monitored by the Requirement Adapter. This monitoring is performed at a given period of time in the ROCED management cycle. 2. Boot VM Depending on the queue size (number of jobs in a queue) the ROCED Broker decides how many machines are actually needed at that time. The Site Adapter contacts the cheapest Cloud provider available and starts the needed number of virtual machines.

68 58 Chapter 4. Extending Batch Systems with Cloud Resources 3. Add node After startup, the new hosts are added to the batch system by the Integration Adapter and become available for job processing. 4. Execute job The job is submitted to the newly started node and is executed. 5. Remove & shutdown Provided that there are no new job submissions, the Cloud nodes are automatically removed from the batch system and subsequently shut down. Batch Server cloud.q 1) monitor queue 5) remove & shutdown 4) execute job Requirement ROCED Integration 3) add node execd Site Cloud Provider networking and virtualization 2) boot VM Virtual Machine Figure 4.10.: Lifecycle of a virtual machine started by ROCED. Here cloud.q is the name of the monitored queue. In order to communicate with the master daemon on the central batch server, every node has installed an execution daemon execd. The ROCED State Machine To manage the progress of the different Cloud nodes, ROCED features a strictly linear state machine (cf. figure 4.11). Transitions between two states of a virtual machine can only be changed by well-defined Adapters. The decision of the ROCED Broker only affects machines in the state down or working. This means for example that booting machines have to be set to working first before ROCED can shut them down.

69 4.4. Scheduling Cloud Resources 59 The possible states of a Cloud node are: ˆ booting The machine is booting but isn t available yet. ˆ up The machine was fully deployed and is ready to be integrated. ˆ integrating The batch system is in the process of integrating the machine. ˆ working The machine has successfully been registered in the batch system and is ready to execute jobs. ˆ pending disintegration The batch system has been setting the node to draining mode: no new jobs are passed to it but running ones are waited for. The machine is marked to be shut down. ˆ disintegrating All assigned jobs have finished and the node is unregistering at the batch system. ˆ disintegrated Node has successfully been disintegrated and is ready to be shut down. ˆ down The machine has been shut down Integration of OpenNebula Cloud Nodes into the IEKP HEP Computing Workflow The work of [44] already provides Adapters for the Amazon EC2 and Eucalyptus Cloud interfaces as well as an Adapter for the TORQUE Batch System. To dynamically integrate Cloud Nodes from the OpenNebula Cloud interface at the Steinbuch Centre of Computing into the local institute cluster at the IEKP which is running Oracle Grid Engine requires the corresponding Adapters for ROCED. As described in chapter 4.3, the seamless integration requires that the nodes reside in the local network to enable access to storage elements and user management. For this purpose the OpenVPN tool was used to establish a bridge. However the utilization of a VPN carries further security risks and in case of a security breach, unauthorized access to the local network has to be prevented. Since a VM

70 60 Chapter 4. Extending Batch Systems with Cloud Resources up 2. ROCED Broker decides how many machines have to be started or shut down Adapter in charge of changing the state of the virtual machine: booting 1. Integration Adapter Site Adapter Figure 4.11.: Overview of the ROCED state machine. The decision on how many nodes have to be deployed or shut down is made by the ROCED Broker. Only machines in the state running can be removed from the system. While most Cloud interfaces provide the possibility to stop (or pause) and restart virtual machines, this feature is not yet implemented in the current version of ROCED. User jobs will always find a clean machine and all user data will be wiped after the job has been executed.

71 4.4. Scheduling Cloud Resources 61 is stored as a virtual machine image the contents of it can be accessed without knowing user accounts or passwords or even starting it. This is the reason why safety-relevant parts (like SSH keys or certificates) should always be initiated after start-up. The development and testing of the Adapters for OpenNebula and Grid Engine as well as the interface to OpenVPN was part of this thesis and is described in the following. OpenNebula Site Adapter The OpenNebula Site Adapter is responsible for starting and stopping as well as monitoring the current status of virtual machines. ROCED provides for every class of Adapters a Base Adapter including a set of well-defined functions which are inherited to the specific implementations providing a unified interface to each of the Adapters. For example, starting or stopping a machine is initiated by calling the functions spawnmachines() respectively terminatemachines(). Additionally, there are two functions which are encountered in the Requirement Adapter as well, namely manage() and onevent(). The first one is called in every management cycle and is supposed to contain tasks which have to be executed regularly. For the Site Adapter this is used, for example, for monitoring the booting status of a virtual machine. The latter is called when the state of a virtual machine is altered. For example, changing the status from disintegrating to disintegrated triggers the Site Adapter to shut down the machine. The OpenNebula Site Adapter makes use of the XML-RPC interface of the Cloud controller. As already described in chapter it provides all the same functionalities as the command line tool on the Cloud controller. An example of how to start and stop machines is given in appendix A. More information on this interface can be found under [92]. Oracle Grid Engine Requirement and Integration Adapters For the usage of the Oracle Grid Engine, two Adapters had to be developed: The Requirement Adapter for monitoring the current number of jobs in a batch queue and the Integration Adapter for registering new compute nodes and removing unneeded ones. In order to estimate the current number of jobs residing in a queue (here named

72 62 Chapter 4. Extending Batch Systems with Cloud Resources cloud.q) during each management cycle a connection via SSH is established to the batch server. Calling qstat -q cloud.q -u "*" wc -l returns the number of jobs (plus two lines offset due to the qstat list header) of all users by piping the output of qstat to the built-in shell line counter. Registering newly spawned Cloud nodes into Grid Engine is performed by adding each one to the list of administrative hosts. This allows to use the automatic installation feature of the inst sge installation utility by calling./inst sge -x -auto <config file> on the node. As soon as the installation has been finished successfully the execution daemon registers itself at the master daemon and starts operating. To detach a node it is set to draining mode which allows to finish already started jobs but declines new ones. Subsequently,./inst sge -ux -auto <config file> is executed to remove and uninstall the execution daemon automatically. Integration of OpenVPN into ROCED As described in chapter 4.3 the use of OpenVPN requires a VPN server which is reachable from the outside world on a public IP address. This is realized by installing a dedicated virtual machine on the IEKP cluster operating as the gateway between the Cloud nodes and the local network. For the integration of OpenVPN some additional steps are performed by ROCED aiming to minimize the risks of unauthorized access to the local network. The following scheme is applied for each machine (cf. figure 4.12): 1. ROCED initiates the boot sequence of a VM. 2. After start-up ROCED creates and delivers a certificate through the VPN server to the Cloud node via SSH. 3. The Cloud node connects to the VPN server. 4. The certificate is revoked right after the connection is established to prevent abuse by possible attackers. 5. The node connects to the LDAP server to retrieve user informations and mounts home directories and storage elements via NSF. 6. ROCED retrieves the VPN IP address from the node and integrates it into the batch system. After the integration, the complete communication between batch system and Cloud node is performed via the VPN tunnel.

73 4.5. Conclusions and Outlook 63 Virtual Machine ROCED init boot booting create & copy certificate VPN Server init certificate creation ROCED connect VPN Server ROCED register node in batch system get VPN IP integrating connect to LDAP, mount storage VPN Server revoke certificate ROCED ROCED working execute job Figure 4.12.: Integration of OpenVPN into the ROCED workflow Conclusions and Outlook The work of this thesis shows, that it is possible to dynamically integrate Open- Nebula Cloud nodes into the local batch system Grid Engine using the ROCED framework. For this purpose, corresponding Adapters were developed for ROCED. The testing setup has been operating several weeks and allowed access to a limited group of users. During this time several thousand test jobs were computed and several hundred Cloud nodes were automatically deployed. As already shown in [44], ROCED exhibits an excellent scaling behavior: Handling up to sixty machines simultaneously was no problem at all employing the newly developed Adapters for Grid Engine and OpenNebula. The limiting factor here was mainly given through the maximum number of virtual machines which could be started on the SCC Cloud installation without disrupting other users. Using OpenVPN, ROCED was able to integrate the running Cloud nodes into the local network of the IEKP. The necessary OpenVPN server operated without any problems during the testing time. The automated creation and revocation of certificates worked flawlessly. After establishing the VPN connection, the Cloud nodes were able to query automatically the local LDAP server for user management and to mount the available storage elements located at the IEKP. Running jobs were thus able to read input from these storage elements and write their output back. The performance of integrating Cloud nodes into the local network is part of current investigations. By imposing a timeout on the integration time after which a Cloud node is shut down and a new one is started, sporadically appearing LDAP querying malfunctions could be encountered in the future. These failures cause

74 64 Chapter 4. Extending Batch Systems with Cloud Resources Grid Engine to throw errors when trying to integrate the node resulting in stuck virtual machines. Since ROCED exhibits an excellent performance, there are considerations concerning the publication under the General Public License v3 (GPLv3) [93]. Further developments of the ROCED framework will include the introduction of configuration files to which hard-coded preferences will be moved and parsed at start-up. Another interesting project could be the development of new Adapters allowing to use the OpenStack Cloud interface or the LSF batch system. The development of OpenStack, which is already praised as the new shooting-star of Cloud interfaces, is maintained by the Cloud hosting service Rackspace [94] and the National Aeronautics and Space Administration (NASA) [95] and backed up by some major IT enterprises like Dell [96], Citrix [97], AMD [98], Intel and Hewlett Packard [99]. Another interesting topic is the combination of dynamic worker node virtualization and responsive extension to the Cloud. For this task the virtual queues of ViBatch [100] could be expanded with Cloud nodes spawned by ROCED.

75 Chapter 5 Studies of the Underlying Event Besides the hard partonic scattering process in proton-proton collisions additional soft contributions originating from QCD background processes arise. These soft contributions, referred to as the Underlying Event (UE), pollute the detector and thus influence all measurements. The origin of the UE is a direct consequence of the compositeness of hadrons: remnants of the destroyed protons are instable color-charged objects radiating gluons that form new particles. Additionally, semi-hard parton processes called multiple parton interactions (MPI) take place and contribute to the UE. Unfortunately, the UE cannot unambiguously be distinguished from the hard interaction (cf. figure 5.1) and especially jet-based observables suffer from the additional soft contributions. Since the use of perturbative methods is not possible in the low energy regime, Monte Carlo event generators employ phenomenologically motivated models to account for the UE (cf. chapter 3.1.3). Since the UE is supposed to dominate different phase space regions than the hard process, all available Monte Carlo Tunes were obtained by slicing the event topology geometrically into different regions with respect to the hardest object in the given event. Chapter 5.1 gives a short introduction to the so-called traditional approach of measuring the Underlying Event. In 2009 a new method for quantifying the UE was introduced [101], called the jet area/median approach. It introduced a new variable ρ which takes into account the catchment area of the jets in the η-φ plane as well as their transverse momentum on an event-by-event basis. First analyses published in 2010 and 2011 [1, 8, 102] presented detailed studies using an adjusted observable ρ for the data collected in 2009 at a center-of-mass energy of s = 900 GeV. This thesis presents a refined study of ρ (see chapter 5.7) investigating the influence of the event scale on ρ. In a hard partonic scattering the event scale is defined by the momentum transfer of the hard interaction process. In the traditional approach, this scale is 65

76 66 Chapter 5. Studies of the Underlying Event usually defined by the transverse momentum of the leading object in the given event. For this study the jet with the highest transverse momentum was chosen. Chapter 5.6 provides detailed studies of systematic uncertainties and their influence on ρ. For this analysis data collected in 2010 at both center-of mass-energies s = 900 GeV and s = 7 TeV was used. Hard-Scattering FSR Outgoing parton Hard- FSR Scattering Outgoing parton Outgoing parton ISR Proton Proton Underlying Event ISR Proton Proton Outgoing parton Underlying Event Figure 5.1.: Schematic partitioning of hadron collisions into hard and soft contributions. Taken from [102] The Traditional Approach of Measuring the Underlying Event The traditional approach of measuring the Underlying Event assumes, that the particles originating from the hard and soft scattering process are covering different regions of the geometrical space. For this purpose events with a suited topology like di-jet events or Drell-Yan muons have to be selected and the direction of the leading object with the highest transverse momentum p T is defined. In a next step the region transverse to the leading object is examined (cf. figure 5.2) where the bulk of particles originating from the Underlying Event is supposed to appear. Then the activity due to the UE is quantified by the momentum sum Σp T as well as the track multiplicity in the transverse region. This method has successfully been applied at the Tevatron [103], Relativistic Heavy Ion Collider (RHIC) [104, 105] and the LHC [106].

77 5.2. The Jet Area/Median Method 67 Figure 5.2.: Slicing of the event topology in the traditional approach of measuring the underlying event. Taken from [23] The Jet Area/Median Method The theoretical reasoning of the jet area/median approach is given in [101] introducing a new observable ρ to estimate the overall background activity in an event. It is defined as the median of the distribution of p T,j /A j of all jets in an event ρ = median j jets [{ pt,j A j }] (5.1) where p T,j is the transverse momentum and A j the area of a jet. The area is determined by using the active area clustering technique described in chapter Since cone-type jet algorithms as well as the anti-k T jet algorithm force a fixed shape of the jet area, the k T clustering algorithm was chosen which allows variable jet shapes. In figure 5.3 a representative distribution of p T,j /A j is depicted. The median of this distribution was chosen, because it is much less susceptible to hard outliers in an event (e.g. hard jets) than the mean value of the distribution. By assuming, that the majority of the event is dominated by soft contributions, the observable ρ can be interpreted as a measure for the average activity due to the Underlying Event. In contrast to the traditional approach, the jet area/median approach is suited for all kinds of event topologies. Using this method, areas containing no physical objects are covered with ghost jets and ρ will become zero if the number of these ghost jets exceeds the number of physical jets. Since the events used in this analysis showed a tendency towards