Large-scale performance monitoring framework for cloud monitoring. Live Trace Reading and Processing



Similar documents
Virtual machine CPU monitoring with Kernel Tracing

Efficient and Large-Scale Infrastructure Monitoring with Tracing

LinuxCon Europe Cloud Monitoring and Distribution Bug Reporting with Live Streaming and Snapshots.

Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

Nested Virtualization

Nested Virtualization

Full and Para Virtualization

Tracing Kernel Virtual Machines (KVM) and Linux Containers (LXC)

Cloud Operating Systems for Servers

Nested Virtualization

The QEMU/KVM Hypervisor

Virtual Computing and VMWare. Module 4

Intro to Virtualization

Architecture of the Kernel-based Virtual Machine (KVM)

Virtualization in Linux KVM + QEMU

Enabling Technologies for Distributed Computing

EFFICIENT ANALYSIS OF APPLICATION SERVERS IN THE CLOUD

Perf Tool: Performance Analysis Tool for Linux

ServerPronto Cloud User Guide

Chapter 5 Cloud Resource Virtualization

Virtualization. Pradipta De

Distributed Systems. Virtualization. Paul Krzyzanowski

Rackspace Cloud Databases and Container-based Virtualization

Enabling Technologies for Distributed and Cloud Computing

Chapter 14 Virtual Machines

Virtualization Technologies

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

HRG Assessment: Stratus everrun Enterprise

Software Tracing of Embedded Linux Systems using LTTng and Tracealyzer. Dr. Johan Kraft, Percepio AB

Performance Profiling in a Virtualized Environment

KVM: Kernel-based Virtualization Driver

Performance Testing of a Cloud Service

CFS-v: I/O Demand-driven VM Scheduler in KVM

Real-time KVM from the ground up

Performance Management for Cloudbased STC 2012

Rebuild Perfume With Python and PyPI

Hypervisors. Introduction. Introduction. Introduction. Introduction. Introduction. Credits:

Development of Type-2 Hypervisor for MIPS64 Based Systems

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version Fix Pack 2.

Using Linux as Hypervisor with KVM

International Journal of Computer & Organization Trends Volume20 Number1 May 2015

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE

Virtualization: Know your options on Ubuntu. Nick Barcet. Ubuntu Server Product Manager

Virtualization for Cloud Computing

Platform as a Service and Container Clouds

CS 695 Topics in Virtualization and Cloud Computing. More Introduction + Processor Virtualization

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

Utilization-based Scheduling in OpenStack* Compute (Nova)

Dheeraj K. Rathore 1, Dr. Vibhakar Pathak 2

Database Virtualization

Basics of Virtualisation

KVM Architecture Overview

JOB ORIENTED VMWARE TRAINING INSTITUTE IN CHENNAI

HPC performance applications on Virtual Clusters

x86 ISA Modifications to support Virtual Machines

Module I-7410 Advanced Linux FS-11 Part1: Virtualization with KVM

A hypervisor approach with real-time support to the MIPS M5150 processor

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

9/26/2011. What is Virtualization? What are the different types of virtualization.

Monitoring Elastic Cloud Services

SUSE Cloud Installation: Best Practices Using an Existing SMT and KVM Environment

PARALLELS CLOUD SERVER

Virtualization. Types of Interfaces

Small is Better: Avoiding Latency Traps in Virtualized DataCenters

Resource usage monitoring for KVM based virtual machines

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

Brian Walters VMware Virtual Platform. Linux J. 1999, 63es, Article 6 (July 1999).

HyperV_Mon 3.0. Hyper-V Overhead. Introduction. A Free tool from TMurgent Technologies. Version 3.0

Affinity Aware VM Colocation Mechanism for Cloud

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

An Introduction to Service Containers

Performance tuning Xen

Microsoft Hyper-V chose a Primary Server Virtualization Platform

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

Introduction to Virtualization & KVM

An Oracle Technical White Paper June Oracle VM Windows Paravirtual (PV) Drivers 2.0: New Features

Hypervisors and Virtual Machines

Marvell DragonFly Virtual Storage Accelerator Performance Benchmarks

A quantitative comparison between xen and kvm

Dynamic Resource allocation in Cloud

SQL Server Virtualization

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

RED HAT ENTERPRISE VIRTUALIZATION & CLOUD COMPUTING

Eloquence Training What s new in Eloquence B.08.00

A Network Marketing Model For Different Host Machines

VMware Server 2.0 Essentials. Virtualization Deployment and Management

Virtualization with Windows

KVM on S390x. Revolutionizing the Mainframe

Cloud^H^H^H^H^H Virtualization Technology. Andrew Jones May 2011

KVM: A Hypervisor for All Seasons. Avi Kivity avi@qumranet.com

SUSE Linux Enterprise 10 SP2: Virtualization Technology Support

M6422A Implementing and Managing Windows Server 2008 Hyper-V

SERVICE SCHEDULE PULSANT ENTERPRISE CLOUD SERVICES

Enterprise-Class Virtualization with Open Source Technologies

Enhancing Hypervisor and Cloud Solutions Using Embedded Linux Iisko Lappalainen MontaVista

SAIVMM: SELF ADAPTIVE INTELLIGENT VMM SCHEDULER FOR SERVER CONSOLIDATION IN CLOUD ENVIRONMENT

<Insert Picture Here> Tracing on Linux

Transcription:

Large-scale performance monitoring framework for cloud monitoring Live Trace Reading and Processing Julien Desfossez Michel Dagenais May 2014 École Polytechnique de Montreal

Live Trace Reading Read the trace while it is being recorded Local or remote session Configurable flush period (live-timer) Merged into LTTng 2.4.0 Supported by Babeltrace 1.2 and LTTngTop Work in progress in TMF 2

Infrastructure integration Server (lttng-sessiond) Server (lttng-sessiond) Server (lttng-sessiond) TCP lttng-relayd TCP Viewer 3

Live streaming session On the server to trace : $ lttng create - live 2000000 -U net://10.0.0.1 $ lttng enable-event -k sched_switch $ lttng enable-event -k -syscall -a $ lttng start On the receiving server (10.0.0.1) : $ lttng-relayd -d On the viewer machine : $ lttngtop -r 10.0.0.1 Or $ babeltrace -i lttng-live net://10.0.0.1 4

What has been done since the last progress report meeting Bugfixing and release of LTTng 2.4.1 Graphite integration tests Stress/performance testing Started Zipkin/Tomograph integration to trace OpenStack (Python) Working with an GSoC intern on Babeltrace to Zipkin Sysadmin-oriented analyses prototypes (Python) Writing the paper about live tracing 5

Graphite Integration 6

Stress-testing setup 48 AMD Opteron(tm) Processor 6348 512GB RAM 4x1TB SSD (1 for the OS, 1 for the VMs, 1 for the traces) Ubuntu 14.04 LTS Linux Kernel 3.13.0-16 LTTng Tools 2.4+ (git HEAD on March 10 th) 7

Stress-testing 100 Ubuntu 12.04 VMs with 1GB RAM and 1 vcpu Streaming their traces to the host lttng-relayd with the live-timer of 5 seconds Tracing syscalls + sched_switch Running Sysbench OLTP (MySQL stress test) Measure overall impact on the system 8

100 Sysbench 9

Python analyses demo 10

Next steps Finish writing the paper Work on the architecture to process traces and extract metrics from large group of machines Studying the large-scale infrastructures monitoring systems Studying HTTP analytics on large-scale web infrastructures Look at Facebook Scribe and integration with Hadoop HDFS Continue prototyping with the Python libraries 11

Install it Packages for your distro (lttng-modules, lttng-ust, lttng-tools, userspacercu, babeltrace) For Ubuntu : PPA for daily build (lttngtop) Or from the source, see http://git.lttng.org 12

LTTng 2.5 features Save/Restore sessions lttng save lttng restore Configuration file (lttng.conf) System-wide : /etc/lttng/lttng.conf User-specific : $HOME/.lttng/lttng.conf Run-time Perf UST User-defined modules on lttng-sessiond startup lttng --version with git commit id 13

Questions? 14

Virtual machine CPU monitoring with Kernel Tracing Mohamad Gebai Michel Dagenais 15 May, 2014 École Polytechnique de Montreal

Content General objectives Current approaches Kernel tracing Trace synchronization Virtual Machine Analysis Execution flow recovery

General objectives Getting the state of a virtual machine at a certain point in time Quantifying the overhead added by virtualization Track the execution of processes inside a VM Aggregate information from host and guests Monitoring multiple VMs on a single host OS Finding performance setbacks due to resource sharing among VMs

Current approaches Top Steal time: percentage of vcpu preemption for the last second Does not reflect the effective load on the host 0% for idle VMs even if the physical CPU is busy Not enough information

Current approaches Perf kvm Information about VM exits, performance counters No information from inside the VM No information about VM interactions

Kernel tracing Trace scheduling events sched_switch for context switches sched_migrate_task for thread migration between CPUs (optional) sched_process_fork, sched_process_exit Trace VMENTRY and VMEXIT on the hypervisor (hardware virtualization) kvm_entry kvm_exit

Tracing virtual machines Each VM is a process Each vcpu is 1 thread Per-thread state can be rebuilt A vcpu can be in VMX root mode or VMX non-root mode A vcpu can be preempted on the host The VM can't know when it is preempted or in VMX root mode Processes in the VM seem to take more time Trace host and guests simultaneously

Trace synchronization Time difference between host and an idle VM

Trace synchronization Time difference between host and an active VM

Trace synchronization Based on the fully incremental convex hull synchronization algorithm 1-to-1 relation required between events from guest and host Tracepoint is added to the guest kernel Executed on the system timer interrupt softirq Triggers a hypercall which is traced on the host Resistant to vcpu migrations and time drifts

Trace synchronization Kernel module added to LTTng as an addon In the guest: Trigger a hypercall (event a) On the host: Acknowledge the hypercall (event b) Give control back to the guest (event c) In the guest: Acknowledge the control (event d)

Trace synchronization Host and guest threads, as seen before....and after synchronization

Trace synchronization Time difference between host and VM after synchronization

TMF Virtual Machine View Shows the state of each vcpu of a VM Aggregation of traces from the host and the guests 2 VM: Debian and Ubuntu vcpu 0 and vcpu 1 are complementary; fighting over the same pcpu

TMF Virtual Machine View Detailed information of execution inside the VM Process burnp6 (TID 2635) is deprived from the pcpu while the CPU time is still accounted for

TMF Virtual Machine View Shows latency introduced by the hypervisor (ie. emulation in KVM) to the nanosecond scale

Use case Periodic critical task Inexplicably takes longer on some executions 100% CPU usage from the guest's point of view

Use case VCPU is preempted on the host Invisible to the VM Duration of preemption is easily measurable

Execution flow recovery Build the execution flow centered around a certain task A List of execution intervals affecting the completion time of A Find the source of preemption across systems Example:

Execution flow recovery Previous example: Execution flow centered around task 3525:

Acknowledgements Ericsson CRSNG Professor Michel Dagenais Geneviève Bastien Francis Giraldeau DORSAL Lab