Scaling Networking Applications to Multiple Cores

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Scaling Networking Applications to Multiple Cores"

Transcription

1 Scaling Networking Applications to Multiple Cores Greg Seibert Sr. Technical Marketing Engineer Cavium Networks

2 Challenges with multi-core application performance Amdahl s Law Evaluates application performance from the perspective of running time Overall application performance scaling limited to the proportion of processing that can be done in parallel Scaling limitation intrinsically related to type of processing being done Evaluating System Performance of Networking Applications How much data can it pass How many packets per second Scaling Parallelization Networking applications provide a convenient quanta of work: The Packet Flows are mostly independent Critical Regions Per-flow data structures

3 Multi-core Programming Techniques Independent processes on each core Each process can maintain state in local storage and avoid shared memory contention Processes snugly-coupled via in-memory IPC mechanisms Pipelined Divide application into stages Each stage can be limited to completely fit into the instruction cache Application performance limited to throughput of any single stage Entire application requires a-priori division of operations Symmetric Multi-Processing (SMP) Same program/image running on multiple cores All instances identical and can load balance organically Classic implementations find it tricky to scale

4 Independent Processes on each core Communication between cores requires the use of Inter-processor Communication Mechanisms (IPC) Shared memory Inter-CPU interrupts Message queues Familiar implementation Multi-programming OS enables this paradigm on a single or multi CPU systems Processing overhead from the IPC mechanism can be significant Context switching and messaging will consume CPU cycles not contributing towards implementing the application s features

5 Dividing applications into pipeline stages Parallelism can be implemented as the first stage identifies the traffic and queues it to multiple second stages Each instance of the second stage can be assigned all the packets of a flow Balancing flows between second stage instances requires some tricky footwork from the first stage Each stage code size can be limited to fit into the L1 instruction cache Performance impact due to instruction cache misses can be reduced Static assignment of operations can lead to a variance in dynamic system performance Dynamic allocation of operations or number of stage instances can somewhat mitigate this effect Can require complex software

6 SMP - All cores able to do all things Different traffic profiles requires a different balance of processing With all application instances able to perform all processing, a dynamic balance will occur organically A single code set (image) can be developed integrating multiply designed and unit-tested modules Testing can verify each modular component performs to expectations and interface requirements System testing and verification only needs to ensure a single image is put through its paces Need to ensure critical regions are minimized Mutual Exclusion mechanisms (mutex) for protecting these regions have the ability to reduce overall application scaling

7 Designing for Optimal Performance Goal is to keep the CPUs busy executing the application s instructions Minimize, if not eliminate, the need for handling interrupts and context switches System calls, interrupt exceptions, and context switching take CPU cycles away from applications Highest Performance: Design a single process per CPU and use polling for I/O Maximize - through design - independent and parallel operations Keep critical regions to a minimum if not eliminate them altogether Protecting critical regions are the single largest impediment to efficient scaling

8 Which method to choose? No one method is intrinsically better than the other Each have their own application space Pipelines benefit Single high-bandwidth flows that require processing phases to be done atomically Symmetric Multiprocessing benefits Multiple flows that can be processed in parallel Low-latency traffic that can be processed in parallel while preserving ingress order on egress Wider range of traffic profiles can maintain performance Multi Independent process/thread applications benefit Existing multi-threaded or multi-process implementations wishing to gain performance without significant re-design Applications rely on Operating System services

9 How can hardware help? Perform some triage on the incoming packet traffic and give them a rough priority Then, hand it off to the software in a prioritized fashion Provide some sort of evaluation of the packet E.g. Flow identification Maintain the packet order arrival throughout its processing Execute menial tasks such as buffer management Recycling buffers that have been sent making them available for new incoming packets. Reduce, if not completely eliminate, the need to protect shared data structures Access to shared data structures are usually per-flow Hardware can ensure that

10 Spinlocks when high-contention locks Multi-CPU synchronization requires a memory-based contention primitive Spinlocks based upon MIPS-defined instructions: Load-Linked and Store-Conditional Statistical Nature of operation inherently unfair OCTEON s SSO can be used to implement fair locks Locking can be done non-blocking Acquire the lock while I do something useful

11 How OCTEON enables high-performance Applications running as Linux processes have direct access to hardware blocks via Simple Exec API Send and receive packets directly Integrated Packet Input and Output processors with knowledge of common network protocols offload software from laborious header validations PIP provides the results of these tests in the form of a set of flags Packets get flow classification on ingress PKO computes and inserts transport layer checksum on egress Hardware buffer management Processors can allocate and free buffers without software intervention Many operations executing in parallel with the dual-issue cores Software can continue to execute instructions while time consuming operations run to completion I/O units can DMA results to core s local memory Crypto instructions execute asynchronous to pipeline

12 How OCTEON enables high-performance Introduces a Work Flow Paradigm SSO off-loads software from the task of scheduling what operations get executed on the cores PIP works in conjunction with SSO to prioritize ingress packets as instances of work PIP classifies packets and tags them and thus the SSO can ensure the software on the cores can work on packets with out interference Polling for work alleviates the overhead of interrupt handling Completion results from application-specific coprocessors submitted as instances of work Timer events can be processed as instances of work Software can be optimized to significantly increase application s performance Hardware work scheduling independent of CPUs can eliminate the need for critical regions Utilizing Atomic Tags allows software to operate knowing it has sole access to resources Flow-based network traffic will have per-flow data structures requiring exclusive access e.g. State Machine Hardware ensures only a single packet per flow is being worked upon

13 OCTEON does it all OCTEON s cnmips cores can operate independently All cores share the same physical memory space so shared memory IPCs are easy to implement Each core has own mailbox interrupts Using the SSO, OCTEON can efficiently implement a pipeline Each group of cores represents a single stage in the pipeline Group switch operation passes work to the next stage Data/State passed via the Work Queue Entry structure Using the traffic classification and tagging from the PIP, the SSO can arbitrate what packets get worked on Can obviate the need to protect per-flow data structures (E.g. TCP Control Block)

Intel DPDK Boosts Server Appliance Performance White Paper

Intel DPDK Boosts Server Appliance Performance White Paper Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks

More information

SYSTEM ecos Embedded Configurable Operating System

SYSTEM ecos Embedded Configurable Operating System BELONGS TO THE CYGNUS SOLUTIONS founded about 1989 initiative connected with an idea of free software ( commercial support for the free software ). Recently merged with RedHat. CYGNUS was also the original

More information

Performance of Software Switching

Performance of Software Switching Performance of Software Switching Based on papers in IEEE HPSR 2011 and IFIP/ACM Performance 2011 Nuutti Varis, Jukka Manner Department of Communications and Networking (COMNET) Agenda Motivation Performance

More information

CPU Scheduling Outline

CPU Scheduling Outline CPU Scheduling Outline What is scheduling in the OS? What are common scheduling criteria? How to evaluate scheduling algorithms? What are common scheduling algorithms? How is thread scheduling different

More information

Chapter 2: OS Overview

Chapter 2: OS Overview Chapter 2: OS Overview CmSc 335 Operating Systems 1. Operating system objectives and functions Operating systems control and support the usage of computer systems. a. usage users of a computer system:

More information

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology 3. The Lagopus SDN Software Switch Here we explain the capabilities of the new Lagopus software switch in detail, starting with the basics of SDN and OpenFlow. 3.1 SDN and OpenFlow Those engaged in network-related

More information

Scheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:

Scheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances: Scheduling Scheduling Scheduling levels Long-term scheduling. Selects which jobs shall be allowed to enter the system. Only used in batch systems. Medium-term scheduling. Performs swapin-swapout operations

More information

KeyStone Training. Multicore Navigator Overview. Overview Agenda

KeyStone Training. Multicore Navigator Overview. Overview Agenda KeyStone Training Multicore Navigator Overview What is Navigator? Overview Agenda Definition Architecture Queue Manager Sub System (QMSS) Packet DMA (PKTDMA) Descriptors and Queuing What can Navigator

More information

Software Datapath Acceleration for Stateless Packet Processing

Software Datapath Acceleration for Stateless Packet Processing June 22, 2010 Software Datapath Acceleration for Stateless Packet Processing FTF-NET-F0817 Ravi Malhotra Software Architect Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions

More information

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower The traditional approach better performance Why computers are

More information

Synchronization. Todd C. Mowry CS 740 November 24, 1998. Topics. Locks Barriers

Synchronization. Todd C. Mowry CS 740 November 24, 1998. Topics. Locks Barriers Synchronization Todd C. Mowry CS 740 November 24, 1998 Topics Locks Barriers Types of Synchronization Mutual Exclusion Locks Event Synchronization Global or group-based (barriers) Point-to-point tightly

More information

Advanced Core Operating System (ACOS): Experience the Performance

Advanced Core Operating System (ACOS): Experience the Performance WHITE PAPER Advanced Core Operating System (ACOS): Experience the Performance Table of Contents Trends Affecting Application Networking...3 The Era of Multicore...3 Multicore System Design Challenges...3

More information

Client/Server and Distributed Computing

Client/Server and Distributed Computing Adapted from:operating Systems: Internals and Design Principles, 6/E William Stallings CS571 Fall 2010 Client/Server and Distributed Computing Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Traditional

More information

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Client/Server Computing Distributed Processing, Client/Server, and Clusters Client/Server Computing Distributed Processing, Client/Server, and Clusters Chapter 13 Client machines are generally single-user PCs or workstations that provide a highly userfriendly interface to the

More information

Embedded Parallel Computing

Embedded Parallel Computing Embedded Parallel Computing Lecture 5 - The anatomy of a modern multiprocessor, the multicore processors Tomas Nordström Course webpage:: Course responsible and examiner: Tomas

More information

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.

More information

Router Architectures

Router Architectures Router Architectures An overview of router architectures. Introduction What is a Packet Switch? Basic Architectural Components Some Example Packet Switches The Evolution of IP Routers 2 1 Router Components

More information

ELEC 377 Operating Systems. Thomas R. Dean

ELEC 377 Operating Systems. Thomas R. Dean ELEC 377 Operating Systems Thomas R. Dean Instructor Tom Dean Office:! WLH 421 Email:! tom.dean@queensu.ca Hours:! Wed 14:30 16:00 (Tentative)! and by appointment! 6 years industrial experience ECE Rep

More information

OpenFlow with Intel 82599. Voravit Tanyingyong, Markus Hidell, Peter Sjödin

OpenFlow with Intel 82599. Voravit Tanyingyong, Markus Hidell, Peter Sjödin OpenFlow with Intel 82599 Voravit Tanyingyong, Markus Hidell, Peter Sjödin Outline Background Goal Design Experiment and Evaluation Conclusion OpenFlow SW HW Open up commercial network hardware for experiment

More information

A Generic Network Interface Architecture for a Networked Processor Array (NePA)

A Generic Network Interface Architecture for a Networked Processor Array (NePA) A Generic Network Interface Architecture for a Networked Processor Array (NePA) Seung Eun Lee, Jun Ho Bahn, Yoon Seok Yang, and Nader Bagherzadeh EECS @ University of California, Irvine Outline Introduction

More information

Overview of Operating Systems Instructor: Dr. Tongping Liu

Overview of Operating Systems Instructor: Dr. Tongping Liu Overview of Operating Systems Instructor: Dr. Tongping Liu Thank Dr. Dakai Zhu and Dr. Palden Lama for providing their slides. 1 Lecture Outline Operating System: what is it? Evolution of Computer Systems

More information

OpenDataPlane Introduction and Overview

OpenDataPlane Introduction and Overview Introduction and Overview Linaro Networking Group (LNG) Initial Release 0.1.0, January 2014 Executive Summary OpenDataPlane (ODP) is an open source project that provides an application programming environment

More information

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Winter Term 2008 / 2009 Jun.-Prof. Dr. André Brinkmann Andre.Brinkmann@uni-paderborn.de Universität Paderborn PC² Agenda Multiprocessor and

More information

Types Of Operating Systems

Types Of Operating Systems Types Of Operating Systems Date 10/01/2004 1/24/2004 Operating Systems 1 Brief history of OS design In the beginning OSes were runtime libraries The OS was just code you linked with your program and loaded

More information

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

Accelerating High-Speed Networking with Intel I/O Acceleration Technology White Paper Intel I/O Acceleration Technology Accelerating High-Speed Networking with Intel I/O Acceleration Technology The emergence of multi-gigabit Ethernet allows data centers to adapt to the increasing

More information

Real-Time Operating Systems for MPSoCs

Real-Time Operating Systems for MPSoCs Real-Time Operating Systems for MPSoCs Hiroyuki Tomiyama Graduate School of Information Science Nagoya University http://member.acm.org/~hiroyuki MPSoC 2009 1 Contributors Hiroaki Takada Director and Professor

More information

Highly parallel, lock- less, user- space TCP/IP networking stack based on FreeBSD. EuroBSDCon 2013 Malta

Highly parallel, lock- less, user- space TCP/IP networking stack based on FreeBSD. EuroBSDCon 2013 Malta Highly parallel, lock- less, user- space TCP/IP networking stack based on FreeBSD EuroBSDCon 2013 Malta Networking stack Requirements High throughput Low latency ConnecLon establishments and teardowns

More information

Objectives. Chapter 5: Process Scheduling. Chapter 5: Process Scheduling. 5.1 Basic Concepts. To introduce CPU scheduling

Objectives. Chapter 5: Process Scheduling. Chapter 5: Process Scheduling. 5.1 Basic Concepts. To introduce CPU scheduling Objectives To introduce CPU scheduling To describe various CPU-scheduling algorithms Chapter 5: Process Scheduling To discuss evaluation criteria for selecting the CPUscheduling algorithm for a particular

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

159.735. Final Report. Cluster Scheduling. Submitted by: Priti Lohani 04244354

159.735. Final Report. Cluster Scheduling. Submitted by: Priti Lohani 04244354 159.735 Final Report Cluster Scheduling Submitted by: Priti Lohani 04244354 1 Table of contents: 159.735... 1 Final Report... 1 Cluster Scheduling... 1 Table of contents:... 2 1. Introduction:... 3 1.1

More information

Linux Driver Devices. Why, When, Which, How?

Linux Driver Devices. Why, When, Which, How? Bertrand Mermet Sylvain Ract Linux Driver Devices. Why, When, Which, How? Since its creation in the early 1990 s Linux has been installed on millions of computers or embedded systems. These systems may

More information

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family White Paper June, 2008 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL

More information

Design Issues in a Bare PC Web Server

Design Issues in a Bare PC Web Server Design Issues in a Bare PC Web Server Long He, Ramesh K. Karne, Alexander L. Wijesinha, Sandeep Girumala, and Gholam H. Khaksari Department of Computer & Information Sciences, Towson University, 78 York

More information

Red Hat Linux Internals

Red Hat Linux Internals Red Hat Linux Internals Learn how the Linux kernel functions and start developing modules. Red Hat Linux internals teaches you all the fundamental requirements necessary to understand and start developing

More information

Gigabit Ethernet Design

Gigabit Ethernet Design Gigabit Ethernet Design Laura Jeanne Knapp Network Consultant 1-919-254-8801 laura@lauraknapp.com www.lauraknapp.com Tom Hadley Network Consultant 1-919-301-3052 tmhadley@us.ibm.com HSEdes_ 010 ed and

More information

Programmable Networking with Open vswitch

Programmable Networking with Open vswitch Programmable Networking with Open vswitch Jesse Gross LinuxCon September, 2013 2009 VMware Inc. All rights reserved Background: The Evolution of Data Centers Virtualization has created data center workloads

More information

Multi-Threading Performance on Commodity Multi-Core Processors

Multi-Threading Performance on Commodity Multi-Core Processors Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

PikeOS: Multi-Core RTOS for IMA. Dr. Sergey Tverdyshev SYSGO AG 29.10.2012, Moscow

PikeOS: Multi-Core RTOS for IMA. Dr. Sergey Tverdyshev SYSGO AG 29.10.2012, Moscow PikeOS: Multi-Core RTOS for IMA Dr. Sergey Tverdyshev SYSGO AG 29.10.2012, Moscow Contents Multi Core Overview Hardware Considerations Multi Core Software Design Certification Consideratins PikeOS Multi-Core

More information

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Performance Study Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Introduction With more and more mission critical networking intensive workloads being virtualized

More information

White Paper Abstract Disclaimer

White Paper Abstract Disclaimer White Paper Synopsis of the Data Streaming Logical Specification (Phase I) Based on: RapidIO Specification Part X: Data Streaming Logical Specification Rev. 1.2, 08/2004 Abstract The Data Streaming specification

More information

Virtualization is set to become a key requirement

Virtualization is set to become a key requirement Xen, the virtual machine monitor The art of virtualization Moshe Bar Virtualization is set to become a key requirement for every server in the data center. This trend is a direct consequence of an industrywide

More information

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Question Bank Subject Name: EC6504 - Microprocessor & Microcontroller Year/Sem : II/IV

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Question Bank Subject Name: EC6504 - Microprocessor & Microcontroller Year/Sem : II/IV DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Question Bank Subject Name: EC6504 - Microprocessor & Microcontroller Year/Sem : II/IV UNIT I THE 8086 MICROPROCESSOR 1. What is the purpose of segment registers

More information

10.04.2008. Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

10.04.2008. Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details Thomas Fahrig Senior Developer Hypervisor Team Hypervisor Architecture Terminology Goals Basics Details Scheduling Interval External Interrupt Handling Reserves, Weights and Caps Context Switch Waiting

More information

Embedded Systems. 6. Real-Time Operating Systems

Embedded Systems. 6. Real-Time Operating Systems Embedded Systems 6. Real-Time Operating Systems Lothar Thiele 6-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic

More information

Network Virtualization Technologies and their Effect on Performance

Network Virtualization Technologies and their Effect on Performance Network Virtualization Technologies and their Effect on Performance Dror Goldenberg VP Software Architecture TCE NFV Winter School 2015 Cloud Computing and NFV Cloud - scalable computing resources (CPU,

More information

COS 318: Operating Systems. I/O Device and Drivers. Input and Output. Definitions and General Method. Revisit Hardware

COS 318: Operating Systems. I/O Device and Drivers. Input and Output. Definitions and General Method. Revisit Hardware COS 318: Operating Systems I/O and Drivers Input and Output A computer s job is to process data Computation (, cache, and memory) Move data into and out of a system (between I/O devices and memory) Challenges

More information

Enea Hypervisor : Facilitating Multicore Migration with the Enea Hypervisor

Enea Hypervisor : Facilitating Multicore Migration with the Enea Hypervisor 1 Enea Hypervisor : Facilitating Multicore Migration with the Enea Hypervisor Magnus Karlsson Principal Engineer, CTO Office Multicore is everywhere in the telecommunications and networking world. Whether

More information

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?

More information

TCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance

TCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance TCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance M. Rangarajan, A. Bohra, K. Banerjee, E.V. Carrera, R. Bianchini, L. Iftode, W. Zwaenepoel. Presented

More information

I/O Device and Drivers

I/O Device and Drivers COS 318: Operating Systems I/O Device and Drivers Prof. Margaret Martonosi Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall11/cos318/ Announcements Project

More information

Development of Type-2 Hypervisor for MIPS64 Based Systems

Development of Type-2 Hypervisor for MIPS64 Based Systems Development of Type-2 Hypervisor for MIPS64 Based Systems High Performance Computing and Networking Lab Al-Khwarizmi Institute of Computer Science University of Engineering & Technology Lahore Pakistan

More information

independent systems in constant communication what they are, why we care, how they work

independent systems in constant communication what they are, why we care, how they work Overview of Presentation Major Classes of Distributed Systems classes of distributed system loosely coupled systems loosely coupled, SMP, Single-system-image Clusters independent systems in constant communication

More information

Operating Systems Design 16. Networking: Sockets

Operating Systems Design 16. Networking: Sockets Operating Systems Design 16. Networking: Sockets Paul Krzyzanowski pxk@cs.rutgers.edu 1 Sockets IP lets us send data between machines TCP & UDP are transport layer protocols Contain port number to identify

More information

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner CPS104 Computer Organization and Programming Lecture 18: Input-Output Robert Wagner cps 104 I/O.1 RW Fall 2000 Outline of Today s Lecture The I/O system Magnetic Disk Tape Buses DMA cps 104 I/O.2 RW Fall

More information

Real Time Programming: Concepts

Real Time Programming: Concepts Real Time Programming: Concepts Radek Pelánek Plan at first we will study basic concepts related to real time programming then we will have a look at specific programming languages and study how they realize

More information

Putting it on the NIC: A Case Study on application offloading to a Network Interface Card (NIC)

Putting it on the NIC: A Case Study on application offloading to a Network Interface Card (NIC) This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE CCNC 2006 proceedings. Putting it on the NIC: A Case Study on application

More information

CPU Scheduling. Core Definitions

CPU Scheduling. Core Definitions CPU Scheduling General rule keep the CPU busy; an idle CPU is a wasted CPU Major source of CPU idleness: I/O (or waiting for it) Many programs have a characteristic CPU I/O burst cycle alternating phases

More information

ioscale: The Holy Grail for Hyperscale

ioscale: The Holy Grail for Hyperscale ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often

More information

Bivio 7000 Series Network Appliance Platforms

Bivio 7000 Series Network Appliance Platforms W H I T E P A P E R Bivio 7000 Series Network Appliance Platforms Uncompromising performance. Unmatched flexibility. Uncompromising performance. Unmatched flexibility. The Bivio 7000 Series Programmable

More information

ELI: Bare-Metal Performance for I/O Virtualization

ELI: Bare-Metal Performance for I/O Virtualization ELI: Bare-Metal Performance for I/O Virtualization Abel Gordon Nadav Amit Nadav Har El Muli Ben-Yehuda, Alex Landau Assaf Schuster Dan Tsafrir IBM Research Haifa Technion Israel Institute of Technology

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization

More information

Chapter 1: Introduction. What is an Operating System?

Chapter 1: Introduction. What is an Operating System? Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered System Real -Time Systems Handheld Systems Computing Environments

More information

Going Linux on Massive Multicore

Going Linux on Massive Multicore Embedded Linux Conference Europe 2013 Going Linux on Massive Multicore Marta Rybczyńska 24th October, 2013 Agenda Architecture Linux Port Core Peripherals Debugging Summary and Future Plans 2 Agenda Architecture

More information

Extreme High Performance Computing or Why Microkernels Suck

Extreme High Performance Computing or Why Microkernels Suck Extreme High Performance Computing or Why Microkernels Suck Christoph Lameter sgi clameter@sgi.com Abstract One often wonders how well Linux scales. We frequently get suggestions that Linux cannot scale

More information

Presentation of Diagnosing performance overheads in the Xen virtual machine environment

Presentation of Diagnosing performance overheads in the Xen virtual machine environment Presentation of Diagnosing performance overheads in the Xen virtual machine environment September 26, 2005 Framework Using to fix the Network Anomaly Xen Network Performance Test Using Outline 1 Introduction

More information

Switching Architectures for Cloud Network Designs

Switching Architectures for Cloud Network Designs Overview Networks today require predictable performance and are much more aware of application flows than traditional networks with static addressing of devices. Enterprise networks in the past were designed

More information

Effective Utilization of Multicore Processor for Unified Threat Management Functions

Effective Utilization of Multicore Processor for Unified Threat Management Functions Journal of Computer Science 8 (1): 68-75, 2012 ISSN 1549-3636 2012 Science Publications Effective Utilization of Multicore Processor for Unified Threat Management Functions Sudhakar Gummadi and Radhakrishnan

More information

Operating System Components and Services

Operating System Components and Services Operating System Components and Services Tom Kelliher, CS 311 Feb. 6, 2012 Announcements: From last time: 1. System architecture issues. 2. I/O programming. 3. Memory hierarchy. 4. Hardware protection.

More information

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering

More information

PCI Express High Speed Networks. Complete Solution for High Speed Networking

PCI Express High Speed Networks. Complete Solution for High Speed Networking PCI Express High Speed Networks Complete Solution for High Speed Networking Ultra Low Latency Ultra High Throughput Maximizing application performance is a combination of processing, communication, and

More information

TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to

TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to Introduction to TCP Offload Engines By implementing a TCP Offload Engine (TOE) in high-speed computing environments, administrators can help relieve network bottlenecks and improve application performance.

More information

Packet Sniffer using Multicore programming. By B.A.Khivsara Assistant Professor Computer Department SNJB s KBJ COE,Chandwad

Packet Sniffer using Multicore programming. By B.A.Khivsara Assistant Professor Computer Department SNJB s KBJ COE,Chandwad Packet Sniffer using Multicore programming By B.A.Khivsara Assistant Professor Computer Department SNJB s KBJ COE,Chandwad Outline Packet Sniffer Multicore Command for CPU info Program in Python Packet

More information

An Implementation Of Multiprocessor Linux

An Implementation Of Multiprocessor Linux An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than

More information

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring CESNET Technical Report 2/2014 HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring VIKTOR PUš, LUKÁš KEKELY, MARTIN ŠPINLER, VÁCLAV HUMMEL, JAN PALIČKA Received 3. 10. 2014 Abstract

More information

KVM Architecture Overview

KVM Architecture Overview KVM Architecture Overview 2015 Edition Stefan Hajnoczi 1 Introducing KVM virtualization KVM hypervisor runs virtual machines on Linux hosts Mature on x86, recent progress on ARM and

More information

Operating Systems 4 th Class

Operating Systems 4 th Class Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science

More information

Processes and Non-Preemptive Scheduling. Otto J. Anshus

Processes and Non-Preemptive Scheduling. Otto J. Anshus Processes and Non-Preemptive Scheduling Otto J. Anshus 1 Concurrency and Process Challenge: Physical reality is Concurrent Smart to do concurrent software instead of sequential? At least we want to have

More information

Multicore Programming with LabVIEW Technical Resource Guide

Multicore Programming with LabVIEW Technical Resource Guide Multicore Programming with LabVIEW Technical Resource Guide 2 INTRODUCTORY TOPICS UNDERSTANDING PARALLEL HARDWARE: MULTIPROCESSORS, HYPERTHREADING, DUAL- CORE, MULTICORE AND FPGAS... 5 DIFFERENCES BETWEEN

More information

Cisco Wide Area Application Services (WAAS) Network Module

Cisco Wide Area Application Services (WAAS) Network Module Cisco Wide Area Application Services (WAAS) Network Module The Cisco Wide Area Application Services (WAAS) Network Module for the Cisco Integrated Services Routers (ISR) is a powerful WAN optimization

More information

A Dell Technical White Paper Dell PowerConnect Team

A Dell Technical White Paper Dell PowerConnect Team Flow Control and Network Performance A Dell Technical White Paper Dell PowerConnect Team THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES.

More information

Efficient Implementation of the bare-metal Hypervisor MetalSVM for the SCC

Efficient Implementation of the bare-metal Hypervisor MetalSVM for the SCC Efficient Implementation of the bare-metal Hypervisor MetalSVM for the SCC Public Release of MetalSVM 0.1 Pablo Reble, Jacek Galowicz, Stefan Lankes and Thomas Bemmerl MARC Symposium, Toulouse CHAIR FOR

More information

OPERATING SYSTEMS SCHEDULING

OPERATING SYSTEMS SCHEDULING OPERATING SYSTEMS SCHEDULING Jerry Breecher 5: CPU- 1 CPU What Is In This Chapter? This chapter is about how to get a process attached to a processor. It centers around efficient algorithms that perform

More information

Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems

Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems Lecture Outline Operating Systems Objectives Describe the functions and layers of an operating system List the resources allocated by the operating system and describe the allocation process Explain how

More information

Low Latency Market Data and Ticker Plant Technology. SpryWare.

Low Latency Market Data and Ticker Plant Technology. SpryWare. Low Latency Market Data and Ticker Plant Technology. SpryWare. Direct Feeds Ultra Low Latency Extreme Capacity High Throughput Fully Scalable SpryWare s state-of-the-art Ticker Plant technology enables

More information

Chapter 16 Distributed Processing, Client/Server, and Clusters

Chapter 16 Distributed Processing, Client/Server, and Clusters Operating Systems: Internals and Design Principles Chapter 16 Distributed Processing, Client/Server, and Clusters Eighth Edition By William Stallings Table 16.1 Client/Server Terminology Applications Programming

More information

Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:

Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note: Chapter 7 OBJECTIVES Operating Systems Define the purpose and functions of an operating system. Understand the components of an operating system. Understand the concept of virtual memory. Understand the

More information

Interconnection Networks

Interconnection Networks Advanced Computer Architecture (0630561) Lecture 15 Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. Interconnection Networks: Multiprocessors INs can be classified based on: 1. Mode

More information

Processing of Flow Accounting Data in Java: Framework Design and Performance Evaluation

Processing of Flow Accounting Data in Java: Framework Design and Performance Evaluation Processing of Flow Accounting Data in Java: Framework Design and Performance Evaluation Jochen Kögel and Sebastian Scholz Institute of Communication Networks and Computer Engineering (IKR) University of

More information

Enabling Practical SDN Security Applications with OFX (The OpenFlow extension Framework)

Enabling Practical SDN Security Applications with OFX (The OpenFlow extension Framework) Enabling Practical SDN Security Applications with OFX (The OpenFlow extension Framework) John Sonchack, Adam J. Aviv, Eric Keller, and Jonathan M. Smith Outline Introduction Overview of OFX Using OFX Benchmarks

More information

CHAPTER 3 STATIC ROUTING

CHAPTER 3 STATIC ROUTING CHAPTER 3 STATIC ROUTING This chapter addresses the end-to-end delivery service of IP and explains how IP routers and hosts handle IP datagrams. The first section discusses how datagrams are forwarded

More information

STEPPING TOWARDS A NOISELESS LINUX ENVIRONMENT

STEPPING TOWARDS A NOISELESS LINUX ENVIRONMENT ROSS 2012 June 29 2012 Venice, Italy STEPPING TOWARDS A NOISELESS LINUX ENVIRONMENT Hakan Akkan*, Michael Lang, Lorie Liebrock* Presented by: Abhishek Kulkarni * New Mexico Tech Ultrascale Systems Research

More information

Exploiting Task-level Concurrency in a Programmable Network Interface

Exploiting Task-level Concurrency in a Programmable Network Interface Exploiting Task-level Concurrency in a Programmable Network Interface Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice University hykim, vijaypai, rixner @rice.edu ABSTRACT Programmable network interfaces

More information

Next Generation Operating Systems

Next Generation Operating Systems Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015 The end of CPU scaling Future computing challenges Power efficiency Performance == parallelism Cisco Confidential 2 Paradox of the

More information

Embedded Systems: map to FPGA, GPU, CPU?

Embedded Systems: map to FPGA, GPU, CPU? Embedded Systems: map to FPGA, GPU, CPU? Jos van Eijndhoven jos@vectorfabrics.com Bits&Chips Embedded systems Nov 7, 2013 # of transistors Moore s law versus Amdahl s law Computational Capacity Hardware

More information

Real-Time Scheduling 1 / 39

Real-Time Scheduling 1 / 39 Real-Time Scheduling 1 / 39 Multiple Real-Time Processes A runs every 30 msec; each time it needs 10 msec of CPU time B runs 25 times/sec for 15 msec C runs 20 times/sec for 5 msec For our equation, A

More information

Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University

Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University Napatech - Sharkfest 2009 1 Presentation Overview About Napatech

More information

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers A Comparative Study on Vega-HTTP & Popular Open-source Web-servers Happiest People. Happiest Customers Contents Abstract... 3 Introduction... 3 Performance Comparison... 4 Architecture... 5 Diagram...

More information

Open Flow Controller and Switch Datasheet

Open Flow Controller and Switch Datasheet Open Flow Controller and Switch Datasheet California State University Chico Alan Braithwaite Spring 2013 Block Diagram Figure 1. High Level Block Diagram The project will consist of a network development

More information

Chapter 3 Operating-System Structures

Chapter 3 Operating-System Structures Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual

More information