Scaling Networking Applications to Multiple Cores
|
|
- Hope Marsh
- 8 years ago
- Views:
Transcription
1 Scaling Networking Applications to Multiple Cores Greg Seibert Sr. Technical Marketing Engineer Cavium Networks
2 Challenges with multi-core application performance Amdahl s Law Evaluates application performance from the perspective of running time Overall application performance scaling limited to the proportion of processing that can be done in parallel Scaling limitation intrinsically related to type of processing being done Evaluating System Performance of Networking Applications How much data can it pass How many packets per second Scaling Parallelization Networking applications provide a convenient quanta of work: The Packet Flows are mostly independent Critical Regions Per-flow data structures
3 Multi-core Programming Techniques Independent processes on each core Each process can maintain state in local storage and avoid shared memory contention Processes snugly-coupled via in-memory IPC mechanisms Pipelined Divide application into stages Each stage can be limited to completely fit into the instruction cache Application performance limited to throughput of any single stage Entire application requires a-priori division of operations Symmetric Multi-Processing (SMP) Same program/image running on multiple cores All instances identical and can load balance organically Classic implementations find it tricky to scale
4 Independent Processes on each core Communication between cores requires the use of Inter-processor Communication Mechanisms (IPC) Shared memory Inter-CPU interrupts Message queues Familiar implementation Multi-programming OS enables this paradigm on a single or multi CPU systems Processing overhead from the IPC mechanism can be significant Context switching and messaging will consume CPU cycles not contributing towards implementing the application s features
5 Dividing applications into pipeline stages Parallelism can be implemented as the first stage identifies the traffic and queues it to multiple second stages Each instance of the second stage can be assigned all the packets of a flow Balancing flows between second stage instances requires some tricky footwork from the first stage Each stage code size can be limited to fit into the L1 instruction cache Performance impact due to instruction cache misses can be reduced Static assignment of operations can lead to a variance in dynamic system performance Dynamic allocation of operations or number of stage instances can somewhat mitigate this effect Can require complex software
6 SMP - All cores able to do all things Different traffic profiles requires a different balance of processing With all application instances able to perform all processing, a dynamic balance will occur organically A single code set (image) can be developed integrating multiply designed and unit-tested modules Testing can verify each modular component performs to expectations and interface requirements System testing and verification only needs to ensure a single image is put through its paces Need to ensure critical regions are minimized Mutual Exclusion mechanisms (mutex) for protecting these regions have the ability to reduce overall application scaling
7 Designing for Optimal Performance Goal is to keep the CPUs busy executing the application s instructions Minimize, if not eliminate, the need for handling interrupts and context switches System calls, interrupt exceptions, and context switching take CPU cycles away from applications Highest Performance: Design a single process per CPU and use polling for I/O Maximize - through design - independent and parallel operations Keep critical regions to a minimum if not eliminate them altogether Protecting critical regions are the single largest impediment to efficient scaling
8 Which method to choose? No one method is intrinsically better than the other Each have their own application space Pipelines benefit Single high-bandwidth flows that require processing phases to be done atomically Symmetric Multiprocessing benefits Multiple flows that can be processed in parallel Low-latency traffic that can be processed in parallel while preserving ingress order on egress Wider range of traffic profiles can maintain performance Multi Independent process/thread applications benefit Existing multi-threaded or multi-process implementations wishing to gain performance without significant re-design Applications rely on Operating System services
9 How can hardware help? Perform some triage on the incoming packet traffic and give them a rough priority Then, hand it off to the software in a prioritized fashion Provide some sort of evaluation of the packet E.g. Flow identification Maintain the packet order arrival throughout its processing Execute menial tasks such as buffer management Recycling buffers that have been sent making them available for new incoming packets. Reduce, if not completely eliminate, the need to protect shared data structures Access to shared data structures are usually per-flow Hardware can ensure that
10 Spinlocks when high-contention locks Multi-CPU synchronization requires a memory-based contention primitive Spinlocks based upon MIPS-defined instructions: Load-Linked and Store-Conditional Statistical Nature of operation inherently unfair OCTEON s SSO can be used to implement fair locks Locking can be done non-blocking Acquire the lock while I do something useful
11 How OCTEON enables high-performance Applications running as Linux processes have direct access to hardware blocks via Simple Exec API Send and receive packets directly Integrated Packet Input and Output processors with knowledge of common network protocols offload software from laborious header validations PIP provides the results of these tests in the form of a set of flags Packets get flow classification on ingress PKO computes and inserts transport layer checksum on egress Hardware buffer management Processors can allocate and free buffers without software intervention Many operations executing in parallel with the dual-issue cores Software can continue to execute instructions while time consuming operations run to completion I/O units can DMA results to core s local memory Crypto instructions execute asynchronous to pipeline
12 How OCTEON enables high-performance Introduces a Work Flow Paradigm SSO off-loads software from the task of scheduling what operations get executed on the cores PIP works in conjunction with SSO to prioritize ingress packets as instances of work PIP classifies packets and tags them and thus the SSO can ensure the software on the cores can work on packets with out interference Polling for work alleviates the overhead of interrupt handling Completion results from application-specific coprocessors submitted as instances of work Timer events can be processed as instances of work Software can be optimized to significantly increase application s performance Hardware work scheduling independent of CPUs can eliminate the need for critical regions Utilizing Atomic Tags allows software to operate knowing it has sole access to resources Flow-based network traffic will have per-flow data structures requiring exclusive access e.g. State Machine Hardware ensures only a single packet per flow is being worked upon
13 OCTEON does it all OCTEON s cnmips cores can operate independently All cores share the same physical memory space so shared memory IPCs are easy to implement Each core has own mailbox interrupts Using the SSO, OCTEON can efficiently implement a pipeline Each group of cores represents a single stage in the pipeline Group switch operation passes work to the next stage Data/State passed via the Work Queue Entry structure Using the traffic classification and tagging from the PIP, the SSO can arbitrate what packets get worked on Can obviate the need to protect per-flow data structures (E.g. TCP Control Block)
Intel DPDK Boosts Server Appliance Performance White Paper
Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks
More informationSYSTEM ecos Embedded Configurable Operating System
BELONGS TO THE CYGNUS SOLUTIONS founded about 1989 initiative connected with an idea of free software ( commercial support for the free software ). Recently merged with RedHat. CYGNUS was also the original
More informationPerformance of Software Switching
Performance of Software Switching Based on papers in IEEE HPSR 2011 and IFIP/ACM Performance 2011 Nuutti Varis, Jukka Manner Department of Communications and Networking (COMNET) Agenda Motivation Performance
More informationCPU Scheduling Outline
CPU Scheduling Outline What is scheduling in the OS? What are common scheduling criteria? How to evaluate scheduling algorithms? What are common scheduling algorithms? How is thread scheduling different
More informationChapter 2: OS Overview
Chapter 2: OS Overview CmSc 335 Operating Systems 1. Operating system objectives and functions Operating systems control and support the usage of computer systems. a. usage users of a computer system:
More informationThe Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology
3. The Lagopus SDN Software Switch Here we explain the capabilities of the new Lagopus software switch in detail, starting with the basics of SDN and OpenFlow. 3.1 SDN and OpenFlow Those engaged in network-related
More informationScheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:
Scheduling Scheduling Scheduling levels Long-term scheduling. Selects which jobs shall be allowed to enter the system. Only used in batch systems. Medium-term scheduling. Performs swapin-swapout operations
More informationKeyStone Training. Multicore Navigator Overview. Overview Agenda
KeyStone Training Multicore Navigator Overview What is Navigator? Overview Agenda Definition Architecture Queue Manager Sub System (QMSS) Packet DMA (PKTDMA) Descriptors and Queuing What can Navigator
More informationSoftware Datapath Acceleration for Stateless Packet Processing
June 22, 2010 Software Datapath Acceleration for Stateless Packet Processing FTF-NET-F0817 Ravi Malhotra Software Architect Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions
More informationWhy Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat
Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower The traditional approach better performance Why computers are
More informationSynchronization. Todd C. Mowry CS 740 November 24, 1998. Topics. Locks Barriers
Synchronization Todd C. Mowry CS 740 November 24, 1998 Topics Locks Barriers Types of Synchronization Mutual Exclusion Locks Event Synchronization Global or group-based (barriers) Point-to-point tightly
More informationAdvanced Core Operating System (ACOS): Experience the Performance
WHITE PAPER Advanced Core Operating System (ACOS): Experience the Performance Table of Contents Trends Affecting Application Networking...3 The Era of Multicore...3 Multicore System Design Challenges...3
More informationClient/Server and Distributed Computing
Adapted from:operating Systems: Internals and Design Principles, 6/E William Stallings CS571 Fall 2010 Client/Server and Distributed Computing Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Traditional
More informationClient/Server Computing Distributed Processing, Client/Server, and Clusters
Client/Server Computing Distributed Processing, Client/Server, and Clusters Chapter 13 Client machines are generally single-user PCs or workstations that provide a highly userfriendly interface to the
More informationEmbedded Parallel Computing
Embedded Parallel Computing Lecture 5 - The anatomy of a modern multiprocessor, the multicore processors Tomas Nordström Course webpage:: Course responsible and examiner: Tomas
More informationAchieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.
More informationRouter Architectures
Router Architectures An overview of router architectures. Introduction What is a Packet Switch? Basic Architectural Components Some Example Packet Switches The Evolution of IP Routers 2 1 Router Components
More informationHow To Understand And Understand An Operating System In C Programming
ELEC 377 Operating Systems Thomas R. Dean Instructor Tom Dean Office:! WLH 421 Email:! tom.dean@queensu.ca Hours:! Wed 14:30 16:00 (Tentative)! and by appointment! 6 years industrial experience ECE Rep
More informationOpenFlow with Intel 82599. Voravit Tanyingyong, Markus Hidell, Peter Sjödin
OpenFlow with Intel 82599 Voravit Tanyingyong, Markus Hidell, Peter Sjödin Outline Background Goal Design Experiment and Evaluation Conclusion OpenFlow SW HW Open up commercial network hardware for experiment
More informationA Generic Network Interface Architecture for a Networked Processor Array (NePA)
A Generic Network Interface Architecture for a Networked Processor Array (NePA) Seung Eun Lee, Jun Ho Bahn, Yoon Seok Yang, and Nader Bagherzadeh EECS @ University of California, Irvine Outline Introduction
More informationOverview of Operating Systems Instructor: Dr. Tongping Liu
Overview of Operating Systems Instructor: Dr. Tongping Liu Thank Dr. Dakai Zhu and Dr. Palden Lama for providing their slides. 1 Lecture Outline Operating System: what is it? Evolution of Computer Systems
More informationOpenDataPlane Introduction and Overview
Introduction and Overview Linaro Networking Group (LNG) Initial Release 0.1.0, January 2014 Executive Summary OpenDataPlane (ODP) is an open source project that provides an application programming environment
More informationMultiprocessor Scheduling and Scheduling in Linux Kernel 2.6
Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Winter Term 2008 / 2009 Jun.-Prof. Dr. André Brinkmann Andre.Brinkmann@uni-paderborn.de Universität Paderborn PC² Agenda Multiprocessor and
More informationTypes Of Operating Systems
Types Of Operating Systems Date 10/01/2004 1/24/2004 Operating Systems 1 Brief history of OS design In the beginning OSes were runtime libraries The OS was just code you linked with your program and loaded
More informationAccelerating High-Speed Networking with Intel I/O Acceleration Technology
White Paper Intel I/O Acceleration Technology Accelerating High-Speed Networking with Intel I/O Acceleration Technology The emergence of multi-gigabit Ethernet allows data centers to adapt to the increasing
More informationReal-Time Operating Systems for MPSoCs
Real-Time Operating Systems for MPSoCs Hiroyuki Tomiyama Graduate School of Information Science Nagoya University http://member.acm.org/~hiroyuki MPSoC 2009 1 Contributors Hiroaki Takada Director and Professor
More informationHighly parallel, lock- less, user- space TCP/IP networking stack based on FreeBSD. EuroBSDCon 2013 Malta
Highly parallel, lock- less, user- space TCP/IP networking stack based on FreeBSD EuroBSDCon 2013 Malta Networking stack Requirements High throughput Low latency ConnecLon establishments and teardowns
More informationObjectives. Chapter 5: Process Scheduling. Chapter 5: Process Scheduling. 5.1 Basic Concepts. To introduce CPU scheduling
Objectives To introduce CPU scheduling To describe various CPU-scheduling algorithms Chapter 5: Process Scheduling To discuss evaluation criteria for selecting the CPUscheduling algorithm for a particular
More informationLecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
More information159.735. Final Report. Cluster Scheduling. Submitted by: Priti Lohani 04244354
159.735 Final Report Cluster Scheduling Submitted by: Priti Lohani 04244354 1 Table of contents: 159.735... 1 Final Report... 1 Cluster Scheduling... 1 Table of contents:... 2 1. Introduction:... 3 1.1
More informationLinux Driver Devices. Why, When, Which, How?
Bertrand Mermet Sylvain Ract Linux Driver Devices. Why, When, Which, How? Since its creation in the early 1990 s Linux has been installed on millions of computers or embedded systems. These systems may
More informationIntel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family
Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family White Paper June, 2008 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL
More informationDesign Issues in a Bare PC Web Server
Design Issues in a Bare PC Web Server Long He, Ramesh K. Karne, Alexander L. Wijesinha, Sandeep Girumala, and Gholam H. Khaksari Department of Computer & Information Sciences, Towson University, 78 York
More informationRed Hat Linux Internals
Red Hat Linux Internals Learn how the Linux kernel functions and start developing modules. Red Hat Linux internals teaches you all the fundamental requirements necessary to understand and start developing
More informationGigabit Ethernet Design
Gigabit Ethernet Design Laura Jeanne Knapp Network Consultant 1-919-254-8801 laura@lauraknapp.com www.lauraknapp.com Tom Hadley Network Consultant 1-919-301-3052 tmhadley@us.ibm.com HSEdes_ 010 ed and
More informationProgrammable Networking with Open vswitch
Programmable Networking with Open vswitch Jesse Gross LinuxCon September, 2013 2009 VMware Inc. All rights reserved Background: The Evolution of Data Centers Virtualization has created data center workloads
More informationMulti-Threading Performance on Commodity Multi-Core Processors
Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction
More informationLast Class: OS and Computer Architecture. Last Class: OS and Computer Architecture
Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts
More informationPikeOS: Multi-Core RTOS for IMA. Dr. Sergey Tverdyshev SYSGO AG 29.10.2012, Moscow
PikeOS: Multi-Core RTOS for IMA Dr. Sergey Tverdyshev SYSGO AG 29.10.2012, Moscow Contents Multi Core Overview Hardware Considerations Multi Core Software Design Certification Consideratins PikeOS Multi-Core
More informationPerformance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009
Performance Study Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Introduction With more and more mission critical networking intensive workloads being virtualized
More informationWhite Paper Abstract Disclaimer
White Paper Synopsis of the Data Streaming Logical Specification (Phase I) Based on: RapidIO Specification Part X: Data Streaming Logical Specification Rev. 1.2, 08/2004 Abstract The Data Streaming specification
More informationVirtualization is set to become a key requirement
Xen, the virtual machine monitor The art of virtualization Moshe Bar Virtualization is set to become a key requirement for every server in the data center. This trend is a direct consequence of an industrywide
More informationDEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Question Bank Subject Name: EC6504 - Microprocessor & Microcontroller Year/Sem : II/IV
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Question Bank Subject Name: EC6504 - Microprocessor & Microcontroller Year/Sem : II/IV UNIT I THE 8086 MICROPROCESSOR 1. What is the purpose of segment registers
More information10.04.2008. Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details
Thomas Fahrig Senior Developer Hypervisor Team Hypervisor Architecture Terminology Goals Basics Details Scheduling Interval External Interrupt Handling Reserves, Weights and Caps Context Switch Waiting
More informationEmbedded Systems. 6. Real-Time Operating Systems
Embedded Systems 6. Real-Time Operating Systems Lothar Thiele 6-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
More informationNetwork Virtualization Technologies and their Effect on Performance
Network Virtualization Technologies and their Effect on Performance Dror Goldenberg VP Software Architecture TCE NFV Winter School 2015 Cloud Computing and NFV Cloud - scalable computing resources (CPU,
More informationCOS 318: Operating Systems. I/O Device and Drivers. Input and Output. Definitions and General Method. Revisit Hardware
COS 318: Operating Systems I/O and Drivers Input and Output A computer s job is to process data Computation (, cache, and memory) Move data into and out of a system (between I/O devices and memory) Challenges
More informationEnea Hypervisor : Facilitating Multicore Migration with the Enea Hypervisor
1 Enea Hypervisor : Facilitating Multicore Migration with the Enea Hypervisor Magnus Karlsson Principal Engineer, CTO Office Multicore is everywhere in the telecommunications and networking world. Whether
More informationMaking Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association
Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?
More informationTCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance
TCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance M. Rangarajan, A. Bohra, K. Banerjee, E.V. Carrera, R. Bianchini, L. Iftode, W. Zwaenepoel. Presented
More informationI/O Device and Drivers
COS 318: Operating Systems I/O Device and Drivers Prof. Margaret Martonosi Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall11/cos318/ Announcements Project
More informationDevelopment of Type-2 Hypervisor for MIPS64 Based Systems
Development of Type-2 Hypervisor for MIPS64 Based Systems High Performance Computing and Networking Lab Al-Khwarizmi Institute of Computer Science University of Engineering & Technology Lahore Pakistan
More informationindependent systems in constant communication what they are, why we care, how they work
Overview of Presentation Major Classes of Distributed Systems classes of distributed system loosely coupled systems loosely coupled, SMP, Single-system-image Clusters independent systems in constant communication
More informationOperating Systems Design 16. Networking: Sockets
Operating Systems Design 16. Networking: Sockets Paul Krzyzanowski pxk@cs.rutgers.edu 1 Sockets IP lets us send data between machines TCP & UDP are transport layer protocols Contain port number to identify
More informationCPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner
CPS104 Computer Organization and Programming Lecture 18: Input-Output Robert Wagner cps 104 I/O.1 RW Fall 2000 Outline of Today s Lecture The I/O system Magnetic Disk Tape Buses DMA cps 104 I/O.2 RW Fall
More informationReal Time Programming: Concepts
Real Time Programming: Concepts Radek Pelánek Plan at first we will study basic concepts related to real time programming then we will have a look at specific programming languages and study how they realize
More informationPutting it on the NIC: A Case Study on application offloading to a Network Interface Card (NIC)
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE CCNC 2006 proceedings. Putting it on the NIC: A Case Study on application
More informationCPU Scheduling. Core Definitions
CPU Scheduling General rule keep the CPU busy; an idle CPU is a wasted CPU Major source of CPU idleness: I/O (or waiting for it) Many programs have a characteristic CPU I/O burst cycle alternating phases
More informationioscale: The Holy Grail for Hyperscale
ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often
More informationBivio 7000 Series Network Appliance Platforms
W H I T E P A P E R Bivio 7000 Series Network Appliance Platforms Uncompromising performance. Unmatched flexibility. Uncompromising performance. Unmatched flexibility. The Bivio 7000 Series Programmable
More informationELI: Bare-Metal Performance for I/O Virtualization
ELI: Bare-Metal Performance for I/O Virtualization Abel Gordon Nadav Amit Nadav Har El Muli Ben-Yehuda, Alex Landau Assaf Schuster Dan Tsafrir IBM Research Haifa Technion Israel Institute of Technology
More informationChapter 11 I/O Management and Disk Scheduling
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization
More informationChapter 1: Introduction. What is an Operating System?
Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered System Real -Time Systems Handheld Systems Computing Environments
More informationGoing Linux on Massive Multicore
Embedded Linux Conference Europe 2013 Going Linux on Massive Multicore Marta Rybczyńska 24th October, 2013 Agenda Architecture Linux Port Core Peripherals Debugging Summary and Future Plans 2 Agenda Architecture
More informationExtreme High Performance Computing or Why Microkernels Suck
Extreme High Performance Computing or Why Microkernels Suck Christoph Lameter sgi clameter@sgi.com Abstract One often wonders how well Linux scales. We frequently get suggestions that Linux cannot scale
More informationPresentation of Diagnosing performance overheads in the Xen virtual machine environment
Presentation of Diagnosing performance overheads in the Xen virtual machine environment September 26, 2005 Framework Using to fix the Network Anomaly Xen Network Performance Test Using Outline 1 Introduction
More informationSwitching Architectures for Cloud Network Designs
Overview Networks today require predictable performance and are much more aware of application flows than traditional networks with static addressing of devices. Enterprise networks in the past were designed
More informationEffective Utilization of Multicore Processor for Unified Threat Management Functions
Journal of Computer Science 8 (1): 68-75, 2012 ISSN 1549-3636 2012 Science Publications Effective Utilization of Multicore Processor for Unified Threat Management Functions Sudhakar Gummadi and Radhakrishnan
More informationOperating System Components and Services
Operating System Components and Services Tom Kelliher, CS 311 Feb. 6, 2012 Announcements: From last time: 1. System architecture issues. 2. I/O programming. 3. Memory hierarchy. 4. Hardware protection.
More informationSockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck
Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering
More informationPCI Express High Speed Networks. Complete Solution for High Speed Networking
PCI Express High Speed Networks Complete Solution for High Speed Networking Ultra Low Latency Ultra High Throughput Maximizing application performance is a combination of processing, communication, and
More informationTCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to
Introduction to TCP Offload Engines By implementing a TCP Offload Engine (TOE) in high-speed computing environments, administrators can help relieve network bottlenecks and improve application performance.
More informationPacket Sniffer using Multicore programming. By B.A.Khivsara Assistant Professor Computer Department SNJB s KBJ COE,Chandwad
Packet Sniffer using Multicore programming By B.A.Khivsara Assistant Professor Computer Department SNJB s KBJ COE,Chandwad Outline Packet Sniffer Multicore Command for CPU info Program in Python Packet
More informationAn Implementation Of Multiprocessor Linux
An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than
More informationHANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring
CESNET Technical Report 2/2014 HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring VIKTOR PUš, LUKÁš KEKELY, MARTIN ŠPINLER, VÁCLAV HUMMEL, JAN PALIČKA Received 3. 10. 2014 Abstract
More informationKVM Architecture Overview
KVM Architecture Overview 2015 Edition Stefan Hajnoczi 1 Introducing KVM virtualization KVM hypervisor runs virtual machines on Linux hosts Mature on x86, recent progress on ARM and
More informationOperating Systems 4 th Class
Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science
More informationProcesses and Non-Preemptive Scheduling. Otto J. Anshus
Processes and Non-Preemptive Scheduling Otto J. Anshus 1 Concurrency and Process Challenge: Physical reality is Concurrent Smart to do concurrent software instead of sequential? At least we want to have
More informationMulticore Programming with LabVIEW Technical Resource Guide
Multicore Programming with LabVIEW Technical Resource Guide 2 INTRODUCTORY TOPICS UNDERSTANDING PARALLEL HARDWARE: MULTIPROCESSORS, HYPERTHREADING, DUAL- CORE, MULTICORE AND FPGAS... 5 DIFFERENCES BETWEEN
More informationHow To Use The Cisco Wide Area Application Services (Waas) Network Module
Cisco Wide Area Application Services (WAAS) Network Module The Cisco Wide Area Application Services (WAAS) Network Module for the Cisco Integrated Services Routers (ISR) is a powerful WAN optimization
More informationA Dell Technical White Paper Dell PowerConnect Team
Flow Control and Network Performance A Dell Technical White Paper Dell PowerConnect Team THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES.
More informationEfficient Implementation of the bare-metal Hypervisor MetalSVM for the SCC
Efficient Implementation of the bare-metal Hypervisor MetalSVM for the SCC Public Release of MetalSVM 0.1 Pablo Reble, Jacek Galowicz, Stefan Lankes and Thomas Bemmerl MARC Symposium, Toulouse CHAIR FOR
More informationOPERATING SYSTEMS SCHEDULING
OPERATING SYSTEMS SCHEDULING Jerry Breecher 5: CPU- 1 CPU What Is In This Chapter? This chapter is about how to get a process attached to a processor. It centers around efficient algorithms that perform
More informationPage 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems
Lecture Outline Operating Systems Objectives Describe the functions and layers of an operating system List the resources allocated by the operating system and describe the allocation process Explain how
More informationLow Latency Market Data and Ticker Plant Technology. SpryWare.
Low Latency Market Data and Ticker Plant Technology. SpryWare. Direct Feeds Ultra Low Latency Extreme Capacity High Throughput Fully Scalable SpryWare s state-of-the-art Ticker Plant technology enables
More informationChapter 16 Distributed Processing, Client/Server, and Clusters
Operating Systems: Internals and Design Principles Chapter 16 Distributed Processing, Client/Server, and Clusters Eighth Edition By William Stallings Table 16.1 Client/Server Terminology Applications Programming
More informationOperating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:
Chapter 7 OBJECTIVES Operating Systems Define the purpose and functions of an operating system. Understand the components of an operating system. Understand the concept of virtual memory. Understand the
More informationInterconnection Networks
Advanced Computer Architecture (0630561) Lecture 15 Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. Interconnection Networks: Multiprocessors INs can be classified based on: 1. Mode
More informationProcessing of Flow Accounting Data in Java: Framework Design and Performance Evaluation
Processing of Flow Accounting Data in Java: Framework Design and Performance Evaluation Jochen Kögel and Sebastian Scholz Institute of Communication Networks and Computer Engineering (IKR) University of
More informationEnabling Practical SDN Security Applications with OFX (The OpenFlow extension Framework)
Enabling Practical SDN Security Applications with OFX (The OpenFlow extension Framework) John Sonchack, Adam J. Aviv, Eric Keller, and Jonathan M. Smith Outline Introduction Overview of OFX Using OFX Benchmarks
More informationCHAPTER 3 STATIC ROUTING
CHAPTER 3 STATIC ROUTING This chapter addresses the end-to-end delivery service of IP and explains how IP routers and hosts handle IP datagrams. The first section discusses how datagrams are forwarded
More informationSTEPPING TOWARDS A NOISELESS LINUX ENVIRONMENT
ROSS 2012 June 29 2012 Venice, Italy STEPPING TOWARDS A NOISELESS LINUX ENVIRONMENT Hakan Akkan*, Michael Lang, Lorie Liebrock* Presented by: Abhishek Kulkarni * New Mexico Tech Ultrascale Systems Research
More informationExploiting Task-level Concurrency in a Programmable Network Interface
Exploiting Task-level Concurrency in a Programmable Network Interface Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice University hykim, vijaypai, rixner @rice.edu ABSTRACT Programmable network interfaces
More informationNext Generation Operating Systems
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015 The end of CPU scaling Future computing challenges Power efficiency Performance == parallelism Cisco Confidential 2 Paradox of the
More informationEmbedded Systems: map to FPGA, GPU, CPU?
Embedded Systems: map to FPGA, GPU, CPU? Jos van Eijndhoven jos@vectorfabrics.com Bits&Chips Embedded systems Nov 7, 2013 # of transistors Moore s law versus Amdahl s law Computational Capacity Hardware
More informationReal-Time Scheduling 1 / 39
Real-Time Scheduling 1 / 39 Multiple Real-Time Processes A runs every 30 msec; each time it needs 10 msec of CPU time B runs 25 times/sec for 15 msec C runs 20 times/sec for 5 msec For our equation, A
More informationWireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University
Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University Napatech - Sharkfest 2009 1 Presentation Overview About Napatech
More informationA Comparative Study on Vega-HTTP & Popular Open-source Web-servers
A Comparative Study on Vega-HTTP & Popular Open-source Web-servers Happiest People. Happiest Customers Contents Abstract... 3 Introduction... 3 Performance Comparison... 4 Architecture... 5 Diagram...
More informationOpen Flow Controller and Switch Datasheet
Open Flow Controller and Switch Datasheet California State University Chico Alan Braithwaite Spring 2013 Block Diagram Figure 1. High Level Block Diagram The project will consist of a network development
More informationChapter 3 Operating-System Structures
Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual
More information