Open-source routing at 10Gb/s
|
|
|
- Neal Park
- 10 years ago
- Views:
Transcription
1 Open-source routing at Gb/s Olof Hagsand, Robert Olsson and Bengt Gördén, Royal Institute of Technology (KTH), Sweden {olofh, Uppsala University, Uppsala, Sweden I. ABSTRACT We present throughput measurements using the Bifrost Linux open source router on selected PC hardware. The hardware consists of eight CPU cores, NUMA architecture, double PCIe buses and Intel and SUN Gb/s interface cards. These cards are equipped with hardware classifiers that dispatch packets to multiple DMA queues which enables parallel processing of packet forwarding and load-balancing between the multiple CPUs. In our experiments, we send a multiflow, simplex packet stream through an open-source router. We measure the throughput and vary packet size, active CPUs, and router configuration. In the experiments, we use an IP flow and packet-length distribution that we claim is realistic for many network scenarios. Using these realistic traffic streams, we show how speeds close to Gb/s is achievable for normal Internet traffic. In particular, we show how the use of multiple CPUs increases the throughput up to a breakpoint which in our setting is at four CPUs. Further, we show that adding filters and full BGP tables have minor effects in the performance. With these results, we claim that open source routers using new PC architectures are a viable option for use in Gb/s networks for many network scenarios. II. INTRODUCTION Although the first IP routers were software-based, the forwarding in modern commercial routers are primarily hardware-based, containing applications specific circuits (ASICs), high performance switching backplanes(e.g. cross-bars) and advanced memory systems (including TCAMs). This enables current routers to perform wire-speed routing up to Terabit speeds. The commercial high-end routers of today have little in common with a standard desktop. However, the complexity of the forwarding and routing protocols have increased resulting in more hardware, and more complex software modules, up to a point where hardware cost, power consumption and protocol complexity are important limiting factors of network deployment. Simultaneously, development of routers on generalpurpose computer platforms (such as PC s) has developed. In particular, general purpose hardware combined with open-source [], [], [] have the advantages of offering a low-cost and flexible solution that is tractable for several niches of networking deployment. Such a platform is inexpensive since it uses off-theshelf commodity hardware, and flexible in the sense of its openness of the source software and a potentially large development community. However, many previous efforts have been hindered by performance requirements. While it has been possible to deploy open source routers as packet filterers on medium-bandwidth networks it has been difficult to connect them to high-bandwidth uplinks. In particular, the Gbps PCI bus used to be a limiting factor during several years but with the advent of PCI Express, the performance has been increased by the use of parallell lanes and a new generation in bus technology with respect to DMA and bus arbitration. One important advantage with PCIe is that interrupts are transferred in-line instead of out-of-band using MSI, which enables a better handling since it allows for multiple queueable interrupts. Memory cache behaviour is also important and is a crucial issue with the introduction of multi-core architectures. With the advances of efficient forwarding algorithms [] and small memory footprints [], IPv forwarding itself is seldom a limiting factor. We believe that several current trends combined actually speaks for a renewed attempt of using generalpurpose hardware, and we have shown an approach that we think has a potential for success in using on a larger scale in new application areas. In particular, with GE speed and low cost we believe that open source routers can be used by enterprises, small ISPs, and other scenarios where cost-efficiency and clean network design are important. But maybe the most important issue is the ability to participate in the development of new services, which can increase the knowledge and may give competitive advantages. In previous work [], we showed how multi-core CPU architectures with NUMA architecture and parallell PCIe buses combined with G Ethernet interface cards with classifiers and multiple interrupt and DMA channels could be used to speed up the packet forwarding of a Linux-based router. We identified several bottlenecks that had to do with locking mechanisms in different parts of the code that caused cachemisses. This was particulalry evident in the so-called
2 Figure. Experimental setups for forwarding. Traffic was generated, forwarded and terminated using three computers. Figure. Simplified block structure of the Tyan board with two AMD CPUs and a single PCIe bus. qdisc part of the kernel code, where output queueing was managed. During the last year, many of these bottlenecks have been removed in the Linux kernel, thus paving the way for full usage of the parallellism provided by the new hardware. III. EXPERIMENTAL PLATFORM AND SETUP The experiments us a prerelease of Bifrost []. based on Linux kernel..-rc -bit with NUMA support. Bifrost is a Linux release aimed at providing an open source routing and packet filtering platform. Bifrost includes routing daemons, packet filtering, packet generators, etc. The Linux kernel had the LC-trie forwarding engine [], and traffic was generated using a modified version of pktgen [], a Linux packet generator. The motherboard is a TYAN Thunder with two Quad-Core AMD Opteron(tm) Processor with. GHz, CPUs in total, see Figure. The eight CPUs are arranged in two quad-cores, each having access to local memory, thus forming a simple NUMA architecture. Internal buses are HyperTransport (HT). We used two network interface cards:. A Gigabit XF SR Dual Port Server Adapter PCI Express x lanes based on the chipset with multiple interrupt and DMA channels. The cards have multiple RX and TX queues. In our experiments we use both two dual and single NICs. The kernel driver is ixgbe. The card unfortunately has fixed opto-modules. SUN niu. Sun Neptune is a dual Gbit/s, PCIe x-based network card. The NIC has receive and transmit DMA channels per port. The board supports different physical connections via XFP modules and can do programmable receive packet classification in hardware using TCAMs. The Linux driver is niu, named after the board (Network Interface Unit). The niu driver has addtional patches to improve multi- queue handling Both network interface cards have a hardware classifier that computes a hash-value of the packet header. The hash-value is then used to select receive queue, and thus selects which CPU receives the packet. Loadbalancing between CPUs therefore require the traffic header to consist of several flows, enough to make the hash-function evenly distribute the traffic over the different CPUs. The SUN niu card also has a TCAM classifier for more finer-grained classification, but it was not used in these experiments. IV. DESCRIPTION OF EXPERIMENTS Three experiments were made where in each experiment throughput was measured as packets per second (pps) and bits per second (bps). The first measure is an indication of per-packet performance and usually reveals latencies and bottlenecks in CPU and software, while the second is often associated with limits in bus and memory bandwidth. ) Throughput vs packet length: Packet lengths varied between and bytes. ) Throughput vs number of CPU cores: The number of CPU cores varied between and. ) Throughput vs functionality: Router functionality varied in terms of filtering and routing table size. The experiments were used the setup shown in Figure. Traffic was sent from the generator via the router to the sink device. The tested device was the experimental platform described in Section III, the test generator and sink device being similar systems. In all cases, pktgen was used to generate traffic, and interface counters were used to verify their reception. On the sink device, received packets were only registered by the receiving port and not actually transferred over the bus to main memory. On this particular NIC, interface counters could still be read after configuring the interface as down. A reference measurement was made with an IXIA traffic analysator which showed that our figures are within % of the IXIA measurements. A weakness of our setup is that the traffic generator can not generated mixed flow traffic in a rate higher than. Mpps. This means that the generator is saturated at this point, and that also the router forwarding is limited at this level.
3 Experiment Caida Chalmers student network Uppsala university uplink. Cumulative fraction [%].. Packet size [bytes] Figure. Packet length distribution Figure. Throughput in million packets per second as a function of packet length in bytes. Note that the x-axis is logarithmic. Distribution % % % Table I PACKET LENGTH DISTRIBUTION USED. However, separate tests with an IXIA, not presented here, show that the router can forward packets up to.mpps, which means that the router under-performs in these experiements, at least for small packet sizes. A. Traffic distribution All traffic used an even random distribution of destination IPv addresses in the / range. There were such flows at one given time, and every flow had a duration of packets. The flows were scheduled using a round-robin scheduler. This resulted in ca K new flows per second. The linux forwarding cache performs well as long as a limited set of new flows arrive per second. The K new flows per second corresponds to a destination cache miss rate of around %. There were two packet length distributions: ) Fixed. In experiment, the packet length was varied between and bytes. ) Mixed. In experiments and, a packet length distribution shown in Table I was used. Figure compares this distribution with three realistic distributions: the WIDE MAWI transpacific link in march [], one taken from the Chalmers student network in December [] and one from the Uppsala university uplink router []. B. Router functionality There were four router functionality configurations: ) Basic. A small routing routing table and no modules loaded. ) The Netfilter module was loaded, but without performing actual filtering. Figure. Throughput bandwidth in Gb/s as a function of packet length. The x-axis is logarithmic. ) Netfilter and connection tracking module loaded ) Netfilter was loaded and full BGP table ( thousand routes). V. RESULTS A. Experiment : Throughput vs packet length In the first experiment, the traffic flow mix was sent from the source, via the router and received by the sink. The router had basic functionality and used all eight CPUs. The packet length was varied between and bytes. The results are shown in Figures and. It can be seen that a maximum packet rate of. Mb/s is achieved at low packet rates. It can also be seen that the router shows close to wire-speed performance at large packet lengths. The figures also show that in these experiments, the card performs better than the. Unfortunately, due to the limitation of the source, rates higher than.mpps can not be generated, the limitations shown in the figures for small packets are therefore probably a limitation of the sender, not the router itself.
4 Number of CPU cores. Functionality Figure. Throughput in million packets per second as a function of active CPUs. Figure. Throughput in million packets per second with different routing configurations. Number of CPU cores Figure. Throughput bandwidth in Gb/s as a function of number of active CPUs Functionality Figure. Throughput bandwidth in Gb/s with different routing configurations as specified in Section IV-B. B. Experiment : Throughput vs number of CPU cores In the second experiment, the traffic flow mix was sent using the packet length distribution as shown in Table I. The funtionality of the router was basic, but the number of CPU cores varied. The results are shown in Figures and. It can be seen that the performance starts at approximate Gb/s and then increases to a point at around - CPUs where the performance levels off at around Gb/s. One could see this as an indication of detoriating utilization of the CPUs. However, this probably depends on the traffic mix and the results may change with changing traffic distribution. There is also a question on what hinders the last gap to Gb/s wirespeed. This is an important question that requires further study. Preliminary tests with an IXIA packet generator indicates that here again the traffic generator may underperform. It may also be that packets are lost in transit, also at lower utilization, a fact which would be a negative result. It should be noted that at this traffic distribution, the performs better than the SUN niu. C. Experiment : Throughput vs functionality In the third experiment, the functionalityof the router was varied. The netfilter and connection tracking modules were loaded and a full BGP routing table was loaded with thousand routes. The results are shown in Figures and. It can be seen that the performance is reduced somewhat, the results indicate that the performance is maintained, also in the presence of large routing tables and filter modules. The SUN and Intel cards behave somewhat differently, it seems that the ixgbe card, for example, is affected more when loading the modules. VI. CONCLUSIONS There is no doubt that open source routing has entered the Gb/s arena. Network interface cards, CPUs and buses can do Gb/s with relatively small packet sizes. Forwarding is possible at G/s speed with large packets and very close to G/s with packet distributions that is close to what is seen on many Internet links. In particular, our results show realistic Internet traffic being forwarded in close to wirespeed by loadbalancing the traffic over multiple CPU cores. New
5 advances in network interface cards, multi-queue technology, multi-core CPUs and high-speed buses combined with new software has made this development possible. Software advances include parallelizing of the forwarding code for use in multi-core CPUs, as well as support for interrupt affinity and queue handling. When obtaining such results, it is important to tune software, configuring interrupt affinity to avoid cache misses and allocating DMA channels adequately. It is also important to carefully select CPU cores, interface NICs, memory and buses to obtain a system with suitable performance. A large issue has to do with avoiding cache misses by constructing highly local code that is independent of shared data structures. During the process of this project, the Linux kernel has improved its performance with respect to parallelized forwarding quite drastically, both in general forwarding as well as driver specific code. This effort has been made by the Linux community as a whole, where we have contributed partly by code patches, partly by measurements and profiling of the source code. In particular, support for the new multiqueue capable cards have been implemented in the Linux networking stack. We are now at a point where additional CPUs visibly improves performance, which is a new property of forwarding since additional CPUs used to result in decreased network performance. This is a continuing process, with hardware vendours constantly developing new and more efficient classifiers. With this paper we also see new challenges, including closing the last % gap between the measured Gb/s and full wirespeed forwarding. With more detailed experiments, we plan to identify the reasons for this and try to remove them if possible. We also hope that the performance breakpoint that we identified at four CPU cores can be raised, maybe by allocatating cores in a more fine-grained manner, so that, for example, a subset of CPU cores can be allocated for traffic in one direction, and another subset of CPUs in another. A key issue to achieve high performance is how to distribute input load over several memories and CPU cores. The cards used in this paper implements the Receiver Side Scaling proposal [] where input is distributed via separate DMA channels to different memories and CPU cores using hashing of the packet headers. This provides basic support for virtualization but is also essential for forwarding performance since it provides a mechanism to parallelize the packet processing. With more powerful hardware classification, it is possible to go a step further in functionality and performance by off-loading parts of the forwarding decisions to hardware. Some cards provide more powerful classification mechanisms, such as the TCAM on the SUN niu. This shows an interesting path for future work, and we plan to continue our work by looking at classification using such more powerful classification devices, in particular how the Linux kernel and its associated software can benefit from such usage. By using a TCAM, packets can be dropped or tagged by hardware thus enabling wire-speed filtering and prioritization, for example. A specific point is how to best serialise several transmit queues while causing minimal lock contention, cache misses and packet reordering. The receiver side is relatively simple in this respect as the classifier splits the incoming traffic. The transmit side is by design a congestion point since the packets comes from several interfaces that may potentially cause blocking. Acknowledgements This paper is a result of a study sponsored by IIS. Interface boards were provided by Intel and SUN, CPUs by AMD. Uppsala university and KTH for providing experimental networking infrastructures. REFERENCES [] O Hagsand, R.Olsson., B. Gorden Towards Gb/s open source routing. In Proceedings of the Linux Symposium, Hamburg, October, [] R.Olsson. Pktgen the linux packet generator. In Proceedings of the Linux Symposium, Ottawa, Canada, volume, pages -,. [] S. Nilsson, and G. Karlsson, Fast address look-up for Internet routers, In Proc. IFIP th International Conference on Broadband Communications, pp. -,. [] M. Degermark et al., "Small Forwarding Tables for Fast Routing Lookups". in Proc. ACM SIGCOMM Conference, pages -, Oct. [] Microsoft Corporation, "Scalable Networking: Eliminating the Receive Processing Bottleneck-Introducing RSS", WinHEC Version - April, [] R. Olsson, H. Wassen, E. Pedersen, "Open Source Routing in High-Speed Production Use", Linux Kongress, October. [] Kunihiro Ishiguro, et al, "Quagga, A routing software package for TCP/IP networks", July [] M. Handley, E. Kohler, A. Ghosh, O. Hodson, P. Radoslavov, "Designing Extensible IP Router Software", in Proc of the nd USENIX Symposium on Networked Systems Design and Implementation (NSDI). [], "Packet size distribution comparison between Internet links in and : WIDE MAWI", Caida,. [] Milnert, "Packet size distribution in Chalmers student network", Bifrost mailing list, Dec,. [] Pedersen, "Packet size distribution of Uppsala university uplink", Bifrost mailing list, Dec,.
OpenFlow with Intel 82599. Voravit Tanyingyong, Markus Hidell, Peter Sjödin
OpenFlow with Intel 82599 Voravit Tanyingyong, Markus Hidell, Peter Sjödin Outline Background Goal Design Experiment and Evaluation Conclusion OpenFlow SW HW Open up commercial network hardware for experiment
Control and forwarding plane separation on an open source router. Linux Kongress 2010-10-23 in Nürnberg
Control and forwarding plane separation on an open source router Linux Kongress 2010-10-23 in Nürnberg Robert Olsson, Uppsala University Olof Hagsand, KTH Jens Låås, UU Bengt Görden, KTH SUMMARY VERSION
Router Architectures
Router Architectures An overview of router architectures. Introduction What is a Packet Switch? Basic Architectural Components Some Example Packet Switches The Evolution of IP Routers 2 1 Router Components
Evaluating the Suitability of Server Network Cards for Software Routers
Evaluating the Suitability of Server Network Cards for Software Routers Maziar Manesh Katerina Argyraki Mihai Dobrescu Norbert Egi Kevin Fall Gianluca Iannaccone Eddie Kohler Sylvia Ratnasamy EPFL, UCLA,
Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009
Performance Study Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Introduction With more and more mission critical networking intensive workloads being virtualized
50. DFN Betriebstagung
50. DFN Betriebstagung IPS Serial Clustering in 10GbE Environment Tuukka Helander, Stonesoft Germany GmbH Frank Brüggemann, RWTH Aachen Slide 1 Agenda Introduction Stonesoft clustering Firewall parallel
High-Performance IP Service Node with Layer 4 to 7 Packet Processing Features
UDC 621.395.31:681.3 High-Performance IP Service Node with Layer 4 to 7 Packet Processing Features VTsuneo Katsuyama VAkira Hakata VMasafumi Katoh VAkira Takeyama (Manuscript received February 27, 2001)
Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University
Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University Napatech - Sharkfest 2009 1 Presentation Overview About Napatech
High-Density Network Flow Monitoring
Petr Velan [email protected] High-Density Network Flow Monitoring IM2015 12 May 2015, Ottawa Motivation What is high-density flow monitoring? Monitor high traffic in as little rack units as possible
D1.2 Network Load Balancing
D1. Network Load Balancing Ronald van der Pol, Freek Dijkstra, Igor Idziejczak, and Mark Meijerink SARA Computing and Networking Services, Science Park 11, 9 XG Amsterdam, The Netherlands June [email protected],[email protected],
Performance of Software Switching
Performance of Software Switching Based on papers in IEEE HPSR 2011 and IFIP/ACM Performance 2011 Nuutti Varis, Jukka Manner Department of Communications and Networking (COMNET) Agenda Motivation Performance
Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters
Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters Evaluation report prepared under contract with Emulex Executive Summary As
Technical Brief. DualNet with Teaming Advanced Networking. October 2006 TB-02499-001_v02
Technical Brief DualNet with Teaming Advanced Networking October 2006 TB-02499-001_v02 Table of Contents DualNet with Teaming...3 What Is DualNet?...3 Teaming...5 TCP/IP Acceleration...7 Home Gateway...9
Wire-speed Packet Capture and Transmission
Wire-speed Packet Capture and Transmission Luca Deri Packet Capture: Open Issues Monitoring low speed (100 Mbit) networks is already possible using commodity hardware and tools based on libpcap.
Building High-Performance iscsi SAN Configurations. An Alacritech and McDATA Technical Note
Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note Internet SCSI (iscsi)
Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck
Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering
HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring
CESNET Technical Report 2/2014 HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring VIKTOR PUš, LUKÁš KEKELY, MARTIN ŠPINLER, VÁCLAV HUMMEL, JAN PALIČKA Received 3. 10. 2014 Abstract
Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family
Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family White Paper June, 2008 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL
Networking Driver Performance and Measurement - e1000 A Case Study
Networking Driver Performance and Measurement - e1000 A Case Study John A. Ronciak Intel Corporation [email protected] Ganesh Venkatesan Intel Corporation [email protected] Jesse Brandeburg
Boosting Data Transfer with TCP Offload Engine Technology
Boosting Data Transfer with TCP Offload Engine Technology on Ninth-Generation Dell PowerEdge Servers TCP/IP Offload Engine () technology makes its debut in the ninth generation of Dell PowerEdge servers,
Challenges in high speed packet processing
Challenges in high speed packet processing Denis Salopek University of Zagreb, Faculty of Electrical Engineering and Computing, Croatia [email protected] Abstract With billions of packets traveling
WHITE PAPER. Extending Network Monitoring Tool Performance
WHITE PAPER Extending Network Monitoring Tool Performance www.ixiacom.com 915-6915-01 Rev. A, July 2014 2 Table of Contents Benefits... 4 Abstract... 4 Introduction... 4 Understanding Monitoring Tools...
VMWARE WHITE PAPER 1
1 VMWARE WHITE PAPER Introduction This paper outlines the considerations that affect network throughput. The paper examines the applications deployed on top of a virtual infrastructure and discusses the
Intel DPDK Boosts Server Appliance Performance White Paper
Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks
Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine
Where IT perceptions are reality Test Report OCe14000 Performance Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine Document # TEST2014001 v9, October 2014 Copyright 2014 IT Brand
How To Improve Performance On A Linux Based Router
Linux Based Router Over 10GE LAN Cheng Cui, Chui-hui Chiu, and Lin Xue Department of Computer Science Louisiana State University, LA USA Abstract High speed routing with 10Gbps link speed is still very
Infrastructure for active and passive measurements at 10Gbps and beyond
Infrastructure for active and passive measurements at 10Gbps and beyond Best Practice Document Produced by UNINETT led working group on network monitoring (UFS 142) Author: Arne Øslebø August 2014 1 TERENA
Networking Virtualization Using FPGAs
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,
Monitoring high-speed networks using ntop. Luca Deri <[email protected]>
Monitoring high-speed networks using ntop Luca Deri 1 Project History Started in 1997 as monitoring application for the Univ. of Pisa 1998: First public release v 0.4 (GPL2) 1999-2002:
TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to
Introduction to TCP Offload Engines By implementing a TCP Offload Engine (TOE) in high-speed computing environments, administrators can help relieve network bottlenecks and improve application performance.
Windows 8 SMB 2.2 File Sharing Performance
Windows 8 SMB 2.2 File Sharing Performance Abstract This paper provides a preliminary analysis of the performance capabilities of the Server Message Block (SMB) 2.2 file sharing protocol with 10 gigabit
The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group
The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links Filippo Costa on behalf of the ALICE DAQ group DATE software 2 DATE (ALICE Data Acquisition and Test Environment) ALICE is a
Welcome to the Dawn of Open-Source Networking. Linux IP Routers Bob Gilligan [email protected]
Welcome to the Dawn of Open-Source Networking. Linux IP Routers Bob Gilligan [email protected] Outline About Vyatta: Open source project, and software product Areas we re working on or interested in
PCI Express* Ethernet Networking
White Paper Intel PRO Network Adapters Network Performance Network Connectivity Express* Ethernet Networking Express*, a new third-generation input/output (I/O) standard, allows enhanced Ethernet network
Performance Evaluation of Linux Bridge
Performance Evaluation of Linux Bridge James T. Yu School of Computer Science, Telecommunications, and Information System (CTI) DePaul University ABSTRACT This paper studies a unique network feature, Ethernet
The Bus (PCI and PCI-Express)
4 Jan, 2008 The Bus (PCI and PCI-Express) The CPU, memory, disks, and all the other devices in a computer have to be able to communicate and exchange data. The technology that connects them is called the
High-performance vswitch of the user, by the user, for the user
A bird in cloud High-performance vswitch of the user, by the user, for the user Yoshihiro Nakajima, Wataru Ishida, Tomonori Fujita, Takahashi Hirokazu, Tomoya Hibi, Hitoshi Matsutahi, Katsuhiro Shimano
Introduction to Intel Ethernet Flow Director and Memcached Performance
White Paper Intel Ethernet Flow Director Introduction to Intel Ethernet Flow Director and Memcached Performance Problem Statement Bob Metcalfe, then at Xerox PARC, invented Ethernet in 1973, over forty
Accelerating High-Speed Networking with Intel I/O Acceleration Technology
White Paper Intel I/O Acceleration Technology Accelerating High-Speed Networking with Intel I/O Acceleration Technology The emergence of multi-gigabit Ethernet allows data centers to adapt to the increasing
1-Gigabit TCP Offload Engine
White Paper 1-Gigabit TCP Offload Engine Achieving greater data center efficiencies by providing Green conscious and cost-effective reductions in power consumption. June 2009 Background Broadcom is a recognized
An Analysis of 8 Gigabit Fibre Channel & 10 Gigabit iscsi in Terms of Performance, CPU Utilization & Power Consumption
An Analysis of 8 Gigabit Fibre Channel & 1 Gigabit iscsi in Terms of Performance, CPU Utilization & Power Consumption An Analysis of 8 Gigabit Fibre Channel & 1 Gigabit iscsi 1 Key Findings Third I/O found
Using Hardware Classification to Improve PC-Based OpenFlow Switching
Using Hardware Classification to Improve PC-Based OpenFlow Switching Voravit Tanyingyong, Markus Hidell, Peter Sjödin School of Information and Communication Technology KTH Royal Institute of Technology
Design considerations for efficient network applications with Intel multi-core processor-based systems on Linux*
White Paper Joseph Gasparakis Performance Products Division, Embedded & Communications Group, Intel Corporation Peter P Waskiewicz, Jr. LAN Access Division, Data Center Group, Intel Corporation Design
PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)
PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters from One Stop Systems (OSS) PCIe Over Cable PCIe provides greater performance 8 7 6 5 GBytes/s 4
PE10G2T Dual Port Fiber 10 Gigabit Ethernet TOE PCI Express Server Adapter Broadcom based
PE10G2T Dual Port Fiber 10 Gigabit Ethernet TOE PCI Express Server Adapter Broadcom based Description Silicom 10Gigabit TOE PCI Express server adapters are network interface cards that contain multiple
Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak
Cray Gemini Interconnect Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak Outline 1. Introduction 2. Overview 3. Architecture 4. Gemini Blocks 5. FMA & BTA 6. Fault tolerance
Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking
Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking Burjiz Soorty School of Computing and Mathematical Sciences Auckland University of Technology Auckland, New Zealand
Packet Capture in 10-Gigabit Ethernet Environments Using Contemporary Commodity Hardware
Packet Capture in 1-Gigabit Ethernet Environments Using Contemporary Commodity Hardware Fabian Schneider Jörg Wallerich Anja Feldmann {fabian,joerg,anja}@net.t-labs.tu-berlin.de Technische Universtität
Gigabit Ethernet Packet Capture. User s Guide
Gigabit Ethernet Packet Capture User s Guide Copyrights Copyright 2008 CACE Technologies, Inc. All rights reserved. This document may not, in whole or part, be: copied; photocopied; reproduced; translated;
MIDeA: A Multi-Parallel Intrusion Detection Architecture
MIDeA: A Multi-Parallel Intrusion Detection Architecture Giorgos Vasiliadis, FORTH-ICS, Greece Michalis Polychronakis, Columbia U., USA Sotiris Ioannidis, FORTH-ICS, Greece CCS 2011, 19 October 2011 Network
Oracle Database Scalability in VMware ESX VMware ESX 3.5
Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises
Optimizing TCP Forwarding
Optimizing TCP Forwarding Vsevolod V. Panteleenko and Vincent W. Freeh TR-2-3 Department of Computer Science and Engineering University of Notre Dame Notre Dame, IN 46556 {vvp, vin}@cse.nd.edu Abstract
Autonomous NetFlow Probe
Autonomous Ladislav Lhotka [email protected] Martin Žádník [email protected] TF-CSIRT meeting, September 15, 2005 Outline 1 2 Specification Hardware Firmware Software 3 4 Short-term fixes Test
An Oracle Technical White Paper November 2011. Oracle Solaris 11 Network Virtualization and Network Resource Management
An Oracle Technical White Paper November 2011 Oracle Solaris 11 Network Virtualization and Network Resource Management Executive Overview... 2 Introduction... 2 Network Virtualization... 2 Network Resource
A Study of Network Security Systems
A Study of Network Security Systems Ramy K. Khalil, Fayez W. Zaki, Mohamed M. Ashour, Mohamed A. Mohamed Department of Communication and Electronics Mansoura University El Gomhorya Street, Mansora,Dakahlya
OpenFlow Switching: Data Plane Performance
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 21 proceedings OpenFlow : Data Plane Performance Andrea Bianco,
Achieving Low-Latency Security
Achieving Low-Latency Security In Today's Competitive, Regulatory and High-Speed Transaction Environment Darren Turnbull, VP Strategic Solutions - Fortinet Agenda 1 2 3 Firewall Architecture Typical Requirements
High-Density Network Flow Monitoring
High-Density Network Flow Monitoring Petr Velan CESNET, z.s.p.o. Zikova 4, 160 00 Praha 6, Czech Republic [email protected] Viktor Puš CESNET, z.s.p.o. Zikova 4, 160 00 Praha 6, Czech Republic [email protected]
The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology
3. The Lagopus SDN Software Switch Here we explain the capabilities of the new Lagopus software switch in detail, starting with the basics of SDN and OpenFlow. 3.1 SDN and OpenFlow Those engaged in network-related
Bivio 7000 Series Network Appliance Platforms
W H I T E P A P E R Bivio 7000 Series Network Appliance Platforms Uncompromising performance. Unmatched flexibility. Uncompromising performance. Unmatched flexibility. The Bivio 7000 Series Programmable
OpenFlow Based Load Balancing
OpenFlow Based Load Balancing Hardeep Uppal and Dane Brandon University of Washington CSE561: Networking Project Report Abstract: In today s high-traffic internet, it is often desirable to have multiple
10 Gbit Hardware Packet Filtering Using Commodity Network Adapters. Luca Deri <[email protected]> Joseph Gasparakis <joseph.gasparakis@intel.
10 Gbit Hardware Packet Filtering Using Commodity Network Adapters Luca Deri Joseph Gasparakis 10 Gbit Monitoring Challenges [1/2] High number of packets to
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.
Bridging the Gap between Software and Hardware Techniques for I/O Virtualization
Bridging the Gap between Software and Hardware Techniques for I/O Virtualization Jose Renato Santos Yoshio Turner G.(John) Janakiraman Ian Pratt Hewlett Packard Laboratories, Palo Alto, CA University of
Network Performance Optimisation and Load Balancing. Wulf Thannhaeuser
Network Performance Optimisation and Load Balancing Wulf Thannhaeuser 1 Network Performance Optimisation 2 Network Optimisation: Where? Fixed latency 4.0 µs Variable latency
The proliferation of the raw processing
TECHNOLOGY CONNECTED Advances with System Area Network Speeds Data Transfer between Servers with A new network switch technology is targeted to answer the phenomenal demands on intercommunication transfer
SIDN Server Measurements
SIDN Server Measurements Yuri Schaeffer 1, NLnet Labs NLnet Labs document 2010-003 July 19, 2010 1 Introduction For future capacity planning SIDN would like to have an insight on the required resources
Local-Area Network -LAN
Computer Networks A group of two or more computer systems linked together. There are many [types] of computer networks: Peer To Peer (workgroups) The computers are connected by a network, however, there
1000-Channel IP System Architecture for DSS
Solution Blueprint Intel Core i5 Processor Intel Core i7 Processor Intel Xeon Processor Intel Digital Security Surveillance 1000-Channel IP System Architecture for DSS NUUO*, Qsan*, and Intel deliver a
New Trends Make 10 Gigabit Ethernet the Data-Center Performance Choice
Intel Server Adapters New Trends Make 10 Gigabit Ethernet the Data-Center Performance Choice Technologies are rapidly evolving to meet data-center needs and enterprise demands for efficiently handling
ECLIPSE Performance Benchmarks and Profiling. January 2009
ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster
EVALUATING THE NETWORKING PERFORMANCE OF LINUX-BASED HOME ROUTER PLATFORMS FOR MULTIMEDIA SERVICES. Ingo Kofler, Robert Kuschnig, Hermann Hellwagner
EVALUATING THE NETWORKING PERFORMANCE OF LINUX-BASED HOME ROUTER PLATFORMS FOR MULTIMEDIA SERVICES Ingo Kofler, Robert Kuschnig, Hermann Hellwagner Institute of Information Technology (ITEC) Alpen-Adria-Universität
Exploiting Commodity Multi-core Systems for Network Traffic Analysis
Exploiting Commodity Multi-core Systems for Network Traffic Analysis Luca Deri ntop.org Pisa, Italy [email protected] Francesco Fusco IBM Zurich Research Laboratory Rüschlikon, Switzerland [email protected]
Cisco Integrated Services Routers Performance Overview
Integrated Services Routers Performance Overview What You Will Learn The Integrated Services Routers Generation 2 (ISR G2) provide a robust platform for delivering WAN services, unified communications,
Virtualised MikroTik
Virtualised MikroTik MikroTik in a Virtualised Hardware Environment Speaker: Tom Smyth CTO Wireless Connect Ltd. Event: MUM Krackow Feb 2008 http://wirelessconnect.eu/ Copyright 2008 1 Objectives Understand
Achieving a High-Performance Virtual Network Infrastructure with PLUMgrid IO Visor & Mellanox ConnectX -3 Pro
Achieving a High-Performance Virtual Network Infrastructure with PLUMgrid IO Visor & Mellanox ConnectX -3 Pro Whitepaper What s wrong with today s clouds? Compute and storage virtualization has enabled
Big Data Technologies for Ultra-High-Speed Data Transfer and Processing
White Paper Intel Xeon Processor E5 Family Big Data Analytics Cloud Computing Solutions Big Data Technologies for Ultra-High-Speed Data Transfer and Processing Using Technologies from Aspera and Intel
I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology
I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology Reduce I/O cost and power by 40 50% Reduce I/O real estate needs in blade servers through consolidation Maintain
Hardware acceleration enhancing network security
Hardware acceleration enhancing network security Petr Kaštovský [email protected] High-Speed Networking Technology Partner Threats Number of attacks grows together with damage caused Source: McAfee
I3: Maximizing Packet Capture Performance. Andrew Brown
I3: Maximizing Packet Capture Performance Andrew Brown Agenda Why do captures drop packets, how can you tell? Software considerations Hardware considerations Potential hardware improvements Test configurations/parameters
Purpose-Built Load Balancing The Advantages of Coyote Point Equalizer over Software-based Solutions
Purpose-Built Load Balancing The Advantages of Coyote Point Equalizer over Software-based Solutions Abstract Coyote Point Equalizer appliances deliver traffic management solutions that provide high availability,
Collecting Packet Traces at High Speed
Collecting Packet Traces at High Speed Gorka Aguirre Cascallana Universidad Pública de Navarra Depto. de Automatica y Computacion 31006 Pamplona, Spain [email protected] Eduardo Magaña Lizarrondo
Broadcom Ethernet Network Controller Enhanced Virtualization Functionality
White Paper Broadcom Ethernet Network Controller Enhanced Virtualization Functionality Advancements in VMware virtualization technology coupled with the increasing processing capability of hardware platforms
VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing
Journal of Information & Computational Science 9: 5 (2012) 1273 1280 Available at http://www.joics.com VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing Yuan
PE310G4BPi40-T Quad port Copper 10 Gigabit Ethernet PCI Express Bypass Server Intel based
PE310G4BPi40-T Quad port Copper 10 Gigabit Ethernet PCI Express Bypass Server Intel based Description Silicom s quad port Copper 10 Gigabit Ethernet Bypass server adapter is a PCI-Express X8 network interface
基 於 SDN 與 可 程 式 化 硬 體 架 構 之 雲 端 網 路 系 統 交 換 器
基 於 SDN 與 可 程 式 化 硬 體 架 構 之 雲 端 網 路 系 統 交 換 器 楊 竹 星 教 授 國 立 成 功 大 學 電 機 工 程 學 系 Outline Introduction OpenFlow NetFPGA OpenFlow Switch on NetFPGA Development Cases Conclusion 2 Introduction With the proposal
Tyche: An efficient Ethernet-based protocol for converged networked storage
Tyche: An efficient Ethernet-based protocol for converged networked storage Pilar González-Férez and Angelos Bilas 30 th International Conference on Massive Storage Systems and Technology MSST 2014 June
Demartek June 2012. Broadcom FCoE/iSCSI and IP Networking Adapter Evaluation. Introduction. Evaluation Environment
June 212 FCoE/iSCSI and IP Networking Adapter Evaluation Evaluation report prepared under contract with Corporation Introduction Enterprises are moving towards 1 Gigabit networking infrastructures and
White Paper Abstract Disclaimer
White Paper Synopsis of the Data Streaming Logical Specification (Phase I) Based on: RapidIO Specification Part X: Data Streaming Logical Specification Rev. 1.2, 08/2004 Abstract The Data Streaming specification
Accelerating 4G Network Performance
WHITE PAPER Accelerating 4G Network Performance OFFLOADING VIRTUALIZED EPC TRAFFIC ON AN OVS-ENABLED NETRONOME INTELLIGENT SERVER ADAPTER NETRONOME AGILIO INTELLIGENT SERVER ADAPTERS PROVIDE A 5X INCREASE
Campus Network Design Science DMZ
Campus Network Design Science DMZ Dale Smith Network Startup Resource Center [email protected] The information in this document comes largely from work done by ESnet, the USA Energy Sciences Network see
Lustre Networking BY PETER J. BRAAM
Lustre Networking BY PETER J. BRAAM A WHITE PAPER FROM CLUSTER FILE SYSTEMS, INC. APRIL 2007 Audience Architects of HPC clusters Abstract This paper provides architects of HPC clusters with information
Chapter 15: Advanced Networks
Chapter 15: Advanced Networks IT Essentials: PC Hardware and Software v4.0 1 Determine a Network Topology A site survey is a physical inspection of the building that will help determine a basic logical
Network Agent Quick Start
Network Agent Quick Start Topic 50500 Network Agent Quick Start Updated 17-Sep-2013 Applies To: Web Filter, Web Security, Web Security Gateway, and Web Security Gateway Anywhere, v7.7 and 7.8 Websense
Linux NIC and iscsi Performance over 40GbE
Linux NIC and iscsi Performance over 4GbE Chelsio T8-CR vs. Intel Fortville XL71 Executive Summary This paper presents NIC and iscsi performance results comparing Chelsio s T8-CR and Intel s latest XL71
