NARC: Network-Attached Reconfigurable Computing for High-performance, Network-based Applications
|
|
|
- Todd Pope
- 10 years ago
- Views:
Transcription
1 NARC: Network-Attached Reconfigurable Computing for High-performance, Network-based Applications Chris Conger, Ian Troxel, Daniel Espinosa, Vikas Aggarwal, and Alan D. George High-performance Computing and Simulation (HCS) Research Lab Department of Electrical and Computer Engineering University of Florida Conger
2 Outline Introduction NARC Board Architecture, Protocols Case Study Applications Experimental Setup Results and Analysis Pitfalls and Lessons Learned Conclusions Future Work Conger 2
3 Introduction Network-Attached Reconfigurable Computer (NARC) Project Inspiration: network-attached storage (NAS) devices Core concept: investigate challenges and alternatives for enabling direct network access and control over reconfigurable (RC) devices Method: prototype hardware interface and software infrastructure, demonstrate proof of concept for benefits of network-attached RC resources Motivations for NARC project include (but not limited to) applications such as: Network-accessible processing resources Generic network RC resource, viable alternative to server and supercomputer solutions Power and cost savings over server-based FPGA cards are key benefits No server needed to host RC device Infrastructure provided for robust operation and interfacing with users Performance increase over existing RC solutions is not a primary goal of this approach Network monitoring and packet analysis Easy attachment; unobtrusive, fast traffic gathering and processing Network intrusion and attack detection, performance monitoring, active traffic injection Direct network connection of FPGA can enable wire-speed processing of network traffic Aircraft and advanced munitions systems Standard Ethernet interface eases addition and integration of RC devices in aircraft and munitions systems Low weight and power also attractive characteristics of NARC device for such applications Conger 3
4 Envisioned Applications Aerospace & military applications Modular, low-power design lends itself well to military craft and munitions deployment FPGAs providing high-performance radar, sonar, and other computational capabilities Scientific field operations Quickly provide first-level estimations for scientific field operations for geologists, biologists, etc. Cost-effective intelligent sensor networks Field-deployable covert operations Completely wireless device enabled through battery, WLAN Passive network monitoring applications Active network traffic injection Distributed computing Use FPGAs in close conjunction with sensors to provide pre-processing functions before network transmission High-performance network technologies Fast Ethernet may be replaced by any network technology Gig-E, Infiniband, RapidIO, proprietary communication protocols Cost-effective, RC-enabled clusters or cluster resources Cluster NARC devices at a fraction of cost, power, cooling Conger 4
5 NARC Board Architecture: Hardware ARM9 network control with FPGA processing power (see Figure 1) Prototype design consists of two boards, connected via cable: Network interface board (ARM9 processor + peripherals) Xilinx development board(s) (FPGA) Network interface peripherals include: Layer-2 network connection (hardware PHY+MAC) External memory, SDRAM and Flash Serial port (debug communication link) FPGA control and data lines NARC hardware specifications: ARM-core microcontroller, 1.8V core, 3.3V peripheral 32-bit RISC, 5-stage pipeline, in-order execution 16KB data cache, 16KB instruction cache Core clock speed 180MHz, peripheral clock 60MHz On-chip Ethernet MAC layer with DMA External memory, 3.3V 32MB SDRAM, 32-bit data bus 2MB Flash, 16-bit data bus Port available for additional 16-bit SRAM devices Ethernet transceiver, 3.3V DM9161 PHY layer transceiver 100Mbps, full duplex capable RMII interface to MAC Figure 1 Block diagram of NARC device Conger 5
6 NARC Board Architecture: Software narc.h ARM processor runs Linux kernel Provides TCP(UDP)/IP stack, resource management, threaded execution, Berkeley Sockets interface for applications Configured and compiled with drivers specifically for our board Applications written in C, compiled using GCC compiler for ARM (see Figure 2) NARC API: Low-level driver function library for basic services Initialize and configure on-chip peripherals of ARM-core processor Configure FPGA (SelectMAP protocol) Transfer data to/from FPGA, manipulate control lines Monitor and initiate network traffic NARC protocol for job exchange (from remote workstation) NARC board application and client application must follow standard rules and procedures for responding to requests from a user User appends a small header onto data (if any) containing info. about request before sending over network (see Figure 3) Bootstrap software in on-board Flash, automatically loads and executes on power-up Configures clocks, memory controllers, I/O pins, etc Contacts tftp server running on network, downloads Linux and ramdisk Boot Linux, automatically execute NARC board software contained in ramdisk Optional serial interface through HyperTerminal for debugging/development client.c client applicatio n main.c main applicatio n Makefile util.c library routines definition s global vars arm-linux-gcc gcc Linux Kernel NARC board application RAMDISK Client application Figure 2 Software development process Figure 3 Request header field definitions NARC Board ` User Workstation Conger 6
7 NARC Board Architecture: FPGA Interface Data communicated to/from FPGA by means of unidirectional data paths 8-bit input port, 8-bit output port, 8 control lines (Figure 4) Control lines manage data transfer, also drive configuration signals Data transferred one byte at a time, full duplex communication possible Control lines include following signals: Clock software-generated signal to clock data on data ports Reset reset signal for interface logic in FPGA Ready signal indicating device is ready to accept another byte of data Valid signal indicating device has placed valid data on port SelectMAP all signals necessary to drive SelectMAP configuration ARM PROG, INIT, CS, WRITE, DONE Out[0:7] f_ready a_valid In[0:7] f_valid a_ready clock reset PROG, INIT, CS, WRITE, DONE D[0:7] In[0:7] f_ready a_valid Out[0:7] f_valid a_ready SelectMAP Port FPGA FPGA configuration through SelectMAP protocol Figure 4 FPGA interface signal diagram Fastest configuration option for Xilinx FPGAs, protocol emulated using GPIO pins of ARM NARC board enables remote configuration and management of FPGA User submits configuration request (RTYPE = 01), along with bitfile and function descriptor Function descriptor is ASCII string, formatted list of functions with associated RTYPE definition ARM halts and configures FPGA, stores descriptor in dedicated RAM buffer for user queries All FPGA designs must restrict use of all SelectMAP pins after configuration Some signals are shared between SelectMAP port and FPGA-ARM link Once configured, SelectMAP pins must remain tri-stated and unused Conger 7
8 Results and Analysis: Raw Performance FPGA interface I/O throughput (Table 1) 1KB data transferred over link, timed Measured using hardware methods Logic analyzer to capture raw link data rate, divide data sent by time from first clock to last clock (see Figure 9) Performance lower than desired for prototype Handshake protocol may add unnecessary overhead Widening data paths, optimizing software routine will significantly improve FPGA I/O performance Network throughput (Table 2) Measured using Linux network benchmark IPerf NARC board located on arbitrary switch within network, application partner is user workstation Transfers as much data as possible in 10 seconds, calculates throughput based on data sent divided by 10 seconds Performed two experiments with NARC board serving as client in one run, server in other Both local and remote (remote location ~400 miles away, at Florida State University) IPerf partner Network interface achieves reasonably good bandwidth efficiency External memory throughput (Table 3) 4KB transferred to external SDRAM, both read and write Measurements again taken using logic analyzer Memory throughput sufficient to provide wire-speed buffering of network traffic On-chip Ethernet MAC has DMA to this SDRAM Should help alleviate I/O bottleneck between ARM and FPGA Mb/s Input Output Logic Analyzer Mb/s Figure 9 Logic analyzer timing Table 1 FPGA interface I/O performance Local Network Remote Network (WAN) NARC Server Server- Server Table 2 Network throughput Mb/s Read Write Logic Analyzer Table 3 External SDRAM throughput Conger 8
9 Results and Analysis: Raw Performance Reconfiguration speed Includes time to transfer bitfile over network, plus time to configure device (transfer bitfile from ARM to FPGA), plus time to receive acknowledgement Our design currently completes a user-initiated reconfiguration request with a 1.2MB bitfile in 2.35 sec Area/resource usage of minimal wrapper for Virtex-II Pro FPGA Stats on resource requirements for a minimal design to provide required link control and data transfer in an application wrapper are presented below: Design implemented on older Virtex-II Pro FPGA Numbers to right indicate requirements for wrapper only, un-used resources available for use in user applications Extremely small footprint! Footprint will be even smaller on larger FPGA Device utilization summary: Selected Device : 2vp20ff Number of Slices: 143 out of % Number of Slice Flip Flops: 120 out of % Number of 4 input LUTs: 238 out of % Number of bonded IOBs: 24 out of 564 4% Number of BRAMs: 8 out of 88 9% Number of GCLKs: 1 out of 16 6% Conger 9
10 Case Study Applications Clustered RC Devices: N-Queens HPC application demonstrating NARC board s role as generic compute resource Application characterized by minimal communication, heavy computation within FPGA NARC version of N-Queens adapted from previously implemented application for PCIbased Celoxica RC1000 board housed in a conventional server N-Queens algorithm is a part of the DoD high-performance computing benchmark suite and representative of select military and intelligence processing algorithms Exercises functionality of various developed mechanisms and protocols for job submission, data transfer, etc. on NARC Figure c/o Jeff Somers User specifies a single parameter N, upon completion the algorithm returns total number of possible solutions Purpose of algorithm is to determine how many possible arrangements of N queens there are on an N N chess board, such that no queen may Figure 5 Possible 8x8 solution attack another (see Figure 5) Results are presented from both NARC-based execution and RC1000-based execution for comparison Conger 10
11 Case Study Applications Network processing: Bloom Filter This application performs passive packet analysis through use of a classification algorithm known as a Bloom Filter Application characterized by constant, bursty communication patterns Most communication is Rx over network, transmission to FPGA Filter may be programmed or queried NARC device copies all received network frames to memory, ARM parses TCP/IP header and sends it to Bloom Filter for classification User can send programming requests, which include a header and string to be programmed into Filter User can also send result collection requests, which causes a formatted results packet to be sent back to the user Otherwise, application constantly runs, querying each header against the current Bloom Filter and recording match/header pair information Bloom Filter works by using multiple hash functions on a given bit string, each hash function rendering indices of a separate bit vector (see Figure 6) To program, hash inputted string and set resulting bit positions as 1 To query, hash inputted string, if all resulting bit positions are 1 the string matches Implemented on Virtex-II Pro FPGA Uses slightly larger, but ultimately more effective application wrapper (see Figure 7) Larger FPGA selected to demonstrate interoperability with any FPGA Figure 6 Bloom Filter algorithmic architecture Figure 7 Bloom Filter implementation architecture Conger 11
12 Experimental Setup N-Queens: Clustered RC devices NARC device located on arbitrary switch in network User interfaces through client application on workstation, requests N-Queens procedure Figure 8 illustrates experimental environment Client application records time required to satisfy request Power supply measures current draw of active NARC device N-Queens also implemented on RC-enabled server equipped with Celoxica RC1000 board Client-side function call to NARC board replaced with function call to RC1000 board in local workstation, same timing measurement NARC NARC Ethernet Network RC-enabled servers Comparison offered in terms of performance, power, cost Workstation Bloom Filter: Network processing Same experimental setup as N-Queens case study Software on ARM co-processor captures all Ethernet frames Only packet headers (TCP/IP) are passed to FPGA Data continuously sent to FPGA as packets arrive over network By attaching NARC device to switch, limited packets can be captured Only broadcast packets and packets destined for the NARC device can be seen Dual-port device could be inserted in-line with network link, monitor all flow-through traffic User Figure 8 Experimental environment Conger 12
13 Results and Analysis: N-Queens Case Study First, consider an execution time comparison between our NARC board and a PCI-based RC card (see Figure 10a and 10b) Both FPGA designs clocked at 50MHz Performance difference is minimal between devices Being able to match performance of PCI-based card is a resounding success! Power consumption and cost of NARC devices drastically lower than that of server with RC card combos Multiple users may share NARC device, PCI-based cards somewhat fixed in an individual server Power consumption calculated using following method Three regulated power supplies exist in complete NARC device (network interface + FPGA board): 5V, 3.3V, 2.5V Current draw from each supply was measured Power consumption is calculated as sum of V I products of all three supplies Exec. Time (s) Exec. Time (s) N-Queens Execution Time Comparison (small board size) NARC RC Algorithm Parameter (N) N-Queens Execution Time Comparison (large board size) NARC RC Algorithm Parameter (N) Figure 10 Performance comparison between NARC board and PCI-based RC card on server Conger 13
14 Results and Analysis: N-Queens Case Study Figure 11 summarizes the performance ratio of N-Queens between both NARC and RC-1000 platforms Consider Table 4 for a summary of cost and power statistics Unit price shown excluding cost of FPGA FPGA costs offset when compared to another device Price shown includes PCB fabrication, component costs Approximate power consumption drastically less than server + RC-card combo Power consumption of server varies depending on particular hardware Typical servers operate off of W power supplies See Figure 12 for example of approximate power consumption calculation P Ratio NARC / RC-1000 Performance Ratio Algorithm Parameter (N) NARC Board RATIO Cost per unit (prototype) $ W Approx. Power Consumption Table 4 Price and power figures for NARC device Equivalency Figure 11 Power consumption calculation P = (5V)(I 5 ) + (3.3V)(I 33 ) + (2.5V)(I 25 ) I 5 0.2A ; I A ; I A = (5)(.2) + (3.3)(.49) + (2.5)(.27) =3.28W Figure 12 Power consumption calculation Conger 14
15 Results and Analysis: Bloom Filter Passive, continuous network traffic analysis Wrapper design was slightly larger than previous minimal wrapper used with N-Queens Still small footprint on chip, majority of FPGA remains for application Maximum wrapper clock frequency 183 MHz, should not limit application clock if in same clock domain Packets received over network link are parsed by ARM, with TCP/IP header saved in buffer Headers sent one-at-a-time as query requests to Bloom Filter (FPGA), when query finishes another header will be de-queued if available User may query NARC device at any time for results update, program new pattern Figure 13 shows resource usage for Virtex-II Pro FPGA Maximum clock frequency of 113MHz Not affected by wrapper constraint Significantly faster computation speed than FPGA-ARM link communication speed FPGA-side buffer will not fill up, headers are processed before next header transmitted to FPGA ARM-side buffer may fill up under heavy traffic loads 32MB ARM-side RAM gives large buffer Device utilization summary: Selected Device : 2vp20ff Number of Slices: 1174 out of % Number of Slice Flip Flops: 1706 out of % Number of 4 input LUTs: 2032 out of % Number of bonded IOBs: 24 out of 564 4% Number of BRAMs: 9 out of 88 10% Number of GCLKs: 1 out of 16 6% Figure 13 Device utilization statistics for Bloom Filter design Conger 15
16 Pitfalls and Lessons Learned FPGA I/O throughput capacity remains persistent problem One motivation for designing custom hardware is to remove typical PCI bottleneck and provide wire-speed network connectivity for FPGA Under-provisioned data path between FPGA and network interface restricts performance benefits for our prototype design Luckily, this problem may be solved through a variety of approaches Wider data paths (16-bit, 32-bit) double or quadruple throughput, at expense of higher pin count Use of higher-performance co-processor capable of faster I/O switching frequencies Optimized data transfer protocol Having co-processor in addition to FPGA to handle network interface is vital to success of our approach Required in order to permit initial remote configuration of FPGA, as well as additional reconfigurations upon user request Offloading network stack, basic request handling, and other maintenance-type tasks from FPGA saves significant amount of valuable slices for user designs Drastically eases interfacing with user application on networked workstation Active co-processor for FPGA applications, e.g. parsing network packets as in Bloom Filter application Conger 16
17 Conclusions A novel approach to providing FPGAs with standalone network connectivity has been prototyped and successfully demonstrated Investigated issues critical to providing remote management of standalone NARC resources Proposed and demonstrated solutions to discovered challenges Performed pair of case studies with two distinct, representative applications for a NARC device Network-attached RC devices offer potential benefits for a variety of applications Impressive cost and power savings over server-based RC processing Independent NARC devices may be shared by multiple users without moving Tightly coupled network interface enables FPGA to be used directly in path of network traffic for real-time analysis and monitoring Two issues that are proving to be a challenge to our approach include: Data latency in FPGA communication Software infrastructure required to achieve a robust standalone RC unit While prototype design achieves relatively good performance in some areas, and limited performance in others, this is acceptable for concept demonstration Fairly complex board design; architecture and software enhancements in development As proof of NARC concept, important goal of project was achieved in demonstration of an effective and efficient infrastructure for managing NARC devices Conger 17
18 Future Work Expansion of network processing capabilities Further development of packet filtering application More specific and practical activity or behavior sought from network traffic Analyze streaming packets at or near wire-speed rates Expansion of Ethernet link to 2-port hub Permit transparent insertion of device into network path Provide easier access to all packets in switched IP network Merging FPGA with ARM co-processor and network interface into one device Ultimate vision for NARC device Will restrict number of different FPGAs which may be supported, according to chosen FPGA socket/footprint for board Increased difficulty in PCB design Expansion to Gig-E, other network technologies Fast Ethernet targeted for prototyping effort, concept demonstration True high-performance device should support Gigabit Ethernet Other potential technologies include (but not limited to) InfiniBand, RapidIO Further development of management infrastructure Need for more robust control/decision-making middleware Automatic device discovery, concurrent job execution, fault-tolerant operation Conger 18
Using FPGAs to Design Gigabit Serial Backplanes. April 17, 2002
Using FPGAs to Design Gigabit Serial Backplanes April 17, 2002 Outline System Design Trends Serial Backplanes Architectures Building Serial Backplanes with FPGAs A1-2 Key System Design Trends Need for.
Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck
Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering
Figure 1.Block diagram of inventory management system using Proximity sensors.
Volume 1, Special Issue, March 2015 Impact Factor: 1036, Science Central Value: 2654 Inventory Management System Using Proximity ensors 1)Jyoti KMuluk 2)Pallavi H Shinde3) Shashank VShinde 4)Prof VRYadav
Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah
(DSF) Soft Core Prozessor NIOS II Stand Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de [email protected] NIOS II 1 1 What is Nios II? Altera s Second Generation
TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to
Introduction to TCP Offload Engines By implementing a TCP Offload Engine (TOE) in high-speed computing environments, administrators can help relieve network bottlenecks and improve application performance.
7a. System-on-chip design and prototyping platforms
7a. System-on-chip design and prototyping platforms Labros Bisdounis, Ph.D. Department of Computer and Communication Engineering 1 What is System-on-Chip (SoC)? System-on-chip is an integrated circuit
Networking Virtualization Using FPGAs
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,
Getting the most TCP/IP from your Embedded Processor
Getting the most TCP/IP from your Embedded Processor Overview Introduction to TCP/IP Protocol Suite Embedded TCP/IP Applications TCP Termination Challenges TCP Acceleration Techniques 2 Getting the most
Open Flow Controller and Switch Datasheet
Open Flow Controller and Switch Datasheet California State University Chico Alan Braithwaite Spring 2013 Block Diagram Figure 1. High Level Block Diagram The project will consist of a network development
A Transport Protocol for Multimedia Wireless Sensor Networks
A Transport Protocol for Multimedia Wireless Sensor Networks Duarte Meneses, António Grilo, Paulo Rogério Pereira 1 NGI'2011: A Transport Protocol for Multimedia Wireless Sensor Networks Introduction Wireless
White Paper Increase Flexibility in Layer 2 Switches by Integrating Ethernet ASSP Functions Into FPGAs
White Paper Increase Flexibility in Layer 2 es by Integrating Ethernet ASSP Functions Into FPGAs Introduction A Layer 2 Ethernet switch connects multiple Ethernet LAN segments. Because each port on the
Von der Hardware zur Software in FPGAs mit Embedded Prozessoren. Alexander Hahn Senior Field Application Engineer Lattice Semiconductor
Von der Hardware zur Software in FPGAs mit Embedded Prozessoren Alexander Hahn Senior Field Application Engineer Lattice Semiconductor AGENDA Overview Mico32 Embedded Processor Development Tool Chain HW/SW
A low-cost, connection aware, load-balancing solution for distributing Gigabit Ethernet traffic between two intrusion detection systems
Iowa State University Digital Repository @ Iowa State University Graduate Theses and Dissertations Graduate College 2010 A low-cost, connection aware, load-balancing solution for distributing Gigabit Ethernet
Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com
Best Practises for LabVIEW FPGA Design Flow 1 Agenda Overall Application Design Flow Host, Real-Time and FPGA LabVIEW FPGA Architecture Development FPGA Design Flow Common FPGA Architectures Testing and
The proliferation of the raw processing
TECHNOLOGY CONNECTED Advances with System Area Network Speeds Data Transfer between Servers with A new network switch technology is targeted to answer the phenomenal demands on intercommunication transfer
High-Performance IP Service Node with Layer 4 to 7 Packet Processing Features
UDC 621.395.31:681.3 High-Performance IP Service Node with Layer 4 to 7 Packet Processing Features VTsuneo Katsuyama VAkira Hakata VMasafumi Katoh VAkira Takeyama (Manuscript received February 27, 2001)
基 於 SDN 與 可 程 式 化 硬 體 架 構 之 雲 端 網 路 系 統 交 換 器
基 於 SDN 與 可 程 式 化 硬 體 架 構 之 雲 端 網 路 系 統 交 換 器 楊 竹 星 教 授 國 立 成 功 大 學 電 機 工 程 學 系 Outline Introduction OpenFlow NetFPGA OpenFlow Switch on NetFPGA Development Cases Conclusion 2 Introduction With the proposal
Cut Network Security Cost in Half Using the Intel EP80579 Integrated Processor for entry-to mid-level VPN
Cut Network Security Cost in Half Using the Intel EP80579 Integrated Processor for entry-to mid-level VPN By Paul Stevens, Advantech Network security has become a concern not only for large businesses,
10/100/1000Mbps Ethernet MAC with Protocol Acceleration MAC-NET Core with Avalon Interface
1 Introduction Ethernet is available in different speeds (10/100/1000 and 10000Mbps) and provides connectivity to meet a wide range of needs from desktop to switches. MorethanIP IP solutions provide a
PCI Express* Ethernet Networking
White Paper Intel PRO Network Adapters Network Performance Network Connectivity Express* Ethernet Networking Express*, a new third-generation input/output (I/O) standard, allows enhanced Ethernet network
How To Monitor And Test An Ethernet Network On A Computer Or Network Card
3. MONITORING AND TESTING THE ETHERNET NETWORK 3.1 Introduction The following parameters are covered by the Ethernet performance metrics: Latency (delay) the amount of time required for a frame to travel
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.
HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring
CESNET Technical Report 2/2014 HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring VIKTOR PUš, LUKÁš KEKELY, MARTIN ŠPINLER, VÁCLAV HUMMEL, JAN PALIČKA Received 3. 10. 2014 Abstract
Windows Server 2008 R2 Hyper-V Live Migration
Windows Server 2008 R2 Hyper-V Live Migration Table of Contents Overview of Windows Server 2008 R2 Hyper-V Features... 3 Dynamic VM storage... 3 Enhanced Processor Support... 3 Enhanced Networking Support...
Going Linux on Massive Multicore
Embedded Linux Conference Europe 2013 Going Linux on Massive Multicore Marta Rybczyńska 24th October, 2013 Agenda Architecture Linux Port Core Peripherals Debugging Summary and Future Plans 2 Agenda Architecture
VMWARE WHITE PAPER 1
1 VMWARE WHITE PAPER Introduction This paper outlines the considerations that affect network throughput. The paper examines the applications deployed on top of a virtual infrastructure and discusses the
Architectures and Platforms
Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation
Boosting Data Transfer with TCP Offload Engine Technology
Boosting Data Transfer with TCP Offload Engine Technology on Ninth-Generation Dell PowerEdge Servers TCP/IP Offload Engine () technology makes its debut in the ninth generation of Dell PowerEdge servers,
OpenFlow Based Load Balancing
OpenFlow Based Load Balancing Hardeep Uppal and Dane Brandon University of Washington CSE561: Networking Project Report Abstract: In today s high-traffic internet, it is often desirable to have multiple
Building High-Performance iscsi SAN Configurations. An Alacritech and McDATA Technical Note
Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note Internet SCSI (iscsi)
Development of complex KNX Devices
WEINZIERL ENGINEERING GmbH WEINZIERL ENGINEERING GMBH Jason Richards 84558 Tyrlaching GERMANY Phone +49 (0) 8623 / 987 98-03 Web: www.weinzierl.de Development of complex KNX Devices Abstract The KNX World
760 Veterans Circle, Warminster, PA 18974 215-956-1200. Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA 18974.
760 Veterans Circle, Warminster, PA 18974 215-956-1200 Technical Proposal Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA 18974 for Conduction Cooled NAS Revision 4/3/07 CC/RAIDStor: Conduction
SPI I2C LIN Ethernet. u Today: Wired embedded networks. u Next lecture: CAN bus u Then: 802.15.4 wireless embedded network
u Today: Wired embedded networks Ø Characteristics and requirements Ø Some embedded LANs SPI I2C LIN Ethernet u Next lecture: CAN bus u Then: 802.15.4 wireless embedded network Network from a High End
Question: 3 When using Application Intelligence, Server Time may be defined as.
1 Network General - 1T6-521 Application Performance Analysis and Troubleshooting Question: 1 One component in an application turn is. A. Server response time B. Network process time C. Application response
SIDN Server Measurements
SIDN Server Measurements Yuri Schaeffer 1, NLnet Labs NLnet Labs document 2010-003 July 19, 2010 1 Introduction For future capacity planning SIDN would like to have an insight on the required resources
UPPER LAYER SWITCHING
52-20-40 DATA COMMUNICATIONS MANAGEMENT UPPER LAYER SWITCHING Gilbert Held INSIDE Upper Layer Operations; Address Translation; Layer 3 Switching; Layer 4 Switching OVERVIEW The first series of LAN switches
Network connectivity controllers
Network connectivity controllers High performance connectivity solutions Factory Automation The hostile environment of many factories can have a significant impact on the life expectancy of PCs, and industrially
Intel DPDK Boosts Server Appliance Performance White Paper
Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks
Performance and Recommended Use of AB545A 4-Port Gigabit Ethernet Cards
Performance and Recommended Use of AB545A 4-Port Gigabit Ethernet Cards From Results on an HP rx4640 Server Table of Contents June 2005 Introduction... 3 Recommended Use Based on Performance and Design...
Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago Outline Introduction Features Motivation Architecture Globus XIO Experimental Results 3 August 2005 The Ohio State University
Network Instruments white paper
Network Instruments white paper ANALYZING FULL-DUPLEX NETWORKS There are a number ways to access full-duplex traffic on a network for analysis: SPAN or mirror ports, aggregation TAPs (Test Access Ports),
D1.2 Network Load Balancing
D1. Network Load Balancing Ronald van der Pol, Freek Dijkstra, Igor Idziejczak, and Mark Meijerink SARA Computing and Networking Services, Science Park 11, 9 XG Amsterdam, The Netherlands June [email protected],[email protected],
System-on-a-Chip with Security Modules for Network Home Electric Appliances
System-on-a-Chip with Security Modules for Network Home Electric Appliances V Hiroyuki Fujiyama (Manuscript received November 29, 2005) Home electric appliances connected to the Internet and other networks
Windows Server 2008 R2 Hyper-V Live Migration
Windows Server 2008 R2 Hyper-V Live Migration White Paper Published: August 09 This is a preliminary document and may be changed substantially prior to final commercial release of the software described
Pre-tested System-on-Chip Design. Accelerates PLD Development
Pre-tested System-on-Chip Design Accelerates PLD Development March 2010 Lattice Semiconductor 5555 Northeast Moore Ct. Hillsboro, Oregon 97124 USA Telephone: (503) 268-8000 www.latticesemi.com 1 Pre-tested
Can High-Performance Interconnects Benefit Memcached and Hadoop?
Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,
Seeking Opportunities for Hardware Acceleration in Big Data Analytics
Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada [email protected] Micaela Serra
General Pipeline System Setup Information
Product Sheet General Pipeline Information Because of Pipeline s unique network attached architecture it is important to understand each component of a Pipeline system in order to create a system that
Enhance Service Delivery and Accelerate Financial Applications with Consolidated Market Data
White Paper Enhance Service Delivery and Accelerate Financial Applications with Consolidated Market Data What You Will Learn Financial market technology is advancing at a rapid pace. The integration of
Analyzing Full-Duplex Networks
Analyzing Full-Duplex Networks There are a number ways to access full-duplex traffic on a network for analysis: SPAN or mirror ports, aggregation TAPs (Test Access Ports), or full-duplex TAPs are the three
High Speed I/O Server Computing with InfiniBand
High Speed I/O Server Computing with InfiniBand José Luís Gonçalves Dep. Informática, Universidade do Minho 4710-057 Braga, Portugal [email protected] Abstract: High-speed server computing heavily relies on
Quantum StorNext. Product Brief: Distributed LAN Client
Quantum StorNext Product Brief: Distributed LAN Client NOTICE This product brief may contain proprietary information protected by copyright. Information in this product brief is subject to change without
Cisco Integrated Services Routers Performance Overview
Integrated Services Routers Performance Overview What You Will Learn The Integrated Services Routers Generation 2 (ISR G2) provide a robust platform for delivering WAN services, unified communications,
The Advantages of Multi-Port Network Adapters in an SWsoft Virtual Environment
The Advantages of Multi-Port Network Adapters in an SWsoft Virtual Environment Introduction... 2 Virtualization addresses key challenges facing IT today... 2 Introducing Virtuozzo... 2 A virtualized environment
An Embedded Based Web Server Using ARM 9 with SMS Alert System
An Embedded Based Web Server Using ARM 9 with SMS Alert System K. Subbulakshmi 1 Asst. Professor, Bharath University, Chennai-600073, India 1 ABSTRACT: The aim of our project is to develop embedded network
Technical Brief. DualNet with Teaming Advanced Networking. October 2006 TB-02499-001_v02
Technical Brief DualNet with Teaming Advanced Networking October 2006 TB-02499-001_v02 Table of Contents DualNet with Teaming...3 What Is DualNet?...3 Teaming...5 TCP/IP Acceleration...7 Home Gateway...9
C-GEP 100 Monitoring application user manual
C-GEP 100 Monitoring application user manual 1 Introduction: C-GEP is a very versatile platform for network monitoring applications. The ever growing need for network bandwith like HD video streaming and
Computer Systems Structure Input/Output
Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices
General system requirements
2 General system requirements Minimal requirements Processor: Intel Core 2 Duo or equivalent Memory (RAM): HDD: NIC: 1 GB At least 100 MB available hard disk space. 1000 Mb/s, Jumbo frame 9kb. OS: Windows
Accelerate Cloud Computing with the Xilinx Zynq SoC
X C E L L E N C E I N N E W A P P L I C AT I O N S Accelerate Cloud Computing with the Xilinx Zynq SoC A novel reconfigurable hardware accelerator speeds the processing of applications based on the MapReduce
How To Test A Microsoft Vxworks Vx Works 2.2.2 (Vxworks) And Vxwork 2.4.2-2.4 (Vkworks) (Powerpc) (Vzworks)
DSS NETWORKS, INC. The Gigabit Experts GigMAC PMC/PMC-X and PCI/PCI-X Cards GigPMCX-Switch Cards GigPCI-Express Switch Cards GigCPCI-3U Card Family Release Notes OEM Developer Kit and Drivers Document
SCSI vs. Fibre Channel White Paper
SCSI vs. Fibre Channel White Paper 08/27/99 SCSI vs. Fibre Channel Over the past decades, computer s industry has seen radical change in key components. Limitations in speed, bandwidth, and distance have
Voice over IP. Demonstration 1: VoIP Protocols. Network Environment
Voice over IP Demonstration 1: VoIP Protocols Network Environment We use two Windows workstations from the production network, both with OpenPhone application (figure 1). The OpenH.323 project has developed
Universal Flash Storage: Mobilize Your Data
White Paper Universal Flash Storage: Mobilize Your Data Executive Summary The explosive growth in portable devices over the past decade continues to challenge manufacturers wishing to add memory to their
Simplifying Embedded Hardware and Software Development with Targeted Reference Designs
White Paper: Spartan-6 and Virtex-6 FPGAs WP358 (v1.0) December 8, 2009 Simplifying Embedded Hardware and Software Development with Targeted Reference Designs By: Navanee Sundaramoorthy FPGAs are becoming
FPGA PCIe Bandwidth. Abstract. 1 Introduction. Mike Rose. Department of Computer Science and Engineering University of California San Diego
FPGA PCIe Bandwidth Mike Rose Department of Computer Science and Engineering University of California San Diego June 9, 2010 Abstract The unique fusion of hardware and software that characterizes FPGAs
Chapter 13 Selected Storage Systems and Interface
Chapter 13 Selected Storage Systems and Interface Chapter 13 Objectives Appreciate the role of enterprise storage as a distinct architectural entity. Expand upon basic I/O concepts to include storage protocols.
Network Monitoring White Paper
Network ing White Paper ImageStream Internet Solutions, Inc. 7900 East 8th Road Plymouth, Indiana 46563 http://www.imagestream.com [email protected] Phone: 574.935.8484 Sales: 800.813.5123 Fax: 574.935.8488
The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group
The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links Filippo Costa on behalf of the ALICE DAQ group DATE software 2 DATE (ALICE Data Acquisition and Test Environment) ALICE is a
White Paper Abstract Disclaimer
White Paper Synopsis of the Data Streaming Logical Specification (Phase I) Based on: RapidIO Specification Part X: Data Streaming Logical Specification Rev. 1.2, 08/2004 Abstract The Data Streaming specification
Cluster Implementation and Management; Scheduling
Cluster Implementation and Management; Scheduling CPS343 Parallel and High Performance Computing Spring 2013 CPS343 (Parallel and HPC) Cluster Implementation and Management; Scheduling Spring 2013 1 /
Frequently Asked Questions
Frequently Asked Questions 1. Q: What is the Network Data Tunnel? A: Network Data Tunnel (NDT) is a software-based solution that accelerates data transfer in point-to-point or point-to-multipoint network
The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.
White Paper Virtualized SAP: Optimize Performance with Cisco Data Center Virtual Machine Fabric Extender and Red Hat Enterprise Linux and Kernel-Based Virtual Machine What You Will Learn The virtualization
Putting it on the NIC: A Case Study on application offloading to a Network Interface Card (NIC)
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE CCNC 2006 proceedings. Putting it on the NIC: A Case Study on application
Gigabit Ethernet Design
Gigabit Ethernet Design Laura Jeanne Knapp Network Consultant 1-919-254-8801 [email protected] www.lauraknapp.com Tom Hadley Network Consultant 1-919-301-3052 [email protected] HSEdes_ 010 ed and
Accelerating High-Speed Networking with Intel I/O Acceleration Technology
White Paper Intel I/O Acceleration Technology Accelerating High-Speed Networking with Intel I/O Acceleration Technology The emergence of multi-gigabit Ethernet allows data centers to adapt to the increasing
I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology
I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology Reduce I/O cost and power by 40 50% Reduce I/O real estate needs in blade servers through consolidation Maintain
Using High Availability Technologies Lesson 12
Using High Availability Technologies Lesson 12 Skills Matrix Technology Skill Objective Domain Objective # Using Virtualization Configure Windows Server Hyper-V and virtual machines 1.3 What Is High Availability?
Stream Processing on GPUs Using Distributed Multimedia Middleware
Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research
Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview
Technical Note TN-29-06: NAND Flash Controller on Spartan-3 Overview Micron NAND Flash Controller via Xilinx Spartan -3 FPGA Overview As mobile product capabilities continue to expand, so does the demand
GigE Vision cameras and network performance
GigE Vision cameras and network performance by Jan Becvar - Leutron Vision http://www.leutron.com 1 Table of content Abstract...2 Basic terms...2 From trigger to the processed image...4 Usual system configurations...4
Overview of Computer Networks
Overview of Computer Networks Client-Server Transaction Client process 4. Client processes response 1. Client sends request 3. Server sends response Server process 2. Server processes request Resource
CCNA R&S: Introduction to Networks. Chapter 5: Ethernet
CCNA R&S: Introduction to Networks Chapter 5: Ethernet 5.0.1.1 Introduction The OSI physical layer provides the means to transport the bits that make up a data link layer frame across the network media.
Chapter 2 - The TCP/IP and OSI Networking Models
Chapter 2 - The TCP/IP and OSI Networking Models TCP/IP : Transmission Control Protocol/Internet Protocol OSI : Open System Interconnection RFC Request for Comments TCP/IP Architecture Layers Application
10 Gigabit Ethernet: Scaling across LAN, MAN, WAN
Arasan Chip Systems Inc. White Paper 10 Gigabit Ethernet: Scaling across LAN, MAN, WAN By Dennis McCarty March 2011 Overview Ethernet is one of the few protocols that has increased its bandwidth, while
UG103.8 APPLICATION DEVELOPMENT FUNDAMENTALS: TOOLS
APPLICATION DEVELOPMENT FUNDAMENTALS: TOOLS This document provides an overview of the toolchain used to develop, build, and deploy EmberZNet and Silicon Labs Thread applications, and discusses some additional
Lustre Networking BY PETER J. BRAAM
Lustre Networking BY PETER J. BRAAM A WHITE PAPER FROM CLUSTER FILE SYSTEMS, INC. APRIL 2007 Audience Architects of HPC clusters Abstract This paper provides architects of HPC clusters with information
FPGA Accelerator Virtualization in an OpenPOWER cloud. Fei Chen, Yonghua Lin IBM China Research Lab
FPGA Accelerator Virtualization in an OpenPOWER cloud Fei Chen, Yonghua Lin IBM China Research Lab Trend of Acceleration Technology Acceleration in Cloud is Taking Off Used FPGA to accelerate Bing search
Xeon+FPGA Platform for the Data Center
Xeon+FPGA Platform for the Data Center ISCA/CARL 2015 PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA Accelerator Platform Applications and Eco-system
Linux. Reverse Debugging. Target Communication Framework. Nexus. Intel Trace Hub GDB. PIL Simulation CONTENTS
Android NEWS 2016 AUTOSAR Linux Windows 10 Reverse ging Target Communication Framework ARM CoreSight Requirements Analysis Nexus Timing Tools Intel Trace Hub GDB Unit Testing PIL Simulation Infineon MCDS
Notes and terms of conditions. Vendor shall note the following terms and conditions/ information before they submit their quote.
Specifications for ARINC 653 compliant RTOS & Development Environment Notes and terms of conditions Vendor shall note the following terms and conditions/ information before they submit their quote. 1.
Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage
White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage
Kirchhoff Institute for Physics Heidelberg
Kirchhoff Institute for Physics Heidelberg Norbert Abel FPGA: (re-)configuration and embedded Linux 1 Linux Front-end electronics based on ADC and digital signal processing Slow control implemented as
Programmable Networking with Open vswitch
Programmable Networking with Open vswitch Jesse Gross LinuxCon September, 2013 2009 VMware Inc. All rights reserved Background: The Evolution of Data Centers Virtualization has created data center workloads
