Low-latency data acquisition to GPUs using FPGA-based 3rd party devices. Denis Perret, LESIA / Observatoire de Paris

Size: px
Start display at page:

Download "Low-latency data acquisition to GPUs using FPGA-based 3rd party devices. Denis Perret, LESIA / Observatoire de Paris"

Transcription

1 Low-latency data acquisition to s using FPGA-based 3rd party devices Denis Perret, LESIA / Observatoire de Paris RTC_4_AO workshop PARIS 2016

2 RTC for ELT AO Deformable Mirror Sensors 40Ge Network high bandwidth low latency Hard Real Time telemetry 40Ge Network hight bandwidth Soft Real Time 2

3 Using FPGA to reduce acquisition latency ~ Interface : 10 GbE, low latency acquisition interface ~ W/O Direct Memory Access to the ram : multiple data copy = introduced latency RAM 10Ge NIC PCIe CPU RAM 3

4 Using FPGA to reduce acquisition latency ~ Interface : 10 GbE, low latency acquisition interface ~ With Direct Memory Access to the ram : single data transfer = low latency, maximum bandwidth RAM 10Ge NIC PCIe CPU RAM 4

5 No PCIe P2P at all. Encapsulated Data NIC Ram PCIe Ram CPU App Handles the interrupts and the data decapsulation ~ The NIC DMA engine sends data to host memory and sends an interrupt to the CPU when the buffer is (half)full. ~ The CPU suspends its tasks and saves its current state (context switch). ~ The CPU decapsulates the data (Ethernet,TCP/UDP, GigeVision ). RTC_4_AO workshop PARIS 2016

6 No PCIe P2P at all. Ram ~ The CPU builds a CUDA stream containing operations as kernel launch and memory transfer orders. NIC Ram Decapsulated Data CPU App Initiates the data and operations stream transfer ~ The stream is sent to the ~ The pixels are copied on the RAM ( DMA, reading process over PCIe) PCIe RTC_4_AO workshop PARIS 2016

7 No PCIe P2P at all. Ram ~ The CPU waits for the computation to end (or does something else) NIC Ram Computation Results CPU App Does something else. Gets interrupted periodically. ~ The results are sent to the CPU RAM. ~ The synchronization mechanisms between the and the CPU are hidden (interrupts ). PCIe RTC_4_AO workshop PARIS 2016

8 No PCIe P2P at all. Encapsulated Results NIC Ram CPU App Encapsulates the data Initiates the Data transfer ~ The CPU encapsulates the results (Ethernet, TCP/UDP, ) ~ The CPU initiates the transfer by configuring and launching the NIC DMA engine (reading process). Ram PCIe RTC_4_AO workshop PARIS 2016

9 With PCIe P2P and Custom NIC Ram ~ The CPU gets the address and size of a dedicated buffer and use it to configure the NIC DMA engine. Data Custom NIC TOE Data Decapsulation Ram Polling Kernel Decaps. Data PCIe CPU App Minding his own business ~ The data are written directly to the memory. ~ The infinitely detects new data by polling its memory (busy loop), performs the computation and fills a local buffer with the results. RTC_4_AO workshop PARIS 2016

10 Getting back the results: 1rst way Ram ~ The CPU gets notified in a hidden way that the computation. ults TOE Data Decapsulation Custom NIC Ram Results CPU App Still minding his own business ~ The CPU initiates the transfer from to FPGA (FPGA DMA engine -> reading process over PCIe). Polling Kernel PCIe RTC_4_AO workshop PARIS 2016

11 Getting back the results: 2nd way Ram ~ The sends directly the data by writing to the NIC addressable space ( DMA -> write process over PCIe). ults TOE Data Decapsulation Custom NIC Ram Results CPU App Still minding his own business Polling Kernel PCIe RTC_4_AO workshop PARIS 2016

12 Getting back the results: 3rd way Ram ~ The notifies the FPGA by writing a flag on the FPGA addressable space. ults TOE Data Decapsulation Custom NIC Ram Results CPU App Still minding his own business ~ The FPGA gets the data back (FPGA DMA engine -> reading process). Polling Kernel PCIe RTC_4_AO workshop PARIS 2016

13 Using FPGA to reduce acquisition latency: First tests ~ Stratix V PCIe development board from PLDA (+ QuickPCIe, QuickUDP IP cores) 42 Gb/s demonstrated from board to ; 8.8 Gb/s per 10GbE link in loopback mode ~ 10 GbE camera from Emergent Vision Technologies (8.9 Gb/s to mem), GigEVision protocol, 1.5 kfps in 240x240 pixels coded on 10bits (360 FPS in 2k x 1k on 8bits or 1k x 1k on 10bits) 13

14 Using FPGA to reduce acquisition latency Latency measurement measurements commands DMA DMA ram CPU app camera control DMC PHY PHY loopback UDP UDP DECAPS CUSTOM_NIC DEMUX Vision_protocol_handling DATA GENERATOR answers pixels deformable mirror commands DMA DMA DMA PCIe 3.0 gpu_ram pixels polling ring kernel buffer DM com buffer computation kernels 14

15 P2P from FPGA to (way back still launched by the CPU) no p2p & computation 1000 p2p & computation no p2p & no computation 600 p2p & no computation

16 Interrupts vs polling (on the cpu) PC B PC A carte PLDA "Quickplay" 10GbE carte PLDA "non QP" DDR hdl-kernel Image Gen c-kernel COG (ou pas) UDP0 UDP0 QPCIE DMA PCIE DDR DDR hdl-kernel Latency Meas UDP1 UDP1 BAR2 computing kernel(s) polling kernel 16 carte RAM

17 Interrupts vs polling (on the cpu) Memory polling Optimized Interrupt + "stress -c 8" Non Optim Inter Non Optim Inter + "stress -c 8"

18 Interrupts vs polling (on the cpu) Polling + isolated CPU 500 Optimized Interrupt 400 Polling + non isolated CPU

19 Using FPGA to reduce the load on /PCIe/network read direction DPRAM DPRAM CoG CoG FIFO Dble FIFO DPRAM port CoG CoG FIFO FIFO RAM { N DPRAM RAM DMA over PCIe N*N subimages ~ FPGA are well suited for on-the-fly decapsulation, data rearrangement and highly parallelized computation. ~ Less load on the PCIe, smaller buffers on the ram, less latency ~ Could be done in the WFS, so we lower the load on the network 19

20 or even do everything with FPGAs ~ Exploring the possibilities with high level development environments: ~ Vendor specific HLS. Xilinx Vivado, ~ Quickplay: based on KPN. Has its own HLS. Is gonna be able to use other HLS. They are planning to integrate PCIe P2P. ~ OpenCL: is available for FPGAs (Altera), CPUs, s. Can now handle network protocols: Altera introduced I/O channels allowing kernels read and write network streams that are defined by the board designer. P2P over PCIe should be possible (synchronization?). Integration of custom HDL blocks? ~ Matlab to HDL. Did someone try it? Maybe useful to help developing IPs. -> Interactions with HPC tools as MPI, DDS, Corba are quite challenging 20

The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group

The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links Filippo Costa on behalf of the ALICE DAQ group DATE software 2 DATE (ALICE Data Acquisition and Test Environment) ALICE is a

More information

High-Density Network Flow Monitoring

High-Density Network Flow Monitoring Petr Velan petr.velan@cesnet.cz High-Density Network Flow Monitoring IM2015 12 May 2015, Ottawa Motivation What is high-density flow monitoring? Monitor high traffic in as little rack units as possible

More information

Qsys and IP Core Integration

Qsys and IP Core Integration Qsys and IP Core Integration Prof. David Lariviere Columbia University Spring 2014 Overview What are IP Cores? Altera Design Tools for using and integrating IP Cores Overview of various IP Core Interconnect

More information

VPX Implementation Serves Shipboard Search and Track Needs

VPX Implementation Serves Shipboard Search and Track Needs VPX Implementation Serves Shipboard Search and Track Needs By: Thierry Wastiaux, Senior Vice President Interface Concept Defending against anti-ship missiles is a problem for which high-performance computing

More information

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com Best Practises for LabVIEW FPGA Design Flow 1 Agenda Overall Application Design Flow Host, Real-Time and FPGA LabVIEW FPGA Architecture Development FPGA Design Flow Common FPGA Architectures Testing and

More information

Xeon+FPGA Platform for the Data Center

Xeon+FPGA Platform for the Data Center Xeon+FPGA Platform for the Data Center ISCA/CARL 2015 PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA Accelerator Platform Applications and Eco-system

More information

FPGA Manager PCIe, USB 3.0 and Ethernet

FPGA Manager PCIe, USB 3.0 and Ethernet FPGA Manager PCIe, USB 3.0 and Ethernet Streaming, made simple. Embedded Computing Conference 2014 Marc Oberholzer Enclustra GmbH Content Enclustra Company Profile FPGA Design Center FPGA Solution Center

More information

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring CESNET Technical Report 2/2014 HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring VIKTOR PUš, LUKÁš KEKELY, MARTIN ŠPINLER, VÁCLAV HUMMEL, JAN PALIČKA Received 3. 10. 2014 Abstract

More information

Intel Xeon +FPGA Platform for the Data Center

Intel Xeon +FPGA Platform for the Data Center Intel Xeon +FPGA Platform for the Data Center FPL 15 Workshop on Reconfigurable Computing for the Masses PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA

More information

By Andrew Wilson, Editor

By Andrew Wilson, Editor Standards propel Gigabit Ethernet to the forefront By Andrew Wilson Recently ratified AIA GigE Vision standard is now a standard interface of many digital cameras. By Andrew Wilson, Editor The adoption

More information

White Paper Streaming Multichannel Uncompressed Video in the Broadcast Environment

White Paper Streaming Multichannel Uncompressed Video in the Broadcast Environment White Paper Multichannel Uncompressed in the Broadcast Environment Designing video equipment for streaming multiple uncompressed video signals is a new challenge, especially with the demand for high-definition

More information

1000Mbps Ethernet Performance Test Report 2014.4

1000Mbps Ethernet Performance Test Report 2014.4 1000Mbps Ethernet Performance Test Report 2014.4 Test Setup: Test Equipment Used: Lenovo ThinkPad T420 Laptop Intel Core i5-2540m CPU - 2.60 GHz 4GB DDR3 Memory Intel 82579LM Gigabit Ethernet Adapter CentOS

More information

Chapter 11: Input/Output Organisation. Lesson 06: Programmed IO

Chapter 11: Input/Output Organisation. Lesson 06: Programmed IO Chapter 11: Input/Output Organisation Lesson 06: Programmed IO Objective Understand the programmed IO mode of data transfer Learn that the program waits for the ready status by repeatedly testing the status

More information

1. Computer System Structure and Components

1. Computer System Structure and Components 1 Computer System Structure and Components Computer System Layers Various Computer Programs OS System Calls (eg, fork, execv, write, etc) KERNEL/Behavior or CPU Device Drivers Device Controllers Devices

More information

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25 FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25 December 2014 FPGAs in the news» Catapult» Accelerate BING» 2x search acceleration:» ½ the number of servers»

More information

Introduction to MPIO, MCS, Trunking, and LACP

Introduction to MPIO, MCS, Trunking, and LACP Introduction to MPIO, MCS, Trunking, and LACP Sam Lee Version 1.0 (JAN, 2010) - 1 - QSAN Technology, Inc. http://www.qsantechnology.com White Paper# QWP201002-P210C lntroduction Many users confuse the

More information

From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller

From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller White Paper From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller The focus of this paper is on the emergence of the converged network interface controller

More information

Networking Virtualization Using FPGAs

Networking Virtualization Using FPGAs Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,

More information

High-performance vswitch of the user, by the user, for the user

High-performance vswitch of the user, by the user, for the user A bird in cloud High-performance vswitch of the user, by the user, for the user Yoshihiro Nakajima, Wataru Ishida, Tomonori Fujita, Takahashi Hirokazu, Tomoya Hibi, Hitoshi Matsutahi, Katsuhiro Shimano

More information

High speed pattern streaming system based on AXIe s PCIe connectivity and synchronization mechanism

High speed pattern streaming system based on AXIe s PCIe connectivity and synchronization mechanism High speed pattern streaming system based on AXIe s connectivity and synchronization mechanism By Hank Lin, Product Manager of ADLINK Technology, Inc. E-Beam (Electron Beam) lithography is a next-generation

More information

Arista Application Switch: Q&A

Arista Application Switch: Q&A Arista Application Switch: Q&A Q. What is the 7124FX Application Switch? A. The Arista 7124FX is a data center class Ethernet switch based on the Arista 7124SX, our ultra low-latency L2/3/4 switching platform.

More information

Cloud Data Center Acceleration 2015

Cloud Data Center Acceleration 2015 Cloud Data Center Acceleration 2015 Agenda! Computer & Storage Trends! Server and Storage System - Memory and Homogenous Architecture - Direct Attachment! Memory Trends! Acceleration Introduction! FPGA

More information

Embedded Systems: map to FPGA, GPU, CPU?

Embedded Systems: map to FPGA, GPU, CPU? Embedded Systems: map to FPGA, GPU, CPU? Jos van Eijndhoven jos@vectorfabrics.com Bits&Chips Embedded systems Nov 7, 2013 # of transistors Moore s law versus Amdahl s law Computational Capacity Hardware

More information

Performance of Host Identity Protocol on Nokia Internet Tablet

Performance of Host Identity Protocol on Nokia Internet Tablet Performance of Host Identity Protocol on Nokia Internet Tablet Andrey Khurri Helsinki Institute for Information Technology HIP Research Group IETF 68 Prague March 23, 2007

More information

Gigabit Ethernet Design

Gigabit Ethernet Design Gigabit Ethernet Design Laura Jeanne Knapp Network Consultant 1-919-254-8801 laura@lauraknapp.com www.lauraknapp.com Tom Hadley Network Consultant 1-919-301-3052 tmhadley@us.ibm.com HSEdes_ 010 ed and

More information

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah (DSF) Soft Core Prozessor NIOS II Stand Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de jens_onno.krah@fh-koeln.de NIOS II 1 1 What is Nios II? Altera s Second Generation

More information

D1.2 Network Load Balancing

D1.2 Network Load Balancing D1. Network Load Balancing Ronald van der Pol, Freek Dijkstra, Igor Idziejczak, and Mark Meijerink SARA Computing and Networking Services, Science Park 11, 9 XG Amsterdam, The Netherlands June ronald.vanderpol@sara.nl,freek.dijkstra@sara.nl,

More information

Xilinx 7 Series FPGA Power Benchmark Design Summary May 2015

Xilinx 7 Series FPGA Power Benchmark Design Summary May 2015 Xilinx 7 Series FPGA Power Benchmark Design Summary May 15 Application-centric Benchmarking Process 1G Packet Processor OTN Muxponder ASIC Emulation Wireless Radio & Satellite Modem Edge QAM AVB Switcher

More information

Providing Safe, User Space Access to Fast, Solid State Disks. Adrian Caulfield, Todor Mollov, Louis Eisner, Arup De, Joel Coburn, Steven Swanson

Providing Safe, User Space Access to Fast, Solid State Disks. Adrian Caulfield, Todor Mollov, Louis Eisner, Arup De, Joel Coburn, Steven Swanson Moneta-Direct: Providing Safe, User Space Access to Fast, Solid State Disks Adrian Caulfield, Todor Mollov, Louis Eisner, Arup De, Joel Coburn, Steven Swanson Non-volatile Systems Laboratory Department

More information

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014 Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,

More information

AGIPD Interface Electronic Prototyping

AGIPD Interface Electronic Prototyping AGIPD Interface Electronic Prototyping P.Goettlicher I. Sheviakov M. Zimmer - Hardware Setup, Measurements - ADC (AD9252 14bit x 8ch x 50msps ) readout - Custom 10G Ethernet performance - Conclusions Test

More information

Pedraforca: ARM + GPU prototype

Pedraforca: ARM + GPU prototype www.bsc.es Pedraforca: ARM + GPU prototype Filippo Mantovani Workshop on exascale and PRACE prototypes Barcelona, 20 May 2014 Overview Goals: Test the performance, scalability, and energy efficiency of

More information

I3: Maximizing Packet Capture Performance. Andrew Brown

I3: Maximizing Packet Capture Performance. Andrew Brown I3: Maximizing Packet Capture Performance Andrew Brown Agenda Why do captures drop packets, how can you tell? Software considerations Hardware considerations Potential hardware improvements Test configurations/parameters

More information

Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University

Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University Napatech - Sharkfest 2009 1 Presentation Overview About Napatech

More information

Router Architectures

Router Architectures Router Architectures An overview of router architectures. Introduction What is a Packet Switch? Basic Architectural Components Some Example Packet Switches The Evolution of IP Routers 2 1 Router Components

More information

AN FPGA FRAMEWORK SUPPORTING SOFTWARE PROGRAMMABLE RECONFIGURATION AND RAPID DEVELOPMENT OF SDR APPLICATIONS

AN FPGA FRAMEWORK SUPPORTING SOFTWARE PROGRAMMABLE RECONFIGURATION AND RAPID DEVELOPMENT OF SDR APPLICATIONS AN FPGA FRAMEWORK SUPPORTING SOFTWARE PROGRAMMABLE RECONFIGURATION AND RAPID DEVELOPMENT OF SDR APPLICATIONS David Rupe (BittWare, Concord, NH, USA; drupe@bittware.com) ABSTRACT The role of FPGAs in Software

More information

FPGA-based MapReduce Framework for Machine Learning

FPGA-based MapReduce Framework for Machine Learning FPGA-based MapReduce Framework for Machine Learning Bo WANG 1, Yi SHAN 1, Jing YAN 2, Yu WANG 1, Ningyi XU 2, Huangzhong YANG 1 1 Department of Electronic Engineering Tsinghua University, Beijing, China

More information

DEVICE DRIVERS AND TERRUPTS SERVICE MECHANISM Lesson-14: Device types, Physical and Virtual device functions

DEVICE DRIVERS AND TERRUPTS SERVICE MECHANISM Lesson-14: Device types, Physical and Virtual device functions DEVICE DRIVERS AND TERRUPTS SERVICE MECHANISM Lesson-14: Device types, Physical and Virtual device functions 1 Device Types For each type of device, there is a set of the generic commands. For example,

More information

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Seeking Opportunities for Hardware Acceleration in Big Data Analytics Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who

More information

Going Linux on Massive Multicore

Going Linux on Massive Multicore Embedded Linux Conference Europe 2013 Going Linux on Massive Multicore Marta Rybczyńska 24th October, 2013 Agenda Architecture Linux Port Core Peripherals Debugging Summary and Future Plans 2 Agenda Architecture

More information

OSes. Arvind Seshadri Mark Luk Ning Qu Adrian Perrig SOSP2007. CyLab of CMU. SecVisor: A Tiny Hypervisor to Provide

OSes. Arvind Seshadri Mark Luk Ning Qu Adrian Perrig SOSP2007. CyLab of CMU. SecVisor: A Tiny Hypervisor to Provide SecVisor: A Seshadri Mark Luk Ning Qu CyLab of CMU SOSP2007 Outline Introduction Assumption SVM Background Design Problems Implementation Kernel Porting Evaluation Limitation Introducion Why? Only approved

More information

I/O Device and Drivers

I/O Device and Drivers COS 318: Operating Systems I/O Device and Drivers Prof. Margaret Martonosi Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall11/cos318/ Announcements Project

More information

I/O virtualization. Jussi Hanhirova Aalto University, Helsinki, Finland jussi.hanhirova@aalto.fi. 2015-12-10 Hanhirova CS/Aalto

I/O virtualization. Jussi Hanhirova Aalto University, Helsinki, Finland jussi.hanhirova@aalto.fi. 2015-12-10 Hanhirova CS/Aalto I/O virtualization Jussi Hanhirova Aalto University, Helsinki, Finland jussi.hanhirova@aalto.fi Outline Introduction IIoT Data streams on the fly processing Network packet processing in the virtualized

More information

Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters

Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters Evaluation report prepared under contract with Emulex Executive Summary As

More information

An Embedded Based Web Server Using ARM 9 with SMS Alert System

An Embedded Based Web Server Using ARM 9 with SMS Alert System An Embedded Based Web Server Using ARM 9 with SMS Alert System K. Subbulakshmi 1 Asst. Professor, Bharath University, Chennai-600073, India 1 ABSTRACT: The aim of our project is to develop embedded network

More information

A Deduplication File System & Course Review

A Deduplication File System & Course Review A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror

More information

Computer Systems Structure Input/Output

Computer Systems Structure Input/Output Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices

More information

A Scalable Large Format Display Based on Zero Client Processor

A Scalable Large Format Display Based on Zero Client Processor International Journal of Electrical and Computer Engineering (IJECE) Vol. 5, No. 4, August 2015, pp. 714~719 ISSN: 2088-8708 714 A Scalable Large Format Display Based on Zero Client Processor Sang Don

More information

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France

More information

N5 NETWORKING BEST PRACTICES

N5 NETWORKING BEST PRACTICES N5 NETWORKING BEST PRACTICES Table of Contents Nexgen N5 Networking... 2 Overview of Storage Networking Best Practices... 2 Recommended Switch features for an iscsi Network... 2 Setting up the iscsi Network

More information

Development. Igor Sheviakov Manfred Zimmer Peter Göttlicher Qingqing Xia. AGIPD Meeting 01-02 April, 2014

Development. Igor Sheviakov Manfred Zimmer Peter Göttlicher Qingqing Xia. AGIPD Meeting 01-02 April, 2014 Textmasterformat AGIPD Firmware/Software bearbeiten Igor Sheviakov Manfred Zimmer Peter Göttlicher Qingqing Xia AGIPD Meeting 01-02 April, 2014 Outline Textmasterformat bearbeiten Reminder: hardware set-up

More information

Nutaq. PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET. nutaq.com MONTREAL QUEBEC

Nutaq. PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET. nutaq.com MONTREAL QUEBEC Nutaq PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET QUEBEC I MONTREAL I N E W YO R K I nutaq.com Nutaq PicoDigitizer 125-Series The PicoDigitizer 125-Series

More information

TOE2-IP FTP Server Demo Reference Design Manual Rev1.0 9-Jan-15

TOE2-IP FTP Server Demo Reference Design Manual Rev1.0 9-Jan-15 TOE2-IP FTP Server Demo Reference Design Manual Rev1.0 9-Jan-15 1 Introduction File Transfer Protocol (FTP) is the protocol designed for file sharing over internet. By using TCP/IP for lower layer, FTP

More information

SPI I2C LIN Ethernet. u Today: Wired embedded networks. u Next lecture: CAN bus u Then: 802.15.4 wireless embedded network

SPI I2C LIN Ethernet. u Today: Wired embedded networks. u Next lecture: CAN bus u Then: 802.15.4 wireless embedded network u Today: Wired embedded networks Ø Characteristics and requirements Ø Some embedded LANs SPI I2C LIN Ethernet u Next lecture: CAN bus u Then: 802.15.4 wireless embedded network Network from a High End

More information

The Dusk of FireWire - The Dawn of USB 3.0

The Dusk of FireWire - The Dawn of USB 3.0 WWW.LUMENERA.COM The Dusk of FireWire - The Dawn of USB 3.0 Advancements and Critical Aspects of Camera Interfaces for Next Generation Vision Systems WHAT S INSIDE Executive Summary Criteria for Selecting

More information

Linux NIC and iscsi Performance over 40GbE

Linux NIC and iscsi Performance over 40GbE Linux NIC and iscsi Performance over 4GbE Chelsio T8-CR vs. Intel Fortville XL71 Executive Summary This paper presents NIC and iscsi performance results comparing Chelsio s T8-CR and Intel s latest XL71

More information

QoS & Traffic Management

QoS & Traffic Management QoS & Traffic Management Advanced Features for Managing Application Performance and Achieving End-to-End Quality of Service in Data Center and Cloud Computing Environments using Chelsio T4 Adapters Chelsio

More information

Model-based system-on-chip design on Altera and Xilinx platforms

Model-based system-on-chip design on Altera and Xilinx platforms CO-DEVELOPMENT MANUFACTURING INNOVATION & SUPPORT Model-based system-on-chip design on Altera and Xilinx platforms Ronald Grootelaar, System Architect RJA.Grootelaar@3t.nl Agenda 3T Company profile Technology

More information

Go Faster - Preprocessing Using FPGA, CPU, GPU. Dipl.-Ing. (FH) Bjoern Rudde Image Acquisition Development STEMMER IMAGING

Go Faster - Preprocessing Using FPGA, CPU, GPU. Dipl.-Ing. (FH) Bjoern Rudde Image Acquisition Development STEMMER IMAGING Go Faster - Preprocessing Using FPGA, CPU, GPU Dipl.-Ing. (FH) Bjoern Rudde Image Acquisition Development STEMMER IMAGING WHO ARE STEMMER IMAGING? STEMMER IMAGING is: Europe's leading independent provider

More information

Question: 3 When using Application Intelligence, Server Time may be defined as.

Question: 3 When using Application Intelligence, Server Time may be defined as. 1 Network General - 1T6-521 Application Performance Analysis and Troubleshooting Question: 1 One component in an application turn is. A. Server response time B. Network process time C. Application response

More information

Intel DPDK Boosts Server Appliance Performance White Paper

Intel DPDK Boosts Server Appliance Performance White Paper Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks

More information

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER Tender Notice No. 3/2014-15 dated 29.12.2014 (IIT/CE/ENQ/COM/HPC/2014-15/569) Tender Submission Deadline Last date for submission of sealed bids is extended

More information

The Bus (PCI and PCI-Express)

The Bus (PCI and PCI-Express) 4 Jan, 2008 The Bus (PCI and PCI-Express) The CPU, memory, disks, and all the other devices in a computer have to be able to communicate and exchange data. The technology that connects them is called the

More information

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Performance Study Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Introduction With more and more mission critical networking intensive workloads being virtualized

More information

USB - FPGA MODULE (PRELIMINARY)

USB - FPGA MODULE (PRELIMINARY) DLP-HS-FPGA LEAD-FREE USB - FPGA MODULE (PRELIMINARY) APPLICATIONS: - Rapid Prototyping - Educational Tool - Industrial / Process Control - Data Acquisition / Processing - Embedded Processor FEATURES:

More information

Design Issues in a Bare PC Web Server

Design Issues in a Bare PC Web Server Design Issues in a Bare PC Web Server Long He, Ramesh K. Karne, Alexander L. Wijesinha, Sandeep Girumala, and Gholam H. Khaksari Department of Computer & Information Sciences, Towson University, 78 York

More information

Configuring Your Computer and Network Adapters for Best Performance

Configuring Your Computer and Network Adapters for Best Performance Configuring Your Computer and Network Adapters for Best Performance ebus Universal Pro and User Mode Data Receiver ebus SDK Application Note This application note covers the basic configuration of a network

More information

Review from last time. CS 537 Lecture 3 OS Structure. OS structure. What you should learn from this lecture

Review from last time. CS 537 Lecture 3 OS Structure. OS structure. What you should learn from this lecture Review from last time CS 537 Lecture 3 OS Structure What HW structures are used by the OS? What is a system call? Michael Swift Remzi Arpaci-Dussea, Michael Swift 1 Remzi Arpaci-Dussea, Michael Swift 2

More information

COS 318: Operating Systems. I/O Device and Drivers. Input and Output. Definitions and General Method. Revisit Hardware

COS 318: Operating Systems. I/O Device and Drivers. Input and Output. Definitions and General Method. Revisit Hardware COS 318: Operating Systems I/O and Drivers Input and Output A computer s job is to process data Computation (, cache, and memory) Move data into and out of a system (between I/O devices and memory) Challenges

More information

Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability

Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability White Paper Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability The new TCP Chimney Offload Architecture from Microsoft enables offload of the TCP protocol

More information

How to design and implement firmware for embedded systems

How to design and implement firmware for embedded systems How to design and implement firmware for embedded systems Last changes: 17.06.2010 Author: Rico Möckel The very beginning: What should I avoid when implementing firmware for embedded systems? Writing code

More information

SMB Direct for SQL Server and Private Cloud

SMB Direct for SQL Server and Private Cloud SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server

More information

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Quiz for Chapter 6 Storage and Other I/O Topics 3.10 Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [6 points] Give a concise answer to each

More information

Cloud-Based Apps Drive the Need for Frequency-Flexible Clock Generators in Converged Data Center Networks

Cloud-Based Apps Drive the Need for Frequency-Flexible Clock Generators in Converged Data Center Networks Cloud-Based Apps Drive the Need for Frequency-Flexible Generators in Converged Data Center Networks Introduction By Phil Callahan, Senior Marketing Manager, Timing Products, Silicon Labs Skyrocketing network

More information

10/100 Mbps Ethernet MAC

10/100 Mbps Ethernet MAC XSV Board 1.0 HDL Interfaces and Example Designs 10/100 Mbps Ethernet MAC VLSI Research Group Electrical Engineering Bandung Institute of Technology, Bandung, Indonesia Last Modified: 20 September 2001

More information

XMC Modules. XMC-6260-CC 10-Gigabit Ethernet Interface Module with Dual XAUI Ports. Description. Key Features & Benefits

XMC Modules. XMC-6260-CC 10-Gigabit Ethernet Interface Module with Dual XAUI Ports. Description. Key Features & Benefits XMC-6260-CC 10-Gigabit Interface Module with Dual XAUI Ports XMC module with TCP/IP offload engine ASIC Dual XAUI 10GBASE-KX4 ports PCIe x8 Gen2 Description Acromag s XMC-6260-CC provides a 10-gigabit

More information

Open Flow Controller and Switch Datasheet

Open Flow Controller and Switch Datasheet Open Flow Controller and Switch Datasheet California State University Chico Alan Braithwaite Spring 2013 Block Diagram Figure 1. High Level Block Diagram The project will consist of a network development

More information

OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE. Guillène Ribière, CEO, System Architect

OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE. Guillène Ribière, CEO, System Architect OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE Guillène Ribière, CEO, System Architect Problem Statement Low Performances on Hardware Accelerated Encryption: Max Measured 10MBps Expectations: 90 MBps

More information

Multiple Public IPs (virtual service IPs) are supported either to cover multiple network segments or to increase network performance.

Multiple Public IPs (virtual service IPs) are supported either to cover multiple network segments or to increase network performance. EliteNAS Cluster Mirroring Option - Introduction Real Time NAS-to-NAS Mirroring & Auto-Failover Cluster Mirroring High-Availability & Data Redundancy Option for Business Continueity Typical Cluster Mirroring

More information

Welcome to Pericom s PCIe and USB3 ReDriver/Repeater Product Training Module.

Welcome to Pericom s PCIe and USB3 ReDriver/Repeater Product Training Module. Welcome to Pericom s PCIe and USB3 ReDriver/Repeater Product Training Module. 1 Pericom has been a leader in providing Signal Integrity Solutions since 2005, with over 60 million units shipped Platforms

More information

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces White Paper Introduction The DDR3 SDRAM memory architectures support higher bandwidths with bus rates of 600 Mbps to 1.6 Gbps (300 to 800 MHz), 1.5V operation for lower power, and higher densities of 2

More information

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

COMPUTER HARDWARE. Input- Output and Communication Memory Systems COMPUTER HARDWARE Input- Output and Communication Memory Systems Computer I/O I/O devices commonly found in Computer systems Keyboards Displays Printers Magnetic Drives Compact disk read only memory (CD-ROM)

More information

USB 3.0 Connectivity using the Cypress EZ-USB FX3 Controller

USB 3.0 Connectivity using the Cypress EZ-USB FX3 Controller USB 3.0 Connectivity using the Cypress EZ-USB FX3 Controller PLC2 FPGA Days June 20, 2012 Stuttgart Martin Heimlicher FPGA Solution Center Content Enclustra Company Profile USB 3.0 Overview What is new?

More information

AFDX networks. Computers and Real-Time Group, University of Cantabria

AFDX networks. Computers and Real-Time Group, University of Cantabria AFDX networks By: J. Javier Gutiérrez (gutierjj@unican.es) Computers and Real-Time Group, University of Cantabria ArtistDesign Workshop on Real-Time System Models for Schedulability Analysis Santander,

More information

HDBaseT Camera. For CCTV / Surveillance. July 2011

HDBaseT Camera. For CCTV / Surveillance. July 2011 Camera For CCTV / Surveillance July 2011 About Alliance The Alliance promotes and standardizes technology for distribution of uncompressed HD multimedia content The Alliance was founded on June 2010, by

More information

3D modeling in PCI Express Gen1 and Gen2 high speed SI simulation

3D modeling in PCI Express Gen1 and Gen2 high speed SI simulation 3D modeling in PCI Express Gen1 and Gen2 high speed SI simulation Runjing Zhou Inner Mongolia University E mail: auzhourj@163.com Jinsong Hu Cadence Design Systems E mail: jshu@cadence.com 17th IEEE Workshop

More information

Packet-based Network Traffic Monitoring and Analysis with GPUs

Packet-based Network Traffic Monitoring and Analysis with GPUs Packet-based Network Traffic Monitoring and Analysis with GPUs Wenji Wu, Phil DeMar wenji@fnal.gov, demar@fnal.gov GPU Technology Conference 2014 March 24-27, 2014 SAN JOSE, CALIFORNIA Background Main

More information

Client and Server System Requirements

Client and Server System Requirements Client and Server System Requirements M inimum Server Requirements Motherboard Asus P4P800, Intel 915G, Intel D865PERL CPU Processor Celeron 2.6/Intel P4 2.0 or greater for DVR s with 4-8 channels Intel

More information

Configure Windows 2012/Windows 2012 R2 with SMB Direct using Emulex OneConnect OCe14000 Series Adapters

Configure Windows 2012/Windows 2012 R2 with SMB Direct using Emulex OneConnect OCe14000 Series Adapters Configure Windows 2012/Windows 2012 R2 with SMB Direct using Emulex OneConnect OCe14000 Series Adapters Emulex OneConnect Ethernet Network Adapters Introduction This document gives an overview of how to

More information

An Oracle Technical White Paper November 2011. Oracle Solaris 11 Network Virtualization and Network Resource Management

An Oracle Technical White Paper November 2011. Oracle Solaris 11 Network Virtualization and Network Resource Management An Oracle Technical White Paper November 2011 Oracle Solaris 11 Network Virtualization and Network Resource Management Executive Overview... 2 Introduction... 2 Network Virtualization... 2 Network Resource

More information

Have both hardware and software. Want to hide the details from the programmer (user).

Have both hardware and software. Want to hide the details from the programmer (user). Input/Output Devices Chapter 5 of Tanenbaum. Have both hardware and software. Want to hide the details from the programmer (user). Ideally have the same interface to all devices (device independence).

More information

ebus Player Quick Start Guide

ebus Player Quick Start Guide ebus Player Quick Start Guide This guide provides you with the information you need to efficiently set up and start using the ebus Player software application to control your GigE Vision or USB3 Vision

More information

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality White Paper Broadcom Ethernet Network Controller Enhanced Virtualization Functionality Advancements in VMware virtualization technology coupled with the increasing processing capability of hardware platforms

More information

High-Density Network Flow Monitoring

High-Density Network Flow Monitoring High-Density Network Flow Monitoring Petr Velan CESNET, z.s.p.o. Zikova 4, 160 00 Praha 6, Czech Republic petr.velan@cesnet.cz Viktor Puš CESNET, z.s.p.o. Zikova 4, 160 00 Praha 6, Czech Republic pus@cesnet.cz

More information

MSITel provides real time telemetry up to 4.8 kbps (2xIridium modem) for balloons/experiments

MSITel provides real time telemetry up to 4.8 kbps (2xIridium modem) for balloons/experiments The MSITel module family allows your ground console to be everywhere while balloon experiments run everywhere MSITel provides real time telemetry up to 4.8 kbps (2xIridium modem) for balloons/experiments

More information

Avoiding pitfalls in PROFINET RT and IRT Node Implementation

Avoiding pitfalls in PROFINET RT and IRT Node Implementation Avoiding pitfalls in PROFINET RT and IRT Node Implementation Prof. Hans D. Doran ZHAW / Institute of Embedded Systems Technikumstrasse 9, 8400 Winterthur, Switzerland E-Mail: hans.doran@zhaw.ch Lukas Itin

More information

Lessons learned from Run2 C-RORC/Clusterfinder Development

Lessons learned from Run2 C-RORC/Clusterfinder Development Lessons learned from Run2 C-RORC/Clusterfinder Development 13.01.2014 Heiko Engel hengel@cern.ch C-RORC Hardware Timeline Kickoff Meeting Start PCB Layout Purchase Preparations Pre-Series Boards done Schematics

More information

FPGAs for Trusted Cloud Computing

FPGAs for Trusted Cloud Computing FPGAs for Trusted Cloud Computing Traditional Servers Datacenter Cloud Servers Datacenter Cloud Manager Client Client Control Client Client Control 2 Existing cloud systems cannot offer strong security

More information

[Download Tech Notes TN-11, TN-18 and TN-25 for more information on D-TA s Record & Playback solution] SENSOR PROCESSING FOR DEMANDING APPLICATIONS 29

[Download Tech Notes TN-11, TN-18 and TN-25 for more information on D-TA s Record & Playback solution] SENSOR PROCESSING FOR DEMANDING APPLICATIONS 29 is an extremely scalable and ultra-fast 10 Gigabit record and playback system. It is designed to work with D-TA sensor signal acquisition products that are 10 Gigabit (10GbE) network attached. The can

More information

Next Generation GPU Architecture Code-named Fermi

Next Generation GPU Architecture Code-named Fermi Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time

More information