Tyche: An efficient Ethernet-based protocol for converged networked storage
|
|
|
- Clinton Cain
- 10 years ago
- Views:
Transcription
1 Tyche: An efficient Ethernet-based protocol for converged networked storage Pilar González-Férez and Angelos Bilas 30 th International Conference on Massive Storage Systems and Technology MSST 2014 June 6, Santa Clara, California 1 / 30
2 1 Introduction 2 Design 3 Results 4 Conclusions and Future Directions 2 / 30
3 1 Introduction 2 Design 3 Results 4 Conclusions and Future Directions 3 / 30
4 Efficient access to networked storage Public clouds use shared storage lower cost Easier to support migration and other operations Converged storage places low-latency storage devices in all servers Storage requests exchanged between all compute servers Network protocol important for achieving high I/O throughput Modern servers increase number of cores and s Cost to access storage a concern as well Cannot use custom s or controllers in all servers Ethernet dominant technology for datacenters Lower cost and complexity single Ethernet network for storage and network data How to reduce protocol overheads for accessing remote storage over Ethernet? 4 / 30
5 Efficient access to networked storage (ii) Challenges Synchronization from 10s of cores to a single link Link bundling for spatial parallelism NUMA affinity Dynamic assignment of links to cores Our goal Design a networked storage access protocol that dynamically manage cores, s, NUMA affinity 5 / 30
6 1 Introduction 2 Design 3 Results 4 Conclusions and Future Directions 6 / 30
7 Our Proposal Tyche a network storage protocol that efficiently shares remote resources by using transparently several s and connections Design goals Connection-oriented protocol Edge-based communication subsystem Use Ethernet Provide RDMA-type operations without any hardware support Can be deployed in existing infrastructures Create block device local view of a remote storage device Support any existing file system 7 / 30
8 Netwok layer Physical devices Overview Send path (Initiator) VFS Receive path (Target) Kernel Space File System Storage device Block device Tyche block layer Tyche block layer Tyche network layer Tyche network layer Ethernet Driver Ethernet Driver 8 / 30
9 Design Challenges Efficiently map I/O requests to network messages Memory managment NUMA affinity Sychronization Allow high concurrency to saturate many s 9 / 30
10 Map I/O Requests to Network Messages Network messages Request/completion messages I/O requests and completions A request message corresponds to a single request packet Request packet transferred as small Ethernet frames (< 100 bytes) Data messages data pages RDMA operations scatter-gather list of memory pages Data packets transferred as Jumbo Ethernet frames Zero copy avoid data copy in receive path For writes interchange pages with Tyche pages For reads, interchange cannot be applied Ethernet header information about packets/messages Provide end-to-end flow-control Facilitate communication between block layer and network layer 10 / 30
11 Memory Management Overhead Block layer remq Queue of pre-allocated request messages Request and completion use the same message buffers damq Queue of pre-allocated descriptors for data messages Target uses pre-allocated pages avoids alloc/free Initiator uses pages of regular I/O requests 11 / 30
12 NUMA Affinity PCIe x8 PCIe x8 Maximum throughput only with right placement Logical connection per Resources allocated on NUMA node where is attached remq damq tx_ring rx_ring not_ring Private rings Connection selected depending on location of buffers of users I/O requests Memory 0 Memory 1 Processor 0 Processor 1 Core 0 Core 1 Core 4 Core 5 QPI 0 Core 2 Core 3 Core 6 Core 7 QPI 1 QPI 1 I/O hub 0 I/O hub / 30
13 Netwok layer Physical devices Tyche Overview Send path (Initiator) VFS Receive path (Target) Kernel Space File System Storage device Block device Tyche block layer damq remq Tyche block layer damq remq Tyche network layer tx_ring_small tx_ring_big Tyche network layer not_ring_req not_ring_data rx_ring_small rx_ring_big Ethernet Driver Ethernet Driver 13 / 30
14 Synchronization Overhead Context synchronization reduced for shared structures Each connection has its own private resources Network layer Three logical rings tx_ring Transmission ring rx_ring Receive ring not_ring Notification ring For each logical ring 2 different physical rings A small ring request packets A large ring data packets Each physical ring has only two sync variables: head and tail Initiator specifies fixed positions at remq and damq For each packet, the sender specifies its position in rx_ring s 14 / 30
15 Synchronization Overhead (ii) Block layer Network layer Ethernet driver Block layer Network layer Ethernet driver I/O request data pages I/O request data pages L L remq damq remq damq request msg A L A data msg L request msg L data msg L not_ring_req not_ring_data tx_ring_small tx_ring_big rx_ring_small rx_ring_big L tx ring rx ring Send path Receive path 15 / 30
16 Synchronization Overhead (iii) Many threads simultaneously issuing write requests cause lock synchronization overhead and lock contention at the level Two modes of operation Inline mode: Application context issues requests with no context switch Queue mode: Applications insert I/O requests in a Tyche queue Several threads submit network requests 16 / 30
17 Allow High Concurrency to Saturate Many s Tyche scales with load at initiator and target Send path Initiator uses queue mode Multiple threads place requests in a queue Tyche controls the number of threads accessing each link Target uses work queues to send I/O completions back One work queue thread per physical core Receive path Network layer one thread/ processes incoming data Block layer several threads per issue/complete requests Tested up to 6 x 10 Gbits/s 17 / 30
18 1 Introduction 2 Design 3 Results 4 Conclusions and Future Directions 18 / 30
19 Experimental Testbed Hardware & Software Two nodes 4-core Intel Xeon Initiator: 12 GB DDR-III DRAM Target: 48 GB DDR-III DRAM 36 GB used as ramdisk 6 Myri10ge cards each node connected back to back CentOS 6.3 Linux kernel Benchmarks: zmio, FIO, Hbase+YCSB, Psearchy, Blast,... Tyche compared to: Linux Network Block Device NBD (today) TSockets Tyche block layer using TCP/IP protocol 19 / 30
20 Baseline Performance zmio, 32 threads, raw device (no file system), 1 MB request size Tyche throughput scales with the number of s Tyche achieves between 82% and 92 % of throughput Tyche improves around 10x the throughput of NBD Throughput (GB/s) Tyche TSockets NBD Throughput (GB/s) Tyche TSockets NBD # s Read requests # s Write requests 20 / 30
21 Impact of Affinity zmio, 32 threads, raw device (no file system), 1 MB request size Tyche achieves maximum throughput only with right placement: Full-mem placement improves no affinity performance up to 97% Kmem- placement improves no affinity performance up to 54% Throughput (GB/s) No affinity Kmem- Full-mem Throughput (GB/s) No affinity Kmem- Full-mem # s Read requests # s Write requests 21 / 30
22 Receive Path Scaling zmio, 32 threads, raw device, 4 kb, 64 kb, and 1 MB request sizes A single thread can process requests for three s: 30 GBits/s By using a thread per : Can achieve maximum throughput Reduce receive path synchronization Throughput (GB/s) k-SinTh 4k-MulTh 64k-SinTh 64k-MulTh 1M-SinTh 1M-MulTh Throughput (GB/s) k-SinTh 4k-MulTh 64k-SinTh 64k-MulTh 1M-SinTh 1M-MulTh # s Read requests # s Write requests 22 / 30
23 Send Path Scaling FIO, XFS, 256 MB file size, several threads, each one its own file 4 kb requests: queue mode makes context switch Inline mode outperforms queue mode up to 31% 512 kb requests: inline mode synchronization overhead and lock contention Writes: queue mode outperforms inline mode up to 45% Throughput (GB/s) Read-queue Read-inline Write-queue Write-inline # Threads Throughput (GB/s) # Threads Read-queue Read-inline Write-queue Write-inline 4 kb request size 512 kb request size 23 / 30
24 Queue vs. Inline Mode Overhead: 4 kb Queue mode pays context switch overhead Initiator: CPU utilization increases up to 29% Target: lower throughput CPU utilization drops up to 19% CPU utilization (sys + user) # Threads Read-queue Read-inline Write-queue Write-inline CPU utilization (sys + user) # Threads Read-queue Read-inline Write-queue Write-inline Initiator, 4 kb request size Target, 4 kb request size 24 / 30
25 Queue vs. Inline Mode Overhead: 512 kb Writes: inline mode synchronization overhead and lock contention Initiator: CPU utilization increases up to 30% Target: lower throughput CPU utilization drops up to 40% CPU utilization (sys + user) # Threads Read-queue Read-inline Write-queue Write-inline CPU utilization (sys + user) # Threads Read-queue Read-inline Write-queue Write-inline Initiator, 512 kb request size Target, 512 kb request size 25 / 30
26 Other benchmarks Tyche always performs better than NBD and TSockets Throughput (MB/s) Tyche NBD TSockets # s Psearchy 1,154 4, ,724 Blast IOR-R 512k 573 1, IOR-W 512k 603 1, HBase-Read HBase-Insert / 30
27 Conclusions and Future Work Conclusions Tyche networked storage protocol Transparently use multiple s and multiple connections Address contention, memory mgmt, and network ordering Address NUMA affinity issues Achieve scalable throughput Reads: up to 6.4 GBytes/s ( 7 max) Writes: up to 6.7 GBytes/s ( 7 max) Significantly outperform NBD and TSockets Future Directions Consider how can co-exist with other network protocols over Ethernet 27 / 30
28 Tyche: An efficient Ethernet-based protocol for converged networked storage Pilar González-Férez and Angelos Bilas FP7-ICT / 30
29 Send Path Overview Block layer Network layer Ethernet driver Block layer Network layer Ethernet driver I/O request data pages I/O request data pages remq damq remq damq request msg 5 data msg 3 request msg 3 tx_ring_small tx_ring_big tx_ring_small tx ring Write requests tx ring Read requests 29 / 30
30 Receive Path Overview Block layer Network layer Ethernet driver Block layer Network layer Ethernet driver I/O request data pages I/O request data pages remq damq remq damq request msg data msg request msg not_ring_req not_ring_data not_ring_req rx_ring_small rx_ring_big rx_ring_small rx ring Write requests rx ring Read requests 30 / 30
D1.2 Network Load Balancing
D1. Network Load Balancing Ronald van der Pol, Freek Dijkstra, Igor Idziejczak, and Mark Meijerink SARA Computing and Networking Services, Science Park 11, 9 XG Amsterdam, The Netherlands June [email protected],[email protected],
Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck
Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering
SMB Direct for SQL Server and Private Cloud
SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server
Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009
Performance Study Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Introduction With more and more mission critical networking intensive workloads being virtualized
Performance of Software Switching
Performance of Software Switching Based on papers in IEEE HPSR 2011 and IFIP/ACM Performance 2011 Nuutti Varis, Jukka Manner Department of Communications and Networking (COMNET) Agenda Motivation Performance
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC
A Packet Forwarding Method for the ISCSI Virtualization Switch
Fourth International Workshop on Storage Network Architecture and Parallel I/Os A Packet Forwarding Method for the ISCSI Virtualization Switch Yi-Cheng Chung a, Stanley Lee b Network & Communications Technology,
Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014
Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,
ECLIPSE Performance Benchmarks and Profiling. January 2009
ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster
Intel DPDK Boosts Server Appliance Performance White Paper
Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks
Network Virtualization Technologies and their Effect on Performance
Network Virtualization Technologies and their Effect on Performance Dror Goldenberg VP Software Architecture TCE NFV Winter School 2015 Cloud Computing and NFV Cloud - scalable computing resources (CPU,
Oracle Database Scalability in VMware ESX VMware ESX 3.5
Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises
Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand
Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand P. Balaji, K. Vaidyanathan, S. Narravula, K. Savitha, H. W. Jin D. K. Panda Network Based
LS DYNA Performance Benchmarks and Profiling. January 2009
LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The
High-performance vswitch of the user, by the user, for the user
A bird in cloud High-performance vswitch of the user, by the user, for the user Yoshihiro Nakajima, Wataru Ishida, Tomonori Fujita, Takahashi Hirokazu, Tomoya Hibi, Hitoshi Matsutahi, Katsuhiro Shimano
High-Density Network Flow Monitoring
Petr Velan [email protected] High-Density Network Flow Monitoring IM2015 12 May 2015, Ottawa Motivation What is high-density flow monitoring? Monitor high traffic in as little rack units as possible
Big Data Technologies for Ultra-High-Speed Data Transfer and Processing
White Paper Intel Xeon Processor E5 Family Big Data Analytics Cloud Computing Solutions Big Data Technologies for Ultra-High-Speed Data Transfer and Processing Using Technologies from Aspera and Intel
The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology
3. The Lagopus SDN Software Switch Here we explain the capabilities of the new Lagopus software switch in detail, starting with the basics of SDN and OpenFlow. 3.1 SDN and OpenFlow Those engaged in network-related
AIX NFS Client Performance Improvements for Databases on NAS
AIX NFS Client Performance Improvements for Databases on NAS October 20, 2005 Sanjay Gulabani Sr. Performance Engineer Network Appliance, Inc. [email protected] Diane Flemming Advisory Software Engineer
Can High-Performance Interconnects Benefit Memcached and Hadoop?
Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,
Chronicle: Capture and Analysis of NFS Workloads at Line Rate
Chronicle: Capture and Analysis of NFS Workloads at Line Rate Ardalan Kangarlou, Sandip Shete, and John Strunk Advanced Technology Group 1 Motivation Goal: To gather insights from customer workloads via
COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service
COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Eddie Dong, Yunhong Jiang 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,
Windows 8 SMB 2.2 File Sharing Performance
Windows 8 SMB 2.2 File Sharing Performance Abstract This paper provides a preliminary analysis of the performance capabilities of the Server Message Block (SMB) 2.2 file sharing protocol with 10 gigabit
A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks
A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks Xiaoyi Lu, Md. Wasi- ur- Rahman, Nusrat Islam, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng Laboratory Department
Linux NIC and iscsi Performance over 40GbE
Linux NIC and iscsi Performance over 4GbE Chelsio T8-CR vs. Intel Fortville XL71 Executive Summary This paper presents NIC and iscsi performance results comparing Chelsio s T8-CR and Intel s latest XL71
Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1
Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System
Packet-based Network Traffic Monitoring and Analysis with GPUs
Packet-based Network Traffic Monitoring and Analysis with GPUs Wenji Wu, Phil DeMar [email protected], [email protected] GPU Technology Conference 2014 March 24-27, 2014 SAN JOSE, CALIFORNIA Background Main
Assessing the Performance of Virtualization Technologies for NFV: a Preliminary Benchmarking
Assessing the Performance of Virtualization Technologies for NFV: a Preliminary Benchmarking Roberto Bonafiglia, Ivano Cerrato, Francesco Ciaccia, Mario Nemirovsky, Fulvio Risso Politecnico di Torino,
Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters
Evaluation Report: Emulex OCe14102 10GbE and OCe14401 40GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters Evaluation report prepared under contract with Emulex Executive Summary As
Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers
WHITE PAPER FUJITSU PRIMERGY AND PRIMEPOWER SERVERS Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers CHALLENGE Replace a Fujitsu PRIMEPOWER 2500 partition with a lower cost solution that
Benchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
Building Enterprise-Class Storage Using 40GbE
Building Enterprise-Class Storage Using 40GbE Unified Storage Hardware Solution using T5 Executive Summary This white paper focuses on providing benchmarking results that highlight the Chelsio T5 performance
Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine
Where IT perceptions are reality Test Report OCe14000 Performance Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine Document # TEST2014001 v9, October 2014 Copyright 2014 IT Brand
Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct
Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 Direct Increased Performance, Scaling and Resiliency July 2012 Motti Beck, Director, Enterprise Market Development [email protected]
Demartek June 2012. Broadcom FCoE/iSCSI and IP Networking Adapter Evaluation. Introduction. Evaluation Environment
June 212 FCoE/iSCSI and IP Networking Adapter Evaluation Evaluation report prepared under contract with Corporation Introduction Enterprises are moving towards 1 Gigabit networking infrastructures and
Performance Characteristics of Large SMP Machines
Performance Characteristics of Large SMP Machines Dirk Schmidl, Dieter an Mey, Matthias S. Müller [email protected] Rechen- und Kommunikationszentrum (RZ) Agenda Investigated Hardware Kernel Benchmark
Cray DVS: Data Virtualization Service
Cray : Data Virtualization Service Stephen Sugiyama and David Wallace, Cray Inc. ABSTRACT: Cray, the Cray Data Virtualization Service, is a new capability being added to the XT software environment with
The Transition to PCI Express* for Client SSDs
The Transition to PCI Express* for Client SSDs Amber Huffman Senior Principal Engineer Intel Santa Clara, CA 1 *Other names and brands may be claimed as the property of others. Legal Notices and Disclaimers
The proliferation of the raw processing
TECHNOLOGY CONNECTED Advances with System Area Network Speeds Data Transfer between Servers with A new network switch technology is targeted to answer the phenomenal demands on intercommunication transfer
Intel Data Direct I/O Technology (Intel DDIO): A Primer >
Intel Data Direct I/O Technology (Intel DDIO): A Primer > Technical Brief February 2012 Revision 1.0 Legal Statements INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,
An Analysis of 8 Gigabit Fibre Channel & 10 Gigabit iscsi in Terms of Performance, CPU Utilization & Power Consumption
An Analysis of 8 Gigabit Fibre Channel & 1 Gigabit iscsi in Terms of Performance, CPU Utilization & Power Consumption An Analysis of 8 Gigabit Fibre Channel & 1 Gigabit iscsi 1 Key Findings Third I/O found
HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief
Technical white paper HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Scale-up your Microsoft SQL Server environment to new heights Table of contents Executive summary... 2 Introduction...
RoCE vs. iwarp Competitive Analysis
WHITE PAPER August 21 RoCE vs. iwarp Competitive Analysis Executive Summary...1 RoCE s Advantages over iwarp...1 Performance and Benchmark Examples...3 Best Performance for Virtualization...4 Summary...
SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation
SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers Microsoft Corporation Agenda SMB Remote File Storage for Server Apps SMB Direct (SMB over RDMA) SMB Multichannel
Datacenter Operating Systems
Datacenter Operating Systems CSE451 Simon Peter With thanks to Timothy Roscoe (ETH Zurich) Autumn 2015 This Lecture What s a datacenter Why datacenters Types of datacenters Hyperscale datacenters Major
Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM
White Paper Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM September, 2013 Author Sanhita Sarkar, Director of Engineering, SGI Abstract This paper describes how to implement
Variations in Performance and Scalability when Migrating n-tier Applications to Different Clouds
Variations in Performance and Scalability when Migrating n-tier Applications to Different Clouds Deepal Jayasinghe, Simon Malkowski, Qingyang Wang, Jack Li, Pengcheng Xiong, Calton Pu Outline Motivation
Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage
White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage
Memory Channel Storage ( M C S ) Demystified. Jerome McFarland
ory nel Storage ( M C S ) Demystified Jerome McFarland Principal Product Marketer AGENDA + INTRO AND ARCHITECTURE + PRODUCT DETAILS + APPLICATIONS THE COMPUTE-STORAGE DISCONNECT + Compute And Data Have
Broadcom Ethernet Network Controller Enhanced Virtualization Functionality
White Paper Broadcom Ethernet Network Controller Enhanced Virtualization Functionality Advancements in VMware virtualization technology coupled with the increasing processing capability of hardware platforms
Enabling Technologies for Distributed and Cloud Computing
Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading
Network Attached Storage. Jinfeng Yang Oct/19/2015
Network Attached Storage Jinfeng Yang Oct/19/2015 Outline Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS s performance (Reliability
Enabling Technologies for Distributed Computing
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies
A Comparative Study on Vega-HTTP & Popular Open-source Web-servers
A Comparative Study on Vega-HTTP & Popular Open-source Web-servers Happiest People. Happiest Customers Contents Abstract... 3 Introduction... 3 Performance Comparison... 4 Architecture... 5 Diagram...
Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago Outline Introduction Features Motivation Architecture Globus XIO Experimental Results 3 August 2005 The Ohio State University
Networking Virtualization Using FPGAs
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,
Virtualised MikroTik
Virtualised MikroTik MikroTik in a Virtualised Hardware Environment Speaker: Tom Smyth CTO Wireless Connect Ltd. Event: MUM Krackow Feb 2008 http://wirelessconnect.eu/ Copyright 2008 1 Objectives Understand
Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks
WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance
Storage at a Distance; Using RoCE as a WAN Transport
Storage at a Distance; Using RoCE as a WAN Transport Paul Grun Chief Scientist, System Fabric Works, Inc. (503) 620-8757 [email protected] Why Storage at a Distance the Storage Cloud Following
Solving I/O Bottlenecks to Enable Superior Cloud Efficiency
WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one
HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring
CESNET Technical Report 2/2014 HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring VIKTOR PUš, LUKÁš KEKELY, MARTIN ŠPINLER, VÁCLAV HUMMEL, JAN PALIČKA Received 3. 10. 2014 Abstract
- An Essential Building Block for Stable and Reliable Compute Clusters
Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative
POSIX and Object Distributed Storage Systems
1 POSIX and Object Distributed Storage Systems Performance Comparison Studies With Real-Life Scenarios in an Experimental Data Taking Context Leveraging OpenStack Swift & Ceph by Michael Poat, Dr. Jerome
RDMA over Ethernet - A Preliminary Study
RDMA over Ethernet - A Preliminary Study Hari Subramoni, Miao Luo, Ping Lai and Dhabaleswar. K. Panda Computer Science & Engineering Department The Ohio State University Outline Introduction Problem Statement
Accelerate SQL Server 2014 AlwaysOn Availability Groups with Seagate. Nytro Flash Accelerator Cards
Accelerate SQL Server 2014 AlwaysOn Availability Groups with Seagate Nytro Flash Accelerator Cards Technology Paper Authored by: Mark Pokorny, Database Engineer, Seagate Overview SQL Server 2014 provides
Next Generation Operating Systems
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015 The end of CPU scaling Future computing challenges Power efficiency Performance == parallelism Cisco Confidential 2 Paradox of the
How System Settings Impact PCIe SSD Performance
How System Settings Impact PCIe SSD Performance Suzanne Ferreira R&D Engineer Micron Technology, Inc. July, 2012 As solid state drives (SSDs) continue to gain ground in the enterprise server and storage
Shared Parallel File System
Shared Parallel File System Fangbin Liu [email protected] System and Network Engineering University of Amsterdam Shared Parallel File System Introduction of the project The PVFS2 parallel file system
THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid
THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING José Daniel García Sánchez ARCOS Group University Carlos III of Madrid Contents 2 The ARCOS Group. Expand motivation. Expand
Benchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
GigE Vision cameras and network performance
GigE Vision cameras and network performance by Jan Becvar - Leutron Vision http://www.leutron.com 1 Table of content Abstract...2 Basic terms...2 From trigger to the processed image...4 Usual system configurations...4
10G Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Future Cloud Applications
10G Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Future Cloud Applications Testing conducted by Solarflare Communications and Arista Networks shows that
Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study
White Paper Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study 2012 Cisco and/or its affiliates. All rights reserved. This
1-Gigabit TCP Offload Engine
White Paper 1-Gigabit TCP Offload Engine Achieving greater data center efficiencies by providing Green conscious and cost-effective reductions in power consumption. June 2009 Background Broadcom is a recognized
Windows Server 2008 R2 Hyper-V Live Migration
Windows Server 2008 R2 Hyper-V Live Migration Table of Contents Overview of Windows Server 2008 R2 Hyper-V Features... 3 Dynamic VM storage... 3 Enhanced Processor Support... 3 Enhanced Networking Support...
RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University
RAMCloud and the Low- Latency Datacenter John Ousterhout Stanford University Most important driver for innovation in computer systems: Rise of the datacenter Phase 1: large scale Phase 2: low latency Introduction
COS 318: Operating Systems. Virtual Machine Monitors
COS 318: Operating Systems Virtual Machine Monitors Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Introduction Have been around
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
Building High-Performance iscsi SAN Configurations. An Alacritech and McDATA Technical Note
Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note Internet SCSI (iscsi)
SALSA Flash-Optimized Software-Defined Storage
Flash-Optimized Software-Defined Storage Nikolas Ioannou, Ioannis Koltsidas, Roman Pletka, Sasa Tomic,Thomas Weigold IBM Research Zurich 1 New Market Category of Big Data Flash Multiple workloads don t
A Performance Analysis of the iscsi Protocol
A Performance Analysis of the iscsi Protocol Stephen Aiken [email protected] Dirk Grunwald [email protected] Jesse Willeke [email protected] Andrew R. Pleszkun [email protected]
High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand
High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand Hari Subramoni *, Ping Lai *, Raj Kettimuthu **, Dhabaleswar. K. (DK) Panda * * Computer Science and Engineering Department
Boosting Data Transfer with TCP Offload Engine Technology
Boosting Data Transfer with TCP Offload Engine Technology on Ninth-Generation Dell PowerEdge Servers TCP/IP Offload Engine () technology makes its debut in the ninth generation of Dell PowerEdge servers,
Performance Guideline for syslog-ng Premium Edition 5 LTS
Performance Guideline for syslog-ng Premium Edition 5 LTS May 08, 2015 Abstract Performance analysis of syslog-ng Premium Edition Copyright 1996-2015 BalaBit S.a.r.l. Table of Contents 1. Preface... 3
Performance Analysis of Large Receive Offload in a Xen Virtualized System
Performance Analysis of Large Receive Offload in a Virtualized System Hitoshi Oi and Fumio Nakajima The University of Aizu, Aizu Wakamatsu, JAPAN {oi,f.nkjm}@oslab.biz Abstract System-level virtualization
Network Traffic Monitoring & Analysis with GPUs
Network Traffic Monitoring & Analysis with GPUs Wenji Wu, Phil DeMar [email protected], [email protected] GPU Technology Conference 2013 March 18-21, 2013 SAN JOSE, CALIFORNIA Background Main uses for network
Dell Microsoft SQL Server 2008 Fast Track Data Warehouse Performance Characterization
Dell Microsoft SQL Server 2008 Fast Track Data Warehouse Performance Characterization A Dell Technical White Paper Database Solutions Engineering Dell Product Group Anthony Fernandez Jisha J Executive
TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to
Introduction to TCP Offload Engines By implementing a TCP Offload Engine (TOE) in high-speed computing environments, administrators can help relieve network bottlenecks and improve application performance.
Network Performance Optimisation and Load Balancing. Wulf Thannhaeuser
Network Performance Optimisation and Load Balancing Wulf Thannhaeuser 1 Network Performance Optimisation 2 Network Optimisation: Where? Fixed latency 4.0 µs Variable latency
Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture
Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture Ron Weiss, Exadata Product Management Exadata Database Machine Best Platform to Run the
Windows Server 2008 R2 Hyper-V Live Migration
Windows Server 2008 R2 Hyper-V Live Migration White Paper Published: August 09 This is a preliminary document and may be changed substantially prior to final commercial release of the software described
MIDeA: A Multi-Parallel Intrusion Detection Architecture
MIDeA: A Multi-Parallel Intrusion Detection Architecture Giorgos Vasiliadis, FORTH-ICS, Greece Michalis Polychronakis, Columbia U., USA Sotiris Ioannidis, FORTH-ICS, Greece CCS 2011, 19 October 2011 Network
Performance Analysis of IPv4 v/s IPv6 in Virtual Environment Using UBUNTU
Performance Analysis of IPv4 v/s IPv6 in Virtual Environment Using UBUNTU Savita Shiwani Computer Science,Gyan Vihar University, Rajasthan, India G.N. Purohit AIM & ACT, Banasthali University, Banasthali,
Infrastructure Matters: POWER8 vs. Xeon x86
Advisory Infrastructure Matters: POWER8 vs. Xeon x86 Executive Summary This report compares IBM s new POWER8-based scale-out Power System to Intel E5 v2 x86- based scale-out systems. A follow-on report
