PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters from One Stop Systems (OSS)
PCIe Over Cable
PCIe provides greater performance 8 7 6 5 GBytes/s 4 3 2 1 0 EISA PCI 32/33 PCI 64/66 PCI-X 64/133 AGP 8X Gb E- net 10Gb E-net PCIe x1 PCIe x4 PCIe x8 PCIe x16 Bus
PCIe over Cable Comparison versus Ethernet Price 1 Gb Ethernet 10 Gb Ethernet PCIe over Cable 2.5Gb to 80Gb PCIe Performance 3 to 80 times faster than 1Gb Ethernet PCIe Cost Source OSS Adapters: $100 to $700 Cables: $30 to $300 Switches: $600 to $1,200 PCIe cables Heavy-duty well shielded cables All cables are cross-over style PCIe best suited for small, local networks Performance
PCIe vs Infiniband 40Gb/s InfiniBand Bundle 36-port 40Gb/s InfiniBand switch 8 Single-Port 40Gb/s PCI Express 2.0 InfiniBand HCA cards 8 2 meters long, copper cables, Total Price: $10K 80Gb/s PCIe Bundle 10-port 80Gb/s PCIe switch 80Gb host adapter and 2m cable Eight 20Gb/s PCI Express 2.0 host adapters with 2m cables Total Price: $6K
PCI Express Basics: Two Architectures Tree One CPU and multiple I/O boards Network Multi CPUs, Multi I/O Requires special H/W and S/W CPU CPU CPU CPU I/O I/O I/O Switch I/O I/O I/O
PCI Express Basics Lanes The Key to Performance Tx CPU I/O Rx x1 Lane Point-to-Point connections no arbitration Each lane consists of two differential pairs Separate Transmit and Receive pairs 2.5 or 5.0 Gb/s rate per pair Components auto-detect max clock rate Multiple lanes are used to increase performance x1 5 Gb/s x1 (pronounced by one ) x4 20 Gb/s x8 40 Gb/s x16 80 Gb/s
Tree Architecture I/O Expansion Host system RAID array PCIe I/O expansion system PCIe switch CPCI/CPCIe IO expansion
PCIe host cable adapters PCIe cost cable adapters PCIe x4 PCIe x8 PCIe x8 PCIe x16 PCIe cables
Upstream Adapters PC, laptop and industrial form factors
Downstream Adapters and Devices Creating downstream PCIe endpoints 9 PCIe board adapters 9 Backplane interface boards 9 Subsystems with PCIe cable inputs 9 Backplanes with PCIe cable inputs
Direct Attached Expansion Kits
Direct Attached Multi-port Switches Extends PCIe bus to multiple downstream sub-systems One upstream link to multiple downstream links Gen 1 and Gen 2 versions
HPC requires Substantial infrastructure with: Long-life, redundant servers GPU accelerators for math co-processing High speed storage or Solid State Disk (SSD) appliances High-speed connectivity
GPU Server AMD-based MB Eight GPU s/ssd s Could also used as NAS SSD appliance 1U PCIe switch One x16g2 uplink Nine x4g2 downlinks Two to eight servers Server to server communication 20Gb PCIe 10Gb Ethernet 20Gb Infiniband Multiple 1U or 2U GPU/SSD appliances Two to eight GPU's/SSD s per appliance 80Gb/s connectivity to server Network connectivity Server to Switch 80Gb/s Server to Server at 20Gb/s GPU/SSD appliance GPU/SSD appliance Architecture GPU Server 1U switch 1U servers Server to appliance connectivity at 80Gb/s
Latest server technology Longer life cycles from rugged servers reduce overall cost and downtime Reduced depth allows better fit in shallow racks Superior cooling and power Latest technology motherboards and processors provide wide range of processing options Dual 5500-series Nehalem quad-core or six-core processors Up to 96GB DRAM 2TB to 5TB disk drive capacity
1U PCIe switch One 80Gb upstream interface Nine 20Gb downstream interfaces Redundant servers with ExpressNet Server to server communication Windows or Linux OS Server redundancy Network connectivity at 10-20Gb/s 1U PCIe switch
Multiple GPUs support many users simultaneously in virtual networks Appliance provides necessary cooling and power not found in servers for optimal operation and significant reduction in downtime Hot swappable appliances provide redundancy GPU Appliance Server to Server at 20Gb/s GPU/SSD appliance GPU/SSD appliance 1U switch 1U servers
GPUs off-load high-end graphics and rendering GPUs from system processors GPUs provide rich media and 3D graphics to virtual desktops Typical GPUs AMD FireStream 9270 Processing power: single precision: 1.2 TFLOPS double precision: 240 GFLOPS AMD FireStream 9250 Processing power: single precision: 1.2 TFLOPS double precision: 240 GFLOPS ATI Radeon HD 5870 Processing power: single precision: 2.72 TFLOPS double precision: 544 GFLOPS
Solid State Disk (SSD) appliance 1U appliance One appliance per server Up to four 640GB SSD boards Server can access 2.5TB storage 2U appliance One appliance per two servers Up to eight 640GB SSD boards Each server can access 2.5TB storage 1U or 2U GPU/SSD appliance Up to 4 GPUs and 4 SSD boards per server 80Gb connectivity Each appliance can employ 4 to 8 Fusion-IO iodrive Duo boards (for example)
Combined GPU and SSD appliance Each 1U GPU/SSD appliance supports: Up to 2 GPU s per server Up to 1.25TB SSD storage per server Each 2U GPU/SSD appliance With 1U server supports: Up to 2 GPU s Up to 1.25TB SSD storage With 3U server supports: Up to 4 GPU s Up to 2.5TB SSD storage
2U Integrated Server 10TFLOP server integrates 9 Motherboard with dual AMD six-core processors Up to four double-wide AMD 9270 or HD5870 GPUs Istanbul-based motherboard Up to four double wide or eight single wide GPUs 9 2.72TFLOPS each or 10TFLOPS total processing power Dual 1500 watt power supplies Four SATA/SAS hot swappable disk drives Superior cooling 9 12 chassis fans 9 4 power supply fans Four hot swap disk drives Dual 1500-watt power supplies
Lower cost, lower latency, and less overhead than Infiniband Greater throughput and lower cost than 10Gb Ethernet Server to server communication over 20Gb PCI Express 80Gb connectivity between GPU appliance and server Network connectivity Server to Switch 80Gb/s PCIe connectivity Server to Server at 20Gb/s GPU/SSD appliance GPU/SSD appliance GPU Server/ 1U switch 1U servers Server to appliance connectivity at 80Gb/s
Latest technology server supporting up to eight GPU s and/or SSD s Long-life, latest technology, redundant 1U servers 20Gb/s PCI Express connectivity between servers 1U or 2U GPU/SSD appliances with 80Gb/s connection to severs The Future of HPC Clusters Network connectivity Server to Switch 80Gb/s Server to Server at 20Gb/s GPU/SSD appliance GPU/SSD appliance GPU Server 1U switch 1U servers Server to appliance connectivity at 80Gb/s
Direct Attached RAID Arrays 4-drive RAID PCIe x4 or 10Gb/s to RAID controller 12-drive RAID PCIe x8 or 20Gb/s to RAID controller 16-drive RAID PCIe x8 or 20Gb/s to RAID controller
Possible Storage Configurations Host cable adapter PCIe expansion kit with RAID board Downstream cable adapter and backplane
Possible Storage Configurations PCIe backplane with RAID board 3 PCIe x4 slots PCIex4 and x8 cable connectors
GPU Computing/RAID Sub-System 3U server 80Gb PCIe 80Gb PCIe over cable 1U PCIe switch 1U GPU Accelerator (includes 2 GPU s) 20Gb PCIe 20Gb PCIe 20Gb PCIe 20Gb PCIe 20Gb PCIe 20Gb PCIe 20Gb PCIe 20Gb PCIe RAID Array 1 RAID Array 3 RAID Array 5 RAID Array 7 RAID Array 2 RAID Array 4 RAID Array 6 RAID Array 8
Global GPU/RAID System 10Gb Ethernet switches to the outside world 1Gb Ethernet connections between nodes Node 1 Node 2 Node 3 Node 4 1Gb Ethernet connections to redundant switches
Summary PCI Express over cable operates from 10Gb/s to 80Gb/s Wide assortment of PCIe adapters, switches, and modules available PCI Express bus can be expanded from PC to IO or from PC to PC The expanding HPC market requires: redundant servers, multiple GPU s, high speed storage, and high-speed connectivity Servers with multiple high speed IO slots provide required bandwidth for GPU s and high speed storage GPU s and high speed storage can be attached to existing servers PCIe over cable provides the most economical high speed connectivity