Experiences with TCP Acceleration Services. Dave Hartzell CSC / NASA Advanced Supercomputing David.hartzell@nasa.gov

Size: px
Start display at page:

Download "Experiences with TCP Acceleration Services. Dave Hartzell CSC / NASA Advanced Supercomputing David.hartzell@nasa.gov"

Transcription

1 Experiences with TCP Acceleration Services Dave Hartzell CSC / NASA Advanced Supercomputing David.hartzell@nasa.gov

2 Goal Understand if Enterprise WAN acceleration products can assist with user file transfers Understand the products and their integration into the network Test the acceleration solution with real users and real applications 2

3 Background We spent several years attempting to improve user transfer performance to/from the Columbia and Pleiades supercomputers at NASA Ames (CA) Users are dispersed at other NASA centers, mostly on East Coast and Southern US

4 Background (2) NASA users utilize either the NREN or NISN networks for access to the systems Linux and Mac are the prevalent desktop systems for HEC users With a team of people supporting users, we did see improvements in TB/Month TCP tuning provided some users with improvements, others were elusive. Could TCP Acceleration further assist?

5 The Challenges Wizard Gap continues Linux OS support Sometimes upgrading to the latest OS isn t an option (e.g. stuck with RHEL 3, 4) Linux can be tuned, but parameters get blown away with an update Not all users have root; some admins don t care/help Some versions of Linux come with 4 MB Autotuning enabled, but tcp_rmem and wmem set at 128K! Field Center network performance issues No Jumbos, no GigE sometimes Packet loss / congestion Firewall CPU congestion

6 WAN / TCP Acceleration Target users were getting 30 Mbps needed more, 300 Mbps should be the minimum! Other users get 300 Mbps easily Solution? Enterprise WAN Acceleration Cisco WAAS Riverbed Juniper WX / WXC

7 Cisco WAAS Wide Area Acceleration Services Accelerates Applications over the WAN Accelerates many different applications and protocols while optimizing WAN bandwidth Mostly for distributed enterprises to increase application performance and reduce WAN costs

8 Cisco WAAS (2) Basic TCP acceleration is an important key function of WAAS: TFO Transport Flow Optimization (TCP proxy and windowing tricks) DRE Data Redundancy Elimination-blocks are replaced with small signatures for transmit over the WAN LZ Limpel-Ziv real-time data compression

9 How Does WAAS work? (2) TFO - Transport Flow Optimization - aka TCP acceleration: Uses TCP proxy and adaptive buffering, large initial windows and other black magic. Does WAAS violate end-to-end model? Probably But we didn t notice any issues... 9

10 How WAAS helps... Hosts see each other with a LAN-like RTT - even over a large WAN. Improve TCP ramp-up and recovery times Packet loss has less impact WAAS takes advantage of BIC (even if clients run others) 10

11 WAAS Testing Phase 1 Tested in lab, but with WAN loopback from CA to Wash DC (inline mode) Phase 2 Deployed on NREN and tested with lab (inline mode and policy routed) Phase 3 Limited user testing (Policy routed)

12 Deployment Methods Inline (Phase 1) WAAS sits inline on network as a Layer 1 bump Appeared very transparent; hosts unaware of WAAS appliance inline Works for single hosts/servers or whole subnets or networks

13 Test hosts Test hosts were stock untuned Ubuntu 8.10 systems with GigE. No tuning was done on the hosts, in terms of TCP optimization (e.g. windows, BIC, etc.) Stock Ubuntu 8.10 is Linux , TCP CUBIC on by default, with auto-tuning enabled

14 Packet Loss Packet loss was introduced via another Linux box running NetEm Simulated a lossy LAN or poorly performing center firewall With WAAS acceleration hosts saw improved throughput even with significant packet loss between 0.1% and 1%.

15 nuttcp Lab Test Results Saw significant improvement with WAAS On 170 ms RTT WAN link, default nuttcp (TCP memory to memory) thoughput was about 59 Mbps (~1 MB window) With WAAS inline rates jumped to around 500 Mbps (with TFO, LZ and DRE enabled on WAAS)

16 WAAS Lab nuttcp Results Mbps Nuttcp is highly compressible and easily cached

17 bbftp Testing BBFTP is commonly used to transfer data on/off the NASA systems Supports multi-stream file transfers Multiple tests run to see effects of caching (DRE) over occurrences Utilized real scientific datasets to better understand impacts Tests were run with single and multiple streams 17

18 bbftp Lab Test Bbftp transfer, 1-stream, 256K / stream, 200 MB dataset 18

19 Impact to tuned hosts? Testing was done to verify that WAAS would not adversely impact well tuned hosts Testing showed that with BBFTP, WAAS would not impact the performance, WAAS still helps (with DRE caching and LZ compression) 19

20 WAAS bbftp Test (tuned host) Tuned host, 8 MB/stream, 8 streams, 4 GB data, Full WAAS optimizations (TFO, LZ and DRE) runs 20

21 User Testing WAAS looked good in a lab environment with real WAN and injected packet loss Next step: testing with real users WAAS deployed, but not inline: Required the use of Web Caching Crtl Protocol (WCCP) or Policy Routing on existing NREN routers/switches (6500s) 21

22 NREN Deployment Langley Research Center NAS Advanced Supercomputing, (Ames, CA) PBR NREN PBR Utilized policy-based routing (PBR) on 6500s in onearmed routing mode 22

23 User Experience/Feedback Tested with a small subset of users at NASA Langley Users experienced significant increase in bbftp throughput, on the order of 3 to 5 times their pre-waas rates e.g. 120 Mbps to well over 450 Mbps Considered a successful trial deployment 23

24 Results WAAS works, helping users get rates better than without WAAS (tuned or untuned hosts). WAAS can help overcome LAN packet loss issues (in my case the WAN wasn t the problem). While it won t do GigE rates, it can still help users get half that rate. With DRE/LZ, WAN util can decrease, while hosts see LAN util increase! 24

25 But what about SSH? Tested SCP v4.7, 5.0 and HPN-SCP WAAS did NOT improve these transfers in most cases BUT with packet loss WAAS did help improve the transfer rates. More testing would be useful, now that SSH 5.1 is out. 25

26 Conclusion TCP acceleration solutions may help users stuck at sub-optimal rates, when OS tuning does not help Appeared transparent to users Integration of WAAS into existing network feasible & painless - if you own the network This is a gap-filler solution until admins and OSes are fixed... It will take years... 26

27 Thanks Dave Hartzell 27

28 Backup Slides 28

29 Optimization Concept 29

30 TCP Optimization Without WAAS With WAAS 30