Performance measurements of STANAG 5066 and Applications Running over STANAG 5066 Steve Kille CEO September 14 th 2009
Why Measure? The End Goal is to run Applications and Services over HF Radio Maximize Throughput Minimize Latency Given the poor and awkward characteristics of HF: Optimization is important Best approach may not be obvious Measurements will help understand what is going on and guide best choices
Talk Summarizes Results Presented in Three White Papers STANAG 5066 Performance Measurements over HF Radio http:///whitepapers/stanag-5066-performance.html Messaging & Directory Servers Subnet performance is key to understanding application performance Performance Measurements of Messaging Protocols over HF Radio http:///whitepapers/performance-of-messaging-protocols-for-hf-radio.html Measurement of applications optimized for HF Radio Performance Measurements of Applications using IP over HF Radio http:///whitepapers/performance-of-ip-applications-over-hf-radio.html Compares standard protocols over IP over HF
Acknowledgements RapidM For loan of RM6 Modems & RC66 STANAG 5066 Server software Help and advice, in particular from Markus van der Riet NATO NC3A For the IPClient STANAG 5066 Client Software Help and advice from Donald Kallgren & Maarten Gerbrands
STANAG 5066 STANAG 5066 Subnet Interface Service is top of HF House The Key Interface between Application (layers) and HF Isode provides STANAG 5066 Clients (and layers above) We look to partners to provide STANAG 5066 Servers (with HF Modems) Measurements done with STANAG 5066 Data Link Protocols Results for STANAG 4538 Data Link expected to be broadly similar
Isode s HF Radio Network RM6 Military HF Modems All tests done with STANAG 4539 3G Waveform Audio cable interconnect between the modems Simulates perfect HF Radio Simple approach that allows focus on data link and higher layers
STANAG 5066 Test Setup Isode s STANAG 5066 Console is an operator GUI tool Operates directly over STANAG 5066 Provides Throughput and Ping (latency) tests Peer to Peer and Multicast Test results reported in GUI
Throughput: 9600 bits/sec Sender view of throughput Lower line started after STANAG 5066 Server buffers filled Oscillation at 2 minute intervals (in line with STANAG 5066 max transmission intervals Clear Convergence Good Utilization
Throughput: 75 bits/sec Clear convergence & good utilization Oscillation at 5 minute interval (time to transmit one 2048 byte APDU unit data ) Queue growth controlled by STANAG 5066 flow control
Throughput: ARQ & Non-ARQ Modem Speed ARQ Utilization Non-ARQ Utilization 9600 bits/sec 83% 94% 1200 bits/sec 90% 89% 75 bits/sec 87% 81% Sender side view and identical test setup Non-ARQ utilization would expected to be higher as no data link acks or retransmission This is seen at 9600, but not at slower speeds Non-ARQ needs application level retransmission on loss Overall good utilization at all speeds for ARQ and non-arq
Effect of MTU Size MTU Size (bytes) 2048 90% 512 88% 128 68% 64 33% Utilization Throughput drops as MTU size is reduced Minor impact down to 512 bytes Dramatic for very small MTU values Applications should generally keep MTU large
Effect of Interleaver Interleaver Short (0.6 secs) 90% Long (4.8 secs) 82% Utilization Long Interleaver desirable for data transfer to increase resilience against burst errors Reduction in performance in line with theoretical analysis For real HF, trade off against improvement in performance due to reduced data loss
Effect of Transmission Interval Good throughput needs long transmission times Key for application layers Theoretical analysis Expect effect to be stronger in operation Degradation stronger for long interleaver
Ping Tests: How They Work Sender sends data Receiver sends back data immediately Sender measures elapsed time (since initial send) on reception On reception sender starts another ping So result is a series of times
ARQ Ping Tests At 1200 bits/sec First ping takes 17 seconds Second ping (and subsequent ones) take 8 seconds ARQ Soft Link establishment delays first ping Turnaround time from throughput tests estimated at 6 seconds Interesting variations with modem speed (see white paper) Ping tests do not give accurate measure of turnaround time
Non-ARQ Ping Tests First Ping takes 46 secs (ARQ 17 secs) Subsequent pings take 86 secs (ARQ 8 secs) Consequence of RapidM (slotted) collision avoidance Collision avoidance is VERY important for HF Sender continues to transmit if it has data All nodes wait 30 secs after end of transmission Each node has a 5 second slot to start transmission This has significant impact on performance of applications running over non-arq STANAG 5066 ed 3 token ring collision avoidance expected to give significant improvement
What Happens with Real HF Radios? We would expect: Very similar results in good conditions Slower link establishment due to ALE and other negotiation Throughput loss in line with levels of noise Measurements and comparisons would be very interesting Isode looking to others to do this (we don t have appropriate setups)
Implications for System Deployment Make tests at STANAG 5066 level to verify performance prior to testing applications STANAG 5066 Console could be used for this Application performance only makes sense in context of STANAG 5066 performance
STANAG 5066 Summary Does its job well and gives good utilization ARQ and non-arq for all speeds (81% - 94%) Application will need to use STANAG 5066 in right way to get this good performance STANAG 5066 ed 3 will be very important for non-arq
HF Messaging Protocols Measurements of four messaging protocols designed for HF Radio HMTP (specified in STANAG 5066 Annex F) Poor performance (see white paper) CFTP (sometimes called Battle Force Email) (specified in STANAG 5066 Annex F) ACP 142. CCEB (OZ/CA/NZ/UK/US) standard for multicast used by STANAG 4406 Annex E. CO-ACP 142 (Connections Oriented ACP 142) Variant of ACP 142 optimized for ARQ Specified by Isode
Message Testing Framework Test Process Send Messages Measure delivery time Batches of 1, 10, or 100 Payload of 0, 1, 10 or 100 kbytes 75, 1200 and 9600 bits/sec Internet mail used as CFTP and HMTP do not support X.400/STANAG 4406 X.400 performance for ACP 142 & CO-ACP 142 slightly better than Internet mail due to more compact encoding
CO-ACP 142 As the Reference Protocol We use CO-ACP 142 as reference protocol Best performance Shares characteristics with both the protocols it gets compared to (ACP 142 and CFTP) We have access to internals so can make better analysis CO-ACP 142 Described in White paper Messaging Protocols for HF Radio http:///whitepapers/messagingprotocols-for-hf-radio.html
CO-ACP 142 Testing Setup
Single Message Transfer Times Modem Speed Payload TIme 9600-21 secs 9600 10 kbyte 30 secs 1200-18 secs 1200 1 kbyte 24 secs 75-91 secs 75 1 Kbyte 245 secs At higher speeds single (small) message transfer time dominated by STANAG 5066 soft link setup time At 75 bits/sec, data transfer time dominates
Link Utilization Modem Speed Payload Number Msgs Utlization 9600-1 2% 9600-100 53% 9600 100 kbytes 10 76% 1200-1 11% 1200-100 72% 1200 10 kbytes 10 89% 75-1 41% 75 1 kbyte 10 74% Shows best, worst and selected utilization at each modem speed Data volume (large messages or many messages) improves utlization
Utilization Analysis: Large Payload (100 kbyte) CO-ACP 142 provides efficient mechanism to carry large payloads 12% for STANAG 5066 is reasonable 4% message overhead due to base64 MIME encoding and compression. Can be eliminated for ACP-142/CO-ACP 142 by binary payload, but would not work for CFTP or HMTP Other overheads are negligible
Utilization Analysis: Small Payload (1 kbyte) STANAG 5066 overhead remains reasonable where multiple messages transferred (13%) Messaging protocol overhead (26%) discussed in next slide Other protocol overheads small and reasonable Messaging & Directory Servers
Message Overhead Analysis For 1 kbyte 26% overhead is 260 bytes Uncompressed CO-ACP 142 BSMTP shown here BSMTP (first two lines) is compact (66 bytes) Gains could be made by stripping headers and trace (Received) For data levels at 100 bytes XMPP may be better than email <pop.user@dhcp-122.isode.net> <pop.user@dhcp-136.isode.net> Received: from dhcp-122.isode.net ([172.16.1.122]) by dhcp-122.isode.net (smtp internal via TCP with ESMTP id <Smg=MAAJqXcB@dhcp-122.isode.net> for <pop.user@dhcp-236.isode.net>; Thu, 23 Jul 2009 11:45:04 +0100 Content-Type: multipart/mixed; boundary="===============1760449971==" MIME-Version: 1.0 Date: Thu, 23 Jul 2009 10:45:04-0000 Subject: test 0 From: pop.user@dhcp-122.isode.net To: pop.user@dhcp-236.isode.net ===============1760449971== Content-Type: text/plain; charset="usascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit This is a text body --===============1760449971== Content-Type: text/plain; charset="us ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit F1xsK6AmS0CZUe6nWAtxfYK8iss9AmgvVnBTqqXMo6 8O+ RF2yDGleSdj/JogKpQ6U8LGyZL311SbbLShrSKWC5Y VjY7 6GTrepKUrymezYZfCDONsh57NM9aEZIylQ64LBvurW rwe
ACP 142 Transfer of messages using non- ARQ supporting Multicast and EMCON See Military Messaging over HF Radio and Satellite using STANAG 4406 A Tests as for CO-ACP 142
ACP 142 Message Transfer Times Payload ACP 142 CO-ACP 142-6 secs 21 secs 1 kbyte 14 secs 30 secs 10 kbyte 101 secs 97 secs 100 kbyte 840 secs 796 secs ACP 142 slower for larger messages reflecting STANAG 5066 measurements ACP 142 is faster for short messages as no soft link establishment For multicast all recipients get message at same time Low latency (multicast) delivery of short messages highly desirable characteristic in some situations
ACP 142 Throughput (messages per hour) Payload ACP 142 CO-ACP 142-730 1108 1 kbyte 244 311 10 kbyte 43 44 100 kbyte 4 5 Messaging & Directory Servers ACP 142 analysis allows for RapidM turnarounds every 15 minutes STANAG 5066 ed 3 would be expected to give better results CO-ACP 142 better, especially for small messages For two or more recipients ACP 142 multicast would give better performance
CFTP CFTP (Compressed File Transfer Protocol) is specified for message transfer in STANAG 5066 Annex F Technically very similar to CO-ACP 142 Both use STANAG 5066 RCOP (Reliable Connection Oriented Protocol) Both use BSMTP (Batch SMTP) approach Both use DEFLATE compression Similar performance expected Theoretical analysis suggests CO-ACP 142 will be slightly faster
CFTP Performance (relative to CO-ACP 142) Messaging & Directory Servers Measurements show very similar performance For single message transfers, difference within experimental variance, but suggest CO-ACP 142 slightly faster (1-5%) For transfers of many messages CO-ACP 142 faster (5-15%) Performance difference technically interesting, but not an overriding factor in protocol choice
Which HF Message Protocol? If you need X.400 Use ACP 142 or CO-ACP 142 If you need Multicast or EMCON Use ACP 142 If you need only Internet Messaging over ARQ (point to point) CFTP and CO-ACP 142 both reasonable choices
CO-ACP 142 vs CFTP CFTP is a standard (STANAG 5066 Annex F) This is a good thing CO-ACP 142 is a technically better protocol Slightly better performance Operates over IP (e.g. SATCOM) as well as STANAG 5066 Supports requests for Delivery Status Notifications (important for reliable messaging) Supports binary transfer (BINARYMIME or 8BITMIME) Desirable for modern email and 4% performance improvement
HF Messaging Summary Three messaging protocols (CFTP; ACP 142; CO-ACP 142) make efficient use of underlying STANAG 5066 services STANAG 5066 ed 3 would be desirable for ACP 142 deployment
Deployment of HF Optimized Protocols Messaging & Directory Servers HF Optimized protocols (like the ones just reviewed) would be deployed as above Can be made transparent to user, but deployment configuration needs to ensure correct application level routing
The pure IP Approach Classic IP deployment makes HF Radio transparent to user and avoids application relay HF is simply an IP Subnet Transparent Switching between HF and other networks NATO target architecture
What is the Performance Cost of IP? There are various reasons to use application relaying as opposed to HF as an IP Subnet See: HF Radio & Network Centric Warfare. The rest of this talk (and associated white paper) look at the performance implications of running standard applications over IP over HF in contrast with optimized applications
IP over STANAG 5066 STANAG 5066 Annex F specifies how to run IP over STANAG 5066. A simple mapping onto STANAG 5066 Unit Data is used
The Application Picture When IP is used, interface to STANAG 5066 is through an IP router, and hidden from the application
UDP Tests User Datagram Protocol (UDP) gives a simple application interface onto IP (unreliable datagram service) Tests send out UDP datagrams of configurable size and rate Latency and throughput measured by receiver
UDP Performance Summary UDP is a thin layer over IP, and so these tests measure IP performance Theoretical analysis suggests that UDP will give a protocol overhead (UDP and IP headers) and otherwise give very similar performance to direct use of STANAG 5066 Tests broadly showed that this happened, although there was a small (around 5%) additional overhead which we could not explain See white paper for details
UDP Latency: 70% Load Load sent at 70% of max load, so no data loss expected Small oscillations 20-30 secs, reflecting STANAG 5066 retransmission 20-30 second typical latency Big spikes occur when data is lost (and retransmitted) Subsequent packets delayed because of in order transmission
UDP Latency: 20% Load At 20% load similar pattern Normal latency drops to 10-20 secs
UDP Latency: 700% Load At 700% load high percentage of packets get dropped (off back of IP queue) Sawtooth spikes at two minute intervals (in line with max STANAG 5066 retransmission time) Latency varies 150-450 seconds (2.5-8 minutes)
Why TCP is Important TCP provides reliable data stream It is the basis for the vast majority of Internet applications and most of those that are likely to be used over HF See white paper for discussion of UDP and RTP applications (the other two layer protocols over IP in common use) TCP makes use of IP to carry data and acknowledgements Synchronous startup Data packets acknowledged with a window mechanism
TCP Testing TCP Tester sends data as fast as it can (single one way data stream) Analyser measures delay of data (from start of test and from data send
TCP: 1200 bits/sec (MTU=500) Tests use MTU of 500 and 1500 bytes (standard Internet value) Red line shows throughput (cumulative since start) Blue line shows current latency Varying latency usually associated with bursty activity Over an hour to reach stability
TCP: 1200 bits/sec (MTU=1500) This one seems to stabilize more quickly Then after two hours destabilizes for some reason
TCP: 9600 bits/sec (MTU=500) At faster modem speed takes an hour to reach good throughput Due to TCP slow start and slow rate to open window. Connection remains in unstable mode after this point
TCP: 300 bits/sec (MTU=1500) This one works very badly (but amazingly does work) Latency of order 2 hours 2% network utilization
What is going on under the hood Optimal use of STANAG 5066 needs the application to use it in the right way This is often not going on when TCP is the application layer TCP windowing mechanism and retransmission calculations have a complex and often inefficient interaction with STANAG 5066
TCP Performance Summary Speed/MTU Utilization Time to 80% First Data 9600/500 44% 65 mins 60 secs 9600/1500 53% 27 mins 12 secs 1200/500 62% 75 mins 47 secs 1200/1500 67% 1.5 mins 20 secs 300/500 46% 3.5 mins 120 secs 300/1500 2% n/a 136 secs Would not work at 75 bits/sec Time to reach 80% of the max utilization figure noted First data is when receiver gets its first data
TCP Analysis Utilization at best (67%) is OK Time to reach good utilization is often very long The overall system often appears quite unstable Time to get first data through is often very high
SMTP Testing SMTP tests are same as for previous messaging protocols Diagram shows how tests and STANAG 5066 all fit together
SMTP Comparison to CO-ACP 142 Modem Speed Number/Payload CO-ACP 142 SMTP 9600 1 x 0 18 secs 2.3 mins 9600 10 x 0 21 secs 18 mins 9600 10 x 100 kbyte 20 mins 103 mins 1200 1 x 0 21 secs 89 secs 1200 100 x 0 8 mins 73 mins 1200 10 x 100 kbyte 2.2 hours 3.8 hours 300 1 x 0 30 secs 5.9 mins 300 1 x 10 kbyte 7.4 mins 16 mins Shows (relative) best and worst at each speed Best relative performance is for very large messages at 1200 bits/sec Application DoS at 300 bits/sec
Why does SMTP over IP over HF perform so badly? Application (SMTP) protocol exchanges interact with TCP and STANAG 5066 in a poor way TCP is not usually given time to reach (reasonably) efficient utilization of STANAG 5066 Particularly significant at faster modem speeds To a lesser extent, inefficient encoding adds overhead
What do these results mean for other applications? For Messaging SMTP over IP over HF much worse that for specially designed protocols. Does this protocol specific analysis apply to other applications? SMTP is a typical Internet protocol Reasonably compact text encodings No compression Synchronous at start and (fairly) asynchronous later Results will vary by protocol, but it seems likely that for most applications an optimized protocol will work substantially better Protocols needing low latency (e.g., Instant Messaging) of particular concern
Summary on Applications over IP over HF Specially designed protocols have much better performance over HF Specific concerns from applications measured: Will not work at slowest HF speeds Very high additional delays on initial transfer Often very significant throughput reduction TCP connections often not stable Bad interaction with application timers (SMTP DoS) Anticipate significant knock-on effects from data errors
Recommendations for Deployment Mission Critical Applications should generally use optimized protocols over HF and not IP This is not the conclusion that many would like, but it seems hard to draw any other conclusion from the numbers
Standardization Recommendation Shift effort from IP over HF to Applications over HF Separate STANAG 5066 SIS Specification as Independent Standard Specify full STANAG 5066 SIS Interface to STANAG 4538 Define key HF Applications as independent standards Review standards for mail over HF Review new application standards over HF and in particular XMPP Look at how best to integrate applications into IP architecture
Questions? Presentation online: blog.isode.com http://blog.isode.com/2009/09/hfia-presentation.html Includes Links to the Three Performance Measurement White Papers Steve.Kille@isode.com