Open Source File Transfers A comparison of recent open source file transfer projects By: John Tkaczewski
Contents Introduction... 2 Recent Open Source Projects... 2 UDT UDP-based Data Transfer... 4 Tsunami UDP Protocol... 4 UFTP... 5 GridFTP... 6 Conclusion... 6 Author Biography... 7 About FileCatalyst... 7 2012 Unlimi-Tech Software, Inc. [1] Accelerating File Transfers
Introduction There exist a number of open source projects trying to tackle accelerated file transfer via UDP. Some solutions are more mature than others and also use different technologies to solve the same problem of large data transfer over WAN. Some commercial managed file transfer solutions claiming to have UDP acceleration have integrated one of those open source projects into their core file transfer technology. These solutions will inherit the strengths but also the weaknesses of the open source project. FileCatalyst has developed its own UDP based protocol written from the ground up, and does not include any code from open source UDP technology. Recent Open Source Projects The following four open source projects will be reviewed: UDT Tsunami UFTP GridFTP An issue apparent in all four solutions is the lack of graphical user interface. Two provide bare bones sender and receiver APIs (meaning that the end user has to compile the classes), while the third one only comes with a command line interface (CLI). Another common problem is poor support for firewall traversal. While this is not an issue for internal transfers, most organizations are interested in sending files over the WAN (which will almost certainly have at least one firewall somewhere on the route). None of these solutions fare well in the worst network conditions, where packet loss, bandwidth or latency are very high. Finally the congestion control in the UDP projects is missing the flexibility to adapt to ever changing network conditions during the data transfer. 2012 Unlimi-Tech Software, Inc. [2] Accelerating File Transfers
Below is a quick reference table comparing the four projects: UDT Tsunami UFTP GridFTP Multi-threaded No No No No Protocol Overhead 10% 20% ~10% 6-8% (same as TCP) Encryption No No Yes Yes C++ source code Yes Yes Yes Yes Java source code Partial No No No Command line No No Yes Yes Binaries No (source code No (source code Yes (CLI only) Yes (CLI only) only) only) UDP based point-to-point Yes Yes Yes No Firewall friendly Partial (no autodetection) No Partial (no autodetection) No GUI client No No No No Server with secure user accounts No No No Yes Congestion control Yes (UDP blast mode preferred) Yes (limited) Yes (congestion control file has to be specified before the transfer starts) Yes (using TCP) Automatic retry and resume No No No (manual resume yes) Jumbo packets Yes No Yes (up to 8800 bytes) Yes Yes IPv6 Yes No No Yes Support for low bandwidth high packet loss (i.e. satellite) Optimized for medium bandwidth (<155Mbps) high latency Optimized for high bandwidth (500Mbps or more) high latency No No No No Yes Yes Yes Yes No No No No Memory footprint Medium Medium Medium High 2012 Unlimi-Tech Software, Inc. [3] Accelerating File Transfers
UDT UDP-based Data Transfer Functionality Issues No installer and no binaries are available, both client and server have to be built from source. This is only a bare bones source code implementation of the sender and receiver, all the functionality around user authentication, reporting, monitoring and file management have to be implemented by the programmer. This project could only be used if 2 back office servers are sending files with no firewalls in between and without any user interaction. Core No multithreading, meaning that only a single CPU core can do the work of receiving, processing, decrypting, decompressing, and writing to disk. This may also limit the number of concurrent connections that can be serviced at once. Poor performance on high packet loss, low bandwidth links, default configuration is very sensitive to packet loss. In fact, a single dropped packet could force a failed transfer. Inflexible congestion control, adapts poorly to quickly changing network metrics. CUDPBlast is the workaround, but it does not actually provide much congestion control. High CPU/memory usage of very fast links 300Mbps or higher. C++ library is relatively mature, while the Java port is still in its infancy with many reported bugs. No graphical client interface for point-to-point transfers. Limited support for firewall tranversal, no auto detection of UDP is possible. No built-in autoatic retry/resume (although it could be built by the programme). Tsunami UDP Protocol This open source project has not been developed in 2 years (unchanged since May 2010). Functionality Issues Requires to be built from source (no binaries). This is only a source code implementation of the sender and receiver; all the functionality around user authentication, reporting, monitoring and file management must be implemented by the progammmer. 2012 Unlimi-Tech Software, Inc. [4] Accelerating File Transfers
Core Only C++ source code. 20% protocol overhead, ex. 100 Mbps link will only be able to send at 80 Mbps. No jumbo packet support. No multi-threading, meaning that only a single CPU core does the work of receiving, processing, decrypting, decompressing, and writing to disk. This may also limit the number of concurrent connections that can be serviced at once. Not optimized for very high bandwidth 100 Mbps or more. Not optimized for low bandwidth high pocket loss (ie. satellite). No graphical client interface for point-to-point transfers. No support for firewall traversal. No resume and retry (although it could be built by the programmer). UFTP The UFTP protocol was based on the Starburst MFTP protocol. Functionality Issues Comes with command line tools only No firewall auto-detection, meaning that UDP is always forced. There is no fall back to TCP/HTTP. Congestion control can only be enabled ahead of the transfer via pre-populated config file. No user account management on the server. Core Protocol designed predominantly for multicast. Point-to-point file transfer is not the core of the technology. Poor performance in high packet loss environment (satellite or wireless). No multi-threading, meaning that only a single CPU core can do the work of receiving, processing, decrypting, decompressing, and writing to disk. This may also limit the number of concurrent connections that can be serviced at once. Not optimized for high bandwidth (500 Mbps or more) No graphical client interface for point-to-point transfers 2012 Unlimi-Tech Software, Inc. [5] Accelerating File Transfers
GridFTP GridFTP is an implementation for use with Grid computing. Functionality Issues Complicated install of the framework to allow multiple streams; doesn't directly address point-topoint file transfers No firewall traversal Core Grid FTP requires a much larger framework called Globus, which is steered under the organisation of the Global Grid Forum. For optimized transfers, multiple nodes or TCP streams must be used Optimized transfer of a single large file wit a single stream between 2 nodes is not possible Command line client interface only (no GUI) Must know TCP buffer size and block size ahead of time before the transfer begins: tcp-bs and - tcp-buffer-size The server and client must be part of a much larger network of Globus nodes Not optimized for very high bandwidth 500 Mbps or more Not optimized for low bandwidth high packet loss (ie. satellite) Conclusion Although UDT seems to be pulling ahead for now, none of these projects are currently viable replacements at the enterprise level. All solutions are lacking the functionality and the ease of use of commercial applications. One exception is GridFTP, which could be used if the organization plans to use Globus and develop a file transfer workflow based on the CLI. A commercial solution such as FileCatalyst addresses each of the weak points, including flexible congestion control, firewall friendliness, GUI client apps and automatic resume/retry which provides a real cost savings and efficiency boost when compared to piecing together a custom solution using a bare bones API. 2012 Unlimi-Tech Software, Inc. [6] Accelerating File Transfers
Author Biography In 2000, John Tkaczewski co-founded Unlimi-Tech Software, creator of the FileCatalyst suite of file transfer solutions. During John's continued tenure as president, Unlimi-Tech has recorded double digit growth annually. Along with the management team, John is jointly responsible for overall vision and strategy as well as finance and administration for the company. John remains active in software development, overseeing creative and technical vision for the FileCatalyst Webmail and FileCatalyst Workflow products. John graduated Bishop's University in 1999 with a BSc. in Computer Science and a Business Diploma. Prior to founding Unlimi-Tech, John worked as programmer analyst for PWGSC, a department of the Canadian Federal Government. John speaks at various global events on topics related to network acceleration and security, big data transfer, open source file transfer, and high tech start-ups. John is also a main contributor to the FileCatalyst blog, discussing topics related to file transfer, data security and software development. About FileCatalyst FileCatalyst software solutions are developed by Unlimi-Tech Software, Inc. Founded in 2000, Unlimi- Tech Software is a privately held corporation based in Ottawa, Canada, operating with a global reseller network. Unlimi-Tech products are focused on solving file transfer challenges in diverse environments, from end user desktops to sophisticated WAN and satellite-based multi-cast systems. Learn more about Unlimi-Tech Software and FileCatalyst: info@filecatalyst.com tf: +1.877.327.9387 t: +1.613.667.2439 f: +1.613.667.2439 2012 Unlimi-Tech Software, Inc. [7] Accelerating File Transfers