SOBCZYK Andrzej 1 MARCINIAK Marian 2, 3 Transferring large amount of data over the Internet network INTRODUCTION It became very common that even very large amount of data is nowadays transmitted over the Internet network. For instance, television production systems, where widespread medium for transmitting video materials were magnetic tapes, optical media and real time transmission systems (e.g. DSNG Digital Satellite News Gathering) or complex transmultiplexing systems [1]. Today transmitting these materials over the Internet is more and more common. Development of telecommunication technologies for high speed data transmission enables access to the Internet network with higher and higher bandwidth. However protocols used in the network were designed when transmission speed was relatively low [2]. Therefore, it is important to examine and compare efficiency of network transmission protocols for high speed transmission of large amount of data in the public Internet network. 1 RESEARCH METHOD For the investigation, we have selected network protocols of application layer (layer 7 of the OSI /The Open Systems Interconnection/ model) that use on transport layer both TCP /Transmission Control Protocol/ and UDP /User Datagram Protocol/ as follow: a) HTTP HyperText Transfer Protocol- TCP based, mostly used for downloading files. Apache HTTPD server [9] and wget application [4] as a client are used. b) FTP File Transfer Protocol TCP based, allows both downloading and uploading files. This uses ProFTPD server [10] and wget application as a client. c) UDT UDP-based Data Transfer- UDP based, designed for high speed file transfer. UDT software [11] is used. d) FASP - UDP based, designed for high speed file transfer. Aspera Connect [2] software is used. The choice of protocols was not made in a desultory kind of way. Both HTTP and FTP protocols have been used for a long time, they have transport layer based on TCP, and they and provide reliable data transmission. However, such TCP features as: initiating connection procedure, so-called three-way-handshake slow start algorithm, used to prevent congestion, confirmation of transmission of any single data packet make the protocol inefficient to transmit large amount of data over the Internet network. Protocols UDT and FASP have transport layer based on UDP, and they are designed for highspeed data transmission. The primary difference between UDT and FASP is the way of implementation libraries licensing. The UDT protocol is available under BSD license and it was published as open source software, while the FASP is a commercial product with closed source. It is known that transmission paths, available bandwidth and many other conditions on the public Internet network are not homogeneous. Such a protean environment may affect transmission speed regardless of the protocol, however, in the case of the present research it is intended factor. In a 1 Politechnika Świętokrzyska, Wydział Elektrotechniki, Automatyki i Informatyki, Katedra Telekomunikacji, Fotoniki i Nanomateriałów, Zakład Telekomunikacji, 25-314 Kielce, al. Tysiąclecia Państwa Polskiego 7, Tel: +48 501 069 852, Fax: +48 41 243 60 55, andrzej@sobczyk.it 2 Instytut Łączności, Zakład Teletransmisji i Technik Optycznych, 04-894 Warszawa, ul. Szachowa 1, Tel. +48 22 51 28 715, Fax. +48 22 51 28 347, M.Marciniak@itl.waw.pl 3 Politechnika Świętokrzyska, Wydział Elektrotechniki, Automatyki i Informatyki, Katedra Telekomunikacji, Fotoniki i Nanomateriałów, Zakład Telekomunikacji, 25-314 Kielce, al. Tysiąclecia Państwa Polskiego 7, Tel: +48 41 34 24 147, Fax: +48 41 41 34 42 997, marianm@tu.kielce.pl 9771
laboratory condition, it would be hard to build a test environment capable to fully simulate effects appearing in a real, global network especially for long distance transmission. Factors independent of researchers that may affect on the results among others are: occupancy of Internet service providers links used in transmissions paths of packet routing, occupancy and load of routers and other network hardware, occupancy and load of the servers. The test procedure consisted of transmission a 500Mb sized file of random data for each of researched protocols between following computer servers: Server A location: Roubaix near Paris, France. Operating system Linux 3.2.13. Internet link symmetrical 100 Mbps. Server B location: Warsaw, Poland. Operating system Linux 2.6.32. Internet link symmetrical 100 Mbps. Server C location: Kielce, Poland. Operating system Linux 2.6.32. Internet link asymmetrical 3.30 Mbps. Server D - location: Warsaw, Poland. Operating system Linux 3.5.0. Link to server B Ethernet 1000Mbps. Directions of transfers was as follows: Server A => Server B Server B => Server D Server A => Server C Server B => Server C In order to minimize the impact of the aforementioned independent factors affecting the research results, procedure was repeated five times and the results averaged. Data transmission speed was measured with IPaudit software. Checksums were calculated with md5sum application. 2 TEST RESULTS 2.1 Quality of transmission After every transmission MD5 checksum of the received file was calculated. In all cases the sum was equal to the sum calculated for the source file. We conclude that all protocols ensure reliable data transmission. 2.2 Data link bandwidth utilization Typical characteristics of utilization of bandwidth for each protocol are as follows. 9772
Fig. 1 Typical characteristics of link saturation (Example for connection Server A => Server B) Protocols based on UDP transport saturate the link in more even, steady way. 2.3 Amount of excessing data Each of protocols in addition to useful data (payload), transmits some extra data, for example: packet headers, retransmissions, checksums. During the research average amount of transferred data were calculated, either in transmit and receive direction. Fig. 2 Average excess amount of transferred data classified by protocol and location UDT and FASP protocols in some cases generate much more redundant data than those based on TCP transport. One possible reason for the behaviour is the way of sending confirmations of packet reception. TCP transport protocol is a kind of connection-oriented protocol, where each sent packet must be confirmed independently. 9773
In the contrary, UDP is a kind of connectionless protocol, thus mechanisms providing reliability of the data link are moved to higher layer protocols. In practice, often confirmations are send grouped, what may result in higher excess data amount, especially when the number of non-delivered packets is high. 2.4 Transmission speed The chart below shows average transmission speed for each protocol. Fig. 3 Average transmission speed divided by protocol and location Due to large disparities of the transmission speed between particular locations, below are shown individual charts. Fig. 4 Average individual transmission speed classified by protocol and location 9774
In most cases we have noticed an advantage of the newer protocols, the only exception is the transmission over LAN where standard were FTP and HTTP more efficient. 2.5 time of transmission of single file The chart below shows an average time required to transmit a test file for each protocol. Fig. 5 Average time of file transfer classified by protocol and location There is an advantage of FASP protocol, it is more efficient for links of lower quality. For transmission over LAN (BD direction) it should be mentioned that the configuration (license conditions) of software using FASP protocol was set to artificially limit transmission speed to maximum level of 100Mbps. CONCLUSIONS The aim of the presented research was to compare the efficiency of different data transmission protocols to transfer large amounts of data in the real public Internet network. The efforts have been made to achieve the best possible compromise between keeping original conditions of the global network and minimizing factors that might affect on the study result and that are independent of the protocol. The results can be classified into two groups: group of research carried out in real conditions and a research carried out in local area network where the influence of external factors was the lowest one. Research carried under real conditions shows a slight advantage of the FASP protocol, the advantage is inversely proportional to available link speed. While averaging the results the dispersion was at level of 10%. During testing over local area network, FASP protocol was clearly less effective than the others, this evidently strange behaviour was caused by specific (due to software license limitation) configuration of the software that artificially limits transfer to 100Mbps, while the original link was 1 GBps. Very interestingly, the dispersion of results for local area network was much larger than for the real conditions and reached a high level over 150%. The authors believe they have clarified essential issues related to transmission of large amounts of data and modern protocols designed for this purpose. Looking to the future, telecommunication techniques and protocols are constantly evolving, and speed of data links available for the average user increases. On the other hand, size of transmitted files has increased by several orders of 9775
magnitude during last few years. For example, video files are sent more frequently in high resolution HDTV and ultra-high resolution 4k UHDTV and recently even 8k UHDTV. Given the above, continuous development and upgrade of protocols designed for transport very large amount of data at very high speed over public networks seems to be natural and inevitable processes. Abstract With development of telecommunications techniques of high speed data transmission the Internet network began displacing the traditional methods of transferring very large amounts of data. As yet common way to transport large amounts of data has been their record on optical or magnetic media and physical transport to the destination. Multimedia technologies, especially high and ultra-high resolution Internet Television, also require techniques to transfer large amounts of data. The progress in telecommunications techniques enabled development of network protocols, which need to be adapted to transmit such significant data amounts. Authors were aimed to research the efficiency of different network protocols, these used for a quite long time as well as relatively new ones based on UDP datagrams transport layer. Both speed and efficiency are compared for different protocols. Przesyłanie bardzo dużych ilości danych w sieci Internet Streszczenie Wraz z rozwojem technik telekomunikacyjnych w zakresie szybkiej transmisji danych sieć Internet wypiera tradycyjne metody przesyłania bardzo dużej ilości danych. Dotychczas często spotykanym sposobem transportu dużej ilości danych była ich rejestracja na nośnikach optycznych czy taśmach magnetycznych, a następnie transport fizyczny do miejsca przeznaczenia. Różnorakie technologie multimedialne, zwłaszcza telewizje internetowe o wysokiej, czy ultra wysokiej rozdzielczości, również wymagają technik umożliwiających przesyłanie znacznej ilości danych. Rozwój technik telekomunikacyjnych pociąga za sobą rozwój protokołów sieciowych, które muszą zostać przystosowane do transmisji tak dużej ilości danych. Autorzy postawili sobie za cel zbadanie efektywności różnych protokołów sieciowych zarówno tych używanych od dawna jak i stosunkowo nowych opartych w warstwie transportowej na datagramach UDP. Podczas badań porównywano zarówno prędkość transmisji uzyskaną dla poszczególnych protokołów jak i ich efektywność. REFERENCES 1. Ciosmak J., Efficient Algorithm to Calculate Nonseparable Two-Dimensional Filter Banks for Transmultiplexer Systems. Przegląd Elektrotechniczny, 2011, 87, 217-220. 2. Chodorek A., Chodorek R. R., A simple and effective TCP-friendly layered multicast content distribution, Polish Journal of Environmental Studies; 2009, 4B. 3. Asperasoft, http://asperasoft.com/technology/transport/fasp/, accessed 01-09-2014. 4. Fielding R., Gettys J., Mogul J., Frystyk L. Masinter L., Leach P., Berners-Lee T., Hypertext Transfer Protocol - HTTP / 1.1" RFC 2616, The Internet Society, July 1999. 5. Free Software Foundation, GNU Wget, http://www.gnu.org/software/wget/, accessed 01-09-2014. 6. ISO Standard 7498-1: 1994 Information technology - Open Systems Interconnection - Basic Reference Model: The Basic Model. 7. Postel J., Transmission Control Protocol" RFC 761, USC / Information Sciences Institute, January 1980. 8. Postel J., User Datagram Protocol" RFC 768, USC / Information Sciences Institute, January 1980. 9. Postel J., Reynolds J., "File Transfer Protocol" RFC 959, USC / Information Sciences Institute, October 1985. 10. The Apache Software Foundation, HTTP Documentation, http://httpd.apache.org/docs/, accessed 01-09-2014. 9776
11. The ProFTPD Project, Project Documentation, http://www.proftpd.org/docs/, accessed 01-09- 2014. 12. Yunhong Gu, http://udt.sourceforge.net/, accessed 01-09-2014. 9777