Website Analysis Hanieh Hajighasemi Dehaghi Computer Networks : Information transfert (LINGI2141) Ecole Polytechnique de Louvain (EPL) Université Catholique de Louvain Abstract The candidate website(foxnews.com) is investigated about behavior of HTTP, TCP and DNS in the Ubuntu software platform V. 14.04. I. INTRODUCTION Checking and analysis of a website can help to understand the importance of network all over the world. Many users do not pay attention that what happen when the user just presses the Enter bottom to travel to internet world through the all web-servers and how the protocols help to make connections between them. In this report foxnews.com, an American news channel, has been investigated. The ranking of this website is 214 all over the word and most of its visitors are in English speaking country such as USA, Canada, UK and Australia. Several websites such as You-tube, Amazon, Yahoo, ebay and wikipedia are linked to the foxnews.com. Also among wellknown news websites, CNN and USAtoday have a connection link inside foxnews. A. Domain name foxnews.com has ability to connect to many web servers. It uses only IPv4 (88.221.92.15) which is located in Switzerland for European countries and then it is able to connect to other countries host server. There are 16 IPv4 which are attributed to this domain name. It worth to mention that foxnews.com does not support IPv6. It also has 7 name servers (NS). All these servers are provided by Akamai Technologies company and they will be expired in June 21, 2020. Its parent name server for com zone is k.gtld-servers.net (192.52.178.30) which is located in the United States of America. All name servers of foxnew.com are listed in the following table. It has to be mentioned that the primary name server is DNS.TPA.FOXNEWS.COM. The name server provided in table I are after resolving CNAME:s and CDN analysis and de-duplication. Domain names discussed here have only A records (IPv4). USW3.AKAM.NET NS1-157.AKAM.NET USC2.AKAM.NET USW1.AKAM.NET TABLE I NAME SERVERS Delegated name servers ASIA3.AKAM.NET NS1-253.AKAM.NET USC4.AKAM.NET DNS.TPA.FOXNEWS.COM foxnews has only one mail server ( foxnewscommail.protection.outlook.com ) and it is located in North America with 4 IPv4. TABLE II IP NUMBERS THAT HAVE PTR OF FOXNEWS.COM 66.54.2.137 66.54.3.18 70.32.45.166 TABLE III THE PTR RECORDS OF THE IP NUMBERS OF THE NAME SERVERS OF FOXNEWS.COM A1-157.AKAM.NET A12-64.AKAM.NET A14-66.AKAM.NET A7-66.AKAM.NET NS1-157.AKAM.NET A1-253.AKAM.NET A12-65.AKAM.NET A22-64.AKAM.NET ASIA3.AKAM.NET NS1-253.AKAM.NET A comprehensive visual presentation of the different records associated with this domain is provided in appendix A. CNAME records information of the host are provided in the table IV. SOA default TTL(time-to-live) is set to 5 minutes. TTL represent time duration spent until DNS related information will be refreshed. CNAME Name Type Class TTL TABLE IV CNAME RECORDS www.foxnews.com.edgesuite.net www.foxnews.com CNAME IN 154 sec This name server usw1.akam.net returns the following DNS records for foxnews.com: 77.67.28.146 (A) ; However, the rest of the name servers return different DNS records for foxnews.com. B. Type of resources In the foxnews.com, many types of resources are used. According to the figure 1, It can be observed that JavaScript files occupied more volume compared to the other types. After JavaScript, images and html sources are taking most of space. Resource distribution of primed cache in terms of type, number of usage and size is illustrated in Figure 1.
websites he visits. foxnews.com sets 67 persistent cookies with average life-time of 476 days and longest 8419 days. C. TCP port Fig. 1. resource distribution All TCP ports are obtained using nmapsi4 V.0.4.1 software package [1]. Most of the domain names access to the web server via port number 80 except 4 cases that contact via port number 443 and all of them also present the JavaScript resource. D. Header request TABLE V TCP PORTS Port State service Description 80/tcp open http AkamaiGHOST 433/tcp open ssl/http AkamaiGHOST 8000/tcp closed http-alt 1) analysis: There are 10 constant request header for all domain names. But some of domain names have additional request headers such as cookie, connect-type, If-Modified- Since, If-Non-Match and X-Requested-With. All these request fields are permanent. 2) Non-standard: According to the result, the non-standard request headers were not observed. 3) Cookies: Different types of cookies are listed as: -Session cookies are cleared when the browser is closed. They also allow the website to identify user s state such as logged-in users. They are mostly considered harmless because they cannot be used for long-term user tracking. This site sets 4 session cookies. -Third-party cookies is written on the machine by website that is different from the website user is actually visiting. These cookies may be set for various purposes, like tracking ads displayed on the website, collection of statistics and etc. This website allows 19 other websites to track your activity. -Persistent cookies are the cookies that are preserved through browser shut-downs. This means, even if user closes the page visiting today and return there in future, the website will know that this is a returning visitor. This may be used for remember me features, as well as persistent user tracking. These cookies, especially if set by third party organizations, are powerful tool for monitoring user activities across all the E. Header respond 1) HTTP version: The web server of foxnews.com uses HTTP version 1.1. HTTP/1.1 is an expanded model of the previous version(http 1.0). Many status codes are defined in the latter version. Here, a list of the status codes based on two different browsers is presented. Chrome : 200 (success), 204 (no content), 301 (moved permanently), 302 (moved temporarily). In the Chrome, approximately all requests sent by browser are accepted from server (code 200). There are 29 cases in code 302. The status Moved temporarily means temporary transition of one web address to another address however, the primary address is still valid. Code 301 represents permanent transition of one web address. There are 5 cases related to this code. no content status is observed while the request of browser is received and processed correctly by server but the response of server is not included any specific content. This code is occurred just in one case of plain text document. Firefox : 200 (success), 204 (no content), 301 (moved permanently), 304 (not modified) In the Firefox, there are only one case in code 304. The browser requests to the server that it sends the latest changes of file. If the file is not modified, the server will send code 304. This code is occurred for an image resource. 2) Analysis: In request field, all connections have keepalive status. But in the respond side, some of the connection are closed. Only unusual thing here is about Expire field. Most of the responds have an expiration time but some have already been expired many years ago. F. TCP analysis By refreshing homepage of foxnews.com in firefox, 2769 total packets received (2769 incoming packets delivered, 2704 requests sent out). There are 8 dropped packets because of missing route. A detailed tcp information can be obtained by netstat -s c tcp. Table VI explained tcp connection information. TABLE VI TCP CONNECTION REPORT 87 active connections openings 0 passive connection openings 0 failed connection attempts 0 connection resets received 66 connections established 2305 segments received 2213 segments send out 4 segments retransmited 5 bad segments received 6 resets sent It is also worth to note that there is one TCP socket finished time wait in fast timer. Twenty delayed acknowledgement
(acks) sent. Quick ack mode was activated 14 times. 1248 packet headers predicted. 262 acknowledgements not containing data payload received. 65 predicted acknowledgements. There are two recovered packet using selective acknowledgements. There are two congestion windows fully recovered without slow start. There is one congestion windows recovered without slow start after partial ack. Other obtained TCP feature data are reported in the table VII: TABLE VII TCPEXT. REPORT TCPLossProbes 12 TCPLossProbeRecovery 10 DSACKs sent for old packets 14 DSACKs sent for out of order packets 1 DSACKs received 11 TCPDSACKIgnoredNoUndo 8 TCPSackShiftFallback 2 TCPRcvCoalesce 1245 TCPOFOQueue 289 TCPOFOMerge 1 TCPChallengeACK 5 TCPSYNChallenge 5 G. initial congestion window value In older Linux kernel versions initial congestion window size (initcwnd) was as low as just 2 (2*MSS, or about 3KB), and since version 3.0 new default is set to 10 (about 14KB). In linux command terminal screen current default route information can be obtained using $ ip route grep default $ default via 192.168.0.1 dev wlan0 proto static As explained above TCP initial congestion window as per draft-hkchu-tcpm-initcwnd-01 in newer Linux kernel version is set to 10. This value can be obtained in tcp.h file and reassigned by: $ sudo ip route change default via 192.168.1.1 dev eth0 proto static initcwnd 10 It is worth to mention that Akamai has a higher initwncd value in terms of performance matters. There are also Limelight and Level3 with a higher initwncd value. There are several advantages of increasing initial windows value which will be discussed below: 1) Reducing Latency: reduces the total transfer time for data sets range (depends on the value). 2) Keeping up with the growth of web object size 3) Recovering faster from loss on under-utilized or wireless links However, in the case of network connection (not an individual connection), increasing the initial window may results in increase of congestion in the network. Although, the increase will happen one-time just at the start of a connection, and the rest of TCP s congestion back-off mechanism will be remained in place. There the numerous studies in literature proposing solution in order to speed up short transfers [2]. A test is performed on the number packets the server sends in the first round-trip after the initial GET request on a fresh connection. The results are listed below. Packets received in first round-trip: 16 Bytes received in first round-trip: 23232 Total bytes received: 74403 IP address: 23.218.135.93 H. Traceroute Traceroute is a utility that traces a packet from source computer to an Internet host. It illustrates how many hops the packet requires to reach the host and how long each hop takes. Traceroute is performed for both active ports (80 http, 443 https). Results are provided in the following tables using Nmap scanning test: TABLE VIII USING PORT 80/TCP HOP RTT ADDRESS 1 7.97 ms bb0-vlan50-th1.59-bdp.teentelecom.net (86.107.58.33) 2 8.01 ms interlink-routers.use.teentelecom.net (172.16.0.177) 3 8.04 ms interlink-routers.use.teentelecom.net (172.16.0.165) 4 9.02 ms bb3-v505-cb.nxdt.teentelecom.net (193.138.193.69) 5 23.03 ms 81.183.1.102 6 35.03 ms decix-fra9.netarch.akamai.com (80.81.192.28) 7 33.07 ms a23-14-93-64.deploy.static.akamaitechnologies.com Nmap done(port 80/tcp): 1 IP address (1 host up) scanned in 25.13 seconds Raw packets sent: 108 (9.042KB) Rcvd: 77 (5.244KB) TABLE IX USING PORT 443/TCP HOP RTT ADDRESS 1 0.71 ms bb0-vlan50-th1.59-bdp.teentelecom.net (86.107.58.33) 2 1.72 ms interlink-routers.use.teentelecom.net (172.16.0.177) 3 1.75 ms interlink-routers.use.teentelecom.net (172.16.0.165) 4 1.78 ms bb3-v505-cb.nxdt.teentelecom.net (193.138.193.69) 5 15.75 ms 81.183.1.96 6 33.76 ms decix-fra9.netarch.akamai.com (80.81.192.28) 7 32.79 ms a23-14-93-64.deploy.static.akamaitechnologies.com Nmap done(port 443/tcp): 1 IP address (1 host up) scanned in 22.98 seconds. Raw packets Nr: sent: 74 (6.084KB) Rcvd: 75 (5.040KB). Wireshark software package [3] is also used to perform tests in the domain foxnews.com. Desired information can be extracted by applying filters which are provided inside Wireshark. In order to obtain number of re-transmitted packets in tcp port 80, following filter is applied in Wireshark
tcp.analysis.retransmission and tcp.port == 80. 23 packets have been re-transmitted. same filter used for tcp port 443 (this time tcp.port sets to 443). Results shows that 3 packets has been re-transmitted. Fig. 2. Wireshark-TCP 443 Figure 2 shows screen shot of Wireshark filtering tcp port of 443 and re-transmitted packets. There is a good agreement between results obtained by Wireshark and Nmap test. The size of the receive window regardless of the number of unacknowledgement bytes is illustrated in Figure 2. (tcp stream graph!windows scaling graph). TCP port is set to 80 for Figure 3. Maximum window size obtained by setting TCP port to 443 is 423 bytes which is less than one tested by TCP port 80. before the connection is ended. That also explains why in the test TCP Reset sent by client after receiving [FIN,ACK] Packet by Server. J. Test Using load impact online tool [4], a load test is performed on the foxnews.com domain. Total number of 50 active virtual users is used in this test. Figure 4 shows user load time variation and total number of active TCP connection over time. User Scenario generated in three different categories using the Load impact : NonMemberSimple: This user only surfs on the webpages for reading news. NonMember: This user explores news page. Moreover, the user uses the search engine embedded in the website. The user also watches videos available on the websites. Member: This user does everything as Not Member user can do plus commenting and downloading data. Overall 142 pages and 948 URLs are tested in 5 minutes. During test 160.81 MB data in total received and 8931 request have been sent. Fig. 4. Load test II. CONCLUSION As conclusion the behaviour of the foxnews channel website over the world web is analysed. main TCP ports are found to be 80 and 443. Using different filters provided in the Wireshark software package, packets transmitted by server has been analysed. Effect of the initial congestion window size has been discussed. For more detailed it may refer to [2]. REFERENCES Fig. 3. windows scaling graph-tcp 80 I. TCP Connection termination In TCP, connection termination is implemented so that each device terminates its end of the connection separately. In other hand when one side closes the connection, It means that device will no longer send data, but still can continue to receive data until the other device has decided to stop sending. By doing so, it allows all data that are pending to be sent by both sides [1] Gordon Lyon. Free security scanner for network exploration. http://nmap.org, 2014. [Online; accessed 16-December-2014]. [2] N. Dukkipati, T. Refice, Y. Cheng, J. Chu, T. Herbert, A. Agarwal, A. Jain, and N. Sutin. An argument for increasing tcps initial congestion window. ACM SIGCOMM Computer Communications Review, 40:2733, 2010. [3] The Wireshark team. Troubleshooting with Wireshark. https://www.wireshark.org, 2014. [Online; accessed 16-December- 2014]. [4] The Load impact team. On demand load testing for developers. http://www.loadimpact.com, 2014. [Online; accessed 16-December- 2014]. APPENDIX DOMAIN RECORDS
Fig. 5. Domain records