The Leader in Cloud Security RESEARCH REPORT Botnet Analysis Leveraging Domain Ratio Analysis Uncovering malicious activity through statistical analysis of web log traffic ABSTRACT Zscaler is a cloud-computing, Security-as-a-Service (SaaS) vendor with Internet gateways distributed around the globe. Zscaler provides policy and security based blocking and logging with a focus on HTTP(S) transactions for enterprise web traffic. In processing millions of global web transactions daily, Zscaler is in a unique position to conduct data mining to uncover emerging web based threats. During the course of compiling and analyzing statistics for the fourth quarter of 2009, an interesting and widespread compromise was discovered. Zscaler s NanoLog technology, providing immediate access to billions of web log transactions, made this possible. The following report describes Zscaler s Nanolog technology and how it was used to find the threat.
Approach... 3 NanoLog Technology... 3 Detection Methodology... 3 Detection Results... 4 Incident Analysis... 4 Additional Information... 8 Conclusion... 9 Copyright 2009-2010 Zscaler 2
Approach Zscaler regularly mines global log data in an effort to uncover previously unidentified threats. Global log data provides a unique view of end user traffic and statistical analysis can be leveraged to highlight anonymous traffic requiring further investigation. Using domain ratio analysis, Zscaler was able to uncover previously unknown traffic for a botnet operating in the.nu domain NanoLog Technology Given the volume of web transactions traversing Zscaler s global cloud every second, efficient storage and retrieval of log data is essential. To handle this challenge, Zscaler designed and developed a binary data format that is highly scalable from a storage standpoint and exponentially more efficient than traditional relational database. Data is written and retrieved using temporal information to limit the disk I/O specifically the number of times that the hard-drive has to pick up and move its head. The NanoLogs are highly optimized to prevent duplicate writes of variable sized information, such as URL strings. Detection Methodology One of the broad statistics monitored by Zscaler is top-level domain (TLD) usage. Within this data, ratios are investigated by identifying the number of transactions per unique domains per TLD. A low ratio would mean that the transactions were broadly distributed across the many domains visited. A ratio of 1:1 for example would mean that there was approximately 1 web transaction per unique domain visited. A high ratio would indicate that there were a much larger number of transactions than unique domains visited suggesting that one or more popular domains dominated the usage of that particular TLD. Popular domains like Google, Facebook, Amazon, Yahoo, Microsoft, MySpace, Twitter, etc., increase the ratio within well-utilized generic TLDs (gtld), such as.com as a few popular domains contain a large number of the transactions for that TLD. At the same time, there are many domains within these gtlds, which will act to lower the ratio slightly though it still remains high overall. For example, October to December 2009 had.com ratios of 726:1, 702:1, and 799:1 respectively. It is interesting to further analyze domain results for less popular TLDs and specifically those that had a higher ratio than the gtlds, both from a statistical perspective as well as from a security perspective. Criminals frequently register domains with TLDs that are less in demand because they are cheaper, and in some cases the particular domain registry (maintainer of the TLD) and/or registrar (maintainer of the domain record) will have poor abuse handling procedures. Additionally, the registry and/or registrar may either be complicit in the illegal activity or be in a jurisdiction/country with a legal system that protects the domain from being de-registered or having the registration information shared with law enforcement. TLDs with a high ratio of transactions per unique domain per TLD have one or more domains with a large number of transactions. It can be valuable to sift through the records to explain the high ratio TLDs. They may represent malicious command and control (C&C) traffic or perhaps an information drop server that has a large number of transactions beaconing to the domain s server. Such a ratio could also represent benign traffic as would be the case with a popular social networking site in a particular country. Copyright 2009-2010 Zscaler 3
Detection Results One such example of a benign domain within a TLD that bubbled to the top was.ly. This domain had a ratio of 2140:1, 1792:1, and 1699:1 from October to December 2009. These ratios were more than double the ratios that.com had during these months. This high ratio is explained by this TLD being relatively unpopular as far as unique domains, but having a large number of transactions to a popular domain: BIT.LY, a URL shortening service. The.NU TLD had even higher ratios: 5063:1, 8083:1, and 2824:1 from October to December 2009. The.NU TLD is assigned to the island state of Niue in the South Pacific Ocean. Wikipedia states that the TLD is particularly popular in Sweden, Denmark, the Netherlands and Belgium, as nu is the word for now in Swedish, Danish, and Dutch 1. While the domain may be popular for these countries, our ratio shows that one or more domains are dominating the transactions for this TLD. Running a query in Zscaler s NanoLogs for the.nu domains to obtain the count of transactions per domain, revealed a large percentage of the transactions going to the domain cvnxus.mine.nu. The URLs to the domain appear as: hxxp://cvnxus.mine.nu:53/30080000 Incident Analysis 18 hosts were identified as beaconing to the cvnxus.mine.nu domain on port 53/TCP using HTTP. In some cases the beaconing was as frequent as once every 2 minutes. The beaconing activity and the port used were suspicious as 53/TCP is generally used for DNS traffic. The traffic is present in Zscaler s NanoLogs from late 2009 to January 2010. Analysis of the traffic indicates that the transaction is a connection finished transaction. Further analysis indicates that these transactions were TCP Acknowledgement (ACK) packets being sent to the cvnxus.mine.nu server without receiving a response back from cvnxus.mine.nu. Upon detection of this incident, cvnxus.mine.nu was added to the Zscaler block-list. The mine.nu domain is a DynDNS domain dynamic DNS is a service used for rapidly updating fully qualified domain names (FQDN), that is, hostname and domain name, for hosts that have dynamic IP addresses. Doing a dig on the FQDN yields the IP address that it currently resolves to: cvnxus.mine.nu. 60 IN A 119.167.225.12 (The 60 is the time-to-live for the DNS record to live in DNS server cache, indicating that this record will be updated every 60 seconds by the DynDNS server.) The IP belongs to a very small Chinese netblock described as QingdaoWantuoWangluoJishuYouxianGongsi 1 http://en.wikipedia.org/wiki/niue Copyright 2009-2010 Zscaler 4
Figure 1: 119.167.225.12 whois Google results for the rather unique string QingdaoWantuoWangluoJishuYouxianGongsi show a French blog detailing an interesting incident involving this netblock as recent as December 23, 2009 2. The blogged incident involves an alleged 0-day PDF exploit using the Missile Defense Agency name to spread malware. Attempting to connect to the 119.167.225.12 server failed. Issuing a wget command to http://cvnxus.mine.nu:53/30080099, the response was failed: Connection refused. Attempts to connect on 80 or 443 likewise failed. An Nmap port scan revealed little useful information and it is possible that the server is handling the TCP ACK traffic, but it is not responding back to the host sending the packets. 2 http://sid.rstack.org/blog/index.php/378-av-caesar-infectori-te-salutant Copyright 2009-2010 Zscaler 5
Figure 2: Nmap scan results The NanoLog records indicate that the transactions had a user agent of Internet Explorer (Unknown Version), and issued an HTTP 1.0 GET request of 210 bytes. Leveraging data partners, Zscaler researchers were able to identify the following malicious artifacts and the ports used to host them on 119.167.225.12 in 2009: Date MD5 Port VirusTotal 2009-09-23 9f670a220ef58bd4 53/TCP 34/41, Backdoor/Win32.PcClient 3 45d134fa0f650a62 2009-09-24 1df16e3bec6f7fea 443/TCP 6/41, Trojan.CryptRedol d9794a006f405513 2009-09-24 a01c82b8f52835a1 53/TCP 21/41, Trojan.CryptRedol 08098e4a54e33022 2009-10-07 0f22d787456e2ca9 443/TCP 19/41, Backdoor.Win32.PoisonIvy d9c7b5ad990f5ac4 2009-10-14 94843482178038b9 443/TCP 14/41, Backdoor.Win32.PoisonIvy 99a07fc61b10227e 2009-11-30 81e5312aed973655 006d57aa3a83233a 443/TCP 32/41, Trojan- Dropper.Win32.Agent.bhxt 2009-11-30 27e326e40c6949b0 c22489af61a6816d 443/TCP 31/41, Trojan- Dropper.Win32.Agent.bhxt Upon further analysis of the infected hosts, it was determined that the customers were infected with a variant of the Backdoor/PcClient malware family. The specific variant impacting these customers was undetected by antivirus vendors. Upon execution, the malware loads three components onto the system: Backdoor component, e.g., <system folder>\yelgcgmh.d1l Keylogger component, e.g., <system folder>\yelgcgmh.dll Rootkit / driver component, e.g., <system folder>\drivers\yelgcgmh.sys (Note - the precise filenames may vary, and the rootkit piece may hide these files from view on the system.) The backdoor then beacons to a remote website using a specific port, in this case, 119.167.225.12:53. It can then receive and execute commands from a remote attacker. The keylogger logs keystrokes and saves its gathered data 3 http://www.microsoft.com/security/portal/threat/encyclopedia/entry.aspx?name=backdoor%3awin32%2fpcclient Copyright 2009-2010 Zscaler 6
to a log file usually located in the Windows system folder, for example <system folder>\log.txt. The rootkit may be added as a service and is capable of hiding processes, files, registry entries, and network traffic. Below are listed some of the FQDNs that have resolved to the command and control IP: 119.167.225.12 (many/all of these domains have been identified in malware incidents). Current resolution for these domains is largely the same, with one domain no longer resolving. None of the domains are listed in the SURBL.org blacklist, Domain IP SURBL Blacklisted amos.2288.org 119.167.225.12 NO cvnxus.mine.nu 119.167.225.12 NO fuckdd.8800.org 119.167.225.12 NO ngcc.8800.org 119.167.225.12 NO nodns2.qipian.org 119.167.225.12 NO packer.8800.org 119.167.225.12 NO tcw8.com 119.167.225.12 NO voov.2288.org 119.167.225.12 NO ewms.6600.org 119.167.225.12 NO cvnxus.ath.cx Does not resolve 119.167.225.12 belongs to the 119.164.0.0/14 netblock, part of the AS4837 autonomous system for the China Network Communications Group (Shandong Province). Note that one of the above domains, tcw8.com, was not handled through a DynDNS domain. The domain registration information for that domain is as follows: Copyright 2009-2010 Zscaler 7
Additional Information Leveraging a data-sharing partner, historical records of netflow traffic were pulled for the IP in question (119.167.225.12). Numerous records confirmed a large number of hosts on the Internet beaconing back to this IP over 53/TCP. Netflow data also revealed traffic from 119.167.225.12 being forwarded to 222.35.137.169. Below are some of the domains that have been identified to resolve to this IP: Domain IP SURBL Blacklisted a27278a.8800.org 222.35.137.169 NO cyhk.3322.org 222.35.137.169 NO tgyeqp.3322.org 222.35.137.169 NO Note - the above are all DynDNS domains. 222.35.137.169 belongs to the 222.35.137.0/24 netblock, part of the AS38356 autonomous system for TimeNet Beijing Sincerity-times Network Technology Project Ltd. The Google Safe Browsing report for AS38356 4 reports at this time that over the past 90 days: 4 http://www.google.com/safebrowsing/diagnostic?site=as:38356 Copyright 2009-2010 Zscaler 8
1467 sites on this network served content that resulted in malicious software being downloaded and installed without user consent 36 sites on this network functioned as intermediaries for the infection of 141 other sites 79 sites on this network infected 2884 other sites The best guess assumption with the information at hand is that the beaconing is an I m alive and infected notification sent to 119.167.225.12. 119.167.225.12 then notifies the next tier command and control (C&C) 222.35.137.169 periodically to provide a list of hosts that can be contacted through installed backdoors and issue commands. Conclusion The analysis detailed in this report demonstrates a successful methodology utilizing Zscaler s logging capabilities to detect previously undetected infected hosts. Domain ratio analysis can be leveraged to quickly identify instances where there are a disproportionate number of transactions per site indicating a popular site, or in this case reoccurring transactions to a command and control host. It is not enough to simply have good content inspection and URL filtering technology in place, as the malware had poor anti-virus detection and the URLs did not exist in datafeeds / block lists. Organizations and vendor partners must have adequate logging and conduct regular analysis on these logs. Zscaler is in a unique position to conduct threat analysis across customer organizations worldwide and provide detailed threat analysis with the necessary protections. Once the incident was detected, Zscaler was able to quickly identify all of the infected hosts as well as push a rule into the cloud to immediately block any communication to the C&C hosts. This analysis was then shared with the impacted customers and further analysis was conducted to isolate the related malware artifacts. The anti-virus vendors running on these customer hosts did not have detection signatures available for the particular malware variant. The malware sample was shared and the needed anti-virus signatures were written and pushed into production. Subsequent sharing of this analysis with other data partners revealed others with previously undetected infected hosts beaconing to the command and control. Zscaler s customer and partner relationships allowed for thorough and professional incident response for its impacted customers as well as a broader notification to others impacted by this threat. This further demonstrates the benefits of good logging and analysis, and how Zscaler s NanoLog technology can be leveraged to detect new, previously undetected malware incidents. Copyright 2009-2010 Zscaler 9