Botnet Detection by Monitoring Group Activities in DNS Traffic



Similar documents
DDoS Attacks Detection Model and its Application

TSG-RAN Working Group 1 (Radio Layer 1) meeting #3 Nynashamn, Sweden 22 nd 26 th March 1999

Performance Center Overview. Performance Center Overview 1

The Application of Multi Shifts and Break Windows in Employees Scheduling

Automatic measurement and detection of GSM interferences

PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE

Secure Election Infrastructures Based on IPv6 Clouds

SELF-EVALUATION FOR VIDEO TRACKING SYSTEMS

Trends in TCP/IP Retransmissions and Resets

Detection of DDoS Attack in SIP Environment with Non-parametric CUSUM Sensor

Chapter 1.6 Financial Management

Chapter 8: Regression with Lagged Explanatory Variables

Towards Intrusion Detection in Wireless Sensor Networks

Multiprocessor Systems-on-Chips

Model-Based Monitoring in Large-Scale Distributed Systems

USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES

Impact of scripless trading on business practices of Sub-brokers.

Market Liquidity and the Impacts of the Computerized Trading System: Evidence from the Stock Exchange of Thailand

Mining Web User Behaviors to Detect Application Layer DDoS Attacks

Journal Of Business & Economics Research September 2005 Volume 3, Number 9

Single-machine Scheduling with Periodic Maintenance and both Preemptive and. Non-preemptive jobs in Remanufacturing System 1

Morningstar Investor Return

The Grantor Retained Annuity Trust (GRAT)

The Real Business Cycle paradigm. The RBC model emphasizes supply (technology) disturbances as the main source of

TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS

Efficient One-time Signature Schemes for Stream Authentication *

MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR

BOTNET Detection Approach by DNS Behavior and Clustering Analysis

Analysis of Pricing and Efficiency Control Strategy between Internet Retailer and Conventional Retailer

Individual Health Insurance April 30, 2008 Pages

Making a Faster Cryptanalytic Time-Memory Trade-Off

Distributing Human Resources among Software Development Projects 1

Task is a schedulable entity, i.e., a thread

Real-time Particle Filters

How To Calculate Price Elasiciy Per Capia Per Capi

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

Impact of Human Mobility on Opportunistic Forwarding Algorithms

Research on Inventory Sharing and Pricing Strategy of Multichannel Retailer with Channel Preference in Internet Environment

A Scalable and Lightweight QoS Monitoring Technique Combining Passive and Active Approaches

The Transport Equation

Measuring macroeconomic volatility Applications to export revenue data,

ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS

Small and Large Trades Around Earnings Announcements: Does Trading Behavior Explain Post-Earnings-Announcement Drift?

CRISES AND THE FLEXIBLE PRICE MONETARY MODEL. Sarantis Kalyvitis

Heuristics for dimensioning large-scale MPLS networks

CALCULATION OF OMX TALLINN

Chapter 7. Response of First-Order RL and RC Circuits

GoRA. For more information on genetics and on Rheumatoid Arthritis: Genetics of Rheumatoid Arthritis. Published work referred to in the results:

LECTURE: SOCIAL SECURITY HILARY HOYNES UC DAVIS EC230 OUTLINE OF LECTURE:

Botnet Economics: Uncertainty Matters

Information technology and economic growth in Canada and the U.S.

Nowadays, almost all health organizations do not have

How To Optimize Time For A Service In 4G Nework

Towards a Generic Trust Model Comparison of Various Trust Update Algorithms

µ r of the ferrite amounts to It should be noted that the magnetic length of the + δ


Why Did the Demand for Cash Decrease Recently in Korea?

GUIDE GOVERNING SMI RISK CONTROL INDICES

Ecotopia: An Ecological Framework for Change Management in Distributed Systems

Task-Execution Scheduling Schemes for Network Measurement and Monitoring

Contrarian insider trading and earnings management around seasoned equity offerings; SEOs

Improvement of a TCP Incast Avoidance Method for Data Center Networks

Caring for trees and your service

Exploring Imputation Techniques for Missing Data in Transportation Management Systems

A Resource Management Strategy to Support VoIP across Ad hoc IEEE Networks

Forecasting, Ordering and Stock- Holding for Erratic Demand

A DISCRETE-TIME MODEL OF TCP WITH ACTIVE QUEUE MANAGEMENT

Predicting Stock Market Index Trading Signals Using Neural Networks

Principal components of stock market dynamics. Methodology and applications in brief (to be updated ) Andrei Bouzaev, bouzaev@ya.

Mobile Broadband Rollout Business Case: Risk Analyses of the Forecast Uncertainties

Relationships between Stock Prices and Accounting Information: A Review of the Residual Income and Ohlson Models. Scott Pirie* and Malcolm Smith**

The Greek financial crisis: growing imbalances and sovereign spreads. Heather D. Gibson, Stephan G. Hall and George S. Tavlas

WATER MIST FIRE PROTECTION RELIABILITY ANALYSIS

Child Protective Services. A Guide To Investigative Procedures

Forecasting and Information Sharing in Supply Chains Under Quasi-ARMA Demand

Idealistic characteristics of Islamic Azad University masters - Islamshahr Branch from Students Perspective

Hedging with Forwards and Futures

Appendix A: Area. 1 Find the radius of a circle that has circumference 12 inches.

Smart Surveillance: Applications, Technologies and Implications

Chapter 4: Exponential and Logarithmic Functions

CAREER MAP HOME HEALTH AIDE

Advise on the development of a Learning Technologies Strategy at the Leopold-Franzens-Universität Innsbruck

Tax Externalities of Equity Mutual Funds

DYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS

Transcription:

Bone Deecion by Monioring Group Aciviies in DNS Traffic Hyunsang Choi, Hanwoo Lee, Heejo Lee, Hyogon Kim Korea Universiy {realchs, hanwoo, heejo, hyogon}@orea.ac.r Absrac Recen malicious aemps are inended o ge financial benefis hrough a large pool of compromised hoss, which are called sofware robos or simply bos. A group of bos, referred o as a bone, is remoely conrollable by a server and can be used for sending spam mails, sealing personal informaion, and launching DDoS aacs. Growing populariy of bones compels o find proper counermeasures bu exising defense mechanisms hardly cach up wih he speed of bone echnologies. In his paper, we propose a bone deecion mechanism by monioring DNS raffic o deec bones, which form a group aciviy in DNS queries simulaneously sen by disribued bos. A few wors have been proposed based on paricular DNS informaion generaed by a bone, bu hey are easily evaded by changing bo programs. Our anomaly-based bone deecion mechanism is more robus han he previous approaches so ha he varians of bos can be deecable by looing a heir group aciviies in DNS raffic. From he experimens on a campus newor, i is shown ha he proposed mechanism can deec bones effecively while bos are connecing o heir server or migraing o anoher server. I. INTRODUCTION Explosive growh of he Inerne provides much improved accessibiliy o huge amoun of valuable daa. However, numerous vulnerabiliies are exposed and he number of incidens is increasing over ime. Especially, recen malicious aemps are differen from old-fashioned hreas, inended o ge financial benefis hrough a large pool of compromised hoss. This horrifying new ype of hreas ha endanger millions of people and newor infrasrucure around he world. For example, hey seals personal informaion which can lead o significan financial losses and simulaneously, used for delivering spam mails, and launching DDoS (Disribued Denial of Service) aacs. A large pool of compromised hoss, called bos, communicae wih a bo conroller o coordinae he newor of bos. Such a newor is commonly referred o as a bone. An aacer, called a bomaser, conrols a bone o perform various malicious aciviies. Recen aacs show ha heir inenions are o gain financial benefis from he aacs. Mos bos can perform a hybrid of previous hreas engaged wih a communicaion sysem. They can propagae lie Inerne worms, hide hemselves from deecion sysems, and launch DDoS aac lie DDoS aac oolis. These This wor was suppored in par by he ITRC program of he Korea Minisry of Informaion, and Basic Research Program of he Korea Science & Engineering Foundaion. This wor was parially suppored by Defense Acquisiion Program Adminisraion and Agency for Defense Developmen under he conrac(ud060048ad). crossbreed echniques mae he bone inelligen and hard o be handled hrough a securiy mechanism. One prominen characerisic of bones is he use of command and conrol (C&C) channels. The main purpose of he channels is o deliver he commands of a bomaser. And oday s bones use he Inerne Relay Cha (IRC) proocol [1], which is mainly designed for group communicaion in discussion forum called channels. Bu he channels are now used for he communicaion of a bone among disribued bos and heir conroller. Defending agains bones is a pressing problem ha is sill no well comprehended, hough bones firs appeared several years ago. Former defense mechanisms focused on a paricular sympom of bos or a signaure of bo programs. Even hough he sudies were meaningful o develop beer defense mechanisms, heir approaches have inrinsic limis such as he ineffeciveness for deecing unnown bo programs which are a sligh modificaion of an exising bo program or newly generaed bo programs. Recen sudies such as [2] on bone measuremens and heir deecion also have he same weaness for he varians of bo programs. The main conribuion of his sudy is he developmen of an anomaly-based bone deecion mechanism by monioring group aciviies in DNS raffic. Bomaser consrucs and manages his bone in several seps and bos rally o (C&C) server a an early sage. Mos of bos use DNS in rallying process and he DNS raffic have unique feaures which we define as group aciviy. The DNS raffic also appeared in oher sages herefore, by using he group aciviy propery of bone DNS raffic, we can deec bone. There are a few sudy which use DNS o deec he bone and some of hem used DNS redirecion o monior bones. However, hey are easily evaded when a bomaser nows hem. Noneheless, our approach does no need any DNS redirecion and communicaion wih any componen of bone. We have developed he bone deecion mechanism wih he following four seps. Firs, we found several feaures of bone DNS raffic ha is disinguishable from legiimae DNS raffic. Second, we defined he ey feaure of DNS raffic called group aciviy. Third, we developed an algorihm ha differeniae bone DNS query by using group aciviy feaure. Las, we analyzed he algorihm o prove feasibiliy of our mechanism. The mechanism are an anomaly-based deecion mechanism, so ha we can deec bone regardless of he ype of bo and bone. The mechanism uses he informaion of IP headers and ha enables o deec bone, even hough

hey uses SSH(Secure Shell) or any oher channel encrypion mehods. Moreover, mechanism can deec bone irrespecive of proocol which hey use. We also developed a mechanism ha enable o deec C&C server migraion. Bone frequenly change is C&C server by migraing o candidae C&C server. Our algorihm can find he bone even hough bos are migraing o oher candidae C&C server. Secion 2 shows he relaed wors of bones. Secion 3 describes main feaures of bones, including he unique paern of bone DNS raffic, rallying problem and migraion of bone. Then, we will inroduce a bone deecion mechanism in Secion 4 and evaluae he feasibiliy and effeciveness of he mechanism in Secion 5. II. RELATED WORK The exisence of bones was recognized several years ago, bu he sudies for defending bones are sill in an early sage. Some securiy companies and insiuions have analyzed he bone raffic, he mehod of propagaion and furhermore proposed he bone deecion and response mechanisms. However, heir defense mechanisms are focused on he sympoms of abnormal newor raffic and bo binary deecions by maching wih he signaures of nown bo codes. Even hough hese are useful for many cases, hey have ineviable limiaions such ha hey are unable o deec new or modified bos. There have been a few researches on he mehodological analysis abou he bo and bone such as heir behaviors, saisics, and raffic measuremens. Jones [3] provided bone bacground and recommendaions so ha newor and sysem securiy adminisraors can recognize and defend agains bone aciviy. Cooe e al. [4] oulined he origins and srucure of bos and bones, daa from he operaor communiy and sudy he effeciveness of deecing bones by direcly monioring IRC communicaion or oher command and conrol aciviy and show a more comprehensive approach is required. Barford e al. presened a perspecive based on an in-deph analysis of bo sofware source code and reveals he complexiy of bone sofware, discusses implicaions for defense sraegies based on he analysis [5]. Rajab e al. [2] consruced a mulifaceed infrasrucure o capure and concurrenly rac muliple bones in he wild, and achieved a comprehensive analysis of measuremens reflecing several imporan srucural and behavioral aspecs of bones. They sudied he bone behavior, bone prevalence on he Inerne, and modeling he bone life cycle. Recenly, a few aemps have been made o cope wih bone problems and mos of hem have come o focus on deecion of bone. Bos are sending DNS queries in order o access he C&C channel server. If we could now he name of domain name of C&C channel server hen we can blaclising he domain name for sinhole echniques o capure he bone raffic and measure he bone. Dagon e al. [6] idenified ey merics for measuring he uiliy of a bone, and describe various opological srucures bone may use o coordinae aacs. And using he performance merics, hey consider he abiliy of differen response echniques o degrade or disrup bones. Their sudy used DNS redirecion o monior bones. However our approach does no need any DNS redirecion and communicaion wih any componen of bone. Dagon also presen bone Deecion and response approach [7] wih analyzing peculiariy of bone rallying DNS raffic (paricularly, measuring canonical DNS reques rae and DNS densiy comparison). However he deecion echnique could easily be evaded when bomasers now he mechanism and poisoned by using faed DNS queries. Krisoff [8] also suggesed a similar approach, bu he mechanism has he same weaness. Binley [9] proposed an anomaly-based algorihm for deecing IRC-based bone meshes. The algorihm combines an IRC mesh deecion componen wih a TCP scan deecion heurisic called he TCP wor weigh. They can deec IRC channel wih high wor weigh hos bu some of hem could no be a member of bone (false posiive), addiional analysis for many borderline cases as hey menioned in he paper. Ramachandran [10] developed echniques and heurisics for deecing DNSBL reconnaissance aciviy, whereby bomasers perform looups agains he DNSBL o deermine wheher heir spamming bos have been blaclised. This approach of bone deecion is derived from novel idea ha deec DNSBL reconnaissance aciviy of bomaser bu also have false posiives and some defecs ha is referred in heir paper. Bones are consruced and managed in several sages such as bo infecion, C&C server rallying, and oher ypes of malicious aciviies. Defense agains bone aacs seems o be a very complicaed as. Only a few of wors have been done in his area, bu we need furher improvemens for he purpose of pracical use. Moreover, previous wors are difficul o be used for finding all ypes of bone because he bone have complex behavior paerns. A. Growh of Bone III. BOTNET A bone is a large pool of compromised hoss ha are conrolled by a bomaser. Recen bones use he Inerne Relay Cha (IRC) server as heir C&C server for conrolling he bone. Bomaser can disperse commands o his bone by he use of he IRC C&C channel. I was shown ha mos bones use he IRC for C&C process [11], however he raffic among bos, he C&C sever and he bomaser can be considered as legiimae raffic because i is hard o disinguish from normal raffic. The size and prevalence of he bone repored as many as 172,000 new bos recruied every day according o CipherTrus [12], which means abou 5 million new bos are appeared every monh. Symanec [13] recenly repored ha he number of bos observed in a day is 30,000 on average. The oal number of bo infeced sysems has been measured o be beween 800,000 o 900,000. A single bone comprised of more han 140,000 hoss was found in he wild and bone driven aacs have been responsible for single DDoS aacs of more han 10Gbps capaciy [14].

B. Rally Problem and IRC Server Since vulnerable hoss are infeced hrough self-propagaing worms, email messages, messengers and oher random spreading processes, he ey problem of a bomaser is how o rally he infeced hoss. Bomaser wan heir bones o be invisible and porable and herefore, hey uses DNS for rallying. I is possible o use oher mehod for rallying he bos, however mos of hem canno provide boh mobiliy and invisibiliy a he same ime. For example, if bo binary has he IP address of C&C server as hard coded sring, hen he C&C server can be perilous o reverse engineering. Even hough he IP address of C&C server is obfuscaed o preven rivial reverse engineering analysis, he hard coded IP address is unchangeable, so i canno provide any mobiliy. If he C&C server is no secure or mobile, i is easy o cleaned and ineffecive. A single alarm or misuse repor can provoe he C&C server o be quaranined or he bone suspended. C. C&C Server Migraion If a bone uses only a single C&C server, he bone could easily be deeced and disarmed. Thus, a bomaser wans o arrange several C&C servers which can be lised in he bo binary for he sabiliy of he bone and uses a dynamic DNS (DDNS) [15] which is a resoluion service ha auomaically perceives he change of he IP address of a server and subsiues he DNS record by frequen updaes and changes, for eeping he bones porable. And even hough he roo C&C server canno operae well or lin failure occurred, candidae C&C servers can be a feasible subsiuion for he roo C&C server. I is observed ha bones were migrae heir C&C server frequenly [6], eiher by being insruced o move o a new IRC channel/server or o download a replacemen sofware which poined hem o a differen C&C server. There are some capured evidence of such migraion occurrence which is simulaneously paricipaing in wo separae bones. The bomaser move his bone by changing he C&C server for evading o be capured. In he wild, here observed mos of hem (65%) are moved only up for 1 day [16]. Even hough previous domain name of bone C&C server is bloced, bomaser can jus moves his bone o anoher candidae C&C server. D. Feaures of Bone DNS As menioned above, infeced hoss auomaically access he C&C server wih is domain name. Therefore, DNS RR (resource record) query is used and such a query also appears a oher siuaions. Following 5 cases show he siuaions of he DNS query used in bone. (1) A he rallying procedure: If he hos infecion success, he infeced hoss should be gahered and as referred in previous secion 3.B, DNS is used. (2) A he malicious behaviors of a bone: Several ypes of malicious aciviies such as DDoS aac and spam mailing are accompanied wih he DNS ransmi. (3) A C&C server lin failures: If he newor or lin of C&C server fails, bos canno access o he C&C server, afer a while (undergo failure of TCP 3-way handshaing), hey begin o send he DNS query o DNS server. (4) A C&C server migraion: As menioned Secion 3.C, he bone migrae one o anoher candidae C&C server. In ha momen, DNS query also used. (5) A C&C server IP address changes: If a C&C server uses dynamic allocaed IP (DHCP), he corresponding IP address can be changed a any ime and a bomaser also can change he IP address of he C&C server inenionally. If he IP address of he C&C server changed, he bos canno connec he old IP address of he server, so hey send he DNS query o access new C&C server. Bone DNS Legiimae DNS Fig. 1. Source IPs accessed o domain name Fixed size Group (Bone members) Anonymous (Legiimae users) Aciviy and Appearance Paerns Group aciviy Inermienly appeared (Specific siuaion) Non-group aciviy Randomly and coninuously appered (Usually) DNS Type Usually DDNS Usually DNS Differences beween Bone and Legiimae DNS DNS queries of bones can be disinguishable from legiimae DNS queries, by unique feaures of he bone DNS queries. Figure 1 shows some differences beween bone DNS queries and legiimae DNS queries. Firs, only bone members send queries o he domain name of C&C server(fixed size), legiimae user never queries o he C&C server domain name. Therefore, he number of differen IP address which queried bone domain is normally fixed. On he oher hand, he legiimae cies are queried from anonymous users (random) a usually. Second, he fixed members of bone ac and migrae ogeher a he same ime. The group aciviy of bone derived from his propery. DNS queries from bone occurr emporary and simulaneously. However, mos of legiimae DNS queries occur coninuously and do no occur simulaneously. The bone queries appears a specified siuaions which menioned above, so hey appeared inermienly. Third, he bone uses DDNS for C&C server usually, bu legiimae cies do no commonly use DDNS. IV. DNS-BASED BOTNET DETECTION MECHANISM A. Bone DNS Query Deecion Algorihm We developed a bone DNS query deecion algorihm by using he differen feaures of bone DNS and legiimae DNS which menioned in Secion 3.D. The algorihm separaed 3 differen pars which are (1) Inser-DNS-Query, (2) Delee- DNS-Query, (3) Deec-BoDNS-Query. Figure2 shows he Inser-DNS-Query sage of algorihm. There is a daabase for soring DNS query daa which include source IP address of he query, domain name of he query and imesamp of he query received. We grouping he DNS query daa by he domain name and imesamp. Fig 3, 4, 5 demonsrae he algorihm wih pseudo code. Firs, here is an array A prepared for soring

1 2 3 4 (D)DNS quer ies (D)DNS quer ies (D)DNS quer ies Daabase Sour ce IP addr ess Quer y domain name Ti m e s a m p Query1 216.152.36.165 w w w.xxx.ne 1 Query2 213.15.36.178 c c s.orea.ac.r 2............ Dom ain Nam e Name Source IP address IP address 1 IP address 2 IP address 3 IP address 4 IP address 5... Delee-DNS-Query ( A 1 FOR = 1 o n 2 W, T= Whielis, size hreshold ) 3 IF ( DN is in W) OR ( DN => cn< T) 4 delee( DN, A ) 5 6 ENDIF delee( IPLis ), delee( cn ) 6 ENDFOR End of Delee-DNS-Query Fig. 4. Delee-DNS-Query Fig. 2. IP address 6 G r ouping quer ies by quer y domain name and imesamp Inser-DNS-Query Inser-DNS-Query (Q ) Q = DNS queries beween ime -1 and 1 A =Array for DNS queries 2 DN = Reques domain name of Q 3 IF DN is no in A 4 inser( DN, A ) 5 6 IP, IPLis = IP address of Q, IP lis of DN inser( IP, IPLis ) 7 ELSE IF IP is no in IPLis 8 cn = size of IPLis 9 cn 10 inser( IP, IPLis ) 11 ENDIF 12 ENDFOR ++ End of Inser-DNS-Query Fig. 3. Inser-DNS-Query he DNS queries. We insered he domain name and source IP address of queries o A. If a new query comes in, checing i already exised in A. If i is a new domain name, inser daa. Oherwise, chec he IP address already exis in he IP lis of he domain name and inser he IP address if i is no exis in he IP lis. In his sep, he daa (domain name, source IP addresses and imesamps) of DNS queries are arranged by he requesed domain name. Second, excue he Delee-DNS- Query sep for removing redundan DNS query. If he size of IP lis do no exceed he size hreshold or he domain name is legiimae which already exis in a whielis, he domain name of queries do no have o be processed. Therefore, i should be removed from array A for reducing he processing overhead and saving he memory. Finally, we find he bone DNS queries in Deec-BoDNS-Query sep. We define and compue numerical value of group aciviy of bone DNS, called similariy. If here are wo IP liss which are requesed a ime 1 and 2 and have a same domain name query, assume ha each size of IP liss as A and B. And if here were same IP addresses beween wo IP liss, assume he size of duplicaed Deec-BoDNS-Query (A 1 FOR = 1 o n 2 IF ( A => DN ) is equal o ( A => DN ) 3 4 ) 1 2 similariy( A => IPLis, A => IPLis ) 1 2 S = compued similariy 5 IF S > a, a = Similariy hreshold = 6 DN is done domain name 7 ELSE IF S = - 1 THEN inser( BL, DN ) BL blaclis 8 ELSE inser( W, DN ) 9 ENDIF End of Deec-BoDNS-Query Fig. 5. Deec-BoDNS-Query IP addresses as C. We le S denoe he similariy such ha S = 1 2 (C A + C )(A 0, B 0). B If A = 0 or B = 0 hen we define he similariy as -1. If he similariy approximaed 0, whielising he domain name and delee he IP lis of he domain. Assume ha here is domain name DN which requesed by muliple source IP addresses in a cerain ime, we measure how many source IP addresses of hem reques DN afer in each ime slo. Due o he feaures of bone DNS which menioned in Secion 3.4 he similariy of bone DNS close o 1 differen from legiimae DNS. And he suspicious domain name ha occurred jus one ime and could be occurred laer, which have he value of similariy -1, inser he domain name o blaclis o be moniored afer ha ime. B. Migraing Bone Deecion Algorihm The algorihm of bone DNS query deecion enables us o disinguish he bone. However, he algorihm canno deec bones migraing o anoher C&C server. Therefore, we developed he migraing bone deecion algorihm wih modifying he bone DNS query deecion algorihm. The firs and second sage( Inser-DNS-Query and Delee-DNS-Query) are same bu hird sep of algorihm is differen. During he migraion of bone, bos use wo differen domain name of C&C server, herefore we compare he IP liss of differen domain name which have similar size of IP lis. Here, similar size deermined on basis of experimen. As we menioned

Secion 3.3, bomaser move heir bones frequenly o change he C&C server and mos of hem (65%) are only up for 1 day in he wild. Therefore, he deecion algorihm of migraion aciviy is significan par of he bone deecion sysem. C. Bone Deecion Sysem The bone deecion sysem ha combines boh of bone query deecion and migraing bone deecion, requires DNS raffic daa. And i can be ideal ha large scale of DNS raffic daa from deployed sensors is provided for he inpu daa because bones usually dispersed a differen newors. Therefore, if he deecion sysem applied for small newor, deecion accuracy can be decreased. Moreover, he sysem is sensiive o he hreshold values so, i mus be carefully decided. Disribuion (max) 1 disribuion of IP lis 0.8 0.6 0.4 0.2 0 1 10 100 Size of IP lis Fig. 6. Disribuion of IP Lis Size V. EVALUATION In order o evaluae he effeciveness of he proposed mechanism, we have measured he deecion performance in our esbed newor. The proposed mechanism is implemened as a bone deecion sysem and he sysem is execued on a campus newor wih bones. We have creaed a Agobo code which is one of he mos famous bo and secured he IRC C&C server and is channels. Over 50 machines are used in he bone and he esbed newor is lined wih he campus newor, herefore we carefully made our bone invisible and secure o preven bone from being exposed. We made he scenario scrip for verifying he algorihms and he scenario includes bone consrucion, rally o he C&C server and command and conrol for spam mailing, DDoS aac, C&C server migraion, ec. The scenario conains he siuaion which menioned in Secion 3.4 for validaing bone DNS query deecion algorihm. We also migraes our bone from roo IRC C&C server o candidae IRC server for verifying he migraing bone deecion algorihm. We use Penium 4 processor PCs ha operae on Windows XP. Defaul values of parameers are as follows. A ime uni is 1 hour and a size hreshold for he deecion algorihm is 5(size of IP Lis) and similariy hreshold is 0.8, because i is he adequae value which is beween a similariy of bone domain and a maximum similariy of legiimae domains. We esed our bone for evaluaion, and capured he raffic for 10 hours. A. Bone DNS Query Deecion The bone in our esbed performs several inds of aciviies which include spam mailing, DDoS, C&C server migraion, ec. To be sure, some of hem provoe DNS raffic and consequenly, our algorihm can deec he bone nicely. The size of IP address lis are disribued as shown in Fig. 6. The size of IP lis means he differen number of source IP addresses which queried same domain name during 1 hour and he Fig. 6. shows ha over 80% of he IP lis size was 1. i means ha mos of he DNS queries are sen from only 1 hos during 1 hour. The size hreshold of IP lis is seled wih 5 and i resuls 92.5% of DNS queries eliminaed which gives grea efficiency of he bone DNS query deecion algorihm. Simliariy 1 0.8 0.6 0.4 0.2 (a) pdbox (b) bone (c) idis (d) pruna 0 0 500 1000 1500 2000 Domain Name similariy ---------------------------------- (hreshold = 0.8) Fig. 7. (e) soribada Similariy of Each Domain Name In conclusion, our algorihm can deec he bone properly if over 5 members of bone are exised in he C class size of newor (he size of our experimen campus newor). The algorihm chec all domain names ha was no eliminaed from previous sep. The similariies in a cerain ime are shown in Fig. 7 and here are abou 2300 differen domain names which include bone domain name (if he domain name could affiliaed each oher we plo he highes value of similariy). Mos of similariies equal o 0 or -1 (90%). Suppose ha domain name DN is source IP lis A during ime and IP lis B during + 1 queried DN. In ha case, if a compued similariy of DN is equal o 0 and ha means he IP Lis A are oally differen from B. If he similariy of DN is -1, DN is jus only requesed jus once ( or + 1) and hey added in blaclis of he algorihm because hey are suspicious o be he domain of bone. Oher domain names mosly ranged from 0 o 0.2 (7.4%). I implies ha a cerain hos which queried a domain(ranged from 0 o 0.2) in imeslo 1, could send query o he same domain in 1+1 wih he probabiliy from 0% o 20%. Only he similariy of bone domain exceeds hreshold 0.8, so he bone domain name could be deeced. Some ineresing domain names which have a similariy larger han 0.2 are shown in Fig. 7 ((a) (e)) and all

of hem were idenified as P2P cies or a cie of enormous size of file ransferring. (a) is he domain name of pdbox [17] and (c) is he domain name of idis [18], boh cies provide he service of uploading and downloading large size of personal files which are movie, game, mp3, ec. (d) is he domain name of pruna [19] and (e) is he domain name of soribada [20], boh provide P2P service. We conjecure he reason ha he users who have accessed P2P or file ransferring cie end o eep up he connecion and more coninuously access he same cie more han oher cies. Therefore, he similariy of hese domains have higher similariy han oher domains. B. Migraing Bone Deecion We also run migraing bo deecion algorihm wih he scenario scrip. In he wors case, algorihm runs on O(n 2 ). Similariy (max) 1 0.8 0.6 0.4 0.2 0 bo migraion (deeced) 50 100 150 200 250 300 350 400 450 IP lis Size similariy --------------------------------------------------------------------------------------------- (hreshold = 0.8) Fig. 8. Similariy of Each Size of IP Lis Neverheless, our algorihm operaes in a reasonable ime (abou 5 minue for 1 hour DNS race) because he algorihm remove he se of IP liss which do no exceed he size hreshold (92.5% of DNS queries removed). Here, he similar size are seled wihin 10% of he size of IP lis. For example if he size of IP lis is 100, hen we compare he IP lis wih all of oher IP lis ha has he size wihin 95 o 105. One of he resuls which include bone migraion is shown in Fig. 8 and he algorihm deec he migraing bo correcly. Mos of IP lis has he similariy ha geing lower as he size of IP lis increase, because if he size of IP lis geing larger, a probabiliy of which he source IP addresses beween wo similar size of IP lis duplicaes geing lower. VI. DISCUSSION Our algorihm wored properly in reasonable processing ime, bu if we assume he siuaion ha our sysem monior huge scale of newor hen he processing ime can be a big problem. Hash ables are a grea soluion for dealing he IP address looup and we consider i for our fuure wor. The bone can evade our algorihms when he bone uses DNS only a iniializing and never use i again (moreover, do no migrae he bone). If we could find IP group lis of IRC raffic in C&C process or aac raffic such as spam mailing or DDoS aac, we can compare each IP liss of hem. Here, he IP liss provider can be he IDS, IPS or oher aac deecion sysems. I is possible o paralyze our algorihm wih inenionally generaed DNS queries ha spoof heir sources. The fabricaed paces, our algorihm could be poisoned. In his research, we do no care abou he siuaion of poisoning, bu a simple preprocessing can be a soluion. If we chec he 3-way handshaing of TCP raffic and record he IP addresses o he lis which endures handshaing. Then we could eliminaes he faed IP addresses of he DNS raffic ha do no endure he handshaing. VII. CONCLUSION I is necessary o provide appropriae counermeasure for bone which become a one of he bigges hrea of newor securiy and major conribuor o unwaned newor raffic. Therefore we researched a simple mechanism o deec a bone by using a DNS queries which used by bone. We found significan feaures of bone DNS queries which discriminae from legiimae DNS queries. The wo differen algorihm for bone deecion are proposed and boh can deec he specific aciviy of bone nicely. Wih our suggesed sysem newor adminisraor enable o deec bo agens and dispose hem. REFERENCES [1] J. Oiarinen and D. Reed, Inerne relay cha proocol. RFC 1459, 1993. [2] M. A. Rajab, J. Zarfoss, F. Monrose, and A. Terzis, A mulifaceed approach o undersanding he bone phenomenon, in Inerne Measuremens Conference (IMC 06), Oc 2006. [3] J. Jones, Bones: Deecion and miigaion, Feb 2003. FEDCIRC. [4] E. Cooe, F. Jahanian, and D. McPherson, The zombie roundup: Undersanding, deecing, and disurbing bones, in The 1s Worshop on Seps o Reducing Unwaned Traffic on he Inerne (SRUTI 05), July 2005. [5] P. Barford and V. Yegneswaran, An inside loo a bones, 2006. Special Worshop on Malware Deecion, Advances in Informaion Securiy, Springer Verlag. [6] D. Dagon, C. Zou, and W. Lee, Modeling bone propagaion using ime zones, in NDSS 2006, Feb 2006. [7] D. Dagon, Bone deecion and response, in OARC Worshop, 2005, 2005. [8] J. Krisoff, Bones, Oc 2004. 32nd Meeing of he Norh American Newor Operaors Group. [9] J. R. Binley and S. Singh, An algorihm for anomaly-based bone deecion, in The 2nd Worshop on Seps o Reducing Unwaned Traffic on he Inerne (SRUTI 06), 2006. [10] A. Ramachandran, N. Feamser, and D. Dagon, Revealing bone membership using dnsbl couner-inelligence, in The 2nd Worshop on Seps o Reducing Unwaned Traffic on he Inerne (SRUTI 06), 2006. [11] M. Overon, Bos and bones, in Virus Bullein 2005, Oc 2005. [12] Cipherrus, secure compuing. hp://www.cipherrus.com/. [13] Symanec Co., hp://www.symanec.com/. [14] D. McPherson, Fingerprin sharing: The need for auomaion of iner-domain informaion sharing, May 2005. hp://www.arbornewors.com/. [15] P. Vixie, S. Thomson, Y. Reher,, and J. Bound, Dynamic updaes in he domain name sysem (dns updae), 1997. hp://www.faqs.org/rfcs/rfc2136.hml/. [16] D. Song, personal communicaion, Oc 2006. Korea Universiy Securiy Seminar. [17] NOWCOM Co., pdbox, hp://www.pdbox.co.r/. [18] KTH Co., idis, hp://idis.paran.com/. [19] MEDIAPORT Co., pruna P2P, hp://www.pruna.com/. [20] SORIBADA Inc., soribada P2P, hp://www.soribada.com/.