RequIn, a tool for fast web traffic inference



Similar documents
The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Traffic State Estimation in the Traffic Management Center of Berlin

An Alternative Way to Measure Private Equity Performance

DEFINING %COMPLETE IN MICROSOFT PROJECT

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

What is Candidate Sampling

Canon NTSC Help Desk Documentation

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , info@teltonika.

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

Forecasting the Direction and Strength of Stock Market Movement

An Interest-Oriented Network Evolution Mechanism for Online Communities

A Secure Password-Authenticated Key Agreement Using Smart Cards

Calculation of Sampling Weights

A Passive Network Measurement-based Traffic Control Algorithm in Gateway of. P2P Systems

Stochastic Protocol Modeling for Anomaly Based Network Intrusion Detection

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688,

Network Security Situation Evaluation Method for Distributed Denial of Service

Negative Selection and Niching by an Artificial Immune System for Network Intrusion Detection

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Traffic-light a stress test for life insurance provisions

To manage leave, meeting institutional requirements and treating individual staff members fairly and consistently.

A Performance Analysis of View Maintenance Techniques for Data Warehouses

Efficient Project Portfolio as a tool for Enterprise Risk Management

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Design and Development of a Security Evaluation Platform Based on International Standards

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

QOS DISTRIBUTION MONITORING FOR PERFORMANCE MANAGEMENT IN MULTIMEDIA NETWORKS

The OC Curve of Attribute Acceptance Plans

Enterprise Master Patient Index

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

denote the location of a node, and suppose node X . This transmission causes a successful reception by node X for any other node

Minimal Coding Network With Combinatorial Structure For Instantaneous Recovery From Edge Failures

Single and multiple stage classifiers implementing logistic discrimination

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

RELIABILITY, RISK AND AVAILABILITY ANLYSIS OF A CONTAINER GANTRY CRANE ABSTRACT

For example, you might want to capture security group membership changes. A quick web search may lead you to the 632 event.

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Project Networks With Mixed-Time Constraints

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

IMPACT ANALYSIS OF A CELLULAR PHONE

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

A role based access in a hierarchical sensor network architecture to provide multilevel security

Can Auto Liability Insurance Purchases Signal Risk Attitude?

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW.

ivoip: an Intelligent Bandwidth Management Scheme for VoIP in WLANs

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture

End-to-end measurements of GPRS-EDGE networks have

The Use of Analytics for Claim Fraud Detection Roosevelt C. Mosley, Jr., FCAS, MAAA Nick Kucera Pinnacle Actuarial Resources Inc.

Scalable and Secure Architecture for Digital Content Distribution

Vembu StoreGrid Windows Client Installation Guide

A Parallel Architecture for Stateful Intrusion Detection in High Traffic Networks

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

The Current Employment Statistics (CES) survey,

Daily Mood Assessment based on Mobile Phone Sensing

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

VoIP Playout Buffer Adjustment using Adaptive Estimation of Network Delays

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Multiple-Period Attribution: Residuals and Compounding

Dynamic Pricing for Smart Grid with Reinforcement Learning

Updating the E5810B firmware

Optimization Model of Reliable Data Storage in Cloud Environment Using Genetic Algorithm

On File Delay Minimization for Content Uploading to Media Cloud via Collaborative Wireless Network

Ensuring Data Storage Security in Cloud Computing

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Automating Analysis of Large-Scale Botnet Probing Events

SEVERAL trends are opening up the era of Cloud

iavenue iavenue i i i iavenue iavenue iavenue

How To Detect An Traffc From A Network With A Network Onlne Onlnet

An Empirical Study of Search Engine Advertising Effectiveness

SEVERAL trends are opening up the era of Cloud

M3S MULTIMEDIA MOBILITY MANAGEMENT AND LOAD BALANCING IN WIRELESS BROADCAST NETWORKS

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Recurrence. 1 Definitions and main statements

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

A FEATURE SELECTION AGENT-BASED IDS

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

sscada: securing SCADA infrastructure communications

Politecnico di Torino. Porto Institutional Repository

Conferencing protocols and Petri net analysis

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

Relay Secrecy in Wireless Networks with Eavesdropper

7.5. Present Value of an Annuity. Investigate

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

Improved SVM in Cloud Computing Information Mining

Transcription:

RequIn, a tool for fast web traffc nference Olver aul, Jean Etenne Kba GET/INT, LOR Department 9 rue Charles Fourer 90 Evry, France Olver.aul@nt-evry.fr, Jean-Etenne.Kba@nt-evry.fr Abstract As networked attacks grow n complexty and more and more Internet users get broadband Internet access, applcaton level traffc analyss n operator networks becomes more dffcult. In ths paper, we descrbe a tool allowng web communcatons to be analyzed n such envronment. Instead of relyng on the extracton of applcaton level parameters and pattern matchng algorthms that are usually consdered bottlenecks for such actvty, we look at smple network and transport level parameters to nfer what happens at the applcaton level. Our approach provdes the ablty to perform a trade-off between analyss speed and precson that n our opnon could be useful for some traffc analyss applcatons lke denal of servce attacks detecton. Keywords-component; Montorng, HTT, performance, DDoS. I. INTRODUCTION Over the last ten years, a part of the securty functons that were prevously mplemented wthn companes has been delegated or outsourced to external organzatons. The appearance of new threats (e.g. worms, DDoS attacks) has led network operators to provde ntruson detecton servces to ther customers. In ths paper we consder one of the challenges mpled by ths new actvty; the ablty to montor user communcatons wthn operator networks. Ths task can be consdered as challengng for several reasons: Operators network nternal devces usually have basc traffc analyss abltes. Most devces are currently lmted to operatons appled to packet headers through capture, aggregaton, flterng, samplng and countng operatons. Operators network nternal lnks usually carry large amounts of traffc. As a result the tme that a montor can devote to each user request s usually very short (a few hundreds of nanoseconds). As a result complex operatons such as the algorthms employed n endhosts montorng systems are usually unusable. For example the snort ntruson detecton tool uses pattern matchng algorthm for whch the best known soluton n term of temporal complexty [2] s n O(n+m) where n s the sze of the strng to be searched n and m the sze of the pattern. Such algorthms would clearly be unable to handle strngs longer than a few words n very hgh speed envronments. Operator networks are usually constraned n term of ntroducng new mechansms or tools by two parameters one beng the relablty of ther network, the other one beng the management cost. As a result new technques should as far as possble take advantage of exstng montorng mechansms n order to lmt the modfcaton to exstng elements. In ths paper we focus on HTT communcatons. Accordng to ISs [3], HTT traffc consttutes between 35 and 50% of the Internet traffc. The goal of ths paper s to present a tool that allows such communcatons to be analyzed n the mddle of a network whle complyng wth the aforementoned lmtatons. We frst ntroduce the measurement nformaton our analyss s based on. Secton IV shows how such nformaton can later be used to deduce users requests. Secton V presents RequIn, an mplementaton of our technques as an extenson to Iflter on FreeBSD. We then test our analyss technque and mplementaton by consderng models and traffcs orgnatng from our web ste. II. MEASUREMENT INFORMATION Our goal s to permt the montorng of web communcatons when applcaton level nformaton cannot be used. In order to do so our plan s to use network and transport level nformaton n order to nfer applcaton level behavors. In ths secton we frst ntroduce network and transport level measurement capabltes. A. The HTT rotocol HTT exchanges can be vewed at several levels. At the lowest level, the HTT protocol s based on a request-response protocol where each request attempts to perform an HTT operaton on an obect at the server. We later call ths level mcro-sesson level. Informaton n HTT. messages [4] s organzed nto nformaton elements called headers. Although HTT. defnes more than 40 dfferent headers, requests and responses usually only use a few them. Requests usually nclude some of the followng headers: The verson of the protocol, a method ndcatng the acton to be performed, a URI ndcatng the obect the acton s to be performed on, a destnaton dentfyng the targeted web server, the date at whch the request was performed... Ths work s funded through European Commsson IST F6 DIADEM FIREWALL and GET DDOS proects.

Smlarly, a response usually ncludes smlar headers (verson, encodng, date, server) and some specfc headers lke a status ndcatng the result of the request, the content length or nformaton targetng caches (expraton date, cache drectves). B. Measurement Informaton Selecton As mentoned earler, measurement operatons am at understandng applcaton level operatons through the analyss of network and transport level nformaton. Although the applcaton level protocol has an mpact on ths nformaton, ths mpact also depends on ntermedate protocols. Addtonally some applcaton level parameters mght be more dffcult to nfer than others. As a result a frst step s to try and map applcaton level parameters to transport and network level parameters. The strength of ths mappng s later examned n the followng sectons. arameter Method URI Source Destnaton Tme/Date Compress. Cachng Status TABLE I. ARAMETERS MAING. Network/Transport level parameter Request data sze, Response data sze Response data sze Source I address, ort Destnaton I address, ort External: Tme/Date External: Server confguraton Response data sze. Response data sze. As ndcated n table I, szes are expected to be a sgnfcant source nformaton n order to nfer several applcaton level parameters. More specfcally our assumpton s that obect sze and obect dentfers are closely connected and that obect szes and transport/network level measured szes are also connected. Whle ths last relaton s obvously true wth HTT.0 where a connecton s used for each obect, HTT. uses several mprovements that can render ths relaton weaker. C. HTT/TC Relatonshp HTT. [4] provdes the ablty for web clents and servers to multplex several HTT request-responses exchanges over a sngle TC connecton. Among persstent connectons we can also dstngush connectons usng ppelned requests from regular connectons. pelned connectons are used by the clent to perform several requests wthout watng for an answer from the server. Ths ablty s however usually lmted by the structure of html obects where mported obects can only be requested after the html document lnkng to them s receved by the clent. Therefore n connectons, request-response sessons can be dstngushed at the network level by ether lookng at: Connectons set-up and endng n the case of non persstent connectons (whether ppelned or not). Request-Response sesson patterns [6] n the case of non-ppelned persstent connectons. These patterns can be found at the network level by consderng TC sequence numbers evolutons. As the sequence number from the clent only ncreases when a new requests s sent to the server, we can set the begnnng of each new sesson when a clent sequence number ncrease occurs. Snce several requests cannot be served smultaneously over the same connecton ths also represents the end of the prevous sesson. Request-Response sesson patterns n the case of persstent ppelned connectons. In the case of ppelned requests, only the frst request-response sesson can be dstngushed from other exchanges. The next sesson may nclude one or several requestresponse sessons. As a result ppelned, persstent connectons can make the relaton between Network/Transport level nformaton and applcaton level nformaton so weak that t can hardly be used. As a result an nterestng queston s whether ppelned connectons are supported n the real lfe. Reference [5] shows that most browsers (MS IE) are not able to use ppelned connectons. Moreover browsers (Frefox, Netscape) that do support ppelnng are usually confgured to avod usng t. Beng able to dstngush mcro-sessons allows us to measure the amount of data transported by TC for a request or a response by lookng at TC sequence number evoluton durng a mcro-sesson. III. METHOD AND OBJECTS SIZE INFERENCE A mentoned earler, our assumpton s that obects szes can be nferred from network or transport level measurements. As a result beng able to perform that operaton as correctly as possble s crtcal to our scheme. Several factors lke HTT headers can play a role n makng ths process more dffcult. Our assumpton s that the sze of HTT headers can take a lmted number of values. For a gven server these values depend on the server confguraton. A. Response type and method nference Fg. provdes the relaton between header szes, types of response and total szes n the case of our web server. These values where obtaned by capturng responses packets from the server over 24 hours. Sx types of responses (dentfed by code numbers) were captured. Fg. shows that 200 ( Ok ) responses can be dstngushed from other responses by lookng at the total sze (total sze > 570 bytes). As show n fg., some 200 responses have a sze that colldes wth other types of responses. However obects carred by these requests consttute less than % of exstng obects. 304 ("Not modfed") responses can also be dstngushed from other responses by lookng at the total sze (total sze <250). Other responses cannot be dstngushed as they carry obects whose sze can vary wdely. Addtonally, our tests showed that two types of HTT headers (and thus two headers szes) were found n transactons wth our server headers for non persstent connectons as well as headers for persstent connectons whch ncluded addtonal headers wth a fxed sze.

Sze (bytes) 650 550 450 350 250 50 Header Sze Response Type Total Sze 200 30 304 400 403 404 Fgure. Response Type/Sze Relatonshp. As a result knowng whether a connecton s persstent s suffcent to deduce the nfluence of the persstence on the HTT header sze. Ths knowledge can be obtaned usng the sesson delmtaton scheme descrbed n secton II. Non persstent connectons are dstngushed by lookng for multple connectons establshment-teardown over short perods of tme. table II provdes the relaton between response sze and response codes for persstent connectons. TABLE II. RESONSE CLASSIFICATION USING RESONSE SIZE (RS). Result Response Sze 200 RS >570 250>RS>460 304 240<RS <250 30, 400, 403, 404 460<RS<570 Usng a smlar methodology, we defne a set of classfcaton crteron n order to nfer the method used n HTT requests. However, we found that determnng the methods type usng solely the response sze could not be performed effcently. In order to do so, we use the combnaton of request and response szes. B. Obect Sze Inference Fg. shows that 200 responses can carry HTT headers whose szes are not fxed. As a result usng an average HTT header sze value to estmate obects sze n the case of GET requests can lead us to some errors. By lookng more closely at headers felds we can classfy them accordng to ther behavor: Some headers never change (e.g. response code, server dentfer; accept range, ). Some header values change but have a fxed sze (e.g. last modfed, date and Etag). Some header values change dependng on the assocated obect (e.g. content type and length). As a result for a gven obect, the response sze should reman constant. Ths means that by keepng the relaton between response szes and obect szes, we can get an exact estmate of obects szes. opular HTT servers support obects compresson pror to sendng them to the clent. Ths can cause a dfference between the number of bytes measured n the network and the sze of the obect. The compresson opton s used when an approprate confguraton s performed on the server sde and when the clent supports compresson. However as most clents support compresson, knowng f compresson s used s only a matter of knowng f the server s confgured to use t. In ths case HTT servers provde the ablty to log both compressed and orgnal szes for each requested obect. The nference process n the case of compressed obect therefore remans the same. IV. URI INFERENCE Our assumpton for URI nference s that network and transport level measurement parameters can be used to nfer obects dentfers for GET requests: Each gven obect has a sngle sze. As a result knowng an URI can help us explanng obects szes and recprocally. Users orgnatng from dfferent locatons have dfferent nterests. For example local students are usually more nterested n schedules and courses related nformaton whle users connectng from remote research nsttutons are more nterested n research related obects. For the same reasons people resdng n dfferent tmezones use dfferent parts of the server. In order to understand the relatons between measurement parameters, we use access logs avalable on web servers. These logs are usually made of a set of entres, each of them descrbng an acton performed on the server. Because each entry lnks I addresses, tme and date nformaton, obect szes and obect dentfers, we can use log entres n order to buld a model that wll later be used to nfer obects dentfers when provded other parameter values. A. Inference Model The model we selected to perform nference operatons s a Bayesan network. Bayesan networks are graphcal models that can be used to represent causal relatonshps between varables. A Bayesan network s usually defned as: An acyclc drected graph G, G ( V, E) =, where V s a set of nodes and E a set of vertexes. A fnte probablty set ( Ω Ζ, Ρ),. A set of varables defned on ( Ω Ζ, Ρ), such as: n ( V, V 2,, V n ) = ( V C ( V )) = Where C ( V ), s the set of causes for V n the graph. The nference n a causal network conssts n propagatng one or more unquestonable nformaton wthn the network, n order to deduce how belefs concernng the other nodes are modfed.

wrte: If node If node s located downstream from node = s a drect descendant of s over. In the other case we can break up reach a drect descendant of., we can, the computaton untl we If node s located upstream from node t s necessary to propagate the nformaton startng from the begnnng of the chan, to know the uncondtonal probablty for each node ( k ), ( k ). In order to do so, we can use the property of nverson of the condtonal probablty: + = + ( ) ( ) As wth the downward propagaton, f s a drect ascendant for the computaton stops here. In the other case + we can perform the same operaton on ascendants. + B. Varables Selecton In order to obtan an effcent model, we frst performed some aggregaton on varables. I addresses were aggregated nto country codes. Seconds, mnutes and hours nformaton was aggregated nto a sngle hour varable. Day of the week, month, year nformaton was aggregated nto a sngle day of the week varable. As the cost of nference n a Bayesan network ncreases exponentally wth the number of varables n the network, t s essental to lmt that number. In order to do so, we evaluate the ablty for each parameter (sze, country, tme and date) to explan obect dentfers. For each couple of varables (URI,), we do so by computng (URI ) and comparng t wth (URI).() by computng: V. IMLEMENTATION AND TESTS A traffc analyzer was mplemented as an extenson to IFlter [7]. The HTT sesson handlng functon s mplemented as a part of the TC state mantenance functon. Ths functon extends the TC connectons data structures by allowng multple HTT sessons to coexst wthn a TC connecton. Sessons are delmted as specfed n secton II and specfed usng the IFlter flterng polcy. When a sesson ends, the correspondng nformaton (Source I address and port, Destnaton I address and port, tmestamps, number of TC bytes transported n both drectons, Number of packets and bytes transported, type of connecton) s handed to the kernel syslog part. Ths nformaton s later exported to the user space and retreved by RequIn. RequIn s frst used to transform tmestamps nto tme and date values as well as I addresses nto country codes. To do so we use a statc I address database for performance reasons. When started, RequIn frst uses logs from the server to montor n order to buld the correspondng Bayesan network and method-response codes classes. When such models are bult, classfcaton and nference models can be used to nfer users' actons. A. Valdaton Tests Our valdaton tests were performed usng our departmental web server. Ths server runs wth Apache.3 and ncludes roughly 5k obects, most of them beng statc pages and receves 7k requests a day. In order to perform consstent tests over a long perod of tme, a copy of ths server was made on a smlar computer. Ths copy was later used for the tests. In order to check that our server dd not have a structure that would have tanted our tests, we performed a comparson between requests szes to our server and the ones usually found on the nternet [6]. Fg. 3 shows both cumulatve dstrbuton functons. Szes smaller than 500 bytes have been gnored snce header szes dstrbuton s unknown n [6]. Overall there s lttle dfference between the two dstrbutons except n the [0 5 ;0 6 ] range where the dfference should not have a large mpact on our scheme. I ( URI, ) N ([ ( URI ) ( URI ) ( )]) = = The rankng between I(URI,) values lead us to the smple Bayesan network presented n fg. 2. N Identfer Obect Sze Country Code Fgure 2. Resultng Bayesan Network. Fgure 3. Responses szes cumulatve dstrbuton. Usng the model defned n secton IV, we bult a Bayesan network for ths web server usng a 309k entres log fle gathered over 43 days from the orgnal server. In order to test

the ablty of the model to predct future requests, we frst nvestgated the nfluence of tme on the estmaton accuracy. Fg. 4 provdes the evoluton of the correct estmate rate over three weeks when usng a three weeks log to buld the model. As shown n fg. 4, the percentage of correct estmates remans around 75% durng roughly 0 days (records to 70k). It then slowly falls to 7% over the next 2 days as new obects are stored n the server. redcton accuracy 0,76 0,75 0,74 0,73 0,72 0,7 0,7 0,69 9 7 25 33 4 49 57 65 73 8 89 97 05 3 2 29 37 45 53 6 Entry# (n thousands) Fgure 4. % of correct estmates over tme. The valdaton of the method and response code nference methods were performed usng a smlar process. Estmaton results are provded n table III. TABLE III. Estmated parameter Method 95 Operaton result 96 RESULT AND METHOD INFERENCE. % correct estmaton Ths frst estmaton does not take nto account the perturbaton that mght be ntroduced by the measurement part of RequIn. In order to valdate the whole software we generated sequental requests for each obect dentfer found n the full log fle. Requests were analyzed by RequIn whch produced the nferred user actons. These actons were later compared to orgnal requests. Results are provded n table IV. Ths test s however based by two parameters: The nference part s unable to take advantage of the country code nformaton. Ths should decrease the accuracy of the nference. The nference process s not affected by agng as the server confguraton s statc. Ths should ncrease the accuracy of the nference. Gven the varaton of accuracy over tme (fg. 4) we however beleve that ths last parameter should have a small effect over the frst ten days. Consequently we expect to get slghtly better results wth real lfe traffc. TABLE IV. Scenaro URI 74 Method 90 Operaton result 94 VALIDATION OF WHOLE SOFTWARE. % correct estmaton B. erformance Tests RequIn was tested on FreeBSD 5.2 on a 2.4Ghz entum eon processor wth a 52KBytes cache. Durng our tests we benchmarked several aspects of the nference process ncludng the tme requred to buld the models, the sze of the models and the tme requred to nfer a request once models are bult. For the test we used an access log fle ncludng 77k entres to buld the nference model. We then used (I address, obect sze) couples from a 232k entres log fle to perform the performance test. We performed 50 seres of tests and averaged the results. TABLE V. arameter Tme to buld the models Sze of the model Tme per request ERFORMANCE RESULTS. 4s.5 Mbytes 0.9us Value These results (table V) show that our nference process, when used ndependently from the request-response measurement mechansm should be able to analyze roughly.m requests per second. Assumng an average Internet HTT traffc ths would allow us to treat a 20Gb/s full duplex lnk. VI. CONCLUSION In ths paper, we ntroduce a new technque to analyze traffc between clents and web-servers. Unlke exstng analyss technques, ths proposal provdes the ablty to trade some accuracy (n term of what nformaton can be retreved and the precson of such nformaton) aganst an ncreased analyss speed. We thnk that such analyss speed mght be useful aganst some threats lke denal of servce attacks where speed s the maor concern. Ths would allow the usage of applcaton level resources to be controlled at the network level. Although our technque s not applcable to every web server (HTT servers that are large or contan mostly dynamc content) our feelng s that t would work for a large proporton of exstng servers makng t useful n practce. We beleve our technque could be further mproved by lookng at HTT communcatons at levels other than the mcro-sesson level. We are currently workng DDoS detecton methods based on the nformaton nferred by RequIn. REFERENCES [] H. Nelsen et al.. Network erformance Effects of HTT/. CSS, and NG. In roceedngs of SIGCOMM 997, August-September 997. [2] G. Navarro and M. Raffnot. Flexble attern Matchng n Strngs. Cambrdge Unv. ress, 2002. [3] Sprnt I Montorng proect, avalable at: pmon.sprnt.com/, 2004. [4] R. Feldng and al. HTT., RFC 266. Internet Engneerng Task Force, June 999. [5] Balachander Krshnamurthy, Martn Arltt, RO-COW: rotocol Complance on the Web, A Longtudnal Study, USITS '0, March 26 28 200. [6] F. Donelson Smth, F. Hernandez, K. Jeffay, and D. Ott, What TC/I protocol headers can Tell Us About the Web, In proceedngs of ACM SIGMETRICS 200, June 200. [7] Darren Reed, IFlter, avalable at coombs.anu.edu.au/~avalon/, 2004.