Detecting misbehaving CDN nodes via peer surveillance. Nikolaos Michalakis Robert Soule, Gaurav Arora, Robert Grimm New York University

Detecting misbehaving D nodes via peer surveillance ikolaos Michalakis Robert Soule, Gaurav Arora, Robert Grimm ew York University

Outline Protect content served through an untrusted ontent Distribution etwork/ Edge-side platform Motivation Dynamic content authenticity. Related work. Approach: Peer Surveillance. Experimental Setup and Evaluation. Future directions onclusions

Background: Large Scale Web-based ollaboration Web-based collaborative services for large-scale societal and educational problems. Example: YU Surgical Interactive Multimedia Modules (SIMMs) ontent-demanding Personalized Large-scale ollaboration across multiple institutions. = Edge-side content processing and generation.

a Kika ontent Delivery etwork (D). Static content. odes join a DHT for caching (a la oral). Dynamic content. Execute server-defined scripts locally. Service through DS redirection. ontrols who enters and leaves the D. Assumes trusted nodes. => In reality Scalability vs Trust Trade-off.

ontent Authenticity Problem an trust my domain but not others. content provider W W DS DS request authentic response misbehaving response optional D node content processing, generation Brazil 3 Greece 0 Brazil 0 Greece 1 and more serious attacks!

Outline Motivation Dynamic content authenticity. Related work. Peer Surveillance. Trusted and untrusted system bases. Surveillance problem. Monitoring: direct, espionage, informers. Response equivalence. Experimental Setup and Evaluation. Future directions onclusions

Related work (dynamic content authenticity) Sitegrity: detect tampering at server. Router signs all server executables. Server Agent sends signatures to router. Reverse problem in D. SSL Does not scale. Voting, fault-tolerance (a la LOKSS) Majority of replies. Slow. Assumes honest majority. What if strong adversarial foothold in D? Must detect and remove.

Protecting dynamic content Re-execute script locally. ompare with remote. lients cannot verify but a Kika nodes can! Monitoring channels between a Kika nodes. In other words, we need peer surveillance.

TB: the trusted computing base TB TB T TB: set of nodes trusted by the system. Rest of the system is untrusted (TB-not). The TB has a publicly known interface. Assume a small TB, controls client redirection.

AB: The adversarial computing base A A AB AB: nodes trusted by the adversary. Assumptions: Rational, rich, multiple roles, colluding. But: annot break cryptography. annot observe or block traffic from non-ab network interfaces.

The Verification Base T T V verification base Verification Base: nodes that can verify honest behavior, a Kika nodes. TB must verify. Alone does not scale. eed help from untrusted nodes. lient are unprotected.

The a Kika Base AB TB DS DS W W verification base unprotected request authentic response misbehaving response optional

Monitoring channels and surveillance Surveillance = effective and trusted channels. Effective: can relay evidence of misbehavior. E.g., a wiretap on Mallory. If discovered, Mallory will act honest. Trusted: the evidence is correct. E.g., -Kids, who broke the vase? -He did! -o, she did!

The Surveillance Problem T V verification base A unprotected Surveillance Problem: How can the TB prove that a node A is in AB? hallenge: T needs an effective and trusted channel to prove A is in AB. V,, and A cannot be trusted.

Direct monitoring Direct monitoring channel: direct node-to-node monitoring requests. Trusted But TB public, a Kika nodes revealed by DS. Ineffective channel: AB knows it s being monitored.

Effective monitoring channels T V verification base A unprotected Effective channel requirement: indistinguishable from regular communication channels. Only two methods left: espionage or informers. Espionage: V hides its identity to pass as client. Informers: forwards A s messages to T or V.

Effective but not trusted untrusted V T A? Man-in-the-middle attack: V or can modify A s messages. Leads to the he said she said problem. A in AB? V and say A is in AB?

Effective and Trusted V T A? Paranoid Assertion something wrong -> A,V is in AB. Problem: A frames V. Large AB can take over.

Effective and Trusted V T A? Accountability: Link a Kika nodes to generated content using signatures. drops unverifiable responses. If not, then he said she said A corrupts or says A corrupts? Protect honest nodes and clients at cost of total client buy-in.

Espionage TB hides identity of some a Kika nodes = spies. Spies act like regular clients, but report to TB. Rationally patient adversary: Honest until steady state. record all clients during that time. orrupt all clients appearing after. Adversary is safe with high probability. Spies must be in his records.

Informers 5. remove node from system AB W TB DS DS W 3. report 6. caught 4. verify honesty request authentic response misbehaving response optional 2. repeat 1. forward response with some probability

Informers Evaluation Scales. Always makes progress. an a client or verifier in AB frame honest nodes? o. Accountability. an a content provider in AB frame a Kika nodes? o. Let s see why...

Proof of Adversarial behavior Script signed by content provider: {S}_cp S_cp includes script URL. Input content signed by content provider: {I}_cp Generated response signed by node: {G}_n {G}_n includes {S}_cp and {I}_cp. Verifier repeats recipe using verifiable ingredients. heck if responses equivalent. How do we define equivalence?

Response Equivalence Requirement Response Equivalence Requirement If equivalent to trusted then accept. Equivalence relations: Idempotent Equivalence Rollback-idempotent equivalence Application-specific equivalence Idempotence -> absolute correctness Application-specific -> false positives, no negatives

Outline Motivation: edge-side content processing/ generation in untrusted environments. Static content authenticity. Dynamic content authenticity. Related work. Approach: Peer Surveillance. Experimental Setup and Evaluation. Future directions onclusions

Experimental Setup Simulated surveillance via informers using arses Simulator. Abstracted: servers, content, crypto, topology. Request stats based on a Kika microbenchmarks and oral daily usage. Adversary: memoryless Questions asked: How much damage (corrupt responses) before removed from the system? How fast can we detect and remove the adversary?

Results (parameter testing) Setup: Planetlab (630 nodes), Princeton adversarial (3%), 1 out of 10 clients informs (p=0.1). Tested damage before caught Aggressive vs Silent Adversary ormal vs Loaded lient Traffic Zealous vs Unconcerned informers

Aggressive vs Silent Adversary Using normal traffic 600 reqs/s Aggressive (corrupt all) -> 788 responses. 80 if p=1.0 Silent (corrupt 1 out of 100) -> 482 responses. 78 if p=1.0. Detection-removal interval same for both Aggressive better. Packs more.

ormal vs Loaded traffic Using aggressive adversary ormal (600 req/s) -> 881 responses. Semi-loaded (50x = 30K req/s) -> 1736 responses. Loaded (100x = 60K req/s) -> 2817 responses. (saturates) High traffic insignificant effect.

Zealous vs Unconcerned Zealous vs Unconcerned informers (aggressive, normal traffic): Zealous (p = 100%) -> 94 oncerned (p = 10%) -> 761 (about 1 order from p=100%) Unconcerned (p = 1%) -> 7147 (about 2 orders from p=100%) Inform probability has proportional effects.

Results (scenarios) Setup: ormal Traffic, aggressive, p=10%. Planetlab, conflicting political interests (50%): leaned in 1-1.5 minutes (24K corrupted). Akamai (10K nodes), one ISP gone bad (3%): leaned in about 5-8 sec (15K corrupted). Planetlab, attack by very large botnet (10K): leaned in about 2 hours (4.3M corrupted).

Future Directions Implement in a Kika. May need verification fault tolerance in practice. Response equivalence through statistical similarity (like spam) Ensuring authenticity does not ensure correctness. How do we detect passive aggressive behavior? E.g., too many 404 errors, denying access to URLs...

Future Directions Working with a semi-trusted client environment. Some stats: oral serves up to 2 million clients/day. Botnet reports mention armies of not more than 50K hosts. Use > 5% of clients as informers verify by voting avoid total client buy-in Simulate

onclusions Surveillance problem an solve with small TB with verifiers and informers. Using informers requires proving nonadversarial behavior under the watchful eye of your peers. Protect honest -> accountability -> total client buy-in (limitation). Scales. Even if small ratio of informers, the adversary is caught very fast.

THE ED