Capture of Mission Assets Vern Paxson University of California Berkeley, California USA vern@eecs.berkeley.edu June 2009 Presented by Giovanni Vigna
Outline Motivational context Tracking network system assets Passively Actively Vantage-point considerations Tying assets to mission needs Leveraging multiple examples Inferring dependencies Fault injection
The Role of Asset-Capture Assets & relationship to mission functioning crucial for reasoning about: Vulnerability to induced faults / subversion Robustness to incidental failure Assets include network/system components, human actors, required services and service sessions (may include multiple components), accompanying configuration info: Topology, access control Particularly significant: version information Points up attacker exploit opportunities, common failure modes Also, presence of data caching Can mask/induce errors Modeling includes producer/consumer, trust relationships, preconditions
Identifying Assets Simple approach: it s the system/network operator s job to track them But: done manually, this is tedious error-prone due to churn Note: some asset information necessarily tracked E.g., users present in system (defined by authentication mechanisms) E.g., devices potentially connected to network (if MAC address registration employ) Thus, part of research effort: tools & procedures to automate asset tracking Using manually provided information as ground truth
Passive Capture Idea: learn assets by surveillance, i.e., watching what happens Requires widespread visibility + suite of network analysis modules Link/network layer Track MAC IP address bindings to resolve aliasing ARP reveals broadcast domains, routers, switches (to limited degree) TTLs reveal IP hop counts from systems to monitors RTTs suggestive of subnet structure, underlying link technology Packet pair dispersion reveals link capacities Jitter can shed light on cross traffic implicit dependencies Deterministic loss pattern often rooted in internal buffer network sizes ICMPs reveal failure conditions (e.g., Host Unreachable)
Passive Capture - Net/Trans. Layer Inference of OS type / version Absent servers manifesting as TCP SYN/RST exchanges Firewalls (some forms) manifesting as SYN/SYN-ACK RST Unanswered SYNs reveal expected-but-missing services Baseline of network path characteristics Range of bandwidth/loss/delay conditions over which the mission has successfully run Behavior of TCP senders indicates their congestion-control specifics, allowing prediction of expected performance in alternative/backup environments
Passive Capture - Application Layer Requires rich set of app-protocol parsers Built on top of Bro system and BinPAC Extract app structure: request/response, error codes, data xfer Determine services provided, version Our previous work on Dynamic Protocol Detection unambiguously determines services via app-level parsing Our previous work on Discovery of Session Structure finds sets of interrelated connections (requires extension to > 2 hosts) related to single instance of app activity Based on observation: independent events arrive according to Poisson process Thus, those arriving quicker than Poisson are with high probability nonindependent Technique works in general for discovery causal structure But requires numerous observations of mission in access to build up statistics
Passive Capture - Caching Data items If some sessions lack data-transfer connections present in others, suggestive of possible caching If when we observe the transfer, it s generally the same data item Two-edged sword: Presence of cache may mean mission can continue even if server is down Depending on the caching policy But: presence of cache means we may miss its presence during passive observation, overlook a dependency
Passive Capture - Pros & Cons Pro: non-invasive Can apply to actual networks with minimal disruption Pro: can work retrospectively If we gather traces from a network-of-interest, can be later analyzed/ reanalyzed using passive techniques Con: things change (churn) Need to understand change time scales & reanalyze / incorporate possible change into mission planning Can only characterize what you see Can miss capabilities, quirks, and dependencies that happened not to manifest Option for increasing visibility: deploy simple agents on end hosts Especially for gaining insight into causal links difficult to infer externally
Active Probing By injecting traffic, can potentially address previous issues of missing capabilities: probe for services/hosts present but not active Because they are backups for the observed missions Tricky because revealing their presence may require correct authentication quirks: how in particular will given end system respond to cornercase/ambiguous packets used for evasion? dependencies: can potentially inject faults Degradation or interruption of network/service functionality Can identify both backup components and failover strategy/delay Can capture ensuing degradation, which may or may not affect mission
Extracting Mission Models First, analyze existing missions to develop task-based model of mission workflow Includes capturing dependencies Some tasks generic: e.g., day-to-day support for network services Others detailed: specific to particular mission s structure & objectives Models captured in mission-model database
Tying Mission Activity to Assets Establish mapping between tasks in cyber-missions w/ corresponding required assets Historically, a difficult/error-prone manual step Starting point: estimated mapping supplied by domain experts Likely incomplete and subject to some spurious errors Next: develop tools to automate discovery of relationships Based on observing event history over a number of runs Capture dependencies in a form that supports automated reasoning Ideally, observe numerous runs, including some with failures Need to also recognize and factor out routine background traffic
Inferring Types of Dependencies Some dependencies are indirect E.g., use of a URL for coordination results in a dependency on DNS Caching can hide the DNS lookup in example runs, but we can infer it when we see use of the URL Some dependencies have complex failover behavior E.g., a videoconference might fail over to audio-only Manifests as use of quite different service elements Requires hierarchical modeling to identify subtasks that differ while maintaining the same overall task structure
Abstracting Types of Dependencies Any member of a pool of services can provide the service with the same performance/cost An asset relies on a partition of services (of different types) when it requires an instance of each service to achieve full performance However, may operate at a degraded level if only can access a subset Assets rely on alternate services when multiple elements provide the service, but at varying performance/cost levels An asset requires a composition of services if each is vital for the asset being available
Classes of Dependencies, con t Each dependency type (pool, alternate, etc.) has different implications for mission availability Research target: develop algorithms for inferring type of dependency Based on observing failures during in situ mission runs Challenge #1: robustly recognizing service failure Failed network connections Application-specific error messages Failure codes Alteration of task structure Challenge #2: failures are rare If we don t observe them in situ, we may need to inject faults in a controlled fashion Requires highly cooperative mission operators
Summary We require a methodical approach to (1) understanding what assets are available including issues of churn and rarely-seen backup services for which we can apply both passive and active techniques to then (2) extracting dependencies among assets and their ties to mission progression based on observing multiple mission runs across which we apply inference techniques Given these dependencies, we then abstract them based how they play sole and/or critical roles, capturing both: models of missions, and formalism to reason across the structure of the different types of dependencies