Secure Network Provenance



Similar documents
Let SDN Be Your Eyes: Secure Forensics in Data Center Networks

PLUMgrid Toolbox: Tools to Install, Operate and Monitor Your Virtual Network Infrastructure

How To Ensure Correctness Of Data In The Cloud

University of Pennsylvania. This work was partially supported by ONR MURI N , NSF CNS and NSF IIS

Software Execution Protection in the Cloud

Automated Formal Analysis of Internet Routing Systems

Reliable Client Accounting for P2P-Infrastructure Hybrids

Ensuring Data Storage Security in Cloud Computing

Chronon: A modern alternative to Log Files

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report.

Tracking Anti-Malware Protection 2015

How to Secure Infrastructure Clouds with Trusted Computing Technologies

Best Practices for Hadoop Data Analysis with Tableau

Differential Provenance: Better Network Diagnostics with Reference Events

Massive Cloud Auditing using Data Mining on Hadoop

Certifying Program Execution with Secure Processors

CSE-E5430 Scalable Cloud Computing Lecture 2

How To Ensure Correctness Of Data In The Cloud

Presenting Mongoose A New Approach to Traffic Capture (patent pending) presented by Ron McLeod and Ashraf Abu Sharekh January 2013

Improving data integrity on cloud storage services

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Technical Overview Simple, Scalable, Object Storage Software

ibigtable: Practical Data Integrity for BigTable in Public Cloud

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

Exploiting peer group concept for adaptive and highly available services

Security and Privacy in Public Clouds. David Lie Department of Electrical and Computer Engineering University of Toronto

Apache Hadoop. Alexandru Costan

Security of Cloud Storage: - Deduplication vs. Privacy

Peer-to-peer Cooperative Backup System

IoT Security Platform

Cloud Computing Backgrounder

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Verification of Data Reliability and Secure Service for Dynamic Data in Cloud Storage

Ensuring Security in Cloud with Multi-Level IDS and Log Management System

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Network Security. Mobin Javed. October 5, 2011

Secure cloud access system using JAR ABSTRACT:

Security in Structured P2P Systems

Big Systems, Big Data

Big Data. A general approach to process external multimedia datasets. David Mera

A Performance Analysis of Distributed Indexing using Terrier

Secure Cloud Storage and Computing Using Reconfigurable Hardware

ROCANA WHITEPAPER How to Investigate an Infrastructure Performance Problem

Outline. Clouds of Clouds lessons learned from n years of research Miguel Correia

Data Storage Security in Cloud Computing for Ensuring Effective and Flexible Distributed System

ThreatSpike Dome: A New Approach To Security Monitoring

Decoding DNS data. Using DNS traffic analysis to identify cyber security threats, server misconfigurations and software bugs

Distributed Aggregation in Cloud Databases. By: Aparna Tiwari

Securing Elastic Applications for Cloud Computing. Many to One Virtualization

Large-Scale TCP Packet Flow Analysis for Common Protocols Using Apache Hadoop

Finding the Evidence in Tamper-Evident Logs

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Index Terms Cloud Storage Services, data integrity, dependable distributed storage, data dynamics, Cloud Computing.

NoSQL for SQL Professionals William McKnight

IBM Software InfoSphere Guardium. Planning a data security and auditing deployment for Hadoop

Secure Way of Storing Data in Cloud Using Third Party Auditor

WatchGuard Dimension v1.1 Update 1 Release Notes

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

MAUI: Dynamically Splitting Apps Between the Smartphone and Cloud

Using distributed technologies to analyze Big Data

Near Sheltered and Loyal storage Space Navigating in Cloud

The Reduced Address Space (RAS) for Application Memory Authentication

Energy Efficiency in Secure and Dynamic Cloud Storage

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

Security Toolsets for ISP Defense

Distributed Data Management

What Is Specific in Load Testing?

Network Instruments white paper

Business Application Services Testing

ADVANCE SECURITY TO CLOUD DATA STORAGE

Building Scalable Applications Using Microsoft Technologies

Technische Herausforderungen der Cloud-Forensik

HDFS Users Guide. Table of contents

CiteSeer x in the Cloud

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine

Fault Tolerance in Hadoop for Work Migration

Database Application Developer Tools Using Static Analysis and Dynamic Profiling

Cloud Computing Now and the Future Development of the IaaS

Hadoop and Map-Reduce. Swati Gore

Technical Brief Distributed Trusted Computing

Fast Prototyping Network Data Mining Applications. Gianluca Iannaccone Intel Research Berkeley

Software tools for Complex Networks Analysis. Fabrice Huet, University of Nice Sophia- Antipolis SCALE (ex-oasis) Team

Cyber Security In High-Performance Computing Environment Prakashan Korambath Institute for Digital Research and Education, UCLA July 17, 2014

A Guide to Understanding SNMP

Developing MapReduce Programs

Digital Forensics Tutorials Acquiring an Image with FTK Imager

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

Efficient Integrity Checking Technique for Securing Client Data in Cloud Computing

CyberNEXS Global Services

Multi Factor Authentication API

Cloud Computing and Advanced Relationship Analytics

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Towards a virtualized Internet for computer networking assignments

Application Security Testing. Generic Test Strategy

Application of Predictive Analytics for Better Alignment of Business and IT

Daniel J. Adabi. Workshop presentation by Lukas Probst

Developing for the App Store. (Legacy)

ANATOMY OF A DDoS ATTACK AGAINST THE DNS INFRASTRUCTURE

A very short history of networking

Transcription:

Secure Network Provenance Wenchao Zhou *, Qiong Fei*, Arjun Narayan*, Andreas Haeberlen*, Boon Thau Loo*, Micah Sherr + * University of Pennsylvania + Georgetown University http://snp.cis.upenn.edu/

Motivation Why did my route to foo.com change?! D Route r 2 E Innocent Reason? Malicious Attack? Alice Route r 1 A foo.com An example scenario: network routing System administrator observes strange behavior Example: the route to foo.com has suddenly changed What exactly happened (innocent reason or malicious attack)? B C 2

We Need Secure Forensics For network routing Example: incident in March 2010 Traffic from Capitol Hill got redirected but also for other application scenarios Distributed hash table: Eclipse attack Cloud computing: misbehaving machines Online multi player gaming: cheating Goal: secure forensics in adversarial scenarios 3

Ideal Solution Q: Explain why the route to foo.com changed to r2. D Route r 2 E Alice A B C foo.com The Network Not realistic: adversary can tell lies A: Because someone accessed Router D and changed the configuration from X to Y. 4

Challenge: Adversaries Can Lie Everything is fine. Router E advertised a new route. Q: Explain why the route to foo.com changed to r2. D Route r 2 E Alice A B C foo.com The Network Problem: adversary can... fabricate plausible (yet incorrect) response point accusation towards innocent nodes 5

Existing Solutions Existing systems assume trusted components Trusted OS kernel, monitor, or hardware E.g. Backtracker [OSDI 06], PASS [USENIX ATC 06], ReVirt [OSDI 02], A2M [SOSP 07] These components may have bugs or be compromised Are there alternatives that do not require such trust? Our solution: We assume no trusted components; Adversary has full control over an arbitrary subset of the network (Byzantine faults). 6

Ideal Guarantees Ideally: explanation is always complete and accurate Fundamental limitations E.g. Faulty nodes secretly exchange messages E.g. Faulty nodes communicate outside the system What guarantees can we provide? Fundamentally impossible 7

Realistic Guarantees Q: Why did my route to foo.com change to r2? D Route r 2 E Alice A B The Network C foo.com Aha, at least I know which node is compromised. A: Because someone accessed Router D and changed its router configuration from X to Y. No faults: Explanation is complete and accurate Byzantine fault: Explanation identifies at least one faulty node Formal definitions and proofs in the paper 8

Outline Goal: A secure forensics system that works in an adversarial environment Explains unexpected behavior No faults: explanation is complete and accurate Byzantine fault: exposes at least one faulty node with evidence Model: Secure Network Provenance Tamper-evident Maintenance and Processing Evaluation Conclusion 9

Provenance as Explanations Alice A D route(a, foo.com) link(a, B) B route(d, foo.com) link(d, E) Origin: data provenance in databases foo.com route(c, foo.com) link(c, foo.com) Explains the derivation of tuples (ExSPAN [SIGMOD 10]) Captures the dependencies between tuples as a graph Explanation of a tuple is a tree rooted at the tuple E route(b, foo.com) link(b, C) C route(e, foo.com) link(e, B) 10

Provenance as Explanations route(d, foo.com) link(d, E) route(e, foo.com) link(e, B) route(a, foo.com) link(a, B) route(b, foo.com) route(c, foo.com) link(b, C) Origin: data provenance in databases link(c, foo.com) Explains the derivation of tuples (ExSPAN [SIGMOD 10]) Captures the dependencies between tuples as a graph Explanation of a tuple is a tree rooted at the tuple 11

Secure Network Provenance route(a, foo.com) link(a, B) route(b, foo.com) route(c, foo.com) link(b, C) link(c, foo.com) Challenge #1. Handle past and transient behavior Traditional data provenance targets current, stable state What if the system never converges? What if the state no longer exists? 12

Secure Network Provenance Time = t1 Time = t2 Time = t3 route(a, foo.com) route(c, foo.com) route(b, foo.com) route(c, foo.com) link(a, B) route(b, foo.com) route(c, foo.com) link(c, foo.com) link(b, C) link(c, foo.com) link(b, C) link(c, foo.com) Challenge #1. Handle past and transient behavior Timeline Traditional data provenance targets current, stable state What if the system never converges? What if the state no longer exists? Solution: Add a temporal dimension 13

Secure Network Provenance Time = t1 Time = t2 Time = t3 route(c, foo.com) route(b, foo.com) route(c, foo.com) +route(a, foo.com) +route(b, foo.com) link(a, B) +route(c, foo.com) link(c, foo.com) link(b, C) link(c, foo.com) link(b, C) +link(c, foo.com) Challenge #2. Explain changes, not just state Traditional data provenance targets system state Often more useful to ask why a tuple (dis)appeared Solution: Include deltas in provenance Timeline 14

Secure Network Provenance route(d, foo.com) link(d, E) route(e, foo.com) link(e, B) route(a, foo.com) link(a, B) route(b, foo.com) route(c, foo.com) link(b, C) link(c, foo.com) Challenge #3. Partition and secure provenance A trusted node would be ideal, but we don t have one Need to partition the graph among the nodes themselves Prevent nodes from altering the graph 15

Partitioning the Provenance Graph Step 1: Each node keeps vertices about local actions Split cross node communications 16

Partitioning the Provenance Graph RECV SEND Step 1: Each node keeps vertices about local actions Split cross node communications Step 2: Make the graph tamper evident 17

Securing Cross Node Edges Signed commitment from B RECEIVE SEND Signed ACK from A RECV SEND Router A Router B Step 1: Each node keeps vertices about local actions Split cross node communications Step 2: Make the graph tamper evident Secure cross node edges (evidence of omissions) 18

Outline Goal: A secure forensics system that works in an adversarial environment Explains unexpected behavior No faults: explanation is complete and accurate Byzantine fault: exposes at least one faulty node with evidence Model: Secure Network Provenance Tamper-evident Maintenance and Processing Evaluation Conclusion 19

System Overview Users Primary system Application Provenance system Query engine Maintenance engine Operator Extract provenance Maintain provenance Query provenance Network Stand alone provenance system On demand provenance reconstruction Provenance graph can be huge (with temporal dimension) Rebuild only the parts needed to answer a query 20

Extracting Dependencies Option 1: Inferred provenance Declarative specifications explicitly capture provenance E.g. Declarative networking, SQL queries, etc. Option 2: Reported provenance Modified source code reports provenance Option 3: External specification Defined on observed I/Os of a black box system 21

Secure Provenance Maintenance Maintain sufficient information for reconstruction I/O and non deterministic events are sufficient Logs are maintained using tamper evident logging Based on ideas from PeerReview [SOSP 07] D E Alice A RECV ACK B C SEND RCV ACK foo.com 22

Secure Provenance Querying Recursively construct the provenance graph Retrieve secure logs from remote nodes Check for tampering, omission, and equivocation Replay the log to regenerate the provenance graph D E Alice A route(a, foo.com) RECV (from B) link(a, B) foo.com Explain the route from A to foo.com. B C 23

Secure Provenance Querying Recursively construct the provenance graph Retrieve secure logs from remote nodes Check for tampering, omission, and equivocation Replay the log to regenerate the provenance graph D E Alice A route(a, foo.com) link(a, B) route(b, foo.com) RECV (from C) foo.com B link(b, C) C 24

Secure Provenance Querying Recursively construct the provenance graph Retrieve secure logs from remote nodes Check for tampering, omission, and equivocation Replay the log to regenerate the provenance graph D E Alice A OK. Now I know how the route was derived. route(a, foo.com) link(a, B) B route(b, foo.com) C link(b, C) foo.com route(c, foo.com) link(c, foo.com) 25

Outline Goal: A secure forensics system that works in an adversarial environment Explains unexpected behavior No faults: explanation is complete and accurate Byzantine fault: exposes at least one faulty node with evidence Model: Secure Network Provenance Tamper-evident Maintenance and Processing Evaluation Conclusion 26

Evaluation Results Prototype implementation (SNooPy) How useful is SNP? Is it applicable to different systems? How expensive is SNP at runtime? Traffic overhead, storage cost, additional CPU overhead? Does SNP affect scalability? What is the querying performance? Per query traffic overhead? Turnaround time for each query? 27

Usability: Applications We evaluated SNooPy with Quagga BGP: RouteView (external specification) Explains oscillation caused by router misconfiguration Hadoop MapReduce: (reported provenance) Detects a tampered Mapper that returns inaccurate results Declarative Chord DHT: (inferred provenance) Detects an Eclipse attacker that always returns its own ID SNooPy solves problems reported in existing work 28

Runtime Overhead: Storage Over 50% of the overhead was due to signatures and acks. Batching messages would help. Manageable storage overhead One week of data: E.g. Quagga 7.3GB; Chord 665MB 29

Query Latency largely due to replaying logs Verification Replay Download dominated by verifying logs and snapshots Query latency varies from application to application Reasonable overhead 30

Summary Secure network provenance in untrusted environments Requires no trusted components Strong guarantees even in the presence of Byzantine faults Formal proof in a technical report Significantly extends traditional provenance model Past and transient state, provenance of change, Efficient storage: reconstructs provenance graph on demand Application independent (Quagga, Hadoop, and Chord) Questions? Project website: http://snp.cis.upenn.edu/ 31