Trustworthy Software Systems

Similar documents
CONTINUOUS DIAGNOSTICS BEGINS WITH REDSEAL

NERC CIP VERSION 5 COMPLIANCE

SAFECode Security Development Lifecycle (SDL)

PCI-DSS Penetration Testing

High Level Cyber Security Assessment 2/1/2012. Assessor: J. Doe

elearning for Secure Application Development

WIND RIVER SECURE ANDROID CAPABILITY

The Road from Software Testing to Theorem Proving

90% of data breaches are caused by software vulnerabilities.

I Control Your Code Attack Vectors Through the Eyes of Software-based Fault Isolation. Mathias Payer, ETH Zurich

KASPERSKY SECURITY INTELLIGENCE SERVICES. EXPERT SERVICES.

DeltaV System Cyber-Security

Protecting Your Organisation from Targeted Cyber Intrusion

Advances in Programming Languages

eguide: Designing a Continuous Response Architecture Executive s Guide to Windows Server 2003 End of Life

CS 356 Lecture 25 and 26 Operating System Security. Spring 2013

Homeland Security Red Teaming

QuickBooks Online: Security & Infrastructure

Secure Software Programming and Vulnerability Analysis

Start building a trusted environment now... (before it s too late) IT Decision Makers

Defense-in-Depth Strategies for Secure, Open Remote Access to Control System Networks

Criteria for web application security check. Version

What is Really Needed to Secure the Internet of Things?

DEFENSE THROUGHOUT THE VULNERABILITY LIFE CYCLE WITH ALERT LOGIC THREAT AND LOG MANAGER

Web application security: automated scanning versus manual penetration testing.

UNCLASSIFIED Version 1.0 May 2012

The Protection Mission a constant endeavor

Juniper Networks Secure

Building A Secure Microsoft Exchange Continuity Appliance

Protect Your Organization With the Certification That Maps to a Master s-level Education in Software Assurance

Adobe Flash Player and Adobe AIR security

ensure prompt restart of critical applications and business activities in a timely manner following an emergency or disaster

Course Content Summary ITN 261 Network Attacks, Computer Crime and Hacking (4 Credits)

Machine vs. Machine: Lessons from the First Year of Cyber Grand Challenge. Approved for Public Release, Distribution Unlimited

Information Security Services

Web Application Security

Designing and Coding Secure Systems

(General purpose) Program security. What does it mean for a pgm to be secure? Depends whom you ask. Takes a long time to break its security controls.

The Cyber Threat Profiler

Three Ways to Secure Virtual Applications

Developing Secure Software in the Age of Advanced Persistent Threats

Next-Generation Firewalls: Critical to SMB Network Security

Top five strategies for combating modern threats Is anti-virus dead?

North Dakota 2013 IT Security Audit Vulnerability Assessment & Penetration Test Project Briefing

Bringing Security Testing to Development. How to Enable Developers to Act as Security Experts

Improvements Needed With Host-Based Intrusion Detection Systems

Foundstone ERS remediation System

Sandbox Roulette: Are you ready for the gamble?

Thick Client Application Security

Access Control Fundamentals

CS 356 Lecture 17 and 18 Intrusion Detection. Spring 2013

What Do You Mean My Cloud Data Isn t Secure?

Web Application Security. Vulnerabilities, Weakness and Countermeasures. Massimo Cotelli CISSP. Secure

INTRODUCING isheriff CLOUD SECURITY

Seven Things To Consider When Evaluating Privileged Account Security Solutions

ADDING NETWORK INTELLIGENCE TO VULNERABILITY MANAGEMENT

Redhawk Network Security, LLC Layton Ave., Suite One, Bend, OR

Background. How much does EMET cost? What is the license fee? EMET is freely available from Microsoft without material cost.

X05. An Overview of Source Code Scanning Tools. Loulwa Salem. Las Vegas, NV. IBM Corporation IBM System p, AIX 5L & Linux Technical University

Adversary Modelling 1

Threat Modeling. Frank Piessens ) KATHOLIEKE UNIVERSITEIT LEUVEN

2015 Vulnerability Statistics Report

The True Story of Data-At-Rest Encryption & the Cloud

Network Mission Assurance

Analyzing Security for Retailers An analysis of what retailers can do to improve their network security

ICTN Enterprise Database Security Issues and Solutions

How to Build a Trusted Application. John Dickson, CISSP

CyberNEXS Global Services

CompTIA Security+ (Exam SY0-410)

Software security specification and verification

HTTPS Inspection with Cisco CWS

Data Storage Security in Cloud Computing

Why should I care about PDF application security?

NETWORK SECURITY. 3 Key Elements

QUIRE: : Lightweight Provenance for Smart Phone Operating Systems

RSA Enterprise Compromise Assessment Tool (ECAT) Date: January 2014 Authors: Jon Oltsik, Senior Principal Analyst and Tony Palmer, Senior Lab Analyst

Enterprise Cybersecurity Best Practices Part Number MAN Revision 006

Secure Software Programming and Vulnerability Analysis

CDM Software Asset Management (SWAM) Capability

Security Assessment of Waratek AppSecurity for Java. Executive Summary

Cutting Edge Practices for Secure Software Engineering

CORE INSIGHT ENTERPRISE: CSO USE CASES FOR ENTERPRISE SECURITY TESTING AND MEASUREMENT

SY system so that an unauthorized individual can take over an authorized session, or to disrupt service to authorized users.

The CompCert verified C compiler

PREVENTING ZERO-DAY ATTACKS IN MOBILE DEVICES

Complete Web Application Security. Phase1-Building Web Application Security into Your Development Process

ASL IT SECURITY XTREME XPLOIT DEVELOPMENT

White Paper. Guide to PCI Application Security Compliance for Merchants and Service Providers

The Seven Deadly Myths of Software Security Busting the Myths

Sophistication of attacks will keep improving, especially APT and zero-day exploits

TOP SECRETS OF CLOUD SECURITY

Transcription:

Trustworthy Software Systems Greg Morrisett Cutting Professor of Computer Science School of Engineering & Applied Sciences Harvard University

Little about me Research & Teaching Compilers, Languages, Formal Methods Software Security Harvard Center for Research on Computation & Society Number of security-oriented advisory boards Microsoft Trustworthy Computing Board (& MSR TAB) Intel-Berkeley SCRUB Lab Fortify (bought by HP) DARPA ISAT National Academy Study on Science of Cybersecurity

All too familiar headlines

From DARPA s Cyber Analytic Framework

Attackers penetrate the architecture easily Goal Demonstrate asymmetric ease of exploitation of DoD computer versus efforts to defend. Hijacked web page Infected.pdf document Result Multiple remote compromises of fully security compliant and patched HBSS computer within days: 2 remote exploits 25+ local privilege escalation exploits Undetected by defenses HBSS Workstation Penetration Demonstration Total Effort: 2 people, 3 days, Total cost = $18K HBSS Costs: Millions of dollars a year for software and 5 licenses alone (not including man hours) Approved for public release; distribution is unlimited = Host Based Security System (HBSS)

Ground truth DoD Reported Incidents of Malicious Cyber Activity, 2000 2009 10.0 8.0 6.0 4.0 2.0 0.0 Federal Defensive Cyber Spending ($B) [1] 2010 REPORT TO CONGRESS of the U.S.-CHINA ECONOMIC AND SECURITY REVIEW COMMISSION [1] INPUT reports 2004 2009 6 Approved for public release; distribution is unlimited

We are divergent with the threat 10,000,000 x Unified Threat Management Lines of Code 8,000,000 6,000,000 4,000,000 2,000,000 Milky Way Network Flight Recorder Security software Malware: 125 lines of code* DEC Seal x Stalker Snort x x x 0 1985 1990 1995 2000 2005 2010 x * Malware lines of code averaged over 9,000 samples Approved for public release; distribution is unlimited 7

Many Things Need to be Fixed User interfaces (and users) Underlying Architecture Underlying Protocols Configuration & Operation tools But one huge issue dominates right now: The code we depend upon is full of bugs.

What s going wrong? Development processes are ineffective. Human code review doesn t work. Certification processes are ineffective. Based on who authored, not the code itself. Current automated defenses are worse than ineffective. Based on syntax or provenance, not semantics. Introduce new classes of vulnerabilities.

Additional security layers often create vulnerabilities Current vulnerability watchlist Vulnerability Title Fix Avail? Date Added XXXXXXXXXXXX XXXXXXXXXXXX Local Privilege Escalation Vulnerability No 8/25/2010 XXXXXXXXXXXX XXXXXXXXXXXX Denial of Service Vulnerability Yes 8/24/2010 XXXXXXXXXXXX XXXXXXXXXXXX Buffer Overflow Vulnerability No 8/20/2010 XXXXXXXXXXXX XXXXXXXXXXXX Sanitization Bypass Weakness No 8/18/2010 XXXXXXXXXXXX XXXXXXXXXXXX Security Bypass Vulnerability No 8/17/2010 XXXXXXXXXXXX XXXXXXXXXXXX Multiple Security Vulnerabilities Yes 8/16/2010 XXXXXXXXXXXX XXXXXXXXXXXX Remote Code Execution Vulnerability No 8/16/2010 XXXXXXXXXXXX XXXXXXXXXXXX Use-After-Free Memory Corruption Vulnerability No 8/12/2010 XXXXXXXXXXXX XXXXXXXXXXXX Remote Code Execution Vulnerability No 8/10/2010 XXXXXXXXXXXX XXXXXXXXXXXX Multiple Buffer Overflow Vulnerabilities No 8/10/2010 XXXXXXXXXXXX XXXXXXXXXXXX Stack Buffer Overflow Vulnerability Yes 8/09/2010 XXXXXXXXXXXX XXXXXXXXXXXX Security-Bypass Vulnerability No 8/06/2010 XXXXXXXXXXXX XXXXXXXXXXXX Multiple Security Vulnerabilities No 8/05/2010 XXXXXXXXXXXX XXXXXXXXXXXX Buffer Overflow Vulnerability No 7/29/2010 XXXXXXXXXXXX XXXXXXXXXXXX Remote Privilege Escalation Vulnerability No 7/28/2010 XXXXXXXXXXXX XXXXXXXXXXXX Cross Site Request Forgery Vulnerability No 7/26/2010 XXXXXXXXXXXX XXXXXXXXXXXX Multiple Denial Of Service Vulnerabilities No 7/22/2010 6 of the vulnerabilities are in security software Color Code Key: Vendor Replied Fix in development Awaiting Vendor Reply/Confirmation Awaiting CC/S/A use validation 10 Approved for public release; distribution is unlimited

Market Failures This problem won t be solved by startups: Developers are stingy. Developers make money/fame by adding features, not by doing security audits or fixes. Contrast with attackers. They make money by doing careful audits

So how do we dig ourselves out of this mess?

Ideal Architecture: untrusted code Policy Checker policy Policies capture behavior. The checker automatically rules out any code that will violate the policy. The checker is small, simple, trustworthy, and automatic. 13

Unfortunately Even simple policies are undecidable. e.g., Does the code have a buffer overflow? So any checker is either incomplete or unsound. Incomplete: rules out programs that meet the policy Unsound: allows a program that fails to meet the policy Analyzing machine code is hard. It s hard enough to analyze real source code for simple policies. Any machine-level analysis requires a big, complicated checker. So how can we trust that it s doing its job correctly? So shift the burden.! 14

Key Observation Finding a proof is hard. Checking a proof can be easy. 15

Proof-Carrying Code [Necula & Lee 97] untrusted code proof Proof Checker policy Code comes with a proof that it satisfies the policy. The proof checker ensures that: a) the proof is valid b) the conclusion says this code respects the policy 16

Good Properties of PCC We can build a trustworthy proof checker. ~1K lines of code. The coupling is tamper-proof. Change code: verifier will discover that the proof no longer talks about the same code. Change proof: verifier will discover if it s no longer valid. No secrets to have stolen. Relative completeness. Any policy that can be formalized. Any code that provably respects the policy. Enables integration No longer matters who produced the code. 17

PCC is No Silver Bullet How to get proofs? Policies of interest are hard to prove. Manual proof construction is order(s) of magnitude harder to write than code. What policies should we enforce? How do you formalize nothing bad? Proofs are relative to models. How do we model the real world? 18

How to get proofs? 1. Use high-level code & a certifying compiler. 2. Use automatic analysis. 3. Insert checks that make it easier to build proof. 4. Change the policy so it s easier to build proof. 5. Get the programmer to help. In reality, we have to do all of these 19

Rest of this Talk Research investments that can help us build proofs of safety & security for real code. Compiler verification Static analysis In-lined reference monitors Proof engineering & automation Domain-specific languages & logics

Reasoning About Machine Code Building proofs about machine code is hard. Prefer to construct proofs about a high-level language. But then there is a gap A bug in the compiler can lead to an exploit. Most browser vulnerabilities are due to bugs in Java or Javascript implementations. See Yang et al. s 2011 PLDI paper.

Proven Correct Compilers CompCert [http://compcert.inria.fr] Optimizing C compiler Back-ends for x86, Arm, PPC Competitive with gcc O1! Proof of correctness: C code has same I/O behavior of generated machine code. Means we can reason about the source code instead of the machine code for most policies. See for instance, Andrew Appel s program logic.

Proof Engineering

Still Many Challenges From -01 to -03; From WAT to JIT. Higher-level languages than C. c.f., core-ml compiler out of Cambridge. Issues reasoning about multi-core programs. c.f., Sewell & Batty s work on C++11. Proof of correctness ~10x the size of code. Some policies not preserved by refinement.

Proved Correct Compilers Shift reasoning from machine to source code. But we still need to produce a proof for the source code

Static Analysis Static analysis tools are now viable for detecting a wide class of common bugs in source code. Prefast, Coverity, HP/Fortify, Based on foundational research in program analysis from the 1980 s-2000 s. However, for legacy code: Generate too many false positives. For that matter, too many true positives as well. Today s commercial tools are (purposefully) unsound.

An alternative: In-lined Reference Monitors (IRMs) Formulate a safety policy. e.g., will not access the network. Insert run-time checks into the code to enforce the policy. Needed at security-critical events. But also must insert checks to protect the monitor! Makes it easy to prove that the compiler respects the policy. Importantly: avoids false positives of static analyses. However, we can use static analysis, to optimize checks. Only eliminate a check if you can prove it s safe to do so. So the role of analysis is purely for optimization.

Some Example Policies SFI: Software Fault Isolation [Wahbe et al.] CFI: Control-Flow Isolation [Abadi et al.] XFI: Extended Flow Isolation [Erlingsson et al.] SafeCode [Dhurjati et al.] These policies attempt to stop various forms of control-flow hijacking and/or data corruption attacks in legacy C/C++ code.

Policy Tradeoffs 1. What vulnerabilities are mitigated? 2. How much legacy code do we break? 3. How much overhead do we incur? 4. How hard is it to get the implementation right?

Some Example Policies SFI: Software Fault Isolation [Wahbe et al.] Forces code to execute in a sandbox. Low overhead (~5% on 32-bit x86), easy to enforce. But doesn t stop hijacking code or data within the sandbox. CFI: Control-Flow Isolation [Abadi et al.] Forces code to follow a control-flow graph. Not as lightweight as SFI (~10-20%?), harder to implement. Stops most (but not all) control hijacks such as ROP attacks; no data. XFI: Extended Flow Isolation [Erlingsson et al.] Extends CFI with stack-protection, even more expensive. SafeCode [Dhurjati et al.] Enforces a type-safety discipline (code + data). Overheads range from 20-150%, very hard to implement well. Stops all control hijacking, many data integrity attacks.

Zooming in on one of these Google wanted to use SFI to provide a sandbox for their Native Client extension to the Chrome Browser. We built a checker that allows Google to verify that a binary will respect the policy. Specialized to this policy: 80 lines of code! We proved that the checker is correct. [See Morrisett et al., PLDI 2012].

Another IRM Example [Adve] Based on the SAFEcode compiler [PLDI 06]: Compiles C, C++, Java, Haskell, etc. Enforces a much stronger policy than SFI. Works by instrumenting the LLVM intermediate representation + some runtime support. Has been used to compile Linux 2.4.22 & NetBSD For Linux, prevented 4 of 5 known vulnerabilities ~20% - 50% overhead

Certification for SAFEcode C/C++ Clang LLVM IR SAFEcode analysis & transforms LLVM IR Proof witness LLVM IR LLVM optimizer LLVM IR proof checker equiv checker LLVM code generator Binary

Summary Thus Far Formulate safety policy as an in-lined reference monitor. Automates policy enforcement; simplifies proof. Technologies from analysis used to cut overheads of monitoring. Technologies from proof-preserving compilation eliminate the need to trust the tools.

Modeling A major challenge for both the Google and SAFECode efforts was constructing formal models of the underlying machines. x86 model has thousands of instructions. Building & validating such models is crucial for any real-world application of formal methods. UK researchers have taken the lead here: C++, x86, ARM, TCP, Javascript,

Richer Policies IRM s make it possible to automate basic safety properties. But for safety & security-critical software, we need policies that cover confidentiality, availability, and functional correctness. Much harder to get proofs. Example: SEL4 micro-kernel + proof of correctness ~20 person years of effort.

Proof Automation Some of this is alleviated by advances in automatic theorem proving technology. SAT solvers; SMT provers. Both have seen dramatic improvements. Still have many hard challenges here. Much more is alleviated by using domainspecific languages, logic, & decision procedures.

Examples: Confidentiality Jif, LIO: Information flow tracking through types Ensure information from private fields in data do not flow to public channels. Dually, public data cannot influence integrity. EasyCrypt, FCF: Domain-specific languages & logics for reasoning about cryptographic schemes (e.g., TLS.) Connect cryptographer-level proofs to actual code.

Wrapping it up: Proof-carrying code (PCC) enables trust. Doesn t matter who wrote the code. Can verify with small trusted computing base. Important for scaling software, where components are brought in from 3 rd parties, open source, etc. Certifying compilers help produce PCC: prove properties at the source level. no need to trust compiler or reveal the source. 39

Getting Proofs: Today: Safety policies enforced by in-lined reference monitors. Stop a wide range of common attacks. Tomorrow: New languages let us capture a range of policies: integrity, confidentiality, availability, correctness. New analysis techniques & decision procedures help automate proof construction. 40

Thanks! Questions? Comments?