Securing Web Applications with Information Flow Tracking with Michael Martin, Benjamin Livshits, John Whaley, Michael Carbin, Dzin Avots, Chris Unkel Web Application Vulnerabilities 50% databases had a security breach [Computer crime & security survey, 2002] 92% Web applications are vulnerable [Application Defense Center, 2004] 48% of all vulnerabilities Q3-Q4, 2004 [Symantec May, 2005] 1
Top Ten Security Flaws in Web Applications [OWASP] 1. Unvalidated Input 2. Broken Access Control 3. Broken Authentication and Session Management 4. Cross Site Scripting (XSS) Flaws 5. Buffer Overflows 6. Injection Flaws 7. Improper Error Handling 8. Insecure Storage 9. Denial of Service 10. Insecure Configuration Management Web Applications Hacker Browser Web App Database Evil Input Confidential information leak 2
SQL Injection Errors Hacker Browser Web App Database Give me Bob s credit card # Delete all records Happy-go-lucky SQL Query User supplies: name, password Java program: String query = SELECT UserID, Creditcard FROM CCRec WHERE Name = + name + AND PW = + password 3
Fun with SQL : the rest are comments in Oracle SQL SELECT UserID, CreditCard FROM CCRec WHERE: Name = bob AND PW = foo Name = bob AND PW = x Name = bob or 1=1 AND PW = x Name = bob; DROP CCRec AND PW = x Vulnerabilities in Web Applications Inject Parameters Hidden fields Headers Cookie poisoning X Exploit SQL injection Cross-site scripting HTTP splitting Path traversal 4
Key: Information Flow A Simple SQL Injection Pattern o = req.getparameter ( ); stmt.executequery ( o ); 5
In Practice ParameterParser.java:586 String session.parameterparser.getrawparameter(string name) public String getrawparameter(string name) throws ParameterNotFoundException { String[] values = request.getparametervalues(name); if (values == null) { throw new ParameterNotFoundException(name + " not found"); } else if (values[0].length() == 0) { throw new ParameterNotFoundException(name + " was empty"); } return (values[0]); } ParameterParser.java:570 String session.parameterparser.getrawparameter(string name, String def) public String getrawparameter(string name, String def) { try { return getrawparameter(name); } catch (Exception e) { return def; } } In Practice (II) ChallengeScreen.java:194 Element lessons.challengescreen.dostage2(websession s) String user = s.getparser().getrawparameter( USER, "" ); StringBuffer tmp = new StringBuffer(); tmp.append("select cc_type, cc_number from user_data WHERE userid = ' ); tmp.append(user); tmp.append("' ); query = tmp.tostring(); Vector v = new Vector(); try { ResultSet results = statement3.executequery( query );... 6
PQL: Program Query Language o = req.getparameter ( ); stmt.executequery ( o ); Query on the dynamic behavior based on object entities Abstracting away information flow Dynamic vs. Static Pattern Dynamically: o = req.getparameter ( ); stmt.executequery (o); Statically: p 1 = req.getparameter ( ); stmt.executequery (p 2 ); p 1 and p 2 point to same object? Pointer alias analysis 7
Flow-Insensitive Pointer Analysis Objects allocated by same line of code are given the same name. Datalog o 1 : p = new Object(); pts(p,o 1 ) o 2 : q = new Object(); pts(q,o 2 ) p.f = q; hpts(o 1,f,o 2 ) r = p.f; pts(r,o 2 ) p o 1 q o 2 f r Assignments: Inference Rule in Datalog pts (v 1, h 1 ) :- v 1 = v 2 & pts (v 2, h 1 ). v 1 = v 2 ; v 2 h 1 v 1 8
Inference Rule in Datalog Stores: hpts(h 1, f, h 2 ) :- v 1.f = v 2 & pts(v 1, h 1 ) & pts(v 2, h 2 ). v 1.f = v 2 ; v 1 h 1 v 2 h 2 f Inference Rule in Datalog Loads: pts(v 2, h 2 ) :- v 2 = v 1.f & pts(v 1, h 1 ) & hpts(h 1, f, h 2 ). v 2 = v 1.f; v 1 h 1 v 2 h 2 f 9
Pointer Analysis Rules pts(v, h) :- h: T v = new T() ; pts(v 1, h 1 ) :- v 1 = v 2 & pts(v 2, h 1 ). hpts(h 1, f, h 2 ) pts(v 2, h 2 ) :- v 1.f = v 2 & pts(v 1, h 1 ) & pts(v 2, h 2 ). :- v 2 = v 1.f & pts(v 1, h 1 ) & hpts(h 1, f, h 2 ). Pointer Alias Analysis Specified by a few Datalog rules Creation sites Assignments Stores Loads Apply rules until they converge 10
Context-Sensitive Pointer Analysis L1: a=malloc(); a=id(a); id(x) id(x) id(x) {return x;} L2: b=malloc( ); b=id(b); context-sensitive a L1 x context-insensitive b L2 x Even without recursion, # of Contexts is exponential! 11
Top 20 Sourceforge Java Apps 10 16 10 12 10 8 10 4 10 0 Costs of Context Sensitivity Typical large program has ~10 14 paths If you need 1 byte to represent a context: 100 terabytes of storage > 12 times size of Library of Congress Memory: $1.2 million Hard drive: $47,500 Time to read sequentially: 20 days 12
Cloning-Based Algorithm Whaley&Lam, PLDI 2004 (best paper award) Create a clone for every context Apply context-insensitive algorithm to cloned call graph Lots of redundancy in result Exploit redundancy by clever use of BDDs (binary decision diagrams) Automatic Analysis Generation PQL Ptr analysis in 10 lines Datalog bddbddb (BDD-based deductive database) with Active Machine Learning 1000s of lines BDD operations 1 year tuning BDD: 10,000s-lines library 13
Benchmarks 9 large, widely used applications Blogging/bulletin board applications Used at a variety of sites Open-source Java J2EE apps Available from SourceForge.net Vulnerabilities Found SQL injection HTTP splitting Cross-site scripting Path traversal Total Header 0 6 5 0 11 Parameter 6 5 0 2 13 Cookie 1 0 0 0 1 Non-Web 2 0 0 3 5 Total 9 11 5 5 30 14