Building and breaking a Python sandbox

Similar documents
Computational Science and Engineering in Python

Introduction to Python Programming Course Notes. Phil Spector Department of Statistics, University of California Berkeley

Instruction Set Architecture of Mamba, a New Virtual Machine for Python

Self-review 9.3 What is PyUnit? PyUnit is the unit testing framework that comes as standard issue with the Python system.

Software Tool Seminar WS Taming the Snake

Python 2 and 3 compatibility testing via optional run-time type checking

Things you didn't know about Python

Google Apps Engine. G-Jacking AppEngine-based applications. Presented 30/05/2014. For HITB 2014 By Nicolas Collignon and Samir Megueddem

Django & Python 3. Aymeric Augustin PyConFR - September 16th, 2012

10 awesome features of Python that you can't use because you refuse to upgrade to Python 3

Python Objects. Charles Severance

A skip list container class in Python

Python Loops and String Manipulation

Obfuscation: know your enemy

Analog Documentation. Release Fabian Büchler

[1] Learned how to set up our computer for scripting with python et al. [3] Solved a simple data logistics problem using natural language/pseudocode.

Introduction to Python

CIS 192: Lecture 10 Web Development with Flask

Introduction to Python for Text Analysis

Chapter 3 Writing Simple Programs. What Is Programming? Internet. Witin the web server we set lots and lots of requests which we need to respond to

Introduction to Logging. Application Logging

Name Spaces. Introduction into Python Python 5: Classes, Exceptions, Generators and more. Classes: Example. Classes: Briefest Introduction

Python Programming: An Introduction To Computer Science

pyownet Documentation

The P3 Compiler: compiling Python to C++ to remove overhead

LEARNING TO PROGRAM WITH PYTHON. Richard L. Halterman

Python for Test Automation i. Python for Test Automation

Crash Dive into Python

Introduction to Python

vmprof Documentation Release 0.1 Maciej Fijalkowski, Antonio Cuni, Sebastian Pawlus

Python Tutorial. Release Guido van Rossum Fred L. Drake, Jr., editor. January 04, Python Software Foundation

Boolean Expressions, Conditions, Loops, and Enumerations. Precedence Rules (from highest to lowest priority)

Getting Started with the Internet Communications Engine

Paraview scripting. Raffaele Ponzini SuperComputing Applications and Innovation Department

CRASH COURSE PYTHON. Het begint met een idee

Exercise 4 Learning Python language fundamentals

Are you already a Python programmer? Did you read the original Dive Into Python? Did you buy it

Programming Language Rankings. Lecture 15: Type Inference, polymorphism & Type Classes. Top Combined. Tiobe Index. CSC 131! Fall, 2014!

Invent Your Own Computer Games with Python, 2 nd Edition. By Al Sweigart

ESCI 386 Scientific Programming, Analysis and Visualization with Python. Lesson 5 Program Control

Computers. An Introduction to Programming with Python. Programming Languages. Programs and Programming. CCHSG Visit June Dr.-Ing.

How to write a bash script like the python? Lloyd Huang. KaLUG - Kaohsiung Linux User Group COSCUP Aug

IVR Studio 3.0 Guide. May Knowlarity Product Team

Securing your Apache Tomcat installation. Tim Funk November 2009

CSCE 110 Programming I Basics of Python: Variables, Expressions, and Input/Output

Introduction to Programming Languages and Techniques. xkcd.com FULL PYTHON TUTORIAL

PyLmod Documentation. Release MIT Office of Digital Learning

latest Release 0.2.6

Introduction to Python

Intro to scientific programming (with Python) Pietro Berkes, Brandeis University

I don t intend to cover Python installation, please visit the Python web site for details.

Objects and classes. Objects and classes. Jarkko Toivonen (CS Department) Programming in Python 1

COMS Programming Languages Python: Lecture 1. Kangkook Jee

Homeland Security Red Teaming

Quiz I Solutions MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Department of Electrical Engineering and Computer Science

Python Basics. S.R. Doty. August 27, Preliminaries What is Python? Installation and documentation... 4

2! Multimedia Programming with! Python and SDL

Objective-C and Cocoa User Guide and Reference Manual. Version 5.0

Financial Accounting Tutorial

Python API. About the Python API. Using Python. Cisco Python Package. About the Python API, page 1 Using Python, page 1

Monitoring Agent for PostgreSQL Fix Pack 10. Reference IBM

Wrocław University of Technology. Bioinformatics. Borys Szefczyk. Applied Informatics. Wrocław (2010)

Computer Science 1 CSci 1100 Lecture 3 Python Functions

Python Tutorial. Release Guido van Rossum Fred L. Drake, Jr., editor. June 18, Python Software Foundation

CS177 MIDTERM 2 PRACTICE EXAM SOLUTION. Name: Student ID:

ft6 Motivation next step: perform the tests usually tedious, error prone work aided by a tool easily repeatable enter ft6 ft6

An introduction to Python for absolute beginners

Data Mining with Python (Working draft)

Java Interview Questions and Answers

Embed Python scripting in C applications

Introduction to Java

Informatica e Sistemi in Tempo Reale

Outline Basic concepts of Python language

APScheduler Documentation

Python Evaluation Rules

Homework 2. A 4*4 image with 16 pixels Borders unaltered. Color of B2 = Average color of (B1,A2,B3,C2) A1 A2 A3 A4 B1 B2 B3 B4 C1 C2 C3 C4 D1 D2 D3 D4

TypeScript for C# developers. Making JavaScript manageable

Crash Dive into Python

Unix Scripts and Job Scheduling

Leak Check Version 2.1 for Linux TM

The Smalltalk Programming Language. Beatrice Åkerblom

Python for Rookies. Example Examination Paper

Writing robust scientific code with testing (and Python) Pietro Berkes, Enthought UK

Django Two-Factor Authentication Documentation

The C Programming Language course syllabus associate level

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

Application Note 49. Using the Digi TransPort Fleet Card. October 2011

Exercise 1: Python Language Basics

Archelon Documentation

Firewall Testing Methodology W H I T E P A P E R

COSC 6397 Big Data Analytics. 2 nd homework assignment Pig and Hive. Edgar Gabriel Spring 2015

Resco CRM Server Guide. How to integrate Resco CRM with other back-end systems using web services

The PHP 5.4 Features You Will Actually Use

Computational Mathematics with Python

Advanced Bash Scripting. Joshua Malone

Finding XSS in Real World

Using PyObjC for Developing Cocoa Applications with Python

Transcription:

Building and breaking a Python sandbox

Director Organizer @jessicamckellar http://jesstess.com

Why? Learning a language Providing a hosted scratch pad Distributed computation Inspecting running processes safely

Examples in the wild Seattle s peer-to-peer computing network Google App Engine s Python shell Codecademy s empythoned CheckIO.org s online coding game

Building a sandbox Language-level sandboxing (pysandbox) OS-level sandboxing (PyPy s sandbox)

Question: How do we execute arbitrary code?

How do we execute arbitrary code? eval: compiles and evaluates expressions >>> eval("1 + 2") 3 exec: compiles and evaluates statements >>> exec "print 'Hello world'" Hello world

sandbox.py class Sandbox(object): def execute(self, code_string): exec code_string

test_sandbox.py from sandbox import Sandbox s = Sandbox() code = """ print "Hello world!" """ s.execute(code)

from sandbox import Sandbox s = Sandbox() code = """ print "Hello world!" """ s.execute(code) $ python test_sandbox.py Hello world!

What should we disallow?

What should we disallow? Resource exhaustion Information disclosure Running unexpected services Disabling/quitting/erroring out of the sandbox

from sandbox import Sandbox s = Sandbox() code = """ file("test.txt", "w").write("kaboom!\\n") """ s.execute(code)

>>> builtins. dict.keys() ['bytearray', 'IndexError', 'all', 'help', 'vars', 'SyntaxError', 'unicode', 'UnicodeDecodeError', 'memoryview', 'isinstance', 'copyright', 'NameError', 'BytesWarning', 'dict', 'input', 'oct', 'bin', 'SystemExit', 'StandardError', 'format', 'repr', 'sorted', 'False', 'RuntimeWarning', 'list', 'iter', 'reload', 'Warning', ' package ', 'round', 'dir', 'cmp', 'set', 'bytes', 'reduce', 'intern', 'issubclass', 'Ellipsis', 'EOFError', 'locals', 'BufferError', 'slice', 'FloatingPointError', 'sum', 'getattr', 'abs', 'exit', 'print', 'True', 'FutureWarning', 'ImportWarning', 'None', 'hash', 'ReferenceError', 'len', 'credits', 'frozenset', ' name ', 'ord', 'super', '_', 'TypeError', 'license', 'KeyboardInterrupt', 'UserWarning', 'filter', 'range', 'staticmethod', 'SystemError', 'BaseException', 'pow', 'RuntimeError', 'float', 'MemoryError', 'StopIteration', 'globals', 'divmod', 'enumerate', 'apply', 'LookupError', 'open', 'quit', 'basestring', 'UnicodeError', 'zip', 'hex', 'long', 'next', 'ImportError', 'chr', 'xrange', 'type', ' doc ', 'Exception', 'tuple', 'UnicodeTranslateError', 'reversed', 'UnicodeEncodeError', 'IOError', 'hasattr', 'delattr', 'setattr', 'raw_input', 'SyntaxWarning', 'compile', 'ArithmeticError', 'str', 'property', 'GeneratorExit', 'int', ' import ', 'KeyError', 'coerce', 'PendingDeprecationWarning', 'file', 'EnvironmentError', 'unichr', 'id', 'OSError', 'DeprecationWarning', 'min', 'UnicodeWarning', 'execfile', 'any', 'complex', 'bool', 'ValueError', 'NotImplemented', 'map', 'buffer', 'max', 'object', 'TabError', 'callable', 'ZeroDivisionError', 'eval', ' debug ', 'IndentationError', 'AssertionError', 'classmethod', 'UnboundLocalError', 'NotImplementedError', 'AttributeError', 'OverflowError']

>>> builtins. dict.keys() ['bytearray', 'IndexError', 'all', 'help', 'vars', 'SyntaxError', 'unicode', 'UnicodeDecodeError', 'memoryview', 'isinstance', 'copyright', 'NameError', 'BytesWarning', 'dict', 'input', 'oct', 'bin', 'SystemExit', 'StandardError', 'format', 'repr', 'sorted', 'False', 'RuntimeWarning', 'list', 'iter', 'reload', 'Warning', ' package ', 'round', 'dir', 'cmp', 'set', 'bytes', 'reduce', 'intern', 'issubclass', 'Ellipsis', 'EOFError', 'locals', 'BufferError', 'slice', 'FloatingPointError', 'sum', 'getattr', 'abs', 'exit', 'print', 'True', 'FutureWarning', 'ImportWarning', 'None', 'hash', 'ReferenceError', 'len', 'credits', 'frozenset', ' name ', 'ord', 'super', '_', 'TypeError', 'license', 'KeyboardInterrupt', 'UserWarning', 'filter', 'range', 'staticmethod', 'SystemError', 'BaseException', 'pow', 'RuntimeError', 'float', 'MemoryError', 'StopIteration', 'globals', 'divmod', 'enumerate', 'apply', 'LookupError', 'open', 'quit', 'basestring', 'UnicodeError', 'zip', 'hex', 'long', 'next', 'ImportError', 'chr', 'xrange', 'type', ' doc ', 'Exception', 'tuple', 'UnicodeTranslateError', 'reversed', 'UnicodeEncodeError', 'IOError', 'hasattr', 'delattr', 'setattr', 'raw_input', 'SyntaxWarning', 'compile', 'ArithmeticError', 'str', 'property', 'GeneratorExit', 'int', ' import ', 'KeyError', 'coerce', 'PendingDeprecationWarning', 'file', 'EnvironmentError', 'unichr', 'id', 'OSError', 'DeprecationWarning', 'min', 'UnicodeWarning', 'execfile', 'any', 'complex', 'bool', 'ValueError', 'NotImplemented', 'map', 'buffer', 'max', 'object', 'TabError', 'callable', 'ZeroDivisionError', 'eval', ' debug ', 'IndentationError', 'AssertionError', 'classmethod', 'UnboundLocalError', 'NotImplementedError', 'AttributeError', 'OverflowError']

How do we disallow execution of problematic builtins?

Idea: keyword blacklist

Idea: keyword blacklist class Sandbox(object): def execute(self, code_string): keyword_blacklist = ["file", "open", "eval", "exec"] for keyword in keyword_blacklist: if keyword in code_string: raise ValueError("Blacklisted") exec code_string

Testing: keyword blacklist from sandbox import Sandbox s = Sandbox() code = """ file("test.txt", "w").write("kaboom!\\n") """ s.execute(code)

Testing: keyword blacklist from sandbox import Sandbox s = Sandbox() code = """ file("test.txt", "w").write("kaboom!\\n") """ s.execute(code) $ python test_sandbox.py Traceback (most recent call last): File "test_sandbox.py", line 11, in <module> s.execute(code) File "/Users/jesstess/Desktop/sandbox/ sandbox.py", line 86, in execute raise ValueError("Blacklisted") ValueError: Blacklisted

How can we get around a keyword blacklist?

Circumvention idea: encryption func = builtins ["file"] func("test.txt", "w").write("kaboom!\n")

Circumvention idea: encryption func = builtins ["file"] func("test.txt", "w").write("kaboom!\n") func = builtins ["svyr".decode("rot13")] func("test.txt", "w").write("kaboom!\n")

Testing: keyword blacklist from sandbox import Sandbox s = Sandbox() code = """ func = builtins ["svyr".decode("rot13")] func("test.txt", "w").write("kaboom!\\n") """ s.execute(code) Kaboom

Observation: if I can get a reference to something bad, I can invoke it.

How can we remove all references to problematic builtins?

Idea: builtins whitelist

builtins_whitelist = set(( # exceptions 'ArithmeticError', 'AssertionError', 'AttributeError',... # constants 'False', 'None', 'True',... # types 'basestring', 'bytearray', 'bytes', 'complex', 'dict',... # functions ' import ', 'abs', 'all', 'any', 'apply', 'bin', 'bool',... # block: eval, execfile, file, quit, exit, reload, etc. ))

import sys main = sys.modules[" main "]. dict orig_builtins = main[" builtins "]. dict builtins_whitelist = set((... )) for builtin in orig_builtins.keys(): if builtin not in builtins_whitelist: del orig_builtins[builtin]

Testing: builtins whitelist from sandbox import Sandbox s = Sandbox() code = """ file("test.txt", "w").write("kaboom!\\n") """ s.execute(code)

Testing: builtins whitelist from sandbox import Sandbox s = Sandbox() code = """ file("test.txt", "w").write("kaboom!\\n") """ s.execute(code) $ python test_sandbox.py Traceback (most recent call last): File "test_sandbox.py", line 9, in <module> s.execute(code)... File "<string>", line 2, in <module> NameError: name 'file' is not defined

Circumvention idea: import something dangerous

Testing: builtins whitelist from sandbox import Sandbox s = Sandbox() code = """ import os fd = os.open("test.txt", os.o_creat os.o_wronly) os.write(fd, "Kaboom!\\n") """ s.execute(code) Kaboom

How do we disallow problematic imports?

Idea: import whitelist

Idea: import whitelist How does importing a module work in Python? >>> importer = builtins. dict.get(" import ") >>> os = importer("os") >>> os <module 'os' from '/Library/Frameworks/ Python.framework/Versions/2.7/lib/python2.7/os.pyc'> >>> os.getcwd() '/Users/jesstess/Desktop/sandbox'

Idea: import whitelist What is the expected function signature for the importer? >>> help( builtins. dict [" import "]) import (...) import (name, globals={}, locals={}, fromlist=[], level=-1) -> module

Idea: import whitelist Cool, let s write our own importer >>> def my_importer(module_name, globals={},... locals={}, fromlist=[],... level=-1):... print "Using my importer!"... return import (module_name, globals,... locals, fromlist, level)... >>> os = my_importer("os") Using my importer! >>> os.getcwd() '/Users/jesstess/Desktop/sandbox'

def _safe_import( import, module_whitelist): def safe_import(module_name, globals={}, locals={}, fromlist=[], level=-1):! if module_name in module_whitelist: return import (module_name,!!!!!!!!!!!! globals, locals, fromlist, level) else: raise ImportError( "Blocked import of %s" ( module_name,)) return safe_import

import sys main = sys.modules[" main "]. dict orig_builtins = main[" builtins "]. dict for builtin in orig_builtins.keys(): if builtin not in builtins_whitelist: del original_builtins[builtin] safe_modules = ["string", "re"] orig_builtins[" import "] = _safe_import( import, safe_modules)

Testing: import whitelist from sandbox import Sandbox s = Sandbox() code = """ import os fd = os.open("test.txt", os.o_creat os.o_wronly) os.write(fd, "Kaboom!\\n") """ s.execute(code)

Testing: import whitelist from sandbox import Sandbox s = Sandbox() code = """ import os fd = os.open("test.txt", os.o_creat os.o_wronly) os.write(fd, "Kaboom!\\n") """ s.execute(code) $ python test_sandbox.py Traceback (most recent call last): File "test_sandbox.py", line 11, in <module>... raise ImportError("Blocked import of %s" % (module_name,)) ImportError: Blocked import of os

Circumvention idea: modifying builtins

Idea: make builtins read-only

How can we make an object read-only in Python?

class ReadOnlyBuiltins(dict): def delitem (self, key): ValueError("Read-only!") def pop(self, key, default=none): ValueError("Read-only!") def popitem(self): ValueError("Read-only!")... def setdefault(self, key, value): ValueError("Read-only!") def setitem (self, key, value): ValueError("Read-only!") def update(self, dict, **kw): ValueError("Read-only!")

main = sys.modules[" main "]. dict orig_builtins = main[" builtins "]. dict for builtin in orig_builtins.keys(): if builtin not in builtins_whitelist: del original_builtins[builtin] safe_modules = ["string", "re"] orig_builtins[" import "] = _safe_import( import, safe_modules) safe_builtins = ReadOnlyBuiltins( original_builtins) main[" builtins "] = safe_builtins

Observation redux: if I can get a reference to something bad, I can invoke it.

Circumvention idea: exploiting the inheritance hierarchy

What can we find out about an object s base classes? >>> dir([]) [' add ', ' class ', ' contains ', ' delattr ', ' delitem ', ' delslice ', ' doc ', ' eq ', ' format ', ' ge ',...] >>> []. class <type 'list'>

What can we find out about an object s base classes? list subclasses object >>> []. class <type 'list'> >>> []. class. bases (<type 'object'>,) >>> []. class. bases [0] <type 'object'>

What can we find out about an object s subclasses? >>> []. class. subclasses () [] >>> int. subclasses () [<type 'bool'>] >>> basestring. subclasses () [<type 'str'>, <type 'unicode'>] subclasses of basestring

>>> []. class. bases [0]. subclasses () [<type 'type'>, <type 'weakref'>, <type 'weakcallableproxy'>, <type 'weakproxy'>, <type 'int'>, <type 'basestring'>, <type 'bytearray'>, <type 'list'>, <type 'NoneType'>, <type 'NotImplementedType'>, <type 'traceback'>, <type 'super'>, <type 'xrange'>, <type 'dict'>, <type 'set'>, <type 'slice'>, <type 'staticmethod'>, <type 'complex'>, <type 'float'>, <type 'buffer'>, <type 'long'>, <type 'frozenset'>, <type 'property'>, <type 'memoryview'>, <type 'tuple'>, <type 'enumerate'>, <type 'reversed'>, <type 'code'>, <type 'frame'>, <type 'builtin_function_or_method'>, <type 'instancemethod'>, <type 'function'>, <type 'classobj'>, <type 'dictproxy'>, <type 'generator'>, <type 'getset_descriptor'>, <type 'wrapper_descriptor'>, <type 'instance'>, <type 'ellipsis'>, <type 'member_descriptor'>, <type 'file'>, <type 'PyCapsule'>, <type 'cell'>, <type 'callable-iterator'>, <type 'iterator'>, <type 'sys.long_info'>, <type 'sys.float_info'>, <type 'EncodingMap'>, <type 'fieldnameiterator'>, <type 'formatteriterator'>, <type 'sys.version_info'>, <type 'sys.flags'>, <type 'exceptions.baseexception'>, <type 'module'>, <type 'imp.nullimporter'>, <type 'zipimport.zipimporter'>, <type 'posix.stat_result'>, <type 'posix.statvfs_result'>, <class 'warnings.warningmessage'>, <class 'warnings.catch_warnings'>, <class '_weakrefset._iterationguard'>, <class '_weakrefset.weakset'>, <class '_abcoll.hashable'>, <type 'classmethod'>, <class '_abcoll.iterable'>, <class '_abcoll.sized'>, <class '_abcoll.container'>, <class '_abcoll.callable'>, <class 'site._printer'>, <class 'site._helper'>, <type '_sre.sre_pattern'>, <type '_sre.sre_match'>, <type '_sre.sre_scanner'>, <class 'site.quitter'>, <class 'codecs.incrementalencoder'>, <class 'codecs.incrementaldecoder'>, <class 'string.template'>, <class 'string.formatter'>, <type 'operator.itemgetter'>, <type 'operator.attrgetter'>, <type 'operator.methodcaller'>, <type 'collections.deque'>, <type 'deque_iterator'>, <type 'deque_reverse_iterator'>, <type 'itertools.combinations'>, <type 'itertools.combinations_with_replacement'>, <type 'itertools.cycle'>, <type 'itertools.dropwhile'>, <type 'itertools.takewhile'>, <type 'itertools.islice'>, <type 'itertools.starmap'>, <type 'itertools.imap'>, <type 'itertools.chain'>, <type 'itertools.compress'>, <type 'itertools.ifilter'>, <type 'itertools.ifilterfalse'>, <type 'itertools.count'>, <type 'itertools.izip'>, <type 'itertools.izip_longest'>, <type 'itertools.permutations'>, <type 'itertools.product'>, <type 'itertools.repeat'>, <type 'itertools.groupby'>, <type 'itertools.tee_dataobject'>, <type 'itertools.tee'>, <type 'itertools._grouper'>, <type '_thread._localdummy'>, <type 'thread._local'>, <type 'thread.lock'>, <class 'sandbox.protection'>, <type 'resource.struct_rusage'>, <class 'sandbox.config.sandboxconfig'>, <class 'sandbox.proxy.readonlysequence'>, <class 'sandbox.sandbox_class.sandbox'>, <class 'sandbox.restorable_dict.restorabledict'>] >>> []. class. bases (<type 'object'>,) >>> []. class. bases [0] <type 'object'> All of the subclasses of object!

>>> []. class. bases (<type 'object'>,) >>> []. class. bases [0] <type 'object'> >>> obj_class = []. class. bases [0] >>> for c in obj_class. subclasses ():... print c. name... wrapper_descriptor instance ellipsis member_descriptor file PyCapsule cell callable-iterator iterator...

>>> []. class. bases (<type 'object'>,) >>> []. class. bases [0] <type 'object'> >>> obj_class = []. class. bases [0] >>> for c in obj_class. subclasses ():... print c. name... wrapper_descriptor instance ellipsis member_descriptor file PyCapsule cell!!! callable-iterator iterator...

from sandbox import Sandbox s = Sandbox() Testing: read-only builtins code = """ obj_class = []. class. bases [0] obj_subclasses = dict((elt. name, elt) for \ elt in obj_class. subclasses ()) func = obj_subclasses["file"] func("text.txt", "w").write("kaboom!\\n") """ s.execute(code) Kaboom

Idea: don t expose dangerous implementation details

Let s delete bases and subclasses >>> type. bases (<type 'object'>,) >>> del type. bases

Let s delete bases and subclasses >>> type. bases (<type 'object'>,) >>> del type. bases Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't set attributes of builtin/extension type 'type' Imposed by the underlying C implementation!

Let s delete bases and subclasses cpython.py from ctypes import pythonapi, POINTER, py_object _get_dict = pythonapi._pyobject_getdictptr _get_dict.restype = POINTER(py_object) _get_dict.argtypes = [py_object] del pythonapi, POINTER, py_object def dictionary_of(ob): dptr = _get_dict(ob) return dptr.contents.value

from cpython import dictionary_of main = sys.modules[" main "]. dict... safe_builtins = ReadOnlyBuiltins( original_builtins) main[" builtins "] = safe_builtins type_dict = dictionary_of(type) del type_dict[" bases "] del type_dict[" subclasses "]

Circumvention idea: would a function by any other name smell as sweet?

>>> def foo():... print "Meow"... >>> dir(foo) [' call ', ' class ', ' closure ', ' code ', ' defaults ', ' delattr ', ' dict ', ' doc ', ' format ', ' get ', ' getattribute ', ' globals ', ' hash ', ' init ', ' module ', ' name ', ' new ', ' reduce ', ' reduce_ex ', ' repr ', ' setattr ', ' sizeof ', ' str ', ' subclasshook ', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']

>>> def foo():... print "Meow"... >>> dir(foo) [' call ', ' class ', ' closure ', ' code ', ' defaults ', ' delattr ', ' dict ', ' doc ', ' format ', ' get ', ' getattribute ', ' globals ', ' hash ', ' init ', ' module ', ' name ', ' new ', ' reduce ', ' reduce_ex ', ' repr ', ' setattr ', ' sizeof ', ' str ', ' subclasshook ', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']???

>>> foo.func_code <code object foo at 0x100509d30, file "<stdin>", line 1> >>> dir(foo.func_code) [' class ', ' cmp ', ' delattr ', ' doc ', ' eq ', ' format ', ' ge ', ' getattribute ', ' gt ', ' hash ', ' init ', ' le ', ' lt ', ' ne ', ' new ', ' reduce ', ' reduce_ex ', ' repr ', ' setattr ', ' sizeof ', ' str ', ' subclasshook ', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames'] >>> foo.func_code.co_code 'd\x01\x00ghd\x00\x00s'

>>> def foo():... print "Meow"... >>> def evil_function():... print "Kaboom!"... >>> foo() Meow >>> foo. setattr ("func_code", evil_function.func_code) >>> foo() Kaboom! Kaboom

Idea redux: don t expose dangerous implementation details

Delete func_code from cpython import dictionary_of from types import FunctionType... type_dict = dictionary_of(type) del type_dict[" bases "] del type_dict[" subclasses "] function_dict = dictionary_of(functiontype) del function_dict["func_code"]

Whew. Let s recap tactics: Keyword blacklist Builtins whitelist Import whitelist Making important objects read-only (builtins) Deleting problematic implementation details ( bases, subclasses, func_code) Deleting the ability to construct arbitrary code objects

We have run out of tricks! We ve implemented 80% of a full-fledged Python sandbox

builtins_whitelist = set(( # exceptions 'ArithmeticError', 'AssertionError', 'AttributeError', 'BufferError', 'BytesWarning', 'DeprecationWarning', 'EOFError', 'EnvironmentError', 'Exception', 'FloatingPointError','FutureWarning', 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning', 'IndentationError', 'IndexError', 'KeyError','LookupError', 'MemoryError', 'NameError', 'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError','PendingDeprecationWarning', 'ReferenceError', 'RuntimeError', 'RuntimeWarning', 'StandardError', 'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError', 'TabError', 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError','UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', # constants 'False', 'None', 'True', ' doc ', ' name ', ' package ', 'copyright', 'license', 'credits', # types 'basestring', 'bytearray', 'bytes', 'complex', 'dict', 'float', 'frozenset', 'int', 'list', 'long', 'object', 'set', 'str', 'tuple', 'unicode', # functions ' import ', 'abs', 'all', 'any', 'apply', 'bin', 'bool', 'buffer', 'callable', 'chr', 'classmethod', 'cmp', 'coerce', 'compile', 'delattr', 'dir', 'divmod', 'enumerate', 'filter', 'format', 'getattr', 'globals', 'hasattr', 'hash', 'hex', 'id', 'isinstance', 'issubclass', 'iter', 'len', 'locals', 'map', 'max', 'min', 'next', 'oct', 'ord', 'pow', 'print', 'property', 'range', 'reduce', 'repr', 'reversed', 'round', 'setattr', 'slice', 'sorted', 'staticmethod', 'sum', 'super', 'type', 'unichr', 'vars', 'xrange', 'zip', )) def _safe_import( import, module_whitelist): def safe_import(module_name, globals={}, locals={}, fromlist=[], level=-1): if module_name in module_whitelist: return import (module_name, globals, locals, fromlist, level) else: raise ImportError("Blocked import of %s" % (module_name,)) return safe_import builtins whitelist import whitelist class ReadOnlyBuiltins(dict): def clear(self): ValueError("Read-only!") def delitem (self, key): ValueError("Read-only!") def pop(self, key, default=none): ValueError("Read-only!") def popitem(self): ValueError("Read-only!") read-only builtins def setdefault(self, key, value):! ValueError("Read-only!") def setitem (self, key, value): ValueError("Read-only!") def update(self, dict, **kw): ValueError("Read-only!") class Sandbox(object): def init (self):! import sys! from types import FunctionType! from cpython import dictionary_of! original_builtins = sys.modules[" main "]. dict [" builtins "]. dict! for builtin in original_builtins.keys(): if builtin not in builtins_whitelist:!! del sys.modules[" main "]. dict [" builtins "]. dict [builtin] original_builtins[" import "] = _safe_import( import, ["string", "re"]) safe_builtins = ReadOnlyBuiltins(original_builtins) sys.modules[" main "]. dict [" builtins "] = safe_builtins! type_dict = dictionary_of(type)! del type_dict[" bases "]! del type_dict[" subclasses "] deleting bases, subclasses_, and func_code! function_dict = dictionary_of(functiontype)! del function_dict["func_code"] def execute(self, code_string):! exec code_string

Building a sandbox Language-level sandboxing (pysandbox) OS-level sandboxing (PyPy s sandbox)

What should we disallow? Resource exhaustion Information disclosure Running unexpected services Disabling/quitting/erroring out of the sandbox

Food for thought

Is this level of reflectiveness good or bad?

Do other languages have these sandboxing concerns?

If you were designing a new language, how would you do this?

Experiments How does an alternative Python implementation like PyPy handle these issues? How does the CPython interpreter compile and run bytecode? What does the Python stack look like? How do ctypes work? How can the operating system help provide a secure environment?

Bedtime reading The full pysandbox implementation: https://github.com/haypo/pysandbox/ A retrospective on pysandbox s challenges: https://lwn.net/articles/574215/ PyPy s sandbox implementation: http://pypy.readthedocs.org/en/latest/sandbox.html How PythonAnywhere s sandbox works: http://blog.pythonanywhere.com/83/

Thank you!

Thank you! Let s talk! O Reilly booth, 3pm