Rethink Package Components on De-Duplication: From Logical Sharing to Physical Sharing



Similar documents
Effects of Memory Randomization, Sanitization and Page Cache on Memory Deduplication

Effects of Memory Randomization, Sanitization and Page Cache on Memory Deduplication

OS-Circular : A Framework of Internet Client with Xen

Analysis of Disk Access Patterns on File Systems for Content Addressable Storage

HTTP-FUSE PS3 Linux: an internet boot framework with kboot

Virtual Switching Without a Hypervisor for a More Secure Cloud

EXPLORING LINUX KERNEL: THE EASY WAY!

The future is in the management tools. Profoss 22/01/2008

Virtualization System Vulnerability Discovery Framework. Speaker: Qinghao Tang Title:360 Marvel Team Leader

Automated deployment of virtualization-based research models of distributed computer systems

OPERATING SYSTEM - MEMORY MANAGEMENT

Cloud^H^H^H^H^H Virtualization Technology. Andrew Jones May 2011

Virtualization. Explain how today s virtualization movement is actually a reinvention

Virtualization: Know your options on Ubuntu. Nick Barcet. Ubuntu Server Product Manager

Cloud Computing #6 - Virtualization

Microkernels, virtualization, exokernels. Tutorial 1 CSC469

COS 318: Operating Systems. Virtual Machine Monitors

Installation Guide for FTMS and Node Manager 1.6.0

Week Overview. Installing Linux Linux on your Desktop Virtualization Basic Linux system administration

Computing in High- Energy-Physics: How Virtualization meets the Grid

Pete s All Things Sun: Comparing Solaris to RedHat Enterprise and AIX Virtualization Features

FAST 11. Yongseok Oh University of Seoul. Mobile Embedded System Laboratory

nanohub.org An Overview of Virtualization Techniques

Full and Para Virtualization

Virtualization. Jukka K. Nurminen

Windows Server 2012 授 權 說 明

Virtualization Technologies (ENCS 691K Chapter 3)

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

Technical Investigation of Computational Resource Interdependencies

VMware Server 2.0 Essentials. Virtualization Deployment and Management

ESX System Analyzer Version 1.0 Installation Guide

Memory Resource Management in VMware ESX Server

LiveCD Howto (CentOS 4) - Step by step - 01/12/ Ver 1.1

RED HAT ENTERPRISE VIRTUALIZATION

CPET 581 Cloud Computing: Technologies and Enterprise IT Strategies. Virtualization of Clusters and Data Centers

Building Docker Cloud Services with Virtuozzo

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University

Virtualization and Other Tricks.

Virtualization with Windows

Are Cache Attacks on Public Clouds Practical?

Practical Experiences With Software Crash Analysis in TV. Wim Decroix/Yves Martens, TPVision

WHITE PAPER. Permabit Albireo Data Optimization Software. Benefits of Albireo for Virtual Servers. January Permabit Technology Corporation

Two-Level Metadata Management for Data Deduplication System

CS5460: Operating Systems

Chapter 2 Addendum (More on Virtualization)

Virtual Machines. Virtualization

Best Practices on monitoring Solaris Global/Local Zones using IBM Tivoli Monitoring

Basics in Energy Information (& Communication) Systems Virtualization / Virtual Machines

An analysis of how static and shared libraries affect memory usage on an IP-STB

IOS110. Virtualization 5/27/2014 1

Virtualization for Cloud Computing

Maximize Your Virtual Environment Investment with EMC Avamar. Rob Emsley Senior Director, Product Marketing

09'Linux Plumbers Conference

I Control Your Code Attack Vectors Through the Eyes of Software-based Fault Isolation. Mathias Payer, ETH Zurich

RED HAT DEVELOPER TOOLSET Build, Run, & Analyze Applications On Multiple Versions of Red Hat Enterprise Linux

Acronis Backup & Recovery 11.5 Quick Start Guide

System Structures. Services Interface Structure

Acronis Backup & Recovery 10 Server for Linux. Update 5. Installation Guide

Before we can talk about virtualization security, we need to delineate the differences between the

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

Operating Systems Virtualization mechanisms

The Monitis Monitoring Agent ver. 1.2

Monitoring Databases on VMware

The Tor VM Project. Installing the Build Environment & Building Tor VM. Copyright The Tor Project, Inc. Authors: Martin Peck and Kyle Williams

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging

How To Backup Your Computer With A File Copy Engine

Solution Guide Parallels Virtualization for Linux

Acronis Backup & Recovery 10 Server for Linux. Installation Guide

Using NixOS for declarative deployment and testing

Virtualization. Dr. Yingwu Zhu

Technical Properties. Mobile Operating Systems. Overview Concepts of Mobile. Functions Processes. Lecture 11. Memory Management.

Deploying Business Virtual Appliances on Open Source Cloud Computing

RPM Brotherhood: KVM VIRTUALIZATION TECHNOLOGY

Virtualization. Pradipta De

Virtualization. Types of Interfaces

Multi-level Metadata Management Scheme for Cloud Storage System

Virtualization in Linux KVM + QEMU

PARALLELS CLOUD SERVER

IBM. Kernel Virtual Machine (KVM) Best practices for KVM

Module I-7410 Advanced Linux FS-11 Part1: Virtualization with KVM

Virtual machines and operating systems

Redefining Backup for VMware Environment. Copyright 2009 EMC Corporation. All rights reserved.

CS 695 Topics in Virtualization and Cloud Computing. Introduction

Distributed System Monitoring and Failure Diagnosis using Cooperative Virtual Backdoors

Comparison of Memory Balloon Controllers

Decentralized Deduplication in SAN Cluster File Systems

Distributed and Cloud Computing

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE

How To Stop A Malicious Process From Running On A Hypervisor

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP

Transcription:

Rethink Package Components on De-Duplication: From Logical Sharing to Physical Sharing Leaflet is wrong Kuniyashu Kuniyasu Suzaki, Toshiki Yagi, Kengo Iijima, Nguyen Anh Quynh, Cyrille Artho Research Center of Information Security National Institute of Advanced Industrial Science and Technology & Yoshihito Watanebe Alpha Systems Inc.

Contents Rethink Package Components Vulnerability of logical sharing (Dynamic-Link Shared Library and Symbolic Link) Propose replacement of logical sharing by physical sharing Physical sharing Deduplication on Memory and Storage Self-contained binary It is NOT static-link binary. Experimental results Conclusions with discussing topics

Rethink Package Components Current Packages depend on Logical Sharing Logical sharing is OS technique to reduce consumption of memory and storage. Dynamic-Link Shared Library for memory and storage Symbolic Link for storage Unfortunately, they include vulnerability caused by dynamic management. Search Path Replacement Attack GOT (Global Offset Table) overwrite attack Dependency Hell Etc.

Search Path Replacement Attack Dynamic-link searches a shared library at run time using a search path. Search path is defined by environment variables. Example: LD_LIBRARY_PATH It allows us to change shared libraries for each process. Unfortunately, the search path is easily replaced by an attacker and leads to malicious shared libraries. Caller program has no methods to certify libraries. Static-link solves this problem but it wastes memory and storage.

GOT Overwrite Attack ELF format has GOT (Global Offset Table) to locate position-independent function address of shared library. The value of GOT is assigned at run time. GOT is created on Data Segment and vulnerable for overwrite attack. Static link solves this problem but it wastes memory and storage. Program Library Code Segment Call PLT Routine PLT Code Segment Data Segment GOT GOT Data Segment Attack

Dependency Hell (DLL Hell in Windows) Dependency Hell is a management problem of shared libraries. Package manager maintains versions of libraries. However, the version mismatch may occur, when a user updates a library without package manager. Dependency Hell llis escalated dby symbolic-link, because most shared libraries use symbolic-link to manage minor updates. /lib/libc.so.6 -> libc-2.10.1.so # ln s libc-2.11.1.so libc.so.6 Static link solves this problem but it wastes memory and storage.

Solution, and further problems The problems are solved by static-link, but it increases consumption of memory and storage. Fortunately, the increased consumption is mitigated by new technique, deduplication. SLINY[USENIX 05] developed deduplication in Linux kernel. It looks the problems are solved Two trends Current applications assume dynamic-link and are not re-compiled as static-link easily. Current virtualization offers us deduplication. SLINKY uses special Linux kernel. It is not applied on any Linux. Using virtualization, guest OS only has to solve the security problems without regard to physical consumption.

Static-Link is not easy [BIG Problem] Current applications depend on dynamic-link shared libraries for flexibility and avoiding license contamination. Some Licenses conflict and the code can not be mixed. Static link is not created with compile option, because dlopen() which calls shared library explicitly is used in main source codes. glibc, lb lb libx11, openssl, gcc, glib, lb pam Difficult to remove! We tried to re-compile /bin, /sbin, /usr/bin, and /usr/sbin dynamic-linked binaries (1,162) with static-link on Gentoo. 185 (15.9%) binaries are re-compiled with static-link. Binary packages make it difficult to re-compile, because they are not easy to get all source code. Commercial applications make problem more difficult.

Self-Contained Binaries Self-contained binary translator It is developed to bring a binary to another machine. It integrates shared libraries into an ELF binary file. Tools Statifier, Autopacage, Ermine for Linux Advantage Prevent Search Path Replacement Attack and Dependency Hell, because it integrates all libraries. Mitigate GOT Overwrite Attack, because the addresses are prefixed for each execution. Disadvantage Consume more memory and storage than static-link

Min: /usr/bin/qmake (3,426,340 -> 6,094,848 (x1.78) ) 6 shared libraries # ldd /usr/bin/qmake linux-gate.so.1 => (0xb78d5000) libstdc++.so.6 => /usr/lib/gcc/i686-pc-linux-gnu/ 4.3.4/libstdc++.so.6 (0xb77da000) libm.so.6 => /lib/libm.so.6 (0xb77b4000) libgcc_s.so.1 => /usr/lib/gcc/i686-pc-linux-gnu/ 4.3.4/libgcc_s.so.1 (0xb77a6000) libc.so.6 => /lib/libc.so.6 (0xb765e000) /lib/ld-linux.so.2 (0xb78d6000) Max: /usr/bin/gnome-open (5,400 -> 8,732,672 (x1617.16)) 22shared libraries # ldd /usr/bin/gnome-open l inux-gate.so.1 => (0xb772d000) libgnome-2.so.0 => /usr/lib/libgnome-2.so.0 (0xb7709000) libgnomevfs-2.so.0 => /usr/lib/libgnomevfs-2.so.0 (0xb76ae000) libxml2.so.2 => /usr/lib/libxml2.so.2 (0xb7578000) libm.so.6 => /lib/libm.so.6 (0xb7552000) libssl.so.0.9.8 => /usr/lib/libssl.so.0.9.8 (0xb7509000) libcrypto.so.0.9.8 => /usr/lib/libcrypto.so.0.9.8 (0xb73bc000) libz.so.1 => /lib/libz.so.1 (0xb73a8000) libutil.so.1 => /lib/libutil.so.1 (0xb73a4000) libbonobo-2.so.0 => /usr/lib/libbonobo-2.so.0 (0xb7348000) libbonobo-activation.so.4 => /usr/lib/libbonobo-activation.so.4 (0xb7332000) liborbitcosnaming-2 2.so.0 => /usr/lib/liborbitcosnaming-2 2.so.0 (0xb732c000) libgconf-2.so.4 => /usr/lib/libgconf-2.so.4 (0xb72f7000) liborbit-2.so.0 => /usr/lib/liborbit-2.so.0 (0xb72a2000) libgthread-2.0.so.0 => /usr/lib/libgthread-2.0.so.0 (0xb729c000) libpthread.so.0 => /lib/libpthread.so.0 (0xb7283000) librt.so.1 => /lib/librt.so.1 (0xb727a000) libdbus-glib-1.so.2 => /usr/lib/libdbus-glib-1.so.2 (0xb725c000) libnsl.so.1 => /lib/libnsl.so.1 (0xb7244000) libdbus-1.so.3 => /usr/lib/libdbus-1.so.3 (0xb720b000) libgio-2.0.so.0 => /usr/lib/libgio-2.0.so.0 (0xb7178000) libresolv.so.2 => /lib/libresolv.so.2 (0xb7163000) libgobject-2.0.so.0 => /usr/lib/libgobject-2.0.so.0 (0xb7126000) libgmodule-2.0.so.0 => /usr/lib/libgmodule-2.0.so.0 (0xb7120000) libdl.so.2 => /lib/libdl.so.2 (0xb711c000) libglib-2.0.so.0 => /usr/lib/libglib-2.0.so.0 (0xb7044000) libpopt.so.0 => /usr/lib/libpopt.so.0 (0xb7039000) libc.so.6 => /lib/libc.so.6 (0xb6ef1000) /lib/ld-linux.so.2 (0xb772e000)

Statifier (1/2) Creation of self-contained binary by Statifier Statifier emulates normal loader and gets allocation information Take snapshot before _dl_start_user() and analyze allocation information of functions of libraries from /proc/pid/maps. The libraries and allocation information are embedded into the binary.

Self-Contained Binary Statifier (2/2) Allocation information and shared libraries are loaded by the starter which is embedded ELF binary by statifier. Includes special libraries: linux-gate.so, ld-linux.so The ELF binary has no INTERP segment to call ld-linux.so ldd command shows no dynamic-link shared libraries Statifier makes a larger binary than static link. Increased memory and storage are mitigated by deduplication.

Deduplication Technique to share same-content chunks at block level (memory and storage). Same-content chunks are shared by indirect link. It is easy to implement when a virtual layer exists to access a block device. Some virtualizations include deduplication mechanism.

Storage Deduplication Used by CAS (Content addressable Storage) data is not addressed by its physical location. Data is addressed by a unique name derived from the content (a secure hash is used as a unique name usually) Same contents are expressed by one original content (same hash) and addressed by indirected link. Plan9 has Venti [USENIX FAST02] Data Domain (EMC) Deduplication [USENIX FAST08] LBCAS (Loopback Content Addressable Storage) [LinuxSymp09] Virtual Disk Indexing CAS Storage Archive Address SHA-1 0000000-0003FFF 4ad36ffe8 0004000-0007FFF 974daf34a 0008000-000BFFF 2d34ff3e1 000C000-000FFFF 974daf34a Deduplication sharing New block is created with new SHA-1

Memory Deduplication Memory deduplication is mainly used for virtual machines. Very effective when same guest OS runs on several virtual machines. On Virtual Machine Monitor Disco[OSDI97] has Transparent Page Sharing VMWare ESX has Content-Based Page Sharing [SOSP02] Xen has Satori[USENIX09] and Differential Engine[OSDI08] On Kernel Linux has KSM (Kernel Samepage Merging) from 2.6.32 [LinuxSymp09] Memory of Process(es) are deduplicated KVM uses this mechanism These targets are virtual machines, but our proposal uses memory deduplication on single OS image, which increase same pages with copy of libraries (self-contained binary). Guest Physical Memory VM1 VM2 VM(n) Real Physical Memory

Evaluation Evaluate the effect of moving form logical sharing to physical sharing. Effect of dynamic link shared library (logical sharing) Gentoo installed on 32GB virtual disk for KVM virtual machine Effect of Statifier (remove logical sharing and increase consumption on memory and storage) Applied on binaries under /bin,/sbin,/usr/bin,/usr/sbin Memory Deduplication KSM (Kernel Samepage merging) of Linux with KVM virtual machine (758MB). Storage Deduplication LBCAS (Loopback Content Addressable Storage)

Effect of Dynamic Link Shared Library 42 normal processes were running at the end of login. Shared libraries overlapped on physical memory (total Virtual memory used by library) / (Physical memory Used by library) = ((B) - (C) )/ ((A) - (C)) = 86,184KB / 13,016KB = 6.62 (A) Consumed physical memory 54,760 KB (B) Summation of consumed memory by each process (Size on /proc/[pid]/smaps) 127,928 KB (C) Summation of process own memory 41,744KB The overlap was 6.62 times. Information obtained by /proc/[pid]smaps PID RSS SHARED 1 672 536 (79%) 792 1676 448 (26%) 1401 508 400 (78%) 1412 712 580 (81%) 1709 544 392 (72%) 1718 408 332 (81%) 1728 628 524 (83%) 1737 764 360 (47%) 1749 1428 1040 (72%) 1750 468 372 (79%) Number of libraries used by 42 processes Usued Name 42 linux-gate.so.1 42 /lib/ld-linux.so.2 42 libc.so.6 32 libdl.so.2 29 libpthread.so.0 27 libpcre.so.3 27 libnsl.so.1 26 libglib-2.0.so.0 25 libgobject-2.0.so.0 24 libm.so.6 Included by all processes

Static Analysis of Statifier Gentoo was customized by statifier. The ELF (1,162) binaries under /bin (82 files), /sbin (74), /usr/bin (912), /usr/sbin (94) were customized by statifier. Original (Dynamic-link) Statifier Increase Total 87,865,480 3,572,936,704 40.66 Average 75,615 3,074,816 40.66 Max (gnome-open) 5,400 8,732,672 1617.16 Min (qmake) 3,426,340 6,094,848 1.78 The disk image (includes non-statifiered files) was expnaded from 3.75GB to 7.08GB (1.88 times).

Effect of Memory Deduplication Memory usage at the end of login Statifier expanded memory consumption from the view of GuestOS, but Deduplication reduced physical memory consumption. 4KB page 80000 34.4% 70000 GuestOS View 60000 50000 40000 30000 20000 10000 GuestOS View 93.0% 2296 481 29929 17.3% 32706 30410 physical memory 86056 45332 4441 25291 8.9% 29732 physical memory Duplicated Deduplicate Unique 0 Normal Gentoo Statifier Gentoo

Trace of memory consumption Normal Gentoo Statifer Gentoo Loopback GuestOS View Physical Mem View GuestOS View Physical Mem View Auto login Auto login LBCAS (256KB) GuestOS View Physical Mem View GuestOS View Physical Mem View Auto login Auto login

Effect of Storage Deduplication Storage usage (static) and total read data at boot (dynamic). Statifier expanded storage consumption from the view of GuestOS on both cases, but Deduplication reduced physical storage consumption in static and dynamic. Smaller chunk is easy to be deduplicated but time overhead is large. Static Dynamic (boot) normal statifier normal statifier On Loopback (Guest OS View) 3,754MB 7,075MB (1.88) 151.7MB 341.0MB (2.25) LBCAS 16KB 268,454 [4195MB] 4352MB [278,499] (1.04) --- ---- LBCAS 64KB 74,679 [4667MB] 83,863 [5241MB] (1.12) 218MB [3,481] 304MB [4,866] (1.40) LBCAS 256KB 22,806 [5701MB] 6723MB [26,892] (1.18) 390MB [1,560 ] 505MB [2,019] (1.29)

Time overhead at boot Statifier reduced the boot time, because it eliminated dynamic reallocation overhead. Deduplication increased the boot time. The overhead of KSM and LBCAS was less than 37%. The overhead is a penalty to remove the vulnerabilities of logical sharing. Without KSM With KSM Normal Statifier Normal Statifier Loopback 95s 84s 95s 105s Reduced LBCAS (256KB) 107s 108s 115s 130s

Conclusion & Discussion Self-Contained binaries strengthen OS security. Prevent Search Path Replacement Attack, GOT (Global Offset Table) overwrite attack, Dependency Hell Easy to apply on normal OS. It does not require source code and re-compile. Increase consumption of memory and storage. Deduplication mitigates the consumption of memory and storage caused by self-contained binary. Encourage moving from Logical sharing to Physical Sharing Deduplication is utilized to increase security on single OS. *Technical detail is written in USENIX HotSec2010 Paper, Moving from Logical Sharing of Guest OS to Physical Sharing of Deduplication on Virtual Machine

Conclusion & Discussion Re-think the packages of Linux distributions Self-Contained binaries is temporary solution. We should rethink to remove logical sharing which cause security problems, because same contents are shared on memory and storage dedeuplcation. Deduplication will be mainly used on IaaS type (multi-tenants) Cloud Computing. *Technical detail is written in USENIX HotSec2010 Paper, Moving from Logical Sharing of Guest OS to Physical Sharing of Deduplication on Virtual Machine