Digital forensic implications of ZFS
|
|
|
- Rebecca Mills
- 10 years ago
- Views:
Transcription
1 available at journal homepage: Digital forensic implications of ZFS Nicole Lang Beebe*, Sonia D. Stacy, Dane Stuckey Dept. of Information Systems & Technology Management, The University of Texas at San Antonio, One UTSA Circle, San Antonio, TX 78249, USA abstract Keywords: ZFS File system Forensics Data recovery Copy on write ZFS is a relatively new, open source file system designed and developed by Sun Microsystems. 1 The stated intent was to develop.a new kind of file system that provides simple administration, transactional semantics, end-to-end data integrity, and immense scalability (OpenSolaris community). Its functionality, architecture, and disk layout take a relatively radical departure from many commonly used file systems (e.g. FAT, NTFS, EXT2/3, UFS, HFSþ, etc.). Since file systems play a very important role in how and where data are stored, as well as the likelihood of their retrieval during digital forensic investigations, it is important that forensics researchers and practitioners understand ZFS and its forensic implications. That is the goal of this article. We first provide the reader with a primer of sorts about ZFS, which lays the foundation for our discussion of ZFS forensics. We then present the results of our analysis of ZFS functionality, architecture, and disk layout identifying and discussing several digital forensic artifacts and challenges unique to ZFS. ª 2009 Digital Forensic Research Workshop. Published by Elsevier Ltd. All rights reserved. 1. Introduction Large scale storage, information security, and ease of administration are key operational and business enablers in many organizations today. Organizations are ever increasing the amount of data they store, warehouse, retrieve, and mine. As is often the case in computing, IT solutions providers are hurriedly adapting current technologies to meet new storage, security, and administrative needs. Perhaps one of the best examples of adapting current technologies to meet new needs is in the context of managing storage arrays and the advent of logical volume managers (LVMs). Storage arraysdtheir size, configuration, and managementdquickly outgrew the capabilities of many existing file systems; hence, the advent of LVMs. The problem is that LVMs can be difficult to manage; the addition or removal of block devices in the array is not as simple as inserting or removing physical drives. So, while current technologies arguably do scale to meet current storage needs, they are not trivial to administer as storage needs and capacity frequently change. Furthermore, they have a finite capacity limit. Given the exponential growth in data stores in recent years, we contend that even 64 bit file systems will be a limiting factor someday. Another growing concern is the issue of information security, particularly data integrity, of critical data stores. Organizations remain vulnerable to silent data corruption. We have relatively robust mechanisms in place to ensure data integrity of network-based data (e.g. message digests, message authentication codes in SSL, etc.), but we do not have robust, widespread means to scrub on-disk data. We have hashing, but that is certainly not self-healing. We have RAID * Corresponding author. address: [email protected] (N.L. Beebe). 1 As of the writing of this article, Sun Microsystems publicly acknowledged a plan for Oracle to acquire Sun. ZFS was released as open source software under a CDDL license. It is unknown what impact the not yet finalized acquisition will have on ZFS /$ see front matter ª 2009 Digital Forensic Research Workshop. Published by Elsevier Ltd. All rights reserved. doi: /j.diin
2 S100 digital investigation 6 (2009) S99 S107 implementations that will rebuild a disk, should one fail, but typical RAID implementations do not prevent against silent data corruption and write-hole vulnerabilities. They do not calculate and verify checksums at an object (i.e. file) level, and self-heal when errors are detected. Such mechanisms are needed to validate the integrity of data stored on-disk, as well as when data is being read/written, which would detect and self-correct errors. Today, such errors remain largely undetected and uncorrected, except via user detection which is often too late to correct the errors. For these reasons, and others, Sun Microsystems designed and developed ZFS (OpenSolaris community). ZFS is a new kind of file system that provides simple administration, transactional semantics, end-to-end data integrity, and immense scalability. ZFS is not an incremental improvement to existing technology; it is a fundamentally new approach to data management (OpenSolaris community). Digital forensics researchers and practitioners know all too well that a different file system often necessitates fundamentally different approaches to search and retrieval extraction and analysis. As Carrier points out, a thorough understanding of the file system is critical to one s knowledge of where digital artifacts will be found during an investigation (Carrier, 2005). If ZFS represents a paradigm shift in file system design from commonly used file systems (e.g. FAT, NTFS, EXT2/3, UFS, HFSþ, etc), significant digital forensic implications follow. ZFS is currently the native file system for OpenSolaris and Solaris 10. Kernel-level ports have been developed for FreeBSD, NetBSD, and Apple s OS X 10.5 Leopard (read-only). User-level ports have been developed for Linux and Apple s OS X 10.5 Leopard via FUSE. A kernel-level port is currently under development for Apple s OS X 10.6 Snow Leopard Server. These implementations and ports have been publicly announced, but there are also rumors of ports to Desktop Snow Leopard and Linux via a kernel-level port provided by Sun. Furthermore, we contend that even if ZFS s prominence in the file system market does not show marked increase in years to come, next-generation file systems will tackle modern storage array, security, and administration issues in a similar ways. The purpose of this article is to introduce the digital forensics research and practitioner community to ZFS and the digital forensic implications thereof. The remainder of the article is structured as follows. First, we familiarize the reader with ZFS in generaldits functionality, its architecture, and its disk layout. Second, and more importantly, we discuss significant digital forensic implications of this relatively new and different file system over commonly used file systems. Finally, we conclude with a discussion of limitations, future research needed, contributions, and a few concluding remarks. 2. Background This section will introduce the reader to ZFS and show how it fundamentally diverges from commonly used file systems. It is a primer of sorts, which is intended to serve the digital forensics community from an educational standpoint, but does not claim a knowledge contribution in the traditional research sense. Such background will help readers understand the subsequent digital forensic implication discussionsdthe primary contribution of this article. The background should also serve the community by lessening their burden in understanding the architecture of ZFS. We aim to coalesce what we have learned from ZFS documentation, conference presentations and proceedings, source code reviews, ZFS developer blog posts, direct observation, and other related sources into an instructive introduction of ZFS for the digital forensics community ZFS functionality ZFS integrates traditional file system and logical volume manager functionality and uses a pooled storage model, facilitating a highly dynamic file system that supports flexible and large storage arrays. Block devices can be added to the storage array, into what ZFS calls zpools. ZFS facilitates the integration automatically and autonomously. The devices are immediately added to the pool, are immediately available for use/storage, and all of this occurs transparent to the user. ZFS eliminates the need for managing and resizing volumes. Storage arrays, their constituent zpools, and their constituent datasets (i.e. file systems, snapshots, clones, and volumes) can be exceptionally large, as ZFS is the first 128 bit file system. Also, it is endian adaptive, 2 making it architecture independent. Data can be created by any architecture and read by any architecture, and the ordering can be intermixed within a zpool. ZFS is designed to be impervious to silent data corruption, because of its extensive use of checksumming. A checksum 3 is calculated and stored for all ZFS objects (e.g. content and metadata). The checksum is stored in the data s metadata and verified incident to any and all data transactions. ZFS then uses the checksums to self-heal whenever errors are detected, thereby facilitating automatic, on-going live data scrubbing, as well as on-demand data scrubbing. To ensure the validity of the on-disk state, ZFS implements a copy-on-write (COW) transactional object model. When block modifications are prescribed, the modified block(s) are written to newly allocated blocks. When the write transactions complete successfully, metadata are updated, also using the COW model. Following the transactional object model, synchronous operations are grouped and tracked in ZFS intent logs (ZILs). Should part of the transaction complete successfully and the other fail, all completed transactions are rolled back to ensure transaction synchronicity requirements are achieved. The model also improves I/O performance by batching write transactions and committing them every few (5 10) seconds, or earlier in the event of a forced sync. Both the COW model and transactional object model ensure that 2 Endian ordering of data structures is a function of the architecture used to write it. A flag is set within objects block pointers to indicate the endian ordering, which ZFS then uses to read the data in the appropriate order. The manipulation required is entirely transparent to the user. 3 Supported checksums include SHA-256 (default), fletcher2, and fletcher4 algorithms.
3 S101 unexpected events, such as system failure due to power failure, do not result in data corruption. ZFS also facilitates system resiliency and data redundancy through its native support of snapshots, clones, and ditto blocks (multiple, automatic, on-disk copies of metadata and/ or content category data). A snapshot is a read-only, point-intime version of a file system. A clone is a read writeable, fully operational file system, created from a snapshot. Ditto blocks are allocated copies of file metadata and potentially file content and will be discussed in greater detail later due to their digital forensic implications. Compression (LZJB) is built-in and implemented by default in ZFS for metadata. Content category data is not compressed by default, but can be compressed. Its implementation gains return on the processing (CPU load) investment by significantly lessening disk I/O time ZFS architecture and disk layout 4 Like most file systems, ZFS starts with a superblockda block of data reserved for key file system category data, often statically located at the beginning of a physical disk. ZFS calls this the uberblock (big endian magic number: 0x00 ba b1 0c note its phonetics). The uberblock is a 1 KB data structure (element) contained within an uberblock array (128 KB array). Even the uberblock follows the COW model in the sense that updates to the active uberblock are accomplished by writing to a different uberblock in the array than that which contains the active uberblock and then updating the transaction group (TXG) number. The uberblock with the highest TXG and valid checksum is deemed the active uberblock. The uberblock array starts at offset 128 KB within the label. A is a virtual device; it may be a physical or logical device. Vdev types include: disk, file, mirror, clone, RAID-Z (similar to RAID-5), replacing, and root. (See Anonymous, 2006 for further explanation of these types.) A label is a 256 KB data structure located in quadruplicate on each (two consecutive copies at the beginning of the and two consecutive copies at the end of the ). The label contains information that facilitates access to zpool contents and verifies the zpool s integrity and availability. See Fig. 1 for illustrations of the above concepts. This is the full extent of statically located data structures in ZFS. To put this point into perspective, we compare ZFS to EXT2/3. EXT2/3 divides the file system into sections, called block groups, and subsequently stores backup superblocks, group descriptor tables, block bitmaps, inode bitmaps, and inode tables at specific locations within each block group. File content is stored in equally sized logical allocation units within the block group. Finally, EXT2/3 intentionally colocates file metadata and file content to minimize disk latency and seek time. ZFS behaves oppositely by design. The analogous data listed above may be stored anywhere on a top-level, which includes intentionally spreading it across multiple physical/ leaf s to reduce the risk of catastrophic (non-recoverable) data loss. While metaslabs subdivide s in a similar manner 4 Primary resources for this section are (Anonymous, 2006; Bruning, 2008a,b,c). as block groups, the corresponding data stored within the metaslabs is not stored in specific locations, as with EXT2/3 block groups. Unlike many commonly used file systems, including NTFS, EXT2/3, and HFSþ, ZFS does not store file content in equally sized logical allocation units (what ZFS calls file system blocks, or FSBs). It uses neither block-based allocation, nor extentbased allocation in the traditional sense. Block-based allocation means files are allocated space at the block (i.e. sector) level, as is the case with UFS. When a file grows, more blocks are allocated to meet the need. Extent-based allocation usually means a file system s data storage space is broken up into equally sized, contiguous groups of blocks upon file system creation, and files are allocated space at an extentlevel. (Note: Extents are also referred to as clusters, logical allocation units, and data units.) When a file grows, more extents (groups of blocks) are allocated to meet the need. While the extent size can be variably determined upon file system creation, all extents are usually equally sized and the extent size does not vary after install, nor between files. In contrast and conceptually similar to JFS 5 (Eckstein, 2004), ZFS dynamically and variably sizes its extents according to the needs of individual files. ZFS right-sizes extents up to the max FSB record size set for the file system (default ¼ 128 KB). So, allocated extents are not equally sized in a single ZFS file system. Files smaller than FSB record size will be allocated to extents that are smaller than FSB record size. Files larger than FSB record size will be allocated to multiple extents equal in size to FSB record size. This functionality means that ZFS maintains sector-level allocation awareness. This is accomplished via per metaslab allocation/free log files called space maps. Space maps are loaded into memory and converted to space efficient, offset sorted, AVL trees of free space. Since space maps merely track free space and provide no other data location information, ZFS relies heavily on block pointersddirect and indirectdsimilar to EXT2/3. In fact, everything after the uberblock is located via block pointers. Unlike other file systems, ZFS does not use statically located inode tables, or file system category files to store data location information (e.g. NTFS s Master File Table $MFT). ZFS supports six levels of indirection, and several instances of such indirection can exist as one traverses from the uberblock, to the meta-object set, to the array for the file system objects (directories, files, etc.), and to the file content itself. This traversal is particularly important to digital forensic investigators, as it relates to how data is stored, located, and retrieved On-disk data walk To provide the big picture of how content and metadata are stored, located, and retrieved, the following discussion walks through how to locate a target file starting from the uberblock. First, however, we must discuss a few key data structures. All, or nearly all, objects in ZFS are described by s. As everything is a file in NTFS, everything is an object in ZFS, thus the list of object types is long. See Anonymous (2006) for 5 JFS and ZFS implement what Eckstein (2004) calls variable-sized allocation units in very different ways.
4 S102 digital investigation 6 (2009) S99 S K 512K 4M N-512K N-256K N Layout Label 0 Label 1 Reserved Space (Boot Block) Allocatable Storage Space Label 2 Label 3 Label 0 0 Blank Space 8K 16K 128K Reserved (Boot Header) Name/Value Pairs describing relationships K 128K Array Fig. 1, label, and uberblock layouts (Figure adapted from ZFS On-Disk specification illustrations 2, 3 & 7). a full listing, but a few example object types include: files, directory listings, space maps, intent logs, attribute lists, access control lists, and s. Dnodes (conceptually similar to inodes in UFS) are 512B data structures that contain, among other things, block pointers (data structure: blkptr_t). Block pointers store information about what the object is, where it is, and how big it is. Dnodes are stored on-disk, usually in arrays, but inmemory copies exist as objects are accessed. Dnode arrays are often fragmented (stored in non-contiguous space on-disk) (Bruning, 2008c). Objects of similar type are grouped in ZFS and called object sets. Whereas individual objects are described by s (_phys_t data structures), object sets are described by metas (objset_phys_t data structures; 1 KB in size). Objset_phys_t data structures contain: 1) a that often points to a array for that object set, 2) a ZIL header, and 3) the object set type. One object set is of particular importance: the meta-object set (MOS). The MOS is the superset of all objects in the zpool. With an understanding of these key concepts and data structures, we can move forward with the on-disk data walk discussion. (See Fig. 2 for an illustrative view of the data walk). At the zpool level, the active uberblock s block pointer points to the meta-object set s (MOS) meta. Its s block pointer points to the MOS array. The second element (index ¼ 1) of the MOS array points to the Object Directory ZAP object (ZFS Attribute Processor; an object that contains name value pairs). One name value pair within the Object Directory ZAP is the root_dataset pair, whose value is the object ID (index number) within the MOS array for the DSL (Dataset and Snapshot Layer) directory. The DSL directory provides a mapping of object IDs (indices) of s within the MOS array for the various DSL datasets (e.g. a specific file system within the zpool). Respective DSL dataset s then point to the metas for the DSL datasets, which in-turn point to each dataset s array. At the dataset level (e.g. a single file system), each within the dataset s array pertains to a specific file system object (e.g. files and directories). The block pointers within directory s point to ZAP objects, which provide the object IDs for the directory s child objects (i.e. files and subdirectories in the case of a directory object). The block pointers within file s point to file content (or block pointer arrays in the case of indirection) Files, directories and their metadata The discussion above centers on file and directory traversal by walking through the steps and data structures involved in ultimately locating file content. Some ambiguity may remain, however, with regard to data typically of forensic interest, such as filenames, directory structures, and file metadata. Filenames and directory names are stored in ZAP objects (name value pair data structures). The within the file system s dataset array that pertains to the file system s root directory points to the root directory ZAP object. Its constituent name value pairs consist of the named directories and files within the root directory and their associated object IDs (dataset array indices). Each file and directory has a dataset, which either points to another directory ZAP object, or the file content, for directories and files respectively. File and directory metadata is stored in its respective within variably sized bonus buffer space (referred to as a znode when storing such metadata). This metadata includes MAC date/time stamps, creation transaction group number (relevant for chronologically ordering different versions of data), read write execute permissions (e.g. 755), file type (e.g. regular, symbolic link, socket, etc.), file size, object ID of the parent directory, number of hard links, owner ID, group ID, and access control lists. See Anonymous (2006) for specific data structure information. 3. Forensic implications Now that readers are familiar with ZFS, its functions, its architecture, and its on-disk layout, we can shift our discussion to its digital forensics implicationsdthe primary focus of this article. The discussion will outline several important artifacts left behind by normal ZFS operation that may be of investigative interest and recoverable during analysis. We also discuss several analytical challenges ZFS introduces Artifact: copies of data Sun s design goals of protecting against data corruption, live data scrubbing, instantaneous snapshots and clones, fast native backup and restore, and file system survivability resulted in several features that create and leave behind copies of metadata and content. Such data is forensically
5 S103 Active Zpool Level Dataset Level (e.g. Single File System) Dataset 1K Meta Meta s Obj. set type ZIL header Meta Object Set (MOS) 1K Meta Meta s Obj. set type ZIL header Dataset array (Unused) Master Node... Root Directory Subdirectory Subdirectory File Subdirectory File... MOS array (Unused) ObjectDirectory 1... DSL Directory 2... Dataset A... Dataset B Dataset C... Object Directory ZAP Object NV pair root_dataset NV pair config NV pair Sync_bplist 1. Always at index(1). 2. Identifies each dataset s objectid (corresponds to its respective MOS array index). 3. Indirection can occur at any time. Block pointers would point to block pointer arrays. Master Node ZAP Obj. Root Version Delete_Queue Root Directory ZAP Obj. File Name: ObjectID File Name: ObjectID... File Content Allocated Extent Block Pointer Array Index Subdirectory ZAP Obj. File Name: ObjectID SubdirName: ObjectID File Name: ObjectID... Fig. 2 On-Disk Data Walk (Figure adapted from Bruning, 2008a). recoverable. These features include the COW transactional object model, ditto blocks, snapshotting, and cloning Copies in unallocated space Several unallocated copies of metadata and content will likely exist, because of the COW model. COW is designed to prevent data corruption by not overwriting in-place data. Instead, when data are to be modified, the modified blocks are written elsewhere on disk. When the modified blocks are written correctly, associated metadata are updated, also via the COW model. Because disk I/O is such a critical performance issue, ZFS (like most file systems) does not securely delete the outdated blocks and the data remains in unallocated space until overwritten. The blocks are, of course, discoverable and recoverable. The impact of this is that forensic examiners will likely find numerous copies of metadata and content throughout the zpool in unallocated space COW is implemented at the FSB-level. If one sector within a file is modified, the respective FSB for that file will be copied on write. If the file consists of several FSBs, it appears that ZFS trades future increased seek time associated with possible file fragmentation for reduced disk IOPs associated with the COW transaction. The implication of FSB-level COW is that the duplicate metadata and content will likely be whole files (if smaller than FSB record size), or significant chunks of files (if larger than FSB record size). Additional unallocated copies of metadata and content may exist due to the transaction object model ZFS uses and its resultant ZFS intent log (ZIL) objects. The ZFS intent log (ZIL) saves transaction records of system calls that change the file system in memory with enough information to be able to replay them. There is one ZIL per file system. (Anonymous, 2006) (pg 51) (Note: Same text remains in the current zil.c source code (OpenSolaris release) located at: cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/ common/fs/zfs/zil.c). Transactions are recorded in ZIL log records in memory. When the ZFS Data Management Unit commits a transaction group, the data are written to permanent storage (the writes are committed, e.g. new content is written to disk for a file and metadata are updated), and the respective ZIL records are discarded. When files and processes require synchronicity (e.g. fsync or O_DSYNC call), the ZIL records are flushed to the stable, on-disk ZIL, to facilitate replay and roll-back, and then the data are written to permanent storage. In either case, ZIL records must be committed in order of TXG number. If
6 S104 digital investigation 6 (2009) S99 S107 a commit/sync is forced for a higher TXG than the lowest one in the ZIL, then all previous records will be committed, in order, before the requested TXG is committed. Research is needed to verify what happens to the data contained in log blocks and log records representing successful commits. It seems reasonable to assume, based on how ZFS uses TXG numbering to maintain state awareness, that committed records and blocks in the stable, on-disk log will remain until overwritten. An important question is whether ZFS allocates new ZIL blocks, or reuses old ones. The issue of the ZIL, both the in-memory and on-disk versions, requires further research. They have forensic implications for live response, collection, and acquisition, as well as for dead /static device analysis. With respect to live response, in-memory ZIL records might provide useful information regarding very recent processes and user actions. With respect to collection and acquisition, pulling the plug will represent a power failure, and the stable ZIL blocks/records may help investigators paint a more accurate picture of the current state of the system upon seizure. Finally, depending on how ZFS allocates new, or reuses old ZIL blocks, there may be numerous, unallocated ZIL blocks that can provide a much longer chronology of key file and process actions Copies in allocated space In addition to the multiple copies of content and metadata category data that likely exists in unallocated space due to the COW transactional object model, multiple allocated copies of metadata and/or content will likely exist via ZFS s ditto block functionality. For redundancy purposes, ZFS replicates metadata one, two, or three times, depending on the criticality of the metadata. ZFS also permits automatic copies of user data blocks upon request (administrative tuning of the file system). The number of ditto blocks created for a given object is determinate based on the number of Data Virtual Addresses (DVAs) populated within the object s s 128B block pointer. Three DVAs are permissible. The number used equates to the number of ditto blocks for the object, and corresponds to the block pointer s wideness. By default: Three copies of global metadata (zpool level metadata) are maintained (one original and two ditto blocks); a triplewide block pointer is stored. Two copies of dataset-level (e.g. file system objects, such as files and directories) metadata are maintained (one original and one ditto block); a double-wide block pointer is stored. One copy of user data (i.e. content) is maintained (Ditto blocks, 2006); a single-wide block pointer is stored. When the zpool is tuned to increase the number of ditto blocks for a specific type of data (e.g. user content), this only affects future writes. Also, ZFS automatically increments the number of ditto blocks maintained for higher-level data (e.g. that content s metadata). For example, two copies of user data content would result in three copies of file system metadata pertaining to that content (Elling, 2007). The location of ditto blocks is easily determined using the DVAs located in the object s s block pointers. It is important to understand the structure of block pointers and their constituent DVAs, however. Each DVA consists of a 32b ID and a 63b sector offset. The offset is relative to the first two copies of the label (L0 and L1; 256 KB each) and the boot block (3.5 MB). Thus, the sector offsets listed are 4096B, or eight sectors, from absolute sector zero on the (physical disk, slice, etc.) (see Fig. 1). The byte offset of the object within the thus equals the DVA offset value times 512, plus Each block pointer also includes asize (allocated size in blocks; asize minus one), indicating the length of the block to which the DVA points, and a G bit value. When set, the block pointer s G bit value indicates that the pointer points to a gang blockda block of block pointers. This occurs when files become fragmented. It is important to note, though, that fragmentation is arguably less frequent in ZFS than in commonly used file systems, which frequently use a next available allocation strategy (e.g. NTFS). This is because ZFS utilizes a first-fit block allocation strategy (per metaslab.c source code). This is a bit of a hybrid approach between next available and best fit, and results in less object (i.e. file) fragmentation. This may make traditional carving techniques even more successful. The digital forensic implication of the ditto block functionality is clear: multiple allocated copies of metadata and/or content will likely exist. Proper understanding of block pointer and DVA data structures is important when examining DVAs in both the ditto block context, as well as in the more basic on-disk data walk context. Native support for instantaneous snapshots and clones is also important in this context, but warrants little discussion. It is presumed that readers understand that instances of snapshots and clones will intuitively result in additional allocated copies of metadata and content. In fact, ZFS s implementation of snapshots and clones mean that the previously discussed implications of the COW modeldleaving copies of metadata and content in unallocated spacedwill result in the same data now being allocated, and thus persistent Artifact: increased state awareness Multiple copies of data in allocated and unallocated space results in increased state awareness. Typically, when forensic investigators seize computers, they only gain insight into the current state of data. Investigators are sometimes able to cull limited temporal, or state, information from application created temp files, crash files, file system journals, and log files, but such information is limited at best. As a result, investigators eagerly seek logical level file system backups (e.g. tape backups, restore points, etc.). Such backups provide snapshots of the file system at various points in time. Analyzing those backups chronologically provides investigators insight into the progression of intrusions and changes to user-level data over time. This type of temporal information and knowledge is often critical to investigations. Digital forensics investigators will benefit from the increasingly frequent inclusion of snapshotting capabilities now built-in to operating and file systems, such as ZFS. While snapshotting is certainly not new and dates back ten or more years (e.g. Veritas s VxFS, AIX s JFS/JFS2, and UFS as far back as Solaris 8), its implementation and use are becoming
7 S105 increasingly mainstream. For example, Windows VistaÔ now stores point-in-time copies of user-level data by default in its Ultimate, Business, and Enterprise editions via its Shadow Copy service. This service has been available since 2005 for Windows XPÔ users via the Volume Shadow Copy service in SDK 7.2, but it was not installed or running by defaultdnow it is. Similar functionality exists by default in the current Windows 7Ô release candidate. Apple s Time Machine, implemented in MAC OS X 10.5, creates automated, on-disk snapshots on via user configured incremental backup scheduling. ZFS implements snapshotting via the zfs snapshot command. The benefit of such file systems snapshotting capabilities is that copies of data are now stored on-disk in allocated space. The snapshots include both metadata and content. They are often created automatically and transparent to the user. They represent a treasure trove of temporal, state information for the investigator (as do clones, of course). Now, combine ZFS s native support for snapshotting and cloning, with its COW transactional object model, de-referenced ditto blocks, and the potential for ZIL blocks to be scattered all over the pool. The digital forensic investigator will have unprecedented temporal, state information available. This data is both allocated and unallocated (or dereferenced, if you will). We contend that the digital forensic investigator will be able to trace system and user activity chronologically much more successfully than in the past Challenge: compression Compression has always been a challenge for traditional digital forensic indexing and searching techniques, resulting in additional preprocessing burden due to decompression activity. As such, it is important to note that compression plays a major role in ZFS s static, on-disk data storage. Analysis of ZFS systems will not be as simple as decompressing known compressed file types, or decompressing entire volumes during the preprocessing stage. ZFS implements compression on a massive scale, but does so at the object, and sometimes even data structure level. Prevailing digital forensic techniques are currently unable to deal with such pervasive metadata compression and datasets with such a significant mix of data in various compression statesdnot compressed, compressed, and compressed with a variety of algorithms. Metadata compression is enabled by default. It can be disabled at any time, although it only appears to disable compression on indirect block pointers; direct pointers remain compressed (ZFS evil tuning guide). Additionally, ZAP objects do not appear to be compressed, even when metadata compression is turned on. Content category data is not compressed by default, but this setting can easily be tuned (PrincetonUniversity, 2007). These settings are globally set on a zpool basis, but tuning only affects new data storage. The compression state of currently saved metadata and content remains as is upon tuning. Thus, a forensic analyst will not be able to treat all objects of the same type, or even different instances of the same object the same analytically with respect to compression. Object compression states will have to be determined at the individual object level, as each object s metadata indicates the object s compression state in its respective block pointer. Digital forensic tools developed to analyze ZFS evidence will have to analyze data structures for compression settings and subsequently decompress the objects accordingly, prior to search, extraction, and analysis processes Challenge: dynamically sized extents As stated earlier, ZFS uses neither block-based allocation, nor extent-based allocation in the traditional sense. While blockbased allocation results in maximum space utilization, it suffers from expensive disk I/O (unless augmented with supplemental read/write and grouping algorithms, as UFS did). Extent-based allocation improves disk I/O, but can result in wasted disk space. The solution to this problem has typically been variably sized block functionality, where the extent (AKA cluster, logical allocation unit, data unit) size is defined upon file system creation and is optimally set based on intended use. In other words, the file system creator selects a small extent size if she knows the file system will be used to store many small files. Regardless of what extent size is selected during file system install, this extent size remains fixed for the entire file system and all subsequent files will be allocated to these equally sized extents. ZFS uses a bit of a hybrid between block-based and extentbased allocation. It is more aptly described as extent-based, but the extents will be dynamically sized according to individual file requirements. The record size of an FSB has a consistent maximum size (default ¼ 128 KB), but upon file allocation, ZFS dynamically sizes the FSB to just fit the data allocated. In other words, a 700B file would be allocated to a two-sector FSB (presuming 512B sector), or 1024B. A 4000B file would be allocated to an eight-sector FSB, or 4096B. A 4097B file would be allocated to a nine-sector FSB, or 4608B. Files larger than the FSB record size (i.e. greater than 128 KB if record size remains set to the default 128 KB) will be allocated to multiple FSBs. On the surface, this might appear to be block-based allocation, but it is not. The metadata stored is still stored at the extent-level similar to data runs in NTFS or extent records in HFSþ (e.g. first fragment starts at cluster 140 for a length of 10 clusters, second fragment starts at cluster 800 for a length of 5 clusters), rather than lists of allocated blocks (e.g. listing all 15 cluster numbers). Two key differences exist between ZFS and traditional extent-level allocation implementations, however. First, ZFS extents vary in size between files, within a file system. Second, the metadata that specify the location and length of the file fragments refer to the fragments sector offsets within labels and fragment length (in sectors). Additionally, the FSB record size can be repeatedly tuned after install (though, not advised). Thus, new files and copies of old files may exhibit very different FSB sizes from each other and from old files. (Bourbonnais, 2006) The FSB size for a specific object (e.g. file) is stored in the object s, within the 16b dn_datablkszsec data structure (size in sectors). Dynamically sized extents negatively impact the ease with which analytical inferences can be made. With file systems that used traditional extent-based allocation (i.e. fixed size extents, variably determined during file system install),
8 S106 digital investigation 6 (2009) S99 S107 forensic examiners can map extents from the start of volume, data area, etc. as appropriate, and know that all data in a specific block consists of that file s content and its file slack. Since the file system will be full of differently sized FSBs, the investigator cannot use simple arithmetic to identify extent boundaries. The investigator will have to consult ZFS metadata pertaining to individual files, as well as the space maps for each metaslab Artifact: dynamically sized extents In addition to the analytical challenge that dynamically sized extents presents, such variability also greatly impacts a critical digital forensics concept file slack. Many investigators have benefited from finding old file content in file slack. Since extents are variably and dynamically sized to best fit the file content, less file slack should exist. Files smaller than the FSB record size should exhibit no useful file slack, presuming the sector slack is zeroed out as with many other file systems. Files larger than the FSB record size, on the other hand, will exhibit a significant amount of file slack in many cases. This amount will be unusually large relative to many commonly used file systems, since the default maximum FSB record size is 128 KB. In NTFS, HFSþ, and EXT2/3, the extent size is typically set to 512, 1024, 2048, 4096, or 8192 bytes, with 4096B being the norm. So, when files are not exact increments of the fixed extent size (e.g. 4 KB) additional extents are allocated, and the last extent may contain useful file slack. If the last extent is larger than 4 KB, as is the case with ZFS and its default maximum extent size of 128 KB, much more file slack will likely exist. JFS is one file system that is capable of even larger extent sizes (up to 16 GB), but it is designed to right-size the last extent to minimize file slack (Ray, 2004). ZFS is not currently designed to right-size the last extent, although conversations with ZFS developers suggest this may change this in future releases. metadata and/or file system category files should also be noted. Although the forensic implication discussions summarized above represents the primary research contribution of this article, we also hope readers find the background informative and instructive. So, to a lesser extent, the ZFS primer is also a contribution of this article Limitations and future research The primary limitation of this article is that we have yet to verify all statements through our own direct observation and reverse engineering of on-disk behavior. We worked very hard to leverage authoritative sources of information and corroborate statements regarding ZFS functionality, architecture, and disk layout. We analyzed ZFS source code to some extent, although not exhaustively. We acknowledge that our implications and challenges discussions are academic and need empirical verification. Aside from the overarching need for empirical verification stated above, there are a few specific areas where future research is needed in particular. One such area is the ZIL. Documentation for the ZIL is limited. Experimentation and more thorough source code analyses are needed to fully understand its in-memory and on-disk formats, its similarities and differences with existing journaling file systems, and the digital forensic implications thereof. Another area ripe for future research regards the increased state awareness of system and user level activity provided by the greater incidence of both allocated and unallocated snapshot data. Without a doubt, more state data will be available on-disk. Several unknowns remain, however. We do not fully understand what the scope of such data will be in practice. Research is needed respecting the recovery and analysis of such data, particularly as it relates to relating the data, transforming it into chronological information, and deriving investigative knowledge thereof. 4. Concluding remarks 4.1. Contribution To the best of our knowledge, this is the first article that examines and discusses ZFS in the digital forensic context. We identified digital forensic artifacts that exist, due to ZFS functionality and design specifically, the increased advent of copies of metadata and content in both allocated and unallocated space, due to ditto blocks, COW, the ZIL, etc. We further discussed the implication of harvesting and analyzing such data to provide a greater chronological sight picture than has been previously possible. We outlined key data structures and how to locate them, which is important for forensic tool development and manual data-walking by forensic investigators. Finally, we alerted the community to the unique challenges that compression and dynamically sized extents represent for the forensic investigator. Though not discussed explicitly, the challenge of finding data through massive amounts of indirection and in the absence of statically located 5. Conclusion ZFS is different. Its data structures are unique. Its on-disk behavior and artifacts are not what digital forensic investigators are used to seeing. Very little is statically located in devices managed by ZFS. A great deal of data is compressed, and soon (rumor has it) much will be encrypted. File slack is arguably non-existent in small files and unusually abundant in large files. Numerous, extra copies of metadata and content will be literally scattered about devices. Such data can be used to increase temporal, state information in support of investigations. The list goes on. Mac forensics used to be just a niche area of forensics, reflective of the user market. It is no more. We predict that ZFS will follow a similar adoption curve amongst consumers, and perhaps more importantly, a much greater share in enterprise systems where end-to-end data integrity and large scale storage is necessary. We predict that next-generation file systems will adopt and integrate many of the solutions offered by ZFS to combat today s IT challenges. As such, it is important for researchers and practitioners to understand ZFS
9 S107 and its forensic implications. We hope this article has made strides in both areas. Acknowledgements The authors wish to thank HEI Systems & Solutions of San Antonio, Texas. It was discussions with them regarding their innovative use of ZFS that sparked our interest in this subject. They also proved to be very knowledgeable and helpful during the course of writing this article. We also wish to thank the DFRWS reviewers for their insightful comments and suggestions that improved this paper greatly. references Anonymous. ZFS On-disk specification (DRAFT); pp Anonymous (Unknown). OpenSolaris community: ZFS, opensolaris.org/os/community/zfs [Date accessed: March 2009]. Anonymous (Unknown). ZFS evil tuning guide, solarisinternals.com/wiki/index.php/zfs_evil_tuning_guide [Date accessed: March 2009]. Bourbonnais R. The dynamics of ZFS, entry/the_dynamics_of_zfs; 2006 [Post Date: ] [Date accessed: March 2009]. Bruning M. ZFS on-disk data walk (Or: where s my data), slides from presentation at the OpenSolaris developer conference. Prague, Czech Republic, osdevcon2008-max.pdf; June 25 27, 2008a. Bruning M. ZFS on-disk data walk (Or: where s my data), video of presentation at the OpenSolaris developer conference. Prague, Czech Republic, videoplay?docid¼ ; June 25 27, 2008b. Bruning M, ZFS On-disk data walk (or: Where s my data?) (DRAFT). In: Proceedings of the OpenSolaris developer Conference. Prague, Czech Republic, June 25 27; 2008c. pp Carrier B. File system forensic analysis. Upper Saddle River, NJ: Addison-Wesley; pp Eckstein K, Forensics for advanced UNIX file systems. In: Proceedings of the fifth annual IEEE SMC information assurance workshop. West Point, NY, June 2004, pp Elling R. ZFS, copies, and data protection. URL: com/relling/entry/zfs_copies_and_data_protection; 2007 [Post Date: ] [Date accessed: ]. Ray M. JFS tuning and performance. Hewlett Packard Company; pp Unknown. Ditto blocks the amazing tape repellent. URL: tape; 2006 [Post Date: ] [Date accessed: ]. ZFS management. PrincetonUniversity. URL: princeton.edu/wunix/solaris/troubleshoot/zfs.html; 2007 [Date accessed: March 2009]. Nicole Lang Beebe, Ph.D. is an Assistant Professor in the Department of Information Systems & Technology Management, at the University of Texas at San Antonio (UTSA). UTSA a National Center of Academic Excellence in Information Assurance for both education (CAEIAE) and research (CAE-R). Dr. Beebe received her Ph.D. in Information Technology from UTSA in She has over ten years experience in information security and digital forensics, in both the commercial and government sectors. She was a computer crime investigator for the Air Force Office of Special Investigations from 1998 to She has been a Certified Information Systems Security Professional (CISSP) since She has published several journal articles related to information security in digital forensics in The DATABASE for Advances in Information Systems, Digital Investigation, and Journal of Information System Security (JISSEC). Her research interests include digital forensics, information security, and data mining. Sonia D. Stacy recently graduated (May 2009) from The University of Texas at San Antonio with a master of science degree in Information Technology and a concentration in Infrastructure Assurance. She is now working for the U.S. government in the computer security field. Dane Stuckey is an undergraduate student at The University of Texas at San Antonio, pursuing a Bachelor of Business Administration degree in both Infrastructure Assurance and Information Systems. Dane currently serves as the lead laboratory administrator for UTSA s Advances Laboratories for Infrastructure Assurance and Security (ALIAS). He also holds an internship with the Air Force Office of Special Investigations where he focuses on computer crime and digital forensic investigations.
ZFS On-Disk Data Walk (Or: Where's My Data) OpenSolaris Developer Conference, June 25-27, 2008 Prague
ZFS On-Disk Data Walk (Or: Where's My Data) OpenSolaris Developer Conference, June 25-27, 2008 Prague Max Bruning Bruning Systems, LLC Topic Outline Overview of On-Disk Data Structures Using zdb(1m) and
ZFS Administration 1
ZFS Administration 1 With a rapid paradigm-shift towards digital content and large datasets, managing large amounts of data can be a challenging task. Before implementing a storage solution, there are
Lecture 18: Reliable Storage
CS 422/522 Design & Implementation of Operating Systems Lecture 18: Reliable Storage Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of
New Technologies File System (NTFS) Priscilla Oppenheimer. Copyright 2008 Priscilla Oppenheimer
New Technologies File System (NTFS) Priscilla Oppenheimer NTFS Default file system for Windows NT, 2000, XP, and Windows Server 2003 No published spec from Microsoft that describes the on-disk layout Good
Chapter 12 File Management
Operating Systems: Internals and Design Principles Chapter 12 File Management Eighth Edition By William Stallings Files Data collections created by users The File System is one of the most important parts
File System Forensics FAT and NTFS. Copyright Priscilla Oppenheimer 1
File System Forensics FAT and NTFS 1 FAT File Systems 2 File Allocation Table (FAT) File Systems Simple and common Primary file system for DOS and Windows 9x Can be used with Windows NT, 2000, and XP New
Linux Filesystem Comparisons
Linux Filesystem Comparisons Jerry Feldman Boston Linux and Unix Presentation prepared in LibreOffice Impress Boston Linux and Unix 12/17/2014 Background My Background. I've worked as a computer programmer/software
File-System Implementation
File-System Implementation 11 CHAPTER In this chapter we discuss various methods for storing information on secondary storage. The basic issues are device directory, free space management, and space allocation
Infortrend EonNAS 3000 and 5000: Key System Features
Infortrend EonNAS 3000 and 5000: Key System Features White paper Abstract This document introduces Infortrend s EonNAS 3000 and 5000 systems and analyzes key features available on these systems. Table
Using the HFSD journal for deleted file recovery
available at www.sciencedirect.com journal homepage: www.elsevier.com/locate/diin Using the HFSD journal for deleted file recovery Aaron Burghardt*, Adam J. Feldman Booz Allen Hamilton, Herndon, VA 20171,
Chapter 12 File Management
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Roadmap Overview File organisation and Access
Chapter 12 File Management. Roadmap
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Overview Roadmap File organisation and Access
CHAPTER 17: File Management
CHAPTER 17: File Management The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides
Snapshot Technology: Improving Data Availability and Redundancy
Snapshot Technology: Improving Data Availability and Redundancy. All rights reserved. Table of Contents Introduction...3 Snapshot Overview...3 Functional Description...6 Creating Snapshots...7 Other Snapshot
HOW TRUENAS LEVERAGES OPENZFS. Storage and Servers Driven by Open Source. [email protected]
HOW TRUENAS LEVERAGES OPENZFS Storage and Servers Driven by Open Source [email protected] CONTENTS 1 Executive Summary 2 History of ixsystems 3 Overview of TrueNAS 4 OpenZFS 4.1 History 4.2 Technical
<Insert Picture Here> Btrfs Filesystem
Btrfs Filesystem Chris Mason Btrfs Goals General purpose filesystem that scales to very large storage Feature focused, providing features other Linux filesystems cannot Administration
INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT
INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT UNPRECEDENTED OBSERVABILITY, COST-SAVING PERFORMANCE ACCELERATION, AND SUPERIOR DATA PROTECTION KEY FEATURES Unprecedented observability
Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution
Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Jonathan Halstuch, COO, RackTop Systems [email protected] Big Data Invasion We hear so much on Big Data and
ZFS In Business. Roch Bourbonnais Sun Microsystems [email protected]
ZFS In Business Roch Bourbonnais Sun Microsystems [email protected] 1 What is ZFS Integrated Volume and Filesystem w no predefined limits Volume Management > pooling of disks, luns... in raid-z
Digital Forensics Lecture 3. Hard Disk Drive (HDD) Media Forensics
Digital Forensics Lecture 3 Hard Disk Drive (HDD) Media Forensics Current, Relevant Topics defendants should not use disk-cleaning utilities to wipe portions of their hard drives before turning them over
Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features
Solaris For The Modern Data Center Taking Advantage of Solaris 11 Features JANUARY 2013 Contents Introduction... 2 Patching and Maintenance... 2 IPS Packages... 2 Boot Environments... 2 Fast Reboot...
TELE 301 Lecture 7: Linux/Unix file
Overview Last Lecture Scripting This Lecture Linux/Unix file system Next Lecture System installation Sources Installation and Getting Started Guide Linux System Administrators Guide Chapter 6 in Principles
File Systems Management and Examples
File Systems Management and Examples Today! Efficiency, performance, recovery! Examples Next! Distributed systems Disk space management! Once decided to store a file as sequence of blocks What s the size
Chapter 11: File System Implementation. Operating System Concepts 8 th Edition
Chapter 11: File System Implementation Operating System Concepts 8 th Edition Silberschatz, Galvin and Gagne 2009 Chapter 11: File System Implementation File-System Structure File-System Implementation
Sawmill Log Analyzer Best Practices!! Page 1 of 6. Sawmill Log Analyzer Best Practices
Sawmill Log Analyzer Best Practices!! Page 1 of 6 Sawmill Log Analyzer Best Practices! Sawmill Log Analyzer Best Practices!! Page 2 of 6 This document describes best practices for the Sawmill universal
Backup and Recovery by using SANWatch - Snapshot
Backup and Recovery by using SANWatch - Snapshot 2007 Infortrend Technology, Inc. All rights Reserved. Table of Contents Introduction...3 Snapshot Functionality...3 Eliminates the backup window...3 Retrieves
Defining Digital Forensic Examination and Analysis Tools Using Abstraction Layers
Defining Digital Forensic Examination and Analysis Tools Using Abstraction Layers Brian Carrier Research Scientist @stake Abstract This paper uses the theory of abstraction layers to describe the purpose
June 2009. Blade.org 2009 ALL RIGHTS RESERVED
Contributions for this vendor neutral technology paper have been provided by Blade.org members including NetApp, BLADE Network Technologies, and Double-Take Software. June 2009 Blade.org 2009 ALL RIGHTS
Incident Response and Computer Forensics
Incident Response and Computer Forensics James L. Antonakos WhiteHat Forensics Incident Response Topics Why does an organization need a CSIRT? Who s on the team? Initial Steps Detailed Project Plan Incident
COS 318: Operating Systems. File Layout and Directories. Topics. File System Components. Steps to Open A File
Topics COS 318: Operating Systems File Layout and Directories File system structure Disk allocation and i-nodes Directory and link implementations Physical layout for performance 2 File System Components
1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management
COMP 242 Class Notes Section 6: File Management 1 File Management We shall now examine how an operating system provides file management. We shall define a file to be a collection of permanent data with
Windows NT File System. Outline. Hardware Basics. Ausgewählte Betriebssysteme Institut Betriebssysteme Fakultät Informatik
Windows Ausgewählte Betriebssysteme Institut Betriebssysteme Fakultät Informatik Outline NTFS File System Formats File System Driver Architecture Advanced Features NTFS Driver On-Disk Structure (MFT,...)
Original-page small file oriented EXT3 file storage system
Original-page small file oriented EXT3 file storage system Zhang Weizhe, Hui He, Zhang Qizhen School of Computer Science and Technology, Harbin Institute of Technology, Harbin E-mail: [email protected]
Chapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
Chapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
Google File System. Web and scalability
Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might
Outline. Windows NT File System. Hardware Basics. Win2K File System Formats. NTFS Cluster Sizes NTFS
Windows Ausgewählte Betriebssysteme Institut Betriebssysteme Fakultät Informatik 2 Hardware Basics Win2K File System Formats Sector: addressable block on storage medium usually 512 bytes (x86 disks) Cluster:
WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression
WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression Sponsored by: Oracle Steven Scully May 2010 Benjamin Woo IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
High Performance Computing Specialists. ZFS Storage as a Solution for Big Data and Flexibility
High Performance Computing Specialists ZFS Storage as a Solution for Big Data and Flexibility Introducing VA Technologies UK Based System Integrator Specialising in High Performance ZFS Storage Partner
ProTrack: A Simple Provenance-tracking Filesystem
ProTrack: A Simple Provenance-tracking Filesystem Somak Das Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology [email protected] Abstract Provenance describes a file
File Systems for Flash Memories. Marcela Zuluaga Sebastian Isaza Dante Rodriguez
File Systems for Flash Memories Marcela Zuluaga Sebastian Isaza Dante Rodriguez Outline Introduction to Flash Memories Introduction to File Systems File Systems for Flash Memories YAFFS (Yet Another Flash
Windows OS File Systems
Windows OS File Systems MS-DOS and Windows 95/98/NT/2000/XP allow use of FAT-16 or FAT-32. Windows NT/2000/XP uses NTFS (NT File System) File Allocation Table (FAT) Not used so much, but look at as a contrast
COS 318: Operating Systems
COS 318: Operating Systems File Performance and Reliability Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Topics File buffer cache
File System Management
Lecture 7: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation
File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System
CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID
Ryusuke KONISHI NTT Cyberspace Laboratories NTT Corporation
Ryusuke KONISHI NTT Cyberspace Laboratories NTT Corporation NILFS Introduction FileSystem Design Development Status Wished features & Challenges Copyright (C) 2009 NTT Corporation 2 NILFS is the Linux
The Linux Virtual Filesystem
Lecture Overview Linux filesystem Linux virtual filesystem (VFS) overview Common file model Superblock, inode, file, dentry Object-oriented Ext2 filesystem Disk data structures Superblock, block group,
ZFS Backup Platform. ZFS Backup Platform. Senior Systems Analyst TalkTalk Group. http://milek.blogspot.com. Robert Milkowski.
ZFS Backup Platform Senior Systems Analyst TalkTalk Group http://milek.blogspot.com The Problem Needed to add 100's new clients to backup But already run out of client licenses No spare capacity left (tapes,
Web-Based Data Backup Solutions
"IMAGINE LOSING ALL YOUR IMPORTANT FILES, IS NOT OF WHAT FILES YOU LOSS BUT THE LOSS IN TIME, MONEY AND EFFORT YOU ARE INVESTED IN" The fact Based on statistics gathered from various sources: 1. 6% of
Network Attached Storage. Jinfeng Yang Oct/19/2015
Network Attached Storage Jinfeng Yang Oct/19/2015 Outline Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS s performance (Reliability
Database Virtualization Technologies
Database Virtualization Technologies Database Virtualization Comes of age CloneDB : 3 talks @ OOW Clone Online in Seconds with CloneDB (EMC) CloneDB with the Latest Generation of Database (Oracle) Efficient
Algorithms and Methods for Distributed Storage Networks 7 File Systems Christian Schindelhauer
Algorithms and Methods for Distributed Storage Networks 7 File Systems Institut für Informatik Wintersemester 2007/08 Literature Storage Virtualization, Technologies for Simplifying Data Storage and Management,
Chapter 11: File System Implementation. Operating System Concepts with Java 8 th Edition
Chapter 11: File System Implementation 11.1 Silberschatz, Galvin and Gagne 2009 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation
A Deduplication File System & Course Review
A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror
Raima Database Manager Version 14.0 In-memory Database Engine
+ Raima Database Manager Version 14.0 In-memory Database Engine By Jeffrey R. Parsons, Senior Engineer January 2016 Abstract Raima Database Manager (RDM) v14.0 contains an all new data storage engine optimized
Installing a Second Operating System
Installing a Second Operating System Click a link below to view one of the following sections: Overview Key Terms and Information Operating Systems and File Systems Managing Multiple Operating Systems
CS2510 Computer Operating Systems
CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction
CS2510 Computer Operating Systems
CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction
Deploying De-Duplication on Ext4 File System
Deploying De-Duplication on Ext4 File System Usha A. Joglekar 1, Bhushan M. Jagtap 2, Koninika B. Patil 3, 1. Asst. Prof., 2, 3 Students Department of Computer Engineering Smt. Kashibai Navale College
Distributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
Best Practices for Architecting Storage in Virtualized Environments
Best Practices for Architecting Storage in Virtualized Environments Leverage Advances in Storage Technology to Accelerate Performance, Simplify Management, and Save Money in Your Virtual Server Environment
Data Deduplication and Corporate PC Backup
A Druva White Paper Data Deduplication and Corporate PC Backup This Whitepaper explains source based deduplication technology and how it is used by Druva s insync product to save storage bandwidth and
Designing a Cloud Storage System
Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes
File System Design and Implementation
Transactions and Reliability Sarah Diesburg Operating Systems CS 3430 Motivation File systems have lots of metadata: Free blocks, directories, file headers, indirect blocks Metadata is heavily cached for
Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies White Paper
Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies White Paper bex30102507wpor Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies Introduction Are you a database administrator
FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS
FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS ABSTRACT This white paper examines some common practices in enterprise storage array design and their resulting trade-offs and limitations. The goal
OPTIMIZING EXCHANGE SERVER IN A TIERED STORAGE ENVIRONMENT WHITE PAPER NOVEMBER 2006
OPTIMIZING EXCHANGE SERVER IN A TIERED STORAGE ENVIRONMENT WHITE PAPER NOVEMBER 2006 EXECUTIVE SUMMARY Microsoft Exchange Server is a disk-intensive application that requires high speed storage to deliver
Library Recovery Center
Library Recovery Center Ever since libraries began storing bibliographic information on magnetic disks back in the 70 s, the challenge of creating useful back-ups and preparing for a disaster recovery
An Oracle White Paper July 2014. Oracle ACFS
An Oracle White Paper July 2014 Oracle ACFS 1 Executive Overview As storage requirements double every 18 months, Oracle customers continue to deal with complex storage management challenges in their data
Siebel Installation Guide for Microsoft Windows. Siebel Innovation Pack 2013 Version 8.1/8.2, Rev. A April 2014
Siebel Installation Guide for Microsoft Windows Siebel Innovation Pack 2013 Version 8.1/8.2, Rev. A April 2014 Copyright 2005, 2014 Oracle and/or its affiliates. All rights reserved. This software and
NSS Volume Data Recovery
NSS Volume Data Recovery Preliminary Document September 8, 2010 Version 1.0 Copyright 2000-2010 Portlock Corporation Copyright 2000-2010 Portlock Corporation Page 1 of 20 The Portlock storage management
File Management. Chapter 12
Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution
Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage
Moving Virtual Storage to the Cloud Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Table of Contents Overview... 1 Understanding the Storage Problem... 1 What Makes
Outline. Failure Types
Outline Database Management and Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 11 1 2 Conclusion Acknowledgements: The slides are provided by Nikolaus Augsten
Filesystems Performance in GNU/Linux Multi-Disk Data Storage
JOURNAL OF APPLIED COMPUTER SCIENCE Vol. 22 No. 2 (2014), pp. 65-80 Filesystems Performance in GNU/Linux Multi-Disk Data Storage Mateusz Smoliński 1 1 Lodz University of Technology Faculty of Technical
Развертывание сервера приложений Oracle GlassFish Server на OpenSolaris: мониторинг, подготовка к работе и резервное копирование
Развертывание сервера приложений Oracle GlassFish Server на OpenSolaris: мониторинг, подготовка к работе и резервное копирование Филипп Торчинский Sun Microsystems 1 Agenda Introduction What is OpenSolaris,
Introduction to Gluster. Versions 3.0.x
Introduction to Gluster Versions 3.0.x Table of Contents Table of Contents... 2 Overview... 3 Gluster File System... 3 Gluster Storage Platform... 3 No metadata with the Elastic Hash Algorithm... 4 A Gluster
AppSense Environment Manager. Enterprise Design Guide
Enterprise Design Guide Contents Introduction... 3 Document Purpose... 3 Basic Architecture... 3 Common Components and Terminology... 4 Best Practices... 5 Scalability Designs... 6 Management Server Scalability...
Optimizing Ext4 for Low Memory Environments
Optimizing Ext4 for Low Memory Environments Theodore Ts'o November 7, 2012 Agenda Status of Ext4 Why do we care about Low Memory Environments: Cloud Computing Optimizing Ext4 for Low Memory Environments
Workload Dependent Performance Evaluation of the Btrfs and ZFS Filesystems
Workload Dependent Performance Evaluation of the Btrfs and ZFS Dominique A. Heger DHTechnologies, Austin, Texas [email protected]. Introduction The UNIX and Linux operating systems alike already provide
RECOVERING FROM SHAMOON
Executive Summary Fidelis Threat Advisory #1007 RECOVERING FROM SHAMOON November 1, 2012 Document Status: FINAL Last Revised: 2012-11-01 The Shamoon malware has received considerable coverage in the past
Hypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
Product Brief. DC-Protect. Content based backup and recovery solution. By DATACENTERTECHNOLOGIES
Product Brief DC-Protect Content based backup and recovery solution By DATACENTERTECHNOLOGIES 2002 DATACENTERTECHNOLOGIES N.V. All rights reserved. This document contains information proprietary and confidential
Concepts of digital forensics
Chapter 3 Concepts of digital forensics Digital forensics is a branch of forensic science concerned with the use of digital information (produced, stored and transmitted by computers) as source of evidence
by Scott Moulton @MyHardDriveDied.com Recover your P0RN from your RAID Array!
by Scott Moulton @MyHardDriveDied.com Recover your P0RN from your RAID Array! WHAT IS THIS ABOUT? BRIEF Coverage ;) Unusual Arrays Intro to RAID About RAID 0 Sight Samples Sound Samples About RAID 5 Demo
Big data management with IBM General Parallel File System
Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers
WHITE PAPER: ENTERPRISE SOLUTIONS. Veritas NetBackup Bare Metal Restore by Symantec Best-of-Breed Server Recovery Using Veritas NetBackup Version 6.
WHITE PAPER: ENTERPRISE SOLUTIONS Veritas NetBackup Bare Metal Restore by Symantec Best-of-Breed Server Recovery Using Veritas NetBackup Version 6.0 White Paper: Enterprise Solutions Veritas NetBackup
NetApp Data Compression and Deduplication Deployment and Implementation Guide
Technical Report NetApp Data Compression and Deduplication Deployment and Implementation Guide Clustered Data ONTAP Sandra Moulton, NetApp April 2013 TR-3966 Abstract This technical report focuses on clustered
Complete Storage and Data Protection Architecture for VMware vsphere
Complete Storage and Data Protection Architecture for VMware vsphere Executive Summary The cost savings and agility benefits of server virtualization are well proven, accounting for its rapid adoption.
On Benchmarking Popular File Systems
On Benchmarking Popular File Systems Matti Vanninen James Z. Wang Department of Computer Science Clemson University, Clemson, SC 2963 Emails: {mvannin, jzwang}@cs.clemson.edu Abstract In recent years,
Perforce Backup Strategy & Disaster Recovery at National Instruments
Perforce Backup Strategy & Disaster Recovery at National Instruments Steven Lysohir National Instruments Perforce User Conference April 2005-1 - Contents 1. Introduction 2. Development Environment 3. Architecture
Operating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University
Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University File Management File management system has traditionally been considered part of the operating system. Applications
A Dell Technical White Paper Dell Compellent
The Architectural Advantages of Dell Compellent Automated Tiered Storage A Dell Technical White Paper Dell Compellent THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL
Using SmartOS as a Hypervisor
Using SmartOS as a Hypervisor SCALE 10x Robert Mustacchi [email protected] (@rmustacc) Software Engineer What is SmartOS? Solaris heritage Zones - OS level virtualization Crossbow - virtual NICs ZFS - pooled
Siebel Installation Guide for UNIX. Siebel Innovation Pack 2013 Version 8.1/8.2, Rev. A April 2014
Siebel Installation Guide for UNIX Siebel Innovation Pack 2013 Version 8.1/8.2, Rev. A April 2014 Copyright 2005, 2014 Oracle and/or its affiliates. All rights reserved. This software and related documentation
Microsoft Vista: Serious Challenges for Digital Investigations
Proceedings of Student-Faculty Research Day, CSIS, Pace University, May 2 nd, 2008 Microsoft Vista: Serious Challenges for Digital Investigations Darren R. Hayes and Shareq Qureshi Seidenberg School of
Microsoft SMB File Sharing Best Practices Guide
Technical White Paper Microsoft SMB File Sharing Best Practices Guide Tintri VMstore, Microsoft SMB 3.0 Protocol, and VMware 6.x Author: Neil Glick Version 1.0 06/15/2016 @tintri www.tintri.com Contents
Data De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication
Data De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication Table of Contents Introduction... 3 Shortest Possible Backup Window... 3 Instant
