Introduction to File Carving



Similar documents
Mobile memory dumps, MSAB and MPE+ Data collection Information recovery Analysis and interpretation of results

Chapter 4. Operating Systems and File Management

Just EnCase. Presented By Larry Russell CalCPA State Technology Committee May 18, 2012

COMPUTER FORENSICS (EFFECTIVE ) ACTIVITY/COURSE CODE: 5374 (COURSE WILL BE LISTED IN THE CATE STUDENT REPORTING PROCEDURES MANUAL)

What Happens When You Press that Button? Explaining Cellebrite UFED Data Extraction Processes

Welcome to new students seminar!! Security is a people problem. forensic proof.com JK Kim

Where is computer forensics used?

File System Forensics FAT and NTFS. Copyright Priscilla Oppenheimer 1

Recovers Lost or Deleted Pictures from: Any Memory Card Type Any Brand Using Any Mass Storage Reader

Defining Digital Forensic Examination and Analysis Tools Using Abstraction Layers

Recover Data Like a Forensics Expert Using an Ubuntu Live CD

Chapter Contents. Operating System Activities. Operating System Basics. Operating System Activities. Operating System Activities 25/03/2014

FORENSIC ANALYSIS OF USB MEDIA EVIDENCE. Jesús Alexander García. Luis Alejandro Franco. Juan David Urrea. Carlos Alfonso Torres

CDR500 Spy Recovery Pro

C6 Easy Imaging Total Computer Backup. User Guide

Hands-On How-To Computer Forensics Training

RECOVERING DELETED DATA FROM FAT PARTITIONS WITHIN MOBILE PHONE HANDSETS USING TRADITIONAL IMAGING TECHNIQUES

NTFS Undelete User Manual

RECOVERING FROM SHAMOON

CommVault Simpana Archive 8.0 Integration Guide

Alternate Data Streams in Forensic Investigations of File Systems Backups

PN Connect:Enterprise Secure FTP Client Release Notes Version

StarWind iscsi SAN Software: Implementation of Enhanced Data Protection Using StarWind Continuous Data Protection

MSc Computer Security and Forensics. Examinations for / Semester 1

Clickfree Software User Guide

COMPUTER FORENSICS. DAVORY: : DATA RECOVERY

winhex Disk Editor, RAM Editor PRESENTED BY: OMAR ZYADAT and LOAI HATTAR

Ans.: You can find your activation key for a Recover My Files by logging on to your account.

McAfee Global Threat Intelligence File Reputation Service. Best Practices Guide for McAfee VirusScan Enterprise Software

Microsoft Vista: Serious Challenges for Digital Investigations

2.6.1 Creating an Acronis account Subscription to Acronis Cloud Creating bootable rescue media... 12

Junos Pulse for Google Android

Advanced Registry Forensics with Registry Decoder. Dr. Vico Marziale Sleuth Kit and Open Source Digital Forensics Conference /03/2012

Incident Response and Computer Forensics

DIGITAL FORENSIC INVESTIGATION, COLLECTION AND PRESERVATION OF DIGITAL EVIDENCE. Vahidin Đaltur, Kemal Hajdarević,

Towards facilitating reliable recovery of JPEG pictures? P. De Smet

McAfee Endpoint Encryption for Files and Folders. Best Practices. For EEFF product version 4.0.0

The Proper Acquisition, Preservation, & Analysis of Computer Evidence: Guidelines & Best-Practices

McAfee Web Reporter Turning volumes of data into actionable intelligence

Q. If I purchase a product activation key on-line, how long will it take to be sent to me?

2.8.1 Creating an Acronis account Subscription to Acronis Cloud Creating bootable rescue media... 16

UNDERSTANDING SMS: Practitioner s Basics

Lab V: File Recovery: Data Layer Revisited

File Systems for Flash Memories. Marcela Zuluaga Sebastian Isaza Dante Rodriguez

Understanding Backup and Recovery Methods

Computer Forensics Principles and Practices

C6 Easy Imaging Total Computer Backup. User Guide

Forensic Analysis of Internet Explorer Activity Files

Linux System Administration

LTFS for Microsoft Windows User Guide

Dr. Lodovico Marziale Managing Partner 504ENSICS, LLC

Windows 7: Current Events in the World of Windows Forensics

Lukas Limacher Department of Computer Science, ETH. Computer Forensics. September 25, 2014

Power, Patch, and Endpoint Managers Expand McAfee epo Platform Capabilities While Cutting Endpoint Costs

New Technologies File System (NTFS) Priscilla Oppenheimer. Copyright 2008 Priscilla Oppenheimer

Introduction to BitLocker FVE

Using Data Domain Storage with Symantec Enterprise Vault 8. White Paper. Michael McLaughlin Data Domain Technical Marketing

Digital Evidence Search Kit

Bypassing CAPTCHAs by Impersonating CAPTCHA Providers

Active Directory 2008 Operations

PTK Forensics. Dario Forte, Founder and Ceo DFLabs. The Sleuth Kit and Open Source Digital Forensics Conference

Forensics source: Edward Fjellskål, NorCERT, Nasjonal sikkerhetsmyndighet (NSM)

Using Computer Forensics in your Investigations

RecoverIt Frequently Asked Questions

2! Bit-stream copy. Acquisition and Tools. Planning Your Investigation. Understanding Bit-Stream Copies. Bit-stream Copies (contd.

How To Use An Fsm1

10 Quick Tips to Mobile Security

Personal Cloud. Support Guide for Mac Computers. Storing and sharing your content 2

Lecture outline. Computer Forensics and Digital Investigation. Defining the word forensic. Defining Computer forensics. The Digital Investigation

White Paper. PCI Guidance: Microsoft Windows Logging

UNDELETE 7.0 USER GUIDE

Cellebrite UFED Physical Pro Cell Phone Extraction Guide

Wrist Audio Player Link Soft for Macintosh. User s Guide

Digital Forensics Tutorials Acquiring an Image with Kali dcfldd

Impact of Digital Forensics Training on Computer Incident Response Techniques

Using RADIUS Agent for Transparent User Identification

Sharp Remote Device Manager (SRDM) Server Software Setup Guide

Determining VHD s in Windows 7 Dustin Hurlbut

Certified Digital Forensics Examiner

Exchange Brick-level Backup and Restore

VERITAS NetBackup 6.0

Computer Forensics: Permanent Erasing

Open Source Data Recovery

CTERA Agent for Linux

BrightStor ARCserve Backup for Windows

FPO. MagicInfo Lite Software for Samsung Large Format Displays. Built-in digital signage software that provides an all-in-one display solution

Windows File Management A Hands-on Class Presented by Edith Einhorn

Implementing McAfee Device Control Security

TotalShredder USB. User s Guide

To Catch a Thief: Computer Forensics in the Classroom

McAfee Endpoint Protection for SMB. You grow your business. We keep it secure.

Forensically Determining the Presence and Use of Virtual Machines in Windows 7

Transcription:

By Christiaan Beek Principal Security Consultant McAfee Foundstone Professional Services

Table of Contents Overview 3 File Recovery Versus Carving 3 Fragmentation 5 Tooling 5 An example of using Photorec 6 Mobile Phones 8 Development by the forensic community 10 Conclusion 10 References 10 About the Author 10 About McAfee Foundstone Education 10 McAfee Foundstone Security Training Classes 11

Overview File carving, or sometimes simply carving, is the process of extracting a collection of data from a larger data set. Data carving techniques frequently occur during a digital investigation when the unallocated file system space is analyzed to extract files. The files are carved from the unallocated space using file type-specific header and footer values. File system structures are not used during the process. File carving is a powerful technique for recovering files and fragments of files when directory entries are corrupt or missing. The block of data is searched block by block for residual data matching the file type-specific header and footer values. Carving is also especially useful in criminal cases where the use of carving techniques can recover evidence. In certain cases related to child pornography, law enforcement agents are often able to recover more images from the suspect s hard disks by using carving techniques. Another example is the hard disks and removable storage media US Navy Seals took from Osama Bin Laden s campus during their raid. Forensic experts used file carving techniques to squeeze every bit of information out of this media. As long as data is not overwritten or wiped, deleted data on all storage devices can be restored using carving techniques, including multifunctional devices and even mobile phones. Depending on the conditions, it is even possible to restore data from formatted disks. With the exhaustive measures of drives since 2006, there is a big chance that the data is not overwritten. For example, let s say you have a two-terabyte drive, and you delete a document from that drive. The disk space reserved for that document will be marked available, but it could really take a long time before this address space on the disk is overwritten. There were forensic cases where we discovered files stored on the disk years ago. In this paper, the basic techniques of file carving tools like Foremost and Photorec, are explained for recovering data from several media types. File Recovery Versus Carving There is a big difference between file recovery techniques and carving. File recovery techniques make use of the file system information that remains after deletion of a file. By using this information, many files can be recovered. For this technique to work, the file system information needs to be correct. If not, the files can t be recovered. If a system is formatted, the file recovery techniques will not work either. Carving deals with the raw data on the media and doesn t use the file system structure during its process. A file system (such as FAT16, FAT32, NTFS, EXT, and others) is a structure for storing and organizing computer files and the data they contain. Although carving doesn t care about which file system is used to store the files, it could be very helpful to understand how a specific file system works. In the FAT file system for example, when a file is deleted, the file s directory entry is changed to show that the file is no longer needed (unallocated). The first character of the filename is replaced with a marker, but the file data itself is left unchanged. Until it s overwritten, the data is still present. A detailed and basic book for forensics of file systems is Brain Carrier s File System Forensic Analysis (http://www. digital-evidence.org/fsfa/). Carving makes use of the internal structure of a file. A file is a block of stored information like an image in a JPEG file. A computer uses file name extensions to identify files content. Let s have a look of the internal structure of a JPEG file. 3

Short Name Bytes Payload Name SOI 0x FF D8 none Start of Image SOF0 0x FF C0 variable size Start of Frame (Baseline DCT) SOF2 0x FF C2 variable size Start of Frame (Progressive DCT) DHT 0x FF C4 variable size Define Huffman Table(s) DQT 0x FF DB variable size Define Quantization Table(s) DRI 0x FF DD 2 bytes Define Restart Interval SOS 0x FF DA variable size Start of Stream RSTn 0x FF D0 0x FF D7 none Restart Appn 0x FF En variable size Application-Specific COM 0x FF FE variable size Comment (text) EOI 0x FF D9 none End of Image Figure 1. File structure of a JPEG file. In a JPEG file, there are certain structures that could help the carving software distinguish this type of file from the rest of the raw data. First of all, there is the header. The header is an identification string that is unique for every file type. This could be very useful for identifying the beginning of file types. In our example of the JPEG file structure, the Start of Image (SOI) of a JPEG file starts with the byte values 0xFF D8 (header). Following the SOI are a series of marker blocks of data used for file information. Each of these markers begins with a signature FF XX, where XX identifies the type of marker. The two bytes following each marker header is the size of the marker data. The marker data immediately follows the size, and then the next marker header FF XX immediately follows the previous marker data. There are no standards as to how many markers exist. The signature FF DA after the markers indicates the Start of Stream marker. The SOS marker is followed by a two-byte value of the size of the SOS data and is immediately followed by the image stream that makes up the graphic. A JPEG file ends with the bytes 0xFF D9 (footer). The constant values 0xFF D8 and 0x FF D9 are also called the magic numbers. In some cases, it is possible that a thumbnail graphic exists within the file. The thumbnail graphic will have the exact same components as the full-size graphic, starting with the byte values FF D8 and ending with the byte values of FF D9. A thumbnail graphic is smaller and less likely to experience fragmentation than its larger parent full-size graphic. If manual visual review of the carved graphic is required, thumbnail graphics can be used as a comparison tool for evaluating what the entire JPEG graphic is to look like. The header and footer of the JPEG file viewed in Pspad 1 Hex are shown below: Figure 2. JPEG header. Figure 3. JPEG footer. 4

Header-footer carving is one of the simplest ways of carving. It searches through the raw data for the file types you wish to carve. This kind of carving assumes that: The files searched for are not fragmented The beginning of the file is still present The signature being searched for is not a common string, which could cause numerous false positives An example of a common string is the header of an MP3 file. This header starts with the letters mp. This is unique regarding other file types, but the signature mp could be in many places in the raw data and is not necessarily pointing at the beginning of an MP3 file. Fragmentation Modern operating systems try to write files without fragmentation because these files are faster to write and to read. But there are three conditions under which an operating system must write a file with two or more fragments: 1. There is no contiguous region of sectors on the media large enough to hold the file without fragmentation. This is likely if a drive has been in use a long time, is filled to near capacity, and has had many files added and deleted in more-or-less random order over time. 2. If data is appended to an existing file, there may not be sufficient unallocated sectors at the end of the file to accommodate the new data. In this case, some file systems may relocate the original file, but mostly, they will simply write the appended data to another location. 3. The file system itself may not support writing files of a certain size in a contiguous manner. For example, the Unix file system (UFS) will fragment files that are long or have bytes at the end of the file that will not fit into an even number of sectors. Simon Garfinkel 2 researched fragmentation statistics by investigating 350 disks containing NTFS, FAT, and UFS. He showed that the fragmentation rate of user files (email, JPEG, Microsoft Word, and Microsoft Excel) is high. Microsoft Word s fragmentation rate was found to be 17 percent; for JPEG files, it was 16 percent; and for Microsoft Outlook s PST files, it was 58 percent. For carving fragmented files that have no beginning or a common string, we use advanced carving techniques based on file structure. An example of many techniques that can be used is file structure carving. File structure carving makes use of recognizable structures outside the header and footer signatures. In the example of the JPEG-layout (Figure 1), not only are the header and footer values used, but also the identifier strings and size to search block by block for JPEG files. Tooling There are different carving tools available. Most of them are open source, and others are commercial solutions offered by companies. Due to the fact that carving is a developing technique, more and more tools are becoming available. Some of the most commonly used carving tools are: Foremost 3 Originally designed by the US Air Force, it is a carver designed for recovering files based on their headers, footers, and internal data structures Scalpel 4 Scalpel is a rewrite of Foremost focused on performance and a decrease of memory usage. It uses a database of header and footer definitions and extracts matching files from a set of image files or raw device files. Scalpel is file system independent and will carve files from FATx, NTFS, EXT2/3, or raw partitions. Scalpel will not allow you to output to the same directory you re carving from. Photorec 5 Photorec is a data recovery software tool designed to recover lost files from digital camera storage media (CompactFlash, Memory Stick, Secure Digital, SmartMedia, Microdrive, MMC, USB flash drives, and others), hard disks, and CD-ROMs. It recovers most common photo formats, audio files, document formats, such as Microsoft Office, PDF, HTML, and archive/compression formats. A complete list of supported file formats can be read on the Photorec website. PhotoRec does not attempt to write 5

to the damaged media from where recovery is being performed. Recovered files are instead written to the directory from where you are running PhotoRec or any other directory you choose. More information about data carving tools and recovery tools can be found on forensicwiki. 6 An example of using Photorec Let s consider the scenario where you have to recover data from a mistakenly formatted USB stick. First, make an image of the device and open it in read-only mode in FTK imager 7 to look for any data. Figure 4. Preview content with FTK Imager. The FAT partition has a root and unallocated space container. In the next steps, we will use the combination of Cygwin and Photorec to recover the data from the USB stick. Cygwin is a *Nix shell under Windows. Many forensic investigators use Cygwin in combination with TSK (http://www.sleuthkit.org). In the Cygwin command prompt, go to the directory where Photorec is located and start it. #./photorec_win.exe 6

Press Enter to Proceed. Choose the option None. In the file opt section of Photorec, it is possible to search for specific file types only. Since we don t know what to expect on the USB stick, search for all file types supported by Photorec. Choose Whole disk and then Search. Choose Other for the option. As we saw from FTK Imager, the file system is FAT. Photorec will ask you where to store the carved files. Select your destination (not on same media that you are searching) and press Enter. 7

Photorec is running and searching for the file types. It had already found two headers. After a couple of minutes, the scan was finished and Photorec showed the results: Photorec carved out five files from the USB stick. Don t forget to check if the restored files are correct. Due to fragmentation, files will not always be recovered as they should be. Mobile Phones In the previous sections, we discussed carving files out of raw data and file systems. For people working in forensics, or interested in forensics, mobile phones are also very interesting sources of data. As with file systems, when you delete a file, it is only permanently deleted when it is overwritten by other data. For mobile phones, it s the same. If an SMS message is deleted, it will still be in the flash memory of the phone or on the removable storage media until that memory space is overwritten. To be more specific, mobile phones use a type of solid state, nonvolatile memory known as flash memory to store SMS, call records, pictures, videos, and more. There are two types of flash memory that are commonly used: NOR and NAND. The NOR memory is used as the code storage media. Examples for items stored in the NOR memory are the phone s OS and default applications. The NAND memory is used for storing user data such as pictures and music. Recovering data from a mobile phone is different. All phone models have an operating system: Microsoft Windows CE, Symbian, Android, and Mac OS X. These operating systems also store their files in the memory of the phone. Samsung, for example, makes use of the FAT file system. Every mobilephone vendor has its own way of storing data into the phone memory. Some vendors store the IMSI code (subscriber identification) in a certain field in the right order, but other vendors use reverse nibbling to store this code in the phone memory. You need to swap the individual nibbles of a byte to proper decode the data. For instance, 12h becomes 21h. But how is it possible to recover data from a mobile phone? You need to understand the principles behind how the data is being stored on the mobile phone. Photos and music are usually stored on the onboard memory card. Once the card is available, data can be easily carved using a card reader and Photorec. For phone flash memory, a different procedure is required. For example, the content of an SMS message is compressed by the PDU format from eight ASCII characters into seven bytes. Alphabets may differ, and there are several encoding alternatives when displaying an SMS message. 8

More information about the SMS PDU format, online encoder/decoder and examples can be read on the twitt88.com website. 8 There is no standard solution for recovering data from the flash memory of mobile phones. For computers, though, images of the disk and memory can be made by using the tool dd. For mobile phones, a flasher is leveraged to dump the physical file system of a mobile unit. An example of a flasher device is CellBrite s UFED Ultimate, used by many law enforcement and forensic investigators. A hex dump is a snapshot of the entire contents of a handset s memory. Forensic examiners need to grab this data, preserve it and analyze it in the hope of finding information hidden from view and/or deleted data. Most of mobile phone forensic examination applications are a progression of backup software that concentrates on the user s data. Some of the applications have the functionality to decode the stored data, but many of them do not support the recovery of deleted items. To overcome this problem, manual investigation of the dump seems to be the only solution. Mobile phones could contain file types like JPEG, MP3, MPEG, MOV, and others. Before manually searching, you need to define the file structure of each file type. For example, if you want to search for JPEG files in a dump from a cell phone, you could use the header and footer characteristics for JPEG as discussed above. These values are 0xFF D8 for the header and 0x FF D9 for the footer. Open the dump in your favorite hex editor, and start searching for the string FF D8. Figure 5. Examining a raw phone dump for JPEG. After discovering a possible JPEG file, mark this beginning position and start looking for the values of the footer. When you have discovered the footer, select the block of data and save it to disk as a JPEG file. While opening it in a file viewer, the following image appears: Figure 6. Carved image from cell phone dump. In this case, we were lucky, the header and footer belonged to the same JPEG file. Often you will notice that the images you retrieve by hand are incomplete. As stated previously, the way mobile phones store their data depends on the manufacturer or operating system, so the files you are looking for could be heavily fragmented. 9

Development by the forensic community Every year, the DFRSW 9 organizes a forensics challenge. A group of specialists decide which challenges are faced in the field. By organizing a challenge, they are stimulating the forensic community to develop tools and procedures to tackle the problems digital investigators are facing in their daily practice. This year, 2011, the challenge was to investigate the flash memory and removable media of an Android smartphone. The top three submissions consisted of toolkits to carve data out of the Android OS. One of the great tools created by Apurva Rustagi was the SMS-carver. 10 Rustagi observed that SMS records started with a 10-digit phone number followed by a six-byte Unix time stamp. The tool dumps the records in delimited format. For the challenges, he developed two SMS-carver tools for the specific phone number mentioned in the challenge. He also provided the program code (in C) so you can develop/adjust the data you are looking for. Another one of his tools carved the Internet history out of the image. Conclusion This white paper is an introduction to file carving for disk images and touches on the new area of carving files out of phone dumps. Memory forensics and carving files out of memory (used in malware incident response) is an example of an increased area of research. As long as people are prepared to share knowledge, we have the opportunity to improve the forensic process in many ways. About the Author Christiaan Beek, Principal Consultant EMEA, Incident Response and Forensics, McAfee As a principal consultant on the McAfee Strategic Security team, Christiaan is responsible for the Incident Response and Forensics Services team in EMEA. He has 11 years of experience in information security performing information security assessments, penetration testing, reverse engineering malware, risk assessments, and forensics and incident response. He is the developer and lead instructor for the Malware Forensics and Incident Response class. In 2010, Christiaan spoke at the Black Hat conferences in Barcelona and Las Vegas. In 2011, he will speak and instruct a class at Black Hat in Abu Dhabi. He has spoken at several international conferences and writes for several media outlets. About McAfee Foundstone Education Empowering students with the knowledge and skill to protect the most important assets from the most critical threats is McAfee Foundstone s primary educational goal. Utilizing industry-recognized experts, McAfee Foundstone security courses bring real-world experiences to the classroom. Our instructors have performed hundreds of web, mobile, e-commerce, and application security assessments and have managed security programs for government and corporate environments. Each hands-on class relies heavily on student labs, exercises, and extensive student-instructor interaction to reinforce critical security issues with real-world scenarios. 10

McAfee Foundstone Security Training Classes Ultimate Hacking Mobile Building Secure Software Ultimate Web Hacking Writing Secure Code: ASP.NET Writing Secure Code: Java Ultimate Hacking Ultimate Hacking Expert Windows Security Malware Incident Response and Forensics (MFIRE) 1. http://www.pspad.com 2. S. Garfinkel, Carving contiguous and fragmented files with fast object validation, in Proc. 2007 Digital Forensics Research Workshop (DFRWS), Pittsburgh, PA, Aug. 2007, pp. 4S:2 12. 3. http://sourceforge.net/foremeost 4. http://www.digitalforensicssolutions.com/scalpel/ 5. http://www.cgsecurity.org/wiki/photorec 6. http://www.forensicswiki.org/wiki/tools:data_recovery 7. http://accessdata.com/support/adownloads 8. http://twit88.com/home/utility/sms-pdu-encode-decode 9. http://www.dfrws.org 10. http://sandbox.dfrws.org/2011/rustagi 11

McAfee 2821 Mission College Boulevard Santa Clara, CA 95054 888 847 8766 www.mcafee.com McAfee, the McAfee logo, and McAfee Foundstone are registered trademarks or trademarks of McAfee, Inc. or its subsidiaries in the United States and other countries. Other marks and brands may be claimed as the property of others. The product plans, specifications and descriptions herein are provided for information only and subject to change without notice, and are provided without warranty of any kind, express or implied. Copyright 2011 McAfee, Inc. 39301wp_file-carving_1111_fnl_ETMG