File Formats and Carving

October 28, 2016
Categorised in: Computer Forensic & Cyber Applications
Many kinds of files have a distinctive structure that was designed by software developers or standards bodies, and that can be useful for classifying and salvaging(Retrieve or preserve) data fragments.
For instance, a graphics file format such as JPEG has a completely different structure from Microsoft Word documents, starting with the first few bytes at the beginning of the file (the “header”), continuing into the locations where data are stored in the main body of the file, and terminating with a few distinctive bytes at the end of the file (the “footer”).
The headers and footers for some common file types are listed in Table 15.4.
The common headers in a JPEG image, Word document, and other file types are often referred to as file signatures and can be used to locate and salvage portions of deleted files.
The process of searching for a certain file signature and attempting to extract the associated data is called “carving” because it conceptually involves cutting a specific piece of data out of a larger dataset.
Carving in the context of digital forensics uses characteristics of a given class of files to locate those files in a raw data stream such as unallocated clusters on a hard drive.
For instance, the beginning and end of a Web (HTML) page are demarked by “<html>” and “</html>,” respectively.
Figure 15.3 shows another example of digital evidence that is commonly found in child exploitation
investigations—digital camera photographs.
The characteristic “FF D8 FF” hexadecimal values indicate that this is the beginning of a JPEG-encoded file and the characteristic “Exif” indicates that it is an Exchangeable Image File Format file common on digital cameras.
Once the beginning and end of the file are located, the intermediate data can be extracted into a file.
This carving process can be achieved by simply copying the data and pasting them into a file.
Alternately, the data can be extracted using dd by specifying the beginning and end of the file as shown here:
D:\>dd if=g:\Case1435\Prepare\unallocated-raw\memory-card-03424-unalloc of=g:\Case1435\Review\unallocated processed\memorycard- 03424-image1.jpg bs=1 skip=100934 count=652730
To make this process more efficient, tools have been developed to automate the process of carving for various file types, including, foremost, scalpel and DataLifter.
Specialized forensic tools like EnCase, FTK, and X-Ways also have some carving capabilities.
These tools can be useful for recovering digital video segments created using Webcams, which are often in AVI, MPEG, or Quicktime format and may be deleted frequently.
This carving technique also works for extracting files from physical memory dumps from mobile devices and from raw network traffic.
Additionally, mobile devices can contain deleted data that may be recoverable using specialized tools (van der Knijff, 2008).
There are a number of limitations to this approach to salvaging data from storage media.
First, the file name and date-time stamps that were associated with a file when it was referenced by the file system are not salvaged along with the data.
Second, the size of the original file may not be known, making it necessary to guess how much data to carve out.
Third, when the original file was fragmented, a simple carving process that assumes all portions of the file were stored contiguously on the disk will fail, salvaging fragments of multiple files and incorrectly combining them into a single container.
Research and development is under way to develop carving tools that address some of the these limitations for certain file types.
For instance, Adroit (http://digital-assembly.com) is a tool designed to recover fragmented JPEG files.
Pratik Kataria is currently learning Springboot and Hibernate.
Technologies known and worked on: C/C++, Java, Python, JavaScript, HTML, CSS, WordPress, Angular, Ionic, MongoDB, SQL and Android.
Softwares known and worked on: Adobe Photoshop, Adobe Illustrator and Adobe After Effects.