Ripping Volume Shadow Copies – Introduction

Sunday, January 29, 2012 Posted by Corey Harrell 0 comments
Windows XP is the operating system I mostly encounter during my digital forensic work. Over the past year I’ve been seeing more and more systems running Windows 7. 2011 brought with it my first few cases where the corporate systems I examined (at my day job) were all running Windows 7. There was even a more drastic change for the home users I assisted with cleaning malware infections because towards the end of the year all my cases involved Windows 7 systems. I foresee Windows XP slowly becoming a relic as the corporate environments I face start upgrading the clients on their networks to Windows 7. One artifact that will be encountered more frequently in Windows 7 is Volume Shadow Copies (VSCs). VSCs can be a potential gold mine but for them to be useful one must know how to access and parse the data inside them. The Ripping Volume Shadow Copies series is discussing another approach on how to examine VSCs and the data they contain.

What Are Volume Shadow Copies

VSCs are not new to Windows 7 and have actually been around since Windows Server 2003. Others in the DFIR community have published a wealth of information on what VSCs are, their forensic significance, and approaches to examine them. I’m only providing a quick explanation since Troy Larson’s presentation slides provide an excellent overview about what VSCs are as well as Lee Whitfield’s Into the Shadows blog post. Basically, the Volume Shadow Copy Service (VSS) can backup data on a Windows system. VSS monitors a volume for any changes to the data stored on it and will create backups only containing those changes. These backups are referred to as a shadow copies. According to Microsoft, the following activities will create shadow copies on Windows 7 and Vista systems:

        -  Manually (Vista & 7)
        -  Every 24 Hours (Vista)
        -  Every 7 Days (7)
        -  Before a Windows Update (Vista & 7)
        -  Unsigned Driver Installation (Vista & 7)
        -  A program that calls the Snapshot API (Vista & 7)

Importance of VSCs

The data inside VSCs may have a significant impact on an examination for a couple of reasons. The obvious benefit is the ability to recover files that may have been deleted or encrypted on the system. This ringed true for me on the few cases involving corporate systems; if it wasn’t for VSCs then I wouldn’t have been able to recover the data of interest. The second and possibly even more significant is the ability to see how systems and/or files evolved over time. I briefly touched on this in the post Ripping Volume Shadow Copies Sneak Peek. I mentioned how parsing the configuration information helped me know what file types to search for based on the installed software. Another example was how the user account information helped me verify a user account existed on the system and narrow down the timeframe when it was deleted. A system’s configuration information is just the beginning; documents, user activity, and programs launched are all great candidates to see how they changed over time.

To illustrate I’ll use a document as an example. When a document is located on a system without VSCs - for the most part - the only data that can be viewed in the document is what is currently there. Previous data inside the document might be able to be recovered from copies of the document or temporary files but won’t completely show how the document changed over time. To see how the document evolved would require trying to recover it at different points in time from system backups (if they were available). Now take that same document located on a system with VSCs. The document can be recovered from every VSC and each one can be examined to see its data. The data will only be what was inside the document when each VSC was created but it could cover a time period of weeks to months. Examining each document from the VSCs will shed light on how the document evolved. Another possibility is the potential to recover data that was in the document at some point in the past but isn't in the document that was located on the system. If system backups were available then they could provide additional information since more copies of the document could be obtained at other points in time.

Accessing VSCs

The Ripping Volume Shadow Copies approach works against mounted volumes. This means a forensic image or hard drive has to be mounted to a Windows system (Vista or 7) in order for the VSCs in the target volume to be ripped. There are different ways to see a hard drive or image’s VSCs and I highlighted some options:

        -  Mount the hard drive by installing it inside a workstation (option will alter data on the hard drive)

        -  Mount the hard drive by using an external hard drive enclosure (option will alter data on the hard drive)

        -  Mount the hard drive by using a hardware writeblocker

        -  Mount the forensic image using Harlan Carvey’s method documented here, here, and the slide deck referenced here

        -  Mount the forensic image using Guidance Software’s Encase with the PDE module (option is well documented in the QCCIS white paper Reliably recovering evidential data from Volume Shadow Copies)

Regardless of the option used to mount the hard drive or image, the Windows vssadmin command or Shadow Explorer program can show what if VSCs are available for a given mounted volume. The pictures below show the Shadow Explorer program and vssadmin command displaying the some VSCs for the mounted volume with drive letter C.

Shadow Explorer Displaying C Volume VSCs

VSSAdmin Displaying C Volume VSCs

Picking VSCs to examine is dependent on the examination goals and what data is needed to accomplish those goals. However, time will be a major consideration. Does the examination need to review an event, document, or user activity for specific times or for all available times on a computer? Answering that question will help determine if certain VSCs covering specific times are picked or if every available VSCs should be examined. Once the VSCs are selected then they can be examined to extract the information of interest.

Another Approach to Examine VSCs

Before discussing another approach to examining VSCs it’s appropriate to reflect on the approaches practitioners are currently using. The first approach is to forensically image each VSC and then examine the data inside each image. Troy’s slide deck referenced earlier has a slide showing how to image a VSC and Richard Drinkwater's Volume Shadow Copy Forensics post from a few years ago shows imaging VSCs as well. The second popular approach doesn’t use imaging since it copies data from each VSC followed by examining that data. The QCCIS white paper referenced earlier outlines this approach using the robocopy program as well as Richard Drinkwater in his posts here and here. Both approaches are feasible for examining VSCs but another approach is to examine the data directly inside VSCs bypassing the need for imaging and copying. The Ripping VSCs approach examines data directly inside VSCs and the two different methods to implement the approach are: Practitioner Method and Developer Method.

Ripping VSCs: Practitioner Method

The Practitioner Method uses ones existing tools to parse data inside VSCs. This means someone doesn’t have to learn a new tool or learn a programming language to write their own tools. All that’s required is for the tool to be command line and the practitioner willingness to execute the tool multiple times against the same data. The picture below shows how the Practitioner Method works.

Practitioner Method Process

Troy Larson demonstrated how a symbolic link can be used to provide access to VSCs. The mklink command can create a symbolic link to a VSC which then provides access to the data stored in the VSC. The Practitioner Method uses the access provided by the symbolic link to execute one’s tools directly against the data. The picture above illustrates a tool executing against the data inside Volume Shadow Copy 19 by traversing through a symbolic link. One could quickly determine the differences between VSCs, parse registry keys in VSCs, examine the same document at different points in time, or track a user’s activity to see what files were accessed. Examining VSCs can become tedious when one has to run the same command against multiple symbolic links to VSCs; this is especially true when dealing with 10, 20, or 30 VSCs. A more efficient and faster way is to use batch scripting to automate the process. Only a basic understanding about batch scripting (need to know how a For loop works) can create powerful tools to examine VSCs. In future posts I’ll cover how simple batch scripts can be leverage to rip data from any VSCs within seconds.

Ripping VSCs: Developer Method

I’ve been using the Practitioner Method for some time now against VSCs on live systems and forensic images. The method has enabled me to see data in different ways which was vital for some of my work involving Windows 7 systems. Recently I figured out a more efficient way to examine data inside VSCs. The Developer Method can examine data inside VSCs directly which bypasses the need to go through a symbolic link. The picture below shows how the Developer Method works.

Developer Method Process

The Developer Method programmatically accesses the data directly inside of VSCs. The majority of existing tools cannot do this natively so one must modify existing tools or develop their own. I used the Perl programming language to demonstrate that the Developer Method for ripping VSCs is possible. I created simple Perl scripts to read files inside a VSC and I modified Harlan’s lslnk.pl to parse Windows shortcut files inside a VSC. Unlike the Practitioner Method, at the time of this post I have not extensively tested the Developer Method. I’m not only discussing the Developer Method for completeness when explaining the Ripping VSCs approach but my hope is by releasing my research early it can help spur the development of DFIR tools for examining VSCs.

What’s Up Next?

Volume Shadow Copies have been a gold mine for me on the couple corporate cases where they were available. The VSCs enabled me to successfully process the cases and that experience is what pushed me towards a different approach to examining VSCs. This approach was to parse the data while it is still stored inside the VSCs. I’m not the only DFIR practitioner looking at examining VSCs in this manner. Stacey Edwards shared in her post Volume Shadow Copies and LogParser how she runs the program logparser against VSCs by traversing through a symbolic link. Rob Lee shared his work on Shadow Timelines where he creates timelines and lists deleted files in VSCs by executing the Sleuthkit directly against VSCs. Accessing VSCs’ data directly can reduce examination time while enabling a DFIR practitioner to see data temporally. Ripping Volume Shadow Copies is a six part series and the remaining five posts will explain the Practitioner and Developer methods in-depth.

        Part 1: Ripping Volume Shadow Copies - Introduction
        Part 2: Ripping VSCs - Practitioner Method
        Part 3: Ripping VSCs - Practitioner Examples
        Part 4: Ripping VSCs - Developer Method
        Part 5: Ripping VSCs - Developer Example
        Part 6: Examining VSCs with GUI Tools

Dual Purpose Volatile Data Collection Script

Monday, January 2, 2012 Posted by Corey Harrell 18 comments
When responding to a potential security incident a capability is needed to quickly triage the system to see what's going on. Is a rogue process running on the system, whose currently logged onto the system, what other systems are trying to connect over the network, or how do I document the actions I took on the system. These are valid questions during incident response whether the response is for an actual event or a simulation. One area to examine to get answers is the systems' volatile data. Automating the collection of volatile data can save valuable time which in turn helps analysts examine the data faster in order to get answers. This post briefly describes (and releases) the Tr3Secure volatile data collection script I wrote.

Tr3Secure needed a toolset for responding to systems during attack simulations and one of the tools had to quickly collect volatile data on a system (I previously discussed what Tr3Secure is here). However, the volatile data collection tool had to provide dual functions. First and foremost it had to properly preserve and acquire data from live systems. The toolset is initially being used in a training environment but the tools and processes we are learning need to be able to translate over to actual security incidents. What good is mastering a collection tool that can’t be used during live incident response activities? The second required function was the tool had to help with training people on examining volatile data. Tr3Secure members come from different information security backgrounds so not every member will be knowledgeable about volatile data. Collecting data is one thing but people will eventually need to know how to understand what the data means. The DFIR community has a few volatile data collection scripts but none of the scripts I found provided the dual functionality for practical and training usage. So I went ahead and wrote a script to meet our needs.

Practical Usage

These were some considerations taken into account to ensure the script is scalable to meet the needs for volatile data collection during actual incident response activities.

        Flexibility

Different responses will have different requirements on where to store the volatile data that’s collected. At times the data may be stored on the same drive where the DFIR toolset is located while at other times the data may be stored to a different drive. I took this into consideration and the volatile data collection script allows for the output data to be stored on a drive of choice. If someone prefers to run their tools from a CD-ROM while someone else works with a large USB removable drive then the script can be used by the both of them.

        Organize Output
 
Troy Larson posted a few lines of code from his collection script to the Win4n6 sometime ago. One thing I noticed about his script was that he organized the output data based on a case number. I incorporated his idea into my script; a case number needs to be entered when the script is run on a system. A case folder enables data collected from numerous systems to be stored in the same folder (folder is named Data-Case#). In addition to organizing data into a case folder, the actual volatile data is stored in a sub-folder named after the system the data came from (system's computer name is used to name the folder). To prevent overwriting data by running the script multiple times on the same system I incorporated a timestamp into the folder name (two digit month, day, year, hour, and minute). Appending a timestamp to the folder name means the script can execute against the same system numerous times and all of the volatile data is stored in separate folders. Lastly, the data collected from the system is stored in separate sub-folders for easier access. The screenshot below shows the data collected for Case Number 100 from the system OWNING-U on 01/01/2012 at 15:46.


        Documentation

Automating data collection means that documentation can be automated as well. The script documents everything in a collection log. Each case has one collection log so regardless if data is collected from one or ten systems an analyst will only have to worry about reviewing one log.


The following information is documented both to the screen for an analyst to see and a collection log file: case number, examiner name, target system, user account used to collect data, drives for tools and data storage, time skew, and program execution. The script prompts the analyst for the case number, their name, and the drive to store data on. This information is automatically stored in the collection log so the analyst doesn’t have to worry about maintaining documentation elsewhere. In addition, the script prompts the analyst for the current date and time which is used to record the time difference between the system and the actual time. Every program executed by the script is recorded in the collection log along with a timestamp of when the program executed. This will make it easier to account for artifacts left on a system if the system is examined after the script is executed. The screenshot below shows the part of the collection log for the data collected from the system OWNING-U.


        Preservation

RFC 3227’s Order of Volatility outlines that evidence should be collected starting with the most volatile then proceeding to the less volatile. The script takes into account the order of volatility during data collection. When all data is selected for collection, the memory is first imaged then volatile data is collected followed by collecting non-volatile data. The volatile data collected is: process information, network information, logged on users, open files, clipboard, and then system information. The non-volatile data collected is installed software, security settings, configured users/groups, system's devices, auto-runs locations, and applied group policies. Another item the script incorporated from Troy Larson’s comment in the Win4n6 group is preserving the prefetch files before volatile data is collected. I never thought about this before I read his comment but it makes sense. Volatile data gets collected by executing numerous programs on a system and these actions can overwrite the existing prefetch files with new information or files. Preserving the prefetch files upfront ensures analysts will have access to most of the prefetch files that were on the system before the collection occurred (four prefetch files may be overwritten before the script preserves them). The script uses robocopy to copy the prefetch files so the file system metadata (timestamps, NTFS permissions, and file ownership) is collected along with the files themselves. The screenshot below shows the preserved files for system OWNING-U.


        Tools Executed

The readme file accompanying the script outlines the various programs used to collect data. The programs include built-in Windows commands and third party utilities. The screenshot below shows the tools folder where the third party utilities are stored.




I’m not going to discuss every program but I at least wanted to highlight a few. Windows diskpart command allows for disks, partitions, and volumes to be managed through the command line. The script leverages diskpart to make it easy for an analyst to see what drives and volumes are attached to a system. Hopefully, the analyst won’t need to open up Windows explorer to see what the removable media drive mappings are since the script displays the information automatically as shown below. Note, to make diskpart work a text file needs to be created in the tools folder named diskpart_commands.txt and the file needs to contain these two commands on separate lines: list disk and list volume.


Mandiant’s Memoryze is used to obtain a forensic image of the system’s memory. Memoryze supports a wide range of Windows operating systems which makes the script more versatile for dumping RAM. The key reason the script uses Memoryze is because it’s the only free memory imaging program I found that allows an image to be stored in a folder of your choice. Most programs will place the memory image in the same folder where the command line is opened. This wouldn’t work because the image would be dropped in the folder where the script is located instead of the drive the analyst wants. Memoryze uses an xml configuration file to image RAM so I borrowed a few lines of code from the MemoryDD.bat batch file to create the xml file for the script. Note, the script only needs the memoryze.exe; to obtain the exe install Memoryze on a computer then just copy memoryze.exe to the Tools folder.

PXServer’s Winaudit program obtains the configuration information from a system and I first became acquainted with the program during my time performing vulnerability assessments. The script uses Winaudit to collect some non-volatile data including the installed software, configured users/groups, and computer devices. Winaudit is capable of collecting a lot more information so it wouldn’t be that hard to incorporate the additional information by modifying the script.

Training Usage

These were the two items put into the script to assist with training members on performing incident response system triage.

        Ordered Output Reports

The script collects a wealth of information about a system and this may be overwhelming to analysts new to examining volatile data. For example, the script produces six different reports about the processes running on a system. A common question when faced with so many reports is how should they be reviewed. The script’s output reports have numbers which is the suggested order for them to be reviewed. This provides a little assistance to analysts until they develop their own process for examining the data. The screenshots below shows the process reports in the output folder and those reports opened in Notepad ++.




        Understanding Tool Functionality and Volatile Data

The script needs to help people better understand what the collected data means about the system where it came from. Two great references for collecting, examining, and understanding volatile data are Windows Forensic Analysis, 2nd edition and Malware Forensics: Investigating and Analyzing Malicious Code. I used both books when researching and selecting the script’s tools to collect volatile data. What better ways to help someone better understand the tools or data then by directing them to references that explain it? I placed comments in the script containing the page number where a specific tool is discussed and the data explained in both books. The screenshot below shows the portion of the script that collects process information and the references are highlighted in red.



Releasing the Tr3Secure Volatile Data Collection Script

There are very few things I do forensically that I think are cool; this script happens to be one of them. There are not many tools or scripts that work as intended while at the same time provide training. People who have more knowledge about volatile data can hit the ground running with the script investigating systems. The script automates imaging memory image, collecting volatile/non-volatile data, and documenting every action taken on the system. People with less knowledge can leverage the tool to learn how to investigate systems. The script collects data then the ordered output and references in the comments can be used to interpret the data. Talk about killing two birds with one stone.

The following is the location to the zip file containing the script and the readme file (download link is here). Please be advised, a few programs the script uses require administrative rights to run properly.

Enjoy and Happy Hunting ...

***** Update *****

The Tr3Secure script has been updated and you can read about the changes in the post Tr3Secure Data Collection Script Reloaded

***** Update *****