Finding Fraudulent Documents Preview

Wednesday, May 16, 2012 Posted by Corey Harrell 2 comments
Anyone who looks at the topics I discuss on my blog may not easily see the kind of cases I frequently work at my day job. For the most part my blog is a reflection of my interests, the topics I’m trying to learn more about, and what I do outside of my employer. As a result, I don’t blog much about the fraud cases I support but I’m ready to share a technique I’ve been working on for some time.

Next month I’m presenting at the SANs Forensic and Incident Response Summit being held in Austin Texas. The summit dates are June 26 and 27. I’m one of the speakers in the SANs 360 slot and the title of my talk is “Finding Fraudulent Word Documents in 360 Seconds” (here is the agenda). My talk is going to quick and dirty about a technique I honed last year to find fraudulent documents. I’m writing a more detailed paper on the technique as well as a query script to automate finding these documents but my presentation will cover the fundamentals. Specifically, what I mean by fraudulent documents, types of frauds, Microsoft Word metadata, Corey’s guidelines, and the technique in action. Here’s a preview about what I hope to cover in my six minutes (subject to change once I put together the slides and figure out my timing).

What exactly are fraudulent documents? You need to look at the two words separately to see what I’m referring to. One definition for fraudulent is “engaging in fraud; deceitful” while a definition for document is “a piece of written, printed, or electronic matter that provides information or evidence or that serves as an official record”. What I’m talking about is electronic matter that provides information or serves as an official record while engaging in fraud. In easier terms and the way I describe it: electronic documents providing fake financial information. There are different types of fraud which means there are different types of fraudulent documents. However, my technique is geared towards finding the electronic documents used to commit purchasing fraud and bid rigging.

There are a few different ways these frauds can be committed but there are times when Microsoft Word documents are used to provide fake information. One example is an invoice for a product that was never purchased to conceal misappropriated money. As most of us know electronic files contain metadata and Word documents are no different. There are values within Word documents’ metadata that provide strong indicators if the document is questionable. I did extensive testing to determine how these values change based on different actions taken against a document (Word versions 2000, 2003, and 2007). My testing showed the changes in the metadata are consistent based on the action. For example, if a Word document is modified then specific values in the metadata changes while other values remain the same.

I combined the information I learned from my testing with all the different fraudulent documents I’ve examined and I noticed distinct patterns. These patterns can be leveraged to identify potential fraudulent documents among electronic information. I’ve developed some guidelines to find these patterns in Word documents’ metadata. I’m not discussing the guidelines in this post since I’m saving it for my #DFIRSummit presentation and my paper. The last piece is tying everything together by doing a quick run through about how the technique can quickly find fraudulent documents for a purchasing fraud. Something I’m hoping to include is my current work on how I’m trying to automate the technique using a query script I’m writing and someone else’s work (I’m not mentioning who since it's not my place).

I’m pretty excited to finally have the chance to go to my first summit and there’s a great lineup of speakers. I was half joking on Twitter when I said it seems like the summit is the DFIR Mecca. I said half because it’s pretty amazing to see the who else will be attending.  

Labels: ,

More About Volume Shadow Copies

Tuesday, May 8, 2012 Posted by Corey Harrell 0 comments

CyberSpeak Podcast About Volume Shadow Copies

I recently had the opportunity to talk with Ovie about Volume Shadow Copies (VSCs) on his CyberSpeak podcast. It was a great experience to meet Ovie and see what it’s like behind the scenes. (I’ve never been on a podcast before and I found out quickly how tough it is to explain something technical without visuals). The CyberSpeak episode May 7 Volume Shadow Copies is online and in it we talk about examining VSCs. In the interview I mentioned a few different things about VSCs and I wanted to elaborate on a few of them. Specifically, I wanted to discuss running the Regripper plugins to identify volumes with VSCs, using the Sift to access VSCs, comparing a user profile across VSCs, and narrowing down the VSC comparison reports with Grep.

Determining Volumes with VSCs and What Files Are Excluded from VSCs

One of my initial steps on an examination is to profile a system so I can get a better idea about what I’m facing. I information I look at includes: basic operating system info, user accounts, installed software, networking information, and data storage locations. I do this by running Regripper in a batch script to generate a custom report containing the information I want. I blogged about this previously in the post Obtaining Information about the Operating System and I even released my Regripper batch script (general-info.bat). I made some changes to the batch script; specifically I added the VSCs plugins and The plugin obtains the volumes monitored by the Volume Shadow Copy service and this is an indication about what volumes may have VSCs available. The plugin gets a list of files/folders that are not included in the VSCs (snapshots). The information the VSCs plugins provide is extremely valuable to know early in an examination since it impacts how I may do things.

While I’m talking about RegRipper, Harlan released RegRipper version 2.5 his post RegRipper: Update, Road Map and further explained how to use the new RegRipper to extract info from VSCs in the excellent post Approximating Program Execution via VSC Analysis with RegRipper. RegRipper is an awesome tool and is one of the few tools I use on every single case. The new update lets RR run directly against VSCs making it even better. That’s like putting bacon on top of bacon.

Using the Sift to Access VSCs

There are different ways to access VSCs stored within an image. Two potential ways are using Encase with the PDE module or the VHD method. Sometime ago Gerald Parsons contacted me about another way to access VSCs; he refers to it as the iSCSI Initiator Method. The method uses a combination of Windows 7 iSCSI Initiator and the Sift workstation. I encouraged Gerald to do a write-up about the method but he was unable to due to time constraints. However, he said I could share the approach and his work with others. In this section of my post I’m only a ghost writer for Gerald Parsons and I’m only conveying the detailed information he provided me including his screenshots. I only made one minor tweak which is to provide additional information about how to access a raw image besides the e01 format.

To use the iSCSI Initiator Method requires a virtual machine running an iSCSI service (I used the Sift workstation inside VMware) and the host operating system running Windows 7. The method involves the following steps:

Sift Workstation Steps

1. Provide access to image in raw format
2. Enable the SIFT iSCSI service
3. Edit the iSCSI configuration file
4. Restart the iscsitarget service

Windows 7 Host Steps

5. Search for iSCSI to locate the iSCSI Initiator program
6. Launch the iSCSI Initiator
7. Enter the Sift IP Address and connect to image
8. Examine VSCs

Sift Workstation Steps

1. Provide access to image in raw format

A raw image needs to be available within the Sift workstation. If the forensic image is already in the raw format and is not split then nothing else needs to be done. However, if the image is a split raw image or is in the e01 format then one of the next commands needs to be used so a single raw image is available.

Split raw image:

sudo affuse path-to-image mount_point

E01 Format use:

sudo path-to-image mount_point

2. Enable the SIFT iSCSI service

By default, in Sift 2.1 the iSCSI is turned off so it needs to be turned on. The false value in the /etc/default/iscsitarget configuration file needs to be change to true. The commands below uses the Gedit text editor to accomplish this.

sudo gedit /etc/default/iscsitarget

(Change “false” to “true”)

3. Edit the iSCSI configuration file

The iSCSI configuration file needs to be edited so it points to your raw image. Edit the /etc/ietd.conf configuration file by performing the following (the first command opens the config file in the text editor Gedit):

sudo gedit /etc/ietd.conf

Comment out the following line by adding the # symbol in front of it:


Add the following two lines (the date can be whatever you want (2011-04) but make sure the image path points to your raw image):

Target iqn.2011-04.sift:storage.disk
Lun 0 Path=/media/path-to-raw-image,Type=fileio,IOMode=ro

4. Restart the iscsitarget service

Restart the iSCSI service with the following command:

sudo service iscsitarget restart

Windows 7 Host Steps

5. Search for iSCSI to locate the iSCSI Initiator program

Search for the Windows 7 built-in iSCSI Initiator program

6. Launch the iSCSI Initiator

Run the iSCSI Initiator program

7. Enter the Sift IP Address and connect to image

The Sift workstation will need a valid IP address and the Windows 7 host must be able to connect to the Sift using it. Enter the Sift’s IP address then select the Quick Connect.

A status window should appear showing a successful connection.

8. Examine VSCs

Windows automatically mounts the forensic image’s volumes to the host after a successful iSCSI connection to the Sift. In my testing it took about 30 seconds for the volumes to appear once the connection was established. The picture below shows Gerald’s host system with two volumes from the forensic image mounted.

If there are any VSCs on the mounted volumes then they can be examined with your method of choice (cough cough Ripping VSCs). Gerald provided additional information about how he leverages Dave Hull’s Plotting photo location data with Bing and Cheeky4n6Monkey Diving in to Perl with GeoTags and GoogleMaps to extract metadata from all the VSCs images to create maps. He extracts the metadata by running the programs from the Sift against the VSCs.

Another cool thing about the iSCSI Initiator Method (besides being another free solution to access VSCs) is the ability to access the Sift iSCSI service from multiple computers. In my test I connected a second system on my network to the Sift iSCSI service while my Windows 7 host system was connected to it. I was able to browse the image’s volumes and access the VSCs at the same time from my host and the other system on the network. Really cool…. When finished examining the volumes and VSCs then you can disconnect the iSCSI connection (in my testing it took about a minute to completely disconnect).

Comparing User Profile Across VSCs

I won’t repeat everything I said in the CyberSpeak podcast about my process to examine VSCs and how I focus on the user profile of interest. Focusing on the user profile of interest within VSCs is very powerful because it can quickly identify interesting files and highlight a user’s activity about what files/folders they accessed. Comparing a user profile or any folder across VSCs is pretty simple to do with my vsc-parser script and I wanted to explain how to do this.

The vsc-parser is written to compare the differences between entire VSCs. In some instances this may be needed. However, if I’m interested in what specific users were doing on a computer then the better option is to only compare the user profiles across VSCs since it’s faster and provides me with everything I need to know. You can do this by making two edits to the batch script that does the comparison. Locate the batch file named file-info-vsc.bat inside the vsc-parser folder as shown below.

Open the file with a text editor and find the function named :files-diff. The function executes diff.exe to identify the differences between VSCs. There are two lines (lines 122 and 129) that need to be modified so the file path reflects the user profile. As can be seen in the picture below the script is written to use the root of the mounted image (%mount-point%:\) and VSCs (c:\vsc%%f and c:\vsc!f!).

These paths need to be changed so they reflect the user profile location. For example, let's say we are interested in the user profile named harrell. Both lines just need to be changed to point to the harrell user profile. The screenshot below now shows the updated script.

When the script executes diff.exe there the comparison reports are placed into the Output folder. The picture below shows the reports for comparing the harrell user profile across 25 VSCs.

Reducing the VSCs Comparison Reports

When comparing a folder such as a user profile across VSCs there will be numerous differences that are not relevant to your case. One example could be the activity associated with Internet browsing. The picture below illustrates this by showing the report comparing VSC 12 to VSC11.

The report showing the differences between VSC12 and VSC11 had 720 lines. Looking at the report you can see there are a lot of lines that are not important. A quick way to remove them is to use grep.exe with the –v switch to only display non-matching lines. I wanted to remove the lines in my report involving the Internet activity. The folders I wanted to get rid of were: Temporary Internet Files, Cookies, Internet Explorer, and History.IE5. I also wanted to get rid of the activity involving the AppData\LocalLow\ CryptnetUrlCache folder. The command below shows how I stacked my grep commands to remove these lines and I saved the output into a text file named reduced_files-diff_vsc12-2-vsc11.txt .

grep.exe -v "Temporary Internet Files" files-diff_vsc12-2-vsc11.txt | grep.exe -v Cookies | grep.exe -v "Internet Explorer" | grep.exe -v History.IE5 | grep.exe -v CryptnetUrlCache > reduced_files-diff_vsc12-2-vsc11.txt

I reduced the report from 720 lines to 35. It’s good practice to look at the report again to make sure no obvious lines were missed before running the same command against the other VSC comparison reports. Staking grep commands to reduce the amount of data to look at makes it easier to spot items of potential interest such as documents or Windows link files. It’s pretty easy to see that the harrell user account was accessing a Word document template, an image named staples, and a document named Invoice-#233-Staples-Office-Supplies in the reduced_files-diff_vsc12-2-vsc11.txt report shown below.

I compare user profiles across VSCs because it’s a quick way to identify data of interest inside VSCs. Regardless, if the data is images, documents, user activity artifacts, email files, or anything else that may stored inside a user profile or that a user account accessed.

Practical Malware Analysis Book Review

Thursday, May 3, 2012 Posted by Corey Harrell 0 comments
There are times when I come across malware on systems. It happens when I’m helping someone with computer troubles to processing a DFIR case to providing assistance on a security incident. It seems as if malware is frequently lurking beneath the surface. Occasionally I thought it might be helpful to know not only what the malware on those systems was up to but also what the malware was incapable of doing. Practical Malware Analysis breaks down the art of analyzing malware so you can better understand how it works and what its capabilities are. PMA is an excellent book and I highly recommend it for the following reasons: understanding malware better, training, and extending test capabilities.

Understanding Malware Better

A very telling quote from the book’s opening is “when analyzing suspected malware, your goal will typically be to determine exactly what a particular suspect binary can do, how to detect it on your network, and how to measure and contain its damage”. Practical Malware Analysis outlines how to meet that goal by outlining a process to follow and the tools to use. Part 1 covers basic analysis demonstrating how to better understand a program’s functionality by using basic static and dynamic analysis. Part 2 builds on the basic analysis by diving deeper into static analysis by analyzing the malware’s assembly code. Part 3 continues by discussing an advanced dynamic analysis technique which was debugging. The book is written in a way where it is fairly easy to follow along and understand the content about the analysis techniques. The later sections in the book: Part 4 Malware Functionality, Part 5 Anti-Reverse-Engineering, and Part 6 Special Topics provided a wealth of information about malware and what someone may encounter during their analysis.

I don’t foresee myself becoming a malware reverse engineer. This wasn’t what I had in mind when I started reading PMA. My intentions were to learn the techniques in PMA so I could be better at my DFIR job. To quickly get intelligence when I’m examining an infected system to help explain what occurred. To be able to rule out malware located on systems from being accused of the reason why certain actions happened on a system. PMA went beyond my expectations and I can honestly say I’m better at my job because I read it.


Practical Malware Analysis follows the No Starch publishing practical approach which is to reinforce content by providing data the reader can analyze as they follow along. The book provides a wealth of information about analyzing malware then follows it up with about 57 labs. The authors indicated they wrote custom programs for the book and this means there are a lot of samples to practice the malware analysis techniques on. The labs are designed so the reader has to answer specific questions by analyzing a sample and afterwards the solutions can be referenced to see the answers. A cool thing about the solutions is that there are short and long versions. The short versions only provide the answers while the long version walks the reader through the analysis demonstrating how the answers were obtained. The combination of the content, labs, samples, and solutions makes PMA a great self training resource.

PMA contains so much information it’s one of those books where people can keep going back to review specific chapters. I can see myself going back over numerous chapters and redoing the labs as a way to train myself on malware analysis techniques. PMA is not only a great reference to have available when faced with malware but it’s even a greater training resource to have regular access to.

Extending Test Capabilities

The process and techniques described in PMA can be used for other analysis besides understanding malware. A friend of mine who was also reading the book (when I was working my way through it) had to take a look at a program someone in his organization was considering using. Part of his research into the program was to treat it like malware and he used out some of the techniques described in PMA. It was very enlighten the information he learned about the program by incorporating malware analysis techniques into his software testing process. I borrowed his idea and started using some PMA techniques as part of my process when evaluating software or software components. I already used it on one project and it helped us identify the networking information we were looking for. The process and tools discussed in the book helped my friend and myself extend our software testing capabilities so it stands to reason it could do the same for others.

Five Star Review

PMA is another book that should be within reaching distance in anyone’s DFIR shop. I went ahead and purchased PMA hoping the book would improve my knowledge and skills when faced with malware. What I ended up with was knowledge, a process and tools I can use to analyze any program I encounter. PMA gets a five star review (5 out of 5).

One area I thought could be improved with PMA was providing more real life examples. It would have been helpful if the authors shared more of their real life experiences about analyzing malware or how the information obtained from malware analysis helped when responding to an incident. I think sharing past experiences is a great way to provide more context since it lets people see how someone else approached something.