Review of Digital Forensics with Open Source Tools

Monday, June 27, 2011 Posted by Corey Harrell 2 comments
I became involved in the digital forensics (DF) field when I had to establish and manage a DF process to support financial investigations and fraud audits. When I got to the point of identifying tools I first looked to see what resources I had at my disposal. Lo and behold my security lab had a dongle to a commercial forensic product. In the beginning I exclusively used a few commercial products to perform forensics but over time I added additional tools to my arsenal to expand my capability. I’m bringing up my background since the intended audience for Digital Forensics with Open Source Tools (DFwOST) is new forensic practitioners and experienced DF practitioners new to open source tools. My review of DFwOST is coming from the perspective of an experienced DF practitioner who may rely on a few (or single) commercial tools during examinations.

Before diving into the world of open source tools DFwOST starts out by defining digital forensics and explaining the goals of any examination which is for an examiner to locate artifacts to indicate if a hypothesis is true or false. DFwOST then covers the three different analysis types used during an examination and the analysis types are: system, application, and file. DFwOST explains how to perform the different analysis by explaining the data, the potential artifacts of interest located in the data, and discussing the open source tools to use against the data. The system analysis covers partitioning and disk layouts of physical storage devices. In addition to this, DFwOST discusses the different file types and artifacts specific to the Windows Linux, and Mac operating systems. The application analysis explains the artifacts associated with different web browsers and mail applications. Rounding out the discussion, the file analysis covers the activities for examining the content of individual files and their metadata. The authors provided a listing of references at the end of each chapter that the reader can use to learn more about the topics DFwOST doesn't go into great detail on.

I think DFwOST will be beneficial to anyone who reads it whether if they are new to the field or an experienced practitioner. However, I think the book is a great resource to experienced DF practitioners who are not familiar with open source and free digital forensic tools. My reasoning is because DFwOST can help to expand capabilities in DF examinations, understand how commercial tools work, and identify additional tools.

Expand Capabilities in DF Examinations

Every tool has its strengths and weaknesses, and commercial tools are no different. There is not a single commercial product that has the ability examine every possible type of data or artifact encountered during exams. This issue is one of the reasons why DF practitioners have multiple tools at their disposal. How does DFwOST fit into the picture?

First DFwOST discusses tools and techniques that have a capability not present in the current crop of commercial tools. The additional capability provided by open source tools can be used to compliment the functionality of commercial tools. For example, chapter 9 discusses the timeline analysis technique and mentions a few tools to create timelines that include the metadata from the file system and various artifacts. In my experiences, timeline analysis is a powerful technique and it has helped me on a range of different examinations from financial investigations to human resource policy violation investigations to security incidents. The ability to generate timelines would be lost by solely relying on a single or few commercial products.

Understand How Commercial Tools Work

Some commercial tools automatically extract information from data and this functionality can help reduce the time needed to complete an examination. On the downside, automation provides a layer of abstraction that may result in examiners not completely understanding the data they are seeing or how the tool works. The tools (open source and free ones in Appendix A) highlighted in DFwOST can be a great educational benefit to examiners by helping better understand the data and how their commercial tools work; thus removing the layer of abstraction caused by automation. Open source tools can not only be ran against data to see how the output is different but the tools' various options can be tested and the code can be read to better understand how the tool functions. The educational benefit provided by open source tools will be helpful to any examination even if the tools are not actually used on a case.

Identify Additional Tools

DFwOST points out numerous tools to use during a digital forensic examination. Using additional tools can provide flexibility and additional resources for validation testing. At times there could be a need to only conduct a few activities and using a multipurpose commercial tool may be overkill for the task at hand. Additional time will be needed for a multipurpose tool since it takes time to load and configure the tool even if the task at hand is just to extract specific information from data. The tools in DFwOST provide this kind of flexibility.

In addition to flexibility, open source tools can be used in the validation testing of commercial tools. Does XYZ commercial software extract the information from a certain type of data properly? Does XYZ commercial tool work as advertised? Both questions can be quickly verified by reproducing the results with the open source tools discussed in DFwOST.

Five Star Review

Overall DFwOST will be a welcome addition to anyone’s DFIR library. The one topic I thought was missing from the book (or I overlooked) is mentioning the process or methods to validate digital forensic tools before they are used during an examination. I don't think the authors had to go into great detail on the subject but pointing the reader (especially people new to the field) to a few references could be helpful. Despite this, if I was posting my review on Amazon then DWwOST would get another five star rating.
Labels: ,


Wednesday, June 22, 2011 Posted by Corey Harrell 0 comments
The links discussed include a triage model, mapping tweets, and an incident analysis write-up.

A Triage Model

Last week I attended my graduation at Norwich University. Not only was it great to finally be done with college but I had the opportunity to sit through presentations including a few on digital forensics. One of the DF presentations was given by Marc Rogers of Purdue University on the Computer Forensics Field Triage Process Model (CFFTPM). Marc’s presentation was informative and afterwards I wanted to learn more about the model so I read a whitepaper on it. For people unfamiliar with the model, the image below shows the different CFFTPM phases.

CFFTPM appears not to be technology dependent which means the model can be used against different platforms in various types of cases. Even knowing this couldn't stop me from thinking about what impact technology may have when trying to implement the model. The majority of the systems I come across run some version of Windows and there has been an increase in the number of Windows 7 systems I’m seeing. If I were to use the model then I take into consideration the volume shadow copies (VSCs) on the newer Windows operating systems. For example, one of the phases in CFFTPM is the examination of the User Profiles by reviewing the following: the home directory, file properties, and registry. If the system contains VSCs then there may be user profile data at different points in time and just triaging the data in the current state might not show an accurate picture of the system. What happens if data is deleted from the user profile prior to the triage? The examiner might not notice this occurred without taking a look at the data in VSCs.

The example holds true for the other triage phases as well. On a few recent cases VSCs contained pertinent data and if I was triaging a system then the VSCs need to be considered. A script and a few tools could parse all of the data - including data in the volume shadow copies – in a short timeframe and still allow me to review all of the information onsite.

Mapping Tweets

A few months ago the article This is What A Tweet Looks Like discussed the metadata stored in a tweet. The metadata contains a wealth of information such as the author's name, account creation date, and the author’s location. I thought about the article when I was reading Creepy, the Geolocation Information Aggregator. Creepy is a python script (still in beta form) that allows people to gather publicly available geolocation information from social networking platforms and image hosting services. The article showed how Creepy can harvest “geolocation information from Twitter in the form of geotagged tweets, foursquare check-in’s, geotagged photos posted on twitter via image hosting services such as yfrog, twitpic,, plixi and others, and Flickr”.

I didn’t test the python script and my judgment is solely based on the content of the article. Creepy appears to make it extremely easy to map the geolocation information from Twitter and I can see the two sides of how this ability can be used. For investigations it might be helpful to confirm the whereabouts of the person tweeting. On the other side, this ability can make it pretty easy to stalk someone and help identify people’s patterns.

Incident Analysis Write-up

The Carnal0wange blog posted Incident Analysis: Lost Million Dollars in a Minute. The post has a link to their write-up about an incident where a victim’s online banking account was compromised and a sum of money was transferred to Eastern Europe. The analysis involved examining a forensic image of the victim’s machine and a network packet capture. Over the past few years, I’ve seen numerous articles about Trojans being leverage to steal money but until now I haven’t seen any public write-ups about the examination of systems infected with a banker Trojan. The write-up is interesting and I’m thankful the authors shared the information.

One of the conclusions was the “victim's machine is infected via an email by executing a malicious executable file” but the write-up didn’t cover the artifacts to point to this type of delivery mechanism. The next area I’m looking into is the artifacts left by using email as the delivery mechanism so it would have been nice to see how the artifacts looked in an actual incident. Despite this, the write-up still provides a glimpse about the analysis of a system infected with a banker Trojan.

***** Update *****
The Carnal0wange blog had a link to the incident analysis report but has since removed the link and is providing the report by email.
***** Update *****

Mass Malware Still a Threat

Speaking of attack vector artifacts… The ThreatPost article Forget APT, Mass Malware is Still the Big Threat said “type of attacks that most enterprises see today still come from mass malware that defenders haven't yet figured out a good way to stop”. Unfortunately, there are a lot of enterprises who don’t react well to the mass malware problem. The standard SOP is to wipe, reimage, update, and redeploy. The one step missing is the investigation to determine how the malware got onto the system in the first place. Did an exploit target one of the following vulnerable applications: Java, Adobe Flash, Adobe Reader, QuickTime and Internet Explorer? If so which one and how did the exploit get delivered to the system. Did a social engineering trick make an end user infect the system themselves? Did the malware spread through email?

The answer to the how question is crucial for an organization to decide what measures to put in place to better protect themselves for the future. Do resources need to be used to improve a breakdown in the patch management process, offer a security awareness refresher training to employees, implement a new security control, etc.. The attack vector artifacts left on the infected system can help an examiner determine the how of a mass malware infection.

Information Gleamed from Stolen Router

This isn't directly related to DFIR but part of my background includes networking so articles about network security peeks my interest. When I was checking out the InfoSec Resources newsletter I saw the article The Case of the Great Router Robbery which discusses the different information that can be obtained from a stolen router and ways to reduce your exposure if a router is stolen from your organization. I don’t know how many routers are actually being stolen from organizations but it’s good to be informed about the potential risk.
Labels: ,

Why Is It What It Is?

Wednesday, June 8, 2011 Posted by Corey Harrell 1 comments
….. Or more specifically why is Microsoft Office metadata what it is?

Microsoft Office documents contain metadata that may be relevant to a digital forensic examination. The metadata may show when a file was created, modified, printed, or what user accounts were used to perform those actions. Others have already researched the metadata in Microsoft Office 2003 and 2007 documents including providing programs to parse the metadata. A few of the write-ups are: Kristinn Gudjonsson’s Office 2007 metadata post, and Lance Muller’s Office Metadata EnScript & Updated Office 2007 Metadata EnScript posts. I was interested in understanding how different actions taken against a Microsoft Office document affect its metadata.

To help show the relevance of Office documents’ metadata I included the metadata from one of my test Word 2007 documents. Usually I manual examine the information in metadata on an as needed basis but I thought for this post it would be cleaner to show the information in report format. The text below is the output from Kristinn’s script ran against a test Word document. The File Metadata section shows when the file was created, modified, printed, and what user accounts were used to perform those actions. Did you notice the last print date/time (2011-05-27T19:09:00Z) occurred one minute before the file was even created (2011-05-27T19:10:00Z)? I’ll touch on this later in the post which is why I pointed it out.

Document name: E:\office metadata testing\word 2007\Xp-2007-1sp.docx
Application Metadata
Template = Normal.dotm
TotalTime = 2
Pages = 1
Words = 1
Characters = 9
Application = Microsoft Office Word
DocSecurity = 0
Lines = 1
Paragraphs = 1
ScaleCrop = false
Company = Test-lab
LinksUpToDate = false
CharactersWithSpaces = 9
SharedDoc = false
HyperlinksChanged = false
AppVersion = 12.0000
File Metadata
title =
subject =
creator = test-2007
keywords =
description =
lastModifiedBy = test-2007
revision = 2
lastPrinted = 2011-05-27T19:09:00Z
created = 2011-05-27T19:10:00Z
modified = 2011-05-27T19:10:00Z

Usernames in Microsoft Office Metadata

Before looking into how different actions against a Microsoft Office document affect its metadata I think it is useful to know more about the usernames reflected in the creator and modifiedby attributes. The usernames are not populated with the name of the user account that performed the action since there is a value in the Windows registry containing the name to use.

When Office 2003 and 2007 is installed there is prompt asking for a user name and company. Those fields are already prepopulated with the information entered when the operating system is installed which is located in the registry key HKLM\Software\Microsoft\Windows NT\CurrentVersion (thanks Greg Kelly for this info). The prompt gives the user an opportunity to either change the user name or company or to leave the fields with the information entered during OS installation.

There is a registry key containing what user name and company was used when Microsoft Office was installed. To locate the registry key the Office program’s GUID must first be determined and this Microsoft article explains how to locate the GUID. The GUID of Microsoft Office programs I tested were 9040110900063D11C8EF10054038389C for Microsoft Office Professional Edition 2003 and 00002109110000000000000000F01FEC for Microsoft Office Professional Plus 2007. The registry key containing the information about the Office installation is: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\UserData\S-1-5-18\Products\{GUID}\InstallProperties. The regowner and regcompany values in the registry key contain the user name and company entered during the Office installation.

The usernames and company information reflected in Microsoft Office documents’ metadata are pulled from the UserInfo registry key of the user account’s NTUSER.DAT hive performing the actions. The names of the two values containing the data in the registry key are UserName and Company. The location of the registry key varies depending on the version of Microsoft Office but the paths below show where the key is located for Office 2007 and 2003.

Microsoft Office 2007: HCU\Software\Microsoft\Office\Common\UserInfo
Microsoft Office 2003: HCU\Software\Microsoft\Office\11.0\Common\UserInfo

Now the question I found myself asking is how are the UserName and Company values initially populated in the UserInfo key? I previously explained the user name and company during Office installation because the entered information is used to populate the UserInfo registry key of the user account that installed Microsoft Office. For the user accounts on the system that are using Microsoft Office but didn’t install it, the values are populated a little different. The first time the user launches an Office application a dialog box appears asking for the user name and initials. The dialog box is prepopulated with the name of the currently logged on user. The information entered in the dialog box is what results in the username value in the user's UserInfo key while the company value comes from the information entered when the Office was installed.

The metadata shown above now has a little more meaning. The username test-2007 is not the name of the user account that created and modified the document but is the name listed in the UserInfo registry key. The name in the UserInfo registry key can be changed at any point but any changes will alter the last write time on the registry key. This means the last write time of the user account’s UserInfo key should be taken into consideration when examining metadata. If the registry key last write time is before the dates/times in the metadata (create, modify, or print) then the metadata reflects what is currently in the user account’s NTUSER.DAT hive. On the other hand, if the registry key last write time is after the metadata timestamps then what is currently in the user account’s NTUSER.DAT hive may not be what was there when the action was taken against the Office document (did I just hear someone whisper check the restore points or volume shadow copies for registry files).

How Actions Change Microsoft Office Metadata

The testing I conducted consisted of creating one document then performing different actions against copies of the document to see how the metadata changed. I only ran the tests against documents created by the following programs: Word 2007, Word 2003, Excel 2007, and Excel 2003. If I test other Office Programs in the future then I’ll update the post to reflect it. The observed changes in the documents’ metadata were consistent across all of the different versions of Office but there were some minor differences between the different file types. The Excel metadata differed from Word in the following ways: there was no revision number, some timestamps contained seconds, and the Save As function didn’t change the documents’ creation date.

I’m providing charts of how the metadata was affected by the different actions taken against the documents. The chart has information in the parenthesis to show what the metadata values were for one set of documents (the timestamps don’t include the date since it was the same for all of the documents).

Here’s the Microsoft Office Word Metadata Changes chart and Microsoft Office Excel Metadata Changes chart.

The charts show how different actions against a Microsoft Office document affect its metadata. There are quite a few takeaways but I’m only going to highlight a few.

* The metadata create date/time reflects when the Office program was opened as opposed to the first time the document was saved
* Copying an Office document doesn’t changed the metadata
* The metadata print date/time only changes when the document is saved after it is printed
* The Save As function results in the Word metadata create and modification date/times being the same while the modification date/time only changes in Excel metadata

Now let’s go back to the metadata I posted above. Do you remember that the last print date (2011-05-27T19:09:00Z) occurred one minute before the file was even created (2011-05-27T19:10:00Z)? There was one action taken against a Microsoft Word document that produced this pattern in the metadata. The action was printing a document then using the Save As function to create a new document. The metadata shown above is from the newly created document.

Hopefully, the sharing of my test results can help others who are pondering the question “why is Microsoft Office metadata what it is”.

How Do You Use Your Skillz

Sunday, June 5, 2011 Posted by Corey Harrell 2 comments
At different times in my personal life I come across everyday people who are experiencing or know of someone having a security issue. Random emails being sent from their email accounts, they clicked on a link that posted something to their friends' Facebook walls, or some rogue program is saying their computers are infected? I expanded jIIr by setting up a Facebook page where I intend to provide security tips to help everyday people protect themselves and be safer, smarter users of the Internet. "Everyday Cyber Security" is meant to be informational and helpful to the "everyday" person so the content is drastically different than my blog. In setting up Everyday Cyber Security I kept reflecting on how I choose to use my DFIR skillz and if I can use my skillz to benefit others. My hope is my personal reflection will encourage you to question how you use your DFIR skillz and if you can be doing more....

I have a certain skillset that the general public does not have. The same is true to the readers of my blog, whether they are seasoned forensicators, students studying the field, or people transitioning into the InfoSec and DFIR fields. I attained my skillset through various means: professional training, self training, researching, and from others who share their experience and knowledge. At times I wonder if I can use my skillset outside of my professional obligations, and if More importantly I ask myself: can I use my skillz to help others in the DFIR community and the Internet community and the communities in which I live.

I've come across some great people in the DFIR community who are more than willing to share their knowledge and tools; some I have had the pleasure to meet in person while the majority I have not. With that said, there are also people on the other end of the spectrum...those who do not share any information at all. This lack of sharing (whatever the reason) not only inhibits discussions nor offer anything to the larger DFIR community, but at times its very discouraging to the people on the receiving end. Some time ago I asked a question about a DFIR technique. What the question was and where I asked it isn't important. What is important is the response I got to my question, which was along the lines of "with experience you'll know." There was no explanation about a process, no suggested method to carry out the technique, no discussion on how to understand the data, and not even a mention of the possible tools to use. This response left me without any references to help me answer my own question and the other people who witnessed my question didn't have an opportunity for a discussion on the topic. Is this the example I should follow with how to use my skillz?

I attended a service this morning that is relevant to the question of "how do you use your skillz?" The message was about not being dormant and taking the opportunities to help others. How does this apply to DFIR...? It's very easy to say to myself "someone else will step up to share the information, someone else will ask a question sooner or later, someone else will answer the question, or eventually you will know with experience." All of these excuses enable me to be dormant instead of taking the opportunity to share my knowledge and experiences.

The decision I've made with how to use my skillz is to try to give back to the community that has given so much to me. I started the jIIr blog to share my research, experience, and thoughts with the DFIR community since there was a chance others would benefit. Now I'm taking the next step of using my skillz and knowledge to help the Internet community and the community where I live. Everyday Cyber Security is a means to empower people to protect themselves from malicious cyber activities. There are a million different reasons of why I shouldn't use my DFIR skills outside of my professional obligations, but I only need one reason to do it anyway. How about you?

Meet the jIIr Symbol

Posted by Corey Harrell 0 comments
About a month ago I was talking with a co-worker when the conversation turned to graphic design. I was thinking about trying to get a few graphics designed for my blog but I had no clue how to go about it. My co-worker knew Julia Hoffman (his sister) who happened to be a graphic design artist and he offered to send her a sketch of my idea. In less than two hours I had two draft pages of my idea and eventually that led to the jIIr symbol. Julia is a great person to work with and her work is impressive. I look forward to working with her on my next graphic project. She is in the process of setting up a website but her college project site is still up if anyone is interested in her work.