Finding Fraudulent Documents Preview
Wednesday, May 16, 2012
Anyone who looks at the topics I discuss on my blog may not easily see the kind of cases I frequently work at my day job. For the most part my blog is a reflection of my interests, the topics I’m trying to learn more about, and what I do outside of my employer. As a result, I don’t blog much about the fraud cases I support but I’m ready to share a technique I’ve been working on for some time.
Next month I’m presenting at the SANs Forensic and Incident Response Summit being held in Austin Texas. The summit dates are June 26 and 27. I’m one of the speakers in the SANs 360 slot and the title of my talk is “Finding Fraudulent Word Documents in 360 Seconds” (here is the agenda). My talk is going to quick and dirty about a technique I honed last year to find fraudulent documents. I’m writing a more detailed paper on the technique as well as a query script to automate finding these documents but my presentation will cover the fundamentals. Specifically, what I mean by fraudulent documents, types of frauds, Microsoft Word metadata, Corey’s guidelines, and the technique in action. Here’s a preview about what I hope to cover in my six minutes (subject to change once I put together the slides and figure out my timing).
What exactly are fraudulent documents? You need to look at the two words separately to see what I’m referring to. One definition for fraudulent is “engaging in fraud; deceitful” while a definition for document is “a piece of written, printed, or electronic matter that provides information or evidence or that serves as an official record”. What I’m talking about is electronic matter that provides information or serves as an official record while engaging in fraud. In easier terms and the way I describe it: electronic documents providing fake financial information. There are different types of fraud which means there are different types of fraudulent documents. However, my technique is geared towards finding the electronic documents used to commit purchasing fraud and bid rigging.
There are a few different ways these frauds can be committed but there are times when Microsoft Word documents are used to provide fake information. One example is an invoice for a product that was never purchased to conceal misappropriated money. As most of us know electronic files contain metadata and Word documents are no different. There are values within Word documents’ metadata that provide strong indicators if the document is questionable. I did extensive testing to determine how these values change based on different actions taken against a document (Word versions 2000, 2003, and 2007). My testing showed the changes in the metadata are consistent based on the action. For example, if a Word document is modified then specific values in the metadata changes while other values remain the same.
I combined the information I learned from my testing with all the different fraudulent documents I’ve examined and I noticed distinct patterns. These patterns can be leveraged to identify potential fraudulent documents among electronic information. I’ve developed some guidelines to find these patterns in Word documents’ metadata. I’m not discussing the guidelines in this post since I’m saving it for my #DFIRSummit presentation and my paper. The last piece is tying everything together by doing a quick run through about how the technique can quickly find fraudulent documents for a purchasing fraud. Something I’m hoping to include is my current work on how I’m trying to automate the technique using a query script I’m writing and someone else’s work (I’m not mentioning who since it's not my place).
I’m pretty excited to finally have the chance to go to my first summit and there’s a great lineup of speakers. I was half joking on Twitter when I said it seems like the summit is the DFIR Mecca. I said half because it’s pretty amazing to see the who else will be attending.
Next month I’m presenting at the SANs Forensic and Incident Response Summit being held in Austin Texas. The summit dates are June 26 and 27. I’m one of the speakers in the SANs 360 slot and the title of my talk is “Finding Fraudulent Word Documents in 360 Seconds” (here is the agenda). My talk is going to quick and dirty about a technique I honed last year to find fraudulent documents. I’m writing a more detailed paper on the technique as well as a query script to automate finding these documents but my presentation will cover the fundamentals. Specifically, what I mean by fraudulent documents, types of frauds, Microsoft Word metadata, Corey’s guidelines, and the technique in action. Here’s a preview about what I hope to cover in my six minutes (subject to change once I put together the slides and figure out my timing).
What exactly are fraudulent documents? You need to look at the two words separately to see what I’m referring to. One definition for fraudulent is “engaging in fraud; deceitful” while a definition for document is “a piece of written, printed, or electronic matter that provides information or evidence or that serves as an official record”. What I’m talking about is electronic matter that provides information or serves as an official record while engaging in fraud. In easier terms and the way I describe it: electronic documents providing fake financial information. There are different types of fraud which means there are different types of fraudulent documents. However, my technique is geared towards finding the electronic documents used to commit purchasing fraud and bid rigging.
There are a few different ways these frauds can be committed but there are times when Microsoft Word documents are used to provide fake information. One example is an invoice for a product that was never purchased to conceal misappropriated money. As most of us know electronic files contain metadata and Word documents are no different. There are values within Word documents’ metadata that provide strong indicators if the document is questionable. I did extensive testing to determine how these values change based on different actions taken against a document (Word versions 2000, 2003, and 2007). My testing showed the changes in the metadata are consistent based on the action. For example, if a Word document is modified then specific values in the metadata changes while other values remain the same.
I combined the information I learned from my testing with all the different fraudulent documents I’ve examined and I noticed distinct patterns. These patterns can be leveraged to identify potential fraudulent documents among electronic information. I’ve developed some guidelines to find these patterns in Word documents’ metadata. I’m not discussing the guidelines in this post since I’m saving it for my #DFIRSummit presentation and my paper. The last piece is tying everything together by doing a quick run through about how the technique can quickly find fraudulent documents for a purchasing fraud. Something I’m hoping to include is my current work on how I’m trying to automate the technique using a query script I’m writing and someone else’s work (I’m not mentioning who since it's not my place).
I’m pretty excited to finally have the chance to go to my first summit and there’s a great lineup of speakers. I was half joking on Twitter when I said it seems like the summit is the DFIR Mecca. I said half because it’s pretty amazing to see the who else will be attending.
It'll be great to see you there, Corey!
Good stuff Corey, looking forward to hearing more about your process...