Tr3Secure Collection Script Updated

Tuesday, October 28, 2014 Posted by Corey Harrell 10 comments
On my to-do list for some time has been to add support back into the Tr3Secure collection script to obtain the NTFS Change Journal ($UsnJrnl). This is a quick post about this functionality being added back to the collection script.

The issue I faced was the following. There are very few tools capable of collecting NTFS artifacts from live systems; even fewer for collecting the $UsnJrnl. The Tr3Secure script uses Joakim Schicht's tool RawCopy to collect files off of live systems. It is one of the few and - as far as I know - the only open source option. Rawcopy pulls files either by their $MFT record number or the file path. Pulling NTFS artifacts requires the $MFT record number. The challenge is the $UsnJrnl does not have a consistent $MFT record number like the other NTFS artifacts. For most scripting languages this wouldn't be an issue but Tr3Secure is a batch script. Batch scripting doesn't support storing a command's output into a variable. Translation: there is not an easy way in batch scripting to query the $UsnJrnl's $MFT record number, store it into a variable, and then use that variable with RawCopy to collect it. This is why adding the functionality back into the script has been and remained on my to-do list until now.

Joakim Schicht's ExtractUsnJrnl


Joakim Schicht does outstanding work producing DFIR tools and releasing them open source. His Github site contains a wealth of tools. He even has a collection of tools for the collection and parsing of NTFS artifacts. For those who aren't familiar with his work then I highly advise you take the time to explore them (also his Google Code wiki page.) He recently released a new tool called ExtractUsnJrnl. The tool - in Joakim's words - does the following:

"$J may be sparse, which would mean parts of the data is just 00's. This may be a significant portion of the total data, and most tools will extract this data stream to its full size (which is annoying and a huge waste of disk space). This is where this tools comes in, as it only extract the actual data for the change journal. That way extraction obviously also goes faster. Why extract 20 GB when you might only need 200 MB?"

The tool not only collects the $UsnJrnl $J alternate data stream but it only extracts the portion containing data. This not only saves space but it makes the collection faster; especially if pulling it over the wire. The tool is command-line making it easy to script with. I updated the Tr3Secure collection script to use the ExtractUsnJrnl tool for grabbing the $UsnJrnl.

ExtractUsnJrnl in Action


ExtractUsnJrnl is really a cool tool so I wanted to take the time to highlight it. I performed a simple test. Collect the $UsnJrnl $J file with one tool (FTK Imager) to see how long it takes and what the file size is then use ExtractUsnJrnl.

The image below shows the $UsnJrnl from a 1TB solid state drive. The file size difference is significant; one file is 4.6GB while the other is 36MB. Both tools were ran locally but ExtractUsnJrnl completed within seconds.


The image below shows the $UsnJrnl from a 300GB removable drive. Again, notice the difference between the file sizes.


Some may be wondering why am I so focused on the resulting file size. The reason is trying to pull a 4.6GB file over the wire from a remote system takes time. A lot of time if that remote system is in a location with a slow network link (think VPN users). By reducing the file size (i.e. 36MB) makes it easier to collect the $UsnJrnl both remotely and locally to an attached storage device.

The next test I ran was to parse both $UsnJrnl $J files to see if the both contain around the same number of records. I said approximate because the hard drives were not write protected and changes may had been made between the collections. Due to this I evaluated the removable store device's NTFS Change Journal since the drive had less activity than the solid state drive.

The image below shows UsnJrnl2Csv successfully parsing the $UsnJrnl $J extracted with FTK Imager.



The image below shows UsnJrnl2Csv successfully parsing the $UsnJrnl $J extracted with ExtractUsnJrnl. Notice how this $J file had significantly less records.


Lastly, the image below shows the comparison of the two parsed $UsnJrnl $J files from the removable media. Both outputs start at the same time with the same file and end at the same time with the same file.


Another Tr3Secure Collection Update


Adding support to collect the $UsnJrnl is not the only update. The change log lists out all of them but I did want to highlight another one. An additional menu option was added to only collect the NTFS artifacts. There are times where I want to create a quick timeline with the NTFS artifacts to get more information about something. For example, an antivirus alert may had flagged a file but I'm interested in if anything else was dropped onto the system. In an instance like this, creating a timeline with both the $MFT and $UsnJrnl can quickly answer this question. I've been using a different collection script to grab just the NTFS artifacts but I decided to incorporate the functionality into the Tr3secure script. The menu option now appears as the following:


Selecting option 5 will only preserve select files then collect the $MFT, $Logfile, and $UsnJrnl.

You can download the TR3Secure Data Collection Script from the following download site. The link is also posted along the right hand side of this blog towards the top.


In the future I plan on doing a post or two  illustrating how targeted collections using scripts - such as the Tr3secure collection script - can significantly speed up the time it takes to triage an alert or system.
Labels: ,

Timeline Analysis by Categories

Tuesday, October 21, 2014 Posted by Corey Harrell 5 comments
Organizing is what you do before you do something, so that when you do it, it is not all mixed up.

~ A. A. Milne

"Corey, at times our auditors find fraud and when they do sometimes they need help collecting and analyzing the data on the computers and network. Could you look into this digital forensic thing just in case if something comes up?" This simple request - about seven years ago - is what lead me into the digital forensic and incident response field. One of the biggest challenges I faced was getting a handle on the process one should use. At the time there was a wealth of information about tools and artifacts but there was not as much outlining a process one could use. In hindsight, my approach to the problem was simplistic but very effective. I first documented the examination steps and then organized every artifact and piece of information I learned about beneath the appropriate step. Organizing my process in this manner enabled me to carry it out without getting lost in the data. In this post I'm highlighting how this type of organization is applied to timeline analysis leveraging Plaso.

Examinations Leveraging Categories


Organizing the digital forensic process by documenting the examination steps and organizing artifacts beneath them ensured I didn't get "all mixed up" when working cases. To do examination step "X" examine artifacts A, B, C, and D. Approaching cases this way is a more effective approach. Not only did it prevent jumping around but it helped to minimize overlooking or forgetting about artifacts of interest. Contrast this to a popular cheat sheet at the time I came into the field (links to cryptome). The cheat sheet was a great reference but the listed items are not in an order one can use to complete an examination. What ends up happening is that you jump around the document based on the examination step being completed. This is what tends to happen without any organization. Jumping around in various references depending on what step is being completed. Not an effective method.

Contrast this to the approach I took. Organizing every artifact and piece of information I learned about beneath the appropriate step. I have provided glimpses about this approach in my posts: Obtaining Information about the Operating System and It Is All About Program Execution. The "information about the operating system" post I wrote within the first year of starting this blog. In the post I outlined a few different examination steps and then listed some of the items beneath it. I did a similar post for program execution; beneath this step I listed various program execution artifacts. I was able to write the auto_rip program due to this organization where every registry artifact was organized beneath the appropriate examination step.

Examination Steps + Artifacts = Categories


What exactly are categories? I see them as the following: Examination Steps + Artifacts = Categories. I outlined my examination steps on the jIIr methodology webpage and below are some of the steps listed for system examinations.

        * Profile the System
              - General Operating System Information
              - User Account Information
              - Software Information
              - Networking Information
              - Storage Locations Information
        * Examine the Programs Ran on the System
        * Examine the Auto-start Locations
        * Examine Host Based Logs for Activity of Interest
        * Examine Web Browsing
        * Examine User Profiles of Interest
              - User Account Configuration Information
              - User Account General Activity
              - User Account Network Activity
              - User Account File/Folder Access Activity
              - User Account Virtualization Access Activity
        * Examine Communications


In a previous post, I said "taking a closer look at the above examination steps it’s easier to see how artifacts can be organized beneath them. Take for example the step Examine the programs ran on the system. Beneath this step you can organize different artifacts such as: application compatibility cache, userassist, and muicache. The same concept applies to every step and artifact." In essence, each examination step becomes a category containing artifacts. In the same post I continued by saying "when you start looking at all the artifacts within a category you get a more accurate picture and avoid overlooking artifacts when processing a case."

This level of organization is how categories can be leveraged during examinations.

Timeline Analysis Leveraging Categories


Organizing the digital forensic process by documenting the examination steps and organizing artifacts is not limited to completing the examination steps or registry analysis. The same approach works for timeline analysis. If I'm looking to build a timeline of a system I don't want everything (aka the kitchen sink approach.) I only want certain types of artifacts that I layer together into a timeline.

To illustrate let's use a system infected with commodity malware. The last thing I want to do is to create a supertimeline using the kitchen sink approach. First, it takes way too long to generate (I'd rather start analyzing than waiting.) Second, the end result has a ton of data that is really not needed. The better route is to select the categories of artifacts I want such as: file system metadata, program execution, browser history, and windows event logs. Approaching timelines in this manner makes them faster to create and easier to analyze.

The way I created timelines in this manner was not efficient as I wanted it to be. Basically, I looked at the timeline tools I used and what artifacts they supported. Then I organized the supported artifacts beneath the examination steps to make them into categories. When I created a timeline I would use different tools to parse the categorized artifacts I wanted and then combine the output into a single timeline in the same format. It sounds like a lot but it didn't take that long to create it. Hell, it was even a lot faster than doing the kitchen sink approach.

This is where Plaso comes into the picture by making things a lot easier to leverage categories in timeline analysis.

Plasm


Plaso is a collection of timeline analysis tools; one of which is plasm. Plasm is capable of "tagging events inside the plaso storage file." Translation: plasm organizes the artifacts into categories by tagging the event in the plaso storage file. At the time of this post the tagging occurs after the timeline data has already been parsed (as opposed to specifying the categories and only parsing those during timeline generation.) The plasm user guide webpage provides a wealth of information about the tool and how to use it. I won't rehash the basics since I couldn't do justice to what is already written. Instead I'll jump right in to how plasm makes leveraging categories in timeline analysis easier.

The plasm switch " --tagfile" enables a tag file to be used to define the events to tag. Plaso provides an example tag file named tag_windows.txt. This is the feature in Plaso that makes it easier to leverage categories in timeline analysis. The artifacts supported by Plaso are organized beneath the examination steps in the tag file. The image below is a portion of the tag_jiir.txt tag file I created showing the organization:


tag_jiir.txt is still a work in progress. As can be seen in the above image, the Plaso supported artifacts are organize beneath the " Examine the Programs Ran on the System" (program execution) and " Examine the Auto-start Locations" (autostarts info) examination steps. The rest of the tag file is not shown but the same organization was completed for the remaining examination steps. After the tag_jiir.txt tag file is applied to plaso storage file then timelines can be created only containing the artifacts within select categories.


Plasm in Action


It's easier to see plasm in action in order to fully explore how it helps with using categories in timeline analysis. The test system for this demonstration is one found laying around on my portable drive; it might be new material or something I previously blogged about. For demonstration purposes I'm running log2timeline against a forensic image instead of a mounted drive. Log2timeline works against a mounted drive but the timeline will lack creation dates (and at the time of this post there is not a way to bring in $MFT data into log2timeline.)

The first step is taking a quick look at the system for any indications that the system may be compromised.  When triaging for potential malware infected system the activity in the program execution artifacts excel and the prefetch files revealed a suspicious file as shown below.


The file stands out for a few reasons. It's a program executing from a temp folder inside a user profile. The file handle to the zone.identifier file indicates the file came from the Internet. Lastly, the program last ran on 1/16/14 12:32:14 UTC.

Now we move on to creating the timeline with the command below (the -o switch specifies the partition I want to parse.) FYI, the command below creates a kitchen sink timeline with everything being parsed. To avoid the kitchen sink use  plaso filters. Creating my filter is on my to-do list after I complete the tag_jiir.txt file.

log2timeline.exe -o 2048 C:\Atad\test-data\test.dump C:\Atad\test-data\test.E01

The image below shows the "information that have been collected and stored inside the storage container." The Plaso tool pinfo was used to show this information.


Now it is time to see plasm in action by tagging the events in the storage container. The command below shows how to do this using my tag_jiir.txt file.

plasm.exe tag --tagfile=C:\Atad\test-data\tag_jiir.txt C:\Atad\test-data\test.dump

The storage container information now shows the tagged events as shown below. Notice how the events are actually the categories for my examination steps (keep in mind some categories are not present due to the image not containing the artifacts.)


Now a timeline can be exported from the storage container based on the categories I want. The program execution indicator revealed there may be malware and it came from the internet. The categories I would want for a quick look are the following:

        - Filesystem: to see what files were created and/or modified
        - Web browsing: to correlate web browsing activity with malware
        - Program execution: to see what programs executed (includes files such as prefetch as well as specific event logs

The command below creates my timeline based on the categories I want (-o switch outputs in l2tcsv format, -w switch outputs to a file, and filter uses the tag contains.) FYI, a timeslice was not used since I wanted to focus on the tagging capability.

psort.exe -o L2tcsv -w C:\Atad\test-data\test-timeline.csv C:\Atad\test-data\test.dump "tag contains 'Program Execution' or tag contains 'Filesystem Info' or tag contains 'Web Browsing Info'"

The image below shows the portion of the timeline around the time of interest, which was 1/16/14 12:32:14 UTC. The timeline shows the lab user account accessing their Yahoo web email then a file named invoice.83842.zip being downloaded from the internet. The user then proceeded to execute the archive's contents by clicking the invoice.83842.exe executable. This infection vector is clear as day since the timeline does not contain a lot of un-needed data.


Conclusion


Plaso's tagging capabilities makes things easier to leverage categories in timeline analysis. By producing a timeline containing only the categories one wants in order to view the timeline data for select artifacts. This type of organization helps minimize getting "all mixed up" when conducting timeline analysis by getting lost in the data.

Plaso is great and the tagging feature rocks but as I mentioned before I used a combination of tools to create timelines. Not every tool has the capabilities I need so combining them provides better results. In past I had excellent results leveraging the Perl log2timeline and Regripper to create timelines. At this point the tools are not compatible. Plaso doesn't have a TLN parser (can't read RegRipper's TLN plug-ins) and RegRipper only outputs to TLN. Based on Harlan's upcoming presentation my statement may not be valid for long.

In the end, leveraging categories in timeline analysis is very powerful. This train of thought is not unique to me. Others (who happen to be tool developers) are looking into this as well. Kristinn is as you can see in Plaso's support for tagging and Harlan wrote about this in his latest Windows forensic book.


Side note: the purpose of this post was to highlight Plaso's tagging capability. However, for the best results the filtering capability should be used to reduce what items get parsed in the first place. The kitchen sink approach just takes way too long; why wait when you can analyze.  
Labels: ,