Timeline Analysis by Categories

Tuesday, October 21, 2014 Posted by Corey Harrell
Organizing is what you do before you do something, so that when you do it, it is not all mixed up.

~ A. A. Milne

"Corey, at times our auditors find fraud and when they do sometimes they need help collecting and analyzing the data on the computers and network. Could you look into this digital forensic thing just in case if something comes up?" This simple request - about seven years ago - is what lead me into the digital forensic and incident response field. One of the biggest challenges I faced was getting a handle on the process one should use. At the time there was a wealth of information about tools and artifacts but there was not as much outlining a process one could use. In hindsight, my approach to the problem was simplistic but very effective. I first documented the examination steps and then organized every artifact and piece of information I learned about beneath the appropriate step. Organizing my process in this manner enabled me to carry it out without getting lost in the data. In this post I'm highlighting how this type of organization is applied to timeline analysis leveraging Plaso.

Examinations Leveraging Categories


Organizing the digital forensic process by documenting the examination steps and organizing artifacts beneath them ensured I didn't get "all mixed up" when working cases. To do examination step "X" examine artifacts A, B, C, and D. Approaching cases this way is a more effective approach. Not only did it prevent jumping around but it helped to minimize overlooking or forgetting about artifacts of interest. Contrast this to a popular cheat sheet at the time I came into the field (links to cryptome). The cheat sheet was a great reference but the listed items are not in an order one can use to complete an examination. What ends up happening is that you jump around the document based on the examination step being completed. This is what tends to happen without any organization. Jumping around in various references depending on what step is being completed. Not an effective method.

Contrast this to the approach I took. Organizing every artifact and piece of information I learned about beneath the appropriate step. I have provided glimpses about this approach in my posts: Obtaining Information about the Operating System and It Is All About Program Execution. The "information about the operating system" post I wrote within the first year of starting this blog. In the post I outlined a few different examination steps and then listed some of the items beneath it. I did a similar post for program execution; beneath this step I listed various program execution artifacts. I was able to write the auto_rip program due to this organization where every registry artifact was organized beneath the appropriate examination step.

Examination Steps + Artifacts = Categories


What exactly are categories? I see them as the following: Examination Steps + Artifacts = Categories. I outlined my examination steps on the jIIr methodology webpage and below are some of the steps listed for system examinations.

        * Profile the System
              - General Operating System Information
              - User Account Information
              - Software Information
              - Networking Information
              - Storage Locations Information
        * Examine the Programs Ran on the System
        * Examine the Auto-start Locations
        * Examine Host Based Logs for Activity of Interest
        * Examine Web Browsing
        * Examine User Profiles of Interest
              - User Account Configuration Information
              - User Account General Activity
              - User Account Network Activity
              - User Account File/Folder Access Activity
              - User Account Virtualization Access Activity
        * Examine Communications


In a previous post, I said "taking a closer look at the above examination steps it’s easier to see how artifacts can be organized beneath them. Take for example the step Examine the programs ran on the system. Beneath this step you can organize different artifacts such as: application compatibility cache, userassist, and muicache. The same concept applies to every step and artifact." In essence, each examination step becomes a category containing artifacts. In the same post I continued by saying "when you start looking at all the artifacts within a category you get a more accurate picture and avoid overlooking artifacts when processing a case."

This level of organization is how categories can be leveraged during examinations.

Timeline Analysis Leveraging Categories


Organizing the digital forensic process by documenting the examination steps and organizing artifacts is not limited to completing the examination steps or registry analysis. The same approach works for timeline analysis. If I'm looking to build a timeline of a system I don't want everything (aka the kitchen sink approach.) I only want certain types of artifacts that I layer together into a timeline.

To illustrate let's use a system infected with commodity malware. The last thing I want to do is to create a supertimeline using the kitchen sink approach. First, it takes way too long to generate (I'd rather start analyzing than waiting.) Second, the end result has a ton of data that is really not needed. The better route is to select the categories of artifacts I want such as: file system metadata, program execution, browser history, and windows event logs. Approaching timelines in this manner makes them faster to create and easier to analyze.

The way I created timelines in this manner was not efficient as I wanted it to be. Basically, I looked at the timeline tools I used and what artifacts they supported. Then I organized the supported artifacts beneath the examination steps to make them into categories. When I created a timeline I would use different tools to parse the categorized artifacts I wanted and then combine the output into a single timeline in the same format. It sounds like a lot but it didn't take that long to create it. Hell, it was even a lot faster than doing the kitchen sink approach.

This is where Plaso comes into the picture by making things a lot easier to leverage categories in timeline analysis.

Plasm


Plaso is a collection of timeline analysis tools; one of which is plasm. Plasm is capable of "tagging events inside the plaso storage file." Translation: plasm organizes the artifacts into categories by tagging the event in the plaso storage file. At the time of this post the tagging occurs after the timeline data has already been parsed (as opposed to specifying the categories and only parsing those during timeline generation.) The plasm user guide webpage provides a wealth of information about the tool and how to use it. I won't rehash the basics since I couldn't do justice to what is already written. Instead I'll jump right in to how plasm makes leveraging categories in timeline analysis easier.

The plasm switch " --tagfile" enables a tag file to be used to define the events to tag. Plaso provides an example tag file named tag_windows.txt. This is the feature in Plaso that makes it easier to leverage categories in timeline analysis. The artifacts supported by Plaso are organized beneath the examination steps in the tag file. The image below is a portion of the tag_jiir.txt tag file I created showing the organization:


tag_jiir.txt is still a work in progress. As can be seen in the above image, the Plaso supported artifacts are organize beneath the " Examine the Programs Ran on the System" (program execution) and " Examine the Auto-start Locations" (autostarts info) examination steps. The rest of the tag file is not shown but the same organization was completed for the remaining examination steps. After the tag_jiir.txt tag file is applied to plaso storage file then timelines can be created only containing the artifacts within select categories.


Plasm in Action


It's easier to see plasm in action in order to fully explore how it helps with using categories in timeline analysis. The test system for this demonstration is one found laying around on my portable drive; it might be new material or something I previously blogged about. For demonstration purposes I'm running log2timeline against a forensic image instead of a mounted drive. Log2timeline works against a mounted drive but the timeline will lack creation dates (and at the time of this post there is not a way to bring in $MFT data into log2timeline.)

The first step is taking a quick look at the system for any indications that the system may be compromised.  When triaging for potential malware infected system the activity in the program execution artifacts excel and the prefetch files revealed a suspicious file as shown below.


The file stands out for a few reasons. It's a program executing from a temp folder inside a user profile. The file handle to the zone.identifier file indicates the file came from the Internet. Lastly, the program last ran on 1/16/14 12:32:14 UTC.

Now we move on to creating the timeline with the command below (the -o switch specifies the partition I want to parse.) FYI, the command below creates a kitchen sink timeline with everything being parsed. To avoid the kitchen sink use  plaso filters. Creating my filter is on my to-do list after I complete the tag_jiir.txt file.

log2timeline.exe -o 2048 C:\Atad\test-data\test.dump C:\Atad\test-data\test.E01

The image below shows the "information that have been collected and stored inside the storage container." The Plaso tool pinfo was used to show this information.


Now it is time to see plasm in action by tagging the events in the storage container. The command below shows how to do this using my tag_jiir.txt file.

plasm.exe tag --tagfile=C:\Atad\test-data\tag_jiir.txt C:\Atad\test-data\test.dump

The storage container information now shows the tagged events as shown below. Notice how the events are actually the categories for my examination steps (keep in mind some categories are not present due to the image not containing the artifacts.)


Now a timeline can be exported from the storage container based on the categories I want. The program execution indicator revealed there may be malware and it came from the internet. The categories I would want for a quick look are the following:

        - Filesystem: to see what files were created and/or modified
        - Web browsing: to correlate web browsing activity with malware
        - Program execution: to see what programs executed (includes files such as prefetch as well as specific event logs

The command below creates my timeline based on the categories I want (-o switch outputs in l2tcsv format, -w switch outputs to a file, and filter uses the tag contains.) FYI, a timeslice was not used since I wanted to focus on the tagging capability.

psort.exe -o L2tcsv -w C:\Atad\test-data\test-timeline.csv C:\Atad\test-data\test.dump "tag contains 'Program Execution' or tag contains 'Filesystem Info' or tag contains 'Web Browsing Info'"

The image below shows the portion of the timeline around the time of interest, which was 1/16/14 12:32:14 UTC. The timeline shows the lab user account accessing their Yahoo web email then a file named invoice.83842.zip being downloaded from the internet. The user then proceeded to execute the archive's contents by clicking the invoice.83842.exe executable. This infection vector is clear as day since the timeline does not contain a lot of un-needed data.


Conclusion


Plaso's tagging capabilities makes things easier to leverage categories in timeline analysis. By producing a timeline containing only the categories one wants in order to view the timeline data for select artifacts. This type of organization helps minimize getting "all mixed up" when conducting timeline analysis by getting lost in the data.

Plaso is great and the tagging feature rocks but as I mentioned before I used a combination of tools to create timelines. Not every tool has the capabilities I need so combining them provides better results. In past I had excellent results leveraging the Perl log2timeline and Regripper to create timelines. At this point the tools are not compatible. Plaso doesn't have a TLN parser (can't read RegRipper's TLN plug-ins) and RegRipper only outputs to TLN. Based on Harlan's upcoming presentation my statement may not be valid for long.

In the end, leveraging categories in timeline analysis is very powerful. This train of thought is not unique to me. Others (who happen to be tool developers) are looking into this as well. Kristinn is as you can see in Plaso's support for tagging and Harlan wrote about this in his latest Windows forensic book.


Side note: the purpose of this post was to highlight Plaso's tagging capability. However, for the best results the filtering capability should be used to reduce what items get parsed in the first place. The kitchen sink approach just takes way too long; why wait when you can analyze.  
Labels: ,
  1. Hi corey,

    It's possible for us to share the tag_jirr.txt ?

    Thx

  2. Anonymous

    Any chance we can get your work in progress tag file. I think this is what i'm looking for as it seems similar to the SANS508 excel filters for trying to sift through the mounds of data.

  3. The tag_jiir.txt is still a work in progress so I'm not releasing it yet. I have a ton of work to do for it to be completed and be ready for public consumption. In addition, I still need to create a filter file to go along with it. This is to reduce the time it takes to create timelines.

  4. Thierry_Fr

    Thanks for this interesting post and for pointing to this tagging functionality

  5. A great visual tool that works on top of plaso is 4n6time http://plaso.kiddaland.net/usage/4n6time

Post a Comment