Dual Purpose Volatile Data Collection Script
Monday, January 2, 2012
When responding to a potential security incident a capability is needed to quickly triage the system to see what's going on. Is a rogue process running on the system, whose currently logged onto the system, what other systems are trying to connect over the network, or how do I document the actions I took on the system. These are valid questions during incident response whether the response is for an actual event or a simulation. One area to examine to get answers is the systems' volatile data. Automating the collection of volatile data can save valuable time which in turn helps analysts examine the data faster in order to get answers. This post briefly describes (and releases) the Tr3Secure volatile data collection script I wrote.
Tr3Secure needed a toolset for responding to systems during attack simulations and one of the tools had to quickly collect volatile data on a system (I previously discussed what Tr3Secure is here). However, the volatile data collection tool had to provide dual functions. First and foremost it had to properly preserve and acquire data from live systems. The toolset is initially being used in a training environment but the tools and processes we are learning need to be able to translate over to actual security incidents. What good is mastering a collection tool that can’t be used during live incident response activities? The second required function was the tool had to help with training people on examining volatile data. Tr3Secure members come from different information security backgrounds so not every member will be knowledgeable about volatile data. Collecting data is one thing but people will eventually need to know how to understand what the data means. The DFIR community has a few volatile data collection scripts but none of the scripts I found provided the dual functionality for practical and training usage. So I went ahead and wrote a script to meet our needs.
Practical Usage
These were some considerations taken into account to ensure the script is scalable to meet the needs for volatile data collection during actual incident response activities.
Flexibility
Different responses will have different requirements on where to store the volatile data that’s collected. At times the data may be stored on the same drive where the DFIR toolset is located while at other times the data may be stored to a different drive. I took this into consideration and the volatile data collection script allows for the output data to be stored on a drive of choice. If someone prefers to run their tools from a CD-ROM while someone else works with a large USB removable drive then the script can be used by the both of them.
Organize Output
Troy Larson posted a few lines of code from his collection script to the Win4n6 sometime ago. One thing I noticed about his script was that he organized the output data based on a case number. I incorporated his idea into my script; a case number needs to be entered when the script is run on a system. A case folder enables data collected from numerous systems to be stored in the same folder (folder is named Data-Case#). In addition to organizing data into a case folder, the actual volatile data is stored in a sub-folder named after the system the data came from (system's computer name is used to name the folder). To prevent overwriting data by running the script multiple times on the same system I incorporated a timestamp into the folder name (two digit month, day, year, hour, and minute). Appending a timestamp to the folder name means the script can execute against the same system numerous times and all of the volatile data is stored in separate folders. Lastly, the data collected from the system is stored in separate sub-folders for easier access. The screenshot below shows the data collected for Case Number 100 from the system OWNING-U on 01/01/2012 at 15:46.
Documentation
Automating data collection means that documentation can be automated as well. The script documents everything in a collection log. Each case has one collection log so regardless if data is collected from one or ten systems an analyst will only have to worry about reviewing one log.
The following information is documented both to the screen for an analyst to see and a collection log file: case number, examiner name, target system, user account used to collect data, drives for tools and data storage, time skew, and program execution. The script prompts the analyst for the case number, their name, and the drive to store data on. This information is automatically stored in the collection log so the analyst doesn’t have to worry about maintaining documentation elsewhere. In addition, the script prompts the analyst for the current date and time which is used to record the time difference between the system and the actual time. Every program executed by the script is recorded in the collection log along with a timestamp of when the program executed. This will make it easier to account for artifacts left on a system if the system is examined after the script is executed. The screenshot below shows the part of the collection log for the data collected from the system OWNING-U.
Preservation
RFC 3227’s Order of Volatility outlines that evidence should be collected starting with the most volatile then proceeding to the less volatile. The script takes into account the order of volatility during data collection. When all data is selected for collection, the memory is first imaged then volatile data is collected followed by collecting non-volatile data. The volatile data collected is: process information, network information, logged on users, open files, clipboard, and then system information. The non-volatile data collected is installed software, security settings, configured users/groups, system's devices, auto-runs locations, and applied group policies. Another item the script incorporated from Troy Larson’s comment in the Win4n6 group is preserving the prefetch files before volatile data is collected. I never thought about this before I read his comment but it makes sense. Volatile data gets collected by executing numerous programs on a system and these actions can overwrite the existing prefetch files with new information or files. Preserving the prefetch files upfront ensures analysts will have access to most of the prefetch files that were on the system before the collection occurred (four prefetch files may be overwritten before the script preserves them). The script uses robocopy to copy the prefetch files so the file system metadata (timestamps, NTFS permissions, and file ownership) is collected along with the files themselves. The screenshot below shows the preserved files for system OWNING-U.
Tools Executed
The readme file accompanying the script outlines the various programs used to collect data. The programs include built-in Windows commands and third party utilities. The screenshot below shows the tools folder where the third party utilities are stored.
I’m not going to discuss every program but I at least wanted to highlight a few. Windows diskpart command allows for disks, partitions, and volumes to be managed through the command line. The script leverages diskpart to make it easy for an analyst to see what drives and volumes are attached to a system. Hopefully, the analyst won’t need to open up Windows explorer to see what the removable media drive mappings are since the script displays the information automatically as shown below. Note, to make diskpart work a text file needs to be created in the tools folder named diskpart_commands.txt and the file needs to contain these two commands on separate lines: list disk and list volume.
Mandiant’s Memoryze is used to obtain a forensic image of the system’s memory. Memoryze supports a wide range of Windows operating systems which makes the script more versatile for dumping RAM. The key reason the script uses Memoryze is because it’s the only free memory imaging program I found that allows an image to be stored in a folder of your choice. Most programs will place the memory image in the same folder where the command line is opened. This wouldn’t work because the image would be dropped in the folder where the script is located instead of the drive the analyst wants. Memoryze uses an xml configuration file to image RAM so I borrowed a few lines of code from the MemoryDD.bat batch file to create the xml file for the script. Note, the script only needs the memoryze.exe; to obtain the exe install Memoryze on a computer then just copy memoryze.exe to the Tools folder.
PXServer’s Winaudit program obtains the configuration information from a system and I first became acquainted with the program during my time performing vulnerability assessments. The script uses Winaudit to collect some non-volatile data including the installed software, configured users/groups, and computer devices. Winaudit is capable of collecting a lot more information so it wouldn’t be that hard to incorporate the additional information by modifying the script.
Training Usage
These were the two items put into the script to assist with training members on performing incident response system triage.
Ordered Output Reports
The script collects a wealth of information about a system and this may be overwhelming to analysts new to examining volatile data. For example, the script produces six different reports about the processes running on a system. A common question when faced with so many reports is how should they be reviewed. The script’s output reports have numbers which is the suggested order for them to be reviewed. This provides a little assistance to analysts until they develop their own process for examining the data. The screenshots below shows the process reports in the output folder and those reports opened in Notepad ++.
Understanding Tool Functionality and Volatile Data
The script needs to help people better understand what the collected data means about the system where it came from. Two great references for collecting, examining, and understanding volatile data are Windows Forensic Analysis, 2nd edition and Malware Forensics: Investigating and Analyzing Malicious Code. I used both books when researching and selecting the script’s tools to collect volatile data. What better ways to help someone better understand the tools or data then by directing them to references that explain it? I placed comments in the script containing the page number where a specific tool is discussed and the data explained in both books. The screenshot below shows the portion of the script that collects process information and the references are highlighted in red.
Releasing the Tr3Secure Volatile Data Collection Script
There are very few things I do forensically that I think are cool; this script happens to be one of them. There are not many tools or scripts that work as intended while at the same time provide training. People who have more knowledge about volatile data can hit the ground running with the script investigating systems. The script automates imaging memory image, collecting volatile/non-volatile data, and documenting every action taken on the system. People with less knowledge can leverage the tool to learn how to investigate systems. The script collects data then the ordered output and references in the comments can be used to interpret the data. Talk about killing two birds with one stone.
The following is the location to the zip file containing the script and the readme file (download link is here). Please be advised, a few programs the script uses require administrative rights to run properly.
Enjoy and Happy Hunting ...
***** Update *****
The Tr3Secure script has been updated and you can read about the changes in the post Tr3Secure Data Collection Script Reloaded
***** Update *****
Tr3Secure needed a toolset for responding to systems during attack simulations and one of the tools had to quickly collect volatile data on a system (I previously discussed what Tr3Secure is here). However, the volatile data collection tool had to provide dual functions. First and foremost it had to properly preserve and acquire data from live systems. The toolset is initially being used in a training environment but the tools and processes we are learning need to be able to translate over to actual security incidents. What good is mastering a collection tool that can’t be used during live incident response activities? The second required function was the tool had to help with training people on examining volatile data. Tr3Secure members come from different information security backgrounds so not every member will be knowledgeable about volatile data. Collecting data is one thing but people will eventually need to know how to understand what the data means. The DFIR community has a few volatile data collection scripts but none of the scripts I found provided the dual functionality for practical and training usage. So I went ahead and wrote a script to meet our needs.
Practical Usage
These were some considerations taken into account to ensure the script is scalable to meet the needs for volatile data collection during actual incident response activities.
Flexibility
Different responses will have different requirements on where to store the volatile data that’s collected. At times the data may be stored on the same drive where the DFIR toolset is located while at other times the data may be stored to a different drive. I took this into consideration and the volatile data collection script allows for the output data to be stored on a drive of choice. If someone prefers to run their tools from a CD-ROM while someone else works with a large USB removable drive then the script can be used by the both of them.
Organize Output
Troy Larson posted a few lines of code from his collection script to the Win4n6 sometime ago. One thing I noticed about his script was that he organized the output data based on a case number. I incorporated his idea into my script; a case number needs to be entered when the script is run on a system. A case folder enables data collected from numerous systems to be stored in the same folder (folder is named Data-Case#). In addition to organizing data into a case folder, the actual volatile data is stored in a sub-folder named after the system the data came from (system's computer name is used to name the folder). To prevent overwriting data by running the script multiple times on the same system I incorporated a timestamp into the folder name (two digit month, day, year, hour, and minute). Appending a timestamp to the folder name means the script can execute against the same system numerous times and all of the volatile data is stored in separate folders. Lastly, the data collected from the system is stored in separate sub-folders for easier access. The screenshot below shows the data collected for Case Number 100 from the system OWNING-U on 01/01/2012 at 15:46.
Documentation
Automating data collection means that documentation can be automated as well. The script documents everything in a collection log. Each case has one collection log so regardless if data is collected from one or ten systems an analyst will only have to worry about reviewing one log.
The following information is documented both to the screen for an analyst to see and a collection log file: case number, examiner name, target system, user account used to collect data, drives for tools and data storage, time skew, and program execution. The script prompts the analyst for the case number, their name, and the drive to store data on. This information is automatically stored in the collection log so the analyst doesn’t have to worry about maintaining documentation elsewhere. In addition, the script prompts the analyst for the current date and time which is used to record the time difference between the system and the actual time. Every program executed by the script is recorded in the collection log along with a timestamp of when the program executed. This will make it easier to account for artifacts left on a system if the system is examined after the script is executed. The screenshot below shows the part of the collection log for the data collected from the system OWNING-U.
Preservation
RFC 3227’s Order of Volatility outlines that evidence should be collected starting with the most volatile then proceeding to the less volatile. The script takes into account the order of volatility during data collection. When all data is selected for collection, the memory is first imaged then volatile data is collected followed by collecting non-volatile data. The volatile data collected is: process information, network information, logged on users, open files, clipboard, and then system information. The non-volatile data collected is installed software, security settings, configured users/groups, system's devices, auto-runs locations, and applied group policies. Another item the script incorporated from Troy Larson’s comment in the Win4n6 group is preserving the prefetch files before volatile data is collected. I never thought about this before I read his comment but it makes sense. Volatile data gets collected by executing numerous programs on a system and these actions can overwrite the existing prefetch files with new information or files. Preserving the prefetch files upfront ensures analysts will have access to most of the prefetch files that were on the system before the collection occurred (four prefetch files may be overwritten before the script preserves them). The script uses robocopy to copy the prefetch files so the file system metadata (timestamps, NTFS permissions, and file ownership) is collected along with the files themselves. The screenshot below shows the preserved files for system OWNING-U.
Tools Executed
The readme file accompanying the script outlines the various programs used to collect data. The programs include built-in Windows commands and third party utilities. The screenshot below shows the tools folder where the third party utilities are stored.
I’m not going to discuss every program but I at least wanted to highlight a few. Windows diskpart command allows for disks, partitions, and volumes to be managed through the command line. The script leverages diskpart to make it easy for an analyst to see what drives and volumes are attached to a system. Hopefully, the analyst won’t need to open up Windows explorer to see what the removable media drive mappings are since the script displays the information automatically as shown below. Note, to make diskpart work a text file needs to be created in the tools folder named diskpart_commands.txt and the file needs to contain these two commands on separate lines: list disk and list volume.
Mandiant’s Memoryze is used to obtain a forensic image of the system’s memory. Memoryze supports a wide range of Windows operating systems which makes the script more versatile for dumping RAM. The key reason the script uses Memoryze is because it’s the only free memory imaging program I found that allows an image to be stored in a folder of your choice. Most programs will place the memory image in the same folder where the command line is opened. This wouldn’t work because the image would be dropped in the folder where the script is located instead of the drive the analyst wants. Memoryze uses an xml configuration file to image RAM so I borrowed a few lines of code from the MemoryDD.bat batch file to create the xml file for the script. Note, the script only needs the memoryze.exe; to obtain the exe install Memoryze on a computer then just copy memoryze.exe to the Tools folder.
PXServer’s Winaudit program obtains the configuration information from a system and I first became acquainted with the program during my time performing vulnerability assessments. The script uses Winaudit to collect some non-volatile data including the installed software, configured users/groups, and computer devices. Winaudit is capable of collecting a lot more information so it wouldn’t be that hard to incorporate the additional information by modifying the script.
Training Usage
These were the two items put into the script to assist with training members on performing incident response system triage.
Ordered Output Reports
The script collects a wealth of information about a system and this may be overwhelming to analysts new to examining volatile data. For example, the script produces six different reports about the processes running on a system. A common question when faced with so many reports is how should they be reviewed. The script’s output reports have numbers which is the suggested order for them to be reviewed. This provides a little assistance to analysts until they develop their own process for examining the data. The screenshots below shows the process reports in the output folder and those reports opened in Notepad ++.
Understanding Tool Functionality and Volatile Data
The script needs to help people better understand what the collected data means about the system where it came from. Two great references for collecting, examining, and understanding volatile data are Windows Forensic Analysis, 2nd edition and Malware Forensics: Investigating and Analyzing Malicious Code. I used both books when researching and selecting the script’s tools to collect volatile data. What better ways to help someone better understand the tools or data then by directing them to references that explain it? I placed comments in the script containing the page number where a specific tool is discussed and the data explained in both books. The screenshot below shows the portion of the script that collects process information and the references are highlighted in red.
Releasing the Tr3Secure Volatile Data Collection Script
There are very few things I do forensically that I think are cool; this script happens to be one of them. There are not many tools or scripts that work as intended while at the same time provide training. People who have more knowledge about volatile data can hit the ground running with the script investigating systems. The script automates imaging memory image, collecting volatile/non-volatile data, and documenting every action taken on the system. People with less knowledge can leverage the tool to learn how to investigate systems. The script collects data then the ordered output and references in the comments can be used to interpret the data. Talk about killing two birds with one stone.
The following is the location to the zip file containing the script and the readme file (download link is here). Please be advised, a few programs the script uses require administrative rights to run properly.
Enjoy and Happy Hunting ...
***** Update *****
The Tr3Secure script has been updated and you can read about the changes in the post Tr3Secure Data Collection Script Reloaded
***** Update *****
Labels:
memory analysis,
script,
triage
@Tamer,
Thanks for the comment and feedback.
> suggest in the readme file to order the tools by link or location
Good point. The tools' order is how I grouped them when I was writing the script to help me stay organized. After reading your suggestion I can see it's not the best way to present the information. If I update the script then I'll also change the readme file order
> I noticed that UnxUtils cannot be downloaded from SF
I just tested the link (http://unxutils.sourceforge.net/) and it brings me to the download page for UnxUtils
> suggestion of using systeminfo instead of ver and uptime
I looked at systeminfo but I can't remember why I decided not to use it. I'll take another look at it.
Again, thanks for the suggestions.
Hi Corey,
Thanks for considering my suggestions
>The tools' order is how I grouped them , the grouping is excellent to understand the purpose,it is better to have it and in addition that another section for the downloads.
> systeminfo can also be replaced by winaudit , where a nice report can be generated.
> Since memory holds all information about processes, ports, open files etc, why do we need to have separate tools considering that malware able to hide this information ?
Thanks
Tamer
@Tamer,
The script is not only meant for malware. The other information is presented in a different way which may be easier to review. i.e. the report for the caches (DNS, arp, etc). Also, the output reports can also be grepped which may be faster to find certain information then processing an image.
The script does have a menu and one option is to only image the memory. The option can be used if someone only wants an image.
Hi Corey,
thanks a lot for sharing. This looks like a very useful tool.
Just for the record, I'm sure you're aware of tools like WFT (http://www.foolmoon.net/security/wft/index.html), or Kludge (http://theinterw3bs.com/?p=503), both of which perform similar steps but seem to be much more bulky.
Cheers,
Stefan.
@Stefan,
I knew about Kludge but I didn't know about WFT. The main reason why I went writing my own was because of the training aspect. I would have prefered to use an existing script but I wasn't able to find a dual purpose scipt. Thanks for the tip about WFT; I'm going to look into the program.
Hi Corey, great from you to share.
I tried multiple times to download tr3secure_data-collection-script.zip and it won't open
@Anon,
I just downloaded the zip file and it opened fine using IE 8. If you are still having issues hit me up by email and I'll send it to you.
Great script Corey, any plans on making it work over a network to grab stuff from a remote system?
@anon,
The script meets our current needs so I have no plans to make it work remotely (as of right now). It wouldn't be to hard to do though.
If you are looking for a script that works remotely check out Kludge. Stefan provided a link to it in his comment.
How would you incorporate DumpIt, vice using Memoryze?
Thanks!
@anon
I looked at Dumpit when putting the script together; I actually checked it out first. The one issue with Dumpit - at the time- was not being able to redirect the memory image to a folder of my choice. For a tool to work with this script you need to be able to send the output to the output folder. Dumpit couldn't do this so I went with Memoryze.
@Corey
Thanks for your reply. Just recently I had to perform IR on a box of interest. I utilized your tool script and the program right perfectly, for the exception of Memoryze. The issues I encountered with Memoryze was a dependency issue--can't remember exactly what it was, but it didn't run correctly and gave an error.
It would be nice if Moonsols could write an optional program that could be incorporated with your collection tool. :)
In any case, thanks for putting together a comprehensive IR tool!
Really like your tool and the small foot print of it all.
Why wasn't anything included that will allow the tool to dump the registry?
@anon
There were a few things not included in the script with registry files being one of them. I have plans to update the script and grabbing reg hives is on the to do list. If you need the functionality sooner you could always modify the batch script by adding it. Glad to hear you like the script and thanks for letting me know.
Congratulations with your script. It is very useful for forensic examiners. I have some minor corrections for you:
1) in line 179 (tools\robocopy.exe %WINDIR%\Prefetch %c_drive%:\Data-%case%\%computername%-%timestamp%\preserved-prefetch-files\Prefetch\ /ZB /copy:DTSOU /r:4 /w:1 /ts /FP /np /log:%c_drive%:\Data-%case%\%computername%-%timestamp%\preserved-prefetch-files\pretch-robocopy-log.txt)) you are missing a robocopy copy parameter and you have an unneeded parentheses in the end of the command . It should be tools\robocopy.exe %WINDIR%\Prefetch %c_drive%:\Data-%case%\%computername%-%timestamp%\preserved-prefetch-files\Prefetch\ /ZB /copy:DATSOU /r:4 /w:1 /ts /FP /np /log:%c_drive%:\Data-%case%\%computername%-%timestamp%\preserved-prefetch-files\pretch-robocopy-log.txt
2)in line 271 it should be: tools\pv.exe -e >> %vol_outpath%\ProcessInfo_2_process-to-exe-mapping.txt not tools\pvc.exe -e >> %vol_outpath%\ProcessInfo_2_process-to-exe-mapping.txt
3)in lines 273-281 the Currprocess tool runs as CProcess.exe (when downloaded) not currprocess.exe
One thing missing is the registry and a second one is the correct language redirection when the system has a foreign codepage (ex. Greek), as in that case the results show correctly in dos but not in the created text files.
Keep up the good work,
A
@anon
Thanks for the corrections; I will include them as I update the script. This version of the script was more focused on collecting volatile data. As a result, non-volatile data was not collected such as registry files. I am updating the non-volatile part so it will collect most items people will want for performing triage.
It was my "duty" to provide you with corrections since I found these errors. I am working on your script a few days now and I have added functionality to collect log and registry files, scheduled tasks, peripherals, installed printers, usb devices history, user logons and internet activity forensic artifacts (by using mostly Nirsoft.net tools). My goal was to upgrade the toolkit to ensure that it contains as much digital forensics artifacts as possible.
Since you are updating the non-volatile part maybe it would be a good idea to exchange ideas and publish together a new version of the script with both our changes combined. If you find it a good idea also you can contact me by e-mail. :)
A