Linkz for Tools

Wednesday, August 15, 2012 Posted by Corey Harrell 2 comments
In this Linkz edition I’m mentioning write-ups discussing tools. A range of items are covered from the registry to malware to jump lists to timelines to processes.

RegRipper Updates


Harlan has been pretty busy updating RegRipper. First RegRipper version 2.5 was released then there were some changes to where Regripper is hosted along with some nice new plugins. Check out Harlan’s posts for all the information. I wanted to touch on a few of the updates though. The updates to Regripper included the ability to run directly against volume shadow copies and parse big data. The significance to parsing big data is apparent in his new plugin that parses the shim cache which is an awesome artifact (link up next). Another excellent addition to RegRipper is the shellbags plugin since it parses Windows 7 shell bags. Harlan’s latest post Shellbags Analysis highlights the forensic significance to shell bags and why one may want to look at the information they contain. I think these are awesome updates; now one tool can be used to parse registry data when it used to take three separate tools. Not to be left out the community has been submitting some plugins as well. To only mention a few Hal Pomeranz provided some plugins to extract Putty and WinSCP information and Elizabeth Schweinsberg added plugins to parse different Run keys. The latest RR plugin download has the plugins submitted by the community. Seriously, if you use RegRipper and haven’t checked out any of these updates then what are you waiting for?

Shim Cache


Mandiant’s post Leveraging the Application Compatibility Cache in Forensic Investigations explained the forensic significance of the Windows Application Compatibility Database. Furthermore, Mandiant released the Shim Cache Parser script to parse the appcompatcache registry key in the System hive. The post, script, and information Mandiant released speaks for itself. Plain and simple, it rocks. So far the shim cache has been valuable for me on fraud and malware cases. Case in point, at times when working malware cases programs execute on a system but the usual program execution artifacts (such as prefetch files) doesn’t show it. I see this pretty frequently with downloaders which are programs whose sole purpose is to download and execute additional malware. The usual program execution artifacts may not show the program running but the shim cache was a gold mine. Not only did it reflect the downloaders executing but the information provided more context to the activity I saw in my timelines. What’s even cooler than the shim cache? Well there are now two different programs that can extract the information from the registry.

Searching Virus Total


Continuing on with the malware topic, Didier Stevens released a virustotal-search program. The program will search for VirusTotal reports using a file’s hash (MD5, SHA1, SHA256) and produces a csv file showing the results. One cool thing about the program is it only performs hash searches against Virustotal so a file never gets uploaded. I see numerous uses for this program since it accepts a file containing a list of hashes as input. One way I’m going to start using virustotal-search is for malware detection. One area I tend to look at for malware and exploits are temporary folders in user profiles. It wouldn’t take too much to search those folders looking for any files with an executable, Java archive, or PDF file signatures. Then for each file found perform a search on the file’s hash to determine if VirusTotal detects it as malicious. Best of all, this entire process could be automated and run in the background as you perform your examination.

Malware Strings


Rounding out my linkz about malware related tools comes from the Hexacorn blog. Adam released Hexdrive version 0.3. In Adam’s own words the concept behind Hexdrive is to “extract a subset of all strings from a given file/sample in order to reduce time needed for finding ‘juicy’ stuff – meaning: any string that can be associated with a) malware b) any other category”. Using Hexdrive makes reviewing strings so much easier. You can think of it as applying a filter across the strings to initially see only the relevant ones typically associated with malware. Then afterwards all of the strings can be viewed using something like Bintext or Strings. It’s a nice data reduction technique and is now my first go to tool when looking at strings in a suspected malicious file.

Log2timeline Updates


Log2timeline has been updated a few times since I last spoke about it on the blog. The latest release is version 0.64. There have been quite a few updates ranging from small bug fixes to new input modules to changing the output of some modules. To see all the updates check out the changelog.

Most of the time when I see people reference log2timeline they are creating timelines using either the default module lists (such as winxp) or log2timeline-sift. Everyone does things differently and there is nothing wrong with these approaches. Personally, both approaches doesn’t exactly meet my needs. The majority of the systems I encounter have numerous user profiles stored on them which mean these profiles contain files with timestamps log2timeline extracts. Running a default module list (such as winxp) or log2timeline-sift against the all the user profiles is an issue for me. Why should I include timeline data for all user accounts instead of the one or two user profiles of interest? Why include the internet history for 10 accounts when I only care about one user? Not only does it take additional time for timeline creation but it results in a lot more data then what I need thus slowing down my analysis. I take a different approach; an approach that better meets my needs for all types of cases.

I narrow my focus down to specific user accounts. I either confirm who the person of interest is which tells me what user profiles to examine. Or I check the user profile timestamps to determine which ones to focus on. What exactly does this have to do with log2timeline? The answer lies in the –e switch since it can exclude files or folders. The –e switch can be used to exclude all user profiles I don’t care about. There’s 10 user profiles and I only care about 2 profiles but I only want to run one log2timeline command. No problem if you use the –e switch. To illustrate let’s say I’m looking at the Internet Explorer history on a Windows 7 system with five user profiles: corey, sam, mike, sally b, and alice. I only need to see the browser history for the corey user account but I don’t want to run multiple log2timeline commands. This is where the –e switch comes into play as shown below:

log2timeline.pl -z local -f iehistory -r -e Users\\sam,Users\\mike,"Users\\sally b",Users\\alice,"Users\\All Users" -w timeline.csv C:\

The exclusion switch eliminates anything containing the text used in the switch. I could have used sam instead of Users\\sam but then I might miss some important files such as anything containing the text “sam”. Using a file path limits the amount of data that is skipped but will still eliminate any file or folder that falls within those user profiles (actually anything falling under the C root directory containing the text Users\username). Notice the use of the double back slashes (\\) and the quotes; for the command to work properly this is needed. What’s the command’s end result? The Internet history from every profile stored in the Users folder except for the sam, mike, sally b, alice, and all user profiles is parsed. I know most people don’t run multiple log2timeline commands when generating timelines since they only pick one of the default modules list. Taking the same scenario where I’m only interested in the corey user account on a Windows 7 box check out the command below. This will parse every Windows 7 artifact except for the excluded user profiles (note the command will impact the filesystem metadata for those accounts if the MFT is parsed as well).

log2timeline.pl -z local -f win7 -r -e Users\\sam,Users\\mike,"Users\\sally b",Users\\alice,"Users\\All Users" -w timeline.csv C:\

The end result is a timeline focused only on the user accounts of interest. Personally, I don't use the default module lists in log2timeline but I wanted to show different ways to use the -e switch.

Time and Date Website


Daylight savings time does not occur on the same day each year. One day I was looking around the Internet for a website showing the exact dates when previous daylight savings time changes occurred. I came across the timeanddate.com website. The site has some cool things. There’s a converter to change the date and time from one timezone to another. There’s a timezone map showing where the various timezones are located. A portion of the site even explains what Daylight Savings Time is. The icing on the cake is the world clock where you can select any timezone to get additional information including the historical dates of when Daylight Savings Time occurred. Here is the historical information for the Eastern Timezone for the time period from the year 2000 to 2009. This will be a useful site when you need to make sure that your timestamps are properly taken into consideration Daylight Savings Time.

Jump Lists


The day has finally arrived; over the past few months I’ve been seeing more Windows 7 systems than Windows XP. This means the artifacts available in the Windows 7 operating system are playing a greater role in my cases. One of those artifacts is jump lists and Woanware released a new version of Jumplister which parses them. This new version has the ability to parse out the DestList data and performs a lookup on the AppID.

Process, Process, Process


Despite all the awesome tools people release they won’t be much use if there isn’t a process in place to use them. I could buy the best saws and hammers but they would be worthless to me building a house since I don’t know the process one uses to build a house. I see digital forensics tools in the same light and in hindsight maybe I should have put these links first. Lance is back blogging over at ForensicKB and he posted a draft to the Forensic Process Lifecycle. The lifecycle covers the entire digital forensic process from the Preparation steps to triage to imaging to analysis to report writing. I think this one is a gem and it’s great to see others outlining a digital forensic process to follow. If you live under a rock then this next link may be a surprise but a few months back SANS released their Digital Forensics and Incident Response poster. The poster has two sides; one outlines various Windows artifacts while the other outlines the SANs process to find malware. The artifact side is great and makes a good reference hanging on the wall. However, I really liked seeing and reading about the SANs malware detection process since I’ve never had the opportunity to attend their courses or read their training materials. I highly recommend for anyone to get a copy of the poster (paper and/or electronic versions). I’ve been slacking updating my methodology page but over the weekend I updated a few things. The most obvious is adding links to my relevant blog posts. The other change and maybe less obvious is I moved around some examination steps so they are more efficient for malware cases. The steps reflect the fastest process I’ve found yet to not only find malware on a system but to determine how malware got there. Just an FYI, the methodology is not only limited to malware cases since I use the same process for fraud and acceptable use policy violations.

Welcome to Year 2

Sunday, August 12, 2012 Posted by Corey Harrell 1 comments
This past week I was vacationing with my family when my blog surpassed another milestone. It has been around for two years and counting. Around my blog’s anniversary I like to reflect back on the previous year and look ahead at the upcoming one. Last year I set out to write about various topics including: investigating security incidents, attack vector artifacts, and my methodology. It shouldn’t be much of a surprise then when you look at the topics in my most read posts from the past year:

1. Dual Purpose Volatile Data Collection Script
2. Finding the Initial Infection Vector
3. Ripping Volume Shadow Copies – Introduction
4. Malware Root Cause Analysis
5. More About Volume Shadow Copies
6. Ripping VSCs – Practitioner Method

Looking at the upcoming year there’s a professional change impacting a topic I’ve been discussing lately. I’m not talking about a job change but an additional responsibility in my current position. My casework will now include a steady dose of malware cases. I’ve been hunting malware for the past few years so now I get to do it on a regular basis as part of my day job. I won’t directly discuss any cases (malware, fraud, or anything else) that I do for my employer. However, I plan to share the techniques, tools, or processes I use. Malware is going to continue to be a topic I frequently discuss from multiple angles in the upcoming year.

Besides malware and any other InfoSec or DFIR topics that have my interest, there are a few research projects on my to-do list. First and foremost is to complete my finding fraudulent documents whitepaper and scripts. The second project is to expand on my current research about the impact virtual desktop infrastructure will have on digital forensics. There are a couple of other projects I’m working on and in time I’ll mention what those are. Just a heads up, at times I’m going to be focusing on these projects so expect some time periods when there isn’t much activity with the blog. As usual, my research will be shared either through my blog or another freely available resource to the DFIR community.

Again, thanks to everyone who links back to my blog and/or publicly discusses any of my write-ups. Each time I come across someone who says that something I wrote helped them in some way makes all the time and work I do for the blog worth the effort. Without people forwarding along my posts then people may not be aware about information that could help them. For this I’m truly grateful. I couldn’t end a reflection post without thanking all the readers who stop by jIIr. Thank you and you won’t be disappointed with what I’m gearing up to release over the next year.
Labels:

Malware Root Cause Analysis

Sunday, July 29, 2012 Posted by Corey Harrell 6 comments
The purpose to performing root cause analysis is to find the cause of a problem. Knowing a problem’s origin makes it easier to take steps to either resolve the problem or lessen the impact the next time the problem happens again. Root cause analysis can be conducted on a number of issues; one happens to be malware infections. Finding the cause of an infection will reveal what security controls broke down that let the malware infect the computer in the first place. In this post I’m expanding on my Compromise Root Cause Analysis Model by showing how a malware infection can be modeled using it.

Compromise Root Cause Analysis Revisited


The Compromise Root Cause Analysis Model is a way to organize information and artifacts to make it easier to answer questions about a compromise. The attack artifacts left on a network and/or computer fall into one of these categories: source, delivery mechanism, exploit, payload, and indicators. The relationship between the categories is shown in the image below.


I’m only providing a brief summary about the model but for more detailed information see the post Compromise Root Cause Analysis Model. At the model’s core is the source of the attack; this is where the attack came from. The delivery mechanisms are for the artifacts associated with the exploit and payload being sent to the target. Lastly, the indicators category is for the post compromise activity artifacts. The only thing that needs to be done to use the model during an examination is to organize any relevant artifacts into these categories. I typically categorized every artifact I discover as an indicator until additional information makes me move them to a different category.

Another Day Another Java Exploit


I completed this examination earlier in the year but I thought it made a great case to demonstrate how to determine a malware infection’s origin by using the Root Cause Analysis Model. The examination was kicked off when someone saw visual indicators on their screen that their computer was infected. My antivirus scan against the powered down computer confirmed there was an infection as shown below.


The antivirus scan flagged four files as being malicious. Two binaries (fuo.exe and 329991.exe) were identified as the threat: Win32:MalOb-GR[Cryp]. One cached webpage (srun[1].htm) was flagged as JS:Agent-PK[Trj] while the ad_track[1].htm file was identified as HTML:RedirME-inf[Trj]. A VirusTotal search on the fuo.exe file’s MD5 hash provided more information about the malware.

I mentally categorized the four files as indicators of the infection until it’s proven otherwise. The next examination step that identified additional artifacts was timeline analysis because it revealed what activity was occurring on the system around the time when malware appeared. A search for the files fuo.exe and 329991.exe brought me to the portion of the timeline shown below.


The timeline showed the fuo.exe file was created on the computer after the 329991.exe file. There were also indications that Java executed; the hsperfdata_username file was modified which is one artifact I highlighted in my Java exploit artifact research. I was looking at the activity on the system before the fuo.exe file appeared which is shown below.


The timeline confirmed Java did in fact execute as can be seen by the modification made to its prefetch file. The file 329991.exe was created on the system at 1/15/2012 16:06:22 which was one second after a jar file appeared in the anon user profile’s temporary folder. This activity resembles exactly how an infection looks when a Java exploit is used to download malware onto a system. However, additional legwork was needed to confirm my theory. Taking one look at the jar_cache8544679787799132517.tmp file in JD-GUI was all that I needed. The picture below highlights three separate areas in the jar file.


The first area labeled as 1 shows a string being built where the temp folder (str1) is added to 329991.exe. The second area labeled as 2 first shows the InputStream function sets the URL to read from while the FileOutputStream function writes the data to a file which happens to be str3. Remember that str3 contains the string 329991.exe located inside the temp folder. The last highlighted area is labeled as 3 which is where the Runtime function starts to run the newly created 329991.exe file. The analysis on the jar file confirmed it was responsible for downloading the first piece of malware onto the system. VirusTotal showed that only 8 out of 43 scanners identified the file as a CVE-2010-0840 Java exploit. (for another write-up about how to examine a Java exploit refer to the post Finding the Initial Infection Vector). At this point I mentally categorized all of the artifacts associated with Java executing and the Java exploit under the exploit category. The new information made me move 329991.exe from the indicator to the payload category since it was the payload of the attack.

I continued working the timeline by looking at the activity on the system before the Java exploit (jar_cache8544679787799132517.tmp) appeared on the system. I noticed there was a PrivacIE entry for a URL ending in ad_track.php. PrivacIE entries are for 3rd party content on websites and this particular URL was interesting because Avast flagged the cached webpage ad_track[1].htm. I started tracking the URLs in an attempt to identify the website that served up the 3rd party content. I didn’t need to identify the website per say since I already reached my examination goal but it was something I personally wanted to know. I gave up looking after spending about 10 minutes working my way through a ton of Internet Explorer entries and temporary Internet files for advertisements.


I answered the “how” question but I wanted to make sure the attack only downloaded the two pieces of malware I already identified. I went back in the timeline to when the fuo.exe file was created on the system. I started looking to see if any other files were created on the system but the only activity I really saw involved the antivirus software installed on the system.


Modeling Artifacts


The examination identified numerous artifacts and information about how the system was compromised. The diagram below shows how the artifacts are organized under the Compromise Root Cause Analysis Model.


As can be seen in the picture above the examination did not confirm all of the information about the attack. However, categorizing the artifacts helped make it possible to answer the question how did the system become infected. It was a patching issue that resulted in an extremely vulnerable Java version running on the system. In the end not only did another person get their computer cleaned but they also learned about the importance of installing updates on their computer.


Usual Disclaimer: I received permission from the person I assisted to discuss this case publicly.

Combining Techniques

Wednesday, July 11, 2012 Posted by Corey Harrell 3 comments
“You do intrusion and malware investigations, we do CP and fraud cases” is a phrase I saw Harlan mention a few times on his blog. To me the phrase is more about how different the casework is; about how different the techniques are for each type of case. Having worked both fraud and malware cases I prefer to focus on what each technique has to offer as opposed to their differences. How parts of one technique can be beneficial to different types of cases. How learning just a little bit about a different technique can pay big dividends by improving your knowledge, skills, and process that you can use on your current cases. To illustrate I wanted to contrast techniques for malware and fraud cases to show how they help one another.

Program Execution


Malware eventually has to run and when it does execute then there may be traces left on a system. Understanding program execution and where these artifacts are located is a valuable technique for malware cases. Examining artifacts containing program execution information is a quick way to find suspicious programs. One such artifact is prefetch files. To show their significance I’m parsing them with Harlan’s updated pref.pl script. Typically I start examining prefetch files by first looking at what executables ran on a system and where they executed from. I saw the following suspicious programs looking at the output from the command “pref.exe -e -d Prefetch-Folder”.

TMP77E.EXE-02781D7C.pf Last run: Fri Mar 12 16:29:05 2010 (1)
\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\TMP77E.EXE

TMPDC7.EXE-2240CBB3.pf Last run: Fri Mar 12 16:29:07 2010 (1)
\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\TMPDC7.EXE

UPDATE.EXE-0825DC41.pf Last run: Fri Mar 12 16:28:57 2010 (1)
\DEVICE\HARDDISKVOLUME1\DOCUMENTS AND SETTINGS\ADMINISTRATOR\DESKTOP\UPDATE.EXE

ASD3.TMP.EXE-26CA54B1.pf Last run: Fri Mar 12 16:34:49 2010 (1)
\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\ASD3.TMP.EXE

ASD4.TMP.EXE-2740C04A.pf Last run: Fri Mar 12 16:34:50 2010 (1)
\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\ASD4.TMP.EXE

DRGUARD.EXE-23A7FB3B.pf Last run: Fri Mar 12 16:35:26 2010 (2)
\DEVICE\HARDDISKVOLUME1\PROGRAM FILES\DR. GUARD\DRGUARD.EXE

ASD2.TMP.EXE-2653E918.pf Last run: Fri Mar 12 16:34:27 2010 (1)
\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\ASD2.TMP.EXE

ASR64_LDM.EXE-3944C1CE.pf Last run: Fri Mar 12 16:29:06 2010 (1)
\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\ASR64_LDM.EXE

These programs stood out for a few different reasons. First I noticed the path they executed from was a temporary folder in a user’s profile. Unusual file paths are one way to spot malware on a system. The other thing I noticed is that some of the programs only executed once. This behavior resembles how downloaders and droppers work. Their sole purpose is to execute once to either download additional malware or install malware. The last dead giveaway is they all executed within a few minutes of each other. The first sweep across the prefetch files netted some interesting programs that appear to be malicious. The next thing to look at is the individual prefetch files to see what file handles were open when the program ran. The TMP77E.EXE-02781D7C.pf prefetch file showed something interesting as shown below (the command used was “pref.pl -p -i -f TMP77E.EXE-02781D7C.pf”).

EXE Name : TMP77E.EXE
Volume Path : \DEVICE\HARDDISKVOLUME1
Volume Creation Date: Fri Nov 2 08:56:57 2007 Z
Volume Serial Number: 6456-B1FD

\DEVICE\HARDDISKVOLUME1\WINDOWS\SYSTEM32\NTDLL.DLL
\DEVICE\HARDDISKVOLUME1\WINDOWS\SYSTEM32\KERNEL32.DLL
\DEVICE\HARDDISKVOLUME1\WINDOWS\SYSTEM32\UNICODE.NLS
*****snippet*****


EXEs found:

\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\TMP77E.EXE
\DEVICE\HARDDISKVOLUME1\WINDOWS\SYSTEM32\NET.EXE
\DEVICE\HARDDISKVOLUME1\WINDOWS\SYSTEM32\SC.EXE
\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\ASR64_LDM.EXE

DAT files found:

\DEVICE\HARDDISKVOLUME1\DOCUMENTS AND SETTINGS\ADMINISTRATOR\LOCAL SETTINGS\TEMPORARY INTERNET FILES\CONTENT.IE5\INDEX.DAT
\DEVICE\HARDDISKVOLUME1\DOCUMENTS AND SETTINGS\ADMINISTRATOR\COOKIES\INDEX.DAT
\DEVICE\HARDDISKVOLUME1\DOCUMENTS AND SETTINGS\ADMINISTRATOR\LOCAL SETTINGS\HISTORY\HISTORY.IE5\INDEX.DAT

Temp paths found:

\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\TMP77E.EXE
\DEVICE\HARDDISKVOLUME1\DOCUMENTS AND SETTINGS\ADMINISTRATOR\LOCAL SETTINGS\TEMPORARY INTERNET FILES\CONTENT.IE5\INDEX.DAT
\DEVICE\HARDDISKVOLUME1\DOCUMENTS AND SETTINGS\ADMINISTRATOR\LOCAL SETTINGS\TEMPORARY INTERNET FILES\CONTENT.IE5\M20M2OXX\READDATAGATEWAY[1].HTM
\DEVICE\HARDDISKVOLUME1\DOCUME~1\ADMINI~1\LOCALS~1\TEMP\ASR64_LDM.EXE

The pref.pl file handle portion to the output was trimmed to make it easier to read but I left in the intelligence provided by the script through its filters. The filters highlight file handles containing exes, dats, and temporary folders. The exe filter shows in additional to a handle to the TMP77E.EXE file there were handles to SC.EXE, NET.EXE, and ASR64_LDM.EXE. SC.EXE is a Windows program for managing services including creating new services while NET.EXE is a Windows program for doing various tasks including starting services. ASR64_LDM.EXE was another suspicious program that ran on the system after TMP77E.EXE. The file handles inside each prefetch file of interest provided additional information which is useful during a malware case.

Program execution is vital for malware cases and I saw how the same technique can apply to fraud cases. On fraud cases a typical activity is to identify and locate financial data. At times this can be done by running keyword searches but most of the time (at least for me) the actual financial data is unknown. What I mean by this is a system is provided and it’s up to you to determine what data is financial. This is where the program execution technique comes into play. The programs that ran on the system can be reviewed to provide leads about what kind of financial data may be present on the system. Using the first sweep across the prefetch files I located these interesting programs (command used was “pref.exe -e -d Prefetch-Folder”). Note: the system being looked at is not from a fraud case but it still demonstrates how the data appears.

WINWORD.EXE-C91725A1.pf Last run: Tue Jul 10 16:42:26 2012 (42)
\DEVICE\HARDDISKVOLUME2\PROGRAM FILES\MICROSOFT OFFICE\OFFICE12\WINWORD.EXE

ACROBAT_SL.EXE-DC4293F2.pf Last run: Fri Jun 22 18:14:12 2012 (1)
\DEVICE\HARDDISKVOLUME2\PROGRAM FILES\ADOBE\ACROBAT 9.0\ACROBAT\ACROBAT_SL.EXE

EXCEL.EXE-C6BEF51C.pf Last run: Tue Jul 10 16:30:18 2012 (22)
\DEVICE\HARDDISKVOLUME2\PROGRAM FILES\MICROSOFT OFFICE\OFFICE12\EXCEL.EXE

POWERPNT.EXE-1404AEAA.pf Last run: Thu Jun 21 20:14:52 2012 (22)
\DEVICE\HARDDISKVOLUME2\PROGRAM FILES\MICROSOFT OFFICE\OFFICE12\POWERPNT.EXE

When I look at program execution for fraud cases I look for financial applications, applications that can create financial data, and programs associated with data spoliation. The system didn’t have any financial or data spoliation programs but there were office productivity applications capable of creating financial documents such as invoices, receipts, proposals, etc. These programs were Microsoft Office and Adobe Acrobat and this means the data created on the system is most likely Word, Excel, Powerpoint, or PDFs. The number of executions for each program is also interesting. I look to see what applications are heavily used since it’s a strong indication about what program the subject uses. Notice Adobe only ran once while Word was ran 42 times. Looking at the file handles inside individual prefetch file also contains information relevant to a fraud case. Below are a few sanitized handles I found in the WINWORD.EXE-C91725A1.pf prefetch file (the command used was “pref.pl -p -i -f WINWORD.EXE-C91725A1.pf”).

EXE Name : WINWORD.EXE
Volume Path : \DEVICE\HARDDISKVOLUME2
Volume Creation Date: Fri Aug 26 18:13:26 2011 Z
Volume Serial Number: E4DD-S23A

\DEVICE\HARDDISKVOLUME2\WINDOWS\SYSTEM32\NTDLL.DLL
\DEVICE\HARDDISKVOLUME2\WINDOWS\SYSTEM32\KERNEL32.DLL
\DEVICE\HARDDISKVOLUME2\WINDOWS\SYSTEM32\APISETSCHEMA.DLL
*****snippet*****
\DEVICE\HARDDISKVOLUME4\FORENSICS\RESEARCH\Folder\SANTIZED.DOCX
\DEVICE\HARDDISKVOLUME4\FORENSICS\RESEARCH\Folder\~$SANTIZED.DOCX
\DEVICE\HARDDISKVOLUME2\USERS\USERNAME\APPDATA\LOCAL\TEMP\MSO4F30.TMP
\DEVICE\HARDDISKVOLUME4\FORENSICS\RESEARCH\Folder\SANTIZED 2.DOCX
\DEVICE\HARDDISKVOLUME4\FORENSICS\RESEARCH\Folder\~$SANTIZED 2.DOCX
\DEVICE\HARDDISKVOLUME2\USERS\USERNAME\APPDATA\LOCAL\TEMP\MSO7CF3.TMP
\DEVICE\HARDDISKVOLUME2\USERS\USERNAME\APPDATA\LOCAL\TEMP\20517251.OD
\DEVICE\HARDDISKVOLUME4\FORENSICS\RESEARCH\Folder\SANTIZED 3.DOCX
\DEVICE\HARDDISKVOLUME4\FORENSICS\RESEARCH\Folder\~$SANTIZED 3.DOCX
\DEVICE\HARDDISKVOLUME4\$MFT
\DEVICE\HARDDISKVOLUME2\USERS\USERNAME\APPDATA\LOCAL\MICROSOFT\WINDOWS\TEMPORARY INTERNET FILES\CONTENT.MSO\SANTIZED.JPEG
\DEVICE\HARDDISKVOLUME4\FORENSICS\RESEARCH\Folder\SANTIZED 3.DOCX:ZONE.IDENTIFIER
*****snippet*****

The file handles shows documents stored in the folder FORENSICS\RESEARCH\Folder\ on a different volume were accessed with Word. I think this is significant because not only does it provide filenames to look for but it also shows another storage location the subject may have used. Where ever there is storage accessible to the subject then there’s a chance that is where they are storing some financial data. Also, notice in the output how the last line shows one of the documents was downloaded from the Internet (zone identified alternate data stream).

User Activity


The program execution showed how fraud cases benefited from a technique used in malware cases. Now let’s turn the tables to see how malware cases can benefit from fraud. As I mentioned before most of the times I have to find where financial data is located whether if it’s on the system or in network shares. The best approach I found was to look at artifacts associated with user activity; specifically file, folder, and network share access. My reasoning is if someone is being looked at for committing a fraud and they are suspected of using the computer to commit the fraud then they will be accessing financial data from the computer to carry out the fraud. Basically, I let their user activity show me where the financial data is located and this approach works regardless if the data is in a hidden folder or stored on a network. There are numerous artifacts containing file, folder, and network share access and one of them is link files. To show their significance I’m parsing them with TZWorks LNK Parser Utility. When I examine link files I parse both the Recent and Office/Recent folders. This results in some duplicates but it catches link files found in one folder and not the other. I’ve seen people delete everything in the Recent folder while not realizing the Office\Recent folder exists. I saw some interesting target files, folders, and network shares by running the command “dir C:\Users\Username\AppData\Roaming\Microsoft\Windows\Recent\*.lnk /b /s | lp -pipe -csv > fraud_recent.txt”.

{CLSID_MyComputer}\E:\Forensics\Research\Folder\sanatized.doc
{CLSID_MyComputer}\E:\Forensics\Research\Folder\sanitized 2.doc
{CLSID_MyComputer}\E:\Forensics\Research\Folder\sanitized 3.doc
{CLSID_MyComputer}\C:\Atad\material\sanitized 1.pdf
{CLSID_MyComputer}\F:\Book1.xls
\\192.168.200.55\ share\TR3Secure

The output has been trimmed (only shows target file column) and sanitized since it’s from one of my systems. The link files show files and folders on removable media and a network share has been accessed in addition to a folder not inside the user profile. I’ve used this same technique on fraud cases to figure out where financial data was stored. One time it was some obscure folder on the system while the next time it was a share on a server located on the network.

Tracking user activity is a great way to locate financial data on fraud cases and I saw how this same technique can apply to malware cases. On malware cases it can help answer the question how did the computer become infected. Looking at the user activity around the time of the initial infection can help shed light on what attack vector was used to compromise the system. Did the user access a network share, malicious website, removable media, email attachment, or peer to peer application? The user activity provides indications about what the account was doing that contributed to the infection. On the malware infected system there were only two link files in the Recent folder shown below is the target create time and target name (command used was “dir "F:\Malware_Recent\*.lnk" /b /s | lp -pipe -csv > malware_recent.txt”).

3/12/2010 16:17:04.640 {CLSID_MyComputer}\C:\downloads
3/12/2010 16:18:59.609 {CLSID_MyComputer}\C:\downloads\link.txt

These link files show the user account accessed the downloads folder and the link text file just before the suspicious programs started executing on the system. Looking at this user activity jogged my memory about how the infection occurred. I was researching a link from a SPAM email and I purposely clicked the link from a system. I just never got around to actually examining the system. However, even though the system was infected on purpose examining the user activity on the malware cases I worked has helped answer the question how did the system become infected.

Closing Thoughts


DFIR has a lot of different techniques to deal with the casework we face. Too many times we tend to focus on the differences; the different tools, different processes, and different meanings of artifacts. Focusing on the differences distracts from seeing what the techniques have to offer. What parts of the techniques can strengthen our processes and make us better regardless of what case we are up against. If I didn’t focus on what the techniques had to offer then I would have missed an opportunity. A chance to develop a better DFIR process by combining malware and fraud techniques; a process that I think is far better than if each technique stood on their own.

Metasploit The Penetration Testers Guide Book Review

Wednesday, July 4, 2012 Posted by Corey Harrell 0 comments

A penetration test is a method to locate weaknesses in an organization’s network by simulating how an attacker may circumvent the security controls. The Preface indicated Metasploit The Penetration Tester’s Guide was written specifically so “readers can become competent penetration testers”. The book further goes on to describe a penetration tester as someone who is able to find ways in which a “hacker might be able to compromise an organization’s security and damage the organization as a whole”. I’ve occasionally seen people talk about the book favorably but their comments were as it related to penetration testing. I wanted to review Metasploit The Penetration Tester’s Guide from a different angle; from the Digital Forensic and Incident Response (DFIR) perspective. As a DFIR professional it is important to not only understand the latest attack techniques but it’s equally important to be aware of what artifacts are left by those techniques. This is the perspective I used when reviewing the book and I walked away thinking. If you want to bring your Digital Forensic and Incident Response skills to the next level then throw Metasploit in your toolbox and work your way through this book.

From Methodology to Basics to Exploitation


The book starts out discussing the various phases to a penetration test which were: pre-engagement interactions, intelligence gathering, threat modeling, exploitation, and post exploitation. After covering the methodology there was an entire chapter dedicated to Metasploit basics. I liked how the basics were covered before diving into the different ways to perform intelligence gathering with the Metasplot framework. Not only did the intelligence gathering cover running scans using the Metasploit built-in scanners but it also discussed running scans with nmap and then building a database with Metasploit to store the nmap scans. Before getting into exploitation an entire chapter was dedicated to using vulnerability scanners (Nessus and Nexpose) to identify vulnerabilities in systems. After Chapter 4 the remainder of the book addresses exploitation and post-exploitation techniques. I liked how the book discussed simpler attacks before leading up to more advanced attacks such as client-side, spear phishing, web, and SQL injection attacks. The book even talked about some advanced topics such as building your own Metasploit module and creating your own exploits. I think the book fulfilled the reason for which it was designed: to “teach you everything from the fundamentals of the Framework to advanced techniques in exploitation”.

Prepare, Prepare, and Prepare


Appendix A in the book walks you through setting up some target machines. One of which is a vulnerable Windows XP box running a web server, SQL server, and a vulnerable web application. Setting up the target machines means you can try out the attacks as you work your way through the book. I found it a better learning experience to try things out as I read about them. One addition benefit to this is that it provides you with a system to analyze. You can attack the system then examine afterwards to see what artifacts were left behind. I think this is a great way to prepare and improve your skills to investigate different kinds of compromises. Start out with simple attacks before proceeding to the more advanced attacks.

This is where I think this book along with Metasploit can bring your skills to the next level. There are numerous articles about how certain organizations were compromised but the articles never mention what artifacts were found indicating how the compromise occurred. Does the following story sound familiar? Media reports said a certain organization was compromised due to a targeted email that contained a malicious attachment. The reports never mentioned what incident responders should keep an eye out for nor does it provide anything about how to spot this attack vector on a system. To fill in these gaps we can simulate the attack against a system to see for ourselves how the attack looks from a digital forensic perspective. Spear-phishing attack vector is covered on page 137 and the steps to conduct the attack is very similar to how those organizations are compromised. The simulated attacks don’t have to stop at spear phishing either since the following could also be done: Java social engineering (page 142), client-side web exploits also known as drive-bys (page 146), web jacking (page 151), multipronged attack (page 153), or pivoting onto other machines (page 89) are a few possibilities one could simulate against the targeted machines. It's better to prepare ourselves to see these attacks in advanced then it is to wait until we are tasked with analyzing a compromised system.

Where’s the Vulnerability Research


Metasploit The Penetration Tester’s Guide is an outstanding book and is a great resource for anyone wanting to better understand how attacks work. However, there was one thing I felt the book was missing. The process to identify and research what vulnerabilities are present in specific software you want to exploit. The book mentions how Metasploit exploits can be located by keyword searches but it doesn’t go into detail about how to leverage online resources to help figure out what exploits to use. A search can be done online for a program/service name and version to list all discovered vulnerabilities in that program. Also, there is additional information explaining what a successful exploit may result in such as remote code execution or a denial of service. This approach has helped me when picking what vulnerabilities to go after and I thought a book trying to make competent penetration testers would have at least mentioned it.

Four Star Review


If someone wants to know how to better secure a system then they need to understand how the system can be attacked. If someone wants to know how to investigate a compromised system then they need to understand how attacks work and what those attacks look like on a system. As DFIR professionals it is extremely important for us to be knowledgeable about different attacks and what artifacts those attacks leave behind. This way when we are looking at a system or network it’s easier to see what caused the compromise; a spear phish, drive-by, SQL injection, or some other attack vector. I think the following should be a standard activity for anyone wanting to investigate compromises. Pick up Metasploit The Penetration Tester’s Guide, add Metasploit to your toolbox, and work your way through the material getting shells on test systems. You will not only have a solid understanding about how attacks work but you will pick up some pen testing skills along the way. Overall I gave Metasploit The Penetration Tester’s Guide a four star review (4 out of 5).

Detect Fraud Documents 360 Slides

Friday, June 29, 2012 Posted by Corey Harrell 2 comments
I recently had the opportunity to attend the SANs Digital Forensics and Incident Response summit in Austin Texas. The summit was a great con; from the outstanding presentations to networking with others from the field. I gave a SANs 360 talk about my technique for finding fraudulent word documents (I previously gave a preview about my talk). I wanted to release my slide deck for anyone who wanted to use it as a reference before my paper is completed. You can grab it from my Google sites page listed as “SANs 360 Detect Frauduelent Word Documents.pdf”.

For those who were unable to attend the summit can still read all of the presentations. SANS updated their Community Summit Archives to include the Forensics and Incident Response Summit 2012. I highly recommend checking out the work shared by others; a lot of it was pretty amazing.
Labels: ,

Computers Don’t Get Sick – They Get Compromised

Sunday, June 10, 2012 Posted by Corey Harrell 3 comments
Security awareness campaigns have done an effective job of educating people about malware. The campaigns have even reached the point to where if people hear certain words they see images in their minds. Say viruses then pictures of a sick computer pops into their minds. Say worms then there’s an image of a computer with critters and germs crawling all over it. People from all walks of life have been convinced malware is similar to real life viruses; the viruses that make things sick. This point of view can be seen at all levels within organizations to family members to the average Joe who walks into the local computer repair shop to the person responsible for dealing with an infected computer. It’s no wonder when malware ends up on a computer people are more likely to think about ER then they are CSI. More likely to do what they can to make the “sickness” go away then they are to figure out what happened. To me people’s expectations about what to do and the actions most people take seems to resemble more like how we deal with the common cold. The issue with this is that computers don’t get sick – they get compromised.

Security awareness campaigns need to move beyond imagery and associations showing malware as something that affects health. Malware should instead be called what it is; a tool. A tool someone is using in an effort to take something from us, our organizations, our families, and our communities. Taking anything they can whether it’s money, information, or computer resources. The burglar picture is a more accurate illustration about what malware is than any of the images showing a “sick” computer. It’s a tool in the hands of a thief. This is the image we need people to picture when they hear the words: malware, viruses, or worms. Those words need to be associated with tools used by criminals and tools used by hostile entities. Maybe then their expectations will change about malware on their computer or within their network. Malware is not something that needs to be “made better” with a trip to the ER but something we need to get to the bottom of to better protect ourselves. It’s not something that should be made to go away so things can go on as normal but something we need to get answers and intelligence from before moving forward. People need to associate malware with going home one day and finding burglary tools sitting in their living room. Seeing the tools in their house should make them want to ask: what happened, how this occurred, and what was taken before they ask when they can continue on as normal.

Those entrusted with dealing with malware on computers and networks need to picture malware the same way. It’s not some cold where we keep throwing medicine at it (aka antivirus scan after antivirus scan). It’s a tool someone placed there to do a specific thing. Making the malware go away is not the answer; the same way that making the tools in the living room disappear doesn’t address the issue. Someone figured out a way to put a tool on the computer and/or network and it’s up to us to figure out how. The tool is a symptom of a larger issue and the intelligence we can learn from answering how the malware got onto a system can go a long way in better protecting the ones relying on our expertise. We need to perform analysis on the systems in order to get to the compromise’s root cause.

The approach going forward should not be to continue with the status quo. The status quo of doing what it takes to make the “sickness” go away without any other thought about answering the questions “how”, “when”, and “what”. Those tasked with dealing with a malware infected computer should no longer accept the status quo either. Removing the malware without any investigative actions to determine the “how”, “when”, and “what”. The status quo views malware on a computer as if the computer is somehow “sick”. It’s time to change the status quo to make it reflect what malware on a computer actually is. The computer isn’t “sick” but compromised.
Labels:

Compromise Root Cause Analysis Model

Sunday, June 3, 2012 Posted by Corey Harrell 3 comments
A common question runs through my mind every time I read an article about: another targeted attack, a mass SQL injection attack, a new exploit being rolled into exploit packs, or a new malware campaign. The question I ask myself is how would the attack look on a system/network from a forensic perspective. What do the artifacts look like that indicates the attack method used. Unfortunately, most security literature doesn’t include this kind of information for people to use when responding to or investigating attacks. This makes it harder to see what evidence is needed to answer the questions “how” and “when” of the compromise. I started researching Attack Vector Artifacts so I could get a better understanding about the way different attacks appear on a system/network. What started out as a research project to fill the void I saw in security literature grew into an investigative model one can use to perform compromise root cause analysis. A model I’ve been successfully using for some time to determine how systems became infected. It’s been awhile since I blogged about attack vector artifacts so I thought what better way to revisit the subject than sharing my model.

DFIR Concepts


Before discussing the model I think it’s important to first touch on two important concepts. Locard’s Exchange Principle and Temporal Context; both are key to understanding why attack vector artifacts exist and how to interpret them.

Locard’s Exchange Principle states when two objects come into contact, something is exchanged from one to the other. Harlan’s writing about the Locard’s Exchange Principle is what first exposed me to how this concept applies to the digital world. The principle is alive and well when an attacker goes after another system or network; the exchange will leave remnants of the attack on the systems involved. There is a transfer between the attacker’s systems with the targeted network/systems. An excellent example of the Locard’s Exchange Principle occurring during an attack can be seen in my write-up Finding the Initial Infection Vector. The target system contained a wealth of information about the attack. There was malware, indications Java executed, Java exploits, and information about the website that served up the exploits. All this information was present because the targeted system came into contact with the web server used in the attack. The same will hold true for any attack; there will be a transfer of some sort between the systems involved in an attack. The transfer will result in attack vector artifacts being present on the targeted network in the attack.

Temporal Context is the age/date of an object and its temporal relation to other items in the archaeological record (as defined by my Google search). As it relates to the digital world, temporal context is comparing a file or artifact to other files and artifacts as it relates to a timeline of system activity. When there is an exchange between attacking and targeted systems the information left on the targeted network will be related. The information will be very closely related within a short timeframe. Temporal Context can also be seen in my post Finding the Initial Infection Vector. The attack activity I first identified was at 10/16/2011 6:50:09 PM and I traced it back to 10/8/2011 23:34:10. This was an eight day timeframe where all the attack vector artifacts were present on the system. It’s a short timeframe when you take into consideration the system had been in use for years. The timeframe grows even smaller when you only look at the attack vector artifacts (exploit, payload, and delivery mechanisms). Temporal Context will hold true for any attack resulting in the attack vector artifacts all occurring within a short timeframe.

What Are Attack Vector Artifacts?


In my previous post (Attack Vector Artifacts) I explained what an attack vector is and the three components it consists of. I think it’s important to briefly revisit the definition and components. An attack vector is a path or means somebody can use to gain access to a network/system in order to deliver a payload or malicious outcome. Based on that definition, an attack vector can be broken down into the following components: delivery mechanisms, exploit, and payload. The diagram below shows their relationship.

Attack Vector Model

At the core is the delivery mechanism which sends the exploit to the network or system. The mechanism can include email, removable media, network services, physical access, or the Internet. Each one of those delivery mechanisms will leave specific artifacts on a network and/or system indicating what initiated the attack.

The purpose of the inner delivery mechanism is to send the exploit. An exploit is something that takes advantage of a vulnerability. Vulnerabilities could be present in a range of items: from operating systems to applications to databases to network services. When vulnerabilities are exploited it leaves specific artifacts on a network/system and those artifacts can identify the weakness targeted. To see what I mean about exploit artifacts you can refer to the ones I documented for Java (CVE-2010-0840 Trusted Methods and CVE-2010-0094 RMIConnectionImpl) , Adobe Reader (CVE-2010-2883 PDF Cooltype), Windows (Autoplay and Autorun and CVE 2010-1885 Help Center), and social engineering (Java Signed Applet Exploit Artifacts).

A successful exploit may result in a payload being sent to the network or system. This is what the outer delivery mechanism is for. If the payload has to be sent to then there may be artifacts showing this activity. One example can be seen in the system I examined in the post Finding the Initial Infection Vector. The Java exploit used the Internet to download the payload and there could have been indications of this in logs on the network. I said “if the payload has to be sent” because there may be instances where the payload is a part of the exploit. In these cases there won’t be any artifacts.

The last component in the attack vector is the desired end result in any attack; to deliver a payload or malicious outcome to the network/system. The payload can include a number of actions ranging from unauthorized access to denial of service to remote code execution to escalation of privileges. The payload artifacts left behind will be dependent on what action was taken.

Compromise Root Cause Analysis Model


To identify the root cause of an attack you need to understand the exploit, payload, and delivery mechanisms artifacts that are uncovered during an examination. The Attack Vector Model does an outstanding job making sense of the artifacts by categorizing them. The model makes it easier to understand the different aspects of an attack and the artifacts that are created by the attack. I’ve been using the model for the past few years to get a better understanding about different types of attacks and their artifacts. Despite the model’s benefits for training and research purposes, I had to extend it in order to use it for investigative purposes by adding two additional layers (source and indicators). The end result is the Compromise Root Cause Analysis Model as shown below.


Compromise Root Cause Analysis Model

At the core of the model is the source of the attack. The source is where the attack originated from. Attacks can originate from outside or within a network; it all depends on who the attacker is. An external source is anything residing outside the control of an organization or person. A few examples are malicious websites, malicious advertisements on websites, or email. On the opposite end are internal attacks which are anything within the control of an organization or person. A few examples include infected removable media and malicious insiders. The artifacts left on a network and its systems can be used to tell where the attack came from.

The second layer added was indicators. The layer is not only where the information and artifacts about how the attack was detected would go but it also encompasses all of the artifacts showing the post compromise activity. A few examples of post compromise activity includes: downloading files, malware executing, network traversal, or data exfiltration. The layer is pretty broad because it groups the post compromise artifacts to make it easier to spot the attack vector artifacts (exploit, payload, and delivery mechanisms).

Compromise Root Cause Analysis Model In Action


The Compromise Root Cause Analysis Model is a way to organize information and artifacts to make it easier to answer questions about an attack. More specifically to answer: how and when did the compromise occur? Information or artifacts about the compromise are discovered by completing examination steps against any relevant data sources. The model is only used to organize what is identified. Modeling discovered information is an ongoing process during an examination. The approach to using the model is to start at the indicators layer then proceed until the source is identified.

The first information to go in the indicators layer is what prompted the suspicion that an attack occurred. Was there an IDS alert, malware indications, or intruder activity. The next step is to start looking for post compromise activity. There will be more artifacts associated with the post compromise then there will be for the attack vectors. While looking at the post compromise artifacts keep an eye out for any attack vector artifacts. The key is to look for any artifacts resembling a payload or exploit. Identifying the payload can be tricky because it may resemble an indicator. Take a malware infection as an example. At times an infection only drops one piece of malware on the system while at other times a downloader is dropped which then downloads additional malware. In the first scenario the single malware is the payload and any artifacts of the malware executing is also included in the payload layer. In the second scenario the additional malware is indicators that an attack occurred. The dropper is the payload of the attack. See what I mean; tricky right? Usually I put any identified artifacts in the indicators layer until I see payload and exploit information that makes me move it into the payload layer.

After the payload is identified then it’s time to start looking for the actual payload and its artifacts. The focus needs to be on the initial artifacts created by the payload. Any addition artifacts created by the payload should be lumped into the indicators layer. To illustrate let’s continue to use the malware scenario. The single malware and any indications of it first executing on the system would be organized in the payload layer. However, if the malware continued to run on the system creating additional artifacts – such as modifying numerous files on the system -then that activity goes into the indicators layer.

The outer delivery mechanism is pretty easy to spot once the payload has been identified. Just look at the system activity around the time the payload appeared to see it. This is the reason why the payload layer only contains the initial payload artifacts; to make it easier to see the layers beneath it.

Similar to the delivery mechanism, seeing the exploit artifacts is also done by looking at the system activity around the payload and what delivered the payload to the system. The exploit artifacts may include: the exploit itself, indications an exploit was used, and traces a vulnerable application was running on the system.

The inner delivery mechanism is a little tougher to see once the exploit has been identified. You may have a general idea about where the exploit came from but actually finding evidence to confirm your theory requires some leg work. The process to find the artifacts is the same; look at the system activity around the exploit and inspect anything of interest.

The last thing to do is to identify the source. Finding the source is done a little different then finding the artifacts in the other layers. The previous layers require looking for artifacts but with the source layer it involves spotting when the attack artifacts stop. Usually there is a flurry of activity showing the delivery mechanisms, exploit, and payload and at some point the activity just stops. In my experiences, this is where the source of the attack is located. As soon as the activity stops then everything before the last identified artifact should be inspected closely to see if the source can be found. In some instances the source can be spotted while in others it’s not as clear. Even if the source cannot be identified at least some avenues of attack can be ruled out based on the lack of supporting evidence. Lastly, if the source points to an internal system then the whole root cause analysis process starts all over again.

Closing Thoughts


I spent my personal time over the past few years helping combat malware. One thing I always do is determine the attack vector so I could at least provide sound security recommendations. I used the compromise root cause analysis process to answer the questions “how” and “when” which thus enabled me to figure out the attack vector. Besides identifying the root of a compromise there are a few other benefits to the process. It’s scalable since it can be used against a single system or a network with numerous systems. Artifacts located from different data sources across a network are organized to show the attack vector more clearly. Another benefit is the process uses layers. Certain layers may not have any artifacts or could be missing artifacts but the ones remaining can still be used to identify how the compromised occurred. This is one of the reasons I’ve been able to figure out how a system was infected even after people took attempts to remove the malware while in the process destroying the most important artifacts showing how the malware got there in the first place.

Finding Fraudulent Documents Preview

Wednesday, May 16, 2012 Posted by Corey Harrell 2 comments
Anyone who looks at the topics I discuss on my blog may not easily see the kind of cases I frequently work at my day job. For the most part my blog is a reflection of my interests, the topics I’m trying to learn more about, and what I do outside of my employer. As a result, I don’t blog much about the fraud cases I support but I’m ready to share a technique I’ve been working on for some time.

Next month I’m presenting at the SANs Forensic and Incident Response Summit being held in Austin Texas. The summit dates are June 26 and 27. I’m one of the speakers in the SANs 360 slot and the title of my talk is “Finding Fraudulent Word Documents in 360 Seconds” (here is the agenda). My talk is going to quick and dirty about a technique I honed last year to find fraudulent documents. I’m writing a more detailed paper on the technique as well as a query script to automate finding these documents but my presentation will cover the fundamentals. Specifically, what I mean by fraudulent documents, types of frauds, Microsoft Word metadata, Corey’s guidelines, and the technique in action. Here’s a preview about what I hope to cover in my six minutes (subject to change once I put together the slides and figure out my timing).

What exactly are fraudulent documents? You need to look at the two words separately to see what I’m referring to. One definition for fraudulent is “engaging in fraud; deceitful” while a definition for document is “a piece of written, printed, or electronic matter that provides information or evidence or that serves as an official record”. What I’m talking about is electronic matter that provides information or serves as an official record while engaging in fraud. In easier terms and the way I describe it: electronic documents providing fake financial information. There are different types of fraud which means there are different types of fraudulent documents. However, my technique is geared towards finding the electronic documents used to commit purchasing fraud and bid rigging.

There are a few different ways these frauds can be committed but there are times when Microsoft Word documents are used to provide fake information. One example is an invoice for a product that was never purchased to conceal misappropriated money. As most of us know electronic files contain metadata and Word documents are no different. There are values within Word documents’ metadata that provide strong indicators if the document is questionable. I did extensive testing to determine how these values change based on different actions taken against a document (Word versions 2000, 2003, and 2007). My testing showed the changes in the metadata are consistent based on the action. For example, if a Word document is modified then specific values in the metadata changes while other values remain the same.

I combined the information I learned from my testing with all the different fraudulent documents I’ve examined and I noticed distinct patterns. These patterns can be leveraged to identify potential fraudulent documents among electronic information. I’ve developed some guidelines to find these patterns in Word documents’ metadata. I’m not discussing the guidelines in this post since I’m saving it for my #DFIRSummit presentation and my paper. The last piece is tying everything together by doing a quick run through about how the technique can quickly find fraudulent documents for a purchasing fraud. Something I’m hoping to include is my current work on how I’m trying to automate the technique using a query script I’m writing and someone else’s work (I’m not mentioning who since it's not my place).

I’m pretty excited to finally have the chance to go to my first summit and there’s a great lineup of speakers. I was half joking on Twitter when I said it seems like the summit is the DFIR Mecca. I said half because it’s pretty amazing to see the who else will be attending.  


Labels: ,

More About Volume Shadow Copies

Tuesday, May 8, 2012 Posted by Corey Harrell 0 comments

CyberSpeak Podcast About Volume Shadow Copies


I recently had the opportunity to talk with Ovie about Volume Shadow Copies (VSCs) on his CyberSpeak podcast. It was a great experience to meet Ovie and see what it’s like behind the scenes. (I’ve never been on a podcast before and I found out quickly how tough it is to explain something technical without visuals). The CyberSpeak episode May 7 Volume Shadow Copies is online and in it we talk about examining VSCs. In the interview I mentioned a few different things about VSCs and I wanted to elaborate on a few of them. Specifically, I wanted to discuss running the Regripper plugins to identify volumes with VSCs, using the Sift to access VSCs, comparing a user profile across VSCs, and narrowing down the VSC comparison reports with Grep.

Determining Volumes with VSCs and What Files Are Excluded from VSCs


One of my initial steps on an examination is to profile a system so I can get a better idea about what I’m facing. I information I look at includes: basic operating system info, user accounts, installed software, networking information, and data storage locations. I do this by running Regripper in a batch script to generate a custom report containing the information I want. I blogged about this previously in the post Obtaining Information about the Operating System and I even released my Regripper batch script (general-info.bat). I made some changes to the batch script; specifically I added the VSCs plugins spp_clients.pl and filesnottosnapshot.pl. The spp_clients.pl plugin obtains the volumes monitored by the Volume Shadow Copy service and this is an indication about what volumes may have VSCs available. The filesnottosnapshot.pl plugin gets a list of files/folders that are not included in the VSCs (snapshots). The information the VSCs plugins provide is extremely valuable to know early in an examination since it impacts how I may do things.

While I’m talking about RegRipper, Harlan released RegRipper version 2.5 his post RegRipper: Update, Road Map and further explained how to use the new RegRipper to extract info from VSCs in the excellent post Approximating Program Execution via VSC Analysis with RegRipper. RegRipper is an awesome tool and is one of the few tools I use on every single case. The new update lets RR run directly against VSCs making it even better. That’s like putting bacon on top of bacon.

Using the Sift to Access VSCs


There are different ways to access VSCs stored within an image. Two potential ways are using Encase with the PDE module or the VHD method. Sometime ago Gerald Parsons contacted me about another way to access VSCs; he refers to it as the iSCSI Initiator Method. The method uses a combination of Windows 7 iSCSI Initiator and the Sift workstation. I encouraged Gerald to do a write-up about the method but he was unable to due to time constraints. However, he said I could share the approach and his work with others. In this section of my post I’m only a ghost writer for Gerald Parsons and I’m only conveying the detailed information he provided me including his screenshots. I only made one minor tweak which is to provide additional information about how to access a raw image besides the e01 format.

To use the iSCSI Initiator Method requires a virtual machine running an iSCSI service (I used the Sift workstation inside VMware) and the host operating system running Windows 7. The method involves the following steps:

Sift Workstation Steps

1. Provide access to image in raw format
2. Enable the SIFT iSCSI service
3. Edit the iSCSI configuration file
4. Restart the iscsitarget service

Windows 7 Host Steps

5. Search for iSCSI to locate the iSCSI Initiator program
6. Launch the iSCSI Initiator
7. Enter the Sift IP Address and connect to image
8. Examine VSCs

Sift Workstation Steps


1. Provide access to image in raw format

A raw image needs to be available within the Sift workstation. If the forensic image is already in the raw format and is not split then nothing else needs to be done. However, if the image is a split raw image or is in the e01 format then one of the next commands needs to be used so a single raw image is available.

Split raw image:

sudo affuse path-to-image mount_point

E01 Format use:

sudo mount_ewf.py path-to-image mount_point

2. Enable the SIFT iSCSI service

By default, in Sift 2.1 the iSCSI is turned off so it needs to be turned on. The false value in the /etc/default/iscsitarget configuration file needs to be change to true. The commands below uses the Gedit text editor to accomplish this.

sudo gedit /etc/default/iscsitarget

(Change “false” to “true”)


3. Edit the iSCSI configuration file

The iSCSI configuration file needs to be edited so it points to your raw image. Edit the /etc/ietd.conf configuration file by performing the following (the first command opens the config file in the text editor Gedit):

sudo gedit /etc/ietd.conf

Comment out the following line by adding the # symbol in front of it:

Target iqn.2001-04.com.example:storage.disk2.sys1.xyz

Add the following two lines (the date can be whatever you want (2011-04) but make sure the image path points to your raw image):

Target iqn.2011-04.sift:storage.disk
Lun 0 Path=/media/path-to-raw-image,Type=fileio,IOMode=ro


4. Restart the iscsitarget service

Restart the iSCSI service with the following command:

sudo service iscsitarget restart


Windows 7 Host Steps


5. Search for iSCSI to locate the iSCSI Initiator program

Search for the Windows 7 built-in iSCSI Initiator program


6. Launch the iSCSI Initiator

Run the iSCSI Initiator program

7. Enter the Sift IP Address and connect to image

The Sift workstation will need a valid IP address and the Windows 7 host must be able to connect to the Sift using it. Enter the Sift’s IP address then select the Quick Connect.


A status window should appear showing a successful connection.


8. Examine VSCs

Windows automatically mounts the forensic image’s volumes to the host after a successful iSCSI connection to the Sift. In my testing it took about 30 seconds for the volumes to appear once the connection was established. The picture below shows Gerald’s host system with two volumes from the forensic image mounted.


If there are any VSCs on the mounted volumes then they can be examined with your method of choice (cough cough Ripping VSCs). Gerald provided additional information about how he leverages Dave Hull’s Plotting photo location data with Bing and Cheeky4n6Monkey Diving in to Perl with GeoTags and GoogleMaps to extract metadata from all the VSCs images to create maps. He extracts the metadata by running the programs from the Sift against the VSCs.

Another cool thing about the iSCSI Initiator Method (besides being another free solution to access VSCs) is the ability to access the Sift iSCSI service from multiple computers. In my test I connected a second system on my network to the Sift iSCSI service while my Windows 7 host system was connected to it. I was able to browse the image’s volumes and access the VSCs at the same time from my host and the other system on the network. Really cool…. When finished examining the volumes and VSCs then you can disconnect the iSCSI connection (in my testing it took about a minute to completely disconnect).


Comparing User Profile Across VSCs


I won’t repeat everything I said in the CyberSpeak podcast about my process to examine VSCs and how I focus on the user profile of interest. Focusing on the user profile of interest within VSCs is very powerful because it can quickly identify interesting files and highlight a user’s activity about what files/folders they accessed. Comparing a user profile or any folder across VSCs is pretty simple to do with my vsc-parser script and I wanted to explain how to do this.

The vsc-parser is written to compare the differences between entire VSCs. In some instances this may be needed. However, if I’m interested in what specific users were doing on a computer then the better option is to only compare the user profiles across VSCs since it’s faster and provides me with everything I need to know. You can do this by making two edits to the batch script that does the comparison. Locate the batch file named file-info-vsc.bat inside the vsc-parser folder as shown below.


Open the file with a text editor and find the function named :files-diff. The function executes diff.exe to identify the differences between VSCs. There are two lines (lines 122 and 129) that need to be modified so the file path reflects the user profile. As can be seen in the picture below the script is written to use the root of the mounted image (%mount-point%:\) and VSCs (c:\vsc%%f and c:\vsc!f!).


These paths need to be changed so they reflect the user profile location. For example, let's say we are interested in the user profile named harrell. Both lines just need to be changed to point to the harrell user profile. The screenshot below now shows the updated script.


When the script executes diff.exe there the comparison reports are placed into the Output folder. The picture below shows the reports for comparing the harrell user profile across 25 VSCs.


Reducing the VSCs Comparison Reports


When comparing a folder such as a user profile across VSCs there will be numerous differences that are not relevant to your case. One example could be the activity associated with Internet browsing. The picture below illustrates this by showing the report comparing VSC 12 to VSC11.


The report showing the differences between VSC12 and VSC11 had 720 lines. Looking at the report you can see there are a lot of lines that are not important. A quick way to remove them is to use grep.exe with the –v switch to only display non-matching lines. I wanted to remove the lines in my report involving the Internet activity. The folders I wanted to get rid of were: Temporary Internet Files, Cookies, Internet Explorer, and History.IE5. I also wanted to get rid of the activity involving the AppData\LocalLow\ CryptnetUrlCache folder. The command below shows how I stacked my grep commands to remove these lines and I saved the output into a text file named reduced_files-diff_vsc12-2-vsc11.txt .

grep.exe -v "Temporary Internet Files" files-diff_vsc12-2-vsc11.txt | grep.exe -v Cookies | grep.exe -v "Internet Explorer" | grep.exe -v History.IE5 | grep.exe -v CryptnetUrlCache > reduced_files-diff_vsc12-2-vsc11.txt

I reduced the report from 720 lines to 35. It’s good practice to look at the report again to make sure no obvious lines were missed before running the same command against the other VSC comparison reports. Staking grep commands to reduce the amount of data to look at makes it easier to spot items of potential interest such as documents or Windows link files. It’s pretty easy to see that the harrell user account was accessing a Word document template, an image named staples, and a document named Invoice-#233-Staples-Office-Supplies in the reduced_files-diff_vsc12-2-vsc11.txt report shown below.


I compare user profiles across VSCs because it’s a quick way to identify data of interest inside VSCs. Regardless, if the data is images, documents, user activity artifacts, email files, or anything else that may stored inside a user profile or that a user account accessed.