Thursday, July 28, 2011

WFA 3/e

I've mentioned a couple of times, in this blog as well as in a couple of lists, that I'm working on completing Windows Forensic Analysis 3/e, and I thought it would be a good time to give a little bit of information regarding the book. 

First off, while the title includes "third edition", this edition is not one where if you purchased the second edition, you're out of luck.  Rather, the third edition is a companion book to the second edition, so you'll want to have both of them on your shelf (or Kindle).  Where a great deal of 2/e was focused on Windows XP, in 3/e I'm focusing primarily on Windows 7.

The third edition has 8 chapters, as follows:

1. Analysis Concepts - Seeing comments from those who've read DFwOST thus far, and seeing the mileage that Chris Pogue is getting from his Sniper Forensics presentations, it appears that there are a lot of analysts out there who like to hear about the concepts that drive analysis.  It's one thing to say to do timeline analysis and talk about how, but I think that it's something else entirely to discuss why we do timeline analysis, as that's the difference between an analyst who creates a timeline, and one who has a reason, justification and analysis goal for creating a timeline.

2.  Live Response - With this chapter, I wanted to take something of a different approach; rather than writing yet another chapter that gives consultants hints on doing IR, I wanted to provide some thoughts as to how organizations can better prepare for those inevitable DFIR activities.  If 2011 thus far hasn't been enough of an example, maybe it's worth saying again...it's not a matter of if your organization will face a compromise, but when.  I would take that a step further and suggest that if you don't have visibility into your systems and infrastructure, you may have already been compromised.  As a consultant, the biggest issue I've seen during IR is the level of preparedness...there's a huge difference between companies that accept that incidents will occur and take steps to prepare, and those who have "not me" culture/attitude; the latter usually ends up paying out much more, in terms of fees, fines, and court costs.  This is something consultants talk about with their customers, but it's a whole new world when you actually see it in action.

3.  Volume Shadow Copies - This chapter is somewhat self-explanatory.  I was doing some research that involved accessing VSCs, and found pretty much the only way to do what I wanted to do required significant resources (ie, $$).  What I did with this chapter is show how VSCs can be accessed within an acquired image without using expensive solutions, as well as provide some insight into how accessing the VSCs can really provide some very valuable information to an analyst.

4.  File Analysis - This chapter is very similar to the corresponding chapter in WFA 2/e, but focuses on some of the files you're likely to see on Windows 7 systems.  I also reference some of the files that you'll find on Windows XP systems, but are different on Windows 7 systems (ie, format, content).  I cover Jump Lists in this chapter, and not just the LNK-like streams, but also the DestList streams (which appear to be some sort of MRU listing for shortcuts).

5.  Registry Analysis - I know, a lot's been said about Registry analysis, particularly in WRF, but this time, instead of doing a break down of what can be found in the Registry, on a hive-by-hive basis, I'm taking more of a solutions-based approach. For example, I see a LOT of folks in the forums and lists who don't understand the role that the USBStor subkeys play in USB device analysis, so what I've done is take the more common analysis processes (that I see, based on questions asked in lists...) and I'm trying to provide solutions across all hives.

One of the things I face with writing chapters such as this one is that folks will say things like, "I want to know about blah...", and very often, there's already information out there on the subject.  One great example is the Registry ShellBags...Chad Tilbury recently posted on this topic to the SANS Forensic Blog, so given that, I have wonder, "what do you want to know?" and "how much are you willing to support the effort to present/share that topic?"  Now, by "support", I mean through such efforts as providing example hives, or just taking a few minutes to elaborate on your thoughts or questions.

6. Malware Detection - I have had a good number of "here's a hard drive that we think is infected with malware..." exams, and given that there are a number of folks out there who likely get similar cases (LE gets CP cases that evolve into the "Trojan Defense", etc.) I wanted to put together a good resource to help address this issue.  This is not a malware analysis chapter...MHL et al did a fantastic job with this topic in the Malware Analyst's Cookbook, and I'm not about to try to parrot what they've done.  Instead, this chapter addresses the topic of detecting malware within in acquired image, and I even provide a checklist of steps you can use.

Note: Many of the tools mentioned in the book are available online, and those items that are not specifically available now (the malware detection checklist, etc.) will be provided online, as well.  I really don't like the idea of providing a DVD with the book, because there are simply too many issues with getting the materials to people who purchase only the ebook, or leave their DVD at home when they go to work...

7.  Timeline Analysis - In this chapter, I not only present how to create a timeline, but I also discuss the concepts behind why we'd want to create a timeline, as well as some of the uses of timelines folks may not be too familiar with.  I presented these concepts and use case scenarios during a course I taught recently, and they seemed to have been very well received.

8.  Application Analysis - Another class of question I see a lot of in the lists has to do with application artifacts; when you think about it, there isn't too terribly much difference between some classes of dynamic malware analysis, and what you'd do to analyze an application for artifacts.

Now, there are some things that I don't cover in the book, in part because they're covered or addressed through other media or resources.  One example is memory analysis...there are a number of resources already available that cover how to capture physical memory, as well as perform analysis of a Windows memory dump, using the freely available tools. 

I wanted to provide something of a preview, because I do get a lot of those, "...does it cover...??" questions, most often from people at conferences, who are holding a copy of the book while they're asking the question.  The simple fact is that no book can cover everything, and it's especially difficult when analysts don't communicate their needs or desires beforehand.  I've done the best I can to collect up those sorts of things from lists, forums, as well as people I've talked to at conferences...but I know that the question is still going to come up, even after the book is printed.

One thing I would like to add is that, as with my other books, the focus is almost exclusively on free and open source tools to get the job done.  Like I said earlier, many of the tools are already available online, and those other items I've developed and mentioned in the book will be posted to the web when the book goes final.

From the lists and forums, I see a lot of questions regarding Windows 7, specifically, "What has changed from Windows XP?"  Truthfully, this is the WRONG question to ask, albeit a popular one.  But if you really want to know, from an analyst perspective, WFA 3/e goes final (manuscript submitted) in October, so it should be available around the beginning of 2012.

HTH

Monday, July 25, 2011

Updates

WRF Review
Andrew Hay posted a glowing review of Windows Registry Forensics on Amazon recently.  I greatly appreciate those who have purchased the book and taken the time to read it, and I especially appreciate those who have taken the time to write a review.

DFwOST
Speaking of books, it looks as if DFwOST has been picked up as a text book!  Pretty cool, eh?  I certainly hope Cory's proud of this...this is a great testament to the efforts that he put into the book, as he was the lead on this...I was just along for the ride.

One of the interesting things about this is that I've heard that other courses may be picking this book up as a resource, in part due to the focus on open source...many of the digital forensics courses out there are held at community colleges that simply cannot afford to purchase any of the commercial forensic analysis applications.  Also, I do appreciate the "tool monkey" comment from the blog post linked above...let's start folks out with an understanding of what's going on under the hood, and progress from there.  The age of Nintendo forensics is over, folks!

If that's the case for you, either as an instructor or individual practitioner, consider my other books, as well...I focus on free and open source tools almost exclusively, because...well...I simply don't have access to the commercial tools.

NoVA Forensics Meetup
Just a reminder...our next meetup is Wed, 3 Aug, starting at 7pm.  One of our members who attended our last meetup has offered to facilitate a discussion regarding some recent cyber activity and how it affects what we do.  I'm really looking forward to this, as I think that it's a great way for everyone to engage.

For location information, be sure to check out the NoVA Forensics Meetup page on the right-hand side of this blog.

PFIC
The agenda for PFIC 2011 has been posted, and I'll be presenting on Tuesday afternoon.  My presentation will be (hopefully) taking the "Extending RegRipper" presentation a bit further.  It works as it is now, but one of the things I want to do is provide a means for the analyst to designate (via both the UI and CLI) to select which user profiles to include in scans.

Bank Fraud
Yet another bank is being sued by a small business following online banking fraud.  Brian Krebs had done considerable work in blogging about other victims (most recently, the town of Eliot, ME).  What should concern folks about this is that once the victim is breached and the money transfers complete, a battle ensues between the victim and the bank.  What isn't happening is this equation is that even with all the press surrounding this, there continue to be victims, and instead of focusing on better security up front, efforts are expended toward suing the bank for "inadequate security measures". Should the bank have had some sort of anomaly detection in place that said, "hey, this connection isn't from an IP address we recognize..."?  Sure.  Should there be some other sort of authentication mechanism that isn't as easily subverted?  Sure.  There are a lot of things that should have been in place...just ask anyone who does PCI forensic assessments, or even just IR work.

One of the things Brian has recommended in his blog is to do all online transactions via a bootable live CD.  I think that this is a great idea...say your Windows system gets infected with something...if you boot the system to a live Linux distribution, this won't even "see" the malware.  Conduct your transactions and shut the system down, and you're done.

Another measure to consider is something like Carbon Black.  Seriously.  Give the guys at Kyrus a call and ask them about their price point.

Cell Phones As Evidence
Christa Miller recently had a Cops2.0 article published regarding how LEOs should approach cell phones/smart phones.  Reading the article, I think that all of it is excellent advice...but you're probably wondering, "what does this have to do with Windows IR or DF work?"  Well, something for analysts to consider is this...if you're analyzing a Windows computer (ie, laptop) confiscated as part of a search warrant, be sure to look to see if a phone has been sync'd to the system.  Did the user install iTunes, download music, and then load the music on their iPhone?  If so, the phone was likely synced/backed up, as well.  Is the Blackberry Desktop Manager installed?  Did the user back their phone up?  If so, the backup files may proved to be significant and valuable resources during an investigation.

Did you map all of the USB removable storage devices that had been connected to the system?  You don't need to have the management software installed to copy images and videos (hint, hint) off of a phone...just connect it via a USB cable and copy the images (which will likely have some very useful EXIF data available).

analyzemft 2.0 Released!
Matt Sabourin updated David Kovar's analyzemft.py to make it completely OO!  David has done some great work putting the tool together, and Matt's extended it a bit by making it OO, so that it can be called from other Python scripts.

The project is now hosted on Google Code.

Thursday, July 21, 2011

Evading Investigators and Analysts

I recently had an opportunity to spend two days with some great folks in the Pacific Northwest, talking about timeline creation and analysis.  During this time, we talked about a couple of ancillary topics, such as critical thinking, questioning assumptions, looking at what might be unusual on a system, and we even touched briefly on anti-forensics.  If you've read my blog for any considerable length of time, you know that my thoughts on anti-forensics are that most tactics are meant to target the analyst, or the analyst's training.

Along the lines of anti-forensics, a good deal of reading can be found online.  For example, in 2005, the Grugq gave a presentation at BlackHat that addresses anti-forensics.  James Foster and Vincent Lui gave a presentation on a similar subject.  Also, I recently ran across a very interesting article regarding evading the forensic investigator.  One of the most important statements in the article is:

"Here it is important to note that the software makes it possible to store..."

Why is this statement important?  Well, if you consider the statement critically for a moment, you'll see that it points out that this technique for hiding data from an analyst requires additional software to be added to the system.  That's right...this isn't something that can be done using just tools native to the system...something must be added to the system for this technique to be used.  What this means is that while the analyst may not necessarily be able to immediately determine that data may have been hidden using these techniques, they would likely be able to determine the existence of this additional software.  After all, what would be the purpose of hiding data with no way for the person hiding it to access it?  If you wanted to deny the owner access to the data, why not simply wipe it to a level at which it is prohibitively expensive to retrieve?

What are some ways that an analyst might go about finding the additional software?  Well, because many times I deal solely with acquired images, I use a malware detection checklist for those times where my analysis calls for it, and one of the things I check is the MUICache keys for the users, which can provide me with an indication of software that may have been executed, but perhaps not via the shell.  This is just one item on the checklist, however...there are others, and because the checklist is a living document (I can add to it, and note when some things work better than others), there may be additional items added in the future.

Another way to address this is through incident preparation.  For example, if the audit capability for the Windows system includes Process Tracking (and the Event Logs are of sufficient size), then you may find an indication of the software being executing there (using evtrpt.pl against Windows XP and 2003 .evt files works like a champ!).  Another possible proactive approach is to use something like Carbon Black from Kyrus Technology; installing something like this before an incident occurs will provide you with (as well as your responders) with considerable, definitive data.  To see how effective a proactive approach involving Carbon Black can be, take a look at this video.

Beyond this, however, its critical that analysts have the training and knowledge to do their jobs in a quick, efficient, and accurate manner, even in the face of dedicated attempts to obfuscate or hinder that analysis.  One of the things we talked a great deal about in the class is the use of multiple data sources to add context to the data, as well as increase the relative level of confidence in that data.  Chris Pogue talks about this a bit in one of his recent blog posts.  Rather than looking at a single data point and wondering (or as is done in many cases, speculating) "...what could have caused this?", it's better to surround that data point with additional data from other sources in order to not only see the context in which the event took place, but we also have to keep in mind that some data is more mutable (easily changed) than others.

When I teach a class like this, I learn a lot, not just in putting the course materials together, but also from engaging with the attendees.  At one point, I was asked on about how many cases do I create a timeline...my response was, "all of them", and that's fairly accurate.  Now, timeline analysis may not be my primary analysis technique...sometimes, I may have something available (an Event Log record, a file name, etc.) to get me started and the goals of my analysis simply dictate that a timeline be created.  Looking back over my recent examinations, I've created a timeline in just about every instance...either to have a starting point for my analysis, or to provide context or validation to my findings, or even to look for secondary or tertiary artifacts to support my findings.  However, the important thing to keep in mind here is that I'm not letting the technique or tool drive my analysis...quite the opposite, in fact.  I'm using the technique for a specific purpose and because it makes sense.


Another analyst told me not long ago, "...I've been doing timelines for years."  I'm sure that this is the case, as the concepts behind timeline analysis have been around for some time and used in other areas of analysis, as well.  However, I'm willing to bet that most of the analysts that have created timelines have done so by manually entering events that they discover into a spreadsheet, and that the events that they've added are based on limited knowledge of the available data sources.  Also, it has been clear for some time that the value of timelines as an analysis tool isn't completely recognized or understood, in part by looking at questions asked in online forums; many of these questions could be answered by creating a timeline.  So, while many are talking about timeline analysis, I think that its imperative that more of us do it. 

Another thing I have learned through engaging with other analysts over the years is that a lot of this stuff that some of us talk about (timeline or Registry analysis) is great, but in some cases, someone will go to training and then not use what they learned for 6 - 9 (or even more) months.  By then, what they learned has become a fog.  These techniques are clearly perishable skill sets that need to be exercised and developed, or they will just fade away.

An example of this that I've seen a great deal of recently in online forums has to do with tracking USB devices on Windows systems.  In the spring of 2005, Cory Altheide and I published some research that we'd conducted regarding USB device artifacts on Windows systems.  Since then, more has been written about this subject, and Rob Lee has posted (and provided worksheets) to the SANS Forensics blog, not only covering thumb drives, but also drive enclosures.  USB artifacts have been discussed in books such as Windows Forensic Analysis (1/e, 2/e), and Windows Registry Forensics, and I'm including yet another discussion in the upcoming third edition of WFA.  I think that this is important, because with all of the information available (including this page on the Forensics Wiki), there continues to be a misunderstanding of the artifacts and analysis process regarding these devices.  For example, variations on the same question have appeared in multiple forums recently, specifically asking why all (in one case, 20 or more) device keys listed beneath the USBStor subkey have the same LastWrite time, and how could the user have plugged all of the devices into the system at the same time.  The problem with this line of analysis is that the LastWrite times for the subkeys beneath the USBStor key are NOT used to determine when the devices were last connected to the system!  What I normally suggest is that analysts engage with the various resources available, and if they want to know what could be responsible for the keys all having the same LastWrite times, generate a timeline.  Seriously.  Timelines aren't just for analysis anymore...they're a great testing tool, as well.

As a side note, RegRipper has all of the necessary plugins to make  USB device analysis pretty straightforward and simple.  I've been working on a chapter on Registry Analysis for my upcoming Windows Forensic Analysis 3/e, and I haven't had to produce any new plugins.  So, even if you're using Rob's SANS checklists, you can get the data itself using RegRipper.

Resources
IronGeek: Malicious USB Devices

Matthieu recently released DumpIt, which is a fusion of the 32- and 64-bit Windows memory dumping utilities that will create a memory dump in the current working directory (which makes this a great utility to run from a thumb or wallet drive).

Wednesday, July 20, 2011

Carbon Black

I attended a Carbon Black (Cb) demo recently, at the invitation of the great folks of Kyrus.  The demo was intended to show some of the improvements to Cb, in particular the GUI available to quickly and easily mine through the available logs.

For those of you who haven’t heard of Cb…where’ve ya been???  Cb is a sensor that monitors the execution process on Windows systems, and reports on processes and file writes.  A coming update to Cb will also report on writes to the Registry, as well as network connections (source/dest IPs and ports, with a time stamp).

In the demo, Mike Viscuso demonstrated how the GUI can be used to quickly track down a FakeRean installation, and even track it back to an a Java “issue” (more analysis would be needed to determine if it were an exploit) delivered via Firefox.  Mike when through this slowly, and had he gone through this full speed, it would have only taken a couple of seconds.  In the demo, Mike identified three stages of the infection, and identified the executable files associated with each.  The first stage executable was identified by 14 of 42 AV engines on VirusTotal (for each stage, Mike submitted the hash of the file, not the file itself).  The second stage executable was not identified by any of the AV engines, and in fact was reported to not have been previously submitted.  Finally, the third stage executable was identified by 5 of 42 AV engines.

Now, compare that to the “normal” IR approach, and how long it would take to dump memory from that system…and this is assuming that you’ve got your IR toolkit prepped and ready to go, and you have personnel trained in the proper collection and analysis techniques.  How about obtaining a copy of the executable?  Cb does this for you; the traditional approach to doing this could take several hours, under the best conditions.

Finally, Mike could have issued a query to determine if the particular files in question had been “seen” on any other systems in the enterprise, an answer to which would be available within seconds.

Deployment, or “How this beats the current IR model”
The current model used for IR is that someone gets hacked or hit with malware of some kind and calls for help, or someone gets notified by an external third party that their data has been compromised (see annual reports from Verizon, TrustWave, or Mandiant) and calls for help.  Sometime after that, personnel and/or equipment are sent on-site, and all the while, data may continue to be exfiltrated from the infrastructure.  At some point after the call for help, network and host data, and possibly the contents of memory from hosts may be collected and analyzed…all of which takes considerable time.

Now, what if you deploy Cb before an incident?  If you were to do this, starting with a testbed of systems, and possibly some non-production systems, you could monitor that subset of your infrastructure, and once you become familiar and comfortable with the tool (check with Kyrus for licensing to get the log collection server within your infrastructure), progressively roll the sensor out to more systems.  Once you get Cb rolled out and the server installed, it’s simply a matter of reviewing the data.   Should you suspect that something has occurred, you now have considerable, albeit easily managed and viewed data available.  You can even set up a scheduled task on the server that queries for new executables having been launched, and have this task run every day (or even six hours).  You may initially get a lot of data, but over time, you’ll notice that the set that you receive back should be reduced.  You can even have the task email the list of new executables to you.

Now, even if you were to query the logs every 24 hrs (via a scheduled task or manually), the fact is that you’d know about the incident within 24 hrs (at the most), rather than hearing about it 3 months later from someone else.  Since many of these notifications come well after the actual data theft occurred, when deployed proactively, Cb is capable of providing a level of context that simply isn’t evident or available via more traditional means of IR, such as memory or even disk analysis.  Further, once something is “seen”, you can query the infrastructure for other affected systems, quickly scoping your incident.  Again, through traditional means of IR, scoping the incident can often take considerable time and be very expensive (in time, money, resources, etc.) to the already-compromised environment.

And Cb answers more than just questions related to security and IR.   One of the use cases that the Kyrus guys like to use involves addressing budgeting issues, and going out across the enterprise to determine how many employees were running all components of an office suite of tools.  With the returned information, the organization was able to drastically reduce licensing costs.  Cb can also be used to enforce acceptable use policies, among other things.

If you haven't done so, I'd recommend taking a look at Cb...it doesn't have huge overhead or a "big" footprint, and what it can save you in terms of much more than just IR and security is immense.  

Saturday, July 09, 2011

More Links, Updates

Meetup
Our 6 July NoVA Forensic Meetup went very well, and I wanted to thank Tom Harper for putting in the time and effort that it takes to provide his presentation, and to also thank the ReverseSpace folks for hosting our meetings!  This time, we had about 20 or so folks show up...some new...and I wanted to thank everyone who took the time out to come by and take part in our event.

What I liked about Tom's presentation was the fact that Tom's a practitioner, and he approaches solutions from that perspective.  Tom was at OSDFC, and after the conference mentioned that some of the presentations we saw that day were from folks from the academic side of DFIR, and not so much the practitioner side.  I agree with that sentiment...and Tom's approach is practical and all about gettin' the job done.

I should also note that the ReverseSpace location doesn't say "ReverseSpace" on the outside, nor is there a sign by the road.  The address is 13505 Dulles Technology Drive, Suite 3, in Herndon, VA, and the facility says "Cortona Academy" on the outside.  Don't worry about writing this address down...simply check out the "NoVA Forensics Meetup" page associated with this blog.

Our next meetup is scheduled for 3 August (same time, 7-8:30pm) and we're looking for presentation or discussion ideas, as well as presenters.  @sampenaiii suggested via Twitter that we have a discussion on LulzSec and the impact of their activities on security and forensics; this sounds like a great idea, and I think that we'll need to have a slide or two with some background for those who may not be familiar with what happened. 

Cory-TV
Cory Altheide recently provided a link to the video of a presentation he gave in Italy on 2 May 2011.  The presentation is titled, "The death of computer forensics: digital forensics after the singularity", and is very interesting to watch.  Cory is a very smart guy, and as Chris Pogue recently tweeted, Cory is the bacon of the DFIR community, because he makes everything better!

Cory presents some very interesting thoughts regarding what were once thought to be "forensics killers", and the future of digital forensic analysis. One of the things that Cory mentioned was the existence of metadata, which is not often removed, simply because the user doesn't know about it.  There have been some pretty interesting instances where metadata has played a very important role (here, and here), and I agree with Cory that it will continue to do so, particular due to the fact that we have new formats, devices, and applications coming out all the time.

Clearly, Cory talked about much more than just metadata in his presentation, and I simply can't do justice to it through any sort of concise description.  Instead, I highly recommend that you take the hour+ to sit down and watch it.  I think that the overall point that Cory makes is that as available drive space increased (every time it did) and decreased in price, and as platforms to be analyzed have become more complex and varied, there have been those who've claimed that these things would be the "death of digital forensics"; instead, analysts have adapted. 

MBR Analysis
Speaking of Chris Pogue, he recently posted to his blog regarding MBR Analysis, referring to a discussion he and I had had not long ago.  What I like about Chris's post is that he's talking about it, and by him doing so, my hope is that the things we talked about reach more analysts.  Chris is the author of the Sniper Forensics presentations (here's the one he gave at DefCon18), and he's been getting a lot of mileage from the series, and presents to a lot of people.  As such, my hope is that more people will hear about MBR analysis, how it can be used, and start looking at this as a viable part of a malware detection process/checklist.

When you read the post, don't get caught up in the terminology...what I mean by that is, to clarify a couple of things, the TSK mmls reads the partition table within an image of a physical disk (you don't have one of these if you acquire a logical image of the C:\ volume, for example), and provides you the offsets to the partitions.  The offset to the active partition is often 63 sectors (indexed at 0), not "0x63".

Speaking of the Sniper Forensics presentations, Chris also has another post up on the SpiderLabs Anterior blog that's worth a read.

Additional Thoughts
I recently posted some thoughts regarding how the structure that data is stored in (as well as where within that structure) can provide context to the data that you're looking at, and an email exchange with James, who commented on that post, led to some thoughts regarding timelines.

James (I hope you don't mind me pointing this out...) in a comment that log2timeline doesn't provide enough context.  I was curious about this, and in the ensuing email exchange, at least part of what James was referring to was things like, if an image file was created on the system, provide a link to that file, that would then be opened in a viewer.  In this way, the analyst would have access to additional context.

I thought that this was an interesting idea...and I still do.  Having my own process for creating timelines (taking nothing away from Kristinn's efforts with log2timeline, of course) I thought about how you could implement something like this...and got stuck.  Here's why...let's say that you completely restructured timeline creation and had the timeline itself available with either the image itself mounted as a volume, or files extracted from the image...would you provide links to all files that were created on the system?  I'm thinking not.  Also, if a file is modified, what do you link to?  What value or context is there if you don't know what was modified in the file?  The same would be true for Registry keys.

However, I do think that there are some benefits to this idea, but it really depends upon how you implement it.  For example, let's say that you found some time stamped data in the pagefile or in unallocated space, and wanted to include it in your timeline.  You could do that, and then include a link to the data...but not to the offset within the image; instead, extract the data (plus 100+ bytes on either side of the data) into a separate file, and provide a link to that file.  The same might be true for other files...for example, if your timeline analysis leads you to determine that the infection vector was spear phishing via a PDF file, rather than copying the PDF file out of the image and linking to it, maybe what you could do is copy the PDF file out of image, and parse the offending portions of that file out, emasculate them (i.e., extract them to a text format, etc.), and then link to that.  You might extract images, and link to those...it all depends on how you want to present the data.  But my point is, this may not be something that you would want to completely automate; instead, use the timeline to perform your analysis, and once you've isolated the appropriate entries in your timeline, provide links to relevant data (either the file, a portion of a file, or to your analysis of the file) in order to add context and value to your timeline.

Wednesday, July 06, 2011

Structure Adds Context

A while ago, I was talking to Cory Altheide and he mentioned something about timeline analysis that sort of clarified an aspect of the analysis technique for me...he said that creating a timeline from multiple data sources added context to the data that you were looking at.  This made a lot of sense to me, because rather than just using file system metadata and displaying just the MACB times of the files and directories, if we added Event Log records, Prefetch file metadata, Registry data, etc., we'd suddenly see more than just that a file was created or accessed.  We'd start to see things like, user A had logged in, launched an application, and the result of those actions was the file creation or modification in which we were interested.

Lately, I've been looking at a number of data structures used by Windows systems...for example, the DestList stream within Windows 7 jump lists.  What this got me thinking about is this...as analysts, we have to understand the structure in which data is stored, and correspondingly, how it's used by the application.  We need to understand this because the structure of the data can provide context to that data.

Let's look at an example...once, in a galaxy far, far away, I was working on a PCI forensic assessment, which included scanning every acquired image for potential credit card numbers (CCNs).  When the scan had completed, I found that I had a good number of hits in two Registry hive files.  So my analysis can't stop there, can it?  After all, what does that mean, that I found CCNs in the Registry?  In and of itself, that statement is lacking context.  So, I need to ask:

Are the hits key names?  Value names?  Value data, or embedded in value data?  Or, are the hits located in unallocated space within the hive files?

The answers to any of these questions would significantly impact my analysis and the findings that I report.

Here's another example...I remember talking with someone a while back who'd "analyzed" a Windows PE file by running strings on it, and found the name of a DLL.  I don't remember the exact conclusions that they'd drawn from this, but what I do remember is thinking that had they done some further analysis, they might have had different conclusions.  After all, finding a string in a 200+ KB file is one thing...but what if that DLL had been in the import table of the PE header?  Wouldn't that have a different impact on the analysis than if the DLL was instead the name of the file where stolen data was stored before being exfil'd?

So, much like timeline analysis, understanding the structure in which data is stored, and how that data is used by an application or program, can provide context to the data that will significantly impact your analysis and findings.

Addendum, 7 July
I've been noodling this over a bit more and another thought that I had was that this concept applies not just to DF analysis, but also to the work that often goes on beyond just analysis, particularly in the LE field, and that is developing intelligence.

In many cases, and particularly for law enforcement, there's more to DF analysis than simply running keyword searches or finding an image.  In many instances, the information found in one examination is used to develop intelligence for a larger investigation, either directly or indirectly.  So, it's not just about, "hey, I found an IP address in the web logs", but what verb was used (GET, POST, etc.), what were the contents of the request, who "owns" the IP address, etc.

So how is something like this implemented?  Well, let's say you're using Simson's bulk_extractor, and you find that a particular email address that's popped up in your overall investigation was located in an acquired image.  Just the fact that this email address exists within the image may be a significant finding, but at this point, you don't have much in the way of context, beyond the fact that you found it in the image.  It could be in an executable, or part of a chat transcript, or in another file.  Regardless, where the email address is located within the image (i.e., which file it's located in) will significantly impact your analysis, your findings, and the intel you derive from these.

Now, let's say you take this a step further and determine, based on the offset within the image where the email address was located, that the file that it is located in is an email.  Now, this provides you with a bit more context, but if you really think about it, you're not done yet...how is the email-address-of-interest used in the file?  Is it in the To:, CC:, or From: fields?  Is it in the body of the message?  Again, where that data is within the structure in which it's stored can significantly impact your analysis, and your intel.

Consider how your examination might be impacted if the email address were found in unallocated space or within the pagefile, as opposed to within an email.

More Links

Meetup
Just a reminder about tonight's meetup:

Location: ReverseSpace (this is our location, unless stated otherwise)
Time: 7-8:30pm (this will be the time that we'll meet, unless stated otherwise)

Tonight, Tom Harper will be presenting...you can get a copy of his slides here.

Also, please notice that I've created a "NoVA Forensics Meetup" page, linked on the right-hand side of this blog. 

Mobius
I ran across the Mobius Forensic Framework this morning (because it had been updated), and found it very interesting.  Mobius is a Python-based framework "...that manages cases and case items, providing an abstract interface for developing extensions. Cases and item categories are defined using XML files for easy integration with other tools."  It seems that this framework has been around for some time...the main link indicates that the last update was near the end of 2009.  The framework appears to have a Hive Report capability, as well.

This appears to be very different in function from the Digital Forensics Framework, now at version 1.1.0, and is definitely worth a look.

For/Sec LinkFest
Klaus has updated his blog again, and posted an expansive set of links regarding forensic and security tools.

I'm always looking to improve the work that I do, and I very often find some interesting links in what Klaus provides.  One was a reference to the eScan AV toolkit, from TinyApps.org, in Klaus' RSS feed.  If you work cases that involve detecting suspected malware ("Trojan defense"), this may be a tool that you'll want to employ as part of your malware detection process/checklist.

Xanda
Speaking of links, I ran across this page at Xanda, and found a number of very interesting links, such as an emulator for the PDP-11.  As with many other sites that provide lists of free/open-source/(some commercial) forensics tools, there will be a considerable amount of overlap, but there are also some links on this page that I haven't seen before, and I'm not about to discount anything at this point.  I mean, while I haven't been asked to analyze an Atari system, when I was at IBM our team was asked to perform analysis of mainframe systems more than once.  The Xanda page also has an entire section on steg tools.

Reading
I ran across this interesting bit of reading on the CERIAS blog, authored by Gene Spafford.  Beyond the mention of historically famous names in the DFIR community (from before there was really a DFIR community....) were the statements in the first paragraph regarding deployment of DFIR countermeasures.

As interesting (and immensely helpful) as these countermeasures may be, having performed a number of incident response engagements and analyzed even more drives and images, I think that the reality is that we have to just file this under "ain't gonna happen".  Now, don't get me wrong...I do believe that such measures are good security and would prove to be immensely useful; however, who's going to implement and monitor them, given the state of security to begin with?  What good is any of this going to do when the bad guys have already been through your infrastructure?

Now would countermeasures such as those Gene describes be useful...sure.  If they were properly deployed.

Monday, July 04, 2011

Links

Independence Day
Before anything else, Happy 4th!  I hope that everyone takes a moment to remember those who have fought and sacrificed for our freedoms...that includes not only those who have given the ultimate sacrifice, but those who have lost loved ones in the fight for freedom.  Also remember our public servants (cops, firefighters, EMTs), as well as our service members who are fighting to give others freedom.  May God bless them all.

e-Evidence
There's been an update over at the e-Evidence web site, with the addition of some good reading...take a look.

APT
I posted recently regarding an article Jason Andress had written for ISSA, regarding APT.  Shortly thereafter, my friend Russ contacted me to let me know that he'd co-authored a similar paper, and that I might want to take a look at it.  The blog post is here, and the paper can be found here.  The paper was written as a requirement for the SANS Technology Institute MSISE program, and while it touches on some of the same themes as Jason's paper, this one takes a bit more of a tactic approach...and that's one of the things I really like about this paper.  The approach taken in the paper is not just tactical...it's "here are some of the things that are seen on the network, and here's a cheap or free way to go about detecting it."

The paper also points out some interesting aspects of tactics used by the threat actors, particularly getting into the infrastructure via some method (spear phishing), gaining a foothold with PI-RAT, and then moving laterally within the infrastructure.

Another aspect of this paper is that it provides additional insight into the threat itself; anyone unfamiliar with the threat should read this paper, Jason's article, and others in order to develop a better understanding of the threat.  Much of what I've read out there covers the general flow of these threats, and this paper provides some insight into a specific implementation, and should be considered as such.  Not every incident of this type is going to include the same persistence mechanism, use of the same RAT, or the same network traffic.  However, the paper does a very good job of pointing out some of what can be done in response to this threat, both in initial detection and then response.

So, again...some great information in the paper, and it is easy to follow; if you're trying to get a better understanding of the threat overall, be sure to include this in your reading, along with additional credible, authoritative sources.

Malware
In the past, I've talked about the four malware characteristics I'd developed to help DFIR folks understand and explain malware, and over time, those characteristics have served me pretty well.  One of those characteristics is the initial infection vector...how the malware gets on the system.  Well, I ran across this InformationWeek article this morning talks about Facebook being the "new" malware vector.  Okay, the meaning of "new" aside, I think that this is interesting, in part because it makes complete sense.  Look at the statistics in the article regarding users and the clients they use to access Facebook...pretty telling, if you ask me.

As an analyst, I'd like to hear from other analysts...have you seen incidents where Facebook was the delivery mechanism for malware?  If so, what are the artifacts on a PC or laptop, as opposed to a smartphone?

Also, Cory started a drinking game at OSDFC, because apparently, I pronounce malware "mall-ware"...so for every time I wrote "malware", you need to drink!



WFA 2/e Review
Mike Ahrendt posted a review of WFA 2/e recently; it's great to see that this book is still active and making its rounds, and that people who are reading it are finding something useful.  I tend to reference it myself now and again for my own needs, and sometimes will make notes of new, additional information that I've found with respect to a particular topic.  I think it's great that folks are still picking it up for the first time and finding it useful.

Bootkits
There was a post over on the SANS ISC site recently regarding the resurgence of bootkits, in which MS's Win32/Popureb.E (which is still short of any information useful to analysts) was specifically mentioned.  The post goes on to take a look at AV products that detect and/or clean MBR infectors, and indicates which are more successful than others.  I still think that one of the biggest issues surrounding this sort of thing is that most analysts I've spoken with appear to not look for this sort of thing when it comes to determining if there is malware (drink!) in an image acquired from an infected system.  I'm not sure if this is an awareness issue, or a training/understanding issue; I have a checklist that I use (and try to keep up to date) for engagements such as this, so when I receive an image and the statement that, "we think it was infected with malware", I run through this process, which includes checking for indications of MBR infectors.

My thoughts on this subject aren't so much that I think that MBR infectors are more pervasive than most analysts think; not at all.  I think that it's more of a knowledge or "engaging with your peers" issue than anything else.  I don't think that available courses (whether for training, or ultimately ending in a certification) are necessarily going to cover the topic of malware detection within an acquired image, but I do think that the issue is one that needs to be understood (i.e., the "Trojan defense").  As such, where do analysts go to get this sort of information or education?

What are your thoughts?