Wednesday, February 29, 2012

Counter-Forensics

A recent SANS Forensics Blog 'Case Leads' post pointed me to a Hexacorn blog post that "...presents several techniques used to prevent forensic analysis."  That statement is a bit too succinct and just a little bit misleading, as the techniques presented in the Hexacorn blog post don't do anything to prevent analysis...rather, they're counter-forensics techniques intended to make analysis a bit more difficult.  However, many analysts are aware that deleting cookies and clearing Event Logs really doesn't do a whole lot more that just alter where you go to get the information you're looking for.  I didn't bring this up simply to say that, and I'm not bashing the Hexacorn guys.  Nor am I suggesting in anyway that the information they've provided isn't useful...for reasons I've already mentioned, it's very valuable information.  But there are two very important principles to keep in mind when it comes to digital analysis that play an important role when discussing such techniques...Locard's Exchange Principle, and "the absence of an artifact where you expect to find one is itself an artifact."

Sidebar:
I should point out that at the Blackhat conference in 2005, James Foster and Vincent Lui gave a presentation that covered "log manipulation and bypassing forensics".  Attendees were invited to join "the speakers in a lively discussion of the “Top 10 Ways to Exploit a Forensic Examiner”", so the approach of targeting the analyst and their training via counter-forensics techniques is nothing new.
/end sidebar

So what am I talking about?  We know from Locard that any interaction between two physical objects is going to have an effect on both of those objects (including a transfer of material between them, if they come into contact), and we know that the same is true in the digital realm.  Therefore, it stands to reason that attempts at counter-forensics are going to themselves constitute an interaction (something will have to execute in order to perform some of these techniques), either between malware and the eco-system in which it's running, or between an intruder and the system they've compromised.  As such, the execution of the counter-forensic technique will have an effect on the environment in which it's being executed.

I've analyzed systems on which a user had no RecentDocs key in their NTUSER.DAT...not that it wasn't populated, it simply wasn't there.  This told me something...and I found out how it was deleted, what was used to delete it (via RegRipper and timeline analysis), when it was deleted, and what the deleted content was (via regslack).  However, had I not known that not having a RecentDocs key was an artifact, I might never have looked.  The same is true for malware found to be using the WinInet API for off-system communications...no artifacts were found in the index.dat file for the user profile that the malware was run under, and that in itself was an artifact.  This led to further analysis that determined why that was the case.  In other situations, cleared Event Logs from Windows XP and 2003 systems have been easily retrieved from unallocated space.

Now, data recovery such as this isn't necessarily going to be of value, or even viable, for all counter-forensic techniques.   Securely wiping a file will make it gone-gone...you may find little bits and pieces of the file, small portions of it as it existed at one time, or you may find indications that a user accessed the file, but if your analysis goals center around the actual contents of the file itself, some counter-forensic techniques may obviate analysis. 

One method that we've seen in recent years to perform counter-forensics is that intruders have minimized their interactions with compromised systems; however, this does not completely obviate that interaction, it simply minimizes it.  Want to do some stuff on the system or network to which it's attached, and minimize your footprint?  Use native tools.  Want to hide your activities?  Make them look as much like normal system activities, and delete your tools and the data repositories that you create...Windows systems are very "noisy" when it comes to system activity, and given everything that occurs under the hood, your deleted files may very well be overwritten and obfuscated in fairly short order.  (Note: this is why I wrote an entire chapter on Immediate Response in WFAT3e...the sooner you respond to an incident, the better data you're likely to get.)

Sidebar:
One thing I do disagree with, with respect to the Hexacorn post, is the reason given for not discussing NTFS alternate data streams (ADSs); "...because it’s quite common and hard to find good keywords".  While the use of ADSs in malware (PoisonIvy) may be 'quite common', the value of analyst's knowing about them and understanding how they're used is a matter of perspective.  I've been in rooms full of analysts and responders, and not been able to get more than maybe one or two of them to admit that they've even heard of ADSs.  If you're a malware analyst and used to seeing the use of ADSs all the time in the samples that you're looking at, and you think that they are nothing new and pretty trivial, consider the number of organizations that don't do any sort of root cause analysis on supposedly-infected systems, where the routine is to wipe and re-install the OS and data.  In these cases, no one is looking for ADSs, in part because they aren't even aware that they could exist.  As such, intrusions or malware that use ADSs may go unnoticed for quite some time.

So, if we're saying that ADSs are 'common' and therefore not worth addressing, I think that we're really missing an opportunity here to not only raise awareness, but to also start developing some trending information with respect to the use of such artifacts.  If a malware analyst sees "a lot" of samples that use ADSs in some capacity, how many is that out of the total amount of malware they're examining?  Is that more or less than last year, or last month?  Instead of samples, look at families...how pervasive is the use of ADSs across families?

As to the keywords comment...forgive the poetic license, but to paraphrase the Bard, "...there are more things on heaven and earth than are dreamt of in your keyword lists."  Keyword searches are a useful tool, don't get me wrong...but they are not the be-all, end-all of digital analysis.  As with any other tool, keyword searches should be employed thoughtfully, and doesn't work for everything.
/end sidebar

One thing to keep in mind when analyzing Windows systems is that Windows has it's own, built-in counter-forensics "measures"...I hesitate to call them that because I don't believe that it's intentional, but the fact is that a lot of what makes Windows useful to a user can also serve as counter-forensics measures.  What I mean by that is this...let's say an intruder accesses a system remotely and "does some things".  If the intruder were to delete some files or indications of their actions, well, we know that "deleted" isn't really gone, it's just hidden until it's overwritten.  Have you ever looked at what a Windows system does when it just sits there, with nothing happening on the monitor?  Stuff goes on...the OS and applications may be updated, Restore Points are created and deleted, limited defrags are run on a scheduled basis, etc.  Now, let's say that the intruder deletes a key file, but the user's on their system, going about their regular work.  Open a Word doc, an Excel spreadsheet, or open that PowerPoint presentation to update some of your slides, and the Office applications will create a tempoorary file...which consumes unallocated sectors.  NTFS MFT records are reused.  With all of this activity, all that it may take to make a deleted file completely unrecoverable is to wait a day or two.  This is really more of an argument for immediate response, which is an entirely different discussion (and blog post) but it fits in here as a counter-forensics measure.

So, in short, the intentional use of counter-forensics techniques will leave traces on the system,which may appear in some cases as a glaring absence of data.  Some counter-forensics techniques are specifically tailored  toward subverting the training and knowledge of the analyst.  A knowledgeable analyst will employ an analysis process, and as such be able to recognize and adapt to the use of such techniques.  But nothing that we do is ever absolute...securely wiping a file will mean that you likely won't be able to recover that file, if that's the goal of your analysis.  You may be able to locate remnants of temporary versions of the file (some applications create temporary copies of files while they're being edited), or you may find indications that a user interacted with a file of that name (through Registry or Jump List analysis).  Some techniques may truly make specific data unrecoverable...but there will very likely be traces of that activity, and that fact can be used in cases of spoliation.

Tuesday, February 28, 2012

More Win

A good friend of mine reached out to me last week, in order to ask for my assistance with something.  They had an issue that they were working on, and it turned out that Facebook chat artifacts played an important role in that issue.  Fortunately, I had taken a personal day, and had some time to devote to this interesting issue; interesting because I'd never had to parse FB chat messages before.

So, armed with a description of what my friend was looking at, and what they wanted to see with respect to the output, I started coding.  I know that Facebook chat messages are saved in JSON format, which is easily parsed via Perl.  I did have a little bit of an issue parsing the time stamp associated with the chat messages, and I reached to Andrew Case, who provided me with some valuable insight.  Apparently, the chat time stamps are in Unix epoch format, with milliseconds.  The resulting 10.3 floating point format (ten digits followed by a decimal and three more digits) is multiplied by 1000 to get a 13 digit string.  As such, the conversion of the time stamp into something readable was pretty straightforward, and Andrew was able to provide some valuable information and resources.

I had an initial script working pretty quickly, and got that script out the door to my friend in just over an hour...I then cleaned up the code a bit, added some comments and documentation, and had a cleaner version sent off about 90 minutes after I started writing code.  The script (fb.pl) is posted here.

The script was run against over 1400 Facebook chat messages that had been exported via EnCase.  The script accessed each file, with each file containing one chat message, and parsed out the pertinent data into CSV format.  The results were available in minutes...or more accurately, "about a minute".  How long would it have taken to do this by hand?  More importantly, how boring would that have been, and how much else could you have accomplished instead of doing all of that by hand?

So how is this useful?  Well, for me, I had an opportunity to work on something interesting, that helped someone else out.  I didn't have to view all of the data...nor did I want to, to be honest.  I just saw enough to allow me to write code that would parse the data.  And I have a neat little piece of code that I can reuse in the future.

My friend got to work on something else while I was writing the code, and then once they had the code, got what could have been hours of tedious, boring work that they'd've had to have done by hand done in "about a minute".  And they have code that they can reuse when similar issues pop up in the future.

Lessons
If you're faced with something that's a lot of a tedious, manual work, it's likely to be boring and result in mistakes...so automate it if you can.  Computers are really good at doing stuff...particularly boring, repetitive stuff...really fast.  People aren't.  Oh, and the code can be reused later.

If you don't know how to automate something, or if it would take you a while to figure it out, try reaching out to another analyst.  You'd be surprised at what some folks have already done, or what they're willing to do.

Get more done faster...by asking for assistance, you can offload something; in this case, that 'something' was developing an automated approach to processing large (potentially massive) amounts of data.  Getting someone's help means that you can often focus on something else while they work on that problem.

Banging your head against the wall on a problem, or "chewing on it" in an attempt to force that problem into submission is a really good way to waste a lot of time.  Sure, you get the joy of knowing you accomplished something yourself, but what's the likelihood that another analyst, perhaps even one you know, has already gone down that road, or something close to it?  Or, maybe has an idea that help you get from point A to point B faster?  After all, what do you have to loose?

Hopefully what this means is that at some point in the near future, my friend and I will be able sit down and share a moment over a nice micro-brew.

By the way, in case you were wondering...the image I included in this blog post has absolutely nothing to do with anything...I just thought it was funny!  ;-)

Sunday, February 26, 2012

Stuff

WFAT3e
So far, there are two reviews of WFAT3e posted to Amazon.  That was pretty quick, and I greatly appreciate the effort that went into making that review available.

If you've got a copy of WFAT3e and are considering writing a review, here's what I'd like to humbly ask...please consider doing more than just reiterating the table of contents.   After all, this book isn't just the second edition with some new info added...this is a companion book that should be on your bookshelf right next to the second edition.  In writing this edition, I took out stuff that hasn't really changed (PE file format, etc.) and instead included stuff specific to Windows 7, such as Volume Shadow Copies, and a bunch of new file formats.  I added an entire chapter just on Timeline Analysis, as well as one on Malware Detection.  In the first chapter, I went into more detail regarding what I consider to be core analysis concepts...so getting your thoughts and feedback on those would not only be interesting for me, but also be very valuable for other potential readers who look to your review to help them make up their minds.

Addendum: Keith's review is posted here.  Thanks so much for sharing your review!  Corey's review is posted here

Registry Analysis
Speaking of books, I've received a couple of emails asking me when a second edition of Windows Registry Forensics would be available.  While it's much too early to have a second edition at the moment, it's not too early to start thinking about it.  So, I thought I throw this out to the community...is a second edition something that would be of interest?

If so, what would you want to see added or modified in this second edition?

DFIRSummit
The agenda for the SANS Forensic Summit 2012 has been posted and it looks like this year's summit is going to be another outstanding event for DFIR folks.  I've attended all of the summits, which started in 2008, with the exception of last year's event.  These are always really great events, combining quality material and presenters with some excellent networking opportunities.

This time around, I've been selected to give the second keynote presentation on Wed, 27 June, an honor that I greatly appreciate and am humbled by.

For my presentation, I plan to use my CfP material, something I put together that I call, "Intro to Windows 7 Forensic Analysis".  I'll try to pack as much practical, 'you can use this right now' knowledge into the hour that I have available.  Keep in mind that the keynote will be at 8am...for me, that's almost lunch time, but I'm aware that many are still deep in REM (get it...R.E.M.??) sleep at that time of the day.

This year's event has a number of notable DFIR luminaries on the agenda, and the presentations cover a range of topics, including Windows 8, Mac OSX, Android memory analysis, effects of "The Cloud" on the industry, and there's even a SANS 360 event on the agenda.  Chris Pogue (@cpbeefcake) will be presenting "Sniper Forensics v3".

So, keep your eyes open for new developments regarding this event...if you're on Twitter, the hashtag is "#DFIRSummit".

IOCs
I've had a couple of posts recently on IOCs (most recent one here, earlier post here).  Unfortunately, it doesn't look like they've gone over too well...there hasn't been much in the way of discussion.  That's too bad...I can see how properly developed IOCs can be extremely useful, particularly when it comes to sharing threat intelligence.

Good IOCs can be shared by folks within the digital analysis community in order to share information.  Do we really think that things like RAM scrapers and keystroke loggers are isolated only to PCI engagements?  I hope not.  In some cases, I don't think that keystroke loggers are even looked for, because they aren't "on the radar" for the particular analyst.  This may be due to a number of factors...lack of experience, fatigue, too many cases piled up, etc.

Also, I can see some significant value in sharing IOCs across security specialties.  Let's say that you're a malware analyst, and you find something interesting, and decide to share the file name, hash, and interesting output from 'strings'.  Okay, that's great, but you've got a sample on your analysis system and an opportunity to share the data you've already collected with others, such as host- and network-based analysts.  Don't think one sample may be too terribly interesting?  How about if by adding that one sample to an aggregate across a number of samples, trends being to develop?

Something that may be a little bit more concrete for folks, and may turn into an actual IOC, popped up on ForensicArtifacts.com recently thanks to John Lukach.  Apparently, you can run the iCloud Service that you find on iPhones and iPads on Windows...so, if you're working an issue that involves data exfiltration, this may be something that you'll want to look into.

Reporting
I read this post from the HackerAcademy this morning, and I have to say, I agree wholeheartedly!  I started out in infosec performing vulnerability assessments and pen tests, and across the board, I'd rather work with someone who was 80-85% technical, but could produce a deliverable, than someone who was 100% technical, but couldn't (or wouldn't) write.

By itself, writing helps (or should help) us organize our thoughts.  We can start out by writing up a quick analysis plan, and use that as a guide, adding reasons for why we deviated from that plan.  From there, we should be keeping case notes, which have our analysis goals written out right at the top of the page, to keep us on track and on point throughout the analysis.  Finally, when we get to the actual reporting phase, we should have a template that we use, so that by the time that we're finishing up our actual analysis, everything we've written thus far pretty much lets the report write itself.

When performing your analysis, if you don't document something, it didn't happen.  When reporting, you have to keep in mind why you're being paid all the money by a customer...if they could do this work themselves, they wouldn't be paying you.  So...do you give them a three page report (one of which is the cover sheet) that basically says, "we didn't find anything", or do you show them what you did?

Another aspect of reporting is to archive information about the work that's been done by your team.  I was once on a team where our manager told us to put our reports up on a protected file server.  Over time, I found that I was the only one doing this...after every engagement, part of my close-out procedures checklist was to put a copy of the report on the server in the proscribed manner, and securely wipe the data and report off of my analysis system.  Since I wasn't the only one doing analysis work, I believe that a great deal of valuable corporate knowledge was lost because other analysts refused to share their experiences and findings.   We don't all look at or tackle problems in the same way, and someone may have experiences that would greatly benefit your current efforts...if they'd shared them.  Sharing those experiences by posting the reports to the server meant that you could view the final product without having the other analysts do any additional work.

A while (over 2 yrs) ago, I posted a sample report to the Files section of the Win4n6 group.  This report was based on analysis of a sample image downloaded from the Internet.  Of course, I can't be expected to post actual customer reports, and this was the best way I found to go from concept to practice.  I hope that someone who reads it finds it useful.

Thursday, February 23, 2012

A Little Something More Regarding IOCs...

I recently posted a little something on IOCs, and I wanted to take a moment to extend that, along the vein of what I was discussing, particularly the part where I said, "Engage with those with adjacent skill sets".  I think that this is a particularly important undertaking, particularly when we're talking about IOCs, because we can get better, more valuable IOCs when we engage with those with adjacent skill sets, and understand what their needs are.  This true for responders, DF analysts, malware analysts, and we can even include pen testers, as well.

First, a word on specialization.  Many of us in the infosec world recognize the need and importance of understanding a general body of knowledge, but at the same time, many of us also gravitate toward a particular specialization, such as digital forensic analysis of mobile devices or Windows systems, web app assessments/pen testing, etc.  Very often, we find this specialization to be extremely beneficial to us, not only in what we do, but in our careers, and often our hobbies, as well.  What happens when we try to do too much...say, IR, DF, and pen testing?  Well, we really don't become an expert at anything in particular, do we?  Several years ago, I sat on a panel at the SANS Forensic Summit, and the question was asked about who you'd want to show up to respond to an incident...my answer was someone who responded to incidents, not someone who was doing a pen test last week, and pulling cable the week before that.  Think about it...who do you want doing your brain surgery?  A trained, experienced neurosurgeon, or someone who, up until 6 months ago, was driving trucks?  My point is that due to how complex systems have become, who can really be an expert in everything? 

Okay, so to tie that together, while it's good to specialize in something, it's also a really good idea to engage with those with other, ancillary, adjacent specializations.  For example, as a digital forensic analyst, I've had some really outstanding successes when I've worked closely with an expert malware analyst (you know who you are, Sean!!).  By working together with the malware analyst, and providing what they need to do their job better (i.e., context, specific information such as paths, etc.), and getting detailed information back from them, we were able to develop a great deal more (and more valuable) information for our customer, in less time, than if we'd each worked separately on the case, and even more so than if we'd worked solo, without the benefit of the other's skills.

Part of that success had to do with sharing IOCs between us.  Well, we started by sharing the specific indicators associated with what was found...I had not found the indication of the infection until running the fourth AV scanner (a relatively obscure free AV scanner named 'a-squared').  From there, I referred to my timeline, and was able to determine why there seemed to be no Registry-based auto-start mechanisms for the malware.  From there, I provided the information I had available to the malware analyst, including information about the target platform, where the malware was found, a copy of the malware itself, etc.  After disabling WFP on his the analysis system, the malware guy got the infected files up and running, and pulled information out of memory that I was then able to use to further my analysis within the acquired image (i.e., domain name, etc.).  Using this information, I targeted specific artifacts...for example, the malware analysts shared that once the malware DLL was in memory, it was no longer obfuscated, so he was able to parse out the PE headers of the file, including the import table.  As the malware DLL imported functions from the WinInet API, it was reasonable to assume (and was verified through analysis) that the malware DLL itself was capable of off-system communications.  This made it clear not only that additional malware was not required, but also told us exactly where we should be looking for artifacts.  As such, in my analysis of the acquired image, I focused on the IE history for the user in question and the pagefile.  I ran strings across the pagefile, looking for indications of the domain name we had, and then wrote a Perl script that would go into the pagefile at that offset for each hit of interest, and grab 100 bytes on either side of the hit.  What we ended up getting was several web server responses to the malware's HTTP off-system communications, which included time stamps from the web server, as well as server directives telling the client to not cache anything.

Now, each of us focusing on our specialized areas of analysis, but treating each other like "customers", worked extremely well.  We each had a wealth of information available to us and worked closely to determine was the other needed, and what was most valuable.

Addendum: While driving home from work in traffic today, I thought of something that might reveal a bit more about this exchange...

At the early stage of my analysis, I had an acquired image from a system, from which I'd created a timeline, and against which I had run four AV scans.  The result of the last scan was where one file was identified as "malicious".  Looking at that file within in the image, via FTK Imager, I could see that the file was obfuscated (by the contents of the import table), although I didn't know how it was obfuscated (packed, encrypted, both, etc.).  And I only had the acquired image...I didn't have a memory dump.  So before I sent it to the malware guy, I had to figure out how this thing was launched...after all, I couldn't just send him the file and ask him, "what can you tell me about this file?", because his answer would be "not much", or not much that I couldn't already figure out for myself (MD5, strings, etc.).

So, I developed some context around the file, by figuring out how it had been launched on the system...which turned out to be that a regular system DLL had been modified to load the malicious DLL.  Ah, there's the context!  So I could send both files to the malware guy.  When I did, it turned out that the regular system DLL on the malware guy's analysis system was 'protected' by Windows File Protection (WFP), so he couldn't just copy the two files onto the system and reboot to get everything to work.  So I told him which Registry key to use to just disable WFP for the installation process, and everything worked.

I should point out that many of the EXE samples of malware are run through dynamic analysis by the analyst copying the EXE to the desktop and double-clicking it.  However, this was a DLL, so that wouldn't work.  Using rundll32.exe might not work, either...figuring out any needed arguments would just take up a lot of time.  So, by putting in a few more minutes (because that's all it really amounted to) of analysis, I was able to provide the malware guy with enough information to get the malware sample up and running on the test system.

Once the malware guy got things going on his end, he was able to provide me with information about Registry keys accessed by the malware (in this case, none), as well as more details regarding off-system communications.  Because he saw HTTP traffic leaving the test system, this gave us some information as to where to look with respect to how it was doing this, which was confirmed by checking the import table of the DLL once it had been loaded into memory.  The malware guy provided me with this information, which told me pretty much where to look for artifacts of the off-system communications.

Yes, 'strings' run against the now un-obfuscated DLL would have provided some information, but by parsing the import table, that information now had the benefit of additional context, which was further added to through the use of more detailed analysis of the malware.  For example, the data stolen by the malware was never written to disk...once it was captured in memory, it was sent off the system via the HTTP communications.  This was extremely valuable information, as it really narrowed down where we should expect to find artifacts of the data collected/collection process.

The point of all this is that rather than providing simply what was easiest, by collecting additional data/context about particular artifacts, we can provide useful information to someone else, which in turn increases the quality of the information (or intelligence) that we receive in turn, and very often, in a much quicker, more timely manner.  In short, I help you to help me...

Now, back to our regularly scheduled program, already in progress...

Okay...so what?  Consider this recent blog post...while it is very cool that 7Zip can be used to export the various sections of a PE file into different files, and you can then run 'strings' on those exported sections, one would think that parsing the PE headers would also be beneficial, and provide a modicum of context to the strings extracted from the .text section.  Why does that matter?  Well, consider this post...from my perspective as a host-based analyst, knowing that the string 'ClearEventLog' was found in the import table versus in the code section makes a huge difference in how I view and value that artifact.

Consider an email address that you found somewhere in an acquired image...wouldn't the relative value of that artifact depend on the context, such as whether or not the email address was associated with emails or not?  And if the address were found to be in an email, wouldn't it's context...whether it was found in the To or From block, or within the body of the email...potentially have a significant impact on your exam?

My point is that sometimes what we may find to be interesting from our perspective may not be entirely useful to someone else, particularly someone with an adjacent skill set.  Other times, something that we think is completely and totally trivial and not interesting at all may be enormously useful to someone else.  A malware analyst may not find that a particular bit of malware that uses the Run key as a persistence mechanism as very interesting, but as a Windows analyst, that piece of information is extremely valuable to me, not only with respect to that particular sample, but also when correlated with other samples, as well as other threats.

So, some thoughts regarding IOCs:
1.  Specific, targeted IOCs only really work well for specific samples of malware or specific incidents.  By themselves, they may have very limited value.  However, when combined/aggregated with other IOCs, they have much greater overall value, in that they will show trends from which more general IOCs can be derived.

2.  You don't always have to create a new IOC...sometimes, updating one you already have works much better.  Rather than having an IOC that looks for a specific file in a specific directory path (i.e., C:\Windows\ntshrui.dll), why not look for all files of a particular type in that directory, and potentially whitelist the good ones?

3.  Working together with analysts with adjacent skill sets, and understanding what artifacts or indicators are valuable to them, produces better overall IOCs.  A malware analyst may find an MD5 hash to be useful, but a DF analyst may find that the full path to the file itself, as well as PE headers from the file, are more useful.  An malware analyst may find the output of 'strings' to be useful, but a DF analyst may need some sort of context around or associated with those strings.

4.  No man, or analyst, is an island.  As smart as each of may be, we're not as smart as several of us working together.  Tired of the cliche's?  Good.  Working with other specialists...malware and memory analysts, host-based DF folks, network analysis folks...is a much better solution and produces a much better overall result than one guy trying to do it all himself.

Monday, February 20, 2012

A Little Something on IOCs...

Not long ago, I read Greg's blog post on using HBGary's DDNA to detect APT attackers in memory, and ran across an interesting statement near the end of the post, specifically, the first paragraph in the Using IOC’s effectively subsection (it's kind of long so I won't copy/paste it here...though I highly recommend that you take a look).  I have to say, I agree with Greg's statements...IOCs can be effective tools, but like any tool, they are only as effective as how they are used.  I think that IOCs have great potential in data reduction and providing an automated facility for removing the "low-hanging fruit" from analysis, thereby leaving the analyst to...you know...analyze.  But like Greg says, if your IOCs are just lists of file names and MD5 hashes, then they really aren't all that effective.

Back when I was performing PCI forensic exams, one of the most frustrating things was the lists of "IOCs" we got from one of the major card brands; we'd get these lists of mostly file names and MD5 hashes, two of the most mutable aspects of IOCs, and we were told that we had to include these ever-expanding lists in searches across all acquired images.  Okay, so let's say we found bp0.exe during an exam, but the hash was different from the one provided in the issued list...we noted that and sent it back in, and then the next list we saw would have the new hash added to it.  However, the lack of sharing any real intel was frustrating, as having a more comprehensive intel model would have allowed all analyst firms to move beyond the relatively simple, "yes, you were compromised" and more toward, "this is how you were compromised, this is your window of compromise, and this is what you can do to prevent it in the future."  For example, we'd get file names, but in most cases, not paths.  We'd get MD5 hashes, but not fuzzy hashes, and Chris Pogue would push for fuzzy hashes.  In most cases, we wouldn't even get things like Registry artifacts that would let us see when something had been installed and run, but then deleted.

So, per Greg's statements, one way to make IOCs more effective is to create them so that they're not just a simple blacklist.  Rather than looking for a specific Registry value used for persistence (because file names and Registry value names can be random and change...), have the IOC either dump all of the values in the key for manual review, or correlate multiple keys and values in order to narrow the focus a bit.  Again referring back to past PCI engagements, when looking for credit card numbers (CCNs) within an image, we had three checks...length, BIN, and Luhn check...that we would use, but we got a lot of false positives (such as MS GUIDs).  However, adding additional checks for track 1 and 2 data not only narrowed down our criteria, but increased our confidence that what we'd found was actual useful data and NOT false positives.  So, as Greg suggests, looking for more general things, or adding additional items to your IOCs, can be very effective.

The effectiveness of IOCs is increased though open source intel sharing, by having someone who discovers something develop an effective (and well-documented) IOC and sharing with their team.  These IOCs can be even more effective if they're shared openly (even if it's within a relatively closed community), because then they're not only subject to a more open review, but they can also be expanded further.

The effectiveness of IOCs is obviated and completely lost when the folks using IOCs do nothing but consume them; that is, download the latest IOCs and run them without any review or understanding.  Tools that use a plugin-based approach (Nessus comes to mind...) are only as effective as the effort put into the plugins, and what's actually reported.  A recent InformationWeek article describes eight lessons learned from the recent Nortel revelation...number 6 is "conduct a thorough forensic analysis".  If you suspect that you have something untoward going on in your infrastructure, and you run a list of IOCs and don't get anything definitive...maybe it's time to break out the toolkit and take things a step further.  You may find something entirely new...or you may simply get peace of mind.

Let's take a look at a couple of tandem blog posts that addressed some malware analysis recently.  I caught wind of this on Twitter...we all know how difficult it is to share anything on Twitter, because you have squeeze things down into 140 characters, or make multiple, abbreviated posts...and a lot gets lost that way.  First, we have Keith's post on the Digital4rensics blog, and then there's the Digitalsec4u post on the New Age Security blog.  Both address different components of the analysis of something that hit an infrastructure and was shared, in order to complete some analysis.  You have to look at both of these blog posts together to pull out the IOCs; read the Digitalsec4u post first to see what hit the infrastructure, and what it "looked like" when it came in, and get the Initial Infection Vector.  Then read Keith's post to begin getting a view of what happened once the original bit of malware was executed.

Now, I know from some perspectives, neither the malware that was analyzed nor the actual results are particularly "earth shattering", particularly from the perspective of these two gents who very graciously shared their analysis.  However, there is considerable value in the data that has been posted and shared.  For example, this malware came in as an email attachment, disguised to look like a plane ticket.  We can see hashes on the original payload, and we know that these artifacts are easily mutable.  We can also see that the malware is packed...that's something we can latch on to (IOC: search email attachment directories for packed executables).  Moving over to Keith's post, we begin to see what happens when the original malware is executed.  We can use this information to see if our IR (and DF) processes are sufficient to deal with this type of issue; assume that it wasn't detected and blocked by your in-place security mechanisms, how would you go about responding to this?  Or, better yet, if part of it was blocked, would your current response and analysis process be suitable to determine that the follow-on stages hadn't actually executed?  This is an extremely important question, and one that not a lot of responders and analysts have been trained to consider, or answer.

Some lessons from the two blog posts:
Engage with those with adjacent skill sets; the above two blog posts are great examples of malware analysts doing their thing, but there are a number of artifacts and other bits of information that would be extremely useful for responders and analysts.  Keith's blog has some request URLs that network analysts can search for in proxy and firewall logs, but there's a question as to how these requests are made - does the malware use the WinInet APIs?  If so, we would expect to see other artifacts on the system, wouldn't we?  So, it's not unusual for an analyst to highlight or share what makes sense to them, but they have the information right there in front of them that would be useful to someone else, perhaps someone with an adjacent skill set (responder, DF analyst, network analyst, etc.), so sharing that other information with them would reduce the amount of time looking for these things.

When sharing that information, specificity of language is important.  Does the malware create a Registry key...or does it add a value to an existing key?  If it adds a value to the Run key...which one?  The same is true for file and directory paths, as well as identifying the platform you're infecting during your analysis.

In short, being clear and concise in presenting your findings makes it much easier for someone to not only replicate your findings (taking variations in platform...WinXP vs Win7...into account), but to also use that information in an immediate response mode, and begin seeing if their systems are infected.

Okay, so why are these lessons important?  There's a bit more to this than was mentioned in the blog posts; Keith mentioned that one of the bits of malware was identified by Microsoft (via VirusTotal) as PWS:Win32/Sypak.A...which is not just a password stealer, but it's a keystroke logger.  This means that it just doesn't capture passwords...it can potentially capture everything, and this can be extremely detrimental to an organization.

Saturday, February 18, 2012

Sharing and Case Studies

There's been some discussion of late in various corners of the DFIR community regarding "sharing" amongst those of us in the community.  Some of it has centered on sharing of threat intelligence and IOCs, and that's sort of spawned off into sharing "case studies".  I ran across a couple of threads on one online forum (like this one...), where an interest in case studies was expressed, but as is usually the case, few are actually stepping forward to provide them.

My hope is that these discussions have formed a crack in the wall of silence, and that some of us can squeeze our fingers into that crack, and with the help of others in the community, pry it apart.

Andrew Case posted an excellent case study on the DFS blog, one that is truly inspiring.  Andrew looked at what he wanted to do and used what was available to him to get it done.

Corey Harrell has posted a number of times to his blog regarding exploit artifacts, and has taken up a series of posts on accessing Volume Shadow Copies.  Corey provided some excellent material which he graciously allowed me to incorporate into chapter 3 of WFAT3e, and he has since taken that several steps further.  While not specifically related to cases Corey has worked, this is some excellent information that is of significant value to anyone who encounters Vista and Win7 systems.

Recently, Melia post on a "case experience" to her blog, where she had to address an issue of sploitation.  I've had similar cases, and I've had issues where such a thing was just part of the case...and I think what Melia's post really points out is that Windows does a great job of illustrating Locard's Exchange Principle; that is, when the user or an application interacts with the operating system, there are very often artifacts of that interaction that survive attempts by the user to cover their tracks.

From a link in Melia's blog post, I found out about the Cheeky4n6Monkey blog, which has some excellent posts, including this one on creating a RegRipper plugin for CCleaner, which actually covers two topic areas: artifacts associated with running CCleaner, and writing a RegRipper plugin.  The author correctly points out that I covered how to write RegRipper plugins in WRF.

There are even more examples available of how analysts have pursued real-world cases.  Take a look at this post to the Mandiant blog by Nick Harbour, from July, 2010.

I know from experience (in the industry, writing books, etc.) that many of us really enjoy reading or hearing case studies, but the fact is that few of us actually share case studies, or just some portion of our experiences (I won't get into the "why"...).  From recent experience, this is disheartening when you go to a conference where many of the presentation titles lead you to believe that case studies or experiences will be shared, but all you get during the presentations, and even out in the common areas between presentations, are blank stares.

No one of us is as smart as all of us.  There are some of us who've seen a lot of things, but no one has seen or done everything.  Sharing what we've seen through presentations and blog posts is a great way for us to learn, without having to have had the actual experiences.

Saturday, February 04, 2012

HowTo: USB Thumb Drives

Now and again, I get some interesting questions from folks, usually posing a previously-addressed question with a slightly different twist on it.  I received one of these types of questions recently and wanted to post a HowTo for others to review, and provide something to which they can add comments.

The question involved a thumb drive, and mapping the use of the thumb drive to a Windows shortcut/LNK file.  The items that we have are the device serial number (pulled from the device descriptor of the thumb drive...remember, this is NOT located in the memory area) and the volume serial number (VSN) from the formatted volume on the device.  These definitions should help you understand what we're referring to.

By now, most of us are familiar with how to go about doing USB device analysis on Windows systems.  This has been covered extensively by Rob Lee of SANS (see the Resources section below), as well as in Windows Registry Forensics and Windows Forensic Analysis 3/e

The key to answering the question of mapping volume serial numbers (VSNs) to specific devices on Vista and Windows 7 can be found in the EMDMgmt key.  This key is associated with ReadyBoost, and lists some details of the device that was connected to the system, including the unique device descriptor, the volume name, and the VSN.  Remember that the VSN can be changed simply by reformatting the device; however, this key should provide valuable information for mapping devices and VSNs (pulled from LNK files).

The EMDMgmt key and it's usefulness is discussed in detail in chapter 5 of Windows Forensic Analysis 3/e.

Often times, the Windows Portable Devices key (mentioned on pg 115 of Windows Registry Forensics; the data can be extracted from the Software hive using the port_dev.pl RegRipper plugin) will contain some very useful information, such as historical drive mappings.  Beneath this key is the Devices key, and beneath that key are subkeys that refer to devices that have been connected to the system.  The "FriendlyName" value will often contain the drive letter to which the device was mounted; in one instance, I have a device that was connected and has the volume name ("TEST") in that particular value data, rather than the drive letter.  The subkey name will usually contain the unique device identifier (very often, the device serial number) within the name...simply parse the key name apart, using "#" as your separator, and you'll see it at or near the end.

Addendum
I came up with a graphic to illustrate the relationship between various Registry hives and keys (and some values) with respect to this analysis:


Resources
Blog Post: Windows Portable Devices (Vista)
SANS Forensic Guide for profiling thumb drives on Windows systems
WindowsIR: Mapping USB devices via LNK (2007)