Sunday, March 15, 2015

Perspectives on Threat Intel

Source: detect-respond.blogspot.com
A while back, I tweeted, saying that "threat intel has it's own order of volatility".  That tweet got one RT and 2 favorites, and at the time, not much of a response beyond that.  Along the way, someone did disagree with me on that, stating that rather than an "order of volatility", threat intel instead has a "shelf life".

Thinking about it, I can see where both are true.

To begin with, let's consider this "order of volatility"...what am I referring to?  Essentially, what I'm talking about was detailed in 2002, in RFC 3227, Guidelines for Evidence Collection and Archiving.  In short, the RFC states that when collecting evidence, "you should proceed from volatile to the less volatile".  What this means is that when collecting "evidence", you should collect that evidence that is most likely to change first, or soonest.  This is a guiding principle that should be used to direct collection methodologies.

As such, the "order of volatility" itself is guidance, as the definition appears in section 2.1 of the RFC, whereas the discussion of actual collection does not begin until section 3.

The term "shelf life" refers to the fact that threat intel indicators have a time period during which they are useful, or have value.  Notice that I don't say "specified time period", because that can vary.  Those who conduct these types of investigations have seen where a single file, with the same name and stored in the same file system location on two different endpoints, placed in those locations minutes apart, may have different hashes.  Or, during an IR investigation, you may find RATs on two different endpoints that are essentially the same version, but with different C2 domain names.  When doing historical analysis of indicators such as collected hashes and domain names, we see how these are useful for a limited amounted of time; they have a "shelf life".

Several years ago when I was doing PCI investigations, we ran across a file named "bp0.exe".  We'd seen it before, and happened to have a copy of the file that we'd seen 8 months prior to the current investigation.  Both files have the same path within the file system (on two completely different investigations and endpoints, of course), and they had different MD5 hashes.  Using fuzzy hashing, we found that they were 98% similar (using the phraseology somewhat loosely).  This is a good example of how MD5 hashes have a "shelf life" and tend to remain valid for a limited time.

When you consider David Bianco's Pyramid of Pain from a network- or malware RE-perspective, the technical indicators at the lower half of the pyramid (hash values, C2 IP addresses, domain names) definitely have a shelf life; they are only 'good' (valid) for a specified period of time.  These indicators can change between campaigns, or as is often the case, during a campaign.  Those who perform threat intel collection and analysis are also aware that C2 IP addresses and domain names can also change quickly (often within hours or even minutes), and there are organizations that continually monitor this sort of thing and track them (i.e., what IP addresses various domain names resolve to, etc.) over time.

So, when you look at "threat intel" from a perspective that is external to a compromised infrastructure, using open source intel collection (and including analysis of samples from sites such as VirusTotal...), then the indicators do, indeed, have a shelf life.

However, when considering these same threat intel indicators from the perspective of the endpoint systems, we can see how they have an order of volatility, as well, and that delays in detection and response will lead to some of these endpoint indicators becoming more difficult to recover, and even unavailable.  If an installer is run on a system, and deletes itself after infecting the endpoint, then the file (along with the MD5 hash of that file) is gone.  In some cases, this happens so quickly that the installer file itself may not be written to physical disk, so there is literally nothing that can be recovered, or "carved".  When malware reports out to a C2 domain, how long does that persist on the endpoint?  Well, it may depend on the particular API used by the malware to perform that off-system communications.  If the malware was written to create it's own network sockets, the domain name may exist in memory for only a very short time, and persist on the network (this does not include any logs on other endpoints) for an even shorter period of time.  The domain name may be found in the pagefile, but again, this does not mean that the domain name is recorded on the endpoint indefinitely...even a C2 domain name or URL will only persist in the pagefile for so long.

If you consider the Pyramid of Pain from the perspective of endpoints on a compromised infrastructure, the indicators begin to have an "order of volatility", in the manner described in RFC 3227,  Some indicators can persist for varying periods of time on those endpoints; some will persist for only a short time, while others will persist for quite some time.

For example, some indicators may be present in the Security Event Log, and in most cases (that I've dealt with), event records of value have been obviated by the normal operation of the system, simply due to the fact that the Security Event Logs have "rolled over", as older event records have been overwritten by newer ones.  I've received images of systems and the Security Event Log was 22MB in size, and when parsed, contained maybe...maybe...2 days worth of events.  The specific event I was interested in occurred weeks (or in some cases, months) prior to when the image as acquired.

Some data sources...RecentFileCache.bcf file, AppCompatCache data, Prefetch files, for example...can be obviated by the passage of time and the normal operation of the system.   I've seen IR team response procedures that included running multiple tools on a live system, which caused the Prefetch files of interest to be deleted, as new programs were run and new Prefetch files created by the operating system.

Ultimately, it seems that the discussion of "shelf life" versus "order of volatility" depends upon your perspective.  If you're not considering endpoints at all, and only looking at the "threat intel" that we usually see collected, discussed and shared (MD5 hashes, C2 IP address and domains) through open sources, then yes, I believe that it is well understood that the indicators do have a "shelf life"; that is, a C2 IP address or domain may be valid at the time that it was found, but there's nothing to say that it will continue to be valid 6 weeks or 6 months later.

However, from the perspective of the endpoint, it's pretty clear that indicators have an order of volatility all their own, and that order can be impacted (in some cases, significantly) by external stimulus, such as IR data collection procedures, or actions taken by an adversary (or an admin).

Full Disclosure
Like many other organizations, my employer provides threat intelligence to clients, some of which is network- and malware-based, and collected through open sources.  In my role within the company, as an incident responder and digital forensic analyst, I tend to be both a consumer and producer of threat intel that is based on analysis of endpoints.  What I wanted to point out in this post is that there are different perspectives on the issue, and that doesn't mean that any one is wrong.

Addendum, 17 Mar: Looking back on this post, it occurs to me that the "shelf life" description applies to indicators in the bottom half of the pyramid, from a malware RE perspective, and the "order of volatility" description applies to indicators at all levels of the pyramid, from a host or endpoint perspective.

Tuesday, March 10, 2015

Links

Revisiting Macros
Kahu Security posted this recent blog article that was pretty interesting.  I thought that the trick that was used was pretty interesting, and yes, "sneaky"...but part of me was wondering what this sort of thing would "look like" within a system image.  What I mean is, if you're tasked with looking at an image of a system that may have been infected via this sort of trick, what would you look for?

The first thing that jumps out at me is the warning displayed in Word, in the second figure in the post.  Once the user clicks on the "Enable Content" button, a record is created within the user's MSOffice TrustRecords key.  This information can be extract using the RegRipper trustrecords.pl plugin, or added to a timeline using the trustrecords_tln.pl plugin.  These plugins were mentioned as part of the HowTo: Determine User Access to Files blog post from July, 2013.

Once the user clicks on the button and the macros are enabled, you'll see the other files described in the blog post created and launched within the file system.  Because the command execution does not persist in memory once the process completes, this is an excellent argument for the use of tools such as Sysmon and Carbon Black.

If some of the files that are part of the infection process are deleted or time stomped, AND you can get to the system relatively quickly, a great resource for analysis is the USN change journal.

As far as recovering/extracting the actual macro itself, take a look at this blog post for some helpful hints and tools.  Also, the folks at OpenDNS posted about Investigating a Malicious Attachment without Reversing.

USN Change Journal
Speaking of files being created on a system (see what I did there?), Mari has a new blog post up where she shares her experience using the USN change journal during analysis.  As you can see in her excellent blog post, the USN change journal remains an excellent resource for obtaining extremely transitory information regarding system activity; while we don't know exactly which process is responsible for various files being created, we can see within a timeline when the various file system activities occurred, and this can not only provide additional context to a timeline, but it can also fill in some gaps in that timeline.

It's great that Mari is sharing her analysis experience with others, as it really helps those of us who may not have the same types of cases as she does, or those who may not approach (or "do") analysis in the same way.   It's the sharing of those experiences, as well as asking questions, that builds a stronger community.

ADSs 
I ran across Matt's recent blog post...what attracted my attention to it were the references ADSs and Powershell.  ADSs are something I've been interested in for quite a long time, and I've included sections in my books that have discussed creating them, running code from within them, and what they "look like" to tools such as Carbon Black.

As to the post, there was an exchange on Twitter regarding the original content of the post, which centered around the use of the phrase "without touching disk"...Matt took the initiative to correct this, as you cannot create an ADS, and at the same time say that you aren't "touching disk".  This is an interesting approach, and Matt's right...under most normal circumstances, at least the circumstances I encounter as an incident responder, something like this would go undetected by sysadmins.  However, this is pretty trivial to address for an incident responder, using various artifacts, some that don't hang around as long (see this blog post), and others that persist for a while longer.

Registry Stuff
I recently ran across this post over on the System Forensics blog...unfortunately, as is the case with many blogs (it seems), comments are turned off, so I have to comment here...

Early on in the post, Patrick says:

Then I ran a few more well known tools and in one case didn’t see some of the entries at all, and in another case saw the entries, but no context was provided.

Interesting...which tools?  Was anything done to contact the author(s)?  Was any data shared so that the author(s) could make the appropriate updates?

At the end of the post, Patrick says:

If you’re simply relying on the output of a tool you’re possibly missing some good information.

He's absolutely right, which is why I strongly recommend that analysts get into the Registry when performing analysis, looking to see what's there.  This is particularly true if there are any issues with tools that don't show you what you expect to see in the output.

This particular MRU isn't something I've seen before, and it is interesting...I can clearly see the value of this data.  If Patrick is willing to share some test data, I'd be more than happy to update the appropriate RegRipper plugin(s).  As the author of RegRipper, I'm fully aware that I don't see everything that is possible to see in the Registry, and as such, I rely on the good will of the DFIR community when it comes to sharing data so that RegRipper plugins can be created or updated.  For example, Eric Zimmerman recently shared some USRCLASS.DAT hive files with me so that I could update the RegRipper shellbags.pl plugin to be able to address shell items particular to Windows 8.1; I've updated some of the new folder shell items and I'm working on the MTP device shell items.  I have also taken an opportunity to run the updated plugin against a USRCLASS.DAT hive from Windows 10 TP, but the content is limited.  Therefore, so is the testing.

Also, if Patrick were willing to share what it is he doesn't like about the output of some of the available tools, I'd be happy to consider making changes to RegRipper output, as appropriate.

Speaking
I submitted two responses...titles and abstracts...to the HTCIA2015 conference "call for papers", and both were accepted.  The conference is in Orlando, FL, at the beginning of September.  Submitting for this conference was an interesting experience, as I started by asking what the attendees might be interested in hearing or seeing, and was told, in short, "yes!"  My presentation titles are "Registry Analysis" and "Lateral Movement".

For the "Lateral Movement" presentation, I plan to discuss various methods of lateral movement within an infrastructure and what they "look like", with respect to the source and destination systems.  I chose this topic as an example, and was told "yes!!"  Also, I've been to conferences before where topics such as this are discussed, but there's been no real discussion or presentation of what the artifacts look like on systems.

I have something of an idea at this point regarding what I'm going to talk about during the "Registry Analysis" presentation, and I'm working on crystallizing it a bit.  At this point, this is going to be an advanced presentation,

My question to you is, if you were to see a 1 hr presentation entitled, "Registry Analysis", what would you hope to get out of it?  What would you look for to be discussed?

Monday, March 02, 2015

How do you "do" analysis?

Everybody remembers "The Matrix", right?  So, you're probably wondering what the image to the right has to do with this article, particularly given the title.  Well, that's easy...this post is about employing various data sources and analysis techniques, and pivoting in order to add context and achieve a greater level of detail in your analysis.  Sticking with just one analysis technique or process, much like simply trying to walk straight through the building lobby to rescue Morpheus, would not have worked.  In order to succeed, Neo and Trinity had to pivot and mutually support each other in order to achieve their collective goal.  So...quite the metaphor for a blog post that involves pivoting, eh?

Timeline Analysis
Timeline analysis is a great technique for answering a wide range of questions.  For malware infections and compromises, timeline analysis can provide the necessary context to illustrate things like the initial infection (or compromise) vector, the window of compromise (i.e., based on when the system was really infected or compromised, if anti-forensics techniques were used), what actions may have been taken following the infection/compromise, the hours during which the intruder tends to operate, and other systems an intruder may have reached to (in the case of a compromise).

Let's say that I have an image of a system thought to be infected with malware.  All I know at this point is that a NIDS alert identified the system as being infected with a particular malware variant based on C2 communications that were detected on the wire, so I can assume that the system must have been infected on or before the date and time that the alert was generated.  Let's also say that based on the NIDS alert, we know that the malware (at least, some variants of it) persists via a Windows service.  Given this little bit of information, here's an analysis process that I might follow, including pivot points:
  1. Load the timeline into Notepad++, scroll all the way to the bottom, and do a search (going up from the bottom) to look for "Service Control Manager/7045" records.
  2. Locate the file referenced by the event record by searching for it in the timeline.  PIVOT to the MFT: parse the MFT, extract the parsed record contents for the file in question in order to determine if there was any time stomping involved.
  3. PIVOT within the timeline; start by looking "near" when the malware file was first created on the system to determine what other activity occurred prior to that event (i.e., what user was logged in, were there indications of web browsing activity, was the user checking their email, etc.)
  4. PIVOT to the file itself: parse the PE headers to get things like compile time, section names, section sizes, strings embedded in the file, etc.  These can all provide greater insight into the file itself.  Extract the malware file and any supporting files (DLLs, etc.) for analysis.
  5. If the malware makes use of DLL side loading, note the persistent application name, in relation to applications used on the system, as well as within the rest of the infrastructure.  
  6. If your timeline doesn't include AV log entries, and there are AV logs on the system, PIVOT to those in order to potentially get some additional detail or context.  Were there any previous attempts to install malware with the same or a similar name or location?  McAfee AV will flag on behaviors...was the malware installed from a Temp directory, or some other location?  
  7. If the system has a hibernation file that was created or modified after the system became infected, PIVOT to that file to conduct analysis regarding the malicious process.
  8. If the malware is known to utilize the WinInet API for off-system/C2 communications, see if the Local Service or Network Service profiles have a populated IE web history (location depends upon the version of Windows being examined). 
  9. If the system you're analyzing has Prefetch files available, were there any specific to the malware?  If so, PIVOT to those, parsing the modules and looking for anything unusual.  
Again, this is simply a notional analysis, meant to illustrate some steps that you could take during analysis.  Of course, it will all depend on the data that you have available, and the goals of your analysis.

Web Shell Analysis
Web shells are a lot of fun.  Most of us are familiar with web shells, at least to some extent, and recognize that there are a lot of different ways that a web shell can be crafted, based on the web server that's running (Apache, IIS, etc.), other applications and content management systems that are installed, etc.  Rather than going into detail regarding different types of web shells, I'll focus just on what an analyst might be looking for (or find) on a Windows server running the IIS web server.  CrowdStrike has a very good blog post that illustrates some web shell artifacts that you might find if an .aspx web shell is created on such a system.

In this example, let's say that you have received an image of a Windows system, running the IIS web server.  You've created a timeline and found artifacts similar to what's described in the CrowdStrike blog post, and now you're read to start pivoting in your analysis.

  1. You find indications of a web shell via timeline analysis; you now have a file name.
  2. PIVOT to the web server logs (if they're available), searching for requests for that page.  As a result of your search, you will know have (a) IP address(es) from where the requests originated, and (b) request contents illustrating the commands that the intruder ran via the web shell.
  3. Using the IP address(es) you found in step 2, PIVOT within the web server logs, this time using the class C or class B range for the IP address(es), to cast the net a bit wider.  This can give you additional information regarding the intruder's early attempts to fingerprint and compromise the web server, as you may find indications of web server vulnerability scans originating from the IP address range.  You may also find indications of additional activity originating from the IP address range(s).
  4. PIVOT back into your timeline, using the date/time stamps of the requests that you're seeing in the web server logs as pivot points, in order to see what events occurred on the systems as a result of requests that were sent via the web shell.  Of course, where the artifacts can be found may depend a great deal upon the type of web shell and the contents of the request.
  5. If tools were uploaded to the system and run, PIVOT to any available Prefetch files, and parse out the embedded strings that point to module loaded by the application, in order to see if there are any additional files that you should be looking to.
Once again, this is simply a notional example of how you might create and use pivot points in your analysis.  This sort of process works not just for web shells, but it's also very similar to the process I used on the IBM ISS ERS team when Chris and I were analyzing SQL injection attacks via IIS web servers; conceptually, there is a lot of overlap between the two types of attacks.

Additional Resources (Web Shells)
Security Disclosures blog post
Shell-Detector
Yara rules - 1aN0rmus, Loki

Memory Analysis
This blog post from Contextis provides a very good example of pivoting during analysis; in this case, the primary data source for analysis was system memory in the form of a hibernation file.  The case stated with disk forensics, and a hit for a particular item was found in a crash dump file, and then the analyst pivoted to the hibernation file.

Adam did a great job with the analysis, and in writing up the post.  Given that this post started with disk forensics, some additional pivot points for the analysis are available:

  1. Pivoting within the memory dump, the analyst could have identified any mutex utilized by the malware.  
  2. Pivoting into a timeline, the analyst may have been able to identify when the service itself was first installed (i.e., "Service Control Manager" record with event ID 7045).
  3. Determining when the malicious service was installed can lead the analyst to the initial infection vector (IIV), and will be extremely valuable if the bad guys used anti-forensic techniques such as time stomping the malware files to try to obfuscate the creation date.
  4. Pivot to the MFT and extract records for the malicious DLL files, as well as the keystroke log file.  Many of us have seen malware that includes a keylogger component that will continually time stomp the keystroke log file as new key strokes are added to it.  

"Doing" Analysis
I received an interesting question a while back, asking for tips on how I "do analysis".  I got to thinking about it, and it made sense to add my thoughts to this blog post.

Most times, when I receive an image, I have some sort of artifact or indicator to work with..a file name or path, a date/time, perhaps a notice from AV that something was detected.  That is the reason why I'm looking at the image in the first place.  And as a result, producing a timeline is obviated by the questions I need to answer; that is to say, I do not create a timeline simply because I received an image.  Instead, I create a timeline because that's often the best way to address the goals of my exam.

When I do create a timeline, I most often have something to look for, to use as an initial starting or pivot point for my analysis.  Let's say that I have a file that I'm interested in; the client received a notification or alert, and that led them to determine that the system was infected.  As such, they want to know what the malware is, how it got on the system, and what may have occurred after the malware infected the system.  After creating the timeline, I can start by searching the timeline for the file listing.  I will usually look for other events "around" the times where I find the file listed...Windows Event Log records, Registry keys being created/modified, etc.

Knowing that most tools (TSK fls.exe, FTK Imager "Export Directory Listing..." functionality) used to populate a timeline will only retrieve the $STANDARD_INFORMATION attributes for the file, I will often extract and parse the $MFT, and then check to see if there are indications of the file being time stomped.  If it does appear that the file was time stomped, I will go into the timeline and look "near" the $FILE_NAME attribute time stamps for further indications of activity.

One of the things I use to help me with my analysis is that I will apply things I learned from previous engagements to my current analysis.  One of the ways I do this is to use the wevtx.bat tool to parse the Windows Event Logs that I've extracted from the image.  This batch file will first run MS's LogParser tool against the *.evtx files I'm interested in, and then parse the output into the appropriate timeline format, while incorporating header tags from the eventmap.txt event mapping file.  If you open the eventmap.txt file in Notepad (or any other editor) you'll see that it includes not only the mappings, but also URLs that are references for the tags.  So, if I have a timeline from a case where malware is suspected, I'll search for the "[MalDetect]" tag.  I do this even though most of the malware I see on a regular basis isn't detected by AV, because often times, AV will have detected previous malware infection attempts, or it will detect malicious software downloaded after the initial infection (credential dumping tools, etc.).

Note: This approach of extracting Windows Event Logs from an acquired image is necessitated by two factors.  First, I most often do not want all of the records from all of the logs.  On my Windows 7 Ultimate system, there are 141 *.evtx files.  Now, not all of them are populated, but most of them do not contain records that would do much more than fill up my timeline.  To avoid that, there are a list of less than a dozen *.evtx files that I will extract from an image and incorporate into a timeline.

Second, I often work without the benefit of a full image.  When assisting other analysts or clients, it's often too cumbersome to have a copy of the image produced and shipped, when it will take just a few minutes for them to send me an archive containing the *.evtx files of interest, and for me to return my findings.  This is not a "speed over accuracy" issue; instead, it's a Sniper Forensics approach that lets me get to the answers I need much quicker.

Another thing I do during timeline analysis is that I keep the image (if available) open in FTK Imager for easy pivoting, so that I can refer to file contents quickly.  Sometimes it's not so much that a file was modified, as much as it is what content was added to the file.  Other times, contents of batch files can lead to additional pivot points that need to be explored.

Several folks have asked me about doing timeline analysis when asked to "find bad stuff".  Like many of you reading this blog post, I do get those types of requests.  I have to remember that sometimes, "bad stuff" leaves a wake.  For example, there is malware that will create Registry keys (or values) that are not associated with persistence; while they do not lead directly to the malware itself (the persistence mechanism will usually point directly to the malware files), they do help in other ways.  One way is that the presence of the key (or value, as the case may be) lets us know that the malware is (or was) installed on the system.  This can be helpful with timeline analysis in general, but also during instances when the bad guy uses the malware to gain access to the system, dump credentials, and then comes back and removes the malware files and persistence mechanism (yeah, I've seen that happen more than a few times).

Another is that the LastWrite time of the key will tell us when the malware was installed.  Files can be time stomped, copied and moved around the file system, etc., all of which will have an effect on the time stamps recorded in the $MFT.  Depending on the $MFT record metadata alone can be misleading, but having additional artifacts (spurious Registry keys created/modified, Windows services installed and started, etc.) can do a great deal to increase our level of confidence in the file system metadata.

So, I like to collect all of those little telltale IOCs, so that when I do get a case of "find the bad stuff", I can check for those indicators quickly.  Do you know where I get the vast majority of the IOCs I use for my current analysis?  From all of my prior analysis.  Like I said earlier in this post, I take what I've learned from previous analysis and apply it to my current analysis, as appropriate.

Sometimes I get indicators from others.  For example, Jamie/@gleeda from Volatility shared with me (it's also in the book) that when the gsecdump credential theft tool is run to extract LSA secrets, the HKLM/Security/Policy/Secrets key LastWrite time is updated.  So I wrote a RegRipper plugin to extract the information and include it in a timeline (without including all of the LastWrite times from all of the keys in the Security hive, which just adds unnecessary volume to my timeline), and since then, I've used it often enough that I'm comfortable with the fidelity of the data.  This indicator serves as a great pivot point in a timeline.

A couple of things I generally don't do during analysis:
I don't include EVERYTHING into the timeline.  Some times, I don't have everything...I don't have access to the entire image.  Someone may send me a few files ($MFT, Registry hives, Windows Event Logs, etc.) because it's faster to do that than ship the image.  However, when I do have an image, I very often don't want everything, as getting everything can lead to a great deal of information being put into the timeline that simply adds noise.  For example, if I'm interested in remote access to a system, I generally do not include Windows Event Logs that focus on hardware monitoring events in my timeline.

I have a script that will parse the $MFT and display the $STANDARD_INFORMATION and $FILE_TIME metadata in a timeline...but I don't use it very often.  In fact, I can honestly say that after creating it, I haven't once used it during my own analysis.  If I'm concerned with time stomping, it's most often only for a handful of files, and I don't see that as a reason for doubling the size of my timeline and making it harder to analyze.  Instead, I will run a script that will display various metadata from each record, and then search the output for just the files that I'm interested in.

I don't color code my timeline.  I have been specifically asked about this...for me, with the analysis process I use, color coding doesn't add any value.  That doesn't mean that if it works for you, you shouldn't do it...not at all.  All I'm saying is that it doesn't add any significant value for me, nor does it facilitate my analysis.  What I do instead is start off with my text-based timeline (see ch. 7 of Windows Forensic Analysis) and I'll create an additional file for that system called "notes"; I'll copy-and-paste relevant extracts from the full timeline into the notes file, annotating various things along the way, such as adding links to relevant web sites, making notes of specific findings, etc.  All of this makes it much easier for me to write my final report, share findings with other team members, and consolidate my findings.