In my recent webcast for SANS Levelup (http://sans.org/u/ZAX) I suggest that all of you out there who want to become security professionals or want to improve their skills join forces to build DFIR labs (Hackmes) together.
The idea was, that you find like-minded people using the hashtag #leveluplabs on twitter. In the last weeks I had many people ask me how I’d set up something like that. In this and the next few episodes I’ll show how I set up my training lab.
The upcoming tutorials not only show how to set up a small fake company network but also how to set up a basic attacker infrastructure. The fake company network will consist of a domain controller and a few clients joined to the domain and maybe later a web server to increase the attack surface a bit. On the attacker site, I’ll use something like Metasploit, powershell empire or covenant. If you are lucky, you can team up with a penetration tester if you are a DFIR person or the other way round.
There will be at least 7 video tutorials to address the following topics. Today I’ll release the introduction part. Next week I’ll continue with building the fake company infrastructure.
Many security products monitor process trees very carefully to detect when for instance office applications spawn Powershell, cmd or other suspicious subprocesses. But is that enough?
Still many organisations are unable to deactivate macros in office documents as they are still widely used. Hence they introduce compensating controls to detect malicious macros as soon as they execute. Many security products mainly focus on process trees to detect when macros execute Powershell code or behave strangely in other ways. At first, that looks like a good idea and it actually is. But is it enough. So let’s see what it takes to get around pure process tree monitoring and how to still detect what’s going on.
I’m really bad in writing Powershell code so please bear with me. So my target was to write a simple Powershell backdoor that calls back to an even simpler php based C2 server. The reason I choose php for the C2 is, that you could run the C2 on any compromised shared webspace and don’t need to set up a full server.
So essentially every victim gets a unique id. As soon as the malware is running, it asks the C2 server for new commands every 4 seconds. If the command is anything other than “wait” it executes the command in cmd.exe. The results will then be transferred to the C2 server. Note that I used GET as method and transferred the results in a get parameter. I did to test a network based detection mechanism I’m working on. In a real attack I’d use POST requests as usually the full URI including GET parameters ends up in proxy logs while POST parameters in the body don’t.
Ok that part was easy. Now I needed a crappy backend to control the malware. It’s not beautiful but It get’s the job done.
So at that point it’s all about getting that into a production system that’s set up with all the usual security measures and more. Under the assumption, that you will always be able to get an employee to execute a macro as long as it’s not blocked, I created a fake word document. It looks something like that – you know the drill, right?
There are two things I want to prevent when my code fires:
I don’t want Word to spawn a sub process
I don’t want to load System.Management.Automation.dll with any process other than powershell.exe (Some monitor that as well)
Honouring the points above, I can load a WScript.Shell object but I can’t use it’s .Run or .Exec methods as that would spawn a cmd.exe process as child of the Word process. So what I did instead was creating a batch script loading my Powershell backdoor. I just placed that script in the startup folder. That will not give me a shell right away, but as soon as the user logs off and on again I’m in. So this is the macro I ended up using.
Today I had the pleasure of dissecting Shadow Hammer for together with our top malware analyst at InfoGuard(@InfoGuardAG) Stefan Rothenbuehler (@creative83).
ShadowHammer is a piece of malware that was distributed in a supply chain attack mimicking ASUS security updates. Once the malicious update explodes on the target system it loads various libraries it uses to determine the mac addresses of the machine. Only if the MAC address matches a set of predefined addresses it will actually load the next stage and infect the machine.
While some of the functionality is already well documented we feel like it would make sense to show how you can get the hashes out of the malware. Stefan and myself also want to make sure to do this writeup in an educational fashion. That might lead to some “overdocumenting” at some points. So skilled malware analysts out there, please bear with us.
Finding the malicious part
This sample as a little bit more complex as it contains benign code as well as the malware code.
When starting the analysis using static methods like Ghidra, the functionality of ShadowHammer is not immediately observable as it only executes once the benign part of the software update finished executing. The malware author placed the entry point of the malware just before the Process exit into the __crtExitProcess function. We tagged the function accordingly in win32dbg.
So how do we get there. Using the Symbols tab in win32dbg, we see that the file uses the VirtualAlloc and VirtualProtect function from kernel32.dll. Whenever we find those functions used by a binary we want to look at them more closely. Stefan pointed out a nice way to set a useful breakpoint at the end of the function. This way the eax register shows the target address of the memory allocation. When found jump into the code for Virtual Alloc and set a breakpoint right before the function returns. There are more of those but number 2 is the one you want to look at (just to save you some time).
If you start single stepping through the code from there entering all functions that are not API functions you will eventually end up at a place where you see two calls to GetAdaptersAddresses. The first of the two compares the eax register to 0x6f which is 111 in decimal. This return code would indicate a bufffer overflow. We believe that this fires, if the number of interfaces the machine maps is too high to fit the provided buffer.
The second call is followed by test eax, eax which is true when eaxis 0x0. This is the desired return coda as it indicates that we now got our interface structures successfully loaded into the buffer.
After that we are in a loop that creates the md5 sum of every interfaces mac address and stores it to the memory using memcpy. In my case when I dump the source for the memcpy operation the first 16 bytes should the md5 sum of one of my adapters.
The bytes I get are:
93 16 DF 71 02 90 78 CB 6E 33 9C BE 12 0B 1F 49
The mac address for my only adapter is:
Checking that using Cyberchef proves that we got the right data there.
So now that we see that the malware stores the md5 of our mac addresses, the question is, what does it compare it to?
To figure that we need to keep on stepping. I put a breakpoint shortly before my current function returns. and single-step from there. Eventually you will find a set of memcmp instructions in several loops. they compare the memory space filled with your adapter’s macs to the macs the attacker is looking for. It will compare 0x10 or 16 dec bytes in two memory locations. That’s the size of an md5 sum.
So argument one and two are eax and esi. They are on top of the stack so we can easily load them into the data dump window. esi contains our values and eax the value hardcoded in the malware.
In this post I’ll describe an approach on how to leverage Excel to dump dynamically created Shellcode from a Macro.
I’m always looking for new challenges for our team that they can solve in slow times. During my research I stumbled upon a nice sample in @0xffff0800 malware archive (Find the current link to the archive at 0day.coffee0). The sample itself was not that complex, getting the potential shellcode out required a technique I never used before. So let’s cut to the chase.
The sample is a Word document with a Macro. According to 0xffff0800 directory structure it’s out of Lazarus group’s tool chest (Wikipedia). The Thor APT scanner by BFK Consulting supports that assumption as it flags the Document with the yara rule “APT_MalDoc_SharpShooter_Lazarus_Campaign_Dec18_1“
That shows me a Macro that is slightly obfuscated. The first five declarations look interesting though.
Attribute VB_Name = "NewMacros"
Private Declare PtrSafe Function SharpShooter Lib "msvcrt" Alias "_beginthread" (ByVal StartAddress As LongPtr, StackSize As Long, ByVal ArgList As LongPtr) As Long
Private Declare PtrSafe Function efasdv Lib "kernel32" Alias "VirtualAlloc" (ByVal address As Long, ByVal size As Long, ByVal aloctype As Long, ByVal fprot As Long) As LongPtr
Private Declare PtrSafe Function gzsdfasd Lib "kernel32" Alias "RtlMoveMemory" (ByVal dest As LongPtr, ByRef src As Any, ByVal dlen As Long) As LongPtr
Private Declare PtrSafe Function ennfiaje Lib "kernel32" Alias "LoadLibraryA" (ByVal libname As String) As LongPtr
Private Declare PtrSafe Function dnnaigej Lib "kernel32" Alias "GetProcAddress" (ByVal module As LongPtr, ByVal pname As String) As LongPtr
The Macro seems to define some strange variable names for well known functions leveraged by malware, VirtualAllocA only being one of them. We also see that there is a 2 dimensional array called llsodiplo.
To make reading easier I went through the code and gave the variables and functions more meaningful names. The result below shows a clearer picture of what’s going on. You can also download the full deobfuscated code here and the original macro here.
As this blog post is not about detailing how that macro works exactly I’ll just point out some key points. So the Macro stores bytes in a 2d byte array which I called shellcode for now. It then allocates 3224 bytes using VirtualAlloc. The allocated sections carry the 0x40 protection flag which according to Microsoft’s documentation refers to PAGE_EXECUTE_READWRITE. So whatever the macro puts there will be executable. It then flattens out the 2d array into a 1d byte array using two nested loops. I called that variable binbuffer. Now comes the tricky part and the reason why I post about this sample. The macro in-memory replaces two sections of the flat bytearray with the memory addresses of LoadLibraryA and GetProcAddr. I assume the resulting in memory-code will use these library calls and needs the addresses of these functions. This gives the code the ability to address the library more easily even if ASLR is activated (which it usually is for Office Products). Unfortunately it makes our job more difficult as well. We can’t just dump the byte array and treat it as runnable shellcode as we will be missing the actual addresses of the mentioned functions. That requires us to use another frequently used technique to extract stuff – alter the code until it spits out whatever we need.
VBA ≠ VBS
There is one very important thing you need to know about Macros. VBA is not VBS. While you can run VBS code using cscript.exe, VBA code will not run. For this particular code it fails. VBA is exclusive for Microsoft applications and a few third party vendors who licensed VBA for their products, one of them being AutoCAD. For us, that means that our best bet is to use Word to execute the Macro. What I’m interested in is the exact byte array it loads int o memory in the last for loop.
For eIndex1 = 0 To size_count - 1
eValue = binbuffer(eIndex1)
Result = RtlMoveMemory(vAddress + eIndex1, eValue, 1)
So let’s fire up Excel and don’t allow the Macros to run. We first need to make running that save. Obviously everything from now on happens on a save Lab VM.
So before I allow macros to run I edit the code a little bit. Generally commenting out one line will be enough. Line 75 would execute the in-memory code. Sharpshooter was declared to be msvcrt._beginthread().
Private Declare PtrSafe Function SharpShooter Lib "msvcrt" Alias "_beginthread" (ByVal StartAddress As LongPtr, StackSize As Long, ByVal ArgList As LongPtr) As Long
LMCooperator = SharpShooter(vAddress, 0, 0)
So I just comment that line out. In addition to that I want to dump the resulting byte array to a file. The easiest way to do that for me was to use the StrConv Function and the Open command to open a file. The resulting code section looks something like this.
For eIndex1 = 0 To yefawfq - 1
eValue = grqwasf(eIndex1)
Result = gzsdfasd(vAddress + eIndex1, eValue, 1)
Open "C:\Users\abc\Desktop\shellcode.txt" For Output As #1
hexstr = StrConv(grqwasf, vbUnicode)
Print #1, hexstr
Dim LMCooperator As Long
'LMCooperator = SharpShooter(vAddress, 0, 0)
This is a fast and easy way to get the final version of the binary code out leveraging Word.
Another story on how you might discover new artifacts to help your investigation – MFT Carving.
It’ been some time since I wrote my last blog post. Like every year, the last quarter is very busy. Still I got something new I want to share. This week I have been teaching SANS FOR 508 with Francesco Picasso @dfirfpi in Paris. When Francesco talked about $MFT entries, I was curious where on a drive single $MFT entries or groups of MFT entries might end up other than the currently active $MFT. I briefly googled if there are solutions that support carving single, potentially corrupted $MFT entries and couldn’t find any. There are many solutions which parse a complete and active $MFT and solutions which carve files, but that’s not was I was looking for. So back in my room I started to do some research.
As the $MFT is literally filled up with timestamps I figured it might come handy to have a MFT-Carver-Parser that also handles half corrupted MFT entries. If you just want to get the tool I wrote to do that and not read the whole blogpost, feel free to download it at https://github.com/cyb3rfox/MFTEntryCarver/
As pointed out, I did not expect to find many entries in unallocated space, but I gave it a try anyhow. First of all, I dumped the unallocated space of a test Windows 7 image using sleuthkit’s blkls.
blkls image.ewf > unallocated.blocks
This produced a file around 7Gb big. To see if it would even make sense to start writing something, I just did a strings search on the file.
strings -a unallocated.blocks | grep FILE0 | wc -l
Note that basically FILE (\x46\x49\x4C\x45) is the header for an entry, but not using it will give you many strings that are part of textfiles and script. Still most not to say all entries start like FILE0 (\x46\x49\x4C\x45\x30). \0x30 declares the offset to the fixup and that is usually the same for many of the files. And as it represents 0 in ASCII it’s easy to grep. Anyway, the above command leads to the following output.
So it looks like we have 5026 potential $MFT entries. Next, I wanted to understand, how those artifacts are distributed across the dump file. Hence, I searched for the magic bytes again, only this time, I also got the offset of the hits.
strings -a -t d unallocated.blocks | grep FILE0
I then grouped the hit offsets into a manageable number of buckets and plotted a histogram.
The histogram shows, that not all of those entries are clustered in one place but in at least 3 separate locations (there are some less populated sections that don’t show on the graph).
That implies, that the entries might be coming from different sources. Interesting enough for me to start writing a little tool. The $MFT is very well documented. I used two resources to understand what I needed to do.
I guess this is the time to point out, that I’m obviously not a full time coder and python is quite new to me. If you don’t believe me, look at the code and you see what I mean. So please excuse my spaghetti code. In the end, it works and is even reasonably fast.
I wanted the tool to be able to do the following things:
Find potential file entries
parse as many of their $FN attributes (long and short) as possible
parse $STDInfo and $FN Timestamps
For resident $data attributes, recover the data as well
still work if half of the information is corrupted
output all in csv format
So, first of all, I needed to find all potential $MFT entries. As the input files for this kind of tool can get quite big, using methods that need to put the whole file into memory might fail. In the end, I decided to use the python mmap library. It’s fast and you can open really big files and move a pointer over the data. Also searching for hex patterns is supported.
The pattern search for FILE (\x46\x49\x4C\x45) returned 54583 potential entries. So how do I decide which are legitimate ones and which are false positives? My approach was to try and parse further attributes and sanity check the results as good as possible. So, for example, I use size checks a lot. All attributes in $MFT entries store their size. I parse that and if the parsed value is smaller than the minimum size of the attribute or bigger than the size of the whole entry, the bytes I parsed are probably not part of a real $MFT entry. I don’t want to go into that too deeply, please look at the code if you want to understand my approach. Suggestions and contributions are highly welcome.
So after a couple of hours of work, I get to the following results.
python MFTEntryCarver.py -s unallocated.blocks
So essentially. MFTEntryCarver.py will give you all the artifacts mentioned above if it can find them. It only keeps on parsing when it finds at least one valid $FN attribute. If certain artifacts are not there, it will put in “corrupted” in the respective field. Below are the statistics the tools shows after processing the test dataset.
The good news here, in the end, there were not only 5026 entries to parse but a total of 54583. Out of those MFTEntryCarver.py recovered 14975 entries. So let’s take a look at one of the entries.
So, in this case, we found a .URL file called “Get Windows Live.url” and the data seems to resident. Some timestamps couldn’t be parsed, MFTEntryCarver.py still gets you as much as it can. I usually transform and analyze raw hex data using CyberChef. Throwing the hex contents of the resident data attribute at it produces readable results (at least for ascii characters).
I’ll try the tool some more in investigations, but it looks promising. offers a viable way to get older $MFT entries, including timestamps. Artifacts like these usually help an investigation when the attacker cleaned up and/or is long gone. If you use it, please give some feedback. You can easily reach out to me on twitter @mathias_fuchs. You can get the code at https://github.com/cyb3rfox/MFTEntryCarver