In doing some dynamic analysis on the latest Emotet samples, I was running into an issue getting the persistent copy of the malware to execute. In its normal flow of execution, Emotet will drop itself into C:\Users\<username>\AppData\Local and make a call to CreateProcessW to continue execution. I was unable to break on CreateProcessW. Instead, the code would terminate.
It turns out Emotet was making a call to SHFileOperationW to do the file drop. SHFileOperationW takes a structure named _SHFILEOPSTRUCTA.
As you can see the wFunc param is set to 01 or FO_MOVE. The result was C0000043 (STATUS_SHARING_VIOLATION). This was causing an issue because my debugger had a handle on the file preventing the move from occurring. I needed to change the value to FO_COPY or 02 to copy the file instead. With this the code continued execution as expected.
If you take a look at an unpacked Emotet sample, something will certainly stick out to you. The sample will have few to no imports. Yet the malware is clearly making heavy use of the Windows API to achieve its objectives. So how is it doing this? It is making use of a well-known shell coding technique to dynamically load APIs at runtime without the use of strings.
Emotet is locating the DLL name from the Process Environment Block. The malware will search for the desired module name and then locate the image base. From the image base of the located module, it is then able to walk the export table locating function pointers for its own import table. In Windows, the FS register points to the Thread Environment Block, or TEB. The offset 0x30 contains a pointer to the Process Environment Block, or PEB. From there the malware will walk the structure to _PEB_LDR_DATA which contains the head of a doubly-linked list called InMemoryOrderModuleList. This list contains the list entries containing BaseDllName and DllBase. When the entry point to the PE header is located, the PE structure is walked to find the export table. Emotet will then loop over the exported names to find the API function it is searching for.
Let’s take a look at this in action. The sample I am looking at is 0b96754a84bc2c01e4e8d64a534c03b5636fb6e958f7c381f9c27e646466cd32.
Sub_403640 starts off by grabbing a reference to the TEB and then grabbing a pointer to the PEB at offset 0x30. From there, a pointer to the _PEB_LDR_DATA structure is saved off along with the module hash that was stored in ecx prior to the function call.
The two function calls Sub_403640 (GetModuleAddress) and sub_4037C0 (GetAPIAddress) will be paired together throughout the code. Once Emotet has resolved the API in a given module, parameters are pushed onto the stack before the call eax.
In this post I take a look at unpacking Emotet to discover hard coded command and control servers. The MD5 for the sample I’m looking at is 8db38c7f70214ee08e166cde8b9163c6.
This sample of Emotet uses a customized packer. Instead of trying to reverse the algorithm to unpack the next stage, we can use dynamic analysis. I’ll let the malware do the unpacking for me and grab the next stage out of memory. The process will need to allocate memory for the next stage, so it’s a good assumption that we will see a call to VirtualAlloc. Open up the sample in x32dbg and set a breakpoint on VirtualAlloc. When the breakpoint is hit, we can note the parameters that were passed to the function.
As you can see, the sample is letting the system determine where to allocate the memory with read write permissions. After the call to VirtualAlloc, the eax register will contain the base address of the allocated region. I dumped the memory at that address to keep an eye on it. After letting the code run to the next breakpoint, I can see the memory has been populated. Next, I jump to the memory map and dump the memory to a file.
Opening the dump in a hex editor displays the following.
The file clearly contains a PE file but there is some extra code at the beginning of the file. I’m unclear as to what this extra code is at this time, I may come back to look at it later. To move things along, I just looked for the magic number and trimmed the rest of the contents to the beginning of the file.
Opening the file in IDA just to take a peek I noticed that there were zero imports. Jumping to sub_403530 I noticed some signs of dynamically creating an import table. Looking at some of the instructions I can see that it is grabbing a pointer to the process environment block. From there it appears to be walking the structure to locate function pointers for its own import table. There is certainly more to research here, but I’ll save that for another post.
After unpacking, we can see the list of C2 servers is hard coded in the binary. This information gets written to a buffer in the .data section of the code. If we are looking to grab the IOCs from the sample, we want to let the malware run and write the C2 servers to the buffer and examine the memory. In the following screenshot we can see the first IP address and port.
The list will contain a series of IP addresses. The highlighted IP address and port will corollate to 220.127.116.11:80. The first four bytes is the IP address followed by two bytes for the port. The next two bytes can be ignored. I extracted the following C2 servers from the sample.