Last updated at Tue, 09 May 2023 14:07:06 GMT
In this blog, you will learn a practical methodology to extract configuration data from recent Qakbot samples. I will provide some background on Qakbot, then walk through decode themes in an easy to visualize manner. Additionally, I'll share a Velociraptor artifact to detect and automate the decode process at scale.
QakBot or QBot, is a modular malware first observed in 2007 that has been historically known as a banking Trojan. Qakbot is used to steal credentials, financial, or other endpoint data, and in recent years, regularly a loader for other malware leading to hands-on-keyboard ransomware.
Qakbot has some notable defense evasion capabilities including:
- Checking for Windows Defender sandbox and terminating on discovery.
- Checking for the presence of running anti-virus or analysis tools, then modifying its later stage behavior for evasion.
- Dynamic corruption of payload on startup and rewrite on system shutdown.
Due to the commodity nature of Qakbot delivery, capabilities, and end game, it is worth extracting configuration from observed samples to scope impact from a given campaign. Hunting enterprise-wide and finding a previously missed machine or discovering an ineffective control can prevent a domain-wide ransomware event or similar cyber attacks.
Qakbot has an RC4 encoded configuration, located inside two resources of the unpacked payload binary. The decryption process has not changed significantly in recent times, but for some minor key changes. It uses a SHA1 of a hard coded key that can typically be extracted as an encoded string in the .data section of the payload binary. This key often remains static across campaigns, which can speed up analysis if we maintain a recent key list.
Current samples undergo two rounds of RC4 decryption with validation built in. The validation bytes dropped from the data for the second round.
After the first round:
- The first 20 bytes in hex is for validation and is compared with the SHA1 of the remaining decoded data
- Bytes [20:40] is the key used for the second round of decoding.
- The Data to decode is byte [40:] onwards
- The same validation process occurs for the second round decoded data
- Verification = data[:20]
- DecodedData = data[20:]
Campaign information is located inside the smaller resource where, after this decoding and verification process, data is clear text.
The larger resource stores Command and Control configuration. This is typically stored in netaddress format with varying separators. A common technique for finding the correct method is searching for common ports and separator patterns in the decoded data.
Qakbot stores blobs of xor encoded strings inside the .data section of its payload binary. The current methodology is to extract blobs of key and data from the referenced key offset which similarly is reused across samples.
Current samples start at offset 0x50, with an xor key, followed by a separator of 0x0000 before encoded data. In recent samples, we have observed more than one string blob and these have occurred in the same format after the separator.
Next steps are splitting on separators, decode expected blob pairs and drop any non printable. Results are fairly obvious when decoding is successful as Qakbot produces clean strings. I typically have seen two well defined groups with strings aligning to Qakbot capabilities.
Qakbot samples are typically packed and need execution or manual unpacking to retrieve the payload for analysis. It's very difficult to obtain this payload remotely at scale, in practice the easiest way is to execute the sample in a VM or sandbox that enables extracting the payload with correct PE offsets.
When executing locally Qakbot typically injects its payload into a Windows process, and can be detected with yara targeting the process for an unbacked section with `PAGE_EXECUTE_READWRITE` protections.
Below, we have an example of running PE-Sieve / Hollows Hunter tool from Hasherezade. This helpful tool enables detection of several types of process injection, and the dumping of injected sections with appropriately aligned headers. In this case, the injected process is `wermgr.exe` but it's worth to note, depending on variant and process footprint, your injected process may vary.
Automation at scale
Now I have explained the decode process, time to enable both detection and decode automation in Velociraptor.
I have recently released Windows.Carving.Qakbot which leverages a PE dump capability in Velociraptor 0.6.8 to enable live memory analysis. The goal of the artifact was to automate my decoding workflow for a generic Qakbot parser and save time for a common analysis. I also wanted an easy to update parser to add additional keys or decode nuances when changes are discovered.
This artifact uses Yara to detect an injected Qakbot payload, then attempts to parse the payload configuration and strings. Some of the features in the artifact cover changes observed in the past in the decryption process to allow a simplified extraction workflow:
- Automatic PE extraction and offset alignment for memory detections.
- StringOffset: the offset of the string xor key and encoded strings is reused regularly.
- PE resource type: the RC4 encoded configuration is typically inside 2 resources, I’ve observed BITMAP and RCDATA
- Unescaped key string: this field is typically reused over samples.
- Type of encoding: single or double, double being the more recent.
- Hidden TargetBytes parameter to enable piping payload in for analysis.
- Worker threads: for bulk analysis / research use cases.
The Qakbot parser can also be leveraged for research and run bulk analysis. One caveat is the content requires payload files that have been dumped with offsets intact. This typically requires some post collection filtering or PE offset realignment but enables Velociraptor notebook to manipulate post processed data.
Some techniques I have used to bulk collect samples:
- Sandbox with PE dumping features: api based collection
- Virustotal search: crowdsourced_yara_rule:0083a00b09|win_qakbot_auto AND tag:pedll AND NOT tag:corrupt (not this will collect some broken payloads)
Some findings from a small data set ~60 samples:
- Named campaigns are typically short and not longer than a few samples over a few days.
- IP addresses are regularly reused and shared across campaigns
- Most prevalent campaigns are “BB” and “obama” prefixed
- Minor campaigns observed: “azd”, “tok” and “rds” with only one or two observed payload samples each.
Strings analysis can also provide insights to sample behavior over time to assist analysis. A great example is the adding to process name list for anti-analysis checks.
PE dumping, which is not available in expensive paid tools, is a useful capability and enables advanced capability at enterprise scale. For widespread threats like Qakbot, this kind of content can significantly improve response for blue teams, or even provide insights into threats when analyzed in bulk. In the coming months, we will be publishing a series of similar blog posts, offering a sneak peek at some of the types of memory analysis enabled by Velociraptor and incorporated into our training courses.
I also would like to thank Jakob Denlinger and James Dunne for their assistance in writing this post.
- Malpedia, QakBot
- Elastic, QBOT Malware Analysis
- @hasherezade. Hollows Hunter, https://github.com/hasherezade/hollows_hunter