Thursday, July 26, 2012

Introducing GHAST & Finding PnkBstrK.sys when "lm" fails

So I'm pretty sick of doing stuff manually in WinDBG. Now that I have a decent understanding of how to use pykd (well, mostly anyways...) I'm going to start writing and releasing various scripts that help me automate some of my analysis. I figured I'd make a github git repo for all this code, which I've dubbed GHAST; Game Hacking Adventures Scripts & Tools.

One problem I've been having is that PnkBstrK.sys doesn't show up in the 'lm' output. Not exactly sure why this is but at first I suspected it was removing itself from the PsLoadedModuleList doubly linked list. This is a common rootkit behaviour and I pretty much consider PnkBstrK.sys to be a rootkit at this point. To confirm whether this was true, I wrote a pykd script to walk the PsLoadedModuleList and print out the name, entry point and base address of all modules. Turns out PnkBstrK.sys hasn't removed itself, but for some reason WinDBG isn't listing it.
PnkBstrK.sys in the PsLoadedModuleList, but not 'lm' output.
Now that I have at least the base address and driver entry point, I can start to automate setting breakpoints and dumping out argument values. My goal is to be able to trace and record all of my 'interesting' functions that I've RE'd from my static analysis. I'm halfway done but the above issue is affecting pykd as well so I needed an alternative way to find the base/entry addresses. The above code can be found here. Hopefully now I can get my other script to work. Anyways, keep your eye on my github repo as I'll update whenever I finish any scripts.

Tuesday, July 17, 2012

Constant Love

If you've been following my journey thus far, you may remember a while back when I first started I identified third party libraries in use in the target I was RE'ing due to the values of constants. Well today is my lucky day it appears. Well, technically yesterday, but whatever. While poking through some more functions in IDA I noticed a very long string of instructions that appeared to be doing some sort of hashing/crypto. I deduced this due to the fact that there were a lot of shr, shl, or, and, and not instructions in a pretty specific pattern.
Hmm, this looks crypto'y
I took a few of the above constants; 28955B88h 173848AAh 242070DBh 3E423112h and threw them in to Google to see what I could find. 
We have a winner! :>
After seeing the reference to MD5 I quickly remembered my GUID related post where I found two values that appeared to be the stringified versions of two md5 hashes. I highly suspect this function is used to generate that GUID. I'm still doing static analysis at this point, my next post will probably clarify what is going on by setting bp's in the debugger while it runs. So we have md5, awesome. No one codes their own MD5, so I bet myself that I could find the source they used... For that I headed over to koders code search. I selected C from the language and converted the hex value (28955B88h) to it's decimal form which is 606105819d.

md5.c from RSA. Can't get any more obvious than that. By finding the direct source, I was able to re-label four functions that are the assembly versions of the C code. Here are snippits of the asm alongside the C source:
MD5Init
MD5Update
MD5Transform
MD5Final
So we now have four of the md5 functions accounted for and labeled. But continuing my search, I also noticed another function with some pretty unique looking values. Using the same technique above, I took the 9D2C5680h and 0EFC60000h values and koders searched them. To my surprise it turned out to be the Mersenne Twister algorithm, also known as rand() in some languages :>. So now I have two more functions re-labeled to sgenrand(seed) and genrand().
rand()? Why thank you, don't mind if I do!
Overall, a pretty successful find, took longer to write this post than it did to get everything discovered and re-labeled, but hey that's the price you pay for documentation!

Sunday, July 15, 2012

Scripts, Tools and the IDT.

So from my last post I had a few people reach out to me about fixing up dumped modules. Unfortunately, I subscribe heavily to the NIH attitude and ended up writing my own quick python module using the pefile module (note you can pip install pefile as well). All my script really does is set the PointerToRawData to the VirutalAddress value and writes out the changes to a new file (prefixed by new_).


import os
import sys
import glob
import argparse
import pefile


def dump_directory(path):
    for filename in glob.glob(path+os.sep+"*.sys"):
        dump_file(filename)

def dump_file(filename):
        print "Fixing up: %s"%filename
        try:
            pe = pefile.PE(filename)
            for section in pe.sections:
                print "Updating: %s PointerToRawData 0x%x to"\
                      " VirtualAddress: 0x%x"%(section.Name,
                                               section.VirtualAddress,
                                               section.PointerToRawData)
                # Update the section.PointerToRawData to be equal to
                # the VirtualAddress/
                section.PointerToRawData = section.VirtualAddress
            # write the changes
            pe.write(filename='new_'+filename[filename.rindex(os.sep)+1:]) 
            print "new_%s written to disk."%filename
        except pefile.PEFormatError, msg:
            print "Error %s file is not a PE file? msg: %s"%(filename, msg)
            
def main():
    parser = argparse.ArgumentParser(
        description='Fixes up the VirtualAddress of drivers dumped from memory.')
    parser.add_argument('--directory',
                        '-d',
                        action='store',
                        help='Directory with *.sys driver files.')
    parser.add_argument('--file',
                        '-f',
                        action='store',
                        help='Single file to fix up.')
    args = parser.parse_args()
    
    if args.directory is not None:
        dump_directory(args.directory)
    elif args.file is not None:
        dump_file(args.file)
    else:
        parser.print_help()
        
if __name__ == '__main__':
    main()

If you are curious about the recommendations I got. @skier_t recommended his tool rreat. The tool from @iMHLv2 was a pretty interesting looking framework/toolset for memory analysis of malware called volatility. I'll definitely play around with their tools more, but for now i'mma write my own junk :>.

So besides fixing up the image once dumped, I've also been working on looking at the various functions of the driver after it's loaded. I came across two very curious blocks of code. At first, IDA didn't flag them as being functions.
IDA listing just the code as is
But by selecting the start of the function and hitting P, IDA will define it for us.
woo, we have functions! :>
You'll notice in the above code the two comments I added. If you are not familiar with the SIDT and LIDT x86 operands, well they are for storing and loading the Interrupt Descriptor Table. I suggest reading materials (both from phrack) if you want to learn more about the IDT and how they are used for hooking. "Handling Interrupt Descriptor Table for fun and profit" article by kad for a deep technical dive into the IDT and the IDT hooking article by mxatone and ivanlef0u for a more 'windowsy' look.

Anyways, it appears that the above disassembly stores the IDT values in memory, does a modification (*notice the mov eax, dword_EE01033C...) then reloads the modified version back into the idt register. When doing run-time analysis I didn't see anything at that address except nulls, so i'm not really sure what the point of it is yet. Keep in mind i'm pretty new to this whole IDT business as well. I tried setting a breakpoint on the two functions which modify the IDT and I can't seem them being called at any point yet. I think I will need to do more work in this area to get a better understanding of it all.

One thing I did notice however is that I'd like an automated way of inspecting the various interrupt entries. If you love python and you use windbg, you should really take a look at pykd, it's pretty damn awesome. After a few minutes of poking through it's samples I found an old (non-working) script which read the IDT entries. I had to rewrite most of the sample script to run in the latest version, but it works now. It's pretty simple in that it just loops through the IDT entries, extracts the dispatch address, dispatch code and the symbol name and displays it. Here's the code:

from pykd import *
import sys

if __name__ == "__main__":
    if isKernelDebugging():
        dprintln( "check interrupt handlers...\n" ) 
        idtr = reg( "idtr" )
        nt = loadModule( "nt" )
        ErrorCount = 0
        dprintln("idtr is: %08x"%idtr)
        for i in xrange(0, 255):
            idtEntry = nt.typedVar("_KIDTENTRY", idtr+i*8)
            if idtEntry.Selector == 8:
                offset = ( idtEntry.ExtendedOffset * 0x10000 ) + idtEntry.Offset
                InterruptHandler = offset
                kinterrupt = nt.typedVar("_KINTERRUPT",InterruptHandler)
                if InterruptHandler != 0x00:
                    try:
                        dprintln("IDT [%02x] InterruptHandler: 0x%08x "\
                                 "DispatchAddress: 0x%08x "\
                                 "KINTERRUPT.DispatchCode 0x%08x"\
                                 " (symbol: %s)"%(i,InterruptHandler,
                                                  kinterrupt.DispatchAddress,
                                                  kinterrupt.DispatchCode,
                                                  findSymbol(InterruptHandler)))
                    except Exception, msg:
                        dprintln("IDT [%02x] empty"%i)
    else:
        dprintln( "we are not debugging the kernel..." )

And here's some output from it being run from WinDBG:
idt_dump.py pykd script, dumpin' some interrupt tables baby!
I think I'm going to become very well acquainted with pykd, because well, doing this kind of stuff manually kinda sucks.

Tuesday, July 10, 2012

Dumping PnkBstrK.sys Part 2: Fixing it up!

You may remember from my last post that I was able to dump PnkBstrK.sys from memory but it "looked weird". As in the addresses, even after I rebased the image in IDA to make them look right, were showing up incorrectly. After a bit of work I've figured out  not only why but also how to get a module/driver/dll that was dumped from memory to "look right" in IDA.

You may remember from the PECOFF specification that the sections of a PE file have some meta-data associated with them. In particular the IMAGE_SECTION_HEADER. This section has the name, the VirtualSize, address of the section in the image as well as the address of where it will be when the image loads, also known as the VirtualAddress. This is the important part. Because what is on disk versus what is loaded into memory is quite different due to section alignment. Here's what the files look like side-by-side in 010editor.
The difference between the file from disk (left) and the one dumped from memory (right)
 You'll notice in the above image there's a large section of null bytes that doesn't exist in the file from disk. This is due to the value of the VirtualAddress section of the PE file for each section. It basically is aligned at 0x1000 so it injects a bunch of null bytes by the loader. If you attempt to load the file as is into IDA Pro you'll get something that looks like the below image.
The file that was dumped from memory, rebased, but the DriverEntry is totally wrong.
 When I first loaded it, the DriverEntry point was totally off. That's because IDA Pro is reading the PointerToRawData value of the PE file metadata struct and assuming that the entry point is where it says it is: at 0x400 in the case of the .text section. This of course is wrong because the dump from memory was aligned differently (adding about 0xc00 of null bytes). With that mystery finally solved (after much head banging, I assure you.) I fixed the PointerToRawData values for each section in 010editor.
Fixing the values for each section for (updating PointerToRawData values)
I then attempted to reload the dumped driver into IDA for analysis again. At first I tried to rebase on load (again selecting the "Manual Load" and "Load resources" check boxes) but that turned out to be incorrect as well, as you can see below.
Looks "ok" but some offsets are pointing to the wrong place!
This was annoying as I wasn't really sure how to fix this problem. After mucking around for a while in IDA Pro's rebasing abilities (Edit ->  Segments -> Rebase program...) I found if I unchecked the 'Rebase the whole program' I'd get a proper load with all offsets pointing to the right part of the PE file.

 After all that, I finally got the driver dumped from memory to look like the one extracted from the disk.
The image with all sections correctly resolved and displayed.
And that's pretty much it! Now I know how to dump drivers directly from memory and fix them up to be able to analyze with IDA Pro easier. Hope this saves someone the headache!