Let us check that with a simple example - the well known "Hello World" application in C created with Microsoft Visual C++ 2010 Express. The size of the executable is 27 kilobytes. The image of the executable has 7 sections while it could well be implemented with only two sections (one for code and one for data).
The import directory looks even more exciting. Well, MSVCRT.dll
is unavoidable as it is the C language interface to Windows Operating System. But there are 28 imported APIs from KERNEL32.dll
and most of them seem to be placed here by mistake. GetTickCount
for example. Do we care about timing when we only want to output a single string and leave? No, we do not.
Anyway, the issue of compiler's heuristic is outside the scope of this article. Let's concentrate on the API functions. In general, it is a great thing that lets us deal with application development without a need to implement every single interaction with Operating System and saves us a lot of time. Good on one hand, but bad on the other. Having the list of API functions for certain software provides a clear understanding of what, and what is even more important, how that software is intended to do. This may be good when you deal with malware research, but not as good when you are trying to protect your legitimate software from being hacked.
Unfortunately, there are thousands of software products that use IsDebuggerPersent
API as their only protection mechanism. Isn't this ridiculous? No, it is not. It is rather sad, I'd say. Of course, there are numerous packers/cryptors/protectors out there, but the problem is that the more known your solution is, the more vulnerable it gets. There are some linkers that provide you with import section obfuscation abilities, but again, the problem is that they are known.
One of the possible solutions for this problem is the Stealth Import of APIs. This is a simple, powerful but underestimated technique. There are many developers, most of the developers, I should say, who believe that it is impossible to create and, even more important, launch a Windows executable without imports at all. "You need to import KERNEL32.dll at least!" - they would say. Unfortunately, not all of us are aware of the fact that both NTDLL.dll
and KERNEL32.dll
are automatically mapped into the process's address space regardless of the executable's import table. It is obvious, that having them loaded in memory, makes it possible to locate any API function and load any library should there be a need for it. We may not know, but the operating system itself provides us with all the tools we may need for that.
Get Handle of KERNEL32.DLL or NTDLL.DLL
All we need to do, is to get the address of the first exception handler in the chain of handlers. This chain is accessible through the first entry in the TIB
(Thread Information Block) which is pointed by [FS:0]
. This is as simple as
;Get the initial exception handler mov eax,[fs:0]
We now have the pointer to the last added EXCEPTION_REGISTRATION
record and only need to iterate through the rest of the records in order to get to the first record which normally points to either KERNEL32.DLL
or NTDLL.DLL
(on Windows Vista, 7). The following code does exactly this:
.search_default_handler: cmp dword[eax],0xFFFFFFFF jz .found_default_handler ;go to the previous handler mov eax,[eax] jmp .search_default_handler
The last (or to say it right - the first) record would have its prev field equal to -1 and its handler field ([eax+4]
in our case) contains the address of the default exception handler located in one of the dlls mentioned above. What's next?
Things are really easy if we are on Windows XP, as we have an address inside the KERNEL32.DLL
and all we have to do is make it page aligned
mov eax,[eax+4] and eax,0xFFFF0000
then "scroll" the pages towards lower addresses and check each page for 'MZ' signature
.look_for_mz: cmp word [eax],'MZ' jz .got_mz sub eax,0x10000 jmp .look_for_mz
Once we find the 'MZ' signature, we have the handle to the library. My advice - save it somewhere. The problem is - we still do not know which library this is (KERNEL32.DLL
or NTDLL.DLL
). In normal situation we would call GetModuleFileName
, but, again, we are not in a normal situation. The solution is easy. Having the base address of the module we already have everything we may need. Does offset 0x3C
look familiar? If not, then you should probably read this document. At this offset from the base address we have a WORD
which is the offset of the PE
signature ('PE\0\0
') which is followed by the COFF
header. We should check it anyway
mov bx,[eax+0x3C] movzx ebx,bx add eax,ebx mov bx,'PE' movzx ebx,bx cmp [eax],ebx jz .found_pe
There is not much we can do if the zero flag is not set after comparison, which means that the PE
signature has not been found. We just need to restore the stack to the state it was in when the process started, zero the eax
register and execute ret instruction. This would terminate our process. Basically, this would mean that we have to review our previous code. On the other hand, if zero flag is set, this mean that we have successfully got to the COFF
header.
Now we need to locate the export directory. Its RVA
(relative virtual address) and size should appear right after the Optional Header, which, in turn, appears right after the COFF
header. We may skip the headers themselves and get straight to the export IMAGE_DATA_DIRECTORY
entry
add eax,0x78
Yes, as simple as that. Now [eax]
points to the RVA
of the export directory and [eax+4]
to its size
typedef struct _IMAGE_DATA_DIRECTORY { DWORD RVA; //EAX points here DWORD Size; } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
The next step is to read the RVA
of the export table and add it to the image base address (the handle we obtained earlier)
mov eax,[eax] add eax,[image_base_address]
Congratulations! We are finally at the Export Directory Table
!
Right now, we are interested in particular field of this table, namely the "Name RVA" which is located at [eax+0x0C]
and points to NULL
terminated ASCII string containing the name of this very library. The procedure is almost identical to the previous one
mov eax,[eax+0x0C]
add eax,[image_base_address]
We are one step from knowing what library this is. However, we have to implement a simple strcasecmp function. Why strcasecmp instead of strcmp? Just because strcasecmp is case-insensitive, and we do not have to guess whether the library name is upper or lower, or even mixed case (like "KERNEL32.dll
"). By respectively comparing the library name with strings 'kernel32.dll
' and 'ntdll.dll
' we identify the library.
We are in KERNEL32.DLL!
If this is the case, then we are lucky as we only have to locate the address of GetProcAddress
API by parsing the export table (this deserves a separate article) or we may still use our custom version of GetProcAddress
. We are able to obtain addresses of any API that is exported by KERNEL32.DLL
as we have its handle. More than that, we are able to load additional libraries by first locating the LoadLibraryA
or LoadLibraryW
addresses. Basically, we are done.
We are in NTDLL.DLL...
This case is less desired, but it may occur if our software runs on Windows Vista and higher. Needless to say that we have no access to GetProcAddress
or LoadLibraryA
(yet!). Instead we have LdrLoadDll
or LdrGetDllHandle
API functions exported by NTDLL.DLL
. Here are the prototypes of these functions:
NTSYSAPI NTSTATUS NTAPI LdrLoadDll(
IN PWCHAR PathToFile OPTIONAL,
IN ULONG Flags OPTIONAL,
IN PUNICODE_STRING ModuleFileName,
OUT PHANDLE ModuleHandle);
Let's skip the optional values as we may safely set them to 0. The first non-optional parameter is ModuleFileName
, but what does PUNICODE_STRING
mean? It is a pointer to a structure, that describes a UNICODE string. This structure may be easily build on stack. Here is its declaration:
typedef struct _LSA_UNICODE_STRING { USHORT Length; USHORT MaximumLength; PWSTR Buffer; } LSA_UNICODE_STRING, *PLSA_UNICODE_STRNIG, UNICODE_STRING, *PUNICODE_STRING;
Length
- this field specifies the length, in bytes, of the string pointed by Buffer, not including the terminatinf NULL
; MaximumLength - total size
, in bytes, of the memory allocated for Buffer; Buffer
- pointer to a wide-character string (like 'K', 0, 'E', 0, 'R', 0, 'N', 0, 'E', 0, 'L', 0, '3', 0, '2', 0, '.', 0, 'D', 0, 'L', 0, 'L', 0, 0, 0
).
The PHANDLE ModuleHandle
is the pointer to a location in memory where the function should store the handle to a loaded library.
Now, let's turn to LdrGetDllHandle
NTSYSAPI NTSTATUS NTAPI LdrGetDllHandle(
IN PWORD pwPath OPTIONAL,
IN PVOID Unused OPTIONAL,
IN PUNICODE_STRING ModuleFileName,
OUT PHANDLE pHModule);
Let's skip the optional parameters again. Especially the "Unused" one.
ModuleFileName
- is a pointer to UNICODE_STRING
structure which describes the name of the DLL;
pHModule
- a pointer to a location in memory where the function should store the result (the handle of the DLL).
We still have to implement a custom GetProcAddress
function in order to retrieve these. The sample code is located at the end of this article.
Once we have the addresses of these functions, we should first try to get the module handle of the 'KERNEL32.dll
' by calling the LdrGetDllHandle
and if it fails, we then try to load it with LdrLoadDll
. If both functions fail - restore the stack, execute ret and check your code.
Once we have the module handle of the KERNEL32.DLL
, we are free to use the API functions it exports (e.g. GetProcAddress
, LoadLibrary
, etc.).
As you can see, this technique is simple in deed. More than that it allows you to implement additional protection mechanisms like code obfuscation, SEH
usage and many more in order to protect one of the most hack-sensitive parts of your software - the import section.
Hope this post was helpful. See you at the next post!
P.S. Custom GetProcAddress function. It is far from being perfect but is enough for what we need it.
//This is our custom GetProcAddress //get_proc_address(HMODULE hModule, PCSTR procName) if used _get_proc_address baseAddress= -4 numberNamePointers= -8 exportAddressTableVA=-12 namePointerVA= -16 ordinalTableVA= -20 ordinalBase= -24 _get_proc_address: push ebp mov ebp, esp sub esp,24 push ebx ecx edx esi edi ebx mov esi,[ebp+8] //ESI -> base address mov ebx,esi //EBX is going to point //to export table push ebx mov bx,[ebx+0x3C] movzx ebx,bx add ebx,[esp] add esp,4 //Set variables mov [ebp+baseAddress],esi add ebx,0x78 //now EBX points to //the export table directory entry mov ebx,[ebx] add ebx,[ebp+baseAddress] mov eax,[ebx+24] //number of name pointers dec eax //This is done in order to //compare 0 based index mov [ebp+numberNamePointers],eax mov eax,[ebx+16] mov [ebp+ordinalBase],eax //ordinal base mov eax,[ebx+28] add eax,[ebp+baseAddress] mov [ebp+exportAddressTableVA],eax //VA of address table mov eax,[ebx+32] add eax,[ebp+baseAddress] mov [ebp+namePointerVA],eax //VA of name pointers table mov eax,[ebx+36] add eax,[ebp+baseAddress] mov [ebp+ordinalTableVA],eax //VA of ordinal table //Reset offset counter xor ecx,ecx .search_loop: push ecx shl ecx,2 //Offset must be multiple of 4, //so we multiply counter by 4 mov ebx,[ebp+namePointerVA] add ebx,ecx //EBX now points to one of //the exported API functions name pointer mov ebx,[ebx] add ebx,[ebp+baseAddress] push ebx dword[ebp+12] call _strcmp test eax,1 jnz .found_api_name pop ecx cmp ecx,[ebp+numberNamePointers] jz .not_found_a_thing inc ecx jmp .search_loop .found_api_name: pop ecx //We now have the offset of the api in ECX register mov ebx,[ebp+ordinalTableVA] shl ecx,1 add ebx,ecx //EBX now points to the correct //ordinal value mov bx,[ebx] movzx ebx,bx //EBX contains an offset into //the export address table shl ebx,2 //Multiply it by 4 mov eax,[ebp+exportAddressTableVA] add ebx,eax mov eax,[ebx] add eax,[ebp+baseAddress] //now the EAX register contains //the address of the exported function .out: pop ebx edi esi edx ecx ebx mov esp,ebp pop ebp ret 8 .not_found_a_thing: xor eax,eax jmp .out end if