Archive for the ‘Debugging’ Category

Useful WinDbg commands: .formats

Monday, October 23rd, 2006

One of the many things that you end up having to do while debugging a program is displaying data types. While you probably know many of the basic commands like db, da, du, and soforth, one perhaps little-used command is useful for displaying a four or eight byte quantity in a number of different data types: the “.formats” command. This command is useful for viewing various primative/built-in data types, where you cannot display as a structure via the “dt” command.

In particular, you can use .formats to translate a number of different data types into readable values, including floating point or various time formats (time_t if you provide a 32-bit value, or FILETIME if you give a 64-bit value). For instance:

0:001> .formats 41414141
Evaluate expression:
  Hex:     41414141
  Decimal: 1094795585
  Octal:   10120240501
  Binary:  01000001 01000001 01000001 01000001
  Chars:   AAAA
  Time:    Fri Sep 10 01:53:05 2004
  Float:   low 12.0784 high 0
  Double:  5.40901e-315

The command also supports 64-bit filetime quantities:

0:001> .formats 01010101`01010101
Evaluate expression:
  Hex:     01010101`01010101
  Decimal: 72340172838076673
  Octal:   0004010020040100200401
  Binary:  00000001 00000001 00000001 00000001
           00000001 00000001 00000001 00000001
  Chars:   ........
  Time:    Sun Mar 28 21:14:43.807 1830 (GMT-4)
  Float:   low 2.36943e-038 high 2.36943e-038
  Double:  7.7486e-304

.formats is primarily useful for saving you a bit of time poking around in a calculator to translate times, or convert perhaps an overwritten eip into text if you are examining a stack buffer string overflow. In conjunction with db and dt, you should be able to format most any data you’ll come across in a debugging session into a readable format (provided you have symbols, of course, in the case of complex user-defined data types).

Win32 calling conventions: __stdcall in assembler

Friday, October 20th, 2006

It’s been awhile since my last post, unfortunately, primarily due to my being a bit swamped with work and a couple of other things as of late. With that said, I’m going to start by picking up where I had previously left off with the Win32 calling conventions series. Without further ado, here’s the stuff on __stdcall as you’ll see it in assembler…

Like __cdecl, __stdcall is completely stack-based.  The semantics of __stdcall are very similar to __cdecl, except that the arguments are cleaned off the stack by the callee instead of the caller.  Because the number of arguments removed from the stack is burned into the target function at compile time, there is no support for variadic functions (functions that take a variable number of arguments, such as printf) that use the __stdcall calling convention.  The rules for register usage and return values are otherwise identical to __cdecl.

In practice, this typically means that an __stdcall function call will look much like a __cdecl function call until you examine the ret instruction that returns transfer to the caller at the end of the __stdcall function in question.  (Alternatively, you can look to see if it appears as if stack arguments are cleaned after the function call.  However, the compiler/optimizer sometimes likes to be tricky with __cdecl functions, and defer argument removal until several function calls later, so this method is less reliable.)

Because the callee cleans the arguments off the stack in an __stdcall function, you will always[1] see a ret instruction terminating a __stdcall function.  For most functions, this count is four times the number of arguments to the function, but this can vary if arguments that are larger than 32-bits are passed.  On Win32, this argument count in bytes value is virtually always[2] a multiple of four, as the compiler will always generate code that aligns the stack to at least four bytes for x86 targets.

Given this information, it is usually fairly easy to distinguish an __stdcall function from a __cdecl function, as a __cdecl function will never use an argument to ret.  Note that this does imply, however, that it is generally not possible to disinguish between an __stdcall function and a __cdecl function in the case that both take zero arguments (without any other outside information other than disassembly); in this special case, the calling conventions have the same semantics.  This also means that if you have a function that does not clean any bytes off the stack with ret, you’ll technically have to examine any callers of the function to see if any pass more than zero arguments (or the actual function implementation itself, to see if it ever expects more than zero arguments) in order to be absolutely sure if the function is __cdecl or __stdcall.

Here’s an example of a simple __stdcall function call for the following C function:
 

__declspec(noinline)
int __stdcall StdcallFunction1(int a, int b, int c)
{
 return (a + b) * c;
}

If we call the function like this:

StdcallFunction1(1, 2, 3);

… we can expect to see something like so, for the call:

push    3
push    2
push    1
call    StdcallFunction1

(There will be no add esp instruction after the call.)

This is quite similar to a __cdecl declared function with the same implementation.  The only difference is the lack of an add esp instruction following the call.

Looking at the function implementation, we can see that unlike the __cdecl version of this function, StdcallFunction1 removes the arguments from the stack:

StdcallFunction1 proc near

a= dword ptr  4 b= dword ptr  8 c= dword ptr  0Ch mov     eax, [esp+8] ; eax = b mov     ecx, [esp+4] ; ecx = a add     eax, ecx     ; eax = eax + ecx imul    eax, [esp+c] ; eax = eax * c retn    0Ch          ; (return value = eax) StdcallFunction1 endp

As expected, the only difference here is that the __stdcall version of the function cleans the three arguments off the stack.  The function is otherwise identical to the __cdecl version, with the return value stored in eax.

With all of this information, you should be able to rather reliably identify most __stdcall functions.  The key things to look out for are:

  • All arguments are on the stack.
  • The ret instruction terminating the function has a non-zero argument count if the number of arguments for the function is non-zero.
  • The ret instruction terminating the function has an argument count that is at least four times the number of arguments for the function.  (If the count is less than four, then the function might be a __fastcall function with three or more arguments.  The __fastcall calling convention passes the first two 32-bit or smaller arguments in registers.)
  • The function does not depend on the state of the ecx and edx volatile variables.  (If the function expects these registers to have a meaningful value initially, then the function is probably a __fastcall or __thiscall function, as those calling conventions pass arguments in the ecx and edx registers.) 

In the next post in this series, I’ll cover the __fastcall calling convention (and hopefully it won’t be such a long wait this time).  Stay tuned…

 

[1]: For functions declared as __declspec(noreturn) or that otherwise never normally return execution control directly to the caller (i.e. a function that always throws an exception), the ret instruction is typically omitted.  There are a couple of other rare cases where you may see no terminating ret, such as if there are two functions, where one function calls the second, and both have very similar prototypes (such as argument ordering or an additional defaulted argument).  In this case, the compiler may combine two functions by having one perform minor adjustments to the stack and then “falling through” directly to the second function.

[2]: If you see a function with a ret instruction that does not take a multiple of four as its argument, then the function was most likely hand-written in assembler.  The Microsoft compiler will never, to my knowledge, generate code like this (and neither should any sane Win32 compiler).

The system call dispatcher on x86

Wednesday, August 23rd, 2006

The system call dispatcher on x86 NT has undergone several revisions over the years.

Until recently, the primary method used to make system calls was the int 2e instruction (software interrupt, vector 0x2e). This is a fairly quick way to enter CPL 0 (kernel mode), and it is backwards compatible with all 32-bit capable x86 processors.

With Windows XP, the mainstream mechanism used to do system calls changed; From this point forward, the operating system selects a more optimized kernel transition mechanism based on your processor type. Pentium II and later processors will instead use the sysenter instruction, which is a more efficient mechanism of switching to CPL 0 (kernel mode), as it dispenses with some needless (in this case) overhead of usual interrupt dispatching.

How is this switch accomplished? Well, starting with Windows XP, the system service call stubs do not hardcode a particular instruction (say, int 2e) anymore. Instead, they indirect through a field in the KUSER_SHARED_DATA block (“SystemCall”). The meaning of this field changed in Windows XP SP2 and Windows Server 2003 SP1; in prior versions, the SystemCall field held the actual code used to make the system call (and was filled in at runtime with the proper values). In XP SP2 and Srv03 SP1, in the interests of reducing system attack surface, the KUSER_SHARED_DATA region was marked non-executable, and SystemCall becomes a pointer to a stub residing in NTDLL (with the pointer value being adjusted at runtime based on the processor type, to refer to an appropriate system call stub).

What this means for you today is that on modern systems, you can expect to see a sequence like so for system calls:

0:001> u ntdll!NtClose
ntdll!ZwClose:
7c821138 b81b000000       mov     eax,0x1b
7c82113d ba0003fe7f       mov     edx,0x7ffe0300
7c821142 ff12             call    dword ptr [edx]
7c821144 c20400           ret     0x4
7c821147 90               nop

0x7ffe0300 is +0x300 bytes into KUSER_SHARED_DATA. Looking at the structure definition, we can see that this is “SystemCall”:

0:001> dt ntdll!_KUSER_SHARED_DATA
   +0x000 TickCountLowDeprecated : Uint4B
   +0x004 TickCountMultiplier : Uint4B
   +0x008 InterruptTime    : _KSYSTEM_TIME
   [...]
   +0x300 SystemCall       : Uint4B
   +0x304 SystemCallReturn : Uint4B
   +0x308 SystemCallPad    : [3] Uint8B
   [...]

Since my system is Srv03 SP1, SystemCall is a pointer to a stub in NTDLL.

0:001> u poi(0x7ffe0300)
ntdll!KiFastSystemCall:
7c82ed50 8bd4             mov     edx,esp
7c82ed52 0f34             sysenter
ntdll!KiFastSystemCallRet:
7c82ed54 c3               ret

On my system, the system call dispatcher is using sysenter. You can look at the old int 2e dispatcher if you wish, as it is still supported for compatibility with older processors:

0:001> u ntdll!KiIntsystemCall
ntdll!KiIntSystemCall:
7c82ed60 8d542408         lea     edx,[esp+0x8]
7c82ed64 cd2e             int     2e
7c82ed66 c3               ret

The actual calling convention used by the system call dispatcher is thus:

  • eax contains the system call ordinal.
  • edx points to either the argument array of the system call on the stack (for int 2e), or the return address plus argument array (for sysenter).

For most of the time, though, you’ll probably not be dealing directly with the system call dispatching mechanism itself. If you are, however, now you know how it works.

Win32 calling conventions: __cdecl in assembler

Thursday, August 17th, 2006

Continuing on the series about Win32 calling conventions, the next topic of discussion is how the various calling conventions look from an assembler level.

This is useful to know for a variety of reasons; if you are reverse engineering (or debugging) something, one of the first steps is figuring out the calling convention for a function you are working with, so that you know how to find the arguments for it, how it deals with the stack, and soforth.

For this post, I’ll concentrate primarily on __cdecl.  Future posts will cover the other major calling conventions.

As I have previously described, __cdecl is an entirely stack based calling convention, in which arguments are cleaned off the stack by the caller.  Given this, you can expect to see all of the arguments for a function placed onto the stack before a function call is main.  If you are using CL, then this is almost always done by using the “push” instruction to place arguments on the stack.

Consider the following simple example function:

__declspec(noinline)
int __cdecl CdeclFunction1(int a, int b, int c)
{
 return (a + b) * c;
}

First, we’ll take a look at what calls to a __cdecl function look like. For example, if we look at a call to the function described above like so:

CdeclFunction1(1, 2, 3);

… we’ll see something like this:

; 119  : 	int v = CdeclFunction1(1, 2, 3);

  00000	6a 03		 push	 3
  00002	6a 02		 push	 2
  00004	6a 01		 push	 1
  00006	e8 00 00 00 00	 call	 CdeclFunction1
  0000b	83 c4 0c	 add	 esp, 12

There are basically three different things going on here.

  1. Setting up arguments for the target function. This is what the three different “push” instructions do. Note that the arguments are pushed in reverse order – you’ll always see them in reverse order if they are placed on the stack via push.
  2. Making the actual function call itself. After all the arguments are in place, the “call” instruction is used to transfer execution to the target. Remember that on x86, the call instruction implicitly pushes the return address on the stack. After the function call returns, the return value of the function is stored in the eax (or edx:eax) registers, typically.
  3. Cleaning arguments off the stack after the function returns. This is the purpose of the “add esp, 0xc” instruction following the “call” instruction. Since the target function does not adjust the stack to remove arguments after the call, this is up to the calller. Sometimes, you may see multiple __cdecl function calls be made in rapid succession, with the compiler only cleaning arguments from the stack after all of the function calls have been made (turning many different “add esp” instructions into just one “add esp” instruction).

It is also worth looking at the implementation of the function to see what it does with the arguments passed in and how it sets up a return value. The assembler for CdeclFunction1 is as so:

CdeclFunction1 proc near

a= dword ptr  4
b= dword ptr  8
c= dword ptr  0Ch

mov     eax, [esp+8]    ; eax = b
mov     ecx, [esp+4]    ; ecx = a
add     eax, ecx        ; eax = eax + ecx
imul    eax, [esp+0Ch]  ; eax = eax * c
retn                    ; (return value = eax)
CdeclFunction1 endp

This function is fairly straightforward. Since __cdecl is stack based for argument passing, all of the parameters are on the stack. Recall that the “call” instruction pushes the return address onto the stack, so the stack will begin with the return value at [esp+0] and have the first argument at [esp+4]. A graphical view of the stack layout (relative to “esp”) of this function is thus:

+00000000  r              db 4 dup(?)      ; (Return address)
+00000004 a               dd ?
+00000008 b               dd ?
+0000000C c               dd ?

In this case, there is no frame pointer in use, so the function accesses all of the arguments directly relative to “esp”. The steps taken are:

  1. The function fills eax with the value of the second argument (b), located at [esp+8] according to our stack layout.
  2. Next, the function loads ecx with the value of the first argument (a), which is located at [esp+4].
  3. Next, the function adds to eax the value of the first argument (a), now stored in the ecx register.
  4. Finally, the function multiplies eax by the value of the third argument (c), located at [esp+c].
  5. After finishing with all of the computations needed to implement the function, it simply returns with a “retn” instruction. Since the caller cleans the stack, the “retn” intruction (with a stack adjustment) is not used here; __cdecl functions never use “retn <displacement>“, only “retn”. Additionally, because the result of the “mul” instruction happened to be stored in the eax register here, no extra instructions are needed to set up the return value, as it is already stored in the return value register (eax) at the end of the function.

Most __cdecl function calls are very similar to the one discussed above, although there will typically be much more code to the actual function and the function call (if there are many arguments), and the compiler may play some optimization tricks (such as deferring cleaning the stack across several function calls). The basic things to look for with a __cdecl function are:

  • All arguments are on the stack.
  • The return instruction is “retn” and not “retn <displacement>“, even when there are a non-zero number of arguments.
  • Shortly after the function call returns, the caller cleans the stack of arguments pushed. This may be deferred later, depending on how the compiler assembled the caller.

Note that if you have been paying attention, given the above criteria, you’ve probably noticed that a __cdecl function with zero arguments will look identical to an __stdcall function with zero arguments. If you don’t have symbols or decorated function names, there is no way to tell the two apart when there are no arguments, as the semantics are the same in that special case.

That’s all for a basic overview of __cdecl from an assembler perspective. Next time: more on the other calling conventions at an assembly level.

Win32 calling conventions: Usage cases

Tuesday, August 15th, 2006

Last time, I talked about some of the general concepts behind the varying calling conventions in use on Win32 (x86).  This posting focuses on the implications and usage cases behind each of the calling conventions, in an effort to provide a better understanding as to when you’ll see them used.

When looking at the different calling conventions, we can see that there are a number of differences between them.  Stack usage vs register parameters, caller vs callee cleans stack, member function calls vs “plain C” function calls, and soforth.  These differences lend each calling convention to specific cases where they are best suited.

To begin with, consider the __cdecl calling convention.  We know that it is a stack based calling convention where the caller cleans the stack.  Furthermore, we know that it is the default calling convention for CL (big hint as to when you’ll see this being used!).  These attributes make it well suited for a couple of cases:

  • Variadic functions, or functions with an ellipsis (…) terminating the argument list.  These functions have a variable number of arguments, which is not known at compile time of the callee.  __cdecl is useful for these functions because the compiler needs to implement a stack displacement to clean arguments off of the stack, but it doesn’t know how many arguments there are.  By leaving the argument-disposal up to the caller, who does know the number of arguments at compile time, the compiler doesn’t need special help from the programmer to correctly adjust the stack when the variadic function is going to return – it “just works”.  __cdecl is the only calling convention on Win32 x86 that supports variadic functions when used with CL.
  • Old-style C functions without prototypes.  For compatibility with legacy C code, the C compiler needs to support making function calls to unprototyped functions.  These must be treated as if they were variadic functions, because the compiler doesn’t know whether the function takes a fixed number of arguments or not (because there is no prototyped argument list).
  • Any other case where the programmer does not explicitly override the calling convention.  The default Visual Studio build environment will use the compiler default calling convention if you do not explicitly tell it otherwise, and this goes to __cdecl.  Some build environments (the DDK/build.exe platform in particular) default to different calling conventions, but Visual Studio built programs will always default to __cdecl if you are using CL.

Next, we’ll take a look at __stdcall.  This calling convention is the standard for Win32 APIs; virtually all system APIs are __stdcall (typically decorated as “WINAPI”, “NTAPI”, or “CALLBACK” in the headers, which are macros that expand to __stdcall).  Here are the typcial usage cases for __stdcall (and the “why” behind them):

  • Library functions.  Excepting the C runtime libraries, virtually all Microsoft-shipped Win32 libraries use __stdcall.  The main reason for this (if you discount the “that’s the way it has always been”) is that you save some instruction code space by using __stdcall and not __cdecl for library functions.  The reason for this is that for __cdecl functions, the caller typically needs to adjust the stack pointer after every “call” instruction to a __cdecl function (which takes up instruction code space – typically an “add esp, imm8” opcode).  For __stdcall functions, you only pay this penality once, in the “retn imm16” opcode at the end of the function (as opposed to once for every caller).  For frequently called functions (say, ReadFile), this begins to add up.  You also theoretically save a bit of processor time and cache space, as there is one less instruction to be executed per “call”.
  • COM functions.  COM uses __stdcall with the “this” pointer being the first argument, which is a required part of the COM API contract for publicly accessible functions.
  • Functions that need to be called from a language other than C/C++.  This also ties back into the COM and library function purposes, but of all of the calling conventions discussed here, only __stdcall has practically universal support among non-Microsoft or non-C/C++ compilers for x86 Win32 (such as Visual Basic, or Delphi).  As a result, it is advantageous to use __stdcall if you are expecting to be called from other languages.
  • Microsoft-built programs.  Microsoft defaults their programs to __stdcall and not __cdecl virtually everywhere, even in images that don’t export functions, or in internally, non-exported funtions within a system library.  This also applies to Microsoft kernel mode code, such as the HAL and the kernel itself.
  • Programs built with the DDK.  The DDK defaults to __stdcall and not __cdecl.
  • NT kernel drivers.  These are always (or at least should always be!) built with the DDK, which again, defaults to __stdcall.

There yet remains __fastcall to discuss.  This calling convention is not used as extensively as the other two (no Microsoft build environment that I am aware of defaults to it), so most of the cases for it being used are the result of a programmer explicitly requesting it.

  • Functions that do not call other functions (“leaf functions”).  These are good candidates for __fastcall because the register arguments are passed in volatile registers, so there is a penalty associated with __fastcall functions that call subfunctions and need to use their arguments across those function calls, as this requires the __fastcall function to save arguments to somewhere nonvolatile (i.e. the stack), and that defeats the whole purpose of __fastcall entirely.
  • Functions that do not use their arguments after the first subfunction call.  These can still benefit from __fastcall without the penalty mentioned above relating to preserving arguments across function calls.
  • Short functions that call other functions and then return.  If you can make both functions __fastcall, then sometimes the compiler can be clever and not need to re-load the argument registers when a __fastcall function calls a __fastcall subfunction.  This can be useful for “wrapper” functions in some cases.
  • Functions that interface with assembly code.  Sometimes it can be more convenient to make a C function called by assembler code __fastcall, because this can save you the work of manually tracking stack displacements.  It can also sometimes be more convenient to make assembler functions called by C code __fastcall as well, for similar reasons.

In general, __fastcall is relatively rare.  There are a couple of kernel functions that use it (KfRaiseIrql, for instance).  A couple of software vendors (such as Blizzard Entertainment) seem to like to ship things compiled with __fastcall as the default calling convention, but this not the common case, and usually not a good idea.

Finally, there is __thiscall.  This calling convention is only used if you are using the default calling convention for member functions.  Note that for member functions that are not accessible cross-program (e.g. not exported somehow), the compiler will sometimes replace ecx with ebx for the “this” pointer as a custom calling convention, depending on your optimization settings.

That’s all for this installment.  In the next posting, I’ll discuss what the various calling conventions look like at a low level (assembly), and what this means to you.

Win32 calling conventions: Concepts

Monday, August 14th, 2006

If you have done debugging work on Win32 (x86) for any length of time, you probably know that there are many different calling conventions.  While I have already covered the Win64 (X64) calling convention, I haven’t yet gone into details about how to work with the various calling conventions present on x86 Windows.  This miniseries is going to go into some detail relating to how the calling conventions work from the perspective of debugging and reverse engineering.

There are three major calling conventions in use in modern Win32 (on x86, anyway): __stdcall, __cdecl, and __fastcall.  All three have different characteristics that are important if you are debugging or reverse engineering something, and it is important to be able to recognize and work with functions written against all three calling conventions – even if you don’t have access to symbols.  To start off, it’s probably best to get a feel for what all of the different calling conventions do, how they work, and soforth.

Most of the calling conventions have some things in common.  All of the calling conventions have the same set of volatile registers, which are not required to remain the same across a call site: eax, ecx, and edx.  Additionally, all three calling conventions use eax for 32-bit return values, and eax:edx for 64-bit return values (high 32 bits in edx).  For large return values (e.g. functions that return a structure, and not a pointer to a structure), a hidden argument is passed to a hidden local variable in the caller’s stack frame that represents the large return value, and the return value is filled in using the hidden argument (that is, large return values are actually implemented as a hidden pointer parameter).

  • __cdecl is the default calling convention for the Microsoft C/C++ compiler.  This is an entirely stack based calling convention for parameter passing; the caller cleans the arguments off of the stack when the call returns.
  • __stdcall has the same semantics as __cdecl, except that the callee (target function) cleans the arguments off of the stack instead of the caller.
  • __fastcall passes the first two register-sized arguments in ecx and edx, with the remaining arguments passed on the stack like __stdcall.  If any stack based arguments were present, the callee cleans them off of the stack.

Additionally, there is a fourth calling convention implemented by CL, known as __thiscall, which is like __stdcall except that there is a hidden pointer argument representing a class object (“this” in C++) that is passed in ecx.  As you might imagine, __thiscall is used for non-static class member function calls (by default).  If you explicitly override the calling convention of member functions, then “this” becomes a hidden (first) argument that is passed according to the conventions of the overriding calling convention.  In general, __thiscall is not really used in the Windows API.

That’s a general overview of the basic concepts behind all of the major Win32 x86 calling convention.  In some upcoming posts, I’ll explore the implications behind the different attributes of these calling conventions, their usage cases, and how they apply to you when you are debugging or reverse engineering something.

The power of dumpbin.exe with symbols

Friday, August 11th, 2006

Many of the compiler utilities shipped with Visual Studio and the DDK in recent times actually support symbols under the hood, but this support is not well documented.

For example, “dumpbin.exe” (actually “link.exe /dump”) supports this, and so does “dumpbin.exe /disasm”.  All you need to do to activate this support is set the default symbol path with _NT_SYMBOL_PATH.  Henceforth, you will be able to see symbol names for exported functions with dumpbin (if you have symbols, of course) – even functions that are exported by ordinal only.

Additionally, when combined with symbol support, you can use “dumpbin.exe /disasm” as a quick-n-dirty x64/IA-64 disassembler (a cheap replacement for IDA Pro Advanced, for instance).  While certainly not as pleasant as a full project-based disassembler, it can get the job done in a pinch and it won’t cost you an arm and a leg either (not that I don’t love IDA, but they make it excessively difficult to get a copy of the 64-bit capable versions of their disassembler).   Skape and myself used this technique when performing research for our paper on x64’s “PatchGuard”.

Setting a default symbol path for all DbgHelp programs

Thursday, August 10th, 2006

If you’re like me, then you have lots of programs you use on a regular basis which utilize DbgHelp to manage symbol access.  Unfortunately, each of these programs typically has its own convention for managing a persisted symbol path, and it can get to be very annoying to update the paths for every single program you use when you need to change your symbol path.

Fortunately, there is a solution:  A little-used (nowadays) environment variable called “_NT_SYMBOL_PATH”.  DbgHelp will add this to the list of symbol paths searched for symbols automatically, without any special configuration needed by programs using it (in most cases; it is possible to programmatically suppress this behavior, but most programs do not do this as it is a bad idea to block this behavior).

How does this environment variable help you consolidate your symbol paths?  Well, there is an option in the System property sheet (under Control Panel) to set default environment variables for your user or the local machine.  You can use this to set a default value for _NT_SYMBOL_PATH for your user, which means that all of your programs using DbgHelp will now have a default symbol search path.

You can give _NT_SYMBOL_PATH any value that you could pass to “.sympath” or save in a WinDbg workspace, including symbol server references.

Note that if you give _NT_SYMBOL_PATH a symbol server reference, you need to make sure that programs using DbgHelp can find symsrv.dll, which means placing it in your path if those programs do not already have a copy in their directories.

Checking if a binary exists on a symbol repository

Wednesday, August 9th, 2006

This question came up on the microsoft.public.windbg newsgroup and turned out to be more complicated to solve than it might appear, so I’m posting about it here.

Someone asked if there is a way to use the DTW tools to detect if a binary is present on a symbol repository.  Now, you might naively assume that you can just use symchk.exe on the binary and it will work, and indeed if you try this, you might be fooled into thinking that it does work.  However, it really doesn’t – if you do this, all that symchk will do is verify that the symbols for the binary are on the symbol repository.

If you need to make sure that the binary is there (i.e. so you can do dump file debugging using the symbol repository), then you need to use the “/id filename” command line parameter with symchk.  This tells symchk that you want to verify that symbols for a dump file exist on the symbol server.  Because DbgEng lets you load a single PE image (.dll/.exe/.sys/etc) as a dump file, and because dump files require loading the actual binary itself from the symbol path, this forces symchk to verify that the binary (and not just the symbols) exist on the symbol repository.

Debugging a custom unhandled exception filter

Tuesday, August 8th, 2006

If you are working on a custom unhandled exception filter (perhaps to implement your own crash dump / error reporting mechanism), then you have probably run into the frustrating situation where you can’t debug the filter itself in case there is a crash bug in it.

The symtoms are that when a crash occurs, your crash reporting logic doesn’t work at all and the process just silently disappears.  When you attach a debugger and reproduce the crash to figure out what went wrong, the debugger keeps getting the exceptions (even if you continue after the first chance exception), and your unhandled exception filter is just never called.

Well, the reason for this is that UnhandledExceptionFilter tries to be clever and hides any last chance exception filter handling when a debugger is active.  Here’s why: the default implementation of the unhandled exception filter launches the JIT debugger and then restarts the exception.  When the JIT debugger attaches itself and the exception is restarted, you clearly want the exception to go to the debugger and not the unhandled exception filter, which would try to launch the JIT debugger again.

If, however, you have a custom unhandled exception filter that is crashing, then you probably don’t want this behavior so that you can debug the problem with the exception filter.  To disable this behavior and let the exception filter be called even if there is a debugger attached, you will need to patch kernel32!UnhandledExceptionFilter in a debugger.  If you unassemble it, you should eventually see a call to NtQueryInformationProcess that looks like this:

NtQueryInformationProcess(GetCurrentProcess(), 7, &var, 4, 0)

…followed by a comparison of that local variable passed in to NtQueryInformationProcess against 0.  You will need to patch the comparison to treat the local set by NtQueryInformationProcess as if it were zero, perhaps turning it into an unconditional jmp with the “eb” or “a” commands.

This comparison is checking for something called the “process debug port”, which is a handle to an LPC port that is used to communicate with the debugger attached to the current process.  The kernel returns a null handle value if there is no debugger attached, which is how UnhandledExceptionFilter knows whether to work its magic and forward the exception on to the debugger or call the registered unhandled exception filter.

After patching out the check, then exceptions should be forwarded to your unhandled exception filter as if no debugger were there, giving you a chance to inspect the operation of your crash handling code.