If you’ve debugged crash dumps for awhile, then you’ve probably ran into a situation where the initial dump context provided by the debugger corresponds to a secondary exception that happened while processing an initial exception that’s likely closer to the original underlying problem in the issue you’re investigating.
This can be annoying, as the “.ecxr” command will point you at the site of the secondary failure exception, and not the original exception context itself. However, in most cases the original, primary exception context is still there on the stack; one just needs to know how to find it.
There are a couple of ways to go about this:
- For hardware generated exceptions (such as access violations), one can look for ntdll!KiUserExceptionDispatcher on the stack, which takes a PCONTEXT and an PEXCEPTION_RECORD as arguments.
- For software-generated exceptions (such as C++ exceptions), things get a bit dicier. One can look for ntdll!RtlDispatchException being called on the stack, and from there, grab the PCONTEXT parameter.
This can be a bit tedious if stack unwinding fails, or you’re dealing with one of those dumps where exceptions on multiple threads at the same time, typically due to crash dump writing going haywire (I’m looking at you, Outlook…). It would be nice if the debugger could automate this process a little bit.
Fortunately, it’s actually not hard to do this with a little bit of a brute-force approach. Specifically, just a plain old “dumb” memory scan for something common to most all CONTEXT records. It’s not exactly a finesse approach, but it’s usually a lot faster than manually groveling through the stack, especially if multiple threads or multiple nested exceptions are involved. While there may be false-positives, it’s usually immediately obvious as to what makes sense to be involved with a live exception or not. Sometimes, however, quick-and-dirty brute force type solutions really end up doing the trick, though.
In order to find CONTEXT records based on a memory search, though, we need some common data points that are typically the same for all CONTEXT structures, and, preferably, contiguous (for ease of use with the “s” command, the debugger’s memory search support). Fortunately, it turns out that this exists in the form of the segment registers of a CONTEXT structure:
0:000> dt ntdll!_CONTEXT
+0x000 ContextFlags : Uint4B
[…]
+0x08c SegGs : Uint4B
+0x090 SegFs : Uint4B
+0x094 SegEs : Uint4B
+0x098 SegDs : Uint4B
[…]
Now, it turns out that for all threads in a given process will almost always have the same segment selector values, excluding exotic and highly out of the ordinary cases like VDM processes. (The same goes for the segment selector values on x64 as well.) Four non-zero 32-bit values (actually, 16-bit values with zero padding to 32-bits) are enough to be able to reasonably pull a search off without being buried in false positives. Here’s how to do it with the infamous WinDbg debugger script (also applicable to other DbgEng-enabled programs, such as kd):
.foreach ( CxrPtr { s -[w1]d 0 l?ffffffff @gs @fs @es @ds } ) { .cxr CxrPtr – 8c }
This is a bit of a long-winded command, so let’s break it down into the individual components. First, we have a “.foreach” construct, which according to the debugger documentation, follows this convention:
.foreach [Options] ( Variable { InCommands } ) { OutCommands }
The .foreach command (actually one of the more versitle debugger-scripting commands, once one gets used to using it) basically takes a series of input strings generated by an input command (InCommands) and invokes an command to process that output (OutCommands), with the results of the input command being subsituted in as a macro specified by the Variable argument. It’s ugly and operates based on text parsing (and there’s support for skipping every X inputs, among other things; see the debugger documentation), but it gets the job done.
The next part of this operation is the s command, which instructs the debugger to search for a pattern in-memory in the target. The arguments supplied here instruct the debugger to search only writable memory (w), output only the address of each match (1), scan for DWORD (4-byte) sized quanitites (d) in the lower 4GB of the address space (0 l?ffffffff); in this case, we’re assuming that the target is a 32-bit process (which might be hosted on Wow64, hence 4GB instead of 3GB used). The remainder of the command specifies the search pattern to look for; the segment register values of the current thread. The “s” command sports a plethora of other options (with a rather unwieldy and convoluted syntax, unfortunately); the debugger documentation runs through the gamut of the other capabilities.
The final part of this command string is the output command, which simply directs the debugger to set the current context to the input command output replacement macro’s value at an offset of 0x8c. (If one recalls, 0x8c is the offset from the start of a struct _CONTEXT to the SegGs member, which is the first value we search for; as a result, the addresses returned by the “s” command will be the address of the SegGs member.) Remember that we restricted the output of the “s” command to just being the address itself, which lets us easily pass that on to a different command (which might give one the idea that the “s” and “.foreach” commands were made to work together).
Putting the command string together, it directs the debugger to search for a sequence of four 32-bit values (the gs, fs, es, and ds segment selector values for the current thread) in contiguous memory, and display the containing CONTEXT structure for each match.
You may find some other CONTEXT records aside from exception-related ones while executing this comamnd (in particular, initial thread contexts are common), but the ones related to a fault are usually pretty obvious and self-evident. Of course, this method isn’t foolproof, but it lets the debugger do some of the hard work for you (which beats manually groveling in corrupted stacks across multiple threads just to pull out some CONTEXT records).
Naturally, there are a number of other uses for both the “.foreach” and “s” commands; don’t be afraid to experiment with them. There are other helpers for automating certain tasks (!for_each_frame, !for_each_local, !for_each_module, !for_each_process, and !for_each_thread, to name a few) too, aside from the general “.foreach“. The debugger scripting support might not be the prettiest to look at, but it can be quite handy at speeding up common, repetitive tasks.
One parting tip with “.foreach” (well, actually two parting tips): The variable replacement macro only works if you separate it from other symbols with a space. This can be a problem in some cases (where you need to perform some arithmetic on the resulting expanded macro in particular, such as subtracting 0x8c in this case), however, as the spaces remain when the macro symbol is expanded. Some commands, such as “dt“, don’t follow the standard expression parsing rules (much to my eternal irritation), and choke if they’ve given arguments with spaces.
All is not lost for these commands, however; one way to work around this issue is to store the macro replacement into a pseudo-register (say, “r @$t0 = ReplacementVariableMacro – 0x8c“) and use that pseudo-register in the actual output command, as you can issue multiple, semi-colon delimited commands in the output commands section.
Tags: Debugging, Debugging tips
Thanks for this info and congratulations on your new position at MS (http://blogs.msdn.com/michael_howard/archive/2009/03/24/ken-johnson-skywing-joins-microsoft.aspx)
I am kind of mixed on the fact that both skape and Skywing are now at MS, just like I was kind of mixed on MS buying Winternals. For example, both skape and Skywing wrote the papers on PatchGuard, and I was not able to find anything about PatchGuard in Windows 7 x64.
Hi
Thank you for your post, however maybe it could be improved.
I have for years used the “~*e s -d poi(@$teb+8) poi(@$teb+4) 1003f†to real stack “behind” the exception. (posted by Ivan Brugiolo).
When I tried your method on a recent dump, it did not find the correct CONTEXT structure because the FS in the correct CONTEXT was 3b instead of 38. after ntdll!DbgUiRemoteBreakin.
I don’t understand the difference between ContextFlags 1003f and 10017 can you explain ?
Thank you !
My output:
0:010> .foreach ( CxrPtr { s -[w1]d 0 l?ffffffff @gs @fs @es @ds } ) {.echo “=== CxrPtr: “;.echo CxrPtr; dd CxrPtr – 8c l1;.cxr CxrPtr – 8c }
=== CxrPtr:
0x0229fdb8
0229fd2c 00010017
eax=00000000 ebx=00000001 ecx=00000002 edx=00000003 esi=00000004 edi=00000005
eip=7c95077b esp=0229fff8 ebp=00000000 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000200
ntdll!DbgUiRemoteBreakin:
7c95077b 6a08 push 8
=== CxrPtr:
0x03f2fdbc
03f2fd30 00010017
eax=7c0040e2 ebx=016c5388 ecx=7c910970 edx=7c90ee18 esi=00000000 edi=0012f440
eip=7c810856 esp=03f2fffc ebp=7c913e6f iopl=0 nv up ei pl nz na po nc
cs=05e0 ss=0010 ds=0023 es=0023 fs=0038 gs=0000 efl=00000200
kernel32!BaseThreadStartThunk:
05e0:7c810856 33ed xor ebp,ebp
=== CxrPtr:
0x043cfdbc
043cfd30 00010017
eax=791d24e3 ebx=04201eb0 ecx=0000ce91 edx=00000002 esi=00000000 edi=00150178
eip=7c810856 esp=043cfffc ebp=7c91664e iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000200
kernel32!BaseThreadStartThunk:
7c810856 33ed xor ebp,ebp
====> Correct fault here !
0:010> ~*e s -d poi(@$teb+8) poi(@$teb+4) 1003f
0012f87c 0001003f 00000000 00000000 00000000 ?……………
0:010> .cxr 0012f87c
eax=616e614d ebx=07dea796 ecx=09647dc8 edx=00000001 esi=09647dc8 edi=06e82408
eip=0b5a145e esp=0012fc08 ebp=0012fc18 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
0b5a145e 8b5804 mov ebx,dword ptr [eax+4] ds:0023:616e6151=????????
See:
http://groups.google.com/group/microsoft.public.windbg/tree/browse_frm/thread/e0270232f2560e5e/4938d2d8b2e4edec?hl=en&rnum=1&q=real+stack+%22behind%22+the+exception&_done=%2Fgroup%2Fmicrosoft.public.windbg%2Fbrowse_frm%2Fthread%2Fe0270232f2560e5e%2F4938d2d8b2e4edec%3Fhl%3Den%26tvc%3D1%26q%3Dreal%2Bstack%2B%2522behind%2522%2Bthe%2Bexception%26#doc_08b20827422f3d42
hi dear, I need some source code for DDK, some good sample for driver programming, if you can send me some stuff related to this and kernel level programming and how use some important kernel functions it’s apperciated.
thanks man …
if it’s no problem send them to me through E-mail .
thanks .
You don’t have to use the pseudo-register trick to work around the spaces on symbol replacement. You can use the alias interpreter macro to handle it.
For example:
.foreach (CxrPtr {}){.cxr ${CxrPtr}-8c}
The alias interpreter version does not require spaces around it.