Previously, I provided a brief overview of what each of the core APIs relating to x64’s extensive data-driven unwind support were, and when you might find them useful.
This post focuses on discussing the interface-level details of RtlUnwindEx, and how they relate to procedure unwinding on Windows (x64 versions, specifically, though most of the concepts apply to other architecture in principle).
The main workhorse of unwind support on x64 Windows is RtlUnwindEx. As previously described, this routine encapsulates all of the work necessary to restore execution context to a prior point in the call stack (relying on RtlVirtualUnwind for this task). RtlUnwindEx also implements all of the logic relating to interactions with unwind/exception handlers during the unwind process (which is essentially the value added by RtlUnwindEx on top of what RtlVirtualUnwind implements).
In order to understand the inner workings of how unwinding works, it is first necessary to understand the high level theory behind how RtlUnwindEx is used (as RtlUnwindEx is at the heart of unwind support on Windows). Although there have been previously posted articles that touch briefly on how unwind is implemented, none that I have seen include all of the details, which is something that this segment of the x64 exception handling series shall attempt to correct.
For the moment, it is simpler to just consider the unwind half of exception handling. The nitty-gritty, exhaustive details of how exceptions are handled and dispatched will be discussed in a future posting; for now, assume that we are only interested in the unwind code path.
When a procedure unwind is requested, by any place within the system, the first order of business is a call to RtlUnwindEx. The prototype for RtlUnwindEx was provided in a previous posting, but in an effort to ensure that everyone is on the same page with this discussion, here’s what it looks like for x64:
VOID NTAPI RtlUnwindEx( __in_opt ULONG64 TargetFrame, __in_opt ULONG64 TargetIp, __in_opt PEXCEPTION_RECORD ExceptionRecord, __in PVOID ReturnValue, __out PCONTEXT OriginalContext, __in_opt PUNWIND_HISTORY_TABLE HistoryTable );
These parameters deserve perhaps a bit more explanation.
- TargetFrame describes the stack pointer (rsp) value for the target of the unwind operation. In normal circumstances, this is always the EstablisherFrame argument to an exception handler that is handling an exception. In the context of an exception handler, EstablisherFrame refers to the stack pointer of the caller of the function that caused the exception being inspected. Likewise, in this context, TargetFrame refers to the stack pointer of the function that the call stack should be unwound to. Although given the fact that with data-driven unwind semantics, one might initially think that this argument is unnecessary (after all, one might assume that RtlUnwindEx could simply invoke RtlVirtualUnwind in order to determine the expected stack pointer value for the next function on the call stack), this argument is actually required. The reason is that RtlUnwindEx supports unwinding past multiple procedure frames; that is, RtlUnwindEx can be used to unwind to a function that is several levels down in the call stack, instead of the immediately lower function in the call stack. Note that the TargetFrame argument must match exactly the expected stack pointer value of the target function in the call stack.
Observant readers may pick up on the SAL annotation describing the TargetFrame argument and notice that it is marked as optional. In general, TargetFrame is always supplied; it can be omitted in one specific circumstance, which is known as an exit unwind; more on that later.
- TargetIp serves a similar purpose as TargetFrame; it describes the instruction pointer value that execution should be unwound to. TargetIp must be an instruction in the same function on the call stack that corresponds to the target stack frame described by TargetFrame. This argument is supplied as a particular function may have multiple points that could be resumed in response to an exception (this typically the case if there are multiple try/except clauses).
Like TargetFrame, the TargetIp argument is also optional (though in most cases, it will be present). Specifically, if a frame consolidation unwind operation is being executed, then the TargetIp argument will be ignored by RtlUnwindEx and may be set to zero if desired (it will, however, still be passed to unwind handlers for use as they see fit). This specialized unwind operation will be discussed later, along with C++ exception support.
- ExceptionRecord is an optional argument describing the reason for an unwind operation. This is typically the same exception record that was indicated as the cause of an exception (if the caller is an exception handler), although it does not strictly have to be as such. If no exception record is supplied, RtlUnwindEx constructs a default exception record to pass on to unwind handlers, with an exception code of STATUS_UNWIND and an exception address referring to an instruction within RtlUnwindEx itself.
- ReturnValue describes a pointer-sized value that is to be placed in the return value register at the completion of an unwind operation, just before control is transferred to the newly unwound context. The interpretation of this value is entirely up to the routine being unwound into. In practice, the Microsoft C/C++ compiler does not use the return value at all in typical cases. Usually, the Microsoft C/C++ compiler will indicate the exception code that caused the exception as the return value, but due to how unwinding across functions works with try/except, there is no language-level support for retrieving the return value of a function that has been unwound due to an exception. As a result, in most circumstances, the return value placed in the unwound execution context based on this argument is ignored.
- OriginalContext describes an out-only pointer to a context record that is updated with the execution context as procedure call frames are unwound. In practice, as RtlUnwindEx does not ever “return” to its caller, this value is typically only provided as a way for a caller to supply its own storage to be used as scratch space by RtlUnwindEx during the intermediate unwind operations comprimising an unwind to the target call frame. Typically, the context record passed in to an exception handler from the exception dispatcher is supplied. Because the initial contents of the OriginalContext argument are not used, however, this argument need not necessarily be the context record passed in from the exception dispatcher.
- HistoryTable describes a cache used to improve the performance of repeated function entry lookups via RtlLookupFunctionEntry. Under normal circumstances, this is the same history table passed in from the exception dispatcher to an exception handler, although it could also be a caller-allocated structure as well. This argument can also be safely omitted entirely, although if a non-trivial set of call frames are being unwound, passing in even a newly-initialized history table may improve performance.
Given all of the above information, RtlUnwindEx performs a procedure call unwind by performing a successive sequence of RtlVirtualUnwind calls (to determine the execution context of the next call frame in the call stack), followed by a call to the registered language handler for the call frame (if one exists and is marked for unwinding support). In most cases where there is a language unwind handler, it will point to _C_specific_handler, which internally searches all of the internal exception handling scopes (e.g. try/except or try/finally constructs), calling “finally” handlers as need be. There may also be internal unwind handlers that are present in the scope table for a particular function, such as for C++ destructor support (assuming asynchronous C++ exception handling has been enabled). Most users will thus interact with unwind handlers in the form of a “finally” handler in a try/finally construct in a function whose language handler refers to _C_specific_handler.
If RtlUnwindEx encounters a “leaf function” during the unwind process (a leaf function is a function that does not use the stack and calls no subfunctions), then it is possible that there will be no matching RUNTIME_FUNCTION entry for the current call frame returned by RtlLookupFunctionEntry. In this case, RtlUnwindEx assumes that the return address of the current call frame is at the current value of Rsp (and that the current call frame has no unwind or exception handlers). Because the x64 calling convention enforces hard rules as to what functions without RUNTIME_FUNCTION registrations can do with the stack, this is a valid assumption for RtlUnwindEx to make (and a necessary assumption, as there is no way to call RtlVirtualUnwind on a function with no matching RUNTIME_FUNCTION entry). The current call frame’s value of Rsp (in the context record describing the current call frame, not the register value of rsp itself within RtlUnwindEx) is dereferenced to locate the call frame’s return address (Rip value), and the saved Rsp value is then adjusted accordingly (increased by 8 bytes).
When RtlUnwindEx locates the endpoint frame of the unwind, a special flag (EXCEPTION_TARGET_UNWIND) is set in the ExceptionFlags member of the EXCEPTION_RECORD passed to the language handler. This flag indicates to the language handler (and possibly any C-language scope handlers) that the handler is being called as the “final destination” of the unwind operation. The Microsoft C/C++ compiler does not expose functionality to detect whether a “finally” handler is being called in the context of a target unwind or if the “finally” handler is simply being called as an intermediate step towards the unwind target.
After the last unwind handler (if applicable) has been called, RtlUnwindEx restores the execution context that has been continually updated by successive calls to RtlVirtualUnwind. This restoration is performed by a call to RtlRestoreContext (a documented, exported function), which simply transfers a given context record to the thread’s execution context (thus “realizing” it).
RtlUnwindEx does not return a value to its caller. In fact, it typically does not return to its caller at all; the only “return” path for RtlUnwindEx is in the case where the passed-in execution context is corrupted (typically due to a bogus stack pointer), or if an exception handler does something illegal (such as returning an unrecognized EXCEPTION_DISPOSITION) value. In these cases, RtlUnwindEx will raise a noncontinuable exception describing the problem (via RtlRaiseStatus). These error conditions are usually fatal (and are indicative of something being seriously corrupted in the process), and virtually always result in the process being terminated. As a result, it is atypical for a caller of RtlUnwindEx to attempt to handle these error cases with an exception handler block.
In the case where RtlUnwindEx performs the requested unwind successfully, a new execution context describing the state at the requested (unwound) call frame is directly realized, and as such RtlUnwindEx does not ever truly return in the success case.
Although RtlUnwindEx is principally used in conjunction with exception handling, there are other use cases implemented by the Microsoft C/C++ compiler which internally rely upon RtlUnwindEx in unrelated capacities. Specifically, RtlUnwindEx implements the core of the standard setjmp and longjmp routines (assuming the exception safe versions of these are enabled by use of the <setjmpex.h> header file) provided by the C runtime library in the Microsoft CRT.
In the exception-safe setjmp/longjmp case, the jmp_buf argument essentially contains an abridged version of the execution context (specifically, volatile register values are omitted). When longjmp is called, the routine constructs an EXCEPTION_RECORD with STATUS_LONGJUMP as the exception code, sets up one exception information parameter (which is a pointer to the jmp_buf), and passes control to RtlUnwindEx (for the curious, the x64 version of the jmp_buf structure is described as _JUMP_BUFFER in setjmp.h under the _M_AMD64_ section). In this particular instance, the ReturnValue argument of RtlUnwindEx is significant; it corresponds to the value that is seemingly returned by setjmp when control is being transferred to the saved setjmp context as part of a longjmp call (somewhat similar in principal as to how the UNIX fork system call indicates whether it is returning to the child process or the parent process). The internal operations of RtlUnwindEx are identical whether it is being used for the implementation of setjmp/longjmp, or for conventional exception-handler-based triggering of procedure call frame unwinding.
However, there are differences that appear when RtlUnwindEx restores the execution context via RtlRestoreContext. There is special support inside RtlRestoreContext for STATUS_LONGJUMP exceptions with one exception information parameter; if this situation is detected, then RtlRestoreContext internally reinitializes portions of the passed-in context record based on the jmp_buf pointer stored in the exception information parameter block of the exception record provided to RtlRestoreContext by RtlUnwindEx. After this special-case partial reinitialization of the context record is complete, RtlRestoreContext realizes the context record as normal (causing execution control to be transferred to the stored Rip value). This can be seen as a hack (and a violation of abstraction layers; there is intended to be a logical separation between operating system level SEH support, and language level SEH support; this special support in RtlRestoreContext blurs the distinction between the two for C language support with the Microsoft C/C++ compiler). This layering violation is not the most egregious in the x64 exception handling scene, however.
This concludes the basic overview of the interface provided by RtlUnwindEx. There are some things that I have not yet covered, such as exit unwinds, collided unwinds, or the deep integration and support for C++ try/catch, and some of the highly unsavory things done in the name of C++ exception support. Next time: A walkthrough of the complete internal implementation of RtlUnwindEx, including undocumented, never-before-seen (or barely documented) corner cases like exit unwinds or collided unwinds (the internals of C++ exception support from the perspective of RtlUnwindEx are reserved for a future posting, due to size considerations).