Yesterday, I expounded on the basics of how assemblies for scripts are structured, and how variables, subroutines, and IR instructions are managed throughout this process.
Nothing beats a good concrete example, though, so let’s examine a sample subroutine, both in NWScript source text form, and then again in MSIL form, and finally in JIT’d amd64 form.
Example subroutine
For the purposes of this example, we’ll take the following simple NWScript subroutine:
int g_randseed = 0; int rand() { return g_randseed = (g_randseed * 214013 + 2531101) >> 16; }
Here, we have a global variable, g_randseed, that is used by our random number generator. Because this is a global variable, it will be stored as an instance variable on the main program class of the script program, as we’ll see when we crack open the underlying IL for this subroutine:
MSIL version
.method private instance int32 NWScriptSubroutine_rand() cil managed { // Code size 110 (0x6e) .maxstack 8 .locals init (int32 V_0, uint32 V_1, int32 V_2, int32 V_3, int32 V_4) IL_0000: ldarg.0 IL_0001: ldarg.0 IL_0002: ldfld uint32 m_CallDepth IL_0007: ldc.i4.1 IL_0008: add IL_0009: dup IL_000a: stloc.1 IL_000b: stfld uint32 m_CallDepth IL_0010: ldloc.1 IL_0011: ldc.i4 0x80 IL_0016: clt.un IL_0018: brtrue.s IL_0025 IL_001a: ldstr "Maximum call depth exceeded." IL_001f: newobj instance void System.Exception::.ctor(string) IL_0024: throw IL_0025: ldarg.0 IL_0026: ldfld int32 m__NWScriptGlobal4 IL_002b: stloc.2 IL_002c: ldc.i4 0x343fd IL_0031: stloc.3 IL_0032: ldloc.2 IL_0033: ldloc.3 IL_0034: mul IL_0035: stloc.s V_4 IL_0037: ldc.i4 0x269f1d IL_003c: stloc.2 IL_003d: ldloc.s V_4 IL_003f: ldloc.2 IL_0040: add IL_0041: stloc.3 IL_0042: ldc.i4 0x10 IL_0047: stloc.s V_4 IL_0049: ldloc.3 IL_004a: ldloc.s V_4 IL_004c: shr IL_004d: stloc.2 IL_004e: ldloc.2 IL_004f: stloc.3 IL_0050: ldarg.0 IL_0051: ldloc.3 IL_0052: stfld int32 m__NWScriptGlobal4 IL_0057: ldloc.2 IL_0058: stloc.0 IL_0059: br IL_005e IL_005e: ldarg.0 IL_005f: ldarg.0 IL_0060: ldfld uint32 m_CallDepth IL_0065: ldc.i4.m1 IL_0066: add IL_0067: stfld uint32 m_CallDepth IL_006c: ldloc.0 IL_006d: ret } // end of method // ScriptProgram::NWScriptSubroutine_rand
That’s a lot of code! (Actually, it turns out to be not that much when the IL is JIT’d, as we’ll see.)
Right away, you’ll probably notice some additional instrumentation in the generated subroutine; there is an instance variable on the main program class, m_CallDepth, that is being used. This is part of the best-effort instrumentation that the JIT backend inserts into JIT’d programs so as to catch obvious programming mistakes before they take down the script host completely.
In this particular case, the JIT’d code is instrumented to keep track of the current call depth in an instance variable on the main program class, m_CallDepth. Should the current call depth exceed a maximum limit (which, incidentally, is the same limit that the interpretive VM imposes), the a System.Exception is raised to abort the script program.
This brings up a notable point, in that the generated IL code is designed to be safely aborted at any time by raising a System.Exception. An exception handler wrapping the entry point catches the exception, and the default return code for the script is returned up to the caller if a script is aborted in this way.
Looking back to the generated code, we can see that the basic operations that we would expect are all there; there is code to load the current value of g_randseed (m__NWScriptGlobal4 in this case), multiply it with a fixed constant (0x343fd, or 214013 as we see in the NWScript source text), then perform the addition and right shift, before finally storing the result back to g_randseed (m__NWScriptGlobal4 again) and returning. (Whew, that’s it!)
Even though there are a lot of loads and stores here still, most of these actually disappear once the CLR JIT compiles the MSIL to native code. To see this in action, let’s look at the same code, now translated into amd64 instructions by the CLR JIT. Here, I used !sos.u from the sos.dll debugger extensions (the instructions are colored using the same coloring scheme as I used above):
0:007> !u 000007ff`001cbac0 Normal JIT generated code NWScriptSubroutine_rand() Begin 000007ff001cbac0, size 7e push rbx push rdi sub rsp,28h mov rdx,rcx mov eax,dword ptr [rdx+1Ch] lea ecx,[rax+1] mov dword ptr [rdx+1Ch],ecx xor eax,eax cmp ecx,80h setb al test eax,eax je 000007ff`001cbb07 mov eax,dword ptr [rdx+34h] imul eax,eax,343FDh lea ecx,[rax+269F1Dh] sar ecx,10h mov dword ptr [rdx+34h],ecx mov eax,dword ptr [rdx+1Ch] dec eax mov dword ptr [rdx+1Ch],eax mov eax,ecx add rsp,28h pop rdi pop rbx ret lea rdx,[000007ff`001f3fd8] mov ecx,70000005h call clr!JIT_StrCns mov rbx,rax lea rcx,[mscorlib_ni+0x4c6d28] call clr!JIT_TrialAllocSFastMP_InlineGetThread mov rdi,rax mov rdx,rbx mov rcx,rdi call mscorlib_ni+0x376e20 (System.Exception..ctor(System.String) mov rcx,rdi call clr!IL_Throw nop
(If you’re curious, this was generated with the .NET 4 JIT.)
Essentially each and every one of the fundamental operations was turned into just a single amd64 instruction by the JIT compiler — not bad at all! (The rest of the code you see here is the recursion guard.)
Tags: NWN2
[…] Nynaeve Adventures in Windows debugging and reverse engineering. « NWScript JIT engine: Under the hood of a generated MSIL subroutine […]