Yesterday’s article described how VMKD currently communicates with DbgEng.dll in order to complete the high-speed connection between a local kernel debugger and the KD stub code running in a VM. At this point, VMKD is essentially operational, with significant improvements over conventional virtual serial port kernel debugging.
That is not to say, however, that nothing remains that could be improved in VMKD. There are a number of areas where significant steps forward could be taken with respect to either performance or end user experience, given a native (in the OS and in VMMs) implementation of the basic concepts behind VMKD. For example, despite the greatly accelerated data rate of VMKD-style kernel debugging, the 1394 kernel debugger transport still outpaces it for writing dump files. (Practically speaking, all operations except writing dump files are much faster on VMKD when compared to 1394.)
This is because the 1394 KD transport can “cheat” when it comes to physical memory reads. As the reader may or may not be aware, 1394 essentially provides an interface to directly access the target’s raw physical memory. DbgEng takes advantage of this capability, and overrides the normal functionality for reading physical memory on the target. Where all other transports send a multitude of DbgKdReadPhysicalMemoryApi packets to the target computer, requesting chunks of physical memory 4000 bytes at a time (4000 bytes is the maximum size of a KD packet across any transport), the 1394 KD client in DbgEng simply pulls the target computer’s physical memory directly “off the wire”, without needing to invoke the DbgKdReadPhysicalMemoryApi request for every 4000 bytes.
This optimization turns out to present very large performance improvements with respect to reading physical memory, as a request to write a dump file is at heart essentially just a large memcpy request, asking to copy the entire contents of physical memory of the target computer to the debugger so that the data can be written to a file. The 1394 KD client approach greatly reduces the amount of code that needs to run for every 4000 bytes of memory, especially in the VM case where every KD request and response pair involve separate VM exits and all the code that such operations involve, on top of all the processing logic guest-side when handling the DbgKdReadPhysicalMemoryApi request and sending the response data.
The same sort of optimization can of course be done in principal for virtual machine kernel debugging, but DbgEng lacks a pluggable interface to perform the highly optimized transfer of raw physical memory contents across the wire. One optimization that could be done without the assistance of DbgEng would be to locally interpret the DbgKdReadPhysicalMemoryApi request VMM-side and handle it without ever passing the request on to the guest-side code, but even this is suboptimal as it introduces a (admittedly short for a local KD process) round trip for every 4000 bytes of physical memory. If the DbgEng team stepped up to the plate and provided such an extensible interface, it would be much easier to provide the sort of speeds that one sees with writing dumps based on local KD.
Another enhancement that could be done Microsoft-side would be a better interface for replacing KD transport modules. Right now, due to the fact that ntoskrnl is static linked to KDCOM.DLL, the OS loader has a hardcoded hack that interprets the KD type in the OS loader options, loads one of the (hardcoded filenames) “kdcom.dll”, “kd1394.dll”, or “kdusb2.dll” modules, and inserts them into the loaded module list under the name “kdcom.dll”. Additionally, the KD transport module appears to be guarded by PatchGuard on Windows x64 editions (at least from the standpoint of PatchGuard 3), and on Windows Vista, Winload.exe enforces a signature check on the KD transport module. These checks are, unfortunately, not particularly conducive to allowing a third party to easily plug themselves into the KD transport path. (Unless virtualization vendors standardize on a way to signal the VMM that the guest wants attention, each virtualization platform is likely to need some slightly different code to effect a VM exit on each KdSendPacket and KdReceivePacket operation.)
Similarly, there are a number of enhancements that virtualization platform vendors could make VMM-side to make the VMKD-style approach more performant. For example, documented pluggable interfaces for communicating with the guest would be a huge step forward (although the virtualization vendor could just implement the whole KD transport replacement themselves instead of relying on a third party solution). VMware appears to be exploring this approach with VMCI, although this interface is unfortunately not supported on VMware Server or any other platforms besides VMware Workstation 6 to the best of my knowledge. Additionally, VMM authors are in the best position to provide documented and supported interfaces to allow pluggable code designed to interface with a VMM to directly access the register, physical, and virtual memory contexts of a given VM.
Virtualization vendors are also in a better position to integrate the installation and activation process for VMM plugins than a third party operating with no support or documentation. For example, the clumsy vmxinject.exe approach that VMKD takes to load its plugin code into the VMware VMM could be completely eliminated by a native architecture for installing, configuring, and loading VMM plugins (VMCI promises to take care of some of this, though not entirely to the extent that I’d hope).
I would strongly encourage Microsoft and virtualization vendors to work together on this front, as at least from the debugging experience (which is a non-trivial, popular use of virtual machines), there’s a significant potential for a better customer experience in the VM kernel debugging arena with a little cooperation here and there. VMKD is essentially a proof of concept showing that vast kernel debugging is absolutely technically possible for Windows virtual machines. Furthermore, with “inside knowledge” of either the kernel or the VMM, it would likely be trivial to implement the sort of pluggable interfaces that would have made the development and testing of VMKD a virtual walk in the park. In other words, if VMKD can be done without help from either Microsoft or VMware, it should be simple for virtualization vendors and Microsoft to implement similar functionality if they work together.
Next time: Parting shots, and thoughts on other improvements beyond simply fast kernel debugging in the virtualization space.
[…] Nynaeve Adventures in Windows debugging and reverse engineering. « Fast kernel debugging for VMware, part 6: Roadmap to Future Improvements […]