Process-level security is not a particularly great way to enforce DRM when users own their own hardware.

May 8th, 2007

Recently, I discussed the basics of the new “process-level security” mechanism introduced with Windows Vista (integrity levels; otherwise known as “mandatory integrity control“, or MIC for short).

Although when combined with more conventional user-level access control, there is the potential to improve security for users to an extent, MIC is ultimately not a mechanism to lock users out of their own computers.

As you might have guessed by this point, I am speaking of the rather less savory topic of DRM. MIC might appear to be attractive to developers that wish to deploy a DRM system, but it really doesn’t provide a particularly effective way to stop a computer owner (administrator) from, well, administering their system.

MIC (and process-level security), on the surface, may appear to be a good way to accomplish this goal. After all, the process-level security model does allow for securable objects (such as processes) to be guarded against other objects – even of the same user sid, which is typically the kind of restriction that a software-based DRM system will try to enforce (i.e. preventing you from debugging a program).

However, it is important to consider that the sort of restrictions imposed by process-level security mechanisms are designed to protect programs from other programs. They are not supposed to protect programs from the user that controls the computer on which they run (in other words, the computer administrator or whatever you wish to call it).

Windows Vista attempts to implement such a (DRM) protection scheme, loosely based on the principals of process-level security, in the form of something called “protected processes“.

If you look through the Vista SDK headers (specifically, winnt.h), you may come across a particularly telling comment that would seem to indicate that protected processes were originally intended to be implemented via the MIC scheme for process-level security in Vista:

#define SECURITY_MANDATORY_LABEL_AUTHORITY       {0,0,0,0,0,16}
#define SECURITY_MANDATORY_UNTRUSTED_RID         (0x00000000L)
#define SECURITY_MANDATORY_LOW_RID               (0x00001000L)
#define SECURITY_MANDATORY_MEDIUM_RID            (0x00002000L)
#define SECURITY_MANDATORY_HIGH_RID              (0x00003000L)
#define SECURITY_MANDATORY_SYSTEM_RID            (0x00004000L)
#define SECURITY_MANDATORY_PROTECTED_PROCESS_RID (0x00005000L)

//
// SECURITY_MANDATORY_MAXIMUM_USER_RID is the highest RID
// that can be set by a usermode caller.
//

#define SECURITY_MANDATORY_MAXIMUM_USER_RID \\
   SECURITY_MANDATORY_SYSTEM_RID

As it would turn out, protected processes (as they are called) are not actually implemented using the integrity level/MIC mechanism on Vista; instead, there is another, alternate mechanism that provides a way to mark protected processes are “untouchable” by “normal” processes (the lack of flexibility in the integrity level ACE system, as far as specifying which access rights are permitted, is the likely reason. If you read the linked article and the paper it includes, there are a new set of access rights defined specially for dealing with protected processes, which are deemed “safe”. These access rights are requestable for such processes, unlike the standard access rights, and there isn’t a good way to convey this with the set of “allow/deny read/write/execute” options available with an integrity level ACE on Vista.)

The end result is however, for the most part, the same; “protected processes” are essentially to high integrity (or lower) processes as high (or medium) integrity processes are to low integrity processes; that is, they cannot be adversely affected by a lesser-trusted process.

This is where the system begins to break down, though. Process integrity is an interesting way to attempt to curtail malware and exploits because the human at the computer (presumably) does not wish such activity to occur. On the other hand, DRM attempts to prevent the human at their computer from performing an action that they (ostensibly) do in fact wish to perform, with their own computer.

This is a fundamental distinction. The difference is that the malware or exploit code that process level security is designed to defend against doesn’t have the benefit of a human with physical (or administrative) access to the computer in question. That little detail turns out to make a world of difference, as we humans aren’t necessarily constrained by the security system like a program would be. For instance, if some evil exploit code running as a low integrity process on a computer wants to gain administrative access to the box, it just can’t do so (excepting the possibility of local privilege escalation exploits or trying to social-engineer the user into giving the program said access – for the moment, ignore those attack vectors, though they are certainly real ones that must be dealt with at some point).

However, if I am a human sitting at my computer, and I am logged on as a “plain user” and wish to perform an administrative task, I am not so constrained. Instead, I simply either log out and log back in as an administrative user (using my administrative account password), or type my password into an elevation prompt. Problem solved!

Now, of course, the protected process mechanism in Vista isn’t quite that dumb. It does try to block administrators from gaining access to protected processes; direct attempts will return STATUS_ACCESS_DENIED. However, again, humans can be a bit more clever here. For one, a user (and by user, I mean a person with full control over their computer) that is intent on bypassing the protected process mechanism could simply load a driver designed to subvert the protected process mechanism.

The DRM system might then counter that attack by then requiring kernel mode code to be signed, on the theory that for wide-scale violations of the DRM system in such a manner, a “cracker” would need to obtain a code-signing cert that would make them more-easily identifiable and vulnerable to legal attack.

However, people are clever (and more specifically, people with physical / administrative access to a computer are not so necessarily constrained by the basic “rules” of the operating system). One could imagine somebody doing something like patching out the driver signing checks on disk, or any number of other approaches. The theoretical counters to attacks like that would be some sort of hardware support to verify the boot process and ensure that only trusted, signed (and thus unmodified by a “cracker”) code can boot the system. Even that is not necessarily foolproof, though; what’s to say that nobody has compromised the task-offload engine on the system’s NIC to run custom code with full physical memory access, outside the confines of the operating system entirely? Free reign over something capable of performing DMA to physical memory means that kernel code and data can be freely rewritten.

Now, where am I going with all of this? I suppose that I am just frustrated that certain people seem to want to continue to invest significant resources into systems that try to wrest control of a computer from an end user, which are simply doomed to fail by the very nature of the diverse and uncontrolled systems upon which that code will run (and which sometimes compromise the security of customer systems in the process). I don’t think the people behind the protected processes system at Microsoft are stupid, not by any means. However, I can’t help but feel that they have know they’re fighting a losing battle, and that their knowledge and expertise would be better spent on more productive things (like working to improve the next release of Windows, or what-have-you).

Now, a couple of parting shots in an effort to quell several potential misconceptions before they begin:

  • I am not advocating that people bypass DRM. This is probably less than legal in many places. I am, however, trying to make a case for the fact that trying to use security models originally designed to protect users from malware as a DRM mechanism is at best a bad idea.
  • I’m also not trying to downplay the negative impact of theft of copyrighted materials, or anything of that sort. As a programmer myself, I’m well aware that if nobody will buy your product because it’s pirated all over the world, then it’s hard to eke out a living. However, I do believe that it is a fallacy to say that it’s impossible to make money out of software or content in the Internet age without layer after layer of customer-unfriendly DRM.
  • I’m not trying to knock the rest of the improvements in Vista (or the start of process-level security being deployed to joe end user, even though it’s probably not yet perfect). There’s a lot of good work that’s been done with Vista, and despite the (ill-conceived, some might say) DRM mechanisms, there is real value that has been added with this release.
  • I’m also not trying to say that Microsoft is devoting so much of its time to DRM that it isn’t paying any attention to adding real value to its products. However, in my view – most of the time spent on DRM is time that could be better spent adding that “real value” instead of doing the dance of security by obscurity (as with today’s systems, that is really all you can do, when it comes down to it) with some enigmatic idea of a “cracker” out there intent on stealing every piece of software or content they get their hands on and redistributing it to every person in the world for free.
  • I’m also not trying to state that the kernel mode code signing requirements for x64 Vista are entirely motivated by DRM (or that all it’s good for is an attempt to enforce DRM), but I doubt that anyone could truthfully say that DRM played no part in the decision to require signed drivers on x64 Vista either. Regardless, there remain other reasons for ostensibly requiring signed code besides trying to block (or at least hold accountable) attempts to bypass the protected process system.

Tricks for getting the most out of your minidumps: Including specific memory regions in a dump

May 4th, 2007

If you’ve ever worked on any sort of crash reporting mechanism, one of the constraints that you are probably familiar with is the size of the dump file created by your reporting mechanism. Obviously, as developers, we’d really love to write a full dump including the entire memory image of the process, full data about all threads and handles (and the like), but this is often less than possible in the real world (particularly if you are dealing with some sort of automated crash submission system, which needs to be as un-intrusive as possible, including not requiring the transfer of 50MB .dmp files).

One way you can improve the quality of the dumps your program creates without making the resulting .dmp unacceptably large is to just use a bit of intelligence as to what parts of memory you’re interested in. After all, while certainly potentially useful, chances are you probably won’t really need the entire address space of the process at the time of a crash to track down the issue. Often enough, simply a stack trace (+ listing of threads) is enough, which is more along the lines of what you see when you make a fairly minimalistic minidump.

However, there are lots of times where that little piece of state information that might explain how your program got into its crashed state isn’t on the stack, leaving you stuck without some additional information. An approach that can sometimes help is to include specific, “high-value” regions of memory in a memory dump. For example, something that can often be helpful (especially in the days of custom calling conventions that try to avoid using the stack where-ever possible) is to include a small portion of memory around each register in-memory.

The idea here is that when you’re going to write a dump, check each register in the faulting context to see if it points to a valid location in the address space of the crashed process. If so, you can include a bit of memory (say, +/- 128 bytes [or some other small amount] from the register’s value) in the dump. On x86, you can actually optimize this a bit further and typically leave out eip/esp/ebp (and any register that points into an executable section of an image section, on the assumption that you’ll probably be able to grab any relevant images from the symbol server (you are using a symbol repository with your own binaries included, aren’t you?) and don’t need to waste space with that code in the dump).

One class of problem that this can be rather helpful in debugging is a crash where you have some sort of structure or class that is getting used in some partially valid state and you need the contents of the struct/class to figure out just what happened. In many cases, you can probably infer the state of your mystery class/struct from what other threads in a program were doing, but sometimes this isn’t possible. In those cases, having access to the class/struct that was being ‘operated upon’ is a great help, and often times you’ll find code where there is a `this’ pointer to an address on the heap that is tantalyzingly present in the current register context. If you were using a typical minimalistic dump, then you would probably not have access to heap memory (due to size constraints) and might find yourself out of luck. If you included a bit of memory around each register when the crash occured, however, that just might get you the extra data points needed to figure out the problem. (Determining which registers “look” like a pointer is something easily accomplished with several calls to VirtualQueryEx on the target, taking each crash-context register value as an address in the target process and checking to see if it refers to a committed region.)

Another good use case for this technique is to include information about your program state in the form of including the contents of various key heap- (or global- ) based objects that wouldn’t normally be included in the dump. In that case, you probably need to set up some mechanism to convey “interesting” addresses to the crash reporting mechanism before a crash occurs, so that it can simply include them in the dump without having to worry about trying to grovel around in the target’s memory trying to pick out interesting things after-the-fact (something that is generally not practical in an automated fashion, especially without symbols). For example, if you’ve got some kind of server application, you could include pointers to particularly useful per-session-state data (or portions thereof, size constraints considered). The need for this can be reduced somewhat by including useful verbose logging data, but you might not always want to have verbose logging on all the time (for various reasons), which might result in an initial repro of a problem being less than useful for uncovering a root cause.

Assuming that you are following the recommended approach of not writing dumps in-process, the easiest way to handle this sort of communication between the program and the (hopefully isolated in a different process) crash reporting mechanism is to use something like a file mapping that contains a list (or fixed-size array) of pointers and sizes to record in the dump. This can make adding or removing “interesting” pointers from the list to be included as simple as adding or removing an entry in a flat array.

As far as including additional memory regions in a minidump goes, this is accomplished by including a MiniDumpCallback function in your call to MiniDumpWriteDump (via the CallbackParam parameter). The minidump callback is essentially a way to perform advanced customizations on how your dump is processed, beyond a set of general behaviors supplied by the DumpType parameter. Specifically, the minidump callback lets you do things like include/exclude all sorts of things from the dump – threads, handle data, PEB/TEB data, memory locations, and more – in a programmatic fashion. The way it works is that as MiniDumpWriteDump is writing the dump, it will call the callback function you supply a number of times to query you for any data you want to add or subtract from the dump. There’s a huge amount of customization you can do with the minidump callback; too much for just this post, so I’ll simply describe how to use it to include specific memory regions.

As far as including memory regions go, you need to wait for the MemoryCallback event being passed to your minidump callback. The way the MemoryCallback event works is that you are called back repeatedly, until your callback returns FALSE. Each time you are called back (and return TRUE), you are expected to have updated the CallbackOutput->MemoryBase and CallbackOutput->MemorySize output parameter fields with the base address and length of a region that is to be included in the dump. When your callback finally returns FALSE, MiniDumpWriteDump assumes that you’re done specifying additional memory regions to include and continues on to the rest of the steps involved in writing the dump.

So, to provide a quick example, assuming you had a DumpWriter class containing an array of address / length pairs, you might use a minidump callback that looks something like this to include those addresses in the dump:

BOOL CALLBACK DumpWriter::MiniDumpCallback(
 PVOID CallbackParam,
 const PMINIDUMP_CALLBACK_INPUT CallbackInput,
 PMINIDUMP_CALLBACK_OUTPUT CallbackOutput
 )
{
 DumpWriter *Writer;
 BOOL        Status;

 Status = FALSE;

 Writer = reinterpret_cast<DumpWriter*>(CallbackParam);

 switch (CallbackInput->CallbackType)
 {

/*
 ... handle other events ...
 */

 case MemoryCallback:
  //
  // If we have some memory regions left to include then
  // store the next. Otherwise, indicate that we're finished.
  //

  if (Writer->Index == DumpWriter->Count)
   Status = FALSE;
  else
  {
   CallbackOutput->MemoryBase =
     Writer->Addresses[ Writer->Index ].Base;
   CallbackOutput->MemorySize =
     Writer->Addresses[ Writer->Index ].Length;

   Writer->Index += 1;
   Status = TRUE;
  }
  break;

/*
 ... handle other events ...
 */
 }

 return Status;
}

In a future posting, I’ll likely revisit some of the other neat things that you can do with the minidump callback function (as well as other things you can do to make your minidumps more useful to work with). In the mean time, Oleg Starodumov also has some great documentation (beyond that in MSDN) about just what all the other minidump callback events do, so if you’re finding MSDN a little bit lacking in that department, I’d encourage you to check his article out.

WordPress can be an incredible pain in the ‘p’ tag…

April 30th, 2007

I decided to go check that the frontpage of the blog still passed the W3C XML validator today (something I try to remember to do at least semi-regularly) and, to my dismay, there were validation failures all over the place.

It seems that WordPress is not particularly intelligent about determining where it should and should not place <p> tags; depending on how you place your whitespace around the beginning of a list (ol/li) or tt (or other tag whose contents tend to span multiple lines), it seems to have quite an affinity for either spewing completely bogus open or close p tags (or closing tags in the wrong order, such as a opening a p tag before a user-specified li tag, then “helpfully” automagically closing the p tag before the closing li tag the user writes in the post).

The worst part of the whole thing is that the breaking tags are autogenerated, and they’re controlled by what sort of whitespace (e.g. blank lines) you have near your the opening and closing tags. Because the “helpful” autogenerated tags aren’t visible at “design time” for a post, you’re all but limited to blind trial and error to get it working right. Sometimes, you even need to put seemingly bogus </p> tags in at “design time” to match unbalanced tags emitted by WordPress automagically at display time.

I love how this turns writing blog posts that render properly into debugging something that is influenced by how I use whitespace. Argh!

Excuse me, while I go back to figuring out the right combination of blank lines to fix the rest of the blog’s tag close mismatch failures…

Blog move finished…

April 29th, 2007

The blog’s been (finally) moved to a reliable box, at a reliable location (including all dependent services), so that should be the last of the intermittant downtime.

So, same url, but minus the random downtime.

New WinDbg (6.7.5.0) released

April 27th, 2007

It’s finally here – WinDbg 6.7.5.0.

I haven’t gotten around to trying out all of the new goodies yet, but there are some nice additions. For one, .fnent now decodes unwind information in a more meaningful way on x64 (although it still doesn’t understand C scope table entries, making it less useful than SDbgExt’s !fnseh if that is what you were interested in.

Looks like they’ve finally gotten around to signing WinDbg.exe too (though, curiously, not the .msi the installer extracts), so the elevation prompts for WinDbg are now of the more friendlier sort instead of the “this program will destroy your computer” sort.

There is also reportedly source server support for CVS included; I imagine that I’ll be taking a stab at that again now that it is supposedly fully baked now.

In other news, the blog (and DNS) will be moving to a more ideal hosting location (read: not my apartment) as early as this weekend (if all goes according to plan, that is). It’ll be moving to a yummy new quad core Xeon box (with a real connection), a nice step up from the original hardware that it has been running on until a short while ago (good riddance). Crossing my fingers, but hopefully the random unavailability been hardware dying on me and Road Runner sucking should be going away Real Soon Now(tm).

A brief discussion of Windows Vista’s IE Protected Mode (and user/process level security)

April 25th, 2007

I was discussing the recent QuickTime bug on Matasano Chargen, and the question of whether it would work in the presence IE7 + Vista and protected mode came up. I figured a more in depth explanation as to just what IE7’s Protected Mode actually does might be in order, hence this posting.

One of the new features introduced with Internet Explorer 7 on Windows Vista is something called “Protected Mode” (or “IE Protected Mode”). It’s an on-by-default security that is sold as something that greatly hardens Internet Explorer against meaningful exploitation, even if an exploitable hole in IE (or a component of IE, such as an ActiveX control) is found.

Let’s dig in a little bit deeper as to what IE Protected Mode is (and isn’t), and what it means for you.

First things first. Protected mode is not related to the “enhanced security configuration” that is introduced in Windows Server 2003 in any way. The “enhanced security configuration” IE included with Srv03 is, at its core, just a set of more restrictive (i.e. locked down) default settings with regard to things like scripting, downloading files, and soforth. Protected mode does not rely on locking down security zone settings to the point where you cannot download files or run any scripts by default, and is completely unrelated to the IE hardening work done in the default Srv03 configuration. I’d imagine that protected mode will probably be included in Longhon Server, but the underlying technologies are very different, and are designed to address different “market segments” (“enhanced security configuration” being just a set of more restrictive defaults, whereas protected mode is a fundamental rethink of how the browser interacts with the rest of the operating system).

Protected mode is a feature that is designed to make “surfing the web a safer experience” for end users. Unlike Srv03, where a locked down IE might fly because you are ostensibly not supposed to be doing lots of fancy web-browser-ish things from a server box, end users are clearly not going to take kindly towards not being permitted to download files, run JavaScript, and soforth in the default configuration.

The way protected mode takes a stab at making things better for the end users of the world is to build upon the new “integrity level” security mechanism that has been introduced into the Windows NT security model starting with Vista, with the goal of making the web browser an “untrusted” process that cannot perform “dangerous” things.

To understand what this means, it’s necessary to know what these new-fangled “integrity levels” in Vista are all about. Integrity levels are assigned to a token representing a user, and tokens are assigned to a process (and can be impersonated by a thread, typically something done by something like an IPC server process that needs to perform something on behalf of a lesser-privileged caller). What’s meaningful about integrity levels is that they allow you to partition what we know of as a “user” into something with multiple different “trust levels” (low, medium, high, with several other infrequently-used levels), such that a thread or a process running as a certain integrity level (or “trust level”) cannot “intefere” with something running at a higher integrity level.

The way this is implemented is by an additional level of security check that is performed when some kind of access rights check is performed. This additional check compares the integrity level of the caller (i.e. the thread or process token’s integrity level) with a new type of field in the security descriptor of the target object (called the “mandatory label“) that specifies what sorts of access a caller of a certain integrity level is allowed to request. The “mandatory label” allows an integrity level to be associated with an object for security checks, and allows three basic policies (lower integrity levels cannot request read access, low integrity levels cannot request write access, lower integrity levels cannot request execute access) to be set, comparing the integrity level of a caller against the integrity level specified with an object’s security descriptor. (Only these three generic access rights may be “guarded” by the integrity level in this way; there is no granularity to allow object specific access rights to be given specific minimum caller integrity levels).

The default settings in most places do not allow write access to be granted to processes of a lower integrity level, and the default minimum integrity level is usually “medium”. The new, label/integrity level access check is performed before conventional ACL-based checks.

In this respect, integrity levels are an attempt to inject something of a sort of process-level security into the NT security model.

If you’re at all familiar with how NT security works, this may be a bit new to you. NT is based upon user-level security, where processes (and threads, in the case of impersonation) run under the context of a user, and derive their security rights (i.e. what securable objects they have access to – files, directories, registry keys, and soforth) and privileges (i.e. the ability to shut down the system, the ability to load a driver, the ability to bypass ACL checks for backup/restore, and soforth) from the user context they run under. The thinking behind this sort of model is that each distinct user on a system will run as, well, a different user. Processes from one user cannot interfere with processes (or files, directories, and soforth) running as a different user, without being granted access to do so (i.e. via an ACL, or by special, administrator-level privileges). The “operating system” (i.e. the services and programs that support the system) conceptually runs as yet another “user”, and is thus ostensibly protected from adverse modifications by malicious users on the system. Each user thus exists in a sort of sandbox, unable to interfere with any other user. Conversely, any process running as a particular user can do anything to any other process (or file or directory) owned by that same user; there is no protection within a user security context.

Obviously, this is a gross oversimplification of the NT security model, but it gets the point across (or so I hope!): The security system in NT revolves around the user as the means to control access in a meaningful fashion. This does make sense in environments like large corporate settings, where many users share the same computer (or where computers are centrally managed), such that users cannot interfere with eachother, and ostensibly cannot attack their computers (i.e. the operating system) because they are running as “plain users” without administrator access and cannot perform “dangerous” tasks.

Unfortunately, in the era of the internet, exploitable software bugs, and computers with end users that run code they do not entirely trust, this model isn’t quite as good as we would like. Because the user is the security boundary, here, if an attacker can run code under your user account, they have full access to all of the processes, files, directories (and soforth) that are accessible to that user. And if that user account happened to be a computer administrator account, then things are even worse; now the attacker has free reign over the entire computer, and everything on it (including all other users present on the box).

Clearly, this isn’t such a great situation, especially given the reality that many users run untrusted code (or more generally, buggy or exploitable code) on a frequent basis. In this Internet-enabled age, user-level security as it has been traditionally implemented isn’t really enough.

There are still ways to make things “work” with user-level security; namely, to give each human several user accounts, specific to the task that they are doing. For example, I might have one user account that I use for browsing and games, and another user account that I use for accessing top secret corporate documents. If the user account that I use to browse the Internet with gets compromised somehow, such as by my running an exploitable program and getting “owned”, then my top secret corporate documents are still safe; the malicious code running under the Internet-browsing-and-games account doesn’t have access to do anything to my secret documents, since they are owned by a different account and the default ACL protects them from other users.

Of course, this is a tough pill to expect end users to swallow; having to switch user accounts as they switch between tasks of differing importance is at best inconvenient and at worst confusing and problematic (for example, if I want to download a file from the Internet for use with my top secret corporate documents, I have to go to (comparatively) a lot of trouble to give it to my other user, and doing so opens an implicit trust relationship between my secret-documents-user and my less-trusted-Internet-browsing user, that the program I just downloaded is 1) not inherently malicious, 2) not tampered with or compromised, and 3) not full of exploitable holes that would put my documents at risk anyway the moment my secret-documents-user runs it). Clearly, while you could theoretically still get by with user level access in today’s world, as a single user, doing so as it is implemented in Windows today is a major pain (and even with everyone’s best intentions, few people I have seen really follow through completely with the concept and do not share programs or files between their users in any way whatsoever).

(Note that I am not suggesting that things like running as nonadmin or breaking tasks up into different users are a lost cause, just that to get things truly right and secure, it is a much more difficult job than one might expect initially, so much so that most “joe users” will not stand a chance at doing it perfectly. I’m also not trying to knock on user-level security as just outright flawed, useless, or broken, but the fact remains there are problems in today’s world that merit additonal consideration.)

Whew, that’s a rather long segway into user-level security. Anyways, protected mode is Microsoft’s first real attempt to tackle this problem – the fact that user level security does not always provide fine enough granularity, in the fact of untrusted or buggy programs – in a consumer-level system, in such a way that is palatable to “joe users”. The way that it works is to leverage the integrity level mechanism to create an additonal security barrier between the user’s web browser (i.e. Internet Explorer in protected mode) and the rest of the user’s files and programs. This is done by assigning the IE process a low integrity level. Following with what we know of integrity levels above, this means that the IE process will be denied access (by the security manager in the kernel) to do things like overwrite your documents, place malicious programs in your “startup” Start Menu directory, overwrite executables in your system directory (of course, if you were already running as a plain user, it wouldn’t be able to do this anyway…), and soforth. This is clearly a good thing. In case the implications haven’t fully sunk in yet:

If an attacker compromises a low integrity process, they should not be able to destroy your data or install a trojan (or other malicious code) on your system*.

(*: This is, of course, barring implementation errors, design oversights, and local privilege escalation holes. The latter may prove to be an especially important sticking point, as many companies (Microsoft included) have often “looked down” upon local privilege escalation bugs as relatively not important to fix in a timely fashion. Perhaps the introduction of process-level security control will help add impetus to shatter the idea that leaving local privilege escalation holes sitting around is okay.)

Now, this is a very important departure from where we have been traditionally with user level access control. Combining per process access control with per user access control allows us to do a lot more to protect users from malicious code and buggy software (in other words, protecting users from themselves), in a fashion that is much easier to deal with from a user perspective.

However, I think it would be premature to say that we’re “all the way there” yet. Protected mode and low integrity level processes are definitely a great step in the right direction, but there still remain issues to be solved. For example, as I alluded to previously, the default configuration allows medium integrity objects to still be opened for read access by low integrity callers. This means that, for example, if an attacker compromises an IE process running in protected mode, they still do have a chance at doing some damage. For instance, even though an attacker might not be able to destroy your data, per-se, he or she can still read it (and thus steal it). So, to continue with our previous example of someone who works with top secret corporate documents, an attacker might not be able to wipe out the company financial records, or joe user’s credit card numbers, but he or she could still steal them (and post them on the Internet for anyone to see, or what have you). In other words, an attacker who compromises a low integrity process can’t destroy all your data (as would be the case if there were no process-level security and we were dealing with just one user account), but he or she can still read it and steal it.

There are other things to watch out for, too, with protected mode. Don’t get into the habit of clicking “OK” on that “are you sure you want this program to run outside of IE Protected Mode” dialog box, or you’re setting yourself up to be burned by clever malware. And certainly never click the “don’t ask me again” check box on the consent dialog, or you’re just begging for some piece of malware to abuse your implicit consent without you even realizing that something’s gone wrong. (In case you’re wondering, the mechanism in IE that allows processes to elevate to medium integrity involves an appcompat hook on CreateProcess that requests a medium integrity process (ieuser.exe) to prompt the user for consent, with the medium integrity process creating the process if the user agrees. So user interaction is still required there, though we know how much users love to click “OK” on all those pesky security warnings. Oh, and there is also some hardening that has been done in win32k.sys to prevent lower integrity processes from sending “dangerous” window messages to higher integrity processes (even WM_USER and friends are disabled by default across an integrity level boundary), so “shatter attacks” ought not to work against the consent dialog. Note that if you bypass the appcompat hook, the new process created is also low integrity, and won’t be able to write to anywhere outside of the “low integrity” sandbox writable files and directories.)

So, while running IE in protected mode does, in some respects limit the damage that can be done if you get compromised, I would still recommend not running untrusted programs under the same user account as your important documents (if you really care that much). Perhaps in a future release, we’ll see a solution that addresses the idea of not allowing untrusted programs to read arbitrary user data as well (such should be possible with proper use of the new integrity level mechanisms, although I suspect the true difficulty shall be in getting third party applications to play nicely as we continue to try and place control of the user’s documents more firmly in the control of the actual user instead of in any arbitrary application that runs on the box).

Sorry about the downtime…

April 25th, 2007

The box I have been hosting the blog on had got to its last proverbial leg as of late; hanging about every 12 hours or less. So, I decided to move the whole thing over to another box. I’m already in the middle of another project to eventually host the blog at a real location, but that’s still in the works and not ready yet. Not wanting to go to the trouble to configure WordPress and everything on another box just temporarily, I decided to give the VMware Converter a try and move the entire box into a VM image on a more stable computer at my apartment.

So far, I’m pretty pleased with the results; it took only a couple of minutes to install the converter and start the conversion process (which took about 1.5 hrs total to complete). Surprisingly enough, the thing mostly just worked out of the box after being VM-ized; I had to install VMware Tools, and reconfigure network settings a bit, but other than that, the whole experience was rather seamless.

For now, the blog should be a bit more stable (now living as a VM on a relatively new and, ah, higher quality tower server that I recently acquired to run most of the services for my apartment).

And as far as VMware Converter goes, color me impressed; I expected to run into a lot of snags, but the user experience was quite good. Definitely a great timesaver for retiring old, failing hardware without having to go through the trouble of reconstructing a new install to perform the tasks of the old one.

Sometimes, a cheap hack does the trick after all (or how I worked around some mysterious USB/Bluetooth issues)

April 16th, 2007

My relatively new Dell XPS M1710 laptop (running Vista x64) has an annoying problem that I’ve recently tracked down.

It appears that when you connect a USB 1.1 device (such as a USB keyboard) to it, and then disconnect that device, Bluetooth and the built-in smart card reader tend to break the next time the computer goes through a suspend/resume cycle. This is fairly annoying for me, as I use a USB keyboard at work (and I also use Bluetooth extensively); it got to the point where pretty much every other time I went to the office and came home, Bluetooth and the smart card reader would break.

The internal Bluetooth transceiver and the built-in smart card reader are both connected to the system via an internal USB hub. It seems like this is becoming a popular thing nowadays among laptop manufacturers, connecting “internal” peripherals on laptops via USB instead of them being on-board and hardwired to the PCI bus or the like. Anyways, when the Bluetooth hub and smart card reader get into the broken state, I’ll typically get a “A USB device attached to the system is not functioning” notification, and the internal USB hub shows up as not started in device manager (with a problem code of CM_PROB_FAILED_START – code 10). Occasionally, the Bluetooth transceiver itself shows up with a CM_PROB_FAILED_START error, but the vast majority of the time it is the USB hub that fails.

I’ve done a bit of searching for a real fix for this problem, and closest I’ve found is this KB article which describes a problem that does sound like mine – Bluetooth breaks after suspending – but, the hotfix isn’t publicly available yet. I suppose I could have called PSS and tried to talk them into getting me a copy of that hotfix to try out, but I tend to try and avoid muddling through the various tiers of technical support until I really have no other options. Furthermore, there’s no guarantee that the hotfix would actually solve the underlying problem; the article doesn’t make any mention of internal USB hubs acting flaky in conjunction with a Bluetooth transceiver, only the Bluetooth module itself.

Until recently, to get out of this state, I typically either have to suspend/resume the laptop again (although this is dangerous*), reboot entirely, or remove and reinstall the devnode associated with the USB hub in device manager. (*Trying to suspend/resume in this case doesn’t always work. Sometimes, it won’t fix the problem, and more than a couple of times it has resulted in Vista hanging while trying to suspend, forcing a hard reboot.).

None of these solutions are particularly desirable; nobody likes having to shut everything down and reboot all the time with a laptop (kind of defeats the point of that nice sleep feature, doesn’t it?). Removing and reinstalling the devnode is less painful than a reboot, but only slightly; waiting for everything on the internal USB hub to get reinstalled after doing that takes up to a minute or two of disk thrashing while INFs are searched, and if things are going to take that long then it would almost be faster just to shut down and reboot anyway.

Eventually, though, I discovered that simply disabling and reenabling the devnode corresponding to the USB hub would resolve the problem. This is definitely a much nicer solution than any of the above; it’s (relatively) quick and less painful, and it doesn’t involve me sitting around and waiting for either a shutdown or reboot, or for a bunch of devices to get reinstalled by PnP.

Unfortunately, that workaround is still a rather manual process. It’s not too bad if I just keep an elevated Device Manager window open for easy access, but among other things this prevents me from, say, unlocking my computer with a smart card after an unsuspend (as the Bluetooth transceiver and smart card reader share the same USB hub that tends to flake out after a suspend/resume, after a USB 1.1 device is removed).

So, I set about seeing whether I could try and automate this process. I could have tried to track down the root cause a bit more, but as far as I can tell, the problem is either 1) a bug in Vista’s USB support (perhaps or perhaps not an x64 specific bug), or 2) a bug in firmware/hardware relating to the internal USB hub itself, or 3) a bug in firmware/hardware relating to the Bluetooth transceiver, or 4) a bug in the Bluetooth drivers for my laptop. None of those possibilities would be something that I could easily solve (as far as the root cause goes), even if I managed to track down the originating cause of the problem. As a result, I decided that the most time-effective solution would be to just try and automate the process of disabling and reenabling the devnode associated with the USB hub (if it breaks), or the Bluetooth transceiver (if it breaks instead). Restarting the devnode is comparatively fast when viewed in light of the other workarounds, and in any case it is fast enough to be a viable way to at least alleviate the symptoms of this problem.

After doing some brief research, it looked like the way to go here was a combination of the CM_Xxx APIs and the SetupDi APIs. Although there’s a relatively large amount of indirection you have to go through to restart an individual devnode (and the CM_Xxx APIs / setupapi are not particularly easy to use in the first place), there happens to be a WDK sample that has the capability to do just what I want – DevCon. DevCon is a console equivalent to the Device Manager MMC snapin; it’s capable of enumerating device nodes, installing/removing them, updating drivers, disabling/reenabling devnodes, and soforth.

Sure enough, I verified that DevCon’s `restart’ command was sufficient to restart the broken devnode (in a fashion similar to disabling and reenabling it in Device Manager), with the end result of causing the USB hub to start working again.

At this point, all I had to do was come up with a good way to locate the broken devnodes, a good way to know when the computer went in/out of sleep (as a sleep cycle causes the problem to occur), and then inject the DevCon code responsible for restarting a device into my program. To make a long story short, I ended up keying the program based on the hardware ids of the internal USB hub and Bluetooth transceiver such that it would check for all devnodes that 1) matched a hardware id in that list, and 2) were in a disabled state with a problem code of CM_PROB_FAILED_START. Detecting a sleep transition is fairly easy for a service (services receive notifications of PnP/Power events through the callback registered via RegisterServiceCtrlHandlerEx, and as I wanted the program to function continuously, a service seemed like the logical way to run it anyway.

An hour or two later and I had my workaround problem done. It’s certainly not pretty, and it doesn’t do anything to fix the root cause of the problem, but as far as treating the symtoms goes, it gets the job done. The basic way the program works is that every time the system resumes from suspend, it will poll all devnodes every second for 30 seconds, looking for a devnode that failed to start and is one of the known list of problematic hardware ids. If it finds such a device, it attempts to restart it (thereby working around the root cause of the problem and alleviating the symptoms). Polling for a fixed time after resume isn’t pretty by any means, but it can occasionally take a bit for one of the devnodes to show as broken when it’s not working, so this works around that.

While you could safely call it a giant hack, the program does get the job done; now, I can unlock my laptop via smart card or use Bluetooth almost immediately after a resume, even if the breakage-after-USB1-device-is-unplugged problem strikes, and all without having to manually futz around in device manager every time.

In case anyone’s interested, I’ve put the source code for the program (“Broken Device Bouncer”, or DevBouncer for short) up. It’s fairly hardcoded to be specific to my machine, so if you wanted to use it for some reason, you’d need to rebuild it.

How I ended up in the kernel debugger while trying to get PHP and Cacti working…

April 14th, 2007

Some days, nothing seems to work properly. This is the sad story of how something as innocent as trying to install a statistics graphing Web application culminated in my breaking out the kernel debugger in an attempt to get things working. (I don’t seem to have a lot of luck with web applications. So much for the way of the future being “easy to develop/deploy/use” web-based applications…)

Recently, I decided that to try installing Cacti in order to get some nice, pretty graphs describing resource utilization on several boxes at my apartment. Cacti is a PHP program that queries SNMP data and, with the help of a program called RRDTool, creates friendly historical graphs for you. It’s commonly used for monitoring things like network or processor usage over time.

In this particular instance, I was attempting to get Cacti working on a Windows Server 2003 x64 SP2 box. Running an amalgam of unix-ish programs on Windows is certainly “fun”, and doing it on native x64 is even more “interesting”. I didn’t expect to find myself in the kernel debugger while trying to get Cacti working, though…

To start out, the first thing I had to do was convert IIS6’s worker processes to 32-bit instead of 64-bit, as the standard PHP 5 distribution doesn’t support x64. (No, I don’t consider spending who knows how many hours to get PHP building on x64 natively a viable solution here, so I just decided to stick with the 32-bit release. I don’t particularly want to be in the habit of having to then maintain rebuild my own PHP distribution from a custom build environment each time security updates come out either…).

This wasn’t too bad (at least not at first); a bit of searching revealed this KB article that documented an IIS metabase flag that you can set to turn on 32-bit worker processes (with the help of the adsutil.vbs script included in the IIS Adminscripts directory).

One small snag here was that I happened to be running a symbol proxy in native x64 mode on this system already. Since the 32-bit vs 64-bit IIS worker process flag is an all-or-nothing option, I had to go install the 32-bit WinDbg distribution on this system and copy over the 32-bit symproxy.dll and symsrv.dll into %systemroot%\system32\inetsrv. Additionally, the registry settings used by the 64-bit symproxy weren’t directly accessible to the 32-bit version (due to a compatiblity feature in 64-bit versions of Windows known as Registry Reflection), so I had to manually copy over the registry settings describing which symbol paths the symbol proxy used to the Wow64 version of HKLM\Software. No big deal so far, just a minor annoyance.

The first unexpected problem that cropped up happened after I had configured the 32-bit symbol proxy ISAPI filter and installed PHP; after I enabled 32-bit worker processes, IIS started tossing HTTP 500 Internal Server Error statuses whenever I tried to browse any site on the box. Hmm, not good…

After determining that everything was still completely broken even after disabling the symbol proxy and PHP ISAPI modules, I discovered some rather unpleasant-looking event log messages:

ISAPI Filter ‘%SystemRoot%\Microsoft.NET\Framework64\v2.0.50727\aspnet_filter.dll’ could not be loaded due to a configuration problem. The current configuration only supports loading images built for a x86 processor architecture. The data field contains the error number. To learn more about this issue, including how to troubleshooting this kind of processor architecture mismatch error, see http://go.microsoft.com/fwlink/?LinkId=29349.

It seemed that the problem was the wrong version of ASP.NET being loaded (still the x64 version). The link in the event message wasn’t all that helpful, but a bit of searching located yet another knowledge base article – this time, about how to switch back and forth between 32-bit and 64-bit versions of ASP.NET. After running aspnet_regiis as described in that article, IIS was once again in a more or less working state. Another problem down, but the worst was yet to come…

With IIS working again, I turned towards configuring Cacti in IIS. Although, at first it appeared as though everything might actually go as planned (after configuring Cacti’s database settings, I soon found myself at its php-based initial configuration page), such things were not meant to be. The first sign of trouble appeared after I completed the initial configuration page and attempted to log on with the default username and password. Doing so resulted in my being thrown back to the log on page, without any error messages. A username and password combination not matching the defaults did result in a logon failure error message, so something besides a credential failure was up.

After some digging around in the Cacti sources, it appeared that the way that Cacti tracks whether a user is logged in or not is via setting some values in the standard PHP session mechanism. Since Cacti was apparently pushing me back to the log on page as soon as I logged on, I guessed that there was probably some sort of failure with PHP’s session state management.

Rewind a bit to back when I installed PHP. In the interest of expediency (hah!), I decided to try out the Win32 installer package (as opposed to just the zip distribution for a manual install) for PHP. Typically, I’ve just installed PHP for IIS the manual way, but I figured that if they had an installer nowadays, it might be worth giving it a shot and save myself some of the tedium.

Unfortunately, it appears that PHP’s installer is not all that intelligent. It turns out that in the IIS ISAPI mode, PHP configures the system-wide PHP settings to point the session state file directory to the user-specific temp directory (i.e. pointing to a location under %userprofile%). This, obviously, isn’t going to work; anonymous users logged on to IIS aren’t going to have access to the temp directory of the account I used to install PHP with.

After manually setting up a proper location for PHP’s session state with the right security permissions (and reconfiguring php.ini to match), I tried logging in to Cacti again. This time, I actually got to the main screen after changing the password (hooray, progress!).

From here, all that I had left to do was some minor reconfiguring of the Windows SNMP service in order to allow Cacti to query it, set up the Cacti poller task job (Cacti is designed to poll data from its configured data sources at regular intervals), and configure my graphs in Cacti.

Configuring SNMP wasn’t all that difficult (though something I hadn’t done before with the Windows SNMP service), and I soon had Cacti successfully querying data over SNMP. All that was left to do was graph it, and I was home free…

Unfortunately, getting Cacti to actually graph the data turned out to be rather troublesome. In fact, I still haven’t even got it working, though I’ve at least learned a bit more about just why it isn’t working…

When I attempted to create graphs in Cacti, everything would appear to work okay, but no RRDTool datafiles would ever appear. No amount of messing with filesystem permissions resolved the problem, and the Cacti log files were not particularly helpful (even on debug severity). Additionally, attempting to edit graph properties in Cacti would result in that HTTP session mysteriously hanging forever more (definitely not a good sign). After searching around (unsuccessfully) for any possible solutions, I decided to try and take a closer look at what exactly was going on when my requests to Cacti got stuck.

Checking the process list after repeating the sequence that caused a particular Cacti session to hang several times, I found that there appeared to be a pair of cmd.exe and rrdtool.exe instances corresponding to each hung session. Hmm, it would appear that something RRDTool was doing was freezing and PHP was waiting for it… (PHP uses cmd.exe to call RRDTool, so I guessed that PHP would be waiting for cmd.exe, which would be waiting for RRDTool).

At first, I attempted to attach to one of the cmd processes with WinDbg. (Incidentally, it would appear that there are currently no symbols for the Wow64 versions of the Srv03SP2 ntdll, kernel32, user32, and a large number of other core DLLs with Wow64 builds available on the Microsoft symbol server for some reason. If any Microsoft people are reading this, it would be greaaaat if you could fix the public symbol server for Srv03 SP2 x64 Wow64 DLLs …) However, symbols for cmd.exe were fortunately available, so it was relatively easy to figure out what it was up to, and prove my earlier hypothesis that it was simply waiting on an rrdtool instance:

0:001:x86> ~1k
ChildEBP RetAddr
0012fac4 7d4d8bf1 ntdll_7d600000!NtWaitForSingleObject+0x15
0012fad8 4ad018ea KERNEL32!WaitForSingleObject+0x12
0012faec 4ad02611 cmd!WaitProc+0x18
0012fc24 4ad01a2b cmd!ExecPgm+0x3e2
0012fc58 4ad019b3 cmd!ECWork+0x84
0012fc70 4ad03c58 cmd!ExtCom+0x40
0012fe9c 4ad01447 cmd!FindFixAndRun+0xa9
0012fee0 4ad06cf6 cmd!Dispatch+0x137
0012ff44 4ad07786 cmd!main+0x108
0012ffc0 7d4e7d2a cmd!mainCRTStartup+0x12f
0012fff0 00000000 KERNEL32!BaseProcessInitPostImport+0x8d
0:001:x86> !peb
[...]
CommandLine: 'cmd.exe /c c:/progra~2/rrdtool/rrdtool.exe -'
[...]

Given this, the next logical step to investigate would be the RRDTool.exe process. Unfortunately, something really weird seemed to be going on with all the RRDTool.exe processes (naturally). WinDbg would give me an access denied error for all of the RRDTool PIDs in the F6 process list, despite my being a local machine administrator.

Attempting to attach to these processes failed as well:

Microsoft (R) Windows Debugger Version 6.6.0007.5
Copyright (c) Microsoft Corporation. All rights reserved.

Cannot debug pid 4904, NTSTATUS 0xC000010A
“An attempt was made to duplicate an object handle into or out of an exiting process.”
Debuggee initialization failed, NTSTATUS 0xC000010A
“An attempt was made to duplicate an object handle into or out of an exiting process.”

This is not something that you want to be seeing on a server box. This particular error means that the process in question is in the middle of being terminated, which prevents a debugger from successfully attaching. However, processes typically terminate in timely fashion; in fact, it’s almost unheard of to actually see a process in the terminating state, since it happens so quickly. However, in this particular instances, the RRDTool processes were remaining in this half-dead state for what appeared to be an indefinite interval.

There are two things that commonly cause this kind of problem, and all of them are related to the kernel:

  1. The disk hardware is not properly responding to I/O requests and they are hanging indefinitely. This can block a process from exiting while the operating system waits for an I/O to finishing canceling or completing. Since this particular box was brand new (and with respectable, high-quality server hardware), I didn’t think that failing hardware was the cause here (or at least, I certainly hoped not!). Given that there were no errors in the event log about I/Os failing, and that I was still able to access files on my disks without issue, I decided to rule this possiblity out.
  2. A driver (or other kernel mode code in the I/O stack) is buggy and is not allowing I/O requests to be canceled or completed, or has deadlocked itself and is not able to complete an I/O request. (You might be familiar with the latter if you’ve tried to use the 1394 mass storage support in Windows for a non-trivial length of time.) Given that I had tentatively ruled out bad hardware, this would seem to be the most likely cause here.

Since the frozen process would be stuck in kernel mode, in either case, to proceed any further I would need to use the kernel debugger. I decided to start out with local kd, as that is sufficient for at least retrieving thread stacks and doing basic passive analysis of potential deadlock issues where the system is at least mostly still functional.

Sure enough, the stuck RRDTool process I had unsuccessfully tried to attach to was blocked in kernel mode:


lkd> !process 0n4904
Searching for Process with Cid == 1328
PROCESS fffffadfcc712040
SessionId: 0 Cid: 1328 Peb: 7efdf000 ParentCid: 1354
DirBase: 5ea6c000 ObjectTable: fffffa80041d19d0 HandleCount: 68.
Image: rrdtool.exe
[...]
THREAD fffffadfcca9a040 Cid 1328.1348 Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (Unknown) KernelMode Non-Alertable
fffffadfccf732d0 SynchronizationEvent
Impersonation token: fffffa80041db980 (Level Impersonation)
DeviceMap fffffa8001228140
Owning Process fffffadfcc712040 Image: rrdtool.exe
Wait Start TickCount 6545162 Ticks: 367515 (0:01:35:42.421)
Context Switch Count 445 LargeStack
UserTime 00:00:00.0000
KernelTime 00:00:00.0015
Win32 Start Address windbg!_imp_RegCreateKeyExW (0x0000000000401000)
Start Address 0x000000007d4d1510
Stack Init fffffadfc4a95e00 Current fffffadfc4a953b0
Base fffffadfc4a96000 Limit fffffadfc4a8f000 Call 0
Priority 10 BasePriority 8 PriorityDecrement 1
RetAddr Call Site
fffff800`01027752 nt!KiSwapContext+0x85
fffff800`0102835e nt!KiSwapThread+0x3c9
fffff800`013187ac nt!KeWaitForSingleObject+0x5a6
fffff800`012b2853 nt!IopAcquireFileObjectLock+0x6d
fffff800`01288dff nt!IopCloseFile+0xad
fffff800`01288f0e nt!ObpDecrementHandleCount+0x175
fffff800`0126ceb0 nt!ObpCloseHandleTableEntry+0x242
fffff800`0128d7a6 nt!ExSweepHandleTable+0xf1
fffff800`012899b6 nt!ObKillProcess+0x109
fffff800`01289d3b nt!PspExitThread+0xa3a
fffff800`0102e3fd nt!NtTerminateProcess+0x362
00000000`77ef0caa nt!KiSystemServiceCopyEnd+0x3
0202c9fc`0202c9fb ntdll!NtTerminateProcess+0xa

Hmm… not quite what I expected. If a buggy driver was involved, it should have at least been somewhere on the call stack, but in this particular instance all we have is ntoskrnl code, straight from the system call to the wait that isn’t coming back. Something was definitely wrong in kernel mode, but it wasn’t immediately clear what was causing it. It appeared as if the kernel was blocked on the file object lock (which, to my knowledge, is used to guard synchronous I/O’s that are issued for a particular file object), but, as the file object lock is built upon KEVENTs, the usual lock diagnostics extensions (like `!locks’) would not be particularly helpful. In this instance, what appeared to be happening was that the process rundown logic in the kernel was attempting to release all still-open handles in the exiting RRDTool process, and it was (for some reason) getting stuck while trying to close a handle to a particular file object.

I could at least figure out what file was “broken”, though, by poking around in the stack of IopCloseFile:

lkd> !fileobj fffffadf`ccf73250
\Temp\php\session\sess_bkcavai8fak8antv9coq46at95
LockOperation Set Device Object: 0xfffffadfce423370 \Driver\dmio
Vpb: 0xfffffadfce864840
Access: Read Write SharedRead SharedWrite
Flags: 0x40042
Synchronous IO
Cache Supported
Handle Created
File Object is currently busy and has 1 waiters.
FsContext: 0xfffffa800390e110 FsContext2: 0xfffffa8000106a10
CurrentByteOffset: 0
Cache Data:
Section Object Pointers: fffffadfcd601c20
Shared Cache Map: fffffadfccfdebb0 File Offset: 0 in VACB number 0
Vacb: fffffadfce97fb08
Your data is at: fffff98070e80000

From here, there are a couple of options:

  1. We could look for processes with an open handle to that file and check their stacks.
  2. We could look for an IRP associated with that file object and try and trace our way back from there.

Initially, I tried the first option, but this ended up not working particularly well. I attempted to use Process Explorer to locate all processes that had a handle to that file, but this ended up failing rather miserably as Process Explorer itself got deadlocked after it opened a handle to the file. This was actually rather curious; it turned out that processes could open a handle to this “broken” file just fine, but when they tried to close the handle, they would get blocked in kernel mode indefinitely.

That unsuccessful, I tried the second option, which is made easier by the use of `!irpfind’. Normally, this extension is very slow to operate (over a serial cable), but local kd makes it quite usable. This revealed something of value:


lkd> !irpfind -v 0 0 fileobject fffffadf`ccf73250
Looking for IRPs with file object == fffffadfccf73250
Scanning large pool allocation table for Tag: Irp? (fffffadfccdf6000 : fffffadfcce56000)
Searching NonPaged pool (fffffadfcac00000 : fffffae000000000) for Tag: Irp?
Irp [ Thread ] irpStack: (Mj,Mn) DevObj [Driver] MDL Process
fffffadfcc225380: Irp is active with 7 stacks 7 is current (= 0xfffffadfcc225600)
No Mdl: No System Buffer: Thread fffffadfccea27d0: Irp stack trace.
cmd flg cl Device File Completion-Context
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[...]
>[ 11, 1] 2 1 fffffadfce7b6040 fffffadfccf73250 00000000-00000000 pending
\FileSystem\Ntfs
Args: fffffadfcd70e0a0 00000000 00000000 00000000

There was an active IRP for this file object. Hopefully, it could be related to whatever is holding the file object lock for that file object. Digging a bit deeper, it’s possible to determine what thread is associated with the IRP (if it’s a thread IRP), and from there, we can grab a stack (which might just give us the smoking gun we’re looking for)…:

lkd> !irp fffffadfcc225380 1
Irp is active with 7 stacks 7 is current (= 0xfffffadfcc225600)
No Mdl: No System Buffer: Thread fffffadfccea27d0: Irp stack trace.
Flags = 00000000
ThreadListEntry.Flink = fffffadfcc2253a0
ThreadListEntry.Blink = fffffadfcc2253a0
[...]
CancelRoutine = fffff800010ba930 nt!FsRtlPrivateResetLowestLockOffset
[...]
lkd> !thread fffffadfccea27d0
THREAD fffffadfccea27d0 Cid 10f8.138c Teb: 00000000fffa1000 Win32Thread: fffffa80023cd860 WAIT: (Unknown) UserMode Non-Alertable
fffffadfccf732e8 NotificationEvent
Impersonation token: fffffa8002c62060 (Level Impersonation)
DeviceMap fffffa8002f3b7b0
Owning Process fffffadfcc202c20 Image: w3wp.exe
Wait Start TickCount 6966187 Ticks: 952 (0:00:00:14.875)
Context Switch Count 1401 LargeStack
UserTime 00:00:00.0000
KernelTime 00:00:00.0000
Win32 Start Address 0x00000000003d87d8
Start Address 0x000000007d4d1504
Stack Init fffffadfc4fbee00 Current fffffadfc4fbe860
Base fffffadfc4fbf000 Limit fffffadfc4fb8000 Call 0
Priority 10 BasePriority 8 PriorityDecrement 0
RetAddr : Call Site
fffff800`01027752 : nt!KiSwapContext+0x85
fffff800`0102835e : nt!KiSwapThread+0x3c9
fffff800`012afb38 : nt!KeWaitForSingleObject+0x5a6
fffff800`0102e3fd : nt!NtLockFile+0x634
00000000`77ef14da : nt!KiSystemServiceCopyEnd+0x3
00000000`00000000 : ntdll!NtLockFile+0xa

This might just be what we’re looking for. There’s a thread in w3wp.exe (the IIS worker process), which is blocking on a synchronous NtLockFile call for that same file object that is in the “broken” state. Since I’m running PHP in ISAPI mode, this does make sense – if PHP is doing something to that file (which it could certainly be, since it’s a PHP session state file as we saw above), then it should be in the context of w3wp.exe.

In order to get a better user mode stack trace as to what might be going on, I was able to attach a user mode debugger to w3wp.exe and get a better picture as to what the deal was:


0:006> .effmach x86
Effective machine: x86 compatible (x86)
0:006:x86> ~6s
ntdll_7d600000!ZwLockFile+0x12:
00000000`7d61d82e c22800 ret 28h
0:006:x86> k
ChildEBP RetAddr
WARNING: Stack unwind information not available. Following frames may be wrong.
014edf84 023915a2 ntdll_7d600000!ZwLockFile+0x12
014edfbc 0241d886 php5ts!flock+0x82
00000000 00000000 php5ts!zend_reflection_class_factory+0xb576

It looks like that thread is indeed related to PHP; PHP is trying to acquire a file lock on the session state file. With a bit of work, we can figure out just what kind of lock it was trying to acquire.

The prototype for NtLockFile is as so:

// NtLockFile locks a region of a file.
NTSYSAPI
NTSTATUS
NTAPI
NtLockFile(
IN HANDLE FileHandle,
IN HANDLE Event OPTIONAL,
IN PIO_APC_ROUTINE ApcRoutine OPTIONAL,
IN PVOID ApcContext OPTIONAL,
OUT PIO_STATUS_BLOCK IoStatusBlock,
IN PULARGE_INTEGER LockOffset,
IN PULARGE_INTEGER LockLength,
IN ULONG Key,
IN BOOLEAN FailImmediately,
IN BOOLEAN ExclusiveLock
);

Given this, we can easily deduce the arguments off a stack dump:

0:006:x86> dd @esp+4 l0n10
00000000`014edf48 000002b0 00000000 00000000 014edfac
00000000`014edf58 014edfac 014edf74 014edf7c 00000000
00000000`014edf68 00000000 00000001
0:006:x86> dq 014edf74 l1
00000000`014edf74 00000000`00000000
0:006:x86> dq 014edf7c l1
00000000`014edf7c 00000000`00000001

It seems that PHP is trying to acquire an exclusive lock for a range of 1 byte starting at offset 0 in this file, with NtLockFile configured to wait until it acquires the lock.

Putting this information together, it’s now possible to surmise what is going on here:

  1. The child processes created by php have a file handle to the session state file (probably there from process creation inheritance).
  2. PHP tries to acquire an exclusive lock on part of the session state file. This takes the file object lock for that file and waits for the file to become exclusively available.
  3. The child process exits. Now, it tries to acquire the file object lock so that it can close its file handle. However, the file object lock cannot be acquired until the child process releases its handle as the handle is blocking PHP’s NtLockFile from completing.
  4. Deadlock! Neither thread can continue, and PHP appears to hang instead of configuring my graphs properly.

In this particular instance, it was actually possible to “recover” from the deadlock without rebooting; the IIS worker process’s wait in NtLockFile is marked as a UserMode wait, so it is possible to terminate the w3wp.exe process, which releases the file object lock and ultimately allows all the frozen processes that are trying to close a handle to the PHP session state file to finish the close handle operation and exit.

This is actually a nasty little problem; it looks like it’s possible for one user mode process to indefinitely freeze another user mode process in kernel mode via a deadlock. Although you can break the deadlock by terminating the second user mode process, the fact that a user mode process can, at all, cause the kernel to deadlock during process exit (“breakable” or not) does not appear to be a good thing to me.

Meanwhile, knowing this right now doesn’t really solve my problem. Furthermore, I suspect that there’s probably a different problem here too, as the command line that was given to RRDTool (simply “-“) doesn’t look all that valid to me. I’ll see if I can come up with some way to work around this deadlock problem, but it definitely looks like an unpleasant one. If it really is a file handle being incorrectly inherited to a child process, then it might be possible to un-mark that handle for inheritance with some work. The fact that I am having to consider making something to patch PHP to work around this is definitely not a happy one, though…

Silly me for thinking that it would just take a kernel debugger to get a web application running…

Debugger internals: Why do ntoskrnl and ntdll have type information?

February 14th, 2007

Johan Johansson sent me a mail asking why nt and ntdll have partial type information included (if you’re at all experienced with debugging on Windows, you’ll know that public symbols, such as what Microsoft ships on the public symbol server at http://msdl.microsoft.com/download/symbols, don’t typically include type information. Instead, one typically needs access to private symbols in order to view types.

However, nt and ntdll are an exception to this rule, on Windows XP or later. Unlike all the other PDBs shipped by Microsoft, the ones corresponding to ntdll and ntoskrnl do include type information for a seemingly arbitrary mix of types, some publicly documented and some undocumented. There is, however, a method to the madness with respect to what symbols are included in the public nt/ntdll PDBs.

To understand what symbols are chosen and why, though, it’s necessary to know a bit of history.

Way back in the days when Windows NT was still called Windows NT (and not Windows 2000 or Windows XP), the debugger scene was a much less friendly place. In those days, “remote debugging” involved dialing up your debugger person with a modem, and if you wanted to kernel debug a target, you had to run a different kernel debugger program specific to the architecture on the target computer.

Additionally, you had to use a debugger version that was newer than the operating system on the target computer or things wouldn’t work out very well. Furthermore, one had to load architecture-specific extension dlls for many kernel debugging tasks. One of the reasons for these restrictions, among other things, is that for different architectures (and different OS releases), the size and layout of many internal structures used by the debugger (and debugger extension modules) to do their work varied. In other words, the size and layout of, say, EPROCESS might not be the same on Windows NT 3.1 x86 vs Windows NT 4.0 for the DEC Alpha.

When Windows 2000 was released, things became a little bit better- Windows 2000 only publicly supported x86, which reduced the number of different architectures that WinDbg needed to publicly support going forward. However, Windows XP and Windows Server 2003 reintroduced mainstream support for non-x86 architectures (first IA64 and then x64).

At some point on the road to Windows XP and Windows Server 2003, a decision was made to clean things up from the debugger perspective and introduce a more manageable way of supporting a large matrix of operating systems and target architectures.

Part of the solution devised involved providing a unified, future-compatible (where possible; obviously, new or radically redesigned OS functionality would require debugger extension changes, but things like simple structure size chages shouldn’t require such drastic measures) method for accessing data on the remote system. Since all that a debugger extension does in the common case is simply quickly reformat data on the target system into an easily human-readable format (such as the process list returned by !process), a unified way to communicate structure sizes and layouts to debugger extensions (and the debugger engine itself) would greatly reduce the headache of supporting the debugger on an ever-expanding set of platforms and operating systems. This solution involved putting the structures used by the debugger engine itself and many debugger extension into the symbol files shipped with ntoskrnl and ntdll, and then providing a well-defined API for retrieving that type information from the perspective of a debugger extension.

Fast-forward to 2007. Now, there is a single, unified debugger engine for all target platforms and architectures (the i386kd.exe and ia64kd.exe that ship with the DTW distribution are essentially the same as the plain kd.exe and are vestigal remains of a by-gone era; these files are simply retained for backwards compability with scripts and programs that drive the debugger), real remote debugging exists, and your debugger doesn’t break every time a service pack is released. All of this is made possible in part due to the symbol information available in the ntoskrnl and ntdll PDB files. This is why you can use WinDbg 6.6.7.5 to debug Windows Vista RTM, despite the fact that WinDbg 6.6.7.5 was released months before the final RTM build was shipped.

Symbol support is also a reason why there is no ‘srv03ext’, ‘srv03fre’, or ‘srv03chk’ extension directories under your Debugging Tools for Windows folder. The nt4chk/nt4fre/w2kchk/w2kfre directories contain debugger extensions specific to that Windows build. Due to the new unified debugger architecture, there is no longer a need to tie a binary to a particular operating system build going forward. Because Windows 2000 and Windows NT 4.0 doesn’t include type data, however, the old directories still remain for backwards compatibility with those platforms.

So, to answer Johan’s question: All of the symbols in the ntoskrnl or ntdll PDBs should be used by the debugger engine itself, or some debugger extension, somewhere. This is the final determining factor as to what types are exposed via those PDBs, to my knowledge; whether a public debugger extension DLL (or the debugger engine itself) uses them or not.