Archive for the 'WDK' Category

Windows Hotpatching: A Walkthrough

As discussed in the last post, Windows 2003 SP1 introduced a technology known as Hotpatching. An integral part of this technology is Hotpatching, which refers to the process of applying an updated on the fly by using runtime code modification techniques.

Although Hotpatching has caught a bit of attention, suprisingly little information has been published about its inner workings. As the technology is patented, however, there is quite a bit of information that can be obtained by reading the patent description. Moreover, there is this (admittedly very terse) discussion about the actual implementation of hotpatching.

Armed with this information, it is possible to get into more detail by looking what is actually happening under the hood when a hoftix is applied: I did so and chose KB911897 as an example, which fixes some flaw in mrxsmb.sys and rdbss.sys. I have also gone through the hassle of translating key parts of the respective assembly code back to C.

Preparing the machine

First, we need a proper machine image which can be used for the experiment. Unfortunately, KB911897 is an SP1 package, so we have to use an old Win 2003 Server SP1 system to apply this update. Once we have the machine running, we can attach the kernel debugger and see what is happening when the hotfix is installed.

Observing the update

When launched with /hotpatch:enable, after some initialization work, the updater calls NtSetSystemInformation (which delegates to ExApplyCodePatch) to apply the hotpatch. Hotpatching includes a coldpatch, which I do not care about here and the actual hotpatch. The first two calls to NtSetSystemInformation (and thus to ExApplyCodePatch) are coldpatching-related and I will thus ignore them here. The third call, however, is made to apply the actual hotpatch, so let’s observe this one further.

Requiring a kernel mode-patch, ExApplyCodePatch then calls MmHotPatchRoutine, which is where the fun starts. Expressed in C, MmHotPatchRoutine, MmHotPatchRoutine roughly looks like this (reverse engineered from assembly, might be slightly incorrect):

NTSTATUS MmHotPatchRoutine(
  __in PSYSTEM_HOTPATCH_CODE_INFORMATION RemoteInfo
  )
{
  UNICODE_STRING ImageFileName;
  DWORD Flags = RemoteInfo->Flags;
  PVOID ImageBaseAddress;
  PVOID ImageHandle;
  NTSTATUS Status, LoadStatus;
  KTHREAD CurrentThread;

  ImageFileName.Length = RemoteInfo->KernelInfo.NameLength;
  ImageFileName.MaximumLength = RemoteInfo->KernelInfo.NameLength;
  ImageFileName.Buffer = ( PBYTE ) RemoteInfo + NameOffset;

  CurrentThread = KeGetCurrentThread();
  KeEnterCriticalRegion( CurrentThread );

  KeWaitForSingleObject(
    MmSystemLoadLock,
    WrVirtualMemory,
    0,
    0,
    0 );

  LoadStatus = MmLoadSystemImage(
    &ImageFileName,
    0,
    0,
    0,
    &ImageHandle,
    &ImageBaseAddress );
  if ( NT_SUCCESS( Status ) || Status == STATUS_IMAGE_ALREADY_LOADED )
  {

    Status = MiPerformHotPatch(
      ImageHandle,
      ImageBaseAddress,
      Flags );
    
    if ( NT_SUCCESS( Status ) || LoadStatus == STATUS_IMAGE_ALREADY_LOADED )
    {
      NOTHING;
    }
    else
    {
      MmUnloadSystemImage( ImageHandle );
    }
    
    LoadStatus = Status;
  }


  KeReleaseMutant(
    MmSystemLoadLock,
    1,  // increment
    FALSE,
    FALSE );

  KeLeaveCriticalRegion( CurrentThread );

  return LoadStatus;
}

As you see in the code, MmHotPatchRoutine will try load the hotpatch image — we can verify this in the debugger:

kd> bp nt!MmLoadSystemImage

kd> g
Breakpoint 3 hit
nt!MmLoadSystemImage:
808ec4b5 6878010000      push    178h

kd> k
ChildEBP RetAddr  
f6acbb28 80990c9e nt!MmLoadSystemImage
f6acbb68 809b2d67 nt!MmHotPatchRoutine+0x59
f6acbba8 808caeff nt!ExApplyCodePatch+0x191
f6acbd50 8082337b nt!NtSetSystemInformation+0xa1e
f6acbd50 7c82ed54 nt!KiFastCallEntry+0xf8
0006bc50 7c821f24 ntdll!KiFastSystemCallRet
0006bd44 7c8304c9 ntdll!ZwSetSystemInformation+0xc
[...]

kd> dt _UNICODE_STRING poi(@esp+4)
ntdll!_UNICODE_STRING
 "\??\c:\windows\system32\drivers\hpf3.tmp"
   +0x000 Length           : 0x50
   +0x002 MaximumLength    : 0x50
   +0x004 Buffer           : 0x81623fa8  "\??\c:\windows\system32\drivers\hpf3.tmp"
   
kd> gu

kd> lm
start    end        module name
[...]           
f6ba4000 f6bad000   hpf3       (deferred)  
[...]
f95cb000 f9641000   mrxsmb     (deferred)  
f9641000 f9671000   rdbss      (deferred)      
[...]

Having loaded the hotpatch image, MmHotPatchRoutine proceeds be calling MiPerformHotPatch, which looks about like this:

NTSTATUS
MiPerformHotPatch(
  IN PLDR_DATA_TABLE_ENTRY ImageHandle,
  IN PVOID ImageBaseAddress,
  IN DWORD Flags
  )
{
  PHOTPATCH_HEADER SectionData ;
  PRTL_PATCH_HEADER Header;    
  NTSTATUS Status;
  PVOID LockVariable;
  PVOID LockedBuffer;
  BOOLEAN f;
  PLDR_DATA_TABLE_ENTRY LdrEntry;

  SectionData = RtlGetHotpatchHeader( ImageBaseAddress );
  if ( ! SectionData  )
  {
    return STATUS_INVALID_PARAMETER;
  }
  
  //
  // Try to get header from MiHotPatchList
  //
  Header = RtlFindRtlPatchHeader(
    MiHotPatchList,
    ImageHandle );

  if ( ! Header )
  {
    PLIST_ENTRY Entry;

    if ( Flags & FLG_HOTPATCH_ACTIVE )
    {
      return STATUS_NOT_SUPPORTED;
    }

    Status = RtlCreateHotPatch(
      &Header,
      SectionData,
      ImageHandle,
      Flags
      );
    if ( ! NT_SUCCESS( Status ) )
    {
      return Status;
    }

    ExAcquireResourceExclusiveLite(
      PsLoadedModuleResource,
      TRUE
      );

    Entry =  PsLoadedModuleList;
    while ( Entry != PsLoadedModuleList )
    {
      LdrEntry = DataTableEntry = CONTAINING_RECORD( Entry,
                                            KLDR_DATA_TABLE_ENTRY,
                                            InLoadOrderLinks )
      if ( LdrEntry->DllBase DllBase >= MiSessionImageEnd )
      {
        if ( RtlpIsSameImage( Header, LdrEntry ) )
        {
          break;
        }
      }
    }

    ExReleaseResourceLite( PsLoadedModuleResource );

    if ( ! PatchHeader->TargetDllBase )
    {
      Status = STATUS_DLL_NOT_FOUND ;
    }

    Status = ExLockUserBuffer(
      ImageHandle->DllBase,
      ImageHandle->SizeOfImage,
      KernelMode,
      IoWriteAccess,
      LockedBuffer,
      LockVariable
      );
    if ( ! NT_SUCCESS( Status ) )
    {
      FreeHotPatchData( Header );
      return Status;
    }


    Status = RtlInitializeHotPatch(
      ( PRTL_PATCH_HEADER ) Header,
      ( PBYTE ) LockedBuffer - ImageHandle->DllBase
      );

    ExUnlockUserBuffer( LockVariable );

    if ( ! NT_SUCCESS( Status ) )
    {
      FreeHotPatchData( ImageHandle );
      return Status;
    }

    f = 1;
  }
  else
  {
    if ( ( Flags ^ ImageHandle->CodeInfo->Flags ) & FLG_HOTPATCH_ACTIVE )
    {
      return STATUS_NOT_SUPPORTED;
    }

    if ( ! ( ImageHandle->CodeInfo->Flags & FLG_HOTPATCH_ACTIVE ) )
    {
      Status = RtlReadHookInformation( Header );
      if ( ! NT_SUCCESS( Status ) )
      {
        return Status;
      }
    }

    f = 0;
  }
  
  Status = MmLockAndCopyMemory(
    ImageHandle->CodeInfo,
    KernelMode
    );
  if ( NT_SUCCESS( Status ) )
  {
    if ( ! f  )
    {
      return Status;
    }

    LdrEntry->EntryPointActivationContext = Header;  // ???
    InsertTailList( MiHotPatchList, LdrEntry->PatchList );
  }
  else
  {
    if ( f ) 
    {
      RtlFreeHotPatchData( Header );
    }
  }

  return Status;
}

So MiPerformHotPatch inspects the hotpatch information stored in the hotpatch image. This data includes information about which code regions need to be updated. After the neccessary information has been gathered, it applies the code changes.

Two basic problems have to be overcome now: On the one hand, all code sections of drivers are mapped read/execute only. Overwring the instructions thus does not work. On the other hand, the system has to properly synchronize the patching process, i.e. it has to make sure no CPU is currently executing the code that is about to be patched.

To overcome the memory protection problems, Windows facilitates a trick I previously only knew from malware: It creates a memory descriptor list (MDL) for the affected code region, maps the MDL, and updates the code through this mapped region. The memory protection is thus circumvented. As it turns, out, there is even a handy, undocumented helper routine for this purpose: ExLockUserBuffer, which is used by MiPerformHotPatch.

To proceed, MiPerformHotPatch calls MmLockAndCopyMemory to do the actual patching. So how does Windows synchronize the update process? Again, it uses a technique I assumed was a malware trick: It schedules CPU-specific DPCs on all CPUs but the current and keeps those DPCs busy while the current thread is uddating the code. Again, Windows provides a neat routine for that: KeGenericCallDpc. In addition to this, Windows raises the IRQL to clock level in order to mask all interrupts.

Here is the pseudo-code for MmLockAndCopyMemory and its helper, MiDoCopyMemory:

NTSTATUS
MmLockAndCopyMemory (
    IN PSYSTEM_HOTPATCH_CODE_INFORMATION PatchInfo,
    IN KPROCESSOR_MODE ProbeMode
    )
{
  PVOID Buffer;
  NTSTATUS Status;
  UINT Index;

  if ( 0 == PatchInfo->CodeInfo.DescriptorsCount )
  {
    return STATUS_SUCCESS;
  }

  Buffer = ExAllocatePoolWithQuotaTag( 
    9,
    PatchInfo->CodeInfo.DescriptorsCount * 2,
    'PtoH' );
  if ( ! Buffer )
  {
    return STATUS_INSUFFICIENT_RESOURCES;
  }
  RtlZeroMemory( Buffer, PatchInfo->CodeInfo.DescriptorsCount * 2 );

  if ( 0 == PatchInfo->CodeInfo.DescriptorsCount )
  {
    Status = STATUS_INVALID_PARAMETER;
    goto Cleanup;
  }

  for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
  {
    if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset > PatchInfo->InfoSize ||
       PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize > PatchInfo->InfoSize ||
       PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset +
       PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize > PatchInfo->InfoSize || 
       /* other checks... */ )
    {
      Status = STATUS_INVALID_PARAMETER;
      goto Cleanup;
    }

    Status = ExLockUserBuffer(
      TargetAddress,
      PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize
      ProbeMode,
      IoWriteAccess,
      &PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
      Buffer[ Index ]
      );
    if ( ! NT_SUCCESS( Status ) )
    {
      goto Cleanup;
    }
  }

  PatchInfo->Flags |= FLG_HOTPATCH_ACTIVE;

  KeGenericCallDpc(
    MiDoCopyMemory,
    PatchInfo );

  if ( PatchInfo->Flags & FLG_HOTPATCH_VERIFICATION_ERROR )
  {
    PatchInfo->Flags &= ~FLG_HOTPATCH_ACTIVE;
    PatchInfo->Flags &= ~FLG_HOTPATCH_VERIFICATION_ERROR;
    Status = STATUS_DATA_ERROR;
  }

Cleanup:
  if ( PatchInfo->CodeInfo.DescriptorsCount > 0 )
  {
    for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
    {
      ExUnlockUserBuffer( Buffer[ Index ] );
    }
  }

  ExFreePoolWithTag( Buffer, 0 );
  return Status;
}

VOID MiDoCopyMemory(
  IN PKDPC Dpc,
  IN PSYSTEM_HOTPATCH_CODE_INFORMATION PatchInfo,
  IN ULONG NumberCpus,
  IN DEFERRED_REVERSE_BARRIER ReverseBarrier
  )
{
  KIRQL OldIrql;
  UNREFERENCED_PARAMETER( Dpc );
  NTSTATUS Status;
  ULONG Index;

  OldIrql = KfRaiseIrql( CLOCK1_LEVEL );

  //
  // Decrement reverse barrier count.
  //
  Status = KeSignalCallDpcSynchronize( ReverseBarrier );
  if ( ! NT_SUCCESS( Status ) )
  {
    goto Cleanup;
  }

  PatchInfo->Flags &= ~FLG_HOTPATCH_VERIFICATION_ERROR;
    
  for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
  {
    if ( PatchInfo->Flags & FLG_HOTPATCH_ACTIVE )
    {
      if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].ValidationSize != 
        RtlCompareMemory(
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
          ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].ValidationOffset,
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].ValidationSize ) )
      {

        if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize != 
          RtlCompareMemory(
            PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
            ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].OrigCodeOffset,
            PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize ) )
        {
          PatchInfo->Flags &= FLG_HOTPATCH_VERIFICATION_ERROR;
          break;
        }
      }
    }
    else
    {
      if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize !=
        RtlComparememory(
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
          ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset,
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize ) )
      {
        PatchInfo->Flags &= FLG_HOTPATCH_VERIFICATION_ERROR;
        break;
      }
    }
  }

  //loc_479533
  if ( PatchInfo->Flags & FLG_HOTPATCH_VERIFICATION_ERROR ||
     PatchInfo->CodeInfo.DescriptorsCount <= 0 )
  {
    goto Cleanup;
  }

  for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
  {
    PVOID Source;
    if ( PatchInfo->Flags & FLG_HOTPATCH_ACTIVE )
    {
      Source = ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset;
    }
    else
    {
      Source = ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].OrigCodeOffset;
    }

    RtlCopyMemory(
      PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
      Source,
      PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize
      );
  }


Cleanup:
   KeSignalCallDpcSynchronize( ReverseBarrier );
   KfLowerIrql( OldIrql );
   KeSignalCallDpcDone( NumberCpus );
}

To see the code, in action, we set a breakpoint on nt!MiDoCopyMemory:

kd> k
ChildEBP RetAddr  
f6acbac0 8087622f nt!MiDoCopyMemory
f6acbae8 80990a10 nt!KeGenericCallDpc+0x3d
f6acbb0c 80990bea nt!MmLockAndCopyMemory+0xf1
f6acbb34 80990cba nt!MiPerformHotPatch+0x143
f6acbb68 809b2d67 nt!MmHotPatchRoutine+0x75
f6acbba8 808caeff nt!ExApplyCodePatch+0x191
f6acbd50 8082337b nt!NtSetSystemInformation+0xa1e

Before letting MiDoCopyMemory do its work, let’s see what it is about to do. No modifications have yet been done to mrxsmb:

kd> !chkimg mrxsmb
0 errors : mrxsmb 

kd> !chkimg rdbss
0 errors : rdbss

The second argument is a structure holding the information garthered previously, peeking into it reveals:

kd> dd /c 1 poi(esp+8) l 4
81583008  00000001
8158300c  00000149
81583010  00000008   <-- # of code patches
81583014  f9648b1f   <-- hmm...

As it turns out, address 81583014 refers to a variable length array of size 8. Poking aroud with dd, the following listing suggests that the structure is of size 28 bytes:

kd> dd /c 7 81583014
81583014  f9648b1f fa2afb1f 000000ec 00000005 000000f1 000000f6 00000005
81583030  f9648b24 fa2b2b24 000000fb 00000002 000000fd 000000ff 00000002
8158304c  f96585ef fa2b15ef 00000101 00000005 00000106 0000010b 00000005
81583068  f96585f4 fa2b45f4 00000110 00000002 00000112 00000114 00000002
81583084  f9658569 fa2b3569 00000116 00000005 0000011b 00000120 00000005
815830a0  f965856e fa2b656e 00000125 00000002 00000127 00000129 00000002
815830bc  f9653378 fa2b5378 0000012b 00000005 00000130 00000135 00000005
815830d8  f965337d fa2b837d 0000013a 00000005 0000013f 00000144 00000005

Given that rdbss was loaded to address range f9641000-f9671000, it is obvious that the first 2 columns refer to code addresses. The third, fifth and sixth column looks like an offset, the fourth and seventh like the length of the code change. First, let’s see where the first column points to:

kd> u f9648b1f
rdbss!RxInitiateOrContinueThrottling+0x6b:
f9648b1f 90              nop
f9648b20 90              nop
f9648b21 90              nop
f9648b22 90              nop
f9648b23 90              nop
rdbss!RxpCancelRoutine:
f9648b24 8bff            mov     edi,edi
f9648b26 55              push    ebp
f9648b27 8bec            mov     ebp,esp

Now that looks promising, especially since the fourth column holds the value 5. Let’s look at the second row:

kd> u f9648b24
rdbss!RxpCancelRoutine:
f9648b24 8bff            mov     edi,edi

No doubt, the first and second row define the two patches necessary to redirect RxpCancelRoutine. But what to replace this code with? As it turns out, the offsets in column three are relative to the structure and point to the code that is to be written:

kd> u poi(esp+8)+000000ec
815830f4 e9dcc455fd      jmp     7eadf5d5          mov     edi,edi

kd> u poi(esp+8)+000000fb
81583103 ebf9            jmp     815830fe

That makes perfectly sense — the five nops are to be overwritten by a near jump, the mov edi, edi will be replaced by a short jump.

So let’s run MiDoCopyMemory and have a look at the results. Back in MmLockAndCopyMemory, the code referred to by the first to rows look like this:

kd> u f9648b1f
rdbss!RxInitiateOrContinueThrottling+0x6b:
f9648b1f e9dcc455fd      jmp     hpf3!RxpCancelRoutine (f6ba5000)

kd> u f9648b24
rdbss!RxpCancelRoutine:
f9648b24 ebf9            jmp     rdbss!RxInitiateOrContinueThrottling+0x6b (f9648b1f)
f9648b26 55              push    ebp
f9648b27 8bec            mov     ebp,esp

VoilĂ , RxpCancelRoutine has been patched and calls are redirected to hpf3!RxpCancelRoutine, the new routine located in the auxiliarry ‘hpf3’ driver. All that remains to be done is cleanup (unlocking the memory etc).

That’s it — that’s how Windows applies patches on the fly using hotpatching. Too bad that the technology is so rarely used in practice.

Advertisements

Windows Hotpatching

Several years ago, with Windows Server 2003 SP1, Microsoft introduced a technology and infrastructure called Hotpatching. The basic intent of this infrastructure is to provide a means to apply hotfixes on the fly, i.e. without having to reboot the system — even if the hotfix contains changes on critical system components such as the kernel iteself, important drivers, or user mode libraries such as shell32.dll.

Trying to applying hotfixes on the fly introduces a variety of problems — the most important being:

  • Patching code that is currently in use
  • Atomically replacing files on disk that are currently in use and therefore locked
  • Making sure that all changes take effect for both, processes currently running and processes which are yet to be started (i.e. before the next reboot)
  • Allowing further hotfixes to be applied on system that has not been rebooted since the last hotfix has been applied in an on-the-fly fashion

The Windows Hotpatching infrastructure is capable of handling all these cases — it is, however, not applicable to all kinds of code fixes. Generally speaking, it can only be used for fixes that merely comprise smallish code changes but do not affect layout or semantics of data structures. A fix for a buffer overflow caused by an off-by-one error, however, is a perfect example for a fix that could certainly be applied using the Hotpatching infrastructure.

That all sounds good and nice, but reality is that we still reboot our machines for just about every update Microsoft provides us, right?

Right. The answer for this is threefold. First, as indicated, some hotfixes can be expected to make changes that cannot be safely applied using the Hotpatching system. Secondly, Hotpatching is used on an opt-in basis, so you will not benefit from it automatically: When a hotpatch-enabled hotfix is applied through Windows Update or by launching the corresponding exe file, it is not used and a reboot will be required. The user has to explicitly specify the /hotpatch:enable switch in order to have the hotfix to be applied on the fly.

In the months after the release of SP1, a certain fraction of the hotfixes issued by Microsoft were indeed hotpatch-enabled and could be applied without a reboot. Interestingly, however, I am not aware of a single hotfix issued since Server 2003 SP2 that supported hotpatching!

And thirdly: Whether Microsoft has lost faith in their hotpatching facility, whether the effort to test such hotfixes turned out to be too high or whether there were other reasons speaking against issueing hotpatch-enabled hotfixes — I do not know.

Notwithstanding this observation, Hotpatching is an interesting technology that deserves to be looked at in more detail. Although I will not cover the entire infrastructure, I will spend at least one more blog post on the mechanisms implemented in Windows that allow code modifications to be performed on the fly. That is, I will focus on the hotpatching part of the infrastructure and will ignore coldpatching and other, smaller aspects of the infrastructre.

What a weirdo: How the /analyze switch changes its behavior depending on its environment

In Visual Studio 2005 Team System (VSTS), the “ultimate” SKU of Visual Studio 2005, Microsoft introduced the /analyze compiler switch. When the /analyze switch is used, the cl compiler not only does its regular checks, but performs a much more thorough static code analysis.

While /analyze is very useful indeed, it was only available in the top SKU — the Standard and Professional versions of Visual Studio lacked support for this compiler switch (this has changed by now, Professional now also supports this feature). As some smart people quickly figured out though, the compilers shipped as part of the Windows SDK did support /analyze, too.

So given that some compilers do support /analyze while other do not, you may well expect that there are two slightly different types of binaries, one that the SDK and VSTS uses, and one that is shipped with other Visual Studio SKUs.

At least this was what I expected. As it turns out though, this is not quite the case.

Where’s /analyze?

For the past two years, I have been developing using Visual Studio 2005 Team System along with Windows SDK 6.0 and WDK 6000 on a Vista x64 machine. Using this setup, I was able to use the /analyze switch in both, “regular” Visual Studio projects and WDK (build.exe-driven) projects. That led me to the conclusion that the WDK 6000 compilers, like the SDK compilers were in fact /analyze-enabled binaries as well.

Switching to a Windows 7 machine with VSTS 2005 and 2008, SDK 7.0, and WDK 6000 did not change this — /analyze kept working fine in all environments.

Then I set up a build server, installed WDK 6000 and Windows SDK 7.0 and attempted to perform a build — to my surprise, though, I got plenty of complaints about the /anayze switch not being supported.

I verified that the right compilers (WDK 6000) were used and compared cl versions between the build machine and my development machine — both were 14.00.50727.220, so everything seemed right. Running cl.exe /? on both machines, however, I noticed that despite versions being the same, this Code Analsis section was missing in the output on the build machine:

                         -CODE ANALYSIS-

/analyze[:WX-] enable code analysis
    WX- - code analysis warnings should not be treated as errors even if /WX is invoked

So obviously, Code Analysis support is enabled or disabled depending on external factors — not the binary itself, but the environment somehow determines whether the /analyze switch is supported or not.

Observing cl.exe /? with Process Monitor on my development machine resulted in the following output:

Process Monitor tracing the search for c1xxast

This trace leaves little room for interpretation: The code analysis features must (mainly) be implemented in c1xxast.dll. c1xxast.dll, however, is not shipped with the WDK itself, nor is it shipped with the non-VSTS SKUs of Visual Studio. So by default, the WDK’s cl will fail to locate the DLL and will revert to “/analyze-disabled mode”.

If, however, you have VSTS or the Windows SDK installed on your machine and your %PATH% happens to include the right directories, cl’s search for c1xxast.dll will succeeded and — tada — /analyze suddenly works. On my development machine, this obviously was the case, whilst on the build machine, it was not.

Compiler version mish-mash

I added the Windows SDK’s bin directory to the build machine’s %PATH% and rerun the build. As I expected, /analyze now worked fine — what I did not quite expect though was that I was now getting dozens of compilation warnings like:

warning C6309: Argument '1' is null: this does not adhere to function 
specification of 'CfixCreateThread'

The reason for this was simple: The WDK cl.exe (remember, version 14.00.50727.220), thanks to a proper %PATH%, now used c1xxast.dll from SDK 7 to perform code analysis — despite the fact that c1xxast.dll actually “belonged” to cl version 15.00.30729.01. So the c1xxast.dll was one generation ahead of the WDK I was using.

The really, really cool thing about cl being able to work with a newer c1xxast.dll is that you can continue using WDK 6000 or 6001 (with W2K support!) and still benefit from the latest-and-greatest static code analysis features.

The reason for getting several warnings on the build machine while not getting similar warnings on my development machine was simply that on my development machine, the VS 2005 directory preceded the SDK directory in my %PATH%. Once I switched the order, I got the same wanings on both machines. This leads me to:

The ugly thing about this, however, is that a tiny change in the order of directories in %PATH% can suddenly make a huge difference w.r.t. code analysis. This is not quite what you’d normally expect.

(The additional compiler warnings, by the way, were a result of the improved analysis checks in cl 15: cl 14 routinely failed to verify the usage of __in vs. __in_opt parameters; cl 15 has become much more precise here and found several mis-attributed function signatures.)

LTCG issues with the WIN7/amd64 environment of WDK 7600

Now that Windows 7 is out, we all sooner or later have to upgrade to WDK 7600. I am still reluctant to move away from WDK 6000/6001 because of the dropped W2K support, but this is a different issue.

However, as one cfix user who has obviously already adopted WDK 7600 kindly pointed out to me, linking a kernel mode unit test against cfix using WDK 7600 and the WIN7/amd64 environment fails reproducibly with the following error message:

error fatal error C1047: The object or library file ‘…\lib\amd64\cfixkdrv.lib’ was created with an older compiler than other objects; rebuild old objects and libraries

In contrast, building the same driver for WIN7/x86 works fine.

As the documentation for C1047 indicates, this error is usually related to inconsistent usage of Link Time Code Generation (LTCG): As soon as you use LTCG, all objects and libraries must be compiled with /GL — this normally is not a big deal, but as this WDK page rightfully explains, it means that libraries built this way are not suitable for redistribution because of their dependency on a specific compiler/linker version. But of couse, it also means that a library not built using /GL cannot be used easily when you build your program using LTCG.

Before Windows 7, all WDK build environment configurations I am aware of did not use LTCG. Neither did cfix, so everything worked fine even if your compiler/linker versions did not match the ones used for building cfix.

With WDK 7600, this situation changes: While WLH and other downlevel build environments still do not seem to use LTCG, the WIN7 environment, at least for amd64, enables LTCG by default.

What this means is that as soon as you link against a library which is not part of the WDK and therefore likely to be built using a different compiler version, you’ll get C1047. It is thus no surprise that attempting to link against cfixkdrv.lib, which is the library all kernel mode unit tests have to link against and which itself has been built using WDK 6000, leads to the error quoted above.

However, once you have figured this out, the workaround for this issue is trivial: Disable LTCG for your test driver by adding the following line to your SOURCES file:


USER_C_FLAGS=/GL-

Link time code generation is a very powerful optimization technique and I encourage everybody to make use of it if possible. However, for the compatibility reasons outlined above, I consider it a rather stupid idea to have the WDK enable LTCG by default. Rather, I had much preferred to see an opt-in switch for LTCG.

Anyway, for the next version, I will consider shipping both, a 7600-compatible LTCG-enabled and a non-LTCG-enabled version of cfixkdrv.lib. Until then, the workaround described above will do the trick.

AuxKlibGetImageExportDirectory and forwarders

One of the newer additions to the DDK is the aux_klib library, which, among others, offers the routine AuxKlibGetImageExportDirectory. As its name suggests, AuxKlibGetImageExportDirectory offers a handy way to obtain a pointer to the export directory of a kernel module.

There is, however, one issue that — at least in my opinion — renders AuxKlibGetImageExportDirectory pretty much useless in most scenarios: Dealing with forwaders.

The primary motivation to call AuxKlibGetImageExportDirectory is to either enumerate the exports of a module or to find a specific export. In both cases, the code is likely to call at least one of the exported routines. To maintain binary compatibility, it would be risky for such code to rely on the fact that all exports that it aims to call are in fact ‘real’ exports and not forwarders. Rather, it is crucial to be prepared to find both types — exports and forwarders — in the export directory and handle each of them appropropriately.

So we need to tell an export from a forwarder. As it turns out, this is not quite as easy as checking some flag. Quoting the Microsoft Portable Executable and Common Object File Format Specification on the content of the export address table:

Each entry in the export address table is a field that uses one of two formats in the following table. If the address specified is not within the export section (as defined by the address and length that are indicated in the optional header), the field is an export RVA, which is an actual address in code or data. Otherwise, the field is a forwarder RVA, which names a symbol in another DLL.

And this exactly is the problem — only being provided the PIMAGE_EXPORT_DIRECTORY pointer, we do not know the start and end RVA of the export section. As a consequence, identifying forwarders is infeasible when using AuxKlibGetImageExportDirectory — which in turn makes it a pretty much useless function.

Workaround

Although AuxKlibGetImageExportDirectory is handy, the work it performs is rather trivial. Therefore, it is not hard to come up with code that, given the Load Address of a module, finds the export directory and properly checks for the existance of forwarders. The following code shows how:

PIMAGE_DATA_DIRECTORY ExportDataDir;
PIMAGE_EXPORT_DIRECTORY ExportDirectory;
PIMAGE_DOS_HEADER DosHeader = ( PIMAGE_DOS_HEADER ) LoadAddress;
PIMAGE_NT_HEADERS NtHeader; 

PULONG FunctionRvaArray;
PUSHORT OrdinalsArray;

ULONG Index;

//
// Peek into PE image to obtain exports.
//
NtHeader = ( PIMAGE_NT_HEADERS ) 
  PtrFromRva( DosHeader, DosHeader->e_lfanew );
if( IMAGE_NT_SIGNATURE != NtHeader->Signature )
{
  //
  // Unrecognized image format.
  //
  return ...;
}

ExportDataDir = &NtHeader->OptionalHeader.DataDirectory
    [ IMAGE_DIRECTORY_ENTRY_EXPORT ];

ExportDirectory = ( PIMAGE_EXPORT_DIRECTORY ) PtrFromRva( 
  LoadAddress, 
  ExportDataDir->VirtualAddress );
  

if ( ExportDirectory->AddressOfNames == 0 ||
   ExportDirectory->AddressOfFunctions == 0 ||
   ExportDirectory->AddressOfNameOrdinals == 0 )
{
  //
  // This module does not have any exports.
  //
  return ...;
}

FunctionRvaArray = ( PULONG ) PtrFromRva(
  LoadAddress,
  ExportDirectory->AddressOfFunctions );

OrdinalsArray = ( PUSHORT ) PtrFromRva(
  LoadAddress,
  ExportDirectory->AddressOfNameOrdinals );

for ( Index = 0; Index < 
      ExportDirectory->NumberOfNames; Index++ )
{
  //
  // Get corresponding export ordinal.
  //
  USHORT Ordinal = ( USHORT ) OrdinalsArray[ Index ] 
    + ( USHORT ) ExportDirectory->Base;

  //
  // Get corresponding function RVA.
  //
  ULONG FuncRva = 
    FunctionRvaArray[ Ordinal - ExportDirectory->Base ];

  if ( FuncRva >= ExportDataDir->VirtualAddress && 
     FuncRva < ExportDataDir->VirtualAddress 
       + ExportDataDir->Size )
  {
    //
    // It is a forwarder.
    //
  }
  else 
  {
    //
    // It is an export.
    //
  }
}

Creating and embedding message tables with the WDK/build.exe

Although message tables play an important role in Windows, their tool support has always be somewhat limited — at least compared to string tables, for which Visual Studio even provides a graphical editor.

When in comes to creating and embedding message tables into a binary built with the WDK, documentation is light. However, the WDK tool chain provides support for mc files and using it requires only a few steps.

1. Create a message file

Unsurprisingly, the first step is to write a message file — I will name it foobarmsg.mc. Here is an example file:

;
; The default is NTSTATUS -- but HRESULT works just as well.
;
MessageIdTypedef=HRESULT

SeverityNames=(
  Success=0x0
  Informational=0x1
  Warning=0x2
  Error=0x3
)

FacilityNames=(
  Interface=4
)

LanguageNames=(English=0x409:MSG00409)

;//--------------------------------------------------------------------
MessageId		= 0x9000
Severity		= Warning
Facility		= Interface
SymbolicName	= FOOBAR_E_WEIRDFAILURE
Language		= English
Some weird failure has occured.
.
Updating the SOURCES file

The message file must be compiled (done by mc.exe). If we include the mc file in the SOURCES macro, build.exe will arange this for us:

SOURCES=
	foobar.c 
	foobarmsg.mc

To tell mc where to place the result files (i.e. the header and the resources), the following two macros can be used in the SOURCES file:

PASS0_HEADERDIR=....include
PASS0_SOURCEDIR=obj$(BUILD_ALT_DIR)$(TARGET_DIRECTORY)

As the names of the macros suggest, mc.exe is run during pass 0 (i.e. before any sources are compiled) — therefore, it is no problem to include the generated header file (foobar.h) in the source files.

Updating the rc file

Assuming the project already includes a .rc file for versioning information, we can use this file and refer to the generated message table resources. At the end of your project’s rc file, include the following line:

#include "foobarmsg.rc"

That’s it. The resulting binary will contain a proper message table.

How to use manifests with build.exe

As of Windows Vista, basically all applications require a manifest in order to at least declare UAC compliance. Visual Studio has builtin support for creating and embedding manifests, so when using VS to build applications, using manifests is straightforward. However, when building a user mode application with the WDK and build.exe, things are a little different. Looking at the WDK documentation, manifests remain unmentioned — both in the context of UAC and SXS. Judging from documentation, it seems that the WDK does not provide any support for embedding manifests — which would mean that you are left with having to poke with the makefiles in order to invoke mt somewhere.

Looking at makefile.new, however, reveals that there are plenty of manifest-related rules and as it turns out, there indeed is support for manifests. On lines 1866 to 1925 (WDK 6000), makefile.new even contains a short documentation about the usage of manifests — so whether manifest support being unmentioned in the official docs is intentional or not is not quite clear. However, using the information in makefile.new, it is straightforward to get build.exe to embed manifests in a binary.

To embed a manifest, first create the manifest file and name it myapp.manifest. In the SOURCES file, include these lines:

SXS_APPLICATION_MANIFEST=myapp.manifest
SXS_ASSEMBLY_VERSION=1.0
SXS_ASSEMBLY_NAME=MyApp
SXS_ASSEMBLY_LANGUAGE=0000

That’s it.

In order not to have to provide a separate manifest file for different processor builds, there is some preprocessing taking place before the manifest is embedded. The following macros are available for use in the manifest:

  • SXS_ASSEMBLY_NAME (set in SOURCES)
  • SXS_ASSEMBLY_VERSION (set in SOURCES, defaults to 5.1.0.0)
  • SXS_ASSEMBLY_LANGUAGE (set in SOURCES, LCID or 0000 for neutral)
  • SXS_PROCESSOR_ARCHITECTURE (set automatically)

When using these macros, note that they will be replaced by quoted text — as a consequence, you have to use them — a bit unintuitively — as follows (Note the missing quotes!):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assembly 
     xmlns="urn:schemas-microsoft-com:asm.v1" 
     manifestVersion="1.0"> 
  <assemblyIdentity version=SXS_ASSEMBLY_VERSION
     processorArchitecture=SXS_PROCESSOR_ARCHITECTURE
     name=SXS_ASSEMBLY_NAME/> 
  <description>Example</description>
  <trustInfo xmlns="urn:schemas-microsoft-com:asm.v3">
    <security>
      <requestedPrivileges>
        <requestedExecutionLevel level="asInvoker" 
          uiAccess="false"/>
      </requestedPrivileges>
    </security>
  </trustInfo>
</assembly>

After preprocessing, the file is embedded into the binary as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!--  Copyright (c) Microsoft Corporation --> 
<assembly 
     xmlns="urn:schemas-microsoft-com:asm.v1" 
     manifestVersion="1.0">
  <assemblyIdentity version="1.0.0.0" 
     processorArchitecture="AMD64" 
     name="MyApp" /> 
  <description>Example</description> 
  <trustInfo xmlns="urn:schemas-microsoft-com:asm.v3">
  <security>
    <requestedPrivileges>
      <requestedExecutionLevel level="asInvoker" 
        uiAccess="false" /> 
    </requestedPrivileges>
  </security>
  </trustInfo>
</assembly>

Categories




About me

Johannes Passing, M.Sc., living in Berlin, Germany.

Besides his consulting work, Johannes mainly focusses on Win32, COM, and NT kernel mode development, along with Java and .Net. He also is the author of cfix, a C/C++ unit testing framework for Win32 and NT kernel mode, Visual Assert, a Visual Studio Unit Testing-AddIn, and NTrace, a dynamic function boundary tracing toolkit for Windows NT/x86 kernel/user mode code.

Contact Johannes: jpassing (at) acm org

Johannes' GPG fingerprint is BBB1 1769 B82D CD07 D90A 57E8 9FE1 D441 F7A0 1BB1.

LinkedIn Profile
Xing Profile
Github Profile