Archive for the 'Debugging' Category

Windows Hotpatching: A Walkthrough

As discussed in the last post, Windows 2003 SP1 introduced a technology known as Hotpatching. An integral part of this technology is Hotpatching, which refers to the process of applying an updated on the fly by using runtime code modification techniques.

Although Hotpatching has caught a bit of attention, suprisingly little information has been published about its inner workings. As the technology is patented, however, there is quite a bit of information that can be obtained by reading the patent description. Moreover, there is this (admittedly very terse) discussion about the actual implementation of hotpatching.

Armed with this information, it is possible to get into more detail by looking what is actually happening under the hood when a hoftix is applied: I did so and chose KB911897 as an example, which fixes some flaw in mrxsmb.sys and rdbss.sys. I have also gone through the hassle of translating key parts of the respective assembly code back to C.

Preparing the machine

First, we need a proper machine image which can be used for the experiment. Unfortunately, KB911897 is an SP1 package, so we have to use an old Win 2003 Server SP1 system to apply this update. Once we have the machine running, we can attach the kernel debugger and see what is happening when the hotfix is installed.

Observing the update

When launched with /hotpatch:enable, after some initialization work, the updater calls NtSetSystemInformation (which delegates to ExApplyCodePatch) to apply the hotpatch. Hotpatching includes a coldpatch, which I do not care about here and the actual hotpatch. The first two calls to NtSetSystemInformation (and thus to ExApplyCodePatch) are coldpatching-related and I will thus ignore them here. The third call, however, is made to apply the actual hotpatch, so let’s observe this one further.

Requiring a kernel mode-patch, ExApplyCodePatch then calls MmHotPatchRoutine, which is where the fun starts. Expressed in C, MmHotPatchRoutine, MmHotPatchRoutine roughly looks like this (reverse engineered from assembly, might be slightly incorrect):

NTSTATUS MmHotPatchRoutine(
  __in PSYSTEM_HOTPATCH_CODE_INFORMATION RemoteInfo
  )
{
  UNICODE_STRING ImageFileName;
  DWORD Flags = RemoteInfo->Flags;
  PVOID ImageBaseAddress;
  PVOID ImageHandle;
  NTSTATUS Status, LoadStatus;
  KTHREAD CurrentThread;

  ImageFileName.Length = RemoteInfo->KernelInfo.NameLength;
  ImageFileName.MaximumLength = RemoteInfo->KernelInfo.NameLength;
  ImageFileName.Buffer = ( PBYTE ) RemoteInfo + NameOffset;

  CurrentThread = KeGetCurrentThread();
  KeEnterCriticalRegion( CurrentThread );

  KeWaitForSingleObject(
    MmSystemLoadLock,
    WrVirtualMemory,
    0,
    0,
    0 );

  LoadStatus = MmLoadSystemImage(
    &ImageFileName,
    0,
    0,
    0,
    &ImageHandle,
    &ImageBaseAddress );
  if ( NT_SUCCESS( Status ) || Status == STATUS_IMAGE_ALREADY_LOADED )
  {

    Status = MiPerformHotPatch(
      ImageHandle,
      ImageBaseAddress,
      Flags );
    
    if ( NT_SUCCESS( Status ) || LoadStatus == STATUS_IMAGE_ALREADY_LOADED )
    {
      NOTHING;
    }
    else
    {
      MmUnloadSystemImage( ImageHandle );
    }
    
    LoadStatus = Status;
  }


  KeReleaseMutant(
    MmSystemLoadLock,
    1,  // increment
    FALSE,
    FALSE );

  KeLeaveCriticalRegion( CurrentThread );

  return LoadStatus;
}

As you see in the code, MmHotPatchRoutine will try load the hotpatch image — we can verify this in the debugger:

kd> bp nt!MmLoadSystemImage

kd> g
Breakpoint 3 hit
nt!MmLoadSystemImage:
808ec4b5 6878010000      push    178h

kd> k
ChildEBP RetAddr  
f6acbb28 80990c9e nt!MmLoadSystemImage
f6acbb68 809b2d67 nt!MmHotPatchRoutine+0x59
f6acbba8 808caeff nt!ExApplyCodePatch+0x191
f6acbd50 8082337b nt!NtSetSystemInformation+0xa1e
f6acbd50 7c82ed54 nt!KiFastCallEntry+0xf8
0006bc50 7c821f24 ntdll!KiFastSystemCallRet
0006bd44 7c8304c9 ntdll!ZwSetSystemInformation+0xc
[...]

kd> dt _UNICODE_STRING poi(@esp+4)
ntdll!_UNICODE_STRING
 "\??\c:\windows\system32\drivers\hpf3.tmp"
   +0x000 Length           : 0x50
   +0x002 MaximumLength    : 0x50
   +0x004 Buffer           : 0x81623fa8  "\??\c:\windows\system32\drivers\hpf3.tmp"
   
kd> gu

kd> lm
start    end        module name
[...]           
f6ba4000 f6bad000   hpf3       (deferred)  
[...]
f95cb000 f9641000   mrxsmb     (deferred)  
f9641000 f9671000   rdbss      (deferred)      
[...]

Having loaded the hotpatch image, MmHotPatchRoutine proceeds be calling MiPerformHotPatch, which looks about like this:

NTSTATUS
MiPerformHotPatch(
  IN PLDR_DATA_TABLE_ENTRY ImageHandle,
  IN PVOID ImageBaseAddress,
  IN DWORD Flags
  )
{
  PHOTPATCH_HEADER SectionData ;
  PRTL_PATCH_HEADER Header;    
  NTSTATUS Status;
  PVOID LockVariable;
  PVOID LockedBuffer;
  BOOLEAN f;
  PLDR_DATA_TABLE_ENTRY LdrEntry;

  SectionData = RtlGetHotpatchHeader( ImageBaseAddress );
  if ( ! SectionData  )
  {
    return STATUS_INVALID_PARAMETER;
  }
  
  //
  // Try to get header from MiHotPatchList
  //
  Header = RtlFindRtlPatchHeader(
    MiHotPatchList,
    ImageHandle );

  if ( ! Header )
  {
    PLIST_ENTRY Entry;

    if ( Flags & FLG_HOTPATCH_ACTIVE )
    {
      return STATUS_NOT_SUPPORTED;
    }

    Status = RtlCreateHotPatch(
      &Header,
      SectionData,
      ImageHandle,
      Flags
      );
    if ( ! NT_SUCCESS( Status ) )
    {
      return Status;
    }

    ExAcquireResourceExclusiveLite(
      PsLoadedModuleResource,
      TRUE
      );

    Entry =  PsLoadedModuleList;
    while ( Entry != PsLoadedModuleList )
    {
      LdrEntry = DataTableEntry = CONTAINING_RECORD( Entry,
                                            KLDR_DATA_TABLE_ENTRY,
                                            InLoadOrderLinks )
      if ( LdrEntry->DllBase DllBase >= MiSessionImageEnd )
      {
        if ( RtlpIsSameImage( Header, LdrEntry ) )
        {
          break;
        }
      }
    }

    ExReleaseResourceLite( PsLoadedModuleResource );

    if ( ! PatchHeader->TargetDllBase )
    {
      Status = STATUS_DLL_NOT_FOUND ;
    }

    Status = ExLockUserBuffer(
      ImageHandle->DllBase,
      ImageHandle->SizeOfImage,
      KernelMode,
      IoWriteAccess,
      LockedBuffer,
      LockVariable
      );
    if ( ! NT_SUCCESS( Status ) )
    {
      FreeHotPatchData( Header );
      return Status;
    }


    Status = RtlInitializeHotPatch(
      ( PRTL_PATCH_HEADER ) Header,
      ( PBYTE ) LockedBuffer - ImageHandle->DllBase
      );

    ExUnlockUserBuffer( LockVariable );

    if ( ! NT_SUCCESS( Status ) )
    {
      FreeHotPatchData( ImageHandle );
      return Status;
    }

    f = 1;
  }
  else
  {
    if ( ( Flags ^ ImageHandle->CodeInfo->Flags ) & FLG_HOTPATCH_ACTIVE )
    {
      return STATUS_NOT_SUPPORTED;
    }

    if ( ! ( ImageHandle->CodeInfo->Flags & FLG_HOTPATCH_ACTIVE ) )
    {
      Status = RtlReadHookInformation( Header );
      if ( ! NT_SUCCESS( Status ) )
      {
        return Status;
      }
    }

    f = 0;
  }
  
  Status = MmLockAndCopyMemory(
    ImageHandle->CodeInfo,
    KernelMode
    );
  if ( NT_SUCCESS( Status ) )
  {
    if ( ! f  )
    {
      return Status;
    }

    LdrEntry->EntryPointActivationContext = Header;  // ???
    InsertTailList( MiHotPatchList, LdrEntry->PatchList );
  }
  else
  {
    if ( f ) 
    {
      RtlFreeHotPatchData( Header );
    }
  }

  return Status;
}

So MiPerformHotPatch inspects the hotpatch information stored in the hotpatch image. This data includes information about which code regions need to be updated. After the neccessary information has been gathered, it applies the code changes.

Two basic problems have to be overcome now: On the one hand, all code sections of drivers are mapped read/execute only. Overwring the instructions thus does not work. On the other hand, the system has to properly synchronize the patching process, i.e. it has to make sure no CPU is currently executing the code that is about to be patched.

To overcome the memory protection problems, Windows facilitates a trick I previously only knew from malware: It creates a memory descriptor list (MDL) for the affected code region, maps the MDL, and updates the code through this mapped region. The memory protection is thus circumvented. As it turns, out, there is even a handy, undocumented helper routine for this purpose: ExLockUserBuffer, which is used by MiPerformHotPatch.

To proceed, MiPerformHotPatch calls MmLockAndCopyMemory to do the actual patching. So how does Windows synchronize the update process? Again, it uses a technique I assumed was a malware trick: It schedules CPU-specific DPCs on all CPUs but the current and keeps those DPCs busy while the current thread is uddating the code. Again, Windows provides a neat routine for that: KeGenericCallDpc. In addition to this, Windows raises the IRQL to clock level in order to mask all interrupts.

Here is the pseudo-code for MmLockAndCopyMemory and its helper, MiDoCopyMemory:

NTSTATUS
MmLockAndCopyMemory (
    IN PSYSTEM_HOTPATCH_CODE_INFORMATION PatchInfo,
    IN KPROCESSOR_MODE ProbeMode
    )
{
  PVOID Buffer;
  NTSTATUS Status;
  UINT Index;

  if ( 0 == PatchInfo->CodeInfo.DescriptorsCount )
  {
    return STATUS_SUCCESS;
  }

  Buffer = ExAllocatePoolWithQuotaTag( 
    9,
    PatchInfo->CodeInfo.DescriptorsCount * 2,
    'PtoH' );
  if ( ! Buffer )
  {
    return STATUS_INSUFFICIENT_RESOURCES;
  }
  RtlZeroMemory( Buffer, PatchInfo->CodeInfo.DescriptorsCount * 2 );

  if ( 0 == PatchInfo->CodeInfo.DescriptorsCount )
  {
    Status = STATUS_INVALID_PARAMETER;
    goto Cleanup;
  }

  for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
  {
    if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset > PatchInfo->InfoSize ||
       PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize > PatchInfo->InfoSize ||
       PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset +
       PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize > PatchInfo->InfoSize || 
       /* other checks... */ )
    {
      Status = STATUS_INVALID_PARAMETER;
      goto Cleanup;
    }

    Status = ExLockUserBuffer(
      TargetAddress,
      PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize
      ProbeMode,
      IoWriteAccess,
      &PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
      Buffer[ Index ]
      );
    if ( ! NT_SUCCESS( Status ) )
    {
      goto Cleanup;
    }
  }

  PatchInfo->Flags |= FLG_HOTPATCH_ACTIVE;

  KeGenericCallDpc(
    MiDoCopyMemory,
    PatchInfo );

  if ( PatchInfo->Flags & FLG_HOTPATCH_VERIFICATION_ERROR )
  {
    PatchInfo->Flags &= ~FLG_HOTPATCH_ACTIVE;
    PatchInfo->Flags &= ~FLG_HOTPATCH_VERIFICATION_ERROR;
    Status = STATUS_DATA_ERROR;
  }

Cleanup:
  if ( PatchInfo->CodeInfo.DescriptorsCount > 0 )
  {
    for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
    {
      ExUnlockUserBuffer( Buffer[ Index ] );
    }
  }

  ExFreePoolWithTag( Buffer, 0 );
  return Status;
}

VOID MiDoCopyMemory(
  IN PKDPC Dpc,
  IN PSYSTEM_HOTPATCH_CODE_INFORMATION PatchInfo,
  IN ULONG NumberCpus,
  IN DEFERRED_REVERSE_BARRIER ReverseBarrier
  )
{
  KIRQL OldIrql;
  UNREFERENCED_PARAMETER( Dpc );
  NTSTATUS Status;
  ULONG Index;

  OldIrql = KfRaiseIrql( CLOCK1_LEVEL );

  //
  // Decrement reverse barrier count.
  //
  Status = KeSignalCallDpcSynchronize( ReverseBarrier );
  if ( ! NT_SUCCESS( Status ) )
  {
    goto Cleanup;
  }

  PatchInfo->Flags &= ~FLG_HOTPATCH_VERIFICATION_ERROR;
    
  for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
  {
    if ( PatchInfo->Flags & FLG_HOTPATCH_ACTIVE )
    {
      if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].ValidationSize != 
        RtlCompareMemory(
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
          ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].ValidationOffset,
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].ValidationSize ) )
      {

        if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize != 
          RtlCompareMemory(
            PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
            ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].OrigCodeOffset,
            PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize ) )
        {
          PatchInfo->Flags &= FLG_HOTPATCH_VERIFICATION_ERROR;
          break;
        }
      }
    }
    else
    {
      if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize !=
        RtlComparememory(
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
          ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset,
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize ) )
      {
        PatchInfo->Flags &= FLG_HOTPATCH_VERIFICATION_ERROR;
        break;
      }
    }
  }

  //loc_479533
  if ( PatchInfo->Flags & FLG_HOTPATCH_VERIFICATION_ERROR ||
     PatchInfo->CodeInfo.DescriptorsCount <= 0 )
  {
    goto Cleanup;
  }

  for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
  {
    PVOID Source;
    if ( PatchInfo->Flags & FLG_HOTPATCH_ACTIVE )
    {
      Source = ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset;
    }
    else
    {
      Source = ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].OrigCodeOffset;
    }

    RtlCopyMemory(
      PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
      Source,
      PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize
      );
  }


Cleanup:
   KeSignalCallDpcSynchronize( ReverseBarrier );
   KfLowerIrql( OldIrql );
   KeSignalCallDpcDone( NumberCpus );
}

To see the code, in action, we set a breakpoint on nt!MiDoCopyMemory:

kd> k
ChildEBP RetAddr  
f6acbac0 8087622f nt!MiDoCopyMemory
f6acbae8 80990a10 nt!KeGenericCallDpc+0x3d
f6acbb0c 80990bea nt!MmLockAndCopyMemory+0xf1
f6acbb34 80990cba nt!MiPerformHotPatch+0x143
f6acbb68 809b2d67 nt!MmHotPatchRoutine+0x75
f6acbba8 808caeff nt!ExApplyCodePatch+0x191
f6acbd50 8082337b nt!NtSetSystemInformation+0xa1e

Before letting MiDoCopyMemory do its work, let’s see what it is about to do. No modifications have yet been done to mrxsmb:

kd> !chkimg mrxsmb
0 errors : mrxsmb 

kd> !chkimg rdbss
0 errors : rdbss

The second argument is a structure holding the information garthered previously, peeking into it reveals:

kd> dd /c 1 poi(esp+8) l 4
81583008  00000001
8158300c  00000149
81583010  00000008   <-- # of code patches
81583014  f9648b1f   <-- hmm...

As it turns out, address 81583014 refers to a variable length array of size 8. Poking aroud with dd, the following listing suggests that the structure is of size 28 bytes:

kd> dd /c 7 81583014
81583014  f9648b1f fa2afb1f 000000ec 00000005 000000f1 000000f6 00000005
81583030  f9648b24 fa2b2b24 000000fb 00000002 000000fd 000000ff 00000002
8158304c  f96585ef fa2b15ef 00000101 00000005 00000106 0000010b 00000005
81583068  f96585f4 fa2b45f4 00000110 00000002 00000112 00000114 00000002
81583084  f9658569 fa2b3569 00000116 00000005 0000011b 00000120 00000005
815830a0  f965856e fa2b656e 00000125 00000002 00000127 00000129 00000002
815830bc  f9653378 fa2b5378 0000012b 00000005 00000130 00000135 00000005
815830d8  f965337d fa2b837d 0000013a 00000005 0000013f 00000144 00000005

Given that rdbss was loaded to address range f9641000-f9671000, it is obvious that the first 2 columns refer to code addresses. The third, fifth and sixth column looks like an offset, the fourth and seventh like the length of the code change. First, let’s see where the first column points to:

kd> u f9648b1f
rdbss!RxInitiateOrContinueThrottling+0x6b:
f9648b1f 90              nop
f9648b20 90              nop
f9648b21 90              nop
f9648b22 90              nop
f9648b23 90              nop
rdbss!RxpCancelRoutine:
f9648b24 8bff            mov     edi,edi
f9648b26 55              push    ebp
f9648b27 8bec            mov     ebp,esp

Now that looks promising, especially since the fourth column holds the value 5. Let’s look at the second row:

kd> u f9648b24
rdbss!RxpCancelRoutine:
f9648b24 8bff            mov     edi,edi

No doubt, the first and second row define the two patches necessary to redirect RxpCancelRoutine. But what to replace this code with? As it turns out, the offsets in column three are relative to the structure and point to the code that is to be written:

kd> u poi(esp+8)+000000ec
815830f4 e9dcc455fd      jmp     7eadf5d5          mov     edi,edi

kd> u poi(esp+8)+000000fb
81583103 ebf9            jmp     815830fe

That makes perfectly sense — the five nops are to be overwritten by a near jump, the mov edi, edi will be replaced by a short jump.

So let’s run MiDoCopyMemory and have a look at the results. Back in MmLockAndCopyMemory, the code referred to by the first to rows look like this:

kd> u f9648b1f
rdbss!RxInitiateOrContinueThrottling+0x6b:
f9648b1f e9dcc455fd      jmp     hpf3!RxpCancelRoutine (f6ba5000)

kd> u f9648b24
rdbss!RxpCancelRoutine:
f9648b24 ebf9            jmp     rdbss!RxInitiateOrContinueThrottling+0x6b (f9648b1f)
f9648b26 55              push    ebp
f9648b27 8bec            mov     ebp,esp

Voilà, RxpCancelRoutine has been patched and calls are redirected to hpf3!RxpCancelRoutine, the new routine located in the auxiliarry ‘hpf3’ driver. All that remains to be done is cleanup (unlocking the memory etc).

That’s it — that’s how Windows applies patches on the fly using hotpatching. Too bad that the technology is so rarely used in practice.

Windows Hotpatching

Several years ago, with Windows Server 2003 SP1, Microsoft introduced a technology and infrastructure called Hotpatching. The basic intent of this infrastructure is to provide a means to apply hotfixes on the fly, i.e. without having to reboot the system — even if the hotfix contains changes on critical system components such as the kernel iteself, important drivers, or user mode libraries such as shell32.dll.

Trying to applying hotfixes on the fly introduces a variety of problems — the most important being:

  • Patching code that is currently in use
  • Atomically replacing files on disk that are currently in use and therefore locked
  • Making sure that all changes take effect for both, processes currently running and processes which are yet to be started (i.e. before the next reboot)
  • Allowing further hotfixes to be applied on system that has not been rebooted since the last hotfix has been applied in an on-the-fly fashion

The Windows Hotpatching infrastructure is capable of handling all these cases — it is, however, not applicable to all kinds of code fixes. Generally speaking, it can only be used for fixes that merely comprise smallish code changes but do not affect layout or semantics of data structures. A fix for a buffer overflow caused by an off-by-one error, however, is a perfect example for a fix that could certainly be applied using the Hotpatching infrastructure.

That all sounds good and nice, but reality is that we still reboot our machines for just about every update Microsoft provides us, right?

Right. The answer for this is threefold. First, as indicated, some hotfixes can be expected to make changes that cannot be safely applied using the Hotpatching system. Secondly, Hotpatching is used on an opt-in basis, so you will not benefit from it automatically: When a hotpatch-enabled hotfix is applied through Windows Update or by launching the corresponding exe file, it is not used and a reboot will be required. The user has to explicitly specify the /hotpatch:enable switch in order to have the hotfix to be applied on the fly.

In the months after the release of SP1, a certain fraction of the hotfixes issued by Microsoft were indeed hotpatch-enabled and could be applied without a reboot. Interestingly, however, I am not aware of a single hotfix issued since Server 2003 SP2 that supported hotpatching!

And thirdly: Whether Microsoft has lost faith in their hotpatching facility, whether the effort to test such hotfixes turned out to be too high or whether there were other reasons speaking against issueing hotpatch-enabled hotfixes — I do not know.

Notwithstanding this observation, Hotpatching is an interesting technology that deserves to be looked at in more detail. Although I will not cover the entire infrastructure, I will spend at least one more blog post on the mechanisms implemented in Windows that allow code modifications to be performed on the fly. That is, I will focus on the hotpatching part of the infrastructure and will ignore coldpatching and other, smaller aspects of the infrastructre.

The hidden danger of forgetting to specify %SystemRoot% in a custom environment block

When spawning a process using CreateProcess and friends, the child process usually inherits the environment (i.e. all environment variables) of the spawning process. Of course, this behavior can be overridden by creating a custom environment block and passing it to the lpEnvironment parameter of CreateProcess.

While the MSDN documentation on CreateProcess does contain a remark saying that current directory information (=C: and friends) should be included in such a custom environment block, it does not mention the importance of SystemRoot.

The SystemRoot environment variable usually contains the path c:\windows — the path that is also accessible using the GetWindowsDirectory function. This environment variable, as it turns out, is not only handy for scripting purposes — it is, in fact, essential for the proper operation of many libraries.

For very simple programs, forgetting to include SystemRoot in a custom environment block usually goes unnoticed — even an empty environment block works just fine. In case of more complex applications, however, the omission of this variable can quickly lead to errors — on Vista, the most common error that can be tracked back to a missing SystemRoot variable is SXS failing to find/load basic system libraries.

Now that we have Windows 7, SystemRoot seems to have become even more important: Now it is not only SXS that requires SystemRoot to be specified properly, but also CryptoAPI.

In my particular case, I was experiencing a 0x80090006 (“Invalid Signature”, NTE_BAD_SIGNATURE) error whenever the child process attempted to call CoGetObject to retrieve a pointer to a DCOM object. While this error occured on Windows 7, the same code worked fine on Windows Vista and XP.

Given this more than general error message, it seemed anything but clear to me what the problem was, so I attached a debugger to the child process (using gflags/Image File Execution Options). Once I did that, I got the following messages in my debug output output:

CryptAcquireContext: CheckSignatureInFile failed at cryptapi.c line 5198
CryptAcquireContext: Failed to read registry signature value at cryptapi.c line 873

I set a breakpoint on CryptAcquireContextW and looked at the stack trace:


0:000> k
ChildEBP RetAddr  
0008f8a4 75760a4f ole32!CRandomNumberGenerator::Initialize+0x2e
0008f8b0 75760769 ole32!CRandomNumberGenerator::GenerateRandomNumber+0xd
0008f8e8 757609cf ole32!CStdMarshal::AddIPIDEntry+0x48
0008f93c 75766aae ole32!CStdMarshal::MarshalServerIPID+0x5a
0008f994 75767519 ole32!CStdMarshal::MarshalObjRef+0xb9
0008f9c8 7576778e ole32!MarshalInternalObjRef+0x8c
0008fa4c 757676ba ole32!CRemoteUnknown::CRemoteUnknown+0x3b
0008fa8c 7576754a ole32!CComApartment::InitRemoting+0x19c
0008fa98 7586d83e ole32!CComApartment::StartServer+0x13
0008faa8 757652b3 ole32!InitChannelIfNecessary+0x1e
0008fb20 757fc046 ole32!CoUnmarshalInterface+0x38
0008fb34 757fd3d5 ole32!CObjrefMoniker::Load+0x26
0008fb70 7573cb7f ole32!CObjrefMonikerFactory::ParseDisplayName+0x16f
0008fbbc 7573caae ole32!FindClassMoniker+0x8b
0008fbf4 75789dc7 ole32!MkParseDisplayName+0xbb
0008fc3c 6954ce84 ole32!CoGetObject+0x82
...

Quite obviously, COM, trying to unmarshal an interface, needed a random number and attempted to use CryptoAPI for this purpose. Looking at the paramters of CryptAcquireContext, I saw that the Microsoft Strong Cryptographic Provider was attempted to be loaded — one of the standard Windows CSPs — so everything seemed normal.

Guided by the message Failed to read registry signature, I switched to Process Monitor to see which registry key was being queried: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Cryptography\Defaults\Provider\Microsoft Strong Cryptographic Provider.

Taking a look at this key in regedit, it did not take long before spotting SystemRoot as the culprit:

Looking at file system activity in Process Monitor proved this:

Process Monitor

Interestingly, on Windows Vista, all Image Path values in the CSP registry keys do not use SystemRoot — they just contain the file name and rely on the library path in order to locate the libraries at runtime. This explains why my code worked fine on Vista.

(While making this change, the developer seemed to forget to change the value’s type from REG_SZ to REG_EXPAND_SZ though :) )

Bottom Line 1: Always, always include SystemRoot when passing a custom environment block to CreateProcess.

Bottom Line 2: The case also shows how a seemingly trivial change in Windows (using an absolute path rather then just a file name in the CSP registry key) can lead to an application incompatibility.

Introducing cfix studio, the Visual Studio AddIn for C/C++ Unit Testing

N.B. cfix studio was the code name of what has become Visual Assert

There is little doubt that native code, and C and C++ in particular, is here to stay. And still, it is pretty obvious that when it comes to tools and IDEs, it is the managed world that has gotten most attention from tool vendors over the past years.

While there are lots and lots of useful tools for native development, many of them probably even better than their managed counterparts, there are some areas where the managed language fraction is far ahead: One of these areas certainly is IDE support for unit testing.

JUnit for Eclipse, TestDriven.Net for Visual Studio and MS Test make test-driven development so much more convenient and efficient that it is almost ridiculous that using command line tools is still state of the art for C/C++ development.

That said, there is great news: With cfix studio, there finally is a solution filling in this gap! cfix studio, based on the cfix unit testing framework, is a Visual Studio-AddIn that allows you to easily write, manage, run, and debug your unit tests from within Visual Studio. No fiddling with command line tools, complex configuration, or boilerplate code required!

Among lots of other features, cfix studio also has first-class support for multi-architecture development – you can easily switch back and forth between 32-bit and 64-bit and can even mix tests of different architectures in a single test run. Needless to say, cfix studio, like cfix, is also fully compatible to WinUnit.

If that has caught your interest, you are invited to check out the first beta version of cfix studio:

Download cfix studio Beta 1
Version 1.0.0.3458

It is free, quick to install and comes with a set of example projects. Give it a try — and please let me know about all your crticism, suggestions and other feedback!

Here are some screenshots of cfix studio in action:

Test Explorer
The Test Explorer allows you to start a single or set of tests, the Run Window shows the results

Run Window
Run Window: Viewing test progress

Failed Assertion
Run Window: Viewing test results and details of a failed assertion

Debgging a failed assertion
When running in the debugger, a failed assertion will hit a breakpoint and the Run Window will show additional details

cfix 1.4 released

Today, a new version of cfix, the open source unit testing framework for user and kernel mode C and C++, has been released. cfix 1.4, in addition to the existing feature of allowing test runs to be restricted to specific fixtures, now also allows single testcases to be run in isolation, which can be a great aid in debugging. Besides several minor fixes, the cfix API has been slightly enhanced and cfix now degrades more gracefully in case of dbghelp-issues.

Updated cfix binaries and source code are now available for download

Uniquely Identifying a Module’s Build

It is common practice to embed a version resource (VS_VERSIONINFO) into PE images such as DLL and EXE files. While this resource mainly serves informational purposes, the version information is occasionaly used to perform certain checks, such as verifying the module’s suitability for a particular purpose.

Under certain circumstances, however, this versioning information may be too imprecise: Versions are not necessarily incremented after each build, so it is possible that two copies of a module carry the same versioning information, yet differ significantly in their implementation. In such situations, identifying the actual build of the module might become neccessary.

The most common, but by no means the only situation in which this applies in practice concerns debugging — to identify the PDB file exactly matching a given module, the debugger must be able to recognize the specific build of a module. It thus does not come as a surprise that all images for which debugging information has been generated contain a dedicated identifier for this purpose: The CodeView signature GUID.

Summarizing what Oleg Starodumov has covered in more detail, cl, when directed to generate a PDB file, implicitly creates this GUID and, along with the path to the PDB file, embeds this data into the PE image. For current versions, the relevant structure used to encode this information is CV_INFO_PDB70, which seems to have been documented once, but not any more:

typedef struct _CV_INFO_PDB70
{
  ULONG CvSignature;
  GUID Signature;
  ULONG Age;
  UCHAR PdbFileName[ ANYSIZE_ARRAY ];
} CV_INFO_PDB70, *PCV_INFO_PDB70;

In order to be able to locate the structure within the PE image, a directory entry of type IMAGE_DEBUG_TYPE_CODEVIEW is written to the image’s debug directory. The following code listing demonstrates how to obtain the signature GUID of an image:

#define PtrFromRva( base, rva ) ( ( ( PUCHAR ) base ) + rva )

static PIMAGE_DATA_DIRECTORY GetDebugDataDirectory(
  __in ULONG_PTR LoadAddress
  )
{
  PIMAGE_DOS_HEADER DosHeader = 
    ( PIMAGE_DOS_HEADER ) ( PVOID ) LoadAddress;
  PIMAGE_NT_HEADERS NtHeader = ( PIMAGE_NT_HEADERS ) 
    PtrFromRva( DosHeader, DosHeader->e_lfanew );
  ASSERT ( IMAGE_NT_SIGNATURE == NtHeader->Signature );

  return &NtHeader->OptionalHeader.DataDirectory
      [ IMAGE_DIRECTORY_ENTRY_DEBUG ];
}

NTSTATUS GetDebugGuid(
  __in ULONG_PTR ModuleBaseAddress,
  __out GUID *Guid
  )
{
  PIMAGE_DATA_DIRECTORY DebugDataDirectory;
  PIMAGE_DEBUG_DIRECTORY DebugHeaders;
  ULONG Index;
  ULONG NumberOfDebugDirs;
  ULONG_PTR ModuleBaseAddress;
  NTSTATUS Status;

  DebugDataDirectory  = DebugDataDirectory( ModuleBaseAddress );
  DebugHeaders    = ( PIMAGE_DEBUG_DIRECTORY ) PtrFromRva( 
    ModuleBaseAddress, 
    DebugDataDirectory->VirtualAddress );

  ASSERT( ( DebugDataDirectory->Size % sizeof( IMAGE_DEBUG_DIRECTORY ) ) == 0 );
  NumberOfDebugDirs = DebugDataDirectory->Size / sizeof( IMAGE_DEBUG_DIRECTORY );

  //
  // Lookup CodeView record.
  //
  for ( Index = 0; Index < NumberOfDebugDirs; Index++ )
  {
    PCV_INFO_PDB70 CvInfo;
    if ( DebugHeaders[ Index ].Type != IMAGE_DEBUG_TYPE_CODEVIEW )
    {
      continue;
    }

    CvInfo = ( PCV_INFO_PDB70 ) PtrFromRva( 
      ModuleBaseAddress, 
      DebugHeaders[ Index ].AddressOfRawData );

    if ( CvInfo->CvSignature != 'SDSR' )
    {
      //
      // Weird, old PDB format maybe.
      //
      return STATUS_xxx_UNRECOGNIZED_CV_HEADER;
    }

    *Guid = CvInfo->Signature;
    return STATUS_SUCCESS;  
  }

  return STATUS_xxx_CV_GUID_LOOKUP_FAILED;
}

cfix 1.2 Installer Fixed for AMD64

The cfix 1.2 package as released last week contained a rather stupid bug that the new build, 1.2.0.3244, now fixes: the amd64 binaries cfix64.exe and cfixkr64.sys were wrongly installed as cfix32.exe and cfixkr32.sys, respectively. Not only did this stand in contrast to what the documenation stated, it also resulted in cfix being unable to load the cfixkr driver on AMD64 platforms.

The new MSI package is now available for download on Sourceforge.


Categories




About me

Johannes Passing, M.Sc., living in Berlin, Germany.

Besides his consulting work, Johannes mainly focusses on Win32, COM, and NT kernel mode development, along with Java and .Net. He also is the author of cfix, a C/C++ unit testing framework for Win32 and NT kernel mode, Visual Assert, a Visual Studio Unit Testing-AddIn, and NTrace, a dynamic function boundary tracing toolkit for Windows NT/x86 kernel/user mode code.

Contact Johannes: jpassing (at) acm org

Johannes' GPG fingerprint is BBB1 1769 B82D CD07 D90A 57E8 9FE1 D441 F7A0 1BB1.

LinkedIn Profile
Xing Profile
Github Profile