Debugging Windows Hotpatching: A Walkthrough

Posted on

As discussed in the last post, Windows 2003 SP1 introduced a technology known as Hotpatching. An integral part of this technology is Hotpatching, which refers to the process of applying an updated on the fly by using runtime code modification techniques.

Although Hotpatching has caught a bit of attention, suprisingly little information has been published about its inner workings. As the technology is patented, however, there is quite a bit of information that can be obtained by reading the patent description. Moreover, there is this (admittedly very terse) discussion about the actual implementation of hotpatching.

Armed with this information, it is possible to get into more detail by looking what is actually happening under the hood when a hoftix is applied: I did so and chose KB911897 as an example, which fixes some flaw in mrxsmb.sys and rdbss.sys. I have also gone through the hassle of translating key parts of the respective assembly code back to C.

Preparing the machine

First, we need a proper machine image which can be used for the experiment. Unfortunately, KB911897 is an SP1 package, so we have to use an old Win 2003 Server SP1 system to apply this update. Once we have the machine running, we can attach the kernel debugger and see what is happening when the hotfix is installed.

Observing the update

When launched with /hotpatch:enable, after some initialization work, the updater calls NtSetSystemInformation (which delegates to ExApplyCodePatch) to apply the hotpatch. Hotpatching includes a coldpatch, which I do not care about here and the actual hotpatch. The first two calls to NtSetSystemInformation (and thus to ExApplyCodePatch) are coldpatching-related and I will thus ignore them here. The third call, however, is made to apply the actual hotpatch, so let’s observe this one further.

Requiring a kernel mode-patch, ExApplyCodePatch then calls MmHotPatchRoutine, which is where the fun starts. Expressed in C, MmHotPatchRoutine, MmHotPatchRoutine roughly looks like this (reverse engineered from assembly, might be slightly incorrect):

NTSTATUS MmHotPatchRoutine(
  __in PSYSTEM_HOTPATCH_CODE_INFORMATION RemoteInfo
  )
{
  UNICODE_STRING ImageFileName;
  DWORD Flags = RemoteInfo->Flags;
  PVOID ImageBaseAddress;
  PVOID ImageHandle;
  NTSTATUS Status, LoadStatus;
  KTHREAD CurrentThread;

  ImageFileName.Length = RemoteInfo->KernelInfo.NameLength;
  ImageFileName.MaximumLength = RemoteInfo->KernelInfo.NameLength;
  ImageFileName.Buffer = ( PBYTE ) RemoteInfo + NameOffset;

  CurrentThread = KeGetCurrentThread();
  KeEnterCriticalRegion( CurrentThread );

  KeWaitForSingleObject(
    MmSystemLoadLock,
    WrVirtualMemory,
    0,
    0,
    0 );

  LoadStatus = MmLoadSystemImage(
    &ImageFileName,
    0,
    0,
    0,
    &ImageHandle,
    &ImageBaseAddress );
  if ( NT_SUCCESS( Status ) || Status == STATUS_IMAGE_ALREADY_LOADED )
  {

    Status = MiPerformHotPatch(
      ImageHandle,
      ImageBaseAddress,
      Flags );

    if ( NT_SUCCESS( Status ) || LoadStatus == STATUS_IMAGE_ALREADY_LOADED )
    {
      NOTHING;
    }
    else
    {
      MmUnloadSystemImage( ImageHandle );
    }

    LoadStatus = Status;
  }


  KeReleaseMutant(
    MmSystemLoadLock,
    1,  // increment
    FALSE,
    FALSE );

  KeLeaveCriticalRegion( CurrentThread );

  return LoadStatus;
}

As you see in the code, MmHotPatchRoutine will try load the hotpatch image – we can verify this in the debugger:

    kdbp nt!MmLoadSystemImage
    
    kdg
    Breakpoint 3 hit
    nt!MmLoadSystemImage:
    808ec4b5 6878010000      push    178h
    
    kdk
    ChildEBP RetAddr  
    f6acbb28 80990c9e nt!MmLoadSystemImage
    f6acbb68 809b2d67 nt!MmHotPatchRoutine+0x59
    f6acbba8 808caeff nt!ExApplyCodePatch+0x191
    f6acbd50 8082337b nt!NtSetSystemInformation+0xa1e
    f6acbd50 7c82ed54 nt!KiFastCallEntry+0xf8
    0006bc50 7c821f24 ntdll!KiFastSystemCallRet
    0006bd44 7c8304c9 ntdll!ZwSetSystemInformation+0xc
    [...]
    
    kddt _UNICODE_STRING poi(@esp+4)
    ntdll!_UNICODE_STRING
     "\??\c:\windows\system32\drivers\hpf3.tmp"
       +0x000 Length           : 0x50
       +0x002 MaximumLength    : 0x50
       +0x004 Buffer           : 0x81623fa8  "\??\c:\windows\system32\drivers\hpf3.tmp"
       
    kdgu
    
    kdlm
    start    end        module name
    [...]           
    f6ba4000 f6bad000   hpf3       (deferred)  
    [...]
    f95cb000 f9641000   mrxsmb     (deferred)  
    f9641000 f9671000   rdbss      (deferred)      
    [...]
    

Having loaded the hotpatch image, MmHotPatchRoutine proceeds be calling MiPerformHotPatch, which looks about like this:

NTSTATUS
MiPerformHotPatch(
  IN PLDR_DATA_TABLE_ENTRY ImageHandle,
  IN PVOID ImageBaseAddress,
  IN DWORD Flags
  )
{
  PHOTPATCH_HEADER SectionData ;
  PRTL_PATCH_HEADER Header;    
  NTSTATUS Status;
  PVOID LockVariable;
  PVOID LockedBuffer;
  BOOLEAN f;
  PLDR_DATA_TABLE_ENTRY LdrEntry;

  SectionData = RtlGetHotpatchHeader( ImageBaseAddress );
  if ( ! SectionData  )
  {
    return STATUS_INVALID_PARAMETER;
  }

  //
  // Try to get header from MiHotPatchList
  //
  Header = RtlFindRtlPatchHeader(
    MiHotPatchList,
    ImageHandle );

  if ( ! Header )
  {
    PLIST_ENTRY Entry;

    if ( Flags & FLG_HOTPATCH_ACTIVE )
    {
      return STATUS_NOT_SUPPORTED;
    }

    Status = RtlCreateHotPatch(
      &Header,
      SectionData,
      ImageHandle,
      Flags
      );
    if ( ! NT_SUCCESS( Status ) )
    {
      return Status;
    }

    ExAcquireResourceExclusiveLite(
      PsLoadedModuleResource,
      TRUE
      );

    Entry =  PsLoadedModuleList;
    while ( Entry != PsLoadedModuleList )
    {
      LdrEntry = DataTableEntry = CONTAINING_RECORD( Entry,
                                            KLDR_DATA_TABLE_ENTRY,
                                            InLoadOrderLinks )
      if ( LdrEntry->DllBase DllBase >= MiSessionImageEnd )
      {
        if ( RtlpIsSameImage( Header, LdrEntry ) )
        {
          break;
        }
      }
    }

    ExReleaseResourceLite( PsLoadedModuleResource );

    if ( ! PatchHeader->TargetDllBase )
    {
      Status = STATUS_DLL_NOT_FOUND ;
    }

    Status = ExLockUserBuffer(
      ImageHandle->DllBase,
      ImageHandle->SizeOfImage,
      KernelMode,
      IoWriteAccess,
      LockedBuffer,
      LockVariable
      );
    if ( ! NT_SUCCESS( Status ) )
    {
      FreeHotPatchData( Header );
      return Status;
    }


    Status = RtlInitializeHotPatch(
      ( PRTL_PATCH_HEADER ) Header,
      ( PBYTE ) LockedBuffer - ImageHandle->DllBase
      );

    ExUnlockUserBuffer( LockVariable );

    if ( ! NT_SUCCESS( Status ) )
    {
      FreeHotPatchData( ImageHandle );
      return Status;
    }

    f = 1;
  }
  else
  {
    if ( ( Flags ^ ImageHandle->CodeInfo->Flags ) & FLG_HOTPATCH_ACTIVE )
    {
      return STATUS_NOT_SUPPORTED;
    }

    if ( ! ( ImageHandle->CodeInfo->Flags & FLG_HOTPATCH_ACTIVE ) )
    {
      Status = RtlReadHookInformation( Header );
      if ( ! NT_SUCCESS( Status ) )
      {
        return Status;
      }
    }

    f = 0;
  }

  Status = MmLockAndCopyMemory(
    ImageHandle->CodeInfo,
    KernelMode
    );
  if ( NT_SUCCESS( Status ) )
  {
    if ( ! f  )
    {
      return Status;
    }

    LdrEntry->EntryPointActivationContext = Header;  // ???
    InsertTailList( MiHotPatchList, LdrEntry->PatchList );
  }
  else
  {
    if ( f ) 
    {
      RtlFreeHotPatchData( Header );
    }
  }

  return Status;
}

So MiPerformHotPatch inspects the hotpatch information stored in the hotpatch image. This data includes information about which code regions need to be updated. After the neccessary information has been gathered, it applies the code changes.

Two basic problems have to be overcome now: On the one hand, all code sections of drivers are mapped read/execute only. Overwring the instructions thus does not work. On the other hand, the system has to properly synchronize the patching process, i.e. it has to make sure no CPU is currently executing the code that is about to be patched.

To overcome the memory protection problems, Windows facilitates a trick I previously only knew from malware: It creates a memory descriptor list (MDL) for the affected code region, maps the MDL, and updates the code through this mapped region. The memory protection is thus circumvented. As it turns, out, there is even a handy, undocumented helper routine for this purpose: ExLockUserBuffer, which is used by MiPerformHotPatch.

To proceed, MiPerformHotPatch calls MmLockAndCopyMemory to do the actual patching. So how does Windows synchronize the update process? Again, it uses a technique I assumed was a malware trick: It schedules CPU-specific DPCs on all CPUs but the current and keeps those DPCs busy while the current thread is uddating the code. Again, Windows provides a neat routine for that: KeGenericCallDpc. In addition to this, Windows raises the IRQL to clock level in order to mask all interrupts.

Here is the pseudo-code for MmLockAndCopyMemory and its helper, MiDoCopyMemory:

NTSTATUS
MmLockAndCopyMemory (
    IN PSYSTEM_HOTPATCH_CODE_INFORMATION PatchInfo,
    IN KPROCESSOR_MODE ProbeMode
    )
{
  PVOID Buffer;
  NTSTATUS Status;
  UINT Index;

  if ( 0 == PatchInfo->CodeInfo.DescriptorsCount )
  {
    return STATUS_SUCCESS;
  }

  Buffer = ExAllocatePoolWithQuotaTag( 
    9,
    PatchInfo->CodeInfo.DescriptorsCount * 2,
    'PtoH' );
  if ( ! Buffer )
  {
    return STATUS_INSUFFICIENT_RESOURCES;
  }
  RtlZeroMemory( Buffer, PatchInfo->CodeInfo.DescriptorsCount * 2 );

  if ( 0 == PatchInfo->CodeInfo.DescriptorsCount )
  {
    Status = STATUS_INVALID_PARAMETER;
    goto Cleanup;
  }

  for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
  {
    if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset PatchInfo->InfoSize ||
       PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize PatchInfo->InfoSize ||
       PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset +
       PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize PatchInfo->InfoSize || 
       /* other checks... */ )
    {
      Status = STATUS_INVALID_PARAMETER;
      goto Cleanup;
    }

    Status = ExLockUserBuffer(
      TargetAddress,
      PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize
      ProbeMode,
      IoWriteAccess,
      &PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
      Buffer[ Index ]
      );
    if ( ! NT_SUCCESS( Status ) )
    {
      goto Cleanup;
    }
  }

  PatchInfo->Flags |= FLG_HOTPATCH_ACTIVE;

  KeGenericCallDpc(
    MiDoCopyMemory,
    PatchInfo );

  if ( PatchInfo->Flags & FLG_HOTPATCH_VERIFICATION_ERROR )
  {
    PatchInfo->Flags &= ~FLG_HOTPATCH_ACTIVE;
    PatchInfo->Flags &= ~FLG_HOTPATCH_VERIFICATION_ERROR;
    Status = STATUS_DATA_ERROR;
  }

Cleanup:
  if ( PatchInfo->CodeInfo.DescriptorsCount 0 )
  {
    for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
    {
      ExUnlockUserBuffer( Buffer[ Index ] );
    }
  }

  ExFreePoolWithTag( Buffer, 0 );
  return Status;
}

VOID MiDoCopyMemory(
  IN PKDPC Dpc,
  IN PSYSTEM_HOTPATCH_CODE_INFORMATION PatchInfo,
  IN ULONG NumberCpus,
  IN DEFERRED_REVERSE_BARRIER ReverseBarrier
  )
{
  KIRQL OldIrql;
  UNREFERENCED_PARAMETER( Dpc );
  NTSTATUS Status;
  ULONG Index;

  OldIrql = KfRaiseIrql( CLOCK1_LEVEL );

  //
  // Decrement reverse barrier count.
  //
  Status = KeSignalCallDpcSynchronize( ReverseBarrier );
  if ( ! NT_SUCCESS( Status ) )
  {
    goto Cleanup;
  }

  PatchInfo->Flags &= ~FLG_HOTPATCH_VERIFICATION_ERROR;

  for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
  {
    if ( PatchInfo->Flags & FLG_HOTPATCH_ACTIVE )
    {
      if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].ValidationSize != 
        RtlCompareMemory(
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
          ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].ValidationOffset,
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].ValidationSize ) )
      {

        if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize != 
          RtlCompareMemory(
            PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
            ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].OrigCodeOffset,
            PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize ) )
        {
          PatchInfo->Flags &= FLG_HOTPATCH_VERIFICATION_ERROR;
          break;
        }
      }
    }
    else
    {
      if ( PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize !=
        RtlComparememory(
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
          ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset,
          PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize ) )
      {
        PatchInfo->Flags &= FLG_HOTPATCH_VERIFICATION_ERROR;
        break;
      }
    }
  }

  //loc_479533
  if ( PatchInfo->Flags & FLG_HOTPATCH_VERIFICATION_ERROR ||
     PatchInfo->CodeInfo.DescriptorsCount <= 0 )
  {
    goto Cleanup;
  }

  for ( Index = 0; Index CodeInfo.DescriptorsCount; Index++ )
  {
    PVOID Source;
    if ( PatchInfo->Flags & FLG_HOTPATCH_ACTIVE )
    {
      Source = ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeOffset;
    }
    else
    {
      Source = ( PBYTE ) PatchInfo + PatchInfo->CodeInfo.CodeDescriptors[ Index ].OrigCodeOffset;
    }

    RtlCopyMemory(
      PatchInfo->CodeInfo.CodeDescriptors[ Index ].MappedAddress,
      Source,
      PatchInfo->CodeInfo.CodeDescriptors[ Index ].CodeSize
      );
  }


Cleanup:
   KeSignalCallDpcSynchronize( ReverseBarrier );
   KfLowerIrql( OldIrql );
   KeSignalCallDpcDone( NumberCpus );
}

To see the code, in action, we set a breakpoint on nt!MiDoCopyMemory:

    kdk
    ChildEBP RetAddr  
    f6acbac0 8087622f nt!MiDoCopyMemory
    f6acbae8 80990a10 nt!KeGenericCallDpc+0x3d
    f6acbb0c 80990bea nt!MmLockAndCopyMemory+0xf1
    f6acbb34 80990cba nt!MiPerformHotPatch+0x143
    f6acbb68 809b2d67 nt!MmHotPatchRoutine+0x75
    f6acbba8 808caeff nt!ExApplyCodePatch+0x191
    f6acbd50 8082337b nt!NtSetSystemInformation+0xa1e
    

Before letting MiDoCopyMemory do its work, let’s see what it is about to do. No modifications have yet been done to mrxsmb:

    kd!chkimg mrxsmb
    0 errors : mrxsmb 
    
    kd!chkimg rdbss
    0 errors : rdbss
    

The second argument is a structure holding the information garthered previously, peeking into it reveals:

    kddd /c 1 poi(esp+8) l 4
    81583008  00000001
    8158300c  00000149
    81583010  00000008   <-- # of code patches
    81583014  f9648b1f   <-- hmm...
    

As it turns out, address 81583014 refers to a variable length array of size 8. Poking aroud with dd, the following listing suggests that the structure is of size 28 bytes:

    kddd /c 7 81583014
    81583014  f9648b1f fa2afb1f 000000ec 00000005 000000f1 000000f6 00000005
    81583030  f9648b24 fa2b2b24 000000fb 00000002 000000fd 000000ff 00000002
    8158304c  f96585ef fa2b15ef 00000101 00000005 00000106 0000010b 00000005
    81583068  f96585f4 fa2b45f4 00000110 00000002 00000112 00000114 00000002
    81583084  f9658569 fa2b3569 00000116 00000005 0000011b 00000120 00000005
    815830a0  f965856e fa2b656e 00000125 00000002 00000127 00000129 00000002
    815830bc  f9653378 fa2b5378 0000012b 00000005 00000130 00000135 00000005
    815830d8  f965337d fa2b837d 0000013a 00000005 0000013f 00000144 00000005
    

Given that rdbss was loaded to address range f9641000-f9671000, it is obvious that the first 2 columns refer to code addresses. The third, fifth and sixth column looks like an offset, the fourth and seventh like the length of the code change. First, let’s see where the first column points to:

    kdu f9648b1f
    rdbss!RxInitiateOrContinueThrottling+0x6b:
    f9648b1f 90              nop
    f9648b20 90              nop
    f9648b21 90              nop
    f9648b22 90              nop
    f9648b23 90              nop
    rdbss!RxpCancelRoutine:
    f9648b24 8bff            mov     edi,edi
    f9648b26 55              push    ebp
    f9648b27 8bec            mov     ebp,esp
    

Now that looks promising, especially since the fourth column holds the value 5. Let’s look at the second row:

    kdu f9648b24
    rdbss!RxpCancelRoutine:
    f9648b24 8bff            mov     edi,edi
    

No doubt, the first and second row define the two patches necessary to redirect RxpCancelRoutine. But what to replace this code with? As it turns out, the offsets in column three are relative to the structure and point to the code that is to be written:

    kdu poi(esp+8)+000000ec
    815830f4 e9dcc455fd      jmp     7eadf5d5          mov     edi,edi
    
    kdu poi(esp+8)+000000fb
    81583103 ebf9            jmp     815830fe
    

That makes perfectly sense – the five nops are to be overwritten by a near jump, the mov edi, edi will be replaced by a short jump.

So let’s run MiDoCopyMemory and have a look at the results. Back in MmLockAndCopyMemory, the code referred to by the first to rows look like this:

    kdu f9648b1f
    rdbss!RxInitiateOrContinueThrottling+0x6b:
    f9648b1f e9dcc455fd      jmp     hpf3!RxpCancelRoutine (f6ba5000)
    
    kdu f9648b24
    rdbss!RxpCancelRoutine:
    f9648b24 ebf9            jmp     rdbss!RxInitiateOrContinueThrottling+0x6b (f9648b1f)
    f9648b26 55              push    ebp
    f9648b27 8bec            mov     ebp,esp
    
    

Voilà, RxpCancelRoutine has been patched and calls are redirected to hpf3!RxpCancelRoutine, the new routine located in the auxiliarry ‘hpf3’ driver. All that remains to be done is cleanup (unlocking the memory etc).

That’s it – that’s how Windows applies patches on the fly using hotpatching. Too bad that the technology is so rarely used in practice.

Any opinions expressed on this blog are Johannes' own. Refer to the respective vendor’s product documentation for authoritative information.
« Back to home