Posts Tagged 'amd64'

Runtime Code Modification Explained, Part 1: Dealing With Memory

Runtime code modification, of self modifying code as it is often referred to, has been used for decades — to implement JITters, writing highly optimized algorithms, or to do all kinds of interesting stuff. Using runtime code modification code has never been really easy — it requires a solid understanding of machine code and it is straightforward to screw up. What’s not so well known, however, is that writing such code has actually become harder over the last years, at least on the IA-32 platform: Comparing the 486 and current Core architectures, it becomes obvious that Intel, in order to allow more advanced CPU-interal optimizations, has actually lessened certain gauarantees made by the CPU, which in turn requires the programmer to pay more attection to certain details.

Looking around on the web, there are plenty of code snippets and example projects that make use of self-modifying code. Without finger-pointing specific resources, it is, however, safe to assume that a significant (and I mean significant!) fraction of these examples fail to address all potential problems related to runtime code modification. As I have shown a while ago, even Detours, which is a well-done and widely recognized and used library relying on runtime code modification has its issues:

Adopting the nomenclature suggested by the Intel processor manuals, code writing data to memory with the intent of having the same processor execute this data as code is referred to as self-modifying code. On SMP machines, it is possible for one processor to write data to memory with the intent of having a different processor execute this data as code. This process if referred to as cross-modifying code. I will jointly refer to both practices as runtime code modification.

Memory Model

The easiest part of runtime code modification is dealing with the memory model. In order to implement self-modifying or cross-modifying code, a program must be able to address the regions of memory containing the code to be modified. Moreover, due to memory protection mechanisms, overwriting code may not be trivially possible.

The IA-32 architecture offers three memory models — the flat, segmented and real mode memory model. Current OS like Windows and Linux rely on the flat memory model, so I will ignore the other two.

Whenever the CPU fetches code, it addresses memory relative to the segment mapped by the CS segment register. In the flat memory model, the CS segment register, which refers to the current code segment, is always set up to map to linear address 0. In the same manner, the data and stack segment registers (DS, SS) are set up to refer to linear address 0.

It is worth mentioning that AMD64 has retired the use of segmentation and the segment bases for code and data segment are therefore always treated as 0.

Given this setup, code can be accessed and modified on IA-32 as well as on AMD64 in the same manner as data. Easy-peasy.

Memory Protection

One of the features enabled by the use of paging is the ability to enforce memory protection. Each page can specify restrictions to which operations are allowed to be performed on memory of the respective page.

In the context of runtime code modification, memory protection is of special importance as memory containing code usually does not permit write access, but rather read and execute access only. A prospective solution thus has to provide a means to either circumvent such write protection or to temporarily grant write access to the required memory areas.

As other parts of the image are write-protected as well, memory protection equally applies to approaches that modify non-code parts of the image such as the Import Address Table. That’s why the call to VirtualProtect is neccessary when Patching the IAT. Programs using runtime code modification often do not restrict themselves to changing existing code but rather generate additional code. Assuming Data Execution Prevention has been enabled, it is thus vital for such approaches to work properly that any code generated is placed into memory regions that grant execute access. While user mode implementations can rely on a feature of the RTL heap (i.e. using the HEAP_CREATE_ENABLE_EXECUTE when calling RtlCreateHeap) for allocating executable memory, no comparable facility for kernel mode exist — a potential instrumentation solution thus has to come up with a custom allocation strategy.

Jump distances

Whenever code is being generated, odds are that there are branching instructions involved. Depending on where memory for the new code has been allocated and where the branch targets falls, the offset between the branching instruction itself and the jump target may be of significant size. In such cases, the software has to make sure that the branch instruction chosen does in fact support offsets at least as large as required for the individual purpose. This sounds trivial, but it is not: Software that overwrites existing code with a branch may face severe limitation w.r.t. how many bytes the branch instruction may occupy — if, for example, there is less than 5 bytes of space (assuming IA-32), a far jump cannot be used. To use a near jump, however, the newly allocated code better be near.

Further safety concerns will be discussed in Part 2 of this series of posts.


Mixing 32 and 64-bit components in a single MSI

Definetely one my pet peeves about Windows Installer is how it deals with instruction set architectures (ISAs). Looking at Windows NT history, supported ISAs have come (amd64, IA-64) and gone (Alpha, PowerPC, MIPS) — yet most of the time, there was more than one ISA being officially supported. Having to ship binaries for multiple ISAs therefore always has been on the agenda for many ISVs.

Needless to say, supporting multiple ISAs requires special consideration when developing setup packages and providing separate packages — one for each ISA — has become common practice to approach this. This approach makes perfect sense: Given the incompatibility of most ISAs, nobody needs Alpha binaries on a MIPS system or amd64 binaries on a IA-64 machine, so there seems little reason to mix ISAs within a single package.

Unsurprisingly, Windows Installer, which was created somewhere around 2000, also goes this route and encourages developers to provide separate packages for each ISA.

However, with the advent of amd64/x64/IA-32e/Intel 64/whateveryoucallit, the situation has changed: Because i386 and amd64 are so closely related and compatible, there are now plenty of situations where combining binaries of differing ISAs (i.e. amd64 and i386) in a single installer package makes perfect sense. Examples for this include:

  • A package comprises a shell extension as well as a standalone App. For certain reasons (maybe the use of VB6), there only is a 32 bit version of the App. The shell extension, in contrast, is available for both, i386 and amd64. Whether you put everything into one package or provide separate packages for each ISA, one of them will comprise a mixture of ISAs.
  • SDKs for unmanaged code usually include .lib and .dll files for multiple architectures. Shipping separate packages for i386 and amd64 (containing different binaries but the same headers, docs, etc.) may please the Windows Installer gods, but seems redundant, a waste of disk space, and user-unfriendly.

Thanks to the msidbComponentAttributes64bit flag, mixing architectures in a single MSI package is technically possible: You mark the package as being 32 bit and set said flag for all 64-bit components. Rather than splitting your setup into multiple packages, you can conveniently combine everything into one.

When reading the documentation (and ICE requirements, more on this later) carefully though, it turns out that this is not quite what the Windows Installer team invented this flag for. Anyway, it works fine, problem solved.


If only there was not ICE80.

ICE80, alas, is critical if you intend to conform to the Requirements for the Windows Vista Logo Program for Software:

Applications must use the Windows Installer (MSI) or ClickOnce for installation. Windows Installation packages must not receive any errors from the Internal Consistency Evaluators (ICEs) listed here:

1-24, 27-31, 33-36, 38, 40-57, 59, 61-63, 65, 67-72, 74-84, 86-87, 89-94, 96-99

ICE80 mainly states that (1) you should not install 64 bit components to 32 bit directories (e.g. Program Files vs. Program Files (x86)) and (2) you should not use 64 bit components in a 32 bit package.

(1) is fair enough, although it raises the question where you should install your software to without splitting it in two or violating other ICE rules. Worse yet, (2) effectively means that said way to create multi-ISA packages, creating 32 bit packages with some components marked with msidbComponentAttributes64bit, is illegal alltogether.

So to be logo’ed, there seems to be no other way than providing separate packages, maybe along with (urgh!) a meta-package that installs the other two.

If there are more important things on your schedule than getting a Vista logo, ICE80 seems like something that can safely be ignored. Indeed, this is what I have done several times, including in case of the cfix installer.

Anyway, let’s ignore ICE80 once more and hold on to the plan of building a 32-bit package containing both, 32-bit and 64-bit components.


For an SDK that is installed on 64-bit Windows, it will usually make sense to install both, 32 and 64 bit .lib and .dll files etc. On 32-bit Windows, installing 64-bit components may seem odd, but due to the existence of amd64 compilers for i386, it still makes sense to install them or at least offer them as optional feature.

So far, so good. Things get interesting, though, when COM registration comes into play. Naturally, a 32 bit installer package sees the system like any other 32 bit application does. Most importantly, this means that Registry Reflection and File System Redirection applies.

Now consider a package that contains both a 32-bit and a 64-bit version of some COM server, each installed to a separate directory. COM Registration either be performed through the Class or the Registry table. Provided that the msidbComponentAttributes64bit flag has been used properly, such a package will work great on 64 bit systems thanks to Registry Reflection: The regsitry entries will be written to the proper (reflected) locations and both COM servers will work properly.

Now think what happens on 32-bit Windows: (1) There is no Registry Reflection and (2) Windows Installer silently ignores msidbComponentAttributes64bit flags. Result: The installation will run just as smooth as on the 64-bit system. However, while installing the files continues to works flawlessly, the registry will be left in a less-than-optimal state: Due to the nonexistence of Registry Reflection, the registration entries of both COM servers will have been written to the same location!

Needless to say, the server whose registration entries were written first will now be unusable.

In a way, Windows Installer has taken its revenge for breaking the rules.

Bottom line: Mixing 32 and 64-bit components in a single MSI works fine in many cases, but is against the MSI rules and can lead to further problems. And while I am still convinced that providing separate, ISA-specific packages is wrong or at least inconvenient in certain situations, it is definitely the safer and “right” way to go.

(Note: Windows Installer 4.5 introduced multi-package transactions, which allow reliable and transactional multi-package setups to be built so that splitting a setup into multiple packages can be implemented without much pain. However, very few users already have Windows Installer 4.5 installed and Windows 2000 is not even supported by this release. For many of us, relying on this feature therefore is not really an option.)

cfix 1.2 Installer Fixed for AMD64

The cfix 1.2 package as released last week contained a rather stupid bug that the new build,, now fixes: the amd64 binaries cfix64.exe and cfixkr64.sys were wrongly installed as cfix32.exe and cfixkr32.sys, respectively. Not only did this stand in contrast to what the documenation stated, it also resulted in cfix being unable to load the cfixkr driver on AMD64 platforms.

The new MSI package is now available for download on Sourceforge.


About me

Johannes Passing lives in Berlin, Germany and works as a Solutions Architect at Google Cloud.

While mostly focusing on Cloud-related stuff these days, Johannes still enjoys the occasional dose of Win32, COM, and NT kernel mode development.

He also is the author of cfix, a C/C++ unit testing framework for Win32 and NT kernel mode, Visual Assert, a Visual Studio Unit Testing-AddIn, and NTrace, a dynamic function boundary tracing toolkit for Windows NT/x86 kernel/user mode code.

Contact Johannes: jpassing (at) hotmail com

LinkedIn Profile
Xing Profile
Github Profile