Concurrent Execution A typical user mode process on a Windows system can be expected to have more than one thread. In addition to user threads, the Windows kernel employs a number of system threads. Given the presence of multiple threads, it is likely that whenever a code modification is performed, more than one thread is affected, i.e. more than one thread is sooner or later going to execute the modified code sequence.
Performing modifications on existing code is a technique commonly encountered among instrumentation solutions such as DTrace. Assuming a multiprocessor machine, altering code brings up the challenge of properly synchronizing such activity among processors. As stated before, IA-32/Intel64 allows code to be modified in the same manner as data. Whether modifying data is an atomic operation or not, depends on the size of the operand. If the total number of bytes to be modified is less than 8 and the target address adheres to certain alignment requirements, current IA-32 processors guarantee atomicity of the write operation.
Instrumentation of a routine may comprise multiple steps. As an example, a trampoline may need to be generated or updated, followed by a modification on the original routine, which may include updatating or replacing a branch instruction to point to the trampoline. In such cases, it is essential for maintaining consistency that the code changes take effect in a specific order. Otherwise, if the branch was written before the trampoline code has been stored, the branch would temporarily point to uninitialized memory.
Runtime code modification, of self modifying code as it is often referred to, has been used for decades – to implement JITters, writing highly optimized algorithms, or to do all kinds of interesting stuff. Using runtime code modification code has never been really easy – it requires a solid understanding of machine code and it is straightforward to screw up. What’s not so well known, however, is that writing such code has actually become harder over the last years, at least on the IA-32 platform: Comparing the 486 and current Core architectures, it becomes obvious that Intel, in order to allow more advanced CPU-interal optimizations, has actually lessened certain gauarantees made by the CPU, which in turn requires the programmer to pay more attection to certain details.
Definetely one my pet peeves about Windows Installer is how it deals with instruction set architectures (ISAs). Looking at Windows NT history, supported ISAs have come (amd64, IA-64) and gone (Alpha, PowerPC, MIPS) – yet most of the time, there was more than one ISA being officially supported. Having to ship binaries for multiple ISAs therefore always has been on the agenda for many ISVs. Needless to say, supporting multiple ISAs requires special consideration when developing setup packages and providing separate packages – one for each ISA – has become common practice to approach this.
The cfix 1.2 package as released last week contained a rather stupid bug that the new build, 220.127.116.1144, now fixes: the amd64 binaries cfix64.exe and cfixkr64.sys were wrongly installed as cfix32.exe and cfixkr32.sys, respectively. Not only did this stand in contrast to what the documenation stated, it also resulted in cfix being unable to load the cfixkr driver on AMD64 platforms. The new MSI package is now available for download on Sourceforge.