What a weirdo: How the /analyze switch changes its behavior depending on its environment

In Visual Studio 2005 Team System (VSTS), the “ultimate” SKU of Visual Studio 2005, Microsoft introduced the /analyze compiler switch. When the /analyze switch is used, the cl compiler not only does its regular checks, but performs a much more thorough static code analysis.

While /analyze is very useful indeed, it was only available in the top SKU — the Standard and Professional versions of Visual Studio lacked support for this compiler switch (this has changed by now, Professional now also supports this feature). As some smart people quickly figured out though, the compilers shipped as part of the Windows SDK did support /analyze, too.

So given that some compilers do support /analyze while other do not, you may well expect that there are two slightly different types of binaries, one that the SDK and VSTS uses, and one that is shipped with other Visual Studio SKUs.

At least this was what I expected. As it turns out though, this is not quite the case.

Where’s /analyze?

For the past two years, I have been developing using Visual Studio 2005 Team System along with Windows SDK 6.0 and WDK 6000 on a Vista x64 machine. Using this setup, I was able to use the /analyze switch in both, “regular” Visual Studio projects and WDK (build.exe-driven) projects. That led me to the conclusion that the WDK 6000 compilers, like the SDK compilers were in fact /analyze-enabled binaries as well.

Switching to a Windows 7 machine with VSTS 2005 and 2008, SDK 7.0, and WDK 6000 did not change this — /analyze kept working fine in all environments.

Then I set up a build server, installed WDK 6000 and Windows SDK 7.0 and attempted to perform a build — to my surprise, though, I got plenty of complaints about the /anayze switch not being supported.

I verified that the right compilers (WDK 6000) were used and compared cl versions between the build machine and my development machine — both were 14.00.50727.220, so everything seemed right. Running cl.exe /? on both machines, however, I noticed that despite versions being the same, this Code Analsis section was missing in the output on the build machine:

                         -CODE ANALYSIS-

/analyze[:WX-] enable code analysis
    WX- - code analysis warnings should not be treated as errors even if /WX is invoked

So obviously, Code Analysis support is enabled or disabled depending on external factors — not the binary itself, but the environment somehow determines whether the /analyze switch is supported or not.

Observing cl.exe /? with Process Monitor on my development machine resulted in the following output:

Process Monitor tracing the search for c1xxast

This trace leaves little room for interpretation: The code analysis features must (mainly) be implemented in c1xxast.dll. c1xxast.dll, however, is not shipped with the WDK itself, nor is it shipped with the non-VSTS SKUs of Visual Studio. So by default, the WDK’s cl will fail to locate the DLL and will revert to “/analyze-disabled mode”.

If, however, you have VSTS or the Windows SDK installed on your machine and your %PATH% happens to include the right directories, cl’s search for c1xxast.dll will succeeded and — tada — /analyze suddenly works. On my development machine, this obviously was the case, whilst on the build machine, it was not.

Compiler version mish-mash

I added the Windows SDK’s bin directory to the build machine’s %PATH% and rerun the build. As I expected, /analyze now worked fine — what I did not quite expect though was that I was now getting dozens of compilation warnings like:

warning C6309: Argument '1' is null: this does not adhere to function
specification of 'CfixCreateThread'

The reason for this was simple: The WDK cl.exe (remember, version 14.00.50727.220), thanks to a proper %PATH%, now used c1xxast.dll from SDK 7 to perform code analysis — despite the fact that c1xxast.dll actually “belonged” to cl version 15.00.30729.01. So the c1xxast.dll was one generation ahead of the WDK I was using.

The really, really cool thing about cl being able to work with a newer c1xxast.dll is that you can continue using WDK 6000 or 6001 (with W2K support!) and still benefit from the latest-and-greatest static code analysis features.

The reason for getting several warnings on the build machine while not getting similar warnings on my development machine was simply that on my development machine, the VS 2005 directory preceded the SDK directory in my %PATH%. Once I switched the order, I got the same wanings on both machines. This leads me to:

The ugly thing about this, however, is that a tiny change in the order of directories in %PATH% can suddenly make a huge difference w.r.t. code analysis. This is not quite what you’d normally expect.

(The additional compiler warnings, by the way, were a result of the improved analysis checks in cl 15: cl 14 routinely failed to verify the usage of __in vs. __in_opt parameters; cl 15 has become much more precise here and found several mis-attributed function signatures.)

Visual Assert hits RTM, now available for free

Visual Assert, the unit testing Add-In for Visual Studio/Visual C++ has finally left its beta status and — better yet — is now available for free, both for commercial and non-commercial use.

Visual Assert, based on the cfix 1.6 unit testing framework, allows you to easily write, manage, run, and debug your C/C++ unit tests -– without ever leaving the Visual Studio® IDE. No fiddling with command line tools, no complex configuration, and no boilerplate code required.

Sounds good? Then go straight to the Visual Assert homepage and download the installer!

cfix 1.6 released, simplifies authoring of multi-threaded tests

A new release of cfix, the unit testing framework for C and C++, is now available for download. Besides some minor enhancements like extending the maximum permitted fixture name, cfix 1.6 introduces a major new feature, Anonymous Thread Auto-Registration.

Since its very first release, cfix has supported multi-threaded test cases, i.e. test cases that spawn child threads, each of which potentially making use of the various assertion statements like CFIX_ASSERT. To make this work and ensure that failing assertions are handled properly, however, usage of CfixCreateThread (rather than the native Win32 CreateThread) was mandatory when spawning such threads.

Although using CfixCreateThread (or CfixCreateSystemThread in case of kernel mode code) remains the preferred way to create child threads, there are situations where usage of this API is not possible — for example, when the child threads are created by libraries such as boost.

To support such scenarios, cfix now allows you to annotate a fixture to enable Anonymous Thread Auto-Registration, which means that newly spawned threads are automatically registered with cfix — no usage of CfixCreateThread required. Thanks to this feature, writing multi-threaded tests becomes straightforward — and integrating with libraries such as boost does not pose a problem any more.

For more details about this feature, please refer to the respective section in the cfix documentation.

As always, updated binaries and source code are available on Sourceforge.

How to customize test run execution to make your cfix test runs more effective

One of the features introduced back in cfix 1.2 was the ability to customize test execution with the command line parameters -fsr and -fsf. Using these switches can make your test runs more effective and can help simplify debugging — so it is worth spending a minute on this topic.

Assume our test run comprises two fixtures, Fixture A and Fixture B. As fixtures are always run in alphabetic order, and tests run in the order they are defined, the first test to be executed is Test 1 of Fixture A, followed by Test 2 of Fixture A, and so on. Assuming all tests of Fixture A succeed, execution then proceeds with the first test of Fixture B. The following figure illustrates this:

Successful execution of a unit testing suite

Things get interesting when one of the tests fails: Let us assume Test 2 of Fixture A leads to a failed assertion. The default behavior of cfix, which equals the behavior of JUnit and NUnit, is to report the failure and proceed with Test 3 of the same fixture:

Execution resumes after a failed unit test

In many cases, this is the appropriate thing to do — letting the test run to completion and figure out the reasons for the failure afterwards. But in certain scenarios, resuming with Test 3 might not be the smartest thing to do — rather, as soon as a test fails, execution should proceed with the next fixture. This is what -fsf (short-circuit fixture”) does:

Short-circuiting the fixture

Once a test case has failed, all remaining tests of the same fixture are skipped.

When debugging, even resuming at the next fixture is often not desired. Rather, as soon as one test fails, you may immediately want to take a closer a look at the failure, so resuming the run is mute. -fsr (short-circuit run”) does right that: It aborts the run immediately.

Short-circuiting the run

By the way — in Visual Assert, you can quickly access these options in the Options menu in the Test Explorer window:

Visual Assert Test Explorer

The hidden danger of forgetting to specify %SystemRoot% in a custom environment block

When spawning a process using CreateProcess and friends, the child process usually inherits the environment (i.e. all environment variables) of the spawning process. Of course, this behavior can be overridden by creating a custom environment block and passing it to the lpEnvironment parameter of CreateProcess.

While the MSDN documentation on CreateProcess does contain a remark saying that current directory information (=C: and friends) should be included in such a custom environment block, it does not mention the importance of SystemRoot.

The SystemRoot environment variable usually contains the path c:\windows — the path that is also accessible using the GetWindowsDirectory function. This environment variable, as it turns out, is not only handy for scripting purposes — it is, in fact, essential for the proper operation of many libraries.

For very simple programs, forgetting to include SystemRoot in a custom environment block usually goes unnoticed — even an empty environment block works just fine. In case of more complex applications, however, the omission of this variable can quickly lead to errors — on Vista, the most common error that can be tracked back to a missing SystemRoot variable is SXS failing to find/load basic system libraries.

Now that we have Windows 7, SystemRoot seems to have become even more important: Now it is not only SXS that requires SystemRoot to be specified properly, but also CryptoAPI.

In my particular case, I was experiencing a 0×80090006 (“Invalid Signature”, NTE_BAD_SIGNATURE) error whenever the child process attempted to call CoGetObject to retrieve a pointer to a DCOM object. While this error occured on Windows 7, the same code worked fine on Windows Vista and XP.

Given this more than general error message, it seemed anything but clear to me what the problem was, so I attached a debugger to the child process (using gflags/Image File Execution Options). Once I did that, I got the following messages in my debug output output:

CryptAcquireContext: CheckSignatureInFile failed at cryptapi.c line 5198
CryptAcquireContext: Failed to read registry signature value at cryptapi.c line 873

I set a breakpoint on CryptAcquireContextW and looked at the stack trace:


0:000> k
ChildEBP RetAddr
0008f8a4 75760a4f ole32!CRandomNumberGenerator::Initialize+0x2e
0008f8b0 75760769 ole32!CRandomNumberGenerator::GenerateRandomNumber+0xd
0008f8e8 757609cf ole32!CStdMarshal::AddIPIDEntry+0x48
0008f93c 75766aae ole32!CStdMarshal::MarshalServerIPID+0x5a
0008f994 75767519 ole32!CStdMarshal::MarshalObjRef+0xb9
0008f9c8 7576778e ole32!MarshalInternalObjRef+0x8c
0008fa4c 757676ba ole32!CRemoteUnknown::CRemoteUnknown+0x3b
0008fa8c 7576754a ole32!CComApartment::InitRemoting+0x19c
0008fa98 7586d83e ole32!CComApartment::StartServer+0x13
0008faa8 757652b3 ole32!InitChannelIfNecessary+0x1e
0008fb20 757fc046 ole32!CoUnmarshalInterface+0x38
0008fb34 757fd3d5 ole32!CObjrefMoniker::Load+0x26
0008fb70 7573cb7f ole32!CObjrefMonikerFactory::ParseDisplayName+0x16f
0008fbbc 7573caae ole32!FindClassMoniker+0x8b
0008fbf4 75789dc7 ole32!MkParseDisplayName+0xbb
0008fc3c 6954ce84 ole32!CoGetObject+0x82
...

Quite obviously, COM, trying to unmarshal an interface, needed a random number and attempted to use CryptoAPI for this purpose. Looking at the paramters of CryptAcquireContext, I saw that the Microsoft Strong Cryptographic Provider was attempted to be loaded — one of the standard Windows CSPs — so everything seemed normal.

Guided by the message Failed to read registry signature, I switched to Process Monitor to see which registry key was being queried: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Cryptography\Defaults\Provider\Microsoft Strong Cryptographic Provider.

Taking a look at this key in regedit, it did not take long before spotting SystemRoot as the culprit:

Looking at file system activity in Process Monitor proved this:

Process Monitor

Interestingly, on Windows Vista, all Image Path values in the CSP registry keys do not use SystemRoot — they just contain the file name and rely on the library path in order to locate the libraries at runtime. This explains why my code worked fine on Vista.

(While making this change, the developer seemed to forget to change the value’s type from REG_SZ to REG_EXPAND_SZ though :) )

Bottom Line 1: Always, always include SystemRoot when passing a custom environment block to CreateProcess.

Bottom Line 2: The case also shows how a seemingly trivial change in Windows (using an absolute path rather then just a file name in the CSP registry key) can lead to an application incompatibility.

cfix finished 2nd in ATI’s Automation Honors Awards, surpassed only by JUnit

Along with JUnit, JWebUnit, NUnit, and SimpleTest, cfix was one of the nominees for the Automated Testing Institute’s Automation Honors Award 2009 in the category Best Open Source Unit Automated Test Tool. A few days ago, the results were published and cfix finished second — surpassed only by JUnit, which finished 1st (No real surprise here). If you are interested, there is a video in which the results are presented.

NTrace paper published on computer.org

Our paper NTrace: Function Boundary Tracing for Windows on IA-32 from WCRE 2009 has now been published on computer.org:

Abstract:

For a long time, dynamic tracing has been an enabling technique for reverse engineering tools. Tracing can not only be used to record the control flow of a particular component such as a piece of malware itself, it is also a way to analyze the interactions of a component and their impact on the rest of the system. Unlike Unix-based systems, for which several dynamic tracing tools are available, Windows has been lacking appropriate tools. From a reverse engineering perspective, however, Windows may be considered the most relevant OS, particularly with respect to malware analysis. In this paper, we present NTrace, a dynamic tracing tool for the Windows kernel, drivers, system libraries, and applications that supports function boundary tracing. NTrace incorporates 2 novel approaches: (1) a way to integrate with Windows Structured Exception Handling and (2) a technique to instrument binary code on IA-32 architectures that is both safe and more efficient than DTrace.

http://www.computer.org/portal/web/csdl/doi/10.1109/WCRE.2009.12

If you do not feel like reading the paper, you can also take a look at the screencasts:


Part 1. Kernel Mode NTrace:
Tracing NTFS and the I/O manager

Part 2: User Mode NTrace

Part 2. User Mode NTrace:
Tracing COM loading a DLL

Visual Assert Beta 3 released

A third beta release of Visual Assert is now available for download on www.visualassert.com.

Visual Assert, in case you have not tried it yet, is an Add-In for Visual Studio that adds unit testing capabilities to the Visual C++ IDE: Based on the cfix unit testing framework, Visual Assert allows unit tests to be written, run, and debugged from within the IDE. Pretty much like Junit/Eclipse, TestDriven.Net or MSTest, but for real, native code — code written in C or C++.

Bugs, bugs, bugs, bugs

Alas, there were still a few of them in the previous two beta releases. Luckily though, almost all I received from users or found by internal testing were relatively minor in nature. Still, I want Visual Assert to be as high quality as possible and decided to add this third beta release into the schedule and take the time to focus on — you guessed it — bugfixing, bugfixing, bugfixing, and bugfixing.

Speaking of bug reports, I have to thank all users of Visual Assert and cfix who reported bugs, suggested new features or provided general feedback. Your input has been, and still is highly appreciated. Although I had to postpone any feature suggestions to a later release, I tried hard to resolve all bugs and have them fixed in this new release.

Download, Try it, Share Your Opinion

Of course, using the new Beta version is free. So whether you have already used the previous beta or not, whether you are a unit testing newbie or write unit tests on a daily basis, be sure to give the new version a try. And of course, do not forget to let me know about your feedback, suggestions, found bugs, etc.!

cfix 1.5.1 released

A new version of cfix, the unit testing framework for C and C++ on Windows, is now available on Sourceforge. Despite fixing several minor issues, the new version resolves the following two issues that were reported by users:

  • Definiting multiple WinUnit fixtures with setup/teardown routines in a single .cpp file leads to a compilation error
  • A thread handle is leaked during execution of a test (#2889511)

Updated binaries and source code are available for download on Sourceforge.

Btw, in case you use cfix for kernel mode testing and are using WDK 7600, please have a look at my previous post: LTCG issues with the WIN7/amd64 environment of WDK 7600

LTCG issues with the WIN7/amd64 environment of WDK 7600

Now that Windows 7 is out, we all sooner or later have to upgrade to WDK 7600. I am still reluctant to move away from WDK 6000/6001 because of the dropped W2K support, but this is a different issue.

However, as one cfix user who has obviously already adopted WDK 7600 kindly pointed out to me, linking a kernel mode unit test against cfix using WDK 7600 and the WIN7/amd64 environment fails reproducibly with the following error message:

error fatal error C1047: The object or library file ‘…\lib\amd64\cfixkdrv.lib’ was created with an older compiler than other objects; rebuild old objects and libraries

In contrast, building the same driver for WIN7/x86 works fine.

As the documentation for C1047 indicates, this error is usually related to inconsistent usage of Link Time Code Generation (LTCG): As soon as you use LTCG, all objects and libraries must be compiled with /GL — this normally is not a big deal, but as this WDK page rightfully explains, it means that libraries built this way are not suitable for redistribution because of their dependency on a specific compiler/linker version. But of couse, it also means that a library not built using /GL cannot be used easily when you build your program using LTCG.

Before Windows 7, all WDK build environment configurations I am aware of did not use LTCG. Neither did cfix, so everything worked fine even if your compiler/linker versions did not match the ones used for building cfix.

With WDK 7600, this situation changes: While WLH and other downlevel build environments still do not seem to use LTCG, the WIN7 environment, at least for amd64, enables LTCG by default.

What this means is that as soon as you link against a library which is not part of the WDK and therefore likely to be built using a different compiler version, you’ll get C1047. It is thus no surprise that attempting to link against cfixkdrv.lib, which is the library all kernel mode unit tests have to link against and which itself has been built using WDK 6000, leads to the error quoted above.

However, once you have figured this out, the workaround for this issue is trivial: Disable LTCG for your test driver by adding the following line to your SOURCES file:


USER_C_FLAGS=/GL-

Link time code generation is a very powerful optimization technique and I encourage everybody to make use of it if possible. However, for the compatibility reasons outlined above, I consider it a rather stupid idea to have the WDK enable LTCG by default. Rather, I had much preferred to see an opt-in switch for LTCG.

Anyway, for the next version, I will consider shipping both, a 7600-compatible LTCG-enabled and a non-LTCG-enabled version of cfixkdrv.lib. Until then, the workaround described above will do the trick.

Next Page »


Categories

Try Visual Assert, the unit testing add-in for Visual Studio (R)


NTrace: Function Boundary Tracing for Windows on IA-32

About me

Johannes Passing, M.Sc., living in Berlin, Germany.

Johannes mainly focusses on Win32, COM, and NT kernel mode development, along with Java and .Net. He also is the author of cfix, a C/C++ unit testing framework for Win32 and NT kernel mode, Visual Assert, a Visual Studio Unit Testing-AddIn, and NTrace, a dynamic function boundary tracing toolkit for Windows NT/x86 kernel/user mode code.

Contact Johannes: jpassing (at) acm org

Johannes' GPG fingerprint is BBB1 1769 B82D CD07 D90A 57E8 9FE1 D441 F7A0 1BB1.

LinkedIn LinkedIn Profile
Xing Xing Profile
Twitter Follow me on Twitter (new)