Archive for the 'COM' Category

The hidden danger of forgetting to specify %SystemRoot% in a custom environment block

When spawning a process using CreateProcess and friends, the child process usually inherits the environment (i.e. all environment variables) of the spawning process. Of course, this behavior can be overridden by creating a custom environment block and passing it to the lpEnvironment parameter of CreateProcess.

While the MSDN documentation on CreateProcess does contain a remark saying that current directory information (=C: and friends) should be included in such a custom environment block, it does not mention the importance of SystemRoot.

The SystemRoot environment variable usually contains the path c:\windows — the path that is also accessible using the GetWindowsDirectory function. This environment variable, as it turns out, is not only handy for scripting purposes — it is, in fact, essential for the proper operation of many libraries.

For very simple programs, forgetting to include SystemRoot in a custom environment block usually goes unnoticed — even an empty environment block works just fine. In case of more complex applications, however, the omission of this variable can quickly lead to errors — on Vista, the most common error that can be tracked back to a missing SystemRoot variable is SXS failing to find/load basic system libraries.

Now that we have Windows 7, SystemRoot seems to have become even more important: Now it is not only SXS that requires SystemRoot to be specified properly, but also CryptoAPI.

In my particular case, I was experiencing a 0x80090006 (“Invalid Signature”, NTE_BAD_SIGNATURE) error whenever the child process attempted to call CoGetObject to retrieve a pointer to a DCOM object. While this error occured on Windows 7, the same code worked fine on Windows Vista and XP.

Given this more than general error message, it seemed anything but clear to me what the problem was, so I attached a debugger to the child process (using gflags/Image File Execution Options). Once I did that, I got the following messages in my debug output output:

CryptAcquireContext: CheckSignatureInFile failed at cryptapi.c line 5198
CryptAcquireContext: Failed to read registry signature value at cryptapi.c line 873

I set a breakpoint on CryptAcquireContextW and looked at the stack trace:


0:000> k
ChildEBP RetAddr  
0008f8a4 75760a4f ole32!CRandomNumberGenerator::Initialize+0x2e
0008f8b0 75760769 ole32!CRandomNumberGenerator::GenerateRandomNumber+0xd
0008f8e8 757609cf ole32!CStdMarshal::AddIPIDEntry+0x48
0008f93c 75766aae ole32!CStdMarshal::MarshalServerIPID+0x5a
0008f994 75767519 ole32!CStdMarshal::MarshalObjRef+0xb9
0008f9c8 7576778e ole32!MarshalInternalObjRef+0x8c
0008fa4c 757676ba ole32!CRemoteUnknown::CRemoteUnknown+0x3b
0008fa8c 7576754a ole32!CComApartment::InitRemoting+0x19c
0008fa98 7586d83e ole32!CComApartment::StartServer+0x13
0008faa8 757652b3 ole32!InitChannelIfNecessary+0x1e
0008fb20 757fc046 ole32!CoUnmarshalInterface+0x38
0008fb34 757fd3d5 ole32!CObjrefMoniker::Load+0x26
0008fb70 7573cb7f ole32!CObjrefMonikerFactory::ParseDisplayName+0x16f
0008fbbc 7573caae ole32!FindClassMoniker+0x8b
0008fbf4 75789dc7 ole32!MkParseDisplayName+0xbb
0008fc3c 6954ce84 ole32!CoGetObject+0x82
...

Quite obviously, COM, trying to unmarshal an interface, needed a random number and attempted to use CryptoAPI for this purpose. Looking at the paramters of CryptAcquireContext, I saw that the Microsoft Strong Cryptographic Provider was attempted to be loaded — one of the standard Windows CSPs — so everything seemed normal.

Guided by the message Failed to read registry signature, I switched to Process Monitor to see which registry key was being queried: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Cryptography\Defaults\Provider\Microsoft Strong Cryptographic Provider.

Taking a look at this key in regedit, it did not take long before spotting SystemRoot as the culprit:

Looking at file system activity in Process Monitor proved this:

Process Monitor

Interestingly, on Windows Vista, all Image Path values in the CSP registry keys do not use SystemRoot — they just contain the file name and rely on the library path in order to locate the libraries at runtime. This explains why my code worked fine on Vista.

(While making this change, the developer seemed to forget to change the value’s type from REG_SZ to REG_EXPAND_SZ though :) )

Bottom Line 1: Always, always include SystemRoot when passing a custom environment block to CreateProcess.

Bottom Line 2: The case also shows how a seemingly trivial change in Windows (using an absolute path rather then just a file name in the CSP registry key) can lead to an application incompatibility.

Advertisements

RCW Reference Counting Rules != COM Reference Counting Rules

Avoiding COM object leaks in managed applications that make use of COM Interop can be a daunting task. While diligent tracking of COM object references and appropriate usage of Marshal.ReleaseComObject usually works fine, COM Interop is always good for surprises.

Recently having been tracking down a COM object leak in a COM/.Net-Interop-centric application, I noticed that the CLR did not quite manage the reference count on my COM object as I expected it to do — more precisely, it incremented the referece count of a COM object when it was passed (from COM) as a method parameter to a callback implemented in .Net — which, of course, contradicts the rules of COM. So while RCWs indeed mostly follow the rules of COM reference counting, they obviously do not do follow the rules in their entirety. Once I spotted this difference, it was easy to find an explanation of this very topic by Ian Griffiths, which is worth quoting [reformatted by me]:

[…]
And by the way, the reference counting is kind of similarish to COM, in that, as you point out, things get addrefed when they are passed to you. But they’re actually not the same. Consider this C# class that
implements a COM interface:

public class Foo : ISomeComInterface
{
  public void Spong(ISomeOtherComInterface bar)
  {
    bar.Quux();
  }
}

Suppose that Spong is the only member of ISomeComInterface. (Other than the basic IUnknown members, obviously.) This Spong method is passed another COM interface as a parameter. And let’s suppose that some non-.NET client is going to call this Spong method on our .NET object via COM interop.

The reference counting rules for COM are not the same as those for the RCW in this case.

For COM, the rule here is that the interface is AddRefed for you before it gets passed in, and is Released for you after you return. In other words, you are not required to do any AddRefing or Releasing on a COM object passed to you in this way *unless* you want to keep hold of a reference to it after the call returns. In that case you would AddRef it.

Compare this with the RCW reference count. As with COM, the RCW’s reference count will be incremented for you when the parameter is passed in. But unlike in COM, it won’t be decremented for you automatically when you return.

You could sum up the difference like this:

  • COM assumes you won’t be holding onto the object reference when the method returns
  • The RCW assumes you *will* be holding onto the object reference when the method returns.

So if you don’t plan to keep hold of the object reference, then the method should really look like this:

public void Spong(ISomeOtherComInterface bar)
{
  bar.Quux();
  Marshal.ReleaseComObject(bar);
}

According to the COM rules of reference counting, this would be a programming error. But with RCWs, it’s how you tell the system you’re not holding onto the object after the method returns.

Pretty counter-intuitive… Plus, I am not aware of any official documentation on this topic.

Working Around TlbImp’s Cleverness

TlbImp, the .Net tool to create Interop assemblies from COM type libraries, contains an optimization that presumably aims at making the consumption of the Interop assembly easier, but ultimately is a nuisance. Consider the following IDL code:

import "oaidl.idl";
import "ocidl.idl";

[
  uuid( a657ef35-fea1-40ad-86d8-bb7b6085a0a3 ),
  version( 1.0 )
]
library Test
{
  
  [
    object,
    uuid( 84b2f017-b8fe-4c2c-87b8-0587b4bf5507 ),
    version( 1.0 ),
    oleautomation
  ]
  interface IFoo : IUnknown 
  {
    HRESULT Foo();
  }

  [
    object,
    uuid( 13d950d6-beb3-4dd3-957b-88b0e5eb5e3f ),
    version( 1.0 ),
    oleautomation
  ]
  interface IBar : IUnknown 
  {
    HRESULT CreateFoo( 
      [out, retval] IFoo **Foo
      );
  }

  [
    uuid( e01ea769-410c-4915-a48c-3522a8087a52 ),
    noncreatable 
  ]
  coclass Foo
  {
    interface IFoo;
  }

  [
    uuid( dca66832-fe3b-4658-a975-442b5678a9ec )
  ]
  coclass Bar
  {
    interface IBar;
  }
}

Two things are worth noting about this IDL code: First, IBar::CreateFoo is declared to “return” an IFoo*, and second, there is only one coclass implementing IFoo, namely Foo. As a consequence, TlbImp attempts to be clever and assumes that if IBar::CreateFoo “returns” an IFoo*, it relly must be an Foo* that is returned. So in the resulting Interop assembly, the IBar interface will look as follows:

[
  ComImport, InterfaceType((short) 1), 
  Guid("13D950D6-BEB3-4DD3-957B-88B0E5EB5E3F"), 
  TypeLibType((short) 0x100)
]
public interface IBar
{
  [return: MarshalAs(UnmanagedType.Interface)]
  [MethodImpl(MethodImplOptions.InternalCall, 
   MethodCodeType=MethodCodeType.Runtime)]
  Foo CreateFoo();
}

Contrary to what the IDL defines, IBar::CreateFoo returns Foo, i.e. a concrete class.

First of all, the assumption underlying this optimization certainly is somewhat flaky as it is not quite in accord with the rules of COM — after all, the implementation of Bar is free to return whatever coclass implementing IFoo seems appropriate and is not limited to returning Foo coclass instances.

While this might not be a real problem in practice, the optimization has another an unpleasant effect on the testability of the library consuming the interface: To properly test this library, it might be a good idea to implement mock or stub implementations of IFoo and IBar. Unfortunately, due to the “optimized” return type of IBar::CreateFoo(), this turns out to be not quite easy, as the following code suggests:

class FooStub : IFoo
{
  public void Foo() 
  {}
}

class BarStub : IBar
{
  public Foo CreateFoo()
  {
    // XXX: FooStub implements IFoo, but Foo is required!
    return new FooStub()
  }
}

Workarounds

As pointless as TlbImp’s behavior in this regard might be, it is quite easy to work around this issue.

The first workaround is to define a dummy coclass in the IDL and declare it to implement IFoo as well:

[
  uuid( ed93b3e6-104b-43d6-be34-972d7519bc62 )
]
coclass Dummy
{
  interface IFoo;
}

Now that there are two candidate coclasses, TlbImp will not be able to infer the coclass from the interface and will not be able to apply its optimization. The drawback of this approach is, of course, that the additional Dummy coclass consitutes additional baggage in the IDL and the Interop assembly.

The second option is to forego declaring the interfaces of Foo in the coclass. These declarations mainly serve informative purposes and are not mandatory, so we can omit them. To satisfy MIDL’s requirement of naming at least one interface, we can just put in IUnknown:

[
  uuid( e01ea769-410c-4915-a48c-3522a8087a52 ),
  noncreatable 
]
coclass Foo
{
  interface IUnknown;
}

Again, TlbImp will not be able to apply its optimization and the resulting Interop assembly will contain a proper, mockable, interface IBar.

Needless to say, this approach has its own drawbacks. As a consequence of the missing interface implementation declarations, class Foo (in the Interop assembly) will not contain any methods and is basically useless — you are obliged to use the interface.

More importantly, however, the .Net class Foo will not implement IFoo — so although QueryInterface’ing IFoo from Foo would work, a statement like IBar bar = new BarClass() will now lead to a compiler error:

Cannot implicitly convert type ‘Test.BarClass’ to ‘Test.IBar’. An explicit conversion exists (are you missing a cast?)

Although I consider the second option to be the cleaner approach, it is therefore best used for noncreatable coclasses only.

Error Codes: Win32 vs. HRESULT vs. NTSTATUS

There are three common error code formats used throughout Windows. In the kernel and native part, NTSTATUS is used exclusively. The Win32 API uses its own error codes (they do not really have a name, so I will refer to them as Win32 error codes) and COM uses HRESULTs — though the separation is not always so sharp, e.g. the safe string functions (StringCch* and friends) also return HRESULTs although they do not belong to COM.

HRESULT (From winerror.h)

//
//  HRESULTs are 32 bit values layed out as follows:
//
//   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
//   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
//  +-+-+-+-+-+---------------------+-------------------------------+
//  |S|R|C|N|r|    Facility         |               Code            |
//  +-+-+-+-+-+---------------------+-------------------------------+
//
//  where
//
//      S - Severity - indicates success/fail
//
//          0 - Success
//          1 - Fail (COERROR)
//
//      R - reserved portion of the facility code, corresponds to NT's
//              second severity bit.
//
//      C - reserved portion of the facility code, corresponds to NT's
//              C field.
//
//      N - reserved portion of the facility code. Used to indicate a
//              mapped NT status value.
//
//      r - reserved portion of the facility code. Reserved for internal
//              use. Used to indicate HRESULT values that are not status
//              values, but are instead message ids for display strings.
//
//      Facility - is the facility code
//
//      Code - is the facility's status code
//

NTSTATUS and Win32 error codes (From Winerror.h or ntstatus.h)

NTSTATUS* and Win32 error codes share the same definition:

//
//  Values are 32 bit values layed out as follows:
//
//   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
//   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
//  +---+-+-+-----------------------+-------------------------------+
//  |Sev|C|R|     Facility          |               Code            |
//  +---+-+-+-----------------------+-------------------------------+
//
//  where
//
//      Sev - is the severity code
//
//          00 - Success
//          01 - Informational
//          10 - Warning
//          11 - Error
//
//      C - is the Customer code flag
//
//      R - is a reserved bit
//
//      Facility - is the facility code
//
//      Code - is the facility's status code
//

In user mode, these codes are primarily encountered as SEH exception codes (e.g. EXCEPTION_ACCESS_VIOLATION, 0xC0000005) or return values. However, due to compatibility reasons, all common error codes defined in winerror.h (such as ERROR_FILE_NOT_FOUND, 0x2) do not quite adhere to their definition. Neither have they set Severity to 0y11 nor have they set their facility code to FACILITY_WIN32). Unsurprisingly, they are the same as in OS/2 (see DosExecPgm as an example).

Another unfortunate property of these Win32 error codes is, that no typedef for them exists. In fact, some APIs such as RegOpenKeyEx treat them as signed (LONG), others such as GetLastError treat them as unsigned (DWORD). Again, this is probably due to compatibility reasons.

So how compatible are those codes? Comparing the structure of HRESULTs and NTSTATUS/Win32 error codes, it is worth noting that HRESULTs explicitly allow for holding NTSTATUS values (Informational NTSTATUS become success HRESULTS, Warning NTSTATUS become failure HRESULTs). Even the other way round, assigning HRESULT values to NTSTATUS variables seems to be ok, given that the R, C, N and r bits of HRESULTS are usually 0.

So on a syntactic level, assigning NTSTATUS values to HRESULTs and vice versa seems to be correct. But let us have a look at the facility codes:

NTSTATUS (ntstatus.h):

#define FACILITY_DEBUGGER                0x1
#define FACILITY_RPC_RUNTIME             0x2
#define FACILITY_RPC_STUBS               0x3
#define FACILITY_IO_ERROR_CODE           0x4
[...]

Win32 error codes and HRESULT(winerror.h):

#define FACILITY_RPC                     1
#define FACILITY_DISPATCH                2
#define FACILITY_STORAGE                 3
#define FACILITY_ITF                     4
[...]

Having the same format, NTSTATUS and Win32 error codes could be expected to use the same facility codes. However, this is not the case — instead, Win32 error codes (according to winerror.h) use the facility values of HRESULTs! As a consequence, interchanging NTSTATUS and Win32 error codes is syntacticly ok but changes their semantics due to non matching facility codes.

With this background in mind, it is now possible to define a conversion matrix:

  From
NTSTATUS Win32 HRESULT
To NTSTATUS   Yes 1, 2 Yes 1
Win32 LsaNtStatusToWinError() or HRESULT_FROM_NT() 1, 4   Yes 3
HRESULT HRESULT_FROM_WIN32( LsaNtStatusToWinError()) or HRESULT_FROM_NT() 1, 4 Yes 2  

1 Facility may need to be adapted
2 Holds for ‘real’ Win32 error codes. For compatibility error codes, use HRESULT_FROM_WIN32
3 As long as you have a ‘real’ HRESULT (i.e. not one from HRESULT_FROM_WIN32) and want to get a ‘real’ Win32 error code (i.e. not a compaitibility one) — otherwise it can get tricky
4 Note that HRESULT_FROM_NT does not take the NT Status to Win32 Error Code conversion table into account, thus the result may not be what one would expect. Using LsaNtStatusToWinError takes this table into account, but yields ‘compatibility’ Win32 error code.
* It turns out that the NTSTATUS documentation in the DDK contradicts the definition in ntstatus.h (3790): According to winerror.h, bit 28 is reserved whereas the DDK counts it as part of the facility field (Which, I guess, is wrong).

Determining the apartment of a thread

There are situations in which it would be convenient to list which apartment the threads of a process belong to. In case of managed debugging, the !threads command provided by SOS gives this info:

PreEmptive   GC Alloc               Lock
ID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
0   688 00149528      6020 Enabled  00000000:00000000 00159e68     0 STA
1   f70 00165548      b220 Enabled  00000000:00000000 00159e68     0 MTA (Finalizer)

In case of unmanaged debugging, however, no such command exists (at least to my knowledge). So the first question is how the apartment-information can be retrieved for a given thread.

Knowing that calling CoInitializeEx( NULL, COINIT_APARTMENTTHREADED ) followed by a CoInitializeEx( NULL, COINIT_MULTITHREADED ) yields an error (which implies that code checking which apartment the thread is currently in is executed), I decided to write up a test program and step through the second CoInitializeEx-call.

I whould have expected to find the information stored in some TLS-slot, however, this is not the case. Instead the TEB structure contains a field dedicated to OLE:

typedef struct _TEB
{
	/*...*/
	PVOID           ReservedForOle;
	/*...*/
} TEB, *PTEB;

As a side note — while dedicating a separate field to OLE may have its advantages, it actually vialolates the idea of layering. OLE/COM is layered above NT; NT should not even know about COM/OLEs existance and thus should not reserve a field for COM/OLE. As such, using TLS would have been the cleaner choice. But I digress…

While identifying this field within the TEB is straightforward, it is totally undocumented which structure this field points to. From the disassembly, it is visible that the apartment type is stored in some flag field at 0xC bytes offset. Fortunately, others have written about that before and have found out the flag values of this field. Of course, there is no guarantee that the values and the offset does not change in future releases of windows — all I can currently say is that the implementation works fine on WinXP x86. Given this information, I was able to code up a WinDBG debugging extension that offers me the information I was looking for:

0:008> ~*e !apt
Thread 0x0000057C Apartment: STA
Thread 0x0000053C Apartment: Not a COM thread
Thread 0x0000056C Apartment: Not a COM thread
Thread 0x00000538 Apartment: Unknown (Unrecognized flags)
Thread 0x00000568 Apartment: Not a COM thread
Thread 0x00000524 Apartment: STA
Thread 0x00000558 Apartment: MTA
Thread 0x00000550 Apartment: MTA

Threads for which the ReservedForOle pointer is NULL are reported as ‘Not a COM thread’. There are, however, threads for which the pointer is non-NULL, yet the aforementioned flag field contains the value 0x00000001, which can neither be identified as STA, MTA or TNA. They are thus reported as ‘Unknown’

The follwoing listing shows the code for retrieving the information I used within the debugging extension.

#define OLE_STA_MASK   0x080    // Bugslayer, MSJ 10/99
#define OLE_MTA_MASK   0x140    // Bugslayer, MSJ 10/99
#define OLE_TNA_MASK   0x800    // http://members.tripod.com/IUnknwn

#define JPDBGEXT_E_DEBUGEE_ERROR MAKE_HRESULT( 1, FACILITY_ITF, 0x200 );
#define JPDBGEXT_E_UNKNOWN_APT     MAKE_HRESULT( 1, FACILITY_ITF, 0x201 );

typedef struct _OLE_INFORMATION
{
    CHAR Padding[ 0xC ];
    DWORD Apartment;
} OLE_INFORMATION;

HRESULT JpDbgExtpGetThreadTebBaseAddress(
    __in HANDLE hThread,
    __out DWORD *pdwBaseAddress
    )
{
    THREAD_BASIC_INFORMATION threadInfo;
    DWORD retLen;
    NTSTATUS status;

    _ASSERTE( hThread );
    _ASSERTE( pdwBaseAddress );

    status = NtQueryInformationThread(
        hThread,
        ThreadBasicInformation,
        &threadInfo,
        sizeof( THREAD_BASIC_INFORMATION ),
        &retLen );
    if ( STATUS_SUCCESS != status )
    {
        return HRESULT_FROM_NT( status );
    }

    *pdwBaseAddress = * ( DWORD* ) &threadInfo.TebBaseAddress;
    return S_OK;
}

HRESULT JpDbgExtpGetApartmentType(
    __in HANDLE hThread,
    __out APARTMENT_TYPE *pApt
    )
{
    DWORD dwTebBaseAddress = 0;
    PVOID pOleAddress = 0;
    OLE_INFORMATION oleInfo;
    HRESULT hr = E_UNEXPECTED;
    TEB debugeeTeb;

    _ASSERTE( hThread );
    _ASSERTE( pApt );

    //
    // Get the debugee thread's TEB.
    //
    hr = JpDbgExtpGetThreadTebBaseAddress( hThread, &dwTebBaseAddress );
    if ( FAILED( hr ) )
    {
        return hr;
    }

    if ( ! ReadMemory(
        dwTebBaseAddress,
        &debugeeTeb,
        sizeof( TEB ),
        NULL ) )
    {
        return JPDBGEXT_E_DEBUGEE_ERROR;
    }

    //
    // Reach into the TEB and read OLE information.
    //
    pOleAddress = debugeeTeb.ReservedForOle;

    if ( pOleAddress == NULL )
    {
        //
        // Not a COM thread.
        //
        *pApt = APARTMENT_TYPE_NONE;
        hr = S_OK;
    }
    else
    {
        DWORD dwOleAddress = * ( DWORD* ) &pOleAddress;

        //
        // COM thread, get apartment
        //
        if ( ! ReadMemory(
            dwOleAddress,
            &oleInfo,
            sizeof( OLE_INFORMATION ),
            NULL ) )
        {
            return JPDBGEXT_E_DEBUGEE_ERROR;
        }

        if ( oleInfo.Apartment & OLE_STA_MASK )
        {
            *pApt = APARTMENT_TYPE_STA;
            hr = S_OK;
        }
        else if ( oleInfo.Apartment & OLE_MTA_MASK )
        {
            *pApt = APARTMENT_TYPE_MTA;
            hr = S_OK;
        }
        else if ( oleInfo.Apartment & OLE_TNA_MASK )
        {
            *pApt = APARTMENT_TYPE_TNA;
            hr = S_OK;
        }
        else
        {
            *pApt = APARTMENT_TYPE_UNKNOWN;
            hr = S_OK;
        }
    }

    return hr;
}

}

Categories




About me

Johannes Passing, M.Sc., living in Berlin, Germany.

Besides his consulting work, Johannes mainly focusses on Win32, COM, and NT kernel mode development, along with Java and .Net. He also is the author of cfix, a C/C++ unit testing framework for Win32 and NT kernel mode, Visual Assert, a Visual Studio Unit Testing-AddIn, and NTrace, a dynamic function boundary tracing toolkit for Windows NT/x86 kernel/user mode code.

Contact Johannes: jpassing (at) acm org

Johannes' GPG fingerprint is BBB1 1769 B82D CD07 D90A 57E8 9FE1 D441 F7A0 1BB1.

LinkedIn Profile
Xing Profile
Github Profile