.NET Fusion
Home About Workshops Articles Writing Talks Books Contact

13. Unmanaged Assemblies

Isn't this a misnomer? We have learned all the way through this workshop that managed code is distributed in assemblies, so this means that assemblies must be managed, right? Fusion was created to solve the problem of DLL Hell, but much of the operating system is still unmanaged, so there must be some solution for DLL Hell for unmanaged libraries. This is the reason for unmanaged assemblies. They are not .NET assemblies, but they have similarities: they can be made up of more than one file, they are deployed as a single unit, and they have information about the files and resources in the assemblies. Unmanaged assemblies can be used to allow you to share multiple versions of DLLs between applications and to do this these 'assemblies' reside in a folder that is the native code equivalent of the .NET GAC. Finally, this cache can also contain information about version redirects, so the request for a DLL in one assembly can be directed to a different version of the DLL in another unmanaged assembly. Since this means that more than one version of an assembly can co-exist, they are also known as side-by-side assemblies.

Unmanaged side-by-side assemblies are important in this workshop because version 14 of the managed C++ compiler (supplied with Visual Studio 2005) will create .NET assemblies that use native libraries and by default, the Microsoft unmanaged libraries will be loaded from the side-by-side cache. On this page I will first describe unmanaged side-by-side assemblies, and then I will explain how this affects managed assemblies created by the C++ compiler.

13.1 Manifests and Side-by-Side Assemblies

First we need to go through a bit of terminology. The first term is an isolated application. This is a process that uses libraries and the description isolated means that the application is unaffected by the installation or uninstallation of other assembles. To do this the isolated application must use shared assemblies and private assemblies. In most cases an isolated application is just a normal process, the only difference is that it has a file called an application manifest that contains the dependency information about the assemblies it uses. A shared side-by-side assembly is one or more DLLs that can be used by any application on the machine. These are stored under the WinSxS folder in separate folders so that multiple versions of an assembly can co-exist, hence the term side-by-side. An administrator or a publisher can redirect the request for an assembly from an application to another version of the same assembly by policy files. A private assembly is a collection of DLLs that is deployed with an application. Such an assembly resides in the application folder and so is not available to any other application, and they will not be replaced when another application installs newer versions of the libraries.

Side-by-side assemblies and private assemblies are described by a manifest. There is one manifest for each assembly and it is made up of XML data that contains information about the DLLs, COM objects and other resources that are used by the assembly. (Note that when you use a manifest to specify COM objects and interfaces it means that you will not need to register the COM server with the system. Indeed, it is recommended that such COM servers do not have self register code.) The manifest for a shared assembly must be in an XML file; the manifest for an application, can be in an XML file or as a bound unmanaged resource. The manifest for a private assembly must be bound as an unmanaged resource to a DLL in the assembly that has the same name as the assembly.

The documentation says that the manifest for a private assembly can be in an XML manifest file, however, this is not correct, a manifest for a private assembly must be bound as an unmanaged resource.

Private assemblies must be installed in the application's folder, however, similar to .NET, they can be in a subfolder that has the name of the assembly or the locale of the assembly. A shared assembly can only be installed in the side-by-side cache and it must be installed by Windows Installer. Furthermore, so that there is no possibility of a name collision with another assembly, a shared assembly must have a publickeytoken and it must be signed so that the DLL loader can verify that it has not been altered since installation.

The manifest contains the versioning information of the resources it uses, and this versioning can be redirected using a publisher configuration file or an application configuration file.

When an application loads a library the request goes to the side-by-side manager. This will always search the side-by-side cache first for the shared assembly. If an appropriate shared assembly cannot be found, the side-by-side manager searches for a private assembly in the application's folder. To do this, the manager must get information about the assembly through the manifest, and contrary to the documentation, it will only search for manifests bound to libraries. Specifically, it will search for a DLL with the same name as the assembly and it will look for a RT_MANIFEST resource with an ID of 2.

The search mechanism works like this: first the manager searches for a subfolder with culture name if the assembly DLL does not exist there, the manager will search in a subfolder of the culture folder which has the name of the assembly. If there is no locale subfolder, the manager will search the application's folder for the assembly DLL and if this cannot be found it will search for a subfolder of the application folder which has the assembly name. The side-by-side manager will use fallback when it searches for localized assemblies. That is, the search will first be for a culture folder (eg en-GB) and if that does not exist, it will search for a language folder (eg en) and finally it will perform the search for a language neutral assembly. You will recognise these steps as precisely the way that Fusion looks for assemblies.

Windows describes the concept of application context. This contains information about the versioning of the resources in an assembly and the context API allows you to redirect binding. When a process is started with CreateProcess Windows looks for an application manifest and uses this to initialize the application context. Any objects used by the application are mapped to versioned objects as specified by this manifest.

As mentioned before, there are various configuration files used in isolated applications. The default configuration is held in the application's manifest (an XML file with the application's full name and the extension .manifest, for example, app.exe.manifest) and in the manifest's of the assemblies the application uses (bound as a resource to a DLL with the assembly name). This configuration can be overridden by publisher policy files, and by an application configuration file. The publisher policy file is for a shared assembly and it redirects versioning of an assembly to another version. The name of a policy file is similar to the .NET policy files, that is policy.<major>.<minor>.<assembly>, where <major> and <minor> refers to the version being redirected, for example policy.1.0.lib. However, unlike .NET, such versioning redirects can only be for small versioning changes, you cannot redirect to a version that differs by the major or minor value, you can only change the build or revision version. You can override the versioning redirects given by a publisher policy for a specific application using an application configuration file. This file has the name of the application with the extension .config (for example app.exe.config) and is installed in the application's folder.

13.2 Unmanaged Assembly Versioning

To see how this works, here is an unmanaged DLL:

#include <stdio.h>

extern "C" __declspec(dllexport) void MyFunction()
{
   printf("called library\n");
}

Compile this to use the DLL version of the C runtime library:

cl /LD /MD lib.cpp

When you use a system shared assembly, like the C runtime library, the unmanaged C++ compiler will automatically create a manifest for the DLL. List the contents of the folder and here you'll see a file called lib.dll.manifest.

<?xml version='1.0' encoding='UTF-8' standalone='yes'?>
<assembly xmlns='urn:schemas-microsoft-com:asm.v1' manifestVersion='1.0'>
 <dependency>
  <dependentAssembly>
   <assemblyIdentity type='win32' name='Microsoft.VC80.CRT' version='8.0.50608.0'
     processorArchitecture='x86' publicKeyToken='1fc8b3b9a1e18e3b' />
  </dependentAssembly>
 </dependency>
</assembly>

In fact, this is created by the linker which is invoked by the compiler. You can suppress the manifest generation using the /manifest:no linker switch and you can specify the name of the manifest file with the /manifestfile switch. The linker finds information about system side-by-side assemblies from information in the static import libraries for the files in the assembly. This means that a particular static .lib file is fixed to a specific version of a DLL, which is a good thing. If you use a side-by-side assembly other than the system assemblies you will have to write the entries for those assemblies.  You can tell the linker to add an <assemblyIdentity> element using the /manifestdependency linker switch. However, as you'll find out later, it might be a good idea to eschew the automatic generation of a manifest file altogether and write your own.

The schema of the manifest file is similar to, but not the same as, the configuration file schema used in .NET. The difference is the top two elements. The root element is called <assembly> and is similar to the .NET <assemblyBinding> element, however, it has a manifestVersion attribute that is mandatory and must be set to 1.0 (this is the version of the schema). Under that element is another element called <dependency> which is a collection of one or more <dependentAssembly> elements. This is similar to .NET configuration files in that it describes an assembly. Each <dependentAssembly>  must contain a  <assemblyIdentity> element. This element has various attributes, which are similar to their equivalent in the .NET configuration file schema. The type attribute is mandatory and must be win32.

A private library's manifest must be bound it to the DLL as an unmanaged resource. Since the manifest file is created by the linker, it means that you have to embed the manifest as a post link step. To do this you use the mt.exe tool to insert the manifest as an RT_MANIFEST resource with an ID of 2. Note that it must be a value of 2 (but, note that an application resource must have an ID of 1). The mt.exe tool is poorly documented, and the MSDN entry does not even list the command line switches to perform this action (although mt /? will list them). To embed a resource you need to pass the name of the manifest file to mt using the /manifest switch and use the /outputresource switch with the name of the PE file and the resource ID in the form: <pe file name>;#<resource id>. This will alter the library, and so you should bear this in mind when we talk about managed C++ later on this page.

For example, embed the manifest using the following:

mt /manifest lib.dll.manifest /outputresource:lib.dll;#2

Be careful when you type this because there is no colon between /manifest and its parameter, but there has to be a colon between /outputresource and its parameter. This looks like this tool has been developed rather sloppily in a bit by bit fashion by different teams. While I can understand this happens, I cannot understand how it gets past code review.

Now you can load the library in VS.NET 2005, Use File, Open, File and select lib.dll, this will automatically open the file in resource view. Open the RT_MANIFEST node and double click on the single item, 2. You will see the contents of the manifest.

There is another way to perform the same thing. Delete the library (del lib.dll) and the manifest file (del lib.dll.manifest) so that you have a clean slate. Now rebuild the library (cl /LD /MD lib.cpp), which will create the manifest file. Now create a resource script that uses the library manifest:

#include <winuser.h>
2 RT_MANIFEST lib.dll.manifest

Compile this file to create the resource:

rc lib.rc

Now link the library again, this time using the  resource. The .obj file should still be in the build folder.

link lib.obj lib.res /DLL /manifest:no /out:lib.dll

The /manifest:no switch tells the linker not to generate the manifest file. It seems a bit odd to link the file twice but in the absence of a tool that will generate manifest files this is your only option. Just to show how messed up the whole process is, you can use the /manifestdependency switch on the linker command line to tell the linker to add the supplied parameter to the <dependency> of the manifest file it will create. This switch, of course, has no effect whatsoever on the output target of the linker, which is, of course, the reason why you invoked linker. Crazy.

The whole process of manifests and embedding them is topsey-turvey and is a complete mess. This is yet another example of Microsoft losing the plot. For a start, MSDN library does not even document the switches to use, you have to use mt /? to get a list of the switches. Then, there is the inconsistency in the switch format: the /manifest switch must not have a colon between it and its parameter, but /outputresource must have the colon. Furthermore, you have to link the output to get the manifest. The output from the linker is not usable because you have to perform extra steps after the output has been linked. Invoking the linker twice, once to generate the manifest file, and a second time to embed a resource, is perverse. The alternative, of altering the output of the linker with an extra tool (mt.exe) is even worse, because it is an admission from Microsoft that their linker is inadequate. Much as I like the changes to the managed C++ language in VS.NET 2005, I do think that they should have devoted less time to changing the C++ language and spent more time getting the tools right.

Here's a simple user of the library:

#include <stdio.h>

#pragma comment(lib, "lib.lib")

extern "C" void MyFunction();

void main()
{
   printf("calling library\n");
   MyFunction();
}

Here, I have used a pragma to indicate the import library for the DLL that the process will use, this means that I don't have to call the linker explicitly. You can compile this code with

cl app.cpp

(Note this will statically link the C runtime library). Run this process and confirm that it works. The reason why it works is because the code indicates that it uses lib.dll and the DLL loader will load this library from the application folder. Then the DLL loader will see that the DLL uses msvcr80.dll. The DLL loader will try to load this DLL, from the side-by-side cache and use the assembly manifest information to pick up the right version. Let's prove that this is the case.

Rename the library so that we can use it later (rename lib.dll lib.withmanifest). Now compile the library without an embedded manifest:

cl /LD /MD lib.cpp

Now run the application again.

This indicates that the DLL loader has not found the DLL and so has given up. Now, you know that you have installed Visual C++ correctly on your machine, so msvcr80.dll must be on your machine. This seems to indicate that you application has broken Windows.

Even though the documentation says that the DLL loader will look for a manifest file if it cannot find an embedded resource, this example has shown that it does not do this. You must embed the manifest in the DLL.

I think it is appalling that a simple code like this will not work straight from the compiler. At least the compiler/linker could warn you. For example, it would be helpful if the linker issued a warning if there isn't an RT_MANIFEST resource. Or maybe the compiler could add a function in the DLL entry point that tests for the resource and throw an exception if one does not exist. This is definitely a bug.

Now return to the situation of the library with the embedded manifest and to do this rename the library without a manifest (rename lib.dll lib.withoutmanifest) and rename the library you made before (rename lib.withmanifest lib.dll). Now compile the application to use the DLL version of the CRT library:

cl /MD app.cpp

You will find that this time the linker will create a manifest file for the application with the same contents as the file generated for the library. Run the application. You will find that it will work. The reason is that the DLL loader will use the application manifest file to get information about the version of the CRT assembly, it does not require that you embed this manifest as an unmanaged resource, as is the case with DLLs. If you do want to embed the manifest file you can use the same procedures that I specified above, but the resource ID should be 1.

Finally, go back to the library without a manifest (delete lib.dll then rename lib.withoutmanifest lib.dll). So now you will have a process and a library that both use the DLL version of the CRT, but neither of these will have an embedded manifest. However, there is an application manifest file. Run the application. This time you will find that the application will run, which means that when the DLL loader loads the library it uses information provided by the application's manifest file.

13.3 The Side-By-Side Cache

So where is the side-by-side library located? Move to the windows folder (cd %windir%) and list the contents. Here you'll find a folder called WinSxS. This is the side-by-side cache. List the contents of this folder. You'll find that there are lots of folders with long names and just three that have short names. These latter three are: InstallTemp, Manifests and Policies. The contents of these folders are altered by Windows Installer, but you are able to have a poke around. InstallTemp is just a temporary folder used by the installer. Manifests is far more interesting. Move to this folder and list its contents. This contains binary files (called verification catalogues) which have the extension .cat and text files with the extension .manifest. Searching through these you'll find a file that seems to correspond to the CRT library, so print it to the console:

type x86_Microsoft.VC80.CRT_1fc8b3b9a1e18e3b_8.0.50727.42_x-ww_0de06acd.manifest

The contents of this file is similar to the manifest file you created for the library and the application. However, this is a manifest file for a shared assembly. Here is an edited version:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">
   <noInheritable>
   <assemblyIdentity type="win32" name="Microsoft.VC80.CRT" version="8.0.50727.42"
      processorArchitecture="x86" publicKeyToken="1fc8b3b9a1e18e3b"/>
   <file name="msvcr80.dll"/>
   <file name="msvcp80.dll"/>
   <file name="msvcm80.dll"/>
</assembly>

This manifest file is not valid, I have edited it to make it more readable. The important point is that there is no <dependency> element, instead the <assemblyIdentity> element is directly under the root. This element is used to give the identity of the assembly. Note that it gives the version of the assembly to be 8.0.50727.42. The assembly also has a publickeytoken, which is used to prevent name collisions. The <file> elements are the DLLs that make up the assembly, so if unmanaged code requests any of these three DLLs the DLL loader will use this assembly manifest. The data I have deleted is information about code signing, which is performed so that a manifest can be validated before it is used. The assembly and manifest are signed, by hashing the files and then creating a digital signature by encrypting the hash with the private key from the publisher's certificate. The public key from the certificate is also used to create the publickeytoken in the assembly name. The operating system will have access to the public key in the certificate and the digital signature in the security catalog (the .cat file) so it can decrypt the digital signature and generate the hash over the file. The operating system can then compare the two hashes, to validate the assembly and the manifest file.

Notice that the version of the CRT assembly is 8.0.50727.42, however, the manifest for the library and process we built specified that they required version 8.0.50608.0. There is a version mismatch!

I think this is pretty shoddy. I have Visual Studio 2005 that has just been released. The machine has never had a beta of Visual Studio 2005 on it, so there is no possibility of an earlier build of the CRT version 8.0 being installed on it. Yet the C++ tools (through the static import libraries) think that they are using an earlier version of the C runtime library than the Visual Studio installer has put on my machine.

So why did the application work?

Move up a directory and move to the Policies folder (cs ..\Policies). Here you will find a folder which has the name and publickeytoken for the C runtime library, but not the complete version:

x86_policy.8.0.Microsoft.VC80.CRT_1fc8b3b9a1e18e3b_x-ww_77c24773

Move to this folder and list the contents. Within here is a binary verification catalog (8.0.50727.42.cat) and a policy file (8.0.50727.42.policy), with the version of the assembly that is installed on the machine. Type this file to the console:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!-- Copyright ┬ 1981-2001 Microsoft Corporation -->
<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">
   <assemblyIdentity type="win32-policy" name="policy.8.0.Microsoft.VC80.CRT"
      version="8.0.50727.42" processorArchitecture="x86"
      publicKeyToken="1fc8b3b9a1e18e3b"/>
   <dependency>
      <dependentAssembly>
         <assemblyIdentity type="win32" name="Microsoft.VC80.CRT"
            processorArchitecture="x86" publicKeyToken="1fc8b3b9a1e18e3b"/>
         <bindingRedirect oldVersion="8.0.41204.256-8.0.50608.0" newVersion="8.0.50727.42"/>
      </dependentAssembly>
   </dependency>
</assembly>

This gives complete information identifying the assembly installed on this machine (version 8.0.50727.42) and then, confusingly, it identifies 'dependent assemblies', with rebinding information to direct requests for all versions of the CRT library between 8.0.41204.256 and 8.0.50608.0 so that the DLL loader uses version 8.0.50727.42 instead. This file is for a specific version of an assembly which says that any requests for any other version should be directed to this version. This is subtly different (and in my opinion, confusingly so) to rebinding in .NET where the configuration file gives information for any version of an assembly and indicates that requests for one version should be redirected to another version.

This is why the application worked. The application requests the CRT library, the manifest indicates that it wants version 8.0.50608.0 and the policy file redirects this request to 8.0.50727.42.

Note that version 8.0.50608.0 of the CRT exists nowhere on my machine, and yet this redirect is necessary because the export libraries provided with Visual C++ say that my code will use it. This is plain daft, proper quality control would have ensured that the correct export libraries were shipped with the product.

13.4 Side-by-Side Execution and Managed C++ Assemblies

OK, now create a managed assembly; this code is trivial, and contrived:

using namespace System;

public ref class LibraryCode
{
public:
   String^ GetData()
   {
      return "Data";
   }
};

This uses the new C++ syntax, where the ^ symbol indicates a handle to an object on the managed heap. Compiling this code is simple:

cl /clr /LD lib.cpp

Now list the contents of the build folder where you'll see that a manifest was created for the assembly. This is exactly the same as the manifest created for the unmanaged library assembly. You can get a list of the DLLs that this library depends upon using the new /dependents switch on dumpbin (they are not dependents, because our library depends on them). I get the following:

MSVCR80.dll
KERNEL32.dll
msvcm80.dll
mscoree.dll

This indicates that the unmanaged and managed versions of the C runtime library are used. Visual Studio 2005 contains a managed version of various C++ standard library classes and C runtime functions in msvcm80.dll. Running dumpbin /imports on the lib assembly shows that the library uses managed methods in msvcm80.dll to initialize the library. The library also uses msvcr80.dll which is the unmanaged C runtime library. dumpbin shows that our library calls functions to initialize and clean up the library and to handle exceptions. Such initialization code is required to 'prepare' the library to use the C runtime library, and to provide code to handle the initialization of global, unmanaged C++ objects. This initialization code has been added to the assembly even though we do not have code that will throw an unmanaged C++ exception, nor uses any unmanaged code. We will return to this issue later, but for the time being, it serves our purpose.

Create a process that calls this library:

#using "lib.dll"

using namespace System;

void main()
{
   LibraryCode^ lib = gcnew LibraryCode;
   Console::WriteLine(lib->GetData());
}

Again, notice the new syntax. The object is accessed using a handle, and it is created using the gcnew operator. Compile this code with:

cl /clr app.cpp

Finally, run the process and confirm that it works. You know the reason why. The process you created also uses the managed and unmanaged C runtime library and so the linker has created an application manifest, app.exe.manifest. The operating system uses this file to identify the side-by-side assembly that contains the two CRT DLLs that the process uses. In addition, the values in this file are used when the operating system loads the library assembly, lib.

View this assembly with ILDASM and briefly scroll through all of the types that have been declared, then close it down before it gets too frightening (did you know that that much initialization had to be provided so that you can use the CRT and the C++ standard library?). The compiler has added many types to give you support for the CRT and in this example you do not use them! Delete the manifest file and run the application again, you'll see the error dialog that was shown earlier.

Now remove the dependence of the process assembly on the CRT. To do this is to compile with the /clr:safe switch:

cl /clr:safe app.cpp

This indicates that the compiler should create verifiable code, and hence it will not use the CRT. The compiler has another new switch /clr:pure which will create an assembly that only contains IL (native code using IJW is not supported) but it can use platform invoke. A pure assembly can use the C runtime library function as long as it is a function exported from the managed version of the CRT DLL, so such an assembly will have a dependence upon msvcm80.dll. We want no dependence on any form of the CRT, managed or unmanaged.

Open the library with ILDASM and confirm that the assembly now only contains the entry point, main. Close ILDASM and list the contents of the build folder. You will find that a manifest has not been created for the application because the process assembly you just built uses no unmanaged file.

Run the application. You will find that a FileNotFoundException is thrown. To work out why, run fuslogvw. The default binding errors give little information, so take a look at the list of native image binding errors. Here, there appears to be some promising entries. The entry for the attempt to load the lib library indicates that Fusion attempted to load the native image for the assembly and could not find it.

This must be a bug. The reason for this error is that the system cannot find the CRT libraries that are required by the library. It has nothing to do with the native image of lib.dll.

Now add the manifest of the library to the library:

mt /manifest lib.dll.manifest /outputresource:lib.dll;#2

Start the application, this time the application will run correctly. The FileNotFoundException was thrown before because the DLL loader could not find the CRT libraries, embedding the manifest gave the loader the information that it needed to be able to find the CRT assembly.

Of course, the library does not use the CRT, nor any unmanaged types, so there is no reason for there to be support for the CRT. Compile this as verifiable code:

cl /LD /clr:safe lib.cpp

Now when you run the application no assembly manifest, nor application manifest is needed because neither the application, nor the library uses side-by-side assemblies.

As a final point, you should remember that if you want to create a strong named library that uses an unmanaged assembly then you should take steps to make sure that the library will not fail strong name validation. If you decide to use mt.exe then you should delay sign the managed library when you build it, then apply mt to insert the manifest resource, and finally re-sign the assembly. It actually makes more sense to chose the second option identified earlier, and create the manifest file yourself and add it to the assembly through a resource script. This way you would not have the extra step of re-signing the assembly.

I hope that you enjoy this tutorial and value the knowledge that you will gain from it. I am always pleased to hear from people who use this tutorial (contact me). If you find this tutorial useful then please also email your comments to mvpga@microsoft.com.

Errata

If you see an error on this page, please contact me and I will fix the problem.

Page Fourteen

This page is (c) 2007 Richard Grimes, all rights reserved