Is .NET A Wrapper Around Win32?
Home About Workshops Articles Writing Talks Books Contact

Analysis of .NET's Use of Win32

Introduction
Managed and Unmanaged Code
Results
   .NET version 1.0.3705
   .NET Version 1.1.4322
   .NET Version 1.2.4322
   .NET Version 2.0.50215
   .NET Version 2.0.50727
   WinFX 6.0.5070
Native Methods in the Framework
Analysis of Results
Conclusions
Downloads
Notes

Executive Summary
Five versions of the .NET framework are analyzed for their use of unmanaged code. The analysis measures how many methods are implemented as Intermediate Language (IL) and how many are implemented in unmanaged code. It also measures how many method calls are to methods that are IL and how many calls are to methods that are implemented with unmanaged code. These measurements give a metric of how 'managed' the framework is; and the more managed, the better.

The results show that all versions of .NET from 1.0.3705 (the first version) to 2.0.50727 (the released version of .NET v2.0) contain unmanaged code in the framework library. Furthermore, the proportion of managed code increased steadily and reached a peak at the version 1.2.4322 (the version supplied with the first publicly available version of Longhorn). However, after that version the proportion of managed code in the framework has decreased for versions after 1.2.4322. The current version of the .NET library (2.0.50727) is the worst yet in terms of the number of calls to unmanaged code, and the second worst (to 1.1.4322) in terms of the number of methods implements with unmanaged code.

The conclusion to be drawn is that Microsoft made an effort to make the framework as managed as possible in versions of the framework up to and including 1.2.4322 (the first version of Longhorn), but after this Microsoft retreated from this aspiration and now it appears they are no longer determined to make the .NET framework entirely managed.

Introduction

I first had access to Longhorn in the middle of 2002 because I had been asked to write a book about WinFS and I was given early access to Longhorn so that I could prepare the book while the operating system was being developed. Sadly, that project fell through, (although Microsoft removing WinFS from Longhorn was not the main reason for the failure of that project). When I was first given access to Longhorn my contact at Microsoft enthused about the Longhorn API (LAPI, now known as WinFX). He said that the move from XP to Longhorn would be bigger than the move from DOS to Windows. He also told me that LAPI would be almost entirely managed. In fact, he told me that the LAPI developers were told that they had to use .NET and if they wanted to write native code they had to give a good reason for it. He went on to explain to me that Microsoft would provide native code wrappers around LAPI for, in his words, "VB6 developers".

I found the concept of LAPI exciting, Microsoft's operating system API's had always been C APIs, and so providing a .NET API was a new departure for Microsoft. I am, and always have been, a .NET enthusiast, principally because I recognize the immense advantage that .NET code access security gives. It appeared to me that the operating system would get huge benefits from .NET security.

A lot has happened in the last couple of years. Microsoft have changed the name to Vista, they have removed WinFS and they have retreated from their position of requiring all new development to be done in .NET. Indeed, the evidence at the moment appears to be that Microsoft have completely lost confidence in .NET.

Of course, that last phrase is subjective, and the official line from Microsoft is that they have as much confidence in .NET as they have ever had. In my previous article, Vista and .NET, I analyzed how much of the operating system used .NET. I found that the PDC 2003 build of Longhorn was distributed with a version of WinFX clearly because operating system applications used this framework. It used .NET in the shell and it also used .NET for several Windows services. However, in spite of this, the majority of the operating system was still based on the C-based Win32 API, but clearly attempts had been made to use .NET. Vista beta 1, and the subsequent 5219 and 5231 builds, did not have WinFX, and you were expected to install it as a separate component. This meant that none of the operating system could be implemented using WinFX, in one action they changed WinFX from being a required and integral part of Windows to be an optional component. In addition, they used .NET less than the PDC03 build of Longhorn. In all versions of Vista to date the shell does not use .NET and there are no .NET services. The few applications that do use .NET are fairly trivial applications, or are applications that could just as easily be built in Win32. Since WinFX is not part of Vista it means that the much exulted Indigo (Windows Communication Foundation) and Avalon (Windows Presentation Foundation) do not show up at all in Vista.

These results, from my analysis of the use of .NET in Vista, agreed with my assertion that Microsoft is losing confidence in .NET, but I wanted to get a view from a different perspective.

One of the great aspects of .NET v1.0 and v1.1 was that the framework library gave access to almost all of Win32. Before .NET a developer would have to use a variety of technologies: C functions exported from DLLs; inproc and local server COM objects; and even C function pointers exported in such a way that they look like COM interfaces1. The .NET framework library provided a unified mechanism to access all of these Windows facilities. This means that the framework library can be described as a wrapper around Win32. However, the framework library is more than this because it provides some features (for example, the regular expressions library and .NET remoting) that are entirely implemented in managed code.

Clearly, if the LAPI developers were told that they had to write managed code this would reduce the 'wrapper' aspect of the framework library around Win32. I wanted to get some kind of metric as to see how much of a wrapper the framework library is. If I could get a measure of 'wrapper-ness' then I could make a comparison between the various versions of the framework library.

Managed and Unmanaged Code

The descriptive terms 'managed' and 'unmanaged' code are too simplistic. There are actually four ways that the framework can use native code. I will explain these four mechanisms from the perspective of .NET metadata.

The first mechanism is the most well known: Platform Invoke. In your .NET source code you will recognize Platform Invoke methods by the [DllImport] attribute on a static method without an implementation. If you view such a method using ILDASM you'll see that a Platform Invoke method has the pinvokeimpl metadata attribute (it is not a custom attribute). For example, this C# code will allow a class to call the Win32 CloseHandle method:

[DllImport("kernel32")]
public static extern bool CloseHandle(uint h);

The IL for this method looks like this:

.method public hidebysig static pinvokeimpl("kernel32" winapi)
bool CloseHandle(unsigned int32 h) cil managed preservesig
{
}

I find this a little confusing because the method is marked as pinvokeimpl which means that it is implemented in the kernel32.dll dynamic linked library, but it is also marked as being cil managed. I think ILDASM is wrong, but I understand why it says this. The .NET ECMA specification gives a description of the metadata that describes methods. This metadata is in the MethodDef table (table 0x06) and each row in this table has these fields:

Name Type Description
RVA ULONG The Relative Virtual Address of the implementation of the method.
ImplFlags USHORT Flags that indicate how the method is implemented.
Flags USHORT Flags that indicates the accessibility of the method and other details about how the method is implemented.
Name string The index in the string table of the method name.
Signature blob The index in the blob table of binary data that describes the method's signature.
ParamList index Index into the Param table of the first parameter.

The relevant fields are the ImplFlags and Flags fields. The CloseHandle entry has 0x80 for ImplFlags and 0x2096 for Flags. The ECMA spec indicates that the ImplFlags value is PreserveSig and the Flags value is Public | Static | HideBySig | PInvokeImpl.

So why where is cil managed? Well, you have to take a closer look at the ImplFlags value. The spec indicates that Flags is not a bitmap. In fact the top 12 bits are a bitmap (which includes the PreserveSig value). The bottom four bits are two separate values. Bit 2 is set if the method is unmanaged, that is, the RVA field points to native code within the assembly; if this bit is unset then the method is managed. The bottom two bits give a value that indicates how the method is implemented: a value of 0 means that it is CIL, a value of 1 means that the method is native (again, the RVA field points to native code within the assembly), 2 means that the method is OPTIL (unused in current versions of .NET) and 3 means that the runtime implements the method (note that this is not the same as internalcall, this is used, for example, on COM interop methods).

So decoding the ImplFlags value again we see that the bottom four bits are all unset. This means that the code is not unmanaged, and is implemented in CIL; hence this gives us the cil managed that ILDASM shows. However, I think this is wrong. The reason is that the pinvokeimpl indicates that the method implementation is not in the assembly (and this is borne out by the RVA field which has a value of zero). So the method implementation is irrelevant and it is not the concern of this assembly.

The second mechanism for accessing unmanaged code is to use COM interop. COM objects are accessed through interfaces. You cannot mark a single method as being a COM method, instead, an entire interface must be marked as being a COM interface. An interface has no implementation, instead a class implements the methods on an interface. The .NET runtime provides an object called the Runtime Callable Wrapper (RCW), which implements the .NET interfaces that correspond to the COM interfaces implemented by the COM object. So that the RCW knows which interfaces to implement you have provide a .NET class that contains metadata to do the mapping. Typically, you will use the tlbimp to generate an interop assembly that has the interfaces and the .NET class used by the RCW for COM interop.

The following is the decompiled code for an interface and a class generated by tlbimp (I have removed some values for clarity):

[ComImport, Guid(/*stuff*/), TypeLibType(/*stuff*/)]
public interface ITest
{
   [MethodImpl(MethodImplOptions.InternalCall, MethodCodeType=MethodCodeType.Runtime), DispId(1)]
   void CallMe();
}

[ComImport, CoClass(typeof(TestClass)), Guid(/*stuff*/)]
public interface Test : ITest
{
}

[ComImport, Guid(/*stuff*/), TypeLibType(/*stuff*/), ClassInterface(0)]
public class TestClass : ITest, Test
{
   [MethodImpl(MethodImplOptions.InternalCall, MethodCodeType=MethodCodeType.Runtime), DispId(1)]
   public virtual extern void CallMe();
}

Your code will create an instance of the Test class to access the COM object. The [CoClass] attribute indicates to the RCW the class that gives information about the COM object and the interfaces it implements. The important item is the [ComImport] attribute that can only be applied to a class. In IL it appears as the import attribute. Here's what ILDASM gives for TestClass (some values are omitted for clarity):

.class public auto ansi import TestClass
extends [mscorlib]System.Object implements LibTest.ITest, LibTest.Test
{
.custom instance void [mscorlib]GuidAttribute::.ctor(string) = ()
.custom instance void [mscorlib]TypeLibTypeAttribute::.ctor(int16) = ()
.custom instance void [mscorlib]ClassInterfaceAttribute::.ctor(int16) = ()
} // end of class TestClass

The metadata for this item is held in the TypeDef (0x02) table, whith the following fields.

Name Type Description
Flags ULONG Indicates how the type is implemented and its visibility
Name string The name of the type
Namespace string The namespace that the type belongs
Extends index An index into one of the tables that describes types of the base class for this type
FieldList index The index in the Field table of the first field of the type, subsequent fields follow immediately in the table.
MethodList index The index in the MethodDef table of the first method of the type, subsequent methods follow immediately in the table.

The table has a 32-bit Flags field and the row for TestClass has 0x1001 for this field. The bottom three bits give a number that indicates the visibility of the type (in this case, Public), the rest of the value is a bitmap and a value of 0x1000 is Import, and this corresponds to the [ComImport] attribute.

The interface (and hence the class) has a single method, CallMe. ILDASM gives the following:

.method public hidebysig newslot virtual
instance void CallMe() runtime managed internalcall
{
.custom instance void [mscorlib]System.Runtime.InteropServices.DispIdAttribute::.ctor(int32) = ()
.override LibTest.ITest::CallMe
} // end of method TestClass::CallMe

Note that this method is described as runtime managed internalcall. Yet again, the implementation of the method is not in the assembly, so I would argue that the managed metadata attribute is a mistake and it appears because bit three of the ImplFlags is not set and so shows that there is no unmanaged implementation in the assembly for this method. The value of ImplFlags for this method is 0x1003. From the discussion above, you'll notice that the bottom 2 bits gives the implementation: Runtime (3). The rest of the value is 0x1000 which is the InternalCall value. The runtime attribute means that the runtime provides the implementation. In this case the implementation is unmanaged code, but note that runtime (ECMA Spec 14.4.3.1) does not indicate how the method is implemented, just that it is implemented by the runtime. The internalcall attribute means that the unmanaged framework DLLs provides the implementation, in this case, the framework provides the necessary code to call the COM interface method.

This last attribute, internalcall, is not limited to COM methods, and it represents the third type of unmanaged code that is called. For example, AppDomain.GetAssemblies, Array.Copy, String.IndexOf and Thread.Join are some of the many methods that are marked as internalcall. It is important to stress that these are methods that are implemented by the runtime itself rather than the framework library, and the annotated ECMA spec (14.4.3.3) gives the following note:

Implementation Specific (Microsoft): internalcall allows the lowest-level parts of the Base Class Library to wrap unmanaged code built into Microsoft's Common Language Runtime.

I want to differentiate between COM methods and non-COM internalcall methods so when I identify an internalcall method I need to check whether the type that implements the method has the [ComImport] attribute; if it does, then the method is a COM method.

The final way to access unmanaged code is through native code embedded in the assembly. This is done by the managed C++ compiler using a mechanism called It Just Works! or IJW. For example, the following managed C++:

// Compile with /clr
void main()
{
   System::Console::WriteLine("started");
}

Note that this code does not have any unmanaged code! However, the clue is the switch used to compile this code: /clr. This means that the assembly links with the managed CRT library implemented in msvcm80.dll and the unmanaged CRT library implemented in msvcr80.dll. The reason that the CRT is linked is because the /clr switch indicates that you may use the CRT and hence the compiler adds support for the library in the assembly. If you do not use the CRT (nor any unmanaged code) then you should use /clr:safe, however, that would not help in this example.

ILDAM shows lots and lots of data items and methods added to allow you to use the CRT and global unmanaged objects. One such method is _mainCRTStartup, shown here:

.method public static pinvokeimpl(/* No map */)
uint32 _mainCRTStartup() native unmanaged preservesig
{
.entrypoint
.custom instance void [mscorlib]System.Security.SuppressUnmanagedCodeSecurityAttribute::.ctor()
   = ( 01 00 00 00 )
// Embedded native code
// Disassembly of native methods is not supported.
// Managed TargetRVA = 0x000012C9
} // end of method 'Global Functions'::_mainCRTStartup

This method is implemented as native x86 code within the assembly. Note that the method has the pinvokeimpl attribute as well as native unmanaged preservesig.

The reason is shown in the MethodDef entry for this method. This entry has a value for ImplFlags of 0x0085 and for Flags of 0x6016. The bottom two bits of ImplFlags are 01 which means native. The third bit indicates whether the code is managed or not (in this case the bit is set which means unmanaged). The other bits are a bitmap, in this case 0x80 which means PreserveSig. Moving on to Flags, the bottom three bits indicate the access level of the method (in this case 6, which means Public). The rest of the bits in this item is a bitmap where 0x10 means Static and 0x6000 is a combination of PInvokeImpl (0x2000) and HasSecurity (0x4000). The latter attribute refers to the custom attribute [SuppressUnmanagedCodeSecurity].

Note that pinvokeimpl has the comment /* No Map */. The reason for this is because there is no entry in the assembly's ImplMap table for this method:

Name Type Description
MappingFlags USHORT Gives information about the imported method
MemberForwarded index Indicates the index in the MethodDef table of the managed implementation of the method
ImportName string The name of the method in the native DLL
ImportScope index Index of an entry in the ModuleRef table for the DLL that implements the native method

Normally this attribute indicates that the runtime should look for an entry ImplMap that will indicate the name of the native function and the DLL which exports the function. Indeed, a method with the pinvokeimpl attribute will have zero for the RVA field for the method's entry in the MethodDef table. However, in the case of the managed _mainCRTStartup there is a non-zero value for RVA. This address still points to a place in the .text section in the PE file (the part of the file that contains metadata and IL). However, the map file for the code (which you can generate with -Fm using the compiler, or /MAP for the linker) indicates that this is the address of the unmanaged _mainCRTStartup function.

So to summarize, there are four types of calls to unmanaged code.. These are: platform invoke to a DLL function, COM interop call to a COM object, call to an internalcall method and a call to embedded native code put there by managed C++ IJW.

Results

The results for various versions of .NET will be given in the following sections, but first I need to explain the measurements that have been taken. I have written a command line tool that will load all libraries in a specified folder. For each .NET library in this list of DLLs the tool measures the number of methods of any visibility within the assembly. It examines each method and determines if the method contains native code, is a Platform Invoke or COM interop method. It also determines the methods that are marked as internalcall. Finally, the tool determines the number of methods that are implemented within the assembly purely in IL.

The next test is to access every method implemented in IL and test each IL opcode to see if it is a call to a method. This gives the measure of the number of method calls made. The tool then analyses these methods to see if they are made to methods within the current assembly or to another assembly; and in each case determines if the method that is called is native, COM, Platform Invoke, internalcall or pure IL.

The reason for these measurements is to determine a metric of whether the .NET framework is a wrapper around Win32. For the purpose of these measurements I define Win32 as being any native DLL or COM object accessed by the framework, and so the calls to COM and pinvokeimpl methods are regarded as being calls to Win32. A few methods are native code embedded within the assembly. These are part of the assembly and so I do not treat them as being Win32 methods. Finally, some methods are marked as being internalcall. These methods are implemented by the unmanaged portion of the .NET framework, and again, I do not treat them as being Win32. The Platform Invoke and COM methods give a measure of how much of a Win32 'wrapper' the framework is, and the native methods and internalcall methods give a measure of how 'managed' the framework library is.

In the results I have a row marked 'Other Methods'. These are calls which would not fit the other categories, or which were not accessible. The majority of these methods are calls through function pointers (calli) and so must be calls to managed methods.

.NET Version 1.0.3705

This is the first version of the framework, supplied as part of Visual Studio 2002. The first table shows how the framework library is implemented:

Description Count %
Total Number Of Methods 76772 100.0
Total Number of IL Methods 71063 92.6
Total Number of COM Methods 2599 3.4
Total Number of pinvokeimpl Methods 2098 2.7
Total Number of internalcall Methods 988 1.3
Total Number of Native Methods 28 0.04
Total Number of Method Calls 277451  
Total Number of Opcodes Examined 1687220  

This shows that the framework contains 28 methods that are native, that is, the x86 code is embedded within the assembly. This is a small percentage of the overall number of methods (some 76 thousand methods), but the fact that there are any native methods surprised me. I will do an analysis of this later.

The next analysis is performed on the method calls that are made within the framework library. In total there are 277 thousand calls. The first table give the number of method calls to methods within the assembly. The percentage is given for all method calls, (that is to methods within the assembly and methods implemented on other assemblies):

Description Count %
Total Number of Method Calls (internal and external) 277451  
Calls to IL Methods 148037 53.4
Calls to COM Methods 978 0.35
Calls to pinvokeimpl Methods 3140 1.13
Calls to internalcall Methods 3055 1.10
Calls to Native Methods 58 0.02
Calls to Other Methods 1114 0.40

The majority of method calls are to methods implemented in IL. So for example, there are just 28 methods and these are called on average twice each (58 method calls). The breakdown of calls to methods implemented in other assemblies is:

Description Count %
Total Number of Method Calls (internal and external) 277451  
Calls to IL Methods 116898 42.1
Calls to COM Methods 3 0.001
Calls to pinvokeimpl Methods 166 0.04
Calls to internalcall Methods 4002 1.44
Calls to Native Methods 0 0

.NET Version 1.1.4322

This is the second version of the framework, provided as part of Visual Studio 2003 and as part of Windows Server 2003. Here are the number of methods in absolute terms and as a percentage of the whole:

Description Count %
Total Number Of Methods 92504 100.0
Total Number of IL Methods 81607 88.2
Total Number of COM Methods 7444 8.0
Total Number of pinvokeimpl Methods 2320 2.5
Total Number of internalcall Methods 1136 1.2
Total Number of Native Methods 38 0.04
Total Number of Method Calls 317267  
Total Number of Opcodes Examined 1912759  

Notice that the number of native methods in the framework has increased by 50% over the previous version of the framework. In this small respect, the framework library is getting less managed. Calls to methods within the assembly are given in this table:

Description Count %
Total Number of Method Calls (internal and external) 317267  
Calls to IL Methods 166901 52.6
Calls to COM Methods 1044 0.33
Calls to pinvokeimpl Methods 3443 1.1
Calls to internalcall Methods 3079 1.0
Calls to Native Methods 80 0.03
Calls to Other Methods 1202 0.38

Calls to methods in other assemblies:

Description Count %
Total Number of Method Calls (internal and external) 317267  
Calls to IL Methods 137249 43.3
Calls to COM Methods 276 0.09
Calls to pinvokeimpl Methods 176 0.06
Calls to internalcall Methods 3817 1.2
Calls to Native Methods 0  

.NET Version 1.2.4322

This version of the framework was provided as part of the first publicly available version of Longhorn (released at the 2003 PDC) and hence contained the first version of WinFX.

Description Count %
Total Number Of Methods 160131 100.0
Total Number of IL Methods 151445 94.6
Total Number of COM Methods 4780 2.99
Total Number of pinvokeimpl Methods 2852 1.78
Total Number of internalcall Methods 1000 0.62
Total Number of Native Methods 63 0.04
Total Number of Method Calls 606755  
Total Number of Opcodes Examined 3406817  

In this version the number of native embedded methods has doubled that of the number in the previous version.  The number of calls to methods within the assembly are shown here:

Description Count %
Total Number of Method Calls (internal and external) 606755  
Calls to IL Methods 328312 54.1
Calls to COM Methods 1152 0.19
Calls to pinvokeimpl Methods 4255 0.70
Calls to internalcall Methods 4393 0.72
Calls to Native Methods 160 0.03
Calls to Other Methods 1995 0.33

Calls to methods in other assemblies:

Description Count %
Total Number of Method Calls (internal and external) 606755  
Calls to IL Methods 260844 43.0
Calls to COM Methods 3 0.00
Calls to pinvokeimpl Methods 113 0.02
Calls to internalcall Methods 5484 0.0
Calls to Native Methods 0 0

.NET Version 2.0.50215

This is the version of the runtime that was released with Vista beta 1.

Description Count %
Total Number Of Methods 166990 100.0
Total Number of IL Methods 156193 93.5
Total Number of COM Methods 6647 4.0
Total Number of pinvokeimpl Methods 3087 1.8
Total Number of internalcall Methods 945 0.6
Total Number of Native Methods 124 0.007
Total Number of Method Calls 648607  
Total Number of Opcodes Examined 3662071  

Again, there is another doubling of the number of native methods in the library. The number of calls to methods within the assembly are:

Description Count %
Total Number of Method Calls (internal and external) 648607  
Calls to IL Methods 366408 56.5
Calls to COM Methods 1472 0.23
Calls to pinvokeimpl Methods 5153 0.8
Calls to internalcall Methods 6481 1.0
Calls to Native Methods 180 0.03
Calls to Other Methods 6193 0.95

Calls to methods in other assemblies:

Description Count %
Total Number of Method Calls (internal and external) 648607  
Calls to IL Methods 247950 38.2
Calls to COM Methods 2 0.00
Calls to pinvokeimpl Methods 87 0.01
Calls to internalcall Methods 14681 2.3
Calls to Native Methods 0 0

.NET Version 2.0.50727

This is the RTM version of .NET version 2.0, and hence contains the release version of the runtime used by WinFX:

Description Count %
Total Number Of Methods 233068 100.0
Total Number of IL Methods 216420 92.8
Total Number of COM Methods 9861 4.2
Total Number of pinvokeimpl Methods 5763 2.5
Total Number of internalcall Methods 941 0.40
Total Number of Native Methods 95 0.04
Total Number of Method Calls 812379  
Total Number of Opcodes Examined 5029051  

It is interesting to note that the number of native methods has decreased to two thirds of the number in the previous version (but still double the number in v1.1). The number of calls to methods within the assembly are:

Description Count %
Total Number of Method Calls (internal and external) 812379  
Calls to IL Methods 457123 56.3
Calls to COM Methods 1488 0.18
Calls to pinvokeimpl Methods 7992 0.98
Calls to internalcall Methods 6501 0.80
Calls to Native Methods 174 0.02
Calls to Other Methods 9194 1.13

Calls to methods in other assemblies:

Description Count %
Total Number of Method Calls (internal and external) 812379  
Calls to IL Methods 311193 38.3
Calls to COM Methods 276 0.03
Calls to pinvokeimpl Methods 107 0.01
Calls to internalcall Methods 18331 2.3
Calls to Native Methods 0 0

WinFX 6.0.5070

The WinFX API is a collection of assemblies that provides the presentation core and messaging infrastructure for Vista. It can also be installed on other versions of Windows, so I installed it on XPSP2.

Description Count %
Total Number Of Methods 65344 100.0
Total Number of IL Methods 62873 96.2
Total Number of COM Methods 1518 2.32
Total Number of pinvokeimpl Methods 955 1.51
Total Number of internalcall Methods 0 0
Total Number of Native Methods 0 0
Total Number of Method Calls 213923  
Total Number of Opcodes Examined 1336809  

Note that although the proportion of pure IL methods is considerably more than for any of the versions of the framework examined so far, there are still some methods that are implemented in unmanaged code.

The number of calls to methods within the assembly are:

Description Count %
Total Number of Method Calls (internal and external) 213923  
Calls to IL Methods 124635 58.3
Calls to COM Methods 434 0.20
Calls to pinvokeimpl Methods 1329 0.62
Calls to internalcall Methods 0 0
Calls to Native Methods 0 0
Calls to Other Methods 10825 5.1

The 'Other Methods' here are methods that are called through a TypeSpec metadata entry. This table describes managed methods.

Calls to methods in other assemblies:

Description Count %
Total Number of Method Calls (internal and external) 213923  
Calls to IL Methods 72786 34.0
Calls to COM Methods 7 0.003
Calls to pinvokeimpl Methods 394 0.18
Calls to internalcall Methods 3248 1.52
Calls to Native Methods 0 0

Native Methods in the Framework

It surprised me that there were any embedded native methods at all in the framework. The number of native methods are summarized here:

1.0.3705 1.1.4322 1.2.30703 2.0.50215 2.0.50727
28 28 63 124 95

To investigate what is happening here. I will use the released version of .NET 2.0 (2.0.50727) as an example. To get a list of the native methods, I altered the analysis tool so that after it analyzed the runtime libraries it printed out the names of the native methods. This showed that there were three assemblies that had native methods (CustomMarshalers, ISymWrapper and System.Data) and one module (System.EnterpriseServices.Wrapper).

One initial sight, the list appeared to contain three types of methods: custom methods, CRT methods and Win32 methods. This last category may appear surprising, these are Win32 API functions like CoCreateInstance and Sleep, and they appear to be implemented as embedded native code. These functions are implemented in Win32 DLLs and so should be imported from those DLLs and should not be embedded in the assembly. To find out what was happening I ran dumpbin /imports on such an assembly and found that these functions were imported through the PE file's import address table (IAT), that is, they were imported through the normal mechanism.

So what is happening here? Well, the first clue is that these four files were written in managed C++ (you can tell this is the case because of the plethora of module static members that are required to use the C runtime library). Managed C++ can use unmanaged DLL functions in two ways, firstly, the code can use platform explicitly through [DllImport] and secondly it can use an import static link library. This code shows both mechanisms:

#pragma comment(lib, "kernel32.lib")
extern "C" void __stdcall CloseHandle(unsigned int);

using namespace System::Runtime::InteropServices;

public ref class Test
{
public:
   void CloseHandleStaticLib(unsigned int handle)
   {
      CloseHandle(handle);
   }
   [DllImport("kernel32", EntryPoint="CloseHandle")]
   static void CloseHandlePI(unsigned int handle);
};

The first line indicates that the code will use a function in the kernel32.dll library and that access to the function is through information provided by the static linked import library kernel32.lib. The function I will use is CloseHandle and the prototype is normally obtained through including <windows.h> however, since this code uses just the one function the second line gives the prototype without the include file. The linker generates an import address table entry for CloseHandle so that when a method uses this function (for example CloseHandleStaticLib) the address of the IAT entry is used. This is It Just Works! (IJW) in action: you write code as if it is native C++ code and the compiler will generate IL for that code.

The second method in the Test class imports CloseHandle through platform invoke, the name of this method is CloseHandlePI so that CloseHandleStaticLib does not call it. Here's what ILDASM shows for the Test methods:

.method public hidebysig static pinvokeimpl("kernel32" as "CloseHandle" winapi)
void CloseHandlePI(uint32 handle) cil managed preservesig forwardref
{
}

.method public hidebysig instance void CloseHandleStaticLib(uint32 handle) cil managed
{
// Code size 7 (0x7)
.maxstack 1
IL_0000: ldarg.1
IL_0001: call void modopt([mscorlib]System.Runtime.CompilerServices.CallConvStdcall)
            CloseHandle(uint32)
IL_0006: ret
} // end of method Test::CloseHandleStaticLib

The second method shows that it calls a global method called CloseHandle:

.method public static pinvokeimpl( lasterr stdcall)
void modopt([mscorlib]System.Runtime.CompilerServices.CallConvStdcall)
CloseHandle(uint32 A_0) native unmanaged preservesig
{
.custom instance void [mscorlib]System.Security.SuppressUnmanagedCodeSecurityAttribute::.ctor()
   = ( 01 00 00 00 )
// Embedded native code
// Disassembly of native methods is not supported.
// Managed TargetRVA = 0x000024A2
} // end of method 'Global Functions'::CloseHandle

This is identified as a native method with x86 code embedded at an RVA of 0x24a2. To get a better understanding of what this code represents I searched through the MethodDef table for this assembly to find the address of the method stored in the .text section after 0x24a2, this method is at 0x24a8. This means that the code for CloseHandle is just 6 bytes in size. Here are those bytes:

ff 25 38 30 00 10    JMP [0x10003038]

As you can see, this is x86 for a jump to other code. Inspecting the output from dumpbin /imports again shows the following:

KERNEL32.dll
   10003000 Import Address Table
   10006720 Import Name Table
          0 time date stamp
          0 Index of first forwarder reference

The IAT starts at 0x10003000 so the jump is clearly to an address in the IAT. Platform invoke used data in the ImplMap table, which is a managed equivalent of the IAT. Therefore, this is not Platform Invoke, but it has the same effect: it is used to access a function in an unmanaged DLL.

Note that although the x86 code (the JMP) is embedded it does not mean that the runtime will call it directly. When the runtime sees a call to CloseHandle it will see the pinvokeimpl and native unmanaged attributes on the method and then it will set up the necessary code to switch over to the unmanaged world before running the x86 code.

In effect, the 'native' methods I have identified in the results are the same as platform invoke methods. An analysis of the results shows that the majority of the 95 methods reported as being native for 2.0.50727 are methods implemented in DLLs. There are a few of these methods that contain embedded native code other than a jump to the IAT. Each of the files that contain native methods will have the embedded native methods: _getFiberPtrId and __security_init_cookie. In addition, ISymWrapper also has new and delete which indicates that this assembly uses an unmanaged C++ heap. The  System.Data assembly has twenty methods with the prefix SNI, which are Sql Server Networking Interface methods, although these do not appear in the IAT. This assembly also has a class called SqlDependencyProcessDispatcher which has three native methods. Finally, the System.EnterpriseServices.Wrapper module's native methods are used to access the COM+ API and various COM APIs through IAT imported methods, and it also has three classes, InitializeSpy , Thunk and TransactionStatus that have native methods.

It is possible to determine within code whether a 'native' method in one of the managed C++ created files is embedded x86 or is a call to the IAT, however, I decided that I did not want to spend any more time doing this. Instead, from the analysis above, you can see that the majority of these 'native' methods are actually calls to native code in DLLs, therefore, I include these in the count of platform invoke methods. They actually make little difference to the values presented.

Analysis of Results

The table below gives the cumulated results from the five versions of the .NET framework examined. I have included the results for WinFX 6.0.5070, so that you can compare the percentages of methods implemented in IL and of the IL methods called. I have not included the count of the methods for this entry since it makes no sense to compare the number of methods or method calls.

Description 1.0.3705 1.1.4322 1.2.30703 2.0.50215 2.0.50727 6.0.5070
Total Number of Methods 76772 92504 160131 166990 233068 -
Total Number of Opcodes 1687220 1912755 3406817 3662071 5029051 -
IL Methods 92.6% 88.2% 94.6% 93.5% 92.8% 96.2%
Methods Implemented through COM or Platform Invoke 6.15% 10.6% 4.81% 5.90% 6.74% 3.78%
Number of Method Calls 277451 317267 606755 648607 812379 -
Number of Calls to IL Methods 95.5% 95.9% 97.1% 94.7% 94.6% 97.3%
Number of Calls to COM or Platform Invoke Methods 1.57% 1.58% 0.94% 1.06% 1.24% 1.01%

The first point to make is that the framework library was extended considerably going from v1.0 through v1.1, v1.2, to v2.0. The number of methods and opcodes in the framework tripled over this period. Remarkably, the average number of opcodes per method remained constant over this whole period: approximately 21 opcodes per method. There are two big jumps in the size of the framework. Going from v1.1 to v1.2, there was an increase of 78% in the number of opcodes. The second big jump was going from the Vista Beta 1 version of the v2.0 framework to the RTM version, and increase of 37%. The other changes were fairly flat.

The interesting information comes when you look at the implementation of those methods and the types of methods that are called. The figures show that the percentage of methods implemented purely in IL is fairly constant between 92% and 95%, the one value that does not follow this trend is v1.1 where 88% of framework is implemented in IL. If that version is ignored, then the proportion of methods implemented in IL increases slightly to v1.2 and then deceases steadily to the v2.0 value.

The number of methods in the framework gives a measure of the available code, but it does not necessarily mean that those methods are called. Furthermore, although the majority of the methods for all versions are pure IL, those methods could simply call another method that is a Platform Invoke or COM Interop method. This is the reason why I analyzed the opcodes in the IL methods. From the table you can see that the number of method calls per method is also fairly constant, being a value between 3.4 and 3.9 calls per method.

Finally, the proportion of methods called that are IL follows a similar trend over the versions as for the proportion of methods that are implemented in IL: it shows a peak at v1.2 and then a steady drop to the RTM of v2.0.

The following graph illustrates the results in a more convenient form2:

Finally, it is worth taking a look at the results for WinFX 6.0.5070. This framework is still only 96.2% implemented in IL, and 3.78% of the API are methods that are Platform Invoke, or COM interop calls to unmanaged code. This is better than the .NET framework, for example Vista Beta 1 (which is the equivalent to this build of WinFX) where 93.5% of the framework is implemented as pure IL methods and 5.9% of methods are unmanaged. The proportion of method calls to pure IL methods is 97.3% and to unmanaged methods is 1.01%. Again, these are better than the values for the equivalent version of the .NET framework (94.7% and 1.06%, respectively).

Conclusions

There are few data points in this analysis and many more versions of the framework will need to be analyzed before a trend can be definitively established. However, tentative conclusions can be drawn. The results show that there appears to be a big push for versions from v1.0 until v1.2 to implement as much as possible of the framework in IL: successive versions have proportionally more IL methods. However, it appears that the compulsion to strive for an IL-only framework stopped at v1.2, because later versions of the framework reversed the trend. This appears to agree with the theory I espoused earlier, that is, the .NET team strived to implement as much as possible of the .NET framework in IL, but after the PDC in 2003 they back-tracked on this requirement and now they no longer have any compulsion to make the framework IL-only. Since the PDC 2003 version of the framework (v1.2) is the peak in terms of the proportion of the framework implemented in IL, this backs up the assertion that I gave earlier that .NET in the PDC 2003 build of Longhorn is less of a wrapper around Win32 than the framework library in earlier versions of .NET. It also backs up my assertion that .NET in later builds (including the released version of .NET 2.0) is becoming more of a wrapper.

The one measurement that I have performed so far on WinFX gives good results compared to the general .NET framework, however, the results show that WinFX cannot be described as totally managed because only 96% of the methods are implemented in IL and of all method calls that are made 97% are to IL methods. WinFX is still dependent upon unmanaged code.

Downloads

The results in this article were taken using the following application. This is a unmanaged C++ application that uses the CorMetaDataDispenser COM object. The source code is not available. This tool is run on the command line and will work with all versions of the framework. The command syntax is:

analyser folder [version]

Where folder is mandatory and indicates the folder that will be searched for library assemblies. The optional version parameter is the version of the framework that you want to search for the framework assemblies. If you do not give this parameter then the most up to date version of .NET will be used. This parameter is useful if you have multiple versions of the framework on your machine.

Download not available at this time (I am adding more features).

Notes

1. The shell in Windows 95, and all versions of Windows since, is implemented using COM-like interfaces. To extend the shell, namespace extensions, context handlers and the like, you have to write classes with vtable based interfaces that are exported from DLLs through a class factory. However, these objects are not COM objects. The reason is that they do not run in a COM apartment and they are not created through COM instantiation. There are great advantages in using objects with interfaces, which explains why the shell uses this paradigm, however, it is not immediately obvious why the shell team decided not to use COM instantiation. COM is essentially a mechanism to manage DLLs: objects are instantiated through their CLSID - a unique 128-bit number that is essentially a class name - and the registry is used to locate the DLL that houses the COM class with a specified CLSID. COM instantiation removes the dependence on the PATH environment variable and the System32 folder which were the major causes of DLL Hell. The reason why COM instantiation was not used by the Windows 95 shell was its memory usage (remember this, you might hear something similar about Vista). Windows 95 had to run a large amount of code on underpowered machines, so each byte of memory that could be saved was important to the shell team.

2. Note that these lines are not draw to scale and are offset so that you can compare the shapes.

   

(c) 2006 Richard Grimes, all rights reserved