.NET Fusion
Home About Workshops Articles Writing Talks Books Contact

3. Strong Named Assemblies

When an assembly is given a strong name Fusion can perform version checking and the assembly can be located in a folder other than the application folder. The term 'strong name' means that the name is unique. If two users create an assembly each with the same short name, the runtime cannot distinguish between them. Indeed, you can overwrite one with the other and the runtime will not complain. If the two users give their assemblies strong names then these assemblies will have unique names and the runtime will be able to determine if the assembly it is attempting to load is the correct assembly. To create a strong name you must have a public/private key pair. This key pair contains two keys, one is deliberately made public and the other is deliberately kept private. The public key is used in the assembly name and no two users should have the same public key. To ensure that the public key is the key that the original publisher used the key embedded in the assembly is validated before the assembly is loaded.

To generate a pair you can use the sn utility. Note that you should only generate the pair once and use that key pair for all assemblies in the future, you should never give access to your key pair to any one other than trusted personnel.

3.1 Giving an Assembly a Strong Name

Use the process and library assembly files, you developed on the previous page.

Generate a key pair with the -k switch:

sn -k key.snk

The key pair should be kept secure because it contains your private key, if you wish you can store the key pair in a file into a cryptographic container using the -i switch:

sn -i key.snk KEYPAIR

Here, KEYPAIR is the name of the container. To give an assembly a strong name you have to give the name of the key file in one of your source files using the [AssemblyKeyFile] attribute or the name of the container using the [AssemblyKeyName] attribute.

.NET Version 3.0
These attributes are deprecated in version 8.0 of the C# compiler (provided with .NET version 3.0/2.0). The reason is that the compiler in earlier versions of the framework would embed the information in the attributes in the assembly. This means that information about the key file name, or key container, will be available to anyone that has access to the assembly. Microsoft consider this a potential leak of sensitive information and so they give a warning if you use either of these attributes. The recommendation now is that you should use the /keyfile or the /keycontainer command line switches (or equivalent project settings in Visual Studio 2005).

This indicates to me a lack of level headed thinking in Microsoft. As I'll explain in a moment, there was never any need for these attributes to appear in the assembly because they are merely messages to the compiler. Microsoft should have altered its compilers to make sure that these attributes were not added to the assembly, rather than the current solution of telling you not to use them.

Add the following line to the source code for the library for all versions of the runtime (ignore the warning that you get with version 3.0/2.0 of the runtime):

[assembly:AssemblyKeyFile("key.snk")]

Compile both the library and the process.

Adding a strong name to your assembly positively ties the assembly to the owner of the key pair, but it does not identify the publisher. This means that two assemblies with the same public key will be from the same publisher, but you do not know who that is. Positive identification of the publisher is achieved by signing the assembly with an X.509 certificate. In spite of this, an assembly with a strong name will have more trust than an assembly without a strong name but this trust would be weakened if it were able to call an assembly without a strong name. So, when you have given an assembly a strong name it can only call assemblies which also have strong names.

Now use ILDASM to view the MANIFEST of the library (ildasm /text /item:lib lib.dll). You will see that a new entry has been added to the metadata for the library.

.assembly lib
{
   .custom instance void [mscorlib]System.Reflection.AssemblyKeyFileAttribute::.ctor(string) =
      ( 01 00 07 6B 65 79 2E 73 6E 6B 00 00 ) // ...key.snk..
  .publickey = (/* data omitted */)
  .hash algorithm 0x00008004
  .ver 1:0:0:0
}

The first thing to notice is that the compiler has added a custom attribute. There is no reason whatsoever for the compiler to do this. This custom attribute is not used by the runtime, and it is not used by user code. The [AssemblyKeyFile] attribute in your code is a message to the compiler telling it to sign the resultant assembly with the key in the specified file. This is a bug in the compiler that has persisted over all versions of .NET. As I mentioned above, Microsoft have recognised this, but instead of fixing the compiler so that it does not add this custom attribute to the outputted assembly, their 'solution' is to issue a warning telling you to use the command line switches instead. In my opinion this is a pretty braindead response to this issue.

When the compiler sees this attribute it loads the key file and extracts the public key. It then creates a hash of the public key and uses the last eight bytes to create a Public Key Token. This token becomes part of the assembly name, since eight bytes are used this means that there are 1.8 x 1019 different combinations. The cryptographic hash routine makes sure that no two public keys could create the same hash, and the random number routine used by the sn utility ensures that no two runs of the utility could create the same key pair. In other words, using a public key token in the strong name guarantees that the name is unique.

However, the public key is public, so a cracker could simply obtain your public key from somewhere and then use it to name their own assembly. Code access security permissions can be based on the strong name, so if the strong name can be compromised then it means that code access security is broken. Recognising this potential issue, Microsoft ensures that a strong name is validated before the assembly is used. To validate the strong name Microsoft must be able to validate the public key that was used to create the public key token.

A public-private key pair is unique, data encrypted with one of the pair can only be decrypted with the other key. (The terms 'public' and 'private' are subjective, both can be used for encryption and decryption, the only difference is that you deliberately make one public and deliberately keep the other private.) So, to validate a public key, you can use it to decrypt some known value that has been encrypted with the private key. The 'known value' is the hash of the assembly. At compile time, the compiler creates a hash of the file (excluding some locations like the place where the hash will be stored later) and then it encrypts, or signs, the hash with the private key. This signed hash is stored in the assembly along with the public key in the file. At runtime Fusion can extract the public key and the signed hash. It uses the public key to decrypt the signed hash to get the hash generated when the assembly was created. It can then re-calculate the hash of the assembly and compare the two hashes. If they agree then the public key is validated. If they disagree then one of the following can be deduced:

  • the public key does not correspond to the private key used to sign the hash
  • the hash of the assembly has changed since the assembly was created, that is, the assembly has been tampered

It does not matter which is the case, because the .NET runtime will treat the assembly as being suspect and so it will not load it.

Going back to the output from ILDASM: the .publickey entry contains the public key. The location of this signature can be determined using DUMPBIN:

dumpbin /clrheader lib.dll

Here is an edited example:

Dump of file lib.dll
File Type: DLL
  clr Header:
    48 cb
  2.00 runtime version
  210C [ 374] RVA [size] of MetaData Directory
     9 flags
     0 entry point token
     0 [ 0] RVA [size] of Resources Directory
  2050 [ 80] RVA [size] of StrongNameSignature Directory
     0 [ 0] RVA [size] of CodeManagerTable Directory
     0 [ 0] RVA [size] of VTableFixups Directory
     0 [ 0] RVA [size] of ExportAddressTableJumps Directory

The strong name signature is the 128 bytes stored at a location (relative virtual address) of 0x2050.

Now that the assembly has a strong name you can store the assembly in a folder that is not under the application's folder. The reason you can do this is because the strong name gives the assembly a unique name. Without a strong name Fusion cannot positively identify an assembly and so it could pick up the wrong file. To prevent this happening, assemblies that do not have a strong name will always be private assemblies.

3.2 Using DEVPATH

This test will store the assembly in a folder identified by the DEVPATH environment variable. To use this environment variable you have to do three things: first, the library must have a strong name; second, you need to add a <developmentMode> element to the machine.config file; and finally, you have to set the DEVPATH environment variable.

As the name suggests, this should only be used for development, you should never use this for release code. Since the <developmentMode> element is added to the machine.config file it is global to your machine, so the DEVPATH will be checked by all applications.

The DEVPATH specifies a folder that can be anywhere on your hard disk and Fusion treats it much like how the LoadLibrary API in Win32 code treats the PATH environment variable, that is, it specifies a folder shared by all applications. For example, create a folder in the root of the drive called bin, then move lib.dll to that folder.

The machine.config file is stored in the following folder:

%systemroot%\Microsoft.NET\Framework\v1.1.4322\CONFIG

Here v1.1.4322 is the version of the framework being used. Open the machine.config file from this folder and add the <developmentMode> element under the <configuration> node.

<runtime>
   <developmentMode developerInstallation="true"/>
</runtime>

Make sure that you save the file. At the command line set the DEVPATH environment variable:

set DEVPATH=c:\bin\

(The last character must be a backslash!) Now run the application and confirm that the library has been loaded from the shared folder. After this test remove the <developmentMode> element.

During the Whidbey (.NET 2.0) beta period Microsoft announced that DEVPATH will be deprecated in future versions of .NET. However, you can still use it in .NET version 2.0. In spite of this, I urge you not to use this facility.

3.3 Codebase with an Absolute Path

Use app.cs, lib.cs, key.snk and app.exe.config from earlier on this page.

In the last section you changed the codeBase through the configuration tool. I said then that the path had to be to a folder under the application folder. Now that the assembly has a strong name it means that the file can be loaded from a location outside of the application folder.

From the last example, the library should be in a folder called bin in the root folder (if it isn't, move it there now). Now open the configuration tool, add app.exe and use the instructions in Example 2.2 to add a codeBase to the assembly. Note that this time the library assembly will have a public key token in the Choose Assembly From Dependent Assemblies dialog. On the Codebases dialog enter 1.0.0.0 for the version and file:///C:/bin/lib.dll for the URI. This time the version is required. Now you can run the application to confirm that the library is being picked up from the requested folder.

Open the configuration file and note that the configuration tool has added the following lines in a <dependentAssembly> node:

<assemblyIdentity name="lib" publicKeyToken="70743f5fc0978ac0"/>
<codeBase version="1.0.0.0" href="file:///C:/bin/lib.dll"/>

The publicKeyToken uniquely identifies the assembly (of course, the value that you'll see will be a different value!).

If you want to obtain the public key token from your assembly you can use the sn utility. Type the following at the command line:

sn -T \bin\lib.dll

3.4 Loading an Assembly from the Internet

When an assembly has a strong name it means that it can be loaded from anywhere on the local intranet or even from the internet. This example will simulate this behaviour using IIS installed on the local machine. (If you have XP Home you will not have access to IIS, and if you have XP Pro you may find that you have to install IIS before you can follow this example.)

Create a folder under the IIS folder:

md c:\Inetpub\wwwroot\bin

Next move the assembly to this location:

move \bin\lib.dll c:\Inetpub\wwwroot\bin

Now change the configuration file so that the codeBase points to this location through HTTP. Use the configuration tool to change the URI to

http://localhost/bin/lib.dll

Run the process and confirm that the library has been loaded - notice that there is a slight delay as the library is 'downloaded'. (If Fusion cannot find the file, it may be because IIS has not been started, to do this, type net start w3svc at the command line.)

When you specify that a library should be downloaded through an internet protocol in the href attribute Fusion will download the assembly to a special sandboxed folder. There are several ways to view this folder. The first is to use the gacutil tool on the command line:

C:\TestFolder>gacutil -ldl

Microsoft (R) .NET Global Assembly Cache Utility. Version 1.1.4322
Copyright (C) Microsoft Corporation 1998-2002. All rights reserved.
The cache of downloaded files contains the following entries:
lib, Version=1.0.0.0, Culture=neutral,
PublicKeyToken=70743f5fc0978ac0, Custom=null
Number of items = 1

(Note that you can clear the download cache with gacutil -cdl). The other way to view the download cache is to use the Fusion namespace extension in Windows Explorer. To do this start Windows Explorer and navigate to %windir%\assembly\Download (for example C:\Windows\assembly\Download). You will see something like the following.

The log for IIS (the most recent file in %systemroot%\system32\Logfiles\W3SVC1) will have a line like the following:

11:35:17 127.0.0.1 GET /bin/lib.dll 200

Close the log file. Now run the process again. Notice that the process starts quicker than before. Take another look at the IIS log file (reload it) and confirm that no request was made to IIS to download the library. For example:

11:36:45 127.0.0.1 GET /bin/lib.dll 304

(A value of 304 in the HTTP protocol means Not Modified, that is, the file being requested has not been modified since it was last downloaded, so the server is telling the client to use its cached value. This means that although the file is not downloaded again, the server is still consulted, and so this application can only be run if there is access to the server.) However, note that the Assembly.CodeBase property printed on the command line still indicates that the library was downloaded.

You will learn in a later example that the GAC is implemented using a specific folder structure on your hard disk. The view that you see in windows explorer is actually generated by a shell namespace extension. The Download folder may appear to be a folder under the GAC, but this is yet another subterfuge of the shell namespace extension. To find the location of the real sandboxed folder you need to look in the local settings' application data folder for your account (not the APPDATA folder), at the command line type:

pushd %userprofile%
cd "Local Settings\Application Data"
dir /s lib.dll

This should give you results that look something like this:

C:\Documents and Settings\RichardGrimes\Local Settings\Application Data>dir /s lib.dll
Volume in drive C is Main
Volume Serial Number is A856-82BC

Directory of C:\Documents and Settings\RichardGrimes\Local Settings\Application Data\assembly\dl2\H1XG8483.OCT\AD0BNDW1.VQR\503b66f9\80eddb64_b8f0c401

02/01/2005 10:54 3,584 lib.dll
1 File(s) 3,584 bytes

Total Files Listed:
1 File(s) 3,584 bytes
0 Dir(s) 1,270,640,640 bytes free

Its not clear what the various parts of the path mean, and there is no documentation.

.NET Version 3.0
Version 3.0/2.0 of the runtime also has a folder structure that uses cryptic names. However, the folder under assembly is now named dl3, clearly this indicates that it is the third version of the download folder.

Try the following test. Move back to the development folder (popd) and use the configuration tool to change the URI to:

http://127.0.0.1/bin/lib.dll

Run the process. Next change the URI to:

http://MY_MACHINE/bin/lib.dll

Here MY_MACHINE is the name of your machine (to get this type echo %COMPUTERNAME% at the command line). Again, run the process. What you have done is downloaded the library three times. Yes, it is the same library, yes, the URI's refer to the same location, but they are different addresses and so Fusion downloads the library for each address. Now move back to the local settings application folder and list the copies of lib.dll in the download folder. These are the results I get (showing just the path below the dl2 folder):

H1XG8483.OCT\AD0BNDW1.VQR\0609c5bc\80eddb64_b8f0c401
H1XG8483.OCT\AD0BNDW1.VQR\503b66f9\80eddb64_b8f0c401
H1XG8483.OCT\AD0BNDW1.VQR\d512bf34\80eddb64_b8f0c401

The only bit that changes is the third folder in the path, this is clearly derived from the URI, perhaps it is a hash. If you do a similar experiment with two different assemblies downloaded from the same URI folder you'll find that the part of the path that changes is the last folder in the path, so this part must be derived from the assembly's fullname.

Clean up by deleting the assembly from the IIS folder (del \Inetpub\wwwroot\bin\*.*) and remove the folder (rmdir \Inetput\wwwroot\bin). Next move back to the development folder (popd) and clear the download cache with gacutil -cdl.

3.5 Versioning of Strong Name Assemblies

When an assembly has a strong name it means that it has a complete name and so the runtime can perform versioning. To illustrate this follow these steps. First remove any <dependentAssembly> nodes from the configuration file (the easiest way to do this is with notepad rather than with the configuration tool). Run the process to confirm that the runtime has loaded the version of the assembly specified in the MANIFEST of the process. Next, change the library so that it has a different version:

[assembly:AssemblyVersion("1.1.0.0")]

Compile only the library (csc /t:library lib.cs), do not compile the process, instead, use the version that is already there that uses version 1.0.0.0 of the library. Now run the process. You will get an exception (if the Visual Studio Just-In-Time Debugging dialog shows, just click on the No button) and the library will not be loaded:

Unhandled Exception: System.IO.FileLoadException: The located assembly's manifest definition with name 'lib' does not match the assembly reference.
File name: "lib"
   at App.Main()
 

.NET Version 3.0
Version 3.0/2.0 of the runtime is more explicit. The exception description gives the full name of the assembly that could not be loaded.

This indicates, in effect, that the manifest of the process says that the process requires version 1.0.0.0 and this version cannot be found. It is actually better to view this exception like this: the manifest of the process requires an assembly with this name:

lib, Version=1.0.0.0, Culture=neutral, PublicKeyToken=70743f5fc0978ac0

The only library assembly in the application folder has this name:

lib, Version=1.1.0.0, Culture=neutral, PublicKeyToken=70743f5fc0978ac0

Clearly these names are different, and this is why the exception that is thrown is FileLoadException.

3.6 Versioning Rebinding

Fusion allows you to change the way that versioning works. There are good reasons for this. You may have a process that uses version 1.0.0.0 of a library assembly but after deploying the application you notice a bug in the library and fix this to create version 1.1.0.0, the fix does not add new functionality, nor does it change the public interface. You want to deploy the library and not a new version of the process. To do this you can provide a redirect setting in the configuration file. To do this, view the properties of the lib configured assembly under the app.exe assembly in the configuration tool:

To do this, use the configuration tool to configure the lib library. When the libraries property dialog shows, click on the Binding Policy tab. The list view on this page allows you to specify a version and the version that it will be redirected to, in other words, if an assembly of Requested Version is mentioned in the process's manifest, load a library with the New Version instead.

Now run the process and confirm that the runtime has picked up the new version. Verify that the process assembly has not changed and that the manifest still requests the old version. It is important to point out that the library is still a private assembly.

This dialog will change the application's configuration file to add a <bindingRedirect> entry for the library. You can check this by opening the configuration file with notepad:

<dependentAssembly>
   <assemblyIdentity name="lib"
      publicKeyToken="70743f5fc0978ac0"/>
   <bindingRedirect oldVersion="1.0.0.0"
      newVersion="1.1.0.0"/>
</dependentAssembly>

You can specify a range of values in oldVersion in the form n.n.n.n - n.n.n.n. You can also add more than one <bindingRedirect> in a single <dependentAssembly> node.

3.7 Versioning Redirection and Codebase

If you have a strong named assembly it means that you can load different versions from different locations using <codeBase>. To test this, create a folder in the root folder of your hard disk called bin and under this create two folders, one called 1.0 and another called 1.1. Compile the library two times. The first time change the version to 1.0.0.0 and then move it to \bin\1.0 and the second time change the version to 1.1.0.0 and move it to \bin\1.1. (Do not recompile the process, so that it still requires version 1.0.0.0 of the library.)

Use the configuration tool to remove any binding redirect that may have been added in the last example. Now add two codeBase entries so that version 1.0.0.0 is picked up from file:///C:/bin/1.0/lib.dll and 1.1.0.0 is picked up from file:///C:/bin/1.1/lib.dll. In effect, you are adding the following lines to the <dependentAssembly> node for the library:

<codeBase version="1.0.0.0" href="file:///C:/bin/1.0/lib.dll"/>
<codeBase version="1.1.0.0" href="file:///C:/bin/1.1/lib.dll"/>

The process requires version 1.0.0.0 so run the process and confirm that it picks up version 1.0.0.0 from c:\bin\1.0. Now use the configuration tool to add a binding redirect from version 1.0.0.0 to version 1.1.0.0. Confirm that the following line was added to the configuration file:

<bindingRedirect oldVersion="1.0.0.0" newVersion="1.1.0.0"/>

Run the process again and confirm that it now picks up version 1.1.0.0 of the library from c:\bin\1.1, in other words, the binding redirect has been performed first, to get a new version, and then the codeBase is applied to locate the library.

Clean up you work by removing the C:\bin folder and its contents.

3.8 Delay Signing

If two assemblies have been signed with the same key then it means that they come from the same publisher. If a user has confidence in one of these assemblies they will have the same confidence with the second one. Further, code access security can use StrongNameIdentityPermission to make sure that only assemblies with the specified key will be loaded. For these reasons you should be careful about who has access to the key pair. If a rogue employee gains access to the key pair she could sign an assembly that she's written to do something nasty and then distribute this assembly and users will use this assembly because they have confidence in any assembly with the key.

To prevent this from happening only trusted personnel should have access to the key pair. However, in practical terms this means that the assembly can only be signed once all development has completed, and this will mean that assembly versioning cannot be used, nor can you place the assembly in the GAC. If this were the case it would mean that the application would have to have private assemblies during development and while this is a valid pattern it might not be the pattern that you've designed.

.NET allows you to perform partial signing, (also known as delay signing). The idea is that the public key is placed in the assembly, which means that the assembly has a 'strong name' and you can perform versioning.

Using the last example (app.cs, lib.cs and key.snk), delete the config file so that you can start afresh. The first action is to extract the public key from the key pair. To do this use the -p switch of the sn utility:

sn -p key.snk pkey.snk

Next edit the library, the relevant lines are identified below:

[assembly:AssemblyVersion("1.0.0.0")]
[assembly:AssemblyKeyFile("pkey.snk")]
[assembly:AssemblyDelaySign(true)]

Build the entire project.

.NET Version 3.0
The C# compiler version 8.0 will issue a warning about the [AssemblyDelaySign] attribute. It indicates that you should use the /delaysign command line switch instead. Again, the reason is that the C# compiler adds a .custom metadata item in the library assembly. There is no need for the compiler to do this. The [AssemblyDelaySign] attribute is merely a message from the source code to the compiler telling it to partially sign the assembly, it has no effect at all on the runtime, nor on the user code.

Confirm that the library has a public key using:

sn -T lib.dll

Now run the application. You find that an exception will be thrown:

Unhandled Exception: System.IO.FileLoadException: Strong name validation failed
for assembly 'lib'.
File name: "lib"
   at App.Main()

What is happening here is that the assembly loader attempts to validate the file. Normally when an assembly is signed a hash is generated of the assembly and this is encrypted with the private key and the public key and the signed hash are put in the assembly manifest. When the assembly is loaded, the signed hash can be decrypted with the public key and then a hash can be performed on the assembly. These two hashes can be compared, and if they are different it means that the assembly has been tampered or the keys do not match. In our situation the file has a public key but does not have the signed hash (however, delay signing does allocate space for the signed hash which you can see by running dumpbin /clrheader lib.dll and looking at the StrongNameSignature).

Note that although strong name validation helps to identify if an assembly has been tampered, you should not rely on this mechanism. There is a bug in Fusion in version 1.1 of the framework which means that a cracker can alter the signed hash in a specific way to turn off validation of the assembly's hash and so Fusion will load the tampered assembly. The public key remains, so the file is still strong named and can be uniquely identified. For details on how to do this see this page.

The public key indicates that the library has been signed so the hash should be checked, but the 'signed hash' has not been initialized so any check on this will fail. Since the check fails, the assembly is not loaded and hence the FileLoadException is thrown.

What you need to do is to tell the assembly loader not to check the hash and to do this you need to use the sn utility:

sn -Vr lib.dll

Now you can run the application and you'll find that the assembly will be loaded. Once you have turned off verification for a strong named assembly (sn says 'Verification entry added') you can put it in the GAC (we will cover shared assemblies and the GAC on the next page) and verification will still be disabled - you do not have to use -Vr again. Try this: add the assembly to the GAC (gacutil -i lib.dll) and delete the local copy (del lib.dll); run the process and you'll see that the library is being picked up from the GAC because the path in the CodeBase is now to a folder under %windir%\assembly. (Now clean up: remove the library from the GAC, gacutil -u lib, and build the project again so that you have a local copy of the library.)

When all development has been completed a trusted person in the company can obtain the private key and use it to re-sign the assemblies. This does not compile an assembly, all it does is calculate the hash for the assembly, sign it, and then place the signed hash in the space allocated for it. To do this you can use the -R switch of sn:

sn -R lib.dll key.snk

Now you can turn on verification using -Vu:

sn -Vu lib.dll

(the utility indicates that the 'verification entry' has been 'unregistered') and then run the application. Now you'll find that the library is loaded as expected.

Clean up the example by removing the [AssemblyDelaySign] attribute and change the [AssemblyKeyFile] attribute:

[assembly:AssemblyKeyFile("key.snk")]

3.9 Obfuscation of Strong Named Assemblies

.NET assemblies contain metadata. This is information that describes the types in the assembly. .NET metadata describe types completely, so there is no need for an equivalent of C++ header files or import libraries in .NET. .NET compilers create intermediate language and at runtime this is just-in-time compiled to native machine code which is then executed. Intermediate language is 'higher level' than machine code and so it is possible to determine how an algorithm works by viewing its assembly with ILDASM. Worse than this, there are tools available that will convert IL into high level languages like C#, VB.NET and Managed C++. Two notable decompilers are Anakrino and my current favourite, Reflector.

There is much debate about whether decompilers are an issue or not. My personal opinion is that I would not have been able to learn so much about the .NET framework if I had not been able to decompile the framework classes. I owe a lot to Reflector. So, if I have access to your assembles I will be able to learn a lot about your code. Do you care? Perhaps not, perhaps you don't have a secret algorithm that you want to protect. Even so, I could still make comments about your coding style. I could gain information about your competence as a .NET developer by identifying inefficient sections of code; I could gain information about your use of, or lack of use of, .NET security. Furthermore, if you don't care about people decompiling your code, then why don't you simply publish your source code for free on the web (after stripping comments, of course)? If you are reticent about publishing your source code, then you should also be reticent about making your assemblies available.

Here's a story that will interest you. Recently, I saw a free .NET utility on the internet which I thought would be great as a PocketPC application. I got in touch with the author and asked him if he would convert it to the Compact Framework. He said that he had not done any PocketPC development. I offered to do the work, but because this was his first major .NET project he was reluctant to give me the source code - he was a little ashamed of the code. So I asked him if it was alright if I used Reflector to get the code. He agreed. I was then able to convert that code to the Compact Framework (this is not as easy as it may appear, because the Compact Framework is a cutdown version of the .NET framework and I had to write many classes that were missing in CF, or rather, Reflector extracted them from the desktop framework for me). More significantly, I was able do a code review and presented the original author with a list of errors in his code. It was no different than if he had given me the original source code.

The bad news is that there is nothing that can be done about this. Someone who is determined to obtain your secret algorithm will be able to do that regardless of the steps that you take to prevent it. There is a possibility that when Microsoft applies Trusted Computing to a future version of Windows then Digital Rights Management would be able to protect the assembly from being decompiled. However, Trusted Computing has its issues (for one side of the argument, see here, for the other side, see here) personally I see the Trusted Computing project as a disabling technology rather than an enabling technology.

Currently, the only way that you can protect your code is to obfuscate it. Obfuscation does not prevent people from decompiling your code, it just makes the decompiled code more difficult to understand. Metadata for public members of public classes is important (in particular the names of members) because methods are imported into another assembly by name. However, names of internal types and private members are of no use at all, because the .NET runtime does not use these names. Information about members are stored in metadata tables and the intermediate language accesses metadata in these tables through the index of the row in the table that contains the data. The runtime does not use the name.

Try this: edit the process assembly code that you have used in the last section so that it calls a new method:

using System;

class App
{
   static void Main()
   {
      LibraryCode code = new LibraryCode();
      Console.WriteLine("library {0}", code.GetVersion());
      CallMethod();
   }
   static void CallMethod()
   {
      Console.WriteLine("called method");
   }
}

Note that the CallMethod method is private, so it should not be accessible by external code, and hence the name serves no purpose. Also note that the Main method is also private but this should be accessible to external code: the compiler must be able to locate and run this code. We'll return to this is in a moment. Compile this code as before. Now view it with ILDASM with the /adv switch. From the View menu select Show bytes (this shows the actual bytes that are used in the IL) and then view the Main method:

.entrypoint
// Method begins at RVA 0x2050
// Code size 31 (0x1f)
.maxstack 2
.locals init (class [lib]LibraryCode V_0)
IL_0000: /* 00 | */ nop
IL_0001: /* 73 | (0A)000003 */ newobj instance void [lib]LibraryCode::.ctor()
IL_0006: /* 0A | */ stloc.0
IL_0007: /* 72 | (70)000001 */ ldstr "library {0}"
IL_000c: /* 06 | */ ldloc.0
IL_000d: /* 6F | (0A)000004 */ callvirt instance string [lib]LibraryCode::GetVersion()
IL_0012: /* 28 | (0A)000005 */ call void [mscorlib]System.Console::WriteLine(string,
object)
IL_0017: /* 00 | */ nop
IL_0018: /* 28 | (06)000002 */ call void App::CallMethod()
IL_001d: /* 00 | */ nop
IL_001e: /* 2A | */ ret

The first thing to note about this code is that there are lots of nops in there. This is the code produced by compiling for release mode on C# 8.0 (.NET 2.0). If you use C# 7.0 (.NET 1.1) you will not get the nops. I don't know why they are there, but I suspect they have to do with Edit and Continue (EnC). (I have little regard for this technology, so I will use any excuse to blame anomalies on it.)

At the top of the method is the .entrypoint metadata. This indicates to the runtime that this method is the method it should run when it starts the process, so the actual name of the method is irrelevant.

Now look at the IL and notice the actual bytes of the opcodes and their parameters. The IL that calls CallMethod are the bytes 0x28, which is the opcode for call, and 0x06000002 which is an index in the MethodDef table (a table of information about methods defined in this assembly). The method called is the second item in the table (these indices are 1-based). This table contains the Relative Virtual Address (RVA) of the IL for the method, and so the runtime has all the information it needs. ILDASM helps you by providing the name of the method given in the metadata table, but the runtime does not use this. To prove this, close ILDASM and open app.exe in the hex viewer in Visual Studio and scroll until you find the UTF8 string CallMethod. Use the hex editor to write over this method name (do not insert any characters, overwrite existing characters), it does not matter what you type here, but for this example just change the string to NullMethod. Run the application, the process should not have a strong name, so there will not be a strong name validation failure. The process will run as expected. View the process with ILDASM again. You'll see that the tool helpfully displays that Main calls the method NullMethod but as we know, the compiler compiled CallMethod. The message is clear: for private methods there is no need to have the name in the assembly. I consider this a bug in the framework compilers.

I think that the .NET compilers should give you the option of excluding names for types and members that are not accessible outside of the assembly.

Notice that the process calls LibraryCode.GetVersion so let's try changing that string in app.exe. Close ILDASM and use the editor to locate the string GetVersion and change it to SetVersion. View the assembly with ILDASM to confirm that the Main appears to call SetVersion. Now run the process. You'll find that a MissingMethodException will be thrown. The reason is that methods are imported by name, and since there is no method called LibraryCode.SetVersion in the lib assembly the runtime cannot call it.

There is a lot of information in method names, if I get access to an assembly and Reflector shows me that there is a method called ObtainSecretData I will focus my attention to that method. The simplest version of obfuscation will go through the MethodDef table and determine which method is private, or a member of an internal class and then determine the location of the string for the name of the method and change the name to something unintelligible. Since the method name is not used, the obfuscator can give each candidate method the same name and can confuse the cracker further by using some unintelligible character like ~, or . This will make it difficult to read code with ILDASM or Reflector. A determined cracker could show the metadata tokens in ILDASM and could even replace them with names she had made up. But even so, information has been lost.

Obfuscators can do more than this. They can encrypt literal strings and replace the code that accesses them with code that calls a decryption routine, but a determined cracker can decode the encrypted string themselves (after all, all the information to decrypt the string must be in the assembly or accessible through it). Further, obfuscators can alter the logic in the code. Decompilers (like Reflector) work by identifying patterns in code and translating them to the equivalent high level code. For example, a C# while loop, or an if/then/else construct will produce a particular pattern of IL. An obfuscator will also identify such patterns and alter them so that a decompiler will not recognise them. Of course, this has no effect on ILDASM (which is a disassembler) and, of course, this technique is only effective until the decompiler writers identify patterns in the code generated by the obfuscator.

An obfuscator can make intelligent decisions about the items that it will obfuscate, but it really should allow the developer to override these decisions. Note that .NET version 3.0/2.0 has two attributes [ObfuscateAssembly] and [Obfuscation]. These are messages to the obfuscator that you are using, they have no effect on the compiler or the runtime. It is interesting that both have a property called StripAfterObfuscation which allows you to indicate to the obfuscator that it should remove the custom attribute so that a cracker does not have any information about the items that you have decided to (or decided not to) obfuscate.

Throughout this discussion you will have noticed that the obfuscator will have to change the assembly. This means that the hash of the assembly will be changed, so the signed hash, the strong name signature, will be wrong. Therefore, it is important that you sign the assembly after the obfuscator has done its work. The only way that you can do this is to delay sign the assembly when you compile it, and then re-sign it after you have obfuscated it. In the remainder of this section I will focus my attention to Dotfuscator, which is provided with Visual Studio.

Dotfuscator Community Edition is provided free as part of Visual Studio. However, this tool has its problems (although, the version supplied with Visual Studio 2005 is an improvement on the version supplied with Visual Studio.NET 2003). The Community Edition will not handle strong named assemblies and to get that facility you have to upgrade to the Professional Edition. I will concentrate here with the Community Edition.

My first gripe occurs when you indicate to Dotfuscator the assembly that it should work on. The program should test to see if the assembly has a strong name and then ensure that the assembly has been delay signed. It doesn't do this, so this means that you can go through the process of obfuscating the assembly and producing something that will fail strong name validation. What is the use of that?

The second issue (in version 3.0) is that it will rename every member. The braindead implementation of the renaming routine does not recognize that public and protected members of a public type should not be obfuscated. By default it will rename public members, which render them useless. To get round this issue you have to explicitly exclude the types and members that you want to be left alone. Also, by default it will leave the names of private (internal) types intact. Is there any reason for this? Private types should be obfuscated because no external code should have access to them. Furthermore, rather than renaming everything to the same pointless string, Dotfuscator uses different names for different items. Version 3.0 (with Visual Studio 3.0) tries to improve on this with a mechanism that they call Overload-Induction, but frankly there is no point. The idea is to make the string names unreadable, so why expend effort trying to make them readable to the developer?

(I have not done an extensive study, but the size of every assembly I have obfuscated with Dotfuscator 3.0 Community Edition is always a whole multiple of system page size, 4096 bytes. This means that the assembly is always larger that it should be. If this is repeatable for all assemblies, then I consider it to be a bug.)

Finally, let's cover the issue of strong names. As I have mentioned, the documentation says that if you pay extra for the Professional Edition it will handle strong names for you. This feature should have been in the Community Edition. You already know what to do, but I will mention the steps anyway. The first thing to do is ensure that your assembly is delay signed, so for version 3.0/2.0 use /delaysign and /keyfile, and for earlier versions use [AssemblyDelaySign] and specify the public key file in [AssemblyKeyFile]. Once you have compiled the assembly you must register it to be excluded from strong name validation with sn -Vr. Now you can obfuscate the assembly. Once the tool has completed the task you can switch on strong name validation for this assembly with sn -Vu and finally you must re-sign the assembly with sn -R.

I hope that you enjoy this tutorial and value the knowledge that you will gain from it. I am always pleased to hear from people who use this tutorial (contact me). If you find this tutorial useful then please also email your comments to mvpga@microsoft.com.

Errata

If you see an error on this page, please contact me and I will fix the problem.

Page Four

This page is (c) 2007 Richard Grimes, all rights reserved