17. .NET Vulnerabilities and Exploits
Is .NET the solution to all of your security problems? Of course the answer is no. .NET is as vulnerable to security attacks as any framework. However, as you have seen throughout this workshop the .NET framework has been built with security in mind right from the beginning. This is in contrast to other competing systems and libraries (particularly those not based on a virtual machine architecture) where security is often added on as an afterthought (if at all).
On this page I will outline some of the exploits that have been carried out on .NET code to date. I will perform an analysis on those exploits and describe the vulnerabilities in .NET that the virus writers have targeted. I will also point out to you ways that you can overcome some of these vulnerabilities and point out ways that Microsoft could help.
17.1 Chronology
The following table has a list of the most well known .NET viruses to date. The table is listed in date order, and a description of the viruses follows after the table.
There are essentially two types of viruses listed here. The first group is viruses that are implemented in .NET simply because the framework offers all the facilities that the writer requires and it offers rapid development; such viruses could be written in VB, Java, C++ or C, but the developer chose .NET because they preferred that platform. These are not .NET viruses, they are merely viruses written in .NET. The other group of viruses is far more interesting because they specifically target the .NET framework. These are viruses that can truly be called .NET viruses and it includes only three viruses from this list: Serot, Gastropod and Impanate.
The first so-called .NET virus that appeared was Donut. Although this targeted .NET applications it was actually an entry point virus, that is, it appended itself to .NET executables and then overwrote the unmanaged entry point of the PE file so that the Donut code would execute when the PE file was run. When run, this virus code attempted to infect all code in EXEs in the same folder and in up to 20 folders above it. In addition, the virus injected some MSIL into the application, which was intended to display a message before the actual host file was executed. To do this the virus writer copied the file (adding a space to the executable name) with the intention of changing the entry point back to the original entry point before executing the this copy. However, the code did not replace the entry point of the copy with the original entry point which meant that the unmanaged virus code would be executed and so the virus was executed recursively until the file name gots too long. XP and later operating systems do not call the unmanaged entry point when running a .NET application and so the Donut virus code will not be called. It is clear that to execute .NET applications securely you must use XP or later.
The next significant .NET virus was the mass-mailer, Sharpei, which prepended itself to applications. Since it does this with code written in C# it is known as a High Level Language Prepender, hence the HLLP in its name. This virus was spread through email purporting to contain a security update. When the code in the email was executed one of the actions was the execution of a VBS script that performed the mass mailing through Outlook; this aspect of the virus was not dependent upon the .NET framework. The virus propagated itself on the local machine by prepending itself to executables. The first action is that the location of the x86 binary from the email was stored in the registry. The virus then dropped and executed a .NET application. This application searched the Windows and Program Files folders for executables and copied the virus code, whose location was saved in the registry, to the front of the executable. The prepending action is simple: it made a copy of the original file, then it copied the virus file over the victim and finally it appended the original file. This way the virus entry point would be run when the victim application was run. The intention was that when the victim file was executed, the virus would be run and then it would create a temporary file containing the original code and executed this temporary file. However, there was a bug and this that results in multiple copies being made. Sharpei does not break .NET security, .NET is used here merely because .NET makes writing replication code so easy. A similar virus is Flate which also uses the .NET runtime to prepend itself to applications.
The Gaze virus is also a mass mailer, that is, it spread itself by running a VBS script that obtained email addresses and then mailed itself to those targets. The virus was written in .NET, but it could have been written in any language. Similarly, Letum is a mass mailer written in .NET. The framework has SMTP classes and Letum takes advantage of this. This virus harvested email addresses from HTML files it finds on the machine and it also posted a message to news servers that you have used with Outlook Express.
The Hobble virus is a mass mailer that distributes itself through Kazaa as well as through email, like Letum, it obtained the email addresses by looking for local HTML files that have mailto entries in them. The virus also attempted to shutdown processes that perform security actions. Another .NET virus spread through file sharing systems is Lupar. When it was run, Lupar would search the hard disk for jpg files with names that suggest they may be paedophile porn. If such files are found, they are moved and information about the machine, username and the files that were found were uploaded to an FTP server. The virus then added an instruction to shutdown the computer whenever Explorer is run. Again, in both viruses, .NET is used because it makes these actions easy to code.
One of the interesting issues with .NET is the effect of versioning. .NET Code that is written for one version of .NET will only run on that version. This means that if a virus is written in .NET version 2.0 that virus will only work with that version.
| However, note that .NET 3.0 is the same as .NET 2.0, so a virus that will work for .NET version 2.0 will also work for .NET version 3.0. Confusing? This is a result of the daft choice of releasing WinFX as .NET version 3.0 rather than any problem with .NET versioning. |
The .NET runtime is far too large to be packaged with the virus so this puts a lot of faith on there being enough machines having the correct version of the framework. Late in 2005 .NET 2.0 was released and some virus writers changed to this version. Idonus was purported to be the first Vista virus, but it was nothing of the kind. It was written with .NET 2.0, and so it will only work on a system with that version of the framework installed, this includes Vista, but it also includes other versions of Windows. Microsoft is always coy about how widespread .NET is, but it is clear that in the interim, most systems are safe from this virus. Furthermore, Idonus uses WinFX, again, in a mistaken belief that this would make it a Vista virus. In fact, this effect of this decision was to make the virus virtually unrunnable because at the time WinFX was released as a CTP (Community Technology Program) release which meant that there was a new release every few months. The chance of the virus being run on a machine with the right version of .NET and the right version of WinFX was pretty small. In addition, the virus writers did not appreciate that WinFX was available (and still is available) for other versions of Windows, so it is not a Vista-only framework, nor did they realise that at that time WinFX was not even installed on the version of Vista available at that time (version 5308) and before (build 5308 has the WinFX package but it was not installed: you had to explicitly go through Control Panel to install it; build 5365, released in April 2006 did have WinFX installed). The combination of this: .NET 2.0 and the CTP release of the WinFX framework narrows the target machines for Idonus to a few geeks with the time to beta test Microsoft's products!
When run Idonus would move around your hard disk. It did this by copying itself to a random folder, and then the next time it was run, it moved to a new folder. The virus would add itself to the list of process run at startup. Its method of replication was very destructive, it choose a .NET file with an EXE extension and copied itself over that file. This is a rather dumb, destructive virus.
The Cxover virus is interesting because it utilizes the only cross platform aspect of Microsoft's .NET framework: that the compact framework for Smart Devices is a cut-down version of the desktop framework. When run, this virus would copy itself to the Windows folder and added itself to the registry to be run when Windows starts, then it called ActiveSync repeatedly to determine if a mobile device was connected. When a device attaches it copied itself to the device and started the new process. Once on the device it then proceeded to delete documents. This virus could have been written in C++, since C++ applications also have access to the Remote API and so can copy code to a mobile device.
Now for the more interesting viruses. The Serot virus is another buggy mass mailer virus. It performed many
actions, spreading itself through emails and communicating with infected
machines through the IRC port, 194. However, relevant to this page is its
infection of .NET files. Serot would locate EXE files and test to see if
they are .NET applications that have not been signed. This virus then decompiled the .NET application, injected its code, and then recompiled the
application. Peter Ferrie
says this is performed by the Compiler Services methods by which I am
sure he actually means that the code is decompiled with ILReader.dll
into IL assembler. (ILReader.dll
was written by Lutz Roeder as a tool for developers, not for virus writers; it
is now no longer available
from his site.) The code can then
be compiled with the classes in System.Reflection.Emit. The
injected code would then drop
this infected code and
then attempted to
execute it.
|
.NET Version 3.0 Note that version 3.0/2.0 of the framework library has the MethodBody
class that can be used to decompile MSIL to assembler. Ever since version 1.0 there
has been a class called ILGenerator that can be used to create MSIL
that can be used by the classes in the System.Reflection.Emit
namespace to create dynamic classes. |
Gastropod carries the aforementioned ILReader.dll file to disassemble MSIL
to IL assembler. Interestingly, once the virus has disassembled the code to IL assembler it
adds extra code to both the host and virus code. This extra code is harmless, it
will either inject nop (no operation) opcodes, or a
combination of ldloc (load local variable on to the stack) and
pop (remove from stack) picking a local variable at random.
Clearly these are used to try and confuse anti-virus utilities because it
makes making a fingerprint of the virus more difficult. Analysts who have worked on
Gastropod have commented that the code has
bugs, some of them quite trivial. However, its polymorphic behaviour shows a
certain level of sophistication in the virus writer.
The final virus I will describe is Impanate. This virus is quite careful about the files it will infect:
it targeted .NET EXEs but ensured that they do not have certificates, nor
would
it touch signed files. This virus injected its own code into the .NET
application. It did this by altering the metadata tables which are described
in the #~ metadata
stream. In effect, it picked an entry at
random from the StandAloneSigs table and duplicated it to add the
signature of the virus's method. It then found the method in the Method
table that it chose, duplicated this entry and added the virus code to it, performing
the necessary calculations for the method's header. Clearly the virus writer
understands intimately the format of assemblies. This virus is an entry point
obscurer, that is, it attaches its code to a method within the legitimate
code. The virus does not perform any action, but it is clear that it could.
17.2 Virus Analysis
The Serot virus highlights a vulnerability of non-obfuscated IL: round tripping. .NET MSIL and metadata mean that disassemblers (and decompilers) can extract compilable code from a binary. Such code can be altered and then recompiled. .NET is not unique in this issue, the same actions can be performed on Java applications.
Try this code:
using System.Reflection;
class App
{
static void Main()
{
MethodInfo mi = typeof(App).GetMethod(
"Method", BindingFlags.Static|BindingFlags.NonPublic);
MethodBody mb = mi.GetMethodBody();
Console.WriteLine("Method has {0} bytes of opcodes",
mb.GetILAsByteArray().Length);
}
static void Method()
{
Console.WriteLine("Called Method");
}
}
This code use the new .NET 3.0/2.0 MethodBody class to get access
to the IL opcodes in the requested method. Al the code does is get the length
of the method in bytes. Compile this code (csc round.cs) and run
it. I get a value of 13 bytes for the method. Next run ildasm to
disassemble this code:
This will generate a file,
round.il with the MSIL for the process. Open this up with
notepad and scroll to the Method method:
.method private hidebysig static void Method() cil managed
{
// Code size 13 (0xd)
.maxstack 8
IL_0000: nop
IL_0001: ldstr "Called Method"
IL_0006: call void [mscorlib]System.Console::WriteLine(string)
IL_000b: nop
IL_000c: ret
} // end of method App::Method
nops, it does
not matter where as long as they are before the ret statement and
after the .maxstack statement:{
// Code size 13 (0xd)
.maxstack 8
IL_0000: nop
nop
nop
IL_0001: ldstr "Called Method"
nop
nop
IL_0006: call void [mscorlib]System.Console::WriteLine(string)
nop
nop
IL_000b: nop
IL_000c: ret
} // end of method App::Method
In this case six extra bytes have been added. ildasm will also
create a .res file with the compiled unmanaged resources. When
you create the altered PE file you need to include those resources. Now compile the process using
the IL assembler:
Run the code again. This time you will find that there are 19 bytes in the
method. Round tripping is easy. It gets a little more complicated if the code
has resources. In this case ildasm will extract every embedded
resource into an appropriately named file. The IL file will indicate which
resources are embedded and which are linked, and it will give the names of the
external resource files (for linked resources) and the files it created (for
embedded resources). When you assemble the altered code you must make sure
that you add the resources to the new PE file, embedding or linking as
appropriate. Performing all of this in code is a little more involved. .NET
3.0/2.0 provides a method called GetMethodBody on the
MethodInfo object (actually, an instance of a private, derived class
called RuntimeMethodInfo) that returns a MethodBody
object. However, as you have seen, the access to the IL (through
GetILAsByteArray) is to bytes and not Opcodes. The
ILGenerator class, used to create dynamic assemblies requires
Opcode objects. The framework does not provide a class to generate
Opcode objects from bytes, but with the information in the .NET
spec it is a trivial, if tedious, task to create such a class. (I have done
this for unmanaged code and have no desire to do this again for managed code.)
This is the great benefit of Lutz Roeder's ILReader, his
disassembler class gave access to an array of Opcode objects for
a method and so it was trivial to alter this array appropriately and pass it
to an instance of the ILGenerator class.
Obfuscation provides some protection from round tripping, but it
must be recognised that obfuscation is not a security technique. Code
that is called within an assembly will be referred through a metadata token.
A token is a 32-bit integer that indicates the metadata table that describes
the member and the index of the member in the table. Run ildasm
using the following command line:
This will start the tool in GUI model, but the important point is that the
/tok switch means that tokens are shown. Here is the code
generated earlier:
void Method() cil managed
{
// Code size 19 (0x13)
.maxstack 8
IL_0000: nop
IL_0001: nop
IL_0002: nop
IL_0003: ldstr "Called Method" /* 7000004F */
IL_0008: nop
IL_0009: nop
IL_000a: call void [mscorlib/*23000001*/]System.Console/*0100000B*/::WriteLine(string) /* 0A000008 */
IL_000f: nop
IL_0010: nop
IL_0011: nop
IL_0012: ret
} // end of method App::Method
Notice that the method has a token of 0x06000002. The top byte
refers to table 0x06, MethodDef, the lower three
bytes are the zero-based index for this particular method in the table, in
this case the third entry. If this method is called within the assembly, then
it will be referred to using the token, not its string name.
If a class member is called outside of the
assembly it is still referred to using tokens, but this time the tokens refer
to the string values that identify the assembly, class (including namespace
and details of the enclosing class if the type is nested) and member. The code
above, that calls WriteLine, uses tokens that refer to the
AssemblyRef (0x23), TypeRef (0x01) and
MemberRef (0x0a)
tables. These tables hold information about the assembly, the type and type
member, and each table has an index in the string heap which ildasm
uses to obtain the string names.
Thus, the name of a type or member is only important if the item is public or protected. The string name of private members will never be used. For this reason, most obfuscators will rename the string name of private members. Typically, the new name will be the same for all members to make it more difficult for casual snoopers to identify the item referred and some obfuscators will even use an illegal name.
If a type has more than one private field then they will be given the same string name, and when this code is decompiled the result will not be compilable. Thus, obfuscated code has a possibility (but not infallibly so) of defeating round tripping. Obfuscation will never completely protect your code because a careful eye of an experienced developer can disambiguate members, but it would hobble the action of a virus.
Note that the author of Serot is suffered from maxim
that a little knowledge is a bad thing. Serot only changed EXE,
but it made sure that it ignored .NET
applications that have been signed. As you know from earlier pages, signing an
EXE will protect it from tampering, but it does not prevent the EXE from being
round tripped. When the application has been decompiled it will not be re-signed when
it is compiled,
but that does not matter because the runtime treats an unsigned application in
exactly the same way as a signed application: if the signature is there, it
will check it, if it isn't there, it won't. The only situation when the
signature becomes important is if the application calls libraries that have
code marked with [StrongNameIdentityPermission]. However, as has
already been mentioned in this workshop, this attribute offers little
protection and anyway identity permissions are fairly meaningless in .NET 3.0/2.0.
Gastropod takes Serot's action a step further. It performs the action that I have illustrated above, that is, it decompiles the IL in an assembly, but rather than simply appending its own code, it alters the host and virus code in a random fashion. The purpose of this is to make sure that it is not possible to obtain a finger print for this virus, which makes it more difficult for an antivirus program to identify it, and hence disinfect the code. Since the virus has effective access to the source code of the application it could do all kinds of things to the code. The virus code could be split into several types and many methods and spread around the memory that contains the IL (a technique called fractional cavity) making disinfection even more difficult. It could attach itself to just the entry point of the application, so that it would be guaranteed to run once every time the application is run or it could attach itself to a randomly chosen method so that the number of times it is run, and whether it is run, is less deterministic. .NET MSIL can be generated from high level code which means that the virus writer has the ability to use some sophisticated code. I can imagine that, in an attempt to hide its activity, a virus writer could detect when the application performs disk action, or accesses a database before it performs its own work. It could also run on a separate thread so that the host would appear to be running correctly while the virus is performing its action.
Finally, note that a .NET application will be installed on the hard disk, and so it will run under full trust. This means that it will have access to the .NET security policy, and a virus could alter policy to allow access from code on other machines. This could be a very serious issue.
17.3 Vulnerabilities In .NET
A list of .NET vulnerabilities, is this a ToDo list for virus writers? Possibly, but it is also a list of issues that need to be fixed in .NET, or at least it is a list of things to make you think and realise that .NET is not your security panacea.
So far, the .NET viruses that have been have identified in the wild have ignored library code. I suspect that the main reason is that .NET libraries can be signed, the signature becomes part of the name of the library and this mechanism acts as a detection of tampering of the library. If a library is decompiled, changed and recompiled without the signature, then it will have a different name and so an assembly that used the original library will ignore the new one. Furthermore, a virus may attempt to append code to an assembly and preserve the original signature, but this will be detected when the assembly is validated. If you are using .NET 1.0 or 1.1 then the validation can be disabled. I have been told that it is possible to do this in .NET 3.0/2.0 as well, but so far I have not been able to find out how, so at the moment I am happy to accept that .NET 3.0/2.0 is safe from this exploit.
However, one way a virus could get round the issues of strong name validation is to change a library, re-sign it and then check every .NET EXE in the current folder to see if any other assembly uses the library. When the virus finds such an assembly it could update the assembly's metadata to have the new public key token (derived from the signed hash of the altered library) so that the assembly knows the 'new' name of the library. If the calling assembly is signed then the virus could re-sign the assembly with it's key and then kick off another search for assemblies that call this new altered assembly.
The .NET framework
does not provide tools to re-sign an already signed assembly (sn.exe
will compare the public key in the already signed assembly with the public key
in the new key-pair and will abort if these are not the same). However, the
source code for the assembly signing routine is freely available in the Shared
Source CLI and it is simple to deduce from this how to re-sign an already
signed assembly. The
unmanaged .NET API has methods to generate a key pair for signing (StrongNameKeyGen)
and to sign an assembly (StrongNameSignatureGeneration).
If an EXE is signed then the runtime will validate the assembly signature
before running it. If the validation fails, then the application will not run.
The virus could prevent this merely by removing the signature (for example, by
round-tripping as explained earlier). If there is no strong name, then there
is no strong name validation check (the unmanaged API
StrongNameSignatureVerification will return true -
that is, the verification succeeded - if the assembly does not have a strong
name). Currently, there is little point in applying a strong name to an EXE:
since an EXE is not loaded as a library the strong name is not used to prevent
name collision and since a strong name can be removed it has no security
benefits.
The virus writers that perform round tripping have ignored signed
executables: they could have targeted those too. They have also ignored
private libraries, but since most private libraries are not signed there would
be no strong name validation to detect their changes. Even if the private
libraries are strong names, they can be round tripped (which removes the
strong name) and the calling assembly could be altered to remove the strong
name. For example, open the IL file
round.il created in the last
section and look at the entry for the strong named framework assembly,
mscorlib:
{
.publickeytoken = (B7 7A 5C 56 19 34 E0 89 ) // .z\V.4..
.ver 2:0:0:0
}
If you were to remove the strong name from mscorlib then this
code will not refer to your new library because of the .publickeytoken
item. To make this entry refer to your unsigned assembly you simply remove the
.publickeytoken item.
The writer of Impanate has shown that s/he has the skills to perform the actions that I've described here. I suspect that it won't be too long before a more sophisticated and serious .NET virus appears.
17.4 Process Monitor
The last section got me thinking. .NET should really ensure that the strong name on an
assembly does not change. To a certain extent this is ensured for library
assemblies. The main reason for giving a library assembly a strong name is to
put it in the GAC. Physically, the GAC is stored under %systemroot%\assembly
and this folder has the following ACL (using the code from the
ACL page of this
workshop):
Owner is: BUILTIN\Administrators
Primary group is: NT AUTHORITY\SYSTEM
Access Rules:
Allow user: BUILTIN\Users rights: ReadAndExecute, Synchronize [001200A9]
inherited right, inheritance: None propagation: None
Allow user: BUILTIN\Users rights: -1610612736 [A0000000]
inherited right, inheritance: ContainerInherit, ObjectInherit propagation: InheritOnly
Allow user: BUILTIN\Power Users rights: Modify, Synchronize [001301BF]
inherited right, inheritance: None propagation: None
Allow user: BUILTIN\Power Users rights: -536805376 [E0010000]
inherited right, inheritance: ContainerInherit, ObjectInherit propagation: InheritOnly
Allow user: BUILTIN\Administrators rights: FullControl [001F01FF]
inherited right, inheritance: None propagation: None
Allow user: BUILTIN\Administrators rights: 268435456 [10000000]
inherited right, inheritance: ContainerInherit, ObjectInherit propagation: InheritOnly
Allow user: NT AUTHORITY\SYSTEM rights: FullControl [001F01FF]
inherited right, inheritance: None propagation: None
Allow user: NT AUTHORITY\SYSTEM rights: 268435456 [10000000]
inherited right, inheritance: ContainerInherit, ObjectInherit propagation: InheritOnly
Allow user: CREATOR OWNER rights: 268435456 [10000000]
inherited right, inheritance: ContainerInherit, ObjectInherit propagation: InheritOnly
This means that all users can read and execute files in the GAC, but only
Administrators and the System account have full control
(that is, can add files to the GAC), but Power Users can modify
files in the GAC. The Program Files and
Windows\System32 folders are protected with an ACL similar to the
GAC. So assuming that the virus is not running under an
Administrator or Power User account it will not be able to
alter code in the GAC, Program Files nor
Windows\System32.
The vulnerable files are executables and private assemblies because these will
be installed elsewhere on the hard disk. By default, most folders allows User accounts
to read and execute files, append files and create files.
As explained above, .NET protects assemblies from tampering through strong names. This protection is a side affect of the real reason for the strong name - name collision protection - but it is a useful side affect. So how can your applications be protected? It seems to be that there is no solution, although there is a glimmer of a possibility. In this section I will describe the features that I have investigated to see if protection could be applied.
When the .NET runtime has loaded a process it logs information. If .NET maintains such a log, then surely it could check to ensure that the names of the assemblies (the process and the libraries it uses) remain the same for all sessions? Let's take a look at the information that is logged.
In .NET 1.0 and 1.1 details about processes that have been run are held in the following folder:
It appears that an assembly must make a reference to a type in an assembly other
than mscorlib to have an entry in this folder, so that when you run this code:
{
Console.WriteLine("Code Started");
}
you will not get a log file because Console is in mscorlib.
However, if you run this code:
{
UriBuilder u = new UriBuilder();
}
you'll find that a log file will be created because UriBuilder is
in the System assembly. The reason for this is because this logging occurs
to provide information for the .NET 1.0 and 1.1 'Fix Application' facility. The
log file contains information about the versions of the assemblies that the
process uses. For example, here's the file I get for the code above:
ExecutablePath=C:\Tests\test.exe
ApplicationName=test.exe
NumResolutions=1
ActivationSnapShot_1=29782465.2399875360
[29782465.2399875360]
RuntimeVersion=v1.1.4322
System/b77a5c561934e089/NULL/1.0.5000.0=29782465.2399875360/System/b77a5c561934e089/NULL/1.0.5000.0
[29782465.2399875360/System/b77a5c561934e089/NULL/1.0.5000.0]
VerReference=1.0.5000.0
VerAppCfg=1.0.5000.0
VerPublisherCfg=1.0.5000.0
VerAdminCfg=1.0.5000.0
Notice that it gives information about the application path and name, and the
full names of assemblies other than mscorlib. Each time the process
is run with a new configuration a new ActivationSnapShot
will be created and the previous snapshot will be retained so that the 'Fix
Application' facility can rollback to a previous configuration. Although the
full names of the libraries are given, only the short name of the process is
given, so you cannot use this mechanism to determine if the strong name has
changed. The 'Fix Application' facility is not available in .NET 3.0/2.0, so if you
compile this application to .NET 3.0/2.0 no log file will be created.
There is an undocumented registry entry called CLRLoadLogDir. This
is a string entry in the following key:
If you use this to give the name of a folder then whenever a .NET process is run
the framework will create a log file regardless of whether the process is a debug
or release build. So for the first run of test.exe I get:
Log started at 10:54:53 on 07/05/2006
-----------------------------------
Host supplied values (usually set via CorBindToRuntime)
-----------------------------------
-----------------------------------
C:\Tests\test.exe was built with version: v1.1.4322
Yet again, this gives the path to the process, but it does not give the full process name. However, it is interesting that the log file gives the version of the runtime that the process was built for.
| How did I find this registry value? I ran SysInternals RegMon and looked at the registry entries accessed by the process as it started up. This undocumented registry value is one of many. |
.NET 3.0/2.0 performs some other interesting logging. You know that the fuslogvw
tool will show you binding information because I have already shown you this in
the Fusion workshop. Run regedit
and navigate to the following key:
Change the following values (add them if necessary): LogFailures (a
DWORD value) should be 1, and LogPath (a
string value) should be the path to a location on your hard disk (I use
C:\Data\Fusion). You can make these settings in the fuslogvw
tool, but it is simpler to do it this way.
Now run a process that has been compiled for
.NET 2.0. You'll find that under the log folder will be a new folder called
NativeImage and under that will be a folder with the name of the
process (in my case test.exe). Within that folder will be an HTML
file (in my case this is ExplicitBind!FileName=(test.exe).HTM).
Note that Fusion log has been configured to log failures, and yet the
process ran without any exceptions. Load the HTML file and take a look at the
failure logged:
The operation failed.
Bind result: hr = 0x80070002. The system cannot find the file specified.
Assembly manager loaded from: C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll
Running under executable C:\Tests\test.exe
--- A detailed error log follows.
LOG: Start binding of native image test, Version=0.0.0.0, Culture=neutral, PublicKeyToken=6c126a78c321d7ed.
LOG: IL assembly loaded from C:\Tests\test.exe.
WRN: No matching native image found.
LOG: Bind to native image assembly did not succeed. Use IL image.
Is this really a failure? What this is saying is that when it is told to run the process it first looks to see if there is a native (pre-jit) image for the process and if a native image cannot be found the IL image is used and is Just-In-Time compiled.
| The implication of this log file is that Microsoft regards the normal state for a process in .NET 3.0/2.0 to be a native image, that is, Microsoft expects you to pre-JIT all of your processes. I have not seen any documentation that says this, but it is explicit in this log file that an IL image is only executed when a native image is not found. If you are interested in creating native images I recommend the native image page of my Fusion Workshop. |
The interesting point about this log file is that it has the full name of the assembly it is attempting to load. Unfortunately, this log file is overwritten whenever the process is run, so there is no enduring log kept about the name of the assembly.
Another option is the unmanaged CLR APIs. These are implemented as interfaces on
COM objects. You can get access to information about the .NET processes that are
running by calling method on the ICorPublish interface:
#include <stdio.h>
#include <corpub.h>
#pragma comment(lib, "ole32.lib")
void main()
{
CoInitialize(0);
ICorPublish* pub = 0;
HRESULT hr = CoCreateInstance(CLSID_CorpubPublish, 0, CLSCTX_INPROC_SERVER, IID_ICorPublish, (void**)&pub);
if (SUCCEEDED(hr))
{
// Use interface here
pub->Release();
}
CoUninitialize();
}
The interface has two methods and both will return access to an
ICorPublishProcess interface. The first is GetProcess to which
you pass a Win32 process ID and it will return an object that implements the
ICorPublishProcess interface specifically for this process. The
second method is EnumProcesses which will return an enumerator
object that implements ICorPublishProcessEnum. This enumerator will
give access to all .NET processes and return an object that implements
the ICorPublishProcess interface for each one.
The ICorPublishProcess interface has a method called
IsManaged so that if you call GetProcess you can determine
if the process is managed or unmanaged. In addition, it has a method called
GetDisplayName. In spite of the name, this method actually returns the
path to the image file of the process. Once you have the name of a .NET file you
can use the unmanaged metadata API to get information about the assembly.
hr = CoCreateInstance(CLSID_CorMetaDataDispenser, NULL, CLSCTX_INPROC_SERVER, IID_IMetaDataDispenserEx, (void **) &pDisp);
if (SUCCEEDED(hr))
{
IMetaDataImport* pImport = 0;
hr = pDisp->OpenScope(name, 0, IID_IMetaDataImport, (IUnknown**)pImport);
if (SUCCEEDED(hr))
{
DWORD maj = 0;
pTables->GetColumn(0x20, 1, 1, &maj); // version major value
DWORD min = 0;
pTables->GetColumn(0x20, 2, 1, &min); // version minor value
DWORD build = 0;
pTables->GetColumn(0x20, 3, 1, &build); // version build value
DWORD rev = 0;
pTables->GetColumn(0x20, 4, 1, &rev); // version revision value
DWORD colVal = 0;
pTables->GetColumn(0x20, 7, 1, &colVal); // get short name of the assembly
const char* name = NULL;
const char* culture = NULL;
if (colVal != 0)
{
pTables->GetString(colVal, &name); // extract the name from the string heap
}
pTables->GetColumn(0x20, 8, 1, &colVal); // get culture
if (colVal != 0)
{
pTables->GetString(colVal, &culture); // extract the name from the string heap
}
// The Assembly table does not have the public key token, so
// ask the runtime to calculate it for us.
byte* pKey = 0;
ULONG keySize = 0;
hr = StrongNameTokenFromAssembly(fileName, (byte**)&pKey, &keySize);
if (!(hr == S_OK || hr == S_FALSE)) // neither S_OK nor S_FALSE
{
pKey = 0;
keySize = 0;
}
// *** use the assembly info here ***
if (hr == S_OK || hr == S_FALSE)
{
StrongNameFreeBuffer(pKey);
}
pTables->Release();
}
pDisp->Release();
}
As explained earlier, metadata is contained in tables. Table 0x20 is Assembly, that is, the table with the
information about the loaded assembly's name. The information in tables are are
made up of rows and columns, and you get the information for a column of a
particular row by giving the table number, and the zero-based
index of the column and the one-based index of the row to GetColumn.
GetColumn always returns an integer. Some columns have string data, and
in this case GetColumn will return the
index of the string in the string heap. The GetString method will
return identified the string as a UTF8 string (even though the string heap
contains Unicode strings). In this code, the assembly short name and culture are held
as strings.
Column 6 should give the public key token, but I find that it does
not return this. Instead I call the StrongNameTokenFromAssembly
method exported from the mscoree.dll library. This function will
allocate memory for the token, so when you have finished with it you must call
StrongNameFreeBuffer. Note that there is no static import library
for these functions, so you have to call LoadLibrary to load the
DLL and GetProcAddress to get the method address. The token will be
8 bytes and the name of the assembly gives these bytes in the reverse order of
the bytes in this array.
As you can see, you can use this code to continually poll for new .NET applications and when a .NET application starts up you can get the full name of the application and use this to determine if the process is new or one that has been run before. For a new process you could store the process path, the full name of the assembly and the date that the process was installed on the machine. If the same file is executed at a later stage, you could compare the full name and the date that the file was written to the disk with the data that you stored earlier and if the two sets of data don't agree your process can flag the user. Clearly the storage of this information must be secure and if you can do this then the mechanism described here should protect you from a virus attacking your processes. Note that this code does not have access to the running process, so you will not be able to pause the application while the user decides what to do.
Since you have access to the metadata of an assembly you can also get a list of the assemblies that the loaded assembly uses, so you can then recursively get the full name of those assemblies and the assemblies they use and put this information in your store. Note that it is not straightforward to get the file path from the assembly full name because shared assemblies will be in the GAC and subject to publisher policies (which may redirect to another version) and private assemblies can be stored in the application folder or a subfolder, or if it is strong named, the configuration file could redirect a private folder to another assembly on the local machine or on another machine.
You can inject your own code into a process as a profiler. A profiler is a COM
object that implements the ICorProfilerCallback interface and, if
profiling is enabled, the runtime will call methods on this interface when
certain events occur. When the .NET application starts Initialize
is called and passed an ICorProfilerInfo interface pointer which
the profiler can call to get information. Unfortunately, you cannot get the full
name of the process this way! However, since this is a COM object you can call
the Win32 GetModuleFileName and then use the metadata API to get
the full name of the assembly. Note that if Initialize throws a
Win32 exception then there will be an execution engine exception and the .NET
process will not run!
It would be far better if the .NET runtime did all of this work for you. Clearly Microsoft have access to the strong name of an assembly, since they even provide this in their Fusion log files. It would be straightforward for them to check every process that is run to determine if it has changed since the last time it was run.
17.5 Conclusion
Here are some tips. These should be obvious, but I will give them anyway:
- Use XP or later. This means that viruses cannot hijack the unmanaged entry point.
- Do not run under an elevated account. Typically, make sure that the
users of your machine (and hence the users of your applications) are
Usersaccounts. - Install applications and private libraries in folders that give only
Read and Execute permissions to the
Usersgroup. The combination of this point and the last point should ensure that code that runs on your machine should not have write or append access to your applications. - You can prevent round tripping by code by ensuring that a class has two private fields of the same type, or two private methods with the same signature (if necessary, add dummy parameters).
- Regularly check the processes set to run at start up. There are several
ways to make code start at start up, so the simplest way to do this check is
to run
msconfig.exe(in%windir%\PCHealth\HelpCtr\Binaries).msconfigallows you to temporarily disable a process set to start up, so you can assure yourself that you can undo your action if you pick the wrong file.
There are several points that can be gleaned from this article. First, the good points:
- To date no .NET virus has been distributed as a downloaded plug-in to .NET code. Code Access Security has closed this security hole.
- To date no .NET virus has targeted the framework libraries. The security of the GAC provided by NTFS and the protection of the strong name on the libraries have prevented this.
- To date no .NET virus has targeted the (unmanaged) runtime binaries. This is most likely due to the complexity of the runtime, but the installation in a folder with appropriate NTFS ACLs does help.
- No code has been able to smash the stack or heap. Buffer overruns simply do not occur in managed code.
- There have only been thirteen .NET viruses in the seven years that the runtime has been available to the public (in beta or RTM). Of those viruses only three can be considered true .NET viruses that target the .NET assembly structure.
Code Access Security, the managed heap and the runtime managed stack are clearly great benefits to .NET. So far no one has been able to hijack .NET code through a buffer overrun (or some other stack smashing technique), nor has any Trojan been able to use elevated privileges to do something that it does not have the permission to do. It is possible that a misconfigured machine could allow untrusted code to run, but that is not a vulnerability in .NET, it is a vulnerability in allowing network administrators administrate machines.
The lack of .NET viruses may be explained in four ways. First, it may be because the best virus writers do not use .NET. This is a compelling argument because the weak point on a Windows machine is Windows itself, and the best way to exploit those weaknesses is to use an unmanaged language. Second, virus writers want their code to be compatible with as many machines as possible. XP does not have the .NET framework by default and it needs user intervention to ensure that it is installed. Although .NET 1.1 was provided on the XP SP2 CD the user still had to make an explicit decision to install it. Windows 2003 Server was the first version of Windows to have the runtime installed as part of the operating system. Even if a machine has the runtime installed there is the issue of the version: currently, there are three versions of .NET available. This means that a virus writer does not have a guarantee that the necessary framework to run his/her code will be on every machine, and worse, if the virus attempts to run on a machine without the right runtime the user will get a telltale error. Virus writers don't like people knowing that unauthorized code is running, or has failed to run, on their machine. Thirdly, it might just be that the complexity of the runtime has meant that it has taken the virus writers a long time to learn .NET sufficiently well to be able to exploit its vulnerabilities. The author of Impanate clearly has a good understanding of the structure of .NET files, yet this virus appeared 6 years after the first beta of .NET was released and three and a half years after the RTM version was released. Finally, it may be that the virus writers acknowledge that .NET is secure and prefer to spend their time on easier targets.
The bad news
- If virus writers are capable of hijacking Win32 libraries it is only a matter of time before a virus writer targets the (unmanaged) runtime binaries. The Shared Source CLI provides a reference implementation of the .NET runtime and this gives virus writers even more help in their quest to find ways of hijacking the runtime.
- There is no protection mechanism on .NET executables. A virus that has access to the folders where a .NET application is installed can inject their own code without the user noticing. The .NET runtime makes no attempt to prevent this.
How do you protect your code? The most important thing is to harden Windows. Run
a firewall, run a virus checker, do not run executable code sent to you from an
untrusted source (via email, as embedded code on a web page...), do not run as
an Administrator and make sure that no other accounts on the
machine can run as an Administrator, and install your .NET applications in folders that have appropriate NTFS ACLs.
These are as applicable to managed code as they are applicable to any code on
your machine.
| I hope that you enjoy this tutorial and value the knowledge that you will gain from it. I am always pleased to hear from people who use this tutorial ((contact me). If you find this tutorial useful then please also email your comments to mvpga@microsoft.com. |
Errata
If you see an error on this page, please contact me and I will fix the problem.