.NET Instrumentation Workshop
Home About Workshops Articles Writing Talks Books Contact

1. Instrumentation

You instrument an application to allow you to monitor how the application is working. Instrumentation can be vital when you are trying to determine why an application is failing. Equally so, instrumentation can be the cause of poor performance. In this workshop I will outline the various mechanisms that .NET provides to allow you to instrument your code. There is a wide range of options, but sadly Microsoft have failed to understand some of the basic tenets of instrumentation which means that through no fault of your own, you may add deadlocks or performance hogs into your code. At the end of this workshop you will know where the problems are with Microsoft's instrumentation code and you will know how to work around the problems.

1.1 Basic Tenets of Instrumentation

The First Tenet

The most important effect of instrumentation is described by Heisenberg's Uncertainty Principle. In effect this says that:

when you measure a system, the action of measurement affects what you measure

Thus, when you add code to monitor your application, the monitoring code will affect the values you obtain. Whenever I read articles about instrumentation I rarely see this stated, the authors act as if the monitoring is totally free. It isn't. Indeed, often the instrumentation code has performance and threading issues, and adding instrumentation to your code could make the difference between an application that runs well and an application that runs slowly and occasionally deadlocks for no apparent reason.

The Second Tenet

The next most important tenet is that you should not use your customers to debug your applications. Too often I hear developers claim that they must have tracing code in their release builds because if the application fails in the field they can use the results of the tracing code to determine the issue so that they can fix the code. I am fully aware that no code can be 100% bug free, but the argument that tracing code has to be in released code is an admission that you think that there are some bugs still in your code. Asserts in released code is an admission that you really do not have any quality control. You use an assert when you know that a particular value would be severely detrimental to your application. If a customer sees an assert generated she knows that you knew that there was a bug, and that you simply had not bothered to find out what conditions would cause the bug to appear.

The Third Tenet

The third tenet of instrumentation is that the information you provide for your customers should be appropriate to them, and the information you provide for your developers should be appropriate to them. Customers will not want to know intermediate results of your algorithm but your developers may. If you follow the second tenet and remove all tracing code from your release builds then you will not have to worry about tracing messages appropriate to your customers, however, although I am willing to argue for the second tenet until I am blue in the face I recognise that developers will want to ignore my advice and so even though I have lost the argument at least they could mitigate the possibility of information leakage by making sure that they only trace information that will be of use to the customer.

Logging information is different to tracing information. A log provides permanent information that your application has run and what the application has done. Such logging information is useful for auditing. If you find that information is useful for debugging, then it is tracing information, not logging information.

The Fourth Tenet

The fourth tenet is that you should be aware that your application is not the only application on the machine and therefore any information that you trace or log should not overwhelm any information logged by other applications. Some developers trace every small detail to the event log which has the effect of squeezing out messages from other applications. You should really question why it is necessary to log huge amounts of data, but if you think that it is necessary then play fair and keep that data to yourself.

As a final comment for this section it is instructive to point out that Microsoft do not have tracing in their own release mode code. If Microsoft decides that they will not use their customers as beta testers, do you think it is a good idea for you?

1.2 Instrumentation In .NET

The .NET framework provides several mechanisms to instrument your code, and the classes can be found in the System.Diagnostics namespace.

Conditional Code
The framework provides the facility to allow you to add code to your assemblies that is compiled only under specific conditions. Compilers can use the condition to determine if it will put the code in the assembly, and a compiler can use the condition to determine if it will call the code (note that these are two different situations). In addition, your code can make the decision as to whether code is called and the framework provides classes that allow you to make this decision.
Traces
The framework provides an extensive mechanism to allow you to add trace messages to your code. Trace messages are useful when debugging code because you can use them to follow code flow and watch intermediate values. The framework provides an extendable architecture that allows you to collect trace messages and store them.
Asserts
An assert is a check on an important value on which your application depends. If the value is not correct then the rest of your application will be suspect. A failed assert typically suspends the running of the application. For this reason, asserts are vital for debugging but are useless for release mode applications.
Event Log
The event log is a system resource and is shared by all processes on the system. It has two main features. Firstly, because it is a repository for logged information it provides a way for processes without a user interface (for example services) to provide feedback. The second feature is that this repository is shared which means that messages from several processes are collated. If you are trying to determine why an application is failing it may be useful if you find that other applications on the system are failing too.
Performance Counters
Performance counters provided instantaneous values or rates, that can be viewed over time. Performance counters can be used to determine how a process is working and are used by administrators to monitor the general health of a system.

As you can see, there is a rich collection of features. But you should see that there are hidden dangers here. For example, performance counters (and as you'll see later, tracing) require inter-process communication and this means that your process will depend on the performance of another process out of your control.

1.3 Why Instrument Code?

There are several reasons why you will want to instrument your code. In the introduction for this workshop I identified two main reasons: tracing code for debugging and logging information. In some respects the mechanism to do both are very similar, but the type of data will be very different.

During application development tracing is vital as a post mortem technique. You know that your application has died, you know the exception that was thrown and where; but why did it happen? Tracing allows you to gain a record of the code paths and intermediate results which you can use as a forensic tool to determine why an application failed. Asserts also provide some post mortem information, but since the asserted value is so important for normal running the assert is often the cause of the application's death. Often an application dies because of an incorrect value obtained earlier on in the application - the code between the point when this incorrect value was obtained and when the application died could be totally bug free. An assert place on such an important value allows you to investigate why the value was incorrect, at the point that it was obtained. This is an important aspect of asserts: to prevent the application running further when it is quite clear that fatal damage has already been done.

Instrumentation can also be used in conjunction with the debugger to make debugging easier. You can provide intermediate results and even debugging methods that will only be available in debug builds and can be called by the immediate window or the variable watch window in the Visual Studio debugger. You can also use symbols and conditional compilation, or conditional code, to provide different versions of the application, perhaps versions that have more debugging information, or use alternative algorithms.

As you can see, .NET provides a rich collection of facilities for generating debug information. However, note that it is debug information and should be used as such. Debug information should only appear in debug builds.

Instrumentation can also be used to provide logging information for release builds. There are two general types of information: logs and performance counters. Logs should be used relatively sparingly and should only log important information. The sort of information that you should log are: auditing information, that is, keeping a list of what your application does, this might be things like transaction IDs; critical events, that is, if your algorithm fails then log this but do not be tempted to dump large amounts of information because that is a debugging task. Performance counters give a measure of how the system is running and could be a rate, like the number of transactions handled per second, or a value, like the number of current transactions. In general performance counters should be useful to system administrators rather than developers. An administrator will need to know the current health of a system, and to get a measure to plan ahead for system expansion, so if your application shows that its disk usage is getting near to its quota, or its memory usage is near to the amount of memory in the machine, the administrator can take steps to remedy these issues.

I hope that you enjoy this tutorial and value the knowledge that you will gain from it. I am always pleased to hear from people who use this tutorial (contact me). If you find this tutorial useful then please also email your comments to mvpga@microsoft.com.

Errata

If you see an error on this page, please contact me and I will fix the problem.

Page Two

This page is (c) 2007 Richard Grimes, all rights reserved