July 2001/The C# Delegate

Features

The C# Delegate

Stanley B. Lippman

If you’re comparing C# to other “C-family” languages, here’s an unusual feature that has no real equivalent in C++ or Java.

C# is a new and somewhat controversial language invented at Microsoft and delivered as a cornerstone of their new Visual Studio.NET, currently in a first Beta release. C# combines a great many of the features of both C++ and Java. The primary criticism of C# within the Java community is the claim that it is just an imperfect Java clone — one that is more the result of litigation than of language innovation. In the C++ community, the primary criticism, which is also leveled at Java, is that C# is just yet another over-hyped proprietary language.

The purpose of this article is to illustrate a feature of the C# language for which there is no analogous direct support in either C++ or Java. This is the C# delegate type, which serves as a kind of generic pointer to member function. The C# delegate type is, I believe, a thoughtfully innovative language feature that should prove of special interest to the C++ programmer regardless of his or her feelings about either C# or Microsoft.

To motivate the discussion, I’ve loosely organized it around the design of a testHarness class that permits any class to register one or more either static or non-static class methods for subsequent execution. The delegate type is at the center of this implementation.

The C# Delegate Type

A delegate is a kind of pointer to function, but with three primary differences:

1) A delegate object can address multiple methods rather than only one method at a time. When we invoke a delegate that addresses multiple methods, the methods are invoked in the order they are assigned to the delegate object — we’ll see how to do that shortly.

2) The methods addressed by a delegate object do not need to be members of the same class. The methods addressed by a delegate object must all share the same prototype and signature. Those methods, however, can be a combination of both static and non-static methods and may be members of one or more different classes.

3) A declaration of a delegate type internally creates a new subtype instance of either the Delegate or MulticastDelegate abstract base classes of the .NET library framework, supporting a collection of public methods to query the delegate object and the method(s) to which it refers.

Declaring a Delegate Type

The declaration of a delegate type generally consists of four components: (a) an access level, (b) the keyword delegate, (c) the return type and signature of the method the delegate type addresses, and (d) the name of the delegate type, which is placed between the return type and signature of the method. For example, the following declares Action to be a public delegate type that addresses methods taking no parameters and with a return type of void:
public delegate void Action();
At first glance, this looks shockingly like a function definition; the only difference is the delegate keyword. The intended symmetry here is to distinguish an ordinary member function from each special case with a keyword rather than a token: virtual, static, and delegate.

If a delegate type is used to address only a single method at any one time, it may address a member function of any return type and signature. If, however, the delegate type addresses two or more methods simultaneously, the return type must be void. Action, for example, can be used to address either a single or multiple methods. This is the declaration we’ll use for our test harness.

Defining a Delegate Handle

We cannot declare global objects in C#; every object definition must be either a local object, a member of a type, or a parameter in the argument list of a function. For the moment, I’ll just show you the declaration of a delegate type. Then we’ll look at how we declare it as a class member.

A delegate type in C#, as well as the class, interface, and array types, is a reference type. A reference type is separated into two parts:

A named handle that we manipulate directly, and

An unnamed object of the handle’s type that that we manipulate indirectly through the handle. This object must be explicitly created using the new expression.

The definition of a reference type is a two-step process. When we write
Action theAction;
theAction represents a handle to a delegate object of the Action delegate type, but is not itself the delegate object. By default, it is set to null. If we attempt to use it before it is assigned a value, a compile-time error is generated. For example, the statement
theAction();
causes the invocation of the method(s) addressed by theAction. However, unless it has been unconditionally assigned to between its definition and this use, the invocation triggers a compile-time error message.

Allocating a Delegate Object

In this section, we’ll need to minimally access both a static and non-static method, for which I’ve volunteered the following Announce class. The static announceDate method prints the current date to standard output in the long form:
Monday, February 26, 2001
while the non-static announceTime method prints the current time to standard output in the short form:
00:58
where the first two digits represent the hour, beginning at zero for midnight, and the second two digits represent the minute. The class definition looks as follows. It makes use of the DateTime class provided with in the .NET class framework.
public class Announce
{
   public static void announceDate()
   {
      DateTime dt = DateTime.Now;
      Console.WriteLine( "Today's date is {0}",
                         dt.ToLongDateString() );
   }

   public void announceTime()
   {
      DateTime dt = DateTime.Now;
      Console.WriteLine( "The current time now is {0}",
                         dt.ToShortTimeString() );
   }
}
To set theAction to address either method, we must create an Action delegate type using the new expression. For a static method, the argument to the constructor is the class name to which the method is a member and the name of the method itself joined by the dot operator (.):
theAction = new Action( Announce.announceDate );
For a non-static method, the argument to the constructor is the class object through which we wish to invoke the method joined to the method name — again joined by the dot operator:
Announce an = new Announce();
theAction   = new Action( an.announceTime );
Notice that theAction is reassigned without first checking to see if it currently addresses an object on the heap and, if so, deleting that. In C#, objects on the managed heap are garbage collected by the runtime environment. We do not explicitly delete objects allocated through the new expression.

The new expression is used to allocate either a single object
HelloUser myProg = new HelloUser();
or an array of objects
string [] messages = new string[ 4 ];
on the program’s managed heap. The name of a type follows the keyword new followed by either a pair of parentheses (to indicate a single object) or a bracket pair (to indicate an array object) [1]. (A pervasive characteristic of the C# language design is this insistence on a single distinct form to differentiate between different functionality.)

Garbage Collection: A Quick Overview

When we allocate a reference type on the managed heap, such as the following array object:
int [] fib = new int[6]{ 1,1,2,3,5,8 };
the object automatically maintains a count of the handles that refer to it. In this example, the array object referred to by fib has an associated reference count initialized to 1. If we now initialize a second array handle with the object referred to by fib:
int [] notfib = fib;
the initialization results in a shallow copy of the object addressed by fib. That is, notfib now also refers to the same array object addressed by fib. The array object’s associated reference count is now 2.

If we modify an element of the array through notfib, as in
notfib [ 0 ] = 0;
that change is also visible through fib. If that kind of multiple access is not acceptable, we would need to program a deep copy. For example,
// allocate a separate array object
notfib = new int [6];

// copy the elements of fib into notfib
// beginning at element 0 of notfib
// see note [2]
fib.CopyTo( notfib, 0 );
notfib no longer addresses the same object referred to by fib. The object previously referred to by both now has its associated reference count decremented by 1. The object referred to by notfib has an initial reference count of 1. If we now reassign fib to also address a new array object — for example, one that contains the first 12 values of the Fibonacci sequence:
fib = new int[12]{ 1,1,2,3,5,8,13,21,34,55,89,144 };
the array object previously referred to by fib now has a reference count of zero. An object on the managed heap with a reference count of zero is marked for deletion — when and if the garbage collector becomes active.

Defining Our Class Properties

Let’s declare the delegate object as a private static member of our testHarness class. For example [3],
public class testHarness
{
   public delegate void    Action();
   static private  Action  theAction;
   // ...
}
Our next step is to provide read and write access to the delegate member. In C#, we do not provide explicit inline methods to read and write non-public data members. Rather, we provide get and set accessors within a named property. Here is our simple delegate property. I’ve called it Tester:
public class testHarness
{
   static public Action Tester
   {
      get{ return theAction; }
      set{ Action = value; }
   }

   // ...
}
A property can encapsulate either a static or non-static data member. Tester is a static property of type Action, our delegate type. If we wish the property to support read access, we provide a get accessor. (Note the absence of a signature. We define an accessor as a block of code. The compiler internally generates an inline method.)

get must return a value of the property’s type. In this example, it simply returns the object it encapsulates. In a lazy allocation design, a first invocation of get may construct then cache the object for later retrieval.

Similarly, if we wish the property to support write access, we provide a set accessor. Within set, value is a conditional-keyword. That is, within a set property only, value has a predefined meaning: it is always an object of the type of its property. In our example, it is an object of type Action. At run time, it is bound to the right-hand side of the assignment. In the following example,
Announce an = new Announce();
testHarnes.Tester = 
    new testHarness.Action
    ( an.announceTime );
set is expanded inline in place of the occurrence of Tester. The value object is set to the object returned by the new expression.

Invoking the Delegate Object

As we saw earlier, we invoke the method addressed by the delegate by applying the call operator to the delegate:
testHarness.Tester();
This invokes the get accessor of the Tester property, which returns the theAction delegate handle. This may, however, throw an exception if theAction is not addressing a delegate object at this point of invocation. The canonical delegate-test-and-execute sequence from outside the class looks as follows:
if ( testHarness.Tester != null )
   testHarness.Tester();
For our testHarness class, our primary method simply encapsulates this test:
static public void run()
{
   if ( theAction != null )
      theAction();
}
Assigning Multiple Delegate Objects

To have a delegate address more than a single method, we primarily use the += and -= operators. For example, imagine that we have defined a testHashtable class. Within its constructor, we add each associated test to the testHarness:
public class testHashtable
{
   public void test0();
   public void test1();
   testHashtable()
   {
      testHarness.Tester += new testHarness.Action( test0 );
      testHarness.Tester += new testHarness.Action( test1 );
   }
   // ...
}
Similarly, when we define a testArrayList class, we add each associated test within its default constructor. Notice that these methods are static.
public class testArrayList
{
   static public void testCapacity();
   static public void testSearch();
   static public void testSort();

   testArrayList()
   {
      testHarness.Tester += new
         testHarness.Action(testCapacity);
      testHarness.Tester += new testHarness.Action(testSearch);
      testHarness.Tester += new testHarness.Action(testSort);
   }
   // ...
}
When the testHarness.run method is invoked, we do not in general know whether the testHashtable or testArrayList methods are invoked first — that depends on the order of their constructor invocations. But we do know that within each set of class methods, the order of invocation reflects the order of assignment.

Delegate Objects and Garbage Collection

Consider the following code sequence within a local block:
{
   Announce an = new Announce();
   testHarness.Tester += 
      new testHarness.Action
      ( an.announceTime );
}
When we initialize a delegate object to a non-static method, both the address of the method and a handle to the class object through which to invoke the method are stored. This causes the class object’s associated reference count to be incremented.

When an is initialized with the new expression, the associated reference count of the object on the managed heap is initialized to 1. When an is passed to the constructor of the delegate object, the reference count of the Announce object is incremented to 2. With the termination of the local block, the lifetime of an terminates, and the reference goes back down to 1 — that of the delegate object.

The good news is that the object associated with an invocation of a method referred to by a delegate is guaranteed not to be garbage collected until the delegate object no longer references the method [4]. We don’t have to worry about the object being cleaned up out from under us. The bad news is that the object persists until the delegate object no longer references the method. The method can be removed with the -= operator. For example, the following revised local block first sets, executes, and then removes announceTime from the delegate object.
{
   Announce an = new Announce();
   Action act  = new testHarness.Action( an.announceTime );

   testHarness.Tester += act;
   testHarness.run();
   testHarness.Tester -= act;
}
For the testHashtable class, our initial design strategy is likely to implement a destructor to remove the test methods added within its constructor. However, destructor invocation does not work the same as it does in C++ [5]. The destructor is not invoked following termination of its object’s lifetime nor with the release of the last reference handle to the object. Rather, the destructor is invoked only during a sweep of the garbage collector, the timing of which is generally unpredictable and, in fact, may not take place at all.

By convention, resource deallocation is factored out into a method named Dispose, which the user can directly invoke:
public void Dispose ()
{
   testHarness.Tester -= new testHarness.Action( test0 );
   testHarness.Tester -= new testHarness.Action( test1 );
}
If a destructor is defined for a class, it typically invokes Dispose.

Accessing the Underlying Class Interface

Let’s go back to our earlier code sequence for a moment:
{
   Announce an = new Announce();
   Action act  = 
       new testHarness.Action
       ( an.announceTime );

   testHarness.Tester += act;
   testHarness.run();
   testHarness.Tester -= act;
}
An alternative implementation might first check to see if Tester already addresses one or several other methods. If so, it saves the currently set delegation list, resets Tester to act, invokes run, then resets Tester to the original delegation list.

To discover the number of methods a delegate actually addresses, we can make use of the underlying Delegate class interface. For example,
if ( testHarness.Tester != null &&
     testHarnest.GetInvocationList().Length != 0 )
   {
      Action oldAct = testHarness.Tester;
      testHarness.Tester = act;
      testHarness.run();
      testHarness.Tester = oldAct;
   }   
else { ... }
GetInvocationList returns an array of Delegate class objects, each element of which represents a method currently addressed by the delegate. Length is a property of the underlying Array class, which implements the built-in C# array type [6].

If we wish, we can access full run-time information about the addressed method through the Method property of the Delegate class. In addition, if the method is non-static, we can also access full run-time information about the object through which the method is to be invoked through the Target property of the Delegate class. In the following example, Delegate methods and properties are highlighted in red:
If (testHarness.Tester != null )
{
   Delegate [] methods = test.Tester.GetInvocationList();
   foreach ( Delegate d in methods )
   {
      MethodInfo theFunction = d.Method;
      Type       theTarget   = d.Target.GetType();

      // ok: now we can find out everything about the
      //     method addressed by the delegate ...
   }
}
Summary

This article has hopefully piqued your interest in the delegate type within C#, which I believe provides an innovative pointer to class method mechanism. Perhaps this article has piqued your interest in the C# language and the .NET class framework as well. A good starting page for technical resources is <http://www.microsoft.com/net/>. An informative news group with Microsoft developer input dealing with both .NET and C# is <http://discuss.develop.com/dotnet.html>. Of course, questions or comments on C# or this article can be addressed to me at stanleyl@you-niversity.com. Finally, C# is currently in the process of standardization. On October 31, 2000, Hewlett-Packard, Intel, and Microsoft jointly submitted a proposed C# draft standard to ECMA, an international standards body (ECMA TC39/TG2). The current draft standard and other documentation can be found at <http://www.ecma.ch>.

Acknowledgements

I would like to thank Josee Lajoie and Marc Briand for their thoughtful review of an earlier draft of this article. Their feedback has made this a significantly better article. I would also like to thank Caro Segal, Shimon Cohen, and Gabi Bayer of you-niversity.com for providing a safety.NET.

Notes

[1] Two initial C# trip-ups for the C++ programmer are (a) the need to place an empty parentheses following the type name of an object with a default constructor, and (b) the placement of the array subscript between the type specifier and the name of the array.

[2] The built-in array in C# is a kind of Array class object provided in the .NET class library, and both the static and non-static methods of the Array class are available to each built-in array object in C#. CopyTo is a non-static Array method.

[3] In C#, as in Java, each member declaration includes its access level. The default access level is private.

[4] Similarly, the C++ Standard requires that a temporary addressed by a reference must not be eliminated until the end of the reference lifetime.

[5] Internally, destructors do not even exist. A class destructor is transformed into a virtual Finalize method.

[6] In C#, a conditional test must evaluate to a Boolean type. The direct test of Length’s value, if(testHarness.Length), is not a valid condition test. An integer value is not implicitly converted into a Boolean value.

Stanley B. Lippman is IT Program Chair with you-niversity.com, an interactive e-learning provider of technical courses on Patterns, C++, C#, Java, XML, ASP, and the .NET platform. Previously, Stan worked for over five years in Feature Animation both at the Disney and DreamWorks Animation Studios. He was the software Technical Director on the Firebird segment of Fantasia 2000. Prior to that, Stan worked for over a decade at Bell Laboratories. Stan is the author of C++ Primer, Essential C++, and Inside the C++ Object Model. He is currently at work on C# Primer for the DevelopMentor Book Series for Addison-Wesley. He may be reached at stanleyl@you-niversity.com.