Java Solutions


Implementing equals for Mixed-Type Comparison

Angelika Langer and Klaus Kreft

Sometimes you want objects to be equal even if they're not. Here's how to make it happen.


All objects in Java are equality comparable via the equals method because all classes inherit the equals method from class Object. As a result, each time a programmer implements a new class, he or she must check whether the inherited version of equals provides the correct semantics for the new class or whether the default implementation must be overridden. Naturally, any user-defined version of equals should be correct. What does “correct” mean? It means meeting the requirements of the so-called equals contract, which is specified in JavaDoc of Object.equals.

In a previous article [1], we discussed at length how easy it is to inadvertently provide incorrect implementations of equals. The most common mistake is violation of the transitivity requirement imposed by the equals contract. It requires the following: if one object is equal to a second and the second is equal to a third object, then the first and the third object must be equal as well.

The core problem of a non-transitive implementation is that it attempts to allow comparison of objects of different types. All the problems regarding transitivity go away as soon as you disallow mixed-type comparison in the sense that equals returns false unless the two objects are of the exact same type.

The same-type comparison is easy to implement. Here is an example:

public boolean equals(Object other) {
  ...
  if (this.getClass() !=
      other.getClass())
    return false;
  else
    // these objects can be compared;
    // go and compare their fields
    ...
}

This type of implementation of the equals method is easy to get right; it is robust, easy to understand, easy to maintain, and for that reason the recommended way of implementing equals — unless you need something else.

Implementation of a Mixed-Type Comparison

The idea of comparing objects of different types is not entirely off base, in particular for classes from the same class hierarchy. After all, objects from the same class hierarchy (and by class hierarchy we mean all classes derived from a common superclass other than Object) have something in common, namely at least the superclass part. As we demonstrated in [1], providing a correct implementation of a mixed-type comparison is a non-trivial task. In this article, we will show one way of implementing a mixed-type comparison of objects from the same class hierarchy that meets the requirements of the equals contract.

Mixed-Type Comparison Semantics

The most common mistake in mixed-type comparison implementations is the semantics of the comparison.

Q: Under which circumstances do we consider two objects of different types from the same class hierarchy as equal?

A: If they have equal superclass parts.

Q: How about the subclass-specific parts?

The problem is that the intransitive implementations of equals do not care about the subclass-specific parts of the objects that they compare. Here is an example: say we have a Point superclass and a ColoredPoint subclass. When we compare ColoredPoint(0,0,RED) to Point(0,0), the typical implementation says they are equal. If you then go on to compare Point(0,0) to ColoredPoint(0,0,BLUE), then they, too, will compare equal. But when we compare ColoredPoint(0,0,RED) and ColoredPoint(0,0,BLUE), they should compare equal because of the transitivity requirement of the equals contract, but in fact they compare unequal. What went wrong?

The crux lies in the subclass-specific color field. A Point object should only compare equal to a ColoredPoint object if the color field contains a default value. Under this rule, the three objects from the example above are no longer equal. Points and ColoredPoints only compare equal when the ColoredPoint has a default color (i.e., Point(0,0) is equal to one and only one ColoredPoint with the coordinates (0,0)).

Transitivity is preserved under this rule. The default color does not matter. The only thing that counts is that for a transitive mixed-type comparison all the fields that cannot be compared (namely the subclass-specific fields) have well-defined default values. Hence the rule is: two objects from the same class hierarchy compare equal if and only if they have:

To make things a little more challenging, let us consider a slightly more complex class hierarchy and see whether the rule holds. Say, we have an additional subclass Point3D that is derived from class Point and that represents three-dimensional points. Both ColoredPoint and Point3D belong to the same class hierarchy — class Point being the root class of that hierarchy. What does it mean to compare a ColoredPoint to a Point3D?

Under the rule above, it means that a ColoredPoint and a Point3D compare equal if and only if they have the same Point part and default values for their subclass-specific parts, namely the color and the third coordinate. For instance, if black is the default color and zero the default coordinate, then Point(0,0), ColoredPoint(0,0,BLACK), and Point3D(0,0,0) would be equal, but neither ColoredPoint(0,0,RED) nor Point3D(0,0,1) would be equal to Point(0,0).

Let us now implement the rule.

Implementing the Mixed-Type Comparison

The functionality that we want to implement depends on the type of the this object and the other object. There are four cases:

  1. this and other are of the exact same type. All fields can and must be compared.
  2. this is of a subtype of other. other is smaller than this, and only the superclass fields can be compared for equality; the subclass-specific fields of this object must have default values.
  3. this is of a supertype of other. other is the larger object that has subclass-specific fields that must have default values. This can be implemented easiest if we let other do the work, which brings us back to case 2 with the roles of this and other being switched.
  4. this and other are from different branches of the class hierarchy. Both objects have something in common, which must be compared for equality, and they both have subclass-specific fields, which must be checked against the default values. This is the tricky part. First, we must identify the superclass that this and other have in common. Second, we must check the subclass-specific fields of both objects for default values.

The principle of the implementation is the usual recursion: every class’s implementation of equals takes care of comparing its own fields and delegates to the superclass’s implementation of equals for comparison of the inherited fields. Comparing one’s own fields comes in two flavors: either other has fields that this can compare with its own fields, or not. The first flavor is relevant when this is of the same type or of a subtype of other (see cases 1 and 2); in this instance, compare this’s fields with other’s fields. The second flavor (“there is nothing to compare on this level because other is too different”) happens in cases 3 and 4; in this instance, check that this’s fields have default values.

We will encapsulate the actual comparison of a class’s own fields into a private helper method that every class in a class hierarchy must provide. Here is the sketch of an implementation of the helper method, which we named _compareFields, for a class with two int fields:

public class MyClass extends MySuperClass {
  static final int
    field1Default = 0,
    field2Default = 0;
  private int field1 = field1Default;
  private int field2 = field2Default;
  
  ...

  private boolean _compareFields(Object other) {
    // at least my type, check fields
    if (other instanceof MyClass) {
      if (   field1 != ((MyClass)other). field1
        || field2 != ((MyClass)other). field2
         )
         return false;
    // higher type, check defaults
    } else {
      if (   field1 != field1Default
        || field2 != field2Default
        )
        return false;

    }
    return true;
  }
}

Note, that _compareFields is a private method in our sample implementation. This is intentionally so in order to make sure that the method remains class-specific and cannot be overridden in a subclass.

In principle, equals does no more than call the helper method and delegate to super.equals. However, matters are not that simple. This strategy would be fine for cases 1 and 2, but more work is needed to address cases 3 and 4. Case 3, “this is of a supertype of other,” can best be addressed by passing the task on the other object’s equals method. By switching the roles of this and other, we simply reduce case 3 to case 2. Case 4, “this and other are from different branches of the class hierarchy,” is the really challenging situation where we must identify the superclass that both objects have in common and must navigate the class tree in a more sophisticated way than simply going straight up from the subclass to the root class.

Before we delve into the solution for case 4, let us take a look at the implementation of the equals method itself.

public class MyClass extends MySuperClass {
  
  ...

  public boolean equals(Object other) {
    // identical
    if (other == this) return true;
    // null
    if (other == null) return false;
    // incomparable types
    if (!(other instanceof RootClass)) return false;
    // subtype of root; go and compare
    return _navigateClassHierarchy(other);
    }
}

The equals method performs a couple of preliminary checks, like checking whether other is a null reference and whether other belongs to the same class hierarchy as this. The assumption here is that the topmost superclass of the class hierarchy is a class named RootClass. After the preliminaries, equals delegates to another helper method named _navigateClassHierarchy, which controls the class tree traversal and triggers the actual field comparison provided by the other helper method _compareFields.

The Tree Traversal

For illustration, let us consider the following class hierarchy shown in Figure 1. For each of the four relevant situations, Figures 2, 3, 4, and 5 illustrate the location of thisObject and theOtherObject in the class hierarchy (see the box around the class name) and the size of each object in terms of the class slices it comprises (see the yellow box). For instance, in case 1, thisObject is an object of type MyClass and contains a MySuperClass and a MyClass.

MyClass thisObject = new MyClass();
MyClass theOtherObject = null;

Figure 2 diagrams case 1:

// (1) this and other are of the exact same type
// theOtherObject = new MyClass();
... thisObject.equals(theOtherObject) ... 

Figure 3 diagrams case 2:

// (2) this is of a subtype of other
   theOtherObject  = new MySuperClass();
   ... thisObject.equals(theOtherObject) ...

Figure 4 diagrams case 3:

// (3) this is of a supertype of other
   theOtherObject  = new MySubClass_1 ();
   ... thisObject.equals(theOtherObject) ...

Figure 5 diagrams case 4:

// (4) this and other are from different
//     branches of the class hierarchy
   thisObject      = new MySubClass_1();
   theOtherObject  = new MySubClass_2();
   ... thisObject.equals(theOtherObject) ...

The algorithm is easy for the cases 1, “this and other are of the exact same type,” and 2, “this is of a subtype of other.” In these cases, we will either compare this’s fields to other’s fields (case 1) or check this’s field against the default values (case 2). This is exactly what method _compareFields does. Afterwards, we delegate to the superclass for comparison of the inherited fields and start the recursive traversal up the class tree. Here is a first (still incomplete) snippet of the helper method _navigateClassHierarchy:

public class MyClass extends MySuperClass { 

  
  ...

  protected boolean _navigateClassHierarchy(Object other) {

      ...

      // compare my fields
      if(!_compareFields(other)) return false;
      // pass the buck up
      return super._navigateClassHierarchy(other);

  }
}

Case 3, “this is of a supertype of other,” is similar to case 2, “this is of a subtype of other.” If we switch the roles of this and other, we can use the solution of case 2, which means that we simply call other’s _navigateClassHierarchy method and provide this as an argument. Here is a more elaborate implementation of the tree navigation method:

public class MyClass extends MySuperClass {
  ...

  protected boolean _navigateClassHierarchy(Object other) {

    if (other instanceof MyClass)
    {  // switch roles         
      return ((MyClass)other)._navigateClassHierarchy(this);
    }
    else
    {   // compare my fields
      if(!_compareFields(other)) return false;
      // pass the buck up
      return super._navigateClassHierarchy(other);
    }
  }
}

Note, this implementation is still incomplete, since we haven’t addressed case 4, “this and other are from different branches of the class hierarchy,” yet.

In our case 4 example, this is a MySubClass_1 object, and other is a MySubSubClass object. Control flow in the tentative implementation of _navigateClassHierarchy shown above will go to the else-branch because other is not an instance of MyClass. The compareFields method will correctly check whether this’s field has default values. Then control flow delegates to the superclass (i.e., MyClass._navigateClassHierarchy will be invoked).

This time other is an instance of MyClass, because MyClass happens to be superclass that this and other have in common. Note, we have not yet checked whether other has default values for its own subclass-specific fields; in order words, we must still traverse the branch of the class tree from which other stems. This can be achieved by switching roles and calling MySubSubClass._navigateClassHierarchy. This happens to be exactly what the then-branch does anyway. So far, so good.

In MySubSubClass._navigateClassHierarchy, we will check for default values of the sub-subclass-specific fields; we will move up the class tree and process the subclass-specific fields until we end up in MyClass._navigateClassHierarchy again. With the implementation as it is, control flow would now descend again down the other branch that we already traversed (i.e., we will end up in an infinite loop).

To break the cycle, we need to memorize that we already traversed that other branch of the tree. For this purpose, we introduce a boolean flag, which we name reverseOrder, to maintain this extra piece of information. Here is the complete implementation of the _navigateClassHierarchy method:

public class MyClass extends MySuperClass {
  ...
  protected boolean _navigateClassHierarchy
    (Object other, boolean reverseOrder) {

    if (other instanceof MyClass && !reverseOrder)
    {  // reverse order     
      return 
        ((MyClass)other)._navigateClassHierarchy(this,true);
    }
    else
    {   // compare my fields
      if(!_compareFields(other)) return false;
        // pass the buck up
        return
          super._navigateClassHierarchy(other,reverseOrder);
    }
  }
}

Before switching the roles of this and other in order to traverse the other branch of the class tree, we check whether we did it before. If so, we refrain from doing it again and instead start the normal straight ascent up the hierarchy from here on, since both sub-branches have already been traversed. The boolean flag works like a so-called latch: it starts with the initial value false, flips exactly once (namely when traversal of the other branch starts), and never changes its value again. It is handed from one _navigateClassHierarchy method to the next, each of which checks the flag in order to find out whether the other branch has already been traversed or not.

That’s it! This is one way of implementing a correct mixed-type comparison. It is certainly not the only conceivable implementation, but no matter how you achieve the goal, it will take more than a simple instanceof test to get it right. Listings 1 and 2 show the complete source code for the root class and one of the subclasses.

As with all implementations of equals, no matter whether they are same-type comparisons only or mixed-type comparisons, any new class in the class hierarchy must join the game and must employ the same mechanism for its implementation of equals. In our case, it means that all classes in the hierarchy must override the helper methods _navigateClassHierarchy and _compareFields. The equals method itself is identical in all classes and needs only be defined once in the root class.

The implementations of the _compareFields methods are entirely class-specific and fully depend on the number and types of the fields that the respective class adds to the hierarchy. Note, _compareFields is a private method in our sample implementation. This is intentionally so in order to make sure that the method remains class-specific and cannot be overridden in a subclass. Alternatively, we could have implemented a final method in each class that has a class-specific name using a naming scheme like _compare<classname>Fields.

The implementations of _navigateClassHierarchy are mostly identical. This is really boilerplate code that you copy and paste. The few modifications are that the class name in the instanceof test and the cast expression must be changed; it is the name of respective class.

Acknowledgements

The ideas discussed in this article were inspired by comments we received from readers our earlier article on equals [1]. These readers were (in the order their emails arrived) Larry Kuenning, Mark Davis, and Frank Griswold. All three pointed out that there is a way of implementing equals so that it performs a transitive mixed-type comparison. From their tentative first ideas and code snippets, we derived the solution explained in this article. Mark Davis’s suggestions are the main foundation of this article; he precisely described the algorithm and provided most of the implementation.

References

[1] Klaus Kreft and Angelika Langer. “Secrets of equals,” Java Solutions Supplement to C/C++ Users Journal, April 2002, <www.cuj.com/java/articles/a19.htm?topic=java>.

[2] Mark Davis. “Liberté, Egalité, Fraternité,” Java Report, January 2000, <www.macchiato.com/columns/Durable5.html>.

Angelika Langer works as an independent freelance trainer/consultant. Her main area of interest and expertise is object-oriented software development in C++ and Java. She can be reached at langer@camelot.de.

Klaus Kreft is a senior consultant at Siemens Nixdorf in Germany. He can be reached at klaus.kreft@mch.sni.de.