Java Solutions


Sorting Made Easy

David Wagstaff

A little something to make sorting by key much easier.


Sorting is taught in most beginning computer classes — or at least it used to be. With the sorting algorithms included in standard libraries, it’s possible to keep all your data sorted and never understand any of the various sorting algorithms. Keeping to this theme, I’m going to describe how to sort objects in Java without having to write a sort algorithm. You’ll still have to do some minimal work describing what and how you want the data sorted. Along the way, I’ll develop a generic tool to all but eliminate even this task.

No one sort algorithm, not even the celebrated quicksort, is the best in every case. (For the curious, the JDK uses a modified mergesort with guaranteed n*log(n) performance.) I’m going with the assumption that the sorting included in the JDK is adequate for most people most of the time. Even if it should prove to be inadequate for your particular needs, remember the adage: “Make it work. Then make it right. Then make it fast.” Accordingly, I urge you to use it during the prototyping stages, and only later swap in your own sort.

As a trivial example, suppose I have a Vector of five Strings that I want sorted:

Vector v = new Vector();
v.add("Gamma");
v.add("Delta");
v.add("Alpha");
v.add("Beta");
v.add("Epsilon");
System.out.println(v);

Collections.sort(v);
System.out.println(v);

The before and after outputs are:

[Gamma, Delta, Alpha, Beta, Epsilon]
[Alpha, Beta, Delta, Epsilon, Gamma]

Nothing could be simpler — one call to the static method Collections.sort from java.util and you’re done. By the way, sort takes a List object. I chose Vector out of habit. LinkedList and ArrayList are also possible candidates [1].

Arrays are remarkably similar:

String[] a =
  { "Gamma","Delta","Alpha","Beta","Epsilon" };

Arrays.sort(a);

Arrays, although fixed in size, have the added advantage of being able to store primitives like int and double in addition to objects. Accordingly, Arrays.sort is overloaded to handle all the primitive types. It’s also overloaded to sort sub-ranges of an array. Suppose you only want to sort the first three elements.

double[] d = {5.7, -6.3, 45.0, 4.9};

Arrays.sort(d, 0, 2);

While this is fine and good, most of us use considerably more complex data. Let’s try this on a real class:

public class Person {
  String name;
  Date birth;
  double income;
  String phone;

  public String toString() {
    return name;
  }
}

Before the OOP style police arrest me, rest assured that I will come back to proper encapsulation. Chill for now while I explore this class in its simplest form. Let’s start small by trying to sort a List of two Persons.

Person p1 = new Person();
p1.name = "Wilson";
// Assign values to birth, income and
// phone
Person p2 = new Person();
p2.name = "Adams";
// Assign values to birth, income and 
// phone
Vector v = new Vector();
v.add(p1);
v.add(p2);
System.out.println(v);

Collections.sort(v);
System.out.println(v);

The output being:

[Wilson, Adams]
java.lang.ClassCastException: Person

Ouch! A run-time exception — I skipped the minimal work of describing what and how I want the data sorted. Looking beneath the covers at the simple data-types that work, you’ll discover that all of them implement the Comparable interface. The description straight from JavaDocs: “This interface imposes a total ordering on the objects of each class that implements it. This ordering is referred to as the class’s natural ordering.” Ah, so let’s implement Comparable in Person to make it sortable:

public class Person implements Comparable {
  ...
  public int compareTo(Object o) {
    Person that = (Person)o;
    return this.name.compareTo(that.name);
  }
}

The Person class now says that its natural ordering is by its name field. Notice how I passed the buck on determining the return value. Since String already is Comparable, we used its code. (I’ll use this trick again in a major way soon.) More importantly, notice the output:

[Wilson, Adams]
[Adams, Wilson]

This is great until marketing wants the same people sorted by income, age, and phone areas.

You might be tempted to juice up compareTo like this:

final int NAME_FIELD = 0;  //BAD EXAMPLE
final int BIRTH_FIELD = 1;

int order = NAME_FIELD;

public int compareTo(Object o) {
  Person that = (Person)o;
  if (order == NAME_FIELD)
    return this.name.compareTo(that.name);
  if (order == BIRTH_FIELD)
    return this.birth.compareTo(that.birth);
}

The reason you might be tempted is because it will work, but it is dangerous. This compound compareTo method depends on each and every Person object having the order field set the same. Ah, so let’s make order static. Well, now sorting Person objects is no longer thread-safe. What’s a body to do? The Java Collections authors anticipated this need and gave us another interface: Comparator.

Comparator also specifies an ordering for objects; the difference being that Comparable describes the natural or default ordering, while Comparator describes alternative orderings. In my opinion, the Person class has no natural ordering, so I’ll eliminate Comparable and focus instead on Comparator classes:

public class BirthComparator implements Comparator {
  public int compare(Object o1, Object o2) {
    Person p1 = (Person)o1;
    Person p2 = (Person)o2;
    return p1.birth.compareTo(p2.birth);
  }
}

public class IncomeComparator implements Comparator {
  public int compare(Object o1, Object o2) {
    Person p1 = (Person)o1;
    Person p2 = (Person)o2;
    Double d1 = new Double(p1.income);
    Double d2 = new Double(p2.income);
    return d1.compareTo(d2);
  }
}

Now, assuming you have a Vector v of Person objects each filled with values, you can sort them three different ways:

Collections.sort(v); //natural ordering by name
Collections.sort(v, new BirthComparator()); //by birth
Collections.sort(v, new IncomeComparator()); //by income

With the exception of proper encapsulation, this is the orthodox way of sorting on different keys. However, it still smells of redundant code. The Comparators duplicate too much of each other’s code. Instead, I’ll define a GenericComparator to replace all of the individual Comparators. Doing so requires using accessors in Person, a move toward encapsulation [2]. So instead of accessing the fields directly, let’s make them all private and give them get methods:

public class Person implements Comparable {
  private String name;
  private Date birth;
  private double income;
  private String phone;

  public String getName() {
  return name;
  }

  public Date getBirth() {
  return birth;
  }
// ...
}

Using GenericComparator, you can sort on any field you wish.

Collections.sort(v, new GenericComparator("getName"));
Collections.sort(v, new GenericComparator("getBirth"));
Collections.sort(v, new GenericComparator("getIncome"));
Collections.sort(v, new GenericComparator("getPhone"));

I’ve purposely left the guts of GenericComparator until the end. It gets its power from using Java’s reflection. Reflection, however, is beyond the scope of this article. Nevertheless, you should understand the general flow because it’s just using the same trick I’ve been using all along. Let Comparables do their own work and return the result:

import java.lang.reflect.*;
import java.util.*;

public class GenericComparator implements Comparator{
  String methodName;

  public GenericComparator(String methodName) {
    this.methodName = methodName;
  }

  public int compare(Object o1, Object o2) {
    Comparable c1 = null;
    Comparable c2 = null;
    try {
      c1 = (Comparable)o1.getClass().getMethod
        (methodName, null).invoke(o1,null);
      c2 = (Comparable)o2.getClass().getMethod
        (methodName, null).invoke(o2,null);
    }
    catch(Exception e) {
      e.printStackTrace();
    }
    return c1.compareTo(c2);
  }
}

Life seems better when it’s organized. For data, nothing says organized better than being sorted.

Notes

[1] For more information on List and other Java 2 collections, see Chuck Allison, “The Java 2 Collections — Finally, Some News You Can Use,” C/C++ Users Journal Java Solutions Supplement, December 1999.

[2] Some astute readers may be offended at the flippant encapsulation. Please don’t infer that I’m advocating get methods as the object-oriented solution for correctly implementing encapsulation or that proper object-oriented design starts with data and haphazardly throws in methods as an afterthought. As a rule, I first design an interface of what the object will “do.” I also write the Javadocs, JUnit tests, and UML diagrams. All of these are essential, and yet in a learning context, such as this article, all these facets of good design tangle together to obscure using the GenericComparator. Ironically, the GenericComparator forces some encapsulation because it simply won’t work with data members; it must use methods, JavaBeanish gets or otherwise.

David Wagstaff is the lead instructor at Thin Blue Edge, a training company specializing in Java and related technologies. He has a BS in computer science from Brigham Young University and is currently completing his masters at Nation Technology University. You can contact David at davidw@thinblueedge.com.