JVMs have gotten a lot better over the years, but you can still dramatically improve performance with a few simple techniques.
Introduction
The many virtues of Java have been well publicized. In particular, the "Write once, run anywhere" promise has provided developers with the freedom to develop cross-platform applications without the overhead of preprocessor directives. The area in which Java is presumed to be at its weakest is performance.
This presumption is not automatically true today; there are products that can boost Java performance sufficiently to make it a non-issue in many applications. For example, TowerJ is a post-compiler that transforms Java byte code into highly optimized native executables, and JRockit is an adaptively optimizing server side JVM (Java Virtual Machine) (see Resources section below). Even so, some simple practices can improve the performance of your Java code without the need to buy one of the aforementioned tools. In this article I will demonstrate a few of these practices.
This discussion will be based mainly on high throughput code (server side). Whereas the main overhead incurred with GUI code involves object creation and re-creation, the significant performance figures in server-side code are the execution times of methods. Thus, for the code examples that are included, I have recorded the average time to execute a method. It is not feasible to record the actual execution time of a single method, so I time a series of executions of that method and calculate the average. This in effect mimics performance-critical code execution.
Each example is accompanied with pseudocode to explain the byte code operations. The actual byte code produced is available from the CUJ website (see www.cuj.com/code). An explanation of all byte code operations is available from the Javasoft website (see Resources section below).
Improving String Manipulation Performance
As in C++, the Java library defines its own String type. Under the covers, this type is implemented as an array of chars, but you do not need to understand this to use strings. The curse of the NULL character ('\0') has frustrated many students learning and using C++; Java does not have this added distraction, but allows programmers to concentrate on the application and not the tools needed to build it. There is a downside associated with this worry-free approach to strings, and that is the string concatenation operator '+'.
This operator appears to be very useful. The majority of applications that write data to streams use '+'. For example:
String name = new String("Joe"); System.out.println(name + " is my name.");In the above snippet, it seems that there is not much that can be changed within the println statement that could improve its execution speed. However, the byte code produced for this statement (represented here in pseudocode) reveals the truth (see Listing 1).
This simple piece of code creates five objects [1]: STR_1, STR_2, STR_3, STR_4, and STR_BUF_1.
Note that object creation is extremely expensive in relative terms. Heap storage must be allocated for all the instance variables of the class and each of the class's superclasses; all the instance variables must be initialized; and the class's constructor and the constructors of each superclass must be executed. To create efficient code, it is imperative that object creation be limited to what is absolutely necessary.
So, can the above code be rewritten in a more efficient manner? Consider the following code snippet:
StringBuffer name = new StringBuffer("Joe"); System.out.println(name.append( " is my name.").toString());The equivalent byte code/pseudocode appears in Listing 2.
The above code creates only four objects: STR_1, STR_2, STR_3, and STR_BUF_1. You might think that saving the creation of one object is not that much of a saving. However, the following code creates eight objects:
String name = new String("Joe"); name+=" is my"; name+=" name.";whereas this code only creates five:
StringBuffer name = new StringBuffer("Joe"); name.append(" is my"); name.append(" name.").toString();The second snippet executes more than twice as fast as the first [2].
Conclusion: use StringBuffers to improve the performance of string processing code. The goal is to minimize the creation of new objects, and this can be achieved by using the append method on StringBuffers rather than the concatenation operator on Strings.
Faster Logging
In every software project I have worked on, there has been a requirement to have a logging mechanism in place. There are many reasons for an application to include a logging facility. The main reason is for easier maintenance. To enable bug reporting in a released application, it is necessary to have some starting point. In many cases the user submits an ambiguous report that details a problem that could be caused by many factors. If there is a mechanism in place for the user to gather additional information about the problem, then the turnaround for problem resolution is reduced dramatically.
There is no standard means to produce this kind of information. It is generally left up to the developer how this mechanism is put in place. Yet the implementation of the logging mechanism can affect the performance of an application dramatically. The goal should be to have a mechanism that outputs valuable run-time information, but minimizes the effect on run-time performance.
The most obvious way to avoid run-time overhead is to have no logging built into the application that is released; in other words, if the actual code that performs the logging is not compiled into the application, then there is no performance hit. Listing 3 shows a class that defines such a logging mechanism. It can be configured to omit the logging code from the byte code produced. The class will be a Singleton to avoid unnecessary creation of Logger instances.
As you can see, this is a very simple class with one type variable, one type constant, two methods, and a constructor. To use this class, simply get the instance of it, check if debug is turned on, and then call debugMsg, as shown below:
... Logger myLogger = Logger.getInstance(); ... if (Logger.CAN_DEBUG) { myLogger.debugMsg("some debug message"); } ...Suppose Logger.CAN_DEBUG is false. When building the application, dead-code elimination will occur and no byte code will be produced for the conditional. This is because the compiler knows that Logger.CAN_DEBUG will always remain false, since it is a final static variable. If Logger.CAN_DEBUG is true, then the code will be compiled, and byte code for the conditional will be produced. Thus, a build with debug messaging turned on will result in more byte code being produced.
This idea can be extended to allow more granular handling of the information being produced. For instance, a new static final boolean could be declared as CAN_INFO, and a new method, public void infoMsg(String msg), could be implemented.
From a performance point of view, this is the best method to use. A number of different builds can be coordinated to reflect what level of messaging is supported. For instance, you could release a production version and a debug version. If a problem occurs with the production version, then swap it with the debug version to pinpoint where the problem is occurring.
The major disadvantage of this approach is that it cannot be configured at run time, for example, as a System property.
The main performance hit in most logging mechanisms is the creation of String objects. Thus, the aim should be to minimize this overhead. The solution will therefore need to include StringBuffers. The Logger class in Listing 4 provides a configurable logging level.
The example code provides a two-level approach to logging. It allows the handling of both debug and info messages. This can be easily extended to handle more types. This class provides a solid base for the logging mechanism.
There are two options available for using this implementation in an application. The first option is to create a base class that will implement a simple API, which the application will extend. The second option is to have the application implement an interface that defines that simple API. The following is an example interface:
public interface LogAPI { public void createMsg(); public void appendLog(String str); public void appendLog(int i); public void logDebugMsg(); public void logInfoMsg(); }In TestLogger.java (Listing 5) a sample implementation of this interface is provided.
The reuse of the StringBuffer object is key here. Normally you would write something like the following as a debug message:
debugMsg("Name:" + name + " Age:" + age);As discussed earlier, this type of String creation is detrimental to performance. If it is rewritten as shown in TestLogger.java, then gains in performance will become visible.
The log level can now be tailored at run time using the setLogLevel method defined in the Logger class. A good way to do this is with the aid of System properties. You must define your own property; in this case, it is called "app.loglevel". If the program in question is an application, then you can you can set the "app.loglevel" property using the -D switch to the JVM [3]. For example:
java -Dapp.loglevel=3 myAppOn the other hand, if your program is an applet, it can be set using the <PARAM> tag:
<PARAM NAME="app.loglevel" VALUE="2">Then, to set the log level, all you need do is get the property value and call setLogLevel on the result:
String strLevel = System.getProperty("app.loglevel"); If(strLevel != null) { int level = Integer.parseInt(strLevel); Logger.getInstance().setLogLevel(level); }The benefits of such an approach are:
- reduced object creation, that is, objects are reused
- a well-defined API, which encourages all developers to follow a standard
- extensibility individual developers can tweak the implementation to suit their own needs.
- reduced maintenance cost due to a standard API
- a run-time customizable log level
Better Performance through Custom Collections
When the need arises to store a collection of common objects, the simplest escape route is normally to use java.util.Vector. This class is inefficient in a good deal of the cases in which it is used. There are two main reasons for this inefficiency. The first reason is that Vector is thread safe; therefore, a number of its methods are synchronized. When you always know the application will be single threaded, this synchronization constitutes unnecessary overhead. The second reason why Vector is inefficient is the amount of casting that is used when retrieving objects from it. If all objects stored in the Vector are of the same type, then casting should not be required. Thus, for better performance, we need type-specific single-threaded collections.
StringVector.java (Listing 6) is an example implementation of a collection for String types. Remember that this class can be tailored for all types of objects.
Previously where you may have seen code like this:
... Vector strings = new Vector(); strings.add("One"); strings.add("Two"); String second = (String)strings.elementAt(1); ...it can now be replaced with:
... StringVector strings = new StringVector(); strings.add("One"); strings.add("Two"); String second = strings.getStringAt(1); ...The result is improved performance. TestCollection.java (Listing 7) highlights this performance difference. The add method for StringVector requires only 70 percent of the execution time compared to Vector's add method. The getStringAt method requires only 25 percent of the execution time compared to Vector's elementAt method.
This technique can be tailored to meet the requirements of whatever application you are working on. For example, you can create an IntegerVector, an EmployeeVector, etc.
Conclusion
This article is by no means "the Bible" for Java performance. Its purpose is to raise awareness of small changes you can make to your Java code to increase its performance.
Resources
TowerJ: ttp://www.towerj.com
Jrockit: http://www.jrockit.com
Java Virtual Machine Specification: http://java.sun.com/docs/books/vmspec
Notes
[1] As stated at http://java.sun.com/docs/books/vmspec/2nd-edition/html/Concepts.doc.html#19124:
Loading of a class or interface that contains a String literal may create a new String object to represent that literal.
This article assumes that a new String is created to represent the literal.
[2] Time taken on average to execute Example 3 100,000 times is 578 milliseconds and the time taken on average to execute Example 4 100,000 times is 265 milliseconds (on my machine).
[3] The -D switch is used by the majority of JVMs that are available. Microsoft's Jview interpreter uses the /d: switch.
John Keyes is a Senior Developer with Stepping Stone Software Ltd. John worked previously with IONA Technologies. He can be reached at johnkeyes@yahoo.com.