Mixed-Language Programming


JNI-C++ Integration Made Easy

Evgeniy Gabrilovich and Lev Finkelstein

Extremely versatile interfaces like the Java JNI also tend to be extremely cumbersome, as a rule. The authors have found a way to break that rule.


Introduction

The JNI (Java Native Interface) [1] is a powerful framework for seamless integration between Java and other programming languages (called “native languages” in the JNI terminology). A common use of the JNI is when a system architect wants to benefit from both worlds, implementing communication protocols in Java and computationally expensive algorithmic parts in C++. (The latter are usually compiled into a dynamic library, which is then invoked from the Java code.) The JNI provides native applications with much of the functionality of Java, allowing them to call Java methods, access and modify Java variables, manipulate Java exceptions, ensure thread safety through Java thread synchronization mechanisms, and ultimately to directly invoke the JVM (Java Virtual Machine).

This functional wealth is provided through a rather complex interface between the native code and the Java programming environment. For example, accessing an integer Java member variable from the native code is a multi-step process, which involves first querying the class type whose variable is to be retrieved (using the JNI function GetObjectClass), obtaining the field identifier (using GetFieldID), and finally retrieving the field itself (using GetIntField). The latter function is representative of a set of functions (Get<type>Field), each corresponding to a different variable type; thus, accessing a variable requires explicit specification of the appropriate function. To streamline these steps, we provide a template (parameterized with a variable type), which encapsulates all these low-level operations, giving the programmer easy-to-use assignment/modification functionality similar to that of native C++ variables. As far as data types are concerned, we develop a complete set of templates for the C++ primitive types that correspond to Java primitive types. (In JNI programming, these C++ primitive types all begin with a lowercase “j,” as in jint, jfloat, etc.) We also outline a generic approach for accessing any user-defined type.

Note that the JNI also supplies a number of JNI API functions, which are complementary in their nature. For example, if a native function has a parameter of type jstring, it should first convert this parameter into a conventional C++ string of chars using GetStringChars (or GetStringUTFChars) and subsequently explicitly release this temporary representation using the corresponding function ReleaseStringChars (or ReleaseStringUTFChars). This mode of operation can be greatly simplified by implementing a proxy-style smart pointer, which realizes the “construction as resource acquisition” idiom [2]. When the smart pointer accesses the Java string, it transparently converts it to a C++ compatible string (a char*), and when it goes out of scope, the smart pointer destructor releases any temporary resources used. The proxy also provides proper handling of the transfer of ownership, similarly to the C++ Standard auto_ptr construct [3]. We also provide analogous treatment for Java arrays, featuring automatic selection of the access function based on the actual element type (e.g., Get<Int>ArrayElements), release of array elements in the destructor of this “smart container,” and conventional access to array elements with operator[]. The “construction as resource acquisition” idiom may also be applied to function pairs such as NewGlobalRef/DeleteGlobalRef, which are used to reserve and release global references (respectively), and MonitorEnter/MonitorExit, which are used for protecting critical sections of code, etc.

To make the discussion concrete, we start by developing a running example, which solves a toy problem. We first show the native code for solving this task without using the framework (“before”) and then show the desired code (“after”), streamlined using the proposed framework. In a subsequent section, we develop the JNI encapsulation framework step by step. The article ends with a larger-scale example. The C++ source files mentioned in this article are not shown, but they are part of the JNI encapsulation framework. The complete source code for the examples and the framework are available on the CUJ website (see www.cuj.com/code).

For background information on the JNI, see the sidebar, “The JNI in a Nutshell.”

A Running Example

Our toy problem defines a Java class JniExample that contains an integer, a static String, and an integer array field. Function main calls a native function implemented in C++, which accesses the Java fields, prints their original (default) values, and ultimately modifies them. Listing 1 shows (a fragment of) the Java class whose variables must be accessed from C++ code.

Listing 2 shows a sample implementation of native C++ code that modifies the above variables, implemented using the original JNI. Observe that on average three to four preparatory operations are necessary to access a Java field from the native code. The revised code in Listing 3 reduces this overhead to one constructor invocation per access. Note also that assignment to Java fields becomes more intuitive too.

Template-Based Encapsulation of the JNI

In this section, we evolve the JNI encapsulation framework. First, we discuss access to scalar variables (of both primitive and user-defined types). We then develop a generic resource management scheme that underlies the implementation of containers (arrays and strings). Finally, we apply this scheme to advanced JNI features, such as monitors and global references.

Field Access

Accessing a Java variable from C++ is a cumbersome process at the very least. Our aim is to develop a technique for establishing correspondence between a C++ object and the Java variable so that all low-level access operations become transparent to the programmer. For example, a C++ object corresponding to intField is a proxy [4] of type JNIField<jint>, created using the environment handle and the Java object passed via the JNI native call:

JNIField<jint> intField(env, obj, "intField");

After this declaration, changing the variable value in C++ is as simple as the following:

intField = 17;

To this end, we use a template class JNIField (see Listing 4), whose three constructors present three ways to attach a C++ variable to a Java field. The first constructor receives an environment handle, a reference to the Java object whose field is to be accessed, and the field name and type signature, and it connects to the designated field. The second constructor allows creation of fields whose type signature may be deduced automatically (more on this below). Both constructors compute the field identifier (using implicit calls to the JNI functions GetObjectClass and GetFieldId) and store it together with the object handle. Template class JNIFieldId, which encapsulates the notion of a field identifier, is also defined in jni_field.h. Occasionally, it might be necessary to access the same field in numerous Java objects of the same type (e.g., iterating over array elements). In such a case, the field identifier may be computed just once and cached for subsequent reuse; this mode is supported by the third constructor. The assignment and casting operators provide easy access to the variable value.

An interesting aspect of the field-handling templates (JNIField, JNIFieldId) is their type inference capability, which is mostly hidden from the user. Note that the JNI only supplies a set of Get<type>Field functions, each corresponding to a different variable type; thus, accessing a variable apparently requires explicit specification of the appropriate function. We circumvent this requirement by using the template specialization technique; namely, we preinstantiate the member functions of JNIFieldId for all the primitive types. (For example, JNIFieldId<jint>::operator= is actually implemented using the SetIntField function, etc.) This way, once the field is instantiated (e.g., on an integer Java variable — JNIField<jint>), we no longer need to specify its type on every variable access.

This mechanism uses a series of basic declarations, where each primitive C++ type is associated with the corresponding Java basic type and Java array type, and is assigned the JNI type signature. The declarations are realized as C++ structs, which are later used by the compiler as a lookup table. (Among other things, this relieves the user from having to remember signatures of various Java types; instead, they can be looked up whenever necessary.) For more details, see the full implementation in jni_field.h and jni_declarations.h.

To conclude the presentation of field access, jni_field.h provides similar definitions for static fields (“class variables” in Java terminology). It defines a template class JNIStaticFieldId (also specialized for the primitive types) and a template class JNIStaticField, which has an additional data member of type jclass for keeping the appropriate class object.

Arrays and User-Defined Data Types

The framework provides (in jni_declarations.h) a set of declarations to facilitate JNI arrays. We actually build a lookup table, which the compiler consults for type inference. This table may be looked up given either a primitive type or an array type. For example, the compiler can automatically deduce that an array of jints is of type jintArray and that an array of type jcharArray consists of jchars and has the JNI signature "[C". Listing 3 contains an example of how to connect C++ code to a Java array of integers ("intArray").

The framework can also be easily extended for custom (non-primitive) types so that they can be used by the compiler for automatic type inference. jni_declarations.h exemplifies this feature with the declarations for a String data type (struct StringDeclarations). Such declarations are immediately available for the compiler to utilize. For instance, in the C++ code in Listing 3, we assign a new value to the string field of the running example with the following statement:

JNIStaticField<jstring>(env, obj, "stringField") =
   env->NewStringUTF("Good-bye, world!");

Note that we do not designate explicitly the corresponding type signature, as the compiler infers it automatically.

Resource Management

The JNI encapsulation framework features a generic resource management mechanism, which (when adapted properly) greatly simplifies various JNI use cases. Our resource management approach is based on the C++ “construction as resource acquisition” idiom [2]: resources are allocated in the constructor and are released in the destructor of dedicated auxiliary objects. This idiom is implemented using the Proxy pattern [5], with functionality similar to that of the auto_ptr template of the C++ Standard library [3]. Our implementation thus enables the programmer to acquire a Java resource through construction of the corresponding C++ object, without having to explicitly acquire the resource; resource deallocation (release) is also performed automatically, through the C++ object destructor. Customized resource management (via the isCopy parameter for allocation and the mode parameter for deallocation) is also supported.

Strings

The JNI supports two kinds of strings: those that use regular (UTF-8) characters, and those that use wide (Unicode) characters. Our class JNIStringUTFChars provides an interface to Java UTF-8 strings of type jstring. Such a string can be accessed as a raw const char* or converted to C++ std::string for convenience [6]. Since Java strings cannot be modified, class JNIStringUTFChars provides only the const version of operator[]. Function length returns the length of the C++ string (using the JNI function GetStringUTFLength). jni_resource.h, which contains the complete implementation, also defines class JNIStringChars for accessing Java strings of wide (Unicode) characters.

As an example, consider the following code that attaches a C++ variable, str, to the static string field of the running example, using the names of the host class and the string field:

JNIStringUTFChars 
str(env, "JniExample", "stringField").

The string value can then be printed simply using cout << str.get().

To replace the value of a Java string, you must first create a new string that can later “survive” in the Java environment. In the code fragment below, a C++ variable is instantiated on the static string field of the running example and then assigned a brand new Java string created with the JNI function NewStringUTF:

JNIStaticField<jstring>
(env, obj, "stringField") =
   env->NewStringUTF
   ("Good-bye, world!");

Note that there is no need to worry about eventually releasing the memory occupied by this newly created string — this is performed by Java’s garbage collector.

Arrays

Our implementation of arrays features most of the functionality presented up until now. In particular, our arrays provide automatic acquisition and release of elements in the constructor and destructor respectively. (The template specialization trick is used here again to preinstantiate the array template for all primitive types so that the appropriate Get<Type>ArrayElements/Release<Type>ArrayElements functions are automatically selected based on the context.) Function size uses the JNI facility GetArrayLength to determine the number of array elements. Two versions of operator[] (regular and const) are provided to access the individual elements. jni_resource.h contains the implementation.

To access the integer array field of the running example, we instantiate a corresponding C++ variable as follows:

JNIArray<jint> arr(env, obj, "intArray").  

Subsequent access of the array elements is straightforward: arr[0] = 0.

The default behavior of JNIArray is to copy all the array elements back into the Java environment once the C++ array goes out of scope. (This is done by JNI function Release<Type>ArrayElements invoked from the array destructor.) When this behavior needs to be overridden, use member function CustomRelease to set the desired mode for Release<Type>ArrayElements.

Occasionally, it is not necessary to manipulate an entire Java array, which may be quite large. For cases when only a part of the array needs to be accessed, the JNI provides a pair of functions Get<Type>ArrayRegion/Set<Type>ArrayRegion. The file jni_utils.h defines corresponding template functions GetArrayRegion/SetArrayRegion (preinstantiated at compile time for primitive types) that are capable of deducing the element type based on their parameters.

Monitors

Monitors serve to ensure mutual exclusion of threads competing for a shared resource. To ensure resource integrity (“thread safety”), threads should request to enter a monitor at the beginning of the critical section and leave it at the end of the section. We have implemented a resource management technique that uses an auxiliary automatic object of type JNIMonitor, whose constructor enters a monitor and whose destructor leaves the monitor as soon as the object goes out of scope (see jni_resource.h for implementation details). JNIMonitor’s constructor receives a handle to the object that constitutes a shared resource protected by this monitor. Listing 5 shows a sample code fragment that uses a monitor.

Global References

It is sometimes necessary to obtain a reference to a Java object so that it can be used across the function boundaries of C++ code. In such cases, a global reference to this object should be reserved (in contrast to most JNI functions that yield a local reference, which expires as soon as the current scope terminates). jni_resource.h implements an auxiliary template class JNIGlobalRef, whose constructor acquires a global reference to the specified object and whose destructor releases it (see sample usage in Listing 6).

Using the Code

Following the STL methodology, all the framework code resides in header files and is entirely template based, so clients do not need to compile their applications with any additional libraries. In fact, client applications need only include the master file jni_master.h, which it self #includes all the other headers. The entire code for this article with complete Java-C++ integration examples can be obtained from the C/C++ Users Journal website at www.cuj.com/code/.

A More Elaborate Example

We now apply the JNI encapsulation framework to a more substantial example. Suppose we have a Java application in which several concurrent threads generate (a predefined number of) objects with string IDs. The task is to collect these objects in a thread-safe way and ultimately sort them by their IDs. Since earlier versions of the JDK (Java Development Kit), such as JDK 1.1, did not have the Collections framework, it is quite natural to implement the sorting container in native C++ code using the STL. JniComplexExample.java (not shown here for the sake of brevity) contains the Java part of this example, which uses the following native functions:

Listing 6 shows the native code for the container, implemented using the STL multimap. The container holds the (Java originated) objects by global references [7], associated with their string IDs. The container is realized as a Singleton object, which has two thread-safe access functions:

(This vector is inherently sorted, since the objects are extracted from a multimap.) To ensure code portability, thread safety is implemented using JNI monitors. Critical sections start with monitor definition and last until the monitor is automatically destroyed as it goes out of scope.

Discussion

Let us review the properties of the solution we developed:

  1. It is an easier to use, more straightforward approach. Whenever possible, the compiler infers variable types from the context. Java data structures are automatically exported to (and in order to save changes, are later imported from) the C++ code. Auxiliary technical operations are encapsulated in higher-level templates.
  2. It is a less error-prone API. Fewer functions to call means fewer opportunities to err in successive function invocations. Also, it is now possible to perform various checks at compile time, instead of discovering problems much later as run-time errors.
  3. It utilizes proper resource management. Resources are automatically deallocated when they are no longer necessary, thus preventing resource leaks, deadlocks, and starvation.
  4. It addresses portability issues. Java portability is preserved by using only ANSI-Standard C++ and the STL [3].
  5. A possible drawback of the suggested framework is the compile-time penalty it imposes, due to heavy use of the preprocessor and embedded templates. However, this overhead is limited to the compilation time and does not propagate to the run time. The code size increase is also negligible, since most of the templates only provide type definitions (and thus do not need run-time representation at all), and unused template instantiations are discarded by the code optimizer.
  6. Acknowledgments

    The authors are thankful to Marc Briand, Herb Sutter, and Jerry Schwarz for their constructive comments and suggestions.

    Notes and References

    [1] Java Native Interface Specification, http://java.sun.com/products/jdk/1.2/docs/guide/jni/spec/jniTOC.doc.html.

    [2] Bjarne Stroustrup. The C++ Programming Language, Third Edition (Addison Wesley, 1990).

    [3] “Information Technology — Programming Languages — C++,” International Standard ISO/IEC 14882-1998(E).

    [4] The name of the C++ variable is obviously arbitrary; we use the same variable name in C++ and Java merely for convenience.

    [5] Erich Gamma, et al. Design Patterns: Elements of Reusable Software Architecture (Addison Wesley, 1995).

    [6] Note that the conversion function asString physically copies the characters.

    [7] Global references are required so that the Java garbage collector does not destroy the objects prematurely.

    [8] D.C. Schmidt and T. Harrison. “Double-Checked Locking: An Optimization Pattern for Efficiently Initializing and Accessing Thread-Safe Objects,” Pattern Languages of Program Design (Addison-Wesley, 1997).

    Evgeniy Gabrilovich is an algorithm developer at Zapper Technologies Inc. He holds an M.Sc. degree in Computer Science from the Technion-Israel Institute of Technology. His interests involve computational linguistics, information retrieval, artificial intelligence, and speech processing. He can be contacted at gabr@acm.org.

    Lev Finkelstein is an algorithm developer at Zapper Technologies Inc., and is a Ph.D. student in Computer Science at the Technion-Israel Institute of Technology. His interests include artificial intelligence, machine learning, multi-agent systems, and data mining. He can be reached at lev@zapper.com.