Net Gets a Java Buzz

by Ray Valdes

With the explosive growth of the Internet and the World Wide Web, it becomes harder each day to sort out the important from the ephemeral. Yesterdays cool site is today's lame page. There is an ongoing arms race among Web-page designers to add impact to pages that increasingly appear static and tired. The latest such weapon in the designer's arsenal is a technology called "Java."

Before examining Java, it is important to remember that, in evaluating any technology, you can sort out potential winners from also-rans by observing who's on the main evolutionary path and who is stuck on the byways. The details will remain unpredictable, but the broad outlines become clear. For example, a central issue that will never go away is the relative scarcity of bandwidth. Just as you can never be too rich, too thin, or have too much RAM or MIPS, the pipe that connects your desktop to the global network can never be too wide. Any technology that deals effectively with the bandwidth constraint will rapidly find its way into the mainstream. Throwing hardware at the bandwidth problem isn't necessarily a solution, because users will find new kinds of data to ship over the pipe: larger graphic images, animation, audio, and video.

One strategy for dealing with the narrow-pipe problem is to offload work from the server to the client. For instance, using VRML technology, which the lay public sometimes confuses with Java, a Web page can contain not just a GIF or JPEG image, but a locally rendered 3-D model through which the user can navigate and interact. (See "VRML and the World Wide Web," by Joe Stewart, Dr. Dobb's Developer Update, June 1995.) Obviously, the burden is now on the client to generate the umpteen frames-per-second necessary to maintain user attention. The creators of VRML say it was designed for low-bandwidth connections, and this is true: The movements through the virtual world are constrained by the speed of your CPU, rather than that of your Internet connection. If you have an SGI box, you're in luck; anything less (such as a Pentium 90 running Windows NT), and you lose. In the not-too-distant future, faster processors and hardware-assisted rendering will make things better, but what about the interim?

Enter Java

The recently announced Java technology from Sun Microsystems garnered excitement and visibility partly because of its quick licensing by Netscape Communications for incorporation into the wildly popular Netscape browser. (Reportedly, Netscape programmers have already produced a demonstrable Java-enhanced version.)

Java is orthogonal to VRML in that what flows through the pipe is not data (a 3-D model represented in VRML format) but executable code written in a general-purpose programming language. The reason the lay public sometimes confuses Java with VRML is that, in certain circumstances, the demos look similar: a Web page with a small embedded 3-D model with which users can interact. However, entirely different technologies are used to achieve the same end: getting around the bandwidth constraints. These technologies are complementary, not competitive. VRML is for navigating large-scale, 3-D spaces; Java is for anything under the sun that can run at interpreted speeds.

The Java language itself is described in the article "Java and Internet Programming," by Arthur van Hoff (Dr. Dobb's Journal, August 1995), one of the principal members of the Java team. Here I'll focus on topics not covered in van Hoff's article: the context surrounding the Java system, additional news about the Java project, and an independent analysis of the prospects for this technology. My discussion is based on an examination of the system and its documentation, as well as interviews with many of the Java implementors. The members of the Java development team, while justifiably proud of their system, are also refreshingly candid about what's not yet complete in this work in progress. There is none of the "yes, we'll tell you about our neat stuff, but then we'll have to kill you" attitude that afflicts some Silicon Valley companies.

First, a quick recap. The Java system consists of a set of interrelated technologies:

One design goal of the Java language is to achieve a "completely new level of robustness" via a small, simple, architecture-neutral, object-oriented language that resembles a dialect of C++. This goal can be paraphrased as, "Let's get rid of all the complicated, dangerous, and/or stupid crud in C++." (This notion is one that many former C programmers heartily support.) As a consequence, Java has no multiple inheritance of classes, no functions (only methods), no pointer arithmetic, no structs, no typedefs, no #defines, no variable argument lists, no header files, no need to free memory, and, as van Hoff adds with a smile, "no core dumps." Because backward compatibility with C is not a requirement, the result is a language that is simple and elegant, yet equivalent in expressive power to C++.

The language was designed primarily by James Gosling at Sun, starting about five years ago. Gosling is a legendary figure among UNIX programmers; he's known for creating the first C version of Emacs (Richard Stallman wrote the very first Emacs using TECO macros) and the Postscript-based, dynamic windowing environment for Sun OS known as "NEWS." The maturity and experience of Gosling's vision is apparent to those I've spoken with who have programmed extensively in Java. From the start, the heft and balance of the language feels right, and continues to wear well over time. The Java team has already written over 500,000 lines of code in Java.

Much of the language is dynamic. For example, references to member functions and instance variables are symbolic until load time, and only then are resolved to numeric offset. There is type information available at run time, and classes are themselves first-class objects. However, Java does not go to the extremes of Smalltalk in this regard; simple types such as char and boolean are not objects, although strings and arrays are.

Gosling's vision goes beyond details of syntax and semantics, to the general notion of "moving behavior across the net." Gosling wants to break down machine boundaries, to allow people to publish executable code as they now publish documents. Robustness and security are key to achieving this goal. The Java language system provides for null-pointer checking, array bounds checking, garbage collection (no need to free memory), exception handling, and "bytecode verification" (dataflow analysis at run time for security purposes and to guard against stack overflow). Unlike C/C++, in which the implementation of data types such as char or int is unspecified, everything in Java is nailed down. chars are signed 8-bit quantities, floats are in IEEE 754 format, byte order is Big-endian, and so on.

Regarding security, the design philosophy is: "Once you have a set of enforceable language constraints, then you can build a security policy on top of them." This multilayered security allows Java to function as an Internet extension language that provides Web pages with executable content. The goal, obviously, is to prevent a virus from infecting a hapless user's machine, despite promiscuous point-and-click.

The Browser

Of all the components in the Java system, the HotJava browser is the most visible, and provides a compelling argument for the importance of this technology. It is itself written in Java and interpreted at run time. A Web page written for the HotJava browser can contain "applets" written in Java, similar to the way Web pages now contain embedded GIF and JPEG images. Because Web browsers are supposed to ignore HTML codes that they don't understand, the <app> tag does no harm when encountered by existing, non-Java-enabled browsers. Applets can implement anything from a manipulable 3-D cube to an interactive game, a small spreadsheet program, or dancing text. Because of the dynamic nature of the language, the HotJava browser can, in theory, upgrade itself at run time - for example, by downloading support for a new network protocol or graphics format.

For examples of Java-jazzed Web pages, take a look at http://www.stones.com or http://www.hotwired.com. The official Java site (http://java.sun.com) contains online documentation and the complete alpha-stage releases for Solaris and Windows NT for downloading via ftp. These alpha releases include source code for the Java compiler and the HotJava browser. Check out the licensing restrictions before you download the code, however. Sun will gladly provide the entire source code, including the VM interpreter, if you express serious interest in doing a port. In exchange, you give Sun nonexclusive rights to any enhancements and improvements you may make to the source.

This openness, which seems unprecedented outside academia, has led to a flurry of activity among software developers. Outside Sun, ports are underway to many platforms, including Linux, Amiga, Next, Windows 3.1, and Sun OS. Sun itself has committed to Solaris, Windows NT, Windows 95, and Macintosh. There are also some efforts at "clean-room" implementations, in which programmers work only from the specifications for the language and VM instruction set, rather than from Sun-provided source code. However, at first glance, the system still seems too immature for these efforts to be successful. For example, there are aspects of the run-time system, such as the bytecode verification and dataflow analysis, that aren't documented. While the lower layers of the system (the language and VM design) are stable, the upper layers (the class library that implements an abstract windowing environment) are undergoing a significant rewrite. The people at Sun have yet to cleanly partition the Java system into a separate language package and browser. For the moment, a port seems to consist of taking the whole shooting match and pounding away at different parts of the code that reside at different levels of abstraction until it all somehow works.

This situation can only improve, as the Java project gains visibility and receives an increased share of corporate resources from Sun. Certainly the project is the focus of great interest within and without the corporate walls. A recent talk by Arthur van Hoff at Sun headquarters had double the expected attendance. Attendees included employees from other divisions at Sun who want to transfer into the group, as well as technical staff from potential competitors, who were attempting to discreetly hide their nametags in their pockets. Also present were principals of spanking-new startups, whispering recruitment pitches to old acquaintances.

Performance

The HotJava browser, written in interpreted Java, runs quite nicely on a Pentium 90 machine with 16MB RAM. It's even manageable on a 486 - a pleasant contrast with other emerging Internet technologies such as VRML and RealAudio, which leave something to be desired. This is not too surprising, because the original Java system was initially created, not for the Web, but for set-top boxes used in interactive TV applications.

The bytecode interpreter was designed with full awareness of past generations of VM designs, including the work by David Ungar's group at Stanford, and the on-the-fly compilation of bytecodes into native machine code pioneered by Peter Deutsch and Allan Shiffman at ParcPlace.

Gosling has come up with some clever tricks to increase performance while preserving platform independence; for example, a technique for the run-time binding of symbolic references to numeric offsets by overwriting the bytecode stream with equivalent _quick instructions. This fundamental technique, by the way, is patented (US patent 5,367,685, "Method and apparatus for resolving data references in generated code"), which may cause problems for anyone seeking to do a clean-room implementation. Although one of the Java engineers states "As far as I know, we'd forgot all about this," it is legitimate to wonder if Sun's corporate lawyers have a different memory-refresh rate.

To gain further performance, it is always possible to compile bytecodes to native machine code, known at Sun as "just-in-time" compilation. The present, purely interpreted implementation runs 15-20 times slower than C, which sounds slower than it seems. This is still one or two orders of magnitude faster than TCL.

What are the Shortcomings?

One caution: The system is still evolving. The awt ( "abstract window toolkit" or "another window toolkit") class library is being rewritten completely, to switch to a model-view-controller architecture. The event model is not specified in a sufficiently abstract manner, and perhaps needs to be reworked to map well across platforms with disparate event models, such as Macintosh, Windows, and X.

Sun press releases claim that the technology has been over five years in the making, but some members of the Java team candidly admit that it is only in the last two years that the project has focused on the goal of creating an Internet extension language. Once that decision was made, progress happened quickly. For example, the initial version of the browser was mostly written last fall by Jonathan Payne during a two-week period, which indicates three things: the high skill level of Payne and other Java team members, the impressive productivity gains provided by the Java language, and the still-inchoate state of the Java system. These initial quick-and-dirty implementations are getting reworked in the transition from alpha to beta stage. van Hoff is writing a new version of the browser that will have design-mode capabilities for authoring Web pages.

Portability remains an issue. The Java system is very portable if you are writing to the bare hardware, a result of its embedded-systems heritage. Java is a lot less portable if you have to integrate with an operating system or GUI environment, especially if that system has a baroque architecture (yes, we're talking about Windows 3.1, which Sun has punted on for the moment, although some independent efforts are underway). Preemptive multithreading is central to the Java design. For example, the language offers a synchronized keyword, which is applied to a method and locks the object whose method is being invoked. The garbage collector runs in its own thread. All told, there are almost a dozen active threads in a running HotJava environment, besides those spawned by applets.

Surprisingly, Java threads seem to function better on Windows NT than on Sun boxes, at least in that threads at the same priority level are scheduled preemptively, unlike the Solaris implementation. Thread guru Tim Lindholm explains that this is due to Java's modest expectations regarding thread facilities. The "cheesy" facilities on Windows NT are a good match, compared to the sophisticated thread library on Solaris, which is overkill and which resulted, paradoxically, in a decision to implement a Java-specific, non-kernel-level implementation of threads. You can expect a switch to standard, kernel-level threads on Solaris before too long.

Java Prospects

After examining the Java system, it seems safe to say: "I have seen the future of the World Wide Web, and it is executable content." This is not to say that Java itself will prevail, only that Java-like technologies will become mainstream.

Are there any viable alternatives to Java? You can speculate that Bill Gates's devotion to Basic will result in Microsoft eventually releasing "VB/WWW." Netscape, which is trying to become the Microsoft of the year 2000 (and has a good shot at it) may attempt to roll their own technology. Certainly Netscape programmers have the chops to pull it off. However, these things take time to create; the Java group has at least a two-year head start and is rapidly gaining adherents.

Existing technologies incubated in academia include Safe-TCL, Guile, Phantom, and Python. People at Sun acknowledge these efforts, but state: "They're not as ready as we are."

One Java-like technology that will be familiar to Dr. Dobb's readers is David Betz's object-oriented language known as "Bob." Like Java, Bob is a tractable subset of C++ and is pseudocompiled into bytecodes interpreted by a stack-oriented virtual machine. Bob debuted in the September 1991 issue of Dr. Dobb's Journal ("Your Own Tiny Object-Oriented Language" by David Betz). Since then, Betz has extended Bob into a language for online conferencing systems ("An Online Conferencing System Construction Kit," by David Betz, Dr. Dobb's Information Highway Sourcebook, Winter 1994) and also made an embeddable version in the form of a language-processor DLL that is callable from Windows applications (see "Callable Bob," Dr. Dobb's Journal, May 1995). Unlike Java, the source code to David's bytecode interpreter is freely available (ftp://ftp.mv.net/pub/ddj/1995/1995.05/bobdll.zip). It's possible that Bob could serve as the basis for commercial alternatives to Java; Betz's first language, XLisp, became the foundation for AutoCad's AutoLisp macro language. Compared to the design goals set for Java, Betz's initial goals for Bob were modest; consequently, the system is very light, consisting of less than 4000 lines of clearly written C code.

One existing commercial technology with some heavyweight backers that may stake a claim on Java's turf is Telescript. Unlike Bob, which was a one-man effort, Telescript was developed over several years by a large team of engineers at General Magic, in concert with blue-chip corporate partners such as Sony, Motorola, and AT&T. The initial version has been shipping since last fall, in the form of the Sony Magic Link palmtop computer. There are currently many differences between Telescript and Java; most significantly, that Telescript currently has nothing to do with the World Wide Web. The major similarity is that both systems support the notion of shipping behavior across networks. Given General Magic's sagging fortunes in the PDA market, it's quite possible it will seek the greener pastures of the World Wide Web. However, the company seems hobbled by a top-down management approach and a penchant for secrecy and for overly proprietary systems - not a good combination for an Internet-oriented business. Few companies have the technological resources to go it alone, no matter how deep the pockets of their corporate backers. Netscape Communications provides a good example of how to succeed: By harnessing the power of individuals connected to the Internet, the company has grown its installed base from zero to three million users (80 percent market penetration) in less than nine months.

One person who is following Netscape's approach is Dave Winer, author of Aretha, a C-like language formerly known as Frontierland. Aretha started out as a scripting language for the Macintosh, and Winer recently broadened its scope to work with the Internet and the Web. To help make it a standard, he has made the binaries available for free on the Net, at http://www.hotwired.com/userland/aretha. At the moment, however, Aretha is restricted to the Macintosh platform, and seems better suited for creating CGI scripts on servers rather than executable content on clients.

As you can see, there's no lack of alternatives to Java. For ongoing coverage of these issues, check out http://www.dobbs.com/dddu/java.html. Will Java succeed? Recall Scotty's remark to Captain Kirk: "I can't change the laws of physics, Jim!". Along with the space-time continuum, bandwidth constraints will always be with us. Java provides a well-conceived strategy for circumventing this problem, and the design and implementation seem substantial enough to go the distance.


Ray is senior technical editor at Dr. Dobb' Journal and can be contacted at ray@valdes.com .