Dr. Dobb's Journal December, 2005
In June of this year, Apple Macintosh developers discovered that they had a formidable challenge set before them. The bad news was that to continue in the Macintosh software market, they were going to have to migrate their Power PC-based applications to an Intel x86-based Macintosh. The good news was that the operating system, Mac OS X, had already been ported and was running on prototype x86-based systems. Furthermore, Apple Computer offered cross-compiler Xcode tools that would take an application's existing source code and generate a "universal binary" file of the application. The universal binary file contains both Power PC and x86 versions of the program, and would therefore execute on both the old and new Mac platforms. (For more details on how Apple has engineered this transition, see my article "The Mac's Move to Intel," DDJ, October 2005.)
Apple provided developer reports that describe the migration of an existing Mac to the new platform as relatively quick and easy. I don't dispute those reports. There are situations where a particular application's software design dovetails nicely with the target platform's software. For example, if the application is written to use Cocoa, an object-oriented API whose hardware-independent frameworks implement many of Mac OS X's system services, the job is straightforward.
However, not all Mac applications are in such an ideal position. Many commercial Mac applications can trace their roots back to when the Mac OS API was the only choice (Mac OS X offers four), and most code was written in a procedural language, instead of an object-oriented one. When migrating an application to a new platform, developers are loath to discard application code that's field-tested and proven to work. Using such "legacy" code potentially reduces the cost and time to port an application because the process starts with stable software. The downside to this strategy is that code idiosyncrasies or subtle differences in the implementation of the legacy APIs on the new platform may hamper the porting process. So, for the majority of Mac developers, the move to Intel might look more like a leap of faith than an easy transition.
This brings us to the $64,000 (or more) question: For most existing Mac applications, how difficult and costly is it to migrate a Power PC Mac application to the x86-based Mac platform? To see where the truth lies, I present in this article a case study of the porting process for a commercial Mac application.
The application in question is Bare Bones Software's BBEdit, an industrial-strength text editor (http://www.barebones.com/). "Text editor" is perhaps a bit of a misnomer here, because over the years, BBEdit's text-processing capabilities have evolved to suit the needs of several categories of user. Besides being a powerful programming editor for writing application code, web wizards use it to write HTML web pages, while professional writers use BBEdit to produce all sorts of text documents, ranging from manuals to novels.
BBEdit offers a wide variety of features and services to meet the demands of these three disparate groups of users. For programming work, the editor displays the source code of C, C++, Objective-C, Java, Perl, Pythonin total, about 20 programming languages with syntax coloring. It features integration with the Absoft tools and interoperates with the Xcode tools as an external editor. For HTML coding, menu choices let you quickly generate HTML tags. The HTML code is also syntax colored, and syntax checks can be run on the HTML to spot coding errors. Furthermore, BBEdit can render the HTML that you just wrote so that you can preview the results immediately and make changes. When the web page's code is ready, BBEdit lets you save it to the server using an FTP/SFTP connection. For just plain writing, the editor is fast, allowing you to quickly search, change, and scroll through large documents consisting of 7 million words or larger. A built-in spellchecker lets everyoneprogrammers, web wizards, and writers alikecheck for spelling errors. Finally, it provides a plug-in interface so that new features and tools can be added to the application in the future.
The current 8.2 version of BBEdit, while changed in many ways from the original version (the World Wide Web didn't even exist when the first version of BBEdit was written), still owes a lot to the code design of its ancestors. To appreciate how the Bare Bones Software team managed the current transition to the x86 platform, it helps to review the editor's rich and complex history. This is due to the fact that the program has been revised numerous times and completely rewritten several times. As you'll see, certain design decisions, some made years ago, profoundly affected the current (and third) transition to the x86 platform.
BBEdit is a classic Mac application in every sense of the word, being 15 years old. It was written in 1990 to provide a lightweightyet fastcode editor for the then-extant 68K Mac platform. The bulk of BBEdit was originally written in Object Pascal, while performance-critical sections of the program were written in C and 68K assembly. Mac old-timers will recall that the preferred programming language for Mac applications was Pascal, and that you accessed the Mac APIs with Pascal stack-based calling conventions. C compilers of that era used glue routines to massage the arguments of a C-based API call so that they resembled a Pascal stack call.
The first major code change BBEdit underwent was when Apple transitioned the Mac platform from the Motorola 68K processor to the Motorola-IBM Power PC processor in 1994. The software engineers considered what was required to port the program, and the decision was made to revamp portions of the editor, in particular using C to replace the 68K assembly code.
During the 1999-2000 timeframe, BBEdit was rewritten entirely in C++. The reason for the programming language change was that since the early '90s, Pascal was on the wane, as many tool vendors and universities embraced C and C++ as the programming languages of choice. The shift to C++ was made due to the dwindling support for Pascal compilers.
The compilers of that transition era (notably Metrowerks' CodeWarrior) were able to take an application's source code and generate a "fat binary," which was an application that contained both 68K code and Power PC code. Such an application, while larger in size, could run at native speeds on either a 68K- or Power PC- based platform during the transition period. The current universal binary scheme is similar in concept.
The next code change occurred when BBEdit was migrated from Mac OS 9 (which is considered the "Classic" environment today) to Mac OS X. Work began in 1999, and was completed by Mac OS X's launch in 2001. Although the new OS offered several different APIs (Cocoa, POSIX UNIX, and Java), Apple had to support the Classic APIs, or else the platform would lose a huge body of existing Mac OS software. To this end, Apple provided a fourth set of APIs, the Carbon APIs. On the front end, Carbon resembles the Classic Mac OS APIs. On the back end, Carbon provided existing applications access to all of Mac OS X's system services, such as memory protection and preemptive multitasking. However, Carbon lacked a handful of the Classic APIs that caused compatibility problems, and some of the capabilities of the retained APIs were modified slightly so that they would work within Mac OS X's frameworks.
For the migration to Mac OS X, C, C++, and Carbon were used to handle the core application services, such as text manipulation. However, where deemed appropriate, the other Mac OS X APIs were used to implement specific application features. For example, Cocoa provided access to the Webkit framework, which renders the HTML for BBEdit. These sections of the application were thus written in Objective-C and Objective-C++. Other services, such as SFTP, utilized the POSIX UNIX API. Table 1 shows how BBEdit draws on different Mac OS X APIs to implement its features.
For the move to Mac OS X, a complete rewrite of BBEdit to Cocoa was considered. Given the company's finite resources, it was decided that BBEdit's customers would be best served by adding features that they wanted, rather than rewriting the application. According to Rich Siegel, BBEdit's creator, "New capabilities will be added to the editor as needed, and in doing so, we'll use the right tool for the right job." In other words, if the Carbon APIs offered the best way to implement the features, then the feature code would use the Carbon APIs.
As has become obvious, BBEdit draws upon many of the APIs available in Mac OS X to implement its feature set. The result is that the application consists of an amalgam of C/C++ code and Objective-C/Objective-C++ code. On the surface, this mix of APIs and programming languages might complicate the shift to the x86 platform. However, the earlier port of BBEdit to the Power PC version of Mac OS X worked in BBEdit's favor here, because it limited the problems the team had to deal with to just those brought about by the OS's behavior on the new platform.
First, with Mac OS X 10.4, the OS itself was no longer a moving target. In earlier releases of the OS, its underlying architecturenotably the kernel APIswas in a state of flux and subject to change. And change these low-level APIs they did, as Apple refined the kernel and underlying frameworks to make improvements. As a consequence, each new release of the OS left broken applications in its wake, an unpleasant outcome that dissuaded many Mac users from switching to Mac OS X. Mac users, after all, want to get work done, not futz with the software.
In 10.4, the kernel programming interfaces (KPIs) have been frozen, and a versioning mechanism lets drivers and other low-level software handle those situations when the KPIs are changed to support new features. The result is an underlying infrastructure for the OS that's stable and consistent across different platforms. This, in turn, makes the porting process manageable.
According to the BBEdit engineers, Mac OS X 10.4 does a good job at hiding the hardware details, while still providing low-level services (such as disk I/O). In addition, its APIs are mostly platform neutral, which means no special code is required to counter side-effects when invoking the APIs on each platform. Put another way, the code to call an API is identical for both platforms, and the results of the API call are identical; no glue code is necessary.
The one side-effect of the x86 processor that Mac OS X can't counter is the Endian issue. The Endian issue arises because of how the data bytes that represent larger values (such as 16- and 32-bit integers, plus floating-point numbers) are organized in memory. (For details, see the accompanying textbox entitled "Endian Issues.")
According to Siegel, the Endian issues "were minor, but present." An earlier design decision actually side-stepped what could have been a major problem for the editor in this area. In 2000, BBEdit 6.0 began using Unicode to support the manipulation of multibyte text for Asian, Arabic, and certain European languages. Unicode is a standard for encoding complex language characters and describing their presentation. For example, Unicode not only specifies the characters in Arabic text, it also specifies an algorithm used to handle its bidirectional display. In turn, Unicode uses the 16-bit Unicode Transformation Format (UTF-16), which encodes the characters as 16-bit values. (For the record, there are variations of UTF used to encode 8- and 32-bit values.)
UTF-16 uses three encoding schemes to describe the byte-ordering of the data: UTF-16BE, UTF-16LE, and UTF-16. In the first two schemes, the name describes the byte ordering, where BE represents Big-Endian and LE represents Little-Endian. For the third encoding, a two-byte sequence at the start of the file makes up a Byte Order Mark (BOM) that describes the Endian encoding scheme.
BBEdit 8.0, which was released in late 2004, uses the full Unicode conversion and rendering features of Mac OS X. These APIs automatically read a file's encoding scheme and manage the data transfers and file I/O appropriately. By choosing to use Unicode early on, the Bare Bones Software engineers not only expanded the number of languages the editor could support, but also avoided what could have been a serious problem with reading and writing files when migrating to the x86-based Mac platform.
Another key revision made in BBEdit 8.0's code design was that the application began using Mac OS X's Preference Services API, rather than storing binary data in a custom resource. This modification also side-stepped the Endian problem.
The biggest impact of the Endian issues was in BBEdit's UI, particularly its menus. The menu choices were stored as tabular data in a custom resource. A byte-swapping routine had to be written to massage the menu data so that it was properly organized in memory for the x86 platform.
Interestingly, the plug-in mechanism was unaffected by the Endian issue. Since this mechanism passes data structures between methods, and these methods access the structures at their native data sizes, the order of the bytes didn't matter. However, older plug-in modules that used the PEF binary format had to be recompiled into the Mach-o format, as that is the executable binary format both Mac OS X platforms will use going forward.
The only code compatibility issue that the BBEdit engineers had to fix was that Apple made a data structure in the Carbon Alias Manager (which is used to locate files by name only on a local drive) opaque. That is, the data structure's contents could no longer be accessed directly. Specifically, the userType field (used to specify the file's type) in an AliasRecord was made opaque. Rather than access that structure directly, you now use getter and setter functions to manage this information.
It took the BBEdit engineers about 24 hours to make the changes described here so that BBEdit executed on the x86-based Mac platform. What took much longer was the regression testing that thoroughly stressed every feature of the application to ensure that subtle side-effects due to Endian issues or the APIs weren't overlooked. This rigorous testing also had to verify that any changes to the code to facilitate the port didn't adversely affect the execution of the Power PC version of the application. In all, testing took several weeks. At the end of this testing, Bare Bones Software announced BBEdit 8.2.3, which is now a universal binary.
BBEdit is a sophisticated application that has survived several platform ports over the course of a decade. Despite its complex internal workings and equally complex history, the BBEdit port went smoothly. Solid code design was the key factor that made the code port a success. See the accompanying textbox entitled "Best Programming Practice Rules" for a list of good programming practices that BBEdit's engineers followed to ensure the writing of quality code.
Siegel summarizes the situation: "There were no unexpected bumps during the migration process, particularly when you consider we were using prerelease tools on a prerelease OS." Bare Bones Software's porting experience augers well for the migration of other Power PC-based Mac applications to the x86-based Mac platform.
DDJ