Columns


CUG New Releases

JMODEM, JPEG, and GZIP

Victor R. Volkman


Victor R. Volkman received a BS in Computer Science from Michigan Technological University. He has been a frequent contributor to The C Users Journal since 1987. He is currently employed as Senior Analyst at H.C.I.A. of Ann Arbor, Michigan. He can be reached by dial-in at the HAL 9000 BBS (313) 663-4173 or by Usenet mail to sysop@hal9k.com.

Introduction

As the new Acquisitions Editor for the C Users' Group Library, I'll be introducing you to the best new volumes of shareware and freeware programming tools over the coming months. Each of these volumes includes full source code in C or C+ + (sometimes a little bit of assembly language too). Each of the volumes can be incorporated into your application or used as the basis of an entirely new application. Licensing restrictions vary with each volume, but keep in mind the authors have submitted their source code for one purpose: to help you.

This month, I'll introduce three new volumes:

JMODEM: CUG 380 (one disk)

JMODEM, by Richard B. Johnson (Beverly, MA), is the definitive version of this innovative file-transfer protocol. Johnson wrote and distributed the first version of JMODEM in 1989. Although he wrote the original in assembly language, he has since rewritten it in C and made the source freely available. Johnson hopes that the source will "make it easier for software developers throughout the world to use this very useful protocol."

The CUG distribution includes a pre-built JMODEM MS-DOS executable that can be used as is. Johnson has taken much care to ensure that it behaves well as an externally shelled communications protocol driver. Because it's a protocol driver, you may simply add it to your existing upload/download protocol menu in Procomm, Telix, Commo, and other terminal programs. JMODEM also provides detailed installation instructions for BBS use.

Developers of their own communications programs can also integrate JMODEM support. JMODEM requires as little as 79KB RAM to run and can be built without any floating-point support libraries. The current version runs only on MS-DOS and has been successfully built with Microsoft C and Borland Turbo C.

Why does the world need another file transfer protocol? In short, JMODEM provides more intelligent block sizing, data compression, and CRC support than the more established protocols. Many of the older protocols, such as XMODEM and YMODEM variants, were designed when 1200 baud was state-of-the-art. To accommodate this low baud rate they use fairly small block sizes which typically range from 128 to 1024 bytes. The older protocols require the receiver to acknowledge receipt of each block before another block is transmitted. The acknowledgment overhead and block size limitations together slow the effective transmission rate by up to 25 percent of the maximum throughput.

JMODEM avoids the block size and acknowledgment logjam by allowing block sizes to increase to 8,192 bytes in length. JMODEM starts out the block size at 512 bytes in length. With each successively correct transmission, it doubles the block size to 1,024, 2,048, 4,096, and finally 8,192 bytes. Similarly, each successively incorrect transmission causes it to halve the previous block size. If necessary, JMODEM will drop to block sizes as small as 64 bytes, allowing transmission even in a high-noise environment.

JMODEM remains one of the few file transfer protocols with built-in data compression. Specifically, JMODEM will apply Run-Length Encoding (RLE) to blocks which are compressible enough (under RLE) to make the operation worthwhile. Upon handling pre-compressed data, such as ZIPped files, JMODEM Automatically disables its own compression process. Realistically, RLE can significantly increase throughput when used with older 2400 baud (v.22bis) modems on uncompressed data; however, RLE cannot beat modern hardware-implemented compression protocols such as Microcom Networking Protocol (MNP 5) and the CCITT v.42bis LZW compression. (MNP 5 typically achieves 2:1 compression and v.42bis achieves 4:1 compression.)

Last, JMODEM provides a 16-bit Cyclic Redundancy Check (CRC) for further protection against transmission errors. Older protocols, such as XMODEM, often rely on checksums to guard against accidental transmission errors. Some protocols, such as ZMODEM, do offer even higher protection via 32-bit CRCs. However, the 16-bit CRC is more than sufficient, yielding only about 1 in 2132 undetectable errors.

JPEG: CUG 381 (one disk)

JPEG Software, by Thomas G. Lane (The Independent JPEG Group), is a complete JPEG image compression and decompression system. JPEG (pronounced "jay-peg") is a standardized compression method for full-color and gray-scale images. JPEG originated from a desire to efficiently handle photographic images. The image can be computer-generated (e.g., a fractal landscape) or captured from an endless variety of sources, including scanners and video capture boards.

The JPEG Software distribution source code is written entirely in C. You can compile it on many platforms, including IBM compatibles, Amiga, Macintosh, Atari ST, DEC VAX/VMS, Cray Y/MP, and most UNIX platforms. Supported UNIX platforms include, but are not limited to, Apollo, HP-UX, SGI Indigo, and SUN Sparcstation. The make system even includes a utility to convert the ANSI-style C code back to older K&R-style.

JPEG differs considerably from file formats such as PCX, GIF, and TIFF, which must reproduce 100 percent of the original image data. Rather, JPEG is "lossy" in that the output image is not necessarily identical to the input image. Applications requiring exact correspondence between input and output bits, such as engineering blueprints, are thus inappropriate for JPEG. However, on typical photographic images, JPEG delivers very good compression levels without visible change. Additionally, with JPEG you can achieve amazingly high compression if you can tolerate a low-quality image. You can trade off image quality against file size by adjusting the compressor's "quality" setting.

By default, the JPEG Software distribution code builds a command-line-driven translator. The currently supported image file formats are: PPM (PBMPLUS color format), PGM (PBMPLUS gray-scale format), GIF, Targa, and RLE (Utah Raster Toolkit format). RLE is supported only if the URT library is available. The compression program, cjpeg, recognizes the input image format automatically, with the exception of some Targa-format files. The decompression program, djpeg, requires you to specify the target file format. The only JPEG file format currently supported is the JFIF format. Support for the TIFF 6.0 JPEG format will probably be added at some future date.

You may incorporate any or all of the JPEG Software source code into your own applications. The only restriction is that you must include a small notice stating: "This software is based in part on the work of the Independent JPEG Group."

GZIP: CUG 382 (one disk)

GZIP, by Jean-loup Gailly (Rueil-Malmaison, France), is a general-purpose archiving and compression utility. Rather than introducing yet another compression file format, GZIP seeks to unite the myriad existing compression methods. GZIP will automatically detect and unacompress files created by Phil Katz's PKZIP and compatible zip methods; it also handles UNIX-derived "pack" (Huffman encoding) and "compress" (LZW) files.

GZIP v1.2.2 (released 6/93) supports many platforms, including MS-DOS, OS/2, Atari, Amiga, and DEC VAX/VMS. GZIP works well with most UNIX workstations including those compatible with NeXT, MIPS, SGI Indigo, and Sun Sparcstations. On the MS-DOS platform, GZIP is only guaranteed to work with Microsoft C 5.0 (or later) and Borland Turbo C 2.0 (or later). By default, GZIP builds for the MS-DOS Compact memory model. An additional compilation flag allows for the Large memory model. Other memory model variations might be possible as well; but GZIP specifically disclaims ability to build for the Small model.

GZIP uses the well-known Lempel-Ziv encoding method as published in the IEEE Transactions on Informataion Theory (1977). By using Lempe-Ziv, GZIP avoids the patented algorithms prevalent in other implementations. GZIP reduces the size of typical text, such as English or C source code, by about 6-70 percent. Files which are already signficantly compressed, such as GIF graphics and VOC audio files, undergo far less reduction. The algorithm is implemented entirely in C, with one exception: the routines that do the performance-critical, longest-string matching have been rewritten in platform-specific assembly language.

GZIP is licensed under the GNU General Public Software License. Therefore, if you plan on incorporating source code from GZIP in your own product, you must make your source code readily available to your users. However, you are still free to distribute the unmodified version of GZIP.EXE and bundle GZIP-archived files with your software.

Call for Submissions

The C Users' Group Library needs your source code to support the C Users' Group mission: to help developers help each other. Currently, the CUG Library offers nearly 300 volumes of source code covering all application areas and all platforms. Applications include function libraries, disassemblers, compilers, text editors, text filters, communications support, text formatters, interpreters, bulletin boards, windowing systems, games, tutorials, math packages, cross-compilers, pre-compilers, and more.

Platforms include all forms of UNIX as well as MS-DOS, Windows 3.x, and OS/2 for PCs. The Atari, Amiga, and Macintosh platforms also have support. Source disks are distributed at minimal cost to users (only $4 per disk). The source will also be made available on CD-ROMs. If your source code is potentially useful to other developers, we want it.

Submission Policy

CUG is interested in all user-supported C and C++ source code. Programs need not be new and unique, nor massive, to be useful to other members. Many times even minor modifications of existing library programs are important to other members, especially if the modifications improve the portability of the code.

CUG accepts submissions only from the author or copyright holder. All submissions must be accompanied by the Author's Release Form. In part, the Author's Release is designed to protect the interest of members who want to restrict for-profit distribution of their product. For tax purposes, CUG is a service of R&D Publications, Inc., a for-profit Kansas corporation.

We had originally intended to organize as a non-profit corporation, but found that it was an unbelievable hassle. We hope the Author's Release Form will allow authors to clearly authorize distribution by CUG while at the same time protecting their residual rights. We make every effort to respect the intentions of the submitting author when distributing software. For a copy of the Author's Release Form, please contact Victor Volkman through any of the methods listed later in this article.

Guidelines for Submission

If you (or perhaps several authors) have placed restrictions on your material, include all the restrictions prominently in the documentation.

Though you may provide documentation of your software in any file format, always include plain ASCII copies of the documentation as well.

Include a one- or two-paragraph summary of the archive contents along with a longer (two-to-ten pages) description detailed enough to help members decide whether the submission is of use to them.

Unless you are serious about policing users of your software, do not place restrictions on its reuse. Reuse restrictions are difficult to enforce, and in the long term, only reduce the credibility of your restrictions.

How to Submit Your Code