Programming the QUANTUMdsp

Downloadable microcode makes softcoding a reality

Charles A. Mirho

Charles is a consultant specializing in multimedia and telephony. He can be reached on CompuServe at 70563,2671.

Many conventional multimedia boards are hard function in that their capabilities are defined by ROM chips (or ROM embedded in microcontrollers) planted on the board at the factory. The problem with the hard-function approach is that technology, particularly in the areas of audio compression and telecommunications, marches ruthlessly on. How state-of-the-art is that 8-bit SoundBlaster board you purchased two years ago, or that three-year-old modem? New compressions offering improved quality and higher ratios are constantly being invented. Standards evolve. Modem and fax technology advances toward higher bit rates. As a result, hard-function hardware becomes less than state-of-the-art within months of its manufacture. The QUANTUMdsp board from Communication Automation & Control offers a solution to this dilemma. The microcode that other multimedia and telephone boards hardcode into ROM is encapsulated into disk files. These disk files are downloaded to the board's local RAM as needed. The QUANTUMdsp's capability at any given time is defined by the contents of this RAM. Therefore, the board can be decoding a JPEG image while simultaneously playing MPEG-coded audio and rotating multiple 3-D wireframes. Or it can be synthesizing MIDI while answering the telephone, simply by replacing the contents of RAM with a new set of functions. The great advantage of this approach is that pieces of the microcode can be combined and configured on the fly. What's more, updating the microcode disk files updates the board's capabilities. For instance, the initial beta version I had contained a disk file with microcode to emulate a 9600-baud modem. Within weeks I received a floppy disk with new microcode to emulate a 14.4-baud modem. I copied the new file over the old one, and I had a working 14.4-baud modem running from the Windows Terminal program.

The QUANTUMdsp board supports advanced audio-compression formats such as MPEG, G.722, G.728, and subband coding, as well as the more well-known mLaw, aLaw, and ADPCM compressions. The board includes a general-MIDI synthesizer, baseline 24-bit JPEG decoder, 14.4-Kbps analog modem, Class I fax, speaker-independent voice recognition, and a fast, graceful means of rotating 3-D wireframes. And, as just mentioned, many of these are available simultaneously.

Sound Familiar?

Historically, this isn't a new approach. Desktop printers followed a similar evolution during their early stages. Over the last 15 years, printers evolved away from hard function. Early printers had only a built-in set of fonts (still true of low-end printers). When an application used the printer, it was stuck with these internal fonts. If, later on, the application required better fonts, the user had to purchase a new printer, or possibly upgrade the printer's ROM chip to include the new fonts. Then laser printers came along, and everything changed. If an application required a font not included in the printer's internal set, it simply downloaded a suitable "soft font" to the printer. This flexibility was a strong influence in the rise of desktop publishing.

A library of microcode files (the manual calls them "modules") are stored in a separate directory. These modules can be combined sequentially using a simple scripting language. Example 1, for instance, shows a script file for playing linear, 16-bit audio at 8000 samples per second. Linear data exists in the raw, 16-bit native format of the A/D converter that sampled it. No special processing is required to play linear data; it's moved directly from the source to the speaker. This is rarely the case--audio data is usually coded, which means each sample has been compressed. (Without compression, audio files can consume 2 to 50 times more space on a hard drive.) The script in Example 1 offers some insight into the board's inner workings. The first two lines declare a flow-control buffer. A Windows application can get a pointer to this buffer and fill it with audio data from a disk file. (We will see how this is done in a moment.) The flow-control buffer is needed for two reasons. First, the Windows application runs on the PC (Intel) processor under cooperative multitasking, which means its behavior is nondeterministic. That is, it is subject to unpredictable delays. The module (fta in the script) is responsible for feeding data to the speaker at exactly 8000 samples per second. That means having a steady supply of audio data on hand at all times; delays are unacceptable, unless you enjoy gaps and hiccups in your audio. The module cannot be left to the mercy of the Windows program, since the Windows program is at the mercy of other Windows programs, ISRs, the Windows memory manager, and so forth.

The second reason for the flow-control buffer is a bit more complex and involves signal theory. Essentially, the problem is that it is most efficient for programs to read large chunks of data at a time from disk files. The Windows program, for the sake of efficiency, would like to read between 4 Kbytes and 32 Kbytes at a time from the audio file. The DSP module, on the other hand, must execute in real time. (The playing of audio is inherently real-time in nature, since we hear sound in real time.) For this reason the DSP module can only deal with tiny chunks of data at any one time (typically between 50 and 3000 bytes). The flow-control buffer lets both the Windows application and the DSP module have it their way--the app feeds large chunks of data into one end of the buffer at a rate it is comfortable with, while the DSP module takes data in small blocks from the other end at its own rate. For this to work, the buffer needs two ends, a head, and a tail. Thus the flow-control buffer is implemented as a FIFO data queue.

The FIFO is a circular buffer with associated read and write pointers. The read pointer marks the spot in the FIFO where the next read operation will find data. The write pointer marks the spot where the next write operation will put data. In Figure 1, the area indicated in light green contains data. Reading this data increments the read pointer. Writing more data increments the write pointer. The FIFO is full when the write pointer is one position to the left of the read pointer. The FIFO is empty when the read and write pointers are equal.

The third line in the script declares a channel--a connection to an external device (such as a speaker, microphone, or telephone line). Channels have two other characteristics besides the device type: the sampling rate and the device number. The sampling rate is simply the number of samples per second that will pass through the channel. The device number specifies which device of the specified type you wish to connect to. If two speakers are connected to the board, you could direct the audio to the first speaker by setting the device number to 1. A declaration of the channel in the form device.type_sample.rate_

device.number defines all of these characteristics: 8000 samples per second flow over the channel to the first speaker connected to the board. Therefore, AudOut_8_1 in Example 1 specifies an audio output device (a speaker) with 8000 samples per second, which connects to the first speaker (there may be more than one).

Finally, the module itself is declared. The module fta is nothing more than a piece of DSP microcode. The definition of modules in the script is object oriented; each module is treated as a "black box" with inputs, outputs, and possibly, controls. The fta module is simply designed to take "bites" of its input and move them to its output. The input of fta is connected to the flow control buffer, and the output is connected to the channel. fta bites off 80 samples at a time from the flow-control buffer and moves them out over the channel where they reach the speaker to be perceived as sound. The DSP always reads 100 blocks of data per second for a total throughput of 8000 samples per second.

Taken in its entirety, the script defines the flow of audio data from the Windows application to the first speaker connected to the board. Graphically, the data flow looks like Figure 2.

The module library is full of simple, useful modules like fta. The modules can be combined sequentially in a script to form more-complex multimedia functions. However, such examples are beyond the scope of this article.

Change is Easy

You may be wondering if all of this scripting is worth the trouble. After all, playing 16-bit uncompressed audio at 8000 samples per second is hardly a monumental feat. Example 2, however, shows how simple it is to modify the script for playing audio at 44,100 samples per second. The only difference is in the channel declaration, AudOut_

44_1. The number 44 is shorthand for 44,100, just as 8 is shorthand for 8000.

Suppose you wanted to record audio at 44,100 samples per second instead of playing it. You simply modify the script to reverse the flow of data; see Example 3. There are two changes to the script in Example 3. The first is the channel declaration, which changes the device type from AudOut (specifying a speaker) to AudIn (specifying a microphone). The second difference is that the fta module has been replaced with the module atf, which works like fta but in reverse; it accepts blocks of data from the channel and moves them to the flow-control buffer. Graphically, the flow of data looks like Figure 3. To add or remove compression from the audio stream, a coder/decoder module could be added in series with either the atf or fta module. The module library is full of useful coders and decoders.

Some C Required

Returning to the example of playing 16-bit, linear audio at 8000 samples per second, on the Windows side, a C program is required to read the data file and move the audio data into the flow-control buffer so that it can be sent out over the channel. Listing One, page 46 shows the complete program. The program has the structure of a DOS program and is compiled and linked for QuickWin. This isn't a requirement, but I use it for the benefit of DOS programmers making the transition to Windows. QuickWin programs are much easier to follow than conventional Windows programs, while still illustrating all the important concepts.

The program begins with a list of necessary header files. In addition to the standard C header files, the file vclib.h is included. This header defines the API to the multimedia board. You can easily spot board-specific functions in the example; they are all prefixed by the letters vc.

The first call is to function vcAddTask, which loads the script file. The microcode from any disk modules in the script are downloaded to the board at this point. The 40,000-byte flow-control buffer defined in the script is also allocated. The first parameter to vcAddTask is the DSP number on which the script will execute. Each board contains a single floating-point 32-bit DSP running at 55 MHz. This should be enough for all but the most intensive applications. If you need more power, however, up to four boards can be added in a single machine. The second parameter is the name of the script file to load. The function returns a handle which will be used to identify this script in future function calls. (A single program can load many scripts simultaneously.)

At this point the script is loaded and idle. The example calls the functions vcGetFifoHandle and vcInitFifo. The function vcGetFifoHandle returns a handle to the FIFO buffer, just as the Windows memory-allocation functions return handles to memory blocks. This handle will be dereferenced shortly when it is time to move data from

the file into the FIFO. The call to vcInitFifo simply sets the FIFO read pointer equal to the value of the FIFO write pointer, indicating that no data is available in the FIFO (the empty state).

Next the data file is opened. This

data file contains audio data sampled at 8000 samples per second. After opening the data file, Listing One calls the function DiskToFifo. This isn't a board function but rather a local function for moving data from the audio file to the FIFO. We will see how the function does this in a moment; for now, suffice it to say that when the function returns, the FIFO is loaded with data and ready to go. The script is loaded and sitting idle with data in the buffer. All that remains is to call vcStartTask to get things started. vcStartTask takes a single parameter, the script handle returned by vcAddTask.

Inside DiskToFifo

Function DiskToFifo moves data from the audio file to the FIFO. DiskToFifo dereferences the handle to the FIFO returned by the previous call to vcGetFifoHandle. It also returns the number of bytes available for writing in the FIFO. vcGetFifoWritePtr is used to get a pointer to the write FIFO. The FIFO write count is returned in the variable lWriteCount. The pointer to the FIFO is dereferenced into the variable lpWrite. The next line of code limits the number of bytes to move to less than 32 Kbytes. This isn't a limitation of the board software but rather a choice in this example to avoid huge data moves that can cause problems in the Intel segmented architecture.

Data is read directly from the data file into the FIFO using the standard C language read function. Since data is read directly into the FIFO this way, the write count returned by vcGetFifoWritePtr must be the number of consecutive bytes available for writing in the FIFO (remember, FIFOs are circular buffers). Obviously, a function like read is "unaware" of the circular nature of the FIFO buffer, and so it is up to the program to supply the number of consecutive bytes.

If an end-of-file condition is reached, indicating that all the data in the file has been moved to the FIFO, the read count is checked using the formula lReadCount &=~0x3L.

This formula rounds down to the nearest multiple of four bytes. This is necessary because the last read may have reached an end-of-file, so the actual read count is something less than the number of bytes requested. For example, if 0x7fff bytes were requested but only 1001 bytes remained in the file, then the read count will be 1001. This number must be rounded down to the nearest multiple of four bytes: 1001 &=~3L0 becomes 1000. (An annoying anomaly of the FIFOs--they must be a multiple of four bytes in size, and must be read and written in multiples of four bytes.) Unfortunately, this discards the last byte in the file but it's not usually a problem when playing audio files. If it is essential to preserve the last byte in the file, then pad bytes can be added, making the total read a multiple of four.

Even after the entire file is read into the FIFO, it isn't yet safe to shut down and exit the program. The entire file has not been played until the FIFO is empty. The flag donewriting is set to indicate that the file has been read into the FIFO and we are now waiting for the FIFO to play out. An if statement at the top of the function checks if the FIFO has played out by calling the function vcGetFifoReadCount. The FIFO read count will be 0 when the FIFO is empty, indicating that all data has been played.

The last statement in DiskToFifo is a call to the function vcUpdateFifoWritePtr. This function updates the read and write pointers for the FIFO. The FIFO pointers are not updated automatically when the FIFO is read or written, because there is no way for the board software to know how many bytes the standard C read function moved into the FIFO.

The main function calls DiskToFifo from a tight loop until the entire file has been played. When DiskToFifo returns 0, indicating that all data has been played, main calls vcDeleteTask with the handle to the script file. This unloads the script and frees the memory allocated for the FIFO.

Support for Standards

Developers who prefer standards will be happy to know that the board supports all MPC functions for the playing and recording of wave audio, as well as for MIDI (synthesis only). Technically, the QUANTUMdsp is not MPC compliant because it does not include a MIDI port. MIDI output messages are directed to the board's general-MIDI synthesizer (also a downloadable microcode file). I tested the board with both Sound Recorder and Media Player and both worked well. Table 1 shows the linear wave-audio formats supported by the board.

Most of the audio formats can mix with one another in any combination. That means two or more audio files of different sampling rates and sample size can play simultaneously. Try that with a Soundblaster! Sample rate-converter modules in the microcode library allow two or more audio streams with different underlying sampling rates to mix on the same output channel. (The sample rate-converter modules are inserted automatically when the board software detects a format conflict.) Supported audio coder/decoders are G.722, G.728, MPEG Layer Two, ADPCM, mLaw, aLaw, and subband.

Analog modem and fax (Class I) capabilities are available by replacing the standard Windows COMM driver with the board's COMM driver. Once the driver is installed, the Windows Terminal program can be used to dial from 14.4 kbps right on down to 300 bps. Any Windows communication program that uses the standard COMM driver will work as well. Fax programs which use the Windows COMM driver (such as BitFax from BIT Software) also work at 9600 baud. Demo programs for 3-D wireframes, 24-bit (16-million-color) JPEG still-image decompression, and an audio jukebox are included.

Conclusion

While the microcode library supports an impressive set of audio functions, there's relatively little in the way of image compressions (only JPEG). The lack of a MIDI or joystick port are certainly drawbacks for the music-composition and game markets. Motion video isn't supported, but a video daughterboard is planned for the future.

Still, the high-powered set of audio compressions are well suited to voice mail and teleconferencing applications. The fast JPEG decoder and MPEG audio compression should be very useful in top-end presentation packages. The telephone and fax features rival similarly priced, dedicated communication boards. But probably the most important feature of the board is the fact that the buyer is not committing to today's multimedia standards. As standards evolve, as compression formats improve, and modem bit rates move up, you can expect updates in the form of disk files (courtesy of AT&T and third parties). This should double or possibly triple the useflife of the board over more-conventional hard-function aproaches.

Example 1: Script to play linear, 16-bit audio at 8000 samples per second.

FifoSize:   40000              /* size of flow-control buffer */
Local: FlowBuffer FifoSize     /* declare flow control buffer */
Extern: AudOut_8_1             /* output channel */
fta( ILevel)
{
        fin     FlowBuffer
        aout    AudOut_8_1
}

Figure 1: The flow-control buffer.

Figure 2: Flow of audio data from a Windows application to the first speaker.

Example 2: Modifying the script in Example 1 to play linear, 16-bit audio at 44,100 samples per second.

FifoSize:   40000             /* size of flow-control buffer */
Local: FlowBuffer FifoSize    /* declare flow control buffer */
Extern: AudOut_44_1           /* output channel */
fta( ILevel)
{
        fin     FlowBuffer
        aout    AudOut_44_1
}

Example 3: Script to record linear, 16-bit audio at 44,100 samples per second.

FifoSize:   40000             /* size of flow-control buffer */
Local: FlowBuffer FifoSize    /* declare flow control buffer */
Extern: AudIn_44_1            /* input channel */
atf( ILevel)
{
        ain     AudIn_44_1
        fout    FlowBuffer

}

Figure 3: Flow of data when recording audio at 44,100 samples per second.

Table 1: Linear wave audio formats supported (almost all formats can mix).

Sample          Mono PLAY Stereo                   Mono       RECORD       Stereo
Rate     8 bit 16 bit      8 bit 16 bit          8 bit  16 bit       8 bit       16 bit
 8000       x     x           x     x            x        x     x      x
11025       x     x           x     x            x        x     x      x
16000       x     x           x     x            x        x     x      x
22050       x     x           x     x            x        x     x      x
24000       x     x           x     x            x        x     x      x
32000       x     x           x     x            x        x     x      x
44100       x     x           x     x            x        x     x      x
48000       x     x           x     x            x        x     x      x

For More Information

Communication Automation & Control
1642 Union Blvd., Suite 200
Allentown, PA 18103
800-367-6735

[LISTING ONE]


/*  QuickWin Audio Player */
#include <stdlib.h>
#include <stdio.h>
#include <conio.h>
#include <io.h>
#include <errno.h>
#include <sys\types.h>
#include <sys\stat.h>
#include <fcntl.h>
#include <vclib.h>      /* VCAS function prototypes */

int DiskToFifo(long hf, int fd);
static int fd, tidPLAY, tidADA;

main()
{
static long hf;
long hparam;

        if( vcAddTask( 1, "PLAY",&tidPLAY) < 0)  /* load script */
                return -1;
        if( vcGetFifoHandle( tidPLAY, "FlowBuffer", &hf) < 0)
                                        /* get handle to flow control buffer */
                return -1;
        if( vcInitFifo( hf) < 0) /* initialize (zero out) flow control buffer*/
                return -1;
        fd=open( "HELLO.L8", O_BINARY|O_RDONLY); /* open the audio data file */
        if(fd == -1) printf("Cannot open HELLO.L8\n");

        printf ("Hit return...\n");
        getchar();

        DiskToFifo( hf, fd);        /* put some data in the FIFO */
        if( vcStartTask( tidPLAY) < 0)  /* start playing the audio file */
                return -1;
        while(DiskToFifo(hf,fd)==1); /* play until done */
        vcDeleteTask (tidPLAY);
        close(fd);                  /* close input file */
}

int DiskToFifo(long hf, int fd)
{
long    lReadCount, lWriteCount, *lpWrite, li;
static  donewriting = 0;

        if (donewriting)
        {
            vcGetFifoReadCount( hf, &li);  /* if DSP has emptied the FIFO... */
            if(li==0L) return(0);          /* then quit */
            return 1;
        }
        printf(".");         /* print something to indicate activity */
        vcGetFifoWritePtr( hf, &lWriteCount, &lpWrite);
                                                  /* get FIFO write pointer */
        if(lWriteCount > 0x7fff) lWriteCount=0x7fff;
                                     /* limit data moves to 32K */
                                     /* read the disk directly into the FIFO */
        lReadCount= read( fd, (void*)lpWrite, (unsigned int)lWriteCount);
        if(lReadCount < lWriteCount)  /* if disk is getting empty... */
        {
                lReadCount &= ~0x3L;           /* ensure 32-bit transfer */
                donewriting = 1;
        }
        vcUpdateFifoWritePtr( hf, lReadCount); /* update FIFO indices */
        return(1);
}
End Listing