Build Your Own RS-232 Sound System

Digital audio from your RS-232 port

Dennis Cronin

Dennis almost completed an EE degree before being lured into the sordid world of fast computers, easy money, and loose connections. He currently specializes in UNIX driver development for Solaris and HP-UX operating systems and can be contacted at denny@cd.com.

Have you ever thought about using the extra RS-232 port on your computer as an audio output? Probably not, right? However, it can be done. This article describes a rather unique method of coding data that will produce an audio signal when streamed to a speaker attached to the RS-232 port. Granted, the sound isn't CD quality, but it's still quite intelligible. I'll start by examining how to get an audio signal from what was only designed to be a serial data port. Then I'll look at a program that outputs a Microsoft Windows .WAV audio file to the comm port of your PC. Finally, I'll provide the wiring diagrams necessary to build your own speaker to attach to the comm port of your PC.

From ASCII to Audio

To understand how to turn a mundane ASCII character stream into music, you need to look beneath the characters, at the underlying serial bit stream coming from the UART. Although a complete primer on asynchronous communications is beyond the scope of this article, I'll cover some basics of where the audio comes from.

An RS-232 asynchronous communications line idles at a state defined as being the "1" state of the line. This is also referred to as a "marking" state. When you shove, say, the letter A into the transmit register of the UART, it generates a start bit. A start bit is always a 0, and the transition from the marking state of 1 to that first start bit of 0 is the critical point of synchronization whence all the rest of the bit stream is referenced.

After the start bit, the UART proceeds to send the bits of the character, starting with the least-significant bit (LSB). An A is 0x41, so the LSB is a 1. Assuming that the line settings are for 8 bits and no parity, you'll see the UART crank out all 8 bits of the character, immediately followed by a stop bit, which is always a 1. The stop bit is also a critical point, since the receiving UART uses it as a simple check to make sure synchronization was maintained throughout the reception of the character. If it doesn't see a 1 in the stop-bit slot, it reports the familiar "framing error" status. Figure 1 shows what the letter A looks like as it emerges from the UART, all framed up with start and stop bits.

While the letter A is interesting enough, the character U is really interesting. As you can see in Figure 2(a), U (defined as 0x55) happens to generate a completely alternating bit pattern, including the start and stop bits. When a steady stream of characters is transmitted, the stop bits are immediately followed by the start bit of the next character. Figure 2(b) shows that a steady stream of U characters is actually just a continuous square wave, the frequency of which is simply half the baud rate programmed at the UART.

If you were to hook up your speaker to a UART while it generated a steady stream of Us, you'd hear a high-pitched tone as this square wave wiggled the speaker back and forth. But suppose you set the baud rate up very high, say at 115,200 baud, which is the fastest baud rate commonly available on PCs. Well, the speaker can't wiggle that fast, nor can we humans hear a frequency that high. Instead, the speaker just sees the effective average value of the waveform, in this case, 0 volts, since there is an equal distribution of 1s and 0s

Now, suppose you pop a character with an extra bit turned on into the stream of Us every once in a while. Your trusty Us contain four 1s and four 0s, so pick a character that has five 1s and three 0s--a W (0x57), for instance. When you shove the W in, the speaker briefly sees an imbalance in the quick, alternating tugs the stream of Us provides. The response to this brief imbalance manifests itself as an exciting "click" every time you shift out a W. But this isn't quite music yet.

What happens when you turn on six 1s, and only do two 0s? The speaker will be a little more imbalanced and will move a little farther. How about seven 1s and one 0? It turns out that you can "unbalance" that speaker by four distinct values on each side of the center value created by the steady stream of Us. Thus, you have nine "positions" you can ask the speaker to assume depending on the mix of 1s and 0s you send it. It's a little DAC (digital-to-analog converter), albeit a humble, 3-bit one (plus the zero position). By picking the right character, you can control the movement of the speaker.

So how many "samples" per second can your DAC do? With 8 bits per character, plus the start and the stop bit, you have a total of 10 bits for each character. At the 115,200-baud rate, this works out to 11,520 characters (or samples) per second. So your DAC is capable of sustaining a sample rate of just over 11K samples per second, which isn't quite CD quality, but is still good enough for voice-grade audio.

In most digital-audio systems, the DAC is immediately followed by a sharp-cutoff lowpass filter. This prevents aliasing and provides the necessary smoothing function to turn the discrete steps back into smooth analog. You're going to have to forgo this luxury and settle for the simple lowpass action resulting from the speaker's mechanical inertia. It doesn't matter that much. The three bits, plus 0 output resolution is still the biggest limiting factor in terms of overall "fidelity" (and I use that term very loosely).

Now that you have an understanding of how to coax some audio out of the RS-232 port, let's look at what you have to do to actually play the Microsoft Windows WAV files.

Playing WAV Files Over RS-232

The most common WAV file format contains 8-bit mono audio data in linear pulse code modulation (PCM) format using a sample rate of 11,000 samples per second. Other formats are possible, but so far this is the most common. The 11-kHz sample rate happens to be close enough to our own 11,520 DAC sample rate, that a sample-rate conversion isn't even necessary. Sounds will play back slightly less than a semitone too high, but that's close enough for rock 'n' roll.

Information about the actual file format is contained in a RIFF header at the front of the file. For simplicity's sake, you're just going to ignore that header and assume the previously stated sound-file parameters. Well, you're not going to completely ignore the header_in fact, you're going to "play" it. The header is so short (in terms of audio data) that the click it adds to the beginning of the sound is almost imperceptible. Since the program is, by definition, not particularly hi-fi, it's just not worth the extra effort to parse that header, so we're going to make a KISS design decision right off that bat. (Header? I don't see no header.)

Linear PCM is probably the simplest audio format to deal with, as each instantaneous value of the audio waveform is directly represented by a number. In this case, with 8 bits of resolution, the numbers vary from 0 to 255, with 128 used as the zero reference. As these numbers swing above and below the zero reference, you need to assign ranges to the nine possible output values provided by your cheesy DAC.

COMAUDIO.C

The program that makes it possible for you to blast sound out of your RS-232 port is COMAUDIO.C (Listing One, page 73). The area of COMAUDIO.C that's of interest is in the conversion of the 8-bit PCM value to the character for output to the serial chip. As you read the audio file in, the convert() subroutine is called to map each input byte of the audio file onto a value which will ultimately be used to index the nine-slot array of actual output characters. The convert() routine applies a bias and a scaling factor to cause each input value to land in one of these nine possible output zones. If necessary, the signal will be clipped to guarantee that it stays within the range of 0--8. Later during output, this index will select one of the output characters from the dac[9] array.

The characters in the dac[9] array are arranged in order of increasing numbers of 1s. Except for the values for all 0s (0x00) and all 1s (0xff), there are several possible candidates for each position. The choice was based on minimizing the low-frequency content of the character, so as to reduce subharmonic squeal from the 115,200 carrier as much as possible.

Prior to commencing actual playback, interrupts are disabled. This is necessary, since even the brief disturbance of a timer tick will pose enough interruption that the CPU can fall behind the serial chip. If this happens, the line drops into the marking state, yielding a very audible clicking sound.

The COMAUDIO.C program was compiled and tested under Borland Turbo C 2.0 and Borland C++ 1.0. Although the program was coded on a 486/33 under DOS and Windows 3.1, I also tested it successfully on 286-class PCs. Similar versions of COMAUDIO.C have been tested on a SPARC 1+ under SunOS 4.1.3. If you have a slower machine, you might need to do some optimizing of the output function. (Or better yet, you might just buy a new computer. Jeez, get into the '90s.) After you compile the program, the only thing left to do is hook up a speaker. Sample data files in .WAV, .C3P, and .AU (Sun) format are available electronically; see "Availability," page 2.

Attaching the Speaker

Figure 3 is a wiring diagram for the speaker. While just about any speaker will work, a cheap replacement 8-ohm speaker (Radio Shack 40-1208 or equivalent) is ideal. You actually want the frequency response to be somewhat limited in order to help filter out some of the nasty, high-frequency digital hash.

A capacitor of at least 100 mf, with a voltage rating of at least 16 WVDC is put in series with the speaker to block DC. Polarity is important; make sure you get the negative terminal of the capacitor attached to the speaker and the positive terminal attached to pin 7 of the DB-25 connector.

The speaker will sound somewhat better if you enclose it in something. A simple cardboard box works well enough; see Figure 4. Plus, this keeps your friends guessing about what complex circuitry, that's capable of magically turning the RS-232 into audio, lurks inside.

Hook the speaker up to one of the comm ports on your PC and give it a whirl. Unless you've been living on a deserted isle, you can probably locate a copy of TADA.WAV, a small sound file that ships with Windows 3.1, to use as a test. Run COMAUDIO with TADA.WAV as an argument, specifying the proper comm port if necessary, and you should hear delightful, although brief, strains of music emanating from your proud little speaker.

ARPEGGIO.C

In addition to the WAV files that are available electronically, there are plenty of places you can download more. But if you want something a little more melodious to play right away, ARPEGGIO.C (Listing Two, page 74) can be used to generate a sample audio file for playback through COMAUDIO. Simply invoke it with a destination filename, and it will write out an audio file containing a catchy little ditty. If your machine doesn't have a math coprocessor, it can take a couple of minutes or more to generate the output file, so be patient. ARPEGGIO.C is easy to modify, and you can change it to generate electronic compositions of your own.

Figure 1: The letter A, as seen emerging from the UART, all framed up with start and stop bits.

Figure 2: (a) The letter U (defined as 0x55) generates a completely alternating bit pattern, including the start and stop bits; (b) a steady stream of U characters is a continuous square wave, the frequency of which is half the baud rate programmed at the UART.

Figure 3: Wiring diagram for the speaker.

Figure 4: New highs in low-fi: a roll-your-own speaker.

[LISTING ONE]


/*
    COMAUDIO.C - uses PC com port to generate audio
*/
#include <stdio.h>
#include <stdlib.h>
#include <alloc.h>
#include <sys/stat.h>
#include <dos.h>

/* defs for low level access to serial chip */
#define SCC_DATA    0
#define SCC_INTCTRL 1
#define SCC_CTRL    3
#define SCC_STATUS  5
#define TXRDY       0x20
#define comout(scc_base,c)                                  \
{                                                           \
    while((inportb(scc_base + SCC_STATUS) & TXRDY) == 0);   \
    outportb(scc_base + SCC_DATA,c);                        \
}

/* farinc - macro to increment far ptr */
#define farinc(p) {                                         \
                    p++;                                    \
                    if(FP_OFF(p) == 0)                      \
                        p = MK_FP(FP_SEG(p) + 0x1000,0);    \
                  }

/* digital to ASCII analog conversion table */
unsigned char dac[9] = {0x00,0x08,0x12,0x29,0x55,0x6b,0xb7,0xef,0xff};

/* protos */
void main(int argc, char **argv);
int line_setup(int linenum);
void set_vol(int volume);
unsigned char convert(int c);

/*
                        main
*/
void
main(int argc,char **argv)
{
    FILE *fp;
    unsigned char far *p, far *bufp;
    long i;
    int port = 1, volume = 5;
    register c,scc_base;
    struct stat statbuf;

    /* check arg cnt for sanity */
    if(argc < 2 || argc > 4) {
        printf("Usage: comaudio [wavfile] [[port]] [[volume]]\n");
        exit(1);
    }

    /* if com port spec'd */
    if(argc > 2) {
        port = atoi(argv[2]);
        if(port < 0 || port > 1) {
            printf("Use 0 or 1 to select com1 or com2 respectively.\n");
            exit(1);
        }
    }

    /* see if volume is spec'd */
    if(argc > 3) {
        volume = atoi(argv[3]);
        if(volume < 1 || volume > 9) {
            printf("Volume should be in range 1-9.\n");
            exit(1);
        }
    }
    set_vol(volume);

    /* get length of sound file */
    if(stat(argv[1],&statbuf) != 0) {
        printf("Cannot stat sound file '%s'\n",argv[1]);
        exit(1);
    }

    /* try to alloc mem to hold entire (nibble packed) file */
    bufp = farmalloc(statbuf.st_size / 2);
    if(bufp == NULL) {
        printf("Cannot allocate %lu bytes of memory for sound file\n",
            statbuf.st_size / 2);
        exit(1);
    }

    /* open sound file */
    fp = fopen(argv[1],"rb");
    if(fp == NULL) {
        printf("Cannot open sound file '%s'\n",argv[1]);
        exit(1);
    }

    /* read entire file into mem */
    for(i = statbuf.st_size / 2, p = bufp; i--; ) {
        /* pack 2 converted vals per byte */
        c = convert(fgetc(fp));
        *p = c | (convert(fgetc(fp)) << 4);
        farinc(p);
    }

    /* set up port */
    scc_base = line_setup(port);

    /* grab from buf and shove out com port */
    disable();      /* ints must be off for full "fidelity" */
    for(i = statbuf.st_size / 2, p = bufp; i--; ) {
        /* unpack to vals per byte */
        c = *p;
        comout(scc_base,dac[c & 0xf]);
        comout(scc_base,dac[c >> 4]);
        farinc(p);
    }
    enable();       /* turn interrupts back on */
    exit(0);
}

/*
                        line_setup

    Sets up spec'd line for 115200, returns ptr to
    assoc'd chip channel.
*/
int
line_setup(int linenum)
{
    union REGS regs;
    int scc_base;

    scc_base = linenum ? 0x2f8 : 0x3f8;

    /* BIOS call does most of it */
    regs.h.ah = 0;
    regs.h.al = 0xe3;               /* 9600,N,8,1 */
    regs.x.dx = linenum;
    int86(0x14,&regs,&regs);

    /* now talk nasty to the chip, write baud div = 1 for 115200 */
    outportb(scc_base + SCC_CTRL,inportb(scc_base + SCC_CTRL) | 0x80);
    outportb(scc_base + SCC_DATA,1);        /* write least sig */
    outportb(scc_base + SCC_INTCTRL,0);     /* write most sig */
    outportb(scc_base + SCC_CTRL,inportb(scc_base + SCC_CTRL) & 0x7f);
    return(scc_base);
}

static atten,bias;

/*
                        set_vol

    Sets conversion factors for spec'd volume.
*/
void
set_vol(int volume)
{
    atten = (10 - volume) * 3;
    bias = (256 - (9 * atten)) / 2;
}

/*
                        convert

    Converts 8 bit PCM value to index into ASCII lookup table
    by attenuating and clipping as necessary.

*/
unsigned char
convert(int c)
{
    c -= bias;
    if(c < 0) c = 0;    /* clip negative peaks */
    c /= atten;
    if(c > 8) c = 8;    /* clip positive peaks */
    return(c);
}

[LISTING TWO]



/*
    ARPEGGIO.C - generates test sound file
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#define SAMPRATE    11520           /* for 115.2 kbaud */
#define DURATION    (SAMPRATE/15)   /* 1/15 sec units */
#define MAX_NOTES   73              /* six octaves */

FILE *fp;       /* output file */

/* protos */
void init_notes(void);
void note(int note_num, int duration, int decay);
void arpeggio(int basenote, int step);

/*
                        main
*/
main(int argc, char **argv)
{
    if(argc != 2) {
        printf("Usage: arpeggio [outfile]\n");
        exit(1);
    }
    if((fp = fopen(argv[1],"wb")) == NULL) {
        printf("Can't open output file '%s'\n",argv[1]);
        exit(1);
    }
    init_notes();

    arpeggio(24,0);
    arpeggio(15,12);
    arpeggio(22,12);
    arpeggio(19,0);
    arpeggio(24,0);
    note(24,10,8);
    exit(0);
}

/*
                        arpeggio

    Recursively generates a pretty little arpeggio.
*/
void
arpeggio(int basenote, int step)
{
    note(basenote + step,1,2);  /* plinky going up */
    if(step == 24)              /* 2 octave arpeggio */
        return;
    else if(step % 12 == 4 || step % 12 == 9)
        arpeggio(basenote,step + 3);
    else
        arpeggio(basenote,step + 2);
    note(basenote + step,1,8);  /* legato coming down */
}

static double notetab[60], rad_per_samp;

/*
                        init_notes

    Builds note frequency table and calcs radians/sample.
*/
void
init_notes(void)
{
    double twlfth_root2,freq;
    int i;

    twlfth_root2 = pow(2.0,1.0 / 12.0); /* compute semitone interval */
    for(i = 0, freq = 110.0 ; i < MAX_NOTES; i++) {
        notetab[i] = freq;
        freq *= twlfth_root2;           /* up a semitone */
    }
    rad_per_samp = 2.0 * M_PI / SAMPRATE;
}

/*
                        note

    Looks up note in frequency table and performs.
*/
void
note(int note_num, int duration, int decay)
{
    int c;
    long i,cnt;
    double freq,val,env,vol = 50.0;

    freq = notetab[note_num];               /* look up note frequency */
    cnt = duration * DURATION;              /* calc count for duration */
    env = 0.999 + decay * 0.0001;           /* calc envelope decay factor */
    for(i = 0; i < cnt; i++) {
        val = sin(rad_per_samp * freq * i); /* compute sine wave val */
        c = (int)(vol * val) + 128;         /* convert to 8 bit PCM */
        fputc(c,fp);                        /* write to output file */
        vol *= env;                         /* make note decay */
    }
}