STRUCTURED PROGRAMMING

Pondering Imponderables

Jeff Duntemann, K16RA

I'm one of those guys who's always pondering imponderables. Such as: What is it that schnauzers do when they're schnauzing? Why do we continue to reelect that pack of thieving cowards we call a Congress? If there are parakeets, are there also metakeets? Why did God create heavy metal bands when we already had trial lawyers? (Or, Lord knows, Lima beans?)

Occasionally, answers present themselves. Not long ago, while watching Mr. Byte squirming around on his back, rubbing his nose against the carpet, I realized that this was, in fact, schnauzing. Schnauzers must just have itchier noses and do it more often. As for Congress; hey, go figure. I always vote against the incumbent. (There are no good congressmen, and not nearly enough dead ones.) We'll leave the question of metakeets to Lewis Carroll's next incarnation.

But you want an imponderable, well, ponder this: Is there some truly general method for designing software?

The Most-asked Question

I'd guess that this is the most-asked question in all computerdom, even if most people asking the question ask it of themselves. (Usually when a deadline's coming up and they have nothing to show for their six months' worth of "design phase.") It beats out the runner-up, "What the hell is object-oriented programming, anyway?" (asked mostly when the boss has laid down the law that all future projects are to be written in an object-oriented fashion) by an Arizona mile.

I've been spending the lion's share of last year's columns fussing with the newness of the OOP concept. We've definitely chewed the flavor out of that particular gumball, big though it was when it rattled out of the machine, and it's time to chew on something else.

So for the next umpty-ump columns I'll be addressing design issues as I perceive them. In the process I'll lay out my own design process for a specific application, and I'll build the application as I go. I readily admit that I'm not a software design expert. I don't have a degree in computer science or software engineering. (My degree is in English, that most cantankerous interpreter of all.) However, I've designed, implemented, and maintained a couple of middling systems in my time (over and above the little programming projects I've done in lieu of watching "L.A. Law" all these years) and, more important, I've watched a lot of other people try and sometimes fail to design software of their own.

The Nature of Design

Let's go back to the last imponderable I mentioned. Is there a truly general method for designing software? There is indeed. It is, in fact, summarized in just one word: Think!

I say this because I've read between the lines a lot in talking to other programmers about design. Numerous people have asked me to recommend a good design methodology. Closer probing sometimes revealed that they were in fact hunting about for something they thought they might have read about in a copy of Datamation that somebody left on the floor of the MIS/DP men's room in 1982; called something like the Krobolgian-Czstrwytrszyz Decomposition Theorem, which would take a set of vague statements scribbled on the back of a Chinese menu and automagically poof them into tight pseudocode that your little sister could implement in GW BASIC in an afternoon. In other words, what they were really asking for was to please relieve them of the burden of actually thinking about what they had to do.

No. Design is not easy. There may be tools to make certain parts of the process less burdensome, but none of these tools relieves you of the responsibility of understanding what's going on all along.

I've created systems in a number of different ways. In looking back, however, I can cook the process down to the following:

  1. Understand what you can't do.
  2. Understand what you must do.
  3. Resolve any conflicts between #1 and #2.
  4. Do what you must do.
  5. Make sure that what you've got is what you wanted.
As far as I've been able to tell, these five steps apply to any sort of software design project you can name. None of them is easy. All of them require considerable thought. #3 may demand a certain element of statesmanship as well, especially in corporate environments where the people who define #1 and #2 are separate groups that throw blunt objects at one another whenever they get the chance.

Constraint-based Design

But step #1 deserves some explication. No design methodology I've studied gives it the importance that I do. Not surprisingly, many if not most of the failed design projects I've seen, failed precisely because the designers gave little thought to their collection of constraints.

I warn you: Get your constraints down on paper first. Memorize them. Then tape them to your door, and do not cover them with your favorite entries from The Far Side Calendar.

So what's a constraint? A constraint is one of the points through which you draw the boundaries of possibility. Inside is to work. Outside is to dream.

The most obvious constraints are defined by the target platform. One very common constraint is to work in text mode only, often because the target machine or machines have no graphics ability. Another constraint is for the program to work in 640K, without benefit of EMS, again because the target machine or machines have no EMS. Connecting to network or mainframe resources can involve pages of detailed and subtle constraints, things you can't do at certain times, or things you can't do faster than certain speeds. I once saw a guy try to design a system that had to pass a megabyte of data every evening -- through a 300 baud phone link. Not his problem -- he just had to write the program. Not his fault if the damned data was too big!

Much thornier constraints involve the people who use the system. How much operator time per day will the system demand? Will the operator be a seasoned part of the team requesting the system, or a minimum-wage clock-watcher from Rent-A-Loser?

Constraints may also be dictated by organization policies. The company may hire you to design and implement a system for them, but may forget to mention that all systems must conform to SAA (in many mainframe-dominated shops, IBM's word has been law for so long that nobody even questions whether SAA is a good idea or not) or even, God help us, be written in Cobol. Management may not ordinarily enforce the Cobol rule -- until somebody decides his turf is being stepped on during the design process and goes looking for fine print to invoke against the proposed system.

Expect constraints if the proposed system involves personnel information, or financial information, or anything else a corporate body might consider sensitive. Transmissions may have to be encrypted. Password access may have to be strictly controlled.

I'm a fanatic about constraints because constraints affect the shape of the system you're designing more than the functional requirements of the system do. Sounds weird, but it's perfectly true. Before you choose which pair of shoes to wear to Hashimoto-San's party, you'd better find out whether he allows people to wear shoes in his house.

Constraints vs. Requirements

Knowing what's a constraint and what's a requirement is a little like knowing what's a bug and what's a feature. I see it this way: A requirement is part of the list of things that the software you're designing must accomplish. Period. A "requirement" that a system be written in C isn't a requirement -- it's a constraint.

(And to avoid any accusations of C-bashing, I'll add that a requirement that the system be written in Pascal is every bit as much a constraint. Just a pleasanter one....)

These are also examples of constraints:

As you move toward the bottom of this list, the constraints start to look more and more like features. The last item on the list, for example, could be either. The touchstone is this: A constraint is a limitation imposed on the application that takes priority over the feature set. You may know how to implement support for COM3: and COM4:, but some other peripherals in the target machine may already be using those two ports. Thus, you can't just "toss them in;" in fact, if your personal library of comm routines includes COM3: and COM4: support, you may have to explicitly disable it or even carve it out of the source code.

The Topsy Problem

Does it make sense to impose explicit constraints on software you're writing for yourself? (Apart from obvious constraints such as working within the limits of your own hardware.) It can, especially if you as a programmer are prone to "creeping featurism," that is, the irresistible urge to keep stuffing new features into a project far beyond what you imagined at the outset. (To paraphrase the potato chip people, "Betcha can't add just one!")

Projects that grow like Topsy can be hazardous to your delivery schedule, your sleep schedule, your marriage. A set of self-imposed constraints will help you set a clear boundary on when the project is actually finished. Limiting yourself to text mode means you're less likely to spend another two months adding graphics mode on a lark.

Later on, if you really, really want to add graphics support, you can break your own constraints.

The JTERM Project

I've decided to completely rethink and reimplement JTERM, the terminal emulator application that I started writing (in CP/M Turbo Pascal) back in 1984. It grew by accretion and is the epitome of software that has utterly no design or forethought behind it. It sends and captures text files, and it transfers files through XModem checksum. There are no menus at all; everything is done most cryptically by various sequences of control keys.

In short, it richly deserves to be shot in the head and left on the anthill.

So where do we start?

Here: With a list of constraints.

Obviously, these are all self-imposed and, are there mainly to give a certain shape and bounds to the project. Each constraint has some thought behind it, however, as every constraint should.

I'm excluding graphics support to keep the project manageable. At some future time, I may reimplement JTERM for Microsoft Windows 3.0, but that's a whole different design exercise. I'm writing it in Turbo Pascal to make it interesting to the broadest possible subset of "Structured Programming." I'm excluding third-party libraries because I want to be able to distribute the entire package in source code form. Besides, buying all the tough parts will preclude discussion of the design efforts that go into the tough parts. In a real-world design effort (where the source code needn't be distributed) you're probably better off buying as much of the technology in library form as you can.

I'm stopping at COM2: because the interrupt infrastructure for the first two COM: ports is fairly standard and well-understood. Beyond that, things get dicey and the complexity of the project as a whole goes up severely.

Finally, I'm using Turbo Vision for the user interface because Turbo Vision is now going out with every copy of Turbo Pascal as an extension of the runtime library. As such, it instantly becomes a force to contend with, and deserves some serious investigation. Because everyone who has Turbo Pascal 6.0 will also have Turbo Vision, distributing the source for Turbo Vision along with JTERM is unnecessary.

A project simple enough to describe in a few magazine articles won't really be big enough to have much in the line of constraints. And when you're working on your own, you can do pretty much whatever you want. However, let me reiterate that in almost every case, programming for money involves numerous constraints, some clearly stated at the outset, and others that you will have to dig for. You'd better dig for them, too -- or later on they'll come crawling up out of the ground like those jive zombies in Michael Jackson's "Thriller" video, muttering one horrible sound or another:

"You know that no one outside Headquarters can access that data set...."

"You know that our Poughkeepsie offices run everything on Apple II machines...."

"You know that all software here has to be written in Cobol...."

Eek!

The Big Zeller Wrapup

When last I looked at the pile on the windowsill, I had a little more than 80 letters, cards, and e-mail notes about Zeller's Congruence, a widely-used, but rarely-explained algorithm for extracting the day of the week given the year, month, and day in the Gregorian era. Zeller's Congruence revolves around the expression shown in Figure 1, reprinted verbatim from my October 1990 column. The q term represents the day of the month. The m term represents the month, but massaged slightly so that while March is month 3, January and February are months 13 and 14, but of the previous year. (I'm being terse here. For the full story, do refer to the October column.) K is the year term, from 0 to 99, and J is the century term. (that is, 17, 18, 19, and so on.)

Figure 1: The Zeller Expression

      (m + 1) * 26       K     J
  q + ------------ + K +--- + --- - 2 * J
           10            4     4

The majority of my correspondents wrote most helpfully to explain the meaning of the -2*j term in the algorithm, at which I'd thrown up my hands in despair of understanding. Here's the skinny; like everything else but American politics, it's simple enough in hindsight.

The Secret of -2*J

What the expression in Figure 1 does is calculate the way the day of the week advances for each day (q), each month (m), each year (K), and each century (J). For each day, the day of the week advances by one. (Obviously.) For each month, the day of week advances somewhat erratically, but Zeller was bright enough to come up with the ((m+1)*26)/10 term to describe it.

Now, the last four terms in the expression are actually two terms plus two corrections. The day of the week advances by one for every year, so we add K. However, every four years, the day of the week advances by an additional (leap year) day, so we have to add K/4, which throws in an extra day for every four years we add. The K/4 term is thus a necessary correction to the K term.

Now, where is the term that shows how the day of the week advances for every century? You guessed it: -2*j. The day of the week moves back by two days every century. The addition of the J/4 term throws an extra day in every 4 centuries, when the century day (that is, noughty-nought," the 00 year) which is ordinarily not a leap year, is made a leap year to account for a slowly-accumulating round-off error in the number of days it takes to make a year. J/4 is thus a correction to -2*j. (I had falsely assumed -- Lord knows why -- that the day does not advance at all in an ordinary century, leaving J/4 a correction to an unstated 0 term.)

Modulus and Remainder

One of the things that makes Zeller so hard to implement is that what Zeller called the modulus function is not quite the same thing as the MOD operator present in most of our compilers. (I took up this issue in my November 1990 column.) What we call MOD in Pascal and Modula-2 is really a remainder function. Modulus and remainder return identical results for positive quantities, but different results for negative quantities, and my Zeller's Congruence implementation was going haywire every time the -2*j term forced the value of the expression as a whole into the negative.

I implemented a true modulus function, presented as Listing One in the November column -- and blew it. My Modulus(X,Y) function returns an erroneous value in every case where either X or Y is negative and Y is an even multiple of X. Where Y is a multiple of X, X modulus Y is 0, always -- and the (X*Trunc(R-1)) term evaluates to X or -X for those cases, when it should in fact evaluate to 0.

Harry J. Smith of Saratoga, California pointed this out and sent a correct Modulus function, which I've given in Figure 2. If you've already begun using my CalcDayOfWeek function containing the Modulus local function, please replace the old Modulus with the one in Figure 2.

Figure 2: MODULUS2.SRC

  FUNCTION Modulus (X, Y : Integer) : Integer;

  VAR
    Holder : Integer;

  BEGIN
    Holder := X MOD Y;
    IF Holder < 0 THEN Inc(Holder,Abs(Y));
    Modulus := Holder;
  END;

Many thanks, Harry.

Congruence?

But the most remarkable thing I learned from this small mountain of letters is that the whole modulus business could have been avoided by magically replacing the -2*j term with a +5*J term. Everything comes out exactly the same in the end, except that since we're adding a quantity to the expression instead of subtracting it, the expression as a whole never goes negative, and we can use MOD with impunity. MOD, remember, is actually the remainder function, and returns the same results as true modulus for all positive quantities.

Why, though, is +5*J equivalent to -2*j? Keep in mind what happens to the value of the expression once we take it: We calculate expression modulus 7. Adding or subtracting multiples of 7 to the value of expression does not change the final value of expression modulus 7. This is why I could get away with my original kludge (shown in the listing for CalcDayOfWeek in my October 1990 column) of adding 7 repeatedly to the expression any time it came out negative until the value turned positive.

Adding 5*J to the expression does something like that. Think of it this way: Every passing century moves the day of the week back by two days. In other words, if today (Halloween, October 31, 1990) is Wednesday, the day of the week on 10/31/1890 was Friday. We say this because most centuries contain 36,524 days -- two days short of a multiple of seven days.

On the other hand, two days less than a multiple of seven days is absolutely the same thing as five days more than a multiple of seven days. It's just as valid to say that in the century between Halloween 1890 and Halloween 1990, the day of the week went forward by five days: From Friday to Wednesday. We're not the least bit concerned about the actual number of days that pass in a century; we're only concerned with the relative position of the day of the week from one end of the century to the other. +5*J, -2*J; Five steps forward, two steps back: modulo 7, it's all the same.

Products Mentioned

MyFLIN OpalFire Software 329 North State Street Division II Orem, UT 84057 801-227-7100, $59.00

I'm not going to print CalcDayOfWeek here yet a fourth time (which would probably be a DDJ record for one piece of code) since I am thoroughly Zellered-out. The change, if you choose to make it, is simple enough for me to cop out and leave as an exercise to the reader.

Thanks much to everybody who wrote to me on the subject.

And By the Way, What's a FLIN?

The downside to having really original product ideas is that there's no comfortable niche to fit into. Truly original products have to explain themselves very well and very often or they don't get their fair share of the public's band-width.

I received a product not long ago that presents a classic example. The product is MyFLIN, written by an Australian hacker and marketed through an American firm in Utah. It's extremely clever, and extremely useful; but it may also be the least self-explanatory product I've ever seen.

Not that it's hard to use; I don't mean self-explanatory that way. But there is not a whit of explanation on the packaging as to what the product does, and as far as I can tell, the disk-based documentation does not share the secret of what "FLIN" means.

MyFLIN solves a very specific problem that I've had for the whole time I've been a Pascal programmer: I create a passel of procedures, including some middling complex ones with eight or nine parameters. Later on, I go to call those procedures from some other part of the source code, or from a different module, and I realize I don't quite remember the order, spelling, or types of all the parameters. Did ErrorCode come before BufferPtr or after? Was it StringForm, StrForm, or StringFrm? Before you know it, I'm Ctrl-QRing to the top of the file to start searching for the malremembered proc, at considerable cost in time and concentration.

MyFLIN fixes that. It's a TSR that builds a database of your own procedure and function declarations for you. There's no data entry involved; MyFLIN sucks the declaration right out of the screen buffer. You just put the cursor anywhere inside the name of the declared procedure, hit a hot key, and you've captured it. Later on, you can recall it just as easily.

This is seminal stuff. It's pretty much alone in its niche, and you really have to play with it for an evening to get a feel for how indispensable the concept is. The software has a couple of rough spots, but nothing worth any serious carping.

Someday I'll figure out what FLIN means -- but I now have a FLIN, and if you suffer from badly-remembered procedure declarations, you should get one, too.

Two Years Before the Masthead

It's Halloween night again -- and I realize that I finished off my very first column for DDJ two years ago today. Back when Kent Porter first approached me with the idea of taking over the "Structured Programming" column, I was in terror of running out of topics to cover after a couple of months. Now, 25 columns later, I see that my list of things-to-be-discussed covers several pages, single-spaced, and grows seemingly without end.

Some call programming a bottomless pit. It's actually more of a never-ending Lifesaver; no matter how many wonders you pinch off the roll, another is always there, right behind it, waiting to be mastered and savored.

Keep that in mind, the next time you're chasing an intermittent system crash at three ayem. Boo!



_STRUCTURED PROGRAMMING COLUMN_
by Jeff Duntemann



[Figure 1: The Zeller expression]


    (m + 1) * 26        K     J
q + ------------ + K + --- + --- - 2*J
        10              4     4




[FIGURE 2]

FUNCTION Modulus(X,Y : Integer) : Integer;

VAR
  Holder : Integer;

BEGIN
  Holder := X MOD Y;
  IF Holder < 0 THEN Inc(Holder,Abs(Y));
  Modulus := Holder;
END;

Copyright © 1991, Dr. Dobb's Journal