June 2000/We Have Mail

Departments

We Have Mail

Letters to the editor may be sent via email to cujed@cmp.com, or via the postal service to Letters to the Editor, C/C++ Users Journal, 1601 W. 23rd St., Ste 200, Lawrence, KS 66046-2700.

Dear CUJ,

I would just like to make a few comments regarding Scott Meyers' article in the February 2000 CUJ. The main point being made is that if it is possible to implement a function f related to a class C without making f a member function of C, one should do so, as this increases encapsulation. I can see two problems with this approach.

First, it makes for a messy interface if some methods are member functions and some are not. As the article explains, this kind of syntactic inconsistency cannot always be avoided. However, there is no reason to encourage it. Class interfaces are written for class users, not class implementors. They should be as straightforward as possible. It should be of no interest to a user if a method was implemented without accessing private class members. A user just wants a clean and understandable interface to work from, without being exposed to consequences of the class implementation.

My second, and more important, objection is that the suggested approach can hinder future changes to the implementation of a class. To use the example in the article, class Wombat supports eating and sleeping, but sleeping does not need to be implemented as a member function:
class Wombat {
public:
   void eat(double tonsToEat);
   ...
};

void
sleep(Wombat& w,
   double hoursToSnooze);
This definition is usable and, yes, users will get used to and accept the inconsistencies in the interface.

But suppose a few months later the implementor wants to make some changes. It's been realized that to build a better Wombat, the sleep period is influenced by how much food the creature has to sleep off. This should be simple enough, we just need to access the tonsOfLastMeal private member and... ah.

We now have two choices. We could expose tonsOfLastMeal, but we don't want to — that's why it was originally private. So now we have to make sleep a friend of Wombat. This is a kludge, and it's what gives friends a bad name. To quote Mr. Meyers himself, "Much as in real life, friends are often more trouble than they're worth."

I would like to suggest an alternative approach: make sleep a member function, but implement it as a local non-member, non-friend function, as follows:
wombat.h:

class Wombat {
public:
   void eat(double tonsToEat);
   void sleep(double hoursToSnooze);
   ...
};

wombat.c:

namespace {
void
do_wombat_sleep(Wombat& w,
   double hoursToSnooze);
}
...
Wombat::sleep(double hoursToSnooze) {
do_wombat_sleep(*this,
   hoursToSnooze);
}
This has the small overhead of an extra function call, but it allows the sleep function to be implemented as a non-member (thus increasing encapsulation) while still giving the implementor the option to make the function a member if this should become necessary. It also gives the user a consistent interface.

This appears to me to be a win-win situation. I would be interested in other peoples' views on this.

Yours sincerely,

Rupert Welch
rafw@mindless.com

Scott Meyers replies:

Thanks for your letter, which CUJ forwarded to me.

Your first objection was:

First, it makes for a messy interface if some methods are member functions and some are not. As the article explains, this kind of syntactic inconsistency cannot always be avoided.

I'd go further and say that this kind of inconsistency can almost never be avoided. Class users almost always add some convenience functions of their own. I continue to believe that the argument about syntactic consistency is more one of theory than practice.

Your second objection was:

My second, and more important, objection is that the suggested approach can hinder future changes to the implementation of a class.

The key word here is "can" — as opposed to "will." I responded to this argument in a posting I made to the comp.lang.c++.moderated newsgroup in January, so I'll just repeat that posting here:

"On 17 Jan 2000 11:40:35 -0500, merlinvin@my-deja.com wrote: Agreed; however, there's a subtle point that I didn't see covered in the article in question. Providing non-friend/non-member functions has the side effect of guaranteeing that the code will *never* need access to protected or private portions of the class. Unfortunately, requirements have a nasty habit of changing..."

"Upcoming professional and personal obligations will likely preclude my participation in this newsgroup for a while, but since I'm the ‘Program in the Future Tense' guy (MEC++ Item 32), I think it's important to address this point.

I recently finished Kent Beck's extreme Programming explained (http://cseng.aw.com/bookpage.taf?ISBN=0-201-61641-6&ptype=2889). Kent gives a marvelous summary of the costs of building too much flexibility into your software. This is from pages 110-111. It's kind of long, but I think it's important to show enough to grasp the argument.

The key is that risk is money just as much as time is money. If you put in a design feature today and you use it tomorrow, you win, because you pay less to put it in today. Chapter 3, Economics of Software Development, however, suggests that this evaluation is not complete. If there is enough uncertainty, the value of the option of waiting is high enough that you would be better off waiting.

Design isn't free. Another aspect of this situation is that when you put more design in today, you increase the overhead of the system. There is more to test, more to understand, more to explain. So every day you don't just pay interest on the money you spent, you also pay a little design tax. With this in mind, the difference in today's investment and tomorrow's investment can be much greater and it is still a good idea to design tomorrow for tomorrow's problems.

As if this weren't enough, the killer is risk. As the Economics of Software Development pointed out, you can't just evaluate the cost of something that happens tomorrow. You also have to evaluate the probability of it happening. Now, I love to guess and be right just as much as anybody, but what I discovered when I started paying attention was that I didn't guess right nearly as often as I thought. Often the fabulous design that I created a year ago had nearly no correct guesses. I had to rework every bit of the design before I was done, sometimes two or three times.

So, the cost of making a design decision today is the cost of making the decision plus the interest we pay on that money plus the inertia it adds to the system. The benefit of making a design decision today is the expected value of the decision being profitably used in the future.

If the cost of today's decision is high, and the probability of its being right is low, and the probability of knowing a better way tomorrow is high, and the cost of putting in the design tomorrow is low, then we can conclude that we should never make a design decision today if we don't need it today. In fact, this is what XP concludes. ‘Sufficient to the day are the troubles thereof.' In other words, changing your design because of what might happen in the future is generally worthwhile only if you are able to predict with accuracy that the possibility you're designing for will actually take place.

On the surface, this seems to be completely contradictory to my ‘Program in the Future Tense' advice of MEC++ Item 32. It's not. I know it's not, because I believe what I wrote in MEC++ and I also agree with Kent's analysis above. Either I'm self-contradictory in my beliefs, then, or there is some underlying unification of these ideas that I need to identify (and probably write up).

I haven't had enough time to really think about this yet (the CUJ article was nearly two years in the making), but I think an important consideration is who controls what might change in the future and what the implications of these changes will be on class clients. In general, Programming in the Future Tense is about preventing silent failures of client code when they do things over which you have no control. For example, why should base classes have virtual destructors, even if there aren't currently any derived classes? Because clients can add derived classes without the knowledge or permission of the base class, so the base class needs to take that possibility into account at the outset.

This is quite different from adding a function that people might want to call in the future. If, in the future, they want to make the call and the function's not there, their code won't compile — things won't silently fail. Kent's argument (assuming I understand it correctly) is that if you add the function early on ‘just in case,' you add to your testing/maintenance/documentation burden without necessarily getting anything in return.

As I explicitly mention in M32, there is a balance to be struck between what is needed now and what might be needed in the future. In my CUJ article, I simply point out that if your choice is between a member function and a non-member non-friend function, the non-member non-friend is the decision that yields the greater degree of encapsulation. I think this is worth mentioning, because, in my experience, this is the opposite of what most people generally believe." — Scott

Dear CUJ,

At my tender age of 44, after working with C for nearly two decades, I'm still getting caught into believing that 0xffffffff << 32 is zero.

In my youth, as an assembly programmer, I've been carefully checking the number of bits involved when shifting a register. Now, implementing an IP address mask function in a high level language (C++) I only checked the high level logic, that is "0 <= shift <= 32," since the function logic implied that a mask of 0 can be used to mean "any IP address." (As everybody knows, if net and addr are 32-bit quantities such that (net & mask) == net, then the condition (addr & mask) == net determines if the IP address belongs to the network with the given mask. The (net, mask) pair is often given in the form 194.243.254.0/24, where a naive expression for the mask is 0xffffffff << (32 - number_of_bits).)

I was pretty sure I read something about an amendment in the behavior of the shift operator, a few years ago. At that time I reasoned that the BITSPERBYTE macro is not so widely diffused to allow painless checking of the right operands of shift operators, hence it was just fine that the shift operator was amended.

So I wrote buggy code and was caught. And I wrote to the usenet as a perfect idiot who didn't read the FAQs. I got plenty of responses telling me that I was wrong. I checked on a 1996 C++ draft and I was helplessly punished by "5.8 [...] The behavior is undefined if the right operand is negative or greater than or equal to the length in bits of the promoted left operand."

Laugh at me!

Still, I wonder whether younger programmers may get relieved or not reading this story.

Yours,

Alessandro Vesely

You've run afoul of one of those compromises that make C famous as a fast but sometimes dangerous language. Hardware designers have long been notorious about making shift instructions that aren't robust. They seldom have provision for shifting an N-bit word N or more bits in either direction. To minimize the need for guard instructions around every shift, the C Standard continues the practice begun by Dennis Ritchie — it's up to you, the programmer, to avoid bogus shifts. It may not be nice for novices, but it's been that way for three decades. — pjp

Dear Sirs:

I found Brad Offer's article, "Error Logging with Iostreams" in the April 2000 issue of CUJ, interesting since recently I had to do something similar. However, I used a different approach, which didn't require redefining iostream inserters, extractors, and manipulators. My equivalent of his dspl macro (displayError manipulator) is a sequence of two back-to-back endl's, since in our error messages we use a single endl to terminate a line, but we'll never have a blank line (two endls) by itself in the middle of the error message, so this seemed to be a safe way to delimit error messages. To be able to redirect the error messages to whatever display device and/or log file is appropriate, I defined my own streambuf class, and then used the ostream_withassign functionality of clog and cerr to override the default output device with my own by constructing and assigning an instance of my streambuf class to clog and cerr. Finally, logging of the filename and line number was accomplished in a similar manner by using the following macros to log trace and error messages:
#define TRACELOG(debugLevel,outMsg){\
   MyLogMgr *theLogMgr = \
   (MyLogMgr*) \
   G_LogMgr::GetInstance();\
   if ((theLogMgr->GetDebugLevel() &\
        debugLevel) == debugLevel) {\
      theLogMgr->GetTraceLog() << \
         outMsg << endl << endl;}}

#define ERRORLOG(outMsg) {\
   char szMsg[512];\
   sprintf(szMsg, \
  "Error at line %d in file(%s):\n",\
      __LINE__, __FILE__);\
   G_LogMgr::GetInstance()->\
      GetErrorLog() << szMsg << \
         outMsg << endl << endl;}
I have included the implementation of my streambuf class on the CUJ ftp site [look for letters.zip in this month's archive. —mb].

Thanks,

Ed Remmell
email: mithras@idcomm.com

Brad Offer replies:

Dear Ed,

Thank you for your letter. It's always fun to see someone else's solution to a problem. I've been wanting to tackle something involving streambufs.

In that spirit of self-improvement, my goal in writing the Errstream class was to learn more about extending the standard iostreams, which I did. As I mentioned in the article though, a streambuf would be a logical replacement for the static character array that I used to hold error messages. I'm hoping that the code in your letter will be of help in that endeavor.