January 2000/We Have Mail

Departments

We Have Mail

Letters to the editor may be sent via email to cujed@mfi.com, or via the postal service to Letters to the Editor, C/C++ Users Journal, 1601 W. 23rd St., Ste 200, Lawrence, KS 66046-2700.

Marc,

If you don't get a deluge of mail about the component-versioning ideas you raised in Editors Forum, October 1999, then I'll be very surprised. Here is my contribution to the deluge.

I agree very strongly with you about the need to fix this problem, especially in the DLL area on Windows. However, I believe that the "manifest" and""version" functions need one extra point applied to make them workable — the data that they deal with (i.e., version numbers) must have a clear interpretation so that precise rules and outcomes can be defined.

As an example of an interpretation/outcome that worked well (IMHO), here is a what VMS commonly does for its equivalent of a DLL.

1. When a DLL is created, it is assigned a version number, such as 2.1. The first number (2) is called the "major" version, and the second (1) is called the "minor" version.

2. When a .EXE or .DLL is built, a copy of the version number from each DLL that it references during the build is taken and stored in the .EXE.

3. When the .EXE is started (perhaps a long time later, or on another host) the program activator checks the version numbers stored in the .EXE against the current version number of each DLL actually available on the host at that time. This process is recursively repeated for each DLL referenced by the DLLs already checked.

4. The program activator will then only allow the .EXE to start if, for each DLL checked, both of these are true:

(a) the major number is exactly equal, and

(b) the minor number in the .EXE is not greater than that in the available DLL.

5. The creator of a DLL had an obligation to support this mechanism by ensuring that if the major-version number was unchanged, then the interface either only added new methods or altered methods would behave the same if called by an older program using the older calling convention.

The result is that a program built against FOOBAR.DLL V2.3 would run using any available FOOBAR.DLL from V2.3 upwards, but not from FOOBAR V3.0 and upwards. The best part is that when the condition failed, you got an (almost) legible error about which DLL was a problem, what version was expected, and what version is available.

This system worked so well, that right up into VMS V6 I was still able to run a .EXE that was created in VMS V2. (In fact I kept this .EXE just for the fun of testing it on each new release.) This is a time gap of some ten years. I'll bet serious money that Win32 will not achieve that!

There may be better systems for interpreting and checking the "manifest" and "version" information, and the "refusal to run" outcome may not always be the best one. But the point is that the interpretation/outcome was always clearly defined to all concerned.

I wish you luck on your crusade. Do you have Bill's phone number so that you can convince him?

Claude Brown
Sydney, Australia

P.S. What really frustrates me is that the VMS background of the NT developers did not result in this mechanism being available. Damn.

I did indeed receive a lot of mail on that Editor's Forum. Apparently I touched a very sore nerve. Actually, my aims were not nearly so ambitious as DEC's (vendor of VMS systems). I am not proposing a scheme to ensure backward compatibility. I just want to help out the poor guy who has to troubleshoot his component-based application. If he figures out that DLL v2 isn't really backwardly compatible with DLL v1 (documentation notwithstanding) he has a pretty good chance of fixing the problem. But if he can't even find out what versions he has — well, he's just screwed, and so are the rest of us.

Thanks for supporting my little jihad. (As for Big Bill, I probably don't even rate a pixel on his radar screen.) For a couple more responses to the forum, read on. — mb

Dear Marc,

Few things are as delightfully frustrating as seeing the publication of an idea passionately held. This happens to me quite often when I read your magazine, and it happened again today, when I read your October editorial: "Every component, shared library, DLL, etc. should have a version function, callable by the host application." I thought my brain had left my body, traveled to your computer, and typed in what I most wanted published! The programming world definitely needs to hear this stuff.

I am a component specialist, tasked with creating and maintaining telecom systems (running with COM on Windows NT) that must run 365 days a year. The lack of versioning in Windows systems in general, and COM in particular, is a diabolical crime of omission, and I've put a policy in place here to help solve the problem:

1. All COM components created in this shop are required to have a read-only Version property.

2. Any program that uses COM components must list, along with its own version number, the version numbers and names of each and every component used in the program. (This typically goes in an About box for GUI-based programs, or as the first line of output for command-line-based programs, or in the NT event log for an NT service.)

3. There is a Version Tracker program that loads each component, queries it for Version, and reports the results.

4. That version tracking report must be included in the source code control system, as well as in a notification email sent to all developers.

5. Anyone who modifies a component in any way must increment the (hard-coded) version number before recompiling the component.

Obviously, this applies to other "component-like" systems as well. Has anyone else realized that it is impossible to automatically maintain version control of stored procedures (supposedly the only way to build RDBMS-based systems) on a SQL Server? Solution: every stored procedure has an OUTPUT parameter called — you guessed it — Version. Each procedure also has an input BIT parameter that, if set, instructs the procedure to report the version number and return, without executing any other instructions. There you have it — it is now trivial to construct a version tracker for those stored procedures.

Thanks,

John F. Hubbard
Magellan Network Systems
Kailua, HI
johnfh@aloha.net

Hello Marc,

I'd like to address shortly a few points you made in your October forum about components.

While I agree with all your wishes about the semantics of components, I have to differ with your feelings about DLLs and how components are handled on Win32 platforms in general.

Your column seems to imply that Windows is handling components poorly because you (like most of us) had sometimes the misfortune of breaking perfectly working software after installing new components. I think you are mistaking buggy new versions with failure in the component model.

COM has made component programming a reality, and I am talking from the standpoint of a former Unix diehard programmer. Switching to Windows programming was quite an eye opener. While Unix is still struggling with locked-in applications, Win32 developers know the value of reuse and are leveraging on existing components every day to build new applications.

Have you ever wanted to, say, call the PNG parsing routine of Gimp? Or reuse Netscape's HTML display engine? Or add your own email formatting routine in xemacs/Gnus... in Java?

All these are pipe dreams on Unix, where each generation reinvents a wheel carefully polished by the previous one.

On Win32, it is a reality. What started as modest automation desires (how could I automate a mailing in Word?) ended up building an extremely impressive framework to enable applications that could only be dreamed of by gurus. I could go on and on about describing the power of COM but you would probably end up thinking I'm just another one of these Microsoft zealots, so I'll just stop here. Win32 has a lot of shortcomings which I have to fight every day, but COM is certainly not one of them, and there is a lot to be learned for component passionates in there. Just launch OLEVIEW.EXE on your Windows desktop and you will see what I mean.

I only wish Microsoft were more open and would submit the source code for COM, and maybe the very simple project ideas I suggested earlier could become a reality on Unix. Imagine... hacking your Linux kernel in Python... how is that for component programming ?

Cedric Beust
http://beust.com/cedric

You and I may disagree on the nature of the problem with Windows DLLs. I think it is due to a weakness in the underlying component model, not just buggy code. But then, I feel the same way about any component model that tries to ensure backward compatibility just through interfaces. One of the key points I tried to make in my editorial is that interfaces aren't strong enough to enforce compatibility.

That said, developers do succeed in creating components that are backwardly-compatible — most of the time. When their best efforts fail I would like to have some debugging information at my disposal; and the first thing I would like to know is, "which version of component do I really have here?" I don't think that is an unreasonable request.

My editorial may have been unclear on some points, because I am not at all suggesting we throw out the COM baby with the bathwater. I agree that you can do some marvelous things with COM, and with other component models as well, even if they aren't perfect.

I would like to thank everyone who has written about this subject. However, everything I've heard so far has been about Windows. I wonder if anyone using Unix or Macs has had similar problems? —mb

Hi,

Having tailored to sorting needs the older version of quick sort that moves the pivot to the first element and then later move the pivot and equal items to the center, I appreciated the improved quick sort presented by P. J. Plauger, ("Standard C/C++: A Better Sort," CUJ, October 1999).

I have one question about the implementation in the function intro_sort, a portion of which I've included below. Can the conditions tested in the lines marked with q1 and q2 occur? It appears to me that line t1 precludes the test performed in line q1 from being true. If qm == qf is true would not the compare in line t1 step qf past qm? Likewise for q2 and t2. Have I missed something?

for (; ; qf += size) {/* partition about pivot value */ t1: for (; qf < ql && (*cmp)(qf, qm) <= 0; qf += size) ; t2: for (; qb < ql && (*cmp)(qm, ql -= size)<=0; ) ; if (ql <= qf) break; swap(qf, ql, size); q1: if (qm == qf) qm = ql; q2: else if (qm == ql) qm = qf; }

Also is there a description available for the algorithm implemented in function adjust_heap and its usage (in the code excerpt below)? I understand the desired result but do not easily see what's happening.

for (h = n / 2; 0 < h; ) adjust_heap(qb, —h, n, size,cmp); for (qe = qb + n * size; 1 < n; ) {/* pop largest item to (shrinking) end */ swap(qb, qe -= size, size); adjust_heap(qb, 0, —n, size,cmp); }

David Matta
matta@nctimes.net

Without studying the code in detail, I can report that the conditions can indeed happen during the formation of a fat pivot. I certainly got ghastly performance before I took better precautions about keeping track of the original pivot value.

Below is my latest version of template function _Adjust_heap, which contains a few judicious comments that may help you understand what's going on. — pjp

template < class _RanIt, class _Diff, class _Ty > inline void _Adjust_heap ( _RanIt _First, _Diff _Hole, _Diff _Bottom, _Ty_Val ) {//percolate _Hole to _Bottom, //then push _Val, using operator< _Diff _Top = _Hole; _Diff _Idx = 2 * _Hole + 2; for (; _Idx < _Bottom; _Idx = 2 * _Idx + 2) {// move _Hole down to // larger child if (*(_First + _Idx) < *(_First + (_Idx - 1))) —_Idx; *(_First + _Hole) = *(_First + _Idx), _Hole = _Idx; } if (_Idx == _Bottom) {// only child at bottom, // move _Hole down to it *(_First + _Hole) = *(_First + (_Bottom - 1)); _Hole = _Bottom - 1; } _Push_heap(_First, _Hole, _Top, _Val); }

Hi CUJ,

I just read the letters to the editor of the August 1999 edition, and came across a reply to an article on debugging tools. [The letter was in reference to Giovanni Bavestrelli's "Better Assertions for MFC," CUJ, June 1999.] Reading the letter, I thought that both the article author and the letter author might be interested in a C library I wrote. It's not C++, so it may not be relevant for them, but it is fault tolerant, ANSI C, and much more. You can download the source code from:

http://www.link-data.com/sourcecode.html

Regards,
Johan Lindh
johan@link-data.com

Thanks. — mb

Dear Mr. Plauger,

I have some comments on your article "Standard C/C++: Why Y2K?" which appeared in the September 1999 issue of CUJ.

1. My tests of the Standard C library (Microsoft Visual C++ version 4) to determine the last time available from the C time function give the time to be: 19 Jan 2038 03:14:08 UTC. Your article talks about a Y2.037K problem but in my opinion it should be a Y2.038K problem since the roll-over to the problem year is from 2037 to 2038 (when using years only).

2. Surely if you use the notation [1900, 2036) you mean the years 1900 to 2035 inclusive, since in my experience the ")" parenthesis means "not included". For integer series I would use [x, y] as in [1900, 2036]. The last valid complete year (for the time function) is 2037 so the interval would be [1900, 2037]. I guess that one could make a case of using ")" in: [1900, 2038) to indicate that not the whole of the last year is valid.

Incidentally the first time of the time function does not start at 1900. I have not exactly determined the time but I calculate the year to be 1902 = 1970 - 68. (Some C libraries treat negative values as before 1970 and others as an error.) So if a user enters the year 1900 then this cannot be represented in seconds using the C mktime function.

Also you may be interested that I read (I think in comp.risks), but have not confirmed, that on 9 Sept 2001 01:46:40 UTC the number of digits used to represent the time in seconds by the C time function increases from 9 to 10 digits which will cause problems if too few digits have been used in a fixed field data file to store the time.

Philip Burgess

I'll respond to your comments in the order you presented them.

1. Y2.038K is a better analogy to Y2K. I chose 2037 as the last completely safe year for 32-bit times, even at the risk of being off by one with the analogy. Systems vary just enough that I decided not to depend on those 19 days in January as a comfortable buffer.

2. I agree. The notation was simply a botch. With respect to the time function, times are nominally measured from the beginning of 1900, but I agree that they are not representable with a signed 32-bit time with zero at 1 January 1970.

3. The last problem you mention presents a small, but finite risk I suppose.

Thanks for pointing out the shortcomings in the article. — pjp