Columns


Questions & Answers

Locking UNIX Files

Ken Pugh


Kenneth Pugh, a principal in Pugh-Killeen Associates, teaches C language courses for corporations. He is the author of C Language for Programmers and All On C, and is a member on the ANSI C committee. He also does custom C programming for communications, graphics, and image databases. His address is 4201 University Dr., Suite 102, Durham, NC 27707.

You may fax questions for Ken to (919) 493-4390. When you hear the answering message, press the * button on your telephone. Ken also receives email at kpugh@dukeac. ac. duke. edu (Internet) or dukeac! kpugh (UUCP).

Q

This is a UNIX question. It is a question I believe could concern every C programmer who works with Unix. Specifically, I work on an Altos 2086 using XENIX 3.4b. If I remove a file (for example using unlink() ) that someone else already has open for writing, what happens if they perform a write after I unlink the file (assuming I've removed the last link)?

Reads and writes can be performed on removed files if the remove took place after the file was open. For example, if when using the more command to read a file, I invoke a shell and remove the file I am more'ing, the more command doesn't notice any difference and keeps on reading. Does the system notice?

My big concern is the write process. For example, if I remove a log file while someone else has it open for writing, and that person or process performs a write, is the system smart enough to avoid overwriting someone else's precious data?

One other question: under MS-DOS if I open a file for reading (fopen("path","r")) and accidentally open the same path for writing (fopen("path","w")), the disk (standard 5.25") file system for that disk sustains major damage. Why can't MS-DOS prevent this? What, besides great care, can be done to avoid this? XENIX handles this problem without file system damage, although the results are never what is desired.

Vern Martin
Alliance, Ohio

A

My UNIX manual says "If one or more processes have the file open when the last link is removed, the removal is postponed until all references to the file have been closed".

If the request to delete the file is postponed till after the last close, then file writing and reading can proceed normally. Records written by one process can be read by another. Writing over other allocated disk blocks should not be a problem.

Two processes that write to the same file may have conflicts. You should use the locking() function to reserve the right to access a particular portion of the file. This lock function interface is:

int locking(file_descriptor,
   mode, size)
int file_descriptor;
int mode;      /*  0 Lock remova l*/
    /*  1 Blocking lock           */
    /*  2 Checking lock           */
long size;      /*  Number of bytes*/
Each version of UNIX has slightly different file locking characteristics. Perhaps some UNIX guru out there can answer this question in more detail.

Regarding your MS-DOS question, do not attempt to open the same file twice for writing, except in the privacy of your own home and unless you have made full backups of your entire disk and you are prepared to reformat the disk.

I can only guess that since MS-DOS was not designed as a multitasking system, it's designers assumed that a file would never be opened twice. However products like DESQview can do some checking along these lines. (I use DESQview extensively as a multitasking environment running over MS-DOS.

My old Wordstar does not have multiple buffers. When I open three or four Wordstar windows to edit multiple files, with fast keystroking I sometimes open up multiple copies of the same file.

I suppose DESQview does some checking for me since I haven't had to reformat the disk. Wordstar creates a couple of temporary files when a document is opened. With multiple copies being edited, the contents in these files may become garbage, but disk allocation (as tested with CHKDSK) does not have problems.

So it is possible to protect against this source of disk corruption — but not with straight MS-DOS. Now I have a question: does anyone know if MS-DOS v4.0 differs in its response to multiple opens?

Q

I have discovered that many short, seemingly simple programs simply do not run correctly until you insert a printf() for debugging purposes. I have noticed that the position of the printf() with respect to the suspect code also affects the outcome. Can you explain this? Thanks.

Mark Petrovic
Stillwater, OK

A

You either have the pointer blues or the compiler bug. The latter is a rather rare item, so I suspect a pointer problem.

Most probably, you are wiping out some code by writing through an uninitialized pointer reference. The addition or deletion of some printf's changes the position of the code causing the damage to occur at a more benign location.

I recommend Robert Ward's book on Debugging C as the best place to start in tracking down these errant pointer problems.

Replies

Repeat Format

In the "Q?/A!" column in The C Users Journal, Vol 8, No. 3., a question was asked about repeat format constructs in C. I devised a function in Microsoft QuickC that will take a format string constructed as you suggested and convert it to a valid format string for printf() with the repeated information. Enclosed is a listing (Listing 1) of the function, and a short main program with sample repeat format strings that demonstrate the function's ability to generate valid printf strings. The function is rather long, but I believe it handles most all of the conditions that were set forth. It would be nice if a compiler could be designed so that a program is compiled directly with the new format string, instead of generating it at runtime.

One problem that may exist in my routine is that it returns a pointer to a local string that effectively disappears upon its return (the data is still there though). It is recursive, so a static buffer wouldn't work. If you have any comments or improvements let me know.

Doug Oliver
Wichita, Kansas

Indentation Styles

I recently read your article (Q?/A!) in The C User's Journal (April 90). In the article you present several styles for the layout of braces within C programs. I also saw the similar discussion in the C Gazette (as you noted). I have a similar style to the one that you use, but with one exception; the braces are "undented" from the code by one space. For example,

/*  Sample if statement */
if (a   0)
 {
   a = 0;
   b = 1;
 }
else
 {
   a = 1;
   b = 0;
 }
This style has all of the plusses that yours has, but it has one more. Sometimes it is necessary to line up braces physically, with pen and paper. I've been using C for nine years, and maybe there will come a time when I never have to do this (but I doubt it!). However, on those rare occasions when this depressing activity becomes necessary, my indenting style helps quite a bit. One may draw lines, connecting the braces, without ever going through any code. This style evolved from an actual problem I encountered where two variable names were different only in the first character (i.e., eflag and aflag). When I drew lines to verify the blocking, they went through the first character of the code. It just happened that this hid a place where my flags were backwards. I continued to use that printout while I was debugging, but I couldn't see the first character so it took me a while to find the problem. When I did, I was a bit ticked off (mild understatement) so I decided to change my style to prevent this in the future. Since then, the problem has never re-occurred. Maybe that's the indenting style, maybe I'm a better programmer. Probably it's a bit of both. Nevertheless, the style serves me well, and is as readable and usable as any other I've come across, more so than most (in my somewhat biased opinion).

On a related note, I have also given up using tabs (except for a single initial tab) for indentation. I move my code between systems on a fairly regular basis, and it never fails to surprise me just how many ways of handling tabs and spaces there are in various editors and printers. As a result, my indenting style consists of a single tab for the first level of code, and three spaces for each following level (with braces being undented by a single space).

Anyway, I just wanted to provide you with another possible style for future consideration, and to let you know that I think your column is very good.

Brian S. Merson
Marlboro, MA

Thank you for your contribution. Your style has a lot going for it. If pretty-print programs would output it, I think I would prefer it to mine. As far as typing it originally, my fingers would probably refuse to learn it. (KP)

To improve the readability of the if () ... code it's convenient to distinguish between the case when else is present and the case when it is not present. This allows to expect or not expect else (which may appear on the next page) already when reading if () ... statement.

Therefore I use a separate line for an open brace when if () is followed with else. When else is not expected the open brace is on the same line as the controlling statement.

So, the codes look like these:

if (...)  {
   ...
   }
if (...)
   {
   ...
   }
else
   {
   ...
   }
David Finkelman
Israel

This also has some possibilities. My only objection would be that adding an else to an if, would require going back and changing the line on which the if appears. (KP)

Great Name / Obscure Code Contest

Here is a submittal (Listing 2) for the all new "Great Name/Obscure Code" contest that you mentioned in the April 1990 issue. Notice the "unique" function names used in the main program. The names of the functions convey so little meaning that the author has used printf statements to document what is going on. Since this code was headed for an imbedded controller environment, the printf's are destined to be removed, leaving an impressive mystery for the unfortunate soul that inherits this piece of "code". This should be an exciting contest and I hope this is an interesting enough entry to compete. I can't wait to read the results!!

David Sterling
Boulder, CO

Microsoft Floating Point Format

Old versions of Mbasic use a proprietary format called Microsoft Binary Format. Quick Basic 4.5 and up use IEEE format but can read/write the old format with a special set of functions. Programs can also be compiled to use the old format by compiling with option /mbf.

A couple of years ago I had to write a query-utility in C to query the database of a program that calculates salaries and taxes. No information on the MBF-format was available but I was able to reverse engineer it. The format is very similar, but not identical to IEEE.

For example MKD$ creates an eight-byte string. The first byte is a sign bit. Then follows a seven bit exponent followed by seven bytes of mantissa.

Enclosed you will find functions that read four and eight-byte floats (Listing 3) . They should be self-explanatory.

Anders Gustafsson