Questions and Answers

Amy Rich

Q I'm running an NFS home directory server on Solaris 8. When people have left the company, we've been doing userdel to remove their accounts. I've just discovered (because of UID reuse) that apparently the quota information stays around, though. I have two questions that deal with this topic. Is there a way to remove a user's quota information when the user is removed with userdel (or perhaps some other tool)? Also, how can I clean up the existing quota entries of users that have already been removed?

A Unfortunately there is no way to remove users' quotas when you delete them short of rolling your own script. This is generally a good idea, anyway, because userdel will not catch things like crontab entries, files outside of the home directory, etc. You can trivially do a mass cleanup of all of the quota entries that don't have usernames assigned with uids, though.

First, make sure there's a user on the file server that has no quotas. If there isn't one, set one up using /usr/sbin/edquota (nobody is often a good choice for a prototype user):

/usr/sbin/edquota nobody

fs /export/home blocks (soft = 0, hard = 0) inodes (soft = 0, hard = 0)

Second, run /usr/sbin/repquota -va to find all of the users who begin with #<uid> and run /usr/sbin/edquota -p nobody to remove the quota (make it match that of the prototype user, nobody) for that uid. You can wrap this in a shell script or a Perl script, which you can then run out of cron every so often if you don't get around to cleaning up quotas when you delete accounts. Try something like the following:

#!/usr/local/bin/perl

use strict;
use warnings;
use POSIX qw (WEXITSTATUS WIFSIGNALED WTERMSIG);

my $repquota = '/usr/sbin/repquota';
my $edquota = '/usr/sbin/edquota';

open (REPQUOTA, "$repquota -va |") || die "Can not open $repquota: $!\n";

while(<REPQUOTA>) {
  if (/^\#(\d+)/) {
    if (system ("$edquota", "-p", "nobody", "$1") != 0) {
      if ($? == -1) {
      die "Failed to execute $edquota: $!\n";
    }
    elsif (WIFSIGNALED ($?)) {
      die "$edquota terminated by signal: " . WTERMSIG($?)
         . (($? & 128) ? ", core dumped\n" : "\n");
    }
    else {
      die "$edquota exited with value " . WEXITSTATUS($?) . "\n";
    }
  }
}

close REPQUOTA;

Q Since upgrading to sendmail 8.13.6 because of the security hole, I'm noticing that a lot of df files are being left in my queue without corresponding qf files. Is there a problem with the new sendmail delivering my mail? Every time I test, all of the mail gets through okay (even if a df file is left behind).

A You don't offer a lot of details about your sendmail setup, but I suspect you're using buffered I/O (perhaps a milter?) and running into a known issue with 8.13.6. This problem will be corrected in later versions. From the release notes at:

http://www.sendmail.org/releases/8.13.6.html

(2006-04-11) If a timeout occurs, a df file can be left in the mail queue if buffered files are used. This is a regression in 8.13.6 caused by the new I/O error handling. A fix will be available in the next release, in the meantime simply remove old df files (i.e., if they are older than the maximum queue timeout and if they have no corresponding qf, Qf, or hf files).

Q I was reading your answer about Solaris coreadm in the March issue of Sys Admin and was wondering if there was a way to make HP/UX 11.23 split core files out by name/pid. That would be much more useful than having the core files overwrite each other all the time.

A HP/UX 11.00 and later introduces the core_addpid kernel parameter, which will allow you to save core files as core.<pid>. Since you're running 11i v2, run the following command to modify the kernel:

echo 'core_addpid/W 1' | adb -o -w /stand/vmunix /dev/kmem

For HP/UX 11.00 and 11i v1, you can instead run:

echo 'core_addpid/W 1' | adb -k -w /stand/vmunix /dev/kmem

Q I'm investigating using wanboot for Solaris 9 and 10, but it appears that to do a hands-off install, you need to upgrade the OBP to a version that supports wanboot. I've searched the Sunsolve patch database and come up with some hits, but it looks like the majority of our hardware is unsupported. Is there a definitive list of systems that can be upgraded to support wanboot?

A Wanboot requires a minimum of OBP version 4.17 in the 4.x OBP source tree. Newer hardware like the T1000 and T2000 series machines will already be at the correct OBP version and not require an upgrade. You can find a definitive list of patches for various hardware types at the following URL if you have a Sunsolve account:

http://sunsolve.sun.com/handbook_private/Devices/Boot_PROM/ \
  BootPROM_Sun4u.html

The patches currently listed for the newest OBP versions equal to or greater than 4.17 are:

Sun Blade 100               119235-01      OBP 4.17
Sun Blade 150               119235-01      OBP 4.17
Sun Blade 1500 (1.062GHz)   119236-01      OBP 4.17
Sun Blade 1500 (1.503GHz)   119237-02      OBP 4.17
Sun Blade 2500 (1.280GHz)   119232-02      OBP 4.17
Sun Blade 2500 (1.600GHz)   119233-01      OBP 4.17
Sun Fire V480               118322-01      OBP 4.17
Sun Fire V490               119243-02      OBP 4.18
Sun Fire V880/V880z         119244-02      OBP 4.18
Sun Fire V890 PCI I/O       119244-02      OBP 4.18
Sun Fire V210               119234-01      OBP 4.17
Sun Fire V240 / Netra 240   119234-01      OBP 4.17
Sun Fire V440 / Netra 440   118319-02      OBP 4.17

If your systems are too old to support wanboot in the OBP, you'll need a CDROM drive on your machine. See:

http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh7b?a=view

for information on performing a secure Solaris 10 wanboot from the local CDROM media. In this case, the client uses the wanboot program from the CDROM rather than from the server:

  boot cdrom -o prompt -F wanboot - install

Q I have two files of hostnames, where one is a complete list of all our machines and the other is a list of machines that have already been upgraded. I want to wind up with a list of machines that still need upgrading. What's the best way to accomplish this with the tools that come with the system? I'm not allowed to install any third-party tools, so this needs to be something pretty basic.

A You don't say what kind of system you have, so I won't assume anything fancy like perl. If your two lists are dictionary sorted, and you don't have duplicate entries (you can get there by using sort -u on both files if order isn't important), you can use the comm command. It takes two files as arguments and prints out three columns of output: lines only in the first file, lines only in the second file, and lines in both files. You can pass the arguments 1, 2, and/or 3 to comm to have it not print out certain columns.

Let's say all-hosts contains all of your hostnames and upgraded-hosts contains the ones you've already upgraded:

comm -23 all-hosts upgraded-hosts > remaining-hosts

If your list of hosts isn't sorted, you can use grep to accomplish the same thing. Make sure that your version of grep supports all of the following flags. If you're using Solaris, for example, you'll want to run /usr/xpg4/bin/grep instead of /bin/grep since /bin/grep does not contain the -x flag.

-x: All characters in the input line must be used to match the pattern.

-v: Invert the matching logic and only print lines that do NOT match
    the pattern.

-F: Interpret each pattern, separated by the newline character, as a 
    fixed string instead of a regular expression.

-f pattern_file: Read patterns from pattern_file, each separated by 
   the newline character.

The command-line syntax you'll use is:

grep -xvFf upgraded-hosts all-hosts > remaining-hosts

Q I'm running an Apache 1.3.33 Web server on five identically configured Fedora Core 4 systems. Apache is started at boot time via the script located at /etc/rc.d/rc3.d/S90apache (also /etc/rc.d/init.d/apache), which calls the /usr/local/apache/bin/apachectl script distributed with Apache. We use cronolog to handle the log messages for error_log and access_log in httpd.conf as follows (trimmed to a few hosts for the sake of brevity):

Listen 80
NameVirtualHost *:80

<VirtualHost *:80>
  DocumentRoot /data/www.my.domain
  ServerName www.my.domain
  ErrorLog "|/usr/local/sbin/cronolog \
    --symlink=/var/log/apache/error_log \
    /var/log/apache/%Y-%m/error_log"
  CustomLog "|/usr/local/sbin/cronolog \
    --symlink=/var/log/apache/access_log \
  /var/log/apache/%Y-%m/access_log.%Y-%m-%d" combined
</VirtualHost>

<VirtualHost *:80>
  DocumentRoot /data/www.hosta.domain
  ServerName www.hosta.domain
  ErrorLog "|/usr/local/sbin/cronolog \
    --symlink=/var/log/apache/hosta-error_log \
    /var/log/apache/%Y-%m/hosta-error_log"
  CustomLog "|/usr/local/sbin/cronolog \
    --symlink=/var/log/apache/hosta-access_log \
    /var/log/apache/%Y-%m/hosta-access_log.%Y-%m-%d" combined
</VirtualHost>

<VirtualHost *:80>
  DocumentRoot /data/www.hostb.domain
  ServerName www.hostb.domain
  ErrorLog "|/usr/local/sbin/cronolog \
    --symlink=/var/log/apache/hostb-error_log \
    /var/log/apache/%Y-%m/hostb-error_log"
  CustomLog "|/usr/local/sbin/cronolog \
    --symlink=/var/log/apache/hostb-access_log \
    /var/log/apache/%Y-%m/hostb-access_log.%Y-%m-%d" combined
</VirtualHost>

<IfModule mod_ssl.c>
  Listen 443
  <Virtualhost *:443>
    SSLEngine on
    SSLCertificateFile      conf/ssl.crt/www.host.domain.crt
    SSLCertificateKeyFile   conf/ssl.key/www.host.domain.key
    DocumentRoot    /data/www.my.domain
    ErrorLog        "|/usr/local/sbin/cronolog \
      --symlink=/var/log/apache/error_log \
      /var/log/apache/%Y-%m/ssl_error_log"
    CustomLog       "|/usr/local/sbin/cronolog \
      --symlink=/var/log/apache/access_log \
      /var/log/apache/%Y-%m/ssl_access_log.%Y-%m-%d" combined
  </VirtualHost>
</IfModule>

On four of the five machines, the log files were being created with cronolog with ownership of root:other and permissions of 0644. On the fifth box, we noticed a number of log files were all created with the same ownerships, but the permissions 0600. When we restarted the errant box, the log files were created with the correct permissions again. Within a month, though, we had more 0600 permissioned log files on one of the other machines. We rebooted this second machine, and it also started creating log files with the correct permissions again.

This is an issue for us since the log files need to be world readable so another process can do statistic reporting. It's also troublesome, since I'm not sure what's causing the change in permissions. Where should I start looking?

A If the machines are truly identical, including init scripts, cron jobs, etc., then the cause of the discrepancy must be something that's being run by a human and not an automated process being run by the machine. The difference in the permissions on the log files generally points towards a umask issue. Perhaps one of your admins is restarting apache with an environment different from the one that the operating system has when it boots. If you catch another one of your machines creating log files with the wrong permissions, you may want to use a small script to dump the environment that apache is running with. It might be illuminating if there are differences.

Make sure that people are starting/restarting apache in the same way all of the time and that their environment is not being carried over by something like sudo. To ensure that the correct environment is being passed to apache when it starts, you might want to explicitly set things like the PATH and umask in /etc/rc.d/rc3.d/S90apache and /etc/rc.d/init.d/apache.

There is also always the chance that one or more of your machines has been compromised, so you may want to boot from CDROM and make a full security sweep of all your machines if you think this might be the case.

Q I'm trying to build sendmail 8.13.6 from source on Solaris 8. I'm following the documentation and running the Build script, but it dies with the following:

gcc -o sendmail main.o alias.o arpadate.o bf.o collect.o conf.o
control.o convtime.o daemon.o deliver.o domain.o envelope.o err.o
headers.o macro.o map.o mci.o milter.o mime.o parseaddr.o queue.o
ratectrl.o readcf.o recipient.o sasl.o savemail.o sfsasl.o shmticklib.o
sm_resolve.o srvrsmtp.o stab.o stats.o sysexits.o timers.o tls.o trace.o
udb.o usersmtp.o util.o version.o
/usr/local/src/Security/sendmail-8.13.6/obj.SunOS.5.8.sun4/ \
  libsmutil/libsmutil.a
/usr/local/src/Security/sendmail-8.13.6/obj.SunOS.5.8.sun4/libsm/libsm.a
-lresolv -lsocket -lnsl
Undefined                       first referenced
symbol                             in file
dbm_pagfno                          map.o
dbm_dirfno                          map.o
ld: fatal: Symbol referencing errors. No output written to sendmail
collect2: ld returned 1 exit status
*** Error code 1
make: Fatal error: Command failed for target 'sendmail'

A This is generally indicative of having replaced the stock Solaris /usr/include/ndbm.h or having a conflicting ndbm.h file in your include path somewhere. You've probably installed Berkeley DB, and it's using the header file from that instead. You can temporarily rename the conflicting file (or remove it if you're not using whatever package it's attached to) and then rebuild the sendmail source using ./Build fresh.

Amy Rich has more than a decade of Unix systems administration experience in various types of environments. Her current roles include that of Senior Systems Administrator for the University Systems Group at Tufts University, Unix systems administration consultant, author, and charter member of LOPSA. She can be reached at: qna@oceanwave.com.