Article oct2006.tar

Questions and Answers

Amy Rich

Q I have an NFS server FreeBSD running 4.10-STABLE. I'm trying to mount one of its drives (/raid) onto a machine running OS X 10.3.9, but it keeps failing regardless of whether I try it from the GUI or the command line. The /etc/exports file on the FreeBSD machine contains the following:

/raid   -network 192.168.1.0 -mask 255.255.255.0 
            
I think all of the appropriate processes are running on the server (and I can mount /raid from other machines). Here's my /etc/rc.conf</d>:

hostname="nfsserver.my.domain"
ifconfig_xl0="DHCP"
kern_securelevel_enable="NO"
linux_enable="YES"
lpd_enable="YES"
network_interfaces="xl0 lo0"
sendmail_enable="NO"
sshd_enable="YES"
usbd_enable="NO"
xntpd_enable="YES"
start_vinum="YES"
nfs_server_enable="YES"
nfs_reserved_port_only="YES"
    
And the accompanying process list from ps:

USER    PID %CPU %MEM  VSZ RSS  TT      STAT STARTED  TIME     COMMAND 
root    288  0.0  0.2  572     284  ??  Is   11Jun06  0:00.01  mountd -r 
root    117  0.0  0.1  360     136  ??  S    11Jun06  5:39.01  nfsd: server     (nfsd) 
root    116  0.0  0.1  360     136  ??  S    11Jun06  7:11.22  nfsd: server     (nfsd) 
root    115  0.0  0.1  360     136  ??  S    11Jun06  13:21.20 nfsd: server     (nfsd) 
root    114  0.0  0.1  360     136  ??  S    11Jun06  62:19.69 nfsd: server     (nfsd) 
root    113  0.0  0.1  368     144  ??  Is   11Jun06  0:00.01  nfsd: master     (nfsd) 
daemon  108  0.0  0.3  968 588      ??  Is   11Jun06  0:00.05  /usr/sbin/portmap 
I didn't think lockd and statd were completely necessary, but I've also tried the mount after starting them, as well.

To try to understand what was going on, I killed mountd and reran it in debugging mode:

mountd: getting export list 
mountd: got line #/usr/home      -network     192.168.1.0 -mask \                              
                                 255.255.255.0 
mountd: got line /raid  -network 192.168.1.0 -mask 255.255.255.0 
mountd: making new ep fs=0x3e456789,0xa89abf4 
mountd: doing opt -network 192.168.1.0 -mask 255.255.255.0 
mountd: doing opt -mask 255.255.255.0 
mountd: getting mount list 
mountd: here we go 
You can see that it's actually reading the exports file correctly. So I tried to do the mount on the OS X client:

# mount nfsserver:/raid /raid 
mount_nfs: /raid: Permission denied 
# ls -al /raid 
total 0 
drwxr-xr-x    2 root     admin          68 Jun 11 11:58 . 
drwxrwxr-t   38 root     admin        1292 Jun 11 11:58 .. 
Thinking it might be some weird permissions issue (though I don't see why it would), I also tried doing a chown of the directory to my own username:

# chown localuser /raid 
# mount nfsserver:/raid /raid 
mount_nfs: /raid: Permission denied 
    
In both cases, the mountd process running in debugging mode on the server reports:

mountd: mount successful 
Just to be certain there weren't any firewall issues, I ran rpcinfo and showmount on the OS X client.

# rpcinfo -p nfsserver 
   program vers proto   port 
    100000    2   tcp    111  portmapper 
    100000    2   udp    111  portmapper 
    100003    2   udp   2049  nfs 
    100003    3   udp   2049  nfs 
    100003    2   tcp   2049  nfs 
    100003    3   tcp   2049  nfs 
    100005    3   udp    684  mountd 
    100005    3   tcp    998  mountd 
    100005    1   udp    684  mountd 
    100005    1   tcp    998  mountd 


# showmount -e nfsserver 
Exports list on nfsserver: 
/raid                              192.168.1.0 
    
I've even tried rebooting both the client and the server with absolutely no luck. I know this should work, but I just can't seem to find the right knobs to twiddle. Can you point out what I'm missing, please?

A To begin with, I would always recommend running statd and lockd when you're exporting files via NFS. As for your mounting issue, in your /etc/rc.conf file, you have the following setting: nfs_reserved_port_only="YES". If you take a look in /etc/rc.network, you're triggering this bit of code:

case ${nfs_reserved_port_only} in [Yy][Ee][Ss]) 
  echo -n ' NFS on reserved port only=YES' 
  sysctl vfs.nfs.nfs_privport=1 >/dev/null 
This tells the server that NFS mount requests will only be accepted on ports less than 1024 (reserved ports). Unfortunately, the default for OS X is never to use a reserved source port to connect, so your client attempts the connection on the wrong port. You can get around this on the command line by running:
mount_nfs -P nfsserver:/raid /raid 
There is no way to make the Finder use non-reserved ports, but you can set up a persistent mount with NetInfo. If you go the route of a persistent mount, you'll want to set the options to -s -P -b.

Q We're in the process of making the leap from Solaris 8 to Solaris 10. As part of this migration, we're slowly moving init scripts into SMF. There's one in particular that I still haven't managed to port yet, though, and it's that script that I have a question about. We have an init script called S99a.postreboot that's only run as part of the JumpStart process then deleted. It sets up a bunch of environment things specific to each machine.

For logging purposes, we have a lot of echo statements in this init script that detail what's being changed. Unfortunately, since moving to Solaris 10, all of that output is now lost. How do I modify the init script just enough to get my output back? I don't want to make this a full-blown SMF service because it's very transient.

A SMF is designed to be fairly quiet when running legacy scripts, so all of your output is now going to /var/svc/log/milestone-<whatever level you called your init script from>. If you want output to the console as well as the log file, you can boot from the OBP with boot -m. This only works if you have one message per startup script that you want to display. If you have more than one message per script, then you can modify the echo lines in your script to redirect output to /dev/sysmsg instead of STDOUT.

Sun actually recommends that you modify your init scripts to load /lib/svc/share/smf_include.sh and then redirect your output to smf_console, but this still winds up at /dev/sysmsg in the end:

# smf_console 
# 
#   Use as ">echo message 2>&1 | smf_console". If SMF_MSGLOG_REDIRECT is 
#   unset, message will be displayed to console. SMF_MSGLOG_REDIRECT is 
#   reserved for future use. 
# 
smf_console () { 
        /usr/bin/tee ${SMF_MSGLOG_REDIRECT:-/dev/msglog} 
} 
    
Then from the /dev/msglog man page:

DESCRIPTION 
     Output from system startup ("rc")  scripts  is  directed  to 
    /dev/msglog, which dispatches it appropriately. 
... 
NOTES 
    In the current version of Solaris, /dev/msglog is  an  alias 
    for  /dev/sysmsg.   In future versions of Solaris, writes to 
    /dev/msglog may be directed into a  more  general  logging 
    mechanism such as syslogd(1M). 
If you want to use smf_console, you need the following line at the top of your init script:

. /lib/svc/share/smf_include.sh 
Then you redirect the output as follows:

echo message 2>&1 | smf_console 
Remember that your output will still be logged to /var/svc/log/ if you need it for future reference.

Q We build a lot of our own packages for Solaris 9, and we've come up with a build system that works pretty well. One thing we get bitten by a lot is programs that stealthily link against dynamic libraries without telling the builder. Often the build machines will have many more libraries installed than the client machines because the libraries are needed for other software packages. But a package containing one of these stealth-linked binaries is installed on a client system that doesn't have the necessary library package, it unsurprisingly fails to run. Since we do batch pushes of new packages, this can break many machines at once.

I was trying to think of a way to combat this so that we check the packages before they go out the door, but parsing through the code or the makefile or the configure script seems extremely error prone. It's impossible to tell what mechanism any given package will use for determining which libraries to link against. Do you know of any software programs I could use (or start with to develop something in house) that are fairly generic?

A If part of your build process includes installing files into a temporary root before packaging, you can run ldd against all of the ELF binaries you find. Not only will this catch which library packages you might be missing, but it will also pick up on programs compiled with missing elements in the LD_RUN_PATH.

In the following Bourne shell script, you'll note that I wind up sorting the data a couple times so that I'm not checking the same libraries over and over. You could instead just keep track of whether you've seen a given file path before, but this is a quick-and-dirty approach that's reasonably fast.

This script also assumes that you're calling it from a master build script, which defines and exports the following variables:

DESTDIR -- the location of the alternate install root for your package. This directory should only contain the files that you intend to package.

BUILD_DEPENDENCIES -- the packages required to build this package

DEPENDENCIES -- the packages required by this one on the client machines at run time. The packages in this list are used to create the depends file during creation of your package.

Here's the script itself:

#!/bin/sh

# set up some initial variables
recompile=0
outfile="/tmp/output.$$"

# remove and recreate our temp files
rm -f ${outfile} && touch ${outfile} || exit $?
rm -f ${outfile}.pkgs && touch ${outfile}.pkgs || exit $?

# find all of the ELF files within the alternate install root
for file in `find ${DESTDIR} -type f -exec file {} \; | awk -F: \
  '/ELF/ {print $1}'`; do
  # run ldd on the ELF binaries to determine what they link against
  for reqlibs in `ldd ${file} | awk '{print $3}'`; do
    if [ "X${reqlibs}” = “X(file" ]; then
      # we have a bad LD_RUN_PATH
      echo "unknown" >> ${outfile}
    else
      echo ${reqlibs} >> ${outfile}
    fi
  done
done


# sort the output so we only determine what package a file belongs 
# to once
for reqlibs in `sort -u ${outfile}`; do
  pkgname=''
  if [ "X${reqlibs}" = "Xunknown" ]; then
    pkgname='unknown'
  else
    # always choose the last package if there's more than one, since 
    # your binary was built with files from that package
    pkgname=`grep "^${reqlibs} " /var/sadm/install/contents| awk \
      '{print $NF}'`
    if [ "X${pkgname}" = "X” ]; then
      # if we haven't found the file in /var/sadm/install/contents, 
      # it doesn't belong to a package
      pkgname='unknown'
    fi
  fi
  echo "${pkgname}" >> ${outfile}.pkgs
done

# output a "checking" message to the user
echo
echo "Using ldd to check for library dependencies"
echo

# sort again to catch packages that contain more than one library file
for reqlibs in `sort -u ${outfile}.pkgs`; do
  if [ "X${reqlibs}" = "Xunknown" ]; then
    # report an error since we have an unidentified dependency
    echo "MISSING LIBRARY!  Modify build script to include the \
      correct -R path"
    echo "or package the library that's not a member of any package."
    echo "Rebuild when you've fixed the error"
    recompile=1
  elif [ "X`echo ${BUILD_DEPENDENCIES} |grep ${reqlibs}`" = "X" ] || \
    [ "X`echo ${DEPENDENCIES} |grep ${reqlibs}`" = "X" ]; then
    # we're missing the package in either the build or run time 
    # dependencies, so echo the required package name and version 
    # to STDOUT
    echo "${reqlibs} `pkginfo -l ${reqlibs}| awk '/VERSION:/ \
      {print $2}'`"
    recompile=1
  fi
done

# get rid of our tmp files
rm -f ${outfile} ${outfile}.pkgs

if [ ${recompile} -eq 1 ]; then
  # our package has unresolved dependencies; go back and rebuild it.
  echo
  echo "   Binaries in this package are linked against libraries "
  echo "   listed in the packages listed above.  You must edit the "
  echo "   build script, adding them to both DEPENDENCIES and 
  echo "   BUILD_DEPENDENCIES"
  echo
  echo "   After you have done so, rerun the master build script to 
  echo "   build a working package."
  echo

  exit 1
fi
Amy Rich has more than a decade of Unix systems administration experience in various types of environments. Her current roles include that of Senior Systems Administrator for the University Systems Group at Tufts University, Unix systems administration consultant, author, and charter member of LOPSA. She can be reached at: qna@oceanwave.com.