Questions and Answers

Amy Rich

Q I'm running Solaris 10 and recently got a corrupted superblock on my boot drive. I was able to boot from the DVD and fix it by running fsck, but it made me wonder exactly what information is contained in the superblock that makes it so important? Is it where the boot block is installed?

A The superblock actually comes after the boot block and contains information about the geometry and layout of the cylinder groups and various file system parameters. The superblock is created by the mkfs (usually called by a systems administrator from newfs) program and can be later modified by the tunefs (or mkfs) program.

For a thorough understanding of the information stored in the superblock, view the source code for ufs_fs.h on cvs.opensolaris.org:

http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/ \
  src/uts/common/sys/fs/ufs_fs.h

In particular, take a look at the following portions of the struct for fs. The first part includes variables that describe the layout and sizes of cylinders and blocks. I'm sure you'll recognize some of the information. For example, you were presumably able to recover your superblock because you used fsck with one of the alternate superblocks that is replicated across each individual cylinder group. The location of the superblock is defined by the below variable, fs_sblkno, and defaults to a location of 16 sectors (8192 bytes) into the file system partition:

uint32_t fs_link; /* linked list of file systems */
uint32_t fs_rolled; /* logging only: fs fully rolled */
daddr32_t fs_sblkno; /* addr of super-block in filesys */
daddr32_t fs_cblkno; /* offset of cyl-block in filesys */
daddr32_t fs_iblkno; /* offset of inode-blocks in filesys */
daddr32_t fs_dblkno; /* offset of first data after cg */
int32_t fs_cgoffset; /* cylinder group offset in cylinder */
int32_t fs_cgmask; /* used to calc mod fs_ntrak */
time32_t fs_time; /* last time written */
int32_t fs_size; /* number of blocks in fs */
int32_t fs_dsize; /* number of data blocks in fs */
int32_t fs_ncg; /* number of cylinder groups */
int32_t fs_bsize; /* size of basic blocks in fs */
int32_t fs_fsize; /* size of frag blocks in fs */
int32_t fs_frag; /* number of frags in a block in fs */
int32_t fs_magic; /* magic number */

Some information in the structure is pulled directly from the hardware, such as the number of tracks and sectors per cylinder:

#ifdef _LITTLE_ENDIAN
int32_t fs_state; /* file system state time stamp */
#else
int32_ fs_npsect; /* # sectors/track including spares */
#endif
int32_ fs_si; /* summary info state - lufs only */
int32_ fs_trackskew; /* sector 0 skew, per track */
int32_ fs_ntrak; /* tracks per cylinder */
int32_ fs_nsect; /* sectors per track */
int32_ fs_spc; /* sectors per cylinder */

You'll also recognize the configuration parameters below from the tunefs man page:

int32_t fs_minfree; /* minimum percentage of free blocks */
int32_t fs_rotdelay; /* num of ms for optimal next block */
int32_t fs_rps; /* disk revolutions per second */
int32_ fs_maxcontig; /* max number of contiguous blks */
int32_ fs_maxbpg; /* max number of blks per cyl group */
int32_ fs_optim; /* optimization preference, see below */

And you'll recognize some of the fields cleared at mount time from the information displayed during your fsck:

char fs_fmod; /* super block modified flag */
char fs_clean; /* file system state flag */
char fs_ronly; /* mounted read-only flag */
char fs_flags; /* largefiles flag, etc. */
char fs_fsmnt[MAXMNTLEN]; /* name mounted on */

Q I'm running AIX 5.3 on our systems, and it looks like they're closing idle TCP connections after half a day or so even when the remote application is sending keepalive packets. I tested the same scenario against one of my 5.2 machines, and it doesn't appear to exhibit the problem. Is there some sort of configuration switch in 5.3 that I'm not setting correctly?

A This sounds like the bug described in APAR IY89429:

  When a TCP connection goes idle, and remote end (need not be 
  AIX) sends TCP keepalives, AIX will respond to 5 keepalives 
  and stop responding after that. However, it will respond to 
  FIN when closing the connection.

  **************************************************************
  * USERS AFFECTED:
  * All users with the following filesets at these levels:
  *   bos.net.tcp.client 5.3.0.50
  *   bos.net.tcp.client 5.3.0.51
  *   bos.net.tcp.client 5.3.0.52
  **************************************************************

There's a fix available from:

http://www-1.ibm.com/support/docview.wss?uid=isg1IY89429

Q I'm trying to get automated ssh rsync working from cron from one Solaris 8 machine to another. I'm trying to rsync files from the initiating machine, machineA to the target machine, machineB. Unfortunately, I can't even seem to get an interactive session to work without prompting for the password. I've taken the $HOME/.ssh/id_rsa.pub from machineA and copied it to the $HOME/.ssh/authorized_keys file on machineB:

~/.ssh @machineA $ ls -al
drwxr-xr-x 13 webowner other 512 Feb 5 11:51 ../
drwx------ 2 webserv other 512 Feb 5 11:37 ./
-rw-r--r-- 1 webserv other 924 Mar 4 2004 known_hosts
-rw-r--r-- 1 webserv other 246 Mar 3 2004 id_rsa.pub
-rw------- 1 webserv other 883 Mar 3 2004 id_rsa
~/.ssh @machineA $ md5 id_rsa.pub
MD5 (/tmp/id_rsa.pub) = a1d0430222fdd245aede5445e103dfc5
~/.ssh @machineB $ ls -al
-rwx------ 1 webserv other 246 Feb 5 11:05 authorized_keys*
drwx------ 2 webserv other 512 Feb 5 11:01 ./
drwxr-xr-x 13 webowner other 512 Sep 27 17:07 ../
~/.ssh @machineB $ md5 authorized_keys
MD5 (/tmp/id_rsa.pub) = a1d0430222fdd245aede5445e103dfc5

When I try to ssh from machineA to machineB, though, I get the following output:

@machineB $ /usr/local/bin/ssh -v machineB
OpenSSH_3.7.1p2, SSH protocols 1.5/2.0, OpenSSL 0.9.7i 14 Oct 2005
debug1: Reading configuration data /usr/local/etc/ssh_config
debug1: Applying options for *
debug1: /usr/local/etc/ssh_config line 21: Deprecated option \
        "RhostsAuthentication"
debug1: Connecting to machineB [130.64.1.175] port 22.
debug1: Connection established.
debug1: identity file /usr/local/apache/.ssh/identity type -1
debug1: identity file /usr/local/apache/.ssh/id_rsa type 1
debug1: identity file /usr/local/apache/.ssh/id_dsa type 2
debug1: Remote protocol version 1.99, remote software version \
        OpenSSH_3.7.1p2
debug1: match: OpenSSH_3.7.1p2 pat OpenSSH*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_3.7.1p2
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-cbc hmac-md5 none
debug1: kex: client->server aes128-cbc hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Host 'machineB' is known and matches the RSA host key.
debug1: Found key in /usr/local/apache/.ssh/known_hosts:2
debug1: ssh_rsa_verify: signature correct
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,password
debug1: Next authentication method: publickey
debug1: Trying private key: /usr/local/apache/.ssh/identity
debug1: Offering public key: /usr/local/apache/.ssh/id_rsa
debug1: Authentications that can continue: publickey,password
debug1: Offering public key: /usr/local/apache/.ssh/id_dsa
debug1: Authentications that can continue: publickey,password
debug1: Next authentication method: password
webserv @machineB's password:

I can see that it's offering the publickey, but the target machine isn't accepting it. What am I doing incorrectly?

A As a security feature, OpenSSH verifies that the ownership of all directories in the chain to the $HOME/.ssh/authorized_keys file and the file itself are actually owned by the user and have the necessary permission restrictions. In your case, the home directory on both machineA and machineB are different from the owner of the $HOME/.ssh directory and the OpenSSH configuration files. You're trying to connect to machineB as webserv, so you need to change the ownership of webserv's home directory to the user webserv.

If the user were actually webowner, then you'd need to recursively change the ownership of the $HOME/.ssh directory on machineB to webowner. If it were another user all together, then you'd need to change the permissions on both the $HOME directory and recursively on the $HOME/.ssh directory.

Q I've had it with Sun's patching tools (Patch Manager, PatchPro, Sun Update Connection, smpatch, etc.) because they seem to have endless amounts of problems. Is there a better alternative, or are we stuck with these kludgey hacks they've thrown together? You'd think that a company this large would handle something as fundamental as system patching in a saner, automated fashion.

A The main thing you need from Sun is the patchdiag.xref file, and you can often build your own patch scripts around that. Many people use the Perl program Patch Check Advanced (pca), by Martin Paul, for patch management on their Solaris machines. Not only is pca easier to use, but, as long as you can get a valid patchdiag.xref file out of Sun (they have been known to release corrupted ones in the past), it's also more thorough and will find and optionally install patches that Sun's tools sometimes miss.

The pca software and accompanying man page can be obtained from:

http://www.par.univie.ac.at/solaris/pca/

You need a Sun Online Account to get access to the actual patches. You'll also want to install wget so that pca can download the patchdiag.xref file as well as the patches themselves.

When you first run pca, it'll prompt you for the location to store the patchdiag.xref file, and then it will exhaustively check the patches you have installed against the patches that are available and produce a report that looks like the following:

Download xref-file to /var/tmp/patchdiag.xref:  done
Using /var/tmp/patchdiag.xref from Feb/23/07
Host: your.host.domain (SunOS 5.8/Generic_117350-44/sparc/sun4u)
Patch  IR   CR RSB Age Synopsis
------ -- - -- --- --- --------------------------------------------
108693 07 < 28 --- 216 Solstice DiskSuite 4.2.1: Product patch
108723 01 < 02 --- 301 SunOS 5.8: lofs patch
108993 63 < 66 RS- 17  SunOS 5.8: LDAP2 client, libc, libthread \
                       and libnsl libraries patc
108820 02 < 04 --- 52  SunOS 5.8: nss_compat.so.1 patch
108823 01 < 02 --- 847 SunOS 5.8: compress/uncompress/zcat patch
108940 57 < 76 -S- 69  Motif 1.2.7 and 2.1.1: Runtime library \
                       patch for Solaris 8
108964 06 < 10 --- 297 SunOS 5.8: /usr/sbin/in.tftpd and \
                       /usr/sbin/snoop patch
108974 50 < 51 R-- 11  SunOS 5.8: dada, uata, dad, sd, ssd and \
                       scsi drivers patch
111323 -- < 01 --- 999 SunOS 5.8: /usr/xpg4/bin/more patch

The patches are sorted so that any dependencies are resolved before attempting to install a patch. The columns in the pca output contain the following information:

Column Purpose
1	The patch number
2	The installed revision of the patch
3	Whether the installed revision of the patch is less than, equal to,
    or greater than the currently available revision.
4	The currently available revision of the patch
5	The state of the patch, one of Recommended, Security, or Bad
6	The number of days since the patch was released (with a maximum of 999)
7	A short description of the patch

To download all of the missing patches, you can run the following command and it will prompt you for your SOA password:

pca -d --user=YourSOAAccount

If you also specify -i, it will install the patches as well.

There are a variety of ways to filter which patches you want to view, download, or install. At larger sites, you'll want to set up a local machine as a centralized patch repository and probably create custom patch bundles for your servers. You can set up pca as a local caching proxy for itself so that one machine automatically handles all of the necessary patch downloads and other machines do not need direct access to the Internet.

To begin, install your favorite Web server and copy pca somewhere under the DocumentRoot as pca-proxy.cgi. Make sure the directory containing pca-proxy.cgi is owned by the user that runs CGI scripts, because it will need write access to download the patchdiag.xref and patch files from Sun. Create a pca.conf file to specify your SOA information so that you can retrieve patches. The files pca-proxy.cgi and pca.conf must be readable by the user who executes CGI scripts, but ideally they should not be writable by that user.

Optionally, you can create patch cluster files to install pre-determined bundles on your servers. Like a patch_order file, this would contain one patch number per line:

108693-28
108723-02
108993-66

On each client, specify the proxy server when you tell pca where to download patches. In the following example, the centralized patch server has a file called 9-recommended-2007-03-05 that lists all of the patches we want to install on the client:

pca --localurl=http://www.my.org/patches/pca-proxy.cgi \
  -i 9-recommended-2007-03-05

If the patches do not already exist on the central patch repository, pca_proxy.cgi will retrieve them from Sun.

For more information about the power and flexibility of pca, see the documentation on the Web site and read the pca.8 man page.

Amy Rich has more than a decade of Unix systems administration experience in various types of environments. Her current roles include that of Senior Systems Administrator for the University Systems Group at Tufts University, Unix systems administration consultant, author, and charter member of LOPSA. She can be reached at: qna@oceanwave.com.