Questions and Answers

Amy Rich

First, thanks to Andy Bach for noticing a few typos in the January column. In the Perl scripts that created spam block lists, Strict and Warnings were mistakenly capitalized after their use statements:

use strict;
use warnings;

And in the second script, the my was missing for the initialization of the scp variable:

my $scp = Net::SCP->new("$desthost")
  or die "Can't open scp to $desthost: $!";

Q Now that Update 1 of Solaris 10 has been released, I tried upgrading with LiveUpgrade. I used the following command:

luupgrade -u -s /cdrom/sol_10_106_sparc/s0 -n c0t0d0s4

It chugged through reading the media and then started to process the profile. At that point, it spit out this error:

 ERROR: Installation of the packages from this media of the media
 failed; pfinstall returned these diagnostics:

 Processing profile
 ld.so.1: pfinstall: fatal: relocation error: file
     /cdrom/sol_10_106_sparc/s0/Solaris_10/Tools/Boot/usr/ \
       snadm/lib/libspmisvc.so.1:
 symbol zonecfg_set_root: referenced symbol not found

Is this a LiveUpgrade bug or have I done something incorrectly?

A Every time you plan to upgrade using LiveUpgrade, you need to make sure that you've installed the necessary patches before running luupgrade. To determine what those patches are, take a look at Sun Infodoc 72099:

http://sunsolve.sun.com/searchproxy/document.do?assetkey=1-9-72099-1

To upgrade from Solaris 10 FCS, you'll need one set of the following patches (depending on your hardware architecture):

Solaris 10 OS SPARC

S10 FCS sparc 119254-13 or higher patchadd/patchrm patches
S10 FCS sparc 119317-01 or higher SVr4 Packaging Commands (usr) Patch
S10 FCS sparc 120900-03 or higher SUNWzoneu required patch
S10 FCS sparc 121333-02 or higher SUNWzoneu required patch
S10 FCS sparc 121430-02 or higher SUNWlur/SUNWluu required for S10
S10 FCS sparc 120235-01 or higher SUNWluzone required patches
S10 FCS sparc 121428-01 or higher SUNWluzone required patches

Solaris 10 OS x86

S10 FCS x86 119255-13 or higher patchadd/patchrm patches
S10 FCS x86 119318-01 or higher SVr4 Packaging Commands (usr) Patch
S10 FCS x86 117435-01 or higher biosdev patch for GRUB Boot
S10 FCS x86 120901-03 or higher SUNWzoneu required patch
S10 FCS x86 121334-02 or higher SUNWzoneu required patch
S10 FCS x86 121431-02 or higher SUNWlur/SUNWluu required fro S10
S10 FCS x86 120236-01 or higher SUNWluzone required patches
S10 FCS x86 121429-01 or higher SUNWluzone required patches

Q I've changed from FreeBSD 5.4-STABLE to 6.0. I built a new kernel and a new userland, ran mergemaster, installed, etc. Everything appeared to go fine, but now I'm seeing device timeouts and UP/DOWN bounces on my Ethernet card:

Dec 10 21:17:14 machine.my.domain kernel: nve0: device timeout (51)
Dec 10 21:17:15 machine.my.domain kernel: nve0: link state changed to DOWN
Dec 10 21:17:16 machine.my.domain kernel: nve0: link state changed to UP

I've tried recompiling the kernel with everything from GENERIC in there as well, but it doesn't seem to have made any difference. Is my NIC going bad, or is it an issue with my kernel?

A There was a known issue with the nve driver that was corrected in CVS:

http://www.freebsd.org/cgi/query-pr.cgi?pr=88045

To fix your problem, you can apply the patch or you can cvsup to a newer version of the FreeBSD source. As a workaround, you can use a different Ethernet card that does not need the nve driver.

Q I'm running Apache 1.3.33 under Solaris 8, and it keeps intermittently segmentation faulting on me. I suspect that this is actually the fault of PHP (we're using mod_php as a DSO), but I'd like to get a core dump so I can verify this. I've compiled PHP 4.3.10 with debugging support, but I just can't get httpd to give me a core when it dies. How do I manage to get a core file out of this thing so I can start debugging?

A There are several steps to getting useful information about a PHP/Apache problem. The first, which you've already completed, is to compile PHP with --enable-debug. By default, Apache tries to write its core files to the location defined by ServerRoot in your httpd.conf file. If this location is not writable by the User/Group Apache runs as, you will not get a core file. You can modify where your core files will be saved by defining the variable CoreDumpDirectory to something like /var/tmp in your httpd.conf file.

Additionally, under Solaris, there is a tool called coreadm that defines what kinds of programs may generate core files and where they'll be stored on disk. From the coreadm man page:

coreadm [-g  pattern]  [-i  pattern]  [-d  option...]   [-e option...]
...

The first form shown in the synopsis can be executed only by the 
super-user and is used to configure system-wide core file options, 
including a global core file name pattern and a per-process core 
file name pattern for the init(1M) process. All such settings are 
saved in coreadm's configuration file /etc/coreadm.conf for setting 
on reboot. See init(1M)

...

A core file name pattern is a normal file system path name with 
embedded variables, specified with a leading % character, that are 
expanded from values in effect when a core file is generated by the 
operating system. The possible variables are:

%p    process-ID
%u    effective user-ID
%g    effective group-ID
%f    executable file name, up to a maximum of MAXCOMLEN characters
%n    system node name (uname -n)
%m    machine name (uname -m)
%t    decimal value of time(2)
%%    literal %

  ...

When a process is dumping core, the operating system will generate 
two possible core files, the global core file and the per-process 
core file. Both files, one or the other, or no file will be 
generated, based on the system options in effect at the time.

When generated, a global core file will be created mode 600 and 
will be owned by the super-user. Non-privileged users cannot 
examine such files.

Ordinary per-process core files are created mode 600 under the 
credentials of the process. The owner of the process can examine 
such files.

A process that is or ever has been setuid or setgid since its last 
exec(2), including a process that began life with super-user p
rivileges and gave up that privilege by way of setuid(2), presents 
security issues with respect to dumping core, as it may contain 
sensitive information in its address space to which the current 
non-privileged owner of the process should not have access.  If 
setid core files are enabled, they will be created mode 600 and 
will be owned by the super-user.

OPTIONS
The following options are supported:

   -d option...
      Disable the specified core file option.  See the -e option for
      descriptions of possible options.  Multiple -e and -d options can
      be specified on the command line. Only super-users can use 
      this option. 

   -e option...
      Enable the specified core file option. Specify option as one 
      of the following:

           global: Allow core dumps using global core pattern
           process: Allow core dumps using per-process core pattern
           global-setid: Allow set-id core dumps using global core
             pattern
           proc-setid: Allow set-id core dumps using per-process core
             pattern
           log: Generate a syslog(3C) message when generation of a
             global core file is attempted.

     Multiple -e and -d options can be specified on the command
     line. Only super-users can use this option.

     -g pattern
        Set the global core file name pattern to pattern.  The pattern
        must start with a / and can contain any of the special %
        variables described in the DESCRIPTION.  Only super-users 
        can use this option.

The coreadm command with no arguments reports the current system configuration, which defaults to enabling only per-process core dumps:

  coreadm
        global core file pattern:
          init core file pattern: core
               global core dumps: disabled
          per-process core dumps: enabled
         global setid core dumps: disabled
    per-process setid core dumps: disabled
        global core dump logging: disabled

Before Solaris 10, Apache was required to start as root to bind to port 80/443 and then dropped permissions to User/Group. Therefore, to capture all possible core dumps, turn on both global and setid core dumps as well as logging. Be aware that some programs that generate cored dumps may contain sensitive information, such as user authentication data:

coreadm -e global -e global-setid -e proc-setid -e log

To save core files to a specific directory, say /var/tmp, specify the global core file name pattern:

coreadm -g '/var/tmp/core.%f.%p'

You can also set the init core pattern as follows:

coreadm -i '/var/tmp/core.%f.%p'

You can test to make sure your settings are working by sending an httpd process an abort signal with kill:

kill -ABRT 5111

You should see an entry in /var/adm/messages (or wherever you send kern.notice notices via /etc/syslog.conf), which looks like:

  Dec 23 14:51:21 host.your.domain genunix: [ID 603404
  kern.notice] NOTICE: core_log: httpd[5111] core dumped:
  /var/tmp/core.httpd.5111

There should also be core files in /var/tmp named core and core.httpd.5111.

Q I'm installing Solaris 9 on a new Sun Fire V120 using a customized JumpStart process. As part of the JumpStart process, I set up mirroring via SVM. The machine is fine through the first few boots, but after the last reboot (when the file pushes and mirroring happen), the machine gives me this error and is unbootable except from the CDROM:

sorry, variable 'mddb_bootlist1' is not defined in the 'md' module
sorry, variable 'mddb_bootlist2' is not defined in the 'md' module
Cannot mount root on /pseudo/md@0:0,1,blk fstype ufs

I've done a fair bit of hackery with the earlier versions of SVM (SDS under Solaris 8), so I know that the stuff it's complaining about is in /etc/system. All of the entries in there look fine to me, though, and I have other machines booting just fine with the same configuration. Maybe if I provide my /etc/system you can take a look and see if you find anything amiss?

* Solstice Disk Suite should work with only half its replicas
set md:mirrored_root_flag=1

* Up our file descripteors and tcp_conn_has_size
set rlim_fd_max=4096
set rlim_fd_cur=4096
set tcp:tcp_conn_hash_size=1024

* Begin MDD root info (do not edit)
forceload: misc/md_trans
forceload: misc/md_raid
forceload: misc/md_hotspares
forceload: misc/md_sp
forceload: misc/md_stripe
forceload: misc/md_mirror
forceload: drv/pcisch
forceload: drv/qlc
forceload: drv/fp
forceload: drv/ssd
rootdev:/pseudo/md@0:0,1,blk
* End MDD root info (do not edit)
* Begin MDD database info (do not edit)
set md:mddb_bootlist1="ssd:7:16 ssd:7:1050 ssd:7:2084 ssd:15:16 \
  ssd:15:1050"
set md:mddb_bootlist2="ssd:15:2084"
* End MDD database info (do not edit)

A I'm guessing that you've been propagating your /etc/system file from a Solaris 8 machine based on its entries. The issue you're running into is that, starting with Solaris 9, the settings for mddb_bootlist1 and mddb_bootlist2 are no longer stored in /etc/system. You'll instead find this information in /kernel/drv/md.conf in a slightly different format, which not only includes the id information but also the disk manufacturers, part numbers, and serial numbers:

# Begin MDD database info (do not edit)
mddb_bootlist1="sd:7:16:id1,sd@SFUJITSU_MAT3073N_SUN72G_WWWWWWWWWWWW____XXXXXXXXXXXX/h
sd:7:8208:id1,sd@SFUJITSU_MAT3073N_SUN72G_WWWWWWWWWWWW____XXXXXXXXXXXX/h
sd:7:16400:id1,sd@SFUJITSU_MAT3073N_SUN72G_WWWWWWWWWWWW____XXXXXXXXXXXX/h
sd:15:16:id1,sd@SFUJITSU_MAT3073N_SUN72G_YYYYYYYYYYYY____ZZZZZZZZZZZZ/h
sd:15:8208:id1,sd@SFUJITSU_MAT3073N_SUN72G_YYYYYYYYYYYY____ZZZZZZZZZZZZ/h
sd:15:16400:id1,sd@SFUJITSU_MAT3073N_SUN72G_YYYYYYYYYYYY____ZZZZZZZZZZZZ/h";
# End MDD database info (do not edit)

If you're setting up SVM using the metaroot command, the correct information should be making its way into the proper files. Since you're seeing bogus information in /etc/system, though, I'm guessing that you're either not using metaroot or you're later overwriting /etc/system with one of your finish scripts.

Amy Rich has more than a decade of Unix systems administration experience in various types of environments. Her current roles include that of Senior Systems Administrator for the University Systems Group at Tufts University, Unix systems administration consultant, author, and charter member of LOPSA. She can be reached at: qna@oceanwave.com.