Article nov2006.tar

Questions and Answers

Amy Rich

Q I have a Mac 15 Powerbook running OS X 10.3.9. While I was using Mozilla, it hung the laptop, and the only way I could get it to respond was to hard power it off. I've had this happen a few times before, and the machine comes back fine and runs a fsck in the background. This time, though, it just hung on the gray screen and never actually presented me with a login screen.

I tried booting in single-user mode by holding down Command s, and it spit out a few lines, tried to load the kernel, and the last thing it printed before hanging was:

Carbon Lazy values total size 11057 bytes!
            
I have a second boot partition on the machine, which I can boot from just fine. I can also successfully boot off the CDROM. In both cases, all of the data on the primary boot partition looks completely unharmed, I just can't get it to boot off it. Do you know how I can recover the primary boot partition without losing my data?

A I've generally seen the Carbon Lazy values total size 11057 bytes! in conjunction with bad permissions or cache files. Unfortunately, the message is vague, and the fix isn't always the same. Here's a list of steps I'd go through, in order of least to most destructive, to try and fix the issue:

1. Boot off the CD or the other boot partition and run the Disk Utility application. Click to repair both the Disk (an fsck) and Disk Permissions (chown/chmod of various files).

2. Re-apply the 10.3.9 Updater, available from:

http://www.apple.com/downloads/macosx/apple/macosxupdate1039.html
3. Remove and recreate the cache directory /Library/Caches by hand or use a fix-it utility like Tiger Cache Cleaner:

http://www.northernsoftworks.com/tigercachecleaner.html
4. Perform an upgrade to 10.4.x on the primary partition.

5. Copy all of your data to your secondary partition and perform a fresh reinstall (of either 10.3.x or 10.4.x) on the primary partition.

Q We recently applied the Solaris jumbo security+recommended patch cluster to our Solaris 9 machines. After activating the new boot environment (we use luupgrade to patch with minimal downtime) and rebooting, we noticed that one of our processes, smcwebserver (part of the 6130 Array management software) would no longer start. The /var/log/webconsole/console_debug_log only showed the following messages for the startup attempt:

Starting Sun Java(TM) Web Console Version 2.2.4 on Mon Jul 17 \
  16:19:00 EDT 2006 
JAVA_HOME=/opt/se6130/java/usr/j2se 
JAVAHELP_HOME=/opt/se6130/java/usr/j2se/opt 
       CLASSPATH=/opt/SUNWtcatu/usr/apache/tomcat/bin/ \      
         bootstrap.jar:/opt/se6130/java/usr/j2se/lib/tools.jar:/ \   
         opt/se6130/java/usr/j2se/jre/lib/jsse.jar 
CATALINA_BASE=/var/opt/webconsole 
CATALINA_HOME=/opt/SUNWtcatu/usr/apache/tomcat 
COM_SUN_WEB_CONSOLE_HOME=/usr/share/webconsole 
COM_SUN_WEB_CONSOLE_BASE=/var/opt/webconsole 
COM_SUN_WEB_CONSOLE_APPBASE=/var/opt/webconsole/webapps 
LD_LIBRARY_PATH=
Java options from java.options property=-server -XX:+BackgroundCompilation 
Normally there are a lot of other lines about things like the HttpConnector, Apache, and WebappLoader being initialized and loaded.

After running /usr/sbin/smcwebserver under sh -x to debug the problem, I saw that it was dying on a pwd command inside of /var. Interestingly, when I do an ls of /, /var looks normal:

>ls -al /|grep var 
drwxr-xr-x  29 root         512 Jul 20 09:17 var 
But when I do an ls of /var, I get the following error message:

 
>ls -al /var 
ls: /var/..: Permission denied 
If I do an ls -ald or an ls -al as root, it works fine. I checked all the patch README files, but nothing looks like it should cause this sort of oddity.

Right now I'm booted off the original boot environment (which works fine) until we can figure out what the heck is going on. Is there a bad patch out there that I missed?

A Your issue doesn't stem from a patch you applied, but from two known bugs that crop up because of your patching procedure (using luupgrade). There is a long-known bug in the Solaris VFS layer in which the permissions of the underlying mount point are consulted instead of the permissions on the mounted file system itself. There's also a known bug with lucreate, in which permissions on newly created mount points are set to 700 instead of the expected 755. I presume that /var is separate file system on your machine, and you may have others (maybe /usr or /opt) that are affected, too. Anything that uses pwd or getcwd inside of one of these mount points as a non-root user, for example the man program, will fail with a permissions error.

If you have a Sunsolve account, take a look at Document ID 102176 for details:

 
http://sunsolve.sun.com/search/document.do?assetkey=1-26-102176-1
    
The workaround for this issue is to change the permissions on any mount points created by lucreate before booting the ABE by using lumount, for example:

 
# lumount ABE_name /mnt 
# chmod 755 /mnt/var 
# luumount ABE_name 
    
For machines that you've already patched, you can boot off the old BE and make the changes, or, if you can't afford the downtime, you can NFS export your root filesystem (and any other place where you have mount points that have been created by lucreate) to localhost, mount it, and modify the permissions there.

If your machine is not set up to run as an NFS server, the easiest way to make the change is to first make sure that SUNWnfssr, SUNWnfssu, and SUNWnfssx are installed. Then do the following:

cp /dfs/dfs/dfstab /dfs/dfs/dfstab.good 
echo "share -F nfs -o root=localhost /" >> /dfs/dfs/dfstab 
/etc/init.d/nfs.server start 
mount / /mnt 
chmod 755 /mnt/var 
umount /mnt 
/etc/init.d/nfs.server stop 
mv /dfs/dfs/dfstab.good /dfs/dfs/dfstab 
    
Subsequently remove any NFS packages that you added.

Q I have a bit of an odd question. I'd like to map domain names and IP addresses to actual locations for the purposes of tracking localized marketing effectiveness. How would I go about doing this?

A How to map IPs/hostnames to a physical location is actually a very common question. Whether or not your question has a usable answer is going to greatly depend on what you mean by "localized". You can map IP addresses (and therefore hostnames) to countries based on the various NIC databases by using whois information to get a sense of where the controlling ISP/domain owner might be located within a country. Let's take an example hostname/IP from my blocked spam logs:

from=<gfmjr@rfidalliancelabs.com> 
relay=[222.172.140.56] 
The mail claims to be from rfidalliancelabs.com, but the connecting IP address is much more trustworthy. The same generally goes for other server logs (i.e., Web servers). Try to map the IP to a real domain name:

 
dig -x 222.172.140.56 
whois 222.172.140.56 
Dig doesn't return a name for the IP, but it does tell you the SOA is ns1.apnic.net. The whois information is much more revealing and redirects to the apnic.whois.net server:

inetnum:      222.172.128.0 - 222.172.255.255 
netname:      CHINANET-YN 
descr:        CHINANET yunnan province network 
descr:        China Telecom 
descr:        A12,Xin-Jie-Kou-Wai Street 
descr:        Beijing 100088 
country:      CN 
admin-c:      CH93-AP 
tech-c:       ZL48-AP 
mnt-by:       APNIC-HM 
mnt-lower:    MAINT-CHINANET-YN 
mnt-routes:   MAINT-CHINANET-YN 
status:       ALLOCATED PORTABLE 
changed:      hm-changed@apnic.net 20040621 
source:       APNIC 

person:       Chinanet Hostmaster 
nic-hdl:      CH93-AP 
e-mail:       anti-spam@ns.chinanet.cn.net 
address:      No.31 ,jingrong street,beijing 
address:      100032 
phone:        10-58501724 
fax-no:       10-58501724 
country:      CN 
changed:      lqing@chinatelecom.com.cn 20051212 
mnt-by:       MAINT-CHINANET 
source:       APNIC 

person:       zhiyong liu 
nic-hdl:      ZL48-AP 
e-mail:       hpnut@mail.yn.cninfo.net 
address:      136 beijin roadkunmingchina 
phone:        871-3360605 
fax-no:       871-3360614 
country:      CN 
changed:      hpnut@mail.yn.cninfo.net 20040426 
mnt-by:       MAINT-CHINANET-YN 
source:       APNIC 
    
So, the owner of the netblock is the Yunnan Province division of Chinanet, which appears to be headquartered in Beijing. What this doesn't take into account is that the end user may be in a different location than the ISP. Chinanet may be reselling the use of part of their IP block to another provider in the same or another country, or the user may be proxying though Chinanet for some reason.

There's also an experimental extension, LOC, to DNS for providing location information, but it's not widely used. While this isn't a practical way to map your information, it might be academically interesting. The LOC RR is an expression of latitude, longitude, and altitude along with a precision factor. Take a look at RFC1876 for more information:

http://rfc.net/rfc1876.html 
Q I'm running OS X 10.3.9 and using Fink and FinkCommander to keep track of my third-party software. Recently, I tried doing a selfupdate because I noticed some packages were woefully out of date. Every time I tried (on multiple machines), I'd get the following error:

rsync -az -v rsync://master.us.finkmirrors.net/finkinfo// \
  TIMESTAMP/sw/fink/TIMESTAMP.tmp 
rsync: failed to connect to master.us.finkmirrors.net: \
  Operation timed out 
rsync error: error in socket IO (code 10) at /SourceCache/rsync/ \
  rsync-14/rsync/clientserver.c(93) 
### execution of rsync failed, exit code 10 
Failed to fetch the timestamp file from the rsync server: 
rsync://master.us.finkmirrors.net/finkinfo/. \
  Check the error messages above. 
    
I tried using different mirrors from both the U.S. (where I'm located) and outside, but they all failed with the exact same error. I then tried switching to cvs to do my update, just in case it was an issue with rsync, but that also failed with similar errors.

Did I miss some sort of major upgrade? I looked into upgrading FinkCommander, too, but that, as suspected, didn't seem to help. Can you tell me how to resync my source tree so I can update packages again?

A Your rsync command might be failing because of access issues on your end. Do you allow rsync connections through your firewall? The same goes for cvs. There's one other thing to be aware of if you haven't updated in a while, though. On March 30th, anonymous access to the Fink cvs repository at sourceforge.net stopped functioning correctly due to issues with (and later restructuring of) the SourceForge cvs service. Because the names of the servers changed, there was no way to get out new information via a selfupdate. To load the new server names into your configuration and reactivate cvs updates, you can download the fink-mirrors-0.24.15.2.tar.gz package file from the Fink Web site:

http://sourceforge.net/project/showfiles.php?group_id=17203&package_id=69685 
    
Once you retrieve that file, do the following to correct your list of mirrors:
tar zxf fink-mirrors-0.24.15.2.tar.gz 
cd fink-mirrors-0.24.15.2 
sudo ./inject.pl 
After the package installation is complete, you'll be queried about your mirrors:

Mirror selection 

The list of possible mirrors in fink has been updated. Do you want 
to review and change your choices? [y/N] y 
You can then select new mirror sites for each major package archive from the lists provided. After choosing valid mirrors, you can go back and run sudo fink selfupdate using cvs, and it should work just fine. You might then need to run sudo fink scanpackages if you encounter issues with sudo apt-get update.

Q We have an old Ultra 5 running Solaris 8 and ipfilter acting as our Internet firewall/gateway. Unfortunately, the onboard Ethernet, hme0, has stopped functioning correctly. To fix this, we added another PCI card, but everything is set to use hme0 as the Internet-facing interface. Is there a way to identify the new card as hme0 instead of hme1?

A You can modify the /etc/path_to_inst file so that hme1 is actually seen as hme0, but the next time you do a reconfiguration reboot, this information will be lost. To make the change, look for the following two lines:

"/pci@1f,0/pci@1/pci@1/SUNW,hme@0,1" 1 "hme" 
"/pci@1f,0/pci@1,1/network@1,1" 0 "hme" 
    
Remove the second line and change the first line to:

"/pci@1f,0/pci@1/pci@1/SUNW,hme@0,1" 0 "hme"
You might also want to consider just purchasing a used Ultra 5 (you might have to do a few tricks if you have any hostid-locked licenses) or even a new machine to take the place of your existing one. Prices for entry-level Sun hardware have come down quite a bit in the recent past, and Ultra 5s are extremely cheap.

Amy Rich has more than a decade of Unix systems administration experience in various types of environments. Her current roles include that of Senior Systems Administrator for the University Systems Group at Tufts University, Unix systems administration consultant, author, and charter member of LOPSA. She can be reached at: qna@oceanwave.com.