Questions and Answers
Amy Rich
Q I have a
JumpStart server and a bunch of clients running Solaris 9. Recently we
expanded our JumpStart network from 192.168.1.0/24 to 192.168.0.0/23 because we were running out of addresses. When we made this
change, JumpStart stopped working all together. When I try to jump hosts
off of the new .0 network, I receive the following error (192.168.1.2 is our
JumpStart server):
TFTP: Could not send to 192.168.1.2:36547 (Host is unreachable)
A snoop of the network shows the following interaction (192.168.0.10 is
the client) during boot net - install or boot net -s:
OLD-BROADCAST -> (broadcast) RARP C Who is 0:3:ba:xx:xx:xx ?
192.168.1.252 -> 192.168.0.10 RARP R 0:3:ba:xx:xx:xx is \
192.168.0.10, 192.168.0.10
192.168.0.10 -> BROADCAST TFTP Read "C0A8000A" (octet)
192.168.1.252 -> 192.168.0.10 TFTP Data block 1 (512 bytes)
It's at this point that the TFTP error I
mentioned above is echoed on the client machine.
Just in case it's something wacky on the server
side, here are the entries in the pertinent files:
/etc/bootparams:
client.my.domain root=jsserver.my.domain:/inst/cdrom/SunOS-5.9-sparc/ \
Solaris_9/Tools/Boot install=jsserver.my.domain:/inst/cdrom/ \
SunOS-5.9-sparc boottype=:in install_config=jsserver.my.domain:/ \
inst/jumpstart rootopts=:rsize=32768 term=:vt100 tz=:US/ \
Eastern jnet=192.168.0.0;255.255.254.0 keyboard=:us \
display=:NONE monitor=:NONE pointer=:NONE \
sysid_config=jsserver.my.domain:/inst/jumpstart/sysidcfg/ \
SunOS-5.9 ns=:NONE
/etc/ethers:
0:3:ba:xx:xx:xx client.my.domain
/etc/hosts:
192.168.0.10 client.my.domain
192.168.1.2 jsserver.my.domain
/etc/netmasks:
192.168.0.0 255.255.254.0
And ifconfig for the JumpStart interface on the server shows:
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
inet 192.168.1.2 netmask fffffe00 broadcast 192.168.1.255
As far as I can tell, this should work, but if you
see any errors in my configuration, I'd be grateful if you could
point them out.
A You don't specify your exact hardware, but there was a known issue with
patch 119234-01 (bug ID 6317169),
which upgraded the firmware on the Sun V210/V240 machines to OBP 4.17.1.
Apparently this also broke JumpStarting across a supernet such as your /23.
If you have this patch installed, try rolling back to OBP 4.16.4.
I've also run into a situation where the patch was not installed (but
the machine was still at a higher firmware/OBP level). Upgrading the OBP to
4.22.11 with patch 121683-01 might also work for you, but I don't believe the
problem has actually been fixed yet with that version.
Your other option is to move away from using
bootparams/RARP all together and start using DHCP to jump your clients. You
can use the Sun DHCP server or a third-party DHCP server, like ISC's,
to store your client information.
Q I just upgraded my laptop to
Debian Unstable with the 2.6.17.1 kernel. This box sits on an internal
private network and connects to the external network via an Apple Airport
(second generation) and then a Solaris 8 machine running ipfilter 3.4.20. I'm seeing
very very odd behavior on this laptop when it tries to connect to outside
machines. I can connect to, say, www.google.com, but not www.yahoo.com. I also can't ssh out to other machines I have accounts on from that
laptop, but all other internal machines talk to the Internet just fine.
First, here is the ipfilter configuration file:
#!/sbin/ipf -f -
# internal interface: hme0 192.168.1.1
# external interface: hme1 publicIP
# Block short packets fragmented too short to be real
block in log quick all with short
# By default, block and log everything.
block in log on hme1 all
block out log on hme1 all
block out log on hme0 all
block in log on hme0 all
# Block loopback addressed packets going in/out of network interfaces
# that aren't on the loopback interface
block in log quick on hme1 from 127.0.0.0/8 to any
block in log quick on hme1 from any to 127.0.0.0/8
block in log quick on hme0 from 127.0.0.0/8 to any
block in log quick on hme0 from any to 127.0.0.0/8
# Allow packets to transverse the loopback interface
pass in quick on lo0 all
pass out quick on lo0 all
# Deny reserved addresses but don't log them.
block in quick on hme1 from 10.0.0.0/8 to any
block in quick on hme1 from 172.16.0.0/12 to any
block in quick on hme1 from 192.168.0.0/16 to any
# Allow any other internal traffic
pass in quick on hme0 from 10.1.1.0/24 to 10.1.1.0/24
pass out quick on hme0 from 10.1.1.0/24 to 10.1.1.0/24
pass out quick on hme1 from any to 192.168.100.1/24
# Allow outgoing DNS requests from .2 and .3
pass in quick on hme0 proto tcp/udp from 10.1.1.2 to any port = domain keep state
pass in quick on hme0 proto tcp/udp from 10.1.1.3 to any port = domain keep state
pass out quick on hme1 proto tcp/udp from publicIP to any port = domain keep state
pass out quick on hme1 proto tcp/udp from 0/32 to any port = domain keep state
# Allow NTP from any internal hosts to any external NTP server.
pass in quick on hme0 proto tcp/udp from 10.1.1.0/24 to any port = 123 keep state
pass out quick on hme1 proto tcp/udp from any to any port = 123 keep state
# Allow incoming mail
pass in quick on hme1 proto tcp from any to publicIP port = smtp keep state
pass in quick on hme1 proto tcp from any to 0/32 port = smtp keep state
pass out quick on hme1 proto tcp from 10.1.1.0/24 to any port = smtp keep state
# Outgoing connections: SSH, WWW, NNTP, mail, whois
pass in quick on hme0 proto tcp from 10.1.1.0/24 to any port = 22 keep state
pass out quick on hme1 proto tcp from 10.1.1.0/24 to any port = 22 keep state
pass in quick on hme0 proto tcp from 10.1.1.0/24 to any port = 80 keep state
pass out quick on hme1 proto tcp from 10.1.1.0/24 to any port = 80 keep state
pass in quick on hme0 proto tcp from 10.1.1.0/24 to any port = 443 keep state
pass out quick on hme1 proto tcp from 10.1.1.0/24 to any port = 443 keep state
pass in quick on hme0 proto tcp from 10.1.1.0/24 to any port = nntp keep state
block in quick on hme1 proto tcp from any to any port = nntp keep state
pass out quick on hme1 proto tcp from 10.1.1.0/24 to any port = nntp keep state
pass in quick on hme0 proto tcp from 10.1.1.0/24 to any port = smtp keep state
pass in quick on hme0 proto tcp from 10.1.1.0/24 to any port = whois keep state
pass out quick on hme1 proto tcp from any to any port = whois keep state
# Allow ssh from offsite
pass in quick on hme1 proto tcp from any to publicIP port = 22 keep state
pass in quick on hme1 proto tcp from any to 0/32 port = 22 keep state
# Allow ping out
pass in quick on hme0 proto icmp all keep state
pass out quick on hme1 proto icmp all keep state
# allow auth out
pass out quick on hme1 proto tcp from publicIP to any port = 113 keep state
pass out quick on hme1 proto tcp from publicIP port = 113 to any keep state
pass out quick on hme1 proto tcp from 0/32 to any port = 113 keep state
pass out quick on hme1 proto tcp from 0/32 port = 113 to any keep state
# ftp
pass out quick on hme1 proto tcp from 10.1.1.0/24 to any port = ftp keep state
pass out quick on hme0 proto tcp from any port = ftp-data to 10.1.1.0/24 \
port > 1024 keep state
pass in quick on hme1 proto tcp from 10.1.1.0/24 port = ftp-data to any \
port > 1023 keep state
pass in quick on hme0 proto tcp from 10.1.1.0/24 to any port = ftp keep state
pass in quick on hme0 proto tcp from 10.1.1.0/24 port > 1023 to any \
port > 1023 keep state
pass out quick on hme1 proto tcp from 10.1.1.0/24 port > 1023 to any \
port > 1023 keep state
# return rst for incoming auth
block return-rst in quick on hme1 proto tcp from any to any port = 113 flags S/SA
# Log these:
block return-rst in log on hme1 proto tcp from any to any flags S/SA
# * return ICMP error packets for invalid UDP packets
block return-icmp(net-unr) in proto udp all
Here's some information I've collected while trying to diagnose the problem:
1. Any connections from the Debian laptop that
originate above port 1023 match the following rules, and a connection is
successfully negotiated:
pass in quick on hme0 proto tcp from 10.1.1.0/24 port > 1023 \
to any port > 1023 keep state
pass out quick on hme1 proto tcp from 10.1.1.0/24 port > 1023 \
to any port > 1023 keep state
2. Any connections originating below port 1024 start
the connection negotiation, but the packets back to the gateway machine are
blocked. Here's an ipf log file entry (there are several for each connection, as the
remote end keeps trying to ack the packet) illustrating this for ssh:
Sep 10 20:39:59 gateway.my.domain ipmon[15972]: \
[ID 702911 local0.warning] 20:39:58.894977 2x hme1 @0:2 b \
externalIP[externalIP],22 -> 10.1.1.65,41300 PR tcp len 20 \
660 -AP IN
Checking ipfstat -hio shows that the following rules were hit:
8 pass out quick on hme0 from 10.1.1.0/24 to 10.1.1.0/24
4 block in log on hme1 from any to any
15 block in log on hme0 from any to any
15 pass in quick on hme0 from 10.1.1.0/24 to 10.1.1.0/24
1 pass in quick on hme0 proto tcp from 10.1.1.0/24 to any \
port = 22 keep state
There's obviously something odd going on
between the Debian laptop and the ipfilter firewall where state is being ignored, but I
can't figure out what. Do you have any suggestions for places to look
or more debugging information I can provide?
A Versions
of the Linux kernel starting with 2.6.17 have seen issues with using TCP
window scaling when some older firewall/NAT
software (Cisco, ipfilter, etc.) enters the mix because it increased the TCP scaling
factor if the machine had more memory available. See the following URL at kerneltrap.org for a discussion of the issue:
http://kerneltrap.org/node/6723
In short, if the other end of the connection supports
TCP window scaling (say, www.yahoo.com) with a larger scaling factor, you'll see the packet
blocked because ipfilter thinks that the packet is bigger than the receiving end will
accept. If TCP window scaling is not negotiated or is offered at a smaller
scaling factor (say www.google.com), you won't see this issue. This makes it very hard
to diagnose because it appears that some sites arbitrarily fail while
others work.
Turning off TCP window scaling will fix the problem
but will result in slower throughput on high-latency and high-bandwidth
connections. For more general information on TCP window scaling, see RFC 1323:
http://www.faqs.org/rfcs/rfc1323.html
In Linux, if tcp_window_scaling is on, the following command will return a 1 (0 if it's turned off):
cat /proc/sys/net/ipv4/tcp_window_scaling
You can turn off TCP window scaling temporarily by
replacing the 1 in that file with a 0. You can permanently turn it off by
adding the following to /etc/sysctl.conf:
net.ipv4.tcp_window_scaling=0
The better option is to upgrade your version of ipfilter to the 4.x branch,
currently 4.1.13, which is available from:
http://coombs.anu.edu.au/~avalon/ip_fil4.1.13.tar.gz
Q I've been trying to debug an issue with a new Sun V120. The box is attached to a
console server, and I don't have physical access. When I try to send it a break, it
ignores it. (I can't get to the OS, so I can't do an init 0 from there.) I've
tried dropping to the LOM and running break from there as well with no effect. If I do a poweroff and then a poweron from the LOM, it just
tries to boot from the diag device (the network). How can I get the machine
to break out of its boot loop and drop to the ok prompt?
A For
Sun V120 machines (and others with the necessary LOM/ALOM support) which
are unable to boot and where you can't break out to the ok prompt, you can drop to
the LOM prompt and run the following commands:
bootmode forth
reset
When it finishes the reset, it stops processing and
enters the Forth interpreter before trying to boot the machine. This is
akin to pressing L1-F on a physically attached keyboard. At this point, you can run the
following at the ok prompt
to turn off auto-booting:
setenv auto-boot? false
Now drop back to the LOM prompt and run the following
to return to the normal boot mode:
bootmode normal
reset
With auto-boot? disabled, you can now choose to boot off the network,
CDROM, other disks, etc. Eventually, when you fix/debug the issue, set the auto-boot? variable back to
true.
Amy Rich has more than a decade of Unix systems
administration experience in various types of environments. Her current
roles include that of Senior Systems Administrator for the University
Systems Group at Tufts University, Unix systems administration consultant,
author, and charter member of LOPSA. She can be reached at: qna@oceanwave.com.
|