Article apr2006.tar

Questions and Answers

Amy Rich

Q We're in the process of upgrading the memory in a number of our Sun V240s from 4G to 8G. Two of the machines we're upgrading keep coming up with only 6G of RAM (according to top). My guess is that a couple of the DIMMs or slots are bad. Other than systematically swapping memory in and out of the machine, how do we tell which DIMMs/sockets are bad?

A If you're running Solaris 10 or later, the predictive self-healing feature should tell you exactly which DIMM is bad, including the location silk-screened onto the motherboard next to the slot. Under earlier versions of Solaris, you might see a message during boot (which you can retrieve with dmesg) indicating which DIMM is bad.

You can verify whether the system is seeing the DIMM at all by running /usr/platform/sun4u/sbin/prtdiag -v. The pertinent sections (this one shows a full 8G) of the output are:

======================= Memory Configuration =======================
Segment Table:
---------------------------------------------------------------------
Base Address    Size  Interleave Factor  Contains
-------------------------------------------------------------------
0x0             4GB          16          BankIDs 0,1,2,3,4,5,6,7,8,9, 
                                         10,11,12,13,14,15
0x1000000000    4GB          16          BankIDs 16,17,18,19,20,21, 22,
                                         23,24,25,26,27,28,29,30,31

Bank Table:
-----------------------------------------------------------
           Physical Location
ID       ControllerID  GroupID   Size    Interleave Way
--------------------------------------------------------
0        0             0         256MB   0,1,2,3,4,5,6,7,8,9,10,11,12,
                                         13,14,15
1        0             0         256MB           
2        0             1         256MB           
3        0             1         256MB           
4        0             0         256MB           
5        0             0         256MB           
6        0             1         256MB           
7        0             1         256MB           
8        0             1         256MB           
9        0             1         256MB           
10       0             0         256MB           
11       0             0         256MB           
12       0             1         256MB           
13       0             1         256MB           
14       0             0         256MB           
15       0             0         256MB           
16       1             0         256MB   0,1,2,3,4,5,6,7,8,9,10,11,12, 
                                         13,14,15
17       1             0         256MB           
18       1             1         256MB           
19       1             1         256MB           
20       1             0         256MB           
21       1             0         256MB           
22       1             1         256MB           
23       1             1         256MB           
24       1             1         256MB           
25       1             1         256MB           
26       1             0         256MB           
27       1             0         256MB           
28       1             1         256MB           
29       1             1         256MB           
30       1             0         256MB           
31       1             0         256MB           
     
Memory Module Groups:
--------------------------------------------------
ControllerID   GroupID  Labels
--------------------------------------------------
0              0        MB/P0/B0/D0,MB/P0/B0/D1
0              1        MB/P0/B1/D0,MB/P0/B1/D1

Memory Module Groups:
--------------------------------------------------
ControllerID   GroupID  Labels
--------------------------------------------------
1              0        MB/P1/B0/D0,MB/P1/B0/D1
1              1        MB/P1/B1/D0,MB/P1/B1/D1
Each processor controls two physical banks of two DIMMs each. To improve interleaving, each physical bank is divided into logical banks by splitting each DIMM. Each processor controls the logical banks seen in the prtdiag output above. If one or more of your physical DIMMs isn't recognized, you should see it missing from the Memory Module Groups output.

The letters in the Memory Module Groups output stand for Motherboard, Processor, Bank, and DIMM. The diagram at:

http://sunsolve.sun.com/handbook_private/ \
 Devices/System_Board/SYSBD_SunFireV2x0_CPU.html
shows the corresponding location of each DIMM slot on the motherboard. Notice that the Bank and DIMM slots are numbered lower when closer to the CPU as opposed to the outside of the board.

If you put a known good DIMM in the socket with the "dead" DIMM in it, and it's still not seen, you most likely have a bad socket and require a new motherboard.

If you also need to know vendor and part number information to order matching replacement DIMMs, you can use the service console. From the serial console of the machine, type #. to drop to the sc> prompt (you may need to log in). Once there, type showfru to see information on all of the field replaceable units, including the memory modules. One of the DIMMs might look something like the following:

FRU_PROM at MB.P0.B0.D0.SEEPROM
Timestamp: MON DEC 26 12:00:00 UTC 2005
Description: SDRAM DDR, 1024 MB
Manufacture Location: 
Vendor: Samsung
Vendor Part No: M3 12L2828ET0-CA2
The Motherboard/Processor/Bank/DIMM locations in the showfru output match those in the prtdiag output.

Q In my consulting business, I run across a lot of small companies without their own sys admin that need some sort of simple shared-disk solution. They can set up a server with lots of disk, but that's more of an administrative headache than these people need. The other route is to get something like a Netapp or high-end SAN, but those are generally way out of the small business price range. Is there a decent solution for the SOHO market other than "build it yourself?"

A This question recently came up on a mailing list that I'm on, and they discussed two lines of products fitting your description. The first was Buffalo Technology's TeraStation:

http://www.buffalotech.com/products/storage.php
They sell three different models that are internally the same technology but with different capacities: .6TB, 1.0TB, and 1.6TB.

The second was Infrant Technologies' ReadyNAS line of servers:

http://www.infrant.com/products.htm
They sell the ReadyNAS X6, 600, and rackmountable 1000S NAS appliances as well as the boards and OS to roll your own if you want to use commodity hardware. The ReadyNAS X6 and 600 are the same hardware, but the X6 makes management a bit easier for users who don't understand RAID at all. To determine which might be better for your clients, take a look at their "Which is right for me" FAQ in their support forum:

http://www.infrant.com/forum/viewtopic.php?p=7650#7650
Both company's products do CIFS/SMB, but only the ReadyNAS also does NFS, so that may be a deciding factor if your clients are UNIX-based (and don't run something like Samba) instead of Windows or Mac.

Network World did a recent story on SMB-based NAS appliances, at:

http://www.networkworld.com/reviews/2005/102405-storage-test.html
It awarded the ReadyNAS 600 the "Clear Choice Award" over Anthology Solutions' Yellow Machine, Iomega's StorCenter Pro NAS 200d with REV drive, and Netgear's Storage Central SC101.

Tom's Networking also did reviews of each of the abovementioned products:

Buffalo Technology TeraStation -- http://www.tomsnetworking.com/Reviews-200-ProdID-TERASTA.php

Infrant Technologies ReadyNAS 600 -- http://www.tomsnetworking.com/Reviews-217-ProdID-H2H5-1.php

Anthology Solution Yellow Machine -- http://www.tomsnetworking.com/Reviews-231-ProdID-P400T.php

Iomega StorCenter Pro NAS 200 -- http://www.tomsnetworking.com/Reviews-215-ProdID-NAS200D.php

Netgear Storage Central SC101 -- http://www.tomsnetworking.com/Reviews-226-ProdID-SC101.php

Q I'm trying to set up some public-facing slave nameservers for our domain. When I do a transfer of the zone, the slave is picking up the wrong zone file. It's seeing the internal zone instead of the external one (which is meant for public queries). Here's my configuration file for the master and slave servers:

On the master (192.168.6.1):

options {
  datasize default;
  stacksize default;
  coresize default;
  files unlimited;
  transfer-format one-answer;
  directory "/etc/named";
  dump-file "/etc/named/cache_dump.db";
  pid-file "/var/run/named.pid";
  statistics-file "/var/run/named.stats";
  zone-statistics yes;
  listen-on {
    127.0.0.1;
    192.168.6.1;
  };
  forwarders {
    10.1.1.1;
    10.1.1.2;
  };
  allow-transfer { 192.168.6.2; };
  allow-query { 192.168.6/24; 127.0.0.1; };
  notify yes;
  also-notify { 192.168.6.2; };
  provide-ixfr no;
};

view "internal" {
  recursion yes;
  zone "my.domain" in {
    type master;
    file "my.domain";
    allow-transfer { any; };
  };

};

view "external" {
  recursion no;
  zone "my.domain" in {
    type master;
    file "my.domain";
    allow-transfer { 192.168.6.2; };
  };
};
On the slave (192.168.6.2):

options {
  datasize default;
  stacksize default;
  coresize default;
  files unlimited;
  transfer-format one-answer;
  directory "/etc/named";
  dump-file "/etc/named/cache_dump.db";
  pid-file "/var/run/named.pid";
  statistics-file "/var/run/named.stats";
  zone-statistics yes;
  listen-on {
    127.0.0.1;
    192.168.6.2;
  };
  forwarders {
    10.1.1.1;
    10.1.1.2;
  };
  allow-notify { 192.168.6.1; };
  allow-transfer { none; };
  notify no;
  allow-query { 192.168.6/24; 127.0.0.1; };
  forward only;
  recursion yes;
  allow-recursion { 192.168.6/24; 127.0.0.1; };
  request-ixfr no;
};

};

zone "my.domain" in {
  type slave;
  masters { 192.168.6.1; };
  file "my.domain";
};
Can you tell me what I'm doing wrong?

A If you're running BIND 9.3 or later, you can use a TSIG (transaction signature) key to select the view you want. If you're running a version of BIND prior to 9.3.0, you'll need to configure both of your nameservers with additional IP addresses so you can pass transfer the zones based on IP restrictions. Take a look at the BIND FAQ for specific examples on how to implement each of these:

http://www.isc.org/sw/bind/FAQ.php
Look for the question "How can I make a server a slave for both an internal and an external view at the same time? When I tried, both views on the slave were transferred from the same view on the master." If you're unfamiliar with TSIG keys, also take a look at Nominum's BIND FAQ:

http://www.nominum.com/getOpenSourceResource.php?id=6#faq_78
Q I'm running Sun Freeware's version of Firefox 1.0 from:

<ftp://ftp.sunfreeware.com/pub/freeware/sparc/8/ \
  firefox-1.0-sparc-sun-solaris2.8.tar.gz>
under Solaris 9 09/04 s9s_u7wos_09 SPARC (from /etc/release). Occasionally, Firefox will go nuts, eat a ton of memory, then die on me. I'm pretty certain there's a memory leak in there somewhere, so I'd like to file a bug report. I'm not sure where to start or how I provide the best information. Do you have suggestions?

A To begin, I'd strongly suggest upgrading to a newer version of Firefox since you're dealing with outdated software. Version 1.5 of the source code can be obtained from:

http://releases.mozilla.org/pub/mozilla.org/firefox/releases/1.5/source/
and binaries can be obtained from:

ftp://ftp.mozilla.org/pub/mozilla.org/firefox/releases/1.5/contrib/
If you're still having issues after upgrading, then Mozilla has a Web page on how to report Firefox bugs:

<http://www.mozilla.org/support/firefox/bugs
Note that the very first thing they ask you to do is use the latest nightly build with a new profile. There's also a useful tutorial on writing good bug reports that should get you started.

If you suspect a memory leak, you can try running Firefox under libumem with mdb. For information on using these tools, take a look at Robert Benson's article titled "Identifying Memory Management Bugs Within Applications Using the libumem Library", at:

http://access1.sun.com/techarticles/libumem.html
Amy Rich has more than a decade of Unix systems administration experience in various types of environments. Her current roles include that of Senior Systems Administrator for the University Systems Group at Tufts University, Unix systems administration consultant, author, and charter member of LOPSA. She can be reached at: qna@oceanwave.com.