Date: 29 Jun 2006 22:25:18 +0200 From: "Arno J. Klaassen" <arno@heho.snv.jussieu.fr> To: amd64@freebsd.org Subject: Re: SMP system not running SMP Message-ID: <wpu063n1zl.fsf@heho.labo> In-Reply-To: <42450.192.168.0.10.1151509103.squirrel@webmail.sd73.bc.ca> References: <74DFB78C-4710-4DD2-A3DA-222BABAECE96@khera.org> <E1FvEDL-000A7v-BH@dilbert.firstcallgroup.co.uk> <20060627230716.44120c49.kgunders@teamcool.net> <42450.192.168.0.10.1151509103.squirrel@webmail.sd73.bc.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
"UEMURA (fka. MAENAKA) Tetsuya" <maenaka@pluto.dti.ne.jp> writes: > Posted on Tue, 27 Jun 2006 15:06:51 +0100 > By default, FreeBSD couldn't start. Dumping the ahd state when probing > the da and simply stopped. So I set the SCSI BIOS to restrict the device > speed upto 80MB/s and the problem went away. After that, the machine > runs flawlessly for 8 months. I have a Tyan S2882 which I cannot get up for more than a couple of days under moderate load, and the symptoms seem related : config : - tracking -stable - 8G RAM - latest BIOS 3ware 9500S-12 with 1.1T data - RAID-1 MAXTOR ATLAS10K5_73WLS as system-disk on ahd0 - doing nothing else than some test-scripts implying fairly moderate nfs-traffic (i.e. scripts via nfs, (rarely needed) data either on NFS or raid, scripts being CPU-intensive) symptom : - systems cold-boots fine (SMP dual opteron 248) - runs OK for a couple of minutes/hours/days - then total freeze; *never* a panic in 9 months - warm reset either does not detect da0 or indeed dumps ahd state when probing it - even cold reboot sometimes has to be repeated once or twice in order to redetect correctly da0 has tried : - changed scsi-cables and termination three times : no deal - decreased device speed to 80Mhz : seems to eliminate the "minutes" part from "runs OK for a couple of minutes/hours/days" ... observations : - this week I downloaded the latest manual from tyan and came across the following jumper setting (dunno if it was in the original version or whether I overlooked it; the printed manual is at the customer's site) : "Set PCI-X Bridge A (PCI 3 & PCI 4 & SCSI7902 & BCM5704) to operate at a maximum 66MHz; Note: Due to the PCI-X specifications it will be necessary to set this bus to 66MHz if a 133/100MHz PCI-X card is added to this bus." Since I do have a 100MHz PCI-X card (3ware) I set this jumper; system up for three days now, cannot confirm right now this was the culprit but other AMD811X based systems might have the same issue. - this board has dual ahd and dual bge : vmstat -i (I just rebooted for an upgrade -stable + linux_base) : irq24: bge0 ahd0 16826 2 irq25: bge1 ahd1 1305665 157 network is attached to bge1, disk is on ahd0. Interestingly, when I provoke insane swapping, it is the "irq25:" process which consumes 50-90%! of cpu-time, but when I stop the program provoking swapping and redo vmstat -i, it indeed reports slightly increased irq24 activity but no noticeable change in irq25 activity ... ( I put hint.ahd.1.disabled="1" in /boot/loader.conf since I do not need ahd1 but that does not seem to do anything ) FYI. I can test on this box for a couple of more weeks, feel free to contact me for more information. Thanx, regards, Arno -- Arno J. Klaassen SCITO S.A. 8 rue des Haies F-75020 Paris, France http://scito.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?wpu063n1zl.fsf>