Date: Fri, 16 Jul 2004 20:40:25 +0100 From: Jason Thomson <jason.thomson@mintel.com> To: freebsd-hardware@freebsd.org, freebsd-stable@freebsd.org Cc: Alasdair Lumsden <enquiries@alivewww.com> Subject: Reproducible FreeBSD 4.10-STABLE (Jul 7) , 3ware 7506-4 lockup. Message-ID: <40F82F29.9040006@mintel.com> In-Reply-To: <40E52725.1060409@mintel.com> References: <1088701228.2638.86.camel@host-83-146-2-180.bulldogdsl.com> <20040701215131.GA83112@elvis.mu.org> <1088722694.2554.48.camel@host-83-146-2-180.bulldogdsl.com> <20040701230015.GA87635@elvis.mu.org> <1088724938.2879.17.camel@host-83-146-2-180.bulldogdsl.com> <20040701233811.GA89536@elvis.mu.org> <1088725862.2879.22.camel@host-83-146-2-180.bulldogdsl.com> <40E52725.1060409@mintel.com>
next in thread | previous in thread | raw e-mail | index | archive | help
We can now reproduce the lockup we have been experiencing. We have not been able to get a crash dump. I'm not sure if it's something we're doing wrong, or if there's some other reason it's not saving the core to the swap device. Next week sometime we can make the server available on the internet if there is someone willing and able to help us debug this. We can probably provide a serial console hookup from another machine if that would help. (We have to migrate the data from this production machine before we can make it available). We are very keen to resolve this problem; we have ~20 machines running FreeBSD 4.x with 7506-4 cards, and so far three of them have exhibited this problem. (Only one is causing problems now - we replaced disks on the other two). Recap on problem: Hardware / OS: + FreeBSD 4.x (Various -STABLE versions from 21/01/04 until 07/07/04) + Dell 1600SC (UP and SMP). + 7506-4 cards + 300 / 320 GB Maxtor Maxline II hard drives. (Only these disks*). * We have many machines with WD2000JB / WD2500JB that do not exhibit this problem. To reproduce the problem on the the machine in question I run this command: # dd if=/dev/twed0s1h iseek=137510 bs=1m of=/dev/null The card then locks up hard within 10 seconds - no further I/O succeeds, but anything that is already in cached by the VM can be read / invoked. Crash dumps are enabled. We have swap (and the dumpdev) configured on a SCSI disk in the same machine. CTRL-ALT-ESC does drop to the debugger. ddb> panic followed by ddb> call boot(0) does reboot the machine, but savecore does NOT find a kernel core dump on reboot. It is possible that we have something configured wrongly, but I can't see what it is. Another data point: In one previouse instance of this problem, we resolved the symptoms by checking the disks with Maxtor's PowerMax tools. One disk was found to have errors and been and repairing / replacing that disk resolved the errors. (However, if the disk has errors, I would expect the RAID card to deal with it!).
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?40F82F29.9040006>