Date: Mon, 24 Oct 2005 11:13:42 -0500 From: Dan Rue <drue@therub.org> To: Vinod Kashyap <vkashyap@amcc.com> Cc: freebsd-stable@FreeBSD.org Subject: Re: twa kernel panic under heavy IO Message-ID: <20051024161342.GI38097@therub.org> In-Reply-To: <2B3B2AA816369A4E87D7BE63EC9D2F26C621CC@SDCEXCHANGE01.ad.amcc.com> References: <2B3B2AA816369A4E87D7BE63EC9D2F26C621CC@SDCEXCHANGE01.ad.amcc.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Oct 06, 2005 at 01:41:38PM -0700, Vinod Kashyap wrote: > > -----Original Message----- > > From: owner-freebsd-stable@freebsd.org > > [mailto:owner-freebsd-stable@freebsd.org] On Behalf Of Jung-uk Kim > > Sent: Thursday, October 06, 2005 1:30 PM > > To: freebsd-stable@FreeBSD.org > > Cc: Dan Rue > > Subject: Re: twa kernel panic under heavy IO > > > > On Thursday 06 October 2005 04:07 pm, Dan Rue wrote: > > > Greetings, > > > > > > I am running a 3ware 9500 SATA raid card in a 12x300GB raid 50 > > > configuration. > > > > > > Here is dmesg identifying the controller: > > > 3ware device driver for 9000 series storage controllers, version: > > > 2.50.02.012 twa0: <3ware 9000 series Storage Controller> port > > > 0xb800-0xb8ff mem 0xfb800000-0xfbffffff,0xfc5ffc00-0xfc5ffcff irq > > > 24 at device 2.0 on pci2 twa0: 12 ports, Firmware FE9X 2.06.00.009, > > > BIOS BE9X 2.03.01.051 > > > > > > I was getting occasional kernel panics in 5.4 doing high I/O type > > > things (typically an rsync operation). I was told that twa was > > > updated in 5-STABLE, so yesterday I upgraded. I've > > Going by the dmesg, you have a 9.1.5.2 driver and 9.2 firmware. The > driver in 5 -STABLE is from the 9.2 release. So, you might not have > the driver upgrade done properly. Try using the driver and firmware > from the same release. If you still see problems, please contact > 3ware support. Sorry about that, the driver and firmware were not actually mismatched - I had pasted my dmesg from a previous email when I was running a different version of FreeBSD. --- After going around with 3ware web support, this issue has been concluded, but not resolved. I tried my 3ware 9500 on FreeBSD 5.3, 5.4, and 5-STABLE. With all of these versions of OS and driver (i never changed the driver version manually), I received hard lock ups and reboots (though, interestingly, no kernel panics). 3ware had me check and troubleshoot a number of possibilities, until they finally decided it was a hardware problem and issued me a replacement card. However, in the meantime, I upgraded to FreeBSD 6.0RC1 and the machine is now working flawlessly. I returned the replacement card unused. I can only conclude that this means that there is a large (timing?) bug in the twa driver in freebsd 5.3/5.4/5-stable (as opposed to an isolated hardware problem with my setup). I have pasted the full conversation with 3ware on my website for those interested here: http://therub.org/9500.txt (sorry for the poor formatting) At one point, I received the following error message just before the machine locked up: >Oct 12 11:36:13 leopard kernel: initiate_write_filepage: already started I grepped for that error message in the freebsd kernel source, and found it in sys/ufs/ffs/ffs_softdep.c on line 3580. What makes it really interesting is the comment above where the error is thrown: if (pagedep->pd_state & IOSTARTED) { /* * This can only happen if there is a driver that does not * understand chaining. Here biodone will reissue the call * to strategy for the incomplete buffers. */ printf("initiate_write_filepage: already started\n"); return; } I know this is a 3ware issue. I am posting this resolution response here in hopes that it may help someone else that hits this bug - and with the hope that publically it will get the attention of the 3ware freebsd driver team/individual. Dan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051024161342.GI38097>