Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Oct 2005 11:07:28 -0700
From:      "Vinod Kashyap" <vkashyap@amcc.com>
To:        "Dan Rue" <drue@therub.org>
Cc:        freebsd-stable@FreeBSD.org
Subject:   RE: twa kernel panic under heavy IO
Message-ID:  <2B3B2AA816369A4E87D7BE63EC9D2F26D89125@SDCEXCHANGE01.ad.amcc.com>

next in thread | raw e-mail | index | archive | help
> -----Original Message-----
> From: Dan Rue [mailto:drue@therub.org]=20
> Sent: Monday, October 24, 2005 9:14 AM
> To: Vinod Kashyap
> Cc: freebsd-stable@FreeBSD.org
> Subject: Re: twa kernel panic under heavy IO
>=20
> On Thu, Oct 06, 2005 at 01:41:38PM -0700, Vinod Kashyap wrote:
> > > -----Original Message-----
> > > From: owner-freebsd-stable@freebsd.org=20
> > > [mailto:owner-freebsd-stable@freebsd.org] On Behalf Of Jung-uk Kim
> > > Sent: Thursday, October 06, 2005 1:30 PM
> > > To: freebsd-stable@FreeBSD.org
> > > Cc: Dan Rue
> > > Subject: Re: twa kernel panic under heavy IO
> > >=20
> > > On Thursday 06 October 2005 04:07 pm, Dan Rue wrote:
> > > > Greetings,
> > > >
> > > > I am running a 3ware 9500 SATA raid card in a 12x300GB raid 50=20
> > > > configuration.
> > > >
> > > > Here is dmesg identifying the controller:
> > > > 3ware device driver for 9000 series storage=20
> controllers, version:
> > > > 2.50.02.012 twa0: <3ware 9000 series Storage Controller> port=20
> > > > 0xb800-0xb8ff mem=20
> 0xfb800000-0xfbffffff,0xfc5ffc00-0xfc5ffcff irq
> > > > 24 at device 2.0 on pci2 twa0: 12 ports, Firmware FE9X=20
> > > > 2.06.00.009, BIOS BE9X 2.03.01.051
> > > >
> > > > I was getting occasional kernel panics in 5.4 doing=20
> high I/O type=20
> > > > things (typically an rsync operation).  I was told that twa was=20
> > > > updated in 5-STABLE, so yesterday I upgraded.  I've
> >=20
> > Going by the dmesg, you have a 9.1.5.2 driver and 9.2=20
> firmware.  The=20
> > driver in 5 -STABLE is from the 9.2 release.  So, you might=20
> not have=20
> > the driver upgrade done properly.  Try using the driver and=20
> firmware=20
> > from the same release.  If you still see problems, please contact=20
> > 3ware support.
>=20
> Sorry about that, the driver and firmware were not actually=20
> mismatched - I had pasted my dmesg from a previous email when=20
> I was running a different version of FreeBSD.
>=20
> ---
>=20
> After going around with 3ware web support, this issue has=20
> been concluded, but not resolved.  I tried my 3ware 9500 on=20
> FreeBSD 5.3, 5.4, and 5-STABLE.  With all of these versions=20
> of OS and driver (i never changed the driver version=20
> manually), I received hard lock ups and reboots (though,=20
> interestingly, no kernel panics). =20
>=20
> 3ware had me check and troubleshoot a number of=20
> possibilities, until they finally decided it was a hardware=20
> problem and issued me a replacement card.  However, in the=20
> meantime, I upgraded to FreeBSD
> 6.0RC1 and the machine is now working flawlessly.  I returned=20
> the replacement card unused. =20
>=20
> I can only conclude that this means that there is a large=20
> (timing?) bug in the twa driver in freebsd 5.3/5.4/5-stable=20
> (as opposed to an isolated hardware problem with my setup).
>=20
> I have pasted the full conversation with 3ware on my website=20
> for those interested here:=20
> http://therub.org/9500.txt (sorry for the poor formatting)
>=20
> At one point, I received the following error message just=20
> before the machine locked up:
>=20
> >Oct 12 11:36:13 leopard kernel: initiate_write_filepage: already=20
> >started
>=20
> I grepped for that error message in the freebsd kernel=20
> source, and found it in sys/ufs/ffs/ffs_softdep.c on line=20
> 3580.  What makes it really interesting is the comment above=20
> where the error is thrown:
>=20
> if (pagedep->pd_state & IOSTARTED) {
>         /*
>          * This can only happen if there is a driver that does not
>          * understand chaining. Here biodone will reissue the call
>          * to strategy for the incomplete buffers.
>          */
>         printf("initiate_write_filepage: already started\n");
>         return;
> }
>=20
> I know this is a 3ware issue.  I am posting this resolution=20
> response here in hopes that it may help someone else that=20
> hits this bug - and with the hope that publically it will get=20
> the attention of the 3ware freebsd driver team/individual.
>=20

The error messages you are seeing are consistent with bad hardware.
The hardware is becoming unavailable for the driver to talk to it.
This other message "initiate_write_filepage..." is different but did
you see the machine hang after this message got printed?  I don't
think it's related to the hang.=20

> Dan
>
--------------------------------------------------------

CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, =
is for the sole use of the intended recipient(s) and contains =
information that is confidential and proprietary to Applied Micro =
Circuits Corporation or its subsidiaries. It is to be used solely for =
the purpose of furthering the parties' business relationship. All =
unauthorized review, use, disclosure or distribution is prohibited. If =
you are not the intended recipient, please contact the sender by reply =
e-mail and destroy all copies of the original message.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2B3B2AA816369A4E87D7BE63EC9D2F26D89125>