Date: Tue, 27 Apr 2010 14:26:14 -0600 From: Scott Long <scottl@samsco.org> To: Andy Farkas <chuzzwassa@gmail.com> Cc: freebsd-scsi@freebsd.org Subject: Re: MFC of "Large set of CAM improvements" breaks I/O to Adaptec 29160 SCSI controller Message-ID: <76C33FA5-993A-4D23-8ECB-F0913E77A677@samsco.org> In-Reply-To: <w2hff80e6381004271320m665ae062t8bea44c799a40cbc@mail.gmail.com> References: <E1O6ilc-0000GP-Q3@dilbert.ticketswitch.com> <4BD6F266.5080403@feral.com> <o2rff80e6381004271308l302a7173qe2dbcd4e4f038305@mail.gmail.com> <4BD74535.4060503@feral.com> <w2hff80e6381004271320m665ae062t8bea44c799a40cbc@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 27, 2010, at 2:20 PM, Andy Farkas wrote: > On Wed, Apr 28, 2010 at 6:12 AM, Matthew Jacob <mj@feral.com> wrote: >=20 >> Does anything time out (eventually)? >=20 > No. I left it sitting overnight and it was still deadlocked > in the morning... >=20 A couple of possible scenarios here: 1. A command completed with an error, that error was reported up to the = periph layer, and the periph failed to properly handle it, leading to a = lost command that eventually livelocked the VM/block layer. 2. An error happened the transport layer, and the aic7xxx tried to = freeze the CAM queues to perform error recovery. Something broke in the = freeze/unfreeze API, so the aic7xxx was left stranded. The more I think about it, it's likely case 2, since I know that = Alexander has been working in or near that code. Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?76C33FA5-993A-4D23-8ECB-F0913E77A677>