From owner-freebsd-scsi Wed Mar 26 10:13:54 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA03970 for freebsd-scsi-outgoing; Wed, 26 Mar 1997 10:13:54 -0800 (PST) Received: from pluto.plutotech.com (root@pluto.plutotech.com [206.168.67.1]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id KAA03955 for ; Wed, 26 Mar 1997 10:13:49 -0800 (PST) Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130]) by pluto.plutotech.com (8.8.5/8.8.3) with ESMTP id LAA27397; Wed, 26 Mar 1997 11:13:39 -0700 (MST) Message-Id: <199703261813.LAA27397@pluto.plutotech.com> X-Mailer: exmh version 2.0beta 12/23/96 To: "Roy M. Hooper" cc: freebsd-scsi@FreeBSD.ORG Subject: Re: AHA2940 bug(s) still exist in 2.2.1 In-reply-to: Your message of "Wed, 26 Mar 1997 11:17:08 EST." <199703261617.LAA17954@toybox.ottawa.on.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 26 Mar 1997 11:14:00 -0700 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > >It would appear that the bugs in the 2940 drivers are still there in 2.2.1. >We had the same kind of crash as usual, except the machine didn't come back >up this time. > >We received several "timeout" messages and then the machine froze. I need the exact timeout messages, information about what AHC options you have in your kernel config file, and the dmesg output listing the drives you are using. I will say this though. Using tagged queueing can still be somewhat dangerous with the Quantum Atlas II drives. For one thing, even with only 8 tags outstanding, they can return QUEUE FULL status which the generic FreeBSD code simply does not handle very well. The transaction will be repeatedly requeued by the kernel until it succeeds with no amount of delay between retries which can often cause the drive to simply "give up" and return BUSY status indefinitely. The proper fix for this is in the works, but it comes only once we convert to my new CAM SCSI framework probably a month or two down the line. Even if the drive doesn't return QUEUE FULL, it is very possible that you are experiencing "tag starvation". The driver currently used "Simple Queue" tags for all transactions which allows the drive to reorder the transactions in anyway it sees fit so long as "write followed by read" consistency is maintained. This means that a transaction for a location far from the current head possition can be starved by a continuous stream of transactions that don't require large seeks. The faster and larger the drive, the easier it is to make this happen. I saw it on a Quantum Atlas II last night during two concurrent copies over 100Bt ethernet. The driver does attempt to handle this condition by first attempting to queue an Ordered Tagged transaction to the disk. This should force the drive to finish all pending transactions before starting any others. The current timeout for the Ordered Tagged transaction to be successful is only 1 second which perhaps isn't really long enough. Something that my help this problem is to perform ordered writes for all synch write operations which will be possible with the new SCSI code. -- Justin T. Gibbs =========================================== FreeBSD: Turning PCs into workstations ===========================================