From owner-freebsd-current  Tue Apr 16 18:28:50 2002
Delivered-To: freebsd-current@freebsd.org
Received: from mail.speakeasy.net (mail11.speakeasy.net [216.254.0.211])
	by hub.freebsd.org (Postfix) with ESMTP id 026E837B400
	for <freebsd-current@freebsd.org>; Tue, 16 Apr 2002 18:28:47 -0700 (PDT)
Received: (qmail 20219 invoked from network); 17 Apr 2002 01:28:46 -0000
Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender <jhb@FreeBSD.org>)
          by mail11.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP
          for <msch@snafu.de>; 17 Apr 2002 01:28:46 -0000
Received: from laptop.baldwin.cx (laptop.baldwin.cx [192.168.0.4])
	by server.baldwin.cx (8.11.6/8.11.6) with ESMTP id g3H1Svv78637;
	Tue, 16 Apr 2002 21:28:57 -0400 (EDT)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.20020416212755.jhb@FreeBSD.org>
X-Mailer: XFMail 1.5.2 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <3CBCCC84.4AFD0E68@mindspring.com>
Date: Tue, 16 Apr 2002 21:27:55 -0400 (EDT)
From: John Baldwin <jhb@FreeBSD.org>
To: Terry Lambert <tlambert2@mindspring.com>
Subject: Re: ATA errors on recent -current
Cc: freebsd-current@freebsd.org, msch@snafu.de
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-current.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-current>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-current>
X-Loop: FreeBSD.ORG


On 17-Apr-2002 Terry Lambert wrote:
>> What was consistent thru all test was, that the disk operates quite
>> some time until the error occures the first time. After that, it is not
>> possible to access the disk in UDMA-Mode any more, regardeless *which*
>> UDMA-Mode it is. 'Quite some time' means approx. 50% of /usr/ports in
>> the above mentioned 'test'.
>> 
>> After the first switch to PIO4, I umounted the filesystem and switched
>> back to UDMA33 for instance - I couldn't even *mount* the filesystem
>> again!
>> 
>> But w/o Tagged Queuing the disk operates flawlessly, so I'm a bit in
>> doubt, if the errors with WD-disks have the same source... but may be.
> 
> My hunch, which is why I suggested decreasing the number of
> tags seen by the driver, is that the tagged queues are over
> used, and this locks the disk up.  My best guess is an off-by-one
> or an exceptional condition handler that was not an issue until
> recently, because of a FreeBSD interrupt architecture change
> having nothing to do with the driver itself (i.e. the reason it
> only happens under load, and didn't happen under the same load,
> before).

Terry, we've had threaded interrupt handlers for over a year and a half
now.  If the had really broken things in this basic a fashion we wouldn't
have made it this far with running systems.  Your hypothesis about
something busted in the tagged queueing code seems sound but blaiming
this on interrupt threads doesn't make much sense to me.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message