From owner-freebsd-current Tue Apr 16 18:28:50 2002 Delivered-To: freebsd-current@freebsd.org Received: from mail.speakeasy.net (mail11.speakeasy.net [216.254.0.211]) by hub.freebsd.org (Postfix) with ESMTP id 026E837B400 for ; Tue, 16 Apr 2002 18:28:47 -0700 (PDT) Received: (qmail 20219 invoked from network); 17 Apr 2002 01:28:46 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) by mail11.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP for ; 17 Apr 2002 01:28:46 -0000 Received: from laptop.baldwin.cx (laptop.baldwin.cx [192.168.0.4]) by server.baldwin.cx (8.11.6/8.11.6) with ESMTP id g3H1Svv78637; Tue, 16 Apr 2002 21:28:57 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.2 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <3CBCCC84.4AFD0E68@mindspring.com> Date: Tue, 16 Apr 2002 21:27:55 -0400 (EDT) From: John Baldwin To: Terry Lambert Subject: Re: ATA errors on recent -current Cc: freebsd-current@freebsd.org, msch@snafu.de Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 17-Apr-2002 Terry Lambert wrote: >> What was consistent thru all test was, that the disk operates quite >> some time until the error occures the first time. After that, it is not >> possible to access the disk in UDMA-Mode any more, regardeless *which* >> UDMA-Mode it is. 'Quite some time' means approx. 50% of /usr/ports in >> the above mentioned 'test'. >> >> After the first switch to PIO4, I umounted the filesystem and switched >> back to UDMA33 for instance - I couldn't even *mount* the filesystem >> again! >> >> But w/o Tagged Queuing the disk operates flawlessly, so I'm a bit in >> doubt, if the errors with WD-disks have the same source... but may be. > > My hunch, which is why I suggested decreasing the number of > tags seen by the driver, is that the tagged queues are over > used, and this locks the disk up. My best guess is an off-by-one > or an exceptional condition handler that was not an issue until > recently, because of a FreeBSD interrupt architecture change > having nothing to do with the driver itself (i.e. the reason it > only happens under load, and didn't happen under the same load, > before). Terry, we've had threaded interrupt handlers for over a year and a half now. If the had really broken things in this basic a fashion we wouldn't have made it this far with running systems. Your hypothesis about something busted in the tagged queueing code seems sound but blaiming this on interrupt threads doesn't make much sense to me. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message