From owner-freebsd-current@FreeBSD.ORG Fri Feb 18 16:48:07 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF06E16A4CE; Fri, 18 Feb 2005 16:48:07 +0000 (GMT) Received: from csa.cs.okstate.edu (a.cs.okstate.edu [139.78.113.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 826B343D46; Fri, 18 Feb 2005 16:48:07 +0000 (GMT) (envelope-from lreid@a.cs.okstate.edu) Received: by csa.cs.okstate.edu (Postfix, from userid 601) id 2890AA0637; Fri, 18 Feb 2005 10:48:07 -0600 (CST) To: cperciva@freebsd.org Received: from 164.58.79.196 (auth. user lreid@a.cs.okstate.edu) by cs.okstate.edu with HTTP; Fri, 18 Feb 2005 10:48:07 -0600 X-IlohaMail-Blah: lreid@a.cs.okstate.edu X-IlohaMail-Method: mail() [mem] X-IlohaMail-Dummy: moo X-Mailer: IlohaMail/0.8.12 (On: cs.okstate.edu) In-Reply-To: <42160FAC.7010807@freebsd.org> From: "Reid Linnemann" Bounce-To: "Reid Linnemann" Errors-To: "Reid Linnemann" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Message-Id: <20050218164807.2890AA0637@csa.cs.okstate.edu> Date: Fri, 18 Feb 2005 10:48:07 -0600 (CST) cc: "freebsd-current@freebsd.org" Subject: Re: ad WRITE_DMA timing out frequently X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Feb 2005 16:48:08 -0000 On 2/18/2005, "Colin Percival" wrote: >Reid Linnemann wrote: >> smartctl doesn't report any logged errors. On a hunch, I also dd'ed a >> file large enough to fill /var, hoping that it would crater on writing >> to that sector. It didn't. I know that's not a very useful test, but >> it seems to hint to me that the disk isn't bad, but the driver is >> freaking out from some event. > >It's quite possible that the driver is at fault, but I'd run a >"smartctl -t long" test first, just to make sure the drive isn't >suffering an intermitant fault. > >Colin Percival I ran smartctl -t long on the disk, and sure enough it's healthy: SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 22444 - There must be some strange system event happening that is causing the driver to freak out. I really have a hunch it's related to the msp queue mail that is being dumped in /var/spool/clientmqueue, because the that occurance roughly falls in lline with the ad0 failures, and it's all on the same partition that the error is reported on.