From owner-freebsd-fs@FreeBSD.ORG Sun Jun 25 04:59:30 2006 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DA5B316A4A9; Sun, 25 Jun 2006 04:59:30 +0000 (UTC) (envelope-from user@dhp.com) Received: from shell.dhp.com (shell.dhp.com [199.245.105.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id ED4EB43D60; Sun, 25 Jun 2006 04:59:29 +0000 (GMT) (envelope-from user@dhp.com) Received: by shell.dhp.com (Postfix, from userid 896) id 5083F31311; Sun, 25 Jun 2006 00:59:29 -0400 (EDT) Date: Sun, 25 Jun 2006 00:59:29 -0400 (EDT) From: Ensel Sharon To: Robert Watson In-Reply-To: <20060624232457.D8526@fledge.watson.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-fs@freebsd.org Subject: Re: 6.1 quota bugs cause adaptec 2820sa kernel to crash ? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Jun 2006 04:59:30 -0000 On Sat, 24 Jun 2006, Robert Watson wrote: > > After loading, the system frequently (multi-daily) crashed with the error: > > > > Warning! Controller is no longer running! code=0xbcef0100 > > > > (after a page or so of aac0 timeout messages) > > > > So I disabled quotas on the system, and it has been completely stable ever > > since. > > > > ----- > > The above sounds a lot like a problem with {Adaptect driver, controller, > disks}, rather than a quota problem. So it might be that there's an I/O load > change with quotas running that triggers the problem. Alternatively, there's > a memory corruption bug (or the like) in quotas that corrupts data structures > for the adaptec driver, but hopefully not. I believe Scott Long follows this > list, but if you don't hear back in a bit, you might forward that description > to him and see if he has thoughts. In the past, he's maintained the Adaptec > drivers, but I'm not sure what his level of involvement with them is at this > point. Yes, that is why I was asking - is it reasonable that quotas causes this, or is quotas just "hard" and if I did something equally "hard" the system would also crash ? I can tell you this though - it is a busy fileserver with a lot of intensive rsyncs going on, and with quotas involved it crashes hourly, and as soon as you turn them off it never crashes. No other variables have changed. What quota-like activity, that doesn't involve quotas, and also won't destroy data (!) can I run to see if it crashes ? I am more than happy to use this system to test things that coud solve this problem for everyone - it is not the availability of this system that matters, it is just that the data must remain safe... thanks...