From owner-freebsd-stable@FreeBSD.ORG Wed Jul 19 14:14:54 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2241216A4DF for ; Wed, 19 Jul 2006 14:14:54 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8F85143D4C for ; Wed, 19 Jul 2006 14:14:53 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 4881146BDC; Wed, 19 Jul 2006 10:14:51 -0400 (EDT) Date: Wed, 19 Jul 2006 15:14:51 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: User Freebsd In-Reply-To: <20060719082627.H1799@ganymede.hub.org> Message-ID: <20060719151327.H5132@fledge.watson.org> References: <20060705100403.Y80381@fledge.watson.org> <20060705234514.I70011@fledge.watson.org> <20060715000351.U1799@ganymede.hub.org> <20060715035308.GJ32624@deviant.kiev.zoral.com.ua> <20060718074804.W1799@ganymede.hub.org> <20060719112424.GK1464@deviant.kiev.zoral.com.ua> <20060719082627.H1799@ganymede.hub.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Kostik Belousov , freebsd-stable@freebsd.org Subject: Re: file system deadlock - the whole story? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Jul 2006 14:14:54 -0000 On Wed, 19 Jul 2006, User Freebsd wrote: > Also note that under FreeBSD 4.x, all three of these machines were pretty > much my more solid machines, with even more vServers running on them then > I'm able to run with 6.x ... once I got rid of using unionfs, stability > skyrocketed :( > > Hrmmmm ... but, your 'controller driver' comment ... that is one common > thing amongst all three servers ... they are all running the iir driver ... > not sure the *exact* controller, but pluto (older Dual-PIII) shows it as: Yes, this was going to be my next question -- if you're seeing wedges under load and there's a common controller in use, maybe we're looking at a driver bug. Bugs of those sort typically look a lot like what you describe: an I/O is "lost" and so eveything that depends on the I/O wedges waiting for it, leading to a lot of processes hanging around waiting for vnode locks, etc. Robert N M Watson Computer Laboratory University of Cambridge