From owner-freebsd-stable@FreeBSD.ORG Thu Nov 13 05:00:53 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CD467106564A; Thu, 13 Nov 2008 05:00:53 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) Received: from mail.ambrisko.com (mail.ambrisko.com [64.174.51.43]) by mx1.freebsd.org (Postfix) with ESMTP id 96A048FC17; Thu, 13 Nov 2008 05:00:53 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO www.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 12 Nov 2008 21:01:04 -0800 Received: from ambrisko.com (localhost [127.0.0.1]) by www.ambrisko.com (8.14.1/8.14.1) with ESMTP id mAD50rcG051931; Wed, 12 Nov 2008 21:00:53 -0800 (PST) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.1/8.14.1/Submit) id mAD50rqt051930; Wed, 12 Nov 2008 21:00:53 -0800 (PST) (envelope-from ambrisko) From: Doug Ambrisko Message-Id: <200811130500.mAD50rqt051930@ambrisko.com> In-Reply-To: <20081113044200.GA10419@icarus.home.lan> To: Jeremy Chadwick Date: Wed, 12 Nov 2008 21:00:53 -0800 (PST) X-Mailer: ELM [version 2.4ME+ PL94b (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Cc: Kostik Belousov , Tim Bishop , freebsd-stable@freebsd.org Subject: Re: System deadlock when using mksnap_ffs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 05:00:53 -0000 Jeremy Chadwick writes: [snip] | The rest of the below information is good -- but I'm confused about | something: is there anyone out there who can use mksnap_ffs on a | filesystem (/usr is a good test source) and NOT experience this | deadlocking problem? Literally *every* FreeBSD box I have root access | to suffers from this problem, so I'm a little baffled why we end-users | need to keep providing debugging output when it should be easy as pie | for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch | their system wedge. We can at work, but we have a bunch of other patches. There are a few problems with the buffer cache: 1) The buffer daemon can't use the space that is reserved for it since to flush some stuff it needs to use more buffers. 2) The buffer cache can get fragmented to prevent large I/O which the buffer daemon may need. 3) Other issues ... I have fix for "1". It is pretty easy. I have a hack'ish fix for "2" in the I make all request use max size so it can't get fragmented since there is no code to defrag and it isn't trivial to defrag the memory. I have some fixes for some other issues, but there were some review issues with them. I might just commit the fixes for 1 and 2. It makes things better and there was no-objections at the time. We have the patches in shipping products. I can try to do some experiments at work like you said since I had similar things working before and it is pretty easy to put in printf's to see the issue. Doug A.