From owner-freebsd-stable@FreeBSD.ORG Thu Nov 13 05:16:54 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E9BF1065695; Thu, 13 Nov 2008 05:16:54 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) Received: from mail.ambrisko.com (mail.ambrisko.com [64.174.51.43]) by mx1.freebsd.org (Postfix) with ESMTP id 403E38FC16; Thu, 13 Nov 2008 05:16:54 +0000 (UTC) (envelope-from ambrisko@ambrisko.com) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO www.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 12 Nov 2008 20:47:49 -0800 Received: from ambrisko.com (localhost [127.0.0.1]) by www.ambrisko.com (8.14.1/8.14.1) with ESMTP id mAD4lcVb051138; Wed, 12 Nov 2008 20:47:38 -0800 (PST) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.1/8.14.1/Submit) id mAD4lbJG051137; Wed, 12 Nov 2008 20:47:37 -0800 (PST) (envelope-from ambrisko) From: Doug Ambrisko Message-Id: <200811130447.mAD4lbJG051137@ambrisko.com> In-Reply-To: <20081112210513.GM47073@deviant.kiev.zoral.com.ua> To: Kostik Belousov Date: Wed, 12 Nov 2008 20:47:37 -0800 (PST) X-Mailer: ELM [version 2.4ME+ PL94b (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Cc: Tim Bishop , Jeremy Chadwick , freebsd-stable@freebsd.org, David Peall Subject: Re: System deadlock when using mksnap_ffs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 05:16:54 -0000 Kostik Belousov writes: | On Wed, Nov 12, 2008 at 07:49:28PM +0000, Tim Bishop wrote: | > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: | > > I run the mksnap_ffs command to take the snapshot and some time later | > > the system completely freezes up: | > > | > > paladin# cd /u2/.snap/ | > > paladin# mksnap_ffs /u2 test.1 | > | > Someone (not named because they choose not to reply to the list) gave me | > the following patch: | > | > --- sys/ufs/ffs/ffs_snapshot.c.orig Wed Mar 22 09:42:31 2006 | > +++ sys/ufs/ffs/ffs_snapshot.c Mon Nov 20 14:59:13 2006 | > @@ -282,6 +282,8 @@ restart: | > if (error) | > goto out; | > bawrite(nbp); | > + if (cg % 10 == 0) | > + ffs_syncvnode(vp, MNT_WAIT); | > } | > /* | > * Copy all the cylinder group maps. Although the | > @@ -303,6 +305,8 @@ restart: | > goto out; | > error = cgaccount(cg, vp, nbp, 1); | > bawrite(nbp); | > + if (cg % 10 == 0) | > + ffs_syncvnode(vp, MNT_WAIT); | > if (error) | > goto out; | > } | > | > With the description: | > | > "What can happen is on a big file system it will fill up the buffer | > cache with I/O and then run out. When the buffer cache fills up then no | > more disk I/O can happen :-( When you do a sync, it flushes that out to | > disk so things don't hang." | > | > It seems to work too. But it seems more like a workaround than a fix? | | It looks hackish, but in fact it is not that wrong, and I even say that | it provides reasonable workaround. | | The usual way to prevent wdrain deadlock is to issue bwillwrite() call | before any vnode lock is taken. This is sufficient for most VFS syscalls | that typically put dozen or less dirty buffers into delayed write | queue. | | Snapshot creation does not call bwillwrite() at all, and then does a lot | of async writes, completely saturating buffer cache with dirty buffers. | bwillwrite cannot be called after the vnode is locked, and just forcing | a sync for the embrionic snapshot vnode is good enough. | | The 10 counter is debatable, but debate shall be postponed until the patch | goes into tree. I ask an anonymous submitter to commit it. Thanks ! I plan to commit it tomorrow since I sent it to Tim to test. The 10 can be tuned but it has kept a bunch of machines at work up. Glad people don't think it is that it is to wrong :-) It probably could be made a little more dynamic but I wonder if it would show any real performance difference and might risk more bugs. Doug A.