Date: Wed, 12 Nov 2008 20:47:37 -0800 (PST) From: Doug Ambrisko <ambrisko@ambrisko.com> To: Kostik Belousov <kostikbel@gmail.com> Cc: Tim Bishop <tim@bishnet.net>, Jeremy Chadwick <koitsu@freebsd.org>, freebsd-stable@freebsd.org, David Peall <david@esn.org.za> Subject: Re: System deadlock when using mksnap_ffs Message-ID: <200811130447.mAD4lbJG051137@ambrisko.com> In-Reply-To: <20081112210513.GM47073@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Kostik Belousov writes: | On Wed, Nov 12, 2008 at 07:49:28PM +0000, Tim Bishop wrote: | > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: | > > I run the mksnap_ffs command to take the snapshot and some time later | > > the system completely freezes up: | > > | > > paladin# cd /u2/.snap/ | > > paladin# mksnap_ffs /u2 test.1 | > | > Someone (not named because they choose not to reply to the list) gave me | > the following patch: | > | > --- sys/ufs/ffs/ffs_snapshot.c.orig Wed Mar 22 09:42:31 2006 | > +++ sys/ufs/ffs/ffs_snapshot.c Mon Nov 20 14:59:13 2006 | > @@ -282,6 +282,8 @@ restart: | > if (error) | > goto out; | > bawrite(nbp); | > + if (cg % 10 == 0) | > + ffs_syncvnode(vp, MNT_WAIT); | > } | > /* | > * Copy all the cylinder group maps. Although the | > @@ -303,6 +305,8 @@ restart: | > goto out; | > error = cgaccount(cg, vp, nbp, 1); | > bawrite(nbp); | > + if (cg % 10 == 0) | > + ffs_syncvnode(vp, MNT_WAIT); | > if (error) | > goto out; | > } | > | > With the description: | > | > "What can happen is on a big file system it will fill up the buffer | > cache with I/O and then run out. When the buffer cache fills up then no | > more disk I/O can happen :-( When you do a sync, it flushes that out to | > disk so things don't hang." | > | > It seems to work too. But it seems more like a workaround than a fix? | | It looks hackish, but in fact it is not that wrong, and I even say that | it provides reasonable workaround. | | The usual way to prevent wdrain deadlock is to issue bwillwrite() call | before any vnode lock is taken. This is sufficient for most VFS syscalls | that typically put dozen or less dirty buffers into delayed write | queue. | | Snapshot creation does not call bwillwrite() at all, and then does a lot | of async writes, completely saturating buffer cache with dirty buffers. | bwillwrite cannot be called after the vnode is locked, and just forcing | a sync for the embrionic snapshot vnode is good enough. | | The 10 counter is debatable, but debate shall be postponed until the patch | goes into tree. I ask an anonymous submitter to commit it. Thanks ! I plan to commit it tomorrow since I sent it to Tim to test. The 10 can be tuned but it has kept a bunch of machines at work up. Glad people don't think it is that it is to wrong :-) It probably could be made a little more dynamic but I wonder if it would show any real performance difference and might risk more bugs. Doug A.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200811130447.mAD4lbJG051137>