FreeBSD Mail Archives

Date:      Wed, 12 Nov 2008 20:47:37 -0800 (PST)
From:      Doug Ambrisko <ambrisko@ambrisko.com>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        Tim Bishop <tim@bishnet.net>, Jeremy Chadwick <koitsu@freebsd.org>, freebsd-stable@freebsd.org, David Peall <david@esn.org.za>
Subject:   Re: System deadlock when using mksnap_ffs
Message-ID:  <200811130447.mAD4lbJG051137@ambrisko.com>
In-Reply-To: <20081112210513.GM47073@deviant.kiev.zoral.com.ua>

Kostik Belousov writes:
| On Wed, Nov 12, 2008 at 07:49:28PM +0000, Tim Bishop wrote:
| > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote:
| > > I run the mksnap_ffs command to take the snapshot and some time later
| > > the system completely freezes up:
| > > 
| > > paladin# cd /u2/.snap/
| > > paladin# mksnap_ffs /u2 test.1
| > 
| > Someone (not named because they choose not to reply to the list) gave me
| > the following patch:
| > 
| > --- sys/ufs/ffs/ffs_snapshot.c.orig	Wed Mar 22 09:42:31 2006
| > +++ sys/ufs/ffs/ffs_snapshot.c	Mon Nov 20 14:59:13 2006
| > @@ -282,6 +282,8 @@ restart:
| >  		if (error)
| >  			goto out;
| >  		bawrite(nbp);
| > +		if (cg % 10 == 0)
| > +			ffs_syncvnode(vp, MNT_WAIT);
| >  	}
| >  	/*
| >  	 * Copy all the cylinder group maps. Although the
| > @@ -303,6 +305,8 @@ restart:
| >  			goto out;
| >  		error = cgaccount(cg, vp, nbp, 1);
| >  		bawrite(nbp);
| > +		if (cg % 10 == 0)
| > +			ffs_syncvnode(vp, MNT_WAIT);
| >  		if (error)
| >  			goto out;
| >  	}
| > 
| > With the description:
| > 
| > "What can happen is on a big file system it will fill up the buffer
| > cache with I/O and then run out.  When the buffer cache fills up then no
| > more disk I/O can happen :-(  When you do a sync, it flushes that out to
| > disk so things don't hang."
| > 
| > It seems to work too. But it seems more like a workaround than a fix?
| 
| It looks hackish, but in fact it is not that wrong, and I even say that
| it provides reasonable workaround.
| 
| The usual way to prevent wdrain deadlock is to issue bwillwrite() call
| before any vnode lock is taken. This is sufficient for most VFS syscalls
| that typically put dozen or less dirty buffers into delayed write
| queue.
| 
| Snapshot creation does not call bwillwrite() at all, and then does a lot
| of async writes, completely saturating buffer cache with dirty buffers.
| bwillwrite cannot be called after the vnode is locked, and just forcing
| a sync for the embrionic snapshot vnode is good enough.
| 
| The 10 counter is debatable, but debate shall be postponed until the patch
| goes into tree. I ask an anonymous submitter to commit it. Thanks !

I plan to commit it tomorrow since I sent it to Tim to test.  The 10 can 
be tuned but it has kept a bunch of machines at work up.  Glad people 
don't think it is that it is to wrong :-)  It probably could be made
a little more dynamic but I wonder if it would show any real performance
difference and might risk more bugs.

Doug A.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200811130447.mAD4lbJG051137>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation