Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Feb 2006 15:05:29 -0800 (PST)
From:      Doug Ambrisko <ambrisko@ambrisko.com>
To:        Kris Kennaway <kris@obsecurity.org>
Cc:        Greg Rivers <gcr+freebsd-stable@tharned.org>, stable@freebsd.org, "Michael R. Wayne" <freebsd@wayne47.com>
Subject:   Re: Disk I/O system hang on 5.4-RELEASE-p8 i386
Message-ID:  <200602242305.k1ON5TUn065222@ambrisko.com>
In-Reply-To: <20060223235055.GA93873@xor.obsecurity.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Kris Kennaway writes:
| On Thu, Feb 23, 2006 at 04:44:46PM -0600, Greg Rivers wrote:
| > On Thu, 23 Feb 2006, Michael R. Wayne wrote:
| > 
| > >Been fighting this for a while.  We have an older server, running
| > >5.4-RELEASE-p8 i386 and used primarily for email, which hangs every
| > >couple of weeks.  The hang seems to be in the disk I/O system; pings
| > >succeed, and I can continue get a login: prompt on the console until
| > >I enter a login at which the response stops.
| > >[snip]
| > 
| > I think you're seeing the UFS deadlock I reported last November for 
| > RELENG_6.  See the thread beginning at 
| > http://lists.freebsd.org/pipermail/freebsd-stable/2005-November/019979.html
| > 
| > I believe this issue has made it onto the show-stopper list for 
| > 6.1-RELEASE and is being actively worked on.
| 
| It's on the todo list, but I don't think it's being worked on yet.
| The main problem is that we need a way to reproduce it on command.
| I'd forgotten that snapshots are involved, so maybe it's just a matter
| of running lots of mksnap_ffs while I/O is in progress.

FWIW, I found a problem when creating snapshots in that it could exhaust
available buffers and wedge:

Index: ffs_snapshot.c
===================================================================
RCS file: /usr/local/cvsroot/freebsd/src/sys/ufs/ffs/ffs_snapshot.c,v
retrieving revision 1.112
diff -u -p -r1.112 ffs_snapshot.c
--- ffs_snapshot.c	9 Jan 2006 20:42:18 -0000	1.112
+++ ffs_snapshot.c	24 Feb 2006 23:02:19 -0000
@@ -336,6 +336,8 @@ restart:
 		if (error)
 			goto out;
 		bawrite(nbp);
+		if (cg % 10 == 0)
+			ffs_syncvnode(vp, MNT_WAIT);
 	}
 	/*
 	 * Copy all the cylinder group maps. Although the
@@ -357,6 +360,8 @@ restart:
 			goto out;
 		error = cgaccount(cg, vp, nbp, 1);
 		bawrite(nbp);
+		if (cg % 10 == 0)
+			ffs_syncvnode(vp, MNT_WAIT);
 		if (error)
 			goto out;
 	}

Fixed this problem for me.

Doug A.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200602242305.k1ON5TUn065222>