Date: Fri, 24 Feb 2006 15:05:29 -0800 (PST) From: Doug Ambrisko <ambrisko@ambrisko.com> To: Kris Kennaway <kris@obsecurity.org> Cc: Greg Rivers <gcr+freebsd-stable@tharned.org>, stable@freebsd.org, "Michael R. Wayne" <freebsd@wayne47.com> Subject: Re: Disk I/O system hang on 5.4-RELEASE-p8 i386 Message-ID: <200602242305.k1ON5TUn065222@ambrisko.com> In-Reply-To: <20060223235055.GA93873@xor.obsecurity.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Kris Kennaway writes: | On Thu, Feb 23, 2006 at 04:44:46PM -0600, Greg Rivers wrote: | > On Thu, 23 Feb 2006, Michael R. Wayne wrote: | > | > >Been fighting this for a while. We have an older server, running | > >5.4-RELEASE-p8 i386 and used primarily for email, which hangs every | > >couple of weeks. The hang seems to be in the disk I/O system; pings | > >succeed, and I can continue get a login: prompt on the console until | > >I enter a login at which the response stops. | > >[snip] | > | > I think you're seeing the UFS deadlock I reported last November for | > RELENG_6. See the thread beginning at | > http://lists.freebsd.org/pipermail/freebsd-stable/2005-November/019979.html | > | > I believe this issue has made it onto the show-stopper list for | > 6.1-RELEASE and is being actively worked on. | | It's on the todo list, but I don't think it's being worked on yet. | The main problem is that we need a way to reproduce it on command. | I'd forgotten that snapshots are involved, so maybe it's just a matter | of running lots of mksnap_ffs while I/O is in progress. FWIW, I found a problem when creating snapshots in that it could exhaust available buffers and wedge: Index: ffs_snapshot.c =================================================================== RCS file: /usr/local/cvsroot/freebsd/src/sys/ufs/ffs/ffs_snapshot.c,v retrieving revision 1.112 diff -u -p -r1.112 ffs_snapshot.c --- ffs_snapshot.c 9 Jan 2006 20:42:18 -0000 1.112 +++ ffs_snapshot.c 24 Feb 2006 23:02:19 -0000 @@ -336,6 +336,8 @@ restart: if (error) goto out; bawrite(nbp); + if (cg % 10 == 0) + ffs_syncvnode(vp, MNT_WAIT); } /* * Copy all the cylinder group maps. Although the @@ -357,6 +360,8 @@ restart: goto out; error = cgaccount(cg, vp, nbp, 1); bawrite(nbp); + if (cg % 10 == 0) + ffs_syncvnode(vp, MNT_WAIT); if (error) goto out; } Fixed this problem for me. Doug A.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200602242305.k1ON5TUn065222>