From owner-freebsd-stable  Tue Jan  9 13:14:59 2001
Delivered-To: freebsd-stable@freebsd.org
Received: from earth.backplane.com (placeholder-dcat-1076843399.broadbandoffice.net [64.47.83.135])
	by hub.freebsd.org (Postfix) with ESMTP id E360D37B69E
	for <stable@FreeBSD.ORG>; Tue,  9 Jan 2001 13:14:40 -0800 (PST)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.1/8.9.3) id f09LETI51662;
	Tue, 9 Jan 2001 13:14:29 -0800 (PST)
	(envelope-from dillon)
Date: Tue, 9 Jan 2001 13:14:29 -0800 (PST)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200101092114.f09LETI51662@earth.backplane.com>
To: Ian Dowse <iedowse@maths.tcd.ie>
Cc: Jaye Mathisen <mrcpu@internetcds.com>, stable@FreeBSD.ORG,
	iedowse@maths.tcd.ie
Subject: Re: Repeated panic in 4.2-stable 
References:  <200101092011.aa63743@salmon.maths.tcd.ie>
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

    It's worth trying out the latest -stable first, to see if the problem
    goes away.  Jaye's kernel was from the 19th of December and the
    traceback was from an msync()/block-allocation sequence, so it could
    be related to bugs we've fixed since the 19th.

    On the otherhand, his kernel faulted in the kernel rather then paniced
    so it could be something more insideous related to the frag size or
    the large filesystem (internal overflows or something like that), as
    you posit.

    It is also possible that the blocksize is only the indirect effect,
    and that the real problem is that the larger per-cylinder bitmaps
    are creating more opportunities for Kirk's background bitmap write
    code (which is independant of softupdates) to interfere with the
    clustering code.  I have a patch for that below (not well tested and
    Kirk hasn't gotten back to me on it yet so I don't know how real
    the problem is.  It looks real, though).

					-Matt

Index: vfs_cluster.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/vfs_cluster.c,v
retrieving revision 1.92.2.3
diff -u -r1.92.2.3 vfs_cluster.c
--- vfs_cluster.c	2000/12/30 01:51:07	1.92.2.3
+++ vfs_cluster.c	2001/01/06 19:40:52
@@ -392,15 +392,22 @@
 
 			tbp = getblk(vp, lbn + i, size, 0, 0);
 
-			if ((tbp->b_flags & B_CACHE) ||
-				(tbp->b_flags & B_VMIO) == 0) {
+			/*
+			 * If the buffer is already fully valid or locked
+			 * (which could also mean that a background write is
+			 * in progress), or the buffer is not backed by VMIO,
+			 * stop.
+			 */
+			if ((tbp->b_flags & (B_CACHE|B_LOCKED)) ||
+			    (tbp->b_flags & B_VMIO) == 0) {
 				bqrelse(tbp);
 				break;
 			}
 
-			for (j = 0;j < tbp->b_npages; j++)
+			for (j = 0;j < tbp->b_npages; j++) {
 				if (tbp->b_pages[j]->valid)
 					break;
+			}
 
 			if (j != tbp->b_npages) {
 				bqrelse(tbp);
@@ -701,8 +708,13 @@
 
 	while (len > 0) {
 		s = splbio();
+		/*
+		 * If the buffer is not delayed-write (i.e. dirty), or it 
+		 * is delayed-write but either locked or inval, it cannot 
+		 * partake in the clustered write.
+		 */
 		if (((tbp = gbincore(vp, start_lbn)) == NULL) ||
-		  ((tbp->b_flags & (B_INVAL | B_DELWRI)) != B_DELWRI) ||
+		  ((tbp->b_flags & (B_LOCKED | B_INVAL | B_DELWRI)) != B_DELWRI) ||
 		  BUF_LOCK(tbp, LK_EXCLUSIVE | LK_NOWAIT)) {
 			++start_lbn;
 			--len;
@@ -774,12 +786,16 @@
 
 				/*
 				 * If it IS in core, but has different
-				 * characteristics, don't cluster with it.
+				 * characteristics, or is locked (which
+				 * means it could be undergoing a background
+				 * I/O or be in a weird state), then don't
+				 * cluster with it.
 				 */
 				if ((tbp->b_flags & (B_VMIO | B_CLUSTEROK |
 				    B_INVAL | B_DELWRI | B_NEEDCOMMIT))
 				  != (B_DELWRI | B_CLUSTEROK |
 				    (bp->b_flags & (B_VMIO | B_NEEDCOMMIT))) ||
+				    (tbp->b_flags & B_LOCKED) ||
 				    tbp->b_wcred != bp->b_wcred ||
 				    BUF_LOCK(tbp, LK_EXCLUSIVE | LK_NOWAIT)) {
 					splx(s);


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message