From owner-freebsd-stable Tue Jan 9 13:14:59 2001 Delivered-To: freebsd-stable@freebsd.org Received: from earth.backplane.com (placeholder-dcat-1076843399.broadbandoffice.net [64.47.83.135]) by hub.freebsd.org (Postfix) with ESMTP id E360D37B69E for ; Tue, 9 Jan 2001 13:14:40 -0800 (PST) Received: (from dillon@localhost) by earth.backplane.com (8.11.1/8.9.3) id f09LETI51662; Tue, 9 Jan 2001 13:14:29 -0800 (PST) (envelope-from dillon) Date: Tue, 9 Jan 2001 13:14:29 -0800 (PST) From: Matt Dillon Message-Id: <200101092114.f09LETI51662@earth.backplane.com> To: Ian Dowse Cc: Jaye Mathisen , stable@FreeBSD.ORG, iedowse@maths.tcd.ie Subject: Re: Repeated panic in 4.2-stable References: <200101092011.aa63743@salmon.maths.tcd.ie> Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG It's worth trying out the latest -stable first, to see if the problem goes away. Jaye's kernel was from the 19th of December and the traceback was from an msync()/block-allocation sequence, so it could be related to bugs we've fixed since the 19th. On the otherhand, his kernel faulted in the kernel rather then paniced so it could be something more insideous related to the frag size or the large filesystem (internal overflows or something like that), as you posit. It is also possible that the blocksize is only the indirect effect, and that the real problem is that the larger per-cylinder bitmaps are creating more opportunities for Kirk's background bitmap write code (which is independant of softupdates) to interfere with the clustering code. I have a patch for that below (not well tested and Kirk hasn't gotten back to me on it yet so I don't know how real the problem is. It looks real, though). -Matt Index: vfs_cluster.c =================================================================== RCS file: /home/ncvs/src/sys/kern/vfs_cluster.c,v retrieving revision 1.92.2.3 diff -u -r1.92.2.3 vfs_cluster.c --- vfs_cluster.c 2000/12/30 01:51:07 1.92.2.3 +++ vfs_cluster.c 2001/01/06 19:40:52 @@ -392,15 +392,22 @@ tbp = getblk(vp, lbn + i, size, 0, 0); - if ((tbp->b_flags & B_CACHE) || - (tbp->b_flags & B_VMIO) == 0) { + /* + * If the buffer is already fully valid or locked + * (which could also mean that a background write is + * in progress), or the buffer is not backed by VMIO, + * stop. + */ + if ((tbp->b_flags & (B_CACHE|B_LOCKED)) || + (tbp->b_flags & B_VMIO) == 0) { bqrelse(tbp); break; } - for (j = 0;j < tbp->b_npages; j++) + for (j = 0;j < tbp->b_npages; j++) { if (tbp->b_pages[j]->valid) break; + } if (j != tbp->b_npages) { bqrelse(tbp); @@ -701,8 +708,13 @@ while (len > 0) { s = splbio(); + /* + * If the buffer is not delayed-write (i.e. dirty), or it + * is delayed-write but either locked or inval, it cannot + * partake in the clustered write. + */ if (((tbp = gbincore(vp, start_lbn)) == NULL) || - ((tbp->b_flags & (B_INVAL | B_DELWRI)) != B_DELWRI) || + ((tbp->b_flags & (B_LOCKED | B_INVAL | B_DELWRI)) != B_DELWRI) || BUF_LOCK(tbp, LK_EXCLUSIVE | LK_NOWAIT)) { ++start_lbn; --len; @@ -774,12 +786,16 @@ /* * If it IS in core, but has different - * characteristics, don't cluster with it. + * characteristics, or is locked (which + * means it could be undergoing a background + * I/O or be in a weird state), then don't + * cluster with it. */ if ((tbp->b_flags & (B_VMIO | B_CLUSTEROK | B_INVAL | B_DELWRI | B_NEEDCOMMIT)) != (B_DELWRI | B_CLUSTEROK | (bp->b_flags & (B_VMIO | B_NEEDCOMMIT))) || + (tbp->b_flags & B_LOCKED) || tbp->b_wcred != bp->b_wcred || BUF_LOCK(tbp, LK_EXCLUSIVE | LK_NOWAIT)) { splx(s); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message