Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 Apr 1998 17:40:57 -0700 (PDT)
From:      Matt Dillon <dillon@best.net>
To:        FreeBSD-gnats-submit@FreeBSD.ORG
Subject:   kern/6258: FreeBSD-2.2.6 VM lockup on kernel map due to brelse calling bfreekva(), dump lockup in getnewbuf() due to fragmented buffer_map
Message-ID:  <199804100040.RAA01574@flea.best.net>

next in thread | raw e-mail | index | archive | help

>Number:         6258
>Category:       kern
>Synopsis:       A fix required to prevent kernel lockups in brelse causes the dump program to lockup in 'newbuf'
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:
>Keywords:
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Apr  9 17:50:00 PDT 1998
>Last-Modified:
>Originator:     Matt Dillon
>Organization:
Best Internet Communications, Inc.
>Release:        FreeBSD 2.2.6-STABLE i386
>Environment:

	Heavily loaded shell machine, PPro 200, 128MB of ram.

>Description:

	Problem #1:  Kernel locks up in kernel map due to brelse() calling
	bfreekva() from a SCSI interrupt while kernel map is already locked
	(there was another bugtrack on this problem).  The fix for this was
	to defer calling bfreekva().

	Problem #2: Unfortunately, this appears to create a new problem.  The 
	new problem is that since bfreekva() is not called when a buffer
	is released, the buffer_map can get fragmented and prevent large 
	getnewbuf allocations from succeeding.  

	The dump program attempts to allocate a 64K buffer to load the
	disklabel.  About once a week the program locks up in a 'newbuf' 
	waitstate, but still eats cpu because it is constantly being woken
	up but then (as far as I can tell) is unable to allocate a bp of
	sufficient size.  This becomes a permanent condition.

	Since we can't put the bfreekva() back into brelse(), my solution
	is to put code in getnewbuf().  If the vm_map_findspace() call
	fails, my proposed code (the second set of changes below) wipes
	the kvm mappings for all EMPTY bp's in an attempt to defragment it
	then retries the vm_map_findspace() call.

	I'm running this code now but it hasn't hit it yet.

>How-To-Repeat:

	I can't reliably get it repeatable.  The problem happens once a week
	or so on our admin machine.  However, I believe the general problem
	is important enough to be flagged critical since apparently the
	original #ifdef notdef patch to remove bfreekva() did not make it
	into 2.2.6. I don't know why.  brelse() is a critical kernel call
	that can occur in an interrupt and should not do anything complex...
	certainly not call bfreekva().  Manipulating the kernel map and
	associated insundry activity is much safer to do in getnewbuf() then
	in brelse().

>Fix:


--- LINK/vfs_bio.c	Fri Mar 13 13:13:57 1998
+++ vfs_bio.c	Thu Apr  9 17:38:01 1998
@@ -597,10 +597,12 @@
 		LIST_REMOVE(bp, b_hash);
 		LIST_INSERT_HEAD(&invalhash, bp, b_hash);
 		bp->b_dev = NODEV;
+#ifdef notdef
 		/*
 		 * Get rid of the kva allocation *now*
 		 */
 		bfreekva(bp);
+#endif
 		if (needsbuffer) {
 			wakeup(&needsbuffer);
 			needsbuffer=0;
@@ -986,9 +988,33 @@
 		 */
 		if (vm_map_findspace(buffer_map,
 			vm_map_min(buffer_map), maxsize, &addr)) {
-			bp->b_flags |= B_INVAL;
-			brelse(bp);
-			goto trytofreespace;
+
+			/*
+			 * Matt hack.  Since we can't call bfreekva() in
+			 * brelse(), the bp's on the EMPTY list may all
+			 * still have allocated KVM.  If we can't find
+			 * unused space in the buffer_map, we should try
+			 * to defragment the map by freeing as much from
+			 * the empty list as possible.
+			 */
+			printf("vm_map_findspace() failed, defragmenting freelist\n");
+			for (bp = TAILQ_FIRST(&bufqueues[QUEUE_EMPTY]);
+				bp;
+				bp = TAILQ_NEXT(bp, b_freelist)
+			) {
+			    if (bp->b_kvasize)
+				bfreekva(bp);
+			    if (bp->b_qindex != QUEUE_EMPTY)
+				break;
+			}
+			addr = 0;
+			if (vm_map_findspace(buffer_map,
+				vm_map_min(buffer_map), maxsize, &addr)) {
+
+				bp->b_flags |= B_INVAL;
+				brelse(bp);
+				goto trytofreespace;
+			}
 		}
 	}
 
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199804100040.RAA01574>