From owner-freebsd-bugs Thu Nov 5 07:20:00 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id HAA23736 for freebsd-bugs-outgoing; Thu, 5 Nov 1998 07:20:00 -0800 (PST) (envelope-from owner-freebsd-bugs@FreeBSD.ORG) Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id HAA23702 for ; Thu, 5 Nov 1998 07:19:59 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.8.8/8.8.5) id HAA23940; Thu, 5 Nov 1998 07:20:01 -0800 (PST) Received: from bsd7.cs.sunysb.edu (bsd7.cs.sunysb.edu [130.245.1.197]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id HAA23354 for ; Thu, 5 Nov 1998 07:14:38 -0800 (PST) (envelope-from gene@starkhome.cs.sunysb.edu) Received: (from uucp@localhost) by bsd7.cs.sunysb.edu (8.8.7/8.8.5) with UUCP id KAA00674 for FreeBSD-gnats-submit@freebsd.org; Thu, 5 Nov 1998 10:14:22 -0500 (EST) Received: (from gene@localhost) by starkhome.cs.sunysb.edu (8.8.7/8.8.5) id JAA27573; Thu, 5 Nov 1998 09:55:46 -0500 (EST) Message-Id: <199811051455.JAA27573@starkhome.cs.sunysb.edu> Date: Thu, 5 Nov 1998 09:55:46 -0500 (EST) From: stark@FreeBSD.ORG Reply-To: stark@FreeBSD.ORG To: FreeBSD-gnats-submit@FreeBSD.ORG X-Send-Pr-Version: 3.2 Subject: kern/8580: Hanging NFS pagein in nfs_bio.c (2.2.7, workaround patch) Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >Number: 8580 >Category: kern >Synopsis: Hanging NFS pagein in nfs_bio.c (2.2.7, workaround patch) >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Nov 5 07:20:00 PST 1998 >Last-Modified: >Originator: Gene Stark >Organization: SUNY at Stony Brook CS Dept. >Release: FreeBSD 2.2.7-RELEASE i386 >Environment: FreeBSD 2.2.7-RELEASE on a system with user files located on an NFS server. Filesystem buffer size 8192. >Description: A race condition involving ptrace and NFS causes the system to loop in nfs_bioread() while attempting to page in text segment. >How-To-Repeat: Run a certain program under gdb Interrupt the program by ^C Install a breakpoint Re-run the program (so that it is killed and restarts right away) System loops in kernel trying to page in (over NFS) the text page that has the breakpoint in it. The bug is timing-dependent -- I can reproduce the hang a certain fraction of the time, but it seems to require other system load to be present. To get a core dump I had to turn on RB_NOSYNC, because otherwise the system hangs syncing the FS. The looping occurs between "again:" and the "goto again" in the following portion of nfs_bioread(): if (not_readin && n > 0) { if (on < bp->b_validoff || (on + n) > bp->b_validend) { bp->b_flags |= B_NOCACHE; if (bp->b_dirtyend > 0) { if ((bp->b_flags & B_DELWRI) == 0) panic("nfsbioread"); if (VOP_BWRITE(bp) == EINTR) return (EINTR); } else brelse(bp); goto again; } } The system hangs because it is spinning here -- maybe for general robustness the CPU should be yielded if possible in this loop. The reason for the looping is that the desired page of bp is outside the valid range. Stepping through the code with DDB (below), shows that each time around the loop, the buffer bp is not in the cache, and is getting reconstituted. I am using a bsize of 8192, so there are two pages per buffer. The second page is the one that is desired. Examining the VM pages in the buffer shows the first one valid, but the second one invalid, with m->busy == 1, but with PG_BUSY not set. Thus, it seems that pagein is supposed to be occurring. Either the pagein never got started, or else somehow it's completion was missed or forgotten. What seems to be the case is that when the breakpoint is set, a COW occurs, moving? the VM page from the underlying object (assocated with a vnode/NFS pager) to the top-level, anonymous object. Then this process terminates, and a new process is restarted, freeing the top-level object but leaving no VM page in the underlying object. When the new process tries to fetch from this page, it faults. Somehow there is a race condition involving the pagein of this page, leading to the observed looping. (DDB trace showing loop follows) --------- After 72 instructions (0 loads, 0 stores), Stopped at _vm_map_simplify_entry+0x175: call _vm_object_deallocate db> trace _vm_map_simplify_entry(f09f5f00,f1466e80) at _vm_map_simplify_entry+0x175 __vm_map_clip_start(f09f5f00,f1466e80,f3172000) at __vm_map_clip_start+0x14 _vm_map_delete(f09f5f00,f3172000,f3174000,f03983ec,efbffd14) at _vm_map_delete+0x3a _bfreekva(f2d567b4) at _bfreekva+0x22 _brelse(f2d567b4) at _brelse+0x300 _nfs_bioread(f13d3780,efbffe44,8,f102f780,1) at _nfs_bioread+0x716 _nfs_getpages(efbffe80) at _nfs_getpages+0xc8 _vnode_pager_getpages(f1478f80,efbfff38,1,0,efbfff58) at _vnode_pager_getpages+0x64 _vm_pager_get_pages(f1478f80,efbfff38,1,0) at _vm_pager_get_pages+0x1f _vm_fault(f13eb000,a000,1,0,0) at _vm_fault+0x5be _trap_pfault(efbfffbc,1) at _trap_pfault+0xe8 _trap(27,27,9f68,a748,efbfd0fc) at _trap+0x12f calltrap() at calltrap+0x15 --- trap 0xc, eip = 0x2000e8a9, esp = 0xefbfd0c8, ebp = 0xefbfd0fc --- db> step ,100 Stopped at __vm_map_clip_end+0xaf: call _vm_object_reference db> trace __vm_map_clip_end(f09f5f00,f1466e80,f3174000) at __vm_map_clip_end+0xaf _vm_map_delete(f09f5f00,f3172000,f3174000,f03983ec,efbffd14) at _vm_map_delete+0x93 _bfreekva(f2d567b4) at _bfreekva+0x22 _brelse(f2d567b4) at _brelse+0x300 _nfs_bioread(f13d3780,efbffe44,8,f102f780,1) at _nfs_bioread+0x716 _nfs_getpages(efbffe80) at _nfs_getpages+0xc8 _vnode_pager_getpages(f1478f80,efbfff38,1,0,efbfff58) at _vnode_pager_getpages+0x64 _vm_pager_get_pages(f1478f80,efbfff38,1,0) at _vm_pager_get_pages+0x1f _vm_fault(f13eb000,a000,1,0,0) at _vm_fault+0x5be _trap_pfault(efbfffbc,1) at _trap_pfault+0xe8 _trap(27,27,9f68,a748,efbfd0fc) at _trap+0x12f calltrap() at calltrap+0x15 --- trap 0xc, eip = 0x2000e8a9, esp = 0xefbfd0c8, ebp = 0xefbfd0fc --- db> unti After 10 instructions (0 loads, 0 stores), Stopped at _vm_object_reference+0x4e: ret db> After 6 instructions (0 loads, 0 stores), Stopped at __vm_map_clip_end+0xbb: ret db> After 28 instructions (0 loads, 0 stores), Stopped at _vm_map_delete+0x157: call _pmap_remove db> After 19 instructions (0 loads, 0 stores), Stopped at _pmap_remove+0x46: call _get_ptbase db> After 11 instructions (0 loads, 0 stores), Stopped at _get_ptbase+0x30: ret db> After 51 instructions (0 loads, 0 stores), Stopped at _pmap_remove+0xd3: ret db> After 4 instructions (0 loads, 0 stores), Stopped at _vm_map_delete+0x163: call _vm_map_entry_delete db> After 20 instructions (0 loads, 0 stores), Stopped at _vm_map_entry_delete+0x3b: call _vm_object_deallocate db> After 20 instructions (0 loads, 0 stores), Stopped at _vm_object_deallocate+0x1b3: ret db> After 4 instructions (0 loads, 0 stores), Stopped at _vm_map_entry_delete+0x45: call _vm_map_entry_dispose db> After 18 instructions (0 loads, 0 stores), Stopped at _vm_map_entry_dispose+0x81: ret db> After 5 instructions (0 loads, 0 stores), Stopped at _vm_map_entry_delete+0x50: ret db> After 16 instructions (0 loads, 0 stores), Stopped at _vm_map_delete+0x185: ret db> After 4 instructions (0 loads, 0 stores), Stopped at _bfreekva+0x2d: ret db> After 18 instructions (0 loads, 0 stores), Stopped at _brelse+0x45b: ret db> After 21 instructions (0 loads, 0 stores), Stopped at _nfs_bioread+0x5d6: call _nfs_getcacheblk db> After 20 instructions (0 loads, 0 stores), Stopped at _nfs_getcacheblk+0x81: call _getblk db> After 24 instructions (0 loads, 0 stores), Stopped at _getblk+0x5f: call _gbincore db> After 23 instructions (0 loads, 0 stores), Stopped at _gbincore+0x37: ret db> After 9 instructions (0 loads, 0 stores), Stopped at _getblk+0x13c: call _getnewbuf db> After 19 instructions (0 loads, 0 stores), Stopped at _getnewbuf+0x41: call _bremfree db> After 33 instructions (0 loads, 0 stores), Stopped at _bremfree+0x7e: ret db> After 46 instructions (0 loads, 0 stores), Stopped at _getnewbuf+0x2d4: call _bfreekva db> After 9 instructions (0 loads, 0 stores), Stopped at _bfreekva+0x2d: ret db> After 7 instructions (0 loads, 0 stores), Stopped at _getnewbuf+0x2e7: call _vm_map_findspace db> After 202 instructions (0 loads, 0 stores), Stopped at _vm_map_findspace+0xb6: ret db> After 21 instructions (0 loads, 0 stores), Stopped at _getnewbuf+0x339: call _vm_map_insert db> After 26 instructions (0 loads, 0 stores), Stopped at _vm_map_insert+0x5c: call _vm_map_lookup_entry db> After 38 instructions (0 loads, 0 stores), Stopped at _vm_map_lookup_entry+0x6e: ret db> After 66 instructions (0 loads, 0 stores), Stopped at _vm_map_insert+0x250: ret db> After 12 instructions (0 loads, 0 stores), Stopped at _getnewbuf+0x356: ret db> After 12 instructions (0 loads, 0 stores), Stopped at _getblk+0x196: calll *%eax db> After 10 instructions (0 loads, 0 stores), Stopped at _nfs_islocked+0x1b: ret db> After 6 instructions (0 loads, 0 stores), Stopped at _getblk+0x1a3: call _gbincore db> After 23 instructions (0 loads, 0 stores), Stopped at _gbincore+0x37: ret db> After 9 instructions (0 loads, 0 stores), Stopped at _getblk+0x1cf: call _bgetvp db> After 40 instructions (0 loads, 0 stores), Stopped at _bgetvp+0x8c: ret db> After 40 instructions (0 loads, 0 stores), Stopped at _getblk+0x25f: call _allocbuf db> After 90 instructions (0 loads, 0 stores), Stopped at _allocbuf+0x412: call _vm_page_lookup db> After 52 instructions (0 loads, 0 stores), Stopped at _vm_page_lookup+0x96: ret db> After 30 instructions (0 loads, 0 stores), Stopped at _allocbuf+0x51a: call _vfs_buf_set_valid db> After 67 instructions (0 loads, 0 stores), Stopped at _vfs_buf_set_valid+0x87: ret db> After 3 instructions (0 loads, 0 stores), Stopped at _allocbuf+0x523: call _vm_page_wire db> After 13 instructions (0 loads, 0 stores), Stopped at _vm_page_wire+0x27: call _vm_page_unqueue db> After 34 instructions (0 loads, 0 stores), Stopped at _vm_page_unqueue+0x7a: ret db> After 16 instructions (0 loads, 0 stores), Stopped at _vm_page_wire+0x60: ret db> After 32 instructions (0 loads, 0 stores), Stopped at _allocbuf+0x412: call _vm_page_lookup db> After 59 instructions (0 loads, 0 stores), Stopped at _vm_page_lookup+0x96: ret db> After 30 instructions (0 loads, 0 stores), Stopped at _allocbuf+0x51a: call _vfs_buf_set_valid db> After 32 instructions (0 loads, 0 stores), Stopped at _vfs_buf_set_valid+0x87: ret db> After 3 instructions (0 loads, 0 stores), Stopped at _allocbuf+0x523: call _vm_page_wire db> After 13 instructions (0 loads, 0 stores), Stopped at _vm_page_wire+0x27: call _vm_page_unqueue db> After 33 instructions (0 loads, 0 stores), Stopped at _vm_page_unqueue+0x7a: ret db> After 16 instructions (0 loads, 0 stores), Stopped at _vm_page_wire+0x60: ret db> After 27 instructions (0 loads, 0 stores), Stopped at _allocbuf+0x585: call _pmap_qenter db> After 53 instructions (0 loads, 0 stores), Stopped at _pmap_qenter+0x54: ret db> After 24 instructions (0 loads, 0 stores), Stopped at _allocbuf+0x5d2: ret db> After 7 instructions (0 loads, 0 stores), Stopped at _getblk+0x26d: ret db> After 14 instructions (0 loads, 0 stores), Stopped at _nfs_getcacheblk+0xaa: ret db> After 41 instructions (0 loads, 0 stores), Stopped at _nfs_bioread+0x711: call _brelse db> After 77 instructions (0 loads, 0 stores), Stopped at _brelse+0x1d1: call _vm_page_test_dirty db> After 9 instructions (0 loads, 0 stores), Stopped at _vm_page_test_dirty+0x21: ret db> After 32 instructions (0 loads, 0 stores), Stopped at _brelse+0x1d1: call _vm_page_test_dirty db> After 8 instructions (0 loads, 0 stores), Stopped at _vm_page_test_dirty+0x10: call _pmap_is_modified db> After 5 instructions (0 loads, 0 stores), Stopped at _pmap_is_modified+0x8: call _pmap_testbit db> After 37 instructions (0 loads, 0 stores), Stopped at _pmap_testbit+0x100: ret db> After 2 instructions (0 loads, 0 stores), Stopped at _pmap_is_modified+0xe: ret db> After 5 instructions (0 loads, 0 stores), Stopped at _vm_page_test_dirty+0x21: ret db> After 7 instructions (0 loads, 0 stores), Stopped at _brelse+0x1e4: call _vm_page_set_invalid db> After 37 instructions (0 loads, 0 stores), Stopped at _vm_page_set_invalid+0x81: ret db> After 8 instructions (0 loads, 0 stores), Stopped at _brelse+0x1fd: call _pmap_page_protect db> After 10 instructions (0 loads, 0 stores), Stopped at _pmap_page_protect+0x21: call _pmap_remove_all db> After 39 instructions (0 loads, 0 stores), Stopped at _pmap_remove_all+0x18b: ret db> After 2 instructions (0 loads, 0 stores), Stopped at _pmap_page_protect+0x27: ret db> After 21 instructions (0 loads, 0 stores), Stopped at _brelse+0x263: call _vfs_vmio_release db> After 14 instructions (0 loads, 0 stores), Stopped at _vfs_vmio_release+0x2b: call _vm_page_unwire db> After 29 instructions (0 loads, 0 stores), Stopped at _vm_page_unwire+0x75: ret db> After 27 instructions (0 loads, 0 stores), Stopped at _vfs_vmio_release+0x2b: call _vm_page_unwire db> After 29 instructions (0 loads, 0 stores), Stopped at _vm_page_unwire+0x75: ret db> After 17 instructions (0 loads, 0 stores), Stopped at _vfs_vmio_release+0x11a: call _pmap_qremove db> After 27 instructions (0 loads, 0 stores), Stopped at _pmap_qremove+0x32: ret db> After 8 instructions (0 loads, 0 stores), Stopped at _vfs_vmio_release+0x141: call _brelvp db> After 36 instructions (0 loads, 0 stores), Stopped at _brelvp+0x7d: ret db> After 6 instructions (0 loads, 0 stores), Stopped at _vfs_vmio_release+0x14d: ret db> After 35 instructions (0 loads, 0 stores), Stopped at _brelse+0x2fb: call _bfreekva db> After 12 instructions (0 loads, 0 stores), Stopped at _bfreekva+0x1d: call _vm_map_delete db> After 12 instructions (0 loads, 0 stores), Stopped at _vm_map_delete+0x14: call _vm_map_lookup_entry db> After 26 instructions (0 loads, 0 stores), Stopped at _vm_map_lookup_entry+0x6e: ret db> After 10 instructions (0 loads, 0 stores), Stopped at _vm_map_delete+0x35: call __vm_map_clip_start db> After 9 instructions (0 loads, 0 stores), Stopped at __vm_map_clip_start+0xf: call _vm_map_simplify_entry db> After 72 instructions (0 loads, 0 stores), Stopped at _vm_map_simplify_entry+0x175: call _vm_object_deallocate >Fix: WORKAROUND I have been running the following patch for about a month on a moderately loaded system that currently averages about 60 users during peak periods each day. There can be up to 15 or 20 students doing debugging with GDB at a time. The printf is exercised several times per day, more frequently if lots of people are doing debugging. The system has not wedged since the patch was installed, and I have noticed no other ill effects. Gene Stark stark@freebsd.org # cvs diff -c nfs_bio.c Index: nfs_bio.c =================================================================== RCS file: /A/cvs/src/sys/nfs/nfs_bio.c,v retrieving revision 1.28.2.7 diff -c -r1.28.2.7 nfs_bio.c *** nfs_bio.c 1998/01/28 00:26:54 1.28.2.7 --- nfs_bio.c 1998/10/07 00:16:17 *************** *** 356,363 **** panic("nfsbioread"); if (VOP_BWRITE(bp) == EINTR) return (EINTR); ! } else brelse(bp); goto again; } } --- 356,387 ---- panic("nfsbioread"); if (VOP_BWRITE(bp) == EINTR) return (EINTR); ! } else { ! #ifdef NFS_PTRACE_BUG ! /* ! * E. Stark - 9/30/98 ! * Avoid spinning here on a partial ! * buffer, by invalidating the buffer ! * and yielding the CPU for awhile. ! * There is a race condition involving ! * ptrace writes to text segment that ! * seems to exercise this bug. ! */ ! int ret; ! bp->b_flags |= B_INVAL; brelse(bp); + if((ret = + tsleep (0xdeadbeef, PRIBIO | PCATCH, + "nfsptbug", hz/10)) != 0) { + if(ret != EWOULDBLOCK) { + printf("nfs_bioread: nfs_ptrace_bug (ret=0x%x)\n", ret); + return(ret); + } + } + #else + brelse(bp); + #endif + } goto again; } } >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message