Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Aug 2011 12:30:11 GMT
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-fs@FreeBSD.org
Subject:   Re: amd64/159930: kernel core
Message-ID:  <201108221230.p7MCUBoN076569@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/159930; it has been noted by GNATS.

From: John Baldwin <jhb@freebsd.org>
To: freebsd-amd64@freebsd.org
Cc: Wouter Snels <nospam@ofloo.net>,
 freebsd-gnats-submit@freebsd.org
Subject: Re: amd64/159930: kernel core
Date: Mon, 22 Aug 2011 08:27:34 -0400

 On Friday, August 19, 2011 6:50:51 pm Wouter Snels wrote:
 > 
 > >Number:         159930
 > >Category:       amd64
 > >Synopsis:       kernel core
 > >Confidential:   no
 > >Severity:       non-critical
 > >Priority:       medium
 > >Responsible:    freebsd-amd64
 > >State:          open
 > >Quarter:        
 > >Keywords:       
 > >Date-Required:
 > >Class:          sw-bug
 > >Submitter-Id:   current-users
 > >Arrival-Date:   Fri Aug 19 23:00:25 UTC 2011
 > >Closed-Date:
 > >Last-Modified:
 > >Originator:     Wouter Snels
 > >Release:        FreeBSD 8.2
 > >Organization:
 > >Environment:
 > FreeBSD spark.ofloo.net 8.2-RELEASE-p2 FreeBSD 8.2-RELEASE-p2 #0: Wed Jul 13 
 15:20:57 CEST 2011     ofloo@spark.ofloo.net:/usr/obj/usr/src/sys/OFL  amd64
 > 
 > >Description:
 > Fatal trap 12: page fault while in kernel mode
 > cpuid = 2; apic id = 02
 > fault virtual address   = 0x30
 > fault code              = supervisor read data, page not present
 > instruction pointer     = 0x20:0xffffffff805dd943
 > stack pointer           = 0x28:0xffffff8091e3d6c0
 > frame pointer           = 0x28:0xffffff8091e3d6f0
 > code segment            = base 0x0, limit 0xfffff, type 0x1b
 >                         = DPL 0, pres 1, long 1, def32 0, gran 1
 > processor eflags        = interrupt enabled, resume, IOPL = 0
 > current process         = 18 (softdepflush)
 > trap number             = 12
 > panic: page fault
 > cpuid = 2
 > KDB: stack backtrace:
 > #0 0xffffffff8063300e at kdb_backtrace+0x5e
 > #1 0xffffffff80602627 at panic+0x187
 > #2 0xffffffff808fbbe0 at trap_fatal+0x290
 > #3 0xffffffff808fbfbf at trap_pfault+0x28f
 > #4 0xffffffff808fc49f at trap+0x3df
 > #5 0xffffffff808e4644 at calltrap+0x8
 > #6 0xffffffff805f668a at priv_check_cred+0x3a
 > #7 0xffffffff8084ebd0 at chkdq+0x310
 > #8 0xffffffff8082db5d at ffs_truncate+0xfed
 > #9 0xffffffff8084ac5c at ufs_inactive+0x21c
 > #10 0xffffffff8068a761 at vinactive+0x71
 > #11 0xffffffff806904b8 at vputx+0x2d8
 > #12 0xffffffff80836386 at handle_workitem_remove+0x206
 > #13 0xffffffff8083675e at process_worklist_item+0x20e
 > #14 0xffffffff80838893 at softdep_process_worklist+0xe3
 > #15 0xffffffff80839d3c at softdep_flush+0x17c
 > #16 0xffffffff805d9f28 at fork_exit+0x118
 > #17 0xffffffff808e4b0e at fork_trampoline+0xe
 > Uptime: 2d4h7m56s
 > Cannot dump. Device not defined or unavailable.
 > Automatic reboot in 15 seconds - press a key on the console to abort
 > panic: bufwrite: buffer is not busy???
 
 Hmm, the panic seems to be caused by a null ucred pointer passed to 
 priv_check_cred() in chkdq():
 
         if ((flags & FORCE) == 0 &&
             priv_check_cred(cred, PRIV_VFS_EXCEEDQUOTA, 0))
                 do_check = 1;
         else
                 do_check = 0;
 
 However, ffs_truncate() passes in NOCRED for its credential:
 
         if ((flags & IO_EXT) && extblocks > 0) {
                 ...
 #ifdef QUOTA
                         (void) chkdq(ip, -extblocks, NOCRED, 0);
 #endif
 
 A few other places call chkdq() with NOCRED (but not with the FORCE flag):
 
 ffs/ffs_inode.c:522:    (void) chkdq(ip, -blocksreleased, NOCRED, 0);
 ffs/ffs_softdep.c:6201: (void) chkdq(ip, -datablocks, NOCRED, 0);
 ffs/ffs_softdep.c:6431: (void) chkdq(ip, -datablocks, NOCRED, 0);
 
 Hmm, all these calls should be passing in a negative value though, and 
 reducing usage takes a shorter path at the start of chkdq() that always 
 returns without ever getting to the call to priv_check_cred().  Similarly if 
 the value (e.g. extblocks) was 0.  This implies that extblocks was a negative 
 value which seems very odd.  Especially given the logic in ffs_truncate():
 
         if ((flags & IO_EXT) && extblocks > 0) {
                 ...
                         if ((error = ffs_syncvnode(vp, MNT_WAIT)) != 0)
                                 return (error);
 #ifdef QUOTA
                         (void) chkdq(ip, -extblocks, NOCRED, 0);
 #endif
 
 Nothing changes extblocks in between that check and the call to chkdq().  It 
 would probably be best to get a crashdump if this is reproducible so we can 
 investigate it further.
 
 -- 
 John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201108221230.p7MCUBoN076569>