Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Aug 2019 20:41:19 -0700
From:      Cy Schubert <Cy.Schubert@cschubert.com>
To:        Mateusz Guzik <mjguzik@gmail.com>
Cc:        Cy Schubert <Cy.Schubert@cschubert.com>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r351472 - head/sys/fs/nullfs
Message-ID:  <201908270341.x7R3fJne062385@slippy.cwsent.com>
In-Reply-To: <CAGudoHHucKGxYKz9L46w-KAhbAKQTXZD6syBPer2aHwh37vx0Q@mail.gmail.com>
References:  <201908250513.x7P5DFFx017824@repo.freebsd.org>  <6C6BF3A4-DC07-4D9A-9A5F-FA075BF535A8@cschubert.com>  <201908261411.x7QEBMWi075354@slippy.cwsent.com> <A961554B-2B0B-4382-BA73-D69FC0129D0B@cschubert.com> <C3B4FE65-FC10-4248-AB00-0E30532C77EB@cschubert.com> <CAGudoHHGLguWZZMh%2BTOCBwANKKp7-6z5JvP1NgRo3jAr4vZ%2BPw@mail.gmail.com> <201908270025.x7R0PtNT008594@slippy.cwsent.com> <CAGudoHHucKGxYKz9L46w-KAhbAKQTXZD6syBPer2aHwh37vx0Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Making progress:

Unread portion of the kernel message buffer:

panic: null_unlock: LK_INTERLOCK passed
cpuid = 2
time = 1566868081
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfffffe0054a954e0
vpanic() at vpanic+0x19d/frame 0xfffffe0054a95530
panic() at panic+0x43/frame 0xfffffe0054a95590
null_unlock() at null_unlock+0x8a/frame 0xfffffe0054a955d0
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x56/frame 0xfffffe0054a955f0
unionfs_unlock() at unionfs_unlock+0x124/frame 0xfffffe0054a95650
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x56/frame 0xfffffe0054a95670
vfs_domount() at vfs_domount+0xc3f/frame 0xfffffe0054a958a0
vfs_donmount() at vfs_donmount+0x934/frame 0xfffffe0054a95940
sys_nmount() at sys_nmount+0x69/frame 0xfffffe0054a95980
amd64_syscall() at amd64_syscall+0x3ae/frame 0xfffffe0054a95ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0054a95ab0
--- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x2bed6a4a, rsp = 
0x7fffffffe2e8, rbp = 0x7fffffffec70 ---
Uptime: 1m35s


-- 
Cheers,
Cy Schubert <Cy.Schubert@cschubert.com>
FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  http://www.FreeBSD.org

	The need of the many outweighs the greed of the few.


In message <CAGudoHHucKGxYKz9L46w-KAhbAKQTXZD6syBPer2aHwh37vx0Q@mail.gmail.c
om>
, Mateusz Guzik writes:
> Sorry for the half-assed patch. I have a patch to remove the flags
> argument and the change above was port of it.
>
> Try this please:
> https://people.freebsd.org/~mjg/patches/vop_unlock-interlock2.diff
>
> On 8/27/19, Cy Schubert <cy.schubert@gmail.com> wrote:
> > It won't build. Did you intend VOP_UNLOCK(vp,LK_INTERLOCK)?
> >
> > --- vfs_subr.o ---
> > /opt/src/svn-current/sys/kern/vfs_subr.c:1037:17: error: too few arguments
> > to function call, expected 2, have 1
> >                         VOP_UNLOCK(vp);
> >                         ~~~~~~~~~~   ^
> > ./vnode_if.h:873:1: note: 'VOP_UNLOCK' declared here
> > static __inline int VOP_UNLOCK(
> > ^
> > /opt/src/svn-current/sys/kern/vfs_subr.c:1401:16: error: too few arguments
> > to function call, expected 2, have 1
> >                 VOP_UNLOCK(vp);
> >                 ~~~~~~~~~~   ^
> > ./vnode_if.h:873:1: note: 'VOP_UNLOCK' declared here
> > static __inline int VOP_UNLOCK(
> > ^
> > /opt/src/svn-current/sys/kern/vfs_subr.c:1413:15: error: too few arguments
> > to function call, expected 2, have 1
> >         VOP_UNLOCK(vp);
> >         ~~~~~~~~~~   ^
> > ./vnode_if.h:873:1: note: 'VOP_UNLOCK' declared here
> > static __inline int VOP_UNLOCK(
> > ^
> > 3 errors generated.
> > *** [vfs_subr.o] Error code 1
> >
> > make[2]: stopped in /export/obj/opt/src/svn-current/amd64.amd64/sys/BREAK
> >
> >
> > --
> > Cheers,
> > Cy Schubert <Cy.Schubert@cschubert.com>
> > FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  http://www.FreeBSD.org
> >
> > 	The need of the many outweighs the greed of the few.
> >
> >
> > In message
> > <CAGudoHHGLguWZZMh+TOCBwANKKp7-6z5JvP1NgRo3jAr4vZ+Pw@mail.gmail.c
> > om>
> > , Mateusz Guzik writes:
> >> Does this do the trick for you?
> >>
> >> https://people.freebsd.org/~mjg/vop_unlock-interlock.diff
> >>
> >> On 8/26/19, Cy Schubert <Cy.Schubert@cschubert.com> wrote:
> >> > On August 26, 2019 8:40:50 AM PDT, Cy Schubert
> >> > <Cy.Schubert@cschubert.com>
> >> > wrote:
> >> >>On August 26, 2019 7:11:22 AM PDT, Cy Schubert
> >> >><Cy.Schubert@cschubert.com> wrote:
> >> >>>In message <6C6BF3A4-DC07-4D9A-9A5F-FA075BF535A8@cschubert.com>, Cy
> >> >>>Schubert wr
> >> >>>ites:
> >> >>>> On August 24, 2019 10:13:15 PM PDT, Mateusz Guzik <mjg@FreeBSD.org>
> >> >>>wrote:
> >> >>>> >Author: mjg
> >> >>>> >Date: Sun Aug 25 05:13:15 2019
> >> >>>> >New Revision: 351472
> >> >>>> >URL: https://svnweb.freebsd.org/changeset/base/351472
> >> >>>> >
> >> >>>> >Log:
> >> >>>> >  nullfs: reduce areas protected by vnode interlock
> >> >>>> >
> >> >>>> >Some places only take the interlock to hold the vnode, which was a
> >> >>>> >requiremnt
> >> >>>> >before they started being manipulated with atomics. Use the newly
> >> >>>> >introduced
> >> >>>> >  vholdnz to bump the count.
> >> >>>> >
> >> >>>> >  Reviewed by:	kib
> >> >>>> >  Tested by:	pho
> >> >>>> >  Sponsored by:	The FreeBSD Foundation
> >> >>>> >  Differential Revision:	https://reviews.freebsd.org/D21358
> >> >>>> >
> >> >>>> >Modified:
> >> >>>> >  head/sys/fs/nullfs/null_vnops.c
> >> >>>> >
> >> >>>> >Modified: head/sys/fs/nullfs/null_vnops.c
> >> >>>>
> >> >>>>======================================================================
> ===
> >> ===
> >> >>>> ==
> >> >>>> >--- head/sys/fs/nullfs/null_vnops.c	Sun Aug 25 05:11:43
> >> >>2019	(r35147
> >> >>>> 1)
> >> >>>> >+++ head/sys/fs/nullfs/null_vnops.c	Sun Aug 25 05:13:15
> >> >>2019	(r35147
> >> >>>> 2)
> >> >>>> >@@ -668,7 +668,7 @@ null_lock(struct vop_lock1_args *ap)
> >> >>>> > 		 * We prevent it from being recycled by holding the vno
> >> de
> >> >>>> > 		 * here.
> >> >>>> > 		 */
> >> >>>> >-		vholdl(lvp);
> >> >>>> >+		vholdnz(lvp);
> >> >>>> > 		error = VOP_LOCK(lvp, flags);
> >> >>>> >
> >> >>>> > 		/*
> >> >>>> >@@ -710,31 +710,16 @@ static int
> >> >>>> > null_unlock(struct vop_unlock_args *ap)
> >> >>>> > {
> >> >>>> > 	struct vnode *vp = ap->a_vp;
> >> >>>> >-	int flags = ap->a_flags;
> >> >>>> >-	int mtxlkflag = 0;
> >> >>>> > 	struct null_node *nn;
> >> >>>> > 	struct vnode *lvp;
> >> >>>> > 	int error;
> >> >>>> >
> >> >>>> >-	if ((flags & LK_INTERLOCK) != 0)
> >> >>>> >-		mtxlkflag = 1;
> >> >>>> >-	else if (mtx_owned(VI_MTX(vp)) == 0) {
> >> >>>> >-		VI_LOCK(vp);
> >> >>>> >-		mtxlkflag = 2;
> >> >>>> >-	}
> >> >>>> > 	nn = VTONULL(vp);
> >> >>>> > 	if (nn != NULL && (lvp = NULLVPTOLOWERVP(vp)) != NULL) {
> >> >>>> >-		VI_LOCK_FLAGS(lvp, MTX_DUPOK);
> >> >>>> >-		flags |= LK_INTERLOCK;
> >> >>>> >-		vholdl(lvp);
> >> >>>> >-		VI_UNLOCK(vp);
> >> >>>> >-		error = VOP_UNLOCK(lvp, flags);
> >> >>>> >+		vholdnz(lvp);
> >> >>>> >+		error = VOP_UNLOCK(lvp, 0);
> >> >>>> > 		vdrop(lvp);
> >> >>>> >-		if (mtxlkflag == 0)
> >> >>>> >-			VI_LOCK(vp);
> >> >>>> > 	} else {
> >> >>>> >-		if (mtxlkflag == 2)
> >> >>>> >-			VI_UNLOCK(vp);
> >> >>>> > 		error = vop_stdunlock(ap);
> >> >>>> > 	}
> >> >>>> >
> >> >>>> >@@ -845,10 +830,8 @@ null_getwritemount(struct
> >> >>>vop_getwritemount_args
> >> >>>> >*ap)
> >> >>>> > 	VI_LOCK(vp);
> >> >>>> > 	xp = VTONULL(vp);
> >> >>>> > 	if (xp && (lowervp = xp->null_lowervp)) {
> >> >>>> >-		VI_LOCK_FLAGS(lowervp, MTX_DUPOK);
> >> >>>> >+		vholdnz(lowervp);
> >> >>>> > 		VI_UNLOCK(vp);
> >> >>>> >-		vholdl(lowervp);
> >> >>>> >-		VI_UNLOCK(lowervp);
> >> >>>> > 		VOP_GETWRITEMOUNT(lowervp, ap->a_mpp);
> >> >>>> > 		vdrop(lowervp);
> >> >>>> > 	} else {
> >> >>>>
> >> >>>> Hi mjg@,
> >> >>>>
> >> >>>> This causes trap 12 a few seconds after mountlate during boot.
> >> >>>Reverting this
> >> >>>>  commit allowed it to boot.
> >> >>>>
> >> >>>> Sorry for no backtrace. I managed to
> >> >>>>  revert and test just prior to rushing out to catch the bus, with no
> >> >>>time to
> >> >>>> craft a proper email. I'll post the backtrace when I arrive at work.
> >> >>>
> >> >>>Hi mjg@,
> >> >>>
> >> >>>The gory details below. Don't be fooled by the panic when yppasswd is
> >> >>>started. I was able to reproduce this panic in single user by simply
> >> >>>mounting and using a nullfs.
> >> >>>
> >> >>>kernel trap 12 with interrupts disabled
> >> >>>
> >> >>>
> >> >>>Fatal trap 12: page fault while in kernel mode
> >> >>>cpuid = 0; apic id = 00
> >> >>>fault virtual address   = 0xc
> >> >>>fault code              = supervisor read data, page not present
> >> >>>instruction pointer     = 0x20:0xffffffff806f0633
> >> >>>stack pointer           = 0x28:0xfffffe004b3fa7c0
> >> >>>frame pointer           = 0x28:0xfffffe004b3fa7e0
> >> >>>code segment            = base 0x0, limit 0xfffff, type 0x1b
> >> >>>                        = DPL 0, pres 1, long 1, def32 0, gran 1
> >> >>>processor eflags        = resume, IOPL = 0
> >> >>>current process         = 26 (syncer)
> >> >>>trap number             = 12
> >> >>>panic: page fault
> >> >>>cpuid = 0
> >> >>>time = 1566821203
> >> >>>KDB: stack backtrace:
> >> >>>db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> >> >>>0xfffffe004b3fa470
> >> >>>vpanic() at vpanic+0x19d/frame 0xfffffe004b3fa4c0
> >> >>>panic() at panic+0x43/frame 0xfffffe004b3fa520
> >> >>>trap_fatal() at trap_fatal+0x39c/frame 0xfffffe004b3fa580
> >> >>>trap_pfault() at trap_pfault+0x49/frame 0xfffffe004b3fa5e0
> >> >>>trap() at trap+0x29d/frame 0xfffffe004b3fa6f0
> >> >>>calltrap() at calltrap+0x8/frame 0xfffffe004b3fa6f0
> >> >>>--- trap 0xc, rip = 0xffffffff806f0633, rsp = 0xfffffe004b3fa7c0, rbp
> >> >>=
> >> >>>0xfffffe004b3fa7e0 ---
> >> >>>propagate_priority() at propagate_priority+0x153/frame
> >> >>>0xfffffe004b3fa7e0
> >> >>>turnstile_wait() at turnstile_wait+0x32b/frame 0xfffffe004b3fa830
> >> >>>__mtx_lock_sleep() at __mtx_lock_sleep+0x191/frame 0xfffffe004b3fa8c0
> >> >>>mnt_vnode_next_active() at mnt_vnode_next_active+0x2cb/frame
> >> >>>0xfffffe004b3fa920
> >> >>>vfs_msync() at vfs_msync+0x266/frame 0xfffffe004b3fa980
> >> >>>sync_fsync() at sync_fsync+0xe6/frame 0xfffffe004b3fa9c0
> >> >>>VOP_FSYNC_APV() at VOP_FSYNC_APV+0x56/frame 0xfffffe004b3fa9e0
> >> >>>sched_sync() at sched_sync+0x379/frame 0xfffffe004b3faa70
> >> >>>fork_exit() at fork_exit+0x83/frame 0xfffffe004b3faab0
> >> >>>fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004b3faab0
> >> >>>--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> >> >>>Uptime: 3m25s
> >> >>>Dumping 921 out of 7974
> >> >>>MB:..2%..11%..21%..32%..42%..51%..61%..72%..82%..91%
> >> >>>
> >> >>>__curthread () at /opt/src/svn-current/sys/amd64/include/pcpu.h:246
> >> >>>246             __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
> >> >>>(OFFSETOF_CURTHREAD));
> >> >>>(kgdb) #0  __curthread () at
> >> >>>/opt/src/svn-current/sys/amd64/include/pcpu.h:246
> >> >>>#1  doadump (textdump=1) at
> >> >>>/opt/src/svn-current/sys/kern/kern_shutdown.c:392
> >> >>>#2  0xffffffff8068aead in kern_reboot (howto=260)
> >> >>>    at /opt/src/svn-current/sys/kern/kern_shutdown.c:479
> >> >>>#3  0xffffffff8068b369 in vpanic (fmt=<optimized out>, ap=<optimized
> >> >>>out>)
> >> >>>    at /opt/src/svn-current/sys/kern/kern_shutdown.c:905
> >> >>>#4  0xffffffff8068b163 in panic (fmt=<unavailable>)
> >> >>>    at /opt/src/svn-current/sys/kern/kern_shutdown.c:832
> >> >>>#5  0xffffffff80a11c2c in trap_fatal (frame=0xfffffe004b3fa700,
> >> >>eva=12)
> >> >>>    at /opt/src/svn-current/sys/amd64/amd64/trap.c:943
> >> >>>#6  0xffffffff80a11c79 in trap_pfault (frame=0xfffffe004b3fa700,
> >> >>>usermode=0)
> >> >>>    at /opt/src/svn-current/sys/amd64/amd64/trap.c:767
> >> >>>#7  0xffffffff80a1126d in trap (frame=0xfffffe004b3fa700)
> >> >>>    at /opt/src/svn-current/sys/amd64/amd64/trap.c:443
> >> >>>#8  <signal handler called>
> >> >>>#9  propagate_priority (td=0xfffff8003e43b000)
> >> >>>    at /opt/src/svn-current/sys/kern/subr_turnstile.c:279
> >> >>>#10 0xffffffff806f13db in turnstile_wait (ts=0xfffff8000258a780,
> >> >>>    owner=<optimized out>, queue=-512)
> >> >>>    at /opt/src/svn-current/sys/kern/subr_turnstile.c:785
> >> >>>#11 0xffffffff80669271 in __mtx_lock_sleep (c=0xfffff80132e4ebf0,
> >> >>>   v=<optimized out>) at
> >> >>/opt/src/svn-current/sys/kern/kern_mutex.c:654
> >> >>>#12 0xffffffff80767b1b in mnt_vnode_next_active_relock (
> >> >>>  mvp=0xfffff80002449800, mp=0xfffff80132d1e000,
> >> >>vp=0xfffff80132e4eb40)
> >> >>>    at /opt/src/svn-current/sys/kern/vfs_subr.c:5590
> >> >>>#13 mnt_vnode_next_active (mvp=<optimized out>, mp=0xfffff80132d1e000)
> >> >>>    at /opt/src/svn-current/sys/kern/vfs_subr.c:5649
> >> >>>#14 0xffffffff80766206 in __mnt_vnode_next_active
> >> >>>(mvp=0xfffff80002449800,
> >> >>>   mp=<optimized out>) at
> >> >>/opt/src/svn-current/sys/kern/vfs_subr.c:5682
> >> >>>#15 vfs_msync (mp=0xfffff80132d1e000, flags=2)
> >> >>>    at /opt/src/svn-current/sys/kern/vfs_subr.c:4238
> >> >>>#16 0xffffffff8076b4a6 in sync_fsync (ap=<optimized out>)
> >> >>>    at /opt/src/svn-current/sys/kern/vfs_subr.c:4468
> >> >>>#17 0xffffffff80a7be46 in VOP_FSYNC_APV (
> >> >>>    vop=0xffffffff80cb4120 <sync_vnodeops>, a=0xfffffe004b3faa08)
> >> >>>    at vnode_if.c:1312
> >> >>>#18 0xffffffff80769319 in VOP_FSYNC (vp=<optimized out>, waitfor=3,
> >> >>>    td=0xfffff800061dc5a0) at ./vnode_if.h:549
> >> >>>#19 sync_vnode (bo=<optimized out>, td=0xfffff800061dc5a0,
> >> >>>  slp=<optimized out>) at
> >> >>/opt/src/svn-current/sys/kern/vfs_subr.c:2316
> >> >>>#20 sched_sync () at /opt/src/svn-current/sys/kern/vfs_subr.c:2418
> >> >>>#21 0xffffffff80647e33 in fork_exit (callout=0xffffffff80768fa0
> >> >>><sched_sync>,
> >> >>>    arg=0x0, frame=0xfffffe004b3faac0)
> >> >>>    at /opt/src/svn-current/sys/kern/kern_fork.c:1045
> >> >>>#22 <signal handler called>
> >> >>>
> >> >>>
> >> >>>dmesg output:
> >> >>>
> >> >>>Updating motd:.
> >> >>>Mounting late filesystems:.
> >> >>>Starting nscd.
> >> >>>Security policy loaded: MAC/ntpd (mac_ntpd)
> >> >>>Starting ntpd.
> >> >>>Starting yppasswdd.
> >> >>>kernel trap 12 with interrupts disabled
> >> >>>
> >> >>>
> >> >>>Fatal trap 12: page fault while in kernel mode
> >> >>>cpuid = 0; apic id = 00
> >> >>>fault virtual address   = 0xc
> >> >>>fault code              = supervisor read data, page not present
> >> >>>instruction pointer     = 0x20:0xffffffff806f0633
> >> >>>stack pointer           = 0x28:0xfffffe004b7fa7c0
> >> >>>frame pointer           = 0x28:0xfffffe004b7fa7e0
> >> >>>code segment            = base 0x0, limit 0xfffff, type 0x1b
> >> >>>                        = DPL 0, pres 1, long 1, def32 0, gran 1
> >> >>>processor eflags        = resume, IOPL = 0
> >> >>>current process         = 26 (syncer)
> >> >>>trap number             = 12
> >> >>>panic: page fault
> >> >>>cpuid = 0
> >> >>>time = 1566820700
> >> >>>KDB: stack backtrace:
> >> >>>db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> >> >>>0xfffffe004b7fa470
> >> >>>vpanic() at vpanic+0x19d/frame 0xfffffe004b7fa4c0
> >> >>>panic() at panic+0x43/frame 0xfffffe004b7fa520
> >> >>>trap_fatal() at trap_fatal+0x39c/frame 0xfffffe004b7fa580
> >> >>>trap_pfault() at trap_pfault+0x49/frame 0xfffffe004b7fa5e0
> >> >>>trap() at trap+0x29d/frame 0xfffffe004b7fa6f0
> >> >>>calltrap() at calltrap+0x8/frame 0xfffffe004b7fa6f0
> >> >>>--- trap 0xc, rip = 0xffffffff806f0633, rsp = 0xfffffe004b7fa7c0, rbp
> >> >>=
> >> >>>0xfffffe004b7fa7e0 ---
> >> >>>propagate_priority() at propagate_priority+0x153/frame
> >> >>>0xfffffe004b7fa7e0
> >> >>>turnstile_wait() at turnstile_wait+0x32b/frame 0xfffffe004b7fa830
> >> >>>__mtx_lock_sleep() at __mtx_lock_sleep+0x191/frame 0xfffffe004b7fa8c0
> >> >>>mnt_vnode_next_active() at mnt_vnode_next_active+0x2cb/frame
> >> >>>0xfffffe004b7fa920
> >> >>>vfs_msync() at vfs_msync+0x266/frame 0xfffffe004b7fa980
> >> >>>sync_fsync() at sync_fsync+0xe6/frame 0xfffffe004b7fa9c0
> >> >>>--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> >> >>>Uptime: 1m38s
> >> >>>Dumping 810 out of 7974
> >> >>>MB:..2%..12%..22%..32%..42%..52%..62%..72%..81%..91%---<<BOOT>>---
> >> >>
> >> >>This may be a simple case of increasing the kernel stack. I'll try this
> >> >>at noon.
> >> >
> >> > Increasing kernel stack pages from 8 to 20 still results in the panic.
> >> >
> >> >
> >> > --
> >> > Pardon the typos and autocorrect, small keyboard in use.
> >> > Cheers,
> >> > Cy Schubert <Cy.Schubert@cschubert.com>
> >> > FreeBSD UNIX: <cy@FreeBSD.org> Web: http://www.FreeBSD.org
> >> >
> >> > 	The need of the many outweighs the greed of the few.
> >> >
> >>
> >>
> >> --
> >> Mateusz Guzik <mjguzik gmail.com>
> >
> >
> >
>
>
> -- 
> Mateusz Guzik <mjguzik gmail.com>





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201908270341.x7R3fJne062385>