Date: Mon, 26 Aug 2019 20:41:19 -0700 From: Cy Schubert <Cy.Schubert@cschubert.com> To: Mateusz Guzik <mjguzik@gmail.com> Cc: Cy Schubert <Cy.Schubert@cschubert.com>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r351472 - head/sys/fs/nullfs Message-ID: <201908270341.x7R3fJne062385@slippy.cwsent.com> In-Reply-To: <CAGudoHHucKGxYKz9L46w-KAhbAKQTXZD6syBPer2aHwh37vx0Q@mail.gmail.com> References: <201908250513.x7P5DFFx017824@repo.freebsd.org> <6C6BF3A4-DC07-4D9A-9A5F-FA075BF535A8@cschubert.com> <201908261411.x7QEBMWi075354@slippy.cwsent.com> <A961554B-2B0B-4382-BA73-D69FC0129D0B@cschubert.com> <C3B4FE65-FC10-4248-AB00-0E30532C77EB@cschubert.com> <CAGudoHHGLguWZZMh%2BTOCBwANKKp7-6z5JvP1NgRo3jAr4vZ%2BPw@mail.gmail.com> <201908270025.x7R0PtNT008594@slippy.cwsent.com> <CAGudoHHucKGxYKz9L46w-KAhbAKQTXZD6syBPer2aHwh37vx0Q@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Making progress: Unread portion of the kernel message buffer: panic: null_unlock: LK_INTERLOCK passed cpuid = 2 time = 1566868081 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0054a954e0 vpanic() at vpanic+0x19d/frame 0xfffffe0054a95530 panic() at panic+0x43/frame 0xfffffe0054a95590 null_unlock() at null_unlock+0x8a/frame 0xfffffe0054a955d0 VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x56/frame 0xfffffe0054a955f0 unionfs_unlock() at unionfs_unlock+0x124/frame 0xfffffe0054a95650 VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x56/frame 0xfffffe0054a95670 vfs_domount() at vfs_domount+0xc3f/frame 0xfffffe0054a958a0 vfs_donmount() at vfs_donmount+0x934/frame 0xfffffe0054a95940 sys_nmount() at sys_nmount+0x69/frame 0xfffffe0054a95980 amd64_syscall() at amd64_syscall+0x3ae/frame 0xfffffe0054a95ab0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0054a95ab0 --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x2bed6a4a, rsp = 0x7fffffffe2e8, rbp = 0x7fffffffec70 --- Uptime: 1m35s -- Cheers, Cy Schubert <Cy.Schubert@cschubert.com> FreeBSD UNIX: <cy@FreeBSD.org> Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. In message <CAGudoHHucKGxYKz9L46w-KAhbAKQTXZD6syBPer2aHwh37vx0Q@mail.gmail.c om> , Mateusz Guzik writes: > Sorry for the half-assed patch. I have a patch to remove the flags > argument and the change above was port of it. > > Try this please: > https://people.freebsd.org/~mjg/patches/vop_unlock-interlock2.diff > > On 8/27/19, Cy Schubert <cy.schubert@gmail.com> wrote: > > It won't build. Did you intend VOP_UNLOCK(vp,LK_INTERLOCK)? > > > > --- vfs_subr.o --- > > /opt/src/svn-current/sys/kern/vfs_subr.c:1037:17: error: too few arguments > > to function call, expected 2, have 1 > > VOP_UNLOCK(vp); > > ~~~~~~~~~~ ^ > > ./vnode_if.h:873:1: note: 'VOP_UNLOCK' declared here > > static __inline int VOP_UNLOCK( > > ^ > > /opt/src/svn-current/sys/kern/vfs_subr.c:1401:16: error: too few arguments > > to function call, expected 2, have 1 > > VOP_UNLOCK(vp); > > ~~~~~~~~~~ ^ > > ./vnode_if.h:873:1: note: 'VOP_UNLOCK' declared here > > static __inline int VOP_UNLOCK( > > ^ > > /opt/src/svn-current/sys/kern/vfs_subr.c:1413:15: error: too few arguments > > to function call, expected 2, have 1 > > VOP_UNLOCK(vp); > > ~~~~~~~~~~ ^ > > ./vnode_if.h:873:1: note: 'VOP_UNLOCK' declared here > > static __inline int VOP_UNLOCK( > > ^ > > 3 errors generated. > > *** [vfs_subr.o] Error code 1 > > > > make[2]: stopped in /export/obj/opt/src/svn-current/amd64.amd64/sys/BREAK > > > > > > -- > > Cheers, > > Cy Schubert <Cy.Schubert@cschubert.com> > > FreeBSD UNIX: <cy@FreeBSD.org> Web: http://www.FreeBSD.org > > > > The need of the many outweighs the greed of the few. > > > > > > In message > > <CAGudoHHGLguWZZMh+TOCBwANKKp7-6z5JvP1NgRo3jAr4vZ+Pw@mail.gmail.c > > om> > > , Mateusz Guzik writes: > >> Does this do the trick for you? > >> > >> https://people.freebsd.org/~mjg/vop_unlock-interlock.diff > >> > >> On 8/26/19, Cy Schubert <Cy.Schubert@cschubert.com> wrote: > >> > On August 26, 2019 8:40:50 AM PDT, Cy Schubert > >> > <Cy.Schubert@cschubert.com> > >> > wrote: > >> >>On August 26, 2019 7:11:22 AM PDT, Cy Schubert > >> >><Cy.Schubert@cschubert.com> wrote: > >> >>>In message <6C6BF3A4-DC07-4D9A-9A5F-FA075BF535A8@cschubert.com>, Cy > >> >>>Schubert wr > >> >>>ites: > >> >>>> On August 24, 2019 10:13:15 PM PDT, Mateusz Guzik <mjg@FreeBSD.org> > >> >>>wrote: > >> >>>> >Author: mjg > >> >>>> >Date: Sun Aug 25 05:13:15 2019 > >> >>>> >New Revision: 351472 > >> >>>> >URL: https://svnweb.freebsd.org/changeset/base/351472 > >> >>>> > > >> >>>> >Log: > >> >>>> > nullfs: reduce areas protected by vnode interlock > >> >>>> > > >> >>>> >Some places only take the interlock to hold the vnode, which was a > >> >>>> >requiremnt > >> >>>> >before they started being manipulated with atomics. Use the newly > >> >>>> >introduced > >> >>>> > vholdnz to bump the count. > >> >>>> > > >> >>>> > Reviewed by: kib > >> >>>> > Tested by: pho > >> >>>> > Sponsored by: The FreeBSD Foundation > >> >>>> > Differential Revision: https://reviews.freebsd.org/D21358 > >> >>>> > > >> >>>> >Modified: > >> >>>> > head/sys/fs/nullfs/null_vnops.c > >> >>>> > > >> >>>> >Modified: head/sys/fs/nullfs/null_vnops.c > >> >>>> > >> >>>>====================================================================== > === > >> === > >> >>>> == > >> >>>> >--- head/sys/fs/nullfs/null_vnops.c Sun Aug 25 05:11:43 > >> >>2019 (r35147 > >> >>>> 1) > >> >>>> >+++ head/sys/fs/nullfs/null_vnops.c Sun Aug 25 05:13:15 > >> >>2019 (r35147 > >> >>>> 2) > >> >>>> >@@ -668,7 +668,7 @@ null_lock(struct vop_lock1_args *ap) > >> >>>> > * We prevent it from being recycled by holding the vno > >> de > >> >>>> > * here. > >> >>>> > */ > >> >>>> >- vholdl(lvp); > >> >>>> >+ vholdnz(lvp); > >> >>>> > error = VOP_LOCK(lvp, flags); > >> >>>> > > >> >>>> > /* > >> >>>> >@@ -710,31 +710,16 @@ static int > >> >>>> > null_unlock(struct vop_unlock_args *ap) > >> >>>> > { > >> >>>> > struct vnode *vp = ap->a_vp; > >> >>>> >- int flags = ap->a_flags; > >> >>>> >- int mtxlkflag = 0; > >> >>>> > struct null_node *nn; > >> >>>> > struct vnode *lvp; > >> >>>> > int error; > >> >>>> > > >> >>>> >- if ((flags & LK_INTERLOCK) != 0) > >> >>>> >- mtxlkflag = 1; > >> >>>> >- else if (mtx_owned(VI_MTX(vp)) == 0) { > >> >>>> >- VI_LOCK(vp); > >> >>>> >- mtxlkflag = 2; > >> >>>> >- } > >> >>>> > nn = VTONULL(vp); > >> >>>> > if (nn != NULL && (lvp = NULLVPTOLOWERVP(vp)) != NULL) { > >> >>>> >- VI_LOCK_FLAGS(lvp, MTX_DUPOK); > >> >>>> >- flags |= LK_INTERLOCK; > >> >>>> >- vholdl(lvp); > >> >>>> >- VI_UNLOCK(vp); > >> >>>> >- error = VOP_UNLOCK(lvp, flags); > >> >>>> >+ vholdnz(lvp); > >> >>>> >+ error = VOP_UNLOCK(lvp, 0); > >> >>>> > vdrop(lvp); > >> >>>> >- if (mtxlkflag == 0) > >> >>>> >- VI_LOCK(vp); > >> >>>> > } else { > >> >>>> >- if (mtxlkflag == 2) > >> >>>> >- VI_UNLOCK(vp); > >> >>>> > error = vop_stdunlock(ap); > >> >>>> > } > >> >>>> > > >> >>>> >@@ -845,10 +830,8 @@ null_getwritemount(struct > >> >>>vop_getwritemount_args > >> >>>> >*ap) > >> >>>> > VI_LOCK(vp); > >> >>>> > xp = VTONULL(vp); > >> >>>> > if (xp && (lowervp = xp->null_lowervp)) { > >> >>>> >- VI_LOCK_FLAGS(lowervp, MTX_DUPOK); > >> >>>> >+ vholdnz(lowervp); > >> >>>> > VI_UNLOCK(vp); > >> >>>> >- vholdl(lowervp); > >> >>>> >- VI_UNLOCK(lowervp); > >> >>>> > VOP_GETWRITEMOUNT(lowervp, ap->a_mpp); > >> >>>> > vdrop(lowervp); > >> >>>> > } else { > >> >>>> > >> >>>> Hi mjg@, > >> >>>> > >> >>>> This causes trap 12 a few seconds after mountlate during boot. > >> >>>Reverting this > >> >>>> commit allowed it to boot. > >> >>>> > >> >>>> Sorry for no backtrace. I managed to > >> >>>> revert and test just prior to rushing out to catch the bus, with no > >> >>>time to > >> >>>> craft a proper email. I'll post the backtrace when I arrive at work. > >> >>> > >> >>>Hi mjg@, > >> >>> > >> >>>The gory details below. Don't be fooled by the panic when yppasswd is > >> >>>started. I was able to reproduce this panic in single user by simply > >> >>>mounting and using a nullfs. > >> >>> > >> >>>kernel trap 12 with interrupts disabled > >> >>> > >> >>> > >> >>>Fatal trap 12: page fault while in kernel mode > >> >>>cpuid = 0; apic id = 00 > >> >>>fault virtual address = 0xc > >> >>>fault code = supervisor read data, page not present > >> >>>instruction pointer = 0x20:0xffffffff806f0633 > >> >>>stack pointer = 0x28:0xfffffe004b3fa7c0 > >> >>>frame pointer = 0x28:0xfffffe004b3fa7e0 > >> >>>code segment = base 0x0, limit 0xfffff, type 0x1b > >> >>> = DPL 0, pres 1, long 1, def32 0, gran 1 > >> >>>processor eflags = resume, IOPL = 0 > >> >>>current process = 26 (syncer) > >> >>>trap number = 12 > >> >>>panic: page fault > >> >>>cpuid = 0 > >> >>>time = 1566821203 > >> >>>KDB: stack backtrace: > >> >>>db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > >> >>>0xfffffe004b3fa470 > >> >>>vpanic() at vpanic+0x19d/frame 0xfffffe004b3fa4c0 > >> >>>panic() at panic+0x43/frame 0xfffffe004b3fa520 > >> >>>trap_fatal() at trap_fatal+0x39c/frame 0xfffffe004b3fa580 > >> >>>trap_pfault() at trap_pfault+0x49/frame 0xfffffe004b3fa5e0 > >> >>>trap() at trap+0x29d/frame 0xfffffe004b3fa6f0 > >> >>>calltrap() at calltrap+0x8/frame 0xfffffe004b3fa6f0 > >> >>>--- trap 0xc, rip = 0xffffffff806f0633, rsp = 0xfffffe004b3fa7c0, rbp > >> >>= > >> >>>0xfffffe004b3fa7e0 --- > >> >>>propagate_priority() at propagate_priority+0x153/frame > >> >>>0xfffffe004b3fa7e0 > >> >>>turnstile_wait() at turnstile_wait+0x32b/frame 0xfffffe004b3fa830 > >> >>>__mtx_lock_sleep() at __mtx_lock_sleep+0x191/frame 0xfffffe004b3fa8c0 > >> >>>mnt_vnode_next_active() at mnt_vnode_next_active+0x2cb/frame > >> >>>0xfffffe004b3fa920 > >> >>>vfs_msync() at vfs_msync+0x266/frame 0xfffffe004b3fa980 > >> >>>sync_fsync() at sync_fsync+0xe6/frame 0xfffffe004b3fa9c0 > >> >>>VOP_FSYNC_APV() at VOP_FSYNC_APV+0x56/frame 0xfffffe004b3fa9e0 > >> >>>sched_sync() at sched_sync+0x379/frame 0xfffffe004b3faa70 > >> >>>fork_exit() at fork_exit+0x83/frame 0xfffffe004b3faab0 > >> >>>fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004b3faab0 > >> >>>--- trap 0, rip = 0, rsp = 0, rbp = 0 --- > >> >>>Uptime: 3m25s > >> >>>Dumping 921 out of 7974 > >> >>>MB:..2%..11%..21%..32%..42%..51%..61%..72%..82%..91% > >> >>> > >> >>>__curthread () at /opt/src/svn-current/sys/amd64/include/pcpu.h:246 > >> >>>246 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" > >> >>>(OFFSETOF_CURTHREAD)); > >> >>>(kgdb) #0 __curthread () at > >> >>>/opt/src/svn-current/sys/amd64/include/pcpu.h:246 > >> >>>#1 doadump (textdump=1) at > >> >>>/opt/src/svn-current/sys/kern/kern_shutdown.c:392 > >> >>>#2 0xffffffff8068aead in kern_reboot (howto=260) > >> >>> at /opt/src/svn-current/sys/kern/kern_shutdown.c:479 > >> >>>#3 0xffffffff8068b369 in vpanic (fmt=<optimized out>, ap=<optimized > >> >>>out>) > >> >>> at /opt/src/svn-current/sys/kern/kern_shutdown.c:905 > >> >>>#4 0xffffffff8068b163 in panic (fmt=<unavailable>) > >> >>> at /opt/src/svn-current/sys/kern/kern_shutdown.c:832 > >> >>>#5 0xffffffff80a11c2c in trap_fatal (frame=0xfffffe004b3fa700, > >> >>eva=12) > >> >>> at /opt/src/svn-current/sys/amd64/amd64/trap.c:943 > >> >>>#6 0xffffffff80a11c79 in trap_pfault (frame=0xfffffe004b3fa700, > >> >>>usermode=0) > >> >>> at /opt/src/svn-current/sys/amd64/amd64/trap.c:767 > >> >>>#7 0xffffffff80a1126d in trap (frame=0xfffffe004b3fa700) > >> >>> at /opt/src/svn-current/sys/amd64/amd64/trap.c:443 > >> >>>#8 <signal handler called> > >> >>>#9 propagate_priority (td=0xfffff8003e43b000) > >> >>> at /opt/src/svn-current/sys/kern/subr_turnstile.c:279 > >> >>>#10 0xffffffff806f13db in turnstile_wait (ts=0xfffff8000258a780, > >> >>> owner=<optimized out>, queue=-512) > >> >>> at /opt/src/svn-current/sys/kern/subr_turnstile.c:785 > >> >>>#11 0xffffffff80669271 in __mtx_lock_sleep (c=0xfffff80132e4ebf0, > >> >>> v=<optimized out>) at > >> >>/opt/src/svn-current/sys/kern/kern_mutex.c:654 > >> >>>#12 0xffffffff80767b1b in mnt_vnode_next_active_relock ( > >> >>> mvp=0xfffff80002449800, mp=0xfffff80132d1e000, > >> >>vp=0xfffff80132e4eb40) > >> >>> at /opt/src/svn-current/sys/kern/vfs_subr.c:5590 > >> >>>#13 mnt_vnode_next_active (mvp=<optimized out>, mp=0xfffff80132d1e000) > >> >>> at /opt/src/svn-current/sys/kern/vfs_subr.c:5649 > >> >>>#14 0xffffffff80766206 in __mnt_vnode_next_active > >> >>>(mvp=0xfffff80002449800, > >> >>> mp=<optimized out>) at > >> >>/opt/src/svn-current/sys/kern/vfs_subr.c:5682 > >> >>>#15 vfs_msync (mp=0xfffff80132d1e000, flags=2) > >> >>> at /opt/src/svn-current/sys/kern/vfs_subr.c:4238 > >> >>>#16 0xffffffff8076b4a6 in sync_fsync (ap=<optimized out>) > >> >>> at /opt/src/svn-current/sys/kern/vfs_subr.c:4468 > >> >>>#17 0xffffffff80a7be46 in VOP_FSYNC_APV ( > >> >>> vop=0xffffffff80cb4120 <sync_vnodeops>, a=0xfffffe004b3faa08) > >> >>> at vnode_if.c:1312 > >> >>>#18 0xffffffff80769319 in VOP_FSYNC (vp=<optimized out>, waitfor=3, > >> >>> td=0xfffff800061dc5a0) at ./vnode_if.h:549 > >> >>>#19 sync_vnode (bo=<optimized out>, td=0xfffff800061dc5a0, > >> >>> slp=<optimized out>) at > >> >>/opt/src/svn-current/sys/kern/vfs_subr.c:2316 > >> >>>#20 sched_sync () at /opt/src/svn-current/sys/kern/vfs_subr.c:2418 > >> >>>#21 0xffffffff80647e33 in fork_exit (callout=0xffffffff80768fa0 > >> >>><sched_sync>, > >> >>> arg=0x0, frame=0xfffffe004b3faac0) > >> >>> at /opt/src/svn-current/sys/kern/kern_fork.c:1045 > >> >>>#22 <signal handler called> > >> >>> > >> >>> > >> >>>dmesg output: > >> >>> > >> >>>Updating motd:. > >> >>>Mounting late filesystems:. > >> >>>Starting nscd. > >> >>>Security policy loaded: MAC/ntpd (mac_ntpd) > >> >>>Starting ntpd. > >> >>>Starting yppasswdd. > >> >>>kernel trap 12 with interrupts disabled > >> >>> > >> >>> > >> >>>Fatal trap 12: page fault while in kernel mode > >> >>>cpuid = 0; apic id = 00 > >> >>>fault virtual address = 0xc > >> >>>fault code = supervisor read data, page not present > >> >>>instruction pointer = 0x20:0xffffffff806f0633 > >> >>>stack pointer = 0x28:0xfffffe004b7fa7c0 > >> >>>frame pointer = 0x28:0xfffffe004b7fa7e0 > >> >>>code segment = base 0x0, limit 0xfffff, type 0x1b > >> >>> = DPL 0, pres 1, long 1, def32 0, gran 1 > >> >>>processor eflags = resume, IOPL = 0 > >> >>>current process = 26 (syncer) > >> >>>trap number = 12 > >> >>>panic: page fault > >> >>>cpuid = 0 > >> >>>time = 1566820700 > >> >>>KDB: stack backtrace: > >> >>>db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > >> >>>0xfffffe004b7fa470 > >> >>>vpanic() at vpanic+0x19d/frame 0xfffffe004b7fa4c0 > >> >>>panic() at panic+0x43/frame 0xfffffe004b7fa520 > >> >>>trap_fatal() at trap_fatal+0x39c/frame 0xfffffe004b7fa580 > >> >>>trap_pfault() at trap_pfault+0x49/frame 0xfffffe004b7fa5e0 > >> >>>trap() at trap+0x29d/frame 0xfffffe004b7fa6f0 > >> >>>calltrap() at calltrap+0x8/frame 0xfffffe004b7fa6f0 > >> >>>--- trap 0xc, rip = 0xffffffff806f0633, rsp = 0xfffffe004b7fa7c0, rbp > >> >>= > >> >>>0xfffffe004b7fa7e0 --- > >> >>>propagate_priority() at propagate_priority+0x153/frame > >> >>>0xfffffe004b7fa7e0 > >> >>>turnstile_wait() at turnstile_wait+0x32b/frame 0xfffffe004b7fa830 > >> >>>__mtx_lock_sleep() at __mtx_lock_sleep+0x191/frame 0xfffffe004b7fa8c0 > >> >>>mnt_vnode_next_active() at mnt_vnode_next_active+0x2cb/frame > >> >>>0xfffffe004b7fa920 > >> >>>vfs_msync() at vfs_msync+0x266/frame 0xfffffe004b7fa980 > >> >>>sync_fsync() at sync_fsync+0xe6/frame 0xfffffe004b7fa9c0 > >> >>>--- trap 0, rip = 0, rsp = 0, rbp = 0 --- > >> >>>Uptime: 1m38s > >> >>>Dumping 810 out of 7974 > >> >>>MB:..2%..12%..22%..32%..42%..52%..62%..72%..81%..91%---<<BOOT>>--- > >> >> > >> >>This may be a simple case of increasing the kernel stack. I'll try this > >> >>at noon. > >> > > >> > Increasing kernel stack pages from 8 to 20 still results in the panic. > >> > > >> > > >> > -- > >> > Pardon the typos and autocorrect, small keyboard in use. > >> > Cheers, > >> > Cy Schubert <Cy.Schubert@cschubert.com> > >> > FreeBSD UNIX: <cy@FreeBSD.org> Web: http://www.FreeBSD.org > >> > > >> > The need of the many outweighs the greed of the few. > >> > > >> > >> > >> -- > >> Mateusz Guzik <mjguzik gmail.com> > > > > > > > > > -- > Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201908270341.x7R3fJne062385>