Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Aug 2019 11:56:43 -0700
From:      Cy Schubert <Cy.Schubert@cschubert.com>
To:        Mateusz Guzik <mjg@FreeBSD.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r351472 - head/sys/fs/nullfs
Message-ID:  <C3B4FE65-FC10-4248-AB00-0E30532C77EB@cschubert.com>
In-Reply-To: <A961554B-2B0B-4382-BA73-D69FC0129D0B@cschubert.com>
References:  <201908250513.x7P5DFFx017824@repo.freebsd.org> <6C6BF3A4-DC07-4D9A-9A5F-FA075BF535A8@cschubert.com> <201908261411.x7QEBMWi075354@slippy.cwsent.com> <A961554B-2B0B-4382-BA73-D69FC0129D0B@cschubert.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On August 26, 2019 8:40:50 AM PDT, Cy Schubert <Cy=2ESchubert@cschubert=2Ec=
om> wrote:
>On August 26, 2019 7:11:22 AM PDT, Cy Schubert
><Cy=2ESchubert@cschubert=2Ecom> wrote:
>>In message <6C6BF3A4-DC07-4D9A-9A5F-FA075BF535A8@cschubert=2Ecom>, Cy
>>Schubert wr
>>ites:
>>> On August 24, 2019 10:13:15 PM PDT, Mateusz Guzik <mjg@FreeBSD=2Eorg>
>>wrote:
>>> >Author: mjg
>>> >Date: Sun Aug 25 05:13:15 2019
>>> >New Revision: 351472
>>> >URL: https://svnweb=2Efreebsd=2Eorg/changeset/base/351472
>>> >
>>> >Log:
>>> >  nullfs: reduce areas protected by vnode interlock
>>> > =20
>>> >Some places only take the interlock to hold the vnode, which was a
>>> >requiremnt
>>> >before they started being manipulated with atomics=2E Use the newly
>>> >introduced
>>> >  vholdnz to bump the count=2E
>>> > =20
>>> >  Reviewed by:	kib
>>> >  Tested by:	pho
>>> >  Sponsored by:	The FreeBSD Foundation
>>> >  Differential Revision:	https://reviews=2Efreebsd=2Eorg/D21358
>>> >
>>> >Modified:
>>> >  head/sys/fs/nullfs/null_vnops=2Ec
>>> >
>>> >Modified: head/sys/fs/nullfs/null_vnops=2Ec
>>>
>>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>>> =3D=3D
>>> >--- head/sys/fs/nullfs/null_vnops=2Ec	Sun Aug 25 05:11:43
>2019	(r35147
>>> 1)
>>> >+++ head/sys/fs/nullfs/null_vnops=2Ec	Sun Aug 25 05:13:15
>2019	(r35147
>>> 2)
>>> >@@ -668,7 +668,7 @@ null_lock(struct vop_lock1_args *ap)
>>> > 		 * We prevent it from being recycled by holding the vnode
>>> > 		 * here=2E
>>> > 		 */
>>> >-		vholdl(lvp);
>>> >+		vholdnz(lvp);
>>> > 		error =3D VOP_LOCK(lvp, flags);
>>> >=20
>>> > 		/*
>>> >@@ -710,31 +710,16 @@ static int
>>> > null_unlock(struct vop_unlock_args *ap)
>>> > {
>>> > 	struct vnode *vp =3D ap->a_vp;
>>> >-	int flags =3D ap->a_flags;
>>> >-	int mtxlkflag =3D 0;
>>> > 	struct null_node *nn;
>>> > 	struct vnode *lvp;
>>> > 	int error;
>>> >=20
>>> >-	if ((flags & LK_INTERLOCK) !=3D 0)
>>> >-		mtxlkflag =3D 1;
>>> >-	else if (mtx_owned(VI_MTX(vp)) =3D=3D 0) {
>>> >-		VI_LOCK(vp);
>>> >-		mtxlkflag =3D 2;
>>> >-	}
>>> > 	nn =3D VTONULL(vp);
>>> > 	if (nn !=3D NULL && (lvp =3D NULLVPTOLOWERVP(vp)) !=3D NULL) {
>>> >-		VI_LOCK_FLAGS(lvp, MTX_DUPOK);
>>> >-		flags |=3D LK_INTERLOCK;
>>> >-		vholdl(lvp);
>>> >-		VI_UNLOCK(vp);
>>> >-		error =3D VOP_UNLOCK(lvp, flags);
>>> >+		vholdnz(lvp);
>>> >+		error =3D VOP_UNLOCK(lvp, 0);
>>> > 		vdrop(lvp);
>>> >-		if (mtxlkflag =3D=3D 0)
>>> >-			VI_LOCK(vp);
>>> > 	} else {
>>> >-		if (mtxlkflag =3D=3D 2)
>>> >-			VI_UNLOCK(vp);
>>> > 		error =3D vop_stdunlock(ap);
>>> > 	}
>>> >=20
>>> >@@ -845,10 +830,8 @@ null_getwritemount(struct
>>vop_getwritemount_args
>>> >*ap)
>>> > 	VI_LOCK(vp);
>>> > 	xp =3D VTONULL(vp);
>>> > 	if (xp && (lowervp =3D xp->null_lowervp)) {
>>> >-		VI_LOCK_FLAGS(lowervp, MTX_DUPOK);
>>> >+		vholdnz(lowervp);
>>> > 		VI_UNLOCK(vp);
>>> >-		vholdl(lowervp);
>>> >-		VI_UNLOCK(lowervp);
>>> > 		VOP_GETWRITEMOUNT(lowervp, ap->a_mpp);
>>> > 		vdrop(lowervp);
>>> > 	} else {
>>>
>>> Hi mjg@,
>>>
>>> This causes trap 12 a few seconds after mountlate during boot=2E
>>Reverting this
>>>  commit allowed it to boot=2E
>>>
>>> Sorry for no backtrace=2E I managed to=20
>>>  revert and test just prior to rushing out to catch the bus, with no
>>time to=20
>>> craft a proper email=2E I'll post the backtrace when I arrive at work=
=2E
>>
>>Hi mjg@,
>>
>>The gory details below=2E Don't be fooled by the panic when yppasswd is
>>started=2E I was able to reproduce this panic in single user by simply
>>mounting and using a nullfs=2E
>>
>>kernel trap 12 with interrupts disabled
>>
>>
>>Fatal trap 12: page fault while in kernel mode
>>cpuid =3D 0; apic id =3D 00
>>fault virtual address   =3D 0xc
>>fault code              =3D supervisor read data, page not present
>>instruction pointer     =3D 0x20:0xffffffff806f0633
>>stack pointer           =3D 0x28:0xfffffe004b3fa7c0
>>frame pointer           =3D 0x28:0xfffffe004b3fa7e0
>>code segment            =3D base 0x0, limit 0xfffff, type 0x1b
>>                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
>>processor eflags        =3D resume, IOPL =3D 0
>>current process         =3D 26 (syncer)
>>trap number             =3D 12
>>panic: page fault
>>cpuid =3D 0
>>time =3D 1566821203
>>KDB: stack backtrace:
>>db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>>0xfffffe004b3fa470
>>vpanic() at vpanic+0x19d/frame 0xfffffe004b3fa4c0
>>panic() at panic+0x43/frame 0xfffffe004b3fa520
>>trap_fatal() at trap_fatal+0x39c/frame 0xfffffe004b3fa580
>>trap_pfault() at trap_pfault+0x49/frame 0xfffffe004b3fa5e0
>>trap() at trap+0x29d/frame 0xfffffe004b3fa6f0
>>calltrap() at calltrap+0x8/frame 0xfffffe004b3fa6f0
>>--- trap 0xc, rip =3D 0xffffffff806f0633, rsp =3D 0xfffffe004b3fa7c0, rb=
p
>=3D
>>0xfffffe004b3fa7e0 ---
>>propagate_priority() at propagate_priority+0x153/frame
>>0xfffffe004b3fa7e0
>>turnstile_wait() at turnstile_wait+0x32b/frame 0xfffffe004b3fa830
>>__mtx_lock_sleep() at __mtx_lock_sleep+0x191/frame 0xfffffe004b3fa8c0
>>mnt_vnode_next_active() at mnt_vnode_next_active+0x2cb/frame
>>0xfffffe004b3fa920
>>vfs_msync() at vfs_msync+0x266/frame 0xfffffe004b3fa980
>>sync_fsync() at sync_fsync+0xe6/frame 0xfffffe004b3fa9c0
>>VOP_FSYNC_APV() at VOP_FSYNC_APV+0x56/frame 0xfffffe004b3fa9e0
>>sched_sync() at sched_sync+0x379/frame 0xfffffe004b3faa70
>>fork_exit() at fork_exit+0x83/frame 0xfffffe004b3faab0
>>fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004b3faab0
>>--- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 ---
>>Uptime: 3m25s
>>Dumping 921 out of 7974
>>MB:=2E=2E2%=2E=2E11%=2E=2E21%=2E=2E32%=2E=2E42%=2E=2E51%=2E=2E61%=2E=2E7=
2%=2E=2E82%=2E=2E91%
>>
>>__curthread () at /opt/src/svn-current/sys/amd64/include/pcpu=2Eh:246
>>246             __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n"
>>(OFFSETOF_CURTHREAD));
>>(kgdb) #0  __curthread () at
>>/opt/src/svn-current/sys/amd64/include/pcpu=2Eh:246
>>#1  doadump (textdump=3D1) at
>>/opt/src/svn-current/sys/kern/kern_shutdown=2Ec:392
>>#2  0xffffffff8068aead in kern_reboot (howto=3D260)
>>    at /opt/src/svn-current/sys/kern/kern_shutdown=2Ec:479
>>#3  0xffffffff8068b369 in vpanic (fmt=3D<optimized out>, ap=3D<optimized
>>out>)
>>    at /opt/src/svn-current/sys/kern/kern_shutdown=2Ec:905
>>#4  0xffffffff8068b163 in panic (fmt=3D<unavailable>)
>>    at /opt/src/svn-current/sys/kern/kern_shutdown=2Ec:832
>>#5  0xffffffff80a11c2c in trap_fatal (frame=3D0xfffffe004b3fa700,
>eva=3D12)
>>    at /opt/src/svn-current/sys/amd64/amd64/trap=2Ec:943
>>#6  0xffffffff80a11c79 in trap_pfault (frame=3D0xfffffe004b3fa700,
>>usermode=3D0)
>>    at /opt/src/svn-current/sys/amd64/amd64/trap=2Ec:767
>>#7  0xffffffff80a1126d in trap (frame=3D0xfffffe004b3fa700)
>>    at /opt/src/svn-current/sys/amd64/amd64/trap=2Ec:443
>>#8  <signal handler called>
>>#9  propagate_priority (td=3D0xfffff8003e43b000)
>>    at /opt/src/svn-current/sys/kern/subr_turnstile=2Ec:279
>>#10 0xffffffff806f13db in turnstile_wait (ts=3D0xfffff8000258a780,=20
>>    owner=3D<optimized out>, queue=3D-512)
>>    at /opt/src/svn-current/sys/kern/subr_turnstile=2Ec:785
>>#11 0xffffffff80669271 in __mtx_lock_sleep (c=3D0xfffff80132e4ebf0,=20
>>   v=3D<optimized out>) at
>/opt/src/svn-current/sys/kern/kern_mutex=2Ec:654
>>#12 0xffffffff80767b1b in mnt_vnode_next_active_relock (
>>  mvp=3D0xfffff80002449800, mp=3D0xfffff80132d1e000,
>vp=3D0xfffff80132e4eb40)
>>    at /opt/src/svn-current/sys/kern/vfs_subr=2Ec:5590
>>#13 mnt_vnode_next_active (mvp=3D<optimized out>, mp=3D0xfffff80132d1e00=
0)
>>    at /opt/src/svn-current/sys/kern/vfs_subr=2Ec:5649
>>#14 0xffffffff80766206 in __mnt_vnode_next_active
>>(mvp=3D0xfffff80002449800,=20
>>   mp=3D<optimized out>) at
>/opt/src/svn-current/sys/kern/vfs_subr=2Ec:5682
>>#15 vfs_msync (mp=3D0xfffff80132d1e000, flags=3D2)
>>    at /opt/src/svn-current/sys/kern/vfs_subr=2Ec:4238
>>#16 0xffffffff8076b4a6 in sync_fsync (ap=3D<optimized out>)
>>    at /opt/src/svn-current/sys/kern/vfs_subr=2Ec:4468
>>#17 0xffffffff80a7be46 in VOP_FSYNC_APV (
>>    vop=3D0xffffffff80cb4120 <sync_vnodeops>, a=3D0xfffffe004b3faa08)
>>    at vnode_if=2Ec:1312
>>#18 0xffffffff80769319 in VOP_FSYNC (vp=3D<optimized out>, waitfor=3D3,=
=20
>>    td=3D0xfffff800061dc5a0) at =2E/vnode_if=2Eh:549
>>#19 sync_vnode (bo=3D<optimized out>, td=3D0xfffff800061dc5a0,=20
>>  slp=3D<optimized out>) at
>/opt/src/svn-current/sys/kern/vfs_subr=2Ec:2316
>>#20 sched_sync () at /opt/src/svn-current/sys/kern/vfs_subr=2Ec:2418
>>#21 0xffffffff80647e33 in fork_exit (callout=3D0xffffffff80768fa0
>><sched_sync>,=20
>>    arg=3D0x0, frame=3D0xfffffe004b3faac0)
>>    at /opt/src/svn-current/sys/kern/kern_fork=2Ec:1045
>>#22 <signal handler called>
>>
>>
>>dmesg output:
>>
>>Updating motd:=2E
>>Mounting late filesystems:=2E
>>Starting nscd=2E
>>Security policy loaded: MAC/ntpd (mac_ntpd)
>>Starting ntpd=2E
>>Starting yppasswdd=2E
>>kernel trap 12 with interrupts disabled
>>
>>
>>Fatal trap 12: page fault while in kernel mode
>>cpuid =3D 0; apic id =3D 00
>>fault virtual address   =3D 0xc
>>fault code              =3D supervisor read data, page not present
>>instruction pointer     =3D 0x20:0xffffffff806f0633
>>stack pointer           =3D 0x28:0xfffffe004b7fa7c0
>>frame pointer           =3D 0x28:0xfffffe004b7fa7e0
>>code segment            =3D base 0x0, limit 0xfffff, type 0x1b
>>                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
>>processor eflags        =3D resume, IOPL =3D 0
>>current process         =3D 26 (syncer)
>>trap number             =3D 12
>>panic: page fault
>>cpuid =3D 0
>>time =3D 1566820700
>>KDB: stack backtrace:
>>db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>>0xfffffe004b7fa470
>>vpanic() at vpanic+0x19d/frame 0xfffffe004b7fa4c0
>>panic() at panic+0x43/frame 0xfffffe004b7fa520
>>trap_fatal() at trap_fatal+0x39c/frame 0xfffffe004b7fa580
>>trap_pfault() at trap_pfault+0x49/frame 0xfffffe004b7fa5e0
>>trap() at trap+0x29d/frame 0xfffffe004b7fa6f0
>>calltrap() at calltrap+0x8/frame 0xfffffe004b7fa6f0
>>--- trap 0xc, rip =3D 0xffffffff806f0633, rsp =3D 0xfffffe004b7fa7c0, rb=
p
>=3D
>>0xfffffe004b7fa7e0 ---
>>propagate_priority() at propagate_priority+0x153/frame
>>0xfffffe004b7fa7e0
>>turnstile_wait() at turnstile_wait+0x32b/frame 0xfffffe004b7fa830
>>__mtx_lock_sleep() at __mtx_lock_sleep+0x191/frame 0xfffffe004b7fa8c0
>>mnt_vnode_next_active() at mnt_vnode_next_active+0x2cb/frame
>>0xfffffe004b7fa920
>>vfs_msync() at vfs_msync+0x266/frame 0xfffffe004b7fa980
>>sync_fsync() at sync_fsync+0xe6/frame 0xfffffe004b7fa9c0
>>--- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 ---
>>Uptime: 1m38s
>>Dumping 810 out of 7974
>>MB:=2E=2E2%=2E=2E12%=2E=2E22%=2E=2E32%=2E=2E42%=2E=2E52%=2E=2E62%=2E=2E7=
2%=2E=2E81%=2E=2E91%---<<BOOT>>---
>
>This may be a simple case of increasing the kernel stack=2E I'll try this
>at noon=2E

Increasing kernel stack pages from 8 to 20 still results in the panic=2E


--=20
Pardon the typos and autocorrect, small keyboard in use=2E
Cheers,
Cy Schubert <Cy=2ESchubert@cschubert=2Ecom>
FreeBSD UNIX: <cy@FreeBSD=2Eorg> Web: http://www=2EFreeBSD=2Eorg

	The need of the many outweighs the greed of the few=2E



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C3B4FE65-FC10-4248-AB00-0E30532C77EB>