From owner-svn-src-all@freebsd.org Mon Aug 26 20:17:49 2019 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A9D2FE41F8; Mon, 26 Aug 2019 20:17:49 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from smtp-out-so.shaw.ca (smtp-out-so.shaw.ca [64.59.136.139]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "Client", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 46HNbc3VkWz3NKj; Mon, 26 Aug 2019 20:17:47 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from spqr.komquats.com ([70.67.125.17]) by shaw.ca with ESMTPA id 2LQuiqZ9jSrVc2LQvin03k; Mon, 26 Aug 2019 14:17:46 -0600 X-Authority-Analysis: v=2.3 cv=L5ZjvNb8 c=1 sm=1 tr=0 a=VFtTW3WuZNDh6VkGe7fA3g==:117 a=VFtTW3WuZNDh6VkGe7fA3g==:17 a=kj9zAlcOel0A:10 a=FmdZ9Uzk2mMA:10 a=YxBL1-UpAAAA:8 a=6I5d2MoRAAAA:8 a=pGLkceISAAAA:8 a=BrtT2lZyVd4BvNs1i8sA:9 a=082q3YohYhMu5kk3:21 a=dPDYV4cP5bbATGA8:21 a=8HBsMUk44bn0z0wV:21 a=CjuIK1q_8ugA:10 a=Ia-lj3WSrqcvXOmTRaiG:22 a=IjZwj45LgO3ly-622nXo:22 Received: from slippy.cwsent.com (slippy8 [10.2.2.6]) by spqr.komquats.com (Postfix) with ESMTPS id EB01A40E; Mon, 26 Aug 2019 13:17:43 -0700 (PDT) Received: from slippy.cwsent.com (localhost [127.0.0.1]) by slippy.cwsent.com (8.15.2/8.15.2) with ESMTP id x7QKHhCV050140; Mon, 26 Aug 2019 13:17:43 -0700 (PDT) (envelope-from Cy.Schubert@cschubert.com) Received: from slippy (cy@localhost) by slippy.cwsent.com (8.15.2/8.15.2/Submit) with ESMTP id x7QKHhOu050137; Mon, 26 Aug 2019 13:17:43 -0700 (PDT) (envelope-from Cy.Schubert@cschubert.com) Message-Id: <201908262017.x7QKHhOu050137@slippy.cwsent.com> X-Authentication-Warning: slippy.cwsent.com: cy owned process doing -bs X-Mailer: exmh version 2.9.0 11/07/2018 with nmh-1.7.1 Reply-to: Cy Schubert From: Cy Schubert X-os: FreeBSD X-Sender: cy@cwsent.com X-URL: http://www.cschubert.com/ To: Mateusz Guzik cc: Cy Schubert , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r351472 - head/sys/fs/nullfs In-reply-to: References: <201908250513.x7P5DFFx017824@repo.freebsd.org> <6C6BF3A4-DC07-4D9A-9A5F-FA075BF535A8@cschubert.com> <201908261411.x7QEBMWi075354@slippy.cwsent.com> Comments: In-reply-to Mateusz Guzik message dated "Mon, 26 Aug 2019 22:15:01 +0200." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 26 Aug 2019 13:17:43 -0700 X-CMAE-Envelope: MS4wfFsWrUUBPo/AqYvh+ANBXj7kpoHJa8hXII97Aw/WOIWDqCNK/cDsU7kQmyUhPKEum5zUFfPrZwWjNcMK7efud8tWp7yHCni6e8JF+okbjhVrse/QnuqI gtwPSQnfhCvLxlDKTVWd1+S6d/M+EgNDnEP+xzAYzPrxuMwtQLSrAxUfm74JvpPSisnmcFcJU8I5Ego5LiUgtH/u7IaEEj2XQud9Va9ZLNChwLPHVt1CvGr+ xMAd+KwSUuB75fPEE4UI7ppNDf+MeGrxldCnUSmKbf4= X-Rspamd-Queue-Id: 46HNbc3VkWz3NKj X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; spf=none (mx1.freebsd.org: domain of cy.schubert@cschubert.com has no SPF policy when checking 64.59.136.139) smtp.mailfrom=cy.schubert@cschubert.com X-Spamd-Result: default: False [-4.85 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_COUNT_FIVE(0.00)[5]; HAS_REPLYTO(0.00)[Cy.Schubert@cschubert.com]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCPT_COUNT_FIVE(0.00)[5]; REPLYTO_EQ_FROM(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-0.97)[-0.969,0]; RCVD_IN_DNSWL_NONE(0.00)[139.136.59.64.list.dnswl.org : 127.0.5.0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6327, ipnet:64.59.128.0/20, country:CA]; RCVD_TLS_LAST(0.00)[]; IP_SCORE(-2.28)[ip: (-5.71), ipnet: 64.59.128.0/20(-3.14), asn: 6327(-2.45), country: CA(-0.09)]; RECEIVED_SPAMHAUS_PBL(0.00)[17.125.67.70.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.11] X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2019 20:17:49 -0000 Thanks. My lunch hour is over. I'll try this after I get home from $JOB. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. In message , Mateusz Guzik writes: > Does this do the trick for you? > > https://people.freebsd.org/~mjg/vop_unlock-interlock.diff > > On 8/26/19, Cy Schubert wrote: > > On August 26, 2019 8:40:50 AM PDT, Cy Schubert > > wrote: > >>On August 26, 2019 7:11:22 AM PDT, Cy Schubert > >> wrote: > >>>In message <6C6BF3A4-DC07-4D9A-9A5F-FA075BF535A8@cschubert.com>, Cy > >>>Schubert wr > >>>ites: > >>>> On August 24, 2019 10:13:15 PM PDT, Mateusz Guzik > >>>wrote: > >>>> >Author: mjg > >>>> >Date: Sun Aug 25 05:13:15 2019 > >>>> >New Revision: 351472 > >>>> >URL: https://svnweb.freebsd.org/changeset/base/351472 > >>>> > > >>>> >Log: > >>>> > nullfs: reduce areas protected by vnode interlock > >>>> > > >>>> >Some places only take the interlock to hold the vnode, which was a > >>>> >requiremnt > >>>> >before they started being manipulated with atomics. Use the newly > >>>> >introduced > >>>> > vholdnz to bump the count. > >>>> > > >>>> > Reviewed by: kib > >>>> > Tested by: pho > >>>> > Sponsored by: The FreeBSD Foundation > >>>> > Differential Revision: https://reviews.freebsd.org/D21358 > >>>> > > >>>> >Modified: > >>>> > head/sys/fs/nullfs/null_vnops.c > >>>> > > >>>> >Modified: head/sys/fs/nullfs/null_vnops.c > >>>> > >>>>========================================================================= > === > >>>> == > >>>> >--- head/sys/fs/nullfs/null_vnops.c Sun Aug 25 05:11:43 > >>2019 (r35147 > >>>> 1) > >>>> >+++ head/sys/fs/nullfs/null_vnops.c Sun Aug 25 05:13:15 > >>2019 (r35147 > >>>> 2) > >>>> >@@ -668,7 +668,7 @@ null_lock(struct vop_lock1_args *ap) > >>>> > * We prevent it from being recycled by holding the vno > de > >>>> > * here. > >>>> > */ > >>>> >- vholdl(lvp); > >>>> >+ vholdnz(lvp); > >>>> > error = VOP_LOCK(lvp, flags); > >>>> > > >>>> > /* > >>>> >@@ -710,31 +710,16 @@ static int > >>>> > null_unlock(struct vop_unlock_args *ap) > >>>> > { > >>>> > struct vnode *vp = ap->a_vp; > >>>> >- int flags = ap->a_flags; > >>>> >- int mtxlkflag = 0; > >>>> > struct null_node *nn; > >>>> > struct vnode *lvp; > >>>> > int error; > >>>> > > >>>> >- if ((flags & LK_INTERLOCK) != 0) > >>>> >- mtxlkflag = 1; > >>>> >- else if (mtx_owned(VI_MTX(vp)) == 0) { > >>>> >- VI_LOCK(vp); > >>>> >- mtxlkflag = 2; > >>>> >- } > >>>> > nn = VTONULL(vp); > >>>> > if (nn != NULL && (lvp = NULLVPTOLOWERVP(vp)) != NULL) { > >>>> >- VI_LOCK_FLAGS(lvp, MTX_DUPOK); > >>>> >- flags |= LK_INTERLOCK; > >>>> >- vholdl(lvp); > >>>> >- VI_UNLOCK(vp); > >>>> >- error = VOP_UNLOCK(lvp, flags); > >>>> >+ vholdnz(lvp); > >>>> >+ error = VOP_UNLOCK(lvp, 0); > >>>> > vdrop(lvp); > >>>> >- if (mtxlkflag == 0) > >>>> >- VI_LOCK(vp); > >>>> > } else { > >>>> >- if (mtxlkflag == 2) > >>>> >- VI_UNLOCK(vp); > >>>> > error = vop_stdunlock(ap); > >>>> > } > >>>> > > >>>> >@@ -845,10 +830,8 @@ null_getwritemount(struct > >>>vop_getwritemount_args > >>>> >*ap) > >>>> > VI_LOCK(vp); > >>>> > xp = VTONULL(vp); > >>>> > if (xp && (lowervp = xp->null_lowervp)) { > >>>> >- VI_LOCK_FLAGS(lowervp, MTX_DUPOK); > >>>> >+ vholdnz(lowervp); > >>>> > VI_UNLOCK(vp); > >>>> >- vholdl(lowervp); > >>>> >- VI_UNLOCK(lowervp); > >>>> > VOP_GETWRITEMOUNT(lowervp, ap->a_mpp); > >>>> > vdrop(lowervp); > >>>> > } else { > >>>> > >>>> Hi mjg@, > >>>> > >>>> This causes trap 12 a few seconds after mountlate during boot. > >>>Reverting this > >>>> commit allowed it to boot. > >>>> > >>>> Sorry for no backtrace. I managed to > >>>> revert and test just prior to rushing out to catch the bus, with no > >>>time to > >>>> craft a proper email. I'll post the backtrace when I arrive at work. > >>> > >>>Hi mjg@, > >>> > >>>The gory details below. Don't be fooled by the panic when yppasswd is > >>>started. I was able to reproduce this panic in single user by simply > >>>mounting and using a nullfs. > >>> > >>>kernel trap 12 with interrupts disabled > >>> > >>> > >>>Fatal trap 12: page fault while in kernel mode > >>>cpuid = 0; apic id = 00 > >>>fault virtual address = 0xc > >>>fault code = supervisor read data, page not present > >>>instruction pointer = 0x20:0xffffffff806f0633 > >>>stack pointer = 0x28:0xfffffe004b3fa7c0 > >>>frame pointer = 0x28:0xfffffe004b3fa7e0 > >>>code segment = base 0x0, limit 0xfffff, type 0x1b > >>> = DPL 0, pres 1, long 1, def32 0, gran 1 > >>>processor eflags = resume, IOPL = 0 > >>>current process = 26 (syncer) > >>>trap number = 12 > >>>panic: page fault > >>>cpuid = 0 > >>>time = 1566821203 > >>>KDB: stack backtrace: > >>>db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > >>>0xfffffe004b3fa470 > >>>vpanic() at vpanic+0x19d/frame 0xfffffe004b3fa4c0 > >>>panic() at panic+0x43/frame 0xfffffe004b3fa520 > >>>trap_fatal() at trap_fatal+0x39c/frame 0xfffffe004b3fa580 > >>>trap_pfault() at trap_pfault+0x49/frame 0xfffffe004b3fa5e0 > >>>trap() at trap+0x29d/frame 0xfffffe004b3fa6f0 > >>>calltrap() at calltrap+0x8/frame 0xfffffe004b3fa6f0 > >>>--- trap 0xc, rip = 0xffffffff806f0633, rsp = 0xfffffe004b3fa7c0, rbp > >>= > >>>0xfffffe004b3fa7e0 --- > >>>propagate_priority() at propagate_priority+0x153/frame > >>>0xfffffe004b3fa7e0 > >>>turnstile_wait() at turnstile_wait+0x32b/frame 0xfffffe004b3fa830 > >>>__mtx_lock_sleep() at __mtx_lock_sleep+0x191/frame 0xfffffe004b3fa8c0 > >>>mnt_vnode_next_active() at mnt_vnode_next_active+0x2cb/frame > >>>0xfffffe004b3fa920 > >>>vfs_msync() at vfs_msync+0x266/frame 0xfffffe004b3fa980 > >>>sync_fsync() at sync_fsync+0xe6/frame 0xfffffe004b3fa9c0 > >>>VOP_FSYNC_APV() at VOP_FSYNC_APV+0x56/frame 0xfffffe004b3fa9e0 > >>>sched_sync() at sched_sync+0x379/frame 0xfffffe004b3faa70 > >>>fork_exit() at fork_exit+0x83/frame 0xfffffe004b3faab0 > >>>fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004b3faab0 > >>>--- trap 0, rip = 0, rsp = 0, rbp = 0 --- > >>>Uptime: 3m25s > >>>Dumping 921 out of 7974 > >>>MB:..2%..11%..21%..32%..42%..51%..61%..72%..82%..91% > >>> > >>>__curthread () at /opt/src/svn-current/sys/amd64/include/pcpu.h:246 > >>>246 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" > >>>(OFFSETOF_CURTHREAD)); > >>>(kgdb) #0 __curthread () at > >>>/opt/src/svn-current/sys/amd64/include/pcpu.h:246 > >>>#1 doadump (textdump=1) at > >>>/opt/src/svn-current/sys/kern/kern_shutdown.c:392 > >>>#2 0xffffffff8068aead in kern_reboot (howto=260) > >>> at /opt/src/svn-current/sys/kern/kern_shutdown.c:479 > >>>#3 0xffffffff8068b369 in vpanic (fmt=, ap= >>>out>) > >>> at /opt/src/svn-current/sys/kern/kern_shutdown.c:905 > >>>#4 0xffffffff8068b163 in panic (fmt=) > >>> at /opt/src/svn-current/sys/kern/kern_shutdown.c:832 > >>>#5 0xffffffff80a11c2c in trap_fatal (frame=0xfffffe004b3fa700, > >>eva=12) > >>> at /opt/src/svn-current/sys/amd64/amd64/trap.c:943 > >>>#6 0xffffffff80a11c79 in trap_pfault (frame=0xfffffe004b3fa700, > >>>usermode=0) > >>> at /opt/src/svn-current/sys/amd64/amd64/trap.c:767 > >>>#7 0xffffffff80a1126d in trap (frame=0xfffffe004b3fa700) > >>> at /opt/src/svn-current/sys/amd64/amd64/trap.c:443 > >>>#8 > >>>#9 propagate_priority (td=0xfffff8003e43b000) > >>> at /opt/src/svn-current/sys/kern/subr_turnstile.c:279 > >>>#10 0xffffffff806f13db in turnstile_wait (ts=0xfffff8000258a780, > >>> owner=, queue=-512) > >>> at /opt/src/svn-current/sys/kern/subr_turnstile.c:785 > >>>#11 0xffffffff80669271 in __mtx_lock_sleep (c=0xfffff80132e4ebf0, > >>> v=) at > >>/opt/src/svn-current/sys/kern/kern_mutex.c:654 > >>>#12 0xffffffff80767b1b in mnt_vnode_next_active_relock ( > >>> mvp=0xfffff80002449800, mp=0xfffff80132d1e000, > >>vp=0xfffff80132e4eb40) > >>> at /opt/src/svn-current/sys/kern/vfs_subr.c:5590 > >>>#13 mnt_vnode_next_active (mvp=, mp=0xfffff80132d1e000) > >>> at /opt/src/svn-current/sys/kern/vfs_subr.c:5649 > >>>#14 0xffffffff80766206 in __mnt_vnode_next_active > >>>(mvp=0xfffff80002449800, > >>> mp=) at > >>/opt/src/svn-current/sys/kern/vfs_subr.c:5682 > >>>#15 vfs_msync (mp=0xfffff80132d1e000, flags=2) > >>> at /opt/src/svn-current/sys/kern/vfs_subr.c:4238 > >>>#16 0xffffffff8076b4a6 in sync_fsync (ap=) > >>> at /opt/src/svn-current/sys/kern/vfs_subr.c:4468 > >>>#17 0xffffffff80a7be46 in VOP_FSYNC_APV ( > >>> vop=0xffffffff80cb4120 , a=0xfffffe004b3faa08) > >>> at vnode_if.c:1312 > >>>#18 0xffffffff80769319 in VOP_FSYNC (vp=, waitfor=3, > >>> td=0xfffff800061dc5a0) at ./vnode_if.h:549 > >>>#19 sync_vnode (bo=, td=0xfffff800061dc5a0, > >>> slp=) at > >>/opt/src/svn-current/sys/kern/vfs_subr.c:2316 > >>>#20 sched_sync () at /opt/src/svn-current/sys/kern/vfs_subr.c:2418 > >>>#21 0xffffffff80647e33 in fork_exit (callout=0xffffffff80768fa0 > >>>, > >>> arg=0x0, frame=0xfffffe004b3faac0) > >>> at /opt/src/svn-current/sys/kern/kern_fork.c:1045 > >>>#22 > >>> > >>> > >>>dmesg output: > >>> > >>>Updating motd:. > >>>Mounting late filesystems:. > >>>Starting nscd. > >>>Security policy loaded: MAC/ntpd (mac_ntpd) > >>>Starting ntpd. > >>>Starting yppasswdd. > >>>kernel trap 12 with interrupts disabled > >>> > >>> > >>>Fatal trap 12: page fault while in kernel mode > >>>cpuid = 0; apic id = 00 > >>>fault virtual address = 0xc > >>>fault code = supervisor read data, page not present > >>>instruction pointer = 0x20:0xffffffff806f0633 > >>>stack pointer = 0x28:0xfffffe004b7fa7c0 > >>>frame pointer = 0x28:0xfffffe004b7fa7e0 > >>>code segment = base 0x0, limit 0xfffff, type 0x1b > >>> = DPL 0, pres 1, long 1, def32 0, gran 1 > >>>processor eflags = resume, IOPL = 0 > >>>current process = 26 (syncer) > >>>trap number = 12 > >>>panic: page fault > >>>cpuid = 0 > >>>time = 1566820700 > >>>KDB: stack backtrace: > >>>db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > >>>0xfffffe004b7fa470 > >>>vpanic() at vpanic+0x19d/frame 0xfffffe004b7fa4c0 > >>>panic() at panic+0x43/frame 0xfffffe004b7fa520 > >>>trap_fatal() at trap_fatal+0x39c/frame 0xfffffe004b7fa580 > >>>trap_pfault() at trap_pfault+0x49/frame 0xfffffe004b7fa5e0 > >>>trap() at trap+0x29d/frame 0xfffffe004b7fa6f0 > >>>calltrap() at calltrap+0x8/frame 0xfffffe004b7fa6f0 > >>>--- trap 0xc, rip = 0xffffffff806f0633, rsp = 0xfffffe004b7fa7c0, rbp > >>= > >>>0xfffffe004b7fa7e0 --- > >>>propagate_priority() at propagate_priority+0x153/frame > >>>0xfffffe004b7fa7e0 > >>>turnstile_wait() at turnstile_wait+0x32b/frame 0xfffffe004b7fa830 > >>>__mtx_lock_sleep() at __mtx_lock_sleep+0x191/frame 0xfffffe004b7fa8c0 > >>>mnt_vnode_next_active() at mnt_vnode_next_active+0x2cb/frame > >>>0xfffffe004b7fa920 > >>>vfs_msync() at vfs_msync+0x266/frame 0xfffffe004b7fa980 > >>>sync_fsync() at sync_fsync+0xe6/frame 0xfffffe004b7fa9c0 > >>>--- trap 0, rip = 0, rsp = 0, rbp = 0 --- > >>>Uptime: 1m38s > >>>Dumping 810 out of 7974 > >>>MB:..2%..12%..22%..32%..42%..52%..62%..72%..81%..91%---<>--- > >> > >>This may be a simple case of increasing the kernel stack. I'll try this > >>at noon. > > > > Increasing kernel stack pages from 8 to 20 still results in the panic. > > > > > > -- > > Pardon the typos and autocorrect, small keyboard in use. > > Cheers, > > Cy Schubert > > FreeBSD UNIX: Web: http://www.FreeBSD.org > > > > The need of the many outweighs the greed of the few. > > > > > -- > Mateusz Guzik