Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Jun 2020 20:33:39 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 247668] Page fault in zfsctl_snapdir_getattr
Message-ID:  <bug-247668-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D247668

            Bug ID: 247668
           Summary: Page fault in zfsctl_snapdir_getattr
           Product: Base System
           Version: 12.1-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: asomers@FreeBSD.org

On a very heavily loaded server I observed the following kernel-mode page
fault.  The offending process was a "procstat -af", which did VOP_GETATTR on
every open file descriptor on the whole system, including the .zfs/snapshot
directories.  On one of those, it called dsl_dataset_phys, which tried to
dereference a null pointer.  There were also 5 "zfs destroy" processes, and
dozens of "zfs list" and "zfs recv" running concurrently.

I suspect that zfsctl_snapdir_getattr is missing some lock when it checks
dsl_dataset_phys, while trying to calculate the directory's nlink attribute=
.=20
But it's not clear what lock it ought to hold.  It's worth noting that ZoL
doesn't have this problem because it doesn't even try to calculate nlink;
instead it always returns "2".

Sadly, I haven't been able to reproduce the issue on any non-production
machine.=20=20

The server in question is running 12-STABLE at svn r346022.

#1  doadump (textdump=3D<optimized out>) at /usr/src/sys/kern/kern_shutdown=
.c:371
#2  0xffffffff80bbe655 in kern_reboot (howto=3D260)
    at /usr/src/sys/kern/kern_shutdown.c:451
#3  0xffffffff80bbea96 in vpanic (fmt=3D<optimized out>, ap=3D<optimized ou=
t>)
    at /usr/src/sys/kern/kern_shutdown.c:880
#4  0xffffffff80bbe8b3 in panic (fmt=3D<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:807
#5  0xffffffff81090310 in trap_fatal (frame=3D0xfffffe04b95c08a0, eva=3D24)
    at /usr/src/sys/amd64/amd64/trap.c:925
#6  0xffffffff8109035f in trap_pfault (frame=3D0xfffffe04b95c08a0,
    usermode=3D<optimized out>, signo=3D<optimized out>, ucode=3D<optimized=
 out>)
    at /usr/src/sys/amd64/amd64/trap.c:743
#7  0xffffffff8108f9b8 in trap (frame=3D0xfffffe04b95c08a0)
    at /usr/src/sys/amd64/amd64/trap.c:407
#8  <signal handler called>
#9  0xffffffff825f4cbc in dsl_dataset_phys (ds=3D0xfffff86821e72e10)
    at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h:2=
57
#10 zfsctl_snapdir_getattr (ap=3D<optimized out>)
    at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c:1133
#11 0xffffffff81211315 in VOP_GETATTR_APV (
    vop=3D0xffffffff826be060 <zfsctl_ops_snapdir>, a=3D0xfffffe04b95c0a98)
    at vnode_if.c:733
#12 0xffffffff80c7bd29 in VOP_GETATTR (vp=3D0x1, vap=3D<optimized out>,
    cred=3D0xfffff88e58a45700) at ./vnode_if.h:309
#13 vop_stdvptocnp (ap=3D<optimized out>) at /usr/src/sys/kern/vfs_default.=
c:743
#14 0xffffffff8121495b in VOP_VPTOCNP_APV (
    vop=3D0xffffffff81b281b8 <default_vnodeops>, a=3D0xfffffe04b95c0d90)
    at vnode_if.c:3718
#15 0xffffffff80c78304 in VOP_VPTOCNP (vp=3D0x0, vpp=3D<optimized out>,
    cred=3D0xfffff88e58a45700, buf=3D0xfffff86ed5d7d400 "",
    buflen=3D0xfffffe04b95c0e34) at ./vnode_if.h:1599
#16 vn_vptocnp (vp=3D0xfffffe04b95c0e28, cred=3D<optimized out>,
    buf=3D<optimized out>, buflen=3D<optimized out>)
    at /usr/src/sys/kern/vfs_cache.c:2296
#17 0xffffffff80c77db7 in vn_fullpath1 (td=3D0xfffff865848d7000,
    vp=3D0xfffff80e4a8a53c0, rdir=3D0xfffff860440f0b40, buf=3D0xfffff86ed5d=
7d400 "",
    retbuf=3D0xfffffe04b95c0fa8, buflen=3D1023)
    at /usr/src/sys/kern/vfs_cache.c:2392
#18 0xffffffff80c780f8 in vn_fullpath (td=3D0xfffff865848d7000,
    vn=3D0xfffff80e4a8a53c0, retbuf=3D0xfffff865848d75a0,
    freebuf=3D0xfffffe04b95c0fb0) at /usr/src/sys/kern/vfs_cache.c:2221
#19 0xffffffff80ca0635 in vn_fill_kinfo_vnode (vp=3D0xfffff80e4a8a53c0,
    kif=3D0xfffff831bcf5e818) at /usr/src/sys/kern/vfs_vnops.c:2352
#20 0xffffffff80c9d3f6 in vn_fill_kinfo (fp=3D<optimized out>,
    kif=3D0xfffff831bcf5e818, fdp=3D<optimized out>)
    at /usr/src/sys/kern/vfs_vnops.c:2318
#21 0xffffffff80b6ca25 in fo_fill_kinfo (fp=3D<optimized out>,
    kif=3D<optimized out>, fdp=3D<optimized out>) at /usr/src/sys/sys/file.=
h:407
#22 export_file_to_kinfo (fp=3D<optimized out>, fd=3D<optimized out>,
    rightsp=3D<optimized out>, kif=3D<optimized out>, fdp=3D0xfffff86618252=
450,
    flags=3D1) at /usr/src/sys/kern/kern_descrip.c:3494
#23 export_file_to_sb (fp=3D0xfffff8210a788460, fd=3D4, rightsp=3D<optimize=
d out>,
    efbuf=3D<optimized out>) at /usr/src/sys/kern/kern_descrip.c:3560
#24 kern_proc_filedesc_out (p=3D<optimized out>, sb=3D<optimized out>,
    maxlen=3D<optimized out>, flags=3D-1124734960)
    at /usr/src/sys/kern/kern_descrip.c:3667
#25 0xffffffff80b6dbbd in sysctl_kern_proc_filedesc (oidp=3D<optimized out>,
    arg1=3D0xfffffe04b95c12bc, arg2=3D<optimized out>, req=3D<optimized out=
>)
    at /usr/src/sys/kern/kern_descrip.c:3701
#26 0xffffffff80bcd639 in sysctl_root_handler_locked (
    oid=3D0xffffffff81b0a760 <sysctl___kern_proc_filedesc>,
    arg1=3D0xfffffe04b95c12bc, arg2=3D1, req=3D0xfffffe04b95c11f0,
    tracker=3D0xfffffe04b95c1168) at /usr/src/sys/kern/kern_sysctl.c:166
#27 0xffffffff80bcccf9 in sysctl_root (oidp=3D<optimized out>,
    arg1=3D0xfffffe04b95c12bc, arg2=3D1, req=3D0xfffffe04b95c11f0)
    at /usr/src/sys/kern/kern_sysctl.c:2062
#28 0xffffffff80bcd368 in userland_sysctl (td=3D0xfffff865848d7000,
    name=3D0xfffffe04b95c12b0, namelen=3D4, old=3D<optimized out>,
    oldlenp=3D<optimized out>, inkernel=3D<optimized out>, new=3D0x0, newle=
n=3D0,
    retval=3D0xfffffe04b95c1318, flags=3D0) at /usr/src/sys/kern/kern_sysct=
l.c:2157
#29 0xffffffff80bcd1af in sys___sysctl (td=3D0xfffff865848d7000,
    uap=3D0xfffff865848d73c0) at /usr/src/sys/kern/kern_sysctl.c:2092
#30 0xffffffff81090e87 in syscallenter (td=3D0xfffff865848d7000)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#31 amd64_syscall (td=3D0xfffff865848d7000, traced=3D0)
    at /usr/src/sys/amd64/amd64/trap.c:1168
#32 <signal handler called>
#33 0x000000080045789a in ?? ()

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-247668-227>