Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Dec 2021 20:34:13 +0100
From:      Peter <pmc@citylink.dinoex.sub.org>
To:        freebsd-stable@freebsd.org
Subject:   12.3:[ZFS pointer corruption] kernel crash exporting FreeBSD src repo
Message-ID:  <YbegNQCTRxon6K6L@gate.intra.daemon.contact>

next in thread | raw e-mail | index | archive | help

Hija,

  I have a filesystem with the FreeBSD src repo clone.
=20
When I do=20
>  git checkout-index -a --prefix=3D/new-filesystem/
in 5 of 6 cases I get a kernel crash.

The repo otherwise looks sane:

# git status
On branch local.generic
Your branch is ahead of 'origin/releng/12.3' by 51 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
# git branch -v -v
* local.generic             f99bc919533 [origin/releng/12.3: ahead 51] em/i=
gb capabilities fix f=FCr 12.3
  local.generic.releng/11.4 95388afb572 [origin/releng/11.4: ahead 12] KERN=
CONF, make conf, sendmail mc
  local.generic.releng/12.2 63c868925fc [origin/releng/12.2: ahead 45] Kern=
elconfigs aktualisiert
  main                      01bad87a76d [origin/main: behind 12250] UPDATIN=
G: add an entry for commit 875977314881


The filesystem is also responsive and in working order.
ZFS scrub does not see anything to complain.
No other error messages anywhere.
zdb runs through it and finds no error.

Everything else on the system (and there is a lot) appears to work
just fine.

In 2 of 5 cases a kernel dump is written, which is quite exactly the
same every time (in another case the name was not "HEAD", but
".arcconfig", another time "sys").


Fatal trap 12: page fault while in kernel mode
cpuid =3D 19; apic id =3D 19
fault virtual address   =3D 0x410
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff808e9815
stack pointer           =3D 0x28:0xfffffe00e667bf50
frame pointer           =3D 0x28:0xfffffe00e667bff0
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 18330 (git)
trap number             =3D 12

#8  0xffffffff808e9815 in _sx_xlock_hard ()
#9  0xffffffff8039e35b in __dbuf_hold_impl ()
#10 0xffffffff8039e44f in dbuf_hold ()
#11 0xffffffff803bc152 in dnode_hold_impl ()
#12 0xffffffff803a597d in dmu_bonus_hold ()
#13 0xffffffff80468b16 in zfs_zget ()
#14 0xffffffff80443ccf in zfs_dirent_lookup ()
#15 0xffffffff80443d9b in zfs_dirlook ()
#16 0xffffffff8045ddb0 in zfs_lookup ()
#17 0xffffffff8045e569 in zfs_freebsd_cachedlookup ()
#18 0xffffffff80d5ac5c in VOP_CACHEDLOOKUP_APV ()
#19 0xffffffff8099e27c in vfs_cache_lookup ()
#20 0xffffffff80d5ab2c in VOP_LOOKUP_APV ()
#21 0xffffffff80835ba5 in null_lookup ()
#22 0xffffffff80d5ab2c in VOP_LOOKUP_APV ()
#23 0xffffffff809a7cc1 in lookup ()
#24 0xffffffff809a716f in namei ()
#25 0xffffffff809bff32 in kern_statat ()
#26 0xffffffff809c06cf in sys_fstatat ()
#27 0xffffffff80cd38b0 in amd64_syscall ()
#28 0xffffffff80caba8e in fast_syscall_common ()
------------------------------------------------------
#8  0xffffffff808e9815 in _sx_xlock_hard (sx=3D0xfffffe00ab1680a0,=20
    x=3D<value optimized out>, opts=3D<value optimized out>)
    at /usr/src/sys/kern/kern_sx.c:682
#9  0xffffffff8039e35b in __dbuf_hold_impl (dh=3D0xfffff80018449000)
    at src/sys/sys/sx.h:168
#10 0xffffffff8039e44f in dbuf_hold (dn=3D0xfffff800183918a0, blkid=3D5302,=
=20
    tag=3D0xffffffff80d7ef0b)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:3016
#11 0xffffffff803bc152 in dnode_hold_impl (os=3D0xfffff8001815e400,=20
    object=3D169684, flag=3D1, slots=3D0, dnp=3D0xfffffe00e667c150)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c:1329
#12 0xffffffff803a597d in dmu_bonus_hold (os=3D<value optimized out>,=20
    object=3D<value optimized out>, tag=3D<value optimized out>,=20
    dbp=3D0xfffffe00e667c1e8)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:345
#13 0xffffffff80468b16 in zfs_zget (zfsvfs=3D0xfffffe00ba7fe000, obj_num=3D=
169684,=20
    zpp=3D0xfffffe00e667c298)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:=
1165
#14 0xffffffff80443ccf in zfs_dirent_lookup (dzp=3D0xfffff80704c676c0,=20
    name=3D0xfffffe00e667c3c0 "HEAD", zpp=3D<value optimized out>, flag=3D2)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:187
#15 0xffffffff80443d9b in zfs_dirlook (dzp=3D0xfffff80704c676c0,=20
    name=3D<value optimized out>, zpp=3D0xfffffe00e667c338)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:238
#16 0xffffffff8045ddb0 in zfs_lookup (dvp=3D<value optimized out>,=20
    nm=3D0xfffffe00e667c3c0 "HEAD", vpp=3D<value optimized out>,=20
    cnp=3D<value optimized out>, nameiop=3D0, cr=3D<value optimized out>, f=
lags=3D0,=20
    cached=3D1)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:=
1671
#17 0xffffffff8045e569 in zfs_freebsd_cachedlookup (ap=3D0xfffffe00e667c520)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:=
4973
#18 0xffffffff80d5ac5c in VOP_CACHEDLOOKUP_APV (vop=3D0xffffffff81067150,=
=20
    a=3D0xfffffe00e667c520) at src/sys/sys/signalvar.h:352
#19 0xffffffff8099e27c in vfs_cache_lookup (ap=3D<value optimized out>)
    at vnode_if.h:80
#20 0xffffffff80d5ab2c in VOP_LOOKUP_APV (vop=3D0xffffffff81067150,=20
    a=3D0xfffffe00e667c5a8) at src/sys/sys/signalvar.h:352
#21 0xffffffff80835ba5 in null_lookup (ap=3D0xfffffe00e667c690) at vnode_if=
=2Eh:54
#22 0xffffffff80d5ab2c in VOP_LOOKUP_APV (vop=3D0xffffffff810a2cc0,=20
    a=3D0xfffffe00e667c690) at src/sys/sys/signalvar.h:352
#23 0xffffffff809a7cc1 in lookup (ndp=3D0xfffffe00e667c8b8) at vnode_if.h:54
#24 0xffffffff809a716f in namei (ndp=3D0xfffffe00e667c8b8)
    at /usr/src/sys/kern/vfs_lookup.c:499
#25 0xffffffff809bff32 in kern_statat (td=3D0xfffff805b25b1000,=20
    flag=3D<value optimized out>, fd=3D<value optimized out>,=20
    path=3D0x8009c7000 <Address 0x8009c7000 out of bounds>,=20
    pathseg=3DUIO_USERSPACE, sbp=3D0xfffffe00e667c9d8, hook=3D0)
    at /usr/src/sys/kern/vfs_syscalls.c:2327
#26 0xffffffff809c06cf in sys_fstatat (td=3D<value optimized out>,=20
    uap=3D0xfffff805b25b13c0) at /usr/src/sys/kern/vfs_syscalls.c:2304
#27 0xffffffff80cd38b0 in amd64_syscall (td=3D0xfffff805b25b1000, traced=3D=
0)
    at src/sys/amd64/amd64/../../kern/subr_syscall.c:144
#28 0xffffffff80caba8e in fast_syscall_common ()
    at /usr/src/sys/amd64/amd64/exception.S:582


Trying to understand this, it all looks quite useful up to #10. And
there I am loosing track: how do we get from #10 to #9 ?

Anyway, I'm bored of all these "optimized out". So I managed to build
a kernel with -O0. (It's no longer very easy to do that, one might
find a switch that works, but then it won't work for the
modules. Thats one reason why I want everything compiled into the
kernel, and no loadable modules.) It then does even run - and it runs
quite well, only a little bit slower.

Now the backtrace is more intellegible:


#9  0xffffffff80bf6ec8 in _sx_xlock_hard (sx=3D0xfffffe00ab2500a0, x=3D0, o=
pts=3D0)
    at ../../../kern/kern_sx.c:683
#10 0xffffffff803e4595 in __sx_xlock (sx=3D0xfffffe00ab2500a0,=20
    td=3D0xfffff80106b3c000, opts=3D0, file=3D0x0, line=3D0) at sx.h:168
#11 0xffffffff803e8947 in dbuf_rele (db=3D0xfffffe00ab250030, tag=3D0x0)
    at ../../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:3143
#12 0xffffffff803eaefa in __dbuf_hold_impl (dh=3D0xfffff80499edf000)
    at ../../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:2987
#13 0xffffffff803e9fce in dbuf_hold_impl (dn=3D0xfffff8001fe2bb80, level=3D=
0 '\0',=20
    blkid=3D5302, fail_sparse=3D0, fail_uncached=3D0, tag=3D0xffffffff81369=
fca,=20
    dbp=3D0xfffffe00e533eb18)
    at ../../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:3016

It then took me a night trying to understand that code (uintptr_t,
never seen before - back when I did learn C, the stance was that you
shouldn't mix integers and pointers - let alone in the same variable).

Then I figured that the x=3D0 in #9 is probably not a good thing.

So I looked where that comes from. That is difficult, but it seems to
come from the db variable in #11, which is a "dmu_buf_impl_t *", and
that looks like this:


$6 =3D {db =3D {db_object =3D 18446735312811784192, db_offset =3D 515396075=
52, db_size =3D 15168, db_data =3D 0x6}, db_objset =3D 0x4000001000,=20
  db_dnode_handle =3D 0x0, db_parent =3D 0x0, db_hash_next =3D 0x0, db_link=
 =3D {avl_child =3D 0xfffffe00ab25a070, avl_pcb =3D 0}, db_blkid =3D 0,=20
  db_blkptr =3D 0x0, db_level =3D 0 '\0', db_mtx =3D {lock_object =3D {lo_n=
ame =3D 0x0, lo_flags =3D 0, lo_data =3D 0, lo_witness =3D 0x0}, sx_lock =
=3D 0},=20
  db_state =3D DB_UNCACHED, db_holds =3D {rc_count =3D 0}, db_buf =3D 0x0, =
db_changed =3D {cv_description =3D 0x0, cv_waiters =3D 0}, db_data_pending =
=3D 0x0,=20
  db_last_dirty =3D 0x0, db_cache_link =3D {list_next =3D 0x0, list_prev =
=3D 0x0}, db_caching_status =3D DB_DBUF_CACHE, db_user =3D 0x0,=20
  db_user_immediate_evict =3D 0 '\0', db_freed_in_flight =3D 0 '\0', db_pen=
ding_evict =3D 0 '\0', db_dirtycnt =3D 0 '\0'}

We can see that the locking info (db_mtx) is just zeroed out, like
almost everything else in there. This is not a dbuf, this is a rogue
dbuf!

I then spent some time reading about these dbufs - but there doesn't
seem to be an easy way to get an inventory and monitor them somehow,
so the question where this havoc comes from is not easy to figure out.

Anyway, I don't necessarily want to figure it out. I want my pool
running.

And since this must have something to do with the pool (there are 5
other pools on the machine, and they do work), I dumped the pool with
"zfs send" into a file.
I replaced the disks.
I replaced the disk controller with a different brand.
I recreated the pool and loaded the file back in.

And the kernel crashes do still appear.

So the bottomline is: ZFS doesn't like git. Now how do I get that pool
working?=20


cheerio,
PMc



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YbegNQCTRxon6K6L>