FreeBSD Mail Archives

Date:      Thu, 7 Jul 2016 10:26:10 -0400
From:      David Cross <dcrosstech@gmail.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-stable@freebsd.org, freebsd-hackers@freebsd.org
Subject:   Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)
Message-ID:  <CAM9edePjo%2BUnWSzHLrcbsw0-5Z6y7xcbWB5eg1fak%2BzqbZWndQ@mail.gmail.com>
In-Reply-To: <20160707001218.GI38613@kib.kiev.ua>
References:  <CAM9edeOek_zqRPt-0vDMNMK9CH31yAeVPAirWVvcuUWy5xsm4A@mail.gmail.com> <CAM9edeN1Npc=cNth2gAk1XFLvar-jZqzxWX50pLQVxDusMrOVg@mail.gmail.com> <20160706151822.GC38613@kib.kiev.ua> <CAM9edeMDdjO6C2BRXBxDV-trUG5A0NEua%2BK0H_wERq7H4AR72g@mail.gmail.com> <CAM9edePfMxm26yYC=o10CGhRSDUHXTTNosFc_T89v4Pxt0JM0g@mail.gmail.com> <20160706173758.GF38613@kib.kiev.ua> <CAM9edeOb0yUqaXbTMGBJVFqgJ%2B%2ByaDr4tGV1TQ_UPOYmv4p2fw@mail.gmail.com> <20160707001218.GI38613@kib.kiev.ua>

The state was printed after the panic, yes.

If I understand the idea of softupdates correctly, I don't think its odd
this buffer wasn't even attempted to be written, it has b_dep defined, that
means those blocks should be written first, right?

Also, I was just able to reproduce this on 11.0-ALPHA6, I did a fresh fsck
on the filesystem to ensure it was clean (I typically don't fsck between
reprouction runs, since it takes so long, and when I do need a 'clean'
slate I just restore the snapshot, its faster than fsck).

The panic from 11.0-ALPHA6 is:

root@bhyve103:~ # panic: softdep_deallocate_dependencies: dangling deps
cpuid = 3
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe011b3861b0
vpanic() at vpanic+0x182/frame 0xfffffe011b386230
panic() at panic+0x43/frame 0xfffffe011b386290
softdep_deallocate_dependencies() at
softdep_deallocate_dependencies+0x71/frame 0xfffffe011b3862b0
brelse() at brelse+0x162/frame 0xfffffe011b386310
bufwrite() at bufwrite+0x206/frame 0xfffffe011b386360
ffs_write() at ffs_write+0x3ed/frame 0xfffffe011b386410
VOP_WRITE_APV() at VOP_WRITE_APV+0x16f/frame 0xfffffe011b386520
vnode_pager_generic_putpages() at vnode_pager_generic_putpages+0x2d5/frame
0xffffe011b3865f0
VOP_PUTPAGES_APV() at VOP_PUTPAGES_APV+0xda/frame 0xfffffe011b386620
vnode_pager_putpages() at vnode_pager_putpages+0x89/frame 0xfffffe011b386690
vm_pageout_flush() at vm_pageout_flush+0x12d/frame 0xfffffe011b386720
vm_object_page_collect_flush() at vm_object_page_collect_flush+0x23a/frame
0xffffe011b386820
vm_object_page_clean() at vm_object_page_clean+0x1be/frame
0xfffffe011b3868a0
vm_object_terminate() at vm_object_terminate+0xa5/frame 0xfffffe011b3868e0
vnode_destroy_vobject() at vnode_destroy_vobject+0x63/frame
0xfffffe011b386910
ufs_reclaim() at ufs_reclaim+0x1f/frame 0xfffffe011b386940
VOP_RECLAIM_APV() at VOP_RECLAIM_APV+0xda/frame 0xfffffe011b386970
vgonel() at vgonel+0x204/frame 0xfffffe011b3869e0
vnlru_proc() at vnlru_proc+0x577/frame 0xfffffe011b386a70
fork_exit() at fork_exit+0x84/frame 0xfffffe011b386ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe011b386ab0


Pardon the machine name, I have a setup script for bhyve VMs, and I didn't
tweak the name, just the install location:
root@bhyve103:~ # uname -a
FreeBSD bhyve103.priv.dcrosstech.com 11.0-ALPHA6 FreeBSD 11.0-ALPHA6 #0
r302303:
 Fri Jul  1 03:32:49 UTC 2016
root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC
amd64


On the 10.3 kernel I was also able to walk the mnt_nvnodes list before the
FS panic and I have the vnode * saved from before the vnlru attempted
reclaim.

print *((struct vnode *)0xfffff80002dc2760)
$6 = {v_tag = 0xffffffff8072b891 "ufs", v_op = 0xffffffff80a13c40,
  v_data = 0xfffff8006a20b160, v_mount = 0xfffff800024e9cc0, v_nmntvnodes =
{
    tqe_next = 0xfffff80002dc2588, tqe_prev = 0xfffff80002dc2958}, v_un = {
    vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0},
  v_hashlist = {le_next = 0x0, le_prev = 0xfffffe0000932ef8}, v_cache_src =
{
    lh_first = 0x0}, v_cache_dst = {tqh_first = 0xfffff8006a18ce00,
    tqh_last = 0xfffff8006a18ce20}, v_cache_dd = 0x0, v_lock = {lock_object
= {
      lo_name = 0xffffffff8072b891 "ufs", lo_flags = 117112832, lo_data =
0,
      lo_witness = 0xfffffe0000607280}, lk_lock = 1, lk_exslpfail = 0,
    lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = {
      lo_name = 0xffffffff8074a89f "vnode interlock", lo_flags = 16973824,
      lo_data = 0, lo_witness = 0xfffffe00005fd680}, mtx_lock = 4},
  v_vnlock = 0xfffff80002dc27c8, v_actfreelist = {
    tqe_next = 0xfffff80002dc2938, tqe_prev = 0xfffff80002dc2648}, v_bufobj
= {
    bo_lock = {lock_object = {lo_name = 0xffffffff80754d34 "bufobj
interlock",
        lo_flags = 86179840, lo_data = 0, lo_witness = 0xfffffe0000605700},
      rw_lock = 1}, bo_ops = 0xffffffff809e97c0,
    bo_object = 0xfffff80002c1b400, bo_synclist = {
      le_next = 0xfffff80002dc2a08, le_prev = 0xfffff80002dc2688},
    bo_private = 0xfffff80002dc2760, __bo_vnode = 0xfffff80002dc2760,
    bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff80002dc2880},
      bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {
        tqh_first = 0xfffffe00f7ae8658, tqh_last = 0xfffffe00f7ae86a8},
      bv_root = {pt_root = 18446741878841706297}, bv_cnt = 1},
    bo_numoutput = 0, bo_flag = 1, bo_bsize = 16384}, v_pollinfo = 0x0,
  v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0,
      tqh_last = 0xfffff80002dc28e8}, rl_currdep = 0x0}, v_cstart = 0,
  v_lasta = 0, v_lastw = 0, v_clen = 0, v_holdcnt = 2, v_usecount = 0,
  v_iflag = 512, v_vflag = 0, v_writecount = 0, v_hash = 18236560,
  v_type = VREG}

I think what is wanted is the buffer and their dependency lists.. I am not
sure where those are under all of this.. bo_*?

On Wed, Jul 6, 2016 at 8:12 PM, Konstantin Belousov <kostikbel@gmail.com>
wrote:

> On Wed, Jul 06, 2016 at 02:21:20PM -0400, David Cross wrote:
> > (kgdb) up 5
> > #5  0xffffffff804aafa1 in brelse (bp=0xfffffe00f77457d0) at buf.h:428
> > 428                     (*bioops.io_deallocate)(bp);
> > Current language:  auto; currently minimal
> > (kgdb) p/x *(struct buf *)0xfffffe00f77457d0
> > $1 = {b_bufobj = 0xfffff80002e88480, b_bcount = 0x4000, b_caller1 = 0x0,
> >   b_data = 0xfffffe00f857b000, b_error = 0x0, b_iocmd = 0x0, b_ioflags =
> > 0x0,
> >   b_iooffset = 0x0, b_resid = 0x0, b_iodone = 0x0, b_blkno = 0x115d6400,
> >   b_offset = 0x0, b_bobufs = {tqe_next = 0x0, tqe_prev =
> > 0xfffff80002e884d0},
> >   b_vflags = 0x0, b_freelist = {tqe_next = 0xfffffe00f7745a28,
> >     tqe_prev = 0xffffffff80c2afc0}, b_qindex = 0x0, b_flags = 0x20402800,
> >   b_xflags = 0x2, b_lock = {lock_object = {lo_name = 0xffffffff8075030b,
> >       lo_flags = 0x6730000, lo_data = 0x0, lo_witness =
> > 0xfffffe0000602f00},
> >     lk_lock = 0xfffff800022e8000, lk_exslpfail = 0x0, lk_timo = 0x0,
> >     lk_pri = 0x60}, b_bufsize = 0x4000, b_runningbufspace = 0x0,
> >   b_kvabase = 0xfffffe00f857b000, b_kvaalloc = 0x0, b_kvasize = 0x4000,
> >   b_lblkno = 0x0, b_vp = 0xfffff80002e883b0, b_dirtyoff = 0x0,
> >   b_dirtyend = 0x0, b_rcred = 0x0, b_wcred = 0x0, b_saveaddr = 0x0,
> b_pager
> > = {
> >     pg_reqpage = 0x0}, b_cluster = {cluster_head = {tqh_first = 0x0,
> >       tqh_last = 0x0}, cluster_entry = {tqe_next = 0x0, tqe_prev = 0x0}},
> >   b_pages = {0xfffff800b99b30b0, 0xfffff800b99b3118, 0xfffff800b99b3180,
> >     0xfffff800b99b31e8, 0x0 <repeats 28 times>}, b_npages = 0x4, b_dep =
> {
> >     lh_first = 0xfffff800023d8c00}, b_fsprivate1 = 0x0, b_fsprivate2 =
> 0x0,
> >   b_fsprivate3 = 0x0, b_pin_count = 0x0}
> >
> >
> > This is the freshly allocated buf that causes the panic; is this what is
> > needed?  I "know" which vnode will cause the panic on vnlru cleanup, but
> I
> > don't know how to walk the memory list without a 'hook'.. as in, i can
> > setup the kernel in a state that I know will panic when the vnode is
> > cleaned up, I can force a panic 'early' (kill -9 1), and then I could get
> > that vnode.. if I could get the vnode list to walk.
>
> Was the state printed after the panic occured ?  What is strange is that
> buffer was not even tried for i/o, AFAIS.  Apart from empty
> b_error/b_iocmd,
> the b_lblkno is zero, which means that the buffer was never allocated on
> the disk.
>
> The b_blkno looks strangely high.  Can you print *(bp->b_vp) ?  If it is
> UFS vnode, do p *(struct inode)(<vnode>->v_data).  I am esp. interested
> in the vnode size.
>
> Can you reproduce the problem on HEAD ?
>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM9edePjo%2BUnWSzHLrcbsw0-5Z6y7xcbWB5eg1fak%2BzqbZWndQ>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation