From owner-freebsd-hackers@freebsd.org Thu Jul 7 14:26:13 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ACE4CB74F6D; Thu, 7 Jul 2016 14:26:13 +0000 (UTC) (envelope-from dcrosstech@gmail.com) Received: from mail-yw0-x22b.google.com (mail-yw0-x22b.google.com [IPv6:2607:f8b0:4002:c05::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 716B61809; Thu, 7 Jul 2016 14:26:13 +0000 (UTC) (envelope-from dcrosstech@gmail.com) Received: by mail-yw0-x22b.google.com with SMTP id l125so15266370ywb.2; Thu, 07 Jul 2016 07:26:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=WNSjvTdBSDtMESYgVMuHn1mA6cAnDXoYPjRyL8goSd4=; b=GL+VREkdaiKLGsvPl8zpVaZaXmUaKFXxYpCn8E4fdIkr3JlL5zLb8TAR/pUxqTDZXH Notas4vUkj/dB295/9pjV/WERr/L0uvMbpybND3kYgsRA5sxokWwnp6Y7gy2qeHT32Ss DVajPHTjTO5wZ7mpZgBcbDlmO8M9mMK4KRFZXdJHghz8K+Pfsw37bz349qbif5Nuu0Do iaXeLmzrwjXQCHjeJiuDmTONWyphzxSp8bmcrAR5W4JLcD5sBbA7PJUBOgj9ce1ANSk1 cc4rFfPgfmHEUjoiF8HdgPeoC6n25CzTeFEWGq/TdEeQnmOilnIIC98bsSbbD7CpgLaR +nDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=WNSjvTdBSDtMESYgVMuHn1mA6cAnDXoYPjRyL8goSd4=; b=T7pxCx+NDTnizlaGUjVTOLysqIz+gCZwH2lxj285vu5dWy5E8j2jyvoiO9MRuEOzz5 7CZSV6i8xRUvvg4wq8Kc8Rw2Y8S7ehQ/WTnSI7skkvB5gWgjGySG2eLVna7RxjLEpz/V 5EpVl9JxPySytasUwZjSU3ikrjBCSq11Ixmx6VuaJaTBXC+dClXqzkQi5ygWYgjPZmXu pSmNmE2VDOO8LJjD5CKV6pl/OQR7zNaafIXUOF2vI64MSgUt/HIFLW2lVQob4s/Hse5H xx43hEp8yK1LdrNP8N+rE3VSP9ucqHX4sndMB7H5wJPAnUx0QRYXLl0DZe+KVYxYkIIc 7xYQ== X-Gm-Message-State: ALyK8tK0GxE+KvzNG6H4aHsewnYkA34HvMo2wofG60dSuY/bBsFIdfSTDrCE4Sn03hpKYBhMxJ9Gvg1H6OdozA== X-Received: by 10.37.205.130 with SMTP id d124mr408294ybf.181.1467901572542; Thu, 07 Jul 2016 07:26:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.37.212.66 with HTTP; Thu, 7 Jul 2016 07:26:10 -0700 (PDT) In-Reply-To: <20160707001218.GI38613@kib.kiev.ua> References: <20160706151822.GC38613@kib.kiev.ua> <20160706173758.GF38613@kib.kiev.ua> <20160707001218.GI38613@kib.kiev.ua> From: David Cross Date: Thu, 7 Jul 2016 10:26:10 -0400 Message-ID: Subject: Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST) To: Konstantin Belousov Cc: freebsd-stable@freebsd.org, freebsd-hackers@freebsd.org X-Mailman-Approved-At: Thu, 07 Jul 2016 15:42:28 +0000 Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2016 14:26:13 -0000 The state was printed after the panic, yes. If I understand the idea of softupdates correctly, I don't think its odd this buffer wasn't even attempted to be written, it has b_dep defined, that means those blocks should be written first, right? Also, I was just able to reproduce this on 11.0-ALPHA6, I did a fresh fsck on the filesystem to ensure it was clean (I typically don't fsck between reprouction runs, since it takes so long, and when I do need a 'clean' slate I just restore the snapshot, its faster than fsck). The panic from 11.0-ALPHA6 is: root@bhyve103:~ # panic: softdep_deallocate_dependencies: dangling deps cpuid = 3 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe011b3861b0 vpanic() at vpanic+0x182/frame 0xfffffe011b386230 panic() at panic+0x43/frame 0xfffffe011b386290 softdep_deallocate_dependencies() at softdep_deallocate_dependencies+0x71/frame 0xfffffe011b3862b0 brelse() at brelse+0x162/frame 0xfffffe011b386310 bufwrite() at bufwrite+0x206/frame 0xfffffe011b386360 ffs_write() at ffs_write+0x3ed/frame 0xfffffe011b386410 VOP_WRITE_APV() at VOP_WRITE_APV+0x16f/frame 0xfffffe011b386520 vnode_pager_generic_putpages() at vnode_pager_generic_putpages+0x2d5/frame 0xffffe011b3865f0 VOP_PUTPAGES_APV() at VOP_PUTPAGES_APV+0xda/frame 0xfffffe011b386620 vnode_pager_putpages() at vnode_pager_putpages+0x89/frame 0xfffffe011b386690 vm_pageout_flush() at vm_pageout_flush+0x12d/frame 0xfffffe011b386720 vm_object_page_collect_flush() at vm_object_page_collect_flush+0x23a/frame 0xffffe011b386820 vm_object_page_clean() at vm_object_page_clean+0x1be/frame 0xfffffe011b3868a0 vm_object_terminate() at vm_object_terminate+0xa5/frame 0xfffffe011b3868e0 vnode_destroy_vobject() at vnode_destroy_vobject+0x63/frame 0xfffffe011b386910 ufs_reclaim() at ufs_reclaim+0x1f/frame 0xfffffe011b386940 VOP_RECLAIM_APV() at VOP_RECLAIM_APV+0xda/frame 0xfffffe011b386970 vgonel() at vgonel+0x204/frame 0xfffffe011b3869e0 vnlru_proc() at vnlru_proc+0x577/frame 0xfffffe011b386a70 fork_exit() at fork_exit+0x84/frame 0xfffffe011b386ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe011b386ab0 Pardon the machine name, I have a setup script for bhyve VMs, and I didn't tweak the name, just the install location: root@bhyve103:~ # uname -a FreeBSD bhyve103.priv.dcrosstech.com 11.0-ALPHA6 FreeBSD 11.0-ALPHA6 #0 r302303: Fri Jul 1 03:32:49 UTC 2016 root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 On the 10.3 kernel I was also able to walk the mnt_nvnodes list before the FS panic and I have the vnode * saved from before the vnlru attempted reclaim. print *((struct vnode *)0xfffff80002dc2760) $6 = {v_tag = 0xffffffff8072b891 "ufs", v_op = 0xffffffff80a13c40, v_data = 0xfffff8006a20b160, v_mount = 0xfffff800024e9cc0, v_nmntvnodes = { tqe_next = 0xfffff80002dc2588, tqe_prev = 0xfffff80002dc2958}, v_un = { vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, le_prev = 0xfffffe0000932ef8}, v_cache_src = { lh_first = 0x0}, v_cache_dst = {tqh_first = 0xfffff8006a18ce00, tqh_last = 0xfffff8006a18ce20}, v_cache_dd = 0x0, v_lock = {lock_object = { lo_name = 0xffffffff8072b891 "ufs", lo_flags = 117112832, lo_data = 0, lo_witness = 0xfffffe0000607280}, lk_lock = 1, lk_exslpfail = 0, lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = { lo_name = 0xffffffff8074a89f "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0xfffffe00005fd680}, mtx_lock = 4}, v_vnlock = 0xfffff80002dc27c8, v_actfreelist = { tqe_next = 0xfffff80002dc2938, tqe_prev = 0xfffff80002dc2648}, v_bufobj = { bo_lock = {lock_object = {lo_name = 0xffffffff80754d34 "bufobj interlock", lo_flags = 86179840, lo_data = 0, lo_witness = 0xfffffe0000605700}, rw_lock = 1}, bo_ops = 0xffffffff809e97c0, bo_object = 0xfffff80002c1b400, bo_synclist = { le_next = 0xfffff80002dc2a08, le_prev = 0xfffff80002dc2688}, bo_private = 0xfffff80002dc2760, __bo_vnode = 0xfffff80002dc2760, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff80002dc2880}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = { tqh_first = 0xfffffe00f7ae8658, tqh_last = 0xfffffe00f7ae86a8}, bv_root = {pt_root = 18446741878841706297}, bv_cnt = 1}, bo_numoutput = 0, bo_flag = 1, bo_bsize = 16384}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0, tqh_last = 0xfffff80002dc28e8}, rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_holdcnt = 2, v_usecount = 0, v_iflag = 512, v_vflag = 0, v_writecount = 0, v_hash = 18236560, v_type = VREG} I think what is wanted is the buffer and their dependency lists.. I am not sure where those are under all of this.. bo_*? On Wed, Jul 6, 2016 at 8:12 PM, Konstantin Belousov wrote: > On Wed, Jul 06, 2016 at 02:21:20PM -0400, David Cross wrote: > > (kgdb) up 5 > > #5 0xffffffff804aafa1 in brelse (bp=0xfffffe00f77457d0) at buf.h:428 > > 428 (*bioops.io_deallocate)(bp); > > Current language: auto; currently minimal > > (kgdb) p/x *(struct buf *)0xfffffe00f77457d0 > > $1 = {b_bufobj = 0xfffff80002e88480, b_bcount = 0x4000, b_caller1 = 0x0, > > b_data = 0xfffffe00f857b000, b_error = 0x0, b_iocmd = 0x0, b_ioflags = > > 0x0, > > b_iooffset = 0x0, b_resid = 0x0, b_iodone = 0x0, b_blkno = 0x115d6400, > > b_offset = 0x0, b_bobufs = {tqe_next = 0x0, tqe_prev = > > 0xfffff80002e884d0}, > > b_vflags = 0x0, b_freelist = {tqe_next = 0xfffffe00f7745a28, > > tqe_prev = 0xffffffff80c2afc0}, b_qindex = 0x0, b_flags = 0x20402800, > > b_xflags = 0x2, b_lock = {lock_object = {lo_name = 0xffffffff8075030b, > > lo_flags = 0x6730000, lo_data = 0x0, lo_witness = > > 0xfffffe0000602f00}, > > lk_lock = 0xfffff800022e8000, lk_exslpfail = 0x0, lk_timo = 0x0, > > lk_pri = 0x60}, b_bufsize = 0x4000, b_runningbufspace = 0x0, > > b_kvabase = 0xfffffe00f857b000, b_kvaalloc = 0x0, b_kvasize = 0x4000, > > b_lblkno = 0x0, b_vp = 0xfffff80002e883b0, b_dirtyoff = 0x0, > > b_dirtyend = 0x0, b_rcred = 0x0, b_wcred = 0x0, b_saveaddr = 0x0, > b_pager > > = { > > pg_reqpage = 0x0}, b_cluster = {cluster_head = {tqh_first = 0x0, > > tqh_last = 0x0}, cluster_entry = {tqe_next = 0x0, tqe_prev = 0x0}}, > > b_pages = {0xfffff800b99b30b0, 0xfffff800b99b3118, 0xfffff800b99b3180, > > 0xfffff800b99b31e8, 0x0 }, b_npages = 0x4, b_dep = > { > > lh_first = 0xfffff800023d8c00}, b_fsprivate1 = 0x0, b_fsprivate2 = > 0x0, > > b_fsprivate3 = 0x0, b_pin_count = 0x0} > > > > > > This is the freshly allocated buf that causes the panic; is this what is > > needed? I "know" which vnode will cause the panic on vnlru cleanup, but > I > > don't know how to walk the memory list without a 'hook'.. as in, i can > > setup the kernel in a state that I know will panic when the vnode is > > cleaned up, I can force a panic 'early' (kill -9 1), and then I could get > > that vnode.. if I could get the vnode list to walk. > > Was the state printed after the panic occured ? What is strange is that > buffer was not even tried for i/o, AFAIS. Apart from empty > b_error/b_iocmd, > the b_lblkno is zero, which means that the buffer was never allocated on > the disk. > > The b_blkno looks strangely high. Can you print *(bp->b_vp) ? If it is > UFS vnode, do p *(struct inode)(->v_data). I am esp. interested > in the vnode size. > > Can you reproduce the problem on HEAD ? >