From owner-freebsd-current@freebsd.org Mon Jul 18 17:58:15 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AE1BEB9DDEB for ; Mon, 18 Jul 2016 17:58:15 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6DCFA1CE3; Mon, 18 Jul 2016 17:58:15 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id u6IHw7h6047690; Mon, 18 Jul 2016 10:58:11 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201607181758.u6IHw7h6047690@gw.catspoiler.org> Date: Mon, 18 Jul 2016 10:58:07 -0700 (PDT) From: Don Lewis Subject: Re: zfs solaris assert panic in 11.0-ALPHA5 r302256 To: avg@FreeBSD.org cc: freebsd-current@FreeBSD.org In-Reply-To: <831f819b-68bb-5425-a4cf-47a51a136ddf@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2016 17:58:15 -0000 On 18 Jul, Andriy Gapon wrote: > On 18/07/2016 20:40, Don Lewis wrote: >> On 18 Jul, Andriy Gapon wrote: >>> On 08/07/2016 07:13, Don Lewis wrote: >>>> My package buiding machine just crashed with this panic during a >>>> poudriere run: >>>> >>>> panic: solaris assert: (dn->dn_phys->dn_nlevels == 0 && db->db_level == 0) || dn->dn_phys->dn_nlevels > db->db_level || dn->dn_next_nlevels[txgoff] > db->db_level || dn->dn_next_nlevels[(tx->tx_txg-1) & TXG_MASK] > db->db_level || dn->dn_next_nlevels[(tx->tx_txg >>> >>> Don, >>> >>> do you have a crash dump? >>> It would be interesting to see a pretty-print of dn, dn->dn_phys, db and >>> tx in the frame where the assert is hit. >> >> I do. Unfortunately kgdb reports that the values of dn and db were >> optimized out. >> > > Well... You can try to use kgdb7111 from ports, perhaps it would work > better. Also, it's often possible to find values of wanted variables by > finding a relevant value that's not optimized out and then following > through pointers, etc to get to the right values. in other cases it's > possible to get the values by examining the disassembly and values of > registers. This is with kgdb from ports: [snip] #13 0xffffffff824be23a in assfail ( a=0x80 , f=0xfffffe085a435d90 "", l=0) at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:81 #14 0xffffffff8215b928 in dbuf_dirty (db=, tx=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1232 #15 0xffffffff8215c0e4 in dbuf_dirty (db=, tx=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1383 #16 0xffffffff8215c0e4 in dbuf_dirty (db=, tx=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1383 #17 0xffffffff82166fd9 in dmu_write_uio_dnode (dn=, uio=, size=, tx=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1193 #18 0xffffffff82166e92 in dmu_write_uio_dbuf (zdb=0xfffff806bacc0b88, uio=0xfffffe085a4368f0, size=65536, tx=0xfffff803a4043600) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1244 #19 0xffffffff82224bac in zfs_write (vp=, uio=, ioflag=, cr=, ct=) [snip] There's not a lot for me to get traction with ... This is also not a very repeatable thing for me. I've only had it happen once even though this machine is kept very busy building ports.