Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Aug 2015 10:48:50 -0700 (PDT)
From:      Don Lewis <truckman@FreeBSD.org>
To:        freebsd-fs@FreeBSD.org
Subject:   Re: solaris assert: avl_is_empty(&dn -> dn_dbufs) panic
Message-ID:  <201508211748.t7LHmo96096088@gw.catspoiler.org>
In-Reply-To: <201508211734.t7LHYQ33096035@gw.catspoiler.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 21 Aug, Don Lewis wrote:
> On 21 Aug, Don Lewis wrote:
>> I just started getting this panic:
>> 
>> solaris assert: avl_is_empty(&dn -> dn_dbufs), file:
>> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c,
>> line 495
>> 
>> System info:
>> FreeBSD zipper.catspoiler.org 11.0-CURRENT FreeBSD 11.0-CURRENT #25 r286923: Wed Aug 19 09:28:53 PDT 2015     dl@zipper.catspoiler.org:/usr/obj/usr/src/sys/GENERIC  amd64
>> 
>> My zfs pool has one mirrored vdev.  Scrub doesn't find any problems.
>> 
>> %zpool status
>>   pool: zroot
>>  state: ONLINE
>>   scan: scrub repaired 0 in 2h58m with 0 errors on Fri Aug 21 00:44:52 2015
>> config:
>> 
>> 	NAME        STATE     READ WRITE CKSUM
>> 	zroot       ONLINE       0     0     0
>> 	  mirror-0  ONLINE       0     0     0
>> 	    ada0p3  ONLINE       0     0     0
>> 	    ada1p3  ONLINE       0     0     0
>> 
>> This panic is reproduceable and happens every time I use poudriere to
>> build ports using my 9.3-RELEASE amd64 jail and occurs at the end of the
>> poudriere run when it is unmounting filesystems.
>> 
>> [00:10:43] ====>> Stopping 4 builders
>> 93amd64-default-job-01: removed
>> 93amd64-default-job-01-n: removed
>> 93amd64-default-job-02: removed
>> 93amd64-default-job-02-n: removed
>> 93amd64-default-job-03: removed
>> 93amd64-default-job-03-n: removed
>> 93amd64-default-job-04: removed
>> 93amd64-default-job-04-n: removed
>> [00:10:46] ====>> Creating pkgng repository
>> Creating repository in /tmp/packages: 100%
>> Packing files for repository: 100%
>> [00:10:55] ====>> Committing packages to repository
>> [00:10:55] ====>> Removing old packages
>> [00:10:55] ====>> Built ports: devel/py-pymtbl net/sie-nmsg net/p5-Net-Nmsg net/axa
>> [93amd64-default] [2015-08-21_00h47m41s] [committing:] Queued: 4  Built: 4  Failed: 0  Skipped: 0  Ignored: 0  Tobuild: 0   Time: 00:10:53
>> [00:10:55] ====>> Logs: /var/poudriere/data/logs/bulk/93amd64-default/2015-08-21_00h47m41s
>> [00:10:55] ====>> Cleaning up
>> 93amd64-default: removed
>> 93amd64-default-n: removed
>> [00:10:55] ====>> Umounting file systems
>> Write failed: Broken pipe
>> 
>> Prior to that, I ran poudriere a number of times with a 10.2-STABLE
>> amd64 jail without incident.
>> 
>> I've kicked off a bunch of poudriere runs for other jails and
>> will check on it in the morning.
> 
> Died the same way after building ports on the first jail,
> 10.1-RELEASE amd64.
> 
> Since there have been some zfs commits since r286923, I upgraded to
> r286998 this morning and tried again with no better luck.  I got the
> same panic again.
> 
> This machine has mirrored swap, and even though I've done what
> gmirror(8) says to do in order to capture crash dumps, I've had no luck
> with that.  The dump is getting written, but savecore is unable to find
> it.

Not sure what is happening with savecore during boot, but I was able to
run it manually and collect the crash dump.


Unread portion of the kernel message buffer:
panic: solaris assert: avl_is_empty(&dn->dn_dbufs), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c, line: 495
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0859e4e4e0
vpanic() at vpanic+0x189/frame 0xfffffe0859e4e560
panic() at panic+0x43/frame 0xfffffe0859e4e5c0
assfail() at assfail+0x1a/frame 0xfffffe0859e4e5d0
dnode_sync() at dnode_sync+0x6c8/frame 0xfffffe0859e4e6b0
dmu_objset_sync_dnodes() at dmu_objset_sync_dnodes+0x2b/frame 0xfffffe0859e4e6e0
dmu_objset_sync() at dmu_objset_sync+0x29e/frame 0xfffffe0859e4e7b0
dsl_pool_sync() at dsl_pool_sync+0x348/frame 0xfffffe0859e4e820
spa_sync() at spa_sync+0x442/frame 0xfffffe0859e4e910
txg_sync_thread() at txg_sync_thread+0x23d/frame 0xfffffe0859e4e9f0
fork_exit() at fork_exit+0x84/frame 0xfffffe0859e4ea30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0859e4ea30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic



(kgdb) bt
#0  doadump (textdump=0) at pcpu.h:221
#1  0xffffffff8037bb86 in db_fncall (dummy1=<value optimized out>, 
    dummy2=<value optimized out>, dummy3=<value optimized out>, 
    dummy4=<value optimized out>) at /usr/src/sys/ddb/db_command.c:568
#2  0xffffffff8037b941 in db_command (cmd_table=0x0)
    at /usr/src/sys/ddb/db_command.c:440
#3  0xffffffff8037b5d4 in db_command_loop ()
    at /usr/src/sys/ddb/db_command.c:493
#4  0xffffffff8037e18b in db_trap (type=<value optimized out>, code=0)
    at /usr/src/sys/ddb/db_main.c:251
#5  0xffffffff80a5b294 in kdb_trap (type=3, code=0, tf=<value optimized out>)
    at /usr/src/sys/kern/subr_kdb.c:654
#6  0xffffffff80e6a4b1 in trap (frame=0xfffffe0859e4e410)
    at /usr/src/sys/amd64/amd64/trap.c:540
#7  0xffffffff80e49f22 in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:235
#8  0xffffffff80a5a96e in kdb_enter (why=0xffffffff81379010 "panic", 
    msg=0xffffffff80a60b60 "UH\211\ufffdAWAVATSH\203\ufffdPI\211\ufffdA\211\ufffdH\213\004%Py\ufffd\201H\211E\ufffd\201<%\ufffd\210\ufffd\201") at cpufunc.h:63
#9  0xffffffff80a1e2c9 in vpanic (fmt=<value optimized out>, 
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:619
#10 0xffffffff80a1e333 in panic (fmt=0xffffffff81aafa90 "\004")
    at /usr/src/sys/kern/kern_shutdown.c:557
---Type <return> to continue, or q <return> to quit---
#11 0xffffffff8240922a in assfail (a=<value optimized out>, 
    f=<value optimized out>, l=<value optimized out>)
    at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:81
#12 0xffffffff820d4f78 in dnode_sync (dn=0xfffff8040b72d3d0, 
    tx=0xfffff8001598ec00)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c:495
#13 0xffffffff820c922b in dmu_objset_sync_dnodes (list=0xfffff80007712b90, 
    newlist=<value optimized out>, tx=<value optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1045
#14 0xffffffff820c8ede in dmu_objset_sync (os=0xfffff80007712800, 
    pio=<value optimized out>, tx=0xfffff8001598ec00)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1163
#15 0xffffffff820e8e78 in dsl_pool_sync (dp=0xfffff80015676000, txg=2660975)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:536
#16 0xffffffff8210dca2 in spa_sync (spa=0xfffffe00089c6000, txg=2660975)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:6641
#17 0xffffffff8211843d in txg_sync_thread (arg=0xfffff80015676000)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:517
#18 0xffffffff809e47c4 in fork_exit (
    callout=0xffffffff82118200 <txg_sync_thread>, arg=0xfffff80015676000, 
    frame=0xfffffe0859e4ea40) at /usr/src/sys/kern/kern_fork.c:1006
---Type <return> to continue, or q <return> to quit---
#19 0xffffffff80e4a45e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:610
#20 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201508211748.t7LHmo96096088>