Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Dec 2012 16:11:56 -0800
From:      Derek Kulinski <takeda@takeda.tk>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org
Subject:   Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines
Message-ID:  <574019558.20121224161156@takeda.tk>
In-Reply-To: <50D8E500.1070408@FreeBSD.org>
References:  <1824023197.20121223142308@takeda.tk> <50D87C56.70709@FreeBSD.org> <331959998.20121224101719@takeda.tk> <50D8E500.1070408@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello Andriy,

Monday, December 24, 2012, 3:28:00 PM, you wrote:

> I've looked through the cores and it does look like in all cases some sort of
> memory corruption is a precursor to a subsequent crash.

> I can't decidedly say if the corruptions are caused by the hardware, by some
> code overwriting random memory locations ("rogue" driver) or by a "simpler" bug
> like use after free.

> I am always inclined to suspect the hardware first.

> You can try to reproduce the problem with some additional checks enabled in the
> kernel.  Those should catch the problem earlier and thus make its source clearer.

> I recommend the following:
> options         INVARIANTS
> options         INVARIANT_SUPPORT
> options         WITNESS
> options         DEBUG_MEMGUARD
> makeoptions     DEBUG+="-DDEBUG"

> The last is really needed only for the ZFS and OpenSolaris compat code.  It make
> result in some extra noise from unrelated subsystems.
> Perhaps you could just add "#define DEBUG" to
> sys/cddl/contrib/opensolaris/uts/common/sys/debug.h.  I haven't tested this
> approach though.

> Also, please put vm.memguard.desc="arc_buf_hdr_t" into loader.conf.

> Please note that these options will make your system significantly slower.

I recompiled the kernel and is running with options you specified (I
enabled DEBUG in the file).

Anyway even at boot time I started getting following warnings, is this
anything:

Dec 24 16:06:03 chinatsu kernel: Creating and/or trimming log files
Dec 24 16:06:03 chinatsu kernel: lock order reversal:
Dec 24 16:06:03 chinatsu kernel: 1st 0xffffffff80bf5780 pf task mtx (pf task mtx) @ /usr/src/sys/contrib/pf/net/pf.c:3330
Dec 24 16:06:03 chinatsu kernel: .
Dec 24 16:06:03 chinatsu kernel: 2nd 0xfffffe0009211af8 radix node head (radix node head) @ /usr/src/sys/net/route.c:384
Dec 24 16:06:03 chinatsu kernel: KDB: stack backtrace:
Dec 24 16:06:03 chinatsu kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
Dec 24 16:06:03 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37
Dec 24 16:06:03 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c
Dec 24 16:06:03 chinatsu kernel: witness_checkorder() at witness_checkorder+0x844
Dec 24 16:06:03 chinatsu kernel: _rw_rlock() at
Dec 24 16:06:03 chinatsu kernel: Starting syslogd.
Dec 24 16:06:03 chinatsu kernel: _rw_rlock+0x81
Dec 24 16:06:03 chinatsu kernel: rtalloc1_fib() at rtalloc1_fib+0x11c
Dec 24 16:06:03 chinatsu kernel: rtalloc_ign_fib() at rtalloc_ign_fib+0xc5
Dec 24 16:06:03 chinatsu kernel: pf_routable() at pf_routable+0x1fd
Dec 24 16:06:03 chinatsu kernel: pf_test_rule() at pf_test_rule+0x6cf
Dec 24 16:06:03 chinatsu kernel: pf_test() at pf_test+0xf58
Dec 24 16:06:03 chinatsu kernel: pf_check_in() at pf_check_in+0x2b
Dec 24 16:06:03 chinatsu kernel: pfil_run_hooks() at pfil_run_hooks+0xd2
Dec 24 16:06:03 chinatsu kernel: ip_input() at ip_input+0x2dc
Dec 24 16:06:03 chinatsu kernel: netisr_dispatch_src() at netisr_dispatch_src+0x170
Dec 24 16:06:03 chinatsu kernel: ether_demux() at ether_demux+0x17d
Dec 24 16:06:03 chinatsu kernel: ether_nh_input() at ether_nh_input+0x209
Dec 24 16:06:03 chinatsu kernel: netisr_dispatch_src() at netisr_dispatch_src+0x170
Dec 24 16:06:03 chinatsu kernel: alc_int_task() at alc_int_task+0x2ff
Dec 24 16:06:03 chinatsu kernel: taskqueue_run_locked() at taskqueue_run_locked+0x93
Dec 24 16:06:03 chinatsu kernel: taskqueue_thread_loop() at taskqueue_thread_loop+0x3e
Dec 24 16:06:03 chinatsu kernel: fork_exit() at fork_exit+0x133
Dec 24 16:06:03 chinatsu kernel: fork_trampoline() at fork_trampoline+0xe
Dec 24 16:06:03 chinatsu kernel: --- trap 0, rip = 0, rsp = 0xffffff85fb2ebbb0, rbp = 0 ---
Dec 24 16:06:03 chinatsu kernel: No core dumps found.
Dec 24 16:06:04 chinatsu kernel: lock order reversal:
Dec 24 16:06:04 chinatsu kernel: 1st 0xffffff85b9cb8dd8 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2677
Dec 24 16:06:04 chinatsu kernel: 2nd 0xfffffe00092c5c00 dirhash (dirhash) @ /usr/src/sys/ufs/ufs/ufs_dirhash.c:284
Dec 24 16:06:04 chinatsu kernel: KDB: stack backtrace:
Dec 24 16:06:04 chinatsu kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
Dec 24 16:06:04 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37
Dec 24 16:06:04 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c
Dec 24 16:06:04 chinatsu kernel: witness_checkorder() at witness_checkorder+0x844
Dec 24 16:06:04 chinatsu kernel: _sx_xlock() at _sx_xlock+0x61
Dec 24 16:06:04 chinatsu kernel: ufsdirhash_acquire() at ufsdirhash_acquire+0x33
Dec 24 16:06:04 chinatsu kernel: ufsdirhash_remove() at
Dec 24 16:06:04 chinatsu kernel: ufsdirhash_remove+0x16
Dec 24 16:06:04 chinatsu kernel: ufs_dirremove() at ufs_dirremove+0x1bb
Dec 24 16:06:04 chinatsu kernel: ufs_remove() at ufs_remove+0x92
Dec 24 16:06:04 chinatsu kernel: VOP_REMOVE_APV() at VOP_REMOVE_APV+0xb7
Dec 24 16:06:04 chinatsu kernel: kern_unlinkat() at kern_unlinkat+0x2eb
Dec 24 16:06:04 chinatsu kernel: amd64_syscall() at amd64_syscall+0x30e
Dec 24 16:06:04 chinatsu kernel: Xfast_syscall() at Xfast_syscall+0xf7
Dec 24 16:06:04 chinatsu kernel: --- syscall (10, FreeBSD ELF64, sys_unlink), rip = 0x80090a22c, rsp = 0x7fffffff
Dec 24 16:06:04 chinatsu kernel: ca88, rbp = 0x7fffffffdf20 ---
Dec 24 16:06:04 chinatsu kernel: lock order reversal:
Dec 24 16:06:04 chinatsu kernel: 1st 0xfffffe00266ddbd8 zfs (zfs) @ /usr/src/sys/kern/vfs_mount.c:849
Dec 24 16:06:04 chinatsu kernel: 2nd 0xfffffe002679a818 devfs (devfs) @ /usr/src/sys/kern/vfs_subr.c:2158
Dec 24 16:06:04 chinatsu kernel: KDB: stack backtrace:
Dec 24 16:06:04 chinatsu kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
Dec 24 16:06:04 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37
Dec 24 16:06:04 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c
Dec 24 16:06:04 chinatsu kernel: witness_checkorder() at witness_checkorder+0x844
Dec 24 16:06:04 chinatsu kernel: __lockmgr_args() at __lockmgr_args+0x10d9
Dec 24 16:06:04 chinatsu kernel: vop_stdlock() at vop_stdlock+0x39
Dec 24 16:06:04 chinatsu kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0xbf
Dec 24 16:06:04 chinatsu kernel: _vn_lock() at _vn_lock+0x47
Dec 24 16:06:04 chinatsu kernel: vget() at vget+0x7b
Dec 24 16:06:04 chinatsu kernel: devfs_allocv() at devfs_allocv+0x13f
Dec 24 16:06:04 chinatsu kernel: devfs_root() at devfs_root+0x4d
Dec 24 16:06:04 chinatsu kernel: vfs_donmount() at vfs_donmount+0xafa
Dec 24 16:06:04 chinatsu kernel: sys_nmount() at sys_nmount+0x66
Dec 24 16:06:04 chinatsu kernel: amd64_syscall() at amd64_syscall+0x30e
Dec 24 16:06:04 chinatsu kernel: Xfast_syscall() at Xfast_syscall+0xf7
Dec 24 16:06:04 chinatsu kernel: --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x800a8d71c, rsp = 0x7fffffffccc8, rbp = 0x801009048 ---
Dec 24 16:06:05 chinatsu named[1387]: starting BIND 9.8.3-P4 -t /var/named -u bind
Dec 24 16:06:05 chinatsu kernel: Starting named.



-- 
Best regards,
 Derek                            mailto:takeda@takeda.tk

People say Microsoft paid 14M$ for using the Rolling Stones song 'Start me up' in their commercials. This is wrong. Microsoft payed 14M$ only for a part of the song. For instance, they didn't use the line 'You'll make a grown man cry'.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?574019558.20121224161156>