From owner-freebsd-fs@FreeBSD.ORG Tue Dec 25 00:11:53 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2B845A54; Tue, 25 Dec 2012 00:11:53 +0000 (UTC) (envelope-from takeda@takeda.tk) Received: from chinatsu.takeda.tk (mail.takeda.tk [74.0.89.210]) by mx1.freebsd.org (Postfix) with ESMTP id EF45A8FC0A; Tue, 25 Dec 2012 00:11:52 +0000 (UTC) Received: from localhost.takeda.tk (takeda-ws2.lan [10.0.0.123]) (authenticated bits=0) by chinatsu.takeda.tk (8.14.5/8.14.5) with ESMTP id qBP0BjNe002051 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO); Mon, 24 Dec 2012 16:11:46 -0800 (PST) (envelope-from takeda@takeda.tk) Date: Mon, 24 Dec 2012 16:11:56 -0800 From: Derek Kulinski X-Priority: 3 (Normal) Message-ID: <574019558.20121224161156@takeda.tk> To: Andriy Gapon Subject: Re: FreeBSD 9.1-RELEASE crashes almost daily; backtraces always list zfs routines In-Reply-To: <50D8E500.1070408@FreeBSD.org> References: <1824023197.20121223142308@takeda.tk> <50D87C56.70709@FreeBSD.org> <331959998.20121224101719@takeda.tk> <50D8E500.1070408@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.97.6 at chinatsu.takeda.tk X-Virus-Status: Clean Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Dec 2012 00:11:53 -0000 Hello Andriy, Monday, December 24, 2012, 3:28:00 PM, you wrote: > I've looked through the cores and it does look like in all cases some sort of > memory corruption is a precursor to a subsequent crash. > I can't decidedly say if the corruptions are caused by the hardware, by some > code overwriting random memory locations ("rogue" driver) or by a "simpler" bug > like use after free. > I am always inclined to suspect the hardware first. > You can try to reproduce the problem with some additional checks enabled in the > kernel. Those should catch the problem earlier and thus make its source clearer. > I recommend the following: > options INVARIANTS > options INVARIANT_SUPPORT > options WITNESS > options DEBUG_MEMGUARD > makeoptions DEBUG+="-DDEBUG" > The last is really needed only for the ZFS and OpenSolaris compat code. It make > result in some extra noise from unrelated subsystems. > Perhaps you could just add "#define DEBUG" to > sys/cddl/contrib/opensolaris/uts/common/sys/debug.h. I haven't tested this > approach though. > Also, please put vm.memguard.desc="arc_buf_hdr_t" into loader.conf. > Please note that these options will make your system significantly slower. I recompiled the kernel and is running with options you specified (I enabled DEBUG in the file). Anyway even at boot time I started getting following warnings, is this anything: Dec 24 16:06:03 chinatsu kernel: Creating and/or trimming log files Dec 24 16:06:03 chinatsu kernel: lock order reversal: Dec 24 16:06:03 chinatsu kernel: 1st 0xffffffff80bf5780 pf task mtx (pf task mtx) @ /usr/src/sys/contrib/pf/net/pf.c:3330 Dec 24 16:06:03 chinatsu kernel: . Dec 24 16:06:03 chinatsu kernel: 2nd 0xfffffe0009211af8 radix node head (radix node head) @ /usr/src/sys/net/route.c:384 Dec 24 16:06:03 chinatsu kernel: KDB: stack backtrace: Dec 24 16:06:03 chinatsu kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Dec 24 16:06:03 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37 Dec 24 16:06:03 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c Dec 24 16:06:03 chinatsu kernel: witness_checkorder() at witness_checkorder+0x844 Dec 24 16:06:03 chinatsu kernel: _rw_rlock() at Dec 24 16:06:03 chinatsu kernel: Starting syslogd. Dec 24 16:06:03 chinatsu kernel: _rw_rlock+0x81 Dec 24 16:06:03 chinatsu kernel: rtalloc1_fib() at rtalloc1_fib+0x11c Dec 24 16:06:03 chinatsu kernel: rtalloc_ign_fib() at rtalloc_ign_fib+0xc5 Dec 24 16:06:03 chinatsu kernel: pf_routable() at pf_routable+0x1fd Dec 24 16:06:03 chinatsu kernel: pf_test_rule() at pf_test_rule+0x6cf Dec 24 16:06:03 chinatsu kernel: pf_test() at pf_test+0xf58 Dec 24 16:06:03 chinatsu kernel: pf_check_in() at pf_check_in+0x2b Dec 24 16:06:03 chinatsu kernel: pfil_run_hooks() at pfil_run_hooks+0xd2 Dec 24 16:06:03 chinatsu kernel: ip_input() at ip_input+0x2dc Dec 24 16:06:03 chinatsu kernel: netisr_dispatch_src() at netisr_dispatch_src+0x170 Dec 24 16:06:03 chinatsu kernel: ether_demux() at ether_demux+0x17d Dec 24 16:06:03 chinatsu kernel: ether_nh_input() at ether_nh_input+0x209 Dec 24 16:06:03 chinatsu kernel: netisr_dispatch_src() at netisr_dispatch_src+0x170 Dec 24 16:06:03 chinatsu kernel: alc_int_task() at alc_int_task+0x2ff Dec 24 16:06:03 chinatsu kernel: taskqueue_run_locked() at taskqueue_run_locked+0x93 Dec 24 16:06:03 chinatsu kernel: taskqueue_thread_loop() at taskqueue_thread_loop+0x3e Dec 24 16:06:03 chinatsu kernel: fork_exit() at fork_exit+0x133 Dec 24 16:06:03 chinatsu kernel: fork_trampoline() at fork_trampoline+0xe Dec 24 16:06:03 chinatsu kernel: --- trap 0, rip = 0, rsp = 0xffffff85fb2ebbb0, rbp = 0 --- Dec 24 16:06:03 chinatsu kernel: No core dumps found. Dec 24 16:06:04 chinatsu kernel: lock order reversal: Dec 24 16:06:04 chinatsu kernel: 1st 0xffffff85b9cb8dd8 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2677 Dec 24 16:06:04 chinatsu kernel: 2nd 0xfffffe00092c5c00 dirhash (dirhash) @ /usr/src/sys/ufs/ufs/ufs_dirhash.c:284 Dec 24 16:06:04 chinatsu kernel: KDB: stack backtrace: Dec 24 16:06:04 chinatsu kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Dec 24 16:06:04 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37 Dec 24 16:06:04 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c Dec 24 16:06:04 chinatsu kernel: witness_checkorder() at witness_checkorder+0x844 Dec 24 16:06:04 chinatsu kernel: _sx_xlock() at _sx_xlock+0x61 Dec 24 16:06:04 chinatsu kernel: ufsdirhash_acquire() at ufsdirhash_acquire+0x33 Dec 24 16:06:04 chinatsu kernel: ufsdirhash_remove() at Dec 24 16:06:04 chinatsu kernel: ufsdirhash_remove+0x16 Dec 24 16:06:04 chinatsu kernel: ufs_dirremove() at ufs_dirremove+0x1bb Dec 24 16:06:04 chinatsu kernel: ufs_remove() at ufs_remove+0x92 Dec 24 16:06:04 chinatsu kernel: VOP_REMOVE_APV() at VOP_REMOVE_APV+0xb7 Dec 24 16:06:04 chinatsu kernel: kern_unlinkat() at kern_unlinkat+0x2eb Dec 24 16:06:04 chinatsu kernel: amd64_syscall() at amd64_syscall+0x30e Dec 24 16:06:04 chinatsu kernel: Xfast_syscall() at Xfast_syscall+0xf7 Dec 24 16:06:04 chinatsu kernel: --- syscall (10, FreeBSD ELF64, sys_unlink), rip = 0x80090a22c, rsp = 0x7fffffff Dec 24 16:06:04 chinatsu kernel: ca88, rbp = 0x7fffffffdf20 --- Dec 24 16:06:04 chinatsu kernel: lock order reversal: Dec 24 16:06:04 chinatsu kernel: 1st 0xfffffe00266ddbd8 zfs (zfs) @ /usr/src/sys/kern/vfs_mount.c:849 Dec 24 16:06:04 chinatsu kernel: 2nd 0xfffffe002679a818 devfs (devfs) @ /usr/src/sys/kern/vfs_subr.c:2158 Dec 24 16:06:04 chinatsu kernel: KDB: stack backtrace: Dec 24 16:06:04 chinatsu kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Dec 24 16:06:04 chinatsu kernel: kdb_backtrace() at kdb_backtrace+0x37 Dec 24 16:06:04 chinatsu kernel: _witness_debugger() at _witness_debugger+0x2c Dec 24 16:06:04 chinatsu kernel: witness_checkorder() at witness_checkorder+0x844 Dec 24 16:06:04 chinatsu kernel: __lockmgr_args() at __lockmgr_args+0x10d9 Dec 24 16:06:04 chinatsu kernel: vop_stdlock() at vop_stdlock+0x39 Dec 24 16:06:04 chinatsu kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0xbf Dec 24 16:06:04 chinatsu kernel: _vn_lock() at _vn_lock+0x47 Dec 24 16:06:04 chinatsu kernel: vget() at vget+0x7b Dec 24 16:06:04 chinatsu kernel: devfs_allocv() at devfs_allocv+0x13f Dec 24 16:06:04 chinatsu kernel: devfs_root() at devfs_root+0x4d Dec 24 16:06:04 chinatsu kernel: vfs_donmount() at vfs_donmount+0xafa Dec 24 16:06:04 chinatsu kernel: sys_nmount() at sys_nmount+0x66 Dec 24 16:06:04 chinatsu kernel: amd64_syscall() at amd64_syscall+0x30e Dec 24 16:06:04 chinatsu kernel: Xfast_syscall() at Xfast_syscall+0xf7 Dec 24 16:06:04 chinatsu kernel: --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x800a8d71c, rsp = 0x7fffffffccc8, rbp = 0x801009048 --- Dec 24 16:06:05 chinatsu named[1387]: starting BIND 9.8.3-P4 -t /var/named -u bind Dec 24 16:06:05 chinatsu kernel: Starting named. -- Best regards, Derek mailto:takeda@takeda.tk People say Microsoft paid 14M$ for using the Rolling Stones song 'Start me up' in their commercials. This is wrong. Microsoft payed 14M$ only for a part of the song. For instance, they didn't use the line 'You'll make a grown man cry'.