From owner-freebsd-hackers@FreeBSD.ORG Tue Dec 16 17:07:56 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3F49A3AA for ; Tue, 16 Dec 2014 17:07:56 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1832BD53 for ; Tue, 16 Dec 2014 17:07:56 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 03E26B93C; Tue, 16 Dec 2014 12:07:55 -0500 (EST) From: John Baldwin To: freebsd-hackers@freebsd.org Subject: Re: freebsd crash under I/O - got error messages part 4 Date: Tue, 16 Dec 2014 11:54:47 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201412161154.47153.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 16 Dec 2014 12:07:55 -0500 (EST) Cc: Wojciech Puchar X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Dec 2014 17:07:56 -0000 On Sunday, December 07, 2014 3:57:25 pm Wojciech Puchar wrote: > even without UFS_DIRHASH there are still errors under heavier I/O load. > > > > lock order reversal: > 1st 0xfffff800182c9068 ufs (ufs) @ kern/vfs_subr.c:2137 > 2nd 0xfffffe00612443d0 bufwait (bufwait) @ ufs/ffs/ffs_vnops.c:262 > 3rd 0xfffff8004a877418 ufs (ufs) @ kern/vfs_subr.c:2137 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe007870ae20 > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe007870aed0 > witness_checkorder() at witness_checkorder+0xdc2/frame 0xfffffe007870af60 > __lockmgr_args() at __lockmgr_args+0x9ea/frame 0xfffffe007870b0a0 > ffs_lock() at ffs_lock+0x84/frame 0xfffffe007870b0f0 > VOP_LOCK1_APV() at VOP_LOCK1_APV+0xd9/frame 0xfffffe007870b120 > _vn_lock() at _vn_lock+0xaa/frame 0xfffffe007870b190 > vget() at vget+0x67/frame 0xfffffe007870b1d0 > vfs_hash_get() at vfs_hash_get+0xe1/frame 0xfffffe007870b220 > ffs_vgetf() at ffs_vgetf+0x40/frame 0xfffffe007870b2b0 > softdep_sync_buf() at softdep_sync_buf+0x3b3/frame 0xfffffe007870b390 > ffs_syncvnode() at ffs_syncvnode+0x286/frame 0xfffffe007870b410 > ffs_truncate() at ffs_truncate+0x614/frame 0xfffffe007870b600 > ufs_direnter() at ufs_direnter+0x722/frame 0xfffffe007870b6c0 > ufs_mkdir() at ufs_mkdir+0x4d0/frame 0xfffffe007870b850 > VOP_MKDIR_APV() at VOP_MKDIR_APV+0xd1/frame 0xfffffe007870b880 > kern_mkdirat() at kern_mkdirat+0x1be/frame 0xfffffe007870baa0 > amd64_syscall() at amd64_syscall+0x216/frame 0xfffffe007870bbb0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe007870bbb0 > --- syscall (136, FreeBSD ELF64, sys_mkdir), rip = 0x800c0e93a, rsp = > 0x7fffffffd3f8, rbp = 0x7fffffffd520 --- The LORs are known false positives (certainly the dirhash one is and is documented as such in the source). They are unrelated to the hard I/O errors you had in your first mail. I'm not sure how to track those down further however. :( -- John Baldwin