From owner-freebsd-fs@FreeBSD.ORG Tue Jan 14 13:36:43 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 94E5D90A for ; Tue, 14 Jan 2014 13:36:43 +0000 (UTC) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) by mx1.freebsd.org (Postfix) with ESMTP id 33C881574 for ; Tue, 14 Jan 2014 13:36:43 +0000 (UTC) Received: from slw by zxy.spb.ru with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1W34Ac-000ENo-Fu for freebsd-fs@freebsd.org; Tue, 14 Jan 2014 17:36:42 +0400 Date: Tue, 14 Jan 2014 17:36:42 +0400 From: Slawa Olhovchenkov To: freebsd-fs@freebsd.org Subject: zfs locking Message-ID: <20140114133642.GM16734@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jan 2014 13:36:43 -0000 I have some kernel traps inside ZFS. Jan 14 00:13:05 srv3 kernel: Fatal trap 12: page fault while in kernel mode Jan 14 00:13:05 srv3 kernel: cpuid = 15; apic id = 2e Jan 14 00:13:05 srv3 kernel: fault virtual address = 0x14 Jan 14 00:13:05 srv3 kernel: fault code = supervisor read data, page not present Jan 14 00:13:05 srv3 kernel: instruction pointer = 0x20:0xffffffff80de2dd1 Jan 14 00:13:05 srv3 kernel: stack pointer = 0x28:0xfffffe104ac45460 Jan 14 00:16:04 srv3 syslogd: kernel boot file is /boot/kernel/kernel Jan 14 00:16:04 srv3 kernel: current process = 13233 (aiod22) Jan 14 00:16:04 srv3 kernel: trap number = 12 Jan 14 00:16:04 srv3 kernel: panic: page fault Jan 14 00:16:04 srv3 kernel: cpuid = 3 Jan 14 00:16:04 srv3 kernel: KDB: stack backtrace: Jan 14 00:16:04 srv3 kernel: #0 0xffffffff80523b50 at kdb_backtrace+0x60 Jan 14 00:16:04 srv3 kernel: #1 0xffffffff804edfa5 at panic+0x155 Jan 14 00:16:04 srv3 kernel: #2 0xffffffff806e9a42 at trap_fatal+0x3a2 Jan 14 00:16:04 srv3 kernel: #3 0xffffffff806e9d19 at trap_pfault+0x2c9 Jan 14 00:16:04 srv3 kernel: #4 0xffffffff806e94a6 at trap+0x5e6 Jan 14 00:16:04 srv3 kernel: #5 0xffffffff806d07d2 at calltrap+0x8 Jan 14 00:16:04 srv3 kernel: #6 0xffffffff80dea2f6 at dbuf_read+0x656 Jan 14 00:16:04 srv3 kernel: #7 0xffffffff80df12ff at dmu_buf_hold_array_by_dnode+0x1cf Jan 14 00:16:04 srv3 kernel: #8 0xffffffff80df21d6 at dmu_read_uio+0x66 Jan 14 00:16:04 srv3 kernel: #9 0xffffffff80e79107 at zfs_freebsd_read+0x357 Jan 14 00:16:04 srv3 kernel: #10 0xffffffff80784a12 at VOP_READ_APV+0x92 Jan 14 00:16:04 srv3 kernel: #11 0xffffffff80596266 at vn_read+0x166 Jan 14 00:16:04 srv3 kernel: #12 0xffffffff80592beb at vn_io_fault+0x15b Jan 14 00:16:04 srv3 kernel: #13 0xffffffff8104f387 at aio_daemon+0x387 Jan 14 00:16:04 srv3 kernel: #14 0xffffffff804c01ea at fork_exit+0x9a Jan 14 00:16:04 srv3 kernel: #15 0xffffffff806d0d0e at fork_trampoline+0xe Jan 14 00:16:04 srv3 kernel: Uptime: 2d9h6m7s 0x20:0xffffffff80de2dd1 is inside arc_read: (kgdb) x/30i 0xffffffff80de2dc0 0xffffffff80de2dc0 : add %ch,%al 0xffffffff80de2dc2 : pop %rdx 0xffffffff80de2dc3 : (bad) 0xffffffff80de2dc5 : add %cl,-0x77(%rax) 0xffffffff80de2dc8 : retq 0xffffffff80de2dc9 : mov 0xc0(%r12),%rax 0xffffffff80de2dd1 : movslq 0x14(%rax),%rsi 0xffffffff80de2dd5 : mov $0xffffffff80ee7c80,%rdi 0xffffffff80de2ddc : callq 0xffffffff806cfa60 0xffffffff80de2de1 : mov 0x18(%rbp),%rax 0xffffffff80de2de5 : testb $0x4,(%rax) 0xffffffff80de2de8 : je 0xffffffff80de2e07 0xffffffff80de2dea : mov %rbx,%rdi 0xffffffff80de2ded : callq 0xffffffff80e520e0 0xffffffff80de2df2 : xor %r15d,%r15d 0xffffffff80de2df5 : mov %r15d,%eax 0xffffffff80de2df8 : add $0x78,%rsp 0xffffffff80de2dfc : pop %rbx 0xffffffff80de2dfd : pop %r12 0xffffffff80de2dff : pop %r13 0xffffffff80de2e01 : pop %r14 0xffffffff80de2e03 : pop %r15 0xffffffff80de2e05 : pop %rbp 0xffffffff80de2e06 : retq DTRACE_PROBE2(l2arc__read, vdev_t *, vd, zio_t *, rzio); ARCSTAT_INCR(arcstat_l2_read_bytes, // arc_read+2137 hdr->b_l2hdr->b_asize); if (*arc_flags & ARC_NOWAIT) { zio_nowait(rzio); return (0); } ASSERT(*arc_flags & ARC_WAIT); if (zio_wait(rzio) == 0) return (0); /* l2arc read error; goto zio_read() */ Is this locking issue?