From owner-freebsd-bugs@FreeBSD.ORG Sun Mar 10 11:10:01 2013 Return-Path: Delivered-To: freebsd-bugs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 613D5332 for ; Sun, 10 Mar 2013 11:10:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 3980080B for ; Sun, 10 Mar 2013 11:10:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r2ABA1dc013174 for ; Sun, 10 Mar 2013 11:10:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r2ABA1Lc013173; Sun, 10 Mar 2013 11:10:01 GMT (envelope-from gnats) Date: Sun, 10 Mar 2013 11:10:01 GMT Message-Id: <201303101110.r2ABA1Lc013173@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org Cc: From: Andriy Gapon Subject: Re: kern/176636: Periodical crashes with 9.1-R X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Andriy Gapon List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Mar 2013 11:10:01 -0000 The following reply was made to PR kern/176636; it has been noted by GNATS. From: Andriy Gapon To: Rasmus Skaarup Cc: bug-followup@FreeBSD.org Subject: Re: kern/176636: Periodical crashes with 9.1-R Date: Sun, 10 Mar 2013 13:01:32 +0200 on 07/03/2013 07:00 Rasmus Skaarup said the following: > > This is the only kind of panic I get - after your patch: > > Fatal trap 12: page fault while in kernel mode > cpuid = 1; apic id = 01 > fault virtual address = 0x60 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff8162e4f0 > stack pointer = 0x28:0xffffff81624726e0 > frame pointer = 0x28:0xffffff81624727d0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 26068 (zpool) > trap number = 12 > panic: page fault > cpuid = 1 > KDB: stack backtrace: > #0 0xffffffff809208a6 at kdb_backtrace+0x66 > #1 0xffffffff808ea8be at panic+0x1ce > #2 0xffffffff80bd8240 at trap_fatal+0x290 > #3 0xffffffff80bd857d at trap_pfault+0x1ed > #4 0xffffffff80bd8b9e at trap+0x3ce > #5 0xffffffff80bc315f at calltrap+0x8 > #6 0xffffffff81673975 at sa_handle_get_from_db+0x95 > #7 0xffffffff81673a38 at sa_handle_get+0x48 > #8 0xffffffff8169f516 at zfs_grab_sa_handle+0x96 > #9 0xffffffff8169faca at zfs_obj_to_path+0x6a > #10 0xffffffff816b8c75 at zfs_ioc_obj_to_path+0x75 > #11 0xffffffff816bad46 at zfsdev_ioctl+0xe6 > #12 0xffffffff807db28b at devfs_ioctl_f+0x7b > #13 0xffffffff80932325 at kern_ioctl+0x115 > #14 0xffffffff8093255d at sys_ioctl+0xfd > #15 0xffffffff80bd7ae6 at amd64_syscall+0x546 > #16 0xffffffff80bc3447 at Xfast_syscall+0xf7 It is possible that while there were the memory corruptions (either because of the bug for which I sent you the patch or for some other reason), some bad / corrupted ZFS metadata was written to the stable storage. Now that corrupted data could be causing further panics. It would be interesting to re-create a pool from scratch and see how that behaves. If you do that, please use the patch. -- Andriy Gapon