From owner-freebsd-stable@FreeBSD.ORG Tue Sep 11 13:44:43 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 502ED1065674 for ; Tue, 11 Sep 2012 13:44:43 +0000 (UTC) (envelope-from niobos@dest-unreach.be) Received: from serv02.imset.org (hackerspace.be [IPv6:2001:41d0:2:1959:fedc:ba98:7654:3210]) by mx1.freebsd.org (Postfix) with ESMTP id E4C4D8FC0C for ; Tue, 11 Sep 2012 13:44:42 +0000 (UTC) Received: from raptor.rto.be (225.72-136-217.adsl-dyn.isp.belgacom.be [217.136.72.225]) by serv02.imset.org (Postfix) with ESMTPSA id 287DECA049 for ; Tue, 11 Sep 2012 15:44:42 +0200 (CEST) Message-ID: <504F4049.9080801@dest-unreach.be> Date: Tue, 11 Sep 2012 15:44:41 +0200 From: Niobos User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:15.0) Gecko/20120824 Thunderbird/15.0 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Kernel panic with geom_multipath + ZFS X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Sep 2012 13:44:43 -0000 Hi, I'm under the illusion that I've found a bug in the FreeBSD kernel, but since I'm new to FreeBSD, a quiet voice tells me it's probably a case of "you're doing it wrong". Also, I'm not sure if this is the right place to complain. So feel free to redirect me. I'll start with some context: * FreeBSD storage.[...] 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Tue Jan 3 07:46:30 UTC 2012 root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 * There are 5 expansion units attached via SAS, daisy-chained. Each unit has 12 disks, totalling at 60 disks. To provide path redundancy, the units are connected HBA-1-2-3-4-5 and HBA-5-4-3-2-1. * I've configured a ZFS on top, with 6 RAID-Z2 arrays of 8+2 disks each. This setup should be able to survive a disk failure. However, manually ejecting one of the disks causes a kernel panic. I've manually OCR'd it below. The panic is not triggered by the ejection itself. I can see that fact in the kernel log a few seconds after the ejection. I think the panic is triggered by access to the (now ejected) disk. > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff807ced68 > stack pointer = 0x28:0xffffff80002ecb70 > frame pointer = 0x28:0xffffff80002ecbc0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 13 (g_down) > trap number = 12 > panic: page fault > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff808680fe at kdb_backtrace+0x5e > #1 0xffffffff80832cb7 at panic+0x184 > #2 0xffffffff80b18400 at trap_fatal+0x290 > #3 0xffffffff80b18749 at trap_pfault+0x1f9 > #4 0xffffffff80b18c0f at trap+0x3df > #5 0xffffffff80b0313f at calltrap+0x8 > #6 0xffffffff80g3f874 at g_io_schedule_down+0x1d4 > #7 0xffffffff807cfb7c at g_down_procbody+0x5c > #8 0xffffffff8080682f at fork_exit+0x11f > #9 0xffffffff80b0366e at fork_trampoline+0xe > Uptime: 7m16s > Automatic reboot in 15 seconds - press a key on the console to abort So the question is either "what am I doing wrong?" or "can anyone confirm this is a bug?" thanks in advance, Niels PS: I'm trying to post via email and read via nntp://gmane, I'm not sure how well this works.