From owner-freebsd-geom@FreeBSD.ORG Tue Nov 24 17:53:31 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B423E106566C for ; Tue, 24 Nov 2009 17:53:31 +0000 (UTC) (envelope-from korvus@comcast.net) Received: from QMTA13.westchester.pa.mail.comcast.net (qmta13.westchester.pa.mail.comcast.net [76.96.59.243]) by mx1.freebsd.org (Postfix) with ESMTP id 637648FC12 for ; Tue, 24 Nov 2009 17:53:31 +0000 (UTC) Received: from OMTA14.westchester.pa.mail.comcast.net ([76.96.62.60]) by QMTA13.westchester.pa.mail.comcast.net with comcast id 91851d01T1HzFnQ5D5gGTE; Tue, 24 Nov 2009 17:40:16 +0000 Received: from [192.168.2.164] ([206.210.89.202]) by OMTA14.westchester.pa.mail.comcast.net with comcast id 95g31d0034Mx3R23a5g5lE; Tue, 24 Nov 2009 17:40:14 +0000 Message-ID: <4B0C1A72.3000301@comcast.net> Date: Tue, 24 Nov 2009 12:40:02 -0500 From: Steve Polyack User-Agent: Thunderbird 2.0.0.23 (X11/20090902) MIME-Version: 1.0 To: freebsd-hardware@freebsd.org, freebsd-stable , freebsd-geom@FreeBSD.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Panic possibly related to glabel/geom and siis(4) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Nov 2009 17:53:31 -0000 I have a system running 8.0-PRERELEASE with multiple drives and SATA port multipliers (siis controllers and PMPs). All of the attached drives are labeled via glabel(8) and then included into a ZFS pool. During some testing to determine how the system would react to a dead drive (simulated by physically removing a drive during operation), I was able to produce a panic. Now, I know that the SATA PMP and siis(4) code to handle and recover from device errors is incomplete, but I believe the crash may be particular to using glabel'd drives. Basically, after removing a drive while the zpool is in use and issues 'camcontrol reset' and 'rescan' on the appropriate bus, the physical device associated with the drive disappears. In this case: (pass5:siisch7:0:15:0): lost device (pass5:siisch7:0:15:0): removing device entry (ada2:siisch7:0:0:0): lost device and /dev/ada2 disappears. However, the associated glabel /dev/label/bigdisk07 remains. Since my ZFS pool is created based on the drive glabels, I believe this is why ZFS never notices the drives disappear either. Do glabels typically go away after a physical device is lost? Should this not be the case? After some runtime with the physical device missing, a kernel panic is produced: ada2:siisch7:0:0:0): Synchronize cache failed (ada2:siisch7:0:0:0): removing device entry Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 14 fault virtual address = 0x48 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff8035f375 stack pointer = 0x28:0xffffff800006db60 frame pointer = 0x28:0xffffff800006db70 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2 (g_event) [thread pid 2 tid 100014 ] Stopped at _mtx_lock_flags+0x15: lock cmpxchgq %rsi,0x18(%rdi) db> bt Tracing pid 2 tid 100014 td 0xffffff00014d4ab0 _mtx_lock_flags() at _mtx_lock_flags+0x15 vdev_geom_release() at vdev_geom_release+0x33 vdev_geom_orphan() at vdev_geom_orphan+0x15c g_run_events() at g_run_events+0x104 g_event_procbody() at g_event_procbody+0x55 fork_exit() at fork_exit+0x118 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff800006dd30, rbp = 0 --- I'm open to try patches and other suggestions. Thanks.