From owner-freebsd-geom@FreeBSD.ORG Wed Aug 3 14:43:27 2011 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DFBE106564A for ; Wed, 3 Aug 2011 14:43:27 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 6A83D8FC15 for ; Wed, 3 Aug 2011 14:43:27 +0000 (UTC) Received: from [192.168.135.105] (c-24-7-47-62.hsd1.ca.comcast.net [24.7.47.62]) (authenticated bits=0) by ns1.feral.com (8.14.4/8.14.4) with ESMTP id p73EEYY0010431 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 3 Aug 2011 07:14:35 -0700 (PDT) (envelope-from mj@feral.com) Message-ID: <4E3957C6.1000406@feral.com> Date: Wed, 03 Aug 2011 07:14:30 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: freebsd-geom@freebsd.org References: <4E394269.3090208@darkbsd.org> In-Reply-To: <4E394269.3090208@darkbsd.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (ns1.feral.com [192.67.166.1]); Wed, 03 Aug 2011 07:14:35 -0700 (PDT) Subject: Re: Poor interaction between gmultipath(8), ZFS and isp(4) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Aug 2011 14:43:27 -0000 Known problem. Or rather, one of a long set of known problems. Most of these were addressed at Panasas under RELENG_7, but I have not had the time to redo them for RELENG_8 and later. Nor was I really happy with a lot of the results. At least from my perspective, due to work commitments, I'm unlikely to get to this very soon. Regrets. On 8/3/2011 5:43 AM, Stephane LAPIE wrote: > Hello list, > > (Not 100% sure the bug is in GEOM_MULTIPATH or in another driver.) > > I am running a FreeBSD 8.2-RELEASE server with ZFSv15, with the > following hardware : > > http://www.darkbsd.org/~darksoul/server_dmesg.txt > > I have a dual fibre-channel controller (isp(4) driver), and I am > accessing 16 RAID0 logical drives on a Promise vTrak E630fD (1 volume / > physical disk) > > Since both controllers are plugged to the same storage unit with no LUN > masking, both controllers end up seeing the same devices. Which is what > made me combine these devices using geom_multipath. > > Here is my zpool structure : > config: > > NAME STATE READ WRITE CKSUM > data ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > multipath/disk0 ONLINE 0 0 0 > multipath/disk1 ONLINE 0 0 0 > multipath/disk2 ONLINE 0 0 0 > multipath/disk3 ONLINE 0 0 0 > multipath/disk4 ONLINE 0 0 0 > multipath/disk5 ONLINE 0 0 0 > multipath/disk6 ONLINE 0 0 0 > multipath/disk7 ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > multipath/disk8 ONLINE 0 0 0 > multipath/disk9 ONLINE 0 0 0 > multipath/disk10 ONLINE 0 0 0 > multipath/disk11 ONLINE 0 0 0 > multipath/disk12 ONLINE 0 0 0 > multipath/disk13 ONLINE 0 0 0 > multipath/disk14 ONLINE 0 0 0 > multipath/disk15 ONLINE 0 0 0 > > errors: No known data errors > > > Using gmultipath, I eventually want to have disk{1,3,5,7,9,11,13,15} use > the second controller, while the rest uses the first. The idea was that > if anyone removed the fiber, it would switch everything over to the > remaining fiber. > > For the sake of testing, I put every multipath device on the same > controller, isp1. > > Here is the kernel log fragment I could acquire from my test (removing a > fiber on which transfers are actively running), however since I don't > have serial console access, I couldn't acquire the relevant kernel panic > trace (it simply mentions a kernel trap during a page fault in g_mp_kt > in the last readable section displayed, but I reckon it's like every CPU > raises the panic message) > > http://www.darkbsd.org/~darksoul/server_lastlog_before_kernelpanic.txt > > After that, I get the aforementioned kernel panic. I can consistently > reproduce it, and will try to acquire serial console output to get more > detailed kernel panic trace, but it feels like everything is occuring at > the same time without proper locking, or confirming relevant structures > are still allocated. This looks like a race condition between isp(4) > loopdown provoking da(4) destruction, and gmultipath(8) failover. > (Therefore having g_mp_kt accessing a da(4) structure that is being > destroyed, or already destroyed, and accessing unallocated memory) > > Maybe this is similar to this issue : > http://freebsd.1045724.n5.nabble.com/Kernel-panic-with-gmultipath-td4204700.html > > > Could this be tuned so that : > 1) initially, on isp(4) loopdown -> da(4) devices depending on it return > SCSI errors, provoking clean failover of gmultipath > 2) afterwards, on isp(4) timeout -> da(4) devices are destroyed > > Is this a case for using the following boot hints ? > - "hint.isp.0.loop_down_limit" and "hint.isp.0.gone_device_time" (though > I am not quite sure what the difference is between the two ... Which one > does the actual deallocation of underlying devices ?) > > Thanks in advance for your time,