From owner-freebsd-geom@FreeBSD.ORG Wed Aug 3 12:59:17 2011 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2EEAF106564A for ; Wed, 3 Aug 2011 12:59:17 +0000 (UTC) (envelope-from stephane.lapie@darkbsd.org) Received: from quasar.darkbsd.org (shinigami.darkbsd.org [82.227.96.182]) by mx1.freebsd.org (Postfix) with ESMTP id BCD688FC0C for ; Wed, 3 Aug 2011 12:59:16 +0000 (UTC) Received: from quasar.darkbsd.org (localhost [127.0.0.1]) by quasar.darkbsd.org (Postfix) with ESMTP id DE8646F2E for ; Wed, 3 Aug 2011 14:43:34 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=darkbsd.org; h=message-id :date:from:mime-version:to:subject:content-type; s=selector1; bh=0OVnKtlLDikOmSEBq5MFo9jl7ws=; b=P4T6ddKcORpTFdlIE4JaHlBgz1be OxQOcQYejPIQgfOzn/9NWNbVtuxk0O706AxWFBSVoVM5OgXadNvF/+e5bgHFjolm P2TptERN8PQ2tUumODOTq0q03joZsqEFLsWelP/WY2ctnHaFsCfG6pIobpmEzKrw 2uHbhcQe2av3uQA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=darkbsd.org; h=message-id :date:from:mime-version:to:subject:content-type; q=dns; s= selector1; b=kWPh30bGGyJSh4AN8hxm7dKS7aGIahbtnicaCSDEGKjXRi++fhh mXXZrtgJbzBpkO5+Q0oSv8Vc7ZyN+tff+wB6io0RzQjqjWvYCxc6JfSQB5bMBMfX eVxvdNSsP4z1s5YpNKxnjuQ6425zxoMkGBeB1Wm0Gb3fzX2slxO4SdjY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=darkbsd.org; h= content-type:content-type:subject:subject:mime-version :user-agent:from:from:date:date:message-id:received:received; s= selector1; t=1312375411; bh=/rdeODrwSqgmzJe2eWspmwCxkdYgnOW0xOZA Rpvq/Ww=; b=eIHn7xQEnWFHXcJx0fCE2mgNsf9FsCDrtyoPfcp084hMQBIMCFuW QWZGnsq2gIEkfu1PjLhl3XPD1Jxx6bd4MBASXr10Xq/2U9Rr1eeq+jXNfAnNpEe4 +M9qTzN+vbdiMlSsQOpc6k4glOiHYodeZbT5WXlwDq2fu/afI5maVdo= Received: from quasar.darkbsd.org ([127.0.0.1]) by quasar.darkbsd.org (quasar.darkbsd.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id hg0yk4yQc2Gc for ; Wed, 3 Aug 2011 14:43:31 +0200 (CEST) Received: from [192.168.166.168] (unknown [210.188.173.246]) (Authenticated sender: darksoul) by quasar.darkbsd.org (Postfix) with ESMTPSA id BB65E6F27 for ; Wed, 3 Aug 2011 14:43:30 +0200 (CEST) Message-ID: <4E394269.3090208@darkbsd.org> Date: Wed, 03 Aug 2011 21:43:21 +0900 From: Stephane LAPIE User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.17) Gecko/20110516 Thunderbird/3.1.10 MIME-Version: 1.0 To: freebsd-geom@freebsd.org X-Enigmail-Version: 1.1.2 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig0ECBDA10BF951DFFC811471E" Subject: Poor interaction between gmultipath(8), ZFS and isp(4) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Aug 2011 12:59:17 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig0ECBDA10BF951DFFC811471E Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello list, (Not 100% sure the bug is in GEOM_MULTIPATH or in another driver.) I am running a FreeBSD 8.2-RELEASE server with ZFSv15, with the following hardware : http://www.darkbsd.org/~darksoul/server_dmesg.txt I have a dual fibre-channel controller (isp(4) driver), and I am accessing 16 RAID0 logical drives on a Promise vTrak E630fD (1 volume / physical disk) Since both controllers are plugged to the same storage unit with no LUN masking, both controllers end up seeing the same devices. Which is what made me combine these devices using geom_multipath. Here is my zpool structure : config: NAME STATE READ WRITE CKSUM data ONLINE 0 0 0 raidz1 ONLINE 0 0 0 multipath/disk0 ONLINE 0 0 0 multipath/disk1 ONLINE 0 0 0 multipath/disk2 ONLINE 0 0 0 multipath/disk3 ONLINE 0 0 0 multipath/disk4 ONLINE 0 0 0 multipath/disk5 ONLINE 0 0 0 multipath/disk6 ONLINE 0 0 0 multipath/disk7 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 multipath/disk8 ONLINE 0 0 0 multipath/disk9 ONLINE 0 0 0 multipath/disk10 ONLINE 0 0 0 multipath/disk11 ONLINE 0 0 0 multipath/disk12 ONLINE 0 0 0 multipath/disk13 ONLINE 0 0 0 multipath/disk14 ONLINE 0 0 0 multipath/disk15 ONLINE 0 0 0 errors: No known data errors Using gmultipath, I eventually want to have disk{1,3,5,7,9,11,13,15} use the second controller, while the rest uses the first. The idea was that if anyone removed the fiber, it would switch everything over to the remaining fiber. For the sake of testing, I put every multipath device on the same controller, isp1. Here is the kernel log fragment I could acquire from my test (removing a fiber on which transfers are actively running), however since I don't have serial console access, I couldn't acquire the relevant kernel panic trace (it simply mentions a kernel trap during a page fault in g_mp_kt in the last readable section displayed, but I reckon it's like every CPU raises the panic message) http://www.darkbsd.org/~darksoul/server_lastlog_before_kernelpanic.txt After that, I get the aforementioned kernel panic. I can consistently reproduce it, and will try to acquire serial console output to get more detailed kernel panic trace, but it feels like everything is occuring at the same time without proper locking, or confirming relevant structures are still allocated. This looks like a race condition between isp(4) loopdown provoking da(4) destruction, and gmultipath(8) failover. (Therefore having g_mp_kt accessing a da(4) structure that is being destroyed, or already destroyed, and accessing unallocated memory) Maybe this is similar to this issue : http://freebsd.1045724.n5.nabble.com/Kernel-panic-with-gmultipath-td42047= 00.html Could this be tuned so that : 1) initially, on isp(4) loopdown -> da(4) devices depending on it return SCSI errors, provoking clean failover of gmultipath 2) afterwards, on isp(4) timeout -> da(4) devices are destroyed Is this a case for using the following boot hints ? - "hint.isp.0.loop_down_limit" and "hint.isp.0.gone_device_time" (though I am not quite sure what the difference is between the two ... Which one does the actual deallocation of underlying devices ?) Thanks in advance for your time, --=20 Stephane LAPIE, EPITA SRS, Promo 2005 "Even when they have digital readouts, I can't understand them." --MegaTokyo --------------enig0ECBDA10BF951DFFC811471E Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk45QmoACgkQ24Ql8u6TF2NSHACeNHa2ug7j6x8GqobfuVdcskox /EQAoM+YGH7HhcuA+Bpo9rc70Uhz76Q/ =F/5b -----END PGP SIGNATURE----- --------------enig0ECBDA10BF951DFFC811471E--