From owner-freebsd-geom@FreeBSD.ORG Fri Jan 20 08:40:59 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 705D9106566C; Fri, 20 Jan 2012 08:40:59 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 96E138FC12; Fri, 20 Jan 2012 08:40:55 +0000 (UTC) Received: by eaai10 with SMTP id i10so110403eaa.13 for ; Fri, 20 Jan 2012 00:40:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=A4fVTjcC6Hz5l30uWQkJhmm3DMsTMvfkpXNwyp7+WxI=; b=AzgWpxXE1qn0eB88os0hlheFByhmMPxp+MfxiXNrxstWh7RhkFbT1OZhNwMe1F1FrC B7WWzL35iDj/ngQsfQjhwd13rD+p4JpCVzipExenn5KKJYGM2zBXiGdsxbbPLB5Mg3n0 eiKgEmsJN1b1PMEUVNDHUDnNUWdj5+5vZEn+U= Received: by 10.213.35.12 with SMTP id n12mr7671455ebd.68.1327046998419; Fri, 20 Jan 2012 00:09:58 -0800 (PST) Received: from ndenevsa.sf.moneybookers.net (g1.moneybookers.com. [217.18.249.148]) by mx.google.com with ESMTPS id t59sm8383473eeh.10.2012.01.20.00.09.55 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 20 Jan 2012 00:09:56 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v1251.1) Content-Type: text/plain; charset=us-ascii From: Nikolay Denev In-Reply-To: <20111114210957.GA68559@in-addr.com> Date: Fri, 20 Jan 2012 10:09:56 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <059C17DB-3A7B-41AA-BF91-2F8EBAF17D01@gmail.com> References: <4EAF00A6.5060903@FreeBSD.org> <05E0E64F-5EC4-425A-81E4-B6C35320608B@neveragain.de> <4EB05566.3060700@FreeBSD.org> <20111114210957.GA68559@in-addr.com> To: Gary Palmer X-Mailer: Apple Mail (2.1251.1) Cc: Alexander Motin , FreeBSD-Current , Dennis K?gel , freebsd-geom@freebsd.org Subject: Re: RFC: GEOM MULTIPATH rewrite X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Jan 2012 08:40:59 -0000 On Nov 14, 2011, at 11:09 PM, Gary Palmer wrote: > On Tue, Nov 01, 2011 at 10:24:06PM +0200, Alexander Motin wrote: >> On 01.11.2011 19:50, Dennis K?gel wrote: >>> Not sure if replying on-list or off-list makes more sense... >>=20 >> Replying on-list could share experience to other users. >>=20 >>> Anyway, some first impressions, on stable/9: >>>=20 >>> The lab environment here is a EMC VNX / Clariion SAN, which has two = Storage Processors, connected to different switches, connected to two = isp(4)s on the test machine. So at any time, the machine sees four = paths, but only two are available (depending on which SP owns the LUN). >>>=20 >>> 580# camcontrol devlist >>> at scbus0 target 0 lun 0 = (da0,pass0) >>> at scbus0 target 1 lun 0 = (da1,pass1) >>> at scbus1 target 0 lun 0 = (da2,pass2) >>> at scbus1 target 1 lun 0 = (da3,pass3) >>> at scbus2 target 0 lun 0 = (da4,pass4) >>> at scbus2 target 1 lun 0 = (da5,pass5) >>> at scbus4 target 0 lun 0 = (cd0,pass6) >>>=20 >>> I miss the ability to "add" disks to automatic mode multipaths, but = I (just now) realized this only makes sense when gmultipath has some = kind of path checking facility (like periodically trying to read sector = 0 of each configured device, this is was Linux' devicemapper-multipathd = does). >>=20 >> In automatic mode other paths supposed to be detected via metadata >> reading. If in your case some paths are not readable, automatic mode >> can't work as expected. By the way, could you describe how your >> configuration supposed to work, like when other paths will start >> working?=20 >=20 > Without knowledge of the particular Clariion SAN Dennis is working = with, > I've seen some so-called active/active RAID controllers force a LUN=20 > fail over from one controller to another (taking it offline for 3 = seconds > in the process) because the LUN received an I/O down a path to the = controller > that was formerly taking the standby role for that LUN (and it was = per-LUN, > so some would be owned by one controller and some by the other). = During > the controller switch, all I/O to the LUN would fail. Thankfully that > particular RAID model where I observed this behaviour hasn't been sold = in > several years, but I would tend to expect such behaviour at the lower > end of the storage market with the higher end units doing true = active/active > configurations. (and no, I won't name the manufacturer on a public = list) >=20 > This is exactly why Linux ships with a multipath configuration file, = so > it can describe exactly what form of brain damage the controller in > question implements so it can work around it, and maybe even=20 > document some vendor-specific extensions so that the host can detect > which controller is taking which role for a particular path. >=20 > Even some controllers that don't have pathological behaviour when > they receive I/O down the wrong path have sub-optimal behaviour unless > you choose the right path. NetApp SANs in particular typically have = two > independant controllers with a high-speed internal interconnect, = however > there is a measurable and not-insignificant penalty for sending the = I/O > to the "partner" controller for a LUN, across the internal = interconnect > (called a "VTIC" I believe) to the "owner" controller. I've been = told, > although I have not measured this myself, that it can add several ms = to > a transaction, which when talking about SAN storage is potentially = several > times what it takes to do the same I/O directly to the controller that > owns it. There's probably a way to make the "partner" controller not > advertise the LUN until it takes over in a failover scenario, but = every > NetApp I've worked with is set (by default I believe) to advertise the > LUN out both controllers. >=20 > Gary > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to = "freebsd-current-unsubscribe@freebsd.org" Another thing I've observed is that active/active probably only makes = sense if you are accessing single LUN. In my tests where I have 24 LUNS that form 4 vdevs in a single zpool, = the highest performance was achieved when I split the active paths among the controllers installed in the = server importing the pool. (basically "gmultipath rotate $LUN" in = rc.local for half of the paths) Using active/active in this situation resulted in fluctuating = performance.