From owner-freebsd-geom@FreeBSD.ORG Fri Jan 20 12:13:26 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C0743106564A; Fri, 20 Jan 2012 12:13:26 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id A19168FC2B; Fri, 20 Jan 2012 12:13:25 +0000 (UTC) Received: by eekb47 with SMTP id b47so184817eek.13 for ; Fri, 20 Jan 2012 04:13:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=k4IjwMtd3EQAnyBacvRokdyKanGqfS4kAnMo2sE3wgg=; b=P6vCUqccZDNSY06DyrqvkQS9ijMHiANQnWYIILgvQCnDqPm000OGBPpJ+Qgjo98QLf ++thc4TIqcEiNBpebn9Zhe+tARSK9v4VKEGCDXKQbYUes+s4On08ErLzDMM7WQx3FJ6M Qvi5+zYdgfaGIQAY9IFSXG4WHf7hHlHbj8XJY= Received: by 10.14.9.38 with SMTP id 38mr3201665ees.101.1327061604619; Fri, 20 Jan 2012 04:13:24 -0800 (PST) Received: from ndenevsa.sf.moneybookers.net (g1.moneybookers.com. [217.18.249.148]) by mx.google.com with ESMTPS id t59sm10698354eeh.10.2012.01.20.04.13.18 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 20 Jan 2012 04:13:21 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v1251.1) Content-Type: text/plain; charset=iso-8859-1 From: Nikolay Denev In-Reply-To: <4F19503B.2090200@FreeBSD.org> Date: Fri, 20 Jan 2012 14:13:16 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <25C45DA0-4B52-42E4-A1A3-DD5168451423@gmail.com> References: <4EAF00A6.5060903@FreeBSD.org> <05E0E64F-5EC4-425A-81E4-B6C35320608B@neveragain.de> <4EB05566.3060700@FreeBSD.org> <20111114210957.GA68559@in-addr.com> <059C17DB-3A7B-41AA-BF91-2F8EBAF17D01@gmail.com> <4F19474A.9020600@FreeBSD.org> <-2439788735531654851@unknownmsgid> <4F19503B.2090200@FreeBSD.org> To: Alexander Motin X-Mailer: Apple Mail (2.1251.1) Cc: Gary Palmer , FreeBSD-Current , Dennis K?gel , "freebsd-geom@freebsd.org" Subject: Re: RFC: GEOM MULTIPATH rewrite X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Jan 2012 12:13:26 -0000 On Jan 20, 2012, at 1:30 PM, Alexander Motin wrote: > On 01/20/12 13:08, Nikolay Denev wrote: >> On 20.01.2012, at 12:51, Alexander Motin wrote: >>=20 >>> On 01/20/12 10:09, Nikolay Denev wrote: >>>> Another thing I've observed is that active/active probably only = makes sense if you are accessing single LUN. >>>> In my tests where I have 24 LUNS that form 4 vdevs in a single = zpool, the highest performance was achieved >>>> when I split the active paths among the controllers installed in = the server importing the pool. (basically "gmultipath rotate $LUN" in = rc.local for half of the paths) >>>> Using active/active in this situation resulted in fluctuating = performance. >>>=20 >>> How big was fluctuation? Between speed of one and all paths? >>>=20 >>> Several active/active devices without knowledge about each other = with some probability will send part of requests via the same links, = while ZFS itself already does some balancing between vdevs. >>>=20 >>> -- >>> Alexander Motin >>=20 >> I will test in a bit and post results. >>=20 >> P.S.: Is there a way to enable/disable active-active on the fly? I'm >> currently re-labeling to achieve that. >=20 > No, there is not now. But for experiments you may achieve the same = results by manually marking as failed all paths except one. It is not = dangerous, as if that link fail, all other will resurrect automatically. >=20 > --=20 > Alexander Motin I had to destroy and relabel anyways, since I was not using = active-active currently. Here's what I did (maybe a little too verbose): gmultipath label -A -v LD_0 /dev/da0 /dev/da24=20 gmultipath label -A -v LD_1 /dev/da1 /dev/da25=20 gmultipath label -A -v LD_2 /dev/da2 /dev/da26=20 gmultipath label -A -v LD_3 /dev/da3 /dev/da27=20 gmultipath label -A -v LD_4 /dev/da4 /dev/da28=20 gmultipath label -A -v LD_5 /dev/da5 /dev/da29=20 gmultipath label -A -v LD_6 /dev/da6 /dev/da30=20 gmultipath label -A -v LD_7 /dev/da7 /dev/da31=20 gmultipath label -A -v LD_8 /dev/da8 /dev/da32=20 gmultipath label -A -v LD_9 /dev/da9 /dev/da33=20 gmultipath label -A -v LD_10 /dev/da10 /dev/da34=20 gmultipath label -A -v LD_11 /dev/da11 /dev/da35=20 gmultipath label -A -v LD_12 /dev/da12 /dev/da36=20 gmultipath label -A -v LD_13 /dev/da13 /dev/da37=20 gmultipath label -A -v LD_14 /dev/da14 /dev/da38=20 gmultipath label -A -v LD_15 /dev/da15 /dev/da39=20 gmultipath label -A -v LD_16 /dev/da16 /dev/da40=20 gmultipath label -A -v LD_17 /dev/da17 /dev/da41=20 gmultipath label -A -v LD_18 /dev/da18 /dev/da42=20 gmultipath label -A -v LD_19 /dev/da19 /dev/da43=20 gmultipath label -A -v LD_20 /dev/da20 /dev/da44=20 gmultipath label -A -v LD_21 /dev/da21 /dev/da45=20 gmultipath label -A -v LD_22 /dev/da22 /dev/da46=20 gmultipath label -A -v LD_23 /dev/da23 /dev/da47=20 :~# gmultipath status Name Status Components multipath/LD_0 OPTIMAL da0 (ACTIVE) da24 (ACTIVE) multipath/LD_1 OPTIMAL da1 (ACTIVE) da25 (ACTIVE) multipath/LD_2 OPTIMAL da2 (ACTIVE) da26 (ACTIVE) multipath/LD_3 OPTIMAL da3 (ACTIVE) da27 (ACTIVE) multipath/LD_4 OPTIMAL da4 (ACTIVE) da28 (ACTIVE) multipath/LD_5 OPTIMAL da5 (ACTIVE) da29 (ACTIVE) multipath/LD_6 OPTIMAL da6 (ACTIVE) da30 (ACTIVE) multipath/LD_7 OPTIMAL da7 (ACTIVE) da31 (ACTIVE) multipath/LD_8 OPTIMAL da8 (ACTIVE) da32 (ACTIVE) multipath/LD_9 OPTIMAL da9 (ACTIVE) da33 (ACTIVE) multipath/LD_10 OPTIMAL da10 (ACTIVE) da34 (ACTIVE) multipath/LD_11 OPTIMAL da11 (ACTIVE) da35 (ACTIVE) multipath/LD_12 OPTIMAL da12 (ACTIVE) da36 (ACTIVE) multipath/LD_13 OPTIMAL da13 (ACTIVE) da37 (ACTIVE) multipath/LD_14 OPTIMAL da14 (ACTIVE) da38 (ACTIVE) multipath/LD_15 OPTIMAL da15 (ACTIVE) da39 (ACTIVE) multipath/LD_16 OPTIMAL da16 (ACTIVE) da40 (ACTIVE) multipath/LD_17 OPTIMAL da17 (ACTIVE) da41 (ACTIVE) multipath/LD_18 OPTIMAL da18 (ACTIVE) da42 (ACTIVE) multipath/LD_19 OPTIMAL da19 (ACTIVE) da43 (ACTIVE) multipath/LD_20 OPTIMAL da20 (ACTIVE) da44 (ACTIVE) multipath/LD_21 OPTIMAL da21 (ACTIVE) da45 (ACTIVE) multipath/LD_22 OPTIMAL da22 (ACTIVE) da46 (ACTIVE) multipath/LD_23 OPTIMAL da23 (ACTIVE) da47 (ACTIVE) :~# zpool import tank :~# zpool status pool: tank state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 multipath/LD_0 ONLINE 0 0 0 multipath/LD_1 ONLINE 0 0 0 multipath/LD_2 ONLINE 0 0 0 multipath/LD_3 ONLINE 0 0 0 multipath/LD_4 ONLINE 0 0 0 multipath/LD_5 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 multipath/LD_6 ONLINE 0 0 0 multipath/LD_7 ONLINE 0 0 0 multipath/LD_8 ONLINE 0 0 0 multipath/LD_9 ONLINE 0 0 0 multipath/LD_10 ONLINE 0 0 0 multipath/LD_11 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 multipath/LD_12 ONLINE 0 0 0 multipath/LD_13 ONLINE 0 0 0 multipath/LD_14 ONLINE 0 0 0 multipath/LD_15 ONLINE 0 0 0 multipath/LD_16 ONLINE 0 0 0 multipath/LD_17 ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 multipath/LD_18 ONLINE 0 0 0 multipath/LD_19 ONLINE 0 0 0 multipath/LD_20 ONLINE 0 0 0 multipath/LD_21 ONLINE 0 0 0 multipath/LD_22 ONLINE 0 0 0 multipath/LD_23 ONLINE 0 0 0 errors: No known data errors And now a very naive benchmark : :~# dd if=3D/dev/zero of=3D/tank/TEST bs=3D1M count=3D512 =20 512+0 records in 512+0 records out 536870912 bytes transferred in 7.282780 secs (73717855 bytes/sec) :~# dd if=3D/dev/zero of=3D/tank/TEST bs=3D1M count=3D512 512+0 records in 512+0 records out 536870912 bytes transferred in 38.422724 secs (13972745 bytes/sec) :~# dd if=3D/dev/zero of=3D/tank/TEST bs=3D1M count=3D512 512+0 records in 512+0 records out 536870912 bytes transferred in 10.810989 secs (49659740 bytes/sec) Now deactivate the alternative paths : /sbin/gmultipath fail LD_0 da24 /sbin/gmultipath fail LD_1 da25 /sbin/gmultipath fail LD_2 da26 /sbin/gmultipath fail LD_3 da27 /sbin/gmultipath fail LD_4 da28 /sbin/gmultipath fail LD_5 da29 /sbin/gmultipath fail LD_6 da6 /sbin/gmultipath fail LD_7 da7 /sbin/gmultipath fail LD_8 da8 /sbin/gmultipath fail LD_9 da9 /sbin/gmultipath fail LD_10 da10 /sbin/gmultipath fail LD_11 da11 /sbin/gmultipath fail LD_12 da36 /sbin/gmultipath fail LD_13 da37 /sbin/gmultipath fail LD_14 da38 /sbin/gmultipath fail LD_15 da39 /sbin/gmultipath fail LD_16 da40 /sbin/gmultipath fail LD_17 da41 /sbin/gmultipath fail LD_18 da18 /sbin/gmultipath fail LD_19 da19 /sbin/gmultipath fail LD_20 da20 /sbin/gmultipath fail LD_21 da21 /sbin/gmultipath fail LD_22 da22 /sbin/gmultipath fail LD_23 da23 :~# gmultipath status Name Status Components multipath/LD_0 DEGRADED da0 (ACTIVE) da24 (FAIL) multipath/LD_1 DEGRADED da1 (ACTIVE) da25 (FAIL) multipath/LD_2 DEGRADED da2 (ACTIVE) da26 (FAIL) multipath/LD_3 DEGRADED da3 (ACTIVE) da27 (FAIL) multipath/LD_4 DEGRADED da4 (ACTIVE) da28 (FAIL) multipath/LD_5 DEGRADED da5 (ACTIVE) da29 (FAIL) multipath/LD_6 DEGRADED da6 (FAIL) da30 (ACTIVE) multipath/LD_7 DEGRADED da7 (FAIL) da31 (ACTIVE) multipath/LD_8 DEGRADED da8 (FAIL) da32 (ACTIVE) multipath/LD_9 DEGRADED da9 (FAIL) da33 (ACTIVE) multipath/LD_10 DEGRADED da10 (FAIL) da34 (ACTIVE) multipath/LD_11 DEGRADED da11 (FAIL) da35 (ACTIVE) multipath/LD_12 DEGRADED da12 (ACTIVE) da36 (FAIL) multipath/LD_13 DEGRADED da13 (ACTIVE) da37 (FAIL) multipath/LD_14 DEGRADED da14 (ACTIVE) da38 (FAIL) multipath/LD_15 DEGRADED da15 (ACTIVE) da39 (FAIL) multipath/LD_16 DEGRADED da16 (ACTIVE) da40 (FAIL) multipath/LD_17 DEGRADED da17 (ACTIVE) da41 (FAIL) multipath/LD_18 DEGRADED da18 (FAIL) da42 (ACTIVE) multipath/LD_19 DEGRADED da19 (FAIL) da43 (ACTIVE) multipath/LD_20 DEGRADED da20 (FAIL) da44 (ACTIVE) multipath/LD_21 DEGRADED da21 (FAIL) da45 (ACTIVE) multipath/LD_22 DEGRADED da22 (FAIL) da46 (ACTIVE) multipath/LD_23 DEGRADED da23 (FAIL) da47 (ACTIVE) And the benchmark again: :~# dd if=3D/dev/zero of=3D/tank/TEST bs=3D1M count=3D512 512+0 records in 512+0 records out 536870912 bytes transferred in 1.083226 secs (495622270 bytes/sec) :~# dd if=3D/dev/zero of=3D/tank/TEST bs=3D1M count=3D512 512+0 records in 512+0 records out 536870912 bytes transferred in 1.409975 secs (380766249 bytes/sec) :~# dd if=3D/dev/zero of=3D/tank/TEST bs=3D1M count=3D512 512+0 records in 512+0 records out 536870912 bytes transferred in 1.136110 secs (472551848 bytes/sec) P.S.: The server is running 8.2-STABLE, dual port isp(4) card, and is = directly connected to a 4Gbps Xyratex dual-controller (active-active) = storage array. All the 24 SAS drives are setup as single disk RAID0 LUNs.=