Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 18 Jun 2015 11:05:19 +0300
From:      Max Gurtovoy <maxg@mellanox.com>
To:        <freebsd-scsi@freebsd.org>, Sagi Grimberg <sagig@mellanox.com>, Oren Duer <oren@mellanox.com>, Hans Petter Selasky <hanss@mellanox.com>
Subject:   Re: gmultipath HA over iscsi/iser
Message-ID:  <55827BBF.6040206@mellanox.com>
In-Reply-To: <557DA8C0.1020209@mellanox.com>
References:  <557DA8C0.1020209@mellanox.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Anyone checked gmultipath utility ?
Thanks,
Max.


On 6/14/2015 7:16 PM, Max Gurtovoy wrote:
> Hello,
> lately I was testing HA using gmultipath utility over iSCSI/iSER devices.
> I'm working on 11-current code base.
> I created 1 LUN on the target side and connected via 2 different 
> physical ports from the initiator side.
> On the initiator side I see see /dev/da0 and /dev/da1.
> I created multipath device using:
> gmultipath label dm0 /dev/da0 /dev/da1.
> Now I have new device /dev/multipath/dm0.
> I set kern.iscsi.fail_on_disconnection=1 (to fail IO fast).
>
> Issue 1:
> -------------
> I can't run simple fio/dd traffice over /dev/da0 nor /dev/da1.
> The only traffic that possible is using the multipath device dm0.
> Is this by design ?
> In the linux implementation we can run traffic on both block devices 
> and multipath devices.
>
> Issue 2:
> --------------
> I run some fio traffic utility over multipath device dm0 on initiator 
> side with port toggling in a loop
>
> Port 1 down --> sleep 2 mins (iSCSI/ISER device reconnecting meanwhile 
> with no success) --> port 1 up --> sleep 5 mins (iSCSI/ISER device 
> reconnecting successecfully)
> Port 2 down --> sleep 2 mins (iSCSI/ISER device reconnecting meanwhile 
> with no success) --> port 2 up --> sleep 5 mins (iSCSI/ISER device 
> reconnecting successecfully)
>
> The expected result is that when the port N is down than the traffic 
> moves to the available port and continue succesfully.
> I run this test for many hours and traffic FAILED (even though there 
> was at least 1 suitable path between initiator and target).
>
> log:
>
> # gmultipath status
>
>              Name   Status  Components
>
> multipath/dm_tcp  OPTIMAL  da0 (ACTIVE)
>
>                             da1 (PASSIVE)
>
> multipath/dm_iser  OPTIMAL  da2 (ACTIVE)
>
>                             da3 (PASSIVE)
>
>
> # fio ..... (over /dev/multipath/dm_iser or /dev/multipath/dm_tcp)
>
>
> fio: this platform does not support process shared mutexes, forcing 
> use of threads. Use the 'thread' option to get rid of this warning.
>
> task1: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=8
>
> ...
>
> task1: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=8
>
> fio-2.1.3
>
> Starting 8 threads
>
> fio: pid=101071, err=6/file:filesetup.c:575, 
> func=open(/dev/multipath/dm_tcp), error=Device not configured
>
> task1: (groupid=0, jobs=8): err= 6 (file:filesetup.c:575, 
> func=open(/dev/multipath/dm_tcp), error=Device not configured): 
> pid=101071: Thu Jun 11 17:25:47 2015
>
>   read : io=296400MB, bw=32122KB/s, iops=8030, runt=9448911msec
>
>     clat (usec): min=131, max=5541.8K, avg=504.40, stdev=5660.23
>
>      lat (usec): min=132, max=5541.8K, avg=504.55, stdev=5660.23
>
>     clat percentiles (usec):
>
>      |  1.00th=[  251],  5.00th=[  298], 10.00th=[  330], 20.00th=[  370],
>
>      | 30.00th=[  406], 40.00th=[  446], 50.00th=[  478], 60.00th=[  510],
>
>      | 70.00th=[  540], 80.00th=[  580], 90.00th=[  644], 95.00th=[  700],
>
>      | 99.00th=[ 1448], 99.50th=[ 1704], 99.90th=[ 1976], 99.95th=[ 2064],
>
>      | 99.99th=[ 2256]
>
>     bw (KB  /s): min=    2, max= 5576, per=12.64%, avg=4060.97, 
> stdev=352.37
>
>   write: io=295596MB, bw=32034KB/s, iops=8008, runt=9448911msec
>
>     clat (usec): min=125, max=5541.8K, avg=490.13, stdev=5143.96
>
>      lat (usec): min=125, max=5541.8K, avg=490.41, stdev=5143.96
>
>     clat percentiles (usec):
>
>      |  1.00th=[  239],  5.00th=[  282], 10.00th=[  310], 20.00th=[  354],
>
>      | 30.00th=[  390], 40.00th=[  426], 50.00th=[  466], 60.00th=[  502],
>
>      | 70.00th=[  532], 80.00th=[  572], 90.00th=[  628], 95.00th=[  692],
>
>      | 99.00th=[ 1432], 99.50th=[ 1688], 99.90th=[ 1960], 99.95th=[ 2040],
>
>      | 99.99th=[ 2256]
>
>     bw (KB  /s): min=    3, max= 5512, per=12.64%, avg=4049.74, 
> stdev=355.11
>
>     lat (usec) : 250=1.29%, 500=56.84%, 750=38.78%, 1000=0.94%
>
>     lat (msec) : 2=2.08%, 4=0.07%, 10=0.01%, 20=0.01%, 50=0.01%
>
>     lat (msec) : 100=0.01%, >=2000=0.01%
>
>   cpu          : usr=0.61%, sys=4.33%, ctx=151634083, majf=0, minf=3
>
>   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
> >=64=0.0%
>
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>
>      complete  : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>
>      issued    : total=r=75878522/w=75672554/d=0, short=r=0/w=0/d=0
>
> Run status group 0 (all jobs):
>
>    READ: io=296400MB, aggrb=32121KB/s, minb=32121KB/s, maxb=32121KB/s, 
> mint=9448911msec, maxt=9448911msec
>
>   WRITE: io=295596MB, aggrb=32034KB/s, minb=32034KB/s, maxb=32034KB/s, 
> mint=9448911msec, maxt=9448911msec
>
>
> # gmultipath status
>
>              Name    Status  Components
>
> multipath/dm_tcp  DEGRADED  da1 (ACTIVE)
>
> multipath/dm_iser  DEGRADED  da3 (ACTIVE)
>
>
> We can see that there is Active paths to multipath device but still 
> traffice failed.
> Any suggestions ? Anyone saw this before ?
>
> Thanks,
> Max Gurtovoy.
> Mellanox Technologies.
>
>
>
>
>
>
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55827BBF.6040206>