Date: Thu, 18 Jun 2015 11:05:19 +0300 From: Max Gurtovoy <maxg@mellanox.com> To: <freebsd-scsi@freebsd.org>, Sagi Grimberg <sagig@mellanox.com>, Oren Duer <oren@mellanox.com>, Hans Petter Selasky <hanss@mellanox.com> Subject: Re: gmultipath HA over iscsi/iser Message-ID: <55827BBF.6040206@mellanox.com> In-Reply-To: <557DA8C0.1020209@mellanox.com> References: <557DA8C0.1020209@mellanox.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Anyone checked gmultipath utility ? Thanks, Max. On 6/14/2015 7:16 PM, Max Gurtovoy wrote: > Hello, > lately I was testing HA using gmultipath utility over iSCSI/iSER devices. > I'm working on 11-current code base. > I created 1 LUN on the target side and connected via 2 different > physical ports from the initiator side. > On the initiator side I see see /dev/da0 and /dev/da1. > I created multipath device using: > gmultipath label dm0 /dev/da0 /dev/da1. > Now I have new device /dev/multipath/dm0. > I set kern.iscsi.fail_on_disconnection=1 (to fail IO fast). > > Issue 1: > ------------- > I can't run simple fio/dd traffice over /dev/da0 nor /dev/da1. > The only traffic that possible is using the multipath device dm0. > Is this by design ? > In the linux implementation we can run traffic on both block devices > and multipath devices. > > Issue 2: > -------------- > I run some fio traffic utility over multipath device dm0 on initiator > side with port toggling in a loop > > Port 1 down --> sleep 2 mins (iSCSI/ISER device reconnecting meanwhile > with no success) --> port 1 up --> sleep 5 mins (iSCSI/ISER device > reconnecting successecfully) > Port 2 down --> sleep 2 mins (iSCSI/ISER device reconnecting meanwhile > with no success) --> port 2 up --> sleep 5 mins (iSCSI/ISER device > reconnecting successecfully) > > The expected result is that when the port N is down than the traffic > moves to the available port and continue succesfully. > I run this test for many hours and traffic FAILED (even though there > was at least 1 suitable path between initiator and target). > > log: > > # gmultipath status > > Name Status Components > > multipath/dm_tcp OPTIMAL da0 (ACTIVE) > > da1 (PASSIVE) > > multipath/dm_iser OPTIMAL da2 (ACTIVE) > > da3 (PASSIVE) > > > # fio ..... (over /dev/multipath/dm_iser or /dev/multipath/dm_tcp) > > > fio: this platform does not support process shared mutexes, forcing > use of threads. Use the 'thread' option to get rid of this warning. > > task1: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=8 > > ... > > task1: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=8 > > fio-2.1.3 > > Starting 8 threads > > fio: pid=101071, err=6/file:filesetup.c:575, > func=open(/dev/multipath/dm_tcp), error=Device not configured > > task1: (groupid=0, jobs=8): err= 6 (file:filesetup.c:575, > func=open(/dev/multipath/dm_tcp), error=Device not configured): > pid=101071: Thu Jun 11 17:25:47 2015 > > read : io=296400MB, bw=32122KB/s, iops=8030, runt=9448911msec > > clat (usec): min=131, max=5541.8K, avg=504.40, stdev=5660.23 > > lat (usec): min=132, max=5541.8K, avg=504.55, stdev=5660.23 > > clat percentiles (usec): > > | 1.00th=[ 251], 5.00th=[ 298], 10.00th=[ 330], 20.00th=[ 370], > > | 30.00th=[ 406], 40.00th=[ 446], 50.00th=[ 478], 60.00th=[ 510], > > | 70.00th=[ 540], 80.00th=[ 580], 90.00th=[ 644], 95.00th=[ 700], > > | 99.00th=[ 1448], 99.50th=[ 1704], 99.90th=[ 1976], 99.95th=[ 2064], > > | 99.99th=[ 2256] > > bw (KB /s): min= 2, max= 5576, per=12.64%, avg=4060.97, > stdev=352.37 > > write: io=295596MB, bw=32034KB/s, iops=8008, runt=9448911msec > > clat (usec): min=125, max=5541.8K, avg=490.13, stdev=5143.96 > > lat (usec): min=125, max=5541.8K, avg=490.41, stdev=5143.96 > > clat percentiles (usec): > > | 1.00th=[ 239], 5.00th=[ 282], 10.00th=[ 310], 20.00th=[ 354], > > | 30.00th=[ 390], 40.00th=[ 426], 50.00th=[ 466], 60.00th=[ 502], > > | 70.00th=[ 532], 80.00th=[ 572], 90.00th=[ 628], 95.00th=[ 692], > > | 99.00th=[ 1432], 99.50th=[ 1688], 99.90th=[ 1960], 99.95th=[ 2040], > > | 99.99th=[ 2256] > > bw (KB /s): min= 3, max= 5512, per=12.64%, avg=4049.74, > stdev=355.11 > > lat (usec) : 250=1.29%, 500=56.84%, 750=38.78%, 1000=0.94% > > lat (msec) : 2=2.08%, 4=0.07%, 10=0.01%, 20=0.01%, 50=0.01% > > lat (msec) : 100=0.01%, >=2000=0.01% > > cpu : usr=0.61%, sys=4.33%, ctx=151634083, majf=0, minf=3 > > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, > >=64=0.0% > > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > > complete : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > > issued : total=r=75878522/w=75672554/d=0, short=r=0/w=0/d=0 > > Run status group 0 (all jobs): > > READ: io=296400MB, aggrb=32121KB/s, minb=32121KB/s, maxb=32121KB/s, > mint=9448911msec, maxt=9448911msec > > WRITE: io=295596MB, aggrb=32034KB/s, minb=32034KB/s, maxb=32034KB/s, > mint=9448911msec, maxt=9448911msec > > > # gmultipath status > > Name Status Components > > multipath/dm_tcp DEGRADED da1 (ACTIVE) > > multipath/dm_iser DEGRADED da3 (ACTIVE) > > > We can see that there is Active paths to multipath device but still > traffice failed. > Any suggestions ? Anyone saw this before ? > > Thanks, > Max Gurtovoy. > Mellanox Technologies. > > > > > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55827BBF.6040206>