Date: Fri, 13 Sep 2019 15:13:25 -0400 From: John Fleming <john@spikefishsolutions.com> To: freebsd-infiniband@freebsd.org Subject: Re: Just joined the infiniband club Message-ID: <CABy3cGxHhwA6H%2BdXDbBQJ6U4C0=doWkPjD6vbMdoQ_y70qL0wg@mail.gmail.com> In-Reply-To: <CABy3cGwY7sEMRocSL2UF7JdJM77wFQnXb-=MtJL4axBoL0S=2w@mail.gmail.com> References: <CABy3cGxXa8J1j%2BodmfdQ6b534BiPwOMUAMOYqXKMD6zGOeBE3w@mail.gmail.com> <00acac6f-3f13-a343-36c5-00fe45620eb0@gmail.com> <CABy3cGzfc-UjPOxMFDYtL%2BOUPw8MYH7WS3picXjGmC=a=Q1xQQ@mail.gmail.com> <CABy3cGwY7sEMRocSL2UF7JdJM77wFQnXb-=MtJL4axBoL0S=2w@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
And of course I meant ethernet mode not linux mode. On Fri, Sep 13, 2019 at 2:36 PM John Fleming <john@spikefishsolutions.com> wrote: > > Top post I know, but i meant to send this to freebsd-infiniband not stable > > > > > On 2019-09-07 19:00, John Fleming wrote: > > > Hi all, i've recently joined the club. I have two Dell R720s connected > > > directly to each other. The card is a connectx-4. I was having a lot > > > of problem with network drops. Where i'm at now is i'm running > > > FreeBSD12-Stable as of a week ago and cards have been cross flashed > > > with OEM firmware (these are lenovo i think) and i'm no longer getting > > > network drops. This box is basically my storage server. Its exporting > > > a raid 10 ZFS volume to a linux (compute 19.04 5.0.0-27-generic) box > > > which is running GNS3 for a lab. > > > > > > So many questions.. sorry if this is a bit rambly! > > > > > > From what I understand this card is really 4 x 25 gig lanes. If i > > > understand that correctly then 1 data transfer should be able to do at > > > max 25 gig (best case) correct? > > > > > > I'm not getting what the difference between connected mode and > > > datagram mode is. Does this have anything to do with the card > > > operating in infiniband mode vs ethernet mode? FreeBSD is using the > > > modules compiled in connected mode with shell script (which is really > > > a bash script not a sh script) from freebsd-infiniband page. > > > > Nothing to do with Ethernet... > > > > Google turned up a brief explanation here: > > > > https://wiki.archlinux.org/index.php/InfiniBand > > > I still don't get why I would want to use one of the the other or why > the option is there but it doesn't matter. > After firmware upgrade and moving to FreeBSD stable (unsure which is > triggering this) i can no longer > set connected mode on linux. There are a lot of posts that say you > have to diabled enhanced iboip mode > via a modules.conf setting but the driver doesn't have any idea what > that is. echoing connnected to mode file > throws a write error. I poked around in linux source but like i'm not > even level 1 fighter on C. i'm like generic NPC > that says hi at the gates. > > > Those are my module building scripts on the wiki. What bash extensions > > did you see? > > Isn't this a bash..ism? When i run it inside sh it throws a fit. No > worries, i just edited loaded.conf > > auto-append-line > > > > > > > Linux box complains if mtu is over 2044 with expect mulitcast drops or > > > something like that so mtu on both boxes is set to 2044. > > > > > > Everything i'm reading makes it sound like there is no RDMA support in > > > FreeBSD or maybe that was no NFS RDMA support. Is that correct? > > RDMA is inherent in Infiniband AFAIK. Last I checked, there was no > > support in FreeBSD for NFS over RDMA, but news travels slowly in this > > group so a little digging might prove otherwise. > > > > > > So far it seems like these cards struggle to full 10 gig pipe. Using > > > iperf (2) the best i'm getting is around 6gb(bit) sec. Interfaces > > > aren't showing drops on either end. Doesn't seem to matter if i do 1, > > > 2 or 4 threads on iperf. > > You'll need both ends in connected mode with a fairly large MTU to get > > good throughput. CentOS defaults to 64k, but FreeBSD is unstable at > > that size last I checked. I got good results with 16k. > > > > My FreeBSD ZFS NFS server performed comparably to the CentOS servers, > > with some buffer space errors causing the interface to shut down (under > > the same loads that caused CentOS servers to lock up completely). > > Someone mentioned that this buffer space bug has been fixed, but I no > > longer have a way to test it. > > > > Best, > > > > Jason > > > > -- > > Earth is a beta site. > > So .. i ended up switch to ETHERNET mode via mlxconfig -d PCID set > LINK_TYPE_P1=2 LINK_TYPE_P2=2 > Oh i also set MTU to 9000. > > After that.. the flood gates opened massively. > > root@R720-Storage:~ # iperf -c 10.255.255.55 -P4 > ------------------------------------------------------------ > Client connecting to 10.255.255.55, TCP port 5001 > TCP window size: 1.01 MByte (default) > ------------------------------------------------------------ > [ 6] local 10.255.255.22 port 62256 connected with 10.255.255.55 port 5001 > [ 3] local 10.255.255.22 port 51842 connected with 10.255.255.55 port 5001 > [ 4] local 10.255.255.22 port 53680 connected with 10.255.255.55 port 5001 > [ 5] local 10.255.255.22 port 33455 connected with 10.255.255.55 port 5001 > [ ID] Interval Transfer Bandwidth > [ 6] 0.0-10.0 sec 24.6 GBytes 21.1 Gbits/sec > [ 3] 0.0-10.0 sec 23.8 GBytes 20.5 Gbits/sec > [ 4] 0.0-10.0 sec 33.4 GBytes 28.7 Gbits/sec > [ 5] 0.0-10.0 sec 32.9 GBytes 28.3 Gbits/sec > [SUM] 0.0-10.0 sec 115 GBytes 98.5 Gbits/sec > root@R720-Storage:~ # > 11:56 AM > root@compute720:~# iperf -c 10.255.255.22 -P4 > ------------------------------------------------------------ > Client connecting to 10.255.255.22, TCP port 5001 > TCP window size: 325 KByte (default) > ------------------------------------------------------------ > [ 5] local 10.255.255.55 port 50022 connected with 10.255.255.22 port 5001 > [ 3] local 10.255.255.55 port 50026 connected with 10.255.255.22 port 5001 > [ 6] local 10.255.255.55 port 50024 connected with 10.255.255.22 port 5001 > [ 4] local 10.255.255.55 port 50020 connected with 10.255.255.22 port 5001 > [ ID] Interval Transfer Bandwidth > [ 5] 0.0-10.0 sec 27.4 GBytes 23.5 Gbits/sec > [ 3] 0.0-10.0 sec 26.2 GBytes 22.5 Gbits/sec > [ 6] 0.0-10.0 sec 26.8 GBytes 23.1 Gbits/sec > [ 4] 0.0-10.0 sec 26.0 GBytes 22.3 Gbits/sec > [SUM] 0.0-10.0 sec 106 GBytes 91.4 Gbits/sec > root@compute720:~# > > I should point out before doing this while running in IB mode with > datagram mode i disabled SMT and set the power profile to performance > on box boxes. This moved me up to 10-12 gig/sec, nothing like the > change to ethernet which i can now fill the pipe from the looks of it. > > Also note a single connection doesn't do more then 25ishgig/sec. > > Back to SATA being the bottle neck but at least if its coming out of > the cache there should be more then enough network IO. > > Oh one last thing, i thought i read somewhere that you needed to have > a switch to do ethernet mode. This doesn't seem to be the case. I > haven't shutdown opensm yet but i'll try that later as i'm assuming i > no longer need that. > > w00t!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABy3cGxHhwA6H%2BdXDbBQJ6U4C0=doWkPjD6vbMdoQ_y70qL0wg>