Date: Mon, 29 Oct 2018 15:25:07 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>, Andrew Vylegzhanin <avv314@gmail.com> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, "freebsd-infiniband@freebsd.org" <freebsd-infiniband@freebsd.org> Subject: Re: NFS + Infiniband problem Message-ID: <YTOPR0101MB11622A9797376128D2FA182ADDF30@YTOPR0101MB1162.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <201810291506.w9TF6YAP057202@pdx.rh.CN85.dnsmgr.net> References: <CA%2BBi_YiHoxFc3wsEPnMeBHWgW-nh6sXQCEgBTb=-nD6-XcjZ%2Bg@mail.gmail.com>, <201810291506.w9TF6YAP057202@pdx.rh.CN85.dnsmgr.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Rodney W. Grimes wrote: Andrew Vylegzhanin wrote: >> Hello everyone, >> >> I have a several FreeBSD machines connected via Infiniband netwok ( FDR >> switch Mellanox SW3036 + ConnectX-3 VPI cards ). >> One of them is a NAS-server with multiply ZFS pools. >> >> All kernels (11.2-RELEASE on clients and 12.0-BETA1 (11.2 also tried) on >> server) are with infiniband connected mode (option IPOIB_CM, option SDM) >> and world with OFED stack support. (WITH_OFED=3D'yes'). >> >> File transfers via FTP or SSH between server and clients works almost >> flawless ( ~ 12 Gbit/s ). >> >> But when I try to copy in/out some significant data via NFS share mounte= d >> on clients, NFS i/o hangs at all or got extremely slow (couple kB/s) >> transfer speed after uncertain amount of copied data. For example, on th= e >> one node I can copy 1GB file, and after NFS hang on file with size 30 kb= . >> >> Some details: >> [root@node4 ~]# mount_nfs -o wsize=3D30000 -o proto=3Dtcp 10.0.2.1:/zdat= a2 /mnt > ^^^^^^^^^^^^ >I am not sure what the interaction between page sizes, TSO needs, >buffer needs and all that are but I always use a power of 2 wsize >and rsize. They should always be a power of 2. I think the code clips the value, but i= t might only clip to a multiple of 512. If it didn't clip this down to 16384, then = that would definitely be a problem. Also, normally the same size for rsize and w= size is used. If you don't do that, you end up with weird sided blocks in the bu= ffer cache. I think it still works when this is done, but could cause performance hits. Probably doesn't matter for a simple performance test. (You can find out what options it is actually using by typing "nfsstat -m" = after doing the mount.) > You might try that. And as Rick suggested, turn of >TSO, if you can. Is infiniband using RDMA to do this, if so then >the page size stuff is probably very important, use multiples of >4096 only. RDMA is not supported by the FreeBSD NFS client. There is a way to use RDMA on a separate connection with NFSv4.1 or later, but I've never written code for that. (Not practical to try to implement without access to hardware tha= t does it.) rick [performance stuff snipped]
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTOPR0101MB11622A9797376128D2FA182ADDF30>