From owner-freebsd-fs@freebsd.org Mon Oct 29 15:48:50 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7381E10DDAAA; Mon, 29 Oct 2018 15:48:50 +0000 (UTC) (envelope-from freebsd-rwg@pdx.rh.CN85.dnsmgr.net) Received: from pdx.rh.CN85.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AEBE275FFA; Mon, 29 Oct 2018 15:48:49 +0000 (UTC) (envelope-from freebsd-rwg@pdx.rh.CN85.dnsmgr.net) Received: from pdx.rh.CN85.dnsmgr.net (localhost [127.0.0.1]) by pdx.rh.CN85.dnsmgr.net (8.13.3/8.13.3) with ESMTP id w9TFmltT057418; Mon, 29 Oct 2018 08:48:47 -0700 (PDT) (envelope-from freebsd-rwg@pdx.rh.CN85.dnsmgr.net) Received: (from freebsd-rwg@localhost) by pdx.rh.CN85.dnsmgr.net (8.13.3/8.13.3/Submit) id w9TFmjaD057417; Mon, 29 Oct 2018 08:48:45 -0700 (PDT) (envelope-from freebsd-rwg) From: "Rodney W. Grimes" Message-Id: <201810291548.w9TFmjaD057417@pdx.rh.CN85.dnsmgr.net> Subject: Re: NFS + Infiniband problem In-Reply-To: To: Rick Macklem Date: Mon, 29 Oct 2018 08:48:45 -0700 (PDT) CC: Andrew Vylegzhanin , "freebsd-fs@freebsd.org" , "freebsd-infiniband@freebsd.org" X-Mailer: ELM [version 2.4ME+ PL121h (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Oct 2018 15:48:50 -0000 > Rodney W. Grimes wrote: > Andrew Vylegzhanin wrote: > >> Hello everyone, > >> > >> I have a several FreeBSD machines connected via Infiniband netwok ( FDR > >> switch Mellanox SW3036 + ConnectX-3 VPI cards ). > >> One of them is a NAS-server with multiply ZFS pools. > >> > >> All kernels (11.2-RELEASE on clients and 12.0-BETA1 (11.2 also tried) on > >> server) are with infiniband connected mode (option IPOIB_CM, option SDM) > >> and world with OFED stack support. (WITH_OFED='yes'). > >> > >> File transfers via FTP or SSH between server and clients works almost > >> flawless ( ~ 12 Gbit/s ). > >> > >> But when I try to copy in/out some significant data via NFS share mounted > >> on clients, NFS i/o hangs at all or got extremely slow (couple kB/s) > >> transfer speed after uncertain amount of copied data. For example, on the > >> one node I can copy 1GB file, and after NFS hang on file with size 30 kb. > >> > >> Some details: > >> [root@node4 ~]# mount_nfs -o wsize=30000 -o proto=tcp 10.0.2.1:/zdata2 /mnt > > ^^^^^^^^^^^^ > >I am not sure what the interaction between page sizes, TSO needs, > >buffer needs and all that are but I always use a power of 2 wsize > >and rsize. > They should always be a power of 2. I think the code clips the value, but it might > only clip to a multiple of 512. If it didn't clip this down to 16384, then that > would definitely be a problem. Also, normally the same size for rsize and wsize > is used. If you don't do that, you end up with weird sided blocks in the buffer cache. > I think it still works when this is done, but could cause performance hits. > Probably doesn't matter for a simple performance test. > (You can find out what options it is actually using by typing "nfsstat -m" after doing the mount.) > > > You might try that. And as Rick suggested, turn of > >TSO, if you can. Is infiniband using RDMA to do this, if so then > >the page size stuff is probably very important, use multiples of > >4096 only. > RDMA is not supported by the FreeBSD NFS client. There is a way to use RDMA > on a separate connection with NFSv4.1 or later, but I've never written code > for that. (Not practical to try to implement without access to hardware that > does it.) It would be very easy to arrange for a pair of PCIE 10G IB cards and a cable to go betweem them if that would be of use to you in some day doing some of this work, or for that mater for even playing with NFS over IB. > rick -- Rod Grimes rgrimes@freebsd.org