From owner-freebsd-fs@freebsd.org Mon Oct 29 15:25:10 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8C3D610DD41C; Mon, 29 Oct 2018 15:25:10 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-eopbgr670069.outbound.protection.outlook.com [40.107.67.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1F0B17562D; Mon, 29 Oct 2018 15:25:09 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YTOPR0101MB1162.CANPRD01.PROD.OUTLOOK.COM (52.132.50.155) by YTOPR0101MB1292.CANPRD01.PROD.OUTLOOK.COM (52.132.45.19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1273.19; Mon, 29 Oct 2018 15:25:08 +0000 Received: from YTOPR0101MB1162.CANPRD01.PROD.OUTLOOK.COM ([fe80::a086:adbc:2b38:1018]) by YTOPR0101MB1162.CANPRD01.PROD.OUTLOOK.COM ([fe80::a086:adbc:2b38:1018%2]) with mapi id 15.20.1273.027; Mon, 29 Oct 2018 15:25:08 +0000 From: Rick Macklem To: "Rodney W. Grimes" , Andrew Vylegzhanin CC: "freebsd-fs@freebsd.org" , "freebsd-infiniband@freebsd.org" Subject: Re: NFS + Infiniband problem Thread-Topic: NFS + Infiniband problem Thread-Index: AQHUbzV7uxRrUqx4EESM4oETGDUinKU2U4sAgAACL+E= Date: Mon, 29 Oct 2018 15:25:07 +0000 Message-ID: References: , <201810291506.w9TF6YAP057202@pdx.rh.CN85.dnsmgr.net> In-Reply-To: <201810291506.w9TF6YAP057202@pdx.rh.CN85.dnsmgr.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; YTOPR0101MB1292; 6:RxpnYDQZaTkULafKJeSfxynRKRi18R4LdrtG6M8KbTGwcjwznVKtgLA6geazT7TB9GwHfNwisr64fF/rUv6Sp85988vOP4jF3OTbXAUrgjRMeHjpjVZX9j+F9w5kfahR+uRR6SpLdSa1hQv0KVFq5UooZgvsXA6seMebxqypW5l/eJVGDGayk0pOZpCjJHUixq7npEPs3bpivcAECd3MxkEimfflMYIAd4gOdjXPXRw0oFYaUq7zJZzf1JWA3jpljE9pA1YgHm4li56xA1q4+FKyWjsCRGXFgG33mGZXsoy4XfTi+mbtHxBbcWHPpFIrvg8gvcT6QHzuCqQW8npV9t8oK//AFQ774/ao/pzRdbKwhO0nlszG7kUyJ/fdI9F5PPvfMu6Hr/oM2jufZZZ+GZsL9glkGnOVEOG6l/g5Ah6hDXAFDN+CDuddiIOU5n9LIv+c0/unvrtqO1n34TTrVA==; 5:cKuvU5xfYmVdoWHNiVJIsrPj+cNk1mXdUb/qveOOQqr55yePm/CPr44yFvr098JHNqcdT21FugAdn4bNn8FAE23pEFMesxu96TdS2G4HnAK2/0XXyNLyQLQyej/mmAdZAXqsjC3vZUMv50hy4KoG6J7rT68wAEWo4mf7zyIahXg=; 7:oroQrwDBAm7lt+ay6FChmlcoLMH3+mcWf/4dQeWCEURpIigvZJ9X64fq8GNBBEUgVSSCw37RH4mopv2QqQOwdXSV4zslBrYhr29mmvqycR6nejpyxWQ0Nq1Owbk5rv0LThor+rNmj/hDdA2JbRbPIg== x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: d17498b0-73db-4aea-b1e3-08d63db2b5a5 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(5600074)(711020)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020); SRVR:YTOPR0101MB1292; x-ms-traffictypediagnostic: YTOPR0101MB1292: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(8121501046)(5005006)(10201501046)(3002001)(93006095)(93001095)(3231382)(944501410)(52105095)(148016)(149066)(150057)(6041310)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123564045)(20161123560045)(20161123558120)(201708071742011)(7699051)(76991095); SRVR:YTOPR0101MB1292; BCL:0; PCL:0; RULEID:; SRVR:YTOPR0101MB1292; x-forefront-prvs: 084080FC15 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(39860400002)(396003)(136003)(346002)(376002)(199004)(53754006)(189003)(6436002)(256004)(9686003)(25786009)(2906002)(55016002)(186003)(33656002)(4326008)(76176011)(5250100002)(39060400002)(786003)(316002)(86362001)(99286004)(7696005)(14444005)(478600001)(102836004)(105586002)(6506007)(74482002)(14454004)(106356001)(110136005)(54906003)(446003)(74316002)(6246003)(11346002)(486006)(476003)(5660300001)(71200400001)(71190400001)(8676002)(81156014)(53936002)(46003)(81166006)(8936002)(68736007)(2900100001)(229853002)(97736004)(305945005); DIR:OUT; SFP:1101; SCL:1; SRVR:YTOPR0101MB1292; H:YTOPR0101MB1162.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-microsoft-antispam-message-info: 9kllnFSps1+i/1WMcU5oJUHuXtIycfoHCJ2JUL1i8kIG4SaBywHJj3jlcKldarfC4It5UK0LT6fLPcoUiYmjKXDb9BXCjl2H6mcNpLpcqyZ11aaGZ0nVDmRuRLsgWhQOWE7Xbg6OT0DVtkuMzkVp0mEC1W+FkdbP6UjtAlLbbaI+ksojBBb2gKfIWI4XY7dIVT1Pc2QUwEmpJBtRuluZv0hX+AB05enHgoj+PfRDjYKhlgUE56wI868rsjzjBPrZMKrHaPyMm8DDOAw7pd8VeDyS7Ys25IbBl0D5oVdhGdN8IppN3i9ngmyT7wyttpunBHW7Uh9PJZdeptJEV70dyK1lfd1r+HNpighJvifF/vc= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: d17498b0-73db-4aea-b1e3-08d63db2b5a5 X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Oct 2018 15:25:07.9886 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTOPR0101MB1292 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Oct 2018 15:25:10 -0000 Rodney W. Grimes wrote: Andrew Vylegzhanin wrote: >> Hello everyone, >> >> I have a several FreeBSD machines connected via Infiniband netwok ( FDR >> switch Mellanox SW3036 + ConnectX-3 VPI cards ). >> One of them is a NAS-server with multiply ZFS pools. >> >> All kernels (11.2-RELEASE on clients and 12.0-BETA1 (11.2 also tried) on >> server) are with infiniband connected mode (option IPOIB_CM, option SDM) >> and world with OFED stack support. (WITH_OFED=3D'yes'). >> >> File transfers via FTP or SSH between server and clients works almost >> flawless ( ~ 12 Gbit/s ). >> >> But when I try to copy in/out some significant data via NFS share mounte= d >> on clients, NFS i/o hangs at all or got extremely slow (couple kB/s) >> transfer speed after uncertain amount of copied data. For example, on th= e >> one node I can copy 1GB file, and after NFS hang on file with size 30 kb= . >> >> Some details: >> [root@node4 ~]# mount_nfs -o wsize=3D30000 -o proto=3Dtcp 10.0.2.1:/zdat= a2 /mnt > ^^^^^^^^^^^^ >I am not sure what the interaction between page sizes, TSO needs, >buffer needs and all that are but I always use a power of 2 wsize >and rsize. They should always be a power of 2. I think the code clips the value, but i= t might only clip to a multiple of 512. If it didn't clip this down to 16384, then = that would definitely be a problem. Also, normally the same size for rsize and w= size is used. If you don't do that, you end up with weird sided blocks in the bu= ffer cache. I think it still works when this is done, but could cause performance hits. Probably doesn't matter for a simple performance test. (You can find out what options it is actually using by typing "nfsstat -m" = after doing the mount.) > You might try that. And as Rick suggested, turn of >TSO, if you can. Is infiniband using RDMA to do this, if so then >the page size stuff is probably very important, use multiples of >4096 only. RDMA is not supported by the FreeBSD NFS client. There is a way to use RDMA on a separate connection with NFSv4.1 or later, but I've never written code for that. (Not practical to try to implement without access to hardware tha= t does it.) rick [performance stuff snipped]