From owner-freebsd-fs@freebsd.org Wed Oct 31 02:53:51 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 680F31074DAA; Wed, 31 Oct 2018 02:53:51 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-eopbgr670065.outbound.protection.outlook.com [40.107.67.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 07CAD8714D; Wed, 31 Oct 2018 02:53:50 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YTOPR0101MB1162.CANPRD01.PROD.OUTLOOK.COM (52.132.50.155) by YTOPR0101MB2092.CANPRD01.PROD.OUTLOOK.COM (52.132.46.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.20; Wed, 31 Oct 2018 02:53:49 +0000 Received: from YTOPR0101MB1162.CANPRD01.PROD.OUTLOOK.COM ([fe80::9c71:6eb6:1bff:727b]) by YTOPR0101MB1162.CANPRD01.PROD.OUTLOOK.COM ([fe80::9c71:6eb6:1bff:727b%3]) with mapi id 15.20.1294.021; Wed, 31 Oct 2018 02:53:49 +0000 From: Rick Macklem To: Andrew Vylegzhanin CC: "Rodney W. Grimes" , "freebsd-fs@freebsd.org" , "freebsd-infiniband@freebsd.org" Subject: Re: NFS + Infiniband problem Thread-Topic: NFS + Infiniband problem Thread-Index: AQHUbzV7uxRrUqx4EESM4oETGDUinKU2U4sAgAACL+GAAOXvgIABagv/ Date: Wed, 31 Oct 2018 02:53:49 +0000 Message-ID: References: <201810291506.w9TF6YAP057202@pdx.rh.CN85.dnsmgr.net> , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; YTOPR0101MB2092; 6:yeTLYMXkIIjIk/LnVtMCbg0A9ibdqPAbmt55FH9lhCiSZBNU2RVko/odXF9gzdjNTtZNjNlRg/HlxjRZoJivF89eiSqUhsQNShSYc5M0ti/4FoPnKwvK6+lEWFBT3yzFmWKmLGbCaeustMQ88IvR+Ahqxe6aHsi3B9N1RCSTQ+iyFLFyRDtYlT/eg4YT3m0rhISozv8fGbA492FAmfoPQfyGz07/VDIKNb4ebm5o8GrSvxwqKt9tpL2T2SvPAtNANUfSCAvg95/o8r9UejDvk+yf8+7izSl/rIyjWbnUVEZ8yn3g4MYMwIBcVrCk9ZIKSjt2vOoovhLaZyAFZP8gyKK7vEi/pU47dfFQc0eip75mLA+iGt1nLx+ExKHko5qdiobun8G6zL/TUqr6RxMr1YGxTqofnpUqiHzc9zDz0whGCp6e0G1bl7xFohgo65g/qcRnoSc6W/XdsWOQNf/IDQ==; 5:qdUaEUvVU2NEtkt9O6Sgu4+J13lf79Ofbkw8y27OrTPwR0X5n5k4Su9s2cVdqRgnX2imTlD0Zy3oz+q0vxQl8xMVMlMlm95OyLpxeBRJwKNQxN0LVx9yxiLvNP/DN/UEz4vvbTSTzlO26w6augt8Ql08/pzPM1tcvRKFZvooBq4=; 7:TgzUUP7tBG3vurSey9tk1gV7pOKVcIb9cjJ2H+CgZW6dk3azzGqTwxh71/JYmDrBzj8ZwHf/JC28od66HJqadymWksNR/ofPoKU3hyfwy9gSTotH4k/1Oz4rSnagXV3aXmnM5X/fuF4Mfxeo5aKSrQ== x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: 9cc5da94-a157-46ef-f0e9-08d63edc1595 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(2017052603328)(7153060)(7193020); SRVR:YTOPR0101MB2092; x-ms-traffictypediagnostic: YTOPR0101MB2092: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(5005006)(8121501046)(3231382)(944501410)(52105095)(3002001)(10201501046)(93006095)(93001095)(148016)(149066)(150057)(6041310)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123560045)(20161123564045)(20161123558120)(201708071742011)(7699051)(76991095); SRVR:YTOPR0101MB2092; BCL:0; PCL:0; RULEID:; SRVR:YTOPR0101MB2092; x-forefront-prvs: 084285FC5C x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39860400002)(136003)(396003)(366004)(376002)(346002)(189003)(199004)(11346002)(2906002)(316002)(71190400001)(71200400001)(6916009)(54906003)(786003)(68736007)(1411001)(6506007)(76176011)(33656002)(102836004)(93886005)(14454004)(81166006)(81156014)(105586002)(8936002)(7696005)(25786009)(8676002)(6436002)(53936002)(6246003)(229853002)(106356001)(2900100001)(55016002)(5660300001)(99286004)(256004)(9686003)(478600001)(74482002)(97736004)(476003)(46003)(86362001)(74316002)(305945005)(446003)(186003)(4326008)(486006)(39060400002)(5250100002); DIR:OUT; SFP:1101; SCL:1; SRVR:YTOPR0101MB2092; H:YTOPR0101MB1162.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-microsoft-antispam-message-info: 6mNL7yOraiA6rHG7oMhNhIaTgdm0cK3BjuBIJQgt9EQmyqcEE1Kg1p8TPcu8+FJz/GE7Cdx85c0psADAnwAL0yhR1eXJ/SYQQWYfX+HfD/DGM7Zgn4xYELQDdPZwOoJqq2+mUUEf2EpVpAwE1jtR27MbmMtQbdGjcxx3SNnVQOZDSVabAzG3HG9zYBvNZQe5DTO2MxOf/mT2TiuH5+cbwQLGWT9xxIR4hrAel38SLrMGM5BjiV6vjwXUTApuhHemV+sAVpmIeONPb+FtwiQ+GfvHjHHyR60VLFHb16vIaAiQFiqOkddvNA+ExnR+Nt5Ct3sRHzJQiMZVyJEsjzpZtQ2PPhbcTU1b9/nzWxBzA8I= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: 9cc5da94-a157-46ef-f0e9-08d63edc1595 X-MS-Exchange-CrossTenant-originalarrivaltime: 31 Oct 2018 02:53:49.5066 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTOPR0101MB2092 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Oct 2018 02:53:51 -0000 Andrew Vylegzhanin wrote: >> >> Some details: >> >> [root@node4 ~]# mount_nfs -o wsize=3D30000 -o proto=3Dtcp 10.0.2.1:/z= data2 /mnt >> > ^^^^^^^^^^^^ > > >Again after some tests. >I've tried 4096,8192,16384 wsize/rsize. Only 4096 value give some measurab= le >result and it's extremely slow ~ 10-16MB/s for writing (depend on numbe= r of >threads), 10-12 MB/s for reading. With other values NFS hangs (or alm= ost hangs - >couple kB/s in average) > >Changing sysctl net.inet.tcp.tso=3D0 (w/o reboot) on both sides had no ef= fect. > >AFAIK, infiniband interface has no option for TSO,LRO: >ib0: flags=3D8043 metric 0 mtu 65520 > options=3D80018 > lladdr 80.0.2.8.fe.80.0.0.0.0.0.0.e4.1d.2d.3.0.50.df.51 > inet 10.0.2.1 netmask 0xffffff00 broadcast 10.0.2.255 > nd6 options=3D29 > > >BTW, hers is my sysctl.conf file with optimisation for congestion control = and tcp >buffers on 10/40 Gbit/s links (the server had 40 Gbit/s Intel ixl = ethernet also): > >kern.ipc.maxsockbuf=3D16777216 > >net.inet.tcp.sendbuf_max=3D16777216 > >net.inet.tcp.recvbuf_max=3D16777216 > >net.inet.tcp.sendbuf_auto=3D1 > >net.inet.tcp.recvbuf_auto=3D1 > >net.inet.tcp.sendbuf_inc=3D16384 > >net.inet.tcp.recvbuf_inc=3D524288 > >net.inet.tcp.cc.algorithm=3Dhtcp Well, I'm not familiar with the current TCP stack (and, as noted before, I = know nothing about InfiniBand). All I can suggest is testing with the default co= ngestion control algorithm. (I always test with the default, which appears to be new= reno.) NFS traffic looks very different than a typical use of TCP. Lots of small T= CP segments in both directions interspersed with some larger ones (the write requests or read replies). rick