From owner-freebsd-net@freebsd.org Sat Aug 29 02:10:52 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8CB443CA42A for ; Sat, 29 Aug 2020 02:10:52 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on0631.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::631]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Bdg156Zrtz3WZW; Sat, 29 Aug 2020 02:10:49 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZByBEKgJyN0aaHFm38gqrfOoQcUe4I2zHo+7fAFy4AOPqf2xRJy29IceX46nKR2NG+PaxJUS0OlpzDEtyect82g8iQPfrgxKfQoZcE5peUGomTA9568ux0KgBqtXmK2+HUpe5e7SblmWZVDO8R/S5g0jxXchB5Idn+AQyZwdYNoMneiu6uDKF52Cwp9skP6arIV4gfJoh6lKoVmqyh1Z7cnltVunRrMeEUb2eEQevLK+rnGLUOvKr/KAyzx3xAvCSpyspA5s5oxjcvbOjHtestrcNWmkWXeEDJnhww+7hJBvmCr40UqzFz8v56+WxnVeYBOMs4rD2fSqKZDPN+q+0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cxvqPRjVUfEFcbhXibAjZC/dT7u4m9JV4iSQ7aqyLB8=; b=PCAKS8GHCt0XFNsgrxhpUELWSdUyJs8oGP5NPja1nvXAwt++EBxRh58/bPm+dClJhzYjPgRQT82OMli053dHyEJ73Rx3xj+7PEpKs/dQ4M5yeJ/wrYA0QGP5W7m/iu2y9IM6Q6Q661wUWqioCFIgO3BtMck8TqVO9hI8ccJZW1rg8DwEjwaEbmYvshA0JyGKtqnCOJJEl1cAUkFUIOpLzWp+8uiDcPi4qdZLLJ8eZdbMuFkJW9LO1fwEtd5iIPUwwFn5zTGklXn0EBdgTOXerMJ9cz+zcChlJlv47WsOoKyU1XaNvc1pfB4svgaPJcOD7db6uc3PILmV00OLgZolTg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cxvqPRjVUfEFcbhXibAjZC/dT7u4m9JV4iSQ7aqyLB8=; b=Y1HIpTvSCMbqWojiUNmDuY2CxEVBq4ms1bumL2TJqvEwobEb1sLjrpDZE8gk76n13otpI8MTmyKlLq//jH/8xKP7qZwPQwvbgU2hnvDfpUEImKAuz7WtD5f9r5eeQvFYqmGjkUjaMuauOIiwbZo+XqOIXdWdeu3DfY0oUFDdW4WxOMQSkOH8Nn2kxPxGq47CxiPdV51eWcHCuuqpidMPKB8aybFC58tW24Q2Ij411slBy5VN2N6VXhdMUiucLSFvkaxUecDON+HjNU+xUT7kKtPsxQzyhufJAOhRU74eQTD4SSvIi2CXFJ3VKxuqj6fTQ1CGngdFSuvCi6uCWsroJA== Received: from QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:38::14) by YQBPR0101MB2131.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:2::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3326.23; Sat, 29 Aug 2020 02:10:45 +0000 Received: from QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM ([fe80::e89a:a655:91ca:4e63]) by QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM ([fe80::e89a:a655:91ca:4e63%5]) with mapi id 15.20.3305.032; Sat, 29 Aug 2020 02:10:45 +0000 From: Rick Macklem To: "Scheffenegger, Richard" CC: Michael Tuexen , "freebsd-net@freebsd.org" Subject: Re: TFO for NFS Thread-Topic: TFO for NFS Thread-Index: AdZ8wCj0gU5kd362TZ21UL2HiTdskgAIvtRNAA75KHAAIdBpaQ== Date: Sat, 29 Aug 2020 02:10:45 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: acda91ee-444c-4c2e-7b5a-08d84bc0bd1d x-ms-traffictypediagnostic: YQBPR0101MB2131: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: GSMXAVRCUVs5kYziriWYwP/1+vlEVEbd5ZJS9Or2liLiLefgiJWy0kURCBo6zb/IDKzAGEqIoC+YVH/CQn/M69ZnNNAT85FiCrClUUCtAnHs+Pw2k+8tRZG+DTSfO+Qmqb4QkTiaa+OJr4T7TcdxW3qlnCw9Fi6bEMJdkhIgIA97tETLgDTEIbFME4jUREWxt9flQY0Pel3FRM/VGWPRKOLIeNMNKtElm7KCT5q9K8NCcknxcNulE7F/hK6xO7mdvmP6nvkfsbU/JkC0RDw216m1yBWrwGBWpMEOUxpNiVzQ4gpkxTvPFe1Xez+bmaRwA/4KVQNkj1ofAfKIKjpitg== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(39860400002)(366004)(376002)(136003)(396003)(478600001)(64756008)(66556008)(71200400001)(86362001)(55016002)(9686003)(4326008)(76116006)(91956017)(66476007)(66946007)(2906002)(33656002)(66446008)(6506007)(3480700007)(186003)(8676002)(7116003)(7696005)(66574015)(52536014)(786003)(316002)(54906003)(53546011)(5660300002)(8936002)(6916009)(83380400001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: 9ke5qWUEzSflSxlBXBdW3taDZyVGhnPiu7ty1yaDixz9Tdng6NdPUrXyRdD7crsnR9ZWUxeosPKl8DjCgFVOWUji9P7LiC1wQDrcCvtPZiSOXIHVxaI9TsHwMahLCJgz78itpRmrcnCDM45UY1ziaLJFef09Acd6n2FQT0u3hK2QDUIrRYPhsBlPShn8HIa6SiOuGkQHKpDNuRqyT4WEg6jjlyN0H7dlOpT/wqFYMj322vmcxQ1v6BnBcQfOAljUrd82jTqsKMuJ5sJ+yOHTwdbXymc9CO7pmAfSwL8OhWL/J3hP32saWLQCbLjqA/7VKWrTvlyV6fHCTGLD2/aE7p3OUU1VL5nyqVaJWTCebKU7zXSacVQZlzB1x1O8i9Xn35/xsm0nHB6PEQHlrBFpXM8+S2Chrk/JL5ejqkxLF6wtDv/fxYXJyWCNYc4aTF0CGqC73yVUe0xy46+qdTTioi8V71ypqrFmWcPyzhq/WsGl9APgNnqPCd8MkTm7fWR2SFhHXZjvNV3lRiACw10ZJkUsyzaQJsv83R71AHrcbr4rinL0t/xOq7W3WN15RaOC+ZLkLgMceAWpH3rvN/j7IgDOS9+FtVQ5jcIMjHiSIWiAyJ2qsOd3pKHpP4Q89QFZUSZr3YP/joIM00PuyC0wmaWeIEiB2LFjX1B8cIkZ09bDtSKRepSPAuVbhf6wmz50sqI7rScIiMd4NKQNG5jzJQ== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: acda91ee-444c-4c2e-7b5a-08d84bc0bd1d X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Aug 2020 02:10:45.1255 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: /C2Knjik/LsLCj/VETo23P2+QsNmXlxvOoEhtuI2N1hT16WCkdZKrY5YrIo1ZfHHtTTVLX+TeS6WQ2E1dLXpnw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB2131 X-Rspamd-Queue-Id: 4Bdg156Zrtz3WZW X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=Y1HIpTvS; dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::631 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-5.90 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; NEURAL_HAM_MEDIUM(-0.97)[-0.974]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-0.95)[-0.946]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-0.98)[-0.985]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Aug 2020 02:10:52 -0000 Scheffenegger, Richard wrote:=0A= >I know, NFS TCP sessions are some of the most long-lived sessions in regul= ar use.=0A= Ok, so I'll admit I can't wrap my head around this.=0A= It is way out of my area of expertise (so I've added freebsd-net@ to the cc= ), but=0A= it seems to me that NFS is the about the least appropriate use fot TFO.=0A= =0A= It seems that, for TFO to be useful, the application needs to be doing freq= uent=0A= short lived TCP connections and often across WAN/Internet.=0A= NFS mounts do neither of the above.=0A= - They, as we've noted, only normally do a TCP connect at mount time.=0A= Usually run on low latency LAN environments. (High latency connections= =0A= hammer NFS performance, due to its frequent small RPCs that the client= =0A= must wait for replies to sychronously.)=0A= =0A= All you might save is one RTT. Take a look at how many RPCs (each with a RT= T)=0A= happen on an active NFS mount.=0A= =0A= >My rationale is two-fold:=0A= >=0A= >First, having a relatively high-profile use of the TFO option in the core = OS modules >will definitely expose that feature to at least some use.=0A= Well, I don't think it is NFS's job to expose a feature that is not useful = for it.=0A= (If you were to implement this and benchmarking showed a significant=0A= improvement in elapsed time to do an NFS mount, then that could be a=0A= different story.)=0A= =0A= >Second, in case of a network disconnect (or, something with my company doe= s, >that would be most comparable to unassigning and reassigning the server= IP >address between different physical ports), while there is IO load, TFO= may reduce >(ever so slightly) the latency impact of the enqueued IOs.=0A= I'm not sure I understand this. NFS always uses port# 2049.=0A= If you are referring to the host IP address, then wouldn't that be handled = via.=0A= Arp and routing? (Does this require a fresh TCP connection to the same serv= er=0A= IP address?)=0A= =0A= >My plan is first to simply enable the socket option - that should result i= n TFO to >get negotiated for, but no actual latency improvement, while the = traditional >connect() sequence to set up a TCP session is done., from the = client side; the >server side will not need to change, and can send out ini= tial data right away with >the syn/ack (at least in theory, if the syn cont= ained a full NFS request that can be >responded to).=0A= >=0A= >Changing the client to make use of the SYN+data facilities would be a 2nd = step.=0A= Well, during an NFS mount, there is first a TCP connection made by=0A= mount_nfs im userspace and it is only used for a single Null RPC.=0A= --> This checks that the server is up and running.=0A= Then mount_nfs does nmount(2), which will create a second TCP connection=0A= which is normally used until unmount.=0A= --> All you save is the RTT for the one first RPC of many.=0A= =0A= >Also, I shall make this a configurable, since some network devices may inh= ibit TFO >packets, incurring a delay (but that's mostly public internet, no= t private networks >where NFS is being used). Ideally with TFO default to o= n (once it's working >properly), but able to explicitly disable it for cert= ain mounts.=0A= NFS suffered TSO related bugs for several (> 5) years (and I wouldn't be su= rprised=0A= if there are still net device drivers broken such that TSO must be disabled= to make=0A= NFS work ok on them.=0A= =0A= As such, I get very nervous about this kind of thing.=0A= =0A= Reliability always trumps performance when it comes to file system work.=0A= =0A= Now, if you are interested in improving NFS performance over TCP, that=0A= could be a very interesting project, but I doubt TFO would be relevant.=0A= Especially when you look at long fat pipes (TCP connections with a large=0A= delay * bandwidth), there is probably a lot that could be done.=0A= --> Read-ahead, write-back algorithm changes. Read/Write data size.=0A= Throttling/congestion avoidance/window sizing in TCP.=0A= And the list goes on and on...=0A= =0A= I do hope that NFS over TLS allows more use of NFS across the Internet,=0A= so performance work related to NFS running on WAN/Internet connections=0A= would be a great thing to do. (I'm not conversant with the current TCP stac= k,=0A= so I'm not the guy to tackle this.)=0A= =0A= rick=0A= =0A= Richard Scheffenegger=0A= =0A= =0A= -----Original Message-----=0A= From: Rick Macklem =0A= Sent: Freitag, 28. August 2020 04:35=0A= To: Scheffenegger, Richard ; rmacklem@fre= ebsd.org=0A= Cc: Michael Tuexen =0A= Subject: Re: TFO for NFS=0A= =0A= NetApp Security WARNING: This is an external email. Do not click links or o= pen attachments unless you recognize the sender and know the content is saf= e.=0A= =0A= =0A= =0A= =0A= Well, you'll find the soconnect() stuff in sys/rpc/clnt_vc.c.=0A= If you just want to play around with it, have fun.=0A= =0A= As for this being useful in practice, that seems unlikely.=0A= When the kernel RPC code uses TCP it establishes one TCP connection at moun= t time and uses that connection until unmount unless the connection breaks = somehow.=0A= (A server will often disconnect after about 5 minutes of no activity on th= e connection. This almost never happens for NFSv4, since the NFSv4 client = does an RPC every 30sec to maintain the lease against the server.)=0A= --> A new TCP connection usually only happens after a=0A= network partitioning heals.=0A= (There was a bug that caused reconnects during certain cases of signal han= dling, but that was fixed about 3 years ago.)=0A= =0A= rick=0A= =0A= ________________________________________=0A= From: Scheffenegger, Richard =0A= Sent: Thursday, August 27, 2020 6:29 PM=0A= To: rmacklem@freebsd.org=0A= Cc: Michael Tuexen=0A= Subject: TFO for NFS=0A= =0A= CAUTION: This email originated from outside of the University of Guelph. Do= not click links or open attachments unless you recognize the sender and kn= ow the content is safe. If in doubt, forward suspicious emails to IThelp@uo= guelph.ca=0A= =0A= Hi Rick,=0A= =0A= I've seen you are very active with the fbsd nfs code, having branched the n= fs-over-tls project.=0A= =0A= Is anyone else contributing to this project yet?=0A= =0A= After some discussion in todays freebsd-transport call with tuexen@ , I was= wondering if the TCP Fast Open Option could be added as a proof-of-concept= to the in-kernel RPC handler. It may also be a nice augmentation of nfs-ov= er-tls when available, to absorb some of the added tls connection setup lat= ency when available...=0A= =0A= Right now, I am quite unfamiliar with all the rpc code, which appears to ha= ndle all the basic plumbing of NFS;=0A= =0A= Would you be interested in helping me with advice and reviews, in order to = try and get something around TFO working?=0A= =0A= (The reduction in time-to-first-IO by 1 RTT may be helpful in some scenario= s, or when TLS 1.2 instead of 1.3 is in use, where speeding up the tls hand= shake would potentially also be a nice property.=0A= =0A= =0A= Having said all this, for a client to actually make use of TFO, it is likel= y the slight changes / additions need to be done, in order to send out the = initial data (TLS or RPC) right away before any soconnect(), using sendmsg(= ) instead - causing the socket itself to figure out that tcp can connect an= d send data at the same time...=0A= =0A= Best regards,=0A= =0A= Richard Scheffenegger=0A= =0A= =0A=