Date: Fri, 2 Jul 2021 02:40:49 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Peter Eriksson <pen@lysator.liu.se> Cc: freebsd-net <freebsd-net@freebsd.org> Subject: Re: RFC: NFS trunking (multiple TCP connections for a mount Message-ID: <YQXPR0101MB09680E95ACA0D07F817688AEDD1F9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <YQXPR0101MB0968C4F4865ADA058CCEEA17DD009@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> References: <YQXPR0101MB0968DC173855A82AAF45F08FDD039@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>,<362300CE-30DA-4552-A3E4-0F3DFE385B2A@lysator.liu.se>,<YQXPR0101MB0968C4F4865ADA058CCEEA17DD009@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
Rick Macklem wrote:=0A= >In case anyone is interested in testing and/or reviewing the patch,=0A= >it is at https://reviews.freebsd.org/D30970.=0A= >=0A= >Only lightly tested at this point.=0A= >=0A= >The NFS mount option is "nconnect=3D<N>", where 2<=3D N <=3D 16,=0A= >same as Linux. (I haven't done a man page patch yet.)=0A= I have updated the patch so that the original TCP connection is=0A= used for RPCs that consist of small messages (therefore not needing=0A= much network bandwidth) and the RPCs (Read/Readdir/Write) that=0A= use larger messages are sent on the N-1 additional TCP connections=0A= in a round robin fashion.=0A= =0A= The message below was posted a couple of days ago on linux-nfs@vger.kernel.= org.=0A= It might be unfair to put it here, out of context, but I think it at least= =0A= suggests that separating the larger RPC messages from the small ones=0A= (mostly Lookup/Getattr/Access metadata related RPCs) may be useful=0A= under certain circumstances.=0A= > The original issue described was how a high read/write process on the=0A= > client could slow another process trying to do heavy metadata=0A= > operations (like walking the filesystem). Using a different mount to=0A= > the same multi-homed server seems to help a lot (probably because of=0A= > the independent slot table).=0A= --> For this implementation, there is no separate session/slot table.=0A= (Note that each I/O RPC only uses one table slot.)=0A= =0A= I did not make this small vs large RPCs on a separate TCP connection=0A= a separate option, since I believe there are already too many mount options= .=0A= If others feel it should be a separate mount option, please speak up.=0A= =0A= The phabricator patch has been updated. Please test/review/comment.=0A= =0A= Thanks, rick=0A= =0A= Thanks everyone, for your input, rick=0A= =0A= ________________________________________=0A= From: Peter Eriksson <pen@lysator.liu.se>=0A= Sent: Tuesday, June 29, 2021 5:11 AM=0A= To: Rick Macklem=0A= Cc: freebsd-net=0A= Subject: Re: RFC: NFS trunking (multiple TCP connections for a mount=0A= =0A= CAUTION: This email originated from outside of the University of Guelph. Do= not click links or open attachments unless you recognize the sender and kn= ow the content is safe. If in doubt, forward suspicious emails to IThelp@uo= guelph.ca=0A= =0A= =0A= > I don't understand how multiple TCP connections to the same=0A= > server IP address will distribute the load across multiple network=0A= > interfaces?=0A= > I thought that lagg would have handled this?=0A= =0A= =0A= A lagg typically keeps all data in a TCP stream on a specific lagg member (= depending on how the lagg is set up, unless you select the =93roundrobin=94= option in freebsd - don=92t do that unless you like out-of-order packets= =85)=0A= =0A= Network equipment with laggs typically hash the IP streams over the lagg me= mbers based on MAC addresses (source&target), IP addresses (source&target) = and port numbers.=0A= =0A= (We have been diagnosing a fun problem locally where we see packet losses/p= erformance drops over our internal backbone network for certain combination= s of odd/even IP addresses/port numbers when things pass certain SPB =93rou= ters=94 (which typically hash the streams over many =93channels=94 between = routers)=85 Fun fun. :-)=0A= =0A= I think the multiple NFS TCP streams could make for some nice performance i= mprovements in certain cases. And it would be a more generalisation of havi= ng multiple streams between two hosts - one-or-many over IPv4 and one-or-ma= ny over IPv6 at the same time. Windows SMB has a similar feature.=0A= =0A= Just avoid the Linux NFS mounting deadlock issue with =93down=94 servers pl= ease :-)=0A= =0A= - Peter=0A= =0A=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB09680E95ACA0D07F817688AEDD1F9>