Date: Fri, 11 Sep 2020 17:15:42 +0000 From: "Scheffenegger, Richard" <Richard.Scheffenegger@netapp.com> To: "sthaug@nethelp.no" <sthaug@nethelp.no> Cc: "net@FreeBSD.org" <net@FreeBSD.org>, "transport@freebsd.org" <transport@freebsd.org> Subject: RE: Socket option to configure Ethernet PCP / CoS per-flow Message-ID: <SN4PR0601MB37283BD2AFBC97768D92D90886240@SN4PR0601MB3728.namprd06.prod.outlook.com> In-Reply-To: <20200911.185432.122001633.sthaug@nethelp.no> References: <SN4PR0601MB372898DF9D2838392B22C7AF86240@SN4PR0601MB3728.namprd06.prod.outlook.com> <20200911.185432.122001633.sthaug@nethelp.no>
next in thread | previous in thread | raw e-mail | index | archive | help
Thank you for the quick feedback. On a related note - it just occurred to me, that the PCP functionality coul= d be extended to make more effective use of PFC (priority flow control) wit= hout explicitly managing it on an application level directly. Right now, PFC typically degenerates to good-old Flow control, as all traff= ic is handled just in the default class (0, or whatever is set up using the= IOCTL interface API). Typically, the different Ethernet classes come with a notion of prioritizat= ion between them - traffic in a "higher" class may be forwarded prior to tr= affic in a lower class. But that is not a strong requirement - using WRR wi= th 1/8th bandwidth "reserved" for each class in a switch, assigning flows t= o a random PCP value, PFC could work in a more scalable fashion - only bloc= king a fraction of traffic, that is actually queue building (has to go over= a lower bandwidth link, or a NIC excessively pausing its ingress), thus re= ducing the chance of the formation of congrestion trees... E.g. PCP runs from 0 (default) to 7;=20 Adding a socket option to explicitly assign traffic to one of these flows w= ould allow testing and configuring applications to make use of "real" prior= itization capabilities of modern switches. And what I was just pondering was a special interface level setting (e.g. 8= ), which results in a socket to pick a "random" value when created, to dist= ribute packets across all the queues available in hardware, allowing PFC to= no longer collapse in effect to old FC style "on"/"off" for all traffic...= =20 Perhaps someone here has experience with congestion tree formation in multi= -hop switching environments, and can comment if the above approach would be= feasible to address that FC issue? Richard Scheffenegger -----Original Message----- From: sthaug@nethelp.no <sthaug@nethelp.no>=20 Sent: Freitag, 11. September 2020 18:55 To: Scheffenegger, Richard <Richard.Scheffenegger@netapp.com> Cc: net@FreeBSD.org; transport@freebsd.org Subject: Re: Socket option to configure Ethernet PCP / CoS per-flow NetApp Security WARNING: This is an external email. Do not click links or o= pen attachments unless you recognize the sender and know the content is saf= e. > However, while this allows all traffic sent via a specific interface to b= e marked with a PCP (priority code point), it defeats the purpose of PFC (p= riority flow control) which works by individually pausing different queues = of an interface, provided there is an actual differentiation of traffic int= o those various classes. > > Internally, we have added a socket option (SO_VLAN_PCP) to change the PCP= specifically for traffic associated with that socket, to be marked differe= ntly from whatever the interface default is (unmarked, or the default PCP). > > Does the community see value in having such a socket option widely availa= ble? (Linux currently doesn't seem to have a per-socket option either, only= a per-interface IOCTL API). I've been doing quite a bit of network testing using iperf3 and similar too= ls, and have wanted this type of functionality since the interface option b= ecame available. Having this on a socket level would make it possible to te= ach iperf3, ping and other tools to set PCP and facilitate/simplify testing= of L2 networks. So the answer is a definite yes! This would be valuable. Steinar Haug, Nethelp consulting, sthaug@nethelp.no
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?SN4PR0601MB37283BD2AFBC97768D92D90886240>