Date: Wed, 20 Aug 2014 16:41:26 +0200 From: Luigi Rizzo <rizzo@iet.unipi.it> To: Hans Petter Selasky <hps@selasky.org> Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] Message-ID: <CA%2BhQ2%2BgfFYg01KyP34zjY7tXv23ZHOg%2BhDeo3s5UoHtQS3cn2w@mail.gmail.com> In-Reply-To: <53F4A2AF.6080102@selasky.org> References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> <CA%2BhQ2%2BhGP5qxRZoDUg1y1dO5juytveT6cW83YgApecRu4vk_dQ@mail.gmail.com> <53F4A2AF.6080102@selasky.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Aug 20, 2014 at 3:29 PM, Hans Petter Selasky <hps@selasky.org> wrote: > Hi Luigi, > > > On 08/20/14 11:32, Luigi Rizzo wrote: > >> On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky <hps@selasky.org> >> wrote: >> >> Hi, >>> >>> A month has passed since the last e-mail on this topic, and in the >>> meanwhile some new patches have been created and tested: >>> >>> Basically the approach has been changed a little bit: >>> >>> - The creation of hardware transmit rings has been made independent of >>> the >>> TCP stack. This allows firewall applications to forward traffic into >>> hardware transmit rings aswell, and not only native TCP applications. >>> This >>> should be one more reason to get the feature into the kernel. >>> =E2=80=8B... >>> >> =E2=80=8Bthe patch seems to include only part of the generic code (ie no= ioctls >> for manipulating the rates, no backend code). Do i miss something ? >> > > The IOCTLs for managing the rates are: > > SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL > > And they go to the if_ioctl callback.=E2=80=8B =E2=80=8Bi really think these new 'advanced' features should go through some ethtool-like API, not more ioctls. We have a strong need to design and implement such an API also to have a uniform mechanism to manipulate rss, queues and other NIC features. =E2=80=8B...=E2=80=8B > > > >> I have a few comments/concerns: >> >> + looks like flowid and txringid are overlapped in scope, >> both will be used (in the backend) to select a specific >> tx queue. I don't have a solution but would like to know >> how do you plan to address this -- does one have priority >> over the other, etc. >> > > Not 100% . In some cases the flowID is used differently than the txringid= , > though it might be possible to join the two. Would need to investigate > current users of the flow ID. =E2=80=8Bin some 10G drivers i have seen, at the driver level the flowid is used on the tx path to assign packets to a given =E2=80=8Btx queue, generally to improve cpu affinity. Of course some applications may want a true flow classifier so they do not have to re-do the classification multiple times. But then, we have a ton of different classifiers with the same need -- e.g. ipfw dynamic rules, dummynet pipe/queue id, divert ports... Pipes are stored in mtags, which are very expensive so i do see a point in embedding them in the mbufs, it's just that going this path there is no end to the list. > > + related to the above, a (possibly unavoidable) side effect >> of this type of changes is that mbufs explode with custom fields, >> so if we could perhaps make one between flowid and txringid, >> that would be useful. >> > > Right, but ratecontrol is an in-general useful feature, especially for > high throughput networks, or do you think otherwise? of course i think =E2=80=8Bthe feature is useful, but see the previous point. We should find a way to manage it (and others) that does not pollute or require continuous changes to the struct mbuf. > > > >> + i am not particularly happy about the explosion of ioctls for >> setting and getting rates. Next we'll want to add scheduling, >> and intervals, and queue sizes and so on. >> For these commands outside the critical path it would be >> preferable a single command with an extensible structure. >> Bikeshed material i am sure. >> > > There is only one IOCTL in the critical path and that is the IOCTL to > change or update the TX ring ID. The other IOCTLs are in the non-critical > path towards the if_ioctl() callback. > > If we can merge the flowID and the txringid into one field, would it be > acceptable to add an IOCTL to read/write this value for all sockets? =E2=80=8Bsee above. i'd prefer an ethtool-like solution. cheers luigi =E2=80=8B
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2BhQ2%2BgfFYg01KyP34zjY7tXv23ZHOg%2BhDeo3s5UoHtQS3cn2w>