Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Aug 2014 16:41:26 +0200
From:      Luigi Rizzo <rizzo@iet.unipi.it>
To:        Hans Petter Selasky <hps@selasky.org>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
Message-ID:  <CA%2BhQ2%2BgfFYg01KyP34zjY7tXv23ZHOg%2BhDeo3s5UoHtQS3cn2w@mail.gmail.com>
In-Reply-To: <53F4A2AF.6080102@selasky.org>
References:  <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> <CA%2BhQ2%2BhGP5qxRZoDUg1y1dO5juytveT6cW83YgApecRu4vk_dQ@mail.gmail.com> <53F4A2AF.6080102@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Aug 20, 2014 at 3:29 PM, Hans Petter Selasky <hps@selasky.org>
wrote:

> Hi Luigi,
>
>
> On 08/20/14 11:32, Luigi Rizzo wrote:
>
>> On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky <hps@selasky.org>
>> wrote:
>>
>>  Hi,
>>>
>>> A month has passed since the last e-mail on this topic, and in the
>>> meanwhile some new patches have been created and tested:
>>>
>>> Basically the approach has been changed a little bit:
>>>
>>> - The creation of hardware transmit rings has been made independent of
>>> the
>>> TCP stack. This allows firewall applications to forward traffic into
>>> hardware transmit rings aswell, and not only native TCP applications.
>>> This
>>> should be one more reason to get the feature into the kernel.
>>> =E2=80=8B...
>>>
>> =E2=80=8Bthe patch seems to include only part of the generic code (ie no=
 ioctls
>> for manipulating the rates, no backend code). Do i miss something ?
>>
>
> The IOCTLs for managing the rates are:
>
> SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL
>
> And they go to the if_ioctl callback.=E2=80=8B


=E2=80=8Bi really think these new 'advanced' features should go
through some ethtool-like API, not more ioctls.
We have a strong need to design and implement such
an API also to have a uniform mechanism to manipulate
rss, queues and other NIC features.


=E2=80=8B...=E2=80=8B
>
>
>
>> I have a few comments/concerns:
>>
>> + looks like flowid and txringid are overlapped in scope,
>>    both will be used (in the backend) to select a specific
>>    tx queue. I don't have a solution but would like to know
>>    how do you plan to address this -- does one have priority
>>    over the other, etc.
>>
>
> Not 100% . In some cases the flowID is used differently than the txringid=
,
> though it might be possible to join the two. Would need to investigate
> current users of the flow ID.


=E2=80=8Bin some 10G drivers i have seen, at the driver
level the flowid is used on the tx path to assign
packets to a given =E2=80=8Btx queue, generally to improve
cpu affinity. Of course some applications
may want a true flow classifier so they do not
have to re-do the classification multiple times.
But then, we have a ton of different classifiers
with the same need -- e.g. ipfw dynamic rules,
dummynet pipe/queue id, divert ports...
Pipes are stored in mtags, which are very expensive
so i do see a point in embedding them in the mbufs,
it's just that going this path there is no end
to the list.



>
>  + related to the above, a (possibly unavoidable) side effect
>>    of this type of changes is that mbufs explode with custom fields,
>>    so if we could perhaps make one between flowid and txringid,
>>    that would be useful.
>>
>
> Right, but ratecontrol is an in-general useful feature, especially for
> high throughput networks, or do you think otherwise?


of course i think =E2=80=8Bthe feature is useful,
but see the previous point. We should find a
way to manage it (and others) that does not pollute
or require continuous changes to the struct mbuf.

>
>
>
>> + i am not particularly happy about the explosion of ioctls for
>>    setting and getting rates. Next we'll want to add scheduling,
>>    and intervals, and queue sizes and so on.
>>    For these commands outside the critical path it would be
>>    preferable a single command with an extensible structure.
>>    Bikeshed material i am sure.
>>
>
> There is only one IOCTL in the critical path and that is the IOCTL to
> change or update the TX ring ID. The other IOCTLs are in the non-critical
> path towards the if_ioctl() callback.
>
> If we can merge the flowID and the txringid into one field, would it be
> acceptable to add an IOCTL to read/write this value for all sockets?


=E2=80=8Bsee above. i'd prefer an ethtool-like solution.

cheers
luigi
=E2=80=8B



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2BhQ2%2BgfFYg01KyP34zjY7tXv23ZHOg%2BhDeo3s5UoHtQS3cn2w>