Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Aug 2014 15:52:34 +0400
From:      "Alexander V. Chernikov" <melifaro@yandex-team.ru>
To:        Luigi Rizzo <rizzo@iet.unipi.it>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Luigi Rizzo <luigi@freebsd.org>, "Andrey V. Elsukov" <ae@freebsd.org>, freebsd-ipfw <freebsd-ipfw@freebsd.org>
Subject:   Re: [CFT] new tables for ipfw
Message-ID:  <53ECA302.8010100@yandex-team.ru>
In-Reply-To: <CA%2BhQ2%2BgxVYmXb%2BHOw4qUm6tykmEvBRkrV0RhZsnC6B08FLKvdA@mail.gmail.com>
References:  <53EBC687.9050503@yandex-team.ru>	<CA%2BhQ2%2Bg=A_rLHCVpBqn0AtFLu_gNGtzbmXvc-7JhpLqPSWw44A@mail.gmail.com>	<53EC880B.3020903@yandex-team.ru>	<CA%2BhQ2%2BiPPhy47eN0=KaSYBaNMdObY20yko7dRY1MMuP_mfnmOQ@mail.gmail.com>	<53EC960A.1030603@yandex-team.ru> <CA%2BhQ2%2BgxVYmXb%2BHOw4qUm6tykmEvBRkrV0RhZsnC6B08FLKvdA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 14.08.2014 15:15, Luigi Rizzo wrote:
>
>
>
> On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov 
> <melifaro@yandex-team.ru <mailto:melifaro@yandex-team.ru>> wrote:
>
>     On 14.08.2014 14:44, Luigi Rizzo wrote:
>>
>>
>>
>>     On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov
>>     <melifaro@yandex-team.ru <mailto:melifaro@yandex-team.ru>> wrote:
>>
>>         On 14.08.2014 13:23, Luigi Rizzo wrote:
>>>
>>>
>>>
>>>         On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov
>>>         <melifaro@yandex-team.ru <mailto:melifaro@yandex-team.ru>>
>>>         wrote:
>>>
>>>             Hello list.
>>>
>>>             I've been hacking ipfw for a while and It seems there is
>>>             something ready to test/review in projects/ipfw branch.
>>>
>>>
>>>         ​this is a fantastic piece of work, thanks for doing it and for
>>>         integrating the feedback.
>>>         ​
>>>         I have some detailed feedback that will send you privately,
>>>         but just a curiosity:
>>>
>>>             ​...​
>>>
>>>             Some examples (see ipfw(8) manual page for the description):
>>>
>>>             ​...
>>>
>>>
>>>               ipfw table mi_test create type cidr algo "cidr:hash
>>>             masks=/30,/64"
>>>
>>>
>>>         ​why do we need to specify mask lengths in the above​ ?
>>         Well, since we're hashing IP we have to know mask to cut host
>>         bits in advance.
>>         (And the real reason is that I'm too lazy to implement
>>         hierarchical  matching (check /32, then /31, then /30) like
>>         how, for example,
>>
>>
>>     ​oh well for that we should use cidr:radix
>>
>>     Research results have never shown a strong superiority of
>>     hierarchical hash tables over good radix implementations,
>>     and in those cases one usually adopts partial prefix
>>     expansion so you only have, say, masks that are a
>>     multiple of 2..8 bits so you only need a small number of
>>     hash lookups.
>     Definitely, especially for IPv6. So I was actually thinking about
>     covering some special sparse cases (e.g. someone having a bunch of
>     /32 and a bunch of /30 and that's all).
>
>     Btw, since we're talking about "good radix implementation": what
>     license does DXR have? :)
>     Is it OK to merge it as another cidr implementation?
>
> "cidr" is a very ugly name, i'd rather use "addr"
Ok, no problem with that. "addr" really sounds better.
>
> DXR has a ​bsd license and of course it is possible to use it.
> You should ask Marko Zec for his latest version of the code
> (and probably make sure we have one copy of the code in the source tree).
Great!. I'll ask him :)
>
> Speaking of features, one thing that would be nice is the ability
> for tables to reference the in-kernel tables (e.g. fibs, socket
> lists, interface lists...), perhaps in readonly mode.
> How complex do you think that would be ?
Implementing algo support for particular provider like sockets/iflists 
shouldn't be hard. Most of the algorithms complexity lies in table 
modifications. Here we have to support
lookup and dump operations, so it is the question of providing necessary 
bindings to existing mechanisms (via some direct binding or utilizing 
things like kernel_sysctl for dump support).

It looks like the following maps well to current table concept:
* such tables are not created by default
* user issues
  `ipfw table kfib create type addr algo "addr:kernel fib=0"`
or
  `ipfw table ktcp create type flow algo "flow:kernel_tcp fib=0"`
or
`ipfw table kiface create type iface algo "iface:kernel"`
* tables have special "readonly" type, flush_all requests are ignored
* no state stored internally

So generic table handling code needs to be modified to support read-only 
tables (and making more callbacks optional).
Additionally, we might need to proxy "info" request info algo callback 
(optional, "real" algorithms won't implement it) to be able to show 
number of items (and some other info) to user.



>
> cheers
> luigi
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53ECA302.8010100>