From owner-freebsd-net@FreeBSD.ORG  Wed May 21 20:44:03 2014
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id AD5A757A;
 Wed, 21 May 2014 20:44:03 +0000 (UTC)
Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238])
 by mx1.freebsd.org (Postfix) with ESMTP id 200D32E36;
 Wed, 21 May 2014 20:44:02 +0000 (UTC)
Received: by onelab2.iet.unipi.it (Postfix, from userid 275)
 id D0B2E7300A; Wed, 21 May 2014 22:48:26 +0200 (CEST)
Date: Wed, 21 May 2014 22:48:26 +0200
From: Luigi Rizzo <rizzo@iet.unipi.it>
To: "Alexander V. Chernikov" <melifaro@freebsd.org>
Subject: Re: [CFT]: ipfw named tables / different tabletypes
Message-ID: <20140521204826.GA67124@onelab2.iet.unipi.it>
References: <5379FE3C.6060501@FreeBSD.org>
 <20140521111002.GB62462@onelab2.iet.unipi.it>
 <537CEC12.8050404@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <537CEC12.8050404@FreeBSD.org>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: Luigi Rizzo <luigi@freebsd.org>, FreeBSD Net <net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 21 May 2014 20:44:03 -0000

On Wed, May 21, 2014 at 10:10:26PM +0400, Alexander V. Chernikov wrote:
> On 21.05.2014 15:10, Luigi Rizzo wrote:
> > On Mon, May 19, 2014 at 04:51:08PM +0400, Alexander V. Chernikov wrote:
> >> Hello list!
> >>
> >> This patch adds ability to name tables / reference them by name.
> >> Additionally, it simplifies adding new table types.
> > Hi Alex,
> > at a high level, i think your changes here are in the right direction.
> Hello!
> >
> > However i think part of the design, and also of the patch below,
> > is not sound/correct so i would like to wait to commit at least
> > until some of the problems below are resolved.
> >
> > 1. The patch as is has several issues (variable declarations in the
> >     middle of a block, assignment in conditionals, incorrect
> >     comments in cut&pasted code, missing checks on strings)
> >     that should be corrected.
> ..missing documentation and so on :)
> Of course, I'll fix all these (or most of these :))

good thanks

> > 3. there is no explanation, here or in the code, on how the
> >     names are managed. I could try to figure out from the code
> >     but it would be the wrong way to understand things so let me
> >     ask, and please explain what you have in mind.
> Currently it is very simple non-resizable hash table with fnv hash based 
> on table name.

that is not an issue. The question is whether one needs to lookup
the hash table every time you have a 'table' argument (of course i
think one should not, and the implementation i propose below gives
direct access to the table without name lookups, as the internal
identifier is still poiniter or small integer, just one that is not exposed
to userspace).

> >
> > Let me address first the name <-> table-id thing.
> >
> > Introducing a symbolic name for tables is a great and useful feature.
> > However the implementation has some tricky issues:
> >
> > - do you want to retain the old numeric identifiers or not ?
> >    I think it is a bad idea to have two namespaces for the same thing,
> >    so if we are switching to alphanumeric strings for tables we should
> >    as well get rid of the numbers (i.e. consider them as strings as well).
> >
> >    I am strongly in favour of not using names as aliases for numbers.
> >    It would require no changes for clients issuing ipfw commands
> >    from a script, and would not require users to to manually handle
> >    the name-id translations.
> Well. I'd prefer not to. However, code we're discussing assumes that 
> numeric ids
> are primary ones (e.g. you can't have named, but unnumbered table,
> you have to choose number by yourself). Switching to named-only tables
> can be tricky since we don't want to match them by inside rules and we 
> don't want
> to loose atomicity when allocating table ids via separate cmds before 
> adding rule.

i think this is solved by the implementation i proposed below.

> It looks like we can do the following:
> 1) Add another IP_FW3_ADD opcode with the following layout:
> 
> {
> rule len;
> unmodified rule itself;
> tlv1 {type=table name;len=..;id=1;"_TABLENAME11_"}
> tlv2 {type=table name;len=..;id=2;"_TABLENAME4_"}
> ..
> }
> Values inside appropriate opcodes { O_IP_XXX_LOOKUP and so on } won't be 
> real values
> but values described in attached TLVs.
> 
> After validating/parsing/checking under UH lock we replace fake values 
> with real ones (probably, newly allocated)
> and return or rollback atomically.

yes this should work, probably not even requiring a new IP_FW3_ADD opcode
because the extra tlv entries can appear in a block by themselves.

> The same thing can be done for displaying ruleset, however I'd prefer 
> client to ask for table names and caching them while displaying.
> 
> Old clients will retain the ability to operate on tables but will see 
> table ids, so nothing will change for them
> (except the case when new binary adds table "5" and new binary sees id 
> which is not 5, but that shouldn't be a problem).

we can solve this by using 'low' numbers for the numeric tables
(these were limited anyways) and allocate the fake entries in
another range.

> >
> > - The rename command worries me a lot:
> >
> > 	> ipfw table <num> name XXX
> > 	> Names (or renames) table. Not the name has to be unique.
> >
> >    because it is neither clear nor intuitive whether you want to
> >    1. just rename a table (possibly breaking or creating references
> >       for rules in the firewall), or
> >    2. modify the name-id translation preserving existing pointers.
> >
> >    Consider the sequence
> > 	ipfw table 13 name bar
> >          ipfw add 100 count dst-ip table bar
> >          ipfw table 13 name foo
> >          ipfw table 14 name bar
> >          ipfw add 200 count src-port 22 dst-ip table bar
> >
> >     Approach #1 would detach rule 100 from table 13 and then connect to 14
> >     (in a non-atomic way), whereas approach #2 would make rule 100 and 200
> >     point to different tables (which is confusing).
> >
> >     Now part of this problem goes away if we get rid of number altogether.
> >
> >     You may legitimately want to swap tables in an atomic way (we have something
> >     similar for rulesets):
> > 	ipfw set swap number number
> There is some problem here:
> Atomic ruleset changes is a great thing, but tables have no atomic support.
> We, for example, are solving this by using different table range on 
> every new configuration.
> It won't be possible with named-only tables (since names usually care 
> some semantic in contrast to numbers).
> The only thing I can think of is separate namespace per each set.
> What do you think?

maybe i am missing some detail but it seems reasonably easy to implement
the atomic swap -- and the use case is when you want to move from
one configuration to a new one:
	ipfw table foo-new flush // clear initial content
	ipfw table foo-new add  ... <repeat as needed>
	ipfw table swap foo-current foo-new // swap the content of the table objects

so you preserve the semantic of the name very easily.

> >
> > So here is what i would propose:
> > - [gradually] introduce new commands that accept strings for every place where
> >    a number was previously accepted. Internally in the firewall, the old
> >    16-bit number is interpreted as a string.
> Yup.
> >
> > - within the kernel create a small set of functions (invoked on insert, list, delete)
> >    that do proper checks on the string and translate it to a pointer (or short integer,
> >    i.e. an index in an array) to the proper object. Done properly, we can reuse
> >    this code also for pipes, sets, nat groups and whatnot.
> Yup.
> >
> >    When a rule references an object that does not exist, you create an empty
> >    object that a subsequent table/pipe/XXX 'create' would initialize.
> >    On 'destroy', no need to touch the ruleset, just clear the object.
> Yes. This looks better than requiring user to create table of given type 
> _before_
> referencing it.
> >
> > - for renames, try to follow the approach used for sets i.e.
> >    ipfw table rename old-name new-name
> > 	changes the name, preserves the object.
> > 	Does not touch the ruleset, only the translation table
> Well, I'm not sure about renaming. I'd prefer permitting renaming iff 
> table is not referenced.
> (and why do we need to rename tables?)

you are right, let's forget it. And 'ipfw set move' actually does a different thing
which has no use here.

> >    ipfw table swap first-table second-table
> > 	atomically swap the content of the two table objects
> > 	(once again not touching the rulesets)

cheers
luigi