From owner-freebsd-net@FreeBSD.ORG Thu May 22 15:34:55 2014 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 36BE94EB; Thu, 22 May 2014 15:34:55 +0000 (UTC) Received: from mail.ipfw.ru (mail.ipfw.ru [IPv6:2a01:4f8:120:6141::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BDB672E10; Thu, 22 May 2014 15:34:54 +0000 (UTC) Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f] (helo=ptichko.yndx.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1WnR6m-000KZo-P0; Thu, 22 May 2014 15:24:24 +0400 Message-ID: <537E18D3.2010201@FreeBSD.org> Date: Thu, 22 May 2014 19:33:39 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.0.1 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: [CFT]: ipfw named tables / different tabletypes References: <5379FE3C.6060501@FreeBSD.org> <20140521111002.GB62462@onelab2.iet.unipi.it> <537CEC12.8050404@FreeBSD.org> <20140521204826.GA67124@onelab2.iet.unipi.it> <537E1029.70007@FreeBSD.org> In-Reply-To: <537E1029.70007@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Luigi Rizzo , FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 May 2014 15:34:55 -0000 On 22.05.2014 18:56, Alexander V. Chernikov wrote: It looks like we have reached some kind of consensus on table naming, so I'm going to implement the following as the first part: * named-only tables, no "user-visible" indexes * Keep the same opcodes, use additional TLVs to pass names in rules * Use explicit userland object names retrieval while listing * Make the previous ones easily extendable for other ipfw objects * Introduce table references and explicit typecasting (while permitting user to refernce non-existing tables) * leave table atomics for one the next stages Are you OK with this? > On 22.05.2014 00:48, Luigi Rizzo wrote: >> On Wed, May 21, 2014 at 10:10:26PM +0400, Alexander V. Chernikov wrote: >>> On 21.05.2014 15:10, Luigi Rizzo wrote: >>>> On Mon, May 19, 2014 at 04:51:08PM +0400, Alexander V. Chernikov >>>> wrote: >>>>> Hello list! >>>>> >>>>> This patch adds ability to name tables / reference them by name. >>>>> Additionally, it simplifies adding new table types. >>>> Hi Alex, >>>> at a high level, i think your changes here are in the right direction. >>> Hello! >>>> However i think part of the design, and also of the patch below, >>>> is not sound/correct so i would like to wait to commit at least >>>> until some of the problems below are resolved. >>>> >>>> 1. The patch as is has several issues (variable declarations in the >>>> middle of a block, assignment in conditionals, incorrect >>>> comments in cut&pasted code, missing checks on strings) >>>> that should be corrected. >>> ..missing documentation and so on :) >>> Of course, I'll fix all these (or most of these :)) >> good thanks >> >>>> 3. there is no explanation, here or in the code, on how the >>>> names are managed. I could try to figure out from the code >>>> but it would be the wrong way to understand things so let me >>>> ask, and please explain what you have in mind. >>> Currently it is very simple non-resizable hash table with fnv hash >>> based >>> on table name. >> that is not an issue. The question is whether one needs to lookup >> the hash table every time you have a 'table' argument (of course i >> think one should not, and the implementation i propose below gives >> direct access to the table without name lookups, as the internal >> identifier is still poiniter or small integer, just one that is not >> exposed >> to userspace). >> >>>> Let me address first the name <-> table-id thing. >>>> >>>> Introducing a symbolic name for tables is a great and useful feature. >>>> However the implementation has some tricky issues: >>>> >>>> - do you want to retain the old numeric identifiers or not ? >>>> I think it is a bad idea to have two namespaces for the same >>>> thing, >>>> so if we are switching to alphanumeric strings for tables we >>>> should >>>> as well get rid of the numbers (i.e. consider them as strings >>>> as well). >>>> >>>> I am strongly in favour of not using names as aliases for numbers. >>>> It would require no changes for clients issuing ipfw commands >>>> from a script, and would not require users to to manually handle >>>> the name-id translations. >>> Well. I'd prefer not to. However, code we're discussing assumes that >>> numeric ids >>> are primary ones (e.g. you can't have named, but unnumbered table, >>> you have to choose number by yourself). Switching to named-only tables >>> can be tricky since we don't want to match them by inside rules and we >>> don't want >>> to loose atomicity when allocating table ids via separate cmds before >>> adding rule. >> i think this is solved by the implementation i proposed below. >> >>> It looks like we can do the following: >>> 1) Add another IP_FW3_ADD opcode with the following layout: >>> >>> { >>> rule len; >>> unmodified rule itself; >>> tlv1 {type=table name;len=..;id=1;"_TABLENAME11_"} >>> tlv2 {type=table name;len=..;id=2;"_TABLENAME4_"} >>> .. >>> } >>> Values inside appropriate opcodes { O_IP_XXX_LOOKUP and so on } >>> won't be >>> real values >>> but values described in attached TLVs. >>> >>> After validating/parsing/checking under UH lock we replace fake values >>> with real ones (probably, newly allocated) >>> and return or rollback atomically. >> yes this should work, probably not even requiring a new IP_FW3_ADD >> opcode >> because the extra tlv entries can appear in a block by themselves. >> >>> The same thing can be done for displaying ruleset, however I'd prefer >>> client to ask for table names and caching them while displaying. >>> >>> Old clients will retain the ability to operate on tables but will see >>> table ids, so nothing will change for them >>> (except the case when new binary adds table "5" and new binary sees id >>> which is not 5, but that shouldn't be a problem). >> we can solve this by using 'low' numbers for the numeric tables >> (these were limited anyways) and allocate the fake entries in >> another range. > Currently we have u16 space available in base opcode. > Introducing another range will require additional opcode, additional > array for new tables and so on. > I'd prefer not to do this. Since table ids are allocated by ourselves > we can (and should) pack them > efficiently and 65k _real tables_ currently available is a quite good > value. > > We preserve compability for old binaries so people with old userland > and scripts should > not not notice anything. We have no public userland API so the only > base binary is /sbin/ipfw which is either old or new. > >> >>>> - The rename command worries me a lot: >>>> >>>> > ipfw table name XXX >>>> > Names (or renames) table. Not the name has to be unique. >>>> >>>> because it is neither clear nor intuitive whether you want to >>>> 1. just rename a table (possibly breaking or creating references >>>> for rules in the firewall), or >>>> 2. modify the name-id translation preserving existing pointers. >>>> >>>> Consider the sequence >>>> ipfw table 13 name bar >>>> ipfw add 100 count dst-ip table bar >>>> ipfw table 13 name foo >>>> ipfw table 14 name bar >>>> ipfw add 200 count src-port 22 dst-ip table bar >>>> >>>> Approach #1 would detach rule 100 from table 13 and then >>>> connect to 14 >>>> (in a non-atomic way), whereas approach #2 would make rule 100 >>>> and 200 >>>> point to different tables (which is confusing). >>>> >>>> Now part of this problem goes away if we get rid of number >>>> altogether. >>>> >>>> You may legitimately want to swap tables in an atomic way (we >>>> have something >>>> similar for rulesets): >>>> ipfw set swap number number >>> There is some problem here: >>> Atomic ruleset changes is a great thing, but tables have no atomic >>> support. >>> We, for example, are solving this by using different table range on >>> every new configuration. >>> It won't be possible with named-only tables (since names usually care >>> some semantic in contrast to numbers). >>> The only thing I can think of is separate namespace per each set. >>> What do you think? >> maybe i am missing some detail but it seems reasonably easy to implement >> the atomic swap -- and the use case is when you want to move from >> one configuration to a new one: >> ipfw table foo-new flush // clear initial content >> ipfw table foo-new add ... >> ipfw table swap foo-current foo-new // swap the content of the >> table objects >> >> so you preserve the semantic of the name very easily. > Yes. We can easily add atomic table swap that way. However, I'm > talking about different use scenario: > Atomically swap entire ruleset which has some tables depency: > > > e.g. we have: > > " > 100 allow ip from table(TABLE1) to me > 200 allow ip from table(TABLE2) to (TABLE3) 80 > > table TABLE1 1.1.1.1/32 > table TABLE1 1.0.0.0/16 > > table TABLE2 2.2.2.2/32 > > table TABLE3 3.3.3.3/32 > " > and we want to _atomically_ change this to > > " > 100 allow ip from table(TABLE1) to me > +200 allow ip from table(TABLE4) to any > 300 allow ip from table(TABLE2) to (TABLE3) 80 > > table TABLE1 1.1.1.1/32 > -table TABLE1 1.0.0.0/16 > > -table TABLE2 2.2.2.2/32 > +table TABLE2 77.77.77.0/24 > > table TABLE3 3.3.3.3/32 > > +table TABLE4 4.4.4.4/32 > " > >> >>>> So here is what i would propose: >>>> - [gradually] introduce new commands that accept strings for every >>>> place where >>>> a number was previously accepted. Internally in the firewall, >>>> the old >>>> 16-bit number is interpreted as a string. >>> Yup. >>>> - within the kernel create a small set of functions (invoked on >>>> insert, list, delete) >>>> that do proper checks on the string and translate it to a >>>> pointer (or short integer, >>>> i.e. an index in an array) to the proper object. Done properly, >>>> we can reuse >>>> this code also for pipes, sets, nat groups and whatnot. >>> Yup. >>>> When a rule references an object that does not exist, you >>>> create an empty >>>> object that a subsequent table/pipe/XXX 'create' would initialize. >>>> On 'destroy', no need to touch the ruleset, just clear the object. >>> Yes. This looks better than requiring user to create table of given >>> type >>> _before_ >>> referencing it. >>>> - for renames, try to follow the approach used for sets i.e. >>>> ipfw table rename old-name new-name >>>> changes the name, preserves the object. >>>> Does not touch the ruleset, only the translation table >>> Well, I'm not sure about renaming. I'd prefer permitting renaming iff >>> table is not referenced. >>> (and why do we need to rename tables?) >> you are right, let's forget it. And 'ipfw set move' actually does a >> different thing >> which has no use here. >> >>>> ipfw table swap first-table second-table >>>> atomically swap the content of the two table objects >>>> (once again not touching the rulesets) >> cheers >> luigi >> >