Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 01 Jun 2014 17:51:33 +0400
From:      "Alexander V. Chernikov" <melifaro@FreeBSD.org>
To:        Luigi Rizzo <rizzo@iet.unipi.it>
Cc:        Luigi Rizzo <luigi@FreeBSD.org>, FreeBSD Net <net@FreeBSD.org>
Subject:   Re: [CFT]: ipfw named tables / different tabletypes
Message-ID:  <538B2FE5.6070407@FreeBSD.org>
In-Reply-To: <20140522163812.GA77634@onelab2.iet.unipi.it>
References:  <5379FE3C.6060501@FreeBSD.org> <20140521111002.GB62462@onelab2.iet.unipi.it> <537CEC12.8050404@FreeBSD.org> <20140521204826.GA67124@onelab2.iet.unipi.it> <537E1029.70007@FreeBSD.org> <20140522154740.GA76448@onelab2.iet.unipi.it> <537E2153.1040005@FreeBSD.org> <20140522163812.GA77634@onelab2.iet.unipi.it>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------090406000404070502000405
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 22.05.2014 20:38, Luigi Rizzo wrote:

Long story short, new version is ready.
I've tried to minimize changes in this patch to ease review/commit.

Changes:
* Add namedobject set-aware api capable of searching/allocation objects 
by their name/idx.
* Switch tables code to use string ids for configuration tasks.
* Change locking model: most configuration changes are protected with UH 
lock, runtime-visible are protected with both locks.
* Reduce number of arguments passed to ipfw_table_add/del by using 
separate structure.
* Add internal V_fw_tables_sets tunable (set to 0) to prepare for 
set-aware tables (requires opcodes/client support)
* Implement typed table referencing (and tables are implicitly allocated 
with all state like radix ptrs on reference)
* Add "destroy" ipfw(8) using new IP_FW_DELOBJ opcode

Namedobj more detailed:
* Blackbox api providing methods to add/del/search/enumerate objects
* Statically-sized hashes for names/indexes
* Per-set bitmask to indicate free indexes
* Separate methods for index alloc/delete/resize


Basically, there should not be any user-visible changes except the 
following:
* reducing table_max is not supported
* flush & add change table type won't work if table is referenced


I haven't removed any numbering restrictions to protect the following case:
one (with old client) unintentionally references too many tables (e.g. 
1000-1128),
tries to allocate table from "valid" range and fails. Old client does 
not have any ability to
destroy any table, so the only way to solve this is either module unload 
or reboot.

I've uploaded the same patch to phabricator since it provides quite 
handy diffs:
https://phabric.freebsd.org/D139 (no login required).

> On Thu, May 22, 2014 at 08:09:55PM +0400, Alexander V. Chernikov wrote:
>> On 22.05.2014 19:47, Luigi Rizzo wrote:
>>> On Thu, May 22, 2014 at 06:56:41PM +0400, Alexander V. Chernikov wrote:
>>>> On 22.05.2014 00:48, Luigi Rizzo wrote:
>>>>> On Wed, May 21, 2014 at 10:10:26PM +0400, Alexander V. Chernikov wrote:
>>> ...
>>>>> we can solve this by using 'low' numbers for the numeric tables
>>>>> (these were limited anyways) and allocate the fake entries in
>>>>> another range.
>>>> Currently we have u16 space available in base opcode.
>>> yes but the standard range for tables is much more limited:
>>>
>>> 	net.inet.ip.fw.tables_max: 128
>>>
>>> so one can just (say) use 32k for "old" tables and the rest
>>> for tables with non numeric names.
>>> Does not seem to be a problem in practice.
>> Well, using upper 32k means that you set this default to 65k which
>> consumes 256k of memory on 32-bit arch.
>> Embedded people won't be very happy about this (and changing table
>> numbers on resize would be a nightmare).
> no no, this is an implementation detail but
> within the kernel you can just remap the 'old' and 'new'
> table identifiers to a single contiguous range.
> The only thing you need to do is that when you push
> identifiers up to userland, those with 'new' names will
> be mapped to the 32-64k range.
>
> Example:
> user first specifies tables
> 	"18, goodguys, 530, badguys" in the same rule
>     /sbin/ipfw will generate these numbers:
> 	18, 32768, 530, 32769 ; tlv {32768:goodguys, 32769:badguys}
>     The kernel will then do a lookup of those identifiers and
> 	18: internal index 1, name "18"
> 	32768: internal index 2, name "goodguys"
> 	530: internal index 3, name "530"
> 	32769: internal index 4, name "badguys"
>
> Then the next rule contains tables
> 	1, badguys, 18
>      /sbin/ipfw generates
> 	1, 32768, 18 ; tlv {32768:badguys} // note different from before
>      Kernel looks up the names and remaps
> 	1: internal index 5, name "1"
> 	32768: internal index 4, name "badguys"
> 	18: internal index 1, name "18"
>
> Finally when you do an 'ipfw show' the kernel will remap names
> between 1 and 32768 to themselves, and other names to 32768+
> (or some other large number, say 40k and above) so
> as they are found. So the rules will be pushed up with
> 	18, 40000, 530, 40001
> 	1, 40001, 18
>
> we can discusso the other details privately
>
> cheers
> luigi
>
>
> 1. first, the
>>>>> maybe i am missing some detail but it seems reasonably easy to implement
>>>>> the atomic swap -- and the use case is when you want to move from
>>>>> one configuration to a new one:
>>>>> 	ipfw table foo-new flush // clear initial content
>>>>> 	ipfw table foo-new add  ... <repeat as needed>
>>>>> 	ipfw table swap foo-current foo-new // swap the content of the table objects
>>>>>
>>>>> so you preserve the semantic of the name very easily.
>>>> Yes. We can easily add atomic table swap that way. However, I'm talking
>>>> about different use scenario:
>>>> Atomically swap entire ruleset which has some tables depency:
>>>>
>>>>
>>>> e.g. we have:
>>>>
>>>> "
>>>> 100 allow ip from table(TABLE1) to me
>>>> 200 allow ip from table(TABLE2) to (TABLE3) 80
>>>>
>>>> table TABLE1 1.1.1.1/32
>>>> table TABLE1 1.0.0.0/16
>>>>
>>>> table TABLE2 2.2.2.2/32
>>>>
>>>> table TABLE3 3.3.3.3/32
>>>> "
>>>> and we want to _atomically_ change this to
>>>>
>>>> "
>>>> 100 allow ip from table(TABLE1) to me
>>>> +200 allow ip from table(TABLE4) to any
>>>> 300 allow ip from table(TABLE2) to (TABLE3) 80
>>>>
>>>> table TABLE1 1.1.1.1/32
>>>> -table TABLE1 1.0.0.0/16
>>>>
>>>> -table TABLE2 2.2.2.2/32
>>>> +table TABLE2 77.77.77.0/24
>>>>
>>>> table TABLE3 3.3.3.3/32
>>>>
>>>> +table TABLE4 4.4.4.4/32
>>>> "
>>> aargh, that's too much -- because between changing
>>> one table and all tables there are infinite intermediate
>>> points that all make sense.
>> It depends. As I said before, we're currently solving this problem by
>> adding new rules (to set X) referencing tables from different range
>> (2048 tables per ruleset) and than doing swap.
>> (And not being able to use named tables to store real names after
>> implementing them is a bit discouraging).
>>
>>> For those cases i think the way to go could be to
>>> insert a 'disabled' new ruleset (however complex it is,
>>> so it covers all possible cases), and then do the set swap,
>>> or disable/enable.
>> We can think of per-set arrays/namespaces of tables:
>>
>> so "ipfw add 100 set X allow ipfw from table(Y) to ..." will reference
>> table Y in set X and
>> "ipfw table ABC list" can differ from "ipfw table ABC set 5 list".
>>
>> This behavior can break some users setups so we can provide
>> sysctl/tunable to turn this off or on.
>>
>>> cheers
>>> luigi
>>>


--------------090406000404070502000405
Content-Type: text/x-patch;
 name="ipfw_ntables4.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="ipfw_ntables4.diff"

Index: sys/netinet/ip_fw.h
===================================================================
--- sys/netinet/ip_fw.h	(revision 266310)
+++ sys/netinet/ip_fw.h	(working copy)
@@ -37,6 +37,11 @@
 #define	IPFW_DEFAULT_RULE	65535
 
 /*
+ * Number of sets supported by ipfw
+ */
+#define	IPFW_MAX_SETS		32
+
+/*
  * Default number of ipfw tables.
  */
 #define	IPFW_TABLES_MAX		65535
@@ -74,6 +79,7 @@ typedef struct _ip_fw3_opheader {
 #define	IP_FW_TABLE_XDEL	87	/* delete entry */
 #define	IP_FW_TABLE_XGETSIZE	88	/* get table size */
 #define	IP_FW_TABLE_XLIST	89	/* list table contents */
+#define	IP_FW_OBJ_DEL		90	/* del table/pipe/etc */
 
 /*
  * The kernel representation of ipfw rules is made of a list of
@@ -632,7 +638,7 @@ typedef struct	_ipfw_table {
 } ipfw_table;
 
 typedef struct	_ipfw_xtable {
-	ip_fw3_opheader	opheader;	/* eXtended tables are controlled via IP_FW3 */
+	ip_fw3_opheader	opheader;	/* IP_FW3 opcode */
 	uint32_t	size;		/* size of entries in bytes	*/
 	uint32_t	cnt;		/* # of entries			*/
 	uint16_t	tbl;		/* table number			*/
@@ -640,4 +646,26 @@ typedef struct	_ipfw_xtable {
 	ipfw_table_xentry xent[0];	/* entries			*/
 } ipfw_xtable;
 
+typedef struct  _ipfw_xtable_tlv {
+	uint16_t        type;		/* TLV type */
+	uint16_t        length;		/* Total length, aligned to u32	*/
+} ipfw_xtable_tlv;
+
+#define	IPFW_TLV_NAME	1
+/* Object name TLV */
+typedef struct _ipfw_xtable_ntlv {
+	ipfw_xtable_tlv	head;		/* TLV header */
+	uint16_t	idx;		/* Name index */
+	uint16_t	spare;		/* unused */
+	char		name[64];	/* Null-terminated name */
+} ipfw_xtable_ntlv;
+
+typedef struct _ipfw_obj_header {
+	ip_fw3_opheader	opheader;	/* IP_FW3 opcode		*/
+	uint32_t	set;		/* Set we're operating		*/
+	uint16_t	idx;		/* object name index		*/
+	uint16_t	objtype;	/* object type			*/
+} ipfw_obj_header;
+#define	IPFW_OBJTYPE_TABLE	1
+
 #endif /* _IPFW2_H */
Index: sys/netpfil/ipfw/ip_fw2.c
===================================================================
--- sys/netpfil/ipfw/ip_fw2.c	(revision 266306)
+++ sys/netpfil/ipfw/ip_fw2.c	(working copy)
@@ -121,6 +121,7 @@ VNET_DEFINE(int, autoinc_step);
 VNET_DEFINE(int, fw_one_pass) = 1;
 
 VNET_DEFINE(unsigned int, fw_tables_max);
+VNET_DEFINE(unsigned int, fw_tables_sets) = 0;	/* Don't use set-aware tables */
 /* Use 128 tables by default */
 static unsigned int default_fw_tables = IPFW_TABLES_DEFAULT;
 
@@ -2719,7 +2720,6 @@ vnet_ipfw_uninit(const void *unused)
 	ipfw_dyn_uninit(0);	/* run the callout_drain */
 	IPFW_WUNLOCK(chain);
 
-	ipfw_destroy_tables(chain);
 	reap = NULL;
 	IPFW_WLOCK(chain);
 	for (i = 0; i < chain->n_rules; i++) {
@@ -2731,6 +2731,7 @@ vnet_ipfw_uninit(const void *unused)
 		free(chain->map, M_IPFW);
 	IPFW_WUNLOCK(chain);
 	IPFW_UH_WUNLOCK(chain);
+	ipfw_destroy_tables(chain);
 	if (reap != NULL)
 		ipfw_reap_rules(reap);
 	IPFW_LOCK_DESTROY(chain);
Index: sys/netpfil/ipfw/ip_fw_private.h
===================================================================
--- sys/netpfil/ipfw/ip_fw_private.h	(revision 266306)
+++ sys/netpfil/ipfw/ip_fw_private.h	(working copy)
@@ -212,6 +212,11 @@ VNET_DECLARE(int, autoinc_step);
 VNET_DECLARE(unsigned int, fw_tables_max);
 #define V_fw_tables_max		VNET(fw_tables_max)
 
+VNET_DECLARE(unsigned int, fw_tables_sets);
+#define V_fw_tables_sets	VNET(fw_tables_sets)
+
+struct tables_config;
+
 struct ip_fw_chain {
 	struct ip_fw	**map;		/* array of rule ptrs to ease lookup */
 	uint32_t	id;		/* ruleset id */
@@ -219,7 +224,6 @@ struct ip_fw_chain {
 	LIST_HEAD(nat_list, cfg_nat) nat;       /* list of nat entries */
 	struct radix_node_head **tables;	/* IPv4 tables */
 	struct radix_node_head **xtables;	/* extended tables */
-	uint8_t		*tabletype;	/* Array of table types */
 #if defined( __linux__ ) || defined( _WIN32 )
 	spinlock_t rwmtx;
 #else
@@ -229,6 +233,7 @@ struct ip_fw_chain {
 	uint32_t	gencnt;		/* NAT generation count */
 	struct ip_fw	*reap;		/* list of rules to reap */
 	struct ip_fw	*default_rule;
+	struct tables_config *tblcfg;	/* tables module data */
 #if defined( __linux__ ) || defined( _WIN32 )
 	spinlock_t uh_lock;
 #else
@@ -295,13 +300,84 @@ struct sockopt;	/* used by tcp_var.h */
 #define IPFW_UH_WLOCK(p) rw_wlock(&(p)->uh_lock)
 #define IPFW_UH_WUNLOCK(p) rw_wunlock(&(p)->uh_lock)
 
+struct tid_info {
+	uint32_t	set;	/* table set */
+	uint16_t	uidx;	/* table index */
+	uint8_t		type;	/* table type */
+	uint8_t		spare;
+	void		*tlvs;	/* Pointer to first TLV */
+	int		tlen;	/* Total TLV size block */
+};
+
+struct obj_idx {
+	uint16_t	uidx;	/* internal index supplied by userland */
+	uint16_t	kidx;	/* kernel object index */
+	uint16_t	off;	/* tlv offset from rule end in 4-byte words */
+	uint8_t		new;	/* index is newly-allocated */
+	uint8_t		type;	/* object type within its category */
+};
+
+struct rule_check_info {
+	uint16_t	table_opcodes;	/* count of opcodes referencing table */
+	uint16_t	new_tables;	/* count of opcodes referencing table */
+	uint32_t	tableset;	/* ipfw set id for table */
+	void		*tlvs;		/* Pointer to first TLV if any */
+	int		tlen;		/* *Total TLV size block */
+	uint8_t		fw3;		/* opcode is new */
+	struct ip_fw	*krule;		/* resulting rule pointer */
+	struct obj_idx	obuf[8];	/* table references storage */
+};
+
+struct tentry_info {
+	void		*paddr;
+	int		plen;		/* Total entry length		*/
+	uint8_t		masklen;	/* mask length			*/
+	uint8_t		spare;
+	uint16_t	flags;		/* record flags			*/
+	uint32_t	value;		/* value			*/
+};
+
 /* In ip_fw_sockopt.c */
 int ipfw_find_rule(struct ip_fw_chain *chain, uint32_t key, uint32_t id);
-int ipfw_add_rule(struct ip_fw_chain *chain, struct ip_fw *input_rule);
 int ipfw_ctl(struct sockopt *sopt);
 int ipfw_chk(struct ip_fw_args *args);
 void ipfw_reap_rules(struct ip_fw *head);
 
+struct namedobj_instance;
+
+struct named_object {
+	TAILQ_ENTRY(named_object)	nn_next;	/* namehash */
+	TAILQ_ENTRY(named_object)	nv_next;	/* valuehash */
+	char			*name;	/* object name */
+	uint8_t			type;	/* object type */
+	uint8_t			compat;	/* Object name is number */
+	uint16_t		kidx;	/* object kernel index */
+	uint16_t		uidx;	/* userland idx for compat records */
+	uint32_t		set;	/* set object belongs to */
+	uint32_t		refcnt;	/* number of references */
+};
+TAILQ_HEAD(namedobjects_head, named_object);
+
+typedef void (objhash_cb_t)(struct namedobj_instance *ni, struct named_object *,
+    void *arg);
+struct namedobj_instance *ipfw_objhash_create(uint32_t items);
+void ipfw_objhash_destroy(struct namedobj_instance *);
+void ipfw_objhash_bitmap_alloc(uint32_t items, void **idx, int *pblocks);
+int ipfw_objhash_bitmap_merge(struct namedobj_instance *ni,
+    void **idx, int *blocks);
+void ipfw_objhash_bitmap_free(void *idx, int blocks);
+struct named_object *ipfw_objhash_lookup_name(struct namedobj_instance *ni,
+    uint32_t set, char *name);
+struct named_object *ipfw_objhash_lookup_idx(struct namedobj_instance *ni,
+    uint32_t set, uint16_t idx);
+void ipfw_objhash_add(struct namedobj_instance *ni, struct named_object *no);
+void ipfw_objhash_del(struct namedobj_instance *ni, struct named_object *no);
+void ipfw_objhash_foreach(struct namedobj_instance *ni, objhash_cb_t *f,
+    void *arg);
+int ipfw_objhash_free_idx(struct namedobj_instance *ni, uint32_t set,
+    uint16_t idx);
+int ipfw_objhash_alloc_idx(void *n, uint32_t set, uint16_t *pidx);
+
 /* In ip_fw_table.c */
 struct radix_node;
 int ipfw_lookup_table(struct ip_fw_chain *ch, uint16_t tbl, in_addr_t addr,
@@ -309,18 +385,28 @@ int ipfw_lookup_table(struct ip_fw_chain *ch, uint
 int ipfw_lookup_table_extended(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
     uint32_t *val, int type);
 int ipfw_init_tables(struct ip_fw_chain *ch);
+int ipfw_destroy_table(struct ip_fw_chain *ch, struct tid_info *ti, int force);
 void ipfw_destroy_tables(struct ip_fw_chain *ch);
-int ipfw_flush_table(struct ip_fw_chain *ch, uint16_t tbl);
-int ipfw_add_table_entry(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
-    uint8_t plen, uint8_t mlen, uint8_t type, uint32_t value);
-int ipfw_del_table_entry(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
-    uint8_t plen, uint8_t mlen, uint8_t type);
-int ipfw_count_table(struct ip_fw_chain *ch, uint32_t tbl, uint32_t *cnt);
+int ipfw_flush_table(struct ip_fw_chain *ch, struct tid_info *ti);
+int ipfw_add_table_entry(struct ip_fw_chain *ch, struct tid_info *ti,
+    struct tentry_info *tei);
+int ipfw_del_table_entry(struct ip_fw_chain *ch, struct tid_info *ti,
+    struct tentry_info *tei);
+int ipfw_count_table(struct ip_fw_chain *ch, struct tid_info *ti,
+    uint32_t *cnt);
 int ipfw_dump_table_entry(struct radix_node *rn, void *arg);
-int ipfw_dump_table(struct ip_fw_chain *ch, ipfw_table *tbl);
-int ipfw_count_xtable(struct ip_fw_chain *ch, uint32_t tbl, uint32_t *cnt);
-int ipfw_dump_xtable(struct ip_fw_chain *ch, ipfw_xtable *tbl);
+int ipfw_dump_table(struct ip_fw_chain *ch, struct tid_info *ti,
+    ipfw_table *tbl);
+int ipfw_count_xtable(struct ip_fw_chain *ch, struct tid_info *ti,
+    uint32_t *cnt);
+int ipfw_dump_xtable(struct ip_fw_chain *ch, struct tid_info *ti,
+    ipfw_xtable *tbl);
 int ipfw_resize_tables(struct ip_fw_chain *ch, unsigned int ntables);
+int ipfw_rewrite_table_uidx(struct ip_fw_chain *chain,
+    struct rule_check_info *ci);
+int ipfw_rewrite_table_kidx(struct ip_fw_chain *chain, struct ip_fw *rule);
+void ipfw_unbind_table_rule(struct ip_fw_chain *chain, struct ip_fw *rule);
+void ipfw_unbind_table_list(struct ip_fw_chain *chain, struct ip_fw *head);
 
 /* In ip_fw_nat.c -- XXX to be moved to ip_var.h */
 
Index: sys/netpfil/ipfw/ip_fw_sockopt.c
===================================================================
--- sys/netpfil/ipfw/ip_fw_sockopt.c	(revision 266306)
+++ sys/netpfil/ipfw/ip_fw_sockopt.c	(working copy)
@@ -53,6 +53,7 @@ __FBSDID("$FreeBSD$");
 #include <sys/socketvar.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
+#include <sys/fnv_hash.h>
 #include <net/if.h>
 #include <net/route.h>
 #include <net/vnet.h>
@@ -67,6 +68,25 @@ __FBSDID("$FreeBSD$");
 #include <security/mac/mac_framework.h>
 #endif
 
+#define	NAMEDOBJ_HASH_SIZE	32
+
+struct namedobj_instance {
+	struct namedobjects_head	*names;
+	struct namedobjects_head	*values;
+	uint32_t nn_size;		/* names hash size */
+	uint32_t nv_size;		/* number hash size */
+	u_long *idx_mask;		/* used items bitmask */
+	uint32_t max_blocks;		/* number of "long" blocks in bitmask */
+	uint16_t free_off[IPFW_MAX_SETS];	/* first possible free offset */
+};
+#define	BLOCK_ITEMS	(8 * sizeof(u_long))	/* Number of items for ffsl() */
+//#define	IDX_SET(i, s)	(((i) & 0xFF) << 16 | ((s) & 0xFFFF))
+#define	IDX_SET(i, s)	(i)
+
+static uint32_t objhash_hash_name(struct namedobj_instance *ni, char *name);
+static uint32_t objhash_hash_val(struct namedobj_instance *ni, uint32_t val);
+
+
 MALLOC_DEFINE(M_IPFW, "IpFw/IpAcct", "IpFw/IpAcct chain's");
 
 /*
@@ -152,8 +172,9 @@ swap_map(struct ip_fw_chain *chain, struct ip_fw *
  * XXX DO NOT USE FOR THE DEFAULT RULE.
  * Must be called without IPFW_UH held
  */
-int
-ipfw_add_rule(struct ip_fw_chain *chain, struct ip_fw *input_rule)
+static int
+add_rule(struct ip_fw_chain *chain, struct ip_fw *input_rule,
+    struct rule_check_info *ci)
 {
 	struct ip_fw *rule;
 	int i, l, insert_before;
@@ -164,19 +185,37 @@ swap_map(struct ip_fw_chain *chain, struct ip_fw *
 
 	l = RULESIZE(input_rule);
 	rule = malloc(l, M_IPFW, M_WAITOK | M_ZERO);
+	bcopy(input_rule, rule, l);
+	/* clear fields not settable from userland */
+	rule->x_next = NULL;
+	rule->next_rule = NULL;
+	IPFW_ZERO_RULE_COUNTER(rule);
+
+	/* Check if we need to do table remap */
+	if (ci->table_opcodes > 0) {
+		ci->krule = rule;
+		i = ipfw_rewrite_table_uidx(chain, ci);
+		if (i != 0) {
+			/* rewrite failed, return error */
+			free(rule, M_IPFW);
+			return (i);
+		}
+	}
+
 	/* get_map returns with IPFW_UH_WLOCK if successful */
 	map = get_map(chain, 1, 0 /* not locked */);
 	if (map == NULL) {
+		if (ci->table_opcodes > 0) {
+			/* We need to unbind tables */
+			IPFW_UH_WLOCK(chain);
+			ipfw_unbind_table_rule(chain, rule);
+			IPFW_UH_WUNLOCK(chain);
+		}
+
 		free(rule, M_IPFW);
-		return ENOSPC;
+		return (ENOSPC);
 	}
 
-	bcopy(input_rule, rule, l);
-	/* clear fields not settable from userland */
-	rule->x_next = NULL;
-	rule->next_rule = NULL;
-	IPFW_ZERO_RULE_COUNTER(rule);
-
 	if (V_autoinc_step < 1)
 		V_autoinc_step = 1;
 	else if (V_autoinc_step > 1000)
@@ -421,6 +460,7 @@ del_entry(struct ip_fw_chain *chain, uint32_t arg)
 
 	rule = chain->reap;
 	chain->reap = NULL;
+	ipfw_unbind_table_list(chain, rule);
 	IPFW_UH_WUNLOCK(chain);
 	ipfw_reap_rules(rule);
 	if (map)
@@ -517,7 +557,7 @@ zero_entry(struct ip_fw_chain *chain, u_int32_t ar
  * Rules are simple, so this mostly need to check rule sizes.
  */
 static int
-check_ipfw_struct(struct ip_fw *rule, int size)
+check_ipfw_struct(struct ip_fw *rule, int size, struct rule_check_info *ci)
 {
 	int l, cmdlen = 0;
 	int have_action=0;
@@ -529,10 +569,15 @@ static int
 	}
 	/* first, check for valid size */
 	l = RULESIZE(rule);
-	if (l != size) {
+	if (l > size) {
 		printf("ipfw: size mismatch (have %d want %d)\n", size, l);
 		return (EINVAL);
 	}
+	if (size > l) {
+		/* Save TLV information */
+		ci->tlvs = (caddr_t)rule + l;
+		ci->tlen = size - l;
+	}
 	if (rule->act_ofs >= rule->cmd_len) {
 		printf("ipfw: bogus action offset (%u > %u)\n",
 		    rule->act_ofs, rule->cmd_len - 1);
@@ -662,6 +707,7 @@ static int
 			    cmdlen != F_INSN_SIZE(ipfw_insn_u32) + 1 &&
 			    cmdlen != F_INSN_SIZE(ipfw_insn_u32))
 				goto bad_size;
+			ci->table_opcodes++;
 			break;
 		case O_MACADDR2:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_mac))
@@ -694,6 +740,8 @@ static int
 		case O_RECV:
 		case O_XMIT:
 		case O_VIA:
+			if (((ipfw_insn_if *)cmd)->name[0] == '\1')
+				ci->table_opcodes++;
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_if))
 				goto bad_size;
 			break;
@@ -879,7 +927,7 @@ ipfw_getrules(struct ip_fw_chain *chain, void *buf
 	char *bp = buf;
 	char *ep = bp + space;
 	struct ip_fw *rule, *dst;
-	int l, i;
+	int error, i, l;
 	time_t	boot_seconds;
 
         boot_seconds = boottime.tv_sec;
@@ -890,8 +938,11 @@ ipfw_getrules(struct ip_fw_chain *chain, void *buf
 		    /* Convert rule to FreeBSd 7.2 format */
 		    l = RULESIZE7(rule);
 		    if (bp + l + sizeof(uint32_t) <= ep) {
-			int error;
 			bcopy(rule, bp, l + sizeof(uint32_t));
+			error = ipfw_rewrite_table_kidx(chain,
+			    (struct ip_fw *)bp);
+			if (error != 0)
+				return (0);
 			error = convert_rule_to_7((struct ip_fw *) bp);
 			if (error)
 				return 0; /*XXX correct? */
@@ -918,6 +969,13 @@ ipfw_getrules(struct ip_fw_chain *chain, void *buf
 		}
 		dst = (struct ip_fw *)bp;
 		bcopy(rule, dst, l);
+		error = ipfw_rewrite_table_kidx(chain, dst);
+		if (error != 0) {
+			printf("Stop on rule %d. Fail to convert table\n",
+			    rule->rulenum);
+			break;
+		}
+
 		/*
 		 * XXX HACK. Store the disable mask in the "next"
 		 * pointer in a wild attempt to keep the ABI the same.
@@ -949,6 +1007,7 @@ ipfw_ctl(struct sockopt *sopt)
 	uint32_t opt;
 	char xbuf[128];
 	ip_fw3_opheader *op3 = NULL;
+	struct rule_check_info ci;
 
 	error = priv_check(sopt->sopt_td, PRIV_NETINET_IPFW);
 	if (error)
@@ -1027,6 +1086,8 @@ ipfw_ctl(struct sockopt *sopt)
 		error = sooptcopyin(sopt, rule, RULE_MAXSIZE,
 			sizeof(struct ip_fw7) );
 
+		memset(&ci, 0, sizeof(struct rule_check_info));
+
 		/*
 		 * If the size of commands equals RULESIZE7 then we assume
 		 * a FreeBSD7.2 binary is talking to us (set is7=1).
@@ -1044,15 +1105,15 @@ ipfw_ctl(struct sockopt *sopt)
 			return error;
 		    }
 		    if (error == 0)
-			error = check_ipfw_struct(rule, RULESIZE(rule));
+			error = check_ipfw_struct(rule, RULESIZE(rule), &ci);
 		} else {
 		    is7 = 0;
 		if (error == 0)
-			error = check_ipfw_struct(rule, sopt->sopt_valsize);
+			error = check_ipfw_struct(rule, sopt->sopt_valsize,&ci);
 		}
 		if (error == 0) {
-			/* locking is done within ipfw_add_rule() */
-			error = ipfw_add_rule(chain, rule);
+			/* locking is done within add_rule() */
+			error = add_rule(chain, rule, &ci);
 			size = RULESIZE(rule);
 			if (!error && sopt->sopt_dir == SOPT_GET) {
 				if (is7) {
@@ -1114,30 +1175,59 @@ ipfw_ctl(struct sockopt *sopt)
 		break;
 
 	/*--- TABLE manipulations are protected by the IPFW_LOCK ---*/
-	case IP_FW_TABLE_ADD:
+	case IP_FW_OBJ_DEL: /* IP_FW3 */
 		{
-			ipfw_table_entry ent;
+			struct _ipfw_obj_header *oh;
+			struct tid_info ti;
 
-			error = sooptcopyin(sopt, &ent,
-			    sizeof(ent), sizeof(ent));
-			if (error)
+			if (sopt->sopt_valsize < sizeof(*oh)) {
+				error = EINVAL;
 				break;
-			error = ipfw_add_table_entry(chain, ent.tbl,
-			    &ent.addr, sizeof(ent.addr), ent.masklen, 
-			    IPFW_TABLE_CIDR, ent.value);
+			}
+
+			oh = (struct _ipfw_obj_header *)(op3 + 1);
+
+			switch (oh->objtype) {
+			case IPFW_OBJTYPE_TABLE:
+				memset(&ti, 0, sizeof(ti));
+				ti.set = oh->set;
+				ti.uidx = oh->idx;
+				ti.tlvs = (oh + 1);
+				ti.tlen = sopt->sopt_valsize - sizeof(*oh);
+				error = ipfw_destroy_table(chain, &ti, 0);
+				break;
+			default:
+				error = ENOTSUP;
+				break;
+			}
+			break;
 		}
-		break;
-
+	case IP_FW_TABLE_ADD:
 	case IP_FW_TABLE_DEL:
 		{
 			ipfw_table_entry ent;
+			struct tentry_info tei;
+			struct tid_info ti;
 
 			error = sooptcopyin(sopt, &ent,
 			    sizeof(ent), sizeof(ent));
 			if (error)
 				break;
-			error = ipfw_del_table_entry(chain, ent.tbl,
-			    &ent.addr, sizeof(ent.addr), ent.masklen, IPFW_TABLE_CIDR);
+
+			memset(&tei, 0, sizeof(tei));
+			tei.paddr = &ent.addr;
+			tei.plen = sizeof(ent.addr);
+			tei.masklen = ent.masklen;
+			tei.flags = 0;
+			tei.value = ent.value;
+			memset(&ti, 0, sizeof(ti));
+			ti.set = RESVD_SET;
+			ti.uidx = ent.tbl;
+			ti.type = IPFW_TABLE_CIDR;
+
+			error = (opt == IP_FW_TABLE_ADD) ?
+			    ipfw_add_table_entry(chain, &ti, &tei) :
+			    ipfw_del_table_entry(chain, &ti, &tei);
 		}
 		break;
 
@@ -1145,6 +1235,8 @@ ipfw_ctl(struct sockopt *sopt)
 	case IP_FW_TABLE_XDEL: /* IP_FW3 */
 		{
 			ipfw_table_xentry *xent = (ipfw_table_xentry *)(op3 + 1);
+			struct tentry_info tei;
+			struct tid_info ti;
 
 			/* Check minimum header size */
 			if (IP_FW3_OPLENGTH(sopt) < offsetof(ipfw_table_xentry, k)) {
@@ -1160,35 +1252,52 @@ ipfw_ctl(struct sockopt *sopt)
 			
 			len = xent->len - offsetof(ipfw_table_xentry, k);
 
+			memset(&tei, 0, sizeof(tei));
+			tei.paddr = &xent->k;
+			tei.plen = len;
+			tei.masklen = xent->masklen;
+			tei.flags = 0;
+			tei.value = xent->value;
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0;	/* XXX: No way to specify set  */
+			ti.uidx = xent->tbl;
+			ti.type = xent->type;
+
 			error = (opt == IP_FW_TABLE_XADD) ?
-				ipfw_add_table_entry(chain, xent->tbl, &xent->k, 
-					len, xent->masklen, xent->type, xent->value) :
-				ipfw_del_table_entry(chain, xent->tbl, &xent->k,
-					len, xent->masklen, xent->type);
+			    ipfw_add_table_entry(chain, &ti, &tei) :
+			    ipfw_del_table_entry(chain, &ti, &tei);
 		}
 		break;
 
 	case IP_FW_TABLE_FLUSH:
 		{
 			u_int16_t tbl;
+			struct tid_info ti;
 
 			error = sooptcopyin(sopt, &tbl,
 			    sizeof(tbl), sizeof(tbl));
 			if (error)
 				break;
-			error = ipfw_flush_table(chain, tbl);
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = tbl;
+			error = ipfw_flush_table(chain, &ti);
 		}
 		break;
 
 	case IP_FW_TABLE_GETSIZE:
 		{
 			u_int32_t tbl, cnt;
+			struct tid_info ti;
 
 			if ((error = sooptcopyin(sopt, &tbl, sizeof(tbl),
 			    sizeof(tbl))))
 				break;
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = tbl;
 			IPFW_RLOCK(chain);
-			error = ipfw_count_table(chain, tbl, &cnt);
+			error = ipfw_count_table(chain, &ti, &cnt);
 			IPFW_RUNLOCK(chain);
 			if (error)
 				break;
@@ -1199,6 +1308,7 @@ ipfw_ctl(struct sockopt *sopt)
 	case IP_FW_TABLE_LIST:
 		{
 			ipfw_table *tbl;
+			struct tid_info ti;
 
 			if (sopt->sopt_valsize < sizeof(*tbl)) {
 				error = EINVAL;
@@ -1213,8 +1323,11 @@ ipfw_ctl(struct sockopt *sopt)
 			}
 			tbl->size = (size - sizeof(*tbl)) /
 			    sizeof(ipfw_table_entry);
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = tbl->tbl;
 			IPFW_RLOCK(chain);
-			error = ipfw_dump_table(chain, tbl);
+			error = ipfw_dump_table(chain, &ti, tbl);
 			IPFW_RUNLOCK(chain);
 			if (error) {
 				free(tbl, M_TEMP);
@@ -1228,6 +1341,7 @@ ipfw_ctl(struct sockopt *sopt)
 	case IP_FW_TABLE_XGETSIZE: /* IP_FW3 */
 		{
 			uint32_t *tbl;
+			struct tid_info ti;
 
 			if (IP_FW3_OPLENGTH(sopt) < sizeof(uint32_t)) {
 				error = EINVAL;
@@ -1236,8 +1350,11 @@ ipfw_ctl(struct sockopt *sopt)
 
 			tbl = (uint32_t *)(op3 + 1);
 
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = *tbl;
 			IPFW_RLOCK(chain);
-			error = ipfw_count_xtable(chain, *tbl, tbl);
+			error = ipfw_count_xtable(chain, &ti, tbl);
 			IPFW_RUNLOCK(chain);
 			if (error)
 				break;
@@ -1248,6 +1365,7 @@ ipfw_ctl(struct sockopt *sopt)
 	case IP_FW_TABLE_XLIST: /* IP_FW3 */
 		{
 			ipfw_xtable *tbl;
+			struct tid_info ti;
 
 			if ((size = valsize) < sizeof(ipfw_xtable)) {
 				error = EINVAL;
@@ -1260,8 +1378,11 @@ ipfw_ctl(struct sockopt *sopt)
 			/* Get maximum number of entries we can store */
 			tbl->size = (size - sizeof(ipfw_xtable)) /
 			    sizeof(ipfw_table_xentry);
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = tbl->tbl;
 			IPFW_RLOCK(chain);
-			error = ipfw_dump_xtable(chain, tbl);
+			error = ipfw_dump_xtable(chain, &ti, tbl);
 			IPFW_RUNLOCK(chain);
 			if (error) {
 				free(tbl, M_TEMP);
@@ -1444,4 +1565,271 @@ convert_rule_to_8(struct ip_fw *rule)
 	return 0;
 }
 
+/*
+ * Named object api
+ *
+ */
+
+void
+ipfw_objhash_bitmap_alloc(uint32_t items, void **idx, int *pblocks)
+{
+	size_t size;
+	int max_blocks;
+	void *idx_mask;
+
+	items = roundup2(items, BLOCK_ITEMS);	/* Align to block size */
+	max_blocks = items / BLOCK_ITEMS;
+	size = items / 8;
+	idx_mask = malloc(size * IPFW_MAX_SETS, M_IPFW, M_WAITOK);
+	/* Mark all as free */
+	memset(idx_mask, 0xFF, size * IPFW_MAX_SETS);
+
+	*idx = idx_mask;
+	*pblocks = max_blocks;
+}
+
+int
+ipfw_objhash_bitmap_merge(struct namedobj_instance *ni, void **idx, int *blocks)
+{
+	int old_blocks, new_blocks;
+	u_long *old_idx, *new_idx;
+	int i;
+
+	old_idx = ni->idx_mask;
+	old_blocks = ni->max_blocks;
+	new_idx = *idx;
+	new_blocks = *blocks;
+
+	/*
+	 * FIXME: Permit reducing total amount of tables
+	 */
+	if (old_blocks > new_blocks)
+		return (1);
+
+	for (i = 0; i < IPFW_MAX_SETS; i++) {
+		memcpy(&new_idx[new_blocks * i], &old_idx[old_blocks * i],
+		    old_blocks * sizeof(u_long));
+	}
+
+	ni->idx_mask = new_idx;
+	ni->max_blocks = new_blocks;
+
+	/* Save old values */
+	*idx = old_idx;
+	*blocks = old_blocks;
+
+	return (0);
+}
+
+void
+ipfw_objhash_bitmap_free(void *idx, int blocks)
+{
+
+	free(idx, M_IPFW);
+}
+
+/*
+ * Creates named hash instance.
+ * Must be called without holding any locks.
+ * Return pointer to new instance.
+ */
+struct namedobj_instance *
+ipfw_objhash_create(uint32_t items)
+{
+	struct namedobj_instance *ni;
+	int i;
+	size_t size;
+
+	size = sizeof(struct namedobj_instance) +
+	    sizeof(struct namedobjects_head) * NAMEDOBJ_HASH_SIZE +
+	    sizeof(struct namedobjects_head) * NAMEDOBJ_HASH_SIZE;
+
+	ni = malloc(size, M_IPFW, M_WAITOK | M_ZERO);
+	ni->nn_size = NAMEDOBJ_HASH_SIZE;
+	ni->nv_size = NAMEDOBJ_HASH_SIZE;
+
+	ni->names = (struct namedobjects_head *)(ni +1);
+	ni->values = &ni->names[ni->nn_size];
+
+	for (i = 0; i < ni->nn_size; i++)
+		TAILQ_INIT(&ni->names[i]);
+
+	for (i = 0; i < ni->nv_size; i++)
+		TAILQ_INIT(&ni->values[i]);
+
+	/* Allocate bitmask separately due to possible resize */
+	ipfw_objhash_bitmap_alloc(items, (void*)&ni->idx_mask, &ni->max_blocks);
+
+	return (ni);
+}
+
+void
+ipfw_objhash_destroy(struct namedobj_instance *ni)
+{
+
+	free(ni->idx_mask, M_IPFW);
+	free(ni, M_IPFW);
+}
+
+static uint32_t
+objhash_hash_name(struct namedobj_instance *ni, char *name)
+{
+	uint32_t v;
+
+	v = fnv_32_str(name, FNV1_32_INIT);
+
+	return (v % ni->nn_size);
+}
+
+static uint32_t
+objhash_hash_val(struct namedobj_instance *ni, uint32_t val)
+{
+	uint32_t v;
+
+	v = val % 31; /* Assume hash size to be 32 */
+
+	return (v % ni->nv_size);
+}
+
+struct named_object *
+ipfw_objhash_lookup_name(struct namedobj_instance *ni, uint32_t set, char *name)
+{
+	struct named_object *no;
+	uint32_t hash;
+
+	hash = objhash_hash_name(ni, name);
+	
+	TAILQ_FOREACH(no, &ni->names[hash], nn_next) {
+		if ((strcmp(no->name, name) == 0) && (no->set == set))
+			return (no);
+	}
+
+	return (NULL);
+}
+
+struct named_object *
+ipfw_objhash_lookup_idx(struct namedobj_instance *ni, uint32_t set,
+    uint16_t idx)
+{
+	struct named_object *no;
+	uint32_t hash;
+
+	hash = objhash_hash_val(ni, IDX_SET(idx, set));
+	
+	TAILQ_FOREACH(no, &ni->values[hash], nv_next) {
+		if ((no->kidx == idx) && (no->set == set))
+			return (no);
+	}
+
+	return (NULL);
+}
+
+void
+ipfw_objhash_add(struct namedobj_instance *ni, struct named_object *no)
+{
+	uint32_t hash;
+
+	hash = objhash_hash_name(ni, no->name);
+	TAILQ_INSERT_HEAD(&ni->names[hash], no, nn_next);
+
+	hash = objhash_hash_val(ni, IDX_SET(no->kidx, no->set));
+	TAILQ_INSERT_HEAD(&ni->values[hash], no, nv_next);
+}
+
+void
+ipfw_objhash_del(struct namedobj_instance *ni, struct named_object *no)
+{
+	uint32_t hash;
+
+	hash = objhash_hash_name(ni, no->name);
+	TAILQ_REMOVE(&ni->names[hash], no, nn_next);
+
+	hash = objhash_hash_val(ni, no->kidx);
+	TAILQ_REMOVE(&ni->values[hash], no, nv_next);
+}
+
+/*
+ * Runs @func for each found named object.
+ * It is safe to delete objects from callback
+ */
+void
+ipfw_objhash_foreach(struct namedobj_instance *ni, objhash_cb_t *f, void *arg)
+{
+	struct named_object *no, *no_tmp;
+	int i;
+
+	for (i = 0; i < ni->nn_size; i++) {
+		TAILQ_FOREACH_SAFE(no, &ni->names[i], nn_next, no_tmp)
+			f(ni, no, arg);
+	}
+}
+
+/*
+ * Removes index from given set.
+ * Returns 0 on success.
+ */
+int
+ipfw_objhash_free_idx(struct namedobj_instance *ni, uint32_t set, uint16_t idx)
+{
+	u_long *mask;
+	int i, v;
+
+	i = idx / BLOCK_ITEMS;
+	v = idx % BLOCK_ITEMS;
+
+	if ((i >= ni->max_blocks) || set >= IPFW_MAX_SETS)
+		return (1);
+
+	mask = &ni->idx_mask[set * ni->max_blocks + i];
+
+	if ((*mask & ((u_long)1 << v)) != 0)
+		return (1);
+
+	/* Mark as free */
+	*mask |= (u_long)1 << v;
+
+	/* Update free offset */
+	if (ni->free_off[set] > i)
+		ni->free_off[set] = i;
+	
+	return (0);
+}
+
+/*
+ * Allocate new index in given set and stores in in @pidx.
+ * Returns 0 on success.
+ */
+int
+ipfw_objhash_alloc_idx(void *n, uint32_t set, uint16_t *pidx)
+{
+	struct namedobj_instance *ni;
+	u_long *mask;
+	int i, off, v;
+
+	if (set >= IPFW_MAX_SETS)
+		return (-1);
+
+	ni = (struct namedobj_instance *)n;
+
+	off = ni->free_off[set];
+	mask = &ni->idx_mask[set * ni->max_blocks + off];
+
+	for (i = off; i < ni->max_blocks; i++, mask++) {
+		if ((v = ffsl(*mask)) == 0)
+			continue;
+
+		/* Mark as busy */
+		*mask &= ~ ((u_long)1 << (v - 1));
+
+		ni->free_off[set] = i;
+		
+		v = BLOCK_ITEMS * i + v - 1;
+
+		*pidx = v;
+		return (0);
+	}
+
+	return (1);
+}
+
 /* end of file */
Index: sys/netpfil/ipfw/ip_fw_table.c
===================================================================
--- sys/netpfil/ipfw/ip_fw_table.c	(revision 266310)
+++ sys/netpfil/ipfw/ip_fw_table.c	(working copy)
@@ -100,6 +100,49 @@ struct table_xentry {
 	u_int32_t		value;
 };
 
+ /*
+ * Table has the following `type` concepts:
+ *
+ * `type` represents lookup key type (cidr, ifp, uid, etc..)
+ * `ftype` is pure userland field helping to properly format table data
+ * `atype` represents exact lookup algorithm for given tabletype.
+ *     For example, we can use more efficient search schemes if we plan
+ *     to use some specific table for storing host-routes only.
+ *
+ */
+struct table_config {
+	struct named_object	no;
+	uint8_t		ftype;		/* format table type */
+	uint8_t		atype;		/* algorith type */
+	uint8_t		linked;		/* 1 if already linked */
+	uint8_t		spare0;
+	uint32_t	count;		/* Number of records */
+	char		tablename[64];	/* table name */
+	void		*state;		/* Store some state if needed */
+	void		*xstate;
+};
+#define	TABLE_SET(set)	((V_fw_tables_sets != 0) ? set : 0)
+
+struct tables_config {
+	struct namedobj_instance	*namehash;
+};
+
+static struct table_config *find_table(struct namedobj_instance *ni,
+    struct tid_info *ti);
+static struct table_config *alloc_table_config(struct namedobj_instance *ni,
+    struct tid_info *ti);
+static void free_table_config(struct namedobj_instance *ni,
+    struct table_config *tc);
+static void link_table(struct ip_fw_chain *chain, struct table_config *tc);
+static void unlink_table(struct ip_fw_chain *chain, struct table_config *tc);
+static int alloc_table_state(void **state, void **xstate, uint8_t type);
+static void free_table_state(void **state, void **xstate, uint8_t type);
+
+
+#define	CHAIN_TO_TCFG(chain)	((struct tables_config *)(chain)->tblcfg)
+#define	CHAIN_TO_NI(chain)	(CHAIN_TO_TCFG(chain)->namehash)
+
+
 /*
  * The radix code expects addr and mask to be array of bytes,
  * with the first byte being the length of the array. rn_inithead
@@ -136,10 +179,10 @@ ipv6_writemask(struct in6_addr *addr6, uint8_t mas
 #endif
 
 int
-ipfw_add_table_entry(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
-    uint8_t plen, uint8_t mlen, uint8_t type, uint32_t value)
+ipfw_add_table_entry(struct ip_fw_chain *ch, struct tid_info *ti,
+    struct tentry_info *tei)
 {
-	struct radix_node_head *rnh, **rnh_ptr;
+	struct radix_node_head *rnh;
 	struct table_entry *ent;
 	struct table_xentry *xent;
 	struct radix_node *rn;
@@ -147,51 +190,57 @@ int
 	int offset;
 	void *ent_ptr;
 	struct sockaddr *addr_ptr, *mask_ptr;
+	struct table_config *tc, *tc_new;
+	struct namedobj_instance *ni;
 	char c;
+	uint8_t mlen;
+	uint16_t kidx;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
 
-	switch (type) {
+	mlen = tei->masklen;
+
+	switch (ti->type) {
 	case IPFW_TABLE_CIDR:
-		if (plen == sizeof(in_addr_t)) {
+		if (tei->plen == sizeof(in_addr_t)) {
 #ifdef INET
 			/* IPv4 case */
 			if (mlen > 32)
 				return (EINVAL);
 			ent = malloc(sizeof(*ent), M_IPFW_TBL, M_WAITOK | M_ZERO);
-			ent->value = value;
+			ent->value = tei->value;
 			/* Set 'total' structure length */
 			KEY_LEN(ent->addr) = KEY_LEN_INET;
 			KEY_LEN(ent->mask) = KEY_LEN_INET;
 			/* Set offset of IPv4 address in bits */
 			offset = OFF_LEN_INET;
-			ent->mask.sin_addr.s_addr = htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0);
-			addr = *((in_addr_t *)paddr);
+			ent->mask.sin_addr.s_addr =
+			    htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0);
+			addr = *((in_addr_t *)tei->paddr);
 			ent->addr.sin_addr.s_addr = addr & ent->mask.sin_addr.s_addr;
 			/* Set pointers */
-			rnh_ptr = &ch->tables[tbl];
 			ent_ptr = ent;
 			addr_ptr = (struct sockaddr *)&ent->addr;
 			mask_ptr = (struct sockaddr *)&ent->mask;
 #endif
 #ifdef INET6
-		} else if (plen == sizeof(struct in6_addr)) {
+		} else if (tei->plen == sizeof(struct in6_addr)) {
 			/* IPv6 case */
 			if (mlen > 128)
 				return (EINVAL);
 			xent = malloc(sizeof(*xent), M_IPFW_TBL, M_WAITOK | M_ZERO);
-			xent->value = value;
+			xent->value = tei->value;
 			/* Set 'total' structure length */
 			KEY_LEN(xent->a.addr6) = KEY_LEN_INET6;
 			KEY_LEN(xent->m.mask6) = KEY_LEN_INET6;
 			/* Set offset of IPv6 address in bits */
 			offset = OFF_LEN_INET6;
 			ipv6_writemask(&xent->m.mask6.sin6_addr, mlen);
-			memcpy(&xent->a.addr6.sin6_addr, paddr, sizeof(struct in6_addr));
+			memcpy(&xent->a.addr6.sin6_addr, tei->paddr,
+			    sizeof(struct in6_addr));
 			APPLY_MASK(&xent->a.addr6.sin6_addr, &xent->m.mask6.sin6_addr);
 			/* Set pointers */
-			rnh_ptr = &ch->xtables[tbl];
 			ent_ptr = xent;
 			addr_ptr = (struct sockaddr *)&xent->a.addr6;
 			mask_ptr = (struct sockaddr *)&xent->m.mask6;
@@ -204,22 +253,23 @@ int
 	
 	case IPFW_TABLE_INTERFACE:
 		/* Check if string is terminated */
-		c = ((char *)paddr)[IF_NAMESIZE - 1];
-		((char *)paddr)[IF_NAMESIZE - 1] = '\0';
-		if (((mlen = strlen((char *)paddr)) == IF_NAMESIZE - 1) && (c != '\0'))
+		c = ((char *)tei->paddr)[IF_NAMESIZE - 1];
+		((char *)tei->paddr)[IF_NAMESIZE - 1] = '\0';
+		mlen = strlen((char *)tei->paddr);
+		if ((mlen == IF_NAMESIZE - 1) && (c != '\0'))
 			return (EINVAL);
 
 		/* Include last \0 into comparison */
 		mlen++;
 
 		xent = malloc(sizeof(*xent), M_IPFW_TBL, M_WAITOK | M_ZERO);
-		xent->value = value;
+		xent->value = tei->value;
 		/* Set 'total' structure length */
 		KEY_LEN(xent->a.iface) = KEY_LEN_IFACE + mlen;
 		KEY_LEN(xent->m.ifmask) = KEY_LEN_IFACE + mlen;
 		/* Set offset of interface name in bits */
 		offset = OFF_LEN_IFACE;
-		memcpy(xent->a.iface.ifname, paddr, mlen);
+		memcpy(xent->a.iface.ifname, tei->paddr, mlen);
 		/* Assume direct match */
 		/* TODO: Add interface pattern matching */
 #if 0
@@ -227,7 +277,6 @@ int
 		mask_ptr = (struct sockaddr *)&xent->m.ifmask;
 #endif
 		/* Set pointers */
-		rnh_ptr = &ch->xtables[tbl];
 		ent_ptr = xent;
 		addr_ptr = (struct sockaddr *)&xent->a.iface;
 		mask_ptr = NULL;
@@ -237,84 +286,128 @@ int
 		return (EINVAL);
 	}
 
-	IPFW_WLOCK(ch);
+	IPFW_UH_WLOCK(ch);
 
-	/* Check if tabletype is valid */
-	if ((ch->tabletype[tbl] != 0) && (ch->tabletype[tbl] != type)) {
-		IPFW_WUNLOCK(ch);
-		free(ent_ptr, M_IPFW_TBL);
-		return (EINVAL);
-	}
+	ni = CHAIN_TO_NI(ch);
 
-	/* Check if radix tree exists */
-	if ((rnh = *rnh_ptr) == NULL) {
-		IPFW_WUNLOCK(ch);
-		/* Create radix for a new table */
-		if (!rn_inithead((void **)&rnh, offset)) {
-			free(ent_ptr, M_IPFW_TBL);
+	tc_new = NULL;
+	if ((tc = find_table(ni, ti)) == NULL) {
+		/* Not found. We have to create new one */
+		IPFW_UH_WUNLOCK(ch);
+
+		tc_new = alloc_table_config(ni, ti);
+		if (tc_new == NULL)
 			return (ENOMEM);
-		}
 
-		IPFW_WLOCK(ch);
-		if (*rnh_ptr != NULL) {
-			/* Tree is already attached by other thread */
-			rn_detachhead((void **)&rnh);
-			rnh = *rnh_ptr;
-			/* Check table type another time */
-			if (ch->tabletype[tbl] != type) {
-				IPFW_WUNLOCK(ch);
-				free(ent_ptr, M_IPFW_TBL);
+		IPFW_UH_WLOCK(ch);
+
+		/* Check if table has already allocated by other thread */
+		if ((tc = find_table(ni, ti)) != NULL) {
+			if (tc->no.type != ti->type) {
+				IPFW_UH_WUNLOCK(ch);
+				free_table_config(ni, tc);
 				return (EINVAL);
 			}
 		} else {
-			*rnh_ptr = rnh;
-			/* 
-			 * Set table type. It can be set already
-			 * (if we have IPv6-only table) but setting
-			 * it another time does not hurt
+			/*
+			 * New table.
+			 * Set tc_new to zero not to free it afterwards.
 			 */
-			ch->tabletype[tbl] = type;
+			tc = tc_new;
+			tc_new = NULL;
+
+			/* Allocate table index. */
+			if (ipfw_objhash_alloc_idx(ni, ti->set, &kidx) != 0) {
+				/* Index full. */
+				IPFW_UH_WUNLOCK(ch);
+				printf("Unable to allocate index for table %s."
+				    " Consider increasing "
+				    "net.inet.ip.fw.tables_max",
+				    tc->no.name);
+				free_table_config(ni, tc);
+				return (EBUSY);
+			}
+			/* Save kidx */
+			tc->no.kidx = kidx;
 		}
+	} else {
+		/* We still have to check table type */
+		if (tc->no.type != ti->type) {
+			IPFW_UH_WUNLOCK(ch);
+			return (EINVAL);
+		}
 	}
+	kidx = tc->no.kidx;
 
+	/* We've got valid table in @tc. Let's add data */
+	IPFW_WLOCK(ch);
+
+	if (tc->linked == 0) {
+		link_table(ch, tc);
+	}
+
+	/* XXX: Temporary until splitting add/del to per-type functions */
+	rnh = NULL;
+	switch (ti->type) {
+	case IPFW_TABLE_CIDR:
+		if (tei->plen == sizeof(in_addr_t))
+			rnh = ch->tables[kidx];
+		else
+			rnh = ch->xtables[kidx];
+		break;
+	case IPFW_TABLE_INTERFACE:
+		rnh = ch->xtables[kidx];
+		break;
+	}
+
 	rn = rnh->rnh_addaddr(addr_ptr, mask_ptr, rnh, ent_ptr);
 	IPFW_WUNLOCK(ch);
+	IPFW_UH_WUNLOCK(ch);
 
+	if (tc_new != NULL)
+		free_table_config(ni, tc);
+
 	if (rn == NULL) {
 		free(ent_ptr, M_IPFW_TBL);
 		return (EEXIST);
 	}
+
 	return (0);
 }
 
 int
-ipfw_del_table_entry(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
-    uint8_t plen, uint8_t mlen, uint8_t type)
+ipfw_del_table_entry(struct ip_fw_chain *ch, struct tid_info *ti,
+    struct tentry_info *tei)
 {
-	struct radix_node_head *rnh, **rnh_ptr;
+	struct radix_node_head *rnh;
 	struct table_entry *ent;
 	in_addr_t addr;
 	struct sockaddr_in sa, mask;
 	struct sockaddr *sa_ptr, *mask_ptr;
+	struct table_config *tc;
+	struct namedobj_instance *ni;
 	char c;
+	uint8_t mlen;
+	uint16_t kidx;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
 
-	switch (type) {
+	mlen = tei->masklen;
+
+	switch (ti->type) {
 	case IPFW_TABLE_CIDR:
-		if (plen == sizeof(in_addr_t)) {
+		if (tei->plen == sizeof(in_addr_t)) {
 			/* Set 'total' structure length */
 			KEY_LEN(sa) = KEY_LEN_INET;
 			KEY_LEN(mask) = KEY_LEN_INET;
 			mask.sin_addr.s_addr = htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0);
-			addr = *((in_addr_t *)paddr);
+			addr = *((in_addr_t *)tei->paddr);
 			sa.sin_addr.s_addr = addr & mask.sin_addr.s_addr;
-			rnh_ptr = &ch->tables[tbl];
 			sa_ptr = (struct sockaddr *)&sa;
 			mask_ptr = (struct sockaddr *)&mask;
 #ifdef INET6
-		} else if (plen == sizeof(struct in6_addr)) {
+		} else if (tei->plen == sizeof(struct in6_addr)) {
 			/* IPv6 case */
 			if (mlen > 128)
 				return (EINVAL);
@@ -325,9 +418,9 @@ int
 			KEY_LEN(sa6) = KEY_LEN_INET6;
 			KEY_LEN(mask6) = KEY_LEN_INET6;
 			ipv6_writemask(&mask6.sin6_addr, mlen);
-			memcpy(&sa6.sin6_addr, paddr, sizeof(struct in6_addr));
+			memcpy(&sa6.sin6_addr, tei->paddr,
+			    sizeof(struct in6_addr));
 			APPLY_MASK(&sa6.sin6_addr, &mask6.sin6_addr);
-			rnh_ptr = &ch->xtables[tbl];
 			sa_ptr = (struct sockaddr *)&sa6;
 			mask_ptr = (struct sockaddr *)&mask6;
 #endif
@@ -339,9 +432,10 @@ int
 
 	case IPFW_TABLE_INTERFACE:
 		/* Check if string is terminated */
-		c = ((char *)paddr)[IF_NAMESIZE - 1];
-		((char *)paddr)[IF_NAMESIZE - 1] = '\0';
-		if (((mlen = strlen((char *)paddr)) == IF_NAMESIZE - 1) && (c != '\0'))
+		c = ((char *)tei->paddr)[IF_NAMESIZE - 1];
+		((char *)tei->paddr)[IF_NAMESIZE - 1] = '\0';
+		mlen = strlen((char *)tei->paddr);
+		if ((mlen == IF_NAMESIZE - 1) && (c != '\0'))
 			return (EINVAL);
 
 		struct xaddr_iface ifname, ifmask;
@@ -360,9 +454,8 @@ int
 		mask_ptr = (struct sockaddr *)&ifmask;
 #endif
 		mask_ptr = NULL;
-		memcpy(ifname.ifname, paddr, mlen);
+		memcpy(ifname.ifname, tei->paddr, mlen);
 		/* Set pointers */
-		rnh_ptr = &ch->xtables[tbl];
 		sa_ptr = (struct sockaddr *)&ifname;
 
 		break;
@@ -371,20 +464,39 @@ int
 		return (EINVAL);
 	}
 
-	IPFW_WLOCK(ch);
-	if ((rnh = *rnh_ptr) == NULL) {
-		IPFW_WUNLOCK(ch);
+	IPFW_UH_RLOCK(ch);
+	ni = CHAIN_TO_NI(ch);
+	if ((tc = find_table(ni, ti)) == NULL) {
+		IPFW_UH_RUNLOCK(ch);
 		return (ESRCH);
 	}
 
-	if (ch->tabletype[tbl] != type) {
-		IPFW_WUNLOCK(ch);
+	if (tc->no.type != ti->type) {
+		IPFW_UH_RUNLOCK(ch);
 		return (EINVAL);
 	}
+	kidx = tc->no.kidx;
 
+	IPFW_WLOCK(ch);
+
+	rnh = NULL;
+	switch (ti->type) {
+	case IPFW_TABLE_CIDR:
+		if (tei->plen == sizeof(in_addr_t))
+			rnh = ch->tables[kidx];
+		else
+			rnh = ch->xtables[kidx];
+		break;
+	case IPFW_TABLE_INTERFACE:
+		rnh = ch->xtables[kidx];
+		break;
+	}
+
 	ent = (struct table_entry *)rnh->rnh_deladdr(sa_ptr, mask_ptr, rnh);
 	IPFW_WUNLOCK(ch);
 
+	IPFW_UH_RUNLOCK(ch);
+
 	if (ent == NULL)
 		return (ESRCH);
 
@@ -405,66 +517,161 @@ flush_table_entry(struct radix_node *rn, void *arg
 	return (0);
 }
 
+/*
+ * Flushes all entries in given table minimizing hoding chain WLOCKs.
+ *
+ */
 int
-ipfw_flush_table(struct ip_fw_chain *ch, uint16_t tbl)
+ipfw_flush_table(struct ip_fw_chain *ch, struct tid_info *ti)
 {
-	struct radix_node_head *rnh, *xrnh;
+	struct namedobj_instance *ni;
+	struct table_config *tc;
+	void *ostate, *oxstate;
+	void *state, *xstate;
+	int error;
+	uint8_t type;
+	uint16_t kidx;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
 
 	/*
-	 * We free both (IPv4 and extended) radix trees and
-	 * clear table type here to permit table to be reused
-	 * for different type without module reload
+	 * Stage 1: determine table type.
+	 * Reference found table to ensure it won't disappear.
 	 */
+	IPFW_UH_WLOCK(ch);
+	ni = CHAIN_TO_NI(ch);
+	if ((tc = find_table(ni, ti)) == NULL) {
+		IPFW_UH_WUNLOCK(ch);
+		return (ESRCH);
+	}
+	type = tc->no.type;
+	tc->no.refcnt++;
+	IPFW_UH_WUNLOCK(ch);
 
+	/*
+	 * Stage 2: allocate new state for given type.
+	 */
+	if ((error = alloc_table_state(&state, &xstate, type)) != 0) {
+		IPFW_UH_WLOCK(ch);
+		tc->no.refcnt--;
+		IPFW_UH_WUNLOCK(ch);
+		return (error);
+	}
+
+	/*
+	 * Stage 3: swap old state pointers with newly-allocated ones.
+	 * Decrease refcount.
+	 */
+	IPFW_UH_WLOCK(ch);
 	IPFW_WLOCK(ch);
-	/* Set IPv4 table pointer to zero */
-	if ((rnh = ch->tables[tbl]) != NULL)
-		ch->tables[tbl] = NULL;
-	/* Set extended table pointer to zero */
-	if ((xrnh = ch->xtables[tbl]) != NULL)
-		ch->xtables[tbl] = NULL;
-	/* Zero table type */
-	ch->tabletype[tbl] = 0;
+
+	ni = CHAIN_TO_NI(ch);
+	kidx = tc->no.kidx;
+
+	ostate = ch->tables[kidx];
+	ch->tables[kidx] = state;
+	oxstate = ch->xtables[kidx];
+	ch->xtables[kidx] = xstate;
+
+	tc->no.refcnt--;
+
 	IPFW_WUNLOCK(ch);
+	IPFW_UH_WUNLOCK(ch);
 
-	if (rnh != NULL) {
-		rnh->rnh_walktree(rnh, flush_table_entry, rnh);
-		rn_detachhead((void **)&rnh);
+	/*
+	 * Stage 4: perform real flush.
+	 */
+	free_table_state(&ostate, &xstate, tc->no.type);
+
+	return (0);
+}
+
+/*
+ * Destroys given table @ti: flushes it,
+ */
+int
+ipfw_destroy_table(struct ip_fw_chain *ch, struct tid_info *ti, int force)
+{
+	struct namedobj_instance *ni;
+	struct table_config *tc;
+
+	ti->set = TABLE_SET(ti->set);
+
+	IPFW_UH_WLOCK(ch);
+
+	ni = CHAIN_TO_NI(ch);
+	if ((tc = find_table(ni, ti)) == NULL) {
+		IPFW_UH_WUNLOCK(ch);
+		return (ESRCH);
 	}
 
-	if (xrnh != NULL) {
-		xrnh->rnh_walktree(xrnh, flush_table_entry, xrnh);
-		rn_detachhead((void **)&xrnh);
+	/* Do not permit destroying used tables */
+	if (tc->no.refcnt > 0 && force == 0) {
+		IPFW_UH_WUNLOCK(ch);
+		return (EBUSY);
 	}
 
+	IPFW_WLOCK(ch);
+	unlink_table(ch, tc);
+	IPFW_WUNLOCK(ch);
+
+	/* Free obj index */
+	if (ipfw_objhash_free_idx(ni, tc->no.set, tc->no.kidx) != 0)
+		printf("Error unlinking kidx %d from table %s\n",
+		    tc->no.kidx, tc->tablename);
+
+	IPFW_UH_WUNLOCK(ch);
+
+	free_table_config(ni, tc);
+
 	return (0);
 }
 
+static void
+destroy_table_locked(struct namedobj_instance *ni, struct named_object *no,
+    void *arg)
+{
+
+	unlink_table((struct ip_fw_chain *)arg, (struct table_config *)no);
+	if (ipfw_objhash_free_idx(ni, no->set, no->kidx) != 0)
+		printf("Error unlinking kidx %d from table %s\n",
+		    no->kidx, no->name);
+	free_table_config(ni, (struct table_config *)no);
+}
+
 void
 ipfw_destroy_tables(struct ip_fw_chain *ch)
 {
-	uint16_t tbl;
 
-	/* Flush all tables */
-	for (tbl = 0; tbl < V_fw_tables_max; tbl++)
-		ipfw_flush_table(ch, tbl);
+	/* Remove all tables from working set */
+	IPFW_UH_WLOCK(ch);
+	IPFW_WLOCK(ch);
+	ipfw_objhash_foreach(CHAIN_TO_NI(ch), destroy_table_locked, ch);
+	IPFW_WUNLOCK(ch);
+	IPFW_UH_WUNLOCK(ch);
 
 	/* Free pointers itself */
 	free(ch->tables, M_IPFW);
 	free(ch->xtables, M_IPFW);
-	free(ch->tabletype, M_IPFW);
+
+	ipfw_objhash_destroy(CHAIN_TO_NI(ch));
+	free(CHAIN_TO_TCFG(ch), M_IPFW);
 }
 
 int
 ipfw_init_tables(struct ip_fw_chain *ch)
 {
+	struct tables_config *tcfg;
+
 	/* Allocate pointers */
 	ch->tables = malloc(V_fw_tables_max * sizeof(void *), M_IPFW, M_WAITOK | M_ZERO);
 	ch->xtables = malloc(V_fw_tables_max * sizeof(void *), M_IPFW, M_WAITOK | M_ZERO);
-	ch->tabletype = malloc(V_fw_tables_max * sizeof(uint8_t), M_IPFW, M_WAITOK | M_ZERO);
+
+	tcfg = malloc(sizeof(struct tables_config), M_IPFW, M_WAITOK | M_ZERO);
+	tcfg->namehash = ipfw_objhash_create(V_fw_tables_max);
+	ch->tblcfg = tcfg;
+
 	return (0);
 }
 
@@ -473,8 +680,10 @@ ipfw_resize_tables(struct ip_fw_chain *ch, unsigne
 {
 	struct radix_node_head **tables, **xtables, *rnh;
 	struct radix_node_head **tables_old, **xtables_old;
-	uint8_t *tabletype, *tabletype_old;
 	unsigned int ntables_old, tbl;
+	struct namedobj_instance *ni;
+	void *new_idx;
+	int new_blocks;
 
 	/* Check new value for validity */
 	if (ntables > IPFW_TABLES_MAX)
@@ -483,24 +692,31 @@ ipfw_resize_tables(struct ip_fw_chain *ch, unsigne
 	/* Allocate new pointers */
 	tables = malloc(ntables * sizeof(void *), M_IPFW, M_WAITOK | M_ZERO);
 	xtables = malloc(ntables * sizeof(void *), M_IPFW, M_WAITOK | M_ZERO);
-	tabletype = malloc(ntables * sizeof(uint8_t), M_IPFW, M_WAITOK | M_ZERO);
+	ipfw_objhash_bitmap_alloc(ntables, (void *)&new_idx, &new_blocks);
 
 	IPFW_WLOCK(ch);
 
 	tbl = (ntables >= V_fw_tables_max) ? V_fw_tables_max : ntables;
+	ni = CHAIN_TO_NI(ch);
 
+	/* Temportary restrict decreasing max_tables  */
+	if (ipfw_objhash_bitmap_merge(ni, &new_idx, &new_blocks) != 0) {
+		IPFW_WUNLOCK(ch);
+		free(tables, M_IPFW);
+		free(xtables, M_IPFW);
+		ipfw_objhash_bitmap_free(new_idx, new_blocks);
+		return (EINVAL);
+	}
+
 	/* Copy old table pointers */
 	memcpy(tables, ch->tables, sizeof(void *) * tbl);
 	memcpy(xtables, ch->xtables, sizeof(void *) * tbl);
-	memcpy(tabletype, ch->tabletype, sizeof(uint8_t) * tbl);
 
 	/* Change pointers and number of tables */
 	tables_old = ch->tables;
 	xtables_old = ch->xtables;
-	tabletype_old = ch->tabletype;
 	ch->tables = tables;
 	ch->xtables = xtables;
-	ch->tabletype = tabletype;
 
 	ntables_old = V_fw_tables_max;
 	V_fw_tables_max = ntables;
@@ -525,7 +741,7 @@ ipfw_resize_tables(struct ip_fw_chain *ch, unsigne
 	/* Free old pointers */
 	free(tables_old, M_IPFW);
 	free(xtables_old, M_IPFW);
-	free(tabletype_old, M_IPFW);
+	ipfw_objhash_bitmap_free(new_idx, new_blocks);
 
 	return (0);
 }
@@ -602,14 +818,17 @@ count_table_entry(struct radix_node *rn, void *arg
 }
 
 int
-ipfw_count_table(struct ip_fw_chain *ch, uint32_t tbl, uint32_t *cnt)
+ipfw_count_table(struct ip_fw_chain *ch, struct tid_info *ti, uint32_t *cnt)
 {
 	struct radix_node_head *rnh;
+	struct table_config *tc;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
+	if ((tc = find_table(CHAIN_TO_NI(ch), ti)) == NULL)
+		return (ESRCH);
 	*cnt = 0;
-	if ((rnh = ch->tables[tbl]) == NULL)
+	if ((rnh = ch->tables[tc->no.kidx]) == NULL)
 		return (0);
 	rnh->rnh_walktree(rnh, count_table_entry, cnt);
 	return (0);
@@ -637,14 +856,17 @@ dump_table_entry(struct radix_node *rn, void *arg)
 }
 
 int
-ipfw_dump_table(struct ip_fw_chain *ch, ipfw_table *tbl)
+ipfw_dump_table(struct ip_fw_chain *ch, struct tid_info *ti, ipfw_table *tbl)
 {
 	struct radix_node_head *rnh;
+	struct table_config *tc;
 
-	if (tbl->tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
+	if ((tc = find_table(CHAIN_TO_NI(ch), ti)) == NULL)
+		return (ESRCH);
 	tbl->cnt = 0;
-	if ((rnh = ch->tables[tbl->tbl]) == NULL)
+	if ((rnh = ch->tables[tc->no.kidx]) == NULL)
 		return (0);
 	rnh->rnh_walktree(rnh, dump_table_entry, tbl);
 	return (0);
@@ -660,16 +882,19 @@ count_table_xentry(struct radix_node *rn, void *ar
 }
 
 int
-ipfw_count_xtable(struct ip_fw_chain *ch, uint32_t tbl, uint32_t *cnt)
+ipfw_count_xtable(struct ip_fw_chain *ch, struct tid_info *ti, uint32_t *cnt)
 {
 	struct radix_node_head *rnh;
+	struct table_config *tc;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
 	*cnt = 0;
-	if ((rnh = ch->tables[tbl]) != NULL)
+	if ((tc = find_table(CHAIN_TO_NI(ch), ti)) == NULL)
+		return (0);	/* XXX: We should return ESRCH */
+	if ((rnh = ch->tables[tc->no.kidx]) != NULL)
 		rnh->rnh_walktree(rnh, count_table_xentry, cnt);
-	if ((rnh = ch->xtables[tbl]) != NULL)
+	if ((rnh = ch->xtables[tc->no.kidx]) != NULL)
 		rnh->rnh_walktree(rnh, count_table_xentry, cnt);
 	/* Return zero if table is empty */
 	if (*cnt > 0)
@@ -747,19 +972,700 @@ dump_table_xentry_extended(struct radix_node *rn,
 }
 
 int
-ipfw_dump_xtable(struct ip_fw_chain *ch, ipfw_xtable *tbl)
+ipfw_dump_xtable(struct ip_fw_chain *ch, struct tid_info *ti, ipfw_xtable *tbl)
 {
 	struct radix_node_head *rnh;
+	struct table_config *tc;
 
 	if (tbl->tbl >= V_fw_tables_max)
 		return (EINVAL);
 	tbl->cnt = 0;
-	tbl->type = ch->tabletype[tbl->tbl];
-	if ((rnh = ch->tables[tbl->tbl]) != NULL)
+
+	if ((tc = find_table(CHAIN_TO_NI(ch), ti)) == NULL)
+		return (0);	/* XXX: We should return ESRCH */
+	tbl->type = tc->no.type;
+	if ((rnh = ch->tables[tc->no.kidx]) != NULL)
 		rnh->rnh_walktree(rnh, dump_table_xentry_base, tbl);
-	if ((rnh = ch->xtables[tbl->tbl]) != NULL)
+	if ((rnh = ch->xtables[tc->no.kidx]) != NULL)
 		rnh->rnh_walktree(rnh, dump_table_xentry_extended, tbl);
 	return (0);
 }
 
+/*
+ * Tables rewriting code 
+ *
+ */
+
+/*
+ * Determine table number and lookup type for @cmd.
+ * Fill @tbl and @type with appropriate values.
+ * Returns 0 for relevant opcodes, 1 otherwise.
+ */
+static int
+classify_table_opcode(ipfw_insn *cmd, uint16_t *puidx, uint8_t *ptype)
+{
+	ipfw_insn_if *cmdif;
+	int skip;
+	uint16_t v;
+
+	skip = 1;
+
+	switch (cmd->opcode) {
+	case O_IP_SRC_LOOKUP:
+	case O_IP_DST_LOOKUP:
+		/* Basic IPv4/IPv6 or u32 lookups */
+		*puidx = cmd->arg1;
+		/* Assume CIDR by default */
+		*ptype = IPFW_TABLE_CIDR;
+		skip = 0;
+		
+		if (F_LEN(cmd) > F_INSN_SIZE(ipfw_insn_u32)) {
+			/*
+			 * generic lookup. The key must be
+			 * in 32bit big-endian format.
+			 */
+			v = ((ipfw_insn_u32 *)cmd)->d[1];
+			switch (v) {
+			case 0:
+			case 1:
+				/* IPv4 src/dst */
+				break;
+			case 2:
+			case 3:
+				/* src/dst port */
+				//type = IPFW_TABLE_U16;
+				break;
+			case 4:
+				/* uid/gid */
+				//type = IPFW_TABLE_U32;
+			case 5:
+				//type = IPFW_TABLE_U32;
+				/* jid */
+			case 6:
+				//type = IPFW_TABLE_U16;
+				/* dscp */
+				break;
+			}
+		}
+		break;
+	case O_XMIT:
+	case O_RECV:
+	case O_VIA:
+		/* Interface table, possibly */
+		cmdif = (ipfw_insn_if *)cmd;
+		if (cmdif->name[0] != '\1')
+			break;
+
+		*ptype = IPFW_TABLE_INTERFACE;
+		*puidx = cmdif->p.glob;
+		skip = 0;
+		break;
+	}
+
+	return (skip);
+}
+
+/*
+ * Sets new table value for given opcode.
+ * Assume the same opcodes as classify_table_opcode()
+ */
+static void
+update_table_opcode(ipfw_insn *cmd, uint16_t idx)
+{
+	ipfw_insn_if *cmdif;
+
+	switch (cmd->opcode) {
+	case O_IP_SRC_LOOKUP:
+	case O_IP_DST_LOOKUP:
+		/* Basic IPv4/IPv6 or u32 lookups */
+		cmd->arg1 = idx;
+		break;
+	case O_XMIT:
+	case O_RECV:
+	case O_VIA:
+		/* Interface table, possibly */
+		cmdif = (ipfw_insn_if *)cmd;
+		cmdif->p.glob = idx;
+		break;
+	}
+}
+
+static char *
+find_name_tlv(void *tlvs, int len, uint16_t uidx)
+{
+	ipfw_xtable_ntlv *ntlv;
+	uintptr_t pa, pe;
+	int l;
+
+	pa = (uintptr_t)tlvs;
+	pe = pa + len;
+	l = 0;
+	for (; pa < pe; pa += l) {
+		ntlv = (ipfw_xtable_ntlv *)pa;
+		l = ntlv->head.length;
+		if (ntlv->head.type != IPFW_TLV_NAME)
+			continue;
+		if (ntlv->idx != uidx)
+			continue;
+		
+		return (ntlv->name);
+	}
+
+	return (NULL);
+}
+
+static struct table_config *
+find_table(struct namedobj_instance *ni, struct tid_info *ti)
+{
+	char *name, bname[16];
+	struct named_object *no;
+
+	if (ti->tlvs != NULL) {
+		name = find_name_tlv(ti->tlvs, ti->tlen, ti->uidx);
+		if (name == NULL)
+			return (NULL);
+	} else {
+		snprintf(bname, sizeof(bname), "%d", ti->uidx);
+		name = bname;
+	}
+
+	no = ipfw_objhash_lookup_name(ni, ti->set, name);
+
+	return ((struct table_config *)no);
+}
+
+static int
+alloc_table_state(void **state, void **xstate, uint8_t type)
+{
+
+	switch (type) {
+	case IPFW_TABLE_CIDR:
+		if (!rn_inithead(state, OFF_LEN_INET))
+			return (ENOMEM);
+		if (!rn_inithead(xstate, OFF_LEN_INET6)) {
+			rn_detachhead(state);
+			return (ENOMEM);
+		}
+		break;
+	case IPFW_TABLE_INTERFACE:
+		*state = NULL;
+		if (!rn_inithead(xstate, OFF_LEN_IFACE))
+			return (ENOMEM);
+		break;
+	}
+	
+	return (0);
+}
+
+
+static struct table_config *
+alloc_table_config(struct namedobj_instance *ni, struct tid_info *ti)
+{
+	char *name, bname[16];
+	struct table_config *tc;
+	int error;
+
+	if (ti->tlvs != NULL) {
+		name = find_name_tlv(ti->tlvs, ti->tlen, ti->uidx);
+		if (name == NULL)
+			return (NULL);
+	} else {
+		snprintf(bname, sizeof(bname), "%d", ti->uidx);
+		name = bname;
+	}
+
+	tc = malloc(sizeof(struct table_config), M_IPFW, M_WAITOK | M_ZERO);
+	tc->no.name = tc->tablename;
+	tc->no.type = ti->type;
+	tc->no.set = ti->set;
+	strlcpy(tc->tablename, name, sizeof(tc->tablename));
+
+	if (ti->tlvs == NULL) {
+		tc->no.compat = 1;
+		tc->no.uidx = ti->uidx;
+	}
+
+	/* Preallocate data structures for new tables */
+	error = alloc_table_state(&tc->state, &tc->xstate, ti->type);
+	if (error != 0) {
+		free(tc, M_IPFW);
+		return (NULL);
+	}
+	
+	return (tc);
+}
+
+static void
+free_table_state(void **state, void **xstate, uint8_t type)
+{
+	struct radix_node_head *rnh;
+
+	switch (type) {
+	case IPFW_TABLE_CIDR:
+		rnh = (struct radix_node_head *)(*state);
+		rnh->rnh_walktree(rnh, flush_table_entry, rnh);
+		rn_detachhead(state);
+
+		rnh = (struct radix_node_head *)(*xstate);
+		rnh->rnh_walktree(rnh, flush_table_entry, rnh);
+		rn_detachhead(xstate);
+		break;
+	case IPFW_TABLE_INTERFACE:
+		rnh = (struct radix_node_head *)(*xstate);
+		rnh->rnh_walktree(rnh, flush_table_entry, rnh);
+		rn_detachhead(xstate);
+		break;
+	}
+}
+
+static void
+free_table_config(struct namedobj_instance *ni, struct table_config *tc)
+{
+
+	if (tc->linked == 0)
+		free_table_state(&tc->state, &tc->xstate, tc->no.type);
+
+	free(tc, M_IPFW);
+}
+
+/*
+ * Links @tc to @chain table named instance.
+ * Sets appropriate type/states in @chain table info.
+ */
+static void
+link_table(struct ip_fw_chain *chain, struct table_config *tc)
+{
+	struct namedobj_instance *ni;
+	uint16_t kidx;
+
+	IPFW_UH_WLOCK_ASSERT(chain);
+	IPFW_WLOCK_ASSERT(chain);
+
+	ni = CHAIN_TO_NI(chain);
+	kidx = tc->no.kidx;
+
+	ipfw_objhash_add(ni, &tc->no);
+	chain->tables[kidx] = tc->state;
+	chain->xtables[kidx] = tc->xstate;
+
+	tc->linked = 1;
+}
+
+/*
+ * Unlinks @tc from @chain table named instance.
+ * Zeroes states in @chain and stores them in @tc.
+ */
+static void
+unlink_table(struct ip_fw_chain *chain, struct table_config *tc)
+{
+	struct namedobj_instance *ni;
+	uint16_t kidx;
+
+	IPFW_UH_WLOCK_ASSERT(chain);
+	IPFW_WLOCK_ASSERT(chain);
+
+	ni = CHAIN_TO_NI(chain);
+	kidx = tc->no.kidx;
+
+	/* Clear state and save pointers for flush */
+	ipfw_objhash_del(ni, &tc->no);
+	tc->state = chain->tables[kidx];
+	chain->tables[kidx] = NULL;
+	tc->xstate = chain->xtables[kidx];
+	chain->xtables[kidx] = NULL;
+
+	tc->linked = 0;
+}
+
+/*
+ * Finds named object by @uidx number.
+ * Refs found object, allocate new index for non-existing object.
+ * Fills in @pidx with userland/kernel indexes.
+ *
+ * Returns 0 on success.
+ */
+static int
+bind_table(struct namedobj_instance *ni, struct rule_check_info *ci,
+    struct obj_idx *pidx, struct tid_info *ti)
+{
+	struct table_config *tc;
+
+	tc = find_table(ni, ti);
+
+	pidx->uidx = ti->uidx;
+	pidx->type = ti->type;
+
+	if (tc == NULL) {
+		/* Try to acquire refcount */
+		if (ipfw_objhash_alloc_idx(ni, ti->set, &pidx->kidx) != 0) {
+			printf("Unable to allocate table index in set %u."
+			    " Consider increasing net.inet.ip.fw.tables_max",
+				    ti->set);
+			return (EBUSY);
+		}
+
+		pidx->new = 1;
+		ci->new_tables++;
+
+		return (0);
+	}
+
+	/* Check if table type if valid first */
+	if (tc->no.type != ti->type)
+		return (EINVAL);
+
+	tc->no.refcnt++;
+
+	pidx->kidx = tc->no.kidx;
+
+	return (0);
+}
+
+/*
+ * Compatibility function for old ipfw(8) binaries.
+ * Rewrites table kernel indices with userland ones.
+ * Works for \d+ talbes only (e.g. for tables, converted
+ * from old numbered system calls).
+ *
+ * Returns 0 on success.
+ * Raises error on any other tables.
+ */
+int
+ipfw_rewrite_table_kidx(struct ip_fw_chain *chain, struct ip_fw *rule)
+{
+	int cmdlen, l;
+	ipfw_insn *cmd;
+	uint32_t set;
+	uint16_t kidx;
+	uint8_t type;
+	struct named_object *no;
+	struct namedobj_instance *ni;
+
+	ni = CHAIN_TO_NI(chain);
+
+	set = TABLE_SET(rule->set);
+	
+	l = rule->cmd_len;
+	cmd = rule->cmd;
+	cmdlen = 0;
+	for ( ;	l > 0 ; l -= cmdlen, cmd += cmdlen) {
+		cmdlen = F_LEN(cmd);
+
+		if (classify_table_opcode(cmd, &kidx, &type) != 0)
+			continue;
+
+		if ((no = ipfw_objhash_lookup_idx(ni, set, kidx)) == NULL)
+			return (1);
+
+		if (no->compat == 0)
+			return (2);
+
+		update_table_opcode(cmd, no->uidx);
+	}
+
+	return (0);
+}
+
+
+/*
+ * Checks is opcode is referencing table of appropriate type.
+ * Adds reference count for found table if true.
+ * Rewrites user-supplied opcode values with kernel ones.
+ *
+ * Returns 0 on success and appropriate error code otherwise.
+ */
+int
+ipfw_rewrite_table_uidx(struct ip_fw_chain *chain,
+    struct rule_check_info *ci)
+{
+	int cmdlen, error, ftype, l;
+	ipfw_insn *cmd;
+	uint16_t uidx;
+	uint8_t type;
+	struct table_config *tc;
+	struct namedobj_instance *ni;
+	struct named_object *no, *no_n, *no_tmp;
+	struct obj_idx *pidx, *p, *oib;
+	struct namedobjects_head nh;
+	struct tid_info ti;
+
+	ni = CHAIN_TO_NI(chain);
+
+	/*
+	 * Prepare an array for storing opcode indices.
+	 * Use stack allocation by default.
+	 */
+	if (ci->table_opcodes <= (sizeof(ci->obuf)/sizeof(ci->obuf[0]))) {
+		/* Stack */
+		pidx = ci->obuf;
+	} else
+		pidx = malloc(ci->table_opcodes * sizeof(struct obj_idx),
+		    M_IPFW, M_WAITOK | M_ZERO);
+
+	oib = pidx;
+	error = 0;
+
+	type = 0;
+	ftype = 0;
+
+	ci->tableset = TABLE_SET(ci->krule->set);
+
+	memset(&ti, 0, sizeof(ti));
+	ti.set = ci->tableset;
+	ti.tlvs = ci->tlvs;
+	ti.tlen = ci->tlen;
+
+	/*
+	 * Stage 1: reference existing tables and determine number
+	 * of tables we need to allocate
+	 */
+	IPFW_UH_WLOCK(chain);
+
+	l = ci->krule->cmd_len;
+	cmd = ci->krule->cmd;
+	cmdlen = 0;
+	for ( ;	l > 0 ; l -= cmdlen, cmd += cmdlen) {
+		cmdlen = F_LEN(cmd);
+
+		if (classify_table_opcode(cmd, &ti.uidx, &ti.type) != 0)
+			continue;
+
+		/*
+		 * Got table opcode with necessary info.
+		 * Try to reference existing tables and allocate
+		 * indices for non-existing one while holding write lock.
+		 */
+		if ((error = bind_table(ni, ci, pidx, &ti)) != 0)
+			break;
+
+		/*
+		 * @pidx stores either existing ref'd table id or new one.
+		 * Move to next index
+		 */
+
+		pidx++;
+	}
+
+	if (error != 0) {
+		/* Unref everything we have already done */
+		for (p = oib; p < pidx; p++) {
+			if (p->new != 0) {
+				ipfw_objhash_free_idx(ni, ci->tableset,p->kidx);
+				continue;
+			}
+
+			/* Find & unref by existing idx */
+			no = ipfw_objhash_lookup_idx(ni, ci->tableset, p->kidx);
+			KASSERT(no!=NULL, ("Ref'd table %d disappeared",
+			    p->kidx));
+
+			no->refcnt--;
+		}
+
+		IPFW_UH_WUNLOCK(chain);
+
+		if (oib != ci->obuf)
+			free(oib, M_IPFW);
+
+		return (error);
+	}
+
+	IPFW_UH_WUNLOCK(chain);
+
+	/*
+	 * Stage 2: allocate table configs for every non-existent table
+	 */
+
+	if (ci->new_tables > 0) {
+		/* Prepare queue to store configs */
+		TAILQ_INIT(&nh);
+
+		for (p = oib; p < pidx; p++) {
+			if (p->new == 0)
+				continue;
+
+			/* TODO: get name from TLV */
+			ti.uidx = p->uidx;
+			ti.type = p->type;
+
+			tc = alloc_table_config(ni, &ti);
+
+			if (tc == NULL) {
+				error = ENOMEM;
+				goto free;
+			}
+
+			tc->no.kidx = p->kidx;
+			tc->no.refcnt = 1;
+
+			/* Add to list */
+			TAILQ_INSERT_TAIL(&nh, &tc->no, nn_next);
+		}
+
+		/*
+		 * Stage 2.1: Check if we're going to create 2 tables
+		 * with the same name, but different table types.
+		 */
+		TAILQ_FOREACH(no, &nh, nn_next) {
+			TAILQ_FOREACH(no_tmp, &nh, nn_next) {
+				if (strcmp(no->name, no_tmp->name) != 0)
+					continue;
+				if (no->type != no_tmp->type) {
+					error = EINVAL;
+					goto free;
+				}
+			}
+		}
+
+		/*
+		 * Stage 3: link & reference new table configs
+		 */
+
+		IPFW_UH_WLOCK(chain);
+
+		/*
+		 * Step 3.1: Check if some tables we need to create have been
+		 * already created with different table type.
+		 */
+
+		error = 0;
+		TAILQ_FOREACH_SAFE(no, &nh, nn_next, no_tmp) {
+			no_n = ipfw_objhash_lookup_name(ni, no->set, no->name);
+			if (no_n == NULL)
+				continue;
+
+			if (no_n->type != no->type) {
+				error = EINVAL;
+				break;
+			}
+
+		}
+
+		if (error != 0) {
+			/*
+			 * Someone has allocated table with different table type.
+			 * We have to rollback everything.
+			 */
+			IPFW_UH_WUNLOCK(chain);
+
+			goto free;
+		}
+
+
+		/*
+		 * Finally, attach tables and rewrite rule.
+		 * We need to set table type for each new table,
+		 * so we have to acquire main WLOCK.
+		 */
+		IPFW_WLOCK(chain);
+		TAILQ_FOREACH_SAFE(no, &nh, nn_next, no_tmp) {
+			no_n = ipfw_objhash_lookup_name(ni, no->set, no->name);
+			if (no_n != NULL) {
+				/* Increase refcount for existing table */
+				no_n->refcnt++;
+				/* Keep oib array in sync: update kindx */
+				for (p = oib; p < pidx; p++) {
+					if (p->kidx == no->kidx) {
+						p->kidx = no_n->kidx;
+						break;
+					}
+				}
+
+				continue;
+			}
+
+			/* New table. Attach to runtime hash */
+			TAILQ_REMOVE(&nh, no, nn_next);
+
+			link_table(chain, (struct table_config *)no);
+		}
+		IPFW_WUNLOCK(chain);
+
+		/* Perform rule rewrite */
+		l = ci->krule->cmd_len;
+		cmd = ci->krule->cmd;
+		cmdlen = 0;
+		pidx = oib;
+		for ( ;	l > 0 ; l -= cmdlen, cmd += cmdlen) {
+			cmdlen = F_LEN(cmd);
+
+			if (classify_table_opcode(cmd, &uidx, &type) != 0)
+				continue;
+			update_table_opcode(cmd, pidx->kidx);
+			pidx++;
+		}
+
+		IPFW_UH_WUNLOCK(chain);
+	}
+
+	error = 0;
+
+	/*
+	 * Stage 4: free resources
+	 */
+free:
+	TAILQ_FOREACH_SAFE(no, &nh, nn_next, no_tmp)
+		free_table_config(ni, tc);
+
+	if (oib != ci->obuf)
+		free(oib, M_IPFW);
+
+	return (error);
+}
+
+/*
+ * Remove references from every table used in @rule.
+ */
+void
+ipfw_unbind_table_rule(struct ip_fw_chain *chain, struct ip_fw *rule)
+{
+	int cmdlen, l;
+	ipfw_insn *cmd;
+	struct namedobj_instance *ni;
+	struct named_object *no;
+	uint32_t set;
+	uint16_t kidx;
+	uint8_t type;
+
+	ni = CHAIN_TO_NI(chain);
+
+	set = TABLE_SET(rule->set);
+
+	l = rule->cmd_len;
+	cmd = rule->cmd;
+	cmdlen = 0;
+	for ( ;	l > 0 ; l -= cmdlen, cmd += cmdlen) {
+		cmdlen = F_LEN(cmd);
+
+		if (classify_table_opcode(cmd, &kidx, &type) != 0)
+			continue;
+
+		no = ipfw_objhash_lookup_idx(ni, set, kidx); 
+
+		KASSERT(no != NULL, ("table id %d not found", kidx));
+		KASSERT(no->type == type, ("wrong type %d (%d) for table id %d",
+		    no->type, type, kidx));
+		KASSERT(no->refcnt > 0, ("refcount for table %d is %d",
+		    kidx, no->refcnt));
+
+		no->refcnt--;
+	}
+}
+
+
+/*
+ * Removes table bindings for every rule in rule chain @head.
+ */
+void
+ipfw_unbind_table_list(struct ip_fw_chain *chain, struct ip_fw *head)
+{
+	struct ip_fw *rule;
+
+	while ((rule = head) != NULL) {
+		head = head->x_next;
+		ipfw_unbind_table_rule(chain, rule);
+	}
+}
+
+
 /* end of file */

--------------090406000404070502000405--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?538B2FE5.6070407>