Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 05 Jun 2014 14:22:28 +0400
From:      "Alexander V. Chernikov" <melifaro@ipfw.ru>
To:        Luigi Rizzo <rizzo@iet.unipi.it>
Cc:        Luigi Rizzo <luigi@FreeBSD.org>, Bill Yuan <bycn82@gmail.com>, FreeBSD Net <net@FreeBSD.org>
Subject:   Re: [CFT]: ipfw named tables / different tabletypes
Message-ID:  <539044E4.1020904@ipfw.ru>
In-Reply-To: <538B2FE5.6070407@FreeBSD.org>
References:  <5379FE3C.6060501@FreeBSD.org> <20140521111002.GB62462@onelab2.iet.unipi.it> <537CEC12.8050404@FreeBSD.org> <20140521204826.GA67124@onelab2.iet.unipi.it> <537E1029.70007@FreeBSD.org> <20140522154740.GA76448@onelab2.iet.unipi.it> <537E2153.1040005@FreeBSD.org> <20140522163812.GA77634@onelab2.iet.unipi.it> <538B2FE5.6070407@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------060502050508080706040508
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 01.06.2014 17:51, Alexander V. Chernikov wrote:
> On 22.05.2014 20:38, Luigi Rizzo wrote:
>
> Long story short, new version is ready.
> I've tried to minimize changes in this patch to ease review/commit.
>
> Changes:
> * Add namedobject set-aware api capable of searching/allocation 
> objects by their name/idx.
> * Switch tables code to use string ids for configuration tasks.
> * Change locking model: most configuration changes are protected with 
> UH lock, runtime-visible are protected with both locks.
> * Reduce number of arguments passed to ipfw_table_add/del by using 
> separate structure.
> * Add internal V_fw_tables_sets tunable (set to 0) to prepare for 
> set-aware tables (requires opcodes/client support)
> * Implement typed table referencing (and tables are implicitly 
> allocated with all state like radix ptrs on reference)
> * Add "destroy" ipfw(8) using new IP_FW_DELOBJ opcode
>
> Namedobj more detailed:
> * Blackbox api providing methods to add/del/search/enumerate objects
> * Statically-sized hashes for names/indexes
> * Per-set bitmask to indicate free indexes
> * Separate methods for index alloc/delete/resize
>
>
> Basically, there should not be any user-visible changes except the 
> following:
> * reducing table_max is not supported
> * flush & add change table type won't work if table is referenced
>
>
> I haven't removed any numbering restrictions to protect the following 
> case:
> one (with old client) unintentionally references too many tables (e.g. 
> 1000-1128),
> tries to allocate table from "valid" range and fails. Old client does 
> not have any ability to
> destroy any table, so the only way to solve this is either module 
> unload or reboot.
>
> I've uploaded the same patch to phabricator since it provides quite 
> handy diffs:
> https://phabric.freebsd.org/D139 (no login required).
A bit cleaner version attached.
>
>> On Thu, May 22, 2014 at 08:09:55PM +0400, Alexander V. Chernikov wrote:
>>> On 22.05.2014 19:47, Luigi Rizzo wrote:
>>>> On Thu, May 22, 2014 at 06:56:41PM +0400, Alexander V. Chernikov 
>>>> wrote:
>>>>> On 22.05.2014 00:48, Luigi Rizzo wrote:
>>>>>> On Wed, May 21, 2014 at 10:10:26PM +0400, Alexander V. Chernikov 
>>>>>> wrote:
>>>> ...
>>>>>> we can solve this by using 'low' numbers for the numeric tables
>>>>>> (these were limited anyways) and allocate the fake entries in
>>>>>> another range.
>>>>> Currently we have u16 space available in base opcode.
>>>> yes but the standard range for tables is much more limited:
>>>>
>>>>     net.inet.ip.fw.tables_max: 128
>>>>
>>>> so one can just (say) use 32k for "old" tables and the rest
>>>> for tables with non numeric names.
>>>> Does not seem to be a problem in practice.
>>> Well, using upper 32k means that you set this default to 65k which
>>> consumes 256k of memory on 32-bit arch.
>>> Embedded people won't be very happy about this (and changing table
>>> numbers on resize would be a nightmare).
>> no no, this is an implementation detail but
>> within the kernel you can just remap the 'old' and 'new'
>> table identifiers to a single contiguous range.
>> The only thing you need to do is that when you push
>> identifiers up to userland, those with 'new' names will
>> be mapped to the 32-64k range.
>>
>> Example:
>> user first specifies tables
>>     "18, goodguys, 530, badguys" in the same rule
>>     /sbin/ipfw will generate these numbers:
>>     18, 32768, 530, 32769 ; tlv {32768:goodguys, 32769:badguys}
>>     The kernel will then do a lookup of those identifiers and
>>     18: internal index 1, name "18"
>>     32768: internal index 2, name "goodguys"
>>     530: internal index 3, name "530"
>>     32769: internal index 4, name "badguys"
>>
>> Then the next rule contains tables
>>     1, badguys, 18
>>      /sbin/ipfw generates
>>     1, 32768, 18 ; tlv {32768:badguys} // note different from before
>>      Kernel looks up the names and remaps
>>     1: internal index 5, name "1"
>>     32768: internal index 4, name "badguys"
>>     18: internal index 1, name "18"
>>
>> Finally when you do an 'ipfw show' the kernel will remap names
>> between 1 and 32768 to themselves, and other names to 32768+
>> (or some other large number, say 40k and above) so
>> as they are found. So the rules will be pushed up with
>>     18, 40000, 530, 40001
>>     1, 40001, 18
>>
>> we can discusso the other details privately
>>
>> cheers
>> luigi
>>
>>
>> 1. first, the
>>>>>> maybe i am missing some detail but it seems reasonably easy to 
>>>>>> implement
>>>>>> the atomic swap -- and the use case is when you want to move from
>>>>>> one configuration to a new one:
>>>>>>     ipfw table foo-new flush // clear initial content
>>>>>>     ipfw table foo-new add  ... <repeat as needed>
>>>>>>     ipfw table swap foo-current foo-new // swap the content of 
>>>>>> the table objects
>>>>>>
>>>>>> so you preserve the semantic of the name very easily.
>>>>> Yes. We can easily add atomic table swap that way. However, I'm 
>>>>> talking
>>>>> about different use scenario:
>>>>> Atomically swap entire ruleset which has some tables depency:
>>>>>
>>>>>
>>>>> e.g. we have:
>>>>>
>>>>> "
>>>>> 100 allow ip from table(TABLE1) to me
>>>>> 200 allow ip from table(TABLE2) to (TABLE3) 80
>>>>>
>>>>> table TABLE1 1.1.1.1/32
>>>>> table TABLE1 1.0.0.0/16
>>>>>
>>>>> table TABLE2 2.2.2.2/32
>>>>>
>>>>> table TABLE3 3.3.3.3/32
>>>>> "
>>>>> and we want to _atomically_ change this to
>>>>>
>>>>> "
>>>>> 100 allow ip from table(TABLE1) to me
>>>>> +200 allow ip from table(TABLE4) to any
>>>>> 300 allow ip from table(TABLE2) to (TABLE3) 80
>>>>>
>>>>> table TABLE1 1.1.1.1/32
>>>>> -table TABLE1 1.0.0.0/16
>>>>>
>>>>> -table TABLE2 2.2.2.2/32
>>>>> +table TABLE2 77.77.77.0/24
>>>>>
>>>>> table TABLE3 3.3.3.3/32
>>>>>
>>>>> +table TABLE4 4.4.4.4/32
>>>>> "
>>>> aargh, that's too much -- because between changing
>>>> one table and all tables there are infinite intermediate
>>>> points that all make sense.
>>> It depends. As I said before, we're currently solving this problem by
>>> adding new rules (to set X) referencing tables from different range
>>> (2048 tables per ruleset) and than doing swap.
>>> (And not being able to use named tables to store real names after
>>> implementing them is a bit discouraging).
>>>
>>>> For those cases i think the way to go could be to
>>>> insert a 'disabled' new ruleset (however complex it is,
>>>> so it covers all possible cases), and then do the set swap,
>>>> or disable/enable.
>>> We can think of per-set arrays/namespaces of tables:
>>>
>>> so "ipfw add 100 set X allow ipfw from table(Y) to ..." will reference
>>> table Y in set X and
>>> "ipfw table ABC list" can differ from "ipfw table ABC set 5 list".
>>>
>>> This behavior can break some users setups so we can provide
>>> sysctl/tunable to turn this off or on.
>>>
>>>> cheers
>>>> luigi
>>>>
>


--------------060502050508080706040508
Content-Type: text/x-patch;
 name="D139_4.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="D139_4.diff"

Index: sbin/ipfw/ipfw2.c
===================================================================
--- sbin/ipfw/ipfw2.c
+++ sbin/ipfw/ipfw2.c
@@ -4243,6 +4243,23 @@
 		do {
 			table_list(xent.tbl, is_all);
 		} while (++xent.tbl < a);
+	} else if (_substrcmp(*av, "destroy") == 0) {
+		char xbuf[sizeof(ipfw_obj_header) + sizeof(ipfw_xtable_ntlv)];
+		ipfw_obj_header *oh;
+		ipfw_xtable_ntlv *ntlv;
+
+		memset(xbuf, 0, sizeof(xbuf));
+		oh = (ipfw_obj_header *)xbuf;
+		ntlv = (ipfw_xtable_ntlv *)(oh + 1);
+
+		ntlv->head.type = IPFW_TLV_NAME;
+		ntlv->head.length = sizeof(*ntlv);
+		ntlv->idx = 1;
+		snprintf(ntlv->name, sizeof(ntlv->name), "%d", xent.tbl);
+		oh->idx = 1;
+		oh->objtype = IPFW_OBJTYPE_TABLE;
+		if (do_setcmd3(IP_FW_OBJ_DEL, xbuf, sizeof(xbuf)) != 0)
+			err(EX_OSERR, "setsockopt(IP_FW_OBJ_DEL)");
 	} else
 		errx(EX_USAGE, "invalid table command %s", *av);
 }
Index: sys/netinet/ip_fw.h
===================================================================
--- sys/netinet/ip_fw.h
+++ sys/netinet/ip_fw.h
@@ -37,6 +37,11 @@
 #define	IPFW_DEFAULT_RULE	65535
 
 /*
+ * Number of sets supported by ipfw
+ */
+#define	IPFW_MAX_SETS		32
+
+/*
  * Default number of ipfw tables.
  */
 #define	IPFW_TABLES_MAX		65535
@@ -74,6 +79,7 @@
 #define	IP_FW_TABLE_XDEL	87	/* delete entry */
 #define	IP_FW_TABLE_XGETSIZE	88	/* get table size */
 #define	IP_FW_TABLE_XLIST	89	/* list table contents */
+#define	IP_FW_OBJ_DEL		90	/* del table/pipe/etc */
 
 /*
  * The kernel representation of ipfw rules is made of a list of
@@ -632,12 +638,34 @@
 } ipfw_table;
 
 typedef struct	_ipfw_xtable {
-	ip_fw3_opheader	opheader;	/* eXtended tables are controlled via IP_FW3 */
+	ip_fw3_opheader	opheader;	/* IP_FW3 opcode */
 	uint32_t	size;		/* size of entries in bytes	*/
 	uint32_t	cnt;		/* # of entries			*/
 	uint16_t	tbl;		/* table number			*/
 	uint8_t		type;		/* table type			*/
 	ipfw_table_xentry xent[0];	/* entries			*/
 } ipfw_xtable;
 
+typedef struct  _ipfw_xtable_tlv {
+	uint16_t        type;		/* TLV type */
+	uint16_t        length;		/* Total length, aligned to u32	*/
+} ipfw_xtable_tlv;
+
+#define	IPFW_TLV_NAME	1
+/* Object name TLV */
+typedef struct _ipfw_xtable_ntlv {
+	ipfw_xtable_tlv	head;		/* TLV header */
+	uint16_t	idx;		/* Name index */
+	uint16_t	spare;		/* unused */
+	char		name[64];	/* Null-terminated name */
+} ipfw_xtable_ntlv;
+
+typedef struct _ipfw_obj_header {
+	ip_fw3_opheader	opheader;	/* IP_FW3 opcode		*/
+	uint32_t	set;		/* Set we're operating		*/
+	uint16_t	idx;		/* object name index		*/
+	uint16_t	objtype;	/* object type			*/
+} ipfw_obj_header;
+#define	IPFW_OBJTYPE_TABLE	1
+
 #endif /* _IPFW2_H */
Index: sys/netpfil/ipfw/ip_fw2.c
===================================================================
--- sys/netpfil/ipfw/ip_fw2.c
+++ sys/netpfil/ipfw/ip_fw2.c
@@ -121,6 +121,7 @@
 VNET_DEFINE(int, fw_one_pass) = 1;
 
 VNET_DEFINE(unsigned int, fw_tables_max);
+VNET_DEFINE(unsigned int, fw_tables_sets) = 0;	/* Don't use set-aware tables */
 /* Use 128 tables by default */
 static unsigned int default_fw_tables = IPFW_TABLES_DEFAULT;
 
@@ -2719,7 +2720,6 @@
 	ipfw_dyn_uninit(0);	/* run the callout_drain */
 	IPFW_WUNLOCK(chain);
 
-	ipfw_destroy_tables(chain);
 	reap = NULL;
 	IPFW_WLOCK(chain);
 	for (i = 0; i < chain->n_rules; i++) {
@@ -2731,6 +2731,7 @@
 		free(chain->map, M_IPFW);
 	IPFW_WUNLOCK(chain);
 	IPFW_UH_WUNLOCK(chain);
+	ipfw_destroy_tables(chain);
 	if (reap != NULL)
 		ipfw_reap_rules(reap);
 	IPFW_LOCK_DESTROY(chain);
Index: sys/netpfil/ipfw/ip_fw_private.h
===================================================================
--- sys/netpfil/ipfw/ip_fw_private.h
+++ sys/netpfil/ipfw/ip_fw_private.h
@@ -212,14 +212,18 @@
 VNET_DECLARE(unsigned int, fw_tables_max);
 #define V_fw_tables_max		VNET(fw_tables_max)
 
+VNET_DECLARE(unsigned int, fw_tables_sets);
+#define V_fw_tables_sets	VNET(fw_tables_sets)
+
+struct tables_config;
+
 struct ip_fw_chain {
 	struct ip_fw	**map;		/* array of rule ptrs to ease lookup */
 	uint32_t	id;		/* ruleset id */
 	int		n_rules;	/* number of static rules */
 	LIST_HEAD(nat_list, cfg_nat) nat;       /* list of nat entries */
 	struct radix_node_head **tables;	/* IPv4 tables */
 	struct radix_node_head **xtables;	/* extended tables */
-	uint8_t		*tabletype;	/* Array of table types */
 #if defined( __linux__ ) || defined( _WIN32 )
 	spinlock_t rwmtx;
 #else
@@ -229,6 +233,7 @@
 	uint32_t	gencnt;		/* NAT generation count */
 	struct ip_fw	*reap;		/* list of rules to reap */
 	struct ip_fw	*default_rule;
+	struct tables_config *tblcfg;	/* tables module data */
 #if defined( __linux__ ) || defined( _WIN32 )
 	spinlock_t uh_lock;
 #else
@@ -295,32 +300,113 @@
 #define IPFW_UH_WLOCK(p) rw_wlock(&(p)->uh_lock)
 #define IPFW_UH_WUNLOCK(p) rw_wunlock(&(p)->uh_lock)
 
+struct tid_info {
+	uint32_t	set;	/* table set */
+	uint16_t	uidx;	/* table index */
+	uint8_t		type;	/* table type */
+	uint8_t		spare;
+	void		*tlvs;	/* Pointer to first TLV */
+	int		tlen;	/* Total TLV size block */
+};
+
+struct obj_idx {
+	uint16_t	uidx;	/* internal index supplied by userland */
+	uint16_t	kidx;	/* kernel object index */
+	uint16_t	off;	/* tlv offset from rule end in 4-byte words */
+	uint8_t		new;	/* index is newly-allocated */
+	uint8_t		type;	/* object type within its category */
+};
+
+struct rule_check_info {
+	uint16_t	table_opcodes;	/* count of opcodes referencing table */
+	uint16_t	new_tables;	/* count of opcodes referencing table */
+	uint32_t	tableset;	/* ipfw set id for table */
+	void		*tlvs;		/* Pointer to first TLV if any */
+	int		tlen;		/* *Total TLV size block */
+	uint8_t		fw3;		/* opcode is new */
+	struct ip_fw	*krule;		/* resulting rule pointer */
+	struct obj_idx	obuf[8];	/* table references storage */
+};
+
+struct tentry_info {
+	void		*paddr;
+	int		plen;		/* Total entry length		*/
+	uint8_t		masklen;	/* mask length			*/
+	uint8_t		spare;
+	uint16_t	flags;		/* record flags			*/
+	uint32_t	value;		/* value			*/
+};
+
 /* In ip_fw_sockopt.c */
 int ipfw_find_rule(struct ip_fw_chain *chain, uint32_t key, uint32_t id);
-int ipfw_add_rule(struct ip_fw_chain *chain, struct ip_fw *input_rule);
 int ipfw_ctl(struct sockopt *sopt);
 int ipfw_chk(struct ip_fw_args *args);
 void ipfw_reap_rules(struct ip_fw *head);
 
+struct namedobj_instance;
+
+struct named_object {
+	TAILQ_ENTRY(named_object)	nn_next;	/* namehash */
+	TAILQ_ENTRY(named_object)	nv_next;	/* valuehash */
+	char			*name;	/* object name */
+	uint8_t			type;	/* object type */
+	uint8_t			compat;	/* Object name is number */
+	uint16_t		kidx;	/* object kernel index */
+	uint16_t		uidx;	/* userland idx for compat records */
+	uint32_t		set;	/* set object belongs to */
+	uint32_t		refcnt;	/* number of references */
+};
+TAILQ_HEAD(namedobjects_head, named_object);
+
+typedef void (objhash_cb_t)(struct namedobj_instance *ni, struct named_object *,
+    void *arg);
+struct namedobj_instance *ipfw_objhash_create(uint32_t items);
+void ipfw_objhash_destroy(struct namedobj_instance *);
+void ipfw_objhash_bitmap_alloc(uint32_t items, void **idx, int *pblocks);
+int ipfw_objhash_bitmap_merge(struct namedobj_instance *ni,
+    void **idx, int *blocks);
+void ipfw_objhash_bitmap_free(void *idx, int blocks);
+struct named_object *ipfw_objhash_lookup_name(struct namedobj_instance *ni,
+    uint32_t set, char *name);
+struct named_object *ipfw_objhash_lookup_idx(struct namedobj_instance *ni,
+    uint32_t set, uint16_t idx);
+void ipfw_objhash_add(struct namedobj_instance *ni, struct named_object *no);
+void ipfw_objhash_del(struct namedobj_instance *ni, struct named_object *no);
+void ipfw_objhash_foreach(struct namedobj_instance *ni, objhash_cb_t *f,
+    void *arg);
+int ipfw_objhash_free_idx(struct namedobj_instance *ni, uint32_t set,
+    uint16_t idx);
+int ipfw_objhash_alloc_idx(void *n, uint32_t set, uint16_t *pidx);
+
 /* In ip_fw_table.c */
 struct radix_node;
 int ipfw_lookup_table(struct ip_fw_chain *ch, uint16_t tbl, in_addr_t addr,
     uint32_t *val);
 int ipfw_lookup_table_extended(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
     uint32_t *val, int type);
 int ipfw_init_tables(struct ip_fw_chain *ch);
+int ipfw_destroy_table(struct ip_fw_chain *ch, struct tid_info *ti, int force);
 void ipfw_destroy_tables(struct ip_fw_chain *ch);
-int ipfw_flush_table(struct ip_fw_chain *ch, uint16_t tbl);
-int ipfw_add_table_entry(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
-    uint8_t plen, uint8_t mlen, uint8_t type, uint32_t value);
-int ipfw_del_table_entry(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
-    uint8_t plen, uint8_t mlen, uint8_t type);
-int ipfw_count_table(struct ip_fw_chain *ch, uint32_t tbl, uint32_t *cnt);
+int ipfw_flush_table(struct ip_fw_chain *ch, struct tid_info *ti);
+int ipfw_add_table_entry(struct ip_fw_chain *ch, struct tid_info *ti,
+    struct tentry_info *tei);
+int ipfw_del_table_entry(struct ip_fw_chain *ch, struct tid_info *ti,
+    struct tentry_info *tei);
+int ipfw_count_table(struct ip_fw_chain *ch, struct tid_info *ti,
+    uint32_t *cnt);
 int ipfw_dump_table_entry(struct radix_node *rn, void *arg);
-int ipfw_dump_table(struct ip_fw_chain *ch, ipfw_table *tbl);
-int ipfw_count_xtable(struct ip_fw_chain *ch, uint32_t tbl, uint32_t *cnt);
-int ipfw_dump_xtable(struct ip_fw_chain *ch, ipfw_xtable *tbl);
+int ipfw_dump_table(struct ip_fw_chain *ch, struct tid_info *ti,
+    ipfw_table *tbl);
+int ipfw_count_xtable(struct ip_fw_chain *ch, struct tid_info *ti,
+    uint32_t *cnt);
+int ipfw_dump_xtable(struct ip_fw_chain *ch, struct tid_info *ti,
+    ipfw_xtable *tbl);
 int ipfw_resize_tables(struct ip_fw_chain *ch, unsigned int ntables);
+int ipfw_rewrite_table_uidx(struct ip_fw_chain *chain,
+    struct rule_check_info *ci);
+int ipfw_rewrite_table_kidx(struct ip_fw_chain *chain, struct ip_fw *rule);
+void ipfw_unbind_table_rule(struct ip_fw_chain *chain, struct ip_fw *rule);
+void ipfw_unbind_table_list(struct ip_fw_chain *chain, struct ip_fw *head);
 
 /* In ip_fw_nat.c -- XXX to be moved to ip_var.h */
 
Index: sys/netpfil/ipfw/ip_fw_sockopt.c
===================================================================
--- sys/netpfil/ipfw/ip_fw_sockopt.c
+++ sys/netpfil/ipfw/ip_fw_sockopt.c
@@ -53,6 +53,7 @@
 #include <sys/socketvar.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
+#include <sys/fnv_hash.h>
 #include <net/if.h>
 #include <net/route.h>
 #include <net/vnet.h>
@@ -67,6 +68,25 @@
 #include <security/mac/mac_framework.h>
 #endif
 
+#define	NAMEDOBJ_HASH_SIZE	32
+
+struct namedobj_instance {
+	struct namedobjects_head	*names;
+	struct namedobjects_head	*values;
+	uint32_t nn_size;		/* names hash size */
+	uint32_t nv_size;		/* number hash size */
+	u_long *idx_mask;		/* used items bitmask */
+	uint32_t max_blocks;		/* number of "long" blocks in bitmask */
+	uint16_t free_off[IPFW_MAX_SETS];	/* first possible free offset */
+};
+#define	BLOCK_ITEMS	(8 * sizeof(u_long))	/* Number of items for ffsl() */
+
+static uint32_t objhash_hash_name(struct namedobj_instance *ni, uint32_t set,
+    char *name);
+static uint32_t objhash_hash_val(struct namedobj_instance *ni, uint32_t set,
+    uint32_t val);
+
+
 MALLOC_DEFINE(M_IPFW, "IpFw/IpAcct", "IpFw/IpAcct chain's");
 
 /*
@@ -152,8 +172,9 @@
  * XXX DO NOT USE FOR THE DEFAULT RULE.
  * Must be called without IPFW_UH held
  */
-int
-ipfw_add_rule(struct ip_fw_chain *chain, struct ip_fw *input_rule)
+static int
+add_rule(struct ip_fw_chain *chain, struct ip_fw *input_rule,
+    struct rule_check_info *ci)
 {
 	struct ip_fw *rule;
 	int i, l, insert_before;
@@ -164,19 +185,37 @@
 
 	l = RULESIZE(input_rule);
 	rule = malloc(l, M_IPFW, M_WAITOK | M_ZERO);
-	/* get_map returns with IPFW_UH_WLOCK if successful */
-	map = get_map(chain, 1, 0 /* not locked */);
-	if (map == NULL) {
-		free(rule, M_IPFW);
-		return ENOSPC;
-	}
-
 	bcopy(input_rule, rule, l);
 	/* clear fields not settable from userland */
 	rule->x_next = NULL;
 	rule->next_rule = NULL;
 	IPFW_ZERO_RULE_COUNTER(rule);
 
+	/* Check if we need to do table remap */
+	if (ci->table_opcodes > 0) {
+		ci->krule = rule;
+		i = ipfw_rewrite_table_uidx(chain, ci);
+		if (i != 0) {
+			/* rewrite failed, return error */
+			free(rule, M_IPFW);
+			return (i);
+		}
+	}
+
+	/* get_map returns with IPFW_UH_WLOCK if successful */
+	map = get_map(chain, 1, 0 /* not locked */);
+	if (map == NULL) {
+		if (ci->table_opcodes > 0) {
+			/* We need to unbind tables */
+			IPFW_UH_WLOCK(chain);
+			ipfw_unbind_table_rule(chain, rule);
+			IPFW_UH_WUNLOCK(chain);
+		}
+
+		free(rule, M_IPFW);
+		return (ENOSPC);
+	}
+
 	if (V_autoinc_step < 1)
 		V_autoinc_step = 1;
 	else if (V_autoinc_step > 1000)
@@ -421,6 +460,7 @@
 
 	rule = chain->reap;
 	chain->reap = NULL;
+	ipfw_unbind_table_list(chain, rule);
 	IPFW_UH_WUNLOCK(chain);
 	ipfw_reap_rules(rule);
 	if (map)
@@ -517,7 +557,7 @@
  * Rules are simple, so this mostly need to check rule sizes.
  */
 static int
-check_ipfw_struct(struct ip_fw *rule, int size)
+check_ipfw_struct(struct ip_fw *rule, int size, struct rule_check_info *ci)
 {
 	int l, cmdlen = 0;
 	int have_action=0;
@@ -662,6 +702,7 @@
 			    cmdlen != F_INSN_SIZE(ipfw_insn_u32) + 1 &&
 			    cmdlen != F_INSN_SIZE(ipfw_insn_u32))
 				goto bad_size;
+			ci->table_opcodes++;
 			break;
 		case O_MACADDR2:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_mac))
@@ -694,6 +735,8 @@
 		case O_RECV:
 		case O_XMIT:
 		case O_VIA:
+			if (((ipfw_insn_if *)cmd)->name[0] == '\1')
+				ci->table_opcodes++;
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_if))
 				goto bad_size;
 			break;
@@ -879,7 +922,7 @@
 	char *bp = buf;
 	char *ep = bp + space;
 	struct ip_fw *rule, *dst;
-	int l, i;
+	int error, i, l;
 	time_t	boot_seconds;
 
         boot_seconds = boottime.tv_sec;
@@ -890,8 +933,11 @@
 		    /* Convert rule to FreeBSd 7.2 format */
 		    l = RULESIZE7(rule);
 		    if (bp + l + sizeof(uint32_t) <= ep) {
-			int error;
 			bcopy(rule, bp, l + sizeof(uint32_t));
+			error = ipfw_rewrite_table_kidx(chain,
+			    (struct ip_fw *)bp);
+			if (error != 0)
+				return (0);
 			error = convert_rule_to_7((struct ip_fw *) bp);
 			if (error)
 				return 0; /*XXX correct? */
@@ -918,6 +964,13 @@
 		}
 		dst = (struct ip_fw *)bp;
 		bcopy(rule, dst, l);
+		error = ipfw_rewrite_table_kidx(chain, dst);
+		if (error != 0) {
+			printf("Stop on rule %d. Fail to convert table\n",
+			    rule->rulenum);
+			break;
+		}
+
 		/*
 		 * XXX HACK. Store the disable mask in the "next"
 		 * pointer in a wild attempt to keep the ABI the same.
@@ -949,6 +1002,7 @@
 	uint32_t opt;
 	char xbuf[128];
 	ip_fw3_opheader *op3 = NULL;
+	struct rule_check_info ci;
 
 	error = priv_check(sopt->sopt_td, PRIV_NETINET_IPFW);
 	if (error)
@@ -1027,6 +1081,8 @@
 		error = sooptcopyin(sopt, rule, RULE_MAXSIZE,
 			sizeof(struct ip_fw7) );
 
+		memset(&ci, 0, sizeof(struct rule_check_info));
+
 		/*
 		 * If the size of commands equals RULESIZE7 then we assume
 		 * a FreeBSD7.2 binary is talking to us (set is7=1).
@@ -1044,15 +1100,15 @@
 			return error;
 		    }
 		    if (error == 0)
-			error = check_ipfw_struct(rule, RULESIZE(rule));
+			error = check_ipfw_struct(rule, RULESIZE(rule), &ci);
 		} else {
 		    is7 = 0;
 		if (error == 0)
-			error = check_ipfw_struct(rule, sopt->sopt_valsize);
+			error = check_ipfw_struct(rule, sopt->sopt_valsize,&ci);
 		}
 		if (error == 0) {
-			/* locking is done within ipfw_add_rule() */
-			error = ipfw_add_rule(chain, rule);
+			/* locking is done within add_rule() */
+			error = add_rule(chain, rule, &ci);
 			size = RULESIZE(rule);
 			if (!error && sopt->sopt_dir == SOPT_GET) {
 				if (is7) {
@@ -1114,37 +1170,67 @@
 		break;
 
 	/*--- TABLE manipulations are protected by the IPFW_LOCK ---*/
-	case IP_FW_TABLE_ADD:
+	case IP_FW_OBJ_DEL: /* IP_FW3 */
 		{
-			ipfw_table_entry ent;
+			struct _ipfw_obj_header *oh;
+			struct tid_info ti;
 
-			error = sooptcopyin(sopt, &ent,
-			    sizeof(ent), sizeof(ent));
-			if (error)
+			if (sopt->sopt_valsize < sizeof(*oh)) {
+				error = EINVAL;
 				break;
-			error = ipfw_add_table_entry(chain, ent.tbl,
-			    &ent.addr, sizeof(ent.addr), ent.masklen, 
-			    IPFW_TABLE_CIDR, ent.value);
-		}
-		break;
+			}
+
+			oh = (struct _ipfw_obj_header *)(op3 + 1);
 
+			switch (oh->objtype) {
+			case IPFW_OBJTYPE_TABLE:
+				memset(&ti, 0, sizeof(ti));
+				ti.set = oh->set;
+				ti.uidx = oh->idx;
+				ti.tlvs = (oh + 1);
+				ti.tlen = sopt->sopt_valsize - sizeof(*oh);
+				error = ipfw_destroy_table(chain, &ti, 0);
+				break;
+			default:
+				error = ENOTSUP;
+				break;
+			}
+			break;
+		}
+	case IP_FW_TABLE_ADD:
 	case IP_FW_TABLE_DEL:
 		{
 			ipfw_table_entry ent;
+			struct tentry_info tei;
+			struct tid_info ti;
 
 			error = sooptcopyin(sopt, &ent,
 			    sizeof(ent), sizeof(ent));
 			if (error)
 				break;
-			error = ipfw_del_table_entry(chain, ent.tbl,
-			    &ent.addr, sizeof(ent.addr), ent.masklen, IPFW_TABLE_CIDR);
+
+			memset(&tei, 0, sizeof(tei));
+			tei.paddr = &ent.addr;
+			tei.plen = sizeof(ent.addr);
+			tei.masklen = ent.masklen;
+			tei.value = ent.value;
+			memset(&ti, 0, sizeof(ti));
+			ti.set = RESVD_SET;
+			ti.uidx = ent.tbl;
+			ti.type = IPFW_TABLE_CIDR;
+
+			error = (opt == IP_FW_TABLE_ADD) ?
+			    ipfw_add_table_entry(chain, &ti, &tei) :
+			    ipfw_del_table_entry(chain, &ti, &tei);
 		}
 		break;
 
 	case IP_FW_TABLE_XADD: /* IP_FW3 */
 	case IP_FW_TABLE_XDEL: /* IP_FW3 */
 		{
 			ipfw_table_xentry *xent = (ipfw_table_xentry *)(op3 + 1);
+			struct tentry_info tei;
+			struct tid_info ti;
 
 			/* Check minimum header size */
 			if (IP_FW3_OPLENGTH(sopt) < offsetof(ipfw_table_xentry, k)) {
@@ -1160,35 +1246,51 @@
 			
 			len = xent->len - offsetof(ipfw_table_xentry, k);
 
+			memset(&tei, 0, sizeof(tei));
+			tei.paddr = &xent->k;
+			tei.plen = len;
+			tei.masklen = xent->masklen;
+			tei.value = xent->value;
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0;	/* XXX: No way to specify set  */
+			ti.uidx = xent->tbl;
+			ti.type = xent->type;
+
 			error = (opt == IP_FW_TABLE_XADD) ?
-				ipfw_add_table_entry(chain, xent->tbl, &xent->k, 
-					len, xent->masklen, xent->type, xent->value) :
-				ipfw_del_table_entry(chain, xent->tbl, &xent->k,
-					len, xent->masklen, xent->type);
+			    ipfw_add_table_entry(chain, &ti, &tei) :
+			    ipfw_del_table_entry(chain, &ti, &tei);
 		}
 		break;
 
 	case IP_FW_TABLE_FLUSH:
 		{
 			u_int16_t tbl;
+			struct tid_info ti;
 
 			error = sooptcopyin(sopt, &tbl,
 			    sizeof(tbl), sizeof(tbl));
 			if (error)
 				break;
-			error = ipfw_flush_table(chain, tbl);
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = tbl;
+			error = ipfw_flush_table(chain, &ti);
 		}
 		break;
 
 	case IP_FW_TABLE_GETSIZE:
 		{
 			u_int32_t tbl, cnt;
+			struct tid_info ti;
 
 			if ((error = sooptcopyin(sopt, &tbl, sizeof(tbl),
 			    sizeof(tbl))))
 				break;
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = tbl;
 			IPFW_RLOCK(chain);
-			error = ipfw_count_table(chain, tbl, &cnt);
+			error = ipfw_count_table(chain, &ti, &cnt);
 			IPFW_RUNLOCK(chain);
 			if (error)
 				break;
@@ -1199,6 +1301,7 @@
 	case IP_FW_TABLE_LIST:
 		{
 			ipfw_table *tbl;
+			struct tid_info ti;
 
 			if (sopt->sopt_valsize < sizeof(*tbl)) {
 				error = EINVAL;
@@ -1213,8 +1316,11 @@
 			}
 			tbl->size = (size - sizeof(*tbl)) /
 			    sizeof(ipfw_table_entry);
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = tbl->tbl;
 			IPFW_RLOCK(chain);
-			error = ipfw_dump_table(chain, tbl);
+			error = ipfw_dump_table(chain, &ti, tbl);
 			IPFW_RUNLOCK(chain);
 			if (error) {
 				free(tbl, M_TEMP);
@@ -1228,16 +1334,20 @@
 	case IP_FW_TABLE_XGETSIZE: /* IP_FW3 */
 		{
 			uint32_t *tbl;
+			struct tid_info ti;
 
 			if (IP_FW3_OPLENGTH(sopt) < sizeof(uint32_t)) {
 				error = EINVAL;
 				break;
 			}
 
 			tbl = (uint32_t *)(op3 + 1);
 
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = *tbl;
 			IPFW_RLOCK(chain);
-			error = ipfw_count_xtable(chain, *tbl, tbl);
+			error = ipfw_count_xtable(chain, &ti, tbl);
 			IPFW_RUNLOCK(chain);
 			if (error)
 				break;
@@ -1248,6 +1358,7 @@
 	case IP_FW_TABLE_XLIST: /* IP_FW3 */
 		{
 			ipfw_xtable *tbl;
+			struct tid_info ti;
 
 			if ((size = valsize) < sizeof(ipfw_xtable)) {
 				error = EINVAL;
@@ -1260,8 +1371,11 @@
 			/* Get maximum number of entries we can store */
 			tbl->size = (size - sizeof(ipfw_xtable)) /
 			    sizeof(ipfw_table_xentry);
+			memset(&ti, 0, sizeof(ti));
+			ti.set = 0; /* XXX: No way to specify set */
+			ti.uidx = tbl->tbl;
 			IPFW_RLOCK(chain);
-			error = ipfw_dump_xtable(chain, tbl);
+			error = ipfw_dump_xtable(chain, &ti, tbl);
 			IPFW_RUNLOCK(chain);
 			if (error) {
 				free(tbl, M_TEMP);
@@ -1444,4 +1558,271 @@
 	return 0;
 }
 
+/*
+ * Named object api
+ *
+ */
+
+void
+ipfw_objhash_bitmap_alloc(uint32_t items, void **idx, int *pblocks)
+{
+	size_t size;
+	int max_blocks;
+	void *idx_mask;
+
+	items = roundup2(items, BLOCK_ITEMS);	/* Align to block size */
+	max_blocks = items / BLOCK_ITEMS;
+	size = items / 8;
+	idx_mask = malloc(size * IPFW_MAX_SETS, M_IPFW, M_WAITOK);
+	/* Mark all as free */
+	memset(idx_mask, 0xFF, size * IPFW_MAX_SETS);
+
+	*idx = idx_mask;
+	*pblocks = max_blocks;
+}
+
+int
+ipfw_objhash_bitmap_merge(struct namedobj_instance *ni, void **idx, int *blocks)
+{
+	int old_blocks, new_blocks;
+	u_long *old_idx, *new_idx;
+	int i;
+
+	old_idx = ni->idx_mask;
+	old_blocks = ni->max_blocks;
+	new_idx = *idx;
+	new_blocks = *blocks;
+
+	/*
+	 * FIXME: Permit reducing total amount of tables
+	 */
+	if (old_blocks > new_blocks)
+		return (1);
+
+	for (i = 0; i < IPFW_MAX_SETS; i++) {
+		memcpy(&new_idx[new_blocks * i], &old_idx[old_blocks * i],
+		    old_blocks * sizeof(u_long));
+	}
+
+	ni->idx_mask = new_idx;
+	ni->max_blocks = new_blocks;
+
+	/* Save old values */
+	*idx = old_idx;
+	*blocks = old_blocks;
+
+	return (0);
+}
+
+void
+ipfw_objhash_bitmap_free(void *idx, int blocks)
+{
+
+	free(idx, M_IPFW);
+}
+
+/*
+ * Creates named hash instance.
+ * Must be called without holding any locks.
+ * Return pointer to new instance.
+ */
+struct namedobj_instance *
+ipfw_objhash_create(uint32_t items)
+{
+	struct namedobj_instance *ni;
+	int i;
+	size_t size;
+
+	size = sizeof(struct namedobj_instance) +
+	    sizeof(struct namedobjects_head) * NAMEDOBJ_HASH_SIZE +
+	    sizeof(struct namedobjects_head) * NAMEDOBJ_HASH_SIZE;
+
+	ni = malloc(size, M_IPFW, M_WAITOK | M_ZERO);
+	ni->nn_size = NAMEDOBJ_HASH_SIZE;
+	ni->nv_size = NAMEDOBJ_HASH_SIZE;
+
+	ni->names = (struct namedobjects_head *)(ni +1);
+	ni->values = &ni->names[ni->nn_size];
+
+	for (i = 0; i < ni->nn_size; i++)
+		TAILQ_INIT(&ni->names[i]);
+
+	for (i = 0; i < ni->nv_size; i++)
+		TAILQ_INIT(&ni->values[i]);
+
+	/* Allocate bitmask separately due to possible resize */
+	ipfw_objhash_bitmap_alloc(items, (void*)&ni->idx_mask, &ni->max_blocks);
+
+	return (ni);
+}
+
+void
+ipfw_objhash_destroy(struct namedobj_instance *ni)
+{
+
+	free(ni->idx_mask, M_IPFW);
+	free(ni, M_IPFW);
+}
+
+static uint32_t
+objhash_hash_name(struct namedobj_instance *ni, uint32_t set, char *name)
+{
+	uint32_t v;
+
+	v = fnv_32_str(name, FNV1_32_INIT);
+
+	return (v % ni->nn_size);
+}
+
+static uint32_t
+objhash_hash_val(struct namedobj_instance *ni, uint32_t set, uint32_t val)
+{
+	uint32_t v;
+
+	v = val % (ni->nv_size - 1);
+
+	return (v);
+}
+
+struct named_object *
+ipfw_objhash_lookup_name(struct namedobj_instance *ni, uint32_t set, char *name)
+{
+	struct named_object *no;
+	uint32_t hash;
+
+	hash = objhash_hash_name(ni, set, name);
+	
+	TAILQ_FOREACH(no, &ni->names[hash], nn_next) {
+		if ((strcmp(no->name, name) == 0) && (no->set == set))
+			return (no);
+	}
+
+	return (NULL);
+}
+
+struct named_object *
+ipfw_objhash_lookup_idx(struct namedobj_instance *ni, uint32_t set,
+    uint16_t idx)
+{
+	struct named_object *no;
+	uint32_t hash;
+
+	hash = objhash_hash_val(ni, set, idx);
+	
+	TAILQ_FOREACH(no, &ni->values[hash], nv_next) {
+		if ((no->kidx == idx) && (no->set == set))
+			return (no);
+	}
+
+	return (NULL);
+}
+
+void
+ipfw_objhash_add(struct namedobj_instance *ni, struct named_object *no)
+{
+	uint32_t hash;
+
+	hash = objhash_hash_name(ni, no->set, no->name);
+	TAILQ_INSERT_HEAD(&ni->names[hash], no, nn_next);
+
+	hash = objhash_hash_val(ni, no->set, no->kidx);
+	TAILQ_INSERT_HEAD(&ni->values[hash], no, nv_next);
+}
+
+void
+ipfw_objhash_del(struct namedobj_instance *ni, struct named_object *no)
+{
+	uint32_t hash;
+
+	hash = objhash_hash_name(ni, no->set, no->name);
+	TAILQ_REMOVE(&ni->names[hash], no, nn_next);
+
+	hash = objhash_hash_val(ni, no->set, no->kidx);
+	TAILQ_REMOVE(&ni->values[hash], no, nv_next);
+}
+
+/*
+ * Runs @func for each found named object.
+ * It is safe to delete objects from callback
+ */
+void
+ipfw_objhash_foreach(struct namedobj_instance *ni, objhash_cb_t *f, void *arg)
+{
+	struct named_object *no, *no_tmp;
+	int i;
+
+	for (i = 0; i < ni->nn_size; i++) {
+		TAILQ_FOREACH_SAFE(no, &ni->names[i], nn_next, no_tmp)
+			f(ni, no, arg);
+	}
+}
+
+/*
+ * Removes index from given set.
+ * Returns 0 on success.
+ */
+int
+ipfw_objhash_free_idx(struct namedobj_instance *ni, uint32_t set, uint16_t idx)
+{
+	u_long *mask;
+	int i, v;
+
+	i = idx / BLOCK_ITEMS;
+	v = idx % BLOCK_ITEMS;
+
+	if ((i >= ni->max_blocks) || set >= IPFW_MAX_SETS)
+		return (1);
+
+	mask = &ni->idx_mask[set * ni->max_blocks + i];
+
+	if ((*mask & ((u_long)1 << v)) != 0)
+		return (1);
+
+	/* Mark as free */
+	*mask |= (u_long)1 << v;
+
+	/* Update free offset */
+	if (ni->free_off[set] > i)
+		ni->free_off[set] = i;
+	
+	return (0);
+}
+
+/*
+ * Allocate new index in given set and stores in in @pidx.
+ * Returns 0 on success.
+ */
+int
+ipfw_objhash_alloc_idx(void *n, uint32_t set, uint16_t *pidx)
+{
+	struct namedobj_instance *ni;
+	u_long *mask;
+	int i, off, v;
+
+	if (set >= IPFW_MAX_SETS)
+		return (-1);
+
+	ni = (struct namedobj_instance *)n;
+
+	off = ni->free_off[set];
+	mask = &ni->idx_mask[set * ni->max_blocks + off];
+
+	for (i = off; i < ni->max_blocks; i++, mask++) {
+		if ((v = ffsl(*mask)) == 0)
+			continue;
+
+		/* Mark as busy */
+		*mask &= ~ ((u_long)1 << (v - 1));
+
+		ni->free_off[set] = i;
+		
+		v = BLOCK_ITEMS * i + v - 1;
+
+		*pidx = v;
+		return (0);
+	}
+
+	return (1);
+}
+
 /* end of file */
Index: sys/netpfil/ipfw/ip_fw_table.c
===================================================================
--- sys/netpfil/ipfw/ip_fw_table.c
+++ sys/netpfil/ipfw/ip_fw_table.c
@@ -100,6 +100,49 @@
 	u_int32_t		value;
 };
 
+ /*
+ * Table has the following `type` concepts:
+ *
+ * `type` represents lookup key type (cidr, ifp, uid, etc..)
+ * `ftype` is pure userland field helping to properly format table data
+ * `atype` represents exact lookup algorithm for given tabletype.
+ *     For example, we can use more efficient search schemes if we plan
+ *     to use some specific table for storing host-routes only.
+ *
+ */
+struct table_config {
+	struct named_object	no;
+	uint8_t		ftype;		/* format table type */
+	uint8_t		atype;		/* algorith type */
+	uint8_t		linked;		/* 1 if already linked */
+	uint8_t		spare0;
+	uint32_t	count;		/* Number of records */
+	char		tablename[64];	/* table name */
+	void		*state;		/* Store some state if needed */
+	void		*xstate;
+};
+#define	TABLE_SET(set)	((V_fw_tables_sets != 0) ? set : 0)
+
+struct tables_config {
+	struct namedobj_instance	*namehash;
+};
+
+static struct table_config *find_table(struct namedobj_instance *ni,
+    struct tid_info *ti);
+static struct table_config *alloc_table_config(struct namedobj_instance *ni,
+    struct tid_info *ti);
+static void free_table_config(struct namedobj_instance *ni,
+    struct table_config *tc);
+static void link_table(struct ip_fw_chain *chain, struct table_config *tc);
+static void unlink_table(struct ip_fw_chain *chain, struct table_config *tc);
+static int alloc_table_state(void **state, void **xstate, uint8_t type);
+static void free_table_state(void **state, void **xstate, uint8_t type);
+
+
+#define	CHAIN_TO_TCFG(chain)	((struct tables_config *)(chain)->tblcfg)
+#define	CHAIN_TO_NI(chain)	(CHAIN_TO_TCFG(chain)->namehash)
+
+
 /*
  * The radix code expects addr and mask to be array of bytes,
  * with the first byte being the length of the array. rn_inithead
@@ -136,62 +179,68 @@
 #endif
 
 int
-ipfw_add_table_entry(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
-    uint8_t plen, uint8_t mlen, uint8_t type, uint32_t value)
+ipfw_add_table_entry(struct ip_fw_chain *ch, struct tid_info *ti,
+    struct tentry_info *tei)
 {
-	struct radix_node_head *rnh, **rnh_ptr;
+	struct radix_node_head *rnh;
 	struct table_entry *ent;
 	struct table_xentry *xent;
 	struct radix_node *rn;
 	in_addr_t addr;
 	int offset;
 	void *ent_ptr;
 	struct sockaddr *addr_ptr, *mask_ptr;
+	struct table_config *tc, *tc_new;
+	struct namedobj_instance *ni;
 	char c;
+	uint8_t mlen;
+	uint16_t kidx;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
 
-	switch (type) {
+	mlen = tei->masklen;
+
+	switch (ti->type) {
 	case IPFW_TABLE_CIDR:
-		if (plen == sizeof(in_addr_t)) {
+		if (tei->plen == sizeof(in_addr_t)) {
 #ifdef INET
 			/* IPv4 case */
 			if (mlen > 32)
 				return (EINVAL);
 			ent = malloc(sizeof(*ent), M_IPFW_TBL, M_WAITOK | M_ZERO);
-			ent->value = value;
+			ent->value = tei->value;
 			/* Set 'total' structure length */
 			KEY_LEN(ent->addr) = KEY_LEN_INET;
 			KEY_LEN(ent->mask) = KEY_LEN_INET;
 			/* Set offset of IPv4 address in bits */
 			offset = OFF_LEN_INET;
-			ent->mask.sin_addr.s_addr = htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0);
-			addr = *((in_addr_t *)paddr);
+			ent->mask.sin_addr.s_addr =
+			    htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0);
+			addr = *((in_addr_t *)tei->paddr);
 			ent->addr.sin_addr.s_addr = addr & ent->mask.sin_addr.s_addr;
 			/* Set pointers */
-			rnh_ptr = &ch->tables[tbl];
 			ent_ptr = ent;
 			addr_ptr = (struct sockaddr *)&ent->addr;
 			mask_ptr = (struct sockaddr *)&ent->mask;
 #endif
 #ifdef INET6
-		} else if (plen == sizeof(struct in6_addr)) {
+		} else if (tei->plen == sizeof(struct in6_addr)) {
 			/* IPv6 case */
 			if (mlen > 128)
 				return (EINVAL);
 			xent = malloc(sizeof(*xent), M_IPFW_TBL, M_WAITOK | M_ZERO);
-			xent->value = value;
+			xent->value = tei->value;
 			/* Set 'total' structure length */
 			KEY_LEN(xent->a.addr6) = KEY_LEN_INET6;
 			KEY_LEN(xent->m.mask6) = KEY_LEN_INET6;
 			/* Set offset of IPv6 address in bits */
 			offset = OFF_LEN_INET6;
 			ipv6_writemask(&xent->m.mask6.sin6_addr, mlen);
-			memcpy(&xent->a.addr6.sin6_addr, paddr, sizeof(struct in6_addr));
+			memcpy(&xent->a.addr6.sin6_addr, tei->paddr,
+			    sizeof(struct in6_addr));
 			APPLY_MASK(&xent->a.addr6.sin6_addr, &xent->m.mask6.sin6_addr);
 			/* Set pointers */
-			rnh_ptr = &ch->xtables[tbl];
 			ent_ptr = xent;
 			addr_ptr = (struct sockaddr *)&xent->a.addr6;
 			mask_ptr = (struct sockaddr *)&xent->m.mask6;
@@ -204,30 +253,30 @@
 	
 	case IPFW_TABLE_INTERFACE:
 		/* Check if string is terminated */
-		c = ((char *)paddr)[IF_NAMESIZE - 1];
-		((char *)paddr)[IF_NAMESIZE - 1] = '\0';
-		if (((mlen = strlen((char *)paddr)) == IF_NAMESIZE - 1) && (c != '\0'))
+		c = ((char *)tei->paddr)[IF_NAMESIZE - 1];
+		((char *)tei->paddr)[IF_NAMESIZE - 1] = '\0';
+		mlen = strlen((char *)tei->paddr);
+		if ((mlen == IF_NAMESIZE - 1) && (c != '\0'))
 			return (EINVAL);
 
 		/* Include last \0 into comparison */
 		mlen++;
 
 		xent = malloc(sizeof(*xent), M_IPFW_TBL, M_WAITOK | M_ZERO);
-		xent->value = value;
+		xent->value = tei->value;
 		/* Set 'total' structure length */
 		KEY_LEN(xent->a.iface) = KEY_LEN_IFACE + mlen;
 		KEY_LEN(xent->m.ifmask) = KEY_LEN_IFACE + mlen;
 		/* Set offset of interface name in bits */
 		offset = OFF_LEN_IFACE;
-		memcpy(xent->a.iface.ifname, paddr, mlen);
+		memcpy(xent->a.iface.ifname, tei->paddr, mlen);
 		/* Assume direct match */
 		/* TODO: Add interface pattern matching */
 #if 0
 		memset(xent->m.ifmask.ifname, 0xFF, IF_NAMESIZE);
 		mask_ptr = (struct sockaddr *)&xent->m.ifmask;
 #endif
 		/* Set pointers */
-		rnh_ptr = &ch->xtables[tbl];
 		ent_ptr = xent;
 		addr_ptr = (struct sockaddr *)&xent->a.iface;
 		mask_ptr = NULL;
@@ -237,84 +286,128 @@
 		return (EINVAL);
 	}
 
-	IPFW_WLOCK(ch);
+	IPFW_UH_WLOCK(ch);
 
-	/* Check if tabletype is valid */
-	if ((ch->tabletype[tbl] != 0) && (ch->tabletype[tbl] != type)) {
-		IPFW_WUNLOCK(ch);
-		free(ent_ptr, M_IPFW_TBL);
-		return (EINVAL);
-	}
+	ni = CHAIN_TO_NI(ch);
 
-	/* Check if radix tree exists */
-	if ((rnh = *rnh_ptr) == NULL) {
-		IPFW_WUNLOCK(ch);
-		/* Create radix for a new table */
-		if (!rn_inithead((void **)&rnh, offset)) {
-			free(ent_ptr, M_IPFW_TBL);
+	tc_new = NULL;
+	if ((tc = find_table(ni, ti)) == NULL) {
+		/* Not found. We have to create new one */
+		IPFW_UH_WUNLOCK(ch);
+
+		tc_new = alloc_table_config(ni, ti);
+		if (tc_new == NULL)
 			return (ENOMEM);
-		}
 
-		IPFW_WLOCK(ch);
-		if (*rnh_ptr != NULL) {
-			/* Tree is already attached by other thread */
-			rn_detachhead((void **)&rnh);
-			rnh = *rnh_ptr;
-			/* Check table type another time */
-			if (ch->tabletype[tbl] != type) {
-				IPFW_WUNLOCK(ch);
-				free(ent_ptr, M_IPFW_TBL);
+		IPFW_UH_WLOCK(ch);
+
+		/* Check if table has already allocated by other thread */
+		if ((tc = find_table(ni, ti)) != NULL) {
+			if (tc->no.type != ti->type) {
+				IPFW_UH_WUNLOCK(ch);
+				free_table_config(ni, tc);
 				return (EINVAL);
 			}
 		} else {
-			*rnh_ptr = rnh;
-			/* 
-			 * Set table type. It can be set already
-			 * (if we have IPv6-only table) but setting
-			 * it another time does not hurt
+			/*
+			 * New table.
+			 * Set tc_new to zero not to free it afterwards.
 			 */
-			ch->tabletype[tbl] = type;
+			tc = tc_new;
+			tc_new = NULL;
+
+			/* Allocate table index. */
+			if (ipfw_objhash_alloc_idx(ni, ti->set, &kidx) != 0) {
+				/* Index full. */
+				IPFW_UH_WUNLOCK(ch);
+				printf("Unable to allocate index for table %s."
+				    " Consider increasing "
+				    "net.inet.ip.fw.tables_max",
+				    tc->no.name);
+				free_table_config(ni, tc);
+				return (EBUSY);
+			}
+			/* Save kidx */
+			tc->no.kidx = kidx;
 		}
+	} else {
+		/* We still have to check table type */
+		if (tc->no.type != ti->type) {
+			IPFW_UH_WUNLOCK(ch);
+			return (EINVAL);
+		}
+	}
+	kidx = tc->no.kidx;
+
+	/* We've got valid table in @tc. Let's add data */
+	IPFW_WLOCK(ch);
+
+	if (tc->linked == 0) {
+		link_table(ch, tc);
+	}
+
+	/* XXX: Temporary until splitting add/del to per-type functions */
+	rnh = NULL;
+	switch (ti->type) {
+	case IPFW_TABLE_CIDR:
+		if (tei->plen == sizeof(in_addr_t))
+			rnh = ch->tables[kidx];
+		else
+			rnh = ch->xtables[kidx];
+		break;
+	case IPFW_TABLE_INTERFACE:
+		rnh = ch->xtables[kidx];
+		break;
 	}
 
 	rn = rnh->rnh_addaddr(addr_ptr, mask_ptr, rnh, ent_ptr);
 	IPFW_WUNLOCK(ch);
+	IPFW_UH_WUNLOCK(ch);
+
+	if (tc_new != NULL)
+		free_table_config(ni, tc);
 
 	if (rn == NULL) {
 		free(ent_ptr, M_IPFW_TBL);
 		return (EEXIST);
 	}
+
 	return (0);
 }
 
 int
-ipfw_del_table_entry(struct ip_fw_chain *ch, uint16_t tbl, void *paddr,
-    uint8_t plen, uint8_t mlen, uint8_t type)
+ipfw_del_table_entry(struct ip_fw_chain *ch, struct tid_info *ti,
+    struct tentry_info *tei)
 {
-	struct radix_node_head *rnh, **rnh_ptr;
+	struct radix_node_head *rnh;
 	struct table_entry *ent;
 	in_addr_t addr;
 	struct sockaddr_in sa, mask;
 	struct sockaddr *sa_ptr, *mask_ptr;
+	struct table_config *tc;
+	struct namedobj_instance *ni;
 	char c;
+	uint8_t mlen;
+	uint16_t kidx;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
 
-	switch (type) {
+	mlen = tei->masklen;
+
+	switch (ti->type) {
 	case IPFW_TABLE_CIDR:
-		if (plen == sizeof(in_addr_t)) {
+		if (tei->plen == sizeof(in_addr_t)) {
 			/* Set 'total' structure length */
 			KEY_LEN(sa) = KEY_LEN_INET;
 			KEY_LEN(mask) = KEY_LEN_INET;
 			mask.sin_addr.s_addr = htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0);
-			addr = *((in_addr_t *)paddr);
+			addr = *((in_addr_t *)tei->paddr);
 			sa.sin_addr.s_addr = addr & mask.sin_addr.s_addr;
-			rnh_ptr = &ch->tables[tbl];
 			sa_ptr = (struct sockaddr *)&sa;
 			mask_ptr = (struct sockaddr *)&mask;
 #ifdef INET6
-		} else if (plen == sizeof(struct in6_addr)) {
+		} else if (tei->plen == sizeof(struct in6_addr)) {
 			/* IPv6 case */
 			if (mlen > 128)
 				return (EINVAL);
@@ -325,9 +418,9 @@
 			KEY_LEN(sa6) = KEY_LEN_INET6;
 			KEY_LEN(mask6) = KEY_LEN_INET6;
 			ipv6_writemask(&mask6.sin6_addr, mlen);
-			memcpy(&sa6.sin6_addr, paddr, sizeof(struct in6_addr));
+			memcpy(&sa6.sin6_addr, tei->paddr,
+			    sizeof(struct in6_addr));
 			APPLY_MASK(&sa6.sin6_addr, &mask6.sin6_addr);
-			rnh_ptr = &ch->xtables[tbl];
 			sa_ptr = (struct sockaddr *)&sa6;
 			mask_ptr = (struct sockaddr *)&mask6;
 #endif
@@ -339,9 +432,10 @@
 
 	case IPFW_TABLE_INTERFACE:
 		/* Check if string is terminated */
-		c = ((char *)paddr)[IF_NAMESIZE - 1];
-		((char *)paddr)[IF_NAMESIZE - 1] = '\0';
-		if (((mlen = strlen((char *)paddr)) == IF_NAMESIZE - 1) && (c != '\0'))
+		c = ((char *)tei->paddr)[IF_NAMESIZE - 1];
+		((char *)tei->paddr)[IF_NAMESIZE - 1] = '\0';
+		mlen = strlen((char *)tei->paddr);
+		if ((mlen == IF_NAMESIZE - 1) && (c != '\0'))
 			return (EINVAL);
 
 		struct xaddr_iface ifname, ifmask;
@@ -360,31 +454,49 @@
 		mask_ptr = (struct sockaddr *)&ifmask;
 #endif
 		mask_ptr = NULL;
-		memcpy(ifname.ifname, paddr, mlen);
+		memcpy(ifname.ifname, tei->paddr, mlen);
 		/* Set pointers */
-		rnh_ptr = &ch->xtables[tbl];
 		sa_ptr = (struct sockaddr *)&ifname;
 
 		break;
 
 	default:
 		return (EINVAL);
 	}
 
-	IPFW_WLOCK(ch);
-	if ((rnh = *rnh_ptr) == NULL) {
-		IPFW_WUNLOCK(ch);
+	IPFW_UH_RLOCK(ch);
+	ni = CHAIN_TO_NI(ch);
+	if ((tc = find_table(ni, ti)) == NULL) {
+		IPFW_UH_RUNLOCK(ch);
 		return (ESRCH);
 	}
 
-	if (ch->tabletype[tbl] != type) {
-		IPFW_WUNLOCK(ch);
+	if (tc->no.type != ti->type) {
+		IPFW_UH_RUNLOCK(ch);
 		return (EINVAL);
 	}
+	kidx = tc->no.kidx;
+
+	IPFW_WLOCK(ch);
+
+	rnh = NULL;
+	switch (ti->type) {
+	case IPFW_TABLE_CIDR:
+		if (tei->plen == sizeof(in_addr_t))
+			rnh = ch->tables[kidx];
+		else
+			rnh = ch->xtables[kidx];
+		break;
+	case IPFW_TABLE_INTERFACE:
+		rnh = ch->xtables[kidx];
+		break;
+	}
 
 	ent = (struct table_entry *)rnh->rnh_deladdr(sa_ptr, mask_ptr, rnh);
 	IPFW_WUNLOCK(ch);
 
+	IPFW_UH_RUNLOCK(ch);
+
 	if (ent == NULL)
 		return (ESRCH);
 
@@ -405,102 +517,206 @@
 	return (0);
 }
 
+/*
+ * Flushes all entries in given table minimizing hoding chain WLOCKs.
+ *
+ */
 int
-ipfw_flush_table(struct ip_fw_chain *ch, uint16_t tbl)
+ipfw_flush_table(struct ip_fw_chain *ch, struct tid_info *ti)
 {
-	struct radix_node_head *rnh, *xrnh;
+	struct namedobj_instance *ni;
+	struct table_config *tc;
+	void *ostate, *oxstate;
+	void *state, *xstate;
+	int error;
+	uint8_t type;
+	uint16_t kidx;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
 
 	/*
-	 * We free both (IPv4 and extended) radix trees and
-	 * clear table type here to permit table to be reused
-	 * for different type without module reload
+	 * Stage 1: determine table type.
+	 * Reference found table to ensure it won't disappear.
+	 */
+	IPFW_UH_WLOCK(ch);
+	ni = CHAIN_TO_NI(ch);
+	if ((tc = find_table(ni, ti)) == NULL) {
+		IPFW_UH_WUNLOCK(ch);
+		return (ESRCH);
+	}
+	type = tc->no.type;
+	tc->no.refcnt++;
+	IPFW_UH_WUNLOCK(ch);
+
+	/*
+	 * Stage 2: allocate new state for given type.
 	 */
+	if ((error = alloc_table_state(&state, &xstate, type)) != 0) {
+		IPFW_UH_WLOCK(ch);
+		tc->no.refcnt--;
+		IPFW_UH_WUNLOCK(ch);
+		return (error);
+	}
 
+	/*
+	 * Stage 3: swap old state pointers with newly-allocated ones.
+	 * Decrease refcount.
+	 */
+	IPFW_UH_WLOCK(ch);
 	IPFW_WLOCK(ch);
-	/* Set IPv4 table pointer to zero */
-	if ((rnh = ch->tables[tbl]) != NULL)
-		ch->tables[tbl] = NULL;
-	/* Set extended table pointer to zero */
-	if ((xrnh = ch->xtables[tbl]) != NULL)
-		ch->xtables[tbl] = NULL;
-	/* Zero table type */
-	ch->tabletype[tbl] = 0;
+
+	ni = CHAIN_TO_NI(ch);
+	kidx = tc->no.kidx;
+
+	ostate = ch->tables[kidx];
+	ch->tables[kidx] = state;
+	oxstate = ch->xtables[kidx];
+	ch->xtables[kidx] = xstate;
+
+	tc->no.refcnt--;
+
 	IPFW_WUNLOCK(ch);
+	IPFW_UH_WUNLOCK(ch);
 
-	if (rnh != NULL) {
-		rnh->rnh_walktree(rnh, flush_table_entry, rnh);
-		rn_detachhead((void **)&rnh);
+	/*
+	 * Stage 4: perform real flush.
+	 */
+	free_table_state(&ostate, &xstate, tc->no.type);
+
+	return (0);
+}
+
+/*
+ * Destroys given table @ti: flushes it,
+ */
+int
+ipfw_destroy_table(struct ip_fw_chain *ch, struct tid_info *ti, int force)
+{
+	struct namedobj_instance *ni;
+	struct table_config *tc;
+
+	ti->set = TABLE_SET(ti->set);
+
+	IPFW_UH_WLOCK(ch);
+
+	ni = CHAIN_TO_NI(ch);
+	if ((tc = find_table(ni, ti)) == NULL) {
+		IPFW_UH_WUNLOCK(ch);
+		return (ESRCH);
 	}
 
-	if (xrnh != NULL) {
-		xrnh->rnh_walktree(xrnh, flush_table_entry, xrnh);
-		rn_detachhead((void **)&xrnh);
+	/* Do not permit destroying used tables */
+	if (tc->no.refcnt > 0 && force == 0) {
+		IPFW_UH_WUNLOCK(ch);
+		return (EBUSY);
 	}
 
+	IPFW_WLOCK(ch);
+	unlink_table(ch, tc);
+	IPFW_WUNLOCK(ch);
+
+	/* Free obj index */
+	if (ipfw_objhash_free_idx(ni, tc->no.set, tc->no.kidx) != 0)
+		printf("Error unlinking kidx %d from table %s\n",
+		    tc->no.kidx, tc->tablename);
+
+	IPFW_UH_WUNLOCK(ch);
+
+	free_table_config(ni, tc);
+
 	return (0);
 }
 
+static void
+destroy_table_locked(struct namedobj_instance *ni, struct named_object *no,
+    void *arg)
+{
+
+	unlink_table((struct ip_fw_chain *)arg, (struct table_config *)no);
+	if (ipfw_objhash_free_idx(ni, no->set, no->kidx) != 0)
+		printf("Error unlinking kidx %d from table %s\n",
+		    no->kidx, no->name);
+	free_table_config(ni, (struct table_config *)no);
+}
+
 void
 ipfw_destroy_tables(struct ip_fw_chain *ch)
 {
-	uint16_t tbl;
 
-	/* Flush all tables */
-	for (tbl = 0; tbl < V_fw_tables_max; tbl++)
-		ipfw_flush_table(ch, tbl);
+	/* Remove all tables from working set */
+	IPFW_UH_WLOCK(ch);
+	IPFW_WLOCK(ch);
+	ipfw_objhash_foreach(CHAIN_TO_NI(ch), destroy_table_locked, ch);
+	IPFW_WUNLOCK(ch);
+	IPFW_UH_WUNLOCK(ch);
 
 	/* Free pointers itself */
 	free(ch->tables, M_IPFW);
 	free(ch->xtables, M_IPFW);
-	free(ch->tabletype, M_IPFW);
+
+	ipfw_objhash_destroy(CHAIN_TO_NI(ch));
+	free(CHAIN_TO_TCFG(ch), M_IPFW);
 }
 
 int
 ipfw_init_tables(struct ip_fw_chain *ch)
 {
+	struct tables_config *tcfg;
+
 	/* Allocate pointers */
 	ch->tables = malloc(V_fw_tables_max * sizeof(void *), M_IPFW, M_WAITOK | M_ZERO);
 	ch->xtables = malloc(V_fw_tables_max * sizeof(void *), M_IPFW, M_WAITOK | M_ZERO);
-	ch->tabletype = malloc(V_fw_tables_max * sizeof(uint8_t), M_IPFW, M_WAITOK | M_ZERO);
+
+	tcfg = malloc(sizeof(struct tables_config), M_IPFW, M_WAITOK | M_ZERO);
+	tcfg->namehash = ipfw_objhash_create(V_fw_tables_max);
+	ch->tblcfg = tcfg;
+
 	return (0);
 }
 
 int
 ipfw_resize_tables(struct ip_fw_chain *ch, unsigned int ntables)
 {
 	struct radix_node_head **tables, **xtables, *rnh;
 	struct radix_node_head **tables_old, **xtables_old;
-	uint8_t *tabletype, *tabletype_old;
 	unsigned int ntables_old, tbl;
+	struct namedobj_instance *ni;
+	void *new_idx;
+	int new_blocks;
 
 	/* Check new value for validity */
 	if (ntables > IPFW_TABLES_MAX)
 		ntables = IPFW_TABLES_MAX;
 
 	/* Allocate new pointers */
 	tables = malloc(ntables * sizeof(void *), M_IPFW, M_WAITOK | M_ZERO);
 	xtables = malloc(ntables * sizeof(void *), M_IPFW, M_WAITOK | M_ZERO);
-	tabletype = malloc(ntables * sizeof(uint8_t), M_IPFW, M_WAITOK | M_ZERO);
+	ipfw_objhash_bitmap_alloc(ntables, (void *)&new_idx, &new_blocks);
 
 	IPFW_WLOCK(ch);
 
 	tbl = (ntables >= V_fw_tables_max) ? V_fw_tables_max : ntables;
+	ni = CHAIN_TO_NI(ch);
+
+	/* Temportary restrict decreasing max_tables  */
+	if (ipfw_objhash_bitmap_merge(ni, &new_idx, &new_blocks) != 0) {
+		IPFW_WUNLOCK(ch);
+		free(tables, M_IPFW);
+		free(xtables, M_IPFW);
+		ipfw_objhash_bitmap_free(new_idx, new_blocks);
+		return (EINVAL);
+	}
 
 	/* Copy old table pointers */
 	memcpy(tables, ch->tables, sizeof(void *) * tbl);
 	memcpy(xtables, ch->xtables, sizeof(void *) * tbl);
-	memcpy(tabletype, ch->tabletype, sizeof(uint8_t) * tbl);
 
 	/* Change pointers and number of tables */
 	tables_old = ch->tables;
 	xtables_old = ch->xtables;
-	tabletype_old = ch->tabletype;
 	ch->tables = tables;
 	ch->xtables = xtables;
-	ch->tabletype = tabletype;
 
 	ntables_old = V_fw_tables_max;
 	V_fw_tables_max = ntables;
@@ -525,7 +741,7 @@
 	/* Free old pointers */
 	free(tables_old, M_IPFW);
 	free(xtables_old, M_IPFW);
-	free(tabletype_old, M_IPFW);
+	ipfw_objhash_bitmap_free(new_idx, new_blocks);
 
 	return (0);
 }
@@ -602,14 +818,17 @@
 }
 
 int
-ipfw_count_table(struct ip_fw_chain *ch, uint32_t tbl, uint32_t *cnt)
+ipfw_count_table(struct ip_fw_chain *ch, struct tid_info *ti, uint32_t *cnt)
 {
 	struct radix_node_head *rnh;
+	struct table_config *tc;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
+	if ((tc = find_table(CHAIN_TO_NI(ch), ti)) == NULL)
+		return (ESRCH);
 	*cnt = 0;
-	if ((rnh = ch->tables[tbl]) == NULL)
+	if ((rnh = ch->tables[tc->no.kidx]) == NULL)
 		return (0);
 	rnh->rnh_walktree(rnh, count_table_entry, cnt);
 	return (0);
@@ -637,14 +856,17 @@
 }
 
 int
-ipfw_dump_table(struct ip_fw_chain *ch, ipfw_table *tbl)
+ipfw_dump_table(struct ip_fw_chain *ch, struct tid_info *ti, ipfw_table *tbl)
 {
 	struct radix_node_head *rnh;
+	struct table_config *tc;
 
-	if (tbl->tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
+	if ((tc = find_table(CHAIN_TO_NI(ch), ti)) == NULL)
+		return (ESRCH);
 	tbl->cnt = 0;
-	if ((rnh = ch->tables[tbl->tbl]) == NULL)
+	if ((rnh = ch->tables[tc->no.kidx]) == NULL)
 		return (0);
 	rnh->rnh_walktree(rnh, dump_table_entry, tbl);
 	return (0);
@@ -660,16 +882,19 @@
 }
 
 int
-ipfw_count_xtable(struct ip_fw_chain *ch, uint32_t tbl, uint32_t *cnt)
+ipfw_count_xtable(struct ip_fw_chain *ch, struct tid_info *ti, uint32_t *cnt)
 {
 	struct radix_node_head *rnh;
+	struct table_config *tc;
 
-	if (tbl >= V_fw_tables_max)
+	if (ti->uidx >= V_fw_tables_max)
 		return (EINVAL);
 	*cnt = 0;
-	if ((rnh = ch->tables[tbl]) != NULL)
+	if ((tc = find_table(CHAIN_TO_NI(ch), ti)) == NULL)
+		return (0);	/* XXX: We should return ESRCH */
+	if ((rnh = ch->tables[tc->no.kidx]) != NULL)
 		rnh->rnh_walktree(rnh, count_table_xentry, cnt);
-	if ((rnh = ch->xtables[tbl]) != NULL)
+	if ((rnh = ch->xtables[tc->no.kidx]) != NULL)
 		rnh->rnh_walktree(rnh, count_table_xentry, cnt);
 	/* Return zero if table is empty */
 	if (*cnt > 0)
@@ -747,19 +972,700 @@
 }
 
 int
-ipfw_dump_xtable(struct ip_fw_chain *ch, ipfw_xtable *tbl)
+ipfw_dump_xtable(struct ip_fw_chain *ch, struct tid_info *ti, ipfw_xtable *tbl)
 {
 	struct radix_node_head *rnh;
+	struct table_config *tc;
 
 	if (tbl->tbl >= V_fw_tables_max)
 		return (EINVAL);
 	tbl->cnt = 0;
-	tbl->type = ch->tabletype[tbl->tbl];
-	if ((rnh = ch->tables[tbl->tbl]) != NULL)
+
+	if ((tc = find_table(CHAIN_TO_NI(ch), ti)) == NULL)
+		return (0);	/* XXX: We should return ESRCH */
+	tbl->type = tc->no.type;
+	if ((rnh = ch->tables[tc->no.kidx]) != NULL)
 		rnh->rnh_walktree(rnh, dump_table_xentry_base, tbl);
-	if ((rnh = ch->xtables[tbl->tbl]) != NULL)
+	if ((rnh = ch->xtables[tc->no.kidx]) != NULL)
 		rnh->rnh_walktree(rnh, dump_table_xentry_extended, tbl);
 	return (0);
 }
 
+/*
+ * Tables rewriting code 
+ *
+ */
+
+/*
+ * Determine table number and lookup type for @cmd.
+ * Fill @tbl and @type with appropriate values.
+ * Returns 0 for relevant opcodes, 1 otherwise.
+ */
+static int
+classify_table_opcode(ipfw_insn *cmd, uint16_t *puidx, uint8_t *ptype)
+{
+	ipfw_insn_if *cmdif;
+	int skip;
+	uint16_t v;
+
+	skip = 1;
+
+	switch (cmd->opcode) {
+	case O_IP_SRC_LOOKUP:
+	case O_IP_DST_LOOKUP:
+		/* Basic IPv4/IPv6 or u32 lookups */
+		*puidx = cmd->arg1;
+		/* Assume CIDR by default */
+		*ptype = IPFW_TABLE_CIDR;
+		skip = 0;
+		
+		if (F_LEN(cmd) > F_INSN_SIZE(ipfw_insn_u32)) {
+			/*
+			 * generic lookup. The key must be
+			 * in 32bit big-endian format.
+			 */
+			v = ((ipfw_insn_u32 *)cmd)->d[1];
+			switch (v) {
+			case 0:
+			case 1:
+				/* IPv4 src/dst */
+				break;
+			case 2:
+			case 3:
+				/* src/dst port */
+				//type = IPFW_TABLE_U16;
+				break;
+			case 4:
+				/* uid/gid */
+				//type = IPFW_TABLE_U32;
+			case 5:
+				//type = IPFW_TABLE_U32;
+				/* jid */
+			case 6:
+				//type = IPFW_TABLE_U16;
+				/* dscp */
+				break;
+			}
+		}
+		break;
+	case O_XMIT:
+	case O_RECV:
+	case O_VIA:
+		/* Interface table, possibly */
+		cmdif = (ipfw_insn_if *)cmd;
+		if (cmdif->name[0] != '\1')
+			break;
+
+		*ptype = IPFW_TABLE_INTERFACE;
+		*puidx = cmdif->p.glob;
+		skip = 0;
+		break;
+	}
+
+	return (skip);
+}
+
+/*
+ * Sets new table value for given opcode.
+ * Assume the same opcodes as classify_table_opcode()
+ */
+static void
+update_table_opcode(ipfw_insn *cmd, uint16_t idx)
+{
+	ipfw_insn_if *cmdif;
+
+	switch (cmd->opcode) {
+	case O_IP_SRC_LOOKUP:
+	case O_IP_DST_LOOKUP:
+		/* Basic IPv4/IPv6 or u32 lookups */
+		cmd->arg1 = idx;
+		break;
+	case O_XMIT:
+	case O_RECV:
+	case O_VIA:
+		/* Interface table, possibly */
+		cmdif = (ipfw_insn_if *)cmd;
+		cmdif->p.glob = idx;
+		break;
+	}
+}
+
+static char *
+find_name_tlv(void *tlvs, int len, uint16_t uidx)
+{
+	ipfw_xtable_ntlv *ntlv;
+	uintptr_t pa, pe;
+	int l;
+
+	pa = (uintptr_t)tlvs;
+	pe = pa + len;
+	l = 0;
+	for (; pa < pe; pa += l) {
+		ntlv = (ipfw_xtable_ntlv *)pa;
+		l = ntlv->head.length;
+		if (ntlv->head.type != IPFW_TLV_NAME)
+			continue;
+		if (ntlv->idx != uidx)
+			continue;
+		
+		return (ntlv->name);
+	}
+
+	return (NULL);
+}
+
+static struct table_config *
+find_table(struct namedobj_instance *ni, struct tid_info *ti)
+{
+	char *name, bname[16];
+	struct named_object *no;
+
+	if (ti->tlvs != NULL) {
+		name = find_name_tlv(ti->tlvs, ti->tlen, ti->uidx);
+		if (name == NULL)
+			return (NULL);
+	} else {
+		snprintf(bname, sizeof(bname), "%d", ti->uidx);
+		name = bname;
+	}
+
+	no = ipfw_objhash_lookup_name(ni, ti->set, name);
+
+	return ((struct table_config *)no);
+}
+
+static int
+alloc_table_state(void **state, void **xstate, uint8_t type)
+{
+
+	switch (type) {
+	case IPFW_TABLE_CIDR:
+		if (!rn_inithead(state, OFF_LEN_INET))
+			return (ENOMEM);
+		if (!rn_inithead(xstate, OFF_LEN_INET6)) {
+			rn_detachhead(state);
+			return (ENOMEM);
+		}
+		break;
+	case IPFW_TABLE_INTERFACE:
+		*state = NULL;
+		if (!rn_inithead(xstate, OFF_LEN_IFACE))
+			return (ENOMEM);
+		break;
+	}
+	
+	return (0);
+}
+
+
+static struct table_config *
+alloc_table_config(struct namedobj_instance *ni, struct tid_info *ti)
+{
+	char *name, bname[16];
+	struct table_config *tc;
+	int error;
+
+	if (ti->tlvs != NULL) {
+		name = find_name_tlv(ti->tlvs, ti->tlen, ti->uidx);
+		if (name == NULL)
+			return (NULL);
+	} else {
+		snprintf(bname, sizeof(bname), "%d", ti->uidx);
+		name = bname;
+	}
+
+	tc = malloc(sizeof(struct table_config), M_IPFW, M_WAITOK | M_ZERO);
+	tc->no.name = tc->tablename;
+	tc->no.type = ti->type;
+	tc->no.set = ti->set;
+	strlcpy(tc->tablename, name, sizeof(tc->tablename));
+
+	if (ti->tlvs == NULL) {
+		tc->no.compat = 1;
+		tc->no.uidx = ti->uidx;
+	}
+
+	/* Preallocate data structures for new tables */
+	error = alloc_table_state(&tc->state, &tc->xstate, ti->type);
+	if (error != 0) {
+		free(tc, M_IPFW);
+		return (NULL);
+	}
+	
+	return (tc);
+}
+
+static void
+free_table_state(void **state, void **xstate, uint8_t type)
+{
+	struct radix_node_head *rnh;
+
+	switch (type) {
+	case IPFW_TABLE_CIDR:
+		rnh = (struct radix_node_head *)(*state);
+		rnh->rnh_walktree(rnh, flush_table_entry, rnh);
+		rn_detachhead(state);
+
+		rnh = (struct radix_node_head *)(*xstate);
+		rnh->rnh_walktree(rnh, flush_table_entry, rnh);
+		rn_detachhead(xstate);
+		break;
+	case IPFW_TABLE_INTERFACE:
+		rnh = (struct radix_node_head *)(*xstate);
+		rnh->rnh_walktree(rnh, flush_table_entry, rnh);
+		rn_detachhead(xstate);
+		break;
+	}
+}
+
+static void
+free_table_config(struct namedobj_instance *ni, struct table_config *tc)
+{
+
+	if (tc->linked == 0)
+		free_table_state(&tc->state, &tc->xstate, tc->no.type);
+
+	free(tc, M_IPFW);
+}
+
+/*
+ * Links @tc to @chain table named instance.
+ * Sets appropriate type/states in @chain table info.
+ */
+static void
+link_table(struct ip_fw_chain *chain, struct table_config *tc)
+{
+	struct namedobj_instance *ni;
+	uint16_t kidx;
+
+	IPFW_UH_WLOCK_ASSERT(chain);
+	IPFW_WLOCK_ASSERT(chain);
+
+	ni = CHAIN_TO_NI(chain);
+	kidx = tc->no.kidx;
+
+	ipfw_objhash_add(ni, &tc->no);
+	chain->tables[kidx] = tc->state;
+	chain->xtables[kidx] = tc->xstate;
+
+	tc->linked = 1;
+}
+
+/*
+ * Unlinks @tc from @chain table named instance.
+ * Zeroes states in @chain and stores them in @tc.
+ */
+static void
+unlink_table(struct ip_fw_chain *chain, struct table_config *tc)
+{
+	struct namedobj_instance *ni;
+	uint16_t kidx;
+
+	IPFW_UH_WLOCK_ASSERT(chain);
+	IPFW_WLOCK_ASSERT(chain);
+
+	ni = CHAIN_TO_NI(chain);
+	kidx = tc->no.kidx;
+
+	/* Clear state and save pointers for flush */
+	ipfw_objhash_del(ni, &tc->no);
+	tc->state = chain->tables[kidx];
+	chain->tables[kidx] = NULL;
+	tc->xstate = chain->xtables[kidx];
+	chain->xtables[kidx] = NULL;
+
+	tc->linked = 0;
+}
+
+/*
+ * Finds named object by @uidx number.
+ * Refs found object, allocate new index for non-existing object.
+ * Fills in @pidx with userland/kernel indexes.
+ *
+ * Returns 0 on success.
+ */
+static int
+bind_table(struct namedobj_instance *ni, struct rule_check_info *ci,
+    struct obj_idx *pidx, struct tid_info *ti)
+{
+	struct table_config *tc;
+
+	tc = find_table(ni, ti);
+
+	pidx->uidx = ti->uidx;
+	pidx->type = ti->type;
+
+	if (tc == NULL) {
+		/* Try to acquire refcount */
+		if (ipfw_objhash_alloc_idx(ni, ti->set, &pidx->kidx) != 0) {
+			printf("Unable to allocate table index in set %u."
+			    " Consider increasing net.inet.ip.fw.tables_max",
+				    ti->set);
+			return (EBUSY);
+		}
+
+		pidx->new = 1;
+		ci->new_tables++;
+
+		return (0);
+	}
+
+	/* Check if table type if valid first */
+	if (tc->no.type != ti->type)
+		return (EINVAL);
+
+	tc->no.refcnt++;
+
+	pidx->kidx = tc->no.kidx;
+
+	return (0);
+}
+
+/*
+ * Compatibility function for old ipfw(8) binaries.
+ * Rewrites table kernel indices with userland ones.
+ * Works for \d+ talbes only (e.g. for tables, converted
+ * from old numbered system calls).
+ *
+ * Returns 0 on success.
+ * Raises error on any other tables.
+ */
+int
+ipfw_rewrite_table_kidx(struct ip_fw_chain *chain, struct ip_fw *rule)
+{
+	int cmdlen, l;
+	ipfw_insn *cmd;
+	uint32_t set;
+	uint16_t kidx;
+	uint8_t type;
+	struct named_object *no;
+	struct namedobj_instance *ni;
+
+	ni = CHAIN_TO_NI(chain);
+
+	set = TABLE_SET(rule->set);
+	
+	l = rule->cmd_len;
+	cmd = rule->cmd;
+	cmdlen = 0;
+	for ( ;	l > 0 ; l -= cmdlen, cmd += cmdlen) {
+		cmdlen = F_LEN(cmd);
+
+		if (classify_table_opcode(cmd, &kidx, &type) != 0)
+			continue;
+
+		if ((no = ipfw_objhash_lookup_idx(ni, set, kidx)) == NULL)
+			return (1);
+
+		if (no->compat == 0)
+			return (2);
+
+		update_table_opcode(cmd, no->uidx);
+	}
+
+	return (0);
+}
+
+
+/*
+ * Checks is opcode is referencing table of appropriate type.
+ * Adds reference count for found table if true.
+ * Rewrites user-supplied opcode values with kernel ones.
+ *
+ * Returns 0 on success and appropriate error code otherwise.
+ */
+int
+ipfw_rewrite_table_uidx(struct ip_fw_chain *chain,
+    struct rule_check_info *ci)
+{
+	int cmdlen, error, ftype, l;
+	ipfw_insn *cmd;
+	uint16_t uidx;
+	uint8_t type;
+	struct table_config *tc;
+	struct namedobj_instance *ni;
+	struct named_object *no, *no_n, *no_tmp;
+	struct obj_idx *pidx, *p, *oib;
+	struct namedobjects_head nh;
+	struct tid_info ti;
+
+	ni = CHAIN_TO_NI(chain);
+
+	/*
+	 * Prepare an array for storing opcode indices.
+	 * Use stack allocation by default.
+	 */
+	if (ci->table_opcodes <= (sizeof(ci->obuf)/sizeof(ci->obuf[0]))) {
+		/* Stack */
+		pidx = ci->obuf;
+	} else
+		pidx = malloc(ci->table_opcodes * sizeof(struct obj_idx),
+		    M_IPFW, M_WAITOK | M_ZERO);
+
+	oib = pidx;
+	error = 0;
+
+	type = 0;
+	ftype = 0;
+
+	ci->tableset = TABLE_SET(ci->krule->set);
+
+	memset(&ti, 0, sizeof(ti));
+	ti.set = ci->tableset;
+	ti.tlvs = ci->tlvs;
+	ti.tlen = ci->tlen;
+
+	/*
+	 * Stage 1: reference existing tables and determine number
+	 * of tables we need to allocate
+	 */
+	IPFW_UH_WLOCK(chain);
+
+	l = ci->krule->cmd_len;
+	cmd = ci->krule->cmd;
+	cmdlen = 0;
+	for ( ;	l > 0 ; l -= cmdlen, cmd += cmdlen) {
+		cmdlen = F_LEN(cmd);
+
+		if (classify_table_opcode(cmd, &ti.uidx, &ti.type) != 0)
+			continue;
+
+		/*
+		 * Got table opcode with necessary info.
+		 * Try to reference existing tables and allocate
+		 * indices for non-existing one while holding write lock.
+		 */
+		if ((error = bind_table(ni, ci, pidx, &ti)) != 0)
+			break;
+
+		/*
+		 * @pidx stores either existing ref'd table id or new one.
+		 * Move to next index
+		 */
+
+		pidx++;
+	}
+
+	if (error != 0) {
+		/* Unref everything we have already done */
+		for (p = oib; p < pidx; p++) {
+			if (p->new != 0) {
+				ipfw_objhash_free_idx(ni, ci->tableset,p->kidx);
+				continue;
+			}
+
+			/* Find & unref by existing idx */
+			no = ipfw_objhash_lookup_idx(ni, ci->tableset, p->kidx);
+			KASSERT(no!=NULL, ("Ref'd table %d disappeared",
+			    p->kidx));
+
+			no->refcnt--;
+		}
+
+		IPFW_UH_WUNLOCK(chain);
+
+		if (oib != ci->obuf)
+			free(oib, M_IPFW);
+
+		return (error);
+	}
+
+	IPFW_UH_WUNLOCK(chain);
+
+	/*
+	 * Stage 2: allocate table configs for every non-existent table
+	 */
+
+	if (ci->new_tables > 0) {
+		/* Prepare queue to store configs */
+		TAILQ_INIT(&nh);
+
+		for (p = oib; p < pidx; p++) {
+			if (p->new == 0)
+				continue;
+
+			/* TODO: get name from TLV */
+			ti.uidx = p->uidx;
+			ti.type = p->type;
+
+			tc = alloc_table_config(ni, &ti);
+
+			if (tc == NULL) {
+				error = ENOMEM;
+				goto free;
+			}
+
+			tc->no.kidx = p->kidx;
+			tc->no.refcnt = 1;
+
+			/* Add to list */
+			TAILQ_INSERT_TAIL(&nh, &tc->no, nn_next);
+		}
+
+		/*
+		 * Stage 2.1: Check if we're going to create 2 tables
+		 * with the same name, but different table types.
+		 */
+		TAILQ_FOREACH(no, &nh, nn_next) {
+			TAILQ_FOREACH(no_tmp, &nh, nn_next) {
+				if (strcmp(no->name, no_tmp->name) != 0)
+					continue;
+				if (no->type != no_tmp->type) {
+					error = EINVAL;
+					goto free;
+				}
+			}
+		}
+
+		/*
+		 * Stage 3: link & reference new table configs
+		 */
+
+		IPFW_UH_WLOCK(chain);
+
+		/*
+		 * Step 3.1: Check if some tables we need to create have been
+		 * already created with different table type.
+		 */
+
+		error = 0;
+		TAILQ_FOREACH_SAFE(no, &nh, nn_next, no_tmp) {
+			no_n = ipfw_objhash_lookup_name(ni, no->set, no->name);
+			if (no_n == NULL)
+				continue;
+
+			if (no_n->type != no->type) {
+				error = EINVAL;
+				break;
+			}
+
+		}
+
+		if (error != 0) {
+			/*
+			 * Someone has allocated table with different table type.
+			 * We have to rollback everything.
+			 */
+			IPFW_UH_WUNLOCK(chain);
+
+			goto free;
+		}
+
+
+		/*
+		 * Finally, attach tables and rewrite rule.
+		 * We need to set table type for each new table,
+		 * so we have to acquire main WLOCK.
+		 */
+		IPFW_WLOCK(chain);
+		TAILQ_FOREACH_SAFE(no, &nh, nn_next, no_tmp) {
+			no_n = ipfw_objhash_lookup_name(ni, no->set, no->name);
+			if (no_n != NULL) {
+				/* Increase refcount for existing table */
+				no_n->refcnt++;
+				/* Keep oib array in sync: update kindx */
+				for (p = oib; p < pidx; p++) {
+					if (p->kidx == no->kidx) {
+						p->kidx = no_n->kidx;
+						break;
+					}
+				}
+
+				continue;
+			}
+
+			/* New table. Attach to runtime hash */
+			TAILQ_REMOVE(&nh, no, nn_next);
+
+			link_table(chain, (struct table_config *)no);
+		}
+		IPFW_WUNLOCK(chain);
+
+		/* Perform rule rewrite */
+		l = ci->krule->cmd_len;
+		cmd = ci->krule->cmd;
+		cmdlen = 0;
+		pidx = oib;
+		for ( ;	l > 0 ; l -= cmdlen, cmd += cmdlen) {
+			cmdlen = F_LEN(cmd);
+
+			if (classify_table_opcode(cmd, &uidx, &type) != 0)
+				continue;
+			update_table_opcode(cmd, pidx->kidx);
+			pidx++;
+		}
+
+		IPFW_UH_WUNLOCK(chain);
+	}
+
+	error = 0;
+
+	/*
+	 * Stage 4: free resources
+	 */
+free:
+	TAILQ_FOREACH_SAFE(no, &nh, nn_next, no_tmp)
+		free_table_config(ni, tc);
+
+	if (oib != ci->obuf)
+		free(oib, M_IPFW);
+
+	return (error);
+}
+
+/*
+ * Remove references from every table used in @rule.
+ */
+void
+ipfw_unbind_table_rule(struct ip_fw_chain *chain, struct ip_fw *rule)
+{
+	int cmdlen, l;
+	ipfw_insn *cmd;
+	struct namedobj_instance *ni;
+	struct named_object *no;
+	uint32_t set;
+	uint16_t kidx;
+	uint8_t type;
+
+	ni = CHAIN_TO_NI(chain);
+
+	set = TABLE_SET(rule->set);
+
+	l = rule->cmd_len;
+	cmd = rule->cmd;
+	cmdlen = 0;
+	for ( ;	l > 0 ; l -= cmdlen, cmd += cmdlen) {
+		cmdlen = F_LEN(cmd);
+
+		if (classify_table_opcode(cmd, &kidx, &type) != 0)
+			continue;
+
+		no = ipfw_objhash_lookup_idx(ni, set, kidx); 
+
+		KASSERT(no != NULL, ("table id %d not found", kidx));
+		KASSERT(no->type == type, ("wrong type %d (%d) for table id %d",
+		    no->type, type, kidx));
+		KASSERT(no->refcnt > 0, ("refcount for table %d is %d",
+		    kidx, no->refcnt));
+
+		no->refcnt--;
+	}
+}
+
+
+/*
+ * Removes table bindings for every rule in rule chain @head.
+ */
+void
+ipfw_unbind_table_list(struct ip_fw_chain *chain, struct ip_fw *head)
+{
+	struct ip_fw *rule;
+
+	while ((rule = head) != NULL) {
+		head = head->x_next;
+		ipfw_unbind_table_rule(chain, rule);
+	}
+}
+
+
 /* end of file */

--------------060502050508080706040508--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?539044E4.1020904>