From owner-svn-src-all@FreeBSD.ORG Thu Oct 9 19:32:36 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B3DC250A; Thu, 9 Oct 2014 19:32:36 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9D3C5EB6; Thu, 9 Oct 2014 19:32:36 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.9/8.14.9) with ESMTP id s99JWaVX065628; Thu, 9 Oct 2014 19:32:36 GMT (envelope-from melifaro@FreeBSD.org) Received: (from melifaro@localhost) by svn.freebsd.org (8.14.9/8.14.9/Submit) id s99JWaSc065624; Thu, 9 Oct 2014 19:32:36 GMT (envelope-from melifaro@FreeBSD.org) Message-Id: <201410091932.s99JWaSc065624@svn.freebsd.org> X-Authentication-Warning: svn.freebsd.org: melifaro set sender to melifaro@FreeBSD.org using -f From: "Alexander V. Chernikov" Date: Thu, 9 Oct 2014 19:32:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r272840 - in head: sbin/ipfw sys/conf sys/modules/ipfw sys/netgraph sys/netinet sys/netpfil/ipfw X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Oct 2014 19:32:36 -0000 Author: melifaro Date: Thu Oct 9 19:32:35 2014 New Revision: 272840 URL: https://svnweb.freebsd.org/changeset/base/272840 Log: Merge projects/ipfw to HEAD. Main user-visible changes are related to tables: * Tables are now identified by names, not numbers. There can be up to 65k tables with up to 63-byte long names. * Tables are now set-aware (default off), so you can switch/move them atomically with rules. * More functionality is supported (swap, lock, limits, user-level lookup, batched add/del) by generic table code. * New table types are added (flow) so you can match multiple packet fields at once. * Ability to add different type of lookup algorithms for particular table type has been added. * New table algorithms are added (cidr:hash, iface:array, number:array and flow:hash) to make certain types of lookup more effective. * Table value are now capable of holding multiple data fields for different tablearg users Performance changes: * Main ipfw lock was converted to rmlock * Rule counters were separated from rule itself and made per-cpu. * Radix table entries fits into 128 bytes * struct ip_fw is now more compact so more rules will fit into 64 bytes * interface tables uses array of existing ifindexes for faster match ABI changes: All functionality supported by old ipfw(8) remains functional. Old & new binaries can work together with the following restrictions: * Tables named other than ^\d+$ are shown as table(65535) in ruleset in old binaries Internal changes:. Changing table ids to numbers resulted in format modification for most sockopt codes. Old sopt format was compact, but very hard to extend (no versioning, inability to add more opcodes), so * All relevant opcodes were converted to TLV-based versioned IP_FW3-based codes. * The remaining opcodes were also converted to be able to eliminate all older opcodes at once * All IP_FW3 handlers uses special API instead of calling sooptcopy* directly to ease adding another communication methods * struct ip_fw is now different for kernel and userland * tablearg value has been changed to 0 to ease future extensions * table "values" are now indexes in special value array which holds extended data for given index * Batched add/delete has been added to tables code * Most changes has been done to permit batched rule addition. * interface tracking API has been added (started on demand) to permit effective interface tables operations * O(1) skipto cache, currently turned off by default at compile-time (eats 512K). * Several steps has been made towards making libipfw: * most of new functions were separated into "parse/prepare/show and actuall-do-stuff" pieces (already merged). * there are separate functions for parsing text string into "struct ip_fw" and printing "struct ip_fw" to supplied buffer (already merged). * Probably some more less significant/forgotten features MFC after: 1 month Sponsored by: Yandex LLC Added: head/sbin/ipfw/tables.c - copied unchanged from r272837, projects/ipfw/sbin/ipfw/tables.c head/sys/netpfil/ipfw/ip_fw_iface.c - copied unchanged from r272837, projects/ipfw/sys/netpfil/ipfw/ip_fw_iface.c head/sys/netpfil/ipfw/ip_fw_table.h (contents, props changed) - copied, changed from r272837, projects/ipfw/sys/netpfil/ipfw/ip_fw_table.h head/sys/netpfil/ipfw/ip_fw_table_algo.c (contents, props changed) - copied, changed from r272837, projects/ipfw/sys/netpfil/ipfw/ip_fw_table_algo.c head/sys/netpfil/ipfw/ip_fw_table_value.c - copied unchanged from r272837, projects/ipfw/sys/netpfil/ipfw/ip_fw_table_value.c Modified: head/sbin/ipfw/Makefile head/sbin/ipfw/ipfw.8 head/sbin/ipfw/ipfw2.c head/sbin/ipfw/ipfw2.h head/sbin/ipfw/main.c head/sbin/ipfw/nat.c head/sys/conf/files head/sys/modules/ipfw/Makefile head/sys/netgraph/ng_ipfw.c head/sys/netinet/ip_fw.h head/sys/netpfil/ipfw/ip_dummynet.c head/sys/netpfil/ipfw/ip_fw2.c head/sys/netpfil/ipfw/ip_fw_dynamic.c head/sys/netpfil/ipfw/ip_fw_log.c head/sys/netpfil/ipfw/ip_fw_nat.c head/sys/netpfil/ipfw/ip_fw_private.h head/sys/netpfil/ipfw/ip_fw_sockopt.c head/sys/netpfil/ipfw/ip_fw_table.c Directory Properties: head/ (props changed) head/sbin/ (props changed) head/sbin/ipfw/ (props changed) head/sys/ (props changed) head/sys/conf/ (props changed) Modified: head/sbin/ipfw/Makefile ============================================================================== --- head/sbin/ipfw/Makefile Thu Oct 9 19:13:33 2014 (r272839) +++ head/sbin/ipfw/Makefile Thu Oct 9 19:32:35 2014 (r272840) @@ -3,7 +3,7 @@ .include PROG= ipfw -SRCS= ipfw2.c dummynet.c ipv6.c main.c nat.c +SRCS= ipfw2.c dummynet.c ipv6.c main.c nat.c tables.c WARNS?= 2 .if ${MK_PF} != "no" Modified: head/sbin/ipfw/ipfw.8 ============================================================================== --- head/sbin/ipfw/ipfw.8 Thu Oct 9 19:13:33 2014 (r272839) +++ head/sbin/ipfw/ipfw.8 Thu Oct 9 19:32:35 2014 (r272840) @@ -1,7 +1,7 @@ .\" .\" $FreeBSD$ .\" -.Dd May 31, 2014 +.Dd Aug 13, 2014 .Dt IPFW 8 .Os .Sh NAME @@ -48,17 +48,43 @@ in-kernel NAT. .Brq Cm firewall | altq | one_pass | debug | verbose | dyn_keepalive .Ss LOOKUP TABLES .Nm -.Cm table Ar number Cm add Ar addr Ns Oo / Ns Ar masklen Oc Op Ar value +.Oo Cm set Ar N Oc Cm table Ar name Cm create Ar create-options .Nm -.Cm table Ar number Cm delete Ar addr Ns Op / Ns Ar masklen +.Oo Cm set Ar N Oc Cm table Ar name Cm destroy .Nm -.Cm table -.Brq Ar number | all -.Cm flush +.Oo Cm set Ar N Oc Cm table Ar name Cm modify Ar modify-options +.Nm +.Oo Cm set Ar N Oc Cm table Ar name Cm swap Ar name +.Nm +.Oo Cm set Ar N Oc Cm table Ar name Cm add Ar table-key Op Ar value +.Nm +.Oo Cm set Ar N Oc Cm table Ar name Cm add Op Ar table-key Ar value ... +.Nm +.Oo Cm set Ar N Oc Cm table Ar name Cm atomic add Op Ar table-key Ar value ... +.Nm +.Oo Cm set Ar N Oc Cm table Ar name Cm delete Op Ar table-key ... .Nm -.Cm table -.Brq Ar number | all +.Oo Cm set Ar N Oc Cm table Ar name Cm lookup Ar addr +.Nm +.Oo Cm set Ar N Oc Cm table Ar name Cm lock +.Nm +.Oo Cm set Ar N Oc Cm table Ar name Cm unlock +.Nm +.Oo Cm set Ar N Oc Cm table +.Brq Ar name | all .Cm list +.Nm +.Oo Cm set Ar N Oc Cm table +.Brq Ar name | all +.Cm info +.Nm +.Oo Cm set Ar N Oc Cm table +.Brq Ar name | all +.Cm detail +.Nm +.Oo Cm set Ar N Oc Cm table +.Brq Ar name | all +.Cm flush .Ss DUMMYNET CONFIGURATION (TRAFFIC SHAPER AND PACKET SCHEDULER) .Nm .Brq Cm pipe | queue | sched @@ -87,6 +113,13 @@ in-kernel NAT. .Oc .Oc .Ar pathname +.Ss INTERNAL DIAGNOSTICS +.Nm +.Cm internal iflist +.Nm +.Cm internal talist +.Nm +.Cm internal vlist .Sh DESCRIPTION The .Nm @@ -822,10 +855,11 @@ It is possible to use the .Cm tablearg keyword with a skipto for a .Em computed -skipto, but care should be used, as no destination caching -is possible in this case so the rules are always walked to find it, -starting from the -.Cm skipto . +skipto. Skipto may work either in O(log(N)) or in O(1) depending +on amount of memory and/or sysctl variables. +See the +.Sx SYSCTL VARIABLES +section for more details. .It Cm call Ar number | tablearg The current rule number is saved in the internal stack and ruleset processing continues with the first rule numbered @@ -1152,7 +1186,7 @@ with multiple addresses) is provided for its use is discouraged. .It Ar addr : Oo Cm not Oc Bro .Cm any | me | me6 | -.Cm table Ns Pq Ar number Ns Op , Ns Ar value +.Cm table Ns Pq Ar name Ns Op , Ns Ar value .Ar | addr-list | addr-set .Brc .Bl -tag -width indent @@ -1164,8 +1198,8 @@ matches any IP address configured on an matches any IPv6 address configured on an interface in the system. The address list is evaluated at the time the packet is analysed. -.It Cm table Ns Pq Ar number Ns Op , Ns Ar value -Matches any IPv4 address for which an entry exists in the lookup table +.It Cm table Ns Pq Ar name Ns Op , Ns Ar value +Matches any IPv4 or IPv6 address for which an entry exists in the lookup table .Ar number . If an optional 32-bit unsigned .Ar value @@ -1359,6 +1393,19 @@ and IPsec encapsulated security payload .It Cm fib Ar fibnum Matches a packet that has been tagged to use the given FIB (routing table) number. +.It Cm flow Ar table Ns Pq Ar name Ns Op , Ns Ar value +Search for the flow entry in lookup table +.Ar name . +If not found, the match fails. +Otherwise, the match succeeds and +.Cm tablearg +is set to the value extracted from the table. +.Pp +This option can be useful to quickly dispatch traffic based on +certain packet fields. +See the +.Sx LOOKUP TABLES +section below for more information on lookup tables. .It Cm flow-id Ar labels Matches IPv6 packets containing any of the flow labels given in .Ar labels . @@ -1550,9 +1597,9 @@ of source and destination addresses and specified. Currently, only IPv4 flows are supported. -.It Cm lookup Bro Cm dst-ip | dst-port | src-ip | src-port | uid | jail Brc Ar N +.It Cm lookup Bro Cm dst-ip | dst-port | src-ip | src-port | uid | jail Brc Ar name Search an entry in lookup table -.Ar N +.Ar name that matches the field specified as argument. If not found, the match fails. Otherwise, the match succeeds and @@ -1616,13 +1663,19 @@ and they are always printed as hexadecim option is used, in which case symbolic resolution will be attempted). .It Cm proto Ar protocol Matches packets with the corresponding IP protocol. -.It Cm recv | xmit | via Brq Ar ifX | Ar if Ns Cm * | Ar table Ns Pq Ar number Ns Op , Ns Ar value | Ar ipno | Ar any +.It Cm recv | xmit | via Brq Ar ifX | Ar if Ns Cm * | Ar table Ns Po Ar name Ns Oo , Ns Ar value Oc Pc | Ar ipno | Ar any Matches packets received, transmitted or going through, respectively, the interface specified by exact name .Po Ar ifX Pc , by device name .Po Ar if* Pc , by IP address, or through some interface. +Table +.Ar name +may be used to match interface by its kernel ifindex. +See the +.Sx LOOKUP TABLES +section below for more information on lookup tables. .Pp The .Cm via @@ -1817,15 +1870,35 @@ connected networks instead of all source .Sh LOOKUP TABLES Lookup tables are useful to handle large sparse sets of addresses or other search keys (e.g., ports, jail IDs, interface names). -In the rest of this section we will use the term ``address''. -There may be up to 65535 different lookup tables, numbered 0 to 65534. +In the rest of this section we will use the term ``key''. +Table name needs to match the following spec: +.Ar table-name . +Tables with the same name can be created in different +.Ar sets . +However, rule links to the tables in +.Ar set 0 +by default. +This behavior can be controlled by +.Va net.inet.ip.fw.tables_sets +variable. +See the +.Sx SETS OF RULES +section for more information. +There may be up to 65535 different lookup tables. .Pp +The following table types are supported: +.Bl -tag -width indent +.It Ar table-type : Ar addr | iface | number | flow +.It Ar table-key : Ar addr Ns Oo / Ns Ar masklen Oc | iface-name | number | flow-spec +.It Ar flow-spec : Ar flow-field Ns Op , Ns Ar flow-spec +.It Ar flow-field : src-ip | proto | src-port | dst-ip | dst-port +.It Cm addr +matches IPv4 or IPv6 address. Each entry is represented by an .Ar addr Ns Op / Ns Ar masklen and will match all addresses with base .Ar addr -(specified as an IPv4/IPv6 address, a hostname or an unsigned integer) -and mask width of +(specified as an IPv4/IPv6 address, or a hostname) and mask width of .Ar masklen bits. If @@ -1833,29 +1906,140 @@ If is not specified, it defaults to 32 for IPv4 and 128 for IPv6. When looking up an IP address in a table, the most specific entry will match. -Associated with each entry is a 32-bit unsigned -.Ar value , -which can optionally be checked by a rule matching code. -When adding an entry, if -.Ar value -is not specified, it defaults to 0. +.It Cm iface +matches interface names. +Each entry is represented by string treated as interface name. +Wildcards are not supported. +.It Cm number +maches protocol ports, uids/gids or jail IDs. +Each entry is represented by 32-bit unsigned integer. +Ranges are not supported. +.It Cm flow +Matches packet fields specified by +.Ar flow +type suboptions with table entries. +.El +.Pp +Tables require explicit creation via +.Cm create +before use. .Pp -An entry can be added to a table -.Pq Cm add , -or removed from a table -.Pq Cm delete . -A table can be examined -.Pq Cm list -or flushed -.Pq Cm flush . +The following creation options are supported: +.Bl -tag -width indent +.It Ar create-options : Ar create-option | create-options +.It Ar create-option : Cm type Ar table-type | Cm valtype Ar value-mask | Cm algo Ar algo-desc | +.Cm limit Ar number | Cm locked +.It Cm type +Table key type. +.It Cm valtype +Table value mask. +.It Cm algo +Table algorithm to use (see below). +.It Cm limit +Maximum number of items that may be inserted into table. +.It Cm locked +Restrict any table modifications. +.El .Pp -Internally, each table is stored in a Radix tree, the same way as -the routing table (see -.Xr route 4 ) . +Some of these options may be modified later via +.Cm modify +keyword. +The following options can be changed: +.Bl -tag -width indent +.It Ar modify-options : Ar modify-option | modify-options +.It Ar modify-option : Cm limit Ar number +.It Cm limit +Alter maximum number of items that may be inserted into table. +.El .Pp -Lookup tables currently support only ports, jail IDs, IPv4/IPv6 addresses -and interface names. -Wildcards is not supported for interface names. +Additionally, table can be locked or unlocked using +.Cm lock +or +.Cm unlock +commands. +.Pp +Tables of the same +.Ar type +can be swapped with each other using +.Cm swap Ar name +command. +Swap may fail if tables limits are set and data exchange +would result in limits hit. +Operation is performed atomically. +.Pp +One or more entries can be added to a table at once using +.Cm add +command. +Addition of all items are performed atomically. +By default, error in addition of one entry does not influence +addition of other entries. However, non-zero error code is returned +in that case. +Special +.Cm atomic +keyword may be specified before +.Cm add +to indicate all-or-none add request. +.Pp +One or more entries can be removed from a table at once using +.Cm delete +command. +By default, error in removal of one entry does not influence +removing of other entries. However, non-zero error code is returned +in that case. +.Pp +It may be possible to check what entry will be found on particular +.Ar table-key +using +.Cm lookup +.Ae table-key +command. +This functionality is optional and may be unsupported in some algorithms. +.Pp +The following operations can be performed on +.Ar one +or +.Cm all +tables: +.Bl -tag -width indent +.It Cm list +List all entries. +.It Cm flush +Removes all entries. +.It Cm info +Shows generic table information. +.It Cm detail +Shows generic table information and algo-specific data. +.El +.Pp +The following lookup algorithms are supported: +.Bl -tag -width indent +.It Ar algo-desc : algo-name | "algo-name algo-data" +.It Ar algo-name: Ar addr:radix | addr:hash | iface:arrray | number:array | flow:hash +.It Cm addr:radix +Separate Radix trees for IPv4 and IPv6, the same way as the routing table (see +.Xr route 4 ) . +Default choice for +.Ar addr +type. +.It Cm addr:hash +Separate auto-growing hashes for IPv4 and IPv6. +Accepts entries with the same mask length specified initially via +.Cm "addr:hash masks=/v4,/v6" +algorithm creation options. +Assume /32 and /128 masks by default. +Search removes host bits (according to mask) from supplied address and checks +resulting key in appropriate hash. +Mostly optimized for /64 and byte-ranged IPv6 masks. +.It Cm iface:arrray +Array storing sorted indexes for entries which are presented in the system. +Optimized for very fast lookup. +.It Cm number:array +Array storing sorted u32 numbers. +.It Cm flow:hash +Auto-growing hash storing flow entries. +Search calculates hash on required packet fields and searches for matching +entries in selected bucket. +.El .Pp The .Cm tablearg @@ -1864,6 +2048,39 @@ the argument for a rule action, action p This can significantly reduce number of rules in some configurations. If two tables are used in a rule, the result of the second (destination) is used. +.Pp +Each record may hold one or more values according to +.Ar value-mask . +This mask is set on table creation via +.Cm valtype +option. +The following value types are supported: +.Bl -tag -width indent +.It Ar value-mask : Ar value-type Ns Op , Ns Ar value-mask +.It Ar value-type : Ar skipto | pipe | fib | nat | dscp | tag | divert | +.Ar netgraph | limit | ipv4 +.It Cm skipto +rule number to jump to. +.It Cm pipe +Pipe number to use. +.It Cm fib +fib number to match/set. +.It Cm nat +nat number to jump to. +.It Cm dscp +dscp value to match/set. +.It Cm tag +tag number to match/set. +.It Cm divert +port number to divert traffic to. +.It Cm netgraph +hook number to move packet to. +.It Cm limit +maximum number of connections. +.It Cm ipv4 +IPv4 nexthop to fwd packets to. +.El +.Pp The .Cm tablearg argument can be used with the following actions: @@ -1873,32 +2090,34 @@ action parameters: rule options: .Cm limit, tagged. .Pp -When used with -.Cm fwd -it is possible to supply table entries with values -that are in the form of IP addresses or hostnames. -See the -.Sx EXAMPLES -Section for example usage of tables and the tablearg keyword. -.Pp When used with the .Cm skipto action, the user should be aware that the code will walk the ruleset -up to a rule equal to, or past, the given number, -and should therefore try keep the -ruleset compact between the skipto and the target rules. +up to a rule equal to, or past, the given number. +.Pp +See the +.Sx EXAMPLES +Section for example usage of tables and the tablearg keyword. .Sh SETS OF RULES -Each rule belongs to one of 32 different +Each rule or table belongs to one of 32 different .Em sets , numbered 0 to 31. Set 31 is reserved for the default rule. .Pp -By default, rules are put in set 0, unless you use the +By default, rules or tables are put in set 0, unless you use the .Cm set N -attribute when entering a new rule. +attribute when adding a new rule or table. Sets can be individually and atomically enabled or disabled, so this mechanism permits an easy way to store multiple configurations of the firewall and quickly (and atomically) switch between them. +.Pp +By default, tables from set 0 are referenced when adding rule with +table opcodes regardless of rule set. +This behavior can be changed by setting +.Va net.inet.ip.fw.tables_set +variable to 1. +Rule's set will then be used for table references. +.Pp The command to enable/disable sets is .Bd -ragged -offset indent .Nm @@ -2968,6 +3187,22 @@ Controls whether bridged packets are pas .Nm . Default is no. .El +.Sh INTERNAL DIAGNOSTICS +There are some commands that may be useful to understand current state +of certain subsystems inside kernel module. +These commands provide debugging output which may change without notice. +.Pp +Currently the following commands are available as +.Cm internal +sub-options: +.Bl -tag -width indent +.It Cm iflist +Lists all interface which are currently tracked by +.Nm +with their in-kernel status. +.It Cm talist +List all table lookup algorithms currently available. +.El .Sh EXAMPLES There are far too many possible uses of .Nm @@ -3220,30 +3455,43 @@ Then we classify traffic using a single .Dl "ipfw pipe 1 config bw 1000Kbyte/s" .Dl "ipfw pipe 4 config bw 4000Kbyte/s" .Dl "..." -.Dl "ipfw table 1 add 192.168.2.0/24 1" -.Dl "ipfw table 1 add 192.168.0.0/27 4" -.Dl "ipfw table 1 add 192.168.0.2 1" +.Dl "ipfw table T1 create type addr" +.Dl "ipfw table T1 add 192.168.2.0/24 1" +.Dl "ipfw table T1 add 192.168.0.0/27 4" +.Dl "ipfw table T1 add 192.168.0.2 1" .Dl "..." -.Dl "ipfw add pipe tablearg ip from table(1) to any" +.Dl "ipfw add pipe tablearg ip from 'table(T1)' to any" .Pp Using the .Cm fwd action, the table entries may include hostnames and IP addresses. .Pp -.Dl "ipfw table 1 add 192.168.2.0/24 10.23.2.1" -.Dl "ipfw table 1 add 192.168.0.0/27 router1.dmz" +.Dl "ipfw table T2 create type addr ftype ip" +.Dl "ipfw table T2 add 192.168.2.0/24 10.23.2.1" +.Dl "ipfw table T21 add 192.168.0.0/27 router1.dmz" .Dl "..." .Dl "ipfw add 100 fwd tablearg ip from any to table(1)" .Pp In the following example per-interface firewall is created: .Pp -.Dl "ipfw table 10 add vlan20 12000" -.Dl "ipfw table 10 add vlan30 13000" -.Dl "ipfw table 20 add vlan20 22000" -.Dl "ipfw table 20 add vlan30 23000" +.Dl "ipfw table IN create type iface valtype skipto,fib" +.Dl "ipfw table IN add vlan20 12000,12" +.Dl "ipfw table IN add vlan30 13000,13" +.Dl "ipfw table OUT create type iface valtype skipto" +.Dl "ipfw table OUT add vlan20 22000" +.Dl "ipfw table OUT add vlan30 23000" +.Dl ".." +.Dl "ipfw add 100 ipfw setfib tablearg ip from any to any recv 'table(IN)' in" +.Dl "ipfw add 200 ipfw skipto tablearg ip from any to any recv 'table(IN)' in" +.Dl "ipfw add 300 ipfw skipto tablearg ip from any to any xmit 'table(OUT)' out" +.Pp +The following example illustrate usage of flow tables: +.Pp +.Dl "ipfw table fl create type flow:flow:src-ip,proto,dst-ip,dst-port" +.Dl "ipfw table fl add 2a02:6b8:77::88,tcp,2a02:6b8:77::99,80 11" +.Dl "ipfw table fl add 10.0.0.1,udp,10.0.0.2,53 12" .Dl ".." -.Dl "ipfw add 100 ipfw skipto tablearg ip from any to any recv 'table(10)' in" -.Dl "ipfw add 200 ipfw skipto tablearg ip from any to any xmit 'table(10)' out" +.Dl "ipfw add 100 allow ip from any to any flow 'table(fl,11)' recv ix0" .Ss SETS OF RULES To add a set of rules atomically, e.g.\& set 18: .Pp Modified: head/sbin/ipfw/ipfw2.c ============================================================================== --- head/sbin/ipfw/ipfw2.c Thu Oct 9 19:13:33 2014 (r272839) +++ head/sbin/ipfw/ipfw2.c Thu Oct 9 19:32:35 2014 (r272840) @@ -66,22 +66,13 @@ struct format_opts { uint32_t first; /* first rule to request */ uint32_t last; /* last rule to request */ uint32_t dcnt; /* number of dynamic states */ + ipfw_obj_ctlv *tstate; /* table state data */ }; -#define IP_FW_TARG IP_FW_TABLEARG -#define ip_fw_bcounter ip_fw -#define ip_fw_rule ip_fw -struct tidx; int resvd_set_number = RESVD_SET; int ipfw_socket = -1; -uint32_t ipfw_tables_max = 0; /* Number of tables supported by kernel */ - -#ifndef s6_addr32 -#define s6_addr32 __u6_addr.__u6_addr32 -#endif - #define CHECK_LENGTH(v, len) do { \ if ((v) < (len)) \ errx(EX_DATAERR, "Rule too long"); \ @@ -362,6 +353,7 @@ static struct _s_x rule_options[] = { { "src-ipv6", TOK_SRCIP6}, { "src-ip6", TOK_SRCIP6}, { "lookup", TOK_LOOKUP}, + { "flow", TOK_FLOW}, { "//", TOK_COMMENT }, { "not", TOK_NOT }, /* pseudo option */ @@ -376,6 +368,11 @@ static struct _s_x rule_options[] = { }; void bprint_uint_arg(struct buf_pr *bp, const char *str, uint32_t arg); +static int ipfw_get_config(struct cmdline_opts *co, struct format_opts *fo, + ipfw_cfg_lheader **pcfg, size_t *psize); +static int ipfw_show_config(struct cmdline_opts *co, struct format_opts *fo, + ipfw_cfg_lheader *cfg, size_t sz, int ac, char **av); +static void ipfw_list_tifaces(void); /* * Simple string buffer API. @@ -514,6 +511,26 @@ safe_realloc(void *ptr, size_t size) } /* + * Compare things like interface or table names. + */ +int +stringnum_cmp(const char *a, const char *b) +{ + int la, lb; + + la = strlen(a); + lb = strlen(b); + + if (la > lb) + return (1); + else if (la < lb) + return (-01); + + return (strcmp(a, b)); +} + + +/* * conditionally runs the command. * Selected options or negative -> getsockopt */ @@ -546,20 +563,18 @@ do_cmd(int optname, void *optval, uintpt } /* - * do_setcmd3 - pass ipfw control cmd to kernel + * do_set3 - pass ipfw control cmd to kernel * @optname: option name * @optval: pointer to option data * @optlen: option length * - * Function encapsulates option value in IP_FW3 socket option - * and calls setsockopt(). - * Function returns 0 on success or -1 otherwise. + * Assumes op3 header is already embedded. + * Calls setsockopt() with IP_FW3 as kernel-visible opcode. + * Returns 0 on success or errno otherwise. */ -static int -do_setcmd3(int optname, void *optval, socklen_t optlen) +int +do_set3(int optname, ip_fw3_opheader *op3, uintptr_t optlen) { - socklen_t len; - ip_fw3_opheader *op3; if (co.test_only) return (0); @@ -569,14 +584,40 @@ do_setcmd3(int optname, void *optval, so if (ipfw_socket < 0) err(EX_UNAVAILABLE, "socket"); - len = sizeof(ip_fw3_opheader) + optlen; - op3 = alloca(len); - /* Zero reserved fields */ - memset(op3, 0, sizeof(ip_fw3_opheader)); - memcpy(op3 + 1, optval, optlen); op3->opcode = optname; - return setsockopt(ipfw_socket, IPPROTO_IP, IP_FW3, op3, len); + return (setsockopt(ipfw_socket, IPPROTO_IP, IP_FW3, op3, optlen)); +} + +/* + * do_get3 - pass ipfw control cmd to kernel + * @optname: option name + * @optval: pointer to option data + * @optlen: pointer to option length + * + * Assumes op3 header is already embedded. + * Calls getsockopt() with IP_FW3 as kernel-visible opcode. + * Returns 0 on success or errno otherwise. + */ +int +do_get3(int optname, ip_fw3_opheader *op3, size_t *optlen) +{ + int error; + + if (co.test_only) + return (0); + + if (ipfw_socket == -1) + ipfw_socket = socket(AF_INET, SOCK_RAW, IPPROTO_RAW); + if (ipfw_socket < 0) + err(EX_UNAVAILABLE, "socket"); + + op3->opcode = optname; + + error = getsockopt(ipfw_socket, IPPROTO_IP, IP_FW3, op3, + (socklen_t *)optlen); + + return (error); } /** @@ -596,6 +637,37 @@ match_token(struct _s_x *table, char *st } /** + * match_token takes a table and a string, returns the value associated + * with the string for the best match. + * + * Returns: + * value from @table for matched records + * -1 for non-matched records + * -2 if more than one records match @string. + */ +int +match_token_relaxed(struct _s_x *table, char *string) +{ + struct _s_x *pt, *m; + int i, c; + + i = strlen(string); + c = 0; + + for (pt = table ; i != 0 && pt->s != NULL ; pt++) { + if (strncmp(pt->s, string, i) != 0) + continue; + m = pt; + c++; + } + + if (c == 1) + return (m->x); + + return (c > 0 ? -2: -1); +} + +/** * match_value takes a table and a value, returns the string associated * with the value (NULL in case of failure). */ @@ -608,16 +680,36 @@ match_value(struct _s_x *p, int value) return NULL; } +size_t +concat_tokens(char *buf, size_t bufsize, struct _s_x *table, char *delimiter) +{ + struct _s_x *pt; + int l; + size_t sz; + + for (sz = 0, pt = table ; pt->s != NULL; pt++) { + l = snprintf(buf + sz, bufsize - sz, "%s%s", + (sz == 0) ? "" : delimiter, pt->s); + sz += l; + bufsize += l; + if (sz > bufsize) + return (bufsize); + } + + return (sz); +} + /* * helper function to process a set of flags and set bits in the * appropriate masks. */ -void -fill_flags(struct _s_x *flags, char *p, uint8_t *set, uint8_t *clear) +int +fill_flags(struct _s_x *flags, char *p, char **e, uint32_t *set, + uint32_t *clear) { char *q; /* points to the separator */ int val; - uint8_t *which; /* mask we are working on */ + uint32_t *which; /* mask we are working on */ while (p && *p) { if (*p == '!') { @@ -629,11 +721,35 @@ fill_flags(struct _s_x *flags, char *p, if (q) *q++ = '\0'; val = match_token(flags, p); - if (val <= 0) - errx(EX_DATAERR, "invalid flag %s", p); - *which |= (uint8_t)val; + if (val <= 0) { + if (e != NULL) + *e = p; + return (-1); + } + *which |= (uint32_t)val; p = q; } + return (0); +} + +void +print_flags_buffer(char *buf, size_t sz, struct _s_x *list, uint32_t set) +{ + char const *comma = ""; + int i, l; + + for (i = 0; list[i].x != 0; i++) { + if ((set & list[i].x) == 0) + continue; + + set &= ~list[i].x; + l = snprintf(buf, sz, "%s%s", comma, list[i].s); + if (l >= sz) + return; + comma = ","; + buf += l; + sz -=l; + } } /* @@ -1018,6 +1134,7 @@ print_flags(struct buf_pr *bp, char cons } } + /* * Print the ip address contained in a command. */ @@ -1029,6 +1146,7 @@ print_ip(struct buf_pr *bp, struct forma struct in_addr *ia; uint32_t len = F_LEN((ipfw_insn *)cmd); uint32_t *a = ((ipfw_insn_u32 *)cmd)->d; + char *t; if (cmd->o.opcode == O_IP_DST_LOOKUP && len > F_INSN_SIZE(ipfw_insn_u32)) { uint32_t d = a[1]; @@ -1036,8 +1154,9 @@ print_ip(struct buf_pr *bp, struct forma if (d < sizeof(lookup_key)/sizeof(lookup_key[0])) arg = match_value(rule_options, lookup_key[d]); - bprintf(bp, "%s lookup %s %d", cmd->o.len & F_NOT ? " not": "", - arg, cmd->o.arg1); + t = table_search_ctlv(fo->tstate, ((ipfw_insn *)cmd)->arg1); + bprintf(bp, "%s lookup %s %s", cmd->o.len & F_NOT ? " not": "", + arg, t); return; } bprintf(bp, "%s%s ", cmd->o.len & F_NOT ? " not": "", s); @@ -1048,7 +1167,8 @@ print_ip(struct buf_pr *bp, struct forma } if (cmd->o.opcode == O_IP_SRC_LOOKUP || cmd->o.opcode == O_IP_DST_LOOKUP) { - bprintf(bp, "table(%u", ((ipfw_insn *)cmd)->arg1); + t = table_search_ctlv(fo->tstate, ((ipfw_insn *)cmd)->arg1); + bprintf(bp, "table(%s", t); if (len == F_INSN_SIZE(ipfw_insn_u32)) bprintf(bp, ",%u", *a); bprintf(bp, ")"); @@ -1262,12 +1382,9 @@ show_static_rule(struct cmdline_opts *co ipfw_insn_log *logptr = NULL; /* set if we find an O_LOG */ ipfw_insn_altq *altqptr = NULL; /* set if we find an O_ALTQ */ int or_block = 0; /* we are in an or block */ - uint32_t set_disable; uint32_t uval; - bcopy(&rule->next_rule, &set_disable, sizeof(set_disable)); - - if (set_disable & (1 << rule->set)) { + if ((fo->set_mask & (1 << rule->set)) == 0) { /* disabled mask */ if (!co->show_sets) return; @@ -1504,7 +1621,7 @@ show_static_rule(struct cmdline_opts *co break; } } - if (rule->_pad & 1) { /* empty rules before options */ + if (rule->flags & IPFW_RULE_NOOPT) { /* empty rules before options */ if (!co->do_compact) { show_prerequisites(bp, &flags, HAVE_PROTO, 0); bprintf(bp, " from any to any"); @@ -1707,7 +1824,7 @@ show_static_rule(struct cmdline_opts *co case O_RECV: case O_VIA: { - char const *s; + char const *s, *t; ipfw_insn_if *cmdif = (ipfw_insn_if *)cmd; if (cmd->opcode == O_XMIT) @@ -1719,10 +1836,26 @@ show_static_rule(struct cmdline_opts *co if (cmdif->name[0] == '\0') bprintf(bp, " %s %s", s, inet_ntoa(cmdif->p.ip)); - else if (cmdif->name[0] == '\1') /* interface table */ - bprintf(bp, " %s table(%d)", s, cmdif->p.glob); - else + else if (cmdif->name[0] == '\1') { + /* interface table */ + t = table_search_ctlv(fo->tstate, + cmdif->p.kidx); + bprintf(bp, " %s table(%s)", s, t); + } else bprintf(bp, " %s %s", s, cmdif->name); + + break; + } + case O_IP_FLOW_LOOKUP: + { + char *t; + + t = table_search_ctlv(fo->tstate, cmd->arg1); + bprintf(bp, " flow table(%s", t); + if (F_LEN(cmd) == F_INSN_SIZE(ipfw_insn_u32)) + bprintf(bp, ",%u", + ((ipfw_insn_u32 *)cmd)->d[0]); + bprintf(bp, ")"); break; } case O_IPID: @@ -1974,6 +2107,19 @@ show_dyn_state(struct cmdline_opts *co, bprintf(bp, " UNKNOWN <-> UNKNOWN\n"); } +static int +do_range_cmd(int cmd, ipfw_range_tlv *rt) +{ + ipfw_range_header rh; + + memset(&rh, 0, sizeof(rh)); + memcpy(&rh.range, rt, sizeof(*rt)); + rh.range.head.length = sizeof(*rt); + rh.range.head.type = IPFW_TLV_RANGE; + + return (do_set3(cmd, &rh.opheader, sizeof(rh))); +} + /* * This one handles all set-related commands * ipfw set { show | enable | disable } @@ -1984,77 +2130,75 @@ show_dyn_state(struct cmdline_opts *co, void ipfw_sets_handler(char *av[]) { - uint32_t set_disable, masks[2]; - int i, nbytes; - uint16_t rulenum; - uint8_t cmd, new_set; + uint32_t masks[2]; + int i; + uint8_t cmd, new_set, rulenum; + ipfw_range_tlv rt; + char *msg; + size_t size; av++; + memset(&rt, 0, sizeof(rt)); if (av[0] == NULL) errx(EX_USAGE, "set needs command"); if (_substrcmp(*av, "show") == 0) { - void *data = NULL; - char const *msg; - int nalloc; - - nalloc = nbytes = sizeof(struct ip_fw); - while (nbytes >= nalloc) { - if (data) - free(data); - nalloc = nalloc * 2 + 200; - nbytes = nalloc; - data = safe_calloc(1, nbytes); - if (do_cmd(IP_FW_GET, data, (uintptr_t)&nbytes) < 0) - err(EX_OSERR, "getsockopt(IP_FW_GET)"); - } + struct format_opts fo; + ipfw_cfg_lheader *cfg; - bcopy(&((struct ip_fw *)data)->next_rule, - &set_disable, sizeof(set_disable)); + memset(&fo, 0, sizeof(fo)); + if (ipfw_get_config(&co, &fo, &cfg, &size) != 0) + err(EX_OSERR, "requesting config failed"); - for (i = 0, msg = "disable" ; i < RESVD_SET; i++) - if ((set_disable & (1<set_mask & (1<set_mask != (uint32_t)-1) ? " enable" : "enable"; for (i = 0; i < RESVD_SET; i++) - if (!(set_disable & (1<set_mask & (1< RESVD_SET) + rt.set = atoi(av[0]); + rt.new_set = atoi(av[1]); + if (!isdigit(*(av[0])) || rt.set > RESVD_SET) *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***