Date: Sun, 14 Apr 2019 12:05:08 +0000 (UTC) From: "Andrey V. Elsukov" <ae@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r346205 - in stable/11: sbin/ipfw sys/netinet sys/netpfil/ipfw sys/netpfil/ipfw/nat64 sys/netpfil/ipfw/nptv6 Message-ID: <201904141205.x3EC58O3036897@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: ae Date: Sun Apr 14 12:05:08 2019 New Revision: 346205 URL: https://svnweb.freebsd.org/changeset/base/346205 Log: MFC r341471: Reimplement how net.inet.ip.fw.dyn_keep_states works. Turning on of this feature allows to keep dynamic states when parent rule is deleted. But it works only when the default rule is "allow from any to any". Now when rule with dynamic opcode is going to be deleted, and net.inet.ip.fw.dyn_keep_states is enabled, existing states will reference named objects corresponding to this rule, and also reference the rule. And when ipfw_dyn_lookup_state() will find state for deleted parent rule, it will return the pointer to the deleted rule, that is still valid. This implementation doesn't support O_LIMIT_PARENT rules. The refcnt field was added to struct ip_fw to keep reference, also next pointer added to be able iterate rules and not damage the content when deleted rules are chained. Named objects are referenced only when states are going to be deleted to be able reuse kidx of named objects when new parent rules will be installed. ipfw_dyn_get_count() function was modified and now it also looks into dynamic states and constructs maps of existing named objects. This is needed to correctly export orphaned states into userland. ipfw_free_rule() was changed to be global, since now dynamic state can free rule, when it is expired and references counters becomes 1. External actions subsystem also modified, since external actions can be deregisterd and instances can be destroyed. In these cases deleted rules, that are referenced by orphaned states, must be modified to prevent access to freed memory. ipfw_dyn_reset_eaction(), ipfw_reset_eaction_instance() functions added for these purposes. Obtained from: Yandex LLC Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D17532 MFC r341472: Add ability to request listing and deleting only for dynamic states. This can be useful, when net.inet.ip.fw.dyn_keep_states is enabled, but after rules reloading some state must be deleted. Added new flag '-D' for such purpose. Retire '-e' flag, since there can not be expired states in the meaning that this flag historically had. Also add "verbose" mode for listing of dynamic states, it can be enabled with '-v' flag and adds additional information to states list. This can be useful for debugging. Obtained from: Yandex LLC Sponsored by: Yandex LLC MFC r344018: Remove `set' field from state structure and use set from parent rule. Initially it was introduced because parent rule pointer could be freed, and rule's information could become inaccessible. In r341471 this was changed. And now we don't need this information, and also it can become stale. E.g. rule can be moved from one set to another. This can lead to parent's set and state's set will not match. In this case it is possible that static rule will be freed, but dynamic state will not. This can happen when `ipfw delete set N` command is used to delete rules, that were moved to another set. To fix the problem we will use the set number from parent rule. Obtained from: Yandex LLC Sponsored by: Yandex LLC MFC r344870: Fix the problem with O_LIMIT states introduced in r344018. dyn_install_state() uses `rule` pointer when it creates state. For O_LIMIT states this pointer actually is not struct ip_fw, it is pointer to O_LIMIT_PARENT state, that keeps actual pointer to ip_fw parent rule. Thus we need to cache rule id and number before calling dyn_get_parent_state(), so we can use them later when the `rule` pointer is overrided. PR: 236292 Modified: stable/11/sbin/ipfw/ipfw.8 stable/11/sbin/ipfw/ipfw2.c stable/11/sbin/ipfw/ipfw2.h stable/11/sbin/ipfw/main.c stable/11/sys/netinet/ip_fw.h stable/11/sys/netpfil/ipfw/ip_fw_dynamic.c stable/11/sys/netpfil/ipfw/ip_fw_eaction.c stable/11/sys/netpfil/ipfw/ip_fw_private.h stable/11/sys/netpfil/ipfw/ip_fw_sockopt.c stable/11/sys/netpfil/ipfw/nat64/nat64lsn_control.c stable/11/sys/netpfil/ipfw/nat64/nat64stl_control.c stable/11/sys/netpfil/ipfw/nptv6/nptv6.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sbin/ipfw/ipfw.8 ============================================================================== --- stable/11/sbin/ipfw/ipfw.8 Sun Apr 14 11:52:00 2019 (r346204) +++ stable/11/sbin/ipfw/ipfw.8 Sun Apr 14 12:05:08 2019 (r346205) @@ -1,7 +1,7 @@ .\" .\" $FreeBSD$ .\" -.Dd September 27, 2018 +.Dd December 4, 2018 .Dt IPFW 8 .Os .Sh NAME @@ -310,10 +310,9 @@ i.e., omitting the "ip from any to any" string when this does not carry any additional information. .It Fl d When listing, show dynamic rules in addition to static ones. -.It Fl e -When listing and -.Fl d -is specified, also show expired dynamic rules. +.It Fl D +When listing, show only dynamic states. +When deleting, delete only dynamic states. .It Fl f Run without prompting for confirmation for commands that can cause problems if misused, i.e., Modified: stable/11/sbin/ipfw/ipfw2.c ============================================================================== --- stable/11/sbin/ipfw/ipfw2.c Sun Apr 14 11:52:00 2019 (r346204) +++ stable/11/sbin/ipfw/ipfw2.c Sun Apr 14 12:05:08 2019 (r346205) @@ -2245,10 +2245,9 @@ show_dyn_state(struct cmdline_opts *co, struct format_ uint16_t rulenum; char buf[INET6_ADDRSTRLEN]; - if (!co->do_expired) { - if (!d->expire && !(d->dyn_type == O_LIMIT_PARENT)) - return; - } + if (d->expire == 0 && d->dyn_type != O_LIMIT_PARENT) + return; + bcopy(&d->rule, &rulenum, sizeof(rulenum)); bprintf(bp, "%05d", rulenum); if (fo->pcwidth > 0 || fo->bcwidth > 0) { @@ -2290,6 +2289,33 @@ show_dyn_state(struct cmdline_opts *co, struct format_ if (d->kidx != 0) bprintf(bp, " :%s", object_search_ctlv(fo->tstate, d->kidx, IPFW_TLV_STATE_NAME)); + +#define BOTH_SYN (TH_SYN | (TH_SYN << 8)) +#define BOTH_FIN (TH_FIN | (TH_FIN << 8)) + if (co->verbose) { + bprintf(bp, " state 0x%08x%s", d->state, + d->state ? " ": ","); + if (d->state & IPFW_DYN_ORPHANED) + bprintf(bp, "ORPHANED,"); + if ((d->state & BOTH_SYN) == BOTH_SYN) + bprintf(bp, "BOTH_SYN,"); + else { + if (d->state & TH_SYN) + bprintf(bp, "F_SYN,"); + if (d->state & (TH_SYN << 8)) + bprintf(bp, "R_SYN,"); + } + if ((d->state & BOTH_FIN) == BOTH_FIN) + bprintf(bp, "BOTH_FIN,"); + else { + if (d->state & TH_FIN) + bprintf(bp, "F_FIN,"); + if (d->state & (TH_FIN << 8)) + bprintf(bp, "R_FIN,"); + } + bprintf(bp, " f_ack 0x%x, r_ack 0x%x", d->ack_fwd, + d->ack_rev); + } } static int @@ -2693,7 +2719,8 @@ ipfw_list(int ac, char *av[], int show_counters) cfg = NULL; sfo.show_counters = show_counters; sfo.show_time = co.do_time; - sfo.flags = IPFW_CFG_GET_STATIC; + if (co.do_dynamic != 2) + sfo.flags |= IPFW_CFG_GET_STATIC; if (co.do_dynamic != 0) sfo.flags |= IPFW_CFG_GET_STATES; if ((sfo.show_counters | sfo.show_time) != 0) @@ -2738,17 +2765,15 @@ ipfw_show_config(struct cmdline_opts *co, struct forma fo->set_mask = cfg->set_mask; ctlv = (ipfw_obj_ctlv *)(cfg + 1); + if (ctlv->head.type == IPFW_TLV_TBLNAME_LIST) { + object_sort_ctlv(ctlv); + fo->tstate = ctlv; + readsz += ctlv->head.length; + ctlv = (ipfw_obj_ctlv *)((caddr_t)ctlv + ctlv->head.length); + } if (cfg->flags & IPFW_CFG_GET_STATIC) { /* We've requested static rules */ - if (ctlv->head.type == IPFW_TLV_TBLNAME_LIST) { - object_sort_ctlv(ctlv); - fo->tstate = ctlv; - readsz += ctlv->head.length; - ctlv = (ipfw_obj_ctlv *)((caddr_t)ctlv + - ctlv->head.length); - } - if (ctlv->head.type == IPFW_TLV_RULE_LIST) { rbase = (ipfw_obj_tlv *)(ctlv + 1); rcnt = ctlv->count; @@ -2775,10 +2800,12 @@ ipfw_show_config(struct cmdline_opts *co, struct forma if (ac == 0) { fo->first = 0; fo->last = IPFW_DEFAULT_RULE; - list_static_range(co, fo, &bp, rbase, rcnt); + if (cfg->flags & IPFW_CFG_GET_STATIC) + list_static_range(co, fo, &bp, rbase, rcnt); if (co->do_dynamic && dynsz > 0) { - printf("## Dynamic rules (%d %zu):\n", fo->dcnt, dynsz); + printf("## Dynamic rules (%d %zu):\n", fo->dcnt, + dynsz); list_dyn_range(co, fo, &bp, dynbase, dynsz); } @@ -2798,6 +2825,9 @@ ipfw_show_config(struct cmdline_opts *co, struct forma continue; } + if ((cfg->flags & IPFW_CFG_GET_STATIC) == 0) + continue; + if (list_static_range(co, fo, &bp, rbase, rcnt) == 0) { /* give precedence to other error(s) */ if (exitval == EX_OK) @@ -3311,6 +3341,8 @@ ipfw_delete(char *av[]) rt.flags |= IPFW_RCFLAG_SET; } } + if (co.do_dynamic == 2) + rt.flags |= IPFW_RCFLAG_DYNAMIC; i = do_range_cmd(IP_FW_XDEL, &rt); if (i != 0) { exitval = EX_UNAVAILABLE; @@ -3318,7 +3350,8 @@ ipfw_delete(char *av[]) continue; warn("rule %u: setsockopt(IP_FW_XDEL)", rt.start_rule); - } else if (rt.new_set == 0 && do_set == 0) { + } else if (rt.new_set == 0 && do_set == 0 && + co.do_dynamic != 2) { exitval = EX_UNAVAILABLE; if (co.do_quiet) continue; Modified: stable/11/sbin/ipfw/ipfw2.h ============================================================================== --- stable/11/sbin/ipfw/ipfw2.h Sun Apr 14 11:52:00 2019 (r346204) +++ stable/11/sbin/ipfw/ipfw2.h Sun Apr 14 12:05:08 2019 (r346205) @@ -37,8 +37,6 @@ struct cmdline_opts { int do_quiet; /* Be quiet in add and flush */ int do_pipe; /* this cmd refers to a pipe/queue/sched */ int do_nat; /* this cmd refers to a nat config */ - int do_dynamic; /* display dynamic rules */ - int do_expired; /* display expired dynamic rules */ int do_compact; /* show rules in compact mode */ int do_force; /* do not ask for confirmation */ int show_sets; /* display the set each rule belongs to */ @@ -48,6 +46,8 @@ struct cmdline_opts { /* The options below can have multiple values. */ + int do_dynamic; /* 1 - display dynamic rules */ + /* 2 - display/delete only dynamic rules */ int do_sort; /* field to sort results (0 = no) */ /* valid fields are 1 and above */ Modified: stable/11/sbin/ipfw/main.c ============================================================================== --- stable/11/sbin/ipfw/main.c Sun Apr 14 11:52:00 2019 (r346204) +++ stable/11/sbin/ipfw/main.c Sun Apr 14 12:05:08 2019 (r346205) @@ -262,7 +262,7 @@ ipfw_main(int oldac, char **oldav) save_av = av; optind = optreset = 1; /* restart getopt() */ - while ((ch = getopt(ac, av, "abcdefhinNp:qs:STtv")) != -1) + while ((ch = getopt(ac, av, "abcdDefhinNp:qs:STtv")) != -1) switch (ch) { case 'a': do_acct = 1; @@ -281,8 +281,12 @@ ipfw_main(int oldac, char **oldav) co.do_dynamic = 1; break; + case 'D': + co.do_dynamic = 2; + break; + case 'e': - co.do_expired = 1; + /* nop for compatibility */ break; case 'f': Modified: stable/11/sys/netinet/ip_fw.h ============================================================================== --- stable/11/sys/netinet/ip_fw.h Sun Apr 14 11:52:00 2019 (r346204) +++ stable/11/sys/netinet/ip_fw.h Sun Apr 14 12:05:08 2019 (r346205) @@ -706,6 +706,7 @@ struct _ipfw_dyn_rule { u_int32_t state; /* state of this rule (typically a * combination of TCP flags) */ +#define IPFW_DYN_ORPHANED 0x40000 /* state's parent rule was deleted */ u_int32_t ack_fwd; /* most recent ACKs in forward */ u_int32_t ack_rev; /* and reverse directions (used */ /* to generate keepalives) */ @@ -936,9 +937,10 @@ typedef struct _ipfw_range_tlv { #define IPFW_RCFLAG_RANGE 0x01 /* rule range is set */ #define IPFW_RCFLAG_ALL 0x02 /* match ALL rules */ #define IPFW_RCFLAG_SET 0x04 /* match rules in given set */ +#define IPFW_RCFLAG_DYNAMIC 0x08 /* match only dynamic states */ /* User-settable flags */ #define IPFW_RCFLAG_USER (IPFW_RCFLAG_RANGE | IPFW_RCFLAG_ALL | \ - IPFW_RCFLAG_SET) + IPFW_RCFLAG_SET | IPFW_RCFLAG_DYNAMIC) /* Internally used flags */ #define IPFW_RCFLAG_DEFAULT 0x0100 /* Do not skip defaul rule */ Modified: stable/11/sys/netpfil/ipfw/ip_fw_dynamic.c ============================================================================== --- stable/11/sys/netpfil/ipfw/ip_fw_dynamic.c Sun Apr 14 11:52:00 2019 (r346204) +++ stable/11/sys/netpfil/ipfw/ip_fw_dynamic.c Sun Apr 14 12:05:08 2019 (r346205) @@ -121,6 +121,12 @@ __FBSDID("$FreeBSD$"); (d)->bcnt_ ## dir += pktlen; \ } while (0) +#define DYN_REFERENCED 0x01 +/* + * DYN_REFERENCED flag is used to show that state keeps reference to named + * object, and this reference should be released when state becomes expired. + */ + struct dyn_data { void *parent; /* pointer to parent rule */ uint32_t chain_id; /* cached ruleset id */ @@ -129,7 +135,7 @@ struct dyn_data { uint32_t hashval; /* hash value used for hash resize */ uint16_t fibnum; /* fib used to send keepalives */ uint8_t _pad[3]; - uint8_t set; /* parent rule set number */ + uint8_t flags; /* internal flags */ uint16_t rulenum; /* parent rule number */ uint32_t ruleid; /* parent rule id */ @@ -155,8 +161,7 @@ struct dyn_data { struct dyn_parent { void *parent; /* pointer to parent rule */ uint32_t count; /* number of linked states */ - uint8_t _pad; - uint8_t set; /* parent rule set number */ + uint8_t _pad[2]; uint16_t rulenum; /* parent rule number */ uint32_t ruleid; /* parent rule id */ uint32_t hashval; /* hash value used for hash resize */ @@ -499,7 +504,7 @@ static int dyn_lookup_ipv6_state_locked(const struct i uint32_t, const void *, int, uint32_t, uint16_t); static struct dyn_ipv6_state *dyn_alloc_ipv6_state( const struct ipfw_flow_id *, uint32_t, uint16_t, uint8_t); -static int dyn_add_ipv6_state(void *, uint32_t, uint16_t, uint8_t, +static int dyn_add_ipv6_state(void *, uint32_t, uint16_t, const struct ipfw_flow_id *, uint32_t, const void *, int, uint32_t, struct ipfw_dyn_info *, uint16_t, uint16_t, uint8_t); static void dyn_export_ipv6_state(const struct dyn_ipv6_state *, @@ -520,8 +525,7 @@ static struct dyn_ipv6_state *dyn_lookup_ipv6_parent_l const struct ipfw_flow_id *, uint32_t, const void *, uint32_t, uint16_t, uint32_t); static struct dyn_ipv6_state *dyn_add_ipv6_parent(void *, uint32_t, uint16_t, - uint8_t, const struct ipfw_flow_id *, uint32_t, uint32_t, uint32_t, - uint16_t); + const struct ipfw_flow_id *, uint32_t, uint32_t, uint32_t, uint16_t); #endif /* INET6 */ /* Functions to work with limit states */ @@ -532,17 +536,17 @@ static struct dyn_ipv4_state *dyn_lookup_ipv4_parent( static struct dyn_ipv4_state *dyn_lookup_ipv4_parent_locked( const struct ipfw_flow_id *, const void *, uint32_t, uint16_t, uint32_t); static struct dyn_parent *dyn_alloc_parent(void *, uint32_t, uint16_t, - uint8_t, uint32_t); + uint32_t); static struct dyn_ipv4_state *dyn_add_ipv4_parent(void *, uint32_t, uint16_t, - uint8_t, const struct ipfw_flow_id *, uint32_t, uint32_t, uint16_t); + const struct ipfw_flow_id *, uint32_t, uint32_t, uint16_t); static void dyn_tick(void *); static void dyn_expire_states(struct ip_fw_chain *, ipfw_range_tlv *); static void dyn_free_states(struct ip_fw_chain *); -static void dyn_export_parent(const struct dyn_parent *, uint16_t, +static void dyn_export_parent(const struct dyn_parent *, uint16_t, uint8_t, ipfw_dyn_rule *); static void dyn_export_data(const struct dyn_data *, uint16_t, uint8_t, - ipfw_dyn_rule *); + uint8_t, ipfw_dyn_rule *); static uint32_t dyn_update_tcp_state(struct dyn_data *, const struct ipfw_flow_id *, const struct tcphdr *, int); static void dyn_update_proto_state(struct dyn_data *, @@ -555,7 +559,7 @@ static int dyn_lookup_ipv4_state_locked(const struct i const void *, int, uint32_t, uint16_t); static struct dyn_ipv4_state *dyn_alloc_ipv4_state( const struct ipfw_flow_id *, uint16_t, uint8_t); -static int dyn_add_ipv4_state(void *, uint32_t, uint16_t, uint8_t, +static int dyn_add_ipv4_state(void *, uint32_t, uint16_t, const struct ipfw_flow_id *, const void *, int, uint32_t, struct ipfw_dyn_info *, uint16_t, uint16_t, uint8_t); static void dyn_export_ipv4_state(const struct dyn_ipv4_state *, @@ -1398,20 +1402,29 @@ ipfw_dyn_lookup_state(const struct ip_fw_args *args, c * should be deleted by dyn_expire_states(). * * In case when dyn_keep_states is enabled, return - * pointer to default rule and corresponding f_pos - * value. - * XXX: In this case we lose the cache efficiency, - * since f_pos is not cached, because it seems - * there is no easy way to atomically switch - * all fields related to parent rule of given - * state. + * pointer to deleted rule and f_pos value + * corresponding to penultimate rule. + * When we have enabled V_dyn_keep_states, states + * that become orphaned will get the DYN_REFERENCED + * flag and rule will keep around. So we can return + * it. But since it is not in the rules map, we need + * return such f_pos value, so after the state + * handling if the search will continue, the next rule + * will be the last one - the default rule. */ if (V_layer3_chain.map[data->f_pos] == rule) { data->chain_id = V_layer3_chain.id; info->f_pos = data->f_pos; } else if (V_dyn_keep_states != 0) { - rule = V_layer3_chain.default_rule; - info->f_pos = V_layer3_chain.n_rules - 1; + /* + * The original rule pointer is still usable. + * So, we return it, but f_pos need to be + * changed to point to the penultimate rule. + */ + MPASS(V_layer3_chain.n_rules > 1); + data->chain_id = V_layer3_chain.id; + data->f_pos = V_layer3_chain.n_rules - 2; + info->f_pos = data->f_pos; } else { rule = NULL; info->direction = MATCH_NONE; @@ -1443,7 +1456,7 @@ ipfw_dyn_lookup_state(const struct ip_fw_args *args, c static struct dyn_parent * dyn_alloc_parent(void *parent, uint32_t ruleid, uint16_t rulenum, - uint8_t set, uint32_t hashval) + uint32_t hashval) { struct dyn_parent *limit; @@ -1462,7 +1475,6 @@ dyn_alloc_parent(void *parent, uint32_t ruleid, uint16 limit->parent = parent; limit->ruleid = ruleid; limit->rulenum = rulenum; - limit->set = set; limit->hashval = hashval; limit->expire = time_uptime + V_dyn_short_lifetime; return (limit); @@ -1470,7 +1482,7 @@ dyn_alloc_parent(void *parent, uint32_t ruleid, uint16 static struct dyn_data * dyn_alloc_dyndata(void *parent, uint32_t ruleid, uint16_t rulenum, - uint8_t set, const struct ipfw_flow_id *pkt, const void *ulp, int pktlen, + const struct ipfw_flow_id *pkt, const void *ulp, int pktlen, uint32_t hashval, uint16_t fibnum) { struct dyn_data *data; @@ -1489,7 +1501,6 @@ dyn_alloc_dyndata(void *parent, uint32_t ruleid, uint1 data->parent = parent; data->ruleid = ruleid; data->rulenum = rulenum; - data->set = set; data->fibnum = fibnum; data->hashval = hashval; data->expire = time_uptime + V_dyn_syn_lifetime; @@ -1526,8 +1537,8 @@ dyn_alloc_ipv4_state(const struct ipfw_flow_id *pkt, u */ static struct dyn_ipv4_state * dyn_add_ipv4_parent(void *rule, uint32_t ruleid, uint16_t rulenum, - uint8_t set, const struct ipfw_flow_id *pkt, uint32_t hashval, - uint32_t version, uint16_t kidx) + const struct ipfw_flow_id *pkt, uint32_t hashval, uint32_t version, + uint16_t kidx) { struct dyn_ipv4_state *s; struct dyn_parent *limit; @@ -1554,7 +1565,7 @@ dyn_add_ipv4_parent(void *rule, uint32_t ruleid, uint1 } } - limit = dyn_alloc_parent(rule, ruleid, rulenum, set, hashval); + limit = dyn_alloc_parent(rule, ruleid, rulenum, hashval); if (limit == NULL) { DYN_BUCKET_UNLOCK(bucket); return (NULL); @@ -1579,7 +1590,7 @@ dyn_add_ipv4_parent(void *rule, uint32_t ruleid, uint1 static int dyn_add_ipv4_state(void *parent, uint32_t ruleid, uint16_t rulenum, - uint8_t set, const struct ipfw_flow_id *pkt, const void *ulp, int pktlen, + const struct ipfw_flow_id *pkt, const void *ulp, int pktlen, uint32_t hashval, struct ipfw_dyn_info *info, uint16_t fibnum, uint16_t kidx, uint8_t type) { @@ -1604,7 +1615,7 @@ dyn_add_ipv4_state(void *parent, uint32_t ruleid, uint } } - data = dyn_alloc_dyndata(parent, ruleid, rulenum, set, pkt, ulp, + data = dyn_alloc_dyndata(parent, ruleid, rulenum, pkt, ulp, pktlen, hashval, fibnum); if (data == NULL) { DYN_BUCKET_UNLOCK(bucket); @@ -1657,8 +1668,8 @@ dyn_alloc_ipv6_state(const struct ipfw_flow_id *pkt, u */ static struct dyn_ipv6_state * dyn_add_ipv6_parent(void *rule, uint32_t ruleid, uint16_t rulenum, - uint8_t set, const struct ipfw_flow_id *pkt, uint32_t zoneid, - uint32_t hashval, uint32_t version, uint16_t kidx) + const struct ipfw_flow_id *pkt, uint32_t zoneid, uint32_t hashval, + uint32_t version, uint16_t kidx) { struct dyn_ipv6_state *s; struct dyn_parent *limit; @@ -1685,7 +1696,7 @@ dyn_add_ipv6_parent(void *rule, uint32_t ruleid, uint1 } } - limit = dyn_alloc_parent(rule, ruleid, rulenum, set, hashval); + limit = dyn_alloc_parent(rule, ruleid, rulenum, hashval); if (limit == NULL) { DYN_BUCKET_UNLOCK(bucket); return (NULL); @@ -1710,8 +1721,8 @@ dyn_add_ipv6_parent(void *rule, uint32_t ruleid, uint1 static int dyn_add_ipv6_state(void *parent, uint32_t ruleid, uint16_t rulenum, - uint8_t set, const struct ipfw_flow_id *pkt, uint32_t zoneid, - const void *ulp, int pktlen, uint32_t hashval, struct ipfw_dyn_info *info, + const struct ipfw_flow_id *pkt, uint32_t zoneid, const void *ulp, + int pktlen, uint32_t hashval, struct ipfw_dyn_info *info, uint16_t fibnum, uint16_t kidx, uint8_t type) { struct dyn_ipv6_state *s; @@ -1735,7 +1746,7 @@ dyn_add_ipv6_state(void *parent, uint32_t ruleid, uint } } - data = dyn_alloc_dyndata(parent, ruleid, rulenum, set, pkt, ulp, + data = dyn_alloc_dyndata(parent, ruleid, rulenum, pkt, ulp, pktlen, hashval, fibnum); if (data == NULL) { DYN_BUCKET_UNLOCK(bucket); @@ -1785,8 +1796,7 @@ dyn_get_parent_state(const struct ipfw_flow_id *pkt, u DYNSTATE_CRITICAL_EXIT(); s = dyn_add_ipv4_parent(rule, rule->id, - rule->rulenum, rule->set, pkt, hashval, - version, kidx); + rule->rulenum, pkt, hashval, version, kidx); if (s == NULL) return (NULL); /* Now we are in critical section again. */ @@ -1809,8 +1819,8 @@ dyn_get_parent_state(const struct ipfw_flow_id *pkt, u DYNSTATE_CRITICAL_EXIT(); s = dyn_add_ipv6_parent(rule, rule->id, - rule->rulenum, rule->set, pkt, zoneid, hashval, - version, kidx); + rule->rulenum, pkt, zoneid, hashval, version, + kidx); if (s == NULL) return (NULL); /* Now we are in critical section again. */ @@ -1853,17 +1863,18 @@ dyn_get_parent_state(const struct ipfw_flow_id *pkt, u static int dyn_install_state(const struct ipfw_flow_id *pkt, uint32_t zoneid, - uint16_t fibnum, const void *ulp, int pktlen, void *rule, - uint32_t ruleid, uint16_t rulenum, uint8_t set, + uint16_t fibnum, const void *ulp, int pktlen, struct ip_fw *rule, struct ipfw_dyn_info *info, uint32_t limit, uint16_t limit_mask, uint16_t kidx, uint8_t type) { struct ipfw_flow_id id; - uint32_t hashval, parent_hashval; + uint32_t hashval, parent_hashval, ruleid, rulenum; int ret; MPASS(type == O_LIMIT || type == O_KEEP_STATE); + ruleid = rule->id; + rulenum = rule->rulenum; if (type == O_LIMIT) { /* Create masked flow id and calculate bucket */ id.addr_type = pkt->addr_type; @@ -1918,11 +1929,11 @@ dyn_install_state(const struct ipfw_flow_id *pkt, uint hashval = hash_packet(pkt); if (IS_IP4_FLOW_ID(pkt)) - ret = dyn_add_ipv4_state(rule, ruleid, rulenum, set, pkt, + ret = dyn_add_ipv4_state(rule, ruleid, rulenum, pkt, ulp, pktlen, hashval, info, fibnum, kidx, type); #ifdef INET6 else if (IS_IP6_FLOW_ID(pkt)) - ret = dyn_add_ipv6_state(rule, ruleid, rulenum, set, pkt, + ret = dyn_add_ipv6_state(rule, ruleid, rulenum, pkt, zoneid, ulp, pktlen, hashval, info, fibnum, kidx, type); #endif /* INET6 */ else @@ -1995,8 +2006,8 @@ ipfw_dyn_install_state(struct ip_fw_chain *chain, stru #ifdef INET6 IS_IP6_FLOW_ID(&args->f_id) ? dyn_getscopeid(args): #endif - 0, M_GETFIB(args->m), ulp, pktlen, rule, rule->id, rule->rulenum, - rule->set, info, limit, limit_mask, cmd->o.arg1, cmd->o.opcode)); + 0, M_GETFIB(args->m), ulp, pktlen, rule, info, limit, + limit_mask, cmd->o.arg1, cmd->o.opcode)); } /* @@ -2093,7 +2104,11 @@ dyn_free_states(struct ip_fw_chain *chain) } /* - * Returns 1 when state is matched by specified range, otherwise returns 0. + * Returns: + * 0 when state is not matched by specified range; + * 1 when state is matched by specified range; + * 2 when state is matched by specified range and requested deletion of + * dynamic states. */ static int dyn_match_range(uint16_t rulenum, uint8_t set, const ipfw_range_tlv *rt) @@ -2101,50 +2116,121 @@ dyn_match_range(uint16_t rulenum, uint8_t set, const i MPASS(rt != NULL); /* flush all states */ - if (rt->flags & IPFW_RCFLAG_ALL) + if (rt->flags & IPFW_RCFLAG_ALL) { + if (rt->flags & IPFW_RCFLAG_DYNAMIC) + return (2); /* forced */ return (1); + } if ((rt->flags & IPFW_RCFLAG_SET) != 0 && set != rt->set) return (0); if ((rt->flags & IPFW_RCFLAG_RANGE) != 0 && (rulenum < rt->start_rule || rulenum > rt->end_rule)) return (0); + if (rt->flags & IPFW_RCFLAG_DYNAMIC) + return (2); return (1); } +static void +dyn_acquire_rule(struct ip_fw_chain *ch, struct dyn_data *data, + struct ip_fw *rule, uint16_t kidx) +{ + struct dyn_state_obj *obj; + + /* + * Do not acquire reference twice. + * This can happen when rule deletion executed for + * the same range, but different ruleset id. + */ + if (data->flags & DYN_REFERENCED) + return; + + IPFW_UH_WLOCK_ASSERT(ch); + MPASS(kidx != 0); + + data->flags |= DYN_REFERENCED; + /* Reference the named object */ + obj = SRV_OBJECT(ch, kidx); + obj->no.refcnt++; + MPASS(obj->no.etlv == IPFW_TLV_STATE_NAME); + + /* Reference the parent rule */ + rule->refcnt++; +} + +static void +dyn_release_rule(struct ip_fw_chain *ch, struct dyn_data *data, + struct ip_fw *rule, uint16_t kidx) +{ + struct dyn_state_obj *obj; + + IPFW_UH_WLOCK_ASSERT(ch); + MPASS(kidx != 0); + + obj = SRV_OBJECT(ch, kidx); + if (obj->no.refcnt == 1) + dyn_destroy(ch, &obj->no); + else + obj->no.refcnt--; + + if (--rule->refcnt == 1) + ipfw_free_rule(rule); +} + +/* + * We do not keep O_LIMIT_PARENT states when V_dyn_keep_states is enabled. + * O_LIMIT state is created when new connection is going to be established + * and there is no matching state. So, since the old parent rule was deleted + * we can't create new states with old parent, and thus we can not account + * new connections with already established connections, and can not do + * proper limiting. + */ static int -dyn_match_ipv4_state(struct dyn_ipv4_state *s, const ipfw_range_tlv *rt) +dyn_match_ipv4_state(struct ip_fw_chain *ch, struct dyn_ipv4_state *s, + const ipfw_range_tlv *rt) { + struct ip_fw *rule; + int ret; - if (s->type == O_LIMIT_PARENT) - return (dyn_match_range(s->limit->rulenum, - s->limit->set, rt)); + if (s->type == O_LIMIT_PARENT) { + rule = s->limit->parent; + return (dyn_match_range(s->limit->rulenum, rule->set, rt)); + } + rule = s->data->parent; if (s->type == O_LIMIT) - return (dyn_match_range(s->data->rulenum, s->data->set, rt)); + rule = ((struct dyn_ipv4_state *)rule)->limit->parent; - if (V_dyn_keep_states == 0 && - dyn_match_range(s->data->rulenum, s->data->set, rt)) - return (1); + ret = dyn_match_range(s->data->rulenum, rule->set, rt); + if (ret == 0 || V_dyn_keep_states == 0 || ret > 1) + return (ret); + dyn_acquire_rule(ch, s->data, rule, s->kidx); return (0); } #ifdef INET6 static int -dyn_match_ipv6_state(struct dyn_ipv6_state *s, const ipfw_range_tlv *rt) +dyn_match_ipv6_state(struct ip_fw_chain *ch, struct dyn_ipv6_state *s, + const ipfw_range_tlv *rt) { + struct ip_fw *rule; + int ret; - if (s->type == O_LIMIT_PARENT) - return (dyn_match_range(s->limit->rulenum, - s->limit->set, rt)); + if (s->type == O_LIMIT_PARENT) { + rule = s->limit->parent; + return (dyn_match_range(s->limit->rulenum, rule->set, rt)); + } + rule = s->data->parent; if (s->type == O_LIMIT) - return (dyn_match_range(s->data->rulenum, s->data->set, rt)); + rule = ((struct dyn_ipv6_state *)rule)->limit->parent; - if (V_dyn_keep_states == 0 && - dyn_match_range(s->data->rulenum, s->data->set, rt)) - return (1); + ret = dyn_match_range(s->data->rulenum, rule->set, rt); + if (ret == 0 || V_dyn_keep_states == 0 || ret > 1) + return (ret); + dyn_acquire_rule(ch, s->data, rule, s->kidx); return (0); } #endif @@ -2154,7 +2240,7 @@ dyn_match_ipv6_state(struct dyn_ipv6_state *s, const i * @rt can be used to specify the range of states for deletion. */ static void -dyn_expire_states(struct ip_fw_chain *chain, ipfw_range_tlv *rt) +dyn_expire_states(struct ip_fw_chain *ch, ipfw_range_tlv *rt) { struct dyn_ipv4_slist expired_ipv4; #ifdef INET6 @@ -2162,8 +2248,11 @@ dyn_expire_states(struct ip_fw_chain *chain, ipfw_rang struct dyn_ipv6_state *s6, *s6n, *s6p; #endif struct dyn_ipv4_state *s4, *s4n, *s4p; + void *rule; int bucket, removed, length, max_length; + IPFW_UH_WLOCK_ASSERT(ch); + /* * Unlink expired states from each bucket. * With acquired bucket lock iterate entries of each lists: @@ -2188,7 +2277,8 @@ dyn_expire_states(struct ip_fw_chain *chain, ipfw_rang while (s != NULL) { \ next = CK_SLIST_NEXT(s, entry); \ if ((TIME_LEQ((s)->exp, time_uptime) && extra) || \ - (rt != NULL && dyn_match_ ## af ## _state(s, rt))) {\ + (rt != NULL && \ + dyn_match_ ## af ## _state(ch, s, rt))) { \ if (prev != NULL) \ CK_SLIST_REMOVE_AFTER(prev, entry); \ else \ @@ -2200,6 +2290,14 @@ dyn_expire_states(struct ip_fw_chain *chain, ipfw_rang DYN_COUNT_DEC(dyn_parent_count); \ else { \ DYN_COUNT_DEC(dyn_count); \ + if (s->data->flags & DYN_REFERENCED) { \ + rule = s->data->parent; \ + if (s->type == O_LIMIT) \ + rule = ((__typeof(s)) \ + rule)->limit->parent;\ + dyn_release_rule(ch, s->data, \ + rule, s->kidx); \ + } \ if (s->type == O_LIMIT) { \ s = s->data->parent; \ DPARENT_COUNT_DEC(s->limit); \ @@ -2684,6 +2782,42 @@ ipfw_expire_dyn_states(struct ip_fw_chain *chain, ipfw } /* + * Pass through all states and reset eaction for orphaned rules. + */ +void +ipfw_dyn_reset_eaction(struct ip_fw_chain *ch, uint16_t eaction_id, + uint16_t default_id, uint16_t instance_id) +{ +#ifdef INET6 + struct dyn_ipv6_state *s6; +#endif + struct dyn_ipv4_state *s4; + struct ip_fw *rule; + uint32_t bucket; + +#define DYN_RESET_EACTION(s, h, b) \ + CK_SLIST_FOREACH(s, &V_dyn_ ## h[b], entry) { \ + if ((s->data->flags & DYN_REFERENCED) == 0) \ + continue; \ + rule = s->data->parent; \ + if (s->type == O_LIMIT) \ + rule = ((__typeof(s))rule)->limit->parent; \ + ipfw_reset_eaction(ch, rule, eaction_id, \ + default_id, instance_id); \ + } + + IPFW_UH_WLOCK_ASSERT(ch); + if (V_dyn_count == 0) + return; + for (bucket = 0; bucket < V_curr_dyn_buckets; bucket++) { + DYN_RESET_EACTION(s4, ipv4, bucket); +#ifdef INET6 + DYN_RESET_EACTION(s6, ipv6, bucket); +#endif + } +} + +/* * Returns size of dynamic states in legacy format */ int @@ -2695,12 +2829,41 @@ ipfw_dyn_len(void) /* * Returns number of dynamic states. + * Marks every named object index used by dynamic states with bit in @bmask. + * Returns number of named objects accounted in bmask via @nocnt. * Used by dump format v1 (current). */ uint32_t -ipfw_dyn_get_count(void) +ipfw_dyn_get_count(uint32_t *bmask, int *nocnt) { +#ifdef INET6 + struct dyn_ipv6_state *s6; +#endif + struct dyn_ipv4_state *s4; + uint32_t bucket; +#define DYN_COUNT_OBJECTS(s, h, b) \ + CK_SLIST_FOREACH(s, &V_dyn_ ## h[b], entry) { \ + MPASS(s->kidx != 0); \ + if (ipfw_mark_object_kidx(bmask, IPFW_TLV_STATE_NAME, \ + s->kidx) != 0) \ + (*nocnt)++; \ + } + + IPFW_UH_RLOCK_ASSERT(&V_layer3_chain); + + /* No need to pass through all the buckets. */ + *nocnt = 0; + if (V_dyn_count + V_dyn_parent_count == 0) + return (0); + + for (bucket = 0; bucket < V_curr_dyn_buckets; bucket++) { + DYN_COUNT_OBJECTS(s4, ipv4, bucket); +#ifdef INET6 + DYN_COUNT_OBJECTS(s6, ipv6, bucket); +#endif + } + return (V_dyn_count + V_dyn_parent_count); } @@ -2734,7 +2897,7 @@ ipfw_is_dyn_rule(struct ip_fw *rule) } static void -dyn_export_parent(const struct dyn_parent *p, uint16_t kidx, +dyn_export_parent(const struct dyn_parent *p, uint16_t kidx, uint8_t set, ipfw_dyn_rule *dst) { @@ -2746,9 +2909,9 @@ dyn_export_parent(const struct dyn_parent *p, uint16_t /* 'rule' is used to pass up the rule number and set */ memcpy(&dst->rule, &p->rulenum, sizeof(p->rulenum)); + /* store set number into high word of dst->rule pointer. */ - memcpy((char *)&dst->rule + sizeof(p->rulenum), &p->set, - sizeof(p->set)); + memcpy((char *)&dst->rule + sizeof(p->rulenum), &set, sizeof(set)); /* unused fields */ dst->pcnt = 0; @@ -2767,7 +2930,7 @@ dyn_export_parent(const struct dyn_parent *p, uint16_t static void dyn_export_data(const struct dyn_data *data, uint16_t kidx, uint8_t type, - ipfw_dyn_rule *dst) + uint8_t set, ipfw_dyn_rule *dst) { dst->dyn_type = type; @@ -2779,13 +2942,16 @@ dyn_export_data(const struct dyn_data *data, uint16_t /* 'rule' is used to pass up the rule number and set */ memcpy(&dst->rule, &data->rulenum, sizeof(data->rulenum)); + /* store set number into high word of dst->rule pointer. */ - memcpy((char *)&dst->rule + sizeof(data->rulenum), &data->set, - sizeof(data->set)); + memcpy((char *)&dst->rule + sizeof(data->rulenum), &set, sizeof(set)); + dst->state = data->state; + if (data->flags & DYN_REFERENCED) + dst->state |= IPFW_DYN_ORPHANED; + /* unused fields */ dst->parent = NULL; - dst->state = data->state; dst->ack_fwd = data->ack_fwd; dst->ack_rev = data->ack_rev; dst->count = 0; @@ -2800,13 +2966,18 @@ dyn_export_data(const struct dyn_data *data, uint16_t static void dyn_export_ipv4_state(const struct dyn_ipv4_state *s, ipfw_dyn_rule *dst) { + struct ip_fw *rule; switch (s->type) { case O_LIMIT_PARENT: - dyn_export_parent(s->limit, s->kidx, dst); + rule = s->limit->parent; + dyn_export_parent(s->limit, s->kidx, rule->set, dst); break; default: - dyn_export_data(s->data, s->kidx, s->type, dst); + rule = s->data->parent; + if (s->type == O_LIMIT) + rule = ((struct dyn_ipv4_state *)rule)->limit->parent; + dyn_export_data(s->data, s->kidx, s->type, rule->set, dst); } dst->id.dst_ip = s->dst; @@ -2827,13 +2998,18 @@ dyn_export_ipv4_state(const struct dyn_ipv4_state *s, static void dyn_export_ipv6_state(const struct dyn_ipv6_state *s, ipfw_dyn_rule *dst) { + struct ip_fw *rule; switch (s->type) { case O_LIMIT_PARENT: - dyn_export_parent(s->limit, s->kidx, dst); + rule = s->limit->parent; + dyn_export_parent(s->limit, s->kidx, rule->set, dst); break; default: - dyn_export_data(s->data, s->kidx, s->type, dst); + rule = s->data->parent; + if (s->type == O_LIMIT) + rule = ((struct dyn_ipv6_state *)rule)->limit->parent; + dyn_export_data(s->data, s->kidx, s->type, rule->set, dst); } dst->id.src_ip6 = s->src; Modified: stable/11/sys/netpfil/ipfw/ip_fw_eaction.c ============================================================================== --- stable/11/sys/netpfil/ipfw/ip_fw_eaction.c Sun Apr 14 11:52:00 2019 (r346204) +++ stable/11/sys/netpfil/ipfw/ip_fw_eaction.c Sun Apr 14 12:05:08 2019 (r346205) @@ -252,11 +252,10 @@ destroy_eaction_obj(struct ip_fw_chain *ch, struct nam * Resets all eaction opcodes to default handlers. */ static void -reset_eaction_obj(struct ip_fw_chain *ch, uint16_t eaction_id) +reset_eaction_rules(struct ip_fw_chain *ch, uint16_t eaction_id, + uint16_t instance_id, bool reset_rules) { struct named_object *no; - struct ip_fw *rule; - ipfw_insn *cmd; int i; IPFW_UH_WLOCK_ASSERT(ch); @@ -267,35 +266,32 @@ reset_eaction_obj(struct ip_fw_chain *ch, uint16_t eac panic("Default external action handler is not found"); if (eaction_id == no->kidx) panic("Wrong eaction_id"); - EACTION_DEBUG("replace id %u with %u", eaction_id, no->kidx); + + EACTION_DEBUG("Going to replace id %u with %u", eaction_id, no->kidx); IPFW_WLOCK(ch); - for (i = 0; i < ch->n_rules; i++) { - rule = ch->map[i]; - cmd = ACTION_PTR(rule); - if (cmd->opcode != O_EXTERNAL_ACTION) - continue; - if (cmd->arg1 != eaction_id) - continue; - cmd->arg1 = no->kidx; /* Set to default id */ - /* - * XXX: we only bump refcount on default_eaction. - * Refcount on the original object will be just - * ignored on destroy. But on default_eaction it - * will be decremented on rule deletion. - */ - no->refcnt++; - /* - * Since named_object related to this instance will be - * also destroyed, truncate the chain of opcodes to - * remove the rest of cmd chain just after O_EXTERNAL_ACTION - * opcode. - */ - if (rule->act_ofs < rule->cmd_len - 1) { - EACTION_DEBUG("truncate rule %d: len %u -> %u", - rule->rulenum, rule->cmd_len, rule->act_ofs + 1); - rule->cmd_len = rule->act_ofs + 1; + /* + * Reset eaction objects only if it is referenced by rules. + * But always reset objects for orphaned dynamic states. + */ + if (reset_rules) { + for (i = 0; i < ch->n_rules; i++) { + /* + * Refcount on the original object will be just + * ignored on destroy. Refcount on default_eaction + * will be decremented on rule deletion, thus we + * need to reference default_eaction object. + */ + if (ipfw_reset_eaction(ch, ch->map[i], eaction_id, + no->kidx, instance_id) != 0) + no->refcnt++; } } + /* + * Reset eaction opcodes for orphaned dynamic states. + * Since parent rules are already deleted, we don't need to + * reference named object of default_eaction. + */ + ipfw_dyn_reset_eaction(ch, eaction_id, no->kidx, instance_id); IPFW_WUNLOCK(ch); } @@ -368,12 +364,71 @@ ipfw_del_eaction(struct ip_fw_chain *ch, uint16_t eact IPFW_UH_WUNLOCK(ch); return (EINVAL); } - if (no->refcnt > 1) - reset_eaction_obj(ch, eaction_id); + reset_eaction_rules(ch, eaction_id, 0, (no->refcnt > 1)); EACTION_DEBUG("External action '%s' with id %u unregistered", no->name, eaction_id); destroy_eaction_obj(ch, no); IPFW_UH_WUNLOCK(ch); + return (0); +} + +int +ipfw_reset_eaction(struct ip_fw_chain *ch, struct ip_fw *rule, + uint16_t eaction_id, uint16_t default_id, uint16_t instance_id) +{ + ipfw_insn *cmd, *icmd; + + IPFW_UH_WLOCK_ASSERT(ch); + IPFW_WLOCK_ASSERT(ch); + + cmd = ACTION_PTR(rule); + if (cmd->opcode != O_EXTERNAL_ACTION || + cmd->arg1 != eaction_id) *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201904141205.x3EC58O3036897>