From owner-svn-src-stable@freebsd.org Sun Apr 14 11:06:43 2019 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C6567157156E; Sun, 14 Apr 2019 11:06:43 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5A63C8BB68; Sun, 14 Apr 2019 11:06:43 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 2D6491F912; Sun, 14 Apr 2019 11:06:43 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x3EB6hI4004762; Sun, 14 Apr 2019 11:06:43 GMT (envelope-from ae@FreeBSD.org) Received: (from ae@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x3EB6gcA004755; Sun, 14 Apr 2019 11:06:42 GMT (envelope-from ae@FreeBSD.org) Message-Id: <201904141106.x3EB6gcA004755@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ae set sender to ae@FreeBSD.org using -f From: "Andrey V. Elsukov" Date: Sun, 14 Apr 2019 11:06:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r346201 - stable/11/sys/netpfil/ipfw X-SVN-Group: stable-11 X-SVN-Commit-Author: ae X-SVN-Commit-Paths: stable/11/sys/netpfil/ipfw X-SVN-Commit-Revision: 346201 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 5A63C8BB68 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.95 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; NEURAL_HAM_SHORT(-0.96)[-0.956,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Apr 2019 11:06:44 -0000 Author: ae Date: Sun Apr 14 11:06:42 2019 New Revision: 346201 URL: https://svnweb.freebsd.org/changeset/base/346201 Log: MFC r342908: Reduce the size of struct ip_fw_args from 240 to 128 bytes on amd64. And refactor the code to avoid unneeded initialization to reduce overhead of per-packet processing. ipfw(4) can be invoked by pfil(9) framework for each packet several times. Each call uses on-stack variable of type struct ip_fw_args to keep the state of ipfw(4) processing. Currently this variable has 240 bytes size on amd64. Each time ipfw(4) does bzero() on it, and then it initializes some fields. glebius@ has reported that they at Netflix discovered, that initialization of this variable produces significant overhead on packet processing. After patching I managed to increase performance of packet processing on simple routing with ipfw(4) firewalling to about 11% from 9.8Mpps up to 11Mpps (Xeon E5-2660 v4@ + Mellanox 100G card). Introduced new field flags, it is used to keep track of what fields was initialized. Some fields were moved into the anonymous union, to reduce the size. They all are mutually exclusive. dummypar field was unused, and therefore it is removed. The hopstore6 field type was changed from sockaddr_in6 to a bit smaller struct ip_fw_nh6. And now the size of struct ip_fw_args is 128 bytes. ipfw_chk() was modified to properly handle ip_fw_args.flags instead of rely on checking for NULL pointers. Obtained from: Yandex LLC Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D18690 MFC r342909: Fix the build with INVARIANTS. MFC r343551: Fix the bug introduced in r342908, that causes problems with dynamic handling for protocols without ports numbers. Since port numbers were uninitialized for protocols like ICMP/ICMPv6, ipfw_chk() used some non-zero values to create dynamic states, and due this it failed to match replies with created states. Reported by: Oliver Hartmann, Boris Lytochkin Obtained from: Yandex LLC Modified: stable/11/sys/netpfil/ipfw/ip_fw2.c stable/11/sys/netpfil/ipfw/ip_fw_log.c stable/11/sys/netpfil/ipfw/ip_fw_pfil.c stable/11/sys/netpfil/ipfw/ip_fw_private.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/netpfil/ipfw/ip_fw2.c ============================================================================== --- stable/11/sys/netpfil/ipfw/ip_fw2.c Sun Apr 14 10:44:06 2019 (r346200) +++ stable/11/sys/netpfil/ipfw/ip_fw2.c Sun Apr 14 11:06:42 2019 (r346201) @@ -1189,6 +1189,7 @@ set_match(struct ip_fw_args *args, int slot, args->rule.slot = slot + 1; /* we use 0 as a marker */ args->rule.rule_id = 1 + chain->map[slot]->id; args->rule.rulenum = chain->map[slot]->rulenum; + args->flags |= IPFW_ARGS_REF; } #ifndef LINEAR_SKIPTO @@ -1374,17 +1375,12 @@ ipfw_chk(struct ip_fw_args *args) * Only valid for IPv4 packets. */ uint8_t proto; - uint16_t src_port = 0, dst_port = 0; /* NOTE: host format */ + uint16_t src_port, dst_port; /* NOTE: host format */ struct in_addr src_ip, dst_ip; /* NOTE: network format */ int iplen = 0; int pktlen; - uint16_t etype = 0; /* Host order stored ether type */ + uint16_t etype; /* Host order stored ether type */ - /* - * dyn_dir = MATCH_UNKNOWN when rules unchecked, - * MATCH_NONE when checked and not matched (q = NULL), - * MATCH_FORWARD or MATCH_REVERSE otherwise (q != NULL) - */ struct ipfw_dyn_info dyn_info; struct ip_fw *q = NULL; struct ip_fw_chain *chain = &V_layer3_chain; @@ -1412,10 +1408,8 @@ ipfw_chk(struct ip_fw_args *args) dst_ip.s_addr = 0; /* make sure it is initialized */ src_ip.s_addr = 0; /* make sure it is initialized */ + src_port = dst_port = 0; pktlen = m->m_pkthdr.len; - args->f_id.fib = M_GETFIB(m); /* note mbuf not altered) */ - proto = args->f_id.proto = 0; /* mark f_id invalid */ - /* XXX 0 is a valid proto: IP/IPv6 Hop-by-Hop Option */ DYN_INFO_INIT(&dyn_info); /* @@ -1439,18 +1433,19 @@ do { \ /* * if we have an ether header, */ - if (args->eh) + if (args->flags & IPFW_ARGS_ETHER) etype = ntohs(args->eh->ether_type); + else + etype = 0; /* Identify IP packets and fill up variables. */ if (pktlen >= sizeof(struct ip6_hdr) && - (args->eh == NULL || etype == ETHERTYPE_IPV6) && ip->ip_v == 6) { + (etype == 0 || etype == ETHERTYPE_IPV6) && ip->ip_v == 6) { struct ip6_hdr *ip6 = (struct ip6_hdr *)ip; + is_ipv6 = 1; - args->f_id.addr_type = 6; hlen = sizeof(struct ip6_hdr); proto = ip6->ip6_nxt; - /* Search extension headers to find upper layer protocols */ while (ulp == NULL && offset == 0) { switch (proto) { @@ -1623,20 +1618,18 @@ do { \ } ip = mtod(m, struct ip *); ip6 = (struct ip6_hdr *)ip; + args->f_id.addr_type = 6; args->f_id.src_ip6 = ip6->ip6_src; args->f_id.dst_ip6 = ip6->ip6_dst; - args->f_id.src_ip = 0; - args->f_id.dst_ip = 0; args->f_id.flow_id6 = ntohl(ip6->ip6_flow); iplen = ntohs(ip6->ip6_plen) + sizeof(*ip6); } else if (pktlen >= sizeof(struct ip) && - (args->eh == NULL || etype == ETHERTYPE_IP) && ip->ip_v == 4) { - is_ipv4 = 1; + (etype == 0 || etype == ETHERTYPE_IP) && ip->ip_v == 4) { + is_ipv4 = 1; hlen = ip->ip_hl << 2; - args->f_id.addr_type = 4; - /* - * Collect parameters into local variables for faster matching. + * Collect parameters into local variables for faster + * matching. */ proto = ip->ip_p; src_ip = ip->ip_src; @@ -1689,23 +1682,30 @@ do { \ } ip = mtod(m, struct ip *); + args->f_id.addr_type = 4; args->f_id.src_ip = ntohl(src_ip.s_addr); args->f_id.dst_ip = ntohl(dst_ip.s_addr); + } else { + proto = 0; + dst_ip.s_addr = src_ip.s_addr = 0; + + args->f_id.addr_type = 1; /* XXX */ } #undef PULLUP_TO pktlen = iplen < pktlen ? iplen: pktlen; - if (proto) { /* we may have port numbers, store them */ - args->f_id.proto = proto; - args->f_id.src_port = src_port = ntohs(src_port); - args->f_id.dst_port = dst_port = ntohs(dst_port); - } + /* Properly initialize the rest of f_id */ + args->f_id.proto = proto; + args->f_id.src_port = src_port = ntohs(src_port); + args->f_id.dst_port = dst_port = ntohs(dst_port); + args->f_id.fib = M_GETFIB(m); + IPFW_PF_RLOCK(chain); if (! V_ipfw_vnet_ready) { /* shutting down, leave NOW. */ IPFW_PF_RUNLOCK(chain); return (IP_FW_PASS); /* accept */ } - if (args->rule.slot) { + if (args->flags & IPFW_ARGS_REF) { /* * Packet has already been tagged as a result of a previous * match on rule args->rule aka args->rule_id (PIPE, QUEUE, @@ -1833,7 +1833,7 @@ do { \ break; case O_MACADDR2: - if (args->eh != NULL) { /* have MAC header */ + if (args->flags & IPFW_ARGS_ETHER) { u_int32_t *want = (u_int32_t *) ((ipfw_insn_mac *)cmd)->addr; u_int32_t *mask = (u_int32_t *) @@ -1848,7 +1848,7 @@ do { \ break; case O_MAC_TYPE: - if (args->eh != NULL) { + if (args->flags & IPFW_ARGS_ETHER) { u_int16_t *p = ((ipfw_insn_u16 *)cmd)->ports; int i; @@ -1869,19 +1869,21 @@ do { \ break; case O_LAYER2: - match = (args->eh != NULL); + match = (args->flags & IPFW_ARGS_ETHER); break; case O_DIVERTED: - { - /* For diverted packets, args->rule.info + if ((args->flags & IPFW_ARGS_REF) == 0) + break; + /* + * For diverted packets, args->rule.info * contains the divert port (in host format) * reason and direction. */ - uint32_t i = args->rule.info; - match = (i&IPFW_IS_MASK) == IPFW_IS_DIVERT && - cmd->arg1 & ((i & IPFW_INFO_IN) ? 1 : 2); - } + match = ((args->rule.info & IPFW_IS_MASK) == + IPFW_IS_DIVERT) && ( + ((args->rule.info & IPFW_INFO_IN) ? + 1: 2) & cmd->arg1); break; case O_PROTO: @@ -2053,7 +2055,8 @@ do { \ #ifdef INET6 /* FALLTHROUGH */ case O_IP6_SRC_ME: - match= is_ipv6 && ipfw_localip6(&args->f_id.src_ip6); + match = is_ipv6 && + ipfw_localip6(&args->f_id.src_ip6); #endif break; @@ -2089,7 +2092,8 @@ do { \ #ifdef INET6 /* FALLTHROUGH */ case O_IP6_DST_ME: - match= is_ipv6 && ipfw_localip6(&args->f_id.dst_ip6); + match = is_ipv6 && + ipfw_localip6(&args->f_id.dst_ip6); #endif break; @@ -2687,8 +2691,8 @@ do { \ case O_DIVERT: case O_TEE: - if (args->eh) /* not on layer 2 */ - break; + if (args->flags & IPFW_ARGS_ETHER) + break; /* not on layer 2 */ /* otherwise this is terminal */ l = 0; /* exit inner loop */ done = 1; /* exit outer loop */ @@ -2864,8 +2868,8 @@ do { \ break; case O_FORWARD_IP: - if (args->eh) /* not valid on layer2 pkts */ - break; + if (args->flags & IPFW_ARGS_ETHER) + break; /* not valid on layer2 pkts */ if (q != f || dyn_info.direction == MATCH_FORWARD) { struct sockaddr_in *sa; @@ -2882,32 +2886,22 @@ do { \ * table_value as next_hop6 address. */ if (is_ipv6) { - struct sockaddr_in6 *sa6; + struct ip_fw_nh6 *nh6; - sa6 = args->next_hop6 = - &args->hopstore6; - sa6->sin6_family = AF_INET6; - sa6->sin6_len = sizeof(*sa6); - sa6->sin6_addr = TARG_VAL( + args->flags |= IPFW_ARGS_NH6; + nh6 = &args->hopstore6; + nh6->sin6_addr = TARG_VAL( chain, tablearg, nh6); - sa6->sin6_port = sa->sin_port; - /* - * Set sin6_scope_id only for - * link-local unicast addresses. - */ - if (IN6_IS_ADDR_LINKLOCAL( - &sa6->sin6_addr)) - sa6->sin6_scope_id = - TARG_VAL(chain, - tablearg, - zoneid); + nh6->sin6_port = sa->sin_port; + nh6->sin6_scope_id = TARG_VAL( + chain, tablearg, zoneid); } else #endif { + args->flags |= IPFW_ARGS_NH4; args->hopstore.sin_port = sa->sin_port; - sa = args->next_hop = - &args->hopstore; + sa = &args->hopstore; sa->sin_family = AF_INET; sa->sin_len = sizeof(*sa); sa->sin_addr.s_addr = htonl( @@ -2915,7 +2909,8 @@ do { \ nh4)); } } else { - args->next_hop = sa; + args->flags |= IPFW_ARGS_NH4PTR; + args->next_hop = sa; } } retval = IP_FW_PASS; @@ -2925,13 +2920,14 @@ do { \ #ifdef INET6 case O_FORWARD_IP6: - if (args->eh) /* not valid on layer2 pkts */ - break; + if (args->flags & IPFW_ARGS_ETHER) + break; /* not valid on layer2 pkts */ if (q != f || dyn_info.direction == MATCH_FORWARD) { struct sockaddr_in6 *sin6; sin6 = &(((ipfw_insn_sa6 *)cmd)->sa); + args->flags |= IPFW_ARGS_NH6PTR; args->next_hop6 = sin6; } retval = IP_FW_PASS; @@ -2960,7 +2956,7 @@ do { \ if (fib >= rt_numfibs) fib = 0; M_SETFIB(m, fib); - args->f_id.fib = fib; + args->f_id.fib = fib; /* XXX */ l = 0; /* exit inner loop */ break; } @@ -3007,6 +3003,7 @@ do { \ struct cfg_nat *t; int nat_id; + args->rule.info = 0; set_match(args, f_pos, chain); /* Check if this is 'global' nat rule */ if (cmd->arg1 == IP_FW_NAT44_GLOBAL) { @@ -3059,6 +3056,7 @@ do { \ else ip->ip_sum = in_cksum(m, hlen); retval = IP_FW_REASS; + args->rule.info = 0; set_match(args, f_pos, chain); } done = 1; /* exit outer loop */ Modified: stable/11/sys/netpfil/ipfw/ip_fw_log.c ============================================================================== --- stable/11/sys/netpfil/ipfw/ip_fw_log.c Sun Apr 14 10:44:06 2019 (r346200) +++ stable/11/sys/netpfil/ipfw/ip_fw_log.c Sun Apr 14 11:06:42 2019 (r346201) @@ -105,7 +105,7 @@ ipfw_log(struct ip_fw_chain *chain, struct ip_fw *f, u char action2[92], proto[128], fragment[32]; if (V_fw_verbose == 0) { - if (args->eh) /* layer2, use orig hdr */ + if (args->flags & IPFW_ARGS_ETHER) /* layer2, use orig hdr */ ipfw_bpf_mtap2(args->eh, ETHER_HDR_LEN, m); else { /* Add fake header. Later we will store Modified: stable/11/sys/netpfil/ipfw/ip_fw_pfil.c ============================================================================== --- stable/11/sys/netpfil/ipfw/ip_fw_pfil.c Sun Apr 14 10:44:06 2019 (r346200) +++ stable/11/sys/netpfil/ipfw/ip_fw_pfil.c Sun Apr 14 11:06:42 2019 (r346201) @@ -124,13 +124,11 @@ ipfw_check_packet(void *arg, struct mbuf **m0, struct { struct ip_fw_args args; struct m_tag *tag; - int ipfw; - int ret; + int ipfw, ret; /* convert dir to IPFW values */ dir = (dir == PFIL_IN) ? DIR_IN : DIR_OUT; - bzero(&args, sizeof(args)); - + args.flags = 0; again: /* * extract and remove the tag if present. If we are left @@ -142,6 +140,7 @@ again: m_tag_delete(*m0, tag); if (args.rule.info & IPFW_ONEPASS) return (0); + args.flags |= IPFW_ARGS_REF; } args.m = *m0; @@ -155,53 +154,82 @@ again: __func__)); /* breaking out of the switch means drop */ - ret = 0; /* default return value for pass */ switch (ipfw) { case IP_FW_PASS: /* next_hop may be set by ipfw_chk */ - if (args.next_hop == NULL && args.next_hop6 == NULL) - break; /* pass */ + if ((args.flags & (IPFW_ARGS_NH4 | IPFW_ARGS_NH4PTR | + IPFW_ARGS_NH6 | IPFW_ARGS_NH6PTR)) == 0) { + ret = 0; + break; + } #if (!defined(INET6) && !defined(INET)) ret = EACCES; #else { - struct m_tag *fwd_tag; + void *psa; size_t len; - - KASSERT(args.next_hop == NULL || args.next_hop6 == NULL, - ("%s: both next_hop=%p and next_hop6=%p not NULL", __func__, - args.next_hop, args.next_hop6)); -#ifdef INET6 - if (args.next_hop6 != NULL) - len = sizeof(struct sockaddr_in6); -#endif #ifdef INET - if (args.next_hop != NULL) + if (args.flags & (IPFW_ARGS_NH4 | IPFW_ARGS_NH4PTR)) { + MPASS((args.flags & (IPFW_ARGS_NH4 | + IPFW_ARGS_NH4PTR)) != (IPFW_ARGS_NH4 | + IPFW_ARGS_NH4PTR)); + MPASS((args.flags & (IPFW_ARGS_NH6 | + IPFW_ARGS_NH6PTR)) == 0); len = sizeof(struct sockaddr_in); -#endif - - /* Incoming packets should not be tagged so we do not + psa = (args.flags & IPFW_ARGS_NH4) ? + &args.hopstore : args.next_hop; + if (in_localip(satosin(psa)->sin_addr)) + (*m0)->m_flags |= M_FASTFWD_OURS; + (*m0)->m_flags |= M_IP_NEXTHOP; + } +#endif /* INET */ +#ifdef INET6 + if (args.flags & (IPFW_ARGS_NH6 | IPFW_ARGS_NH6PTR)) { + MPASS((args.flags & (IPFW_ARGS_NH6 | + IPFW_ARGS_NH6PTR)) != (IPFW_ARGS_NH6 | + IPFW_ARGS_NH6PTR)); + MPASS((args.flags & (IPFW_ARGS_NH4 | + IPFW_ARGS_NH4PTR)) == 0); + len = sizeof(struct sockaddr_in6); + psa = args.next_hop6; + (*m0)->m_flags |= M_IP6_NEXTHOP; + } +#endif /* INET6 */ + /* + * Incoming packets should not be tagged so we do not * m_tag_find. Outgoing packets may be tagged, so we * reuse the tag if present. */ - fwd_tag = (dir == DIR_IN) ? NULL : + tag = (dir == DIR_IN) ? NULL : m_tag_find(*m0, PACKET_TAG_IPFORWARD, NULL); - if (fwd_tag != NULL) { - m_tag_unlink(*m0, fwd_tag); + if (tag != NULL) { + m_tag_unlink(*m0, tag); } else { - fwd_tag = m_tag_get(PACKET_TAG_IPFORWARD, len, + tag = m_tag_get(PACKET_TAG_IPFORWARD, len, M_NOWAIT); - if (fwd_tag == NULL) { + if (tag == NULL) { ret = EACCES; break; /* i.e. drop */ } } + if ((args.flags & IPFW_ARGS_NH6) == 0) + bcopy(psa, tag + 1, len); + m_tag_prepend(*m0, tag); + ret = 0; #ifdef INET6 - if (args.next_hop6 != NULL) { + /* IPv6 next hop needs additional handling */ + if (args.flags & (IPFW_ARGS_NH6 | IPFW_ARGS_NH6PTR)) { struct sockaddr_in6 *sa6; - sa6 = (struct sockaddr_in6 *)(fwd_tag + 1); - bcopy(args.next_hop6, sa6, len); + sa6 = satosin6(tag + 1); + if (args.flags & IPFW_ARGS_NH6) { + sa6->sin6_family = AF_INET6; + sa6->sin6_len = sizeof(*sa6); + sa6->sin6_addr = args.hopstore6.sin6_addr; + sa6->sin6_port = args.hopstore6.sin6_port; + sa6->sin6_scope_id = + args.hopstore6.sin6_scope_id; + } /* * If nh6 address is link-local we should convert * it to kernel internal form before doing any @@ -213,18 +241,8 @@ again: } if (in6_localip(&sa6->sin6_addr)) (*m0)->m_flags |= M_FASTFWD_OURS; - (*m0)->m_flags |= M_IP6_NEXTHOP; } -#endif -#ifdef INET - if (args.next_hop != NULL) { - bcopy(args.next_hop, (fwd_tag+1), len); - if (in_localip(args.next_hop->sin_addr)) - (*m0)->m_flags |= M_FASTFWD_OURS; - (*m0)->m_flags |= M_IP_NEXTHOP; - } -#endif - m_tag_prepend(*m0, fwd_tag); +#endif /* INET6 */ } #endif /* INET || INET6 */ break; @@ -237,6 +255,7 @@ again: ret = EACCES; if (ip_dn_io_ptr == NULL) break; /* i.e. drop */ + MPASS(args.flags & IPFW_ARGS_REF); if (mtod(*m0, struct ip *)->ip_v == 4) ret = ip_dn_io_ptr(m0, dir, &args); else if (mtod(*m0, struct ip *)->ip_v == 6) @@ -260,6 +279,7 @@ again: ret = EACCES; break; /* i.e. drop */ } + MPASS(args.flags & IPFW_ARGS_REF); ret = ipfw_divert(m0, dir, &args.rule, (ipfw == IP_FW_TEE) ? 1 : 0); /* continue processing for the original packet (tee). */ @@ -273,6 +293,7 @@ again: ret = EACCES; break; /* i.e. drop */ } + MPASS(args.flags & IPFW_ARGS_REF); ret = ng_ipfw_input_p(m0, dir, &args, (ipfw == IP_FW_NGTEE) ? 1 : 0); if (ipfw == IP_FW_NGTEE) /* ignore errors for NGTEE */ @@ -281,13 +302,15 @@ again: case IP_FW_NAT: /* honor one-pass in case of successful nat */ - if (V_fw_one_pass) - break; /* ret is already 0 */ + if (V_fw_one_pass) { + ret = 0; + break; + } goto again; case IP_FW_REASS: goto again; /* continue with packet */ - + case IP_FW_NAT64: ret = 0; break; @@ -302,7 +325,7 @@ again: *m0 = NULL; } - return ret; + return (ret); } /* @@ -312,15 +335,14 @@ int ipfw_check_frame(void *arg, struct mbuf **m0, struct ifnet *ifp, int dir, struct inpcb *inp) { - struct ether_header *eh; + struct ip_fw_args args; struct ether_header save_eh; + struct ether_header *eh; + struct m_tag *mtag; struct mbuf *m; int i, ret; - struct ip_fw_args args; - struct m_tag *mtag; - bzero(&args, sizeof(args)); - + args.flags = IPFW_ARGS_ETHER; again: /* fetch start point from rule, if any. remove the tag if present. */ mtag = m_tag_locate(*m0, MTAG_IPFW_RULE, 0, NULL); @@ -329,6 +351,7 @@ again: m_tag_delete(*m0, mtag); if (args.rule.info & IPFW_ONEPASS) return (0); + args.flags |= IPFW_ARGS_REF; } /* I need some amt of data to be contiguous */ @@ -347,10 +370,8 @@ again: args.m = m; /* the packet we are looking at */ args.oif = dir == PFIL_OUT ? ifp: NULL; /* destination, if any */ - args.next_hop = NULL; /* we do not support forward yet */ - args.next_hop6 = NULL; /* we do not support forward yet */ args.eh = &save_eh; /* MAC header for bridged/MAC packets */ - args.inp = NULL; /* used by ipfw uid/gid/jail rules */ + args.inp = inp; /* used by ipfw uid/gid/jail rules */ i = ipfw_chk(&args); m = args.m; if (m != NULL) { @@ -381,14 +402,14 @@ again: case IP_FW_DUMMYNET: ret = EACCES; - int dir2; if (ip_dn_io_ptr == NULL) break; /* i.e. drop */ *m0 = NULL; - dir2 = (dir == PFIL_IN) ? DIR_IN : DIR_OUT; - ip_dn_io_ptr(&m, dir2 | PROTO_LAYER2, &args); + dir = (dir == PFIL_IN) ? DIR_IN : DIR_OUT; + MPASS(args.flags & IPFW_ARGS_REF); + ip_dn_io_ptr(&m, dir | PROTO_LAYER2, &args); return 0; case IP_FW_NGTEE: @@ -397,6 +418,7 @@ again: ret = EACCES; break; /* i.e. drop */ } + MPASS(args.flags & IPFW_ARGS_REF); ret = ng_ipfw_input_p(m0, (dir == PFIL_IN) ? DIR_IN : DIR_OUT, &args, (i == IP_FW_NGTEE) ? 1 : 0); if (i == IP_FW_NGTEE) /* ignore errors for NGTEE */ @@ -413,7 +435,7 @@ again: *m0 = NULL; } - return ret; + return (ret); } /* do the divert, return 1 on error 0 on success */ Modified: stable/11/sys/netpfil/ipfw/ip_fw_private.h ============================================================================== --- stable/11/sys/netpfil/ipfw/ip_fw_private.h Sun Apr 14 10:44:06 2019 (r346200) +++ stable/11/sys/netpfil/ipfw/ip_fw_private.h Sun Apr 14 11:06:42 2019 (r346201) @@ -82,11 +82,13 @@ struct _ip6dn_args { * efficient to pass variables around and extend the interface. */ struct ip_fw_args { - struct mbuf *m; /* the mbuf chain */ - struct ifnet *oif; /* output interface */ - struct sockaddr_in *next_hop; /* forward address */ - struct sockaddr_in6 *next_hop6; /* ipv6 forward address */ - + uint32_t flags; +#define IPFW_ARGS_ETHER 0x0001 /* has valid ethernet header */ +#define IPFW_ARGS_NH4 0x0002 /* has IPv4 next hop in hopstore */ +#define IPFW_ARGS_NH6 0x0004 /* has IPv6 next hop in hopstore */ +#define IPFW_ARGS_NH4PTR 0x0008 /* has IPv4 next hop in next_hop */ +#define IPFW_ARGS_NH6PTR 0x0010 /* has IPv6 next hop in next_hop6 */ +#define IPFW_ARGS_REF 0x0020 /* has valid ipfw_rule_ref */ /* * On return, it points to the matching rule. * On entry, rule.slot > 0 means the info is valid and @@ -94,19 +96,33 @@ struct ip_fw_args { * If chain_id == chain->id && slot >0 then jump to that slot. * Otherwise, we locate the first rule >= rulenum:rule_id */ - struct ipfw_rule_ref rule; /* match/restart info */ + struct ipfw_rule_ref rule; /* match/restart info */ - struct ether_header *eh; /* for bridged packets */ - - struct ipfw_flow_id f_id; /* grabbed from IP header */ - //uint32_t cookie; /* a cookie depending on rule action */ - struct inpcb *inp; - - struct _ip6dn_args dummypar; /* dummynet->ip6_output */ - union { /* store here if cannot use a pointer */ - struct sockaddr_in hopstore; - struct sockaddr_in6 hopstore6; + struct ifnet *oif; /* output interface */ + struct inpcb *inp; + union { + /* + * We don't support forwarding on layer2, thus we can + * keep eh pointer in this union. + * next_hop[6] pointers can be used to point to next hop + * stored in rule's opcode to avoid copying into hopstore. + * Also, it is expected that all 0x1-0x10 flags are mutually + * exclusive. + */ + struct ether_header *eh; /* for bridged packets */ + struct sockaddr_in *next_hop; + struct sockaddr_in6 *next_hop6; + /* ipfw next hop storage */ + struct sockaddr_in hopstore; + struct ip_fw_nh6 { + struct in6_addr sin6_addr; + uint32_t sin6_scope_id; + uint16_t sin6_port; + } hopstore6; }; + + struct mbuf *m; /* the mbuf chain */ + struct ipfw_flow_id f_id; /* grabbed from IP header */ }; MALLOC_DECLARE(M_IPFW);