From owner-svn-src-head@freebsd.org Mon Mar 18 14:00:21 2019 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A1C251540294; Mon, 18 Mar 2019 14:00:21 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4D66A8A673; Mon, 18 Mar 2019 14:00:21 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 23E1A22497; Mon, 18 Mar 2019 14:00:21 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x2IE0LLM049090; Mon, 18 Mar 2019 14:00:21 GMT (envelope-from ae@FreeBSD.org) Received: (from ae@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x2IE0JiD049079; Mon, 18 Mar 2019 14:00:19 GMT (envelope-from ae@FreeBSD.org) Message-Id: <201903181400.x2IE0JiD049079@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ae set sender to ae@FreeBSD.org using -f From: "Andrey V. Elsukov" Date: Mon, 18 Mar 2019 14:00:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r345275 - in head: sbin/ipfw sys/conf sys/modules/ipfw_nat64 sys/netinet6 sys/netpfil/ipfw/nat64 X-SVN-Group: head X-SVN-Commit-Author: ae X-SVN-Commit-Paths: in head: sbin/ipfw sys/conf sys/modules/ipfw_nat64 sys/netinet6 sys/netpfil/ipfw/nat64 X-SVN-Commit-Revision: 345275 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4D66A8A673 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.95 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.997,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.95)[-0.949,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US] X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Mar 2019 14:00:22 -0000 Author: ae Date: Mon Mar 18 14:00:19 2019 New Revision: 345275 URL: https://svnweb.freebsd.org/changeset/base/345275 Log: Revert r345274. It appears that not all 32-bit architectures have necessary CK primitives. Modified: head/sbin/ipfw/ipfw.8 head/sbin/ipfw/ipfw2.h head/sbin/ipfw/nat64lsn.c head/sys/conf/files head/sys/modules/ipfw_nat64/Makefile head/sys/netinet6/ip_fw_nat64.h head/sys/netpfil/ipfw/nat64/nat64lsn.c head/sys/netpfil/ipfw/nat64/nat64lsn.h head/sys/netpfil/ipfw/nat64/nat64lsn_control.c Modified: head/sbin/ipfw/ipfw.8 ============================================================================== --- head/sbin/ipfw/ipfw.8 Mon Mar 18 12:59:08 2019 (r345274) +++ head/sbin/ipfw/ipfw.8 Mon Mar 18 14:00:19 2019 (r345275) @@ -3300,7 +3300,6 @@ See .Sx SYSCTL VARIABLES for more info. .Sh IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION -.Ss Stateful translation .Nm supports in-kernel IPv6/IPv4 network address and protocol translation. Stateful NAT64 translation allows IPv6-only clients to contact IPv4 servers @@ -3318,8 +3317,7 @@ to be able use stateful NAT64 translator. Stateful NAT64 uses a bunch of memory for several types of objects. When IPv6 client initiates connection, NAT64 translator creates a host entry in the states table. -Each host entry uses preallocated IPv4 alias entry. -Each alias entry has a number of ports group entries allocated on demand. +Each host entry has a number of ports group entries allocated on demand. Ports group entries contains connection state entries. There are several options to control limits and lifetime for these objects. .Pp @@ -3339,11 +3337,6 @@ First time an original packet is handled and consumed and then it is handled again as translated packet. This behavior can be changed by sysctl variable .Va net.inet.ip.fw.nat64_direct_output . -Also translated packet can be tagged using -.Cm tag -rule action, and then matched by -.Cm tagged -opcode to avoid loops and extra overhead. .Pp The stateful NAT64 configuration command is the following: .Bd -ragged -offset indent @@ -3371,16 +3364,15 @@ to represent IPv4 addresses. This IPv6 prefix should b The translator implementation follows RFC6052, that restricts the length of prefixes to one of following: 32, 40, 48, 56, 64, or 96. The Well-Known IPv6 Prefix 64:ff9b:: must be 96 bits long. -The special -.Ar ::/length -prefix can be used to handle several IPv6 prefixes with one NAT64 instance. -The NAT64 instance will determine a destination IPv4 address from prefix -.Ar length . -.It Cm states_chunks Ar number -The number of states chunks in single ports group. -Each ports group by default can keep 64 state entries in single chunk. -The above value affects the maximum number of states that can be associated with single IPv4 alias address and port. -The value must be power of 2, and up to 128. +.It Cm max_ports Ar number +Maximum number of ports reserved for upper level protocols to one IPv6 client. +All reserved ports are divided into chunks between supported protocols. +The number of connections from one IPv6 client is limited by this option. +Note that closed TCP connections still remain in the list of connections until +.Cm tcp_close_age +interval will not expire. +Default value is +.Ar 2048 . .It Cm host_del_age Ar seconds The number of seconds until the host entry for a IPv6 client will be deleted and all its resources will be released due to inactivity. Modified: head/sbin/ipfw/ipfw2.h ============================================================================== --- head/sbin/ipfw/ipfw2.h Mon Mar 18 12:59:08 2019 (r345274) +++ head/sbin/ipfw/ipfw2.h Mon Mar 18 14:00:19 2019 (r345275) @@ -278,7 +278,6 @@ enum tokens { TOK_AGG_LEN, TOK_AGG_COUNT, TOK_MAX_PORTS, - TOK_STATES_CHUNKS, TOK_JMAXLEN, TOK_PORT_RANGE, TOK_HOST_DEL_AGE, Modified: head/sbin/ipfw/nat64lsn.c ============================================================================== --- head/sbin/ipfw/nat64lsn.c Mon Mar 18 12:59:08 2019 (r345274) +++ head/sbin/ipfw/nat64lsn.c Mon Mar 18 14:00:19 2019 (r345275) @@ -87,70 +87,68 @@ nat64lsn_print_states(void *buf) char sflags[4], *sf, *proto; ipfw_obj_header *oh; ipfw_obj_data *od; - ipfw_nat64lsn_stg_v1 *stg; - ipfw_nat64lsn_state_v1 *ste; + ipfw_nat64lsn_stg *stg; + ipfw_nat64lsn_state *ste; uint64_t next_idx; int i, sz; oh = (ipfw_obj_header *)buf; od = (ipfw_obj_data *)(oh + 1); - stg = (ipfw_nat64lsn_stg_v1 *)(od + 1); + stg = (ipfw_nat64lsn_stg *)(od + 1); sz = od->head.length - sizeof(*od); next_idx = 0; while (sz > 0 && next_idx != 0xFF) { - next_idx = stg->next.index; + next_idx = stg->next_idx; sz -= sizeof(*stg); if (stg->count == 0) { stg++; continue; } - /* - * NOTE: addresses are in network byte order, - * ports are in host byte order. - */ + switch (stg->proto) { + case IPPROTO_TCP: + proto = "TCP"; + break; + case IPPROTO_UDP: + proto = "UDP"; + break; + case IPPROTO_ICMPV6: + proto = "ICMPv6"; + break; + } + inet_ntop(AF_INET6, &stg->host6, s, sizeof(s)); inet_ntop(AF_INET, &stg->alias4, a, sizeof(a)); - ste = (ipfw_nat64lsn_state_v1 *)(stg + 1); + ste = (ipfw_nat64lsn_state *)(stg + 1); for (i = 0; i < stg->count && sz > 0; i++) { sf = sflags; - inet_ntop(AF_INET6, &ste->host6, s, sizeof(s)); inet_ntop(AF_INET, &ste->daddr, f, sizeof(f)); - switch (ste->proto) { - case IPPROTO_TCP: - proto = "TCP"; + if (stg->proto == IPPROTO_TCP) { if (ste->flags & 0x02) *sf++ = 'S'; if (ste->flags & 0x04) *sf++ = 'E'; if (ste->flags & 0x01) *sf++ = 'F'; - break; - case IPPROTO_UDP: - proto = "UDP"; - break; - case IPPROTO_ICMP: - proto = "ICMPv6"; - break; } *sf = '\0'; - switch (ste->proto) { + switch (stg->proto) { case IPPROTO_TCP: case IPPROTO_UDP: printf("%s:%d\t%s:%d\t%s\t%s\t%d\t%s:%d\n", s, ste->sport, a, ste->aport, proto, sflags, ste->idle, f, ste->dport); break; - case IPPROTO_ICMP: + case IPPROTO_ICMPV6: printf("%s\t%s\t%s\t\t%d\t%s\n", s, a, proto, ste->idle, f); break; default: printf("%s\t%s\t%d\t\t%d\t%s\n", - s, a, ste->proto, ste->idle, f); + s, a, stg->proto, ste->idle, f); } ste++; sz -= sizeof(*ste); } - stg = (ipfw_nat64lsn_stg_v1 *)ste; + stg = (ipfw_nat64lsn_stg *)ste; } return (next_idx); } @@ -176,7 +174,6 @@ nat64lsn_states_cb(ipfw_nat64lsn_cfg *cfg, const char err(EX_OSERR, NULL); do { oh = (ipfw_obj_header *)buf; - oh->opheader.version = 1; /* Force using ov new API */ od = (ipfw_obj_data *)(oh + 1); nat64lsn_fill_ntlv(&oh->ntlv, cfg->name, set); od->head.type = IPFW_TLV_OBJDATA; @@ -366,8 +363,12 @@ nat64lsn_parse_int(const char *arg, const char *desc) static struct _s_x nat64newcmds[] = { { "prefix6", TOK_PREFIX6 }, + { "agg_len", TOK_AGG_LEN }, /* not yet */ + { "agg_count", TOK_AGG_COUNT }, /* not yet */ + { "port_range", TOK_PORT_RANGE }, /* not yet */ { "jmaxlen", TOK_JMAXLEN }, { "prefix4", TOK_PREFIX4 }, + { "max_ports", TOK_MAX_PORTS }, { "host_del_age", TOK_HOST_DEL_AGE }, { "pg_del_age", TOK_PG_DEL_AGE }, { "tcp_syn_age", TOK_TCP_SYN_AGE }, @@ -375,13 +376,10 @@ static struct _s_x nat64newcmds[] = { { "tcp_est_age", TOK_TCP_EST_AGE }, { "udp_age", TOK_UDP_AGE }, { "icmp_age", TOK_ICMP_AGE }, - { "states_chunks",TOK_STATES_CHUNKS }, { "log", TOK_LOG }, { "-log", TOK_LOGOFF }, { "allow_private", TOK_PRIVATE }, { "-allow_private", TOK_PRIVATEOFF }, - /* for compatibility with old configurations */ - { "max_ports", TOK_MAX_PORTS }, /* unused */ { NULL, 0 } }; @@ -438,17 +436,42 @@ nat64lsn_create(const char *name, uint8_t set, int ac, nat64lsn_parse_prefix(*av, AF_INET6, &cfg->prefix6, &cfg->plen6); if (ipfw_check_nat64prefix(&cfg->prefix6, - cfg->plen6) != 0 && - !IN6_IS_ADDR_UNSPECIFIED(&cfg->prefix6)) + cfg->plen6) != 0) errx(EX_USAGE, "Bad prefix6 %s", *av); ac--; av++; break; +#if 0 + case TOK_AGG_LEN: + NEED1("Aggregation prefix len required"); + cfg->agg_prefix_len = nat64lsn_parse_int(*av, opt); + ac--; av++; + break; + case TOK_AGG_COUNT: + NEED1("Max per-prefix count required"); + cfg->agg_prefix_max = nat64lsn_parse_int(*av, opt); + ac--; av++; + break; + case TOK_PORT_RANGE: + NEED1("port range x[:y] required"); + if ((p = strchr(*av, ':')) == NULL) + cfg->min_port = (uint16_t)nat64lsn_parse_int( + *av, opt); + else { + *p++ = '\0'; + cfg->min_port = (uint16_t)nat64lsn_parse_int( + *av, opt); + cfg->max_port = (uint16_t)nat64lsn_parse_int( + p, opt); + } + ac--; av++; + break; case TOK_JMAXLEN: NEED1("job queue length required"); cfg->jmaxlen = nat64lsn_parse_int(*av, opt); ac--; av++; break; +#endif case TOK_MAX_PORTS: NEED1("Max per-user ports required"); cfg->max_ports = nat64lsn_parse_int(*av, opt); @@ -496,12 +519,6 @@ nat64lsn_create(const char *name, uint8_t set, int ac, *av, opt); ac--; av++; break; - case TOK_STATES_CHUNKS: - NEED1("number of chunks required"); - cfg->states_chunks = (uint8_t)nat64lsn_parse_int( - *av, opt); - ac--; av++; - break; case TOK_LOG: cfg->flags |= NAT64_LOG; break; @@ -613,12 +630,6 @@ nat64lsn_config(const char *name, uint8_t set, int ac, *av, opt); ac--; av++; break; - case TOK_STATES_CHUNKS: - NEED1("number of chunks required"); - cfg->states_chunks = (uint8_t)nat64lsn_parse_int( - *av, opt); - ac--; av++; - break; case TOK_LOG: cfg->flags |= NAT64_LOG; break; @@ -778,24 +789,31 @@ nat64lsn_show_cb(ipfw_nat64lsn_cfg *cfg, const char *n printf("nat64lsn %s prefix4 %s/%u", cfg->name, abuf, cfg->plen4); inet_ntop(AF_INET6, &cfg->prefix6, abuf, sizeof(abuf)); printf(" prefix6 %s/%u", abuf, cfg->plen6); - if (co.verbose || cfg->states_chunks > 1) - printf(" states_chunks %u", cfg->states_chunks); - if (co.verbose || cfg->nh_delete_delay != NAT64LSN_HOST_AGE) +#if 0 + printf("agg_len %u agg_count %u ", cfg->agg_prefix_len, + cfg->agg_prefix_max); + if (cfg->min_port != NAT64LSN_PORT_MIN || + cfg->max_port != NAT64LSN_PORT_MAX) + printf(" port_range %u:%u", cfg->min_port, cfg->max_port); + if (cfg->jmaxlen != NAT64LSN_JMAXLEN) + printf(" jmaxlen %u ", cfg->jmaxlen); +#endif + if (cfg->max_ports != NAT64LSN_MAX_PORTS) + printf(" max_ports %u", cfg->max_ports); + if (cfg->nh_delete_delay != NAT64LSN_HOST_AGE) printf(" host_del_age %u", cfg->nh_delete_delay); - if (co.verbose || cfg->pg_delete_delay != NAT64LSN_PG_AGE) + if (cfg->pg_delete_delay != NAT64LSN_PG_AGE) printf(" pg_del_age %u ", cfg->pg_delete_delay); - if (co.verbose || cfg->st_syn_ttl != NAT64LSN_TCP_SYN_AGE) + if (cfg->st_syn_ttl != NAT64LSN_TCP_SYN_AGE) printf(" tcp_syn_age %u", cfg->st_syn_ttl); - if (co.verbose || cfg->st_close_ttl != NAT64LSN_TCP_FIN_AGE) + if (cfg->st_close_ttl != NAT64LSN_TCP_FIN_AGE) printf(" tcp_close_age %u", cfg->st_close_ttl); - if (co.verbose || cfg->st_estab_ttl != NAT64LSN_TCP_EST_AGE) + if (cfg->st_estab_ttl != NAT64LSN_TCP_EST_AGE) printf(" tcp_est_age %u", cfg->st_estab_ttl); - if (co.verbose || cfg->st_udp_ttl != NAT64LSN_UDP_AGE) + if (cfg->st_udp_ttl != NAT64LSN_UDP_AGE) printf(" udp_age %u", cfg->st_udp_ttl); - if (co.verbose || cfg->st_icmp_ttl != NAT64LSN_ICMP_AGE) + if (cfg->st_icmp_ttl != NAT64LSN_ICMP_AGE) printf(" icmp_age %u", cfg->st_icmp_ttl); - if (co.verbose || cfg->jmaxlen != NAT64LSN_JMAXLEN) - printf(" jmaxlen %u ", cfg->jmaxlen); if (cfg->flags & NAT64_LOG) printf(" log"); if (cfg->flags & NAT64_ALLOW_PRIVATE) Modified: head/sys/conf/files ============================================================================== --- head/sys/conf/files Mon Mar 18 12:59:08 2019 (r345274) +++ head/sys/conf/files Mon Mar 18 14:00:19 2019 (r345275) @@ -4398,9 +4398,9 @@ netpfil/ipfw/nat64/nat64clat.c optional inet inet6 ipf netpfil/ipfw/nat64/nat64clat_control.c optional inet inet6 ipfirewall \ ipfirewall_nat64 netpfil/ipfw/nat64/nat64lsn.c optional inet inet6 ipfirewall \ - ipfirewall_nat64 compile-with "${NORMAL_C} -I$S/contrib/ck/include" + ipfirewall_nat64 netpfil/ipfw/nat64/nat64lsn_control.c optional inet inet6 ipfirewall \ - ipfirewall_nat64 compile-with "${NORMAL_C} -I$S/contrib/ck/include" + ipfirewall_nat64 netpfil/ipfw/nat64/nat64stl.c optional inet inet6 ipfirewall \ ipfirewall_nat64 netpfil/ipfw/nat64/nat64stl_control.c optional inet inet6 ipfirewall \ Modified: head/sys/modules/ipfw_nat64/Makefile ============================================================================== --- head/sys/modules/ipfw_nat64/Makefile Mon Mar 18 12:59:08 2019 (r345274) +++ head/sys/modules/ipfw_nat64/Makefile Mon Mar 18 14:00:19 2019 (r345275) @@ -8,6 +8,4 @@ SRCS+= nat64clat.c nat64clat_control.c SRCS+= nat64lsn.c nat64lsn_control.c SRCS+= nat64stl.c nat64stl_control.c -CFLAGS+= -I${SRCTOP}/sys/contrib/ck/include - .include Modified: head/sys/netinet6/ip_fw_nat64.h ============================================================================== --- head/sys/netinet6/ip_fw_nat64.h Mon Mar 18 12:59:08 2019 (r345274) +++ head/sys/netinet6/ip_fw_nat64.h Mon Mar 18 14:00:19 2019 (r345275) @@ -122,7 +122,7 @@ typedef struct _ipfw_nat64clat_cfg { /* * NAT64LSN default configuration values */ -#define NAT64LSN_MAX_PORTS 2048 /* Unused */ +#define NAT64LSN_MAX_PORTS 2048 /* Max number of ports per host */ #define NAT64LSN_JMAXLEN 2048 /* Max outstanding requests. */ #define NAT64LSN_TCP_SYN_AGE 10 /* State's TTL after SYN received. */ #define NAT64LSN_TCP_EST_AGE (2 * 3600) /* TTL for established connection */ @@ -135,20 +135,16 @@ typedef struct _ipfw_nat64clat_cfg { typedef struct _ipfw_nat64lsn_cfg { char name[64]; /* NAT name */ uint32_t flags; - - uint32_t max_ports; /* Unused */ - uint32_t agg_prefix_len; /* Unused */ - uint32_t agg_prefix_max; /* Unused */ - + uint32_t max_ports; /* Max ports per client */ + uint32_t agg_prefix_len; /* Prefix length to count */ + uint32_t agg_prefix_max; /* Max hosts per agg prefix */ struct in_addr prefix4; uint16_t plen4; /* Prefix length */ uint16_t plen6; /* Prefix length */ struct in6_addr prefix6; /* NAT64 prefix */ uint32_t jmaxlen; /* Max jobqueue length */ - - uint16_t min_port; /* Unused */ - uint16_t max_port; /* Unused */ - + uint16_t min_port; /* Min port group # to use */ + uint16_t max_port; /* Max port group # to use */ uint16_t nh_delete_delay;/* Stale host delete delay */ uint16_t pg_delete_delay;/* Stale portgroup delete delay */ uint16_t st_syn_ttl; /* TCP syn expire */ @@ -157,7 +153,7 @@ typedef struct _ipfw_nat64lsn_cfg { uint16_t st_udp_ttl; /* UDP expire */ uint16_t st_icmp_ttl; /* ICMP expire */ uint8_t set; /* Named instance set [0..31] */ - uint8_t states_chunks; /* Number of states chunks per PG */ + uint8_t spare; } ipfw_nat64lsn_cfg; typedef struct _ipfw_nat64lsn_state { @@ -181,30 +177,5 @@ typedef struct _ipfw_nat64lsn_stg { uint32_t spare2; } ipfw_nat64lsn_stg; -typedef struct _ipfw_nat64lsn_state_v1 { - struct in6_addr host6; /* Bound IPv6 host */ - struct in_addr daddr; /* Remote IPv4 address */ - uint16_t dport; /* Remote destination port */ - uint16_t aport; /* Local alias port */ - uint16_t sport; /* Source port */ - uint16_t spare; - uint16_t idle; /* Last used time */ - uint8_t flags; /* State flags */ - uint8_t proto; /* protocol */ -} ipfw_nat64lsn_state_v1; - -typedef struct _ipfw_nat64lsn_stg_v1 { - union nat64lsn_pgidx { - uint64_t index; - struct { - uint8_t chunk; /* states chunk */ - uint8_t proto; /* protocol */ - uint16_t port; /* base port */ - in_addr_t addr; /* alias address */ - }; - } next; /* next state index */ - struct in_addr alias4; /* IPv4 alias address */ - uint32_t count; /* Number of states */ -} ipfw_nat64lsn_stg_v1; - #endif /* _NETINET6_IP_FW_NAT64_H_ */ + Modified: head/sys/netpfil/ipfw/nat64/nat64lsn.c ============================================================================== --- head/sys/netpfil/ipfw/nat64/nat64lsn.c Mon Mar 18 12:59:08 2019 (r345274) +++ head/sys/netpfil/ipfw/nat64/nat64lsn.c Mon Mar 18 14:00:19 2019 (r345275) @@ -33,17 +33,16 @@ __FBSDID("$FreeBSD$"); #include #include #include -#include -#include #include -#include #include #include #include #include #include #include +#include #include +#include #include #include @@ -72,22 +71,17 @@ __FBSDID("$FreeBSD$"); MALLOC_DEFINE(M_NAT64LSN, "NAT64LSN", "NAT64LSN"); -static epoch_t nat64lsn_epoch; -#define NAT64LSN_EPOCH_ENTER(et) epoch_enter_preempt(nat64lsn_epoch, &(et)) -#define NAT64LSN_EPOCH_EXIT(et) epoch_exit_preempt(nat64lsn_epoch, &(et)) -#define NAT64LSN_EPOCH_WAIT() epoch_wait_preempt(nat64lsn_epoch) -#define NAT64LSN_EPOCH_ASSERT() MPASS(in_epoch(nat64lsn_epoch)) -#define NAT64LSN_EPOCH_CALL(c, f) epoch_call(nat64lsn_epoch, (c), (f)) +static void nat64lsn_periodic(void *data); +#define PERIODIC_DELAY 4 +static uint8_t nat64lsn_proto_map[256]; +uint8_t nat64lsn_rproto_map[NAT_MAX_PROTO]; -static uma_zone_t nat64lsn_host_zone; -static uma_zone_t nat64lsn_pgchunk_zone; -static uma_zone_t nat64lsn_pg_zone; -static uma_zone_t nat64lsn_aliaslink_zone; -static uma_zone_t nat64lsn_state_zone; -static uma_zone_t nat64lsn_job_zone; +#define NAT64_FLAG_FIN 0x01 /* FIN was seen */ +#define NAT64_FLAG_SYN 0x02 /* First syn in->out */ +#define NAT64_FLAG_ESTAB 0x04 /* Packet with Ack */ +#define NAT64_FLAGS_TCP (NAT64_FLAG_SYN|NAT64_FLAG_ESTAB|NAT64_FLAG_FIN) -static void nat64lsn_periodic(void *data); -#define PERIODIC_DELAY 4 +#define NAT64_FLAG_RDR 0x80 /* Port redirect */ #define NAT64_LOOKUP(chain, cmd) \ (struct nat64lsn_cfg *)SRV_OBJECT((chain), (cmd)->arg1) /* @@ -97,33 +91,25 @@ static void nat64lsn_periodic(void *data); enum nat64lsn_jtype { JTYPE_NEWHOST = 1, JTYPE_NEWPORTGROUP, - JTYPE_DESTROY, + JTYPE_DELPORTGROUP, }; struct nat64lsn_job_item { - STAILQ_ENTRY(nat64lsn_job_item) entries; + TAILQ_ENTRY(nat64lsn_job_item) next; enum nat64lsn_jtype jtype; - - union { - struct { /* used by JTYPE_NEWHOST, JTYPE_NEWPORTGROUP */ - struct mbuf *m; - struct nat64lsn_host *host; - struct nat64lsn_state *state; - uint32_t src6_hval; - uint32_t state_hval; - struct ipfw_flow_id f_id; - in_addr_t faddr; - uint16_t port; - uint8_t proto; - uint8_t done; - }; - struct { /* used by JTYPE_DESTROY */ - struct nat64lsn_hosts_slist hosts; - struct nat64lsn_pg_slist portgroups; - struct nat64lsn_pgchunk *pgchunk; - struct epoch_context epoch_ctx; - }; - }; + struct nat64lsn_host *nh; + struct nat64lsn_portgroup *pg; + void *spare_idx; + struct in6_addr haddr; + uint8_t nat_proto; + uint8_t done; + int needs_idx; + int delcount; + unsigned int fhash; /* Flow hash */ + uint32_t aaddr; /* Last used address (net) */ + struct mbuf *m; + struct ipfw_flow_id f_id; + uint64_t delmask[NAT64LSN_PGPTRNMASK]; }; static struct mtx jmtx; @@ -132,278 +118,143 @@ static struct mtx jmtx; #define JQUEUE_LOCK() mtx_lock(&jmtx) #define JQUEUE_UNLOCK() mtx_unlock(&jmtx) -static int nat64lsn_alloc_host(struct nat64lsn_cfg *cfg, - struct nat64lsn_job_item *ji); -static int nat64lsn_alloc_pg(struct nat64lsn_cfg *cfg, - struct nat64lsn_job_item *ji); -static struct nat64lsn_job_item *nat64lsn_create_job( - struct nat64lsn_cfg *cfg, int jtype); static void nat64lsn_enqueue_job(struct nat64lsn_cfg *cfg, struct nat64lsn_job_item *ji); -static void nat64lsn_job_destroy(epoch_context_t ctx); -static void nat64lsn_destroy_host(struct nat64lsn_host *host); -static void nat64lsn_destroy_pg(struct nat64lsn_pg *pg); +static void nat64lsn_enqueue_jobs(struct nat64lsn_cfg *cfg, + struct nat64lsn_job_head *jhead, int jlen); +static struct nat64lsn_job_item *nat64lsn_create_job(struct nat64lsn_cfg *cfg, + const struct ipfw_flow_id *f_id, int jtype); +static int nat64lsn_request_portgroup(struct nat64lsn_cfg *cfg, + const struct ipfw_flow_id *f_id, struct mbuf **pm, uint32_t aaddr, + int needs_idx); +static int nat64lsn_request_host(struct nat64lsn_cfg *cfg, + const struct ipfw_flow_id *f_id, struct mbuf **pm); static int nat64lsn_translate4(struct nat64lsn_cfg *cfg, - const struct ipfw_flow_id *f_id, struct mbuf **mp); + const struct ipfw_flow_id *f_id, struct mbuf **pm); static int nat64lsn_translate6(struct nat64lsn_cfg *cfg, - struct ipfw_flow_id *f_id, struct mbuf **mp); -static int nat64lsn_translate6_internal(struct nat64lsn_cfg *cfg, - struct mbuf **mp, struct nat64lsn_state *state, uint8_t flags); + struct ipfw_flow_id *f_id, struct mbuf **pm); -#define NAT64_BIT_TCP_FIN 0 /* FIN was seen */ -#define NAT64_BIT_TCP_SYN 1 /* First syn in->out */ -#define NAT64_BIT_TCP_ESTAB 2 /* Packet with Ack */ -#define NAT64_BIT_READY_IPV4 6 /* state is ready for translate4 */ -#define NAT64_BIT_STALE 7 /* state is going to be expired */ +static int alloc_portgroup(struct nat64lsn_job_item *ji); +static void destroy_portgroup(struct nat64lsn_portgroup *pg); +static void destroy_host6(struct nat64lsn_host *nh); +static int alloc_host6(struct nat64lsn_cfg *cfg, struct nat64lsn_job_item *ji); -#define NAT64_FLAG_FIN (1 << NAT64_BIT_TCP_FIN) -#define NAT64_FLAG_SYN (1 << NAT64_BIT_TCP_SYN) -#define NAT64_FLAG_ESTAB (1 << NAT64_BIT_TCP_ESTAB) -#define NAT64_FLAGS_TCP (NAT64_FLAG_SYN|NAT64_FLAG_ESTAB|NAT64_FLAG_FIN) +static int attach_portgroup(struct nat64lsn_cfg *cfg, + struct nat64lsn_job_item *ji); +static int attach_host6(struct nat64lsn_cfg *cfg, struct nat64lsn_job_item *ji); -#define NAT64_FLAG_READY (1 << NAT64_BIT_READY_IPV4) -#define NAT64_FLAG_STALE (1 << NAT64_BIT_STALE) -static inline uint8_t -convert_tcp_flags(uint8_t flags) -{ - uint8_t result; +/* XXX tmp */ +static uma_zone_t nat64lsn_host_zone; +static uma_zone_t nat64lsn_pg_zone; +static uma_zone_t nat64lsn_pgidx_zone; - result = flags & (TH_FIN|TH_SYN); - result |= (flags & TH_RST) >> 2; /* Treat RST as FIN */ - result |= (flags & TH_ACK) >> 2; /* Treat ACK as estab */ +static unsigned int nat64lsn_periodic_chkstates(struct nat64lsn_cfg *cfg, + struct nat64lsn_host *nh); - return (result); -} +#define I6_hash(x) (djb_hash((const unsigned char *)(x), 16)) +#define I6_first(_ph, h) (_ph)[h] +#define I6_next(x) (x)->next +#define I6_val(x) (&(x)->addr) +#define I6_cmp(a, b) IN6_ARE_ADDR_EQUAL(a, b) +#define I6_lock(a, b) +#define I6_unlock(a, b) -static void -nat64lsn_log(struct pfloghdr *plog, struct mbuf *m, sa_family_t family, - uintptr_t state) -{ +#define I6HASH_FIND(_cfg, _res, _a) \ + CHT_FIND(_cfg->ih, _cfg->ihsize, I6_, _res, _a) +#define I6HASH_INSERT(_cfg, _i) \ + CHT_INSERT_HEAD(_cfg->ih, _cfg->ihsize, I6_, _i) +#define I6HASH_REMOVE(_cfg, _res, _tmp, _a) \ + CHT_REMOVE(_cfg->ih, _cfg->ihsize, I6_, _res, _tmp, _a) - memset(plog, 0, sizeof(*plog)); - plog->length = PFLOG_REAL_HDRLEN; - plog->af = family; - plog->action = PF_NAT; - plog->dir = PF_IN; - plog->rulenr = htonl(state >> 32); - plog->subrulenr = htonl(state & 0xffffffff); - plog->ruleset[0] = '\0'; - strlcpy(plog->ifname, "NAT64LSN", sizeof(plog->ifname)); - ipfw_bpf_mtap2(plog, PFLOG_HDRLEN, m); -} +#define I6HASH_FOREACH_SAFE(_cfg, _x, _tmp, _cb, _arg) \ + CHT_FOREACH_SAFE(_cfg->ih, _cfg->ihsize, I6_, _x, _tmp, _cb, _arg) -#define HVAL(p, n, s) jenkins_hash32((const uint32_t *)(p), (n), (s)) -#define HOST_HVAL(c, a) HVAL((a),\ - sizeof(struct in6_addr) / sizeof(uint32_t), (c)->hash_seed) -#define HOSTS(c, v) ((c)->hosts_hash[(v) & ((c)->hosts_hashsize - 1)]) +#define HASH_IN4(x) djb_hash((const unsigned char *)(x), 8) -#define ALIASLINK_HVAL(c, f) HVAL(&(f)->dst_ip6,\ - sizeof(struct in6_addr) * 2 / sizeof(uint32_t), (c)->hash_seed) -#define ALIAS_BYHASH(c, v) \ - ((c)->aliases[(v) & ((1 << (32 - (c)->plen4)) - 1)]) -static struct nat64lsn_aliaslink* -nat64lsn_get_aliaslink(struct nat64lsn_cfg *cfg __unused, - struct nat64lsn_host *host, const struct ipfw_flow_id *f_id __unused) +static unsigned +djb_hash(const unsigned char *h, const int len) { + unsigned int result = 0; + int i; - /* - * We can implement some different algorithms how - * select an alias address. - * XXX: for now we use first available. - */ - return (CK_SLIST_FIRST(&host->aliases)); + for (i = 0; i < len; i++) + result = 33 * result ^ h[i]; + + return (result); } -#define FADDR_CHUNK(p, a) ((a) & ((p)->chunks_count - 1)) -#define FREEMASK_CHUNK(p, v) \ - ((p)->chunks_count == 1 ? &(p)->freemask : \ - &((p)->freemask_chunk[FADDR_CHUNK(p, v)])) -#define STATES_CHUNK(p, v) \ - ((p)->chunks_count == 1 ? (p)->states : \ - ((p)->states_chunk[FADDR_CHUNK(p, v)])) -#define STATE_HVAL(c, d) HVAL((d), 2, (c)->hash_seed) -#define STATE_HASH(h, v) \ - ((h)->states_hash[(v) & ((h)->states_hashsize - 1)]) - -#define NAT64LSN_TRY_PGCNT 32 -static struct nat64lsn_pg* -nat64lsn_get_pg(uint32_t *chunkmask, uint32_t *pgmask, - struct nat64lsn_pgchunk **chunks, struct nat64lsn_pg **pgptr, - uint32_t *pgidx, in_addr_t faddr) +/* +static size_t +bitmask_size(size_t num, int *level) { - struct nat64lsn_pg *pg, *oldpg; - uint32_t idx, oldidx; - int cnt; + size_t x; + int c; - cnt = 0; - /* First try last used PG */ - oldpg = pg = ck_pr_load_ptr(pgptr); - idx = oldidx = ck_pr_load_32(pgidx); - /* If pgidx is out of range, reset it to the first pgchunk */ - if (!ISSET32(*chunkmask, idx / 32)) - idx = 0; - do { - ck_pr_fence_load(); - if (pg != NULL && - bitcount64(*FREEMASK_CHUNK(pg, faddr)) > 0) { - /* - * If last used PG has not free states, - * try to update pointer. - * NOTE: it can be already updated by jobs handler, - * thus we use CAS operation. - */ - if (cnt > 0) - ck_pr_cas_ptr(pgptr, oldpg, pg); - return (pg); - } - /* Stop if idx is out of range */ - if (!ISSET32(*chunkmask, idx / 32)) - break; + for (c = 0, x = num; num > 1; num /= 64, c++) + ; - if (ISSET32(pgmask[idx / 32], idx % 32)) - pg = ck_pr_load_ptr( - &chunks[idx / 32]->pgptr[idx % 32]); - else - pg = NULL; + return (x); +} - idx++; - } while (++cnt < NAT64LSN_TRY_PGCNT); +static void +bitmask_prepare(uint64_t *pmask, size_t bufsize, int level) +{ + size_t x, z; - /* If pgidx is out of range, reset it to the first pgchunk */ - if (!ISSET32(*chunkmask, idx / 32)) - idx = 0; - ck_pr_cas_32(pgidx, oldidx, idx); - return (NULL); + memset(pmask, 0xFF, bufsize); + for (x = 0, z = 1; level > 1; x += z, z *= 64, level--) + ; + pmask[x] ~= 0x01; } +*/ -static struct nat64lsn_state* -nat64lsn_get_state6to4(struct nat64lsn_cfg *cfg, struct nat64lsn_host *host, - const struct ipfw_flow_id *f_id, uint32_t hval, in_addr_t faddr, - uint16_t port, uint8_t proto) +static void +nat64lsn_log(struct pfloghdr *plog, struct mbuf *m, sa_family_t family, + uint32_t n, uint32_t sn) { - struct nat64lsn_aliaslink *link; - struct nat64lsn_state *state; - struct nat64lsn_pg *pg; - int i, offset; - NAT64LSN_EPOCH_ASSERT(); - - /* Check that we already have state for given arguments */ - CK_SLIST_FOREACH(state, &STATE_HASH(host, hval), entries) { - if (state->proto == proto && state->ip_dst == faddr && - state->sport == port && state->dport == f_id->dst_port) - return (state); - } - - link = nat64lsn_get_aliaslink(cfg, host, f_id); - if (link == NULL) - return (NULL); - - switch (proto) { - case IPPROTO_TCP: - pg = nat64lsn_get_pg( - &link->alias->tcp_chunkmask, link->alias->tcp_pgmask, - link->alias->tcp, &link->alias->tcp_pg, - &link->alias->tcp_pgidx, faddr); - break; - case IPPROTO_UDP: - pg = nat64lsn_get_pg( - &link->alias->udp_chunkmask, link->alias->udp_pgmask, - link->alias->udp, &link->alias->udp_pg, - &link->alias->udp_pgidx, faddr); - break; - case IPPROTO_ICMP: - pg = nat64lsn_get_pg( - &link->alias->icmp_chunkmask, link->alias->icmp_pgmask, - link->alias->icmp, &link->alias->icmp_pg, - &link->alias->icmp_pgidx, faddr); - break; - default: - panic("%s: wrong proto %d", __func__, proto); - } - if (pg == NULL) - return (NULL); - - /* Check that PG has some free states */ - state = NULL; - i = bitcount64(*FREEMASK_CHUNK(pg, faddr)); - while (i-- > 0) { - offset = ffsll(*FREEMASK_CHUNK(pg, faddr)); - if (offset == 0) { - /* - * We lost the race. - * No more free states in this PG. - */ - break; - } - - /* Lets try to atomically grab the state */ - if (ck_pr_btr_64(FREEMASK_CHUNK(pg, faddr), offset - 1)) { - state = &STATES_CHUNK(pg, faddr)->state[offset - 1]; - /* Initialize */ - state->flags = proto != IPPROTO_TCP ? 0 : - convert_tcp_flags(f_id->_flags); - state->proto = proto; - state->aport = pg->base_port + offset - 1; - state->dport = f_id->dst_port; - state->sport = port; - state->ip6_dst = f_id->dst_ip6; - state->ip_dst = faddr; - state->ip_src = link->alias->addr; - state->hval = hval; - state->host = host; - SET_AGE(state->timestamp); - - /* Insert new state into host's hash table */ - HOST_LOCK(host); - CK_SLIST_INSERT_HEAD(&STATE_HASH(host, hval), - state, entries); - host->states_count++; - /* - * XXX: In case if host is going to be expired, - * reset NAT64LSN_DEADHOST flag. - */ - host->flags &= ~NAT64LSN_DEADHOST; - HOST_UNLOCK(host); - NAT64STAT_INC(&cfg->base.stats, screated); - /* Mark the state as ready for translate4 */ - ck_pr_fence_store(); - ck_pr_bts_32(&state->flags, NAT64_BIT_READY_IPV4); - break; - } - } - return (state); + memset(plog, 0, sizeof(*plog)); + plog->length = PFLOG_REAL_HDRLEN; + plog->af = family; + plog->action = PF_NAT; + plog->dir = PF_IN; + plog->rulenr = htonl(n); + plog->subrulenr = htonl(sn); + plog->ruleset[0] = '\0'; + strlcpy(plog->ifname, "NAT64LSN", sizeof(plog->ifname)); + ipfw_bpf_mtap2(plog, PFLOG_HDRLEN, m); } - /* * Inspects icmp packets to see if the message contains different * packet header so we need to alter @addr and @port. */ static int -inspect_icmp_mbuf(struct mbuf **mp, uint8_t *proto, uint32_t *addr, +inspect_icmp_mbuf(struct mbuf **m, uint8_t *nat_proto, uint32_t *addr, uint16_t *port) { - struct icmp *icmp; struct ip *ip; + struct tcphdr *tcp; + struct udphdr *udp; + struct icmphdr *icmp; int off; - uint8_t inner_proto; + uint8_t proto; - ip = mtod(*mp, struct ip *); /* Outer IP header */ + ip = mtod(*m, struct ip *); /* Outer IP header */ off = (ip->ip_hl << 2) + ICMP_MINLEN; - if ((*mp)->m_len < off) - *mp = m_pullup(*mp, off); - if (*mp == NULL) + if ((*m)->m_len < off) + *m = m_pullup(*m, off); + if (*m == NULL) return (ENOMEM); - ip = mtod(*mp, struct ip *); /* Outer IP header */ - icmp = L3HDR(ip, struct icmp *); + ip = mtod(*m, struct ip *); /* Outer IP header */ + icmp = L3HDR(ip, struct icmphdr *); switch (icmp->icmp_type) { case ICMP_ECHO: case ICMP_ECHOREPLY: /* Use icmp ID as distinguisher */ - *port = ntohs(icmp->icmp_id); + *port = ntohs(*((uint16_t *)(icmp + 1))); return (0); case ICMP_UNREACH: case ICMP_TIMXCEED: @@ -415,133 +266,90 @@ inspect_icmp_mbuf(struct mbuf **mp, uint8_t *proto, ui * ICMP_UNREACH and ICMP_TIMXCEED contains IP header + 64 bits * of ULP header. */ - if ((*mp)->m_pkthdr.len < off + sizeof(struct ip) + ICMP_MINLEN) + if ((*m)->m_pkthdr.len < off + sizeof(struct ip) + ICMP_MINLEN) return (EINVAL); - if ((*mp)->m_len < off + sizeof(struct ip) + ICMP_MINLEN) - *mp = m_pullup(*mp, off + sizeof(struct ip) + ICMP_MINLEN); - if (*mp == NULL) + if ((*m)->m_len < off + sizeof(struct ip) + ICMP_MINLEN) + *m = m_pullup(*m, off + sizeof(struct ip) + ICMP_MINLEN); + if (*m == NULL) return (ENOMEM); - ip = mtodo(*mp, off); /* Inner IP header */ - inner_proto = ip->ip_p; + ip = mtodo(*m, off); /* Inner IP header */ + proto = ip->ip_p; off += ip->ip_hl << 2; /* Skip inner IP header */ *addr = ntohl(ip->ip_src.s_addr); - if ((*mp)->m_len < off + ICMP_MINLEN) - *mp = m_pullup(*mp, off + ICMP_MINLEN); - if (*mp == NULL) + if ((*m)->m_len < off + ICMP_MINLEN) + *m = m_pullup(*m, off + ICMP_MINLEN); + if (*m == NULL) return (ENOMEM); - switch (inner_proto) { + switch (proto) { case IPPROTO_TCP: + tcp = mtodo(*m, off); + *nat_proto = NAT_PROTO_TCP; + *port = ntohs(tcp->th_sport); + return (0); case IPPROTO_UDP: - /* Copy source port from the header */ - *port = ntohs(*((uint16_t *)mtodo(*mp, off))); - *proto = inner_proto; + udp = mtodo(*m, off); + *nat_proto = NAT_PROTO_UDP; + *port = ntohs(udp->uh_sport); return (0); case IPPROTO_ICMP: /* * We will translate only ICMP errors for our ICMP * echo requests. */ - icmp = mtodo(*mp, off); + icmp = mtodo(*m, off); if (icmp->icmp_type != ICMP_ECHO) return (EOPNOTSUPP); - *port = ntohs(icmp->icmp_id); + *port = ntohs(*((uint16_t *)(icmp + 1))); return (0); }; return (EOPNOTSUPP); } -static struct nat64lsn_state* -nat64lsn_get_state4to6(struct nat64lsn_cfg *cfg, struct nat64lsn_alias *alias, - in_addr_t faddr, uint16_t port, uint8_t proto) +static inline uint8_t +convert_tcp_flags(uint8_t flags) { - struct nat64lsn_state *state; - struct nat64lsn_pg *pg; - int chunk_idx, pg_idx, state_idx; + uint8_t result; - NAT64LSN_EPOCH_ASSERT(); + result = flags & (TH_FIN|TH_SYN); + result |= (flags & TH_RST) >> 2; /* Treat RST as FIN */ + result |= (flags & TH_ACK) >> 2; /* Treat ACK as estab */ - if (port < NAT64_MIN_PORT) - return (NULL); - /* - * Alias keeps 32 pgchunks for each protocol. - * Each pgchunk has 32 pointers to portgroup. - * Each portgroup has 64 states for ports. - */ - port -= NAT64_MIN_PORT; - chunk_idx = port / 2048; - - port -= chunk_idx * 2048; - pg_idx = port / 64; - state_idx = port % 64; - - /* - * First check in proto_chunkmask that we have allocated PG chunk. - * Then check in proto_pgmask that we have valid PG pointer. - */ - pg = NULL; - switch (proto) { - case IPPROTO_TCP: - if (ISSET32(alias->tcp_chunkmask, chunk_idx) && - ISSET32(alias->tcp_pgmask[chunk_idx], pg_idx)) { - pg = alias->tcp[chunk_idx]->pgptr[pg_idx]; - break; - } - return (NULL); - case IPPROTO_UDP: - if (ISSET32(alias->udp_chunkmask, chunk_idx) && - ISSET32(alias->udp_pgmask[chunk_idx], pg_idx)) { - pg = alias->udp[chunk_idx]->pgptr[pg_idx]; - break; - } - return (NULL); - case IPPROTO_ICMP: - if (ISSET32(alias->icmp_chunkmask, chunk_idx) && - ISSET32(alias->icmp_pgmask[chunk_idx], pg_idx)) { - pg = alias->icmp[chunk_idx]->pgptr[pg_idx]; - break; - } *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***