From owner-freebsd-pf@freebsd.org Fri Jan 25 00:32:02 2019 Return-Path: Delivered-To: freebsd-pf@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5D34F14C49B8 for ; Fri, 25 Jan 2019 00:32:02 +0000 (UTC) (envelope-from srs0=upjq=qb=sigsegv.be=kristof@codepro.be) Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.codepro.be", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 214A98F2E1 for ; Fri, 25 Jan 2019 00:32:00 +0000 (UTC) (envelope-from srs0=upjq=qb=sigsegv.be=kristof@codepro.be) Received: from [10.193.4.91] (unknown [202.36.179.100]) (Authenticated sender: kp) by venus.codepro.be (Postfix) with ESMTPSA id C857D8660; Fri, 25 Jan 2019 01:31:50 +0100 (CET) From: "Kristof Provost" To: byrnejb@harte-lyne.ca Cc: freebsd-pf@freebsd.org Subject: Re: routing LAN traffic through/around a pf gateway Date: Fri, 25 Jan 2019 13:31:46 +1300 X-Mailer: MailMate (2.0BETAr6135) Message-ID: <77538042-3448-4C7F-8499-F492A06E52E9@sigsegv.be> In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 214A98F2E1 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dmarc=fail reason="" header.from=sigsegv.be (policy=none); spf=pass (mx1.freebsd.org: domain of srs0=upjq=qb=sigsegv.be=kristof@codepro.be designates 5.9.86.228 as permitted sender) smtp.mailfrom=srs0=upjq=qb=sigsegv.be=kristof@codepro.be X-Spamd-Result: default: False [-3.67 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; DMARC_POLICY_SOFTFAIL(0.10)[sigsegv.be : SPF not aligned (relaxed), No valid DKIM,none]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:5.9.86.228]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; TO_DN_NONE(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.998,0]; NEURAL_HAM_SHORT(-0.74)[-0.742,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[228.86.9.5.list.dnswl.org : 127.0.9.2]; RCPT_COUNT_TWO(0.00)[2]; MX_GOOD(-0.01)[mx2.codepro.be,mx1.codepro.be]; IP_SCORE(-0.82)[ipnet: 5.9.0.0/16(-1.77), asn: 24940(-2.33), country: DE(-0.01)]; FORGED_SENDER(0.30)[kristof@sigsegv.be,srs0=upjq=qb=sigsegv.be=kristof@codepro.be]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+]; ASN(0.00)[asn:24940, ipnet:5.9.0.0/16, country:DE]; FROM_NEQ_ENVFROM(0.00)[kristof@sigsegv.be,srs0=upjq=qb=sigsegv.be=kristof@codepro.be]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2] Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2019 00:32:02 -0000 On 25 Jan 2019, at 9:37, James B. Byrne via freebsd-pf wrote: > I have limited knowledge of PF being in the process of transitioning > from 20+ years of RHEL/CentOS to FreeBSD. Neither do I possess a > great fund of knowledge respecting IP routing. That said this is my > problem: > > On a small test LAN I have three hosts, W44, W4 and G5: > > network layout, gateway address 216.185.71.5 > > W44 G5 w4 > 216.185.71.44 ----> 216.185.71.5 216.185.71.4 int_if IP > 192.168.150.44 192.168.150.5 ----> 192.168.150.4 int_if IP alias > > Using ssh and with PF running on the gateway, when I connect from > 216.185.71.44 to 216.185.71.4 then the ssh session operates normally. > However, if instead I connect from 216.185.71.44 to 192.168.150.4 then > the initial connection is made but the ssh session remains responsive > for a brief time before it becomes non-responsive. If I terminate the > PF running on the gateway the ssh session again becomes responsive. > If I do not terminate PF then eventually the ssh session client > disconnects with a timeout error. > > Besides macros the entire active contents of pf.conf on G5 are: > > scrub in all no-df max-mss 1440 fragment reassemble > > block return out log all > > block drop in log all > > pass log on $int_if > > pass inet proto icmp all \ > icmp-type $icmp_types keep state > > pass out quick on $ext_if inet proto udp \ > from any \ > to any port 33433 >< 33626 keep state > > Which results in these rules when PF is running: > > @0 scrub in all no-df max-mss 1440 fragment reassemble > @1 block return out log all > @2 block drop in log all > @3 pass log on em0 all flags S/SA keep state > @4 pass inet proto icmp all icmp-type echoreq keep state > @5 pass inet proto icmp all icmp-type unreach keep state > @6 pass out quick on em1 inet proto udp from any to any port 33433 >< > 33626 keep state > You don’t appear to have a rule permitting the SSH traffic to pass through your router. I’m a more than little surprised you manage to establish a connection in the first place. Unless the connection existed before you started pf, of course. Try adding something like: pass inet porto tcp port 22 Regards, Kristof From owner-freebsd-pf@freebsd.org Fri Jan 25 13:14:18 2019 Return-Path: Delivered-To: freebsd-pf@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2D6F114B561D for ; Fri, 25 Jan 2019 13:14:18 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A9D9282F04; Fri, 25 Jan 2019 13:14:17 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x0PDEApu032336 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 25 Jan 2019 15:14:13 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x0PDEApu032336 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x0PDE9Z2032335; Fri, 25 Jan 2019 15:14:09 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Fri, 25 Jan 2019 15:14:09 +0200 From: Konstantin Belousov To: Andreas Longwitz Cc: freebsd-pf@freebsd.org, Gleb Smirnoff , Kristof Provost Subject: Re: rdr pass for proto tcp sometimes creates states with expire time zero and so breaking connections Message-ID: <20190125131409.GZ24863@kib.kiev.ua> References: <5BC51424.5000309@incore.de> <5BD45882.1000207@incore.de> <5BEB3B9A.9080402@incore.de> <20181113222533.GJ9744@FreeBSD.org> <5C49ECAA.7060505@incore.de> <20190124203802.GU24863@kib.kiev.ua> <5C4A37A1.80206@incore.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5C4A37A1.80206@incore.de> User-Agent: Mutt/1.11.2 (2019-01-07) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2019 13:14:18 -0000 On Thu, Jan 24, 2019 at 11:09:37PM +0100, Andreas Longwitz wrote: > >> I think the problem is the cmpxchg8b instruction used in > >> counter_u64_fetch(), because this machine instruction always writes to > >> memory, also when we only want to read and have (EDX:EAX) = (ECX:EBX): > >> > >> TEMP64 <- DEST > >> IF (EDX:EAX = TEMP64) > >> THEN > >> ZF <- 1 > >> DEST <- ECX:EBX > >> ELSE > >> ZF <- 0 > >> EDX:EAX <- TEMP64 > >> DEST <- TEMP64 > >> FI > >> > >> If one CPU increments the counter in pf_create_state() and another does > >> the fetch, then both CPU's may run the xmpxschg8b at once with the > >> chance that both read the same memory value in TEMP64 and the fetching > >> CPU is the second CPU that writes and so the increment is lossed. Thats > >> what I see without the above patch two or three times a week. > > > > Please try the following patch. The idea is to make the value to compare > > with unlikely to be equal to the memory content, for fetch_one(). > > During my research I first had the same idea, but it did not work. In > the actual coding eax/edx is not well defined before cmpxchg8b is > executed, but it does not help for the problem to do so. > > > Also it fixes a silly bug in zero_one(). > > > > diff --git a/sys/i386/include/counter.h b/sys/i386/include/counter.h > > index 7fd26d2a960..aa20831ba18 100644 > > --- a/sys/i386/include/counter.h > > +++ b/sys/i386/include/counter.h > > @@ -78,6 +78,9 @@ counter_u64_read_one_8b(uint64_t *p) > > uint32_t res_lo, res_high; > > > > __asm __volatile( > > + "movl (%0),%%eax\n\t" > > + "movl 4(%0),%%edx\n\t" > > + "addl $0x10000000,%%edx\n\t" /* avoid write */ > > "movl %%eax,%%ebx\n\t" > > "movl %%edx,%%ecx\n\t" > > "cmpxchg8b (%2)" > > We can not avoid the write done by cmpxchg8b as can be seen from the > microcode given above, we always end up with "DEST <- TEMP". From the > Intel instruction reference manual: > > The destination operand is written back if the comparision fails. (The > processor never produces a locked read without also producing a locked > write). I see, AMD APM is more clear there, stating that the instruction always do rmw regardless of lock prefix. > > Maybe it is enough to prefix the cmpxchg8b with LOCK only in function > counter_u64_read_one_8b(). I am not sure. Lets switch to IPI method for fetch, similar to clear. I do not think that the cost of fetch is too important comparing with the race. > > > > @@ -120,11 +123,11 @@ counter_u64_zero_one_8b(uint64_t *p) > > { > > __asm __volatile( > > +"\n1:\n\t" > > "movl (%0),%%eax\n\t" > > - "movl 4(%0),%%edx\n" > > + "movl 4(%0),%%edx\n\t" > > "xorl %%ebx,%%ebx\n\t" > > "xorl %%ecx,%%ecx\n\t" > > -"1:\n\t" > > "cmpxchg8b (%0)\n\t" > > "jnz 1b" > > : > > If jnz jumps back the instruction cmpxchg8b has load registers eax/edx > with (%0), therefor I do not understand the silly bug. Ignore me. diff --git a/sys/i386/include/counter.h b/sys/i386/include/counter.h index 7fd26d2a960..278f89123a4 100644 --- a/sys/i386/include/counter.h +++ b/sys/i386/include/counter.h @@ -72,7 +72,12 @@ counter_64_inc_8b(uint64_t *p, int64_t inc) } #ifdef IN_SUBR_COUNTER_C -static inline uint64_t +struct counter_u64_fetch_cx8_arg { + uint64_t res; + uint64_t *p; +}; + +static uint64_t counter_u64_read_one_8b(uint64_t *p) { uint32_t res_lo, res_high; @@ -87,9 +92,22 @@ counter_u64_read_one_8b(uint64_t *p) return (res_lo + ((uint64_t)res_high << 32)); } +static void +counter_u64_fetch_cx8_one(void *arg1) +{ + struct counter_u64_fetch_cx8_arg *arg; + uint64_t val; + + arg = arg1; + val = counter_u64_read_one_8b((uint64_t *)((char *)arg->p + + UMA_PCPU_ALLOC_SIZE * PCPU_GET(cpuid))); + atomic_add_64(&arg->res, val); +} + static inline uint64_t counter_u64_fetch_inline(uint64_t *p) { + struct counter_u64_fetch_cx8_arg arg; uint64_t res; int i; @@ -108,9 +126,10 @@ counter_u64_fetch_inline(uint64_t *p) } critical_exit(); } else { - CPU_FOREACH(i) - res += counter_u64_read_one_8b((uint64_t *)((char *)p + - UMA_PCPU_ALLOC_SIZE * i)); + arg.p = p; + arg.res = 0; + smp_rendezvous(NULL, counter_u64_fetch_cx8_one, NULL, &arg); + res = arg.res; } return (res); }