From owner-freebsd-pf@freebsd.org Tue Nov 13 21:01:27 2018 Return-Path: Delivered-To: freebsd-pf@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 086001131B5D for ; Tue, 13 Nov 2018 21:01:27 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F244577AF1 for ; Tue, 13 Nov 2018 21:01:25 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id B415328245; Tue, 13 Nov 2018 22:01:15 +0100 (CET) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id rQ_x5HDhIMrD; Tue, 13 Nov 2018 22:01:14 +0100 (CET) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id D7AF828244; Tue, 13 Nov 2018 22:01:14 +0100 (CET) Received: from bsdmhs.longwitz (unknown [192.168.99.6]) by mail.local.incore (Postfix) with ESMTP id B6339112; Tue, 13 Nov 2018 22:01:14 +0100 (CET) Message-ID: <5BEB3B9A.9080402@incore.de> Date: Tue, 13 Nov 2018 22:01:14 +0100 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: Kristof Provost CC: freebsd-pf@freebsd.org Subject: Re: rdr pass for proto tcp sometimes creates states with expire time zero and so breaking connections References: <5BC51424.5000309@incore.de> <5BD45882.1000207@incore.de> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: F244577AF1 X-Spamd-Result: default: False [-1.36 / 200.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.93)[-0.932,0]; RCVD_COUNT_FIVE(0.00)[5]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-0.87)[-0.868,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[incore.de]; AUTH_NA(1.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[dss.incore.de]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[138.1.145.195.list.dnswl.org : 127.0.10.0]; NEURAL_HAM_SHORT(-0.35)[-0.351,0]; R_SPF_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:3320, ipnet:195.145.0.0/16, country:DE]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(-0.09)[asn: 3320(-0.46), country: DE(-0.01)] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 21:01:27 -0000 > > Are there any hints why the counter pf_default_rule->states_cur > could get a negative value ? > > I’m afraid I have no idea right now. > OK, in the meantime I did some more research and I am now quite sure the problem with the bogus pf_default_rule->states_cur counter is not a problem in pf. I am convinced it is a problem in counter(9) on i386 server. The critical code is the machine instruction cmpxchg8b used in /sys/i386/include/counter.h. >From intel instruction set reference manual: Zhis instruction can be used with a LOCK prefix allow the instruction to be executed atomically. We have two other sources in kernel using cmpxchg8b: /sys/i386/include/atomic.h and /sys/cddl/contrib/opensolaris/common/atomic/i386/opensolaris_atomic.S Both make use of the LOCK feature, in atomic.h a detailed explanation is given. Because counter.h lacks the LOCK prefix I propose the following patch to get around the leak: --- counter.h.orig 2015-07-03 16:45:36.000000000 +0200 +++ counter.h 2018-11-13 16:07:20.329053000 +0100 @@ -60,6 +60,7 @@ "movl %%edx,%%ecx\n\t" "addl (%%edi),%%ebx\n\t" "adcl 4(%%edi),%%ecx\n\t" + "lock \n\t" "cmpxchg8b %%fs:(%%esi)\n\t" "jnz 1b" : @@ -76,6 +77,7 @@ __asm __volatile( "movl %%eax,%%ebx\n\t" "movl %%edx,%%ecx\n\t" + "lock \n\t" "cmpxchg8b (%2)" : "=a" (res_lo), "=d"(res_high) : "SD" (p) @@ -121,6 +123,7 @@ "xorl %%ebx,%%ebx\n\t" "xorl %%ecx,%%ecx\n\t" "1:\n\t" + "lock \n\t" "cmpxchg8b (%0)\n\t" "jnz 1b" : Kindly regards, Andreas