From owner-freebsd-pf@freebsd.org Sat Oct 27 12:22:39 2018 Return-Path: Delivered-To: freebsd-pf@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B74FE10ED8FD for ; Sat, 27 Oct 2018 12:22:39 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 293168FA63 for ; Sat, 27 Oct 2018 12:22:38 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id 69C93139A5; Sat, 27 Oct 2018 14:22:28 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id PrdFTlJ0Xk4E; Sat, 27 Oct 2018 14:22:26 +0200 (CEST) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id D0F0B139A0; Sat, 27 Oct 2018 14:22:26 +0200 (CEST) Received: from bsdmhs.longwitz (unknown [192.168.99.6]) by mail.local.incore (Postfix) with ESMTP id 974AC1B0; Sat, 27 Oct 2018 14:22:26 +0200 (CEST) Message-ID: <5BD45882.1000207@incore.de> Date: Sat, 27 Oct 2018 14:22:26 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: Kristof Provost CC: freebsd-pf@freebsd.org Subject: Re: rdr pass for proto tcp sometimes creates states with expire time zero and so breaking connections References: <5BC51424.5000309@incore.de> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Oct 2018 12:22:39 -0000 Thanks very much for answer especially for the hint to openbsd. > I wonder if there’s an integer overflow in the of_state_expires() > calculation. > The OpenBSD people have a cast to u_int64_t in their version: > > |timeout = (u_int64_t)timeout * (end - states) / (end - start); > | > > Perhaps this would fix your problem? (Untested, not even compiled) > > | if (end && states > start && start < end) { > if (states < end) { > timeout = (uint64_t)timeout * (end - states) / (end - start); > return (state->expire + timeout; > } > else > return (time_uptime); > } > return (state->expire + timeout); I can confirm the patch of the openbsd people adding the uint64_t cast makes sense. If timeout * (end - states) becomes bigger than UINT32_MAX (I am on i386) the cast prevents the overflow of this product and the result of the adaptive calculation will always be correct. Example: start=6000, end=12000, timeout=86400 * 5 (5 days), states=100 result 140972, result with cast patch 856800. In the problem I have reported for states of "rdr pass" rules I see start=6000, end=12000, timeout=86400 and (obviously erroneous, probably negative) states=0xffffffd0. Therefore the condition "states < end" is false and instead of doing the adaptive calculation the function pf_states_expires() returns time_update. In the code snippet start = state->rule.ptr->timeout[PFTM_ADAPTIVE_START]; if (start) { end = state->rule.ptr->timeout[PFTM_ADAPTIVE_END]; states = counter_u64_fetch(state->rule.ptr->states_cur); } else { start = V_pf_default_rule.timeout[PFTM_ADAPTIVE_START]; end = V_pf_default_rule.timeout[PFTM_ADAPTIVE_END]; states = V_pf_status.states; } I see start=6000, so the variable states with value 0xffffffd0 is hold in a counter variable. Based on these facts there are two questions (please notice I have not any adaptive statements in my pf.conf): 1. Why I see start=6000 for states of "rdr pass" rules ? For all states of filter rules I see always start=0 and therefore a counter variable is never used. 2, The counter variable states_cur is changed exclusiv by the macros STATE_INC/DEC_COUNTERS, why one of them can be negative ? The answer of question 1 I can give myself: All states of filter rules include a pointer to the underlying rule which includes counter variables too. A state of a "rdr pass" rule does not have such an underlying rule, so the global pf_default_rule is used. That means in the first line of the code snippet above we have state->rule.ptr == pf_default_rule and the value of timeout[PFTM_ADAPTIVE_END] indeed is initialized with 6000. Further the counter variable for states_cur of pf_default_rule is used für all "rdr/nat/binat pass" rules together. This was a little bit suprising for me, but I think this is intended behaviour. Correct ? I still don't have an answer for question 2. My problem is best described in the log of commit 263029: Once pf became not covered by a single mutex, many counters in it became race prone. Some just gather statistics, but some are later used in different calculations. A real problem was the race provoked underflow of the states_cur counter on a rule. Once it goes below zero, it wraps to UINT32_MAX. Later this value is used in pf_state_expires() and any state created by this rule is immediately expired. Thus, make fields states_cur, states_tot and src_nodes of struct pf_rule be counter(9)s. I run FreeBSD 10 r338093 and of course have this and the following commits included. Are there any hints why the counter pf_default_rule->states_cur could get a negative value ? --- Andreas Longwitz