Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 Mar 2018 10:22:05 -0800
From:      =?UTF-8?Q?Ermal_Lu=C3=A7i?= <eri@freebsd.org>
To:        Joe Jones <joe@stream-technologies.com>
Cc:        Kristof Provost <kristof@sigsegv.be>, "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org>
Subject:   Re: Kernel Panic
Message-ID:  <CAPBZQG03AxCbNFGtdhG07CaR_YOydmVK=6VwtMo--6Gxz7za6w@mail.gmail.com>
In-Reply-To: <235640a7-9463-6268-e8b2-3a333a011368@stream-technologies.com>
References:  <5A842FC6.7020806@stream-technologies.com> <FCB6BE6F-5346-42EC-ACB2-9CD99A1A16F0@sigsegv.be> <5A8443BF.8040208@stream-technologies.com> <5289570D-24E1-4292-B4D2-D2F67D7D2D4F@sigsegv.be> <5A93EDC9.7020407@stream-technologies.com> <9F39A687-FB34-4984-B969-5264DF38544E@sigsegv.be> <19aedb50-34c0-417d-fc1e-e8d519655684@stream-technologies.com> <22A6028C-9BBA-4117-8734-D976EA5A1367@sigsegv.be> <06755C0B-4633-4FF7-988B-97A0A04D4EF6@sigsegv.be> <a5d9b51c-192a-ab66-d4d6-118cd5094591@stream-technologies.com> <E69A083C-8017-4DB6-A464-6C465D38FB41@sigsegv.be> <235640a7-9463-6268-e8b2-3a333a011368@stream-technologies.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 1, 2018 at 9:43 AM, Joe Jones <joe@stream-technologies.com>
wrote:

> Hi Kristo,
>
> It's just the master that crashed, the backup can take over.
>
> We think the panic we got by compiling with witness and invariant may be a
> red herring.
>
> We are now looking rules like
>
> nat on $isp_if from <napts> to any -> <external_napts> sticky-address
>
> if we replace the external_napts table with a single address rather than a
> block of addresses the box does not crash.
>
> We are following this line of investigation at the moment.
>

This is a known issue and should be documented somewhere, possibly man page.
It source is when locking was re-designed for pf(4).

https://github.com/freebsd/freebsd/blob/releng/11.1/sys/netpfil/pf/pf_lb.c#L428

* XXXGL: in the round-robin case we need to store
* the round-robin machine state in the rule, thus
* forwarding thread needs to modify rule.
*
* This is done w/o locking, because performance is assumed
* more important than round-robin precision.
*
* In the simpliest case we just update the "rpool->cur"
* pointer. However, if pool contains tables or dynamic
* addresses, then "tblidx" is also used to store machine
* state. Since "tblidx" is int, concurrent access to it can't
* lead to inconsistence, only to lost of precision.
*
* Things get worse, if table contains not hosts, but
* prefixes. In this case counter also stores machine state,
* and for IPv6 address, counter can't be updated atomically.
* Probably, using round-robin on a table containing IPv6
* prefixes (or even IPv4) would cause a panic.

The fix is to add proper locking around such scenario.
At minimum there would be needed a RULES_WLOCK in there or maybe reside to
atomics.



> Regards
> Joe Jones
>
>
> On 01/03/18 09:57, Kristof Provost wrote:
>
>> On 1 Mar 2018, at 15:37, Joe Jones wrote:
>>
>>> yes we use pfsync. Yesterday we tried with pfsync switched off, the box
>>> still locked up but this time without a panic.
>>>
>>> We make the DIOCRADDADDRS ioctl on the master and the backup (we use
>>> CARPed pairs).
>>>
>>> Interesting. It might be related to pfsync. Is is the master that panics
>> or the backup? Or both?
>>
>> Regards,
>> Kristof
>>
>
> _______________________________________________
> freebsd-pf@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-pf
> To unsubscribe, send any mail to "freebsd-pf-unsubscribe@freebsd.org"
>
> --
> Ermal
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPBZQG03AxCbNFGtdhG07CaR_YOydmVK=6VwtMo--6Gxz7za6w>