Date: Thu, 1 Mar 2018 10:22:05 -0800 From: =?UTF-8?Q?Ermal_Lu=C3=A7i?= <eri@freebsd.org> To: Joe Jones <joe@stream-technologies.com> Cc: Kristof Provost <kristof@sigsegv.be>, "freebsd-pf@freebsd.org" <freebsd-pf@freebsd.org> Subject: Re: Kernel Panic Message-ID: <CAPBZQG03AxCbNFGtdhG07CaR_YOydmVK=6VwtMo--6Gxz7za6w@mail.gmail.com> In-Reply-To: <235640a7-9463-6268-e8b2-3a333a011368@stream-technologies.com> References: <5A842FC6.7020806@stream-technologies.com> <FCB6BE6F-5346-42EC-ACB2-9CD99A1A16F0@sigsegv.be> <5A8443BF.8040208@stream-technologies.com> <5289570D-24E1-4292-B4D2-D2F67D7D2D4F@sigsegv.be> <5A93EDC9.7020407@stream-technologies.com> <9F39A687-FB34-4984-B969-5264DF38544E@sigsegv.be> <19aedb50-34c0-417d-fc1e-e8d519655684@stream-technologies.com> <22A6028C-9BBA-4117-8734-D976EA5A1367@sigsegv.be> <06755C0B-4633-4FF7-988B-97A0A04D4EF6@sigsegv.be> <a5d9b51c-192a-ab66-d4d6-118cd5094591@stream-technologies.com> <E69A083C-8017-4DB6-A464-6C465D38FB41@sigsegv.be> <235640a7-9463-6268-e8b2-3a333a011368@stream-technologies.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 1, 2018 at 9:43 AM, Joe Jones <joe@stream-technologies.com> wrote: > Hi Kristo, > > It's just the master that crashed, the backup can take over. > > We think the panic we got by compiling with witness and invariant may be a > red herring. > > We are now looking rules like > > nat on $isp_if from <napts> to any -> <external_napts> sticky-address > > if we replace the external_napts table with a single address rather than a > block of addresses the box does not crash. > > We are following this line of investigation at the moment. > This is a known issue and should be documented somewhere, possibly man page. It source is when locking was re-designed for pf(4). https://github.com/freebsd/freebsd/blob/releng/11.1/sys/netpfil/pf/pf_lb.c#L428 * XXXGL: in the round-robin case we need to store * the round-robin machine state in the rule, thus * forwarding thread needs to modify rule. * * This is done w/o locking, because performance is assumed * more important than round-robin precision. * * In the simpliest case we just update the "rpool->cur" * pointer. However, if pool contains tables or dynamic * addresses, then "tblidx" is also used to store machine * state. Since "tblidx" is int, concurrent access to it can't * lead to inconsistence, only to lost of precision. * * Things get worse, if table contains not hosts, but * prefixes. In this case counter also stores machine state, * and for IPv6 address, counter can't be updated atomically. * Probably, using round-robin on a table containing IPv6 * prefixes (or even IPv4) would cause a panic. The fix is to add proper locking around such scenario. At minimum there would be needed a RULES_WLOCK in there or maybe reside to atomics. > Regards > Joe Jones > > > On 01/03/18 09:57, Kristof Provost wrote: > >> On 1 Mar 2018, at 15:37, Joe Jones wrote: >> >>> yes we use pfsync. Yesterday we tried with pfsync switched off, the box >>> still locked up but this time without a panic. >>> >>> We make the DIOCRADDADDRS ioctl on the master and the backup (we use >>> CARPed pairs). >>> >>> Interesting. It might be related to pfsync. Is is the master that panics >> or the backup? Or both? >> >> Regards, >> Kristof >> > > _______________________________________________ > freebsd-pf@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-pf > To unsubscribe, send any mail to "freebsd-pf-unsubscribe@freebsd.org" > > -- > Ermal >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPBZQG03AxCbNFGtdhG07CaR_YOydmVK=6VwtMo--6Gxz7za6w>