Date: Mon, 26 Feb 2018 11:21:45 +0000 From: Joe Jones <joe@stream-technologies.com> To: Kristof Provost <kristof@sigsegv.be> Cc: freebsd-pf@freebsd.org Subject: Re: Kernel Panic Message-ID: <5A93EDC9.7020407@stream-technologies.com> In-Reply-To: <5289570D-24E1-4292-B4D2-D2F67D7D2D4F@sigsegv.be> References: <5A842FC6.7020806@stream-technologies.com> <FCB6BE6F-5346-42EC-ACB2-9CD99A1A16F0@sigsegv.be> <5A8443BF.8040208@stream-technologies.com> <5289570D-24E1-4292-B4D2-D2F67D7D2D4F@sigsegv.be>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Kristof, we are not updating rules during the test although in production we will reload the rule set from time to time. We are constantly adding and removing from tables though, using the DIOCRADDADDRS and DIOCRDELADDRS ioctl, also DIOCKILLSTATES is being called a lot. These are all in response to RADIUS events. We tried using pfctl shell command rather than calling ioctl directly, to check that it wasn't a problem with how we are calling the ioctl. A little background. Our production system is running on 8.4 and has been stable for years. We are in the process of moving to 11.1 and are having big problems with stability when we allow customer traffic into the machine. At the moment we are using mirror ports on the switch to play live traffic into it. We're trying to work out the simplest configuration that causes a problem with a view to producing a good bug report. I have notices that the pfil interface https://www.freebsd.org/cgi/man.cgi?query=pfil&sektion=9 has locking in it which didn't exist in 8, I think it was introduced in 9? the locking functions appear in the man page in 10. I don't know if that interface is used directly by pf, but I'm guessing packet processing needs to be thread safe in a way it didn't in 8. Regards Joe Jones On 25/02/18 10:56, Kristof Provost wrote: > On 14 Feb 2018, at 19:57, Joe Jones wrote: >> On 14/02/18 13:09, Kristof Provost wrote: >>> On 14 Feb 2018, at 23:47, Joe Jones wrote: >>>> we are running test traffic through our system, after between 1 and >>>> 12 hours we get a kernel panic, always in the pfr_pool_get function >>>> in /usr/src/sys/netpfil/pf/pf_table.c line 2140. After a bit of >>>> investigation I confirmed that ke2 is set to null on line 2122. >>>> >>> It’d probably be interesting to know what the contents of uaddr/addr >>> is here. >>> From a very quick look at the code there’s supposed to be a route >>> lookup there, and I’d expect there to always be a result. The code >>> certainly expects it, because that looks to be what causes the panic. >>> >> >> (kgdb) p *uaddr >> No symbol "uaddr" in current context. >> >> (kgdb) p *addr >> $1 = { >> pfa = { >> v4 = { >> s_addr = 2016475826 >> }, >> v6 = { >> __u6_addr = { >> __u6_addr8 = 0xfffffe0000310d0c "��0x0\r1", >> __u6_addr16 = 0xfffffe0000310d0c, >> __u6_addr32 = 0xfffffe0000310d0c >> } >> }, >> addr8 = 0xfffffe0000310d0c "��0x0\r1", >> addr16 = 0xfffffe0000310d0c, >> addr32 = 0xfffffe0000310d0c >> } >> } >> > Interesting… That looks okay, so I have no idea why that lookup > returned NULL. > Are you modifying tables/rules at all during this test? > >> Am I right in thinking that's in network order. >> > I believe so, yes. > > Regards, > Kristof
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5A93EDC9.7020407>