Date: Fri, 18 Sep 2015 16:28:57 +0200 From: Palle Girgensohn <girgen@FreeBSD.org> To: Julien Charbon <jch@freebsd.org> Cc: freebsd-net@freebsd.org Subject: Re: Kernel panics in tcp_twclose using jails + VIMAGE Message-ID: <B0023E50-EB97-497B-8045-F1F10BBAF6AC@FreeBSD.org> In-Reply-To: <55FC1809.3070903@freebsd.org> References: <26B0FF93-8AE3-4514-BDA1-B966230AAB65@FreeBSD.org> <55FC1809.3070903@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] > 18 sep 2015 kl. 15:56 skrev Julien Charbon <jch@freebsd.org>: > > Hi Palle, > > On 18/09/15 11:12, Palle Girgensohn wrote: >> We see daily panics on our production systems (web server, apache >> running MPM event, openjdk8. Kernel with VIMAGE. Jails using netgraph >> interfaces [not epair]). >> >> The problem started after the summer. Normal port upgrades seems to >> be the only difference. The problem occurs with 10.2-p2 kernel as >> well as 10.1-p4 and 10.1-p15. >> >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203175 >> >> Any ideas? > > Thanks for you detailed report. I am not aware of any tcp_twclose() > related issues (without VIMAGE) since FreeBSD 10.0 (does not mean there > are none). Few interesting facts (at least for me): > > - Your crash happens when unlocking a inp exclusive lock with INP_WUNLOCK() > > - Something is already wrong before calling turnstile_broadcast() as it > is called with ts = NULL: > > turnstile_broadcast (ts=0x0, queue=1) at > /usr/src/sys/kern/subr_turnstile.c:838 > __rw_wunlock_hard () at /usr/src/sys/kern/kern_rwlock.c:988 > tcp_twclose () at /usr/src/sys/netinet/tcp_timewait.c:540 > tcp_tw_2msl_scan () at /usr/src/sys/netinet/tcp_timewait.c:748 > tcp_slowtimo () at /usr/src/sys/netinet/tcp_timer.c:198 > > I won't go to far here as I am not expert enough in VIMAGE, but one > question anyway: > > - Can you correlate this kernel panic to a particular event? Like for > example a VIMAGE/VNET jail destruction. > > I will test that on my side on a 10.2 machine. > > -- > Julien > Hi, thank for your reply. It is not related to jail destruction. It *might* be related to apache httpd (MPM event) forking during normal operation, but we have not found any specific event that triggers the panic. The system crash during normal operation, no excessive load (but load is at least partly responsible, a more loaded server is more likely to crash). Note that we use netgraph, not epair, although I don't believe it makes a difference. Palle [-- Attachment #2 --] -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org iQEcBAEBCAAGBQJV/B+pAAoJEIhV+7FrxBJDBb8H/RXUyYSm2xIRAT0/gIGLbQVh rLyEOPQcbQ4ST319Gtf/Us99qy2zF973m3FMlmeeuN5hmqB9I0KHPxskD7HZKd00 5kzXAvbsot8f96629sc7Vpp62XWXpd5kvO4uNijbyuGUSbI1j3GSurKIxgq1Jc86 MOYNY2h0DDdzkbUjCYUo/4bQ3YQ+DaGrT407tU3bdYbsxSHqrfbhkiLJhJCidiYS 4t5EpVqtu1FypJaMdJCdxbmPMnk3y1HcyKai651zRSePsAXNUb4GzGJ2+RqHF+N6 rqj3mZFg0G36Xx4OQ4qe6t+1+YCMsBRiC6K0I5NKXXJEZlNvFcNguQgJafHUhIU= =C6TS -----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B0023E50-EB97-497B-8045-F1F10BBAF6AC>
