Date: Fri, 18 Sep 2015 16:28:57 +0200 From: Palle Girgensohn <girgen@FreeBSD.org> To: Julien Charbon <jch@freebsd.org> Cc: freebsd-net@freebsd.org Subject: Re: Kernel panics in tcp_twclose using jails + VIMAGE Message-ID: <B0023E50-EB97-497B-8045-F1F10BBAF6AC@FreeBSD.org> In-Reply-To: <55FC1809.3070903@freebsd.org> References: <26B0FF93-8AE3-4514-BDA1-B966230AAB65@FreeBSD.org> <55FC1809.3070903@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_0D107C28-45CE-4957-9CD0-8DCB8DFC6727 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 > 18 sep 2015 kl. 15:56 skrev Julien Charbon <jch@freebsd.org>: >=20 > Hi Palle, >=20 > On 18/09/15 11:12, Palle Girgensohn wrote: >> We see daily panics on our production systems (web server, apache >> running MPM event, openjdk8. Kernel with VIMAGE. Jails using netgraph >> interfaces [not epair]). >>=20 >> The problem started after the summer. Normal port upgrades seems to >> be the only difference. The problem occurs with 10.2-p2 kernel as >> well as 10.1-p4 and 10.1-p15. >>=20 >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203175 >>=20 >> Any ideas? >=20 > Thanks for you detailed report. I am not aware of any tcp_twclose() > related issues (without VIMAGE) since FreeBSD 10.0 (does not mean = there > are none). Few interesting facts (at least for me): >=20 > - Your crash happens when unlocking a inp exclusive lock with = INP_WUNLOCK() >=20 > - Something is already wrong before calling turnstile_broadcast() as = it > is called with ts =3D NULL: >=20 > turnstile_broadcast (ts=3D0x0, queue=3D1) at > /usr/src/sys/kern/subr_turnstile.c:838 > __rw_wunlock_hard () at /usr/src/sys/kern/kern_rwlock.c:988 > tcp_twclose () at /usr/src/sys/netinet/tcp_timewait.c:540 > tcp_tw_2msl_scan () at /usr/src/sys/netinet/tcp_timewait.c:748 > tcp_slowtimo () at /usr/src/sys/netinet/tcp_timer.c:198 >=20 > I won't go to far here as I am not expert enough in VIMAGE, but one > question anyway: >=20 > - Can you correlate this kernel panic to a particular event? Like for > example a VIMAGE/VNET jail destruction. >=20 > I will test that on my side on a 10.2 machine. >=20 > -- > Julien >=20 Hi, thank for your reply. It is not related to jail destruction. It = *might* be related to apache httpd (MPM event) forking during normal = operation, but we have not found any specific event that triggers the = panic. The system crash during normal operation, no excessive load (but = load is at least partly responsible, a more loaded server is more likely = to crash). Note that we use netgraph, not epair, although I don't believe it makes = a difference. Palle --Apple-Mail=_0D107C28-45CE-4957-9CD0-8DCB8DFC6727 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org iQEcBAEBCAAGBQJV/B+pAAoJEIhV+7FrxBJDBb8H/RXUyYSm2xIRAT0/gIGLbQVh rLyEOPQcbQ4ST319Gtf/Us99qy2zF973m3FMlmeeuN5hmqB9I0KHPxskD7HZKd00 5kzXAvbsot8f96629sc7Vpp62XWXpd5kvO4uNijbyuGUSbI1j3GSurKIxgq1Jc86 MOYNY2h0DDdzkbUjCYUo/4bQ3YQ+DaGrT407tU3bdYbsxSHqrfbhkiLJhJCidiYS 4t5EpVqtu1FypJaMdJCdxbmPMnk3y1HcyKai651zRSePsAXNUb4GzGJ2+RqHF+N6 rqj3mZFg0G36Xx4OQ4qe6t+1+YCMsBRiC6K0I5NKXXJEZlNvFcNguQgJafHUhIU= =C6TS -----END PGP SIGNATURE----- --Apple-Mail=_0D107C28-45CE-4957-9CD0-8DCB8DFC6727--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B0023E50-EB97-497B-8045-F1F10BBAF6AC>