From owner-freebsd-net@freebsd.org Tue Sep 22 20:58:24 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C653CA034C6 for ; Tue, 22 Sep 2015 20:58:24 +0000 (UTC) (envelope-from girgen@FreeBSD.org) Received: from mail.pingpong.net (mail.pingpong.net [79.136.116.202]) by mx1.freebsd.org (Postfix) with ESMTP id 5020F18D9; Tue, 22 Sep 2015 20:58:24 +0000 (UTC) (envelope-from girgen@FreeBSD.org) Received: from mail.pingpong.net (localhost [127.0.0.1]) by mail.pingpong.net (Postfix) with ESMTP id 205F4DC61; Tue, 22 Sep 2015 22:58:23 +0200 (CEST) X-Virus-Scanned: by amavisd-new at pingpong.net Received: from mail.pingpong.net ([127.0.0.1]) by mail.pingpong.net (mail.pingpong.net [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 9F7RDaJoxs4j; Tue, 22 Sep 2015 22:58:22 +0200 (CEST) Received: from [10.0.1.12] (h-155-4-74-242.na.cust.bahnhof.se [155.4.74.242]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.pingpong.net (Postfix) with ESMTPSA id E2BDBDC54; Tue, 22 Sep 2015 22:58:22 +0200 (CEST) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: Kernel panics in tcp_twclose From: Palle Girgensohn In-Reply-To: <56019AF8.8000705@freebsd.org> Date: Tue, 22 Sep 2015 22:58:22 +0200 Cc: Konstantin Belousov , freebsd-net@freebsd.org, Hans Petter Selasky Content-Transfer-Encoding: quoted-printable Message-Id: References: <26B0FF93-8AE3-4514-BDA1-B966230AAB65@FreeBSD.org> <55FC1809.3070903@freebsd.org> <20150918160605.GN67105@kib.kiev.ua> <55FFBE01.6060706@freebsd.org> <3721F099-F45D-4DCD-8AB3-84D1ABC44145@FreeBSD.org> <73856F2B-3E70-483C-9988-C84E798CEB44@FreeBSD.org> <44EBAC98-4761-4E47-8E47-5032430A1C8A@FreeBSD.org> <56019AF8.8000705@freebsd.org> To: Julien Charbon X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Sep 2015 20:58:25 -0000 > 22 sep 2015 kl. 20:16 skrev Julien Charbon : >=20 >=20 > Hi Palle, >=20 > On 22/09/15 18:49, Palle Girgensohn wrote: >>> 22 sep 2015 kl. 18:46 skrev Palle Girgensohn >>> : >>>> 21 sep 2015 kl. 15:53 skrev Palle Girgensohn >>>> : >>>>> 21 sep 2015 kl. 10:21 skrev Julien Charbon >>>>> : On 18/09/15 18:06, Konstantin Belousov >>>>> wrote: >>>>>> On Fri, Sep 18, 2015 at 03:56:25PM +0200, Julien Charbon >>>>>> wrote: >>>>>>> [...] >>>>> - Second, if issue is still in stable/10, compile 10.2 kernel >>>>> with these options: >>>>>=20 >>>>> options DDB options DEADLKRES options >>>>> INVARIANTS options INVARIANT_SUPPORT options >>>>> WITNESS options WITNESS_SKIPSPIN >>>>>=20 >>>>> To see where the original fault is coming from. >>>> [...] >>>>=20 >>>> I'll try stable/10 now. Would you suggest a "clean" stable/10, >>>> or could 287621 and 287780 help? >>>>=20 >>>> I'll add the debugging suggested options right away. >>>>=20 >>>> Palle >>>=20 >>> I have a new core dump from ^/stable/10 with: >>>=20 >>> options DDB options DEADLKRES options >>> INVARIANTS options INVARIANT_SUPPORT options >>> WITNESS options WITNESS_SKIPSPIN >>=20 >> # kgdb kernel /var/crash/vmcore.2 GNU gdb 6.1.1 [FreeBSD] Copyright >> 2004 Free Software Foundation, Inc. GDB is free software, covered >> by the GNU General Public License, and you are welcome to change it >> and/or distribute copies of it under certain conditions. Type "show >> copying" to see the conditions. There is absolutely no warranty for >> GDB. Type "show warranty" for details. This GDB was configured as >> "amd64-marcel-freebsd"... >>=20 >> Unread portion of the kernel message buffer: panic: tcp_detach: >> INP_TIMEWAIT && INP_DROPPED && tp !=3D NULL cpuid =3D 16 KDB: stack >> backtrace: db_trace_self_wrapper() at >> db_trace_self_wrapper+0x2b/frame 0xfffffe183d9e97e0 kdb_backtrace() >> at kdb_backtrace+0x39/frame 0xfffffe183d9e9890 vpanic() at >> vpanic+0x126/frame 0xfffffe183d9e98d0 kassert_panic() at >> kassert_panic+0x139/frame 0xfffffe183d9e9940 tcp_usr_detach() at >> tcp_usr_detach+0xf9/frame 0xfffffe183d9e9970 sofree() at >> sofree+0x1f1/frame 0xfffffe183d9e99a0 soclose() at >> soclose+0x3a0/frame 0xfffffe183d9e99f0 _fdrop() at >> _fdrop+0x29/frame 0xfffffe183d9e9a10 closef() at closef+0x1e2/frame >> 0xfffffe183d9e9aa0 closefp() at closefp+0x9d/frame >> 0xfffffe183d9e9ae0 amd64_syscall() at amd64_syscall+0x25a/frame >> 0xfffffe183d9e9bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame >> 0xfffffe183d9e9bf0 --- syscall (6, FreeBSD ELF64, sys_close), rip =3D >> 0x801c8d94a, rsp =3D 0x7ffff91c8668, rbp =3D 0x7ffff91c8680 --- KDB: >> enter: panic Uptime: 18h57m59s Dumping 23085 out of 98263 >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% >>=20 >> Reading symbols from /boot/kernel/nullfs.ko.symbols...done. Loaded >> symbols for /boot/kernel/nullfs.ko.symbols Reading symbols from >> /boot/kernel/zfs.ko.symbols...done. Loaded symbols for >> /boot/kernel/zfs.ko.symbols Reading symbols from >> /boot/kernel/opensolaris.ko.symbols...done. Loaded symbols for >> /boot/kernel/opensolaris.ko.symbols Reading symbols from >> /boot/kernel/ng_bridge.ko.symbols...done. Loaded symbols for >> /boot/kernel/ng_bridge.ko.symbols Reading symbols from >> /boot/kernel/netgraph.ko.symbols...done. Loaded symbols for >> /boot/kernel/netgraph.ko.symbols Reading symbols from >> /boot/kernel/ng_eiface.ko.symbols...done. Loaded symbols for >> /boot/kernel/ng_eiface.ko.symbols Reading symbols from >> /boot/kernel/ng_ether.ko.symbols...done. Loaded symbols for >> /boot/kernel/ng_ether.ko.symbols Reading symbols from >> /boot/kernel/accf_data.ko.symbols...done. Loaded symbols for >> /boot/kernel/accf_data.ko.symbols Reading symbols from >> /boot/kernel/accf_http.ko.symbols...done. Loaded symbols for >> /boot/kernel/accf_http.ko.symbols Reading symbols from >> /boot/kernel/ums.ko.symbols...done. Loaded symbols for >> /boot/kernel/ums.ko.symbols Reading symbols from >> /boot/kernel/ng_socket.ko.symbols...done. Loaded symbols for >> /boot/kernel/ng_socket.ko.symbols Reading symbols from >> /boot/kernel/fdescfs.ko.symbols...done. Loaded symbols for >> /boot/kernel/fdescfs.ko.symbols #0 doadump (textdump=3D1) at >> pcpu.h:219 219 __asm("movq %%gs:%1,%0" : "=3Dr" (td) = (kgdb) bt #0 >> doadump (textdump=3D1) at pcpu.h:219 #1 0xffffffff8094b337 in >> kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:451 #2 >> 0xffffffff8094b845 in vpanic (fmt=3D, ap=3D> optimized out>) at /usr/src/sys/kern/kern_shutdown.c:758 #3 >> 0xffffffff8094b6d9 in kassert_panic (fmt=3D) at >> /usr/src/sys/kern/kern_shutdown.c:646 #4 0xffffffff80b1ee59 in >> tcp_usr_detach (so=3D) at >> /usr/src/sys/netinet/tcp_usrreq.c:202 #5 0xffffffff809cd291 in >> sofree (so=3D0xfffff801dd302000) at >> /usr/src/sys/kern/uipc_socket.c:747 #6 0xffffffff809cdb00 in >> soclose (so=3D) at >> /usr/src/sys/kern/uipc_socket.c:849 #7 0xffffffff808fe659 in >> _fdrop (fp=3D0xfffff802a593db40, td=3D0x0) at file.h:343 #8 >> 0xffffffff80901092 in closef (fp=3D0xfffff802a593db40, >> td=3D0xfffff80eebc894a0) at /usr/src/sys/kern/kern_descrip.c:2338 #9 >> 0xffffffff808feb5d in closefp (fdp=3D0xfffff80b20cce000, fd=3D> optimized out>, fp=3D0xfffff802a593db40, td=3D0xfffff80eebc894a0, >> holdleaders=3D) at >> /usr/src/sys/kern/kern_descrip.c:1194 #10 0xffffffff80d7bc3a in >> amd64_syscall (td=3D0xfffff80eebc894a0, traced=3D0) at >> subr_syscall.c:134 #11 0xffffffff80d5f1db in Xfast_syscall () at >> /usr/src/sys/amd64/amd64/exception.S:396 #12 0x0000000801c8d94a in >> ?? () Previous frame inner to this frame (corrupt stack?) Current >> language: auto; currently minimal >=20 > Thanks for the information. As I suspected the initial error was > elsewhere than tcp_twclose(), never got this assertion before: >=20 > tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp !=3D NULL >=20 > from here: >=20 > static void > tcp_detach(struct socket *so, struct inpcb *inp) > { > struct tcpcb *tp; >=20 > INP_INFO_WLOCK_ASSERT(&V_tcbinfo); > INP_WLOCK_ASSERT(inp); >=20 > KASSERT(so->so_pcb =3D=3D inp, ("tcp_detach: so_pcb !=3D = inp")); > KASSERT(inp->inp_socket =3D=3D so, ("tcp_detach: inp_socket !=3D = so")); >=20 > tp =3D intotcpcb(inp); >=20 > if (inp->inp_flags & INP_TIMEWAIT) { > if (inp->inp_flags & INP_DROPPED) { > KASSERT(tp =3D=3D NULL, ("tcp_detach: = INP_TIMEWAIT > && " > "INP_DROPPED && tp !=3D NULL")); >=20 > Let me check if I could find a path that could lead to this > unexpected case. Unexpected because: INP_DROPPED and inp->inp_ppcb > is set to NULL are set at same time here: >=20 > void > tcp_twclose(struct tcptw *tw, int reuse) > { > struct socket *so; > struct inpcb *inp; >=20 > inp =3D tw->tw_inpcb; > KASSERT((inp->inp_flags & INP_TIMEWAIT), ("tcp_twclose: > !timewait")); > KASSERT(intotw(inp) =3D=3D tw, ("tcp_twclose: inp_ppcb !=3D = tw")); > INP_INFO_WLOCK_ASSERT(&V_tcbinfo); /* in_pcbfree() */ > INP_WLOCK_ASSERT(inp); >=20 > tcp_tw_2msl_stop(tw, reuse); > inp->inp_ppcb =3D NULL; > in_pcbdrop(inp); > ... >=20 > Interesting and by the way could you try: >=20 > # kgdb kernel /var/crash/vmcore.2 > (kgdb) info threads >=20 > To see if other thread are also in TCP stack at the same time, and if > one of this thread is referencing the same inp. >=20 > Thanks. >=20 > -- > Julien >=20 BTW, this backtrace looks quite different from the previous ones? Is it = a different problem, or just a different way to reveal the same problem? Palle