Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Sep 2015 22:32:34 +0200
From:      Palle Girgensohn <girgen@FreeBSD.org>
To:        Julien Charbon <jch@freebsd.org>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-net@freebsd.org, Hans Petter Selasky <hps@selasky.org>
Subject:   Re: Kernel panics in tcp_twclose
Message-ID:  <A46D96ED-459C-49F5-8B9A-19144C086C22@FreeBSD.org>
In-Reply-To: <56019AF8.8000705@freebsd.org>
References:  <26B0FF93-8AE3-4514-BDA1-B966230AAB65@FreeBSD.org> <55FC1809.3070903@freebsd.org> <20150918160605.GN67105@kib.kiev.ua> <55FFBE01.6060706@freebsd.org> <3721F099-F45D-4DCD-8AB3-84D1ABC44145@FreeBSD.org> <73856F2B-3E70-483C-9988-C84E798CEB44@FreeBSD.org> <44EBAC98-4761-4E47-8E47-5032430A1C8A@FreeBSD.org> <56019AF8.8000705@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

> 22 sep 2015 kl. 20:16 skrev Julien Charbon <jch@freebsd.org>:
>=20
>=20
> Hi Palle,
>=20
> On 22/09/15 18:49, Palle Girgensohn wrote:
>>> 22 sep 2015 kl. 18:46 skrev Palle Girgensohn
>>> <girgen@FreeBSD.org>:
>>>> 21 sep 2015 kl. 15:53 skrev Palle Girgensohn
>>>> <girgen@FreeBSD.org>:
>>>>> 21 sep 2015 kl. 10:21 skrev Julien Charbon
>>>>> <jch@FreeBSD.org>: On 18/09/15 18:06, Konstantin Belousov
>>>>> wrote:
>>>>>> On Fri, Sep 18, 2015 at 03:56:25PM +0200, Julien Charbon
>>>>>> wrote:
>>>>>>> [...]
>>>>> - Second, if issue is still in stable/10, compile 10.2 kernel
>>>>> with these options:
>>>>>=20
>>>>> options        DDB options        DEADLKRES options
>>>>> INVARIANTS options        INVARIANT_SUPPORT options
>>>>> WITNESS options        WITNESS_SKIPSPIN
>>>>>=20
>>>>> To see where the original fault is coming from.
>>>> [...]
>>>>=20
>>>> I'll try stable/10 now. Would you suggest a "clean" stable/10,
>>>> or could 287621 and 287780 help?
>>>>=20
>>>> I'll add the debugging suggested options right away.
>>>>=20
>>>> Palle
>>>=20
>>> I have a new core dump from ^/stable/10 with:
>>>=20
>>> options        DDB options        DEADLKRES options
>>> INVARIANTS options        INVARIANT_SUPPORT options
>>> WITNESS options        WITNESS_SKIPSPIN
>>=20
>> # kgdb kernel /var/crash/vmcore.2 GNU gdb 6.1.1 [FreeBSD] Copyright
>> 2004 Free Software Foundation, Inc. GDB is free software, covered
>> by the GNU General Public License, and you are welcome to change it
>> and/or distribute copies of it under certain conditions. Type "show
>> copying" to see the conditions. There is absolutely no warranty for
>> GDB.  Type "show warranty" for details. This GDB was configured as
>> "amd64-marcel-freebsd"...
>>=20
>> Unread portion of the kernel message buffer: panic: tcp_detach:
>> INP_TIMEWAIT && INP_DROPPED && tp !=3D NULL cpuid =3D 16 KDB: stack
>> backtrace: db_trace_self_wrapper() at
>> db_trace_self_wrapper+0x2b/frame 0xfffffe183d9e97e0 kdb_backtrace()
>> at kdb_backtrace+0x39/frame 0xfffffe183d9e9890 vpanic() at
>> vpanic+0x126/frame 0xfffffe183d9e98d0 kassert_panic() at
>> kassert_panic+0x139/frame 0xfffffe183d9e9940 tcp_usr_detach() at
>> tcp_usr_detach+0xf9/frame 0xfffffe183d9e9970 sofree() at
>> sofree+0x1f1/frame 0xfffffe183d9e99a0 soclose() at
>> soclose+0x3a0/frame 0xfffffe183d9e99f0 _fdrop() at
>> _fdrop+0x29/frame 0xfffffe183d9e9a10 closef() at closef+0x1e2/frame
>> 0xfffffe183d9e9aa0 closefp() at closefp+0x9d/frame
>> 0xfffffe183d9e9ae0 amd64_syscall() at amd64_syscall+0x25a/frame
>> 0xfffffe183d9e9bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame
>> 0xfffffe183d9e9bf0 --- syscall (6, FreeBSD ELF64, sys_close), rip =3D
>> 0x801c8d94a, rsp =3D 0x7ffff91c8668, rbp =3D 0x7ffff91c8680 --- KDB:
>> enter: panic Uptime: 18h57m59s Dumping 23085 out of 98263
>> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
>>=20
>> Reading symbols from /boot/kernel/nullfs.ko.symbols...done. Loaded
>> symbols for /boot/kernel/nullfs.ko.symbols Reading symbols from
>> /boot/kernel/zfs.ko.symbols...done. Loaded symbols for
>> /boot/kernel/zfs.ko.symbols Reading symbols from
>> /boot/kernel/opensolaris.ko.symbols...done. Loaded symbols for
>> /boot/kernel/opensolaris.ko.symbols Reading symbols from
>> /boot/kernel/ng_bridge.ko.symbols...done. Loaded symbols for
>> /boot/kernel/ng_bridge.ko.symbols Reading symbols from
>> /boot/kernel/netgraph.ko.symbols...done. Loaded symbols for
>> /boot/kernel/netgraph.ko.symbols Reading symbols from
>> /boot/kernel/ng_eiface.ko.symbols...done. Loaded symbols for
>> /boot/kernel/ng_eiface.ko.symbols Reading symbols from
>> /boot/kernel/ng_ether.ko.symbols...done. Loaded symbols for
>> /boot/kernel/ng_ether.ko.symbols Reading symbols from
>> /boot/kernel/accf_data.ko.symbols...done. Loaded symbols for
>> /boot/kernel/accf_data.ko.symbols Reading symbols from
>> /boot/kernel/accf_http.ko.symbols...done. Loaded symbols for
>> /boot/kernel/accf_http.ko.symbols Reading symbols from
>> /boot/kernel/ums.ko.symbols...done. Loaded symbols for
>> /boot/kernel/ums.ko.symbols Reading symbols from
>> /boot/kernel/ng_socket.ko.symbols...done. Loaded symbols for
>> /boot/kernel/ng_socket.ko.symbols Reading symbols from
>> /boot/kernel/fdescfs.ko.symbols...done. Loaded symbols for
>> /boot/kernel/fdescfs.ko.symbols #0  doadump (textdump=3D1) at
>> pcpu.h:219 219		__asm("movq %%gs:%1,%0" : "=3Dr" (td) =
(kgdb) bt #0
>> doadump (textdump=3D1) at pcpu.h:219 #1  0xffffffff8094b337 in
>> kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:451 #2
>> 0xffffffff8094b845 in vpanic (fmt=3D<value optimized out>, ap=3D<value
>> optimized out>) at /usr/src/sys/kern/kern_shutdown.c:758 #3
>> 0xffffffff8094b6d9 in kassert_panic (fmt=3D<value optimized out>) at
>> /usr/src/sys/kern/kern_shutdown.c:646 #4  0xffffffff80b1ee59 in
>> tcp_usr_detach (so=3D<value optimized out>) at
>> /usr/src/sys/netinet/tcp_usrreq.c:202 #5  0xffffffff809cd291 in
>> sofree (so=3D0xfffff801dd302000) at
>> /usr/src/sys/kern/uipc_socket.c:747 #6  0xffffffff809cdb00 in
>> soclose (so=3D<value optimized out>) at
>> /usr/src/sys/kern/uipc_socket.c:849 #7  0xffffffff808fe659 in
>> _fdrop (fp=3D0xfffff802a593db40, td=3D0x0) at file.h:343 #8
>> 0xffffffff80901092 in closef (fp=3D0xfffff802a593db40,
>> td=3D0xfffff80eebc894a0) at /usr/src/sys/kern/kern_descrip.c:2338 #9
>> 0xffffffff808feb5d in closefp (fdp=3D0xfffff80b20cce000, fd=3D<value
>> optimized out>, fp=3D0xfffff802a593db40, td=3D0xfffff80eebc894a0,
>> holdleaders=3D<value optimized out>) at
>> /usr/src/sys/kern/kern_descrip.c:1194 #10 0xffffffff80d7bc3a in
>> amd64_syscall (td=3D0xfffff80eebc894a0, traced=3D0) at
>> subr_syscall.c:134 #11 0xffffffff80d5f1db in Xfast_syscall () at
>> /usr/src/sys/amd64/amd64/exception.S:396 #12 0x0000000801c8d94a in
>> ?? () Previous frame inner to this frame (corrupt stack?) Current
>> language:  auto; currently minimal
>=20
> Thanks for the information.  As I suspected the initial error was
> elsewhere than tcp_twclose(), never got this assertion before:
>=20
> tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp !=3D NULL
>=20
> from here:
>=20
> static void
> tcp_detach(struct socket *so, struct inpcb *inp)
> {
>        struct tcpcb *tp;
>=20
>        INP_INFO_WLOCK_ASSERT(&V_tcbinfo);
>        INP_WLOCK_ASSERT(inp);
>=20
>        KASSERT(so->so_pcb =3D=3D inp, ("tcp_detach: so_pcb !=3D =
inp"));
>        KASSERT(inp->inp_socket =3D=3D so, ("tcp_detach: inp_socket !=3D =
so"));
>=20
>        tp =3D intotcpcb(inp);
>=20
>        if (inp->inp_flags & INP_TIMEWAIT) {
>                if (inp->inp_flags & INP_DROPPED) {
>                        KASSERT(tp =3D=3D NULL, ("tcp_detach: =
INP_TIMEWAIT
> && "
>                            "INP_DROPPED && tp !=3D NULL"));
>=20
> Let me check if I could find a path that could lead to this
> unexpected case.  Unexpected because:  INP_DROPPED and inp->inp_ppcb
> is set to NULL are set at same time here:
>=20
> void
> tcp_twclose(struct tcptw *tw, int reuse)
> {
>        struct socket *so;
>        struct inpcb *inp;
>=20
>        inp =3D tw->tw_inpcb;
>        KASSERT((inp->inp_flags & INP_TIMEWAIT), ("tcp_twclose:
> !timewait"));
>        KASSERT(intotw(inp) =3D=3D tw, ("tcp_twclose: inp_ppcb !=3D =
tw"));
>        INP_INFO_WLOCK_ASSERT(&V_tcbinfo);      /* in_pcbfree() */
>        INP_WLOCK_ASSERT(inp);
>=20
>        tcp_tw_2msl_stop(tw, reuse);
>        inp->inp_ppcb =3D NULL;
>        in_pcbdrop(inp);
>        ...
>=20
> Interesting and by the way could you try:
>=20
> # kgdb kernel /var/crash/vmcore.2
> (kgdb) info threads
>=20
> To see if other thread are also in TCP stack at the same time, and if
> one of this thread is referencing the same inp.
>=20
> Thanks.
>=20
> --
> Julien
>=20


Thanks for pursuing this. Enclosed is output of info threads. +6100 =
threads, so I compressed it. Most threads are in sched_switch, but the =
few ones below differ.

Anything else I can do? I could supply the entire core file, or access =
to the machine?=20

Palle

$ grep -v -A 1  sched info_threads2.txt=20
(kgdb) info threads
  6132 Thread 105613 (PID=3D43806: python2.7)  sched_switch =
(td=3D0xfffff803c56af940, newtd=3D<value optimized out>,=20
--
* 4206 Thread 106143 (PID=3D13296: httpd)  doadump (textdump=3D1) at =
pcpu.h:219
  4205 Thread 106142 (PID=3D13296: httpd)  sched_switch =
(td=3D0xfffff80eebc89940, newtd=3D<value optimized out>,=20
--
  1656 Thread 106919 (PID=3D10658: jsvc)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  1655 Thread 106918 (PID=3D10658: jsvc)  sched_switch =
(td=3D0xfffff804900f7000, newtd=3D<value optimized out>,=20
--
  1435 Thread 106442 (PID=3D10436: jsvc)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  1434 Thread 106441 (PID=3D10436: jsvc)  sched_switch =
(td=3D0xfffff80fe183e4a0, newtd=3D<value optimized out>,=20
--
  592 Thread 100139 (PID=3D12: intr/swi0: uart uart)  fork_trampoline () =
at /usr/src/sys/amd64/amd64/exception.S:608
  591 Thread 100138 (PID=3D12: intr/irq1: atkbd0)  sched_switch =
(td=3D0xfffff80110893000, newtd=3D<value optimized out>,=20
--
  590 Thread 100120 (PID=3D12: intr/irq22: uhci3)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  589 Thread 100114 (PID=3D12: intr/irq23: uhci2 uhci4)  fork_trampoline =
() at /usr/src/sys/amd64/amd64/exception.S:608
  588 Thread 100108 (PID=3D12: intr/irq20: hpet0 uhci1+)  sched_switch =
(td=3D0xfffff801105fa000,=20
--
  586 Thread 100101 (PID=3D12: intr/irq260: bce3)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  585 Thread 100100 (PID=3D12: intr/irq259: bce2)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  584 Thread 100099 (PID=3D12: intr/irq258: bce1)  sched_switch =
(td=3D0xfffff8011049b000, newtd=3D<value optimized out>,=20
--
  580 Thread 100084 (PID=3D12: intr/swi5: fast taskq)  fork_trampoline =
() at /usr/src/sys/amd64/amd64/exception.S:608
  579 Thread 100082 (PID=3D12: intr/swi6: Giant taskq)  sched_switch =
(td=3D0xfffff80110495940,=20
--
  578 Thread 100052 (PID=3D12: intr/swi3: vm)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  577 Thread 100051 (PID=3D12: intr/swi1: netisr 0)  sched_switch =
(td=3D0xfffff801102f4000, newtd=3D<value optimized out>,=20
--
  576 Thread 100050 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  575 Thread 100049 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  574 Thread 100048 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  573 Thread 100047 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  572 Thread 100046 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  571 Thread 100045 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  570 Thread 100044 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  569 Thread 100043 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  568 Thread 100042 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  567 Thread 100041 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  566 Thread 100040 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  565 Thread 100039 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  564 Thread 100038 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  563 Thread 100037 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  562 Thread 100036 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  561 Thread 100035 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  560 Thread 100034 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  559 Thread 100033 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  558 Thread 100032 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  557 Thread 100031 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  556 Thread 100030 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  555 Thread 100029 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  554 Thread 100028 (PID=3D12: intr/swi4: clock)  fork_trampoline () at =
/usr/src/sys/amd64/amd64/exception.S:608
  553 Thread 100027 (PID=3D12: intr/swi4: clock)  sched_switch =
(td=3D0xfffff801102cf000, newtd=3D<value optimized out>,=20
--
  552 Thread 100026 (PID=3D11: idle/idle: cpu23)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  551 Thread 100025 (PID=3D11: idle/idle: cpu22)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  550 Thread 100024 (PID=3D11: idle/idle: cpu21)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  549 Thread 100023 (PID=3D11: idle/idle: cpu20)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  548 Thread 100022 (PID=3D11: idle/idle: cpu19)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  547 Thread 100021 (PID=3D11: idle/idle: cpu18)  sched_switch =
(td=3D0xfffff801102d1000, newtd=3D<value optimized out>,=20
--
  546 Thread 100020 (PID=3D11: idle/idle: cpu17)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  545 Thread 100019 (PID=3D11: idle/idle: cpu16)  sched_switch =
(td=3D0xfffff801102d1940, newtd=3D<value optimized out>,=20
--
  544 Thread 100018 (PID=3D11: idle/idle: cpu15)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  543 Thread 100017 (PID=3D11: idle/idle: cpu14)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  542 Thread 100016 (PID=3D11: idle/idle: cpu13)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  541 Thread 100015 (PID=3D11: idle/idle: cpu12)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  540 Thread 100014 (PID=3D11: idle/idle: cpu11)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  539 Thread 100013 (PID=3D11: idle/idle: cpu10)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  538 Thread 100012 (PID=3D11: idle/idle: cpu9)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  537 Thread 100011 (PID=3D11: idle/idle: cpu8)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  536 Thread 100010 (PID=3D11: idle/idle: cpu7)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  535 Thread 100009 (PID=3D11: idle/idle: cpu6)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  534 Thread 100008 (PID=3D11: idle/idle: cpu5)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  533 Thread 100007 (PID=3D11: idle/idle: cpu4)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  532 Thread 100006 (PID=3D11: idle/idle: cpu3)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  531 Thread 100005 (PID=3D11: idle/idle: cpu2)  sched_switch =
(td=3D0xfffff801102b84a0, newtd=3D<value optimized out>,=20
--
  530 Thread 100004 (PID=3D11: idle/idle: cpu1)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  529 Thread 100003 (PID=3D11: idle/idle: cpu0)  0xffffffff80d68e88 in =
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1451
  528 Thread 100002 (PID=3D1: init)  sched_switch =
(td=3D0xfffff801102b94a0, newtd=3D<value optimized out>,=20
--



Palle



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A46D96ED-459C-49F5-8B9A-19144C086C22>