Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 05 Apr 2006 11:50:11 +0900
From:      Kazuaki Oda <kaakun@highway.ne.jp>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: kernel panic: page fault
Message-ID:  <44333063.70606@highway.ne.jp>
In-Reply-To: <20060404141813.H22854@fledge.watson.org>
References:  <4430FAAF.2040809@highway.ne.jp> <20060403133210.U36756@fledge.watson.org> <44311AB5.2010407@highway.ne.jp> <20060404141813.H22854@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Robert Watson wrote:
> 
> On Mon, 3 Apr 2006, Kazuaki Oda wrote:
> 
>>> Also, are you running with INVARIANTS and/or WITNESS?
>>
>>
>> Sorry, I compiled kernel without INVARIANTS and WITNESS.
> 
> 
> No problem at all -- the debugging information you have sent me is 
> enough to track down the source of the problem.  It looks like we have 
> an inconsistency in how we handle (especially in my new world order) the 
> recycling of timewait state for an inpcb that is still present.  I've 
> committed a work-around which should prevent the panic you're seeing, 
> but I need to investigate a bit more before I can commit a full solution.
> 
> For those interested, the problem is how to handle sockets with attached 
> inpcbs that represent closed or time wait TCP connections.  This can 
> happen if shutdown() is called on a socket, kicking the TCP state engine 
> into a close cycle, rather than a reset.  In the current world order, 
> the following sets of socket, pcb, and ppcb protocol state can occur:
> 
> fd -> socket <-> inpcb <-> tcpcb    Normal TCP socket in various states.
> fd -> socket <-> inpcb <-> twtcp    Unclosed TCP socket in time wait.
> fd -> socket <-> inpcb <-> NULL        Unclosed TCP socket after tw 
> recycle.
>       socket <-> inpcb <-> tcpcb        Socket closed, buffer still needed.
>                  inpcb <-> twtcp        Socket closed, time wait.
> 
> The problem was that the middle case exists, but was not accounted for. 
> There's another problem that is still present in the new socket/pcb 
> model, in which the inpcb of an open socket with a closed TCP connection 
> continues to reserve the address/port combination.  This is related to 
> the inpcb without twtcp case, where we recycle the twtcp, but can't 
> recylce the inpcb immediately because there's still an fd reference to 
> the socket, and hence a socket reference to the inpcb.
> 
> My current leaning is to do the following:
> 
> - Since we now keep inpcb's around for the lifetime of the socket, either
>   teach the inpcb lookup routines to ignore INP_DROPPED, or to move them to
>   another inpcb list for open but dropped inpcb's.  This will avoid the
>   reservation hanging around.
> 
> - Either eliminate the inp_ppcb pointer NULL case by prohibiting 
> recycling of
>   the twtcp state of a socket that is still open, or more formally 
> supporting
>   it through the previous change.  The trick is to prevent the twtcp/inpcb
>   pairs from turning up and being used during input and allocation 
> collision
>   processing.
> 
> The summary is that we're not quite there yet in terms of how it all 
> should be working, but that we should avoid the panic for now due to the 
> workarounds I committed (which basically are changes to handle the 
> inp_ppcb pointer being NULL for INP_TIMEWAIT sockets).
> 
> Thanks!
> 
> Robert N M Watson
> 
> 

I re-cvsup'ed today and rebuilt kernel.  After reboot with new kernel, I got:


# kgdb kernel.debug /var/crash/vmcore.5
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: 
Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc071886b
stack pointer           = 0x28:0xd4422b74
frame pointer           = 0x28:0xd4422b80
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 14 (swi1: net)
trap number             = 12
panic: page fault
cpuid = 0
Uptime: 9m26s
Dumping 510 MB (2 chunks)
   chunk 0: 1MB (158 pages) ... ok
   chunk 1: 510MB (130544 pages) 494 478 462 446 430 414 398 382 366 350 334 318 
302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14

#0  doadump () at pcpu.h:166
166             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:166
#1  0xc066f042 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:402
#2  0xc066f369 in panic (fmt=0xc088de83 "%s")
     at /usr/src/sys/kern/kern_shutdown.c:558
#3  0xc083ca50 in trap_fatal (frame=0xd4422b34, eva=0)
     at /usr/src/sys/i386/i386/trap.c:870
#4  0xc083c78f in trap_pfault (frame=0xd4422b34, usermode=0, eva=0)
     at /usr/src/sys/i386/i386/trap.c:778
#5  0xc083c3a5 in trap (frame=
       {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = -1014804444, tf_esi = 0, 
tf_ebp = -733860992, tf_isp = -733861024, tf_ebx = -997587984, tf_edx = 
-997587984, tf_ecx = -1020269616, tf_eax = -733860800, tf_trapno = 12, tf_err = 
0, tf_eip = -1066301333, tf_cs = 32, tf_eflags = 66118, tf_esp = -997587984, 
tf_ss = 0})
     at /usr/src/sys/i386/i386/trap.c:463
#6  0xc082841a in calltrap () at /usr/src/sys/i386/i386/exception.s:138
#7  0xc071886b in tcp_timewait (tw=0x0, to=0xd4422c40, th=0xc3835024,
     m=0xc3801b00, tlen=0) at pcpu.h:166
#8  0xc0715a44 in tcp_input (m=0xc3801b00, off0=20)
     at /usr/src/sys/netinet/tcp_input.c:763
#9  0xc070ee6d in ip_input (m=0xc3801b00)
     at /usr/src/sys/netinet/ip_input.c:656
#10 0xc06eb98b in netisr_processqueue (ni=0xc0971e38)
     at /usr/src/sys/net/netisr.c:236
#11 0xc06ebb8a in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:349
#12 0xc0659e49 in ithread_execute_handlers (p=0xc32fd468, ie=0xc333f100)
     at /usr/src/sys/kern/kern_intr.c:662
#13 0xc0659f69 in ithread_loop (arg=0xc32dc840)
     at /usr/src/sys/kern/kern_intr.c:745
#14 0xc0658d61 in fork_exit (callout=0xc0659f14 <ithread_loop>,
     arg=0xc32dc840, frame=0xd4422d38) at /usr/src/sys/kern/kern_fork.c:819
#15 0xc082847c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:199
(kgdb) frame 8
#8  0xc0715a44 in tcp_input (m=0xc3801b00, off0=20)
     at /usr/src/sys/netinet/tcp_input.c:763
763                     if (tcp_timewait(intotw(inp), &to, th, m, tlen))
(kgdb) p inp
$1 = (struct inpcb *) 0xc48a03f0
(kgdb) p *inp
$2 = {inp_hash = {le_next = 0xc48ad690, le_prev = 0xc4962348}, inp_list = {
     le_next = 0xc48cddc8, le_prev = 0xc48d2dd0}, inp_flow = 0, inp_inc = {
     inc_flags = 0 '\0', inc_len = 0 '\0', inc_pad = 0, inc_ie = {
       ie_fport = 4364, ie_lport = 20480, ie_dependfaddr = {ie46_foreign = {
           ia46_pad32 = {0, 0, 0}, ia46_addr4 = {s_addr = 84650176}},
         ie6_foreign = {__u6_addr = {
             __u6_addr8 = '\0' <repeats 12 times>, "ĀĻ\v\005", __u6_addr16 = {
               0, 0, 0, 0, 0, 0, 43200, 1291}, __u6_addr32 = {0, 0, 0,
               84650176}}}}, ie_dependladdr = {ie46_local = {ia46_pad32 = {0,
             0, 0}, ia46_addr4 = {s_addr = 51095744}}, ie6_local = {
           __u6_addr = {__u6_addr8 = '\0' <repeats 12 times>, "ĀĻ\v\003",
             __u6_addr16 = {0, 0, 0, 0, 0, 0, 43200, 779}, __u6_addr32 = {0, 0,
               0, 51095744}}}}}}, inp_ppcb = 0x0, inp_pcbinfo = 0xc0972ba0,
   inp_socket = 0xc494a530, inp_label = 0x0, inp_flags = 8388608, inp_sp = 0x0,
   inp_vflag = 41 ')', inp_ip_ttl = 64 '@', inp_ip_p = 0 '\0',
   inp_ip_minttl = 0 '\0', inp_depend4 = {inp4_ip_tos = 0 '\0',
     inp4_options = 0x0, inp4_moptions = 0x0}, inp_depend6 = {
     inp6_options = 0x0, inp6_outputopts = 0x0, inp6_moptions = 0x0,
     inp6_icmp6filt = 0x0, inp6_cksum = 0, inp6_ifindex = 0, inp6_hops = 0},
   inp_portlist = {le_next = 0xc48cddc8, le_prev = 0xc48d2e44},
   inp_phd = 0xc341e780, inp_gencnt = 926118, inp_mtx = {mtx_object = {
       lo_name = 0xc08b6e09 "inp", lo_type = 0xc08b4936 "tcpinp",
       lo_flags = 21692416, lo_witness_data = {lod_list = {stqe_next = 0x0},
         lod_witness = 0x0}}, mtx_lock = 3274697680, mtx_recurse = 0}}
(kgdb) p *inp->inp_socket
$3 = {so_count = 1, so_type = 1, so_options = 12, so_linger = 0,
   so_state = 8192, so_qstate = 0, so_pcb = 0xc48a03f0, so_proto = 0xc093a7c8,
   so_head = 0x0, so_incomp = {tqh_first = 0x0, tqh_last = 0x0}, so_comp = {
     tqh_first = 0x0, tqh_last = 0x0}, so_list = {tqe_next = 0x0,
     tqe_prev = 0xc37a4bd0}, so_qlen = 0, so_incqlen = 0, so_qlimit = 0,
   so_timeo = 0, so_error = 0, so_sigio = 0x0, so_oobmark = 0, so_aiojobq = {
     tqh_first = 0x0, tqh_last = 0xc494a578}, so_rcv = {sb_sel = {si_thrlist = {
         tqe_next = 0x0, tqe_prev = 0xc3ecaa50}, si_thread = 0x0, si_note = {
         kl_list = {slh_first = 0x0}, kl_lock = 0xc0653644 <knlist_mtx_lock>,
         kl_unlock = 0xc065367c <knlist_mtx_unlock>,
         kl_locked = 0xc06536b8 <knlist_mtx_locked>, kl_lockarg = 0xc494a5a4},
       si_flags = 0}, sb_mtx = {mtx_object = {lo_name = 0xc08adcf7 "so_rcv",
         lo_type = 0xc08adcf7 "so_rcv", lo_flags = 16973824, lo_witness_data = {
           lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4,
       mtx_recurse = 0}, sb_state = 32, sb_mb = 0x0, sb_mbtail = 0x0,
     sb_lastrecord = 0x0, sb_cc = 0, sb_hiwat = 66608, sb_mbcnt = 0,
     sb_mbmax = 262144, sb_ctl = 0, sb_lowat = 1, sb_timeo = 0, sb_flags = 0},
   so_snd = {sb_sel = {si_thrlist = {tqe_next = 0x0, tqe_prev = 0x0},
       si_thread = 0x0, si_note = {kl_list = {slh_first = 0x0},
         kl_lock = 0xc0653644 <knlist_mtx_lock>,
         kl_unlock = 0xc065367c <knlist_mtx_unlock>,
         kl_locked = 0xc06536b8 <knlist_mtx_locked>, kl_lockarg = 0xc494a610},
       si_flags = 0}, sb_mtx = {mtx_object = {lo_name = 0xc08adcf0 "so_snd",
         lo_type = 0xc08adcf0 "so_snd", lo_flags = 16973824, lo_witness_data = {
           lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4,
       mtx_recurse = 0}, sb_state = 16, sb_mb = 0x0, sb_mbtail = 0x0,
     sb_lastrecord = 0x0, sb_cc = 0, sb_hiwat = 33304, sb_mbcnt = 0,
     sb_mbmax = 262144, sb_ctl = 0, sb_lowat = 2048, sb_timeo = 0,
     sb_flags = 0}, so_upcall = 0, so_upcallarg = 0x0, so_cred = 0xc3b91600,
   so_label = 0x0, so_peerlabel = 0x0, so_gencnt = 622463, so_emuldata = 0x0,
   so_accf = 0x0}
(kgdb) p *inp->inp_ppcb
Attempt to dereference a generic pointer.
(kgdb) frame 7
#7  0xc071886b in tcp_timewait (tw=0x0, to=0xd4422c40, th=0xc3835024,
     m=0xc3801b00, tlen=0) at pcpu.h:166
166             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) p tw
$4 = (struct tcptw *) 0x0
(kgdb) p *tw
Cannot access memory at address 0x0
(kgdb) p *to
$5 = {to_flags = 49, to_tsval = 97163995, to_tsecr = 0, to_mss = 1460,
   to_requested_s_scale = 0 '\0', to_nsacks = 0 '\0', to_sacks = 0x0}
(kgdb) p *th
$6 = {th_sport = 4364, th_dport = 20480, th_seq = 2749703150, th_ack = 0,
   th_x2 = 0, th_off = 10, th_flags = 2 '\002', th_win = 57344, th_sum = 0,
   th_urp = 0}
(kgdb)


Is more information required?

--
Kazuaki Oda



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?44333063.70606>