From owner-freebsd-hackers@FreeBSD.ORG  Sun May 28 20:34:50 2006
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: freebsd-hackers@freebsd.org
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 490EE16CA55
	for <freebsd-hackers@freebsd.org>; Sun, 28 May 2006 20:29:36 +0000 (UTC)
	(envelope-from lists-freebsd@silverwraith.com)
Received: from pear.silverwraith.com (pear.silverwraith.com [69.12.167.160])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A2AEA43D68
	for <freebsd-hackers@freebsd.org>; Sun, 28 May 2006 20:29:29 +0000 (GMT)
	(envelope-from lists-freebsd@silverwraith.com)
Received: from avleen by pear.silverwraith.com with local (Exim 4.61 (FreeBSD))
	(envelope-from <lists-freebsd@silverwraith.com>) id 1FkRtw-0002lD-3e
	for freebsd-hackers@freebsd.org; Sun, 28 May 2006 13:30:16 -0700
Date: Sun, 28 May 2006 13:30:16 -0700
From: Avleen Vig <lists-freebsd@silverwraith.com>
To: freebsd-hackers@freebsd.org
Message-ID: <20060528203015.GA8791@silverwraith.com>
References: <20060512220019.GA1911@silverwraith.com>
	<20060512223919.GA21382@fonon.realnet>
	<20060513014020.GE1911@silverwraith.com>
	<20060513074033.GA1236@fonon.realnet>
	<20060515175802.GA727@silverwraith.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20060515175802.GA727@silverwraith.com>
User-Agent: Mutt/1.5.11
Subject: 6.1 crash data (was: Re: no core file handler recognizes format)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 28 May 2006 20:34:53 -0000

On Mon, May 15, 2006 at 10:58:02AM -0700, Avleen Vig wrote:
> On Sat, May 13, 2006 at 11:40:33AM +0400, Stanislav Sedov wrote:
> > Rebuild your kernel with INVARIANTS enabled and debug info. It will
> > provide more information in case the crash happens again.

Ok, I finally got a core file with the crash :-)
Where's what some of kgdb tells me.
All I can tell, is that the bug happened somewhere around trying to set
a TOS value for an outbound network packet?
Help please?


[root@gooseberry] ~ # kgdb -c /var/crash/vmcore.0 /usr/obj/usr/src/sys/GOOSEBERR
Y/kernel.debug
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Unde
fined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".
       
Unread portion of the kernel message buffer:
       
       
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x58
fault code              = supervisor write, page not present
instruction pointer     = 0x20:0xc05efa9a
stack pointer           = 0x28:0xd6cb7ae0
frame pointer           = 0x28:0xd6cb7b10
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 20115 (python)
trap number             = 12
panic: page fault
Uptime: 10d6h22m19s
Dumping 511 MB (2 chunks)
  chunk 0: 1MB (159 pages) ... ok
  chunk 1: 511MB (130800 pages) 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15

#0  doadump () at pcpu.h:165
165     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) where
#0  doadump () at pcpu.h:165
#1  0xc0553492 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:402
#2  0xc05537ac in panic (fmt=0xc071873f "%s")
    at /usr/src/sys/kern/kern_shutdown.c:558
#3  0xc06fc00c in trap_fatal (frame=0xd6cb7aa0, eva=0)
    at /usr/src/sys/i386/i386/trap.c:836
#4  0xc06fbd17 in trap_pfault (frame=0xd6cb7aa0, usermode=0, eva=88)
    at /usr/src/sys/i386/i386/trap.c:744
#5  0xc06fb94d in trap (frame=
      {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = 0, tf_esi = -691307388, tf_ebp = -691307760, tf_isp = -691307828, tf_ebx = 0, tf_edx = -691307120, tf_ecx = 0, tf_eax = 8, tf_trapno = 12, tf_err = 2, tf_eip = -1067517286, tf_cs = 32, tf_eflags = 66183, tf_esp = -691307388, tf_ss = -691307784})
    at /usr/src/sys/i386/i386/trap.c:434
#6  0xc06e994a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc05efa9a in ip_ctloutput (so=0x8, sopt=0xd6cb7c84)
    at /usr/src/sys/netinet/ip_output.c:1210
    at /usr/src/sys/netinet/ip_output.c:1210
#8  0xc0601ad1 in tcp_ctloutput (so=0xc57aede8, sopt=0xd6cb7c84)
    at /usr/src/sys/netinet/tcp_usrreq.c:1038
#9  0xc05971a7 in sosetopt (so=0xc57aede8, sopt=0xd6cb7c84)
    at /usr/src/sys/kern/uipc_socket.c:1560
#10 0xc059cec9 in kern_setsockopt (td=0xc4b03900, s=8, level=8, name=8,
    val=0xbfbfab68, valseg=UIO_USERSPACE, valsize=0)
    at /usr/src/sys/kern/uipc_syscalls.c:1351
#11 0xc059cdee in setsockopt (td=0x8, uap=0xd6cb7d90)
    at /usr/src/sys/kern/uipc_syscalls.c:1307
#12 0xc06fc322 in syscall (frame=
      {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = -1077957792, tf_esi = -1077957784, tf_ebp = -1077957768, tf_isp = -691307164, tf_ebx = 708028888, tf_edx = 170620760, tf_ecx = -1077958488, tf_eax = 105, tf_trapno = 22, tf_err = 2, tf_eip = 673659967, tf_cs = 51, tf_eflags = 662, tf_esp = -1077957844, tf_ss = 59})
    at /usr/src/sys/i386/i386/trap.c:981
#13 0xc06e999f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200
#14 0x00000033 in ?? ()


(kgdb) up 7
#7  0xc05efa9a in ip_ctloutput (so=0x8, sopt=0xd6cb7c84)
    at /usr/src/sys/netinet/ip_output.c:1210
1210                                    inp->inp_ip_tos = optval;

(kgdb) p optval
$1 = 8

(kgdb) p inp
$2 = (struct inpcb *) 0x0

(kgdb) p inp->inp_ip_tos
There is no member named inp_ip_tos.

(kgdb) p inp->inp_depend4.inp4_ip_tos
Cannot access memory at address 0x58

**** Here I went up one more, to #8:

(kgdb) up 1
#8  0xc0601ad1 in tcp_ctloutput (so=0xc57aede8, sopt=0xd6cb7c84)
    at /usr/src/sys/netinet/tcp_usrreq.c:1038
1038                    error = ip_ctloutput(so, sopt);

(kgdb) p *so
$14 = {so_count = 1, so_type = 1, so_options = 4, so_linger = 0,
  so_state = 8448, so_qstate = 0, so_pcb = 0x0, so_proto = 0xc076e588,
  so_head = 0x0, so_incomp = {tqh_first = 0x0, tqh_last = 0x0}, so_comp = {
    tqh_first = 0x0, tqh_last = 0x0}, so_list = {tqe_next = 0x0,
    tqe_prev = 0xc3baa5b4}, so_qlen = 0, so_incqlen = 0, so_qlimit = 0,
  so_timeo = 0, so_error = 54, so_sigio = 0x0, so_oobmark = 0, so_aiojobq = {
    tqh_first = 0x0, tqh_last = 0xc57aee30}, so_rcv = {sb_sel = {si_thrlist = {
        tqe_next = 0x0, tqe_prev = 0x0}, si_thread = 0x0, si_note = {
        kl_list = {slh_first = 0x0}, kl_lock = 0xc0535980 <knlist_mtx_lock>,
        kl_unlock = 0xc05359b0 <knlist_mtx_unlock>,
        kl_locked = 0xc05359e0 <knlist_mtx_locked>, kl_lockarg = 0xc57aee5c},
      si_flags = 0}, sb_mtx = {mtx_object = {lo_class = 0xc0764584,
        lo_name = 0xc07312d1 "so_rcv", lo_type = 0xc07312d1 "so_rcv",
        lo_flags = 196608, lo_list = {tqe_next = 0x0, tqe_prev = 0x0},
        lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, sb_state = 32,
    sb_mb = 0x0, sb_mbtail = 0x0, sb_lastrecord = 0x0, sb_cc = 0,
    sb_hiwat = 65700, sb_mbcnt = 0, sb_mbmax = 525600, sb_ctl = 0,
    sb_lowat = 1, sb_timeo = 0, sb_flags = 0}, so_snd = {sb_sel = {
      si_thrlist = {tqe_next = 0x0, tqe_prev = 0x0}, si_thread = 0x0,
      si_note = {kl_list = {slh_first = 0x0},
        kl_lock = 0xc0535980 <knlist_mtx_lock>,
        kl_unlock = 0xc05359b0 <knlist_mtx_unlock>,
        kl_locked = 0xc05359e0 <knlist_mtx_locked>, kl_lockarg = 0xc57aeed4},
      si_flags = 0}, sb_mtx = {mtx_object = {lo_class = 0xc0764584,
        lo_name = 0xc07312ca "so_snd", lo_type = 0xc07312ca "so_snd",
        lo_flags = 196608, lo_list = {tqe_next = 0x0, tqe_prev = 0x0},
        lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, sb_state = 16,
    sb_mb = 0x0, sb_mbtail = 0x0, sb_lastrecord = 0x0, sb_cc = 0,
    sb_hiwat = 33580, sb_mbcnt = 0, sb_mbmax = 268640, sb_ctl = 0,
    sb_lowat = 2048, sb_timeo = 0, sb_flags = 0}, so_upcall = 0,
  so_upcallarg = 0x0, so_cred = 0xc54fc880, so_label = 0x0,
  so_peerlabel = 0x0, so_gencnt = 1765445, so_emuldata = 0x0, so_accf = 0x0}

(kgdb) p *sopt
$15 = {sopt_dir = SOPT_SET, sopt_level = 0, sopt_name = 3,
  sopt_val = 0xbfbfab68, sopt_valsize = 4, sopt_td = 0xc4b03900}


That's about all I was about to find out with my limited debugging
skills (and from reading Michael Lucas's OnLamp.com article on kernel
debugging). Everything I've seen the panic, it's been while some python
process was running, which seems like more than a coincedence.