Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Aug 2002 23:01:34 +0100 (BST)
From:      Gavin Atkinson <gavin@ury.york.ac.uk>
To:        Ian Dowse <iedowse@maths.tcd.ie>
Cc:        <freebsd-stable@FreeBSD.ORG>, <dillon@FreeBSD.ORG>
Subject:   Re: mbuf usage - how do i track it down? 
Message-ID:  <Pine.BSF.4.33.0208212209360.26041-100000@ury.york.ac.uk>
In-Reply-To: <200208211849.aa43591@salmon.maths.tcd.ie>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 21 Aug 2002, Ian Dowse wrote:
> In message <Pine.BSF.4.33.0208211647430.22899-100000@ury.york.ac.uk>, Gavin Atkinson writes:
> >So how do I find out what is actually allocating these mbufs. Something
> >seems to be leaking them.
> 	http://www.maths.tcd.ie/~iedowse/FreeBSD/minfo/
> It does a pile of consistency checks and can dump mbuf contents
> with the -x flag. Run it redirected to a file so that it gets
> a resonably consistent snapshot of the system, and then examine
> the file.

OK, this is interesting. It looks like i end up with a loop in an mbuf
chain.

Chain 0xc0cee900
        0xc0cee900 len 41 flags 0x002 type 1 next 0xc0e3ba00 prev 0x0
        0xc0e3ba00 len 181 flags 0x002 type 1 next 0xc0cc9a00 prev 0xc0cee900
        0xc0cc9a00 len 135 flags 0x002 type 1 next 0xc0dacc00 prev 0xc0e3ba00
        0xc0dacc00 len 185 flags 0x002 type 1 next 0xc0dd9c00 prev 0xc0cc9a00
<snip>
        0xc0e50a00 len 144 flags 0x002 type 1 next 0xc0d16400 prev 0xc0e10900
        0xc0d16400 len 236 flags 0x000 type 0 next 0xc0e45700 prev 0xc0e50a00
        0xc0e45700 len 567 flags 0x003 type 0 next 0xc0e4f900 prev 0xc0da4f00
        0xc0e4f900 len 16 flags 0x000 type 0 next 0xc0dd2b00 prev 0xc0e45700
        0xc0dd2b00 len 40 flags 0x002 type 0 next 0xc0bd7600 prev 0xc0e4f900
        0xc0bd7600 len 16 flags 0x000 type 8 next 0xc0dd2b00 prev 0xc0dd2b00
        0xc0dd2b00 len 40 flags 0x002 type 0 next 0xc0bd7600 prev 0xc0e4f900
        0xc0bd7600 len 16 flags 0x000 type 8 next 0xc0dd2b00 prev 0xc0dd2b00

Running it again a while later, gets stuck in a different loop
        0xc0f8ac00 len 203 flags 0x002 type 0 next 0xc0decb00 prev 0xc0f98e00
        0xc0decb00 len 209 flags 0x002 type 1 next 0xc0f8ac00 prev 0xc0f8ac00
        0xc0f8ac00 len 203 flags 0x002 type 0 next 0xc0decb00 prev 0xc0f98e00
        0xc0decb00 len 209 flags 0x002 type 1 next 0xc0f8ac00 prev 0xc0f8ac00
        0xc0f8ac00 len 203 flags 0x002 type 0 next 0xc0decb00 prev 0xc0f98e00
        0xc0decb00 len 209 flags 0x002 type 1 next 0xc0f8ac00 prev 0xc0f8ac00
        0xc0f8ac00 len 203 flags 0x002 type 0 next 0xc0decb00 prev 0xc0f98e00
        0xc0decb00 len 209 flags 0x002 type 1 next 0xc0f8ac00 prev 0xc0f8ac00

Running it again, i see other suspicious results

        0xc107cf00 len 150 flags 0x002 type 1 next 0xc0fe3d00 prev 0xc104a400
        0xc107d000 len 0 flags 0x000 type 0 next 0x0 prev 0xc107d100
        0xc107d100 len 876824626 flags 0x5757 type 22839 next 0xc107d000 prev 0xc107d200
        0xc107d200 len 0 flags 0x77e7 type 0 next 0xc107d100 prev 0xc107d300
        0xc107d300 len 539915361 flags 0x6369 type 27713 next 0xc107d200 prev 0xc107d400
        0xc107d400 len 13315 flags 0x000 type 0 next 0xc107d300 prev 0xc107d500
        0xc107d500 len -4014831 flags 0x1fe7 type -12313 next 0xc107d400 prev 0xc107d600
        0xc107d600 len 1852994816 flags 0x000 type 58 next 0xc107d500 prev 0xc0e57e00
        0xc107d700 len 208 flags 0x002 type 0 next 0xc0fee700 prev 0xc107c100
        0xc107d800 len 37 flags 0x002 type 0 next 0xc0e72c00 prev 0xc1045f00
        0xc107d900 len 16 flags 0x000 type 0 next 0xc104d300 prev 0xc0fd1200

So i'm at a loss as to where to go from here. Assuming minfo works, i may
be seeing mbuf chain corruption.

I have also seen the fillowing panic, during times of low mbufs:
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xc
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0196db8
stack pointer           = 0x10:0xc97f0bf0
frame pointer           = 0x10:0xc97f0bfc
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 167 (natd)
interrupt mask          =
trap number             = 12
panic: page fault

#5  0xc02bd553 in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi = 1,
      tf_esi = 0, tf_ebp = -914420740, tf_isp = -914420772,
      tf_ebx = -1058123776, tf_edx = -1, tf_ecx = -925641856, tf_eax = -1,
      tf_trapno = 12, tf_err = 0, tf_eip = -1072075336, tf_cs = 8,
      tf_eflags = 66050, tf_esp = -1058123776, tf_ss = 1})
    at /usr/src/sys/i386/i386/trap.c:466
#6  0xc0196db8 in m_copydata (m=0x0, off=-1, len=1,
    cp=0xc0ee5070 ":sha1:UPA6SGRR2Z7Y3YSND2J3JYQTFPMKK5JI")
    at /usr/src/sys/kern/uipc_mbuf.c:863
#7  0xc01e438c in tcp_output (tp=0xc8e11140)
    at /usr/src/sys/netinet/tcp_output.c:607
#8  0xc01e325b in tcp_input (m=0xc0ee5000, off0=20, proto=6)
    at /usr/src/sys/netinet/tcp_input.c:2158
#9  0xc01dcc73 in ip_input (m=0xc0ee5000)
    at /usr/src/sys/netinet/ip_input.c:821
#10 0xc01d605b in div_output (so=0xc8d3ce40, m=0xc0ee5000, sin=0xc1ac2650,
    control=0x0) at /usr/src/sys/netinet/ip_divert.c:327
#11 0xc01d61fb in div_send (so=0xc8d3ce40, flags=0, m=0xc0ee5000,
    nam=0xc1ac2650, control=0x0, p=0xc874b380)
    at /usr/src/sys/netinet/ip_divert.c:440
#12 0xc01990db in sosend (so=0xc8d3ce40, addr=0xc1ac2650, uio=0xc97f0ecc,
    top=0xc0ee5000, control=0x0, flags=0, p=0xc874b380)
    at /usr/src/sys/kern/uipc_socket.c:609
#13 0xc019c49b in sendit (p=0xc874b380, s=3, mp=0xc97f0f0c, flags=0)
    at /usr/src/sys/kern/uipc_syscalls.c:585
#14 0xc019c59e in sendto (p=0xc874b380, uap=0xc97f0f80)
    at /usr/src/sys/kern/uipc_syscalls.c:638
#15 0xc02bdfe1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
      tf_edi = -1078002552, tf_esi = 1, tf_ebp = -1077937016,
      tf_isp = -914419756, tf_ebx = 60, tf_edx = 3, tf_ecx = 1, tf_eax = 133,
      tf_trapno = 7, tf_err = 2, tf_eip = 134551364, tf_cs = 31,
      tf_eflags = 643, tf_esp = -1078002724, tf_ss = 47})
    at /usr/src/sys/i386/i386/trap.c:1175
#16 0xc02aef45 in Xint0x80_syscall ()

(kgdb) f 6
#6  0xc0196db8 in m_copydata (m=0x0, off=-1, len=1,
    cp=0xc0ee5070 ":sha1:UPA6SGRR2Z7Y3YSND2J3JYQTFPMKK5JI")
    at /usr/src/sys/kern/uipc_mbuf.c:863
863             while (len > 0) {

(kgdb) f 7
#7  0xc01e438c in tcp_output (tp=0xc8e11140)
    at /usr/src/sys/netinet/tcp_output.c:607
607                             m_copydata(so->so_snd.sb_mb, off, (int) len,
(kgdb)
(kgdb) p so
$1 = (struct socket *) 0xc8d3d380
(kgdb) p *so
$2 = {so_type = 1, so_options = 4, so_linger = 0, so_state = 258,
  so_pcb = 0xc8e11080 "\003X{`k(", so_proto = 0xc0332be8,
  so_head = 0x0, so_incomp = {tqh_first = 0x0, tqh_last = 0xc8d3d394},
  so_comp = {tqh_first = 0x0, tqh_last = 0xc8d3d39c}, so_list = {
    tqe_next = 0x0, tqe_prev = 0x0}, so_qlen = 0, so_incqlen = 0,
  so_qlimit = 0, so_timeo = 0, so_error = 0, so_sigio = 0x0, so_oobmark = 0,
  so_aiojobq = {tqh_first = 0x0, tqh_last = 0xc8d3d3c0}, so_rcv = {sb_cc = 0,
    sb_hiwat = 57920, sb_mbcnt = 0, sb_mbmax = 262144, sb_lowat = 1,
    sb_mb = 0x0, sb_sel = {si_pid = 0, si_note = {slh_first = 0x0},
      si_flags = 0}, sb_flags = 0, sb_timeo = 0}, so_snd = {sb_cc = 0,
    sb_hiwat = 33304, sb_mbcnt = 0, sb_mbmax = 262144, sb_lowat = 2048,
    sb_mb = 0x0, sb_sel = {si_pid = 0, si_note = {slh_first = 0x0},
      si_flags = 0}, sb_flags = 0, sb_timeo = 0}, so_upcall = 0,
  so_upcallarg = 0x0, so_cred = 0xc1a0a900, so_gencnt = 72140,
  so_emuldata = 0x0, so_accf = 0x0}
(kgdb)

SO it looks like somewhere there is also a use-mbuf-alloc-without-checking
bug somewhere.

Gavin


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.33.0208212209360.26041-100000>