Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Aug 2002 22:57:31 -0400
From:      Bosko Milekic <bmilekic@unixdaemons.com>
To:        Gavin Atkinson <gavin@ury.york.ac.uk>
Cc:        Ian Dowse <iedowse@maths.tcd.ie>, freebsd-stable@FreeBSD.ORG, dillon@FreeBSD.ORG
Subject:   Re: mbuf usage - how do i track it down?
Message-ID:  <20020821225731.A32832@unixdaemons.com>
In-Reply-To: <Pine.BSF.4.33.0208212209360.26041-100000@ury.york.ac.uk>; from gavin@ury.york.ac.uk on Wed, Aug 21, 2002 at 11:01:34PM %2B0100
References:  <200208211849.aa43591@salmon.maths.tcd.ie> <Pine.BSF.4.33.0208212209360.26041-100000@ury.york.ac.uk>

next in thread | previous in thread | raw e-mail | index | archive | help

Wow.

Is this something that only started happening recently?  How recent
-STABLE are you running?

On Wed, Aug 21, 2002 at 11:01:34PM +0100, Gavin Atkinson wrote:
> On Wed, 21 Aug 2002, Ian Dowse wrote:
> > In message <Pine.BSF.4.33.0208211647430.22899-100000@ury.york.ac.uk>, Gavin Atkinson writes:
> > >So how do I find out what is actually allocating these mbufs. Something
> > >seems to be leaking them.
> > 	http://www.maths.tcd.ie/~iedowse/FreeBSD/minfo/
> > It does a pile of consistency checks and can dump mbuf contents
> > with the -x flag. Run it redirected to a file so that it gets
> > a resonably consistent snapshot of the system, and then examine
> > the file.
> 
> OK, this is interesting. It looks like i end up with a loop in an mbuf
> chain.
> 
> Chain 0xc0cee900
>         0xc0cee900 len 41 flags 0x002 type 1 next 0xc0e3ba00 prev 0x0
>         0xc0e3ba00 len 181 flags 0x002 type 1 next 0xc0cc9a00 prev 0xc0cee900
>         0xc0cc9a00 len 135 flags 0x002 type 1 next 0xc0dacc00 prev 0xc0e3ba00
>         0xc0dacc00 len 185 flags 0x002 type 1 next 0xc0dd9c00 prev 0xc0cc9a00
> <snip>
>         0xc0e50a00 len 144 flags 0x002 type 1 next 0xc0d16400 prev 0xc0e10900
>         0xc0d16400 len 236 flags 0x000 type 0 next 0xc0e45700 prev 0xc0e50a00
>         0xc0e45700 len 567 flags 0x003 type 0 next 0xc0e4f900 prev 0xc0da4f00
>         0xc0e4f900 len 16 flags 0x000 type 0 next 0xc0dd2b00 prev 0xc0e45700
>         0xc0dd2b00 len 40 flags 0x002 type 0 next 0xc0bd7600 prev 0xc0e4f900
>         0xc0bd7600 len 16 flags 0x000 type 8 next 0xc0dd2b00 prev 0xc0dd2b00
>         0xc0dd2b00 len 40 flags 0x002 type 0 next 0xc0bd7600 prev 0xc0e4f900
>         0xc0bd7600 len 16 flags 0x000 type 8 next 0xc0dd2b00 prev 0xc0dd2b00
> 
> Running it again a while later, gets stuck in a different loop
>         0xc0f8ac00 len 203 flags 0x002 type 0 next 0xc0decb00 prev 0xc0f98e00
>         0xc0decb00 len 209 flags 0x002 type 1 next 0xc0f8ac00 prev 0xc0f8ac00
>         0xc0f8ac00 len 203 flags 0x002 type 0 next 0xc0decb00 prev 0xc0f98e00
>         0xc0decb00 len 209 flags 0x002 type 1 next 0xc0f8ac00 prev 0xc0f8ac00
>         0xc0f8ac00 len 203 flags 0x002 type 0 next 0xc0decb00 prev 0xc0f98e00
>         0xc0decb00 len 209 flags 0x002 type 1 next 0xc0f8ac00 prev 0xc0f8ac00
>         0xc0f8ac00 len 203 flags 0x002 type 0 next 0xc0decb00 prev 0xc0f98e00
>         0xc0decb00 len 209 flags 0x002 type 1 next 0xc0f8ac00 prev 0xc0f8ac00
> 
> Running it again, i see other suspicious results
> 
>         0xc107cf00 len 150 flags 0x002 type 1 next 0xc0fe3d00 prev 0xc104a400
>         0xc107d000 len 0 flags 0x000 type 0 next 0x0 prev 0xc107d100
>         0xc107d100 len 876824626 flags 0x5757 type 22839 next 0xc107d000 prev 0xc107d200
>         0xc107d200 len 0 flags 0x77e7 type 0 next 0xc107d100 prev 0xc107d300
>         0xc107d300 len 539915361 flags 0x6369 type 27713 next 0xc107d200 prev 0xc107d400
>         0xc107d400 len 13315 flags 0x000 type 0 next 0xc107d300 prev 0xc107d500
>         0xc107d500 len -4014831 flags 0x1fe7 type -12313 next 0xc107d400 prev 0xc107d600
>         0xc107d600 len 1852994816 flags 0x000 type 58 next 0xc107d500 prev 0xc0e57e00
>         0xc107d700 len 208 flags 0x002 type 0 next 0xc0fee700 prev 0xc107c100
>         0xc107d800 len 37 flags 0x002 type 0 next 0xc0e72c00 prev 0xc1045f00
>         0xc107d900 len 16 flags 0x000 type 0 next 0xc104d300 prev 0xc0fd1200
> 
> So i'm at a loss as to where to go from here. Assuming minfo works, i may
> be seeing mbuf chain corruption.
> 
> I have also seen the fillowing panic, during times of low mbufs:
> panic messages:
> ---
> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0xc
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xc0196db8
> stack pointer           = 0x10:0xc97f0bf0
> frame pointer           = 0x10:0xc97f0bfc
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 167 (natd)
> interrupt mask          =
> trap number             = 12
> panic: page fault
> 
> #5  0xc02bd553 in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi = 1,
>       tf_esi = 0, tf_ebp = -914420740, tf_isp = -914420772,
>       tf_ebx = -1058123776, tf_edx = -1, tf_ecx = -925641856, tf_eax = -1,
>       tf_trapno = 12, tf_err = 0, tf_eip = -1072075336, tf_cs = 8,
>       tf_eflags = 66050, tf_esp = -1058123776, tf_ss = 1})
>     at /usr/src/sys/i386/i386/trap.c:466
> #6  0xc0196db8 in m_copydata (m=0x0, off=-1, len=1,
>     cp=0xc0ee5070 ":sha1:UPA6SGRR2Z7Y3YSND2J3JYQTFPMKK5JI")
>     at /usr/src/sys/kern/uipc_mbuf.c:863
> #7  0xc01e438c in tcp_output (tp=0xc8e11140)
>     at /usr/src/sys/netinet/tcp_output.c:607
> #8  0xc01e325b in tcp_input (m=0xc0ee5000, off0=20, proto=6)
>     at /usr/src/sys/netinet/tcp_input.c:2158
> #9  0xc01dcc73 in ip_input (m=0xc0ee5000)
>     at /usr/src/sys/netinet/ip_input.c:821
> #10 0xc01d605b in div_output (so=0xc8d3ce40, m=0xc0ee5000, sin=0xc1ac2650,
>     control=0x0) at /usr/src/sys/netinet/ip_divert.c:327
> #11 0xc01d61fb in div_send (so=0xc8d3ce40, flags=0, m=0xc0ee5000,
>     nam=0xc1ac2650, control=0x0, p=0xc874b380)
>     at /usr/src/sys/netinet/ip_divert.c:440
> #12 0xc01990db in sosend (so=0xc8d3ce40, addr=0xc1ac2650, uio=0xc97f0ecc,
>     top=0xc0ee5000, control=0x0, flags=0, p=0xc874b380)
>     at /usr/src/sys/kern/uipc_socket.c:609
> #13 0xc019c49b in sendit (p=0xc874b380, s=3, mp=0xc97f0f0c, flags=0)
>     at /usr/src/sys/kern/uipc_syscalls.c:585
> #14 0xc019c59e in sendto (p=0xc874b380, uap=0xc97f0f80)
>     at /usr/src/sys/kern/uipc_syscalls.c:638
> #15 0xc02bdfe1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
>       tf_edi = -1078002552, tf_esi = 1, tf_ebp = -1077937016,
>       tf_isp = -914419756, tf_ebx = 60, tf_edx = 3, tf_ecx = 1, tf_eax = 133,
>       tf_trapno = 7, tf_err = 2, tf_eip = 134551364, tf_cs = 31,
>       tf_eflags = 643, tf_esp = -1078002724, tf_ss = 47})
>     at /usr/src/sys/i386/i386/trap.c:1175
> #16 0xc02aef45 in Xint0x80_syscall ()
> 
> (kgdb) f 6
> #6  0xc0196db8 in m_copydata (m=0x0, off=-1, len=1,
>     cp=0xc0ee5070 ":sha1:UPA6SGRR2Z7Y3YSND2J3JYQTFPMKK5JI")
>     at /usr/src/sys/kern/uipc_mbuf.c:863
> 863             while (len > 0) {
> 
> (kgdb) f 7
> #7  0xc01e438c in tcp_output (tp=0xc8e11140)
>     at /usr/src/sys/netinet/tcp_output.c:607
> 607                             m_copydata(so->so_snd.sb_mb, off, (int) len,
> (kgdb)
> (kgdb) p so
> $1 = (struct socket *) 0xc8d3d380
> (kgdb) p *so
> $2 = {so_type = 1, so_options = 4, so_linger = 0, so_state = 258,
>   so_pcb = 0xc8e11080 "\003X{`k(", so_proto = 0xc0332be8,
>   so_head = 0x0, so_incomp = {tqh_first = 0x0, tqh_last = 0xc8d3d394},
>   so_comp = {tqh_first = 0x0, tqh_last = 0xc8d3d39c}, so_list = {
>     tqe_next = 0x0, tqe_prev = 0x0}, so_qlen = 0, so_incqlen = 0,
>   so_qlimit = 0, so_timeo = 0, so_error = 0, so_sigio = 0x0, so_oobmark = 0,
>   so_aiojobq = {tqh_first = 0x0, tqh_last = 0xc8d3d3c0}, so_rcv = {sb_cc = 0,
>     sb_hiwat = 57920, sb_mbcnt = 0, sb_mbmax = 262144, sb_lowat = 1,
>     sb_mb = 0x0, sb_sel = {si_pid = 0, si_note = {slh_first = 0x0},
>       si_flags = 0}, sb_flags = 0, sb_timeo = 0}, so_snd = {sb_cc = 0,
>     sb_hiwat = 33304, sb_mbcnt = 0, sb_mbmax = 262144, sb_lowat = 2048,
>     sb_mb = 0x0, sb_sel = {si_pid = 0, si_note = {slh_first = 0x0},
>       si_flags = 0}, sb_flags = 0, sb_timeo = 0}, so_upcall = 0,
>   so_upcallarg = 0x0, so_cred = 0xc1a0a900, so_gencnt = 72140,
>   so_emuldata = 0x0, so_accf = 0x0}
> (kgdb)
> 
> SO it looks like somewhere there is also a use-mbuf-alloc-without-checking
> bug somewhere.
> 
> Gavin
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-stable" in the body of the message
> 

-- 
Bosko Milekic * bmilekic@unixdaemons.com * bmilekic@FreeBSD.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020821225731.A32832>