Date: Mon, 10 May 2010 20:12:48 +0200 From: Ivan Voras <ivoras@freebsd.org> To: Andrew Gallatin <gallatin@cs.duke.edu> Cc: freebsd-net@freebsd.org, developers@freebsd.org Subject: Re: FreeBSD.org IPv6 issue - AAAA records disabled Message-ID: <AANLkTikxCST-ZBH2iBhkZ2aF1QedPhXR9i-kQNOGHoNt@mail.gmail.com> In-Reply-To: <4BE82011.6050009@cs.duke.edu> References: <4BD885C6.10600@FreeBSD.org> <20100429204544.GC1286@arthur.nitro.dk> <1272998683.2406.38.camel@localhost.localdomain> <20100504190328.GC31196@valentine.liquidneon.com> <4BE80F07.8090309@cs.duke.edu> <AANLkTimVvm1AfOoJax9AcSWNLGJqIGe7EPE1FssA7tDe@mail.gmail.com> <4BE82011.6050009@cs.duke.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On 10 May 2010 17:02, Andrew Gallatin <gallatin@cs.duke.edu> wrote:
> Ivan Voras wrote:
> In the case of 143046, he says that disabling IPv6 support in his
> jails solved the problem. =C2=A0He also saw the panic:sbdrop, as well
> as some others which makes it seem like something is holding a
> pointer to a free mbuf and modifying it behind the back of the
> new owner. =C2=A0He saw this most often:
>
> Tracing pid 12 tid 100063 td 0xffffff00092c7390
> mb_free_ext() at mb_free_ext+0x15
> m_freem() at m_freem+0x23
> ether_input() at ether_input+0x56
> mxge_intr() at mxge_intr+0x5b2
> intr_event_execute_handlers() at intr_event_execute_handlers+0x132
> ithread_loop() at ithread_loop+0x7d
> fork_exit() at fork_exit+0x121
> fork_trampoline() at fork_trampoline+0xe
>
> This is coming from the "discard frame w/o packet header"
> codepath in ether_input. =C2=A0With the way I stock my rings,
> this is impossible unless something else is scribbling on
> my mbufs.
I have a different trace but the dmesg buffer on crashed machines
usually contains the message
"em0: discard frame w/o packet header"
just before the panic so this still looks like it.
Here are my most usual traces:
#1 0xffffffff8058d179 in boot (howto=3D260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2 0xffffffff8058d5ac in panic (
fmt=3D0xffffffff80971c38 "%s: sockbuf %p and mbuf %p clashing")
at /usr/src/sys/kern/kern_shutdown.c:579
#3 0xffffffff805ea6d4 in sbsndptr (sb=3DNA)
at /usr/src/sys/kern/uipc_sockbuf.c:954
#4 0xffffffff806ff812 in tcp_output (tp=3D0xffffff0001ae46e0)
at /usr/src/sys/netinet/tcp_output.c:814
#5 0xffffffff8070c274 in tcp_usr_send (so=3D0xffffff0001f7ed48, flags=3D0,=
m=3DNA)
at tcp_offload.h:282
#6 0xffffffff805f05b6 in sosend_generic (so=3D0xffffff0001f7ed48, addr=3D0=
x0,
uio=3D0x0, top=3D0xffffff0001909c00, control=3D0x0, flags=3DNA)
at /usr/src/sys/kern/uipc_socket.c:1256
#7 0xffffffff8077c93f in svc_vc_reply (xprt=3D0xffffff00018ba000, msg=3DNA=
)
at /usr/src/sys/rpc/svc_vc.c:732
#8 0xffffffff8077851a in svc_sendreply_common (rqstp=3D0xffffff00425f2800,
rply=3D0xffffff8070811990, body=3D0xffffff006c70f000)
at /usr/src/sys/rpc/svc.c:538
#9 0xffffffff807793b9 in svc_sendreply_mbuf (NA)
at /usr/src/sys/rpc/svc.c:594
#10 0xffffffff80760669 in nfssvc_program (rqst=3D0xffffff00425f2800, xprt=
=3DNA)
at /usr/src/sys/nfsserver/nfs_srvkrpc.c:371
#11 0xffffffff80778f42 in svc_run_internal (pool=3D0xffffff00016cea00,
ismaster=3D0) at /usr/src/sys/rpc/svc.c:893
#12 0xffffffff807792bb in svc_thread_start (arg=3DNA)
at /usr/src/sys/rpc/svc.c:1198
#13 0xffffffff80564b78 in fork_exit (
callout=3D0xffffffff807792b0 <svc_thread_start>, arg=3D0xffffff00016cea=
00,
frame=3D0xffffff8070811c80) at /usr/src/sys/kern/kern_fork.c:843
#14 0xffffffff808594ae in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:561
(Note: NFS, for what it's worth) And:
#1 0xffffffff805a2a69 in boot (howto=3D260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2 0xffffffff805a2e9c in panic (fmt=3D0xffffffff8098f15f "sbdrop")
at /usr/src/sys/kern/kern_shutdown.c:579
#3 0xffffffff80603a53 in sbdrop_internal (sb=3DNA)
at /usr/src/sys/kern/uipc_sockbuf.c:858
#4 0xffffffff80710ae7 in tcp_do_segment (m=3D0xffffff0001941500,
th=3D0xffffff000194157c, so=3D0xffffff0032efaaa0, tp=3D0xffffff00ddc7f0=
00,
drop_hdrlen=3D52, tlen=3D14, iptos=3D0 '\0', ti_locked=3D2)
at /usr/src/sys/netinet/tcp_input.c:2355
#5 0xffffffff807128d2 in tcp_input (m=3D0xffffff0001941500, off0=3DNA)
at /usr/src/sys/netinet/tcp_input.c:1020
#6 0xffffffff806ae23a in ip_input (m=3D0xffffff0001941500)
at /usr/src/sys/netinet/ip_input.c:804
#7 0xffffffff8065b30d in swi_net (arg=3DNA) at /usr/src/sys/net/netisr.c:7=
16
#8 0xffffffff8057bf8d in intr_event_execute_handlers (p=3DNA)
at /usr/src/sys/kern/kern_intr.c:1220
#9 0xffffffff8057d63e in ithread_loop (arg=3D0xffffff00014b66c0)
at /usr/src/sys/kern/kern_intr.c:1233
#10 0xffffffff80579f58 in fork_exit (
callout=3D0xffffffff8057d5b0 <ithread_loop>, arg=3D0xffffff00014b66c0,
frame=3D0xffffff8000044c80) at /usr/src/sys/kern/kern_fork.c:843
#11 0xffffffff8086f3ae in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:561
#12 0x0000000000000000 in ?? ()
(Looks like a regular input path)
> I think something may be holding onto an mbuf after free,
> then re-freeing it. =C2=A0But only after somebody else allocated
> it. =C2=A0 I was hoping that the mbuf double free referenced
> above was the smoking gun, but it turns out that there isn't
> even a bge interface in my pr (just bce and mxge).
Since I have an em it looks interface-agnostic.
Also, moving this to freebsd-net@.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTikxCST-ZBH2iBhkZ2aF1QedPhXR9i-kQNOGHoNt>
