Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Sep 2006 14:08:52 -0700
From:      "Jack Vogel" <jfvogel@gmail.com>
To:        "Gleb Smirnoff" <glebius@freebsd.org>
Cc:        Michiel Boland <michiel@boland.org>, freebsd-current@freebsd.org
Subject:   Re: panic in em_txeof
Message-ID:  <2a41acea0609281408k65fc2a3g35bffdb6712bb280@mail.gmail.com>
In-Reply-To: <20060928194831.GF59833@FreeBSD.org>
References:  <Pine.GSO.4.64.0609281300150.29275@brakkenstein.nijmegen.internl.net> <20060928194831.GF59833@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 9/28/06, Gleb Smirnoff <glebius@freebsd.org> wrote:
> On Thu, Sep 28, 2006 at 01:09:05PM +0200, Michiel Boland wrote:
> M> -CURRENT from 25 Sept. (if_em.c has rev 1.147)
> M>
> M> em, connected to cisco 2950 at 100 Mb full/duplex with TSO disabled.
> M>
> M> At high load, the card stopped passing network traffic. After I
> M> ifconfig-ed the interface down and up again, I got this panic.
> M>
> M> Obviously neither the network card malfunction or the panic are any good.
> M> I hope someone can figure out what's going on.
> M>
> M> Cheers
> M> Michiel
> M>
> M> Fatal trap 12: page fault while in kernel mode
> M> fault virtual address        = 0x568
> M> fault code           = supervisor read, page not present
> M> instruction pointer  = 0x20:0xc0464d9a
> M> stack pointer                = 0x28:0xd3358c50
> M> frame pointer                = 0x28:0xd3358c64
> M> code segment         = base 0x0, limit 0xfffff, type 0x1b
> M>                      = DPL 0, pres 1, def32 1, gran 1
> M> processor eflags     = interrupt enabled, resume, IOPL = 0
> M> current process              = 11 (swi4: clock sio)
> M> trap number          = 12
> M> panic: page fault
> M> KDB: stack backtrace:
> M> kdb_backtrace(100,c20736c0,28,d3358c10,c,...) at kdb_backtrace+0x29
> M> panic(c063a952,c065b591,0,0,fffff,...) at panic+0xa8
> M> trap_fatal(d3358c10,568,c20736c0,c069d0a0,0,...) at trap_fatal+0x2b6
> M> trap_pfault(d3358c10,0,568) at trap_pfault+0x1cb
> M> trap(d3350008,c04f0028,c2150028,568,ad,...) at trap+0x38d
> M> calltrap() at calltrap+0x5
> M> --- trap 0xc, eip = 0xc0464d9a, esp = 0xd3358c50, ebp = 0xd3358c64 ---
> M> em_txeof(c20f1000) at em_txeof+0x86
> M> em_watchdog(c2131000) at em_watchdog+0xa6
> M> if_slowtimo(0) at if_slowtimo+0x66
> M> softclock(0) at softclock+0x252
> M> ithread_execute_handlers(c2072b04,c2070500) at
> M> ithread_execute_handlers+0x125
> M> ithread_loop(c20426c0,d3358d38) at ithread_loop+0x54
> M> fork_exit(c04cea10,c20426c0,d3358d38) at fork_exit+0x7a
> M> fork_trampoline() at fork_trampoline+0x8
> M> --- trap 0x1, eip = 0, esp = 0xd3358d6c, ebp = 0 ---
> M> Uptime: 2d23h21m50s
> M> Physical memory: 505 MB
> M> Dumping 117 MB: 102 86 (CTRL-C to abort)  70 54 38 22 (CTRL-C to abort)
> M> (CTRL-C to abort)  6
> M>
> M> #0  doadump () at pcpu.h:166
> M> 166          __asm __volatile("movl %%fs:0,%0" : "=r" (td));
> M> (kgdb) bt
> M> #0  doadump () at pcpu.h:166
> M> #1  0xc04e3ca4 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
> M> #2  0xc04e3f6c in panic (fmt=0xc063a952 "%s") at
> M> /usr/src/sys/kern/kern_shutdown.c:565
> M> #3  0xc0616d0a in trap_fatal (frame=0xd3358c10, eva=1384) at
> M> /usr/src/sys/i386/i386/trap.c:867
> M> #4  0xc0616a2b in trap_pfault (frame=0xd3358c10, usermode=0, eva=1384) at
> M> /usr/src/sys/i386/i386/trap.c:776
> M> #5  0xc0616625 in trap (frame=
> M>       {tf_fs = -751501304, tf_es = -1068564440, tf_ds = -1038811096, tf_edi
> M>       = 1384, tf_esi = 173, tf_ebp = -751465372, tf_isp = -751465412,
> M>       tf_ebx = -1038800176, tf_edx = -1039200256, tf_ecx = -865996036,
> M>       tf_eax = 2768, tf_trapno = 12, tf_err = 0, tf_eip = -1069134438,
> M>       tf_cs = 32, tf_eflags = 66054, tf_esp = -1038938112, tf_ss = 231}) at
> M>       /usr/src/sys/i386/i386/trap.c:461
> M> #6  0xc060759a in calltrap () at /usr/src/sys/i386/i386/exception.s:138
> M> #7  0xc0464d9a in em_txeof (adapter=0xc20f1000) at
> M> /usr/src/sys/dev/em/if_em.c:2956
> M> #8  0xc0461ace in em_watchdog (ifp=0xc2131000) at
> M> /usr/src/sys/dev/em/if_em.c:963
> M> #9  0xc05576de in if_slowtimo (arg=0x0) at /usr/src/sys/net/if.c:1415
> M> #10 0xc04f1ac2 in softclock (dummy=0x0) at
> M> /usr/src/sys/kern/kern_timeout.c:271
> M> #11 0xc04ce955 in ithread_execute_handlers (p=0xc2072b04, ie=0xc2070500) at
> M> /usr/src/sys/kern/kern_intr.c:662
> M> #12 0xc04cea64 in ithread_loop (arg=0xc20426c0) at
> M> /usr/src/sys/kern/kern_intr.c:745
> M> #13 0xc04cd8b6 in fork_exit (callout=0xc04cea10 <ithread_loop>,
> M> arg=0xc20426c0, frame=0xd3358d38) at /usr/src/sys/kern/kern_fork.c:818
> M> #14 0xc06075fc in fork_trampoline () at
> M> /usr/src/sys/i386/i386/exception.s:199
> M> (kgdb) f 7
> M> #7  0xc0464d9a in em_txeof (adapter=0xc20f1000) at
> M> /usr/src/sys/dev/em/if_em.c:2956
> M> 2956                 num_avail++;
> M> (kgdb) info locals
> M> i = 173
> M> num_avail = 231
> M> tx_buffer = (struct em_buffer *) 0x568
> M> tx_desc = (struct em_tx_desc *) 0xc2152ad0
> M> ifp = (struct ifnet *) 0xc2131000
>
> Yeah, looks like calling em_txeof() from watchdog wasn't a perfect idea.
> Anyway, this code is temporary and isn't merged to RELENG_6 and isn't
> planned to be merged.

Ya, I've been staring at the code trying to understand how that tx_buffer
pointer could get that obviously bogus content. So far I'm not sure...

I would suggest taking the call to em_txeof() out of the watchdog code
and see (I didnt know that got put in until now) if that makes things better.

Jack



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2a41acea0609281408k65fc2a3g35bffdb6712bb280>