Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 02 Oct 2010 10:37:01 -0400
From:      Mike Tancsa <mike@sentex.net>
To:        "Li, Qing" <qing.li@bluecoat.com>
Cc:        freebsd-stable@freebsd.org
Subject:   RE: if_rtdel: error 47 (netgraph or mpd issue?)
Message-ID:  <201010021437.o92EbAIl033701@lava.sentex.ca>
In-Reply-To: <201009171759.o8HHxCJM037780@lava.sentex.ca>
References:  <201008312102.o7VL2MJr000894@lava.sentex.ca> <B583FBF374231F4A89607B4D08578A4308026A4D@bcs-mail03.internal.cacheflow.com> <201009012255.o81MtMXn009701@lava.sentex.ca> <B583FBF374231F4A89607B4D08578A4308026ABC@bcs-mail03.internal.cacheflow.com> <201009081512.o88FCIq8064280@lava.sentex.ca> <AANLkTimkKpxLJZo0Oxce4tFXD3i4Jg1adw68B-LwxgAm@mail.gmail.com> <201009081535.o88FZKQS064396@lava.sentex.ca> <B583FBF374231F4A89607B4D08578A43080CAB7F@bcs-mail03.internal.cacheflow.com> <201009101651.o8AGp8uU080952@lava.sentex.ca> <201009171759.o8HHxCJM037780@lava.sentex.ca>

next in thread | previous in thread | raw e-mail | index | archive | help

FYI,
         I disabled ipv6 in mpd as well as set ipv6_enable="NO" and 
the box has been stable for 2 weeks now.  Previously, it would crash 
every 5 days or so.  Something in inet6 or mpd ?

         ---Mike


At 01:59 PM 9/17/2010, Mike Tancsa wrote:
>At 12:51 PM 9/10/2010, Mike Tancsa wrote:
>
>
>>FYI, I enabled witness in the kernel and am seeing the following
>>
>>
>>uma_zalloc_arg: zone "128" with the following non-sleepable locks held:
>>exclusive rw ifnet_rw (ifnet_rw) r = 0 (0xc0b56ec4) locked @ 
>>/usr/src/sys/net/if.c:419
>
>
>Hi,
>         Another crash. I had it break to the serial debugger this time
>
>
>Fatal trap 12: page fault while in kernel mode
>cpuid = 1; apic id = 01
>fault virtual address   = 0x24
>fault code              = supervisor read, page not present
>instruction pointer     = 0x20:0xc64c79e4
>stack pointer           = 0x28:0xe7c84864
>frame pointer           = 0x28:0xe7c84a9c
>code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
>processor eflags        = interrupt enabled, resume, IOPL = 0
>current process         = 1280 (mpd5)
>[thread pid 1280 tid 100096 ]
>Stopped at      ng_path2noderef+0x174:  testb   $0x1,0x24(%esi)
>db> bt
>Tracing pid 1280 tid 100096 td 0xc58f7780
>ng_path2noderef(cace4b80,cb0a5350,e7c84ab8,e7c84ab4,0,...) at 
>ng_path2noderef+0x174
>ng_address_path(cace4b80,c64d4400,cb0a5350,0,28885ba0,...) at 
>ng_address_path+0x40
>ngc_send(cb66db44,0,cb2f4500,cba946f0,0,...) at ngc_send+0x182
>sosend_generic(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend_generic+0x50d
>sosend(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend+0x3f
>kern_sendit(c58f7780,8d,e7c84c60,0,0,...) at kern_sendit+0x107
>sendit(0,cba946f0,7,e7c84c7c,1,...) at sendit+0xb1
>sendto(c58f7780,e7c84cf8,c093d225,c091bcfe,282,...) at sendto+0x48
>syscall(e7c84d38) at syscall+0x1da
>Xint0x80_syscall() at Xint0x80_syscall+0x21
>--- syscall (133, FreeBSD ELF32, sendto), eip = 0x284b13c7, esp = 
>0xbf9fe4cc, ebp = 0xbf9fe4f8 ---
>db> where
>Tracing pid 1280 tid 100096 td 0xc58f7780
>ng_path2noderef(cace4b80,cb0a5350,e7c84ab8,e7c84ab4,0,...) at 
>ng_path2noderef+0x174
>ng_address_path(cace4b80,c64d4400,cb0a5350,0,28885ba0,...) at 
>ng_address_path+0x40
>ngc_send(cb66db44,0,cb2f4500,cba946f0,0,...) at ngc_send+0x182
>sosend_generic(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend_generic+0x50d
>sosend(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend+0x3f
>kern_sendit(c58f7780,8d,e7c84c60,0,0,...) at kern_sendit+0x107
>sendit(0,cba946f0,7,e7c84c7c,1,...) at sendit+0xb1
>sendto(c58f7780,e7c84cf8,c093d225,c091bcfe,282,...) at sendto+0x48
>syscall(e7c84d38) at syscall+0x1da
>Xint0x80_syscall() at Xint0x80_syscall+0x21
>--- syscall (133, FreeBSD ELF32, sendto), eip = 0x284b13c7, esp = 
>0xbf9fe4cc, ebp = 0xbf9fe4f8 ---
>db> show locks
>exclusive sx so_snd_sx (so_snd_sx) r = 0 (0xcb66dc64) locked @ 
>/usr/src/sys/kern/uipc_sockbuf.c:148
>db> show alllocks
>Process 1928 (sshd) thread 0xc6402a00 (100094)
>exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xc669a898) locked @ 
>/usr/src/sys/kern/uipc_sockbuf.c:148
>Process 1281 (ng_queue) thread 0xc58f6a00 (100057)
>shared rw radix node head (radix node head) r = 0 (0xc56e1580) 
>locked @ /usr/src/sys/net/route.c:362
>Process 1280 (mpd5) thread 0xc58f7780 (100096)
>exclusive sx so_snd_sx (so_snd_sx) r = 0 (0xcb66dc64) locked @ 
>/usr/src/sys/kern/uipc_sockbuf.c:148
>db> call doadump()
>Physical memory: 2032 MB
>Dumping 274 MB: 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3
>Dump complete
>
>
>
>
>panic:
>
>GNU gdb 6.1.1 [FreeBSD]
>Copyright 2004 Free Software Foundation, Inc.
>GDB is free software, covered by the GNU General Public License, and you are
>welcome to change it and/or distribute copies of it under certain conditions.
>Type "show copying" to see the conditions.
>There is absolutely no warranty for GDB.  Type "show warranty" for details.
>This GDB was configured as "i386-marcel-freebsd"...
>
>Unread portion of the kernel message buffer:
>
>
>Fatal trap 12: page fault while in kernel mode
>cpuid = 1; apic id = 01
>fault virtual address   = 0x24
>fault code              = supervisor read, page not present
>instruction pointer     = 0x20:0xc64c79e4
>stack pointer           = 0x28:0xe7c84864
>frame pointer           = 0x28:0xe7c84a9c
>code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
>processor eflags        = interrupt enabled, resume, IOPL = 0
>current process         = 1280 (mpd5)
>Physical memory: 2032 MB
>Dumping 274 MB: 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3
>
>#0  doadump () at pcpu.h:231
>231     pcpu.h: No such file or directory.
>         in pcpu.h
>(kgdb) #0  doadump () at pcpu.h:231
>#1  0xc04a5899 in db_fncall (dummy1=1, dummy2=0, dummy3=-1061510048,
>     dummy4=0xe7c84600 "") at /usr/src/sys/ddb/db_command.c:548
>#2  0xc04a5c91 in db_command (last_cmdp=0xc09cf71c, cmd_table=0x0, dopager=1)
>     at /usr/src/sys/ddb/db_command.c:445
>#3  0xc04a5dea in db_command_loop () at /usr/src/sys/ddb/db_command.c:498
>#4  0xc04a7c6d in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:229
>#5  0xc069c7ae in kdb_trap (type=12, code=0, tf=0xe7c84824)
>     at /usr/src/sys/kern/subr_kdb.c:535
>#6  0xc08aabcf in trap_fatal (frame=0xe7c84824, eva=36)
>     at /usr/src/sys/i386/i386/trap.c:929
>#7  0xc08aadf0 in trap_pfault (frame=0xe7c84824, usermode=0, eva=36)
>     at /usr/src/sys/i386/i386/trap.c:851
>#8  0xc08ab5e3 in trap (frame=0xe7c84824) at /usr/src/sys/i386/i386/trap.c:533
>#9  0xc088ecdc in calltrap () at /usr/src/sys/i386/i386/exception.s:166
>#10 0xc64c79e4 in ng_path2noderef (here=0xcace4b80,
>     address=0xcb0a5350 "ctrl", destp=0xe7c84ab8, lasthook=0xe7c84ab4)
>     at 
> /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:1757
>#11 0xc64c7d40 in ng_address_path (here=0xcace4b80, item=0xc64d4400,
>     address=0xcb0a5350 "ctrl", retaddr=0)
>     at 
> /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:3536
>#12 0xc64c2662 in ngc_send (so=0xcb66db44, flags=0, m=0xcb2f4500,
>     addr=0xcba946f0, control=0x0, td=0xc58f7780)
>     at /usr/src/sys/modules/netgraph/socket/../../../netgraph/ng_socket.c:296
>#13 0xc06cf68d in sosend_generic (so=0xcb66db44, addr=0xcba946f0,
>     uio=0xe7c84bec, top=0xcb2f4500, control=0x0, flags=0, td=0xc58f7780)
>     at /usr/src/sys/kern/uipc_socket.c:1260
>#14 0xc06cbe2f in sosend (so=0xcb66db44, addr=0xcba946f0, uio=0xe7c84bec,
>     top=0x0, control=0x0, flags=0, td=0xc58f7780)
>     at /usr/src/sys/kern/uipc_socket.c:1304
>#15 0xc06d21f7 in kern_sendit (td=0xc58f7780, s=141, mp=0xe7c84c60, flags=0,
>     control=0x0, segflg=UIO_USERSPACE)
>     at /usr/src/sys/kern/uipc_syscalls.c:788
>#16 0xc06d23f1 in sendit (td=0xc58f7780, s=141, mp=0xe7c84c60, flags=0)
>     at /usr/src/sys/kern/uipc_syscalls.c:724
>#17 0xc06d2508 in sendto (td=0xc58f7780, uap=0xe7c84cf8)
>     at /usr/src/sys/kern/uipc_syscalls.c:840
>#18 0xc08aafea in syscall (frame=0xe7c84d38)
>     at /usr/src/sys/i386/i386/trap.c:1111
>#19 0xc088ed41 in Xint0x80_syscall ()
>     at /usr/src/sys/i386/i386/exception.s:264
>#20 0x00000033 in ?? ()
>Previous frame inner to this frame (corrupt stack?)
>
>
>(kgdb) up 10
>#10 0xc64c79e4 in ng_path2noderef (here=0xcace4b80, 
>address=0xcb0a5350 "ctrl", destp=0xe7c84ab8, lasthook=0xe7c84ab4)
>     at 
> /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:1757
>1757                    NG_NODE_UNREF(oldnode); /* XXX another race */
>(kgdb) list
>1752                     * instead of the direct hook in this crawl?
>1753                     */
>1754                    oldnode = node;
>1755                    if ((node = NG_PEER_NODE(hook)))
>1756                            NG_NODE_REF(node);      /* XXX RACE */
>1757                    NG_NODE_UNREF(oldnode); /* XXX another race */
>1758                    if (NG_NODE_NOT_VALID(node)) {
>1759                            NG_NODE_UNREF(node);    /* XXX more races */
>1760                            node = NULL;
>1761                    }
>(kgdb)
>
>(kgdb) p *hook
>$3 = {hk_name = "ctrl", '\0' <repeats 27 times>, hk_private = 
>0xcb90a5c0, hk_flags = 0, hk_type = 0, hk_peer = 0xcab92e80,
>   hk_node = 0xcace4b80, hk_hooks = {le_next = 0x0, le_prev = 
> 0xcace4bb4}, hk_rcvmsg = 0, hk_rcvdata = 0, hk_refs = 2}
>(kgdb)
>(kgdb) p *node
>Cannot access memory at address 0x0
>(kgdb)
>
>_______________________________________________
>freebsd-stable@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"

--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            mike@sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201010021437.o92EbAIl033701>