Date: Sat, 02 Oct 2010 10:37:01 -0400 From: Mike Tancsa <mike@sentex.net> To: "Li, Qing" <qing.li@bluecoat.com> Cc: freebsd-stable@freebsd.org Subject: RE: if_rtdel: error 47 (netgraph or mpd issue?) Message-ID: <201010021437.o92EbAIl033701@lava.sentex.ca> In-Reply-To: <201009171759.o8HHxCJM037780@lava.sentex.ca> References: <201008312102.o7VL2MJr000894@lava.sentex.ca> <B583FBF374231F4A89607B4D08578A4308026A4D@bcs-mail03.internal.cacheflow.com> <201009012255.o81MtMXn009701@lava.sentex.ca> <B583FBF374231F4A89607B4D08578A4308026ABC@bcs-mail03.internal.cacheflow.com> <201009081512.o88FCIq8064280@lava.sentex.ca> <AANLkTimkKpxLJZo0Oxce4tFXD3i4Jg1adw68B-LwxgAm@mail.gmail.com> <201009081535.o88FZKQS064396@lava.sentex.ca> <B583FBF374231F4A89607B4D08578A43080CAB7F@bcs-mail03.internal.cacheflow.com> <201009101651.o8AGp8uU080952@lava.sentex.ca> <201009171759.o8HHxCJM037780@lava.sentex.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
FYI, I disabled ipv6 in mpd as well as set ipv6_enable="NO" and the box has been stable for 2 weeks now. Previously, it would crash every 5 days or so. Something in inet6 or mpd ? ---Mike At 01:59 PM 9/17/2010, Mike Tancsa wrote: >At 12:51 PM 9/10/2010, Mike Tancsa wrote: > > >>FYI, I enabled witness in the kernel and am seeing the following >> >> >>uma_zalloc_arg: zone "128" with the following non-sleepable locks held: >>exclusive rw ifnet_rw (ifnet_rw) r = 0 (0xc0b56ec4) locked @ >>/usr/src/sys/net/if.c:419 > > >Hi, > Another crash. I had it break to the serial debugger this time > > >Fatal trap 12: page fault while in kernel mode >cpuid = 1; apic id = 01 >fault virtual address = 0x24 >fault code = supervisor read, page not present >instruction pointer = 0x20:0xc64c79e4 >stack pointer = 0x28:0xe7c84864 >frame pointer = 0x28:0xe7c84a9c >code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 >processor eflags = interrupt enabled, resume, IOPL = 0 >current process = 1280 (mpd5) >[thread pid 1280 tid 100096 ] >Stopped at ng_path2noderef+0x174: testb $0x1,0x24(%esi) >db> bt >Tracing pid 1280 tid 100096 td 0xc58f7780 >ng_path2noderef(cace4b80,cb0a5350,e7c84ab8,e7c84ab4,0,...) at >ng_path2noderef+0x174 >ng_address_path(cace4b80,c64d4400,cb0a5350,0,28885ba0,...) at >ng_address_path+0x40 >ngc_send(cb66db44,0,cb2f4500,cba946f0,0,...) at ngc_send+0x182 >sosend_generic(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend_generic+0x50d >sosend(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend+0x3f >kern_sendit(c58f7780,8d,e7c84c60,0,0,...) at kern_sendit+0x107 >sendit(0,cba946f0,7,e7c84c7c,1,...) at sendit+0xb1 >sendto(c58f7780,e7c84cf8,c093d225,c091bcfe,282,...) at sendto+0x48 >syscall(e7c84d38) at syscall+0x1da >Xint0x80_syscall() at Xint0x80_syscall+0x21 >--- syscall (133, FreeBSD ELF32, sendto), eip = 0x284b13c7, esp = >0xbf9fe4cc, ebp = 0xbf9fe4f8 --- >db> where >Tracing pid 1280 tid 100096 td 0xc58f7780 >ng_path2noderef(cace4b80,cb0a5350,e7c84ab8,e7c84ab4,0,...) at >ng_path2noderef+0x174 >ng_address_path(cace4b80,c64d4400,cb0a5350,0,28885ba0,...) at >ng_address_path+0x40 >ngc_send(cb66db44,0,cb2f4500,cba946f0,0,...) at ngc_send+0x182 >sosend_generic(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend_generic+0x50d >sosend(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend+0x3f >kern_sendit(c58f7780,8d,e7c84c60,0,0,...) at kern_sendit+0x107 >sendit(0,cba946f0,7,e7c84c7c,1,...) at sendit+0xb1 >sendto(c58f7780,e7c84cf8,c093d225,c091bcfe,282,...) at sendto+0x48 >syscall(e7c84d38) at syscall+0x1da >Xint0x80_syscall() at Xint0x80_syscall+0x21 >--- syscall (133, FreeBSD ELF32, sendto), eip = 0x284b13c7, esp = >0xbf9fe4cc, ebp = 0xbf9fe4f8 --- >db> show locks >exclusive sx so_snd_sx (so_snd_sx) r = 0 (0xcb66dc64) locked @ >/usr/src/sys/kern/uipc_sockbuf.c:148 >db> show alllocks >Process 1928 (sshd) thread 0xc6402a00 (100094) >exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xc669a898) locked @ >/usr/src/sys/kern/uipc_sockbuf.c:148 >Process 1281 (ng_queue) thread 0xc58f6a00 (100057) >shared rw radix node head (radix node head) r = 0 (0xc56e1580) >locked @ /usr/src/sys/net/route.c:362 >Process 1280 (mpd5) thread 0xc58f7780 (100096) >exclusive sx so_snd_sx (so_snd_sx) r = 0 (0xcb66dc64) locked @ >/usr/src/sys/kern/uipc_sockbuf.c:148 >db> call doadump() >Physical memory: 2032 MB >Dumping 274 MB: 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3 >Dump complete > > > > >panic: > >GNU gdb 6.1.1 [FreeBSD] >Copyright 2004 Free Software Foundation, Inc. >GDB is free software, covered by the GNU General Public License, and you are >welcome to change it and/or distribute copies of it under certain conditions. >Type "show copying" to see the conditions. >There is absolutely no warranty for GDB. Type "show warranty" for details. >This GDB was configured as "i386-marcel-freebsd"... > >Unread portion of the kernel message buffer: > > >Fatal trap 12: page fault while in kernel mode >cpuid = 1; apic id = 01 >fault virtual address = 0x24 >fault code = supervisor read, page not present >instruction pointer = 0x20:0xc64c79e4 >stack pointer = 0x28:0xe7c84864 >frame pointer = 0x28:0xe7c84a9c >code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 >processor eflags = interrupt enabled, resume, IOPL = 0 >current process = 1280 (mpd5) >Physical memory: 2032 MB >Dumping 274 MB: 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3 > >#0 doadump () at pcpu.h:231 >231 pcpu.h: No such file or directory. > in pcpu.h >(kgdb) #0 doadump () at pcpu.h:231 >#1 0xc04a5899 in db_fncall (dummy1=1, dummy2=0, dummy3=-1061510048, > dummy4=0xe7c84600 "") at /usr/src/sys/ddb/db_command.c:548 >#2 0xc04a5c91 in db_command (last_cmdp=0xc09cf71c, cmd_table=0x0, dopager=1) > at /usr/src/sys/ddb/db_command.c:445 >#3 0xc04a5dea in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 >#4 0xc04a7c6d in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:229 >#5 0xc069c7ae in kdb_trap (type=12, code=0, tf=0xe7c84824) > at /usr/src/sys/kern/subr_kdb.c:535 >#6 0xc08aabcf in trap_fatal (frame=0xe7c84824, eva=36) > at /usr/src/sys/i386/i386/trap.c:929 >#7 0xc08aadf0 in trap_pfault (frame=0xe7c84824, usermode=0, eva=36) > at /usr/src/sys/i386/i386/trap.c:851 >#8 0xc08ab5e3 in trap (frame=0xe7c84824) at /usr/src/sys/i386/i386/trap.c:533 >#9 0xc088ecdc in calltrap () at /usr/src/sys/i386/i386/exception.s:166 >#10 0xc64c79e4 in ng_path2noderef (here=0xcace4b80, > address=0xcb0a5350 "ctrl", destp=0xe7c84ab8, lasthook=0xe7c84ab4) > at > /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:1757 >#11 0xc64c7d40 in ng_address_path (here=0xcace4b80, item=0xc64d4400, > address=0xcb0a5350 "ctrl", retaddr=0) > at > /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:3536 >#12 0xc64c2662 in ngc_send (so=0xcb66db44, flags=0, m=0xcb2f4500, > addr=0xcba946f0, control=0x0, td=0xc58f7780) > at /usr/src/sys/modules/netgraph/socket/../../../netgraph/ng_socket.c:296 >#13 0xc06cf68d in sosend_generic (so=0xcb66db44, addr=0xcba946f0, > uio=0xe7c84bec, top=0xcb2f4500, control=0x0, flags=0, td=0xc58f7780) > at /usr/src/sys/kern/uipc_socket.c:1260 >#14 0xc06cbe2f in sosend (so=0xcb66db44, addr=0xcba946f0, uio=0xe7c84bec, > top=0x0, control=0x0, flags=0, td=0xc58f7780) > at /usr/src/sys/kern/uipc_socket.c:1304 >#15 0xc06d21f7 in kern_sendit (td=0xc58f7780, s=141, mp=0xe7c84c60, flags=0, > control=0x0, segflg=UIO_USERSPACE) > at /usr/src/sys/kern/uipc_syscalls.c:788 >#16 0xc06d23f1 in sendit (td=0xc58f7780, s=141, mp=0xe7c84c60, flags=0) > at /usr/src/sys/kern/uipc_syscalls.c:724 >#17 0xc06d2508 in sendto (td=0xc58f7780, uap=0xe7c84cf8) > at /usr/src/sys/kern/uipc_syscalls.c:840 >#18 0xc08aafea in syscall (frame=0xe7c84d38) > at /usr/src/sys/i386/i386/trap.c:1111 >#19 0xc088ed41 in Xint0x80_syscall () > at /usr/src/sys/i386/i386/exception.s:264 >#20 0x00000033 in ?? () >Previous frame inner to this frame (corrupt stack?) > > >(kgdb) up 10 >#10 0xc64c79e4 in ng_path2noderef (here=0xcace4b80, >address=0xcb0a5350 "ctrl", destp=0xe7c84ab8, lasthook=0xe7c84ab4) > at > /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:1757 >1757 NG_NODE_UNREF(oldnode); /* XXX another race */ >(kgdb) list >1752 * instead of the direct hook in this crawl? >1753 */ >1754 oldnode = node; >1755 if ((node = NG_PEER_NODE(hook))) >1756 NG_NODE_REF(node); /* XXX RACE */ >1757 NG_NODE_UNREF(oldnode); /* XXX another race */ >1758 if (NG_NODE_NOT_VALID(node)) { >1759 NG_NODE_UNREF(node); /* XXX more races */ >1760 node = NULL; >1761 } >(kgdb) > >(kgdb) p *hook >$3 = {hk_name = "ctrl", '\0' <repeats 27 times>, hk_private = >0xcb90a5c0, hk_flags = 0, hk_type = 0, hk_peer = 0xcab92e80, > hk_node = 0xcace4b80, hk_hooks = {le_next = 0x0, le_prev = > 0xcace4bb4}, hk_rcvmsg = 0, hk_rcvdata = 0, hk_refs = 2} >(kgdb) >(kgdb) p *node >Cannot access memory at address 0x0 >(kgdb) > >_______________________________________________ >freebsd-stable@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-stable >To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201010021437.o92EbAIl033701>