Date: Wed, 16 Feb 2011 15:04:09 +0600 From: Eugene Grosbein <egrosbein@rdtc.ru> To: Gleb Smirnoff <glebius@freebsd.org> Cc: Przemyslaw Frasunek <przemyslaw@frasunek.com>, Mike Tancsa <mike@sentex.net>, mav@freebsd.org, bz@freebsd.org, "net@freebsd.org" <net@freebsd.org> Subject: Re: Netgraph/mpd5 stability issues Message-ID: <4D5B9309.30508@rdtc.ru> In-Reply-To: <20110216084635.GI42041@glebius.int.ru> References: <20110131144838.GO62007@FreeBSD.org> <4D46F655.9000701@rdtc.ru> <20110131204816.GV62007@glebius.int.ru> <4D5A989E.8020703@sentex.net> <4D5B4F07.6080801@rdtc.ru> <20110216084635.GI42041@glebius.int.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On 16.02.2011 14:46, Gleb Smirnoff wrote: > On Wed, Feb 16, 2011 at 10:13:59AM +0600, Eugene Grosbein wrote: > E> I run AMD64 with 4GB of memory, lots of memory is free and > E> I still get panics often, sometimes two in a couple of hours. > E> It does not seem memory exhaustion to me. It seems as very low probable race > E> that happens occasionally but may happen any time. > E> > E> With Gleb's patch, it is obvious that panic happens at moments of user disconnect. > > I missed: did my patch fix panics in the ng_address_hook(), in this block? > > if ((hook == NULL) || > NG_HOOK_NOT_VALID(hook) || > NG_HOOK_NOT_VALID(peer = NG_HOOK_PEER(hook)) || > NG_NODE_NOT_VALID(peernode = NG_PEER_NODE(hook))) { > NG_FREE_ITEM(item); > TRAP_ERROR(); > return (ENETDOWN); > } It seems, yes. All my panics now are in _chkhook() being called with bad hook as first argument. > All the panics reported by you and Mike recently have traces unrelated > to netgraph, and also traces look weird. No, almost all my panics are related to netgraph, chains are like ip_fastforward() - ng_rmnode_self() - ng_address_hook() - trap sendto() - kern_sendit() - sosend_generic() - ng_parse_get_token() - ... - trap Only one of my panics was unrelated to netgraph, with igmp_change_state() in trace. > May be there is some kind of memory corruption? May be try memguard(9)? I can try memguard too, please tell again what setting should I use. One more thing: I've noticed my traced show there are plenty of recursive calls, for example (from my letter of 07.02): panic: page fault cpuid = 1 KDB: stack backtrace: X_db_sym_numargs() at 0xffffffff801a227a = X_db_sym_numargs+0x15a kdb_backtrace() at 0xffffffff8033d547 = kdb_backtrace+0x37 panic() at 0xffffffff8030b567 = panic+0x187 dblfault_handler() at 0xffffffff804c0ca0 = dblfault_handler+0x330 dblfault_handler() at 0xffffffff804c107f = dblfault_handler+0x70f trap() at 0xffffffff804c155f = trap+0x3df calltrap() at 0xffffffff804a8de4 = calltrap+0x8 --- trap 0xc, rip = 0xffffffff803e4f36, rsp = 0xffffff80ebff7400, rbp = 0xffffff80ebff7430 --- ng_parse_get_token() at 0xffffffff803e4f36 = ng_parse_get_token+0x6596 ng_parse_get_token() at 0xffffffff803e5ccf = ng_parse_get_token+0x732f ng_destroy_hook() at 0xffffffff803d53b2 = ng_destroy_hook+0x222 ng_rmnode() at 0xffffffff803d6118 = ng_rmnode+0xa08 ng_snd_item() at 0xffffffff803d8520 = ng_snd_item+0x3f0 ng_destroy_hook() at 0xffffffff803d52ed = ng_destroy_hook+0x15d ng_rmnode() at 0xffffffff803d57b9 = ng_rmnode+0xa9 ng_rmnode() at 0xffffffff803d7664 = ng_rmnode+0x1f54 ng_snd_item() at 0xffffffff803d8520 = ng_snd_item+0x3f0 ng_parse_get_token() at 0xffffffff803e97fa = ng_parse_get_token+0xae5a sosend_generic() at 0xffffffff80373df6 = sosend_generic+0x436 kern_sendit() at 0xffffffff803776d5 = kern_sendit+0x1a5 kern_sendit() at 0xffffffff8037790c = kern_sendit+0x3dc sendto() at 0xffffffff803779fd = sendto+0x4d syscallenter() at 0xffffffff8034a015 = syscallenter+0x1e5 syscall() at 0xffffffff804c10fb = syscall+0x4b Xfast_syscall() at 0xffffffff804a90c2 = Xfast_syscall+0xe2 --- syscall (133, FreeBSD ELF64, sendto), rip = 0x8018c971c, rsp = 0x7fffffbfeab8, rbp = 0x80203dcc0 --- Uptime: 2d17h1m42s Is it normal, is NETGRAPH protected from such execution flow? Eugene Grosbein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D5B9309.30508>