Date: Wed, 16 Feb 2011 13:23:30 +0300 From: Gleb Smirnoff <glebius@FreeBSD.org> To: Eugene Grosbein <egrosbein@rdtc.ru> Cc: Przemyslaw Frasunek <przemyslaw@frasunek.com>, Mike Tancsa <mike@sentex.net>, mav@FreeBSD.org, bz@FreeBSD.org, "net@freebsd.org" <net@FreeBSD.org>, julian@FreeBSD.org Subject: Re: Netgraph/mpd5 stability issues Message-ID: <20110216102330.GJ42041@glebius.int.ru> In-Reply-To: <4D5B9309.30508@rdtc.ru> References: <20110131144838.GO62007@FreeBSD.org> <4D46F655.9000701@rdtc.ru> <20110131204816.GV62007@glebius.int.ru> <4D5A989E.8020703@sentex.net> <4D5B4F07.6080801@rdtc.ru> <20110216084635.GI42041@glebius.int.ru> <4D5B9309.30508@rdtc.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Feb 16, 2011 at 03:04:09PM +0600, Eugene Grosbein wrote: E> On 16.02.2011 14:46, Gleb Smirnoff wrote: E> > On Wed, Feb 16, 2011 at 10:13:59AM +0600, Eugene Grosbein wrote: E> > E> I run AMD64 with 4GB of memory, lots of memory is free and E> > E> I still get panics often, sometimes two in a couple of hours. E> > E> It does not seem memory exhaustion to me. It seems as very low probable race E> > E> that happens occasionally but may happen any time. E> > E> E> > E> With Gleb's patch, it is obvious that panic happens at moments of user disconnect. E> > E> > I missed: did my patch fix panics in the ng_address_hook(), in this block? E> > E> > if ((hook == NULL) || E> > NG_HOOK_NOT_VALID(hook) || E> > NG_HOOK_NOT_VALID(peer = NG_HOOK_PEER(hook)) || E> > NG_NODE_NOT_VALID(peernode = NG_PEER_NODE(hook))) { E> > NG_FREE_ITEM(item); E> > TRAP_ERROR(); E> > return (ENETDOWN); E> > } E> E> It seems, yes. All my panics now are in _chkhook() being called E> with bad hook as first argument. That is because of NETGRAPH_DEBUG, not my patch :(. Unfortunately, we don't have coredumps and can't tell whether locking the destroy path helped or not. E> Only one of my panics was unrelated to netgraph, with igmp_change_state() in trace. E> E> > May be there is some kind of memory corruption? May be try memguard(9)? E> E> I can try memguard too, please tell again what setting should I use. You need to set vm.memguard.desc to a memory type you want to monitor. You can try for some time (several hours) all netgraph related memory types: vmstat -m | grep -i netgraph | awk '{print $1}' E> One more thing: I've noticed my traced show there are plenty of recursive calls, E> for example (from my letter of 07.02): ... E> Is it normal, is NETGRAPH protected from such execution flow? Yes, this is weird. For example kern_sendit() can't call kern_sendit() for sure. Most other double calls in the trace are weird too. -- Totus tuus, Glebius.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110216102330.GJ42041>