Date: Wed, 6 Feb 2008 22:59:32 -0500 From: Louis Mamakos <louie@transsys.com> To: Gleb Smirnoff <glebius@FreeBSD.org> Cc: cvs-src@FreeBSD.org, Alexander Motin <mav@FreeBSD.org>, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/netgraph/netflow ng_netflow.c Message-ID: <F97B091A-A8D5-46F6-AB5D-6E1F915E53BD@transsys.com> In-Reply-To: <20080205141739.GX14339@FreeBSD.org> References: <200801271501.m0RF1Hki089075@repoman.freebsd.org> <20080202201153.GL14339@FreeBSD.org> <47A4E122.8080901@FreeBSD.org> <C0C34BEB-3EB8-4552-B0BD-CE481311C77A@transsys.com> <20080205141739.GX14339@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Feb 5, 2008, at 9:17 AM, Gleb Smirnoff wrote: > On Sun, Feb 03, 2008 at 11:36:49AM -0500, Louis Mamakos wrote: > L> > Gleb Smirnoff =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > L> >> you should have asked me for review before committing! This is > L> >> not a bug, this is a feature. This was quite clear from the =20 > comments, > L> >> that you removed: > L> >> - /* if export hook disconnected stop running expire(). */ > L> >> This is intended behavior. We must not lose information unless > L> >> user explicitly wants to lose information. In the latter case > L> >> he will connect ng_hole(4) node to the "export" hook. But we =20 > must > L> >> not lose information if user runs some script that swaps =20 > receiving > L> >> node on the "export" hook. > L> >> Please backout this change! > L> > > L> > Expire process was not depending completely on connected hook =20 > even before > L> > this commit. For example, every TCP session closing forces some =20= > data > L> > export. So even with export hook disconnected some data still =20 > will be lost > L> > and not just lost, but it was leading to memory leak which I =20 > have fixed > L> > with other commit. > > That's true. The active TCP close should be reworked. And the new =20 > active expiry > feature violates the original design, when no export hook ment, no =20 > data lose. :( > > L> If there's a concern about no losing the netflow data, then it's =20= > likely that > L> it's usually the case that an export hook is connected. If a =20 > user wanted to > L> change the export arrangement for the netflow data, then just =20 > disconnected > L> and reconnecting to the export hook won't caused data to be lost =20= > if the > L> expiry parameters are set to something reasonable. > > Since expiry runs periodically, then it can race with hook change I'm not sure why I'd have an expectation that I would never, under any circumstances lose data when switching the export hook. If I really, really wanted to arrange for that, perhaps I'd connect the export hook to a "tee" node so I could swap in different export destinations in a "make before break" sort of arrangement. > > > L> Finally, in the absence of infinite amounts of memory, data will =20= > eventually > L> be lost. The only decision is over what duration data should be =20= > kept around > L> so that it might be harvested. It's a huge surprise that the =20 > netflow module > L> consumes large amounts of kernel memory. As a user, I expected =20= > the > L> expiration timers to be the policy that I specify to control how =20= > long the > L> netflow stats are stored, and my expectation wasn't met. > > Huge surprise? How can you expect a kernel module that stores a lot =20= > of data > consume a little kernel memory? I suppose the problem is that I had no expectation that a kernel =20 module, would consume unbounded amounts of kernel resources. I certainly didn't =20 expect that it would have a need to store "a lot of data" given that there are =20 documented parameters on how the in-kernel state should be expired. That this =20 expiration doesn't occur is a significant difference that would I would have =20 expected as reasonable behavior. You start with the presumption that the data being collected is so =20 precious that it cannot be dropped under any circumstances. That's probably a faulty premise to begin with, given that most of the netflow export happens =20 on an unreliable UDP transport. > > I agree that the behavior should be documented in manual page and =20 > using > ng_hole(4) for your case should be advised. If you send me a manual =20= > page patch, > I can commit it. Driving the kernel into resource exhaustion for no really good reason =20= doesn't seem like the right default behavior. I really think that the netflow module should default into a safe mode of operation rather than =20 unexpected consumption of a limited resource. Louis Mamakos
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F97B091A-A8D5-46F6-AB5D-6E1F915E53BD>