Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Feb 2008 22:59:32 -0500
From:      Louis Mamakos <louie@transsys.com>
To:        Gleb Smirnoff <glebius@FreeBSD.org>
Cc:        cvs-src@FreeBSD.org, Alexander Motin <mav@FreeBSD.org>, src-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/netgraph/netflow ng_netflow.c
Message-ID:  <F97B091A-A8D5-46F6-AB5D-6E1F915E53BD@transsys.com>
In-Reply-To: <20080205141739.GX14339@FreeBSD.org>
References:  <200801271501.m0RF1Hki089075@repoman.freebsd.org> <20080202201153.GL14339@FreeBSD.org> <47A4E122.8080901@FreeBSD.org> <C0C34BEB-3EB8-4552-B0BD-CE481311C77A@transsys.com> <20080205141739.GX14339@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Feb 5, 2008, at 9:17 AM, Gleb Smirnoff wrote:

> On Sun, Feb 03, 2008 at 11:36:49AM -0500, Louis Mamakos wrote:
> L> > Gleb Smirnoff =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
> L> >>  you should have asked me for review before committing! This is
> L> >> not a bug, this is a feature. This was quite clear from the =20
> comments,
> L> >> that you removed:
> L> >> -       /* if export hook disconnected stop running expire(). */
> L> >>  This is intended behavior. We must not lose information unless
> L> >> user explicitly wants to lose information. In the latter case
> L> >> he will connect ng_hole(4) node to the "export" hook. But we =20
> must
> L> >> not lose information if user runs some script that swaps =20
> receiving
> L> >> node on the "export" hook.
> L> >>  Please backout this change!
> L> >
> L> > Expire process was not depending completely on connected hook =20
> even before
> L> > this commit. For example, every TCP session closing forces some =20=

> data
> L> > export. So even with export hook disconnected some data still =20
> will be lost
> L> > and not just lost, but it was leading to memory leak which I =20
> have fixed
> L> > with other commit.
>
> That's true. The active TCP close should be reworked. And the new =20
> active expiry
> feature violates the original design, when no export hook ment, no =20
> data lose. :(
>
> L>  If there's a concern about no losing the netflow data, then it's =20=

> likely that
> L>  it's usually the case that an export hook is connected.  If a =20
> user wanted to
> L>  change the export arrangement for the netflow data, then just =20
> disconnected
> L>  and reconnecting to the export hook won't caused data to be lost =20=

> if the
> L>  expiry parameters are set to something reasonable.
>
> Since expiry runs periodically, then it can race with hook change

I'm not sure why I'd have an expectation that I would never, under any
circumstances lose data when switching the export hook.  If I really,
really wanted to arrange for that, perhaps I'd connect the export hook
to a "tee" node so I could swap in different export destinations in a
"make before break" sort of arrangement.

>
>
> L>  Finally, in the absence of infinite amounts of memory, data will =20=

> eventually
> L>  be lost.  The only decision is over what duration data should be =20=

> kept around
> L>  so that it might be harvested.  It's a huge surprise that the =20
> netflow module
> L>  consumes large amounts of kernel memory.  As a user, I expected =20=

> the
> L>  expiration timers to be the policy that I specify to control how =20=

> long the
> L>  netflow stats are stored, and my expectation wasn't met.
>
> Huge surprise? How can you expect a kernel module that stores a lot =20=

> of data
> consume a little kernel memory?

I suppose the problem is that I had no expectation that a kernel =20
module, would
consume unbounded amounts of kernel resources.  I certainly didn't =20
expect that
it would have a need to store "a lot of data" given that there are =20
documented
parameters on how the in-kernel state should be expired.  That this =20
expiration
doesn't occur is a significant difference that would I would have =20
expected as
reasonable behavior.

You start with the presumption that the data being collected is so =20
precious that
it cannot be dropped under any circumstances.  That's probably a faulty
premise to begin with, given that most of the netflow export happens =20
on an
unreliable UDP transport.
>
> I agree that the behavior should be documented in manual page and =20
> using
> ng_hole(4) for your case should be advised. If you send me a manual =20=

> page patch,
> I can commit it.

Driving the kernel into resource exhaustion for no really good reason =20=

doesn't
seem like the right default behavior.  I really think that the netflow
module should default into a safe mode of operation rather than =20
unexpected
consumption of a limited resource.

Louis Mamakos




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F97B091A-A8D5-46F6-AB5D-6E1F915E53BD>