From owner-cvs-src@FreeBSD.ORG Thu Feb 7 03:59:35 2008 Return-Path: Delivered-To: cvs-src@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1506116A41A; Thu, 7 Feb 2008 03:59:35 +0000 (UTC) (envelope-from louie@transsys.com) Received: from ringworld.transsys.com (ringworld.transsys.com [144.202.0.15]) by mx1.freebsd.org (Postfix) with ESMTP id B65AE13C461; Thu, 7 Feb 2008 03:59:34 +0000 (UTC) (envelope-from louie@transsys.com) Received: from PM-G5.transsys.com (c-69-141-143-164.hsd1.nj.comcast.net [69.141.143.164]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: louie) by ringworld.transsys.com (Postfix) with ESMTP id 9446B5C5A; Wed, 6 Feb 2008 22:59:33 -0500 (EST) Message-Id: From: Louis Mamakos To: Gleb Smirnoff In-Reply-To: <20080205141739.GX14339@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v915) Date: Wed, 6 Feb 2008 22:59:32 -0500 References: <200801271501.m0RF1Hki089075@repoman.freebsd.org> <20080202201153.GL14339@FreeBSD.org> <47A4E122.8080901@FreeBSD.org> <20080205141739.GX14339@FreeBSD.org> X-Mailer: Apple Mail (2.915) Cc: cvs-src@FreeBSD.org, Alexander Motin , src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/netgraph/netflow ng_netflow.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Feb 2008 03:59:35 -0000 On Feb 5, 2008, at 9:17 AM, Gleb Smirnoff wrote: > On Sun, Feb 03, 2008 at 11:36:49AM -0500, Louis Mamakos wrote: > L> > Gleb Smirnoff =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > L> >> you should have asked me for review before committing! This is > L> >> not a bug, this is a feature. This was quite clear from the =20 > comments, > L> >> that you removed: > L> >> - /* if export hook disconnected stop running expire(). */ > L> >> This is intended behavior. We must not lose information unless > L> >> user explicitly wants to lose information. In the latter case > L> >> he will connect ng_hole(4) node to the "export" hook. But we =20 > must > L> >> not lose information if user runs some script that swaps =20 > receiving > L> >> node on the "export" hook. > L> >> Please backout this change! > L> > > L> > Expire process was not depending completely on connected hook =20 > even before > L> > this commit. For example, every TCP session closing forces some =20= > data > L> > export. So even with export hook disconnected some data still =20 > will be lost > L> > and not just lost, but it was leading to memory leak which I =20 > have fixed > L> > with other commit. > > That's true. The active TCP close should be reworked. And the new =20 > active expiry > feature violates the original design, when no export hook ment, no =20 > data lose. :( > > L> If there's a concern about no losing the netflow data, then it's =20= > likely that > L> it's usually the case that an export hook is connected. If a =20 > user wanted to > L> change the export arrangement for the netflow data, then just =20 > disconnected > L> and reconnecting to the export hook won't caused data to be lost =20= > if the > L> expiry parameters are set to something reasonable. > > Since expiry runs periodically, then it can race with hook change I'm not sure why I'd have an expectation that I would never, under any circumstances lose data when switching the export hook. If I really, really wanted to arrange for that, perhaps I'd connect the export hook to a "tee" node so I could swap in different export destinations in a "make before break" sort of arrangement. > > > L> Finally, in the absence of infinite amounts of memory, data will =20= > eventually > L> be lost. The only decision is over what duration data should be =20= > kept around > L> so that it might be harvested. It's a huge surprise that the =20 > netflow module > L> consumes large amounts of kernel memory. As a user, I expected =20= > the > L> expiration timers to be the policy that I specify to control how =20= > long the > L> netflow stats are stored, and my expectation wasn't met. > > Huge surprise? How can you expect a kernel module that stores a lot =20= > of data > consume a little kernel memory? I suppose the problem is that I had no expectation that a kernel =20 module, would consume unbounded amounts of kernel resources. I certainly didn't =20 expect that it would have a need to store "a lot of data" given that there are =20 documented parameters on how the in-kernel state should be expired. That this =20 expiration doesn't occur is a significant difference that would I would have =20 expected as reasonable behavior. You start with the presumption that the data being collected is so =20 precious that it cannot be dropped under any circumstances. That's probably a faulty premise to begin with, given that most of the netflow export happens =20 on an unreliable UDP transport. > > I agree that the behavior should be documented in manual page and =20 > using > ng_hole(4) for your case should be advised. If you send me a manual =20= > page patch, > I can commit it. Driving the kernel into resource exhaustion for no really good reason =20= doesn't seem like the right default behavior. I really think that the netflow module should default into a safe mode of operation rather than =20 unexpected consumption of a limited resource. Louis Mamakos