From owner-freebsd-current@FreeBSD.ORG Wed Sep 1 23:14:59 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B7E7716A4CE for ; Wed, 1 Sep 2004 23:14:59 +0000 (GMT) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id F363A43D31 for ; Wed, 1 Sep 2004 23:14:56 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.0.201] ([192.168.0.201]) (authenticated bits=0) by pooker.samsco.org (8.12.11/8.12.10) with ESMTP id i81NEWuj078920; Wed, 1 Sep 2004 17:14:33 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <41365746.2030605@samsco.org> Date: Wed, 01 Sep 2004 17:12:06 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.2) Gecko/20040831 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Marc G. Fournier" References: <20040901151405.G47186@ganymede.hub.org> <20040901200257.GA92717@afields.ca> <20040901184050.J47186@ganymede.hub.org> In-Reply-To: <20040901184050.J47186@ganymede.hub.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=0.0 required=3.8 tests=none autolearn=no version=2.63 X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on pooker.samsco.org cc: Allan Fields cc: freebsd-current@freebsd.org Subject: Re: vnode leak in FFS code ... ? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Sep 2004 23:14:59 -0000 Marc G. Fournier wrote: > On Wed, 1 Sep 2004, Allan Fields wrote: > >> On Wed, Sep 01, 2004 at 03:19:27PM -0300, Marc G. Fournier wrote: >> >>> I don't know if this is applicable to -current as well, but so far, >>> anything like this I've uncovered in 4.x has needed an equivalent fix in >>> 5.x, so figured it can't hurt to ask, especially with everyone working >>> towards a STABLE 5.x branch ... I do not have a 5.x machine running this >>> sort of load at the moment, so can't test, or provide feedback there ... >>> all my 5.x machines are more or less desktops ... >>> >>> On Saturday, I'm going to try an unmount of the bigger file system, >>> to see >>> if it frees everything up without a reboot ... but if someone can >>> suggest >>> something to check to see if it is a) a leak and b) is fixable >>> between now >>> and then, please let me know ... again, this is a 4.10 system, but >>> most of >>> the work that Tor and David have done (re: vnodes) in the past >>> relating to >>> my servers have been applied to 5.x first, and MFC'd afterwards, so I >>> suspect that this too many be something that applies to both branches >>> ... >> >> >> Unmounting the filesystems will call vflush() and should flush all >> vnodes from under that mount point. I'm not entirely sure if this >> is the best you can do w/o rebooting. > > > Understood, and agreed ... *but* ... is there a way, before I do that, > of determining if this is something that needs to be fixed at the OS > level? Is there a leak here that I can somehow identify while its in > this state? > > The server has *only* been up 25 days > It's really hard to tell if there is a vnode leak here. The vnode pool is fairly fluid and has nothing to do with the number of files that are actually 'open'. Vnodes get created when the VFS layer wants to access an object that isn't already in the cache, and only get destroyed when the object is destroyed. A vnode that reprents a file that was opened will stay 'active' in the system long after the file has been closed, because it's cheaper to keep it active in the cache than it is to discard it and then risk having to go through the pain of a namei() and VOP_LOOKUP() again later. Only if the maxvnode limit is hit will old vnodes start getting recycled to represent other objects. So in other words, just becuase you open a file and then close it does not mean the the vfs.numvnodes counter will increase by 1 and then decrease by 1. It'll increase by 1 and then stay there until something causes the vnode to be destroyed. Unmounting a filesystem is one way to destroy vnodes, and unlinking files is another was (IIRC, my memory might not be correct here). So you've obviously bumped up kern.maxvnodes well above the limits that are normally generated from the auto-tuner. Why did you do that, if not because you knew that you'd have a large working set of referenced (but maybe not open all at once) filesystem objects? In reality, a number that large is pretty impractical unless you're doing something specialized like a very large (and active) squid cache, or a large NNTP server. In those cases, you'll likely reach the maxvnode limit relatively quickly, after which you'll be relying on the system to do it's best at keeping the cache hot with useful objects. If you don't have a case like that then numvnodes might not ever grow to meet maxvnodes, or will only do so over a good deal of time. Taking 25 days to grow is not terribly unusual if you are serving a large website, fileserver, file-based database, etc. However, for slow-growth cases like this, a large maxvnodes is likely uneeded and is a waste of kernel RAM. If unmounting the filesystem doesn't result in numvnodes decreasing, then there definitely might be a leak. Unfortunately, you haven't provided that kind of information yet. Scott