From owner-freebsd-current@FreeBSD.ORG Wed Sep 1 18:19:28 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DC1C616A4D0 for ; Wed, 1 Sep 2004 18:19:28 +0000 (GMT) Received: from ganymede.hub.org (blk-222-46-91.eastlink.ca [24.222.46.91]) by mx1.FreeBSD.org (Postfix) with ESMTP id 17A8443D5F for ; Wed, 1 Sep 2004 18:19:28 +0000 (GMT) (envelope-from scrappy@hub.org) Received: by ganymede.hub.org (Postfix, from userid 1000) id E3ADA33E31; Wed, 1 Sep 2004 15:19:27 -0300 (ADT) Received: from localhost (localhost [127.0.0.1]) by ganymede.hub.org (Postfix) with ESMTP id DBA5833DCE for ; Wed, 1 Sep 2004 15:19:27 -0300 (ADT) Date: Wed, 1 Sep 2004 15:19:27 -0300 (ADT) From: "Marc G. Fournier" To: freebsd-current@freebsd.org Message-ID: <20040901151405.G47186@ganymede.hub.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: vnode leak in FFS code ... ? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Sep 2004 18:19:29 -0000 I don't know if this is applicable to -current as well, but so far, anything like this I've uncovered in 4.x has needed an equivalent fix in 5.x, so figured it can't hurt to ask, especially with everyone working towards a STABLE 5.x branch ... I do not have a 5.x machine running this sort of load at the moment, so can't test, or provide feedback there ... all my 5.x machines are more or less desktops ... On Saturday, I'm going to try an unmount of the bigger file system, to see if it frees everything up without a reboot ... but if someone can suggest something to check to see if it is a) a leak and b) is fixable between now and then, please let me know ... again, this is a 4.10 system, but most of the work that Tor and David have done (re: vnodes) in the past relating to my servers have been applied to 5.x first, and MFC'd afterwards, so I suspect that this too many be something that applies to both branches ... ----------------- I have two servers, both running 4.10 of within a few days (Aug 5 for venus, Aug 7 for neptune) ... both running jail environments ... one with ~60 running, the other with ~80 ... the one with 60 has been running for ~25 days now, and is at the border of running out of vnodes: Aug 31 20:58:00 venus root: debug.numvnodes: 519920 - debug.freevnodes: 11058 - debug.vnlru_nowhere: 256463 - vlrup Aug 31 20:59:01 venus root: debug.numvnodes: 519920 - debug.freevnodes: 13155 - debug.vnlru_nowhere: 256482 - vlrup Aug 31 21:00:03 venus root: debug.numvnodes: 519920 - debug.freevnodes: 13092 - debug.vnlru_nowhere: 256482 - vlruwt while the other one has been up for ~1 days, but is using alot less, for more processes: Aug 31 20:58:00 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 208655 - debug.vnlru_nowhere: 0 - vlruwt Aug 31 20:59:00 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 208602 - debug.vnlru_nowhere: 0 - vlruwt Aug 31 21:00:03 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 208319 - debug.vnlru_nowhere: 0 - vlruwt I've tried shutting down all of the VMs on venus, and umount'd all of the unionfs mounts, as well as the one nfs mount we have ... the above #s are after the VMs (and mounts are recreated ... Now, my understanding of the vnodes is that for every file opened, a vnode is created ... in my case, since I'm using unionfs, there are two vnodes per file ... if it possible that there are 'stale' vnodes that aren't being freed up? Is there some way of 'viewing' the vnode structure? For instance, fstat shows: venus# fstat | wc -l 19531 So, obviously it isn't just open files that I'm dealing with here, for even if I double that, that is nowhere near 519920 ... So, where else are the vnodes going? Is there a 'leak'? What can I look at to try and narrow this down / provide more information? Even some way of determining a specific process that is sucking back alot of them, to move that to a different machine ... ? Looking at vmstat -m .. specifically the work that David did on seperating the union vs regular vnodes: UNION mount 60 2K 3K204800K 162 0 0 32 undcac 0 0K 1K204800K343638713 0 0 16 unpath 13146 227K 1025K204800K 43541149 0 0 16,32,64,128 Export Host 1 1K 1K204800K 164 0 0 256 vnodes 141 7K 8K204800K 613 0 0 16,32,64,128,256 Why does 'vnodes' show only 141 InUse? Or, in this case, should I be looking at: FFS node496600124150K 127870K204800K401059293 0 0 256 496k FFS nodes, if I'm reading right? vs neptune, which is showing only: FFS node300433 75109K 80257K204800K 3875307 0 0 256 Hrmmm, maybe I'm mis-reading all of this, and going down the wrong paths here, so hopefully someone will correct if I am ... but, for now ... Looking at vmstat -m a bit further, the top of the report has: Memory statistics by bucket size Size In Use Free Requests HighWater Couldfree 16 13116 28356 2063580697 1280 7822 32 77734 7002 168084205 640 316065 64 465006 48402 2804541088 320 637084 128 100182 60010 591859866 160 1850304 256 500029 12163 1178322001 80 123078 Now, the only things that are using alot of the '256 Size' memory are: FFS node494513123629K 127870K204800K401104542 0 0 256 vfscache449709 29178K 32434K204800K737673766 0 0 64,128,256,512K Since only 500029 are 'InUse', and since FFS node is exclusively 256 ... I'm going to guess that most of vfscache is using something else ... so, my question becomes if 123000 'Could be Freed', why aren't they? Assuming, of course, I'm not on the wrong trail here :(