Date: Sun, 17 Jul 2005 12:10:33 -0300 (ADT) From: "Marc G. Fournier" <scrappy@hub.org> To: freebsd-stable@freebsd.org Subject: vnode leak in NFS (Was: Re: 4.11-STABLE leaks vnodes worse then 4.x from Feb 13th ... ?) Message-ID: <20050717120926.R66818@ganymede.hub.org> In-Reply-To: <20050715120008.H66818@ganymede.hub.org> References: <20050715120008.H66818@ganymede.hub.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Wow, now this was unexpected ... figured I try a quick theory this morning ... debug.freevnodes was down to: debug.freevnodes: 103987 umount /nfs and ... # sysctl debug.freevnodes debug.freevnodes: 332106 The vnode leak isn't in the unionfs code ... its in the nfs code :( On Fri, 15 Jul 2005, Marc G. Fournier wrote: > > Recently, I started having problems with one of my newest servers ... > figuring that it might have somethign to do with the fact that I went SATA > for this one (all others are SCSI), I figured it might be a driver issue > causing the problems, since everything else is the same as the other 5 > servers on our network ... > > Today, I'm starting to wonder if I've been just looking at the "most obvious" > cause, instead of looking deeper ... > > The problem that manifests itself is similar to the old 'ran out of vnode' > issue I used to experience under 4.x ... the server would still run, be > totally pingable, and you could even get the motd when you tried to ssh in, > but you couldn't get a prompt, and all processes were hung ... > > I just upgraded the kernel on this machine (mercury) on the 13th of July, and > its been running 1 day, 12 hrs now ... there is hardly anything running on > this machine (10 jails), and vnode usage is: > > debug.numvnodes: 336460 - debug.freevnodes: 5275 - debug.vnlru_nowhere: 0 - > vlruwt > > One of my older servers (neptune), running kernels from Feb 13th of this > year, and with 81 jails running on it, is using up *significantly less* > vnodes (uptime: 1 day, 10 hours): > > debug.numvnodes: 279710 - debug.freevnodes: 91442 - debug.vnlru_nowhere: 0 - > vlruwt > > Now, compared to neptune, mercury isn't running anything special ... several > apache 1 processes, postfix, cyrus-imapd and that's it ... neptune on the > other hand, is running the full gambit ... aolserver, java, apache 1 and 2, > postfix, etc ... > > So, I'm starting to think that the problem isn't "hardware related", but the > kernel itself ... the latest 4.11-STABLE kernel seems to have brought in new > vnode leakage, or ... vnlru isn't working as it should be to free up vnodes > ... > > Looking at that process on mercury: > > # ps aux | grep vnlru > root 7 0.0 0.0 0 0 ?? DL Wed11PM 0:00.65 (vnlru) > > whereas on neptune: > > # ps aux | grep vnlru > root 9 0.0 0.0 0 0 ?? DL Thu01AM 0:00.79 (vnlru) > > so about the same about of CPU time being expended ... a bit more on the more > loaded server, but not a major amount ... > > I'd like to try and debug this, but don't know where to start ... I realize > that 4.x isn't being pushed anymore, but there are alot of us that haven't > moved to 5.x yet (am working on that for our next server, but its going to > take me several months before I can convert all our existing servers up) ... > > I do have a serial console on this server, if that helps to debug things ... > > I've heard that there was some work done on 5.x to clean up some of the vnode > leaks ... not sure if that is fact or just rumor ... but, if so, would any of > them be MFCable to 4.x? > > Thanks ... > > ---- > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) > Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664 > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050717120926.R66818>
