Date: Mon, 14 Mar 2005 23:29:39 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Jeff Roberson <jroberson@chesapeake.net> Cc: arch@freebsd.org Subject: Re: Freeing vnodes. Message-ID: <200503150729.j2F7Tdxl052412@apollo.backplane.com> References: <20050314213038.V20708@mail.chesapeake.net>
next in thread | previous in thread | raw e-mail | index | archive | help
:I have a patch at http://www.chesapeake.net/~jroberson/freevnodes.diff :that allows us to start reclaiming vnodes from the free list and release :their memory. It also changes the semantics of wantfreevnodes, and makes :getnewvnode() much prettier. Hmm. I'm not sure your logic is correct in this bit: + /* + * We try to get half way to wantfreevnodes each time we run. This + * slows down the process slightly, giving vnodes a greater chance + * of being lru'd to the back of the list. + */ + count = (freevnodes - wantfreevnodes) / 2; + for (; count > 0; count--) { ... This seems to be indicating that you are going to try to destroy *MOST* of the vnodes on the free list. freevnodes is typically a fairly high number what wantfreevnodes is typically a very low number, so this calculation is typically going to be, well, a big number. I think you did not intend this. Didn't you just want to destroy enough vnodes to have 'wantfreevnodes' worth of slop so getnewvnode() could allocate new vnodes? In that case the calculation would be: count = numvnodes - desiredvnodes + wantfreevnodes; while (count > 0) { ... physically free a vnode and reduce the numvnodes count ... --count; } :This also removes the recycling from the getnewvnode() path. Instead, it :is done by a new helper function that is called from vnlru_proc(). This :function just frees vnodes from the head of the list until we reach our :wantfreevnodes target. : :I haven't perf tested this yet, but I have a box that is doing a :buildworld with a fairly constant freevnodes count which shows that vnodes :are actually being uma_zfree'd. : :Comments? Anyone willing to do some perf tests for me? : :Thanks, :Jeff There is a second issue when you remove the recycling from the getnewvnode() path. Generally speaking when the system runs out of vnodes you wind up with a LOT of processes sleeping in the getnewvnode() procedure all at once. You need to make very sure that you are actually freeing up a sufficient number of vnodes to allow *all* the processes waiting for a new vnode to proceed all at once. This would argue against only going 'half way' to the goal, and would also argue for having a more dynamic goal that is based on the load on getnewvnode(). Even worse, since processes open and close files all the time, there can be a HUGE load on getnewvnode() (think of cvsupd and find, or a cvs update, etc...). This load can easily outstrip vnlru_proc()'s new ability to free vnodes and potentially cause a lot of unnecessarily blockages. I love the idea of being able to free vnodes in vnlru_proc() rather then free-and-reuse them in allocvnode(), but I cannot figure out how vnlru_proc() could possibly adapt to the huge load range that getnewvnode() has to deal with. Plus keep in mind that the vnodes being reused at that point are basically already dead except for the vgonel(). This brings up the true crux of the problem, where the true overhead of reusing a vnode inline with the getnewvnode() call is... and that is that vgonel() potentially has to update the related inode and could cause an unrelated process to block inside getnewvnode(). But even this case is a hard sell because the buffer cache for the filesystem almost certainly has cached the inode's disk block and a stat() alone does not actually dirty an inode, you usually have to read() from the file too. I would argue that if anything needs fixing here it would be the 'late' inode sync. [note: inode syncing can occur in INACTIVE as well as RECLAIM, and I'm not sure which routine it occurs in 'most of the time']. -Matt Matthew Dillon <dillon@backplane.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200503150729.j2F7Tdxl052412>