From owner-freebsd-hackers@FreeBSD.ORG Mon Oct 15 20:58:18 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8FEF1350; Mon, 15 Oct 2012 20:58:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 34CA68FC08; Mon, 15 Oct 2012 20:58:17 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap8EABp4fFCDaFvO/2dsb2JhbABFFoV8un6CIAEBBAEjVgUWDgoCAg0ZAlkGiBEGC6oFkwmBIYo4hSuBEgOVbIEVjxuDCYF7 X-IronPort-AV: E=Sophos;i="4.80,590,1344225600"; d="scan'208";a="186505520" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 15 Oct 2012 16:58:16 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 4A5B1B4039; Mon, 15 Oct 2012 16:58:16 -0400 (EDT) Date: Mon, 15 Oct 2012 16:58:16 -0400 (EDT) From: Rick Macklem To: Ivan Voras Message-ID: <1516511249.2287339.1350334696127.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: Subject: Re: NFS server bottlenecks MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE7 (Win)/6.0.10_GA_2692) Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 20:58:18 -0000 Ivan Voras wrote: > On 13/10/2012 17:22, Nikolay Denev wrote: > > > drc3.patch applied and build cleanly and shows nice improvement! > > > > I've done a quick benchmark using iozone over the NFS mount from the > > Linux host. > > > > Hi, > > If you are already testing, could you please also test this patch: > > http://people.freebsd.org/~ivoras/diffs/nfscache_lock.patch > I don't think (it is hard to test this) your trim cache algorithm will choose the correct entries to delete. The problem is that UDP entries very seldom time out (unless the NFS server isn't seeing hardly any load) and are mostly trimmed because the size exceeds the highwater mark. With your code, it will clear out all of the entries in the first hash buckets that aren't currently busy, until the total count drops below the high water mark. (If you monitor a busy server with "nfsstat -e -s", you'll see the cache never goes below the high water mark, which is 500 by default.) This would delete entries of fairly recent requests. If you are going to replace the global LRU list with ones for each hash bucket, then you'll have to compare the time stamps on the least recently used entries of all the hash buckets and then delete those. If you keep the timestamp of the least recent one for that hash bucket in the hash bucket head, you could at least use that to select which bucket to delete from next, but you'll still need to: - lock that hash bucket - delete a few entries from that bucket's lru list - unlock hash bucket - repeat for various buckets until the count is beloew the high water mark Or something like that. I think you'll find it a lot more work that one LRU list and one mutex. Remember that mutex isn't held for long. Btw, the code looks very nice. (If I was being a style(9) zealot, I'd remind you that it likes "return (X);" and not "return X;". rick > It should apply to HEAD without Rick's patches. > > It's a bit different approach than Rick's, breaking down locks even > more.