From owner-freebsd-hackers@FreeBSD.ORG Sat Oct 13 13:03:37 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7021E91B; Sat, 13 Oct 2012 13:03:37 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id DC2F28FC12; Sat, 13 Oct 2012 13:03:36 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap8EAN6MclCDaFvO/2dsb2JhbABFhhG6GYIgAQEBBAEBASArIAsbGAICDRkCKQEJJgYIBwQBHASHZAumTJF3gSGKLhqEZIESA5M+gi2BFY8ZgwmBRzQ X-IronPort-AV: E=Sophos;i="4.80,581,1344225600"; d="scan'208";a="183488893" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 13 Oct 2012 09:03:23 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id F1267B41C2; Sat, 13 Oct 2012 09:03:22 -0400 (EDT) Date: Sat, 13 Oct 2012 09:03:22 -0400 (EDT) From: Rick Macklem To: Garrett Wollman Message-ID: <611092759.2189637.1350133402953.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20600.62541.243673.307571@hergotha.csail.mit.edu> Subject: Re: NFS server bottlenecks MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE7 (Win)/6.0.10_GA_2692) Cc: Nikolay Denev , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Oct 2012 13:03:37 -0000 Garrett Wollman wrote: > < said: > > > I've attached the patch drc3.patch (it assumes drc2.patch has > > already been > > applied) that replaces the single mutex with one for each hash list > > for tcp. It also increases the size of NFSRVCACHE_HASHSIZE to 200. > > I haven't tested this at all, but I think putting all of the mutexes > in an array like that is likely to cause cache-line ping-ponging. It > may be better to use a pool mutex, or to put the mutexes adjacent in > memory to the list heads that they protect. Well, I'll admit I don't know how to do this. What the code does need is a "set of mutexes", where any of the mutexes can be referred to by an "index". I could easily define a structure that has: struct nfsrc_hashhead { struct nfsrvcachehead head; struct mtx mutex; } nfsrc_hashhead[NFSRVCACHE_HASHSIZE]; - but all that does is leave a small structure between each "struct mtx" and I wouldn't have thought that would make much difference. (How big is a typical hardware cache line these days? I have no idea.) - I suppose I could "waste space" and define a glob of unused space between them, like: struct nfsrc_hashhead { struct nfsrvcachehead head; char garbage[N]; struct mtx mutex; } nfsrc_hashhead[NFSRVCACHE_HASHSIZE]; - If this makes sense, how big should N be? (Somewhat less that the length of a cache line, I'd guess. It seems that the structure should be at least a cache line length in size.) All this seems "kinda hokey" to me and beyond what code at this level should be worrying about, but I'm game to make changes, if others think it's appropriate. I've never use mtx_pool(9) mutexes, but it doesn't sound like they would be the right fit, from reading the man page. (Assuming the mtx_pool_find() is guaranteed to return the same mutex for the same address passed in as an argument, it would seem that they would work, since I can pass &nfsrvcachehead[i] in as the pointer arg to "index" a mutex.) Hopefully jhb@ can say if using mtx_pool(9) for this would be better than an array: struct mtx nfsrc_tcpmtx[NFSRVCACHE_HASHSIZE]; Does anyone conversant with mutexes know what the best coding approach is? >(But I probably won't be > able to do the performance testing on any of these for a while. I > have a server running the "drc2" code but haven't gotten my users to > put a load on it yet.) > No rush. At this point, the earliest I could commit something like this to head would be December. rick ps: I hope John doesn't mind being added to the cc list yet again. It's just that I suspect he knows a fair bit about mutex implementation and possible hardware cache line effects. > -GAWollman > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to > "freebsd-hackers-unsubscribe@freebsd.org"