From owner-freebsd-arch@FreeBSD.ORG Tue May 19 15:29:47 2009 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E10AE106566B for ; Tue, 19 May 2009 15:29:47 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 942808FC17 for ; Tue, 19 May 2009 15:29:47 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av0FAItmEkqDaFvI/2dsb2JhbACNegHCDYQCBQ X-IronPort-AV: E=Sophos;i="4.41,215,1241409600"; d="scan'208";a="35937801" Received: from darling.cs.uoguelph.ca ([131.104.91.200]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 19 May 2009 11:00:34 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by darling.cs.uoguelph.ca (Postfix) with ESMTP id 1D742940074 for ; Tue, 19 May 2009 11:00:34 -0400 (EDT) X-Virus-Scanned: amavisd-new at darling.cs.uoguelph.ca Received: from darling.cs.uoguelph.ca ([127.0.0.1]) by localhost (darling.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id x5m95YCnrhtM for ; Tue, 19 May 2009 11:00:33 -0400 (EDT) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by darling.cs.uoguelph.ca (Postfix) with ESMTP id 0F74E940025 for ; Tue, 19 May 2009 11:00:33 -0400 (EDT) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id n4JF1B214154 for ; Tue, 19 May 2009 11:01:11 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Tue, 19 May 2009 11:01:11 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: freebsd-arch@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: nfs server resource exhaustion (before it's too late) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 May 2009 15:29:48 -0000 In the experimental nfs server (sys/fs/nfsserver), there is a function that, when it returns non-zero, causes the server to reply NFSERR_DELAY to the client so that it will try the RPC again a little later. (Or, for NFSv2 over UDP, which doesn't have NFSERR_DELAY, it simply drops the request and assumes the client will timeout and try it again.) This is intended to avoid the situation where the server cannot m_get/m_getcl/malloc part way through processing a request, due to resource exhaustion. (The malloc case isn't as critical, since I have high water marks set to limit the # of allocations for the various NFSv4 state related structures that are malloc'd.) At this point the function is just a stub: int nfsrv_mallocmget_limit(void) { return (0); } I just took a quick look (I don't know anything about UMA, except that it seems to be used by m_get and m_getcl) and this was what I could think of for doing the above on FreeBSD8. (It wasn't obvious to me if there was a limit set for the various zones used by malloc(), so I didn't include them. int nfsrv_mallocmget_limit(void) { u_int32_t pages, maxpages; uma_zone_get_pagecnts(zone_clust, &pages, &maxpages); if (maxpages != 0 && (pages * 12 / 10) > maxpages) return (1); return (0); } At this point, the only function I could see that would return the above information is sysctl_vm_zone_stats() and it looks like overkill. Also, the function needs to be relatively low overhead, since it is called for every nfs rpc the server gets so I thought this might be ok? /* added to sys/vm/uma_core.c */ void uma_zone_get_pagecnts(uma_zone_t zone, u_int32_t *pages, u_int32_t *maxpages) { uma_keg_t keg; ZONE_LOCK(zone); keg = zone_first_keg(zone); *pages = keg->uk_pages; *maxpages = keg->uk_maxpages; ZONE_UNLOCK(zone); } Does this look reasonable or can anyone suggest a better alternative? Thanks in advance for any suggestions, rick