From owner-freebsd-bugs Thu Dec 2 10: 0:24 1999 Delivered-To: freebsd-bugs@freebsd.org Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (Postfix) with ESMTP id C479314E85 for ; Thu, 2 Dec 1999 10:00:20 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.9.3/8.9.2) id KAA18650; Thu, 2 Dec 1999 10:00:01 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by hub.freebsd.org (Postfix) with SMTP id 3D8E314DB8 for ; Thu, 2 Dec 1999 09:56:19 -0800 (PST) (envelope-from iedowse@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 2 Dec 1999 17:54:52 +0000 (GMT) Message-Id: <199912021754.aa92503@walton.maths.tcd.ie> Date: Thu, 2 Dec 1999 17:54:52 +0000 (GMT) From: iedowse@maths.tcd.ie Reply-To: iedowse@maths.tcd.ie To: FreeBSD-gnats-submit@freebsd.org X-Send-Pr-Version: 3.2 Subject: kern/15222: mbuf leak in nfs_srvcache.c Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >Number: 15222 >Category: kern >Synopsis: mbuf leak in nfs_srvcache.c >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Dec 2 10:00:01 PST 1999 >Closed-Date: >Last-Modified: >Originator: Ian Dowse >Release: FreeBSD 3.3-STABLE i386 >Organization: School of Mathematics Trinity College Dublin >Environment: All versions of FreeBSD, when used as an NFS server. This problem will only have a significant impact on very busy NFS servers. >Description: FreeBSD maintains a cache of some recent NFS server requests and their replies. This allows duplicate requests to be discarded while the requested operation is still in progress, and avoids the need to redo an operation if the reply gets lost. The default maximum cache size is 64 entries. When a request is received, nfsrv_getcache() either finds an existing cache entry, or creates a new one. After the operation completes, nfsrv_updatecache() updates the cache entry, sometimes including a copy of the reply mbuf chain. The problem is that nfsrv_updatecache() assumes it will called only once for a given cache entry, so it does not check if the cache entry already has an associated mbuf chain. Normally this assumption is correct, but under certain high-load situations, the following can occur: 1) A request R is received by the server, say a write with the 'filesync' option. 2) 64 other requests are received before R completes. The cache entry for R gets flushed out. 3) A retransmit of R is received (R1); since the cache entry is gone, a new cache entry is created, and execution of R1 begins. 4) R completes, and a copy of the reply mbuf chain gets saved by nfsrv_updatecache(). 5) R1 completes. nfsrv_updatecache() replaces the cache entry with a copy of R1's reply, but doesn't free the original mbuf chain. This problem was leaking 200-500 mbufs each day on our busiest NFS server. There is another leak in nfsrv_cleancache(), but I think this will only be triggered if all the nfsd's get killed. The patch below fixes both of these problems. As commented in the patch, this problem only occurs when the cache is too small, so it might be desirable to increase the cache size (double it maybe) when nfsrv_updatecache() finds a RC_DONE entry. Recording the event in the nfs stats structure and having a sysctl to increase the cache size would be another option. I can submit patches for either. >How-To-Repeat: 1) Keep the disk on the NFS server busy - for example run find /whatever -type f |xargs cat > /dev/null 2) Send an NFS filesync write request for a file 'file1' 3) Send 64 access requests for another file 'file2' each with a different xid, to flush the cache. 4) Retransmit the NFS filesync write (with the same xid). 5) Wait for the two replies to the writes. This should leak just a few mbufs, so it needs to be repeated a number of times before the leak becomes apparent. >Fix: --- nfs_srvcache.c.orig Thu Dec 2 16:22:37 1999 +++ nfs_srvcache.c Thu Dec 2 16:49:44 1999 @@ -293,6 +293,19 @@ goto loop; } rp->rc_flag |= RC_LOCKED; + if (rp->rc_state == RC_DONE) { + /* + * This can happen if the cache is too small; + * retransmits of the same request aren't + * dropped so we see the operation completing + * more than once. + * XXX maybe increase desirednfsrvcache here. + */ + if (rp->rc_flag & RC_REPMBUF) { + m_freem(rp->rc_reply); + rp->rc_flag &= ~RC_REPMBUF; + } + } rp->rc_state = RC_DONE; /* * If we have a valid reply update status and save @@ -332,6 +345,10 @@ nextrp = rp->rc_lru.tqe_next; LIST_REMOVE(rp, rc_hash); TAILQ_REMOVE(&nfsrvlruhead, rp, rc_lru); + if (rp->rc_flag & RC_REPMBUF) + m_freem(rp->rc_reply); + if (rp->rc_flag & RC_NAM) + free(rp->rc_nam, M_SONAME); free(rp, M_NFSD); } numnfsrvcache = 0; >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message