Date: Thu, 2 Dec 1999 17:54:52 +0000 (GMT) From: iedowse@maths.tcd.ie To: FreeBSD-gnats-submit@freebsd.org Subject: kern/15222: mbuf leak in nfs_srvcache.c Message-ID: <199912021754.aa92503@walton.maths.tcd.ie>
next in thread | raw e-mail | index | archive | help
>Number: 15222
>Category: kern
>Synopsis: mbuf leak in nfs_srvcache.c
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Thu Dec 2 10:00:01 PST 1999
>Closed-Date:
>Last-Modified:
>Originator: Ian Dowse
>Release: FreeBSD 3.3-STABLE i386
>Organization:
School of Mathematics
Trinity College Dublin
>Environment:
All versions of FreeBSD, when used as an NFS server. This problem
will only have a significant impact on very busy NFS servers.
>Description:
FreeBSD maintains a cache of some recent NFS server requests and
their replies. This allows duplicate requests to be discarded while
the requested operation is still in progress, and avoids the need
to redo an operation if the reply gets lost. The default maximum
cache size is 64 entries.
When a request is received, nfsrv_getcache() either finds an existing
cache entry, or creates a new one. After the operation completes,
nfsrv_updatecache() updates the cache entry, sometimes including
a copy of the reply mbuf chain.
The problem is that nfsrv_updatecache() assumes it will called only
once for a given cache entry, so it does not check if the cache entry
already has an associated mbuf chain. Normally this assumption is
correct, but under certain high-load situations, the following can
occur:
1) A request R is received by the server, say a write
with the 'filesync' option.
2) 64 other requests are received before R completes.
The cache entry for R gets flushed out.
3) A retransmit of R is received (R1); since the cache
entry is gone, a new cache entry is created, and
execution of R1 begins.
4) R completes, and a copy of the reply mbuf chain gets
saved by nfsrv_updatecache().
5) R1 completes. nfsrv_updatecache() replaces the cache
entry with a copy of R1's reply, but doesn't free
the original mbuf chain.
This problem was leaking 200-500 mbufs each day on our busiest
NFS server.
There is another leak in nfsrv_cleancache(), but I think this will
only be triggered if all the nfsd's get killed. The patch below
fixes both of these problems.
As commented in the patch, this problem only occurs when the cache
is too small, so it might be desirable to increase the cache size
(double it maybe) when nfsrv_updatecache() finds a RC_DONE entry.
Recording the event in the nfs stats structure and having a sysctl
to increase the cache size would be another option. I can submit
patches for either.
>How-To-Repeat:
1) Keep the disk on the NFS server busy - for example run
find /whatever -type f |xargs cat > /dev/null
2) Send an NFS filesync write request for a file 'file1'
3) Send 64 access requests for another file 'file2' each
with a different xid, to flush the cache.
4) Retransmit the NFS filesync write (with the same xid).
5) Wait for the two replies to the writes.
This should leak just a few mbufs, so it needs to be repeated a
number of times before the leak becomes apparent.
>Fix:
--- nfs_srvcache.c.orig Thu Dec 2 16:22:37 1999
+++ nfs_srvcache.c Thu Dec 2 16:49:44 1999
@@ -293,6 +293,19 @@
goto loop;
}
rp->rc_flag |= RC_LOCKED;
+ if (rp->rc_state == RC_DONE) {
+ /*
+ * This can happen if the cache is too small;
+ * retransmits of the same request aren't
+ * dropped so we see the operation completing
+ * more than once.
+ * XXX maybe increase desirednfsrvcache here.
+ */
+ if (rp->rc_flag & RC_REPMBUF) {
+ m_freem(rp->rc_reply);
+ rp->rc_flag &= ~RC_REPMBUF;
+ }
+ }
rp->rc_state = RC_DONE;
/*
* If we have a valid reply update status and save
@@ -332,6 +345,10 @@
nextrp = rp->rc_lru.tqe_next;
LIST_REMOVE(rp, rc_hash);
TAILQ_REMOVE(&nfsrvlruhead, rp, rc_lru);
+ if (rp->rc_flag & RC_REPMBUF)
+ m_freem(rp->rc_reply);
+ if (rp->rc_flag & RC_NAM)
+ free(rp->rc_nam, M_SONAME);
free(rp, M_NFSD);
}
numnfsrvcache = 0;
>Release-Note:
>Audit-Trail:
>Unformatted:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199912021754.aa92503>
