Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Mar 2010 17:04:40 +0200
From:      Daniel Braniss <danny@cs.huji.ac.il>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        bug-followup@FreeBSD.org, freebsd-fs@FreeBSD.org, Kai Kockro <kkockro@web.de>
Subject:   Re: kern/144330: [nfs] mbuf leakage in nfsd with zfs 
Message-ID:  <E1NtjBJ-000AyL-B5@kabab.cs.huji.ac.il>
In-Reply-To: <Pine.GSO.4.63.1003220949490.11799@muncher.cs.uoguelph.ca> 
References:  <201003171120.o2HBK3CV082081@freefall.freebsd.org>  <20100317113953.GA14582@icarus.home.lan> <Pine.GSO.4.63.1003171844120.20254@muncher.cs.uoguelph.ca> <86tys9eqo6.fsf@kopusha.onet> <Pine.GSO.4.63.1003212018180.28991@muncher.cs.uoguelph.ca> <E1NtfW6-0008E7-9q@kabab.cs.huji.ac.il> <Pine.GSO.4.63.1003220949490.11799@muncher.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
> 
> 
> On Mon, 22 Mar 2010, Daniel Braniss wrote:
> 
> >
> > well, it's much better!, but no cookies yet :-)
> >
> 
> Well, that's good news. I'll try and get dfr to review it and then
> commit it. Thanks Mikolaj, for finding this.
> 
> > from comparing graphs in
> > 	ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbuf-leak/
> > store-01-e.ps: a production server running newfsd - now up almost 20 days
> > 	notice that the average used mbuf is below 1000!
> >
> > store-02.ps: kernel without last patch, classic nfsd
> > 	the leak is huge.
> >
> > store-02++.ps: with latest patch
> > 	the leak is much smaller but I see 2 issues:
> > 		- the initial leap to over 2000, then a smaller leak.
> 
> The initial leap doesn't worry me. That's just a design constraint.
yes, but new-nsfd does it better.

> A slow leak after that is still a problem. (I might have seen the
> slow leak in testing here. I'll poke at it and see if I can reproduce
> that.)

all I do is mount upd on a client and start a write process.

> 
> >
> > could someone explain replay_prune() to me?
> >
> I just looked at it and I think it does the following:
>  	- when it thinks the cache is too big (either too many entries
>            or too much mbuf data) it loops around until:
>  		- no longer too much or can't free any more
>                  (when an entry is free'd, rc_size and rc_count are
>                   reduced)
>            (the loop is from the end of the tailq, so it is freeing
>             the least recently used entries)
>  	- the test for rce_repmsg.rm_xid != 0 avoids freeing ones
>            that are in progress, since rce_repmsg is all zeroed until
>            the reply has been generated

thanks for the information, it's what i thought, but the coding made it look 
as something
else could happen - why else start the search of the queue after each match?> 

> I did notice that the call to replay_prune() from replay_setsize() does 
> not lock the mutex before calling it, so it doesn't look smp safe to me 
> for this case, but I doubt that would cause a slow leak. (I think this is
> only called when the number of mbuf clusters in the kernel changes and
> might cause a kernel crash if the tailq wasn't in a consistent state as
> it rattled through the list in the loop.)
> 
there seems to be an NFSLOCK involved before calling replay_setsize ...

well, the server is a 2 cpu quad nehalem, so maybe I should try several 
clients ...

> rick
> 
btw, the new-nfsd has been running on a production server for almost 20 days
and all seeems fine.

anyways, things are looking better,
cheers,
	danny





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1NtjBJ-000AyL-B5>