FreeBSD Mail Archives

Date:      Thu, 4 Mar 2010 18:43:10 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Daniel Braniss <danny@cs.huji.ac.il>
Cc:        stable@freebsd.org, freebsd-fs@freebsd.org, Willem Jan Withagen <wjw@digiware.nl>, =?utf-8?B?RWlyaWsgw5h2ZXJieQ==?= <ltning@anduin.net>, rwatson@freebsd.org, Jeremy Chadwick <freebsd@jdc.parodius.com>
Subject:   Re: mbuf leakage with nfs/udp (was mbuf leakage with nfs/zfs?) 
Message-ID:  <Pine.GSO.4.63.1003041836010.5966@muncher.cs.uoguelph.ca>
In-Reply-To: <E1Nn55u-000Ik6-9C@kabab.cs.huji.ac.il>
References:  <20100226174021.8feadad9.gerrit@pmp.uni-hannover.de>  <E1Nl6VA-000557-D9@kabab.cs.huji.ac.il> <20100226224320.8c4259bf.gerrit@pmp.uni-hannover.de> <4B884757.9040001@digiware.nl> <20100227080220.ac6a2e4d.gerrit@pmp.uni-hannover.de> <4B892918.4080701@digiware.nl> <20100227202105.f31cbef7.gerrit@pmp.uni-hannover.de> <20100227193819.GA60576@icarus.home.lan> <BD8AC9F6-DF96-41F9-8E92-48A4E5606DC7@anduin.net> <4B89943C.70704@digiware.nl> <20100227220310.GA65110@icarus.home.lan> <Pine.GSO.4.63.1003011703100.26054@muncher.cs.uoguelph.ca> <E1NmPHy-0009jy-Dj@kabab.cs.huji.ac.il> <Pine.GSO.4.63.1003021947470.3879@muncher.cs.uoguelph.ca> <E1NmkOe-000PSY-JT@kabab.cs.huji.ac.il> <Pine.GSO.4.63.1003031936170.29530@muncher.cs.uoguelph.ca> <E1Nn55u-000Ik6-9C@kabab.cs.huji.ac.il>




On Thu, 4 Mar 2010, Daniel Braniss wrote:

>
> correct. The interesting side effect, is that I can't see any negative
> issues when disabling the cash.

If the client retries a non-idempotent RPC, the server will do it
again, which can result in data corruption. This is likely to happen
infrequently, but with potentially nasty results. (The paper that
describes this was given at a late 1980s Usenix by Chet J. His name
is in a comment somewhere, I think. I won't dare to try and spell it.:-)

> seems ok, I have been running it now on a semi production server and
> it's holding up quiet nicely, the cache seems not up to expectations:
>
> store-mg-03# nfsstat -se
> Server Info:
>  Getattr   Setattr    Lookup  Readlink      Read     Write    Create    Remove
> 48176764    262687  12582599     19732   4225907   9186574    780793    818837
>   Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus    Access
>     7623       160     27753     59551     59552    118216         0   1992779
>    Mknod    Fsstat    Fsinfo  PathConf    Commit   LookupP   SetClId SetClIdCf
>        0    979005        19         0   1644267         0         0         0
>     Open  OpenAttr OpenDwnGr  OpenCfrm DelePurge   DeleRet     GetFH      Lock
>        0         0         0         0         0         0         0         0
>    LockT     LockU     Close    Verify   NVerify     PutFH  PutPubFH PutRootFH
>        0         0         0         0         0         0         0         0
>    Renew RestoreFH    SaveFH   Secinfo RelLckOwn  V4Create
>        0         0         0         0         0         0
> Server:
> Retfailed    Faults   Clients
>        0         0         0
> OpenOwner     Opens LockOwner     Locks    Delegs
>        0         0         0         0         0
> Server Cache Stats:
>   Inprog      Idem  Non-idem    Misses CacheSize   TCPPeak
>      307         0       297  80943198         0         0
>
If you are referring to the high miss rate, that is normal and to be
expected. It's the 297 Non-idempotent hits that could have caused data
corruption without the cache. When there is a hit, the RPC reply comes
from the cache, so that the RPC isn't performed again on the server.
(Some/many of these are not harmful. For example, a retried Remove
simply fails with ENOENT, but others...)

Glad to hear that the experimental server is working ok for you, rick

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.63.1003041836010.5966>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation