Date: Sat, 20 Feb 2016 17:58:24 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: lev@FreeBSD.org Cc: freebsd-fs@freebsd.org Subject: Re: Panic in NFS client on CURRENT Message-ID: <353969052.570755.1456009104365.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <56C84922.8050803@FreeBSD.org> References: <56C752CD.4090203@FreeBSD.org> <1022369130.4303814.1455930123897.JavaMail.zimbra@uoguelph.ca> <56C84922.8050803@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Lev wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > On 20.02.2016 04:02, Rick Macklem wrote: > > >> Basically, I'm asking if there was a server reboot or nfsd thread > >> restart or some kind of network partition that would separate > >> some client(s) from the server. OR Panic occurred during normal > >> operation. > There was NO server reboot/restarts. MAYBE, this VM (where client > runs) lost network connectivity for several seconds, but server itself > was NOT stopped, restarted or rebooted. > Well, the stack trace you put in the PR showed a recovery from an expired lease. This should only occur when the client is partitioned from the server for more than a lease duration (120sec on FreeBSD). Even a 120+sec network partitioning won't cause an expired recovery unless a conflicting open/lock request is made for a FreeBSD server. (A Linux server will NFS4ERR_EXPIRED as soon as the lease has exceeded without a renewal and Linux uses a lease of 60sec, so it is easier to reproduce with a Linux NFSv4 server if you happen to have one.) --> So I don't know why it would go into a lease expired recovery. (A network partitioning of a few seconds shouldn't do it.) I think the only way to know what caused this would be to have a packet capture that started before the problem occurred. (Maybe your network setup is somehow directing some RPC messages to the wrong place or they`re being blocked by some firewall setup.) If you have an NFSv4.0 mount you should see a Renew RPC about once per minute (half a lease duration) which keeps the lease from expiring. For NFSv4.1, it is an RPC with just a Sequence operation which should have the same effect. Reproducing this shouldn't be easy (which is a good thing;-). It has been a while, but it should take something like: - network partition a client from the server while it has a file open, for several minutes. (It might also need to have a byte range lock on the file, I can`t remember for sure if just an open is sufficient.) - Try and open the same file on another client (and get a conflicting byte range lock maybe). --> This should result in a reply to the client of NFS4ERR_EXPIRED. If you look at a packet trace in wireshark, it is a server reply of NFS4ERR_EXPIRED that tells the client to go into this recovery cycle. Unfortunately I am away from home until April, so I don't have access to wireshark until then. (I will try and reproduce a NFSERR_EXPIRED failure with the laptops I have with me, but I'm not sure if I can pull it off.) Btw, this type of recovery isn't specified by the RFC and can only recover opens and not byte range locks. Fyi, the recovery from a server reboot (or reload of nfsd.ko in a FreeBSD server) is specified by the RFC and can recover opens and locks. It starts with the server replying NFSERR_STALECLIENTID or NFSERR_STALESTATEID. Good luck with it, rick ps: I`ll email if I reproduce the NFSERR_EXPIRED and find any problem beyond the panic with fix already posted. > - -- > // Lev Serebryakov > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2 > > iQJ7BAEBCgBmBQJWyEkiXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w > ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRGOTZEMUNBMEI1RjQzMThCNjc0QjMzMEFF > QUIwM0M1OEJGREM0NzhGAAoJEOqwPFi/3EePfCsP9AuK494J8cUft0MmvAly7yVw > iF0R0joxwttp9t6qydMjlQfmj6yoX+UACFWWRBZGgGrS8K7PcGSsGFl5s/Bt1ylL > lw3GDr7GsVNDOhG4ypwsiqI2Wq/PzFhBMUpuUq6A+kdqZVH1ApQFyDKrdWbvDQLx > 9Dm6vvW/fx6W1PgJp4i2B8zSf4vz7s91JyPMXnN9IQNG/1H9WERudzx/2kp1ws9y > wYCXVmsidMO9j0DQ4eVVSM2vSfc6VKgyjWhVeHguRXc5F3L5VGuoSXyzCkceC66r > t+8MDYhrsm00hrkZyTO6s1KcC8OKrgZBr9p0UIM1oMaqo02DyWp7KfM1nDMW9FI6 > IXsLaizPnnf7u+gGI2SllNXMaPvcREAxrnQDHKTdifKkpXrSroYYfJGmxAsRidmY > 8nwZ1bytGeSHlTYSq1XTJLCWsSoM/o0Vgl+bGXvajWFkFT/GRGb5akWUBZhkzo7n > TTpm0zrLuSvqWwRvqisoAuKW7QmCF2E0ei0E01TA3DDpF31dLOCApMq4t/UooT5h > w25dTRpc+WPUEwKXSzZ90kPHmmoRz7dn8y6Oeb681GtqoauMBgVUuWhI7+sobRBy > gcyIIpPB1Y0vteslzd5JDRUWcDUGg23fqRgax+J+motaNEXus2P6RxZTkq3DmOgO > qpvv/BwLVn++rVnTNWY= > =hihT > -----END PGP SIGNATURE----- >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?353969052.570755.1456009104365.JavaMail.zimbra>