Date: Fri, 17 Jul 2015 15:31:59 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Graham Allan <allan@physics.umn.edu> Cc: Ahmed Kamal via freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) Message-ID: <184170291.10949389.1437161519387.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <20150716235022.GF32479@physics.umn.edu> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <5594B008.10202@freebsd.org> <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> <CANzjMX5eN1FsnHMf6KGZe_b3vwxxF=dy3fJUHxeGO4BXuNzfPA@mail.gmail.com> <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> <CANzjMX427XNQJ1o6Wh2CVy1LF1ivspGcfNeRCmv%2BOyApK2UhJg@mail.gmail.com> <CANzjMX5xyUz6OkMKS4O-MrV2w58YT9ricOPLJWVtAR5Ci-LMew@mail.gmail.com> <20150716235022.GF32479@physics.umn.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
Graham Allan wrote: > I'm curious how things are going for you with this? > > Reading your thread did pique my interest since we have a lot of > Scientific Linux (RHEL clone) boxes with FreeBSD NFSv4 servers. I meant > to glance through our logs for signs of the same issue, but today I > started investigating a machine which appeared to have hung processes, > high rpciod load, and high traffic to the NFS server. Of course it is > exactly this issue. > > The affected machine is running SL5 though most of our server nodes are > now SL6. I can see errors from most of them but the SL6 systems appear > less affected - I see a stream of the sequence-id errors in their logs but > things in general keep working. The one SL5 machine I'm looking at > has a single sequence-id error in today's logs, but then goes into a > stream of "state recovery failed" then "Lock reclaim failed". It's > probably partly related to the particular workload on this machine. > > I would try switching our SL6 machines to NFS 4.1 to see if the > behaviour changes, but 4.1 isn't supported by our 9.3 servers (is it in > 10.1?). > Btw, I've done some testing against a fairly recent Fedora and haven't seen the problem. If either of you guys could load a recent Fedora on a test client box, it would be interesting to see if it suffers from this. (My experience is that the Fedora distros have more up to date Linux NFS clients.) rick > At the NFS servers, most of the sysctl settings are already tuned > from defaults. eg tcp.highwater=100000, vfs.nfsd.tcpcachetimeo=300, > 128-256 nfs kernel threads. > > Graham > > On Fri, Jul 03, 2015 at 01:21:00AM +0200, Ahmed Kamal via freebsd-fs wrote: > > PS: Today (after adjusting tcp.highwater) I didn't get any screaming > > reports from users about hung vnc sessions. So maybe just maybe, linux > > clients are able to somehow recover from this bad sequence messages. I > > could still see the bad sequence error message in logs though > > > > Why isn't the highwater tunable set to something better by default ? I mean > > this server is certainly not under a high or unusual load (it's only 40 PCs > > mounting from it) > > > > On Fri, Jul 3, 2015 at 1:15 AM, Ahmed Kamal > > <email.ahmedkamal@googlemail.com > > > wrote: > > > > > Thanks all .. I understand now we're doing the "right thing" .. Although > > > if mounting keeps wedging, I will have to solve it somehow! Either using > > > Xin's patch .. or Upgrading RHEL to 6.x and using NFS4.1. > > > > > > Regarding Xin's patch, is it possible to build the patched nfsd code, as > > > a > > > kernel module ? I'm looking to minimize my delta to upstream. > > > > > > Also would adopting Xin's patch and hiding it behind a > > > kern.nfs.allow_linux_broken_client be an option (I'm probably not the > > > last > > > person on earth to hit this) ? > > > > > > Thanks a lot for all the help! > > > > > > On Thu, Jul 2, 2015 at 11:53 PM, Rick Macklem <rmacklem@uoguelph.ca> > > > wrote: > > > > > >> Ahmed Kamal wrote: > > >> > Appreciating the fruitful discussion! Can someone please explain to > > >> > me, > > >> > what would happen in the current situation (linux client doing this > > >> > skip-by-1 thing, and freebsd not doing it) ? What is the effect of > > >> > that? > > >> Well, as you've seen, the Linux client doesn't function correctly > > >> against > > >> the FreeBSD server (and probably others that don't support this > > >> "skip-by-1" > > >> case). > > >> > > >> > What do users see? Any chances of data loss? > > >> Hmm. Mostly it will cause Opens to fail, but I can't guess what the > > >> Linux > > >> client behaviour is after receiving NFS4ERR_BAD_SEQID. You're the guy > > >> observing > > >> it. > > >> > > >> > > > >> > Also, I find it strange that netapp have acknowledged this is a bug on > > >> > their side, which has been fixed since then! > > >> Yea, I think Netapp screwed up. For some reason their server allowed > > >> this, > > >> then was fixed to not allow it and then someone decided that was broken > > >> and > > >> reversed it. > > >> > > >> > I also find it strange that I'm the first to hit this :) Is no one > > >> running > > >> > nfs4 yet! > > >> > > > >> Well, it seems to be slowly catching on. I suspect that the Linux client > > >> mounting a Netapp is the most common use of it. Since it appears that > > >> they > > >> flip flopped w.r.t. who's bug this is, it has probably persisted. > > >> > > >> It may turn out that the Linux client has been fixed or it may turn out > > >> that most servers allowed this "skip-by-1" even though David Noveck (one > > >> of the main authors of the protocol) seems to agree with me that it > > >> should > > >> not be allowed. > > >> > > >> It is possible that others have bumped into this, but it wasn't isolated > > >> (I wouldn't have guessed it, so it was good you pointed to the RedHat > > >> discussion) > > >> and they worked around it by reverting to NFSv3 or similar. > > >> The protocol is rather complex in this area and changed completely for > > >> NFSv4.1, > > >> so many have also probably moved onto NFSv4.1 where this won't be an > > >> issue. > > >> (NFSv4.1 uses sessions to provide exactly once RPC semantics and doesn't > > >> use > > >> these seqid fields.) > > >> > > >> This is all just mho, rick > > >> > > >> > On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem <rmacklem@uoguelph.ca> > > >> wrote: > > >> > > > >> > > Julian Elischer wrote: > > >> > > > On 7/2/15 9:09 AM, Rick Macklem wrote: > > >> > > > > I am going to post to nfsv4@ietf.org to see what they say. > > >> > > > > Please > > >> > > > > let me know if Xin Li's patch resolves your problem, even though > > >> > > > > I > > >> > > > > don't believe it is correct except for the UINT32_MAX case. Good > > >> > > > > luck with it, rick > > >> > > > and please keep us all in the loop as to what they say! > > >> > > > > > >> > > > the general N+2 bit sounds like bullshit to me.. its always N+1 in > > >> > > > a > > >> > > > number field that has a > > >> > > > bit of slack at wrap time (probably due to some ambiguity in the > > >> > > > original spec). > > >> > > > > > >> > > Actually, since N is the lock op already done, N + 1 is the next > > >> > > lock > > >> > > operation in order. Since lock ops need to be strictly ordered, > > >> allowing > > >> > > N + 2 (which means N + 2 would be done before N + 1) makes no sense. > > >> > > > > >> > > I think the author of the RFC meant that N + 2 or greater fails, but > > >> it > > >> > > was poorly worded. > > >> > > > > >> > > I will pass along whatever I get from nfsv4@ietf.org. (There is an > > >> archive > > >> > > of it somewhere, but I can't remember where.;-) > > >> > > > > >> > > rick > > >> > > _______________________________________________ > > >> > > freebsd-fs@freebsd.org mailing list > > >> > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > >> > > To unsubscribe, send any mail to > > >> > > "freebsd-fs-unsubscribe@freebsd.org" > > >> > > > > >> > > > >> > > > > > > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > -- > ------------------------------------------------------------------------- > Graham Allan - allan@physics.umn.edu - gta@umn.edu - (612) 624-5040 > School of Physics and Astronomy - University of Minnesota > ------------------------------------------------------------------------- > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?184170291.10949389.1437161519387.JavaMail.zimbra>