From owner-freebsd-fs@freebsd.org  Sun Jul  5 22:30:05 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B89A3992582
 for <freebsd-fs@mailman.ysv.freebsd.org>; Sun,  5 Jul 2015 22:30:05 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44])
 by mx1.freebsd.org (Postfix) with ESMTP id 52BC61E53;
 Sun,  5 Jul 2015 22:30:04 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A2C7BABRrplV/61jaINWBoNmYAaDGbozgWQKhS1KAoFXEgEBAQEBAQGBCoQjAQEBAwEBAQEgKyALEAIBCA4KAgINGQICJwEJJgIECAcEARwEh3kDCggNsS+Pbg2FYAEBAQcBAQEBAR2BIYoqgk2BVhACAQUIAQ40B4JogUMFjBmHfIRiglmBXYQKRINRiwCEKoNbAiaCDByBbyIxB39BgQQBAQE
X-IronPort-AV: E=Sophos;i="5.15,411,1432612800"; d="scan'208";a="222063314"
Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca)
 ([131.104.99.173])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 05 Jul 2015 18:28:55 -0400
Received: from localhost (localhost [127.0.0.1])
 by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id A447715F542;
 Sun,  5 Jul 2015 18:28:55 -0400 (EDT)
Received: from zcs1.mail.uoguelph.ca ([127.0.0.1])
 by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id DyGR0CgU2CTx; Sun,  5 Jul 2015 18:28:54 -0400 (EDT)
Received: from localhost (localhost [127.0.0.1])
 by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 6191115F54D;
 Sun,  5 Jul 2015 18:28:54 -0400 (EDT)
X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca
Received: from zcs1.mail.uoguelph.ca ([127.0.0.1])
 by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id jz4sox0meuvF; Sun,  5 Jul 2015 18:28:54 -0400 (EDT)
Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18])
 by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 3F17815F542;
 Sun,  5 Jul 2015 18:28:54 -0400 (EDT)
Date: Sun, 5 Jul 2015 18:28:53 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Ahmed Kamal <email.ahmedkamal@googlemail.com>
Cc: Julian Elischer <julian@freebsd.org>, freebsd-fs@freebsd.org, 
 Xin LI <d@delphij.net>
Message-ID: <1463698530.4486572.1436135333962.JavaMail.zimbra@uoguelph.ca>
In-Reply-To: <CANzjMX6EoPOcY9V5EQeu5KO1WhwFxxo7-mYRhccVvKiaDW8nGQ@mail.gmail.com>
References: <CANzjMX45QaC8yZx2nHPAohJRvQjmUOHuhMQWP9nX+srJs707Hg@mail.gmail.com>
 <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca>
 <CANzjMX5eN1FsnHMf6KGZe_b3vwxxF=dy3fJUHxeGO4BXuNzfPA@mail.gmail.com>
 <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca>
 <CANzjMX427XNQJ1o6Wh2CVy1LF1ivspGcfNeRCmv+OyApK2UhJg@mail.gmail.com>
 <CANzjMX5xyUz6OkMKS4O-MrV2w58YT9ricOPLJWVtAR5Ci-LMew@mail.gmail.com>
 <2010996878.3611963.1435884702063.JavaMail.zimbra@uoguelph.ca>
 <CANzjMX6EoPOcY9V5EQeu5KO1WhwFxxo7-mYRhccVvKiaDW8nGQ@mail.gmail.com>
Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.95.11]
X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191)
Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!)
Thread-Index: rcZ265AjBv92fGCCMrwVSGc0qtHu/A==
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Jul 2015 22:30:05 -0000

Ahmed Kamal wrote:
> Hi folks,
> 
> Just a quick update. I did not test Xin's patches yet .. What I did so far
> is to increase the tcp highwater tunable and increase nfsd threads to 60.
> Today (a working day) I noticed I only got one bad sequence error message!
> Check this:
> 
> # grep 'bad sequence' messages* | awk '{print $1 $2}' | uniq -c
>       1 messages:Jul5
>      39 messages.1:Jun28
>      15 messages.1:Jun29
>       4 messages.1:Jun30
>       9 messages.1:Jul1
>      23 messages.1:Jul2
>       1 messages.1:Jul4
>       1 messages.2:Jun28
> 
> So there seems to be an improvement! Not sure if the Linux nfs4 client is
> able to somehow recover from those bad-sequence situations or not .. I did
> get some user complaints that running "ls -l" is sometimes slow and takes a
> couple of seconds to finish.
> 
> One final question .. Do you folks think nfs4.1 is more reliable in general
> than nfs4 .. I've always only used nfs3 (I guess it can't work here with
> /home/* being separate zfs filesystems) .. So should I go through the pain
> of upgrading a few servers to RHEL-6 to try out nfs4.1 ? Basically do you
> expect the protocol to be more solid ? I know it's a fluffy question, just
> give me your thoughts. Thanks a lot!
> 
All I can say is that the "bad seqid" errors should not occur, since NFSv4.1
doesn't use the seqid#s to order RPCs.

Also I would say that a correctly implemented NFSv4.1 protocol should function
"more correctly" since all RPCs and performed "exactly once". (How much effect
this will have in practice, I can't say.)

On the other hand, NFSv4.1 is a newer protocol (with an RFC of over 500pages),
so it is hard to say how mature the implementations are.
I think only testing will give you the answer.

I would suggest that you test Xi Lin's patch that allows the "seqid + 2" case
and see if that makes the "bad seqid" errors go away. (Even though I think this
would indicate a client bug, adding this in way that it can be enabled via a sysctl
seems reasonable.)

Btw, I haven't seen any additional posts from nfsv4@ietf.org on this, rick

> 
> 
> On Fri, Jul 3, 2015 at 2:51 AM, Rick Macklem <rmacklem@uoguelph.ca> wrote:
> 
> > Ahmed Kamal wrote:
> > > PS: Today (after adjusting tcp.highwater) I didn't get any screaming
> > > reports from users about hung vnc sessions. So maybe just maybe, linux
> > > clients are able to somehow recover from this bad sequence messages. I
> > > could still see the bad sequence error message in logs though
> > >
> > > Why isn't the highwater tunable set to something better by default ? I
> > mean
> > > this server is certainly not under a high or unusual load (it's only 40
> > PCs
> > > mounting from it)
> > >
> > > On Fri, Jul 3, 2015 at 1:15 AM, Ahmed Kamal <
> > email.ahmedkamal@googlemail.com
> > > > wrote:
> > >
> > > > Thanks all .. I understand now we're doing the "right thing" ..
> > Although
> > > > if mounting keeps wedging, I will have to solve it somehow! Either
> > using
> > > > Xin's patch .. or Upgrading RHEL to 6.x and using NFS4.1.
> > > >
> > > > Regarding Xin's patch, is it possible to build the patched nfsd code,
> > as a
> > > > kernel module ? I'm looking to minimize my delta to upstream.
> > > >
> > Yes, you can build the nfsd as a module. If your kernel config does not
> > include
> > "options NFSD" the module will get loaded/used. It is also possible to
> > replace
> > the module without rebooting, but you need to kill of the nfsd daemon then
> > kldunload nfsd.ko and replace nfsd.ko with the new one. (In
> > /boot/<kernel-name>.)
> >
> > > > Also would adopting Xin's patch and hiding it behind a
> > > > kern.nfs.allow_linux_broken_client be an option (I'm probably not the
> > last
> > > > person on earth to hit this) ?
> > > >
> > If it fixes your problem, I think this is reasonable.
> > I'm also hoping that someone that works on the Linux client reports
> > if/when this
> > was changed.
> >
> > rick
> >
> > > > Thanks a lot for all the help!
> > > >
> > > > On Thu, Jul 2, 2015 at 11:53 PM, Rick Macklem <rmacklem@uoguelph.ca>
> > > > wrote:
> > > >
> > > >> Ahmed Kamal wrote:
> > > >> > Appreciating the fruitful discussion! Can someone please explain to
> > me,
> > > >> > what would happen in the current situation (linux client doing this
> > > >> > skip-by-1 thing, and freebsd not doing it) ? What is the effect of
> > that?
> > > >> Well, as you've seen, the Linux client doesn't function correctly
> > against
> > > >> the FreeBSD server (and probably others that don't support this
> > > >> "skip-by-1"
> > > >> case).
> > > >>
> > > >> > What do users see? Any chances of data loss?
> > > >> Hmm. Mostly it will cause Opens to fail, but I can't guess what the
> > Linux
> > > >> client behaviour is after receiving NFS4ERR_BAD_SEQID. You're the guy
> > > >> observing
> > > >> it.
> > > >>
> > > >> >
> > > >> > Also, I find it strange that netapp have acknowledged this is a bug
> > on
> > > >> > their side, which has been fixed since then!
> > > >> Yea, I think Netapp screwed up. For some reason their server allowed
> > this,
> > > >> then was fixed to not allow it and then someone decided that was
> > broken
> > > >> and
> > > >> reversed it.
> > > >>
> > > >> > I also find it strange that I'm the first to hit this :) Is no one
> > > >> running
> > > >> > nfs4 yet!
> > > >> >
> > > >> Well, it seems to be slowly catching on. I suspect that the Linux
> > client
> > > >> mounting a Netapp is the most common use of it. Since it appears that
> > they
> > > >> flip flopped w.r.t. who's bug this is, it has probably persisted.
> > > >>
> > > >> It may turn out that the Linux client has been fixed or it may turn
> > out
> > > >> that most servers allowed this "skip-by-1" even though David Noveck
> > (one
> > > >> of the main authors of the protocol) seems to agree with me that it
> > should
> > > >> not be allowed.
> > > >>
> > > >> It is possible that others have bumped into this, but it wasn't
> > isolated
> > > >> (I wouldn't have guessed it, so it was good you pointed to the RedHat
> > > >> discussion)
> > > >> and they worked around it by reverting to NFSv3 or similar.
> > > >> The protocol is rather complex in this area and changed completely for
> > > >> NFSv4.1,
> > > >> so many have also probably moved onto NFSv4.1 where this won't be an
> > > >> issue.
> > > >> (NFSv4.1 uses sessions to provide exactly once RPC semantics and
> > doesn't
> > > >> use
> > > >>  these seqid fields.)
> > > >>
> > > >> This is all just mho, rick
> > > >>
> > > >> > On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem <rmacklem@uoguelph.ca>
> > > >> wrote:
> > > >> >
> > > >> > > Julian Elischer wrote:
> > > >> > > > On 7/2/15 9:09 AM, Rick Macklem wrote:
> > > >> > > > > I am going to post to nfsv4@ietf.org to see what they say.
> > Please
> > > >> > > > > let me know if Xin Li's patch resolves your problem, even
> > though I
> > > >> > > > > don't believe it is correct except for the UINT32_MAX case.
> > Good
> > > >> > > > > luck with it, rick
> > > >> > > > and please keep us all in the loop as to what they say!
> > > >> > > >
> > > >> > > > the general N+2 bit sounds like bullshit to me.. its always N+1
> > in a
> > > >> > > > number field that has a
> > > >> > > > bit of slack at wrap time (probably due to some ambiguity in the
> > > >> > > > original spec).
> > > >> > > >
> > > >> > > Actually, since N is the lock op already done, N + 1 is the next
> > lock
> > > >> > > operation in order. Since lock ops need to be strictly ordered,
> > > >> allowing
> > > >> > > N + 2 (which means N + 2 would be done before N + 1) makes no
> > sense.
> > > >> > >
> > > >> > > I think the author of the RFC meant that N + 2 or greater fails,
> > but
> > > >> it
> > > >> > > was poorly worded.
> > > >> > >
> > > >> > > I will pass along whatever I get from nfsv4@ietf.org. (There is
> > an
> > > >> archive
> > > >> > > of it somewhere, but I can't remember where.;-)
> > > >> > >
> > > >> > > rick
> > > >> > > _______________________________________________
> > > >> > > freebsd-fs@freebsd.org mailing list
> > > >> > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > > >> > > To unsubscribe, send any mail to "
> > freebsd-fs-unsubscribe@freebsd.org"
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>