Date: Wed, 01 Jul 2015 16:49:02 -0700 From: Xin Li <delphij@delphij.net> To: Ahmed Kamal <email.ahmedkamal@googlemail.com>, Rick Macklem <rmacklem@uoguelph.ca> Cc: freebsd-fs@freebsd.org Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) Message-ID: <55947C6E.5060409@delphij.net> In-Reply-To: <CANzjMX7xKBvnzJhQhB_ZrUnyE2m_FJXXy4fm_RFnuZfBDyDm2A@mail.gmail.com> References: <CANzjMX45QaC8yZx2nHPAohJRvQjmUOHuhMQWP9nX%2BsrJs707Hg@mail.gmail.com> <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <CANzjMX7xKBvnzJhQhB_ZrUnyE2m_FJXXy4fm_RFnuZfBDyDm2A@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 07/01/15 16:44, Ahmed Kamal via freebsd-fs wrote: > The not so great news is, after updating sysctl and rebooting the > nas box, I still saw a few (NFS: v4 server nas returned a bad > sequence-id error!) lines in logs. Users have already left, so I > don't know how bad is it .. > > Could you share more info on what this error means? RedHat seems to > think the client can skip-by-1 and choose larger IDs and that would > be totally fine ? Also how serious is this error, would it cause > NFS session stall like that ? I wonder if this would help, which loosen the check: Index: sys/fs/nfsserver/nfs_nfsdstate.c =================================================================== - --- sys/fs/nfsserver/nfs_nfsdstate.c (revision 285016) +++ sys/fs/nfsserver/nfs_nfsdstate.c (working copy) @@ -3805,7 +3805,8 @@ nfsrv_checkseqid(struct nfsrv_descript *nd, u_int3 printf("refcnt=%d\n", stp->ls_op->rc_refcnt); panic("nfsrvstate op refcnt"); } - - if ((stp->ls_seq + 1) == seqid) { + if ((stp->ls_seq + 1) == seqid || + (stp->ls_seq + 2) == seqid) { if (stp->ls_op) nfsrvd_derefcache(stp->ls_op); stp->ls_op = op; Personally I don't quite buy the skip-by-1 is Okay argument but it seems that the RFC text can be interpreted that way. Cheers, > On Thu, Jul 2, 2015 at 1:36 AM, Rick Macklem <rmacklem@uoguelph.ca> > wrote: > >> Ahmed Kamal wrote: >>> Hi all, >>> >>> I'm a refugee from linux land. I just set up my first freebsd >>> 10.1 zfs >> box, >>> sharing /home over nfs. Since every home directory is its own >>> zfs >> dataset, >>> I chose to use nfsv4 to enable recursively sharing/mounting any >>> directory under /home (I understand nfs4 is a must in this >>> scenario!) >>> >>> I'm able to mount form linux (rhel5 latest kernel) >>> successfully. Users >> are >>> working fine. However every now and then a user screams that >>> his session >> is >>> frozen. Usually the processes are stuck in nfs_wait or rpc_* >>> state. I >> tried >>> using a much newer linux kernel (3.2 however it still faced the >>> same problem). The errors in Linux log files are mostly: Jul 1 >>> 17:41:47 mammoth kernel: NFS: v4 server nas returned a *bad >>> sequence-id error*! Jul 1 17:52:32 mammoth kernel: >>> nfs4_reclaim_locks: unhandled error -11. Zeroing state Jul 1 >>> 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim >>> failed! >>> >> Btw, a client should only do "reclaim" operations after the >> server has replied with NFS4ERR_STALE_CLIENTID or >> NFS4ERR_STALE_STATEID. I am pretty certain that the FreeBSD NFSv4 >> server only generates these replies after it has rebooted, so >> assuming the server didn't reboot, I have no idea why the client >> would attempt these and am not surprised they failed. >> >> I'm guessing that the DRC constipation somehow caused the Linux >> client to go into recovery mode? >> >> rick >> >>> My search led me to >>> (https://access.redhat.com/solutions/1328073) a detailed >>> analysis of the issue, which you can read over here >>> https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf >>> .. NetApp confirmed this was a bug for them (I'm wondering if >>> this is still in FreeBSD?!) >>> >>> PS: Right before sending this, I saw dmesg on the freebsd box >>> advising increasing vfs.nfsd.tcphighwater .. So I up'ed that to >>> 64000. I also >> up'ed >>> the number of nfs server threads (-t) from 10 to 60 (we're >>> roughly 40 >> linux >>> machines) >>> >>> Any advice is most appreciated! >>> >>> Thanks _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs To >>> unsubscribe, send any mail to >>> "freebsd-fs-unsubscribe@freebsd.org" >>> >> > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs To > unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > - -- Xin LI <delphij@delphij.net> https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.1.5 (FreeBSD) iQIcBAEBCgAGBQJVlHxuAAoJEJW2GBstM+nskvsP/ire8QyTfL6mF1njMNZwI/k5 AQ+BwWs5r8LzcRN/4v7/gelbS+lXnYVbVHMl8q6j+HzUzQ3yId4ZGlJWpJtHDNnj +gV8kmFt/og1QTrQRbN81i4GEr914SlKWmo7LsxrWmEhAiKsN0sYsjELD/mH5BZX 1wRe3vTvyrMwm+6u1krqT8ZrxRANBFBmNqiFb8sag7B3oJQZsGhAyUSsJvUhb00o ozwC2NT5y8Jv0QcZdC/wGeYc8FmRNQTAjE22WkzbsUey/e7FxL7vflCGgngYCIxE zbZNW65xThZO8fti5MxiepJ27VPa5ocX0CQihBFYp5haG6fzWBGalV/ggAOwYL44 nz1caLhdKIj9JSd8QwLdTArq8+6H8Sx4jp4iGzQnppNo8PtG/AlHlw9uDKaUF4iw H+tMb6qMu2FQJ9X+phtplzvjZxCbBbwY205GeTm5eElOkYzIyYvqIvZasos02ze0 v3SQXtpIHjrnndXMVNRJOkhYquGxVFxUm5IJ7o+0wrgVJp1V3cBKd4vs0o84Mgu5 EPGKCyt8x/B6ujCxkunODpNOb+sFyq6aqsDLAO6JSih5HfQntpxoZTjm8p4KjsG6 nPqXQXmi2NoOd6WPOunp7w/y+fKA4YdLAhPC7rbXQwpLL81UqNH141BrtscN0ovi pyRlJ4r3Zs75qUwVSkzL =3/OG -----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55947C6E.5060409>