Date: Thu, 23 Jul 2015 17:53:03 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Graham Allan <allan@physics.umn.edu> Cc: Ahmed Kamal <email.ahmedkamal@googlemail.com>, Ahmed Kamal via freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) Message-ID: <1935759160.2320694.1437688383362.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <55B12EB7.6030607@physics.umn.edu> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> <CANzjMX427XNQJ1o6Wh2CVy1LF1ivspGcfNeRCmv%2BOyApK2UhJg@mail.gmail.com> <CANzjMX5xyUz6OkMKS4O-MrV2w58YT9ricOPLJWVtAR5Ci-LMew@mail.gmail.com> <20150716235022.GF32479@physics.umn.edu> <184170291.10949389.1437161519387.JavaMail.zimbra@uoguelph.ca> <CANzjMX4NmxBErtEu=e5yEGJ6gAJBF4_ar_aPdNDO2-tUcePqTQ@mail.gmail.com> <55B12EB7.6030607@physics.umn.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
Graham Allan wrote: > For our part, the user whose code triggered the pathological behaviour > on SL5 reran it on SL6 without incident - I still see lots of > sequence-id errors in the logs, but nothing bad happened. > > I'd still like to ask them to rerun again on SL5 to see if the "accept > skipped seqid" patch had any effect, though I think we expect not. Maybe > it would be nice if I could get set up to capture rolling tcpdumps of > the nfs traffic before they run that though... > > Graham > > On 7/20/2015 10:26 PM, Ahmed Kamal wrote: > > Hi folks, > > > > I've upgraded a test client to rhel6 today, and I'll keep an eye on it > > to see what happens. > > > > During the process, I made the (I guess mistake) of zfs send | recv to a > > locally attached usb disk for backup purposes .. long story short, > > sharenfs property on the received filesystem was causing some nfs/mountd > > errors in logs .. I wasn't too happy with what I got .. I destroyed the > > backup datasets and the whole pool eventually .. and then rebooted the > > whole nas box .. After reboot my logs are still flooded with > > > > Jul 21 05:12:36 nas kernel: nfsrv_cache_session: no session > > Jul 21 05:13:07 nas last message repeated 7536 times > > Jul 21 05:15:08 nas last message repeated 29664 times > > > > Not sure what that means .. or how it can be stopped .. Anyway, will > > keep you posted on progress. > Oh, I didn't see the part about "reboot" before. Unfortunately, it sounds like the client isn't recovering after the session is lost. When the server reboots, the client(s) will get NFS4ERR_BAD_SESSION errors back because the server reboot has deleted all sessions. The NFS4ERR_BAD_SESSION should trigger state recovery on the client. (It doesn't sound like the clients went into recovery, starting with a Create_session operation, but without a packet trace, I can't be sure?) rick > > -- > ------------------------------------------------------------------------- > Graham Allan - gta@umn.edu - allan@physics.umn.edu > School of Physics and Astronomy - University of Minnesota > ------------------------------------------------------------------------- > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1935759160.2320694.1437688383362.JavaMail.zimbra>