Date: Fri, 6 Oct 2023 17:31:56 -0700 From: Rick Macklem <rick.macklem@gmail.com> To: J David <j.david.lists@gmail.com> Cc: FreeBSD FS <freebsd-fs@freebsd.org> Subject: Re: FreeBSD 13.2 NFS client mount hangs Message-ID: <CAM5tNy7AK7Hq%2Bnnxftg=s5wk=YHw27F7EBydLX=D_z1KvFuD_Q@mail.gmail.com> In-Reply-To: <CABXB=RT14gofYHkMMr8cj%2BTy2QRUgn6zunho4T2Kq2NxAWmuAQ@mail.gmail.com> References: <CABXB=RRSHMhZQFL28eHKjhAYmU87qjpQ=B1=8VRSZoXat9=r5A@mail.gmail.com> <CAM5tNy4sqc18UCZF0vgL%2BXP6vF0wgt_3Yi07yY4wqeuzs6haMA@mail.gmail.com> <CABXB=RSUJ3mpYF5puAm0hSxeavozxyf7Ruab8mPrtBOu6bxM-w@mail.gmail.com> <CAM5tNy52x2s=9Os%2BPAa=-iz7F_o_4_9XxJbRAR28V1v9A4nN6A@mail.gmail.com> <CABXB=RT14gofYHkMMr8cj%2BTy2QRUgn6zunho4T2Kq2NxAWmuAQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Oct 6, 2023 at 10:48=E2=80=AFAM J David <j.david.lists@gmail.com> w= rote: > > On Mon, Oct 2, 2023 at 7:08=E2=80=AFPM Rick Macklem <rick.macklem@gmail.c= om> wrote: > > > The nfscbd daemon is not running on any of the clients. > > > > > > > If the Linux server still > > > > issues delegations > > > > > > How would I determine that? > > nfsstat -E -c > > and then look at the number under "Delegs". It is a current count of > > delegations, so if it remains 0 over time, no delegations are being iss= ued. > > But if this is done from a client that is not running nfscbd, isn't it > pretty well guaranteed to be zero? > > Checking all the clients I can find, "Deleg" is zero on all of them. > On about half, "DelegRet" is nonzero but small (1-100), but I don't > know what that is or if it's related. > > > I have attached a small patch which should make the NFS client handle > > this error correctly. > > I will look for a way to try this patch, but the clients in this case > are all managed with freebsd-update and don't have enough disk space > to build a kernel locally, so it may be tricky. > > > > > # tcpdump -s 0 -w out.pcap host <nfs-server-name> > > > > Let this run for a while and then pull out.pcap into wireshark and = see what > > > > traffic is going between the NFS client and server. > > > > (Unlike tcpdump, wireshark does know how to decode NFS properly.) > > > > > > If/when the issue happens again, I will attempt to do this and report= back. > > I am also working on getting access to Wireshark. > > In the interim, it did happen again, so the best I can do is put a > little bit of tcpdump output here: https://pastebin.com/UDrphwr5 . Maybe someone who is familiar with tcpdump output can look at this. (I always use wireshark.) It looks to me like the TCP checksum is failing on the client->server reque= st, but maybe I am not reading it correctly. If the checksum is incorrect, then there is something badly broken in your network fabric. rick > > I can't vouch for "correct" but it does mostly seem to decode the NFS pac= kets. > > It seems to loop the same couple of actions with long delays (15 > seconds) between retries: > > This sequence: > +0.0000s: Client -> server xid 1205841201 getattr fh 0,7/2 ("Getattr" > in packet body) > +1.4106s: Client -> server xid 1205841202 getattr fh 0,5/2 ("Renew" in > packet body) > +0.0002s: Server -> client xid 1205841202 getattr LNK 12231267145 ids > 1/53 sz 0 ("Renew" in packet body) > +3.8001s: Server -> client xid 1205841201 getattr ERROR: Request > couldn't be completed in time ("Getattr" in packet body) > > Repeats after 15 seconds: > +15.0090s: client -> server 1205841203 getattr fh 0,7/2 ("Getattr" in > packet body) > ... etc > > The "fh 0,7/2" and "fh 0,5/2" seem to be consistent each time. The xid > (transaction/request ID?) increments each time. > > Maybe that will provide a lucky flash of insight in the interim. > > Thanks!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy7AK7Hq%2Bnnxftg=s5wk=YHw27F7EBydLX=D_z1KvFuD_Q>