FreeBSD Mail Archives

Date:      Sun, 22 Dec 2019 17:01:24 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Daniel Braniss <danny@cs.huji.ac.il>
Cc:        Adam McDougall <mcdouga9@egr.msu.edu>, "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   Re: nfs lockd errors after NetApp software upgrade.
Message-ID:  <YQBPR0101MB14276E7F9C127374C3E36952DD2F0@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <AE8F5D6B-E7DA-4AB9-B909-7D362A6A406B@cs.huji.ac.il>
References:  <EBC4AD74-EC62-4C67-AB93-1AA91F662AAC@cs.huji.ac.il> <YQBPR0101MB1427411AFE335E869B9CF022DD530@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <0121E289-D2AE-44BA-ADAC-4814CAEE676F@cs.huji.ac.il> <CAGfybS-3Rvs57=oGFEfii_9a=aWxPr6dEq1Y1LqHbLXK1ZKmXA@mail.gmail.com> <YQBPR0101MB1427F9BE658B9A46C7E08335DD520@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <854B6E5A-C6BC-44B3-A656-FC9B8EF19881@cs.huji.ac.il> <YQBPR0101MB1427F445F1F1EAF382E5131ADD520@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <8770BD0D-4B72-431A-B4F5-A29D4DBA03B1@cs.huji.ac.il> <b1182bbf-fd0b-a23d-1cc4-ddf9513bcb2e@egr.msu.edu> <YQBPR0101MB1427CE52BBA32A888443BFB4DD2D0@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM> <8A78F67B-C244-45CF-B9BF-D7062669B33B@cs.huji.ac.il> <YQBPR0101MB1427C9D4CF8918F10B6FD400DD2C0@YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM>, <AE8F5D6B-E7DA-4AB9-B909-7D362A6A406B@cs.huji.ac.il>

Daniel Braniss wrote:=0A=
>> On 21 Dec 2019, at 19:32, Rick Macklem <rmacklem@uoguelph.ca> wrote:=0A=
>>=0A=
>> Daniel Braniss wrote:=0A=
>>>> On 20 Dec 2019, at 19:19, Rick Macklem >>><rmacklem@uoguelph.ca<mailto=
:rmacklem@uoguelph.ca>> wrote:=0A=
>>>>=0A=
>>>> Adam McDougall wrote:=0A=
>>>>> Try changing bool_t do_tcp =3D FALSE; to TRUE in=0A=
>>>>> /usr/src/sys/nlm/nlm_prot_impl.c, recompile the kernel and try again.=
 I=0A=
>>>>> think this makes it match Linux client behavior. I suspect I ran into=
=0A=
>>>>> the same issue as you. I do think I used nolockd is a workaround=0A=
>>>>> temporarily. I can provide some more details if it works.=0A=
>>>> If this fixes the problem, please let me know.=0A=
>>>>=0A=
>>>> I'm not sure I'd want to change the default, since it might break thin=
gs for=0A=
>>>> others, but I can definitely make it a tunable, so that people don't n=
eed to=0A=
>>>> recompile a kernel to deal with it.=0A=
>>>>=0A=
>>>>=0A=
>>> great! I was just about to see how it can be done(tunable) but need to =
check if it can >be done=0A=
>>> at any time, or just at boot time.=0A=
>> I haven't looked at the code, but I suspect changing it on the fly could=
 cause problems,=0A=
>> so I am inclined to make it a tunable (boot time only).=0A=
my feelings too.=0A=
>>=0A=
>>> thanks.=0A=
>>> btw, currently, from several hours of analysing the traffic, it seems t=
hat nlm is UDP.=0A=
>> I assume that means you haven't tried flipping it to TCP yet.=0A=
>I will soon, but I have my doubts, the problem is caused my multiple event=
s, i.e, it >happened once while=0A=
>I was doing svn checkout, but i have done it several times since, and no i=
ssues. So it >must be=0A=
>an aggregation of factors. Other hosts are reporting locks times too.=0A=
Well, I've noted the flawed protocol. Here's an example (from my limited un=
derstanding of these protocols, where there has never been a published spec=
) :=0A=
- The NLM supports a "blocking lock request" that goes something like this.=
..=0A=
   - client requests lock and is willing to wait for it=0A=
   - if server has a conflicting lock on the file, it replies "I'll acquire=
 the lock for=0A=
      you when I can and let you know".=0A=
     --> When the conflicting lock is released, the server acquires the loc=
k and does=0A=
            a callback (server->client RPC) to tell the client it now has t=
he lock.=0A=
You don't have to think about this for long to realize that any network unr=
eliability=0A=
or partitioning could result in trouble.=0A=
The kernel RPC layer may do some retries of the RPCs (this is controlled by=
 the=0A=
parameters set for the RPC), but at some point the protocol asks the NSM=0A=
(rpc.statd) if the machine is "up" and then uses the NSM's answer to deal w=
ith it.=0A=
(The NSM basically pokes other systems and notes they are "up" if they get=
=0A=
 replies to these pokes. It uses IP broadcast at some point.)=0A=
=0A=
Now, maybe switching to TCP will make the RPCs reliable enough that it will=
=0A=
work, or maybe it won't? (It certainly sounds like the Netapp upgrade is ca=
using=0A=
some kind of network issue, and the NLM doesn't tolerate that well.)=0A=
=0A=
rick=0A=
=0A=
danny=0A=
=0A=
>=0A=
> Please let us know how it goes, rick=0A=
>=0A=
> danny=0A=
>=0A=
>=0A=
> rick=0A=
>=0A=
> On 12/19/19 9:21 AM, Daniel Braniss wrote:=0A=
>=0A=
>=0A=
> On 19 Dec 2019, at 16:09, Rick Macklem <rmacklem@uoguelph.ca<mailto:rmack=
lem@uoguelph.ca>> wrote:=0A=
>=0A=
> Daniel Braniss wrote:=0A=
> [stuff snipped]=0A=
> all mounts are nfsv3/tcp=0A=
> This doesn't affect what the NLM code (rpc.lockd) uses. I honestly don't =
know when=0A=
> the NLM uses tcp vs udp. I think rpc.statd still uses IP broadcast at tim=
es.=0A=
> can the replay cache have any influence here? I tend to remember way back=
 issues=0A=
> with it,=0A=
>=0A=
> To me, it looks like a network configuration issue.=0A=
> that was/is my gut feelings too, but, as far as we can tell, nothing has =
changed in the network infrastructure,=0A=
> the problems appeared after the NetAPP=92s software was updated, it was w=
orking fine till then.=0A=
>=0A=
> the problems are also happening on freebsd 12.1=0A=
>=0A=
> You could capture packets (maybe when a client first starts rpc.statd and=
 rpc.lockd)=0A=
> and then look at them in wireshark. I'd disable statup of rpc.lockd and r=
pc.statd=0A=
> at boot for a test client and then run something like:=0A=
> # tcpdump -s 0 -s out.pcap host <netapp-host>=0A=
> - and then start rpc.statd and rpc.lockd=0A=
> Then I'd look at out.pcap in wireshark (much better at decoding this stuf=
f than=0A=
> tcpdump). I'd look for things like different reply IP addresses from the =
Netapp,=0A=
> which might confuse this tired old NLM protocol Sun devised in the mid-19=
80s.=0A=
>=0A=
> it=92s going to be an interesting week end :-(=0A=
>=0A=
> the error is also appearing on freebsd-11.2-stable, I=92m now checking if=
 it=92s also=0A=
> happening on 12.1=0A=
> btw, the NetApp version is 9.3P17=0A=
> Yes. I wasn't the author of the NSM and NLM code (long ago I refused to e=
ven=0A=
> try to implement it, because I knew the protocol was badly broken) and I =
avoid=0A=
> fiddling with. As such, it won't have change much since around FreeBSD7.=
=0A=
> and we haven=92t had any issues with it for years, so you must have done =
something good=0A=
>=0A=
> cheers,=0A=
>     danny=0A=
>=0A=
>=0A=
> rick=0A=
>=0A=
> cheers,=0A=
>      danny=0A=
>=0A=
> rick=0A=
>=0A=
> Cheers=0A=
>=0A=
> Richard=0A=
> (NetApp admin)=0A=
>=0A=
> On Wed, 18 Dec 2019 at 15:46, Daniel Braniss <danny@cs.huji.ac.il<mailto:=
danny@cs.huji.ac.il><mailto:danny@cs.huji.ac.il>> wrote:=0A=
>=0A=
>=0A=
> On 18 Dec 2019, at 16:55, Rick Macklem <rmacklem@uoguelph.ca<mailto:rmack=
lem@uoguelph.ca><mailto:rmacklem@uoguelph.ca>> wrote:=0A=
>=0A=
> Daniel Braniss wrote:=0A=
>=0A=
> Hi,=0A=
> The server with the problems is running FreeBSD 11.1 stable, it was worki=
ng fine for >several months,=0A=
> but after a software upgrade of our NetAPP server it=92s reporting many l=
ockd errors >and becomes catatonic,=0A=
> ...=0A=
> Dec 18 13:11:02 moo-09 kernel: nfs server fr-06:/web/www: lockd not respo=
nding=0A=
> Dec 18 13:11:45 moo-09 last message repeated 7 times=0A=
> Dec 18 13:12:55 moo-09 last message repeated 8 times=0A=
> Dec 18 13:13:10 moo-09 kernel: nfs server fr-06:/web/www: lockd is alive =
again=0A=
> Dec 18 13:13:10 moo-09 last message repeated 8 times=0A=
> Dec 18 13:13:29 moo-09 kernel: sonewconn: pcb 0xfffff8004cc051d0: Listen =
queue >overflow: 194 already in queue awaiting acceptance (1 occurrences)=
=0A=
> Dec 18 13:14:29 moo-09 kernel: sonewconn: pcb 0xfffff8004cc051d0: Listen =
queue >overflow: 193 already in queue awaiting acceptance (3957 occurrences=
)=0A=
> Dec 18 13:15:29 moo-09 kernel: sonewconn: pcb 0xfffff8004cc051d0: Listen =
queue >overflow: 193 already in queue awaiting acceptance =85=0A=
> Seems like their software upgrade didn't improve handling of NLM RPCs?=0A=
> Appears to be handling RPCs slowly and/or intermittently. Note that no on=
e=0A=
> tests it with IPv6, so at least make sure you are still using IPv4 for th=
e mounts and=0A=
> try and make sure IP broadcast works between client and Netapp. I think t=
he NLM=0A=
> and NSM (rpc.statd) still use IP broadcast sometimes.=0A=
>=0A=
> we are ipv4 - we have our own class c :-)=0A=
> Maybe the network guys can suggest more w.r.t. why, but as I've stated be=
fore,=0A=
> the NLM is a fundamentally broken protocol which was never published by S=
un,=0A=
> so I suggest you avoid using it if at all possible.=0A=
> well, at the moment the ball is on NetAPP court, and switching to NFSv4 a=
t the moment is out of the question, it=92s=0A=
> a production server used by several thousand students.=0A=
>=0A=
>=0A=
> - If the locks don't need to be seen by other clients, you can just use t=
he "nolockd"=0A=
> mount option.=0A=
> or=0A=
> - If locks need to be seen by other clients, try NFSv4 mounts. Netapp fil=
ers=0A=
> should support NFSv4.1, which is a much better protocol that NFSv4.0.=0A=
>=0A=
> Good luck with it, rick=0A=
> thanks=0A=
>     danny=0A=
>=0A=
> =85=0A=
> any ideas?=0A=
>=0A=
> thanks,=0A=
>    danny=0A=
>=0A=
> _______________________________________________=0A=
> freebsd-stable@freebsd.org<mailto:freebsd-stable@freebsd.org><mailto:free=
bsd-stable@freebsd.org> mailing list=0A=
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A=
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org<=
mailto:freebsd-stable-unsubscribe@freebsd.org>"=0A=
>=0A=
> _______________________________________________=0A=
> freebsd-stable@freebsd.org<mailto:freebsd-stable@freebsd.org><mailto:free=
bsd-stable@freebsd.org> mailing list=0A=
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A=
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org<=
mailto:freebsd-stable-unsubscribe@freebsd.org>"=0A=
>=0A=
>=0A=
> _______________________________________________=0A=
> freebsd-stable@freebsd.org<mailto:freebsd-stable@freebsd.org> mailing lis=
t=0A=
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A=
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"=
=0A=
>=0A=
>=0A=
> _______________________________________________=0A=
> freebsd-stable@freebsd.org<mailto:freebsd-stable@freebsd.org> mailing lis=
t=0A=
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A=
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org<=
mailto:freebsd-stable-unsubscribe@freebsd.org>"=0A=
> _______________________________________________=0A=
> freebsd-stable@freebsd.org<mailto:freebsd-stable@freebsd.org> mailing lis=
t=0A=
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A=
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org<=
mailto:freebsd-stable-unsubscribe@freebsd.org>"=0A=
>=0A=
=0A=
_______________________________________________=0A=
freebsd-stable@freebsd.org mailing list=0A=
https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A=
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"=
=0A=

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR0101MB14276E7F9C127374C3E36952DD2F0>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation