Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 10 Apr 2021 15:56:33 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        "Scheffenegger, Richard" <Richard.Scheffenegger@netapp.com>, "tuexen@freebsd.org" <tuexen@freebsd.org>
Cc:        Youssef GHORBAL <youssef.ghorbal@pasteur.fr>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: NFS Mount Hangs
Message-ID:  <YQXPR0101MB096894FBD385DB9A42C1399FDD729@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <SN4PR0601MB37287855390FB8A989381CFE86729@SN4PR0601MB3728.namprd06.prod.outlook.com>
References:  <C643BB9C-6B61-4DAC-8CF9-CE04EA7292D0@tildenparkcapital.com> <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <D67AF317-D238-4EC0-8C7F-22D54AD5144C@pasteur.fr> <YQXPR0101MB09684AB7BEFA911213604467DD669@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <C87066D3-BBF1-44E1-8398-E4EB6903B0F2@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <YQXPR0101MB0968C44C7C82A3EB64F384D0DD7B9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <DEF8564D-0FE9-4C2C-9F3B-9BCDD423377C@freebsd.org> <YQXPR0101MB0968E0A17D8BCACFAF132225DD7A9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <SN4PR0601MB3728E392BCA494EAD49605FE86789@SN4PR0601MB3728.namprd06.prod.outlook.com> <YQXPR0101MB09686B4F921B96DCAFEBF874DD789@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <YQXPR0101MB096876B44F33BAD8991B62C8DD789@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> <YQXPR0101MB09688645194907BBAA6E7C7ADD789@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <BF5D23D3-5DBD-4E29-9C6B-F4CCDC205353@freebsd.org> <YQXPR0101MB096826445C85921C8F6410A2DD779@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <E4A51EAD-8F9A-49BB-8852-F9D61BDD9EA4@freebsd.org> <YQXPR0101MB09682F230F25FBF3BC427135DD729@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <SN4PR0601MB3728AF2554FDDFB4EEF2C95B86729@SN4PR0601MB3728.namprd06.prod.outlook.com>, <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org>, <SN4PR0601MB37287855390FB8A989381CFE86729@SN4PR0601MB3728.namprd06.prod.outlook.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Scheffenegger, Richard <Richard.Scheffenegger@netapp.com> wrote:=0A=
>>Rick wrote:=0A=
>> Hi Rick,=0A=
>>=0A=
>>> Well, I have some good news and some bad news (the bad is mostly for Ri=
chard).=0A=
>>>=0A=
>>> The only message logged is:=0A=
>>> tcpflags 0x4<RST>; tcp_do_segment: Timestamp missing, segment processed=
 normally=0A=
>>>=0A=
Btw, I did get one additional message during further testing (with r367492 =
reverted):=0A=
 tcpflags 0x4<RST>; syncache_chkrst: Our SYN|ACK was rejected, connection a=
ttempt aborted=0A=
   by remote endpoint=0A=
=0A=
This only happened once of several test cycles.=0A=
=0A=
>>> But...the RST battle no longer occurs. Just one RST that works and then=
 the SYN gets SYN,ACK'd by the FreeBSD end and off it goes...=0A=
>>>=0A=
>>> So, what is different?=0A=
>>>=0A=
>>> r367492 is reverted from the FreeBSD server.=0A=
>>> I did the revert because I think it might be what otis@ hang is being c=
aused by. (In his case, the Recv-Q grows on the socket for the stuck Linux =
client, while others work.=0A=
>>>=0A=
>>> Why does reverting fix this?=0A=
>>> My only guess is that the krpc gets the upcall right away and sees a EP=
IPE when it does soreceive()->results in soshutdown(SHUT_WR).=0A=
This was bogus and incorrect. The diagnostic printf() I saw was generated f=
or the=0A=
back channel, and that would have occurred after the socket was shut down.=
=0A=
=0A=
>>=0A=
>> With r367492 you don't get the upcall with the same error state? Or you =
don't get an error on a write() call, when there should be one?=0A=
If Send-Q is 0 when the network is partitioned, after healing, the krpc see=
s no activity on=0A=
the socket (until it acquires/processes an RPC it will not do a sosend()).=
=0A=
Without the 6minute timeout, the RST battle goes on "forever" (I've never a=
ctually=0A=
waited more than 30minutes, which is close enough to "forever" for me).=0A=
--> With the 6minute timeout, the "battle" stops after 6minutes, when the t=
imeout=0A=
      causes a soshutdown(..SHUT_WR) on the socket.=0A=
      (Since the soshutdown() patch is not yet in "main". I got comments, b=
ut no "reviewed"=0A=
       on it, the 6minute timer won't help if enabled in main. The soclose(=
) won't happen=0A=
       for TCP connections with the back channel enabled, such as Linux 4.1=
/4.2 ones.)=0A=
=0A=
If Send-Q is non-empty when the network is partitioned, the battle will not=
 happen.=0A=
=0A=
>=0A=
>My understanding is that he needs this error indication when calling shutd=
own().=0A=
There are several ways the krpc notices that a TCP connection is no longer =
functional.=0A=
- An error return like EPIPE from either sosend() or soreceive().=0A=
- A return of 0 from soreceive() with no data (normal EOF from other end).=
=0A=
- A 6minute timeout on the server end, when no activity has occurred on the=
=0A=
  connection. This timer is currently disabled for NFSv4.1/4.2 mounts in "m=
ain",=0A=
  but I enabled it for this testing, to stop the "RST battle goes on foreve=
r"=0A=
  during testing. I am thinking of enabling it on "main", but this crude ba=
ndaid=0A=
  shouldn't be thought of as a "fix for the RST battle".=0A=
=0A=
>>=0A=
>> From what you describe, this is on writes, isn't it? (I'm asking, at the=
 original problem that was fixed with r367492, occurs in the read path (dra=
ining of ths so_rcv buffer in the upcall right away, which subsequently inf=
luences the ACK sent by the stack).=0A=
>>=0A=
>> I only added the so_snd buffer after some discussion, if the WAKESOR sho=
uldn't have a symmetric equivalent on WAKESOW....=0A=
>>=0A=
>> Thus a partial backout (leaving the WAKESOR part inside, but reverting t=
he WAKESOW part) would still fix my initial problem about erraneous DSACKs =
(which can also lead to extremely poor performance with Linux clients), but=
 possible address this issue...=0A=
>>=0A=
>> Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 f=
or the revert only on the so_snd upcall?=0A=
Since the krpc only uses receive upcalls, I don't see how reverting the sen=
d side would have=0A=
any effect?=0A=
=0A=
>Since the release of 13.0 is almost done, can we try to fix the issue inst=
ead of reverting the commit?=0A=
I think it has already shipped broken.=0A=
I don't know if an errata is possible, or if it will be broken until 13.1.=
=0A=
=0A=
--> I am much more concerned with the otis@ stuck client problem than this =
RST battle that only=0A=
       occurs after a network partitioning, especially if it is 13.0 specif=
ic.=0A=
       I did this testing to try to reproduce Jason's stuck client (with co=
nnection in CLOSE_WAIT)=0A=
       problem, which I failed to reproduce.=0A=
=0A=
rick=0A=
=0A=
Rs: agree, a good understanding where the interaction btwn stack, socket an=
d in kernel tcp user breaks is needed;=0A=
=0A=
>=0A=
> If this doesn't help, some major surgery will be necessary to prevent NFS=
 sessions with SACK enabled, to transmit DSACKs...=0A=
=0A=
My understanding is that the problem is related to getting a local error in=
dication after=0A=
receiving a RST segment too late or not at all.=0A=
=0A=
Rs: but the move of the upcall should not materially change that; i don=92t=
 have a pc here to see if any upcall actually happens on rst...=0A=
=0A=
Best regards=0A=
Michael=0A=
>=0A=
>=0A=
>> I know from a printf that this happened, but whether it caused the RST b=
attle to not happen, I don't know.=0A=
>>=0A=
>> I can put r367492 back in and do more testing if you'd like, but I think=
 it probably needs to be reverted?=0A=
>=0A=
> Please, I don't quite understand why the exact timing of the upcall would=
 be that critical here...=0A=
>=0A=
> A comparison of the soxxx calls and errors between the "good" and the "ba=
d" would be perfect. I don't know if this is easy to do though, as these ca=
lls appear to be scattered all around the RPC / NFS source paths.=0A=
>=0A=
>> This does not explain the original hung Linux client problem, but does s=
hed light on the RST war I could create by doing a network partitioning.=0A=
>>=0A=
>> rick=0A=
>=0A=
> _______________________________________________=0A=
> freebsd-net@freebsd.org mailing list=0A=
> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A=
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB096894FBD385DB9A42C1399FDD729>