Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Jun 2019 14:23:54 -0700
From:      Igor Grinchenko <igor-fbsdnet@grinchenko.org>
To:        Michael Tuexen <tuexen@freebsd.org>
Cc:        freebsd-net@freebsd.org
Subject:   Re: unexpected TCP resets (RST) in 12.0-RELEASE
Message-ID:  <20190621212354.GP94573@sun.grinchenko.org>
In-Reply-To: <D977D765-1043-4789-81DB-0985D256BA71@freebsd.org>
References:  <20190621081941.GM94573@sun.grinchenko.org> <D977D765-1043-4789-81DB-0985D256BA71@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Fri, Jun 21, 2019 at 12:26:46PM +0200, Michael Tuexen wrote:
> > On 21. Jun 2019, at 10:19, Igor Grinchenko <igor-fbsdnet@grinchenko.org> wrote:
<..>
> > 
> > It doesn't seem to matter what app that is, doesn't matter what event mechanism is used (kqueue or select). TCP stack just refuses to handshake from time to time. The rate of these increases with the rate of incoming connections. 12.0-RELEASE-p6 seems to be producing fewer of these, it could be due to the fix in https://www.freebsd.org/security/advisories/FreeBSD-EN-19:11.net.asc . While my guess is not very scientific, it seems to be related to the new epoch(9) based synchronization. A 11.2-RELEASE host, serving the same exact traffic, which I kept for baselining is rock-solid and doesn't produce TCP resets like these.
> Hi Igor,
> 
> could you do
> sudo sysctl net.inet.tcp.log_debug=1
> on the host sending the RST segments and see if you get some messages in /var/log/messages.
> Do you see these messages? If yes, what is logged?
> 
> Do you have a way to reproduce this issue?
> 
> Best regards
> Michael
> > 

Michael,

first of all, thank you for the MIB pointer, very useful, it should really be promoted for bugreports/mailing list postings. 
Unfortunately, setting net.inet.tcp.log_debug=1 doesn't result in any log entries for this particular use case. I see logging about other legitimate resets but not that one.

I have been trying to find an easy way to reproduce it, but haven't been able to do it yet. It is a moderately loaded server running FPM accepting and serving 300-500 requests(each on a new TCP connection) per second. It seems like I'm getting significantly fewer resets on more powerful hardware(Intel Gold 6140 vs E5-2670 v3), all other things being equal. 
Enabling net.inet.tcp.syncookies_only=1 doesn't seem to help, either.

is there anything else I can run to get a better insight into what might be happening?

--
Igor



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190621212354.GP94573>