FreeBSD Mail Archives

Date:      Thu, 18 Jul 2019 09:41:14 +0200
From:      Michael Tuexen <tuexen@freebsd.org>
To:        Vitalij Satanivskij <satan@ukr.net>
Cc:        Paul <devgs@ukr.net>, freebsd-net@freebsd.org
Subject:   Re: Issues with TCP Timestamps allocation
Message-ID:  <BDA6466F-26E4-4DF9-9F19-6B4ABB377D05@freebsd.org>
In-Reply-To: <20190718073555.GA3770@hell.ukr.net>
References:  <1562579483.67527000.24rw4xi5@frv39.fwdcdn.com> <32FD061B-245C-41D2-81DE-1B4756A7173D@freebsd.org> <1562591379.369129000.gpmxvurq@frv39.fwdcdn.com> <DF65CA7F-B5FC-499D-B053-0531596D230C@freebsd.org> <1562599181.734953000.1l9a1d23@frv39.fwdcdn.com> <0C475A01-9BCD-4E4A-9731-09AB919CA9BE@freebsd.org> <1562676414.933145000.z3zteyqp@frv39.fwdcdn.com> <1E9F3F99-C3E9-44DD-AA70-9B11E19D4769@freebsd.org> <20190717074243.GA65665@hell.ukr.net> <B7BD397D-1B0E-4EFD-94EE-483C22952CD7@freebsd.org> <20190718073555.GA3770@hell.ukr.net>

> On 18. Jul 2019, at 09:35, Vitalij Satanivskij <satan@ukr.net> wrote:
>=20
>=20
> Yep. Patch work.=20
Thanks for testing and reporting.

Best regards
Michael
>=20
>=20
> Michael Tuexen wrote:
> MT> > On 17. Jul 2019, at 09:42, Vitalij Satanivskij <satan@ukr.net> =
wrote:
> MT> >=20
> MT> >=20
> MT> >=20
> MT> > Hello.=20
> MT> >=20
> MT> > Is there any changes about this problem
> MT> Please find a patch in https://reviews.freebsd.org/D20980
> MT>=20
> MT> If possible, please test and report.
> MT>=20
> MT> Best regards
> MT> Michael
> MT> >=20
> MT> >=20
> MT> > I'm using FreeBSD 12 on my desktop and can confirm problem occur =
with some hosts.
> MT> >=20
> MT> >=20
> MT> >=20
> MT> > Michael Tuexen wrote:
> MT> > MT>=20
> MT> > MT>=20
> MT> > MT> > On 9. Jul 2019, at 14:58, Paul <devgs@ukr.net> wrote:
> MT> > MT> >=20
> MT> > MT> > Hi Michael,
> MT> > MT> >=20
> MT> > MT> > 9 July 2019, 15:34:29, by "Michael Tuexen" =
<tuexen@freebsd.org>:
> MT> > MT> >=20
> MT> > MT> >>=20
> MT> > MT> >>=20
> MT> > MT> >>> On 8. Jul 2019, at 17:22, Paul <devgs@ukr.net> wrote:
> MT> > MT> >>>=20
> MT> > MT> >>>=20
> MT> > MT> >>>=20
> MT> > MT> >>> 8 July 2019, 17:12:21, by "Michael Tuexen" =
<tuexen@freebsd.org>:
> MT> > MT> >>>=20
> MT> > MT> >>>>> On 8. Jul 2019, at 15:24, Paul <devgs@ukr.net> wrote:
> MT> > MT> >>>>>=20
> MT> > MT> >>>>> Hi Michael,
> MT> > MT> >>>>>=20
> MT> > MT> >>>>> 8 July 2019, 15:53:15, by "Michael Tuexen" =
<tuexen@freebsd.org>:
> MT> > MT> >>>>>=20
> MT> > MT> >>>>>>> On 8. Jul 2019, at 12:37, Paul <devgs@ukr.net> =
wrote:
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> Hi team,
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> Recently we had an upgrade to 12 Stable. Immediately =
after, we have started=20
> MT> > MT> >>>>>>> seeing some strange connection establishment =
timeouts to some fixed number
> MT> > MT> >>>>>>> of external (world) hosts. The issue was persistent =
and easy to reproduce.
> MT> > MT> >>>>>>> Thanks to a patience and dedication of our system =
engineer we have tracked =20
> MT> > MT> >>>>>>> this issue down to a specific commit:
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> =
https://svnweb.freebsd.org/base?view=3Drevision&revision=3D338053
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> This patch was also back-ported into 11 Stable:
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> =
https://svnweb.freebsd.org/base?view=3Drevision&revision=3D348435
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> Among other things this patch changes the timestamp =
allocation strategy,
> MT> > MT> >>>>>>> by introducing a deterministic randomness via a hash =
function that takes
> MT> > MT> >>>>>>> into account a random key as well as source address, =
source port, dest
> MT> > MT> >>>>>>> address and dest port. As the result, timestamp =
offsets of different
> MT> > MT> >>>>>>> tuples (SA,SP,DA,DP) will be wildly different and =
will jump from small=20
> MT> > MT> >>>>>>> to large numbers and back, as long as something in =
the tuple changes.
> MT> > MT> >>>>>> Hi Paul,
> MT> > MT> >>>>>>=20
> MT> > MT> >>>>>> this is correct.
> MT> > MT> >>>>>>=20
> MT> > MT> >>>>>> Please note that the same happens with the old =
method, if two hosts with
> MT> > MT> >>>>>> different uptimes are bind a consumer grade NAT.
> MT> > MT> >>>>>=20
> MT> > MT> >>>>> If NAT does not replace timestamps then yes, it should =
be the case.
> MT> > MT> >>>>>=20
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> After performing various tests of hosts that produce =
the above mentioned=20
> MT> > MT> >>>>>>> issue we came to conclusion that there are some =
interesting implementations=20
> MT> > MT> >>>>>>> that drop SYN packets with timestamps smaller  than =
the largest timestamp=20
> MT> > MT> >>>>>>> value from streams of all recent or current =
connections from a specific=20
> MT> > MT> >>>>>>> address. This looks as some kind of SYN flood =
protection.
> MT> > MT> >>>>>> This also breaks multiple hosts with different =
uptimes behind a consumer
> MT> > MT> >>>>>> level NAT talking to such a server.
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> To ensure that each external host is not going to =
see a wild jumps of=20
> MT> > MT> >>>>>>> timestamp values I propose a patch that removes =
ports from the equation
> MT> > MT> >>>>>>> all together, when calculating the timestamp offset:
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> Index: sys/netinet/tcp_subr.c
> MT> > MT> >>>>>>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> MT> > MT> >>>>>>> --- sys/netinet/tcp_subr.c	(revision 348435)
> MT> > MT> >>>>>>> +++ sys/netinet/tcp_subr.c	(working copy)
> MT> > MT> >>>>>>> @@ -2224,7 +2224,22 @@
> MT> > MT> >>>>>>> uint32_t
> MT> > MT> >>>>>>> tcp_new_ts_offset(struct in_conninfo *inc)
> MT> > MT> >>>>>>> {
> MT> > MT> >>>>>>> -	return (tcp_keyed_hash(inc, =
V_ts_offset_secret));
> MT> > MT> >>>>>>> +        /*=20
> MT> > MT> >>>>>>> +         * Some implementations show a strange =
behaviour when a wildly random=20
> MT> > MT> >>>>>>> +         * timestamps allocated for different =
streams. It seems that only the
> MT> > MT> >>>>>>> +         * SYN packets are affected. Observed =
implementations drop SYN packets
> MT> > MT> >>>>>>> +         * with timestamps smaller than the largest =
timestamp value of all=20
> MT> > MT> >>>>>>> +         * recent or current connections from =
specific a address. To mitigate=20
> MT> > MT> >>>>>>> +         * this we are going to ensure that each =
host will always observe=20
> MT> > MT> >>>>>>> +         * timestamps as increasing no matter the =
stream: by dropping ports
> MT> > MT> >>>>>>> +         * from the equation.
> MT> > MT> >>>>>>> +         */=20
> MT> > MT> >>>>>>> +        struct in_conninfo inc_copy =3D *inc;
> MT> > MT> >>>>>>> +
> MT> > MT> >>>>>>> +        inc_copy.inc_fport =3D 0;
> MT> > MT> >>>>>>> +        inc_copy.inc_lport =3D 0;
> MT> > MT> >>>>>>> +
> MT> > MT> >>>>>>> +	return (tcp_keyed_hash(&inc_copy, =
V_ts_offset_secret));
> MT> > MT> >>>>>>> }
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> /*
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> In any case, the solution of the uptime leak, =
implemented in rev338053 is=20
> MT> > MT> >>>>>>> not going to suffer, because a supposed attacker is =
currently able to use=20
> MT> > MT> >>>>>>> any fixed values of SP and DP, albeit not 0, anyway, =
to remove them out=20
> MT> > MT> >>>>>>> of the equation.
> MT> > MT> >>>>>> Can you describe how a peer can compute the uptime =
from two observed timestamps?
> MT> > MT> >>>>>> I don't see how you can do that...
> MT> > MT> >>>>>=20
> MT> > MT> >>>>> Supposed attacker could run a script that continuously =
monitors timestamps,
> MT> > MT> >>>>> for example via a periodic TCP connection from a fixed =
local port (eg 12345)=20
> MT> > MT> >>>>> and a fixed local address to the fixed victim's =
address and port (eg 80).
> MT> > MT> >>>>> Whenever large discrepancy is observed, attacker can =
assume that reboot has=20
> MT> > MT> >>>>> happened (due to V_ts_offset_secret re-generation), =
hence the received=20
> MT> > MT> >>>>> timestamp is considered an approximate point of reboot =
from which the uptime
> MT> > MT> >>>>> can be calculated, until the next reboot and so on.
> MT> > MT> >>>> Ahh, I see. The patch we are talking about is not =
intended to protect against
> MT> > MT> >>>> continuous monitoring, which is something you can =
always do. You could even
> MT> > MT> >>>> watch for service availability and detect reboots. A =
change of the local key
> MT> > MT> >>>> would also look similar to a reboot without a temporary =
loss of connectivity.
> MT> > MT> >>>>=20
> MT> > MT> >>>> Thanks for the clarification.
> MT> > MT> >>>>>=20
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> There is the list of example hosts that we were able =
to reproduce the=20
> MT> > MT> >>>>>>> issue with:
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> curl -v http://88.99.60.171:80
> MT> > MT> >>>>>>> curl -v http://163.172.71.252:80
> MT> > MT> >>>>>>> curl -v http://5.9.242.150:80
> MT> > MT> >>>>>>> curl -v https://185.134.205.105:443
> MT> > MT> >>>>>>> curl -v https://136.243.1.231:443
> MT> > MT> >>>>>>> curl -v https://144.76.196.4:443
> MT> > MT> >>>>>>> curl -v http://94.127.191.194:80
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> To reproduce, call curl repeatedly with a same URL =
some number of times.=20
> MT> > MT> >>>>>>> You are going  to see some of the requests stuck in=20=

> MT> > MT> >>>>>>> `*    Trying XXX.XXX.XXX.XXX...`
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> For some reason, the easiest way to reproduce the =
issue is with nc:
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> $ echo "foooooo" | nc -v 88.99.60.171 80
> MT> > MT> >>>>>>>=20
> MT> > MT> >>>>>>> Only a few such calls are required until one of them =
is stuck on connect():
> MT> > MT> >>>>>>> issuing SYN packets with an exponential backoff.
> MT> > MT> >>>>>> Thanks for providing an end-point to test with. I'll =
take a look.
> MT> > MT> >>>>>> Just to be clear: You are running a FreeBSD client =
against one of the above
> MT> > MT> >>>>>> servers and experience the problem with the new =
timestamp computations.
> MT> > MT> >>>>>>=20
> MT> > MT> >>>>>> You are not running arbitrary clients against a =
FreeBSD server...
> MT> > MT> >>>>>=20
> MT> > MT> >>>>> We are talking about FreeBSD being the client. Peers =
that yield this unwanted
> MT> > MT> >>>>> behaviour are unknown. Little bit of tinkering showed =
that some of them run=20
> MT> > MT> >>>>> Debian:
> MT> > MT> >>>>>=20
> MT> > MT> >>>>> telnet 88.99.60.171 22
> MT> > MT> >>>>> Trying 88.99.60.171...
> MT> > MT> >>>>> Connected to 88.99.60.171.
> MT> > MT> >>>>> Escape character is '^]'.
> MT> > MT> >>>>> SSH-2.0-OpenSSH_6.7p1 Debian-5+deb8u3
> MT> > MT> >>>> Also some are hosted by Hetzner, but not all. I'll will =
look into
> MT> > MT> >>>> this tomorrow, since I'm on a deadline today (well it =
is 2am tomorrow
> MT> > MT> >>>> morning, to be precise)...
> MT> > MT> >>>=20
> MT> > MT> >>> Thanks a lot, I would appreciate that.
> MT> > MT> >> Hi Paul,
> MT> > MT> >>=20
> MT> > MT> >> I have looked into this.
> MT> > MT> >>=20
> MT> > MT> >> * The FreeBSD behaviour is the one which is specified in =
the last bullet item
> MT> > MT> >>  in https://tools.ietf.org/html/rfc7323#section-5.4
> MT> > MT> >>  It is also the one, which is RECOMMENDED in
> MT> > MT> >>  https://tools.ietf.org/html/rfc7323#section-7.1=20
> MT> > MT> >>=20
> MT> > MT> >> * My NAT box (a popular one in Germany) does NOT rewrite =
TCP timestamps.
> MT> > MT> >>=20
> MT> > MT> >> This means that the host you are referring to have some =
sort of protection,
> MT> > MT> >> which makes incorrect assumptions. It will also break =
multiple hosts behind
> MT> > MT> >> a NAT.
> MT> > MT> >>=20
> MT> > MT> >> I can run
> MT> > MT> >> curl -v http://88.99.60.171:80
> MT> > MT> >> in a loop without any problems from a FreeBSD head =
system. I tested 1000
> MT> > MT> >> iterations or so. The TS.val is jumping up and down as =
expected.
> MT> > MT> >> I'm wondering why you are observing errors in this case, =
too.
> MT> > MT> >>=20
> MT> > MT> >> However, doing something like
> MT> > MT> >> echo "foooooo" | nc -v 88.99.60.171 80
> MT> > MT> >> triggers the problem.
> MT> > MT> >>=20
> MT> > MT> >> So I think there is some functionality (in a middlebox or =
running on the host),
> MT> > MT> >> which incorrectly assume monotonic timestamps between =
multiple TCP connections
> MT> > MT> >> coming from the same IP address, but only in case of =
errors at the application layer.
> MT> > MT> >=20
> MT> > MT> > Yeah, exactly, some hosts seem to enable this only in case =
of an error in HTTP
> MT> > MT> > communication (some smart proxy?). However, there are some =
that behave this way
> MT> > MT> > regardless of errors, for example these:
> MT> > MT> >=20
> MT> > MT> > curl -v https://185.134.205.105:443
> MT> > MT> > curl -v https://136.243.1.231:443
> MT> > MT> Wireshark sees an Encrypted Alert in both cases. So I guess =
this is another indication
> MT> > MT> of "error at the application layer".
> MT> > MT> >=20
> MT> > MT> >>=20
> MT> > MT> >> Do you have any insights whether the hosts you are listed =
share something in
> MT> > MT> >> common. Some of them are hosted by Hetzner, but not all.
> MT> > MT> >=20
> MT> > MT> > Nope. A whole set of endpoints that we have detected so =
far is pretty diverse,
> MT> > MT> > containing a lot of different locations geographically, as =
well as different
> MT> > MT> > hosters.
> MT> > MT> OK. Thanks for the clarification.
> MT> > MT> >=20
> MT> > MT> >>=20
> MT> > MT> >> I think in general, it is the correct thing to include =
the port numbers in
> MT> > MT> >> the offset computation. We might add a sysctl variable to =
control the inclusion.
> MT> > MT> >> This would allow interworking with broken middleboxes.
> MT> > MT> >=20
> MT> > MT> > Yeah, I completely agree that these rare cases should not =
dictate the implementation.
> MT> > MT> > But an ability to enable a work-around via sysctl would be =
greatly appreciated.
> MT> > MT> > Currently we are unable to roll-out the upgrade across all =
servers because of this
> MT> > MT> > issue: even though it happens not so often, a lot of =
requests from our users=20
> MT> > MT> > get stuck or fail all together. For example, a host =
185.134.205.105 is a kind of
> MT> > MT> > social network that our proxy servers connect to so =
securely access to content,
> MT> > MT> > such as images, on behalf of our users.
> MT> > MT> >=20
> MT> > MT> >>=20
> MT> > MT> >> Please note, this does not fix the case of multiple =
clients behind a NAT.
> MT> > MT> >=20
> MT> > MT> > Yeah, that's true. Fortunately we don't use NAT.
> MT> > MT> >=20
> MT> > MT> >>=20
> MT> > MT> >> I'm also trying to figure out how and why Linux and =
Windows are handling this.
> MT> > MT> >=20
> MT> > MT> > Thanks for bothering!
> MT> > MT> Will let you know what I figure out.
> MT> > MT>=20
> MT> > MT> Best regards
> MT> > MT> Michael
> MT> > MT> >=20
> MT> > MT> >>=20
> MT> > MT> >> Best regards
> MT> > MT> >> Michael
> MT> > MT> >>=20
> MT> > MT> >>>=20
> MT> > MT> >>>>=20
> MT> > MT> >>>> Best regards
> MT> > MT> >>>> Michael=20
> MT> > MT> >>>>>=20
> MT> > MT> >>>>>=20
> MT> > MT> >>>>>>=20
> MT> > MT> >>>>>> Best regards
> MT> > MT> >>>>>> Michael
> MT> > MT> >>>>>>=20
> MT> > MT> >>>>>>=20
> MT> > MT> >>>>=20
> MT> > MT> >>>>=20
> MT> > MT> >>=20
> MT> > MT> >>=20
> MT> > MT>=20
> MT> > MT> _______________________________________________
> MT> > MT> freebsd-net@freebsd.org mailing list
> MT> > MT> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> MT> > MT> To unsubscribe, send any mail to =
"freebsd-net-unsubscribe@freebsd.org"
> MT> > _______________________________________________
> MT> > freebsd-net@freebsd.org mailing list
> MT> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
> MT> > To unsubscribe, send any mail to =
"freebsd-net-unsubscribe@freebsd.org"
> MT>=20
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BDA6466F-26E4-4DF9-9F19-6B4ABB377D05>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation