From owner-freebsd-net@freebsd.org Mon Jul 8 12:53:17 2019 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0CC6715DDFE1 for ; Mon, 8 Jul 2019 12:53:17 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "COMODO RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 840356CF5A for ; Mon, 8 Jul 2019 12:53:16 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:c6a0:4015:12:f4a0:506f:aa54:cf8] (unknown [IPv6:2a02:c6a0:4015:12:f4a0:506f:aa54:cf8]) (Authenticated sender: macmic) by drew.franken.de (Postfix) with ESMTPSA id DD372721E281A; Mon, 8 Jul 2019 14:53:11 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: Issues with TCP Timestamps allocation From: Michael Tuexen In-Reply-To: <1562579483.67527000.24rw4xi5@frv39.fwdcdn.com> Date: Mon, 8 Jul 2019 14:53:11 +0200 Cc: freebsd-net@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <32FD061B-245C-41D2-81DE-1B4756A7173D@freebsd.org> References: <1562579483.67527000.24rw4xi5@frv39.fwdcdn.com> To: Paul X-Mailer: Apple Mail (2.3445.104.11) X-Spam-Status: No, score=-1.7 required=5.0 tests=ALL_TRUSTED,BAYES_00, NORMAL_HTTP_TO_IP,NUMERIC_HTTP_ADDR autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2019 12:53:17 -0000 > On 8. Jul 2019, at 12:37, Paul wrote: >=20 > Hi team, >=20 > Recently we had an upgrade to 12 Stable. Immediately after, we have = started=20 > seeing some strange connection establishment timeouts to some fixed = number > of external (world) hosts. The issue was persistent and easy to = reproduce. > Thanks to a patience and dedication of our system engineer we have = tracked =20 > this issue down to a specific commit: >=20 > https://svnweb.freebsd.org/base?view=3Drevision&revision=3D338053 >=20 > This patch was also back-ported into 11 Stable: >=20 > https://svnweb.freebsd.org/base?view=3Drevision&revision=3D348435 >=20 > Among other things this patch changes the timestamp allocation = strategy, > by introducing a deterministic randomness via a hash function that = takes > into account a random key as well as source address, source port, dest > address and dest port. As the result, timestamp offsets of different > tuples (SA,SP,DA,DP) will be wildly different and will jump from small=20= > to large numbers and back, as long as something in the tuple changes. Hi Paul, this is correct. Please note that the same happens with the old method, if two hosts with different uptimes are bind a consumer grade NAT. >=20 > After performing various tests of hosts that produce the above = mentioned=20 > issue we came to conclusion that there are some interesting = implementations=20 > that drop SYN packets with timestamps smaller than the largest = timestamp=20 > value from streams of all recent or current connections from a = specific=20 > address. This looks as some kind of SYN flood protection. This also breaks multiple hosts with different uptimes behind a consumer level NAT talking to such a server. >=20 > To ensure that each external host is not going to see a wild jumps of=20= > timestamp values I propose a patch that removes ports from the = equation > all together, when calculating the timestamp offset: >=20 > Index: sys/netinet/tcp_subr.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sys/netinet/tcp_subr.c (revision 348435) > +++ sys/netinet/tcp_subr.c (working copy) > @@ -2224,7 +2224,22 @@ > uint32_t > tcp_new_ts_offset(struct in_conninfo *inc) > { > - return (tcp_keyed_hash(inc, V_ts_offset_secret)); > + /*=20 > + * Some implementations show a strange behaviour when a = wildly random=20 > + * timestamps allocated for different streams. It seems that = only the > + * SYN packets are affected. Observed implementations drop = SYN packets > + * with timestamps smaller than the largest timestamp value = of all=20 > + * recent or current connections from specific a address. To = mitigate=20 > + * this we are going to ensure that each host will always = observe=20 > + * timestamps as increasing no matter the stream: by dropping = ports > + * from the equation. > + */=20 > + struct in_conninfo inc_copy =3D *inc; > + > + inc_copy.inc_fport =3D 0; > + inc_copy.inc_lport =3D 0; > + > + return (tcp_keyed_hash(&inc_copy, V_ts_offset_secret)); > } >=20 > /* >=20 > In any case, the solution of the uptime leak, implemented in rev338053 = is=20 > not going to suffer, because a supposed attacker is currently able to = use=20 > any fixed values of SP and DP, albeit not 0, anyway, to remove them = out=20 > of the equation. Can you describe how a peer can compute the uptime from two observed = timestamps? I don't see how you can do that... >=20 > There is the list of example hosts that we were able to reproduce the=20= > issue with: >=20 > curl -v http://88.99.60.171:80 > curl -v http://163.172.71.252:80 > curl -v http://5.9.242.150:80 > curl -v https://185.134.205.105:443 > curl -v https://136.243.1.231:443 > curl -v https://144.76.196.4:443 > curl -v http://94.127.191.194:80 >=20 > To reproduce, call curl repeatedly with a same URL some number of = times.=20 > You are going to see some of the requests stuck in=20 > `* Trying XXX.XXX.XXX.XXX...` >=20 > For some reason, the easiest way to reproduce the issue is with nc: >=20 > $ echo "foooooo" | nc -v 88.99.60.171 80 >=20 > Only a few such calls are required until one of them is stuck on = connect(): > issuing SYN packets with an exponential backoff. Thanks for providing an end-point to test with. I'll take a look. Just to be clear: You are running a FreeBSD client against one of the = above servers and experience the problem with the new timestamp computations. You are not running arbitrary clients against a FreeBSD server... Best regards Michael