From owner-svn-src-all@FreeBSD.ORG  Fri Apr  3 08:52:05 2015
Return-Path: <owner-svn-src-all@FreeBSD.ORG>
Delivered-To: svn-src-all@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id E1FAE4A7;
 Fri,  3 Apr 2015 08:52:05 +0000 (UTC)
Received: from cyrus.watson.org (cyrus.watson.org [198.74.231.69])
 by mx1.freebsd.org (Postfix) with ESMTP id 9474195D;
 Fri,  3 Apr 2015 08:52:05 +0000 (UTC)
Received: from [10.0.1.17] (host81-157-243-31.range81-157.btcentralplus.com
 [81.157.243.31])
 by cyrus.watson.org (Postfix) with ESMTPSA id C3B0446BB1;
 Fri,  3 Apr 2015 04:52:03 -0400 (EDT)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\))
Subject: Re: svn commit: r280971 - in head: contrib/ipfilter/tools
 share/man/man4 sys/contrib/ipfilter/netinet sys/netinet sys/netipsec
 sys/netpfil/pf
From: "Robert N. M. Watson" <rwatson@FreeBSD.org>
In-Reply-To: <551E520E.1040708@selasky.org>
Date: Fri, 3 Apr 2015 09:52:01 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <6DF5FB51-8135-4144-BD3A-6E4127A23AA7@FreeBSD.org>
References: <201504012226.t31MQedN044443@svn.freebsd.org>
 <1427929676.82583.103.camel@freebsd.org> <20150402123522.GC64665@FreeBSD.org>
 <20150402133751.GA549@dft-labs.eu> <20150402134217.GG64665@FreeBSD.org>
 <20150402135157.GB549@dft-labs.eu> <1427983109.82583.115.camel@freebsd.org>
 <20150402142318.GC549@dft-labs.eu> <20150402143420.GI64665@FreeBSD.org>
 <20150402153805.GD549@dft-labs.eu>
 <alpine.BSF.2.11.1504021657440.27263@fledge.watson.org>
 <551D8143.4060509@selasky.org> <551D8945.8050906@selasky.org>
 <8900318B-8155-4131-A0C3-3DE169782EFC@FreeBSD.org>
 <551D8C6C.9060504@selasky.org>
 <alpine.BSF.2.11.1504021939390.64391@fledge.watson.org>
 <551DA5EA.1080908@selasky.org> <551DAC9E.9010303@selasky.org>
 <358EC58D-1F92-411E-ADEB-8072020E9EB3@FreeBSD.org>
 <551DEF26.4000403@selasky.org>
 <4B7DAA59-389F-41AE-99D8-034A7AA61C99@FreeBSD.org>
 <551E520E.1040708@selasky.org>
To: Hans Petter Selasky <hps@selasky.org>
X-Mailer: Apple Mail (2.2070.6)
Cc: Mateusz Guzik <mjguzik@gmail.com>, Ian Lepore <ian@freebsd.org>,
 svn-src-all@freebsd.org, src-committers@freebsd.org,
 Gleb Smirnoff <glebius@FreeBSD.org>, svn-src-head@freebsd.org
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
 user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all/>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Apr 2015 08:52:06 -0000

On 3 Apr 2015, at 09:40, Hans Petter Selasky <hps@selasky.org> wrote:

>> There are countless covert channels in TCP/IP; breaking the IP =
implementation to close a covert channel is probably not a worthwhile =
investment.
>=20
> The IP ID channel is a _broadcast_ channel to all devices connected to =
the same network stack, including all VPN connections and even =
localhost. It is high speed and it cannot be blocked by firewall rules, =
and works across large networks. The other covert channels can easily be =
reduced by firewall rules. This one can't.
>=20
> Now that Gleb put in a patch that the shared IP ID counter is not used =
that frequently, only for specific traffic like ping packets, I believe =
this is very likely to be abused.

Research into covert channels has been going on for 30+ years, and the =
conclusion of that research has been that it is almost impossible to =
eliminate covert channels from designs intended for high-performance =
data sharing. This is just one covert channel of countless channels, and =
given that the stack would need to be fundamentally redesigned to =
eliminate many of them, covert-channel elimination should not be a =
primary design concern for this code. As such, we might eliminate it as =
a side effect of another change, but I don't think it's a good =
motivation to make changes.

>> As indicated in pretty much the original RFC on the topic, IP IDs
>> need to be at minimum unique to a 2-tuple pair, so cannot be
>> unique only at the granularity of TCP or UDP connections, GRE
>> associations, etc. However, our current implementation keeps them
>> globally unique, which means they wrap much faster than
>> necessary. Shifting to unique IP ID spaces for IP 2-tuples would
>> provide for a much longer wrapping time at the cost of
>> maintaining (and looking up!) additional state. There are various
>> ways to improve things -- and not all require a full set of
>> per-IP-2-tuple IP ID counters, for example, you could have hash
>> buckets based on 2 tuples. It's harder to do this in a
>> multiprocessor-scalable way, however, as the uniqueness
>> requirements are global, and the IP ID space is very small -- a
>> more fundamental problem. In general, the world therefore tries
>> quite hard not to fragment, using TCP PMTU and careful MTU
>> selection for UDP (etc). Also, the world has become quite a lot
>> more homogeneous with respect to link-layer MTU over time --
>> e.g., with convergence on Ethernet, although VPNs have made
>> things a bit less fun.
>=20
> The IP ID field should have been 64-bit, containing a copy of the =
16-bit source and destination TCP/UDP ports and a 32-bit sequence =
number. Now that's not possible, but how about saying that each unique =
IP can have at maximum 16 different connections passing to another =
unique IP. And then reduce the sequence number to 8-bits. So:
>=20
> IP ID =3D ((src port) & 0xF) | (((dst port) & 0xF) << 4) | =
((inp->inp_sequence++) << 8);
>=20
> Whenever we see TCP PMTU activated we can release some more =
combinations to a common pool somewhere. Will also work with IP =
encapsulations, where some bits of the sequence number gets replaced, if =
the IP ID is encoded the same ...
>=20
> You might call me a freshman in the IP stack area and I'm very =
surprised about all the issues I've come across in this area the last =
couple of months. I start understanding why DragonFly forked and why =
there is something called infiniband.

Before engaging further in this conversation, and trying to modify the =
behaviour of the TCP/IP stack, you need to educate yourself about the =
design and history of the protocols involved. Otherwise, you're going to =
repeatedly suggest ideas that are fundamentally broken, and we're going =
to waste our time shooting them down when you could just have done a bit =
of background reading and learned the basics of the protocol design and =
implementation.

Robert


>=20
> Robert and Gleb:
>=20
> >  multiprocessor-scalable
>=20
> Won't r280971 exactly do what you told me was not a good idea and are =
giving me some critisism for? Namely, result in one IP ID counter per =
TCP/UDP connection. If you have two applications that run on each their =
core. One cause updates to the IP-ID value X times per time unit and the =
other one Y times per time unit. If "(X =E2=81=BB Y)" is odd (50% =
chance), then at some point the IP-ID *will* resemble exactly to the =
same value in a predictable fashion, even if the amount of traffic is =
considered "low". And I think the chance increases with more cores, =
looking at this from the pure perspective of mathematics.
>=20
> Why is then r280971 fine, when it is doing the same like D2211, only =
D2211 does it in a predictable fashion while r280971 is unpredictable. A =
clear IP ID number sequence on a TCP stream maybe wouldn't even need an =
explanation. Even a 12-year-old would understand, a-ha, that TCP stream =
is incrementing that fast and that stream is incrementing that fast, and =
in the end there is a collision. When a collision happens we will have a =
retransmit, and then maybe we can then randomize the next IP ID value a =
bit to avoid repeated collisions.
>=20
> --HPS