From owner-svn-src-all@FreeBSD.ORG Fri Apr 3 08:52:05 2015 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E1FAE4A7; Fri, 3 Apr 2015 08:52:05 +0000 (UTC) Received: from cyrus.watson.org (cyrus.watson.org [198.74.231.69]) by mx1.freebsd.org (Postfix) with ESMTP id 9474195D; Fri, 3 Apr 2015 08:52:05 +0000 (UTC) Received: from [10.0.1.17] (host81-157-243-31.range81-157.btcentralplus.com [81.157.243.31]) by cyrus.watson.org (Postfix) with ESMTPSA id C3B0446BB1; Fri, 3 Apr 2015 04:52:03 -0400 (EDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: svn commit: r280971 - in head: contrib/ipfilter/tools share/man/man4 sys/contrib/ipfilter/netinet sys/netinet sys/netipsec sys/netpfil/pf From: "Robert N. M. Watson" In-Reply-To: <551E520E.1040708@selasky.org> Date: Fri, 3 Apr 2015 09:52:01 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <6DF5FB51-8135-4144-BD3A-6E4127A23AA7@FreeBSD.org> References: <201504012226.t31MQedN044443@svn.freebsd.org> <1427929676.82583.103.camel@freebsd.org> <20150402123522.GC64665@FreeBSD.org> <20150402133751.GA549@dft-labs.eu> <20150402134217.GG64665@FreeBSD.org> <20150402135157.GB549@dft-labs.eu> <1427983109.82583.115.camel@freebsd.org> <20150402142318.GC549@dft-labs.eu> <20150402143420.GI64665@FreeBSD.org> <20150402153805.GD549@dft-labs.eu> <551D8143.4060509@selasky.org> <551D8945.8050906@selasky.org> <8900318B-8155-4131-A0C3-3DE169782EFC@FreeBSD.org> <551D8C6C.9060504@selasky.org> <551DA5EA.1080908@selasky.org> <551DAC9E.9010303@selasky.org> <358EC58D-1F92-411E-ADEB-8072020E9EB3@FreeBSD.org> <551DEF26.4000403@selasky.org> <4B7DAA59-389F-41AE-99D8-034A7AA61C99@FreeBSD.org> <551E520E.1040708@selasky.org> To: Hans Petter Selasky X-Mailer: Apple Mail (2.2070.6) Cc: Mateusz Guzik , Ian Lepore , svn-src-all@freebsd.org, src-committers@freebsd.org, Gleb Smirnoff , svn-src-head@freebsd.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2015 08:52:06 -0000 On 3 Apr 2015, at 09:40, Hans Petter Selasky wrote: >> There are countless covert channels in TCP/IP; breaking the IP = implementation to close a covert channel is probably not a worthwhile = investment. >=20 > The IP ID channel is a _broadcast_ channel to all devices connected to = the same network stack, including all VPN connections and even = localhost. It is high speed and it cannot be blocked by firewall rules, = and works across large networks. The other covert channels can easily be = reduced by firewall rules. This one can't. >=20 > Now that Gleb put in a patch that the shared IP ID counter is not used = that frequently, only for specific traffic like ping packets, I believe = this is very likely to be abused. Research into covert channels has been going on for 30+ years, and the = conclusion of that research has been that it is almost impossible to = eliminate covert channels from designs intended for high-performance = data sharing. This is just one covert channel of countless channels, and = given that the stack would need to be fundamentally redesigned to = eliminate many of them, covert-channel elimination should not be a = primary design concern for this code. As such, we might eliminate it as = a side effect of another change, but I don't think it's a good = motivation to make changes. >> As indicated in pretty much the original RFC on the topic, IP IDs >> need to be at minimum unique to a 2-tuple pair, so cannot be >> unique only at the granularity of TCP or UDP connections, GRE >> associations, etc. However, our current implementation keeps them >> globally unique, which means they wrap much faster than >> necessary. Shifting to unique IP ID spaces for IP 2-tuples would >> provide for a much longer wrapping time at the cost of >> maintaining (and looking up!) additional state. There are various >> ways to improve things -- and not all require a full set of >> per-IP-2-tuple IP ID counters, for example, you could have hash >> buckets based on 2 tuples. It's harder to do this in a >> multiprocessor-scalable way, however, as the uniqueness >> requirements are global, and the IP ID space is very small -- a >> more fundamental problem. In general, the world therefore tries >> quite hard not to fragment, using TCP PMTU and careful MTU >> selection for UDP (etc). Also, the world has become quite a lot >> more homogeneous with respect to link-layer MTU over time -- >> e.g., with convergence on Ethernet, although VPNs have made >> things a bit less fun. >=20 > The IP ID field should have been 64-bit, containing a copy of the = 16-bit source and destination TCP/UDP ports and a 32-bit sequence = number. Now that's not possible, but how about saying that each unique = IP can have at maximum 16 different connections passing to another = unique IP. And then reduce the sequence number to 8-bits. So: >=20 > IP ID =3D ((src port) & 0xF) | (((dst port) & 0xF) << 4) | = ((inp->inp_sequence++) << 8); >=20 > Whenever we see TCP PMTU activated we can release some more = combinations to a common pool somewhere. Will also work with IP = encapsulations, where some bits of the sequence number gets replaced, if = the IP ID is encoded the same ... >=20 > You might call me a freshman in the IP stack area and I'm very = surprised about all the issues I've come across in this area the last = couple of months. I start understanding why DragonFly forked and why = there is something called infiniband. Before engaging further in this conversation, and trying to modify the = behaviour of the TCP/IP stack, you need to educate yourself about the = design and history of the protocols involved. Otherwise, you're going to = repeatedly suggest ideas that are fundamentally broken, and we're going = to waste our time shooting them down when you could just have done a bit = of background reading and learned the basics of the protocol design and = implementation. Robert >=20 > Robert and Gleb: >=20 > > multiprocessor-scalable >=20 > Won't r280971 exactly do what you told me was not a good idea and are = giving me some critisism for? Namely, result in one IP ID counter per = TCP/UDP connection. If you have two applications that run on each their = core. One cause updates to the IP-ID value X times per time unit and the = other one Y times per time unit. If "(X =E2=81=BB Y)" is odd (50% = chance), then at some point the IP-ID *will* resemble exactly to the = same value in a predictable fashion, even if the amount of traffic is = considered "low". And I think the chance increases with more cores, = looking at this from the pure perspective of mathematics. >=20 > Why is then r280971 fine, when it is doing the same like D2211, only = D2211 does it in a predictable fashion while r280971 is unpredictable. A = clear IP ID number sequence on a TCP stream maybe wouldn't even need an = explanation. Even a 12-year-old would understand, a-ha, that TCP stream = is incrementing that fast and that stream is incrementing that fast, and = in the end there is a collision. When a collision happens we will have a = retransmit, and then maybe we can then randomize the next IP ID value a = bit to avoid repeated collisions. >=20 > --HPS