From owner-freebsd-net@FreeBSD.ORG Sat Apr 4 17:11:55 2015 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D27081FE for ; Sat, 4 Apr 2015 17:11:55 +0000 (UTC) Received: from cyrus.watson.org (cyrus.watson.org [198.74.231.69]) by mx1.freebsd.org (Postfix) with ESMTP id 7FE80EE9 for ; Sat, 4 Apr 2015 17:11:55 +0000 (UTC) Received: from [10.0.1.17] (host81-157-243-31.range81-157.btcentralplus.com [81.157.243.31]) by cyrus.watson.org (Postfix) with ESMTPSA id B027546B89; Sat, 4 Apr 2015 13:11:53 -0400 (EDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: Patch to reduce use of global IP ID value(s) to avoid leaking information From: "Robert N. M. Watson" In-Reply-To: <55200A51.3090008@selasky.org> Date: Sat, 4 Apr 2015 18:11:55 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <551F034A.3040402@selasky.org> <20150403213641.GM64665@glebius.int.ru> <551FA37B.90609@selasky.org> <35F9F267-EDB3-45FC-95E0-4573556BD736@freebsd.org> <551FF191.2090109@selasky.org> <55200A51.3090008@selasky.org> To: Hans Petter Selasky X-Mailer: Apple Mail (2.2070.6) Cc: "emeric.poupon@stormshield.eu >> Emeric POUPON" , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Apr 2015 17:11:55 -0000 On 4 Apr 2015, at 16:59, Hans Petter Selasky wrote: > I think you confuse what I'm trying to explain to you from the = responses I get. I'm not talking about putting data into the IP ID field = or other IP fields, which then someone at the receiving end picks out = and stores. This is well known. >=20 > I'm talking about sampling the IP ID value you get in return from a = PING response. A firewall typically has multiple ports. If pinging the = gateway from any of these ports cause an increment of a shared IP ID = value, then anyone that can ping the common firewall will see the IP ID = updates the other parties are doing. This can be used for RX and TX = communication. Can you confirm that you understand? I have a feeling we = are talking about completely different ways of side channel = communication. I entirely understood, and have entirely understood throughout the = thread, what you were describing, but you appear to be confused about = the terminology. Normally, 'covert channel' and 'side channel' are used = in the following senses: A 'covert channel' is a mechanism by which two collaborating (and, = arguably, colluding) parties are able to communicate data despite it not = being an intentional communications channel. It is widely recognised = that covert channels are effectively impossible to eliminate when a = resource is shared by two potentially colluding parties -- e.g., a = common processor, cache, bus, etc. A global IP ID counter is a potential = covert channel because two parties can modulate a signal, with a = variable amount of noise, but it's far from the only one -- signals can = be modulated onto a variety of protocol IDs (e.g., TCP and UDP = port-number allocation) but also timing information (e.g., ICMP and TCP = rate limiters that also exist in the stack and can also be used for = signalling). One particularly neat covert channel, published by Steven = Murdoch, is using CPU load to trigger clock drift, on which a signal can = be modulated quite effectively -- albeit slowly -- which will be visible = via TCP timestamping. A 'side channel' is a mechanism by which an attacking party (the 'spy') = can extract information about an (unwilling) target process without = their collaboration or collusion, despite an assumed isolation policy. = It is widely recognised that side channels are difficult to eliminate = while maintaining efficient sharing of an underlying resource -- e.g., a = common processor, cache, bus, etc -- and almost impossible if the shared = medium isn't fully understood. The IP ID counter can also be a side = channel as it can reveal information about the global IP transmission = rate of a host, for example. Another recent example is that of the cache = side-channel attack, in which the effectiveness of process isolation is = greatly limited by hyperthreaded processors. During the 1980s and 1990s, covert channels were studied heavily in the = security literature, especially in the context of secure operating = systems. It was common to try to both measure the effective bandwidth of = covert channels (e.g., how fast a signal could be modulated onto CPU = utilisation between 'high' and 'low' processes in an MLS system). By the = early 1990s, it had been demonstrated that covert channels were almost = impossible to control in practice for general-purpose systems, even = using very restrictive kernel designs and real-time scheduling policies = -- today, some systems do effectively accomplish this (e.g., MILS = systems) but only through very restrictive policies and simplistic = designs. By the late 1990s, the topic of covert-channel analysis had = become extremely niche due to broad recognition that trying to solve the = problem was effectively pointless. The topic has seen some recent = resurgence in large part because covert channels can be used to attack = anonymity systems such as Tor -- e.g., causing hidden services reached = via Tor's overlay network to reveal themselves via the Internet. Where we can limit or close covert channels efficiently, reliably, and = without disrupting underlying functionality, it doesn't hurt to do so. = However, limiting covert channels as a primary design concern -- and in = particular, suggesting that we can do so in a strong way -- is simply = setting ourselves up for failure, as shown by a long research = literature. For the IP ID field, a far more pressing problem is to = ensure maximum robustness for high packet rates. The technique you've = proposed -- simple cryptographic randomness of a global field -- can = substantially damage robustness by reduce the reuse period for IP IDs, = increasing the chances of data corruption when, for example, using >MTU = UDP frames at a significant rate. The good news is that addressing that = problem also reduces the degree to which covert channels can be = utilised, since maintaining the IP ID at greater 2-tuple granularity = reduces the degree to which shared channels exist at all. More mature = mechanisms can help reduce the reuse period for pseudo-random sequences = as well (e.g., dedicating a few MSB or LSB to ensuring a minimum reuse = period). As Gleb and I have both pointed out, there are extensive covert channels = in the design of the TCP/IP protocol suite and its practical = implementations, and given the 'weakest link' nature of covert channel = defence (any channel is sufficient), it is wise to be extremely = sceptical of claims that these can be resolved in a way that provides = any benefit at all to our users. A more reasonable conclusion is that = firewall consumers should not make incorrect assumptions about defence = against colluding parties, as there are countless means of signalling = information across a shared resource such as a firewall. This is = especially true as firewalls are frequently configured to allow = intentional communication, and almost any type of communication taking = place at higher layers in the stack will allow signalling at vast = bandwidths -- e.g., via DNS query content and rates, VPN packet rates, = etc. If you want to make covert-channel prevention a primary design goal if = the FreeBSD stack, a vast amount of work will be required -- you've = spotted just one instance of many possible covert channels -- and it's = not clear it will offer practical benefit nor allow the implementation = to be at all efficient -- which is far more important to most FreeBSD = users. Being alarmist about just one of many covert channels has little = actual benefit, and is leading you to propose design solutions that may = substantially damage real network-stack functionality. If you want to = pursue a goal of covert-channel reduction, you need to be far more = systematic in analysing the potential channels, rather than just hacking = on the one you happen to have observed. If you aren't systematic, = there's actually no benefit to changing the system to address just one = channel! But, more generally, trying to limit covert channels is = something you need to seek a broad architectural consensus on from the = network-stack developer community, and make a primary review goal for = all existing, and all future, code. This will mean new coding standards, = testing tools, etc. The reason I say this is that prior work in this = area -- in particular relating to trusted-system design as captured in = first the Orange Book and later Common Criteria Protection Profiles -- = has provided clear evidence that this is a vastly difficult undertaking, = which has been effectively abandoned for all but the highest-security of = system designs. Robert=