From owner-freebsd-current@FreeBSD.ORG Fri Feb 28 09:09:43 2014 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3154C8C9; Fri, 28 Feb 2014 09:09:43 +0000 (UTC) Received: from theravensnest.org (theraven.freebsd.your.org [216.14.102.27]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id EB4ED12BF; Fri, 28 Feb 2014 09:09:42 +0000 (UTC) Received: from [192.168.0.7] (cpc28-cmbg15-2-0-cust64.5-4.cable.virginm.net [86.27.189.65]) (authenticated bits=0) by theravensnest.org (8.14.7/8.14.5) with ESMTP id s1S99bh1091015 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 28 Feb 2014 09:09:40 GMT (envelope-from theraven@FreeBSD.org) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: firebox build fails post clang-3.4 merge From: David Chisnall In-Reply-To: <530FEBB5.70502@protected-networks.net> Date: Fri, 28 Feb 2014 09:09:32 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <191C11E5-DE27-4C8B-AC20-50259447254F@FreeBSD.org> References: <201402270057.s1R0vkjH084327@gw.catspoiler.org> <530EA5CD.2070508@protected-networks.net> <530FEBB5.70502@protected-networks.net> To: Michael Butler X-Mailer: Apple Mail (2.1827) Cc: Don Lewis , freebsd-current@FreeBSD.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Feb 2014 09:09:43 -0000 On 28 Feb 2014, at 01:51, Michael Butler = wrote: > I guess what I'm trying to get at is that I am used to a compiler = which > takes one of two actions, irrespective of the complexities of the = source > language or target architecture .. >=20 > 1) the compiler has no definitive translation of "semantic intent" > because the code is ambiguous - produces an error for the programmer = to > that effect >=20 > 2) the compiler has no idea how to translate unambiguous code into > functional machine code - produces an error for the compiler author(s) > benefit to expose missing code-generation cases If you're actually used to compilers like this, then I can only assume = that you normally only compile programs that are a single compilation = unit with no dynamic flow control (e.g. function pointers or = data-dependent conditionals), because that's the only case where it is = possible to implement a compiler that does what you claim you are used = to. You're certainly not used to any C/C++ compiler that's been = released in the last 20 years. The reason for the 'land mines', as you put it, is that the compiler has = determined that a certain code path ought to be unreachable, either = because of programmer-provided annotations or some knowledge of the = language semantics, but can't statically prove that it really is because = it can't do full symbolic execution of the entire program to prove that = it is in all cases, for all possible inputs. It then has two choices: - Merrily continue with this assumption, and if it happens to be wrong = continue executing code with the program in an undefined state. - Insert something that will cause abnormal program termination, = allowing it to be debugged and (hopefully) preventing it becoming an = arbitrary code execution vulnerability. In the vast majority of cases, sadly, it will actually do the first. It = will try, however, to do the latter if it won't harm performance too = much. You can, alternatively, ask the compiler not to take advantage of = any of this knowledge for optimisation and aggressively tell you if = you're doing anything that might be unsafe. Compiling with these = options gives you code that runs at around 10-50% of the speed of an = optimised compile. Maybe that's what you're used to? Now, things are looking promising for you. The estimates we did a = couple of years ago showed that (as long as you don't use shared = libraries), it should be feasible (although quite time consuming) to do = the sorts of analysis that you're used to for moderate sized codebases = once computers with 1-2TB of RAM become common. At the current rate of = development, that's only a couple more years away. It may still take a = few weeks to compile though... David