Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Feb 2014 09:09:32 +0000
From:      David Chisnall <theraven@FreeBSD.org>
To:        Michael Butler <imb@protected-networks.net>
Cc:        Don Lewis <truckman@FreeBSD.org>, freebsd-current@FreeBSD.org
Subject:   Re: firebox build fails post clang-3.4 merge
Message-ID:  <191C11E5-DE27-4C8B-AC20-50259447254F@FreeBSD.org>
In-Reply-To: <530FEBB5.70502@protected-networks.net>
References:  <201402270057.s1R0vkjH084327@gw.catspoiler.org> <530EA5CD.2070508@protected-networks.net> <A540F9BE-19E5-4ACF-B4A1-106B894F89FA@FreeBSD.org> <530FEBB5.70502@protected-networks.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 28 Feb 2014, at 01:51, Michael Butler <imb@protected-networks.net> =
wrote:

> I guess what I'm trying to get at is that I am used to a compiler =
which
> takes one of two actions, irrespective of the complexities of the =
source
> language or target architecture ..
>=20
> 1) the compiler has no definitive translation of "semantic intent"
> because the code is ambiguous - produces an error for the programmer =
to
> that effect
>=20
> 2) the compiler has no idea how to translate unambiguous code into
> functional machine code - produces an error for the compiler author(s)
> benefit to expose missing code-generation cases

If you're actually used to compilers like this, then I can only assume =
that you normally only compile programs that are a single compilation =
unit with no dynamic flow control (e.g. function pointers or =
data-dependent conditionals), because that's the only case where it is =
possible to implement a compiler that does what you claim you are used =
to.  You're certainly not used to any C/C++ compiler that's been =
released in the last 20 years.

The reason for the 'land mines', as you put it, is that the compiler has =
determined that a certain code path ought to be unreachable, either =
because of programmer-provided annotations or some knowledge of the =
language semantics, but can't statically prove that it really is because =
it can't do full symbolic execution of the entire program to prove that =
it is in all cases, for all possible inputs.  It then has two choices:

- Merrily continue with this assumption, and if it happens to be wrong =
continue executing code with the program in an undefined state.

- Insert something that will cause abnormal program termination, =
allowing it to be debugged and (hopefully) preventing it becoming an =
arbitrary code execution vulnerability.

In the vast majority of cases, sadly, it will actually do the first.  It =
will try, however, to do the latter if it won't harm performance too =
much.  You can, alternatively, ask the compiler not to take advantage of =
any of this knowledge for optimisation and aggressively tell you if =
you're doing anything that might be unsafe.  Compiling with these =
options gives you code that runs at around 10-50% of the speed of an =
optimised compile.  Maybe that's what you're used to?

Now, things are looking promising for you.  The estimates we did a =
couple of years ago showed that (as long as you don't use shared =
libraries), it should be feasible (although quite time consuming) to do =
the sorts of analysis that you're used to for moderate sized codebases =
once computers with 1-2TB of RAM become common.  At the current rate of =
development, that's only a couple more years away.  It may still take a =
few weeks to compile though...

David




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?191C11E5-DE27-4C8B-AC20-50259447254F>