Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Nov 2005 00:02:20 +0100
From:      Jens Schweikhardt <schweikh@schweikhardt.net>
To:        tridge@samba.org
Cc:        "Andrew P." <infofarmer@gmail.com>, freebsd-ports@freebsd.org, sobomax@portaone.com
Subject:   Re: FreeBSD ccache port is wonderfiul!
Message-ID:  <20051124230220.GB1923@schweikhardt.net>
In-Reply-To: <17286.14954.340289.972852@samba.org>
References:  <200511210625.16973.ringworm01@gmail.com> <cb5206420511210825v7b4dc852jf3f29f325d8ed7fd@mail.gmail.com> <20051124182645.GA1923@schweikhardt.net> <17286.14954.340289.972852@samba.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Tridge et al,

On Fri, Nov 25, 2005 at 09:10:50AM +1100, tridge@samba.org wrote:
# Jens,
# 
#  > Note to Tridge: buildworld compiles a cc, installs a complete
#  > temporary build environment and uses that environment's compiler
#  > and headers from that point on.  The interesting problem is how to
#  > make ccache find the same cc's.  There are ways, but they're not
#  > entirely obvious.
# 
# interesting - that certainly makes things more complex, though ccache
# is meant to cope with those sorts of games. 
# 
#  > Please see the thread "Using ccache for build{world, kernel}" on the
#  > current@ mailing list (only two weeks ago).
# 
# ahh, i can see a potential explanation. Looking at this:
# 
#   http://lists.freebsd.org/pipermail/freebsd-current/2005-November/058052.html
# 
# The change of compiler during the build process would normally
# invalidate the cache as the mtime/size of the compiler is used as part
# of the cache hash index. What you are trying to do is find ways to
# defeat that mechanism, as you want to be able to use cached object
# files from an older compiler with a newer one.

Exactly.

# I see that people have previously used CCACHE_PATH to do this which
# overrides the 'find the compiler' code in ccache, and makes ccache use
# the same compiler for both stages of the build, which of course is
# incorrect, but fast :-)

Yes, it's the reason why some people incorrectly claim ccache would not
work well for FreeBSD's "make buildworld". They run into the "old compiler
does not find new headers" problem.

# Your CCACHE_NOHASH_SIZE_MTIME patch takes a different approach, and
# just disables the 'detect the compiler has changed' code, allowing the
# new compiler to be used, but still using the cached object files from
# the old compiler. Strictly speaking that is also unsafe, but will at
# least use the right compiler for uncached compiles, which is an
# improvement over the CCACHE_PATH method.

Yes, strictly speaking it's unsafe--unless you can guarantee that code
generation is not affected. This is a case of "Every art has rules that
you are not allowed to break as a grasshopper. Once you're enlightened,
you no longer need to abide by the rules" :-) That's why the man page
patch says "...only if you know what you are doing". In this particular
case, deleting the cache after cc updates is certainly a sensible
approach.

# So if I understand the problem correctly, ccache would build world in
# freebsd fine if none of the overrides were used (CCACHE_PATH or
# CCACHE_NOHASH_SIZE_MTIME), but it wouldn't gain nearly as much in
# build speed as the 2nd stage build with the freshly installed compiler
# would be uncached. Is that right? (I'm just making sure I haven't
# missed a real ccache bug somewhere).

Yes, that's right. The ccache is effective only while building the
temporary environment, which is only a fraction of the work of a
complete buildworld. People wonder why they only get a few hundred
cache hits instead of thousands.

# So, assuming this isn't a real ccache bug, then really what you are
# doing is finding different ways to defeat the compiler bootstrapping
# process in build world. Why not have an option in the build process to
# not use compiler bootstrapping? If lots of people are defeating it
# using ccache tricks anyway, then perhaps you need an option that
# explicitly disables it? That might make the source of the problem a
# little clearer when people get build failures.

I'm not sure I understand what you mean by "not use compiler
bootstrapping". It's a non-optional part of the FreeBSD build and I like
to start from scratch. A new world *must* build, install, and use a
temporary environment.

# I also noticed this patch from Maxim in the above thread:
# 
#   http://www.portaone.com/~sobomax/ccache.buildworld
# 
# that one changes ccache to use a hash of the compiler binary instead
# of the size/mtime. It also adds an extra cacheing layer, which caches
# the compilers hash indexed by the hash of its size and mtime. I think
# that extra cacheing layer is not really needed, and just adds a lot of
# complexity. Hashing the cc binary should be fast enough that it won't
# be noticed in the build (especially as it will be in memory), and some
# quick tests here seem to confirm that its lost in the noise, at least
# with gcc. Unless cc on freebsd is particularly large (gcc on my system
# is just 90k) I don't think its a win.

Here it's
schweikh@hal9000:~ $ ll /usr/local/libexec/ccache/bin
total 486
-r-xr-xr-x  1 root  wheel   77296 Nov  9 22:35 c++
-r-xr-xr-x  1 root  wheel    4080 Nov  9 22:35 c89
-r-xr-xr-x  1 root  wheel    4380 Nov  9 22:35 c99
-r-xr-xr-x  1 root  wheel  164956 Nov  9 22:35 cc
-r-xr-xr-x  1 root  wheel   77296 Nov 13 16:27 g++
-r-xr-xr-x  1 root  wheel  164956 Nov 13 16:27 gcc

For some reason, cc and gcc (normally hard linked) are statically
linked, while c++/g++ are dynamically linked.

# I've committed a simpler patch:
# 
#   http://build.samba.org/?function=diff;tree=ccache;date=1132869249;author=tridge
# 
# which should achieve the same thing. Maybe you could try it on the
# freebsd build with CCACHE_HASH_COMPILER set and see if it helps?

Will do, but I'm moving and it may take until January for a definitive
go/no go.

# Of course, hashing the compiler binary isn't a perfect way of
# detecting that a compiler hasn't changed, as compilers often use
# multiple stages with separate binaries and the 2nd stage might have
# changed with no change in the main cc binary. That's why ccache
# normally uses the size/mtime, as that tends to change when a new
# compiler is installed, so its more conservative than hashing the
# binary. Ideally ccache would have some way of asking the compiler for
# some sort if ID that guarantees its the same, but that would be pretty
# tricky to add to compilers I think.

Yes, I always wondered about that, too. cc1 is the actual binary
responsible for code generation, maybe even the assembler must be
considered. Looking at the compiler driver may give false positives, if
only cc1 or as changes (unlikely, but maybe not just of academic
interest).

# Cheers, Tridge
# 
# PS: I've committed your CCACHE_NOHASH_SIZE_MTIME patch to cvs. See 
# http://build.samba.org/?function=diff;tree=ccache;date=1132866608;author=tridge

That's great news, thanks! Any plans for a 2.5 release?

Regards,

	Jens
-- 
Jens Schweikhardt http://www.schweikhardt.net/
SIGSIG -- signature too long (core dumped)



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051124230220.GB1923>