Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Dec 2012 16:42:48 +1100
From:      "Michael Vale" <masked@internode.on.net>
To:        <freebsd-hackers@freebsd.org>, <freebsd-arch@freebsd.org>, <freebsd-ports@freebsd.org>
Subject:   Cross Compiling of ports Makefiles.
Message-ID:  <E49F944634B24A7387BC937CBE19D2D9@forexamplePC>

index | next in thread | raw e-mail

Hi, 

For those of you who are aware I’ve been implementing a complete cross-compiling series of functions to ports makefiles.

I had a good 3+ week break since my last email with a patch to show, and I’ve totally re-written it and have started from scratch.  Not including any of Ray’s Zrouter code either.

While it’s still a work in progress, i have outlined the entire system to produce target installs into the same staging directory as a bsd system ready to be flashed onto NAND for embedded, complete with pkg registry and ldconfig, everything has been thought of.   - The reason I have chosen this method for the ports to be installed into a tree is so they can be compliled after build/install kernel/world and be combined into one firmware image seemlessly.  Some ports won’t just be optional applications for future embedded firmware images, they’ll be an integral part of it.  The goal here is to be able to build complete firmware images in one fowl swoop.  Perhaps beyond the scope most of you out there but I may wish to pick and choose exclude required parts of the BSD system and replace them with the busybox port and replace libc with google’s Bionic, uClibc or even musl.  This cannot be achieved currently with the likes of tinderbox and pourdiere

It will still be possible to build packages though.

Due to the nature of cross building first i’ll lay out the options and then tell you which one I am implementing first as there are reasons for having different build-enviornments/toolchains.

Ok, firstly I was going to give you all detail of all possible cross-compiling scenarios as I outline them. but I’ll have you know it’s much of a muchness, there is the pros and cons to each and every different step, the one i’m about to put to you now is the most feature complete and quickest to implement.  That doesn’t mean building without a DESTDIR JAIL in the future and just using the build system and it’s tools without a new toolchain doesn’t make sense (sometimes it does!) and that i’m not going to do it or that I’m not going to do a full '’Canadian Cross’.

Ultimately as a goal the minimal command do invoke cross compliation is TARGET(_ARCH)=${ARCH} make.

This could go on for hours, so after just deleted to extra paragraphs, i’m going to summerise.

first we check for CLANG (as the x-compiler) or if we need to install xdev (bsd make of gcc compiled for target arch).
(ok so some of this wont be in Makefile order (upside down and back to front), but im just spitting it out as it comes)
if GNU configure is used, it usually pretty good at detecting the compilers executable path from the TARGET triple alone, for worse case scenario also set ${CC}’s path at the beginning of global env ${PATH} to override any subsequent.

pre-chroot: is mostly used to declare global env variables to keep the build from failing and making sure the install will complete.

do-chroot: and we have to firstly install and BUILD_DEPENDS, remember these can be libraries too and they have to be built with the build machines usual stuff and installed in their usual place (lucky we are using a CHROOTED JAIL here! we could easy make a mess otherwise) remembering sometimes some depends can be both a BUILD dep AND a RUN dep to the TARGET.  That’s okay, they should always be declared as correctly and never have to cross-compile a BUILD depend.  However a BUILD depend can be build twice, (once for the build system) and again (as a TARGET) for the TARGET as a RUN depend for the TARGET.

The beauty of doing this work is we can now treat the lib and run depends more suitably.  During this process we can strip the libs, exclude the headers and change the directory structure to one, save on inodes, and second pkg register, libtool and ld require the files are installed into the root tree correctly in order for them to build valid databases and register them. Now, BUILD/HOST system has already had it’s tail cut off by DESTDIR.  Now there is plenty of ways we can install everything into a valid sub-directory and have DESTDIR still considered ROOT and PREFIX or LOCALDIR doesn’t have some obscure prepending directory that doesn’t exist in the CROSS_STAGING_ROOT.  Some ways include adding a variable in bsd.lib.mk and in every single one of make’s install targets between ${DESTDIR} and ${LOCALBASE} or ${PREFIX}.  And we could include if statements for cross, this would leave it at that and we could go ahead and simply install into a sub-directory before pkg, ldconfig and firmware image packing occurs, but I’d rather keep all cross-building to bsd.cross.mk and include it in bsd.port.mk and instead within DESTDIR do-chroot: re-define ${DESTDIR} as ${_bldroot}${DESTDIR} and all TARGET_LIBS, RUN_DEPENDS and TARGET install in a CHROOTED=no chroot.

Doing the same thing could also prevent the need for a DESTDIR JAIL install at all and just use the real build machine’s build env, rather than a jail.  Regardless.  We still have to install these targets and their DESTDIR is skewed.  There is a few options,

One is to have a MAKEOBJDIRPREFIX like option, and redefine every target’s DESTDIR ${makeobjDESTDIR} before running do-install.  Now i’ve yet to complete this stage, but I believe this is the way to do it.

There are other options but they aren’t as elegant/will make baby jesus cry.

Now the install of these targets won’t require a chroot.  A chroot could be done, and that would be okay for one port.  but if there is already a cross compiled system in there ready for flashing to disk, theres no way to chroot without moving files temporarially form the existing target system and copying or building programs like /bin/sh that will execute on the build machine and allow chroot to run.



We can patch/sed PLIST files, for pkg register to work, patch/sed/edit ldconfig’s db, and some other steps.  But I don’t like that idea.


that’s why I’m opting with the other option and that is to create some INSTALL_DEPENDS or CROSS_COMPILING_CHROOT_INSTALL_DEPENDS, if you will.  just /bin/sh and another few 
build TARGET port in jailed DESTDIR/CHROOTED=yes.

this is achieved by installing all build dependencies first...


Sorry, I’m too tired to continue on any further!

I wanted to wait until the initial plan works, shoot an email off then get into the good stuff.  But it’s taking me longer than I thought even just to describe all the processes.

I didn’t want to submit half-baked Makefiles that don’t work, but I can only write about half of one anyway haha!

Anyway, I’m going to spend some time working on them in the next few days, so please expect an update.
From owner-freebsd-arch@FreeBSD.ORG  Thu Dec 27 05:47:09 2012
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 74FA21EA;
 Thu, 27 Dec 2012 05:47:09 +0000 (UTC)
 (envelope-from alfred@ixsystems.com)
Received: from mail.iXsystems.com (newknight.ixsystems.com [206.40.55.70])
 by mx1.freebsd.org (Postfix) with ESMTP id 4A4E08FC08;
 Thu, 27 Dec 2012 05:47:08 +0000 (UTC)
Received: from localhost (mail.ixsystems.com [10.2.55.1])
 by mail.iXsystems.com (Postfix) with ESMTP id B592860ECA;
 Wed, 26 Dec 2012 21:47:08 -0800 (PST)
Received: from mail.iXsystems.com ([10.2.55.1])
 by localhost (mail.ixsystems.com [10.2.55.1]) (maiad, port 10024) with ESMTP
 id 43528-04; Wed, 26 Dec 2012 21:47:08 -0800 (PST)
Received: from Alfreds-MacBook-Pro-9.local (unknown [10.8.0.26])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by mail.iXsystems.com (Postfix) with ESMTPSA id 157A160EC7;
 Wed, 26 Dec 2012 21:47:08 -0800 (PST)
Message-ID: <50DBE0DB.6090804@ixsystems.com>
Date: Wed, 26 Dec 2012 21:47:07 -0800
From: Alfred Perlstein <alfred@ixsystems.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Peter Wemm <peter@wemm.org>
Subject: Re: UPDATE Re: making use of userland dtrace on FreeBSD
References: <50D49DFF.3060803@ixsystems.com> <50DBC7E2.1070505@mu.org>
 <CAGE5yCq46NFKKzSUZq=jz0NwEnWdjPTK_0fpZ+wWV9FA0BSQCg@mail.gmail.com>
 <50DBD193.7080505@mu.org>
 <CAGE5yCrnoNhOh3VaYU3bO6BwA=bpxD5QzkZvD+HaUwvXNQ+Ufw@mail.gmail.com>
In-Reply-To: <CAGE5yCrnoNhOh3VaYU3bO6BwA=bpxD5QzkZvD+HaUwvXNQ+Ufw@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "arch@freebsd.org" <arch@freebsd.org>, Adrian Chadd <adrian@freebsd.org>,
 Alfred Perlstein <bright@mu.org>, Rui Paulo <rpaulo@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>;
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Dec 2012 05:47:09 -0000

On 12/26/12 9:32 PM, Peter Wemm wrote:
> On Wed, Dec 26, 2012 at 8:41 PM, Alfred Perlstein <bright@mu.org> wrote:
>> On 12/26/12 8:21 PM, Peter Wemm wrote:
>>> On Wed, Dec 26, 2012 at 8:00 PM, Alfred Perlstein <bright@mu.org> wrote:
>>>
>>>> What would be the drawbacks?  I don't want to hurt freebsd for heavy
>>>> performance, but I think this functionality should work out of the box
>>>> for
>>>> most people.
>>> The drawbacks are mostly performance related.  It defeats a certain
>>> hardware optimizations for call/return on leaf functions.  It'll
>>> mostly affect things like math, crypto, compression and multimedia
>>> libraries (that's ffmpeg, bzip2/gzip/libarchive, openssl, etc) but, we
>>> generally don't seem to care about that sort of performance anyway, so
>>> what's one more loss?
>>
>> Can you clarify some?  If it was somewhat easy to re-add
>> -fomit-frame-pointer to critical libraries like this, then that would be OK?
> No, you can't add MD flags like this.  The way to do it is see things
> like PIC, WARNS, etc where you can do overrides of defaults on a
> directory basis, and respect the system-wide user overrides.
>
> Remember, -fno-omit-frame-pointer is the default on i386 (except at
> high -O levels with gcc, I dont know where clang, the default
> compiler, draws the line).  Other platforms don't even have frame
> pointers.  You can't just scatter that switch around the place.

Agreed!    It seems that -fno-omit-frame-pointer documentation is a bit
strange, the manual page indicates:
>            -O also turns on -fomit-frame-pointer on machines where
> doing so
>            does not interfere with debugging.
Then goes on to specify that under the actual option that it's turned on
under -O, -O2, -O3, etc.


>
>> To be honest, I'm not sure if you're serious about "generally don't seem to
>> care" or just feel defeated on the issue and we should care.
> We took quite a performance beating because of not using the
> tuned-by-perl assembler code in openssl on amd64, for example.  This
> flows through to benchmarks on things like apache throughput with
> mod_ssl.  Or throughput on stunnel(1).
I don't recall if I was involved in that discussion, but that is troubling.

>
> My drive-by comment about not seeming to care any more is that people
> (except for Bruce) generally don't actually measure the performance
> impact of their changes any more.  The last time this was widespread
> was when Kris Kennaway used to be constantly abusing machines and
> reporting the effects as measured by ministat(1).
>
> If somebody were to say "this change makes world take 15% longer to
> compile but makes no meaningful affect on things like bzip2, openssl
> throughput etc" and posted the actual ministat output to back it up
> then there wouldn't even be a question on performance at all.  It'd
> only be "is 15% more build time worth ubiquitous dtrace?"  And thats a
> far easier thing to answer.
>
> A hand-wave leads to bikesheds.  Actual numbers are bikeshed repellant.
>
> I myself have killed patches that turned out to be premature
> optimizations because it actually didn't make any difference.  For
> example, I never committed the lazy tlb shootdown to AMD64 because it
> made things slower on the hardware of the day - opteron silicon had
> *hardware* address space tags on their TLB and the lazy shootdown code
> just added more synchronization work that just added overhead..  eg:
> buildworld was around 2% slower with the patches.
>
> Another example was the mtxpool code that caused cache line thrashing.
> If we cared about performance that would never have gone in. Sure, it
> compiled and worked, but the costs weren't quantified till much later
> and we realized how much trouble they were beyond a certain usage
> level.
>
> What's 2%?  It multiplies out.. 2% here, 1% there.. 3% over there,
> 0.5% somewhere else.. before you know it, there's a pretty big overall
> hit.
I see, well I will run some numbers and report back.

>
>>> Of course it wouldn't be required with dwarf unwinding awareness, but
>>> we don't have that.
>>>
>>> We have -fno-omit-frame-pointer on the amd64 kernel whenever debugging
>>> is compiled in because there's no unwinder for doing stack traces.  We
>>> need a dwarf2+ unwinder and somebody to instrument the call frame
>>> state through the remaining assembler code.
>>>
>> How much work is that exactly?  I've only been a gdb user, not a hacker.
> gdb has a stack unwinder.  kdb/ddb/stack(9) do not.  There's well
> established GPL code to do it, as well as libunwind and variants.
> Basically what this code has to do is run the dwarf2+ state machine to
> find all the call/return frames instead of assuming the compiler did
> it.  Heck, even glibc has a dwarf2 unwinder built into it as part of
> their exception processing system.
>
> I'm not entirely sure what more work src/lib/libelf and
> src/lib/libdwarf need.  It looks like its got just enough implemented
> to support the ctfconvert etc and doesn't have an unwinder in it.
>
This really seems beyond my skill level / time allotment.  Let's see
where the numbers put us in terms of system performance and then we can
make a call on it.

I'd rather take a few % of perf for the power of dtrace, but not if that
% is double digits.

-Alfred



help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E49F944634B24A7387BC937CBE19D2D9>