From owner-freebsd-arch@FreeBSD.ORG Thu Dec 27 05:43:02 2012 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D4C71E9F; Thu, 27 Dec 2012 05:43:02 +0000 (UTC) (envelope-from masked@internode.on.net) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [IPv6:2001:44b8:8060:ff02:300:1:6:6]) by mx1.freebsd.org (Postfix) with ESMTP id 157048FC12; Thu, 27 Dec 2012 05:43:00 +0000 (UTC) Received: from ppp221-140.static.internode.on.net (HELO forexamplePC) ([150.101.221.140]) by ipmail06.adl6.internode.on.net with SMTP; 27 Dec 2012 16:12:46 +1030 Message-ID: From: "Michael Vale" To: , , Subject: Cross Compiling of ports Makefiles. Date: Thu, 27 Dec 2012 16:42:48 +1100 MIME-Version: 1.0 X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 16.4.3505.912 X-MimeOLE: Produced By Microsoft MimeOLE V16.4.3505.912 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2012 05:43:02 -0000 Hi,=20 For those of you who are aware I=E2=80=99ve been implementing a complete = cross-compiling series of functions to ports makefiles. I had a good 3+ week break since my last email with a patch to show, and = I=E2=80=99ve totally re-written it and have started from scratch. Not = including any of Ray=E2=80=99s Zrouter code either. While it=E2=80=99s still a work in progress, i have outlined the entire = system to produce target installs into the same staging directory as a = bsd system ready to be flashed onto NAND for embedded, complete with pkg = registry and ldconfig, everything has been thought of. - The reason I = have chosen this method for the ports to be installed into a tree is so = they can be compliled after build/install kernel/world and be combined = into one firmware image seemlessly. Some ports won=E2=80=99t just be = optional applications for future embedded firmware images, = they=E2=80=99ll be an integral part of it. The goal here is to be able = to build complete firmware images in one fowl swoop. Perhaps beyond the = scope most of you out there but I may wish to pick and choose exclude = required parts of the BSD system and replace them with the busybox port = and replace libc with google=E2=80=99s Bionic, uClibc or even musl. = This cannot be achieved currently with the likes of tinderbox and = pourdiere It will still be possible to build packages though. Due to the nature of cross building first i=E2=80=99ll lay out the = options and then tell you which one I am implementing first as there are = reasons for having different build-enviornments/toolchains. Ok, firstly I was going to give you all detail of all possible = cross-compiling scenarios as I outline them. but I=E2=80=99ll have you = know it=E2=80=99s much of a muchness, there is the pros and cons to each = and every different step, the one i=E2=80=99m about to put to you now is = the most feature complete and quickest to implement. That = doesn=E2=80=99t mean building without a DESTDIR JAIL in the future and = just using the build system and it=E2=80=99s tools without a new = toolchain doesn=E2=80=99t make sense (sometimes it does!) and that = i=E2=80=99m not going to do it or that I=E2=80=99m not going to do a = full '=E2=80=99Canadian Cross=E2=80=99. Ultimately as a goal the minimal command do invoke cross compliation is = TARGET(_ARCH)=3D${ARCH} make. This could go on for hours, so after just deleted to extra paragraphs, = i=E2=80=99m going to summerise. first we check for CLANG (as the x-compiler) or if we need to install = xdev (bsd make of gcc compiled for target arch). (ok so some of this wont be in Makefile order (upside down and back to = front), but im just spitting it out as it comes) if GNU configure is used, it usually pretty good at detecting the = compilers executable path from the TARGET triple alone, for worse case = scenario also set ${CC}=E2=80=99s path at the beginning of global env = ${PATH} to override any subsequent. pre-chroot: is mostly used to declare global env variables to keep the = build from failing and making sure the install will complete. do-chroot: and we have to firstly install and BUILD_DEPENDS, remember = these can be libraries too and they have to be built with the build = machines usual stuff and installed in their usual place (lucky we are = using a CHROOTED JAIL here! we could easy make a mess otherwise) = remembering sometimes some depends can be both a BUILD dep AND a RUN dep = to the TARGET. That=E2=80=99s okay, they should always be declared as = correctly and never have to cross-compile a BUILD depend. However a = BUILD depend can be build twice, (once for the build system) and again = (as a TARGET) for the TARGET as a RUN depend for the TARGET. The beauty of doing this work is we can now treat the lib and run = depends more suitably. During this process we can strip the libs, = exclude the headers and change the directory structure to one, save on = inodes, and second pkg register, libtool and ld require the files are = installed into the root tree correctly in order for them to build valid = databases and register them. Now, BUILD/HOST system has already had = it=E2=80=99s tail cut off by DESTDIR. Now there is plenty of ways we = can install everything into a valid sub-directory and have DESTDIR still = considered ROOT and PREFIX or LOCALDIR doesn=E2=80=99t have some obscure = prepending directory that doesn=E2=80=99t exist in the = CROSS_STAGING_ROOT. Some ways include adding a variable in bsd.lib.mk = and in every single one of make=E2=80=99s install targets between = ${DESTDIR} and ${LOCALBASE} or ${PREFIX}. And we could include if = statements for cross, this would leave it at that and we could go ahead = and simply install into a sub-directory before pkg, ldconfig and = firmware image packing occurs, but I=E2=80=99d rather keep all = cross-building to bsd.cross.mk and include it in bsd.port.mk and instead = within DESTDIR do-chroot: re-define ${DESTDIR} as ${_bldroot}${DESTDIR} = and all TARGET_LIBS, RUN_DEPENDS and TARGET install in a CHROOTED=3Dno = chroot. Doing the same thing could also prevent the need for a DESTDIR JAIL = install at all and just use the real build machine=E2=80=99s build env, = rather than a jail. Regardless. We still have to install these targets = and their DESTDIR is skewed. There is a few options, One is to have a MAKEOBJDIRPREFIX like option, and redefine every = target=E2=80=99s DESTDIR ${makeobjDESTDIR} before running do-install. = Now i=E2=80=99ve yet to complete this stage, but I believe this is the = way to do it. There are other options but they aren=E2=80=99t as elegant/will make = baby jesus cry. Now the install of these targets won=E2=80=99t require a chroot. A = chroot could be done, and that would be okay for one port. but if there = is already a cross compiled system in there ready for flashing to disk, = theres no way to chroot without moving files temporarially form the = existing target system and copying or building programs like /bin/sh = that will execute on the build machine and allow chroot to run. We can patch/sed PLIST files, for pkg register to work, patch/sed/edit = ldconfig=E2=80=99s db, and some other steps. But I don=E2=80=99t like = that idea. that=E2=80=99s why I=E2=80=99m opting with the other option and that is = to create some INSTALL_DEPENDS or = CROSS_COMPILING_CHROOT_INSTALL_DEPENDS, if you will. just /bin/sh and = another few=20 build TARGET port in jailed DESTDIR/CHROOTED=3Dyes. this is achieved by installing all build dependencies first... Sorry, I=E2=80=99m too tired to continue on any further! I wanted to wait until the initial plan works, shoot an email off then = get into the good stuff. But it=E2=80=99s taking me longer than I = thought even just to describe all the processes. I didn=E2=80=99t want to submit half-baked Makefiles that don=E2=80=99t = work, but I can only write about half of one anyway haha! Anyway, I=E2=80=99m going to spend some time working on them in the next = few days, so please expect an update. From owner-freebsd-arch@FreeBSD.ORG Thu Dec 27 05:47:09 2012 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 74FA21EA; Thu, 27 Dec 2012 05:47:09 +0000 (UTC) (envelope-from alfred@ixsystems.com) Received: from mail.iXsystems.com (newknight.ixsystems.com [206.40.55.70]) by mx1.freebsd.org (Postfix) with ESMTP id 4A4E08FC08; Thu, 27 Dec 2012 05:47:08 +0000 (UTC) Received: from localhost (mail.ixsystems.com [10.2.55.1]) by mail.iXsystems.com (Postfix) with ESMTP id B592860ECA; Wed, 26 Dec 2012 21:47:08 -0800 (PST) Received: from mail.iXsystems.com ([10.2.55.1]) by localhost (mail.ixsystems.com [10.2.55.1]) (maiad, port 10024) with ESMTP id 43528-04; Wed, 26 Dec 2012 21:47:08 -0800 (PST) Received: from Alfreds-MacBook-Pro-9.local (unknown [10.8.0.26]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.iXsystems.com (Postfix) with ESMTPSA id 157A160EC7; Wed, 26 Dec 2012 21:47:08 -0800 (PST) Message-ID: <50DBE0DB.6090804@ixsystems.com> Date: Wed, 26 Dec 2012 21:47:07 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Peter Wemm Subject: Re: UPDATE Re: making use of userland dtrace on FreeBSD References: <50D49DFF.3060803@ixsystems.com> <50DBC7E2.1070505@mu.org> <50DBD193.7080505@mu.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "arch@freebsd.org" , Adrian Chadd , Alfred Perlstein , Rui Paulo X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2012 05:47:09 -0000 On 12/26/12 9:32 PM, Peter Wemm wrote: > On Wed, Dec 26, 2012 at 8:41 PM, Alfred Perlstein wrote: >> On 12/26/12 8:21 PM, Peter Wemm wrote: >>> On Wed, Dec 26, 2012 at 8:00 PM, Alfred Perlstein wrote: >>> >>>> What would be the drawbacks? I don't want to hurt freebsd for heavy >>>> performance, but I think this functionality should work out of the box >>>> for >>>> most people. >>> The drawbacks are mostly performance related. It defeats a certain >>> hardware optimizations for call/return on leaf functions. It'll >>> mostly affect things like math, crypto, compression and multimedia >>> libraries (that's ffmpeg, bzip2/gzip/libarchive, openssl, etc) but, we >>> generally don't seem to care about that sort of performance anyway, so >>> what's one more loss? >> >> Can you clarify some? If it was somewhat easy to re-add >> -fomit-frame-pointer to critical libraries like this, then that would be OK? > No, you can't add MD flags like this. The way to do it is see things > like PIC, WARNS, etc where you can do overrides of defaults on a > directory basis, and respect the system-wide user overrides. > > Remember, -fno-omit-frame-pointer is the default on i386 (except at > high -O levels with gcc, I dont know where clang, the default > compiler, draws the line). Other platforms don't even have frame > pointers. You can't just scatter that switch around the place. Agreed! It seems that -fno-omit-frame-pointer documentation is a bit strange, the manual page indicates: > -O also turns on -fomit-frame-pointer on machines where > doing so > does not interfere with debugging. Then goes on to specify that under the actual option that it's turned on under -O, -O2, -O3, etc. > >> To be honest, I'm not sure if you're serious about "generally don't seem to >> care" or just feel defeated on the issue and we should care. > We took quite a performance beating because of not using the > tuned-by-perl assembler code in openssl on amd64, for example. This > flows through to benchmarks on things like apache throughput with > mod_ssl. Or throughput on stunnel(1). I don't recall if I was involved in that discussion, but that is troubling. > > My drive-by comment about not seeming to care any more is that people > (except for Bruce) generally don't actually measure the performance > impact of their changes any more. The last time this was widespread > was when Kris Kennaway used to be constantly abusing machines and > reporting the effects as measured by ministat(1). > > If somebody were to say "this change makes world take 15% longer to > compile but makes no meaningful affect on things like bzip2, openssl > throughput etc" and posted the actual ministat output to back it up > then there wouldn't even be a question on performance at all. It'd > only be "is 15% more build time worth ubiquitous dtrace?" And thats a > far easier thing to answer. > > A hand-wave leads to bikesheds. Actual numbers are bikeshed repellant. > > I myself have killed patches that turned out to be premature > optimizations because it actually didn't make any difference. For > example, I never committed the lazy tlb shootdown to AMD64 because it > made things slower on the hardware of the day - opteron silicon had > *hardware* address space tags on their TLB and the lazy shootdown code > just added more synchronization work that just added overhead.. eg: > buildworld was around 2% slower with the patches. > > Another example was the mtxpool code that caused cache line thrashing. > If we cared about performance that would never have gone in. Sure, it > compiled and worked, but the costs weren't quantified till much later > and we realized how much trouble they were beyond a certain usage > level. > > What's 2%? It multiplies out.. 2% here, 1% there.. 3% over there, > 0.5% somewhere else.. before you know it, there's a pretty big overall > hit. I see, well I will run some numbers and report back. > >>> Of course it wouldn't be required with dwarf unwinding awareness, but >>> we don't have that. >>> >>> We have -fno-omit-frame-pointer on the amd64 kernel whenever debugging >>> is compiled in because there's no unwinder for doing stack traces. We >>> need a dwarf2+ unwinder and somebody to instrument the call frame >>> state through the remaining assembler code. >>> >> How much work is that exactly? I've only been a gdb user, not a hacker. > gdb has a stack unwinder. kdb/ddb/stack(9) do not. There's well > established GPL code to do it, as well as libunwind and variants. > Basically what this code has to do is run the dwarf2+ state machine to > find all the call/return frames instead of assuming the compiler did > it. Heck, even glibc has a dwarf2 unwinder built into it as part of > their exception processing system. > > I'm not entirely sure what more work src/lib/libelf and > src/lib/libdwarf need. It looks like its got just enough implemented > to support the ctfconvert etc and doesn't have an unwinder in it. > This really seems beyond my skill level / time allotment. Let's see where the numbers put us in terms of system performance and then we can make a call on it. I'd rather take a few % of perf for the power of dtrace, but not if that % is double digits. -Alfred