From owner-freebsd-arm@freebsd.org Thu Jan 26 18:31:35 2017 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F16CDCC358C for ; Thu, 26 Jan 2017 18:31:35 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-72.reflexion.net [208.70.210.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B2647A8C for ; Thu, 26 Jan 2017 18:31:34 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 13566 invoked from network); 26 Jan 2017 18:05:24 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 26 Jan 2017 18:05:24 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v8.20.1) with SMTP; Thu, 26 Jan 2017 13:04:53 -0500 (EST) Received: (qmail 7898 invoked from network); 26 Jan 2017 18:04:53 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 26 Jan 2017 18:04:53 -0000 Received: from [192.168.1.111] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id C020AEC90C8; Thu, 26 Jan 2017 10:04:52 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: qemu-arm-static appears to have problems with signal delivery during (at least) poudrirer-devel based cross builds of some ports with ALLOW_MAKE_JOBS=yes From: Mark Millard In-Reply-To: <049fd4e6-209b-4385-48ed-f3413ab27e52@gmail.com> Date: Thu, 26 Jan 2017 10:04:52 -0800 Cc: Sean Bruno , freebsd-arm@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <5AB92372-6862-4F60-84B2-9B3E7B7FF3C9@dsl-only.net> References: <7AF92A3C-3563-4B2E-B14A-D6BAF30A16A2@dsl-only.net> <9d7129d7-da2d-18e9-38ae-06f3483450f7@freebsd.org> <4399212D-B4DD-460F-AD1B-9250FB412B38@dsl-only.net> <049fd4e6-209b-4385-48ed-f3413ab27e52@gmail.com> To: meloun.michal@gmail.com X-Mailer: Apple Mail (2.3259) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2017 18:31:36 -0000 On 2017-Jan-26, at 5:54 AM, Michal Meloun = wrote: > On 26.01.2017 5:26, Mark Millard wrote: >> On 2017-Jan-25, at 12:27 PM, Sean Bruno = wrote: >>=20 >>> Mark: >>>=20 >>> There was a recent update this week that was submitted and accepted = to >>> qemu-user-static. >>>=20 >>> Want to give it a spin again and see if you are able to make = progress? >>>=20 >>> sean "top poster for maximum effect" bruno >>=20 >> I updated my /usr/ports to -r432460 (from today) and rebuilt. >> I the tried doing some poudriere -x -a arm.armv6 port builds >> again, with ALLOW_MAKE_JOBS=3Dyes and -J 1 in use. >>=20 >> Unfortunately the qemu-user-static update did not fix the >> problem I've been seeing. >>=20 >> An example extracted from a print/texinfo log still shows >> "TCG temporary leak before 00021826": >=20 > I just rebuild print/texinfo without single problem. > Well, with slightly different CFLAGS > CFLAGS+=3D -O2 -munaligned-access -mcpu=3Dcortex-a15 -fno-builtin-sin > -fno-builtin-cos >=20 > Michal I had already reported that on retries the failure point in the overall sequence for the port either changes or the build completes for whatever I was trying to build that initially failed. (I did not repeat that in the new report.) When I retried print/texinfo built okay. I've never gotten anything large like lang/gcc6 with a full bootstrap to complete with ALLOW_MAKE_JOBS=3Dyes and -J 1 in use --no where near doing so. (In my context ALLOW_MAKE_JOBS means what portmaster would do for -j 4. Poudriere seem to give no control of this [-J is a different issue].) I've also not been able to use gdb on the .core produced: a qemu_gmake.core file extracted from the compressed tar archive of the failed work directory. file on it reports. . . # file /root/poudriere_failure/work/.build/qemu_gmake.core /root/poudriere_failure/work/.build/qemu_gmake.core: ELF 32-bit LSB core = file ARM, version 1 (FreeBSD), FreeBSD-style, from 'ke' (I suspect that "version 1 (FreebSD)" is not really intended to be supported as stands.) I submitted bugzilla 216132 as a segmentation fault report against devel/gdb but the patch that was tried just allowed gdb to get farther but show other problems and still fail overall on handling qemu_gmake.core. See 216132. =3D=3D=3D Mark Millard markmi at dsl-only.net > .... > mv warn-on-use.h-t warn-on-use.h > /bin/mkdir -p sys > rm -f sys/types.h-t sys/types.h && \ > { echo '/* DO NOT EDIT! GENERATED AUTOMATICALLY! */'; \ > sed -e 's|@''GUARD_PREFIX''@|GL|g' \ > -e 's|@''INCLUDE_NEXT''@|include_next|g' \ > -e 's|@''PRAGMA_SYSTEM_HEADER''@|#pragma GCC system_header|g' \ > -e 's|@''PRAGMA_COLUMNS''@||g' \ > -e 's|@''NEXT_SYS_TYPES_H''@||g' \ > -e 's|@''WINDOWS_64_BIT_OFF_T''@|0|g' \ > < ./sys_types.in.h; \ > } > sys/types.h-t && \ > mv sys/types.h-t sys/types.h > rm -f unistd.h-t unistd.h && \ > .. >=20 >=20 >>=20 >> . . . >> rm -f sys/types.h-t sys/types.h && \ >> { echo '/* DO NOT EDIT! GENERATED AUTOMATICALLY! */'; \ >> sed -e 's|@''GUARD_PREFIX''@|GL|g' \ >> -e 's|@''INCLUDE_NEXT''@|include_next|g' \ >> -e 's|@''PRAGMA_SYSTEM_HEADER''@|#pragma GCC system_header|g' \ >> -e 's|@''PRAGMA_COLUMNS''@||g' \ >> -e 's|@''NEXT_SYS_TYPES_H''@||g' \ >> -e 's|@''WINDOWS_64_BIT_OFF_T''@|0|g' \ >> < ./sys_types.in.h; \ >> } > sys/types.h-t && \ >> mv sys/types.h-t sys/types.h >> TCG temporary leak before 00021826 >> qemu: uncaught target signal 4 (Illegal instruction) - core dumped >> Illegal instruction >> gmake[2]: *** [Makefile:1174: all-recursive] Error 1 >> gmake[2]: Leaving directory = '/wrkdirs/usr/ports/print/texinfo/work/texinfo-6.1' >> gmake[1]: *** [Makefile:1113: all] Error 2 >> gmake[1]: Leaving directory = '/wrkdirs/usr/ports/print/texinfo/work/texinfo-6.1' >> =3D=3D=3D> Compilation failed unexpectedly. >> Try to set MAKE_JOBS_UNSAFE=3Dyes and rebuild before reporting the = failure to >> the maintainer. >> *** Error code 1 >>=20 >> Stop. >> make: stopped in /usr/ports/print/texinfo >> =3D=3D=3D=3D>> Cleaning up wrkdir >> =3D=3D=3D> Cleaning for texinfo-6.1.20160425,1 >> build of print/texinfo ended at Wed Jan 25 20:08:32 PST 2017 >> build time: 00:06:57 >> !!! build failure encountered !!! >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >>=20 >> On 01/15/17 07:09, Mark Millard wrote: >>> On 2017-Jan-14, at 10:53 PM, Mark Millard = wrote: >>>=20 >>>> [Context: head (12) -r312009 and ports head -r431413.] >>>>=20 >>>> I've been experimenting on amd64 with poudriere-devel with -x >>>> for -a arm.armv6 and I ran into: >>>>=20 >>>>> TCG temporary leak before 00021826 >>>>> qemu: uncaught target signal 4 (Illegal instruction) - core dumped >>>>=20 >>>> in 3 of the 31 ports for the build, but 4 skipped so 3 of 27 >>>> attempted. The 00021826 is the same number in all the examples >>>> so far (whatever its base). >>>>=20 >>>> These seem to be the only TCG messages and each failure starts with >>>> one and then reports the qemu message. (Also true for the below.) >>>> As far as I can tell the TCG notice is the report of an internal >>>> qemu problem that is then translated into an Illegal instruction. >>>>=20 >>>> This was with ALLOW_MAKE_JOBS=3Dyes but -J 1 for poudriere. >>>>=20 >>>> For 2 of the problem ports retries worked, still using >>>> ALLOW_MAKE_JOBS=3Dyes and -J 1 . >>>>=20 >>>> But the 3rd port failed each time tried with ALLOW_MAKE_JOBS=3Dyes >>>> --but in a different step each time. >>>>=20 >>>> In all failure cases it was gmake that got the "illegal = instruction". >>>>=20 >>>> But disabling ALLOW_MAKE_JOBS=3Dyes appears (so far) to avoid the >>>> issue. For example, that 3rd failing port built fine. (I've >>>> been doing more ports since, with ALLOW_MAKE_JOBS=3Dyes repeatedly >>>> failing and lack of it working.) >>>>=20 >>>> My guess is SIGCHLD delivery sometimes touches something (or a = timing) >>>> that is not handled well in qemu-arm-static. I've had not problems >>>> on an rpi2 or bpim3 in the past. >>>>=20 >>>> (I have seen some analogous "soemtimes" issues on powerpc under >>>> and version of lang that mishandled the stack part of the ABI >>>> FreeBSD uses, SIGCHLD sometimes getting on the stack at a bad-time >>>> for the messed up code generation, leading to stack corruption. = Code >>>> not getting signals had no problems.) >>>>=20 >>>> Note: The amd64 context is FreeBSD under VirtualBox under macOS >>>> and it has had no problem for native builds of world, kernel, >>>> or ports. >>>=20 >>> Avoiding ALLOW_MAKE_JOBS=3Dyes is not sufficient to guarantee builds >>> will work. Here is one that got near the end before failing the >>> same way: >>>=20 >>> . . . >>> install -m 0644 = /wrkdirs/usr/ports/devel/arm-none-eabi-gcc/work/gcc-6.3.0/gcc/cp/type-util= s.h = /wrkdirs/usr/ports/devel/arm-none-eabi-gcc/work/stage/usr/local/lib/gcc/ar= m-none-eabi/6.3.0/plugin/include/cp/type-utils.h >>> install: DONTSTRIP set - will not strip installed binaries >>> TCG temporary leak before 00021826 >>> qemu: uncaught target signal 4 (Illegal instruction) - core dumped >>> gmake[1]: *** [Makefile:4176: install-gcc] Illegal instruction >>> gmake[1]: Leaving directory = '/wrkdirs/usr/ports/devel/arm-none-eabi-gcc/work/.build' >>> *** Error code 2 >>>=20 >>> Stop. >>> make: stopped in /usr/ports/devel/arm-none-eabi-gcc >>> =3D=3D=3D=3D>> Cleaning up wrkdir >>> =3D=3D=3D> Cleaning for arm-none-eabi-gcc-6.3.0 >>> build of devel/arm-none-eabi-gcc ended at Sun Jan 15 00:04:02 PST = 2017 >>> build time: 02:52:28 >>> !!! build failure encountered !!! >>>=20 >>>=20 >>> Going back to the earlier initial problem (that I happen to have the >>> material for handy): expanding the .tbz of the failed build and = finding >>> the core showed: >>>=20 >>> # find . -name "*.core" -exec file {} \; = = ./work/binutils-2.27/ld/qemu_gmake.core: ELF 32-bit LSB core file ARM, = version 1 (FreeBSD), FreeBSD-style, from 'ke' >>>=20 >>> [I've not figured out what I can do with that --or how.] >>>=20 >>>=20 >>> One thing unusual on my part is that I use -mcpu=3Dcortex-a7 . That >>> matches how I historically buildworld buildkernel for installation >>> on the rpi2 and bpim3. I've never had problems like this with >>> builds on the rpi2 or the bpim3 (buildworld, buildkernel, port >>> builds). It might be that qemu-arm-static has a problem with >>> -mcpu=3Dcortex-a7 code that is generated --but not always. >>>=20 >>> Using the make.conf as an example: >>>=20 >>> # more /usr/local/etc/poudriere.d/head-cortex-a7-make.conf >>> WANT_QT_VERBOSE_CONFIGURE=3D1 >>> # >>> DEFAULT_VERSIONS+=3Dperl5=3D5.24 >>> WITH_DEBUG=3D >>> WITH_DEBUG_FILES=3D >>> MALLOC_PRODUCTION=3D >>> # >>> #system clang 3.8+ (gcc6 rejects -march=3Darmv7a): >>> #CFLAGS+=3D -march=3Darmv7-a -mcpu=3Dcortex-a7 >>> #CXXFLAGS+=3D -march=3Darmv7-a -mcpu=3Dcortex-a7 >>> #CPPFLAGS+=3D -march=3Darmv7-a -mcpu=3Dcortex-a7 >>> # >>> #lang/gcc6's xgcc stage considers the above conflicting so use just: >>> CFLAGS+=3D -mcpu=3Dcortex-a7 >>> CXXFLAGS+=3D -mcpu=3Dcortex-a7 >>> CPPFLAGS+=3D -mcpu=3Dcortex-a7 >>>=20 >>>=20 >>> For my context poudriere with -x for -a arm.armv6 and the use of >>> qemu-arm-static does not look reliable enough to depend on. It is >>> not obvious that the -x use contributes to the problem: it may well >>> not. >>>=20 >>> =3D=3D=3D >>> Mark Millard >>> markmi at dsl-only.net