From owner-freebsd-hackers@freebsd.org Sun Oct 28 13:58:26 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2017310D1A45 for ; Sun, 28 Oct 2018 13:58:26 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic303-22.consmr.mail.gq1.yahoo.com (sonic303-22.consmr.mail.gq1.yahoo.com [98.137.64.203]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9E19385482 for ; Sun, 28 Oct 2018 13:58:25 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: zoFnuvAVM1mZUfQrPQcXwTDWeh_CkHZaQK_CXgY4I_1t1nMYmFUOIDoqPMoiTS6 tYOPUXQVV0A.ta62JcS6B3PI1hd9ZCbKiyYbcHutTqKIR3XVitCRSyT9yLpx8J.YiCqX2PA2AkvP m9HIVpVapI6qo.rULBBM9pfYIwq77vkLGHzZ_pvG4qTpf5yFWQ62FZoRlvevvLH5X.QZAvfsLgqq SicbPbi7SnR5jc422Z_MP0HQLltct0tifyKcT5dbieJlMNuznbXuOmpbVClh_h7bkUMpX.ebGMw4 Q4rh94SC5uwrGnvL7WtQ5O6BCdJw3Iw4wJXr6PGXbOHP8OaPv.wrTIVb5M.cFZQ7hE1.g_JLfAVZ l9WgGHXhWmtdJpwpYjY48rgljCobe5iFzXGVhb_8fSMZmVR8Ia_sKVoXiUnBpgQo_Y9QVyrSMKxt Ko8AIcv4jBpCo_dj5Q4edTY9LlieaeThvIRQvkmfCtS.WY0eGCWOcEzJzk8_CIibyGXSIoaccIWF 246s4KIZbrKEh2jvRHZ6ztvmvDRiXBLO314O0ZtuvA6ZNLyupWA9rFaiFszrNlZRmxCGeKeFby0S qaxKIskMfItH5Gyrbky6YSVeBHSIv5f7ELU_GNAmv6VJzfsNqkjRy_tFTV8IF5wB5ZNou0_iR9DB dTosYamrBtmW_rWzEdv9JWwVK802OBwEWC59Wi3xuE807hOhNScTdimpXbP5Ylfx4J4TzG9pCEWP 6_Say2cQ8YBLaKZ1dx6trRvI_bLTp6xNxHViUI3M.F3ibsfb3E3FOQ.1ELGA8RxpQ4wF3R_3VAmK BVzewVQUBlaKkAiBJBAt1UHrhqJPiqywLPDLbYyzEX3.cF_IdKwypLBGP9xNNy5Mlbk0NpxHtQdI OA648emVASgtn6WgnFnb5KykusQaOF5xLLavEsn4Ekbl.fJ11mPJCpKc_i0SDcMPhgiE4msz7eQ2 kSZ7aglSOtJlNHPFZ_kb6IOdCEBHVeTFeIlmybPxiv9cDYyr4B82gph56gL.lPgGKdozug5xAtXQ _jKqHSrjOIWV5kVnjuBIc8jipS7BNDjHbhpZbxP0tmwujmRM_sfmBC.vw1A-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic303.consmr.mail.gq1.yahoo.com with HTTP; Sun, 28 Oct 2018 13:58:16 +0000 Received: from c-76-115-7-162.hsd1.or.comcast.net (EHLO [192.168.1.25]) ([76.115.7.162]) by smtp428.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID fef59226e7a8c8d85910decefd040bb7; Sun, 28 Oct 2018 13:58:14 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: Re: head -r339076 amd64 -> armv7 port cross build attempt with native tools involved: hangs between a cc (wait) and its child ld (uwait) From: Mark Millard In-Reply-To: Date: Sun, 28 Oct 2018 06:58:13 -0700 Cc: FreeBSD Toolchain , freeBSD , FreeBSD Ports ML Content-Transfer-Encoding: quoted-printable Message-Id: <324BD0F0-4017-4395-9B59-B7A8558EA6FD@yahoo.com> References: <33C58480-1E76-4748-83B4-CB39FAD8584A@yahoo.com> <220332B7-0B5E-4378-AD48-FDFB8F135A50@yahoo.com> To: =?utf-8?Q?Mika=C3=ABl_Urankar?= , Sean Bruno X-Mailer: Apple Mail (2.3445.9.1) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Oct 2018 13:58:26 -0000 [I have a work around for the specific activity to avoid the hang.] On 2018-Oct-27, at 6:00 PM, Mark Millard wrote: > [The bigger test still hung up.] >=20 > On 2018-Oct-27, at 5:30 PM, Mark Millard wrote: >=20 >> [Just the __packed removal patch was sufficient to no longer >> have the hang problem that I originally reported for the >> print/texinfo build in poudriere.] >>=20 >> On 2018-Oct-27, at 4:33 PM, Mark Millard = wrote: >>=20 >>> [Some of this discussion occurred off list. The point here >>> is not specific to the hang that I originally reported.] >>>=20 >>> On 2018-Oct-27, at 3:03 PM, Mark Millard = wrote: >>>>=20 >>=20 >> Mika=C3=ABl Urankar is being quoted below: >>=20 >>>>> . . . >>>>>=20 >>>>>> There are bugs in qemu that can cause such deadlock, you can try = these >>>>>> 2 patches: >>>>>> = https://github.com/MikaelUrankar/qemu-bsd-user/commit/9424a5ffde4de2768ab6= baa45fdbe0dbb56a7371 >>>>>> = https://github.com/MikaelUrankar/qemu-bsd-user/commit/d6f65a7f07d280b6906d= 499d8e465d4d2026c52b >>=20 >> Back to me: >>=20 >>>>> I'll try those later. Thanks. (I need to get back to sleep.) >>>>>=20 >>>>> It was interesting that attach/detach to the ld process >>>>> caused it to progress. The rest of the build completed >>>>> just fine. But that one spot consistently hung up before >>>>> trying gdb to look at the back trace. >>>>>=20 >>>>=20 >>>> Looking at the qemu code related to the 2nd patch: the >>>> structure of the field copies (via __get_user) seems >>>> very sensitive to the ABI rules for the target and >>>> how things align and such, given that the structure >>>> description and code are host code. __packed vs. not >>>> is possibly not sufficient control to always make things >>>> match right across all the potential combinations of >>>> host and target from what I can see. >>>>=20 >>>> Lack of __packed may prove sufficient for my specific >>>> context (amd64 host and armv7 target) but it seems >>>> non-obvious what to do in general. >>>>=20 >>>> There would also seem to be big endian vs. little endian >>>> issues on the individual __get_user styles of copies >>>> when the host and target do not match for a multi-byte >>>> numeric encoding. >>>=20 >>> Well, I get the following for: >>>=20 >>> #include "/usr/include/sys/event.h" // kevent >>> #include // offsetof >>> #include // printf >>>=20 >>> int >>> main() >>> { >>> printf("%lu\n", (unsigned long) sizeof(struct kevent)); >>> printf("ident %lu\n", (unsigned long) offsetof(struct kevent, = ident)); >>> printf("filter %lu\n", (unsigned long) offsetof(struct kevent, = filter)); >>> printf("flags %lu\n", (unsigned long) offsetof(struct kevent, = flags)); >>> printf("fflags %lu\n", (unsigned long) offsetof(struct kevent, = fflags)); >>> printf("data %lu\n", (unsigned long) offsetof(struct kevent, = data)); >>> printf("udata %lu\n", (unsigned long) offsetof(struct kevent, = udata)); >>> printf("ext %lu\n", (unsigned long) offsetof(struct kevent, = ext)); >>> return 0; >>> } >>>=20 >>> (This code avoided warnings for type mismatches with the >>> printf strings and such.) >>>=20 >>> amd64 native [host of qemu use] (comments hand added): >>>=20 >>> # ./a.out >>> 64 >>> ident 0 >>> filter 8 // NOTE! >>> flags 10 // NOTE! >>> fflags 12 // NOTE! >>> data 16 >>> udata 24 >>> ext 32 >>>=20 >>> (The above is not particularly important but I >>> include it for completeness.) >>>=20 >>> armv7 native [target in qemu use] (comments hand added): >>>=20 >>> # ./a.out >>> 64 // NOTE vs. below! >>> ident 0 >>> filter 4 // NOTE vs. above! >>> flags 6 // NOTE vs. above! >>> fflags 8 // NOTE vs. above! >>> data 16 // NOTE vs. below! >>> udata 24 // NOTE vs. below! >>> ext 32 // NOTE vs. below! >>>=20 >>> /usr/include/sys/event.h lacks __packed in both cases. >>>=20 >>> With __packed in qemu-arm-static's source code >>> for target_freebsd_kevent I confirm that via >>> gdb for the qemu-arm-static: >>>=20 >>> p/d sizeof(struct target_freebsd_kevent) >>> p/d &((struct target_freebsd_kevent *)0)->ident >>> p/d &((struct target_freebsd_kevent *)0)->filter >>> p/d &((struct target_freebsd_kevent *)0)->flags >>> p/d &((struct target_freebsd_kevent *)0)->fflags >>> p/d &((struct target_freebsd_kevent *)0)->data >>> p/d &((struct target_freebsd_kevent *)0)->udata >>> p/d &((struct target_freebsd_kevent *)0)->ext >>>=20 >>> reports as the 2nd patch's problem-report >>> material reports (56,0,4,6,8,12,20,24): not >>> even the right size. >>>=20 >>> I also confirm that removing __packed in qemu's >>> code and rebuilding and then checking with gdb >>> reported a match to the above armv7 native report >>> (64,0,4,6,8,16,24,32). >>>=20 >>> I have not verified __packed used vs. not for any >>> other combination of host and target platforms. >>=20 >> Removing the 2 examples of __packed, including the >> 1 for target_freebsd_kevent, as in Mika=C3=ABl Urankar's >> 2nd listed patch, was sufficient to avoid the hang >> that I originally reported. (Technically FreeBSD 11 >> is not involved and so one of the __packed removals >> is not relevant to my example.) >>=20 >> I have not applied Mika=C3=ABl Urankar's first listed >> patch at all. It did not prove necessary for my >> context. >>=20 >> Again: the only tested context is amd64 -> armv7 >> (host -> target) under a head -r339076 based >> build. (So still 12.) >>=20 >> I'm doing a larger amd64 -> armv7 rebuild (around >> 210 ports overall) that originally included the >> problematical hang and a full-bootstrap build >> of lang/gcc8 (so extensive emulation use after >> the clang-based stages). Prior to the patch, >> all smaller attempts also hung at the same >> place for print/texinfo. >>=20 >> But I'll only report if this larger test has >> a problem. >=20 >=20 > The bigger test still hung up in the same old place. > A gdb attach/detach sequence against the qemu-arm-static > for the ld again let it continue from there. >=20 > Drat. But good to know. Having lld use -Wl,--no-threads avoids the problem. Without the option, lld for N "cpus" creates N or so extra worker threads (besides the thread for main) plus one more that does something different. Having only the thread for main (and possibly one more) avoids the hangups. In my context, N=3D=3D28 (Hyper-V) or N=3D=3D32 (native FreeBSD boot) was in use. Also: The hangups when there were around N+2 threads total only happened when lld was executed as emulated code instead of as host-native code. Some autoconfig activity does not use ${CC} or the like and so some lld use ends up emulated even when most of the clang/llvm activity in the poudriere bulk run is host-native. Side note: The ports infrastructure does not have LINKER_TYPE in use like buildworld buildkernel does, so I did not use LDFLAGS.lld+=3D-Wl,--no-threads like I do for buildworld buildkernel . For now I'm using LDFLAGS.clang+=3D-Wl,--no-threads with LDFLAGS+=3D${LDFLAGS.${CHOSEN_COMPILER_TYPE}} in order to select the option when lld is more likely to be in use. I also avoid the LDFLAGS.clang assignment for powerpc* families, because lld is not used in that context (so far). =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)