Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 27 Oct 2018 18:00:34 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        =?utf-8?Q?Mika=C3=ABl_Urankar?= <mikael.urankar@gmail.com>, Sean Bruno <sbruno@freebsd.org>
Cc:        FreeBSD Toolchain <freebsd-toolchain@freebsd.org>, freeBSD <freebsd-hackers@freebsd.org>, FreeBSD Ports ML <freebsd-ports@freebsd.org>
Subject:   Re: head -r339076 amd64 -> armv7 port cross build attempt with native tools involved: hangs between a cc (wait) and its child ld (uwait)
Message-ID:  <D333D3B5-C7B3-4A48-92E2-673C0FFAA96F@yahoo.com>
In-Reply-To: <220332B7-0B5E-4378-AD48-FDFB8F135A50@yahoo.com>
References:  <33C58480-1E76-4748-83B4-CB39FAD8584A@yahoo.com> <CAJwjRmS0u6ONZTOX%2B-aFuOjm2FFDR-vkSO8h4j47d5OODPsDjA@mail.gmail.com> <D3CCBEF4-BCEF-4D6F-A503-AAE512D3D875@yahoo.com> <CBB0AC55-9EFE-4B58-8139-CE7CC265BF21@yahoo.com> <E0E27A7F-D4F5-450B-B6FE-03664E48D3BB@yahoo.com> <220332B7-0B5E-4378-AD48-FDFB8F135A50@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[The bigger test still hung up.]

On 2018-Oct-27, at 5:30 PM, Mark Millard <marklmi at yahoo.com> wrote:

> [Just the __packed removal patch was sufficient to no longer
> have the hang problem that I originally reported for the
> print/texinfo build in poudriere.]
>=20
> On 2018-Oct-27, at 4:33 PM, Mark Millard <marklmi at yahoo.com> wrote:
>=20
>> [Some of this discussion occurred off list. The point here
>> is not specific to the hang that I originally reported.]
>>=20
>> On 2018-Oct-27, at 3:03 PM, Mark Millard <marklmi at yahoo.com> =
wrote:
>>>=20
>=20
> Mika=C3=ABl Urankar is being quoted below:
>=20
>>>> . . .
>>>>=20
>>>>> There are bugs in qemu that can cause such deadlock, you can try =
these
>>>>> 2 patches:
>>>>> =
https://github.com/MikaelUrankar/qemu-bsd-user/commit/9424a5ffde4de2768ab6=
baa45fdbe0dbb56a7371
>>>>> =
https://github.com/MikaelUrankar/qemu-bsd-user/commit/d6f65a7f07d280b6906d=
499d8e465d4d2026c52b
>=20
> Back to me:
>=20
>>>> I'll try those later. Thanks. (I need to get back to sleep.)
>>>>=20
>>>> It was interesting that attach/detach to the ld process
>>>> caused it to progress. The rest of the build completed
>>>> just fine. But that one spot consistently hung up before
>>>> trying gdb to look at the back trace.
>>>>=20
>>>=20
>>> Looking at the qemu code related to the 2nd patch: the
>>> structure of the field copies (via __get_user) seems
>>> very sensitive to the ABI rules for the target and
>>> how things align and such, given that the structure
>>> description and code are host code. __packed vs. not
>>> is possibly not sufficient control to always make things
>>> match right across all the potential combinations of
>>> host and target from what I can see.
>>>=20
>>> Lack of __packed may prove sufficient for my specific
>>> context (amd64 host and armv7 target) but it seems
>>> non-obvious what to do in general.
>>>=20
>>> There would also seem to be big endian vs. little endian
>>> issues on the individual __get_user styles of copies
>>> when the host and target do not match for a multi-byte
>>> numeric encoding.
>>=20
>> Well, I get the following for:
>>=20
>> #include "/usr/include/sys/event.h" // kevent
>> #include <stddef.h> // offsetof
>> #include <stdio.h>  // printf
>>=20
>> int
>> main()
>> {
>>       printf("%lu\n", (unsigned long) sizeof(struct kevent));
>>       printf("ident %lu\n", (unsigned long) offsetof(struct kevent, =
ident));
>>       printf("filter %lu\n", (unsigned long) offsetof(struct kevent, =
filter));
>>       printf("flags %lu\n", (unsigned long) offsetof(struct kevent, =
flags));
>>       printf("fflags %lu\n", (unsigned long) offsetof(struct kevent, =
fflags));
>>       printf("data %lu\n", (unsigned long) offsetof(struct kevent, =
data));
>>       printf("udata %lu\n", (unsigned long) offsetof(struct kevent, =
udata));
>>       printf("ext %lu\n", (unsigned long) offsetof(struct kevent, =
ext));
>>       return 0;
>> }
>>=20
>> (This code avoided warnings for type mismatches with the
>> printf strings and such.)
>>=20
>> amd64 native [host of qemu use] (comments hand added):
>>=20
>> # ./a.out
>> 64
>> ident 0
>> filter 8  // NOTE!
>> flags 10  // NOTE!
>> fflags 12 // NOTE!
>> data 16
>> udata 24
>> ext 32
>>=20
>> (The above is not particularly important but I
>> include it for completeness.)
>>=20
>> armv7 native [target in qemu use] (comments hand added):
>>=20
>> # ./a.out
>> 64       // NOTE vs. below!
>> ident 0
>> filter 4 // NOTE vs. above!
>> flags 6  // NOTE vs. above!
>> fflags 8 // NOTE vs. above!
>> data 16  // NOTE vs. below!
>> udata 24 // NOTE vs. below!
>> ext 32   // NOTE vs. below!
>>=20
>> /usr/include/sys/event.h lacks __packed in both cases.
>>=20
>> With __packed in qemu-arm-static's source code
>> for target_freebsd_kevent I confirm that via
>> gdb for the qemu-arm-static:
>>=20
>> p/d sizeof(struct target_freebsd_kevent)
>> p/d &((struct target_freebsd_kevent *)0)->ident
>> p/d &((struct target_freebsd_kevent *)0)->filter
>> p/d &((struct target_freebsd_kevent *)0)->flags
>> p/d &((struct target_freebsd_kevent *)0)->fflags
>> p/d &((struct target_freebsd_kevent *)0)->data
>> p/d &((struct target_freebsd_kevent *)0)->udata
>> p/d &((struct target_freebsd_kevent *)0)->ext
>>=20
>> reports as the 2nd patch's problem-report
>> material reports (56,0,4,6,8,12,20,24): not
>> even the right size.
>>=20
>> I also confirm that removing __packed in qemu's
>> code and rebuilding and then checking with gdb
>> reported a match to the above armv7 native report
>> (64,0,4,6,8,16,24,32).
>>=20
>> I have not verified __packed used vs. not for any
>> other combination of host and target platforms.
>=20
> Removing the 2 examples of __packed, including the
> 1 for target_freebsd_kevent, as in Mika=C3=ABl Urankar's
> 2nd listed patch, was sufficient to avoid the hang
> that I originally reported. (Technically FreeBSD 11
> is not involved and so one of the __packed removals
> is not relevant to my example.)
>=20
> I have not applied Mika=C3=ABl Urankar's first listed
> patch at all. It did not prove necessary for my
> context.
>=20
> Again: the only tested context is amd64 -> armv7
> (host -> target) under a head -r339076 based
> build. (So still 12.)
>=20
> I'm doing a larger amd64 -> armv7 rebuild (around
> 210 ports overall) that originally included the
> problematical hang and a full-bootstrap build
> of lang/gcc8 (so extensive emulation use after
> the clang-based stages). Prior to the patch,
> all smaller attempts also hung at the same
> place for print/texinfo.
>=20
> But I'll only report if this larger test has
> a problem.


The bigger test still hung up in the same old place.
A gdb attach/detach sequence against the qemu-arm-static
for the ld again let it continue from there.

Drat. But good to know.

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D333D3B5-C7B3-4A48-92E2-673C0FFAA96F>