Date: Mon, 29 Jan 2024 10:45:19 +1100 From: Nathan Reilly-list <lists@nreilly.com> To: Guido Falsi <mad@madpilot.net> Cc: emulation@freebsd.org, "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>, freebsd-pkg@freebsd.org Subject: Re: qemu-user-static aarch64 lockup/race? (was Re: Python failure in poudriere on arm64 (via qemu-user-static cross compiling)) Message-ID: <D2DD631F-8AED-48B7-8FB3-86F93BA707F2@nreilly.com> In-Reply-To: <5ef2ab66-25ef-45f1-aa5a-4b614eab2f40@madpilot.net> References: <6a33726b-eb6f-418e-9fbd-6d0b9b4bfaa8@madpilot.net> <0fc7f929-6e5b-4a33-97d2-8a9c0c07d524@madpilot.net> <79a5eb0f-d04e-4c1a-9d8a-185e1fb4e4a2@madpilot.net> <CANCZdfr-0w2EMa=_hFT3p4gFSDO-P1Yf8Vb-1eLiwRVomo1Jfg@mail.gmail.com> <a1845758-3535-4aa0-9274-d3b13dd3801b@madpilot.net> <5ef2ab66-25ef-45f1-aa5a-4b614eab2f40@madpilot.net>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] > On 29 Jan 2024, at 8:43 am, Guido Falsi <mad@madpilot.net> wrote: > On 28/01/24 22:34, Guido Falsi wrote: >> On 28/01/24 22:23, Warner Losh wrote: >>> >>> On Sun, Jan 28, 2024, 12:38 PM Guido Falsi <mad@madpilot.net <mailto:mad@madpilot.net>> wrote: >>> >>> On 28/01/24 15:15, Guido Falsi wrote: >>> [snip] >>> > Creating repository in /tmp/packages: 0% >>> > >>> >>> BTW, forgot to mention last time this worked without issue was around >>> 20th December. >>> >>> >>> I think this is a bsd-user issue. There is a race somewhere in that code that causes the hangs. I'd love a reproducible test case that is somewhat smaller than python... there are bigger races with the newer stuff and I've not had the time to chase it there either. 😞 >> First of all thanks for your feedback. It encourages me having someone else with better knowledge about this confirm that a race condition is actually a possible cause! >> Strange this has not been happening up to mid December. >> My main and fully reproducible use case is actually mostly with pkg. >> at the end of the run poudriere runs `pkg repo` to create the meta files and sign the repo. It forks itself (ncpus + 2 I guess, even forcing it to 1 worker I see three processes), and then locks up, with all the processes stopping using CPU (ps output is in my message) >> I guess this can be reproduced with any poudriere repo with at least more than ncpus packages in it. can also be reproduced using `poudriere pkgclean -u <etc>` >> If that does not work I'm not sure how to reproduce it in other ways, but I can try writing some code mocking what pkg seems to be doing, not an expert at such things, though. > > In case it helps further norrow doen things, It looks like the lockup is happening somewhere around here: > > https://github.com/freebsd/pkg/blob/56fa3f87d9d9644348b89680dfd8af47a860ee82/libpkg/pkg_repo_create.c#L778 > > and/or in the pkg_create_repo_worker() function here: > > https://github.com/freebsd/pkg/blob/56fa3f87d9d9644348b89680dfd8af47a860ee82/libpkg/pkg_repo_create.c#L341 > > > (I'm trying to spare you the time needed to find the actual code being executed, I guess you would have identified this in a few minutes yourself, but I'm trying to make myself useful) There appears to be a GitHub issue for poudriere with this, but seems to be looking in another direction. https://github.com/freebsd/poudriere/issues/1009 Regards, Nathan [-- Attachment #2 --] <html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><br id="lineBreakAtBeginningOfMessage"><div><br><blockquote type="cite"><div>On 29 Jan 2024, at 8:43 am, Guido Falsi <mad@madpilot.net> wrote:</div><div><div>On 28/01/24 22:34, Guido Falsi wrote:<br><blockquote type="cite">On 28/01/24 22:23, Warner Losh wrote:<blockquote type="cite">On Sun, Jan 28, 2024, 12:38 PM Guido Falsi <mad@madpilot.net <mailto:mad@madpilot.net>> wrote:<br><br> On 28/01/24 15:15, Guido Falsi wrote:<br> [snip]<br> > Creating repository in /tmp/packages: 0%<br> ><br><br> BTW, forgot to mention last time this worked without issue was around<br> 20th December.<br><br><br>I think this is a bsd-user issue. There is a race somewhere in that code that causes the hangs. I'd love a reproducible test case that is somewhat smaller than python... there are bigger races with the newer stuff and I've not had the time to chase it there either. 😞<br></blockquote>First of all thanks for your feedback. It encourages me having someone else with better knowledge about this confirm that a race condition is actually a possible cause!<br>Strange this has not been happening up to mid December.<br>My main and fully reproducible use case is actually mostly with pkg.<br>at the end of the run poudriere runs `pkg repo` to create the meta files and sign the repo. It forks itself (ncpus + 2 I guess, even forcing it to 1 worker I see three processes), and then locks up, with all the processes stopping using CPU (ps output is in my message)<br>I guess this can be reproduced with any poudriere repo with at least more than ncpus packages in it. can also be reproduced using `poudriere pkgclean -u <etc>`<br>If that does not work I'm not sure how to reproduce it in other ways, but I can try writing some code mocking what pkg seems to be doing, not an expert at such things, though.<br></blockquote><br>In case it helps further norrow doen things, It looks like the lockup is happening somewhere around here:<br><br>https://github.com/freebsd/pkg/blob/56fa3f87d9d9644348b89680dfd8af47a860ee82/libpkg/pkg_repo_create.c#L778<br><br>and/or in the pkg_create_repo_worker() function here:<br><br>https://github.com/freebsd/pkg/blob/56fa3f87d9d9644348b89680dfd8af47a860ee82/libpkg/pkg_repo_create.c#L341<br><br><br>(I'm trying to spare you the time needed to find the actual code being executed, I guess you would have identified this in a few minutes yourself, but I'm trying to make myself useful)<br></div></div></blockquote><div><br></div><div><br></div></div>There appears to be a GitHub issue for poudriere with this, but seems to be looking in another direction.<div><br></div><div><a href="https://github.com/freebsd/poudriere/issues/1009">https://github.com/freebsd/poudriere/issues/1009</a></div><div><br></div><div>Regards,</div><div>Nathan</div></body></html>help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D2DD631F-8AED-48B7-8FB3-86F93BA707F2>
