Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Feb 2024 01:23:08 +0000
From:      Jessica Clarke <jrtc27@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        "src-committers@freebsd.org" <src-committers@freebsd.org>, "dev-commits-src-all@freebsd.org" <dev-commits-src-all@freebsd.org>, "dev-commits-src-main@freebsd.org" <dev-commits-src-main@freebsd.org>
Subject:   Re: git: 8271d9b99a3b - main - libsys: remove usage of pthread_once and _once_stub
Message-ID:  <DD27C896-F134-4BC7-97F8-0A13C0F00114@freebsd.org>
In-Reply-To: <ZdafOihWozCdmcTx@kib.kiev.ua>
References:  <202402210029.41L0TOH5000231@gitrepo.freebsd.org> <964A29A2-4C51-4037-8EBE-320008D48AE0@freebsd.org> <ZdYGEGyT3949IZRC@kib.kiev.ua> <4715B319-B7DE-4D06-9F27-00CFE5AF89A7@freebsd.org> <ZdafOihWozCdmcTx@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

On 22 Feb 2024, at 01:11, Konstantin Belousov <kostikbel@gmail.com> wrote:
> 
> On Wed, Feb 21, 2024 at 05:23:10PM +0000, Jessica Clarke wrote:
>> On 21 Feb 2024, at 14:17, Konstantin Belousov <kostikbel@gmail.com> wrote:
>>> 
>>> On Wed, Feb 21, 2024 at 12:51:04AM +0000, Jessica Clarke wrote:
>>>> On 21 Feb 2024, at 00:29, Konstantin Belousov <kib@FreeBSD.org> wrote:
>>>>> 
>>>>> The branch main has been updated by kib:
>>>>> 
>>>>> URL: https://cgit.FreeBSD.org/src/commit/?id=8271d9b99a3b98c662ee9a6257a144284b7e1728
>>>>> 
>>>>> commit 8271d9b99a3b98c662ee9a6257a144284b7e1728
>>>>> Author:     Konstantin Belousov <kib@FreeBSD.org>
>>>>> AuthorDate: 2024-02-20 14:45:29 +0000
>>>>> Commit:     Konstantin Belousov <kib@FreeBSD.org>
>>>>> CommitDate: 2024-02-21 00:26:11 +0000
>>>>> 
>>>>>  libsys: remove usage of pthread_once and _once_stub
>>>>> 
>>>>>  that existed in auxv.c, use simple bool gate instead. This leaves a
>>>>>  small window if two threads try to call _elf_aux_info(3) simultaneously.
>>>>>  The situation is safe because auxv parsing is really idempotent. The
>>>>>  parsed data is the same, and we store atomic types (int/long/ptr) so
>>>>>  double-init does not matter.
>>>> 
>>>> You still need to load acquire and store release aux_once though,
>>>> otherwise you can see aux_once as true yet read the pre-initialised
>>>> data. In practice that’s surely very hard to hit, but the code as
>>>> written is now wrong. Also, idempotence should probably be made
>>>> unnecessary by using 0/1/2 state for uninitialised/initialising/
>>>> initialised, as it’s still technically UB from a C AM perspective due
>>>> to not being data race free if two threads initialise at the same time.
>>>> Better to just do the correct thing rather than risk things going wrong.
>>> 
>>> There is too much to handle 'in process' state for loosing thread, I need
>>> the whole libthr machinery.
>> 
>> What do you need libthr for? In pseudo-C:
>> 
>> x = load_acquire(&aux_once)
>> if (__predict_true(x == 2))
>>    return;
>> if (x == 1 || !compare_exchange_strong_acquire(&aux_once, &x, 1)) {
>>    while (x != 2) {
>>        yield();
>>        x = load_acquire(&aux_once)
>>    }
>>    return;
>> }
>> /* initialise as before */
>> store_release(&aux_once, 2);
>> 
>> I believe that’s all you need. Or compare exchange 0 to 1 as the
>> initial operation; makes the source code shorter at the expense of a
>> more expensive fast path:
>> 
>> x = 0;
>> if (__predict_true(!compare_exchange_strong_acquire(&aux_once, &x, 1)) {
>>    while (__predict_false(x != 2)) {
>>        yield();
>>        x = load_acquire(&aux_once)
>>    }
>>    return;
>> }
>> /* initialise as before */
>> store_release(&aux_once, 2);
>> 
>> I probably have bugs in the above, but you get the gist.
> The bug in the fragment above is with the yield().  If low-priority thread
> enters the '1' (in progress) block, and then is preempted by high-priority
> thread also entering init_auxv(), the process would never make a progress.
> 
> This is why I said that we need libthr (or umtx), to use real locking and
> move the waiting thread off cpu.  In kernel, yield can be used in similar
> situations because we can bump the priority, although it is tricky.

Yes, priority inversion is an issue here. But this is (without all the
C++ abstraction) how libcxxrt implements __cxa_guard_acquire, so if
it’s good enough for C++ constructors for static storage duration
objects declared at local scope, surely it’s also good enough for
aux_once? And if it’s not good enough for aux_once then libcxxrt should
be deemed broken...

One could easily adapt the above to use UMTX_OP_WAIT/WAKE though.

Jess

>>> I added the fences, thanks for noting.
>> 
>> Thanks.
>> 
>> Jess
>> 
>>> WRT being UB from pure C, we already have much more assumptions about
>>> atomicity.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DD27C896-F134-4BC7-97F8-0A13C0F00114>