From nobody Wed Jul 24 17:34:47 2024 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WTh3Z0NFhz5SJvZ; Wed, 24 Jul 2024 17:34:50 +0000 (UTC) (envelope-from melounmichal@gmail.com) Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WTh3Y5XSRz4FTB; Wed, 24 Jul 2024 17:34:49 +0000 (UTC) (envelope-from melounmichal@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-427ffae0b91so393485e9.0; Wed, 24 Jul 2024 10:34:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721842488; x=1722447288; darn=freebsd.org; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:reply-to:user-agent:mime-version:date:message-id:from :sender:from:to:cc:subject:date:message-id:reply-to; bh=SjSGpB4QpVM2BXFXdUT/fDhKrwfkBWOrmcznpVw5Vbc=; b=IE67eL9e7/gSDEv6EjdMyvpAdJ0YR/6IA6hlkBtU0DLJdPmJDs2vGQf3cnY4NsuGXF jaG5qI6BBVMP9Vbh2A1UL+aL8IboBbBzdq4ZnRRFrBp54pdwf0WpZGtjLo8h3d6likee mGOfsb0Ae3M2oHhkh0xxm6RmCuo/Vh07Q8EG8CLnvDXjBM30JJyy8swV97v5k2qR8bM2 V6GSJ5iWt1wpN6X46I5vfjmmwqKygwIQsR0ckcNIiqdpyi7f3sJRuFd3Td6FYAC+9NxA /s8uB0YOdoyV5uPDn5lM3qRQgsteGoum7pl1it90Lp2XQXnXO3s4/fuVsIHovO4q4ytf dxig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721842488; x=1722447288; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:reply-to:user-agent:mime-version:date:message-id:from :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=SjSGpB4QpVM2BXFXdUT/fDhKrwfkBWOrmcznpVw5Vbc=; b=ix1zAS55TdfRe/tIaXAVSmMgq+fxPyu2RuDUwIImZwOw06hGkVZ+190JZLLoi2b3+f VTgV/H4vnEQwKPzm91xXD9yzXqfgyYa2Z/mrjRpP8OPyPJsT4NaMT6cSKZm7/V7NAP+a zRjaSKvUUe+X7pZQH6YoahtUm4Fkh9SL7Crv6XfZPynYYEl4PeIN9yq1mksSYwWJ53YY 2jRpiwT0xqLErTyo/CmHJOAEtxh7SLG8Igf5Ju1EIss9XVDXolkE7U5GvGqaJt9XW2ca OfQf4IJLzARCPi2tmEnxT+XqnQKVnc1cx8yAILiKwraOfJdj933gdmc5UsDh60517B8f 48ig== X-Forwarded-Encrypted: i=1; AJvYcCWbqXztlII/BNR8sSJoGjyllz8jBp59ieYWhsrJmKNKQJTEQH8YA5cKun41u8y4LKDm4tjx5LyjjZliT47dHXCzqtbWnG8IKE6aZDdP8VK+TkZD7830yE+KGcmTzvg0mZVHqwWK X-Gm-Message-State: AOJu0Ywx4FXx26s7AJZpxvv2bLA44aK4+jrbhFYBfzganuI5+TmkEYJz +E8lVwmhylnXnT0TZpGEgziO3hTPYPBX0d0kgVvf8svIPYYydCdxH2mbpdDW X-Google-Smtp-Source: AGHT+IFw2kEL2k4ETosin8jlOOUczci9UOeaZNc9h6pez8p+oGufKVEDfwHOoKXgq/njEaCUmAXpDQ== X-Received: by 2002:a05:600c:1c95:b0:426:59fe:ac27 with SMTP id 5b1f17b1804b1-427f7ad53a8mr28267615e9.26.1721842487402; Wed, 24 Jul 2024 10:34:47 -0700 (PDT) Received: from ?IPV6:2001:67c:14a0:5fe0:841e:45d2:e338:10c2? ([2001:67c:14a0:5fe0:841e:45d2:e338:10c2]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-368786949f5sm14900497f8f.57.2024.07.24.10.34.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 24 Jul 2024 10:34:46 -0700 (PDT) From: "mmel@freebsd.org" X-Google-Original-From: "mmel@freebsd.org" Message-ID: <28484869-05fd-4391-9501-10b93280f7a4@freebsd.org> Date: Wed, 24 Jul 2024 19:34:47 +0200 List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@FreeBSD.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Reply-To: mmel@freebsd.org Subject: Re: armv7-on-aarch64 stuck at urdlck To: Konstantin Belousov , John F Carr Cc: Mark Millard , FreeBSD Current , "freebsd-arm@freebsd.org" References: <33251aa3-681f-4d17-afe9-953490afeaf0@gmail.com> <0DD19771-3AAB-469E-981B-1203F1C28233@yahoo.com> <6a969609-fa0e-419d-83d5-e4fcf0f6ec35@freebsd.org> Content-Language: cs, en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[] X-Rspamd-Queue-Id: 4WTh3Y5XSRz4FTB On 24.07.2024 17:47, Konstantin Belousov wrote: > On Wed, Jul 24, 2024 at 01:07:39PM +0000, John F Carr wrote: >> >> >>> On Jul 24, 2024, at 06:50, Konstantin Belousov wrote: >>> >>> On Wed, Jul 24, 2024 at 12:34:57PM +0200, mmel@freebsd.org wrote: >>>> >>>> >>>> On 24.07.2024 12:24, Konstantin Belousov wrote: >>>>> On Tue, Jul 23, 2024 at 08:11:13PM +0000, John F Carr wrote: >>>>>> On Jul 23, 2024, at 13:46, Michal Meloun wrote: >>>>>>> >>>>>>> On 23.07.2024 11:36, Konstantin Belousov wrote: >>>>>>>> On Tue, Jul 23, 2024 at 09:53:41AM +0200, Michal Meloun wrote: >>>>>>>>> The good news is that I'm finally able to generate a working/locking >>>>>>>>> test case. The culprit (at least for me) is if "-mcpu" is used when >>>>>>>>> compiling libthr (e.g. indirectly injected via CPUTYPE in /etc/make.conf). >>>>>>>>> If it is not used, libthr is broken (regardless of -O level or debug/normal >>>>>>>>> build), but -mcpu=cortex-a15 will always produce a working libthr. >>>>>>>> I think this is very significant progress. >>>>>>>> Do you plan to drill down more to see what is going on? >>>>>>> >>>>>>> So the problem is now clear, and I fear it may apply to other architectures as well. >>>>>>> dlopen_object() (from rtld_elf), >>>>>>> https://cgit.freebsd.org/src/tree/libexec/rtld-elf/rtld.c#n3766, >>>>>>> holds the rtld_bind_lock write lock for almost the entire time a new library is loaded. >>>>>>> If the code uses a yet unresolved symbol to load the library, the rtl_bind() function attempts to get read lock of rtld_bind_lock and a deadlock occurs. >>>>>>> >>>>>>> In this case, it round_up() in _thr_stack_fix_protection, >>>>>>> https://cgit.freebsd.org/src/tree/lib/libthr/thread/thr_stack.c#n136. >>>>>>> Issued by __aeabi_uidiv (since not all armv7 processors support HW divide). >>>>>>> >>>>>>> Unfortunately, I'm not sure how to fix it. The compiler can emit __aeabi_<> in any place, and I'm not sure if it can resolve all the symbols used by rtld_eld and libthr beforehand. >>>>>>> >>>>>>> >>>>>>> Michal >>>>>>> >>>>>> >>>>>> In this case (but not for all _aeabi_ functions) we can avoid division >>>>>> as long as page size is a power of 2. >>>>>> >>>>>> The function is >>>>>> >>>>>> static inline size_t >>>>>> round_up(size_t size) >>>>>> { >>>>>> if (size % _thr_page_size != 0) >>>>>> size = ((size / _thr_page_size) + 1) * >>>>>> _thr_page_size; >>>>>> return size; >>>>>> } >>>>>> >>>>>> The body can be condensed to >>>>>> >>>>>> return (size + _thr_page_size - 1) & ~(_thr_page_size - 1); >>>>>> >>>>>> This is shorter in both lines of code and instruction bytes. >>>>> >>>>> Lets not allow this to be lost. Could anybody confirm that the patch >>>>> below fixes the issue? >>>>> >>>>> commit d560f4f6690a48476565278fd07ca131bf4eeb3c >>>>> Author: Konstantin Belousov >>>>> Date: Wed Jul 24 13:17:55 2024 +0300 >>>>> >>>>> rtld: avoid division in __thr_map_stacks_exec() >>>>> The function is called by rtld with the rtld bind lock write-locked, >>>>> when fixing the stack permission during dso load. Not every ARMv7 CPU >>>>> supports the div, which causes the recursive entry into rtld to resolve >>>>> the __aeabi_uidiv symbol, causing self-lock. >>>>> Workaround the problem by using roundup2() instead of open-coding less >>>>> efficient formula. >>>>> Diagnosed by: mmel >>>>> Based on submission by: John F Carr >>>>> Sponsored by: The FreeBSD Foundation >>>>> MFC after: 1 week >>>>> >>> Just realized that it is wrong. Stack size is user-controlled and it does >>> not need to be power of two. >> >> Your change is correct. _thr_page_size is set to getpagesize(), >> which is a power of 2. The call to roundup2 takes a user-provided >> size and rounds it up to a multiple of the system page size. >> >> I tested the change and it works. My change also works and >> should compile to identical code. I forgot there was a standard >> function to do the rounding. > Right, my bad, thank you for correcting my thinko. > >> >>> For final resolving of deadlocks, after a full day of digging, I'm very much >>>> incline of adding -znow to the linker flags for libthr.so (and maybe also >>>> for ld-elf.so). The runtime cost of resolving all symbols at startup is very >>>> low. Direct pre-solving in _thr_rtld_init() is problematic for the _aeabi_* >>>> symbols, since they don't have an official C prototypes, and some are not >>>> compatible with C calling conventions. >>> I do not like it. `-z now' changes (breaks) the ABI and makes some symbols >>> not preemtible. >>> >>> In the worst case, we would need a call to the asm routine which causes the >>> resolution of the _eabi_* symbols on arm. >>> >> >> It would also be possible to link libthr with libgcc.a and use a linker map >> to hide the _eabi_ symbols. > In principle yes, but if the ARM ABI states that _eabi symbols must be used, > and exported from libc, then this is also some form of ABI breakage. I hope that https://reviews.freebsd.org/D46104 is acceptable :)