From nobody Mon Jul 22 17:27:20 2024 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WSS096bh1z5RKHw for ; Mon, 22 Jul 2024 17:27:37 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic301-22.consmr.mail.gq1.yahoo.com (sonic301-22.consmr.mail.gq1.yahoo.com [98.137.64.148]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4WSS086PvYz4FSN for ; Mon, 22 Jul 2024 17:27:36 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1721669255; bh=gWXKJb84kxqj8C+bR6NsKAoh+Jiq6enochyzyJX/bFQ=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=K+rXZL8KZdq2yDoiDmwUn8lXoiGd3VXisU+qnFtXvy+KqVcoyb6lE3AOjypKblTb/Gy1kEyvofBCc9+sdbBiyhN3c9CjG2pPtMNHZm6y9Z5/L4DjHCnz6xTdNQmFYYKUhCklLw2Wk7mu+jPlYf164sJSgsXgz/TeXw6oROB8iQwxbDJKiOoww4plPZU0Y1To9hquU7sENKoyw2ZupuUpW5gtF4fqUoXOSYWeFey6BHK3XBDXOssfkSP5xU8OOYhgm2ps4yD+wbW/u/pdy1OFTYMsgdxYoxC8wDawANJESArTdC1mhsmzFGiOm7j0VdczPYRp7+kuPer+jBz8autU8w== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1721669255; bh=NnGmI9m2XnRV9+IwgiBe75VpRxbSOAou/KbUSSyMlai=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=QLIsue0rZs1Mp3/1YpgfBVivirNhoo1m/PWIc71XwbT87AsRjGyDSj8yJsjpWmq123ZFg78jYw2J0dO437Zj3jLQKYAc5zaPZCqecqFMXkjcePVVjee4em0/Olo1RHJif782YSi8ekYHhN4pM5XmHk98gssJX2GL9R3Yma72DwWuO6lVeLbQMvsQUS5BDgjJcS9ES4WnFVosU9xUwuoszqCKp1cV9SyL8gAMMAY6DU9YzLBQP9VfZUEViGf/oUpt6rfUFYz025q7rS713yfXTbm4m2CtkrUktYD5kN/lO8yD+I0md+J0S+iP2EyNgU80zL1HLf4W/wSOgkkkmA2NlA== X-YMail-OSG: 4H9VackVM1mFYpKKm9uofl0JO.kxW5_dpGNfE1hdWc7eQvliU.2Bdc.FPDfYwgU I6w410VYLYHJxKmz_hK_ecl_BPyVUXKBwErTM9gIcw0.b9xaP3S.T.NYKsdqqoyTcRq2EwkBLLLz Sty4S_M7_DcIa99l8eW4bS5aMFJnuzDBOVJZsWG3SCsXuTcqPz6tFOHGiYyrIW2ZxZElH1t2vKHe IgVlVMshTnpVbD7Ma8dnWA362kbVcd1Tt.Cl8FPr02iO7zFHCbmgwA4_cxtJ4qXZ4dPm7WES4Xmo asRJjZcV5nvuK0DZLB.k3FRnk.kF.qp_5NUBxm3N7oGMz3xidtjzYSQSdnbfg1zPTT2GHr_aI2Of tnYDYwrI4xriZfB5ZQ9K__5g1EVLfqUGZAb3Df1oi84y7.l4oJ6OyseJ7Vcio2CDLkbwwjcndrF2 rC3hH62MvFAbDT7s3bhsBT_88xQ47zH9KrGhkcNuEpPbjwRxPMuXswMxDgEN7V2uD.MNjQRE4tkN 1G5SZRjDieBA_zhVfGMIUnU5Ot5Ax579N3TUHFoi2nN0lNq_eGOFrjyuzkS.UgoWoG8w8uBvOlkB RQZsTeE02uiVvUZfdS7qa9QgZVIWbPSit8mOaHSb_7vWmmIO49EvLTHNTglem1sBprJlaL0VrTXK sUxuMN9J70L1HT2R9WnBYnU3moNd4NdUNGV7qqWm.kQfPmE_XCCZQvGC15loINGPOYx2mUSXpZO1 mYvPqDWbedkr2TTLwKG5Y2VdVsBgR1F065xFaXAAy.nYVe9uVakyqjog_lcrpJaG43_Mb6ta3rDG VaWF57UTan1mEETMKa5zrbH2dGY4UHy4UMeZGGilf6JhZSd7e6Ead2KFMgKrv_bhKGxPSVXEfJ8P iBlo2Pyt34WgAdSfJQXnwW83H6zorccrbMITue5Z9rTA7Dj8LVirFfF5t7vN_z28ToJe1swka2qN k3CHAtDciyPeEFmyGu69n2xBWvk1yhACRMo9Ieon_uYD0wulIr1gQ36dRilVXUk78hucH76vz.QT muO2d9I7QGDgKeK.glYikjW9aWES1lum5i86u0GbFdJKbknSOcvtivG5XruxmwW5X4sM1qQAQf6X anFud54tWzrcbUn4a4G44iqE4ToeSAq7jOymCEZQcuZwkKDdYUTWg.Z3gyXfPNDYEPCuoGhu7FR1 zsoSgg1MWxCeel9LhngCKMNjmotaXcOY6eTqa.AeS1eXjsrjxiVItiYIsGTLY3bc5SSbhNdPZhr4 4KEwEcXXUnqJkb7itq3Gur.R3AexEtSZPJ.z5uSPXtI0qU6w0MKxxl7IMXi0lP8Ub0HaASAim1Kt vqONz0L6ZFH4J3N_tR5Jd.aOqk0KbN_ERisp.WYxzhda4Uws65iv_yAXakfP5wT_SIwWWVsTz_nO F8IQ2YjLYC6_OPSs7YZblqfX6GKmKylzTDHnNW7a9HZo50NEDcdB_q2277ij9cMKOCcnrU36OsvE AWkAH.Xy1E2aBSr0_ehlBvS8e0ZqxzHS0ajXMyPX9_D1BGZ2JjAem63FMc.c7uXgg47ZiL2qb72X fT1V8KiiuqnGpZOxL52ScAsOLSvasWbbAme1aApoNeWg9wTICz6PAT_IPIydCTfp.TZhb996zjGE 1a2FG54sFu9w8aPM44_4FwCCSGLbd65dz4IWop8ONqTZxWfxrc8P5zn58QOOkYlPtOLRv6Z5dwXY VUJkON2Ucrj9h1uEZl_CRuEZIbFKrseZ3bsLhxJlj9xJwyIqvpvEOeL2dRGJdthM9UUY1HdArVRQ joMyr.qUjuGav3zZ4ZGOk3eLdZwjpmGP.A3Qz0F71NFKSJ6LYmoEnDDzqzreeH0_0a7uQYAcRGRU Axnr0.tpSn9g4miN.RLs.jddqU1mX7ckPSLN.qUUFdJdA4rA7CB0MLV7PVYV1LfEF5IzqggKyfMR f5Pi1JWy1i6UHSPb3jfJ__CkxlKJJslB32nQpXjZxttcBoySN_3OmvaLDidGDjZ6olRZZpUEhF4F 4wP50yi2eaPuAYX5sxT1ALRFXDFLDPpsynrHmKsGSYAWh7bE7t00XM8bKP60pgeBwkQiBvQUkKVY vxNwPvDZjBd0_EJIlasdLBdxX7jNMqph6EZWmNMlGA00WU1nZ6i0KfxTAc3ALpAJ6ehjKFhVWgE5 PgCBR3HdG5zrx9lIGD9i_4pZn0HdOydyig5O9voX_EqCLJgsvzXk44WImqNmvHqv2enUOhxtDgLz yD2emZDkIQHXEGbvsX4Y.0FvquV7TTDSfUknw3t8CvWB4pS4F3VlG_vEIACnp8ZoODaQ7Yg3wHdU S5Mw- X-Sonic-MF: X-Sonic-ID: d4be4be0-89b7-4151-837c-91ad9c0df497 Received: from sonic.gate.mail.ne1.yahoo.com by sonic301.consmr.mail.gq1.yahoo.com with HTTP; Mon, 22 Jul 2024 17:27:35 +0000 Received: by hermes--production-gq1-799bb7c8cf-b6h6x (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID 0b1c46225ddb57335b8c024386159b3e; Mon, 22 Jul 2024 17:27:31 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Subject: Re: armv7-on-aarch64 stuck at urdlck From: Mark Millard In-Reply-To: Date: Mon, 22 Jul 2024 10:27:20 -0700 Cc: FreeBSD Current , "freebsd-arm@freebsd.org" , "kib@freebsd.org >> Konstantin Belousov" Content-Transfer-Encoding: quoted-printable Message-Id: References: <724db42b-5550-4381-8277-2971e6b3e8f1@freebsd.org> <86185657-e521-466b-89e2-f291aaac10a6@freebsd.org> <0EF18174-8735-46A4-BD71-FFA3472B319F@yahoo.com> To: mmel@freebsd.org X-Mailer: Apple Mail (2.3774.600.62) X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US] X-Rspamd-Queue-Id: 4WSS086PvYz4FSN On Jul 22, 2024, at 09:41, meloun.michal@gmail.com wrote: > On 22.07.2024 18:26, Mark Millard wrote: >> On Jul 22, 2024, at 06:40, Michal Meloun = wrote: >>> On 22.07.2024 13:46, Mark Millard wrote: >>>> On Jul 21, 2024, at 22:59, Michal Meloun = wrote: >>>>> I don't want to hijack the original thread, so I'm replying in a = new one. >>>>>=20 >>>>> My tegra track current, has been running 24/7 by building = kernel/world and kde5 in a loop for a few years now. But I have never = encountered the aforementioned lockup in native armv7. >>>>>=20 >>>>> I have seen usermode mutex lockup in arm32 jail on aarch64, but = only very rarely (once a month or so) and all my attempts to reproduce = it in a more deterministic way have failed. Also, I don't think I've = ever seen this with the debug version of libc. >>>>>=20 >>>>> Unfortunately I also failed to reproduce given lockup using = dlopen_test.c, neither on native armv7 or arm32 jail. >>>>>=20 >>>>> Michal Meloun >>>> What is the output of: >>>> # readelf -a /libexec/ld-elf.so.1 | grep -E "(^[^ = 0-9]|.*_rtld_get_stack_prot)" >>>> in your armv7 context(s)? Does it include for likes of: >>>> QUOTE >>>> Symbol table '.symtab' contains 911 entries: >>>> 903: 000000000001b9ac 16 FUNC GLOBAL DEFAULT 11 = _rtld_get_stack_prot >>>> END QUOTE >>>> ` >>>> vs. not? >>>> Note that the "debug version of libc" being involved likely means = that >>>> DEBUG_FLAGS was defined. That in turn likely means that strip is = not >>>> being used. In such a case, I expect that the .symtab entry for >>>> _rtld_get_stack_prot (and more) exists for such a context. >>> At tis time, I have standard (thus stripped, non-debug) version of = runtime linker library installed. Thus it have only dynamic relocation = record for _rtld_get_stack_prot: >>>=20 >>> root@tegra124:~/dlopen_test # readelf -a /libexec/ld-elf.so.1 | grep = -E "(^[^ 0-9]|.*_rtld_get_stack_prot)" >>> ELF Header: >>> Elf file type is DYN (Shared object file) >>> Entry point 0x1449c >>> There are 10 program headers, starting at offset 52 >>> Program Headers: >>> There are 23 section headers, starting at offset 0x1a448: >>> Section Headers: >>> Key to Flags: >>> Dynamic section at offset 0x19fa4 contains 15 entries: >>> Relocation section (.rel.dyn): >>> r_offset r_info r_type st_value st_name >>> Symbol table '.dynsym' contains 27 entries: >>> 5: 000000000001ba0c 16 FUNC GLOBAL DEFAULT 12 = _rtld_get_stack_prot@@FBSDprivate_1.0 (11) >>> Notes at offset 0x00000174 with length 0x00000018: >>> Histogram for bucket list length (total of 6 buckets): >>> Histogram for bucket list length (total of 27 buckets): >>> Version symbol section (.gnu.version): >>> Version definition section (.gnu.version_d): >>> Attribute Section: aeabi >>>=20 >>> ------ >>>=20 >>> root@tegra124:~/dlopen_test # ./dlopen_test >>> root@tegra124:~/dlopen_test # >> Just to be sure . . . >> Did you at some point "pkg install cairo" (or analogous) so that >> the following (or some vintage) were in place? >> # ls -lodT /usr/local/lib/libcairo.so* >> lrwxr-xr-x 1 root wheel - 21 Apr 29 19:45:15 2024 = /usr/local/lib/libcairo.so -> libcairo.so.2.11704.0 >> lrwxr-xr-x 1 root wheel - 21 Apr 29 19:45:15 2024 = /usr/local/lib/libcairo.so.2 -> libcairo.so.2.11704.0 >> -rwxr-xr-x 1 root wheel - 1118272 Apr 29 19:45:15 2024 = /usr/local/lib/libcairo.so.2.11704.0 >> # file /usr/local/lib/libcairo.so.2.11704.0 >> /usr/local/lib/libcairo.so.2.11704.0: ELF 32-bit LSB shared object, = ARM, EABI5 version 1 (FreeBSD), dynamically linked, for FreeBSD 15.0 = (1500018), stripped >> (Installing cairo would also install other things it needs.) >> For the failing contexts, the a.out from dlopen_test.c will only >> hang if the library (and what it requires) is actually there to >> load. > Yep, i have cairo installed (but compiled from sources, not installed = by pkg). And i have verified that dlopen() return success. > In the meantime I tried all combinations (debud/stripped) of ld_elf = and libthr. All combinations work without problems on the native system = and in arm323 jail. Thanks for the information. My personal builds, which are the ones that work in my testing, are built on aarch64 as armv7 instead of on amd64. The known failing ones are built on amd64. But I've no more specific information suggesting a tie to the type of build host for the world used. > Btw, gdb has long had problems with stepping inside ld_elf. It's = better to run the test program without it and connect to the test = program to get the "correct" stack trace. >=20 In part I was deliberately exploring what sequence leads to the hangups vs. lack of hangups and the like: more context than a backtrace of the stuck state can provide. But doing "./a.out &" and then "gdb -p..." to attach to it: _umtx_op () at _umtx_op.S:4 warning: 4 _umtx_op.S: No such file or directory (gdb) bt #0 _umtx_op () at _umtx_op.S:4 #1 0x2036845c in _umtx_op_err (obj=3D0x4, op=3D12, val=3D0, uaddr=3D0x0, = uaddr2=3D0x0) at = /home/pkgbuild/worktrees/main/lib/libsys/_umtx_op_err.c:36 #2 0x20115da8 in __thr_rwlock_rdlock (rwlock=3D0x4, = rwlock@entry=3D0x20137c40, flags=3D3, tsp=3D) at = /home/pkgbuild/worktrees/main/lib/libthr/thread/thr_umtx.c:294 #3 0x2010ebf4 in _thr_rwlock_rdlock (rwlock=3D0x20137c40, flags=3D0, = tsp=3D0x0) at = /home/pkgbuild/worktrees/main/lib/libthr/thread/thr_umtx.h:229 #4 _thr_rtld_rlock_acquire (lock=3D0x20137c40) at = /home/pkgbuild/worktrees/main/lib/libthr/thread/thr_rtld.c:121 #5 0x20060788 in rlock_acquire (lock=3D0x2008af10 , = lockstate=3Dlockstate@entry=3D0xffffd114) at = /home/pkgbuild/worktrees/main/libexec/rtld-elf/rtld_lock.c:259 #6 0x20059098 in _rtld_bind (obj=3D0x2008f404, reloff=3D496) at = /home/pkgbuild/worktrees/main/libexec/rtld-elf/rtld.c:1035 #7 0x2005483c in _rtld_bind_start () at = /home/pkgbuild/worktrees/main/libexec/rtld-elf/arm/rtld_start.S:89 #8 0x2005483c in _rtld_bind_start () at = /home/pkgbuild/worktrees/main/libexec/rtld-elf/arm/rtld_start.S:89 #9 0x2005483c in _rtld_bind_start () at = /home/pkgbuild/worktrees/main/libexec/rtld-elf/arm/rtld_start.S:89 . . . It does not seem significantly different than I'd reported for the hungup state. An issue here is that the pkgbase world possibly is -O2 based despite having debug information (but is stripped). This can make details less reliable. So, for example, the rwlock=3D0x4 vs. rwlock@entry=3D0x20137c40 for __thr_rwlock_rdlock could well be suspect. =3D=3D=3D Mark Millard marklmi at yahoo.com