Date: Tue, 28 Dec 2021 10:49:11 +0100 From: Jan Mikkelsen <janm@transactionware.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: freebsd-hackers@freebsd.org Subject: Re: closefrom blocking, wchan urdlck Message-ID: <BD613855-83EA-46BA-9F75-063D803A02EA@transactionware.com> In-Reply-To: <YcnZdCAyWxNaUpE%2B@kib.kiev.ua> References: <2B3BA665-D42A-4B5F-AD2F-ED10E64A7276@transactionware.com> <YcnFDUxa/m1xeaUS@kib.kiev.ua> <BE544C61-86CB-48B3-92C7-39F7FFDE64DE@transactionware.com> <YcnVwe/Yb0YU5PVe@kib.kiev.ua> <9CB0803A-E15B-47F9-97A9-03597D41C01E@transactionware.com> <YcnZdCAyWxNaUpE%2B@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 27 Dec 2021, at 16:19, Konstantin Belousov <kostikbel@gmail.com> = wrote: >=20 > On Mon, Dec 27, 2021 at 04:13:50PM +0100, Jan Mikkelsen wrote: >>=20 >>=20 >>> On 27 Dec 2021, at 16:03, Konstantin Belousov <kostikbel@gmail.com> = wrote: >>>=20 >>> On Mon, Dec 27, 2021 at 03:54:57PM +0100, Jan Mikkelsen wrote: >>>>=20 >>>>> On 27 Dec 2021, at 14:52, Konstantin Belousov = <kostikbel@gmail.com> wrote: >>>>>=20 >>>>> On Mon, Dec 27, 2021 at 01:39:11PM +0100, Jan Mikkelsen wrote: >>>>>> Hi, >>>>>>=20 >>>>>> (On 11.2) >>>>>>=20 >>>>>> I am occasionally seeing closefrom() block in a child process = created by a call to pdfork(). >>>>>>=20 >>>>>> When this does happen, it is very early after the process has = started, while other threads are being created elsewhere in the process. = I cannot reproduce it after the thread creation is complete. According = to the sigaction man page, this should be async signal safe. >>>>>>=20 >>>>>> Stack trace from the call to closefrom(): >>>>>>=20 >>>>>> * frame #0: 0x000000080090276c libthr.so.3`_umtx_op_err at = _umtx_op_err.S:37 >>>>>> frame #1: 0x00000008008f6121 = libthr.so.3`__thr_rwlock_rdlock(rwlock=3D<unavailable>, = flags=3D<unavailable>, tsp=3D<unavailable>) at thr_umtx.c:307:10 >>>>>> frame #2: 0x00000008008ff1ac libthr.so.3`_thr_rtld_rlock_acquire = [inlined] _thr_rwlock_rdlock(rwlock=3D0x0000000800911600, flags=3D0, = tsp=3D0x0000000000000000) at thr_umtx.h:232:10 >>>>>> frame #3: 0x00000008008ff19b = libthr.so.3`_thr_rtld_rlock_acquire(lock=3D0x0000000800911600) at = thr_rtld.c:125 >>>>>> frame #4: 0x000000080075332b = ld-elf.so.1`rlock_acquire(lock=3D0x0000000800765270, = lockstate=3D0x00007fffdfbfb8d0) at rtld_lock.c:208:2 >>>>>> frame #5: 0x000000080074ba20 = ld-elf.so.1`_rtld_bind(obj=3D0x0000000800769000, reloff=3D6072) at = rtld.c:861:5 >>>>>> frame #6: 0x0000000800747c7d ld-elf.so.1`_rtld_bind_start at = rtld_start.S:121 >>>>>> frame #7: 0x00000000006562d3 = prog`Twio::ProcHandle::spawn(this=3D<unavailable>, command=3D"/bin/echo", = args=3D0x0000000800d7e000, descriptor_mapping=3D<unavailable>, = descriptor_end=3D3) at prochandle_pdfork.cpp:308:2 >>>>> And where is the closefrom() call in the demonstrated trace? >>>>>=20 >>>>> What version of the system do you use? >>>>> You need at least cbdec8db18b533f6d7be (on HEAD) or = a5659943e37a74c96e >>>>> (stable/13) for pdfork() to behave sanely. But you still not = allowed to >>>>> call non-async signal safe functions in the child before exec. >>>>=20 >>>>=20 >>>> This is 12.2-p11. I just noticed that I wrote 11.2 above, that is = incorrect. >>=20 =E2=80=A6. >>>> The commit you can apply cleanly to 12.2, I=E2=80=99m running a = build now. Are there other issues with pdfork in 12.2? >>>=20 >>> pdfork() with threading processes requires 21f749da82e755aafab1276 = and the >>> followup cbdec8db18b533f6d7be. I do not believe any of this is in = 12.3, >>> and definitely not in 12.2. >>=20 >> Thanks, will check and apply. I can no longer reproduce the problem with these two changes added to a = 12.2 source tree. Thanks! Regards, Jan M.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BD613855-83EA-46BA-9F75-063D803A02EA>