Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Nov 2025 00:19:56 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 290958] ctfmerge: random Segmentation fault: 11 for `make buildkernel' on macOS
Message-ID:  <bug-290958-227-TG82JOfBqB@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-290958-227@https.bugs.freebsd.org/bugzilla/>

index | next in thread | previous in thread | raw e-mail

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=290958

Mark Peek <mp@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|bugs@FreeBSD.org            |mp@FreeBSD.org
                 CC|                            |mp@FreeBSD.org

--- Comment #2 from Mark Peek <mp@FreeBSD.org> ---
Created attachment 265610
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=265610&action=edit
Patch for missing locking around ctfmerge fifo operations

I was able to reproduce this issue when run in a loop and then simplified it by
just running the cftmerge command in a loop from the last crash. This would
fail fairly quickly in a loop to 100.

(lldb) bt all
  thread #1
    frame #0: 0x00000001978ca4f8 libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019790a0dc libsystem_pthread.dylib`_pthread_cond_wait +
984
    frame #2: 0x0000000104eefca0 ctfmerge`main + 1736
    frame #3: 0x0000000197541d54 dyld`start + 7184
  thread #2
    frame #0: 0x00000001978c99c8 libsystem_kernel.dylib`__psynch_mutexwait + 8
    frame #1: 0x0000000197906e3c
libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 84
    frame #2: 0x0000000197904868
libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 220
    frame #3: 0x0000000104ef05dc ctfmerge`worker_thread + 980
    frame #4: 0x0000000197909c08 libsystem_pthread.dylib`_pthread_start + 136
* thread #3, stop reason = ESR_EC_DABORT_EL0 (fault address: 0x17f5)
  * frame #0: 0x0000000104ef093c ctfmerge`fifo_len + 16
    frame #1: 0x0000000104ef06d4 ctfmerge`worker_thread + 1228
    frame #2: 0x0000000197909c08 libsystem_pthread.dylib`_pthread_start + 136
  thread #4
    frame #0: 0x00000001978ca4f8 libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019790a0dc libsystem_pthread.dylib`_pthread_cond_wait +
984
    frame #2: 0x0000000104ef06e8 ctfmerge`worker_thread + 1248
    frame #3: 0x0000000197909c08 libsystem_pthread.dylib`_pthread_start + 136

Fixed the above occurrence by locking around the fifo_len() call and then
received this at another location fifo_len() call:

(lldb) bt all
  thread #1
    frame #0: 0x00000001978ca4f8 libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019790a0dc libsystem_pthread.dylib`_pthread_cond_wait +
984
    frame #2: 0x0000000102317ca0 ctfmerge`main(argc=<unavailable>,
argv=<unavailable>) at ctfmerge.c:928:3 [opt]
    frame #3: 0x0000000197541d54 dyld`start + 7184
  thread #2
    frame #0: 0x00000001978c99c8 libsystem_kernel.dylib`__psynch_mutexwait + 8
    frame #1: 0x0000000197906e3c
libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 84
    frame #2: 0x0000000197904868
libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 220
    frame #3: 0x000000019790a168 libsystem_pthread.dylib`_pthread_cond_wait +
1124
    frame #4: 0x00000001023186f8
ctfmerge`worker_runphase2(wq=0x0000000102344968) at ctfmerge.c:472:4 [opt]
[inlined]
    frame #5: 0x0000000102318624 ctfmerge`worker_thread(wq=0x0000000102344968)
at ctfmerge.c:544:2 [opt]
    frame #6: 0x0000000197909c08 libsystem_pthread.dylib`_pthread_start + 136
  thread #3
    frame #0: 0x00000001978c99c8 libsystem_kernel.dylib`__psynch_mutexwait + 8
    frame #1: 0x0000000197906e3c
libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 84
    frame #2: 0x0000000197904868
libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 220
    frame #3: 0x000000019790a168 libsystem_pthread.dylib`_pthread_cond_wait +
1124
    frame #4: 0x00000001023186f8
ctfmerge`worker_runphase2(wq=0x0000000102344968) at ctfmerge.c:472:4 [opt]
[inlined]
    frame #5: 0x0000000102318624 ctfmerge`worker_thread(wq=0x0000000102344968)
at ctfmerge.c:544:2 [opt]
    frame #6: 0x0000000197909c08 libsystem_pthread.dylib`_pthread_start + 136
* thread #4, stop reason = ESR_EC_DABORT_EL0 (fault address: 0x2176)
  * frame #0: 0x000000010231894c ctfmerge`fifo_len + 16
    frame #1: 0x00000001023186e4
ctfmerge`worker_runphase2(wq=0x0000000102344968) at ctfmerge.c:471:7 [opt]
[inlined]
    frame #2: 0x0000000102318624 ctfmerge`worker_thread(wq=0x0000000102344968)
at ctfmerge.c:544:2 [opt]
    frame #3: 0x0000000197909c08 libsystem_pthread.dylib`_pthread_start + 136
  thread #5
    frame #0: 0x00000001978c99c8 libsystem_kernel.dylib`__psynch_mutexwait + 8
    frame #1: 0x0000000197906e3c
libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 84
    frame #2: 0x0000000197904868
libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 220
    frame #3: 0x0000000102318578 ctfmerge`worker_thread(wq=0x0000000102344968)
at ctfmerge.c:532:3 [opt]
    frame #4: 0x0000000197909c08 libsystem_pthread.dylib`_pthread_start + 136
  thread #6
    frame #0: 0x00000001978ca4f8 libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019790a0dc libsystem_pthread.dylib`_pthread_cond_wait +
984
    frame #2: 0x00000001023186f8
ctfmerge`worker_runphase2(wq=0x0000000102344968) at ctfmerge.c:472:4 [opt]
[inlined]
    frame #3: 0x0000000102318624 ctfmerge`worker_thread(wq=0x0000000102344968)
at ctfmerge.c:544:2 [opt]
    frame #4: 0x0000000197909c08 libsystem_pthread.dylib`_pthread_start + 136

Fixed the second one and then found another by reviewing all the fifo_*() calls
for the attached patch. I ran this twice in a loop to 10000 without an issue.


Note to get a core dump on MacOS and lldb backtrace:
1. Change /cores to be writable by the user "chmod 777 /cores"
2. Set core limit "ulimit -c unlimited"
3. codesign the ctfmerge binary to give it a core dump entitlement:
   /usr/libexec/PlistBuddy -c "Add :com.apple.security.get-task-allow bool
true" tmp.entitlements
   codesign -s - -f --entitlements tmp.entitlements /path/to/ctfmerge

Then run lldb:
    lldb -c /cores/core.<pid> -f /path/to/ctfmerge
    (lldb) bt all

-- 
You are receiving this mail because:
You are the assignee for the bug.

help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-290958-227-TG82JOfBqB>