Date: Sat, 03 Jan 2026 22:06:31 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 292167] occasional SMP APs not starting on ppc64 guests Message-ID: <bug-292167-227@https.bugs.freebsd.org/bugzilla/>
index | next in thread | raw e-mail
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=292167 Bug ID: 292167 Summary: occasional SMP APs not starting on ppc64 guests Product: Base System Version: 15.0-RELEASE Hardware: powerpc OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: adrian@freebsd.org QEMU version: ``` qemu-devel-10.1.20251031 QEMU CPU Emulator - development version ``` -HEAD revision: ``` commit aa611fa7e835ae77a623cc6d05020f5ee76dc881 (HEAD -> 20251231_ppc64_file_test) Author: Dag-Erling Smørgrav <des@FreeBSD.org> Date: Wed Dec 31 14:10:39 2025 +0100 depend-cleanup.sh: Reduce repetition MFC after: 1 week Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D54329 ``` booting 15 and 16 builds, BE or LE ISO ``` #!/bin/sh qemu-system-ppc64 \ -s -S \ -machine pseries \ -cpu power9 \ -cdrom freebsd-16.iso \ -m 6144M \ -smp 8 \ -nographic ``` Sometimes I get nothing after the AP init ``` real memory = 6396932096 (6100 MB) avail memory = 6135046144 (5850 MB) FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs random: registering fast source PowerISA DARN random number generator random: fast provider: "PowerISA DARN random number generator" arc4random: WARNING: initial seeding bypassed the cryptographic random device because it was not yet seeded and the knob 'bypass_before_seeding' was enabled. random: entropy device external interface kbd0 at kbdmux0 ofwbus0: <Open Firmware Device Tree> on nexus0 cpulist0: <Open Firmware CPU Group> on ofwbus0 cpu0: <Open Firmware CPU> on cpulist0 xicp0: <External Interrupt Presentation Controller> on ofwbus0 xicp0: Handling CPUs 0-7 vdevice0: <POWER Hypervisor Virtual Device Root> on ofwbus0 vscsi0: <POWER Hypervisor Virtual SCSI Bus> irq 16781572 on vdevice0 vscsi0: Queue depth 22 commands llan0: <POWER Hypervisor Virtual Ethernet> irq 16781571 on vdevice0 llan0: Ethernet address: 52:54:00:12:34:56 uart0: <POWER Hypervisor Virtual Serial Port> irq 16781569 on vdevice0 pcib0: <RTAS Host-PCI bridge> on ofwbus0 pci0: <POWER Hypervisor PCI bus> on pcib0 xhci0: <NEC uPD720200 USB 3.0 controller> mem 0x81020000-0x81023fff irq 4609 at device 1.0 numa-domain 0 on pci0 xhci0: 32 bytes context size, 32-bit DMA xhci0: xECP capabilities <PROTO,PROTO> usbus0 numa-domain 0 on xhci0 vgapci0: <VGA-compatible display> mem 0x80000000-0x80ffffff,0x81000000-0x81000fff at device 0.0 numa-domain 0 on pci0 vgapci0: Boot video device rtas0: <Run-Time Abstraction Services> on ofwbus0 rtas0: registered as a time-of-day clock, resolution 0.002000s ossl0: <OpenSSL crypto> on nexus0 Timecounter "timebase" frequency 512000000 Hz quality 1000 Event timer "decrementer" frequency 512000000 Hz quality 1000 Timecounters tick every 1.000 msec llan0: link state changed to UP usbus0: 5.0Gbps Super Speed USB v3.0 ugen0.1: <(0x1033) XHCI root HUB> at usbus0 uhub0 numa-domain 0 on usbus0 uhub0: <(0x1033) XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 Trying to mount root from cd9660:/dev/iso9660/BOOT [ro]... Launching APs: 2 5 6 4 7 1 3 WARNING: WITNESS option enabled, expect reduced performance. <hang> ``` Firing up gdb on the build kernel/image: ``` adrian@test-3:/data/1/adrian/freebsd/freebsd-src-ppc64 % kgdb ../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/ powerpc.powerpc64/sys/GENERIC64/kernel.debug GNU gdb (GDB) 15.1 [GDB v15.1 for FreeBSD] Copyright (C) 2024 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-portbld-freebsd16.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/powerpc.powerpc64/sys/GENERIC64/kernel.de bug... (kgdb) target remote localhost:1234 Remote debugging using localhost:1234 0x0000000000000100 in ?? () (kgdb) cont Continuing. ^C Thread 8 received signal SIGINT, Interrupt. [Switching to Thread 1.8] 0xc000000000e8ba3c in ?? () (kgdb) add-symbol-file ../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/powerpc.powerpc64/sys/GENERIC64/kernel. debug 0xc000000000101000 add symbol table from file "../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/powerpc.powerpc64/sys/GENERIC64/ke rnel.debug" at .text_addr = 0xc000000000101000 (y or n) bt Please answer y or n. (y or n) y Reading symbols from ../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/powerpc.powerpc64/sys/GENERIC64/kernel.de bug... (kgdb) info threads ``` ``` (kgdb) info threads Id Target Id Frame 1 Thread 1.1 (CPU#0 [running]) cpu_switch () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195 2 Thread 1.2 (CPU#1 [halted ]) phyp_hcall () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45 3 Thread 1.3 (CPU#2 [running]) cpu_switch () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195 4 Thread 1.4 (CPU#3 [halted ]) phyp_hcall () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45 5 Thread 1.5 (CPU#4 [halted ]) phyp_hcall () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45 6 Thread 1.6 (CPU#5 [halted ]) phyp_hcall () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45 7 Thread 1.7 (CPU#6 [halted ]) phyp_hcall () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45 * 8 Thread 1.8 (CPU#7 [halted ]) phyp_hcall () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45 ``` Now, what's interesting is that CPU 1 and CPU 3 are both doing this ``` (kgdb) thread 1 [Switching to thread 1 (Thread 1.1)] #0 cpu_switch () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195 195 ld %r7,TD_LOCK(%r13) (kgdb) bt #0 cpu_switch () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195 #1 0xc0000000008d3244 in sched_switch (td=0xc00800002940e840, flags=<optimized out>) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:2419 #2 0xc0000000008a30ec in mi_switch (flags=267) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_synch.c:530 #3 0xc0000000008d6620 in sched_bind (td=0xc00800002940e840, cpu=6) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:3065 #4 0xc0000000008f93c0 in taskqgroup_binder (ctx=0xc000000005090600) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:764 #5 0xc0000000008f9f78 in gtaskqueue_run_locked (queue=0xc00000000535d480) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:368 #6 0xc0000000008f9c5c in gtaskqueue_thread_loop (arg=<optimized out>) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:544 #7 0xc0000000008279c0 in fork_exit (callout=0xc0000000008f9b20 <gtaskqueue_thread_loop>, arg=0xc008000029366098, frame=0xc00800003190c940) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_fork.c:1155 #8 0xc000000000e7d3bc in fork_trampoline () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:357 Backtrace stopped: frame did not save the PC (kgdb) ``` ``` (kgdb) thread 3 [Switching to thread 3 (Thread 1.3)] #0 cpu_switch () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195 195 ld %r7,TD_LOCK(%r13) (kgdb) bt #0 cpu_switch () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195 #1 0xc0000000008d3244 in sched_switch (td=0xc00800002940acc0, flags=<optimized out>) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:2419 #2 0xc0000000008a30ec in mi_switch (flags=267) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_synch.c:530 #3 0xc0000000008d6620 in sched_bind (td=0xc00800002940acc0, cpu=0) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:3065 #4 0xc0000000008f93c0 in taskqgroup_binder (ctx=0xc000000005090780) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:764 #5 0xc0000000008f9f78 in gtaskqueue_run_locked (queue=0xc00000000535dd80) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:368 #6 0xc0000000008f9c5c in gtaskqueue_thread_loop (arg=<optimized out>) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:544 #7 0xc0000000008279c0 in fork_exit (callout=0xc0000000008f9b20 <gtaskqueue_thread_loop>, arg=0xc008000029366008, frame=0xc00800003188a940) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_fork.c:1155 #8 0xc000000000e7d3bc in fork_trampoline () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:357 Backtrace stopped: frame did not save the PC (kgdb) ``` So cpu #1 (cpu0) is waiting for CPU #7 (cpu6), and cpu #3 (cpu2) is waiting for cpu0. So cpu2 can't make forward progress as cpu0 is in sched switch, and the rest of them (eg CPU #7 / cpu6): ``` (kgdb) thread 7 [Switching to thread 7 (Thread 1.7)] #0 phyp_hcall () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45 45 ld %r0,16(%r1) (kgdb) bt #0 phyp_hcall () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45 #1 0xc000000000e922b8 in phyp_cpu_idle (sbt=<optimized out>) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/platform_chrp.c:594 #2 0xc000000000e6be5c in cpu_idle (busy=<optimized out>) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/cpu.c:745 #3 0xc0000000008d6e5c in sched_idletd (dummy=<optimized out>) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:3210 #4 0xc0000000008279c0 in fork_exit (callout=0xc0000000008d691c <sched_idletd>, arg=0x0, frame=0xc008000031870940) at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_fork.c:1155 #5 0xc000000000e7d3bc in fork_trampoline () at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:357 Backtrace stopped: frame did not save the PC (kgdb) ``` So there's something weird going on here. I may have to go and fire up KTR inside the kernel to log the set of scheduler operations before things went out to lunch. -- You are receiving this mail because: You are the assignee for the bug.home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-292167-227>
