Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 03 Jan 2026 22:06:31 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 292167] occasional SMP APs not starting on ppc64 guests
Message-ID:  <bug-292167-227@https.bugs.freebsd.org/bugzilla/>

index | next in thread | raw e-mail

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=292167

            Bug ID: 292167
           Summary: occasional SMP APs not starting on ppc64 guests
           Product: Base System
           Version: 15.0-RELEASE
          Hardware: powerpc
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: adrian@freebsd.org

QEMU version:

```
qemu-devel-10.1.20251031       QEMU CPU Emulator - development version
```

-HEAD revision:

```
commit aa611fa7e835ae77a623cc6d05020f5ee76dc881 (HEAD ->
20251231_ppc64_file_test)
Author: Dag-Erling Smørgrav <des@FreeBSD.org>
Date:   Wed Dec 31 14:10:39 2025 +0100

    depend-cleanup.sh: Reduce repetition

    MFC after:      1 week
    Reviewed by:    imp
    Differential Revision:  https://reviews.freebsd.org/D54329
```

booting 15 and 16 builds, BE or LE ISO

```
#!/bin/sh

qemu-system-ppc64 \
        -s -S \
        -machine pseries \
        -cpu power9 \
        -cdrom freebsd-16.iso \
        -m 6144M \
        -smp 8 \
        -nographic
```

Sometimes I get nothing after the AP init

```
real memory  = 6396932096 (6100 MB)
avail memory = 6135046144 (5850 MB)
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
random: registering fast source PowerISA DARN random number generator
random: fast provider: "PowerISA DARN random number generator"
arc4random: WARNING: initial seeding bypassed the cryptographic random device
because it was not yet seeded and the knob 'bypass_before_seeding' was enabled.
random: entropy device external interface
kbd0 at kbdmux0
ofwbus0: <Open Firmware Device Tree> on nexus0
cpulist0: <Open Firmware CPU Group> on ofwbus0
cpu0: <Open Firmware CPU> on cpulist0
xicp0: <External Interrupt Presentation Controller> on ofwbus0
xicp0: Handling CPUs 0-7
vdevice0: <POWER Hypervisor Virtual Device Root> on ofwbus0
vscsi0: <POWER Hypervisor Virtual SCSI Bus> irq 16781572 on vdevice0
vscsi0: Queue depth 22 commands
llan0: <POWER Hypervisor Virtual Ethernet> irq 16781571 on vdevice0
llan0: Ethernet address: 52:54:00:12:34:56
uart0: <POWER Hypervisor Virtual Serial Port> irq 16781569 on vdevice0
pcib0: <RTAS Host-PCI bridge> on ofwbus0
pci0: <POWER Hypervisor PCI bus> on pcib0
xhci0: <NEC uPD720200 USB 3.0 controller> mem 0x81020000-0x81023fff irq 4609 at
device 1.0 numa-domain 0 on pci0
xhci0: 32 bytes context size, 32-bit DMA
xhci0: xECP capabilities <PROTO,PROTO>
usbus0 numa-domain 0 on xhci0
vgapci0: <VGA-compatible display> mem
0x80000000-0x80ffffff,0x81000000-0x81000fff at device 0.0 numa-domain 0 on pci0
vgapci0: Boot video device
rtas0: <Run-Time Abstraction Services> on ofwbus0
rtas0: registered as a time-of-day clock, resolution 0.002000s
ossl0: <OpenSSL crypto> on nexus0
Timecounter "timebase" frequency 512000000 Hz quality 1000
Event timer "decrementer" frequency 512000000 Hz quality 1000
Timecounters tick every 1.000 msec
llan0: link state changed to UP
usbus0: 5.0Gbps Super Speed USB v3.0
ugen0.1: <(0x1033) XHCI root HUB> at usbus0
uhub0 numa-domain 0 on usbus0
uhub0: <(0x1033) XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
Trying to mount root from cd9660:/dev/iso9660/BOOT [ro]...
Launching APs: 2 5 6 4 7 1 3
WARNING: WITNESS option enabled, expect reduced performance.
<hang>
```

Firing up gdb on the build kernel/image:

```
adrian@test-3:/data/1/adrian/freebsd/freebsd-src-ppc64 % kgdb
../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/
powerpc.powerpc64/sys/GENERIC64/kernel.debug
GNU gdb (GDB) 15.1 [GDB v15.1 for FreeBSD]
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd16.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from
../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/powerpc.powerpc64/sys/GENERIC64/kernel.de
bug...
(kgdb) target remote localhost:1234
Remote debugging using localhost:1234
0x0000000000000100 in ?? ()
(kgdb) cont
Continuing.
^C
Thread 8 received signal SIGINT, Interrupt.
[Switching to Thread 1.8]
0xc000000000e8ba3c in ?? ()
(kgdb) add-symbol-file
../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/powerpc.powerpc64/sys/GENERIC64/kernel.
debug 0xc000000000101000
add symbol table from file
"../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/powerpc.powerpc64/sys/GENERIC64/ke
rnel.debug" at
        .text_addr = 0xc000000000101000
(y or n) bt
Please answer y or n.
(y or n) y
Reading symbols from
../freebsd-obj-ppc64/data/1/adrian/freebsd/freebsd-src-ppc64/powerpc.powerpc64/sys/GENERIC64/kernel.de
bug...
(kgdb) info threads
```


```
(kgdb) info threads
  Id   Target Id                    Frame
  1    Thread 1.1 (CPU#0 [running]) cpu_switch ()
    at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195
  2    Thread 1.2 (CPU#1 [halted ]) phyp_hcall ()
    at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45
  3    Thread 1.3 (CPU#2 [running]) cpu_switch ()
    at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195
  4    Thread 1.4 (CPU#3 [halted ]) phyp_hcall ()
    at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45
  5    Thread 1.5 (CPU#4 [halted ]) phyp_hcall ()
    at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45
  6    Thread 1.6 (CPU#5 [halted ]) phyp_hcall ()
    at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45
  7    Thread 1.7 (CPU#6 [halted ]) phyp_hcall ()
    at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45
* 8    Thread 1.8 (CPU#7 [halted ]) phyp_hcall ()
    at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45
```

Now, what's interesting is that CPU 1 and CPU 3 are both doing this

```
(kgdb) thread 1
[Switching to thread 1 (Thread 1.1)]
#0  cpu_switch () at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195
195             ld      %r7,TD_LOCK(%r13)
(kgdb) bt
#0  cpu_switch () at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195
#1  0xc0000000008d3244 in sched_switch (td=0xc00800002940e840, flags=<optimized
out>)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:2419
#2  0xc0000000008a30ec in mi_switch (flags=267) at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_synch.c:530
#3  0xc0000000008d6620 in sched_bind (td=0xc00800002940e840, cpu=6)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:3065
#4  0xc0000000008f93c0 in taskqgroup_binder (ctx=0xc000000005090600)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:764
#5  0xc0000000008f9f78 in gtaskqueue_run_locked (queue=0xc00000000535d480)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:368
#6  0xc0000000008f9c5c in gtaskqueue_thread_loop (arg=<optimized out>)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:544
#7  0xc0000000008279c0 in fork_exit (callout=0xc0000000008f9b20
<gtaskqueue_thread_loop>, arg=0xc008000029366098, 
    frame=0xc00800003190c940) at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_fork.c:1155
#8  0xc000000000e7d3bc in fork_trampoline () at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:357
Backtrace stopped: frame did not save the PC
(kgdb) 
```

```
(kgdb) thread 3
[Switching to thread 3 (Thread 1.3)]
#0  cpu_switch () at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195
195             ld      %r7,TD_LOCK(%r13)
(kgdb) bt
#0  cpu_switch () at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:195
#1  0xc0000000008d3244 in sched_switch (td=0xc00800002940acc0, flags=<optimized
out>)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:2419
#2  0xc0000000008a30ec in mi_switch (flags=267) at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_synch.c:530
#3  0xc0000000008d6620 in sched_bind (td=0xc00800002940acc0, cpu=0)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:3065
#4  0xc0000000008f93c0 in taskqgroup_binder (ctx=0xc000000005090780)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:764
#5  0xc0000000008f9f78 in gtaskqueue_run_locked (queue=0xc00000000535dd80)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:368
#6  0xc0000000008f9c5c in gtaskqueue_thread_loop (arg=<optimized out>)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/subr_gtaskqueue.c:544
#7  0xc0000000008279c0 in fork_exit (callout=0xc0000000008f9b20
<gtaskqueue_thread_loop>, arg=0xc008000029366008, 
    frame=0xc00800003188a940) at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_fork.c:1155
#8  0xc000000000e7d3bc in fork_trampoline () at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:357
Backtrace stopped: frame did not save the PC
(kgdb) 
```

So cpu #1 (cpu0) is waiting for CPU #7 (cpu6), and cpu #3 (cpu2) is waiting for
cpu0.
So cpu2 can't make forward progress as cpu0 is in sched switch, and the rest of
them (eg CPU #7 / cpu6):

```
(kgdb) thread 7
[Switching to thread 7 (Thread 1.7)]
#0  phyp_hcall () at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45
45              ld      %r0,16(%r1)
(kgdb) bt
#0  phyp_hcall () at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/phyp-hvcall.S:45
#1  0xc000000000e922b8 in phyp_cpu_idle (sbt=<optimized out>)
    at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/pseries/platform_chrp.c:594
#2  0xc000000000e6be5c in cpu_idle (busy=<optimized out>)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/cpu.c:745
#3  0xc0000000008d6e5c in sched_idletd (dummy=<optimized out>)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/sched_ule.c:3210
#4  0xc0000000008279c0 in fork_exit (callout=0xc0000000008d691c <sched_idletd>,
arg=0x0, frame=0xc008000031870940)
    at /data/1/adrian/freebsd/freebsd-src-ppc64/sys/kern/kern_fork.c:1155
#5  0xc000000000e7d3bc in fork_trampoline () at
/data/1/adrian/freebsd/freebsd-src-ppc64/sys/powerpc/powerpc/swtch64.S:357
Backtrace stopped: frame did not save the PC
(kgdb) 
```

So there's something weird going on here. I may have to go and fire up KTR
inside the kernel to log the set of scheduler operations before things went out
to lunch.

-- 
You are receiving this mail because:
You are the assignee for the bug.

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-292167-227>