Date: Thu, 06 Nov 2025 19:09:03 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 290843] killpg deadlock against a stopped interrupted fork Message-ID: <bug-290843-227@https.bugs.freebsd.org/bugzilla/>
index | next in thread | raw e-mail
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=290843 Bug ID: 290843 Summary: killpg deadlock against a stopped interrupted fork Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: bdrewery@FreeBSD.org CC: kib@FreeBSD.org, markj@FreeBSD.org This is on CURRENT 55c28005f544282b984ae0e15dacd0c108d8ab12 but I've seen this for a few years. I had to disable the DEADLKRES option because I hit it so often in Poudriere tests. Finally found a simple repro today. Basic summary is `killpg(pgid, STOP)` against a forking child blocks further `killpg(pgid)`. Repro given later. Here's the simplest result: ``` # procstat -t 51155 31783 PID TID COMM TDNAME CPU PRI STATE WCHAN 51155 302071 sh - -1 115 sleep killpg r 31783 128716 sh - -1 115 stop - # procstat -kk 51155 31783 PID TID COMM TDNAME KSTACK 51155 302071 sh - mi_switch+0x172 sleepq_switch+0x109 _sx_xlock_hard+0x513 _sx_xlock+0xac killpg1+0x138 kern_kill+0x222 amd64_syscall+0x451 fast_syscall_common+0xf8 31783 128716 sh - mi_switch+0x172 thread_suspend_check+0xbd sig_intr+0x7a fork1+0x448 sys_fork+0x54 amd64_syscall+0x451 fast_syscall_common+0xf8 ``` Using `kill -CONT -31783` blocks on killpg racer, while avoiding killpg with `kill -CONT 31783` does not block. Repro: ``` # `kill -STOP; kill -TERM; kill-CONT` against a forking job (job control enabled). # foo() is trying to repro a blank $() value which is not required for the repro but brings in enough forking to trigger the problem quickly so I left it in. sh -c 'trap "kill -9 %1; exit" INT; foo() { unset cmd; cmd=$(/sbin/sysctl -n vm.loadavg|/usr/bin/awk "{print \$2,\$3,\$4}"); case "${cmd:+set}" in set) ;; *) exit 99 ;; esac }; runner() { while foo; do :; done }; launch() { local -; set -m; PS4="child+ " runner & }; set -x; while :; do launch; sleep 0.1; kill -STOP %1; kill -TERM %1; kill -CONT %1; ret=0; wait; if [ $ret -eq 99 ]; then exit 99; fi; done;' ``` It appears https://reviews.freebsd.org/D40493 and https://reviews.freebsd.org/D41128 may have relevant discussion and attempts to fix. -- You are receiving this mail because: You are the assignee for the bug.home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-290843-227>
