Date: Thu, 27 Dec 2018 22:45:18 +0000 From: bugzilla-noreply@freebsd.org To: testing@freebsd.org Subject: [Bug 233646] Flakey test case: bin.sh.builtins.functional_test.kill1 Message-ID: <bug-233646-32464-oabj91MzZ1@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-233646-32464@https.bugs.freebsd.org/bugzilla/> References: <bug-233646-32464@https.bugs.freebsd.org/bugzilla/>
index | next in thread | previous in thread | raw e-mail
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233646 Jilles Tjoelker <jilles@FreeBSD.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|New |Open --- Comment #3 from Jilles Tjoelker <jilles@FreeBSD.org> --- In the below text, wait(2) means any wait system call; sh(1) uses wait3() which appears as wait4() in ktrace. The test case is meant to test that a terminated, wait(2)ed for but not wait(1)ed for job can be passed to kill(1) without error (the command will do nothing). The part with the second background job, p2 and wait is intended to wait for the first background job to terminate and be wait(2)ed for, without taking excessive time or wait(1)ing for it (which would make the %1 specification invalid). If the first background job is slow to terminate, the kill command will do something but this is harmless. If the first background job terminates but the kernel has not returned it yet via wait(2), the kill command will kill a zombie which per POSIX does nothing successfully. I noticed that the problem is quickly reproduced on head using a loop like while sh builtins/kill1.0; do :; done using head's sh as well as stable/11's sh, while it can run for quite a while on stable/11 using stable/11's sh as well as head's sh built against stable/11. Reproducing with ktrace -i seems hard, but reproducing with plain ktrace works. The below ktrace extract seems to indicate that the kernel is at fault, returning an [ESRCH] error for killing a zombie: 19837 sh CALL fork 19837 sh RET fork 19838/0x4d7e 19837 sh CALL wait4(0xffffffff,0x7fffffffe91c,0x1<WNOHANG>,0) 19837 sh RET wait4 0 19837 sh CALL fork 19837 sh RET fork 19839/0x4d7f 19837 sh CALL sigprocmask(SIG_BLOCK,0x7fffffffe820,0x7fffffffe810) 19837 sh RET sigprocmask 0 19837 sh CALL sigaction(SIGCHLD,0x7fffffffe850,0x7fffffffe830) 19837 sh RET sigaction 0 19837 sh CALL wait4(0xffffffff,0x7fffffffe80c,0x1<WNOHANG>,0) 19837 sh RET wait4 19839/0x4d7f 19837 sh CALL sigaction(SIGCHLD,0x7fffffffe830,0) 19837 sh RET sigaction 0 19837 sh CALL sigprocmask(SIG_SETMASK,0x7fffffffe810,0) 19837 sh RET sigprocmask 0 19837 sh CALL kill(0x4d7e,SIGTERM) 19837 sh RET kill -1 errno 3 No such process Process ID 18007 has not been returned by a wait4() call, so it must either be still running or a zombie. In either case, a kill() on it must succeed. It appears that there is no test that specifically verifies that killing a zombie process succeeds. -- You are receiving this mail because: You are the assignee for the bug.help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-233646-32464-oabj91MzZ1>
