Date: Wed, 18 Nov 2020 02:17:57 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 251227] setpgid sometimes returns ESRCH instead of EACCES Message-ID: <bug-251227-227-auO1CmqqqJ@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-251227-227@https.bugs.freebsd.org/bugzilla/> References: <bug-251227-227@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D251227 --- Comment #3 from Mahmoud Al-Qudsi <mqudsi@neosmart.net> --- The main issue is that a failed zombie race breaks job control in shells; that's actually what led me to file this issue. We were getting reports of setpgid(2) failure when setting up a job in fish [0]. Typical job control setup involves setting up a new pgrp that has control of the terminal; by convention the pgrp is assigned the pid of the first proce= ss executed in the job pipeline. When executing `foo | bar`, there's obviously= no hard guarantee that by the time the shell forks to init `bar`, `foo` has not yet finished execution (except if you add cross-process synchronization post-fork but pre-exec, which is extremely heavy handed and performs notice= ably poorly). Shells count on the fact that as long as they have not reaped `foo= `, then job pgrp with the same pid as `foo` will still be around by the time t= he shell calls setpgid for `bar`. Apart from the bigger issue that using pfind() instead of pfind_any() here prevents a subsequent process in the same job from getting access to a shell that was assigned over to the newly minted pgrp that now contains only zomb= ies, EACCES is used to distinguish between actual errors calling setpgid (e.g. EPERM, EINVAL, and in other cases, ESRCH) that qualify as exceptions stemmi= ng from incorrect call semantics from the unavoidable race condition where a s= hell needs to call setpgid but is scheduled after the child's fork+exec has alre= ady occurred. So shells abort or error out when ESRCH is returned, but silently ignore EACCES because it's an expected race condition. This exact behavior = is actually spelled out in the POSIX.1-2004's setpgid page under the section entitled "RATIONALE" [1] (I don't have a copy of POSIX.1-2001 in front of me right now). [0]: https://github.com/fish-shell/fish-shell/issues/7474 [1]: https://pubs.opengroup.org/onlinepubs/009695399/functions/setpgid.html --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-251227-227-auO1CmqqqJ>