Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Nov 2020 02:17:57 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 251227] setpgid sometimes returns ESRCH instead of EACCES
Message-ID:  <bug-251227-227-auO1CmqqqJ@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-251227-227@https.bugs.freebsd.org/bugzilla/>
References:  <bug-251227-227@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D251227

--- Comment #3 from Mahmoud Al-Qudsi <mqudsi@neosmart.net> ---
The main issue is that a failed zombie race breaks job control in shells;
that's actually what led me to file this issue. We were getting reports of
setpgid(2) failure when setting up a job in fish [0].

Typical job control setup involves setting up a new pgrp that has control of
the terminal; by convention the pgrp is assigned the pid of the first proce=
ss
executed in the job pipeline. When executing `foo | bar`, there's obviously=
 no
hard guarantee that by the time the shell forks to init `bar`, `foo` has not
yet finished execution (except if you add cross-process synchronization
post-fork but pre-exec, which is extremely heavy handed and performs notice=
ably
poorly). Shells count on the fact that as long as they have not reaped `foo=
`,
then job pgrp with the same pid as `foo` will still be around by the time t=
he
shell calls setpgid for `bar`.

Apart from the bigger issue that using pfind() instead of pfind_any() here
prevents a subsequent process in the same job from getting access to a shell
that was assigned over to the newly minted pgrp that now contains only zomb=
ies,
EACCES is used to distinguish between actual errors calling setpgid (e.g.
EPERM, EINVAL, and in other cases, ESRCH) that qualify as exceptions stemmi=
ng
from incorrect call semantics from the unavoidable race condition where a s=
hell
needs to call setpgid but is scheduled after the child's fork+exec has alre=
ady
occurred. So shells abort or error out when ESRCH is returned, but silently
ignore EACCES because it's an expected race condition. This exact behavior =
is
actually spelled out in the POSIX.1-2004's setpgid page under the section
entitled "RATIONALE" [1] (I don't have a copy of POSIX.1-2001 in front of me
right now).

[0]: https://github.com/fish-shell/fish-shell/issues/7474
[1]: https://pubs.opengroup.org/onlinepubs/009695399/functions/setpgid.html

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-251227-227-auO1CmqqqJ>