Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Nov 2020 01:12:09 +0100
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        =?UTF-8?Q?J=C3=A9r=C3=A9mie_Galarneau?= <jeremie.galarneau@efficios.com>
Cc:        freebsd-hackers@freebsd.org, Michael Jeanson <mjeanson@efficios.com>
Subject:   Re: poll() POLLHUP behaviour inconsistency
Message-ID:  <CAGudoHE1Y-e=9ukKdXu_upbnXvf523wYSnk5S2DJVX=2Sp_2HA@mail.gmail.com>
In-Reply-To: <CA%2BjJMxuBj7EsZ76zg%2B9_bjm46N50%2BF7UG7yGt4zXy0q3VZbGng@mail.gmail.com>
References:  <CA%2BjJMxuBj7EsZ76zg%2B9_bjm46N50%2BF7UG7yGt4zXy0q3VZbGng@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/4/20, J=C3=A9r=C3=A9mie Galarneau <jeremie.galarneau@efficios.com> wr=
ote:
> Hi,
>
> Michael and myself are porting code from Linux to FreeBSD and we have
> noticed a
> peculiar difference in the way poll() events are handled.
>
> In short, we have a process that monitors the lifetime of other processes=
.
> It
> does so by sharing a pipe between the parent and the child on every fork:
> read-end in the parent, write-end in the child. The pipe is not used to
> communicate; it's only used to poll() on the death of the child process.
>
> On Linux, poll() is called with a POLLHUP event and nothing else. When
> the child process dies, the poll() call returns with 'revents =3D=3D POLL=
HUP'.
>
> After some head scratching, we figured that on FreeBSD (12.1 and 12.2) if
> the
> child process died while the parent was not waiting in poll(), we would g=
et
> 'revents =3D=3D POLLHUP' on the next invocation of poll(), like we do on =
Linux.
> However, if the parent is in poll() when the child dies, the call to poll=
()
> never unblocks. This resulted in occasional hangs in the application.
>
> I am joining a reproducer [1].
>
>
> As indicated, changing the 'events' to 'POLLIN | POLLHUP' causes both eve=
nts
> to
> be delivered in both cases (child dies before/during calling poll()).
>
> The following excerpts of the FreeBSD, Linux, and Open Specification seem
> in agreement that passing POLLHUP is unnecessary as it is checked
> implicitly.
>
> FreeBSD (POLL(2))
>   This flag is always checked, even if not present in the events bitmask
> [...]
>
> Open Group:
>   This flag is only valid in the revents bitmask; it shall be ignored in =
the
>   events member.
>
> Linux (poll(2)):
>   Hang up (only  returned  in revents; ignored in events).
>
>
> I am surprised by the behaviour being different depending on the moment t=
he
> child process' death occurs.
>
> This is not a big deal for us to work-around, but I would like to know if=
 I
> should open a bug and try to fix it or if this is intentional (and perhap=
s
> documented?) behaviour.
>
> Thanks!
> J=C3=A9r=C3=A9mie Galarneau
>
> [1] https://gist.github.com/jgalar/5c3c2673b69fa0df652bda80a88f860c
>

Thanks for the detailed report with a reproducer.

pipe_poll checks for POLLIN | POLLRDNORM and POLLOUT | POLLWRNORM in
order to decide whether to add itself to the list of waiters. Since
you don't specify any of it and POLLHUP condition is not met, it
neglects to do anything but at the same time does not return any
events to poll itself. Then poll blocks waiting for wakeups which
never come since pipe_poll did not add us anywhere.

A trivial hack looks like this:
diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
index 239cf3bbdfb..59bc03e032a 100644
--- a/sys/kern/sys_pipe.c
+++ b/sys/kern/sys_pipe.c
@@ -1458,13 +1458,13 @@ pipe_poll(struct file *fp, int events, struct
ucred *active_cred,
        }

        if (revents =3D=3D 0) {
-               if (fp->f_flag & FREAD && events & (POLLIN | POLLRDNORM)) {
+               if (fp->f_flag & FREAD) {
                        selrecord(td, &rpipe->pipe_sel);
                        if (SEL_WAITING(&rpipe->pipe_sel))
                                rpipe->pipe_state |=3D PIPE_SEL;
                }

-               if (fp->f_flag & FWRITE && events & (POLLOUT | POLLWRNORM))=
 {
+               if (fp->f_flag & FWRITE) {
                        selrecord(td, &wpipe->pipe_sel);
                        if (SEL_WAITING(&wpipe->pipe_sel))
                                wpipe->pipe_state |=3D PIPE_SEL;

With this in place the reproducer passes. I don't know yet if this is
just a pipe or general poll problem.

I don't know what the right fix is right now, may take few days. This
may or may not be a candidate for errata for the 12.2 release,
depending on how the extensive the real fix will turn out to be.

That said you may need to implement a workaround regardless of the
issue getting fixed.

Thanks,
--=20
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHE1Y-e=9ukKdXu_upbnXvf523wYSnk5S2DJVX=2Sp_2HA>