Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 Nov 2020 00:22:57 +0100
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        =?UTF-8?Q?J=C3=A9r=C3=A9mie_Galarneau?= <jeremie.galarneau@efficios.com>
Cc:        freebsd-hackers@freebsd.org, Michael Jeanson <mjeanson@efficios.com>
Subject:   Re: poll() POLLHUP behaviour inconsistency
Message-ID:  <CAGudoHEK04aXMvAsa4ufPfUUzX6RUJ2Uiv6K06RccGmR3=%2BV%2BA@mail.gmail.com>
In-Reply-To: <CA%2BjJMxuzRAkqRez9g%2B=9=kioHCboNDi1eF=iigcyWcKu1uBejg@mail.gmail.com>
References:  <CA%2BjJMxuBj7EsZ76zg%2B9_bjm46N50%2BF7UG7yGt4zXy0q3VZbGng@mail.gmail.com> <CAGudoHE1Y-e=9ukKdXu_upbnXvf523wYSnk5S2DJVX=2Sp_2HA@mail.gmail.com> <CA%2BjJMxuzRAkqRez9g%2B=9=kioHCboNDi1eF=iigcyWcKu1uBejg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/4/20, J=C3=A9r=C3=A9mie Galarneau <jeremie.galarneau@efficios.com> wr=
ote:
> On Tue, 3 Nov 2020 at 19:12, Mateusz Guzik <mjguzik@gmail.com> wrote:
>>
>> On 11/4/20, J=C3=A9r=C3=A9mie Galarneau <jeremie.galarneau@efficios.com>=
 wrote:
>> > Hi,
>> >
>> > Michael and myself are porting code from Linux to FreeBSD and we have
>> > noticed a
>> > peculiar difference in the way poll() events are handled.
>> >
>> > In short, we have a process that monitors the lifetime of other
>> > processes.
>> > It
>> > does so by sharing a pipe between the parent and the child on every
>> > fork:
>> > read-end in the parent, write-end in the child. The pipe is not used t=
o
>> > communicate; it's only used to poll() on the death of the child
>> > process.
>> >
>> > On Linux, poll() is called with a POLLHUP event and nothing else. When
>> > the child process dies, the poll() call returns with 'revents =3D=3D
>> > POLLHUP'.
>> >
>> > After some head scratching, we figured that on FreeBSD (12.1 and 12.2)
>> > if
>> > the
>> > child process died while the parent was not waiting in poll(), we woul=
d
>> > get
>> > 'revents =3D=3D POLLHUP' on the next invocation of poll(), like we do =
on
>> > Linux.
>> > However, if the parent is in poll() when the child dies, the call to
>> > poll()
>> > never unblocks. This resulted in occasional hangs in the application.
>> >
>> > I am joining a reproducer [1].
>> >
>> >
>> > As indicated, changing the 'events' to 'POLLIN | POLLHUP' causes both
>> > events
>> > to
>> > be delivered in both cases (child dies before/during calling poll()).
>> >
>> > The following excerpts of the FreeBSD, Linux, and Open Specification
>> > seem
>> > in agreement that passing POLLHUP is unnecessary as it is checked
>> > implicitly.
>> >
>> > FreeBSD (POLL(2))
>> >   This flag is always checked, even if not present in the events
>> > bitmask
>> > [...]
>> >
>> > Open Group:
>> >   This flag is only valid in the revents bitmask; it shall be ignored =
in
>> > the
>> >   events member.
>> >
>> > Linux (poll(2)):
>> >   Hang up (only  returned  in revents; ignored in events).
>> >
>> >
>> > I am surprised by the behaviour being different depending on the momen=
t
>> > the
>> > child process' death occurs.
>> >
>> > This is not a big deal for us to work-around, but I would like to know
>> > if I
>> > should open a bug and try to fix it or if this is intentional (and
>> > perhaps
>> > documented?) behaviour.
>> >
>> > Thanks!
>> > J=C3=A9r=C3=A9mie Galarneau
>> >
>> > [1] https://gist.github.com/jgalar/5c3c2673b69fa0df652bda80a88f860c
>> >
>>
>> Thanks for the detailed report with a reproducer.
>>
>> pipe_poll checks for POLLIN | POLLRDNORM and POLLOUT | POLLWRNORM in
>> order to decide whether to add itself to the list of waiters. Since
>> you don't specify any of it and POLLHUP condition is not met, it
>> neglects to do anything but at the same time does not return any
>> events to poll itself. Then poll blocks waiting for wakeups which
>> never come since pipe_poll did not add us anywhere.
>>
>> A trivial hack looks like this:
>> diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
>> index 239cf3bbdfb..59bc03e032a 100644
>> --- a/sys/kern/sys_pipe.c
>> +++ b/sys/kern/sys_pipe.c
>> @@ -1458,13 +1458,13 @@ pipe_poll(struct file *fp, int events, struct
>> ucred *active_cred,
>>         }
>>
>>         if (revents =3D=3D 0) {
>> -               if (fp->f_flag & FREAD && events & (POLLIN | POLLRDNORM)=
)
>> {
>> +               if (fp->f_flag & FREAD) {
>>                         selrecord(td, &rpipe->pipe_sel);
>>                         if (SEL_WAITING(&rpipe->pipe_sel))
>>                                 rpipe->pipe_state |=3D PIPE_SEL;
>>                 }
>>
>> -               if (fp->f_flag & FWRITE && events & (POLLOUT |
>> POLLWRNORM)) {
>> +               if (fp->f_flag & FWRITE) {
>>                         selrecord(td, &wpipe->pipe_sel);
>>                         if (SEL_WAITING(&wpipe->pipe_sel))
>>                                 wpipe->pipe_state |=3D PIPE_SEL;
>>
>> With this in place the reproducer passes. I don't know yet if this is
>> just a pipe or general poll problem.
>>
>> I don't know what the right fix is right now, may take few days. This
>> may or may not be a candidate for errata for the 12.2 release,
>> depending on how the extensive the real fix will turn out to be.
>>
>> That said you may need to implement a workaround regardless of the
>> issue getting fixed.
>
> Hi,
>
> Thanks for the great (and quick!) reply.
> Let me know if I can help with the fix in any way.
>

The fix landed in https://svnweb.freebsd.org/changeset/base/367352 in
the development branch and I'm going to merge it in few days to the
stable/12 -- that is it will be a part of future releases.

I don't intend to work on issuing an errata for 12.2 as the problem
seems to have an easy workaround. However, if fixing this is of
importance I can look into it.

--=20
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHEK04aXMvAsa4ufPfUUzX6RUJ2Uiv6K06RccGmR3=%2BV%2BA>