From owner-freebsd-hackers@freebsd.org Wed Nov 4 22:01:58 2020 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 187A54677C9 for ; Wed, 4 Nov 2020 22:01:58 +0000 (UTC) (envelope-from jeremie.galarneau@efficios.com) Received: from mail.efficios.com (mail.efficios.com [167.114.26.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4CRLGY2M6Gz4Ljn for ; Wed, 4 Nov 2020 22:01:57 +0000 (UTC) (envelope-from jeremie.galarneau@efficios.com) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 39B6F2E1AFD for ; Wed, 4 Nov 2020 17:01:56 -0500 (EST) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id j9S3g0Kl2axz for ; Wed, 4 Nov 2020 17:01:55 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 7304F2E1E0F for ; Wed, 4 Nov 2020 17:01:55 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 7304F2E1E0F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1604527315; bh=MkuRwMqIwQuw8xoUG/B3gi5dYQpLqRzdd6qPZA1dD2o=; h=MIME-Version:From:Date:Message-ID:To; b=qPHDa5tWVx1H3h6T5tCEVoPMrQ9KiiPj5/Dpsc5FSWDKmmKZutWWqd+yZfnr0trXQ vF7SH4ivIVSv8irRriOxkTPt8HB5yrVUiOIKkD+Aji0O0KWDp/Ph4RMjCFMog0g4jn 2iI9e9sQKkAkHWPUtTRO4dRxmkzjziUxL3mWF5FdULTXRfLzdkeFgQ90SNOeEPsFbA H9OTopNbxUQaTT7hjsritCzfxX114qkxHdwbgQuGJlv33OJ0jErvmHR3aF9kcWJ7EK SQXtEgYf/hUg+NNFfQD+ju2Ee1ETY9VBPzLp8sDSj5jJgq3rHvFPWZUqw88TJjttmN wf7Z4L69GGcaQ== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id XfnuoL9GdeYo for ; Wed, 4 Nov 2020 17:01:55 -0500 (EST) Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) by mail.efficios.com (Postfix) with ESMTPSA id 161A52E1BF5 for ; Wed, 4 Nov 2020 17:01:55 -0500 (EST) Received: by mail-wr1-f42.google.com with SMTP id n15so127527wrq.2 for ; Wed, 04 Nov 2020 14:01:55 -0800 (PST) X-Gm-Message-State: AOAM530l7+AXaUqbEFKGkP/cPktSRdC6SKTsv5monB3RnTNy13lMqQrf fDOpvhzIWbus1eS+63ced8IJQnuiUmFFptR7hmU= X-Google-Smtp-Source: ABdhPJz6dAG9XVcKD5JPcDOGcn7GWh4+kMl01gTbEhaS9Tt32RPOsIlhucNzexbyzrLuSwvlULSLB44gEG5ent8T1uY= X-Received: by 2002:adf:80c8:: with SMTP id 66mr46325wrl.415.1604527313859; Wed, 04 Nov 2020 14:01:53 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?J=C3=A9r=C3=A9mie_Galarneau?= Date: Wed, 4 Nov 2020 17:01:18 -0500 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: poll() POLLHUP behaviour inconsistency To: Mateusz Guzik Cc: freebsd-hackers@freebsd.org, Michael Jeanson Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4CRLGY2M6Gz4Ljn X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=efficios.com header.s=default header.b=qPHDa5tW; dmarc=pass (policy=none) header.from=efficios.com; spf=pass (mx1.freebsd.org: domain of jeremie.galarneau@efficios.com designates 167.114.26.124 as permitted sender) smtp.mailfrom=jeremie.galarneau@efficios.com X-Spamd-Result: default: False [-3.07 / 15.00]; RCVD_TLS_LAST(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[efficios.com:s=default]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; NEURAL_HAM_LONG(-1.03)[-1.031]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[efficios.com:+]; DMARC_POLICY_ALLOW(-0.50)[efficios.com,none]; NEURAL_HAM_SHORT(-1.45)[-1.449]; NEURAL_HAM_MEDIUM(-1.02)[-1.016]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; R_MIXED_CHARSET(1.43)[subject]; ASN(0.00)[asn:16276, ipnet:167.114.0.0/17, country:FR]; RCVD_COUNT_SEVEN(0.00)[7]; MAILMAN_DEST(0.00)[freebsd-hackers] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Nov 2020 22:01:58 -0000 On Tue, 3 Nov 2020 at 19:12, Mateusz Guzik wrote: > > On 11/4/20, J=C3=A9r=C3=A9mie Galarneau = wrote: > > Hi, > > > > Michael and myself are porting code from Linux to FreeBSD and we have > > noticed a > > peculiar difference in the way poll() events are handled. > > > > In short, we have a process that monitors the lifetime of other process= es. > > It > > does so by sharing a pipe between the parent and the child on every for= k: > > read-end in the parent, write-end in the child. The pipe is not used to > > communicate; it's only used to poll() on the death of the child process= . > > > > On Linux, poll() is called with a POLLHUP event and nothing else. When > > the child process dies, the poll() call returns with 'revents =3D=3D PO= LLHUP'. > > > > After some head scratching, we figured that on FreeBSD (12.1 and 12.2) = if > > the > > child process died while the parent was not waiting in poll(), we would= get > > 'revents =3D=3D POLLHUP' on the next invocation of poll(), like we do o= n Linux. > > However, if the parent is in poll() when the child dies, the call to po= ll() > > never unblocks. This resulted in occasional hangs in the application. > > > > I am joining a reproducer [1]. > > > > > > As indicated, changing the 'events' to 'POLLIN | POLLHUP' causes both e= vents > > to > > be delivered in both cases (child dies before/during calling poll()). > > > > The following excerpts of the FreeBSD, Linux, and Open Specification se= em > > in agreement that passing POLLHUP is unnecessary as it is checked > > implicitly. > > > > FreeBSD (POLL(2)) > > This flag is always checked, even if not present in the events bitmas= k > > [...] > > > > Open Group: > > This flag is only valid in the revents bitmask; it shall be ignored i= n the > > events member. > > > > Linux (poll(2)): > > Hang up (only returned in revents; ignored in events). > > > > > > I am surprised by the behaviour being different depending on the moment= the > > child process' death occurs. > > > > This is not a big deal for us to work-around, but I would like to know = if I > > should open a bug and try to fix it or if this is intentional (and perh= aps > > documented?) behaviour. > > > > Thanks! > > J=C3=A9r=C3=A9mie Galarneau > > > > [1] https://gist.github.com/jgalar/5c3c2673b69fa0df652bda80a88f860c > > > > Thanks for the detailed report with a reproducer. > > pipe_poll checks for POLLIN | POLLRDNORM and POLLOUT | POLLWRNORM in > order to decide whether to add itself to the list of waiters. Since > you don't specify any of it and POLLHUP condition is not met, it > neglects to do anything but at the same time does not return any > events to poll itself. Then poll blocks waiting for wakeups which > never come since pipe_poll did not add us anywhere. > > A trivial hack looks like this: > diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c > index 239cf3bbdfb..59bc03e032a 100644 > --- a/sys/kern/sys_pipe.c > +++ b/sys/kern/sys_pipe.c > @@ -1458,13 +1458,13 @@ pipe_poll(struct file *fp, int events, struct > ucred *active_cred, > } > > if (revents =3D=3D 0) { > - if (fp->f_flag & FREAD && events & (POLLIN | POLLRDNORM))= { > + if (fp->f_flag & FREAD) { > selrecord(td, &rpipe->pipe_sel); > if (SEL_WAITING(&rpipe->pipe_sel)) > rpipe->pipe_state |=3D PIPE_SEL; > } > > - if (fp->f_flag & FWRITE && events & (POLLOUT | POLLWRNORM= )) { > + if (fp->f_flag & FWRITE) { > selrecord(td, &wpipe->pipe_sel); > if (SEL_WAITING(&wpipe->pipe_sel)) > wpipe->pipe_state |=3D PIPE_SEL; > > With this in place the reproducer passes. I don't know yet if this is > just a pipe or general poll problem. > > I don't know what the right fix is right now, may take few days. This > may or may not be a candidate for errata for the 12.2 release, > depending on how the extensive the real fix will turn out to be. > > That said you may need to implement a workaround regardless of the > issue getting fixed. Hi, Thanks for the great (and quick!) reply. Let me know if I can help with the fix in any way. J=C3=A9r=C3=A9mie > > Thanks, > -- > Mateusz Guzik --=20 J=C3=A9r=C3=A9mie Galarneau EfficiOS Inc. http://www.efficios.com