From owner-freebsd-hackers@freebsd.org Wed Nov 4 00:12:13 2020 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9D303448E0F for ; Wed, 4 Nov 2020 00:12:13 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CQnCJ64Zwz4fHR for ; Wed, 4 Nov 2020 00:12:12 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm1-x32e.google.com with SMTP id p22so872347wmg.3 for ; Tue, 03 Nov 2020 16:12:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=qdFE7B6HxXwRfV9VKWtC+niIHZHqjPPlR5eZFEdpNfQ=; b=XGR0lSx4Q+g0CmSaIV1xmnun+TaRhkedacu5mJy/ABxxjh1vLzxso6oDyUG7kuTfiH g6XPPXhv+0w0Wf8Ko2cJ8N7JtbXuRsH804PxSo1AD/A7UuAU97W0ftSuFNkJ2ka4OOfB vhX5aRjXpKGlxqtedpQif2pDjRnCgmla944zj6hTO6sL6b5F78Oxph7uIiNEHgB22wdy LRiC/QQs4olCAV1ou+JibbczeQbOAX249K22kMwDQ2I3PG/6uyO4aFkjY+KSTpzwhqKC VWzz3i4OEPyFwpipY6+w8Z8mLMuejTpsnv42oa1GGdDiM2JUZECCZF2tI9jifsnii7OK CMeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=qdFE7B6HxXwRfV9VKWtC+niIHZHqjPPlR5eZFEdpNfQ=; b=dsar3ruXjmf80+fK+u58U974lMXuBacYd7D8iuxIlghFaHLuz+8Azq7/08yxV46Gcp 6o5BxqNVzbk14kiCxNg4wv1pyKUvA0K5stW1yPIkNf3y/u0KwYEO+DFLjqusUn/MtGur lzm/s+RabdboBqZAbxKCq/27tBuAdT1qEt2M8Q4c3tFJd/hTdSOqEI65oRvm3oFp2sDe xjU3eAroV+bTq2r4Tyvna8g3f0FRpE1hbh3Y7oq+nEZocP3ajoyY9GeKmlCgmtO5IvN4 TQuBXZMiqa0eMk3/w1q6UOA558Tv96di0rhHRDasyS3Ovd38sP5WzB3ntbbiYoziinj2 LttQ== X-Gm-Message-State: AOAM531vQiCvdnUdSzpv7hc2zNtxo3TIJOGKGXeW1b7uDaqShjDyAg2o t9kUUjCqc9y+lzX8DIUSrneif0N/UUWOx5iPDZzSnyz97IA= X-Google-Smtp-Source: ABdhPJxrkNztjycDpTLt/1F7C9E04eNMdgfonY+dukc++Wix625HOsu8TXGdknGK9w4jrP3+IviYI6/DyQfcMGtM25M= X-Received: by 2002:a1c:2441:: with SMTP id k62mr1781400wmk.10.1604448730643; Tue, 03 Nov 2020 16:12:10 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a5d:4c4f:0:0:0:0:0 with HTTP; Tue, 3 Nov 2020 16:12:09 -0800 (PST) In-Reply-To: References: From: Mateusz Guzik Date: Wed, 4 Nov 2020 01:12:09 +0100 Message-ID: Subject: Re: poll() POLLHUP behaviour inconsistency To: =?UTF-8?Q?J=C3=A9r=C3=A9mie_Galarneau?= Cc: freebsd-hackers@freebsd.org, Michael Jeanson Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4CQnCJ64Zwz4fHR X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=XGR0lSx4; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of mjguzik@gmail.com designates 2a00:1450:4864:20::32e as permitted sender) smtp.mailfrom=mjguzik@gmail.com X-Spamd-Result: default: False [-3.18 / 15.00]; RCVD_TLS_ALL(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; NEURAL_HAM_LONG(-1.03)[-1.031]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::32e:from]; NEURAL_HAM_SHORT(-0.15)[-0.148]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-hackers]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Nov 2020 00:12:13 -0000 On 11/4/20, J=C3=A9r=C3=A9mie Galarneau wr= ote: > Hi, > > Michael and myself are porting code from Linux to FreeBSD and we have > noticed a > peculiar difference in the way poll() events are handled. > > In short, we have a process that monitors the lifetime of other processes= . > It > does so by sharing a pipe between the parent and the child on every fork: > read-end in the parent, write-end in the child. The pipe is not used to > communicate; it's only used to poll() on the death of the child process. > > On Linux, poll() is called with a POLLHUP event and nothing else. When > the child process dies, the poll() call returns with 'revents =3D=3D POLL= HUP'. > > After some head scratching, we figured that on FreeBSD (12.1 and 12.2) if > the > child process died while the parent was not waiting in poll(), we would g= et > 'revents =3D=3D POLLHUP' on the next invocation of poll(), like we do on = Linux. > However, if the parent is in poll() when the child dies, the call to poll= () > never unblocks. This resulted in occasional hangs in the application. > > I am joining a reproducer [1]. > > > As indicated, changing the 'events' to 'POLLIN | POLLHUP' causes both eve= nts > to > be delivered in both cases (child dies before/during calling poll()). > > The following excerpts of the FreeBSD, Linux, and Open Specification seem > in agreement that passing POLLHUP is unnecessary as it is checked > implicitly. > > FreeBSD (POLL(2)) > This flag is always checked, even if not present in the events bitmask > [...] > > Open Group: > This flag is only valid in the revents bitmask; it shall be ignored in = the > events member. > > Linux (poll(2)): > Hang up (only returned in revents; ignored in events). > > > I am surprised by the behaviour being different depending on the moment t= he > child process' death occurs. > > This is not a big deal for us to work-around, but I would like to know if= I > should open a bug and try to fix it or if this is intentional (and perhap= s > documented?) behaviour. > > Thanks! > J=C3=A9r=C3=A9mie Galarneau > > [1] https://gist.github.com/jgalar/5c3c2673b69fa0df652bda80a88f860c > Thanks for the detailed report with a reproducer. pipe_poll checks for POLLIN | POLLRDNORM and POLLOUT | POLLWRNORM in order to decide whether to add itself to the list of waiters. Since you don't specify any of it and POLLHUP condition is not met, it neglects to do anything but at the same time does not return any events to poll itself. Then poll blocks waiting for wakeups which never come since pipe_poll did not add us anywhere. A trivial hack looks like this: diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c index 239cf3bbdfb..59bc03e032a 100644 --- a/sys/kern/sys_pipe.c +++ b/sys/kern/sys_pipe.c @@ -1458,13 +1458,13 @@ pipe_poll(struct file *fp, int events, struct ucred *active_cred, } if (revents =3D=3D 0) { - if (fp->f_flag & FREAD && events & (POLLIN | POLLRDNORM)) { + if (fp->f_flag & FREAD) { selrecord(td, &rpipe->pipe_sel); if (SEL_WAITING(&rpipe->pipe_sel)) rpipe->pipe_state |=3D PIPE_SEL; } - if (fp->f_flag & FWRITE && events & (POLLOUT | POLLWRNORM))= { + if (fp->f_flag & FWRITE) { selrecord(td, &wpipe->pipe_sel); if (SEL_WAITING(&wpipe->pipe_sel)) wpipe->pipe_state |=3D PIPE_SEL; With this in place the reproducer passes. I don't know yet if this is just a pipe or general poll problem. I don't know what the right fix is right now, may take few days. This may or may not be a candidate for errata for the 12.2 release, depending on how the extensive the real fix will turn out to be. That said you may need to implement a workaround regardless of the issue getting fixed. Thanks, --=20 Mateusz Guzik