From owner-freebsd-bugs@FreeBSD.ORG Thu Mar 23 10:10:18 2006 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BCE4216A400 for ; Thu, 23 Mar 2006 10:10:18 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 311E543D48 for ; Thu, 23 Mar 2006 10:10:18 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k2NAAICD084888 for ; Thu, 23 Mar 2006 10:10:18 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id k2NAAH71084887; Thu, 23 Mar 2006 10:10:17 GMT (envelope-from gnats) Date: Thu, 23 Mar 2006 10:10:17 GMT Message-Id: <200603231010.k2NAAH71084887@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Oliver Fromme Cc: Subject: Re: kern/94772: FIFOs (named pipes) + select() == broken X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Oliver Fromme List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Mar 2006 10:10:18 -0000 The following reply was made to PR kern/94772; it has been noted by GNATS. From: Oliver Fromme To: bde@zeta.org.au (Bruce Evans) Cc: bug-followup@freebsd.org Subject: Re: kern/94772: FIFOs (named pipes) + select() == broken Date: Thu, 23 Mar 2006 11:08:50 +0100 (CET) Hi, I'm answering on several emails at once, to make things simpler. Bruce Evans wrote: > Oliver Fromme wrote: > > Bruce Evans wrote: > > > Here is a program that tests more cases. I made it give no output > > > ... > > > > It does not produce any output on Solaris 9, NetBSD 3.0, > > DEC UNIX 4.0D and Linux 2.4.32. (I had to replace signal() > > with sigset() on Solaris, add a few missing #includes and > > write small replacements for err() and warnx().) > > I thought that the signal() was portable. Unfortunately, it's not. SysV (e.g. Solaris) has different semantics: When the signal handler is executed, the signal's disposition is set to SIG_DFL. That means that the handler is only executed once, unless you call signal() again. The solution is to use sigset() which behaves more like BSD's signal(). On the other hand, FreeBSD doesn't know sigset() at all. On Linux, the situation is even more complex: When using libc4 or libc5, signal() has SysV semantics, and when using glibc2, it has BSD semantics. However, when using glibc2 with -D_XOPEN_SOURCE=500, it's again SysV, and in this latter case sigset() is defined in the header file (not in the other cases). Bottom line: For portable programs, neither signal() nor sigset() should be used. Instead, sigaction() should be used, which behaves the same on BSD and SysV, and should be supported everywhere. > Under FreeBSD, > of all things is the only missing include. FreeBSD generally seems to require less includes than the standard says. I had to add , , and (although the latter two probably only because of my err() and warnx() replacements). > I stopped trying to avoid > using the err() family in test programs when Linux got them 6-8 years > ago. Yes, but Solaris and DEC UNIX (and probably other commercial UNIX systems) don't have them. Fortunately, it was easy to write replacements in this case, because they were only called with single constant strings. > > > (By the way, DEC UNIX 4.0D _does_ have a bug: If the FIFO > > > has O_NONBLOCK set and no writer has opened the FIFO, then > > > select() doesn't block. > > > > Actually, it's not a bug. I've read SUSv3 wrong. That > > behaviour is perfectly fine. In fact, SUSv3 (a.k.a. > > POSIX-2001) requires that select() doesn't block in that > > case, and the behaviour of select() and poll() must be > > independet of whether O_NONBLOCK is set or not. > > I have tried to find POSIX saying that many times since I think > it is the correct behaviour, but I couldn't find it for either > select() or poll() before today. Now I can find it for [p]select() > but not for poll(). From POSIX.1-2001-draft7.txt for pselect(): > > %%% > 31193 A descriptor shall be considered ready for reading when a call to an input function with > 31194 O_NONBLOCK clear would not block, whether or not the function would transfer data > 31195 successfully. (The function might return data, an end-of-file indication, or an error other than > 31196 one indicating that it is blocked, and in each of these cases the descriptor shall be considered > 31197 ready for reading.) > %%% I've got SUSv3 a.k.a. IEEE Std 1003.1-2001 ("POSIX"). You can download it from The Open Group's website (you have to register with them, but it's free). However, I don't know how much it differs from the draft that you have. The above paragraph from the select() spec seems to be the same. > Other parts of POSIX make it clear that O_NONBLOCK reads must never block, That's right, but it does not matter for select()/poll(). > so if O_NONBLOCK is set then pselect() for read must never block either. No, I think that's not right. The standard clearly says that select() should always behave as if O_NONBLOCK was not set: "A descriptor shall be considered ready for reading when a call to an input function with O_NONBLOCK clear would not block". For poll there is a similar statement which is even clearer: "The poll() function shall not be affected by the O_NONBLOCK flag." Therefore: select() and poll() are not dependent on the O_NONBLOCK flag. They should always behave as if it was not set. Furthermore, the standard says a few things about the read() function when used on (nameless) pipes or FIFOs: [quote begin] When attempting to read from an empty pipe or FIFO: * If no process has the pipe open for writing, read() shall return 0 to indicate end-of-file. * If some process has the pipe open for writing and O_NONBLOCK is set, read() shall return -1 and set errno to [EAGAIN]. * If some process has the pipe open for writing and O_NONBLOCK is clear, read() shall block the calling thread until some data is written or the pipe is closed by all processes that had the pipe open for writing. [quote end] That clearly means that select() should _not_ block when no process has the FIFO open for writing. (Because the select() behaviour depends on the behaviour of read() as if the O_NONBLOCK flag is clear.) Furthermore, it als means that it does _not_ matter if there was a a writer previously or not. > > - if (events & (POLLOUT | POLLWRNORM)) > > - if (sowriteable(so)) > > - revents |= events & (POLLOUT | POLLWRNORM); > > + if (events & (POLLOUT | POLLWRNORM) && sowriteable(so)) > > + revents |= events & (POLLOUT | POLLWRNORM); > > + else { > > + /* > > + * POLLOUT and POLLHUP shall not both be set. > > + * Therefore check only for POLLHUP if POLLOUT > > + * has not been set. (Note that POLLHUP need > > + * not be in events; it's always checked.) > > + */ > > + if (so->so_rcv.sb_state & SBS_CANTRCVMORE && > > + so->so_rcv.sb_cc == 0) > > + revents |= POLLHUP; > > + } > > I think SBS_CANTSENDMORE in so_snd should be checked here. Agreed. > I think the receiver count shouldn't be checked here. Agreed. That would handle the case correctly where both POLLIN and POLLHUP can be set at the same time. > I'm surprised that > my test succeeds with this -- doesn't it prevent POLLHUP being set in the > hangup+ case? Yes, I think it prevents that (i.e. POLLHUP would act more like a "POLLEOF"). That's not correct behaviour, of course. I'll fix that. > This might be clearer with SBS_CANTSENDMORE checked first. > SBS_CANTSENDMORE set implies !sowriteable() so the behaviour is the same, > and I think it is clearer to not even look at the output bits in > `events' in the hangup case. So you mean in the SBS_CANTSENDMORE case, POLLHUP should be set without checking if the caller has requested POLLOUT in the events mask? That sounds reasonable, because POLLOUT certainly can't be returned in that case. It makes the code more complex, though. I'll have a look at that and try to implement it that way. > This also fixes poll() on sockets. Sockets are more often used than named > pipes so the change needs a few weeks of testing before MFC. I see. Bruce Evans wrote: > Bruce Evans wrote: > > I intened to check the behaviour for this in my test programs but don't > > seem to have done it. I intended to follow Linux's behaviour even if this > > is nonstandard. Linux used to have some special cases including a gripe > > in a comment about having to have them to match Sun's behaviour, but I > > couldn't find these when I last checked. Perhaps the difference is > > precisely between select() and poll(), to follow the standard for select() > > and exploit the fuzziness for poll(). > > I added the check. I'll try that later today. (At least I hope to have enough time for it.) > select() on a named pipe: > % selectp: state 0: expected set; got clear > [...] > Now there is an extra failure for state 0. Some complications will be > required to fix this without breaking poll() on named pipe. State 0 is > when the read descriptor is open with O_NONBLOCK and there has "never" > been a writer. In this state, select() on the read descriptor must > succeed to conform to POSIX, but poll() on the read descriptor must > block to conform to Linux. I think the Linux behaviour is what happens > naturally -- the socket isn't hung up so sopoll() won't set POLLHUP, Now that might be debatable. SUSv3 says that POLLHUP means that the device is disconnected. That doesn't sound like it should make a difference if there was a previous writer or not. In fact, when I open a FIFO which doesn't have a writer currently, there's no way to know if there was a writer previously (before I opened the FIFO) who "hung it up". Personally I think that Linux is in error. POLLHUP should be set when "the device is disconnected" (SUSv3), i.e. when there is no writer, period. However, I see your point that it might be more beneficial to be Linux-compliant instead of standard-compliant. > and there is no input so sopoll() won't set POLLIN, so sopoll() won't > set any flags in revents and poll() will block. An extra flag seems to > be necessary to distinguish this state so that select() doesn't block. Yes, if we want to be Linux-compliant. That'll make the code a lot more complicated. *sigh* Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd Any opinions expressed in this message may be personal to the author and may not necessarily reflect the opinions of secnetix in any way. "The ITU has offered the IETF formal alignment with its corresponding technology, Penguins, but that won't fly." -- RFC 2549