Date: Thu, 23 Mar 2006 05:10:20 GMT From: Bruce Evans <bde@zeta.org.au> To: freebsd-bugs@FreeBSD.org Subject: Re: kern/94772: FIFOs (named pipes) + select() == broken Message-ID: <200603230510.k2N5AKax065735@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/94772; it has been noted by GNATS. From: Bruce Evans <bde@zeta.org.au> To: Oliver Fromme <olli@lurza.secnetix.de> Cc: bug-followup@freebsd.org Subject: Re: kern/94772: FIFOs (named pipes) + select() == broken Date: Thu, 23 Mar 2006 16:02:54 +1100 (EST) On Thu, 23 Mar 2006, Bruce Evans wrote: > On Wed, 22 Mar 2006, Oliver Fromme wrote: >> Oliver Fromme wrote: >> > Bruce Evans wrote: > I intened to check the behaviour for this in my test programs but don't > seem to have done it. I intended to follow Linux's behaviour even if this > is nonstandard. Linux used to have some special cases including a gripe > in a comment about having to have them to match Sun's behaviour, but I > couldn't find these when I last checked. Perhaps the difference is > precisely between select() and poll(), to follow the standard for select() > and exploit the fuzziness for poll(). I added the check. Linux-2.6.10 in fact acts as guessed above. So the check for select() is for the behaviour specified by POSIX (select() on a read descriptor that is in nonblocking mode and is for a fifo that has never had a writer returns success), while the check for poll() is for exactly the opposite behaviour (poll() blocks instead of returning with POLLIN set; the test actually uses a nonblocking poll() and only sees checks for POLLIN not set, since a test that poll() blocks would be messier and I think I understand at least the FreeBSD implementation well enough to know that this test is equivalent). > I'll add tests for the O_NONBLOCK behaviour before mailing the > test for poll(). First a small change to add it to the select() test: %%% --- select.c~ Sun Feb 12 23:42:30 2006 +++ select.c Thu Mar 23 13:47:23 2006 @@ -30,7 +30,19 @@ err(1, "open for read"); #endif - kill(ppid, SIGUSR1); + if (fd >= FD_SETSIZE) + errx(1, "fd = %d too large for select()", fd); + +#ifdef NAMEDPIPE + FD_ZERO(&rfds); + FD_SET(fd, &rfds); + tv.tv_sec = 0; + tv.tv_usec = 0; + if (select(fd + 1, &rfds, NULL, NULL, &tv) < 0) + err(1, "select"); + if (!FD_ISSET(fd, &rfds)) + warnx("state 0: expected set; got clear"); +#endif - /* XXX should check that fd fits in rfds. */ + kill(ppid, SIGUSR1); usleep(1); %%% poll() test: %%% #include <sys/poll.h> #include <sys/stat.h> #include <err.h> #include <errno.h> #include <fcntl.h> #include <signal.h> #include <unistd.h> static pid_t cpid; static pid_t ppid; static volatile sig_atomic_t state; static void catch(int sig) { state++; } #ifdef USE_POLLINIGNEOF /* * FreeBSD's POLLINIGNEOF (which causes half of the bugs when the kernel * uses it) can be used to fix up the broken cases 3 and 6a if the kernel * uses it, i.e., for named pipes but not for pipes. Note that the sense * of POLLINIGNEOF is reversed when passed to the kernel -- it means * don't-ignore-EOF in .events and if it is set there then it means * not-POLLHUP in .revents. * * This leaves the following broken cases: * state 6 (hangup but data available) for poll on a named pipe: * should have POLLIN | POLLHUP, but have POLLIN only. In this * case, we don't try POLLINIGNEOF since resulting pair of revents * cannot be distinguished from the pair for a case in which POLLIN * only is correct. * state 6a (hangup and no data available) for poll on a plain pipe: * should have POLLHUP only, but have POLLIN | POLLHUP. This is * what I thought is correct, but it is not what Linux-2.6.10 does * for named pipes. FreeBSD's select() currently depends on POLLIN * being set in this case, and Linux's select() acts the same as * FreeBSD's select() in this case. * states 3 and 6a (hangup and no data available) for select on a named pipe: * should have FD_SET() set as in old-FreeBSD and Linux-2.6.10, but * have FD_SET() clear. The POLLINIGNEOF changes just broke select() * here. So what was the PR (34020?) which inspired these changes * about? poll() only? This regression test uses nonblocking mode * for all polls and a timeout of 0 for all selects so that the * kernel state can be seen without blocking for long. I hope that * the select() blocks iff the resulting .revents indicates that it * should block (it shouldn't block if it would set POLLIN). */ int mypoll(struct pollfd *fds, nfds_t nfds, int timeout) { struct pollfd mypfd; int r; r = poll(fds, nfds, timeout); if (nfds != 1 || timeout != 0 || fds[0].revents & POLLIN) return (r); mypfd = fds[0]; mypfd.events |= POLLINIGNEOF; r = poll(&mypfd, 1, 0); if (r >= 0) { if (mypfd.revents &= POLLIN) { mypfd.revents &= ~POLLIN; mypfd.revents |= POLLHUP; } fds[0].revents = mypfd.revents; } return (r); } #define poll(fds, nfds, timeout) mypoll((fds), (nfds), (timeout)) #endif static void child(int fd) { struct pollfd pfd; char buf[256]; #ifdef NAMEDPIPE pfd.fd = open("p", O_RDONLY | O_NONBLOCK); if (pfd.fd < 0) err(1, "open for read"); #else pfd.fd = fd; #endif pfd.events = POLLIN; #ifdef NAMEDPIPE if (poll(&pfd, 1, 0) < 0) err(1, "poll"); if (pfd.revents != 0) warnx("state 0: expected 0; got %#x", pfd.revents); #endif kill(ppid, SIGUSR1); usleep(1); while (state != 1) ; #ifndef NAMEDPIPE /* * The connection cannot be restablished. Use the code that delays * the read until after the writer disconnects since that case is * more interesting. */ state = 4; goto state4; #endif if (poll(&pfd, 1, 0) < 0) err(1, "poll"); if (pfd.revents != 0) warnx("state 1: expected 0; got %#x", pfd.revents); kill(ppid, SIGUSR1); usleep(1); while (state != 2) ; if (poll(&pfd, 1, 0) < 0) err(1, "poll"); if (pfd.revents != POLLIN) warnx("state 2: expected POLLIN; got %#x", pfd.revents); if (read(pfd.fd, buf, sizeof buf) != 1) err(1, "read"); if (poll(&pfd, 1, 0) < 0) err(1, "poll"); if (pfd.revents != 0) warnx("state 2a: expected 0; got %#x", pfd.revents); kill(ppid, SIGUSR1); usleep(1); while (state != 3) ; if (poll(&pfd, 1, 0) < 0) err(1, "poll"); if (pfd.revents != POLLHUP) warnx("state 3: expected POLLHUP; got %#x", pfd.revents); kill(ppid, SIGUSR1); /* * Now we expect a new writer, and a new connection too since * we read all the data. The only new point is that we didn't * start quite from scratch since the read fd is not new. Check * startup state as above, but don't do the read as above. */ usleep(1); while (state != 4) ; state4: if (poll(&pfd, 1, 0) < 0) err(1, "poll"); if (pfd.revents != 0) warnx("state 4: expected 0; got %#x", pfd.revents); kill(ppid, SIGUSR1); usleep(1); while (state != 5) ; if (poll(&pfd, 1, 0) < 0) err(1, "poll"); if (pfd.revents != POLLIN) warnx("state 5: expected POLLIN; got %#x", pfd.revents); kill(ppid, SIGUSR1); usleep(1); while (state != 6) ; /* * Now we have no writer, but should still have data from the old * writer. Check that we have both a data condition and a hangup * condition, and that the data can read the data in the usual way. * Since Linux does this, programs must not quite reading when they * see POLLHUP; they must see POLLHUP without POLLIN (or another * input condition) before they decide that there is EOF. gdb-6.1.1 * is an example of a broken program that quits on POLLHUP only -- * see its event-loop.c. */ if (poll(&pfd, 1, 0) < 0) err(1, "poll"); if (pfd.revents != (POLLIN | POLLHUP)) warnx("state 6: expected POLLIN | POLLHUP; got %#x", pfd.revents); if (read(pfd.fd, buf, sizeof buf) != 1) err(1, "read"); if (poll(&pfd, 1, 0) < 0) err(1, "poll"); if (pfd.revents != POLLHUP) warnx("state 6a: expected POLLHUP; got %#x", pfd.revents); close(pfd.fd); kill(ppid, SIGUSR1); exit(0); } static void parent(int fd) { usleep(1); while (state != 1) ; #ifdef NAMEDPIPE fd = open("p", O_WRONLY | O_NONBLOCK); if (fd < 0) err(1, "open for write"); #endif kill(cpid, SIGUSR1); usleep(1); while (state != 2) ; if (write(fd, "", 1) != 1) err(1, "write"); kill(cpid, SIGUSR1); usleep(1); while (state != 3) ; if (close(fd) != 0) err(1, "close for write"); kill(cpid, SIGUSR1); usleep(1); while (state != 4) ; #ifndef NAMEDPIPE return; #endif fd = open("p", O_WRONLY | O_NONBLOCK); if (fd < 0) err(1, "open for write"); kill(cpid, SIGUSR1); usleep(1); while (state != 5) ; if (write(fd, "", 1) != 1) err(1, "write"); kill(cpid, SIGUSR1); usleep(1); while (state != 6) ; if (close(fd) != 0) err(1, "close for write"); kill(cpid, SIGUSR1); usleep(1); while (state != 7) ; } int main(void) { int fd[2]; int i; #ifdef NAMEDPIPE if (mkfifo("p", 0666) != 0 && errno != EEXIST) err(1, "mkfifo"); #endif signal(SIGUSR1, catch); ppid = getpid(); for (i = 0; i < 2; i++) { #ifndef NAMEDPIPE if (pipe(fd) != 0) err(1, "pipe"); #else fd[0] = -1; fd[1] = -1; #endif state = 0; switch (cpid = fork()) { case -1: err(1, "fork"); case 0: (void)close(fd[1]); child(fd[0]); break; default: (void)close(fd[0]); parent(fd[1]); break; } } return (0); } %%% The error output of these is null under Linux-2.6.10, but under FreeBSD-5.oldcurrent it is: poll() on a nameless pipe: % poll: state 6a: expected POLLHUP; got 0x11 % poll: state 6a: expected POLLHUP; got 0x11 No change for this. For poll(), Linux consistently doesn't set POLLIN when there is only null data, so we check for this. poll() on a named pipe: % pollp: state 3: expected POLLHUP; got 0 % pollp: state 6: expected POLLIN | POLLHUP; got 0x1 % pollp: state 6a: expected POLLHUP; got 0 % pollp: state 3: expected POLLHUP; got 0 % pollp: state 6: expected POLLIN | POLLHUP; got 0x1 % pollp: state 6a: expected POLLHUP; got 0 No change for this, except I didn't compile with POLLINIGNEOF used so the 3 and 6a state don't get fixed up. select() on a nameless pipe: <no output> No change for this. Here it doesn't matter if hangup is indicated by POLLHUP or POLLIN | POLLHUP -- selscan() converts both to data-ready although it's null data. select() on a named pipe: % selectp: state 0: expected set; got clear % selectp: state 3: expected set; got clear % selectp: state 6a: expected set; got clear % selectp: state 0: expected set; got clear % selectp: state 3: expected set; got clear % selectp: state 6a: expected set; got clear Now there is an extra failure for state 0. Some complications will be required to fix this without breaking poll() on named pipe. State 0 is when the read descriptor is open with O_NONBLOCK and there has "never" been a writer. In this state, select() on the read descriptor must succeed to conform to POSIX, but poll() on the read descriptor must block to conform to Linux. I think the Linux behaviour is what happens naturally -- the socket isn't hung up so sopoll() won't set POLLHUP, and there is no input so sopoll() won't set POLLIN, so sopoll() won't set any flags in revents and poll() will block. An extra flag seems to be necessary to distinguish this state so that select() doesn't block. POLLINIGNEOF was supposed to be this flag. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200603230510.k2N5AKax065735>