Date: Thu, 12 Jan 2012 23:52:00 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Giovanni Trematerra <gianni@freebsd.org> Cc: flo@freebsd.org, Attilio Rao <attilio@freebsd.org>, Konstantin Belousov <kib@freebsd.org>, freebsd-arch@freebsd.org, jilles@freebsd.org Subject: Re: pipe/fifo code merged. Message-ID: <20120112221422.V1340@besplex.bde.org> In-Reply-To: <CACfq091z6wUG_ApKhfk%2BuDw7=te6ucjXQnkzWEjGnR0pAUzbCg@mail.gmail.com> References: <CACfq093o9iVZKxCj58OR2hpCLDYTUTdxg_re_bEMYn2SrNrLCQ@mail.gmail.com> <20120110005155.S2378@besplex.bde.org> <CACfq09225iMYLe6p8jSiVhsDw_rqTyEHsvPdtZXLrQYT0-skzg@mail.gmail.com> <20120110153807.H943@besplex.bde.org> <CACfq091AT7OP6Bxd3g5me8=HR%2BykUrVA4zEaBw8qtBxb9Ue1fQ@mail.gmail.com> <CACfq091z6wUG_ApKhfk%2BuDw7=te6ucjXQnkzWEjGnR0pAUzbCg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 12 Jan 2012, Giovanni Trematerra wrote: > thanks again to point me out those regression tests I missed in first place. > I ran those tests and results were identical with patched and non > patched kernel. > So at least in that regard the patch doesn't introduce more regressions. > There are some tests that fail. If you and others think it's worth to > fix them I can > take this on. Matching the old behaviour is good enough for now. > ... > STOCK KERNEL 10.0-CURRENT > > [gianni@devbox: poll]% ./pipepoll > 1..20 > ok 1 Pipe state 4: expected 0; got 0 > ok 2 Pipe state 5: expected POLLIN; got POLLIN > ok 3 Pipe state 6: expected POLLIN | POLLHUP; got POLLIN | POLLHUP > not ok 4 Pipe state 6a: expected POLLHUP; got POLLIN | POLLHUP > ok 5 Sock state 4: expected 0; got 0 > ok 6 Sock state 5: expected POLLIN; got POLLIN > ok 7 Sock state 6: expected POLLIN | POLLHUP; got POLLIN | POLLHUP > not ok 8 Sock state 6a: expected POLLHUP; got POLLIN | POLLHUP > ok 9 FIFO state 0: expected 0; got 0 > ok 10 FIFO state 1: expected 0; got 0 > ok 11 FIFO state 2: expected POLLIN; got POLLIN > ok 12 FIFO state 2a: expected 0; got 0 > not ok 13 FIFO state 3: expected POLLHUP; got POLLIN | POLLHUP > ok 14 FIFO state 4: expected 0; got 0 > ok 15 FIFO state 5: expected POLLIN; got POLLIN > ok 16 FIFO state 6: expected POLLIN | POLLHUP; got POLLIN | POLLHUP > not ok 17 FIFO state 6a: expected POLLHUP; got POLLIN | POLLHUP Returning spurious POLLIN with POLLHUP (when there is no input available) is not too serious. It means that applications can't trust POLLIN alone. They must also check POLLHUP (which they should check anyway) and then if both are set they are reduced to read()ing the file to see if there is any input, much like they have to do if they use select() instead of poll() (since select() cannot distiginguish these conditions). Applications that have been converted from poll() to select() should still do the right thing by still using read(). However, gdb was broken by this conversion: echo 'p 0' | gdb /bin/cat This prints "Hangup detected on fd 0\n", then "error detected on stdin", then exits with status 0. It never sees its input. It should do the same thing as for normal input of 'p 0\n<terminal EOF>' -- that is see the input and execute it, then see the EOF and not complain about it (twice), then exit with status 0. This is because it thinks that POLLHUP implies that there is no more input. It doesn't even check POLLIN when it sees POLLHUP. If it checked, then it would have to worry about spurious POLLIN, but here the problem is the opposite -- there is non-spurious POLLIN, and non-spurions POLLHUP. POLLHUP must be acted on immediately when the kernel detected it, to unblock poll() or select(), even when there is unread input, since otherwise any unread input would leave the poll blocked forever. > not ok 18 FIFO state 6b: poll result 0 expected 1. expected POLLHUP; got 0 This one is more interesting and delicate. Linux fails in the same way as FreeBSD here. 6b through 6d test that if there is a hackup condition, then new and old readers all see it.... > not ok 19 FIFO state 6c: expected POLLHUP; got POLLIN | POLLHUP > not ok 20 FIFO state 6d: expected POLLHUP; got POLLIN | POLLHUP Back to spurious POLLIN. ... 6b tests a new reader and fails because the new reader doesn't see the hangup condition. 6c tests that the old reader still sees the hangup condition after the open for the new reader has (possibly and erroneously) cleared it. This passes. 6d tests that the old reader still sees the hangup condition after the new reader has gone away. This passes too. Note that it is only possibly to get a new reader by opening with O_NONBLOCK. Otherwise, the open for the new reader blocks so the difference between never having had a "connection" and having a hangup condition for a previous connection cannot be seen. Some users want the new open to not see the hangup so that poll on it blocks waiting for a new connection. I want it to see the hangup condition so that the condition only depends on the "file" state and not on the timing of the opens. Note that the hangup condition is cleared when the last reader that can see it goes away. Thus there is only a difference in unusual cases. There are 2 main types of unusual cases: - when there are races between herds of readers and writers. Now it seems best to not have the hangup condition depend so much on the timing. - when some readers intentionally don't go away on hangup, but wait for a new connection. Such waiting is difficult either way. Sticky hangup prevents blocking in poll on the old fd's. Now it seems best for new readers to not see the old hangups. My preferred behviour prevents this when the old readers don't go away. They might stay either because you want them to, or because you you can't control them and their owner wants them to. But very complicated setups that intentionally go near the unusual cases probably need external synchronization to avoid races and access control to prevent uncontrolled accesses changing the fifo state and eating the i/o. It is not just hangup that involves device state. > [gianni@devbox: poll]% ./pipeselect > 1..20 > ok 1 Pipe state 4: expected clear; got clear > ok 2 Pipe state 5: expected set; got set > ok 3 Pipe state 6: expected set; got set > ok 4 Pipe state 6a: expected set; got set > ok 5 Sock state 4: expected clear; got clear > ok 6 Sock state 5: expected set; got set > ok 7 Sock state 6: expected set; got set > ok 8 Sock state 6a: expected set; got set > not ok 9 FIFO state 0: expected set; got clear Everything except #9 for select() gives the expected results in my version. State 0 is just the state after initial open in O_RDONLY mode. See the large comment about this in pipeselect.c (it says that select() must see POLLIN although poll() must not). Perhaps the comment is wrong or out of date. > ok 10 FIFO state 1: expected clear; got clear > ok 11 FIFO state 2: expected set; got set > ok 12 FIFO state 2a: expected clear; got clear > ok 13 FIFO state 3: expected set; got set > ok 14 FIFO state 4: expected clear; got clear > ok 15 FIFO state 5: expected set; got set > ok 16 FIFO state 6: expected set; got set > ok 17 FIFO state 6a: expected set; got set > not ok 18 FIFO state 6b: expected set; got clear > ok 19 FIFO state 6c: expected set; got set > ok 20 FIFO state 6d: expected set; got set select() passes all except #9 and #18 because it can't distinguish the spurious POLLIN from POLLHUP. I added some tests, mainly for POLLOUT. Unfortunately, this patch won't apply cleanly, because -current has some changes that I haven't merged. I forget if the kernel needs any changes to pass these. Not many anyway. Output tests and the changes in -current are missing for pipeselect.c. % Index: pipepoll.c % =================================================================== % RCS file: /home/ncvs/src/tools/regression/poll/pipepoll.c,v % retrieving revision 1.1 % diff -u -2 -r1.1 pipepoll.c % --- pipepoll.c 12 Jul 2009 12:50:43 -0000 1.1 % +++ pipepoll.c 25 Aug 2009 13:58:28 -0000 % @@ -30,4 +30,7 @@ % result = "POLLIN"; % break; % + case POLLOUT: % + result = "POLLOUT"; % + break; % case POLLHUP: % result = "POLLHUP"; % @@ -36,4 +39,13 @@ % result = "POLLIN | POLLHUP"; % break; % + case POLLIN | POLLOUT: % + result = "POLLIN | POLLOUT"; % + break; % + case POLLIN | POLLOUT | POLLHUP: % + result = "POLLIN | POLLOUT | POLLHUP"; % + break; % + case POLLOUT | POLLHUP: % + result = "POLLOUT | POLLHUP"; % + break; % default: % asprintf(&ncresult, "%#x", events); % @@ -81,10 +93,10 @@ % } % pfd.fd = fd; % - pfd.events = POLLIN; % + pfd.events = POLLIN | POLLOUT; % % if (filetype == FT_FIFO) { % if (poll(&pfd, 1, 0) < 0) % err(1, "poll"); % - report(num++, "0", 0, pfd.revents); % + report(num++, "0", POLLOUT, pfd.revents); % } % kill(ppid, SIGUSR1); Note that this is buggy. The fifo is still opened O_RDONLY, so in -current poll() returns 0, though it should probably return POLLERR (see previous mail). IIRC, this test for POLLOUT passes under Linux. This verifies that Linux doesn't check the open mode for output. The tests could be expanded to check intentionally that silly combinations of flags and open modes don't work. Except for the above, they apparently avoid the silly combinations, else the change to check the open mode would have caused more failures. % @@ -104,5 +116,5 @@ % if (poll(&pfd, 1, 0) < 0) % err(1, "poll"); % - report(num++, "1", 0, pfd.revents); % + report(num++, "1", POLLOUT, pfd.revents); % kill(ppid, SIGUSR1); % % @@ -112,10 +124,10 @@ % if (poll(&pfd, 1, 0) < 0) % err(1, "poll"); % - report(num++, "2", POLLIN, pfd.revents); % + report(num++, "2", POLLIN | POLLOUT, pfd.revents); % if (read(fd, buf, sizeof buf) != 1) % err(1, "read"); % if (poll(&pfd, 1, 0) < 0) % err(1, "poll"); % - report(num++, "2a", 0, pfd.revents); % + report(num++, "2a", POLLOUT, pfd.revents); % kill(ppid, SIGUSR1); % % @@ -140,5 +152,5 @@ % if (poll(&pfd, 1, 0) < 0) % err(1, "poll"); % - report(num++, "4", 0, pfd.revents); % + report(num++, "4", POLLOUT, pfd.revents); % kill(ppid, SIGUSR1); % % @@ -148,5 +160,5 @@ % if (poll(&pfd, 1, 0) < 0) % err(1, "poll"); % - report(num++, "5", POLLIN, pfd.revents); % + report(num++, "5", POLLIN | POLLOUT, pfd.revents); % kill(ppid, SIGUSR1); % % @@ -193,7 +205,16 @@ % report(num++, "6d", POLLHUP, pfd.revents); % } % - close(fd); % kill(ppid, SIGUSR1); % % + if (filetype != FT_FIFO) % + close (fd); % + else { % + usleep(1); % + while (state != 7) % + ; % + close(fd); % + kill(ppid, SIGUSR1); % + } % + % exit(0); % } % @@ -202,4 +223,6 @@ % parent(int fd) % { % + struct pollfd pfd; % + % usleep(1); % while (state != 1) % @@ -210,4 +233,10 @@ % err(1, "open for write"); % } % + pfd.fd = fd; % + pfd.events = POLLIN | POLLOUT; % + % + if (poll(&pfd, 1, 0) < 0) % + err(1, "poll"); % + report(-1, "1p", POLLOUT, pfd.revents); % kill(cpid, SIGUSR1); % % @@ -253,4 +282,19 @@ % while (state != 7) % ; % + fd = open(FIFONAME, O_WRONLY | O_NONBLOCK); % + if (fd < 0) % + err(1, "open for write"); % + pfd.fd = fd; % + if (poll(&pfd, 1, 0) < 0) % + err(1, "poll"); % + report(-1, "7", POLLOUT, pfd.revents); % + kill(cpid, SIGUSR1); % + % + usleep(1); % + while (state != 8) % + ; % + if (poll(&pfd, 1, 0) < 0) % + err(1, "poll"); % + report(-1, "8", POLLHUP, pfd.revents); % } % Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120112221422.V1340>