Date: Wed, 11 Jan 2012 01:14:44 +0100 From: Giovanni Trematerra <gianni@freebsd.org> To: Bruce Evans <brde@optusnet.com.au> Cc: jilles@freebsd.org, Attilio Rao <attilio@freebsd.org>, flo@freebsd.org, Konstantin Belousov <kib@freebsd.org>, freebsd-arch@freebsd.org Subject: Re: pipe/fifo code merged. Message-ID: <CACfq0922=y20Bo2ekp82DTv2BTqD%2BbJEYppK0nkbkef-9CuKdA@mail.gmail.com> In-Reply-To: <20120110211510.T1676@besplex.bde.org> References: <CACfq093o9iVZKxCj58OR2hpCLDYTUTdxg_re_bEMYn2SrNrLCQ@mail.gmail.com> <20120110005155.S2378@besplex.bde.org> <CACfq09225iMYLe6p8jSiVhsDw_rqTyEHsvPdtZXLrQYT0-skzg@mail.gmail.com> <20120110153807.H943@besplex.bde.org> <20120110211510.T1676@besplex.bde.org>
index | next in thread | previous in thread | raw e-mail
On Tue, Jan 10, 2012 at 1:04 PM, Bruce Evans <brde@optusnet.com.au> wrote: > On Tue, 10 Jan 2012, Bruce Evans wrote: > >> I think you don't want me to read the patch, since I would see too much >> detail starting with style bugs. Anyway.. >> ... > > > One more set of details. > > % + PIPE_UNLOCK(rpipe); > % + if (fifo_iseof(fp)) > % + events |= POLLINIGNEOF; > % + PIPE_LOCK(rpipe); > > This is new code (needed to force POLLIGNEOF for fifos). > > It is a layering violation to call the fifo code for non-fifos here. > fifo_iseof() handles this internally by checking fp->vnode->v_fifoinfo. > The pipe layer should know if it is dealing with a fifo in a better way > than that. I fixed this in http://www.trematerra.net/patches/pipefifo_merge2.4.diff > > I don't like unlocking in the middle in general, and here it gives > races. We will miss setting POLLIN | POLLRDNORM for certain changes > if they weren't set earlier and the state changed while unlocked. Why > unlock anyway or lock in fifo_iseof()? Only fi_seqcount == fi_wgen > is checked under the lock there. Races in that check are probably > just as harmless as races here. And locking doesn't even prevent them, > since if fi_seqcount or fi_wgen can change underneath us, they can also > change just after we check them. They rarely change compared with the > buffer count raced with above. fixed that too as you suggest. > % @@ -1326,58 +1549,66 @@ pipe_poll(fp, events, active_cred, td) > % struct ucred *active_cred; > % struct thread *td; > % { > % - struct pipe *rpipe = fp->f_data; > % + struct pipeinfo *pip = fp->f_data; > % + struct pipe *rpipe; > % struct pipe *wpipe; > % int revents = 0; > % #ifdef MAC > % int error; > % #endif > % % - wpipe = rpipe->pipe_peer; > > % + rpipe = pip->pi_rpipe; > % + wpipe = pip->pi_wpipe->pipe_peer; > % PIPE_LOCK(rpipe); > % #ifdef MAC > % error = mac_pipe_check_poll(active_cred, rpipe->pipe_pair); > % if (error) > % - goto locked_error; > % + return (0); > > Seems to be broken. The unlock is now missing. fixed. > > % -static int > % -fifo_poll_f(struct file *fp, int events, struct ucred *cred, struct > thread *td) > % -{ > % - struct fifoinfo *fip; > % - struct file filetmp; > % - int levents, revents = 0; > % - > % - fip = fp->f_data; > % - levents = events & > % - (POLLIN | POLLINIGNEOF | POLLPRI | POLLRDNORM | POLLRDBAND); > % - if ((fp->f_flag & FREAD) && levents) { > % - filetmp.f_data = fip->fi_readsock; > % - filetmp.f_cred = cred; > % - mtx_lock(&fifo_mtx); > % - if (fp->f_seqcount == fip->fi_wgen) > % - levents |= POLLINIGNEOF; > % - mtx_unlock(&fifo_mtx); > % - revents |= soo_poll(&filetmp, levents, cred, td); > % - } > % - levents = events & (POLLOUT | POLLWRNORM | POLLWRBAND); > % - if ((fp->f_flag & FWRITE) && levents) { > % - filetmp.f_data = fip->fi_writesock; > % - filetmp.f_cred = cred; > % - revents |= soo_poll(&filetmp, levents, cred, td); > % - } > % - return (revents); > % -} > > This was reasonably clean. My version is cleaner: > - POLLIGNEOF is an old mistake of mine. I tried to kill it, but kib@ > propagated it to sys_pipe.c too, where it has survived another release > or two. In my version, I still have it in the call to soo_poll() but > don't have it in the `levents = events & ...' mask. Thus it is a > pure kernel flag, and acts the same as your isfifo flag -- it tells > the socket layer to do something unusual because this is a fifo. It > is not needed any more, since the pipe layer is close to the fifo > layer so it can just do something unusual. It can determine whether > the pipe is a fifo without passing around flags (the flag should be > in pipe_state). > - My version is missing the FREAD and FWRITE checks. These seem to be > necessary, but I think they don't belong at this level. Also, the > error handling for them seems quite broken (nonexistent). I think > POLLERR is supposed to be returned for attempts to poll for an > impossible condition, but the FREAD and FWRITE checks give a return > of 0. And returning 0 is much worse than returning success, since > it will cause at least poll() to block() when it should return, > Here is the commit that added these checks: > > % ---------------------------- > % revision 1.118 > % date: 2005/09/12 10:16:18; author: rwatson; state: Exp; lines: +2 -2 > % Only poll the fifo for read events if the fifo is attached to a readable > % file descriptor. Otherwise, the read end of a fifo might return that it > % is writable (which it isn't). > > But it should return (with POLLERR). This is an error condition and > should be detected. > > POSIX is fuzzy about this. It only says that POLLERR is for when an error > occurred. It defines the POLLNVAL error clearly as meaning that the fd > is invalid. Well that is not so clear. A non-open fd is clearly invalid. > This is handled in upper layers. Polling for a direction that can't work > can be considered as an invalid fd too, unless "invalid" has its technical > meaning. Linux-2.6.10 sets POLLERR for reading from a pipe or fifo with > no readers, and has an XXX comment saying that most Unices don't do this > for fifos. This seems wrong to me, and FreeBSD doesn't do it for any of > pipes, fifos or sockets. But for pipes, there is tricky EOF handling > associated with this condition. I can't see anywhere where Linux gives > this based on the open mode. > > % % Only poll the fifo for write events if the fifo attached to a writable > % file descriptor. Otherwise, the write end of a fifo might return that > % it is readable (which it isn't). > > Seems to be necessary too. I can't see anywhere where Linux returns > POLLERR for i/o errors or unwritable files. > > % % In the event that a file is FREAD|FWRITE (which is allowed by POSIX, but > % has undefined behavior), we poll for both. > % % MFC after: 3 days > % ---------------------------- > > select() is interestingly different than poll(). It can't return POLLERR. > Thus, the old broken behaviour gave the best close to possible behaviour > for select() at the usual level. The POLLERR's should make it return > success, and the false successes in the kernel would have done the same. > Only cases where there were no false successes in the kernel were broken. > > > I don't like defaults set by initializations in declarations 'revents > = 0'. Both the default and the return of 0 here seem to be wrong. > This is an error condition, so I think POLLERR should be returned, as > about. Otherwise, poll() will probably block. And the block is not > just transient, at least in the above since the error condition can > never go away. You will only be saved from blocking forever if there > is success on some other file descriptor or event. > > % #endif > % - if (events & (POLLIN | POLLRDNORM)) > % - if ((rpipe->pipe_state & PIPE_DIRECTW) || > % - (rpipe->pipe_buffer.cnt > 0)) > % - revents |= events & (POLLIN | POLLRDNORM); > % + if (fp->f_flag & FREAD) { > % + if (events & (POLLIN | POLLRDNORM)) > % + if ((rpipe->pipe_state & PIPE_DIRECTW) || > % + (rpipe->pipe_buffer.cnt > 0)) > % + revents |= events & (POLLIN | POLLRDNORM); > > The change in fifos_vnops.c was done cleanly by adding the FREAD check > to the events mask check. With fifos now polled here, it is needed > (modulo bugs) here too. But here it makes the important changes for > fifos, if any, unreadable by indenting everything. > > % - if (events & (POLLOUT | POLLWRNORM)) > % - if (wpipe->pipe_present != PIPE_ACTIVE || > % - (wpipe->pipe_state & PIPE_EOF) || > % - (((wpipe->pipe_state & PIPE_DIRECTW) == 0) && > % - ((wpipe->pipe_buffer.size - wpipe->pipe_buffer.cnt) >= > PIPE_BUF || > % - wpipe->pipe_buffer.size == 0))) > % - revents |= events & (POLLOUT | POLLWRNORM); > % + PIPE_UNLOCK(rpipe); > % + if (fifo_iseof(fp)) > % + events |= POLLINIGNEOF; > % + PIPE_LOCK(rpipe); > % % - if ((events & POLLINIGNEOF) == 0) { > % - if (rpipe->pipe_state & PIPE_EOF) { > % - revents |= (events & (POLLIN | POLLRDNORM)); > % - if (wpipe->pipe_present != PIPE_ACTIVE || > % - (wpipe->pipe_state & PIPE_EOF)) > % - revents |= POLLHUP; > % + if ((events & POLLINIGNEOF) == 0) { > % + if (rpipe->pipe_state & PIPE_EOF) { > % + revents |= (events & (POLLIN | POLLRDNORM)); > % + if (wpipe->pipe_present != PIPE_ACTIVE || > % + (wpipe->pipe_state & PIPE_EOF)) > % + revents |= POLLHUP; > % + } > % } > % } > % + if (fp->f_flag & FWRITE) > % + if (events & (POLLOUT | POLLWRNORM)) > % + if (wpipe->pipe_present != PIPE_ACTIVE || > % + (wpipe->pipe_state & PIPE_EOF) || % + > (((wpipe->pipe_state & PIPE_DIRECTW) == 0) && > % + ((wpipe->pipe_buffer.size - wpipe->pipe_buffer.cnt) >>= > % + PIPE_BUF || wpipe->pipe_buffer.size == 0))) > % + revents |= events & (POLLOUT | POLLWRNORM); > % % if (revents == 0) { > % - if (events & (POLLIN | POLLRDNORM)) { > % - selrecord(td, &rpipe->pipe_sel); > % - if (SEL_WAITING(&rpipe->pipe_sel)) > % - rpipe->pipe_state |= PIPE_SEL; > % - } > % + if (fp->f_flag & FREAD) > % + if (events & (POLLIN | POLLRDNORM)) { > % + selrecord(td, &rpipe->pipe_sel); > % + if (SEL_WAITING(&rpipe->pipe_sel)) > % + rpipe->pipe_state |= PIPE_SEL; > % + } > % % - if (events & (POLLOUT | POLLWRNORM)) { > % - selrecord(td, &wpipe->pipe_sel); > % - if (SEL_WAITING(&wpipe->pipe_sel)) > % - wpipe->pipe_state |= PIPE_SEL; > % - } > % + if (fp->f_flag & FWRITE) > % + if (events & (POLLOUT | POLLWRNORM)) { > % + selrecord(td, &wpipe->pipe_sel); > % + if (SEL_WAITING(&wpipe->pipe_sel)) > % + wpipe->pipe_state |= PIPE_SEL; > % + } > % } > % -#ifdef MAC > % -locked_error: > % -#endif > % PIPE_UNLOCK(rpipe); > % % return (revents); > > It seems that not much really changed here. To avoid indentation and > fix bugs, the FREAD and FWRITE checks should be done up front. I think > they can be done before locking and mac checking. Something like: > > if ((fp->f_flag & FREAD) && (events & (POLLIN | POLLRDNORM)) > return (POLLERR); > if ((fp->f_flag & FWRITE) && (events & (POLLOUT | POLLWRNORM)) > return (POLLERR); > if (events & POLLINIGNEOF) > return (POLLER); /* try to kill this too */ > > Since the diff for pipe_poll() was unreadable, here it is again with > the old lines removed. A few more problems are now obvious: > > % @@ -1326,58 +1549,66 @@ pipe_poll(fp, events, active_cred, td) > % struct ucred *active_cred; > % struct thread *td; > % { > % + struct pipeinfo *pip = fp->f_data; > % + struct pipe *rpipe; > % struct pipe *wpipe; > % int revents = 0; > % #ifdef MAC > % int error; > % #endif > % % + rpipe = pip->pi_rpipe; > % + wpipe = pip->pi_wpipe->pipe_peer; > % PIPE_LOCK(rpipe); > % #ifdef MAC > % error = mac_pipe_check_poll(active_cred, rpipe->pipe_pair); > % if (error) > % + return (0); > % #endif > % + if (fp->f_flag & FREAD) { > % + if (events & (POLLIN | POLLRDNORM)) > % + if ((rpipe->pipe_state & PIPE_DIRECTW) || > % + (rpipe->pipe_buffer.cnt > 0)) > % + revents |= events & (POLLIN | POLLRDNORM); > % > > This style bug (extra blank line) was common in old code. It helps make > the diffs unreadable too. > > > % % + if ((events & POLLINIGNEOF) == 0) { > % + if (rpipe->pipe_state & PIPE_EOF) { > % + revents |= (events & (POLLIN | POLLRDNORM)); > % + if (wpipe->pipe_present != PIPE_ACTIVE || > % + (wpipe->pipe_state & PIPE_EOF)) > % + revents |= POLLHUP; > % + } > > This is old code, reindented. It was not needed, since it used to > just check for the POLLINIGNEOF mistake in the user events. Now it > is needed to give the modified (POLLINIGNEOF) semantics from the kernel > flag for fifos. It is much uglier than the corresponding code in the > old fifo_poll_f(). That begins with putting the relevant user events > in levents. So that it doesn't have to repeat the long mask expressions. > Well, that's about the limits of the cleanups. Something like the > above is still needed to give the semantics change. > > The socket layer still has code that corresponds exactly to the above. > It is now not needed, since it now only supports the POLLINIGNEOF > mistake in the user events. One copy of this code is bad enough. > It seems to me that your concerns aren't related to the patch. I'll try to address them when the patch will be into the tree, -- Giannihome | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACfq0922=y20Bo2ekp82DTv2BTqD%2BbJEYppK0nkbkef-9CuKdA>
