From owner-freebsd-bugs@FreeBSD.ORG Wed Jun 18 21:20:07 2003 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 84C8D37B404 for ; Wed, 18 Jun 2003 21:20:07 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 58E5243FDD for ; Wed, 18 Jun 2003 21:20:06 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id h5J4K6Up023159 for ; Wed, 18 Jun 2003 21:20:06 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.9/8.12.9/Submit) id h5J4K5ZY023158; Wed, 18 Jun 2003 21:20:05 -0700 (PDT) Date: Wed, 18 Jun 2003 21:20:05 -0700 (PDT) Message-Id: <200306190420.h5J4K5ZY023158@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Bruce Evans Subject: Re: kern/53447: poll(2) semantics differ from susV3/POSIX X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Bruce Evans List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jun 2003 04:20:07 -0000 The following reply was made to PR kern/53447; it has been noted by GNATS. From: Bruce Evans To: "Artem 'Zazoobr' Ignatjev" Cc: freebsd-gnats-submit@freebsd.org Subject: Re: kern/53447: poll(2) semantics differ from susV3/POSIX Date: Thu, 19 Jun 2003 14:18:12 +1000 (EST) On Wed, 18 Jun 2003, Artem 'Zazoobr' Ignatjev wrote: > clemens fischer wrote: > > ... > > Mhh, then this is apparently a problem with BSD poll() semantics. > > > > poll is expected to set the POLLHUP bit on EOF, but FreeBSD > > apparently does not, but signals POLLIN and then returns 0 on > > read(). Is someone involved with the FreeBSD crowd and can post a > > bug report for this? > > > FreeBSD DOES set POLLHUP bit; but, also, EOF on pipe or disconnected > socket can be caught by reading 0 bytes from ready-to-read descriptor. The latter is very standard (required by POSIX). Whether POLLIN should be set together with POLLHUP for EOF is not so clear. It is permitted by POSIX and seems least surprising, so FreeBSD does it. POSIX mainly requires POLLOUT and POLLHUP to not both be set. This all goes naturally with read(), write() and select() semantics: for most types of files including pipes, read() returns 0 with no error on EOF, and select() has no standard way to select on EOF, so reading works best if EOF satisfies POLLIN. OTOH write() returns -1 and a nonzero errno (EPIPE for pipes) on EOF, and write-selects on pipes (if not the whole process) normallt get terminated by SIGPIPE so select()'s lack of understanding of EOF is less of a problem for writes than for reads. POLLHUP is more broken for named pipes and sockets than for nameless pipes. It seems to be unimplemented, and FreeBSD may have broken POLLHUP for all types of EOFs by making poll() and select() for reading always block waiting for a writer if there isn't one (and there is no data). Other systems apparently handle initial EOFs (ones where the open() was nonblocking and there was no writer at open time and none since) specially, but POSIX doesn't seem to mention an special handling for initial EOFs and handling all EOFs like this makes it harder to detect them. > See the code below (it's /sys/kern/sys_pipe.c 1.60.2.13, used in FreeBSD > 4.8-RELEASE): > int > pipe_poll(fp, events, cred, p) > struct file *fp; > int events; > struct ucred *cred; > struct proc *p; > { > struct pipe *rpipe = (struct pipe *)fp->f_data; > struct pipe *wpipe; > int revents = 0; > > wpipe = rpipe->pipe_peer; > if (events & (POLLIN | POLLRDNORM)) > if ((rpipe->pipe_state & PIPE_DIRECTW) || > (rpipe->pipe_buffer.cnt > 0) || > > (rpipe->pipe_state & PIPE_EOF)) > > revents |= events & (POLLIN | POLLRDNORM); > > if (events & (POLLOUT | POLLWRNORM)) > if (wpipe == NULL || (wpipe->pipe_state & PIPE_EOF) || > (((wpipe->pipe_state & PIPE_DIRECTW) == 0) && > (wpipe->pipe_buffer.size - wpipe->pipe_buffer.cnt) >= PIPE_BUF)) > revents |= events & (POLLOUT | POLLWRNORM); > > > if ((rpipe->pipe_state & PIPE_EOF) || > > (wpipe == NULL) || > > (wpipe->pipe_state & PIPE_EOF)) > > revents |= POLLHUP; The only known bug in polling on nameless pipes is near here. POLLHUP is set for both sides if PIPE_EOF is set for either side. This may be correct for writing but it is broken for reading. The writer may have written something and then exited. This gives POLLHUP for the reader (presumably because it gives PIPE_EOF for the writer). But EOF, and thus POLLHUP, should not occur for the reader until the data already written had been read. This bug breaks at least gdb's detection of EOF (try "echo 'p 0' | gdb /bin/cat"). Bruce