Date: Mon, 7 Sep 2015 18:17:17 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: bugzilla-noreply@freebsd.org Cc: freebsd-bugs@freebsd.org Subject: Re: [Bug 202933] unwanted behaviour change when writing to revoked terminals Message-ID: <20150907171030.H828@besplex.bde.org> In-Reply-To: <bug-202933-8@https.bugs.freebsd.org/bugzilla/> References: <bug-202933-8@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 6 Sep 2015 bugzilla-noreply@freebsd.org wrote: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202933 > > When a terminal is revoked, writing to it sets errno to: > - ENXIO until FreeBSD 10.1 kernel > - EIO with FreeBSD 10.2 kernel > > The following program can be used to see this behaviour change: > ------------------------------------------------------------------ > #include <stdio.h> > #include <sys/types.h> > #include <unistd.h> > #include <fcntl.h> > extern int errno; > int main() { > int id = open("/dev/console", O_RDWR); > revoke("/dev/console"); > int ret = write(id, "X", 1); > if (ret < 0) printf("errno=%d\n", errno); > return 0; > } > ------------------------------------------------------------------ > It returns 6 (ENXIO) on FreeBSD 10.1 and 5 (EIO) on FreeBSD 10.2. EIO is correct. > I wonder if this new behaviour would not be an unwanted side-effect due to > kernel changes. IIRC, this was a bug in devfs in FreeBSD-10.1 (maybe FreeBSD-9 too). Dead ttysd lost their connection to deadfs, so the code in deadfs that intentionally returns EIO for terminals was not reached. > For instance, this leads to bug ID 202932 for the rsyslog8 port, that loops > endlessly after /dev/console is revoked since the errno code tested to handle > correctly this case is now EIO instead of ENXIO. > > This could happen to some other tools for the same reason. Ony for ones broken like rsyslog8. It is probably read() that is broken in rsyslog8(). read() on a dead terminal is specified to return 0 (EOF), but the corresponding bug in devfs made the code the code in deadfs that does that unreachable, so ENXIO was reeturned for read() too. Detecting a dead terminal is not very easy, since reading it never fails and select() always reports that it is ready to read (since read() would not fail). The best way to detect it is to use poll() instead of select(). poll() return POLLHUP for all dead file descriptors. You then have to fight with system-dependent with bugs in poll(). E.g., the correct way to use POLLHUP is to check for POLLIN at the same time and ignore POLLHUP if POLLIN is set. POLLIN together with POLLHUP should mean that the connection was hung up but non-null buffered data can be read. Some applications can ignore the buffered data, but all should read it to prevent it leaking to the next open() (whether it can leak depends on the file type). But this is broken in many systems for some file types including all dead file descriptors. For dead file descriptors, there is _no_ buffered data that can be read using the dead fd (there may buffered data that can be read using a new open, and there is no way to clear that using the dead fd), but dead poll() returns POLLIN together with POLLHUP. This means that the state of POLLHUP without POLLIN can never be trusted to work. To work around this, something like the following should be used: poll() for POLLIN if POLLHUP is not set; then the connection state is open else if (POLLHUP is set and) POLLIN is not set; then the connection state is closed else (POLLHUP and POLLIN are both set); then try to read some data if null data; then either we lost a race or poll() is broken consider fixing poll() poll() and read data a few more times if POLLIN remains set and the data remains null; then poll() seems to be broken but we can never be sure that we just lost the race again endif repeat until the race case is unlikely else the connection state is closed but has buffered data read all of this data now or later to reach the fully-closed case endif endif Closure of a connection by revoke(2) is quite different from a normal hangup. revoke(2) should flush all i/o for all file types. I'm not sure if it does. Normal hangup of a terminal does flush all i/o, modulo bugs. The case of non-null buffered data together with POLLHUP occurs mainly for fifos and perhaps for sockets. Not for revoked ones, but since normal hangup is caused by just the last writer going away, and because this can quite reasonably happen before the reader finishes reading the data written by the last writer (or earlier writers), normal hangup never flushes the data. I suspect that revoke() doesn't flush it either, and this is a bug. For terminals, data can easily be lost for normal hangup. This is a security bug, modulo unnecessary races in it. (Drivers tend to have the following bug: the hangup condition is often delivered to software and upper layers as soon as possible, while data is buffered and delivered after a delay. Thus in a normal sloppy application that is not careful to drain its data, together with broken draining in drivers, hangup is often seen before the last few bytes of data and these bytes are lost unnecessarily.) Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150907171030.H828>