Date: Mon, 18 Feb 2002 14:04:04 +1100 (EST) From: Bruce Evans <bde@zeta.org.au> To: David Malone <dwmalone@maths.tcd.ie> Cc: "Todd C. Miller" <Todd.Miller@courtesan.com>, <audit@FreeBSD.ORG>, Chris Johnson <cjohnson@palomine.net>, Brian McDonald <brian@lustygrapes.net> Subject: Re: Syslog hangong on console. Message-ID: <20020218131837.K4236-100000@gamplex.bde.org> In-Reply-To: <200202171759.aa61427@salmon.maths.tcd.ie>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 17 Feb 2002, David Malone wrote: > > Below is the diff I committed to OpenBSD some time ago. It goes a > > bit farther and opens all files with O_NONBLOCK and then changes > > to blocking writes for real files. > ... > BTW - some people seemed to be indicating that syslogd was blocking > at some stage other than the open, which is what I wasn't able to > reproduce. FreeBSD's syslogd uses ttymsg() to write to the tty, > which should never block. The only way I could see it happening was > if isatty() lied after the tty was opened. I sent David a large private mail which was mostly about this problem. ttymsg() may block in close(2) or _exit(2) when it clears O_NONBLOCK, which is eventually the usual case if there is a blockage downstream. (Usually the first few writes go to driver buffers and write(2) and writev(2) return successfully, but the data isn't guaranteed to go out unless you do a tcdrain(3) or equivalent, and this is not practical in ttymsg() or syslog() (since it might block).) Blocking in _exit(2) is especially bad, since it gives unkillable processes. These can cause the process table to fill up in ttymsg(). I sent David some old patches related to limiting the children. Using O_NONBLOCK without using tcdrain(3) gives a different kind of brokenness. Unfortunately, David's change to syslog.c gives a perfect example of this. The changed code is essentially: fd = open(... O_NONBLOCK); write(fd, ...); close(fd); Here the write normally immediately returns successfully after copying the data to driver buffers, even when the physical device is completely blocked. Then the close flushes the data in the driver and hardware buffers because O_NONBLOCK is still set at close time. I "fixed" this in FreeBSD. In 4.4BSD-Lite, ttylclose() checks the IO_NDELAY flag to decide whether to flush the buffers. This is nonsense, since the flags passed to ttylclose are the open/fcntl flags, not those flags converted to IO_* flags. FreeBSD's ttylclose() checks FNONBLOCK instead. The result of the fix is that if the close is the last-close, code like the above drops all the data if the device is completely blocked, and writes only a few bytes even if the device is completely unblocked (only thise bytes that have reached their destination before the close flushes the buffers are sure to have gone out) Without the fix, the behaviour is worse: processes may block forever in close(2) or _exit(2) despite use of O_NONBLOCK. Perhaps multiple processess for the same device -- there are some races that may permit first-opens to complete while last-closes are blocked. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-audit" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020218131837.K4236-100000>