Date: Thu, 21 Dec 2017 16:51:59 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> Cc: freebsd-bugs@freebsd.org Subject: Re: [Bug 203129] syslogd: /dev/console: Interrupted system call Message-ID: <20171221125104.U1074@besplex.bde.org> In-Reply-To: <bug-203129-8-ueDAunTmzI@https.bugs.freebsd.org/bugzilla/> References: <bug-203129-8@https.bugs.freebsd.org/bugzilla/> <bug-203129-8-ueDAunTmzI@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 20 Dec 2017 a bug that doesn't want replies@freebsd.org wrote: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203129 > > --- Comment #10 from heas@shrubbery.net --- > I have noticed another regressive behavior from this: > > Local system status: > 3:01AM up 36 days, 7 hrs, 3 users, load averages: 165.55, 164.19, 161.10 > > I am forced to reboot at this point. > > This is a headless server with a service processor and a serial console with > h/w flow control running 11.1-p4. A similar server without a head and no serial > console does not have this problem, nor does a headless 10.3 server, nor did > this server when it was running 10.3. Configuring h/w flow control for consoles is an error. It asks for the system to hang waiting to do console output whenever the other end of the connection invokes flow control (perhaps because someone unplugged its cable or turned it off overnight). Software flow control might do the same. syslogd uses wall/ttymsg.c which has a fork bomb when things block. Normally the bugs run the other way and waits are too short, giving lost and corrupted output when close() closes without waiting; this can be ameliorated using an extra process to keep the console open, and this often happens automatically by using getty. Waiting for the other end to come back is obviously right, but not very easy to program. Even dropping the output in a controlled way has never been programmed in FreeBSD. Not using flow control is a hack to give uncontrolled dropping. The output goes out, but the other end drops it. Even if the other end hasn't invoked flow control, the sender doesn't know if the output was received since acks for it have never been programmed in FreeBSD. h/w flow control for consoles should be locked off using the initial state (set initial value to off) and lock state devices (set lock flag to on). There are many bugs in this area. sio set good defaults for consoles in the initial and lock state devices about 25 years ago, except it only locked the speeds, CLOCAL and HUPCL. Except for HUPCL, this survived a move to upper layers until the upper layers broke this and much more by removing the special initialization of lock state devices in FreeBSD-8. uart dishonored the initial and lock state devices for precisely the speeds, CLOCAL and HUPCL until FreeBSD-11. uart used an internal hard-coded "fixation" method. Now it only uses fixation for HUPCL. I don't know of any related change between FreeBSD-10 and FreeBSD-11 except for fixing the dishonoring for the speeds and CLOCAL in uart. Since the upper layers are still missing initialization of the lock state device, the speeds and CLOCAL are neither locked or fixated. Since uart users apparently don't use /etc/rc.d/serial, too many of entries in gettytab are needed to avoid blowing away the speeds and CLOCAL when one is used. Blowing away was previously prevented by fixation, so almost any entry in gettytab worked. The user must still pick one. Perhaps the problem is just that too many entries have CRTSCTS and the user picked one of those. If gettytab is not used, then there should be no problem since init and syslogd (which uses wall/ttymsg.c) are even more clueless about ttys than getty. These programs expect to be able to just open /dev/console and do i/o to it. This depends on good defaults for the initial state device. The driver attempts to provide these, and usually does this OK for system consoles (but nothing else). Then if getty is run, without locking it tends to clobber the good defaults (getty holds the tty open so the initial state is not used again). My version of wall/ttymsg.c avoids the fork bomb by waiting too much. (It tries to do all the i/o in a single subprocess, but is still clueless about ttys so it doesn't drain of flush this i/o properly.) In versions of FreeBSD where I fixed some of the bugs in upper layers, fixing the default initial state to the same values that sio used almost 25 years ago gave surprising behaviour. It gives a raw initial state. This is correct for serial h/w ttys that are not system consoles, but it is weird for vtys. Syscons doesn't support the initial and lock state devices. I sometimes run statistics programs like top without getty on vtys. These programs expect to just open the tty and do i/i to it just like init. Raw mode is not suitable for this. I fix this using the hack of another program to change the state and keep the tty open. In /etc/ttys: ttyv8 "/bin/sh -c '(stty sane; /home/bde/bin/scripts/netstat -d -I em0 1) </dev/ttyv8 >/dev/ttyv8 2>&1'" cons25 on secure ... Curses programs like top don't need the separate stty, but simple programs like netstat do. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171221125104.U1074>