Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Dec 2017 16:51:59 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
Cc:        freebsd-bugs@freebsd.org
Subject:   Re: [Bug 203129] syslogd: /dev/console: Interrupted system call
Message-ID:  <20171221125104.U1074@besplex.bde.org>
In-Reply-To: <bug-203129-8-ueDAunTmzI@https.bugs.freebsd.org/bugzilla/>
References:  <bug-203129-8@https.bugs.freebsd.org/bugzilla/> <bug-203129-8-ueDAunTmzI@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 20 Dec 2017 a bug that doesn't want replies@freebsd.org wrote:

> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203129
>
> --- Comment #10 from heas@shrubbery.net ---
> I have noticed another regressive behavior from this:
>
> Local system status:
> 3:01AM  up 36 days, 7 hrs, 3 users, load averages: 165.55, 164.19, 161.10
>
> I am forced to reboot at this point.
>
> This is a headless server with a service processor and a serial console with
> h/w flow control running 11.1-p4. A similar server without a head and no serial
> console does not have this problem, nor does a headless 10.3 server, nor did
> this server when it was running 10.3.

Configuring h/w flow control for consoles is an error.  It asks for
the system to hang waiting to do console output whenever the other end
of the connection invokes flow control (perhaps because someone
unplugged its cable or turned it off overnight).  Software flow control
might do the same.  syslogd uses wall/ttymsg.c which has a fork bomb
when things block.  Normally the bugs run the other way and waits are
too short, giving lost and corrupted output when close() closes without
waiting; this can be ameliorated using an extra process to keep the
console open, and this often happens automatically by using getty.

Waiting for the other end to come back is obviously right, but not very
easy to program.  Even dropping the output in a controlled way has never
been programmed in FreeBSD.  Not using flow control is a hack to give
uncontrolled dropping.  The output goes out, but the other end drops it.
Even if the other end hasn't invoked flow control, the sender doesn't
know if the output was received since acks for it have never been programmed
in FreeBSD.

h/w flow control for consoles should be locked off using the initial state
(set initial value to off) and lock state devices (set lock flag to on).

There are many bugs in this area.  sio set good defaults for consoles in
the initial and lock state devices about 25 years ago, except it only
locked the speeds, CLOCAL and HUPCL.  Except for HUPCL, this survived a
move to upper layers until the upper layers broke this and much more by
removing the special initialization of lock state devices in FreeBSD-8.
uart dishonored the initial and lock state devices for precisely the
speeds, CLOCAL and HUPCL until FreeBSD-11.  uart used an internal
hard-coded "fixation" method.  Now it only uses fixation for HUPCL.

I don't know of any related change between FreeBSD-10 and FreeBSD-11
except for fixing the dishonoring for the speeds and CLOCAL in uart.
Since the upper layers are still missing initialization of the lock
state device, the speeds and CLOCAL are neither locked or fixated.
Since uart users apparently don't use /etc/rc.d/serial, too many of
entries in gettytab are needed to avoid blowing away the speeds and
CLOCAL when one is used.  Blowing away was previously prevented by
fixation, so almost any entry in gettytab worked.  The user must still
pick one.  Perhaps the problem is just that too many entries have
CRTSCTS and the user picked one of those.

If gettytab is not used, then there should be no problem since init
and syslogd (which uses wall/ttymsg.c) are even more clueless about
ttys than getty.  These programs expect to be able to just open
/dev/console and do i/o to it.  This depends on good defaults for
the initial state device.  The driver attempts to provide these,
and usually does this OK for system consoles (but nothing else).
Then if getty is run, without locking it tends to clobber the
good defaults (getty holds the tty open so the initial state is not
used again).

My version of wall/ttymsg.c avoids the fork bomb by waiting too much.
(It tries to do all the i/o in a single subprocess, but is still
clueless about ttys so it doesn't drain of flush this i/o properly.)

In versions of FreeBSD where I fixed some of the bugs in upper layers,
fixing the default initial state to the same values that sio used
almost 25 years ago gave surprising behaviour.  It gives a raw initial
state.  This is correct for serial h/w ttys that are not system
consoles, but it is weird for vtys.  Syscons doesn't support the initial
and lock state devices.  I sometimes run statistics programs like top
without getty on vtys.  These programs expect to just open the tty and
do i/i to it just like init.  Raw mode is not suitable for this.  I
fix this using the hack of another program to change the state and keep
the tty open.  In /etc/ttys:

ttyv8	"/bin/sh -c '(stty sane; /home/bde/bin/scripts/netstat -d -I em0 1) </dev/ttyv8 >/dev/ttyv8 2>&1'"		cons25	on  secure
...

Curses programs like top don't need the separate stty, but simple programs
like netstat do.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171221125104.U1074>