From owner-freebsd-arm@FreeBSD.ORG  Mon Dec 23 10:17:52 2013
Return-Path: <owner-freebsd-arm@FreeBSD.ORG>
Delivered-To: arm@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 7F3187EC;
 Mon, 23 Dec 2013 10:17:52 +0000 (UTC)
Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au
 [211.29.132.249])
 by mx1.freebsd.org (Postfix) with ESMTP id 8E8521DE7;
 Mon, 23 Dec 2013 10:17:51 +0000 (UTC)
Received: from c122-106-144-87.carlnfd1.nsw.optusnet.com.au
 (c122-106-144-87.carlnfd1.nsw.optusnet.com.au [122.106.144.87])
 by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 9985510420A7;
 Mon, 23 Dec 2013 20:55:57 +1100 (EST)
Date: Mon, 23 Dec 2013 20:55:56 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: John-Mark Gurney <jmg@funkthat.com>
Subject: Re: mountroot issues (was Re: 10.0-release proposed patch for Atmel)
In-Reply-To: <20131222192842.GI99167@funkthat.com>
Message-ID: <20131223164823.B954@besplex.bde.org>
References: <B5B10EE3-955D-4A8C-A233-9DADF6898A54@bsdimp.com>
 <20131222192842.GI99167@funkthat.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=YYGEuWhf c=1 sm=1 tr=0
 a=p/w0leo876FR0WNmYI1KeA==:117 a=PO7r1zJSAAAA:8 a=KDmCbJ2lt7cA:10
 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=bMQ0z2ntKMwA:10
 a=_mH0PAV41rktP4ceC3AA:9 a=CjuIK1q_8ugA:10
Cc: "freebsd-arm@freebsd.org" <arm@FreeBSD.org>, freebsd-arch@FreeBSD.org
X-BeenThere: freebsd-arm@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Porting FreeBSD to ARM processors." <freebsd-arm.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arm/>
List-Post: <mailto:freebsd-arm@freebsd.org>
List-Help: <mailto:freebsd-arm-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Dec 2013 10:17:52 -0000

On Sun, 22 Dec 2013, John-Mark Gurney wrote:

> Warner Losh wrote this message on Sat, Dec 21, 2013 at 23:44 -0700:
>> Right now, the mountroot prompt doesn't work on Atmel CPUs. Almost all the characters are eaten. I recently committed an elegant fix for this into head to mask the interrupt for new characters and only do polling.
>
> So, a similar issue plages i386/amd64 too.  There the console mostly
> works, but it will drop characters on occasion...  The problem is that
> mount root spins calling into the console code instead of asking the
> console code for a single character and having the console code wait
> for this character...  and if you type a character while it's outside
> the console routines, that character will be dropped...

This was broken by the multiple console changes (and buggy drivers).
It last (mostly) worked in FreeBSD-4, since there were no multiple
console changes there.  It worked just like you hinted: the mount root
code _doesn't_ spin calling into the console code, but calls into the
console code which spins.  Non-buggy drivers of course do something
to prevent interrupt handlers doing anything while spinning.

syscons is the main FreeBSD driver that I know of that once upon a
time was non-buggy here.  In sccngetch(), it simply disabled all tty
interrupts including the keyboard one using spltty().  This worked
between FreeBSD-1 and FreeBSD-4.  This was broken by SMPng.  spltty()
became null of course, and the keyboard interrupt became Giant-locked
(it still is).  But the spltty()s for locking out the interrupt handler
in console code were not replaced by Giant locking, and Giant locking
(or almost any normal locking) is not permitted in console drivers
anyway.  Apparently, AT keyboard interrupts have magic timing (delayed
interrupts?) allows console interrupt to mostly work without locking.
Multiplexed keyboards with some actually being USB increase the problem
significantly -- they probably have different interrupt timing, and
since no normal locking is permitted it is difficult to lock things
throughout the poll.  The big SMPng change occurred just 6 weeks before
multiple consoles.  Thus there was a small window in which the old API
would have worked if its locking had been updated.  Closing of this
window gave the current brokenness less any fixes for this brokenness
in syscons's cngrab() -- the polling loop is at too high a level for
locking in the driver to help, and syscons didn't have any anyway.  My
cndbctl() API was designed to fix this problem, except it was
misdesigned and only applied for ddb.  Syscons was the only console
driver that supported it, but syscons only used it for console _output_,
and it was broken a few years after the multiple console changes.
cngrab() was mostly designed by me and mostly implemented by avg.  It
was only used by syscons too.  Strangely, it now does mostly keyboard
stuff where cndbctl() did only screen stuff.  I don't see where it
does anything to stop the interrupt handler doing anything.

Syscons console output was also broken by the null spltty().  The
breakage was more serious and was work around by sprinkling buggy
locking.  This sometimes gives deadlock if the locks are non-recursive,
and doesn't work if they are recursive.  Syscons was never properly
locked.  spltty() was like a recursive lock.  It basically only worked
for locking out interrupt handlers, and if you used it more (say to
prevent the console driver entering when spltty() is raised), then
you make it like a non-recursive lock and get deadlock.  The races
from not locking are the usual ones.  You can have one thread in
console output updating critical pointers or in console input accessing
critical device registers.  Then another thread may want to do console
i/o.  With no locking, these threads just clobber each others' state.
With locking, the console output cannot be as synchronous as desired,
and may cause deadlock.  Deadlock is only a serious problem for ddb
console i/o, but after avoiding it there it is trivial to avoid it
for threads.  Normal locking is not permitted anywhere in ddb since
it may deadlock, and blowing away the locks is not a solution since
it gives the problem of clobbering state.  With normal locking, deadlock
always occurs in cases like the following:

   console driver aquires locks
   console driver changes critical state  --> debugger trap ...
     debugger does console i/o
       console driver blocks aquiring same locks (deadlock)

The debugger trap may be on the same or a different CPU.  If it is on
the same CPU, then it is obviously difficult to get back to the
interrupted code.  (Similarly if the critical code is interrupted by
a fast interrupt handler, an NMI, or some other trap.  Except for
debugger traps, the problem can be reduced by forbidding console i/o
except in emergencies like panics.)  If it is only a different CPU,
then the CPU handling the trap can more easily wait for the CPU doing
the i/o, but this is still hard and not done (kdb_enter() actually
starts by stopping all the other CPUs as soon as possible).

The com driver in 386BSD and 4.4BSD worked similarly to syscons.  It
used splhigh() instead of spltty() around its polling loops.  This
disables too many more interrupts.  This even breaks clock interrupts
and thus even the system time in the 386BSD era when the system time
was just in timer interrupt ticks.

In sio, I used essentially the cngrab() method, but pushed the method
into the driver since I don't like churning APIs (the driver calls the
methid whenever it is entered, so as to not depend on higher layers
doing it).  This was based on multiple console code in my debugger,
where the the enabled consoles were "opened" on each entry to the
driver, or by a debugger command to change the set of enable consoles.
There was also an "ioctl" method to change the state of enabled consoles.
The "open" method should do a complete initialization of the device state
if necessary.  Similarly, the "close" method should try to restore the
orginal state.

This was of course broken by the multiple console changes.  The "pollc"
method never really worked, since it cannot keep the device "open"
across polls unless a higher level does the "open", and switching the
device state in open/close makes it flap too much if the switch is non-
null.  sio also had some smaller bugs here:
- when there are shared interrupts, devices attached to the interrupt
   or even all sio devices are polled for activity.  Since the polling
   doesn't use the interrupt status, it detects activity on devices
   that are not supposed to interrupt.
- there is some SMP modification (amplification?) of the previous bug.
   I forget the details.

uart did nothing to disable device interrupts while polling, at least
for ns8250.  uart's locking is convoluted and still seems buggy:
from a slightly old version:

% static void
% ns8250_putc(struct uart_bas *bas, int c)
% {
% 	int limit;
% 
% 	limit = 250000;
% 	while ((uart_getreg(bas, REG_LSR) & LSR_THRE) == 0 && --limit)
% 		DELAY(4);
% 	uart_setreg(bas, REG_DATA, c);
% 	uart_barrier(bas);
% 	limit = 250000;
% 	while ((uart_getreg(bas, REG_LSR) & LSR_TEMT) == 0 && --limit)
% 		DELAY(4);
% }

This is wrapped by uart_putc().  uart_putc() acquires a mutex (di->hwmtx),
unless kdb_active it doesn't acquire the mutex.  Not acquiring the mutex
avoids deadlock with ddb (if not kdb), but means that the mutex doesn't
actually work for ddb i/o, and ddb input is the most common case of input.

di->hwmtx doesn't do much except provide locking for console i/o routines.
With correct console grabbing, it should be impossible for the interrupt
handle to run while console i/o is in progress.  Synchronization occurs
when grabbing acquires the mutex.

% 
% static int
% ns8250_rxready(struct uart_bas *bas)
% {
% 
% 	return ((uart_getreg(bas, REG_LSR) & LSR_RXRDY) != 0 ? 1 : 0);
% }

Similarly.  Upper layers acquire the mutex.

% 
% static int
% ns8250_getc(struct uart_bas *bas, struct mtx *hwmtx)
% {
% 	int c;
% 
% 	uart_lock(hwmtx);
% 
% 	while ((uart_getreg(bas, REG_LSR) & LSR_RXRDY) == 0) {
% 		uart_unlock(hwmtx);
% 		DELAY(4);
% 		uart_lock(hwmtx);
% 	}
% 
% 	c = uart_getreg(bas, REG_DATA);
% 
% 	uart_unlock(hwmtx);
% 
% 	return (c);
% }

This function does its own locking.  It releases the mutex after every
poll.  Releasing the mutex doesn't do much except let the interrupt
handler run and eat your input.  It is done because the function may
take aritrarily long to return, but I don't see why locking up all i/o
on the device should be a problem.  Anyway, correct grabbing gives a
much longer-term "lock" on all the i/o (not the mutex, but whatever
grabbing does to stop interrupt activity and keep it stopped).  It makes
the locking in the above have no effect.

This function uses a correct API, but is currently bogus since the API
is broken.  uart still mostly uses the old console API internally, but
the current console API has been broken to not have cn_getc, leaving
this function unattached to it.  In the console API, cngetc() still
exists at the top level, but the cn_getc method became unused because
the polling loop for it moved to the top level.  It was changed to use
only cn_checkc for input.  Then cn_checkc was broken by renaming it
to cn_getc and removing the real cn_getc.  The CONS_DRIVER() obfuscation
hides some of the details of this obfuscation from drivers, and uart
still uses the old names internally.

With multiple active consoles, bounding of the time taken by console i/o
is actually a problem for output too.  You might have a mixture of fast
and slow devices.  cngetc() only has to wait for one device, but cnputc()
has to wait for all of them.  It would be better if the output to the
fastest device went out first, but there are no outer loops for output,
so the output goes out in device order.  The exact arrangement is:
- cngets(): grab all devices until input is read
- cngetc(): missing grabbing.  Thus broken for external use.  It is mainly
   used by cngets() where it is grabbed globally, and by cnputc() where
   it is grabbed around its call.
- cncheckc(): missing grabbing.  Thus broken for external use.  It is
   mainly used when shutting down.  Now a transient grab is correct.
   Grabbing is still needed for device initialization in some cases.
   Perhaps in shutdown a nearby printf() does (should do) sufficient
   initialization.
- cnputs(): missing grabbing.  It loops calling cnputc().
- cnputc(): missing grabbing
So grabbing is basically only implemented for input.

> The problem is, not many of us spend time at the mountroot prompt, and
> so even if we notice the issue, it's so minor that we just deal w/
> it...

Also, it is not noticeable for syscons.  I rarely use serial consoles
and deal with the problem for sio by holding down keys until the the
right characters get through.  Perhaps transiently disabling interrupts,
or more likely doing lots of initialization and finalization on every
console driver entry, makes the problem less noticeable for sio.  The
initialization and finalization does lots of slow bus accesses.  This
makes the duty cycle maybe 99% with interrupts disabled and 1% enabled.
I now remember experimenting with much longer intentional delays in
the initialization to work around the problem of losing state on
non-null mode switches (especially high frequency ones caused by polling
for input).  The delays can be made as large as 10-20 milliseconds before
becoming perceptible.

> The method I came up with years ago was to add a routine/flag that would
> have the console wait for a character instead of simply returning when
> there was no character...  Though if we implement cngrab properly where
> we don't flush buffers, turn off interupts, etc, then that would work
> too...

That's how it used to work.  It just doesn't work for multiple consoles.
However, in the usual case of only 1 active console, this method would
work fine.  The API churn also means that you can't simply go back to
the old method for 1 active console :-(.  You could also try the old
method with a limit of a second or so for each device.  After getting
input from one, give preference to that one so the delays for cycling
round them all are not too painful.  The timeouts would also be useful
for output.

Bruce