Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 4 Jan 2015 23:36:03 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Hans Petter Selasky <hps@selasky.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Bruce Evans <brde@optusnet.com.au>
Subject:   Re: svn commit: r276626 - head/sys/kern
Message-ID:  <20150104211113.A1445@besplex.bde.org>
In-Reply-To: <54A87C28.2030304@selasky.org>
References:  <201501031721.t03HLKHp060964@svn.freebsd.org> <20150104045338.V2929@besplex.bde.org> <54A87C28.2030304@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 4 Jan 2015, Hans Petter Selasky wrote:

> On 01/03/15 21:17, Bruce Evans wrote:
>> On Sat, 3 Jan 2015, Hans Petter Selasky wrote:
>> 
>>> Log:
>>>  Rework r276532 a bit. Always avoid recursing into the console drivers
>>>  clients, hence they might not handle it very well. This change allows
>>>  debugging mutex problems with kernel console drivers when
>>>  "debug.witness.skipspin=0" is set in the boot environment.
>> 
>> This keeps getting worse.  Console drivers are required to be reentrant
>> so that console i/o works in any context, including reentering the
>> console driver a couple of times.

Please trim quotes.

[quotes trimmed excessively since 2 copies of everything is too much already]

>>> ...
>>> +        if (mtx_owned(&cnputs_mtx))
>>> +            return;
>> 
>> The largest obvious new bug here is losing all output from panic()
>> (and possibly other critical output) for panics that occur while
>> the mutex is held by the same CPU.  The previous change made this
>> case work (for reentrant console drivers) instead of deadlocking.

> Thanks for your analysis. I can get the the full backtrace if you like, but 
> not right now. It is quite long!
>
> Quickly summarized, the problem is that printf recurses from inside a 
> mtx_lock_spin() because of a LOR, due to some testing I'm doing, which is 
> because witness calls directly to printf() again. In other words, we cannot 
> use printf() to debug problems in the console drivers, and actually printf() 
> should be disabled when we are inside certain code paths, simply.

I now remember more of the problem.  Broken console drivers use mutex
locks.  The most broken ones don't even declare these locks as
MTX_NOWITNESS.  Thus when a LOR is detected in one of these locks, it
causes endless recursion into the console driver.  Finally, LORs
_should_ be detected for many uses of these locks, since the fundamantal
brokenness of using mutex locks in console drivers includes there being
no single correct order for them.

printf() is required to work in any context.  It shouldn't be disabled.
Making it work for witnessing of itself is especially easy compared
with making it work for ddb'ing itself.  For witness, the recursion
only occurs in certain contexts that can be limited.  For ddb, it occurs
on any instruction within the console driver (when all the instructions
are stepped through).

printf() is perhaps the must useful place to put a ddb breakpoint at.
I often sprinkle debugging printf()s which are rarely excuted and put a
breakpoint a printf() instead of in all the callers.  Then when the
printf() is reached, I sometimes step a fair way into it.

The surprising thing is that the recursion/deadlock doesn't happen more
often.  This seems to be because:
- most systems configure MTX_SKIPSPIN, so witness never detects LORs with
   spinlocks
- most console drivers aren't so broken as to not use MTX_NOWITNESS

You said that you force !MTX_SKIPSPIN.  I suspect that you also use
one of the broken console drivers.  (Oops, you say later that you are
using vt).  vt and usb_serial are obviously broken, since they do
unconditional mutex locking with a non-recursive witnessed mutex,
without even the kdb_active hack to ignore the lock while in ddb.  The
above change helps a little for these drivers.  The higher-level lock
prevents interference from console calls from cnputs().  All other
console calls, which include all output when PRINTF_BUFR_SIZE is
configured and all ddb output, just deadlock as before.

Note that when the less broken console drivers are called while kdb is
active, the kdb_active hack results in the locking having no effect.
Recursive cases just proceed (but without risk of recursing endless
unless another lock is (mis)used), so the lock is useless in this case.
This case must work without the lock, and once it does the lock is not
needed in non-ddb cases either.

The quick fix for recursion in witness is to use MTX_NOWITNESS for all
console driver locks.  This is undesireable since it breaks witnessing
of these locks in other contexts, but sio always did it.  MTX_RECURSE
might also be needed, but is too dangerous, and it takes very large
locking bugs or rare races (something like printf()s in trap handlers)
for recursion to occur, once you have removed the recursion for witnessing
a bug in yourself.

The fundamental LOR problem with using mutex locking in console drivers is:
- console drivers need to be callable from anywhere, so their locks need to
   be ordered after all others
- however, to be useful on devices with the console shared with a normal
   device, the same lock needs to be used for the console part as for the
   main part.  (Sharing the lock gives potential deadlock or just too long
   a delay before the console i/o is done, but normally the main part
   releases the lock soon enough.)  In the main part, the lock is not
   special.  It shouldn't be MTX_NOWITNESS, and other locks may be
   acquired while it is held.  For no LOR for acquiring these other locks,
   the other locks need to be ordered after the console lock.  This conflicts
   with the requirement that the console lock be ordered after all others.
The locking table is ordered so that there is no LOR for the normal use.
Thus there is almost always a LOR for console use.  This LOR is normally
harmless, but not always.  It may indicate larger locking bugs in console
drivers.  To avoid seeing this LOR and getting endless recursion from it,
MTX_NOWITNESS must be used.

I tried reordering the locking table to put console locks last.  This seemed
to work at first, but that was only because MTX_SKIPSPIN was enabled.
When spinlocks were properly witnessed, (false) LORs were detected in normal
operation.  I now use the following which gives only minor reording by
pushing down console locks as far as possible.  This was only done carefully
for console and serial drivers that I used (sio, cy and syscons).

X diff -c2 ./kern/subr_witness.c~ ./kern/subr_witness.c
X *** ./kern/subr_witness.c~	Tue Apr 10 01:51:12 2012
X --- ./kern/subr_witness.c	Tue Oct 21 19:14:20 2014
X ***************
X *** 629,635 ****
X   #endif
X   	{ "rm.mutex_mtx", &lock_class_mtx_spin },
X   	{ "sio", &lock_class_mtx_spin },
X   	{ "scrlock", &lock_class_mtx_spin },
X - #ifdef __i386__
X   	{ "cy", &lock_class_mtx_spin },
X   #endif
X --- 629,637 ----
X   #endif
X   	{ "rm.mutex_mtx", &lock_class_mtx_spin },
X + #define MISORDERED
X + #undef MISORDERED
X + #ifdef MISORDERED
X   	{ "sio", &lock_class_mtx_spin },
X   	{ "scrlock", &lock_class_mtx_spin },
X   	{ "cy", &lock_class_mtx_spin },
X   #endif

cy is neither i386-specific or a console driver.

X ***************
X *** 639,643 ****
X --- 641,647 ----
X   #endif
X   	{ "scc_hwmtx", &lock_class_mtx_spin },
X + #ifdef MISORDERED
X   	{ "uart_hwmtx", &lock_class_mtx_spin },
X + #endif
X   	{ "fast_taskqueue", &lock_class_mtx_spin },
X   	{ "intr table", &lock_class_mtx_spin },
X ***************
X *** 650,662 ****
X --- 654,679 ----
X   	{ "rm_spinlock", &lock_class_mtx_spin },
X   	{ "turnstile chain", &lock_class_mtx_spin },
X + #ifndef MISORDERED
X + 	{ "sio", &lock_class_mtx_spin },
X + 	{ "cy", &lock_class_mtx_spin },
X + #endif
X   	{ "turnstile lock", &lock_class_mtx_spin },
X   	{ "sched lock", &lock_class_mtx_spin },
X   	{ "td_contested", &lock_class_mtx_spin },

When the interrupt handler calls the scheduler, its lock must be ordered
before scheduler locks.


X + #ifndef MISORDERED
X + 	{ "scrlock", &lock_class_mtx_spin },
X + #endif
X   	{ "callout", &lock_class_mtx_spin },
X   	{ "entropy harvest mutex", &lock_class_mtx_spin },
X + #ifdef MISORDERED
X   	{ "syscons video lock", &lock_class_mtx_spin },
X + #endif
X   #ifdef SMP
X   	{ "smp rendezvous", &lock_class_mtx_spin },
X   #endif
X + #ifndef MISORDERED
X + 	{ "uart_hwmtx", &lock_class_mtx_spin },
X + 	{ "syscons video lock", &lock_class_mtx_spin },
X + #endif

I think uart can be pushed down this far.  Its locked regions are too
short, but I think it releases the lock before acquiring others, so
it can be a leaf lock for all uses.  This has barely been tested.

X   #ifdef __powerpc__
X   	{ "tlb0", &lock_class_mtx_spin },

> Instead of a nice LOR printout, I immediately hit a panic, because other 
> mutexes marked as non-recursive suddenly got recursed inside the console 
> drivers, in this case "dev/vt". For now, printouts are not allowed if inside 
> "cnputc()". This at least got me on with the debugging. Previously 
> "cnputs_mtx" was marked non-recursive, and I think that is correct.

So it was just new bug in vt.  Try the quick fix for its console mutex.
Or better, use syscons and use the quick fix for its mutexes.  Its bugs
are better understood and partly fixed in a one of my versions.

vt is also missing the kdb_active hack.  It also has some lower-level
spin mutexes (mainly VTBUF_LOCK()?).  This gives obvious deadlocks.
E.g., one CPU may be doing normal outout and hold VTBUF_LOCK().  Then
a debugger trap on another CPU will first stop the first CPU and then
deadlock on VTBUF_LOCK().  If the kdb_active hack is used to avoid
this deadlock, then the locking simply doesn't work.

vt's locks don't seem to be in the static LOR table.  The table is small,
so there must be many other locks like that.  I don't see how a default
order can work.

Non-recursive is mostly correct.  The recursion indicates another bug.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150104211113.A1445>