Date: Sat, 22 Nov 2008 22:52:30 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Maksim Yevmenkin <emax@freebsd.org> Cc: "current@freebsd.org" <current@freebsd.org>, Bruce Evans <brde@optusnet.com.au> Subject: Re: syscons(4) races Message-ID: <20081122204840.K1008@besplex.bde.org> In-Reply-To: <bb4a86c70811201625v4dd78c64p64868b6d636a7602@mail.gmail.com> References: <bb4a86c70811201625v4dd78c64p64868b6d636a7602@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 20 Nov 2008, Maksim Yevmenkin wrote: > [moving to -current] I'm not subscribed, so when I send mail there it is delayed. >>> --- head/sys/dev/syscons/syscons.c Sun Nov 16 21:57:54 2008 >>> (r185012) >>> +++ head/sys/dev/syscons/syscons.c Sun Nov 16 22:39:04 2008 >>> (r185013) >>> ... >>> + mtx_lock(&Giant); >>> /* assert(sc_console != NULL) */ >>> >>> /* >> >> Acquiring Giant in debugger mode is especially invalid (except deadlock >> is not so likely for a recursive lock). > > ok, in this case i have a somewhat stupid question. when kbdmux(4) is > the default keyboard, all those kbdd_xxx() calls (that sccngetch() > makes) will call into kbdmux(4) functions that will grab giant. > kbdmux was changed about 2 months ago to do that. it sounds like those > changes are completely wrong. am i correct here? Yes, the low-level console functions are supposed to be reentrant (more details below), so they cannot do things like that. >> BTW, sccngetch() shouldn't exist. It existed to demultiplex the 2 >> console driver entry points sc_cngetch() and sc_cncheckc(), but >> sc_cncheckc() has gone away. > > the entry point in consdev struct is still there. also cncheckc() is > trying to call it (if its present). so it looks like it may not be > completely gone, just sysconst(4) no longer implements it. Ah, this is compatibility cruft which is bogusly only used by a driver that is newer than the cruft (xen). Normal drivers use CONSOLE_DRIVER() but xen uses the old interface CONS_DRIVER(). CONSOLE_DRIVER() differs from CONS_DRIVER() only in not taking the cn_checkc() entry point. Binary compatibility and correct naming were broken by renaming cn_checkc() to cn_getc(). Names are still correct at the top level (there, cncheckc() gives the non-blocking interface and cngetc() gives the blocking interface, while in normal drivers there is now only cn_getc() which gives the non-blocking interface). It is cn_dbctl() that has completely gone away. >> There is still a problem for calls to sc_cngetc() from the low-console >> driver. These can race with sckbdevent(). In debugger mode, no locking >> is permitted, so syscons' getc functions must be carefully written to >> do only harmless things when they lose races. They are not carefully >> written (e.g., almost the first thing they do, shown in the above, may >> give a buffer overrun: >> >> if (fkeycp < fkey.len) { >> mtx_unlock(&Giant); >> splx(s); >> return fkey.str[fkeycp++]; >> } > > all right, so we can not use locks, i'm guessing this includes spin > locks too. can we use atomic operations here? since, in polling mode, > consumer is going to call getc() in a loop, can we use atomic > reference counter to make sure there is only one caller running at a > time? if someone already grabbed reference counter just return -1 as > if there is no input? It is safer to use local spinlocks and safer still to use home-made locks based on atomic_cmpset(), but still wrong to do simply that since it can cause deadlock, especially for debugging -- it is not possible to make sure that there is only 1 caller running, since debugging (or NMI, or perhaps an ordinary interrupt that cannot reasonably be masked) can preempt any caller. sio uses its local spinlock (I don't like this) together with a hack to allow reentering from the debugger only: % need_unlock = 0; % if (!kdb_active && sio_inited == 2 && !mtx_owned(&sio_lock)) { % mtx_lock_spin(&sio_lock); % need_unlock = 1; % } This is only done for cn_putc(). cn_getc() (really cn_checkc()) is completely missing locking (except in debugger mode) and needs higher-level locking anyway, since the whole polling loop in cngetc() needs to be locked to prevent interference. Here sio_lock is the local spinlock (actually per-driver so it isn't very local), sio_inited is a flag to prevent use of sio_lock here before sio_lock is initialized (this flag requires delicate initialization using atomic_cmpset to avoid races), and !kdb_active is the hack to let the debugger in irrespective of the state of the lock and without touching the lock. Any lock used for the console must be MTX_QUIET and/or MTX_NOWITNESS to prevent console i/o generating endless i/o about itself. I think it must be a spinlock too. The lock thus reduces to a simple spinlock which can be implemented directly using atomic_cmpset(), thus making it clear that the lock has no interactions with other parts of the system. sio shouldn't be using its normal-i/o driver lock for the console. It attempts to be reentrant and does a context switch of the normal-device state while in console mode. However, this only works for UP. For SMP, something more is needed to stop interference from other CPUs. In debugger mode, the system provides the something by stopping other CPUs so the old UP code works (this also provides the necessary locking for the loop in cngetc()). In non-debugger mode, abusing the normal-i/o driver lock provides the something. This is ugly but works quite well. Analysis: - Suppose one CPU is doing console i/o and acquires the lock. Then other CPUs have be prevented from touching any device state (since sio console i/o is not completely reentrant -- for UP it depends on masking interrupts to prevent more than 1 level of reentrance (although it preserves the state on the stack), and stopping other CPUs corresponds to masking interrupts. - Suppose this code is called with the lock held. Then waiting until the lock is not held is best provided the wait isn't very long (in particular, provided The wait isn't infinite due to deadlock). If the lock is held by another CPU, then waiting corresponds to waiting in the previous case (except since we are abusing the normal-i/o dirver lock, we may be waiting for normal i/o and not console i/o). Using a spinlock prevents the lock being held by the same CPU except in cases involving non-maskable exceptions: - debugger exceptions are handled by the !kdb_active hack. Now sio's console i/o reentrancy can be used a couple of layers deep (for ddb tracing the console i/o code; ddb doesn't support debugging itself, so the reentrance can't go deeper than a couple). - other traps like SIGBUS shouldn't get here since the console i/o code shouldn't cause such traps. - NMIs. An NMI in console i/o code probably causes deadlock if the NMI handler does console i/o. I think NMIs can cause deadlock generally -- the implementation of spinlocks uses disabling of interrupts to prevent deadlock, but NMIs cannot be disabled; thus NMI handlers cannot call mtx_lock_spin(), but I think they do, if only by wandering into console i/o code to report a problem. >> ... >> However, you seem to have made the races very common by (ab)using cngetc() >> for keyboard multiplexing. cngetc() is not designed for this. You need >> to know syscons' internals and do the necessary Giant locking in the caller. > > this is what i'm not getting. are you saying that lower layer keyboard > drivers can not use any locking whatsoever in polling mode (or rather > debug mode)? are you saying that we need a completely different path > for handling keyboard input in debug mode? and/or possibly polling > mode? I'm saying thay lower layer console drivers cannot use any normal locking in any mode. In non-debugger modes, at least cn_putc() must not block on any normal lock, since interrupt handlers can call printf() which calls cn_putc(). Before [Free]BSD used mutex or ithreads, this almost required the console i/o routines to be reentrant. Debugger mode requires reentrancy even more, but is otherwise simpler. Now with ithreads, even non-spin locks might work (at a cost of hopefully unimportant delays) for interrupt handlers, but debugger mode still requires reentrancy. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081122204840.K1008>