From owner-freebsd-current@FreeBSD.ORG Fri Nov 21 00:58:06 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AE25B1065675 for ; Fri, 21 Nov 2008 00:58:06 +0000 (UTC) (envelope-from maksim.yevmenkin@gmail.com) Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.232]) by mx1.freebsd.org (Postfix) with ESMTP id 7BC8B8FC20 for ; Fri, 21 Nov 2008 00:58:06 +0000 (UTC) (envelope-from maksim.yevmenkin@gmail.com) Received: by rv-out-0506.google.com with SMTP id b25so677831rvf.43 for ; Thu, 20 Nov 2008 16:58:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:cc:mime-version:content-type:content-transfer-encoding :content-disposition:x-google-sender-auth; bh=Vr9cMFIlj2dT7h4X55UYtpFNzT7NERJ5UNyfNOmzlRQ=; b=gnr/v3qzawSHbWpGOautXwjtJzvRfZiy+vmczX/AsnMLd9+miDFu007eDxjG9YgUvG iDNueNqztN/aPeG84u18InEJ6tKEHHbixivVqhCH60KdH7IE1wnaQBVVphHiUQEjBp1h Q5ZeJWefQhymwzRy46L/yGtf6qux4+17Ye2b8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:mime-version:content-type :content-transfer-encoding:content-disposition:x-google-sender-auth; b=nOSbudb5aWqjNHV9dPA8mPi3fcKrWqV6JUSaE60MR5Ni7rQ/6ohR2ORCSVhdsU531p 4T3wN37aL3KDaneCYIm1z+9rrqEp/f7QUNCtAKjp6KgzxUmZ5dS24Mp+LqcFmrZlTn8H keQYvvP9SumNvBXKQSEMVFlPVkyfmFRB7p6Ac= Received: by 10.141.29.21 with SMTP id g21mr1533026rvj.198.1227227140492; Thu, 20 Nov 2008 16:25:40 -0800 (PST) Received: by 10.140.199.20 with HTTP; Thu, 20 Nov 2008 16:25:40 -0800 (PST) Message-ID: Date: Thu, 20 Nov 2008 16:25:40 -0800 From: "Maksim Yevmenkin" Sender: maksim.yevmenkin@gmail.com To: "Bruce Evans" MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Google-Sender-Auth: f7d9fc851d49eda2 Cc: "current@freebsd.org" Subject: Re: syscons(4) races X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2008 00:58:06 -0000 [moving to -current] Bruce, ok, i'm looking at it again... [...] >> More locking for syscons(4). This should prevent races with sckbdevent(). > > This cannot be the correct fix. It should cause panics due to failure > of the (missing) assertion that no locks may be accessed in debugger mode. > Locks may not be accessed in debugger mode because: > (1) locks may be in an inconsistent state (when locking code is being > debugged) (unless they are made to work virtually -- switch all locks > to a safe state while in debugger mode, but display the unswitched > state in all debugger commands) > (2) deadlock is possible (would also be avoided by virtualization). ok, fair enough. >> --- head/sys/dev/syscons/syscons.c Sun Nov 16 21:57:54 2008 >> (r185012) >> +++ head/sys/dev/syscons/syscons.c Sun Nov 16 22:39:04 2008 >> (r185013) >> @@ -1572,6 +1572,7 @@ sccngetch(int flags) >> int s = spltty(); /* block sckbdevent and scrn_timer while we poll */ > > The comment still says that this is what blocks sckbdevent(). > > The comment was wrong too for debugger mode -- in debugger mode, > interrupts are masked in hardware and other CPUs are stopped, so nothing > except possibly this functions internals can call sckbdevent(). ok > Internal calls to scrn_timer() used to be prevented by sccndbctl() > setting syscons' `debugger' flag and this function and others checking > this flag. This has been lost. I think the flag isn't needed and > wasn't used to protect sckbdevent(), and your problem has nothing to > do with debugger mode except for breaking it. Instead your problem > is a layering one (continued below (*)). ok >> int c; >> >> + mtx_lock(&Giant); >> /* assert(sc_console != NULL) */ >> >> /* > > Acquiring Giant in debugger mode is especially invalid (except deadlock > is not so likely for a recursive lock). ok, in this case i have a somewhat stupid question. when kbdmux(4) is the default keyboard, all those kbdd_xxx() calls (that sccngetch() makes) will call into kbdmux(4) functions that will grab giant. kbdmux was changed about 2 months ago to do that. it sounds like those changes are completely wrong. am i correct here? > BTW, sccngetch() shouldn't exist. It existed to demultiplex the 2 > console driver entry points sc_cngetch() and sc_cncheckc(), but > sc_cncheckc() has gone away. the entry point in consdev struct is still there. also cncheckc() is trying to call it (if its present). so it looks like it may not be completely gone, just sysconst(4) no longer implements it. > (*) syscons has several getc routines. The layering and locking for > these should be something like: > > sc_cngetc(): must not access any locks; currently implemented by > calling scgetc(); therefore: > scgetc(): must not access any locks ok. > sccngetch(): bogus layering -- see above > sckbdevent(): I think this is the entry point for all normal (non- > low-level-console) syscons input. It is currently implemented > by calling scgetc(), which must not access any locks. Therefore, > any locking must be in this function. I think it currently uses > Giant locking. yes, it does. > There is still a problem for calls to sc_cngetc() from the low-console > driver. These can race with sckbdevent(). In debugger mode, no locking > is permitted, so syscons' getc functions must be carefully written to > do only harmless things when they lose races. They are not carefully > written (e.g., almost the first thing they do, shown in the above, may > give a buffer overrun: > > if (fkeycp < fkey.len) { > mtx_unlock(&Giant); > splx(s); > return fkey.str[fkeycp++]; > } all right, so we can not use locks, i'm guessing this includes spin locks too. can we use atomic operations here? since, in polling mode, consumer is going to call getc() in a loop, can we use atomic reference counter to make sure there is only one caller running at a time? if someone already grabbed reference counter just return -1 as if there is no input? > since the increment is not atomic with the bounds check), but they > mostly work. (Syscons' putc routines have a much larger number of > races like this, but they mostly work too. It used to be easy to cause > panics by racing normal console output with printfs from an interrupt > handler (use a high frequency timeout interrupt handler that prints > something), but almost any locking would fix that and I think Giant-locking > everything fixed it accidentally (**). So there is only a problem > modes like debugger mode where locking is not permitted.) > > The races for sc_cngetc() were normally limited: > - most calls were in debugger mode, and debugger mode limits problems > - the only other calls were for things like gets() for non-auto > mountroot and cngetc() for the "hit any key to reboot". This case > might even supply the needed and permitted Giant locking accidentally. > > However, you seem to have made the races very common by (ab)using cngetc() > for keyboard multiplexing. cngetc() is not designed for this. You need > to know syscons' internals and do the necessary Giant locking in the caller. this is what i'm not getting. are you saying that lower layer keyboard drivers can not use any locking whatsoever in polling mode (or rather debug mode)? are you saying that we need a completely different path for handling keyboard input in debug mode? and/or possibly polling mode? thanks, max