From owner-freebsd-current@FreeBSD.ORG Fri Nov 7 05:17:23 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2FEDF16A4CF; Fri, 7 Nov 2003 05:17:23 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 705D243F85; Fri, 7 Nov 2003 05:17:21 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id AAA09012; Sat, 8 Nov 2003 00:17:07 +1100 Date: Sat, 8 Nov 2003 00:17:06 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Morten Johansen In-Reply-To: <3FAAF50B.9030105@morten-johansen.net> Message-ID: <20031107230107.O3769@gamplex.bde.org> References: <3FA966B2.9040704@morten-johansen.net> <20031105202947.A43448@pooker.samsco.home> <3FAAF50B.9030105@morten-johansen.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@freebsd.org Subject: Re: the PS/2 mouse problem X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Nov 2003 13:17:23 -0000 On Fri, 7 Nov 2003, Morten Johansen wrote: > Morten Johansen wrote: > > Scott Long wrote: > > > >> One thought that I had was to make psmintr() be INTR_FAST. I need to > >> stare at the code some more to fully understand it, but it looks like it > >> wouldn't be all that hard to do. Basically just use the interrupt > >> handler > >> to pull all of the data out of the hardware and into a ring buffer in > >> memory, and then a fast taskqueue to process that ring buffer. It would > >> at least answer the question of whether the observed problems are due to > >> ithread latency. And if done right, no locks would be needed in > >> psmintr(). However, it is usually easier to use a lock even if not strictly necessary. psm as currently structured uses the technique of calling psmintr() from the timeout handler. This requires a lock. If this were not done, then the timeout routine would probably need to access hardware using scattered i/o instructions, and these would need locks (to prevent them competing with i/o instructions in psmintr()). Putting all the hardware accesses in the fast interrupt handler is simpler. The sio driver uses this technique but doesn't manage to put _all_ the i/o's in the interrupt handler, so it ends up having to lock out the interrupt handler all over the place. Ring buffers can be self-locking using delicate atomic instructions, but they are easier to implement using locks. > > I can reproduce the problem consistently on my machine, by moving the > > mouse around, while executing e.g this command in a xterm: > > > > dd if=/dev/zero of=test bs=32768 count=4000; sync; sync; sync > > > > when the sync'ing sets in the mouse attacks. > > It is very likely due to interrupt latency. > > > > I'd be happy to test any clever patches. > > Wow. You are completly right! > By using a MTX_SPIN mutex instead, and marking the interrupt handler > INTR_MPSAFE | INTR_FAST, my problem goes away. > I am no longer able to reproduce the mouse attack. > I have not noticed any side-effects of this. Could there be any? > I will file a PR with an updated patch, unless you think it's a better > idea to rearrange the driver. > Probably the locking could be done better anyway. Er, psmintr() needs large changes to become a fast interrupt handler. it does many things that may not be done by a fast interrupt handler, starting with the first statement in it: /* read until there is nothing to read */ while((c = read_aux_data_no_wait(sc->kbdc)) != -1) { This calls into the keyboard driver, which is not written to support any fast interrupt handlers. In general, fast interrupt handlers may not call any functions, since the "any" function doesn't know that it is called in fast interrupt handler context and may do things that may not be done in fast interrupt handler context. As it happens, read_aux_data_no_wait() does the following bad things: - it accesses private keyboard data. All data that is accessed by a fast interrupt handler must be locked by a common lock or use self-locking accesses. Data in another subsystem can't reasonably be locked by this (although the keyboard subsystem is close to psm, you don't want to export the complexities of psmintr()'s locking to the keyboard subsystem). - it calls other functions. The closure of all these calls must be examined and made fast-interrupt-handler safe before this is safe. The lowest level will resolve to something like inb(PSMPORT) and this alone is obviously safe provided PSMPORT is only accessed in the interrupt handler or is otherwise locked. (Perhaps the private keyboard data is actually private psm data that mainly points to PSMPORT. Then there is no problem with the data accesses. But the function calls make it unclear who owns the data.) - it sometimes calls the DELAY() function, which is not permitted in fast interrupt handlers since apart from locking issues, fast interrupt handlers are not permitted to busy-wait. Many of the complications for fast interrupt handlers shouldn't be needed in psm. Just make psmintr() INTR_MPSAFE. This is nontrival, however. Fine grained locking gaves many of the complications that were only in fast interrupt handlers in RELENG_4. E.g., for psmintr() to be MPSAFE, all of its calls into the keyboard subsystem need to be MPSAFE, and they are unlikely to be so unless the keyboard subsystem is made MPSAFE. The following method can be used to avoid some of the complications: make the interrupt handler not touch much data, so that it can be locked easily. The data should be little more than a ring buffer. Make the handler either INTR_MPSAFE or INTR_FAST (it doesn't matter for slow devices like psm). Put all the rest of what was in the interrupt handler in non-MPSAFE code (except where it accesses data shared with the interrupt handler) so that all of this code and its closure doesn't need to be made MPSAFE. This method is what the sio driver uses in -current, sort of accidentally. sio's SWI handler and all of the tty subsystem are not MPSAFE, but this has almost no visible effect because very low latency is only needed at the lowest level. BTW, the flags combination (INTR_MPSAFE | INTR_FAST) makes no sense. INTR_MPSAFE only applies to non-INTR_FAST handlers and INTR_FAST is currently non-advisory. Bruce