From owner-freebsd-usb@FreeBSD.ORG  Sat Apr  2 18:02:28 2005
Return-Path: <owner-freebsd-usb@FreeBSD.ORG>
Delivered-To: freebsd-usb@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DB25616A4CE
	for <freebsd-usb@freebsd.org>; Sat,  2 Apr 2005 18:02:28 +0000 (GMT)
Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11])
	by mx1.FreeBSD.org (Postfix) with SMTP id 05AD643D41
	for <freebsd-usb@freebsd.org>; Sat,  2 Apr 2005 18:02:28 +0000 (GMT)
	(envelope-from iedowse@maths.tcd.ie)
Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP
          id <aa86365@salmon>; 2 Apr 2005 19:02:19 +0100 (BST)
To: ticso@cicely.de
In-Reply-To: Your message of "Sat, 02 Apr 2005 19:19:39 +0200."
             <20050402171938.GU2072@cicely12.cicely.de> 
Date: Sat, 02 Apr 2005 19:02:18 +0100
From: Ian Dowse <iedowse@maths.tcd.ie>
Message-ID: <200504021902.aa86365@salmon.maths.tcd.ie>
cc: ticso@cicely12.cicely.de
cc: iedowse@maths.tcd.ie
cc: sebastien.b@swissinfo.org
cc: freebsd-usb@freebsd.org
Subject: Re: panic: uhci_abort_xfer: not in process context 
X-BeenThere: freebsd-usb@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: FreeBSD support for USB <freebsd-usb.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-usb>,
	<mailto:freebsd-usb-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-usb>
List-Post: <mailto:freebsd-usb@freebsd.org>
List-Help: <mailto:freebsd-usb-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-usb>,
	<mailto:freebsd-usb-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Apr 2005 18:02:29 -0000

In message <20050402171938.GU2072@cicely12.cicely.de>, Bernd Walter writes:
>I fully agree with you - in general.
>You see the inetrrupt routine as non-blocking context.
>Fine - abort_xfer is bad as it blocks for a long time - agreed.
>But we need any kind of syncronisation with top half.
>Although taking a mutex may block I don't consider it bad as it should
>not be hold for long time and therefor waiting a long time is unlikely.
>Allowing to wait for a Mutex in the bottom half is the whole reason we
>have interrupt threads.
>But that is the current problem - we are not even allowed to take
>a Mutex, just because intr_context isn't safe.
>The panic was triggered by a userland call that was claimed to be in
>interrupt context.
>OK - I was wrong in that it was not close, but that doesn't change
>anything with the basic problem:
>The panic triggered when it shouldn't.

Maybe I'm misunderstanding the cause of some of these panics, so
correct me if this sounds wrong. There seem to be two ways for the
"not in process context" panic to occur. One is where usbd_abort_pipe()
is called directly from an interrupt thread. The other way is for
a callback to be called from interrupt thread and that callback
sleeps. This allows other threads to enter the USB code with
intr_context > 0, so the panic can be incorrectly triggered.

In both cases, the bug is that a callback function is sleeping and
allowing other threads to run. I'm not sure I understand your
comments about locking mutexes though. It is fine for a callback
function to acquire a mutex, because acquiring a mutex is not really
sleeping, because currently held locks are not dropped even if the
mutex cannot be acquired immediately. For example, the USB system
will hold Giant at the time that the interrupt thread is calling
completion callbacks. Even if one of those callbacks needs to acquire
another mutex that is currently locked, Giant will not be dropped
while that mutex is acquired, so no other threads can enter the USB
code. The problem is limited to tsleep/msleep where all mutexes are
dropped.

A more specific statement of the original comments would be that
callbacks in asynchronous event systems should not call tsleep or
msleep, no matter what the calling context is.

Ian