From owner-freebsd-current@FreeBSD.ORG  Sat Jan 24 03:07:50 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DE038106566B;
	Sat, 24 Jan 2009 03:07:50 +0000 (UTC)
	(envelope-from prvs=julian=268335b7e@elischer.org)
Received: from smtp-outbound.ironport.com (smtp-outbound.ironport.com
	[63.251.108.112])
	by mx1.freebsd.org (Postfix) with ESMTP id B7FA48FC12;
	Sat, 24 Jan 2009 03:07:50 +0000 (UTC)
	(envelope-from prvs=julian=268335b7e@elischer.org)
Received: from jelischer-laptop.sfo.ironport.com (HELO
	julian-mac.elischer.org) ([10.251.22.38])
	by smtp-outbound.ironport.com with ESMTP; 23 Jan 2009 18:40:11 -0800
Message-ID: <497A7F8E.60804@elischer.org>
Date: Fri, 23 Jan 2009 18:40:14 -0800
From: Julian Elischer <julian@elischer.org>
User-Agent: Thunderbird 2.0.0.19 (Macintosh/20081209)
MIME-Version: 1.0
To: Maksim Yevmenkin <maksim.yevmenkin@gmail.com>
References: <20090123154336.GJ60948@e.0x20.net>	<200901232146.58360.hselasky@c2i.net>	<bb4a86c70901231345u24cb0786obe9b26174af9ffd3@mail.gmail.com>	<200901232337.05627.hselasky@c2i.net>
	<bb4a86c70901231514x7696e457k3a6dc0e59d4daed7@mail.gmail.com>
In-Reply-To: <bb4a86c70901231514x7696e457k3a6dc0e59d4daed7@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-current@freebsd.org, current@freebsd.org,
	Alfred Perlstein <alfred@freebsd.org>,
	Hans Petter Selasky <hselasky@c2i.net>
Subject: Re: panic: mutex Giant not owned
	at	/usr/src/sys/kern/tty_ttydisc.c:1127
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 24 Jan 2009 03:07:51 -0000

Maksim Yevmenkin wrote:
> [...]
> 
>>> i've added taskqueue_drain() to detach() in my "stall" diff that i
>>> posted few hours back. is that addresses some of your concerns?
>> Yes, at detach you always need to drain by principle.
> 
> ok, fine :) taskqueue_drain() in detach is go :) move to the next item :)
> 
>>> also
>>> keep in mind that while netgraph node is created and destroyed
>>> primarily by the driver in response to hardware arrival and departure,
>>> nothing protects it from the netgraph side. in other words there is
>>> another way to manipulate netgraph node using netgraph api. for
>>> example user can request shutdown of a netgraph node, etc. so all of
>>> this has to be taking into consideration.
>> Ok, I've fixed this: Destroy is now non-blocking USB-wise. See:
>>
>> http://perforce.freebsd.org/chv.cgi?CH=156586
> 
> i'm sorry, did i mention there is no sleeping in netgraph? :-)

it's not just netgraph.. it's the network stack in general..
can you imagine tcp deciding to sleep?

> 
> again, sorry, but this is not going to work. you still doing
> UBT_LOCK()/UBT_UNLOCK() in netgraph method. that is you are trying to
> grab the same lock that is used for locking interfaces.
> 
> more importantly, like i said before, usb2_transfer_stop() does
> USB_BUS_LOCK()/USB_BUS_UNLOCK(). is there a guarantee that another usb
> device will not do something synchronously over the same usb bus?
> 
> i'm actually working on including some of your changes (except getting
> rid of ubt_task), so, please, lets just stop for a second and talk
> before we start commit war here :) please bear with me for just a
> second, ok?
> 
>>> in fact, if i understand you correctly, with addition of
>>> taskqueue_drain() to detach(), the later becomes completely
>>> synchronous. that is
>>>
>>> 1) NG_NODE_REF()
>>> 2) ng_rmnode_self (i.e. mark node as possibly dead, detach all the
>>> hooks, etc. etc.)
>>> 3) taskqueue_drain() (i.e. sleep until task has finished)
>>> 4) usb2_transfer_unsetup() (which does "usb2_transfer_drain() and sleeps)
>>> 5) NG_NODE_UNREF()
>>> 6) free everything up
>>>
>>> in theory, items (1) and (5) above just an additional protection
>>> because i was not sure about inner workings of usb2. i think, when we
>>> sort everything out they can be removed.
>> I have a question. What is the following doing at the middle of the
>> ubt_detach():
>>
>>        if (node != NULL)
>>                NG_NODE_UNREF(node);
>>
>> If you say that:
>>
>>                ng_rmnode_self(node);
>>
>> Cleans up the last reference?
> 
> the complete code is
> 
> node_p  node = sc->sc_node;
> 
> if (node != NULL) {
>    sc->sc_node = NULL;  <-- clear sc_node
> 
>    NG_NODE_SET_PRIVATE(node, NULL);
>    NG_NODE_REALLY_DIE(node);
> 
>    NG_NODE_REF(node); <--- grab +1 reference
>    ng_rmnode_self(node); <--- mark node as "dead", but not ensure its
> not free()d
> }
> 
> /* bla, bla */
> 
> if (node != NULL)
>    NG_NODE_UNREF(node); <--- drop 1 reference and possibly free() node
> 
> so, say we enter detach() with only one reference (from attach()).
> when we see the sc_node pointer, we assume that there could be
> multiple (> 1) references on it. so just for added protection we grab
> one more (now reference count is 2). then we call ng_rmnode_self()
> which will mark node as "dead" and drop one reference (now reference
> count is 1). then we do "bla bla" stuff which could access node
> pointer. the important thing is that node pointer still points to a
> valid "dead" node, so NG_NODE_NOT_VALID() can be used to check if node
> is "dead" or "alive". finally when "bla bla" stuff is done, we know we
> are not going to access node pointer any mode and we drop our
> reference (now reference count is 0 and node is destroyed).
> 
> 
>>> i do not understand. if usb2_transfer_unsetup() calls
>>> usb2_transfer_drain() and sleeps, then what difference does it make in
>>> detach()?
>> You are right that USB-wise it does not matter if you call
>> usb2_transfer_drain() separately or if you call usb2_transfer_unsetup(). I'm
>> just thinking with regard to the state inside bluetooth, because there are
>> two contexts running in parallell. I.E. Callbacks might be executed in
>> paralell to bluetooth tearing down.
> 
> do you mean transfer stop on hook disconnection? but those are
> protected by interface locks, no? and like i said, i'm changing
> ubt_task to call drain() instead of stop() to make stop completely
> synchronous (which is actually a nice thing).
> 
>>> i just followed original code that did it this way, i.e.
>>> drain and unsetup all transfers in one call.
>> Yes, that's fine.
> 
> ok, fine. one more down :)
> 
>>>> In USB2 a "usb2_start_hardware()" call is always paired with a USB
>>>> transfer callback, no matter what you do! This way of design has been
>>>> chosen so that you can do refcounts inside the callback!
>>> ok, and that is exactly what code does. the only thing is that
>>> refcounts are on netgraph node and not on softc structure. again the
>>> reason for that is because netgraph node can be accessed from both
>>> usb2 driver and netgraph subsystem directly.
>> Isn't it enough to increment the refcount at attach and decrement it at
>> detach ????
> 
> it would be if we could sleep in netgraph, and we can not. in theory,
> we only need one extra reference which would tell us if there anything
> in usb2 that still could access node pointer. in practice, instead of
> checking every transfer and making sure its no pending before dropping
> that extra reference i just add multiple references for each usb2
> transfer and ubt_task (i.e. things that could access node pointer).
> 
> [...]
> 
>>> well, when i looked at the code, i saw that both functions do
>>> USB_BUS_LOCK() and USB_BUS_UNLOCK().
>> Well, this is the lock of the USB IRQ handler. One per USB controller. There
>> are no more locks than this that will be locked. The IRQ handler is called
>> from a high priority context and will not block very long. Same with USB
>> callbacks.
> 
> right, and what if some other usb device attached to the same usb
> controller is doing something synchronously? do you see that you could
> potentially block netgraph for arbitrary time and that is really out
> of ng_ubt2 driver control?
> 
>>> also, now that you mentioned it, i should call usb2_transfer_drain()
>>> in ubt_task() instead of usb2_transfer_stop() to make transfer stop
>>> completely synchronous as well.
>> When I think about it, usb2_transfer_stop() is enough. See my latest Patch in
>> P4.
> 
> please read my comment above.
> 
>>> please let me reiterate, sleeping in netgraph is NOT allowed. even if
>>> its for a few milliseconds. please accept it as a fact. think of all
>>> netgraph methods as fast interrupt handlers. it is very typical for
>>> fast interrupt handlers to do minimal amount of work and schedule swi
>>> to do more heavy processing. that is exactly what the code does here.
>> Fixed!
> 
> i beg to differ :)
> 
> thanks,
> max
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"