From owner-freebsd-current@FreeBSD.ORG Sat Jan 24 03:07:50 2009 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DE038106566B; Sat, 24 Jan 2009 03:07:50 +0000 (UTC) (envelope-from prvs=julian=268335b7e@elischer.org) Received: from smtp-outbound.ironport.com (smtp-outbound.ironport.com [63.251.108.112]) by mx1.freebsd.org (Postfix) with ESMTP id B7FA48FC12; Sat, 24 Jan 2009 03:07:50 +0000 (UTC) (envelope-from prvs=julian=268335b7e@elischer.org) Received: from jelischer-laptop.sfo.ironport.com (HELO julian-mac.elischer.org) ([10.251.22.38]) by smtp-outbound.ironport.com with ESMTP; 23 Jan 2009 18:40:11 -0800 Message-ID: <497A7F8E.60804@elischer.org> Date: Fri, 23 Jan 2009 18:40:14 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.19 (Macintosh/20081209) MIME-Version: 1.0 To: Maksim Yevmenkin References: <20090123154336.GJ60948@e.0x20.net> <200901232146.58360.hselasky@c2i.net> <200901232337.05627.hselasky@c2i.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, current@freebsd.org, Alfred Perlstein , Hans Petter Selasky Subject: Re: panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Jan 2009 03:07:51 -0000 Maksim Yevmenkin wrote: > [...] > >>> i've added taskqueue_drain() to detach() in my "stall" diff that i >>> posted few hours back. is that addresses some of your concerns? >> Yes, at detach you always need to drain by principle. > > ok, fine :) taskqueue_drain() in detach is go :) move to the next item :) > >>> also >>> keep in mind that while netgraph node is created and destroyed >>> primarily by the driver in response to hardware arrival and departure, >>> nothing protects it from the netgraph side. in other words there is >>> another way to manipulate netgraph node using netgraph api. for >>> example user can request shutdown of a netgraph node, etc. so all of >>> this has to be taking into consideration. >> Ok, I've fixed this: Destroy is now non-blocking USB-wise. See: >> >> http://perforce.freebsd.org/chv.cgi?CH=156586 > > i'm sorry, did i mention there is no sleeping in netgraph? :-) it's not just netgraph.. it's the network stack in general.. can you imagine tcp deciding to sleep? > > again, sorry, but this is not going to work. you still doing > UBT_LOCK()/UBT_UNLOCK() in netgraph method. that is you are trying to > grab the same lock that is used for locking interfaces. > > more importantly, like i said before, usb2_transfer_stop() does > USB_BUS_LOCK()/USB_BUS_UNLOCK(). is there a guarantee that another usb > device will not do something synchronously over the same usb bus? > > i'm actually working on including some of your changes (except getting > rid of ubt_task), so, please, lets just stop for a second and talk > before we start commit war here :) please bear with me for just a > second, ok? > >>> in fact, if i understand you correctly, with addition of >>> taskqueue_drain() to detach(), the later becomes completely >>> synchronous. that is >>> >>> 1) NG_NODE_REF() >>> 2) ng_rmnode_self (i.e. mark node as possibly dead, detach all the >>> hooks, etc. etc.) >>> 3) taskqueue_drain() (i.e. sleep until task has finished) >>> 4) usb2_transfer_unsetup() (which does "usb2_transfer_drain() and sleeps) >>> 5) NG_NODE_UNREF() >>> 6) free everything up >>> >>> in theory, items (1) and (5) above just an additional protection >>> because i was not sure about inner workings of usb2. i think, when we >>> sort everything out they can be removed. >> I have a question. What is the following doing at the middle of the >> ubt_detach(): >> >> if (node != NULL) >> NG_NODE_UNREF(node); >> >> If you say that: >> >> ng_rmnode_self(node); >> >> Cleans up the last reference? > > the complete code is > > node_p node = sc->sc_node; > > if (node != NULL) { > sc->sc_node = NULL; <-- clear sc_node > > NG_NODE_SET_PRIVATE(node, NULL); > NG_NODE_REALLY_DIE(node); > > NG_NODE_REF(node); <--- grab +1 reference > ng_rmnode_self(node); <--- mark node as "dead", but not ensure its > not free()d > } > > /* bla, bla */ > > if (node != NULL) > NG_NODE_UNREF(node); <--- drop 1 reference and possibly free() node > > so, say we enter detach() with only one reference (from attach()). > when we see the sc_node pointer, we assume that there could be > multiple (> 1) references on it. so just for added protection we grab > one more (now reference count is 2). then we call ng_rmnode_self() > which will mark node as "dead" and drop one reference (now reference > count is 1). then we do "bla bla" stuff which could access node > pointer. the important thing is that node pointer still points to a > valid "dead" node, so NG_NODE_NOT_VALID() can be used to check if node > is "dead" or "alive". finally when "bla bla" stuff is done, we know we > are not going to access node pointer any mode and we drop our > reference (now reference count is 0 and node is destroyed). > > >>> i do not understand. if usb2_transfer_unsetup() calls >>> usb2_transfer_drain() and sleeps, then what difference does it make in >>> detach()? >> You are right that USB-wise it does not matter if you call >> usb2_transfer_drain() separately or if you call usb2_transfer_unsetup(). I'm >> just thinking with regard to the state inside bluetooth, because there are >> two contexts running in parallell. I.E. Callbacks might be executed in >> paralell to bluetooth tearing down. > > do you mean transfer stop on hook disconnection? but those are > protected by interface locks, no? and like i said, i'm changing > ubt_task to call drain() instead of stop() to make stop completely > synchronous (which is actually a nice thing). > >>> i just followed original code that did it this way, i.e. >>> drain and unsetup all transfers in one call. >> Yes, that's fine. > > ok, fine. one more down :) > >>>> In USB2 a "usb2_start_hardware()" call is always paired with a USB >>>> transfer callback, no matter what you do! This way of design has been >>>> chosen so that you can do refcounts inside the callback! >>> ok, and that is exactly what code does. the only thing is that >>> refcounts are on netgraph node and not on softc structure. again the >>> reason for that is because netgraph node can be accessed from both >>> usb2 driver and netgraph subsystem directly. >> Isn't it enough to increment the refcount at attach and decrement it at >> detach ???? > > it would be if we could sleep in netgraph, and we can not. in theory, > we only need one extra reference which would tell us if there anything > in usb2 that still could access node pointer. in practice, instead of > checking every transfer and making sure its no pending before dropping > that extra reference i just add multiple references for each usb2 > transfer and ubt_task (i.e. things that could access node pointer). > > [...] > >>> well, when i looked at the code, i saw that both functions do >>> USB_BUS_LOCK() and USB_BUS_UNLOCK(). >> Well, this is the lock of the USB IRQ handler. One per USB controller. There >> are no more locks than this that will be locked. The IRQ handler is called >> from a high priority context and will not block very long. Same with USB >> callbacks. > > right, and what if some other usb device attached to the same usb > controller is doing something synchronously? do you see that you could > potentially block netgraph for arbitrary time and that is really out > of ng_ubt2 driver control? > >>> also, now that you mentioned it, i should call usb2_transfer_drain() >>> in ubt_task() instead of usb2_transfer_stop() to make transfer stop >>> completely synchronous as well. >> When I think about it, usb2_transfer_stop() is enough. See my latest Patch in >> P4. > > please read my comment above. > >>> please let me reiterate, sleeping in netgraph is NOT allowed. even if >>> its for a few milliseconds. please accept it as a fact. think of all >>> netgraph methods as fast interrupt handlers. it is very typical for >>> fast interrupt handlers to do minimal amount of work and schedule swi >>> to do more heavy processing. that is exactly what the code does here. >> Fixed! > > i beg to differ :) > > thanks, > max > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"