Date: Sat, 27 Aug 2005 18:44:51 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: "M. Warner Losh" <imp@bsdimp.com> Cc: bzeeb-lists@lists.zabbadoz.net, freebsd-current@FreeBSD.org, dandee@volny.cz Subject: Re: LOR route vr0 Message-ID: <20050827184153.A24510@fledge.watson.org> In-Reply-To: <20050827.114013.35047360.imp@bsdimp.com> References: <Pine.BSF.4.53.0508270912550.969@e0-0.zab2.int.zabbadoz.net> <20050827.104631.10908351.imp@bsdimp.com> <20050827181827.O24510@fledge.watson.org> <20050827.114013.35047360.imp@bsdimp.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 27 Aug 2005, M. Warner Losh wrote:
> : Generally speaking, network interface device driver locks follow network
> : stack locks in the lock order. However, I've not really looked much at
> : the route table locking so can't speak to whether that is the case
> : specifically for routing locks. If it is, the below traces reflect the
> : correct order, and you might want to add a hard-coded entry to witness in
> : order to catch the reverse order.
>
> Can you pose a quickie summary on how to do that? I tried last night and
> was unsuccessful...
You need to add an entry to subr_witness.c creating a graph edge between
the softc lock and the routing lock. An example of an entry in
subr_witness.c:
/*
* TCP/IP
*/
{ "tcp", &lock_class_mtx_sleep },
{ "tcpinp", &lock_class_mtx_sleep },
{ "so_snd", &lock_class_mtx_sleep },
{ NULL, NULL },
Note that sets of ordered entries are terminated with a double-null. This
declares that locks of type "tcp" preceed "tcpinp" which preceed
"so_snd".
> : Lock order reversals between the
> : network stack and device drivers tend to occur as a result of the device
> : driver calling into the network stack while holding the device driver
> : mutex.
>
> I'm as sure as I can be that no locks are held when I call INTO the
> network layer. As far as I can tell, I only do that when I call
> ifp->if_input, and I drop the locks to do that.
If I had to guess, you do a media status update, which can cause routing
socket events indicating the link went up or down.
> : Someone (tm) should work out if the right order is route locks ->
> : device driver locks, as it's likely a common calss of bugs across many
> : drivers.
>
> I just discovered the problem in my code. I'm not sure where the
> other order happens, but in my code I do the following:
>
> ED_LOCK(sc);
> ed_setrcr(sc);
> ed_ds_getmcst(sc);
> IF_ADDR_LOCK(sc->ifp);
> TAILQ_FOREACH(ifma, &sc->ifp->if_multiaddrs, ifma_link) {
> ...
> IF_ADDR_UNLOCK(sc->ifp);
> ED_UNLOCK(sc);
>
> since the lock for ED should be a leaf lock, this causes problems. I'm
> guessing that the network layer calls into the driver with this lock
> held. Without hard coding the locking into witness (see above), I'm
> unsure where this happens. A quick grep of the code doesn't reveal
> anything obvious...
I think this case should be OK, and we should document that as being the
case using a hard-coded witness entry.
> When I comment out the abouve IF_ADDR locks, I have no more LORs, but I
> think maybe other problems :-).
Hmmm. I was thinking that it was a separate issue. Could you try adding
a graph edge to witness forcing the ifaddrmtx's to fall before the driver
mutexes, in order to identify a path by which ifaddrmtx preceeds the
driver mutex?
Robert N M Watson
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050827184153.A24510>
