Date: Sat, 27 Aug 2005 18:44:51 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: "M. Warner Losh" <imp@bsdimp.com> Cc: bzeeb-lists@lists.zabbadoz.net, freebsd-current@FreeBSD.org, dandee@volny.cz Subject: Re: LOR route vr0 Message-ID: <20050827184153.A24510@fledge.watson.org> In-Reply-To: <20050827.114013.35047360.imp@bsdimp.com> References: <Pine.BSF.4.53.0508270912550.969@e0-0.zab2.int.zabbadoz.net> <20050827.104631.10908351.imp@bsdimp.com> <20050827181827.O24510@fledge.watson.org> <20050827.114013.35047360.imp@bsdimp.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 27 Aug 2005, M. Warner Losh wrote: > : Generally speaking, network interface device driver locks follow network > : stack locks in the lock order. However, I've not really looked much at > : the route table locking so can't speak to whether that is the case > : specifically for routing locks. If it is, the below traces reflect the > : correct order, and you might want to add a hard-coded entry to witness in > : order to catch the reverse order. > > Can you pose a quickie summary on how to do that? I tried last night and > was unsuccessful... You need to add an entry to subr_witness.c creating a graph edge between the softc lock and the routing lock. An example of an entry in subr_witness.c: /* * TCP/IP */ { "tcp", &lock_class_mtx_sleep }, { "tcpinp", &lock_class_mtx_sleep }, { "so_snd", &lock_class_mtx_sleep }, { NULL, NULL }, Note that sets of ordered entries are terminated with a double-null. This declares that locks of type "tcp" preceed "tcpinp" which preceed "so_snd". > : Lock order reversals between the > : network stack and device drivers tend to occur as a result of the device > : driver calling into the network stack while holding the device driver > : mutex. > > I'm as sure as I can be that no locks are held when I call INTO the > network layer. As far as I can tell, I only do that when I call > ifp->if_input, and I drop the locks to do that. If I had to guess, you do a media status update, which can cause routing socket events indicating the link went up or down. > : Someone (tm) should work out if the right order is route locks -> > : device driver locks, as it's likely a common calss of bugs across many > : drivers. > > I just discovered the problem in my code. I'm not sure where the > other order happens, but in my code I do the following: > > ED_LOCK(sc); > ed_setrcr(sc); > ed_ds_getmcst(sc); > IF_ADDR_LOCK(sc->ifp); > TAILQ_FOREACH(ifma, &sc->ifp->if_multiaddrs, ifma_link) { > ... > IF_ADDR_UNLOCK(sc->ifp); > ED_UNLOCK(sc); > > since the lock for ED should be a leaf lock, this causes problems. I'm > guessing that the network layer calls into the driver with this lock > held. Without hard coding the locking into witness (see above), I'm > unsure where this happens. A quick grep of the code doesn't reveal > anything obvious... I think this case should be OK, and we should document that as being the case using a hard-coded witness entry. > When I comment out the abouve IF_ADDR locks, I have no more LORs, but I > think maybe other problems :-). Hmmm. I was thinking that it was a separate issue. Could you try adding a graph edge to witness forcing the ifaddrmtx's to fall before the driver mutexes, in order to identify a path by which ifaddrmtx preceeds the driver mutex? Robert N M Watson
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050827184153.A24510>