From owner-freebsd-current@FreeBSD.ORG Sat Aug 27 18:04:42 2005 Return-Path: X-Original-To: freebsd-current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AB54E16A41F; Sat, 27 Aug 2005 18:04:42 +0000 (GMT) (envelope-from imp@bsdimp.com) Received: from harmony.village.org (vc4-2-0-87.dsl.netrack.net [199.45.160.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 41CBC43D46; Sat, 27 Aug 2005 18:04:42 +0000 (GMT) (envelope-from imp@bsdimp.com) Received: from localhost (localhost.village.org [127.0.0.1]) by harmony.village.org (8.13.3/8.13.3) with ESMTP id j7RI4TrR017803; Sat, 27 Aug 2005 12:04:29 -0600 (MDT) (envelope-from imp@bsdimp.com) Date: Sat, 27 Aug 2005 12:04:48 -0600 (MDT) Message-Id: <20050827.120448.37592601.imp@bsdimp.com> To: rwatson@FreeBSD.org From: "M. Warner Losh" In-Reply-To: <20050827184153.A24510@fledge.watson.org> References: <20050827181827.O24510@fledge.watson.org> <20050827.114013.35047360.imp@bsdimp.com> <20050827184153.A24510@fledge.watson.org> X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.0 (harmony.village.org [127.0.0.1]); Sat, 27 Aug 2005 12:04:34 -0600 (MDT) Cc: bzeeb-lists@lists.zabbadoz.net, freebsd-current@FreeBSD.org, dandee@volny.cz Subject: Re: LOR route vr0 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Aug 2005 18:04:42 -0000 In message: <20050827184153.A24510@fledge.watson.org> Robert Watson writes: : On Sat, 27 Aug 2005, M. Warner Losh wrote: : : > : Generally speaking, network interface device driver locks follow network : > : stack locks in the lock order. However, I've not really looked much at : > : the route table locking so can't speak to whether that is the case : > : specifically for routing locks. If it is, the below traces reflect the : > : correct order, and you might want to add a hard-coded entry to witness in : > : order to catch the reverse order. : > : > Can you pose a quickie summary on how to do that? I tried last night and : > was unsuccessful... : : You need to add an entry to subr_witness.c creating a graph edge between : the softc lock and the routing lock. An example of an entry in : subr_witness.c: : : /* : * TCP/IP : */ : { "tcp", &lock_class_mtx_sleep }, : { "tcpinp", &lock_class_mtx_sleep }, : { "so_snd", &lock_class_mtx_sleep }, : { NULL, NULL }, : : Note that sets of ordered entries are terminated with a double-null. This : declares that locks of type "tcp" preceed "tcpinp" which preceed : "so_snd". So would I add "ed1" to the list or "network driver"? : > : Lock order reversals between the : > : network stack and device drivers tend to occur as a result of the device : > : driver calling into the network stack while holding the device driver : > : mutex. : > : > I'm as sure as I can be that no locks are held when I call INTO the : > network layer. As far as I can tell, I only do that when I call : > ifp->if_input, and I drop the locks to do that. : : If I had to guess, you do a media status update, which can cause routing : socket events indicating the link went up or down. No link moditoring, since the ED card I'm testing has no mii bus. That might be ANOTHER problem, but it isn't this one :-). : > : Someone (tm) should work out if the right order is route locks -> : > : device driver locks, as it's likely a common calss of bugs across many : > : drivers. : > : > I just discovered the problem in my code. I'm not sure where the : > other order happens, but in my code I do the following: : > : > ED_LOCK(sc); : > ed_setrcr(sc); : > ed_ds_getmcst(sc); : > IF_ADDR_LOCK(sc->ifp); : > TAILQ_FOREACH(ifma, &sc->ifp->if_multiaddrs, ifma_link) { : > ... : > IF_ADDR_UNLOCK(sc->ifp); : > ED_UNLOCK(sc); : > : > since the lock for ED should be a leaf lock, this causes problems. I'm : > guessing that the network layer calls into the driver with this lock : > held. Without hard coding the locking into witness (see above), I'm : > unsure where this happens. A quick grep of the code doesn't reveal : > anything obvious... : : I think this case should be OK, and we should document that as being the : case using a hard-coded witness entry. rearranging the code in this case would be at the very least awkward. Maybe quite difficult, but likely doable. : > When I comment out the abouve IF_ADDR locks, I have no more LORs, but I : > think maybe other problems :-). : : Hmmm. I was thinking that it was a separate issue. Could you try adding : a graph edge to witness forcing the ifaddrmtx's to fall before the driver : mutexes, in order to identify a path by which ifaddrmtx preceeds the : driver mutex? I'll try again. Warner