From owner-freebsd-arch@FreeBSD.ORG Mon Aug 18 00:16:43 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E475237B401 for ; Mon, 18 Aug 2003 00:16:42 -0700 (PDT) Received: from sohgo.tanimura.dyndns.org (IP1A1247.kng.mesh.ad.jp [61.203.80.239]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D53243FB1 for ; Mon, 18 Aug 2003 00:16:40 -0700 (PDT) (envelope-from tanimura@tanimura.dyndns.org) Received: from sohgo.tanimura.dyndns.org (localhost [IPv6:::1]) ESMTP id h7I7GIFN016727 ; Mon, 18 Aug 2003 16:16:18 +0900 (JST) Received: (from uucp@localhost) (8.12.9/3.7W-submit-carrots-Tokyu-Meguro) with UUCP id h7I7GH5O016726 ; Mon, 18 Aug 2003 16:16:17 +0900 (JST) Received: from urban.nkth.tanimura.dyndns.org (localhost [IPv6:::1]) with ESMTP id h7I7DQPE005765 ; Mon, 18 Aug 2003 16:13:26 +0900 (JST) Message-Id: <200308180713.h7I7DQPE005765@urban> Date: Mon, 18 Aug 2003 16:13:26 +0900 From: Seigo Tanimura To: arch@freebsd.org User-Agent: Wanderlust/2.10.1 (Watching The Wheels) SEMI/1.14.5 (Awara-Onsen) FLIM/1.14.5 (Demachiyanagi) APEL/10.6 MULE XEmacs/21.4 (patch 13) (Rational FORTRAN) (i386--freebsd) Organization: My Home MIME-Version: 1.0 (generated by SEMI 1.14.5 - "Awara-Onsen") Content-Type: text/plain; charset=US-ASCII cc: Seigo Tanimura Subject: Embedding a vnode type to its interlock mutex type X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Aug 2003 07:16:43 -0000 In short: A vnode should embed its type name (VREG, VCHR, etc.) in the type of the interlock mutex to avoid a false LOR alarm by Witness. The Details: With my p4 branch (tanimura_socket), I get a LOR shown below: --- v --- LOR log --- v --- lock order reversal 1st 0xfffff800262faa90 local rcv (local rcv) @ kern/uipc_socket.c:1250 2nd 0xc03c9a08 vm page queue mutex (vm page queue mutex) @ kern/uipc_cow.c:88 Stack backtrace: _mtx_lock_flags() at _mtx_lock_flags+0x9c socow_iodone() at socow_iodone+0x1c m_free() at m_free+0xf4 soreceive() at soreceive+0xe50 soo_read() at soo_read+0x78 dofileread() at dofileread+0x60 read() at read+0x3c syscall() at syscall+0x280 --- ^ --- LOR log --- ^ --- where the lock order is: --- v --- lock order by Witness --- v --- 4 pseudofs_vncache -- last acquired @ fs/pseudofs/pseudofs_vncache.c:227 5 mntvnode -- last acquired @ nfsclient/nfs_vfsops.c:1014 13 vnode interlock -- last acquired @ nfsclient/nfs_vfsops.c:1010 14 Syncer mtx -- last acquired @ kern/vfs_subr.c:1798 14 spechash -- last acquired @ kern/vfs_subr.c:2009 14 vnode_free_list -- last acquired @ kern/vfs_subr.c:760 14 local snd -- last acquired @ kern/uipc_socket.c:2173 15 local rcv -- last acquired @ kern/uipc_socket.c:2174 16 mbuf PCPU list lock -- last acquired @ kern/subr_mbuf.c:926 17 mbuf subsystem general lists lock -- last acquired @ kern/subr_mbuf.c:676 16 Malloc Stats -- last acquired @ kern/kern_malloc.c:324 16 sleep mtxpool -- last acquired @ kern/kern_prot.c:1686 16 socket generation -- last acquired @ kern/uipc_socket.c:263 16 sellck -- last acquired @ kern/sys_generic.c:816 16 UMA pcpu -- (already displayed) 10 vm object_list -- last acquired @ vm/vm_object.c:621 11 kmem object -- last acquired @ vm/vm_meter.c:179 12 vm page queue mutex -- last acquired @ vm/vm_pageout.c:1464 13 vnode interlock -- (already displayed) --- ^ --- lock order by Witness --- ^ --- Witness yells out because: - vm page queue mutex is immediately followed by vnode interlock, and - local rcv (the receive lock of an AF_LOCAL socket) is locked after vnode interlock in another place. Witness seems to thus presume that vm page queue mutex should have been acquired before local rcv. Although Witness treats this as a LOR, I believe this should be safe because AF_LOCAL sockets and VSOCK vnodes should not be used during vm operation. If we distinguished the interlocks for VSOCK vnodes from the ones for VREG (or VCHR), then there would be two lock orders: vm page queue mutex VREG/VCHR vnode interlock Syncer mtx spechash : : and VSOCK vnode interlock local snd local rcv vm page queue mutex : : They should safely coexist with each other. In order to accomplish that, the type of a mutex must be changed when the type of a vnode is changed. While it might work to just destroy and reinit the interlock of a vnode, those operations are likely to be overdoing because the type of a mutex is meaningful for only Witness. It would hence be better for Witness to provide an API to change the type of an inited mutex. For sockets, I implemented a trick to change lock types quite a few days ago. The idea is to embed the address family name to the types of socket locks. It solved almost all of the false LOR alerts for routing sockets, where a routing socket may need to be altered during sending from an inet or inet6 socket. It is quite easy for sockets to change its lock type because a socket is always freed when it is closed. Comments? -- Seigo Tanimura