From owner-freebsd-hackers@FreeBSD.ORG Sat Apr 28 09:57:34 2007 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D663016A400; Sat, 28 Apr 2007 09:57:34 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id A7B7413C487; Sat, 28 Apr 2007 09:57:34 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 1CF0946C5C; Sat, 28 Apr 2007 05:57:34 -0400 (EDT) Date: Sat, 28 Apr 2007 10:57:34 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Julian Elischer In-Reply-To: <463235D1.5030605@elischer.org> Message-ID: <20070428105428.I28395@fledge.watson.org> References: <200704262136.33196.hselasky@c2i.net> <46311708.5030002@elischer.org> <3bbf2fe10704261450n50b66392saa7dc2ea7f091d31@mail.gmail.com> <200704270748.49404.hselasky@c2i.net> <463235D1.5030605@elischer.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Attilio Rao , freebsd-hackers@freebsd.org, Hans Petter Selasky Subject: Re: msleep() on recursivly locked mutexes X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Apr 2007 09:57:34 -0000 On Fri, 27 Apr 2007, Julian Elischer wrote: > Basically you shouldn't have a recursed mutex FULL STOP. We have a couple of > instances in the kernel where we allow a mutex to recurse, but they had to > be hard fought, and the general rule is "Don't". If you are recursing on a > mutex you need to switch to some other method of doing things. e.g. > reference counts, turnstiles, whatever.. use the mutex to create these but > don't hold the mutex for long enough to need to recurse on it. A mutex > should generally lock, dash-in and work, unlock. We have some cases where > that is not true, but we are trying to get rid of them, not add more. Most of these instances have to do with legacy code and data structures that involve high levels of code recursion and reentrance. This is frequently an unreliable way to organize code anyway, and often involves other bugs that are less visible. Over time, it's my hope that we can eliminate quite a few sources of remaining lock recursion, but there are some tricky cases involving repeated callbacks between layers that make that harder. For example, in the socket/network pcb relationship, there's a lack of clarity on which side drives the overlapping state machines present in both sets of data structures. Over time, we're migrating towards a model in which the socket infrastructure is more of a "library" in service to network protocols that will drive the actual transitions, but in the mean time, lock recursion is required. For any significantly rewritten or new code, I would expect that recursion would be avoided in almost all cases. Robert N M Watson Computer Laboratory University of Cambridge