From owner-freebsd-current@FreeBSD.ORG Fri Oct 26 21:42:14 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C2E2916A421; Fri, 26 Oct 2007 21:42:14 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 97F1513C480; Fri, 26 Oct 2007 21:42:14 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 1F14446E20; Fri, 26 Oct 2007 17:42:14 -0400 (EDT) Date: Fri, 26 Oct 2007 22:42:13 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: John Baldwin In-Reply-To: <200710261222.28656.jhb@freebsd.org> Message-ID: <20071026223628.O99770@fledge.watson.org> References: <200710251435.58984.jhb@freebsd.org> <200710261222.28656.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-current@freebsd.org Subject: Re: Deadlock, exclusive sx so_rcv_sx, amd64 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2007 21:42:14 -0000 On Fri, 26 Oct 2007, John Baldwin wrote: > "sbwait" is waiting for data to come in on a socket and "pfault" is waiting > on disk I/O. It is a bit odd that 1187 is holding a lock while sleeping > though that is permitted with an sx lock. Still, if it's supposed to be > protect socket's receive buffer that is odd. Maybe get a trace of the > process blocked in "sbwait" (tr ) and bug rwatson@ about it. This is normal -- there are two kinds of locks on each socket buffer: a mutex protecting the integrity of the data structure, and an sx lock serializing I/O on the socket buffer. The latter is intended to prevent I/O interlacing, and replaced the older sblock/sbunlock implemented using tsleep(), flags, and the mutex as an interlock. It is normal for the sx lock to be held over sleeps -- both sbwait, indicating that the I/O has not yet been completed but is waiting on the network or remote endpoint, and a page fault, indicating that a data copy to or from user space is in progress and has blocked waiting on paging. Other threads blocked on the sx lock sleep interruptibly, thanks for Attilio's addition of interruptible sx lock calls. It's not impossible that there are deadlocks involved, but if so, they likely existed before the change to formal sx locks as the previous "by hand" lock construction had essentially identical (but slower) properties. There is an interesting question about whether the strong semantics in the presence of interlaced I/O requests (i.e., simultaneous requests from multiple threads on a single socket) are required, in which case we might be able to weaken the locking here with some reworking of the socket buffer data structures and send/receive routines. For the time being we should leave them as-is for stream sockets, and have optimized them out for UDP sockets by virtue of a simplified sosend_dgram(), which was part of our optimization work for BIND. FYI, BIND uses a single UDP socket for all transactions, and since each transaction is atomic (being a datagram), the overhead of socket buffer locking was significant, not to mention unrequired. This was problem was originally pointed out by Jinmei Tatuya. So, in summary: sleeping while holding the so_rcv/so_snd sx locks is normal, but deadlocks are not, so if the pointer comes back in the direction of the socket code after some more investigation, let me know. Robert N M Watson Computer Laboratory University of Cambridge