From owner-freebsd-arch@FreeBSD.ORG Fri Aug 3 07:31:29 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E4EB16A41B for ; Fri, 3 Aug 2007 07:31:29 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.186]) by mx1.freebsd.org (Postfix) with ESMTP id 0E89613C45A for ; Fri, 3 Aug 2007 07:31:28 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: by nf-out-0910.google.com with SMTP id b2so203787nfb for ; Fri, 03 Aug 2007 00:31:28 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=UAr/oCQexNU/d2a0ZPLuZKvTuq+dcbGMgQH4h9XebudA+0NzeKZKq5Etx3q1SnCsfAjFTdYBDDuhAuc0DNgSj9YaJmFVxWi0vkd416npzucsun4QFXSzcvtkpex8Ap2F9yx5qetd3/heA8YpsnF5AT7ignmL/g89b0KKe2b4qkg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=U6wyr/2TCzlDxP10FrvN5xCfN1d0bB/fD1+W0QXNJ11RenN3qTML3F7GmhMOEgcwHD6p/O1wSklIaH0SI2mv6wGsIoXBLPDjruFMtC2zEIm91PbyKc8/KPDtZmuqE3QNlBsgpbnbRT+Ad72JjMCWz21YFKsaCAILVqABxpCn4XM= Received: by 10.78.201.10 with SMTP id y10mr756074huf.1186126287739; Fri, 03 Aug 2007 00:31:27 -0700 (PDT) Received: from ?172.31.5.25? ( [89.97.252.178]) by mx.google.com with ESMTPS id 36sm1126164huc.2007.08.03.00.31.27 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 03 Aug 2007 00:31:27 -0700 (PDT) Message-ID: <46B2D993.6070409@FreeBSD.org> Date: Fri, 03 Aug 2007 09:30:27 +0200 From: Attilio Rao User-Agent: Thunderbird 1.5 (X11/20060526) MIME-Version: 1.0 To: Jeff Roberson References: <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org> <20070704114005.X552@10.0.0.1> <20070729180722.GB85196@rot26.obsecurity.org> <20070802174819.S561@10.0.0.1> <20070803014445.GS92956@elvis.mu.org> <20070802190033.J561@10.0.0.1> In-Reply-To: <20070802190033.J561@10.0.0.1> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: Attilio Rao Cc: arch@freebsd.org, Alfred Perlstein , Kris Kennaway Subject: Re: Fine grain select locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2007 07:31:29 -0000 Jeff Roberson wrote: > > On Thu, 2 Aug 2007, Alfred Perlstein wrote: > >> * Jeff Roberson [070802 17:52] wrote: >>> >>> I believe filedescriptor locking is the place where we are most lacking. >>> The new sx helped tremendously. However, this is still going to be a >>> scalability limiter. I have looked into both linux and solaris's >>> solution >>> to this problem. Briefly, linux uses RCU to protect the list, which is >>> close to ideal as this is certainly a read heavy workload. Solaris >>> on the >>> other hand uses the actual file lock to protect the descriptor slot. So >>> they fetch the file pointer, lock it, and then check to see if they >>> lost a >>> race with the slot being reassigned while they were acquiring the lock. >>> This approach is perhaps better than rcu in many cases except when the >>> descriptor set is expanded. Then they have to lock every file in the >>> set. >> >> Certainly this is an extreme edge case... ? > > Well that may be true, yes. However, there are other problems with this > scheme. For example, flags settings could be done entirely with cmpset, > without using a lock at all. In most cases we're just setting a bit > which can be done with atomic_set. When we're doing multiple operations > we could compute the value and attempt to est it in a loop. So we can > totally eliminate locking the descriptor here. > > We also could use atomic ops to protect the file descriptor reference > count. This would eliminate another use of the FILE_LOCK(). I'm not > sure if it's possible to merge this with an approach that uses the > FILE_LOCK() to protect the descriptor table. Although I've not thought > it all the way through. > > If the ref count and flags were done with atomics the main consumer of > FILE_LOCK would actually be the unix domain socket garbage collection > code. How's that for old unix baggage. Do many programs actually pass > around descriptors these days? inetd? others? It might be worth it to > lock this seperately from the file lock. I'm sure I've alredy implemented it, but later I realized that there is a race with the p_fd field (if I got you right you are referring to the fdesc_mtx here), so we probabilly should better arrange those paths firstly. Thanks, Attilio