From owner-freebsd-net@FreeBSD.ORG Fri Mar 25 21:38:13 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 50FF31065670; Fri, 25 Mar 2011 21:38:13 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 2C31E8FC14; Fri, 25 Mar 2011 21:38:13 +0000 (UTC) Received: from [192.168.2.112] (host86-147-11-178.range86-147.btcentralplus.com [86.147.11.178]) by cyrus.watson.org (Postfix) with ESMTPSA id 1ABF346B55; Fri, 25 Mar 2011 17:38:11 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=us-ascii From: "Robert N. M. Watson" In-Reply-To: <201103251701.34576.jhb@freebsd.org> Date: Fri, 25 Mar 2011 21:38:10 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <997B5A3A-8AC7-42F5-BE43-64B6CB4E2A25@freebsd.org> References: <201103251701.34576.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1082) Cc: freebsd-net@freebsd.org, Jim Subject: Re: why use INP_WLOCK instead of INP_RLOCK X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Mar 2011 21:38:13 -0000 On 25 Mar 2011, at 21:01, John Baldwin wrote: > On Tuesday, February 01, 2011 12:54:33 am Jim wrote: >> I am not sure if anybody has asked it before. I could not find answer = by >> doing rough search on Internet, if it is duplicate question, sorry in >> advance. >>=20 >> My question is that, for getting socket options in tcp_ctloutput() in >> tcp_usrreq.c, why do we need to do lock with INP_WLOCK(inp) as = setting >> socket options does. Why do we just use INP_RLOCK(inp), as it looks = not >> changing anything in tcp control block? >=20 > I think mostly it is just because no one has bothered to change it. =20= > Realistically it probably won't make any noticable difference unless = your=20 > workload consists of doing lots of calls to getsockopt() but not = sending any=20 > actual traffic on the associated sockets. :) (Almost all of the other=20= > operations on a TCP connection require a write lock on the pcb.) Just to reiterate John's point here: the critical performance paths for = TCP both require the inpcb lock to be held exclusively (input and = output), and socket options are typically called from the same user = thread doing I/O, meaning that acquiring read locks instead of write = locks is unlikely to make any measurable difference. However, in = principle I believe most if not all getsockopt()'s in TCP should be fine = with just a read lock, and for socket options used with UDP, there might = well be some benefit to using a read lock, since most UDP operations use = read locks and note write locks on the inpcb. I should further note that Jeff Roberson has some exciting in-progress = work to reduce transmit-input contention on the inpcb that appears to = make quite a noticeable difference in improving TCP performance. We = don't have much global lock contention currently when in the steady = state, but the per-connection locks do get heavily contended. His work = is similar to some work done in the Linux stack a year or two ago to = defer input processing to the user thread rather than contending on the = inpcb lock, if it's already held. Hopefully we'll see the results of = that work in 9.0, and possibly backported to 8.x. I also have a large pending patchset adding connection group support, = and aligning software lookup tables with hardware work distribution via = RSS, which is due to go in before 9.0. Robert=