From owner-freebsd-net@FreeBSD.ORG Sat Aug 9 11:30:28 2008 Return-Path: Delivered-To: net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ABB2210656CB for ; Sat, 9 Aug 2008 11:30:28 +0000 (UTC) (envelope-from robert@fledge.watson.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 58A178FC2F for ; Sat, 9 Aug 2008 11:30:28 +0000 (UTC) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 2D13046CCB for ; Sat, 9 Aug 2008 07:13:15 -0400 (EDT) X-Return-Path: X-Received: from cyrus.watson.org ([unix socket]) by cyrus.watson.org (Cyrus v2.1.18) with LMTP; Fri, 08 Aug 2008 17:22:18 -0400 X-Sieve: CMU Sieve 2.2 X-Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by cyrus.watson.org (Postfix) with ESMTP id F236446C0B for ; Fri, 8 Aug 2008 17:22:17 -0400 (EDT) X-Received: from hub.freebsd.org (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id 5D637157DA1; Fri, 8 Aug 2008 21:21:23 +0000 (UTC) (envelope-from owner-freebsd-stable@freebsd.org) X-Received: from hub.freebsd.org (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id E305A1065698; Fri, 8 Aug 2008 21:21:20 +0000 (UTC) (envelope-from owner-freebsd-stable@freebsd.org) X-Delivered-To: stable@FreeBSD.org X-Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D7FDC1065670 for ; Fri, 8 Aug 2008 21:21:12 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) X-Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 8ADE38FC13 for ; Fri, 8 Aug 2008 21:21:12 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) X-Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 2E71646C0B for ; Fri, 8 Aug 2008 17:21:12 -0400 (EDT) Date: Fri, 8 Aug 2008 22:21:12 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: stable@FreeBSD.org In-Reply-To: Message-ID: References: User-Agent: Alpine 1.10 (BSF 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Sender: owner-freebsd-stable@freebsd.org Errors-To: owner-freebsd-stable@freebsd.org ReSent-Date: Sat, 9 Aug 2008 12:13:10 +0100 (BST) ReSent-From: robert ReSent-To: net@FreeBSD.org ReSent-Subject: Re: HEADS UP: inpcb/inpcbinfo rwlocking: coming to a 7-STABLE branch near you ReSent-Message-ID: ReSent-User-Agent: Alpine 1.10 (BSF 962 2008-03-14) Cc: Subject: Re: HEADS UP: inpcb/inpcbinfo rwlocking: coming to a 7-STABLE branch near you X-BeenThere: freebsd-net@freebsd.org List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Aug 2008 11:30:28 -0000 On Sun, 3 Aug 2008, Robert Watson wrote: > This is an advance warning that, late next week, I will be merging a fairly > large set of changes to the IPv4 and IPv6 protocols layered over the > inpcb/inpcbinfo kernel infrastructure. To be specific, this affects TCP, > UDP, and raw sockets on both IPv4 and IPv6. I will post a further e-mail > announcement along with patch set and schedule in a day or two once it's > prepared. Patches, which require the MFC of rwlock try-locking, which I did earlier today: http://www.watson.org/~robert/freebsd/netperf/20080808-7stable-rwlock-inpcb.diff These incude the inpcb/inpcbinfo read/write locking changes (although not yet for raw/divert sockets). Any testing, especially with heavy UDP loads, would be much appreciated -- this are fairly complex changes, and also quite a complex MFC. Robert N M Watson Computer Laboratory University of Cambridge > > The thrust of this change is to replace the mutexes protecting the inpcb and > inpcbinfo data structures with read-write locks (rwlocks). These structures > represent, respectively, particular sockets and the global socket lists for > all socket types in IPv4 and IPv6 except for SCTP. When you run netstat, > inpcbinfo is the data structure referencing all connections, and each line in > the nestat output reflects the contents of a specific inpcb. > > In the current stage of this work, the intent is to improve performance for > datagram-related protocols on SMP systems by allowing concurrent acquisition > of both global and connection locks during receive and transmit. This is > possible because, in the common case, no connection or global state is > modified during UDP/raw receive and transmit at the IP layer, so a read lock > is sufficient to prevent data in those structures from unexpectedly changing. > For receive, socket layer state is modified, but this is separately protected > by socket layer locks. On transmit, no state is modified at any layer, so in > principle we will allow fully parallel transmit from multiple threads down to > about the routing and network interface layers, whereas previously they would > bottleneck in UDP. > > The applications targeted by this change are threaded UDP server > applications, such as BIND9, nsd, and UDP-based memcached. Kris Kennaway and > Paul Saab have done fairly extensive testing with the changes and > demonstrated significant performance improvements due to reduced contention > and overhead. Perhaps they can mention some of those numbers in a follow-up > to this post. > > The reason for the heads up is that, while carefully-tested, changes of this > sort do come with risks. We've carefully structured them so as to avoid > breaking the ABIs for netstat, etc, but it's not impossible that some > problems will arise as the changes settle. The goal, however, is to see > these performance improvements in 7.1, and since they've had a bit to shake > out in 8.x and seen some heavy use, I think now is the right time to merge > them. > > In any case, I will send out e-mail in a couple of days with a proposed merge > patch and schedule for merging, and perhaps if you are in a positition where > you might benefit from these improvements, or have interesting UDP or > raw-socket based applications running on 7.x, you could test the candidate > patch before it's merged, reporting any problems. Unless I receive negative > feedback, I will plan on merging the changes late in the week, and keep a > close eye on stable@ for any reports of problems. > > Thanks, > > Robert N M Watson > Computer Laboratory > University of Cambridge > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"