From owner-freebsd-net@FreeBSD.ORG Thu Dec 6 09:35:17 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 477CEB59; Thu, 6 Dec 2012 09:35:17 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 167C38FC0C; Thu, 6 Dec 2012 09:35:17 +0000 (UTC) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTPS id A2A6F46B20; Thu, 6 Dec 2012 04:35:16 -0500 (EST) Date: Thu, 6 Dec 2012 09:35:16 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: John Baldwin Subject: Re: Latency issues with buf_ring In-Reply-To: <201212041108.17645.jhb@freebsd.org> Message-ID: References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com> <201212041108.17645.jhb@freebsd.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Barney Cordoba , freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Dec 2012 09:35:17 -0000 On Tue, 4 Dec 2012, John Baldwin wrote: >> Q2: Are there any case studies or benchmarks for buf_ring, or it is just >> blindly being used because someone claimed it was better and offered it for >> free? One of the points of locking is to avoid race conditions, so the > > fact that you have races in a supposed lock-less scheme seems more than just > ironic. > > The buf_ring author claims it has benefits in high pps workloads. I am not > aware of any benchmarks, etc. ... joining this conversation a bit late -- still about two years behind on net@ :-) ... There are several places where having a good buf_ring primitive should offer significant benefits over blocking locks around queues: - ifnet transmit enqueue path, whether owned by the general stack (ifqueue) or the driver (as is often the case with if_transmit). - netisr queues used in deferred input dispatch, including loopback. - A future lockless hand-off of inbound TCP segments from the ithread/netisr to an already running user thread a la Van Jacobson's proposal to the Linux community (now implemented), which would significantly reduce contention on inpcb locks in many workloads. I've measured significant lock contention in all those places in the past, and I believe buf_ring was intended to address at least the first case. This isn't the same as having benchmarks showing that the current code is "better", but the right primitive used in the right way should almost certainly help all of those cases substantially. I know that when Philip Paeps was working with the Solarflare driver, switching to lockless dispatch in the outbound path made a significant difference. One thing we do need to make sure is handled well is bounds on queue length, since we don't want infinitely long queues when a backlog begins to form -- there's no reason this can't be done, although the specifics depend on what one wants to accomplish and how. I would like to see us making use of lockless queue primitives in these kinds of scenarios, motivated by benchmarking, and ideally addressing architectures with weaker memory consistency properly. We should definitely minimise the number of different implementations of those primitives as much as possible, since (as with locks themselves) they are very hard to get right, and debugging problems with them can be quite problematic. Robert