From owner-svn-src-all@FreeBSD.ORG Wed Aug 21 19:59:39 2013 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 16EB055D; Wed, 21 Aug 2013 19:59:39 +0000 (UTC) (envelope-from nparhar@gmail.com) Received: from mail-pd0-x231.google.com (mail-pd0-x231.google.com [IPv6:2607:f8b0:400e:c02::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id CD8C5268A; Wed, 21 Aug 2013 19:59:38 +0000 (UTC) Received: by mail-pd0-f177.google.com with SMTP id y10so866196pdj.36 for ; Wed, 21 Aug 2013 12:59:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=dUhDtbQj8rYwY6jJnCBEZtJP0sKeIE2zsjljRkj081M=; b=heXS7HxUyTF+l0aeW2FvrXHCfG/cImhEwc+54cldFT1JaEV352FZMrTQoayqnrpzTt A7JbMQTAD4+ihLyvb2TjA2VGx65qwcA8Cg5VCpWmRZq/eWKbiEzbRIYlo+G4qL0pxwXC 2kwpvjOdlu3GHx2WXRdX78QCnwnpKUUCIfrYxXzfi5EoMnrjpweADZyuuTD97suUUxce NiwcgbdXvYuKE4ZkeguGpEnAs+65zW6X4y1Bybu8qUA67hlvXza6Ms2g4tHTci5OjEp4 JuC9cfhStMNjgcz49nkTKXFUJABdbfYAUNdTZzhOrojEqm42wNjH0Efh1NeCIhFfxRIJ +GgA== X-Received: by 10.68.252.194 with SMTP id zu2mr9356621pbc.58.1377115177723; Wed, 21 Aug 2013 12:59:37 -0700 (PDT) Received: from [10.192.166.0] (stargate.chelsio.com. [67.207.112.58]) by mx.google.com with ESMTPSA id ys4sm10131543pbb.9.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 21 Aug 2013 12:59:36 -0700 (PDT) Sender: Navdeep Parhar Message-ID: <52151C26.10004@FreeBSD.org> Date: Wed, 21 Aug 2013 12:59:34 -0700 From: Navdeep Parhar User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130819 Thunderbird/17.0.8 MIME-Version: 1.0 To: Scott Long Subject: Re: svn commit: r254520 - in head/sys: kern sys References: <201308191116.r7JBGsc6065793@svn.freebsd.org> <521256CE.6070706@FreeBSD.org> <5212870A.50105@freebsd.org> <521291F1.8060500@FreeBSD.org> <5214D5E0.9040002@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Andre Oppermann X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Aug 2013 19:59:39 -0000 On 08/21/13 12:41, Scott Long wrote: > > On Aug 21, 2013, at 8:59 AM, Andre Oppermann wrote: > >> On 19.08.2013 23:45, Navdeep Parhar wrote: >>> On 08/19/13 13:58, Andre Oppermann wrote: >>>> On 19.08.2013 19:33, Navdeep Parhar wrote: >>>>> On 08/19/13 04:16, Andre Oppermann wrote: >>>>>> Author: andre >>>>>> Date: Mon Aug 19 11:16:53 2013 >>>>>> New Revision: 254520 >>>>>> URL: http://svnweb.freebsd.org/changeset/base/254520 >>>>>> >>>>>> Log: >>>>>> Remove the unused M_NOFREE mbuf flag. It didn't have any in-tree >>>>>> users >>>>>> for a very long time, if ever. >>>>>> >>>>>> Should such a functionality ever be needed again the appropriate and >>>>>> much better way to do it is through a custom EXT_SOMETHING >>>>>> external mbuf >>>>>> type together with a dedicated *ext_free function. >>>>>> >>>>>> Discussed with: trociny, glebius >>>>>> >>>>>> Modified: >>>>>> head/sys/kern/kern_mbuf.c >>>>>> head/sys/kern/uipc_mbuf.c >>>>>> head/sys/sys/mbuf.h >>>>>> >>>>> >>>>> Hello Andre, >>>>> >>>>> Is this just garbage collection or is there some other reason for this? >>>> >>>> This is garbage collection and removal of not quite right, rotten, >>>> functionality. >>>> >>>>> I recently tried some experiments to reduce the number of mbuf and >>>>> cluster allocations in a 40G NIC driver. M_NOFREE and EXT_EXTREF proved >>>>> very useful and the code changes to the kernel were minimal. See >>>>> user/np/cxl_tuning. The experiment was quite successful and I was >>>>> planning to bring in most of those changes to HEAD. I was hoping to get >>>>> some runtime mileage on the approach in general before tweaking the >>>>> ctors/dtors for jumpbo, jumbo9, jumbo16 to allow for an mbuf+refcnt >>>>> within the cluster. But now M_NOFREE has vanished without a warning... >>>> >>>> I'm looking through your experimental code and that is some really good >>>> numbers you're achieving there! >>>> >>>> However a couple things don't feel quite right, hackish even, and not >>>> fit for HEAD. This is a bit the same situation we had with some of the >>>> first 1GigE cards quite a number of years back (mostly ti(4)). There >>>> we ended up with a couple of just good enough hacks to make it fast. >>>> Most of the remains I've collected today. >>> >>> If M_NOFREE and EXT_EXTREF are properly supported in the tree (and I'm >>> arguing that they were, before r254520) then the changes are perfectly >>> legitimate. The only hackish part was that I was getting the cluster >>> from the jumbop zone while bypassing its normal refcnt mechanism. This >>> I did so as to use the same zone as m_uiotombuf to keep it "hot" for all >>> consumers (driver + network stack). >> >> If you insist I'll revert the commit removing M_NOFREE. EXT_EXTREF isn't >> touched yet, but should get better support. >> >> The hackish part for me is that the driver again manages its own memory >> pool. Windows works that way, NetBSD is moving towards it while FreeBSD >> has and remains at a central network memory pool. The latter (our current) >> way of doing it seems more efficient overall especially on heavily loaded >> networked machines. There may be significant queues building (think app >> blocked having many sockets buffer fill up) up delaying the freeing and >> returning of network memory resources. Together with fragmentation this >> can lead to bad very outcomes. Router applications with many interfaces >> also greatly benefit from central memory pools. >> >> So I'm really not sure that we should move back in the driver owned pool >> direction with lots of code duplication and copy-pasting (see NetBSD). >> Also it is kinda weird to have a kernel based pool for data going down >> the stack and another one in each driver for those going up. >> >> Actually I'm of the opinion that we should stay with the central memory >> pool and fix so that it works just as well for those cases a driver pool >> currently performs better. > > The central memory pool approach is too slow, unfortunately. There's a > reason that other OS's are moving to them. At Netflix we are > currently working on some approaches to private memory pools in order to > achieve better efficiency, and we're closely watching and anticipating Navdeep's > work. I should point out that I went to great lengths to use the jumbop zone in my experiments, and not create my own pool of memory for the rx buffers. The hope was to share cache warmth (sounds very cosy :-) with the likes of m_uiotombuf (which uses jumbop too) etc. So I'm actually in the camp that prefers central pools. I'm just trying out ways to reduce the trips we have to make to the pool(s) involved. Laying down mbufs within clusters, and packing multiple frames per cluster clearly helps. Careful cluster recycling within the NIC seems to work too. Regards, Navdeep