From owner-freebsd-net@FreeBSD.ORG Wed Oct 6 17:43:00 2010 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3EC191065696; Wed, 6 Oct 2010 17:43:00 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 11DC28FC17; Wed, 6 Oct 2010 17:43:00 +0000 (UTC) Received: from [192.168.2.105] (host86-161-142-69.range86-161.btcentralplus.com [86.161.142.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 5B79B46B2A; Wed, 6 Oct 2010 13:42:58 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: "Robert N. M. Watson" In-Reply-To: <4CAB4BE6.3070307@xiplink.com> Date: Wed, 6 Oct 2010 18:42:56 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <3C5CDDB3-73BF-4551-8F42-1FBCDB757AB6@freebsd.org> References: <4C9DA26D.7000309@freebsd.org> <4CA51024.8020307@freebsd.org> <9AD4923A-72AE-4FE3-A869-3AF8ECBF17E2@FreeBSD.org> <0DB8120D-C02A-49A1-8013-1ED818EDE7E6@freebsd.org> <20101003131330.GA85551@onelab2.iet.unipi.it> <4CAB4BE6.3070307@xiplink.com> To: Karim Fodil-Lemelin X-Mailer: Apple Mail (2.1081) Cc: Juli Mallett , Rui Paulo , Ryan Stone , Luigi Rizzo , FreeBSD Net Subject: Re: mbuf changes X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Oct 2010 17:43:00 -0000 On 5 Oct 2010, at 17:01, Karim Fodil-Lemelin wrote: > I will share some of the experience I had doing embed mtags. Hopefully = its relevant :) >=20 > The idea of carrying a certain amount of mbuf tags within the mbuf = structure is somewhat similar but much cleaner, imo, then Linux's skbuff = char cb[40 - 48] (it was 40bytes in 2.4.x ...). Now this idea is not new = although as you know the devil is in the details... Hi Karim: This sounds like very interesting work, and something we should figure = out how to generalize for FreeBSD. I had also been pondering something = along these lines in order to improve MAC label performance when using = ubiquitously labeled policies. Since MAC often stores references to = externally managed data from mbuf labels (i.e., deep structures, rather = than just bytes of data) it's especially important to make sure we get = the tear-down stuff right... Robert >=20 > What we did for BSD is create a container in the mbuf and extend the = API with functions we (pompously) called m_tag_fast_alloc() and = m_tag_fast_free(). This means the standard m_tag_alloc() is still = supported across the system and the old behavior is unchanged (list of = allocated struct attached to the packet header). Whats different is the = availability of a 'fast' call that directly uses the container within = the mbuf, effectively avoiding those malloc and cache misses. I'll = explain later how we effectively support calling m_tag_delete on a = 'fast' tag. >=20 > The trick to save CPU cycles was also to quickly revert back to the = standard tag mechanism if some component in the system is manipulating = the tag list by deleting elements. Effectively, the m_tag_fast_free is a = NOP and fast tags are not deleted once allocated (unless m_free is = called on the mbuf of course). When m_tag_delete is called the container = simply becomes 'fast tag' invalid for further additions. This is not = flexible but has the merit of reducing the overall number of operations = given that almost no components are deleting tags without deleting the = mbuf (loopback does but its a special case). >=20 > One last thing we did is perform various operational tests to come up = with the most statistically optimized container size. Now this is much = easier to do on a proprietary system then for a general purpose OS but = its certainly possible. >=20 > Finally, we did see speed increase for our application and if someone = is interested I could provide a patch although I would have to rewrite = it without the proprietary bits in it. >=20 > Best regards, >=20 > Karim.