From owner-freebsd-net@FreeBSD.ORG Tue Jul 8 02:14:46 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1DC574E7; Tue, 8 Jul 2014 02:14:46 +0000 (UTC) Received: from mail-pa0-x229.google.com (mail-pa0-x229.google.com [IPv6:2607:f8b0:400e:c03::229]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E14172CA0; Tue, 8 Jul 2014 02:14:45 +0000 (UTC) Received: by mail-pa0-f41.google.com with SMTP id fb1so6438050pad.28 for ; Mon, 07 Jul 2014 19:14:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=Yg7ZzN5xK2nwVyX7SfVfK/3muobXoPp/qQm81mzuVE4=; b=vqiSEpkKM4ueSil2b9ERSVoBtGfvSojK37vXogaMoJ01kA9PG7gTt0L/h/q9NhsQTt OH6hNg9dgcSVbcxNEWTPTB+Np+sn2RurBMcKcqXzO0xSSb6kJrbC+7OZXUX4DEDm3f1Q mJVKqwKiX15dk6KAS8FHcoWbvBkeYZlCB2Df2cerGaYxJLvkby8j7smF2x0nWgoQgssE daepWPUw3g0x0GyiRK2JqSWf2vGK2GHmO3IpgfOzybeNypsJV+SOJ6IDDdWPyHCQ/Avb qfS3i0murE4z6G+VrITlFnE+GwF4VfjFB/xCr0+CmbrGS7ZLKWIg5xpKwBrhCOwtR/tI gM1w== X-Received: by 10.69.18.11 with SMTP id gi11mr32173688pbd.36.1404785685381; Mon, 07 Jul 2014 19:14:45 -0700 (PDT) Received: from pyunyh@gmail.com ([106.247.248.2]) by mx.google.com with ESMTPSA id wp3sm53981897pbc.67.2014.07.07.19.14.42 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 07 Jul 2014 19:14:44 -0700 (PDT) From: Yonghyeon PYUN X-Google-Original-From: "Yonghyeon PYUN" Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Tue, 08 Jul 2014 11:14:39 +0900 Date: Tue, 8 Jul 2014 11:14:39 +0900 To: Hans Petter Selasky Subject: Re: [RFC] Allow m_dup() to use JUMBO clusters Message-ID: <20140708021439.GA3965@michelle.fasterthan.com> Reply-To: pyunyh@gmail.com References: <53BA5657.8010309@selasky.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53BA5657.8010309@selasky.org> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, freebsd-current@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Jul 2014 02:14:46 -0000 On Mon, Jul 07, 2014 at 10:12:07AM +0200, Hans Petter Selasky wrote: > Hi, > > I'm asking for some input on the attached m_dup() patch, so that > existing functionality or dependencies are not broken. The background > for the change is to allow m_dup() to defrag long mbuf chains that > doesn't fit into a specific hardware's scatter gather entries, typically > when doing TSO. > > In my case the HW limit is 16 entries of length 4K for doing a 64KByte I wonder how HW can handle a full-sized TSO packet(64KB + Ethernet header + VLAN tag). > TSO packet. Currently m_dup() is at best producing 32 entries of each 2K > for a 64Kbytes TSO packet. > > By allowing m_dup() to get JUMBO clusters when allocating mbufs, we > avoid creating a new function, specific to the hardware, to defrag some > rare-occurring very long mbuf chains into a mbuf chain below 16 entries. > I think m_dup() was used to get a copy of writable mbuf chains. If m_dup() starts to allocate jumbo mbufs it will eventually fail on long running boxes. This will break firewall(ipfw divert, pf/ipf dup-to) rules and several ethernet drivers. I don't know how many TSO requests could be queued by HW but if the number is very small, the driver may be able to pre-allocate that number of buffers (N * (64KB + Ethernet header + VLAN tag)) in driver. Upper stack will almost always generate more than 16 mbufs for TSO packets. When driver knows the length of mbuf chain of TSO packet is more than 16, you can copy the mbuf chain to the pre-allocated buffer. I recall I didn't implement TSO on txp(4) because the firmware of txp(4) controller does not support more than 16 fragment descriptors.