From owner-freebsd-net@FreeBSD.ORG Sat Dec 18 22:48:54 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A5DB0106566B for ; Sat, 18 Dec 2010 22:48:54 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 673F98FC15 for ; Sat, 18 Dec 2010 22:48:54 +0000 (UTC) Received: by iyb26 with SMTP id 26so1300157iyb.13 for ; Sat, 18 Dec 2010 14:48:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:from:date:to:cc :subject:message-id:reply-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=HB77/uuXRkJY1ra8QYkw+1Uh3qU2cd7BBsBPTgOudas=; b=pzG8hear7Bs26MlmRsanppHsZvXNVZus3Hj364/cTch/Nc8q3WRqRuTHc1I+Oz4Hc7 3E5IpZ+Gj1CoSqFWplCa+2eb65aWYuqLeBvyHOK/bN4xuhUlg4xMw5RdDeI/kFChuLRo 3UHdXJRxWvC2hdLpOoCDvdDRzmaeZbepmG9QE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=cFpKVtukHQlt3VKlmu90cJiUcLMdOIMXxgwvdiit+DGfpK7adIPUXhTkYKo5WYx0Ko ruykkgaBKDDSGp/Tv2O6FBgEWXxudkoxpxORGtAoJlLevcBYTijpNj3qGIxdUMtYZKL0 OLgMBq3GVymExtnExON3flIASBkfqOV1zASi8= Received: by 10.42.230.2 with SMTP id jk2mr2423044icb.392.1292712533742; Sat, 18 Dec 2010 14:48:53 -0800 (PST) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id c4sm2328275ict.19.2010.12.18.14.48.51 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 18 Dec 2010 14:48:52 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Sat, 18 Dec 2010 14:48:09 -0800 From: Pyun YongHyeon Date: Sat, 18 Dec 2010 14:48:09 -0800 To: abcde abcde Message-ID: <20101218224809.GA22768@michelle.cdnetworks.com> References: <808782.86181.qm@web53807.mail.re2.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <808782.86181.qm@web53807.mail.re2.yahoo.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org Subject: Re: nfe_defrag() routine in nividia ethernet driver X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Dec 2010 22:48:54 -0000 On Thu, Dec 16, 2010 at 07:53:16PM -0800, abcde abcde wrote: > Hi, we ported the nvidia ethernet driver to our product.? It's been OK until > recently we?ran into an error condition where packets would get dropped quietly. > The root cause resides in the nfe_encap() routine, where we call nfe_defrag() to > try to reduce the length of the mbuf chain to 32, if it's longer than 32. In the > event the 32 mbufs need more than 32 segments, the subsequent call to > bus_dmamap_load_mbuf_sg() would cause it to return an error then the packet is > subsequently dropped. > > > My questions are, > > 1. there appears to be a generic m_defrag() routine available, which doesn't > stop at 32 and is used by a couple of other drivers (Intel, Broadcom, to name a > few). What was the need for a nvidia version of the defrag routine? > As John said, m_defrag(9) is expensive operation. Since all nfe(4) controllers supports multiple TX buffers use m_collapse(9) instead. > 2. The NFE_MAX_SCATTER constant, which limits how many segments can be used, is > defined to be 32, while the corresponding constants for other drivers are 100 or > 64 (again Intel or Broadcom). How was the value 32 picked? Anybody knows the > reasoning behind them? > I think all nfe(4) controllers have no limitation on number of segments can be used. However most ethernet controllers targeted to non-server systems are not good at supporting multiple outstanding DMA read operation on the PCIe bus. Even though controller supports multiple DMA read operation it would take more time to fetch a TX frame that is split into long list of mbuf chains than short/single contiguous TX frame. CPU is much faster than controller DMA engine. The magic number 32 was chosen to balance on performance and resource usage. 32 should be large enough to support TSO to send a full 64KB TCP segment. If controller has no TSO capability I would have used 16.