From owner-freebsd-performance@FreeBSD.ORG Fri May 14 17:01:28 2010 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0D26F106566C; Fri, 14 May 2010 17:01:28 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-ww0-f54.google.com (mail-ww0-f54.google.com [74.125.82.54]) by mx1.freebsd.org (Postfix) with ESMTP id 651218FC08; Fri, 14 May 2010 17:01:27 +0000 (UTC) Received: by wwb18 with SMTP id 18so452024wwb.13 for ; Fri, 14 May 2010 10:01:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=mb0FBfUT//flROo8fIA4o0tNFLt84XE7zXoiURSK37c=; b=isBroku26b4zfEFdLPuqLQodE0TCQQr32Ux9ys55068MWh7mdhHF2/xqB+vUNWvS+h s+8jHgh/w9D7yu+b43ugF8RhbI6tD+557VlLqgKEjmwe+hNhfrSeDaekXN7NnRWDIojo 8+IX1xs/C+7lKW6JB6bYZ20YOPzR7GksX+3yk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=poBupNReuh0QL1hyecyb2xPpM6nN+pkxbfko9NogzVKAvKGM7995bzm8X/C35jtFOt +0KTAkl96EJFayOjMhMTt1ywAoFPHgoFHFO7F2QBfHPPITQo50KZesPH/y5MdfOnKra6 vOUnXEcXvqkdyydkmLOIT7XAux2bh5+/geqlA= MIME-Version: 1.0 Received: by 10.216.88.211 with SMTP id a61mr491807wef.65.1273856486242; Fri, 14 May 2010 10:01:26 -0700 (PDT) Received: by 10.216.29.129 with HTTP; Fri, 14 May 2010 10:01:24 -0700 (PDT) In-Reply-To: References: <4BE52856.3000601@unsane.co.uk> <1273323582.3304.31.camel@efe> <20100511135103.GA29403@grapeape2.cs.duke.edu> <4BED5929.5020302@cs.duke.edu> Date: Fri, 14 May 2010 10:01:24 -0700 Message-ID: From: Jack Vogel To: Alexander Sack Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Murat Balaban , freebsd-net@freebsd.org, freebsd-performance@freebsd.org, Andrew Gallatin Subject: Re: Intel 10Gb X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 May 2010 17:01:28 -0000 On Fri, May 14, 2010 at 8:18 AM, Alexander Sack wrote: > On Fri, May 14, 2010 at 10:07 AM, Andrew Gallatin > wrote: > > Alexander Sack wrote: > > <...> > >>> Using this driver/firmware combo, we can receive minimal packets at > >>> line rate (14.8Mpps) to userspace. You can even access this using a > >>> libpcap interface. The trick is that the fast paths are OS-bypass, > >>> and don't suffer from OS overheads, like lock contention. See > >>> http://www.myri.com/scs/SNF/doc/index.html for details. > >> > >> But your timestamps will be atrocious at 10G speeds. Myricom doesn't > >> timestamp packets AFAIK. If you want reliable timestamps you need to > >> look at companies like Endace, Napatech, etc. > > > > I see your old help ticket in our system. Yes, our timestamping > > is not as good as a dedicated capture card with a GPS reference, > > but it is good enough for most people. > > I was told btw that it doesn't timestamp at ALL. I am assuming NOW > that is incorrect. > > Define *most* people. > > I am not knocking the Myricom card. In fact I so wish you guys would > just add the ability to latch to a 1PPS for timestamping and it would > be perfect. > > We use I think an older version of the card internally for replay. > Its a great multi-purpose card. > > However with IPG at 10G in the nanoseconds, anyone trying to do OWDs > or RTT will find it difficult compared to an Endace or Napatech card. > > Btw, I was referring to bpf(4) specifically, so please don't take my > comments as a knock against it. > > >> PS I am not sure but Intel also supports writing packets directly in > >> cache (yet I thought the 82599 driver actually does a prefetch anyway > >> which had me confused on why that helps) > > > > You're talking about DCA. We support DCA as well (and I suspect some > > other 10G NICs do to). There are a few barriers to using DCA on > > FreeBSD, not least of which is that FreeBSD doesn't currently have the > > infrastructure to support it (no IOATDMA or DCA drivers). > > Right. > > > DCA is also problematic because support from system/motherboard > > vendors is very spotty. The vendor must provide the correct tag table > > in BIOS such that the tags match the CPU/core numbering in the system. > > Many motherboard vendors don't bother with this, and you cannot enable > > DCA on a lot of systems, even though the underlying chipset supports > > DCA. I've done hacks to force-enable it in the past, with mixed > > results. The problem is that DCA depends on having the correct tag > > table, so that packets can be prefetched into the correct CPU's cache. > > If the tag table is incorrect, DCA is a big pessimization, because it > > blows the cache in other CPUs. > > Right. > > > That said, I would *love* it if FreeBSD grew ioatdma/dca support. > > Jack, does Intel have any interest in porting DCA support to FreeBSD? > > Question for Jack or Drew, what DOES FreeBSD have to do to support > DCA? I thought DCA was something you just enable on the NIC chipset > and if the system is IOATDMA aware, it just works. Is that not right > (assuming cache tags are correct and accessible)? i.e. I thought this > was hardware black magic than anything specific the OS has to do. > > OK, let me see if I can clarify some of this. First, there IS an I/OAT driver that I did for FreeBSD like 3 or 4 years ago, in the timeframe that we put the feature out. However, at that time all it was good for was the DMA aspect of things, and Prafulla used it to accelerate the stack copies; interest did not seem that great so I put the code aside, its not badly dated and needs to be brought up to date due to there being a few different versions of the hardware now. At one point maybe a year back I started to take the code apart thinking I would JUST do DCA, that got back-burnered due to other higher priority issues, but its still an item in my queue. I also had a nibble of an interest in using the DMA engine so perhaps I should not go down the road of just doing the DCA support in the I/OAT part of the driver. The question is how to make the infrastructure work. To answer Alexander's question, DCA support is NOT in the NIC, its in the chipset, that's why the I/OAT driver was done as a seperate driver, but the NIC was the user of the info, its been a while since I was into the code but if memory serves the I/OAT driver just enables the support in the chipset, and then the NIC driver configures its engine to use it. DCA and DMA were supported in Linux in the same driver because the chipset features were easily handled together perhaps, I'm not sure :) Fabien's data earlier in this thread suggested that a strategicallly placed prefetch did you more good than DCA did if I recall, what do you all think of that? As far as I'm concerned right now I am willing to resurrect the driver, clean it up and make the features available, we can see how valuable they are after that, how does that sound?? Cheers, Jack