From owner-freebsd-arch@FreeBSD.ORG Tue Jul 15 06:21:13 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D247DEBC; Tue, 15 Jul 2014 06:21:13 +0000 (UTC) Received: from mail-vc0-x22f.google.com (mail-vc0-x22f.google.com [IPv6:2607:f8b0:400c:c03::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6CA222F90; Tue, 15 Jul 2014 06:21:13 +0000 (UTC) Received: by mail-vc0-f175.google.com with SMTP id hu12so4101117vcb.6 for ; Mon, 14 Jul 2014 23:21:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=DMDHrfojD3PK1R3qhR+G//uG+VuerRoZxYFEL11Q4JA=; b=w5iwvuAnCQVUqBQh6pu81hf1OAccd/kXErxe7XonRnbh6xJm+yO7AdFyJoSxLVD/R4 QRvir/fLw7gLQkNqzfmReZLUNTCvOsEUYxcimQM3Wrxt7tlg2aU/sq8Vc8MdLtaR/pDR he0eo8i3/xmpz3SQXqx46cNUIh7GLhI97gJcVWFjNRPjM/vyOZja4Ku09RyeEfdpsNWW 1EyymacyqagvJ1ffuHpPRI/qfeuQFi+aSY5XlJTeKqoEMgAHeo15LRpu/7cGd+K1MBxq CxatMme9d8VXsqwCHLltsnC87pHs1aL0bPG9iOiCTTq/KfnJHrVkULxlfkwcL2fLBik6 YIsA== MIME-Version: 1.0 X-Received: by 10.220.2.136 with SMTP id 8mr20340949vcj.17.1405405272376; Mon, 14 Jul 2014 23:21:12 -0700 (PDT) Received: by 10.221.53.199 with HTTP; Mon, 14 Jul 2014 23:21:12 -0700 (PDT) In-Reply-To: References: Date: Mon, 14 Jul 2014 23:21:12 -0700 Message-ID: Subject: Re: UDP/TCP versus IP frames - subtle out of order packets with hardware hashing From: Jack Vogel To: Adrian Chadd Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: FreeBSD Net , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Jul 2014 06:21:13 -0000 I had missed the fact that Alex turned this off in the Linux driver, sounds to me like its the right thing to do for FreeBSD also. Jack On Mon, Jul 14, 2014 at 5:44 PM, Adrian Chadd wrote: > Hi, > > Whilst digging into UDP receive side scaling on the intel ixgbe(4) > NIC, I stumbled across how it hashes traffic between IP fragmented > traffic and non IP-fragmented traffic. > > Here's how it surfaced: > > * the ixgbe(4) NIC is configured to hash on both IP (2-tuple) and > TCP/UDP (4-tuple); > * when a non-fragmented UDP frame comes in, it's hashed on the 4-tuple > and comes into queue A; > * when a fragmented UDP frame comes in, it's hashed on the IP 2-tuple > and comes into queue B. > > So if there's a mix of small and large datagrams, we'll end up with > some packets coming in via queue A and some by queue B. In normal > operation that'll result in out of order packets. > > For the RSS stuff I'm working on it means that some packets will match > the PCBGROUP setup and some won't. By default UDP configures a 2-tuple > hash so it expects packets to come in hashed appropriately. But that > only matches for large frames. For small frames it'll be hashed via > the 4-tuple and it won't match. > > The ip reassembly code doesn't recalculate the flowid/flowtype once > it's finished. It'd be nice to do that before further processing so it > can be placed in the right netisr. > > So there's a couple of semi-overlapping issues: > > * Right now we could get TCP and UDP frames out of order. I'd like to > at least have ixgbe(4) hash on the 2-tuple for UDP rather than the > 4-tuple. That fixes that silly corner case. It's not likely going to > show up except for things like forwarding workloads. Maybe people > doing memcached work, I'm not sure. > > * Whether or not to calculate the flowid/flowtype in ip_reass() (or > maybe in the netisr input path, in case there's no flowid assigned) so > work is better distributed; > > * .. then if we do that, we could do 4-tuple UDP hashing again and > we'd just recalculate for any large frames. > > Here's what happened with Linux and ixgbe in 2010 on this topic: > > http://comments.gmane.org/gmane.linux.network/166687 > > What do people think? > > > -a > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >