From owner-freebsd-net@freebsd.org Mon Apr 25 06:18:31 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CF58EB1B537 for ; Mon, 25 Apr 2016 06:18:31 +0000 (UTC) (envelope-from dmarquess@gmail.com) Received: from mail-io0-x234.google.com (mail-io0-x234.google.com [IPv6:2607:f8b0:4001:c06::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9A8BC1DF7 for ; Mon, 25 Apr 2016 06:18:31 +0000 (UTC) (envelope-from dmarquess@gmail.com) Received: by mail-io0-x234.google.com with SMTP id 2so174008616ioy.1 for ; Sun, 24 Apr 2016 23:18:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=aHqpGXXr/BWh34H5qTy6j7ckgsb0fKzt1P4Ku059YOc=; b=Dgx6F3cKDy13vPLxYQ7VdbCVbt0DB+lSDtnfblg7aZyF2w5ldBXMgtjHkQ3oqI0IW1 cL8xbL9O6gD9Ha9HT9eJN//xTuw3Ixm1LtlbS5ekK/RvdQ971IrFVGK9pD7gvlMbalM+ EfbcJJ+gAudNFfbudl5CYxq/ozkwYsRy6cwVXT5UBsvrRcRCfACUWWahnTIb+SronQJM Clwcw3hqwNqpUT0LvX40D7aU7nYfAcp8ZtSFw7+hbjhgxhcejHG55L0lQm4CmuCizOsy /OJIm73jSLOsiLnGr5DN6RmZNiZDuouf1TicVqMnanX3upkWrfl3ETSvyqaDOdw4EoAg Y+Jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=aHqpGXXr/BWh34H5qTy6j7ckgsb0fKzt1P4Ku059YOc=; b=IDlzMB2n0iDZxJZWD4pQdZBrrmO40KHjB0UA86Ug43bnWEFbkHZj4R1j2wMHbsMbov Wjsc1C+iOXWrXTUfi2Lwh392alsIA3fF/kuVetpJ9oGcCmWaRXNhUi5eWNkDw59K8Vkm +X/Gd/6Nos1qVNGD7Lf0H40ASzT5s4qNY1DafPHhL8Kremesidokhgna1xpQT+MWFWxF M57KCMFR4OP53H3tQahetcXzEH3bQQe5AtLg3yWsbhm2706mEnYEcBp4GIrqMOt9uVN1 2vYAzZ8EqwDIdy9fQd6/e6MrDv6iACbaGr0raJPSYcb/rlNAVMzNLzM+hMzYzPAtjXwg ABFQ== X-Gm-Message-State: AOPr4FVAmwFH+7tH6at/bDZ5r30KcIW8MqGNayXB9WIdtDaigGBdOwvRrI6k4P1J8oPxk5opPSZdz6g5nuAF3w== MIME-Version: 1.0 X-Received: by 10.107.198.69 with SMTP id w66mr30097402iof.178.1461565111099; Sun, 24 Apr 2016 23:18:31 -0700 (PDT) Received: by 10.107.156.132 with HTTP; Sun, 24 Apr 2016 23:18:30 -0700 (PDT) In-Reply-To: References: <2C78DBCF-26F2-44D0-A45E-6EE8918648EA@netapp.com> Date: Mon, 25 Apr 2016 01:18:30 -0500 Message-ID: Subject: Re: Issues with ixl(4) From: Dustin Marquess To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Apr 2016 06:18:31 -0000 So I've done some more testing, and it's definitely some kind of interaction between ixl & lagg, and maybe even ix & lagg. It doesn't matter if lagg is using "lacp" or "loadbalance" (with the switch set appropriately), it happens on both. I did find out that statically adding an arp entry for the "bad hosts" fixes it, so it's something to do with ARP replies (tcpdump doesn't show it getting the ARP replies all of the time in the lagg). Pretty much exactly this problem: https://lists.freebsd.org/pipermail/freebsd-net/2015-June/042593.html Except that fix is already in the code. I going all the way back to r294499 of -CURRENT and that didn't change it. I also tried 10.3, but that immediately panics on the Intel-based ixl machine. I'll see if I can get the AMD-based ix machine to boot 10.3 for testing. -Dustin On Thu, Apr 21, 2016 at 4:52 PM, K. Macy wrote: > > > On Wednesday, April 20, 2016, Dustin Marquess wrote: >> >> I tried backing out that change and everything worked for a few minutes >> and then started acting up again. Then I notice Sean Bruno's "TCP Packets >> Drop!!!" email about LACP. I disabled LACP on the switch side and then >> changed the lagg config from "lacp" to "roundrobin", and so far so good. On >> the switch side it looks like member ports were randomly bounding in the >> LACP bundle, and when I'd tcpdump an interface I wouldn't see anything until >> another LACP&LLDP packet came in. >> >> So something seems to be broken with lagg's LACP support recently. The >> good news is I don't think the route caching is causing this problem. I'll >> put it back in and retest to make sure though. >> > > > Glad to hear I was in error. > -M > >> >> Thanks for the help! >> -Dustin >> >> On Tue, Apr 19, 2016 at 6:15 PM, K. Macy wrote: >>> >>> On Mon, Apr 18, 2016 at 10:45 PM, Eggert, Lars wrote: >>> > I haven't played with lagg+vlan+bridge, but I briefly evaluated XL710 >>> > boards last year >>> > (https://lists.freebsd.org/pipermail/freebsd-net/2015-October/043584.html) >>> > and saw very poor throughputs and latencies even in very simple setups. As >>> > far as I could figure it out, TSO/LRO wasn't being performed (although >>> > enabled) and so I ran into packet-rate issues. >>> > >>> > I basically gave up and went with a different vendor. FWIW, the XL710 >>> > boards in the same machines booted into Linux performed fine. >>> > >>> >>> FWIW, NFLX sees performance close to that of cxgbe (by far the best >>> maintained, best performing FreeBSD 40G driver) with an iflib >>> converted driver. The iflib updated driver will be imported by 11 but >>> won't become the default driver until 11.1 for wont of QA resources at >>> Intel. >>> >>> -M >> >> >