Date: Mon, 8 Aug 2016 15:52:16 +0200 From: Ben RUBSON <ben.rubson@gmail.com> To: freebsd-net <freebsd-net@freebsd.org> Subject: Re: Unstable local network throughput Message-ID: <647B1F5C-EF03-4DC5-B5AC-75AD1995A20B@gmail.com> In-Reply-To: <bc872304-aac7-2b21-2f83-7aea3cc82386@selasky.org> References: <3C0D892F-2BE8-4650-B9FC-93C8EE0443E1@gmail.com> <bed13ae3-0b8f-b1af-7418-7bf1b9fc74bc@selasky.org> <3B164B7B-CBFB-4518-B57D-A96EABB71647@gmail.com> <5D6DF8EA-D9AA-4617-8561-2D7E22A738C3@gmail.com> <06E414D5-9CDA-46D1-A26F-0B07E76FDB34@gmail.com> <0b14bf39-ed71-b9fb-1998-bd9676466df6@selasky.org> <E5BE8DAC-AB6A-491E-A901-4E513367278B@gmail.com> <CAFMmRNz8WryZVVR-_OvB7Ad3tR1NqPpXpv_QEPkoffxdFzdUQw@mail.gmail.com> <A5742F79-C2C2-4040-A369-D8CFE6B48D33@gmail.com> <bc872304-aac7-2b21-2f83-7aea3cc82386@selasky.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 05 Aug 2016, at 10:30, Hans Petter Selasky <hps@selasky.org> wrote: > > On 08/04/16 23:49, Ben RUBSON wrote: >>> >>> On 04 Aug 2016, at 20:15, Ryan Stone <rysto32@gmail.com> wrote: >>> >>> On Thu, Aug 4, 2016 at 11:33 AM, Ben RUBSON <ben.rubson@gmail.com> wrote: >>> But even without RSS, I should be able to go up to 2x40Gbps, don't you think so ? >>> Nobody already did this ? >>> >>> Try this patch >>> (...) >> >> I also just tested the NODEBUG kernel but it did not help. > > Hi, > > When running these tests, do you see any CPUs fully utilised? No, CPUs look like this on both servers : 27 processes: 1 running, 26 sleeping CPU 0: 1.1% user, 0.0% nice, 16.7% system, 0.0% interrupt, 82.2% idle CPU 1: 1.1% user, 0.0% nice, 18.9% system, 0.0% interrupt, 80.0% idle CPU 2: 1.9% user, 0.0% nice, 17.8% system, 0.0% interrupt, 80.4% idle CPU 3: 1.1% user, 0.0% nice, 15.2% system, 0.0% interrupt, 83.7% idle CPU 4: 0.4% user, 0.0% nice, 16.3% system, 0.0% interrupt, 83.3% idle CPU 5: 1.1% user, 0.0% nice, 14.4% system, 0.0% interrupt, 84.4% idle CPU 6: 2.6% user, 0.0% nice, 17.4% system, 0.0% interrupt, 80.0% idle CPU 7: 2.2% user, 0.0% nice, 15.2% system, 0.0% interrupt, 82.6% idle CPU 8: 1.1% user, 0.0% nice, 3.0% system, 15.9% interrupt, 80.0% idle CPU 9: 0.0% user, 0.0% nice, 3.0% system, 32.2% interrupt, 64.8% idle CPU 10: 0.0% user, 0.0% nice, 0.4% system, 58.9% interrupt, 40.7% idle CPU 11: 0.0% user, 0.0% nice, 0.4% system, 77.4% interrupt, 22.2% idle CPU 12: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 13: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 14: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 15: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 16: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 17: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 18: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 19: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 20: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 21: 0.0% user, 0.0% nice, 0.0% system, 0.4% interrupt, 99.6% idle CPU 22: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 23: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Load is correctly spread over the NUMA connected to the NIC (the first 12 CPUs). There is clearly enough power to fulfill the full-duplex link ! I tried many cpuset configurations (IRQs over the 12 CPUs etc...), but no improvement at all. > Did you check the RX/TX pauseframes settings and the mlx4 sysctl statistics counters, if there is packet loss? I tried to disable RX/TX pauseframes, but it did not help. And "sysctl -a | grep mlx | grep err" counters are all 0. I also played with ring size, adaptive interrupt moderation... with no luck. Ben
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?647B1F5C-EF03-4DC5-B5AC-75AD1995A20B>
