Date: Mon, 8 Aug 2016 15:52:16 +0200 From: Ben RUBSON <ben.rubson@gmail.com> To: freebsd-net <freebsd-net@freebsd.org> Subject: Re: Unstable local network throughput Message-ID: <647B1F5C-EF03-4DC5-B5AC-75AD1995A20B@gmail.com> In-Reply-To: <bc872304-aac7-2b21-2f83-7aea3cc82386@selasky.org> References: <3C0D892F-2BE8-4650-B9FC-93C8EE0443E1@gmail.com> <bed13ae3-0b8f-b1af-7418-7bf1b9fc74bc@selasky.org> <3B164B7B-CBFB-4518-B57D-A96EABB71647@gmail.com> <5D6DF8EA-D9AA-4617-8561-2D7E22A738C3@gmail.com> <06E414D5-9CDA-46D1-A26F-0B07E76FDB34@gmail.com> <0b14bf39-ed71-b9fb-1998-bd9676466df6@selasky.org> <E5BE8DAC-AB6A-491E-A901-4E513367278B@gmail.com> <CAFMmRNz8WryZVVR-_OvB7Ad3tR1NqPpXpv_QEPkoffxdFzdUQw@mail.gmail.com> <A5742F79-C2C2-4040-A369-D8CFE6B48D33@gmail.com> <bc872304-aac7-2b21-2f83-7aea3cc82386@selasky.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 05 Aug 2016, at 10:30, Hans Petter Selasky <hps@selasky.org> wrote: >=20 > On 08/04/16 23:49, Ben RUBSON wrote: >>>=20 >>> On 04 Aug 2016, at 20:15, Ryan Stone <rysto32@gmail.com> wrote: >>>=20 >>> On Thu, Aug 4, 2016 at 11:33 AM, Ben RUBSON <ben.rubson@gmail.com> = wrote: >>> But even without RSS, I should be able to go up to 2x40Gbps, don't = you think so ? >>> Nobody already did this ? >>>=20 >>> Try this patch >>> (...) >>=20 >> I also just tested the NODEBUG kernel but it did not help. >=20 > Hi, >=20 > When running these tests, do you see any CPUs fully utilised? No, CPUs look like this on both servers : 27 processes: 1 running, 26 sleeping CPU 0: 1.1% user, 0.0% nice, 16.7% system, 0.0% interrupt, 82.2% = idle CPU 1: 1.1% user, 0.0% nice, 18.9% system, 0.0% interrupt, 80.0% = idle CPU 2: 1.9% user, 0.0% nice, 17.8% system, 0.0% interrupt, 80.4% = idle CPU 3: 1.1% user, 0.0% nice, 15.2% system, 0.0% interrupt, 83.7% = idle CPU 4: 0.4% user, 0.0% nice, 16.3% system, 0.0% interrupt, 83.3% = idle CPU 5: 1.1% user, 0.0% nice, 14.4% system, 0.0% interrupt, 84.4% = idle CPU 6: 2.6% user, 0.0% nice, 17.4% system, 0.0% interrupt, 80.0% = idle CPU 7: 2.2% user, 0.0% nice, 15.2% system, 0.0% interrupt, 82.6% = idle CPU 8: 1.1% user, 0.0% nice, 3.0% system, 15.9% interrupt, 80.0% = idle CPU 9: 0.0% user, 0.0% nice, 3.0% system, 32.2% interrupt, 64.8% = idle CPU 10: 0.0% user, 0.0% nice, 0.4% system, 58.9% interrupt, 40.7% = idle CPU 11: 0.0% user, 0.0% nice, 0.4% system, 77.4% interrupt, 22.2% = idle CPU 12: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 13: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 14: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 15: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 16: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 17: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 18: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 19: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 20: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 21: 0.0% user, 0.0% nice, 0.0% system, 0.4% interrupt, 99.6% = idle CPU 22: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle CPU 23: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% = idle Load is correctly spread over the NUMA connected to the NIC (the first = 12 CPUs). There is clearly enough power to fulfill the full-duplex link ! I tried many cpuset configurations (IRQs over the 12 CPUs etc...), but = no improvement at all. > Did you check the RX/TX pauseframes settings and the mlx4 sysctl = statistics counters, if there is packet loss? I tried to disable RX/TX pauseframes, but it did not help. And "sysctl -a | grep mlx | grep err" counters are all 0. I also played with ring size, adaptive interrupt moderation... with no = luck. Ben
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?647B1F5C-EF03-4DC5-B5AC-75AD1995A20B>