Date: Mon, 23 Jan 2017 16:39:36 +0100 From: =?UTF-8?Q?Olivier_Cochard=2DLabb=C3=A9?= <olivier@freebsd.org> To: Matthew Macy <mmacy@nextbsd.org> Cc: Sean Bruno <sbruno@freebsd.org>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, "freebsd-current@freebsd.org" <freebsd-current@freebsd.org> Subject: Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending Message-ID: <CA%2Bq%2BTcpHmuOGyp5A290WmUvGTnOSse7v8gj4=R8kZ=m51-_s4A@mail.gmail.com> In-Reply-To: <159902b73ed.10775291e21533.7488368455500235608@nextbsd.org> References: <30f21c75-d3a2-edcd-1999-d5ed9f970c06@freebsd.org> <b000a957-8d17-a04d-6275-0d3920aa8a17@freebsd.org> <CA%2Bq%2BTcramTrYgYT-s%2B=aBZzRJV8FmKQqGt=1twPhLBR7AoXkcQ@mail.gmail.com> <1598d97bf2a.c6bcb76838987.6501340920645175463@nextbsd.org> <574a7ac7-4842-9518-8286-a4d89a9f7a27@freebsd.org> <CA%2Bq%2BTco-dcoU8EZnDEzgoK-v2Q2=U5GF6ASMSj0kwzd_wB5xig@mail.gmail.com> <6c6cb534-73c7-464b-8af1-7445a9c0188c@freebsd.org> <1598f29d379.ea6360351471.8752933472741761813@nextbsd.org> <CA%2Bq%2BTcpUXXPEQtdMFup6EZzyCKs9Ep%2BnS5SB%2Bfm6bSJSDs34_w@mail.gmail.com> <1598f3f8588.d20017893749.339651164872952258@nextbsd.org> <1598f42ad77.eeec05be4113.9201780237587761460@nextbsd.org> <CA%2Bq%2BTcp5LwrnXt75tNpYpAr1KWx9YpLx5kMHhPR%2BYgAs__n1eA@mail.gmail.com> <159902b73ed.10775291e21533.7488368455500235608@nextbsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jan 12, 2017 at 1:54 AM, Matthew Macy <mmacy@nextbsd.org> wrote: > > A flame graph for the core cycle count and a flame graph with cache > miss stats from pmc would be a great start. > > > > > > =E2=80=8BI didn't know the exact event name to use for cache miss stat= s, but > here are the flame graphs for CPU_CLK_UNHALTED_CORE: > > http://dev.bsdrp.net/netgate.r311848.CPU_CLK_UNHALTED_CORE.svg > > http://dev.bsdrp.net/netgate.r311849.CPU_CLK_UNHALTED_CORE.svg > > Thanks. Having twice as many txqs would definitely help. It's also clear > that there may be some sort of peformance issue in iflib_txq_drain. > Although it could just be non-stop cache misses on the packet headers. > > > =E2=80=8BAny news about the performance issue in iflib_txq_drain ? On a different hardware (PC Engine APU2), I've got -20% performance drop: x head r311848: packets per second + head r311849: packets per second +--------------------------------------------------------------------------= + | ++ x= | |+++ x xx x= | | |_A_|= | ||A| = | +--------------------------------------------------------------------------= + N Min Max Median Avg Stddev x 5 580021 588650 585676 585406.1 3550.8673 + 5 463865 467599 465428 465638.6 1437.9347 Difference at 95.0% confidence -119768 +/- 3950.78 -20.4589% +/- 0.558328% (Student's t, pooled s =3D 2708.9) =E2=80=8B =E2=80=8BBecause it's an AMD processor I didn't found the pmc equivalent of CPU_CLK_UNHALTED_CORE, then I've used BU_CPU_CLK_UNHALTED but I've no idea if it's the good one. http://dev.bsdrp.net/apu2.r311848.BU_CPU_CLK_UNHALTED.svg http://dev.bsdrp.net/apu2.r311849.BU_CPU_CLK_UNHALTED.svg =E2=80=8B =E2=80=8BThanks=E2=80=8B
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2Bq%2BTcpHmuOGyp5A290WmUvGTnOSse7v8gj4=R8kZ=m51-_s4A>