Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Aug 2016 08:59:09 +0200
From:      Ben RUBSON <ben.rubson@gmail.com>
To:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: Unstable local network throughput
Message-ID:  <7DD30CE7-32E6-4D26-91D4-C1D4F2319655@gmail.com>
In-Reply-To: <CAJ-Vmo=Mfcvd41gtrt8GJfEtP-DQFfXt7pZ8eRLQzu73M=sX4A@mail.gmail.com>
References:  <3C0D892F-2BE8-4650-B9FC-93C8EE0443E1@gmail.com> <bed13ae3-0b8f-b1af-7418-7bf1b9fc74bc@selasky.org> <3B164B7B-CBFB-4518-B57D-A96EABB71647@gmail.com> <5D6DF8EA-D9AA-4617-8561-2D7E22A738C3@gmail.com> <BD0B68D1-CDCD-4E09-AF22-34318B6CEAA7@gmail.com> <CAJ-VmomW0Wth-uQU-OPTfRAsXW1kTDy-VyO2w-pgNosb-N1o=Q@mail.gmail.com> <B4D77A84-8F02-43E7-AD65-5B92423FC344@gmail.com> <CAJ-Vmo=Mfcvd41gtrt8GJfEtP-DQFfXt7pZ8eRLQzu73M=sX4A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

> On 11 Aug 2016, at 00:11, Adrian Chadd <adrian.chadd@gmail.com> wrote:
>=20
> hi,
>=20
> ok, lets start by getting the NUMA bits into the kernel so you can
> mess with things.
>=20
> add this to the kernel
>=20
> options MAXMEMDOM=3D8
> (which hopefully is enough)
> options VM_NUMA_ALLOC
> options DEVICE_NUMA
>=20
> Then reboot and post your 'dmesg' output to the list. This should show
> exactly which domain devices are in.

http://pastebin.com/raw/yaYEytME

> Install the 'intel-pcm' package. There's a 'pcm-numa.x' command - do
> kldload cpuctl, then run pcm-numa.x and see if it works. It should
> give us some useful information about NUMA.
> (Same as pcm-memory.x, pcm-pcie.x, etc.)

Yes these tools work :

# pcm-numa.x

 Intel(r) Performance Counter Monitor: NUMA monitoring utility=20
 Copyright (c) 2009-2016 Intel Corporation

Number of physical cores: 12
Number of logical cores: 24
Number of online logical cores: 24
Threads (logical cores) per physical core: 2
Num sockets: 2
Physical cores per socket: 6
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2400000000 Hz
Package thermal spec power: 85 Watt; Package minimum power: 31 Watt; =
Package maximum power: 170 Watt;=20
ERROR: QPI LL monitoring device (0:127:9:2) is missing. The QPI =
statistics will be incomplete or missing.
Socket 0: 2 memory controllers detected with total number of 5 channels. =
1 QPI ports detected.
ERROR: QPI LL monitoring device (0:255:9:2) is missing. The QPI =
statistics will be incomplete or missing.
Socket 1: 2 memory controllers detected with total number of 5 channels. =
1 QPI ports detected.
Socket 0
Max QPI link 0 speed: 16.0 GBytes/second (8.0 GT/second)
Socket 1
Max QPI link 0 speed: 16.0 GBytes/second (8.0 GT/second)

Detected Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz "Intel(r) =
microarchitecture codename Haswell-EP/EN/EX"
Update every 1.0 seconds
Time elapsed: 1010 ms
Core | IPC  | Instructions | Cycles  |  Local DRAM accesses | Remote =
DRAM Accesses=20
   0   0.70       1158 K     1655 K       577                 245        =
       =20
   1   0.33        186 K      557 K       160                  15        =
       =20
   2   0.43        317 K      745 K       385                  31        =
       =20
   3   0.36        260 K      718 K       232                  33        =
       =20
   4   0.31        186 K      602 K       188                  11        =
       =20
   5   0.39        314 K      806 K       371                  43        =
       =20
   6   0.36        235 K      659 K       257                  46        =
       =20
   7   0.35        200 K      576 K       133                  44        =
       =20
   8   0.42        423 K     1011 K       226                  20        =
       =20
   9   0.60       1309 K     2199 K       379                 104        =
       =20
  10   0.34        192 K      562 K       161                  26        =
       =20
  11   0.38        257 K      684 K       158                  44        =
       =20
  12   0.35        185 K      528 K        39                 121        =
       =20
  13   0.32        199 K      616 K        51                 171        =
       =20
  14   0.31        184 K      594 K        34                 130        =
       =20
  15   0.35        272 K      783 K        47                 256        =
       =20
  16   0.31        178 K      579 K        26                 127        =
       =20
  17   0.37        272 K      729 K        87                 204        =
       =20
  18   0.52        485 K      942 K        35                 204        =
       =20
  19   0.40        285 K      723 K        16                 147        =
       =20
  20   0.31        195 K      620 K        10                 134        =
       =20
  21   0.33        201 K      615 K        30                 114        =
       =20
  22   0.29        176 K      612 K        24                 110        =
       =20
  23   0.52        896 K     1716 K        86                 895        =
       =20
=
--------------------------------------------------------------------------=
-----------------------------------------
   *   0.43       8575 K       19 M      3712                3275        =
       =20

> Then next is playing around with interrupt thread / userland cpuset
> and memory affinity. We can look at that next.

Waiting for your instructions !

Ben




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7DD30CE7-32E6-4D26-91D4-C1D4F2319655>