Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Aug 2016 10:25:55 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Ben RUBSON <ben.rubson@gmail.com>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: Unstable local network throughput
Message-ID:  <CAJ-VmonZ08-x0T=5S0aEB9qBtjda7beDBEaU1eY9=8jWhA_TnQ@mail.gmail.com>
In-Reply-To: <9ED07C8F-4E43-4C5F-A893-61F9ADA76E56@gmail.com>
References:  <3C0D892F-2BE8-4650-B9FC-93C8EE0443E1@gmail.com> <bed13ae3-0b8f-b1af-7418-7bf1b9fc74bc@selasky.org> <3B164B7B-CBFB-4518-B57D-A96EABB71647@gmail.com> <5D6DF8EA-D9AA-4617-8561-2D7E22A738C3@gmail.com> <BD0B68D1-CDCD-4E09-AF22-34318B6CEAA7@gmail.com> <CAJ-VmomW0Wth-uQU-OPTfRAsXW1kTDy-VyO2w-pgNosb-N1o=Q@mail.gmail.com> <B4D77A84-8F02-43E7-AD65-5B92423FC344@gmail.com> <CAJ-Vmo=Mfcvd41gtrt8GJfEtP-DQFfXt7pZ8eRLQzu73M=sX4A@mail.gmail.com> <7DD30CE7-32E6-4D26-91D4-C1D4F2319655@gmail.com> <CAJ-VmongwvbY3QqKBV%2BFJCHOfSdr-=v9CmLH1z=Tqwz19AtUpg@mail.gmail.com> <AF923C63-2414-4DCE-9FD9-CAE02E3AC8CE@gmail.com> <CAJ-VmonL8kVs3=BBg02cbzXA9NpAh-trdCBh4qkjw29dOCau-g@mail.gmail.com> <91AEB1BD-44EA-43AD-A9A1-6DEBF367DF9B@gmail.com> <EB650D09-5AAC-4425-9687-ED6BBCF63ED1@gmail.com> <CAJ-Vmo=J9GnUYYD7noxvd-RfvXmZn56UYn92wfg8Y3eHUc2-Vg@mail.gmail.com> <9ED07C8F-4E43-4C5F-A893-61F9ADA76E56@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 17 August 2016 at 08:43, Ben RUBSON <ben.rubson@gmail.com> wrote:
>
>> On 17 Aug 2016, at 17:38, Adrian Chadd <adrian.chadd@gmail.com> wrote:
>>
>> [snip]
>>
>> ok, so this is what I was seeing when I was working on this stuff last.
>>
>> The big abusers are:
>>
>> * so_snd lock, for TX'ing producer/consumer socket data
>> * tcp stack pcb locking (which rss tries to work around, but it again
>> doesn't help producer/consumer locking, only multiple sockets)
>> * for some of the workloads, the scheduler spinlocks are pretty
>> heavily contended and that's likely worth digging into.
>>
>> Thanks! I'll go try this on a couple of boxes I have with
>> intel/chelsio 40g hardware in it and see if I can reproduce it. (My
>> test boxes have the 40g NICs in NUMA domain 1...)
>
> You're welcome, happy to help and troubleshoot :)
>
> What about the performance which differs from one reboot to another,
> as if the NUMA domains have switched ? (0 to 1 & 1 to 0)
> Did you already see this ?

I've seen some varying behaviours, yeah. There are a lot of missing
pieces in kernel-side NUMA, so a lot of the kernel memory allocation
behaviours are undefined. Well, tehy'e defined; it's just there's no
way right now for the kernel (eg mbufs, etc) to allocate domain local
memory. So it's "by accident", and sometimes it's fine; sometimes it's
not.



-adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmonZ08-x0T=5S0aEB9qBtjda7beDBEaU1eY9=8jWhA_TnQ>