Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Jul 2018 13:33:36 -0700
From:      Adrian Chadd <adrian.chadd@gmail.com>
To:        ryan@ixsystems.com, FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: 9k jumbo clusters
Message-ID:  <CAJ-VmomHQ%2BzcJ%2BHXAjMg9aS1RPZsdHy0tYjdKzjpwrUY%2B05NiQ@mail.gmail.com>
In-Reply-To: <20180727221843.GZ2884@funkthat.com>
References:  <EBDE6EDD-D875-43D8-8D65-1F1344A6B817@ixsystems.com> <20180727221843.GZ2884@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 27 Jul 2018 at 15:19, John-Mark Gurney <jmg@funkthat.com> wrote:

> Ryan Moeller wrote this message on Fri, Jul 27, 2018 at 12:45 -0700:
> > There is a long-standing issue with 9k mbuf jumbo clusters in FreeBSD.
> > For example:
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183381
> > https://lists.freebsd.org/pipermail/freebsd-net/2013-March/034890.html
> >
> > This comment suggests the 16k pool does not have the fragmentation
> problem:
> > https://reviews.freebsd.org/D11560#239462
> > I???m curious whether that has been confirmed.
> >
> > Is anyone working on the pathological case with 9k jumbo clusters in the
> > physical memory allocator?  There was an interesting discussion started a
> > few years ago but I???m not sure what ever came of it:
> > http://docs.freebsd.org/cgi/mid.cgi?21225.20047.947384.390241
> >
> > I have seen some work in the direction of avoiding larger than page size
> > jumbo clusters in 12-CURRENT.  Many existing drivers avoid the 9k cluster
> > size already.  The code for larger cluster sizes in iflib is #ifdef'd out
> > so it maxes out at the page size jumbo clusters until
> "CONTIGMALLOC_WORKS"
> > (apparently it doesn't).
> >
> > With all the changes due to iflib, is there any chance some of this will
> > get MFC'd to address the serious problem that remains in 11-STABLE?
> >
> > Otherwise, would it be feasible to disable the use of the 9k cluster pool
> > in at least some of the popular NIC drivers as a solution for the stable
> > branches?
> >
> > Finally, I have studied some of the driver code in 11-STABLE and posted
> the
> > gist of my notes in relation to this problem.  If anyone spots a mistake
> or
> > has something else to contribute, comments on the gist would be greatly
> > appreciated!
> > https://gist.github.com/freqlabs/eba9b755f17a223260246becfbb150a1
>
> Drivers need to be fixed to use 4k pages instead of cluster.  I really hope
> no one is using a card that can't do 4k pages, or if they are, then they
> should get a real card that can do scatter/gather on 4k pages for jumbo
> frames..


Yeah but it's 2018 and your server has like minimum a dozen million 4k
pages.

So if you're doing stuff like lots of network packet kerchunking why not
have specialised allocator paths that can do things like "hey, always give
me 64k physical contig pages for storage/mbufs because you know what?
they're going to be allocated/freed together always."

There was always a race between bus bandwidth, memory bandwidth and
bus/memory latencies. I'm not currently on the disk/packet pushing side of
things, but the last couple times I were it was at different points in that
4d space and almost every single time there was a benefit from having a
couple of specialised allocators so you didn't have to try and manage a few
dozen million 4k pages based on your changing workload.

I enjoy the 4k page size management stuff for my 128MB routers. Your 128G
server has a lot of 4k pages. It's a bit silly.




-adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmomHQ%2BzcJ%2BHXAjMg9aS1RPZsdHy0tYjdKzjpwrUY%2B05NiQ>