Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 24 Mar 2012 14:17:57 -0700
From:      Jack Vogel <jfvogel@gmail.com>
To:        Juli Mallett <jmallett@freebsd.org>
Cc:        freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject:   Re: nmbclusters: how do we want to fix this for 8.3 ?
Message-ID:  <CAFOYbcm8_UHTcw_7ZC2fg3Z-AMVcqNK%2BWEA1HNsjhnmiS5yNMQ@mail.gmail.com>
In-Reply-To: <CACVs6=_kGtQX05baYdi2xqG380uLpcmn9WWo4NeGZ%2BvrXEnXZw@mail.gmail.com>
References:  <CAFOYbc=oU5DxZDZQZZe4wJhVDoP=ocVOnpDq7bT=HbVkAjffLQ@mail.gmail.com> <20120222205231.GA81949@onelab2.iet.unipi.it> <1329944986.2621.46.camel@bwh-desktop> <20120222214433.GA82582@onelab2.iet.unipi.it> <CAFOYbc=BWkvGuqAOVehaYEVc7R_4b1Cq1i7Ged=-YEpCekNvfA@mail.gmail.com> <134564BB-676B-49BB-8BDA-6B8EB8965969@netasq.com> <ji5ldg$8tl$1@dough.gmane.org> <CACVs6=_avBzUm0mJd%2BkNvPuBodmc56wHmdg_pCrAODfztVnamw@mail.gmail.com> <20120324200853.GE2253@funkthat.com> <CAFOYbcm_UySny1pUq2hYBcLDpCq6-BwBZLYVEnwAwcy6vtcvng@mail.gmail.com> <CACVs6=_kGtQX05baYdi2xqG380uLpcmn9WWo4NeGZ%2BvrXEnXZw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This whole issue only came up on a system with 10G devices, and only igb
does anything like you're talking about, not a device/driver on most low en=
d
systems. So, we are trading red herrings it would seem.

I'm not opposed to economizing things in a sensible way, it was I that
brought
the issue up after all :)

Jack


On Sat, Mar 24, 2012 at 2:02 PM, Juli Mallett <jmallett@freebsd.org> wrote:

> On Sat, Mar 24, 2012 at 13:33, Jack Vogel <jfvogel@gmail.com> wrote:
> > On Sat, Mar 24, 2012 at 1:08 PM, John-Mark Gurney <jmg@funkthat.com>
> wrote:
> >> If we had some sort of tuning algorithm that would keep track of the
> >> current receive queue usage depth, and always keep enough mbufs on the
> >> queue to handle the largest expected burst of packets (either
> historical,
> >> or by looking at largest tcp window size, etc), this would both improv=
e
> >> memory usage, and in general reduce the number of require mbufs on the
> >> system...  If you have fast processors, you might be able to get away
> with
> >> less mbufs since you can drain the receive queue faster, but on slower
> >> systems, you would use more mbufs.
> >
> > These are the days when machines might have 64 GIGABYTES of main storag=
e,
> > so having sufficient memory to run high performance networking seems
> little
> > to
> > ask.
>
> I think the suggestion is that this should be configurable.  FreeBSD
> is also being used on systems, in production, doing networking-related
> tasks, with <128MB of RAM.  And it works fine, more or less.
>
> >> This tuning would also fix the problem of interfaces not coming up sin=
ce
> >> at boot, each interface might only allocate 128 or so mbufs, and then
> >> dynamicly grow as necessary...
> >
> > You want modern fast networked servers but only giving them 128 mbufs,
> > ya right , allocating memory takes time, so when you do this people wil=
l
> > whine about latency :)
>
> Allocating memory doesn't have to take much time.  A multi-queue
> driver could steal mbufs from an underutilized queue.  It could grow
> the number of descriptors based on load.  Some of those things are
> hard to implement in the first place and harder to cover the corner
> cases of, but not all.
>
> > When you start pumping 10G...40G...100G ...the scale of the system
> > is different, thinking in terms of the old 10Mb or 100Mb days just
> doesn't
> > work.
>
> This is a red herring.  Yes, some systems need to do 40/100G.  They
> require special tuning.  The default shouldn't assume that everyone's
> getting maximum pps.  This seems an especially-silly argument when
> much of the silicon available can't even keep up with maximum packet
> rates with minimally-sized frames, at 10G or even at 1G.
>
> But again, 1G NICs are the default now.  Does every FreeBSD system
> with a 1G NIC have loads of memory?  No.  I have an Atheros system
> with 2 1G NICs and 256MB of RAM.  It can't do anything at 1gbps.  Not
> even drop packets.  Why should its memory usage model be tuned for
> something it can't do?
>
> I'm not saying it should be impossible to allocate a bajillion
> gigaquads of memory to receive rings, I certainly do it myself all the
> time.  But choosing defaults is a tricky thing, and systems that are
> "pumping 10G" need other tweaks anyway, whether that's enabling
> forwarding or something else.  Because they have to be configured for
> the task that they are to do.  If part of that is increasing the
> number of receive descriptors (as the Intel drivers already allow us
> to do =97 thanks, Jack) and the number of queues, is that such a bad
> thing?  I really don't think it makes sense for my 8-core system or my
> 16-core system to come up with 8- or 16-queues *per interface*.  That
> just doesn't make sense.  8/N or 16/N queues where N is the number of
> interfaces makes more sense under heavy load.  1 queue per port is
> *ideal* if a single core can handle the load of that interface.
>
> > Sorry but the direction is to scale everything, not pare back on the
> network
> > IMHO.
>
> There is not just one direction.  There is not just one point of
> scaling.  Relatively-new defaults do not necessarily have to be
> increased in the future.  I mean, should a 1G NIC use 64 queues on a
> 64-core system that can do 100gbps @ 64 bytes on one core?  It's
> actively-harmful to performance.  The answer to "what's the most
> sensible default?" is not "what does a system that just forwards
> packets need?"  A system that just forwards packets already needs IPs
> configured and a sysctl set.  If we make it easier to change the
> tuning of the system for that scenario, then nobody's going to care
> what our defaults are, or think us "slow" for them.
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFOYbcm8_UHTcw_7ZC2fg3Z-AMVcqNK%2BWEA1HNsjhnmiS5yNMQ>