Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 04 Apr 2021 20:01:43 +0000
From:      "Poul-Henning Kamp" <phk@phk.freebsd.dk>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Warner Losh <imp@bsdimp.com>, Mateusz Guzik <mjguzik@gmail.com>, FreeBSD CURRENT <freebsd-current@freebsd.org>
Subject:   Re: [SOLVED] Re: Strange behavior after running under high load
Message-ID:  <11597.1617566503@critter.freebsd.dk>
In-Reply-To: <YGoSSXzGDZBDl922@kib.kiev.ua>
References:  <58bea0f0-5c3d-4263-ebee-f939a7e169e9@freebsd.org> <494d4aab-487b-83c9-03f3-10cf470081c5@freebsd.org> <CAGudoHHDBxOWc_u6=c1v8x%2Bw-yfYEhv_-BALCj5t95HkobCZeA@mail.gmail.com> <81671.1617432659@critter.freebsd.dk> <CAGudoHFp4x3C7fzh-SM4DQ%2B7t3YuREuknUBd-VaO=%2Bs2th4J6A@mail.gmail.com> <CANCZdfrthB8QLbeF%2Bfux9i1H2_jF6LRppkYe1dhEt7URBo4qSw@mail.gmail.com> <YGn8%2BW/ipcysamdI@kib.kiev.ua> <11447.1617562904@critter.freebsd.dk> <YGoSSXzGDZBDl922@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
--------
Konstantin Belousov writes:

> > B) We lack a nuanced call-back to tell the subsystems to release some =
of their memory "without major delay".

> The delay in the wall clock sense does not drive the issue.

I didnt say anything about "wall clock" and you're missing my point by a w=
ide margin.

We need to make major memory consumers, like vnodes take action *before* s=
hortages happen, so that *when* they happen, a lot of memory can be releas=
ed to relive them.

> We cannot expect any io to proceed while we are low on memory [...]

Which is precisely why the top level goal should be for that to never happ=
en, while still allowing the freeable" memory to be used as a cache as muc=
h as possible.

> > C) We have never attempted to enlist userland, where jemalloc often ha=
ng on to a lot of unused VM pages.
> > =

> The userland does not add to this problem, [...]

No, but userland can help solve it:  The unused pages from jemalloc/userla=
nd can very quickly be released to relieve any imminent shortage the kerne=
l might have.

As can pages from vnodes, and for that matter socket buffers.

But there are always costs, actual costs, ie: what it will take to release=
 the memory (locking, VM mappings, washing) and potential costs (lack of f=
uture caching opportunities).

These costs need to be presented to the central memory allocator, so when =
it decides back-pressure is appropriate, it can decide who to punk for how=
 much memory.

> But normally operating system does not have an issue with user pages.  =


Only if you disregard all non-UNIX operating systems.

Many other kernels have cooperated with userland to balance memory (and fo=
r that matter disk-space).

Just imagine how much better the desktop experience would be, if we could =
send SIGVM to firefox to tell it stop being a memory-pig.

(At least two of the major operating systems in the desktop world does som=
ething like that today.)

> Io latency is not the factor there. We must avoid situations where
> instantiating a vnode stalls waiting for KVA to appear, similarly we
> must avoid system state where vnodes allocation consumed so much kmem
> that other allocations stall.

My argument is the precise opposite:  We must make vnodes and the allocati=
ons they cause responsive to the sytems overall memory availability, well =
in advance of the shortage happening in the first place.

> Quite indicative is that we do not shrink the vnode list on low memory
> events.  Vnlru also does not account for the memory pressure.

The only reason we do not, is that we cannot tell definitively if freeing =
a vnode will cause disk-I/O (which may not matter with SSD's) or even how =
much memory it might free, if anything.

-- =

Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    =

Never attribute to malice what can adequately be explained by incompetence=
.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?11597.1617566503>