Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Mar 2021 14:33:01 +0000
From:      bugzilla-noreply@freebsd.org
To:        net@FreeBSD.org
Subject:   [Bug 254333] [tcp] sysctl net.inet.tcp.hostcache.list hangs
Message-ID:  <bug-254333-7501-Sb4iqkca7I@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-254333-7501@https.bugs.freebsd.org/bugzilla/>
References:  <bug-254333-7501@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254333

--- Comment #13 from Richard Scheffenegger <rscheff@freebsd.org> ---
Thanks. I can not comment as to why the hostcache.count counts up and beyond
the limit.

But the other data confirms that what you are seeing is the (unsuccessful)
attempt in allocating a huge amount of kernel memory by the sbuf_new functi=
on.

When trying to dump all the entries of the hostcache, the hostcache.list
 tries to grab hostcache.cachelimit * 128 bytes, or  1966080 *  128 ~=3D 25=
0 MB
of contingeous kernel memory - twice (!).

(Also, if the count really is > cachelimit, the hostcache.list may eventual=
ly
simply fail, due to insufficient memory...)

The following Diffs are under review, but should address these particular
issues:
o) immediately respond with buffer required without actually
allocating/preparing that=20
o) allocating only one bucket's worth of output and moving the output
bucket-by-bucket to userspace (reducing the memory footprint temporarily
required from 2x 250 MB down to 1x 4kB).

See=20
https://reviews.freebsd.org/D29471
https://reviews.freebsd.org/D29481
https://reviews.freebsd.org/D29483

Patching with only D29471 should mostly address the issue, although there
remain issues around keeping a lock for an extensive period of time, while
moving the output to userspace repeatedly. This may have undesired sideeffe=
cts,
so you want to probably go with all three.

To "unstuck" the system in this state - refrain from issuing hostcache.list
multiple times, and try to free up the above mentioned chunk of in-kernel
memory (at least temporarily). That should allow all the stalled malloc
processes to succeed eventually, and return properly.

But on systems with a high uptime or kernel memory churn, and a very large
hostcache.cachelimit, it really is only a question of time, until such a hu=
ge
malloc blocks "indefinitely" - thus the above patches try to be much more s=
mart
in what kernel memory really is needed.

The downside is, that there is a higher chance, that hash buckets may change
more, than they would with that huge memory allocation (if successful).

But hostcache.list does not provide a "complete" snapshot of all the entrie=
s in
the hostcache at a very specific moment in time - they may change between t=
he
evaluation of different buckets already (but the longer time for moving dat=
a to
userspace may allow more changes to happen).

--=20
You are receiving this mail because:
You are on the CC list for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-254333-7501-Sb4iqkca7I>