Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Apr 2021 23:18:10 +0200
From:      Stefan Esser <se@freebsd.org>
To:        FreeBSD CURRENT <freebsd-current@freebsd.org>
Cc:        Mateusz Guzik <mjguzik@gmail.com>
Subject:   [SOLVED] Re: Strange behavior after running under high load
Message-ID:  <494d4aab-487b-83c9-03f3-10cf470081c5@freebsd.org>
In-Reply-To: <58bea0f0-5c3d-4263-ebee-f939a7e169e9@freebsd.org>
References:  <58bea0f0-5c3d-4263-ebee-f939a7e169e9@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--IJDrdhoypIKxT0mQGjNoPX3whaobsT4g0
Content-Type: multipart/mixed; boundary="uw19Xlh3ZbkDWSgncbBPeI9KmKTlPHa5p";
 protected-headers="v1"
From: Stefan Esser <se@freebsd.org>
To: FreeBSD CURRENT <freebsd-current@freebsd.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Message-ID: <494d4aab-487b-83c9-03f3-10cf470081c5@freebsd.org>
Subject: [SOLVED] Re: Strange behavior after running under high load
References: <58bea0f0-5c3d-4263-ebee-f939a7e169e9@freebsd.org>
In-Reply-To: <58bea0f0-5c3d-4263-ebee-f939a7e169e9@freebsd.org>

--uw19Xlh3ZbkDWSgncbBPeI9KmKTlPHa5p
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

Am 28.03.21 um 16:39 schrieb Stefan Esser:
> After a period of high load, my now idle system needs 4 to 10 seconds t=
o
> run any trivial command - even after 20 minutes of no load ...
>=20
>=20
> I have run some Monte-Carlo simulations for a few hours, with initially=20
35=20
> processes running in parallel for some 10 seconds each.
>=20
> The load decreased over time since some parameter sets were faster to p=
rocess.
> All in all 63000 processes ran within some 3 hours.
>=20
> When the system became idle, interactive performance was very bad. Runn=
ing
> any trivial command (e.g. uptime) takes some 5 to 10 seconds. Since I h=
ave
> to have this system working, I plan to reboot it later today, but will =
keep
> it in this state for some more time to see whether this state persists =
or
> whether the system recovers from it.
>=20
> Any ideas what might cause such a system state???

Seems that Mateusz Guzik was right to mention performance issues when
the system is very low on vnodes. (Thanks!)

I have been able to reproduce the issue and have checked vnode stats:

kern.maxvnodes: 620370
kern.minvnodes: 155092
vm.stats.vm.v_vnodepgsout: 6890171
vm.stats.vm.v_vnodepgsin: 18475530
vm.stats.vm.v_vnodeout: 228516
vm.stats.vm.v_vnodein: 1592444
vfs.wantfreevnodes: 155092
vfs.freevnodes: 47	<----- obviously too low ...
vfs.vnodes_created: 19554702
vfs.numvnodes: 621284
vfs.cache.debug.vnodes_cel_3_failures: 0
vfs.cache.stats.heldvnodes: 6412

The freevnodes value stayed in this region over several minutes, with
typical program start times (e.g. for "uptime") in the region of 10 to
15 seconds.

After rising maxvnodes to 2,000,000 form 600,000 the system performance
is restored and I get:

kern.maxvnodes: 2000000
kern.minvnodes: 500000
vm.stats.vm.v_vnodepgsout: 7875198
vm.stats.vm.v_vnodepgsin: 20788679
vm.stats.vm.v_vnodeout: 261179
vm.stats.vm.v_vnodein: 1817599
vfs.wantfreevnodes: 500000
vfs.freevnodes: 205988	<----- still a lot higher than wantfreevnodes
vfs.vnodes_created: 19956502
vfs.numvnodes: 912880
vfs.cache.debug.vnodes_cel_3_failures: 0
vfs.cache.stats.heldvnodes: 20702

I do not know why the performance impact is so high - there are a few
free vnodes (more than required for the shared libraries to start e.g.
the uptime program). Most probably each attempt to get a vnode triggers
a clean-up attempt that runs for a significant time, but has no chance
to actually reach near the goal of 155k or 500k free vnodes.

Anyway, kern.maxvnodes can be changed at run-time and it is thus easy
to fix. It seems that no message is logged to report this situation.
A rate limited hint to rise the limit should help other affected users.

Regards, STefan


--uw19Xlh3ZbkDWSgncbBPeI9KmKTlPHa5p--

--IJDrdhoypIKxT0mQGjNoPX3whaobsT4g0
Content-Type: application/pgp-signature; name="OpenPGP_signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="OpenPGP_signature"

-----BEGIN PGP SIGNATURE-----

wsB5BAABCAAjFiEEo3HqZZwL7MgrcVMTR+u171r99UQFAmBnihIFAwAAAAAACgkQR+u171r99URL
yggAv5fdNvQLBuZym91mt+gdHniYzQzvc1gmxI+VbJJzcBfEEnS24Xb4wGVpF17DYr8Qh92H9WY1
gqrUFr4ExOPvekuAianjukUOsiG7Lb2rU/7mpt84fTKJtrNJgWJTPh8I6jPcgo+4LleZH5sdxqhh
EW8qF8bDNQ+5hFzoVhG3pCReW3LlDYw+nYPR1Elph6PMzxa1y0hWHiP92wQ+smNwfzRed770mmR0
wwu66qxN+oHU0ggO84wGVWG7ejovne/sroxDkiZO0Cqf7q12JDPfriAUaHDXTMtSaGPyOBJ2vGYN
JetqjAV3hRQBs9rR5HYuK5lZDrzZFgx0ycOWAElZAA==
=z2QX
-----END PGP SIGNATURE-----

--IJDrdhoypIKxT0mQGjNoPX3whaobsT4g0--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?494d4aab-487b-83c9-03f3-10cf470081c5>