Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Mar 2021 03:11:05 +0200
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Stefan Esser <se@freebsd.org>
Cc:        Andriy Gapon <avg@freebsd.org>, FreeBSD CURRENT <freebsd-current@freebsd.org>
Subject:   Re: Strange behavior after running under high load
Message-ID:  <CAGudoHFQN6EkR2Y33sKwbooUGqP-oLJ0yqjpL3HuE7gn7vRLPQ@mail.gmail.com>
In-Reply-To: <c64447e2-4ed4-a948-e15a-efcd66c271a9@freebsd.org>
References:  <58bea0f0-5c3d-4263-ebee-f939a7e169e9@freebsd.org> <e8398b06-de30-63a8-94b1-a336ed32ed27@FreeBSD.org> <c64447e2-4ed4-a948-e15a-efcd66c271a9@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This may be the problem fixed in
e9272225e6bed840b00eef1c817b188c172338ee ("vfs: fix vnlru marker
handling for filtered/unfiltered cases").

However, there is a long standing performance bug where if vnode limit
is hit, and there is nothing to reclaim, the code is just going to
sleep for one second.

On 3/28/21, Stefan Esser <se@freebsd.org> wrote:
> Am 28.03.21 um 17:44 schrieb Andriy Gapon:
>> On 28/03/2021 17:39, Stefan Esser wrote:
>>> After a period of high load, my now idle system needs 4 to 10 seconds to
>>> run any trivial command - even after 20 minutes of no load ...
>>>
>>>
>>> I have run some Monte-Carlo simulations for a few hours, with initially
>>> 35
>>> processes running in parallel for some 10 seconds each.
>>
>> I saw somewhat similar symptoms with 13-CURRENT some time ago.
>> To me it looked like even small kernel memory allocations took a very long
>> time.
>> But it was hard to properly diagnose that as my favorite tool, dtrace, was
>> also
>> affected by the same problem.
>
> That could have been the case - but I had to reboot to recover the system.
>
> I had let it sit idle fpr a few hours and the last "time uptime" before
> the reboot took 15 second real time to complete.
>
> Response from within the shell (e.g. "echo *") was instantaneous, though.
>
> I tried to trace the program execution of "uptime" with truss and found,
> that the loading of shared libraries proceeded at about one or two per
> second until all were attached and then the program quickly printed the
> expected results.
>
> I could probably recreate the issue by running the same set of programs
> that triggered it a few hours ago, but this is a production system and
> I need it to be operational through the week ...
>
> Regards, STefan
>
>


-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHFQN6EkR2Y33sKwbooUGqP-oLJ0yqjpL3HuE7gn7vRLPQ>