FreeBSD Mail Archives

Date:      Thu, 15 Mar 2012 20:00:41 +0100
From:      Svatopluk Kraus <onwahe@gmail.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        hackers@freebsd.org
Subject:   Re: [vfs] buf_daemon() slows down write() severely on low-speed CPU
Message-ID:  <CAFHCsPXZ7MmaVpfN0H%2BaY5ubvY3L_guN2ugUOZ-wEnhKuJV=tw@mail.gmail.com>
In-Reply-To: <20120315112959.GP75778@deviant.kiev.zoral.com.ua>
References:  <CAFHCsPVqNCYj-obQqS4iyKR-xK0AaRJU_6KX=fccEK4U8NaktQ@mail.gmail.com> <20120312181921.GF75778@deviant.kiev.zoral.com.ua> <CAFHCsPWZD065A0su_LJn8Q4RW1pft_DobbsgSph1NNZ=mNXYYw@mail.gmail.com> <20120315112959.GP75778@deviant.kiev.zoral.com.ua>

2012/3/15 Konstantin Belousov <kostikbel@gmail.com>:
> On Tue, Mar 13, 2012 at 01:54:38PM +0100, Svatopluk Kraus wrote:
>> On Mon, Mar 12, 2012 at 7:19 PM, Konstantin Belousov
>> <kostikbel@gmail.com> wrote:
>> > On Mon, Mar 12, 2012 at 04:00:58PM +0100, Svatopluk Kraus wrote:
>> >> Hi,
>> >>
>> >> =A0 =A0I have solved a following problem. If a big file (according to
>> >> 'hidirtybuffers') is being written, the write speed is very poor.
>> >>
>> >> =A0 =A0It's observed on system with elan 486 and 32MB RAM (i.e., low =
speed
>> >> CPU and not too much memory) running FreeBSD-9.
>> >>
>> >> =A0 =A0Analysis: A file is being written. All or almost all dirty buf=
fers
>> >> belong to the file. The file vnode is almost all time locked by
>> >> writing process. The buf_daemon() can not flush any dirty buffer as a
>> >> chance to acquire the file vnode lock is very low. A number of dirty
>> >> buffers grows up very slow and with each new dirty buffer slower,
>> >> because buf_daemon() eats more and more CPU time by looping on dirty
>> >> buffers queue (with very low or no effect).
>> >>
>> >> =A0 =A0This slowing down effect is started by buf_daemon() itself, wh=
en
>> >> 'numdirtybuffers' reaches 'lodirtybuffers' threshold and buf_daemon()
>> >> is waked up by own timeout. The timeout fires at 'hz' period, but
>> >> starts to fire at 'hz/10' immediately as buf_daemon() fails to reach
>> >> 'lodirtybuffers' threshold. When 'numdirtybuffers' (now slowly)
>> >> reaches ((lodirtybuffers + hidirtybuffers) / 2) threshold, the
>> >> buf_daemon() can be waked up within bdwrite() too and it's much worse=
.
>> >> Finally and with very slow speed, the 'hidirtybuffers' or
>> >> 'dirtybufthresh' is reached, the dirty buffers are flushed, and
>> >> everything starts from beginning...
>> > Note that for some time, bufdaemon work is distributed among bufdaemon
>> > thread itself and any thread that fails to allocate a buffer, esp.
>> > a thread that owns vnode lock and covers long queue of dirty buffers.
>>
>> However, the problem starts when numdirtybuffers reaches
>> lodirtybuffers count and ends around hidirtybuffers count. There are
>> still plenty of free buffers in system.
>>
>> >>
>> >> =A0 =A0On the system, a buffer size is 512 bytes and the default
>> >> thresholds are following:
>> >>
>> >> =A0 =A0vfs.hidirtybuffers =3D 134
>> >> =A0 =A0vfs.lodirtybuffers =3D 67
>> >> =A0 =A0vfs.dirtybufthresh =3D 120
>> >>
>> >> =A0 =A0For example, a 2MB file is copied into flash disk in about 3
>> >> minutes and 15 second. If dirtybufthresh is set to 40, the copy time
>> >> is about 20 seconds.
>> >>
>> >> =A0 =A0My solution is a mix of three things:
>> >> =A0 =A01. Suppresion of buf_daemon() wakeup by setting bd_request to =
1 in
>> >> the main buf_daemon() loop.
>> > I cannot understand this. Please provide a patch that shows what do
>> > you mean there.
>> >
>> =A0 =A0 =A0 curthread->td_pflags |=3D TDP_NORUNNINGBUF | TDP_BUFNEED;
>> =A0 =A0 =A0 mtx_lock(&bdlock);
>> =A0 =A0 =A0 for (;;) {
>> - =A0 =A0 =A0 =A0 =A0 =A0 bd_request =3D 0;
>> + =A0 =A0 =A0 =A0 =A0 =A0 bd_request =3D 1;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 mtx_unlock(&bdlock);
> Is this a complete patch ? The change just causes lost wakeups for bufdae=
mon,
> nothing more.
Yes, it's a complete patch. And exactly, it causes lost wakeups which are:
1. !! UNREASONABLE !!, because bufdaemon is not sleeping,
2. not wanted, because it looks that it's correct behaviour for the
sleep with hz/10 period. However, if the sleep with hz/10 period is
expected to be waked up by bd_wakeup(), then bd_request should be set
to 0 just before sleep() call, and then bufdaemon behaviour will be
clear.

All stuff around bd_request and bufdaemon sleep is under bd_lock, so
if bd_request is 0 and bufdaemon is not sleeping, then all wakeups are
unreasonable! The patch is about that mainly.

>
>>
>> I read description of bd_request variable. However, bd_request should
>> serve as an indicator that buf_daemon() is in sleep. I.e., the
>> following paradigma should be used:
>>
>> mtx_lock(&bdlock);
>> bd_request =3D 0; =A0 =A0/* now, it's only time when wakeup() will be me=
aningful */
>> sleep(&bd_request, ..., hz/10);
>> bd_request =3D 1; =A0 /* in case of timeout, we must set it (bd_wakeup()
>> already set it) */
>> mtx_unlock(&bdlock);
>>
>> My patch follows the paradigma. What happens without the patch in
>> described problem: buf_daemon() fails in its job and goes to sleep
>> with hz/10 period. It supposes that next early wakeup will do nothing
>> too. bd_request is untouched but buf_daemon() doesn't know if its last
>> wakeup was made by bd_wakeup() or by timeout. So, bd_request could be
>> 0 and buf_daemon() can be waked up before hz/10 just by bd_wakeup().
>> Moreover, setting bd_request to 0 when buf_daemon() is not in sleep
>> can cause time consuming and useless wakeup() calls without effect.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFHCsPXZ7MmaVpfN0H%2BaY5ubvY3L_guN2ugUOZ-wEnhKuJV=tw>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation