Date: Thu, 15 Mar 2012 20:00:41 +0100 From: Svatopluk Kraus <onwahe@gmail.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: hackers@freebsd.org Subject: Re: [vfs] buf_daemon() slows down write() severely on low-speed CPU Message-ID: <CAFHCsPXZ7MmaVpfN0H%2BaY5ubvY3L_guN2ugUOZ-wEnhKuJV=tw@mail.gmail.com> In-Reply-To: <20120315112959.GP75778@deviant.kiev.zoral.com.ua> References: <CAFHCsPVqNCYj-obQqS4iyKR-xK0AaRJU_6KX=fccEK4U8NaktQ@mail.gmail.com> <20120312181921.GF75778@deviant.kiev.zoral.com.ua> <CAFHCsPWZD065A0su_LJn8Q4RW1pft_DobbsgSph1NNZ=mNXYYw@mail.gmail.com> <20120315112959.GP75778@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
2012/3/15 Konstantin Belousov <kostikbel@gmail.com>: > On Tue, Mar 13, 2012 at 01:54:38PM +0100, Svatopluk Kraus wrote: >> On Mon, Mar 12, 2012 at 7:19 PM, Konstantin Belousov >> <kostikbel@gmail.com> wrote: >> > On Mon, Mar 12, 2012 at 04:00:58PM +0100, Svatopluk Kraus wrote: >> >> Hi, >> >> >> >> =A0 =A0I have solved a following problem. If a big file (according to >> >> 'hidirtybuffers') is being written, the write speed is very poor. >> >> >> >> =A0 =A0It's observed on system with elan 486 and 32MB RAM (i.e., low = speed >> >> CPU and not too much memory) running FreeBSD-9. >> >> >> >> =A0 =A0Analysis: A file is being written. All or almost all dirty buf= fers >> >> belong to the file. The file vnode is almost all time locked by >> >> writing process. The buf_daemon() can not flush any dirty buffer as a >> >> chance to acquire the file vnode lock is very low. A number of dirty >> >> buffers grows up very slow and with each new dirty buffer slower, >> >> because buf_daemon() eats more and more CPU time by looping on dirty >> >> buffers queue (with very low or no effect). >> >> >> >> =A0 =A0This slowing down effect is started by buf_daemon() itself, wh= en >> >> 'numdirtybuffers' reaches 'lodirtybuffers' threshold and buf_daemon() >> >> is waked up by own timeout. The timeout fires at 'hz' period, but >> >> starts to fire at 'hz/10' immediately as buf_daemon() fails to reach >> >> 'lodirtybuffers' threshold. When 'numdirtybuffers' (now slowly) >> >> reaches ((lodirtybuffers + hidirtybuffers) / 2) threshold, the >> >> buf_daemon() can be waked up within bdwrite() too and it's much worse= . >> >> Finally and with very slow speed, the 'hidirtybuffers' or >> >> 'dirtybufthresh' is reached, the dirty buffers are flushed, and >> >> everything starts from beginning... >> > Note that for some time, bufdaemon work is distributed among bufdaemon >> > thread itself and any thread that fails to allocate a buffer, esp. >> > a thread that owns vnode lock and covers long queue of dirty buffers. >> >> However, the problem starts when numdirtybuffers reaches >> lodirtybuffers count and ends around hidirtybuffers count. There are >> still plenty of free buffers in system. >> >> >> >> >> =A0 =A0On the system, a buffer size is 512 bytes and the default >> >> thresholds are following: >> >> >> >> =A0 =A0vfs.hidirtybuffers =3D 134 >> >> =A0 =A0vfs.lodirtybuffers =3D 67 >> >> =A0 =A0vfs.dirtybufthresh =3D 120 >> >> >> >> =A0 =A0For example, a 2MB file is copied into flash disk in about 3 >> >> minutes and 15 second. If dirtybufthresh is set to 40, the copy time >> >> is about 20 seconds. >> >> >> >> =A0 =A0My solution is a mix of three things: >> >> =A0 =A01. Suppresion of buf_daemon() wakeup by setting bd_request to = 1 in >> >> the main buf_daemon() loop. >> > I cannot understand this. Please provide a patch that shows what do >> > you mean there. >> > >> =A0 =A0 =A0 curthread->td_pflags |=3D TDP_NORUNNINGBUF | TDP_BUFNEED; >> =A0 =A0 =A0 mtx_lock(&bdlock); >> =A0 =A0 =A0 for (;;) { >> - =A0 =A0 =A0 =A0 =A0 =A0 bd_request =3D 0; >> + =A0 =A0 =A0 =A0 =A0 =A0 bd_request =3D 1; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 mtx_unlock(&bdlock); > Is this a complete patch ? The change just causes lost wakeups for bufdae= mon, > nothing more. Yes, it's a complete patch. And exactly, it causes lost wakeups which are: 1. !! UNREASONABLE !!, because bufdaemon is not sleeping, 2. not wanted, because it looks that it's correct behaviour for the sleep with hz/10 period. However, if the sleep with hz/10 period is expected to be waked up by bd_wakeup(), then bd_request should be set to 0 just before sleep() call, and then bufdaemon behaviour will be clear. All stuff around bd_request and bufdaemon sleep is under bd_lock, so if bd_request is 0 and bufdaemon is not sleeping, then all wakeups are unreasonable! The patch is about that mainly. > >> >> I read description of bd_request variable. However, bd_request should >> serve as an indicator that buf_daemon() is in sleep. I.e., the >> following paradigma should be used: >> >> mtx_lock(&bdlock); >> bd_request =3D 0; =A0 =A0/* now, it's only time when wakeup() will be me= aningful */ >> sleep(&bd_request, ..., hz/10); >> bd_request =3D 1; =A0 /* in case of timeout, we must set it (bd_wakeup() >> already set it) */ >> mtx_unlock(&bdlock); >> >> My patch follows the paradigma. What happens without the patch in >> described problem: buf_daemon() fails in its job and goes to sleep >> with hz/10 period. It supposes that next early wakeup will do nothing >> too. bd_request is untouched but buf_daemon() doesn't know if its last >> wakeup was made by bd_wakeup() or by timeout. So, bd_request could be >> 0 and buf_daemon() can be waked up before hz/10 just by bd_wakeup(). >> Moreover, setting bd_request to 0 when buf_daemon() is not in sleep >> can cause time consuming and useless wakeup() calls without effect.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFHCsPXZ7MmaVpfN0H%2BaY5ubvY3L_guN2ugUOZ-wEnhKuJV=tw>