Date: Wed, 17 Jul 2013 17:46:27 -0400 From: Mark Johnston <markj@freebsd.org> To: John Baldwin <jhb@freebsd.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, smh@freebsd.org, freebsd-stable@freebsd.org Subject: Re: syncer causing latency spikes Message-ID: <20130717214627.GC8289@charmander> In-Reply-To: <201307171615.35484.jhb@freebsd.org> References: <20130717180720.GA8289@charmander> <20130717191852.GS5991@kib.kiev.ua> <201307171615.35484.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jul 17, 2013 at 04:15:35PM -0400, John Baldwin wrote: > On Wednesday, July 17, 2013 3:18:52 pm Konstantin Belousov wrote: > > On Wed, Jul 17, 2013 at 02:07:55PM -0400, Mark Johnston wrote: > > > During such an fsync, DTrace shows me that syncer sleeps of 50-200ms are > > > happening up to 8 or 10 times a second. When this happens, a bunch of > > > postgres threads become blocked in vn_write() waiting for the vnode lock > > > to become free. It looks like the write-clustering code is limited to > > > using (nswbuf / 2) pbufs, and FreeBSD prevents one from setting nswbuf > > > to anything greater than 256. > > Syncer is probably just a victim of profiling. Would postgres called > > fsync(2), you then blame the fsync code for the pauses. > > > > Just add a tunable to allow the user to manually-tune the nswbuf, > > regardless of the buffer cache sizing. And yes, nswbuf default max > > probably should be bumped to something like 1024, at least on 64bit > > architectures which do not starve for kernel memory. > > Also, if you are seeing I/O stalls with mfi(4), then you might need a > firmware update for your mfi(4) controller. cc'ing smh@ who knows more about > that particular issue (IIRC). I tried upgrading the firmware to the latest available image (I believe it was from March), but that didn't help. I wouldn't call my problem a stall in the sense of commands timing out (which I've seen before), it's just that we manage to generate a large enough backlog that the driver/controller take at least several seconds to clear it, during which all I/O is stalled in the kernel.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130717214627.GC8289>