From owner-freebsd-stable@FreeBSD.ORG Wed Jul 17 20:15:51 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 96197480; Wed, 17 Jul 2013 20:15:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 7045E904; Wed, 17 Jul 2013 20:15:51 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id D1765B922; Wed, 17 Jul 2013 16:15:49 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Subject: Re: syncer causing latency spikes Date: Wed, 17 Jul 2013 16:15:35 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <20130717180720.GA8289@charmander> <20130717191852.GS5991@kib.kiev.ua> In-Reply-To: <20130717191852.GS5991@kib.kiev.ua> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201307171615.35484.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 17 Jul 2013 16:15:49 -0400 (EDT) Cc: Konstantin Belousov , smh@freebsd.org, Mark Johnston X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jul 2013 20:15:51 -0000 On Wednesday, July 17, 2013 3:18:52 pm Konstantin Belousov wrote: > On Wed, Jul 17, 2013 at 02:07:55PM -0400, Mark Johnston wrote: > > During such an fsync, DTrace shows me that syncer sleeps of 50-200ms are > > happening up to 8 or 10 times a second. When this happens, a bunch of > > postgres threads become blocked in vn_write() waiting for the vnode lock > > to become free. It looks like the write-clustering code is limited to > > using (nswbuf / 2) pbufs, and FreeBSD prevents one from setting nswbuf > > to anything greater than 256. > Syncer is probably just a victim of profiling. Would postgres called > fsync(2), you then blame the fsync code for the pauses. > > Just add a tunable to allow the user to manually-tune the nswbuf, > regardless of the buffer cache sizing. And yes, nswbuf default max > probably should be bumped to something like 1024, at least on 64bit > architectures which do not starve for kernel memory. Also, if you are seeing I/O stalls with mfi(4), then you might need a firmware update for your mfi(4) controller. cc'ing smh@ who knows more about that particular issue (IIRC). -- John Baldwin