From owner-freebsd-stable@FreeBSD.ORG Wed Jul 17 21:45:40 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id ADB52FEE; Wed, 17 Jul 2013 21:45:40 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-ie0-x232.google.com (mail-ie0-x232.google.com [IPv6:2607:f8b0:4001:c03::232]) by mx1.freebsd.org (Postfix) with ESMTP id 6ECB9D50; Wed, 17 Jul 2013 21:45:40 +0000 (UTC) Received: by mail-ie0-f178.google.com with SMTP id u16so5251026iet.23 for ; Wed, 17 Jul 2013 14:45:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=qpp8FxmuVpYsr4L6WK85wQFxBnwIjxtiMI9z6PGXr5k=; b=OTJloyLZINqp8j/BRcympa+iOCuuJ4VLMTBd2zXumP5CAEJE9WFf1KeVK9IGy3JieQ a36FOU47xWkQhOqgTnWXb17T09luP391ZCwApTm6DnI7RU4aCDNpUzxhDjCv4TEorWLi gfId4mo/xZ/HbomaibFYmyiAOSiXErBTe5NUbGSzt+YKRhlcK8epvGkBNGFGhcKIYm7Y hgCFdkKBozlo5XRf5v6nGGmpf6D+GenMcwt3ZE+ZVyNfuMRxMHGAEXQy4cxIIx1PxXmC nIZU8FFnpG/Bl6/8QMKSxAHrtyLVJT/BKSDAJEpwjR8TqnreNN/TeQsq/LGBIaZZ3s5k kN4A== X-Received: by 10.50.16.8 with SMTP id b8mr11399351igd.1.1374097540054; Wed, 17 Jul 2013 14:45:40 -0700 (PDT) Received: from charmander (mail1.sandvine.com. [64.7.137.162]) by mx.google.com with ESMTPSA id o14sm31083363igw.2.2013.07.17.14.45.38 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 17 Jul 2013 14:45:39 -0700 (PDT) Sender: Mark Johnston Date: Wed, 17 Jul 2013 17:46:27 -0400 From: Mark Johnston To: John Baldwin Subject: Re: syncer causing latency spikes Message-ID: <20130717214627.GC8289@charmander> References: <20130717180720.GA8289@charmander> <20130717191852.GS5991@kib.kiev.ua> <201307171615.35484.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201307171615.35484.jhb@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Konstantin Belousov , smh@freebsd.org, freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jul 2013 21:45:40 -0000 On Wed, Jul 17, 2013 at 04:15:35PM -0400, John Baldwin wrote: > On Wednesday, July 17, 2013 3:18:52 pm Konstantin Belousov wrote: > > On Wed, Jul 17, 2013 at 02:07:55PM -0400, Mark Johnston wrote: > > > During such an fsync, DTrace shows me that syncer sleeps of 50-200ms are > > > happening up to 8 or 10 times a second. When this happens, a bunch of > > > postgres threads become blocked in vn_write() waiting for the vnode lock > > > to become free. It looks like the write-clustering code is limited to > > > using (nswbuf / 2) pbufs, and FreeBSD prevents one from setting nswbuf > > > to anything greater than 256. > > Syncer is probably just a victim of profiling. Would postgres called > > fsync(2), you then blame the fsync code for the pauses. > > > > Just add a tunable to allow the user to manually-tune the nswbuf, > > regardless of the buffer cache sizing. And yes, nswbuf default max > > probably should be bumped to something like 1024, at least on 64bit > > architectures which do not starve for kernel memory. > > Also, if you are seeing I/O stalls with mfi(4), then you might need a > firmware update for your mfi(4) controller. cc'ing smh@ who knows more about > that particular issue (IIRC). I tried upgrading the firmware to the latest available image (I believe it was from March), but that didn't help. I wouldn't call my problem a stall in the sense of commands timing out (which I've seen before), it's just that we manage to generate a large enough backlog that the driver/controller take at least several seconds to clear it, during which all I/O is stalled in the kernel.