From owner-freebsd-stable@FreeBSD.ORG Wed Jul 17 22:58:47 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 30159187; Wed, 17 Jul 2013 22:58:47 +0000 (UTC) (envelope-from smh@freebsd.org) Received: from smtp1.multiplay.co.uk (smtp1.multiplay.co.uk [85.236.96.35]) by mx1.freebsd.org (Postfix) with ESMTP id EC409196; Wed, 17 Jul 2013 22:58:46 +0000 (UTC) Received: by smtp1.multiplay.co.uk (Postfix, from userid 65534) id 3D58D20E7088C; Wed, 17 Jul 2013 22:58:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.multiplay.co.uk X-Spam-Level: X-Spam-Status: No, score=-2.2 required=8.0 tests=ALL_TRUSTED,AWL,BAYES_00, STOX_REPLY_TYPE autolearn=no version=3.3.1 Received: from r2d2 (82-69-141-170.dsl.in-addr.zen.co.uk [82.69.141.170]) by smtp1.multiplay.co.uk (Postfix) with ESMTPA id B339920E70847; Wed, 17 Jul 2013 22:58:43 +0000 (UTC) Message-ID: From: "Steven Hartland" To: "John Baldwin" , References: <20130717180720.GA8289@charmander> <20130717191852.GS5991@kib.kiev.ua> <201307171615.35484.jhb@freebsd.org> Subject: Re: syncer causing latency spikes Date: Wed, 17 Jul 2013 23:59:10 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: Konstantin Belousov , Mark Johnston X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jul 2013 22:58:47 -0000 ----- Original Message ----- From: "John Baldwin" > On Wednesday, July 17, 2013 3:18:52 pm Konstantin Belousov wrote: >> On Wed, Jul 17, 2013 at 02:07:55PM -0400, Mark Johnston wrote: >> > During such an fsync, DTrace shows me that syncer sleeps of 50-200ms are >> > happening up to 8 or 10 times a second. When this happens, a bunch of >> > postgres threads become blocked in vn_write() waiting for the vnode lock >> > to become free. It looks like the write-clustering code is limited to >> > using (nswbuf / 2) pbufs, and FreeBSD prevents one from setting nswbuf >> > to anything greater than 256. >> Syncer is probably just a victim of profiling. Would postgres called >> fsync(2), you then blame the fsync code for the pauses. >> >> Just add a tunable to allow the user to manually-tune the nswbuf, >> regardless of the buffer cache sizing. And yes, nswbuf default max >> probably should be bumped to something like 1024, at least on 64bit >> architectures which do not starve for kernel memory. > > Also, if you are seeing I/O stalls with mfi(4), then you might need a > firmware update for your mfi(4) controller. cc'ing smh@ who knows more about > that particular issue (IIRC). Indeed if your seeing any IO timeouts in /var/log/messages or the console there is a know issue in older mfi(4) firmware which can cause extended IO stalls. We believe that fixed firmware packages include "APP" version > *.130.* For Dell branded HW that corrisponds with FW package version 21.2.1-0000 Regards Steve