From owner-freebsd-hackers@FreeBSD.ORG Tue Mar 27 13:40:11 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 81476106566C for ; Tue, 27 Mar 2012 13:40:11 +0000 (UTC) (envelope-from ssanders@softhammer.net) Received: from oproxy8-pub.bluehost.com (oproxy8.bluehost.com [IPv6:2605:dc00:100:2::a8]) by mx1.freebsd.org (Postfix) with SMTP id 9D0FC8FC1A for ; Tue, 27 Mar 2012 13:40:04 +0000 (UTC) Received: (qmail 8101 invoked by uid 0); 27 Mar 2012 13:40:04 -0000 Received: from unknown (HELO host358.hostmonster.com) (66.147.240.158) by oproxy8.bluehost.com with SMTP; 27 Mar 2012 13:40:04 -0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=softhammer.net; s=default; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID; bh=OHOBkpmkJNXp/TdZlGA4KLMItf0Utzl9xxN2yrUnRS8=; b=TwqtAyWirnssa++kM+nrUvr7MgRTAi77QXb26uE1Nxo4wfGmewz1F+MKvIOyJSU3OACynrJXXNOui1tJE6aSkt6TILhtCSOfoiBqlCaZMgvl7JGCii8f1nCfwJWWY7Ra; Received: from pool-173-73-60-93.washdc.fios.verizon.net ([173.73.60.93] helo=[192.168.1.3]) by host358.hostmonster.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.76) (envelope-from ) id 1SCWd2-00070Y-4A for freebsd-hackers@freebsd.org; Tue, 27 Mar 2012 07:40:04 -0600 Message-ID: <4F71C333.9010506@softhammer.net> Date: Tue, 27 Mar 2012 09:40:03 -0400 From: Steve Sanders User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120310 Thunderbird/11.0 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org References: <4F3922A8.2090808@softhammer.net> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Identified-User: {2492:host358.hostmonster.com:softhamm:softhammer.net} {sentby:smtp auth 173.73.60.93 authed with ssanders@softhammer.net} Subject: Re: Odd RAID Performance Issue X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Mar 2012 13:40:11 -0000 Thanks for all of the suggestions. We do tune the logging ufs partition to have 64K blocks. We found a solution that makes this problem go away. We've modified the cam such that if a controller has 2 or more disks attached, it divides the number of I/O slots on the card between the disks. So a twa card as 252 slots available and the cam splits this between the two 'disks' attached to the controller, each disk getting 126 slots. The queue depths reported from iostat get ridiculously long (~1000) but we do not end up using memory from the runningspace buffers. Since we don't go over the vfs.hirunningspace mark, the system does not pause. I'm now wondering what causes runningspace usage? Could someone point me to the code where we end up allocating blocks from high running space? An interesting side effect of this has been to make a mess of the iostat's reports. 'iostats -x 1' now shows the database disk as 150-200% used and very often shows service time as being 15 seconds. Given the fact that the data looks good and the system isn't pausing, a 15 second operation time seems unlikely. I believe this to be an effect of a large number of NCQ operations terminating in the 1 second elapsed time window. So the operation duration is adding up to be much larger than the 1 second elapsed time window. Not realistic but illustrative, imagine 5 1 second NCQ operations terminating in the 1 second window. The current code will calculate the duration as 5 seconds, dividing by 1 will yield 500%. Thanks On 02/13/2012 11:51 AM, Ivan Voras wrote: > On 13/02/2012 15:48, Stephen Sanders wrote: >> We've an application that logs data on one very large raid6 array >> and updates/accesses a database on another smaller raid5 array. > You would be better off with RAID10 for a database (or anything which > does random IO). > >> Both arrays are connected to the same PCIe 3ware RAID controller. The >> system has 2 six core 3Ghz processors and 24 GB of RAM. The system is >> running FreeBSD 8.1. > Did you do any additional OS tuning? Do you use UFS or ZFS? > >> The problem we're encountering is that the disk subsystem appears to >> 'pause' periodically. It looks as if this is a result of disk read/write >> operations from the database array taking a very long time to complete >> (up to 8 sec). > You should be able to monitor this with "iostat -x 1" (or whatever > number of seconds instead of "1") - the last three columns should tell > you if the device(s) are extraordinarily busy, and the r/s and w/s > columns should tell you what the real IOPS rate is. You should probably > post a sample output from this command when the problem appears. > > >