From owner-freebsd-hackers@FreeBSD.ORG Tue Mar 27 20:55:29 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E8DEF1065673 for ; Tue, 27 Mar 2012 20:55:28 +0000 (UTC) (envelope-from ssanders@softhammer.net) Received: from smtp-hq2.opnet.com (smtp-hq2.opnet.com [192.104.65.247]) by mx1.freebsd.org (Postfix) with ESMTP id A4ED88FC14 for ; Tue, 27 Mar 2012 20:55:28 +0000 (UTC) Received: from [172.16.9.10] (wtn09010.opnet.com [172.16.9.10]) by smtp.opnet.com (Postfix) with ESMTPSA id 076BD211017D for ; Tue, 27 Mar 2012 16:48:56 -0400 (EDT) Message-ID: <4F7227B8.5020505@softhammer.net> Date: Tue, 27 Mar 2012 16:48:56 -0400 From: Stephen Sanders User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org References: <4F3922A8.2090808@softhammer.net> <4F71C333.9010506@softhammer.net> In-Reply-To: <4F71C333.9010506@softhammer.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: Odd RAID Performance Issue X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Mar 2012 20:55:29 -0000 Bit of a head space on the running space usage question. One of the test systems has 4 g_up/g_down threads running hence the better runningbufspace usages. biodone() gets called a lot more often so the buffer usage is not backing up. It also appears that devstat_start_transaction() / devstat_end_transaction() is getting called from g_up/g_down. It seems that this should cause some of the counter updates to be subjected to thread scheduling issues. Like g_up() running a lot more than g_down() so that the start_count becomes less than end_count. On 3/27/2012 9:40 AM, Steve Sanders wrote: > Thanks for all of the suggestions. We do tune the logging ufs partition > to have 64K blocks. > > We found a solution that makes this problem go away. > > We've modified the cam such that if a controller has 2 or more disks > attached, it divides the number of I/O slots on the card between the > disks. So a twa card as 252 slots available and the cam splits this > between the two 'disks' attached to the controller, each disk getting > 126 slots. > > The queue depths reported from iostat get ridiculously long (~1000) but > we do not end up using memory from the runningspace buffers. Since we > don't go over the vfs.hirunningspace mark, the system does not pause. > > I'm now wondering what causes runningspace usage? Could someone point > me to the code where we end up allocating blocks from high running space? > > An interesting side effect of this has been to make a mess of the > iostat's reports. 'iostats -x 1' now shows the database disk as > 150-200% used and very often shows service time as being 15 seconds. > Given the fact that the data looks good and the system isn't pausing, a > 15 second operation time seems unlikely. > > I believe this to be an effect of a large number of NCQ operations > terminating in the 1 second elapsed time window. So the operation > duration is adding up to be much larger than the 1 second elapsed time > window. > > Not realistic but illustrative, imagine 5 1 second NCQ operations > terminating in the 1 second window. The current code will calculate the > duration as 5 seconds, dividing by 1 will yield 500%. > > Thanks > > > On 02/13/2012 11:51 AM, Ivan Voras wrote: >> On 13/02/2012 15:48, Stephen Sanders wrote: >>> We've an application that logs data on one very large raid6 array >>> and updates/accesses a database on another smaller raid5 array. >> You would be better off with RAID10 for a database (or anything which >> does random IO). >> >>> Both arrays are connected to the same PCIe 3ware RAID controller. The >>> system has 2 six core 3Ghz processors and 24 GB of RAM. The system is >>> running FreeBSD 8.1. >> Did you do any additional OS tuning? Do you use UFS or ZFS? >> >>> The problem we're encountering is that the disk subsystem appears to >>> 'pause' periodically. It looks as if this is a result of disk read/write >>> operations from the database array taking a very long time to complete >>> (up to 8 sec). >> You should be able to monitor this with "iostat -x 1" (or whatever >> number of seconds instead of "1") - the last three columns should tell >> you if the device(s) are extraordinarily busy, and the r/s and w/s >> columns should tell you what the real IOPS rate is. You should probably >> post a sample output from this command when the problem appears. >> >> >> > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"