From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 13 15:04:31 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3AEE51065676 for ; Mon, 13 Feb 2012 15:04:31 +0000 (UTC) (envelope-from ssanders@softhammer.net) Received: from smtp-hq2.opnet.com (smtp-hq2.opnet.com [192.104.65.247]) by mx1.freebsd.org (Postfix) with ESMTP id 198568FC0C for ; Mon, 13 Feb 2012 15:04:31 +0000 (UTC) Received: from [172.16.9.10] (wtn09010.opnet.com [172.16.9.10]) by smtp.opnet.com (Postfix) with ESMTPSA id CBFCD211023A for ; Mon, 13 Feb 2012 09:48:09 -0500 (EST) Message-ID: <4F3922A8.2090808@softhammer.net> Date: Mon, 13 Feb 2012 09:48:08 -0500 From: Stephen Sanders User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:10.0) Gecko/20120129 Thunderbird/10.0 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Odd RAID Performance Issue X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Feb 2012 15:04:31 -0000 We've an application that logs data on one very large raid6 array and updates/accesses a database on another smaller raid5 array. Both arrays are connected to the same PCIe 3ware RAID controller. The system has 2 six core 3Ghz processors and 24 GB of RAM. The system is running FreeBSD 8.1. The averaged read/write rate to the database is 2MB/s while the averaged write raid to the data logging array is 300MB/s. Writes to the logging array are somewhat bursty. The problem we're encountering is that the disk subsystem appears to 'pause' periodically. It looks as if this is a result of disk read/write operations from the database array taking a very long time to complete (up to 8 sec). When the disk read operation takes such a long time, it appears that the system starts to run out of memory due to bio block buffering. Most processes end up in either getblk() or waithighrunning(). We've instrumented g_vfs_strategie() and bufdone_finish() using dtrace. The indication from this effort is that a number of reads and writes are taking 4-8 seconds. So far, it looks as if the disk driver and hardware are OK as read/write operations appear to be in the milli-second region. We believe that our instrumentation is pointing to something between the VFS layer and the CAM as the culprit. We've gotten the same result from FreeBSD 8.2 but have not tried FreeBSD 9 as yet. This scenario is not limited to a single system and is occurring on a couple of systems. Does this sound familiar to anyone out there? Thanks