Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Mar 2012 09:40:03 -0400
From:      Steve Sanders <ssanders@softhammer.net>
To:        freebsd-hackers@freebsd.org
Subject:   Re: Odd RAID Performance Issue
Message-ID:  <4F71C333.9010506@softhammer.net>
In-Reply-To: <jhbf2o$7f6$1@dough.gmane.org>
References:  <4F3922A8.2090808@softhammer.net> <jhbf2o$7f6$1@dough.gmane.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Thanks for all of the suggestions.  We do tune the logging ufs partition
to have 64K blocks. 

We found a solution that makes this problem go away. 

We've modified the cam such that if a controller has 2 or more disks
attached, it divides the number of I/O slots on the card between the
disks.  So a twa card as 252 slots available and the cam splits this
between the two 'disks' attached to the controller, each disk getting
126 slots.

The queue depths reported from iostat get ridiculously long (~1000) but
we do not end up using memory from the runningspace buffers.  Since we
don't go over the vfs.hirunningspace mark, the system does not pause.

I'm now wondering what causes runningspace usage?  Could someone point
me to the code where we end up allocating blocks from high running space?

An interesting side effect of this has been to make a mess of the
iostat's reports.  'iostats -x 1' now shows the database disk as
150-200% used and very often shows service time as being 15 seconds. 
Given the fact that the data looks good and the system isn't pausing, a
15 second operation time seems unlikely.

I believe this to be an effect of a large number of NCQ operations
terminating in the 1 second elapsed time window.  So the operation
duration is adding up to be much larger than the 1 second elapsed time
window. 

Not realistic but illustrative, imagine 5 1 second NCQ operations
terminating in the 1 second window.  The current code will calculate the
duration as 5 seconds, dividing by 1 will yield 500%.

Thanks


On 02/13/2012 11:51 AM, Ivan Voras wrote:
> On 13/02/2012 15:48, Stephen Sanders wrote:
>> We've an application that logs data on one very large raid6 array
>> and updates/accesses a database on another smaller raid5 array.
> You would be better off with RAID10 for a database (or anything which
> does random IO).
>
>> Both arrays are connected to the same PCIe 3ware RAID controller.   The
>> system has 2 six core 3Ghz processors and 24 GB of RAM.  The system is
>> running FreeBSD 8.1.
> Did you do any additional OS tuning? Do you use UFS or ZFS?
>
>> The problem we're encountering is that the disk subsystem appears to
>> 'pause' periodically.   It looks as if this is a result of disk read/write
>> operations from the database array taking a very long time to complete
>> (up to 8 sec).
> You should be able to monitor this with "iostat -x 1" (or whatever
> number of seconds instead of "1") - the last three columns should tell
> you if the device(s) are extraordinarily busy, and the r/s and w/s
> columns should tell you what the real IOPS rate is. You should probably
> post a sample output from this command when the problem appears.
>
>
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F71C333.9010506>