From owner-freebsd-current@FreeBSD.ORG Thu Dec 3 12:53:52 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C09C9106566C for ; Thu, 3 Dec 2009 12:53:52 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-bw0-f213.google.com (mail-bw0-f213.google.com [209.85.218.213]) by mx1.freebsd.org (Postfix) with ESMTP id 47F1B8FC14 for ; Thu, 3 Dec 2009 12:53:51 +0000 (UTC) Received: by bwz5 with SMTP id 5so1026615bwz.3 for ; Thu, 03 Dec 2009 04:53:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :x-enigmail-version:content-type:content-transfer-encoding; bh=JWE/zEt6LldtAwaC6G7ledxjUFWAZjQH7qGsjLrdnxM=; b=dpB2Rv9uypB6GRdDrMNf1wJD5hceF5eu9e+cd5w0D0DxsOGtAj+gsGXs9FoHKX4/pF j0ux85LXBF2GStir4u3FfUWK0brPJURFghb+RMn2Uu4zd7tTRs61nxNkzUrDR8ehQJ1Q izQupUp/hbLpqVxKfLizcbWHKsKblB173eYLM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; b=EgsZHxJZI+9JtL65G4YBIAxTAXqzj/UZc+NV9uISZ2yzBTWf7RRRJq4GY/IUZY7sPq v14GswuWHE1uJSd6FPpLPmIofm7weTeJzEseKGxcMWzK6gYO4gwYObzB28J7P531z/Fg ttHJUg578tMqXd/wOXMf5uQ09FL5mYDxEjGEs= Received: by 10.204.152.151 with SMTP id g23mr1513883bkw.148.1259844830800; Thu, 03 Dec 2009 04:53:50 -0800 (PST) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id 14sm741973fxm.7.2009.12.03.04.53.48 (version=SSLv3 cipher=RC4-MD5); Thu, 03 Dec 2009 04:53:49 -0800 (PST) Sender: Alexander Motin Message-ID: <4B17B4DA.7050403@FreeBSD.org> Date: Thu, 03 Dec 2009 14:53:46 +0200 From: Alexander Motin User-Agent: Thunderbird 2.0.0.23 (X11/20090901) MIME-Version: 1.0 To: Maxim Sobolev References: <4A9E8677.1020208@FreeBSD.org> <20090903002106.GB17538@dmr.ath.cx> <4AA0075A.5010109@FreeBSD.org> <4B16FFA9.6070002@FreeBSD.org> In-Reply-To: <4B16FFA9.6070002@FreeBSD.org> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: FreeBSD-Current , "Derek \(freebsd lists\)" <482254ac@razorfever.net>, Emil Mikulic Subject: Re: gmirror 'load' algorithm (Was: Re: siis/atacam/ata/gmirror 8.0-BETA3 disk performance) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Dec 2009 12:53:52 -0000 Maxim Sobolev wrote: > Alexander Motin wrote: >> I have played a bit with this patch on 4-disk mirror. It works better >> then original algorithm, but still not perfect. >> >> 1. I have managed situation with 4 read streams when 3 drives were >> busy, while forth one was completely idle. gmirror prefer constantly >> seek one of drives on short distances, but not to use idle drive, >> because it's heads were few gigabytes away from that point. >> >> IMHO request locality priority should be made almost equal for any >> nonzero distances. As we can see with split mode, even small gaps >> between requests can significantly reduce drive performance. So I >> think it is not so important if data are 100MB or 500GB away from >> current head position. It is perfect case when requests are completely >> sequential. But everything beyond few megabytes from current position >> just won't fit drive cache. >> >> 2. IMHO it would be much better to use averaged request queue depth as >> load measure, instead of last request submit time. Request submit time >> works fine only for equal requests, equal drives and serialized load, >> but it is actually the case where complicated load balancing is just >> not needed. The fact that some drive just got request does not mean >> anything, if some another one got 50 requests one second ago and still >> processes them. > > Can you try this one: > > http://sobomax.sippysoft.com/~sobomax/geom_mirror.diff > > It implements different logic - instead of looking for the time, it > checks the outstanding requests queue length and recently served > requests proximity to decide where to schedule requests. Your patch changes "round-robin" algorithm, instead of "load". I have reimplemented it for "load" algorithm and changed math a bit, as I have written before. Patch is here: http://people.freebsd.org/~mav/gmirror.patch Here is some benchmarks for gmirror of 4 drives: ### load original linear 1MB read random 1 process MBps: 101 tps: 161 2 processes MBps: 78 tps: 265 4 processes MBps: 90 tps: 325 8 processes MBps: 101 tps: 384 16 processes MBps: 118 tps: 426 32 processes MBps: 142 tps: 457 Random performance is not bad, but linear is terrible, as requests jumping between drives and kicking each-other. ### round-robin linear 1MB read random 1 process MBps: 64 tps: 158 2 processes MBps: 131 tps: 260 4 processes MBps: 235 tps: 342 5 processes MBps: 240 tps: 362 8 processes MBps: 239 tps: 397 16 processes MBps: 246 tps: 432 32 processes MBps: 258 tps: 452 This is completely predictable. Random is fine, linear is not really linear. Perfect requests balancing between drives. ### load mav@ linear 1MB read random 1 process MBps: 104 tps: 159 2 processes MBps: 214 tps: 256 4 processes MBps: 425 tps: 332 5 processes MBps: 300 tps: 352 8 processes MBps: 245 tps: 391 16 processes MBps: 255 tps: 436 32 processes MBps: 263 tps: 457 Random is close to round-robin. Request balancing is close to perfect. Linear shows maximum possible performance for number of processes up to the number of drives, using only as much disks as needed. With more processes then disks, performance predictably reducing, but still beats all other methods. I think it is hardly possible to get much more. -- Alexander Motin