From owner-svn-src-all@FreeBSD.ORG Thu Jan 7 18:43:23 2010 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 46A241065672; Thu, 7 Jan 2010 18:43:23 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by mx1.freebsd.org (Postfix) with ESMTP id CDB928FC12; Thu, 7 Jan 2010 18:43:22 +0000 (UTC) Received: from besplex.bde.org (c122-106-155-90.carlnfd1.nsw.optusnet.com.au [122.106.155.90]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o07IhJQ3014812 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 8 Jan 2010 05:43:20 +1100 Date: Fri, 8 Jan 2010 05:43:19 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Alexander Motin In-Reply-To: <4B4612E2.70506@FreeBSD.org> Message-ID: <20100108052300.P4481@besplex.bde.org> References: <201001061712.o06HCICF087127@svn.freebsd.org> <9bbcef731001060938k2b0014a2m15eef911b9922b2c@mail.gmail.com> <4B44D8FA.2000608@FreeBSD.org> <9bbcef731001061103u33fd289q727179454b21ce18@mail.gmail.com> <4B450F30.20705@FreeBSD.org> <20100108013737.S56162@delplex.bde.org> <4B4612E2.70506@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Ivan Voras , Bruce Evans Subject: Re: svn commit: r201658 - head/sbin/geom/class/stripe X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jan 2010 18:43:23 -0000 On Thu, 7 Jan 2010, Alexander Motin wrote: > Bruce Evans wrote: >> On Thu, 7 Jan 2010, Alexander Motin wrote: >>> One thing IMHO would be nice to see there is the alignment of the >>> read-ahead requests to the array stripe size/offset. Dirty hack I've >>> tried there, reduced number of requests to the array components by 30%. >> >> ffs thinks that bsize alignment is adequate. > > Alignment definitely should be adequate, but it is a different task. I > say that bsize and stripe size are not needed to be equal, as they serve > different purposes. Even with perfectly aligned usual 16K bsize and 64K > stripe size we will have 75% chance of misaligned I/O. That's what I meant by bsize alignment (only) being thought to be adequate (by ffs, not us). >> It doesn't try to align >> files any more than that. Then for sequential reads from the beginning >> of the file, vfs read clustering tries to read MAXPHYS bytes at a time, >> so it perfectly preserves any initial misalignment. Even with alignment by vfs, with misalignment by ffs we would get things like an inital 64K read being split up into 48K (to reach a stripe alignment boundary), then 16K (since that is all that is left to read). For larger files starting with this misalignment, we would get i/o sizes like 48K+128K+...+128K+trailingK, except with ffs there is also the pessimal allocation of indirect blocks, which will put a bubble in the i/o at 12*bsize (default 192K) to seek far away to the indirect block. > I think we would benefit, if vfs could shrink first request of long > read-ahead session a bit, to get all further MAXPHYS-sized reads to not > cross more stripe boundaries then really required. In worst case it can > generate one extra request to array, in best case it can twice reduce > number of requests to the array components. Yes, that is the only relatively easy thing to change. Bruce