From owner-freebsd-fs@FreeBSD.ORG Thu Dec 20 23:41:51 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 405A916A417 for ; Thu, 20 Dec 2007 23:41:51 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outT.internet-mail-service.net (outT.internet-mail-service.net [216.240.47.243]) by mx1.freebsd.org (Postfix) with ESMTP id 250BD13C474 for ; Thu, 20 Dec 2007 23:41:51 +0000 (UTC) (envelope-from julian@elischer.org) Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160) by out.internet-mail-service.net (qpsmtpd/0.40) with ESMTP; Thu, 20 Dec 2007 15:41:50 -0800 Received: from julian-mac.elischer.org (localhost [127.0.0.1]) by idiom.com (Postfix) with ESMTP id CB7C6126D15; Thu, 20 Dec 2007 15:41:49 -0800 (PST) Message-ID: <476AFDBC.9040301@elischer.org> Date: Thu, 20 Dec 2007 15:41:48 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Peter Schuller References: <200712202140.08367.peter.schuller@infidyne.com> <20071220221735.GB67140@cicely12.cicely.de> <200712210036.49040.peter.schuller@infidyne.com> In-Reply-To: <200712210036.49040.peter.schuller@infidyne.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, ticso@cicely.de, Ivan Voras Subject: Re: readv: parallel or sequential? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 23:41:51 -0000 Peter Schuller wrote: >> In case the application uses serialized access there is not much to do >> beside preread or caching writes to make use of multiple spindles. > > Agreed. > >> But an application has to be carefull, because parallel access within >> a single file almost always mean that access is not linear anymore, so >> many opther performance tunings won't work as good as they could, so >> this could easily outweight the performance gain from multiple access. > > For seek bound applications you don't really care anyway. If you have a > mixture of stream bound and seak bound I/O going on you will run into various > issues which are difficult to avoid without very careful application-specific > tuning I think. But for the simple case of doing concurrent seek-bound I/O I > would expect it to be handled gracefully by the OS. > > And I do mean to the same file, rather than file descriptor (in response to > the other post on descriptors). > >> Nonlinear access from within an application has to be for another reason >> and not as a performance tuning. > > Why? Again, PostgreSQL, other databases, or any file access pattern which is > seek bound stands to gain more or less linearly from concurrent I/O being > propagated to constituent devices in a non-serialized fashion. This is a > pretty basic assumption in my book when designing an application. Whenever > something is seek bound, assuming I have concurrency in my app, I look at the > number of constituent devices on the device and the type of RAID or similar > being used (including stripe sizes in relation to the size of my I/O > requests, etc). > > I fully expect to be able to scale linearly with the number of underlying > devices, assuming raid0/raid10 or something equivalent, and assuming I have a > concurrency that is sufficiently high to keep all drives busy. > > (There are valid exceptions of course, such as raidz/raidz2. But that's beyond > the scope of this discussion.) multiple reads and writes to the same file *From different file descriptors* (same process or not) might proceed in "parallel" but readv and writev will be implemented serially to the filesystem. now IF THE FILESYSTEM IS NOT DOING SYNCHRONOUS DISK ACCESSES the reads and writes might proceed in parallel or be grouped, clustered or otherwise rearanged. >