From owner-freebsd-arch@FreeBSD.ORG Fri Jun 10 13:24:03 2005 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0A19D16A41C; Fri, 10 Jun 2005 13:24:03 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 68D0743D48; Fri, 10 Jun 2005 13:24:02 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87]) by mailout1.pacific.net.au (8.13.4/8.13.4/Debian-1) with ESMTP id j5ADO0K6017088; Fri, 10 Jun 2005 23:24:00 +1000 Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy2.pacific.net.au (8.13.4/8.13.4/Debian-1) with ESMTP id j5ADNwQ7015739; Fri, 10 Jun 2005 23:23:59 +1000 Date: Fri, 10 Jun 2005 23:23:59 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Poul-Henning Kamp In-Reply-To: <9131.1118346135@critter.freebsd.dk> Message-ID: <20050610231928.J25650@delplex.bde.org> References: <9131.1118346135@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org, Pawel Jakub Dawidek Subject: Re: simplify disksort, please review. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jun 2005 13:24:03 -0000 On Thu, 9 Jun 2005, Poul-Henning Kamp wrote: > In message <20050609193008.GB837@darkness.comp.waw.pl>, Pawel Jakub Dawidek writes: > >> The one example of how the order can be broken (write(offset, size)): >> >> write(1024, 512) >> write(0, 2048) > > If you issue these two requests just like that, you get no guarantee > which order they get written in. > > It's not just disksort which might surprise you, tagged queuing and > write caches may mess up your day as well. Internal (buffer) caches too. For 2 sparate writes there must normally be 2 separate buffers, and if the buffer data overlaps then the buffers may be incoherent, especially if they are malloced (which rarely happens now, so overlapping buffers are more likely to just clobber each other when their data is written to in memory than their data is to become incoherent). File systems should use a fixed block size with all buffers beginning on a block boundary so that they never generate overlapping buffers. Bruce