From owner-freebsd-arch@FreeBSD.ORG Fri Jun 10 08:05:54 2005 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4B62716A41F; Fri, 10 Jun 2005 08:05:54 +0000 (GMT) (envelope-from jroberson@chesapeake.net) Received: from mail.chesapeake.net (chesapeake.net [208.142.252.6]) by mx1.FreeBSD.org (Postfix) with ESMTP id AACAB43D55; Fri, 10 Jun 2005 08:05:53 +0000 (GMT) (envelope-from jroberson@chesapeake.net) Received: from mail.chesapeake.net (localhost [127.0.0.1]) by mail.chesapeake.net (8.12.10/8.12.10) with ESMTP id j5A85rk9098570; Fri, 10 Jun 2005 04:05:53 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.12.10/8.12.10/Submit) with ESMTP id j5A85qDC098566; Fri, 10 Jun 2005 04:05:52 -0400 (EDT) (envelope-from jroberson@chesapeake.net) X-Authentication-Warning: mail.chesapeake.net: jroberson owned process doing -bs Date: Fri, 10 Jun 2005 04:05:52 -0400 (EDT) From: Jeff Roberson To: Poul-Henning Kamp In-Reply-To: <9131.1118346135@critter.freebsd.dk> Message-ID: <20050610040302.S16943@mail.chesapeake.net> References: <9131.1118346135@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: arch@FreeBSD.org, Pawel Jakub Dawidek Subject: Re: simplify disksort, please review. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jun 2005 08:05:54 -0000 On Thu, 9 Jun 2005, Poul-Henning Kamp wrote: > In message <20050609193008.GB837@darkness.comp.waw.pl>, Pawel Jakub Dawidek writes: > > >The one example of how the order can be broken (write(offset, size)): > > > > write(1024, 512) > > write(0, 2048) > > If you issue these two requests just like that, you get no guarantee > which order they get written in. > > It's not just disksort which might surprise you, tagged queuing and > write caches may mess up your day as well. Through the filesystem the buf will be locked until the first write completes. Truthfully, it's likely to be DELWRI, which means the second write will be a memcpy into the buffer. If you fsync'd between the two, the first could still be in the drive cache when the second comes down, but certainly the drive cache will remain coherent. Anyway, it's really not important to solve at the disk queue layer, because it's only an issue with raw devices, and it's very irregular to have multiple processes opening the same raw device to issue a lot of io of different sizes. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. >