From owner-freebsd-fs@FreeBSD.ORG Tue Mar 24 12:41:14 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 42E221065672 for ; Tue, 24 Mar 2009 12:41:14 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx07.syd.optusnet.com.au (fallbackmx07.syd.optusnet.com.au [211.29.132.9]) by mx1.freebsd.org (Postfix) with ESMTP id 3D0F38FC18 for ; Tue, 24 Mar 2009 12:41:13 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by fallbackmx07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n2OCJ0nE021627 for ; Tue, 24 Mar 2009 23:19:00 +1100 Received: from besplex.bde.org (c122-107-120-227.carlnfd1.nsw.optusnet.com.au [122.107.120.227]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n2OCInZJ005236 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 24 Mar 2009 23:18:52 +1100 Date: Tue, 24 Mar 2009 23:18:49 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Julian Elischer In-Reply-To: <49C7C45B.7040708@elischer.org> Message-ID: <20090324224001.D1670@besplex.bde.org> References: <200903231733.51671.mel.flynn+fbsd.fs@mailing.thruhere.net> <49C7C45B.7040708@elischer.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: Trying to understand how aio(4), mount(8) async (and gjournal) relate X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Mar 2009 12:41:14 -0000 On Mon, 23 Mar 2009, Julian Elischer wrote: > Mel Flynn wrote: >> If one mounts a disk with async, does this internally use aio system calls, >> or is there a subsystem available which does largely the same? If so, why >> are there 2 subsystems? > > the async mount operation tells the systewm that writes done by a process may > be actually written to teh filesystem at any time later, and that it is ok to > return "success" to the user immediately, regardless of whether the write has > actually been done. > This also applies to the metadata. It does not affect reads at all. Actually, this only (*) applies to the metadata. The async mount option does not affect writes, except for the metadata part of writes and for a small affect on the amount of asyncness of writes. The data part of writes is always (*) written asynchronously and "success" is returned to the user immediately, irrespective of whether the write has actually been done. (Success may be returned even if the write was attempted but failed. Writes are retried endlessly to a fault, so the data for a failed writes is not always lost. (The usual fault is panicing when the disk goes away.) It is necessary to use fsync() or the still-unimplemnted POSIX interface fdatasync() to ensure that the data has been written or get a report of non-success.) (*) Not quite only or always, since (1) The async mount option is incompatible with soft updates and is silently ignored if soft updates is configured. Soft updates essentially gives async everything, with stronger ordering of writes so that the file system is always consistent (but often out of date). (2) The sync mount option may be used to force synchronous writes of data. This option should give synchronous everything, but when it is mixed with the async mount option (and that option is not ignored) it gives the weird and undesirable behaviour of sync writes of data and async writes of metadata. >> When using aio, for example with squid, does this mean the underlying >> provider needs to be mounted async or is this totally unrelated? Similarly >> if said disk is on a gjournalled partition, is the async mount redundant or >> is using aio redundant or neither? > > no the aio system calls are implemented in a manner that allows > the caller to request IO and then return at a later time to find out > the result. it has no connection to the mount option. Yes, essentially Unix kernels always had and used async writes, and the relatively new and quite different aio interface lets applications do async writes too. Bruce