From owner-freebsd-arch@FreeBSD.ORG Mon Oct 6 16:11:16 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2518316A4DC; Mon, 6 Oct 2003 16:11:16 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 699B443FEC; Mon, 6 Oct 2003 16:11:14 -0700 (PDT) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.9/8.12.9) with ESMTP id h96NBB2B027375; Tue, 7 Oct 2003 01:11:11 +0200 (CEST) (envelope-from phk@phk.freebsd.dk) To: Scott Long From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 06 Oct 2003 16:44:32 MDT." <20031006163218.L55190@pooker.samsco.home> Date: Tue, 07 Oct 2003 01:11:11 +0200 Message-ID: <27374.1065481871@critter.freebsd.dk> cc: arch@freebsd.org cc: Garrett Wollman Subject: Re: Alignment of disk-I/O from userland. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Oct 2003 23:11:16 -0000 In message <20031006163218.L55190@pooker.samsco.home>, Scott Long writes: >We already >have the busdma interface whose sole purpose is to take system >buffers and prepare them for transfer to/from hardware [...] I certainly do agree that _if_ we do want to do copy/align busdma would be a good place for it. >As for returning an error code for a buffer that we (arbitrarily) believe >to be too big to align, [...] I have never advocated returning an error based on "alignment and size", only based on alignment alone. But I also just realized a complication I had not thought of earlier, and which may modify our thinking further: This is an issue for all physread()/physwrite() drivers, not just disks. In other words, if I want to write to 1MB blocks to a SCSI tape, and I don't align my in memory buffer sufficiently for the hardware, busdma would have to allocate 1MB of memory (it may _possibly_ be able to do so as disjunct pages rather than consequtively) and copy the entire request over. For disks we can chop the request at sector boundaries or multiple thereoff and deal with it that way, but we don't have that option for scsi_sa or even scsi_pt devices. Currently we impose a 128k upper limit on I/O requests, but we have already more or less agreed that needs to grow into the 4-16MB range soon. The more I think about it, there more arguments I find for retaining the status quo of requiring userland to do proper alignment (but with better error-checking). Particularly since the only unaligned case I know of yet, newfs(8), is by trivial accident rather than need or intent. The question of how to communicate the alignment required to userland has been raised. I propose this answer: Suffient alignment can be obtained by any one of these methods: 1. Allocate your buffer with malloc(3). 2. Align it to the request size. 3. Align it to a page. (The first is somewhat dependent on the behaviour of phkmalloc, and can be removed, but it offers a nice clean shortcut for most programmers.) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.