From owner-freebsd-current@FreeBSD.ORG Tue Feb 18 20:19:06 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2742861E; Tue, 18 Feb 2014 20:19:06 +0000 (UTC) Received: from mail-qc0-x22d.google.com (mail-qc0-x22d.google.com [IPv6:2607:f8b0:400d:c01::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 92F291030; Tue, 18 Feb 2014 20:19:05 +0000 (UTC) Received: by mail-qc0-f173.google.com with SMTP id i8so26483569qcq.32 for ; Tue, 18 Feb 2014 12:19:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=lIWY8+fCOvqJBKGQlWezG2F4dDEG3XqyHVYMwqZv84s=; b=lcMTHjRZMAwtC9HZO28Nso4tNcf00V2dSU8/rjDxV0m+5Xp+cGj7r4I76hCrKu82SB 30cLzL/+qAh4ufTCocNNx1rkykSLBw1C/Tid9UVXEzybHCV7lcsXMeG505npfwA9Et2a JPkbksjdGGHfKI6usA2Aeh6Swcz1GnRxu+nWCPP5ratcQONacgrD1v42BG+ehJIW6Rhh QJpFQoLC8ZfeW+mHPMAd+RTGNAvIgdhZyC8njEDKKN0DRSgYLq6oaTCf4D0tVigg781A hLyjQp10EHo/6susaY22DRrm+EZr7jkFaDGXrPNOXBO+ypcTysp/hYlNebVGE5EkwbxV obPA== MIME-Version: 1.0 X-Received: by 10.229.188.69 with SMTP id cz5mr45669100qcb.7.1392754744070; Tue, 18 Feb 2014 12:19:04 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.16.10 with HTTP; Tue, 18 Feb 2014 12:19:04 -0800 (PST) In-Reply-To: <201402181328.26553.jhb@freebsd.org> References: <20140217111635.GL26785@glebius.int.ru> <201402181328.26553.jhb@freebsd.org> Date: Tue, 18 Feb 2014 12:19:04 -0800 X-Google-Sender-Auth: Tm0w7yFjojxbnu96GCMopG8X97c Message-ID: Subject: Re: [CFT] new sendfile(2) From: Adrian Chadd To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-current , David Chisnall , Gleb Smirnoff , "current@freebsd.org" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Feb 2014 20:19:06 -0000 On 18 February 2014 10:28, John Baldwin wrote: > On Monday, February 17, 2014 6:24:21 am David Chisnall wrote: >> P.S. If aio() is creating a new thread per request, rather than scheduling > them from a pool, then that is also likely a bug. The aio APIs were designed > so that systems with DMA controllers could issue DMA requests in the syscall > and return immediately, then trigger the notification in response to the DMA- > finished interrupt. There shouldn't need to be any kernel threads created to > do this... > > AIO uses a pool, but the requests are all done synchronously from that > pool. While our low-level disk routines are async (e.g. GEOM etc.), > the filesystem code above that generally is not. The aio code does have > some special gunk in place for sockets (and I believe raw disk I/O) to > make it truly async, but aio for files uses sychronous I/O from a pool > of worker threads. Just to expand on John's response - which is absolutely correct: * the IO strategy routines these days do indeed do things via callbacks, so no AIO worker threads required * However any blocking that goes on in the completion path ends up making the disk IO rate drop dramatically - so there's still a single AIO completion thread involved in posting the kqueue notifications (ie, doing kqueue notifications from the strategy completion callback doesn't work well because of (kqueue, etc) lock contention) * The disk code is all blocking - especially trying to do metadata reads for things like directory traversal. I don't know if Scott is working on the async directory stuff or not, but nailing a fully async path for filesystem strategy() calls on an arbitrary file would really aid high throughput AIO based systems. We'd be able to do zero copy disk IO for both read and write. -a