From owner-freebsd-arch@FreeBSD.ORG Mon Mar 22 18:50:41 2010 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 32E5A106566B; Mon, 22 Mar 2010 18:50:41 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from harmony.bsdimp.com (bsdimp.com [199.45.160.85]) by mx1.freebsd.org (Postfix) with ESMTP id C8DA68FC1C; Mon, 22 Mar 2010 18:50:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by harmony.bsdimp.com (8.14.3/8.14.1) with ESMTP id o2MIj5Zf028027; Mon, 22 Mar 2010 12:45:05 -0600 (MDT) (envelope-from imp@bsdimp.com) Date: Mon, 22 Mar 2010 12:45:05 -0600 (MDT) Message-Id: <20100322.124505.787670930858384500.imp@bsdimp.com> To: scottl@samsco.org From: "M. Warner Losh" In-Reply-To: References: <1269134585.00231959.1269122405@10.7.7.3> <4BA6279E.3010201@FreeBSD.org> X-Mailer: Mew version 6.3 on Emacs 22.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: mav@freebsd.org, freebsd-current@freebsd.org, ivoras@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Increasing MAXPHYS X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Mar 2010 18:50:41 -0000 In message: Scott Long writes: : I'd like to go in the opposite direction. The queue-dispatch-queue : model of GEOM is elegant and easy to extend, but very wasteful for : the simple case, where the simple case is one or two simple : partition transforms (mbr, bsdlabel) and/or a simple stripe/mirror : transform. None of these need a dedicated dispatch context in order : to operate. What I'd like to explore is compiling the GEOM stack at : creation time into a linear array of operations that happen without : a g_down/g_up context switch. As providers and consumers taste each : other and build a stack, that stack gets compiled into a graph, and : that graph gets executed directly from the calling context, both : from the dev_strategy() side on the top and the bio_done() on the : bottom. GEOM classes that need a detached context can mark : themselves as such, doing so will prevent a graph from being : created, and the current dispatch model will be retained. I have a few things to say on this. First, I've done similar things at past companies for systems that are similar to geom's queueing environment. It is possible to convert the queueing nodes in the graph to filtering nodes in the graph. Another way to look at this is to say you're implementing direct dispatch into geom's stack. This can be both good and bad, but should reduce latency a lot. One problem that I see is that you are calling into the driver from a different set of contexts. The queueing stuff was there to protect the driver from LoRs due to its routines being called from many different contexts, sometimes with other locks held (fact of life often in the kernel). So this certainly is something worth exploring, especially if we have optimized paths for up/down for certain geom classes while still allowing the current robust, but slow, paths for the more complicated nodes in the tree. It remains to be see if there's going to be issues around locking order, but we've hit that with both geom and ifnet in the past, so caution (eg, running with WITNESS turned on early and often) is advised. Warner