From owner-freebsd-hackers@FreeBSD.ORG Wed Oct 17 13:09:29 2007 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D726916A419; Wed, 17 Oct 2007 13:09:29 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from merke.itea.ntnu.no (merke.itea.ntnu.no [129.241.7.61]) by mx1.freebsd.org (Postfix) with ESMTP id 580E113C44B; Wed, 17 Oct 2007 13:09:29 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from localhost (localhost [127.0.0.1]) by merke.itea.ntnu.no (Postfix) with ESMTP id E420213CFBD; Wed, 17 Oct 2007 15:09:27 +0200 (CEST) Received: from caracal.stud.ntnu.no (caracal.stud.ntnu.no [129.241.56.185]) by merke.itea.ntnu.no (Postfix) with ESMTP; Wed, 17 Oct 2007 15:09:27 +0200 (CEST) Received: by caracal.stud.ntnu.no (Postfix, from userid 2312) id 227A96240FD; Wed, 17 Oct 2007 15:09:35 +0200 (CEST) Date: Wed, 17 Oct 2007 15:09:35 +0200 From: Ulf Lilleengen To: Fabio Checconi Message-ID: <20071017130934.GA26180@stud.ntnu.no> References: <20071011022001.GC13480@gandalf.sssup.it> <20071016161037.5ab1b74f@39-25.mops.rwth-aachen.de> <20071017110715.GA25075@stud.ntnu.no> <20071017121907.GL99087@gandalf.sssup.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20071017121907.GL99087@gandalf.sssup.it> User-Agent: Mutt/1.5.9i X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. Cc: freebsd-hackers@freebsd.org, s223560@studenti.ing.unipi.it, luigi@FreeBSD.org Subject: Re: Pluggable Disk Scheduler Project X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 13:09:29 -0000 On ons, okt 17, 2007 at 02:19:07 +0200, Fabio Checconi wrote: > > From: Ulf Lilleengen > > Date: Wed, Oct 17, 2007 01:07:15PM +0200 > > > > On tir, okt 16, 2007 at 04:10:37 +0200, Karsten Behrmann wrote: > > Over to a more general view of it's architecture: > > > > When I looked at this project for the first time, I was under the impression > > that this would be best done in a GEOM class. > > > > However, I think the approach that was taken in the Hybrid project is better > > Ok. I think that such a solution requires a lot more effort on the > design and coding sides, as it requires the modification of the > drivers and can bring us problems with locking and with the queueing > assumptions that may vary on a per-driver basis. > I completely agree with the issue of converting device drivers, but at least it will be an _optional_ possibility (Having different scheduler plugins could make this possible). One does not necessary need to convert the drivers. > Maybe I've not enough experience/knowledge of the driver subsystem, > but I would not remove the queueing that is done now by the drivers > (think of ata freezepoints,) but instead I'd like to try to grab > the requests before they get to the driver (e.g., in/before their > d_strategy call) and have some sort of pull mechanism when requests > complete (still don't have any (serious) idea on that, I fear that > the right place to do that, for locking issues and so on, can be > driver dependent.) Any ideas on that? Which drivers can be good > starting points to try to write down some code? > If you look at it, Hybrid is just a generalization of the existing bioq_* API already defined. And this API is used by GEOM classes _before_ device drivers get the requests AFAIK. For a simple example on a driver, the md-driver might be a good place to look. Note that I have little experience and knowledge of the driver subsystem myself. Also note (from the Hybrid page): * we could not provide support for non work-conserving schedulers, due to a couple of reasons: 1. the assumption, in some drivers, that bioq_disksort() will make requests immediately available (so a subsequent bioq_first() will not return NULL). 2. the fact that there is no bioq_lock()/bioq_unlock(), so the scheduler does not have a safe way to generate requests for a given queue. This certainly argues for having this in the GEOM layer, but perhaps it's possible to change the assumtions done in some drivers? The locking issue should perhaps be better planned though, and an audit of the driver disksort code is necessary. Also: * as said, the ATA driver in 6.x/7.x moves the disksort one layer below the one we are working at, so this particular work won't help on ATA-based 6.x machines. We should figure out how to address this, because the work done at that layer is mostly a replica of the bioq_*() API. So, I see this can get a bit messy thinking of that the ATA drivers does disksorts on its own, but perhaps it would be possible to fix this by letting changing the general ATA driver to use it's own pluggable scheduler. Anyway, I shouldn't demand that you do this, especially since I don't have any code or anything to show to, and because you decide what you want to do. However, I'd hate to see the Hybrid effort go to waste :) I was hoping some of the authors of the project would reply with their thoughts, so I CC'ed them. -- Ulf Lilleengen