From owner-freebsd-hackers@FreeBSD.ORG Wed Oct 17 11:37:30 2007 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6AAC516A503 for ; Wed, 17 Oct 2007 11:37:30 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from signal.itea.ntnu.no (signal.itea.ntnu.no [129.241.190.231]) by mx1.freebsd.org (Postfix) with ESMTP id E92CF13C448 for ; Wed, 17 Oct 2007 11:37:29 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from localhost (localhost [127.0.0.1]) by signal.itea.ntnu.no (Postfix) with ESMTP id 901DE33703; Wed, 17 Oct 2007 13:07:08 +0200 (CEST) Received: from caracal.stud.ntnu.no (caracal.stud.ntnu.no [129.241.56.185]) by signal.itea.ntnu.no (Postfix) with ESMTP; Wed, 17 Oct 2007 13:07:08 +0200 (CEST) Received: by caracal.stud.ntnu.no (Postfix, from userid 2312) id C2C366240FD; Wed, 17 Oct 2007 13:07:15 +0200 (CEST) Date: Wed, 17 Oct 2007 13:07:15 +0200 From: Ulf Lilleengen To: Karsten Behrmann Message-ID: <20071017110715.GA25075@stud.ntnu.no> References: <20071011022001.GC13480@gandalf.sssup.it> <20071016161037.5ab1b74f@39-25.mops.rwth-aachen.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20071016161037.5ab1b74f@39-25.mops.rwth-aachen.de> User-Agent: Mutt/1.5.9i X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. Cc: freebsd-hackers@freebsd.org, Fabio Checconi Subject: Re: Pluggable Disk Scheduler Project X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 11:37:30 -0000 On tir, okt 16, 2007 at 04:10:37 +0200, Karsten Behrmann wrote: > > Hi, > > is anybody working on the `Pluggable Disk Scheduler Project' from > > the ideas page? > I've been kicking the idea around in my head, but I'm probably newer to > everything involved than you are, so feel free to pick it up. If you want, > we can toss some ideas and code to each other, though I don't really > have anything on the latter. > > [...] > > After reading [1], [2] and its follow-ups the main problems that > > need to be addressed seem to be: > > > > o is working on disk scheduling worth at all? > Probably, one of the main applications would be to make the background > fsck a little more well-behaved. I agree, as I said before, the ability to give I/O priorities is probably one of the most important things. > > > o Where is the right place (in GEOM) for a disk scheduler? [...] > > > o How can anticipation be introduced into the GEOM framework? > I wouldn't focus on just anticipation, but also other types of > schedulers (I/O scheduling influenced by nice value?) > > > o What can be an interface for disk schedulers? > good question, but geom seems a good start ;) > > > o How to deal with devices that handle multiple request per time? > Bad news first: this is most disks out there, in a way ;) > SCSI has tagged queuing, ATA has native command queing or > whatever the ata people came up over their morning coffee today. > I'll mention a bit more about this further down. > > > o How to deal with metadata requests and other VFS issues? > Like any other disk request, though for priority-respecting > schedulers this may get rather interesting. > > [...] > > The main idea is to allow the scheduler to enqueue the requests > > having only one (other small fixed numbers can be better on some > > hardware) outstanding request and to pass new requests to its > > provider only after the service of the previous one ended. [...] > - servers where anticipatory performs better than elevator > - realtime environments that need a scheduler fitting their needs > - the background fsck, if someone implements a "priority" scheduler Apache is actally a good candidate according to the old antipacitory design document ( not sure of it's relevance today, but...) Over to a more general view of it's architecture: When I looked at this project for the first time, I was under the impression that this would be best done in a GEOM class. However, I think the approach that was taken in the Hybrid project is better because of the following reasons: - It makes it possible to use by _both_ GEOM classes and device drivers (Which might use some other scheduler type?). - Does not remove any configuratbility, since changing etc. can be done by user with sysctl. - Could make it possible for a GEOM class to decide for itself which scheduler it wants to use (most GEOM classes uses the standard bioq_disksort interface in disk_subr.c). - The ability to stack a GEOM class with a scheduler could easily be "emulated" by creating a GEOM class to utilize the disksched framework. All in all, I just think this approach gives more flexibility than putting it in a GEOM class that have to be added manually by a user. Just my thought on this. Also, I got my test-box up again today, and will be trying your patch as soon as I've upgraded it to CURRENT Fabio. -- Ulf Lilleengen