From owner-freebsd-arch@FreeBSD.ORG Sun Jan 4 01:07:08 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B7FA116A4CE for ; Sun, 4 Jan 2004 01:07:08 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3DA7843D2D for ; Sun, 4 Jan 2004 01:07:07 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.10/8.12.10) with ESMTP id i0496tjl027623; Sun, 4 Jan 2004 10:07:01 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: "M. Warner Losh" From: "Poul-Henning Kamp" In-Reply-To: Your message of "Sat, 03 Jan 2004 15:36:44 MST." <20040103.153644.107852018.imp@bsdimp.com> Date: Sun, 04 Jan 2004 10:06:55 +0100 Message-ID: <27622.1073207215@critter.freebsd.dk> cc: arch@freebsd.org Subject: Re: Simple patch: Make DFLTPHYS an option X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jan 2004 09:07:08 -0000 In message <20040103.153644.107852018.imp@bsdimp.com>, "M. Warner Losh" writes: >The folks on IRC cautioned that this is not for the feign of heart, >since there's 'issues' with physio, MAXPHYS, etc. > >Comments? We need to get MAXPHYS into the megabyte range eventually. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Sun Jan 4 03:11:59 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 52E7916A4CE; Sun, 4 Jan 2004 03:11:59 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0281243D39; Sun, 4 Jan 2004 03:11:56 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id WAA25172; Sun, 4 Jan 2004 22:11:52 +1100 Date: Sun, 4 Jan 2004 22:11:51 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Scott Long In-Reply-To: <3FF7BD89.4080406@freebsd.org> Message-ID: <20040104211704.O582@gamplex.bde.org> References: <20040103.153644.107852018.imp@bsdimp.com> <3FF7967A.1090401@freebsd.org><3FF7BD89.4080406@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: Simple patch: Make DFLTPHYS an option X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jan 2004 11:11:59 -0000 On Sun, 4 Jan 2004, Scott Long wrote: > Bruce Evans wrote: > > On Sat, 3 Jan 2004, Scott Long wrote: > > > >>The key, though, is to ensure that the block system is actually > >>honoring the per-device disk.d_maxsize variable. I'm not sure if it is > >>right now. > > > > It at least used to work (up to MAXPHYS). The ad driver used a max > > i/o size of 128K until recently. This has rotted back to 64K for some > > reason (64K is encoded as DFLTPHYS in the non-dma case and as 64 * 1024 > > in the dma case). > > I've seen evidence lately that this might be broken, but I need to track > it down further. Do you mean sizes other than DFLTPHYS or the ad driver? For ad, I remember seeing the commit that reduced the size, but I couldn't find it easily. It seems to have been just the big ATAng commit. I don't know of any problems with i/o size maxes different from the defaults except for the one in spec_getpages(). I/O sizes of up to (VM_INITIAL_PAGEIN * PAGE_SIZE) bytes must work for disk devices, since spec_getpages() doesn't honor dev->si_iosize_max. This value accidentally defaults to the same value as DFLTPHYS on machines with 4K pages and to the same value as MAXPHYS on machines with 8K pages. Thus the "maximum" given by dev->si_iosize_max cannot actually be the maximum on any machine if it is < DFLTPHYS, and the usual default of DFLTPHYS is never the actual maximum on non-broken machines with 8K pages. Most disk drivers handle this by splitting up large i/o's into smaller ones internally. physio() does similar splitting (based on si_iosize_max). So si_iosize_max is not very useful for disks. physio() would do better just to split up based on MAXPHYS (since large sizes only occur if the user requests them). Clustering may benefit from using a smaller size (since a smaller size may actually be better and users can't control it). physio() needs si_iosize_max mainly to avoid wrong splitting for non-disk devices (mainly tapes). > >>Also, increasing MAXPHYS will lead to your KVA being chewed up quite > >>quickly, which in turn will lead to unpleasant panics. A lot of work > >>needs to go in to fixing this; increasing the value here has little > >>value even to people who shun seatbelts. > > > > Not all that quicky. MAXPHYS affects mainly pbufs, and there are a > > limited number of them (256 max?), and their kva is statically allocated. > > 256 times the current MAXPHYS gives 16M. This could easily be increased > > by a factor of up to about 8 without necesarily breaking things (e.g., > > by stealing 112MB from buffer kva using VM_BCACHE_SIZE if the default > > normal-buffer kva size is large (if it is small then there should be > > space to spare, else there would be no space to spare on systems with > > more RAM so that the buffer kva size is larger). > > VFS, softupdates, UFS_DIRHASH, etc, all contribute to KVA being eaten > faster than it used to be. Don't use them then :-). (I mostly don't.) > Even with smarter tuning of common culprits > like maxvnodes, KVA is still under a lot of pressure. This depends on the memory size. I use VM_BCACHE_SIZE = 512M and have no problems fitting everything else in the remaining 512M - on a machine with 1GB. With more physical memory, it becomes harder to fit everything in without kludges. (The default BKVASIZE and VM_BCACHE_SIZE are already kludged to take 1/4 as much space as they should, although it this is not necessary on machines with not much physical memory or more than KVA than i386's have.) Bruce From owner-freebsd-arch@FreeBSD.ORG Mon Jan 5 16:32:40 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E5D2616A4CE for ; Mon, 5 Jan 2004 16:32:40 -0800 (PST) Received: from mk-smarthost-2.mail.uk.tiscali.com (mk-smarthost-2.mail.uk.tiscali.com [212.74.114.38]) by mx1.FreeBSD.org (Postfix) with ESMTP id 176B743D2F for ; Mon, 5 Jan 2004 16:32:40 -0800 (PST) (envelope-from clapham.99@tiscali.co.uk) Received: from [80.45.147.134] (helo=HP25268139141) by mk-smarthost-2.mail.uk.tiscali.com with smtp (Exim 4.24) id 1Adf9C-00089g-C5 for freebsd-arch@FreeBSD.org; Tue, 06 Jan 2004 00:32:38 +0000 Message-ID: <000501c3d3ec$96a898c0$86932d50@HP25268139141> From: "John Maclean" To: Date: Tue, 6 Jan 2004 00:32:38 -0000 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 Subject: X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jan 2004 00:32:41 -0000 Sirs, I have read our site http://www.freebsd.org/platforms/index.html as I am interested in learning C on a UNIX type platform. Could you tell me please if FreeBsd would run on a Athlon Xp based laptop? Regards, John Maclean From owner-freebsd-arch@FreeBSD.ORG Mon Jan 5 16:45:24 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D725416A4CE for ; Mon, 5 Jan 2004 16:45:24 -0800 (PST) Received: from protactinium.btinternet.com (protactinium.btinternet.com [194.73.73.176]) by mx1.FreeBSD.org (Postfix) with ESMTP id D5AAA43D1F for ; Mon, 5 Jan 2004 16:45:22 -0800 (PST) (envelope-from dom@wirespeed.org.uk) Received: from [81.128.49.26] (helo=egg) by protactinium.btinternet.com with esmtp (Exim 3.22 #25) id 1AdfKo-0000xf-00; Tue, 06 Jan 2004 00:44:38 +0000 From: Dominic Marks To: "John Maclean" , Date: Tue, 6 Jan 2004 00:49:25 +0000 User-Agent: KMail/1.5.4 References: <000501c3d3ec$96a898c0$86932d50@HP25268139141> In-Reply-To: <000501c3d3ec$96a898c0$86932d50@HP25268139141> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401060049.25439.dom@wirespeed.org.uk> Subject: Re: X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: dom@wirespeed.org.uk List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jan 2004 00:45:25 -0000 On Tuesday 06 Jan 2004 12:32 am, John Maclean wrote: > Sirs, > > I have read our site http://www.freebsd.org/platforms/index.html as > I am interested in learning C on a UNIX type platform. Could you > tell me please if FreeBsd would run on a Athlon Xp based laptop? It will almost certainly run like a dream. Congratulations on your excellent choice of learning environment :-) NB: The freebsd-questions list would have been more appropriate, but I can see why you picked out freebsd-arch, I reccomend consulting the FreeBSD Handbook for a run down of how to install FreeBSD and descriptions of the purposes of the various mailing lists. You can access this at http://www.uk.freebsd.org/handbook/. Enjoy. > Regards, > > John Maclean > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to > "freebsd-arch-unsubscribe@freebsd.org" -- Dominic From owner-freebsd-arch@FreeBSD.ORG Mon Jan 5 21:00:30 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4157A16A59B for ; Mon, 5 Jan 2004 21:00:30 -0800 (PST) Received: from atlas.cc.uregina.ca (ATLAS.CC.UREGINA.CA [142.3.100.254]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8720D43D49 for ; Mon, 5 Jan 2004 21:00:28 -0800 (PST) (envelope-from forres3j@uregina.ca) Received: from wybnsk03d010101108.sk.sympatico.ca (wybnsk03d010101108.sk.sympatico.ca [142.165.98.108]) by atlas.cc.uregina.ca (8.12.10/8.12.8) with ESMTP id i0650Phw025616 for ; Mon, 5 Jan 2004 23:00:26 -0600 (CST) From: jared forrester To: Date: Mon, 5 Jan 2004 23:04:08 -0600 User-Agent: KMail/1.5 References: <000501c3d3ec$96a898c0$86932d50@HP25268139141> In-Reply-To: <000501c3d3ec$96a898c0$86932d50@HP25268139141> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401052304.08194.forres3j@uregina.ca> Subject: Re: [PMX:#] X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jan 2004 05:00:30 -0000 On January 5, 2004 06:32 pm, John Maclean wrote: > Sirs, > > I have read our site http://www.freebsd.org/platforms/index.html as I am > interested in learning C on a UNIX type platform. Could you tell me please > if FreeBsd would run on a Athlon Xp based laptop? > > Regards, > > John Maclean > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" Well, the short answer would be perhaps. The long answer is that it rely depends on what hardware exactly makes up the laptop and how much of it you need to work. You should start by making an inventory of everything it contains (what type of IDE controller?, sound card type, etc) as well as your IRQ settings (probably not necessary but it can't hurt). Next you should read the freebsd handbook located at: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/index.html It details what hardware is supported etc. As far as laptops go, you will have some trouble with cardbus (it's supported in the 5.x series but not in 4.x last I checked). It's unlikely that FreeBSD will not run at all, but if anything some of your hardware will not work 100%. I used to run FreeBSD 4.8 on a compaq armada 7400 but it did not support the triflex ide controller (though the generic driver did work just without udma-33), the usb controller (not at all), nor the cardbus slots (though they did work as 16 bit pcmcia slots and I could use a linksys 10/100 nic). However freeBSD did still run. I hope that helps Jared. From owner-freebsd-arch@FreeBSD.ORG Tue Jan 6 02:47:39 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BCCF516A4CE for ; Tue, 6 Jan 2004 02:47:39 -0800 (PST) Received: from smtp4.server.rpi.edu (smtp4.server.rpi.edu [128.113.2.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9DB8E43D1D for ; Tue, 6 Jan 2004 02:47:38 -0800 (PST) (envelope-from higgsr@rpi.edu) Received: from webmail.rpi.edu (webmail.rpi.edu [128.113.26.21]) by smtp4.server.rpi.edu (8.12.8/8.12.8) with ESMTP id i06AlbjB022155; Tue, 6 Jan 2004 05:47:38 -0500 Message-Id: <200401061047.i06AlbjB022155@smtp4.server.rpi.edu> Content-Type: text/plain Content-Disposition: inline To: clapham.99@tiscali.co.uk From: higgsr@rpi.edu X-Originating-Ip: 24.29.62.194 Mime-Version: 1.0 Date: Tue, 06 Jan 2004 5:47:37 EST X-Mailer: EMUmail 4.00 X-Scanned-By: CanIt (www . canit . ca) cc: freebsd-arch@freebsd.org Subject: Re: X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: higgsr@rpi.edu List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jan 2004 10:47:39 -0000 This type of question should probably be sent to freebsd-questions. Anyways, you should make an inventory of you hardware (ethernet, harddrives, any raid controllers, etc.) http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/install-pre.html and have a look at the hardware docs in either ftp://ftp4.freebsd.org/pub/FreeBSD/releases/i386/4.9-RELEASE/ ftp://ftp4.freebsd.org/pub/FreeBSD/releases/i386/5.2-RC2/ depending on which branch you are interested in. Ray Higgs On Tue, 6 Jan 2004 00:32:38 -0000 "John Maclean" wrote: > Sirs, > > I have read our site http://www.freebsd.org/platforms/index.html as I am > interested in learning C on a UNIX type platform. Could you tell me > please > if FreeBsd would run on a Athlon Xp based laptop? > > Regards, > > John Maclean > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@FreeBSD.ORG Tue Jan 6 22:22:59 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6EE9F16A4CE for ; Tue, 6 Jan 2004 22:22:59 -0800 (PST) Received: from ozlabs.org (ozlabs.org [203.10.76.45]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4F03143D2F for ; Tue, 6 Jan 2004 22:22:57 -0800 (PST) (envelope-from grog@lemis.com) Received: from blackwater.lemis.com (blackwater.lemis.com [192.109.197.80]) by ozlabs.org (Postfix) with ESMTP id 8A7D92BD73 for ; Wed, 7 Jan 2004 17:22:54 +1100 (EST) Received: by blackwater.lemis.com (Postfix, from userid 1004) id B67F251216; Wed, 7 Jan 2004 16:52:52 +1030 (CST) Date: Wed, 7 Jan 2004 16:52:52 +1030 From: Greg 'groggy' Lehey To: FreeBSD Architecture Mailing List Message-ID: <20040107062252.GQ7617@wantadilla.lemis.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="QOaciPm+FYh5cTV+" Content-Disposition: inline User-Agent: Mutt/1.4.1i Organization: The FreeBSD Project Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.FreeBSD.org/ X-PGP-Fingerprint: 9A1B 8202 BCCE B846 F92F 09AC 22E6 F290 507A 4223 Subject: Vinum and GEOM: the future X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jan 2004 06:22:59 -0000 --QOaciPm+FYh5cTV+ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Vinum and GEOM overlap significantly in their features, and they do some things not only differently but in an incompatible manner. The development of GEOM has resulted in Vinum features atrophying and rotting. For example, at present, it's not possible to put swap on a Vinum volume, due to a change in the swapon() code which requires GEOM. Vinum was written at a time where many of its features were not available in any other form. With the advent of GEOM, that has changed. What should we do with Vinum? The obvious options are: 1. Ditch it. It's served its purpose, and there are better alternatives. 2. Keep it alongside GEOM, and maintain code such as the swapon() code to handle both. 3. Modify it to understand GEOM. As I'll explain, I think that the only serious option is (3). Vinum needs to be modified to work with GEOM, at least in those areas which overlap. One problem I have is understanding the relationship between GEOM and Vinum. Yes, it's easy to understand statements (from geom(4)) like: In the fixed hierarchy above it is not possible to mirror two physical disks and then partition the mirror into subdisks, instead one is forced to make subdisks on the physical volumes and to mirror these two and two resulting in a much more complex configuration. GEOM on the other hand does not care in which order things are done, ... It's also very clear that GEOM offers significant advantages in this area (but also more room for users to shoot themselves in the foot; the quote above continues: "the only restriction is that cycles in the graph will not be allowed."). The question I have is: what other advantages does it offer? I'm currently writing a paper for presentations to the Linux.Conf.Au in Adelaide next week (http://lca2004.linux.org.au/, in case you're interested, and yes, they specifically asked for a paper about Vinum. Go figure), and I've come up with the following list of Vinum features:=20 - Online configuration via the vinum utility program. - Automatic error detection and recovery where possible. - State information for each object. This enables Vinum to function correctly even if some objects are not accessible. - Persistent configuration. Each Vinum drive stores two copies of the configuration, so the system can start up automatically. The configuration includes state information, so any degraded objects will remain so over a reboot, or even when moved to a new system. - Support for Vinum root file systems. - Online rebuild of objects. Interestingly, none of these touch GEOM as far as I can see. Am I missing something? Based on this understanding, my intentions for Vinum currently don't go beyond replacing the following: - Replace the objects volume, plex and subdisk with a corresponding geom. I expect this to enable a more arbitrary means of joining together the objects, but that's about all. - Replace the ioctls with gctl_s. This seems to be more cosmetic than functional, though also a good idea. This will certainly be worthwhile, but somehow I was expecting more. Can anybody suggest other things that could be changed with benefit? Greg -- See complete headers for address and phone numbers. --QOaciPm+FYh5cTV+ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (FreeBSD) iD8DBQE/+6W8IubykFB6QiMRAh1dAKCUDONVoK2uF646FolDst79gwR24ACfc5+6 2hRth80ABV3yoQmbPx+Hjwc= =EJes -----END PGP SIGNATURE----- --QOaciPm+FYh5cTV+-- From owner-freebsd-arch@FreeBSD.ORG Tue Jan 6 23:49:28 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3EE1616A4CE; Tue, 6 Jan 2004 23:49:28 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id E36AB43D39; Tue, 6 Jan 2004 23:49:25 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.10/8.12.10) with ESMTP id i077nND7025393; Wed, 7 Jan 2004 08:49:24 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: "Greg 'groggy' Lehey" From: "Poul-Henning Kamp" In-Reply-To: Your message of "Wed, 07 Jan 2004 16:52:52 +1030." <20040107062252.GQ7617@wantadilla.lemis.com> Date: Wed, 07 Jan 2004 08:49:23 +0100 Message-ID: <25392.1073461763@critter.freebsd.dk> cc: FreeBSD Architecture Mailing List Subject: Re: Vinum and GEOM: the future X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jan 2004 07:49:28 -0000 In message <20040107062252.GQ7617@wantadilla.lemis.com>, "Greg 'groggy' Lehey" writes: >It's also very clear that GEOM offers significant advantages in this >area (but also more room for users to shoot themselves in the foot; I think it is important to keep a clear distinction between "GEOM" (the infrastructure component) on one side and the GEOM classes on the other. The intent was from the start that all politics would happen in the classes not in the infrastructure code, and as a result, you can seriously penetrate your feet with a badly thought out or implemented class. On the other hand, it is also possible to write classes in a way which prevents such footshooting, at least that's the verdict so far. >the quote above continues: "the only restriction is that cycles in the >graph will not be allowed."). The question I have is: what other >advantages does it offer? Parents praising kids is a dreadful thing to listen to, but as I see it, the main advantage is to give us the necessary infrastructure to do all the weird things we want to do, without running into problems with recursion, kernel-stack overruns and needless code duplication. Infrastructure is always hard to argue for in advance, but once in place it rapidly becomes nearly invisible and after a while people start to generalize from it. Before GEOM, nobody ever asked me to be able to encrypt only one copy of a mirror ("We take that disk home for the night") because obviously mirroring was something you did with CCD or Vinum, and neither did anything like encryption. Now with GEOM you should see some of the requests I get... >(http://lca2004.linux.org.au/, in case you're interested, and yes, >they specifically asked for a paper about Vinum. Go figure), and I've >come up with the following list of Vinum features:=20 > > [...] > >Interestingly, none of these touch GEOM as far as I can see. Am I >missing something? Yes and no. None of these are GEOM's jobs, they are all stuff which the GEOM classes should do. (Of course GEOM should make it as easy as possible and offer sensible libraries etc). If you put all of Vinum into one GEOM class, like I did with CCD, then you would basically have the same situation as before GEOM, except a lot of the magic code you had to do for Vinum now can rely on GEOM to offer these facilities as standard. The entire auto-discovery thing for instance. >Based on this understanding, my intentions for Vinum currently don't >go beyond replacing the following: > >- Replace the objects volume, plex and subdisk with a corresponding > geom. I expect this to enable a more arbitrary means of joining > together the objects, but that's about all. >- Replace the ioctls with gctl_s. This seems to be more cosmetic than > functional, though also a good idea. Well, as I've said before, I would really suggest you just start out by making Vinum a single GEOM class, where you use consumers instead of calling devsw()->strategy() to access the disks vinum live on, and offer providers instead of dev_t's for access to the vinum entities you expose (volumes, plex etc). I know ScottL started working on RaidFrame, and listening to him during the process was very much like "OK, this is no longer necessary ... this goes ... don't need this ... get rid of that ..." and I would hope the vinum experience would be the same. This would not result in changes to the vinum user exerience or require documentation changes, and it would give you a good clean point to further work from. >This will certainly be worthwhile, but somehow I was expecting more. >Can anybody suggest other things that could be changed with benefit? Not of the top of my head. In the long term, I would hope that we would end up where we have one very general MIRROR class, one very general RAID5 classe etc. These would just do the basic disk-request transformations, but not contain any autoconfiguration or all that. The other classes, VINUM could be a good example, would add the "high-level" intelligence, by autodiscovering metadata and based on the metadata configuring STRIPE, MIRROR and RAID5 modules to do the right thing. Other such "high-level" classes might be RAIDFRAME, "VERITAS" and "AIX-LVM". But this is long term stuff, and we need to crawl first. For now I'm content with GEOM giving us the ability to implement transformations in a clean way. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Jan 6 23:50:41 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C0E4116A4CE for ; Tue, 6 Jan 2004 23:50:41 -0800 (PST) Received: from mongers.org (miracle.mongers.org [193.162.142.71]) by mx1.FreeBSD.org (Postfix) with SMTP id 156D943D41 for ; Tue, 6 Jan 2004 23:50:36 -0800 (PST) (envelope-from jlouis@mongers.org) Received: (qmail 28149 invoked by uid 1030); 7 Jan 2004 07:50:55 -0000 From: "Jesper Louis Andersen" Date: Wed, 7 Jan 2004 08:50:55 +0100 To: Greg 'groggy' Lehey Message-ID: <20040107075055.GC12220@miracle.mongers.org> References: <20040107062252.GQ7617@wantadilla.lemis.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20040107062252.GQ7617@wantadilla.lemis.com> User-Agent: Mutt/1.4.1i cc: FreeBSD Architecture Mailing List Subject: Re: Vinum and GEOM: the future X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jan 2004 07:50:41 -0000 Quoting Greg 'groggy' Lehey (grog@FreeBSD.org): Not doing anything useful work for FreeBSD wont let me stand away from commenting on this issue. I saw it coming about the time where GEOM was added to the kernel and then forced as mandatory for the operation of the kernel. > 1. Ditch it. It's served its purpose, and there are better > alternatives. > 2. Keep it alongside GEOM, and maintain code such as the swapon() > code to handle both. > 3. Modify it to understand GEOM. I think you explain the points of (3) quite well. I will kill (1) then: o GEOM is new in the sense that it has not seen a bashing of n system administrators running it on flaky disks (to quote the GEOM author from gbde(1) in a slighty modified manner ;). o GEOM provides things which vinum is not capable of at present and it will probably be beyond the scope to add said features to Vinum, apart from the fact that it will start a competition between the 2 implementations of the same feature. FreeBSD does not have enough committers to support that IMO. This is radically different from e.g. Linux where it is common to see pieces of code compete for inclusion into the kernel. o Vinum is stable. Vinum is proven to work. Apart from it being a bit dark lands since people who _really_ need RAID probably does it by hardware anyway. This fact makes it hard to ditch Vinum entirely because that would leave people with no stable option when choosing to build software RAID systems. That said I think it should die at some point. There really is no reason for having 2 things that does the same unless they differ in the presumed stability. I did some mental abstract work about thinking how to implement a RAID-1 by the use of GEOM. It is not abstractly hard to do kernel work. Yet it is complex: There is a very high risk that you forget to handle a little nitpick situation correctly and then data on the disks are lost when that case happens. People using RAID systems should not do it because of backup. They should be in place before. People uses RAID to minimize the downtime window and if that is the point, then stability matters a whole lot. Modifying Vinum to work with GEOM is changing Vinum yes. But not in the area that matters with respect to stability. My 2 cents, somebody else may attack (2). -- j. From owner-freebsd-arch@FreeBSD.ORG Wed Jan 7 07:49:13 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B670216A4CE; Wed, 7 Jan 2004 07:49:13 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 530AD43D39; Wed, 7 Jan 2004 07:49:11 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.10/8.12.10) with ESMTP id i07FlmUd006704; Wed, 7 Jan 2004 10:47:48 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i07FlmoC006701; Wed, 7 Jan 2004 10:47:48 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Wed, 7 Jan 2004 10:47:48 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: "Greg 'groggy' Lehey" In-Reply-To: <20040107062252.GQ7617@wantadilla.lemis.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: FreeBSD Architecture Mailing List Subject: Re: Vinum and GEOM: the future X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jan 2004 15:49:13 -0000 On Wed, 7 Jan 2004, Greg 'groggy' Lehey wrote: > 1. Ditch it. It's served its purpose, and there are better > alternatives. Right now, Vinum remains the most productionable implementation of RAID5 on FreeBSD. Not only that, but even if we do eventually decide to kick out Vinum, we need to provide a sensible migration path to whatever replaces it. > 2. Keep it alongside GEOM, and maintain code such as the swapon() > code to handle both. One of the nice things about the move to GEOM is that we now have a consistent and reliable abstraction for storage devices, with well-defined APIs for querying storage properties, rather than attempting to futz around with "Is it a character device that kind of implements the things we're kind of looking for". The disk(9) API provides a pretty reasonable API for non-GEOM devices to export "I am a storage thing", and experience seems to demonstrate that writing GEOM transforms and services is far easier than digging around to do it from scratch. > 3. Modify it to understand GEOM. Vinum seems to consist of two components: things to make up for a lack of GEOM/devfs, and things that implement volume/RAID services. Gradually trimming the overlap will allow the body of Vinum to implement that which it actually exists to do: volume and RAID services, and seems like a natural direction. > - Online configuration via the vinum utility program. > - Automatic error detection and recovery where possible. > - State information for each object. This enables Vinum to function > correctly even if some objects are not accessible. > - Persistent configuration. Each Vinum drive stores two copies of the > configuration, so the system can start up automatically. The > configuration includes state information, so any degraded objects > will remain so over a reboot, or even when moved to a new system. > - Support for Vinum root file systems. > - Online rebuild of objects. > > Interestingly, none of these touch GEOM as far as I can see. Am I > missing something? An important goal of GEOM is to allow storage transform authors to have to deal with less paperwork by providing reasonable abstractions. If half of the paperwork evaporates from Vinum, it will be a lot easier to do these things -- for example, you get decent notification of disk arrival/removal so that you can automatically configure, it provides a framework to allow interlocking pieces to cooperate, and a more well defined mechanism to pass requests up and down the stack. Another benefit is that you get Vinum's hands out of the internals of device management, which should improve maintainability and reduce complexity. > Based on this understanding, my intentions for Vinum currently don't go > beyond replacing the following: > > - Replace the objects volume, plex and subdisk with a corresponding > geom. I expect this to enable a more arbitrary means of joining > together the objects, but that's about all. > - Replace the ioctls with gctl_s. This seems to be more cosmetic than > functional, though also a good idea. > > This will certainly be worthwhile, but somehow I was expecting more. > Can anybody suggest other things that could be changed with benefit? I think there's a spectrum of possibilities you can explore, and that it offers a lot of choices. The most obvious first step is to have Vinum export its storage units using the disk(9) API, which will permit GEOM consumers to attach to those devices as "disks". This will get swap up and running again with what I hope will be little difficulty, and basically put Vinum in the same situation disk devices currently sit. disk(9) allows you to say "Hi, I'm a disk, and I implement the following methods, and have the following properties". The one caution is to be careful about generating cycles: i.e., only export volumes, not the bits that make up volumes. You would continue to use character devices for things like the Vinum control node. A second phase involves an actual "GEOMification", in which modify Vinum to consume and produce GEOM instances. I.e., you turn Vinum into one large GEOM class, using GEOM to discover and access storage objects, and using GEOM to expose new storage objects, and use GEOM's stage engine and bio management. As I mentioned, this will strip a lot of the "paperwork" from Vinum, and result in Vinum no longer directly producing or consuming character devices for storage I/O. Note that in this stage, one of the things you can do is move to using GEOM ctl operations to manage Vinum, but that's not obligatory: you could still maintain the use of a character device for control ioctls. A third, and optional stage, would be to then decompose Vinum into its logical components, creating GEOM classes for each of those components. This will be a lot more work, but I think would be well worth it. However, it will take a fair amount of time, so I think that this makes sense only after performing one of the above steps as an interim stage. My recommendation would be to begin by simply attacking the disk(9) issue. Chances are, the changes will be small -- avoiding cycles might fall out naturally, or it might require a little tweaking. Once it exports disk(9), you're at a point where you can pause for breath and take on the larger tasks. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research From owner-freebsd-arch@FreeBSD.ORG Wed Jan 7 12:40:26 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 85EF416A4CE for ; Wed, 7 Jan 2004 12:40:26 -0800 (PST) Received: from web61104.mail.yahoo.com (web61104.mail.yahoo.com [216.155.196.106]) by mx1.FreeBSD.org (Postfix) with SMTP id 4EE2543D5A for ; Wed, 7 Jan 2004 12:40:04 -0800 (PST) (envelope-from pawel_worach@yahoo.com) Message-ID: <20040107204003.9827.qmail@web61104.mail.yahoo.com> Received: from [193.234.190.150] by web61104.mail.yahoo.com via HTTP; Wed, 07 Jan 2004 12:40:03 PST Date: Wed, 7 Jan 2004 12:40:03 -0800 (PST) From: Pawel Worach To: freebsd-arch@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Rewamped clone_root script X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jan 2004 20:40:26 -0000 Hi! I have been using this script for a while instad of /usr/share/examples/diskless/clone_root which is curretly outdated because of the /lib changes. If you like it please feel free to commit. Regards Pawel Worach begin 644 clone_root.gz M'XL(`!IM_#\``[586U/;2A)^UZ_HX[A.,.4+<,[#+A39,K8`;8SME>00*B>U MR-+8UEJ65".)2PC_?;^>D7P!LLG+NH@]FNGI_OK>RKO?.M,P[F0+XYWQCGI) M^BC#^2*G/;]!1P<'?S3Y^T\:>_OZB3=TH(D66D129D'DAU^^Y8TZ<*;RE!$@GK"]Q5)0*N0JS',14"J3NS#`(E]X.;X$V$11M/DCH]*:X('/G&2A[YH@B#,*`([ MYK*1J53;!029?N2%*R'9O$>O84#J.A:UMG$W=D.W1[VW5`__X]'X%1=WA#YN>Q;3H.@=ZZ M&@\L<`%;NSMT+=-IDC7L#29]:WC1)#"AX.R>60/+O5%PSBUWR-+.&1Z-N[9K]2:#KDWCB3T>.I;3F_0M:[, M?IL``6+)_&0.77(NNX/!MH[XVU'QS`2^[MF`.2DA4+%OV6;/95TVJQ[L!6B# M)CECLV?QPOQL0H^N?=,LN3KFOR8@PB&8];M7W0LHMO<3B\`7O8EM7C%:&,&9 MG#FNY4Y MI)I=*:JR3F>ZZRQ5]GR49HA_B431OE2MI%T*&8NHR49C5^I'0H"@?*O`R"N+ M*:1!HGL:N_XQ*=XZU2;):^$3.OB'(5 M.C-@FK+S59>7QQWQD"8R[RB[$W5(?^)9IGYETL3R[H]F[J=-"1.D4<$G!R7= MP1:G(I.T^52/O\YI`SM8^&G05G[,XC!-A3;"^+-99@8KL$@0:&6B/"F#+3P9 MW'.;%]RP8MP*Q+$7'/L'QUCXWO%,G"C"6?@@@I87!'*3;&WX,&TG_->-L-')+*?OM/P&]C*T,BSN,(.U%&.,A2KP`(X/^:2]$E%9KZ=<, MQ^Y!G=-.*<48G?US_9Q,_V,8GIQGI[=SD<-22!YYO#RF^OZM82!SOE#]'_3; M*>+F*WW_CL=:_8GIGVMT2C5JM6HX..'TB`TBX2\2JA69-Q?'=(L2HQU2/[BE M+RW_*[X*_I)4%@]^6):9^;6V9L`!VO(1*<*+*4%16)==Y:==NH**-,"PM$M$ M>TF,6JR/,IV9S;*TGA"66+<):?G.B#HE%J_.I$-BJ;G]:/RC,ZJDZ7C5*1MP[O&HP\!O+36NNN M5IT?5N>M1BF*IE)X2[4M,L\W`KS#&,8JF^\U4#(J8WSX\('JAS7CF2L07EW\ M)?1*T1A23\*H2*5,!U[K&P=;B?IY)\[`DVICF`!VR5+AA[/'5TV^7:,/OQ_A MSL;66VRUNK_"=>/ATJ]O,%YKPL[3'5'U9"_VA9;Z&[6"JF4Q_Y=BK[PEA/(@ MP+UGX>&E0[--`Q^.4TTFTL3^8]NB@ZT# M:F&$>-+!^AI)#X&4,Y*W4:R6/"6UTEWAJQR3(+4FU,+W3(]N:J^S,TF]NO<* M--=>0PU;7Q2C-R:N+<"O!;]!_U(FS$(J6*(P7JID?U(=XKE*>2+O]%8E3/V) M:9[I.WGW2VJ=']/[IU2B[B.+GM_?*MKISVF/*MH]\G<<<4)13*T,!IA2W:.& MP:,0$G?+(Y63E)P.Z&=AS`*KMLKH_#1,8)-@5>"@+(O/ MVZ`:QAJH;E_/[$XP6W&>M,*=5V#JFX[++6V+P5]*UP^$:>RN$Q>8"X\^_'X( MOJS62^@ZP]1L">@:N!978B][_8^A;S);76R43!3?DHM*^[)_TY>#UM^_[@/3 M!M]W3J^?V$4S;+RAP70]R*\U4%MK%=:CQ*_87]]]4PL^V(']*\Y\A;Y-?F207 MPG5%7]?TLH;L)DF5(`IR"?^C:0_Q=G]^NI'4J'3!;->*A1KN?@!83S4S#Y'U C Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4E96516A4CE for ; Wed, 7 Jan 2004 15:18:25 -0800 (PST) Received: from dragon.nuxi.com (trang.nuxi.com [66.93.134.19]) by mx1.FreeBSD.org (Postfix) with ESMTP id 05B1943D60 for ; Wed, 7 Jan 2004 15:18:21 -0800 (PST) (envelope-from obrien@dragon.nuxi.com) Received: from dragon.nuxi.com (obrien@localhost [127.0.0.1]) by dragon.nuxi.com (8.12.10/8.12.9) with ESMTP id i07NIJvT064937; Wed, 7 Jan 2004 15:18:19 -0800 (PST) (envelope-from obrien@dragon.nuxi.com) Received: (from obrien@localhost) by dragon.nuxi.com (8.12.10/8.12.10/Submit) id i07NIJrC064936; Wed, 7 Jan 2004 15:18:19 -0800 (PST) (envelope-from obrien) Date: Wed, 7 Jan 2004 15:18:19 -0800 From: "David O'Brien" To: John Maclean Message-ID: <20040107231819.GA64718@dragon.nuxi.com> References: <000501c3d3ec$96a898c0$86932d50@HP25268139141> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <000501c3d3ec$96a898c0$86932d50@HP25268139141> User-Agent: Mutt/1.4.1i X-Operating-System: FreeBSD 5.2-CURRENT Organization: The NUXI BSD Group X-Pgp-Rsa-Fingerprint: B7 4D 3E E9 11 39 5F A3 90 76 5D 69 58 D9 98 7A X-Pgp-Rsa-Keyid: 1024/34F9F9D5 cc: freebsd-arch@FreeBSD.org Subject: Re: your mail X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: obrien@FreeBSD.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jan 2004 23:18:25 -0000 On Tue, Jan 06, 2004 at 12:32:38AM -0000, John Maclean wrote: > I have read our site http://www.freebsd.org/platforms/index.html as I am > interested in learning C on a UNIX type platform. Could you tell me please > if FreeBsd would run on a Athlon Xp based laptop? Yes FreeBSD will run on Athlon XP-M laptops. This question is appropriate for freebsd-questions, not freebsd-arch. From owner-freebsd-arch@FreeBSD.ORG Thu Jan 8 11:00:19 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 89F0A16A4CE for ; Thu, 8 Jan 2004 11:00:19 -0800 (PST) Received: from carver.gumbysoft.com (carver.gumbysoft.com [66.220.23.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 31DBB43D45 for ; Thu, 8 Jan 2004 11:00:17 -0800 (PST) (envelope-from dwhite@gumbysoft.com) Received: by carver.gumbysoft.com (Postfix, from userid 1000) id E073C72DBF; Thu, 8 Jan 2004 11:00:16 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by carver.gumbysoft.com (Postfix) with ESMTP id DE33B72DB5; Thu, 8 Jan 2004 11:00:16 -0800 (PST) Date: Thu, 8 Jan 2004 11:00:16 -0800 (PST) From: Doug White To: Pawel Worach In-Reply-To: <20040107204003.9827.qmail@web61104.mail.yahoo.com> Message-ID: <20040108105941.R14836@carver.gumbysoft.com> References: <20040107204003.9827.qmail@web61104.mail.yahoo.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-arch@freebsd.org Subject: Re: Rewamped clone_root script X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jan 2004 19:00:19 -0000 On Wed, 7 Jan 2004, Pawel Worach wrote: > Hi! > > I have been using this script for a while instad of > /usr/share/examples/diskless/clone_root which is > curretly outdated because of the /lib changes. > > If you like it please feel free to commit. Please file this as a PR so we don't lose it. :) Thanks! -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org From owner-freebsd-arch@FreeBSD.ORG Thu Jan 8 16:49:29 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2318D16A4CE; Thu, 8 Jan 2004 16:49:29 -0800 (PST) Received: from smtp1.server.rpi.edu (smtp1.server.rpi.edu [128.113.2.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9426543D41; Thu, 8 Jan 2004 16:49:27 -0800 (PST) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp1.server.rpi.edu (8.12.8/8.12.8) with ESMTP id i090nQLJ028441; Thu, 8 Jan 2004 19:49:26 -0500 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: Date: Thu, 8 Jan 2004 19:49:25 -0500 To: freebsd-ports@FreeBSD.ORG From: Garance A Drosihn Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Scanned-By: CanIt (www . canit . ca) X-Mailman-Approved-At: Thu, 08 Jan 2004 16:54:29 -0800 Subject: Call for feedback on a Ports-collection change X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jan 2004 00:49:29 -0000 I have been pondering a possible change for the way the ports collection is done. I've done a little exploration into the idea, but I thought I'd ask for more feedback before sinking more time into it. I believe I have someone who would do much of the actual work for this change, so I think I can make it happen, but I want to know if the FreeBSD ports project would be interested in this idea if I come up with some working version. To keep this as a "doable" project, I also have a fairly modest goal: Further reduce the inode-count of the ports collection. That's it. There are many things which could be done as a follow-on to this, but that's all I want to try for right now. The method is also pretty modest. The only thing that makes this a big project is the need to do it across the entire ports collection, and without causing any disruption. What I want to do is create one new file per port, and then move almost all the other files into that new file. Ideally each port would end up with just two files. The Makefile, and this new file (some ports might also need a Makefile.inc file). Especially as disks get ever-larger, I think we're better off with fewer-but-larger files, instead of a larger number of tiny files. I would also write a single simple program, which knows how to find the correct info for any given purpose. Thus, the format of the file should not be important. The program would know what to do for both "old-style" and "new-style" ports, so we don't have to convert the entire collection at once. I think the easiest and clearest way to implement this would be one C program, and not 800 lines of /bin/sh commands and deep make-magic. Does this seem like a reasonable project for me to pursue? Does it conflict with other projects which are already in the works to do a similar restructuring? I wouldn't want to start this project if no one thinks it is worth doing. [This message is BCC'ed to FreeBSD-arch, but I expect the discussion to happen on FreeBSD-ports] -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu From owner-freebsd-arch@FreeBSD.ORG Fri Jan 9 11:01:20 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 67B4216A4CE for ; Fri, 9 Jan 2004 11:01:20 -0800 (PST) Received: from smtp.mho.com (smtp.mho.net [64.58.4.6]) by mx1.FreeBSD.org (Postfix) with SMTP id 0B0AA43D3F for ; Fri, 9 Jan 2004 11:01:18 -0800 (PST) (envelope-from scottl@freebsd.org) Received: (qmail 25173 invoked by uid 1002); 9 Jan 2004 19:01:17 -0000 Received: from unknown (HELO freebsd.org) (64.58.1.252) by smtp.mho.net with SMTP; 9 Jan 2004 19:01:17 -0000 Message-ID: <3FFEFA18.1060805@freebsd.org> Date: Fri, 09 Jan 2004 11:59:36 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5) Gecko/20031103 X-Accept-Language: en-us, en MIME-Version: 1.0 To: arch@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: Interrupt API change X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jan 2004 19:01:20 -0000 All, At the September DevSummit, Peter Wemm proposed changing the device driver API so that interrupt routines return an INT instead of a VOID. The primary purpose of this is two-fold. The first is so interrupt handlers can communicate back to the low-level interrupt routines whether or not they were able to handle the interrupt. Heuristics can then be built on this information to better detect things like interrupt storms. This change also paves the way for the proposal to make interrupts be multi-tiered. The first level interrupt handler can relay back whether it wants the second-level handler to be run (similar to filter interrupt handlers in Max OS X). I'm not ready to go in the multi-level interrupt direction just yet, but changing the API now will help that later if needed. The change will consist of changing the driver_intr_t typedef in /sys/sys/bus.h to return an int, and then doing a sweep of the entire tree. I expect no functional change from this right now. Future plans for the interrupt API include the possiblity of mutli- level interrupt handlers, and expanding the API to handle message interrupts. Message interrupts are like what is found in the PCI/PCI-X/PCI-Express specs under 'Message Signaled Interrupts'. Under MSI, a device sends a data message over the bus to signal an interrupt instead of raising and INTx line. The interrupt controller sees this message and turns it into an interrupt vector for the CPU. Since MSI allows devices to define more than one message, drivers can in turn have multiple, specific interrupt handlers. Our interrupt API has no concept of this, and will need to be enhanced to support the idea of negotiating the number of allowed messages, and then registering the appropriate handler for each message. I've not decided exactly how to do this yet, so anyone with knowledge of the PCI specs is welcome to give input. Thanks, Scott From owner-freebsd-arch@FreeBSD.ORG Fri Jan 9 11:28:49 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9BA3616A4CE; Fri, 9 Jan 2004 11:28:49 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6C0E443D2D; Fri, 9 Jan 2004 11:28:48 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id i09JSliw015491; Fri, 9 Jan 2004 14:28:47 -0500 (EST) Date: Fri, 9 Jan 2004 14:28:47 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Scott Long In-Reply-To: <3FFEFA18.1060805@freebsd.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: Interrupt API change X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jan 2004 19:28:49 -0000 On Fri, 9 Jan 2004, Scott Long wrote: > All, > > At the September DevSummit, Peter Wemm proposed changing the device > driver API so that interrupt routines return an INT instead of a VOID. > The primary purpose of this is two-fold. The first is so interrupt > handlers can communicate back to the low-level interrupt routines > whether or not they were able to handle the interrupt. Heuristics > can then be built on this information to better detect things like > interrupt storms. This change also paves the way for the proposal > to make interrupts be multi-tiered. The first level interrupt handler > can relay back whether it wants the second-level handler to be run > (similar to filter interrupt handlers in Max OS X). > > I'm not ready to go in the multi-level interrupt direction just yet, > but changing the API now will help that later if needed. The change > will consist of changing the driver_intr_t typedef in /sys/sys/bus.h > to return an int, and then doing a sweep of the entire tree. I expect > no functional change from this right now. Coming from a background in Solaris device drivers, I think is a good idea :) Can I suggest that instead of just returning 0 or non-zero, that you use something more indicative like INTR_CLAIMED / INTR_UNCLAIMED ? From owner-freebsd-arch@FreeBSD.ORG Fri Jan 9 11:34:51 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D0ADD16A4CE for ; Fri, 9 Jan 2004 11:34:51 -0800 (PST) Received: from smtp.mho.com (smtp.mho.net [64.58.4.6]) by mx1.FreeBSD.org (Postfix) with SMTP id 98D5843D1D for ; Fri, 9 Jan 2004 11:34:49 -0800 (PST) (envelope-from scottl@freebsd.org) Received: (qmail 27228 invoked by uid 1002); 9 Jan 2004 19:34:49 -0000 Received: from unknown (HELO freebsd.org) (64.58.1.252) by smtp.mho.net with SMTP; 9 Jan 2004 19:34:49 -0000 Message-ID: <3FFF01F3.7070700@freebsd.org> Date: Fri, 09 Jan 2004 12:33:07 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5) Gecko/20031103 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniel Eischen References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: arch@FreeBSD.org Subject: Re: Interrupt API change X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jan 2004 19:34:51 -0000 Daniel Eischen wrote: > On Fri, 9 Jan 2004, Scott Long wrote: > > >>All, >> >>At the September DevSummit, Peter Wemm proposed changing the device >>driver API so that interrupt routines return an INT instead of a VOID. >>The primary purpose of this is two-fold. The first is so interrupt >>handlers can communicate back to the low-level interrupt routines >>whether or not they were able to handle the interrupt. Heuristics >>can then be built on this information to better detect things like >>interrupt storms. This change also paves the way for the proposal >>to make interrupts be multi-tiered. The first level interrupt handler >>can relay back whether it wants the second-level handler to be run >>(similar to filter interrupt handlers in Max OS X). >> >>I'm not ready to go in the multi-level interrupt direction just yet, >>but changing the API now will help that later if needed. The change >>will consist of changing the driver_intr_t typedef in /sys/sys/bus.h >>to return an int, and then doing a sweep of the entire tree. I expect >>no functional change from this right now. > > > Coming from a background in Solaris device drivers, I think > is a good idea :) Can I suggest that instead of just returning > 0 or non-zero, that you use something more indicative like > INTR_CLAIMED / INTR_UNCLAIMED ? > > > Yes, right after I hit the Send button I realized that I had forgotten to mention this. There will be an enumeration of return codes, but I haven't thought up a good name for them yet. Scott From owner-freebsd-arch@FreeBSD.ORG Sat Jan 10 06:09:56 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0CC2116A4CE; Sat, 10 Jan 2004 06:09:56 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0BF4543D41; Sat, 10 Jan 2004 06:09:54 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id BAA10343; Sun, 11 Jan 2004 01:09:50 +1100 Date: Sun, 11 Jan 2004 01:09:49 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Scott Long In-Reply-To: <3FFEFA18.1060805@freebsd.org> Message-ID: <20040111005502.O22604@gamplex.bde.org> References: <3FFEFA18.1060805@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: Interrupt API change X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Jan 2004 14:09:56 -0000 On Fri, 9 Jan 2004, Scott Long wrote: > At the September DevSummit, Peter Wemm proposed changing the device > driver API so that interrupt routines return an INT instead of a VOID. This design error was rejected last time it was discussed (about 8-10 years ago). > The primary purpose of this is two-fold. The first is so interrupt > handlers can communicate back to the low-level interrupt routines > whether or not they were able to handle the interrupt. Heuristics > can then be built on this information to better detect things like > interrupt storms. That'a about all it can do. In the shared irq case, there is no alternative to calling all the handlers if one of the handlers did something, since activity by one handler is unrelated to activity by others (except as an optimization that only works in the edge triggered case -- devices usually rarely interrupt concurrently so assuming that thety never do is usually most efficient). In the non-shared case, individual handlers can better decide about interrupt storms in a device-specific way. The non-shared case will hopefully be almost all cases when APIC support becomes standard on i386's. > This change also paves the way for the proposal > to make interrupts be multi-tiered. That is another reason not to do it. Interrupts shouldn't be multi-tiered except in special cases. > The first level interrupt handler > can relay back whether it wants the second-level handler to be run > (similar to filter interrupt handlers in Max OS X). The first level interrupt handler should call (or schedule) other levels as necessary (as in RELENG_4, but not using inefficient scheduling as in -current). Bruce From owner-freebsd-arch@FreeBSD.ORG Sat Jan 10 06:15:30 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 18F2916A4CE; Sat, 10 Jan 2004 06:15:30 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8D7AA43D1D; Sat, 10 Jan 2004 06:15:26 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.10/8.12.10) with ESMTP id i0AEFIFu060043; Sat, 10 Jan 2004 15:15:24 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Bruce Evans From: "Poul-Henning Kamp" In-Reply-To: Your message of "Sun, 11 Jan 2004 01:09:49 +1100." <20040111005502.O22604@gamplex.bde.org> Date: Sat, 10 Jan 2004 15:15:18 +0100 Message-ID: <60042.1073744118@critter.freebsd.dk> cc: arch@freebsd.org Subject: Re: Interrupt API change X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Jan 2004 14:15:30 -0000 In message <20040111005502.O22604@gamplex.bde.org>, Bruce Evans writes: >> Heuristics >> can then be built on this information to better detect things like >> interrupt storms. > >That'a about all it can do. Considering the fatality of interrupt storms, this alone is reason enough to implement the change. >In the non-shared case, >individual handlers can better decide about interrupt storms in a >device-specific way. Considering that interrupt storms more often than not are caused by driver bugs, expecting the driver to mitigate seems terminally optimistic to me. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Sat Jan 10 20:14:55 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0371116A4CE for ; Sat, 10 Jan 2004 20:14:55 -0800 (PST) Received: from smtp.mho.com (smtp.mho.net [64.58.4.6]) by mx1.FreeBSD.org (Postfix) with SMTP id 8923C43D46 for ; Sat, 10 Jan 2004 20:14:53 -0800 (PST) (envelope-from scottl@freebsd.org) Received: (qmail 22706 invoked by uid 1002); 11 Jan 2004 04:14:50 -0000 Received: from unknown (HELO freebsd.org) (64.58.1.252) by smtp.mho.net with SMTP; 11 Jan 2004 04:14:50 -0000 Message-ID: <4000CD54.30801@freebsd.org> Date: Sat, 10 Jan 2004 21:13:08 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5) Gecko/20031103 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Bruce Evans References: <3FFEFA18.1060805@freebsd.org> <20040111005502.O22604@gamplex.bde.org> In-Reply-To: <20040111005502.O22604@gamplex.bde.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: arch@freebsd.org Subject: Re: Interrupt API change X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Jan 2004 04:14:55 -0000 Bruce Evans wrote: > > That'a about all it can do. In the shared irq case, there is no > alternative to calling all the handlers if one of the handlers did > something, since activity by one handler is unrelated to activity by > others (except as an optimization that only works in the edge triggered > case -- devices usually rarely interrupt concurrently so assuming that > thety never do is usually most efficient). In the non-shared case, > individual handlers can better decide about interrupt storms in a > device-specific way. The non-shared case will hopefully be almost all > cases when APIC support becomes standard on i386's. > This is way too overly optimistic. Interrupt routing is still limited by things like the number of physical PCI INTx lines. The APIC can't do anything about devices that share the same physical line. MSI will help this, but I suspect that MSI will only be supported on the higher-end PCI/PCI-X cards, at least until PCI-Express is adopted. I wouldn't expect the shared PCI INTx line problem to go away for at least another 5-7 years. There is no reason to duplicate interrupt storm heuristics in every single PCI driver. For now, the change will be essentially a no-op. However, getting it in will allow us to experiment with it in the future with ease. I'm not advocating that we break shared interrupt semantics and use this to short-circuit handlers. > > The first level interrupt handler should call (or schedule) other levels > as necessary (as in RELENG_4, but not using inefficient scheduling as in > -current). I understand, and that's why I haven't committed to doing it yet. > > Bruce >