From owner-freebsd-fs Mon Apr 29 4:41:38 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mk-smarthost-2.mail.uk.tiscali.com (mk-smarthost-2.mail.uk.worldonline.com [212.74.112.72]) by hub.freebsd.org (Postfix) with ESMTP id 173CB37B41B; Mon, 29 Apr 2002 04:41:32 -0700 (PDT) Received: from [212.139.129.125] (helo=bloodhound.uk.worldonline.com) by mk-smarthost-2.mail.uk.tiscali.com with esmtp (Exim 3.35 #1) id 1729WY-000HaN-00; Mon, 29 Apr 2002 12:40:54 +0100 Received: from brian by bloodhound.uk.worldonline.com with local (Exim 3.22 #1) id 1729X7-0000vS-00; Mon, 29 Apr 2002 12:41:29 +0100 Date: Mon, 29 Apr 2002 12:41:29 +0100 From: Brian Candler To: freebsd-fs@freebsd.org, freebsd-net@freebsd.org Subject: Re: NFS clearing attribute cache in nfs_open Message-ID: <20020429124129.A3409@linnet.org> References: <20020426181535.B2748@linnet.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20020426181535.B2748@linnet.org>; from B.Candler@pobox.com on Fri, Apr 26, 2002 at 06:15:35PM +0100 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Thanks for comments so far. It looks like the best solution for me is going to be to boot diskless into a ramdisk. It's been quite a job finding out how to do that; there doesn't seem to be much in the way of documentation. Reverse-engineering the source suggests that the following methods exist: (1) Including an MD image within the kernel itself options MD_ROOT options MD_ROOT_SIZE n # kilobytes then use usr/src/release/write_mfs_in_kernel.c to patch the image into the kernel. Trouble is you must reserve sufficient space for the image, and repatch a clean kernel every time you wish to change the ramdisk. (2) Module preload mechanism dev/md/md.c calls preload_search_info(mod, MODINFO_NAME) This accepts a 'module' of type "md_image" or "mfs_root" It then creates a ramdisk using data at MODINFO_ADDR of MODINFO_SIZE. These in turn come from sys/kern/subr_module.c which uses a block of "preload_metadata" As far as I can tell, either I can put something like load -t md_image myfile in boot/loader.rc, or do what release/Makefile does, which is to put mfsroot_load="YES" mfsroot_type="mfs_root" mfsroot_name="/boot/mfsroot" in boot/loader.conf. I'm not sure if that's sufficient to replace the NFS root, or if I have to recompile the kernel _without_ options BOOTP and BOOTP_NFSROOT for this to work, but I'll give it a go later on... Cheers, Brian. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Apr 29 10: 4:40 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mail102.csoft.net (lilly.csoft.net [63.111.22.101]) by hub.freebsd.org (Postfix) with SMTP id D6E9737B42A for ; Mon, 29 Apr 2002 10:04:11 -0700 (PDT) Received: (qmail 96487 invoked by uid 1234); 29 Apr 2002 17:00:28 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 29 Apr 2002 17:00:28 -0000 Date: Mon, 29 Apr 2002 12:00:28 -0500 (EST) From: Joshua Steele X-X-Sender: jsteele@lilly To: freebsd-fs@freebsd.org Subject: newfs overwrite... Message-ID: <20020429115813.M95917-100000@lilly> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org I had a 60 gig IDE hard drive that was full of business data overwritten when upgrading a system to fbsd 4.5 The newfs program was executed on the drive, and (i assume) that the file system table was overwritten. I do not think the drive was completely formatted, because the process took less than 30 seconds. Is there a way to reverse this process and get the old fs table back, or rebuild it..i really need the data. Any suggestions/comments would be well appreciated. Joshua Steele To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Apr 29 10:18:28 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mail102.csoft.net (lilly.csoft.net [63.111.22.101]) by hub.freebsd.org (Postfix) with SMTP id BBD2837B404 for ; Mon, 29 Apr 2002 10:18:25 -0700 (PDT) Received: (qmail 98630 invoked by uid 1234); 29 Apr 2002 17:14:42 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 29 Apr 2002 17:14:42 -0000 Date: Mon, 29 Apr 2002 12:14:42 -0500 (EST) From: Joshua Steele X-X-Sender: jsteele@lilly To: Michael Sierchio Cc: freebsd-fs@freebsd.org Subject: Re: newfs overwrite... In-Reply-To: <3CCD7E14.2070809@tenebras.com> Message-ID: <20020429121106.V97112-100000@lilly> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Well..this was the backup/storage server. I contacted drivesavers, and its going to be about 7,000.00US to get it fixed by them...which is not an option because i do not have that much in resources to get the drive fixed (i am a small business) Are there any other tools, etc. for freebsd that aide in rebuilding the fs table? Or am i basically not going to be able to repair the drive, and might as well move on and start salvaging what financial data i do have at the current time before the tax quarter is up.... Joshua Steele Codefusion Internet Services http://www.CodefusionIS.com (301) 777-1142 On Mon, 29 Apr 2002, Michael Sierchio wrote: > Joshua Steele wrote: > > I had a 60 gig IDE hard drive that was full of business data overwritten > > when upgrading a system to fbsd 4.5 The newfs program was executed on the > > drive, and (i assume) that the file system table was overwritten. I do > > not think the drive was completely formatted, because the process took > > less than 30 seconds. Is there a way to reverse this process and get the > > old fs table back, or rebuild it..i really need the data. Any > > suggestions/comments would be well appreciated. > > You could try one of the numerous commercial services such as > drivesavers.com (@ about $100/MB last I heard), or simple restore > from your latest backup ;-) > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Apr 29 11: 1:21 2002 Delivered-To: freebsd-fs@freebsd.org Received: from smtp.comcast.net (smtp.comcast.net [24.153.64.2]) by hub.freebsd.org (Postfix) with ESMTP id 139B537B41A for ; Mon, 29 Apr 2002 11:00:37 -0700 (PDT) Received: from leto (pcp529856pcs.nash01.tn.comcast.net [68.52.131.181]) by mtaout45-01.icomcast.net (iPlanet Messaging Server 5.1 HotFix 0.3 (built Apr 8 2002)) with ESMTP id <0GVC006B0COXVM@mtaout45-01.icomcast.net> for freebsd-fs@freebsd.org; Mon, 29 Apr 2002 14:00:33 -0400 (EDT) Date: Mon, 29 Apr 2002 12:56:11 -0500 (CDT) From: "Brandon D. Valentine" Subject: Re: newfs overwrite... In-reply-to: <20020429121106.V97112-100000@lilly> X-X-Sender: bandix@leto.homeportal.2wire.net To: Joshua Steele Cc: Michael Sierchio , freebsd-fs@freebsd.org Message-id: <20020429124054.X1710-100000@leto.homeportal.2wire.net> MIME-version: 1.0 Content-type: TEXT/PLAIN; charset=US-ASCII Content-transfer-encoding: 7BIT Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Mon, 29 Apr 2002, Joshua Steele wrote: >Well..this was the backup/storage server. I contacted drivesavers, and >its going to be about 7,000.00US to get it fixed by them...which is not an >option because i do not have that much in resources to get the drive fixed >(i am a small business) ALWAYS ALWAYS ALWAYS backup critical data to tape. Sorry to be so harsh, but let this be a lesson to you. Recently my boss blew away a fair bit of PHP coding with an unintentionally greedy rm command. He would have felt awful except that 24 hours had elapsed since the file was last modified and so my backup schedule (courtesy of amanda) had already put the file onto a tape. It took a matter of minutes to restore it, literally. $7k is an expensive lesson in the relative unreliability of hard drives. For under $10k you can get a 19tape AIT-2 library from Overland Data w/ a full compliment of tapes. Amanda is free software. Protect your data. If you can't afford the hardware up front, there are online merchants who will help you finance it. The investment pays off the minute something like this happens. >Are there any other tools, etc. for freebsd that aide in rebuilding the fs >table? Or am i basically not going to be able to repair the drive, and >might as well move on and start salvaging what financial data i do have at >the current time before the tax quarter is up.... I recently used gpart (ports/sysutils/gpart) to recover the partition table on a botched WinXP system for a friend. I don't know if it can actually recover data from a newfs'd filesystem. The system I used it on had only had its partition table wiped. It's worth a try though. If that doesn't work you might start reading fs(5) and friends and see if you can't come up with something creative. The data is still there on disk but since you newfs'd it you'll be without the inode-to-file mapping that makes the data meaningful. My suggestion is, don't ever mount that drive rw, only mount it read-only so you don't overwrite what's hiding on the disk. The data is at least in some form recoverable, it's just a matter of how much time and effort you need to put into it. At a certain number of hours of work that $7k starts to sound like a bargain. Good luck! Brandon D. Valentine -- "Time to resign from the human race, wipe those tears from your lovely face. Baby, wave to the man in the ol' red caboose before all hell breaks loose." - Kinky Friedman To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Apr 29 15:30:38 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mailhub.kki.krakow.pl (nova.kki.krakow.pl [195.116.9.2]) by hub.freebsd.org (Postfix) with ESMTP id 890BA37B405 for ; Mon, 29 Apr 2002 15:30:34 -0700 (PDT) Received: (from root@localhost) by mailhub.kki.krakow.pl (8.11.2/8.11.2) id g3U1bJ606322; Tue, 30 Apr 2002 03:37:19 +0200 Date: Tue, 30 Apr 2002 03:37:19 +0200 From: scaner-virus@mailhub.kki.krakow.pl Message-Id: <200204300137.g3U1bJ606322@mailhub.kki.krakow.pl> To: freebsd-fs@freebsd.org Subject: BYL WIRUS W POCZCIE DO CIEBIE OD vw-audi@kki.krakow.pl (VIRUS IN MAIL FOR YOU FROM vw-audi@kki.krakow.pl) Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org V I R U S A L E R T Our viruschecker found the System antyvirusowy w Krakowski Komercyjny Internet znalazl VIRUS !! o nazwie: I-Worm.Klez.h virus(es) in an email to you from: w poczcie do Ciebie od: vw-audi@kki.krakow.pl Delivery of the email was stopped! Dostarczanie listu zostalo wstrzymane! The ID of your quarantined message is: numer identyfikacyjny chorego listu jest: virus-20020430-033719-6135 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Apr 29 15:45:26 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mail.allcaps.org (mail.allcaps.org [208.252.245.17]) by hub.freebsd.org (Postfix) with ESMTP id 00A7137B404 for ; Mon, 29 Apr 2002 15:45:24 -0700 (PDT) Received: from mail.allcaps.org (localhost [127.0.0.1]) by mail.allcaps.org (Postfix) with ESMTP id 83C7632601 for ; Mon, 29 Apr 2002 15:45:23 -0700 (PDT) Received: from localhost (bsder@localhost) by mail.allcaps.org (8.12.3/8.12.3/Submit) with ESMTP id g3TMjNJ9016549 for ; Mon, 29 Apr 2002 15:45:23 -0700 (PDT) X-Authentication-Warning: mail.allcaps.org: bsder owned process doing -bs Date: Mon, 29 Apr 2002 15:45:23 -0700 (PDT) From: "Andrew P. Lentvorski" To: Subject: Non-standard root filesystems Message-ID: <20020429153020.Q16532-100000@mail.allcaps.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Could someone fill me in on the current status of either A) remounting a root filesystem or B) mounting a non-standard fs as a root fs? I've bumped into this three times now. Twice as part of the vinum stuff and a third as part of a solid-state media problem. In the case of vinum, I actually had a client choose a Linux system because its rootfs is software RAIDable. Other people are also starting to bump into the issue as brought up in the NFS attribute thread. (AFS can't be used because the root filesystem can't be remounted completely) What is the current status? Or, alternatively, what are the issues with implementation? Is it a political problem (no perceived need/low priority/coming in FreeBSD 5.X) or is it a massive amount of work problem? Thanks, -a To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Apr 29 16:16:55 2002 Delivered-To: freebsd-fs@freebsd.org Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by hub.freebsd.org (Postfix) with SMTP id 9D75037B41A for ; Mon, 29 Apr 2002 16:16:48 -0700 (PDT) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 30 Apr 2002 00:16:47 +0100 (BST) To: Joshua Steele Cc: Michael Sierchio , freebsd-fs@freebsd.org Subject: Re: newfs overwrite... In-Reply-To: Your message of "Mon, 29 Apr 2002 12:14:42 CDT." <20020429121106.V97112-100000@lilly> Date: Tue, 30 Apr 2002 00:16:47 +0100 From: Ian Dowse Message-ID: <200204300016.aa17293@salmon.maths.tcd.ie> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org In message <20020429121106.V97112-100000@lilly>, Joshua Steele writes: >Are there any other tools, etc. for freebsd that aide in rebuilding the fs >table? Or am i basically not going to be able to repair the drive, and >might as well move on and start salvaging what financial data i do have at >the current time before the tax quarter is up.... If the newfs was run with the same parameters as when the filesystem was originally created, then all of the top-level metadata will have been completely obliterated (used/free block lists, used/free inode lists, the inode structures themselves, and the top-level mapping between inodes and file blocks). This means that all of the records linking file names/types to file data blocks are gone. The data contained in the files is probably still intact, but it is scattered in block-sized chunks across the disk, interleaved with blocks from deleted files and anything else that has ever been written to the filesystem. About the only thing you can easily do to recover some fragments of text data is something like strings /dev/whatever | grep -100 'some string' where 'some string' is a text string contained in the data you want. If there are lots of files that begin with a known header (e.g. word documents), you might have some luck with a program that extracts N blocks after every block that has the right header at the start. There's a simple example at http://www.maths.tcd.ie/~iedowse/FreeBSD/docfind.pl but it is dumb and assumes that all office documents are 256k long word documents. Ian To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Apr 29 16:16:56 2002 Delivered-To: freebsd-fs@freebsd.org Received: from smtp.comcast.net (smtp.comcast.net [24.153.64.2]) by hub.freebsd.org (Postfix) with ESMTP id 103B037B404 for ; Mon, 29 Apr 2002 16:16:51 -0700 (PDT) Received: from leto (pcp529856pcs.nash01.tn.comcast.net [68.52.131.181]) by mtaout02.icomcast.net (iPlanet Messaging Server 5.1 HotFix 0.3 (built Apr 8 2002)) with ESMTP id <0GVC00HAARC13A@mtaout02.icomcast.net> for freebsd-fs@freebsd.org; Mon, 29 Apr 2002 19:16:50 -0400 (EDT) Date: Mon, 29 Apr 2002 18:12:24 -0500 (CDT) From: "Brandon D. Valentine" Subject: Re: Non-standard root filesystems In-reply-to: <20020429153020.Q16532-100000@mail.allcaps.org> X-X-Sender: bandix@leto.homeportal.2wire.net To: "Andrew P. Lentvorski" Cc: freebsd-fs@freebsd.org Message-id: <20020429180845.F2248-100000@leto.homeportal.2wire.net> MIME-version: 1.0 Content-type: TEXT/PLAIN; charset=US-ASCII Content-transfer-encoding: 7BIT Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Mon, 29 Apr 2002, Andrew P. Lentvorski wrote: >I've bumped into this three times now. Twice as part of the vinum stuff >and a third as part of a solid-state media problem. In the case of vinum, >I actually had a client choose a Linux system because its rootfs is >software RAIDable. Not being up on the latest in FreeBSD's software RAID I cannot comment on how evolved it might be. However, if nobody here can help you I'd suggest looking at NetBSD's RAIDframe, which has very good support for software RAIDed root filesystems. I myself am of the persuasion that the only particularly good reason to make / a RAID filesystem is to add fault tolerance, which you don't /really/ get with software RAID. I tend to go with SCSI hardware RAID cards doing RAID1 mirroring for my server system drives. Brandon D. Valentine -- "Time to resign from the human race, wipe those tears from your lovely face. Baby, wave to the man in the ol' red caboose before all hell breaks loose." - Kinky Friedman To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 5: 9:54 2002 Delivered-To: freebsd-fs@freebsd.org Received: from goanga.com (goanga.com [193.231.240.30]) by hub.freebsd.org (Postfix) with ESMTP id 4EC6437B419; Tue, 30 Apr 2002 05:09:47 -0700 (PDT) Received: from abc.ro (goanga.com [193.231.240.30]) by goanga.com (8.11.3/8.11.3) with ESMTP id g3UC9cg80311; Tue, 30 Apr 2002 15:09:39 +0300 (EEST) (envelope-from andrei@abc.ro) Message-ID: <3CCE8982.6A915F2B@abc.ro> Date: Tue, 30 Apr 2002 15:09:38 +0300 From: ANdrei Organization: Cronon AG - tech department X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.2.12 i386) X-Accept-Language: de, ro, en MIME-Version: 1.0 To: FS@FREEBSD.ORG, bugs@FREEBSD.ORG Subject: xterm & directory cat Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org why is it possible to do a "cat" o a directory in freebsd? and, if this is intended (for whatever purpose, but i can't think of any reasonable...), maybe others can verify that this crashes your xterm, if you run this command form a xterm under X... actually, it changes your character set (or whatever, i'm not much into how this works), but the effect is that you can't use your terminal any more (in a normal way :) hope i'm not missing smtg, and this really s a bug, and i'm posting to the right lists... feedback is appreciated, but please cc me, cause i'm not subscribed... aloha, ANdrei -- ----------------------------------[ http://www.goanga.com ]-- Never take life seriously. _ _ Nobody gets out alive anyway. o' \.=./ `o (o o) -----------------------------------------ooO--(_)--Ooo------- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 5:53:44 2002 Delivered-To: freebsd-fs@freebsd.org Received: from areilly.bpc-users.org (CPE-144-132-240-160.nsw.bigpond.net.au [144.132.240.160]) by hub.freebsd.org (Postfix) with SMTP id B898B37B404 for ; Tue, 30 Apr 2002 05:53:38 -0700 (PDT) Received: (qmail 4370 invoked from network); 30 Apr 2002 12:53:38 -0000 Received: from localhost (andrew@127.0.0.1) by localhost with SMTP; 30 Apr 2002 12:53:38 -0000 Subject: Re: xterm & directory cat From: Andrew Reilly To: ANdrei Cc: fs@freebsd.org, bugs@freebsd.org In-Reply-To: <3CCE8982.6A915F2B@abc.ro> References: <3CCE8982.6A915F2B@abc.ro> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.3 Date: 30 Apr 2002 22:53:38 +1000 Message-Id: <1020171218.3085.26.camel@gurney.reilly.home> Mime-Version: 1.0 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Tue, 2002-04-30 at 22:09, ANdrei wrote: > why is it possible to do a "cat" o a directory in freebsd? and, if this Why would it not be? Directories in any unix system are just files. Well, they have slightly more significance than files, but behave as files in many respects. The format is binary, and consists of the text string of the name and the inode number of the file. > is intended (for whatever purpose, but i can't think of any > reasonable...), maybe others can verify that this crashes your xterm, if > you run this command form a xterm under X... actually, it changes your > character set (or whatever, i'm not much into how this works), but the > effect is that you can't use your terminal any more (in a normal way :) That will happen if you cat just about _any_ binary file. Xterm emulates an ANSI terminal, and some combinations of random binary characters are quite likely to be interpreted as control sequences that could do just about anything. Putting the terminal into a funny mode is very likely. Try using hd or less instead: they display the non-ASCII characters more usefully. > hope i'm not missing smtg, and this really s a bug, and i'm posting to > the right lists... Not sure what the right list would be. Perhaps -questions. It's not a bug, anyway. > feedback is appreciated, but please cc me, cause i'm not subscribed... OK. -- Andrew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 8:25:11 2002 Delivered-To: freebsd-fs@freebsd.org Received: from falcon.prod.itd.earthlink.net (falcon.mail.pas.earthlink.net [207.217.120.74]) by hub.freebsd.org (Postfix) with ESMTP id 0C5E737B41C; Tue, 30 Apr 2002 08:25:03 -0700 (PDT) Received: from pool0453.cvx21-bradley.dialup.earthlink.net ([209.179.193.198] helo=mindspring.com) by falcon.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 172ZUg-0001gf-00; Tue, 30 Apr 2002 08:24:43 -0700 Message-ID: <3CCEB71D.1AD1F911@mindspring.com> Date: Tue, 30 Apr 2002 08:24:13 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: ANdrei Cc: FS@FREEBSD.ORG, bugs@FREEBSD.ORG Subject: Re: xterm & directory cat References: <3CCE8982.6A915F2B@abc.ro> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org ANdrei wrote: > why is it possible to do a "cat" o a directory in freebsd? Because directories are implemented as files. It's an implementation detail. It has the side effect of letting you mmap large directories to get them faulted in quickly, if you write your code correctly. > and, if this > is intended (for whatever purpose, but i can't think of any > reasonable...), maybe others can verify that this crashes your xterm, if > you run this command form a xterm under X... actually, it changes your > character set (or whatever, i'm not much into how this works), but the > effect is that you can't use your terminal any more (in a normal way :) Try cat'ing /bin/ls. It's not crashing your xterm, by the way, it's just sending it an escape sequence that locks it up. > hope i'm not missing smtg, and this really s a bug, and i'm posting to > the right lists... > feedback is appreciated, but please cc me, cause i'm not subscribed... The right list would have been -questions. After your xterm is "crashed", use control-right-mouse-button "full reset". Your xterm will "uncrash". -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 8:41:19 2002 Delivered-To: freebsd-fs@freebsd.org Received: from accms33.physik.rwth-aachen.de (accms33.physik.RWTH-Aachen.DE [137.226.46.133]) by hub.freebsd.org (Postfix) with ESMTP id 22D8337B41C; Tue, 30 Apr 2002 08:41:14 -0700 (PDT) Received: (from kuku@localhost) by accms33.physik.rwth-aachen.de (8.9.3/8.9.3) id RAA24705; Tue, 30 Apr 2002 17:40:57 +0200 Date: Tue, 30 Apr 2002 17:40:57 +0200 From: Christoph Kukulies To: Terry Lambert Cc: ANdrei , FS@FreeBSD.ORG, bugs@FreeBSD.ORG Subject: Re: xterm & directory cat Message-ID: <20020430174057.A24695@gilberto.physik.rwth-aachen.de> References: <3CCE8982.6A915F2B@abc.ro> <3CCEB71D.1AD1F911@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3CCEB71D.1AD1F911@mindspring.com>; from tlambert2@mindspring.com on Tue, Apr 30, 2002 at 08:24:13AM -0700 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Tue, Apr 30, 2002 at 08:24:13AM -0700, Terry Lambert wrote: > ANdrei wrote: > > and, if this > > is intended (for whatever purpose, but i can't think of any > > reasonable...), maybe others can verify that this crashes your xterm, if > > you run this command form a xterm under X... actually, it changes your > > character set (or whatever, i'm not much into how this works), but the > > effect is that you can't use your terminal any more (in a normal way :) > > Try cat'ing /bin/ls. > > It's not crashing your xterm, by the way, it's just sending it an > escape sequence that locks it up. > > > > hope i'm not missing smtg, and this really s a bug, and i'm posting to > > the right lists... > > feedback is appreciated, but please cc me, cause i'm not subscribed... > > The right list would have been -questions. > > After your xterm is "crashed", use control-right-mouse-button Pardon Sir, control-middle-mouse-button :-) > "full reset". Your xterm will "uncrash". > > -- Terry -- Chris Christoph P. U. Kukulies kukulies@rwth-aachen.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 9:28:35 2002 Delivered-To: freebsd-fs@freebsd.org Received: from scaup.prod.itd.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by hub.freebsd.org (Postfix) with ESMTP id 0CE8037B41E for ; Tue, 30 Apr 2002 09:28:29 -0700 (PDT) Received: from pool0495.cvx21-bradley.dialup.earthlink.net ([209.179.193.240] helo=mindspring.com) by scaup.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 172aUE-0007JO-00; Tue, 30 Apr 2002 09:28:18 -0700 Message-ID: <3CCEC5E5.FED0CBF@mindspring.com> Date: Tue, 30 Apr 2002 09:27:17 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Joshua Steele Cc: Michael Sierchio , freebsd-fs@freebsd.org Subject: Re: newfs overwrite... References: <20020429121106.V97112-100000@lilly> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Joshua Steele wrote: > Well..this was the backup/storage server. I contacted drivesavers, and > its going to be about 7,000.00US to get it fixed by them...which is not an > option because i do not have that much in resources to get the drive fixed > (i am a small business) > > Are there any other tools, etc. for freebsd that aide in rebuilding the fs > table? Or am i basically not going to be able to repair the drive, and > might as well move on and start salvaging what financial data i do have at > the current time before the tax quarter is up.... Buy a much-larger-than-60G disk (preferrably, more than twice as large), and: 1) dd the image of the 60G disk into a single file Note: Not really necessary, but it prevents you from screwing up your "live" disk) 2) Start copying out chunks of data base on cylinder groups, and identification of secondary indirect blocks The data better be *really* valuable, as this is a manual, labor intensive operation. If it's recognizable to a human, then you are going to be doing a lot of looking; if it's not, you are going to be using the remainder of the disk space to write some programs to recover particular file contents type of data. It's alway easiest if the drive is human readable. I recovered a good 250,000 lines of source code from a spammed drive this way, at one time in my misspent youth, so that the project, due in a couple of days after the fact, would not be turned in late. The main problem is that when you delete a file, the physical analogy is to take the contents out of the file folder, rip the label off the file folder, and then shuffle the pages that were in the folder into your blank printer paper (knowing that the printer will erase them before it prints on them), after which you throw the file floder back into the supply cabinet. You've basically done this with all your files. If the papers don't contain binary information (e.g. the moral equivalent of encrypted data, in terms of being able to identify which piece of paper goes in which file folder, or which piece of paper goes in what order), then it's just a big sorting job. If it's binary data, you can basically perform an iterative search based on your knowledge of the contents, in order to recover the data. For an executable, this is probably not worthwhile (you can always replace it), but identifying "magic numbers" for things like Postscript, ELF executables, etc., are actually very easy; the remainder of the file, less so. The other hint you have is that every set of 9 pages in large file folders are "stapled together" -- members of the same clyinder group. If you have a rough idea of the FS size (which you do), then examining the post-newfs disk read-only will tell you where all the FS layout information lives. From this, you can probably recover directory information pretty easily, which can give you inode and relative cylinder group information; doing this requires a fairly deep understanding of the FS in question. THe drive recovery place might be a deal. Basically, they copy the normallay readable data off the disk, and then read the disk, taking head hysteresis into account, to recover the misaligned track writes, if any, to recover the data (which is why MILSPEC erasure requires the writing of patterned data to the disk, from both seek directions, to achieve erasure of "secret" data). On a theoretical standpoint: o Everything above is predicated on the idea you are using FFS. If you use another FS, the recovery details become very much easier or very much harder, depending on the FS. o It's pretty trivial to change the process to lazy-bind the contents of deleted information, so that instead of writing zero'ed inodes to the disk, you leave the index information intact, and only zeroit on reallocation; this makes undeleting files a lot easier, because it doesn't put the unlabelled file folder back into the file cabinet. It also leaves the papers in the folder, though they are available for the printer to grab and clear at random, if it's asked to print (saving new files to the FS may overwrite "deleted" data). This would be a rather simple operation for FFS, actually. o It's also pretty trivial to change it so that formatting actually scrubs the disk, and then deletion also scrubs the disk. In combination, this would be a bad thing, but seperately, it would allow you to recover a lot of data much more quickly, by being able to rule out large amounts of disk space from consideration. o It's pretty trivial to change the formatting process to resemble the Windows formatting process, which means that the newfs can be made largely reversible. This is actually probably a pretty good idea, for general small businesses like yours, actually. No one has seriously attempted to productize UNIX, yet... not even Univel, back in the day. Anecdote time: One thing we often did at the local university any time a machine was donated was to first undelete everything, and see if there were games on the disks. The FS layout helped us considerably. This was before doing such things was considered illegal. o If you are depending on the data being unrecoverable merely because you format the disk... it's not going to happen... o The data is always recoverable. The speed and time is a matter of the effort you are willing to expend. Depending on the unrecoverability of the data is a losing proposition, unless it's encrypted, and if it's something like DES, using "the crypt breaker's workbench" makes it pretty trivial to recover the data, as well. o Having some of the financial data on hand in a format that allows recreation of partial data gives you enough information that you can probably eliminate the data There are some things that can make a disk unrecoverable, but they all require the use of cryptographic mechanisms. If you have used a good one on your financial data on the disk... it's time to start over entering the data. If you have time pressure on you right now, spend the money. If you have some leeway, then recover the data the slow way, and if it's not panning out, then spend the money before the time-to-recover window closes. You might also look at this as an opportunity to build the tools needed to recover the data more quickly. It's actually not that difficult to build such tools, and you have a test image (now) that is a relatively expensive thing to create. 8-(. Frankly, any time I've done this to a disk, I've always been most concerned with a small subset of the data, not the whole disk, so the recovery was simultaneously much easier and much less totally labor intensive; like a linear search, I could stop after only having examined about 50% of my total data set. It also means that all the tools I wrote for the job were so small that I just threw them away when I was done with them (e.g. I didn't archive them for posterity, but I also didn't actively seek to get rid of them, they just got backed up on tape and ignored, over time). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 9:36:18 2002 Delivered-To: freebsd-fs@freebsd.org Received: from scaup.prod.itd.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by hub.freebsd.org (Postfix) with ESMTP id B9D6B37B41D for ; Tue, 30 Apr 2002 09:36:10 -0700 (PDT) Received: from pool0495.cvx21-bradley.dialup.earthlink.net ([209.179.193.240] helo=mindspring.com) by scaup.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 172abi-0002fb-00; Tue, 30 Apr 2002 09:36:02 -0700 Message-ID: <3CCEC7D5.D22356A0@mindspring.com> Date: Tue, 30 Apr 2002 09:35:33 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Andrew P. Lentvorski" Cc: freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems References: <20020429153020.Q16532-100000@mail.allcaps.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org FreeBSD treats root mounts as "special", relative to all other mounts. This is a design error, but overcoming it requires a reorganization of the mount code that's not really politically easy to accomplish, even though it's technically very easy. Some of the stuff Poul is doing right now will probably help you in the future with assembing things like RAID-able volumes in the future -- but not help you right now. As far as software RAID is concerned: it's a bad idea, from a performance perspective; I don't recommend it. Note that I'm the person who did the original user space RAIDframe port to FreeBSD in the mid 1990's, so I'm not just talking out my butt: the amount of overhead for parity calculation and storage is *considerable*, and makes RAID hardware a *much* better idea. -- Terry "Andrew P. Lentvorski" wrote: > > Could someone fill me in on the current status of either A) remounting a > root filesystem or B) mounting a non-standard fs as a root fs? > > I've bumped into this three times now. Twice as part of the vinum stuff > and a third as part of a solid-state media problem. In the case of vinum, > I actually had a client choose a Linux system because its rootfs is > software RAIDable. > > Other people are also starting to bump into the issue as brought up in the > NFS attribute thread. (AFS can't be used because the root filesystem > can't be remounted completely) > > What is the current status? Or, alternatively, what are the issues with > implementation? Is it a political problem (no perceived need/low > priority/coming in FreeBSD 5.X) or is it a massive amount of work problem? > > Thanks, > -a > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 9:49:17 2002 Delivered-To: freebsd-fs@freebsd.org Received: from 66-162-33-178.gen.twtelecom.net (66-162-33-178.gen.twtelecom.net [66.162.33.178]) by hub.freebsd.org (Postfix) with ESMTP id ABD4437B417 for ; Tue, 30 Apr 2002 09:49:13 -0700 (PDT) Received: from [10.4.1.134] (helo=expertcity.com) by 66-162-33-178.gen.twtelecom.net with esmtp (Exim 3.22 #4) id 172aoT-0004tY-00 for freebsd-fs@freebsd.org; Tue, 30 Apr 2002 09:49:13 -0700 Message-ID: <3CCECB01.2020203@expertcity.com> Date: Tue, 30 Apr 2002 09:49:05 -0700 From: Jeff Behl User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4.1) Gecko/20020314 Netscape6/6.2.2 X-Accept-Language: en-us MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: NFS mount over 1TB Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org I have a 1.3TB volume mounted over NFS. It mounts, but df shows negative values. Is this expected? Is there anything I can do to get the real values? FreeBSD expert60.snv 4.5-RELEASE FreeBSD 4.5-RELEASE #0: Tue Feb 5 10:38:06 PST 2002 root@expert60.snv:/usr/obj/usr/src/sys/expert60 i386 thanks. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 9:52:34 2002 Delivered-To: freebsd-fs@freebsd.org Received: from scaup.prod.itd.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by hub.freebsd.org (Postfix) with ESMTP id D3E2E37B400; Tue, 30 Apr 2002 09:52:30 -0700 (PDT) Received: from pool0495.cvx21-bradley.dialup.earthlink.net ([209.179.193.240] helo=mindspring.com) by scaup.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 172arA-0002Xu-00; Tue, 30 Apr 2002 09:52:00 -0700 Message-ID: <3CCECB92.68EF0387@mindspring.com> Date: Tue, 30 Apr 2002 09:51:30 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Christoph Kukulies Cc: ANdrei , FS@FreeBSD.ORG, bugs@FreeBSD.ORG Subject: Re: xterm & directory cat References: <3CCE8982.6A915F2B@abc.ro> <3CCEB71D.1AD1F911@mindspring.com> <20020430174057.A24695@gilberto.physik.rwth-aachen.de> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Christoph Kukulies wrote: > > After your xterm is "crashed", use control-right-mouse-button > > Pardon Sir, control-middle-mouse-button :-) > > > "full reset". Your xterm will "uncrash". Guess you aren't using a laptop or a 2 button mouse... -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 10: 1:35 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mail.cise.ufl.edu (beach.cise.ufl.edu [128.227.205.211]) by hub.freebsd.org (Postfix) with ESMTP id F3D5B37B41A for ; Tue, 30 Apr 2002 10:01:31 -0700 (PDT) Received: from waterspout (waterspout.cise.ufl.edu [128.227.205.52]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by mail.cise.ufl.edu (Postfix) with ESMTP id 691506C55; Tue, 30 Apr 2002 13:01:30 -0400 (EDT) Date: Tue, 30 Apr 2002 13:01:28 -0400 From: James F.Hranicky To: "Terry Lambert" Cc: bsder@allcaps.org, freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems Message-Id: <20020430130128.11428802.jfh@cise.ufl.edu> In-Reply-To: <3CCEC7D5.D22356A0@mindspring.com> References: <20020429153020.Q16532-100000@mail.allcaps.org> <3CCEC7D5.D22356A0@mindspring.com> Organization: University of Florida CISE Department X-Mailer: Sylpheed version 0.7.5 (GTK+ 1.2.8; sparc-sun-solaris2.8) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Tue, 30 Apr 2002 09:35:33 -0700 "Terry Lambert" wrote: > As far as software RAID is concerned: it's a bad idea, from a > performance perspective; I don't recommend it. Note that I'm > the person who did the original user space RAIDframe port to > FreeBSD in the mid 1990's, so I'm not just talking out my butt: > the amount of overhead for parity calculation and storage is > *considerable*, and makes RAID hardware a *much* better idea. Perhaps with RAID 5, but with 0+1 using vinum, wouldn't you see an increase with long enough plexes? Granted, it's more disks, but it may be cheaper than hardware RAID (or not, haven't looked in a while). Plus, just a simple two disk mirror for / wouldn't be all that bad, considering / shouldn't get heavy write traffic, right? ---------------------------------------------------------------------- | Jim Hranicky, Senior SysAdmin UF/CISE Department | | E314D CSE Building Phone (352) 392-1499 | | jfh@cise.ufl.edu http://www.cise.ufl.edu/~jfh | ---------------------------------------------------------------------- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 10:11:16 2002 Delivered-To: freebsd-fs@freebsd.org Received: from scaup.prod.itd.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by hub.freebsd.org (Postfix) with ESMTP id 7DFFD37B421 for ; Tue, 30 Apr 2002 10:10:37 -0700 (PDT) Received: from pool0495.cvx21-bradley.dialup.earthlink.net ([209.179.193.240] helo=mindspring.com) by scaup.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 172b97-0005Sh-00; Tue, 30 Apr 2002 10:10:34 -0700 Message-ID: <3CCECFEA.88B9D86A@mindspring.com> Date: Tue, 30 Apr 2002 10:10:02 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "James F.Hranicky" Cc: bsder@allcaps.org, freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems References: <20020429153020.Q16532-100000@mail.allcaps.org> <3CCEC7D5.D22356A0@mindspring.com> <20020430130128.11428802.jfh@cise.ufl.edu> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org "James F.Hranicky" wrote: > "Terry Lambert" wrote: > > As far as software RAID is concerned: it's a bad idea, from a > > performance perspective; I don't recommend it. Note that I'm > > the person who did the original user space RAIDframe port to > > FreeBSD in the mid 1990's, so I'm not just talking out my butt: > > the amount of overhead for parity calculation and storage is > > *considerable*, and makes RAID hardware a *much* better idea. > > Perhaps with RAID 5, but with 0+1 using vinum, wouldn't you > see an increase with long enough plexes? Only for reads. Realize that there are not load balancing strategies available here, other than simple round-robin; also realize that this assumes access will be non-sequential, and that you'll actually hurt sequential access by cache shootdown of the contents of the track cache on the physical disk. Writes will still be screwed by having to be done twice in order to be considered committed to stable storage. > Granted, it's more disks, but it may be cheaper than hardware RAID > (or not, haven't looked in a while). > > Plus, just a simple two disk mirror for / wouldn't be all that bad, > considering / shouldn't get heavy write traffic, right? IMO, / should get zero write traffic, since it should be mounted read-only. Damn hard to install a "rootkit" on a read-only /... As I suggested, though... some of PHK's stuff ("GEOM", if it gets done and gets committed) will go a ways to help the "problem". I guess I don't totally understand the point, other than that you can use software RAID on / on Linux and not FreeBSD? That's not really news, and it's not really an interesting thing to do, even if it were news. It's actually possible to do with hoop-jumping, but it's really not something I would recommend doing under any circumstances, anyway. How does Linux handle a half-plex failure detected at boot time? It seems to me that the only way to detect this is "not booting", which means that you're not loading any software capable of handling the problem, anyway, and you're screwed. Hardware RAID doesn't have this problem... -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 10:16:10 2002 Delivered-To: freebsd-fs@freebsd.org Received: from laptop.tenebras.com (laptop.tenebras.com [66.92.188.18]) by hub.freebsd.org (Postfix) with SMTP id 638F637B443 for ; Tue, 30 Apr 2002 10:15:50 -0700 (PDT) Received: (qmail 10214 invoked from network); 30 Apr 2002 17:15:49 -0000 Received: from sapphire.tenebras.com (HELO tenebras.com) (66.92.188.241) by 0 with SMTP; 30 Apr 2002 17:15:49 -0000 Message-ID: <3CCED145.6010607@tenebras.com> Date: Tue, 30 Apr 2002 10:15:49 -0700 From: Michael Sierchio Reply-To: kudzu@tenebras.com User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0rc1) Gecko/20020427 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "James F.Hranicky" Cc: Terry Lambert , bsder@allcaps.org, freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems References: <20020429153020.Q16532-100000@mail.allcaps.org> <3CCEC7D5.D22356A0@mindspring.com> <20020430130128.11428802.jfh@cise.ufl.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org James F.Hranicky wrote: > "Terry Lambert" wrote: > >>As far as software RAID is concerned: it's a bad idea, from a >>performance perspective; I don't recommend it. Note that I'm >>the person who did the original user space RAIDframe port to >>FreeBSD in the mid 1990's, so I'm not just talking out my butt: >>the amount of overhead for parity calculation and storage is >>*considerable*, and makes RAID hardware a *much* better idea. > > > Perhaps with RAID 5, but with 0+1 using vinum, wouldn't you > see an increase with long enough plexes? > > Granted, it's more disks, but it may be cheaper than hardware RAID > (or not, haven't looked in a while). > > Plus, just a simple two disk mirror for / wouldn't be all that bad, > considering / shouldn't get heavy write traffic, right? It's true that RAID 5 is costlier on writes than RAID 1, but the problem with the comparison between software and hardware RAID is that the differences are revealed when other-than-normal operation is in force. It's really quite something to see mirror catch-up or parity recalculation on an entire disk because of crash or replacement -- when under something like Veritas or Vinum -- might as well take the system offline. "Cheaper than hardware RAID" depends on your metric, I suppose -- if the initial cost is all you're interested in, it's cheaper. Presumably folks use RAID because they're motivated by other concerns. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 13:42: 2 2002 Delivered-To: freebsd-fs@freebsd.org Received: from quic.net (romulus.quic.net [216.23.27.8]) by hub.freebsd.org (Postfix) with SMTP id 498B337B41C for ; Tue, 30 Apr 2002 13:41:48 -0700 (PDT) Received: (qmail 5147 invoked by uid 1032); 30 Apr 2002 20:41:53 -0000 From: utsl@quic.net Date: Tue, 30 Apr 2002 16:41:53 -0400 To: Terry Lambert Cc: "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems Message-ID: <20020430204153.GB3603@quic.net> References: <20020429153020.Q16532-100000@mail.allcaps.org> <3CCEC7D5.D22356A0@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3CCEC7D5.D22356A0@mindspring.com> User-Agent: Mutt/1.3.27i Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Tue, Apr 30, 2002 at 09:35:33AM -0700, Terry Lambert wrote: > FreeBSD treats root mounts as "special", relative to all other > mounts. This is a design error, but overcoming it requires a > reorganization of the mount code that's not really politically > easy to accomplish, even though it's technically very easy. > > Some of the stuff Poul is doing right now will probably help > you in the future with assembing things like RAID-able > volumes in the future -- but not help you right now. Linux has a syscall (pivot_root) to swap the root with another mounted filesystem. It is occasionally quite useful, and I've been wondering about implementing it (or something similar) on FreeBSD. Possibly you can tell me why that wouldn't work, or would be a bad idea. > As far as software RAID is concerned: it's a bad idea, from a > performance perspective; I don't recommend it. Note that I'm > the person who did the original user space RAIDframe port to > FreeBSD in the mid 1990's, so I'm not just talking out my butt: > the amount of overhead for parity calculation and storage is > *considerable*, and makes RAID hardware a *much* better idea. I agree with you about the performance. Hardware RAID is faster, more reliable, uses less resources, etc. However, many people don't have the budget for it. In my case, I have production systems running Linux with software RAID. I would much rather run hardware RAID and FreeBSD, but I have no budget to buy SCSI RAID controllers. Switching to FreeBSD+Vinum would be a reasonable solution, but I can't mirror root, and that creates a political problem. I get, "If FreeBSD and Vinum will be better, how come you can't mirror the root filesystem?" ---Nathan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 13:49:31 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mail.takas.lt (mail-src.takas.lt [212.59.31.77]) by hub.freebsd.org (Postfix) with ESMTP id A0EF237B43A; Tue, 30 Apr 2002 13:49:10 -0700 (PDT) Received: from mfa.vip.mail.tpe.yahoo.com ([213.190.46.17]) by mail.takas.lt with Microsoft SMTPSVC(5.0.2195.2966); Tue, 30 Apr 2002 22:49:02 +0200 From: "Carla" To: "dielqosgh@aol.com" Subject: It will work for you Content-Type: text/plain; charset="us-ascii";format=flowed Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook Express 5.00.2615.200 Message-ID: X-OriginalArrivalTime: 30 Apr 2002 20:49:08.0173 (UTC) FILETIME=[79245FD0:01C1F088] Date: 30 Apr 2002 22:49:08 +0200 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org As seen on NBC, CBS, CNN, and even Oprah! The health discovery that actually reverses aging while burning fat, without dieting or exercise! This proven discovery has even been reported on by the New England Journal of Medicine. Forget aging and dieting forever! And it's Guaranteed! Click here: http://66.231.133.70/sj1/index.html Would you like to lose weight while you sleep! No dieting! No hunger pains! No Cravings! No strenuous exercise! Change your life forever! 100% GUARANTEED! 1.Body Fat Loss 82% improvement. 2.Wrinkle Reduction 61% improvement. 3.Energy Level 84% improvement. 4.Muscle Strength 88% improvement. 5.Sexual Potency 75% improvement. 6.Emotional Stability 67% improvement. 7.Memory 62% improvement. *********************************************************** You are receiving this email as a subscriber to the Opt-In America Mailing List. To unsubscribe from future offers, just click here: mailto:affiliateoptout@btamail.net.cn?Subject=off To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 15:58:25 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mail.allcaps.org (mail.allcaps.org [208.252.245.17]) by hub.freebsd.org (Postfix) with ESMTP id 3D23337B43C for ; Tue, 30 Apr 2002 15:58:10 -0700 (PDT) Received: by mail.allcaps.org (Postfix, from userid 501) id C516E32601; Tue, 30 Apr 2002 15:58:09 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.allcaps.org (Postfix) with ESMTP id C01CE2E81E; Tue, 30 Apr 2002 15:58:09 -0700 (PDT) Date: Tue, 30 Apr 2002 15:58:09 -0700 (PDT) From: "Andrew P. Lentvorski" To: Cc: Terry Lambert , Subject: Re: Non-standard root filesystems In-Reply-To: <20020430204153.GB3603@quic.net> Message-ID: <20020430141744.P312-100000@mail.allcaps.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org While this sort of mutated into a software RAID discussion, remember that two of the original problems (solid-state media and mounting an unusual FS type) are not RAID-related. utsl@quic.net wrote: > Linux has a syscall (pivot_root) to swap the root with another mounted > filesystem. It is occasionally quite useful, and I've been wondering > about implementing it (or something similar) on FreeBSD. > > Possibly you can tell me why that wouldn't work, or would be a bad > idea. That sounds like a fine idea. What are the issues with doing that? Terry Lambert wrote: > How does Linux handle a half-plex failure detected at boot time? > > It seems to me that the only way to detect this is "not booting", > which means that you're not loading any software capable of > handling the problem, anyway, and you're screwed. > > Hardware RAID doesn't have this problem... But sometimes that's okay. If it can't boot because of failure, the system is still consistent. Things don't have to be automatic, all the time. As long as I can manually reconfigure and boot off of the other plex, I'm okay. I simply want a redundant disk array (RAID 1), effectively. With vinum, I don't have that. A failure on the original boot drives takes the system out--suddenly and badly. However, the failure is *not* in vinum, it's in the way FreeBSD handles mounting /. If the drives weren't redundant, I'd have to do daily backups. My current system is set to shut down gracefully on any disk failure. If need be, I can take a backup image while the system has a failed drive (or is even rebuilding the new one). Given that redundancy, I do monthly backups on my CVS server/bug tracking repository. The amount of work saved over a year given that I am chief/cook/bottle-washer is worth $300 (the price of a 3ware Escalade card). And the line feeding that machine is only a T1; performance isn't an issue. However, I'm now captive to whether 3ware wants to continue supporting their card. That decision is subject to the whim of 3ware management. If the card fails, and 3ware isn't giving out info for supporting their new cards on FreeBSD, I'm toast. It's all a case of balancing price and risk. Terry Lambert wrote: > I guess I don't totally understand the point, other than that > you can use software RAID on / on Linux and not FreeBSD? See my opening statement, the implications reach further than softRAID. > That's not really news, and it's not really an interesting thing to do, > even if it were news. Perhaps. But I would ask about how many people have requested this of vinum before attempting to dismiss it quite so quickly. > It's actually possible to do with hoop-jumping, but it's really > not something I would recommend doing under any circumstances, > anyway. I'd really love to have a pointer to the hoop jumping. It might present some ways around the solid-state media issue as well as helping out with the non-standard fs type. Thanks, -a To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 17: 2:41 2002 Delivered-To: freebsd-fs@freebsd.org Received: from quic.net (romulus.quic.net [216.23.27.8]) by hub.freebsd.org (Postfix) with SMTP id 5D85237B400 for ; Tue, 30 Apr 2002 17:02:29 -0700 (PDT) Received: (qmail 415 invoked by uid 1032); 1 May 2002 00:02:36 -0000 From: utsl@quic.net Date: Tue, 30 Apr 2002 20:02:36 -0400 To: "Andrew P. Lentvorski" Cc: freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems Message-ID: <20020501000236.GB28212@quic.net> References: <20020430204153.GB3603@quic.net> <20020430141744.P312-100000@mail.allcaps.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020430141744.P312-100000@mail.allcaps.org> User-Agent: Mutt/1.3.27i Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Tue, Apr 30, 2002 at 03:58:09PM -0700, Andrew P. Lentvorski wrote: > utsl@quic.net wrote: > > Linux has a syscall (pivot_root) to swap the root with another mounted > > filesystem. It is occasionally quite useful, and I've been wondering > > about implementing it (or something similar) on FreeBSD. > > > > Possibly you can tell me why that wouldn't work, or would be a bad > > idea. > That sounds like a fine idea. What are the issues with doing that? I've been taking a look, and I think it is probably beyond my skill. :( From what I can see the following would be necessary: 1. locate mount structs for both old and new root filesystems 2. lock mount structures 3. swap pointers in mount list? (not sure if this is necessary) 4. set MNT_ROOTFS on new root filesystem, clear on old 5. unlock mount structures (I think it'd be safe at that point) 6. the ugly part: walk through all processes and file descriptors, and switch the root directory. Also switch cwd if it is /. I'm probably missing something. ---Nathan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 17:13:29 2002 Delivered-To: freebsd-fs@freebsd.org Received: from bg.sics.se (uofa-dsl-nen-23.dakotacom.net [150.135.176.23]) by hub.freebsd.org (Postfix) with ESMTP id A88F937B405; Tue, 30 Apr 2002 17:13:20 -0700 (PDT) Received: (from bg@localhost) by bg.sics.se (8.11.6/8.11.6) id g3R17GX08156; Fri, 26 Apr 2002 18:07:16 -0700 (MST) (envelope-from bg) To: Brian Candler Cc: freebsd-fs@FreeBSD.ORG, freebsd-net@FreeBSD.ORG Subject: Re: NFS clearing attribute cache in nfs_open References: <20020426181535.B2748@linnet.org> From: Bjoern Groenvall Date: 26 Apr 2002 18:07:15 -0700 In-Reply-To: Brian Candler's message of "Fri, 26 Apr 2002 18:15:35 +0100" Message-ID: Lines: 23 X-Mailer: Gnus v5.7/Emacs 20.6 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org SunOS 4 used to have a NFS mount mount option nocto (no close to open consistency checking) that would suppress getting fresh attributes when opening a file. IMHO it is ok to enable this option on filesystems that never change (or atleast rarely). I always used it when mounting /usr on diskless systems. It put less load on the server and also eliminated some delays. One may also safely enable this option on filesystems that are accesed from only one host at a time. It all boils down to if there is no writing or sharing, then, there can be no inconsistencies if nocto is enabled. Hope this helps, Bjφrn -- _ _ ,_______________. Bjorn Gronvall (Bjφrn Grφnvall) /_______________/| Swedish Institute of Computer Science | || PO Box 1263, S-164 29 Kista, Sweden | Schroedingers || Email: bg@sics.se, Phone +46 -8 633 15 25 | Cat |/ Cellular +46 -70 768 06 35, Fax +46 -8 751 72 30 `---------------' To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 17:58:40 2002 Delivered-To: freebsd-fs@freebsd.org Received: from harrier.prod.itd.earthlink.net (harrier.mail.pas.earthlink.net [207.217.120.12]) by hub.freebsd.org (Postfix) with ESMTP id 9335037B417 for ; Tue, 30 Apr 2002 17:58:34 -0700 (PDT) Received: from pool0580.cvx40-bradley.dialup.earthlink.net ([216.244.44.70] helo=mindspring.com) by harrier.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 172iRz-0005Aq-00; Tue, 30 Apr 2002 17:58:32 -0700 Message-ID: <3CCF3D98.3495D84D@mindspring.com> Date: Tue, 30 Apr 2002 17:58:00 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: utsl@quic.net Cc: "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems References: <20020429153020.Q16532-100000@mail.allcaps.org> <3CCEC7D5.D22356A0@mindspring.com> <20020430204153.GB3603@quic.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org utsl@quic.net wrote: > On Tue, Apr 30, 2002 at 09:35:33AM -0700, Terry Lambert wrote: > > FreeBSD treats root mounts as "special", relative to all other > > mounts. This is a design error, but overcoming it requires a > > reorganization of the mount code that's not really politically > > easy to accomplish, even though it's technically very easy. > > > > Some of the stuff Poul is doing right now will probably help > > you in the future with assembing things like RAID-able > > volumes in the future -- but not help you right now. > > Linux has a syscall (pivot_root) to swap the root with another mounted > filesystem. It is occasionally quite useful, and I've been wondering > about implementing it (or something similar) on FreeBSD. > > Possibly you can tell me why that wouldn't work, or would be a bad > idea. Doing that would be very hard. The way mount points work won't exactly make it impossible, but it won't make it easy. Here's the architectural fix: 1) Seperate the mount point covering code from the per FS mounting code. 2) Add a seperate VOP for setting the "mounted on" information into the superblock (some FS's, like FFS, like to record the "last mounted on" information; this is actually not used for anything that I've ever seen (right now), so it would probably be OK to rip out completely (right now; it could later be useful for automounting and getting rid of /etc/fstab entirely). 3) When mounting an FS at the VFS_MOUNT layer, simply get a pointer into the list of mounted file systems. *DO NOT* deal with the mount point covering at all in the per FS code! 4) Deal with the mount point covering in the higher level code; this reduces the amount of crap you have to parse in a per FS manner anyway. The covering is done by referencing the FS in the system mounted FS layer from #3 (above). At this point, from the VFS perspective, all mounts -- root and non-root -- are exactly the same: you implement the one type of mount (the "fill in this mount table entry and set up the in core mount structure data" kind), and it's taken care of... the only difference between a root and a non-root mount is the vnode covering code for the mount, and that all uses the same code at a higher layer. This would also make your "pivot" FS work correctly... to do that, you would have to cover an opaque vnode. You could actually do this with any vnode, by revoking the vnode, and making it a deadfs vnode. > > As far as software RAID is concerned: it's a bad idea, from a > > performance perspective; I don't recommend it. Note that I'm > > the person who did the original user space RAIDframe port to > > FreeBSD in the mid 1990's, so I'm not just talking out my butt: > > the amount of overhead for parity calculation and storage is > > *considerable*, and makes RAID hardware a *much* better idea. > > I agree with you about the performance. Hardware RAID is faster, more > reliable, uses less resources, etc. However, many people don't have the > budget for it. I guess they don't get RAID. 8-) 8-) 8-) 8-). > In my case, I have production systems running Linux with software RAID. > I would much rather run hardware RAID and FreeBSD, but I have no budget > to buy SCSI RAID controllers. Switching to FreeBSD+Vinum would be a > reasonable solution, but I can't mirror root, and that creates a > political problem. I get, "If FreeBSD and Vinum will be better, how come > you can't mirror the root filesystem?" How does mirroring the root FS recover after an error? If you can't load the kernel to load the software RAID, then you can't run the software RAID to recover from a failure, right? How does Linux solve this problem? *Does* Linux solve this problem, or are we really talking about an unrecoverable condition that Linux lets you get yourself into, but FreeBSD doesn't? -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Apr 30 19:20: 4 2002 Delivered-To: freebsd-fs@freebsd.org Received: from quic.net (romulus.quic.net [216.23.27.8]) by hub.freebsd.org (Postfix) with SMTP id 4715437B404 for ; Tue, 30 Apr 2002 19:19:52 -0700 (PDT) Received: (qmail 9153 invoked by uid 1032); 1 May 2002 02:19:59 -0000 From: utsl@quic.net Date: Tue, 30 Apr 2002 22:19:59 -0400 To: Terry Lambert Cc: "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems Message-ID: <20020501021959.GA20232@quic.net> References: <20020429153020.Q16532-100000@mail.allcaps.org> <3CCEC7D5.D22356A0@mindspring.com> <20020430204153.GB3603@quic.net> <3CCF3D98.3495D84D@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3CCF3D98.3495D84D@mindspring.com> User-Agent: Mutt/1.3.27i Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Tue, Apr 30, 2002 at 05:58:00PM -0700, Terry Lambert wrote: > utsl@quic.net wrote: > > On Tue, Apr 30, 2002 at 09:35:33AM -0700, Terry Lambert wrote: > > > FreeBSD treats root mounts as "special", relative to all other > > > mounts. This is a design error, but overcoming it requires a > > > reorganization of the mount code that's not really politically > > > easy to accomplish, even though it's technically very easy. > > > > > > Some of the stuff Poul is doing right now will probably help > > > you in the future with assembing things like RAID-able > > > volumes in the future -- but not help you right now. > > > > Linux has a syscall (pivot_root) to swap the root with another mounted > > filesystem. It is occasionally quite useful, and I've been wondering > > about implementing it (or something similar) on FreeBSD. > > > > Possibly you can tell me why that wouldn't work, or would be a bad > > idea. > > Doing that would be very hard. The way mount points work > won't exactly make it impossible, but it won't make it easy. > > Here's the architectural fix: > > 1) Seperate the mount point covering code from the per FS > mounting code. I'm not sure what you're talking about here. Could you point me at files and functions to read? I'm not particularly familiar with VFS. (My kernel hacking days were years ago, and not on a Unix or even Unix-like kernel...) > 2) Add a seperate VOP for setting the "mounted on" information > into the superblock (some FS's, like FFS, like to record > the "last mounted on" information; this is actually not > used for anything that I've ever seen (right now), so it > would probably be OK to rip out completely (right now; it > could later be useful for automounting and getting rid of > /etc/fstab entirely). I'd think this wouldn't be necessary. I've never seen the last mounted tag used for anything, either. I'm not sure why you'd want to get rid of /etc/fstab. > 3) When mounting an FS at the VFS_MOUNT layer, simply get a > pointer into the list of mounted file systems. *DO NOT* > deal with the mount point covering at all in the per FS > code! > > 4) Deal with the mount point covering in the higher level > code; this reduces the amount of crap you have to > parse in a per FS manner anyway. The covering is done > by referencing the FS in the system mounted FS layer > from #3 (above). > > At this point, from the VFS perspective, all mounts -- root and > non-root -- are exactly the same: you implement the one type of > mount (the "fill in this mount table entry and set up the in core > mount structure data" kind), and it's taken care of... the only > difference between a root and a non-root mount is the vnode > covering code for the mount, and that all uses the same code at > a higher layer. Hmm. Sounds like there's some complexity I missed. In any case, this is well beyond me. It sounds like you're saying there's some code that I haven't seen that would need to be refactored between VFS and FS. Changes like these should be made by someone who knows what he's doing, and I clearly don't. > This would also make your "pivot" FS work correctly... to do that, > you would have to cover an opaque vnode. You could actually do > this with any vnode, by revoking the vnode, and making it a deadfs > vnode. I'm not sure what you mean by "cover an opaque vnode." I don't think I know enough about how VFS mounts work in FreeBSD to discuss this intelligently. Maybe after a lot of reading... > > In my case, I have production systems running Linux with software RAID. > > I would much rather run hardware RAID and FreeBSD, but I have no budget > > to buy SCSI RAID controllers. Switching to FreeBSD+Vinum would be a > > reasonable solution, but I can't mirror root, and that creates a > > political problem. I get, "If FreeBSD and Vinum will be better, how come > > you can't mirror the root filesystem?" > > How does mirroring the root FS recover after an error? If you > can't load the kernel to load the software RAID, then you can't > run the software RAID to recover from a failure, right? Assuming RAID 1, you have a 50% chance that the primary disk fails. If the secondary disk fails (not first to boot), shutdown, replace it, and reboot. On most systems nowadays, it's possible to set a boot order so that the BIOS will try to boot the second drive, if the first drive doesn't boot. That will work sometimes. So if you have to, you boot from something else. (The other disk most likely, or possibly floppy, CD, or network.) At least there'd be something there to recover. With a root mirror, the worst case is still much less painful than a complete restore from tape. > How does Linux solve this problem? *Does* Linux solve this > problem, or are we really talking about an unrecoverable > condition that Linux lets you get yourself into, but FreeBSD > doesn't? About the way I described above. It's more of a problem for firmware and/or boot loader than OS. As for unrecoverable: I'd much rather drive in, swap a disk, reboot from floppy, and get to go home when the mirror resyncs, than have to do restore from backup. I _hate_ restoring root filesystems from backups. ---Nathan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 12:27:11 2002 Delivered-To: freebsd-fs@freebsd.org Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by hub.freebsd.org (Postfix) with SMTP id 25E8737B419 for ; Wed, 1 May 2002 12:27:06 -0700 (PDT) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 1 May 2002 20:27:05 +0100 (BST) To: utsl@quic.net Cc: "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems In-Reply-To: Your message of "Tue, 30 Apr 2002 20:02:36 EDT." <20020501000236.GB28212@quic.net> Date: Wed, 01 May 2002 20:27:04 +0100 From: Ian Dowse Message-ID: <200205012027.aa63727@salmon.maths.tcd.ie> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org In message <20020501000236.GB28212@quic.net>, utsl@quic.net writes: >On Tue, Apr 30, 2002 at 03:58:09PM -0700, Andrew P. Lentvorski wrote: >> utsl@quic.net wrote: >> > Linux has a syscall (pivot_root) to swap the root with another mounted >> > filesystem. It is occasionally quite useful, and I've been wondering >> > about implementing it (or something similar) on FreeBSD. >> That sounds like a fine idea. What are the issues with doing that? > >I've been taking a look, and I think it is probably beyond my skill. :( >From what I can see the following would be necessary: I presume you know that FreeBSD already allows you to mount another filesystem directly over /? Do you actually need to remove the original root filesystem, or is it just for cosmetic reasons that you would like it to disappear from the mountlist? One think I have thought about before, but never actually tried implementing, is to permit the root filesystem to be forcibly unmounted so long as there is another filesystem mounted directly above it to become the new root (you obviously have to specify the root filesystem by device name). That might be relatively easy to do, and it doesn't require a new system call. There are a few problems though: I think init(8) would need to have a signal that causes it to re-exec itself, since otherwise it could get killed if any of the executable needed to be paged in. Ian To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 12:46: 8 2002 Delivered-To: freebsd-fs@freebsd.org Received: from laptop.tenebras.com (laptop.tenebras.com [66.92.188.18]) by hub.freebsd.org (Postfix) with SMTP id 06B6137B416 for ; Wed, 1 May 2002 12:46:05 -0700 (PDT) Received: (qmail 14024 invoked from network); 1 May 2002 19:46:04 -0000 Received: from sapphire.tenebras.com (HELO tenebras.com) (66.92.188.241) by 0 with SMTP; 1 May 2002 19:46:04 -0000 Message-ID: <3CD045FB.7060902@tenebras.com> Date: Wed, 01 May 2002 12:46:03 -0700 From: Michael Sierchio Reply-To: kudzu@tenebras.com User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0rc1) Gecko/20020427 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ian Dowse Cc: utsl@quic.net, "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems References: <200205012027.aa63727@salmon.maths.tcd.ie> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Ian Dowse wrote: > I presume you know that FreeBSD already allows you to mount another > filesystem directly over /? Do you actually need to remove the > original root filesystem, or is it just for cosmetic reasons that > you would like it to disappear from the mountlist? Union filesytem is currently "busted" AFAIK in 4.5. Perhaps Terry Lambert can comment more meaningfully on the cache coherency issue. We're doing work for a commercial project, but hope to return a portion of it to the FreeBSD Project if we can get it committed -- the politics of which are beyond my comprehension ;-) There is also some weirdness, IIRC, to doing union mounts which mix RO and RW filesystems, but I may be misremembering. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 13: 6:40 2002 Delivered-To: freebsd-fs@freebsd.org Received: from quic.net (romulus.quic.net [216.23.27.8]) by hub.freebsd.org (Postfix) with SMTP id 263B837B419 for ; Wed, 1 May 2002 13:06:37 -0700 (PDT) Received: (qmail 3161 invoked by uid 1032); 1 May 2002 20:06:43 -0000 From: utsl@quic.net Date: Wed, 1 May 2002 16:06:43 -0400 To: Michael Sierchio Cc: Ian Dowse , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems Message-ID: <20020501200643.GC32719@quic.net> References: <200205012027.aa63727@salmon.maths.tcd.ie> <3CD045FB.7060902@tenebras.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3CD045FB.7060902@tenebras.com> User-Agent: Mutt/1.3.27i Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Wed, May 01, 2002 at 12:46:03PM -0700, Michael Sierchio wrote: > Ian Dowse wrote: > > >I presume you know that FreeBSD already allows you to mount another > >filesystem directly over /? Do you actually need to remove the > >original root filesystem, or is it just for cosmetic reasons that > >you would like it to disappear from the mountlist? I wasn't aware that it could do that with the root filesystem. I'm not concerned about the cosmetics, only functionality. From what I read below, I think the functionality is somewhat in doubt. > Union filesytem is currently "busted" AFAIK in 4.5. Perhaps Terry > Lambert can comment more meaningfully on the cache coherency issue. Is it planned to fix it for 5.0? > We're doing work for a commercial project, but hope to return a > portion of it to the FreeBSD Project if we can get it committed -- > the politics of which are beyond my comprehension ;-) > > There is also some weirdness, IIRC, to doing union mounts which > mix RO and RW filesystems, but I may be misremembering. That's a shame. It could be useful for CDs. ---Nathan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 13:15:48 2002 Delivered-To: freebsd-fs@freebsd.org Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by hub.freebsd.org (Postfix) with SMTP id 2456437B417 for ; Wed, 1 May 2002 13:15:45 -0700 (PDT) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 1 May 2002 21:15:44 +0100 (BST) To: kudzu@tenebras.com Cc: utsl@quic.net, "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems In-Reply-To: Your message of "Wed, 01 May 2002 12:46:03 PDT." <3CD045FB.7060902@tenebras.com> Date: Wed, 01 May 2002 21:15:43 +0100 From: Ian Dowse Message-ID: <200205012115.aa68752@salmon.maths.tcd.ie> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org In message <3CD045FB.7060902@tenebras.com>, Michael Sierchio writes: >Union filesytem is currently "busted" AFAIK in 4.5. Perhaps Terry I never said anything about using union filesystems; you can just mount another filesystem directly over /, and it hides the underlying root filesystem. This is no different from the way that mounting /usr hides anything that might have been in the /usr directory before it was mounted on. Ian To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 13:25:47 2002 Delivered-To: freebsd-fs@freebsd.org Received: from laptop.tenebras.com (laptop.tenebras.com [66.92.188.18]) by hub.freebsd.org (Postfix) with SMTP id 8701037B400 for ; Wed, 1 May 2002 13:25:42 -0700 (PDT) Received: (qmail 14220 invoked from network); 1 May 2002 20:25:41 -0000 Received: from sapphire.tenebras.com (HELO tenebras.com) (66.92.188.241) by 0 with SMTP; 1 May 2002 20:25:41 -0000 Message-ID: <3CD04F45.1040704@tenebras.com> Date: Wed, 01 May 2002 13:25:41 -0700 From: Michael Sierchio Reply-To: kudzu@tenebras.com User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0rc1) Gecko/20020427 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ian Dowse Cc: utsl@quic.net, "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems References: <200205012115.aa68752@salmon.maths.tcd.ie> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Ian Dowse wrote: > I never said anything about using union filesystems; you can just > mount another filesystem directly over /, and it hides the underlying > root filesystem. Covering mounts are also broken, IIRC, with the same cache coherency problem. I may be mistaken, but the problem is with NULLFS, which is required for covering mounts. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 13:28:48 2002 Delivered-To: freebsd-fs@freebsd.org Received: from laptop.tenebras.com (laptop.tenebras.com [66.92.188.18]) by hub.freebsd.org (Postfix) with SMTP id CBF9037B41B for ; Wed, 1 May 2002 13:28:34 -0700 (PDT) Received: (qmail 14241 invoked from network); 1 May 2002 20:28:34 -0000 Received: from sapphire.tenebras.com (HELO tenebras.com) (66.92.188.241) by 0 with SMTP; 1 May 2002 20:28:34 -0000 Message-ID: <3CD04FF1.7080304@tenebras.com> Date: Wed, 01 May 2002 13:28:33 -0700 From: Michael Sierchio Reply-To: kudzu@tenebras.com User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0rc1) Gecko/20020427 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ian Dowse Cc: utsl@quic.net, "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems References: <200205012115.aa68752@salmon.maths.tcd.ie> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Ian Dowse wrote: > I never said anything about using union filesystems; you can just > mount another filesystem directly over /, and it hides the underlying > root filesystem. This is no different from the way that mounting > /usr hides anything that might have been in the /usr directory > before it was mounted on. You can mount a filesystem anywhere in the file hierarchy , the mount point doesn't have to be a fs, and hide the underlying directory -- but you can't do this with the root filesystem To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 13:52:31 2002 Delivered-To: freebsd-fs@freebsd.org Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by hub.freebsd.org (Postfix) with SMTP id 2B08B37B419 for ; Wed, 1 May 2002 13:52:28 -0700 (PDT) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 1 May 2002 21:52:27 +0100 (BST) To: kudzu@tenebras.com Cc: utsl@quic.net, "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems In-Reply-To: Your message of "Wed, 01 May 2002 13:28:33 PDT." <3CD04FF1.7080304@tenebras.com> Date: Wed, 01 May 2002 21:52:25 +0100 From: Ian Dowse Message-ID: <200205012152.aa72665@salmon.maths.tcd.ie> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org In message <3CD04FF1.7080304@tenebras.com>, Michael Sierchio writes: >You can mount a filesystem anywhere in the file hierarchy , the mount >point doesn't have to be a fs, and hide the underlying directory -- >but you can't do this with the root filesystem I had forgotten that there is a bug in /sbin/mount that stops you from doing this directly, but for filesystems with a separate mount_xxx program, it will work. Unfortunately it looks like there is another problem. I fixed a bug about a year ago relating to mounting filesystems over /, but it looks as if the bugfix got accidentally reverted by a subsequent commit a few months later. This bug results in a "vrele: negative ref count" panic if you unmount the filesystem or reboot. Ian To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 15:13:13 2002 Delivered-To: freebsd-fs@freebsd.org Received: from focker.2y.net (pcp745076pcs.reston01.va.comcast.net [68.49.139.189]) by hub.freebsd.org (Postfix) with ESMTP id E65AA37B419 for ; Wed, 1 May 2002 15:10:35 -0700 (PDT) Received: (from johnnye@localhost) by focker.2y.net (8.11.6/8.11.6) id g41MC7l53041 for freebsd-fs@freebsd.org; Wed, 1 May 2002 18:12:07 -0400 (EDT) (envelope-from johnnye) Date: Wed, 1 May 2002 18:12:07 -0400 (EDT) From: john Message-Id: <200205012212.g41MC7l53041@focker.2y.net> To: freebsd-fs@freebsd.org Subject: Need help recovering disklabel/slices, wiped out two blocks Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org First let me tell you how I got this problem. I was excited about getting my jaz drive hooked up so I could now do weekly backups. I was following the handbook's guide to setting up a tape drive and made a small typo. Instead of "dd if=/dev/zero of-/dev/da0 count=2", I did: # dd if=/dev/zero of=/dev/ad0 count=2 Nothing happened immediately, I realized my mistake and got my jaz drive working. I think I remember copying my home slice just to check out the write speed, so thats good. But then I rebooted and got the good ol' "NON SYSTEM DISK REINSERT AND PRESS ENTER" as if the master boot record was gone. First thing I did was used sysinstall from the floppies to check out fdisk, everything was gone! No partitions/slices to be found anywhere, so I exited, no changes. Next I tried the fixit disk to see what that told me, and in devices it only showed /mnt/dev/ad0 for disk drives. So thats where I left off, decided to get some more help before screwing anything else up. I checked out the manpages for disklabel, and it seems like thats what I need to re-create the slices, but I also found out that I need to know the exact geometry of my slices before. Is there a backup somewhere on the disk? I think someone in 1999 created a c program to get the geometries, but I cant find that. Someone else recommend hex editing, which I know nothing about. And lastly someone also mentioned super blocks having backups. The install was standard, I let freebsd take over my disk, but I did not take the default slices because I made a 10 gig slice for data, which is what I didnt backup yet :( Any help is greatly appreciated, /~John Eisenhower To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 17:10:51 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mail.allcaps.org (mail.allcaps.org [208.252.245.17]) by hub.freebsd.org (Postfix) with ESMTP id D192C37B41E for ; Wed, 1 May 2002 17:10:44 -0700 (PDT) Received: by mail.allcaps.org (Postfix, from userid 501) id E703E32601; Wed, 1 May 2002 17:10:38 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.allcaps.org (Postfix) with ESMTP id E23972E821; Wed, 1 May 2002 17:10:38 -0700 (PDT) Date: Wed, 1 May 2002 17:10:38 -0700 (PDT) From: "Andrew P. Lentvorski" To: Ian Dowse Cc: , Subject: Re: Non-standard root filesystems In-Reply-To: <200205012027.aa63727@salmon.maths.tcd.ie> Message-ID: <20020501170107.E2382-100000@mail.allcaps.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Wed, 1 May 2002, Ian Dowse wrote: > I presume you know that FreeBSD already allows you to mount another > filesystem directly over /? Do you actually need to remove the > original root filesystem, or is it just for cosmetic reasons that > you would like it to disappear from the mountlist? I believe that if you open a file on the underlying filesystem, it's descriptor remains active even if you union mount/overlay mount the fs. Until you close all the open files, there is still access going to the underlying fs. Somebody please correct me if I am mistaken on this. -a To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 23: 1:24 2002 Delivered-To: freebsd-fs@freebsd.org Received: from pintail.mail.pas.earthlink.net (pintail.mail.pas.earthlink.net [207.217.120.122]) by hub.freebsd.org (Postfix) with ESMTP id 22C2D37B400 for ; Wed, 1 May 2002 23:01:21 -0700 (PDT) Received: from pool0052.cvx21-bradley.dialup.earthlink.net ([209.179.192.52] helo=mindspring.com) by pintail.mail.pas.earthlink.net with esmtp (Exim 3.33 #2) id 1739eS-0004pG-00; Wed, 01 May 2002 23:01:12 -0700 Message-ID: <3CD0D60B.C629901E@mindspring.com> Date: Wed, 01 May 2002 23:00:43 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Ian Dowse Cc: utsl@quic.net, "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems References: <200205012027.aa63727@salmon.maths.tcd.ie> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Ian Dowse wrote: > I presume you know that FreeBSD already allows you to mount another > filesystem directly over /? Do you actually need to remove the > original root filesystem, or is it just for cosmetic reasons that > you would like it to disappear from the mountlist? My first thought, as well. But the problem is really deeper than that; doing a mount-over will not maintain the already covered mount points, particularly if there are missing intermediate directories. > One think I have thought about before, but never actually tried > implementing, is to permit the root filesystem to be forcibly > unmounted so long as there is another filesystem mounted directly > above it to become the new root (you obviously have to specify the > root filesystem by device name). That might be relatively easy to > do, and it doesn't require a new system call. To do this requires violating the stacking abstraction. By doing this, you would grant indirect access to underlying FSs to areas which have been mounted over. The worst case scenario here is a cryptographic FS stacking layer mounted over top of the FS that it's stacked on top of, which uses metadata obtained by directory or other namespace folding. By doing this, it's possible to access the cleartext and the ciphertext simultaneously, which would permit recovery of the "pad". After that, revocation of access is imposible. Worse, it's possible to replace the ciphertext with all zeros, recover the pad from a reading of the exposed ciphertext without a key, reverse the process to replace the zeros with the cipher text, and then reading the other ciphertext, merely XOR it with the recovered pad, allowing you to decrypt the data without a key. You *really* don't want to allow this. > There are a few problems though: I think init(8) would need to have > a signal that causes it to re-exec itself, since otherwise it could > get killed if any of the executable needed to be paged in. A more correct approach would probably be to provide a static root, and then union mount the "real" root over the static root. I suggested this a long, long time ago (1996) for the devfs, which would mount the devfs automatically on /dev, and then root would be mounted union over the / part of /dev/, leaving the /dev/ contents intact. This would, as a side effect, provide the persistance in /dev that everyone always bitches about for permissions, deletions, etc.. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed May 1 23:48:53 2002 Delivered-To: freebsd-fs@freebsd.org Received: from swan.prod.itd.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id 3116D37B417 for ; Wed, 1 May 2002 23:48:50 -0700 (PDT) Received: from pool0052.cvx21-bradley.dialup.earthlink.net ([209.179.192.52] helo=mindspring.com) by swan.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 173AOL-0001G0-00; Wed, 01 May 2002 23:48:38 -0700 Message-ID: <3CD0E128.980AA927@mindspring.com> Date: Wed, 01 May 2002 23:48:08 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: kudzu@tenebras.com Cc: Ian Dowse , utsl@quic.net, "Andrew P. Lentvorski" , freebsd-fs@freebsd.org Subject: Re: Non-standard root filesystems References: <200205012115.aa68752@salmon.maths.tcd.ie> <3CD04F45.1040704@tenebras.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Michael Sierchio wrote: > Ian Dowse wrote: > > I never said anything about using union filesystems; you can just > > mount another filesystem directly over /, and it hides the underlying > > root filesystem. > > Covering mounts are also broken, IIRC, with the same cache coherency > problem. I may be mistaken, but the problem is with NULLFS, which > is required for covering mounts. He's talking a fully covering mount, not a stacking mount. The coherency issues are particular to the stacking process, and to the "unionfs", which is a stacking FS. The "union" mount option, which is handled internally to the FS framework itself, has different problems. The main problem with covering mounts is what happens to inferior mount points, when you cover a mount point higher up in the mount hierarchy (e.g. "What happens to an already mounted "/usr/" when you cover over "/"?"), and "slip under" (e.g. "How do I replace, rather than cover, an FS at a mount point?"). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu May 2 4:53:17 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mail.teligent.se (mail.teligent.se [212.209.126.130]) by hub.freebsd.org (Postfix) with ESMTP id AE6CC37B416 for ; Thu, 2 May 2002 04:53:14 -0700 (PDT) Received: from annar (dyn-globen-ab-134.teligent.se [172.21.0.134]) by mail.teligent.se (8.11.1/8.11.1) with SMTP id g42Bl1719553 for ; Thu, 2 May 2002 13:47:01 +0200 (CEST) (envelope-from anna.ruthstrom@teligent.se) From: =?iso-8859-1?Q?Anna_M_Ruthstr=F6m?= To: Subject: Filesystem Date: Thu, 2 May 2002 13:53:07 +0200 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0) Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6600 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Hello, Im sing bsd 3.4 on my current system. Is there any limitation of how many directorys you can have in one directory, and if so can this parameter be changed? I heard something abut 32 676 directorys in a directory, is this true? Thanx alot! / Reg. Anna To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu May 2 7:25:42 2002 Delivered-To: freebsd-fs@freebsd.org Received: from bingnet2.cc.binghamton.edu (bingnet2.cc.binghamton.edu [128.226.1.18]) by hub.freebsd.org (Postfix) with ESMTP id 79FB037B400 for ; Thu, 2 May 2002 07:25:34 -0700 (PDT) Received: from onyx ([128.226.182.171]) by bingnet2.cc.binghamton.edu (8.11.6/8.11.6) with ESMTP id g42EPO416098; Thu, 2 May 2002 10:25:24 -0400 (EDT) Date: Thu, 2 May 2002 10:25:24 -0400 (EDT) From: Zhihui Zhang X-Sender: zzhang@onyx To: =?iso-8859-1?Q?Anna_M_Ruthstr=F6m?= Cc: freebsd-fs@freebsd.org Subject: Re: Filesystem In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN Content-Transfer-Encoding: QUOTED-PRINTABLE Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org You can create as many entries as you want in a directory. However, the performance will be low because you have to scan it linearly later. -Zhihui On Thu, 2 May 2002, [iso-8859-1] Anna M Ruthstr=F6m wrote: > Hello, >=20 > Im sing bsd 3.4 on my current system. > Is there any limitation of how many directorys you can have in one > directory, and if so can this parameter be changed? I heard something abu= t > 32 676 directorys in a directory, is this true? >=20 > Thanx alot! >=20 > / Reg. Anna >=20 >=20 > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message >=20 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu May 2 9:39: 2 2002 Delivered-To: freebsd-fs@freebsd.org Received: from avocet.prod.itd.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by hub.freebsd.org (Postfix) with ESMTP id 22FE037B416 for ; Thu, 2 May 2002 09:38:56 -0700 (PDT) Received: from pool0542.cvx22-bradley.dialup.earthlink.net ([209.179.200.32] helo=mindspring.com) by avocet.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 173JbZ-0002th-00; Thu, 02 May 2002 09:38:53 -0700 Message-ID: <3CD16B80.BAC37697@mindspring.com> Date: Thu, 02 May 2002 09:38:24 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Anna M =?iso-8859-1?Q?Ruthstr=F6m?= Cc: freebsd-fs@freebsd.org Subject: Re: Filesystem References: Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Anna M Ruthstr=F6m wrote: > Hello, > = > Im sing bsd 3.4 on my current system. > Is there any limitation of how many directorys you can have in one > directory, and if so can this parameter be changed? I heard something a= but > 32 676 directorys in a directory, is this true? There is no limitation except inodes and available disk space. However, it's an incredibly bad idea to have huge numbers of entries in a directory, since search time for all but a few FS implementations is linear, so depending on a faster search time would make your code dependent on a particular UNIX or FS implementation. It's always a bad idea to intentionally write non-portable code. If you are trying to abuse the FS name space as a hierarchical database with a wide key space, you would be better off: 1) Using a real database, where search will be O(log2(N)) instead of O(N) for non-existant entries and O(N/2) for exiting entries 2) Using a relational instead of a hierarchical database; by their nature, hierarchical databases are designed to be deep and not wide 3) Or use a hierarchical database with tier indexing, so that certain tiers can be very wide, but most data is hierarchical (an example of this would be an LDAP directory; see http://www.openldap.org ) -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu May 2 10:21:29 2002 Delivered-To: freebsd-fs@freebsd.org Received: from laptop.tenebras.com (laptop.tenebras.com [66.92.188.18]) by hub.freebsd.org (Postfix) with SMTP id D3A7337B446 for ; Thu, 2 May 2002 10:20:49 -0700 (PDT) Received: (qmail 18320 invoked from network); 2 May 2002 17:20:48 -0000 Received: from unknown (HELO tenebras.com) (192.168.1.123) by 0 with SMTP; 2 May 2002 17:20:48 -0000 Message-ID: <3CD1756C.1060102@tenebras.com> Date: Thu, 02 May 2002 10:20:44 -0700 From: Michael Sierchio Reply-To: kudzu@tenebras.com User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:0.9.9) Gecko/20020416 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Terry Lambert Cc: freebsd-fs@freebsd.org Subject: Re: Filesystem References: <3CD16B80.BAC37697@mindspring.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Terry Lambert wrote: > It's always a bad idea to intentionally write non-portable code. Seems to work for Microsoft ;-) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu May 2 14:51:15 2002 Delivered-To: freebsd-fs@freebsd.org Received: from reiher.informatik.uni-wuerzburg.de (wi4d22.informatik.uni-wuerzburg.de [132.187.101.122]) by hub.freebsd.org (Postfix) with ESMTP id 1160837B41A; Thu, 2 May 2002 14:51:12 -0700 (PDT) Received: by reiher.informatik.uni-wuerzburg.de (Postfix, from userid 1001) id 95F3AAF1E; Thu, 2 May 2002 23:51:10 +0200 (CEST) Date: Thu, 2 May 2002 23:51:10 +0200 From: Matthias Buelow To: Terry Lambert Cc: ANdrei , FS@FreeBSD.ORG, bugs@FreeBSD.ORG Subject: Re: xterm & directory cat Message-ID: <20020502215110.GA587@reiher.informatik.uni-wuerzburg> References: <3CCE8982.6A915F2B@abc.ro> <3CCEB71D.1AD1F911@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3CCEB71D.1AD1F911@mindspring.com> User-Agent: Mutt/1.3.28i Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Terry Lambert writes: >After your xterm is "crashed", use control-right-mouse-button >"full reset". Your xterm will "uncrash". Typing reset^J (control+j, in case it doesn't accept ^M - return), or echo ^V^O (output a literal ctrl+o) will also reset the terminal. --mkb To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu May 2 15:13:59 2002 Delivered-To: freebsd-fs@freebsd.org Received: from harrier.prod.itd.earthlink.net (harrier.mail.pas.earthlink.net [207.217.120.12]) by hub.freebsd.org (Postfix) with ESMTP id 8659E37B416; Thu, 2 May 2002 15:13:56 -0700 (PDT) Received: from pool0524.cvx21-bradley.dialup.earthlink.net ([209.179.194.14] helo=mindspring.com) by harrier.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 173Opc-0006DP-00; Thu, 02 May 2002 15:13:45 -0700 Message-ID: <3CD1B9FC.6D75FF9A@mindspring.com> Date: Thu, 02 May 2002 15:13:16 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Matthias Buelow Cc: ANdrei , FS@FreeBSD.ORG, bugs@FreeBSD.ORG Subject: Re: xterm & directory cat References: <3CCE8982.6A915F2B@abc.ro> <3CCEB71D.1AD1F911@mindspring.com> <20020502215110.GA587@reiher.informatik.uni-wuerzburg> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Matthias Buelow wrote: > Terry Lambert writes: > >After your xterm is "crashed", use control-right-mouse-button > >"full reset". Your xterm will "uncrash". > > Typing reset^J (control+j, in case it doesn't accept ^M - return), > or echo ^V^O (output a literal ctrl+o) will also reset the terminal. On ANSI 3.64, "ESC #" is "lock keyboard". If that's seen, the only way to reset is is a ctrl-shift-break (on a VT100) or using the xterm menu based reset, previously described. We use to put this escape sequence into our .finger files so that we could do ANSI 3.64 animations that the watcher would have to "sit back and enjoy". -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu May 2 17:21:50 2002 Delivered-To: freebsd-fs@freebsd.org Received: from reiher.informatik.uni-wuerzburg.de (wi4d22.informatik.uni-wuerzburg.de [132.187.101.122]) by hub.freebsd.org (Postfix) with ESMTP id 5BEB737B404; Thu, 2 May 2002 17:21:44 -0700 (PDT) Received: by reiher.informatik.uni-wuerzburg.de (Postfix, from userid 1001) id A8326AF1E; Fri, 3 May 2002 02:21:42 +0200 (CEST) Date: Fri, 3 May 2002 02:21:42 +0200 From: Matthias Buelow To: Terry Lambert Cc: ANdrei , FS@FreeBSD.ORG, bugs@FreeBSD.ORG Subject: Re: xterm & directory cat Message-ID: <20020503002142.GA382@reiher.informatik.uni-wuerzburg> References: <3CCE8982.6A915F2B@abc.ro> <3CCEB71D.1AD1F911@mindspring.com> <20020502215110.GA587@reiher.informatik.uni-wuerzburg> <3CD1B9FC.6D75FF9A@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3CD1B9FC.6D75FF9A@mindspring.com> User-Agent: Mutt/1.3.28i Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Terry Lambert writes: >On ANSI 3.64, "ESC #" is "lock keyboard". If that's seen, >the only way to reset is is a ctrl-shift-break (on a VT100) >or using the xterm menu based reset, previously described. Hmm, doesn't seem to work on xterm (xf4.2.0), though... --mkb To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 0:59:56 2002 Delivered-To: freebsd-fs@freebsd.org Received: from vbook.express.ru (vbook.nc.express.ru [212.24.37.35]) by hub.freebsd.org (Postfix) with ESMTP id 0AD3837B404 for ; Fri, 3 May 2002 00:59:53 -0700 (PDT) Received: from vova by vbook.express.ru with local (Exim 3.36 #1) id 173Xyn-000GuD-00 for fs@freebsd.org; Fri, 03 May 2002 11:59:49 +0400 Subject: Re: Filesystem From: "Vladimir B. " Grebenschikov To: fs@freebsd.org In-Reply-To: References: Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable X-Mailer: Ximian Evolution 1.0.3 Date: 03 May 2002 11:59:48 +0400 Message-Id: <1020412788.5512.2.camel@vbook.express.ru> Mime-Version: 1.0 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org =F7 Thu, 02.05.2002, =D7 19:25, Zhihui Zhang =CE=C1=D0=C9=D3=C1=CC: =20 > You can create as many entries as you want in a directory. =20 It is true only for files. But original question was about subdirectories. Actually there are limit on number of hardlinks for file(or dirs) and this limit about 32k Each subdirectory hardlinks it's '..' to parent directory so, you can't create more then 32k subdirectories (it is true for 3.x) but I don't know how with this problem on -STABLE and -CURRENT ? Anybody knows ? =20 > -Zhihui >=20 > On Thu, 2 May 2002, [iso-8859-1] Anna M Ruthstr=F6m wrote: >=20 > > Hello, > >=20 > > Im sing bsd 3.4 on my current system. > > Is there any limitation of how many directorys you can have in one > > directory, and if so can this parameter be changed? I heard something a= but > > 32 676 directorys in a directory, is this true? > >=20 > > Thanx alot! > >=20 > > / Reg. Anna > >=20 > >=20 > > To Unsubscribe: send mail to majordomo@FreeBSD.org > > with "unsubscribe freebsd-fs" in the body of the message > >=20 >=20 >=20 > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message >=20 --=20 Vladimir B. Grebenschikov vova@sw.ru SWsoft To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 1:12:43 2002 Delivered-To: freebsd-fs@freebsd.org Received: from vbook.express.ru (vbook.nc.express.ru [212.24.37.35]) by hub.freebsd.org (Postfix) with ESMTP id D774237B400 for ; Fri, 3 May 2002 01:12:40 -0700 (PDT) Received: from vova by vbook.express.ru with local (Exim 3.36 #1) id 173YBC-000GvO-00 for freebsd-fs@freebsd.org; Fri, 03 May 2002 12:12:38 +0400 Subject: Re: Filesystem From: "Vladimir B. " Grebenschikov To: freebsd-fs@freebsd.org In-Reply-To: References: Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable X-Mailer: Ximian Evolution 1.0.3 Date: 03 May 2002 12:12:38 +0400 Message-Id: <1020413558.5512.5.camel@vbook.express.ru> Mime-Version: 1.0 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org =F7 Thu, 02.05.2002, =D7 19:25, Zhihui Zhang =CE=C1=D0=C9=D3=C1=CC: =20 > You can create as many entries as you want in a directory. =20 It is true only for files. But original question was about subdirectories. Actually there are limit on number of hardlinks for file(or dirs) and this limit about 32k Each subdirectory hardlinks it's '..' to parent directory so, you can't create more then 32k subdirectories (it is true for 3.x) but I don't know how with this problem on -STABLE and -CURRENT ? Anybody knows ? =20 > -Zhihui >=20 > On Thu, 2 May 2002, [iso-8859-1] Anna M Ruthstr=F6m wrote: >=20 > > Hello, > >=20 > > Im sing bsd 3.4 on my current system. > > Is there any limitation of how many directorys you can have in one > > directory, and if so can this parameter be changed? I heard something a= but > > 32 676 directorys in a directory, is this true? > >=20 > > Thanx alot! > >=20 > > / Reg. Anna > >=20 > >=20 > > To Unsubscribe: send mail to majordomo@FreeBSD.org > > with "unsubscribe freebsd-fs" in the body of the message > >=20 >=20 >=20 > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message >=20 --=20 Vladimir B. Grebenschikov vova@sw.ru SWsoft To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 9:28: 2 2002 Delivered-To: freebsd-fs@freebsd.org Received: from gull.prod.itd.earthlink.net (gull.mail.pas.earthlink.net [207.217.120.84]) by hub.freebsd.org (Postfix) with ESMTP id 008F837B405 for ; Fri, 3 May 2002 09:27:59 -0700 (PDT) Received: from pool0248.cvx40-bradley.dialup.earthlink.net ([216.244.42.248] helo=mindspring.com) by gull.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 173fuT-000373-00; Fri, 03 May 2002 09:27:54 -0700 Message-ID: <3CD2BA6D.B3661EED@mindspring.com> Date: Fri, 03 May 2002 09:27:25 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Vladimir B. Grebenschikov" Cc: fs@freebsd.org Subject: Re: Filesystem References: <1020412788.5512.2.camel@vbook.express.ru> Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org "Vladimir B. Grebenschikov" wrote: > = > =F7 Thu, 02.05.2002, =D7 19:25, Zhihui Zhang =CE=C1=D0=C9=D3=C1=CC: > > You can create as many entries as you want in a directory. > = > It is true only for files. > = > But original question was about subdirectories. > = > Actually there are limit on number of hardlinks for file(or dirs) and > this limit about 32k > = > Each subdirectory hardlinks it's '..' to parent directory so, you can't= > create more then 32k subdirectories (it is true for 3.x) > but I don't know how with this problem on -STABLE and -CURRENT ? > = > Anybody knows ? The limit you are complaining about are FAT/VFAT specific. All FS's have their own limits. In FFS, a directory is a file. Therefore you are limited by the number of files. THe number of files is limited by disk space and reserved space for inodes. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 10:24:52 2002 Delivered-To: freebsd-fs@freebsd.org Received: from kali.avantgo.com (shadow.avantgo.com [64.157.226.66]) by hub.freebsd.org (Postfix) with ESMTP id AA89E37B416 for ; Fri, 3 May 2002 10:24:47 -0700 (PDT) Received: from river.avantgo.com ([10.11.30.114]) by kali.avantgo.com with Microsoft SMTPSVC(5.0.2195.3779); Fri, 3 May 2002 10:24:47 -0700 Date: Fri, 3 May 2002 10:24:46 -0700 (PDT) From: Scott Hess To: Terry Lambert Cc: "Vladimir B. Grebenschikov" , Subject: Re: Filesystem In-Reply-To: <3CD2BA6D.B3661EED@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT X-OriginalArrivalTime: 03 May 2002 17:24:47.0412 (UTC) FILETIME=[6C65F740:01C1F2C7] Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Looks to me like Vladimir's supposition is correct for FFS: scott@ganja2:test> uname -a FreeBSD ganja2.avantgo.com 4.5-RELEASE-p4 FreeBSD 4.5-RELEASE-p4 #0: Tue Apr 23 16:28:25 PDT 2002 root@agdev2.avantgo.com:/avantgo/obj/avantgo/src/sys/GANJA i386 scott@ganja2:test> perl -e 'for (my $ii = 0; 1; $ii++) { mkdir(sprintf("%06u", $ii), 0755) || die "$!";}' Too many links at -e line 1. scott@ganja2:test> ls | wc 32765 32765 229355 scott@ganja2:test> ls -ld . drwxr-xr-x 32767 scott wheel 530944 May 3 10:23 . [32767 is 32765 links from .. of subdirs, 1 for . of test, one for test in parent directory.] Later, scott On Fri, 3 May 2002, Terry Lambert wrote: > "Vladimir B. Grebenschikov" wrote: > > > > χ Thu, 02.05.2002, Χ 19:25, Zhihui Zhang ΞΑΠΙΣΑΜ: > > > You can create as many entries as you want in a directory. > > > > It is true only for files. > > > > But original question was about subdirectories. > > > > Actually there are limit on number of hardlinks for file(or dirs) and > > this limit about 32k > > > > Each subdirectory hardlinks it's '..' to parent directory so, you can't > > create more then 32k subdirectories (it is true for 3.x) > > but I don't know how with this problem on -STABLE and -CURRENT ? > > > > Anybody knows ? > > The limit you are complaining about are FAT/VFAT specific. > > All FS's have their own limits. > > In FFS, a directory is a file. Therefore you are limited by > the number of files. THe number of files is limited by disk > space and reserved space for inodes. > > -- Terry > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 10:45:20 2002 Delivered-To: freebsd-fs@freebsd.org Received: from kali.avantgo.com (shadow.avantgo.com [64.157.226.66]) by hub.freebsd.org (Postfix) with ESMTP id 5F86437B41C for ; Fri, 3 May 2002 10:45:11 -0700 (PDT) Received: from river.avantgo.com ([10.11.30.114]) by kali.avantgo.com with Microsoft SMTPSVC(5.0.2195.3779); Fri, 3 May 2002 10:45:11 -0700 Date: Fri, 3 May 2002 10:45:10 -0700 (PDT) From: Scott Hess To: fs@freebsd.org Subject: Re: Filesystem In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT X-OriginalArrivalTime: 03 May 2002 17:45:11.0178 (UTC) FILETIME=[45D1D6A0:01C1F2CA] Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Hmm. And I just noticed an interesting effect from softupdates (I think). If I follow my perl script with 'ls | xargs rmdir', and then immediately run the perl script again, I get the 'Too many links' error with fewer directories created. So: scott@ganja2:test> perl -e 'for (my $ii = 0; 1; $ii++) { mkdir(sprintf("%06u", $ii), 0755) || die "$!";}' ; ls | wc Too many links at -e line 1. 32765 32765 229355 scott@ganja2:test> ls | xargs rmdir ; perl -e 'for (my $ii = 0; 1; $ii++) { mkdir(sprintf("%06u", $ii), 0755) || die "$!";}' ; ls | wc Too many links at -e line 1. 3212 3212 22484 The number of directories created by the second pass varies from run to run. Sometimes it's even all 32765 directories... Later, scott On Fri, 3 May 2002, Scott Hess wrote: > Looks to me like Vladimir's supposition is correct for FFS: > > scott@ganja2:test> uname -a > FreeBSD ganja2.avantgo.com 4.5-RELEASE-p4 FreeBSD 4.5-RELEASE-p4 #0: Tue Apr 23 16:28:25 PDT 2002 > root@agdev2.avantgo.com:/avantgo/obj/avantgo/src/sys/GANJA i386 > scott@ganja2:test> perl -e 'for (my $ii = 0; 1; $ii++) { mkdir(sprintf("%06u", $ii), 0755) || die "$!";}' > Too many links at -e line 1. > scott@ganja2:test> ls | wc > 32765 32765 229355 > scott@ganja2:test> ls -ld . > drwxr-xr-x 32767 scott wheel 530944 May 3 10:23 . > > [32767 is 32765 links from .. of subdirs, 1 for . of test, one for test in > parent directory.] > > Later, > scott > > On Fri, 3 May 2002, Terry Lambert wrote: > > "Vladimir B. Grebenschikov" wrote: > > > > > > χ Thu, 02.05.2002, Χ 19:25, Zhihui Zhang ΞΑΠΙΣΑΜ: > > > > You can create as many entries as you want in a directory. > > > > > > It is true only for files. > > > > > > But original question was about subdirectories. > > > > > > Actually there are limit on number of hardlinks for file(or dirs) and > > > this limit about 32k > > > > > > Each subdirectory hardlinks it's '..' to parent directory so, you can't > > > create more then 32k subdirectories (it is true for 3.x) > > > but I don't know how with this problem on -STABLE and -CURRENT ? > > > > > > Anybody knows ? > > > > The limit you are complaining about are FAT/VFAT specific. > > > > All FS's have their own limits. > > > > In FFS, a directory is a file. Therefore you are limited by > > the number of files. THe number of files is limited by disk > > space and reserved space for inodes. > > > > -- Terry > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > > with "unsubscribe freebsd-fs" in the body of the message > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 11: 3:52 2002 Delivered-To: freebsd-fs@freebsd.org Received: from swan.prod.itd.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id 0799537B400 for ; Fri, 3 May 2002 11:03:42 -0700 (PDT) Received: from pool0248.cvx40-bradley.dialup.earthlink.net ([216.244.42.248] helo=mindspring.com) by swan.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 173hP7-0005bi-00; Fri, 03 May 2002 11:03:37 -0700 Message-ID: <3CD2D0DC.9B636A54@mindspring.com> Date: Fri, 03 May 2002 11:03:08 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Scott Hess Cc: "Vladimir B. Grebenschikov" , fs@freebsd.org Subject: Re: Filesystem References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Scott Hess wrote: > Looks to me like Vladimir's supposition is correct for FFS: > > scott@ganja2:test> uname -a > FreeBSD ganja2.avantgo.com 4.5-RELEASE-p4 FreeBSD 4.5-RELEASE-p4 #0: Tue Apr 23 16:28:25 PDT 2002 > root@agdev2.avantgo.com:/avantgo/obj/avantgo/src/sys/GANJA i386 > scott@ganja2:test> perl -e 'for (my $ii = 0; 1; $ii++) { mkdir(sprintf("%06u", $ii), 0755) || die "$!";}' > Too many links at -e line 1. > scott@ganja2:test> ls | wc > 32765 32765 229355 > scott@ganja2:test> ls -ld . > drwxr-xr-x 32767 scott wheel 530944 May 3 10:23 . > > [32767 is 32765 links from .. of subdirs, 1 for . of test, one for test in > parent directory.] Crap. Forgot about the link count. Yes, it's an int16_t. It looks like it's signed for a single compare, so it could be made unsigned, and the compare done against both 0 and 65535 (and/or prevented from underflowing in the first place). If you look at the stat man page, there's an nlink_t. This is a u_int16_t. So minimally, there is a limit of 65535 on all FS's, period. I believe this limit is common to all UNIX systems that use hard links to do ".." processing. Technically, it should be possible to do this with affecting link count. However, it's really easy to open a directory and determine if there ar substirectories by examining the link count on it, and to know how many there are by examining the link count on it. In any case, it's still an incredibly bad idea to have even a tenth of that man objects in a single directory, period. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 11: 9:53 2002 Delivered-To: freebsd-fs@freebsd.org Received: from swan.prod.itd.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id 2259637B41C for ; Fri, 3 May 2002 11:09:43 -0700 (PDT) Received: from pool0248.cvx40-bradley.dialup.earthlink.net ([216.244.42.248] helo=mindspring.com) by swan.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 173hUz-0006Gr-00; Fri, 03 May 2002 11:09:42 -0700 Message-ID: <3CD2D249.405D9EB4@mindspring.com> Date: Fri, 03 May 2002 11:09:13 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Scott Hess Cc: fs@freebsd.org Subject: Re: Filesystem References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Scott Hess wrote: > Hmm. And I just noticed an interesting effect from softupdates (I think). > If I follow my perl script with 'ls | xargs rmdir', and then immediately > run the perl script again, I get the 'Too many links' error with fewer > directories created. So: [ ... ] > The number of directories created by the second pass varies from run to > run. Sometimes it's even all 32765 directories... The commit of the unlink operations are pending, so the real number of links, relative to the effective number of links, is much larger. Probably you should beat up Poul and Kirk, since they are supposedly currently working on a DARPA funded "UFS2" that's supposed to overcome field size limitations; it's just possible (but unlikely) that they have missed this one. Since they aren't changing the directory structure to a btree or anything (I don't know if they are reserving space or a type field so someone else can do that later without redoing the FS), you are still doing a bad thing when you put a lot of files or directories into a single directory. PS: If you want a flat FS with a single namespace, you should use inodes and manipulate them with NFS handles, instead. Then it can be as flat as you want. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 13:31:23 2002 Delivered-To: freebsd-fs@freebsd.org Received: from repulse.cnchost.com (repulse.concentric.net [207.155.248.4]) by hub.freebsd.org (Postfix) with ESMTP id 791C137B404 for ; Fri, 3 May 2002 13:31:18 -0700 (PDT) Received: from bitblocks.com (adsl-209-204-185-216.sonic.net [209.204.185.216]) by repulse.cnchost.com id QAA24496; Fri, 3 May 2002 16:31:04 -0400 (EDT) [ConcentricHost SMTP Relay 1.14] Message-ID: <200205032031.QAA24496@repulse.cnchost.com> To: Terry Lambert Cc: Scott Hess , "Vladimir B. Grebenschikov" , fs@FreeBSD.ORG Subject: Re: Filesystem In-reply-to: Your message of "Fri, 03 May 2002 11:03:08 PDT." <3CD2D0DC.9B636A54@mindspring.com> Date: Fri, 03 May 2002 13:31:03 -0700 From: Bakul Shah Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Terry Lambert writes: > I believe this limit is common to all UNIX systems that use hard > links to do ".." processing. Technically, it should be possible > to do this with affecting link count. However, it's really easy > to open a directory and determine if there ar substirectories by > examining the link count on it, and to know how many there are by > examining the link count on it. Plan9 does ".." right. The same can be done in Unix by storing the rooted path in the kernel for a process'es current working dir. and by following some path rewrite rules: //.. == //../ == / /../ == / You would also have to deal with middle directories being renamed, filesystems being forcibly unmounted and so on. Not storing the entire path for cwd may have been the right decision for '70s but not since then.... > In any case, it's still an incredibly bad idea to have even a tenth > of that man objects in a single directory, period. IMHO it is a bad idea to not have evolved directories to use a B-tree representation (at least when the number of entries exceed some threshold. Implement mechanisms and leave policies to the users! To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 16:53: 6 2002 Delivered-To: freebsd-fs@freebsd.org Received: from albatross.prod.itd.earthlink.net (albatross.mail.pas.earthlink.net [207.217.120.120]) by hub.freebsd.org (Postfix) with ESMTP id 0DBD337B404 for ; Fri, 3 May 2002 16:53:04 -0700 (PDT) Received: from pool0260.cvx40-bradley.dialup.earthlink.net ([216.244.43.5] helo=mindspring.com) by albatross.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 173mr1-0002vA-00; Fri, 03 May 2002 16:52:47 -0700 Message-ID: <3CD322B2.FBEF3C19@mindspring.com> Date: Fri, 03 May 2002 16:52:18 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Bakul Shah Cc: Scott Hess , "Vladimir B. Grebenschikov" , fs@FreeBSD.ORG Subject: Re: Filesystem References: <200205032031.QAA24496@repulse.cnchost.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Bakul Shah wrote: > > In any case, it's still an incredibly bad idea to have even a tenth > > of that man objects in a single directory, period. > > IMHO it is a bad idea to not have evolved directories to use > a B-tree representation (at least when the number of entries > exceed some threshold. Implement mechanisms and leave > policies to the users! You can argue this, but then we are left with software that will only run well on AIX or [insert pet platform here], and runs dog slow on other platforms, because it assumes that the underlying implementation will always be O(log2(N)) instead of O(N) or O(N**2). It's a really crappy program that relies on underlying OS specific features for its efficiencies, because as soon as it's ported to an OS where the assumptions it makes are no longer true, it's screwed. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 17:19:50 2002 Delivered-To: freebsd-fs@freebsd.org Received: from illustrious.cnchost.com (illustrious.concentric.net [207.155.252.7]) by hub.freebsd.org (Postfix) with ESMTP id 77F7737B417 for ; Fri, 3 May 2002 17:19:47 -0700 (PDT) Received: from bitblocks.com (adsl-209-204-185-216.sonic.net [209.204.185.216]) by illustrious.cnchost.com id UAA13780; Fri, 3 May 2002 20:19:40 -0400 (EDT) [ConcentricHost SMTP Relay 1.14] Message-ID: <200205040019.UAA13780@illustrious.cnchost.com> To: Terry Lambert Cc: Bakul Shah , Scott Hess , "Vladimir B. Grebenschikov" , fs@FreeBSD.ORG Subject: Re: Filesystem In-reply-to: Your message of "Fri, 03 May 2002 16:52:18 PDT." <3CD322B2.FBEF3C19@mindspring.com> Date: Fri, 03 May 2002 17:19:39 -0700 From: Bakul Shah Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Terry Lambert writes: > Bakul Shah wrote: > > > In any case, it's still an incredibly bad idea to have even a tenth > > > of that man objects in a single directory, period. > > > > IMHO it is a bad idea to not have evolved directories to use > > a B-tree representation (at least when the number of entries > > exceed some threshold. Implement mechanisms and leave > > policies to the users! > > You can argue this, but then we are left with software that > will only run well on AIX or [insert pet platform here], and > runs dog slow on other platforms, because it assumes that > the underlying implementation will always be O(log2(N)) instead > of O(N) or O(N**2). > > It's a really crappy program that relies on underlying OS > specific features for its efficiencies, because as soon as it's > ported to an OS where the assumptions it makes are no longer > true, it's screwed. Unless enough systems provide this capability no one sane will use it. So yes, portability suffers. My frustration is with the 70s mindset when it comes to extending basic capabilities. Reasoning like: the disk access speed is very slow so the speed of directory access is not an issue. And since speed is not an issue a linear search is fine and dandy. Never mind that with a large buffer cache, chances are you will find dir. blocks in core and a linear search is not a great searching strategy when you have more than 10 to 20 items. And this is not the only such instance in the kernel. If you build scalable solutions people will use them. If enough Unix variants provide fast dir search, others will have to pick it up. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 17:46:39 2002 Delivered-To: freebsd-fs@freebsd.org Received: from falcon.prod.itd.earthlink.net (falcon.mail.pas.earthlink.net [207.217.120.74]) by hub.freebsd.org (Postfix) with ESMTP id 9D4FC37B417 for ; Fri, 3 May 2002 17:46:35 -0700 (PDT) Received: from pool0260.cvx40-bradley.dialup.earthlink.net ([216.244.43.5] helo=mindspring.com) by falcon.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 173ngy-0003hi-00; Fri, 03 May 2002 17:46:28 -0700 Message-ID: <3CD32F43.327CDA46@mindspring.com> Date: Fri, 03 May 2002 17:45:55 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Bakul Shah Cc: Scott Hess , "Vladimir B. Grebenschikov" , fs@FreeBSD.ORG Subject: Re: Filesystem References: <200205040019.UAA13780@illustrious.cnchost.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Bakul Shah wrote: > Unless enough systems provide this capability no one sane > will use it. So yes, portability suffers. Generally the "large amount of object in a directory" objection is brought up by some [FS] advocate, who wants to advocate their pet FS, and isn't really noticing a real problem: instead, they are implementing a solution, and then trying to invent a problem that requires it. So they really don't care about portability, they care about advocacy. > My frustration is with the 70s mindset when it comes to > extending basic capabilities. Reasoning like: the disk > access speed is very slow so the speed of directory access is > not an issue. And since speed is not an issue a linear > search is fine and dandy. Never mind that with a large > buffer cache, chances are you will find dir. blocks in core > and a linear search is not a great searching strategy when > you have more than 10 to 20 items. And this is not the > only such instance in the kernel. This case is an externalized interface used by programs, which have a choice in their implementation, and choose badly. I agree that there is a lot of room for extending the basic OS capabilities; I would also argue that there is generally a lot of research in that area, as well, and that research isn't being put into practice, for the most part, for a reason other than "not invented here" or "it's too hard: let's go shopping" (though that is sometimes the reason). The main reason I think applies is that legacy interoperability has more value than the other benefits that the change brings. For FS research, Poul and Kirk are working on UFS2. It would be wise to appeal to them to include sufficient mechanisms in the way of extension fields so that you could (for example) mark a directory as being "btree" or "patricia tree" or "trie" or whatever, and have it work transparently with legacy support for linear directory block layout. Poul has already stated that he and Kirk do not intend to change the directory layout to a btree, even though they have an on disk format change that they will be making, so this is the most logical compatability break. I don't know if they intend to provide sufficient extension instertion points for this type of thing (this would include extensions to the soft updates system being pluggable, as well -- so I doubt it). Either way, they haven't said, so you could always ask Kirk. > If you build scalable solutions people will use them. If > enough Unix variants provide fast dir search, others will > have to pick it up. Fast dir search won't be picked up until important applications start to rely on it. And important applications won't rely on it until it's generally available. So the only real way to break the log-jam is to come up with a killer app, which relies on some feature you want to proselytize. The main enemy of new features like this is that there is always more than one way to solve a problem. 8-). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 19:59: 2 2002 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.mrf.mail.rcn.net (smtp02.mrf.mail.rcn.net [207.172.4.61]) by hub.freebsd.org (Postfix) with ESMTP id A188437B400 for ; Fri, 3 May 2002 19:58:49 -0700 (PDT) Received: from 66-44-0-84.s84.apx1.lnh.md.dialup.rcn.com ([66.44.0.84] helo=localhost.) by smtp02.mrf.mail.rcn.net with smtp (Exim 3.33 #10) id 173pl1-0003Vt-00; Fri, 03 May 2002 22:58:48 -0400 References: <200205032031.QAA24496@repulse.cnchost.com> In-Reply-To: <200205032031.QAA24496@repulse.cnchost.com> Date: Fri, 3 May 2002 22:41:26 EDT From: Eric Jacobs To: fs@freebsd.org, Bakul Shah Subject: Re: Filesystem Organization: X-Mailer: Post Office 0.7.2 build 20010211(by eric@localhost 2001/02/11 21:52:30) Message-ID: X-Mailer: PostOffice 0.7 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Bakul Shah wrote: > Terry Lambert writes: > > > I believe this limit is common to all UNIX systems that use hard links > > to do ".." processing. Technically, it should be possible to do this > > with affecting link count. However, it's really easy > > to open a directory and determine if there ar substirectories by > > examining the link count on it, and to know how many there are by > > examining the link count on it. > > > Plan9 does ".." right. The same can be done in Unix by > storing the rooted path in the kernel for a process'es > current working dir. and by following some path rewrite > rules: > > //.. == > //../ == / > /../ == / Those rules aren't valid on the account of syntax alone. You would have to know which components are symbolic links. And once you take into account symbolic links, you have essentially what namei does anyway. I think what Terry Lambert was saying was that since hard-linking directories isn't allowed anyway, there's no need to refcount them, except for the subdirectory counting tricks. > You would also have to deal with middle directories being > renamed, filesystems being forcibly unmounted and so on. > > Not storing the entire path for cwd may have been the right > decision for '70s but not since then.... The entire path is stored indirectly via the VFS name cache, so getcwd() works _even_ for filesystems which do not implement "..". Implementing ".." at the VFS level would be just as simple. Probably the only reason it isn't is because it has been traditionally handled at the FS level. > > In any case, it's still an incredibly bad idea to have even a tenth of > > that man objects in a single directory, period. > > IMHO it is a bad idea to not have evolved directories to use a B-tree > representation (at least when the number of entries exceed some > threshold. Implement mechanisms and leave policies to the users! If you can handle access considerations yourself, one creative solution might be to use getfh(2) and fhopen(2) and store the file handles however way you want. This bypasses the kernel lookup entirely. -- P To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri May 3 21:19:33 2002 Delivered-To: freebsd-fs@freebsd.org Received: from quic.net (romulus.quic.net [216.23.27.8]) by hub.freebsd.org (Postfix) with SMTP id 18AE937B400 for ; Fri, 3 May 2002 21:19:30 -0700 (PDT) Received: (qmail 23710 invoked by uid 1032); 4 May 2002 04:19:36 -0000 From: utsl@quic.net Date: Sat, 4 May 2002 00:19:36 -0400 To: Terry Lambert Cc: Bakul Shah , Scott Hess , "Vladimir B. Grebenschikov" , fs@FreeBSD.ORG Subject: Re: Filesystem Message-ID: <20020504041936.GA19646@quic.net> References: <200205040019.UAA13780@illustrious.cnchost.com> <3CD32F43.327CDA46@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3CD32F43.327CDA46@mindspring.com> User-Agent: Mutt/1.3.27i Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Fri, May 03, 2002 at 05:45:55PM -0700, Terry Lambert wrote: > > If you build scalable solutions people will use them. If > > enough Unix variants provide fast dir search, others will > > have to pick it up. > > Fast dir search won't be picked up until important applications > start to rely on it. And important applications won't rely on > it until it's generally available. So the only real way to break > the log-jam is to come up with a killer app, which relies on some > feature you want to proselytize. > > The main enemy of new features like this is that there is always > more than one way to solve a problem. 8-). In this particular case, most sane people try to rewrite the application to avoid this kind of situation in the first place. Most people add some directory heirarchy, like squid does, or use abuse a database. There are also a few masochistic types that roll their own filesystem in userspace, and use raw disk. OTOH, I've seen a very large application (it ran on a Sun E10K) that did absolutely nothing about it. It was designed to put some ~1-2k files into a spool directory, and rotate every day. Unfortunately, the application didn't ever get redesigned to handle the scale it was being used for. So when I dealt with it, they had a filesystem that had 800,000 to 1M files in 15-16 directories. (Varied from day to day.) I found out about it when I was asked to figure out why the incremental backups for that filesystem never completed. They would run for ~35-40 hours and then crash. If I remember right, the backup program was running out of address space. 8-) Even if the filesystem had used btrees, the backup program would still have crashed. It was trying to make a list in memory of all the files it needed to backup. It never actually wrote anything to tape... I don't know if all backup software does incrementals that way, but I'd bet most of them do. So there can be other disadvantages to having lots of files in a directory besides slow directory lookups. ---Nathan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat May 4 7:31: 9 2002 Delivered-To: freebsd-fs@freebsd.org Received: from gull.prod.itd.earthlink.net (gull.mail.pas.earthlink.net [207.217.120.84]) by hub.freebsd.org (Postfix) with ESMTP id 35BD437B41A for ; Sat, 4 May 2002 07:31:02 -0700 (PDT) Received: from pool0048.cvx22-bradley.dialup.earthlink.net ([209.179.198.48] helo=mindspring.com) by gull.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 1740Yu-0006Rr-00; Sat, 04 May 2002 07:31:00 -0700 Message-ID: <3CD3F086.9F400956@mindspring.com> Date: Sat, 04 May 2002 07:30:30 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Eric Jacobs Cc: fs@freebsd.org, Bakul Shah Subject: Re: Filesystem References: <200205032031.QAA24496@repulse.cnchost.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Eric Jacobs wrote: > > Plan9 does ".." right. The same can be done in Unix by > > storing the rooted path in the kernel for a process'es > > current working dir. and by following some path rewrite > > rules: > > > > //.. == > > //../ == / > > /../ == / > > Those rules aren't valid on the account of syntax alone. You would > have to know which components are symbolic links. And once you take > into account symbolic links, you have essentially what namei does > anyway. > > I think what Terry Lambert was saying was that since hard-linking > directories isn't allowed anyway, there's no need to refcount them, > except for the subdirectory counting tricks. I meant that ".." being treated as a link is useful, because the link count itself can be useful information. However, the trade off is that it limits the number of subdirectories. The trade off in the other direction is that you have to be prepared to descend into the directory. This isn't really that big a deal these days, now that there is an attribute bit indicating the entry is a directory in the directory entry itself, so it's possible to both avoid the stat, and still get the information, if the link count is such that it "indicates" there are no subdirectories. Basically, some software will have to be hacked to traverse a directory for subdirectories, instead of just stat'ing the parent inode, and only traversing if the link count was > 2. The disallowing of hard links on directories was actually my suggestion from ~1994, on the basis of working around POSIX time update requirements for hosted file services. If you pretend that directories are special, and that they aren't files, you can escape from a number of time updates that would otherwise be a "SHALL update" vs. a "SHALL mark for update". Hard links on directories also fail to maintain parent/child relationships properly. Without such links, you are guarantted that you can cache the parent in the child inode, which can let you further speed reverse traversal. Since it was only ever an option for root, it's really no big loss. > > You would also have to deal with middle directories being > > renamed, filesystems being forcibly unmounted and so on. > > > > Not storing the entire path for cwd may have been the right > > decision for '70s but not since then.... > > The entire path is stored indirectly via the VFS name cache, so > getcwd() works _even_ for filesystems which do not implement "..". > Implementing ".." at the VFS level would be just as simple. Probably > the only reason it isn't is because it has been traditionally handled > at the FS level. The cache implementation LRU's it out. Saving the path-on-open works when not doing so fails, only because leaf nodes of type file don't maintain proper parent pointers. The implementation at the VFS level should be handled by having real vnodes/inodes for hard links. Maintaing the link-to-link relationship would require some additional overhead, but it's minor. Doing this would also allow you to store the parent inode of any inode... and since non-leaf inodes are always guaranteed to be directories, the recoverability of any open file's path to the root is guaranteed. If 128 bytes is too large a stretch, it can be done with smaller "link nodes", but the net effect is the same: by moving the link out to an abstract FS artifact, rather than an artifact of a count and a directory entry, you gain a lot of benefit. > > > In any case, it's still an incredibly bad idea to have even a tenth of > > > that man objects in a single directory, period. > > > > IMHO it is a bad idea to not have evolved directories to use a B-tree > > representation (at least when the number of entries exceed some > > threshold. Implement mechanisms and leave policies to the users! > > If you can handle access considerations yourself, one creative solution > might be to use getfh(2) and fhopen(2) and store the file handles however > way you want. This bypasses the kernel lookup entirely. I mentioned this, as a means of getting a flat (inode) name space. The only real problem with this (and it's a doozy!) is that the fsck process expects to have a real directory from which it can derive the reference count, or the inode is considered "lost" and will end up in "lost+found" on the next fsck, as an FS inconsistency. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat May 4 8:16: 4 2002 Delivered-To: freebsd-fs@freebsd.org Received: from gull.prod.itd.earthlink.net (gull.mail.pas.earthlink.net [207.217.120.84]) by hub.freebsd.org (Postfix) with ESMTP id 85C5D37B400 for ; Sat, 4 May 2002 08:16:00 -0700 (PDT) Received: from pool0048.cvx22-bradley.dialup.earthlink.net ([209.179.198.48] helo=mindspring.com) by gull.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 1741GB-0003B5-00; Sat, 04 May 2002 08:15:44 -0700 Message-ID: <3CD3FB02.3EC1DA29@mindspring.com> Date: Sat, 04 May 2002 08:15:14 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: utsl@quic.net Cc: Bakul Shah , Scott Hess , "Vladimir B. Grebenschikov" , fs@FreeBSD.ORG Subject: Re: Filesystem References: <200205040019.UAA13780@illustrious.cnchost.com> <3CD32F43.327CDA46@mindspring.com> <20020504041936.GA19646@quic.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org utsl@quic.net wrote: [ ... linear directory search times on the majority of systems ... ] > OTOH, I've seen a very large application (it ran on a Sun E10K) that did > absolutely nothing about it. It was designed to put some ~1-2k files > into a spool directory, and rotate every day. Unfortunately, the > application didn't ever get redesigned to handle the scale it was being > used for. So when I dealt with it, they had a filesystem that had > 800,000 to 1M files in 15-16 directories. (Varied from day to day.) I > found out about it when I was asked to figure out why the incremental > backups for that filesystem never completed. They would run for ~35-40 > hours and then crash. If I remember right, the backup program was > running out of address space. 8-) > > Even if the filesystem had used btrees, the backup program would still > have crashed. It was trying to make a list in memory of all the files > it needed to backup. It never actually wrote anything to tape... I don't > know if all backup software does incrementals that way, but I'd bet most > of them do. > > So there can be other disadvantages to having lots of files in a > directory besides slow directory lookups. I wasn't really trying to exhasutively list all the reasons that it was bad to put a bunch of files in a large directory. There are an incredibly large number of reasons for it to be bad, and I have better things to do than spending the rest of time pointing out impedence mismatches in algorithms. 8-). My take on an application that doesn't scale is that "fixing" the application by changing the behaviour of the underlying system is just propping up bad code. Bad code deserves to lose. So if someone wrote an application like that, it's just as well that the programmer who failed to consider scaling issues lose out to the programmer who considered them. After all, it's very likely that the failure to consider scaling issues is more of an "all or nothing" thing, and that the failure to consider one means that solving it in the OS will just expose the next one. There's really no way you can make the OS behave perfectly for all applications. At some point, applications programmers will have to learn how to program, or all bets are off. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat May 4 13:39:25 2002 Delivered-To: freebsd-fs@freebsd.org Received: from glatton.cnchost.com (glatton.cnchost.com [207.155.248.47]) by hub.freebsd.org (Postfix) with ESMTP id 1525B37B400 for ; Sat, 4 May 2002 13:39:19 -0700 (PDT) Received: from bitblocks.com (adsl-209-204-185-216.sonic.net [209.204.185.216]) by glatton.cnchost.com id QAA16949; Sat, 4 May 2002 16:39:08 -0400 (EDT) [ConcentricHost SMTP Relay 1.14] Message-ID: <200205042039.QAA16949@glatton.cnchost.com> To: Terry Lambert Cc: Scott Hess , "Vladimir B. Grebenschikov" , fs@FreeBSD.ORG Subject: Re: Filesystem In-reply-to: Your message of "Fri, 03 May 2002 17:45:55 PDT." <3CD32F43.327CDA46@mindspring.com> Date: Sat, 04 May 2002 13:39:07 -0700 From: Bakul Shah Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org > Generally the "large amount of object in a directory" objection > is brought up by some [FS] advocate, who wants to advocate their > pet FS, and isn't really noticing a real problem: instead, they > are implementing a solution, and then trying to invent a problem > that requires it. So they really don't care about portability, > they care about advocacy. If anything, I am advocating the zero, one and infinity rule. Don't place arbitrary limits and justify it as a good thing. A directory provides a namespace and using it for that purpose for any number of files is a perfectly sane thing to do. If it worked well enough, you wouldn't see abominations like ftp.netcom.com:users/ba/bakul (where they had use a two level scheme to improve access time). BTW, I too have come across a couple of cases where tens of thousands of files were stored in one dir. In one case it was a "throw away" program that ended up being a critical tool and the prototype got used and expended instead of being simply rewritten. By the time I got involved, it also depededed on Sybase but they couldn't figure out what to do with those zillion existing files! > > My frustration is with the 70s mindset when it comes to > > extending basic capabilities. Reasoning like: the disk > > access speed is very slow so the speed of directory access is > > not an issue. And since speed is not an issue a linear > > search is fine and dandy. Never mind that with a large > > buffer cache, chances are you will find dir. blocks in core > > and a linear search is not a great searching strategy when > > you have more than 10 to 20 items. And this is not the > > only such instance in the kernel. > > This case is an externalized interface used by programs, which > have a choice in their implementation, and choose badly. Can you please translate that in simple english? > I agree that there is a lot of room for extending the basic OS > capabilities; I would also argue that there is generally a lot > of research in that area, as well, and that research isn't being > put into practice, for the most part, for a reason other than > "not invented here" or "it's too hard: let's go shopping" (though > that is sometimes the reason). The main reason I think applies > is that legacy interoperability has more value than the other > benefits that the change brings. XFS from SGI also has btree directories so at least some vendors are doing this. I suspect in the free software community the reason is likely to be a) it is not hip enough, b) people who care & have expertise don't have time and/or inclination, c) NIH -- let us do our own cool thing even if it is just a tiny variation. BTW, I don't look down on any of these reasons. It is just the way things are. > For FS research, Poul and Kirk are working on UFS2. It would be > wise to appeal to them to include sufficient mechanisms in the > way of extension fields so that you could (for example) mark a > directory as being "btree" or "patricia tree" or "trie" or > whatever, and have it work transparently with legacy support for > linear directory block layout. Poul has already stated that he > and Kirk do not intend to change the directory layout to a btree, > even though they have an on disk format change that they will be > making, so this is the most logical compatability break. I don't > know if they intend to provide sufficient extension instertion > points for this type of thing (this would include extensions to > the soft updates system being pluggable, as well -- so I doubt > it). Either way, they haven't said, so you could always ask Kirk. I don't recall the goals of UFS2 being published except extending some limits. But to me this is a perfect example of NIH (I admit I don't know the details but seems that way). Why not just clone XFS? SGI can already achieve amazing data rates with it, its design is proven by fire and it has a lot of good features. Note that a number of companies are currently doing a lot of research in the FS area but I am afraid most of it will die with the companies. > Fast dir search won't be picked up until important applications > start to rely on it. And important applications won't rely on > it until it's generally available. So the only real way to break > the log-jam is to come up with a killer app, which relies on some > feature you want to proselytize. No app has to critically rely on fast dir search for it to be useful. It can be done (almost) transparently and most all look ups will speedup. Store a "dir type code" somewhere in the dir. inode. Use that to select the appropriate function table from namei(). Add a way to convert between dir types. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat May 4 19:59: 7 2002 Delivered-To: freebsd-fs@freebsd.org Received: from scaup.prod.itd.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by hub.freebsd.org (Postfix) with ESMTP id D739037B404 for ; Sat, 4 May 2002 19:58:55 -0700 (PDT) Received: from pool0267.cvx22-bradley.dialup.earthlink.net ([209.179.199.12] helo=mindspring.com) by scaup.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 174CEU-0004ev-00; Sat, 04 May 2002 19:58:42 -0700 Message-ID: <3CD49FC5.D1B17CB7@mindspring.com> Date: Sat, 04 May 2002 19:58:13 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Bakul Shah Cc: Scott Hess , "Vladimir B. Grebenschikov" , fs@FreeBSD.ORG Subject: Re: Filesystem References: <200205042039.QAA16949@glatton.cnchost.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Bakul Shah wrote: > If anything, I am advocating the zero, one and infinity rule. > Don't place arbitrary limits and justify it as a good thing. And, if anything, I'm saying "don't try to pretend that arbitrary rules don't exist on 140+ UNIX systems, just because they don't exist on 2-3 UNIX systems when configured with particular non-default options. 8-). > A directory provides a namespace and using it for that > purpose for any number of files is a perfectly sane thing to > do. Time to whip out the ol' "reductio ad absurdum"... Actually, it's not. It turns out, the size of hard disks is finite, and will tend to continue to be finite for the forseeable future. So no matter how you look at it, there's an "arbitrary limit" imposed by the amount of storage you have, divided by the number of bytes it takes to store an average entry. > If it worked well enough, you wouldn't see abominations > like ftp.netcom.com:users/ba/bakul (where they had use a two > level scheme to improve access time). Actually, this is a pretty reasonable thing to do; it gives a nice limit on the number of entries when the thing scanning it isn't running software, but is running wetware, instead. No matter how you slice it, putting 10,000 files in a directory, or even 1,000, generally results in significant barriers to processing the thing by a human being. > BTW, I too have come across a couple of cases where tens of > thousands of files were stored in one dir. In one case it > was a "throw away" program that ended up being a critical > tool and the prototype got used and expended instead of being > simply rewritten. By the time I got involved, it also > depededed on Sybase but they couldn't figure out what to do > with those zillion existing files! Bite the bullet, and actually solve the problem, if it doesn't work for you. If it works for you, then file the observation and move on. In other words, perhaps the problem wasn't "what to do with those zillion existing files", but "why do you think it's necessary to *do* something with those zillion existing files?". 8-). > > This case is an externalized interface used by programs, which > > have a choice in their implementation, and choose badly. > > Can you please translate that in simple english? Yeah. You have to interface to the underlying system to get to the files, and you don't necessarily have control of which underlying system you end up running on, so it behooves you to not make implementation assumptions that depend on a particular underlying system, or a particular implementation technology or algorithm being present in the underlying system -- especially if such technology is not implemented in the vast majority of systems. Or in even plainer English... "Write portable code." > > I agree that there is a lot of room for extending the basic OS > > capabilities; I would also argue that there is generally a lot > > of research in that area, as well, and that research isn't being > > put into practice, for the most part, for a reason other than > > "not invented here" or "it's too hard: let's go shopping" (though > > that is sometimes the reason). The main reason I think applies > > is that legacy interoperability has more value than the other > > benefits that the change brings. > > XFS from SGI also has btree directories so at least some > vendors are doing this. Yet I don't see the adoption of this technology happening in Linux, even though it's available for Linux, and the only places I *do* see it being adopted are where it's integrated into the default filesystem type for the particular OS platform. This is why I've thought implementing XFS in FreeBSD was a losing proposition: it's not a case of "if you build it, they will come", it's a case of "it it comes with the OS, then, yeah, we'll leave it turned on". > I suspect in the free software community the reason is likely > to be a) it is not hip enough, See, most people have a general misunderstanding of what drives Open Source projects. They think they can just declare a project, and, by doing so, hordes of programmers will descend upon it and write the code for you. Like army ants eating everything in their path. Or killer bees attacking someone trying to dig a post hole through their nest. The fact is that the only thing that motivates volunteerism on an Open Source project is preexisting working code. There is nothing so populous as a declared Open Source project which goes nowhere for lack of something to tinker with. > b) people who care & have expertise don't have time and/or > inclination, Certainly, my motivation to work on XFS (or anything else where it can't be compiled into and distributed on a CDROM as the default because of license conflicts) is pretty much zilch. I think that people generally don't give engineers credit for intelligence. They treat them as if they were autistic savants, unable to really understand the ramifications of their work. RMS certainly does this, when he assumes that programmers will program for the love of it, with no reward asked or given, other than the task itself. I guess if you get enough grants and other funding unrelated to your work product, you might develop that idea. > c) NIH -- let > us do our own cool thing even if it is just a tiny variation. > BTW, I don't look down on any of these reasons. It is just the > way things are. I look down on NIH. It's blatant stupidity. Generally, what people call NIH comes down to other factors, though. For example, a lot of people think any Open Source license is equivalent, and do they are incapable of understanding someone who writes new code to fulfill a function, merely to get out from under a license. > I don't recall the goals of UFS2 being published except > extending some limits. Read the DARPA document that initiated the work. It goes into some more detail. > But to me this is a perfect example > of NIH (I admit I don't know the details but seems that way). > Why not just clone XFS? SGI can already achieve amazing data > rates with it, its design is proven by fire and it has a lot > of good features. The License. The impossibility of distributing a CDROM that installs a precompiled binary image with the FS on it due to license issues making the resulting binary illegal, according to most corporate IP lawyers (engineering opinions do not matter in a legal risk analysis equation). > Note that a number of companies are currently doing a lot of > research in the FS area but I am afraid most of it will die > with the companies. Yes. That's a matter for tort reform of IP law, requiring source escrow to obtain any protection whatsoever. > > Fast dir search won't be picked up until important applications > > start to rely on it. And important applications won't rely on > > it until it's generally available. So the only real way to break > > the log-jam is to come up with a killer app, which relies on some > > feature you want to proselytize. > > No app has to critically rely on fast dir search for it to be > useful. It can be done (almost) transparently and most all > look ups will speedup. Store a "dir type code" somewhere in > the dir. inode. Use that to select the appropriate function > table from namei(). Add a way to convert between dir types. No app that you know of so far. Maybe there is a "killer app" that needs it. it's doubtful, though. If you're right, it means that fast and large directory performance is irrelevent to the big picture. I'd actually agree with that assessment. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message