From owner-freebsd-fs Sun Mar 18 0:38:48 2001 Delivered-To: freebsd-fs@freebsd.org Received: from relay.butya.kz (butya-gw.butya.kz [212.154.129.94]) by hub.freebsd.org (Postfix) with ESMTP id 62D8437B719; Sun, 18 Mar 2001 00:38:43 -0800 (PST) (envelope-from bp@butya.kz) Received: by relay.butya.kz (Postfix, from userid 1000) id 952D5288DD; Sun, 18 Mar 2001 14:38:37 +0600 (ALMT) Received: from localhost (localhost [127.0.0.1]) by relay.butya.kz (Postfix) with ESMTP id 8E43F2878C; Sun, 18 Mar 2001 14:38:37 +0600 (ALMT) Date: Sun, 18 Mar 2001 14:38:37 +0600 (ALMT) From: Boris Popov To: Sergey Babkin Cc: security@freebsd.org, Wes Peters , Robert Watson , fs@freebsd.org Subject: Re: about common group & user ID space (PR kern/14584) In-Reply-To: <3AB3FC38.94711FFF@bellatlantic.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sat, 17 Mar 2001, Sergey Babkin wrote: > I want to commit PR kern/14584. I've been told that it's good > to discuss it in -arch, -security and -fs. (It has been sort of > discussed on -hackers already, there were not much replies). Well, the idea looks good. It doesn't break any existing command except that the one need a (simple) tool to control required pseudo flat UID/GID space. However, I'm more liked it, if it will be possible to enable such behavior on a per-mount basis (but I guess we're out of spare mount options). -- Boris Popov http://www.butya.kz/~bp/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sun Mar 18 4:40:40 2001 Delivered-To: freebsd-fs@freebsd.org Received: from ringworld.nanolink.com (ringworld.nanolink.com [195.24.48.13]) by hub.freebsd.org (Postfix) with SMTP id 44C6D37B719 for ; Sun, 18 Mar 2001 04:40:12 -0800 (PST) (envelope-from roam@orbitel.bg) Received: (qmail 69058 invoked by uid 1000); 18 Mar 2001 12:38:22 -0000 Date: Sun, 18 Mar 2001 14:38:22 +0200 From: Peter Pentchev To: Dag-Erling Smorgrav Cc: Tony Finch , Duncan Barclay , Kris Kennaway , hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: httpfs Message-ID: <20010318143822.F49603@ringworld.oblivion.bg> Mail-Followup-To: Dag-Erling Smorgrav , Tony Finch , Duncan Barclay , Kris Kennaway , hackers@FreeBSD.ORG, fs@FreeBSD.ORG References: <20010310031515.A8998@mollari.cthul.hu> <20010315095533.C12432@ringworld.oblivion.bg> <000d01c0ad3c$0ed83fb0$d26020c2@Cadence.COM> <000d01c0ad3c$0ed83fb0$d26020c2@Cadence.COM> <20010315124244.A442@ringworld.oblivion.bg> <20010316054649.F385@hand.dotat.at> <20010316174424.A428@ringworld.oblivion.bg> <20010317180055.A486@ringworld.oblivion.bg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from des@ofug.org on Sat, Mar 17, 2001 at 05:03:42PM +0100 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sat, Mar 17, 2001 at 05:03:42PM +0100, Dag-Erling Smorgrav wrote: > Peter Pentchev writes: > > On Sat, Mar 17, 2001 at 04:53:34PM +0100, Dag-Erling Smorgrav wrote: > > > Peter Pentchev writes: > > > > There was at the time - socketpair(2) had totally slipped my mind ;) > > > Umm, you want pipe(2), not socketpair(2). > > Actually, I want socketpair(2). pipe(2) was what I used before, > > and that's the reason I had a read-only file descriptor - the portalfs > > architecture allows for only one fd to be returned, and pipe(2) > > provides a one-way pipe. > > Not in FreeBSD. Oops. OK. I RTFM'd, and fixed it. Thanks to everyone who pointed that out :) G'luck, Peter -- This sentence no verb. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sun Mar 18 6:51:59 2001 Delivered-To: freebsd-fs@freebsd.org Received: from point.osg.gov.bc.ca (point.osg.gov.bc.ca [142.32.102.44]) by hub.freebsd.org (Postfix) with ESMTP id 5C6B637B718; Sun, 18 Mar 2001 06:51:54 -0800 (PST) (envelope-from Cy.Schubert@uumail.gov.bc.ca) Received: (from daemon@localhost) by point.osg.gov.bc.ca (8.8.7/8.8.8) id GAA03707; Sun, 18 Mar 2001 06:48:21 -0800 Received: from passer.osg.gov.bc.ca(142.32.110.29) via SMTP by point.osg.gov.bc.ca, id smtpda03705; Sun Mar 18 06:48:09 2001 Received: (from uucp@localhost) by passer.osg.gov.bc.ca (8.11.2/8.9.1) id f2IEm0D17167; Sun, 18 Mar 2001 06:48:00 -0800 (PST) Received: from cwsys9.cwsent.com(10.2.2.1), claiming to be "cwsys.cwsent.com" via SMTP by passer9.cwsent.com, id smtpdb17165; Sun Mar 18 06:47:45 2001 Received: (from uucp@localhost) by cwsys.cwsent.com (8.11.3/8.9.1) id f2IElef41927; Sun, 18 Mar 2001 06:47:40 -0800 (PST) Message-Id: <200103181447.f2IElef41927@cwsys.cwsent.com> Received: from localhost.cwsent.com(127.0.0.1), claiming to be "cwsys" via SMTP by localhost.cwsent.com, id smtpdF41921; Sun Mar 18 06:47:18 2001 X-Mailer: exmh version 2.3.1 01/18/2001 with nmh-1.0.4 Reply-To: Cy Schubert - ITSD Open Systems Group From: Cy Schubert - ITSD Open Systems Group X-Sender: schubert To: Sergey Babkin Cc: security@FreeBSD.ORG, Wes Peters , Robert Watson , fs@FreeBSD.ORG Subject: Re: about common group & user ID space (PR kern/14584) In-reply-to: Your message of "Sat, 17 Mar 2001 19:07:20 EST." <3AB3FC38.94711FFF@bellatlantic.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sun, 18 Mar 2001 06:47:17 -0800 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org In message <3AB3FC38.94711FFF@bellatlantic.net>, Sergey Babkin writes: > All, > > I want to commit PR kern/14584. I've been told that it's good From an operational standpoint I see one problem. Some sites use UID 0-999 and 65000-65535 for use by special accounts, such as www, ftp, oracle, etc. In some cases this policy is dictated by a desire to have some kind of commonality across various vendor platforms, some of which reserve some odd UID's and GID's for vendor supplied software or purposes. The only suggestion I would make is that a range could be specified. For example instead of vfs.commonid, vfs.commonid.low and vfs.commonid.high, allowing a site to, for example, reserve UID/GID's 10000-19999 or any other range as common ID's. Regards, Phone: (250)387-8437 Cy Schubert Fax: (250)387-5766 Team Leader, Sun/Alpha Team Internet: Cy.Schubert@osg.gov.bc.ca Open Systems Group, ITSD, ISTA Province of BC To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sun Mar 18 11:48:33 2001 Delivered-To: freebsd-fs@freebsd.org Received: from lariat.org (lariat.org [12.23.109.2]) by hub.freebsd.org (Postfix) with ESMTP id 5201537B719; Sun, 18 Mar 2001 11:48:29 -0800 (PST) (envelope-from brett@lariat.org) Received: from mustang.lariat.org (IDENT:ppp0.lariat.org@lariat.org [12.23.109.2]) by lariat.org (8.9.3/8.9.3) with ESMTP id MAA01358; Sun, 18 Mar 2001 12:42:26 -0700 (MST) Message-Id: <4.3.2.7.2.20010318123759.00d9dd10@localhost> X-Sender: brett@localhost X-Mailer: QUALCOMM Windows Eudora Version 4.3.2 Date: Sun, 18 Mar 2001 12:42:17 -0700 To: Terry Lambert , babkin@bellatlantic.net (Sergey Babkin) From: Brett Glass Subject: Re: about common group & user ID space (PR kern/14584) Cc: security@FreeBSD.ORG, wes@softweyr.com (Wes Peters), rwatson@FreeBSD.ORG (Robert Watson), fs@FreeBSD.ORG In-Reply-To: <200103180738.AAA03250@usr05.primenet.com> References: <3AB3FC38.94711FFF@bellatlantic.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org At 12:38 AM 3/18/2001, Terry Lambert wrote: >The benefits in not having the grovel through the FS contents, or >do a more complex ID space transformations, and the moving of the >majority of changes to user space, combined with the fact that if >you turn it off, the ownership doesn't need to be reverted, are >all plusses. At the same time, it'd be nice to eliminate the arbitrary limitations on (a) the number of groups of which a user can be a member and (b) the number of members in a group. Both of these limitations often bite administrators who, for example, want most users of a system to be members of a particular group or want to implement group-based access control schemes with a moderate degree of granularity. Classes won't cut it for this purpose, alas, because they're not built into file system security. --Brett To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sun Mar 18 12:51:48 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id B3BD737B71F; Sun, 18 Mar 2001 12:51:43 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2IKp4g01900; Sun, 18 Mar 2001 12:51:04 -0800 (PST) (envelope-from dillon) Date: Sun, 18 Mar 2001 12:51:04 -0800 (PST) From: Matt Dillon Message-Id: <200103182051.f2IKp4g01900@earth.backplane.com> To: "Duncan Barclay" Cc: "Peter Pentchev" , "Kris Kennaway" , , Subject: Re: httpfs References: <20010310031515.A8998@mollari.cthul.hu> <20010315095533.C12432@ringworld.oblivion.bg> <000d01c0ad3c$0ed83fb0$d26020c2@Cadence.COM> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :I don't really think that portalfs is the right thing to use to build :an httpfs with, but I would like to see how you managed to get your example :to work. Are you using stdout to create an anonymous file handle? What happens :if two processes concurrently read from /p/http/*? : :Duncan : :-- :_____________________________________________________________ :Duncan Barclay | God smiles upon the little children, You could certainly write a program to sit in the middle and cache the request to handle that case. The problem with portalfs is that you can't 'cd' into it or do directory operations on it, and filesystem operations such as lseek, fstat, and so forth cannot be intercepted. It would be the ultimate coolness if you could. We need a better solution then faking an NFS mount to be able to run *real* filesystems in user space. But, that aside, portalfs works just dandy for getting simple file handles from a path. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sun Mar 18 14:10:34 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 6258337B718; Sun, 18 Mar 2001 14:10:27 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.1/8.11.1) with SMTP id f2ILRVh47947; Sun, 18 Mar 2001 16:27:31 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Sun, 18 Mar 2001 16:27:30 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Matt Dillon Cc: Duncan Barclay , Peter Pentchev , Kris Kennaway , hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: httpfs In-Reply-To: <200103182051.f2IKp4g01900@earth.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sun, 18 Mar 2001, Matt Dillon wrote: > You could certainly write a program to sit in the middle and cache > the request to handle that case. > > The problem with portalfs is that you can't 'cd' into it or do > directory operations on it, and filesystem operations such as lseek, > fstat, and so forth cannot be intercepted. It would be the ultimate > coolness if you could. > > We need a better solution then faking an NFS mount to be able to run > *real* filesystems in user space. > > But, that aside, portalfs works just dandy for getting simple file handles > from a path. Take a look at the XFS module included with Arla, and the Coda kernel module. They're both targetted at the idea that a userspace daemon will deal with open/close/directory requests, providing container vnodes for the actual files on demand, allowing the kernel to efficiently provide them to consumers. It's easy to imagine an HTTP backend daemon for them. The Arla kernel module is probably a bit more mature and better maintained; on the other hand, the Coda module is in our sys/ tree already. The OpenBSD folk have actually imported Arla into their distribution, which is actually not a bad idea now that OpenAFS is around... (Of course, we still need someone to port OpenAFS so that we have a free server -- with IFS on the server side, we should be able to exhibit a substantially simpler implementation with the same perform benefits as the AFS iopen() stuff :-) Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sun Mar 18 15:39:37 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp10.phx.gblx.net (smtp10.phx.gblx.net [206.165.6.140]) by hub.freebsd.org (Postfix) with ESMTP id 92FD937B71A; Sun, 18 Mar 2001 15:39:32 -0800 (PST) (envelope-from tlambert@usr05.primenet.com) Received: (from daemon@localhost) by smtp10.phx.gblx.net (8.9.3/8.9.3) id QAA15412; Sun, 18 Mar 2001 16:39:14 -0700 Received: from usr05.primenet.com(206.165.6.205) via SMTP by smtp10.phx.gblx.net, id smtpd7updia; Sun Mar 18 16:39:12 2001 Received: (from tlambert@localhost) by usr05.primenet.com (8.8.5/8.8.5) id QAA18696; Sun, 18 Mar 2001 16:39:21 -0700 (MST) From: Terry Lambert Message-Id: <200103182339.QAA18696@usr05.primenet.com> Subject: Re: about common group & user ID space (PR kern/14584) To: brett@lariat.org (Brett Glass) Date: Sun, 18 Mar 2001 23:39:21 +0000 (GMT) Cc: tlambert@primenet.com (Terry Lambert), babkin@bellatlantic.net (Sergey Babkin), security@FreeBSD.ORG, wes@softweyr.com (Wes Peters), rwatson@FreeBSD.ORG (Robert Watson), fs@FreeBSD.ORG In-Reply-To: <4.3.2.7.2.20010318123759.00d9dd10@localhost> from "Brett Glass" at Mar 18, 2001 12:42:17 PM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > At the same time, it'd be nice to eliminate the arbitrary limitations > on (a) the number of groups of which a user can be a member and (b) the > number of members in a group. Both of these limitations often bite > administrators who, for example, want most users of a system to be > members of a particular group or want to implement group-based access > control schemes with a moderate degree of granularity. Classes won't > cut it for this purpose, alas, because they're not built into file > system security. I think that you will run into the limitations inherent in the quota record storage format and NFSv2 UID/GID, well before you face that limit. I think that trying to make a user a member of 50,000 groups is probably a mistake, and it's not "arbitrary" to prevent this. There is really no limit on the number of members permitted in a group, I believe. If you are talking about line length, I'd say you should consider getting rid of "pico" and using a real editor. I think there are patches floating around to allow repeats of group lines in order to set up larger lists of members, in any case (they may already be integrated into FreeBSD; they aren't in BSDI, from looking at the BSDI system I have access to). I think the workaround for the "I want groups to be more than groups and act more like classes, but I'm too lazy to implement classes properly" problem is pretty simple: write an SGID program that gets you a shell. Alternately, write a program that lets you add a group (and spawn a subshell) that's SUID root, and does a check against the group password field. Give the password to the users you want to have access to the group. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sun Mar 18 22:58:50 2001 Delivered-To: freebsd-fs@freebsd.org Received: from lariat.org (lariat.org [12.23.109.2]) by hub.freebsd.org (Postfix) with ESMTP id 09B2537B718; Sun, 18 Mar 2001 22:58:47 -0800 (PST) (envelope-from brett@lariat.org) Received: from mustang.lariat.org (IDENT:ppp0.lariat.org@lariat.org [12.23.109.2]) by lariat.org (8.9.3/8.9.3) with ESMTP id XAA06450; Sun, 18 Mar 2001 23:54:49 -0700 (MST) Message-Id: <4.3.2.7.2.20010318234944.00e3a620@localhost> X-Sender: brett@localhost X-Mailer: QUALCOMM Windows Eudora Version 4.3.2 Date: Sun, 18 Mar 2001 23:54:30 -0700 To: Terry Lambert From: Brett Glass Subject: Re: about common group & user ID space (PR kern/14584) Cc: tlambert@primenet.com (Terry Lambert), babkin@bellatlantic.net (Sergey Babkin), security@FreeBSD.ORG, wes@softweyr.com (Wes Peters), rwatson@FreeBSD.ORG (Robert Watson), fs@FreeBSD.ORG In-Reply-To: <200103182339.QAA18696@usr05.primenet.com> References: <4.3.2.7.2.20010318123759.00d9dd10@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org At 04:39 PM 3/18/2001, Terry Lambert wrote: >I think that trying to make a user a member of 50,000 groups is >probably a mistake, and it's not "arbitrary" to prevent this. On the other hand, the current limit is quite low. >There is really no limit on the number of members permitted in a >group, I believe. I recently had to help out a client who hit that limit. He ran a graphic arts house and wanted his customers to be able to FTP jobs in. So, he added them. One day, after about two years, the system croaked because the group was too large. --Brett To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 10:51:20 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peter3.wemm.org (c1315225-a.plstn1.sfba.home.com [65.0.135.147]) by hub.freebsd.org (Postfix) with ESMTP id 938DD37B71E for ; Mon, 19 Mar 2001 10:51:12 -0800 (PST) (envelope-from peter@netplex.com.au) Received: from mobile.wemm.org (mobile.wemm.org [10.0.0.5]) by peter3.wemm.org (8.11.0/8.11.0) with ESMTP id f2JIpCp78454 for ; Mon, 19 Mar 2001 10:51:12 -0800 (PST) (envelope-from peter@netplex.com.au) Received: from netplex.com.au (localhost [127.0.0.1]) by mobile.wemm.org (8.11.1/8.11.1) with ESMTP id f2JIp8h36375; Mon, 19 Mar 2001 10:51:08 -0800 (PST) (envelope-from peter@netplex.com.au) Message-Id: <200103191851.f2JIp8h36375@mobile.wemm.org> X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4 To: Alfred Perlstein Cc: =?iso-8859-1?Q?Andr=E9_Luiz_dos_Santos?= , fs@FreeBSD.ORG Subject: Re: Truncating a file. In-Reply-To: <20010315181601.O29888@fw.wintelcom.net> Content-Transfer-Encoding: 8bit Date: Mon, 19 Mar 2001 10:51:08 -0800 From: Peter Wemm Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Alfred Perlstein wrote: > * André Luiz dos Santos [010315 18:11] wrote: > > > > With ftruncate, you can remove part of the end of a file. Is there a way to > > remove part of the beginning of a file? > > I'm developing a SOCKS5 server that stores the data received from the fir st > > connection to a local file, and when the second connection is writable, rea d > > that file and write the data to this second connection. As data is read fro m > > the local file, its beginning becomes useless, so I'd like to truncate it > > out. Is it possible? > > No. I tried to tempt the VIVAFS author into porting his BSDI 4.3-net2 based filesystem to 4.4Lite and FreeBSD. It had many unusual features, including the ability to reverse truncate from the beginning of the file. It also had rotating files, etc. When we last spoke in June 1996, he was intending to take a shot at it in between working on QDDB. The truncate-from-the-beginning stuff is nice because it didn't leave holes like the F_FREESP fcntl method. It actually shortened the file. F_FREESP does not do that. The rotating file thing is also nice. Suppose you set your file limit to 10MB, you keep writing data into it, and it never grows beyond 10MB. As you append data onto the end of it, a corresponding amount falls off the beginning. This is nearly perfect for self maintaining "recent activity" log files. I'm tempted to mention the name of the author and his email address, but I'm sure that the determined among us can find him. :-) Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 10:54:35 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peter3.wemm.org (c1315225-a.plstn1.sfba.home.com [65.0.135.147]) by hub.freebsd.org (Postfix) with ESMTP id EFD3437B718 for ; Mon, 19 Mar 2001 10:54:31 -0800 (PST) (envelope-from peter@netplex.com.au) Received: from mobile.wemm.org (mobile.wemm.org [10.0.0.5]) by peter3.wemm.org (8.11.0/8.11.0) with ESMTP id f2JIsVp78477 for ; Mon, 19 Mar 2001 10:54:31 -0800 (PST) (envelope-from peter@netplex.com.au) Received: from netplex.com.au (localhost [127.0.0.1]) by mobile.wemm.org (8.11.1/8.11.1) with ESMTP id f2JIsUh36403; Mon, 19 Mar 2001 10:54:30 -0800 (PST) (envelope-from peter@netplex.com.au) Message-Id: <200103191854.f2JIsUh36403@mobile.wemm.org> X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4 To: Terry Lambert Cc: andre@netvision.com.br, fs@FreeBSD.ORG Subject: Re: Truncating a file. In-Reply-To: <200103160600.XAA04319@usr05.primenet.com> Date: Mon, 19 Mar 2001 10:54:30 -0800 From: Peter Wemm Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Terry Lambert wrote: > > With ftruncate, you can remove part of the end of a file. Is there a way to > > remove part of the beginning of a file? > > I'm developing a SOCKS5 server that stores the data received from the fir st > > connection to a local file, and when the second connection is writable, rea d > > that file and write the data to this second connection. As data is read fro m > > the local file, its beginning becomes useless, so I'd like to truncate it > > out. Is it possible? > > FreeBSD doesn't support the defacto industry standard F_FREESP > fcntl(2) command argument (which would free the area referred > to by the contents of a flock structure). F_FREESP is not truncating. It just makes a hole. Truncating from the beginning means to take a 10MB file, truncate the first 1MB off the beginning and be left with a 9MB file.. VIVAFS implements this, but it was based on 4.3-net2 and BSD/OS 2.x and was considered "tainted" pending a reimplementation. Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 16:57: 0 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.teb1.iconnet.net (smtp02.teb1.iconnet.net [209.3.218.43]) by hub.freebsd.org (Postfix) with ESMTP id 8642637B71E; Mon, 19 Mar 2001 16:56:34 -0800 (PST) (envelope-from babkin@bellatlantic.net) Received: from bellatlantic.net (client-151-198-135-36.nnj.dialup.bellatlantic.net [151.198.135.36]) by smtp02.teb1.iconnet.net (8.9.1/8.9.1) with ESMTP id TAA23200; Mon, 19 Mar 2001 19:55:05 -0500 (EST) Message-ID: <3AB6AA65.1B6ED19E@bellatlantic.net> Date: Mon, 19 Mar 2001 19:55:01 -0500 From: Sergey Babkin X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 4.0-19990626-CURRENT i386) X-Accept-Language: en, ru MIME-Version: 1.0 To: Terry Lambert Cc: security@FreeBSD.ORG, Wes Peters , Robert Watson , fs@FreeBSD.ORG, arch@FreeBSD.ORG Subject: Re: about common group & user ID space (PR kern/14584) References: <200103180738.AAA03250@usr05.primenet.com> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Terry Lambert wrote: > > > I want to commit PR kern/14584. I've been told that it's good > > to discuss it in -arch, -security and -fs. (It has been sort of > > discussed on -hackers already, there were not much replies). > > So I've posted a message on -arch, and now on -security and -fs. > > I've also discussed this idea shortly with Kirk McKusick at > > Usenix-2000 at the BSD BOF and he generally liked it and suggested > > to review further. > > You could do this a bit more cleanly by just stealing the sign > bit, and setting if the uid field contained a group ID. > > There would be no conversion problem for an existing system. That was my original idea but some thinking and experimentation has shown that it creates too many incompatibilities, such as: - programs displaying the owner by name would break, and that includes both the standard programs and random applications - when exported by nfs, the same problem would stand for the clients - chown will have to be changed - both the program and system call, as you mention later and possibly other sorts of breakages. > This changes the check to a one line change, conditional on > the high bit being set. No, the change would be the same, just wrapped into a condition check for this bit. > Note that this change is really necessary in the user space code > anyway: even if you make the UID and GID numeric values not > intersect, there is still the possibility of a group and user > having the same name, so a set-by-name needs a seperate flag > (thing "chown bin.bin foo", for example). In the way I propose it, the sysadmins are supposed to create a pseudo-user with the same name and ID as each group. That automagically makes all commands, such as chown and ls, work properly. Of course, that means that no real users and groups must have the same name, but the common namespace looks natural with the common ID space. Because the traditional users ang groups with low IDs do have overlapping names, and IDs, the sysctl sets the lowest ID from which the common ID space starts. If the sysctl sets this value to below 100 (traditional range for the system IDs), then the common ID code is disabled altogether. The value of 100 is set by a kernel config option and may be changed. > The benefits in not having the grovel through the FS contents, or > do a more complex ID space transformations, and the moving of the > majority of changes to user space, combined with the fact that if > you turn it off, the ownership doesn't need to be reverted, are > all plusses. This is not quite so. My patch requires only one little change in the kernel and no usel-level space changes at all. It has some expectations for the assignment of user and group IDs and names, but these expectations are justified to make the common ID space look reasonable. The downside is that it's slightly slower (for each file owner ID in the common space it has to be checked agains all process'es groups). I'm not sure yet if it allows more complex transformations and whether it does it comparable to your proposal. -SB To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 16:58:38 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.teb1.iconnet.net (smtp02.teb1.iconnet.net [209.3.218.43]) by hub.freebsd.org (Postfix) with ESMTP id DFF4E37B735; Mon, 19 Mar 2001 16:58:32 -0800 (PST) (envelope-from babkin@bellatlantic.net) Received: from bellatlantic.net (client-151-198-135-36.nnj.dialup.bellatlantic.net [151.198.135.36]) by smtp02.teb1.iconnet.net (8.9.1/8.9.1) with ESMTP id TAA23220; Mon, 19 Mar 2001 19:57:46 -0500 (EST) Message-ID: <3AB6AB09.1D43B872@bellatlantic.net> Date: Mon, 19 Mar 2001 19:57:45 -0500 From: Sergey Babkin X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 4.0-19990626-CURRENT i386) X-Accept-Language: en, ru MIME-Version: 1.0 To: Boris Popov Cc: security@freebsd.org, Wes Peters , Robert Watson , fs@freebsd.org, arch@bellatlantic.net Subject: Re: about common group & user ID space (PR kern/14584) References: Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Boris Popov wrote: > > On Sat, 17 Mar 2001, Sergey Babkin wrote: > > > I want to commit PR kern/14584. I've been told that it's good > > to discuss it in -arch, -security and -fs. (It has been sort of > > discussed on -hackers already, there were not much replies). > > However, I'm more liked it, if it will be possible to enable such > behavior on a per-mount basis (but I guess we're out of spare mount > options). Eh, I should have cc-ed it to all the lists at once. I've already answered this in -arch: I think that this should be a system-wide option: the /etc/passwd ang /etc/group files are common for the whole OS, and this option describes their contents. So setting this value per filesystem makes no sense and may cause unobvious errors when different filesystems get mounted by mistake with different values of common ID. -SB To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 17: 1:44 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.teb1.iconnet.net (smtp02.teb1.iconnet.net [209.3.218.43]) by hub.freebsd.org (Postfix) with ESMTP id 0AF7537B737; Mon, 19 Mar 2001 17:01:39 -0800 (PST) (envelope-from babkin@bellatlantic.net) Received: from bellatlantic.net (client-151-198-135-36.nnj.dialup.bellatlantic.net [151.198.135.36]) by smtp02.teb1.iconnet.net (8.9.1/8.9.1) with ESMTP id UAA23273; Mon, 19 Mar 2001 20:00:40 -0500 (EST) Message-ID: <3AB6ABB7.A208EECE@bellatlantic.net> Date: Mon, 19 Mar 2001 20:00:39 -0500 From: Sergey Babkin X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 4.0-19990626-CURRENT i386) X-Accept-Language: en, ru MIME-Version: 1.0 To: Cy Schubert - ITSD Open Systems Group Cc: security@FreeBSD.ORG, Wes Peters , Robert Watson , fs@FreeBSD.ORG, arch@FreeBSD.ORG Subject: Re: about common group & user ID space (PR kern/14584) References: <200103181447.f2IElef41927@cwsys.cwsent.com> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Cy Schubert - ITSD Open Systems Group wrote: > > In message <3AB3FC38.94711FFF@bellatlantic.net>, Sergey Babkin writes: > > All, > > > > I want to commit PR kern/14584. I've been told that it's good > > >From an operational standpoint I see one problem. Some sites use UID > 0-999 and 65000-65535 for use by special accounts, such as www, ftp, > oracle, etc. In some cases this policy is dictated by a desire to have > some kind of commonality across various vendor platforms, some of which > reserve some odd UID's and GID's for vendor supplied software or > purposes. The only suggestion I would make is that a range could be > specified. For example instead of vfs.commonid, vfs.commonid.low and > vfs.commonid.high, allowing a site to, for example, reserve UID/GID's > 10000-19999 or any other range as common ID's. I'm not sure if it's so important: probably, normally the IDs around 65535 are used for things like nobody/nogroup. But since it's easy to implement, I guess it would not hurt. So I agree with this proposal. -SB To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 17:16:49 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.teb1.iconnet.net (smtp02.teb1.iconnet.net [209.3.218.43]) by hub.freebsd.org (Postfix) with ESMTP id 2458537B740; Mon, 19 Mar 2001 17:16:31 -0800 (PST) (envelope-from babkin@bellatlantic.net) Received: from bellatlantic.net (client-151-198-135-36.nnj.dialup.bellatlantic.net [151.198.135.36]) by smtp02.teb1.iconnet.net (8.9.1/8.9.1) with ESMTP id UAA23373; Mon, 19 Mar 2001 20:15:12 -0500 (EST) Message-ID: <3AB6AF1F.9452E231@bellatlantic.net> Date: Mon, 19 Mar 2001 20:15:11 -0500 From: Sergey Babkin X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 4.0-19990626-CURRENT i386) X-Accept-Language: en, ru MIME-Version: 1.0 To: Terry Lambert Cc: Brett Glass , security@FreeBSD.ORG, Wes Peters , Robert Watson , fs@FreeBSD.ORG, arch@FreeBSD.ORG Subject: Re: about common group & user ID space (PR kern/14584) References: <200103182339.QAA18696@usr05.primenet.com> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Terry Lambert wrote: > > > At the same time, it'd be nice to eliminate the arbitrary limitations These things are not really related to the common ID space. I definitely would not like to do them in the same patch, just to keep things separate. > > on (a) the number of groups of which a user can be a member and (b) the For this there is some macro (can't remember the name) which can be defined in the kernel config file as an option with a higher value. Setting it higher means higher system overhead but since the memory size has increased significantly over the last few years, I think that a higher default value makes sense. > > number of members in a group. Both of these limitations often bite > > administrators who, for example, want most users of a system to be > > members of a particular group or want to implement group-based access > > control schemes with a moderate degree of granularity. Classes won't > > cut it for this purpose, alas, because they're not built into file > > system security. > > I think that you will run into the limitations inherent in the > quota record storage format and NFSv2 UID/GID, well before you > face that limit. > > There is really no limit on the number of members permitted in a > group, I believe. If you are talking about line length, I'd say I think there is such a limit. Or at least it was in the 2.0.5 days. I'm not sure about the line length limit. I remember that there was such a limit in SVR4.2, so if a group line grew past some size, getgrent() and friends went crazy. > you should consider getting rid of "pico" and using a real editor. The common workaround it to split a group record into multiple lines in /etc/group, like: staff:*:20:root staff:*:20:babkin Keep no more than about ~50 users per line. This may break things like adduser but it's not a big loss. The important things, such as setting process permissions on login, work fine. > I think there are patches floating around to allow repeats of > group lines in order to set up larger lists of members, in any > case (they may already be integrated into FreeBSD; they aren't in > BSDI, from looking at the BSDI system I have access to). No patches are really required. If you discount the secondary stuff like useradd/adduser, repeated lines just work out of the box on all the Unix systems where I tried: FreeBSD, Linux, HP-UX, UnixWare, SCO OpenServer, ICL DRS/NX (old SVR4.2). -SB To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 20: 1:33 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id D745937B73C; Mon, 19 Mar 2001 20:01:23 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.1/8.11.1) with SMTP id f2K40Ih69662; Mon, 19 Mar 2001 23:00:19 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Mon, 19 Mar 2001 23:00:18 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Sergey Babkin Cc: security@freebsd.org, Wes Peters , fs@freebsd.org Subject: Re: about common group & user ID space (PR kern/14584) In-Reply-To: <3AB3FC38.94711FFF@bellatlantic.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Sergey, Sorry for my long delay in getting back to you with regards to your proposed changes. Let me start by saying I had a number of reactions at various levels (gut, technical, ...) and that one of the interesting aspects about the suggested changes is that they're remarkably self-consistent: most security "extensions" I've seen contain relatively easy-to-find inconsistencies that render them useless against a qualified attacker. My gut reaction to the changes is one of concern: it strikes me that while the changes have a number of nice properties (not least of which is the consistency of them, and that they don't require underlying file system changes), fundamentally there are a few objections that can be made. First, it's a hack, in that it will not be consistently applied across file systems, or even across boots depending on the kernel used. Second, it changes the semantics of well-defined interfaces and primitives such that they are more open than they used to be (a certain class of subject credentials will have a strict superset of the rights they previously had) without providing the application any way to determine if the feature is in effect (no pathconf(), so that leaves direct experimentation, etc). Third, your patches include no attempt to uniformly update documentation referring to users/groups to bring them in sync with the new implementation. Fourth, many applications exist that make strong assumptions about the UNIX protection mechanisms that will no longer hold true (this is related to (3), although not quite identical). Fifth, the resulting system is highly non-portable, in that neither users nor administrators with expectations from other systems (or even from FreeBSD) will be able to apply their knowledge and experience with the mechanism in place, and safety assumptions may no longer hold. Sixth, applications that assume that preserving permissions across certain types of file system operations will no longer behave correctly (for example, when you tar on UFS and untar on NFS). Seventh, it (as you point out explicitly, and by design) intersects two namespaces that have traditionally not been combined in the kernel. Userland code has often made assumptions about mapping uid and gid values, but that has never been a property of the kernel policy. Eighth, it introduces additional hard-coded uid/gid values into the kernel, something we've been trying to move away from (in theory, only two constant values should be relevant, leaving aside default device permissions: uid 0 and the uid used to represent NOVAL in vop_setattr() (which is evil also :-)). Now, none of these is a reason to completely reject the idea. In fact, there's precedent for conditionally compiled divergent security hacks in UFS, in the form of SUIDDIR, which adopts a modified file ownership/creation/inheritence model making for easier use of Samba on closed file servers (it represents a substantial security risk if not on a closee system). Ok, so that was the "gut reaction" and the "why the gut reaction doesn't rule out adding this feature". Let me go onto various other relevant responses. My first response on initial concern that this policy would introduce an "inconsistency". That is to say, based on this modified kernel policy and common uses of it in the userland policy environment, easily exploitable inconsistencies could be found and used to gain privilege. In my initial glance, I was unable to identify such an inconsistency -- that isn't to say it doesn't exist, just that on a quick initial analysis I didn't find one. Hence my comment above on this being relatively unusual :-). On determining that I didn't find any vulnerabilites off-hand, I was interested, as this is both unusual, and the changes bring some nice new system properties. As I said above, there is some precedent for this type of conditionally compiled feature (read: "hack"), and as long as it were clearly documented as such, my reaction is again not a rejection :-). I should take a moment also to respond to your comments on ACLs. In my view, they all apply. ACLs are a pain to deal with, because they increase the already high administrative overhead of managing per-file permissions. Personally, I'm a fan of the AFS ACL model, where protections are present only on directories, hard links are prohibited, and sub-directories inherit protections on creation. I even had an implementation of this on FreeBSD at one point, although it's quite dated now. However, ACLs have a number of things going for them: 1) They are portable. POSIX.1e pretty much defines everything you need to know (not quite) to implement a portable DAC mechanism. Many operating systems implement POSIX.1e with a high degree of compliance. Many applications know about, or are learning about, POSIX.1e. For example, Samba's new ACL support will speak POSIX.1e. 2) They provide compatibility with file modes: if you don't know about ACLs, all the mode commands "just work". This goes for users and for applications. You might not end up with the permissions you expect, but you'll end up with conservative and safe permissions according to the permission model. Applications and users won't make assumptions about UNIX mode compatibility and be wrong, failing open. The result was an even uglier ACL model, but the argument that this was desirable was a strong one. 3) They're widely used and fairly well inspected by a fair number of security types. So while I don't like POSIX.1e ACLs, I decided to implement them because these all seemed to be strong properties that were hard to ignore. Cutting "yet another discretionary access control mechanism" was really out of the question from these perspectives. In a few days, I'll be committing options UFS_ACL to the -CURRENT tree, and the result will be a fairly complete POSIX.1e/POSIX.2c implementation. Some userland tools, such as mv, cp, backup stuff, mtree, will need to be updated, and we have a few more bits of the ACL editing library to finish so as to support applications such as Samba. Other than to strongly caution against using your feature in most situations (especially where portability and safety involving multiple file systems, machines, operating systems, etc), I won't stop you from committing it (especially if you use it locally and with success). I will say that I think divergent and non-portable security models are likely to be more trouble than they are worth, and make my job substantially more complicated (we'll be starting work on a FreeBSD Security Architecture document at some point, and each time a random hack is added, we have to deal with the consequences :-). Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 20:46:21 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id B62C937B71B; Mon, 19 Mar 2001 20:46:17 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.1/8.11.1) with SMTP id f2K4ehh69973; Mon, 19 Mar 2001 23:40:43 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Mon, 19 Mar 2001 23:40:43 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Sergey Babkin Cc: security@freebsd.org, Wes Peters , fs@freebsd.org Subject: Re: about common group & user ID space (PR kern/14584) In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Mon, 19 Mar 2001, Robert Watson wrote: > Personally, I'm a fan of the AFS ACL model, where protections are present > only on directories, hard links are prohibited, and sub-directories > inherit protections on creation. I even had an implementation of this on > FreeBSD at one point, although it's quite dated now. However, ACLs have > a number of things going for them: Just as an aside, btw, AFS uses a common numeric namespace for both users and groups, as well as for remote users from other cells. Users can also allocate and manage groups on demand. The single numeric namespace makes things a lot more consistent :-). (although I think it allocates negative values to groups, and positive ones to users..) Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 21:11:11 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 9D99437B72E; Mon, 19 Mar 2001 21:11:04 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.1/8.11.1) with SMTP id f2K5Avh70371; Tue, 20 Mar 2001 00:10:57 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Tue, 20 Mar 2001 00:10:57 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: freebsd-fs@FreeBSD.org, mckusick@FreeBSD.org Cc: jedgar@FreeBSD.org Subject: First round review request, ACLs for UFS commit Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org For the past few months, we (Chris Faulhaber and myself) have been developing and testing POSIX.1e ACL support for FreeBSD. Most of the code to support ACLs is now committed to the base source tree as of 5.0-CURRENT. This includes support libraries, generic ACL support code in the kernel, relevant vnode operations, introduction of extended attributes, and userland ACL tools talked about in POSIX.2c. It has also included work to maintain API and tool portability with SGI and Linux developers. I'm now about ready to do the last set of commits, which introduce the ACL semantics into the UFS code itself (mapping ACLs onto UFS EAs). I'd like to pause at this point and ask for some additional review of the code before committing it. A number of sites have been working with ACLs for a few months now and found them to work fairly well. The most recent revision of the ACL code is 0.6.1, available for download from: http://www.TrustedBSD.org/downloads/ It relies on the most recent round of EA commits to the tree, and the userland getfacl and setfacl tools committed this afternoon. It is my intent to request review on freebsd-fs, wait a few days, then bump over to -current and -arch, wait a few days, and then commit assuming no show-stoppers turn up. There are several things left on the TODO list, including fix a couple of issues with ACL validation, implement a few more supporting library calls, etc. The goal now, however, is to get increased testing and experimentation by making the code available to a wider audience. Thanks, Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 21:21:42 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (Postfix) with ESMTP id 9A72737B731; Mon, 19 Mar 2001 21:21:33 -0800 (PST) (envelope-from tlambert@usr05.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.9.3/8.9.3) id WAA14447; Mon, 19 Mar 2001 22:14:43 -0700 (MST) Received: from usr05.primenet.com(206.165.6.205) via SMTP by smtp02.primenet.com, id smtpdAAAgvai7B; Mon Mar 19 22:14:28 2001 Received: (from tlambert@localhost) by usr05.primenet.com (8.8.5/8.8.5) id WAA23451; Mon, 19 Mar 2001 22:21:04 -0700 (MST) From: Terry Lambert Message-Id: <200103200521.WAA23451@usr05.primenet.com> Subject: Re: about common group & user ID space (PR kern/14584) To: babkin@bellatlantic.net (Sergey Babkin) Date: Tue, 20 Mar 2001 05:20:59 +0000 (GMT) Cc: fs@FreeBSD.ORG, arch@FreeBSD.ORG In-Reply-To: <3AB6AA65.1B6ED19E@bellatlantic.net> from "Sergey Babkin" at Mar 19, 2001 07:55:01 PM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > > You could do this a bit more cleanly by just stealing the sign > > bit, and setting if the uid field contained a group ID. > > > > There would be no conversion problem for an existing system. > > That was my original idea but some thinking and experimentation > has shown that it creates too many incompatibilities, such as: > > - programs displaying the owner by name would break, and > that includes both the standard programs and random applications > - when exported by nfs, the same problem would stand for the > clients > - chown will have to be changed - both the program and system call, > as you mention later > > and possibly other sorts of breakages. The NFS breakage is going to be there in any case; the semantics will be different on the remote machine, giving ownership to a particular user (who doesn't exist). This will turn "owner by name" numeric at best, and give ownership to a particular person, not group, at worst. You also have the problem that the FreeBSD machine has to be your NIS master; in a heterogeneous environment, Sun boxes are still better NIS servers, since they understand the full complement of NIS maps, which FreeBSD doesn't, and they support automount (as opposed to amd, which happily requires a reboot to unwedge in many situations). Any time you internalize or externalize a uid/gid space, you will have that problem. Plus, with your approach, you are either going to have to make an exception for certain ID ranges, permitting overlap, or you are going to be stuck renumbering things like "bin" and "kmem". Further, even if the FreeBSD was the NIS master for NFS name interpretation, the only safe way to make the maps transportable would be to have identical group and password name/ID pairs. This breaks for normal duplication, which exists now: you can't have two entries in either file for the same key field, since a getpwuid or getgrgid will only return the first matching value in all cases. > > This changes the check to a one line change, conditional on > > the high bit being set. > > No, the change would be the same, just wrapped into a condition > check for this bit. I think you could "fudge" the in core copy of one id to be the other, with the bit OR'ed in or AND'ed off, as appropriate... > In the way I propose it, the sysadmins are supposed to create > a pseudo-user with the same name and ID as each group. That > automagically makes all commands, such as chown and ls, work > properly. Of course, that means that no real users and groups > must have the same name, but the common namespace looks natural > with the common ID space. Because the traditional users ang groups > with low IDs do have overlapping names, and IDs, the sysctl sets > the lowest ID from which the common ID space starts. If the sysctl > sets this value to below 100 (traditional range for the > system IDs), then the common ID code is disabled altogether. > The value of 100 is set by a kernel config option and may be changed. This explodes when your remote NIS server doesn't enforce the new semantics; this is sort of the opposite of the problem I cite above with not being able to maintain a single namespace. Really, the namespace and the ID space are paired, so the only practical thing to do is to seperate the ID space and the namespace at the same time. I really think it's a lot easier to do this by stealing a bit somewhere (second one down from the sign, if the sign is to be held sacrosanct) than it is to rely on semantic enforcement by your tools. As soon as you do that, it becomes significantly less useful. At least with a stolen bit, the ownership on the remote machine works, even if it doesn't precisely "make sense" the same way it does on the hacked FreeBSD box. > > The benefits in not having the grovel through the FS contents, or > > do a more complex ID space transformations, and the moving of the > > majority of changes to user space, combined with the fact that if > > you turn it off, the ownership doesn't need to be reverted, are > > all plusses. > > This is not quite so. My patch requires only one little change in > the kernel and no usel-level space changes at all. It has some > expectations for the assignment of user and group IDs and names, > but these expectations are justified to make the common ID space > look reasonable. The downside is that it's slightly slower > (for each file owner ID in the common space it has to be checked > agains all process'es groups). I'm not sure yet if it allows > more complex transformations and whether it does it comparable > to your proposal. You could do the transformations with a mapping layer (left as an exercise for the student), but it would not really be worth it, I think. The biggest problem is that the tools have to have a gentleman's agreement between themselves across systems that everyone will sign up to honor. That's really too kludgy to trust, unless you are in a homogeneous environment (if then). Placing this as a restriction makes the idea much, much less generally useful than it would otherwise be. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 22: 1:20 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 4A48537B718; Mon, 19 Mar 2001 22:01:12 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id CD97259283; Tue, 20 Mar 2001 00:00:06 -0600 (CST) Date: Tue, 20 Mar 2001 00:00:06 -0600 From: "Michael C . Wu" To: Robert Watson Cc: freebsd-fs@FreeBSD.org, mckusick@FreeBSD.org, jedgar@FreeBSD.org Subject: Re: First round review request, ACLs for UFS commit Message-ID: <20010320000006.C43637@peorth.iteration.net> Reply-To: "Michael C . Wu" Mail-Followup-To: "Michael C . Wu" , Robert Watson , freebsd-fs@FreeBSD.org, mckusick@FreeBSD.org, jedgar@FreeBSD.org References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from rwatson@FreeBSD.org on Tue, Mar 20, 2001 at 12:10:57AM -0500 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 12:10:57AM -0500, Robert Watson scribbled: | For the past few months, we (Chris Faulhaber and myself) have been Great work guys! I think the code that you guys produces is very well written and commented. :) Now I know why there are so few responses to your request for reviews. Your (pl.) code is impeccable. | The most recent revision of the ACL code is 0.6.1, available for download | http://www.TrustedBSD.org/downloads/ Just a small question after reading the latest patch. You don't seem to handle the case where the user forgets that he is not mounting a ACL'ed filesystem and expecting ACL's to work. There seems to be a default fallback to old behavior. Is this necessarily good? i.e. Should we have a default set of ACL's instead? Secondly, how does this affect the performance of the filesystem? Also, there are no man pages that I see. -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Mar 19 23:29:26 2001 Delivered-To: freebsd-fs@freebsd.org Received: from lariat.org (lariat.org [12.23.109.2]) by hub.freebsd.org (Postfix) with ESMTP id A6DDF37B718; Mon, 19 Mar 2001 23:29:18 -0800 (PST) (envelope-from brett@lariat.org) Received: from mustang.lariat.org (IDENT:ppp0.lariat.org@lariat.org [12.23.109.2]) by lariat.org (8.9.3/8.9.3) with ESMTP id AAA20088; Tue, 20 Mar 2001 00:25:56 -0700 (MST) Message-Id: <4.3.2.7.2.20010320002008.00d12b50@localhost> X-Sender: brett@localhost X-Mailer: QUALCOMM Windows Eudora Version 4.3.2 Date: Tue, 20 Mar 2001 00:25:37 -0700 To: Sergey Babkin , Terry Lambert From: Brett Glass Subject: Re: about common group & user ID space (PR kern/14584) Cc: security@FreeBSD.ORG, Wes Peters , Robert Watson , fs@FreeBSD.ORG, arch@FreeBSD.ORG In-Reply-To: <3AB6AF1F.9452E231@bellatlantic.net> References: <200103182339.QAA18696@usr05.primenet.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org At 06:15 PM 3/19/2001, Sergey Babkin wrote: >> > on (a) the number of groups of which a user can be a member and (b) the > >For this there is some macro (can't remember the name) which >can be defined in the kernel config file as an option with >a higher value. Setting it higher means higher system overhead >but since the memory size has increased significantly over >the last few years, I think that a higher default value makes >sense. I do too. Could you submit this as a patch? >I think there is such a limit. Or at least it was in the 2.0.5 days. >I'm not sure about the line length limit. I remember that there >was such a limit in SVR4.2, so if a group line grew past some size, >getgrent() and friends went crazy. I believe that it was between 100 and 130 when it lost it. Don't know if it was the number of characters or the number of users. >The common workaround it to split a group record into multiple >lines in /etc/group, like: > >staff:*:20:root >staff:*:20:babkin > >Keep no more than about ~50 users per line. >This may break things like adduser but it's not a big loss. Breaking adduser WOULD be a loss. If one of our sysadmins-in-training was adding users to the system, he or she wouldn't know what to do next. And those of us who COULD wouldn't want to take the time. Perhaps adduser ought to be patched to deal with this... say, by understanding multiple lines and limiting the number of users on any one line. --Brett To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 2:45:38 2001 Delivered-To: freebsd-fs@freebsd.org Received: from baerenklau.de.freebsd.org (baerenklau.de.freebsd.org [195.185.195.14]) by hub.freebsd.org (Postfix) with ESMTP id 956D837B718; Tue, 20 Mar 2001 02:45:23 -0800 (PST) (envelope-from w@panke.de.freebsd.org) Received: (from uucp@localhost) by baerenklau.de.freebsd.org (8.8.8/8.8.8) with UUCP id LAA15220; Tue, 20 Mar 2001 11:43:56 +0100 (CET) (envelope-from w@panke.de.freebsd.org) Received: (from w@localhost) by paula.panke.de.freebsd.org (8.9.3/8.8.8) id LAA01232; Tue, 20 Mar 2001 11:30:52 +0100 (CET) (envelope-from w) Date: Tue, 20 Mar 2001 11:30:52 +0100 From: Wolfram Schneider To: Brett Glass Cc: Terry Lambert , Sergey Babkin , security@FreeBSD.ORG, Wes Peters , Robert Watson , fs@FreeBSD.ORG Subject: Re: about common group & user ID space (PR kern/14584) Message-ID: <20010320113052.A1141@paula.panke.de.freebsd.org> References: <3AB3FC38.94711FFF@bellatlantic.net> <200103180738.AAA03250@usr05.primenet.com> <4.3.2.7.2.20010318123759.00d9dd10@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: <4.3.2.7.2.20010318123759.00d9dd10@localhost>; from brett@lariat.org on Sun, Mar 18, 2001 at 12:42:17PM -0700 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On 2001-03-18 12:42:17 -0700, Brett Glass wrote: > At the same time, it'd be nice to eliminate the arbitrary limitations > on (a) the number of groups of which a user can be a member and (b) the > number of members in a group. Both of these limitations often bite > administrators who, for example, want most users of a system to be > members of a particular group or want to implement group-based access > control schemes with a moderate degree of granularity. The current length limit for a line in /etc/groups is 256KByte, which should be enough for 65536 users in one group ;-) Please keep in mind that other OS has lower limits, eg. Solaris had a limit of 1024 characters (~200 user per group) and NIS/YP may not work with lines longer 1024 characters. You can increase the limit if you want and recompile your libc. See src/lib/libc/gen/getgrent.c,v for more details. The support for long lines was added in Dec 1996. -Wolfram -- Wolfram Schneider http://wolfram.schneider.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 6:13:20 2001 Delivered-To: freebsd-fs@freebsd.org Received: from bilver.wjv.com (dhcp-1-23.n01.orldfl01.us.ra.verio.net [157.238.210.23]) by hub.freebsd.org (Postfix) with ESMTP id 518A937B725 for ; Tue, 20 Mar 2001 06:12:55 -0800 (PST) (envelope-from bill@bilver.wjv.com) Received: (from bill@localhost) by bilver.wjv.com (8.11.1/8.11.1) id f2KECSZ04701 for freebsd-fs@freebsd.org; Tue, 20 Mar 2001 09:12:28 -0500 (EST) (envelope-from bill) Date: Tue, 20 Mar 2001 09:09:27 -0500 From: Bill Vermillion To: freebsd-fs@freebsd.org Subject: Re: about common group & user ID space (PR kern/14584) Message-ID: <20010320090926.B4220@wjv.com> Reply-To: bv@wjv.com References: <3AB3FC38.94711FFF@bellatlantic.net> <200103180738.AAA03250@usr05.primenet.com> <4.3.2.7.2.20010318123759.00d9dd10@localhost> <20010320113052.A1141@paula.panke.de.freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320113052.A1141@paula.panke.de.freebsd.org>; from bsd@panke.de.freebsd.org on Tue, Mar 20, 2001 at 11:30:52AM +0100 Organization: W.J.Vermillion / Orlando - Winter Park Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 11:30:52AM +0100, Wolfram Schneider thus spoke: > On 2001-03-18 12:42:17 -0700, Brett Glass wrote: > > > At the same time, it'd be nice to eliminate the arbitrary > > limitations on (a) the number of groups of which a user can > > be a member and (b) the number of members in a group. Both of > > these limitations often bite administrators who, for example, > > want most users of a system to be members of a particular group > > or want to implement group-based access control schemes with a > > moderate degree of granularity. > The current length limit for a line in /etc/groups is 256KByte, > which should be enough for 65536 users in one group ;-) Is you copy of 'bc' broken, or are you figuring in hex :-) If all users were at the 16 character name limit you have about 16,000 users in a group, and about 32,000 if you limited them to 8 character names. This is just a rough 'back of the envelope' figure not counting commas, etc, or the actuall bytes in 256K - I just used 256,000 as the number. :-) :-) :-) :-) :-) :-) :-) :-) :-) :-) :-) :-) :-) :-) :-) Bill -- Bill Vermillion - bv @ wjv . com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 9:11:54 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 3D78437B71C; Tue, 20 Mar 2001 09:11:45 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 6A9ED59283; Tue, 20 Mar 2001 11:11:44 -0600 (CST) Date: Tue, 20 Mar 2001 11:11:44 -0600 From: "Michael C . Wu" To: dillon@freebsd.org, grog@freebsd.org, fs@freebsd.org, hackers@freebsd.org Subject: tuning a VERY heavily (30.0) loaded server Message-ID: <20010320111144.A51924@peorth.iteration.net> Reply-To: "Michael C . Wu" Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org [Lengthy email, bear with me please, it is quite interesting. This box averages 30.0 load with no problems.] system stats at http://zoo.ee.ntu.edu.tw/~keichii/ Hello Everyone, I have a friend who admins a very heavily loaded BBS server. (In Taiwan, BBS'es are still very popular, because they are the primary form of scholastic communication in colleges/universities. And FreeBSD runs on most of the university systems in Taiwan ;) ) This box is rather a FreeBSD advocacate itself, as you will see why. It runs an self-wrote PERL SMTP daemon. (Sendmail and Postfix croaks) SMTPD pipes the mail to "bbsmail" that delivers the mail to BBS users. SMTPd averages about BBSd averages about 3000 users at any given time of the day, Peak usage is about 4300 users before the box dies. Each user averages 4-5KB/sec bandwidth. BBSd is an in-house modification of a popular BBSD in Taiwan. There is an innd backend to BBSd that gets a full feed of tw.bbs.* and many other local newsgroups. Average file size is about 4K. /home/bbsusers* is on a vinum stripe'd volume with 3 Ultra160 9G 10000RPM drives on sym0 at stripe size 256K, Greg: I know this should be a prime number, can we safely use <150K stripe sizes? CPU time is not a problem. The other parts of the system rest on 3*Ultra160 9g 10K RPM on AHC0 at stripe size 256K. Physical memory is 2.5 GB. We do MFS and it croaks/crashes at midnight, our peak load time. We do md0, it croaks before peak time. Dual PIII-750 CPU's Due to the structure of BBS's, we cannot split the load across different servers. We also think that we probably cannot get more performance out of hardware upgrades that we can afford. (i.e. Please don't tell us to buy a Starfire 4500 :-) We are all volunteer werkers at El Cheapo university budgets.) We average around 30.0 server load with no noticeable delays for users. Peak load is up to 50.0. Average process count is around 4000 to 5000. We have followed Alfred's advice to do sysctl -w vfs.vmioenable=1 It allows us to survive the peak load a little longer than before. And we are putting our logs of sockstat, iostat 5, vmstat 5, netstat 5, dmesg, uname -a on the following URL. http://zoo.ee.ntu.edu.tw/~keichii/ *DRUM ROLL* What do you think we can do to make this server survive the peak load of around 5000 users? :) * How should we setup our IPFW? * What should be the optimal newfs and tunefs configurations for our filesystems? * What should we try as vinum stripe sizes? * What is possibly the bottleneck that we have for load 30.0? (since we are not CPU-bound nor memory bound) * Is there any VM tweaks that we can do? * Anything else we can do? Thanks, Michael -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 9:12:39 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id B468F37B71B; Tue, 20 Mar 2001 09:12:32 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.1/8.11.1) with SMTP id f2KHCNh78306; Tue, 20 Mar 2001 12:12:27 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Tue, 20 Mar 2001 12:12:22 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: "Michael C . Wu" Cc: freebsd-fs@FreeBSD.org, mckusick@FreeBSD.org, jedgar@FreeBSD.org Subject: Re: First round review request, ACLs for UFS commit In-Reply-To: <20010320000006.C43637@peorth.iteration.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, 20 Mar 2001, Michael C . Wu wrote: > On Tue, Mar 20, 2001 at 12:10:57AM -0500, Robert Watson scribbled: > | For the past few months, we (Chris Faulhaber and myself) have been > > Great work guys! I think the code that you guys produces > is very well written and commented. :) Thanks :-). > | The most recent revision of the ACL code is 0.6.1, available for download > | http://www.TrustedBSD.org/downloads/ > > Just a small question after reading the latest patch. You don't seem to > handle the case where the user forgets that he is not mounting a ACL'ed > filesystem and expecting ACL's to work. There seems to be a default > fallback to old behavior. Is this necessarily good? i.e. Should we have > a default set of ACL's instead? POSIX.1e defines ACLs to act as a superset to normal UNIX file permissions. This means that applications which expect permissions act on ACL'd file systems safely (if a little oddly) without modification. However, it does mean that if applications want to run on both types, and have ACL-aware behavior, they must know how to deal with that. In an earlier iteration, I actually had ACL emulation running over permissions, where it was possible to address permission-based file systems using the ACL interfaces and get the "right" behavior: acl_get_{fd,file}() would return an ACL derived from the permissions, and acl_set_{fd,file}() would accept ACLs that can be represented purely as permissions, returning an appropriate error if not. However, this is actually in violation of the POSIX.1e spec (see discussion on the POSIX.1e mailing list that I host), and so in more recent revisions of the code, emulation is not provided. I was tempted to leave a kernel uption in place that provided emulation on non-ACL file systems, but the impact would be that applications might make incorrect assumptions. I've also been tempted to grab an FFS superblock flag to indicate whether or not the given file system has ACLs enabled, so that ACL policy is available to the file system checker, and allowing distinguishing of the ACL policy from mount policy. However, we only have a very limited number of FFS flags, and an even more limited number of mount flags. In my ideal world, the following FFS flags would be allocated: FS_EA UFS extended attributes are enabled for this file system FS_ACL ACLs are enabled for this file system FS_EANG FFS next generation EAs are enabled for this file system This would allow fsck-time and mount-time activity to be more policy driven (right now, if the UFS_ACL code is enabled and the two posix1e.acl_{access,default} EAs are available, ACLs are available, which is not quite so desirable). It looks like the FFS flags field is only 8 bits though :-(. > Secondly, how does this affect the performance of the filesystem? The performance of the current ACL implementation is primarily driven by the speed of the EA implementation. ACLs themselves add very, very little overhead, it all depends on where and how the ACLs are stored on the file system. The current EA implementation makes use of backing files which are essentially inode number indexed arrays of EA data for each defined EA. There are several performance concerns with this implementation: 1) For a given EA, all access to the EA is channeled through a single vnode lock, preventing concurrent access. (Concurrent access is also limited by the EA implementation, which is something I'm working on fixing by moving to using mutexes and outstanding transaction counts, but it will be a bit before I can get that work finished and evaluated). 2) Locality is not taken into account in storing EAs on disk near the inode they correspond to, introducing potential extra disk seeks. 3) EA and more specifically ACL data is not read in with the inode and so operations that need access to the ACL data may take additional seeks to pull it in. For some initial evaluation of the performance of the current EA mechanism, please see my paper presented at BSDCon 2000. It's available on the TrustedBSD site under Documentation -> Implementation. Note that the interfaces and implementation have evolved somewhat since then, but a number of the general concerns there are still accurate. Because ACL performance is driven by EA performance, and because the EA implementation is cleanly seperated from the ACL implementation (the ACL code accesses the EAs using the normal EA vnode operations on UFS), it's possible to improve ACL performance by improving EA performance. In fact, modulo a bit of fuzz, the cost of ACLs is probably identical to the cost of EAs. As such, the main target for improving ACL performance would be improving EA performance. We hope to have work on this started by late summer, pending the availability of resources to do that. (Hence the FS_EANG flag described above, distinguishing the two implementations). > Also, there are no man pages that I see. There is not currently a good source of documentation for the actual UFS ACL implementation. However, lots of general ACL documentation actually already exists in the base system. Please take a look at: getfacl(1), setfacl(1) acl(3), acl_get(3), acl_delete(3), acl_dup(3), acl_free(3), acl_from_text(3), acl_get(3), acl_init(3), acl_set(3), acl_to_text(3), acl_valid(3) acl(9), VOP_ACLCHECK(9), VOP_GETACL(9), VOP_SETACL(9) What I should probably do to address the lack of UFS documentation is add a README.extendedattributes and README.acls in the ufs/ufs directory, in the style of similar files in ufs/ffs. Also, we should add a UFS man page of some sort that documents various UFS features (including hacks such as SUIDDIR, etc) to make these features more accessible. You might also be interested by: extattrctl(8), getextattr(8), setextattr(8) extattr(9), VOP_GETEXTATTR(9), VOP_SETEXTATTR(9) Note that substantial changes are still underway for the EA mechanisms and interfaces to reflect the needs of applications now being written to use tham (Thomas Moestl has an updated tar that speaks ACLs and EAs, and has noted a number of limitations to the EA interface that need to be addressed). There's also portability work going on with respects to the EA interface that may result in changes to EAs. Let me know if you have any additional questions. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 9:22:45 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 10FEF37B728; Tue, 20 Mar 2001 09:22:41 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 76F3B59283; Tue, 20 Mar 2001 11:22:40 -0600 (CST) Date: Tue, 20 Mar 2001 11:22:40 -0600 From: "Michael C . Wu" To: izero@ms26.hinet.net Cc: dillon@freebsd.org, grog@freebsd.org, fs@freebsd.org, hackers@freebsd.org Subject: Re: tuning a VERY heavily (30.0) loaded server Message-ID: <20010320112239.A52424@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010320111144.A51924@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320111144.A51924@peorth.iteration.net>; from keichii@iteration.net on Tue, Mar 20, 2001 at 11:11:44AM -0600 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 11:11:44AM -0600, Michael C . Wu scribbled: | system stats at | http://zoo.ee.ntu.edu.tw/~keichii/ | It runs an self-wrote PERL SMTP daemon. (Sendmail and Postfix croaks) | SMTPD pipes the mail to "bbsmail" that delivers the mail to | BBS users. SMTPd averages about $ mailq |wc -l 2694 $ gzcat maillog.0.gz |wc -l 14407 $ gzcat maillog.2.gz |wc -l 52413 -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 9:27:25 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 4942337B729; Tue, 20 Mar 2001 09:27:19 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f2KHRHg21496; Tue, 20 Mar 2001 09:27:17 -0800 (PST) Date: Tue, 20 Mar 2001 09:27:17 -0800 From: Alfred Perlstein To: "Michael C . Wu" Cc: dillon@FreeBSD.ORG, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server Message-ID: <20010320092717.R29888@fw.wintelcom.net> References: <20010320111144.A51924@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320111144.A51924@peorth.iteration.net>; from keichii@iteration.net on Tue, Mar 20, 2001 at 11:11:44AM -0600 X-all-your-base: are belong to us. Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Michael C . Wu [010320 09:11] wrote: > [Lengthy email, bear with me please, it is quite interesting. > This box averages 30.0 load with no problems.] cool.. > system stats at > http://zoo.ee.ntu.edu.tw/~keichii/ Where's the crashdump/traceback? > Physical memory is 2.5 GB. We do MFS and it croaks/crashes > at midnight, our peak load time. We do md0, it croaks before > peak time. Explain the crash. What is md0/MFS being used for? Why do you need it? > Due to the structure of BBS's, we cannot split the load across > different servers. We also think that we probably cannot > get more performance out of hardware upgrades that we can afford. > (i.e. Please don't tell us to buy a Starfire 4500 :-) We are all volunteer > werkers at El Cheapo university budgets.) Well, getting hardware RAID is always a nice thing and really not too expensive. > We have followed Alfred's advice to do sysctl -w vfs.vmioenable=1 > It allows us to survive the peak load a little longer than before. cool.. > And we are putting our logs of sockstat, iostat 5, vmstat 5, > netstat 5, dmesg, uname -a on the following URL. > > http://zoo.ee.ntu.edu.tw/~keichii/ > > *DRUM ROLL* > What do you think we can do to make this server survive the > peak load of around 5000 users? :) > [snip several non-interesting ideas] > * Anything else we can do? Well first off, telling us which version of FreeBSD this is... Second, provide a crashdump with debug symbols, and show us the backtrace. Third, consider alternatives to MFS since it seems to be a key factor in your stability problems. If you just need a pretty fast /tmp, I would use a softupdates partition as it's probably more effecient than MFS/MD. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 9:38:35 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 44B2F37B734; Tue, 20 Mar 2001 09:38:24 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id EE4CA59283; Tue, 20 Mar 2001 11:38:18 -0600 (CST) Date: Tue, 20 Mar 2001 11:38:18 -0600 From: "Michael C . Wu" To: Alfred Perlstein Cc: "Michael C . Wu" , dillon@FreeBSD.ORG, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server Message-ID: <20010320113818.B52586@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320092717.R29888@fw.wintelcom.net>; from bright@wintelcom.net on Tue, Mar 20, 2001 at 09:27:17AM -0800 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 09:27:17AM -0800, Alfred Perlstein scribbled: | * Michael C . Wu [010320 09:11] wrote: | > [Lengthy email, bear with me please, it is quite interesting. | > This box averages 30.0 load with no problems.] | | cool.. | FreeBSD zoo.ee.ntu.edu.tw 4.2-STABLE FreeBSD 4.2-STABLE #0: Tue Mar 20 11:10:46 CST 2001 root@:/usr/src/sys/compile/SimFarm i386 | > system stats at | > http://zoo.ee.ntu.edu.tw/~keichii/ | | Where's the crashdump/traceback? I'll try to get one tomorrow night, It always crashes. :) It's quite hard trying to get a crashdump when you are 15000miles away from the console. | > Physical memory is 2.5 GB. We do MFS and it croaks/crashes | > at midnight, our peak load time. We do md0, it croaks before | > peak time. | | Explain the crash. What is md0/MFS being used for? Why do you | need it? md0/MFS is used for caching the articles that BBS users read. They often read the same articles over and over again, and we find that a 128MB MFS/md0 will have 70% hitrate When our MFS/md0 fills up after long usage, the box easily dies. (We crontab clean the mfs, but sometimes the load shoots up for no reason and is not able to clean the mfs in time.) If we dont do this cache, the data for the bulletin boards | > Due to the structure of BBS's, we cannot split the load across | > different servers. We also think that we probably cannot | > get more performance out of hardware upgrades that we can afford. | > (i.e. Please don't tell us to buy a Starfire 4500 :-) We are all volunteer | > werkers at El Cheapo university budgets.) | | Well, getting hardware RAID is always a nice thing and really not | too expensive. I looked into that, it seems that hardware RAID will have less performance due to hw raid cards' onboard CPU bounding it. | > We have followed Alfred's advice to do sysctl -w vfs.vmioenable=1 | > It allows us to survive the peak load a little longer than before. | | cool.. | | > And we are putting our logs of sockstat, iostat 5, vmstat 5, | > netstat 5, dmesg, uname -a on the following URL. | > | > http://zoo.ee.ntu.edu.tw/~keichii/ | > * Anything else we can do? | | Well first off, telling us which version of FreeBSD this is... | FreeBSD zoo.ee.ntu.edu.tw 4.2-STABLE FreeBSD 4.2-STABLE #0: Tue Mar 20 11:10:46 CST 2001 root@:/usr/src/sys/compile/SimFarm i386 | Second, provide a crashdump with debug symbols, and show us | the backtrace. | | Third, consider alternatives to MFS since it seems to be a key | factor in your stability problems. If you just need a pretty | fast /tmp, I would use a softupdates partition as it's probably | more effecient than MFS/MD. The harddrives will die very quickly if we don't have MFS... -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 9:49:30 2001 Delivered-To: freebsd-fs@freebsd.org Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161]) by hub.freebsd.org (Postfix) with ESMTP id D0B1D37B740; Tue, 20 Mar 2001 09:49:21 -0800 (PST) (envelope-from faber@ISI.EDU) Received: from ted.isi.edu (ted.isi.edu [128.9.160.104]) by boreas.isi.edu (8.11.2/8.11.2) with ESMTP id f2KHnK520704; Tue, 20 Mar 2001 09:49:20 -0800 (PST) Received: (from faber@localhost) by ted.isi.edu (8.11.2/8.11.2) id f2KHmls08662; Tue, 20 Mar 2001 09:48:47 -0800 (PST) (envelope-from faber) Date: Tue, 20 Mar 2001 09:48:37 -0800 From: Ted Faber To: "Michael C . Wu" Cc: Alfred Perlstein , "Michael C . Wu" , dillon@FreeBSD.ORG, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server Message-ID: <20010320094837.B1284@ted.isi.edu> References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=php-sha1; protocol="application/pgp-signature"; boundary="BwCQnh7xodEAoBMC" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320113818.B52586@peorth.iteration.net>; from keichii@iteration.net on Tue, Mar 20, 2001 at 11:38:18AM -0600 X-url: http://www.isi.edu/~faber Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org --BwCQnh7xodEAoBMC Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Mar 20, 2001 at 11:38:18AM -0600, Michael C . Wu wrote: > On Tue, Mar 20, 2001 at 09:27:17AM -0800, Alfred Perlstein scribbled: > | * Michael C . Wu [010320 09:11] wrote: > | > Physical memory is 2.5 GB. We do MFS and it croaks/crashes > | > at midnight, our peak load time. We do md0, it croaks before > | > peak time. > | > | Explain the crash. What is md0/MFS being used for? Why do you > | need it? > > md0/MFS is used for caching the articles that BBS users read. > They often read the same articles over and over again, > and we find that a 128MB MFS/md0 will have 70% hitrate > > When our MFS/md0 fills up after long usage, the box easily > dies. (We crontab clean the mfs, but sometimes the load > shoots up for no reason and is not able to clean the mfs in time.) > If we dont do this cache, the data for the bulletin boards Forgive me if this is a stupid question, but how much swap is there on this machine? Is the combination of the packed MFS and high process load exhausting your swap? --BwCQnh7xodEAoBMC Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (FreeBSD) Comment: For info see http://www.gnupg.org iD8DBQE6t5f1aUz3f+Zf+XsRAnKTAKD2KfRmKT5xISmnSw92iVPTxdGTtgCffv16 V6TK3KKHF799LzyDTMhxu7o= =PUFq -----END PGP SIGNATURE----- --BwCQnh7xodEAoBMC-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 9:51:45 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 6F0BD37B740; Tue, 20 Mar 2001 09:51:41 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2KHopk94248; Tue, 20 Mar 2001 09:50:51 -0800 (PST) (envelope-from dillon) Date: Tue, 20 Mar 2001 09:50:51 -0800 (PST) From: Matt Dillon Message-Id: <200103201750.f2KHopk94248@earth.backplane.com> To: "Michael C . Wu" Cc: Alfred Perlstein , "Michael C . Wu" , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org One thing that comes to mind is that you can smarthost your outgoing email to another host so the queues don't build up. This should greatly reduce mail load. In fact, I would recommend offloading email entirely if possible... email always hits disks hard. Definitely get rid of MFS. MFS wastes 2x the memory allocated to it. Use a softupdates-enabled filesystem in place of MFS, or use a swap-backed VN-based partition with softupdates enabled. Alfred's vmiodirenable suggestion is a good one. With all the memory you have you can also try turning off write_behind, e.g. setting vfs.write_behind to 0. I don't have enough information on the type of paging your machine is doing or the disk configuration. If you have multiple HD's, swap should definitely be spread across at least two of them. A few minutes worth of 'vmstat 1' output during the heavily loaded period would be useful, plus 'sysctl -a | fgrep vm'. I might be able to make suggestions on optimizing the VM system. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 9:55:25 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id C050437B73F; Tue, 20 Mar 2001 09:55:21 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2KHstW94364; Tue, 20 Mar 2001 09:54:55 -0800 (PST) (envelope-from dillon) Date: Tue, 20 Mar 2001 09:54:55 -0800 (PST) From: Matt Dillon Message-Id: <200103201754.f2KHstW94364@earth.backplane.com> To: "Michael C . Wu" Cc: Alfred Perlstein , "Michael C . Wu" , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :md0/MFS is used for caching the articles that BBS users read. :They often read the same articles over and over again, :and we find that a 128MB MFS/md0 will have 70% hitrate : :When our MFS/md0 fills up after long usage, the box easily :dies. (We crontab clean the mfs, but sometimes the load :shoots up for no reason and is not able to clean the mfs in time.) :If we dont do this cache, the data for the bulletin boards Definitely throw away MFS. A normal filesystem is plenty good enough for caching articles that BBS users read. MFS will just waste memory unnecessarily. It does seem to me that you might not have sufficient swap configured either, as per Ted's thought. With 2.5G of physical memory, You should have *AT LEAST* 3G of configured swap. I would recommend a 1G swap partition on each of your three 9G drives (for 3G total). -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10: 1:39 2001 Delivered-To: freebsd-fs@freebsd.org Received: from ns1.via-net-works.net.ar (ns1.via-net-works.net.ar [200.10.100.10]) by hub.freebsd.org (Postfix) with ESMTP id 519BD37B73F for ; Tue, 20 Mar 2001 10:01:32 -0800 (PST) (envelope-from fpscha@ns1.via-net-works.net.ar) Received: (from fpscha@localhost) by ns1.via-net-works.net.ar (8.9.3/8.9.3) id OAA69131 for freebsd-fs@freebsd.org; Tue, 20 Mar 2001 14:54:57 -0300 (ART) From: Fernando Schapachnik Message-Id: <200103201754.OAA69131@ns1.via-net-works.net.ar> Subject: growfs To: freebsd-fs@freebsd.org Date: Tue, 20 Mar 2001 14:54:57 -0300 (ART) Reply-To: Fernando Schapachnik X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=ISO-8859-1 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Hello, I was wondering how usable is the growfs implementation available in -CURRENT. Any chance of using it on -STABLE? Thanks! Fernando P. Schapachnik Administración de la red VIA NET.WORKS ARGENTINA S.A. fschapachnik@vianetworks.com.ar To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10: 2:31 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 2E40237B73F; Tue, 20 Mar 2001 10:02:26 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 3E35359283; Tue, 20 Mar 2001 12:01:12 -0600 (CST) Date: Tue, 20 Mar 2001 12:01:12 -0600 From: "Michael C . Wu" To: izero@ms26.hinet.net, cross@math.psu.edu Cc: Alfred Perlstein , "Michael C . Wu" , dillon@FreeBSD.ORG, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver Message-ID: <20010320120112.C52586@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320113818.B52586@peorth.iteration.net>; from keichii@iteration.net on Tue, Mar 20, 2001 at 11:38:18AM -0600 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org MRTG Graph at http://zoonews.ee.ntu.edu.tw/mrtg/zoo.html | | FreeBSD zoo.ee.ntu.edu.tw 4.2-STABLE FreeBSD 4.2-STABLE | #0: Tue Mar 20 11:10:46 CST 2001 root@:/usr/src/sys/compile/SimFarm i386 | | | > system stats at | | > http://zoo.ee.ntu.edu.tw/~keichii/ | md0/MFS is used for caching the articles that BBS users read. | They often read the same articles over and over again, | and we find that a 128MB MFS/md0 will have 70% hitrate | | When our MFS/md0 fills up after long usage, the box easily | dies. (We crontab clean the mfs, but sometimes the load | shoots up for no reason and is not able to clean the mfs in time.) | If we dont do this cache, the data for the bulletin boards | Another problem is that we have around 4000+ processes accessing lots of SHM at the same time.. The *UGLY* source code for the BBS is at http://zoo.ee.ntu.edu.tw/~keichii/zoo_bbsd_src.tgz We can only provide crash dumps for trusted people because of the thousands of passwords in the dump. -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10: 5:42 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id E6B0E37B73F; Tue, 20 Mar 2001 10:05:35 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id F01F259283; Tue, 20 Mar 2001 12:03:14 -0600 (CST) Date: Tue, 20 Mar 2001 12:03:14 -0600 From: "Michael C . Wu" To: Ted Faber Cc: fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server Message-ID: <20010320120314.D52586@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320094837.B1284@ted.isi.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320094837.B1284@ted.isi.edu>; from faber@ISI.EDU on Tue, Mar 20, 2001 at 09:48:37AM -0800 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 09:48:37AM -0800, Ted Faber scribbled: | On Tue, Mar 20, 2001 at 11:38:18AM -0600, Michael C . Wu wrote: | > On Tue, Mar 20, 2001 at 09:27:17AM -0800, Alfred Perlstein scribbled: | > | * Michael C . Wu [010320 09:11] wrote: | > | > Physical memory is 2.5 GB. We do MFS and it croaks/crashes | > | > at midnight, our peak load time. We do md0, it croaks before | > | > peak time. | > | | > | Explain the crash. What is md0/MFS being used for? Why do you | > | need it? | > | > md0/MFS is used for caching the articles that BBS users read. | > They often read the same articles over and over again, | > and we find that a 128MB MFS/md0 will have 70% hitrate | > | > When our MFS/md0 fills up after long usage, the box easily | > dies. (We crontab clean the mfs, but sometimes the load | > shoots up for no reason and is not able to clean the mfs in time.) | > If we dont do this cache, the data for the bulletin boards | | Forgive me if this is a stupid question, but how much swap is there on | this machine? Is the combination of the packed MFS and high process | load exhausting your swap? SWAP is never touched. :) last pid: 23395; load averages: 2.08, 2.92, 3.60 up 0+01:29:58 02:03:27 1529 processes:24 running, 1505 sleeping CPU states: 40.5% user, 0.0% nice, 46.4% system, 1.1% interrupt, 12.0% idle Mem: 705M Active, 1369M Inact, 332M Wired, 99M Cache, 265M Buf, 7504K Free Swap: 512M Total, 512M Free -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:12: 2 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 0F33237B718; Tue, 20 Mar 2001 10:11:55 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f2KI99W22661; Tue, 20 Mar 2001 10:09:09 -0800 (PST) Date: Tue, 20 Mar 2001 10:09:09 -0800 From: Alfred Perlstein To: "Michael C . Wu" Cc: izero@ms26.hinet.net, cross@math.psu.edu, "Michael C . Wu" , dillon@FreeBSD.ORG, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver Message-ID: <20010320100909.T29888@fw.wintelcom.net> References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320120112.C52586@peorth.iteration.net>; from keichii@iteration.net on Tue, Mar 20, 2001 at 12:01:12PM -0600 X-all-your-base: are belong to us. Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Michael C . Wu [010320 10:01] wrote: > MRTG Graph at > http://zoonews.ee.ntu.edu.tw/mrtg/zoo.html > > | > | FreeBSD zoo.ee.ntu.edu.tw 4.2-STABLE FreeBSD 4.2-STABLE > | #0: Tue Mar 20 11:10:46 CST 2001 root@:/usr/src/sys/compile/SimFarm i386 > | > | | > system stats at > | | > http://zoo.ee.ntu.edu.tw/~keichii/ > | md0/MFS is used for caching the articles that BBS users read. > | They often read the same articles over and over again, > | and we find that a 128MB MFS/md0 will have 70% hitrate > | > | When our MFS/md0 fills up after long usage, the box easily > | dies. (We crontab clean the mfs, but sometimes the load > | shoots up for no reason and is not able to clean the mfs in time.) > | If we dont do this cache, the data for the bulletin boards > | > > Another problem is that we have around 4000+ processes accessing > lots of SHM at the same time.. How much SHM? Like, what's the combined size of all segments in the system? You can make SHM non-pageable which results in a lot of saved memory for attached processes. You want to be after this date and have this file: Revision 1.3.2.3 / (download) - annotate - [select for diffs], Sun Dec 17 02:05:41 2000 UTC (3 months ago) by alfred Branch: RELENG_4 Changes since 1.3.2.2: +37 -32 lines Diff to previous 1.3.2.2 (colored) to branchpoint 1.3 (colored) next main 1.4 (colored) MFC: phys_pager fix for multiple segments Then set kern.ipc.shm_use_phys=1 > The *UGLY* source code for the BBS is at > http://zoo.ee.ntu.edu.tw/~keichii/zoo_bbsd_src.tgz tis ok, maybe later... though :) > > We can only provide crash dumps for trusted people because > of the thousands of passwords in the dump. Heh. :) -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:19:38 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 0066637B719; Tue, 20 Mar 2001 10:19:30 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2KIHMx94840; Tue, 20 Mar 2001 10:17:22 -0800 (PST) (envelope-from dillon) Date: Tue, 20 Mar 2001 10:17:22 -0800 (PST) From: Matt Dillon Message-Id: <200103201817.f2KIHMx94840@earth.backplane.com> To: Alfred Perlstein Cc: "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, "Michael C . Wu" , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> <20010320100909.T29888@fw.wintelcom.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org : :How much SHM? Like, what's the combined size of all segments in :the system? You can make SHM non-pageable which results in a lot :of saved memory for attached processes. : :You want to be after this date and have this file: : : :Revision 1.3.2.3 / (download) - annotate - [select for diffs], Sun Dec 17 02:05:41 2000 UTC (3 months ago) by alfred :Branch: RELENG_4 :Changes since 1.3.2.2: +37 -32 lines :Diff to previous 1.3.2.2 (colored) to branchpoint 1.3 (colored) next main 1.4 (colored) : :MFC: phys_pager fix for multiple segments : :Then set kern.ipc.shm_use_phys=1 We never MFC'd that? After the release we should. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:24:31 2001 Delivered-To: freebsd-fs@freebsd.org Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161]) by hub.freebsd.org (Postfix) with ESMTP id C9CB237B718; Tue, 20 Mar 2001 10:24:22 -0800 (PST) (envelope-from faber@ISI.EDU) Received: from ted.isi.edu (ted.isi.edu [128.9.160.104]) by boreas.isi.edu (8.11.2/8.11.2) with ESMTP id f2KIM1527394; Tue, 20 Mar 2001 10:22:01 -0800 (PST) Received: (from faber@localhost) by ted.isi.edu (8.11.2/8.11.2) id f2KIM1909090; Tue, 20 Mar 2001 10:22:01 -0800 (PST) (envelope-from faber) Date: Tue, 20 Mar 2001 10:21:56 -0800 From: Ted Faber To: "Michael C . Wu" Cc: fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server Message-ID: <20010320102156.C1284@ted.isi.edu> References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320094837.B1284@ted.isi.edu> <20010320120314.D52586@peorth.iteration.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=php-sha1; protocol="application/pgp-signature"; boundary="ZwgA9U+XZDXt4+m+" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320120314.D52586@peorth.iteration.net>; from keichii@iteration.net on Tue, Mar 20, 2001 at 12:03:14PM -0600 X-url: http://www.isi.edu/~faber Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org --ZwgA9U+XZDXt4+m+ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Mar 20, 2001 at 12:03:14PM -0600, Michael C . Wu wrote: > On Tue, Mar 20, 2001 at 09:48:37AM -0800, Ted Faber scribbled: > | Forgive me if this is a stupid question, but how much swap is there on > | this machine? Is the combination of the packed MFS and high process > | load exhausting your swap? > > > SWAP is never touched. :) > > last pid: 23395; load averages: 2.08, 2.92, 3.60 up 0+01:29:58 02:03:27 > 1529 processes:24 running, 1505 sleeping > CPU states: 40.5% user, 0.0% nice, 46.4% system, 1.1% interrupt, 12.0% idle > Mem: 705M Active, 1369M Inact, 332M Wired, 99M Cache, 265M Buf, 7504K Free > Swap: 512M Total, 512M Free A couple other people have mentioned that this is your swap load when the machine's quiet. MFS can exhaust your swap quickly, and if you scale these load numbers up by a factor of 10, I think you're going to touch swap. (Even here you're already down to 7M free mem.) I wouldn't be surprised if you're running out of swap. --ZwgA9U+XZDXt4+m+ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (FreeBSD) Comment: For info see http://www.gnupg.org iD8DBQE6t5/EaUz3f+Zf+XsRAlCNAJsGrpqo0bwQ4UWFZKzu+ZgCb6ROmQCbBFNw h1gZvaVRLZ4fhbGWbCZNJG4= =J9px -----END PGP SIGNATURE----- --ZwgA9U+XZDXt4+m+-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:24:44 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id B338E37B71D; Tue, 20 Mar 2001 10:24:31 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 91F8C59289; Tue, 20 Mar 2001 12:22:45 -0600 (CST) Date: Tue, 20 Mar 2001 12:22:45 -0600 From: "Michael C . Wu" To: Matt Dillon Cc: "Michael C . Wu" , Alfred Perlstein , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server Message-ID: <20010320122245.E52586@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <200103201750.f2KHopk94248@earth.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103201750.f2KHopk94248@earth.backplane.com>; from dillon@earth.backplane.com on Tue, Mar 20, 2001 at 09:50:51AM -0800 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 09:50:51AM -0800, Matt Dillon scribbled: | One thing that comes to mind is that you can smarthost your outgoing | email to another host so the queues don't build up. This should | greatly reduce mail load. In fact, I would recommend offloading email | entirely if possible... email always hits disks hard. | | Definitely get rid of MFS. MFS wastes 2x the memory allocated to it. | Use a softupdates-enabled filesystem in place of MFS, or use a | swap-backed VN-based partition with softupdates enabled. | Alfred's vmiodirenable suggestion is a good one. | | With all the memory you have you can also try turning off write_behind, | e.g. setting vfs.write_behind to 0. done. :) Thank you | I don't have enough information on the type of paging your machine | is doing or the disk configuration. If you have multiple HD's, swap | should definitely be spread across at least two of them. | | A few minutes worth of 'vmstat 1' output during the heavily loaded | period would be useful, plus 'sysctl -a | fgrep vm'. I might be able sysctl -a always crashes the system. It happens on other similiarly loaded BBS'es in Taiwan. | to make suggestions on optimizing the VM system. We have 'vmstat 5' available at http://zoo.ee.ntu.edu.tw/~keichii/ Fresh hot vmstat 1 log at http://zoo.ee.ntu.edu.tw/~keichii/vmstat_1.log -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:24:49 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 4281F37B71B; Tue, 20 Mar 2001 10:24:28 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 780F259283; Tue, 20 Mar 2001 12:21:56 -0600 (CST) Date: Tue, 20 Mar 2001 12:21:56 -0600 From: "Michael C . Wu" To: Alfred Perlstein Cc: izero@ms26.hinet.net, cross@math.psu.edu, "Michael C . Wu" , dillon@FreeBSD.ORG, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver Message-ID: <20010320122156.A53182@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> <20010320100909.T29888@fw.wintelcom.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320100909.T29888@fw.wintelcom.net>; from bright@wintelcom.net on Tue, Mar 20, 2001 at 10:09:09AM -0800 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 10:09:09AM -0800, Alfred Perlstein scribbled: | * Michael C . Wu [010320 10:01] wrote: | > MRTG Graph at | > http://zoonews.ee.ntu.edu.tw/mrtg/zoo.html | > | > | | > | FreeBSD zoo.ee.ntu.edu.tw 4.2-STABLE FreeBSD 4.2-STABLE | > | #0: Tue Mar 20 11:10:46 CST 2001 root@:/usr/src/sys/compile/SimFarm i386 | > | | > | | > system stats at | > | | > http://zoo.ee.ntu.edu.tw/~keichii/ | > | md0/MFS is used for caching the articles that BBS users read. | > | They often read the same articles over and over again, | > | and we find that a 128MB MFS/md0 will have 70% hitrate | > | | > | When our MFS/md0 fills up after long usage, the box easily | > | dies. (We crontab clean the mfs, but sometimes the load | > | shoots up for no reason and is not able to clean the mfs in time.) | > | If we dont do this cache, the data for the bulletin boards | > | | > | > Another problem is that we have around 4000+ processes accessing | > lots of SHM at the same time.. | | How much SHM? Like, what's the combined size of all segments in | the system? You can make SHM non-pageable which results in a lot | of saved memory for attached processes. | ipcs -b Shared Memory: T ID KEY MODE OWNER GROUP SEGSZ m 65536 1304 --rw------- bbs bbs 131076 m 65537 1217 --rw------- bbs wheel 1633728 m 65538 1215 --rw------- bbs wheel 768016 m 65539 1219 --rw------- bbs bbs 2065956 m 65540 1111 --rw------- bbs bbs 12 m 65541 1302 --rw------- bbs bbs 40016 m 65542 1303 --rw------- bbs bbs 40016 m 65543 1201 --rw------- bbs bbs 33328 m 65544 1301 --rw------- bbs bbs 33328 VM_KMEM_SIZE_MAX When we raise this variable, the system dies easily, but on a similiarly configured system (bbs.kkcity.com.tw) at similiar load, It helps a lot to keep the system stable. We have a hashd daemon that uses 4300*2 unix domain sockets written in pthreads. There are eight of these daemons each serving about 500 bbsd's. | Revision 1.3.2.3 / (download) - annotate - [select for diffs], Sun Dec 17 02:05:41 2000 UTC (3 months ago) by alfred | Branch: RELENG_4 | Changes since 1.3.2.2: +37 -32 lines | Diff to previous 1.3.2.2 (colored) to branchpoint 1.3 (colored) next main 1.4 (colored) | | MFC: phys_pager fix for multiple segments | | Then set kern.ipc.shm_use_phys=1 O.K. | > The *UGLY* source code for the BBS is at | > http://zoo.ee.ntu.edu.tw/~keichii/zoo_bbsd_src.tgz | | tis ok, maybe later... though :) Someone asked me in private for this. :) -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:24:56 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 513F337B719; Tue, 20 Mar 2001 10:24:49 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f2KIMkg23099; Tue, 20 Mar 2001 10:22:46 -0800 (PST) Date: Tue, 20 Mar 2001 10:22:46 -0800 From: Alfred Perlstein To: Matt Dillon Cc: "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, "Michael C . Wu" , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver Message-ID: <20010320102246.U29888@fw.wintelcom.net> References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> <20010320100909.T29888@fw.wintelcom.net> <200103201817.f2KIHMx94840@earth.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103201817.f2KIHMx94840@earth.backplane.com>; from dillon@earth.backplane.com on Tue, Mar 20, 2001 at 10:17:22AM -0800 X-all-your-base: are belong to us. Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Matt Dillon [010320 10:17] wrote: > : > :How much SHM? Like, what's the combined size of all segments in > :the system? You can make SHM non-pageable which results in a lot > :of saved memory for attached processes. > : > :You want to be after this date and have this file: > : > : > :Revision 1.3.2.3 / (download) - annotate - [select for diffs], Sun Dec 17 02:05:41 2000 UTC (3 months ago) by alfred > :Branch: RELENG_4 > :Changes since 1.3.2.2: +37 -32 lines > :Diff to previous 1.3.2.2 (colored) to branchpoint 1.3 (colored) next main 1.4 (colored) > : > :MFC: phys_pager fix for multiple segments > : > :Then set kern.ipc.shm_use_phys=1 > > We never MFC'd that? After the release we should. I MFC'd it a long time ago (3 months): > :Branch: RELENG_4 I just wasn't sure if he was up to date with 4.2-stable enough to get it. :) -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:27: 9 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 94F8237B718; Tue, 20 Mar 2001 10:27:03 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 1DF5A5928B; Tue, 20 Mar 2001 12:23:50 -0600 (CST) Date: Tue, 20 Mar 2001 12:23:50 -0600 From: "Michael C . Wu" To: Matt Dillon Cc: "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, Alfred Perlstein , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver Message-ID: <20010320122350.F52586@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> <200103201815.f2KIFR594803@earth.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103201815.f2KIFR594803@earth.backplane.com>; from dillon@earth.backplane.com on Tue, Mar 20, 2001 at 10:15:27AM -0800 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 10:15:27AM -0800, Matt Dillon scribbled: | | :Another problem is that we have around 4000+ processes accessing | :lots of SHM at the same time.. | | How big is 'lots'? If the shared memory segment is smallish, e.g. | less then 64MB, you should be ok. If it is larger then you will | have to do some kernel tuning to avoid running out of pmap entries. This is exactly what happens to us sometimes. We run out of pmap entries. :) But what can we tune? -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:27:16 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id BF0F537B719; Tue, 20 Mar 2001 10:27:09 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f2KINog23123; Tue, 20 Mar 2001 10:23:50 -0800 (PST) Date: Tue, 20 Mar 2001 10:23:50 -0800 From: Alfred Perlstein To: Matt Dillon Cc: "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver Message-ID: <20010320102350.V29888@fw.wintelcom.net> References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> <200103201815.f2KIFR594803@earth.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103201815.f2KIFR594803@earth.backplane.com>; from dillon@earth.backplane.com on Tue, Mar 20, 2001 at 10:15:27AM -0800 X-all-your-base: are belong to us. Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Matt Dillon [010320 10:16] wrote: > > :Another problem is that we have around 4000+ processes accessing > :lots of SHM at the same time.. > > How big is 'lots'? If the shared memory segment is smallish, e.g. > less then 64MB, you should be ok. If it is larger then you will > have to do some kernel tuning to avoid running out of pmap entries. kern.ipc.shm_use_phys should remove the need for pv entries. it's the default on solaris. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:33: 1 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 3E3B137B718; Tue, 20 Mar 2001 10:32:57 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f2KITPL23345; Tue, 20 Mar 2001 10:29:25 -0800 (PST) Date: Tue, 20 Mar 2001 10:29:24 -0800 From: Alfred Perlstein To: "Michael C . Wu" Cc: Matt Dillon , "Michael C . Wu" , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server Message-ID: <20010320102924.X29888@fw.wintelcom.net> References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <200103201750.f2KHopk94248@earth.backplane.com> <20010320122245.E52586@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320122245.E52586@peorth.iteration.net>; from keichii@iteration.net on Tue, Mar 20, 2001 at 12:22:45PM -0600 X-all-your-base: are belong to us. Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Michael C . Wu [010320 10:27] wrote: > On Tue, Mar 20, 2001 at 09:50:51AM -0800, Matt Dillon scribbled: > > sysctl -a always crashes the system. It happens on other similiarly > loaded BBS'es in Taiwan. WHY ARE THERE NO TRACEBACKS BEING POSTED TO THE LISTS? THIS IS THE WHOLE POINT OF FREEBSD/OPEN-SOURCE. ARE YOU GUYS SO USED TO MICROSOFT THAT YOU DON'T EXPECT US TO CARE ABOUT THIS? WE CARE, WE USE FREEBSD FOR OUR OWN BUSNIESSES. ARGH. THANKS, -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:35:21 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id C64F737B71A; Tue, 20 Mar 2001 10:35:14 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f2KIVUc23476; Tue, 20 Mar 2001 10:31:30 -0800 (PST) Date: Tue, 20 Mar 2001 10:31:30 -0800 From: Alfred Perlstein To: "Michael C . Wu" Cc: Matt Dillon , "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver Message-ID: <20010320103130.Y29888@fw.wintelcom.net> References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> <200103201815.f2KIFR594803@earth.backplane.com> <20010320122350.F52586@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010320122350.F52586@peorth.iteration.net>; from keichii@iteration.net on Tue, Mar 20, 2001 at 12:23:50PM -0600 X-all-your-base: are belong to us. Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Michael C . Wu [010320 10:27] wrote: > On Tue, Mar 20, 2001 at 10:15:27AM -0800, Matt Dillon scribbled: > | > | :Another problem is that we have around 4000+ processes accessing > | :lots of SHM at the same time.. > | > | How big is 'lots'? If the shared memory segment is smallish, e.g. > | less then 64MB, you should be ok. If it is larger then you will > | have to do some kernel tuning to avoid running out of pmap entries. > > This is exactly what happens to us sometimes. We run out of pmap entries. :) > But what can we tune? If this is a result of the shared memory, then my sysctl should fix it. Be aware, that it doesn't fix it on the fly! You must drop and recreate the shared memory segments. better to reboot actually and set the variable before any shm is allocated. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:41:15 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 4D6B237B719; Tue, 20 Mar 2001 10:41:03 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2KIcZP95379; Tue, 20 Mar 2001 10:38:35 -0800 (PST) (envelope-from dillon) Date: Tue, 20 Mar 2001 10:38:35 -0800 (PST) From: Matt Dillon Message-Id: <200103201838.f2KIcZP95379@earth.backplane.com> To: "Michael C . Wu" Cc: izero@ms26.hinet.net, cross@math.psu.edu, Alfred Perlstein , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> <200103201815.f2KIFR594803@earth.backplane.com> <20010320122350.F52586@peorth.iteration.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :| How big is 'lots'? If the shared memory segment is smallish, e.g. :| less then 64MB, you should be ok. If it is larger then you will :| have to do some kernel tuning to avoid running out of pmap entries. : :This is exactly what happens to us sometimes. We run out of pmap entries. :) :But what can we tune? :-- :+-----------------------------------------------------------+ :| keichii@iteration.net | keichii@freebsd.org | What Alfred said: sysctl -w kern.ipc.shm_use_phys=1 (run prior to creating the initial shared memory segment, e.g. when the machine is booted). That should solve the pv entry problem. What Alfred said in regards to 'sysctl -a' crashing too... We'll fix it if you give us a traceback! The kernel config you are using would be useful. It sounds like there are a bunch of things you either need to tune or have already tuned in the kernel configuration. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:43: 4 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth.backplane.com [208.161.114.65]) by hub.freebsd.org (Postfix) with ESMTP id 20FA537B718; Tue, 20 Mar 2001 10:43:01 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2KIFR594803; Tue, 20 Mar 2001 10:15:27 -0800 (PST) (envelope-from dillon) Date: Tue, 20 Mar 2001 10:15:27 -0800 (PST) From: Matt Dillon Message-Id: <200103201815.f2KIFR594803@earth.backplane.com> To: "Michael C . Wu" Cc: izero@ms26.hinet.net, cross@math.psu.edu, Alfred Perlstein , "Michael C . Wu" , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :Another problem is that we have around 4000+ processes accessing :lots of SHM at the same time.. How big is 'lots'? If the shared memory segment is smallish, e.g. less then 64MB, you should be ok. If it is larger then you will have to do some kernel tuning to avoid running out of pmap entries. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:48:54 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 9C89B37B71C; Tue, 20 Mar 2001 10:48:46 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2KImj995560; Tue, 20 Mar 2001 10:48:45 -0800 (PST) (envelope-from dillon) Date: Tue, 20 Mar 2001 10:48:45 -0800 (PST) From: Matt Dillon Message-Id: <200103201848.f2KImj995560@earth.backplane.com> To: "Michael C . Wu" Cc: Alfred Perlstein , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <200103201750.f2KHopk94248@earth.backplane.com> <20010320122245.E52586@peorth.iteration.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :We have 'vmstat 5' available at http://zoo.ee.ntu.edu.tw/~keichii/ :Fresh hot vmstat 1 log at :http://zoo.ee.ntu.edu.tw/~keichii/vmstat_1.log : :-- :+-----------------------------------------------------------+ :| keichii@iteration.net | keichii@freebsd.org | Your vmstat output indicates: * That you have plenty of cpu * That you are not paging heavily (good!) Ah. Your kernel config is in that directory too. Cool. Looks about what I expected. The default VM_KMEM_SIZE_MAX is 200MB, I'm not sure why you are reducing it to 192MB (but it wouldn't make much of a different I guess). I usually don't increase 'maxusers' above 256 myself, but 512 should be fine. Everything else looks fine too. The iostat output sheds more light on the disk activity. It doesn't look all that bad. If your users are accessing a lot of different files it might be beneficial to mount the filesystems in question with the 'noatime' option. This coupled with softupdates should remove any need for MFS. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:49:18 2001 Delivered-To: freebsd-fs@freebsd.org Received: from mail.enteract.com (mail.enteract.com [207.229.143.33]) by hub.freebsd.org (Postfix) with ESMTP id 8F07A37B71A; Tue, 20 Mar 2001 10:49:12 -0800 (PST) (envelope-from dscheidt@tumbolia.com) Received: from shell-3.enteract.com (dscheidt@shell-3.enteract.com [207.229.143.42]) by mail.enteract.com (8.11.1/8.11.2) with ESMTP id f2KIEjG70592; Tue, 20 Mar 2001 12:14:45 -0600 (CST) (envelope-from dscheidt@tumbolia.com) Date: Tue, 20 Mar 2001 12:14:45 -0600 (CST) From: David Scheidt X-Sender: dscheidt@shell-3.enteract.com To: "Michael C . Wu" Cc: Ted Faber , fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server In-Reply-To: <20010320120314.D52586@peorth.iteration.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, 20 Mar 2001, Michael C . Wu wrote: : :SWAP is never touched. :) Your vmstat output shows page out activity. I can't tell if it's to swap or to file backed memory, but it's happening. You know this isn't happening when your box blows up? : :last pid: 23395; load averages: 2.08, 2.92, 3.60 up 0+01:29:58 02:03:27 :1529 processes:24 running, 1505 sleeping :CPU states: 40.5% user, 0.0% nice, 46.4% system, 1.1% interrupt, 12.0% idle :Mem: 705M Active, 1369M Inact, 332M Wired, 99M Cache, 265M Buf, 7504K Free :Swap: 512M Total, 512M Free You really, really should have at least as much as swap as RAM, probably closer to 2X. A big spike in load can run you out of swap very quickly -- less tan a minute. -- dscheidt@tumbolia.com Bipedalism is only a fad. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:50:20 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id B476C37B71A; Tue, 20 Mar 2001 10:50:15 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id D0F0E59283; Tue, 20 Mar 2001 12:49:38 -0600 (CST) Date: Tue, 20 Mar 2001 12:49:38 -0600 From: "Michael C . Wu" To: Matt Dillon Cc: "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, Alfred Perlstein , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver Message-ID: <20010320124938.H52586@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> <200103201815.f2KIFR594803@earth.backplane.com> <20010320122350.F52586@peorth.iteration.net> <200103201838.f2KIcZP95379@earth.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103201838.f2KIcZP95379@earth.backplane.com>; from dillon@earth.backplane.com on Tue, Mar 20, 2001 at 10:38:35AM -0800 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 10:38:35AM -0800, Matt Dillon scribbled: | | :| How big is 'lots'? If the shared memory segment is smallish, e.g. | :| less then 64MB, you should be ok. If it is larger then you will | :| have to do some kernel tuning to avoid running out of pmap entries. | : | :This is exactly what happens to us sometimes. We run out of pmap entries. :) | :But what can we tune? | sysctl -w kern.ipc.shm_use_phys=1 | | (run prior to creating the initial shared memory segment, e.g. when | the machine is booted). | | That should solve the pv entry problem. What Alfred said in regards | to 'sysctl -a' crashing too... We'll fix it if you give us a | traceback! Yes. I promised a trace for public and dump if you want. But there is no one at the NOC now. So it will have to be tomorrow night. We will take the box down at peak load to better help you guys. :) -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:53:45 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 8409437B71A; Tue, 20 Mar 2001 10:53:41 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2KIquW95665; Tue, 20 Mar 2001 10:52:56 -0800 (PST) (envelope-from dillon) Date: Tue, 20 Mar 2001 10:52:56 -0800 (PST) From: Matt Dillon Message-Id: <200103201852.f2KIquW95665@earth.backplane.com> To: Alfred Perlstein Cc: "Michael C . Wu" , "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded scerver References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320120112.C52586@peorth.iteration.net> <200103201815.f2KIFR594803@earth.backplane.com> <20010320122350.F52586@peorth.iteration.net> <20010320103130.Y29888@fw.wintelcom.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :If this is a result of the shared memory, then my sysctl should fix it. : :Be aware, that it doesn't fix it on the fly! You must drop and recreate :the shared memory segments. : :better to reboot actually and set the variable before any shm is :allocated. : :-- :-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] Lets see. Approximately 4MB shared across 4000 processes. That eats 1024 pte's per process, or around 4 million pmap elements that would be saved. That's a lot of KVM that would be saved. I'll bet turning that option on will magically solve most of Michael's problems too (though I'd still get rid of the MFS filesystem). -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 10:57: 3 2001 Delivered-To: freebsd-fs@freebsd.org Received: from genesis.tao.org.uk (genesis.tao.org.uk [212.135.162.62]) by hub.freebsd.org (Postfix) with ESMTP id 76A5637B71A for ; Tue, 20 Mar 2001 10:56:56 -0800 (PST) (envelope-from joe@tao.org.uk) Received: from tao.org.uk (genius.tao.org.uk [212.135.162.50]) by genesis.tao.org.uk (Postfix) with ESMTP id 53A3A4A24; Tue, 20 Mar 2001 18:56:55 +0000 (GMT) Received: by tao.org.uk (Postfix, from userid 100) id E1C343120; Tue, 20 Mar 2001 18:57:00 +0000 (GMT) Date: Tue, 20 Mar 2001 18:57:00 +0000 From: Josef Karthauser To: Fernando Schapachnik Cc: freebsd-fs@freebsd.org Subject: Re: growfs Message-ID: <20010320185658.C5954@tao.org.uk> References: <200103201754.OAA69131@ns1.via-net-works.net.ar> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="p2kqVDKq5asng8Dg" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103201754.OAA69131@ns1.via-net-works.net.ar>; from fpscha@ns1.via-net-works.net.ar on Tue, Mar 20, 2001 at 02:54:57PM -0300 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org --p2kqVDKq5asng8Dg Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Mar 20, 2001 at 02:54:57PM -0300, Fernando Schapachnik wrote: > Hello, > I was wondering how usable is the growfs implementation > available in -CURRENT. >=20 > Any chance of using it on -STABLE? You should just be able to compile it on -stable and use it. It was tested quite thoughly by the authors, but I've not seen any success/failure comments on any of the mailing lists since it was commited so I don't know how much real world testing it has received. That's why it's not been MFC'd to stable yet. Joe --p2kqVDKq5asng8Dg Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (FreeBSD) Comment: For info see http://www.gnupg.org iEYEARECAAYFAjq3p/oACgkQXVIcjOaxUBYT2gCgmOM+oqI+voLW8gvLZdea77Ws nacAn3CZ4q2M7y3GWklbVROvKbLBgVSD =VkKV -----END PGP SIGNATURE----- --p2kqVDKq5asng8Dg-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 11: 4:30 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 1D7D737B718; Tue, 20 Mar 2001 11:04:25 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2KJ4GP95937; Tue, 20 Mar 2001 11:04:16 -0800 (PST) (envelope-from dillon) Date: Tue, 20 Mar 2001 11:04:16 -0800 (PST) From: Matt Dillon Message-Id: <200103201904.f2KJ4GP95937@earth.backplane.com> To: Ted Faber Cc: "Michael C . Wu" , fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <20010320094837.B1284@ted.isi.edu> <20010320120314.D52586@peorth.iteration.net> <20010320102156.C1284@ted.isi.edu> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :> SWAP is never touched. :) :> :> last pid: 23395; load averages: 2.08, 2.92, 3.60 up 0+01:29:58 02:03:27 :> 1529 processes:24 running, 1505 sleeping :> CPU states: 40.5% user, 0.0% nice, 46.4% system, 1.1% interrupt, 12.0% idle :> Mem: 705M Active, 1369M Inact, 332M Wired, 99M Cache, 265M Buf, 7504K Free :> Swap: 512M Total, 512M Free : :A couple other people have mentioned that this is your swap load when :the machine's quiet. MFS can exhaust your swap quickly, and if you :scale these load numbers up by a factor of 10, I think you're going to :touch swap. (Even here you're already down to 7M free mem.) That is almost certainly what is occuring. Since swap is otherwise not being used much, I'm going to retract my '3G of swap' recommendation (though if you ever repartition your disks I would still do it). You don't need 3G of swap, the 512M is fine as long as you scrap MFS. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 18:25:46 2001 Delivered-To: freebsd-fs@freebsd.org Received: from ego.mind.net (ego.mind.net [206.99.66.9]) by hub.freebsd.org (Postfix) with ESMTP id 2AD9E37B71C; Tue, 20 Mar 2001 18:25:36 -0800 (PST) (envelope-from takhus@takhus.mind.net) Received: from takhus.dyn.mind.net (AFN-Dyn-2084622070.pc.ashlandfiber.net [208.46.220.70]) by ego.mind.net (8.9.3/8.9.3) with ESMTP id SAA19471; Tue, 20 Mar 2001 18:15:19 -0800 Received: from localhost (fleisher@localhost) by takhus.dyn.mind.net (8.11.3/8.11.3) with ESMTP id f2L2FJp18281; Tue, 20 Mar 2001 18:15:19 -0800 (PST) (envelope-from takhus@takhus.mind.net) X-Authentication-Warning: takhus.dyn.mind.net: fleisher owned process doing -bs Date: Tue, 20 Mar 2001 18:15:19 -0800 (PST) From: Tony Fleisher X-Sender: fleisher@takhus.dyn.mind.net To: Brett Glass Cc: Sergey Babkin , security@FreeBSD.ORG, fs@FreeBSD.ORG, arch@FreeBSD.ORG Subject: Re: about common group & user ID space (PR kern/14584) In-Reply-To: <4.3.2.7.2.20010320002008.00d12b50@localhost> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, 20 Mar 2001, Brett Glass wrote: > At 06:15 PM 3/19/2001, Sergey Babkin wrote: > > >> > on (a) the number of groups of which a user can be a member and (b) the > > > >For this there is some macro (can't remember the name) which > >can be defined in the kernel config file as an option with > >a higher value. Setting it higher means higher system overhead > >but since the memory size has increased significantly over > >the last few years, I think that a higher default value makes > >sense. > > I do too. Could you submit this as a patch? > > >I think there is such a limit. Or at least it was in the 2.0.5 days. > >I'm not sure about the line length limit. I remember that there > >was such a limit in SVR4.2, so if a group line grew past some size, > >getgrent() and friends went crazy. > > I believe that it was between 100 and 130 when it lost it. Don't > know if it was the number of characters or the number of users. > > [details about a workaround and adduser breakage removed] I believe that the limit on the length of a line in the group file was removed prior to 3.0-RELEASE. See revision 1.14 of src/lib/libc/gen/getgrent.c by wosch. http://www.FreeBSD.org/cgi/cvsweb.cgi/src/lib/libc/gen/getgrent.c Regards, Tony. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Mar 20 23:12: 1 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp05.primenet.com (smtp05.primenet.com [206.165.6.135]) by hub.freebsd.org (Postfix) with ESMTP id 9BBBB37B71C for ; Tue, 20 Mar 2001 23:11:57 -0800 (PST) (envelope-from tlambert@usr05.primenet.com) Received: (from daemon@localhost) by smtp05.primenet.com (8.9.3/8.9.3) id AAA07811; Wed, 21 Mar 2001 00:06:16 -0700 (MST) Received: from usr05.primenet.com(206.165.6.205) via SMTP by smtp05.primenet.com, id smtpdAAAhiainp; Wed Mar 21 00:06:05 2001 Received: (from tlambert@localhost) by usr05.primenet.com (8.8.5/8.8.5) id AAA22512; Wed, 21 Mar 2001 00:11:41 -0700 (MST) From: Terry Lambert Message-Id: <200103210711.AAA22512@usr05.primenet.com> Subject: Re: growfs To: fschapachnik@vianetworks.com.ar Date: Wed, 21 Mar 2001 07:11:41 +0000 (GMT) Cc: freebsd-fs@FreeBSD.ORG In-Reply-To: <200103201754.OAA69131@ns1.via-net-works.net.ar> from "Fernando Schapachnik" at Mar 20, 2001 02:54:57 PM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > I was wondering how usable is the growfs implementation > available in -CURRENT. > > Any chance of using it on -STABLE? It will work as well as it works pretty much everywhere, going back a long time. I believe the first version from "Der Mouse" ran on FreeBSD 1.1.5. The problem with this is that you will get fragmentation; consider the following two cases; the first is a disk of size "10"; the second is a disk of size "6" that has been "grown" to size "10". The "*" are allocated blocks, and the "." are unallocated blocks; "o" are blocks that would have been allocated in the new space, if it had been there at the time the were allocated (but it wasn't): .*.*..*.*.***..*.**. 01 .*o*oo*.*o***.o*o**. *..**.*.**.*****.*.. 02 *.o**o*o**.*****o*oo **.**.***...*.*.***. 03 **o**o***.oo*o*o***. .**..*.*.**.**.**.** 04 .**oo*o*o**o**o**.** *...***..****.*.*..* 05 *.o.***.o****o*o*oo* ..**..**..****.*.**. 06 oo**o.**oo****o*.**o .*..***..**.*.*.*.*. 07 .................... ***.*..*..*..*..**.. 08 .................... ..*.*.**..**.***..*. 09 .................... .*..***.*..*..*..*.. 10 .................... You see, blocks are allocated by picking empty space in a random cylinder, and at less than an 85% fill, there is effectively a 99.98% probability of not getting a collision and having to retry (the allocation policy is effectively a hash, and that's why FFS filesystems do not fragment in the first place, unless they are overfilled or someone foolishly reduces the free reserve space -- see Knuth "Seminumerical Algorithms: Sorting and Searching"). So effectively, by growing the disk, you make FFS need a defragmenter, since the probability of an allocation being tried in line "08" is the same as it being tried in line "03". So it's OK for an administrator who is willing to take a (perhaps significant) performance hit, in trade for not having to do a backup and restore, but it's pretty useless for the general case. If you decide to write an FSS defragmenter over this (it's possible to write on, but until you change the size of an FS, it's a pretty useless piece of software), then you should build it so that it can migrate data out of either the start or end of an FF, by taking one or both of those areas as not being locations where data will be relocated to during defragmentation. Doing this would let you shrink FFS partitions, as well, by defragging the data out of the area which you are going to make "go away". In general, doing this to the end of a disk is significantly easier than the start of the disk, since you will have a difficult time relocating superblock and other information not specifically associated with a cylinder group (e.g. boot code, disklabels, etc.). A better soloution for doing that would be to simply shrink it from the end, and then use a non-destructive (end byte first) data copy to relocate the entire filesystem further down on the disk, leaving the relative block offsets intact (rewriting would require you to renumber the block offsets after moving the start after you clear out the front of the disk). Much of this goes to hell during a power outage; migrating the data safely is a much harder problem; in doing that, you would copy the data first, then modify the metadata to point to the copy instead of the original, so that in case of a power failure, metadata pointing to one instead of the other would still be valid, and you could pick up where you left off. The same goes for metadata, as well, but it's just an additional level of complication. Personally, I've always found at this point that people lose interest, and deside it's easier to backup their FS data, repartition the disk, and restore from the backup... Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 0: 6:30 2001 Delivered-To: freebsd-fs@freebsd.org Received: from eozoon.coleman.org (adsl-209-233-238-136.dsl.snfc21.pacbell.net [209.233.238.136]) by hub.freebsd.org (Postfix) with ESMTP id C12A337B71B for ; Wed, 21 Mar 2001 00:06:27 -0800 (PST) (envelope-from don@eozoon.coleman.org) Received: from eozoon.coleman.org (eozoon.coleman.org [127.0.0.1] (may be forged)) by eozoon.coleman.org (8.9.3/8.9.3) with ESMTP id AAA09667; Wed, 21 Mar 2001 00:06:03 -0800 (PST) Message-Id: <200103210806.AAA09667@eozoon.coleman.org> X-Mailer: exmh version 2.3.1 01/18/2001 with nmh-1.0.4 To: Terry Lambert Reply-To: don@coleman.org Cc: fschapachnik@vianetworks.com.ar, freebsd-fs@FreeBSD.ORG Subject: Re: growfs In-reply-to: Your message of "Wed, 21 Mar 2001 07:11:41 GMT." <200103210711.AAA22512@usr05.primenet.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 21 Mar 2001 00:06:02 -0800 From: Don Coleman Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Terry, I don't think the picture is quite as bad as you paint it. The clustering code of FFS will defragment files automatically as they grow. While it is true that a highly fragmented filesystem will not be magically fixed by growing it, any new files will be written out as large files as they get large. don To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 0:56:29 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (Postfix) with ESMTP id C569237B71F for ; Wed, 21 Mar 2001 00:56:25 -0800 (PST) (envelope-from tlambert@usr05.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.9.3/8.9.3) id BAA00744; Wed, 21 Mar 2001 01:49:39 -0700 (MST) Received: from usr05.primenet.com(206.165.6.205) via SMTP by smtp02.primenet.com, id smtpdAAApBa4wb; Wed Mar 21 01:49:28 2001 Received: (from tlambert@localhost) by usr05.primenet.com (8.8.5/8.8.5) id BAA24089; Wed, 21 Mar 2001 01:56:10 -0700 (MST) From: Terry Lambert Message-Id: <200103210856.BAA24089@usr05.primenet.com> Subject: Re: growfs To: don@coleman.org Date: Wed, 21 Mar 2001 08:55:48 +0000 (GMT) Cc: tlambert@primenet.com (Terry Lambert), fschapachnik@vianetworks.com.ar, freebsd-fs@FreeBSD.ORG In-Reply-To: <200103210806.AAA09667@eozoon.coleman.org> from "Don Coleman" at Mar 21, 2001 12:06:02 AM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > I don't think the picture is quite as bad as you paint it. > > The clustering code of FFS will defragment files automatically as > they grow. While it is true that a highly fragmented filesystem > will not be magically fixed by growing it, any new files will > be written out as large files as they get large. I think that rather misses the most probable time for someone to actually run the command: when they need more disk space, because their disks are full. Also realize that the free reserve has been eroded over time as disks get larger; ideally, it would be 15% (1G on a 6G) drive; the current tradeoff for "more space" vs. "better efficiency" is 8% (1G on a 12.5G drive). I don't think if someone has a 37G drive today (say it's dedicated to a Vinum plex, so it can be made bigger if we want, or say it's part of a larger RAID array), that they will think of running "growfs" on the thing when it has "only 3G free". With some of the 75G disks out there today, that becomes "only 6G free". People are used to thinking of the free reserve as "a hell of a lot of wasted space"; mostly because they aren't computer scientists, and simly don't understand hash fil algorithms or the reason for the free reserve. Even if they understand it intellectually, there are many computer scientists who grew up on systems where main memory was 4k, or even with the first PC, where that free reserve is equivalent to 600 times the size of the largest available hard drive for the original IBM PC XT. Intellectually, they may know the math, but their gut still tells them "that's a hell of a lot of wasted space". My gut reaction, which I have to fight, is to tweak the free reserve down to 6%, and get another 600M/1.2G of disk space. I know that if I did this, I'd be able to rationalize it as a temporary stopgap that I will fix correctly by deleting and/or compressing junk later (yeah, right), or by doing The Right Thing and adding more disk, and then using backup/restore to defragment things. I know that if I did this, I would be trying to pull one over on myself. So I don't do it. Finally, say you are right: assume that we are talking about files which grow over time, instead of just talking about the normal disks you'd see at any ISP or commercial or educational environment, where the only things that grow over time are log files, directories, and email folders (if they happen to be stored in mail spool, rather than half a dozen other formats). Even with that, we take an 80% full disk, and we "growfs" it to twice it's previous size. There is a 50/50 chance that a new allocation wil be on the new region vs. the old. This means that for a disk size K with a new aggregate size 2*K, that it takes .5 * (2*K) more data to hit the 85% hash limit, and .12 * (2*K) more data to hit the 92% fill limit of the current eroded newfs free reserve. In other words, the disk is 80% full, and you double the space so that it's conceptually 40% full, but then you only get to add 24% (of the original) or 12% (of the new) more data before the original disk is 92% full. 24% - 7% = 18%... in other words, serious fragmentation based thrashing becomes a problem when there is 80%+18% = 98% of the original disk worth of data, as opposed to 92% of the original disk worth of data (8% of the original disk free being the "worst acceptable allowable tradeoff" for newfs as it sits today). A backup and restore (or intentional -- non-side effect -- defragmentation) will drop both disks in the combined plex to 42%, with another 50% of the available space (i.e. "one whole disk") until it becomes a problem. And if *I* can come close to being able to rationalize doing this anyway to allow me to procrastinate... Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 3:14:23 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peter3.wemm.org (c1315225-a.plstn1.sfba.home.com [65.0.135.147]) by hub.freebsd.org (Postfix) with ESMTP id 67FE137B72E; Wed, 21 Mar 2001 03:14:14 -0800 (PST) (envelope-from peter@netplex.com.au) Received: from mobile.wemm.org (mobile.wemm.org [10.0.0.5]) by peter3.wemm.org (8.11.0/8.11.0) with ESMTP id f2LBEEp89135; Wed, 21 Mar 2001 03:14:14 -0800 (PST) (envelope-from peter@netplex.com.au) Received: from netplex.com.au (localhost [127.0.0.1]) by mobile.wemm.org (8.11.1/8.11.1) with ESMTP id f2LBE0h57371; Wed, 21 Mar 2001 03:14:04 -0800 (PST) (envelope-from peter@netplex.com.au) Message-Id: <200103211114.f2LBE0h57371@mobile.wemm.org> X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4 To: Matt Dillon Cc: Alfred Perlstein , "Michael C . Wu" , "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver In-Reply-To: <200103201852.f2KIquW95665@earth.backplane.com> Date: Wed, 21 Mar 2001 03:13:59 -0800 From: Peter Wemm Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Matt Dillon wrote: > :If this is a result of the shared memory, then my sysctl should fix it. > : > :Be aware, that it doesn't fix it on the fly! You must drop and recreate > :the shared memory segments. > : > :better to reboot actually and set the variable before any shm is > :allocated. > : > :-- > :-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] > > Lets see. Approximately 4MB shared across 4000 processes. That eats > 1024 pte's per process, or around 4 million pmap elements that would be > saved. That's a lot of KVM that would be saved. Also, 4MB = 1024 pages, at 28 bytes per mapping == 28k per process. 28k * 4000 processes = 114688k of kvm, ie: 114MB of kvm. I bet you'll find that you are right on the limit, and you are seeing lots of page unwiring by the page daemon to try and stop it running out of pv entries. Do you see messages on the console saying that pmap_collect got activated? Do a sysctl vm.zone (or vmstat -z) and see the entry for 'PV ENTRY'. vmstat -z is different to vm.zone on 4.x, but somebody broke it on -current so that you cannot use it on crashdumps anymore. :-( GRRRRR! You can increase PMAP_SHPGPERPROC (kernel compile option). Or use the sysctl to use physical memory backed shm mappings, which do not consume pv_entries for each page per process. Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 4:52:14 2001 Delivered-To: freebsd-fs@freebsd.org Received: from ns1.via-net-works.net.ar (ns1.via-net-works.net.ar [200.10.100.10]) by hub.freebsd.org (Postfix) with ESMTP id 7865637B73B for ; Wed, 21 Mar 2001 04:52:11 -0800 (PST) (envelope-from fpscha@ns1.via-net-works.net.ar) Received: (from fpscha@localhost) by ns1.via-net-works.net.ar (8.9.3/8.9.3) id JAA94399; Wed, 21 Mar 2001 09:52:52 -0300 (ART) From: Fernando Schapachnik Message-Id: <200103211252.JAA94399@ns1.via-net-works.net.ar> Subject: Re: growfs In-Reply-To: <200103210711.AAA22512@usr05.primenet.com> "from Terry Lambert at Mar 21, 2001 07:11:41 am" To: Terry Lambert Date: Wed, 21 Mar 2001 09:52:52 -0300 (ART) Cc: fschapachnik@vianetworks.com.ar, freebsd-fs@FreeBSD.ORG Reply-To: Fernando Schapachnik X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=ISO-8859-1 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org En un mensaje anterior, Terry Lambert escribió: > > I was wondering how usable is the growfs implementation > > available in -CURRENT. > > > > Any chance of using it on -STABLE? > > It will work as well as it works pretty much everywhere, going > back a long time. I believe the first version from "Der Mouse" > ran on FreeBSD 1.1.5. Any place it can be downloaded? (I mean: is it just userland and I grab it from -CURRENT, or there kernel/UFS code patches also?). > > The problem with this is that you will get fragmentation; > consider the following two cases; the first is a disk of > size "10"; the second is a disk of size "6" that has been > "grown" to size "10". The "*" are allocated blocks, and > the "." are unallocated blocks; "o" are blocks that would > have been allocated in the new space, if it had been there > at the time the were allocated (but it wasn't): Thanks for the extense explanation! Actually, this is good enough for what I need: I'm planning to offer backup space to customers. So I will start with an x Gb HD (vinum'ed, of course). When I sold the whole of it, or I'm about to do it, then I will add a new one. The trick is that there are very high chances that when I've sold x Gb of backup space, the real used space is y, being y << x, so basically I'm adding disk space when there is a lot of space available still. Will that suffice, or am I lying somewhere? Thanks! Fernando P. Schapachnik Administración de la red VIA NET.WORKS ARGENTINA S.A. fschapachnik@vianetworks.com.ar To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 9:51: 6 2001 Delivered-To: freebsd-fs@freebsd.org Received: from postfix.conectiva.com.br (perninha.conectiva.com.br [200.250.58.156]) by hub.freebsd.org (Postfix) with ESMTP id 3384537B741 for ; Wed, 21 Mar 2001 09:50:57 -0800 (PST) (envelope-from riel@conectiva.com.br) Received: from burns.conectiva (burns.conectiva [10.0.0.4]) by postfix.conectiva.com.br (Postfix) with SMTP id C7CE016E24 for ; Wed, 21 Mar 2001 14:50:51 -0300 (EST) Received: (qmail 30402 invoked by uid 0); 21 Mar 2001 17:50:13 -0000 Received: from dial10.ras.conectiva (HELO imladris.rielhome.conectiva) (root@10.0.8.10) by burns.conectiva with SMTP; 21 Mar 2001 17:50:13 -0000 Received: from localhost (IDENT:riel@localhost [127.0.0.1]) by imladris.rielhome.conectiva (8.11.2/8.11.2) with ESMTP id f2LGVjh10858; Wed, 21 Mar 2001 13:31:45 -0300 Date: Wed, 21 Mar 2001 13:31:45 -0300 (BRST) From: Rik van Riel X-Sender: riel@imladris.rielhome.conectiva To: Peter Wemm Cc: Matt Dillon , Alfred Perlstein , "Michael C . Wu" , "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver In-Reply-To: <200103211114.f2LBE0h57371@mobile.wemm.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Wed, 21 Mar 2001, Peter Wemm wrote: > Also, 4MB = 1024 pages, at 28 bytes per mapping == 28k per process. 28 bytes/mapping is a LOT. I've implemented an (admittedly not completely architecture-independent) reverse mapping patch for Linux with an overhead of 8 bytes/pte... I wonder how hard/easy would it be to reduce the memory overhead of some of these old Mach data structures in FreeBSD... regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 10: 4: 9 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 1E87C37B721; Wed, 21 Mar 2001 09:58:38 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f2LHucH27120; Wed, 21 Mar 2001 09:56:38 -0800 (PST) Date: Wed, 21 Mar 2001 09:56:38 -0800 From: Alfred Perlstein To: Rik van Riel Cc: Peter Wemm , Matt Dillon , "Michael C . Wu" , "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver Message-ID: <20010321095638.H12319@fw.wintelcom.net> References: <200103211114.f2LBE0h57371@mobile.wemm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from riel@conectiva.com.br on Wed, Mar 21, 2001 at 01:31:45PM -0300 X-all-your-base: are belong to us. Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Rik van Riel [010321 09:51] wrote: > On Wed, 21 Mar 2001, Peter Wemm wrote: > > > Also, 4MB = 1024 pages, at 28 bytes per mapping == 28k per process. > > 28 bytes/mapping is a LOT. I've implemented an (admittedly > not completely architecture-independent) reverse mapping > patch for Linux with an overhead of 8 bytes/pte... > > I wonder how hard/easy would it be to reduce the memory > overhead of some of these old Mach data structures in FreeBSD... "Our" Alan Cox and Tor Egge have been trimming these structs down for quite some time. Perhaps they should look at Linux's system, however last I checked Linux's was an order of magnitude less complex which might prohibit that simplification in FreeBSD. If you have suggestions, let's hear them. :) -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 11:33:51 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth.backplane.com [208.161.114.65]) by hub.freebsd.org (Postfix) with ESMTP id AA1DC37B7FA; Wed, 21 Mar 2001 11:33:29 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2LJ7cp17933; Wed, 21 Mar 2001 11:07:38 -0800 (PST) (envelope-from dillon) Date: Wed, 21 Mar 2001 11:07:38 -0800 (PST) From: Matt Dillon Message-Id: <200103211907.f2LJ7cp17933@earth.backplane.com> To: Alfred Perlstein Cc: "Michael C . Wu" , Rik van Riel , Peter Wemm , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver References: <200103211114.f2LBE0h57371@mobile.wemm.org> <20010321120620.A932@peorth.iteration.net> <200103211817.f2LIHR416007@earth.backplane.com> <20010321102836.N12319@fw.wintelcom.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :Hey, talking about large amounts of swap, did you know that: : 4.2-STABLE FreeBSD 4.2-STABLE #1: Sat Feb 10 01:26:41 PST 2001 :has a max swap limit that's possibly 'low': : : b: 15912412 0 swap # (Cyl. 0 - 990*) : c: 17912412 0 unused 0 0 # (Cyl. 0 - 1114*) : :If I made b == c, then i couldn't swapon it. : :Don't ask why I have that much swap, I just needed a bunch on a dedicated :disk. :) : :-- :-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] You would have to reconfigure your kernel to reduce NSWAP from 4 to either 2 or 1. Bitmap overhead for swap is 2 bits per page (4K) of swap. There is a maximum of 2G / 16 / NSWAPDEV blocks (512 byte chunks) of swap. If NSWAPDEV is 4, this comes to: 2G/16/4x512 = 17GB. So you would be able to create approximately four 17GB swap partitions. If you reduce NSWAP to 2 you would be able to create approximately two 34GB swap partitions. If you reduce NSWAP to 1 you would be able to create approximately one 68GB swap partition. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 11:34: 4 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth.backplane.com [208.161.114.65]) by hub.freebsd.org (Postfix) with ESMTP id F20BB37B765; Wed, 21 Mar 2001 11:33:29 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2LIHR416007; Wed, 21 Mar 2001 10:17:27 -0800 (PST) (envelope-from dillon) Date: Wed, 21 Mar 2001 10:17:27 -0800 (PST) From: Matt Dillon Message-Id: <200103211817.f2LIHR416007@earth.backplane.com> To: "Michael C . Wu" Cc: Rik van Riel , Peter Wemm , Alfred Perlstein , "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver References: <200103211114.f2LBE0h57371@mobile.wemm.org> <20010321120620.A932@peorth.iteration.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :B) Added 3gb of swap on one drive, 1gb of swap on a raid volume : another 1gb swap on another raid volume :C) enabled vfs.vmiodirenable and kern.ipc.shm_use_phys : :-- :+-----------------------------------------------------------+ :| keichii@iteration.net | keichii@freebsd.org | I'd reduce that 3gb on that one drive to 1gb. The kernel allocates a bitmap for 4 * (largest_swap_partition), i.e. it will allocate a bitmap for 3gb x 4 = 12 gb worth of swap, even though you only have 5. If you reduce the 3gb to 1gb, then the kernel will allocate a bitmap for 1gb x 4 = 4gb worth of swap, using 1/3 the memory for the bitmap. Each page of swap eats 2 bits of memory for the bitmap so we aren't talking about a huge amount of memory, but it's worth doing. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 11:34:31 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth.backplane.com [208.161.114.65]) by hub.freebsd.org (Postfix) with ESMTP id 547CF37B718; Wed, 21 Mar 2001 11:34:24 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2LIeYA16476; Wed, 21 Mar 2001 10:40:34 -0800 (PST) (envelope-from dillon) Date: Wed, 21 Mar 2001 10:40:34 -0800 (PST) From: Matt Dillon Message-Id: <200103211840.f2LIeYA16476@earth.backplane.com> To: Alfred Perlstein Cc: Rik van Riel , Peter Wemm , "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver References: <200103211114.f2LBE0h57371@mobile.wemm.org> <20010321095638.H12319@fw.wintelcom.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :* Rik van Riel [010321 09:51] wrote: :> On Wed, 21 Mar 2001, Peter Wemm wrote: :> :> > Also, 4MB = 1024 pages, at 28 bytes per mapping == 28k per process. :> :> 28 bytes/mapping is a LOT. I've implemented an (admittedly :> not completely architecture-independent) reverse mapping :> patch for Linux with an overhead of 8 bytes/pte... :> :> I wonder how hard/easy would it be to reduce the memory :> overhead of some of these old Mach data structures in FreeBSD... : :"Our" Alan Cox and Tor Egge have been trimming :these structs down for quite some time. Perhaps they should :look at Linux's system, however last I checked Linux's was :an order of magnitude less complex which might prohibit that :simplification in FreeBSD. : :If you have suggestions, let's hear them. :) : :-- :-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] We've looked at those structures quite a bit. DG and I talked about it a year or two ago but we came to the conclusion that the extra linkages in our pv_entry gave us significant performance benefits during rundowns. Since then Tor has done a lot of cleanup, but I don't think the analysis has changed much. typedef struct pv_entry { pmap_t pv_pmap; /* pmap where mapping lies */ vm_offset_t pv_va; /* virtual address for mapping */ TAILQ_ENTRY(pv_entry) pv_list; TAILQ_ENTRY(pv_entry) pv_plist; vm_page_t pv_ptem; /* VM page for pte */ } *pv_entry_t; pv_pmap The pmap associated with the pv_entry. pv_va The virtual address of the pv_entry in the pmap. Used to quickly track down the pv_entry associated with a (pmap, vm_page_t, va) when iterating a pv_list or pv_plist. pv_list - pv_entry's associated with a vm_page_t. pv_plist - pv_entry's associated with a pmap A pmap_entry can be located either through pv_list or through pv_plist. The kernel chooses which list to iterate through to find a pv_entry based on which of the two lists has the least number of elements. One of the two nodes could be removed from the pv_entry structure (saving 8 bytes) could be removed but at the cost of performance for certain cases. If you have a huge number of processes sharing a page of memory, iterating through pv_plist to locate a mapping is usually more efficient. If you have fewer processes but full mappings (e.g. the page table page is full), then iterating through pv_list is more efficient. pv_ptem The vm_page_t associated with a pv_entry. This field is used to quickly find associate vm_page_t's when we are wiping whole page tables (e.g. on process exit). It could be removed, but at significant cost to process exits and munmap()'s of large areas. Theoretically we can remove half the structure, but at a significant cost in performance. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 11:36:25 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 4CB4C37B71B; Wed, 21 Mar 2001 11:36:12 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f2LISbL28025; Wed, 21 Mar 2001 10:28:37 -0800 (PST) Date: Wed, 21 Mar 2001 10:28:37 -0800 From: Alfred Perlstein To: Matt Dillon Cc: "Michael C . Wu" , Rik van Riel , Peter Wemm , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.org, fs@FreeBSD.org, hackers@FreeBSD.org Subject: Re: tuning a VERY heavily (30.0) loaded s cerver Message-ID: <20010321102836.N12319@fw.wintelcom.net> References: <200103211114.f2LBE0h57371@mobile.wemm.org> <20010321120620.A932@peorth.iteration.net> <200103211817.f2LIHR416007@earth.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103211817.f2LIHR416007@earth.backplane.com>; from dillon@earth.backplane.com on Wed, Mar 21, 2001 at 10:17:27AM -0800 X-all-your-base: are belong to us. Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Matt Dillon [010321 10:20] wrote: > > :B) Added 3gb of swap on one drive, 1gb of swap on a raid volume > : another 1gb swap on another raid volume > :C) enabled vfs.vmiodirenable and kern.ipc.shm_use_phys > : > :-- > :+-----------------------------------------------------------+ > :| keichii@iteration.net | keichii@freebsd.org | > > I'd reduce that 3gb on that one drive to 1gb. The kernel > allocates a bitmap for 4 * (largest_swap_partition), i.e. > it will allocate a bitmap for 3gb x 4 = 12 gb worth of swap, > even though you only have 5. If you reduce the 3gb to 1gb, then > the kernel will allocate a bitmap for 1gb x 4 = 4gb worth of swap, > using 1/3 the memory for the bitmap. Each page of swap eats 2 bits of > memory for the bitmap so we aren't talking about a huge > amount of memory, but it's worth doing. Hey, talking about large amounts of swap, did you know that: 4.2-STABLE FreeBSD 4.2-STABLE #1: Sat Feb 10 01:26:41 PST 2001 has a max swap limit that's possibly 'low': b: 15912412 0 swap # (Cyl. 0 - 990*) c: 17912412 0 unused 0 0 # (Cyl. 0 - 1114*) If I made b == c, then i couldn't swapon it. Don't ask why I have that much swap, I just needed a bunch on a dedicated disk. :) -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 11:57:12 2001 Delivered-To: freebsd-fs@freebsd.org Received: from postfix.conectiva.com.br (perninha.conectiva.com.br [200.250.58.156]) by hub.freebsd.org (Postfix) with ESMTP id 1B10637B730 for ; Wed, 21 Mar 2001 11:56:57 -0800 (PST) (envelope-from riel@conectiva.com.br) Received: from burns.conectiva (burns.conectiva [10.0.0.4]) by postfix.conectiva.com.br (Postfix) with SMTP id 4385C16E3F for ; Wed, 21 Mar 2001 16:25:29 -0300 (EST) Received: (qmail 1026 invoked by uid 0); 21 Mar 2001 19:24:50 -0000 Received: from dial11.ras.conectiva (HELO imladris.rielhome.conectiva) (root@10.0.8.11) by burns.conectiva with SMTP; 21 Mar 2001 19:24:50 -0000 Received: from localhost (IDENT:riel@localhost [127.0.0.1]) by imladris.rielhome.conectiva (8.11.2/8.11.2) with ESMTP id f2LJEXh17515; Wed, 21 Mar 2001 16:14:33 -0300 Date: Wed, 21 Mar 2001 16:14:32 -0300 (BRST) From: Rik van Riel X-Sender: riel@imladris.rielhome.conectiva To: Matt Dillon Cc: Alfred Perlstein , Peter Wemm , "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver In-Reply-To: <200103211840.f2LIeYA16476@earth.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Wed, 21 Mar 2001, Matt Dillon wrote: > We've looked at those structures quite a bit. DG and I talked about > it a year or two ago but we came to the conclusion that the extra > linkages in our pv_entry gave us significant performance benefits > during rundowns. Since then Tor has done a lot of cleanup, but > I don't think the analysis has changed much. > > typedef struct pv_entry { > pmap_t pv_pmap; /* pmap where mapping lies */ > vm_offset_t pv_va; /* virtual address for mapping */ > TAILQ_ENTRY(pv_entry) pv_list; > TAILQ_ENTRY(pv_entry) pv_plist; > vm_page_t pv_ptem; /* VM page for pte */ > } *pv_entry_t; The (maybe too lightweight) structure I have in my patch looks like this: struct pte_chain { struct pte_chain * next; pte_t * ptep; }; Each pte_chain hangs off a page of physical memory and the ptep is a pointer to a page table entry. The page struct of the page table page itself is used to note down which address space and offset we have. This means that FreeBSD's pv_pmap, pv_va and pv_ptem are in the page table page and NOT in each pte_chain structure... The only issue is address space rundowns, but this _could_ be ok due to the fact that systems usually seem to have more short-running processes than long-running ones and the tasks that are short-running will have their pte_chain nearer to the beginning of the list. Finding all pages in an address_space is simply done by walking the populated parts of the page tables, this is cache friendly and relatively fast for everything except really huge sparse mappings (but in that case, the finer grained locking makes sure this penalty gets restricted to the exiting task only and doesn't block the rest of the system). The whole patch is available at: http://www.surriel.com/patches/2.4/2.4.1-pmap-swapsonuml I have some newer code available, but haven't bothered coding up a new patch yet since the reverse mapping is experimental stuff and we're still busy finetuning and debugging 2.4 ;) regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 12: 7:11 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id A1B4437B735; Wed, 21 Mar 2001 12:07:03 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id B2FB05928C; Wed, 21 Mar 2001 12:06:20 -0600 (CST) Date: Wed, 21 Mar 2001 12:06:20 -0600 From: "Michael C . Wu" To: Rik van Riel Cc: Peter Wemm , Matt Dillon , Alfred Perlstein , "Michael C . Wu" , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver Message-ID: <20010321120620.A932@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <200103211114.f2LBE0h57371@mobile.wemm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from riel@conectiva.com.br on Wed, Mar 21, 2001 at 01:31:45PM -0300 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Wed, Mar 21, 2001 at 01:31:45PM -0300, Rik van Riel scribbled: | On Wed, 21 Mar 2001, Peter Wemm wrote: For those interested in this system: I have put up the kernel profiles at http://zoo.ee.ntu.edu.tw/~keichii/kernel_profiles/ This is kgmon -rb ;sleep 30;kgmon -hp ran every minute on the system by cron. i will let this run for more than 72 hours to get a feel for the usage peak and lows. In addition, this should provide for a good study of how FreeBSD 4.2 operates under high load in SMP. ckbdevent+0x1b9 atkbd_intr(c029c860,0,bfbfd2a4,c0227dbf,c029c860) at atkbd_intr+0x22 atkbd_isa_intr(c029c860,0,3011002f,2f,bfbf002f) at atkbd_isa_intr+0x18 Xresume1() at Xresume1+0x35 interrupt, eip = 0x807ca7b, esp = 0xf5cbffe0, ebp = 0xbfbfd2a4 db> Well, I know this trace is somewhat useless because of ctrl-alt-esc interrupt. The box did not crash today after we : A) Took out MFS/Md0 B) Added 3gb of swap on one drive, 1gb of swap on a raid volume another 1gb swap on another raid volume C) enabled vfs.vmiodirenable and kern.ipc.shm_use_phys -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 12:37:14 2001 Delivered-To: freebsd-fs@freebsd.org Received: from mail.cise.ufl.edu (coast.cise.ufl.edu [128.227.205.212]) by hub.freebsd.org (Postfix) with ESMTP id F3E2E37B718 for ; Wed, 21 Mar 2001 12:37:12 -0800 (PST) (envelope-from jfh@cise.ufl.edu) Received: from cise.ufl.edu (waterspout.cise.ufl.edu [128.227.205.52]) by mail.cise.ufl.edu (Postfix) with ESMTP id 4AC9FD815 for ; Wed, 21 Mar 2001 15:32:35 -0500 (EST) To: freebsd-fs@FreeBSD.org Subject: NFS ACLs? (was Re: First round review request, ACLs for UFS commit) In-Reply-To: Message from Robert Watson of "Tue, 20 Mar 2001 12:12:22 EST." Date: Wed, 21 Mar 2001 15:32:35 -0500 From: "James F. Hranicky" Message-Id: <20010321203235.4AC9FD815@mail.cise.ufl.edu> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Robert Watson wrote: > > Let me know if you have any additional questions. Are there plans for adding support for ACLs in the FreeBSD NFS implementation? Are there issues that need to be addressed to ensure interoperability with other OSs' implementations? If I could find any kind of spec for ACLs over NFS, this might be a fun project to tinker with in my "copious spare time" , but I've had a devil of a time finding any good docs...the POSIX drafts don't seem to mention NFS specifically, and I haven't found anything else since the last time I talked to you a few months back...perhaps I'm still not looking in the right place? ---------------------------------------------------------------------- | Jim Hranicky, Senior SysAdmin UF/CISE Department | | E314D CSE Building Phone (352) 392-1499 | | jfh@cise.ufl.edu http://www.cise.ufl.edu/~jfh | ---------------------------------------------------------------------- - Encryption: its use by criminals is far less - - frightening than its banishment by governments - - Vote for Privacy - To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 17:59:50 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133]) by hub.freebsd.org (Postfix) with ESMTP id 72F4237B71B for ; Wed, 21 Mar 2001 17:59:47 -0800 (PST) (envelope-from tlambert@usr05.primenet.com) Received: (from daemon@localhost) by smtp03.primenet.com (8.9.3/8.9.3) id SAA11764; Wed, 21 Mar 2001 18:56:25 -0700 (MST) Received: from usr05.primenet.com(206.165.6.205) via SMTP by smtp03.primenet.com, id smtpdAAA9Xai6w; Wed Mar 21 18:56:21 2001 Received: (from tlambert@localhost) by usr05.primenet.com (8.8.5/8.8.5) id SAA13427; Wed, 21 Mar 2001 18:59:41 -0700 (MST) From: Terry Lambert Message-Id: <200103220159.SAA13427@usr05.primenet.com> Subject: Re: growfs To: fschapachnik@vianetworks.com.ar Date: Thu, 22 Mar 2001 01:59:41 +0000 (GMT) Cc: tlambert@primenet.com (Terry Lambert), freebsd-fs@FreeBSD.ORG In-Reply-To: <200103211252.JAA94399@ns1.via-net-works.net.ar> from "Fernando Schapachnik" at Mar 21, 2001 09:52:52 AM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > > > I was wondering how usable is the growfs implementation > > > available in -CURRENT. > > > > > > Any chance of using it on -STABLE? > > > > It will work as well as it works pretty much everywhere, going > > back a long time. I believe the first version from "Der Mouse" > > ran on FreeBSD 1.1.5. > > Any place it can be downloaded? (I mean: is it just userland and I > grab it from -CURRENT, or there kernel/UFS code patches also?). It's called "fsresize" these days: http://www.nethelp.no/scsi/fsresize.c > > The problem with this is that you will get fragmentation; > > consider the following two cases; the first is a disk of > > size "10"; the second is a disk of size "6" that has been > > "grown" to size "10". The "*" are allocated blocks, and > > the "." are unallocated blocks; "o" are blocks that would > > have been allocated in the new space, if it had been there > > at the time the were allocated (but it wasn't): > > Thanks for the extense explanation! > > Actually, this is good enough for what I need: I'm planning to offer > backup space to customers. So I will start with an x Gb HD (vinum'ed, > of course). When I sold the whole of it, or I'm about to do it, then > I will add a new one. The trick is that there are very high chances > that when I've sold x Gb of backup space, the real used space is y, > being y << x, so basically I'm adding disk space when there is a lot > of space available still. > > Will that suffice, or am I lying somewhere? If you have a lot of space out there, then you will be fine. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Mar 21 20:11:11 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.teb1.iconnet.net (smtp02.teb1.iconnet.net [209.3.218.43]) by hub.freebsd.org (Postfix) with ESMTP id BDA3D37B71E; Wed, 21 Mar 2001 20:10:51 -0800 (PST) (envelope-from babkin@bellatlantic.net) Received: from bellatlantic.net (client-151-198-117-202.nnj.dialup.bellatlantic.net [151.198.117.202]) by smtp02.teb1.iconnet.net (8.9.1/8.9.1) with ESMTP id XAA12126; Wed, 21 Mar 2001 23:10:46 -0500 (EST) Message-ID: <3AB97B45.37E4957F@bellatlantic.net> Date: Wed, 21 Mar 2001 23:10:45 -0500 From: Sergey Babkin X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 4.0-19990626-CURRENT i386) X-Accept-Language: en, ru MIME-Version: 1.0 To: Terry Lambert Cc: fs@FreeBSD.ORG, arch@FreeBSD.ORG Subject: Re: about common group & user ID space (PR kern/14584) References: <200103200521.WAA23451@usr05.primenet.com> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Terry Lambert wrote: > > > > You could do this a bit more cleanly by just stealing the sign > > > bit, and setting if the uid field contained a group ID. > > > > > That was my original idea but some thinking and experimentation > > has shown that it creates too many incompatibilities, such as: > > > > - programs displaying the owner by name would break, and > > that includes both the standard programs and random applications > > - when exported by nfs, the same problem would stand for the > > clients > > - chown will have to be changed - both the program and system call, > > as you mention later > > > > and possibly other sorts of breakages. > > The NFS breakage is going to be there in any case; the semantics > will be different on the remote machine, giving ownership to a > particular user (who doesn't exist). This will turn "owner by > name" numeric at best, and give ownership to a particular person, > not group, at worst. It is expected that the people would bring their whole u/gid/namespace (or at least its portion described by the sysctls as common space) into the required consistent form before enabling this feature. This is why I don't want this feature enabled by default. I plan to implement it over NFS as well but if the NFS server does not support this feature, then yes, the ownership will be given to the pseudo-user with ID coinciding withthe group ID only (and if the namespace is consistent then such a pseudouser with the same name and id as the group must exist). > You also have the problem that the FreeBSD machine has to be > your NIS master; in a heterogeneous environment, Sun boxes are > still better NIS servers, since they understand the full > complement of NIS maps, which FreeBSD doesn't, and they support > automount (as opposed to amd, which happily requires a reboot > to unwedge in many situations). > > Any time you internalize or externalize a uid/gid space, you > will have that problem. No, I don't. The origin of the passwd/group files is irrelevant. Only their contents is important. And yes, if someone is using NIS, their choice would be to either set the commonalized portion of the ID space as local or make the ids/names unique in the NIS maps. Or, of course, not enable this feature. > Plus, with your approach, you are either going to have to make > an exception for certain ID ranges, permitting overlap, or you > are going to be stuck renumbering things like "bin" and "kmem". Yes, the common id range is tuned by sysctl. > Further, even if the FreeBSD was the NIS master for NFS name > interpretation, the only safe way to make the maps transportable > would be to have identical group and password name/ID pairs. > This breaks for normal duplication, which exists now: you can't > have two entries in either file for the same key field, since > a getpwuid or getgrgid will only return the first matching value > in all cases. That's why I consider the uniqueness rule so important in the distribution of uids/gids/names: - each user ang group must have a different name in the common namespace - each user ang group must have a different id in the common id space - for each group there must be a pseudo-user with the same ID and name and because by historical compatibility reasons applying these requirements to the whole id space is impossible, only part of the id space is enabled as compliant. > > > This changes the check to a one line change, conditional on > > > the high bit being set. > > > > No, the change would be the same, just wrapped into a condition > > check for this bit. > > I think you could "fudge" the in core copy of one id to be the > other, with the bit OR'ed in or AND'ed off, as appropriate... Hm, probably yes, they can be put into a loop. > > In the way I propose it, the sysadmins are supposed to create > > a pseudo-user with the same name and ID as each group. That > > This explodes when your remote NIS server doesn't enforce the > new semantics; this is sort of the opposite of the problem I > cite above with not being able to maintain a single namespace. The passwd/group files do not enforce it either. So it's just a convention to which the sysadmins should comply. If they can't comply they must not enable this feature. > I really think it's a lot easier to do this by stealing a bit > somewhere (second one down from the sign, if the sign is to be > held sacrosanct) than it is to rely on semantic enforcement by > your tools. As soon as you do that, it becomes significantly > less useful. At least with a stolen bit, the ownership on the > remote machine works, even if it doesn't precisely "make sense" > the same way it does on the hacked FreeBSD box. Wihtout the stolen bit it works on the remote machine as well, and even names will be shown properly because NFS implies the same users/groups on both machines. Just the permissions on the remote machines will be more restrictive. > The biggest problem is that the tools have to have a gentleman's > agreement between themselves across systems that everyone will > sign up to honor. That's really too kludgy to trust, unless you > are in a homogeneous environment (if then). Placing this as a > restriction makes the idea much, much less generally useful than > it would otherwise be. Well, the standard permissions scheme and especially NFS has this sort of agreement as well. Nothing stops an NFS client from assigning the IDs and names completely differently than on the server. Or another typical breakage is when an user in not listed in /etc/group for his primary group. -SB To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 4:52:20 2001 Delivered-To: freebsd-fs@freebsd.org Received: from ns1.via-net-works.net.ar (ns1.via-net-works.net.ar [200.10.100.10]) by hub.freebsd.org (Postfix) with ESMTP id C147037B718 for ; Thu, 22 Mar 2001 04:52:16 -0800 (PST) (envelope-from fpscha@ns1.via-net-works.net.ar) Received: (from fpscha@localhost) by ns1.via-net-works.net.ar (8.9.3/8.9.3) id JAA29879; Thu, 22 Mar 2001 09:53:02 -0300 (ART) From: Fernando Schapachnik Message-Id: <200103221253.JAA29879@ns1.via-net-works.net.ar> Subject: Re: growfs In-Reply-To: <200103220159.SAA13427@usr05.primenet.com> "from Terry Lambert at Mar 22, 2001 01:59:41 am" To: Terry Lambert Date: Thu, 22 Mar 2001 09:53:02 -0300 (ART) Cc: fschapachnik@vianetworks.com.ar, freebsd-fs@FreeBSD.ORG Reply-To: Fernando Schapachnik X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=ISO-8859-1 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org En un mensaje anterior, Terry Lambert escribió: > > > > I was wondering how usable is the growfs implementation > > > > available in -CURRENT. > > > > > > > > Any chance of using it on -STABLE? > > > > > > It will work as well as it works pretty much everywhere, going > > > back a long time. I believe the first version from "Der Mouse" > > > ran on FreeBSD 1.1.5. > > > > Any place it can be downloaded? (I mean: is it just userland and I > > grab it from -CURRENT, or there kernel/UFS code patches also?). > > It's called "fsresize" these days: > > http://www.nethelp.no/scsi/fsresize.c Mmmm... How is this related to: http://www.FreeBSD.org/cgi/getmsg.cgi?fetch=26529+28670+/usr/local/www/db/text/2000/freebsd-fs/20001210.freebsd-fs Are they different things? Thanks! Fernando P. Schapachnik Administración de la red VIA NET.WORKS ARGENTINA S.A. fschapachnik@vianetworks.com.ar Conmutador: (54-11) 4323-3333 - Soporte: 0810-333-AYUDA To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 5:17:57 2001 Delivered-To: freebsd-fs@freebsd.org Received: from relay.ioffe.rssi.ru (relay.ioffe.rssi.ru [194.85.224.33]) by hub.freebsd.org (Postfix) with ESMTP id A093937B71D for ; Thu, 22 Mar 2001 05:17:45 -0800 (PST) (envelope-from kopts@astro.ioffe.rssi.ru) Received: from astro.ioffe.rssi.ru (astro.ioffe.rssi.ru [194.85.229.130]) by relay.ioffe.rssi.ru (8.9.1/8.9.1) with ESMTP id QAA12249 for ; Thu, 22 Mar 2001 16:17:37 +0300 (MSK) Received: by astro.ioffe.rssi.ru (8.9.3/Clnt-2.14-AS-eef) id QAA35068; Thu, 22 Mar 2001 16:17:33 +0300 (MSK) Date: Thu, 22 Mar 2001 16:17:33 +0300 (MSK) From: Alexey Koptsevich To: fs@freebsd.org Subject: localized filenames in msdos fs Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Hello, Is there a way to correctly interpret local (e.g. Russian cp1251) letters in the filenames on the msdos filesystem? Please cc: me your reply. Thanks, Alex To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 6:44:25 2001 Delivered-To: freebsd-fs@freebsd.org Received: from mx.nsu.ru (mx.nsu.ru [193.124.215.71]) by hub.freebsd.org (Postfix) with ESMTP id 04FCB37B71C for ; Thu, 22 Mar 2001 06:44:22 -0800 (PST) (envelope-from fjoe@iclub.nsu.ru) Received: from iclub.nsu.ru (root@iclub.nsu.ru [193.124.222.66]) by mx.nsu.ru (8.9.1/8.9.0) with ESMTP id UAA08291; Thu, 22 Mar 2001 20:42:03 +0600 (NOVT) Received: from localhost (fjoe@localhost) by iclub.nsu.ru (8.11.2/8.11.2) with ESMTP id f2MEg2t94390; Thu, 22 Mar 2001 20:42:02 +0600 (NS) (envelope-from fjoe@iclub.nsu.ru) Date: Thu, 22 Mar 2001 20:42:02 +0600 (NS) From: Max Khon To: Alexey Koptsevich Cc: fs@FreeBSD.ORG Subject: Re: localized filenames in msdos fs In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org hi, there! On Thu, 22 Mar 2001, Alexey Koptsevich wrote: > Is there a way to correctly interpret local (e.g. Russian cp1251) letters > in the filenames on the msdos filesystem? russian filenames on msdos filesystem are stored in cp866 (not cp1251). You can use -W=koi2dos,-L=ru_RU.KOI8-R options. /fjoe To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 6:57:30 2001 Delivered-To: freebsd-fs@freebsd.org Received: from tao.org.uk (genesis.tao.org.uk [212.135.162.62]) by hub.freebsd.org (Postfix) with ESMTP id 4BC7F37B71A for ; Thu, 22 Mar 2001 06:57:24 -0800 (PST) (envelope-from joe@tao.org.uk) Received: by tao.org.uk (Postfix, from userid 100) id 724F93120; Thu, 22 Mar 2001 14:57:22 +0000 (GMT) Date: Thu, 22 Mar 2001 14:57:21 +0000 From: Josef Karthauser To: Fernando Schapachnik Cc: Terry Lambert , freebsd-fs@FreeBSD.ORG Subject: Re: growfs Message-ID: <20010322145721.E7142@tao.org.uk> References: <200103220159.SAA13427@usr05.primenet.com> <200103221253.JAA29879@ns1.via-net-works.net.ar> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="b8GWCKCLzrXbuNet" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103221253.JAA29879@ns1.via-net-works.net.ar>; from fpscha@ns1.via-net-works.net.ar on Thu, Mar 22, 2001 at 09:53:02AM -0300 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org --b8GWCKCLzrXbuNet Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 22, 2001 at 09:53:02AM -0300, Fernando Schapachnik wrote: > En un mensaje anterior, Terry Lambert escribi=F3: > > > > > I was wondering how usable is the growfs implementation > > > > > available in -CURRENT. > > > > >=20 > > > > > Any chance of using it on -STABLE? > > > >=20 > > > > It will work as well as it works pretty much everywhere, going > > > > back a long time. I believe the first version from "Der Mouse" > > > > ran on FreeBSD 1.1.5. > > >=20 > > > Any place it can be downloaded? (I mean: is it just userland and I > > > grab it from -CURRENT, or there kernel/UFS code patches also?). > >=20 > > It's called "fsresize" these days: > >=20 > > http://www.nethelp.no/scsi/fsresize.c >=20 > Mmmm... How is this related to: > http://www.FreeBSD.org/cgi/getmsg.cgi?fetch=3D26529+28670+/usr/local/www/= db/text/2000/freebsd-fs/20001210.freebsd-fs >=20 > Are they different things? Growfs is in the tree in -current. fsresize is something different. Joe --b8GWCKCLzrXbuNet Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (FreeBSD) Comment: For info see http://www.gnupg.org iEYEARECAAYFAjq6EtEACgkQXVIcjOaxUBbWrACg2WHs4TL5GoqpmNk0JnrNwO+5 j+UAoMK0oAEgRJyDgYr1rSiX5deuJnTD =3ilu -----END PGP SIGNATURE----- --b8GWCKCLzrXbuNet-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 11:56:57 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (Postfix) with ESMTP id BEA9737B71F for ; Thu, 22 Mar 2001 11:56:54 -0800 (PST) (envelope-from tlambert@usr06.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.9.3/8.9.3) id MAA18064; Thu, 22 Mar 2001 12:50:08 -0700 (MST) Received: from usr06.primenet.com(206.165.6.206) via SMTP by smtp02.primenet.com, id smtpdAAAhlaymJ; Thu Mar 22 12:50:01 2001 Received: (from tlambert@localhost) by usr06.primenet.com (8.8.5/8.8.5) id MAA15736; Thu, 22 Mar 2001 12:56:43 -0700 (MST) From: Terry Lambert Message-Id: <200103221956.MAA15736@usr06.primenet.com> Subject: Re: growfs To: fschapachnik@vianetworks.com.ar Date: Thu, 22 Mar 2001 19:56:43 +0000 (GMT) Cc: tlambert@primenet.com (Terry Lambert), freebsd-fs@FreeBSD.ORG In-Reply-To: <200103221253.JAA29879@ns1.via-net-works.net.ar> from "Fernando Schapachnik" at Mar 22, 2001 09:53:02 AM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > > It's called "fsresize" these days: > > > > http://www.nethelp.no/scsi/fsresize.c > > Mmmm... How is this related to: > http://www.FreeBSD.org/cgi/getmsg.cgi?fetch=26529+28670+/usr/local/www/db/text/2000/freebsd-fs/20001210.freebsd-fs > > Are they different things? Same ecological niche. I think "growfs" is newer, which is why I picked it. I looked in the "man page searcher" at: http://www.FreeBSD.org/cgi/man.cgi?manpath=FreeBSD+5.0-current Which apparently doesn't cover -current. It seems that growfs in the -current source tree, however, according to my local CVS mirror. I don't know how hard a compile it would be on an older version of FreeBSD; if not hard, you'd think it would have been merged back for the 4.3 release. I've used "fsresize" before (it's been around a _long_ time), but haven't used "growfs". I think I wouldn't run either of them on a machine where I had not backed up the data already, and which was running on a UPS so that the process doesn't get interrupted in a state which will screw me (I'm not confident that the operations are ordered so as to prrevent this, or restartable, in the event of an interruption). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 12:28:57 2001 Delivered-To: freebsd-fs@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 3F1A837B71C; Thu, 22 Mar 2001 12:28:53 -0800 (PST) (envelope-from keichii@peorth.iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id B902B59293; Thu, 22 Mar 2001 14:28:52 -0600 (CST) Date: Thu, 22 Mar 2001 14:28:52 -0600 From: "Michael C . Wu" To: fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver Message-ID: <20010322142852.A19619@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <200103211114.f2LBE0h57371@mobile.wemm.org> <20010321120620.A932@peorth.iteration.net> <200103211817.f2LIHR416007@earth.backplane.com> <20010321102836.N12319@fw.wintelcom.net> <200103211907.f2LJ7cp17933@earth.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103211907.f2LJ7cp17933@earth.backplane.com>; from dillon@earth.backplane.com on Wed, Mar 21, 2001 at 11:07:38AM -0800 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Just an update on the lovely loaded BBS server. We made our record-breaking number of users last night. After implementing the changes suggested, and kqueue'ifying the BBS daemon. We saw a dramatic increase in server power. Top number of users was 4704 users. Serving SSH, HTTP, SMTP, innd, BBSD with no delays. (Meanwhile, we had kernel profiling ON :) ) We had peak load averages of 100.0, read: no delay. I am certain that we could taken on 6000 users had we had that many users. (It died due to unrelated reason, not because of the load since the number of users had gone down to 4400.) iostat became a fraction of what it used to be before we set vfs.vmiodirenable=1. (Why is vfs.vmiodirenable=1 not enabled by default?) We used to die at about 4200 users with average loads of 200.0 even 300.0 For those still interested in kq'ed BBSD stats: http://zoo.ee.ntu.edu.tw/~keichii I'm ponder if this should have been posted to FreeBSD-Advocacy. :^) -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 12:34:41 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 002E437B71A; Thu, 22 Mar 2001 12:34:35 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f2MKYSp11302; Thu, 22 Mar 2001 12:34:28 -0800 (PST) Date: Thu, 22 Mar 2001 12:34:28 -0800 From: Alfred Perlstein To: "Michael C . Wu" Cc: fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver Message-ID: <20010322123428.D9431@fw.wintelcom.net> References: <200103211114.f2LBE0h57371@mobile.wemm.org> <20010321120620.A932@peorth.iteration.net> <200103211817.f2LIHR416007@earth.backplane.com> <20010321102836.N12319@fw.wintelcom.net> <200103211907.f2LJ7cp17933@earth.backplane.com> <20010322142852.A19619@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010322142852.A19619@peorth.iteration.net>; from keichii@iteration.net on Thu, Mar 22, 2001 at 02:28:52PM -0600 X-all-your-base: are belong to us. Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Michael C . Wu [010322 12:29] wrote: > Just an update on the lovely loaded BBS server. > We made our record-breaking number of users last night. > > After implementing the changes suggested, and kqueue'ifying > the BBS daemon. We saw a dramatic increase in server power. > > Top number of users was 4704 users. Serving SSH, HTTP, SMTP, innd, BBSD > with no delays. (Meanwhile, we had kernel profiling ON :) ) > We had peak load averages of 100.0, read: no delay. I am certain > that we could taken on 6000 users had we had that many users. > (It died due to unrelated reason, not because of the load since the > number of users had gone down to 4400.) iostat became a fraction of what > it used to be before we set vfs.vmiodirenable=1. > > (Why is vfs.vmiodirenable=1 not enabled by default?) It's not a good thing for boxes with < 128megs of ram IMO. It wastes a bunch of ram. When I get a chance I'm going to look at having FreeBSD auto tune such things as maxusers and things like vfs.vmiodirenable for large installs. Actually, i'm tempted to change the defaults for large installs then leave the "optimizations" for small machines to the people who have small machines. :) > We used to die at about 4200 users with average loads of 200.0 even 300.0 > For those still interested in kq'ed BBSD stats: > http://zoo.ee.ntu.edu.tw/~keichii > > I'm ponder if this should have been posted to FreeBSD-Advocacy. :^) Why not write up an article about this BBS scene and get it posted to slashdot or daemonnews? It would really make good press: "FreeBSD community responds to BBS problem with spectacular results." or something. :) -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 12:36:22 2001 Delivered-To: freebsd-fs@freebsd.org Received: from tao.org.uk (genesis.tao.org.uk [212.135.162.62]) by hub.freebsd.org (Postfix) with ESMTP id 169B437B71E for ; Thu, 22 Mar 2001 12:36:18 -0800 (PST) (envelope-from joe@tao.org.uk) Received: by tao.org.uk (Postfix, from userid 100) id 37ABC3120; Thu, 22 Mar 2001 20:36:16 +0000 (GMT) Date: Thu, 22 Mar 2001 20:36:16 +0000 From: Josef Karthauser To: Terry Lambert Cc: fschapachnik@vianetworks.com.ar, freebsd-fs@FreeBSD.org Subject: Re: growfs Message-ID: <20010322203616.C566@tao.org.uk> References: <200103221253.JAA29879@ns1.via-net-works.net.ar> <200103221956.MAA15736@usr06.primenet.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="/e2eDi0V/xtL+Mc8" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200103221956.MAA15736@usr06.primenet.com>; from tlambert@primenet.com on Thu, Mar 22, 2001 at 07:56:43PM +0000 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org --/e2eDi0V/xtL+Mc8 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 22, 2001 at 07:56:43PM +0000, Terry Lambert wrote: > > > It's called "fsresize" these days: > > >=20 > > > http://www.nethelp.no/scsi/fsresize.c > >=20 > > Mmmm... How is this related to: > > http://www.FreeBSD.org/cgi/getmsg.cgi?fetch=3D26529+28670+/usr/local/ww= w/db/text/2000/freebsd-fs/20001210.freebsd-fs > >=20 > > Are they different things? >=20 > Same ecological niche. >=20 > I think "growfs" is newer, which is why I picked it. >=20 > I looked in the "man page searcher" at: >=20 > http://www.FreeBSD.org/cgi/man.cgi?manpath=3DFreeBSD+5.0-current >=20 > Which apparently doesn't cover -current. It seems that growfs > in the -current source tree, however, according to my local CVS > mirror. >=20 > I don't know how hard a compile it would be on an older version > of FreeBSD; if not hard, you'd think it would have been merged > back for the 4.3 release. The guys who wrote it did so under 4.1 so it should work fine :) Joe --/e2eDi0V/xtL+Mc8 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (FreeBSD) Comment: For info see http://www.gnupg.org iEYEARECAAYFAjq6Yj8ACgkQXVIcjOaxUBaYRwCeLHsuCb5VQ/etqywHCajMib7m 00gAoM2Dp26OTv9r9wGStcnBDbiiM3Zb =8lE7 -----END PGP SIGNATURE----- --/e2eDi0V/xtL+Mc8-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 13:46:13 2001 Delivered-To: freebsd-fs@freebsd.org Received: from ajax1.sovam.com (ajax1.sovam.com [194.67.1.172]) by hub.freebsd.org (Postfix) with ESMTP id 2230737B71C; Thu, 22 Mar 2001 13:46:06 -0800 (PST) (envelope-from avn@any.ru) Received: from ts9-a178.dial.sovam.com ([195.239.70.178]:1066 "EHLO ts9-a178.dial.sovam.com" ident: "avn" whoson: "-unregistered-" smtp-auth: TLS-CIPHER: TLS-PEER: ) by ajax1.sovam.com with ESMTP id ; Fri, 23 Mar 2001 00:45:46 +0300 Date: Fri, 23 Mar 2001 00:49:28 +0300 (MSK) From: "Alexey V. Neyman" X-X-Sender: To: "Michael C . Wu" Cc: , Subject: Re: tuning a VERY heavily (30.0) loaded s cerver In-Reply-To: <20010322142852.A19619@peorth.iteration.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org hello there! On Thu, 22 Mar 2001, Michael C . Wu wrote: >(Why is vfs.vmiodirenable=1 not enabled by default?) By the way, is there any all-in-one-place description of sysctl tuneables? Looking all the man pages and collecting notices about MIB variables seems rather tiresome and, I think, pointless. I doubt if they are all documented in man pages. # Alexey To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 13:53:24 2001 Delivered-To: freebsd-fs@freebsd.org Received: from mailout01.sul.t-online.com (mailout01.sul.t-online.com [194.25.134.80]) by hub.freebsd.org (Postfix) with ESMTP id 1E69837B718 for ; Thu, 22 Mar 2001 13:53:20 -0800 (PST) (envelope-from bfischer@Techfak.Uni-Bielefeld.DE) Received: from fwd04.sul.t-online.com by mailout01.sul.t-online.com with smtp id 14gD1C-0000vr-08; Thu, 22 Mar 2001 22:53:18 +0100 Received: from frolic.no-support.loc (520094253176-0001@[217.0.156.251]) by fmrl04.sul.t-online.com with esmtp id 14gD18-11sqSeC; Thu, 22 Mar 2001 22:53:14 +0100 Received: (from bjoern@localhost) by frolic.no-support.loc (8.11.1/8.9.3) id f2MLmk902231; Thu, 22 Mar 2001 22:48:46 +0100 (CET) (envelope-from bjoern) From: Bjoern Fischer Date: Thu, 22 Mar 2001 22:48:46 +0100 To: "James F. Hranicky" Cc: freebsd-fs@FreeBSD.ORG Subject: Re: NFS ACLs? (was Re: First round review request, ACLs for UFS commit) Message-ID: <20010322224845.A589@frolic.no-support.loc> References: <20010321203235.4AC9FD815@mail.cise.ufl.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010321203235.4AC9FD815@mail.cise.ufl.edu>; from jfh@cise.ufl.edu on Wed, Mar 21, 2001 at 03:32:35PM -0500 X-Sender: 520094253176-0001@t-dialin.net Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > Are there plans for adding support for ACLs in the FreeBSD NFS > implementation? Are there issues that need to be addressed to > ensure interoperability with other OSs' implementations? > > If I could find any kind of spec for ACLs over NFS, this might be a > fun project to tinker with in my "copious spare time" , but I've had > a devil of a time finding any good docs...the POSIX drafts don't seem > to mention NFS specifically, and I haven't found anything else since > the last time I talked to you a few months back...perhaps I'm still not > looking in the right place? NFSv4 (RFC2624 and RFC3010) has support for ACLs. There are no plans to bring NFSv4 that I am aware of. The first OpenSource OS NFSv4 implemetations probably will be available on Linux and OpenBSD. The new RPC code in -current and the cleanup of the vnode code should pave the way for NFSv4 in FreeBSD. Bjoern Fischer -- -----BEGIN GEEK CODE BLOCK----- GCS d--(+) s++: a- C+++(-) UB++++OSI++++$ P+++(-) L---(++) !E W- N+ o>+ K- !w !O !M !V PS++ PE- PGP++ t+++ !5 X++ tv- b+++ D++ G e+ h-- y+ ------END GEEK CODE BLOCK------ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Mar 22 16:14:32 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 3BF7E37B71A; Thu, 22 Mar 2001 16:14:28 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2N0EBC61507; Thu, 22 Mar 2001 16:14:11 -0800 (PST) (envelope-from dillon) Date: Thu, 22 Mar 2001 16:14:11 -0800 (PST) From: Matt Dillon Message-Id: <200103230014.f2N0EBC61507@earth.backplane.com> To: "Michael C . Wu" Cc: fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver References: <200103211114.f2LBE0h57371@mobile.wemm.org> <20010321120620.A932@peorth.iteration.net> <200103211817.f2LIHR416007@earth.backplane.com> <20010321102836.N12319@fw.wintelcom.net> <200103211907.f2LJ7cp17933@earth.backplane.com> <20010322142852.A19619@peorth.iteration.net> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :(Why is vfs.vmiodirenable=1 not enabled by default?) : The only reason it isn't enabled by default is some unresolved filesystem corruption that occurs very rarely (with or without it) that Kirk and I are still trying to nail down. I want to get that figured out first. It is true that some people have brought up memory use issues, but I don't consider memory use to really be that much of an issue. This is a cache, after all, so the blocks can be reused at just about any time. And directory blocks do not get cached well at all with vmiodirenable turned off. So the net result should be an increase in performance even on low-memory boxes. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri Mar 23 0:52:33 2001 Delivered-To: freebsd-fs@freebsd.org Received: from bom2.vsnl.net.in (bom2.vsnl.net.in [202.54.1.1]) by hub.freebsd.org (Postfix) with ESMTP id 9F66637B719; Fri, 23 Mar 2001 00:51:53 -0800 (PST) (envelope-from toner1@asianwired.net) Received: from 202.54.1.1 (rsvp-208-187-151-175.ac05.dlls.eli.net [208.187.151.175]) by bom2.vsnl.net.in (Postfix) with SMTP id 9D88C104DC; Fri, 23 Mar 2001 14:19:38 +0530 (GMT+5:30) To: customer@republic.com Date: Thu, 22 Mar 01 03:22:20 EST From: toner1@asianwired.net Subject: toner supplies Message-Id: <20010323084939.9D88C104DC@bom2.vsnl.net.in> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org PLEASE FORWARD TO THE PERSON RESPONSIBLE FOR PURCHASING YOUR LASER PRINTER SUPPLIES **** VORTEX SUPPLIES **** -SPECIALS OF THE DAY ON LASER TONER SUPPLIES AT DISCOUNT PRICES-- LASER PRINTER TONER CARTRIDGES COPIER AND FAX CARTRIDGES WE ARE -->THE<-- PLACE TO BUY YOUR TONER CARTRIDGES BECAUSE YOU SAVE UP TO 30% FROM OFFICE DEPOT'S, QUILL'S OR OFFICE MAX'S EVERY DAY LOW PRICES ORDER BY PHONE:1-888-288-9043 ORDER BY FAX: 1-888-977-1577 CUSTOMER SERVICE: 1-888-248-2015 E-MAIL REMOVAL LINE: 1-888-248-4930 UNIVERSITY AND/OR SCHOOL PURCHASE ORDERS WELCOME. (NO CREDIT APPROVAL REQUIRED) ALL OTHER PURCHASE ORDER REQUESTS REQUIRE CREDIT APPROVAL. PAY BY CHECK (C.O.D), CREDIT CARD OR PURCHASE ORDER (NET 30 DAYS). IF YOUR ORDER IS BY CREDIT CARD PLEASE LEAVE YOUR CREDIT CARD # PLUS EXPIRATION DATE. IF YOUR ORDER IS BY PURCHASE ORDER LEAVE YOUR SHIPPING/BILLING ADDRESSES AND YOUR P.O. NUMBER NO SHIPPING CHARGES FOR ORDERS $49 OR OVER ADD $4.75 FOR ORDERS UNDER $49. C.O.D. ORDERS ADD $4.5 TO SHIPPING CHARGES. FOR THOSE OF YOU WHO REQUIRE MORE INFORMATION ABOUT OUR COMPANY INCUDING FEDERAL TAX ID NUMBER, CLOSEST SHIPPING OR CORPORATE ADDRESS IN THE CONTINENTAL U.S. OR FOR CATALOG REQUESTS PLEASE CALL OUR CUSTOMER SERVICE LINE 1-888-248-2015 OUR NEW , LASER PRINTER TONER CARTRIDGE, PRICES ARE AS FOLLOWS: (PLEASE ORDER BY PAGE NUMBER AND/OR ITEM NUMBER) HEWLETT PACKARD: (ON PAGE 2) ITEM #1 LASERJET SERIES 4L,4P (74A)------------------------$44 ITEM #2 LASERJET SERIES 1100 (92A)-------------------------$44 ITEM #3 LASERJET SERIES 2 (95A)-------------------------------$39 ITEM #4 LASERJET SERIES 2P (75A)-----------------------------$54 ITEM #5 LASERJET SERIES 5P,6P,5MP, 6MP (3903A)--$44 ITEM #6 LASERJET SERIES 5SI, 5000 (29A)------------------$95 ITEM #7 LASERJET SERIES 2100 (96A)-------------------------$74 ITEM #8 LASERJET SERIES 8100 (82X)-----------------------$145 ITEM #9 LASERJET SERIES 5L/6L (3906A0------------------$35 ITEM #10 LASERJET SERIES 4V-------------------------------------$95 ITEM #11 LASERJET SERIES 4000 (27X)-------------------------$72 ITEM #12 LASERJET SERIES 3SI/4SI (91A)--------------------$54 ITEM #13 LASERJET SERIES 4, 4M, 5,5M-----------------------$49 HEWLETT PACKARD FAX (ON PAGE 2) ITEM #14 LASERFAX 500, 700 (FX1)----------$49 ITEM #15 LASERFAX 5000,7000 (FX2)------$54 ITEM #16 LASERFAX (FX3)------------------------$59 ITEM #17 LASERFAX (FX4)------------------------$54 LEXMARK/IBM (ON PAGE 3) OPTRA 4019, 4029 HIGH YIELD---------------$89 OPTRA R, 4039, 4049 HIGH YIELD---------$105 OPTRA E----------------------------------------------------$59 OPTRA N--------------------------------------------------$115 OPTRA S--------------------------------------------------$165 - EPSON (ON PAGE 4) ACTION LASER 7000,7500,8000,9000-------$105 ACTION LASER 1000,1500-------------------------$105 CANON PRINTERS (ON PAGE 5) PLEASE CALL FOR MODELS AND UPDATED PRICES FOR CANON PRINTER CARTRIDGES PANASONIC (0N PAGE 7) NEC SERIES 2 MODELS 90 AND 95----------$105 APPLE (0N PAGE 8) LASER WRITER PRO 600 or 16/600------------$49 LASER WRITER SELECT 300,320,360---------$74 LASER WRITER 300 AND 320----------------------$54 LASER WRITER NT, 2NT------------------------------$54 LASER WRITER 12/640--------------------------------$79 CANON FAX (ON PAGE 9) LASERCLASS 4000 (FX3)---------------------------$59 LASERCLASS 5000,6000,7000 (FX2)---------$54 LASERFAX 5000,7000 (FX2)----------------------$54 LASERFAX 8500,9000 (FX4)----------------------$54 CANON COPIERS (PAGE 10) PC 3, 6RE, 7 AND 11 (A30)---------------------$69 PC 300,320,700,720 and 760 (E-40)--------$89 IF YOUR CARTRIDGE IS NOT LISTED CALL CUSTOMER SERVICE AT 1-888-248-2015 90 DAY UNLIMITED WARRANTY INCLUDED ON ALL PRODUCTS. ALL TRADEMARKS AND BRAND NAMES LISTED ABOVE ARE PROPERTY OF THE RESPECTIVE HOLDERS AND USED FOR DESCRIPTIVE PURPOSES ONLY. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri Mar 23 5:44:58 2001 Delivered-To: freebsd-fs@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 9974037B71F; Fri, 23 Mar 2001 05:44:52 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.1/8.11.1) with SMTP id f2NDi9h27008; Fri, 23 Mar 2001 08:44:29 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Fri, 23 Mar 2001 08:44:09 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: "Alexey V. Neyman" Cc: "Michael C . Wu" , fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Fri, 23 Mar 2001, Alexey V. Neyman wrote: > On Thu, 22 Mar 2001, Michael C . Wu wrote: > > >(Why is vfs.vmiodirenable=1 not enabled by default?) > By the way, is there any all-in-one-place description of sysctl tuneables? > Looking all the man pages and collecting notices about MIB variables seems > rather tiresome and, I think, pointless. I doubt if they are all > documented in man pages. sysctl(3) describes a number of the constant-named sysctl variables, and a number of sysctl's are described in the man pages associated with the features tweaked by the sysctl's. For example, the jail(8) man page describes the jail.* namespace. However, you're right that there are vast hoards of under-documented sysctl's. That said, probably only the "tweakable" (writable) sysctl's need to be documented in the general case, since many are used for the sole purpose of exporting kernel data for supported interfaces, whereas the sysctl's are subject to change. For example, a large number of read-only sysctl's were introduced to support the non-setgid-kmem operation of top, systat, and various other *stat's recently. Also, many sysctl's are "self-documenting", in that the declaration of the sysctl macros in-kernel include a description field. I don't think sysctl(8) currently knows how to read that field, but if you look at the SYSCTL definitions in the kernel source, they're probably a decent starting point. A magic script to extract the sysctl names, types, and descriptions might be useful.. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri Mar 23 11:57:47 2001 Delivered-To: freebsd-fs@freebsd.org Received: from roaming.cacheboy.net (host213-123-132-142.btopenworld.com [213.123.132.142]) by hub.freebsd.org (Postfix) with ESMTP id 9712637B718; Fri, 23 Mar 2001 11:57:33 -0800 (PST) (envelope-from adrian@roaming.cacheboy.net) Received: (from adrian@localhost) by roaming.cacheboy.net (8.11.1/8.11.1) id f2NJkkB05859; Fri, 23 Mar 2001 20:46:46 +0100 (CET) (envelope-from adrian) Date: Fri, 23 Mar 2001 20:11:03 +0100 From: Adrian Chadd To: Robert Watson Cc: "Alexey V. Neyman" , "Michael C . Wu" , fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver Message-ID: <20010323201103.A5828@roaming.cacheboy.net> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from rwatson@FreeBSD.ORG on Fri, Mar 23, 2001 at 08:44:09AM -0500 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Fri, Mar 23, 2001, Robert Watson wrote: > > On Fri, 23 Mar 2001, Alexey V. Neyman wrote: > > > On Thu, 22 Mar 2001, Michael C . Wu wrote: > > > > >(Why is vfs.vmiodirenable=1 not enabled by default?) > > By the way, is there any all-in-one-place description of sysctl tuneables? > > Looking all the man pages and collecting notices about MIB variables seems > > rather tiresome and, I think, pointless. I doubt if they are all > > documented in man pages. > > sysctl(3) describes a number of the constant-named sysctl variables, and a > number of sysctl's are described in the man pages associated with the > features tweaked by the sysctl's. For example, the jail(8) man page > describes the jail.* namespace. However, you're right that there are vast > hoards of under-documented sysctl's. That said, probably only the > "tweakable" (writable) sysctl's need to be documented in the general case, > since many are used for the sole purpose of exporting kernel data for > supported interfaces, whereas the sysctl's are subject to change. For > example, a large number of read-only sysctl's were introduced to support > the non-setgid-kmem operation of top, systat, and various other *stat's > recently. Also, many sysctl's are "self-documenting", in that the > declaration of the sysctl macros in-kernel include a description field. I > don't think sysctl(8) currently knows how to read that field, but if you > look at the SYSCTL definitions in the kernel source, they're probably a > decent starting point. A magic script to extract the sysctl names, types, > and descriptions might be useful.. A while back I started running through the undocumented sysctls and documenting them. I didn't get through all of them, and the main reason I stopped was because there wasn't a nifty way to extract the sysctls short of writing a script to extract them from /usr/src. Someone did point out that you could stuff the sysctl's into an elf segment and only load it when needed, but I don't know much about elf. If someone would like to do this, I'm sure a small group of us (Asmodai? :-P) could walk the sysctl tree again and figure out what the undocumented sysctls are. :-) adrian -- Adrian Chadd "Programming is like sex: One mistake and you have to support for a lifetime." -- rec.humor.funny To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Mar 24 0:17:28 2001 Delivered-To: freebsd-fs@freebsd.org Received: from vexpert.dbai.tuwien.ac.at (vexpert.dbai.tuwien.ac.at [128.130.111.12]) by hub.freebsd.org (Postfix) with ESMTP id 294DE37B71A; Sat, 24 Mar 2001 00:17:15 -0800 (PST) (envelope-from pfeifer@dbai.tuwien.ac.at) Received: from deneb (deneb [128.130.111.2]) by vexpert.dbai.tuwien.ac.at (8.11.1/8.11.1) with ESMTP id f2O8HBe23577; Sat, 24 Mar 2001 09:17:12 +0100 (MET) Date: Sat, 24 Mar 2001 09:17:12 +0100 (CET) From: Gerald Pfeifer To: , Subject: Displaying options for current NFS mounts Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I tried to get some responses to this on -questions a couple of months ago, but failed: What I'd like to see is `mount -v' printing mail:/var/mail on /var/mail (nfs: v3, udp) vexpert:/files7 on /system (nfs: v3, tcp) vexpert:/files5 on /.amd_mnt/vexpert/files5 (nfs: v3, udp) ^^^^^^^^^^^^ instead of mail:/var/mail on /var/mail (nfs) vexpert:/files7 on /system (nfs) vexpert:/files5 on /.amd_mnt/vexpert/files5 (nfs) ^^^ This kind of information is incredibly useful for debugging, yet I haven't found ANY way to obtain it, let alone such a natural one. Gerald To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Mar 24 5:42: 5 2001 Delivered-To: freebsd-fs@freebsd.org Received: from heechee.tobez.org (254.adsl0.ryv.worldonline.dk [213.237.10.254]) by hub.freebsd.org (Postfix) with ESMTP id 2B4B037B71E; Sat, 24 Mar 2001 05:41:52 -0800 (PST) (envelope-from tobez@tobez.org) Received: by heechee.tobez.org (Postfix, from userid 1001) id D9D95550E; Sat, 24 Mar 2001 14:41:50 +0100 (CET) Date: Sat, 24 Mar 2001 14:41:50 +0100 From: Anton Berezin To: Adrian Chadd Cc: Robert Watson , "Alexey V. Neyman" , "Michael C . Wu" , fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver Message-ID: <20010324144150.A59930@heechee.tobez.org> References: <20010323201103.A5828@roaming.cacheboy.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="k+w/mQv8wyuph6w0" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010323201103.A5828@roaming.cacheboy.net>; from adrian@FreeBSD.ORG on Fri, Mar 23, 2001 at 08:11:03PM +0100 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org --k+w/mQv8wyuph6w0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Fri, Mar 23, 2001 at 08:11:03PM +0100, Adrian Chadd wrote: > A while back I started running through the undocumented sysctls and > documenting them. I didn't get through all of them, and the main reason > I stopped was because there wasn't a nifty way to extract the sysctls > short of writing a script to extract them from /usr/src. > > Someone did point out that you could stuff the sysctl's into an elf > segment and only load it when needed, but I don't know much about elf. > If someone would like to do this, I'm sure a small group of us > (Asmodai? :-P) could walk the sysctl tree again and figure out what > the undocumented sysctls are. :-) Some time ago I wrote such a script and even sent it to someone; never got any response, though. It is pretty minimal but does the job. I believe that the only sysctls it misses are those which use auxilliary defines to minimize the number of parameters (like #define P1B_SYSCTL in posix4/posix4_mib.c). FWIW, the script is attached. Cheers, &Anton. -- May the tuna salad be with you. --k+w/mQv8wyuph6w0 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=sysctl_find #! /usr/bin/perl -w use File::Find; use Text::ParseWords; sub check_file { return unless /\.c$/ && -r; local $/; open SRC, "< $_" or print( "can't open $File::Find::dir: $!"), return; my $src = ; # memory hog close SRC; my @found = ($src =~ /\nSYSCTL_(\w+\([^()]+\))/sg); return unless @found; print "$File::Find::dir/$_:\n"; for (@found) { tr/\n\t / /s; my ($type) = /^(\w+)\(/; next if $type eq "DECL"; s/^.*\(//; s/\)//; my @args = quotewords ',', 1, $_; #print "|@args|\n"; $args[0] =~ s/^\s*_//; $args[0] =~ s/\s+$//; $args[0] =~ tr/_/./; $args[2] =~ s/^\s+//; $args[2] =~ s/\s+$//; print "$type\t$args[0].$args[2]\t$args[-1]\n"; } } find( \&check_file, '/usr/src/sys'); --k+w/mQv8wyuph6w0-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Mar 24 6:21:29 2001 Delivered-To: freebsd-fs@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id CB5BC37B719; Sat, 24 Mar 2001 06:21:18 -0800 (PST) (envelope-from des@ofug.org) Received: (from des@localhost) by flood.ping.uio.no (8.9.3/8.9.3) id PAA09436; Sat, 24 Mar 2001 15:21:06 +0100 (CET) (envelope-from des@ofug.org) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: Matt Dillon Cc: Alfred Perlstein , "Michael C . Wu" , Rik van Riel , Peter Wemm , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver References: <200103211114.f2LBE0h57371@mobile.wemm.org> <20010321120620.A932@peorth.iteration.net> <200103211817.f2LIHR416007@earth.backplane.com> <20010321102836.N12319@fw.wintelcom.net> <200103211907.f2LJ7cp17933@earth.backplane.com> From: Dag-Erling Smorgrav Date: 24 Mar 2001 15:21:05 +0100 In-Reply-To: Matt Dillon's message of "Wed, 21 Mar 2001 11:07:38 -0800 (PST)" Message-ID: Lines: 11 User-Agent: Gnus/5.0802 (Gnus v5.8.2) Emacs/20.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Matt Dillon writes: > So you would be able to create approximately four 17GB swap partitions. > If you reduce NSWAP to 2 you would be able to create approximately > two 34GB swap partitions. If you reduce NSWAP to 1 you would be able > to create approximately one 68GB swap partition. "approximately one"? :) DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Mar 24 10:32:15 2001 Delivered-To: freebsd-fs@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id B6CC937B719; Sat, 24 Mar 2001 10:32:11 -0800 (PST) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.2/8.9.3) id f2OIVNb05212; Sat, 24 Mar 2001 10:31:23 -0800 (PST) (envelope-from dillon) Date: Sat, 24 Mar 2001 10:31:23 -0800 (PST) From: Matt Dillon Message-Id: <200103241831.f2OIVNb05212@earth.backplane.com> To: Dag-Erling Smorgrav Cc: Alfred Perlstein , "Michael C . Wu" , Rik van Riel , Peter Wemm , izero@ms26.hinet.net, cross@math.psu.edu, grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded s cerver References: <200103211114.f2LBE0h57371@mobile.wemm.org> <20010321120620.A932@peorth.iteration.net> <200103211817.f2LIHR416007@earth.backplane.com> <20010321102836.N12319@fw.wintelcom.net> <200103211907.f2LJ7cp17933@earth.backplane.com> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :Matt Dillon writes: :> So you would be able to create approximately four 17GB swap partitions. :> If you reduce NSWAP to 2 you would be able to create approximately :> two 34GB swap partitions. If you reduce NSWAP to 1 you would be able :> to create approximately one 68GB swap partition. : :"approximately one"? :) : :DES :-- :Dag-Erling Smorgrav - des@ofug.org Well, before Tor's patch you could create more, but then the machine would get angry at you and crash. Now you can try to create more and the machine will only admonish you for being silly, and then ignore the request. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Mar 24 14:15:54 2001 Delivered-To: freebsd-fs@freebsd.org Received: from bazooka.unixfreak.org (bazooka.unixfreak.org [63.198.170.138]) by hub.freebsd.org (Postfix) with ESMTP id 38DB737B718; Sat, 24 Mar 2001 14:15:40 -0800 (PST) (envelope-from dima@unixfreak.org) Received: from spike.unixfreak.org (spike [63.198.170.139]) by bazooka.unixfreak.org (Postfix) with ESMTP id A025A3E09; Sat, 24 Mar 2001 14:15:39 -0800 (PST) To: Gerald Pfeifer Cc: freebsd-hackers@freebsd.org, freebsd-fs@freebsd.org Subject: Re: Displaying options for current NFS mounts In-Reply-To: ; from pfeifer@dbai.tuwien.ac.at on "Sat, 24 Mar 2001 09:17:12 +0100 (CET)" Date: Sat, 24 Mar 2001 14:15:39 -0800 From: Dima Dorfman Message-Id: <20010324221539.A025A3E09@bazooka.unixfreak.org> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Gerald Pfeifer writes: > What I'd like to see is `mount -v' printing > > vexpert:/files5 on /.amd_mnt/vexpert/files5 (nfs: v3, udp) > ^^^^^^^^^^^^ > instead of > > vexpert:/files5 on /.amd_mnt/vexpert/files5 (nfs) > ^^^ > This kind of information is incredibly useful for debugging, yet I > haven't found ANY way to obtain it, let alone such a natural one. IIRC tcpdump can detect NFS3 vs. NFS2, but that's suboptimal. Implementing the above functionality in mount(8) isn't actually that hard. We would need to export the filesystem-specific _args structures (e.g., nfs_args, ffs_args) to the userland. If we do that, mount(8) will be able to display all kinds of interesting, filesystem-specific stuff (e.g., NFS version and transport, whether a mounted CDROM is using Joilet, etc.). I tried to export this stuff in struct statfs, but ran into a problem: I'd need the complete definitions of _args in , but I can't include, e.g., because the latter includes the former ()! The patch below kind of implements this functionality. I only export nfs_args (not _args), and I only modified mount(8) to print the NFS version, but printing the transport and others is simple from there. To work around the above problem, I pasted the struct nfs_args definition into mount.h. It is *horribly* ugly, but it does work. If some other people display intrest in this, and someone can suggest a less ugly way of getting the definitions of _args into mount.h (the only other way I can think of is to just move all of them from /.h to mount.h permamently), I'll implement this stuff in the other filesystems. Regards Dima Dorfman dima@unixfreak.org P.S. If you want to try the patch, you'll need to rebuild at least the kernel, libc, mount, mountd, and amd, since the size of struct statfs changes. I only did those, and it seems to work on my system. Index: sys/sys/mount.h =================================================================== RCS file: /st/src/FreeBSD/src/sys/sys/mount.h,v retrieving revision 1.102 diff -u -r1.102 mount.h --- sys/sys/mount.h 2001/03/01 20:59:59 1.102 +++ sys/sys/mount.h 2001/03/24 22:03:13 @@ -69,6 +69,44 @@ #define MNAMELEN 72 /* length of buffer for returned name */ #endif +/* XXXDD: from src/sys/nfs/nfs.h! fixme! */ +/* + * Arguments to mount NFS + */ +#ifndef NFS_ARGS_DEFINED +#define NFS_ARGS_DEFINED +#define NFS_ARGSVERSION 3 /* change when nfs_args changes */ +struct nfs_args { + int version; /* args structure version number */ + struct sockaddr *addr; /* file server address */ + int addrlen; /* length of address */ + int sotype; /* Socket type */ + int proto; /* and Protocol */ + u_char *fh; /* File handle to be mounted */ + int fhsize; /* Size, in bytes, of fh */ + int flags; /* flags */ + int wsize; /* write size in bytes */ + int rsize; /* read size in bytes */ + int readdirsize; /* readdir size in bytes */ + int timeo; /* initial timeout in .1 secs */ + int retrans; /* times to retry send */ + int maxgrouplist; /* Max. size of group list */ + int readahead; /* # of blocks to readahead */ + int leaseterm; /* Term (sec) of lease */ + int deadthresh; /* Retrans threshold */ + char *hostname; /* server's name */ + int acregmin; /* cache attrs for reg files min time */ + int acregmax; /* cache attrs for reg files max time */ + int acdirmin; /* cache attrs for dirs min time */ + int acdirmax; /* cache attrs for dirs max time */ +}; +#endif /* !NFS_ARGS_DEFINED */ + +/* filesystem-specific mount options */ +union mount_info { + struct nfs_args nfs; +}; + struct statfs { long f_spare2; /* placeholder */ long f_bsize; /* fundamental file system block size */ @@ -92,6 +130,7 @@ char f_mntfromname[MNAMELEN];/* mounted filesystem */ short f_spares2; /* unused spare */ long f_spare[2]; /* unused spare */ + union mount_info f_mtinfo; /* filesystem-specific mount info */ }; #ifdef _KERNEL Index: sys/nfs/nfs.h =================================================================== RCS file: /st/src/FreeBSD/src/sys/nfs/nfs.h,v retrieving revision 1.57 diff -u -r1.57 nfs.h --- sys/nfs/nfs.h 2001/02/18 13:30:19 1.57 +++ sys/nfs/nfs.h 2001/03/24 22:03:13 @@ -116,6 +116,8 @@ /* * Arguments to mount NFS */ +#ifndef NFS_ARGS_DEFINED +#define NFS_ARGS_DEFINED #define NFS_ARGSVERSION 3 /* change when nfs_args changes */ struct nfs_args { int version; /* args structure version number */ @@ -141,6 +143,7 @@ int acdirmin; /* cache attrs for dirs min time */ int acdirmax; /* cache attrs for dirs max time */ }; +#endif /* !NFS_ARGS_DEFINED */ /* * NFS mount option flags Index: sys/nfs/nfs_vfsops.c =================================================================== RCS file: /st/src/FreeBSD/src/sys/nfs/nfs_vfsops.c,v retrieving revision 1.94 diff -u -r1.94 nfs_vfsops.c --- sys/nfs/nfs_vfsops.c 2001/03/01 20:59:19 1.94 +++ sys/nfs/nfs_vfsops.c 2001/03/24 22:03:14 @@ -307,6 +307,9 @@ sbp->f_type = mp->mnt_vfc->vfc_typenum; bcopy(mp->mnt_stat.f_mntonname, sbp->f_mntonname, MNAMELEN); bcopy(mp->mnt_stat.f_mntfromname, sbp->f_mntfromname, MNAMELEN); + bcopy((caddr_t)&mp->mnt_stat.f_mtinfo.nfs, + (caddr_t)&sbp->f_mtinfo.nfs, + sizeof(mp->mnt_stat.f_mtinfo.nfs)); } nfsm_reqdone; vput(vp); @@ -892,6 +895,7 @@ bcopy((caddr_t)argp->fh, (caddr_t)nmp->nm_fh, argp->fhsize); bcopy(hst, mp->mnt_stat.f_mntfromname, MNAMELEN); bcopy(pth, mp->mnt_stat.f_mntonname, MNAMELEN); + bcopy(argp, &mp->mnt_stat.f_mtinfo.nfs, sizeof(*argp)); nmp->nm_nam = nam; /* Set up the sockets and per-host congestion */ nmp->nm_sotype = argp->sotype; Index: sbin/mount/mount.c =================================================================== RCS file: /st/src/FreeBSD/src/sbin/mount/mount.c,v retrieving revision 1.41 diff -u -r1.41 mount.c --- sbin/mount/mount.c 2000/11/22 17:54:56 1.41 +++ sbin/mount/mount.c 2001/03/24 22:03:14 @@ -50,6 +50,9 @@ #include #include +#include +#include + #include #include #include @@ -530,6 +533,20 @@ (void)printf(", reads: sync %ld async %ld", sfp->f_syncreads, sfp->f_asyncreads); } + + /* + * File-system specific options. + */ + if (strcmp(sfp->f_fstypename, "nfs") == 0) { + struct nfs_args *nfsa = &sfp->f_mtinfo.nfs; + + if (nfsa->version != NFS_ARGSVERSION) { + (void)printf("\n"); + errx(1, "nfs_args version mismatch"); + } + (void)printf(", %s", + (nfsa->flags & NFSMNT_NFSV3) ? "v3" : "v2"); + } (void)printf(")\n"); } To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Mar 24 17:43: 8 2001 Delivered-To: freebsd-fs@freebsd.org Received: from dell.dannyland.org (dell.dannyland.org [64.81.36.13]) by hub.freebsd.org (Postfix) with ESMTP id 3B52C37B718; Sat, 24 Mar 2001 17:43:05 -0800 (PST) (envelope-from dannyman@toldme.com) Received: by dell.dannyland.org (Postfix, from userid 1001) id 576BC5C3D; Sat, 24 Mar 2001 17:43:14 -0800 (PST) Date: Sat, 24 Mar 2001 17:43:14 -0800 From: dannyman To: Matt Dillon Cc: "Michael C . Wu" , Alfred Perlstein , grog@FreeBSD.ORG, fs@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: tuning a VERY heavily (30.0) loaded server Message-ID: <20010324174314.B38361@dell.dannyland.org> References: <20010320111144.A51924@peorth.iteration.net> <20010320092717.R29888@fw.wintelcom.net> <20010320113818.B52586@peorth.iteration.net> <200103201750.f2KHopk94248@earth.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <200103201750.f2KHopk94248@earth.backplane.com>; from dillon@earth.backplane.com on Tue, Mar 20, 2001 at 09:50:51AM -0800 X-Loop: djhoward@uiuc.edu X-URL: http://www.dannyland.org/~dannyman/ Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Mar 20, 2001 at 09:50:51AM -0800, Matt Dillon wrote: > One thing that comes to mind is that you can smarthost your outgoing > email to another host so the queues don't build up. This should > greatly reduce mail load. In fact, I would recommend offloading email > entirely if possible... email always hits disks hard. > > Definitely get rid of MFS. MFS wastes 2x the memory allocated to it. > Use a softupdates-enabled filesystem in place of MFS, or use a > swap-backed VN-based partition with softupdates enabled. > > Alfred's vmiodirenable suggestion is a good one. [...] This might make a tiny help: mount things -noatime? If you are reading the same files over and over and over again you needn't bother WRITEing an atime ... To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Mar 24 18:36:36 2001 Delivered-To: freebsd-fs@freebsd.org Received: from relay.butya.kz (butya-gw.butya.kz [212.154.129.94]) by hub.freebsd.org (Postfix) with ESMTP id 9947837B719; Sat, 24 Mar 2001 18:36:30 -0800 (PST) (envelope-from bp@butya.kz) Received: by relay.butya.kz (Postfix, from userid 1000) id 57E2928E25; Sun, 25 Mar 2001 09:36:21 +0700 (ALMST) Received: from localhost (localhost [127.0.0.1]) by relay.butya.kz (Postfix) with ESMTP id 447962866F; Sun, 25 Mar 2001 09:36:21 +0700 (ALMST) Date: Sun, 25 Mar 2001 09:36:21 +0700 (ALMST) From: Boris Popov To: Dima Dorfman Cc: Gerald Pfeifer , freebsd-hackers@freebsd.org, freebsd-fs@freebsd.org Subject: Re: Displaying options for current NFS mounts In-Reply-To: <20010324221539.A025A3E09@bazooka.unixfreak.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sat, 24 Mar 2001, Dima Dorfman wrote: > Implementing the above functionality in mount(8) isn't actually that > hard. We would need to export the filesystem-specific _args > structures (e.g., nfs_args, ffs_args) to the userland. If we do that, > mount(8) will be able to display all kinds of interesting, > filesystem-specific stuff (e.g., NFS version and transport, whether a > mounted CDROM is using Joilet, etc.). This probably a step in the wrong direction. In this way mount(8) should be aware about all types of filesystems and their flags/options/etc. May be fs (kernel part) can report these options as string to the given userland buffer ? It is not necessary to use existing statfs struct - now we have a very flexible VOP_GETEXTATTR() with corresponding syscall. > If some other people display intrest in this, and someone can suggest > a less ugly way of getting the definitions of _args into mount.h > (the only other way I can think of is to just move all of them from > /.h to mount.h permamently), I'll implement this stuff in the > other filesystems. Think about third party filesystems :) -- Boris Popov http://www.butya.kz/~bp/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Mar 24 20:18: 2 2001 Delivered-To: freebsd-fs@freebsd.org Received: from mail.webmonster.de (datasink.webmonster.de [194.162.162.209]) by hub.freebsd.org (Postfix) with SMTP id 37A2C37B719 for ; Sat, 24 Mar 2001 20:17:59 -0800 (PST) (envelope-from karsten@rohrbach.de) Received: (qmail 28445 invoked by uid 1000); 25 Mar 2001 04:07:52 -0000 Date: Sun, 25 Mar 2001 06:07:52 +0200 From: "Karsten W. Rohrbach" To: bv@bilver.wjv.com Cc: freebsd-fs@freebsd.org Subject: Re: Design a journalled file system Message-ID: <20010325060752.A28058@rohrbach.de> Reply-To: karsten@rohrbach.de Mail-Followup-To: bv@bilver.wjv.com, freebsd-fs@freebsd.org References: <20010226221132.C20550@prism.flugsvamp.com> <200102270620.XAA13824@usr05.primenet.com> <20010227084658.D20550@prism.flugsvamp.com> <20010227101911.A88501@wjv.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010227101911.A88501@wjv.com>; from bill@bilver.wjv.com on Tue, Feb 27, 2001 at 10:19:11AM -0500 X-Arbitrary-Number-Of-The-Day: 42 X-Sender: karsten@rohrbach.de Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Bill Vermillion(bill@bilver.wjv.com)@Tue, Feb 27, 2001 at 10:19:11AM -0500: > On Tue, Feb 27, 2001 at 08:46:58AM -0600, Jonathan Lemon thus spoke: > Is my mind playing tricks on me? I seem to recall that on an SGI > there is a separte boot file system then the XFS. It's been a > couple of years now - but I convertned several from the 5.x to the > 6.x Irix with the new XFS. yeah afaik there is a separate partition/slice/whatsoever that holds the boot files. > > Why does the boot file system have to be the same as a running > file-system. I know that in some of the Sys V.x Intel variants, > there is a separate booting file system conforming to the old > s51 file system because the newer file systems they use wont > boot in an iNTEL environment. i know some people will ignite their flamethrowers now but i like the idea behind /dev/ipldevice on ibm aix. its just container for some very simple structure that holds the files needed to boot as far as devices and other drivers are loaded to get into the next stage and kinda kldload(8) the other drivers and stuff before commencing the real rc stuff. this means -- in the simplest implementation -- having a partition, lets say /dev/da0e with approx. 10 mb size and symlinking it to /dev/bootdevice. then some administration model like linux' lilo has to be run where the image of the boot file system gets assembled somehow. dirty hack: having a directory /bootstage where all the files (loader, rcfiles, kernel, modules) are copied in and cd /bootstage && find .|cpio -o /dev/bootdevice now the loader has to grok cpio or tar format. very stable and convenient way. to be suitable for production use there has to be some kind of selection mechanism for the old setup but that's not a big point in discussion i guess. cheers, /k -- > LET Jesus be YOUR anchor! When Satan rocks your boat, THROW Jesus overboard! KR433/KR11-RIPE -- http://www.webmonster.de -- ftp://ftp.webmonster.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Mar 24 20:19:39 2001 Delivered-To: freebsd-fs@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id EA35737B719; Sat, 24 Mar 2001 20:19:32 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id OAA14767; Sun, 25 Mar 2001 14:19:19 +1000 Date: Sun, 25 Mar 2001 14:18:43 +1000 (EST) From: Bruce Evans X-Sender: bde@besplex.bde.org To: Dima Dorfman Cc: Gerald Pfeifer , freebsd-hackers@FreeBSD.ORG, freebsd-fs@FreeBSD.ORG Subject: Re: Displaying options for current NFS mounts In-Reply-To: <20010324221539.A025A3E09@bazooka.unixfreak.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sat, 24 Mar 2001, Dima Dorfman wrote: > I tried to export this stuff in struct statfs, but ran into a problem: > I'd need the complete definitions of _args in , but I > can't include, e.g., because the latter includes the > former ()! mount.h used to know too much about all sorts of filesystems, but this was fixed in 4.4BSD. It is impossible for mount.h or mount(8) to know about all file systems, since filesystems can be dynamically loaded, and ugly for it to know about more than 1 (or 0 -- ffs is too special). > The patch below kind of implements this functionality. I only export > nfs_args (not _args), and I only modified mount(8) to print > the NFS version, but printing the transport and others is simple from > there. To work around the above problem, I pasted the struct nfs_args > definition into mount.h. It is *horribly* ugly, but it does work. Only mount_foofs can reasonably know about the options for foofs. perhaps mount(8) could fork-exec mount_foofs(8) to print options for foofs. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message