From owner-freebsd-arch@FreeBSD.ORG Sun Jun 22 01:23:29 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3467337B401 for ; Sun, 22 Jun 2003 01:23:29 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 21FB343FCB for ; Sun, 22 Jun 2003 01:23:28 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id SAA22520; Sun, 22 Jun 2003 18:23:21 +1000 Date: Sun, 22 Jun 2003 18:23:20 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: David Schultz In-Reply-To: <20030622035258.GB60460@HAL9000.homeunix.com> Message-ID: <20030622180851.K55800@gamplex.bde.org> References: <20030622005124.GA59673@HAL9000.homeunix.com> <20030622035258.GB60460@HAL9000.homeunix.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: Per-source CFLAGS X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jun 2003 08:23:29 -0000 On Sat, 21 Jun 2003, David Schultz wrote: > On Sun, Jun 22, 2003, Bruce Evans wrote: > > For this, you really want per-file WARNS, since among other reasons > > compiler-dependent flags shouldn't be put in individual Makefiles. > > ... > > Do you need to turn off all warnings or just ones for non-broken > > precedence and a few other non-broken things? gcc doesn't give > > In this case, we really do want to ignore all the warnings. This > is vendor code, written in a style that makes it easiest for the > author to maintain. But not necessarily easiest for us to maintain. We enable some warnings for lots of things under contrib although most things under contrib are not FreeBSD-warning clean. I realize that gdtoa is special since it is compiled as part of libc. > It so happens that -w is a de facto (if not > de jura) standard; it is supported by the GNU, Intel, and Sun C > compilers at least. It's not de-jure in POSIX (c99). > > > # SINGLE SUFFIX RULES > > > .c: > > > - ${CC} ${CFLAGS} ${LDFLAGS} -o ${.TARGET} ${.IMPSRC} > > > + ${CC} ${CFLAGS} ${CFLAGS_${.IMPSRC}} ${LDFLAGS} \ > > > + -o ${.TARGET} ${.IMPSRC} > > > ... > > > > Some rules are specified by POSIX, so they can't be changed. I don't > > see how ${CFLAGS} can be per-file directly, so the POSIX spec seems to > > be actively opposed to per-file CFLAGS. > > ??? You mean we can't add a variable that will normally expand to > nil? This seems like a compatible change, unless you're worried > about someone's makefile breaking because they defined > CFLAGS_foo.c to mean something else. >From POSIX.1-200x-draft7.txt: % 23836 Default Rules % 23837 The default rules for make shall achieve results that are the same as if the following were used. % ... % 23864 SINGLE SUFFIX RULES % 23865 .c: % 23866 $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $< This leaves little scope for modifying the default rules. Bruce From owner-freebsd-arch@FreeBSD.ORG Sun Jun 22 01:37:23 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D158B37B401 for ; Sun, 22 Jun 2003 01:37:23 -0700 (PDT) Received: from whale.sunbay.crimea.ua (whale.sunbay.crimea.ua [212.110.138.65]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7896E43F75 for ; Sun, 22 Jun 2003 01:37:20 -0700 (PDT) (envelope-from ru@sunbay.com) Received: from whale.sunbay.crimea.ua (ru@localhost [127.0.0.1]) h5M8bFVd005550 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 22 Jun 2003 11:37:16 +0300 (EEST) (envelope-from ru@sunbay.com) Received: (from ru@localhost) by whale.sunbay.crimea.ua (8.12.9/8.12.8/Submit) id h5M8bFx7005545; Sun, 22 Jun 2003 11:37:15 +0300 (EEST) (envelope-from ru) Date: Sun, 22 Jun 2003 11:37:14 +0300 From: Ruslan Ermilov To: Bruce Evans Message-ID: <20030622083714.GD99674@sunbay.com> References: <20030622005124.GA59673@HAL9000.homeunix.com> <20030622035258.GB60460@HAL9000.homeunix.com> <20030622180851.K55800@gamplex.bde.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="AkbCVLjbJ9qUtAXD" Content-Disposition: inline In-Reply-To: <20030622180851.K55800@gamplex.bde.org> User-Agent: Mutt/1.5.4i cc: arch@freebsd.org cc: David Schultz Subject: Re: Per-source CFLAGS X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jun 2003 08:37:24 -0000 --AkbCVLjbJ9qUtAXD Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jun 22, 2003 at 06:23:20PM +1000, Bruce Evans wrote: [...] > > > Some rules are specified by POSIX, so they can't be changed. I don't > > > see how ${CFLAGS} can be per-file directly, so the POSIX spec seems to > > > be actively opposed to per-file CFLAGS. > > > > ??? You mean we can't add a variable that will normally expand to > > nil? This seems like a compatible change, unless you're worried > > about someone's makefile breaking because they defined > > CFLAGS_foo.c to mean something else. >=20 > >From POSIX.1-200x-draft7.txt: >=20 > % 23836 Default Rules > % 23837 The default rules for make shall achieve results = that are the same as if the following were used. > % ... > % 23864 SINGLE SUFFIX RULES > % 23865 .c: > % 23866 $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $< >=20 > This leaves little scope for modifying the default rules. >=20 A double suffix rule would be more appropriate here: 23883 DOUBLE SUFFIX RULES 23884 .c.o: 23885 $(CC) $(CFLAGS) -c $< 23886 .f.o: 23887 $(FC) $(FFLAGS) -c $< 23888 .y.o: 23889 $(YACC) $(YFLAGS) $< 23890 $(CC) $(CFLAGS) -c y.tab.c 23891 rm -f y.tab.c 23892 mv y.tab.o $@ Anyway, this only means we should not add the support for per-source CFLAGS to the %POSIX section of sys.mk. I still have some concerns with the proposed implementation. All already existing per-file knobs override the global knob, and I think that maybe the per-source CFLAGS should behave the same? Doing it this way is more flexible; you're free to augment the global CFLAGS by saying CFLAGS_foo.c=3D ${CFLAGS} -DFOO Cheers, --=20 Ruslan Ermilov Sysadmin and DBA, ru@sunbay.com Sunbay Software Ltd, ru@FreeBSD.org FreeBSD committer --AkbCVLjbJ9qUtAXD Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (FreeBSD) iD8DBQE+9Wq6Ukv4P6juNwoRApwXAJ40ttjhe6+nuLfil2gC/MeRK5r8ggCfUDZn HZqwKWaN0OF9XbC8T1b7Mk0= =zxou -----END PGP SIGNATURE----- --AkbCVLjbJ9qUtAXD-- From owner-freebsd-arch@FreeBSD.ORG Sun Jun 22 01:50:24 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3252137B401 for ; Sun, 22 Jun 2003 01:50:24 -0700 (PDT) Received: from HAL9000.homeunix.com (ip114.bella-vista.sfo.interquest.net [66.199.86.114]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8C1D143FA3 for ; Sun, 22 Jun 2003 01:50:23 -0700 (PDT) (envelope-from dschultz@OCF.Berkeley.EDU) Received: from HAL9000.homeunix.com (localhost [127.0.0.1]) by HAL9000.homeunix.com (8.12.9/8.12.9) with ESMTP id h5M8oKJa062066; Sun, 22 Jun 2003 01:50:20 -0700 (PDT) (envelope-from dschultz@OCF.Berkeley.EDU) Received: (from das@localhost) by HAL9000.homeunix.com (8.12.9/8.12.9/Submit) id h5M8oKiJ062065; Sun, 22 Jun 2003 01:50:20 -0700 (PDT) (envelope-from dschultz@OCF.Berkeley.EDU) Date: Sun, 22 Jun 2003 01:50:20 -0700 From: David Schultz To: Bruce Evans Message-ID: <20030622085020.GA61926@HAL9000.homeunix.com> Mail-Followup-To: Bruce Evans , arch@freebsd.org References: <20030622005124.GA59673@HAL9000.homeunix.com> <20030622114150.L54976@gamplex.bde.org> <20030622035258.GB60460@HAL9000.homeunix.com> <20030622180851.K55800@gamplex.bde.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030622180851.K55800@gamplex.bde.org> cc: arch@freebsd.org Subject: Re: Per-source CFLAGS X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jun 2003 08:50:24 -0000 On Sun, Jun 22, 2003, Bruce Evans wrote: > On Sat, 21 Jun 2003, David Schultz wrote: > > ??? You mean we can't add a variable that will normally expand to > > nil? This seems like a compatible change, unless you're worried > > about someone's makefile breaking because they defined > > CFLAGS_foo.c to mean something else. > > >From POSIX.1-200x-draft7.txt: > > % 23836 Default Rules > % 23837 The default rules for make shall achieve results that are the same as if the following were used. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > % ... > % 23864 SINGLE SUFFIX RULES > % 23865 .c: > % 23866 $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $< > > This leaves little scope for modifying the default rules. The results *are* the same with the added ${CFLAGS_$<}, with the exception of the extra space in the argument list, and I don't think that's what the POSIX people were thinking. Is there a specific problem that this patch would cause for people expecting standards-compliant make magic (other than a name conflict)? By the way, is your only complaint that I should not be making this modification in sys.mk? I'd be perfectly happy to remove that part. I really only care about bsd.lib.mk at the moment, and the rest was a hasty afterthought for completeness' sake. To do a complete job without touching sys.mk, it looks like I would need to duplicate a number of default rules in bsd.prog.mk, though... From owner-freebsd-arch@FreeBSD.ORG Sun Jun 22 01:59:27 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6376837B401; Sun, 22 Jun 2003 01:59:27 -0700 (PDT) Received: from HAL9000.homeunix.com (ip114.bella-vista.sfo.interquest.net [66.199.86.114]) by mx1.FreeBSD.org (Postfix) with ESMTP id BEDE143F85; Sun, 22 Jun 2003 01:59:26 -0700 (PDT) (envelope-from dschultz@OCF.Berkeley.EDU) Received: from HAL9000.homeunix.com (localhost [127.0.0.1]) by HAL9000.homeunix.com (8.12.9/8.12.9) with ESMTP id h5M8xNJa062114; Sun, 22 Jun 2003 01:59:24 -0700 (PDT) (envelope-from dschultz@OCF.Berkeley.EDU) Received: (from das@localhost) by HAL9000.homeunix.com (8.12.9/8.12.9/Submit) id h5M8xNOF062113; Sun, 22 Jun 2003 01:59:23 -0700 (PDT) (envelope-from dschultz@OCF.Berkeley.EDU) Date: Sun, 22 Jun 2003 01:59:23 -0700 From: David Schultz To: Ruslan Ermilov Message-ID: <20030622085923.GA62034@HAL9000.homeunix.com> Mail-Followup-To: Ruslan Ermilov , Bruce Evans , arch@FreeBSD.org References: <20030622005124.GA59673@HAL9000.homeunix.com> <20030622035258.GB60460@HAL9000.homeunix.com> <20030622180851.K55800@gamplex.bde.org> <20030622083714.GD99674@sunbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030622083714.GD99674@sunbay.com> cc: arch@FreeBSD.org Subject: Re: Per-source CFLAGS X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jun 2003 08:59:27 -0000 On Sun, Jun 22, 2003, Ruslan Ermilov wrote: > I still have some concerns with the proposed implementation. > All already existing per-file knobs override the global knob, > and I think that maybe the per-source CFLAGS should behave > the same? Doing it this way is more flexible; you're free > to augment the global CFLAGS by saying > > CFLAGS_foo.c= ${CFLAGS} -DFOO Overriding the global settings is kinda the point, but FWIW, your proposal is indeed more flexible. But can you think of a way to implement it that doesn't involve generating a CFLAGS_foo.c variable for every file? From owner-freebsd-arch@FreeBSD.ORG Sun Jun 22 03:02:53 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9E92E37B401 for ; Sun, 22 Jun 2003 03:02:53 -0700 (PDT) Received: from ns1.xcllnt.net (209-128-86-226.BAYAREA.NET [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9D6AA43F93 for ; Sun, 22 Jun 2003 03:02:52 -0700 (PDT) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id h5MA2qDZ089791 for ; Sun, 22 Jun 2003 03:02:52 -0700 (PDT) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) by dhcp01.pn.xcllnt.net (8.12.9/8.12.9) with ESMTP id h5MA2pSx082898 for ; Sun, 22 Jun 2003 03:02:51 -0700 (PDT) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.9/8.12.9/Submit) id h5MA2om4082897 for arch@freebsd.org; Sun, 22 Jun 2003 03:02:50 -0700 (PDT) (envelope-from marcel) Date: Sun, 22 Jun 2003 03:02:49 -0700 From: Marcel Moolenaar To: arch@freebsd.org Message-ID: <20030622100249.GA82703@dhcp01.pn.xcllnt.net> References: <20030622005124.GA59673@HAL9000.homeunix.com> <20030622045529.GA80446@dhcp01.pn.xcllnt.net> <20030622064521.GA61030@HAL9000.homeunix.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030622064521.GA61030@HAL9000.homeunix.com> User-Agent: Mutt/1.5.4i Subject: Re: Per-source CFLAGS X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jun 2003 10:02:53 -0000 On Sat, Jun 21, 2003 at 11:45:22PM -0700, David Schultz wrote: > > > > Per file compilation options are in direct conflict with make > > invocator control, by way of it being a makefile writer knob. > > Put differently: it's a feature for developers, not builders. > > We already see the problem with that when we define CFLAGS on > > the make command line, rather than in the environment. I'm > > not opposed to per-file options, but it seems to push the > > need to split make invocator knobs from makefile writer knobs. > > Until we have such seperation, I request that per-file options > > be made conditional so that make invocators still have control > > without being powerless. > > I expect that this feature would not be used except in very > special cases, and I would be opposed to gratuitous use of it. My expectations are less rose-coloured :-) > In fact, most of these cases are so special that the relevant > file probably won't even work without the extra option. > For example, Peter mentioned a while ago that vfprintf.c was > causing an ICE unless -O was turned off. A #pragma around the function is normally a good way to avoid file-scoped pessimisation. > Since these things are only used selectively, it only makes sense > to disable them selectively. For instance, if we set it on two > files to temporarily work around a gcc bug, and on another file > because it's vendor code that we don't want to see warnings for, a > big knob that says ``Turn off all the special cases'' wouldn't > make much sense. However, if what you're looking for is the > ability to say > GDTOA_WARNS=YES > in your make.conf, that can certainly be done on a case by case > basis. It's exactly what I not want. It's control without power. Think what a build invocator has to figure out if he wants an uniform build and there are hundreds of makefiles; all possibly containing specialized compiler options on a per-file basis. > Would this satisfy your concerns? Not really, but don't worry about it. As I said, I'm not apposed to the feature. There are legit cases where it's useful. If it's causing us problems, we'll just deal with it then and there. It's just that I've already been there, and I'm not in a hurry to go back :-) -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-arch@FreeBSD.ORG Sun Jun 22 03:27:51 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F206037B401 for ; Sun, 22 Jun 2003 03:27:50 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id AD07D43F85 for ; Sun, 22 Jun 2003 03:27:49 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id UAA31953; Sun, 22 Jun 2003 20:27:42 +1000 Date: Sun, 22 Jun 2003 20:27:41 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: David Schultz In-Reply-To: <20030622085020.GA61926@HAL9000.homeunix.com> Message-ID: <20030622200641.D56263@gamplex.bde.org> References: <20030622005124.GA59673@HAL9000.homeunix.com> <20030622035258.GB60460@HAL9000.homeunix.com> <20030622085020.GA61926@HAL9000.homeunix.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: Per-source CFLAGS X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jun 2003 10:27:51 -0000 On Sun, 22 Jun 2003, David Schultz wrote: > On Sun, Jun 22, 2003, Bruce Evans wrote: > > On Sat, 21 Jun 2003, David Schultz wrote: > > > ??? You mean we can't add a variable that will normally expand to > > > nil? This seems like a compatible change, unless you're worried > > > about someone's makefile breaking because they defined > > > CFLAGS_foo.c to mean something else. > > > > >From POSIX.1-200x-draft7.txt: > > > > % 23836 Default Rules > > % 23837 The default rules for make shall achieve results that are the same as if the following were used. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > % ... > > % 23864 SINGLE SUFFIX RULES > > % 23865 .c: > > % 23866 $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $< > > > > This leaves little scope for modifying the default rules. > > The results *are* the same with the added ${CFLAGS_$<}, with the > exception of the extra space in the argument list, and I don't > think that's what the POSIX people were thinking. Is there a The space is not part of the results :-). > specific problem that this patch would cause for people expecting > standards-compliant make magic (other than a name conflict)? The spec says that a macro named CFLAGS_foo.c cannot modify the rule unless the application makefile adds it to CFLAGS itself, Adding magic macros to CFLAGS breaks this. Anyway, it's bogus to change the separate set of POSIX rules, since the whole point of having a separate set is for the POSIX rules to not be affected by FreeBSDisms. > By the way, is your only complaint that I should not be making > this modification in sys.mk? I'd be perfectly happy to remove > that part. I really only care about bsd.lib.mk at the moment, and I only care about the POSIX part of sys.mk for now. I think the change isn't too obtrusive to put in sys.mk provided there is a way to avoid it, and the POSIX part gives such a way. Bruce From owner-freebsd-arch@FreeBSD.ORG Sun Jun 22 20:10:40 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A328C37B401 for ; Sun, 22 Jun 2003 20:10:40 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id D997143F3F for ; Sun, 22 Jun 2003 20:10:39 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9/8.12.9) with ESMTP id h5N3AMKJ048958; Sun, 22 Jun 2003 23:10:22 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)h5N3ABDp048953; Sun, 22 Jun 2003 23:10:11 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Sun, 22 Jun 2003 23:10:10 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: John-Mark Gurney In-Reply-To: <20030621011002.GG15336@funkthat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: make /dev/pci really readable X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2003 03:10:40 -0000 On Fri, 20 Jun 2003, John-Mark Gurney wrote: > John-Mark Gurney wrote this message on Mon, Jun 16, 2003 at 22:29 -0700: > > Bruce Evans wrote this message on Tue, Jun 17, 2003 at 12:36 +1000: > > > On Mon, 16 Jun 2003, Robert Watson wrote: > > > > It looks like (although I haven't tried), user processes can > > > > also cause the kernel to allocate unlimited amounts of kernel memory, > > > > which is another bit we probably need to tighten down. > > > > > > Much more serious. > > > > Yep, the pattern_buf is allocated, and in some cases a berak happens > > w/o freeing it. So there is a memory leak her. Will be fixed soon. > > Ok, I think I have a good patch. It's attached. Fixes the memory leak. > I have also fix the pci manpage to talk about the errors, but it isn't > included in the patch. Per my earlier and out-of-band comments, the /dev/pci code could use some further robustness improvements. In particular, make sure that the code is careful to validate all user arguments for sensibility, such as the issue regarding the allocation of unlimited amounts of kernel memory that I raised earlier. I think we're close to this being safe, but need to take it carefully. This code was clearly not designed to be exposed to untrusted users... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-arch@FreeBSD.ORG Sun Jun 22 20:32:06 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E39FE37B401; Sun, 22 Jun 2003 20:32:06 -0700 (PDT) Received: from mail.cyberonic.com (mail.cyberonic.com [4.17.179.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9F87743FA3; Sun, 22 Jun 2003 20:32:03 -0700 (PDT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (node-40244c0a.sfo.onnet.us.uu.net [64.36.76.10]) by mail.cyberonic.com (8.12.8/8.12.5) with ESMTP id h5N3xfMo004649; Sun, 22 Jun 2003 23:59:42 -0400 Received: (from jmg@localhost) by hydrogen.funkthat.com (8.12.9/8.11.6) id h5N3WJOb076942; Sun, 22 Jun 2003 20:32:19 -0700 (PDT) (envelope-from jmg) Date: Sun, 22 Jun 2003 20:32:19 -0700 From: John-Mark Gurney To: Robert Watson Message-ID: <20030623033219.GI57612@funkthat.com> Mail-Followup-To: Robert Watson , Bruce Evans , arch@freebsd.org References: <20030621011002.GG15336@funkthat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-Operating-System: FreeBSD 4.2-RELEASE i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html cc: arch@freebsd.org Subject: Re: make /dev/pci really readable X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2003 03:32:07 -0000 Robert Watson wrote this message on Sun, Jun 22, 2003 at 23:10 -0400: > > On Fri, 20 Jun 2003, John-Mark Gurney wrote: > > > John-Mark Gurney wrote this message on Mon, Jun 16, 2003 at 22:29 -0700: > > > Bruce Evans wrote this message on Tue, Jun 17, 2003 at 12:36 +1000: > > > > On Mon, 16 Jun 2003, Robert Watson wrote: > > > > > It looks like (although I haven't tried), user processes can > > > > > also cause the kernel to allocate unlimited amounts of kernel memory, > > > > > which is another bit we probably need to tighten down. > > > > > > > > Much more serious. > > > > > > Yep, the pattern_buf is allocated, and in some cases a berak happens > > > w/o freeing it. So there is a memory leak her. Will be fixed soon. > > > > Ok, I think I have a good patch. It's attached. Fixes the memory leak. > > I have also fix the pci manpage to talk about the errors, but it isn't > > included in the patch. > > Per my earlier and out-of-band comments, the /dev/pci code could use some > further robustness improvements. In particular, make sure that the code > is careful to validate all user arguments for sensibility, such as the > issue regarding the allocation of unlimited amounts of kernel memory that > I raised earlier. I think we're close to this being safe, but need to > take it carefully. This code was clearly not designed to be exposed to > untrusted users... Ok, yes, I missed that one. I have commited a fix for that problem. I just did a double check, and I don't see anymore unchecked user input. The memory leak I thought you were talking about was the part that wasn't freeing memory that was allocated (and bounded by an unvalidated variable). Do you want me to reverse the permission check? or? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Mon Jun 23 06:25:50 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C9E6F37B401 for ; Mon, 23 Jun 2003 06:25:50 -0700 (PDT) Received: from mail1.belgacom.be (mail1.belgacom.be [195.13.15.36]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2729143F93 for ; Mon, 23 Jun 2003 06:25:49 -0700 (PDT) (envelope-from david.huysmans@belgacom.be) Received: from AEV003.BGC.NET ([45.216.116.102]) by mail1.belgacom.be with SMTP id h5NDPk324759 for ; Mon, 23 Jun 2003 13:25:46 GMT Received: from 45.216.116.50 by AEV003.BGC.NET (InterScan E-Mail VirusWall NT); Mon, 23 Jun 2003 15:25:37 +0200 content-class: urn:content-classes:message Date: Mon, 23 Jun 2003 15:25:32 +0200 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPartTM-000-c8bb4eb4-f132-4d8d-a3eb-e377c247d400" Message-ID: <73EECD58E1AED211BD940008C75DE6AA0C9B41C2@atgx41.bc> X-MS-Has-Attach: X-MS-TNEF-Correlator: X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Thread-Topic: little question Thread-Index: AcM5iu1Y8ZYHunakRm66VW+EgPjuhA== From: To: X-Mailman-Approved-At: Mon, 23 Jun 2003 06:39:14 -0700 Subject: little question X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2003 13:25:51 -0000 This is a multi-part message in MIME format. ------=_NextPartTM-000-c8bb4eb4-f132-4d8d-a3eb-e377c247d400 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable are you going to make a HP-PA risc version ??? 32bit=20 ------=_NextPartTM-000-c8bb4eb4-f132-4d8d-a3eb-e377c247d400 Content-Type: text/plain; name="Disclaimer_Belgacom.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Disclaimer_Belgacom.txt" ***** DISCLAIMER ***** "This e-mail and any attachments thereto may contain information which is confidential and/or protected by intellectual property rights and are intended for the sole use of the recipient(s) named above. Any use of the information contained herein (including, but not limited to, total or partial reproduction, communication or distribution in any form) by persons other than the designated recipient(s) is prohibited. If you have received this e-mail in error, please notify the sender either by telephone or by e-mail and delete the material from any computer. Thank you for your cooperation." ------=_NextPartTM-000-c8bb4eb4-f132-4d8d-a3eb-e377c247d400-- From owner-freebsd-arch@FreeBSD.ORG Mon Jun 23 07:15:05 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 225F137B404 for ; Mon, 23 Jun 2003 07:15:04 -0700 (PDT) Received: from comp.chem.msu.su (comp-ext.chem.msu.su [158.250.32.157]) by mx1.FreeBSD.org (Postfix) with ESMTP id D54E543FCB for ; Mon, 23 Jun 2003 07:15:00 -0700 (PDT) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (localhost [127.0.0.1]) by comp.chem.msu.su (8.12.3p2/8.12.3) with ESMTP id h5NEEvpV082270 for ; Mon, 23 Jun 2003 18:14:57 +0400 (MSD) (envelope-from yar@comp.chem.msu.su) Received: (from yar@localhost) by comp.chem.msu.su (8.12.3p2/8.12.3/Submit) id h5NEEvwI082269 for arch@freebsd.org; Mon, 23 Jun 2003 18:14:57 +0400 (MSD) (envelope-from yar) Date: Mon, 23 Jun 2003 18:14:56 +0400 From: Yar Tikhiy To: arch@freebsd.org Message-ID: <20030623141456.GA79865@comp.chem.msu.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.3i Subject: VOP_RENAME(9) arguments question X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2003 14:15:05 -0000 Hi there, I've seen code in the VOP_RENAME() vop implementations handling various cases of some of fvp, fdvp, tvp, and tdvp being different, or the same, or NULL. Some of such cases seem clear to me: tvp == NULL -- the destination file doesn't exist yet tdvp == fdvp -- renaming within the same directory tvp == fvp -- source and destination are hardlinks to the same file But what does mean the case of "tvp == tdvp"? I guess this can be true only if renaming to the root directory, which will fail, yet is a special case for unlocking tvp and tdvp. Can there be other special cases of arguments to VOP_RENAME(9) a developer should be aware of? -- Yar From owner-freebsd-arch@FreeBSD.ORG Mon Jun 23 15:22:53 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B8D8B37B401; Mon, 23 Jun 2003 15:22:53 -0700 (PDT) Received: from kientzle.com (h-66-166-149-50.SNVACAID.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 306D543F3F; Mon, 23 Jun 2003 15:22:53 -0700 (PDT) (envelope-from kientzle@acm.org) Received: from acm.org (big.x.kientzle.com [66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id h5NMMqtJ078254; Mon, 23 Jun 2003 15:22:52 -0700 (PDT) (envelope-from kientzle@acm.org) Message-ID: <3EF77E52.6070007@acm.org> Date: Mon, 23 Jun 2003 15:25:22 -0700 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0.1) Gecko/20021005 X-Accept-Language: en-us, en MIME-Version: 1.0 To: arch@freebsd.org, ru@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: Proposal: execvP X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: kientzle@acm.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2003 22:22:54 -0000 I've encountered a couple of places now where I could really use an exec* function that is identical to execvp(3), except that it accepts a path specification instead of automatically using the PATH environment variable. For lack of a better name, I propose adding the following to lib/libc/gen/exec.c: /* Exec 'file', searching the specified path. */ int execvP(const char *file, const char *path, char *const argv[]); The implementation itself is trivial; a three-line edit converts the existing execvp() into execvP(), and then execvp() gets a new implementation as follows: int execvp(const char *file, char *const argv[]) { const char *path; path = getenv("PATH"); if(!path) path = _PATH_DEFPATH; return execvP(file,path,argv); } In essence, execvP() is merely publishing an already-existing capability within the library by breaking execvp() into two very natural pieces. Without this, I basically will have to copy a slightly modified version of execvp() into several utilities, which seems a rather pointless exercise. Thoughts? Tim Kientzle From owner-freebsd-arch@FreeBSD.ORG Mon Jun 23 15:41:00 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 110C537B401; Mon, 23 Jun 2003 15:41:00 -0700 (PDT) Received: from mail1.panix.com (mail1.panix.com [166.84.1.72]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D78A43F3F; Mon, 23 Jun 2003 15:40:59 -0700 (PDT) (envelope-from rsi@panix.com) Received: from panix1.panix.com (panix1.panix.com [166.84.1.1]) by mail1.panix.com (Postfix) with ESMTP id BC69B4871E; Mon, 23 Jun 2003 18:40:58 -0400 (EDT) Received: (from rsi@localhost) by panix1.panix.com (8.11.6p2/8.8.8/PanixN1.1) id h5NMewU27649; Mon, 23 Jun 2003 18:40:58 -0400 (EDT) Message-Id: <200306232240.h5NMewU27649@panix1.panix.com> Sender: rsi@panix.com To: kientzle@acm.org References: <3EF77E52.6070007@acm.org> From: Rajappa Iyer Date: 23 Jun 2003 15:40:58 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii cc: arch@freebsd.org Subject: Re: Proposal: execvP X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2003 22:41:00 -0000 Tim Kientzle writes: > In essence, execvP() is merely publishing an already-existing capability > within the library by breaking execvp() into two very natural pieces. > Without this, I basically will have to copy a slightly modified version > of execvp() into several utilities, which seems a rather pointless > exercise. What's wrong with putenv("PATH=newpath"); execvp(...); What am I missing? rsi -- a.k.a. Rajappa Iyer. Absinthe makes the tart grow fonder. From owner-freebsd-arch@FreeBSD.ORG Mon Jun 23 15:42:19 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EC2D437B401; Mon, 23 Jun 2003 15:42:19 -0700 (PDT) Received: from kientzle.com (h-66-166-149-50.SNVACAID.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4E57643F3F; Mon, 23 Jun 2003 15:42:19 -0700 (PDT) (envelope-from kientzle@acm.org) Received: from acm.org (big.x.kientzle.com [66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id h5NMgJtJ078333; Mon, 23 Jun 2003 15:42:19 -0700 (PDT) (envelope-from kientzle@acm.org) Message-ID: <3EF782E0.2050803@acm.org> Date: Mon, 23 Jun 2003 15:44:48 -0700 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0.1) Gecko/20021005 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Rajappa Iyer References: <3EF77E52.6070007@acm.org> <200306232240.h5NMewU27649@panix1.panix.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: arch@freebsd.org Subject: Re: Proposal: execvP X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: kientzle@acm.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2003 22:42:20 -0000 Rajappa Iyer wrote: > Tim Kientzle writes: >>In essence, execvP() is merely publishing an already-existing capability >>within the library by breaking execvp() into two very natural pieces. >>Without this, I basically will have to copy a slightly modified version >>of execvp() into several utilities, which seems a rather pointless >>exercise. > > > What's wrong with > > putenv("PATH=newpath"); > execvp(...); > > What am I missing? > > rsi Your suggested code changes the PATH that is used by child processes. Tim From owner-freebsd-arch@FreeBSD.ORG Mon Jun 23 16:47:04 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 13FDF37B401 for ; Mon, 23 Jun 2003 16:47:04 -0700 (PDT) Received: from obsecurity.dyndns.org (adsl-64-169-104-32.dsl.lsan03.pacbell.net [64.169.104.32]) by mx1.FreeBSD.org (Postfix) with ESMTP id 43B5C43F75 for ; Mon, 23 Jun 2003 16:47:03 -0700 (PDT) (envelope-from kris@obsecurity.org) Received: from rot13.obsecurity.org (rot13.obsecurity.org [10.0.0.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id 0910866BE5; Mon, 23 Jun 2003 16:47:02 -0700 (PDT) Received: by rot13.obsecurity.org (Postfix, from userid 1000) id DF459B84; Mon, 23 Jun 2003 16:47:01 -0700 (PDT) Date: Mon, 23 Jun 2003 16:47:01 -0700 From: Kris Kennaway To: david.huysmans@belgacom.be Message-ID: <20030623234701.GA16515@rot13.obsecurity.org> References: <73EECD58E1AED211BD940008C75DE6AA0C9B41C2@atgx41.bc> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="tThc/1wpZn/ma/RB" Content-Disposition: inline In-Reply-To: <73EECD58E1AED211BD940008C75DE6AA0C9B41C2@atgx41.bc> User-Agent: Mutt/1.4.1i cc: freebsd-arch@FreeBSD.org Subject: Re: little question X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2003 23:47:05 -0000 --tThc/1wpZn/ma/RB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jun 23, 2003 at 03:25:32PM +0200, david.huysmans@belgacom.be wrote: > are you going to make a HP-PA risc version ??? 32bit=20 Extremely unlikely. Try NetBSD. Kris --tThc/1wpZn/ma/RB Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (FreeBSD) iD8DBQE+95F1Wry0BWjoQKURAgXeAJ0UPNnq+FacBW/bAfClde1YMXO0mQCfU6IQ lOEd5wahEaGdVBRScU5OjjY= =M5ap -----END PGP SIGNATURE----- --tThc/1wpZn/ma/RB-- From owner-freebsd-arch@FreeBSD.ORG Tue Jun 24 04:51:45 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A3AF837B401 for ; Tue, 24 Jun 2003 04:51:45 -0700 (PDT) Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1473843F3F for ; Tue, 24 Jun 2003 04:51:43 -0700 (PDT) (envelope-from dcs@tcoip.com.br) Received: from tcoip.com.br ([10.0.2.6]) by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h5OBpWM11546; Tue, 24 Jun 2003 08:51:33 -0300 Message-ID: <3EF83B43.3000100@tcoip.com.br> Date: Tue, 24 Jun 2003 08:51:31 -0300 From: "Daniel C. Sobral" User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4b) Gecko/20030606 X-Accept-Language: en-us, en, pt-br, ja MIME-Version: 1.0 To: Erik Trulsson References: <200306151406.aa36218@salmon.maths.tcd.ie> <200306151826.h5FIPvM7046944@gw.catspoiler.org> <20030615190209.GA75458@falcon.midgard.homeip.net> In-Reply-To: <20030615190209.GA75458@falcon.midgard.homeip.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-arch@freebsd.org Subject: Re: Message buffer and printf reentrancy patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jun 2003 11:51:45 -0000 Erik Trulsson wrote: > > With a C99 compiler it is always true. In C89 it was implementation > defined if integer division rounded towards zero or towards > negative-infinity. In C99 integer division always rounds towards zero. > This combined with the fact that (a/b)*b + a%b == a is always true (for > integer a,b and b!=0) means that (neg_int % pos_int <= 0 ) is always > true in C99, while it wasn't always true in C89. Heh. I recall when ANS Forth was being discussed that people found it silly all the discussion around whether signed integer division was symmetric or floored, when even C didn't bother with it. :-) (And, if you are interested in such things, the problem was that Forth-79 was symmetric, Forth-83 was floored (or vice versa -- I don't know); the solution was to leave / as implementation defined and creating two new operators: one symmetric and one floored. :) -- Daniel C. Sobral (8-DCS) Gerencia de Operacoes Divisao de Comunicacao de Dados Coordenacao de Seguranca VIVO Centro Oeste Norte Fones: 55-61-313-7654/Cel: 55-61-9618-0904 E-mail: Daniel.Capo@tco.net.br Daniel.Sobral@tcoip.com.br dcs@tcoip.com.br Outros: dcs@newsguy.com dcs@freebsd.org capo@notorious.bsdconspiracy.net I finally got it all together ... but I forgot where I put it. From owner-freebsd-arch@FreeBSD.ORG Tue Jun 24 09:43:03 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D38B537B401 for ; Tue, 24 Jun 2003 09:43:03 -0700 (PDT) Received: from milla.ask33.net (milla.ask33.net [217.197.166.60]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1D00443F93 for ; Tue, 24 Jun 2003 09:42:52 -0700 (PDT) (envelope-from nick@milla.ask33.net) Received: by milla.ask33.net (Postfix, from userid 1001) id F29653ABB53; Tue, 24 Jun 2003 18:46:02 +0200 (CEST) Date: Tue, 24 Jun 2003 18:46:02 +0200 From: Pawel Jakub Dawidek To: freebsd-arch@freebsd.org Message-ID: <20030624164602.GW7587@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="KZCIPwrNpw38UenM" Content-Disposition: inline X-PGP-Key-URL: http://garage.freebsd.pl/jules.asc X-OS: FreeBSD 4.8-RELEASE i386 X-URL: http://garage.freebsd.pl User-Agent: Mutt/1.5.1i Subject: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jun 2003 16:43:04 -0000 --KZCIPwrNpw38UenM Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello. Some time ago I've implemented private memory zones for IPC mechism. Every jail and main host got its own memory for IPC operations. It was implemented for FreeBSD 4.x. Avaliable at: http://garage.freebsd.pl/privipc.tbz http://garage.freebsd.pl/privipc.README I want to port this to FreeBSD 5.x, but with many improvements. Because of that there are few things to talk about and I'm curious if anyone will be interested in answering my questions and at the end commiting this to -CURRENT. Patch will not be a "fast hack" so the best way will be commiting this in parts. I got already working sysvipv_msg mechanism. So if anyone is interested in, please inform me and I'll ask my questions and I'll send also what I got now. Thanks! --=20 Pawel Jakub Dawidek pawel@dawidek.net UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net --KZCIPwrNpw38UenM Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iQCVAwUBPviASj/PhmMH/Mf1AQF3RgQAmNu52v0zX7bKHFuWhbYIybpuh6cI4Ua6 mwnpi6p1k4h2Irn/QDJNR5/kR/6mjkjXMhCuSoJCHbwSJMu2W3hpaHReUCjlCpps xggR+vbcELWcK3i3OECknRFIm2bON3l9ZLza+bKpoxrn5WNI58/ueXeHfkE6sUPM g5nOjX1JfSE= =sSZw -----END PGP SIGNATURE----- --KZCIPwrNpw38UenM-- From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 05:22:13 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 799AD37B401 for ; Wed, 25 Jun 2003 05:22:13 -0700 (PDT) Received: from milla.ask33.net (milla.ask33.net [217.197.166.60]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9CACF43FFD for ; Wed, 25 Jun 2003 05:22:07 -0700 (PDT) (envelope-from nick@milla.ask33.net) Received: by milla.ask33.net (Postfix, from userid 1001) id 495FE3ABB53; Wed, 25 Jun 2003 14:25:39 +0200 (CEST) Date: Wed, 25 Jun 2003 14:25:39 +0200 From: Pawel Jakub Dawidek To: Max Khon Message-ID: <20030625122539.GI7587@garage.freebsd.pl> References: <20030624164602.GW7587@garage.freebsd.pl> <20030625112130.GA72312@iclub.nsu.ru> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="+bs7B30DeWCM5QK8" Content-Disposition: inline In-Reply-To: <20030625112130.GA72312@iclub.nsu.ru> X-PGP-Key-URL: http://garage.freebsd.pl/jules.asc X-OS: FreeBSD 4.8-RELEASE i386 X-URL: http://garage.freebsd.pl User-Agent: Mutt/1.5.1i cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 12:22:13 -0000 --+bs7B30DeWCM5QK8 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 25, 2003 at 06:21:30PM +0700, Max Khon wrote: +> > Some time ago I've implemented private memory zones for IPC mechism. +> > Every jail and main host got its own memory for IPC operations. +> > It was implemented for FreeBSD 4.x. Avaliable at: +> >=20 +> > http://garage.freebsd.pl/privipc.tbz +> > http://garage.freebsd.pl/privipc.README +> >=20 +> > I want to port this to FreeBSD 5.x, but with many improvements. +> > Because of that there are few things to talk about and I'm curious if +> > anyone will be interested in answering my questions and at the end +> > commiting this to -CURRENT. +> >=20 +> > Patch will not be a "fast hack" so the best way will be commiting this +> > in parts. I got already working sysvipv_msg mechanism. +> >=20 +> > So if anyone is interested in, please inform me and I'll ask my +> > questions and I'll send also what I got now. +>=20 +> I'm interested in reviewing and committing this stuff Thanks. So first of all, I implemented something like allocate-on-demand. Memory zones are allocated only if IPC syscall will be called from inside of jail. This is the best way, I think, because: 1. We don't allocate memory if this isn't needed. 2. We don't have to fight with locking prisons list when loading IPC as kld module (allocating memory when lock is holded, ehh). I'm also proposing to create mirror of those values: security.jail.ipc.msgmax security.jail.ipc.msgmni security.jail.ipc.msgmnb security.jail.ipc.msgtql security.jail.ipc.msgssz security.jail.ipc.msgseg They will be always read-write and used to calculate memory that will be allocated for newly created jails. Is everything what I'm saying sounds reasonable I'll send patch for sysvipc_msg. --=20 Pawel Jakub Dawidek pawel@dawidek.net UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net --+bs7B30DeWCM5QK8 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iQCVAwUBPvmUwz/PhmMH/Mf1AQEHZAP/RKVXZOmtLozSs8z5qZSN/24049mXzlaS THwRUt8V1DzRY1bjr7zo33h8DAWb9nN+2Y2YfCHeEeWLZuZ8GS41UW8Q6yhXQnjg X2YG3yeCBUVaqjZ5tKmjmEMJdv3xGI24vUYNS62738E79rlHnVisRNiPIUMCi87F u7GvR9YlT/s= =3Ffd -----END PGP SIGNATURE----- --+bs7B30DeWCM5QK8-- From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 06:51:12 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2CCF037B401 for ; Wed, 25 Jun 2003 06:51:12 -0700 (PDT) Received: from demos.su (mx.demos.su [194.87.0.32]) by mx1.FreeBSD.org (Postfix) with ESMTP id 289B543FE0 for ; Wed, 25 Jun 2003 06:51:10 -0700 (PDT) (envelope-from mitya@fling-wing.demos.su) Received: from [194.87.5.69] (HELO fling-wing.demos.su) by demos.su (CommuniGate Pro SMTP 4.1b7/D) with ESMTP-TLS id 78031086; Wed, 25 Jun 2003 17:51:08 +0400 Received: from fling-wing.demos.su (localhost [127.0.0.1]) by fling-wing.demos.su (8.12.9/8.12.6) with ESMTP id h5PDp75R021556; Wed, 25 Jun 2003 17:51:07 +0400 (MSD) (envelope-from mitya@fling-wing.demos.su) Received: (from mitya@localhost) by fling-wing.demos.su (8.12.9/8.12.6/Submit) id h5PDp7qk021555; Wed, 25 Jun 2003 17:51:07 +0400 (MSD) Date: Wed, 25 Jun 2003 17:51:06 +0400 From: Dmitry Sivachenko To: Pawel Jakub Dawidek Message-ID: <20030625135106.GA19868@fling-wing.demos.su> References: <20030624164602.GW7587@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="qMm9M+Fa2AknHoGS" Content-Disposition: inline In-Reply-To: <20030624164602.GW7587@garage.freebsd.pl> WWW-Home-Page: http://mitya.pp.ru/ X-PGP-Key: http://mitya.pp.ru/mitya.asc User-Agent: Mutt/1.5.4i cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 13:51:12 -0000 --qMm9M+Fa2AknHoGS Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jun 24, 2003 at 06:46:02PM +0200, Pawel Jakub Dawidek wrote: > Hello. >=20 > Some time ago I've implemented private memory zones for IPC mechism. > Every jail and main host got its own memory for IPC operations. > It was implemented for FreeBSD 4.x. Avaliable at: >=20 > http://garage.freebsd.pl/privipc.tbz > http://garage.freebsd.pl/privipc.README I think it would be better to add checks to disallow the use of IPC=20 primitives created in one jail from another. Thus we will avoid allocating separate segments of kernel memory for each jail. It could be trivially achieved by adding another field to struct ipc_perm, but Robert Watson said he knows another way of doing this without breaking ABI (if I understood him right). --qMm9M+Fa2AknHoGS Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (FreeBSD) iD8DBQE++ajKEZSZYxPV34ARAniHAKCkNLYNkLMuWU+n/Sby90GP1KnbQwCggzHx /95lgkqTkgDcO2l/GXBmEx0= =R/Ho -----END PGP SIGNATURE----- --qMm9M+Fa2AknHoGS-- From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 07:45:31 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5485137B405 for ; Wed, 25 Jun 2003 07:45:31 -0700 (PDT) Received: from milla.ask33.net (milla.ask33.net [217.197.166.60]) by mx1.FreeBSD.org (Postfix) with ESMTP id 57E6A43FB1 for ; Wed, 25 Jun 2003 07:45:27 -0700 (PDT) (envelope-from nick@milla.ask33.net) Received: by milla.ask33.net (Postfix, from userid 1001) id 739363ABB53; Wed, 25 Jun 2003 16:48:49 +0200 (CEST) Date: Wed, 25 Jun 2003 16:48:49 +0200 From: Pawel Jakub Dawidek To: Dmitry Sivachenko Message-ID: <20030625144849.GJ7587@garage.freebsd.pl> References: <20030624164602.GW7587@garage.freebsd.pl> <20030625135106.GA19868@fling-wing.demos.su> <20030625140518.GA23435@fling-wing.demos.su> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="OPsOAWNf+lTlQ18+" Content-Disposition: inline In-Reply-To: <20030625140518.GA23435@fling-wing.demos.su> X-PGP-Key-URL: http://garage.freebsd.pl/jules.asc X-OS: FreeBSD 4.8-RELEASE i386 X-URL: http://garage.freebsd.pl User-Agent: Mutt/1.5.1i cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 14:45:31 -0000 --OPsOAWNf+lTlQ18+ Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 25, 2003 at 06:05:18PM +0400, Dmitry Sivachenko wrote: +> > > Some time ago I've implemented private memory zones for IPC mechism. +> > > Every jail and main host got its own memory for IPC operations. +> > > It was implemented for FreeBSD 4.x. Avaliable at: +> > >=20 +> > > http://garage.freebsd.pl/privipc.tbz +> > > http://garage.freebsd.pl/privipc.README +> >=20 +> > I think it would be better to add checks to disallow the use of IPC=20 +> > primitives created in one jail from another. +> > Thus we will avoid allocating separate segments of kernel memory for +> > each jail. +> >=20 +> > It could be trivially achieved by adding another field to struct ipc_p= erm, +> > but Robert Watson said he knows another way of doing this without +> > breaking ABI (if I understood him right). +> >=20 +>=20 +> Please look at his patch: +>=20 +> http://www.watson.org/~robert/freebsd/mac_sysvipc.diff +>=20 +> It does slightly different things, but we could borrow from it. But you got still *one* memory zones for every jail and main host. And I want to separate them. --=20 Pawel Jakub Dawidek pawel@dawidek.net UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net --OPsOAWNf+lTlQ18+ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iQCVAwUBPvm2UT/PhmMH/Mf1AQEVNgP/X3aFLNbD24xBPxS1ln2c0jx5Y2+/JvWc f3wuPN+nsBIL1BrvVdiTcT3WciOT9shRwfniLxe1c+biMbL7LIq+uAw1QyCPG93y YQjKrTa7MkDahnn4NLRUyKxw7GgpXJ2dbEE+jQf93jlUFTkjzweVJ6d9YhnZMEa+ qpDnLDfxYcU= =51Jk -----END PGP SIGNATURE----- --OPsOAWNf+lTlQ18+-- From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 07:52:37 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 02ED337B401 for ; Wed, 25 Jun 2003 07:52:37 -0700 (PDT) Received: from demos.su (mx.demos.su [194.87.0.32]) by mx1.FreeBSD.org (Postfix) with ESMTP id 94B8D43FEA for ; Wed, 25 Jun 2003 07:52:35 -0700 (PDT) (envelope-from mitya@fling-wing.demos.su) Received: from [194.87.5.69] (HELO fling-wing.demos.su) by demos.su (CommuniGate Pro SMTP 4.1b7/D) with ESMTP-TLS id 78046347; Wed, 25 Jun 2003 18:52:34 +0400 Received: from fling-wing.demos.su (localhost [127.0.0.1]) by fling-wing.demos.su (8.12.9/8.12.6) with ESMTP id h5PEqY5R029779; Wed, 25 Jun 2003 18:52:34 +0400 (MSD) (envelope-from mitya@fling-wing.demos.su) Received: (from mitya@localhost) by fling-wing.demos.su (8.12.9/8.12.6/Submit) id h5PEqXoY029778; Wed, 25 Jun 2003 18:52:33 +0400 (MSD) Date: Wed, 25 Jun 2003 18:52:33 +0400 From: Dmitry Sivachenko To: Pawel Jakub Dawidek Message-ID: <20030625145233.GA28322@fling-wing.demos.su> References: <20030624164602.GW7587@garage.freebsd.pl> <20030625135106.GA19868@fling-wing.demos.su> <20030625140518.GA23435@fling-wing.demos.su> <20030625144849.GJ7587@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <20030625144849.GJ7587@garage.freebsd.pl> WWW-Home-Page: http://mitya.pp.ru/ X-PGP-Key: http://mitya.pp.ru/mitya.asc User-Agent: Mutt/1.5.4i cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 14:52:37 -0000 On Wed, Jun 25, 2003 at 04:48:49PM +0200, Pawel Jakub Dawidek wrote: > On Wed, Jun 25, 2003 at 06:05:18PM +0400, Dmitry Sivachenko wrote: > +> > > Some time ago I've implemented private memory zones for IPC mechism. > +> > > Every jail and main host got its own memory for IPC operations. > +> > > It was implemented for FreeBSD 4.x. Avaliable at: > +> > > > +> > > http://garage.freebsd.pl/privipc.tbz > +> > > http://garage.freebsd.pl/privipc.README > +> > > +> > I think it would be better to add checks to disallow the use of IPC > +> > primitives created in one jail from another. > +> > Thus we will avoid allocating separate segments of kernel memory for > +> > each jail. > +> > > +> > It could be trivially achieved by adding another field to struct ipc_perm, > +> > but Robert Watson said he knows another way of doing this without > +> > breaking ABI (if I understood him right). > +> > > +> > +> Please look at his patch: > +> > +> http://www.watson.org/~robert/freebsd/mac_sysvipc.diff > +> > +> It does slightly different things, but we could borrow from it. > > But you got still *one* memory zones for every jail and main host. Yes, that is exactly what I want. This is similar to separate IP stack for each jail: this is more powerful solution, but more expensive (uses more kernel memory). Jail is not a true virtual machine. Let's keep it a *light* virtual machine replacement, with single IP stack, one memory zones for all jails and host, etc. > And I want to separate them. > Then you should join Marco Zec and contribute to his project. Jail will hardly become a true virtual machine. From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 07:58:48 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9F7CA37B40A; Wed, 25 Jun 2003 07:58:48 -0700 (PDT) Received: from milla.ask33.net (milla.ask33.net [217.197.166.60]) by mx1.FreeBSD.org (Postfix) with ESMTP id 962C643FF3; Wed, 25 Jun 2003 07:58:46 -0700 (PDT) (envelope-from nick@milla.ask33.net) Received: by milla.ask33.net (Postfix, from userid 1001) id 9DE663ABB51; Wed, 25 Jun 2003 17:02:21 +0200 (CEST) Date: Wed, 25 Jun 2003 17:02:21 +0200 From: Pawel Jakub Dawidek To: Dmitry Sivachenko Message-ID: <20030625150221.GL7587@garage.freebsd.pl> References: <20030624164602.GW7587@garage.freebsd.pl> <20030625135106.GA19868@fling-wing.demos.su> <20030625140518.GA23435@fling-wing.demos.su> <20030625144849.GJ7587@garage.freebsd.pl> <20030625145233.GA28322@fling-wing.demos.su> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="FvF9dqTwB4R3n80B" Content-Disposition: inline In-Reply-To: <20030625145233.GA28322@fling-wing.demos.su> X-PGP-Key-URL: http://garage.freebsd.pl/jules.asc X-OS: FreeBSD 4.8-RELEASE i386 X-URL: http://garage.freebsd.pl User-Agent: Mutt/1.5.1i cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 14:58:48 -0000 --FvF9dqTwB4R3n80B Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 25, 2003 at 06:52:33PM +0400, Dmitry Sivachenko wrote: +> > But you got still *one* memory zones for every jail and main host. +>=20 +> Yes, that is exactly what I want. +> This is similar to separate IP stack for each jail: this is more powerf= ul +> solution, but more expensive (uses more kernel memory). But note that my implementation allocates memory "on demand". If IPC syscall will not be used inside of jail memory will not be allocated. If think also that this will be trivial to add value to jail struct that will thell if we want separate IPC memory zones for this jail or not. +> Jail is not a true virtual machine. +> Let's keep it a *light* virtual machine replacement, with single IP stac= k, +> one memory zones for all jails and host, etc. I think it should be and it isn't now because of implementaion problems. Am I wrong? Poul? Robert? --=20 Pawel Jakub Dawidek pawel@dawidek.net UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net --FvF9dqTwB4R3n80B Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iQCVAwUBPvm5fT/PhmMH/Mf1AQHxwwQAiw9AeQnaPglEuqFhU4GHHF+GYSlC4g2r yta5TOwhLBtvubyIdNy7Iim3bpStfw2b8V5QMgHPfFM8yfJ2pRZZrvEck34WZeDZ 6prjTvBmMUD+9nCcpu+jmXIRmRZoBW8EBmg6z0ChB9l1zvia7sj3xRhLpvBm4k/9 c8NkwcUNEZc= =SvO5 -----END PGP SIGNATURE----- --FvF9dqTwB4R3n80B-- From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 08:21:22 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D71437B401 for ; Wed, 25 Jun 2003 08:21:22 -0700 (PDT) Received: from demos.su (mx.demos.su [194.87.0.32]) by mx1.FreeBSD.org (Postfix) with ESMTP id B015443FF2 for ; Wed, 25 Jun 2003 08:21:20 -0700 (PDT) (envelope-from mitya@fling-wing.demos.su) Received: from [194.87.5.69] (HELO fling-wing.demos.su) by demos.su (CommuniGate Pro SMTP 4.1b7/D) with ESMTP-TLS id 78049492; Wed, 25 Jun 2003 19:21:19 +0400 Received: from fling-wing.demos.su (localhost [127.0.0.1]) by fling-wing.demos.su (8.12.9/8.12.6) with ESMTP id h5PFLJ5R033397; Wed, 25 Jun 2003 19:21:19 +0400 (MSD) (envelope-from mitya@fling-wing.demos.su) Received: (from mitya@localhost) by fling-wing.demos.su (8.12.9/8.12.6/Submit) id h5PFLJa8033396; Wed, 25 Jun 2003 19:21:19 +0400 (MSD) Date: Wed, 25 Jun 2003 19:21:19 +0400 From: Dmitry Sivachenko To: Pawel Jakub Dawidek Message-ID: <20030625152119.GA31396@fling-wing.demos.su> References: <20030624164602.GW7587@garage.freebsd.pl> <20030625135106.GA19868@fling-wing.demos.su> <20030625140518.GA23435@fling-wing.demos.su> <20030625144849.GJ7587@garage.freebsd.pl> <20030625145233.GA28322@fling-wing.demos.su> <20030625150221.GL7587@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <20030625150221.GL7587@garage.freebsd.pl> WWW-Home-Page: http://mitya.pp.ru/ X-PGP-Key: http://mitya.pp.ru/mitya.asc User-Agent: Mutt/1.5.4i cc: freebsd-arch@FreeBSD.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 15:21:22 -0000 On Wed, Jun 25, 2003 at 05:02:21PM +0200, Pawel Jakub Dawidek wrote: > On Wed, Jun 25, 2003 at 06:52:33PM +0400, Dmitry Sivachenko wrote: > +> > But you got still *one* memory zones for every jail and main host. > +> > +> Yes, that is exactly what I want. > +> This is similar to separate IP stack for each jail: this is more powerful > +> solution, but more expensive (uses more kernel memory). > > But note that my implementation allocates memory "on demand". This is part of the problem: with single memory zone for all jails, less memory is allocated. With private memory zones, if m jails use IPC, you need to allocate m*M kbytes (for some value of M you consider sufficient for one jail). With one memory zone for all jails, it is enough to allocate N kbytes where M < N < m*M, because every jail will not use all M kbytes at the same time. > If IPC syscall will not be used inside of jail memory will not be allocated. > If think also that this will be trivial to add value to jail struct > that will thell if we want separate IPC memory zones for this jail or not. > > +> Jail is not a true virtual machine. > +> Let's keep it a *light* virtual machine replacement, with single IP stack, > +> one memory zones for all jails and host, etc. > > I think it should be and it isn't now because of implementaion problems. > Am I wrong? Poul? Robert? > From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 08:28:17 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 775AE37B401; Wed, 25 Jun 2003 08:28:17 -0700 (PDT) Received: from milla.ask33.net (milla.ask33.net [217.197.166.60]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6E71143FB1; Wed, 25 Jun 2003 08:28:16 -0700 (PDT) (envelope-from nick@milla.ask33.net) Received: by milla.ask33.net (Postfix, from userid 1001) id 3B7103ABB51; Wed, 25 Jun 2003 17:31:53 +0200 (CEST) Date: Wed, 25 Jun 2003 17:31:53 +0200 From: Pawel Jakub Dawidek To: Dmitry Sivachenko Message-ID: <20030625153153.GO7587@garage.freebsd.pl> References: <20030624164602.GW7587@garage.freebsd.pl> <20030625135106.GA19868@fling-wing.demos.su> <20030625140518.GA23435@fling-wing.demos.su> <20030625144849.GJ7587@garage.freebsd.pl> <20030625145233.GA28322@fling-wing.demos.su> <20030625150221.GL7587@garage.freebsd.pl> <20030625152119.GA31396@fling-wing.demos.su> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="v1mHNXBTCsim3EdZ" Content-Disposition: inline In-Reply-To: <20030625152119.GA31396@fling-wing.demos.su> X-PGP-Key-URL: http://garage.freebsd.pl/jules.asc X-OS: FreeBSD 4.8-RELEASE i386 X-URL: http://garage.freebsd.pl User-Agent: Mutt/1.5.1i cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 15:28:17 -0000 --v1mHNXBTCsim3EdZ Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 25, 2003 at 07:21:19PM +0400, Dmitry Sivachenko wrote: +> > +> > But you got still *one* memory zones for every jail and main host. +> > +>=20 +> > +> Yes, that is exactly what I want. +> > +> This is similar to separate IP stack for each jail: this is more p= owerful +> > +> solution, but more expensive (uses more kernel memory). +> >=20 +> > But note that my implementation allocates memory "on demand". +>=20 +> This is part of the problem: with single memory zone for all jails, +> less memory is allocated. With private memory zones, if m jails use IPC, +> you need to allocate m*M kbytes (for some value of M you consider +> sufficient for one jail). +>=20 +> With one memory zone for all jails, it is enough to allocate N kbytes wh= ere +> M < N < m*M, because every jail will not use all M kbytes at the same ti= me. Of course, but please. We could start wondering if struct prison in every ucred struct don't consume to much memory. Of course we allocate more memor= y, but if we want to run for example two instants of postgresql in two diffrent jails? But ok, it will be good compromise to add sysctl security.jail.privipc IMHO. So we could turn this feature on if it is needed. What is your opinion? --=20 Pawel Jakub Dawidek pawel@dawidek.net UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net --v1mHNXBTCsim3EdZ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iQCVAwUBPvnAaT/PhmMH/Mf1AQFoswP9GVRLmrU27QPU8YZ6zmfSTG+BBOI7Man8 a5ap2DrbdAfLj8QnBL5LZmSXdn4KgMly6PcxycImyXgiIrBAfRi1xzpwQYxkF5ar 5SQJDZIgQ3+3X8oMaAUD7iVRJtBUrWAbi2+xRPi3IrVfWjSr2J3zhiua0TGxFc/m qIft1YXKUVA= =TunJ -----END PGP SIGNATURE----- --v1mHNXBTCsim3EdZ-- From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 08:46:31 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E481837B401 for ; Wed, 25 Jun 2003 08:46:31 -0700 (PDT) Received: from demos.su (mx.demos.su [194.87.0.32]) by mx1.FreeBSD.org (Postfix) with ESMTP id C57C743FD7 for ; Wed, 25 Jun 2003 08:46:29 -0700 (PDT) (envelope-from mitya@fling-wing.demos.su) Received: from [194.87.5.69] (HELO fling-wing.demos.su) by demos.su (CommuniGate Pro SMTP 4.1b7/D) with ESMTP-TLS id 78052741; Wed, 25 Jun 2003 19:46:28 +0400 Received: from fling-wing.demos.su (localhost [127.0.0.1]) by fling-wing.demos.su (8.12.9/8.12.6) with ESMTP id h5PFkS5R035903; Wed, 25 Jun 2003 19:46:28 +0400 (MSD) (envelope-from mitya@fling-wing.demos.su) Received: (from mitya@localhost) by fling-wing.demos.su (8.12.9/8.12.6/Submit) id h5PFkRxu035902; Wed, 25 Jun 2003 19:46:27 +0400 (MSD) Date: Wed, 25 Jun 2003 19:46:27 +0400 From: Dmitry Sivachenko To: Pawel Jakub Dawidek Message-ID: <20030625154627.GA35011@fling-wing.demos.su> References: <20030624164602.GW7587@garage.freebsd.pl> <20030625135106.GA19868@fling-wing.demos.su> <20030625140518.GA23435@fling-wing.demos.su> <20030625144849.GJ7587@garage.freebsd.pl> <20030625145233.GA28322@fling-wing.demos.su> <20030625150221.GL7587@garage.freebsd.pl> <20030625152119.GA31396@fling-wing.demos.su> <20030625153153.GO7587@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <20030625153153.GO7587@garage.freebsd.pl> WWW-Home-Page: http://mitya.pp.ru/ X-PGP-Key: http://mitya.pp.ru/mitya.asc User-Agent: Mutt/1.5.4i cc: freebsd-arch@FreeBSD.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 15:46:32 -0000 On Wed, Jun 25, 2003 at 05:31:53PM +0200, Pawel Jakub Dawidek wrote: > On Wed, Jun 25, 2003 at 07:21:19PM +0400, Dmitry Sivachenko wrote: > +> > +> > But you got still *one* memory zones for every jail and main host. > +> > +> > +> > +> Yes, that is exactly what I want. > +> > +> This is similar to separate IP stack for each jail: this is more powerful > +> > +> solution, but more expensive (uses more kernel memory). > +> > > +> > But note that my implementation allocates memory "on demand". > +> > +> This is part of the problem: with single memory zone for all jails, > +> less memory is allocated. With private memory zones, if m jails use IPC, > +> you need to allocate m*M kbytes (for some value of M you consider > +> sufficient for one jail). > +> > +> With one memory zone for all jails, it is enough to allocate N kbytes where > +> M < N < m*M, because every jail will not use all M kbytes at the same time. > > Of course, but please. We could start wondering if struct prison in every > ucred struct don't consume to much memory. Of course we allocate more memory, Common sence is your friend. > but if we want to run for example two instants of postgresql in two > diffrent jails? I propose to add additional checks for p->p_prison. If two different users (with different UIDs) can use IPC, then it is simple to allow processes from different jails to use it too (and do not interfere with each other). > > But ok, it will be good compromise to add sysctl security.jail.privipc IMHO. > So we could turn this feature on if it is needed. What is your opinion? > My point of view is that allowing jailed processes to safely use single memory zone is simple and sufficient solution. From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 10:26:51 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4971937B401 for ; Wed, 25 Jun 2003 10:26:51 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 89FCB43FE9 for ; Wed, 25 Jun 2003 10:26:50 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9/8.12.9) with ESMTP id h5PHQSKJ057237; Wed, 25 Jun 2003 13:26:28 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)h5PHQS1I057232; Wed, 25 Jun 2003 13:26:28 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Wed, 25 Jun 2003 13:26:28 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Pawel Jakub Dawidek In-Reply-To: <20030624164602.GW7587@garage.freebsd.pl> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 17:26:51 -0000 On Tue, 24 Jun 2003, Pawel Jakub Dawidek wrote: > Some time ago I've implemented private memory zones for IPC mechism. > Every jail and main host got its own memory for IPC operations. > It was implemented for FreeBSD 4.x. Avaliable at: > > http://garage.freebsd.pl/privipc.tbz > http://garage.freebsd.pl/privipc.README > > I want to port this to FreeBSD 5.x, but with many improvements. Because > of that there are few things to talk about and I'm curious if anyone > will be interested in answering my questions and at the end commiting > this to -CURRENT. > > Patch will not be a "fast hack" so the best way will be commiting this > in parts. I got already working sysvipv_msg mechanism. > > So if anyone is interested in, please inform me and I'll ask my > questions and I'll send also what I got now. We have some initial patches that wrap the user ipcperm structure in a kernel-specific structure, which we use to add a MAC label. It would be easy to also add a prison pointer. We probably won't get to merging this patch for a couple of weeks, but it's worth keeping in mind. http://www.watson.org/~robert/freebsd/mac_sysvipc.diff This needs style cleanup, bug fixing, testing, etc, but it's the direction we're pushing in for MAC right now. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 10:48:52 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D982E37B401; Wed, 25 Jun 2003 10:48:52 -0700 (PDT) Received: from milla.ask33.net (milla.ask33.net [217.197.166.60]) by mx1.FreeBSD.org (Postfix) with ESMTP id E3AAB44003; Wed, 25 Jun 2003 10:48:50 -0700 (PDT) (envelope-from nick@milla.ask33.net) Received: by milla.ask33.net (Postfix, from userid 1001) id 325033ABB51; Wed, 25 Jun 2003 19:52:25 +0200 (CEST) Date: Wed, 25 Jun 2003 19:52:25 +0200 From: Pawel Jakub Dawidek To: Robert Watson Message-ID: <20030625175225.GS7587@garage.freebsd.pl> References: <20030624164602.GW7587@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="37umynISzNxy+PmB" Content-Disposition: inline In-Reply-To: X-PGP-Key-URL: http://garage.freebsd.pl/jules.asc X-OS: FreeBSD 4.8-RELEASE i386 X-URL: http://garage.freebsd.pl User-Agent: Mutt/1.5.1i cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 17:48:53 -0000 --37umynISzNxy+PmB Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 25, 2003 at 01:26:28PM -0400, Robert Watson wrote: +> We have some initial patches that wrap the user ipcperm structure in a +> kernel-specific structure, which we use to add a MAC label. It would be +> easy to also add a prison pointer. We probably won't get to merging this +> patch for a couple of weeks, but it's worth keeping in mind.=20 +>=20 +> http://www.watson.org/~robert/freebsd/mac_sysvipc.diff +>=20 +> This needs style cleanup, bug fixing, testing, etc, but it's the directi= on +> we're pushing in for MAC right now. Hmm, I'm not sure if I understand patch well, but with this stuff we will be able to run for example two postgresql servers in diffrent jails? Or it only will provide denying specified requests? --=20 Pawel Jakub Dawidek pawel@dawidek.net UNIX Systems Programmer/Administrator http://garage.freebsd.pl Am I Evil? Yes, I Am! http://cerber.sourceforge.net --37umynISzNxy+PmB Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iQCVAwUBPvnhWT/PhmMH/Mf1AQErngP+PmlvSViY3gSYrq7GjklXJnhjTNuLfo3i i/S5pEDiYw9BPD2g706HexhYikyvoz81WcGzWO72nYY0VnaSIa/cU9jUrWDxirre m6+c2W6ba2yaKvKjhnOabRKNbzvPIXlG+VwpRwisgvzO3l0iV3USio1MM6RG2i/d glTDsUb9TT8= =CZsf -----END PGP SIGNATURE----- --37umynISzNxy+PmB-- From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 16:22:40 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EB94F37B401; Wed, 25 Jun 2003 16:22:40 -0700 (PDT) Received: from mx.nsu.ru (mx.nsu.ru [212.192.164.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8225643FE0; Wed, 25 Jun 2003 16:22:39 -0700 (PDT) (envelope-from fjoe@iclub.nsu.ru) Received: from mail by mx.nsu.ru with drweb-scanned (Exim 3.36 #1 (Debian)) id 19VJip-0007su-00; Thu, 26 Jun 2003 06:30:39 +0700 Received: from iclub.nsu.ru ([193.124.215.97] ident=root) by mx.nsu.ru with esmtp (Exim 3.36 #1 (Debian)) id 19VJhA-0007f6-00; Thu, 26 Jun 2003 06:28:56 +0700 Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1]) by iclub.nsu.ru (8.12.9/8.12.9) with ESMTP id h5PNKjMk093607; Thu, 26 Jun 2003 06:20:45 +0700 (NSS) (envelope-from fjoe@iclub.nsu.ru) Received: (from fjoe@localhost) by iclub.nsu.ru (8.12.9/8.12.9/Submit) id h5PNKj5w093606; Thu, 26 Jun 2003 06:20:45 +0700 (NSS) Date: Thu, 26 Jun 2003 06:20:45 +0700 From: Max Khon To: Dmitry Sivachenko Message-ID: <20030625232045.GB92939@iclub.nsu.ru> References: <20030624164602.GW7587@garage.freebsd.pl> <20030625135106.GA19868@fling-wing.demos.su> <20030625140518.GA23435@fling-wing.demos.su> <20030625144849.GJ7587@garage.freebsd.pl> <20030625145233.GA28322@fling-wing.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030625145233.GA28322@fling-wing.demos.su> User-Agent: Mutt/1.4.1i X-Envelope-To: demon@freebsd.org, nick@garage.freebsd.pl, freebsd-arch@freebsd.org X-Bogosity: No, tests=bogofilter, spamicity=0.000000, version=0.13.6.3 X-Spam-Status: No, hits=-106.0 required=5.0 tests=BOGOFILTER_TEST_PASS,EMAIL_ATTRIBUTION,IN_REP_TO, REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT, USER_IN_WHITELIST version=2.55 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp) cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 23:22:41 -0000 hi, there! On Wed, Jun 25, 2003 at 06:52:33PM +0400, Dmitry Sivachenko wrote: > Yes, that is exactly what I want. > This is similar to separate IP stack for each jail: this is more powerful > solution, but more expensive (uses more kernel memory). > > Jail is not a true virtual machine. > Let's keep it a *light* virtual machine replacement, with single IP stack, > one memory zones for all jails and host, etc. btw I know of two projects whose goal is IP stack virtualization for jail. Virtual IP stack (as well as virtualized sysvipc with separate memory zones) can be quite useful. Can provide two solutions? - with shared memory zone (for those who want "light" version) - with separate memory zones (for people who want to keep sysvipc fully separated, i.e. one user can't exhaust all sysvipc resources and make sysvipc unusable for second user) /fjoe From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 16:24:33 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A2D6B37B401; Wed, 25 Jun 2003 16:24:33 -0700 (PDT) Received: from sccrmhc12.attbi.com (sccrmhc12.comcast.net [204.127.202.56]) by mx1.FreeBSD.org (Postfix) with ESMTP id B6F4544020; Wed, 25 Jun 2003 16:24:32 -0700 (PDT) (envelope-from julian@elischer.org) Received: from interjet.elischer.org ([12.233.125.100]) by attbi.com (sccrmhc12) with ESMTP id <2003062523243101200d3lpse>; Wed, 25 Jun 2003 23:24:31 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id QAA05124; Wed, 25 Jun 2003 16:24:30 -0700 (PDT) Date: Wed, 25 Jun 2003 16:24:29 -0700 (PDT) From: Julian Elischer To: Max Khon In-Reply-To: <20030625232045.GB92939@iclub.nsu.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Dmitry Sivachenko cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 23:24:33 -0000 On Thu, 26 Jun 2003, Max Khon wrote: > hi, there! > > On Wed, Jun 25, 2003 at 06:52:33PM +0400, Dmitry Sivachenko wrote: > > > Yes, that is exactly what I want. > > This is similar to separate IP stack for each jail: this is more powerful > > solution, but more expensive (uses more kernel memory). > > > > Jail is not a true virtual machine. > > Let's keep it a *light* virtual machine replacement, with single IP stack, > > one memory zones for all jails and host, etc. > > btw I know of two projects whose goal is IP stack virtualization for jail. > Virtual IP stack (as well as virtualized sysvipc with separate > memory zones) can be quite useful. Can provide two solutions? > > - with shared memory zone (for those who want "light" version) > - with separate memory zones (for people who want to keep > sysvipc fully separated, i.e. one user can't exhaust all sysvipc resources > and make sysvipc unusable for second user) Is either of these projects Marco Zec's project? > > /fjoe > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@FreeBSD.ORG Wed Jun 25 16:26:00 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6E9DE37B404; Wed, 25 Jun 2003 16:26:00 -0700 (PDT) Received: from mx.nsu.ru (mx.nsu.ru [212.192.164.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id BC09743F75; Wed, 25 Jun 2003 16:25:58 -0700 (PDT) (envelope-from fjoe@iclub.nsu.ru) Received: from mail by mx.nsu.ru with drweb-scanned (Exim 3.36 #1 (Debian)) id 19VJlk-0008K1-00; Thu, 26 Jun 2003 06:33:40 +0700 Received: from iclub.nsu.ru ([193.124.215.97] ident=root) by mx.nsu.ru with esmtp (Exim 3.36 #1 (Debian)) id 19VJku-0008C2-00; Thu, 26 Jun 2003 06:32:48 +0700 Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1]) by iclub.nsu.ru (8.12.9/8.12.9) with ESMTP id h5PNOgMk093691; Thu, 26 Jun 2003 06:24:42 +0700 (NSS) (envelope-from fjoe@iclub.nsu.ru) Received: (from fjoe@localhost) by iclub.nsu.ru (8.12.9/8.12.9/Submit) id h5PNOf2I093690; Thu, 26 Jun 2003 06:24:41 +0700 (NSS) Date: Thu, 26 Jun 2003 06:24:41 +0700 From: Max Khon To: Pawel Jakub Dawidek Message-ID: <20030625232441.GC92939@iclub.nsu.ru> References: <20030624164602.GW7587@garage.freebsd.pl> <20030625175225.GS7587@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030625175225.GS7587@garage.freebsd.pl> User-Agent: Mutt/1.4.1i X-Envelope-To: nick@garage.freebsd.pl, rwatson@freebsd.org, freebsd-arch@freebsd.org X-Bogosity: No, tests=bogofilter, spamicity=0.000000, version=0.13.6.3 X-Spam-Status: No, hits=-106.5 required=5.0 tests=BOGOFILTER_TEST_PASS,EMAIL_ATTRIBUTION,IN_REP_TO, QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT_MUTT,USER_IN_WHITELIST version=2.55 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp) cc: Robert Watson cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 23:26:00 -0000 hi, there! On Wed, Jun 25, 2003 at 07:52:25PM +0200, Pawel Jakub Dawidek wrote: > +> We have some initial patches that wrap the user ipcperm structure in a > +> kernel-specific structure, which we use to add a MAC label. It would be > +> easy to also add a prison pointer. We probably won't get to merging this > +> patch for a couple of weeks, but it's worth keeping in mind. > +> > +> http://www.watson.org/~robert/freebsd/mac_sysvipc.diff > +> > +> This needs style cleanup, bug fixing, testing, etc, but it's the direction > +> we're pushing in for MAC right now. > > Hmm, I'm not sure if I understand patch well, but with this stuff we will > be able to run for example two postgresql servers in diffrent jails? no > Or it only will provide denying specified requests? yes. the goal is to use existing MAC framework to deny access to foreign (from other jail) sysvipc objects. /fjoe From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 05:24:30 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 46EDC37B401 for ; Fri, 27 Jun 2003 05:24:30 -0700 (PDT) Received: from asterix.rsu.ru (asterix.rsu.ru [195.208.245.250]) by mx1.FreeBSD.org (Postfix) with ESMTP id C713E44003 for ; Fri, 27 Jun 2003 05:24:28 -0700 (PDT) (envelope-from bushman@rsu.ru) Received: from rsu.ru (mac.cc.rsu.ru [195.208.252.173]) by asterix.rsu.ru (8.12.9/8.12.9) with ESMTP id h5RCOM4d030817; Fri, 27 Jun 2003 16:24:22 +0400 (MSD) (envelope-from bushman@rsu.ru) Date: Fri, 27 Jun 2003 16:24:18 +0400 Content-Type: text/plain; charset=US-ASCII; format=flowed Mime-Version: 1.0 (Apple Message framework v552) To: arch@freebsd.org From: "Michael A. Bushkov" Content-Transfer-Encoding: 7bit Message-Id: <45F05EA1-A89A-11D7-9C1D-000393BC13C6@rsu.ru> X-Mailer: Apple Mail (2.552) X-Spam-Status: No, hits=-100.0 required=5.0 tests=USER_IN_WHITELIST version=2.54 X-Spam-Checker-Version: SpamAssassin 2.54 (1.174.2.17-2003-05-11-exp) cc: os@rsu.ru Subject: dynamically linked root and nscd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 12:24:30 -0000 Good day! We've read some messages from "Making a dynamically-linked root", but we're still not sure if these things would be done or not. Will next versions of FreeBSD have a dynamically linked root? And another questions: are you interested in developing nscd (Caching daemon) analog for FreeBSD? If you are, we have an ability to develop it. Michael Bushkov Rostov State University From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 08:26:53 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B1A8837B401 for ; Fri, 27 Jun 2003 08:26:53 -0700 (PDT) Received: from mail.speakeasy.net (mail14.speakeasy.net [216.254.0.214]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1446643F93 for ; Fri, 27 Jun 2003 08:26:53 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 27266 invoked from network); 27 Jun 2003 15:26:52 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender )encrypted SMTP for ; 27 Jun 2003 15:26:52 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.9/8.12.9) with ESMTP id h5RFQoGI062275; Fri, 27 Jun 2003 11:26:50 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <3EF3C12F.9060303@btc.adaptec.com> Date: Fri, 27 Jun 2003 11:27:02 -0400 (EDT) From: John Baldwin To: Scott Long cc: freebsd-arch@freebsd.org Subject: RE: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 15:26:54 -0000 On 21-Jun-2003 Scott Long wrote: > All, > > As I work towards locking down storage drivers, I'm also preening their > use of busdma. A common theme among them is a misuse of > bus_dmamap_load() and the associated callback mechanism. For most, the > consequence is harmless as long as the card can support the amount of > physical memory in the system (systems with IOMMU's not withstanding). > However, in cases such as PAE where busdma might have to use bounce > buffers, most drivers don't handle the possibility of bus_dmamap_load() > returning EINPROGRESS. The consequence of this is twofold: > bus_dmamap_load() returns without the callback being called, but the > driver doesn't detect this and merrily goes on its way. Later on the > callback does get called, and any state that was shared with it gets > corrupted. This is a problem even for drivers that are under Giant. > > The solution for this is mostly a mechanical cut-n-paste of the code > dealing with the callback. However, locking down the drivers presents > a new problem with the callback. Since the callback can be called > asynchronously from an SWI, it needs some way to synchronize with the > driver. Adding code to each callback to conditionally grab the driver > mutex incurs a penalty (albiet small) and requires more effort. The > better solution is to export the driver mutex to busdma and have the > SWI that runs the callback lock the mutex before calling the callback. Erm, what's wrong with this: void foo_function() { mtx_assert(&mylock, MA_OWNED); ... } void foo_callback() { mtx_lock(&mylock); foo_function(); mtx_unlock(&mylock); } ? Using this approach is more flexible in case there is a driver that uses a sx lock or a (not yet implemented) reader-writer lock or a critical section, or whatever. This just means that the callback uses a wrapper function but that really isn't that hard to do and there are other cases (callouts in general) that need this. To me this seems to be adding a special case to the API that won't work for all situations anyways. I also don't see wrapper functions as being all that hard. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 08:46:40 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 221EE37B404 for ; Fri, 27 Jun 2003 08:46:40 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id D52D243FDD for ; Fri, 27 Jun 2003 08:46:38 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9/8.12.9) with ESMTP id h5RFkBKJ073840; Fri, 27 Jun 2003 11:46:11 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)h5RFkAkY073837; Fri, 27 Jun 2003 11:46:11 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Fri, 27 Jun 2003 11:46:10 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: "Michael A. Bushkov" In-Reply-To: <45F05EA1-A89A-11D7-9C1D-000393BC13C6@rsu.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org cc: os@rsu.ru Subject: Re: dynamically linked root and nscd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 15:46:40 -0000 On Fri, 27 Jun 2003, Michael A. Bushkov wrote: > We've read some messages from "Making a dynamically-linked root", but > we're still not sure if these things would be done or not. Will next > versions of FreeBSD have a dynamically linked root? It sounds like 5.2 and future versions will support building the system with a dynamically linked root; it's not clear to me that the decision on whether to default to a dynamically linked root has been made yet (it sounded like the jury was out on detailed performance measurements, etc). > And another questions: are you interested in developing nscd (Caching > daemon) analog for FreeBSD? If you are, we have an ability to develop > it. I think the answer is that, regardless of whether the default is dynamic or not, there's still interest in a caching daemon for nsswitch, since it will provide a way to do easy centralized caching and management, as well as allow more functionality for those who choose not to link the system dynamically. If you have the resources and interest to create such a thing, I encourage you to do so :-). A useful starting exercise would be to look at any existing implementations (especially Berkeley-licensed ones) and see whether they could be used verbatim, how they could be improved on, etc. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 08:56:48 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 629E237B401; Fri, 27 Jun 2003 08:56:48 -0700 (PDT) Received: from godel.mtl.distributel.net (nat.MTL.distributel.NET [66.38.181.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id 82FFB43F3F; Fri, 27 Jun 2003 08:56:47 -0700 (PDT) (envelope-from bmilekic@technokratis.com) Received: from godel.mtl.distributel.net (localhost [127.0.0.1]) h5RBxoDL008440; Fri, 27 Jun 2003 11:59:50 GMT (envelope-from bmilekic@technokratis.com) Received: (from bmilekic@localhost) by godel.mtl.distributel.net (8.12.9/8.12.9/Submit) id h5RBxoWD008439; Fri, 27 Jun 2003 11:59:50 GMT X-Authentication-Warning: godel.mtl.distributel.net: bmilekic set sender to bmilekic@technokratis.com using -f Date: Fri, 27 Jun 2003 11:59:50 +0000 From: Bosko Milekic To: Robert Watson Message-ID: <20030627115950.GA8424@technokratis.com> References: <45F05EA1-A89A-11D7-9C1D-000393BC13C6@rsu.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i cc: arch@freebsd.org cc: os@rsu.ru cc: "Michael A. Bushkov" Subject: Re: dynamically linked root and nscd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 15:56:48 -0000 On Fri, Jun 27, 2003 at 11:46:10AM -0400, Robert Watson wrote: > > On Fri, 27 Jun 2003, Michael A. Bushkov wrote: > > > We've read some messages from "Making a dynamically-linked root", but > > we're still not sure if these things would be done or not. Will next > > versions of FreeBSD have a dynamically linked root? > > It sounds like 5.2 and future versions will support building the system > with a dynamically linked root; it's not clear to me that the decision on > whether to default to a dynamically linked root has been made yet (it > sounded like the jury was out on detailed performance measurements, etc). > > > And another questions: are you interested in developing nscd (Caching > > daemon) analog for FreeBSD? If you are, we have an ability to develop > > it. > > I think the answer is that, regardless of whether the default is dynamic > or not, there's still interest in a caching daemon for nsswitch, since it > will provide a way to do easy centralized caching and management, as well > as allow more functionality for those who choose not to link the system > dynamically. If you have the resources and interest to create such a > thing, I encourage you to do so :-). A useful starting exercise would be > to look at any existing implementations (especially Berkeley-licensed > ones) and see whether they could be used verbatim, how they could be > improved on, etc. I hate to intrude like this here, but I have a question. When you guys talk about "caching daemon," I hope you mean the same thing. Do you mean "a daemon that would only do caching and be queried by the libc stuff before the nss code calls the backend" or do you mean "a daemon that the nss code would talk to and that would not only do caching but also take care of calling the backend?" Because, in the former case, you still need to dynamically link whereas in the latter (more appealing case), you don't. I may be totally wrong here, but I could have sworn I saw someone post about working/having worked on something like the latter somewhere on our lists within the past two weeks. > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects > robert@fledge.watson.org Network Associates Laboratories -- Bosko Milekic * bmilekic@technokratis.com * bmilekic@FreeBSD.org TECHNOkRATIS Consulting Services * http://www.technokratis.com/ From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 09:03:06 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 82DFB37B40B for ; Fri, 27 Jun 2003 09:03:06 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id A6DDC43FB1 for ; Fri, 27 Jun 2003 09:03:05 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9/8.12.9) with ESMTP id h5RG2cKJ074083; Fri, 27 Jun 2003 12:02:38 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)h5RG2ckY074080; Fri, 27 Jun 2003 12:02:38 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Fri, 27 Jun 2003 12:02:38 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Bosko Milekic In-Reply-To: <20030627115950.GA8424@technokratis.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org cc: os@rsu.ru cc: "Michael A. Bushkov" Subject: Re: dynamically linked root and nscd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 16:03:07 -0000 On Fri, 27 Jun 2003, Bosko Milekic wrote: > I hate to intrude like this here, but I have a question. > > When you guys talk about "caching daemon," I hope you mean the same > thing. Do you mean "a daemon that would only do caching and be queried > by the libc stuff before the nss code calls the backend" or do you mean > "a daemon that the nss code would talk to and that would not only do > caching but also take care of calling the backend?" Because, in the > former case, you still need to dynamically link whereas in the latter > (more appealing case), you don't. What I have in mind is an NSS libc stub client that speaks to a UNIX domain socket, which hooks you up to nscd which dynamically links against the providers of the NSS lookup services. > I may be totally wrong here, but I could have sworn I saw someone > post about working/having worked on something like the latter somewhere > on our lists within the past two weeks. Could well be, and should definitely be looked at before implementing anything. It could also be that NetBSD has an nscd. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 09:13:47 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9BA6137B407; Fri, 27 Jun 2003 09:13:43 -0700 (PDT) Received: from godel.mtl.distributel.net (nat.MTL.distributel.NET [66.38.181.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id D4D5C43F75; Fri, 27 Jun 2003 09:13:42 -0700 (PDT) (envelope-from bmilekic@technokratis.com) Received: from godel.mtl.distributel.net (localhost [127.0.0.1]) h5RCGkDL008703; Fri, 27 Jun 2003 12:16:46 GMT (envelope-from bmilekic@technokratis.com) Received: (from bmilekic@localhost) by godel.mtl.distributel.net (8.12.9/8.12.9/Submit) id h5RCGkDv008702; Fri, 27 Jun 2003 12:16:46 GMT X-Authentication-Warning: godel.mtl.distributel.net: bmilekic set sender to bmilekic@technokratis.com using -f Date: Fri, 27 Jun 2003 12:16:46 +0000 From: Bosko Milekic To: Robert Watson Message-ID: <20030627121646.GA8678@technokratis.com> References: <20030627115950.GA8424@technokratis.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i cc: arch@freebsd.org cc: os@rsu.ru cc: "Michael A. Bushkov" Subject: Re: dynamically linked root and nscd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 16:13:47 -0000 On Fri, Jun 27, 2003 at 12:02:38PM -0400, Robert Watson wrote: > > On Fri, 27 Jun 2003, Bosko Milekic wrote: > > > I hate to intrude like this here, but I have a question. > > > > When you guys talk about "caching daemon," I hope you mean the same > > thing. Do you mean "a daemon that would only do caching and be queried > > by the libc stuff before the nss code calls the backend" or do you mean > > "a daemon that the nss code would talk to and that would not only do > > caching but also take care of calling the backend?" Because, in the > > former case, you still need to dynamically link whereas in the latter > > (more appealing case), you don't. > > What I have in mind is an NSS libc stub client that speaks to a UNIX > domain socket, which hooks you up to nscd which dynamically links against > the providers of the NSS lookup services. Yes, this is precisely what I had in mind, too. I found the relevant post: From: Michael Bushkov List-Id: Technical Discussions relating to FreeBSD Subject: Re[2]: nscd for freebsd Date: Fri, 20 Jun 2003 17:43:06 +0400 It seems that the person in the CC was involved in the first discussion, too; these are in fact the same guys from RSU. So, to the guys from RSU: why is a dynamically-linked-root important for the daemon dispatcher/cache engine idea? -- Bosko Milekic * bmilekic@technokratis.com * bmilekic@FreeBSD.org TECHNOkRATIS Consulting Services * http://www.technokratis.com/ From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 12:33:26 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4B06037B401 for ; Fri, 27 Jun 2003 12:33:26 -0700 (PDT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 47B9743FBF for ; Fri, 27 Jun 2003 12:33:25 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.9/8.12.9) with ESMTP id h5RJXJwV028448 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 27 Jun 2003 15:33:19 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h5RJXEN00619; Fri, 27 Jun 2003 15:33:14 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16124.39930.142492.356163@grasshopper.cs.duke.edu> Date: Fri, 27 Jun 2003 15:33:14 -0400 (EDT) To: Scott Long In-Reply-To: <3EF3C12F.9060303@btc.adaptec.com> References: <3EF3C12F.9060303@btc.adaptec.com> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 19:33:26 -0000 Scott Long writes: > All, > > As I work towards locking down storage drivers, I'm also preening their > use of busdma. A common theme among them is a misuse of > bus_dmamap_load() and the associated callback mechanism. For most, the Why not just add a way to avoid deferring the callback entirely? I'm talking about something like Solaris' ability to pass DDI_DMA_DONTWAIT as the callback function to ddi_dma_buf_bind_handle(). My desire is to know immediately whether the DMA mapping failed. If it failed, that's OK with me. Since my driver locks users memory and maps it for DMA for a potentially unbounded amount of time, I want to know about a failure right away, not pile up requests that will never be satisfied. Can I get the behaviour I'm after by adding BUS_DMA_NOWAIT to the flags I pass to bus_dmamap_load()? Drew From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 12:50:09 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6E67337B401 for ; Fri, 27 Jun 2003 12:50:09 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id E130943FBF for ; Fri, 27 Jun 2003 12:50:08 -0700 (PDT) (envelope-from scott_long@btc.adaptec.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h5RJnI815838; Fri, 27 Jun 2003 12:49:18 -0700 Received: from btc.adaptec.com (hollin.btc.adaptec.com [10.100.253.56]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id MAA22244; Fri, 27 Jun 2003 12:50:08 -0700 (PDT) Message-ID: <3EFC9F2D.6020908@btc.adaptec.com> Date: Fri, 27 Jun 2003 13:46:53 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3) Gecko/20030414 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andrew Gallatin References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> In-Reply-To: <16124.39930.142492.356163@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 19:50:09 -0000 Andrew Gallatin wrote: > Scott Long writes: > > All, > > > > As I work towards locking down storage drivers, I'm also preening their > > use of busdma. A common theme among them is a misuse of > > bus_dmamap_load() and the associated callback mechanism. For most, the > > Why not just add a way to avoid deferring the callback entirely? I'm > talking about something like Solaris' ability to pass DDI_DMA_DONTWAIT > as the callback function to ddi_dma_buf_bind_handle(). > As you hinted below, BUS_DMA_NOWAIT does what you want. It will return ENOMEM to the caller if the bounce buffers cannot be pre-allocated during bus_dmamap_load(). > My desire is to know immediately whether the DMA mapping failed. If > it failed, that's OK with me. Since my driver locks users memory and maps > it for DMA for a potentially unbounded amount of time, I want to > know about a failure right away, not pile up requests that will never > be satisfied. bus_dmamap_load() is designed to always return immediately with either sucess (meaning that the callback was called), or some sort of error code. A return of EINPROGRESS means that bouncing was required but bounce buffers were not available so the callback was defered. There is a flaw in that bus_dmamap_load() treats every request individually. If a request comes in that has to be deferred, and a second request comes in that doesn't need to be deferred, it's likely that the second request will be serviced before the first. SCSI drivers (that check EINPROGRESS correctly) get around this by telling CAM to freeze the queue until the deferred request is serviced. The better solution is probably for bus_dmamap_load() to defer any new requests while prior ones are still pending. Scott From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 13:41:13 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 335B437B401 for ; Fri, 27 Jun 2003 13:41:13 -0700 (PDT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 197B943F85 for ; Fri, 27 Jun 2003 13:41:10 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.9/8.12.9) with ESMTP id h5RKf8wV001396 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 27 Jun 2003 16:41:08 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h5RKf3e00692; Fri, 27 Jun 2003 16:41:03 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16124.43999.333761.397624@grasshopper.cs.duke.edu> Date: Fri, 27 Jun 2003 16:41:03 -0400 (EDT) To: Scott Long In-Reply-To: <3EFC9F2D.6020908@btc.adaptec.com> References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 20:41:13 -0000 Scott Long writes: > > As you hinted below, BUS_DMA_NOWAIT does what you want. It will return > ENOMEM to the caller if the bounce buffers cannot be pre-allocated > during bus_dmamap_load(). OK, thanks. I looks like sparc64 also returns ENOMEM if it runs out of sgmap space.. One more question: What's the FreeBSD equivalent of Solaris' DDI_DMA_CONSISTENT and DDI_DMA_STREAMING? Thanks, Drew From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 13:46:54 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D424537B401 for ; Fri, 27 Jun 2003 13:46:54 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5474543FD7 for ; Fri, 27 Jun 2003 13:46:54 -0700 (PDT) (envelope-from scott_long@btc.adaptec.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h5RKk3826500; Fri, 27 Jun 2003 13:46:03 -0700 Received: from btc.adaptec.com (hollin.btc.adaptec.com [10.100.253.56]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id NAA18343; Fri, 27 Jun 2003 13:46:53 -0700 (PDT) Message-ID: <3EFCAC7A.6060305@btc.adaptec.com> Date: Fri, 27 Jun 2003 14:43:38 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3) Gecko/20030414 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andrew Gallatin References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> In-Reply-To: <16124.43999.333761.397624@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 20:46:55 -0000 Andrew Gallatin wrote: > Scott Long writes: > > > > As you hinted below, BUS_DMA_NOWAIT does what you want. It will return > > ENOMEM to the caller if the bounce buffers cannot be pre-allocated > > during bus_dmamap_load(). > > OK, thanks. I looks like sparc64 also returns ENOMEM if it runs out of > sgmap space.. > > One more question: What's the FreeBSD equivalent of Solaris' > DDI_DMA_CONSISTENT and DDI_DMA_STREAMING? > > Thanks, > > Drew I'm not familiar with Solaris DDI. bus_dmamem_alloc() is guaranteed to give you contiguous memory that doesn't require bouncing (or ENOMEM if that's not possible). I can't imagine what DDI_DMA_STREAMING is. Scott From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 13:58:43 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3BB2F37B401 for ; Fri, 27 Jun 2003 13:58:43 -0700 (PDT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 687AD43FE0 for ; Fri, 27 Jun 2003 13:58:42 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.9/8.12.9) with ESMTP id h5RKwfwV002014 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 27 Jun 2003 16:58:41 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h5RKwZQ00709; Fri, 27 Jun 2003 16:58:35 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16124.45051.919899.414795@grasshopper.cs.duke.edu> Date: Fri, 27 Jun 2003 16:58:35 -0400 (EDT) To: Scott Long In-Reply-To: <3EFCAC7A.6060305@btc.adaptec.com> References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <3EFCAC7A.6060305@btc.adaptec.com> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 20:58:43 -0000 Scott Long writes: > > I'm not familiar with Solaris DDI. bus_dmamem_alloc() is guaranteed to > give you contiguous memory that doesn't require bouncing (or ENOMEM if > that's not possible). I can't imagine what DDI_DMA_STREAMING is. Most sparc's have 2 different sorts of DMA modes. One is cache coherent (aka DDI_DMA_CONSISTENT) -- this is what we all know and love from PC, alphas, macs, etc. The other mode (DDI_DMA_STREAMING) allows non cache coherent DMA. This requires you to call ddi_dma_sync() between your last touch of the data and you starting a DMA read from a device. And vice-versa for a DMA write. The reason people use DDI_DMA_STREAMING is because coherent DMA bandwith tends to be abysmal on most sparcs. Using DDI_DMA_STREAMING upgrades the bandwith from abysmal to just bad. Here are some examples: For u80, UltraSPARC II, using chip "Psycho", 98 MBytes/s consistent vs. 150 MBytes/s streaming. For sunfire, UltraSPARC III, using chip "Schizo", 70 MBytes/s consistent vs. 173 MBytes/s streaming. (compare to 450MB/sec for most intel 64-bit/66MHz PCI slots).. Drew From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 14:08:16 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7767137B401 for ; Fri, 27 Jun 2003 14:08:16 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id D3B2C43FD7 for ; Fri, 27 Jun 2003 14:08:15 -0700 (PDT) (envelope-from scott_long@btc.adaptec.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h5RL7P801473; Fri, 27 Jun 2003 14:07:25 -0700 Received: from btc.adaptec.com (hollin.btc.adaptec.com [10.100.253.56]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id OAA27851; Fri, 27 Jun 2003 14:08:11 -0700 (PDT) Message-ID: <3EFCB178.9030207@btc.adaptec.com> Date: Fri, 27 Jun 2003 15:04:56 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3) Gecko/20030414 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andrew Gallatin References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <3EFCAC7A.6060305@btc.adaptec.com> <16124.45051.919899.414795@grasshopper.cs.duke.edu> In-Reply-To: <16124.45051.919899.414795@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 21:08:16 -0000 Andrew Gallatin wrote: > Scott Long writes: > > > > I'm not familiar with Solaris DDI. bus_dmamem_alloc() is guaranteed to > > give you contiguous memory that doesn't require bouncing (or ENOMEM if > > that's not possible). I can't imagine what DDI_DMA_STREAMING is. > > Most sparc's have 2 different sorts of DMA modes. One is cache > coherent (aka DDI_DMA_CONSISTENT) -- this is what we all know and love > from PC, alphas, macs, etc. > > The other mode (DDI_DMA_STREAMING) allows non cache coherent DMA. > This requires you to call ddi_dma_sync() between your last touch of > the data and you starting a DMA read from a device. And vice-versa > for a DMA write. > > The reason people use DDI_DMA_STREAMING is because coherent DMA > bandwith tends to be abysmal on most sparcs. Using DDI_DMA_STREAMING > upgrades the bandwith from abysmal to just bad. Here are some > examples: > > For u80, UltraSPARC II, using chip "Psycho", > 98 MBytes/s consistent vs. 150 MBytes/s streaming. > For sunfire, UltraSPARC III, using chip "Schizo", > 70 MBytes/s consistent vs. 173 MBytes/s streaming. > > (compare to 450MB/sec for most intel 64-bit/66MHz PCI slots).. > > Drew > The approach taken with busdma is that you don't assume coherency. The idea is to call bus_dmamap_sync() with the appropriate opcode to signal pre|post read|write, and have that take care of the platform-specific magic. On i386 when bouncing does not occur, these are NOOP, otherwise the actual bouncing bcopy() takes place. On sparc64 it looks like it does the appropriate IOMMU and memory barrier magic. Scott From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 14:22:06 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F1EB237B401 for ; Fri, 27 Jun 2003 14:22:05 -0700 (PDT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1CC6943FAF for ; Fri, 27 Jun 2003 14:22:05 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.9/8.12.9) with ESMTP id h5RLM3wV003207 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 27 Jun 2003 17:22:03 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h5RLLwX00743; Fri, 27 Jun 2003 17:21:58 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16124.46454.595892.860118@grasshopper.cs.duke.edu> Date: Fri, 27 Jun 2003 17:21:58 -0400 (EDT) To: Scott Long In-Reply-To: <3EFCB178.9030207@btc.adaptec.com> References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <3EFCAC7A.6060305@btc.adaptec.com> <16124.45051.919899.414795@grasshopper.cs.duke.edu> <3EFCB178.9030207@btc.adaptec.com> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 21:22:06 -0000 Scott Long writes: > > The approach taken with busdma is that you don't assume coherency. The Unfortunately, in our application we must assume coherency in some situations. We have kernel memory mmap'ed into user space for zero-copy io of small messages. Doing a system call to force the dma sync would add unacceptable latency. (we're talking sub 10us latencies here, without syscalls). > idea is to call bus_dmamap_sync() with the appropriate opcode to signal > pre|post read|write, and have that take care of the platform-specific > magic. On i386 when bouncing does not occur, these are NOOP, otherwise > the actual bouncing bcopy() takes place. On sparc64 it looks like it > does the appropriate IOMMU and memory barrier magic. Sure, but we're a 64-bit card and never bounce. If we've bounced, we might as well take the ball and go home, so to speak ;) Anyway, this has saved me a lot of time. Its now apparent that there's no point in our using busdma, since the main gain would have been to enable us to run on sparc64. Directly using physical addresses works great everywhere else.. Drew From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 14:32:30 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7C57E37B401 for ; Fri, 27 Jun 2003 14:32:30 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id CE53443FB1 for ; Fri, 27 Jun 2003 14:32:29 -0700 (PDT) (envelope-from scott_long@btc.adaptec.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h5RLVc800141; Fri, 27 Jun 2003 14:31:38 -0700 Received: from btc.adaptec.com (hollin.btc.adaptec.com [10.100.253.56]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id OAA07675; Fri, 27 Jun 2003 14:32:28 -0700 (PDT) Message-ID: <3EFCB725.4060902@btc.adaptec.com> Date: Fri, 27 Jun 2003 15:29:09 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3) Gecko/20030414 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andrew Gallatin References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <3EFCAC7A.6060305@btc.adaptec.com> <16124.45051.919899.414795@grasshopper.cs.duke.edu> <3EFCB178.9030207@btc.adaptec.com> <16124.46454.595892.860118@grasshopper.cs.duke.edu> In-Reply-To: <16124.46454.595892.860118@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 21:32:30 -0000 Andrew Gallatin wrote: > Scott Long writes: > > > > The approach taken with busdma is that you don't assume coherency. The > > Unfortunately, in our application we must assume coherency in some > situations. We have kernel memory mmap'ed into user space for > zero-copy io of small messages. Doing a system call to force the dma > sync would add unacceptable latency. (we're talking sub 10us latencies > here, without syscalls). > The bus_dmamap_sync() isn't done from a separate system call. The flow is this: bus_dmamap_load(); driver_callback() { set up S/G list; bus_dmamap_sync(PREWRITE); tell hardware that DMA is ready; } The callback gets called immediately as long as conditions are met, as we have discuss prior. Then: driver_intr() { see that hardware has DMA'd data to us; bus_dmamap_sync(POSTREAD); bus_dmamap_unload(); } As I understand it, it is possible to set the pycho bridge to use a coherent address range, but FreeBSD doesn't take advantage of that yet. Scott > > idea is to call bus_dmamap_sync() with the appropriate opcode to signal > > pre|post read|write, and have that take care of the platform-specific > > magic. On i386 when bouncing does not occur, these are NOOP, otherwise > > the actual bouncing bcopy() takes place. On sparc64 it looks like it > > does the appropriate IOMMU and memory barrier magic. > > Sure, but we're a 64-bit card and never bounce. If we've bounced, we > might as well take the ball and go home, so to speak ;) > > Anyway, this has saved me a lot of time. Its now apparent that > there's no point in our using busdma, since the main gain would have > been to enable us to run on sparc64. Directly using physical > addresses works great everywhere else.. > > Drew > > > From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 14:52:37 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4282A37B405 for ; Fri, 27 Jun 2003 14:52:37 -0700 (PDT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5D5EC43FFD for ; Fri, 27 Jun 2003 14:52:36 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.9/8.12.9) with ESMTP id h5RLqYwV005791 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 27 Jun 2003 17:52:34 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h5RLqTh00773; Fri, 27 Jun 2003 17:52:29 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16124.48285.343025.428957@grasshopper.cs.duke.edu> Date: Fri, 27 Jun 2003 17:52:29 -0400 (EDT) To: Scott Long In-Reply-To: <3EFCB725.4060902@btc.adaptec.com> References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <3EFCAC7A.6060305@btc.adaptec.com> <16124.45051.919899.414795@grasshopper.cs.duke.edu> <3EFCB178.9030207@btc.adaptec.com> <16124.46454.595892.860118@grasshopper.cs.duke.edu> <3EFCB725.4060902@btc.adaptec.com> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: Andrew Gallatin cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 21:52:37 -0000 Scott Long writes: > > > > Unfortunately, in our application we must assume coherency in some > > situations. We have kernel memory mmap'ed into user space for > > zero-copy io of small messages. Doing a system call to force the dma > > sync would add unacceptable latency. (we're talking sub 10us latencies > > here, without syscalls). > > > > The bus_dmamap_sync() isn't done from a separate system call. The flow > is this: > > > bus_dmamap_load(); > driver_callback() > { > set up S/G list; > bus_dmamap_sync(PREWRITE); > tell hardware that DMA is ready; > } > > The callback gets called immediately as long as conditions are met, as > we have discuss prior. > > Then: > > driver_intr() > { > see that hardware has DMA'd data to us; > bus_dmamap_sync(POSTREAD); > bus_dmamap_unload(); > } In our application, millions of separate DMAs can happen with no kernel intervention at all. We do the bus_dmamap_load() in the context of an ioctl which allocates some kernel memory, and pokes the DMA addresses for the kernel memory down onto the board. The user then mmaps the kernel memory, and also mmaps a page of SRAM on our board. When the user wants to initate a dma, he writes to the device's SRAM indicating an offset in the mmaped kernel memory. The board then initates the transfer (after doing bounds checking..). When the user is done, or the process exits, the dma addresses are cleared from the nic, and the kernel memory is freed. This is OS-bypass networking. > As I understand it, it is possible to set the pycho bridge to use > a coherent address range, but FreeBSD doesn't take advantage of that > yet. > Yes, that's what solaris does.. Drew From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 15:30:59 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8426F37B401 for ; Fri, 27 Jun 2003 15:30:59 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id A152644001 for ; Fri, 27 Jun 2003 15:30:58 -0700 (PDT) (envelope-from gibbs@scsiguy.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h5RMU7822962; Fri, 27 Jun 2003 15:30:07 -0700 Received: from [10.100.253.70] (aslan.btc.adaptec.com [10.100.253.70]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id PAA02455; Fri, 27 Jun 2003 15:30:57 -0700 (PDT) Date: Fri, 27 Jun 2003 16:32:07 -0600 From: "Justin T. Gibbs" To: Andrew Gallatin , Scott Long Message-ID: <721230000.1056753126@aslan.btc.adaptec.com> In-Reply-To: <16124.48285.343025.428957@grasshopper.cs.duke.edu> References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <3EFCAC7A.6060305@btc.adaptec.com> <16124.45051.919899.414795@grasshopper.cs.duke.edu> <3EFCB178.9030207@btc.adaptec.com> <16124.46454.595892.860118@grasshopper.cs.duke.edu> <3EFCB725.4060902@btc.adaptec.com> <16124.48285.343025.428957@grasshopper.cs.duke.edu> X-Mailer: Mulberry/3.0.3 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: "Justin T. Gibbs" List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 22:30:59 -0000 > > As I understand it, it is possible to set the pycho bridge to use > > a coherent address range, but FreeBSD doesn't take advantage of that > > yet. > > > > Yes, that's what solaris does.. We added BUS_DMA_COHERENT to the API just before shipping 5.1. It is only a "hint", so if you need to verify that the implementation was able to give you coherent memory, we should add an API to allow you to know. Of course, the Sparc bus dma implementation doesn't honor the flag yet, but I'm sure that will change shortly. -- Justin From owner-freebsd-arch@FreeBSD.ORG Fri Jun 27 23:25:13 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A497937B401; Fri, 27 Jun 2003 23:25:13 -0700 (PDT) Received: from asterix.rsu.ru (asterix.rsu.ru [195.208.245.250]) by mx1.FreeBSD.org (Postfix) with ESMTP id F142A44011; Fri, 27 Jun 2003 23:25:11 -0700 (PDT) (envelope-from os@rsu.ru) Received: from brain.cc.rsu.ru (brain.cc.rsu.ru [195.208.252.154]) (authenticated bits=0) by asterix.rsu.ru (8.12.9/8.12.9) with ESMTP id h5S6P34e063146 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sat, 28 Jun 2003 10:25:03 +0400 (MSD) (envelope-from os@rsu.ru) Date: Sat, 28 Jun 2003 10:25:03 +0400 (MSD) From: Oleg Sharoiko X-X-Sender: os@brain.cc.rsu.ru To: Bosko Milekic In-Reply-To: <20030627121646.GA8678@technokratis.com> Message-ID: <20030628092435.B547@brain.cc.rsu.ru> References: <20030627115950.GA8424@technokratis.com> <20030627121646.GA8678@technokratis.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-100.5 required=5.0 tests=EMAIL_ATTRIBUTION,USER_IN_WHITELIST version=2.54 X-Spam-Checker-Version: SpamAssassin 2.54 (1.174.2.17-2003-05-11-exp) cc: arch@freebsd.org cc: Robert Watson cc: "Michael A. Bushkov" Subject: Re: dynamically linked root and nscd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 06:25:13 -0000 On Fri, 27 Jun 2003, Bosko Milekic wrote: BM> So, to the guys from RSU: why is a dynamically-linked-root important BM> for the daemon dispatcher/cache engine idea? We (guys from RSU) have to have a clear understanding of the road FreeBSD's NSS will be developed. Linux/Solaris seem to implement 'in process' model of NSS. MacOS X seem to have 'IPC' model (correct me if I'm wrong). 'In process' model has the benefit of being (semi-)compatible with existing NSS modules (at least it can be implemented this way). This means that third-party open source modules can be used in FreeBSD (ex. padl's nss_ldap). But 'in process' model doesn't support static binaries. 'IPC' model will work fine with static binaries, but it's very different from 'in process' model. With 'ICP' model FreeBSD will need it's own set of NSS modules. If I understood correctly there is some interest in combination of both of these model, correct? Do we really need both of this models? With dynamicly linked root we have a working 'in process' NSS and only 'clean caching' daemon is required. We are ready to work hard on a project that is really needed but we can not waste our time on code that will be discarded (as it happened with our IPC implementation of NSS). Our final question is: should we develop a 'caching only' daemon or a daemon which will also dispatch requests to NSS agents? I hope my English is understandable. If not - please let me know and I'll try to re-express my thoughts. -- Oleg Sharoiko. Software and Network Engineer Computer Center of Rostov State University. From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 00:46:12 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F034637B401 for ; Sat, 28 Jun 2003 00:46:12 -0700 (PDT) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5956243FD7 for ; Sat, 28 Jun 2003 00:46:12 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfjii.dialup.mindspring.com ([165.247.206.82] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19WAPS-0004dd-00; Sat, 28 Jun 2003 00:46:11 -0700 Message-ID: <3EFD4755.49BAF150@mindspring.com> Date: Sat, 28 Jun 2003 00:44:21 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Andrew Gallatin References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <16124.45051.919899.414795@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4574c562441626f6360c4526e0ed31fa12601a10902912494350badd9bab72f9c350badd9bab72f9c cc: Scott Long cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 07:46:13 -0000 Andrew Gallatin wrote: > Most sparc's have 2 different sorts of DMA modes. One is cache > coherent (aka DDI_DMA_CONSISTENT) -- this is what we all know and love > from PC, alphas, macs, etc. "contiguous" > The other mode (DDI_DMA_STREAMING) allows non cache coherent DMA. > This requires you to call ddi_dma_sync() between your last touch of > the data and you starting a DMA read from a device. And vice-versa > for a DMA write. "scatter/gather" > The reason people use DDI_DMA_STREAMING is because coherent DMA > bandwith tends to be abysmal on most sparcs. I'm not surprised; in order to present a contiguous physical RAM buffer to the device DMA engine, you have two choices: 1) Get lucky 2) Copy the data before triggering the DMA The data copy is what kills you. -- Terry From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 00:54:06 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4995337B401 for ; Sat, 28 Jun 2003 00:54:06 -0700 (PDT) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8A4B643F75 for ; Sat, 28 Jun 2003 00:54:05 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfjii.dialup.mindspring.com ([165.247.206.82] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19WAX5-0005Or-00; Sat, 28 Jun 2003 00:54:04 -0700 Message-ID: <3EFD492A.60C18556@mindspring.com> Date: Sat, 28 Jun 2003 00:52:10 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Andrew Gallatin References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <3EFCAC7A.6060305@btc.adaptec.com> <16124.45051.919899.414795@grasshopper.cs.duke.edu> <16124.46454.595892.860118@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4574c562441626f63374ab4d5f00590f3387f7b89c61deb1d350badd9bab72f9c350badd9bab72f9c cc: Scott Long cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 07:54:06 -0000 Andrew Gallatin wrote: > Scott Long writes: > > The approach taken with busdma is that you don't assume coherency. The > > Unfortunately, in our application we must assume coherency in some > situations. We have kernel memory mmap'ed into user space for > zero-copy io of small messages. Doing a system call to force the dma > sync would add unacceptable latency. (we're talking sub 10us latencies > here, without syscalls). "contigmalloc" You have to do the same thing for BT848 and fram buffers that eat host memory instead of having their own to play in. In general, this has to be done in the device driver, very early on in the life of the system to stand any chance of succeeding: because we do not properly use ELF section attribute tags, it's not possible to defrag physical memory in FreeBSD to do these allocations later. The tags are necessary to identify code in the defragmentation code path so that it does not attempt to relocate itself while it's running itself. NB: FreeBSD doesn't support kernel paging and discard of init routines in device drivers, once the driver is operational, for the same reason: lack of section tags indicating "paging path" in the first instance, and lack of section tags indicating "discard after initialization" in the second. -- Terry From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 00:55:01 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E570137B401; Sat, 28 Jun 2003 00:55:01 -0700 (PDT) Received: from mx.nsu.ru (mx.nsu.ru [212.192.164.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id B508F43F85; Sat, 28 Jun 2003 00:55:00 -0700 (PDT) (envelope-from fjoe@iclub.nsu.ru) Received: from mail by mx.nsu.ru with drweb-scanned (Exim 3.36 #1 (Debian)) id 19WAfh-00057F-00; Sat, 28 Jun 2003 15:02:57 +0700 Received: from iclub.nsu.ru ([193.124.215.97] ident=root) by mx.nsu.ru with esmtp (Exim 3.36 #1 (Debian)) id 19WAfZ-0004vu-00; Sat, 28 Jun 2003 15:02:50 +0700 Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1]) by iclub.nsu.ru (8.12.9/8.12.9) with ESMTP id h5S7sDMk075406; Sat, 28 Jun 2003 14:54:13 +0700 (NSS) (envelope-from fjoe@iclub.nsu.ru) Received: (from fjoe@localhost) by iclub.nsu.ru (8.12.9/8.12.9/Submit) id h5S7sC90075405; Sat, 28 Jun 2003 14:54:12 +0700 (NSS) Date: Sat, 28 Jun 2003 14:54:12 +0700 From: Max Khon To: Julian Elischer Message-ID: <20030628075412.GD74123@iclub.nsu.ru> References: <20030625232045.GB92939@iclub.nsu.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-Envelope-To: julian@elischer.org, demon@freebsd.org, freebsd-arch@freebsd.org X-Bogosity: No, tests=bogofilter, spamicity=0.000001, version=0.13.6.3 X-Spam-Status: No, hits=-106.5 required=5.0 tests=BOGOFILTER_TEST_PASS,EMAIL_ATTRIBUTION,IN_REP_TO, QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT_MUTT,USER_IN_WHITELIST version=2.55 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp) cc: Dmitry Sivachenko cc: freebsd-arch@freebsd.org Subject: Re: Jailed sysvipc implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 07:55:02 -0000 hi, there! On Wed, Jun 25, 2003 at 04:24:29PM -0700, Julian Elischer wrote: > > btw I know of two projects whose goal is IP stack virtualization for jail. > > Virtual IP stack (as well as virtualized sysvipc with separate > > memory zones) can be quite useful. Can provide two solutions? > > > > - with shared memory zone (for those who want "light" version) > > - with separate memory zones (for people who want to keep > > sysvipc fully separated, i.e. one user can't exhaust all sysvipc resources > > and make sysvipc unusable for second user) > > Is either of these projects Marco Zec's project? yes. The other is Riccardo Scandariato work. Both were discussed on -net about a month ago. /fjoe From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 01:04:48 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 51A6037B405 for ; Sat, 28 Jun 2003 01:04:48 -0700 (PDT) Received: from mail.cyberonic.com (mail.cyberonic.com [4.17.179.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4543643F75 for ; Sat, 28 Jun 2003 01:04:47 -0700 (PDT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (node-40244c0a.sfo.onnet.us.uu.net [64.36.76.10]) by mail.cyberonic.com (8.12.8/8.12.5) with ESMTP id h5S8ThHl015499; Sat, 28 Jun 2003 04:29:44 -0400 Received: (from jmg@localhost) by hydrogen.funkthat.com (8.12.9/8.11.6) id h5S84m5J083802; Sat, 28 Jun 2003 01:04:48 -0700 (PDT) (envelope-from jmg) Date: Sat, 28 Jun 2003 01:04:48 -0700 From: John-Mark Gurney To: Terry Lambert Message-ID: <20030628080448.GI55920@funkthat.com> Mail-Followup-To: Terry Lambert , Andrew Gallatin , Scott Long , freebsd-arch@freebsd.org References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <3EFCAC7A.6060305@btc.adaptec.com> <16124.45051.919899.414795@grasshopper.cs.duke.edu> <16124.46454.595892.860118@grasshopper.cs.duke.edu> <3EFD492A.60C18556@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3EFD492A.60C18556@mindspring.com> User-Agent: Mutt/1.4.1i X-Operating-System: FreeBSD 4.2-RELEASE i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html cc: Scott Long cc: Andrew Gallatin cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 08:04:48 -0000 Terry Lambert wrote this message on Sat, Jun 28, 2003 at 00:52 -0700: > Andrew Gallatin wrote: > > Scott Long writes: > > > The approach taken with busdma is that you don't assume coherency. The > > > > Unfortunately, in our application we must assume coherency in some > > situations. We have kernel memory mmap'ed into user space for > > zero-copy io of small messages. Doing a system call to force the dma > > sync would add unacceptable latency. (we're talking sub 10us latencies > > here, without syscalls). > > "contigmalloc" > > You have to do the same thing for BT848 and fram buffers that eat > host memory instead of having their own to play in. In general, > this has to be done in the device driver, very early on in the life > of the system to stand any chance of succeeding: because we do not > properly use ELF section attribute tags, it's not possible to defrag > physical memory in FreeBSD to do these allocations later. The tags > are necessary to identify code in the defragmentation code path so > that it does not attempt to relocate itself while it's running itself. > > NB: FreeBSD doesn't support kernel paging and discard of init routines > in device drivers, once the driver is operational, for the same reason: > lack of section tags indicating "paging path" in the first instance, > and lack of section tags indicating "discard after initialization" in > the second. I'm sorry, no, this will not solve the problem he is talking about. You need to reread the information that Andrew has provided before. In a previous email you got confused on the STREAMING/COHERENT flag's meaning. Using contigmalloc only gives you a linear address space, but does not guarantee that the processor will snoop the memory write cycles by the bridge or device to keep the cache of the cpu the same with the memory. For what Andrew needs, he needs the processor to have the same information as in memory. On multiprocessor systems, it can get expensive if every processor has to snoop every memory write that happens. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 01:54:59 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 09D2D37B405 for ; Sat, 28 Jun 2003 01:54:59 -0700 (PDT) Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net [207.217.120.139]) by mx1.FreeBSD.org (Postfix) with ESMTP id C7CCD4400D for ; Sat, 28 Jun 2003 01:54:57 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfjii.dialup.mindspring.com ([165.247.206.82] helo=mindspring.com) by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19WBTu-0006oI-00; Sat, 28 Jun 2003 01:54:51 -0700 Message-ID: <3EFD574B.9419EE71@mindspring.com> Date: Sat, 28 Jun 2003 01:52:27 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: John-Mark Gurney References: <3EF3C12F.9060303@btc.adaptec.com> <16124.39930.142492.356163@grasshopper.cs.duke.edu> <3EFC9F2D.6020908@btc.adaptec.com> <16124.43999.333761.397624@grasshopper.cs.duke.edu> <3EFCAC7A.6060305@btc.adaptec.com> <16124.45051.919899.414795@grasshopper.cs.duke.edu> <16124.46454.595892.860118@grasshopper.cs.duke.edu> <3EFD492A.60C18556@mindspring.com> <20030628080448.GI55920@funkthat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4567a418c676e86940cf383f60ee7f58093caf27dac41a8fd350badd9bab72f9c350badd9bab72f9c cc: Scott Long cc: Andrew Gallatin cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 08:54:59 -0000 John-Mark Gurney wrote: > I'm sorry, no, this will not solve the problem he is talking about. > You need to reread the information that Andrew has provided before. > In a previous email you got confused on the STREAMING/COHERENT flag's > meaning. Using contigmalloc only gives you a linear address space, > but does not guarantee that the processor will snoop the memory write > cycles by the bridge or device to keep the cache of the cpu the same > with the memory. For what Andrew needs, he needs the processor to have > the same information as in memory. On multiprocessor systems, it can > get expensive if every processor has to snoop every memory write that > happens. Clearly, I don't have a deep understanding of SPARC64 SMP hardware; given what he was saying, it still looks to me that the issue he was attempting to address was related to whether or not the memory in question was physically vs. logically contiguous: It also still looks to me that the use of "cache coherent" in: was referring to user space memory and device memory, and not the processor cache. Reading the Solaris ddi_dma_sync(9) man page: Doesn't change that impression for me (it mentions that explicitly calling the function may result in cache flushes, but doesn't imply snooping will occur). There's a good programming article on "Writing Device Drivers" in the "Sun Product Documentation" online: That discusses this in detail, and which seems (to me) to be authoritative. I'd be happy to be corrected, but if you're going to correct me, please tell me *why* I'm wrong, instead of just telling me *that* I'm wrong, since I really *am* interested in not being wrong for the same root cause in the future. Thanks, -- Terry From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 08:46:58 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B8B8837B401; Sat, 28 Jun 2003 08:46:58 -0700 (PDT) Received: from godel.mtl.distributel.net (nat.MTL.distributel.NET [66.38.181.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id B68584401F; Sat, 28 Jun 2003 08:46:57 -0700 (PDT) (envelope-from bmilekic@technokratis.com) Received: from godel.mtl.distributel.net (localhost [127.0.0.1]) h5SBo0DL015208; Sat, 28 Jun 2003 11:50:00 GMT (envelope-from bmilekic@technokratis.com) Received: (from bmilekic@localhost) by godel.mtl.distributel.net (8.12.9/8.12.9/Submit) id h5SBnxWP015204; Sat, 28 Jun 2003 11:49:59 GMT X-Authentication-Warning: godel.mtl.distributel.net: bmilekic set sender to bmilekic@technokratis.com using -f Date: Sat, 28 Jun 2003 11:49:59 +0000 From: Bosko Milekic To: Oleg Sharoiko Message-ID: <20030628114959.GA15104@technokratis.com> References: <20030627115950.GA8424@technokratis.com> <20030627121646.GA8678@technokratis.com> <20030628092435.B547@brain.cc.rsu.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030628092435.B547@brain.cc.rsu.ru> User-Agent: Mutt/1.4.1i cc: arch@freebsd.org cc: Robert Watson cc: "Michael A. Bushkov" Subject: Re: dynamically linked root and nscd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 15:46:59 -0000 On Sat, Jun 28, 2003 at 10:25:03AM +0400, Oleg Sharoiko wrote: > > On Fri, 27 Jun 2003, Bosko Milekic wrote: > > BM> So, to the guys from RSU: why is a dynamically-linked-root important > BM> for the daemon dispatcher/cache engine idea? > > We (guys from RSU) have to have a clear understanding of the road FreeBSD's > NSS will be developed. Linux/Solaris seem to implement 'in process' model of > NSS. MacOS X seem to have 'IPC' model (correct me if I'm wrong). > > 'In process' model has the benefit of being (semi-)compatible with existing > NSS modules (at least it can be implemented this way). This means that > third-party open source modules can be used in FreeBSD (ex. padl's nss_ldap). > But 'in process' model doesn't support static binaries. > > 'IPC' model will work fine with static binaries, but it's very different from > 'in process' model. With 'ICP' model FreeBSD will need it's own set of NSS > modules. > > If I understood correctly there is some interest in combination of both of > these model, correct? Do we really need both of this models? With dynamicly > linked root we have a working 'in process' NSS and only 'clean caching' daemon > is required. We are ready to work hard on a project that is really needed but > we can not waste our time on code that will be discarded (as it happened with > our IPC implementation of NSS). > > Our final question is: should we develop a 'caching only' daemon or a daemon > which will also dispatch requests to NSS agents? > > I hope my English is understandable. If not - please let me know and I'll try > to re-express my thoughts. > > -- > Oleg Sharoiko. > Software and Network Engineer > Computer Center of Rostov State University. This is what I suspected your response would be and it is why I wanted to make sure that things were well understood. I can't speak for everyone, but I'm a big fan of what you call the IPC model. Here is a little bit about why. 1) It works for both statically and dynamically linked binaries. 2) You can introduce a caching layer between the libc dispatch code and the daemon dispatch code. I think that the daemon should be what dispatches the nss requests. If it's going to do caching, it needs to know about what data they return anyway and going back and forth between the libc code and the daemon over a socket is wasteful; so you may as well go "up to" the daemon _once_ when the call is made, and then "back down" to the libc code _once_, once the data is available. Anything else in-between should not warrant back-and-forth activity between the libc code and the daemon. 3) You can maintain a single connection instance to the backend as specified by an agent module, instead of having to constantly re-establish connection. 4) You can cache the data read from the config files. I know that Jacques Vidrine (who has done the work on nss in FreeBSD) may tend to disagree. In a previous reply to you he mentionned that he thought this model was overkill because: 1) It would require for all consumers in libc to marshall arguments 2) Some NSS modules already have their own daemons and that seems to be the direction they're headed in. For (1), I'm sure it's possible to pass the data down to the daemon opaquely and have the daemon pass it down to the agent module. Unless I misunderstand the problem. For (2), this is a terrible direction. It would be better if they were thought to instead deal with a well-defined interface and all connect using the facility provided by the single running cache+dispatch daemon. There seems to be a growing desire to move to a dynamically linked root, but I really think that if the only reason to do that is to accomodate the libc nsdispatch code, then it's probably the libc nsdispatch code which should change, assuming the 'IPC' model solution doesn't have any shortcomings which I've left out above? Regards, -- Bosko Milekic * bmilekic@technokratis.com * bmilekic@FreeBSD.org TECHNOkRATIS Consulting Services * http://www.technokratis.com/ From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 09:32:16 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 15D4937B401; Sat, 28 Jun 2003 09:32:16 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 84AD443FB1; Sat, 28 Jun 2003 09:32:15 -0700 (PDT) (envelope-from scottl@freebsd.org) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h5SGVF818914; Sat, 28 Jun 2003 09:31:15 -0700 Received: from freebsd.org (hollin.btc.adaptec.com [10.100.253.56]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id JAA15203; Sat, 28 Jun 2003 09:32:05 -0700 (PDT) Message-ID: <3EFDC2EF.1060807@freebsd.org> Date: Sat, 28 Jun 2003 10:31:43 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3) Gecko/20030425 X-Accept-Language: en-us, en MIME-Version: 1.0 To: John Baldwin References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: "Justin T. Gibbs" cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 16:32:16 -0000 John Baldwin wrote: > Using this approach is more flexible in case there is a driver > that uses a sx lock or a (not yet implemented) reader-writer lock > or a critical section, or whatever. This just means that the > callback uses a wrapper function but that really isn't that hard to > do and there are other cases (callouts in general) that need this. > To me this seems to be adding a special case to the API that won't > work for all situations anyways. I also don't see wrapper functions > as being all that hard. > Ok, after many semi-private discussions, how about this: 1) New flag, BUS_DMA_INSWI, to signal that the caller is busdma_swi(). 2) Remove callback_mtx and replace it with callback2, a function pointer that wraps the callback with driver-dependent locking. This makes thing more flexible for alternate locking strategies. 3) Move vm_swi to be INTR_MPSAFE. On every single arch, vm_swi only exists to call busdma_swi(). This should not preclude other tasks from being added to this SWI as long as appropriate locking is done. 4) Have busdma_swi() check that callback2==NULL. If it does, grab Giant before calling bus_dmamap_load(). If it doesn't, call bus_dmamap_load() with callback2 instead of the original callback. 5) bus_dmamap_load() checks BUS_DMA_INSWI==0 before overwriting the callback and callback_args fields of the map. It will blindly call 'callback' and rely on the caller (either the driver or busdma_swi) giving it the right pointer. Scott From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 14:33:26 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BC18C37B401; Sat, 28 Jun 2003 14:33:26 -0700 (PDT) Received: from aslan.scsiguy.com (mail.scsiguy.com [63.229.232.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9F50144001; Sat, 28 Jun 2003 14:33:25 -0700 (PDT) (envelope-from gibbs@scsiguy.com) Received: from aslan.scsiguy.com (aslan.scsiguy.com [63.229.232.106]) by aslan.scsiguy.com (8.12.9/8.12.8) with ESMTP id h5SLXPEU046677; Sat, 28 Jun 2003 15:33:25 -0600 (MDT) (envelope-from gibbs@scsiguy.com) Date: Sat, 28 Jun 2003 15:33:25 -0600 From: "Justin T. Gibbs" To: Scott Long , John Baldwin Message-ID: <2768600000.1056836005@aslan.scsiguy.com> In-Reply-To: <3EFDC2EF.1060807@freebsd.org> References: <3EFDC2EF.1060807@freebsd.org> X-Mailer: Mulberry/3.0.3 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 21:33:27 -0000 > Ok, after many semi-private discussions, how about this: There is only one problem with this strategy. The original idea of using a mutex allowed the busdma API to use that same mutex as the strategy for locking the fields of the tag, dmamap, etc. In other-words, the agreement would have been that the caller always has the lock held before calling into bus dma, so that bus dma only has to grab additional locks to protect data shared with other clients. For this to work in the more general scheme, you would have to register "acquire lock"/"release lock" functions in the tag since locking within the callback does not allow for the protection of the tag or dmamap fields in the deferred case (they would only be protected *during* the callback). Again, what we want to achieve is as few lock acquires and releases in the common case as possible. For architectures like x86, the only data structure that needs to be locked for the common case of no deferral and no bounce page allocations is the tag (it will soon hold the S/G list passed to the callback). Other implementations may need to acquire other locks, but using the client's lock still removes one lock acquire and release in each invocation that is not deferred. -- Justin From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 14:54:07 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5363837B401 for ; Sat, 28 Jun 2003 14:54:07 -0700 (PDT) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by mx1.FreeBSD.org (Postfix) with SMTP id 3169243FF2 for ; Sat, 28 Jun 2003 14:54:05 -0700 (PDT) (envelope-from iedowse@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 28 Jun 2003 22:54:04 +0100 (BST) To: freebsd-arch@freebsd.org Date: Sat, 28 Jun 2003 22:54:04 +0100 From: Ian Dowse Message-ID: <200306282254.aa83607@salmon.maths.tcd.ie> Subject: Unmounting by filesystem ID X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2003 21:54:07 -0000 The patch below adds a new mount flag MNT_BYFSID that can be used to unmount a filesystem by specifying its filesystem ID instead of a path. The umount utility is changed to use this mechanism by default. This approach has a number of advantages: - It avoids any lookup operations that could potentially block forever, so filesystems such as NFS can be reliably unmounted even if the server is not responding but looking up the root node would require an RPC (maybe to an underlying filesystem). - The filesystem specification is unambiguous, so umount(8) can be sure that it is unmounting the correct filesystem (more work in umount(8) may be required here). - Detached filesystems can be unmounted. If a filesystem becomes detached from the filesystem hierarchy because the underlying filesystem got unmounted, it does not require a reboot to unmount it. Since unmounting by a path name is now only required for compatibility, in that case unmount() now just does a string comparison to find the correct filesystem. Also, this patch only affects unmounting; a similar approach could be applied to MNT_UPDATE mount operations. I would like to commit this during the next few days. Any comments or suggestions? Ian Index: sys/kern/vfs_mount.c =================================================================== RCS file: /dump/FreeBSD-CVS/src/sys/kern/vfs_mount.c,v retrieving revision 1.108 diff -u -r1.108 vfs_mount.c --- sys/kern/vfs_mount.c 11 Jun 2003 00:56:58 -0000 1.108 +++ sys/kern/vfs_mount.c 28 Jun 2003 21:12:18 -0000 @@ -1224,17 +1224,42 @@ int flags; } */ *uap; { - register struct vnode *vp; + fsid_t fsid; struct mount *mp; - int error; - struct nameidata nd; + char *pathbuf; + int error, id0, id1; - NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF, UIO_USERSPACE, uap->path, td); - if ((error = namei(&nd)) != 0) + pathbuf = malloc(MNAMELEN, M_TEMP, M_WAITOK); + error = copyinstr(uap->path, pathbuf, MNAMELEN, NULL); + if (error) { + free(pathbuf, M_TEMP); return (error); - vp = nd.ni_vp; - NDFREE(&nd, NDF_ONLY_PNBUF); - mp = vp->v_mount; + } + if (uap->flags & MNT_BYFSID) { + /* Decode the filesystem ID. */ + if (sscanf(pathbuf, "FSID:%d:%d", &id0, &id1) != 2) { + free(pathbuf, M_TEMP); + return (EINVAL); + } + fsid.val[0] = id0; + fsid.val[1] = id1; + + mtx_lock(&mountlist_mtx); + TAILQ_FOREACH_REVERSE(mp, &mountlist, mntlist, mnt_list) + if (bcmp(&mp->mnt_stat.f_fsid, &fsid, + sizeof(fsid)) == 0) + break; + mtx_unlock(&mountlist_mtx); + } else { + mtx_lock(&mountlist_mtx); + TAILQ_FOREACH_REVERSE(mp, &mountlist, mntlist, mnt_list) + if (strcmp(mp->mnt_stat.f_mntonname, pathbuf) == 0) + break; + mtx_unlock(&mountlist_mtx); + } + free(pathbuf, M_TEMP); + if (mp == NULL) + return (ENOENT); /* * Only root, or the user that did the original mount is @@ -1242,28 +1267,15 @@ */ if (mp->mnt_cred->cr_uid != td->td_ucred->cr_uid) { error = suser(td); - if (error) { - vput(vp); + if (error) return (error); - } } /* * Don't allow unmounting the root filesystem. */ - if (mp->mnt_flag & MNT_ROOTFS) { - vput(vp); - return (EINVAL); - } - - /* - * Must be the root of the filesystem - */ - if ((vp->v_vflag & VV_ROOT) == 0) { - vput(vp); + if (mp->mnt_flag & MNT_ROOTFS) return (EINVAL); - } - vput(vp); return (dounmount(mp, uap->flags, td)); } Index: sys/sys/mount.h =================================================================== RCS file: /dump/FreeBSD-CVS/src/sys/sys/mount.h,v retrieving revision 1.147 diff -u -r1.147 mount.h --- sys/sys/mount.h 26 Mar 2003 22:15:58 -0000 1.147 +++ sys/sys/mount.h 1 Apr 2003 12:32:39 -0000 @@ -224,12 +224,9 @@ #define MNT_RELOAD 0x00040000 /* reload filesystem data */ #define MNT_FORCE 0x00080000 /* force unmount or readonly change */ #define MNT_SNAPSHOT 0x01000000 /* snapshot the filesystem */ +#define MNT_BYFSID 0x08000000 /* specify filesystem by ID. */ #define MNT_CMDFLAGS (MNT_UPDATE | MNT_DELEXPORT | MNT_RELOAD | \ - MNT_FORCE | MNT_SNAPSHOT) -/* - * Still available - */ -#define MNT_SPARE3 0x08000000 + MNT_FORCE | MNT_SNAPSHOT | MNT_BYFSID) /* * Internal filesystem control flags stored in mnt_kern_flag. * Index: sbin/umount/umount.c =================================================================== RCS file: /dump/FreeBSD-CVS/src/sbin/umount/umount.c,v retrieving revision 1.34 diff -u -r1.34 umount.c --- sbin/umount/umount.c 7 Apr 2003 12:56:01 -0000 1.34 +++ sbin/umount/umount.c 28 Jun 2003 21:05:18 -0000 @@ -54,6 +54,7 @@ #include #include +#include #include #include #include @@ -72,10 +73,10 @@ int fflag, vflag; char *nfshost; -void checkmntlist (char *, char **, char **, char **); +struct statfs *checkmntlist (char *, char **); int checkvfsname (const char *, char **); -char *getmntname (const char *, const char *, - mntwhat, char **, dowhat); +struct statfs *getmntentry (const char *, const char *, mntwhat, char **, + dowhat); char *getrealname(char *, char *resolved_path); char **makevfslist (const char *); size_t mntinfo (struct statfs **); @@ -83,7 +84,7 @@ int sacmp (struct sockaddr *, struct sockaddr *); int umountall (char **); int checkname (char *, char **); -int umountfs (char *, char *, char *); +int umountfs (char *, char *, fsid_t *, char *); void usage (void); int xdr_dir (XDR *, char *); @@ -92,8 +93,8 @@ { int all, errs, ch, mntsize, error; char **typelist = NULL, *mntonname, *mntfromname; - char *type, *mntfromnamerev, *mntonnamerev; - struct statfs *mntbuf; + char *type; + struct statfs *mntbuf, *sfsrev; struct addrinfo hints; /* Start disks transferring immediately. */ @@ -166,18 +167,16 @@ */ mntonname = mntbuf[mntsize].f_mntonname; mntfromname = mntbuf[mntsize].f_mntfromname; - mntonnamerev = getmntname(getmntname(mntonname, - NULL, MNTFROM, &type, NAME), NULL, - MNTON, &type, NAME); - mntfromnamerev = getmntname(mntonnamerev, - NULL, MNTFROM, &type, NAME); + sfsrev = getmntentry(mntonname, NULL, MNTON, &type, + NAME); - if (strcmp(mntonnamerev, mntonname) == 0 && - strcmp(mntfromnamerev, mntfromname ) != 0) + if (!fflag && bcmp(&sfsrev->f_fsid, + &mntbuf[mntsize].f_fsid, sizeof(fsid_t)) != 0) { warnx("cannot umount %s, %s\n " "is mounted there, umount it first", - mntonname, mntfromnamerev); + mntonname, sfsrev->f_mntfromname); + } if (checkname(mntbuf[mntsize].f_mntonname, typelist) != 0) @@ -196,7 +195,7 @@ errs = 1; break; } - (void)getmntname(NULL, NULL, NOTHING, NULL, FREE); + (void)getmntentry(NULL, NULL, NOTHING, NULL, FREE); exit(errs); } @@ -258,29 +257,29 @@ { size_t len; int speclen; - char *mntonname, *mntfromname; - char *mntfromnamerev; char *resolved, realname[MAXPATHLEN]; char *type, *hostp, *delimp, *origname; + struct statfs *sfs, *sfsrev; len = 0; - mntfromname = mntonname = delimp = hostp = NULL; + delimp = hostp = NULL; + sfs = NULL; /* * 1. Check if the name exists in the mounttable. */ - (void)checkmntlist(name, &mntfromname, &mntonname, &type); + sfs = checkmntlist(name, &type); /* * 2. Remove trailing slashes if there are any. After that * we look up the name in the mounttable again. */ - if (mntfromname == NULL && mntonname == NULL) { + if (sfs == NULL) { speclen = strlen(name); for (speclen = strlen(name); speclen > 1 && name[speclen - 1] == '/'; speclen--) name[speclen - 1] = '\0'; - (void)checkmntlist(name, &mntfromname, &mntonname, &type); + sfs = checkmntlist(name, &type); resolved = name; /* Save off original name in origname */ if ((origname = strdup(name)) == NULL) @@ -290,7 +289,7 @@ * has been used and translate it to the ':' syntax. * Look up the name in the mounttable again. */ - if (mntfromname == NULL && mntonname == NULL) { + if (sfs == NULL) { if ((delimp = strrchr(name, '@')) != NULL) { hostp = delimp + 1; if (*hostp != '\0') { @@ -313,8 +312,7 @@ speclen--) name[speclen - 1] = '\0'; name[len + speclen + 1] = '\0'; - (void)checkmntlist(name, &mntfromname, - &mntonname, &type); + sfs = checkmntlist(name, &type); resolved = name; } /* @@ -325,11 +323,10 @@ * basedir of mountpoint and add the dirname again. * Check the name in mounttable one last time. */ - if (mntfromname == NULL && mntonname == NULL) { + if (sfs == NULL) { (void)strcpy(name, origname); if ((getrealname(name, realname)) != NULL) { - (void)checkmntlist(realname, - &mntfromname, &mntonname, &type); + sfs = checkmntlist(realname, &type); resolved = realname; } /* @@ -343,9 +340,9 @@ * fstat structure get's more reliable, * but at the moment we cannot thrust it. */ - if (mntfromname == NULL && mntonname == NULL) { + if (sfs == NULL) { (void)strcpy(name, origname); - if (umountfs(NULL, origname, + if (umountfs(NULL, origname, NULL, "none") == 0) {; warnx("%s not found in " "mount table, " @@ -370,38 +367,37 @@ * Check if the reverse entrys of the mounttable are really the * same as the normal ones. */ - if ((mntfromnamerev = strdup(getmntname(getmntname(mntfromname, - NULL, MNTON, &type, NAME), NULL, MNTFROM, &type, NAME))) == NULL) - err(1, "strdup"); + sfsrev = getmntentry(sfs->f_mntonname, NULL, MNTON, &type, NAME); /* * Mark the uppermost mount as unmounted. */ - (void)getmntname(mntfromname, mntonname, NOTHING, &type, MARK); + (void)getmntentry(sfs->f_mntfromname, sfs->f_mntonname, NOTHING, &type, + MARK); /* * If several equal mounts are in the mounttable, check the order * and warn the user if necessary. */ - if (strcmp(mntfromnamerev, mntfromname ) != 0 && - strcmp(resolved, mntonname) != 0) { + if (fflag != MNT_FORCE && sfsrev != sfs) { warnx("cannot umount %s, %s\n " "is mounted there, umount it first", - mntonname, mntfromnamerev); + sfs->f_mntonname, sfsrev->f_mntfromname); - /* call getmntname again to set mntcheck[i] to 0 */ - (void)getmntname(mntfromname, mntonname, + /* call getmntentry again to set mntcheck[i] to 0 */ + (void)getmntentry(sfs->f_mntfromname, sfs->f_mntonname, NOTHING, &type, UNMARK); return (1); } - free(mntfromnamerev); - return (umountfs(mntfromname, mntonname, type)); + return (umountfs(sfs->f_mntfromname, sfs->f_mntonname, &sfs->f_fsid, + type)); } /* * NFS stuff and unmount(2) call */ int -umountfs(char *mntfromname, char *mntonname, char *type) +umountfs(char *mntfromname, char *mntonname, fsid_t *fsid, char *type) { + char fsidbuf[64]; enum clnt_stat clnt_stat; struct timeval try; struct addrinfo *ai, hints; @@ -439,14 +435,27 @@ * A non-NULL return means that this is the last * mount from mntfromname that is still mounted. */ - if (getmntname(mntfromname, NULL, NOTHING, &type, COUNT) - != NULL) + if (getmntentry(mntfromname, NULL, NOTHING, &type, COUNT) + != NULL) do_rpc = 1; } if (!namematch(ai)) return (1); - if (unmount(mntonname, fflag) != 0 ) { + /* First try to unmount using the specified filesystem ID. */ + if (fsid != NULL) { + snprintf(fsidbuf, sizeof(fsidbuf), "FSID:%d:%d", fsid->val[0], + fsid->val[1]); + if (unmount(fsidbuf, fflag | MNT_BYFSID) != 0) { + warn("unmount of %s failed", mntonname); + if (errno != ENOENT) + return (1); + /* Compatability for old kernels. */ + warnx("retrying using path instead of filesystem ID"); + fsid = NULL; + } + } + if (fsid == NULL && unmount(mntonname, fflag) != 0) { warn("unmount of %s failed", mntonname); return (1); } @@ -490,9 +499,9 @@ return (0); } -char * -getmntname(const char *fromname, const char *onname, - mntwhat what, char **type, dowhat mark) +struct statfs * +getmntentry(const char *fromname, const char *onname, mntwhat what, + char **type, dowhat mark) { static struct statfs *mntbuf; static size_t mntsize = 0; @@ -523,21 +532,15 @@ case NAME: /* Return only the specific name */ for (i = mntsize - 1; i >= 0; i--) { - if (fromname != NULL && what == MNTON && - !strcmp(mntbuf[i].f_mntfromname, fromname) && - mntcheck[i] != 1) { + if (fromname != NULL && !strcmp((what == MNTFROM) ? + mntbuf[i].f_mntfromname : mntbuf[i].f_mntonname, + fromname) && mntcheck[i] != 1) { if (type) *type = mntbuf[i].f_fstypename; - return (mntbuf[i].f_mntonname); - } - if (fromname != NULL && what == MNTFROM && - !strcmp(mntbuf[i].f_mntonname, fromname) && - mntcheck[i] != 1) { - if (type) - *type = mntbuf[i].f_fstypename; - return (mntbuf[i].f_mntfromname); + return (&mntbuf[i]); } } + return (NULL); case MARK: /* Mark current mount with '1' and return name */ @@ -546,7 +549,7 @@ (strcmp(mntbuf[i].f_mntonname, onname) == 0) && (strcmp(mntbuf[i].f_mntfromname, fromname) == 0)) { mntcheck[i] = 1; - return (mntbuf[i].f_mntonname); + return (&mntbuf[i]); } } return (NULL); @@ -557,7 +560,7 @@ (strcmp(mntbuf[i].f_mntonname, onname) == 0) && (strcmp(mntbuf[i].f_mntfromname, fromname) == 0)) { mntcheck[i] = 0; - return (mntbuf[i].f_mntonname); + return (&mntbuf[i]); } } return (NULL); @@ -582,7 +585,7 @@ } } if (count <= 1) - return (mntbuf[i].f_mntonname); + return (&mntbuf[i]); else return (NULL); case FREE: @@ -646,17 +649,15 @@ return (0); } -void -checkmntlist(char *name, char **fromname, char **onname, char **type) +struct statfs * +checkmntlist(char *name, char **type) { + struct statfs *sfs; - *fromname = getmntname(name, NULL, MNTFROM, type, NAME); - if (*fromname == NULL) { - *onname = getmntname(name, NULL, MNTON, type, NAME); - if (*onname != NULL) - *fromname = name; - } else - *onname = name; + sfs = getmntentry(name, NULL, MNTON, type, NAME); + if (sfs == NULL) + sfs = getmntentry(name, NULL, MNTFROM, type, NAME); + return (sfs); } size_t From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 17:27:27 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D62C37B401 for ; Sat, 28 Jun 2003 17:27:27 -0700 (PDT) Received: from obsecurity.dyndns.org (adsl-64-169-104-32.dsl.lsan03.pacbell.net [64.169.104.32]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4C18244035 for ; Sat, 28 Jun 2003 17:27:26 -0700 (PDT) (envelope-from kris@obsecurity.org) Received: from rot13.obsecurity.org (rot13.obsecurity.org [10.0.0.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id 15ED666BE5; Sat, 28 Jun 2003 17:27:25 -0700 (PDT) Received: by rot13.obsecurity.org (Postfix, from userid 1000) id E8B64B1F; Sat, 28 Jun 2003 17:27:24 -0700 (PDT) Date: Sat, 28 Jun 2003 17:27:24 -0700 From: Kris Kennaway To: Ian Dowse Message-ID: <20030629002724.GA61860@rot13.obsecurity.org> References: <200306282254.aa83607@salmon.maths.tcd.ie> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="UugvWAfsgieZRqgk" Content-Disposition: inline In-Reply-To: <200306282254.aa83607@salmon.maths.tcd.ie> User-Agent: Mutt/1.4.1i cc: freebsd-arch@freebsd.org Subject: Re: Unmounting by filesystem ID X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jun 2003 00:27:27 -0000 --UugvWAfsgieZRqgk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Jun 28, 2003 at 10:54:04PM +0100, Ian Dowse wrote: >=20 > The patch below adds a new mount flag MNT_BYFSID that can be used > to unmount a filesystem by specifying its filesystem ID instead of > a path. The umount utility is changed to use this mechanism by > default. This approach has a number of advantages: >=20 > - It avoids any lookup operations that could potentially block > forever, so filesystems such as NFS can be reliably unmounted > even if the server is not responding but looking up the root node > would require an RPC (maybe to an underlying filesystem). > - The filesystem specification is unambiguous, so umount(8) can > be sure that it is unmounting the correct filesystem (more > work in umount(8) may be required here). > - Detached filesystems can be unmounted. If a filesystem becomes > detached from the filesystem hierarchy because the underlying > filesystem got unmounted, it does not require a reboot to unmount > it. >=20 > Since unmounting by a path name is now only required for compatibility, > in that case unmount() now just does a string comparison to find > the correct filesystem. Also, this patch only affects unmounting; > a similar approach could be applied to MNT_UPDATE mount operations. >=20 > I would like to commit this during the next few days. Any comments > or suggestions? The approach sounds great to me; these are long-standing annoyances in FreeBSD. Kris --UugvWAfsgieZRqgk Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (FreeBSD) iD8DBQE+/jJsWry0BWjoQKURAtgxAKCOYWvNJVyogbJL8zlVna2NvzMeugCg/ac3 iVsxySX53MlvGDdoTuTu4Uc= =inDX -----END PGP SIGNATURE----- --UugvWAfsgieZRqgk-- From owner-freebsd-arch@FreeBSD.ORG Sat Jun 28 19:03:06 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BDFC237B401; Sat, 28 Jun 2003 19:03:06 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2D77744005; Sat, 28 Jun 2003 19:03:06 -0700 (PDT) (envelope-from scottl@freebsd.org) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h5T228811793; Sat, 28 Jun 2003 19:02:08 -0700 Received: from freebsd.org (hollin.btc.adaptec.com [10.100.253.56]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id TAA27563; Sat, 28 Jun 2003 19:03:02 -0700 (PDT) Message-ID: <3EFE48E8.1040700@freebsd.org> Date: Sat, 28 Jun 2003 20:03:20 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.3.1) Gecko/20030425 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Justin T. Gibbs" References: <3EFDC2EF.1060807@freebsd.org> <2768600000.1056836005@aslan.scsiguy.com> In-Reply-To: <2768600000.1056836005@aslan.scsiguy.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-arch@freebsd.org Subject: Re: API change for bus_dma X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jun 2003 02:03:07 -0000 Justin T. Gibbs wrote: >>Ok, after many semi-private discussions, how about this: > > > There is only one problem with this strategy. The original idea > of using a mutex allowed the busdma API to use that same mutex as > the strategy for locking the fields of the tag, dmamap, etc. In > other-words, the agreement would have been that the caller always > has the lock held before calling into bus dma, so that bus dma > only has to grab additional locks to protect data shared with > other clients. For this to work in the more general scheme, you > would have to register "acquire lock"/"release lock" functions in > the tag since locking within the callback does not allow for the > protection of the tag or dmamap fields in the deferred case (they > would only be protected *during* the callback). > > Again, what we want to achieve is as few lock acquires and releases > in the common case as possible. For architectures like x86, the only > data structure that needs to be locked for the common case of no deferral > and no bounce page allocations is the tag (it will soon hold the S/G list > passed to the callback). Other implementations may need to acquire other > locks, but using the client's lock still removes one lock acquire and > release in each invocation that is not deferred. > > -- > Justin > > This is becoming wonderfully complex. What is the purpose of storing the S/G list in the tag? Are we going to enforce a 1:1 relationship between tags and maps? That would really suck for the aac(4) driver. Scott