Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Jun 1998 19:44:26 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        wjw@surf.IAEhv.nl (Willem Jan  Withagen)
Cc:        mike@smith.net.au, current@FreeBSD.ORG
Subject:   Re: Variant Link implementation (Was: Re: lorder problem: ....... )
Message-ID:  <199806071944.MAA24713@usr06.primenet.com>
In-Reply-To: <199806071230.OAA28150@surf.IAEhv.nl> from "Willem Jan  Withagen" at Jun 7, 98 02:30:10 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> >Ok.  I did actually look at how this could be implemented last time it 
> >came up, and I think it would be *reasonably* straightforward.
> 
> Can you come up with that scenario? I might be tempted to take a look at it
> and possibly try to implement it. But I need to make some time estimates
> if I ever will be able to finisch it.


The problem is that the envp can be replaced, and there is no simple
way to access it.  Try the following code:

-- spamchild.c -----------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>

main()
{
        setenv( "SPAM", "the luncheon meat", 0);
        if( fork())
                exit( 0);
        printf( "SPAMchild(%d): hanging out\n", getpid());
        sleep( 10000);
}
--------------------------------------------------------------------------

then do a "ps -gaxewww | grep SPAM"; you won't find the new environment
variable in the child process environment.  This is because the environ
is reallocated, and moved somewhere the kernel can't find it.  The
origianl environment hangs out, taking up space.



A (sort of) fix would be to make a kernel call in crt0 to push the
address of the envp to the kernel so that it is known.  Then if the
contents of the address were modified, the kernel would be able to
identify the new environment.  This has several problems:

1)	Legacy applications would fail to act as expected.  This is
	the number one issue, if the impetus for the change is the
	variance between a.out and ELF "universes".  They fail
	because they have not been recompiled.

2)	If one process is traversing the environment of another, it
	would have to do so with a kernel interface.  This is an
	inevitable result of making the interface available at all;
	the variant symlink evaluation is such that the kernel must
	do the equivalent.  Now consider the case of a process that
	points it's envp at an invalid address.

3)	ABI applications are compiled against their native crt0,
	not the new one.  All ABI applications, even if recompiled,
	will not have the necessary functionality.


> Syntax is in some ways a cosmetic issue. What we need next to syntax is the
> semantics of the variants. The easy way I once have thought of doing it,
> is in libc. (or atleast still in userspace) This way semantics are fairly
> easy and the sollution was doable. But I never got around to figuring out
> where it would kill things. 

This breaks in the same way that crt0.o breaks (above).


Let me preface this with "I have most of this code completed":

A better soloution is to change getenv/setenv/putenv/unsetenv to be
wrappers to a single multiplex system call (you have to have a new
system call either way).  Here is the whole plan:

1)	Add a char **environ to the proc structure.

2)	Make execve(2) parse the envp argument into this structure,
	instead of just making a local copy.

3)	Get rid of "environ" usability for all future applications
	deprecate it (at worst) or delete it outright (at best).

4)	Make getenv/setenv/putenv/unsetenv call the kernel to manipulate
	the environment.  Basically, this moves some of the libc code
	into the kernel.  I know this is not optimal, but we have
	legacy applications with impossible to identify "&environ",
	and this is the compromise.  If you can think of a better
	one, suggest it.

5)	Make the "${" / "}" delimiters expand out of the environment
	in path expansion.  This is not AFS compatible, but the use
	of a terminal delimiter is necessary.

In addition to this, the following is necessary to support soloution
of the a.out/elf variance:

6)	Define a class of "system" variables in the namespace.  These
	are environment variables which may be set by the kernel and
	will override user settings, if present.

7)	My suggestion is to use the "#" as a prefix.  This solves the:

		set > x				# save environment
		...
		...
		. x				# restore environment

	problem, in that it escapes system environment varaibles into
	comments.

8)	Modify the image activators that deal with ABI format (rather
	than interpreters with "#!" or deencapsulators, like gzip) to
	set the system variable "#abi".


Fun things that can also be done using this interface:

9)	Extend the interface arguments.  First of all, add a version
	number as the first argument.  Second, add a PID as the second
	argument; if the pid is 0, reference the current process.

10)	Enforce normal permissions on inter-process access.

11)	You now have solved the problem of modifying a processes
	environment from another process.  8-).

12)	Take another step.  What is the difference between an
	environment and a logical name table?  Add flags bits
	into the high order byte of the version number: LNM_SYS,
	LNM_GRP, LNM_LCL.

13)	Define LNM_LCL as "the current processes char **environ";
	define LNM_GRP as "the current processes process group or
	session leader's char **environ"; define LNM_SYS as "the
	init processes char **environ".

14)	Modify "getenv" to look first in LNM_LCL, then in LNM_GRP,
	then in LNM_SYS for any environment variable.

15)	Modify unsetenv(3) to operate only on LNM_GRP.  The setenv(3)
	call will only write LNM_LCL.

16)	Provide access to an "lnm(3)" interface, with set/get/getnext
	capability.

17)	Set your system wide environment (news server, etc.) in the
	rc.local file.  8-).

18)	Set your session specific stuff elsewhere.  Login.conf is a
	good idea; so is a session manager -- like "xdm".  For the
	"DISPLAY" and "TERM" variables, for example.

> (Dumb question: Does INIT have an environment which we can load?)

See #13, above.  8-).

> Doing it thru sysctl-interogation has the advantage of the variables
> being global to all processes. It has the disadvantage that it is no
> longer local to one's process-space.

"System" variables *must* be local to the process space.

Consider the Linux ELF binary NetScape loading the FreeBSD a.out
binary helper app on a predominantly FreeBSD ELF system.  8-(.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199806071944.MAA24713>