Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Oct 1995 13:58:06 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        bde@zeta.org.au (Bruce Evans)
Cc:        bde@zeta.org.au, terry@lambert.org, CVS-commiters@freefall.freebsd.org, bde@freefall.freebsd.org, cvs-sys@freefall.freebsd.org, hackers@freebsd.org, swallace@ece.uci.edu
Subject:   Re: SYSCALL IDEAS [Was: cvs commit: src/sys/kern sysv_msg.c sysv_sem.c sysv_shm.c]
Message-ID:  <199510262058.NAA21619@phaeton.artisoft.com>
In-Reply-To: <199510260520.PAA07866@godzilla.zeta.org.au> from "Bruce Evans" at Oct 26, 95 03:20:48 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> >> Which one?  In NetBSD it's register_t, which may be longer than an
> >> int.  This causes problems.
> 
> >Yeah.  They ignored the definition of "int".  That's a problem.
> 
> "int" is machine-dependent.  On 68000's you would have to support
> some user compilers using 16 bit ints (for speed) and others using
> 32 bit ints (for easy porting).  We're close to having the same
> problems with 32 vs 64 bit ints.

Well, I'd argue that "max size for a one bus cycle transfer" is an int,
so the Lattice/etc. 32 bit 68k int is wrong and the Manx Aztec C 16
bit int is correct.

The problem in this case isn't sizeof(int) > sizeof(long), but that
sizeof(int) < sizeof(long) | sizeof( void *).

> >The real problem is the lack of atomic sized types and the use of "short"
> >as a synonym for "16 bit value", "long" for "32 bit value" and "quad"
> >for "64 bit value".
> 
> NetBSD has fixed this.  It uses the typedefs in <machine/types.h>
> (int16_t, int32_t, int64_t) a lot.

Not with 64 bit ints they don't.  A 64 bit int means a 64 bit (or
greater) long, and either a 16 or 32 bit short.  You lose access to
either 16 or 32 bit atomic types, period.  That's what's broken.

> >The real screwup is when int goes greater than 32 bits, the standard
> >*stupidly* requires long to go up as well because it can concieve of
> >a maximally sized type named anything other than "long".
> 
> This is fundamental.  longs are at least as large as ints.

This is fundamental to the definition of a long as a non-deterministically
sized type.  This is in no other way "fundamental".

> >I think using registers for calls and inlining are antithetical.  You
> 
> Calls to inline functions don't use the standard calling convention.

Calls to system calls *must* use *some* calling convention agreed upon
in advance and invariant under optimization, or it will be indeterminate
under optimization.  This is obvious.

> >> This may be true if you control the ABI.
> 
> >You *do* control the ABI.  You are either running a known ABI paradigm
> >(ie: int push, sizeof(int) == sizeof(register_t), etc.), or you are
> >running a compatability ABI, in which case you know at binary load time
> 
> You know it but you don't control it.

Excuse me?  You are attempting to assert exactly that control on the
Intel ABI and you are arguing that it can't be done?

> >and can force alternate argument call gating without much trouble (but
> >some runtime overhead: to be expected in any case of non-native code
> >execution anyway).
> 
> I've thought of using alternative gates to stop the compatibility
> interface (for other OS's) from slowing down the standard interface.
> We plan to use trap gates instead of call gates for standard syscalls.
> We already support int 0x80 for Linux and NetBSD uses int 0x80 for
> its native syscalls.  If everything used int 0x80, then decoding would
> be expensive.  The expense can be pushed to image activation time using
> alternative TSS's and gates.  We might end up with the following:
> 
> 	int 0x80 (trap gate) for native syscalls
> 	int 0x80 (trap gate) for NetBSD syscalls
> 	int 0x80 (trap gate) for Linux syscalls
> 	lcall(7, 0) (call gate) for ibcs2 compatibility and slowness
> 	...

This is the way to do things; my issue was in making the modified
calling convention being proposed the default calling convention.


> Er, you have this exactly backwards.  My wrappers provide part of
> what is required to handle nontrivial conversions from a 16 bit
> ABI to a 32 bit one.

If I produce new code for an emulated environment, I'm *not* going
to be doing so in a native environment.  The code production will
be in the cross environment as well.

Wrappering is a non-issue for conversions.  Wrappering *is* an issue
in code complexity.  I believe complexity should be reduced wherever
possible -- and this is not done by imposing ever more rigid standards
on the programmer, it's done by making the code environment more
flexible.


What wall time differentials do you expect from the conversion, and
why?


> 
> >Structure casts are not a portability problem (as below).
> 
> >> >The answer is to compile with the packing being the default for data
> >> >matchup -- in the alpha case, 64 bits.  For devices that need it, use
> >> >packing #pragma's to tighten them up on a case by case basis, in the
> >> >header file defining the structure.
> >> 
> >> That won't help much.  The problem is that syscall args don't form
> >> a struct.  Even if they are in memory, the packing may be different.
> 
> >This is not allowed to be the case.  Either the nature of the push
> >must be changed to ensure a register_t sized push, or the structure
> 
> Yeah, right.  Change the Xenix 286 ABI to push 386 register_t's.

That's an emulated environment, and the stack decoding in the kernel
must be based on knowledge of the source address.

Consider for example a PPC port of FreeBSD capable of running Intel
FreeBSD binaries.  The capability and the endianess issues must be
handled in the ABI call emulation layers.   The code will execute
in an emulated 386 environment, but all kernel support will be
native.

This is on the order of an XDR interface that is contextualized for
the native byte/word order of the machine instead of network byte
order.

The point is that the call conversion is layered seperately from the
call implementation.

Currently, there are a number of internal implemenations that do not
understand UIO_SYSSPACE as a copy source/destination for this type of
support, but the abstraction of the ABI interface from the implementation
interface internal to the kernel itself *must* take this into account.

Going to an architecture dependent passing mechanism is a mistake; it
greatly complicated the internal interfaces for ABI translation, causing
the need for an intermediate layering for native calls.

> >packing must be specifiable at declaration time, or (preferrably) both.
> 
> Packing can't be specified in C.  That's why my my machine generated
> wrappers are required - to provide portability.  It isn't worth
> supporting both because if you support the worst case (weird packing)
> then it just takes more code to support the array case.

#pragma.  If we are talking about compiler modifications, packing #pragma's
are much more generally useful than register call wrappings and the
internal code you have suggested for varargs.

What is your wrapper for open( const char *, int, ...)?


> My ideas for syscalls are based on what is done for vnodes :-).  It
> is possible to do better for syscalls (remove the uap's from the non-
> machine-generated code) because stacking isn't required.  Stacking
> seems to require passing around uap's because one layer might modify
> *uap.  This is too hard to do with call-by-value function call args.

What about non-native architecture ABI support?

> >> The ABI is a convention, and can't be changed.
> 
> >The ABI is an agreement between user and kernel space, and is abstracted
> >from implementation by (1) binary file identification, (2) execution
> >class, and (3) call gate.
> 
> >That means we can vary it without changing the underlying implementation,
> >with the cost being restricted to the abstraction layering (already in
> >place as additional overhead anyway for ABI class emulation) and
> >additional overhead for non-native ABI's.
> 
> You can't vary Xenix 286's syscall parameter passing conventions!

What have I said that implied this?  The abstraction layering, since
it is *in the kernel* in the class-call handling code, would be
transparent.

The point is to have a native ABI that matches the kernel exported ABI
to reduce overhead for native apps.

By going to a register passing mechanism, you destroy this, and complicate
the non-native ABI's at the same time... don't you see this as well?


> >I wouldn't have a problem with an alternate execution class, and potentially
> >trap gate, to cause there to be a "very fast" calling mechanism that is
> >there *as*an*aternative*to*the*default* "portable" calling mechanism.
> >But mandating that the default be such that the majority of code on the
> >net would require patching to make it run (admittedly, mostly header
> >files) is *bogus*.
> 
> The default should be fast.  Since the convention is enforced by mostly
> machine generated glue in /usr/src/lib/libc/i386, the C convention is
> irrelevant except for its impact on the complexity of the glue.

And the glue in the underlying ABI implementations on the other side of
the user/kernel barrier.


> >I object to needing to include standard system call prototypes to allow
> >their use.  I put up with lseek/truncate/ftruncate/mmap BS because it's
> 
> It isn't required.  However, passing of args that have the correct type
> (after the default promotions) is required.  The second arg to lseek
> must be off_t, not long, except of course if off_t is long.

??? If it's not required, then either you *aren't* talking about passing
arguments to system calls in registers *or* you expect the compiler to
"do the right thing" magically.

Right now I can write code that makes system calls in assembly.  With
register passing conventions, I will need to either use library or
machine generated external routine calls (unacceptable), or I will have
to have documentation of what the compiler has done so that I can do
the same thing (non-portable as hell -- ever use M4 to write machine
independent assembly code?).


> >Note that prototypes of system calls screw up your ability to properly
> >utilize the syscall(2) call gate mechanism the way it was intended.  A
> 
> The use of syscall() in general requires handling all the messy conversion
> issues that we have been discussing in your own code.

Why is that?  A push is a push.  A trap is a trap.

The only "messy conversion" is in pushing the arguments on the stack,
and I still fail to see the benefit of playing "keep up with Linux"
in this regard.

> >user space checkpointing mechanism that defines it's own open interface
> >and calls the real open call via syscall (to remember the file names)
> >will fail to compiler when the inlined open conflicts with the user
> >definition.
> 
> Yes, it can't work in general :-].  It assumes all machines are vaxes.

This isn't true.  Did you even download the Rutger's University process
checkpoint/migration code?

> >Probably a "correct" approach would be to either (1) push 64 bits per
> >for all arguments, using type identification to decide on pushing a 0
> >for the quad high order dword or pushing a value that really exists at
> >the time of the call, or
> 
> This would be slow and is beside the point.  It's easy to implement a
> good ABI for a particular when you can design it.  We'll have just
> one chance to redesigned the ABI when we switch to int 0x80 syscalls.

I don't like it as an option either.  I would much rather revert the
type of the offset in the standard to the stack alignment type, as it
was pre-quad, and define additional quad-knowledgable functions to
deal with the quad values which the file system and VM can't currently
support anyway.

> (2) choose a different violation of the ANSI C
> >and POSIX standards -- interface extension -- instead of passing quad's
> 
> POSIX allows most reasonable extensions.

Yeah, well using quad for off_t isn't one of them.

> >for the default truncate/seek/mmap call interfaces.  The additional
> 
> It even allows nonstandard syscalls such as truncate/mmap :-).
> 
> >without prototypes (qtruncate/qftruncate/qseek/qlseek/qmmap).  And
> >you wouldn't even implement qmmap until the vmio file windowing is fixed
> >in the kernel VM system.
> 
> Linux has llseek.  That way leads to many ifdefs.

It leads to:

#ifdef HAS_QUADS
#ifndef INT_IS_64
#if BSD44
#define	lseek(filedes,offset,whence)	qlseek(filedes,offset,whence)
#define	off_t				quad_t
#endif	/* BSD44*/
#if LINUX
#define	lseek(filedes,offset,whence)	llseek(filedes,offset,whence)
#define	off_t				quad_t
#endif	/* LINUX*/
#endif	/* !INT_IS_64*/
#endif	/* HAS_QUADS*/

in one app-specific header file.


> >In any case, I think the benefits are questionable, and should be
> >explored as an execution class other than the default execution class
> >(ABI) before they are integrated.  Even after they are integrated,
> >portability would dictate that their use be optional.
> 
> This discussion has become sidetracked.  Please restrict durther discussion
> to the original point, which is to simplify the apparent interface to
> syscalls without changing the actual interface at all and without reducing
> efficiency significantly.

I believe the wrappering to constitute obfuscation, not simplification.

There is a difference between conceptual simplification (but no
document to reference to determine conception) and implementational
simplification.

I'm opposed to additional obfuscation of the call interface to
allow questionable processor specific optimization to take place
at the expense of ABI emulation portability.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199510262058.NAA21619>