From owner-freebsd-hackers  Tue Apr  2 16:27:53 2002
Delivered-To: freebsd-hackers@freebsd.org
Received: from hawk.mail.pas.earthlink.net (hawk.mail.pas.earthlink.net [207.217.120.22])
	by hub.freebsd.org (Postfix) with ESMTP id 0B9D337B41C
	for <hackers@freebsd.org>; Tue,  2 Apr 2002 16:27:45 -0800 (PST)
Received: from pool0116.cvx40-bradley.dialup.earthlink.net ([216.244.42.116] helo=mindspring.com)
	by hawk.mail.pas.earthlink.net with esmtp (Exim 3.33 #1)
	id 16sYck-0005tS-00; Tue, 02 Apr 2002 16:27:39 -0800
Message-ID: <3CAA4C43.1F19A758@mindspring.com>
Date: Tue, 02 Apr 2002 16:26:43 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony}  (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: "E.B. Dreger" <eddy+public+spam@noc.everquick.net>
Cc: Alfred Perlstein <bright@mu.org>, hackers@freebsd.org
Subject: Re: dlopen(), ld.so, and library wrappers
References: <Pine.LNX.4.20.0204021914260.21594-100000@www.everquick.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG

"E.B. Dreger" wrote:
> So you're saying that:
> 
> 1) Program foo contains function called getpid()
> 2) Program foo links with libc
> 3) foo.getpid uses dlsym(RTLD_NEXT, "getpid") to call libc.getpid
> 4) Shared object bar, loaded by foo, will receive a pointer to
>    foo.getpid instead of libc.getpid when getpid is resolved.
> 
> ???
> 
> If that'll work, great.  But I didn't interpret the manpage that
> way.  I read it as shared object bar being able to provide a
> wrapper for program foo, which is the _opposite_ of what I want.
> 
> I'll try the above.  If that doesn't do what I want, I'll see if
> I can restate more clearly.


I recently wanted to do what you want to do, which is to write
a program that could load shared object modules, and have the
external symbol references of the modules resolve to symbols
defined in the program.

Your example case is more complicated, in that it wants the
symbols in the program to have duplicates in libc.so, and to
resolve the symbols preferentially to the ones defined in the
program.

The answer is: it works as expected.

Alfred's point about inferior nodes is actually not quite
correct, in practive.  In practice, if I have an arbitrary
load order for modules, things can be undefined.  Specifically:


	main:
		#include <sys/syscall.h>
		#include <sys/types.h>
		#include <unistd.h>
		#include <stdio.h>
		pid_t
		getpid( void)
		{
			printf( "main getpid\n");
			return syscall( SYS_getpid);
		}

		main()
		{
			/* load modules "fee" and "foo" in any order */

			...

			foomain();
		}
		...

	foo.so:

		foomain()
		{
			printf( "pid is %d\n", getpid());
		}
		...

	fee.so:
		#include <sys/syscall.h>
		#include <sys/types.h>
		#include <unistd.h>
		#include <stdio.h>
		pid_t
		getpid(void)
		{
			printf( "fee getpid\n");
			return syscall(SYS_getpid);
		}
		...

Which one foomain() ends up calling depends on whether fee.so was
loaded before or after foo.so was loaded.

The other PITA here is that when you are linking, you can't know
that all your symbols are resolved, and that you haven't forgotten
a shared library on the linkage line, because the linking of
shared objects doesn't treat them as RTLD_NOW at "ld" invocation
time.  This is arguably a linker bug, since it's not reporting
missing symbols at link time.  You can recreate this without shared
objects that you are going to dlopen, simply by creating a shared
library that references symbols in a second shared library (no
static data references!), and then using routines in the first
library in your main.  The linker won't bitch (but should!) when
you link your program only against the first library.  Instead, it
crashes at runtime when it does RTLD_LAZY ecaluation of shared
library symbols... and they aren't there (which would be OK, if the
initial linktime treated symbol graph resolution as RTLD_NOW; this
would take about 350 lines of code in ld.so, the lat time I looked).


Also, you can't intentionally override functions in shared library
referenced from main() with symbols in a shared object, without
doing explicit work to to the override.

Also, the dlopen( NULL, mode); doesn't work the way you would
expect (or want) with the RTLD_GLOBAL and RTLD_LAZY flags.

A common way of addressing ordering issues is to derference
your function pointers out of a struct of function pointers,
which you obtain by asking a module entry point for the function
pointers.

In Microsoft-land, this is called an "interface", but an "interface"
is just a pure virtual base class with a default protection type of
"public", so you can just "#define interface struct", and be there.

Basically, the module passes back a pointer to an implementation
class for the pure virtual base class (this entry point in
Microsoft-land is called "IUnknown"; it's a basic extension to
the OLE mechanism, which defines some additional well known
entry points, and which is a subset of COM, which defines entry
points like "process attach/detach" and "thread attach/detach").

Actually, FreeBSD could use some of these, particularly for
dealing with threading on things like the UMICH and Netscape
LDAP libraries, which are not intrinsically thread-safe (the
thread-attach/detach permits you to insert a serialization
barrier that you would not otherwise be able to get).

You can simulate the "process attach/detach" pretty easily in
FreeBSD, by adding entries to the shared object's CTOR and DTOR
linker set lists, which are processed out of .init and .fini to
ensure construction/destruction of statically declared class
instances with pure virtual base classes, whne any such module
is loaded.  See "TEXT_SET()".  Unfortunately, the DTOR handling
is not via .fini processing at detach time, it's via atexit()
processing, so you can't depend on destruction, which is a bad
problem, if your class is expected to destroy subclasses hung
of a list or with pointers to them in the class(es) in the
object you are unloading).

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message