From owner-freebsd-hackers Tue Apr 2 16:27:53 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from hawk.mail.pas.earthlink.net (hawk.mail.pas.earthlink.net [207.217.120.22]) by hub.freebsd.org (Postfix) with ESMTP id 0B9D337B41C for ; Tue, 2 Apr 2002 16:27:45 -0800 (PST) Received: from pool0116.cvx40-bradley.dialup.earthlink.net ([216.244.42.116] helo=mindspring.com) by hawk.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 16sYck-0005tS-00; Tue, 02 Apr 2002 16:27:39 -0800 Message-ID: <3CAA4C43.1F19A758@mindspring.com> Date: Tue, 02 Apr 2002 16:26:43 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "E.B. Dreger" Cc: Alfred Perlstein , hackers@freebsd.org Subject: Re: dlopen(), ld.so, and library wrappers References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "E.B. Dreger" wrote: > So you're saying that: > > 1) Program foo contains function called getpid() > 2) Program foo links with libc > 3) foo.getpid uses dlsym(RTLD_NEXT, "getpid") to call libc.getpid > 4) Shared object bar, loaded by foo, will receive a pointer to > foo.getpid instead of libc.getpid when getpid is resolved. > > ??? > > If that'll work, great. But I didn't interpret the manpage that > way. I read it as shared object bar being able to provide a > wrapper for program foo, which is the _opposite_ of what I want. > > I'll try the above. If that doesn't do what I want, I'll see if > I can restate more clearly. I recently wanted to do what you want to do, which is to write a program that could load shared object modules, and have the external symbol references of the modules resolve to symbols defined in the program. Your example case is more complicated, in that it wants the symbols in the program to have duplicates in libc.so, and to resolve the symbols preferentially to the ones defined in the program. The answer is: it works as expected. Alfred's point about inferior nodes is actually not quite correct, in practive. In practice, if I have an arbitrary load order for modules, things can be undefined. Specifically: main: #include #include #include #include pid_t getpid( void) { printf( "main getpid\n"); return syscall( SYS_getpid); } main() { /* load modules "fee" and "foo" in any order */ ... foomain(); } ... foo.so: foomain() { printf( "pid is %d\n", getpid()); } ... fee.so: #include #include #include #include pid_t getpid(void) { printf( "fee getpid\n"); return syscall(SYS_getpid); } ... Which one foomain() ends up calling depends on whether fee.so was loaded before or after foo.so was loaded. The other PITA here is that when you are linking, you can't know that all your symbols are resolved, and that you haven't forgotten a shared library on the linkage line, because the linking of shared objects doesn't treat them as RTLD_NOW at "ld" invocation time. This is arguably a linker bug, since it's not reporting missing symbols at link time. You can recreate this without shared objects that you are going to dlopen, simply by creating a shared library that references symbols in a second shared library (no static data references!), and then using routines in the first library in your main. The linker won't bitch (but should!) when you link your program only against the first library. Instead, it crashes at runtime when it does RTLD_LAZY ecaluation of shared library symbols... and they aren't there (which would be OK, if the initial linktime treated symbol graph resolution as RTLD_NOW; this would take about 350 lines of code in ld.so, the lat time I looked). Also, you can't intentionally override functions in shared library referenced from main() with symbols in a shared object, without doing explicit work to to the override. Also, the dlopen( NULL, mode); doesn't work the way you would expect (or want) with the RTLD_GLOBAL and RTLD_LAZY flags. A common way of addressing ordering issues is to derference your function pointers out of a struct of function pointers, which you obtain by asking a module entry point for the function pointers. In Microsoft-land, this is called an "interface", but an "interface" is just a pure virtual base class with a default protection type of "public", so you can just "#define interface struct", and be there. Basically, the module passes back a pointer to an implementation class for the pure virtual base class (this entry point in Microsoft-land is called "IUnknown"; it's a basic extension to the OLE mechanism, which defines some additional well known entry points, and which is a subset of COM, which defines entry points like "process attach/detach" and "thread attach/detach"). Actually, FreeBSD could use some of these, particularly for dealing with threading on things like the UMICH and Netscape LDAP libraries, which are not intrinsically thread-safe (the thread-attach/detach permits you to insert a serialization barrier that you would not otherwise be able to get). You can simulate the "process attach/detach" pretty easily in FreeBSD, by adding entries to the shared object's CTOR and DTOR linker set lists, which are processed out of .init and .fini to ensure construction/destruction of statically declared class instances with pure virtual base classes, whne any such module is loaded. See "TEXT_SET()". Unfortunately, the DTOR handling is not via .fini processing at detach time, it's via atexit() processing, so you can't depend on destruction, which is a bad problem, if your class is expected to destroy subclasses hung of a list or with pointers to them in the class(es) in the object you are unloading). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message