Date: Wed, 30 Apr 2003 18:11:03 -0700 From: Peter Wemm <peter@wemm.org> To: Daniel Eischen <eischen@pcnet1.pcnet.com> Cc: threads@freebsd.org Subject: Re: Question about rtld-elf. Anyone?.. Anyone? Message-ID: <20030501011103.6E20D2A7EA@canning.wemm.org> In-Reply-To: <Pine.GSO.4.10.10304301927460.24833-100000@pcnet1.pcnet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Daniel Eischen wrote: > On Wed, 30 Apr 2003, Daniel Eischen wrote: > > > On Tue, 29 Apr 2003, Peter Wemm wrote: > > > One way I've seen is to have libc and the respective pthreads libraries > > > provide the public access to things like dlopen() etc. That way, the > > > threads package of your choice does its own serialization of the entry > > > points into the dynamic linker guts/internals. As John Polstra said > > > earlier, he has some thoughts about how to make the actual lazy symbol > > > lookup be thread-safe. > > > > I think this would work. It could even be done in our libc, just > > as malloc, stdio, and friends use locking stubs (overridden by our > > threads libraries). > > > > > If I recall correctly, our old a.out based shared lib implementation did it > > > precicely this way. dlopen() was a function in libc, that called through > > > a vector into the guts of ld.so.1. The dynamic linker itself never provi ded > > > direct call access to this stuff. Some systems put these public function s > > > in a seperate library, -ldl. The ELF implemetation that we use does, and > > > doesn't give the threads library a chance to wrap them. > > > > > > (And no, this is not an invitation for getting sidetracked on making > > > ld-elf.so.1 into libdl.so.1 as a service library, etc etc) > > > > > > How would things go if we renamed the ld-elf.so functions to __rtld_dlope n() > > > etc and then had libc provide a weak dlopen() function that redirected to > > > __rtld_dlopen(), and give libpthread a chance to provide a replacement? > > > And of course, deal with making the runtime symbol resolution as John > > > suggested in the commit logs. > > > > Or just have libc provide the necessary locking so that we don't need > > to repeat it in libc_r, libthr, and libpthread. > > > > Is a simple mutex around dlopen, dlsym, etc, sufficient? We don't need > > to handle recursive calls, right? > > As an experiment, I made the dlfoo calls in rtld-elf weak > (__dlfoo -> dlfoo) and then overrode them in libpthread > and protected them with mutexes. > > I can get mozilla to work about 1/2 of the time now, but > it still gets stuck in the same state the other 1/2 of > the time. This is a bit of an improvement, and seems to > indicate (at least to me) that rtld-elf is the culprit. As John said, the problem is twofold. One is the symbol resolution itself, eg: when you access a function for the first time, a lazy binding call happens. He had ideas about how to make that fully reentrant. The second problem was preventing dlopen() and friends being called in parallel. It sounds like you've dealt with only the second problem... Cheers, -Peter -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030501011103.6E20D2A7EA>