Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Feb 1999 19:00:29 -0800 (PST)
From:      John Polstra <jdp@polstra.com>
To:        Archie Cobbs <archie@whistle.com>
Cc:        terry@whistle.com, hackers@FreeBSD.ORG
Subject:   Re: Interesting ld.so bug
Message-ID:  <XFMail.990224190029.jdp@polstra.com>
In-Reply-To: <199902230410.UAA72778@bubba.whistle.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Archie Cobbs wrote:
> John Polstra writes:
>> > Now when we run a java class that uses the java_jni.c native method,
>> > the call to Java_bar1() succeeds, and the call from there to bar1()
>> > succeeds, but when bar1() tries to call bar2(), it jumps to a very
>> > low address and segfaults. It seems that the bar2() trampoline is
>> > using an uninitialized base address or whatever.
>> > 
>> > NOW, if we remove "db.c" from the compilation of "libfoo.so",
>> > then everything works!
>> 
>> Was the code in the static libgdbm.a library compiled with -fpic?
>> I bet it wasn't, and that's probably the problem.  All code that's
>> included in a shared library should be PIC code.

Hey, you never responded to that.  Could you try rebuilding
libgdbm.a with -fpic and see if it fixes the problem?

> Actually, now something else is going on..  here's some more info:
> 
>             With db.c     Without db.c
>             ---------     ------------
> 
> RTLD_LAZY      fails          works!
> 
> RTLD_NOW       fails          fails
> 
> Terry thinks there is a screwup in RTLD_NOW in that it's failing
> to recurse.

I don't think that's it.  The code is correct as far as I can see.
The relocation function isn't supposed to recurse -- it simply loops
over all objects that have been loaded since the last time it ran,
relocating each one of them.  The recursion was done prior to that,
in load_needed_objects().

Any time lazy binding works when immediate binding fails, it's most
likely because the program doesn't ever actually call the problematic
function.  So it never has to be bound, therefore the bug isn't
encountered.

> However, this can be worked around by adding this to the build
> of the library (discoverd by Amancio):
> 
>   -export-dynamic -lgdbm -lc

If you are building a shared library, don't link it against non-PIC
static libraries.  It kills performance at best, and at worst it is
asking for trouble.  You should be using "-lc_pic" here, and libgdbm
needs to have PIC object files in it.

The --export-dynamic thing might be a clue, but it's awfully hard
for me to tell.  There's no way I can duplicate the problem in any
reasonable amount of time based on the description you guys have given
me.  Every time I think of trying it, I start to wonder if this is a
trick -- like maybe I'm on Candid Camera, or America's Funniest Home
Videos.  "Now let's see if we can get him to attempt constructing THIS
test case, *snicker* *giggle* ..." :-)

Could you try to distill the test case down to a set of files packed
all together into one gzipped tar file smaller than 2 MB, with a
Makefile such that I can type "make" to build it and "./test" to run
it, and I don't have to install anything into root-owned directories?
That's the only way I'm going to be able to do anything on this one.

John
---
  John Polstra                                               jdp@polstra.com
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Nobody ever went broke underestimating the taste of the American public."
                                                            -- H. L. Mencken


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.990224190029.jdp>