From owner-freebsd-current Sat May 23 12:51:46 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id MAA19895 for freebsd-current-outgoing; Sat, 23 May 1998 12:51:46 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from smtp04.primenet.com (daemon@smtp04.primenet.com [206.165.6.134]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id MAA19881 for ; Sat, 23 May 1998 12:51:43 -0700 (PDT) (envelope-from tlambert@usr07.primenet.com) Received: (from daemon@localhost) by smtp04.primenet.com (8.8.8/8.8.8) id MAA13712; Sat, 23 May 1998 12:51:41 -0700 (MST) Received: from usr07.primenet.com(206.165.6.207) via SMTP by smtp04.primenet.com, id smtpd013704; Sat May 23 12:51:40 1998 Received: (from tlambert@localhost) by usr07.primenet.com (8.8.5/8.8.5) id MAA10260; Sat, 23 May 1998 12:51:36 -0700 (MST) From: Terry Lambert Message-Id: <199805231951.MAA10260@usr07.primenet.com> Subject: Re: Fix for undefined "__error" and discussion of shared object versioning To: syssgm@dtir.qld.gov.au (Stephen McKay) Date: Sat, 23 May 1998 19:51:36 +0000 (GMT) Cc: tlambert@primenet.com, freebsd-current@FreeBSD.ORG, syssgm@dtir.qld.gov.au In-Reply-To: <199805231040.UAA02235@troll.dtir.qld.gov.au> from "Stephen McKay" at May 23, 98 08:40:06 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > I'm sure you have misread my message. Here is a diff from the test code > you sent on 20 May 1998 03:12:25 +0000 to the test code I sent back on > 20 May 1998 17:47:10 +1000: [ ... ] > All I did was count the number of times ___error() is called. I didn't > rename the symbol. Since ___error() is called even when linked with -lc_r > I conclude that __error() in libc_r is not overriding the weak __error > supplied by your modified errno.h. Thus, threaded applications would > all share the same errno instead of getting one each, which led to my > claim that a more extensive multithreaded test case is required. I see what you are attempting. The weak symbol is apparently being screwed over by our linker *before* the libraries are examined for identical non-weak symbols. Specifically, in pass 1, in ld.c, there is code: /* * If this symbol has acquired final definition, we're done. * Commons must be allowed to bind to shared object data * definitions. */ if (sp->defined && (sp->common_size == 0 || relocatable_output || building_shared_object)) { if ((sp->defined & N_TYPE) == N_SETV) /* Allocate zero entry in set vector */ setv_fill_count++; /* * At this stage, we do not know whether an alias * is going to be defined for real here, or whether * it refers to a shared object symbol. The decision * is deferred until digest_pass2(). */ if (!sp->alias) defined_global_sym_count++; continue; } This causes the symbol to be bound to the weak value, even though there is a shared library definition. This is *WRONG*. The ld program is *BROKEN*. So the problem you are seeing is specifically because the *PROGRAM* object has the weak definition. This will never be the case for the legacy code you are delaing with. In the shared library case, the loading of shared objects and the resoloution of weak symbols is, in fact, correct. Practically, this means that the weak __error definition to ___error *WILL* work, but *ONLY* if it occurs in shared objects, and *NOT* in the main program. This was the point of the _ERRNO_ protection of the static function and weak symbol definition, in my last posting. You don't have it defined when you compile normal programs. It would be better to tag this off of "SHARABLE" or whatever the compiler likes to define when you are compiling PIC code for a shared library. > In other words, there is now just one errno because ___error from errno.h > is used in preference to __error in libc_r. That's because you are defining it in your objects that you are linking against, as well as your stub shared library. > Yes, I've looked at these. That's why I'm so disappointed that the > technique doesn't work. Having played with it a bit, I'm now convinced > that no tweaking with errno.h can ever fix the problem. About 8 hours of work on ld could fix it. I hacked together a stupid instrumented ld that almost works in about an hour. It still doesn't do the right thing, quite, since it doesn't put the correct shared library offset into the symbol definition; it does, however, put the expected ___errno string in the relocation symbol external name references, so it's a matter of clobbering one value (or more accurately, pulling the weak value off of nzlist in favor of the shared object value). > >For programs *already* linked against libc_r instead of libc, or > >linked against the new libc, I *EXPECT* the standin to *NEVER* be > >called. > > Yes! This is where I claim the experimental evidence is against you. You are still running the wrong experiment, I think. The errno.h static function and weak sumbol declaration should *ONLY* occur in shared library compilations, and *not* when you include errno.h in the objects for the program you are linking against the shared libraries. > Spurred by your description of load ordering, I built a small library > (lib__error.so) containing just /usr/src/lib/libc/sys/__errno.c with > an execution counter in __error_unthreaded. I linked this to a small > test program, using -lc_r as well. [ ... ] > Output of ldd: > > foo: > -l__error.0 => /syshome/syssgm/lib/lib__error.so.0.0 (0x20014000) > -lc_r.3 => /usr/lib/libc_r.so.3.0 (0x20019000) > -lc.3 => /usr/lib/libc.so.3.1 (0x2009b000) > ----------------------------------------------------------------------------- > Output of foo: > ----------------------------------------------------------------------------- > errno is 0 > count is 1 > errno is 21 > count is 3 > ----------------------------------------------------------------------------- The problem here is that foo is getting the __error = ___error from foo.o, not from teh shared library. I would expect the strong __error in /usr/lib/libc_r.so.3.0 to override the weak __error = ___error in /syshome/syssgm/lib/lib__error.so.0.0 But it's *NOT* going to override the __error = ___error definition that occurs in foo.o because of the ld bug (see above) which prevents strong references in shared libraries from overriding weak references in the user's code. This may, in fact, be broken for certain shared library data definitions as well (I haven't looked closely, only close enough to see that it seems wrong, even in the case where common_size != 0). > Now on to a hack that actually works: [ ... hack to ld.so ... ] I'm anxious about this hack because what you are doing is covering a bug in ld that is interfering with your test case. I think this can be adequately dispensed with by doing the right thing in errno.h and bsd.lib.mk. > So, for the folks that really care about this, we now have 3 possible > options: > > 1) back out the errno change, and possibly put it back after ELF. > > 2) hack ld.so (prototype works fine) > > 3) bump ALL library major numbers > > Which will it be? There are two more: 4) hack errno.h to define the weak symbol mechanism I proposed, and fix ld so that errno.h doesn't have to know that a shared library compilation unit is including it. 5) hack errno.h to define the weak symbol mechanism I proposed, and hack bsd.lib.mk that errno.h knows that a shared library compilation unit is including it. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message