Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Nov 2018 19:12:22 -0800
From:      Mark Millard <marklmi26-fbsd@yahoo.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        svn-src-head@freebsd.org
Subject:   Re: svn commit: r339876 - head/libexec/rtld-elf
Message-ID:  <8FFCF603-6315-4D1C-858B-FC7233C17DD7@yahoo.com>
In-Reply-To: <20181102185014.GP5335@kib.kiev.ua>
References:  <8E5A5F3A-F1A7-4702-A2F7-65D74CC5B2E5@yahoo.com> <20181102004101.GI5335@kib.kiev.ua> <E44F5772-1F8A-40B8-9C4E-B8362B768F37@yahoo.com> <003A49D7-6E8B-4775-A70B-E0EB44505D4B@yahoo.com> <20181102113827.GM5335@kib.kiev.ua> <7B29A4C8-228D-41CB-B594-98DFA456E9C8@yahoo.com> <20181102155234.GN5335@kib.kiev.ua> <E93B3880-281E-482C-9DA7-851398543B97@yahoo.com> <20181102185014.GP5335@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
[I've present a little information about the longer-existing
failure's odd backtrace for /libexec/ld-elf.so.1 /bin/ls
--but on powerpc64 FreeBSD instead of 32-bit powerpc FreeBSD.]

On 2018-Nov-2, at 11:50, Konstantin Belousov <kostikbel at gmail.com> =
wrote:

> On Fri, Nov 02, 2018 at 10:38:08AM -0700, Mark Millard wrote:
>> On 2018-Nov-2, at 8:52 AM, Konstantin Belousov <kostikbel at =
gmail.com> wrote:
>>=20
>>> . . .
>>=20
>> That seems better. But it crashes during /bin/ls execution
>> ( 0x0180???? addresses ), apparently in a library routine
>> ( 0x41?????? addresses ):
>>=20
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x411220b4 in ?? ()
>> (gdb) bt
>> #0  0x411220b4 in ?? ()
>> #1  0x4112200c in ?? ()
>> #2  0x01803c84 in ?? ()
>> #3  0x018023b4 in ?? ()
>> #4  0x010121a0 in .rtld_start () at =
/usr/src/libexec/rtld-elf/powerpc/rtld_start.S:112
>>=20
>> Using a normal gdb run of /bin/ls suggests:
>>=20
>> #2  0x01803c84 in ?? () should be in main and seems to be: bl =
0x1818914 <getopt_long@plt>
>> #3  0x018023b4 in ?? () should be in _start
>>=20
>> Looking in the test context:
>>=20
>>   0x1803c80:	bl      0x1818914
>>   0x1803c84:	cmpwi   cr7,r3,-1
>>=20
>> and:
>>=20
>>   0x1818914:	li      r11,59
>>   0x1818918:	b       0x18186f4
>>=20
>> and:
>>=20
>>   0x18186f4:	rlwinm  r11,r11,2,0,29
>>   0x18186f8:	addis   r11,r11,386
>>   0x18186fc:	lwz     r11,-30316(r11)
>>   0x1818700:	mtctr   r11
>>   0x1818704:	bctr
>>=20
>> Breaking at the bctr and using info reg:
>>=20
>> r11            0x4125ffa0	1093009312
>>=20
>> It looks like there is some amount of
>> activity before the traceback addresses
>> show up.
>>=20
>> I've not found a good way to fill in the "in ??()"
>> (or analogous) information. The addresses 0x411220??
>> do not match up with a normal run of /bin/ls from
>> gdb: the addresses can not be accessed.
>>=20
>>=20
>>=20
>> It does appear that the code is in /lib/libc.so.7 in the
>> test context:
>>=20
>> Breakpoint 2, reloc_non_plt (obj=3D0x41041600, obj_rtld=3D0x41104b57, =
flags=3D4, lockstate=3D0x0) at =
/usr/src/libexec/rtld-elf/powerpc/reloc.c:338
>> . . .
>>=20
> There seems to be an issue with the direct execution mode on ppc.
> Even otherwise working ld-elf.so.1 segfaults if I try to use it as
> standalone binary.
>=20
> But if I specify patched ld-elf.so.1 as the interpreter for some =
program,
> using 'cc -Wl,-I,<path>/ld-elf.so.1' it works.  So I see there two =
bugs,
> one is regression due to textsize calculation, which should be fixed =
by
> my patch.  Another is the direct exec problem.

I've got a little more information about the odd backtrace
from the /libexec/ld-elf.so.1 /bin/ls failure that the
prior patch allowed getting to, although for a powerpc64
example context.

The information is only identifying where the code was
in /bin/ls and /lib/libc.so.1 in the backtrace. For
libc.so.1 I found the same code sequences in a gdb of
/bin/ls directly, matching one first, using the addresses
vs. in the /libexec/ld-elf.so.1 /bin/ls process to
find offsets for going back and forth, and then used
that two find the 2nd backtrace addresses material.

Overall it suggests to me that (in somewhat=20
symbolic terms):

bl     <00001322.plt_call.getenv>

eventually lead to executing the wrong code.


The supporting detail is as follows.

The /libexec/ld-elf.so.1 part of the backtrace was
easy to find where the code was:

(gdb) run /bin/ls
Starting program: /libexec/ld-elf.so.1 /bin/ls

Program received signal SIGSEGV, Segmentation fault.
0x000000080118d81c in ?? ()
(gdb) bt
#0  0x000000080118d81c in ?? ()
#1  0x000000080118d920 in ?? ()
#2  0x0000000010002558 in ?? ()
#3  0x00000000100037b0 in ?? ()
#4  0x0000000001018450 in ._rtld_start () at =
/usr/src/libexec/rtld-elf/powerpc64/rtld_start.S:104
Backtrace stopped: frame did not save the PC

(gdb)=20
101		ld      %r7,128(%r1)	/* exit proc */
102		ld      %r8,136(%r1)	/* ps_strings */
103=09
104		blrl	/* _start(argc, argv, envp, obj, cleanup, =
ps_strings) */
105=09
106		li      %r0,1		/* _exit() */
107		sc


The /bin/ls part of the backtrace was easy to find
were the code was:

(gdb) symbol-file /bin/ls
Load new symbol table from "/bin/ls"? (y or n) y
Reading symbols from /bin/ls...Reading symbols from =
/usr/lib/debug//bin/ls.debug...done.
done.
(gdb) bt
#0  0x000000080118d81c in ?? ()
#1  0x000000080118d920 in ?? ()
#2  0x0000000010002558 in main (argc=3D<optimized out>, =
argv=3D0x80134bdb0) at /usr/src/bin/ls/ls.c:268
#3  0x00000000100037b0 in _start (argc=3D<optimized out>, =
argv=3D0x3fffffffffffdb70, env=3D0x3fffffffffffdb88, obj=3D<optimized =
out>, cleanup=3D<optimized out>, ps_strings=3D<optimized out>)
    at /usr/src/lib/csu/powerpc64/crt1.c:96
#4  0x0000000001018450 in ?? ()
#5  0x0000000000000000 in ?? ()

(gdb) fr 3=20
#3  0x00000000100037b0 in _start (argc=3D<optimized out>, =
argv=3D0x3fffffffffffdb70, env=3D0x3fffffffffffdb88, obj=3D<optimized =
out>, cleanup=3D<optimized out>, ps_strings=3D<optimized out>)
    at /usr/src/lib/csu/powerpc64/crt1.c:96
96		exit(main(argc, argv, env));
(gdb) down
#2  0x0000000010002558 in main (argc=3D<optimized out>, =
argv=3D0x80134bdb0) at /usr/src/bin/ls/ls.c:268
268		while ((ch =3D getopt_long(argc, argv,



For the messy lib.libc.so.1 part of the backtrace both
addresses are in getopt_internal. I show extractions from
the the gdb /bin/ls output because it has helpful symbolic
information displayed. But that means that the addresses
are offset from those in the bt for the failure process.

For #1  0x000000080118d920 in ?? () I end up with:

(gdb) x/32i 0x81019b6c0+0xad0-0x880
   0x81019b910 <getopt_internal+592>:	stw     r9,0(r18)
   0x81019b914 <getopt_internal+596>:	addis   r3,r2,-5
   0x81019b918 <getopt_internal+600>:	addi    r3,r3,30120
   0x81019b91c <getopt_internal+604>:	bl      0x81018dfe0 =
<00001322.plt_call.getenv>
   0x81019b920 <getopt_internal+608>:	ld      r2,40(r1)

(The machine code around it all matches around
0x000000080118d920 in the failure context.)

The getenv call in the source is the 2nd line of:

        if (posixly_correct =3D=3D -1 || optreset)
                posixly_correct =3D (getenv("POSIXLY_CORRECT") !=3D =
NULL);

For #0  0x000000080118d81c in ?? () I end up with:

(gdb) x/32i 0x81019b6c0+0xad0-0x880-0x110
   0x81019b800 <getopt_internal+320>:	bne     cr7,0x81019b868 =
<getopt_internal+424>
   0x81019b804 <getopt_internal+324>:	lwa     r5,0(r29)
   0x81019b808 <getopt_internal+328>:	stw     r17,0(r18)
   0x81019b80c <getopt_internal+332>:	cmpw    cr7,r5,r19
   0x81019b810 <getopt_internal+336>:	bge     cr7,0x81019ba60 =
<getopt_internal+928>
   0x81019b814 <getopt_internal+340>:	rldicr  r9,r5,3,60
   0x81019b818 <getopt_internal+344>:	ldx     r10,r20,r9
   0x81019b81c <getopt_internal+348>:	lbz     r9,0(r10)

with the failure being that r10 is zero in that last
line above. Again the surrounding code matches.

The source code line is reported to be:

                if (*(place =3D nargv[optind]) !=3D '-' ||

I got the line number information from breakpoints 3 and 4
below (from the gdb /bin/ls process):

(gdb) info br
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000000010002360 in main at =
/usr/src/bin/ls/ls.c:231
	breakpoint already hit 1 time
3       breakpoint     keep y   0x000000081019b81c in getopt_internal at =
/usr/src/lib/libc/stdlib/getopt_long.c:411
4       breakpoint     keep y   0x000000081019b91c in getopt_internal at =
/usr/src/lib/libc/stdlib/getopt_long.c:379

Line 379 has the getenv call, matching the machine code showing
the call.

(I set the breakpoints just as a way of using "info br" to list
the information later.)

Overall this seems to suggest that:

bl     <00001322.plt_call.getenv>

lead to something odd happening and got to the wrong
code.

That is all the additional information that I have
at this point. I hope it is of some use.

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8FFCF603-6315-4D1C-858B-FC7233C17DD7>