Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 3 Jun 2001 22:03:31 +0200 (CEST)
From:      "Hartmann, O." <ohartman@klima.physik.uni-mainz.de>
To:        Andre Albsmeier <andre.albsmeier@mchp.siemens.de>
Cc:        John Polstra <jdp@polstra.com>, <stable@FreeBSD.ORG>
Subject:   Re: NIS/YP still broken!
Message-ID:  <Pine.BSF.4.33.0106032156210.1370-100000@klima.physik.uni-mainz.de>
In-Reply-To: <20010603194836.A34626@curry.mchp.siemens.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Here I am again, have had a lot of work last night ...


:>On Sat, 02-Jun-2001 at 15:22:37 -0700, John Polstra wrote:
:>> In article <Pine.BSF.4.33.0106021209140.10271-100000@klima.physik.uni-mainz.de>,
:>> Hartmann, O. <ohartman@klima.physik.uni-mainz.de> wrote:
:>> >
:>> > FreeBSD 4.3-STABLE has still a broken NIS/YP! If there are more than
:>> > one slave servers ypxfrd should spread its tables, push seems to
:>> > lock up and get a timeout.
:>> >
:>> > This was reported earlier here and I got a 'fix' for this but this fix
:>> > hasn't been merged in due it targets a sypmtome, not the cause itself.
:>
:>I will happily jump in here since I can easily reproduce it.
:>
:>
:>> We would love to fix this, but unfortunately the people who can debug
:>> it have not been able to reproduce the problem.  If you are willing to
:>> help, maybe you can debug it by remote control. :-)
:>
:>Well, if these people like a step by step guide how to reproduce it
:>I can try...

Well, this problem occurs on ALL systems running here and configured as
NIS/YP server and running the recent FreeBSD 4.3-STABLE. It should be
able to reproduce the problem!

:>
:>
:>> Currently, my best hypothesis about the cause of this problem is that
:>> yppush is reading from an invalid memory address which happens to fall
:>> into the region occupied by the dynamic linker.  Thus making small
:>> changes to the dymamic linker causes the behavior of yppush to change.
:>>
:>> To test this hypothesis, let's try an experiment.  Please apply the
:>> patch below to "/usr/src/usr.sbin/yppush/yppush_main.c":
:>>
:>> Index: yppush_main.c
:>> ===================================================================
:>> RCS file: /home/ncvs/src/usr.sbin/yppush/yppush_main.c,v
:>> retrieving revision 1.11
:>> diff -u -r1.11 yppush_main.c
:>> --- yppush_main.c	1999/08/28 01:21:09	1.11
:>> +++ yppush_main.c	2001/06/02 21:35:11
:>> @@ -545,6 +545,11 @@
:>>  	struct hostlist *tmp;
:>>  	struct sigaction sa;
:>>
:>> +	static char *rtld_base = (char *)0;	/* Patch me */
:>> +	static char *rtld_limit = (char *)0;	/* Patch me too */
:>> +	if (rtld_base != NULL && rtld_limit > rtld_base)
:>> +		munmap(rtld_base, rtld_limit - rtld_base);
:>> +
:>>  	while ((ch = getopt(argc, argv, "d:j:p:h:t:v")) != -1) {
:>>  		switch(ch) {
:>>  		case 'd':
:>>
:>> Then rebuild and reinstall yppush like this:
:>>
:>> 	make clean
:>> 	make obj
:>> 	make depend
:>> 	DEBUG_FLAGS=-g make
:>> 	STRIP= make install
:>>
:>> and verify that the program is still failing.  I hope it will still
:>> fail, or we are out of luck.
:>
:>Ack, the programm still fails as before.

I patched the source, too. yppush fails as before, but I have trouble to catch
its ID and get the memory map as described. Maybe I'm to stupid to get it.


:>
:>
:>> As it is shown here, the patch should do nothing.  Next you must
:>> determine where the dynamic linker is loaded, and patch the low and
:>> high limits into the two lines labeled "Patch me" and "Patch me too".
:>> You can do this as follows.  Run yppush manually and see what its
:>> process ID is.  While the program is still running, display its map
:>> file "/proc/PID/map".  For example, if the process ID is 12345 you
:>> would want to see "/proc/12345/map".  I recommend that you look at the
:>> file like this:
:>>
:>>     dd bs=64k < /proc/12345/map
:>>
:>> since "cat" often doesn't work on these kinds of files.  I hope that
:>> yppush will run long enough for you to snare this information.  If
:>> it finishes too quickly, try adding a call ``sleep(30)'' just after
:>> the added lines in yppush_main.c.
:>>
:>> The map file will resemble this:
:>>
:>> 0x8048000 0x8049000 1 0 0xcb8a78a0 r-x 1 0 0x0 COW NC vnode
:>> 0x8049000 0x804a000 1 0 0xcb79d1e0 rw- 1 0 0x2180 NCOW NNC default
:>>
:>> 0x28049000 0x2805a000 17 0 0xcb55a120 r-x 38 19 0x4 COW NC vnode
:>> 0x2805a000 0x2805b000 1 0 0xcb39b120 rw- 1 0 0x2180 COW NNC vnode
:>> 0x2805b000 0x2805d000 2 0 0xcb5c6a20 rw- 2 0 0x2180 NCOW NNC default
:>> 0x2805d000 0x28065000 6 0 0xcb5c6a20 rwx 2 0 0x2180 NCOW NNC default
:>>
:>> 0x28065000 0x280e2000 44 0 0xc0355a00 r-x 46 23 0x4 COW NC vnode
:>> 0x280e2000 0x280e7000 5 0 0xcb34f120 rwx 1 0 0x2180 COW NNC vnode
:>> 0x280e7000 0x280fb000 2 0 0xcb3c2240 rwx 1 0 0x2180 NCOW NNC default
:>>
:>> 0xbfbe0000 0xbfc00000 4 0 0xcb45b600 rwx 1 0 0x2180 NCOW NNC default
:>
:>The map here looks slightly different:
:>
:>0x8048000 0x804d000 5 0 0xd6927ea0 r-x 1 0 0x0 COW NC vnode
:>0x804d000 0x804f000 2 0 0xd6894d20 rw- 2 0 0x2180 NCOW NNC default
:>0x804f000 0x8066000 16 0 0xd6894d20 rwx 2 0 0x2180 NCOW NNC default  <--- additional
:>
:>0x1804d000 0x1805e000 17 0 0xd73f4d80 r-x 10 5 0x0 COW NC vnode
:>0x1805e000 0x1805f000 1 0 0xd7246ba0 rw- 1 0 0x2180 COW NNC vnode
:>0x1805f000 0x18061000 2 0 0xd72ec540 rw- 2 0 0x2180 NCOW NNC default
:>0x18061000 0x18069000 5 0 0xd72ec540 rwx 2 0 0x2180 NCOW NNC default
:>
:>0x18069000 0x180e6000 103 0 0xc0280300 r-x 104 45 0x0 COW NC vnode
:>0x180e6000 0x180eb000 5 0 0xd71fba20 rwx 1 0 0x2180 COW NNC vnode
:>0x180eb000 0x180ff000 7 0 0xd7b96c60 rwx 1 0 0x2180 NCOW NNC default
:>
:>0xbfbe0000 0xbfc00000 4 0 0xd79b18a0 rwx 1 0 0x2180 NCOW NNC default
:>
:>
:>> except that I have added some blank lines to make it easier to
:>> explain.  The first 3 groups of lines above correspond to (1) the
:>> program itself, (2) the dynamic linker, and (3) the shared library
:>> libc.so.4.  The final line is the runtime stack.  Except for the
:>> stack, each group begins with one or two "vnode" lines.  That's how
:>> you can recognize where each group starts.  The first two numbers in
:>> each line are the start and end+1 addresses of a region of memory.
:>>
:>> The first group is the executable, and the second group is the dynamic
:>> linker.  As you can see, in this example the dynamic linker occupies
:>> the region starting at 0x28049000 and ending just below 0x28065000.
:>> The numbers you want to look at in the second group are these:
:>>
:>> ||||||||||
:>> VVVVVVVVVV
:>> 0x28049000 0x2805a000 17 0 0xcb55a120 r-x 38 19 0x4 COW NC vnode
:>> 0x2805a000 0x2805b000 1 0 0xcb39b120 rw- 1 0 0x2180 COW NNC vnode
:>> 0x2805b000 0x2805d000 2 0 0xcb5c6a20 rw- 2 0 0x2180 NCOW NNC default
:>> 0x2805d000 0x28065000 6 0 0xcb5c6a20 rwx 2 0 0x2180 NCOW NNC default
:>>            ^^^^^^^^^^
:>>            ||||||||||
:>
:>So im my case it is:
:>
:>||||||||||
:>VVVVVVVVVV
:>0x1804d000 0x1805e000 17 0 0xd73f4d80 r-x 10 5 0x0 COW NC vnode
:>0x1805e000 0x1805f000 1 0 0xd7246ba0 rw- 1 0 0x2180 COW NNC vnode
:>0x1805f000 0x18061000 2 0 0xd72ec540 rw- 2 0 0x2180 NCOW NNC default
:>0x18061000 0x18069000 5 0 0xd72ec540 rwx 2 0 0x2180 NCOW NNC default
:>           ^^^^^^^^^^
:>           ||||||||||
:>
:>> Now take the first number and replace the 0 with it in the "Patch me"
:>> line.  And take the second number and replace the 0 with it in the
:>> "Patch me too" line, like this:
:>>
:>> 	static char *rtld_base = (char *)0x28049000;     /* Patch me */
:>> 	static char *rtld_limit = (char *)0x28065000;    /* Patch me too */
:>>
:>> (The numbers will no doubt be different on your system.)
:>
:>Done, I have now:
:>
:>       static char *rtld_base = (char *)0x1804d000;     /* Patch me */
:>       static char *rtld_limit = (char *)0x18069000;    /* Patch me too */
:>
:>
:>> Rebuild yppush again and install it the same way as you did before
:>> (with DEBUG=-g and STRIP= ).
:>>
:>> With the proper addresses patched in, yppush will unmap the dynamic
:>> linker from memory as soon as it starts up.  So if anything in yppush
:>> tries to read from that region of memory, a segmentation violation
:>> will occur and you should get a core dump.  With gdb, get a stack
:>> trace and send it to me in that case.
:>
:>I have a corefile but can't debug it:
:>
:>Core was generated by `yppush'.
:>Program terminated with signal 11, Segmentation fault.
:>Cannot access memory at address 0x180600a8.
:>#0  0x1804f358 in ?? ()Cannot access memory at address 0x180600a8.
:>
:>Anything I did wrong?
:>
:>Thanks a lot for helping,
:>
:>	-Andre
:>

And for this: it seems that with FreeBSD 4.3-STABLE this problem occur
again since it disappeared when I used FreeBSD 4.2-STABLE and 4.3-BETA
throughout 4.3-RC.

On the other hand, I can not do a lot of evaluation of this problem due the
fact all these system I mentioned are 'in production'. At the moment we
configured two slave NIS servers and a third machine is upcoming. I will
do a workaround by manipulating the /var/yp/ypservers file, exchanging
all the servers in a manner of 'round robin' by a shell script.

--
MfG
O. Hartmann

ohartman@klima.physik.uni-mainz.de
----------------------------------------------------------------
IT-Administration des Institut fuer Physik der Atmosphaere (IPA)
----------------------------------------------------------------
Johannes Gutenberg Universitaet Mainz
Becherweg 21
55099 Mainz

Tel: +496131/3924662 (Maschinensaal)
Tel: +496131/3924144
FAX: +496131/3923532


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.33.0106032156210.1370-100000>