Date: Sat, 2 Jun 2001 15:22:37 -0700 (PDT) From: John Polstra <jdp@polstra.com> To: stable@freebsd.org Cc: ohartman@klima.physik.uni-mainz.de Subject: Re: NIS/YP still broken! Message-ID: <200106022222.f52MMbR35496@vashon.polstra.com> In-Reply-To: <Pine.BSF.4.33.0106021209140.10271-100000@klima.physik.uni-mainz.de> References: <Pine.BSF.4.33.0106021209140.10271-100000@klima.physik.uni-mainz.de>
next in thread | previous in thread | raw e-mail | index | archive | help
In article <Pine.BSF.4.33.0106021209140.10271-100000@klima.physik.uni-mainz.de>,
Hartmann, O. <ohartman@klima.physik.uni-mainz.de> wrote:
>
> FreeBSD 4.3-STABLE has still a broken NIS/YP! If there are more than
> one slave servers ypxfrd should spread its tables, push seems to
> lock up and get a timeout.
>
> This was reported earlier here and I got a 'fix' for this but this fix
> hasn't been merged in due it targets a sypmtome, not the cause itself.
We would love to fix this, but unfortunately the people who can debug
it have not been able to reproduce the problem. If you are willing to
help, maybe you can debug it by remote control. :-)
Currently, my best hypothesis about the cause of this problem is that
yppush is reading from an invalid memory address which happens to fall
into the region occupied by the dynamic linker. Thus making small
changes to the dymamic linker causes the behavior of yppush to change.
To test this hypothesis, let's try an experiment. Please apply the
patch below to "/usr/src/usr.sbin/yppush/yppush_main.c":
Index: yppush_main.c
===================================================================
RCS file: /home/ncvs/src/usr.sbin/yppush/yppush_main.c,v
retrieving revision 1.11
diff -u -r1.11 yppush_main.c
--- yppush_main.c 1999/08/28 01:21:09 1.11
+++ yppush_main.c 2001/06/02 21:35:11
@@ -545,6 +545,11 @@
struct hostlist *tmp;
struct sigaction sa;
+ static char *rtld_base = (char *)0; /* Patch me */
+ static char *rtld_limit = (char *)0; /* Patch me too */
+ if (rtld_base != NULL && rtld_limit > rtld_base)
+ munmap(rtld_base, rtld_limit - rtld_base);
+
while ((ch = getopt(argc, argv, "d:j:p:h:t:v")) != -1) {
switch(ch) {
case 'd':
Then rebuild and reinstall yppush like this:
make clean
make obj
make depend
DEBUG_FLAGS=-g make
STRIP= make install
and verify that the program is still failing. I hope it will still
fail, or we are out of luck.
As it is shown here, the patch should do nothing. Next you must
determine where the dynamic linker is loaded, and patch the low and
high limits into the two lines labeled "Patch me" and "Patch me too".
You can do this as follows. Run yppush manually and see what its
process ID is. While the program is still running, display its map
file "/proc/PID/map". For example, if the process ID is 12345 you
would want to see "/proc/12345/map". I recommend that you look at the
file like this:
dd bs=64k < /proc/12345/map
since "cat" often doesn't work on these kinds of files. I hope that
yppush will run long enough for you to snare this information. If
it finishes too quickly, try adding a call ``sleep(30)'' just after
the added lines in yppush_main.c.
The map file will resemble this:
0x8048000 0x8049000 1 0 0xcb8a78a0 r-x 1 0 0x0 COW NC vnode
0x8049000 0x804a000 1 0 0xcb79d1e0 rw- 1 0 0x2180 NCOW NNC default
0x28049000 0x2805a000 17 0 0xcb55a120 r-x 38 19 0x4 COW NC vnode
0x2805a000 0x2805b000 1 0 0xcb39b120 rw- 1 0 0x2180 COW NNC vnode
0x2805b000 0x2805d000 2 0 0xcb5c6a20 rw- 2 0 0x2180 NCOW NNC default
0x2805d000 0x28065000 6 0 0xcb5c6a20 rwx 2 0 0x2180 NCOW NNC default
0x28065000 0x280e2000 44 0 0xc0355a00 r-x 46 23 0x4 COW NC vnode
0x280e2000 0x280e7000 5 0 0xcb34f120 rwx 1 0 0x2180 COW NNC vnode
0x280e7000 0x280fb000 2 0 0xcb3c2240 rwx 1 0 0x2180 NCOW NNC default
0xbfbe0000 0xbfc00000 4 0 0xcb45b600 rwx 1 0 0x2180 NCOW NNC default
except that I have added some blank lines to make it easier to
explain. The first 3 groups of lines above correspond to (1) the
program itself, (2) the dynamic linker, and (3) the shared library
libc.so.4. The final line is the runtime stack. Except for the
stack, each group begins with one or two "vnode" lines. That's how
you can recognize where each group starts. The first two numbers in
each line are the start and end+1 addresses of a region of memory.
The first group is the executable, and the second group is the dynamic
linker. As you can see, in this example the dynamic linker occupies
the region starting at 0x28049000 and ending just below 0x28065000.
The numbers you want to look at in the second group are these:
||||||||||
VVVVVVVVVV
0x28049000 0x2805a000 17 0 0xcb55a120 r-x 38 19 0x4 COW NC vnode
0x2805a000 0x2805b000 1 0 0xcb39b120 rw- 1 0 0x2180 COW NNC vnode
0x2805b000 0x2805d000 2 0 0xcb5c6a20 rw- 2 0 0x2180 NCOW NNC default
0x2805d000 0x28065000 6 0 0xcb5c6a20 rwx 2 0 0x2180 NCOW NNC default
^^^^^^^^^^
||||||||||
Now take the first number and replace the 0 with it in the "Patch me"
line. And take the second number and replace the 0 with it in the
"Patch me too" line, like this:
static char *rtld_base = (char *)0x28049000; /* Patch me */
static char *rtld_limit = (char *)0x28065000; /* Patch me too */
(The numbers will no doubt be different on your system.)
Rebuild yppush again and install it the same way as you did before
(with DEBUG=-g and STRIP= ).
With the proper addresses patched in, yppush will unmap the dynamic
linker from memory as soon as it starts up. So if anything in yppush
tries to read from that region of memory, a segmentation violation
will occur and you should get a core dump. With gdb, get a stack
trace and send it to me in that case.
There are a dozen things that could go wrong with this procedure, but
I don't have any better ideas at the moment.
John
--
John Polstra jdp@polstra.com
John D. Polstra & Co., Inc. Seattle, Washington USA
"Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200106022222.f52MMbR35496>
