Date: Thu, 12 Jul 2007 13:09:23 -0700 From: Peter Wemm <peter@wemm.org> To: freebsd-current@freebsd.org Cc: Michiel Boland <michiel@boland.org>, peter@freebsd.org Subject: Re: upgrade 6-STABLE to -CURRENT on sparc64 renders box unusable Message-ID: <200707121309.23820.peter@wemm.org> In-Reply-To: <Pine.GSO.4.64.0707102211410.4947@neerbosch.nijmegen.internl.net> References: <Pine.GSO.4.64.0707101403490.13827@neerbosch.nijmegen.internl.net> <Pine.GSO.4.64.0707101824231.15001@neerbosch.nijmegen.internl.net> <Pine.GSO.4.64.0707102211410.4947@neerbosch.nijmegen.internl.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 10 July 2007, Michiel Boland wrote: > Well, in fact I did manage to debug this further. :) > > The problem is that on sparc64 and -CURRENT, every executable > segfaults in > > _rtld > init_rtld > relocate_objects > reloc_non_plt > mmap > __getosreldate > > It appears that __getosreldate was added five days ago, which may > explain why the breakage on sparc64 hasn't been reported yet. (I am > ccing peter@ since he committed this.) > > If I apply the following patch, then rebuild libc, things are more or > less ok again. Of course this patch is very suboptimal, I am just > trying to point out where the problem is. > > --- __getosreldate.c.orig 2007-07-10 22:29:02.000000000 +0200 > +++ __getosreldate.c 2007-07-10 22:28:20.000000000 +0200 > @@ -42,13 +42,10 @@ > int > __getosreldate(void) > { > - static int osreldate; > + int osreldate; > size_t len; > int oid[2]; > int error, osrel; > - > - if (osreldate != 0) > - return (osreldate); > > oid[0] = CTL_KERN; > oid[1] = KERN_OSRELDATE; Your other option would be to add WITHOUT_SYSCALL_COMPAT=yes to /etc/make.conf. That gets rid of the __getosreldate() calls entirely, but at the expense of being able to boot an older kernel after userland has been updated. We could make this option default on sparc64 if it was acceptable. Another option might to hack rtld given the unusual circumstances: Index: libexec/rtld-elf/sparc64/reloc.c @@ -247,6 +247,9 @@ return (0); } + +void *__sys_freebsd6_mmap(void *, size_t, int, int, int, int, off_t); + int reloc_non_plt(Obj_Entry *obj, Obj_Entry *obj_rtld) { @@ -260,7 +263,8 @@ * The dynamic loader may be called from a thread, we have * limited amounts of stack available so we cannot use alloca(). */ - cache = mmap(NULL, bytes, PROT_READ|PROT_WRITE, MAP_ANON, -1, 0); + cache = __sys_freebsd6_mmap(NULL, bytes, PROT_READ|PROT_WRITE, MAP_ANON, + -1, 0, 0); if (cache == MAP_FAILED) cache = NULL; This would avoid the pre-reloc-fixup use of __getosreldate() via mmap. In spite of the name, freebsd6_mmap is "standard" in the tree right now and isn't going to become 'compat6' till comfortably after the release. The __getosreldate() thing would go away at the same time, so the problem would be "solved". The catch would be that a slightly out-of-date userland would depend on COMPAT_FREEBSD6 on sparc64. sparc64 boxes would be able to boot/run relatively old kernel.old's even after a fresh build/install world. PS: I've been told the same problem applies to powerpc.. PPS: I tried for 4 days to get a sun4v box to build world (shared with sparc64). I ended up giving up and just building/installing a new libc. I forgot that ld-elf.so.1 was statically linked against libc_pic.a. -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200707121309.23820.peter>