Date: Tue, 22 Aug 2017 20:08:57 +0200 From: "Hartmann, O." <ohartmann@walstatt.org> To: David Wolfskill <david@catwhisker.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, current@freebsd.org, Dimitry Andric <dim@FreeBSD.org> Subject: Re: SIGSEGV in /bin/sh after r322740 -> r322776 update Message-ID: <20170822200841.180f633b@hermann> In-Reply-To: <20170822133836.GQ1130@albert.catwhisker.org> References: <20170822114627.GC1130@albert.catwhisker.org> <20170822115923.GC1700@kib.kiev.ua> <20170822122836.GH1130@albert.catwhisker.org> <20170822123449.GD1700@kib.kiev.ua> <20170822124617.GN1130@albert.catwhisker.org> <20170822131958.GE1700@kib.kiev.ua> <20170822133836.GQ1130@albert.catwhisker.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 22 Aug 2017 06:38:36 -0700 David Wolfskill <david@catwhisker.org> wrote: I also ran into this problem after "upgrading" to r322769 and now I have on ALL systems, I did this "upgrade", a wrecked system which isn't even capable of compiling a new kernel or world. I can understand that something weird and havoc can happen on systems running CURRENT with customised kernels, also some hidden problems, but this serious problem occurs even on vanilla GENERIC systems up to r322798! I just tried to cleandir everything and rebuild world and kernel which is on some slow boxes a pain in the arse (and I always thought LLVM/CLANG's goal was to shorten compile cycles ... the opposite seems the fact, by the way). The arising question is with view to GENERIC: do those changes even get tested on real hardware or is it all theory/virtual when commited? Just a question. I'm awaiting this patch in the hope I can rebuild everything to normal. Thanks, oh > On Tue, Aug 22, 2017 at 04:19:58PM +0300, Konstantin Belousov wrote: > > ... > > > > Ok, can you rebuild kernel and libc from scratch ? I.e. remove > > > > your object directories. > > > > > > I think I'll need a working /bin/sh to do that. As noted, I could > > > try the stable/11 /bin/sh; on the other hand, if it's dying in a > > > library, that's not likely to help a whole lot. :-} > > I highly suspect that this is not /bin/sh at all. Backtrace > > strongly suggests that the malloc() has issues, but again I suspect > > that the reason is not an issue in malloc, but its use of TLS. > > I think I hope that this use of "TLS" is not the one associated with > (say) SSL.... :-} > > > The amd64 changes were to the TLS base register handling. So you > > might try to boot previous kernel. If this works out without > > replacing libc then it is definitely TLS, but I still do not know > > what is wrong. .... > > OK; we have a bit of progress, then: > * When I tried to rename the kernel directories in /boot, I got more > segfaults. So I figured I'd use the boot menu to select > kernel.old, and just tried "sudo shutdown -r now" -- and got a > segfault. "sudo reboot" did, as well. So did "sudo kill 1". On a > whim, I tried "sudo halt"; that actually worked. > > * After the (successful) reboot from kernel.old, I was able to rename > kernel directories without issue. This may be useflu evidence. > > * Flushed with that success, I have started a fresh clean build of > r322776. (I had managed to clear /usr/obj prior to the reboot.) > > * I should be able to provide updated status within about 30 minutes. > > Thanks again for all your help! > > Peace, > david
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170822200841.180f633b>