Date: Thu, 19 Jun 2014 03:22:21 -0700 From: John-Mark Gurney <jmg@funkthat.com> To: Nathan Whitehorn <nwhitehorn@freebsd.org> Cc: freebsd-arm@freebsd.org Subject: Re: AVILA getting close! Message-ID: <20140619102221.GL31367@funkthat.com> In-Reply-To: <53A21C30.7060601@freebsd.org> References: <20140618225808.GG31367@funkthat.com> <53A21C30.7060601@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Nathan Whitehorn wrote this message on Wed, Jun 18, 2014 at 16:09 -0700: > > On 06/18/14 15:58, John-Mark Gurney wrote: > >So, w/ the recent couple of patches that alc has provided, I no longer > >receive kernel panics on my AVILA board! > > > >$ uname -a > >FreeBSD avila.funkthat.com 11.0-CURRENT FreeBSD 11.0-CURRENT #27 > >r267333:267349M: Wed Jun 11 09:57:58 PDT 2014 > >jmg@carbon.funkthat.com:/usr/obj/arm.armeb/usr/src.avila/sys/AVILA arm > >$ uptime > >12:15AM up 1 day, 15 mins, 2 users, load averages: 0.13, 0.11, 0.08 > > > >This survived a portsnap extract... This is all over NFS... > > > >Though the issue that I'm now having is that some binaries (newsyslog) > >and sometimes other binaries (awk, grep) core dump... > > > >I believe this is an issue w/ rtld, or related... If I compile newsyslog > >-static, it works fine... Otherwise I get a SIGILL, and that is > >because it jumps off into the weeds.. Though gdb on arm isn't very > >useful.. > > > >The trouble appears to be when resolving a symbol that hasn't been > >called yet... The trouble starts when newsyslog starts parsing a > >line that isn't a comment line and tries to strdup it... stepi'ing > >has me go into > >_rtld_bind_start -> > > _rtld_bind -> > > rlock_acquire -> > > thread_mask_set -> > > def_thread_set_flag (via function pointer) > > def_rlock_acquire -> > > atomic_add_acq_int > > > >Turning on rtld's debug doesn't tell me anything I didn't know already: > >"memchr" in "libc.so.7" ==> 0x2017af30 in "libc.so.7" > >"strdup" in "newsyslog" ==> 0x200cc8b0 in "libc.so.7" > >Bus error (core dumped) > > > >I've posted both a gdb log showing the stepi, and my copy of > >ld-elf.so.1 to: > >https://www.funkthat.com/~jmg/20140619/ > > > >Let me know if there is any additional information... > > > What happens if you set LD_BIND_NOW=1 in the environment first? I identified where the SIGILL is coming from, it's coming from arm/undefined.c:272: http://fxr.watson.org/fxr/source/arm/arm/undefined.c#L272 Also, I forgot that I get a different fault when not running under gdb than I do under gdb... under gdb, I get the SIGILL... when not running under gdb, as far as I can tell, I jump through a bogus function pointer, and end up executing random code that gives me a SIGBUS instead... So, this whole SIGILL may be a problem w/ gdb, and not the final problem... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140619102221.GL31367>