Date: Sat, 02 Nov 2013 07:33:40 -0600 From: Ian Lepore <ian@FreeBSD.org> To: Tim Kientzle <tim@kientzle.com> Cc: freebsd-arm@FreeBSD.org, Howard Su <howard0su@gmail.com> Subject: Re: sshd crash Message-ID: <1383399220.31172.116.camel@revolution.hippie.lan> In-Reply-To: <EB18203F-C516-4917-9AA4-DBA6E66DAAB6@kientzle.com> References: <CAAvnz_rj43Ww6=mMfnp2u5TA2pWb20vWOqyAtuK08wgzy0dH6A@mail.gmail.com> <1383313834.31172.65.camel@revolution.hippie.lan> <CAHNYxxMMF_GJv10drYuQFO%2Bav%2BTdp8OBvJfFZObEZ=tgaBovSA@mail.gmail.com> <1383328423.31172.92.camel@revolution.hippie.lan> <CAHNYxxNiuKP8wfTaZuL%2BBXiLcYA9eU3LBb-659ZBYr-WBSmZeQ@mail.gmail.com> <1383343354.31172.102.camel@revolution.hippie.lan> <EB18203F-C516-4917-9AA4-DBA6E66DAAB6@kientzle.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2013-11-01 at 22:35 -0700, Tim Kientzle wrote: > On Nov 1, 2013, at 3:02 PM, Ian Lepore <ian@freebsd.org> wrote: > > > On Sat, 2013-11-02 at 02:40 +0800, Jia-Shiun Li wrote: > >> On Sat, Nov 2, 2013 at 1:53 AM, Ian Lepore <ian@freebsd.org> wrote: > >>> On Sat, 2013-11-02 at 01:44 +0800, Jia-Shiun Li wrote: > >>>> may I add: putty causes this to happen. mine 0.62. But ssh from another > >>>> FreeBSD host has no problem. > >>>> > >>>> I suspect it to be some issues related to memory or malloc issues > >>>> specific to bbb. 'tmux a -d' without existing detached sessions > >>>> causes tmux client to core dump. But sshd and it are both fine on rpi. > >>>> > >>>> -Jia-Shiun. > >>> > >>> This is the first I've heard of being able to ssh to an arm platform > >>> that doesn't have PrivSep disabled, since about July or so. I've never > >>> heard a report yet that anything on the client side could make a > >>> difference. > >>> > >>> It's definitely not a beaglebone thing, it happens on every arm board > >>> I've got... dreamplug, rpi, bbw, imx53, wandboard. > >> > >> > >> Ok let me make sure I did not mix things up. ;) > >> > >> IIRC I once saw similar issue on rpi shortly. But after another > >> weekly update it was gone. I did not pay too much attention on rpi, > >> and thought it was bbb specific. > >> > >> I did not change sshd_config, UsePrivilegeSeparation supposed > >> remaining on as default is. > > I started looking into it a couple of months ago but didn't get > very far; Diane Bruce got a lot further than I did. > > If I recall correctly, it started up when the malloc libc symbols > were changed. That may have altered what malloc implementation > sshd used. > > So it could be a long-standing stray write that jemalloc just > happens to detect. > > It could also be related to locking (there's some multi-threaded > crypto code in sshd that may be involved). There's lots of stuff with lock in the name, but I don't think there are actually any threads involved in sshd, just forking. ldd says sshd doesn't link to libthr. I'm not sure it's a mundane stray-write either. The routine that's asserting is checking to see if the contents of a page are all-zero because a jemalloc internal flag is set that says it should be. I had the routine print the non-zero data it found, and it looks like this: not-zero at 0 0x20c99000 = 0x20800a00 not-zero at 1 0x20c99004 = 0x00000001 not-zero at 2 0x20c99008 = 0x0000002f not-zero at 3 0x20c9900c = 0xffffffff not-zero at 4 0x20c99010 = 0x00007fff not-zero at 5 0x20c99014 = 0x00000003 not-zero at 96 0x20c99180 = 0x5a5a5a5a not-zero at 97 0x20c99184 = 0x5a5a5a5a not-zero at 98 0x20c99188 = 0x5a5a5a5a The 0x5a continues to the end of the page. So jemalloc has metadata that says it thinks the page is all-zeroes, and the page is a mix of data and some zeroes and the 5a junk-fill byte. It seems more like the metadata is in error somehow. (Maybe a stray write hit the metadata.) -- Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1383399220.31172.116.camel>