Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 May 2012 21:46:16 -0700
From:      Jason Evans <jasone@FreeBSD.org>
To:        Steve Wills <swills@FreeBSD.org>
Cc:        current@FreeBSD.org
Subject:   Re: <jemalloc>: jemalloc_arena.c:182: Failed assertion: "p[i] == 0"
Message-ID:  <A67012C5-E54A-4F60-A1DD-AAFB3867793B@FreeBSD.org>
In-Reply-To: <a457b78de070b45bbffdd06271c6a7ef.squirrel@mouf.net>
References:  <20120421185402.GH1743@albert.catwhisker.org> <7AD8956D-AD18-4CAB-9953-06E00185A7DA@freebsd.org> <a457b78de070b45bbffdd06271c6a7ef.squirrel@mouf.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On May 7, 2012, at 12:19 PM, Steve Wills wrote:
>> On Apr 21, 2012, at 11:54 AM, David Wolfskill wrote:
>>> After applying Dimitry Andric's patches to contrib/jemalloc and
>>> replacing
>>> /usr/bin/as with one built last Sunday, I was finally(!) able to =
rebuild
>>> head as of 234536:
>>>=20
>>> FreeBSD freebeast.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT =
#797
>>> 234536M: Sat Apr 21 10:23:33 PDT 2012
>>> root@freebeast.catwhisker.org:/usr/obj/usr/src/sys/GENERIC  i386
>>>=20
>>> However, as I was copying a /usr/obj hierarchy via tar -- e.g.:
>>>=20
>>> root@freebeast:/common/home/david # (cd /var/tmp && rm -fr obj && =
mkdir
>>> obj) && (cd /usr && tar cpf - obj) | (cd /var/tmp && tar xpf -)
>>>=20
>>> it ran for a while, then:
>>>=20
>>> <jemalloc>: jemalloc_arena.c:182: Failed assertion: "p[i] =3D=3D 0"
>>> Abort (core dumped)
>>> root@freebeast:/common/home/david # echo $?
>>> 134
>>> root@freebeast:/common/home/david # ls -lTio *.core
>>> ls: No match.
>>> root@freebeast:/common/home/david #
>>>=20
>>> So ... no core file, apparently.
>>>=20
>>> freebeast(10.0-C)[2] find /usr/src/contrib/jemalloc -type f -name
>>> jemalloc_arena.c
>>> freebeast(10.0-C)[3]
>>>=20
>>> No file named "jemalloc_arena.c", either.
>>>=20
>>> But contrib/jemalloc/src/arena.c contains a function,
>>> arena_chunk_validate_zeroed():
>>>=20
>>>   175 static inline void
>>>   176 arena_chunk_validate_zeroed(arena_chunk_t *chunk, size_t =
run_ind)
>>>   177 {
>>>   178         size_t i;
>>>   179         UNUSED size_t *p =3D (size_t *)((uintptr_t)chunk + =
(run_ind
>>> << LG_PAGE));
>>>   180
>>>   181         for (i =3D 0; i < PAGE / sizeof(size_t); i++)
>>>   182                 assert(p[i] =3D=3D 0);
>>>   183 }
>>>=20
>>> Thoughts?
>>=20
>> I received a similar report yesterday in the context of filezilla, =
but
>> didn't get as far as reproducing it.  I think the problem is in
>> chunk_alloc_dss(), which dangerously claims that newly allocated =
memory is
>> zeroed.  It looks like I formalized this bad assumption in early =
2010,
>> though the bug existed before that.  It's a bigger deal now because =
sbrk()
>> is preferred over mmap(), so the bug has languished for a couple of =
years.
>> I'll get a fix committed today (and revert the order of preference
>> between sbrk() and mmap()).
>>=20
>> By the way, I wonder why not everyone hits this (I don't).
>=20
> I just now hit the same issue while using ports tinderbox. It was =
calling
> tar during the "makeJail" tinderbox subcommand and gave the same error =
as
> in the subject. Funny thing is I had run the same command (on a =
different
> "jail") right before this and didn't get the error. What's the status =
of
> this? Should I set MALLOC_PRODUCTION=3Dyes in /etc/make.conf, rebuild =
world
> and forget about it?

How recent is your system?  This problem should have been fixed by =
r234569, so if you're still seeing problems after that revision, there's =
another problem we need to figure out.  (By the way, it's possible for =
an application to trigger this assertion, but unlikely.)

Thanks,
Jason=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A67012C5-E54A-4F60-A1DD-AAFB3867793B>