Date: Mon, 30 Apr 2012 12:34:32 -0700 From: Jason Evans <jasone@freebsd.org> To: =?iso-8859-1?Q?Gustau_P=E9rez_i_Querol?= <gperez@entel.upc.edu> Cc: avilla@freebsd.org, FreeBSD current <freebsd-current@freebsd.org> Subject: Re: RFC: jemalloc: qdbus sigsegv in malloc_init Message-ID: <2D080258-652B-4EFA-8F6F-6ECA3CA4404B@freebsd.org> In-Reply-To: <4F9E9E06.4070004@entel.upc.edu> References: <4F9E9E06.4070004@entel.upc.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 30, 2012, at 7:13 AM, Gustau P=E9rez i Querol wrote: > the kde team is seeing some strange problems with the new version = (4.8.1) of devel/dbus-qt4 with current. It does work with stable. I also = suspect that the problem described below is affecting the experimental = cinnamon port (an alternative to gnome3, possible replacement of = gnome2). >=20 > The problem happens with both i386 and amd64 with empty = /etc/malloc.conf and simple /etc/make.conf. Everything compiled with = base gcc (no clang). The kernel was compiled with no debug support, but = it can enable if needed. There are reports from avilla@freebsd.org of = the same behavior with clang compiled world and kernel and with = MALLOC_PRODUCTION=3Dyes. >=20 > When qdbus starts, it segfauts. The backtrace of the problem with = r234769 can be found here: http://pastebin.com/ryBXtqGF. When starting = the qdbus daemon by hand in a X+twm session, we see it calls calloc many = times and after a fixed number of times segfaults. We see it segfaults = at rb_gen (a quite large macro defined at = $SRC_BASE/contrib/jemalloc/include/jemalloc/internal/rb.h). >=20 > If the daemon is started by hand, I'm able to skip all the calls qdbus = makes to calloc till the one causing the segfault. At that point, at = rb_gen, we don't exactly know what is going on or how to debug the = macro. Ktrace are available, but we were unable to find anything new = from them. >=20 > With old versions of current before the jemalloc imports (as of March = 30th) the daemon segfaulted at malloc.c:2426. With revisions during = April 20 to 24th (can be more precise, it was during the jemalloc = imports) the daemon segfaulted at malloc_init. Bts are available if = needed, and if necessary I can go back to those revision and recompile = world+kernel to see its behavior. >=20 > Any help from freebsd-current@ (perhaps Jason Evans can help us) will = be appreciated. Any additional info, like source revisions, can be = provided. I would like to stress that the experimental devel/dbus-qt4 = works fine with recent stable. The crash is happening in page run management, so there is some pretty = bad memory corruption going on by the time of the crash. If I = understand you correctly, you have reproduced the crash on a system that = does *not* have MALLOC_PRODUCTION defined, which means that none of the = assertions in jemalloc caught the problem. Adrian Chadd made the excellent suggestion of trying valgrind; it's = likely to point out the problem almost immediately. If that doesn't = work, the utrace functionality in malloc may help you figure out what = activity has occurred by the time of the crash, and give you a better = understanding of what happened to memory around the address that is = involved in the crash. Jason=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2D080258-652B-4EFA-8F6F-6ECA3CA4404B>