Date: Mon, 21 May 2012 10:54:54 +0800 From: David Xu <listlog2011@gmail.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: Alberto Villa <avilla@freebsd.org>, Gustau P?rez i Querol <gperez@entel.upc.edu>, davidxu@freebsd.org, FreeBSD current <freebsd-current@freebsd.org> Subject: Re: RFC: jemalloc: qdbus sigsegv in malloc_init Message-ID: <4FB9AE7E.6090109@gmail.com> In-Reply-To: <20120520172419.GQ2358@deviant.kiev.zoral.com.ua> References: <4F9E9E06.4070004@entel.upc.edu> <4FB88925.4070008@gmail.com> <CAJp7RHaOkEzyfD5e6pLMSBxvCBYCn9BWv=9BWu0CYsQHzGyFdg@mail.gmail.com> <20120520172419.GQ2358@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2012/5/21 1:24, Konstantin Belousov wrote: > On Sun, May 20, 2012 at 06:42:35PM +0200, Alberto Villa wrote: >> On Sun, May 20, 2012 at 8:03 AM, David Xu<listlog2011@gmail.com> wrote: >>> qdbus segfaults on my machine too, I tracked it down, and found the problem >>> is in QT, >>> it deleted current_thread_data_key, but it still uses it in some cxa hooks, >>> I applied the >>> following patch, and it works fine. >> Thanks for the analysis David! >> >>> I think the bug depends on linking order in QT library ? if the >>> qthread_unix.cpp is linked >>> as lastest module, the key will be deleted after all cxa hooks run, then it >>> will be fine, >>> otherwise, it would crash. >> Is this really possible? > No, I do not think it is possible. > > The only possibility for something weird happen is for atexit/__cxa_atexit > functions to be registered from another atexit function, and then we > indeed could call the newly registered function too late. > > I wonder if the following hack makes any change in the observed behaviour. > > diff --git a/lib/libc/stdlib/atexit.c b/lib/libc/stdlib/atexit.c > index 511172a..bab850c 100644 > --- a/lib/libc/stdlib/atexit.c > +++ b/lib/libc/stdlib/atexit.c > @@ -72,6 +72,7 @@ struct atexit { > }; > > static struct atexit *__atexit; /* points to head of LIFO stack */ > +static int atexit_gen; > > /* > * Register the function described by 'fptr' to be called at application > @@ -107,6 +108,7 @@ atexit_register(struct atexit_fn *fptr) > __atexit = p; > } > p->fns[p->ind++] = *fptr; > + atexit_gen++; > _MUTEX_UNLOCK(&atexit_mutex); > return 0; > } > @@ -162,7 +164,7 @@ __cxa_finalize(void *dso) > struct dl_phdr_info phdr_info; > struct atexit *p; > struct atexit_fn fn; > - int n, has_phdr; > + int atexit_gen_prev, n, has_phdr; > > if (dso != NULL) > has_phdr = _rtld_addr_phdr(dso,&phdr_info); > @@ -170,6 +172,8 @@ __cxa_finalize(void *dso) > has_phdr = 0; > > _MUTEX_LOCK(&atexit_mutex); > +retry: > + atexit_gen_prev = atexit_gen; > for (p = __atexit; p; p = p->next) { > for (n = p->ind; --n>= 0;) { > if (p->fns[n].fn_type == ATEXIT_FN_EMPTY) > @@ -196,6 +200,8 @@ __cxa_finalize(void *dso) > _MUTEX_LOCK(&atexit_mutex); > } > } > + if (atexit_gen_prev != atexit_gen) > + goto retry; > _MUTEX_UNLOCK(&atexit_mutex); > if (dso == NULL) > _MUTEX_DESTROY(&atexit_mutex); I have tried your patch, it does not fix the problem. As I said, it is a bug in QT, the bug is pthread key current_thread_data_key is deleted by a global C++ object too early, other C++ global objects still need this pthread key. The following procedure shows how I found the problem: davidxu@xyf:~%gdb qdbus GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"...(no debugging symbols found)... (gdb) break __cxa_finalize Function "__cxa_finalize" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (__cxa_finalize) pending. (gdb) run Starting program: /usr/local/bin/qdbus (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...[New LWP 100077] (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...Breakpoint 2 at 0x2864ac26 Pending breakpoint "__cxa_finalize" resolved (no debugging symbols found)...[New Thread 29007300 (LWP 100077/qdbus)] (no debugging symbols found)...:1.0 org.gnome.SessionManager :1.11 :1.111 :1.12 :1.13 org.gtk.vfs.Daemon :1.143 :1.15 org.pulseaudio.Server :1.17 org.gnome.Panel :1.18 :1.19 :1.20 org.gtk.Private.HalVolumeMonitor :1.21 org.gtk.Private.GPhoto2VolumeMonitor :1.22 :1.24 org.gnome.ScreenSaver :1.25 :1.27 :1.28 :1.29 :1.30 :1.31 org.gnome.panel.applet.WnckletFactory :1.32 :1.33 :1.34 :1.35 org.gnome.panel.applet.CPUFreqAppletFactory :1.36 org.gnome.panel.applet.NotificationAreaAppletFactory :1.37 org.gnome.panel.applet.MultiLoadAppletFactory :1.38 :1.39 :1.4 org.gnome.GConf :1.41 org.gnome.panel.applet.ClockAppletFactory :1.49 :1.5 org.gnome.SettingsDaemon :1.50 :1.53 :1.64 :1.7 org.freedesktop.secrets org.gnome.keyring :1.75 org.gtk.vfs.Metadata :1.76 org.gnome.Terminal.Display_0_0 :1.77 org.freedesktop.DBus [Switching to Thread 29007300 (LWP 100077/qdbus)] Breakpoint 2, 0x2864ac26 in __cxa_finalize () from /lib/libc.so.7 (gdb) print current_thread_data_key $1 = 0 (gdb) thread tsd Key 0, destructor 0x281d77f0 <_Z27destroy_current_thread_dataPv> Key 1, destructor 0x28732dc0 <g_thread_create_full> Key 2, destructor 0x28726a00 <g_slice_get_config> Key 3, destructor 0x0 <???> Here you can find that the function destroy_current_thread_data() is registered. (gdb) bt #0 0x2864ac26 in __cxa_finalize () from /lib/libc.so.7 #1 0x285efe1a in exit () from /lib/libc.so.7 #2 0x08051db5 in main () (gdb) break QThreadData::current() Breakpoint 3 at 0x281d7856 (gdb) info breakpoints Num Type Disp Enb Address What 2 breakpoint keep y 0x2864ac26 <__cxa_finalize+6> breakpoint already hit 1 time 3 breakpoint keep y 0x281d7856 <QThreadData::current()+6> (gdb) delete 2 (gdb) cont Continuing. Breakpoint 3, 0x281d7856 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 (gdb) bt #0 0x281d7856 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 #1 0x281d4747 in QThread::currentThread () from /usr/local/lib/qt4/libQtCore.so.4 #2 0x28097248 in QDBusConnectionPrivate::deleteYourself () from /usr/local/lib/qt4/libQtDBus.so.4 #3 0x2808f2ea in QDBusConnection::~QDBusConnection () from /usr/local/lib/qt4/libQtDBus.so.4 #4 0x2864ad8f in __cxa_finalize () from /lib/libc.so.7 #5 0x285efe1a in exit () from /lib/libc.so.7 #6 0x08051db5 in main () (gdb) thread tsd Key 1, destructor 0x0 <???> Key 2, destructor 0x0 <???> Key 3, destructor 0x0 <???> Here you can see the destroy_current_thread_data() was executed, and unregistered. the key current_thread_key_data which is index 0 is deleted. (gdb) cont Continuing. Breakpoint 3, 0x281d7856 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 (gdb) bt #0 0x281d7856 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 #1 0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4 #2 0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4 #3 0x281d5a9e in QAdoptedThread::QAdoptedThread () from /usr/local/lib/qt4/libQtCore.so.4 #4 0x281d7934 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 #5 0x281d4747 in QThread::currentThread () from /usr/local/lib/qt4/libQtCore.so.4 #6 0x28097248 in QDBusConnectionPrivate::deleteYourself () from /usr/local/lib/qt4/libQtDBus.so.4 #7 0x2808f2ea in QDBusConnection::~QDBusConnection () from /usr/local/lib/qt4/libQtDBus.so.4 #8 0x2864ad8f in __cxa_finalize () from /lib/libc.so.7 #9 0x285efe1a in exit () from /lib/libc.so.7 #10 0x08051db5 in main () now the stupid code starts to create a new thread... (gdb) cont Continuing. Breakpoint 3, 0x281d7856 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 (gdb) bt #0 0x281d7856 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 #1 0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4 #2 0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4 #3 0x281d5a9e in QAdoptedThread::QAdoptedThread () from /usr/local/lib/qt4/libQtCore.so.4 #4 0x281d7934 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 #5 0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4 #6 0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4 #7 0x281d5a9e in QAdoptedThread::QAdoptedThread () from /usr/local/lib/qt4/libQtCore.so.4 #8 0x281d7934 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 #9 0x281d4747 in QThread::currentThread () from /usr/local/lib/qt4/libQtCore.so.4 #10 0x28097248 in QDBusConnectionPrivate::deleteYourself () from /usr/local/lib/qt4/libQtDBus.so.4 #11 0x2808f2ea in QDBusConnection::~QDBusConnection () from /usr/local/lib/qt4/libQtDBus.so.4 #12 0x2864ad8f in __cxa_finalize () from /lib/libc.so.7 #13 0x285efe1a in exit () from /lib/libc.so.7 #14 0x08051db5 in main () (gdb) #0 0x281d7856 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 #1 0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4 #2 0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4 #3 0x281d5a9e in QAdoptedThread::QAdoptedThread () from /usr/local/lib/qt4/libQtCore.so.4 #4 0x281d7934 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 #5 0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4 #6 0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4 #7 0x281d5a9e in QAdoptedThread::QAdoptedThread () from /usr/local/lib/qt4/libQtCore.so.4 #8 0x281d7934 in QThreadData::current () from /usr/local/lib/qt4/libQtCore.so.4 #9 0x281d4747 in QThread::currentThread () from /usr/local/lib/qt4/libQtCore.so.4 #10 0x28097248 in QDBusConnectionPrivate::deleteYourself () from /usr/local/lib/qt4/libQtDBus.so.4 #11 0x2808f2ea in QDBusConnection::~QDBusConnection () from /usr/local/lib/qt4/libQtDBus.so.4 #12 0x2864ad8f in __cxa_finalize () from /lib/libc.so.7 #13 0x285efe1a in exit () from /lib/libc.so.7 #14 0x08051db5 in main () (gdb) dead-loop in QT library until the stack overflow. As I said, it depends on ordering the global objects are destructed, if the object which deleting the current_thread_data_key is destructed lastly, the problem wont happen, but now it is destructed too early. I believe there is no specification said that which C++ object should be destructed first if they are in different compiled module and then are linked together to generated a shared object, .so file.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4FB9AE7E.6090109>