Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 May 2012 10:54:54 +0800
From:      David Xu <listlog2011@gmail.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Alberto Villa <avilla@freebsd.org>, Gustau P?rez i Querol <gperez@entel.upc.edu>, davidxu@freebsd.org, FreeBSD current <freebsd-current@freebsd.org>
Subject:   Re: RFC: jemalloc: qdbus sigsegv in malloc_init
Message-ID:  <4FB9AE7E.6090109@gmail.com>
In-Reply-To: <20120520172419.GQ2358@deviant.kiev.zoral.com.ua>
References:  <4F9E9E06.4070004@entel.upc.edu> <4FB88925.4070008@gmail.com> <CAJp7RHaOkEzyfD5e6pLMSBxvCBYCn9BWv=9BWu0CYsQHzGyFdg@mail.gmail.com> <20120520172419.GQ2358@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2012/5/21 1:24, Konstantin Belousov wrote:
> On Sun, May 20, 2012 at 06:42:35PM +0200, Alberto Villa wrote:
>> On Sun, May 20, 2012 at 8:03 AM, David Xu<listlog2011@gmail.com>  wrote:
>>> qdbus segfaults on my machine too, I tracked it down, and found the problem
>>> is in QT,
>>> it deleted current_thread_data_key,  but it still uses it in some cxa hooks,
>>>   I  applied the
>>> following patch,  and it works fine.
>> Thanks for the analysis David!
>>
>>> I think the bug depends on linking order in QT library ? if the
>>> qthread_unix.cpp is linked
>>> as lastest module, the key will be deleted after all cxa hooks run, then it
>>> will be fine,
>>> otherwise, it would crash.
>> Is this really possible?
> No, I do not think it is possible.
>
> The only possibility for something weird happen is for atexit/__cxa_atexit
> functions to be registered from another atexit function, and then we
> indeed could call the newly registered function too late.
>
> I wonder if the following hack makes any change in the observed behaviour.
>
> diff --git a/lib/libc/stdlib/atexit.c b/lib/libc/stdlib/atexit.c
> index 511172a..bab850c 100644
> --- a/lib/libc/stdlib/atexit.c
> +++ b/lib/libc/stdlib/atexit.c
> @@ -72,6 +72,7 @@ struct atexit {
>   };
>
>   static struct atexit *__atexit;		/* points to head of LIFO stack */
> +static int atexit_gen;
>
>   /*
>    * Register the function described by 'fptr' to be called at application
> @@ -107,6 +108,7 @@ atexit_register(struct atexit_fn *fptr)
>   		__atexit = p;
>   	}
>   	p->fns[p->ind++] = *fptr;
> +	atexit_gen++;
>   	_MUTEX_UNLOCK(&atexit_mutex);
>   	return 0;
>   }
> @@ -162,7 +164,7 @@ __cxa_finalize(void *dso)
>   	struct dl_phdr_info phdr_info;
>   	struct atexit *p;
>   	struct atexit_fn fn;
> -	int n, has_phdr;
> +	int atexit_gen_prev, n, has_phdr;
>
>   	if (dso != NULL)
>   		has_phdr = _rtld_addr_phdr(dso,&phdr_info);
> @@ -170,6 +172,8 @@ __cxa_finalize(void *dso)
>   		has_phdr = 0;
>
>   	_MUTEX_LOCK(&atexit_mutex);
> +retry:
> +	atexit_gen_prev = atexit_gen;
>   	for (p = __atexit; p; p = p->next) {
>   		for (n = p->ind; --n>= 0;) {
>   			if (p->fns[n].fn_type == ATEXIT_FN_EMPTY)
> @@ -196,6 +200,8 @@ __cxa_finalize(void *dso)
>   			_MUTEX_LOCK(&atexit_mutex);
>   		}
>   	}
> +	if (atexit_gen_prev != atexit_gen)
> +		goto retry;
>   	_MUTEX_UNLOCK(&atexit_mutex);
>   	if (dso == NULL)
>   		_MUTEX_DESTROY(&atexit_mutex);
I have tried your patch,  it does not fix the problem. As I said, it is 
a bug in QT,
the bug is pthread key current_thread_data_key is deleted by a global 
C++ object
too early, other C++ global objects still need this pthread key. The 
following procedure
shows how I found the problem:

davidxu@xyf:~%gdb qdbus
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...(no debugging symbols 
found)...
(gdb) break __cxa_finalize
Function "__cxa_finalize" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (__cxa_finalize) pending.
(gdb) run
Starting program: /usr/local/bin/qdbus
(no debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...(no debugging symbols found)...(no debugging 
symbols found)...(no debugging symbols found)...(no debugging symbols 
found)...[New LWP 100077]
(no debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...(no debugging symbols found)...(no debugging 
symbols found)...(no debugging symbols found)...(no debugging symbols 
found)...(no debugging symbols found)...Breakpoint 2 at 0x2864ac26
Pending breakpoint "__cxa_finalize" resolved
(no debugging symbols found)...[New Thread 29007300 (LWP 100077/qdbus)]
(no debugging symbols found)...:1.0
  org.gnome.SessionManager
:1.11
:1.111
:1.12
:1.13
  org.gtk.vfs.Daemon
:1.143
:1.15
  org.pulseaudio.Server
:1.17
  org.gnome.Panel
:1.18
:1.19
:1.20
  org.gtk.Private.HalVolumeMonitor
:1.21
  org.gtk.Private.GPhoto2VolumeMonitor
:1.22
:1.24
  org.gnome.ScreenSaver
:1.25
:1.27
:1.28
:1.29
:1.30
:1.31
  org.gnome.panel.applet.WnckletFactory
:1.32
:1.33
:1.34
:1.35
  org.gnome.panel.applet.CPUFreqAppletFactory
:1.36
  org.gnome.panel.applet.NotificationAreaAppletFactory
:1.37
  org.gnome.panel.applet.MultiLoadAppletFactory
:1.38
:1.39
:1.4
  org.gnome.GConf
:1.41
  org.gnome.panel.applet.ClockAppletFactory
:1.49
:1.5
  org.gnome.SettingsDaemon
:1.50
:1.53
:1.64
:1.7
  org.freedesktop.secrets
  org.gnome.keyring
:1.75
  org.gtk.vfs.Metadata
:1.76
  org.gnome.Terminal.Display_0_0
:1.77
org.freedesktop.DBus
[Switching to Thread 29007300 (LWP 100077/qdbus)]

Breakpoint 2, 0x2864ac26 in __cxa_finalize () from /lib/libc.so.7
(gdb) print current_thread_data_key
$1 = 0
(gdb) thread tsd
Key 0, destructor 0x281d77f0 <_Z27destroy_current_thread_dataPv>
Key 1, destructor 0x28732dc0 <g_thread_create_full>
Key 2, destructor 0x28726a00 <g_slice_get_config>
Key 3, destructor 0x0 <???>


Here you can find that the function destroy_current_thread_data()  is 
registered.


(gdb) bt
#0  0x2864ac26 in __cxa_finalize () from /lib/libc.so.7
#1  0x285efe1a in exit () from /lib/libc.so.7
#2  0x08051db5 in main ()
(gdb) break QThreadData::current()
Breakpoint 3 at 0x281d7856
(gdb) info breakpoints
Num Type           Disp Enb Address    What
2   breakpoint     keep y   0x2864ac26 <__cxa_finalize+6>
     breakpoint already hit 1 time
3   breakpoint     keep y   0x281d7856 <QThreadData::current()+6>
(gdb) delete 2
(gdb) cont
Continuing.

Breakpoint 3, 0x281d7856 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
(gdb) bt
#0  0x281d7856 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
#1  0x281d4747 in QThread::currentThread () from 
/usr/local/lib/qt4/libQtCore.so.4
#2  0x28097248 in QDBusConnectionPrivate::deleteYourself () from 
/usr/local/lib/qt4/libQtDBus.so.4
#3  0x2808f2ea in QDBusConnection::~QDBusConnection () from 
/usr/local/lib/qt4/libQtDBus.so.4
#4  0x2864ad8f in __cxa_finalize () from /lib/libc.so.7
#5  0x285efe1a in exit () from /lib/libc.so.7
#6  0x08051db5 in main ()
(gdb) thread tsd
Key 1, destructor 0x0 <???>
Key 2, destructor 0x0 <???>
Key 3, destructor 0x0 <???>

Here you can see the destroy_current_thread_data() was executed, and 
unregistered.
the key current_thread_key_data which is index 0 is deleted.


(gdb) cont
Continuing.

Breakpoint 3, 0x281d7856 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
(gdb) bt
#0  0x281d7856 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
#1  0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4
#2  0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4
#3  0x281d5a9e in QAdoptedThread::QAdoptedThread () from 
/usr/local/lib/qt4/libQtCore.so.4
#4  0x281d7934 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
#5  0x281d4747 in QThread::currentThread () from 
/usr/local/lib/qt4/libQtCore.so.4
#6  0x28097248 in QDBusConnectionPrivate::deleteYourself () from 
/usr/local/lib/qt4/libQtDBus.so.4
#7  0x2808f2ea in QDBusConnection::~QDBusConnection () from 
/usr/local/lib/qt4/libQtDBus.so.4
#8  0x2864ad8f in __cxa_finalize () from /lib/libc.so.7
#9  0x285efe1a in exit () from /lib/libc.so.7
#10 0x08051db5 in main ()

now the stupid code starts to create a new thread...

(gdb) cont
Continuing.

Breakpoint 3, 0x281d7856 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
(gdb) bt
#0  0x281d7856 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
#1  0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4
#2  0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4
#3  0x281d5a9e in QAdoptedThread::QAdoptedThread () from 
/usr/local/lib/qt4/libQtCore.so.4
#4  0x281d7934 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
#5  0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4
#6  0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4
#7  0x281d5a9e in QAdoptedThread::QAdoptedThread () from 
/usr/local/lib/qt4/libQtCore.so.4
#8  0x281d7934 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
#9  0x281d4747 in QThread::currentThread () from 
/usr/local/lib/qt4/libQtCore.so.4
#10 0x28097248 in QDBusConnectionPrivate::deleteYourself () from 
/usr/local/lib/qt4/libQtDBus.so.4
#11 0x2808f2ea in QDBusConnection::~QDBusConnection () from 
/usr/local/lib/qt4/libQtDBus.so.4
#12 0x2864ad8f in __cxa_finalize () from /lib/libc.so.7
#13 0x285efe1a in exit () from /lib/libc.so.7
#14 0x08051db5 in main ()
(gdb)
#0  0x281d7856 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
#1  0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4
#2  0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4
#3  0x281d5a9e in QAdoptedThread::QAdoptedThread () from 
/usr/local/lib/qt4/libQtCore.so.4
#4  0x281d7934 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
#5  0x282f58e3 in QObject::QObject () from /usr/local/lib/qt4/libQtCore.so.4
#6  0x281d4710 in QThread::QThread () from /usr/local/lib/qt4/libQtCore.so.4
#7  0x281d5a9e in QAdoptedThread::QAdoptedThread () from 
/usr/local/lib/qt4/libQtCore.so.4
#8  0x281d7934 in QThreadData::current () from 
/usr/local/lib/qt4/libQtCore.so.4
#9  0x281d4747 in QThread::currentThread () from 
/usr/local/lib/qt4/libQtCore.so.4
#10 0x28097248 in QDBusConnectionPrivate::deleteYourself () from 
/usr/local/lib/qt4/libQtDBus.so.4
#11 0x2808f2ea in QDBusConnection::~QDBusConnection () from 
/usr/local/lib/qt4/libQtDBus.so.4
#12 0x2864ad8f in __cxa_finalize () from /lib/libc.so.7
#13 0x285efe1a in exit () from /lib/libc.so.7
#14 0x08051db5 in main ()
(gdb)

dead-loop in QT library until the stack overflow.

As I said, it depends on ordering the global objects are destructed, if 
the object which deleting
the current_thread_data_key is destructed lastly, the problem wont 
happen, but now
it is destructed too early. I believe there is no specification said 
that which C++ object should be
destructed first if they are in different compiled module and then are 
linked together to generated
a shared object, .so file.







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4FB9AE7E.6090109>