From owner-freebsd-stable@freebsd.org Fri Jul 3 21:11:23 2015 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9B528993A76 for ; Fri, 3 Jul 2015 21:11:23 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 155F91DF4 for ; Fri, 3 Jul 2015 21:11:22 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t63LBB7O007717 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 4 Jul 2015 00:11:11 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t63LBB7O007717 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t63LBB1o007716; Sat, 4 Jul 2015 00:11:11 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 4 Jul 2015 00:11:11 +0300 From: Konstantin Belousov To: Andre Meiser Cc: freebsd-stable@freebsd.org Subject: Re: Many core dumps in pthread_getspecific. Message-ID: <20150703211111.GZ2080@kib.kiev.ua> References: <20150603145838.GX2499@kib.kiev.ua> <20150614190504.GT2080@kib.kiev.ua> <20150616073637.GO2080@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 21:11:23 -0000 On Fri, Jul 03, 2015 at 05:21:50PM +0200, Andre Meiser wrote: > Hi, > > back again. Sorry, I accidently deleted the core file and I'd to wait two weeks until vim crashed again. Xorg didn't crashed so far with the debug libs. > > On Tue, Jun 16, 2015 at 09:36 +0200, Konstantin Belousov wrote: > > Ok, so the vim fault is reproducable, I suppose ? > > No, I tried, but no chance to do it on purpose. But so far it always happens while resizing the xterm. > > Now the entire info you asked for (out of the new core file): > > > % readelf -d vim | grep NEEDED > 0x0000000000000001 (NEEDED) Shared library: [libm.so.5] > 0x0000000000000001 (NEEDED) Shared library: [libncurses.so.8] > 0x0000000000000001 (NEEDED) Shared library: [libintl.so.8] > 0x0000000000000001 (NEEDED) Shared library: [libpython2.7.so.1] > 0x0000000000000001 (NEEDED) Shared library: [libthr.so.3] > 0x0000000000000001 (NEEDED) Shared library: [libc.so.7] > > (gdb) bt > #0 0x000000080149e6a2 in check_deferred_signal (curthread=0x802406400) at /usr/src/lib/libthr/thread/thr_sig.c:331 > #1 0x000000080149e5ed in _thr_ast (curthread=0x802406400) at /usr/src/lib/libthr/thread/thr_sig.c:264 > #2 0x00000008014a33c7 in _thr_rtld_lock_release (lock=) at /usr/src/lib/libthr/thread/thr_rtld.c:162 > #3 0x000000080083d94d in _r_debug_postinit () from /libexec/ld-elf.so.1 > #4 0x000000080083b15d in .text () from /libexec/ld-elf.so.1 > #5 0x00000000004e4163 in preserve_exit () > #6 0x000000000051f118 in mch_libcall () > #7 0x000000080149f47a in handle_signal (actp=, sig=, info=, ucp=) at /usr/src/lib/libthr/thread/thr_sig.c:240 > #8 0x000000080149f062 in thr_sighandler (sig=, info=, _ucp=) at /usr/src/lib/libthr/thread/thr_sig.c:183 > #9 > #10 0x000000080149e6a2 in check_deferred_signal (curthread=0x802406400) at /usr/src/lib/libthr/thread/thr_sig.c:331 > #11 0x000000080149e5ed in _thr_ast (curthread=0x802406400) at /usr/src/lib/libthr/thread/thr_sig.c:264 > #12 0x00000008014a33c7 in _thr_rtld_lock_release (lock=) at /usr/src/lib/libthr/thread/thr_rtld.c:162 > #13 0x000000080083d94d in _r_debug_postinit () from /libexec/ld-elf.so.1 > #14 0x000000080083b15d in .text () from /libexec/ld-elf.so.1 > #15 0x000000080149f4e2 in handle_signal (actp=, sig=, info=, ucp=) at /usr/src/lib/libthr/thread/thr_sig.c:256 > #16 0x000000080149f062 in thr_sighandler (sig=, info=, _ucp=) at /usr/src/lib/libthr/thread/thr_sig.c:183 > #17 > #18 select () at select.S:3 > #19 0x000000080149cb32 in __select (numfds=1, readfds=0x7fffffffdfb0, writefds=0x0, exceptfds=0x7fffffffdf30, timeout=0x7fffffffe038) at /usr/src/lib/libthr/thread/thr_syscalls.c:561 > #20 0x000000000051ac4b in mch_write () > #21 0x000000000051ae0f in mch_inchar () > #22 0x00000000005b8647 in ui_inchar () > #23 0x00000000004aeb8a in inchar () > #24 0x00000000004b1ffb in vgetc () > #25 0x00000000004b0efa in vgetc () > #26 0x00000000004b27b9 in safe_vgetc () > #27 0x00000000004f59ef in normal_cmd () > #28 0x00000000005dfec7 in main_loop () > #29 0x00000000005df538 in main () > > > (gdb) info locals > act = {__sigaction_u = {__sa_handler = 0, __sa_sigaction = 0}, sa_flags = 37875000, sa_mask = {__bits = {8, 4239276, 0, 0}}} > info = {si_signo = 0, si_errno = 0, si_code = 37875000, si_pid = 8, si_uid = 37874640, si_status = 8, si_addr = 0x700000008, si_value = {sival_int = 37875104, sival_ptr = 0x80241eda0, sigval_int = 37875104, sigval_ptr = 0x80241eda0}, _reason = {_fault = {_trapno = 141}, > _timer = {_timerid = 141, _overrun = 0}, _mesgq = {_mqd = 141}, _poll = {_band = 141}, __spare__ = {__spare1__ = 141, __spare2__ = {0, 0, 8744960, 8, 37874976, 8, 8641467}}}} > > > (gdb) info registers > rax 0xf0b470 15774832 > rbx 0x802406400 34397512704 > rcx 0x1 1 > rdx 0x80085b800 34368501760 > rsi 0x80241ed38 34397613368 > rdi 0x8015137d0 34381838288 > rbp 0x80241ecd0 0x80241ecd0 > rsp 0x8015137d0 0x8015137d0 > r8 0x800856600 34368480768 > r9 0x8080808080808080 -9187201950435737472 > r10 0x41b778 4306808 > r11 0x5262 21090 > r12 0x1 1 > r13 0x839888 8624264 > r14 0x8015137d0 34381838288 > r15 0x2 2 > rip 0x80149e6a2 0x80149e6a2 > eflags 0x10202 66050 > cs 0x43 67 > ss 0x3b 59 > ds 0x0 0 > es 0x0 0 > fs 0x0 0 > gs 0x0 0 > > > (gdb) disassemble > Dump of assembler code for function check_deferred_signal: > 0x000000080149e650 : push %rbp > 0x000000080149e651 : mov %rsp,%rbp > 0x000000080149e654 : push %r15 > 0x000000080149e656 : push %r14 > 0x000000080149e658 : push %rbx > 0x000000080149e659 : sub $0x78,%rsp > 0x000000080149e65d : mov %rdi,%rbx > 0x000000080149e660 : cmpl $0x0,0x100(%rbx) > 0x000000080149e667 : je 0x80149e672 > 0x000000080149e669 : cmpl $0x0,0x180(%rbx) > 0x000000080149e670 : je 0x80149e67d > 0x000000080149e672 : lea -0x18(%rbp),%rsp > 0x000000080149e676 : pop %rbx > 0x000000080149e677 : pop %r14 > 0x000000080149e679 : pop %r15 > 0x000000080149e67b : pop %rbp > 0x000000080149e67c : retq > 0x000000080149e67d : movl $0x1,0x180(%rbx) > 0x000000080149e687 : callq 0x801498e44 <__getcontextx_size@plt> > 0x000000080149e68c : cltq > 0x000000080149e68e : mov %rsp,%r14 > 0x000000080149e691 : add $0xf,%rax > 0x000000080149e695 : and $0xfffffffffffffff0,%rax > 0x000000080149e699 : sub %rax,%r14 > 0x000000080149e69c : mov %r14,%rsp > 0x000000080149e69f : mov %r14,%rdi > 0x000000080149e6a2 : callq 0x801499214 > 0x000000080149e6a7 : cmpl $0x0,0x100(%rbx) > 0x000000080149e6ae : je 0x80149e73b > 0x000000080149e6b4 : lea 0x100(%rbx),%r15 > 0x000000080149e6bb : mov %r14,%rdi > 0x000000080149e6be : callq 0x801499064 <__fillcontextx2@plt> > 0x000000080149e6c3 : movups 0x160(%rbx),%xmm0 > 0x000000080149e6ca : movups 0x170(%rbx),%xmm1 > 0x000000080149e6d1 : movaps %xmm1,-0x30(%rbp) > 0x000000080149e6d5 : movaps %xmm0,-0x40(%rbp) > 0x000000080149e6d9 : movups 0x150(%rbx),%xmm0 > 0x000000080149e6e0 : movups %xmm0,(%r14) > 0x000000080149e6e4 : movups 0x40(%r15),%xmm0 > 0x000000080149e6e9 : movaps %xmm0,-0x50(%rbp) > 0x000000080149e6ed : movups (%r15),%xmm0 > 0x000000080149e6f1 : movups 0x10(%r15),%xmm1 > 0x000000080149e6f6 : movups 0x20(%r15),%xmm2 > 0x000000080149e6fb : movups 0x30(%r15),%xmm3 > 0x000000080149e700 : movaps %xmm3,-0x60(%rbp) > 0x000000080149e704 : movaps %xmm2,-0x70(%rbp) > 0x000000080149e708 : movaps %xmm1,-0x80(%rbp) > 0x000000080149e70c : movaps %xmm0,-0x90(%rbp) > 0x000000080149e713 : movl $0x0,0x100(%rbx) > 0x000000080149e71d : mov -0x90(%rbp),%esi > 0x000000080149e723 : lea -0x40(%rbp),%rdi > 0x000000080149e727 : lea -0x90(%rbp),%rdx > 0x000000080149e72e : mov %r14,%rcx > 0x000000080149e731 : callq 0x80149f390 > 0x000000080149e736 : jmpq 0x80149e672 > 0x000000080149e73b : movl $0x0,0x180(%rbx) > 0x000000080149e745 : jmpq 0x80149e672 > End of assembler dump. > > > I've kept a copy of the vim binary and also the core file, so this time I can answer any further questions much faster. ;) > > I can't help much with those assembler part. But I've looked into /usr/src/lib/libthr/thread/thr_sig.c and there is alloca used at line 330: > > 330 uc = alloca(uc_len); > 331 getcontext(uc); > > I would bet using malloc and check for NULL will help to fix this problem. Well, there will be a free needed before return and one at the end of check_deferred_signal, but that's better than an unsafe alloca. > You would be wrong. It seems that there is a recursion into rtld which cannot work when returning from the signal. Try the following patch, but I am unsure how easy is to see whether the patch helps. diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c index a6d021f..ebb6c58 100644 --- a/lib/libthr/thread/thr_sig.c +++ b/lib/libthr/thread/thr_sig.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include @@ -257,7 +258,7 @@ handle_signal(struct sigaction *actp, int sig, siginfo_t *info, ucontext_t *ucp) /* reschedule cancellation */ check_cancel(curthread, &uc2); errno = err; - __sys_sigreturn(&uc2); + syscall(SYS_sigreturn, &uc2); } void