From owner-freebsd-current@FreeBSD.ORG Mon Oct 6 00:34:29 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4EFCF1065677 for ; Mon, 6 Oct 2008 00:34:29 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from kientzle.com (kientzle.com [66.166.149.50]) by mx1.freebsd.org (Postfix) with ESMTP id 04AA98FC12 for ; Mon, 6 Oct 2008 00:34:28 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from [10.123.3.171] (p53.kientzle.com [66.166.149.53]) by kientzle.com (8.12.9/8.12.9) with ESMTP id m960YRtv064696; Sun, 5 Oct 2008 17:34:27 -0700 (PDT) (envelope-from kientzle@freebsd.org) Message-ID: <48E95D0E.50202@freebsd.org> Date: Sun, 05 Oct 2008 17:34:22 -0700 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20060422 X-Accept-Language: en-us, en MIME-Version: 1.0 To: jos@catnook.com References: <20081004080511.GA72641@lizzy.catnook.local> <20081004161024.GA67323@nagual.pp.ru> <20081004222249.GA48928@lizzy.catnook.local> <48E80F02.4070309@freebsd.org> <20081005233256.GB8507@lizzy.catnook.local> In-Reply-To: <20081005233256.GB8507@lizzy.catnook.local> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: Andrey Chernov , freebsd-current@freebsd.org Subject: Re: firefox3-bin crashes near arc4random_buf() X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Oct 2008 00:34:29 -0000 > I watched it crash a bunch more times and the backtraces are the same. That's > good, right? :-) Yes. For a suitable definition of "good." ;-) >>It might also be worth running it under ktrace, >>forcing the crash, then sharing the last few dozen >>lines from kdump output. > > Also attached is firefox3.kdump. The last few lines look like: > > 6855 firefox-bin RET clock_gettime 0 > 6855 firefox-bin CALL _umtx_op(0x8179760,0x8,0x1,0x8179740,0xbf8fdddc) > 6855 firefox-bin PSIG SIGSEGV caught handler=0x28237290 mask=0x0 code=0x1 > 6855 firefox-bin CALL unlink(0x8179600) > 6855 firefox-bin NAMI "/home/jos/.mozilla/firefox/tosfxhak.default/lock" > 6855 firefox-bin RET unlink 0 > 6855 firefox-bin CALL sigaction(SIGSEGV,0x2978dfb4,0) > 6855 firefox-bin RET sigaction 0 > 6855 firefox-bin CALL sigprocmask(SIG_UNBLOCK,0xbf4f906c,0) > 6855 firefox-bin RET sigprocmask 0 > 6855 firefox-bin CALL thr_kill(0x1878c,SIGSEGV) > 6855 firefox-bin RET thr_kill 0 > 6855 firefox-bin PSIG SIGSEGV SIG_DFL > > This to me suggests that the segfault happens inside _umtx_op. Am I reading > that correctly? Not necessarily. Firefox is multi-threaded. The thread that called _umtx_op() is not the thread that crashed (_umtx_op() hadn't returned to userspace, so that thread was still in the kernel). This does, however, answer one puzzle: Firefox appears to have a signal handler that catches SEGV, releases the lock file, then re-throws SEGV to actually kill the program. That explains stack frames #0-#4 in your backtrace; that's the signal handler executing after the segfault but before the program is terminated. Something is still screwy about the backtrace. dbopen() doesn't call arc4random_buf. However, it does call mkstemp() which does call arc4random_uniform, which should be right next to arc4random_buf in memory. GCC optimizations could be obscuring the call stack here. It's certainly possible that arc4random is involved somehow but I don't yet see it. It does seem likely that we're looking at a libc problem, so a debug version of libc might help. Replacing libc on a running system is a little tricky. I believe the following works, though I've not tried it: % cd /usr/src/lib/libc % make clean % make DEBUG_FLAGS=-g % cp /lib/libc.so.7 /lib/libc.so.7-backup ... reboot to single user, use /rescue/sh as your shell ... % cp /usr/src/lib/libc/libc.so.7 /lib/libc.so.7 ... reboot ... This should give you a standard libc with full debugging symbols. Hopefully, the backtrace will now give more details. I think we're getting closer. Tim