From owner-freebsd-sparc64@FreeBSD.ORG Sat Aug 6 18:05:43 2011 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 07F361065670 for ; Sat, 6 Aug 2011 18:05:43 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id A720E8FC08 for ; Sat, 6 Aug 2011 18:05:42 +0000 (UTC) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.4/8.14.4/ALCHEMY.FRANKEN.DE) with ESMTP id p76I5coN040527; Sat, 6 Aug 2011 20:05:38 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.4/8.14.4/Submit) id p76I5cwv040526; Sat, 6 Aug 2011 20:05:38 +0200 (CEST) (envelope-from marius) Date: Sat, 6 Aug 2011 20:05:37 +0200 From: Marius Strobl To: Peter Jeremy Message-ID: <20110806180537.GB48988@alchemy.franken.de> References: <20110630221752.GG65891@pjdesk.au.alcatel-lucent.com> <20110702002325.GS14797@alchemy.franken.de> <4E0F6B8D.8000500@rice.edu> <20110704214158.GX14797@alchemy.franken.de> <20110705160709.GA77843@alchemy.franken.de> <4E135420.4080201@rice.edu> <20110705190126.GE14797@alchemy.franken.de> <20110706042634.GP65891@pjdesk.au.alcatel-lucent.com> <20110706103910.GG14797@alchemy.franken.de> <20110706222851.GQ65891@pjdesk.au.alcatel-lucent.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110706222851.GQ65891@pjdesk.au.alcatel-lucent.com> User-Agent: Mutt/1.4.2.3i Cc: "freebsd-sparc64@freebsd.org" Subject: Re: 'make -j16 universe' gives SIReset X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Aug 2011 18:05:43 -0000 On Thu, Jul 07, 2011 at 08:28:51AM +1000, Peter Jeremy wrote: > On 2011-Jul-06 18:39:10 +0800, Marius Strobl wrote: > >On Wed, Jul 06, 2011 at 02:26:34PM +1000, Peter Jeremy wrote: > >> And DDB for one of the stuck processes shows > >> db> trace 8881 > >> Tracing pid 8881 tid 195433 td 0xfffff8b0a2e72880 > >> mi_switch() at mi_switch+0x2a8 > >> sleepq_switch() at sleepq_switch+0x1cc > >> sleepq_catch_signals() at sleepq_catch_signals+0x130 > >> sleepq_wait_sig() at sleepq_wait_sig+0x8 > >> _sleep() at _sleep+0x41c > >> do_rw_rdlock() at do_rw_rdlock+0x7e4 > >> __umtx_op_rw_rdlock() at __umtx_op_rw_rdlock+0x1c > >> _umtx_op() at _umtx_op+0x3c > >> syscallenter() at syscallenter+0x270 > >> syscall() at syscall+0x74 > >> -- syscall (454, FreeBSD ELF64, _umtx_op) %o7=0x40479574 -- > >> userland() at 0x4047957c > >> user trace: trap %o7=0x40479574 > >> pc 0x4047957c, sp 0x7fdffffc561 > >> pc 0x7fdffffd1c0, sp 0x40365a10 > >> pc 0x90000000000125a, sp 0xac00002d11220000 > > > >What line does mi_switch+0x2a8 translate to? > > 0xc0503628 : call 0xc0528ba0 > 448 sched_switch(td, newtd, flags); > > The system is still running so I think the bigger issue is why > none of the processes can grab the mutex. Could you please give the below patch a try? This is just a shot in the dark though. > > >sparc64 package of gdb53, which still has the '-k' option: > >http://people.freebsd.org/~marius/gdb-5.3_1%2c1.tbz > > Unfortunately, it doesn't like me: > > # gdb53 -k /usr/obj/usr/src/sys/GENERIC/kernel.debug /dev/mem > GNU gdb 5.3 (FreeBSD) > Copyright 2002 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "sparc64-unknown-freebsd9.0"... > panic messages: > --- > dmesg: kvm_nlist: No such file or directory > --- > ---Can't read userspace from dump, or kernel process--- > > (kgdb) where > ---Can't read userspace from dump, or kernel process--- > > (kgdb) disas mi_switch > Segmentation fault (core dumped) > FYI, as of r224687 I've basically unfucked thread debugging and kgdb(1) on sparc64. Marius Index: lib/libthr/thread/thr_umtx.c =================================================================== --- lib/libthr/thread/thr_umtx.c (revision 223862) +++ lib/libthr/thread/thr_umtx.c (working copy) @@ -152,13 +152,13 @@ __thr_umutex_timedlock(struct umutex *mtx, uint32_ int __thr_umutex_unlock(struct umutex *mtx, uint32_t id) { -#ifndef __ia64__ - /* XXX this logic has a race-condition on ia64. */ +#if !defined(__ia64__) && !defined(__sparc64__) + /* XXX this logic has a race-condition on these architectures. */ if ((mtx->m_flags & (UMUTEX_PRIO_PROTECT | UMUTEX_PRIO_INHERIT)) == 0) { atomic_cmpset_rel_32(&mtx->m_owner, id | UMUTEX_CONTESTED, UMUTEX_CONTESTED); return _umtx_op_err(mtx, UMTX_OP_MUTEX_WAKE, 0, 0, 0); } -#endif /* __ia64__ */ +#endif /* !__ia64__ && !__sparc64__*/ return _umtx_op_err(mtx, UMTX_OP_MUTEX_UNLOCK, 0, 0, 0); }