From owner-freebsd-sparc64@FreeBSD.ORG Tue May 20 15:38:02 2014 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 61A98413 for ; Tue, 20 May 2014 15:38:02 +0000 (UTC) Received: from blaze.cs.jhu.edu (blaze.cs.jhu.edu [128.220.13.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.cs.jhu.edu", Issuer "InCommon Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 312A921FC for ; Tue, 20 May 2014 15:38:01 +0000 (UTC) Received: from gradx.cs.jhu.edu (gradx.cs.jhu.edu [128.220.13.52]) by blaze.cs.jhu.edu (8.14.4/8.14.4) with ESMTP id s4KFbra7031420 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 20 May 2014 11:37:53 -0400 Received: from gradx.cs.jhu.edu (localhost [127.0.0.1]) by gradx.cs.jhu.edu (8.14.8/8.14.5) with ESMTP id s4KFbrgK043931; Tue, 20 May 2014 11:37:53 -0400 Received: (from nwf@localhost) by gradx.cs.jhu.edu (8.14.8/8.14.8/Submit) id s4KFbrkl042041; Tue, 20 May 2014 11:37:53 -0400 Date: Tue, 20 May 2014 11:37:53 -0400 From: Nathaniel W Filardo To: Chris Ross Subject: Re: FreeBSD 10-STABLE/sparc64 panic Message-ID: <20140520153753.GV24043@gradx.cs.jhu.edu> References: <20140518083413.GK24043@gradx.cs.jhu.edu> <751F7778-95CE-40FC-857F-222FB37737C0@distal.com> <20140518235853.GM24043@gradx.cs.jhu.edu> <20140519145222.GN24043@gradx.cs.jhu.edu> <20140519193529.GO24043@gradx.cs.jhu.edu> <20140519205047.GP24043@gradx.cs.jhu.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="l5oECiFRo5dp+2y7" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Cc: freebsd-sparc64@freebsd.org X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 May 2014 15:38:02 -0000 --l5oECiFRo5dp+2y7 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Well, I am having a bad day; I tried to roll back to 9 but ended up building a 10 kernel again (too many source checkouts, sigh). In any case, I kicked WITNESS and INVARIANTS on this time and, while I cannot say for sure that it is related, early in boot I am told lock order reversal: 1st 0xc0756998 entropy harvest mutex (entropy harvest mutex) @ /systank/src-git/sys/dev/random/random_harvestq.c:198 2nd 0xfffff800055c7c38 uart_hwmtx (uart_hwmtx) @ /systank/src-git/sys/dev/uart/uart_cpu.h:94 KDB: stack backtrace: _witness_debugger() at _witness_debugger+0x38 witness_checkorder() at witness_checkorder+0xea0 __mtx_lock_spin_flags() at __mtx_lock_spin_flags+0x134 uart_cnputc() at uart_cnputc+0x60 cnputc() at cnputc+0xac putchar() at putchar+0xe8 kvprintf() at kvprintf+0x88 _vprintf() at _vprintf+0x38 vprintf() at vprintf+0x10 printf() at printf+0x20 witness_checkorder() at witness_checkorder+0xbf8 __mtx_lock_spin_flags() at __mtx_lock_spin_flags+0x134 sleepq_lock() at sleepq_lock+0x58 msleep_spin_sbt() at msleep_spin_sbt+0xb8 random_kthread() at random_kthread+0x294 fork_exit() at fork_exit+0xa4 fork_trampoline() at fork_trampoline+0x8 lock order reversal: 1st 0xc0756998 entropy harvest mutex (entropy harvest mutex) @ /systank/src-git/sys/dev/random/random_harvestq.c:198 2nd 0xc07b1510 sleepq chain (sleepq chain) @ /systank/src-git/sys/kern/subr_sleepqueue.c:237 The middle bit there looks like our favorite lockup: spin lock 0xc07b3cb0 (smp rendezvous) held by 0xfffff80009116920 (tid 100335) too long exclusive spin mutex smp rendezvous (smp rendezvous) r = 0 (0xc07b3cb0) locked @ /systank/src-git/sys/kern/subr_smp.c:497 timeout stopping cpus panic: spin lock held too long cpuid = 1 KDB: stack backtrace: vpanic() at vpanic+0x1b4 panic() at panic+0x20 _mtx_lock_spin_failed() at _mtx_lock_spin_failed+0x74 _mtx_lock_spin_cookie() at _mtx_lock_spin_cookie+0xb8 __mtx_lock_spin_flags() at __mtx_lock_spin_flags+0x190 tick_get_timecount_mp() at tick_get_timecount_mp+0x94 binuptime() at binuptime+0x3c timercb() at timercb+0x6c tick_intr() at tick_intr+0x220 -- interrupt level=0xe pil=0 %o7=0xc061d208 -- spinlock_exit() at spinlock_exit+0x2c __mtx_unlock_spin_flags() at __mtx_unlock_spin_flags+0x138 uart_cnputc() at uart_cnputc+0xd0 cnputc() at cnputc+0xac putchar() at putchar+0xe8 kvprintf() at kvprintf+0x890 _vprintf() at _vprintf+0x38 log() at log+0x48 do_link_state_change() at do_link_state_change+0x21c taskqueue_run_locked() at taskqueue_run_locked+0x100 taskqueue_run() at taskqueue_run+0x64 taskqueue_swi_run() at taskqueue_swi_run+0x18 intr_event_execute_handlers() at intr_event_execute_handlers+0x154 ithread_loop() at ithread_loop+0x120 fork_exit() at fork_exit+0xa4 fork_trampoline() at fork_trampoline+0x8 So we might be seeing a deadlock between the console mutex and entropy harvesting? Cheers, --nwf; --l5oECiFRo5dp+2y7 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAlN7dtEACgkQTeQabvr9Tc/9IgCfQfmqUGKgJde7Dq8hNgoRXp7q cq0AniZcXnhXzt/ouf0bY0GWh4d8LT64 =vPGv -----END PGP SIGNATURE----- --l5oECiFRo5dp+2y7--