Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Sep 2014 00:00:28 -0400
From:      Chris Ross <cross+freebsd@distal.com>
To:        freebsd-sparc64@freebsd.org
Subject:   Re: FreeBSD 10-STABLE/sparc64 panic
Message-ID:  <456226AE-0712-4510-AEF5-2053F36F2181@distal.com>
In-Reply-To: <AF5EA0E6-860B-47DF-AC5E-6A45317C6092@distal.com>
References:  <20140518083413.GK24043@gradx.cs.jhu.edu> <751F7778-95CE-40FC-857F-222FB37737C0@distal.com> <20140518235853.GM24043@gradx.cs.jhu.edu> <20140519145222.GN24043@gradx.cs.jhu.edu> <A092DFEB-D5CF-473E-88BD-81B005C26C57@distal.com> <20140519193529.GO24043@gradx.cs.jhu.edu> <20140519205047.GP24043@gradx.cs.jhu.edu> <CA75738D-066D-4EDC-9018-89936EE861C6@distal.com> <AB5649B5-BBFB-4284-9CFF-4784D28A18F3@distal.com> <A9D37635-CA61-401B-BEAE-14C4F370BFD6@distal.com> <BC35853D-DA5E-4799-947C-4C64A0BC7D36@distal.com> <D9350E94-1F01-4FFD-A51E-AD8761F5C9CF@distal.com> <E48E7175-310B-4449-B3E1-2058F9E681D0@distal.com> <323A3936-DE55-459A-B8AA-CFF463922F22@distal.com> <7DD7D2DC-A265-40D6-9995-16ABAF79C1FB@distal.com> <AF5EA0E6-860B-47DF-AC5E-6A45317C6092@distal.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Jun 30, 2014, at 10:40 , Chris Ross <cross+freebsd@distal.com> wrote:
> tl;dr : I=92ve finished my testing and have a result, but see other =
things I
> don=92t understand.  Could use more help.

  Old thread, problem still exists.  Noticed in head around:

=
http://lists.freebsd.org/pipermail/freebsd-sparc64/2014-March/009261.html

  And in stable/10 as of revision 263676 (likely earlier).  As numerous =
people
have tried, I have also tried, to narrow it down to a commit, or small =
number
of commits, but the failure is sporadic.  I think looking at the current =
code which
is still failing may be most useful.

  I am right now seeing this on stable/10 code updated today, =
10.1-BETA3,
r272264.  As noted earlier in these threads, I am running a Sun Fire =
v240.  At
least one or two other folks with v240's have seen this, and I think a =
variant
of SunBlade that also has bge's on it.

  Multiuser boot panics at:

Setting hostname: hostname.distal.com.
bge0: link state changed to DOWN
spin lock 0xc0c95330 (smp rendezvous) held by 0xfffff8000560a490 (tid =
100347) too long
timeout stopping cpus
panic: spin lock held too long
cpuid =3D 1
KDB: stack backtrace:
#0 0xc054a0d0 at _mtx_lock_spin_failed+0x50
#1 0xc054a198 at _mtx_lock_spin_cookie+0xb8
#2 0xc08b989c at tick_get_timecount_mp+0xdc
#3 0xc056c33c at binuptime+0x3c
#4 0xc08857ac at timercb+0x6c
#5 0xc08b9c00 at tick_intr+0x220
Uptime: 20s
Automatic reboot in 15 seconds - press a key on the console to abort

  In past kernels, ones more recent than March 2014, it will sometimes
boot [to multiuser] the first try, but usually will crash a few times, =
but
eventually come all the way up.  Given 30-40 minutes, it will usually
recover to multiuser, and is stable forever (in past testing) at that =
point.
This evening, it was rebooting for about 40 minutes (11 panic and
reboot sequences), but then came up.

  I would be happy to dig into this further, but will need some advice =
and
instruction.  I fear I may not even have built the kernel with full =
debugging,
but can do so.  I'll look into that now that the machine is up again.

  Please let me know what I can do to help.  Thanks.

                                      - Chris






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?456226AE-0712-4510-AEF5-2053F36F2181>