Date: Mon, 29 Sep 2014 00:00:28 -0400 From: Chris Ross <cross+freebsd@distal.com> To: freebsd-sparc64@freebsd.org Subject: Re: FreeBSD 10-STABLE/sparc64 panic Message-ID: <456226AE-0712-4510-AEF5-2053F36F2181@distal.com> In-Reply-To: <AF5EA0E6-860B-47DF-AC5E-6A45317C6092@distal.com> References: <20140518083413.GK24043@gradx.cs.jhu.edu> <751F7778-95CE-40FC-857F-222FB37737C0@distal.com> <20140518235853.GM24043@gradx.cs.jhu.edu> <20140519145222.GN24043@gradx.cs.jhu.edu> <A092DFEB-D5CF-473E-88BD-81B005C26C57@distal.com> <20140519193529.GO24043@gradx.cs.jhu.edu> <20140519205047.GP24043@gradx.cs.jhu.edu> <CA75738D-066D-4EDC-9018-89936EE861C6@distal.com> <AB5649B5-BBFB-4284-9CFF-4784D28A18F3@distal.com> <A9D37635-CA61-401B-BEAE-14C4F370BFD6@distal.com> <BC35853D-DA5E-4799-947C-4C64A0BC7D36@distal.com> <D9350E94-1F01-4FFD-A51E-AD8761F5C9CF@distal.com> <E48E7175-310B-4449-B3E1-2058F9E681D0@distal.com> <323A3936-DE55-459A-B8AA-CFF463922F22@distal.com> <7DD7D2DC-A265-40D6-9995-16ABAF79C1FB@distal.com> <AF5EA0E6-860B-47DF-AC5E-6A45317C6092@distal.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jun 30, 2014, at 10:40 , Chris Ross <cross+freebsd@distal.com> wrote: > tl;dr : I’ve finished my testing and have a result, but see other things I > don’t understand. Could use more help. Old thread, problem still exists. Noticed in head around: http://lists.freebsd.org/pipermail/freebsd-sparc64/2014-March/009261.html And in stable/10 as of revision 263676 (likely earlier). As numerous people have tried, I have also tried, to narrow it down to a commit, or small number of commits, but the failure is sporadic. I think looking at the current code which is still failing may be most useful. I am right now seeing this on stable/10 code updated today, 10.1-BETA3, r272264. As noted earlier in these threads, I am running a Sun Fire v240. At least one or two other folks with v240's have seen this, and I think a variant of SunBlade that also has bge's on it. Multiuser boot panics at: Setting hostname: hostname.distal.com. bge0: link state changed to DOWN spin lock 0xc0c95330 (smp rendezvous) held by 0xfffff8000560a490 (tid 100347) too long timeout stopping cpus panic: spin lock held too long cpuid = 1 KDB: stack backtrace: #0 0xc054a0d0 at _mtx_lock_spin_failed+0x50 #1 0xc054a198 at _mtx_lock_spin_cookie+0xb8 #2 0xc08b989c at tick_get_timecount_mp+0xdc #3 0xc056c33c at binuptime+0x3c #4 0xc08857ac at timercb+0x6c #5 0xc08b9c00 at tick_intr+0x220 Uptime: 20s Automatic reboot in 15 seconds - press a key on the console to abort In past kernels, ones more recent than March 2014, it will sometimes boot [to multiuser] the first try, but usually will crash a few times, but eventually come all the way up. Given 30-40 minutes, it will usually recover to multiuser, and is stable forever (in past testing) at that point. This evening, it was rebooting for about 40 minutes (11 panic and reboot sequences), but then came up. I would be happy to dig into this further, but will need some advice and instruction. I fear I may not even have built the kernel with full debugging, but can do so. I'll look into that now that the machine is up again. Please let me know what I can do to help. Thanks. - Chris
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?456226AE-0712-4510-AEF5-2053F36F2181>
