FreeBSD Mail Archives

Date:      Wed, 1 Jul 2015 02:54:33 +0000
From:      Glen Barber <gjb@FreeBSD.org>
To:        Chris Ross <cross+freebsd@distal.com>
Cc:        freebsd-stable@freebsd.org, Kurt Lidl <lidl@pix.net>
Subject:   Re: New FreeBSD snapshots available: stable/10 (20150625 r284813)
Message-ID:  <20150701025433.GN5423@FreeBSD.org>
In-Reply-To: <29FAA191-D0E5-4127-B016-65B4AE42ABE8@distal.com>
References:  <20150626174927.GA69720@FreeBSD.org> <559330CF.2050606@pix.net> <20150701001613.GH5423@FreeBSD.org> <55934CFB.7050407@pix.net> <56A9EB91-2F97-4096-99C8-26D3EFC13D2D@distal.com> <20150701023640.GM5423@FreeBSD.org> <29FAA191-D0E5-4127-B016-65B4AE42ABE8@distal.com>


--WfGgNR53aAH+Ap/a
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Jun 30, 2015 at 10:48:56PM -0400, Chris Ross wrote:
>=20
> On Jun 30, 2015, at 22:36 , Glen Barber <gjb@FreeBSD.org> wrote:
> > On Tue, Jun 30, 2015 at 10:27:21PM -0400, Chris Ross wrote:
> >>=20
> >>  Yeah, this is the same panic you, I, and others have been seeing on s=
parc64's
> >> with bge's, or at least v240's (and one other IIRC) for many many mont=
hs.  Thanks
> >> for grabbing a core!
> >>=20
> >>  When I was trying to search for a commit that caused the change of be=
havior,
> >> I had difficultly doing it, but it was well back in 2014.  The "boots =
sometimes"
> >> makes this a hard one to track, but as I only have my production v240,=
 also
> >> makes it one I haven't spent as much time trying to find as I'd like.
> >>=20
> >>  Thank you for letting me know this issue isn't fixed, though, despite=
 the other
> >> success with this code.  :-)
> >>=20
> >>  Hopefully your stacktrace can help figure out what is wrong.
> >>=20
> >=20
> > A quick search through the PR system returned zero results for this.
> > Did you file a PR previously?  (If not, do you know of one that already
> > exists that Kurt can update?)
>=20
>   The "long" thread I see in my emails are with subject "FreeBSD 10-STABL=
E/sparc64 panic".  May/June, and then later September and October, and I do=
n't see anyone to have created a PR.  I think I got confused and dismayed i=
n June, from reading back, and then never got to trying hard again.
>=20
>   The first report I see is from Kurt, http://lists.freebsd.org/pipermail=
/freebsd-sparc64/2014-March/009261.html, so well over a year ago.  But, no =
mention in that thread about a PR either.
>=20

Thank you for the reference.

>   I think you may be right, Glen, that there isn't one, and that's on me =
as well as others.  Hopefully, some of the searching through various revisi=
ons of 10/stable I documented in the "FreeBSD 10-STABLE/sparc64 panic" thre=
ad in May 2014 may help in the end, though.
>=20

It's fine, it explains why I could not find one.

Kurt, could you please create a PR and point me to the PR number so RE
can put it on our watch list?

Thanks.

Glen

>   Thanks.  tl;dr; I don't know of an existing PR.
>=20
>                                                - Chris
>=20
> >>=20
> >> On Jun 30, 2015, at 22:14 , Kurt Lidl <lidl@pix.net> wrote:
> >>> I got all excited and decided to give it a try on my dual-cpu
> >>> V240 as well.  I was able to get it installed, but it panics
> >>> when booting off the mirrored ZFS drives.  (Note:  I have no
> >>> reason to believe this is ZFS related.)
> >>>=20
> >>> ---- snip, snip ----
> >>> Setting hostname: spork.pix.net.
> >>> bge0: link state changed to DOWN
> >>> spin lock 0xc0cb9e38 (smp rendezvous) held by 0xfffff80003e93240 (tid=
 100340) too long
> >>> timeout stopping cpus
> >>> panic: spin lock held too long
> >>> cpuid =3D 1
> >>> KDB: stack backtrace:
> >>> #0 0xc0575380 at panic+0x20
> >>> #1 0xc0558e10 at _mtx_lock_spin_failed+0x50
> >>> #2 0xc0558ed8 at _mtx_lock_spin_cookie+0xb8
> >>> #3 0xc08d7b9c at tick_get_timecount_mp+0xdc
> >>> #4 0xc0583c88 at binuptime+0x48
> >>> #5 0xc08a3b8c at timercb+0x6c
> >>> #6 0xc08d7f00 at tick_intr+0x220
> >>> Uptime: 29s
> >>> Dumping 8192 MB (4 chunks)
> >>> chunk at 0: 2147483648 bytes ... ok
> >>> chunk at 0x100000000: 2147483648 bytes ... ok
> >>> chunk at 0x1000000000: 2147483648 bytes ... ok
> >>> chunk at 0x1100000000: 2147483648 bytes ... ok
> >>>=20
> >>> Dump complete
> >>> ---- snip, snip ----
> >>>=20
> >>> Now the thing that amazes me is that this happened
> >>> the first three times after I did the install, and
> >>> on the fourth boot, it didn't panic.  And it was
> >>> able to 'savecore' the crashdump.
> >>>=20
> >>> Here's the stacktrace from the core.txt.0 file:
> >>>=20
> >>> -Kurt
> >>>=20
> >>> Reading symbols from /boot/kernel/zfs.ko.symbols...done.
> >>> Loaded symbols for /boot/kernel/zfs.ko.symbols
> >>> Reading symbols from /boot/kernel/opensolaris.ko.symbols...done.
> >>> Loaded symbols for /boot/kernel/opensolaris.ko.symbols
> >>> Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done.
> >>> Loaded symbols for /boot/kernel/geom_mirror.ko.symbols
> >>> Reading symbols from /boot/kernel/tmpfs.ko.symbols...done.
> >>> Loaded symbols for /boot/kernel/tmpfs.ko.symbols
> >>> #0  0x00000000c05745bc in doadump (textdump=3D<value optimized out>)
> >>>   at /usr/src/sys/kern/kern_shutdown.c:262
> >>> 262             savectx(&dumppcb);
> >>> (kgdb) #0  0x00000000c05745bc in doadump (textdump=3D<value optimized=
 out>)
> >>>   at /usr/src/sys/kern/kern_shutdown.c:262
> >>> #1  0x00000000c0574fb0 in kern_reboot (howto=3D260)
> >>>   at /usr/src/sys/kern/kern_shutdown.c:451
> >>> #2  0x00000000c0575358 in vpanic (fmt=3D0xc0b22fe0 "spin lock held to=
o long",
> >>>   ap=3D0x1fa2da638) at /usr/src/sys/kern/kern_shutdown.c:758
> >>> #3  0x00000000c0575388 in panic (fmt=3D0xc0b22fe0 "spin lock held too=
 long")
> >>>   at /usr/src/sys/kern/kern_shutdown.c:687
> >>> #4  0x00000000c0558e18 in _mtx_lock_spin_failed (m=3D0xc0cb9e38)
> >>>   at /usr/src/sys/kern/kern_mutex.c:561
> >>> #5  0x00000000c0558ee0 in _mtx_lock_spin_cookie (c=3D0xfffff80003e932=
40,
> >>>   tid=3D18446735277669594832, opts=3D0, file=3D0x0, line=3D0)
> >>>   at /usr/src/sys/kern/kern_mutex.c:608
> >>> #6  0x00000000c08d7ba4 in tick_get_timecount_mp (tc=3D0xc0d13378) at =
smp.h:206
> >>> #7  0x00000000c0583c90 in binuptime (bt=3D0x1fa2da980)
> >>>   at /usr/src/sys/kern/kern_tc.c:188
> >>> #8  0x00000000c08a3b94 in timercb (et=3D0xc0d13308, arg=3D<value opti=
mized out>)
> >>>   at time.h:418
> >>> #9  0x00000000c08d7f08 in tick_intr (tf=3D0x1fa2dab20)
> >>>   at /usr/src/sys/sparc64/sparc64/tick.c:252
> >>> #10 0x00000000c00a11bc in tl1_intr ()
> >>> #11 0x00000000c08c934c in spinlock_exit ()
> >>>   at /usr/src/sys/sparc64/sparc64/machdep.c:244
> >>> #12 0x00000000c08c9330 in spinlock_exit ()
> >>>   at /usr/src/sys/sparc64/sparc64/machdep.c:240
> >>> #13 0x00000000c051a194 in cnputs (p=3D0x1fa2db11a "")
> >>>   at /usr/src/sys/kern/kern_cons.c:530
> >>> #14 0x00000000c05c06e0 in putchar (c=3D10, arg=3D0x1fa2db0c8)
> >>>   at /usr/src/sys/kern/subr_prf.c:437
> >>> #15 0x00000000c05bee90 in kvprintf (fmt=3D0xc0b2fb95 "",
> >>>   func=3D0xc05c02e0 <putchar>, arg=3D0x1fa2db0c8, radix=3D10, ap=3D0x=
1fa2db300)
> >>>   at /usr/src/sys/kern/subr_prf.c:655
> >>> #16 0x00000000c05bfe80 in _vprintf (level=3D5, flags=3D1,
> >>>   fmt=3D0xc0b2fb78 "%s: link state changed to %s\n", ap=3D0x1fa2db2f0)
> >>>   at /usr/src/sys/kern/subr_prf.c:281
> >>> #17 0x00000000c05c0270 in log (level=3D5,
> >>>   fmt=3D0xc0b2fb78 "%s: link state changed to %s\n")
> >>>   at /usr/src/sys/kern/subr_prf.c:308
> >>> #18 0x00000000c064ec28 in do_link_state_change (arg=3D0xfffff80003396=
800,
> >>>   pending=3D1) at /usr/src/sys/net/if.c:2131
> >>> #19 0x00000000c05cab38 in taskqueue_run_locked (queue=3D0xfffff800032=
88000)
> >>>   at /usr/src/sys/kern/subr_taskqueue.c:342
> >>> #20 0x00000000c05cacec in taskqueue_run (queue=3D0xfffff80003288000)
> >>>   at /usr/src/sys/kern/subr_taskqueue.c:358
> >>> #21 0x00000000c05cae20 in taskqueue_swi_run (dummy=3D0x0)
> >>>   at /usr/src/sys/kern/subr_taskqueue.c:471
> >>> #22 0x00000000c0539cc4 in intr_event_execute_handlers (p=3D0xfffff800=
03295860,
> >>>   ie=3D0xfffff80003287e00) at /usr/src/sys/kern/kern_intr.c:1264
> >>> #23 0x00000000c053b86c in ithread_loop (arg=3D0xfffff8000324c080)
> >>>   at /usr/src/sys/kern/kern_intr.c:1277
> >>> #24 0x00000000c0536428 in fork_exit (callout=3D0xc053b780 <ithread_lo=
op>,
> >>>   arg=3D0xfffff8000324c080, frame=3D0x1fa2db880)
> >>>   at /usr/src/sys/kern/kern_fork.c:1018
> >>> #25 0x00000000c00a1270 in fork_trampoline ()
> >>> #26 0x00000000c00a1270 in fork_trampoline ()
> >>> Previous frame identical to this frame (corrupt stack?)
> >>> (kgdb)
> >>>=20
> >>>=20
> >>> _______________________________________________
> >>> freebsd-stable@freebsd.org mailing list
> >>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> >>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.=
org"
> >>>=20
> >>=20
> >=20
> >=20
>=20



--WfGgNR53aAH+Ap/a
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJVk1ZpAAoJEAMUWKVHj+KTTnQP/0facaP1u8g3ZMvqsq8/RLOe
LGEVFxttPZEDczocpLAEG2yQWMDMzgmiSijJJRHrlFZfh+o5vR3gYnfsl6wvw29u
svdtYp/+5X6OVQ0eVnrKMBeuV+10loK7BLzOgp7JwMS3/afgsLg+VxfM5tq7HtiO
BE2pEFC7vKw3kVHNp8o6VYgQmW7+KBq+W4OUGjoezlItmUUPY7BDL80AAoIwNidL
6C76NUmrxAotNf4YNPjCGBKiq29gNI3zkD2i9LkhNztTOFo5SUvhgBjMkEyEechV
JxsFU37TiRaDnv0Pkq5BjRAsafYZUjRE/2o6dwA0K2KaXbEfq01N5IpSc8Oiki02
rdUk6vZH79fxBHmlyAu80ciesReMCMZ4GLnvzqSz4AKqBYMttnJX383+6FF4cv4x
xrvYkaBOLw+n+OcxNb8YK6LLaDMIYfc2SvnJHZGW7YEIHWBwG7X6ggSDptuWYYfx
iXkWr65XsL+2LlOV9jYXDWGM/8SCJkw5Xiq6BRGIoG9rg06RsIdr+bjMXfmr38LY
ASx9B5htnYht2UnS90DmxMy6yafenyU+Ab9K8v96QDFUjtsCK2Z3aJ7EBQ+0+ACf
9xFv9FswK/rGmdRKZ6nv9JapexF7LMA6aeRxj0r7m8IR31W8Kr7gnyLHLzhT410p
TGYEtNFrvxgs2I7luRgt
=Holc
-----END PGP SIGNATURE-----

--WfGgNR53aAH+Ap/a--

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150701025433.GN5423>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation