Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 17 May 2015 16:58:31 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Ronald Klop <ronald-lists@klop.ws>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: Random Kernel Panic on Dreamplug (FS related)
Message-ID:  <2D3CFA21-3DB5-41D3-9955-31E26151A54D@bsdimp.com>
In-Reply-To: <op.xysn99y5kndu52@ronaldradial.radialsg.local>
References:  <542559BC.7090100@gmail.com> <20140929040126.GG43300@funkthat.com> <54291B74.5010307@gmail.com> <20140930112937.GU43300@funkthat.com> <542A9EA4.70109@gmail.com> <20140930123010.GZ43300@funkthat.com> <542AB897.3020309@gmail.com> <1412086795.66615.363.camel@revolution.hippie.lan> <542ABE45.3020402@gmail.com> <1431814583.91685.39.camel@freebsd.org> <op.xysn99y5kndu52@ronaldradial.radialsg.local>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_3604DB3D-4CA7-41A5-A991-759122A9A6CF
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii


> On May 17, 2015, at 1:26 PM, Ronald Klop <ronald-lists@klop.ws> wrote:
>=20
> On Sun, 17 May 2015 00:16:23 +0200, Ian Lepore <ian@freebsd.org> =
wrote:
>=20
>> On Tue, 2014-09-30 at 16:29 +0200, Mattia Rossi wrote:
>>> Am 30.09.2014 16:19, schrieb Ian Lepore:
>>> > On Tue, 2014-09-30 at 16:05 +0200, Mattia Rossi wrote:
>>> >> Am 30.09.2014 14:30, schrieb John-Mark Gurney:
>>> >>> Mattia Rossi wrote this message on Tue, Sep 30, 2014 at 14:14 =
+0200:
>>> >>>> Am 30.09.2014 13:29, schrieb John-Mark Gurney:
>>> >>>>> Mattia Rossi wrote this message on Mon, Sep 29, 2014 at 10:42 =
+0200:
>>> >>>>>> Am 29.09.2014 06:01, schrieb John-Mark Gurney:
>>> >>>>>>> Mattia Rossi wrote this message on Fri, Sep 26, 2014 at =
14:19 +0200:
>>> >>>>>>>> This might be part of the weird FFS issues the Dreamplug =
has and no-one
>>> >>>>>>>> knows why they're happening.
>>> >>>>>>> Are you running w/ FFS journaling?  If so, try turning it =
off, but
>>> >>>>>>> keeping softupdates on..
>>> >>>>>> No journaling, no softupdates. I'll try enabling softupdates =
next time.
>>> >>>>>> don't know if it will panic though
>>> >>>>>>>> data_abort_handler() at data_abort_handler+0x5c0
>>> >>>>>>>>            pc =3D 0xc0de7a28  lr =3D 0xc0dd711c =
(exception_exit)
>>> >>>>>>>>            sp =3D 0xde019898  fp =3D 0xde019a20
>>> >>>>>>>>            r4 =3D 0xffffffff  r5 =3D 0xffff1004
>>> >>>>>>>>            r6 =3D 0xc3f3f6c0  r7 =3D 0x00001000
>>> >>>>>>>>            r8 =3D 0xc443e880  r9 =3D 0x00000000
>>> >>>>>>>>           r10 =3D 0xc3d69000
>>> >>>>>>>> exception_exit() at exception_exit
>>> >>>>>>>>            pc =3D 0xc0dd711c  lr =3D 0xc0d53828 =
(ffs_truncate+0xaa8)
>>> >>>>>>>>            sp =3D 0xde0198e8  fp =3D 0xde019a20
>>> >>>>>>>>            r0 =3D 0xd0238120  r1 =3D 0x00000e60
>>> >>>>>>>>            r2 =3D 0x00000000  r3 =3D 0x00000000
>>> >>>>>>>>            r4 =3D 0x00000120  r5 =3D 0x00000000
>>> >>>>>>>>            r6 =3D 0xc3f3f6c0  r7 =3D 0x00001000
>>> >>>>>>>>            r8 =3D 0xc443e880  r9 =3D 0x00000000
>>> >>>>>>>>           r10 =3D 0xc3d69000 r12 =3D 0xd0238120
>>> >>>>>>>> memset() at memset+0x48
>>> >>>>>>>>            pc =3D 0xc0de521c  lr =3D 0xc0d53828 =
(ffs_truncate+0xaa8)
>>> >>>>>>>>            sp =3D 0xde0198e8  fp =3D 0xde019a20
>>> >>>>>>>> Unwind failure (no registers changed)
>>> >>>>>>> No more beyond this?   If you could run addr2line on =
0xc0d53828 so
>>> >>>>>>> that we know where in ffs_truncate it's failing, that'd be =
very
>>> >>>>>>> nice...
>>> >>>>>> So I was trying to save the coredump in  order to reboot and =
run
>>> >>>>>> addr2line, but that failed:
>>> >>>>>>
>>> >>>>>> Physical memory: 504 MB
>>> >>>>>> Dumping 67 MB:(da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 =
01 d5 1f20
>>> >>>>>> 00 00 01 00  <sip:2000000100>
>>> >>>>>> (da0:umass-sim0:0:0:0): CAM status: Resource Unavailable
>>> >>>>>> (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
>>> >>>>>> Aborting dump due to I/O error.
>>> >>>>>>
>>> >>>>>> ** DUMP FAILED (ERROR 5) **
>>> >>>>>>
>>> >>>>>> So I guess this error is related to the CAM errors I'm =
getting from time
>>> >>>>>> to time. I was hoping that those errors were related to the =
INVARIANTS
>>> >>>>>> option that slowed down the system and thus might have =
triggered CAM
>>> >>>>>> errors, but obviously the SD Card seems to be the real issue =
here.
>>> >>>>>> So no crashdump for further analysis.
>>> >>>>> That's fine.. w/ the addr2line we have some lines to =
explore...
>>> >>>>>
>>> >>>>>> Interestingly the CAM errors didn't show up on the terminal =
as other
>>> >>>>>> times, the kernel just panicked straight away.
>>> >>>>> Hmm.. that is odd.. someone who knows the SD card layer should =
look
>>> >>>>> at this part...  It could be that the SD card driver doesn't =
handle
>>> >>>>> dumping (there is this global flag that gets set) properly and =
the driver
>>> >>>>> needs to behave differently when it's set...
>>> >>>> I also need to grab a new SD card, just to make sure it's =
really not the
>>> >>>> card.
>>> >>>>
>>> >>>>>> But I've got the addr2line output, even though I'm not sure =
it makes any
>>> >>>>>> difference:
>>> >>>>>>
>>> >>>>>> addr2line -f -e /mnt/kernel.debug 0xc0d53828
>>> >>>>>>
>>> >>>>>> ffs_truncate
>>> >>>>>> /usr/devel/dreamplug/sys/ufs/ffs/ffs_inode.c:321
>>> >>>>> can you give me the contents of the line?  and a few lines of =
context
>>> >>>>> around it?   In HEAD's source, this is DOINGASYNC, and there =
is no call
>>> >>>>> to memset, nor a variable assignment that would result in =
memset being
>>> >>>>> called...
>>> >>>> Same here.. The file hasn't been changed in a while (Fri, 31 =
May 2013):
>>> >>>>
>>> >>>>                   ip->i_size =3D length;
>>> >>>>                   DIP_SET(ip, i_size, length);
>>> >>>>                   if (bp->b_bufsize =3D=3D fs->fs_bsize)
>>> >>>>                           bp->b_flags |=3D B_CLUSTEROK;
>>> >>>>                   if (flags & IO_SYNC)
>>> >>>>                           bwrite(bp);
>>> >>>> 321:        else if (DOINGASYNC(vp))
>>> >>>>                           bdwrite(bp);
>>> >>>>                   else
>>> >>>>                           bawrite(bp);
>>> >>>>                   ip->i_flag |=3D IN_CHANGE | IN_UPDATE;
>>> >>>>                   return (ffs_update(vp, !DOINGASYNC(vp)));
>>> >>>>
>>> >>>> No idea what's going on.
>>> >>> ok, could you send me the output of objdump -dSl, but you only =
need
>>> >>> to include the part from XXXXX <ffs_truncate>: to the next =
XXX<func>:
>>> >>> line...  probably off list as it'll be quite long...
>>> >> I'm sorry, but given that I just broke all my working worlds =
using fsck,
>>> >> I'm not going to be able to do that until I'm back from =
holidays....
>>> >> currently working on the stuff remotely and after today's work =
day, I'm
>>> >> not going to be able to get my hands on the dreamplug.
>>> >>
>>> >>
>>> > BTW, for anyone playing with this problem, step one is to edit
>>> > your /etc/fstab and set the fsck pass number to 0 for all =
filesystems.
>>> > There's a risk of filesystem corruption after a crash, but it's =
smaller
>>> > than the 100% corruption rate of letting fsck run. :)
>>> >
>>> Of course! Great idea :-) Sometimes just can't think of the right =
tweak
>>> to save a lot of pain...
>>>=20
>>> Anyhow, I just found out, that I was rebooting the dreamplug from =
the sd
>>> card instead of the usb stick the whole time, and the usb stick =
hasn't
>>> been damaged enough by fsck, so it actually booted :-) I'll send the
>>> objdump soon.
>>=20
>> A (very) late update on this.... It looks like we may have tracked =
the
>> change that started all this down to the introduction of unmapped IO,
>> almost 2 years ago now.  I still can't find the root cause, but I =
think
>> disabling unmapped IO on armv4/5 is a viable workaround, which Warner
>> committed this morning as r283014.
>>=20
>> --Ian
>=20
>=20
> This sounds promising for the use of my Sheevaplugs.
> I will try this soon. Thanks.

I plan on MFCing this change to 10 on Monday or Tuesday after fixing =
some
typos.

Warner


--Apple-Mail=_3604DB3D-4CA7-41A5-A991-759122A9A6CF
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJVWR0XAAoJEGwc0Sh9sBEAFIkP/14+TZbGifpy/Z3ef3BEO/rE
k77JEIxuHxl5rw0b4uCyMuEapbFNUDR3nKc0w8Nb0veVGyX4MFvdxtZqFKWAOvS/
g4Ug45h4bBMkdqTM0BnYlu6/taVd71g3uV/2lWG8+IMNN/CfGTAV0gf+nI1yxCT7
zbJwlc82uzuBeKn9g0XxEyVG6soL2BGvmUJET7K1pxQwZXGxZUuRrjVW9iacD8q0
gPKhQMa8udtv44uCserNoO6H3OCbkpuY11QExDbPZBYRSAJFToiAYE5bbVXl7yiO
/JVOAjIbAzC9W1BHrpWVOM3vlwlVwnyhgkYAehvhJ/6qUMrdX4L2yuhnD4G0X0tm
10+6cpzZ/vGfK7/L0r9fM2QM/WaAltNdIhj4Oy/+KRyUJySnrpUeXHPoipGgy6Vl
ZvKDpyfr9FDiPyW9LNlaZqUBDnpGzXoBWv/GYcejajpqo17p+TSC4rOaP9nW0vVa
SdYFEyKguTQoZbIKlMK0Aov1223Q/fPnTCxZz0dV+xYSxDPDMXQk7ftWtG0TPRWZ
BSCEU3fbkGkegj49MMj6ZYYIpqvbok8bWh+MmKItlrgMH/BH9PIa0W7b5ZUvZLF+
dwaTr0vOcQG/3nkWn0ufrHTQ9vwaZKK2jYa80Zacr7ayFlwpHfzlCyMznroa4/cs
hLT+RzMUb9vKYvo7pBTY
=7kxJ
-----END PGP SIGNATURE-----

--Apple-Mail=_3604DB3D-4CA7-41A5-A991-759122A9A6CF--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2D3CFA21-3DB5-41D3-9955-31E26151A54D>