Date: Sun, 17 May 2015 21:26:23 +0200 From: "Ronald Klop" <ronald-lists@klop.ws> To: freebsd-arm@freebsd.org Subject: Re: Random Kernel Panic on Dreamplug (FS related) Message-ID: <op.xysn99y5kndu52@ronaldradial.radialsg.local> In-Reply-To: <1431814583.91685.39.camel@freebsd.org> References: <542559BC.7090100@gmail.com> <20140929040126.GG43300@funkthat.com> <54291B74.5010307@gmail.com> <20140930112937.GU43300@funkthat.com> <542A9EA4.70109@gmail.com> <20140930123010.GZ43300@funkthat.com> <542AB897.3020309@gmail.com> <1412086795.66615.363.camel@revolution.hippie.lan> <542ABE45.3020402@gmail.com> <1431814583.91685.39.camel@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 17 May 2015 00:16:23 +0200, Ian Lepore <ian@freebsd.org> wrote: > On Tue, 2014-09-30 at 16:29 +0200, Mattia Rossi wrote: >> Am 30.09.2014 16:19, schrieb Ian Lepore: >> > On Tue, 2014-09-30 at 16:05 +0200, Mattia Rossi wrote: >> >> Am 30.09.2014 14:30, schrieb John-Mark Gurney: >> >>> Mattia Rossi wrote this message on Tue, Sep 30, 2014 at 14:14 +0200: >> >>>> Am 30.09.2014 13:29, schrieb John-Mark Gurney: >> >>>>> Mattia Rossi wrote this message on Mon, Sep 29, 2014 at 10:42 >> +0200: >> >>>>>> Am 29.09.2014 06:01, schrieb John-Mark Gurney: >> >>>>>>> Mattia Rossi wrote this message on Fri, Sep 26, 2014 at 14:19 >> +0200: >> >>>>>>>> This might be part of the weird FFS issues the Dreamplug has >> and no-one >> >>>>>>>> knows why they're happening. >> >>>>>>> Are you running w/ FFS journaling? If so, try turning it off, >> but >> >>>>>>> keeping softupdates on.. >> >>>>>> No journaling, no softupdates. I'll try enabling softupdates >> next time. >> >>>>>> don't know if it will panic though >> >>>>>>>> data_abort_handler() at data_abort_handler+0x5c0 >> >>>>>>>> pc = 0xc0de7a28 lr = 0xc0dd711c (exception_exit) >> >>>>>>>> sp = 0xde019898 fp = 0xde019a20 >> >>>>>>>> r4 = 0xffffffff r5 = 0xffff1004 >> >>>>>>>> r6 = 0xc3f3f6c0 r7 = 0x00001000 >> >>>>>>>> r8 = 0xc443e880 r9 = 0x00000000 >> >>>>>>>> r10 = 0xc3d69000 >> >>>>>>>> exception_exit() at exception_exit >> >>>>>>>> pc = 0xc0dd711c lr = 0xc0d53828 >> (ffs_truncate+0xaa8) >> >>>>>>>> sp = 0xde0198e8 fp = 0xde019a20 >> >>>>>>>> r0 = 0xd0238120 r1 = 0x00000e60 >> >>>>>>>> r2 = 0x00000000 r3 = 0x00000000 >> >>>>>>>> r4 = 0x00000120 r5 = 0x00000000 >> >>>>>>>> r6 = 0xc3f3f6c0 r7 = 0x00001000 >> >>>>>>>> r8 = 0xc443e880 r9 = 0x00000000 >> >>>>>>>> r10 = 0xc3d69000 r12 = 0xd0238120 >> >>>>>>>> memset() at memset+0x48 >> >>>>>>>> pc = 0xc0de521c lr = 0xc0d53828 >> (ffs_truncate+0xaa8) >> >>>>>>>> sp = 0xde0198e8 fp = 0xde019a20 >> >>>>>>>> Unwind failure (no registers changed) >> >>>>>>> No more beyond this? If you could run addr2line on 0xc0d53828 >> so >> >>>>>>> that we know where in ffs_truncate it's failing, that'd be very >> >>>>>>> nice... >> >>>>>> So I was trying to save the coredump in order to reboot and run >> >>>>>> addr2line, but that failed: >> >>>>>> >> >>>>>> Physical memory: 504 MB >> >>>>>> Dumping 67 MB:(da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 01 >> d5 1f20 >> >>>>>> 00 00 01 00 <sip:2000000100> >> >>>>>> (da0:umass-sim0:0:0:0): CAM status: Resource Unavailable >> >>>>>> (da0:umass-sim0:0:0:0): Error 5, Retries exhausted >> >>>>>> Aborting dump due to I/O error. >> >>>>>> >> >>>>>> ** DUMP FAILED (ERROR 5) ** >> >>>>>> >> >>>>>> So I guess this error is related to the CAM errors I'm getting >> from time >> >>>>>> to time. I was hoping that those errors were related to the >> INVARIANTS >> >>>>>> option that slowed down the system and thus might have triggered >> CAM >> >>>>>> errors, but obviously the SD Card seems to be the real issue >> here. >> >>>>>> So no crashdump for further analysis. >> >>>>> That's fine.. w/ the addr2line we have some lines to explore... >> >>>>> >> >>>>>> Interestingly the CAM errors didn't show up on the terminal as >> other >> >>>>>> times, the kernel just panicked straight away. >> >>>>> Hmm.. that is odd.. someone who knows the SD card layer should >> look >> >>>>> at this part... It could be that the SD card driver doesn't >> handle >> >>>>> dumping (there is this global flag that gets set) properly and >> the driver >> >>>>> needs to behave differently when it's set... >> >>>> I also need to grab a new SD card, just to make sure it's really >> not the >> >>>> card. >> >>>> >> >>>>>> But I've got the addr2line output, even though I'm not sure it >> makes any >> >>>>>> difference: >> >>>>>> >> >>>>>> addr2line -f -e /mnt/kernel.debug 0xc0d53828 >> >>>>>> >> >>>>>> ffs_truncate >> >>>>>> /usr/devel/dreamplug/sys/ufs/ffs/ffs_inode.c:321 >> >>>>> can you give me the contents of the line? and a few lines of >> context >> >>>>> around it? In HEAD's source, this is DOINGASYNC, and there is >> no call >> >>>>> to memset, nor a variable assignment that would result in memset >> being >> >>>>> called... >> >>>> Same here.. The file hasn't been changed in a while (Fri, 31 May >> 2013): >> >>>> >> >>>> ip->i_size = length; >> >>>> DIP_SET(ip, i_size, length); >> >>>> if (bp->b_bufsize == fs->fs_bsize) >> >>>> bp->b_flags |= B_CLUSTEROK; >> >>>> if (flags & IO_SYNC) >> >>>> bwrite(bp); >> >>>> 321: else if (DOINGASYNC(vp)) >> >>>> bdwrite(bp); >> >>>> else >> >>>> bawrite(bp); >> >>>> ip->i_flag |= IN_CHANGE | IN_UPDATE; >> >>>> return (ffs_update(vp, !DOINGASYNC(vp))); >> >>>> >> >>>> No idea what's going on. >> >>> ok, could you send me the output of objdump -dSl, but you only need >> >>> to include the part from XXXXX <ffs_truncate>: to the next >> XXX<func>: >> >>> line... probably off list as it'll be quite long... >> >> I'm sorry, but given that I just broke all my working worlds using >> fsck, >> >> I'm not going to be able to do that until I'm back from holidays.... >> >> currently working on the stuff remotely and after today's work day, >> I'm >> >> not going to be able to get my hands on the dreamplug. >> >> >> >> >> > BTW, for anyone playing with this problem, step one is to edit >> > your /etc/fstab and set the fsck pass number to 0 for all filesystems. >> > There's a risk of filesystem corruption after a crash, but it's >> smaller >> > than the 100% corruption rate of letting fsck run. :) >> > >> Of course! Great idea :-) Sometimes just can't think of the right tweak >> to save a lot of pain... >> >> Anyhow, I just found out, that I was rebooting the dreamplug from the sd >> card instead of the usb stick the whole time, and the usb stick hasn't >> been damaged enough by fsck, so it actually booted :-) I'll send the >> objdump soon. > > A (very) late update on this.... It looks like we may have tracked the > change that started all this down to the introduction of unmapped IO, > almost 2 years ago now. I still can't find the root cause, but I think > disabling unmapped IO on armv4/5 is a viable workaround, which Warner > committed this morning as r283014. > > --Ian This sounds promising for the use of my Sheevaplugs. I will try this soon. Thanks. Ronald.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.xysn99y5kndu52>