Date: Tue, 12 May 2015 12:25:40 -0400 From: Paul Mather <paul@gromit.dlib.vt.edu> To: Ian Lepore <ian@freebsd.org> Cc: Ralf Wenk <iz-rpi03@hs-karlsruhe.de>, freebsd-arm@freebsd.org Subject: Re: state of FreeBSD ARM (less stable than 6 months ago) Message-ID: <9F66E210-24E2-42C4-BEF9-5234F56686BA@gromit.dlib.vt.edu> In-Reply-To: <1431438508.6170.258.camel@freebsd.org> References: <5550C252.6030001@foxvalley.net> <1431357226.2428197.265704673.6A544F74@webmail.messagingengine.com> <555177D9.8080001@foxvalley.net> <E1Ys3w8-003kLQ-NY@smtp.hs-karlsruhe.de> <1431438508.6170.258.camel@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On May 12, 2015, at 9:48 AM, Ian Lepore <ian@freebsd.org> wrote: > On Tue, 2015-05-12 at 08:45 +0200, Ralf Wenk wrote: >> On Mon, 11 May 2015, at 21:47:37, Dan Raymond wrote: >>> On 5/11/2015 9:13 AM, Mark Felder wrote: >>>> On Mon, May 11, 2015, at 09:53, Dan Raymond wrote: >>>>> I've been running an email and web server using FreeBSD 11 on a >>>>> Raspberry Pi B+ since November. It has crashed 3 times since then >>>>> (roughly every two months). I'm currently running r277334. I = thought >>>>> I'd try the latest build to see if stability has improved. I = purchased a >>>>> Raspberry Pi 2 and used the latest crochet to built r282738. No >>>>> problems building it and it booted up fine. However, it crashes = about >>>>> an hour into building some ports I use for my server (nginx, php, >>>>> etc.). I tried twice last night and it crashed both times. Is = anybody >>>>> looking into these stability issues? >>>>>=20 >>>> RPi2 support is something like less than a week old for SMP and DMA >>>> transport. I'm not sure more than a handful of people have actually >>>> tried it yet. The bugs here will be worked out in time, but if you = have >>>> any core dumps or info that can assist in tracking down issues = you're >>>> experiencing that would certainly be appreciated. >>>>=20 >>>=20 >>> These panics always seem to be mmcsd related. I doubt it has = anything=20 >>> to do with RPi2 or SMP. >>>=20 >>> sdhci_bcm0-slot0: Controller timeout >>> sdhci_bcm0-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D = REGISTER DUMP =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> sdhci_bcm0-slot0: Sys addr: 0x4d295a00 | Version: 0x00009902 >>> sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt: 0x00000020 >>> sdhci_bcm0-slot0: Argument: 0x002d19c0 | Trn mode: 0x0000193a >>> sdhci_bcm0-slot0: Present: 0x01ff0506 | Host ctl: 0x00000003 >>> sdhci_bcm0-slot0: Power: 0x0000000f | Blk gap: 0x00000000 >>> sdhci_bcm0-slot0: Wake-up: 0x00000000 | Clock: 0x00000507 >>> sdhci_bcm0-slot0: Timeout: 0x0000000e | Int stat: 0x00000010 >>> sdhci_b >>>=20 >>>=20 >>>=20 >>> mmcsd0: Error indicated: 1 Timeout >>> g_vfs_done():mmcsd0s2a[WRITE(offset=3D1460830208, = length=3D24576)]error =3D 5 >>> panic: No b_bufobj 0xd767ca00 >>> cpuid =3D 1 >>> KDB: enter: panic >>> [ thread pid 12 tid 100013 ] >>> Stopped at $d.7: ldrb r15, [r15, r15, ror r15]! >>> db> >>=20 >> I see such panics every two to three months. They happen on a RPi B >> and RPi B+ as well. I have tried different the SD-Cards on the B and >> the B+ of course. So I think it is not related to SD-card, = manufacturer >> or RPi board. >>=20 >> Usually they happen in the middle of the night when syslogd(8) tries = to >> write something. I have never seen them happen when the RPi has some = work >> to do, e.g. is compiling a port. >>=20 >> Continuing out of the debugger prints the usual messages, but on = reboot >> the RPi freeze. Only a power cycle will get it back to operating. >>=20 >> Very often after such a panic happened my RPi gets "unstable" and = panics >> within the next 48 hours again with the same cause. I found out that, = if >> that happened and I force an fsck ignoring the journal there will be = some >> minor issue fixed and the RPi is stable again. For the next 2 or 3 = months. >>=20 >>=20 >> Ralf >=20 > IMO, the moral of that story is: Never use softupdates with journaling > enabled. For years there have been reports on the mailing lists of = fsck > failures when journaling is enabled (not arm-specific). Sometimes a = few > months goes by without a report and you wonder if it got fixed with = some > checkin you didn't notice, then the reports crop up again. My > conclusion is that journalling has never really worked right. >=20 > The only advantage of journaling is to speed up fsck on huge > filesystems. An sdcard with a handful of GB isn't huge. It would be really nice if the default for ARM images was not to enable=20= soft updates journalling on the root file system. I've found it to=20 cause problems to the point that the first thing I do with a=20 newly-installed FreeBSD/arm image nowadays is to "tunefs -j disable" on=20= the SD card root file system. (I still use soft updates, which I don't=20= find to be a problem.) (BTW, I do have sympathy for the point of view that says, "But if we=20 turn it off, and people aren't using it, how will we ever test it/fix=20 it?...") Cheers, Paul.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9F66E210-24E2-42C4-BEF9-5234F56686BA>