Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Sep 2019 15:45:43 -0400
From:      Mike Tancsa <mike@sentex.net>
To:        Ian Lepore <ian@freebsd.org>, Warner Losh <imp@bsdimp.com>
Cc:        freebsd-embedded <freebsd-embedded@freebsd.org>
Subject:   Re: watchdogd stat location
Message-ID:  <ded95b5e-7f5d-6752-c8f9-9b3e26fc8d52@sentex.net>
In-Reply-To: <817c7ed712d6b7da3015b7312be485a9044b14e1.camel@freebsd.org>
References:  <5eba25eb-9ba4-0c93-27c8-e834491298ad@sentex.net> <CANCZdfp6bym5b6eFXFH0MxjYsAX%2B1Bi9fGXgp7sFM206zmsveQ@mail.gmail.com> <CAJ1Oi8FsG=nEBXdd0CS3U2zZSgh=SMcBO0ieY-KT5b1iDVFmJg@mail.gmail.com> <83831ae6-9275-4f0c-a23d-c9cca3dc28f4@sentex.net> <CANCZdfrRh7Ssf9vSSJ4Hopec1q7abLi9AdUqoPqZm4hPok6QUQ@mail.gmail.com> <fcdd9659-d7e4-c554-b501-6b8cd178f6d7@sentex.net> <CANCZdfpq424LcV04dBJfoid_KSSdYWGfq2StDCToxDzZXnAvfg@mail.gmail.com> <817c7ed712d6b7da3015b7312be485a9044b14e1.camel@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 9/28/2019 3:30 PM, Ian Lepore wrote:
> If we want to be sure to force physical IO, how about dd if=/
> of=/dev/null count=1 ?
>
> But I question the premise of forcing physical IO as being somehow a
> better indicator of a non-hung system.  I think it's just a better
> indicator of the sdcard problem that Mike is experiencing.  For anyone
> else, forcing periodic physical IO is going to do annoying things like
> spin up idle drives.


I think in my case, I am going to need to do that.  I was hoping doing a
simple stat on / or /boot would do the trick to recover from

mmcsd0: Error indicated: 1 Timeout
g_vfs_done():mmcsd0s1a[READ(offset=267358208, length=4096)]error = 5
vnode_pager_generic_getpages_done: I/O read error 5
vm_fault: pager read error, pid 1 (init)
sdhci_pci0-slot0: Got AutoCMD12 error 0x0001, but there is no active
command.
sdhci_pci0-slot0: ============== REGISTER DUMP ==============
sdhci_pci0-slot0: Sys addr: 0x74ee0000 | Version:  0x00001001
sdhci_pci0-slot0: Blk size: 0x00005200 | Blk cnt:  0x00000008
sdhci_pci0-slot0: Argument: 0x0007f817 | Trn mode: 0x00000037
sdhci_pci0-slot0: Present:  0x01ff0000 | Host ctl: 0x00000007
sdhci_pci0-slot0: Power:    0x0000000f | Blk gap:  0x00000000
sdhci_pci0-slot0: Wake-up:  0x00000000 | Clock:    0x00000007
sdhci_pci0-slot0: Timeout:  0x0000000d | Int stat: 0x00000000
sdhci_pci0-slot0: Int enab: 0x01ff00fb | Sig enab: 0x01ff00fb
sdhci_pci0-slot0: AC12 err: 0x00000001 | Host ctl2:0x00000080
sdhci_pci0-slot0: Caps:     0x21fe32b2 | Caps2:    0x00000070
sdhci_pci0-slot0: Max curr: 0x00c80064 | ADMA err: 0x00000000
sdhci_pci0-slot0: ADMA addr:0x00000000 | Slot int: 0x000000ff
sdhci_pci0-slot0: ===========================================
g_vfs_done():mmcsd0s1a[READ(offset=267358208, length=4096)]error = 5
vnode_pager_generic_getpages_done: I/O read error 5

but it looks like no dice, at least in the one case I hit over the
weekend. However from the captured logs, not sure if watchogd really got
armed or not.

I think doing an actual raw read is the way to go, but to put that in
watchdogd feels like it would violate POLA.  I think instead, I will
make it an external command as it will fix my needs, or even roll my own
watchdogd which might even be better for me.

    ---Mike

-- 
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, mike@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ded95b5e-7f5d-6752-c8f9-9b3e26fc8d52>