Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Feb 2023 09:07:27 -0800
From:      bob prohaska <fbsd@www.zefox.net>
To:        John F Carr <jfc@mit.edu>
Cc:        Mark Millard <marklmi@yahoo.com>, "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>
Subject:   Re: fsck segfaults on rpi3 running 13-stable (and on 14-CURRENT analyzing the same file system that resulted from the 13-STABLE crash)
Message-ID:  <20230220170727.GC57936@www.zefox.net>
In-Reply-To: <1DB17CD4-63B5-4FA2-ADC6-6ED817A09CCB@mit.edu>
References:  <202302192054.31JKsq7w079295@chez.mckusick.com> <3DD8EEC2-6135-42A0-A80C-F195CAAC025E@yahoo.com> <20230219222328.GA55941@www.zefox.net> <2F5B20E9-AFF8-42F6-9E1F-50BBDF4E1B79@yahoo.com> <20230220044544.GB57936@www.zefox.net> <9CEF4E7A-2F13-454F-A04A-A6C5A80FD4B7@yahoo.com> <268392B4-58FE-49EE-9B1D-6DA632757DFA@yahoo.com> <1DB17CD4-63B5-4FA2-ADC6-6ED817A09CCB@mit.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Feb 20, 2023 at 11:47:30AM +0000, John F Carr wrote:
> 
> But we have an address from the SCSI command: READ(10). CDB: 28 00 43 29 d6 40 00 00 40 00 
> 
> Decoded that says read, starting block 0x4329d640, length 0x40 blocks.  If block size is 512 bytes that is about half a terabyte into the disk.
> 
> This shell command should replicate the read:
> 
> # dd if=/dev/da0 of=/dev/null bs=32768 count=1 skip=17606489
> 
> The device name (if=) comes from the error message "da0:umass-sim0:0:0:0".  The block size (bs=) matches the read request in the failed SCSI command.  The skip count is 0x4329d640 (disk block) / 64 (number of disk blocks per dd block).
> 
> If you reproduce the error with dd you can try a binary search over the 64 block range until you find the block that failed.
> 

I can't reproduce the error using dd:

root@pelorus:/usr/ports/sysutils/smartmontools # dd if=/dev/da0 of=/dev/null bs=32768 count=1 skip=17606489
1+0 records in
1+0 records out
32768 bytes transferred in 0.004245 secs (7719010 bytes/sec)
root@pelorus:/usr/ports/sysutils/smartmontools # dd if=/dev/da0 of=/dev/null bs=32768 count=1 skip=17606489
1+0 records in
1+0 records out
32768 bytes transferred in 0.004151 secs (7893526 bytes/sec)
root@pelorus:/usr/ports/sysutils/smartmontools # dd if=/dev/da0 of=/dev/null bs=32768 count=1 skip=17606489
1+0 records in
1+0 records out
32768 bytes transferred in 0.004139 secs (7917764 bytes/sec)
root@pelorus:/usr/ports/sysutils/smartmontools # dd if=/dev/da0 of=/dev/null bs=32768 count=1 skip=17606489
1+0 records in
1+0 records out
32768 bytes transferred in 0.004070 secs (8052034 bytes/sec)
root@pelorus:/usr/ports/sysutils/smartmontools # dd if=/dev/da0 of=/dev/null bs=32768 count=1 skip=17606489
1+0 records in
1+0 records out
32768 bytes transferred in 0.004032 secs (8126081 bytes/sec)
root@pelorus:/usr/ports/sysutils/smartmontools # 

Not a peep from the console either. 

This is a $50 disk from Amazon, so Mark's skepticism isn't entirely unwarranted.

Trying the SMART self tests is a bit confusing:
root@pelorus:/usr/ports/sysutils/smartmontools # smartctl -t short /dev/da0
smartctl 7.3 2022-02-28 r5338 [FreeBSD 13.2-PRERELEASE arm64] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
[I didn't ask for an off-line test] 
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Mon Feb 20 08:39:21 2023 PST
Use smartctl -X to abort test.
root@pelorus:/usr/ports/sysutils/smartmontools # 

The shell prompt returned immediately, and smartctl -a indicates
"Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run."
A long test asks me to wait 166 minutes, also returning a prmpt immediately.
There's no activity on the disk LED, but since the tests are internal
maybe that's normal. I expected some sort of complaint that the test
couldn't be done on a mounted and booted disk, but we'll see. 
For the moment I'm skeptical any testing is being done, as there's no
audible activity from the disk. I'll move the disk to another Pi later 
today unless better ideas emerge from this discussion. 
 
Thanks for writing!

bob prohaska



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20230220170727.GC57936>