Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Jun 2020 01:08:41 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>
Subject:   Re: USB [USB3 and USB2] problems when using UEFi v1.16 to boot RPi4: notes as I explore
Message-ID:  <CF81584E-75CE-4BFC-8ACC-AB95E561B28D@yahoo.com>
In-Reply-To: <B1FF8DD3-DFD1-4973-B0D2-6AC33BCAA59C@yahoo.com>
References:  <476DD0F0-2286-4B2C-8E44-4404AF17F5A8@yahoo.com> <B1FF8DD3-DFD1-4973-B0D2-6AC33BCAA59C@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2020-Jun-21, at 13:04, Mark Millard <marklmi at yahoo.com> wrote:

> On 2020-Jun-21, at 09:02, Mark Millard <marklmi at yahoo.com> wrote:
>=20
>> This reports on intermediate results of some
>> 4 GiBYte RPi4 use via UWFI based booting.
>>=20
>> Extracted from my reply to a different message:
>>=20
>>> The following may be a function of the conditions/configuration
>>> I'm experimenting with. For example over_voltage=3D6 and
>>> arm_freq=3D2000 and it is the 1st time using two USB3 devices (SSD
>>> and Ethernet): no powered hub involved (yet). I've not investigated
>>> variations yet. I am using a 5.1V 3.5A power supply. While
>>> I'm not generally where I can see/use it, an HDMI connection is
>>> present but nothing is logged in there.
>>>=20
>>> It appears that I get occasional USB SSD data corruption
>>> during writes: building ports a few later extracts of prior
>>> ports builds get ". . . from package: Lzma library error:
>>> Corrupted input data". Out of 419 ports built so far I've
>>> had 4 such failures (40 other ports skipped). The last port
>>> (llvm10) is still building and probably has 4 or more hours
>>> to go.
>>>=20
>>> Possibly going along with that is that, when I try to
>>> copy a large tar file during the poudriere bulk, the copy
>>> ends up corrupted (diff/cmp find differences). I've not
>>> yet tried when the RPi4 was basically idle. Using cmp shows
>>> that long sequences of bytes are different. Sometimes the
>>> new copy has large blocks of binary zeros but not always.
>>> It looks like the blocks might be 4096 in size. (Some bytes
>>> at the beginning or ends of 4096 might happen to match
>>> so the size of the mismatch is can be somewhat less than
>>> 4096.) The alignment of the mismatched blocks also
>>> stays inside 4096 alignment boundaries, not crossing.
>>> (I've not seen back-to-back failed blocks yet.) The messed
>>> up blocks are rare.
>>>=20
>>> The poudriere bulk is using 4 builders, each allowed
>>> 4 processes. So much of the time there was/is a significant
>>> load average involved (4+) and there was such when I was
>>> testing copies.
>>>=20
>>> So far I've not seen variability in the read results of the
>>> files that were created. It appears to be a write-time
>>> variability.
>>>=20
>>> Of note:
>>>=20
>>> The USB SSD is the same media also used to boot and
>>> operate a Rock64. I've not observed any problems in
>>> that alternate usage context. But I should do more
>>> explicit checking now.
>>>=20
>>> My testing NetBSD with the built-in Ethernet in use and
>>> only a USB3 SSD has not suggested problems for the
>>> over_voltage and arm_freq so far. But I need better
>>> checking than I did. NetBSD was using the same type of
>>> USB3 SSD on the same RPi4.
>>=20
>>=20
>> Of the 4 port builds that failed for ". . . from package:
>> Lzma library error Corrupted input data", only 2 files are
>> involved. 3 of the 4 failures are attempted extractions
>> of the same package (llvm80-8.0.1_3) and the same file
>> fails for each of the 3.
>>=20
>> But, more interesting is that, prior to the failures, llvm80
>> was extracted 3 other times successfully after it was built.
>> This may be nothing more than in-memory copies of content
>> still being available at the time. (No USB-read required of
>> what what ended up being written?)
>>=20
>> mesa-libs-19.0.8 , mesa-dri-19.0.8 , and xorg-server-1.20.8_1,1
>> had no failures. The later xf86-video-scfb-0.0.5_2 ,
>> xf86-input-libinput-0.30.0 , and xf86-video-vesa-2.4.0_3 had
>> failures while preparing to build.
>=20
> The llvm10 build finished.
>=20
> As for the bad large-file copy under a
> head -r360311 based context. . .
>=20
> Having the RPi4 otherwise idle made no
> difference.
>=20
> Having only the USB3 SSD as a USB device (on
> USB3) made no difference. Nor did also not
> having HDMI connected.
>=20
> Changing the arm_freq in use made no difference.
> Using the default arm_freq (no assignment)
> and having no over_voltage assignment made no
> difference.
>=20
> Using an external powered hub instead of a
> direct plug-in for the USB3 SSD made no
> difference.
>=20
> All the above at the same time made no
> difference.
>=20
> Plugging in the USB SSD to a USB2 port instead
> of a USB3 port and booting that way made no
> difference.
>=20
> Booting the Rock64 with the media and doing
> the experiment had no problems.
>=20
> It looks like the v1.16 UEFI based context
> has a general problem that shows up in at
> least USB "disk" I/O.
>=20
> The file copied during the tests is:
>=20
> # ls -ldT /usr/obj/clang-cortexA53-installworld-poud.tar
> -rw-r--r--  1 root  wheel  4011026432 Apr 25 21:04:42 2020 =
/usr/obj/clang-cortexA53-installworld-poud.tar
>=20
> Note: diffing this file with the original on another
> machine consistently shows no differences. The above
> copy was established via copying to the Rock64. It is
> only attempting to write new copies via the RPi4 that
> end up with the new copies not fully matching this
> file.
>=20
> Copies over the network (scp and nfs) made to the RPi4
> from where the original file is also end up partially
> corrupted on the RPi4. In this context, the RPi4 is
> using an external USB3 Ethernet device as the source
> of the data.
>=20
> Copies made from the RPi4 to the other machine end
> up with no differences (i.e., a good copy results).
>=20
> It looks like the problem is for writes to the USB
> media, not reads of the media.
>=20
> For reference on the RPi4:
>=20
> USB3 boot context:
>=20
> ugen0.3: <OWC Envoy Pro mini> at usbus0
> umass0 on uhub1
> umass0: <OWC Envoy Pro mini, class 0/0, rev 3.00/1.00, addr 2> on =
usbus0
> umass0:  SCSI over Bulk-Only; quirks =3D 0x0100
> umass0:0:0: Attached to scbus0
> . . . (Root mount waiting for: CAM notices) . . .
> da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
> da0: <OWC Envoy Pro mini 0> Fixed Direct Access SPC-4 SCSI device
> da0: Serial Number #
> da0: 400.000MB/s transfers
> da0: 228936MB (468862128 512 byte sectors)
> da0: quirks=3D0x2<NO_6_BYTE>
>=20
> USB2 boot context:
>=20
> ugen0.3: <OWC Envoy Pro mini> at usbus0
> umass0 on uhub2
> umass0: <OWC Envoy Pro mini, class 0/0, rev 2.10/1.00, addr 2> on =
usbus0
> umass0:  SCSI over Bulk-Only; quirks =3D 0x0100
> umass0:0:0: Attached to scbus0
> . . . (Root mount waiting for: CAM notices) . . .
> da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
> da0: <OWC Envoy Pro mini 0> Fixed Direct Access SPC-4 SCSI device
> da0: Serial Number #
> da0: 40.000MB/s transfers
> da0: 228936MB (468862128 512 byte sectors)
> da0: quirks=3D0x2<NO_6_BYTE>
>=20

I've checked NetBSD operation with large file
copies for:

# uname -ap
NetBSD NBSDRPi4 9.99.64 NetBSD 9.99.64 (GENERIC64) #1: Sun May 31 =
01:41:16 UTC 2020  =
root@NBSDRPi4:/usr/obj/sys/arch/evbarm/compile/GENERIC64 evbarm aarch64

There were no file differences found between
file copies.

The problem seems to be specific to the FreeBSD/RPi4
combination in some way.


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CF81584E-75CE-4BFC-8ACC-AB95E561B28D>