Date: Mon, 22 Jun 2020 01:08:41 -0700 From: Mark Millard <marklmi@yahoo.com> To: "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org> Subject: Re: USB [USB3 and USB2] problems when using UEFi v1.16 to boot RPi4: notes as I explore Message-ID: <CF81584E-75CE-4BFC-8ACC-AB95E561B28D@yahoo.com> In-Reply-To: <B1FF8DD3-DFD1-4973-B0D2-6AC33BCAA59C@yahoo.com> References: <476DD0F0-2286-4B2C-8E44-4404AF17F5A8@yahoo.com> <B1FF8DD3-DFD1-4973-B0D2-6AC33BCAA59C@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Jun-21, at 13:04, Mark Millard <marklmi at yahoo.com> wrote: > On 2020-Jun-21, at 09:02, Mark Millard <marklmi at yahoo.com> wrote: >=20 >> This reports on intermediate results of some >> 4 GiBYte RPi4 use via UWFI based booting. >>=20 >> Extracted from my reply to a different message: >>=20 >>> The following may be a function of the conditions/configuration >>> I'm experimenting with. For example over_voltage=3D6 and >>> arm_freq=3D2000 and it is the 1st time using two USB3 devices (SSD >>> and Ethernet): no powered hub involved (yet). I've not investigated >>> variations yet. I am using a 5.1V 3.5A power supply. While >>> I'm not generally where I can see/use it, an HDMI connection is >>> present but nothing is logged in there. >>>=20 >>> It appears that I get occasional USB SSD data corruption >>> during writes: building ports a few later extracts of prior >>> ports builds get ". . . from package: Lzma library error: >>> Corrupted input data". Out of 419 ports built so far I've >>> had 4 such failures (40 other ports skipped). The last port >>> (llvm10) is still building and probably has 4 or more hours >>> to go. >>>=20 >>> Possibly going along with that is that, when I try to >>> copy a large tar file during the poudriere bulk, the copy >>> ends up corrupted (diff/cmp find differences). I've not >>> yet tried when the RPi4 was basically idle. Using cmp shows >>> that long sequences of bytes are different. Sometimes the >>> new copy has large blocks of binary zeros but not always. >>> It looks like the blocks might be 4096 in size. (Some bytes >>> at the beginning or ends of 4096 might happen to match >>> so the size of the mismatch is can be somewhat less than >>> 4096.) The alignment of the mismatched blocks also >>> stays inside 4096 alignment boundaries, not crossing. >>> (I've not seen back-to-back failed blocks yet.) The messed >>> up blocks are rare. >>>=20 >>> The poudriere bulk is using 4 builders, each allowed >>> 4 processes. So much of the time there was/is a significant >>> load average involved (4+) and there was such when I was >>> testing copies. >>>=20 >>> So far I've not seen variability in the read results of the >>> files that were created. It appears to be a write-time >>> variability. >>>=20 >>> Of note: >>>=20 >>> The USB SSD is the same media also used to boot and >>> operate a Rock64. I've not observed any problems in >>> that alternate usage context. But I should do more >>> explicit checking now. >>>=20 >>> My testing NetBSD with the built-in Ethernet in use and >>> only a USB3 SSD has not suggested problems for the >>> over_voltage and arm_freq so far. But I need better >>> checking than I did. NetBSD was using the same type of >>> USB3 SSD on the same RPi4. >>=20 >>=20 >> Of the 4 port builds that failed for ". . . from package: >> Lzma library error Corrupted input data", only 2 files are >> involved. 3 of the 4 failures are attempted extractions >> of the same package (llvm80-8.0.1_3) and the same file >> fails for each of the 3. >>=20 >> But, more interesting is that, prior to the failures, llvm80 >> was extracted 3 other times successfully after it was built. >> This may be nothing more than in-memory copies of content >> still being available at the time. (No USB-read required of >> what what ended up being written?) >>=20 >> mesa-libs-19.0.8 , mesa-dri-19.0.8 , and xorg-server-1.20.8_1,1 >> had no failures. The later xf86-video-scfb-0.0.5_2 , >> xf86-input-libinput-0.30.0 , and xf86-video-vesa-2.4.0_3 had >> failures while preparing to build. >=20 > The llvm10 build finished. >=20 > As for the bad large-file copy under a > head -r360311 based context. . . >=20 > Having the RPi4 otherwise idle made no > difference. >=20 > Having only the USB3 SSD as a USB device (on > USB3) made no difference. Nor did also not > having HDMI connected. >=20 > Changing the arm_freq in use made no difference. > Using the default arm_freq (no assignment) > and having no over_voltage assignment made no > difference. >=20 > Using an external powered hub instead of a > direct plug-in for the USB3 SSD made no > difference. >=20 > All the above at the same time made no > difference. >=20 > Plugging in the USB SSD to a USB2 port instead > of a USB3 port and booting that way made no > difference. >=20 > Booting the Rock64 with the media and doing > the experiment had no problems. >=20 > It looks like the v1.16 UEFI based context > has a general problem that shows up in at > least USB "disk" I/O. >=20 > The file copied during the tests is: >=20 > # ls -ldT /usr/obj/clang-cortexA53-installworld-poud.tar > -rw-r--r-- 1 root wheel 4011026432 Apr 25 21:04:42 2020 = /usr/obj/clang-cortexA53-installworld-poud.tar >=20 > Note: diffing this file with the original on another > machine consistently shows no differences. The above > copy was established via copying to the Rock64. It is > only attempting to write new copies via the RPi4 that > end up with the new copies not fully matching this > file. >=20 > Copies over the network (scp and nfs) made to the RPi4 > from where the original file is also end up partially > corrupted on the RPi4. In this context, the RPi4 is > using an external USB3 Ethernet device as the source > of the data. >=20 > Copies made from the RPi4 to the other machine end > up with no differences (i.e., a good copy results). >=20 > It looks like the problem is for writes to the USB > media, not reads of the media. >=20 > For reference on the RPi4: >=20 > USB3 boot context: >=20 > ugen0.3: <OWC Envoy Pro mini> at usbus0 > umass0 on uhub1 > umass0: <OWC Envoy Pro mini, class 0/0, rev 3.00/1.00, addr 2> on = usbus0 > umass0: SCSI over Bulk-Only; quirks =3D 0x0100 > umass0:0:0: Attached to scbus0 > . . . (Root mount waiting for: CAM notices) . . . > da0 at umass-sim0 bus 0 scbus0 target 0 lun 0 > da0: <OWC Envoy Pro mini 0> Fixed Direct Access SPC-4 SCSI device > da0: Serial Number # > da0: 400.000MB/s transfers > da0: 228936MB (468862128 512 byte sectors) > da0: quirks=3D0x2<NO_6_BYTE> >=20 > USB2 boot context: >=20 > ugen0.3: <OWC Envoy Pro mini> at usbus0 > umass0 on uhub2 > umass0: <OWC Envoy Pro mini, class 0/0, rev 2.10/1.00, addr 2> on = usbus0 > umass0: SCSI over Bulk-Only; quirks =3D 0x0100 > umass0:0:0: Attached to scbus0 > . . . (Root mount waiting for: CAM notices) . . . > da0 at umass-sim0 bus 0 scbus0 target 0 lun 0 > da0: <OWC Envoy Pro mini 0> Fixed Direct Access SPC-4 SCSI device > da0: Serial Number # > da0: 40.000MB/s transfers > da0: 228936MB (468862128 512 byte sectors) > da0: quirks=3D0x2<NO_6_BYTE> >=20 I've checked NetBSD operation with large file copies for: # uname -ap NetBSD NBSDRPi4 9.99.64 NetBSD 9.99.64 (GENERIC64) #1: Sun May 31 = 01:41:16 UTC 2020 = root@NBSDRPi4:/usr/obj/sys/arch/evbarm/compile/GENERIC64 evbarm aarch64 There were no file differences found between file copies. The problem seems to be specific to the FreeBSD/RPi4 combination in some way. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CF81584E-75CE-4BFC-8ACC-AB95E561B28D>