From owner-freebsd-arm@freebsd.org Fri Jun 26 03:40:39 2020 Return-Path: Delivered-To: freebsd-arm@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 3A4A834C7CB for ; Fri, 26 Jun 2020 03:40:39 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic311-25.consmr.mail.gq1.yahoo.com (sonic311-25.consmr.mail.gq1.yahoo.com [98.137.65.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49tN2G2Nhtz40yY for ; Fri, 26 Jun 2020 03:40:37 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: eB7ylv4VM1nyWgdyN_NGYlAl9z4C..HwL3Er0C3Zn.76qWirqGhzgj3PVBpVINO Zzp900R.MZ1LgqP4B6Zfdp1JJXCiwVDs7PMXC7lIcRJbAEvpMPuoYqmTAvk7UvZVtCk7yXj1_QPI kEbdz3GI.5rBMnnojXql3FFLnVGoiIbUJiO7T5RhWWMy_sPDAMiGjTtg89I9f8Vs8EE0I0spKvn. Al0VQH.2m59o4tp4C1bApUHeWazeM4YvovwZZQPRgF4NDMe0ZGZAd8DKABFPVuWldDm9eK0saWSZ 7obRrhvX6iY4x58rGHOCR0DuaidC2m31rOWEbtIi4N8Hlr91C0C4tocO7ycs5Ec74E1L91jEYYdT d.IjrhWJAFrIkK_SOb0ztnOQXY5mS1WtjZNMG0q14cg_VBHEFJk5cQuR.Xtzq2WNGSed7rGL1X8m F30YkSijs1TJpMZExG77Duvw8ja5tJOt0YILnenafBmDB6APGDcDSYBmBEmzxIiGFFtjaYrFS9g4 _yCkrCJLGxddNg1Gbc9R4UGagAgh8XdDmNh0qjYAyuRXWoggOAdAY1Z_QGXx1JwKf2IfgIUfGfFD qe_eoWNjLviASqQDK4ZbcxweG6qG6IKux8Q2xHU6vHmtnb1zGaT8Xl7ENraxBewhx1GkGEE_cAVn SYEbyX.mza.EZoqynrgGB1qDz4LxgKXxtKACxpga.d7Yw8ofTf76XyzypEMRXQipoevLUEhdq0fH 039M8MbOyN9eVj_zFohEUewunEFIC9prwFzxZ1C9r3BGXl.s7Vc076ClWd_PdNi5Aen2WlyOLQ_M jsOz48vuJEXxmIO41EJdhlQ51JJu6bhTelsrOIUNde1BYSeH4tvEDsibnzpzXv_HIHoQIzzi2P85 rI58iRrs_gLTuHx.Vf8UX3wHXObVaifNkWLkWmWTbq7gp1k9BwoL7Kl_7LOYEEt3633ZEpGxmSJj Q2Ii5R7eTKcJd7OTZtocOMGe7iVr1jTUUo5kF1y3S5duCYqDoH6VqHepq9uUNUaMAXaOb.7A0Gig ShlK6hLgMdjGUYTEYcJOi5gFd1iSIPr.MLEQMDyEG8I_pZvLYPPP.FMPoyVNdUkR8HmyUI3egLZT ZUd1Kb98Bz6_XUBhAyJVOdG_vlaIU4S3qf.banOQ4LAqGrdLIrl2.HqSE9QkwyAIGhPdethhgT.n sWBpUqjPse4vgbApELUaB3Hi42aOngedUX2gqZz.MVjvSQHLYDQ9YC3_7sFb97OItPTSoolELQpQ XREuNQP4RzTst.GGCZLpcfU78d1nJv9A2LSCSl7i.EAM_It0J3O_dLmXNh3sjBXMVH3ih5usHKnB C0xg6scIA.p_.Fwlgt2rTUmLDxKRX9wx3pOudYZrQadYIqba1BnjJtUZQKkjqX1VQmhjKCJKQYCr BeUtmKUoRWmSfDkixh1.iHTbAqhMnN2w2DsRjtlkruRk- Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.gq1.yahoo.com with HTTP; Fri, 26 Jun 2020 03:40:36 +0000 Received: by smtp422.mail.gq1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 218a8f59592c1ba148496faf92843370; Fri, 26 Jun 2020 03:40:33 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: USB [USB3 and USB2] problems when using UEFi v1.16 to boot RPi4: Evidence of a read-time problem being involved Date: Thu, 25 Jun 2020 20:40:32 -0700 References: <476DD0F0-2286-4B2C-8E44-4404AF17F5A8@yahoo.com> To: freebsd-arm In-Reply-To: Message-Id: <88B0E169-C42F-42D6-B2BA-957EAEC7DB8C@yahoo.com> X-Mailer: Apple Mail (2.3608.80.23.2.2) X-Rspamd-Queue-Id: 49tN2G2Nhtz40yY X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.47 / 15.00]; RCVD_TLS_LAST(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.65.206:from]; FROM_HAS_DN(0.00)[]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.996]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_MEDIUM(-0.90)[-0.899]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; RCVD_IN_DNSWL_NONE(0.00)[98.137.65.206:from]; NEURAL_HAM_SHORT(-0.08)[-0.077]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; RCVD_COUNT_TWO(0.00)[2]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim] X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Jun 2020 03:40:39 -0000 [Looks like it is a read-time failure in some new testing.] On 2020-Jun-25, at 17:52, Mark Millard wrote: >=20 > On 2020-Jun-25, at 15:40, Klaus K=C3=BCchemann wrote: >=20 >> Am 25.06.2020 um 21:29 schrieb Mark Millard via freebsd-arm = : >>> =E2=80=A6 >>> . >>> The test still failed to produce an accurate file copy >>> but the kernel did not report anything either. I'm >>> Unsure how get evidence of the context for the bad 4K >>> chunks. >>>=20 >> No clue if it has effects but maybe : dd if=3Dxxx of=3Dxxx bs=3D4k ? >=20 > Something interesting does result from dd testing, > even though doing file copies that way still gets > the problem. In fact a couple of interesting points > show up. >=20 > Using dd to copy large files still gets corrupted copies. > (Large files are only because the corruptions are not > frequent in the files but a sufficiently large file > seems to always have some corruption.) >=20 > Interestingly, dd if=3D/dev/zero based large file > generation has produced good files from what I > can tell. (Generate separate files and diff them > after a reboot.) >=20 > The problem was originally discovered copying > from another machine to a RPi4. But the Ethernet > use involved USB in providing data (but not a > local USB drive) --while /dev/zero does not > involve USB as a data source and copies of > data in memory via file content buffering. So > the contrasting dd if=3D/dev/zero results may be > indicating something. >=20 > Another interesting point is that the following > sequence seems repeatable for step (E)'s resultant > property below: >=20 > A) first do a couple of large dd if=3D/dev/zero file generations > B) then do a (non-zero) large file copy (dd based or cp based) > C) reboot > D) diff the 2 files generated in (A): no differences > E) diff the original large file and the temporary copy > from (B): there are differences and the temporary copy > has zero in every byte that is different. >=20 > (E) suggests that the bad file copies via cp or > via dd are picking up data from the wrong memory > pages sometimes, (A) just made large numbers of > pages zero, making it more likely a zero page > would be used if the wrong page was referenced. >=20 > An example of checking for (E) was: >=20 > # diff clang-cortexA53-installworld-poud.tar mmjnk.other=20 > Binary files clang-cortexA53-installworld-poud.tar and mmjnk.other = differ >=20 > # cmp -l clang-cortexA53-installworld-poud.tar mmjnk.other | grep -v " = 0$" | more > --More--(END) >=20 >=20 > Note about my example "large file" sizes: >=20 > -rw-r--r-- 1 root wheel 4011026432 Apr 25 21:04:42 2020 = clang-cortexA53-installworld-poud.tar >=20 > and I've been mostly using 4 GiByte for the resultant size > of large files generated via dd. >=20 > I have not tried to find a minimum size for reliably > getting corrupted file copies. >=20 I continued after the above with (no additional reboot): # cpuset -l0 cp -aRx clang-cortexA53-installworld-poud.tar mmjnk.other2 # diff clang-cortexA53-installworld-poud.tar mmjnk.other2 Binary files clang-cortexA53-installworld-poud.tar and mmjnk.other2 = differ # cpuset -l2 diff clang-cortexA53-installworld-poud.tar mmjnk.other2 Binary files clang-cortexA53-installworld-poud.tar and mmjnk.other2 = differ # cpuset -l3 cp -aRx clang-cortexA53-installworld-poud.tar mmjnk.other3 # cpuset -l3 diff clang-cortexA53-installworld-poud.tar mmjnk.other3 Binary files clang-cortexA53-installworld-poud.tar and mmjnk.other3 = differ Note that the final mmjnk.other2 was via cpu 2. Note that the mmjnk.other3 was via cpu 3. Note that the original mmjnk.other was without limiting the cpu usage. Then I went back and did a compare of files not written since the reboot and showing zeros earlier above. First I show some of the output of a prior zeros-producing compare: # cmp -l clang-cortexA53-installworld-poud.tar mmjnk.other | more 1795768321 264 0 1795768322 167 0 1795768323 272 0 1795768324 6 0 1795768325 3 0 1795768326 370 0 1795768327 10 0 1795768328 112 0 . . . (Yes, I did not lock down what cpu was to be used for the cmp -l usage in this activity. In the future I probably should experiment with that too.) The new comparison looked like: # cmp -l clang-cortexA53-installworld-poud.tar mmjnk.other | more 1442340865 15 0 1442340866 245 0 1442340867 1 30 1442340868 1 353 1442340869 0 11 1442340870 100 17 1442340871 226 271 1442340872 31 125 . . . Not all-zeros being presented on the right any more! And not the same offset either (so different left hand side data). (Some bytes are a match to the left side and so do not show a line overall.) So I looked at the new copy made under cpuset -l2 : # cmp -l clang-cortexA53-installworld-poud.tar mmjnk.other2 | more 1442340865 15 0 1442340866 245 0 1442340867 1 30 1442340868 1 353 1442340869 0 11 1442340870 100 17 1442340871 226 271 1442340872 31 125 . . . Same offset in this file and *same* values on the left and right. (Not just those shown above.) So I looked at the new copy made under cpuset -l3 : # cmp -l clang-cortexA53-installworld-poud.tar mmjnk.other3 | more 981008385 62 0 981008386 111 0 981008387 157 30 981008388 65 353 981008389 123 11 981008390 145 17 981008391 164 271 981008393 160 0 . . . Different offset in this file but the *same* values on the right. (Not just those shown above.) The left values are different, matching up with the offset difference. (Some bytes are a match to the different data on the left and so do not show a line but the right side values appear to match the prior 2 examples even where lines disappear differently because of left-side content.) So, apparently, the same page of content used for the right side material but at a different point in the diff. (Lack of controlling the cpu used for cmp -l might be contributing?) Note: 1795768321 % 4096 =3D=3D 1 Note: 1442340865 % 4096 =3D=3D 1 Note: 981008385 % 4096 =3D=3D 1 cmp starts with line "1", so the above all align at 4096 boundaries. Overall this indicates that an unmodified file can have its content appear to change and that multiple files got the same block of bad data showing up in their respective comparisons, just not always at the same offset in the files. I've no clue if the roles of "left" and "right" could swap. So far the right seems to be the one that gets the bad data. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)