From nobody Mon Feb 20 06:00:35 2023 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PKsGz1z6nz3s0TV for ; Mon, 20 Feb 2023 06:00:59 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic304-23.consmr.mail.gq1.yahoo.com (sonic304-23.consmr.mail.gq1.yahoo.com [98.137.68.204]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4PKsGy2BV6z3s67 for ; Mon, 20 Feb 2023 06:00:58 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b="i/Sw1V9w"; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.68.204 as permitted sender) smtp.mailfrom=marklmi@yahoo.com; dmarc=pass (policy=reject) header.from=yahoo.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1676872848; bh=RkUYQGCYQ3/WKYKoZWertAz3zs2XeCh86/lqcTzoiI0=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=i/Sw1V9wfiQo2wLEwPdDWzXFuiDIhRUgC0JoTB+yKZIDvBpOiEL3LzlexIlq4eFiAuiPC3835wezozTVZ8mXGOHGXOdngkX43nHKNR6T0vhuS/sCbWaLRvn6trroTdy/lx8hqv0LHlCO/G60V7BuQAs21nR3Jo1a0bbJQOQ5wDsAgfagWt1sSBdxwZMTcABWly+HjMFJTaXN7pPkWzJKLAVBmX8JNKNL/pLAo9l4IY+gP4qJzLNIey/JmkBigxZ7OtTP3aIracVDumz1UesLFO52CU5E6Hi+J7A20fEmlLaIa6lq8HOoZ3yKp0rHsbIN3qDxm7TYbf1I5WBXanXfgw== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1676872848; bh=NNGrmcR5skb42cGDyYWPpCCiVugA11pRq4eBynFL6oj=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=OqHREA3T0k8kilvG0GRe7G9XRyvN/mGvt53bkPMN0//fx9YTqZ9dlvlokU1iCMMYjQhac0JfE3XffAWIDlQyjvsdz30XEDdXymmNWtOJQiMHJAqw0VODDDaeZ+vNOVIsbGW6H3guLcG53QRL40YVYpD0gUnQWNSM2Lvles8uUgeVvn9dLdDXYPhBFhWbvA45eKWCQdwk5EBLmr4XvfoAzd9t1F4Ge2c8MmkqEtfahjnxfZePvF9ZhG+dx8/kuJrRQacAYx1pzqCnBQC3YUXYFb90PXf3afClWwKSSQa4GMA5xGO8iqP+OZ3eqKLyqFBOj6z5jM0UO4lqtvF+ibyn9g== X-YMail-OSG: mAgF_fwVM1nZ.S9AdOBwCSU8gv1wFCpM0N5xQOeMLReIHVWTos7mRbA.AKqJusH fY5ounEtwVLet1n3AOw95gWgeDy0c3DTUEes7vYgUqHDC1Yrews6.2f9.YBJhmmOt3fFtN1x3gbW H.gWFuLZeZ4FEsHCqblBtTDzA7_qfrp50PlRGks0JwR454DlUaE57Luo6NA2luYwNk.iS9UqyM1k jEl1KGY2.PIBVEUiWFFTvnbCvSgoMtagS4QjteJHlTjYaCehJ3vT.Z59oErMIfjCkTKsbncUkaMk WftmkQFiQqCQGKYXLvNUlHDvkLRxrPPnZptttOMCaOzMPWqtEmI2z6n2ZDalYloNt5tWjez1eYQe _gmKMbSdm4wdkm6hOPwG6Sf97c9CkSUPvEsH7Y8RHiN52bU1ogEcxJJyw6R0YjYHdVKCHprTuwle OcVcuKYCE01yazBQoUvOdCE592vSahKUBtJ6Hd.1tTRL_ie6caG7pCcNFU8GKOcTXwwCrwaw0RPS h.6sJatnIvz1jMR6bzYlfxiffLQ0bqVlHeqgnPz7WcdwgZbAtViM31WdarBCWVSbIeLEYbh1S8f1 OhypbSbc.Xq4T7myU8G_mJidF.IyK5nwbbm2a91c5hwaEa4r6CqNMIEK6ywgBheCM8JnzfUvdc.8 eo4ykvQTzPGWjvN7CLmYzGeU_4j0qphpTi02xOsbgOYw4sv52UKBWproBZ__4g9oIUWwc05RrA_e 9F7SBn1uCAmR4gu416cVABX_POS3dxEly5JhgyJmL4d2t2W5PzLTG32dmEUWlDHj1o510pCfzr.m DBigiu1QX5iu0tBOy2IoF0b_iQCx975.AOchu_1WLIHPcR197wXvlqutb4xhDDTxXsB4dcgz5GMi hV.fFIr4IaZG4SgAx96NVpS44jPuz_FmGDUGxNyb4jzLoopQm59w8d_0abL.N.x35W3fWK1tj3jq mkHbuUGbdTsxXU3GY81U3A0y_7oCYPPGL085VThL.fYdI3BrM6tf6ESnhdWjqE3d0xm1_QAARhHj qVYaK4mlIEdAnXNIGefBGDhkZkOlKHia1iXRzd_VJJT3M.DKd9em_uoL8jcdHjjOw3AeZFueYekF w0Cb1ssVioRrx3UEzAaKdz49m2jPYsQNX87TYuA.tXGcxqj2GL5bb_VGIyfDm05LHqmcvH6xbA0x enBqxbbsmsv_XFQAF.e_S8esTAHaLB9csqqpY.i42zIl4dCdFRi7aoy4d7nnXxnygncENwRL7LkE DJ4BRJgipVQ7dTmC2Us4oHR5exLp_6RwktHhZQO2VmBQKltUMcq.l3FrRVcmSvog1ZH.zcPqm6j5 rVSJpk_sfyEUWM.wwN0SFFQ9hEpKkCkInCXExbg8PKUCQPV4TNGVZ3kapekXtHAkZJaAQE5pOIUL 4Mk5XWML27vZVWMQZeFlpujr6wkkwU7s5N4W6xEppZpWXMdjvjFZdfTMHUXjAEFMXj56sDSnmFnY edF4tVk0CUjIwfsTeHpBTL2mEEZ20xnAt6jV27.Uvq7mEWayuEVflweW4ENSjnBBm7WWAAByE8NC li4PgA22Cb3Y2vSZm4XdtEX7Rv8hpakJu5_CR_Cla0r5simz_pXKR0r0DDj7V4r2X2xHFXpMnese 0e681QqsBlaGzRuZgdgwcfLDzf4jf.SNtJ5hoUe51xZS5TB4ckD.3NVrjSy4DA8mBw2G0VJ3mKT. bpPWwwSDa._06gu29F8T1P6hRdmt1osnxWS9AduVlYPL36ZfXs01o02iCNOqtkE2AA8u86Sio0t0 ZF0sYxNQWl66FxXlpMGSSxa0eWQEk6cvl14f8ndP9RpA3ttIqt6ENggfuUnmTc3jxy6igep7MUvb njoqpJfGOwFC95DvrTP4OIa1UXltC..TZYCl712gRsj9fP02IYuhyrtbYmNOScTKA5.deTLX4EnT _7Xq0.zFtmUhzEor1vFQohMLqHFWMkvBAOLxxA2km9ALcXlAeD.meDWKEcVApMfY5cy6izY0bCmb iefHulSfnJURbxd63wAF6nv4BeekhBgmXxsZj_y0VV7Jd7VjLKshhGgP4GTw_C51uOiti8bqkDA9 ghgN3iVPZHZ_11MAKAnLXwQ5sY0ucb5Nskg6p7gI0EEbPA7k67jdEcuZi5VAjYHHBxtEu8DTg864 LqYYqVFxNTXkcEAgHbnUCB4US.nGgvPDpc_IBMP3ZZV5M9C0JQCD7ECcwuJ0nno6DbXTw.rHQbHY V94mWSr8hD7h0GkiWf2FrE2ic7obuJcPCMY2Uxxmps0kySMSCMPZ9xBQHzZ2Sg9uPD0cPpa1uSA- - X-Sonic-MF: Received: from sonic.gate.mail.ne1.yahoo.com by sonic304.consmr.mail.gq1.yahoo.com with HTTP; Mon, 20 Feb 2023 06:00:48 +0000 Received: by hermes--production-ne1-746bc6c6c4-9t6ft (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID ec98122237b7b129c887928e709eec3b; Mon, 20 Feb 2023 06:00:47 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\)) Subject: Re: fsck segfaults on rpi3 running 13-stable (and on 14-CURRENT analyzing the same file system that resulted from the 13-STABLE crash) From: Mark Millard In-Reply-To: <9CEF4E7A-2F13-454F-A04A-A6C5A80FD4B7@yahoo.com> Date: Sun, 19 Feb 2023 22:00:35 -0800 Cc: freebsd-arm@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <268392B4-58FE-49EE-9B1D-6DA632757DFA@yahoo.com> References: <202302192054.31JKsq7w079295@chez.mckusick.com> <3DD8EEC2-6135-42A0-A80C-F195CAAC025E@yahoo.com> <20230219222328.GA55941@www.zefox.net> <2F5B20E9-AFF8-42F6-9E1F-50BBDF4E1B79@yahoo.com> <20230220044544.GB57936@www.zefox.net> <9CEF4E7A-2F13-454F-A04A-A6C5A80FD4B7@yahoo.com> To: bob prohaska X-Mailer: Apple Mail (2.3731.400.51.1.1) X-Spamd-Result: default: False [-3.20 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.70)[-0.696]; MV_CASE(0.50)[]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; RCVD_IN_DNSWL_NONE(0.00)[98.137.68.204:from]; RCVD_COUNT_THREE(0.00)[3]; FREEMAIL_FROM(0.00)[yahoo.com]; TO_DN_SOME(0.00)[]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; DKIM_TRACE(0.00)[yahoo.com:+]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; MIME_TRACE(0.00)[0:+]; MLMMJ_DEST(0.00)[freebsd-arm@freebsd.org] X-Rspamd-Queue-Id: 4PKsGy2BV6z3s67 X-Spamd-Bar: --- X-ThisMailContainsUnwantedMimeParts: N On Feb 19, 2023, at 21:50, Mark Millard wrote: > On Feb 19, 2023, at 20:45, bob prohaska wrote: >=20 >> On Sun, Feb 19, 2023 at 02:35:15PM -0800, Mark Millard wrote: >>>=20 >>> Kirk likely monitors the freebsd-fs list. >>=20 >> I didn't notice there was such a list 8-\ >>=20 >>> Kirk likely does not monitor the freebsd-arm list. >>> None of us thought to switch to freebsd-fs at the >>> time. The only part of your context that ended up >>> to be arm specific was original buildworld crash. >>> You definitely started in an appropriate place >>> (freebsd-arm). After the crash, the rest was more >>> general relative to platforms and more specific >>> relative to file system handling (UFS support). >>>=20 >>> I do not see any reason for any of this exchange >>> to go to any lists, given the current status. >>=20 >> Alas, the story's not over yet 8-( =20 >>=20 >> After getting the disk fsck'd and booting once more, >> an attempt to buildworld using a fresh /usr/src >> and empty /usr/obj crashed again, >=20 > I'm confused. The original crash was reported to be > on a RPi2B using a armv7 kernel, or so I thought. > (The RPi3B was for later fsck_ffs activity for the > media's UFS.) >=20 > This new material indicates a RPi3B arm64 (aarch64) > context for this buildworld failure. Is it the same > media as for the prior buildworld failure? >=20 >> in I think the >> same way. This time some notes have been collected >> at >> http://www.zefox.net/~fbsd/rpi3/scsi_status_error/readme >>=20 >> To a casual glance, it looks like a hardware error. >> But, the machine seems to work fine until it's running >> buildworld, and then crashes during a relatively easy >> part of buildworld. The initial error message is: >>=20 >> bob@pelorus:/usr/src % (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 = 43 29 d6 40 00 00 40 00=20 >> (da0:umass-sim0:0:0:0): CAM status: SCSI Status Error >> (da0:umass-sim0:0:0:0): SCSI status: Check Condition >> (da0:umass-sim0:0:0:0): SCSI sense: MEDIUM ERROR asc:11,0 = (Unrecovered read error) >> (da0:umass-sim0:0:0:0): Error 5, Unretryable error >=20 > A description of "Media Error" from seagate is: >=20 > Medium Error - Indicates the command terminated with a nonrecovered = error condition, probably caused by a flaw in the medium or an error in = the recorded data. >=20 > To compare/contrast with other alternatives, see: >=20 > https://www.seagate.com/support/kb/scsi-sense-key-chart-196259en/ >=20 > A more extensive list with asc/ascq involved as well is at: >=20 > https://en.wikipedia.org/wiki/Key_Code_Qualifier/ >=20 > Allowing more comparison/contrast with other classifications. >=20 > It indicates: >=20 > 3 11 00 Medium Error - unrecovered read error >=20 > (matching the reported text). >=20 >> SCSI errors are not unknown, but they usually succeed on retry. >> It's not obvious why this is treated as un-retryable.=20 >=20 > Because that is what the "3 11 00" combination involved > means. The drive is reporting that. It is not a FreeBSD > driver choice of handling. >=20 > (I'm not expert at drive internals, so I take it at face > value.) >=20 >> Are there any simple tests that might help decide what's wrong? >> It's likely that re-running buildworld will reproduce the crash. >=20 > See the https://en.wikipedia.org/wiki/Key_Code_Qualifier/ > description material for some background information? >=20 >> I've placed the results of smartctl -a at the end of the notes.=20 >> The interpretation isn't self evident, hopefully someone else >> can lend an eye. I'll try smartctl -t after a good night's sleep.=20 >=20 > man smartctl reports: >=20 > UNC: UNCorrectable Error in Data >=20 > The 3 examples of: >=20 > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 40 51 00 ff ff ff 0f Error: UNC at LBA =3D 0x0fffffff =3D 268435455 >=20 > indicate UNC. All 3 list the same LBA value. Turns out that the LBA value is likely garbage, given the size of your drive (> 128 GiBytes): Quoting the smartctl man page: (Because of the limitations of the SMART error log, if the LBA is greater than 0xfffffff, then = either no error log entry will be made, or the error log entry will = have an incorrect LBA. This may happen for drives with a = capacity greater than 128 GiB or 137 GB.) Also, the more expanded material about UNC is: UNC (UNCorrectable): data is uncorrectable. This refers = to data which has been read from the disk, but for which the Error Checking and Correction (ECC) codes are inconsistent. In effect, this means that the data can not be read. >=20 > Error 4 occurred at disk power-on lifetime: 11121 hours (463 days + 9 = hours) > Error 3 occurred at disk power-on lifetime: 11098 hours (462 days + 10 = hours) > Error 2 occurred at disk power-on lifetime: 11096 hours (462 days + 8 = hours) >=20 > So spread over a little over a day overall, with 2 and 3 > spread over a couple of hours. >=20 > It suggests to me that the drive is no longer usable. > But I'm no expert. =3D=3D=3D Mark Millard marklmi at yahoo.com