From owner-freebsd-current@freebsd.org Thu Jan 25 20:22:51 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EA391ECB344 for ; Thu, 25 Jan 2018 20:22:50 +0000 (UTC) (envelope-from ohartmann@walstatt.org) Received: from mout.gmx.net (mout.gmx.net [212.227.15.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5C08D6FCEB; Thu, 25 Jan 2018 20:22:50 +0000 (UTC) (envelope-from ohartmann@walstatt.org) Received: from thor.intern.walstatt.dynvpn.de ([85.182.13.183]) by mail.gmx.com (mrgmx002 [212.227.17.190]) with ESMTPSA (Nemesis) id 0LoE4f-1fGaY73jCL-00gDWT; Thu, 25 Jan 2018 21:17:31 +0100 Date: Thu, 25 Jan 2018 21:16:56 +0100 From: "O. Hartmann" To: FreeBSD CURRENT Cc: Warner Losh , Mark Johnston , Michael Tuexen , Ed Maste Subject: Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2> Message-ID: <20180125211723.6e65329f@thor.intern.walstatt.dynvpn.de> In-Reply-To: References: <20171231004137.4f9ad496@thor.intern.walstatt.dynvpn.de> <23651B78-E31C-4BDD-BCA3-408B8F907884@freebsd.org> <20180108153356.GA2412@raichu> Organization: WALSTATT User-Agent: OutScare 3.1415926 X-Operating-System: ImNotAnOperatingSystem 3.141592527 MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; boundary="Sig_/DrIwn3ci4fBg94yiFsXxNhQ"; protocol="application/pgp-signature" X-Provags-ID: V03:K0:gvvNPEgkKJiuXdku+fb8JhgFwLmgQkmAZq+ubG3Uv6qcG1bel8e XQzKONB+CfbVvXJG6ZnhRiP5grA2/K4mIYF3PBMkUpTL+zBmgVxTT683JgcjMxuLMKvJTQn 66DH/yd+hoTCFf3gOlf/b80o2uy/rnZ6S8sSljsvb6z8arUrcKEhc6iu/tYlqcWV1LgNdQS 3DxP/zEizPUmVOUh/tGsw== X-UI-Out-Filterresults: notjunk:1;V01:K0:b1z5xNYR7xk=:rV5f32CbMUM1dA8ou0yjiP o9ednCojd5IT8C2OVjOgLWdTKspHQ8o9aIJN9iLquIhARme1/rqAeifKNBKZV3jKXL9Wv6DV0 c08N9Vtb60u584x/zhweiyydgE1Oy+NathB58VnNHR/+8R3DAkWjGvEz972QMpl+wmyiWf4WR sv1x72xjPNAd+ObSMLMNt107f8BdIL24BQ02q/3M3m5lvta2XIFL9bQRp21PrToWzWPE+ONHh 4cL84/j1nPvpd+wkWy0u9YER2aI35vBqBTG4s3KO26tOrHP50SwKq62MpkLA+uOdOPbDhkY7e zTNnajd8/9+HpioPyaWx9W55YU7PI9nPSdgf1Y4UnasaVY1pJbCQoU0t5QXbi2eGOy4H4feOO v3IzOkeRPrJs0IecWxKQj1HOhCPLzREkjg8PzXA7mfpo8gHmBKvH5U3IY8SKg9qyRTfiCIA2s dCu8miNunx3INPD309Es8ao0qJ6XKIcFFqdkZkiVYuZAqIQXRibKwvUUyO8EJni0U2CZrGh9h NZ/NvWpFy0c5MDxl3VqtWvlHbyTTnQJBzeZaykE6l0NOgDJXQQ+tsFh9HZCyP9aQngLpRuWoq DNvf/myDCiRzX3ZVYAoIj7iQm1ZcSfB9TaU5g9VhOBR0PVWr6pg3I0XCo6KnPRcwWf8CH4EqQ xfpQzLciHyXWYt6jjaHckja5jXbiJIoKm0ss0o5E1ZJsaaVRLnU2kkArJu4e2tItnRX+pyiB3 ZI+saeGAumlvg2xzHWi5uMu47Se3kaf+O52gT9PvUPmVvsM6KUYjAHlt0IG9KcO4w3XmUBlNa DJ7TdZv5VSKulZO9IIoNNUA34W8Kw== X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jan 2018 20:22:51 -0000 --Sig_/DrIwn3ci4fBg94yiFsXxNhQ Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Am Mon, 8 Jan 2018 09:12:16 -0700 Warner Losh schrieb: > On Jan 8, 2018 8:34 AM, "Mark Johnston" wrote: >=20 > On Thu, Jan 04, 2018 at 09:10:37AM +0100, Michael Tuexen wrote: > > > On 31. Dec 2017, at 02:45, Warner Losh wrote: > > > > > > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann = =20 > wrote: > > > =20 > > >> On most recent CURRENT I face the error shwon below on /tmp filesyst= em > > >> (UFS2) residing > > >> on a Samsung 850 Pro SSD: > > >> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: =20 > 0x4515d2a3 !=3D > > >> bp: 0xd9fba319 > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d= 2a3 > > >> !=3D bp: 0xd9fba319 > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d= 2a3 > > >> !=3D bp: 0xd9fba319 > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d= 2a3 > > >> !=3D bp: 0xd9fba319 > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d= 2a3 > > >> !=3D bp: 0xd9fba319 > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > >> > > >> I've already formatted the /tmp filesystem, but obviously without any > > >> success. > > >> > > >> Since I face such strange errors also on NanoBSD images dd'ed to SD = =20 > cards, > > >> I guess there > > >> is something fishy ... =20 > > > > > > > > > It indicates a problem. We've seen these 'corruptions' on data in =20 > motion at > > > work, but I hacked fsck to report checksum mismatches (it silently =20 > corrects > > > them today) and we've not seen any mismatch when we unmount and fsck = the > > > filesystem. =20 > > Not sure this helps: But we have seen this also after system panics > > when having soft update journaling enabled. Having soft update journali= ng > > disabled, we do not observed this after several panics. > > Just to be clear: The panics are not related to this issue, > > but to other network development we do. =20 >=20 > I saw the same issue this morning on a mirrored root filesystem after my > workstation came up following a power failure. fsck recovered using the > journal, and I subsequently saw a number of these checksum failures. > Upon shutdown, I saw the same handle_workitem_freefile errors as above. > I then ran a full fsck from single-user mode, which didn't turn up any > inconsistencies, and after that the checksum failure errors disappeared, > presumably because fsck fixed them. >=20 >=20 > Yes. Fsck automatically fixes issues like that. It does it silently. I ha= ve > patched to make it noisy, and the dozen cases I saw the errors, fsck was > silent with my whiny patches. I can put them up for review if people want= ... >=20 > Warner within the past couple of weeks - or since the first occurence of these str= ange reports, I have had mysterious crashes: when installing FreeBSD even the proper (rec= ommended) way, the box suddenly crashes out of the blue. The symptoms are always the same = and the result is also always the same: the box is unusable, the boot process is stuck at = BTX halted with a list of dumped CPU registers (I guess it is the CPU registers) and t= he filesystem is corrupt. I have had this strange problem on several hosts with SSDs - I = reported end November/beginning of December 2017 of those crashes. On on machine I refom= ated the SSD and did a playback from ab 'dump'-backup - since then those crashes went aw= ay. The box now in question is the last of them not being traeted that way. it seems, t= here is somewhere/somehow a minefield hidden and I have no clue what it could be :-( I'm going to do the very same soon with the SSD of the remaining box - dump= and restore. I just wanted to note this for the record. The crash happend with FreeBSD 12.0-CURRENT #14 r328409: Thu Jan 25 20:40:= 27 CET amd64. Kind regards, Oliver 2018=20 --=20 O. Hartmann Ich widerspreche der Nutzung oder =C3=9Cbermittlung meiner Daten f=C3=BCr Werbezwecke oder f=C3=BCr die Markt- oder Meinungsforschung (=C2=A7 28 Abs.= 4 BDSG). --Sig_/DrIwn3ci4fBg94yiFsXxNhQ Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iLUEARMKAB0WIQQZVZMzAtwC2T/86TrS528fyFhYlAUCWmo7UwAKCRDS528fyFhY lJeQAf0XYgWzSL4pWHomsqM9lPYnUYhN3hHA+pKBjv/BdPWKVsn4vLOjADwmn/Xn f2nyB6LIabgQ9HnAThOCPeFXoEjyAf9Y5KQDS0n+6WC/TpL4HBSQYXjW9Kx2yTBu EgDJ9XIRiZiSaQ3+unW/q7LmNZaNL7sj340RIxNJ1E8HgDbCHoza =DIAb -----END PGP SIGNATURE----- --Sig_/DrIwn3ci4fBg94yiFsXxNhQ--