From owner-freebsd-questions@freebsd.org Thu Aug 29 06:37:09 2019 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 171F0CB83C for ; Thu, 29 Aug 2019 06:37:09 +0000 (UTC) (envelope-from vas@mpeks.tomsk.su) Received: from admin.sibptus.ru (admin.sibptus.ru [IPv6:2001:19f0:5001:21dc::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46JtFJ2mC2z4GrN for ; Thu, 29 Aug 2019 06:37:08 +0000 (UTC) (envelope-from vas@mpeks.tomsk.su) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sibptus.ru; s=20181118; h=In-Reply-To:Message-ID:Subject:To:From:Date; bh=cUJT7tT/CEXJgRi2BHvzUXlxatSsJUosK/bvdVvjSys=; b=bMsRHzPPpLAN8RdCtHURLfF88W uGdB/Jm1ho2LisfjnrETE1jBrb5MisdWHQ+O8zFbRjR7ooJLucrog17IUt5lmLX2nAMGwXxhQr8Ns 83DM7A/oI61aGBZtwlZ0TPRzpchOxwT5DYkp2Qqoo4dv6JHdUs+ZkmraSLHair2pSsNE=; Received: from vas by admin.sibptus.ru with local (Exim 4.92.1 (FreeBSD)) (envelope-from ) id 1i3E3O-0009MA-UY for freebsd-questions@freebsd.org; Thu, 29 Aug 2019 13:37:06 +0700 Date: Thu, 29 Aug 2019 13:37:06 +0700 From: Victor Sudakov To: freebsd-questions@freebsd.org Subject: Re: Kernel panic and ZFS corruption on 11.3-RELEASE Message-ID: <20190829063706.GB34810@admin.sibptus.ru> References: <20190828025728.GA1441@admin.sibptus.ru> <2964dd94-ad99-d0b8-c5d8-5d276cf02d06@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="CdrF4e02JqNVZeln" Content-Disposition: inline In-Reply-To: <2964dd94-ad99-d0b8-c5d8-5d276cf02d06@gmail.com> X-PGP-Key: http://admin.sibptus.ru/~vas/ X-PGP-Fingerprint: 10E3 1171 1273 E007 C2E9 3532 0DA4 F259 9B5E C634 User-Agent: Mutt/1.12.1 (2019-06-15) Sender: Victor Sudakov X-Rspamd-Queue-Id: 46JtFJ2mC2z4GrN X-Spamd-Bar: -------- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=sibptus.ru header.s=20181118 header.b=bMsRHzPP; dmarc=none; spf=pass (mx1.freebsd.org: domain of vas@mpeks.tomsk.su designates 2001:19f0:5001:21dc::10 as permitted sender) smtp.mailfrom=vas@mpeks.tomsk.su X-Spamd-Result: default: False [-8.49 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[sibptus.ru:s=20181118]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx:c]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.20)[multipart/signed,text/plain]; TO_DN_NONE(0.00)[]; DMARC_NA(0.00)[tomsk.su]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE(-2.89)[ip: (-9.22), ipnet: 2001:19f0:5000::/38(-4.61), asn: 20473(-0.59), country: US(-0.05)]; DKIM_TRACE(0.00)[sibptus.ru:+]; NEURAL_HAM_SHORT(-0.99)[-0.992,0]; SIGNED_PGP(-2.00)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:20473, ipnet:2001:19f0:5000::/38, country:US]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2019 06:37:09 -0000 --CdrF4e02JqNVZeln Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MJ wrote: >=20 >=20 > On 28/08/2019 12:57 pm, Victor Sudakov wrote: > > Dear Colleagues, > >=20 > > Shortly after upgrading to 11.3-RELEASE I had a kernel panic: > >=20 > > Aug 28 00:01:40 vas kernel: panic: solaris assert: dmu_buf_hold_array(o= s, object, offset, size, 0, ((char *)(uintptr_t)__func__), &numbufs, &dbp) = =3D=3D 0 (0x5 =3D=3D 0x0), file: /usr/src/sys/cddl/contrib/opensolaris/uts/= common/fs/zfs/dmu.c, line: 1022 > > Aug 28 00:01:40 vas kernel: cpuid =3D 0 > > Aug 28 00:01:40 vas kernel: KDB: stack backtrace: > > Aug 28 00:01:40 vas kernel: #0 0xffffffff80b4c4d7 at kdb_backtrace+0x67 > > Aug 28 00:01:40 vas kernel: #1 0xffffffff80b054ee at vpanic+0x17e > > Aug 28 00:01:40 vas kernel: #2 0xffffffff80b05363 at panic+0x43 > > Aug 28 00:01:40 vas kernel: #3 0xffffffff8260322c at assfail3+0x2c > > Aug 28 00:01:40 vas kernel: #4 0xffffffff822a9585 at dmu_write+0xa5 > > Aug 28 00:01:40 vas kernel: #5 0xffffffff82302b38 at space_map_write+0x= 188 > > Aug 28 00:01:40 vas kernel: #6 0xffffffff822e31fd at metaslab_sync+0x41d > > Aug 28 00:01:40 vas kernel: #7 0xffffffff8230b63b at vdev_sync+0xab > > Aug 28 00:01:40 vas kernel: #8 0xffffffff822f776b at spa_sync+0xb5b > > Aug 28 00:01:40 vas kernel: #9 0xffffffff82304420 at txg_sync_thread+0x= 280 > > Aug 28 00:01:40 vas kernel: #10 0xffffffff80ac8ac3 at fork_exit+0x83 > > Aug 28 00:01:40 vas kernel: #11 0xffffffff80f69d6e at fork_trampoline+0= xe > > Aug 28 00:01:40 vas kernel: Uptime: 14d3h42m57s > >=20 > > after which the ZFS pool became corrupt: > >=20 > > pool: d02 > > state: FAULTED > > status: The pool metadata is corrupted and the pool cannot be opened. > > action: Recovery is possible, but will result in some data loss. > > Returning the pool to its state as of =D0=B2=D1=82=D0=BE=D1=80=D0=BD= =D0=B8=D0=BA, 27 =D0=B0=D0=B2=D0=B3=D1=83=D1=81=D1=82=D0=B0 2019 =D0=B3. 23= :51:20 > > should correct the problem. Approximately 9 minutes of data > > must be discarded, irreversibly. Recovery can be attempted > > by executing 'zpool clear -F d02'. A scrub of the pool > > is strongly recommended after recovery. > > see: http://illumos.org/msg/ZFS-8000-72 > > scan: resilvered 423K in 0 days 00:00:05 with 0 errors on Sat Sep 30= 04:12:20 2017 > > config: > >=20 > > NAME STATE READ WRITE CKSUM > > d02 FAULTED 0 0 2 > > ada2.eli ONLINE 0 0 12 > >=20 > > However, "zpool clear -F d02" results in error: > > cannot clear errors for d02: I/O error > >=20 > > Do you know if there is a way to recover the data, or should I say fare= well to several hundred Gb of anime? > >=20 > > PS I think I do have the vmcore file if someone is interested to debug = the panic. >=20 > Do you have a backup? Then restore it. No, it's much more interesting to try and recover the pool. >=20 > If you don't, have you tried > zpool import -F d02 I've tried "zpool clear -F d02" with no success (see above). Later I tried "zpool import -Ff d02", but on an 11.2 system, as David Christensen advised, and this was a success. > Some references you might like to read: > https://docs.oracle.com/cd/E19253-01/819-5461/gbctt/index.html > Take note of this section: > "If the damaged pool is in the zpool.cache file, the problem is discovere= d when the system is booted, and the damaged pool is reported in the zpool = status command. If the pool isn't in the zpool.cache file, it won't success= fully import or open and you'll see the damaged pool messages when you atte= mpt to import the pool." >=20 > I've not had your exact error, but in the case of disk corruption/failure= , I've used import as the sledgehammer approach. What do you think made all the difference: 11.2 vs 11.3, or "import -F" vs = "clear -F"? What is the difference between "import -F" vs "clear -F" in the fixing of = zpool errors? --=20 Victor Sudakov, VAS4-RIPE, VAS47-RIPN 2:5005/49@fidonet http://vas.tomsk.ru/ --CdrF4e02JqNVZeln Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEcBAEBAgAGBQJdZ3KSAAoJEA2k8lmbXsY0HjIH/RUXX1rM4RmMBDixRK+MZNNi Vj75972HZjN/66F7SmoUq8Qal5IPDN29JrBx2Nte6O2g3EaVMOvGBF8IdcuU9gTS jpH5p7tQGvzZ2ihePQhuyCsd+mlf+G+zo9iD84Um15ftwa8kn5PG7g/VcMgd+PXR qq48p/mpb6vLe4YMs/Dz1uEt43D0gUGOd2trfBA8Ix2RsrFik6rP8ZQN4tlgYaTZ EwJBopVoHQooMBsKicO0M0D5H9UZBFTolsMUhlFOlB35I8Kb2UDeBX0xNqIUl0lB wxwdNKY1B1dCNqj828a8xDI8A/kkrs0dEFNuArMepN8vR+UusEDeqa+JuZKovEo= =b5oQ -----END PGP SIGNATURE----- --CdrF4e02JqNVZeln--