Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Aug 2019 09:57:28 +0700
From:      Victor Sudakov <vas@mpeks.tomsk.su>
To:        freebsd-questions@freebsd.org
Subject:   Kernel panic and ZFS corruption on 11.3-RELEASE
Message-ID:  <20190828025728.GA1441@admin.sibptus.ru>

next in thread | raw e-mail | index | archive | help

--/9DWx/yDrRhgMJTb
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Dear Colleagues,

Shortly after upgrading to 11.3-RELEASE I had a kernel panic:

Aug 28 00:01:40 vas kernel: panic: solaris assert: dmu_buf_hold_array(os, o=
bject, offset, size, 0, ((char *)(uintptr_t)__func__), &numbufs, &dbp) =3D=
=3D 0 (0x5 =3D=3D 0x0), file: /usr/src/sys/cddl/contrib/opensolaris/uts/com=
mon/fs/zfs/dmu.c, line: 1022
Aug 28 00:01:40 vas kernel: cpuid =3D 0
Aug 28 00:01:40 vas kernel: KDB: stack backtrace:
Aug 28 00:01:40 vas kernel: #0 0xffffffff80b4c4d7 at kdb_backtrace+0x67
Aug 28 00:01:40 vas kernel: #1 0xffffffff80b054ee at vpanic+0x17e
Aug 28 00:01:40 vas kernel: #2 0xffffffff80b05363 at panic+0x43
Aug 28 00:01:40 vas kernel: #3 0xffffffff8260322c at assfail3+0x2c
Aug 28 00:01:40 vas kernel: #4 0xffffffff822a9585 at dmu_write+0xa5
Aug 28 00:01:40 vas kernel: #5 0xffffffff82302b38 at space_map_write+0x188
Aug 28 00:01:40 vas kernel: #6 0xffffffff822e31fd at metaslab_sync+0x41d
Aug 28 00:01:40 vas kernel: #7 0xffffffff8230b63b at vdev_sync+0xab
Aug 28 00:01:40 vas kernel: #8 0xffffffff822f776b at spa_sync+0xb5b
Aug 28 00:01:40 vas kernel: #9 0xffffffff82304420 at txg_sync_thread+0x280
Aug 28 00:01:40 vas kernel: #10 0xffffffff80ac8ac3 at fork_exit+0x83
Aug 28 00:01:40 vas kernel: #11 0xffffffff80f69d6e at fork_trampoline+0xe
Aug 28 00:01:40 vas kernel: Uptime: 14d3h42m57s

after which the ZFS pool became corrupt:

  pool: d02
 state: FAULTED
status: The pool metadata is corrupted and the pool cannot be opened.
action: Recovery is possible, but will result in some data loss.
	Returning the pool to its state as of =D0=B2=D1=82=D0=BE=D1=80=D0=BD=D0=B8=
=D0=BA, 27 =D0=B0=D0=B2=D0=B3=D1=83=D1=81=D1=82=D0=B0 2019 =D0=B3. 23:51:20
	should correct the problem.  Approximately 9 minutes of data
	must be discarded, irreversibly.  Recovery can be attempted
	by executing 'zpool clear -F d02'.  A scrub of the pool
	is strongly recommended after recovery.
   see: http://illumos.org/msg/ZFS-8000-72
  scan: resilvered 423K in 0 days 00:00:05 with 0 errors on Sat Sep 30 04:1=
2:20 2017
config:

	NAME	    STATE     READ WRITE CKSUM
	d02	    FAULTED	 0     0     2
	  ada2.eli  ONLINE	 0     0    12

However, "zpool clear -F d02" results in error:
cannot clear errors for d02: I/O error

Do you know if there is a way to recover the data, or should I say farewell=
 to several hundred Gb of anime?

PS I think I do have the vmcore file if someone is interested to debug the =
panic.

--=20
Victor Sudakov,  VAS4-RIPE, VAS47-RIPN
2:5005/49@fidonet http://vas.tomsk.ru/

--/9DWx/yDrRhgMJTb
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQEcBAEBAgAGBQJdZe2YAAoJEA2k8lmbXsY07mgH/R3KXo8Ze/EQouDJVIV5dOkx
j4LOFB1TNnZ6RGsJWpbJNpofxNTvdVGEzJg4Ma1+UZZCtnDfkF3Cb3YLVpvVm/al
pvz6xvxewbdzy+Gr2Son3/8R00b2CA7wJPc7Wntz/syevMkQN3SFtQDTHM3NUNhm
cDgDrEeXM/xAEqj9kbRziAvCdqr5sTX63bHZ28UVeYyme4skseCLKz5u1Hyyxkld
v72dDtfo47/6tXxO4hZ2HNLFq6GQM4HEZi7WGqll6Hv9HhZzIOStqqF/ShrUP/Pm
wtesgN4Yz/RCv2eCLHMQXLUMzDOJpW5Osf0LJ3o/JNNmBYsAmSKMAAeJHa9esMw=
=tHVW
-----END PGP SIGNATURE-----

--/9DWx/yDrRhgMJTb--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190828025728.GA1441>