Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 17 Jun 2017 22:32:26 +0000
From:      kc atgb <kisscoolandthegangbang@hotmail.fr>
To:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: Problem with zpool remove of log device
Message-ID:  <AM4PR05MB17143E6527B1CEC99B950BD6A0C60@AM4PR05MB1714.eurprd05.prod.outlook.com>
In-Reply-To: <AM4PR05MB17148E0F416FDF75EBC8616AA0CD0@AM4PR05MB1714.eurprd05.prod.outlook.com>
References:  <9188a169-cd81-f64d-6b9e-0e3c6b4af1bb@wasikowski.net> <0410af$1dldvp4@ipmail04.adl6.internode.on.net> <AM4PR05MB17148E0F416FDF75EBC8616AA0CD0@AM4PR05MB1714.eurprd05.prod.outlook.com>

next in thread | previous in thread | raw e-mail | index | archive | help


Le lun. 12 juin 2017 22:16:02 CEST
kc atgb <kisscoolandthegangbang@hotmail.fr> a =E9crit:

>=20
>=20
> Le mer. 07 juin 2017 08:21:09 CEST
> Stephen McKay <mckay@FreeBSD.org> a =E9crit:
>=20
> > On Friday, 26th May 2017, lukasz@wasikowski.net wrote:
> >=20
> > >I cant remove log device from pool - operation ends ok, but log device
> > >is still in the pool (bug?).
> > >
> > ># uname -a
> > >FreeBSD xxx.yyy.com 11.0-STABLE FreeBSD 11.0-STABLE #0 r316543: Thu Ap=
r
> > >6 08:22:43 CEST 2017     root@xxx.yyy.com:/usr/obj/usr/src/sys/YYY  am=
d64
> > >
> > ># zpool status tank
> > >[..snip..]
> > >
> > >        NAME                   STATE     READ WRITE CKSUM
> > >        tank                 ONLINE       0     0     0
> > >          mirror-0             ONLINE       0     0     0
> > >            ada2p3             ONLINE       0     0     0
> > >            ada3p3             ONLINE       0     0     0
> > >        logs
> > >          mirror-1             ONLINE       0     0     0
> > >            gpt/tankssdzil0  ONLINE       0     0     0  block size: 5=
12B configured, 4096B native
> > >            gpt/tankssdzil1  ONLINE       0     0     0  block size: 5=
12B configured, 4096B native
> >=20
> > >When I try to remove log device operation ends without errors:
> > >
> > ># zpool remove tank mirror-1; echo $?
> > >0
> > >
> > >But the log device is still there:
> > >[..snip..]
> > >I'd like to remove it - how should I proceed?
> >=20
> > Does your system still write to the log?  Use "zfs iostat -v 1" to
> > check.  I think it is probably no longer be in use and only the final
> > disconnection failed.
> >=20
> > What does "zpool list -v" tell you?  If you have a non-zero ALLOC
> > column for your log mirror and the log is no longer being used then
> > you may have hit an accounting bug in zfs that the zfsonlinux people
> > ran into a while ago.
> >=20
> > I had this problem when I tried to remove a log mirror from a pool
> > I have been using for years.  I solved it by tweaking the zfsonlinux
> > hack a bit and slotting it into 9.3.
> >=20
> > If you apply this hack be sure to have a full backup first!  When I
> > used it, I did my backup and a scrub then booted the hacked kernel,
> > issued the zfs remove command (which succeeded), reverted the kernel,
> > then scrubbed again.  All went well.
> >=20
> > Good luck!
> >=20
> > Here's the patch against 9.3 (should be close even for 11.0):
> >=20
> > Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> > --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	(revision 3098=
60)
> > +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	(working copy)
> > @@ -5446,6 +5446,18 @@
> >  	ASSERT(vd =3D=3D vd->vdev_top);
> > =20
> >  	/*
> > +	 * slog stuck hack - barnes333@gmail.com
> > +	 * https://github.com/zfsonlinux/zfs/issues/1422
> > +	 */
> > +	if (vd->vdev_islog && vd->vdev_removing
> > +	    && vd->vdev_state =3D=3D VDEV_STATE_OFFLINE
> > +	    && vd->vdev_stat.vs_alloc > 0) {
> > +		printf("ZFS: slog stuck hack - clearing vs_alloc: %llu\n",
> > +		    (unsigned long long)vd->vdev_stat.vs_alloc);
> > +		vd->vdev_stat.vs_alloc =3D 0;
> > +	}
> > +
> > +	/*
> >  	 * Only remove any devices which are empty.
> >  	 */
> >  	if (vd->vdev_stat.vs_alloc !=3D 0)
> >=20
> > Cheers,
> >=20
> > Stephen.
> > _______________________________________________
> > freebsd-fs@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> >=20
>=20
> I have this case once again. The first time it was one month ago.=20
> I had to backup ma datas and destroy and recreate the pool to remove the =
"faulted" log device.=20
>=20
> I'll try your patch. I hope I'll be more lucky than OP. I have to backup =
first again.=20
>=20
> In my opinion, maybe this problem is related to a certain type of data or=
 activity. I have my pool for few years now and added a log only some month=
s ago.=20
> It is a little bit strange that it happened to me twice in so little laps=
 of time and others are not affected.=20
>=20
> K.
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>=20

I have succesfully applied the patch and build the kernel. The removal of l=
og=20
device has worked too. It was in offline state, then I had to remove the dr=
ive
so it was marked as unavailable before removal. =20

My FreeBSD version :
FreeBSD my.host.name 9.3-STABLE FreeBSD 9.3-STABLE #0 r315141: Sun Mar 12=20
16:00:24 CET 2017     root@my.host.name:/usr/obj/usr/src/sys/GENERIC amd64

I'm still curious about why is it happening. Any idea ?=20

K.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AM4PR05MB17143E6527B1CEC99B950BD6A0C60>