Date: Fri, 7 Jun 2013 00:32:34 +0200 From: mxb <mxb@alumni.chalmers.se> To: Outback Dingo <outbackdingo@gmail.com> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: zpool export/import on failover - The pool metadata is corrupted Message-ID: <4AEF62BB-BC52-41F7-9460-8FF9388AAEA9@alumni.chalmers.se> In-Reply-To: <CAKYr3zxs2BzivStP%2BB9u9tdx38m0s-kO66jGuaMmtcHo-i576w@mail.gmail.com> References: <D7F099CB-855F-43F8-ACB5-094B93201B4B@alumni.chalmers.se> <CAKYr3zyPLpLau8xsv3fCkYrpJVzS0tXkyMn4E2aLz29EMBF9cA@mail.gmail.com> <016B635E-4EDC-4CDF-AC58-82AC39CBFF56@alumni.chalmers.se> <CAKYr3zxs2BzivStP%2BB9u9tdx38m0s-kO66jGuaMmtcHo-i576w@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Should I increase sleep ? On 7 jun 2013, at 00:23, Outback Dingo <outbackdingo@gmail.com> wrote: >=20 >=20 >=20 > On Thu, Jun 6, 2013 at 6:12 PM, mxb <mxb@alumni.chalmers.se> wrote: >=20 > Then MASTER goes down, CARP on the second node goes MASTER (devd.conf, = and script for lifting): >=20 > root@nfs2:/root # cat /etc/devd.conf >=20 >=20 > notify 30 { > match "system" "IFNET"; > match "subsystem" "carp0"; > match "type" "LINK_UP"; > action "/etc/zfs_switch.sh active"; > }; >=20 > notify 30 { > match "system" "IFNET"; > match "subsystem" "carp0"; > match "type" "LINK_DOWN"; > action "/etc/zfs_switch.sh backup"; > }; >=20 > root@nfs2:/root # cat /etc/zfs_switch.sh > #!/bin/sh >=20 > DATE=3D`date +%Y%m%d` > HOSTNAME=3D`hostname` >=20 > ZFS_POOL=3D"jbod" >=20 >=20 > case $1 in > active) > echo "Switching to ACTIVE and importing ZFS" | mail -s = ''$DATE': '$HOSTNAME' switching to ACTIVE' root > sleep 10 > /sbin/zpool import -f jbod > /etc/rc.d/mountd restart > /etc/rc.d/nfsd restart > ;; > backup) > echo "Switching to BACKUP and exporting ZFS" | mail -s = ''$DATE': '$HOSTNAME' switching to BACKUP' root > /sbin/zpool export jbod > /etc/rc.d/mountd restart > /etc/rc.d/nfsd restart > ;; > *) > exit 0 > ;; > esac >=20 > This works, most of the time, but sometimes I'm forced to re-create = pool. Those machines suppose to go into prod. > Loosing pool(and data inside it) stops me from deploy this setup. >=20 >=20 > right this is the type of standard fair for the setup, but, what tells = nodeA that nodeB has completed export of the pool?? sleep 10 isnt likely = to cut it in all scenerios, hence there is no true mechanism, therfor it = is feasible that NodeA is already trying to import said pool before = NodeB has exported it fully. there in lies the potential for a problem > =20 > //mxb >=20 > On 6 jun 2013, at 22:06, Outback Dingo <outbackdingo@gmail.com> wrote: >=20 >>=20 >>=20 >>=20 >> On Thu, Jun 6, 2013 at 3:24 PM, mxb <mxb@alumni.chalmers.se> wrote: >>=20 >> Hello list, >>=20 >> I have two-head ZFS setup with external disk enclosure over SAS = expander. >> This is a failover setup with CARP and devd triggering spool = export/import. >> One of two nodes is preferred master. >>=20 >> Then master is rebooted, devd kicks in as of CARP becomes master and = the second node picks up ZFS-disks from external enclosure. >> Then master comes back, CARP becomes master, devd kicks in and pool = gets exported from the second node and imported on the first one. >>=20 >> However, I have experienced metadata corruption several times with = this setup. >> Note, that ZIL(mirrored) resides on external enclosure. Only L2ARC is = both local and external - da1,da2, da13s2, da14s2 >>=20 >> root@nfs2:/root # zpool import >> pool: jbod >> id: 17635654860276652744 >> state: FAULTED >> status: The pool metadata is corrupted. >> action: The pool cannot be imported due to damaged devices or data. >> see: http://illumos.org/msg/ZFS-8000-72 >> config: >>=20 >> jbod FAULTED corrupted data >> raidz3-0 ONLINE >> da3 ONLINE >> da4 ONLINE >> da5 ONLINE >> da6 ONLINE >> da7 ONLINE >> da8 ONLINE >> da9 ONLINE >> da10 ONLINE >> da11 ONLINE >> da12 ONLINE >> cache >> da1 >> da2 >> da13s2 >> da14s2 >> logs >> mirror-1 ONLINE >> da13s1 ONLINE >> da14s1 ONLINE >>=20 >> Any ideas what is going on? >>=20 >> Best case scenerio, both nodes tried to import the disks = simultaneously, split brain condition, or disks appear out of order and = no using labels, we had a similar situation with geom_multipath, theres = no quorum disks knowledge yet for FreeBSD using zfs in this = configuration, we ran it similarly for a while, until we realized = through research it was bad karma to let carp and devd control nodes = without a fool proof way to be sure nodes were ready to export/import. = though you did state >>=20 >> "Then master comes back, CARP becomes master, devd kicks in and pool = gets exported from the second node and imported on the first one." be = nice to see how you managed that with scripts... even if both nodes = booted simultaneously their both going to "fight" for master and try to = import that pool. >>=20 >>=20 >> //mxb >>=20 >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>=20 >=20 >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4AEF62BB-BC52-41F7-9460-8FF9388AAEA9>