From owner-freebsd-fs@FreeBSD.ORG Thu Jun 6 22:32:38 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3F7319B1 for ; Thu, 6 Jun 2013 22:32:38 +0000 (UTC) (envelope-from mxb@alumni.chalmers.se) Received: from mail-lb0-f175.google.com (mail-lb0-f175.google.com [209.85.217.175]) by mx1.freebsd.org (Postfix) with ESMTP id A032D1933 for ; Thu, 6 Jun 2013 22:32:37 +0000 (UTC) Received: by mail-lb0-f175.google.com with SMTP id v10so3570612lbd.6 for ; Thu, 06 Jun 2013 15:32:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :message-id:references:to:x-mailer:x-gm-message-state; bh=2YaWQtg6Dih4hM9HV4GZ0TeTIzAFEQmtyiTlwsR01Jg=; b=C3NqmrTog5+oU5f4QpXNsszQX9Y4rXNDX8dH9hLaGjdd88hlMQiCcnqMPGN2nghr1i nNIb6xguSpEIwn+/Ye9++d0PQntsDsAA9I87aPy2DBf2LHjIVHqcAkf4UnhSoQR/Sbxr lammE5eirXAt6gpVVrv9SbjngzMtjm4Yq7d4XWkUFuM3FyGD9F8Aof0mivJMWHGa7WIx 7heSwqerS2pfngUqII24MkPvIIO6voIvS469cn2BFO/NB5lTiHgOg8zm8MXWPIPYClue Yw8V5rImXMI1C2TXh+0u8t73h2dKyAdrFD4Qr03G96EwXQdylPtR+xSXbXhTrk70IAVT rjDQ== X-Received: by 10.152.120.9 with SMTP id ky9mr1150817lab.71.1370557956199; Thu, 06 Jun 2013 15:32:36 -0700 (PDT) Received: from grey.home.unixconn.com (h-75-17.a183.priv.bahnhof.se. [46.59.75.17]) by mx.google.com with ESMTPSA id x8sm4122531lae.10.2013.06.06.15.32.35 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 06 Jun 2013 15:32:35 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) Subject: Re: zpool export/import on failover - The pool metadata is corrupted From: mxb In-Reply-To: Date: Fri, 7 Jun 2013 00:32:34 +0200 Message-Id: <4AEF62BB-BC52-41F7-9460-8FF9388AAEA9@alumni.chalmers.se> References: <016B635E-4EDC-4CDF-AC58-82AC39CBFF56@alumni.chalmers.se> To: Outback Dingo X-Mailer: Apple Mail (2.1503) X-Gm-Message-State: ALoCoQmn6fqDxxKaSiW360pfLy/B8ZkMLOdDO4qMo3noimW10iyUeeDn95QbuicxVLz6aYWlenYy Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jun 2013 22:32:38 -0000 Should I increase sleep ? On 7 jun 2013, at 00:23, Outback Dingo wrote: >=20 >=20 >=20 > On Thu, Jun 6, 2013 at 6:12 PM, mxb wrote: >=20 > Then MASTER goes down, CARP on the second node goes MASTER (devd.conf, = and script for lifting): >=20 > root@nfs2:/root # cat /etc/devd.conf >=20 >=20 > notify 30 { > match "system" "IFNET"; > match "subsystem" "carp0"; > match "type" "LINK_UP"; > action "/etc/zfs_switch.sh active"; > }; >=20 > notify 30 { > match "system" "IFNET"; > match "subsystem" "carp0"; > match "type" "LINK_DOWN"; > action "/etc/zfs_switch.sh backup"; > }; >=20 > root@nfs2:/root # cat /etc/zfs_switch.sh > #!/bin/sh >=20 > DATE=3D`date +%Y%m%d` > HOSTNAME=3D`hostname` >=20 > ZFS_POOL=3D"jbod" >=20 >=20 > case $1 in > active) > echo "Switching to ACTIVE and importing ZFS" | mail -s = ''$DATE': '$HOSTNAME' switching to ACTIVE' root > sleep 10 > /sbin/zpool import -f jbod > /etc/rc.d/mountd restart > /etc/rc.d/nfsd restart > ;; > backup) > echo "Switching to BACKUP and exporting ZFS" | mail -s = ''$DATE': '$HOSTNAME' switching to BACKUP' root > /sbin/zpool export jbod > /etc/rc.d/mountd restart > /etc/rc.d/nfsd restart > ;; > *) > exit 0 > ;; > esac >=20 > This works, most of the time, but sometimes I'm forced to re-create = pool. Those machines suppose to go into prod. > Loosing pool(and data inside it) stops me from deploy this setup. >=20 >=20 > right this is the type of standard fair for the setup, but, what tells = nodeA that nodeB has completed export of the pool?? sleep 10 isnt likely = to cut it in all scenerios, hence there is no true mechanism, therfor it = is feasible that NodeA is already trying to import said pool before = NodeB has exported it fully. there in lies the potential for a problem > =20 > //mxb >=20 > On 6 jun 2013, at 22:06, Outback Dingo wrote: >=20 >>=20 >>=20 >>=20 >> On Thu, Jun 6, 2013 at 3:24 PM, mxb wrote: >>=20 >> Hello list, >>=20 >> I have two-head ZFS setup with external disk enclosure over SAS = expander. >> This is a failover setup with CARP and devd triggering spool = export/import. >> One of two nodes is preferred master. >>=20 >> Then master is rebooted, devd kicks in as of CARP becomes master and = the second node picks up ZFS-disks from external enclosure. >> Then master comes back, CARP becomes master, devd kicks in and pool = gets exported from the second node and imported on the first one. >>=20 >> However, I have experienced metadata corruption several times with = this setup. >> Note, that ZIL(mirrored) resides on external enclosure. Only L2ARC is = both local and external - da1,da2, da13s2, da14s2 >>=20 >> root@nfs2:/root # zpool import >> pool: jbod >> id: 17635654860276652744 >> state: FAULTED >> status: The pool metadata is corrupted. >> action: The pool cannot be imported due to damaged devices or data. >> see: http://illumos.org/msg/ZFS-8000-72 >> config: >>=20 >> jbod FAULTED corrupted data >> raidz3-0 ONLINE >> da3 ONLINE >> da4 ONLINE >> da5 ONLINE >> da6 ONLINE >> da7 ONLINE >> da8 ONLINE >> da9 ONLINE >> da10 ONLINE >> da11 ONLINE >> da12 ONLINE >> cache >> da1 >> da2 >> da13s2 >> da14s2 >> logs >> mirror-1 ONLINE >> da13s1 ONLINE >> da14s1 ONLINE >>=20 >> Any ideas what is going on? >>=20 >> Best case scenerio, both nodes tried to import the disks = simultaneously, split brain condition, or disks appear out of order and = no using labels, we had a similar situation with geom_multipath, theres = no quorum disks knowledge yet for FreeBSD using zfs in this = configuration, we ran it similarly for a while, until we realized = through research it was bad karma to let carp and devd control nodes = without a fool proof way to be sure nodes were ready to export/import. = though you did state >>=20 >> "Then master comes back, CARP becomes master, devd kicks in and pool = gets exported from the second node and imported on the first one." be = nice to see how you managed that with scripts... even if both nodes = booted simultaneously their both going to "fight" for master and try to = import that pool. >>=20 >>=20 >> //mxb >>=20 >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>=20 >=20 >=20