Date: Wed, 28 Oct 2015 21:57:21 -0400 From: "Michael W. Lucas" <mwlucas@michaelwlucas.com> To: fs@freebsd.org Subject: iSCSI/ZFS strangeness Message-ID: <20151029015721.GA95057@mail.michaelwlucas.com>
next in thread | raw e-mail | index | archive | help
Hi, I'm experimenting with iSCSI HA with FreeBSD 10.2 amd64. I know people do this sort of thing, but I can't figure out what I'm missing. (Most of the tutorials cover HAST instead). I suspect the real problem is "Lucas doesn't know the right search terms." The goal is to make an iSCSI-based ZFS pool that's available to two separate hosts, and remains available even if one of the iSCSI servers fails. Instead, the pool hangs when either of the iSCSI servers goes down. My test environment has two iSCSI servers, iscsi1 and iscsi2. They each export three drives as a single target. There's two iSCSI initiators, zfs1 and zfs2. Both of them have active connections to the iSCSI targets. On another host I've created a ZFS pool of striped mirrors. Each mirror has one drive from each iSCSI server. The initiators can both access the iSCSI-based pool--not simultaneously, of course. But CARP, devd, and some shell scripting should get me a highly available pool that can withstand the demise of any one iSCSI server and any one initiator. The hope is that the pool would continue to work even if an iSCSI host shuts down. When the downed iSCSI host returns, the initiators should log back in and the pool auto-resilver. Some ten minutes ago, I killed iscsi2. The pool is live on zfs1. And the drives really have disappeared. # iscsictl Target name Target portal State iqn.2013-11.io.mwl:target0 iscsi2.blackhelicopters.org Operation timed out iqn.2013-11.io.mwl:target0 iscsi1.blackhelicopters.org Connected: da2 da3 da4 I would expect to see the pool appear degraded. But instead, I have: # zpool status iscsi pool: iscsi state: ONLINE scan: resilvered 1.16G in 0h3m with 0 errors on Wed Oct 28 14:13:08 2015 config: NAME STATE READ WRITE CKSUM iscsi ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/iscsi1-0 ONLINE 0 0 0 gpt/iscsi2-0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 gpt/iscsi1-1 ONLINE 0 0 0 gpt/iscsi2-1 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 gpt/iscsi1-2 ONLINE 0 0 0 gpt/iscsi2-2 ONLINE 0 0 0 errors: No known data errors To try to make ZFS realize the pool is degraded, I write to the iSCSI pool. (tar -xvpf ports.tar.gz) Each time, the extract gets to a certain point and hangs. Can't ^C or ^Z out of it. This latest time, the extract reaches: x ports/www/firefox-esr/files/patch-media-mtransport-third_party-nICEr-src-util-mbslen.c I can still SSH into the machine, but if I try to look in any directories under /iscsi/ports/* my terminal hangs. So I restart the downed iSCSI server. The initiators log back into the target. And the hung tar extract picks up where it left off. So, I haven't achieved HA. The pool stays up, but it's not exactly usable. Any hints on what I'm missing? Thanks, ==ml -- Michael W. Lucas - mwlucas@michaelwlucas.com, Twitter @mwlauthor http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151029015721.GA95057>