From owner-freebsd-fs@FreeBSD.ORG Tue Mar 9 23:28:00 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EBC401065672 for ; Tue, 9 Mar 2010 23:28:00 +0000 (UTC) (envelope-from toasty@dragondata.com) Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id A57678FC16 for ; Tue, 9 Mar 2010 23:28:00 +0000 (UTC) Received: by gwaa20 with SMTP id a20so4229316gwa.13 for ; Tue, 09 Mar 2010 15:27:59 -0800 (PST) Received: by 10.150.128.6 with SMTP id a6mr101441ybd.281.1268175823565; Tue, 09 Mar 2010 15:03:43 -0800 (PST) Received: from vpn177.ord02.your.org (vpn177.ord02.your.org [204.9.55.177]) by mx.google.com with ESMTPS id 21sm5951597iwn.11.2010.03.09.15.03.42 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 09 Mar 2010 15:03:43 -0800 (PST) From: Kevin Day Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Date: Tue, 9 Mar 2010 17:03:41 -0600 Message-Id: <7418ECC2-55C1-4A28-82EA-0972AFE745EF@dragondata.com> To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Apple Message framework v1077) X-Mailer: Apple Mail (2.1077) Subject: iscsi over HAST backed storage partial success X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Mar 2010 23:28:01 -0000 I'm running istgt (iscsi target) using HAST backed storage. For the most = part, it seems to work really well. I have ucarp running to change the = IP that istgt is bound to, and modified the ucarp scripts to start/stop = istgt depending on which side is the master. If I shut down the primary, = the secondary takes over and all seems well. However, if I reboot the secondary, the primary starts freezing up for = long periods: Mar 9 22:46:27 cs04 hastd: [iscsi1] (primary) Unable to r: Socket is = not connected. Mar 9 22:46:27 cs04 hastd: [iscsi1] (primary) Unable to co: Connection = refused. Mar 9 22:46:42 cs04 last message repeated 3 times Mar 9 22:46:53 cs04 istgt[14298]: ABORT_TASK Mar 9 22:47:35 cs04 last message repeated 3 times Mar 9 22:48:02 cs04 hastd: [iscsi1] (primary) Unable to co: Operation = timed out. Mar 9 22:48:02 cs04 istgt[14298]: CmdSN(45748), OP=3D0x2a, = ElapsedTime=3D74 cleared=20 Mar 9 22:48:02 cs04 istgt[14298]: istgt_iscsi.c: = 640:istgt_iscsi_write_pdu: ***ERROR*** iscsi_write() failed (errno=3D32) Mar 9 22:48:02 cs04 istgt[14298]: = istgt_iscsi.c:3327:istgt_iscsi_op_task: ***ERROR*** iscsi_write_pdu() = failed Mar 9 22:48:02 cs04 istgt[14298]: = istgt_iscsi.c:3867:istgt_iscsi_execute: ***ERROR*** iscsi_op_task() = failed =20 Mar 9 22:48:02 cs04 istgt[14298]: istgt_iscsi.c:4337:worker: = ***ERROR*** iscsi_execute() failed Mar 9 22:48:02 cs04 istgt[14298]: CmdSN(490802), OP=3D0x2a, = ElapsedTime=3D73 cleared Mar 9 22:48:02 cs04 istgt[14298]: CmdSN(28387), OP=3D0x2a, = ElapsedTime=3D73 cleared=20 Mar 9 22:48:14 cs04 istgt[14298]: ABORT_TASK Mar 9 22:48:52 cs04 last message repeated 2 times Mar 9 22:49:22 cs04 hastd: [iscsi1] (primary) Unable to co: Operation = timed out. As soon as the secondary comes back online, everything starts behaving = again and all is well. Is this expected behavior at this point, or should hastd not block like = this? -- Kevin