From owner-freebsd-bugs@freebsd.org Fri Aug 19 09:41:14 2016 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DF049BBFD03 for ; Fri, 19 Aug 2016 09:41:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CECA41153 for ; Fri, 19 Aug 2016 09:41:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u7J9fEaV077675 for ; Fri, 19 Aug 2016 09:41:14 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 211990] iscsi fails to reconnect and does not release devices Date: Fri, 19 Aug 2016 09:41:14 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.3-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: ben.rubson@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Aug 2016 09:41:15 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D211990 Bug ID: 211990 Summary: iscsi fails to reconnect and does not release devices Product: Base System Version: 10.3-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: ben.rubson@gmail.com Hello, I'm facing an issue where iscsictl does not want to remove devices. Here is how I can reproduce this. ### Initiator : # iscsictl -Aa iscsictl then reports the 17 targets as connected, perfect. ### Target : Let's switch down the network interface # ifconfig mlxen1 down ### Initiator : iscsictl reports the 17 targets as disconnected, perfect. ### Target : Let's switch up the network interface # ifconfig mlxen1 up ### Initiator : iscsictl reports the 17 targets as connected, however, for 4 devices, I get= the following : 09:59:43 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:lg2): timed out waiting for iscsid(8) for 11 seconds; reconnecting 09:59:54 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:lg2): timed out waiting for iscsid(8) for 11 seconds; reconnecting 09:59:57 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:lg2): handoff = on already connected session 07:59:57 srv1 iscsid[1372]: 192.168.2.2 (iqn.2012-06.srv2:lg2): ISCSIDHANDO= FF: Device busy 09:59:57 srv1 iscsid[581]: child process 1372 terminated with exit status 1 09:59:57 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:lg2): connecti= on error; reconnecting 09:59:57 srv1 kernel: (da21:iscsi8:0:0:0): got CAM status 0x8 09:59:57 srv1 kernel: (da21:iscsi8:0:0:0): fatal error, failed to attach to device 10:00:07 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:lg2): no ping reply (NOP-In) after 10 seconds; reconnecting 10:00:08 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:lg2): no ping reply (NOP-In) after 10 seconds; reconnecting ### Target : 09:58:50 srv2 kernel: mlxen1: link state changed to DOWN 09:58:50 srv2 kernel: mlx4_en: mlxen1: Link Down 09:58:53 srv2 kernel: WARNING: 192.168.2.1 (iqn.1994-09.org.freebsd:srv1): = no ping reply (NOP-Out) after 5 seconds; dropping connection 09:58:53 srv2 last message repeated 16 times 09:59:49 srv2 kernel: mlx4_en: mlxen1: Link Up 09:59:49 srv2 kernel: mlxen1: link state changed to UP 09:59:49 srv2 devd: Executing '/etc/rc.d/dhclient quietstart mlxen1' 09:59:59 srv2 kernel: WARNING: 192.168.2.1 (iqn.1994-09.org.freebsd:srv1): connection error; dropping connection 09:59:59 srv2 last message repeated 3 times ### Initiator : # iscsictl -Ra # iscsictl -L Target name Target portal State iqn.2012-06.srv2:sW1 192.168.2.2 Connected: da18=20 iqn.2012-06.srv2:sW2 192.168.2.2 Connected: da23=20 iqn.2012-06.srv2:rT3 192.168.2.2 Connected: da17=20 iqn.2012-06.srv2:lg2 192.168.2.2 Connected: da21=20 As you can see, the 4 problematic devices remain "connected", nodes exist in /dev/, but they are unusable. Each time I "iscsictl -Ra", I get the following on initiator side : 10:09:35 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:lg2): connecti= on error; reconnecting 10:09:35 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:sW1): connecti= on error; reconnecting 10:09:35 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:rT3): connecti= on error; reconnecting 10:09:35 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:sW2): connecti= on error; reconnecting No logs however on target side, even if I start ctld with -d. The only workaround I found is to reboot, or to change the target name to properly reconnect... # uname -v FreeBSD 10.3-RELEASE-p7 #0: Thu Aug 11 18:38:15 UTC 2016 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC=20 Thank you for your support, Best regards, Ben --=20 You are receiving this mail because: You are the assignee for the bug.=