From owner-freebsd-scsi@freebsd.org Thu Nov 10 09:57:56 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 882FEC39E9E for ; Thu, 10 Nov 2016 09:57:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 77934AB9 for ; Thu, 10 Nov 2016 09:57:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id uAA9vt20050262 for ; Thu, 10 Nov 2016 09:57:56 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-scsi@FreeBSD.org Subject: [Bug 211990] iscsi fails to reconnect and does not release devices Date: Thu, 10 Nov 2016 09:57:56 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.3-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: julien@perdition.city X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Nov 2016 09:57:56 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D211990 --- Comment #18 from Julien Cigar --- Problem appeared again today, after ~15 days of uptime, always on FreeBSD filer1.prod.lan 10.3-RELEASE-p11 FreeBSD 10.3-RELEASE-p11 #0: Mon Oct 24 18:49:24 UTC 2016=20=20=20=20 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): no ping reply (NOP-In) after 5 seconds; reconnecting WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In) after 5 seconds; reconnecting (da3:iscsi1:0:0:0): READ(10). CDB: 28 00 01 ef ec 90 00 00 01 00=20 (da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host (da2:iscsi2:0:0:0): READ(10). CDB: 28 00 01 ef ec 8e 00 00 01 00=20 (da3:(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host iscsi1:0:(da2:0:iscsi2:0:0): 0:Retrying command 0): Retrying command da3 at iscsi1 bus 0 scbus4 target 0 lun 0 da3: s/n MYSERIAL 0 detached da2 at iscsi2 bus 0 scbus3 target 0 lun 0 da2: s/n MYSERIAL 1 detached (da2:iscsi2:0:0:0): Periph destroyed (da3:iscsi1:0:0:0): Periph destroyed da2 at iscsi2 bus 0 scbus3 target 0 lun 0 da2: Fixed Direct Access SPC-4 SCSI device da2: Serial Number MYSERIAL 1 da2: 150.000MB/s transfers da2: Command Queueing enabled da2: 1840144MB (471076881 4096 byte sectors) WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In) after 5 seconds; reconnecting (da2:iscsi2:0:0:0): READ(10). CDB: 28 00 1c 14 10 0f 00 00 01 00=20 (da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host (da2:iscsi2:0:0:0): Retrying command da2 at iscsi2 bus 0 scbus3 target 0 lun 0 da2: s/n MYSERIAL 1 detached (da2:iscsi2:0:0:0): Periph destroyed da2 at iscsi2 bus 0 scbus3 target 0 lun 0 da2: Fixed Direct Access SPC-4 SCSI device da2: Serial Number MYSERIAL 1 da2: 150.000MB/s transfers da2: Command Queueing enabled da2: 1840144MB (471076881 4096 byte sectors) WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): login timed out after 6 seconds; reconnecting da3 at iscsi1 bus 0 scbus4 target 0 lun 0 da3: Fixed Direct Access SPC-4 SCSI device da3: Serial Number MYSERIAL 0 da3: 150.000MB/s transfers da3: Command Queueing enabled da3: 1840144MB (471076881 4096 byte sectors) WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In) after 5 seconds; reconnecting (da2:iscsi2:0:0:0): READ(10). CDB: 28 00 1c 14 10 0f 00 00 01 00=20 (da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host (da2:iscsi2:0:0:0): Retrying command da2 at iscsi2 bus 0 scbus3 target 0 lun 0 da2: s/n MYSERIAL 1 detached WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In) after 5 seconds; reconnecting (da2:iscsi2:0:0:0): Periph destroyed da2 at iscsi2 bus 0 scbus3 target 0 lun 0 da2: Fixed Direct Access SPC-4 SCSI device da2: Serial Number MYSERIAL 1 da2: 150.000MB/s transfers da2: Command Queueing enabled da2: 1840144MB (471076881 4096 byte sectors) WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): handoff on already connected session WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): connection error; reconnecting da3 at iscsi1 bus 0 scbus4 target 0 lun 0 da3: s/n MYSERIAL 0 detached (da3:iscsi1:0:0:0): Periph destroyed da3 at iscsi1 bus 0 scbus4 target 0 lun 0 da3: Fixed Direct Access SPC-4 SCSI device da3: Serial Number MYSERIAL 0 da3: 150.000MB/s transfers da3: Command Queueing enabled da3: 1840144MB (471076881 4096 byte sectors) After a zpool online, and with vfs.zfs.scrub_delay =3D 0 and vfs.zfs.resilver_delay =3D 0 I issued a zpool scrub and again I had a timeo= ut: WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): no ping reply (NOP-In) after 5 seconds; reconnecting WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In) after 5 seconds; reconnecting (da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 9a 67 00 00 01 00=20 (da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host (da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 98 3c 00 00 01 00=20 (da3:(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host iscsi1:0:(da2:0:iscsi2:0:0): 0:Retrying command 0): Retrying command (da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 9f e4 00 00 01 00=20 (da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 a8 cc 00 00 01 00=20 (da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host (da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host (da3:(da2:iscsi1:0:iscsi2:0:0:0:0): 0): Retrying command Retrying command (da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 a2 42 00 00 01 00=20 (da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 95 6e 00 00 20 00=20 (da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host (da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host (da3:(da2:iscsi1:0:iscsi2:0:0:0:0): 0): Retrying command Retrying command (da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 96 4e 00 00 20 00=20 (da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 95 8e 00 00 20 00=20 (da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host (da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host (da3:(da2:iscsi1:0:iscsi2:0:0:0:0): 0): Retrying command Retrying command (da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 96 6e 00 00 20 00=20 (da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 a9 b2 00 00 01 00=20 (da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host (da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host (da3:(da2:iscsi1:0:iscsi2:0:0:0:0): 0): Retrying command Retrying command da3 at iscsi1 bus 0 scbus4 target 0 lun 0 da3: s/n MYSERIAL 0 detached da2 at iscsi2 bus 0 scbus3 target 0 lun 0 da2: s/n MYSERIAL 1 detached (da3:iscsi1:0:0:0): Periph destroyed (da2:iscsi2:0:0:0): Periph destroyed da2 at iscsi1 bus 0 scbus4 target 0 lun 0 da2: Fixed Direct Access SPC-4 SCSI device da2: Serial Number MYSERIAL 0 da2: 150.000MB/s transfers da2: Command Queueing enabled da2: 1840144MB (471076881 4096 byte sectors) da3 at iscsi2 bus 0 scbus3 target 0 lun 0 da3: Fixed Direct Access SPC-4 SCSI device da3: Serial Number MYSERIAL 1 da3: 150.000MB/s transfers da3: Command Queueing enabled da3: 1840144MB (471076881 4096 byte sectors) I've raised those timeouts a little bit: kern.iscsi.login_timeout: 30 kern.iscsi.iscsid_timeout: 30 kern.iscsi.ping_timeout: 30 and see if it makes any difference --=20 You are receiving this mail because: You are on the CC list for the bug.=