From owner-freebsd-scsi@freebsd.org Tue Oct 3 06:19:08 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8F143E31FC7; Tue, 3 Oct 2017 06:19:08 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wr0-x242.google.com (mail-wr0-x242.google.com [IPv6:2a00:1450:400c:c0c::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 22C4A6635B; Tue, 3 Oct 2017 06:19:08 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wr0-x242.google.com with SMTP id y44so171951wry.2; Mon, 02 Oct 2017 23:19:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=srKBKlqL0bm2QIKPSKpGYiCm8msLDkbMXNIiZt3xkus=; b=SI3hpRxnRCyREQkUT9I+JfvlDqAa1VAeiCefJkz4U3eIdsgLT+LvaDHUv4VdtsObGi hdKwajyoRODQNKjCYcffYAu/0YrlQ55gn+qIxUmM4T9Msz4kbmI3ZdC2zOqDhpWdrtL6 qGw5caG8B0nI/H0GEnf0og6Tx64uUB3hS7US4horNQTuydh10N69nUrT802fjToOlbEo Mi472MJLGyqoqlFJEgQwQnnVA5xHrCVjwixwr6yim+T348OLkW4xmF+3mohXd0ED7Jtw 9X2b2F6kgLGZqJzotBH+LgglgnT2MnojL8Lbgv64eUHXp6VTnFh4iGSikjqa6pd3f5+r 9q6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=srKBKlqL0bm2QIKPSKpGYiCm8msLDkbMXNIiZt3xkus=; b=B1CgA+DE3UABMyiRT7MY0l1k0QG4yQmBW99n3cH31B97kh9RvbAWVXBOtuTM71AAfQ Ocff8oZP8YjnzuzZBKb7o7Y9LABBWH62RJGpxapx3HOoLuLFtQkIKh/tjD2VWpleqHp/ 5q736Z/jrvyalRsthASW5oAssAkCu0ym2opdoPorfvcNLDTJsKEcOSGsfabZTg+2laSP 48q/sBPmR57tDI3zE7icCHbijJrRtvmSFUvEsra7ZRGwRXkR58AXhTteeP9mm0MB1YXo uNKYTcjuE5n885iYmBzj6LJ6Ed93J/oC2x1MwOsQ3vNVG+yUorFzOjJqIUEC1d8v19TG FC6g== X-Gm-Message-State: AHPjjUizU8/T0VgGVi7Ea7OIHdC5NScN1gZiCvFaNcvGDy1ayABs+mGR xsM87m5P4iTGSO5B9rBsiIZtrBFh X-Google-Smtp-Source: AOwi7QDp5dVRcwq413co87jY/pCsZXU2soA4q0HWK9xlxejqtdPydqL4CUSW5Ak9R8BapLy1oYcRew== X-Received: by 10.223.198.15 with SMTP id n15mr10905748wrg.200.1507011546433; Mon, 02 Oct 2017 23:19:06 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id 4sm12162741wmg.20.2017.10.02.23.19.04 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 02 Oct 2017 23:19:05 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: ZFS stalled after some mirror disks were lost From: Ben RUBSON In-Reply-To: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> Date: Tue, 3 Oct 2017 08:19:04 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> To: Freebsd fs , FreeBSD-scsi X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 06:19:08 -0000 Hi, Putting scsi list as it could be related. > On 02 Oct 2017, at 20:12, Ben RUBSON wrote: >=20 > Hi, >=20 > On a FreeBSD 11 server, the following online/healthy zpool : >=20 > home > mirror-0 > label/local1 > label/local2 > label/iscsi1 > label/iscsi2 > mirror-1 > label/local3 > label/local4 > label/iscsi3 > label/iscsi4 > cache > label/local5 > label/local6 >=20 > A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk > according to "zpool iostat", nothing on local disks (strange but I > noticed that IOs always prefer iscsi disks to local disks). > No write IOs. >=20 > Let's disconnect all iSCSI disks : > iscsictl -Ra >=20 > Expected behavior : > IO activity flawlessly continue on local disks. >=20 > What happened : > All IOs stalled, server only answers to IOs made to its zroot pool. > All commands related to the iSCSI disks (iscsictl), or to ZFS = (zfs/zpool), > don't return. >=20 > Questions : > Why this behavior ? > How to know what happens ? (/var/log/messages says almost nothing) >=20 > I already disconnected the iSCSI disks without any issue in the past, > several times, but there were almost no IOs running. >=20 > Thank you for your help ! >=20 > Ben > On 02 Oct 2017, at 22:55, Andriy Gapon wrote: >=20 >> On 02/10/2017 22:13, Ben RUBSON wrote: >>=20 >>> On 02 Oct 2017, at 20:45, Andriy Gapon wrote: >>>=20 >>>> On 02/10/2017 21:17, Ben RUBSON wrote: >>>>=20 >>>> Unfortunately the zpool command stalls / does not return :/ >>>=20 >>> Try to take procstat -kk -a. >>=20 >> Here is the procstat output : >> https://benrubson.github.io/zfs/procstat01.log >=20 > First, it seems that there are some iscsi threads stuck on a lock = like: > 0 100291 kernel iscsimt mi_switch+0xd2 = sleepq_wait+0x3a > _sx_xlock_hard+0x592 iscsi_maintenance_thread+0x316 fork_exit+0x85 > fork_trampoline+0xe >=20 > or like >=20 > 8580 102077 iscsictl - mi_switch+0xd2 = sleepq_wait+0x3a > _sx_slock_hard+0x325 iscsi_ioctl+0x7ea devfs_ioctl_f+0x13f = kern_ioctl+0x2d4 > sys_ioctl+0x171 amd64_syscall+0x4ce Xfast_syscall+0xfb >=20 > Also, there is a thread in cam_sim_free(): > 0 100986 kernel iscsimt mi_switch+0xd2 = sleepq_wait+0x3a > _sleep+0x2a1 cam_sim_free+0x48 iscsi_session_cleanup+0x1bd > iscsi_maintenance_thread+0x388 fork_exit+0x85 fork_trampoline+0xe >=20 > So, it looks like there could be a problem is the iscsi teardown path. >=20 > Maybe that caused a domino effect in ZFS code. I see a lot of threads = waiting > either for spa_namespace_lock or a spa config lock (a highly = specialized ZFS > lock). But it is hard to untangle their inter-dependencies. >=20 > Some of ZFS I/O threads are also affected, for example: > 0 101538 kernel zio_write_issue_ mi_switch+0xd2 = sleepq_wait+0x3a > _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 > taskqueue_run_locked+0x14a taskqueue_thread_loop+0xe8 fork_exit+0x85 > fork_trampoline+0xe > 8716 101319 sshd - mi_switch+0xd2 = sleepq_wait+0x3a > _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 > zio_nowait+0x49 arc_read+0x8e4 dbuf_read+0x6c2 = dmu_buf_hold_array_by_dnode+0x1d3 > dmu_read_uio_dnode+0x41 dmu_read_uio_dbuf+0x3b zfs_freebsd_read+0x5fc > VOP_READ_APV+0x89 vn_read+0x157 vn_io_fault1+0x1c2 vn_io_fault+0x197 > dofileread+0x98 > 71181 101141 encfs - mi_switch+0xd2 = sleepq_wait+0x3a > _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 > zio_nowait+0x49 arc_read+0x8e4 dbuf_read+0x6c2 dmu_buf_hold+0x3d > zap_lockdir+0x43 zap_cursor_retrieve+0x171 zfs_freebsd_readdir+0x3f3 > VOP_READDIR_APV+0x8f kern_getdirentries+0x21b sys_getdirentries+0x28 > amd64_syscall+0x4ce Xfast_syscall+0xfb > 71181 101190 encfs - mi_switch+0xd2 = sleepq_wait+0x3a > _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 > zio_nowait+0x49 arc_read+0x8e4 dbuf_prefetch_indirect_done+0xcc = arc_read+0x425 > dbuf_prefetch+0x4f7 dmu_zfetch+0x418 dmu_buf_hold_array_by_dnode+0x34d > dmu_read_uio_dnode+0x41 dmu_read_uio_dbuf+0x3b zfs_freebsd_read+0x5fc > VOP_READ_APV+0x89 vn_read+0x157 >=20 > Note that the first of these threads executes a write zio. >=20 > It would be nice to determine an owner of spa_namespace_lock. > If you have debug symbols then it can be easily done in kgdb on the = live system: > (kgdb) p spa_namespace_lock So as said a few minutes ago I lost access to the server and had to = recycle it. Thankfully I managed to reproduce the issue, re-playing exactly the same = steps. Curious line in /var/log/messages : kernel: g_access(918): provider da18 has error (da18 is the remaining iSCSI target device which did not disconnect = properly) procstat -kk -a : https://benrubson.github.io/zfs/procstat02.log (kgdb) p spa_namespace_lock $1 =3D -2110867066 Thank you ! Ben From owner-freebsd-scsi@freebsd.org Tue Oct 3 06:22:10 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CAEF5E3231B; Tue, 3 Oct 2017 06:22:10 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x231.google.com (mail-wm0-x231.google.com [IPv6:2a00:1450:400c:c09::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5BE7266661; Tue, 3 Oct 2017 06:22:10 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x231.google.com with SMTP id m72so9768187wmc.0; Mon, 02 Oct 2017 23:22:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=YZkiMsy6BmmyyYQlVQZD5mwcE7j2oPo3X0avPs5Z+II=; b=Zxezx4WKo7BbmdBfWso+T2JwMrRUxz/uewl+kvbwbn/TPxh8Jz1hzzwDayUCb7QcF2 voS5oQfL6fSab9AHatgzkYnhjM2DnjEM2NeVdAxC3zCACf7uXmGDklUzUdvSYOfTutDT wVZ5K2GTWXcu654GveBgFdrE9AnJkVM4Y7jQolPA38uOc17/OIWO0ozJG1E/vkORxBTL iF9+03P0k6C4fG7ROWfPjC9Mpgcg+m5mpvajetY4yQLnjd04/mAGvlhaxv4XJ37v949A 6oqb7vOJSWCnDhDAZ/bHwwmOL2vUbZrCyTrAkwQ/dY4vOlVjianilE7ua6KJsdupj5Ls A3MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=YZkiMsy6BmmyyYQlVQZD5mwcE7j2oPo3X0avPs5Z+II=; b=VOsy/oUfx18DAu0LEbZzRYWrH6GdykLzcHJv0J3UOpKC8j3O7o06uvCkMlC/+EY9UI cZ1JsosDMYCnkSRMkp574KLncsUcuzTz4GSruoK6rgvz8qh7louLjqC/xgthhK4WJBnA 7lcRAi9vyxXpvASNYfM9WlTJkiMjV9kE93L1SScGGpDckAxfC22M5bTxosXwbrN1Vnht Ks61n7VZGXLlUlZGDH0CPQ8RVmKUqxfhssjrC1IPB6EVLFW/6TyuYW3Ah7qorXuSt97a FelZATKnKZ4KCa3LdAcnhajBQG3gpno5Ei1H6iVsFOMQ5CaHWYBaPr6qhRZwhuVFT+Y+ YtbA== X-Gm-Message-State: AMCzsaVz9vz+kzvCeymefWE3sanKr2s/WXCwt7bAN9RTlu4Wv8tQ0bZ6 9eym3oT0E0UZxN37fzmtVCcR4fXE X-Google-Smtp-Source: AOwi7QCfSACr0peMLWDBJx3eIygoktqSBax6ahQva+Ju5Fs96305YMSympnjorUdpywd0Y+rYp1pRw== X-Received: by 10.28.211.69 with SMTP id k66mr12100999wmg.1.1507011728568; Mon, 02 Oct 2017 23:22:08 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id v2sm7550275wmf.40.2017.10.02.23.22.07 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 02 Oct 2017 23:22:07 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: ZFS stalled after some mirror disks were lost From: Ben RUBSON In-Reply-To: Date: Tue, 3 Oct 2017 08:22:06 +0200 Cc: Steven Hartland , Freebsd fs , FreeBSD-scsi Content-Transfer-Encoding: quoted-printable Message-Id: References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> To: Andriy Gapon X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 06:22:10 -0000 > On 03 Oct 2017, at 08:12, Andriy Gapon wrote: >=20 > On 02/10/2017 21:12, Ben RUBSON wrote: >> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk >> according to "zpool iostat", nothing on local disks (strange but I >> noticed that IOs always prefer iscsi disks to local disks). >=20 > Are your local disks SSD or HDD? HDD. > Could it be that iSCSI disks appear to be faster than the local disks = to the > smart ZFS mirror code? Or because their /dev/da are greater then the local ones ? (as they are attached after the local disks) (my 2 cents...) For sure we could have expected the local disks to be preferred, or at least the load to be spread among all (local & iscsi) disks. > Steve, what do you think? From owner-freebsd-scsi@freebsd.org Tue Oct 3 07:31:53 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E19E9E33743; Tue, 3 Oct 2017 07:31:53 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wr0-x22d.google.com (mail-wr0-x22d.google.com [IPv6:2a00:1450:400c:c0c::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7245F68665; Tue, 3 Oct 2017 07:31:53 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wr0-x22d.google.com with SMTP id l39so5638168wrl.12; Tue, 03 Oct 2017 00:31:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=UgM3Amcn+bco2urIcwSNTrv7H6g7cAH4FC9jCmWj2iQ=; b=Hcpc0NPqR9cJf4a7Tozx0Gyd3JCDuU2zWrMyytXgc4gjUJibL0K67ko8kaX0sWTqI7 wnxwz+lRvWb5L7414g6vwtI/IJjd7E3ClqTyXwkBVBPMuoJHxh5z2gVwQUiv7hs6V6jM Auyu7r5Xf/kkMFK4MJRtkUpDT9j3DSfNPdJe/tBVKFPnd8H3bxEqXiTjgc9RNSsvCkT/ 7eeVpZKbK6ha9o6OrVG2uOWnP4cWPNpLJBYCPbTqJvUhqvfWwvduXcH98pgjMCUkg6Ht gpefHIy+K0uKCWKcNTBM7h85vCb4kYyE3Sw1tVoLAMpbe/3LtOA/jWj/Rp+fQAABgYEF KKSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=UgM3Amcn+bco2urIcwSNTrv7H6g7cAH4FC9jCmWj2iQ=; b=bgsHn1ezV+DGjycls9hErTBwaoOm9P4odfZEF51/oG7PLSKRBM3+QvrGe5Jo1LA9l3 1XwBiQ1ESWWoiu631RMRXq3lVbPwgBJfndCa9mHscemA+iic0DHEtK0vrdmbUvsmXwmt AMuimk1/ZDSjltZxAp+Wc5CDKxJoRJTxCrf6/iroW1CKl8/VmR3riRN18iSHiGK1KDOt eNhLAiapcT04L2153fCNSFORmA0MZYQUulAQli3e4axtPw6Wgyq/sZJCyR9El99GPm5b CUdsNzB+nmpOHY0H3EhGUnJIzMZaQgI4YPLMvxnyHO1WL7aSe/9B2P8JC2i6ipkolLYc 56Pw== X-Gm-Message-State: AMCzsaW+1lzJJz1i2FTeL+hMCNtsKpzegOs+07RNMR99V54idlVxMEao ePF5Sn5oVRi3WVw+CODWecpoIa5n X-Google-Smtp-Source: AOwi7QCTm0qGWPIDdfXs7QTgC5IN6WbC6rNxM2Natyy7Msx1YA8M1dP7b3RIL1jv7NUZwxSfRwMmVg== X-Received: by 10.223.178.144 with SMTP id g16mr11264078wrd.76.1507015911918; Tue, 03 Oct 2017 00:31:51 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id l37sm12954776wrl.47.2017.10.03.00.31.51 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 03 Oct 2017 00:31:51 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: ZFS stalled after some mirror disks were lost From: Ben RUBSON In-Reply-To: <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> Date: Tue, 3 Oct 2017 09:31:50 +0200 Cc: Freebsd fs , FreeBSD-scsi Content-Transfer-Encoding: quoted-printable Message-Id: <1990B359-FC8D-4D6A-992B-7F77A07D83A6@gmail.com> References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> To: Steven Hartland , Andriy Gapon X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 07:31:54 -0000 > On 03 Oct 2017, at 09:25, Steven Hartland = wrote: >=20 > On 03/10/2017 07:12, Andriy Gapon wrote: >> On 02/10/2017 21:12, Ben RUBSON wrote: >>=20 >>> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk >>> according to "zpool iostat", nothing on local disks (strange but I >>> noticed that IOs always prefer iscsi disks to local disks). >>>=20 >> Are your local disks SSD or HDD? >> Could it be that iSCSI disks appear to be faster than the local disks = to the >> smart ZFS mirror code? >>=20 >> Steve, what do you think? >>=20 > Yes that quite possible, the mirror balancing uses the queue depth + = rotating bias to determine the load of the disk so if your iSCSI host is = processing well and / or is reporting non-rotating vs rotating for the = local disks it could well be the mirror is preferring reads from the the = less loaded iSCSI devices. Note that local & iscsi disks are _exactly_ the same (same model number, = same SAS adapter...). So iSCSI ones should be a little bit slower due to network latency (even = if it's very low in my case). Once production back, after having analysed the main issue of this = thread, I should then try to find whether or not iSCSI disks are seen as rotating disks. Thanks for the hint ! Ben From owner-freebsd-scsi@freebsd.org Tue Oct 3 07:39:37 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C1166E338CC for ; Tue, 3 Oct 2017 07:39:37 +0000 (UTC) (envelope-from steven@multiplay.co.uk) Received: from mail-wm0-x22d.google.com (mail-wm0-x22d.google.com [IPv6:2a00:1450:400c:c09::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FAE568825 for ; Tue, 3 Oct 2017 07:39:37 +0000 (UTC) (envelope-from steven@multiplay.co.uk) Received: by mail-wm0-x22d.google.com with SMTP id t69so14725429wmt.2 for ; Tue, 03 Oct 2017 00:39:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=GrdDugNggHtWHMaNQfG4aSNyt9oH2g9RwDNn/ftgLmU=; b=zZz6wklD5H1NP7KE7l9LGGs5UxEBQlB10BWB9S7z2eQ6hO+aVWxEkMMwa+GtjshS8Y Cxz80LYM10mwFk952Xdu5MXzeUnKBQyYUvvZRHCicz/TCheQXglNww/jqWLBHLmneqiL nK4IclmVMoCZgEh8vvnhMr6iydaXuC0jtJoUCre5tLZIbXJAsVeNQBPAVTbMOlamk4ll BJ76b4zy8WeP7FPmEZPjGt2xuJnkWYsxC1bJ2JuqfXs3rJQvVHEB4wIw1CUNnvZo9HtW Lh1uqp2wd0HtrHu5GSW/sigE/wqu/nBtaeQdpjGYcCsLl1+PE2mOBTrqdRRT8XlbtifE dhtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=GrdDugNggHtWHMaNQfG4aSNyt9oH2g9RwDNn/ftgLmU=; b=YRvLA22tLfXeMUIXiWk9x7DevUUJgPEz9U9oW56t+sqS4c2P+PyAGzCQRDS/QCLQCp dDGexNYP9p+NvNxAXCNFRKavGbX81/7KTQ7L7KiRrLv9RzaH0rFWt/87Wttuo3CKMMez kHzwGj28GF94DfyUGr9pqcjoi5AC3ekCUIWvlu7FG9if7pCgftC9gGiIwNgHEb5007FV iX8dK7BUD0QurQL9a6cuBGvrz/AXjP7abli3EyTJsYIxJMFrnHyU3D5kr3+1fik6S5kT i70iidsiATE+wm3ZHZmzeqblVbYr+oE7m1sBGhIg6ufJ4V00wrY7M+2t+PQBK2mEnSgO hNLQ== X-Gm-Message-State: AHPjjUg0jvc3ClLoTrN/YZFWIfjiZpyJjouB7BELpujOBT1m3AWdkaa9 xim+bTGBq4/T986gmEn3pBo6QP9fE54= X-Google-Smtp-Source: AOwi7QABtvOBGfjQMvW7El+S9dNwANMh3rBpWLRCo55zipq8qtcFGcIWNwwI7ug+klEOpwEH9VtYSg== X-Received: by 10.80.183.231 with SMTP id i36mr22667458ede.262.1507016375361; Tue, 03 Oct 2017 00:39:35 -0700 (PDT) Received: from [10.10.1.111] ([185.97.61.1]) by smtp.gmail.com with ESMTPSA id f20sm9116958edm.46.2017.10.03.00.39.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Oct 2017 00:39:34 -0700 (PDT) Subject: Re: ZFS stalled after some mirror disks were lost To: Ben RUBSON , Andriy Gapon Cc: Freebsd fs , FreeBSD-scsi References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> <1990B359-FC8D-4D6A-992B-7F77A07D83A6@gmail.com> From: Steven Hartland Message-ID: <9bce89eb-4d6f-aec1-df44-ebf794a3123b@multiplay.co.uk> Date: Tue, 3 Oct 2017 08:39:36 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <1990B359-FC8D-4D6A-992B-7F77A07D83A6@gmail.com> Content-Language: en-US Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 07:39:37 -0000 On 03/10/2017 08:31, Ben RUBSON wrote: >> On 03 Oct 2017, at 09:25, Steven Hartland wrote: >> >> On 03/10/2017 07:12, Andriy Gapon wrote: >>> On 02/10/2017 21:12, Ben RUBSON wrote: >>> >>>> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk >>>> according to "zpool iostat", nothing on local disks (strange but I >>>> noticed that IOs always prefer iscsi disks to local disks). >>>> >>> Are your local disks SSD or HDD? >>> Could it be that iSCSI disks appear to be faster than the local disks to the >>> smart ZFS mirror code? >>> >>> Steve, what do you think? >>> >> Yes that quite possible, the mirror balancing uses the queue depth + rotating bias to determine the load of the disk so if your iSCSI host is processing well and / or is reporting non-rotating vs rotating for the local disks it could well be the mirror is preferring reads from the the less loaded iSCSI devices. > Note that local & iscsi disks are _exactly_ the same (same model number, same SAS adapter...). > So iSCSI ones should be a little bit slower due to network latency (even if it's very low in my case). > Once production back, after having analysed the main issue of this thread, I should then > try to find whether or not iSCSI disks are seen as rotating disks. > > Thanks for the hint ! Hmm, the output from gstat -dp on a loaded machine would be interesting to see too.     Regards     Steve From owner-freebsd-scsi@freebsd.org Tue Oct 3 11:43:14 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 105F0E39AF6; Tue, 3 Oct 2017 11:43:14 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x22a.google.com (mail-wm0-x22a.google.com [IPv6:2a00:1450:400c:c09::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 96AE7708F3; Tue, 3 Oct 2017 11:43:13 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x22a.google.com with SMTP id i82so13926471wmd.3; Tue, 03 Oct 2017 04:43:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=glpyQV2+u2c6fJfkqWnk31JnmpiHOFmGo21o3TTPem4=; b=EjQirs0ZEDtRqHhE44IybPy9mbeC3ANouwMhfvyQ7sdaWhwYxhDAP/kzPkNIfZOhH/ kQUpJ7j2MfQhH1ararAVdP6yVfraT26wio8luOfDU0+1kR9O/av79KX1NtK0pJCW33W/ SMCs7Kx6A44xwfDq9Z6HNtBJ1Tv2KpeLnuQlHnZV+gi97ofWExVrRuyuvM6eKCWjoaQ1 2/Q2iat3xIEVILrDmLWthQJf2DzUFfITkQ5j+IVUJOLk330XAwSH9DuylEj5PM0z91n6 Z9eOiLTuqImS7A51V1qxMM5K0aanr/8EmQpTcFGVpLzl4ql2457HWKQ5CV/LLg8txdRg WZng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=glpyQV2+u2c6fJfkqWnk31JnmpiHOFmGo21o3TTPem4=; b=pzaMbzYhABe18vxY1u4sTvxoeWwAC2QZEspNYkF1uX50JKz4vNnG6TSCroCq2cUQs3 7xkGHpFcMpD9gIwPtsZ8C1+qknGOiQUDmlb9cK2hSuHZheDMkDHNAFH0gDXq+MtixldW d1e/i/pL0VZtOzHH4UmtboeIwKkx+3Bdo1hqFx+6Z3ukwHGaMObWlZqx4o5ui1ZiVuq6 +bXm/BIM2C9UYKO9ZNR9OUkE8+uL37oIhBvPxymKUbN0s0rQDisfL6abQxbZajxX6rgY iiG67dTqWixIB+oNsz2rU7k1fs/VjDvcm8y52kRsu93QZO1Si4bZEFt/UP+3r6okisvh wNFQ== X-Gm-Message-State: AMCzsaUggGDEBWxsqWb3xhP7P5iaWyXV8El7mdteV3ugFA6crocVIzsm pBsbgwzfmA5GiFaiu5IlAfPXwEdd X-Google-Smtp-Source: AOwi7QACemBy8ozmmJZ4ikHMBjJdplLgUvShqzfjOlMdfYFKgVgHLWkTfw5czeA0p08z3XkqXsHZVg== X-Received: by 10.28.136.83 with SMTP id k80mr12670178wmd.159.1507030991716; Tue, 03 Oct 2017 04:43:11 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id k37sm6553666wre.96.2017.10.03.04.43.10 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 03 Oct 2017 04:43:11 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: ZFS stalled after some mirror disks were lost From: Ben RUBSON In-Reply-To: Date: Tue, 3 Oct 2017 13:43:09 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <63B239EB-47F0-4DDA-982A-794E5B5FC56F@gmail.com> References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> To: Freebsd fs , FreeBSD-scsi X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 11:43:14 -0000 > On 03 Oct 2017, at 08:19, Ben RUBSON wrote: >=20 > Hi, >=20 > Putting scsi list as it could be related. >=20 >> On 02 Oct 2017, at 20:12, Ben RUBSON wrote: >>=20 >> Hi, >>=20 >> On a FreeBSD 11 server, the following online/healthy zpool : >>=20 >> home >> mirror-0 >> label/local1 >> label/local2 >> label/iscsi1 >> label/iscsi2 >> mirror-1 >> label/local3 >> label/local4 >> label/iscsi3 >> label/iscsi4 >> cache >> label/local5 >> label/local6 >>=20 >> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk >> according to "zpool iostat", nothing on local disks (strange but I >> noticed that IOs always prefer iscsi disks to local disks). >> No write IOs. >>=20 >> Let's disconnect all iSCSI disks : >> iscsictl -Ra >>=20 >> Expected behavior : >> IO activity flawlessly continue on local disks. >>=20 >> What happened : >> All IOs stalled, server only answers to IOs made to its zroot pool. >> All commands related to the iSCSI disks (iscsictl), or to ZFS = (zfs/zpool), >> don't return. >>=20 >> Questions : >> Why this behavior ? >> How to know what happens ? (/var/log/messages says almost nothing) >>=20 >> I already disconnected the iSCSI disks without any issue in the past, >> several times, but there were almost no IOs running. >>=20 >> Thank you for your help ! >>=20 >> Ben >=20 >> On 02 Oct 2017, at 22:55, Andriy Gapon wrote: >>=20 >>> On 02/10/2017 22:13, Ben RUBSON wrote: >>>=20 >>>> On 02 Oct 2017, at 20:45, Andriy Gapon wrote: >>>>=20 >>>>> On 02/10/2017 21:17, Ben RUBSON wrote: >>>>>=20 >>>>> Unfortunately the zpool command stalls / does not return :/ >>>>=20 >>>> Try to take procstat -kk -a. >>>=20 >>> Here is the procstat output : >>> https://benrubson.github.io/zfs/procstat01.log >>=20 >> First, it seems that there are some iscsi threads stuck on a lock = like: >> 0 100291 kernel iscsimt mi_switch+0xd2 = sleepq_wait+0x3a >> _sx_xlock_hard+0x592 iscsi_maintenance_thread+0x316 fork_exit+0x85 >> fork_trampoline+0xe >>=20 >> or like >>=20 >> 8580 102077 iscsictl - mi_switch+0xd2 = sleepq_wait+0x3a >> _sx_slock_hard+0x325 iscsi_ioctl+0x7ea devfs_ioctl_f+0x13f = kern_ioctl+0x2d4 >> sys_ioctl+0x171 amd64_syscall+0x4ce Xfast_syscall+0xfb >>=20 >> Also, there is a thread in cam_sim_free(): >> 0 100986 kernel iscsimt mi_switch+0xd2 = sleepq_wait+0x3a >> _sleep+0x2a1 cam_sim_free+0x48 iscsi_session_cleanup+0x1bd >> iscsi_maintenance_thread+0x388 fork_exit+0x85 fork_trampoline+0xe >>=20 >> So, it looks like there could be a problem is the iscsi teardown = path. >>=20 >> Maybe that caused a domino effect in ZFS code. I see a lot of = threads waiting >> either for spa_namespace_lock or a spa config lock (a highly = specialized ZFS >> lock). But it is hard to untangle their inter-dependencies. >>=20 >> Some of ZFS I/O threads are also affected, for example: >> 0 101538 kernel zio_write_issue_ mi_switch+0xd2 = sleepq_wait+0x3a >> _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 >> taskqueue_run_locked+0x14a taskqueue_thread_loop+0xe8 fork_exit+0x85 >> fork_trampoline+0xe >> 8716 101319 sshd - mi_switch+0xd2 = sleepq_wait+0x3a >> _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 >> zio_nowait+0x49 arc_read+0x8e4 dbuf_read+0x6c2 = dmu_buf_hold_array_by_dnode+0x1d3 >> dmu_read_uio_dnode+0x41 dmu_read_uio_dbuf+0x3b zfs_freebsd_read+0x5fc >> VOP_READ_APV+0x89 vn_read+0x157 vn_io_fault1+0x1c2 vn_io_fault+0x197 >> dofileread+0x98 >> 71181 101141 encfs - mi_switch+0xd2 = sleepq_wait+0x3a >> _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 >> zio_nowait+0x49 arc_read+0x8e4 dbuf_read+0x6c2 dmu_buf_hold+0x3d >> zap_lockdir+0x43 zap_cursor_retrieve+0x171 zfs_freebsd_readdir+0x3f3 >> VOP_READDIR_APV+0x8f kern_getdirentries+0x21b sys_getdirentries+0x28 >> amd64_syscall+0x4ce Xfast_syscall+0xfb >> 71181 101190 encfs - mi_switch+0xd2 = sleepq_wait+0x3a >> _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 >> zio_nowait+0x49 arc_read+0x8e4 dbuf_prefetch_indirect_done+0xcc = arc_read+0x425 >> dbuf_prefetch+0x4f7 dmu_zfetch+0x418 = dmu_buf_hold_array_by_dnode+0x34d >> dmu_read_uio_dnode+0x41 dmu_read_uio_dbuf+0x3b zfs_freebsd_read+0x5fc >> VOP_READ_APV+0x89 vn_read+0x157 >>=20 >> Note that the first of these threads executes a write zio. >>=20 >> It would be nice to determine an owner of spa_namespace_lock. >> If you have debug symbols then it can be easily done in kgdb on the = live system: >> (kgdb) p spa_namespace_lock >=20 > So as said a few minutes ago I lost access to the server and had to = recycle it. > Thankfully I managed to reproduce the issue, re-playing exactly the = same steps. >=20 > Curious line in /var/log/messages : > kernel: g_access(918): provider da18 has error > (da18 is the remaining iSCSI target device which did not disconnect = properly) >=20 > procstat -kk -a : > https://benrubson.github.io/zfs/procstat02.log >=20 > (kgdb) p spa_namespace_lock > $1 =3D -2110867066 This time with debug symbols. procstat -kk -a : https://benrubson.github.io/zfs/procstat03.log (kgdb) p spa_namespace_lock $1 =3D { lock_object =3D { lo_name =3D 0xffffffff822eb986 "spa_namespace_lock",=20 lo_flags =3D 40960000,=20 lo_data =3D 0,=20 lo_witness =3D 0x0 },=20 sx_lock =3D 18446735285324580100 } Easily reproductible. No issue however is there is no IO load. As soon as there is IO load, I can reproduce the issue. Ben From owner-freebsd-scsi@freebsd.org Tue Oct 3 14:40:21 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B07C0E3ED48; Tue, 3 Oct 2017 14:40:21 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wr0-x22c.google.com (mail-wr0-x22c.google.com [IPv6:2a00:1450:400c:c0c::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 493DD77004; Tue, 3 Oct 2017 14:40:21 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wr0-x22c.google.com with SMTP id u5so6323484wrc.5; Tue, 03 Oct 2017 07:40:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=453TtCA1ft5VW+g9Wy9s0Ho4p5EcUtomwJSbflpk6qY=; b=Um2NeeZyImQIjVmbPcMmkvcEwdmShbpBahrTRuULwQrCVmyDXw9FwQRzUOppwpWGdA Y9R4Or9u62aLxdB9IYAFb5mC8JNXo3F6EN5LnA4lQW7rM3qweIKxdOcJhyrD9AwSV+yi RCeZVZMCHFLSrVPWE34tP3b0pBUjn4LCmwPZ/ypmCY9zY7nT0Z4Y/1NOHAG4pu552o6M PnzfEYX8t/r9lf8dDl2ogfrf3n6Ngu54jtBNyIxftjSM2pKUHejPNVrUEp4mo/JfvnZL eqf3gDP2q8gEw3Frk19E4phnhpN552wRITTxHLAU91H81Sfs0gFYLLwNxN+GbXAn/J7I k6RQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=453TtCA1ft5VW+g9Wy9s0Ho4p5EcUtomwJSbflpk6qY=; b=l4QYKyKiw6SiRwveaWFb66uhBn5sw9DtUEVueYDAEBZIMQznrocdraN3fVpqzHRmsH KEEoY/STt3P2TqJ+OjU7M3pBaoocJ5oLFq9C1IFp9gaUztn1tH3SvzyrJav/ciYRGb3c dwWRpOql9M3nJPGSgaC7AJYk8TWUAhqO8U6nhB2gC4trXfrfybH/+3K2cQrqhE8Cs6YX 2IzY/vrXPnGysjwc6PJ0lY0bHWtqQUitl48+OKlX3snL9Mp3DDwtICnuYIoY03KbxU/f 7/CQWE4QNRb3rFOTgiKDMl3Y5/Zd6B4pfNmNMSsCcaePEFK5V8gvRuh7WoVQ8fOWRDfg BTYw== X-Gm-Message-State: AMCzsaVk8zt5rRlCBX0o7uHZHsVTmje/rwPdC9yUCw7PtF2syjfZZdGP WKw3Ae3t/oAVIwlJBX6Lahc6OKUr X-Google-Smtp-Source: AOwi7QAks5RkA2w/emplIiLA0la1EH3lSiJrqnKQ4Tz6cJPJ2t0zpkwg2BaSYYcVoCMvOHXdnCeuUw== X-Received: by 10.223.151.210 with SMTP id t18mr245916wrb.261.1507041619416; Tue, 03 Oct 2017 07:40:19 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id v8sm29638wrg.80.2017.10.03.07.40.18 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 03 Oct 2017 07:40:18 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: ZFS prefers iSCSI disks over local ones ? From: Ben RUBSON In-Reply-To: <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> Date: Tue, 3 Oct 2017 16:40:17 +0200 Cc: Andriy Gapon Content-Transfer-Encoding: quoted-printable Message-Id: <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> To: Freebsd fs , FreeBSD-scsi , Steven Hartland X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 14:40:21 -0000 Hi, I start a new thread to avoid confusion in the main one. (ZFS stalled after some mirror disks were lost) > On 03 Oct 2017, at 09:39, Steven Hartland wrote: >=20 >> On 03/10/2017 08:31, Ben RUBSON wrote: >>=20 >>> On 03 Oct 2017, at 09:25, Steven Hartland wrote: >>>=20 >>>> On 03/10/2017 07:12, Andriy Gapon wrote: >>>>=20 >>>>> On 02/10/2017 21:12, Ben RUBSON wrote: >>>>>=20 >>>>> Hi, >>>>>=20 >>>>> On a FreeBSD 11 server, the following online/healthy zpool : >>>>>=20 >>>>> home >>>>> mirror-0 >>>>> label/local1 >>>>> label/local2 >>>>> label/iscsi1 >>>>> label/iscsi2 >>>>> mirror-1 >>>>> label/local3 >>>>> label/local4 >>>>> label/iscsi3 >>>>> label/iscsi4 >>>>> cache >>>>> label/local5 >>>>> label/local6 >>>>>=20 >>>>> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi = disk >>>>> according to "zpool iostat", nothing on local disks (strange but I >>>>> noticed that IOs always prefer iscsi disks to local disks). >>>>=20 >>>> Are your local disks SSD or HDD? >>>> Could it be that iSCSI disks appear to be faster than the local = disks >>>> to the smart ZFS mirror code? >>>>=20 >>>> Steve, what do you think? >>>=20 >>> Yes that quite possible, the mirror balancing uses the queue depth + >>> rotating bias to determine the load of the disk so if your iSCSI = host >>> is processing well and / or is reporting non-rotating vs rotating = for >>> the local disks it could well be the mirror is preferring reads from >>> the the less loaded iSCSI devices. >>=20 >> Note that local & iscsi disks are _exactly_ the same HDD (same model = number, >> same SAS adapter...). So iSCSI ones should be a little bit slower due = to >> network latency (even if it's very low in my case). >=20 > The output from gstat -dp on a loaded machine would be interesting to = see too. So here is the gstat -dp : L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da0 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da1 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da2 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da3 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da4 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da5 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da6 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da7 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da8 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da9 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da10 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da11 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da12 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da13 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da14 1 370 370 47326 0.7 0 0 0.0 0 0 0.0 23.2| da15 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da16 0 357 357 45698 1.4 0 0 0.0 0 0 0.0 39.3| da17 0 348 348 44572 0.7 0 0 0.0 0 0 0.0 22.5| da18 0 432 432 55339 0.7 0 0 0.0 0 0 0.0 27.5| da19 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da20 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da21 The 4 active drives are the iSCSI targets of the above quoted pool. A local disk : Geom name: da7 Providers: 1. Name: da7 Mediasize: 4000787030016 (3.6T) Sectorsize: 512 Mode: r0w0e0 descr: HGSTxxx lunid: 5000xxx ident: NHGDxxx rotationrate: 7200 fwsectors: 63 fwheads: 255 A iSCSI disk : Geom name: da19 Providers: 1. Name: da19 Mediasize: 3999688294912 (3.6T) Sectorsize: 512 Mode: r1w1e2 descr: FREEBSD CTLDISK lunname: FREEBSD MYDEVID 12 lunid: FREEBSD MYDEVID 12 ident: iscsi4 rotationrate: 0 fwsectors: 63 fwheads: 255 Sounds like then the faulty thing is the rotationrate set to 0 ? Thx, Ben From owner-freebsd-scsi@freebsd.org Tue Oct 3 14:58:24 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 81BAAE3F69A for ; Tue, 3 Oct 2017 14:58:24 +0000 (UTC) (envelope-from steven@multiplay.co.uk) Received: from mail-wm0-x235.google.com (mail-wm0-x235.google.com [IPv6:2a00:1450:400c:c09::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0B8BE77C8B for ; Tue, 3 Oct 2017 14:58:24 +0000 (UTC) (envelope-from steven@multiplay.co.uk) Received: by mail-wm0-x235.google.com with SMTP id m72so11295318wmc.0 for ; Tue, 03 Oct 2017 07:58:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=2l8y+WmCvuvkO6MuO4FpU34eu/iNFPJi0uNP5KiYwPs=; b=Xbec0zqdBk8hIoxaX/K3OTrJt3fiZMYfjNxXQHWbGSizY1wUTrFjnuy7gOTvTUiKl3 kKO0Hk0EOIkSRt/Ee45+Y7S5UM+0PVLnzoFJGz2CtnHddtu60oZlF4UiN12T+pDN53bI mF2bIf/AehZyQc25DYA+o1ng6YKRPQxrnKKdQjiFrIr5JTa5ZG7HJJsejOMBjSdjNRGu f4p20oFUc6vwc8ngYb79f2Gli5KavSJtRsrDuE6SVdzPacb/SYkrsa5Gn0bS48WWblOY NpzhCfoV/3WsBM7WahmTCfGRIij4obr7vIrwzHMBZmvC5VWlhAEvluss/1ed16gRlfwm hzPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=2l8y+WmCvuvkO6MuO4FpU34eu/iNFPJi0uNP5KiYwPs=; b=PuRw2X/yg8vriPxkTf+z/16PObeRMi5GOM0kG7tSAK2616p9PODQ2uD+TWDxIawvmD Xy/bYQaygdD1iKs07bJ5+Oj/cTWX0KEMXMWammalYAz28Kyj6feGpKRj5+5yJcK48yWf DSQgyoePVi/jfCTiSgF/WOu4Z54F5ux8L6ZXdv2biCC1Bs8845XJvvw+BYa2P/mY0aoy f3Y3YFpWjmrK/GtwgwA+21uUgIA76Im4DbBAVcjs4of8r/0l1hIIPZlWHj3rORrJklEB ntFg0mgRIQCp0WBnlsAevTbO7MzHLDveXc4WBjRvHA4iA2FHgrvYTV5Vaz0mvCV8XQpC tzlA== X-Gm-Message-State: AHPjjUj36g77lvYGa3srGOhrQijo5+CwgDvA2nvoDI94AETYGg9X3sTw Q3sgv0nPo7Ogolv8zzaxzCHhGX/4dzw= X-Google-Smtp-Source: AOwi7QCGgjl4OXFqHcAb50LuUNJlUhSTB8G8mhp+HEmCsw/lRYlkoblbF5eeQlZDK6ImLfdyUDSmfg== X-Received: by 10.28.109.77 with SMTP id i74mr14805242wmc.67.1507042701050; Tue, 03 Oct 2017 07:58:21 -0700 (PDT) Received: from [10.10.1.111] ([185.97.61.1]) by smtp.gmail.com with ESMTPSA id m138sm9043048wmd.29.2017.10.03.07.58.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Oct 2017 07:58:19 -0700 (PDT) Subject: Re: ZFS prefers iSCSI disks over local ones ? To: Ben RUBSON , Freebsd fs , FreeBSD-scsi Cc: Andriy Gapon References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> From: Steven Hartland Message-ID: Date: Tue, 3 Oct 2017 15:58:22 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> Content-Language: en-US Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 14:58:24 -0000 On 03/10/2017 15:40, Ben RUBSON wrote: > Hi, > > I start a new thread to avoid confusion in the main one. > (ZFS stalled after some mirror disks were lost) > >> On 03 Oct 2017, at 09:39, Steven Hartland wrote: >> >>> On 03/10/2017 08:31, Ben RUBSON wrote: >>> >>>> On 03 Oct 2017, at 09:25, Steven Hartland wrote: >>>> >>>>> On 03/10/2017 07:12, Andriy Gapon wrote: >>>>> >>>>>> On 02/10/2017 21:12, Ben RUBSON wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On a FreeBSD 11 server, the following online/healthy zpool : >>>>>> >>>>>> home >>>>>> mirror-0 >>>>>> label/local1 >>>>>> label/local2 >>>>>> label/iscsi1 >>>>>> label/iscsi2 >>>>>> mirror-1 >>>>>> label/local3 >>>>>> label/local4 >>>>>> label/iscsi3 >>>>>> label/iscsi4 >>>>>> cache >>>>>> label/local5 >>>>>> label/local6 >>>>>> >>>>>> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk >>>>>> according to "zpool iostat", nothing on local disks (strange but I >>>>>> noticed that IOs always prefer iscsi disks to local disks). >>>>> Are your local disks SSD or HDD? >>>>> Could it be that iSCSI disks appear to be faster than the local disks >>>>> to the smart ZFS mirror code? >>>>> >>>>> Steve, what do you think? >>>> Yes that quite possible, the mirror balancing uses the queue depth + >>>> rotating bias to determine the load of the disk so if your iSCSI host >>>> is processing well and / or is reporting non-rotating vs rotating for >>>> the local disks it could well be the mirror is preferring reads from >>>> the the less loaded iSCSI devices. >>> Note that local & iscsi disks are _exactly_ the same HDD (same model number, >>> same SAS adapter...). So iSCSI ones should be a little bit slower due to >>> network latency (even if it's very low in my case). >> The output from gstat -dp on a loaded machine would be interesting to see too. > So here is the gstat -dp : > > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da0 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da1 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da2 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da3 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da4 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da5 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da6 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da7 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da8 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da9 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da10 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da11 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da12 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da13 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da14 > 1 370 370 47326 0.7 0 0 0.0 0 0 0.0 23.2| da15 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da16 > 0 357 357 45698 1.4 0 0 0.0 0 0 0.0 39.3| da17 > 0 348 348 44572 0.7 0 0 0.0 0 0 0.0 22.5| da18 > 0 432 432 55339 0.7 0 0 0.0 0 0 0.0 27.5| da19 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da20 > 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da21 > > The 4 active drives are the iSCSI targets of the above quoted pool. > > A local disk : > > Geom name: da7 > Providers: > 1. Name: da7 > Mediasize: 4000787030016 (3.6T) > Sectorsize: 512 > Mode: r0w0e0 > descr: HGSTxxx > lunid: 5000xxx > ident: NHGDxxx > rotationrate: 7200 > fwsectors: 63 > fwheads: 255 > > A iSCSI disk : > > Geom name: da19 > Providers: > 1. Name: da19 > Mediasize: 3999688294912 (3.6T) > Sectorsize: 512 > Mode: r1w1e2 > descr: FREEBSD CTLDISK > lunname: FREEBSD MYDEVID 12 > lunid: FREEBSD MYDEVID 12 > ident: iscsi4 > rotationrate: 0 > fwsectors: 63 > fwheads: 255 > > Sounds like then the faulty thing is the rotationrate set to 0 ? > > Absolutely and from the looks you're not stressing the iSCSI disks so they get high queuing depths hence the preference. As load increased I would expect the local disks to start seeing activity.     Regards     Steve From owner-freebsd-scsi@freebsd.org Tue Oct 3 15:03:22 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 243E8E3F999; Tue, 3 Oct 2017 15:03:22 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x233.google.com (mail-wm0-x233.google.com [IPv6:2a00:1450:400c:c09::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A98907C4AB; Tue, 3 Oct 2017 15:03:21 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x233.google.com with SMTP id i82so15205575wmd.3; Tue, 03 Oct 2017 08:03:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=hCrCM1uH3lHhMowCEUEfzBE9AYQxWOTzX0bceRvaYzA=; b=TKoTJTM/ji+z9gD0WsHrPQ7DA4hc4HUxAVpQuQVXKVUELu5R9BQnAvaKN2eVK52cSb yRr858lJ5muzWHDWEIqrd47JQ4lOUNpbTEqGhNqlbhItyxKL0Yf/F/AbvUMorNWylIF4 fTM+lxh0BKsYQICw89FSw/6tnlhDUGOBWvtApvPL0k2ogN3MfY9LxlfDGeXnEyzXbOEP 092H6ZwCIjDqrycWOTqtjyrZFHHe3SACeDI+AKl033dVuZGqBKPPyDzwoPBX6mCcb6if cSQCaV/SLF2qvFx8FBZEIy9Pz4IkGrcJ6RZkFTQVZohRXY+Kvj00JlZpMwj4h1qth2nA 3F3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=hCrCM1uH3lHhMowCEUEfzBE9AYQxWOTzX0bceRvaYzA=; b=bnoDmMOtLGd7AY5ANfqmGWDBpTnkJMUwnvAjuhyUV0NE2UfyWP4D304Q2UjIQ+M9sa vkeqhPAIB8xwb9Az494FprBpN+ffoi997Rii7IkUfXwgIg0u3f3hqPtuyerYdsTRvr24 Duqcj9/tXZ5cynUJSPA1k7wQx3EZHqXWlelAu7uaw5uceGp3Hg2z2c8ICqD5Rpl+RtOQ H7pB+QLgZ7eWX5skdLiEU0xwAO8+OfFzLRP63u8zkhaUjoAx4lDxYmcjDHn96KlMU/aT h+m6US2r/d2fRklFzEGKeqic5uEfhh9i7VTNxSr/pOeN4hO65gQ3F36MxsSyypNgK47D FY8Q== X-Gm-Message-State: AMCzsaXX+z98ElmHrxF7WdZZjvrBTKrWLN2KuoIcuVZ486k3a6hj4gSy sgobxpR5WoP0LkgBsFquJFuRcmWv X-Google-Smtp-Source: AOwi7QC4rO1x6Z979fymPYKCC/QWMczBWFNllPnUrHDkiirjL7UqkllZobQZ8VRMTtgA8hKwMzrDDA== X-Received: by 10.28.232.138 with SMTP id f10mr2683080wmi.130.1507043000017; Tue, 03 Oct 2017 08:03:20 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id p78sm23655244wma.11.2017.10.03.08.03.19 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 03 Oct 2017 08:03:19 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: ZFS prefers iSCSI disks over local ones ? From: Ben RUBSON In-Reply-To: Date: Tue, 3 Oct 2017 17:03:18 +0200 Cc: Andriy Gapon Content-Transfer-Encoding: quoted-printable Message-Id: References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> To: Steven Hartland , FreeBSD-scsi , Freebsd fs X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 15:03:22 -0000 > On 03 Oct 2017, at 16:58, Steven Hartland = wrote: >=20 > On 03/10/2017 15:40, Ben RUBSON wrote: >> Hi, >>=20 >> I start a new thread to avoid confusion in the main one. >> (ZFS stalled after some mirror disks were lost) >>=20 >>=20 >>> On 03 Oct 2017, at 09:39, Steven Hartland wrote: >>>=20 >>>=20 >>>> On 03/10/2017 08:31, Ben RUBSON wrote: >>>>=20 >>>>=20 >>>>> On 03 Oct 2017, at 09:25, Steven Hartland wrote: >>>>>=20 >>>>>=20 >>>>>> On 03/10/2017 07:12, Andriy Gapon wrote: >>>>>>=20 >>>>>>=20 >>>>>>> On 02/10/2017 21:12, Ben RUBSON wrote: >>>>>>>=20 >>>>>>> Hi, >>>>>>>=20 >>>>>>> On a FreeBSD 11 server, the following online/healthy zpool : >>>>>>>=20 >>>>>>> home >>>>>>> mirror-0 >>>>>>> label/local1 >>>>>>> label/local2 >>>>>>> label/iscsi1 >>>>>>> label/iscsi2 >>>>>>> mirror-1 >>>>>>> label/local3 >>>>>>> label/local4 >>>>>>> label/iscsi3 >>>>>>> label/iscsi4 >>>>>>> cache >>>>>>> label/local5 >>>>>>> label/local6 >>>>>>>=20 >>>>>>> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi = disk >>>>>>> according to "zpool iostat", nothing on local disks (strange but = I >>>>>>> noticed that IOs always prefer iscsi disks to local disks). >>>>>>>=20 >>>>>> Are your local disks SSD or HDD? >>>>>> Could it be that iSCSI disks appear to be faster than the local = disks >>>>>> to the smart ZFS mirror code? >>>>>>=20 >>>>>> Steve, what do you think? >>>>>>=20 >>>>> Yes that quite possible, the mirror balancing uses the queue depth = + >>>>> rotating bias to determine the load of the disk so if your iSCSI = host >>>>> is processing well and / or is reporting non-rotating vs rotating = for >>>>> the local disks it could well be the mirror is preferring reads = from >>>>> the the less loaded iSCSI devices. >>>>>=20 >>>> Note that local & iscsi disks are _exactly_ the same HDD (same = model number, >>>> same SAS adapter...). So iSCSI ones should be a little bit slower = due to >>>> network latency (even if it's very low in my case). >>>>=20 >>> The output from gstat -dp on a loaded machine would be interesting = to see too. >>>=20 >> So here is the gstat -dp : >>=20 >> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da0 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da1 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da2 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da3 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da4 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da5 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da6 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da7 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da8 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da9 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da10 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da11 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da12 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da13 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da14 >> 1 370 370 47326 0.7 0 0 0.0 0 0 0.0 23.2| da15 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da16 >> 0 357 357 45698 1.4 0 0 0.0 0 0 0.0 39.3| da17 >> 0 348 348 44572 0.7 0 0 0.0 0 0 0.0 22.5| da18 >> 0 432 432 55339 0.7 0 0 0.0 0 0 0.0 27.5| da19 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da20 >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da21 >>=20 >> The 4 active drives are the iSCSI targets of the above quoted pool. >>=20 >> A local disk : >>=20 >> Geom name: da7 >> Providers: >> 1. Name: da7 >> Mediasize: 4000787030016 (3.6T) >> Sectorsize: 512 >> Mode: r0w0e0 >> descr: HGSTxxx >> lunid: 5000xxx >> ident: NHGDxxx >> rotationrate: 7200 >> fwsectors: 63 >> fwheads: 255 >>=20 >> A iSCSI disk : >>=20 >> Geom name: da19 >> Providers: >> 1. Name: da19 >> Mediasize: 3999688294912 (3.6T) >> Sectorsize: 512 >> Mode: r1w1e2 >> descr: FREEBSD CTLDISK >> lunname: FREEBSD MYDEVID 12 >> lunid: FREEBSD MYDEVID 12 >> ident: iscsi4 >> rotationrate: 0 >> fwsectors: 63 >> fwheads: 255 >>=20 >> Sounds like then the faulty thing is the rotationrate set to 0 ? >=20 > Absolutely Good catch then, thank you ! > and from the looks you're not stressing the iSCSI disks so they get = high queuing depths hence the preference. > As load increased I would expect the local disks to start seeing = activity. Yes this is also what I see. Any way however to set rotationrate to 7200 (or to a slightly greater = value) as well for iSCSI drives ? I looked through ctl.conf(5) and iscsi.conf(5) but did not found = anything related. Many thanks ! Ben From owner-freebsd-scsi@freebsd.org Tue Oct 3 15:07:37 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 176EFE3FB48; Tue, 3 Oct 2017 15:07:37 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x22e.google.com (mail-wm0-x22e.google.com [IPv6:2a00:1450:400c:c09::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9B2BD7C6DF; Tue, 3 Oct 2017 15:07:36 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x22e.google.com with SMTP id b189so13483935wmd.4; Tue, 03 Oct 2017 08:07:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=H+N/UGMzAC7gjaMlOP9gGicMWAUPQPwxkCg1OWuqB8Y=; b=bKMP5ilKNmkLzj+iTrpm26KJ6hFJNbLFB4Hd9EW2PMgJOD5CO2hqT+EWZzH+ZMOWu8 G2PvlDden0jHlYIwWW1w8WyjCaex5btTaZKvx/ib22VTwAyQJgtHqOt6+D0ruRKIKM8S SiCN+PAsYcZWST1TOaz1QduJmnaIPY5O337DQ+lkyLrojxoF7idMi9MgZ/wisRz/hfvj MvHhB8ObC5wbdkS/h1Wr3n1jRuCd6rDnYi7bBiuPKOPzp3jf3KKQWWPdsj2NwIxbhXJ0 W0ojeuzqNxw67E15s0l0r2UxVD4j0C8ONazSnOXm5SfIu374WZimAF8pX8McWfKf8atp 4wkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=H+N/UGMzAC7gjaMlOP9gGicMWAUPQPwxkCg1OWuqB8Y=; b=oUyNu9ZWoSIfNqJDmssjOcwHCJYfD21C8a0SSW1yRPsUN01k4J+Ju75EuTo6cuJPLZ XxTRMbeu0vwi6tvz89Rjb34NoFsbkKdYIkGRhws6ju8i6SuWMp6aDcFeVpCLJdoD2wsl ykeycw3btO0dGDXuJ2AiDZ6AFaLG0CrBVCTkojlqPRjIL8Ognxmhy+292eqOaWVJs3BI zOaWv3bbMTWnUXTHxHHX07hhkFyOqgs0RqxS2DqU/qI6QzWbC0LNXnoLko7ecRn8Bo0y kl5iN+J7+LxxsxznSkQTx/HnQ1lhMMpfPmyyceW86F4AsBA3a6c3kbveMXzgilh0X29v uF8Q== X-Gm-Message-State: AMCzsaWMpAq9M/c5GrYzdyJ9pZ+fHHthOa+iECHgf/Ue0TT8DuOxFi85 BQbj4YnomTzHpI0hd1g/pNVp9k2w X-Google-Smtp-Source: AOwi7QC8FzAIEnk5HznczseaUCUU/NeoDBVCSmWkcb4pqLXDyP3LvFdzB0fzvRp//rTyQWEZMGR5TA== X-Received: by 10.28.209.2 with SMTP id i2mr4235886wmg.153.1507043254956; Tue, 03 Oct 2017 08:07:34 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id m8sm2724283wrg.55.2017.10.03.08.07.34 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 03 Oct 2017 08:07:34 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: ZFS prefers iSCSI disks over local ones ? From: Ben RUBSON In-Reply-To: Date: Tue, 3 Oct 2017 17:07:33 +0200 Cc: Andriy Gapon Content-Transfer-Encoding: quoted-printable Message-Id: <49ADB654-E68B-4B88-AE8E-49F755092848@gmail.com> References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> To: Steven Hartland , FreeBSD-scsi , Freebsd fs X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 15:07:37 -0000 > On 03 Oct 2017, at 17:03, Ben RUBSON wrote: >=20 >> On 03 Oct 2017, at 16:58, Steven Hartland = wrote: >>=20 >> On 03/10/2017 15:40, Ben RUBSON wrote: >>> Hi, >>>=20 >>> I start a new thread to avoid confusion in the main one. >>> (ZFS stalled after some mirror disks were lost) >>>=20 >>>=20 >>>> On 03 Oct 2017, at 09:39, Steven Hartland wrote: >>>>=20 >>>>=20 >>>>> On 03/10/2017 08:31, Ben RUBSON wrote: >>>>>=20 >>>>>=20 >>>>>> On 03 Oct 2017, at 09:25, Steven Hartland wrote: >>>>>>=20 >>>>>>=20 >>>>>>> On 03/10/2017 07:12, Andriy Gapon wrote: >>>>>>>=20 >>>>>>>=20 >>>>>>>> On 02/10/2017 21:12, Ben RUBSON wrote: >>>>>>>>=20 >>>>>>>> Hi, >>>>>>>>=20 >>>>>>>> On a FreeBSD 11 server, the following online/healthy zpool : >>>>>>>>=20 >>>>>>>> home >>>>>>>> mirror-0 >>>>>>>> label/local1 >>>>>>>> label/local2 >>>>>>>> label/iscsi1 >>>>>>>> label/iscsi2 >>>>>>>> mirror-1 >>>>>>>> label/local3 >>>>>>>> label/local4 >>>>>>>> label/iscsi3 >>>>>>>> label/iscsi4 >>>>>>>> cache >>>>>>>> label/local5 >>>>>>>> label/local6 >>>>>>>>=20 >>>>>>>> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi = disk >>>>>>>> according to "zpool iostat", nothing on local disks (strange = but I >>>>>>>> noticed that IOs always prefer iscsi disks to local disks). >>>>>>>>=20 >>>>>>> Are your local disks SSD or HDD? >>>>>>> Could it be that iSCSI disks appear to be faster than the local = disks >>>>>>> to the smart ZFS mirror code? >>>>>>>=20 >>>>>>> Steve, what do you think? >>>>>>>=20 >>>>>> Yes that quite possible, the mirror balancing uses the queue = depth + >>>>>> rotating bias to determine the load of the disk so if your iSCSI = host >>>>>> is processing well and / or is reporting non-rotating vs rotating = for >>>>>> the local disks it could well be the mirror is preferring reads = from >>>>>> the the less loaded iSCSI devices. >>>>>>=20 >>>>> Note that local & iscsi disks are _exactly_ the same HDD (same = model number, >>>>> same SAS adapter...). So iSCSI ones should be a little bit slower = due to >>>>> network latency (even if it's very low in my case). >>>>>=20 >>>> The output from gstat -dp on a loaded machine would be interesting = to see too. >>>>=20 >>> So here is the gstat -dp : >>>=20 >>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da0 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da1 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da2 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da3 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da4 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da5 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da6 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da7 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da8 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da9 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da10 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da11 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da12 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da13 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da14 >>> 1 370 370 47326 0.7 0 0 0.0 0 0 0.0 23.2| da15 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da16 >>> 0 357 357 45698 1.4 0 0 0.0 0 0 0.0 39.3| da17 >>> 0 348 348 44572 0.7 0 0 0.0 0 0 0.0 22.5| da18 >>> 0 432 432 55339 0.7 0 0 0.0 0 0 0.0 27.5| da19 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da20 >>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da21 >>>=20 >>> The 4 active drives are the iSCSI targets of the above quoted pool. >>>=20 >>> A local disk : >>>=20 >>> Geom name: da7 >>> Providers: >>> 1. Name: da7 >>> Mediasize: 4000787030016 (3.6T) >>> Sectorsize: 512 >>> Mode: r0w0e0 >>> descr: HGSTxxx >>> lunid: 5000xxx >>> ident: NHGDxxx >>> rotationrate: 7200 >>> fwsectors: 63 >>> fwheads: 255 >>>=20 >>> A iSCSI disk : >>>=20 >>> Geom name: da19 >>> Providers: >>> 1. Name: da19 >>> Mediasize: 3999688294912 (3.6T) >>> Sectorsize: 512 >>> Mode: r1w1e2 >>> descr: FREEBSD CTLDISK >>> lunname: FREEBSD MYDEVID 12 >>> lunid: FREEBSD MYDEVID 12 >>> ident: iscsi4 >>> rotationrate: 0 >>> fwsectors: 63 >>> fwheads: 255 >>>=20 >>> Sounds like then the faulty thing is the rotationrate set to 0 ? >>=20 >> Absolutely >=20 > Good catch then, thank you ! >=20 >> and from the looks you're not stressing the iSCSI disks so they get = high queuing depths hence the preference. >> As load increased I would expect the local disks to start seeing = activity. >=20 > Yes this is also what I see. >=20 > Any way however to set rotationrate to 7200 (or to a slightly greater = value (*)) as well for iSCSI drives ? > I looked through ctl.conf(5) and iscsi.conf(5) but did not found = anything related. Sorry, (*) or to a slightly lower value (of course...). I forgot to mention that as the initiator, target is a FreeBSD 11.0 = server. Ben From owner-freebsd-scsi@freebsd.org Tue Oct 3 15:18:53 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9F172E3FEB4; Tue, 3 Oct 2017 15:18:53 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from mail.in-addr.com (mail.in-addr.com [IPv6:2a01:4f8:191:61e8::2525:2525]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 68DA97CC16; Tue, 3 Oct 2017 15:18:53 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from gjp by mail.in-addr.com with local (Exim 4.89 (FreeBSD)) (envelope-from ) id 1dzOyB-0003Rn-1F; Tue, 03 Oct 2017 16:18:51 +0100 Date: Tue, 3 Oct 2017 16:18:50 +0100 From: Gary Palmer To: Ben RUBSON Cc: Steven Hartland , FreeBSD-scsi , Freebsd fs , Andriy Gapon Subject: Re: ZFS prefers iSCSI disks over local ones ? Message-ID: <20171003151850.GA65538@in-addr.com> References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: gpalmer@freebsd.org X-SA-Exim-Scanned: No (on mail.in-addr.com); SAEximRunCond expanded to false X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 15:18:53 -0000 On Tue, Oct 03, 2017 at 05:03:18PM +0200, Ben RUBSON wrote: > > On 03 Oct 2017, at 16:58, Steven Hartland wrote: > > > > On 03/10/2017 15:40, Ben RUBSON wrote: > >> Hi, > >> > >> I start a new thread to avoid confusion in the main one. > >> (ZFS stalled after some mirror disks were lost) > >> > >> > >>> On 03 Oct 2017, at 09:39, Steven Hartland wrote: > >>> > >>> > >>>> On 03/10/2017 08:31, Ben RUBSON wrote: > >>>> > >>>> > >>>>> On 03 Oct 2017, at 09:25, Steven Hartland wrote: > >>>>> > >>>>> > >>>>>> On 03/10/2017 07:12, Andriy Gapon wrote: > >>>>>> > >>>>>> > >>>>>>> On 02/10/2017 21:12, Ben RUBSON wrote: > >>>>>>> > >>>>>>> Hi, > >>>>>>> > >>>>>>> On a FreeBSD 11 server, the following online/healthy zpool : > >>>>>>> > >>>>>>> home > >>>>>>> mirror-0 > >>>>>>> label/local1 > >>>>>>> label/local2 > >>>>>>> label/iscsi1 > >>>>>>> label/iscsi2 > >>>>>>> mirror-1 > >>>>>>> label/local3 > >>>>>>> label/local4 > >>>>>>> label/iscsi3 > >>>>>>> label/iscsi4 > >>>>>>> cache > >>>>>>> label/local5 > >>>>>>> label/local6 > >>>>>>> > >>>>>>> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk > >>>>>>> according to "zpool iostat", nothing on local disks (strange but I > >>>>>>> noticed that IOs always prefer iscsi disks to local disks). > >>>>>>> > >>>>>> Are your local disks SSD or HDD? > >>>>>> Could it be that iSCSI disks appear to be faster than the local disks > >>>>>> to the smart ZFS mirror code? > >>>>>> > >>>>>> Steve, what do you think? > >>>>>> > >>>>> Yes that quite possible, the mirror balancing uses the queue depth + > >>>>> rotating bias to determine the load of the disk so if your iSCSI host > >>>>> is processing well and / or is reporting non-rotating vs rotating for > >>>>> the local disks it could well be the mirror is preferring reads from > >>>>> the the less loaded iSCSI devices. > >>>>> > >>>> Note that local & iscsi disks are _exactly_ the same HDD (same model number, > >>>> same SAS adapter...). So iSCSI ones should be a little bit slower due to > >>>> network latency (even if it's very low in my case). > >>>> > >>> The output from gstat -dp on a loaded machine would be interesting to see too. > >>> > >> So here is the gstat -dp : > >> > >> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da0 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da1 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da2 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da3 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da4 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da5 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da6 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da7 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da8 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da9 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da10 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da11 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da12 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da13 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da14 > >> 1 370 370 47326 0.7 0 0 0.0 0 0 0.0 23.2| da15 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da16 > >> 0 357 357 45698 1.4 0 0 0.0 0 0 0.0 39.3| da17 > >> 0 348 348 44572 0.7 0 0 0.0 0 0 0.0 22.5| da18 > >> 0 432 432 55339 0.7 0 0 0.0 0 0 0.0 27.5| da19 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da20 > >> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da21 > >> > >> The 4 active drives are the iSCSI targets of the above quoted pool. > >> > >> A local disk : > >> > >> Geom name: da7 > >> Providers: > >> 1. Name: da7 > >> Mediasize: 4000787030016 (3.6T) > >> Sectorsize: 512 > >> Mode: r0w0e0 > >> descr: HGSTxxx > >> lunid: 5000xxx > >> ident: NHGDxxx > >> rotationrate: 7200 > >> fwsectors: 63 > >> fwheads: 255 > >> > >> A iSCSI disk : > >> > >> Geom name: da19 > >> Providers: > >> 1. Name: da19 > >> Mediasize: 3999688294912 (3.6T) > >> Sectorsize: 512 > >> Mode: r1w1e2 > >> descr: FREEBSD CTLDISK > >> lunname: FREEBSD MYDEVID 12 > >> lunid: FREEBSD MYDEVID 12 > >> ident: iscsi4 > >> rotationrate: 0 > >> fwsectors: 63 > >> fwheads: 255 > >> > >> Sounds like then the faulty thing is the rotationrate set to 0 ? > > > > Absolutely > > Good catch then, thank you ! > > > and from the looks you're not stressing the iSCSI disks so they get high queuing depths hence the preference. > > As load increased I would expect the local disks to start seeing activity. > > Yes this is also what I see. > > Any way however to set rotationrate to 7200 (or to a slightly greater value) as well for iSCSI drives ? > I looked through ctl.conf(5) and iscsi.conf(5) but did not found anything related. > > Many thanks ! Use the "option" setting in ctl.conf to change the rpm value (documented in the OPTIONS section of ctladm(8)). Regards, Gary From owner-freebsd-scsi@freebsd.org Tue Oct 3 15:33:28 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 14D8CE4030C for ; Tue, 3 Oct 2017 15:33:28 +0000 (UTC) (envelope-from fbsd-lists@dudes.ch) Received: from mail.dudes.ch (mail.dudes.ch [193.73.211.25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.dudes.ch", Issuer "StartCom Class 3 OV Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A144B7D2F9 for ; Tue, 3 Oct 2017 15:33:26 +0000 (UTC) (envelope-from fbsd-lists@dudes.ch) Received: from mwoffice.virtualtec.office (pippin.virtualtec.ch [93.189.66.120]) (authenticated bits=0) by mail.dudes.ch (8.15.2/8.15.2) with ESMTPSA id v93FSvpK096593 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Tue, 3 Oct 2017 17:28:57 +0200 (CEST) (envelope-from fbsd-lists@dudes.ch) X-Authentication-Warning: mail.dudes.ch: Host pippin.virtualtec.ch [93.189.66.120] claimed to be mwoffice.virtualtec.office Date: Tue, 3 Oct 2017 17:28:57 +0200 From: Markus Wild To: freebsd-scsi@freebsd.org Subject: Re: ZFS prefers iSCSI disks over local ones ? Message-ID: <20171003172857.2497b931@mwoffice.virtualtec.office> In-Reply-To: <49ADB654-E68B-4B88-AE8E-49F755092848@gmail.com> References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> <49ADB654-E68B-4B88-AE8E-49F755092848@gmail.com> X-Mailer: Claws Mail 3.15.1 (GTK+ 2.24.31; amd64-portbld-freebsd11.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.78 on 193.73.211.25 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 15:33:28 -0000 > > Any way however to set rotationrate to 7200 (or to a slightly greater value (*)) as well for iSCSI drives ? > > I looked through ctl.conf(5) and iscsi.conf(5) but did not found anything related. > > Sorry, (*) or to a slightly lower value (of course...). > I forgot to mention that as the initiator, target is a FreeBSD 11.0 server. We use this in our ctl.conf to ensure vmware doesn't consider the iscsi volumes to be ssd drives: [...] lun 1 { path /dev/zvol/data/volumes/zvol1 ; option rpm 10000 } [...] Cheers, Markus From owner-freebsd-scsi@freebsd.org Tue Oct 3 15:37:35 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1BB29E407E9; Tue, 3 Oct 2017 15:37:35 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wr0-x233.google.com (mail-wr0-x233.google.com [IPv6:2a00:1450:400c:c0c::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8FF477E0A4; Tue, 3 Oct 2017 15:37:34 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wr0-x233.google.com with SMTP id r79so1290727wrb.13; Tue, 03 Oct 2017 08:37:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=kf3mYkpEpeo5BfnWkpMJiMD5eg9W97WYhG8Brs50U0w=; b=aBsoKuGdj9e1wmVyB32IGlKFEnm4hwtklujvk8EiOeOcjb0utxKwoODnjutnFTBFG4 8rOCQwfH3KCrB8lMMTiY7x/TPKxZzSUSbCjpxacy4/3pD4Nq6PzVkfEAY3Vx47yUPLW+ sq2iZ6Vi/+3l8hEsZwzSZTzxuiawHzodOrpEc+NAeQV0hEQlrOFV9MEr8bk8Mcqoxc26 WgsoGtlqOPjI+ddk0M3ax1xslTZz8KsnCj2f2srBTpIqM13Dkb6eDM/mTFmBpKKQaNnw SEnVpOMiqpSeumLmdDwe+W5jv0Vk23VDC7RadkcQQsgCMWD+NT2/iqOPZaYCo0avt8Jn 3J1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=kf3mYkpEpeo5BfnWkpMJiMD5eg9W97WYhG8Brs50U0w=; b=cEaeo853jRwsEA1KJyFBawp5xUI31KrQVC0vC5jMVIDjHlgGQgOT2RtoMxQ2sLHygI 7YQd23Jcdjjv7ESjidAI6eVyjVNd/yEHl/s/0pdHa3oRcGKf3O1+re7tey95DHDXrCUI 6B89p3XWgv2HSRBoc6D7vDliC0oHCILg8zK4Hx9hJRjcE5zG3o61rw3blwn6E/FgAL0e gxdP4FYe6O7p0wx6GHeiZn/17v61inS7ObPjqHb0bBoFv0aI8nyI+dibppPoEa21TnDM UNmD471TOC05C2QWJu1f4p/9k1P8R5e9JI3LPnxHFKka2cbr3OH4QHcS0xFZf5GYWw63 u9aQ== X-Gm-Message-State: AHPjjUgRmHQGKi+5gYOFLm3Mjq3zOcXWTcABfWnQQI2ar997ikXW7LUt f5Bu8KYnnA5UMLQvceBHzcA= X-Google-Smtp-Source: AOwi7QCBkPbOPxQSsNRGVn8jb0rk/dd/6RhGemrHMigk48Ya5jJ+W+kJP+GAHJo2Y7UjsrIlUGIiag== X-Received: by 10.223.155.203 with SMTP id e11mr13482670wrc.218.1507045053090; Tue, 03 Oct 2017 08:37:33 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id n57sm19561773wrn.29.2017.10.03.08.37.32 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 03 Oct 2017 08:37:32 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: ZFS prefers iSCSI disks over local ones ? From: Ben RUBSON In-Reply-To: <20171003172857.2497b931@mwoffice.virtualtec.office> Date: Tue, 3 Oct 2017 17:37:30 +0200 Cc: Andriy Gapon , Steven Hartland Content-Transfer-Encoding: quoted-printable Message-Id: <919C4A38-5192-4AED-BC6A-FBED8EFD6B31@gmail.com> References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> <49ADB654-E68B-4B88-AE8E-49F755092848@gmail.com> <20171003172857.2497b931@mwoffice.virtualtec.office> To: Markus Wild , FreeBSD-scsi , Freebsd fs X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 15:37:35 -0000 > On 03 Oct 2017, at 17:28, Markus Wild wrote: >=20 >>> Any way however to set rotationrate to 7200 (or to a slightly = greater value (*)) as well for iSCSI drives ? >>> I looked through ctl.conf(5) and iscsi.conf(5) but did not found = anything related. =20 >>=20 >> Sorry, (*) or to a slightly lower value (of course...). >> I forgot to mention that as the initiator, target is a FreeBSD 11.0 = server. >=20 > We use this in our ctl.conf to ensure vmware doesn't consider the = iscsi volumes to be ssd drives: >=20 > [...] > lun 1 { path /dev/zvol/data/volumes/zvol1 ; option rpm 10000 } > [...] Markus, thank you very much for the tip ! I'll test this as soon as my production will be fully online. Perfect ! :) Best, Ben= From owner-freebsd-scsi@freebsd.org Tue Oct 3 15:40:25 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 183B8E40A14; Tue, 3 Oct 2017 15:40:25 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wr0-x233.google.com (mail-wr0-x233.google.com [IPv6:2a00:1450:400c:c0c::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B396A7E6C6; Tue, 3 Oct 2017 15:40:24 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wr0-x233.google.com with SMTP id 54so6649055wrz.10; Tue, 03 Oct 2017 08:40:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=FSE6usK7Y8BeHJ1vB2YCw29sv4py6eOBtrt4Z1muMPs=; b=tjQLJ7HEyqC3Ne8lBwZ2X/zmKkev5ktDsQ40Kwl3ZNfeTjxQi2SU7NJfDa+klfKiv0 OIwDz24K0+ETcObhfbvUezspResLUnEhCxUQ0Nl8c0Y5lpzqEofkw54jxb1CYEQqP5K1 uREb58aCoKkdWuWh0IgigAU3TR0zecjqMOHU5YEpWpEBvBYxngHWXHqWd+r3sxhvJuVB XsTbajoojVSg6hjj9SGxIQTOCK74NSkTq4xZ48HDIzK07NzCokTP6rwOmKLPqjSaMnv9 QUyyjELdG+jOeB8qnPwuRLqhlQlu6W58moiNMnSozx3KIA8BU93JnXyqYggAXzXMVmJp Ujpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=FSE6usK7Y8BeHJ1vB2YCw29sv4py6eOBtrt4Z1muMPs=; b=dKM7qifoWMImcXBSTTRxFdnXjCiIXUJQe2sZq5cFrZ+9ifB1JnfRL3D8BAW/zbMpNb vlhUVERHtSJ4dYUFb8ebPH5G8C4nJYSqu9+uDxSTY4fHlHG56CN0Hw92eoHI63ua9LuN 5rfzeK31dJPJbx0Zp/JdrbiC2FgVrCrAR65VhxPYPpJ48jFFdDV5RNfxIqi7ZS4qtWz2 0Hkm9vyMySbvWpwBmEbn5G2z8JZPtvsMpvWxn3sfcRhCCZMOHdHcTSoSp9m3NzKsJK3v PeiyiG6ZXj3Abk/fiFBZs/LbCKRRo+gMWAfP45jvO2RDCTlcY3OrusOMzLdVEPL2qFtI Og4g== X-Gm-Message-State: AMCzsaW0653F+xyassHVQ2C+pnrmd/DWcs/p3eOtx4QOgtEBueNrBkKV eF5VmHXoBLyvtP7FzxacaaaxY4gnh/g= X-Google-Smtp-Source: AOwi7QBu1cG2mB/ax3xgwxZYYa3fd6Cy6w9FPD5NqP+PSqnqx0tuHYrM4BOnL49KSqFFr1UujujjWQ== X-Received: by 10.223.171.73 with SMTP id r9mr3244715wrc.118.1507045222910; Tue, 03 Oct 2017 08:40:22 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id d18sm7277435wra.89.2017.10.03.08.40.22 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 03 Oct 2017 08:40:22 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: ZFS prefers iSCSI disks over local ones ? From: Ben RUBSON In-Reply-To: <20171003151850.GA65538@in-addr.com> Date: Tue, 3 Oct 2017 17:40:21 +0200 Cc: Steven Hartland , FreeBSD-scsi , Freebsd fs , Andriy Gapon Content-Transfer-Encoding: quoted-printable Message-Id: References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> <20171003151850.GA65538@in-addr.com> To: Gary Palmer X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 15:40:25 -0000 > On 03 Oct 2017, at 17:18, Gary Palmer wrote: >=20 > On Tue, Oct 03, 2017 at 05:03:18PM +0200, Ben RUBSON wrote: >>> On 03 Oct 2017, at 16:58, Steven Hartland = wrote: >>>=20 >>> On 03/10/2017 15:40, Ben RUBSON wrote: >>>> Hi, >>>>=20 >>>> I start a new thread to avoid confusion in the main one. >>>> (ZFS stalled after some mirror disks were lost) >>>>=20 >>>>=20 >>>>> On 03 Oct 2017, at 09:39, Steven Hartland wrote: >>>>>=20 >>>>>=20 >>>>>> On 03/10/2017 08:31, Ben RUBSON wrote: >>>>>>=20 >>>>>>=20 >>>>>>> On 03 Oct 2017, at 09:25, Steven Hartland wrote: >>>>>>>=20 >>>>>>>=20 >>>>>>>> On 03/10/2017 07:12, Andriy Gapon wrote: >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>> On 02/10/2017 21:12, Ben RUBSON wrote: >>>>>>>>>=20 >>>>>>>>> Hi, >>>>>>>>>=20 >>>>>>>>> On a FreeBSD 11 server, the following online/healthy zpool : >>>>>>>>>=20 >>>>>>>>> home >>>>>>>>> mirror-0 >>>>>>>>> label/local1 >>>>>>>>> label/local2 >>>>>>>>> label/iscsi1 >>>>>>>>> label/iscsi2 >>>>>>>>> mirror-1 >>>>>>>>> label/local3 >>>>>>>>> label/local4 >>>>>>>>> label/iscsi3 >>>>>>>>> label/iscsi4 >>>>>>>>> cache >>>>>>>>> label/local5 >>>>>>>>> label/local6 >>>>>>>>>=20 >>>>>>>>> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi = disk >>>>>>>>> according to "zpool iostat", nothing on local disks (strange = but I >>>>>>>>> noticed that IOs always prefer iscsi disks to local disks). >>>>>>>>>=20 >>>>>>>> Are your local disks SSD or HDD? >>>>>>>> Could it be that iSCSI disks appear to be faster than the local = disks >>>>>>>> to the smart ZFS mirror code? >>>>>>>>=20 >>>>>>>> Steve, what do you think? >>>>>>>>=20 >>>>>>> Yes that quite possible, the mirror balancing uses the queue = depth + >>>>>>> rotating bias to determine the load of the disk so if your iSCSI = host >>>>>>> is processing well and / or is reporting non-rotating vs = rotating for >>>>>>> the local disks it could well be the mirror is preferring reads = from >>>>>>> the the less loaded iSCSI devices. >>>>>>>=20 >>>>>> Note that local & iscsi disks are _exactly_ the same HDD (same = model number, >>>>>> same SAS adapter...). So iSCSI ones should be a little bit slower = due to >>>>>> network latency (even if it's very low in my case). >>>>>>=20 >>>>> The output from gstat -dp on a loaded machine would be interesting = to see too. >>>>>=20 >>>> So here is the gstat -dp : >>>>=20 >>>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da0 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da1 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da2 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da3 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da4 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da5 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da6 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da7 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da8 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da9 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da10 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da11 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da12 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da13 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da14 >>>> 1 370 370 47326 0.7 0 0 0.0 0 0 0.0 23.2| da15 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da16 >>>> 0 357 357 45698 1.4 0 0 0.0 0 0 0.0 39.3| da17 >>>> 0 348 348 44572 0.7 0 0 0.0 0 0 0.0 22.5| da18 >>>> 0 432 432 55339 0.7 0 0 0.0 0 0 0.0 27.5| da19 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da20 >>>> 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0.0| da21 >>>>=20 >>>> The 4 active drives are the iSCSI targets of the above quoted pool. >>>>=20 >>>> A local disk : >>>>=20 >>>> Geom name: da7 >>>> Providers: >>>> 1. Name: da7 >>>> Mediasize: 4000787030016 (3.6T) >>>> Sectorsize: 512 >>>> Mode: r0w0e0 >>>> descr: HGSTxxx >>>> lunid: 5000xxx >>>> ident: NHGDxxx >>>> rotationrate: 7200 >>>> fwsectors: 63 >>>> fwheads: 255 >>>>=20 >>>> A iSCSI disk : >>>>=20 >>>> Geom name: da19 >>>> Providers: >>>> 1. Name: da19 >>>> Mediasize: 3999688294912 (3.6T) >>>> Sectorsize: 512 >>>> Mode: r1w1e2 >>>> descr: FREEBSD CTLDISK >>>> lunname: FREEBSD MYDEVID 12 >>>> lunid: FREEBSD MYDEVID 12 >>>> ident: iscsi4 >>>> rotationrate: 0 >>>> fwsectors: 63 >>>> fwheads: 255 >>>>=20 >>>> Sounds like then the faulty thing is the rotationrate set to 0 ? >>>=20 >>> Absolutely >>=20 >> Good catch then, thank you ! >>=20 >>> and from the looks you're not stressing the iSCSI disks so they get = high queuing depths hence the preference. >>> As load increased I would expect the local disks to start seeing = activity. >>=20 >> Yes this is also what I see. >>=20 >> Any way however to set rotationrate to 7200 (or to a slightly greater = value) as well for iSCSI drives ? >> I looked through ctl.conf(5) and iscsi.conf(5) but did not found = anything related. >>=20 >> Many thanks ! >=20 > Use the "option" setting in ctl.conf to change the rpm value = (documented > in the OPTIONS section of ctladm(8)). Thank you also Gary, and sorry as your mail went to spam :/ Ben From owner-freebsd-scsi@freebsd.org Wed Oct 4 19:05:14 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2E25BE4055B for ; Wed, 4 Oct 2017 19:05:14 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mx0.gentlemail.de (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B3FCC6DCF7 for ; Wed, 4 Oct 2017 19:05:13 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mh0.gentlemail.de (ezra.dcm1.omnilan.net [78.138.80.135]) by mx0.gentlemail.de (8.14.5/8.14.5) with ESMTP id v94J5AGb017613 for ; Wed, 4 Oct 2017 21:05:10 +0200 (CEST) (envelope-from freebsd@omnilan.de) Received: from titan.inop.mo1.omnilan.net (s1.omnilan.de [217.91.127.234]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mh0.gentlemail.de (Postfix) with ESMTPSA id AFDD5487; Wed, 4 Oct 2017 21:05:10 +0200 (CEST) Message-ID: <59D530E6.6090506@omnilan.de> Date: Wed, 04 Oct 2017 21:05:10 +0200 From: Harry Schmalzbauer Organization: OmniLAN User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-DE; rv:1.9.2.8) Gecko/20100906 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Subject: Re: ZFS prefers iSCSI disks over local ones ? References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> <69fbca90-9a18-ad5d-a2f7-ad527d79f8ba@freebsd.org> <9342D2A7-CE29-445B-9C40-7B6A9C960D59@gmail.com> <49ADB654-E68B-4B88-AE8E-49F755092848@gmail.com> <20171003172857.2497b931@mwoffice.virtualtec.office> In-Reply-To: <20171003172857.2497b931@mwoffice.virtualtec.office> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Greylist: ACL 129 matched, not delayed by milter-greylist-4.2.7 (mx0.gentlemail.de [78.138.80.130]); Wed, 04 Oct 2017 21:05:10 +0200 (CEST) X-Milter: Spamilter (Reciever: mx0.gentlemail.de; Sender-ip: 78.138.80.135; Sender-helo: mh0.gentlemail.de; ) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Oct 2017 19:05:14 -0000 Bezüglich Markus Wild's Nachricht vom 03.10.2017 17:28 (localtime): >>> Any way however to set rotationrate to 7200 (or to a slightly greater value (*)) as well for iSCSI drives ? >>> I looked through ctl.conf(5) and iscsi.conf(5) but did not found anything related. >> Sorry, (*) or to a slightly lower value (of course...). >> I forgot to mention that as the initiator, target is a FreeBSD 11.0 server. > We use this in our ctl.conf to ensure vmware doesn't consider the iscsi volumes to be ssd drives: > > [...] > lun 1 { path /dev/zvol/data/volumes/zvol1 ; option rpm 10000 } mav@ also added the formfactor option in r273687, which is configurable via ctl.conf(5) and documented in ctladm(8). Another not very well knwon option is "product". This is significant for WindowsServerBackup e.g. A ctl.conf(5) LUN specification example I generally use: lun 0 { # blocksize 4096 doesn't work for WSB2008, vhdx is prerequisite (2012+)! blocksize 4096 device-id "da5" option vendor "FreeBSD-ctl" option product "BackVOL15-1" # RPM 0=not reported, 1=non-rotating(SSD), n>1024 rpm option rpm 7200 # FormFactor 0=not reported, 1=5.25, 2=3.5, 3=2.5, 4=1.8, 5=less 1.8 inch option formfactor 2 path /dev/da5 serial "10000001" } -harry From owner-freebsd-scsi@freebsd.org Fri Oct 6 10:09:00 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 05F44E32C69; Fri, 6 Oct 2017 10:09:00 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x244.google.com (mail-wm0-x244.google.com [IPv6:2a00:1450:400c:c09::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9652C7C264; Fri, 6 Oct 2017 10:08:59 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x244.google.com with SMTP id q132so6924175wmd.2; Fri, 06 Oct 2017 03:08:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=UpHVYrJAyRFA/okHlePHWZbTVK8Y5iq4PgSPAvQf7wM=; b=UUODtDqWhtKIt7hgjHIh2lgw6PDjBGRRdbp9v6rQTAyXIIyTfT835M7t+/hnIEgtaN +6V0Ar/APAR/lFMDJyg9obpXsd1YnG/6ziy3DDqbpWvrfzB6/Urpt4eWX8A76zuagaKG Cuxd4YbUMOLtQLjCX2fJ4akMUL3nJgVxFMVTcbFlh0bdK71UFhu4N2f/rWfNfhccw2kw kKN9RvNeZjckqztmAIKn3e5OBC94EL3/qzRuGT2u70STsuXw3qAtYM3fauo3phN9dFSy op5aF7UIXxL7ZrgkqpFW6sntmmvzjZj9VLIPnZDmjtdoBv8wF54ySoRzGTspjhkMQZ2e XjIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=UpHVYrJAyRFA/okHlePHWZbTVK8Y5iq4PgSPAvQf7wM=; b=R9LfkjmUmIEO/WjPJMGIe4Wk7B+iej5fKJIVnGp83tEZpdrANBWbf8tWOOcA4zPY9B hkUoAzLJta+NaVLRws7Iv9AHjIDrz3vic2tkyNHTqqJnJesR6J7aIYxMlo7dGjrftbO8 AYxumqSmXgf9lghskqPIxDeihmAgBL8U5qkI/xrmsuFx5rpPuDb7nLEdVQmBqGMjU0Ck LOtSKpdrOFf7mrZfS9a46eU9SWrKq0QBmt7QyMbwWQsDpry27l0GJA98k6vTIa26yklu oVT0xAAyLfEJszNCsQFNnCawsSYa8AiKK5j/y2G893o+WB1TAFdYZBC3db+MBkEMIrCZ 5OiQ== X-Gm-Message-State: AMCzsaVImHJhkq59FyGwsq/6rlndaqM9Sw6srdAdvyvXOBnVphNOwwAv K2YeOHctD4CxfO7obCBZnLhNolMq X-Google-Smtp-Source: AOwi7QDrbUNE/rsqNrs4O7gWR+HaVkzwREPc1nFG0/NchrPl3oW79X6VUfI4ae6tT+deecQzd4TLvA== X-Received: by 10.28.153.85 with SMTP id b82mr1125513wme.121.1507284537762; Fri, 06 Oct 2017 03:08:57 -0700 (PDT) Received: from bens-mac.home (LFbn-MAR-1-445-220.w2-15.abo.wanadoo.fr. [2.15.38.220]) by smtp.gmail.com with ESMTPSA id d17sm985661wrc.13.2017.10.06.03.08.56 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 06 Oct 2017 03:08:57 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: ZFS stalled after some mirror disks were lost From: Ben RUBSON In-Reply-To: Date: Fri, 6 Oct 2017 12:08:55 +0200 Cc: Freebsd fs Content-Transfer-Encoding: quoted-printable Message-Id: <82632887-E9D4-42D0-AC05-3764ABAC6B86@gmail.com> References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> To: FreeBSD-scsi , =?utf-8?Q?Edward_Tomasz_Napiera=C5=82a?= X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Oct 2017 10:09:00 -0000 > On 02 Oct 2017, at 20:12, Ben RUBSON wrote: >=20 > Hi, >=20 > On a FreeBSD 11 server, the following online/healthy zpool : >=20 > home > mirror-0 > label/local1 > label/local2 > label/iscsi1 > label/iscsi2 > mirror-1 > label/local3 > label/local4 > label/iscsi3 > label/iscsi4 > cache > label/local5 > label/local6 >=20 > A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk > according to "zpool iostat", nothing on local disks. > No write IOs. >=20 > Let's disconnect all iSCSI disks : > iscsictl -Ra >=20 > Expected behavior : > IO activity flawlessly continue on local disks. >=20 > What happened : > All IOs stalled, server only answers to IOs made to its zroot pool. > All commands related to the iSCSI disks (iscsictl), or to ZFS = (zfs/zpool), > don't return. >=20 > Questions : > Why this behavior ? > How to know what happens ? (/var/log/messages says almost nothing) >=20 > I already disconnected the iSCSI disks without any issue in the past, > several times, but there were almost no IOs running. >=20 > Thank you for your help ! >=20 > Ben Hello, So first, many thanks again to Andriy, we spent almost 3 hours debugging = the stalled server to find the root cause of the issue. Sounds like I would need help from iSCSI dev team (Edward perhaps ?), as = issue seems to be on this side. Here is Andriy conclusion after the debug session, I quote him : > So, it seems that the root cause of all evil is this outstanding zio = (it might > be not the only one). > In other words, it looks like iscsi stack bailed out without = completing all > outstanding i/o requests that it had. > It should either return success or error for every request, it can not = simply > drop a request. > And that appears to be what happened here. > It looks like ZFS is fragile in the face of this type of errors. > Essentially, each logical i/o request obtains a configuration lock of = type 'zio' > in shared mode to prevent certain configuration changes from happening = while > there are any outsanding zio-s. > If a zio is lost, then this lock is leaked. > Then, the code that deals with vdev failures tries to take this lock = in > exclusive mode while holding a few other configuration locks also in = exclsuive > mode so, any other thread needing those locks would block. > And there are code paths where a configuration lock is taken while > spa_namespace_lock is held. > And when spa_namespace_lock is never dropped then the system is close = to toast, > because all pool lookups would get stuck. > I don't see how this can be fixed in ZFS. > It seems that when the initiator is being removed it doesn't properly = terminate > in-glight requests. > It would be interesting to see what happens if you test other = scenarios. So I tested the following other scenarios : 1 - drop all iSCSI traffic using ipfw on the target 2 - ifdown the iSCSI NIC on the target 3 - ifdown the iSCSI NIC on the initiator 4 - stop ctld (on the target of course) I tested all of them several times, 5 or 6 times each ? I managed to kernel panic (!) 2 times. First time in case 2. Second time in case 4. Not sure I would not have been able to panic in other test cases though. Stack traces : https://s1.postimg.org/2hfdpsvban/panic_case2.png https://s1.postimg.org/2ac5ud9t0f/panic_case4.png (kgdb) list *g_io_request+0x4a7 0xffffffff80a14dc7 is in g_io_request (/usr/src/sys/geom/geom_io.c:638). 633 g_bioq_unlock(&g_bio_run_down); 634 /* Pass it on down. */ 635 if (first) 636 wakeup(&g_wait_down); 637 } 638 } 639=09 640 void 641 g_io_deliver(struct bio *bp, int error) 642 { I had some kernel panics on the same servers a few months ago, loosing iSCSI targets which were used in a gmirror with local disks. gmirror should have continued to work flawlessly (as ZFS) using local disks but the server crashed. Stack traces : https://s1.postimg.org/14v4sabhv3/panic_g_destroy1.png https://s1.postimg.org/437evsk6rz/panic_g_destroy2.png https://s1.postimg.org/8pt1whiy5b/panic_g_destroy3.png (kgdb) list *g_destroy_consumer+0x53 0xffffffff80a18563 is in g_destroy_consumer (geom.h:369). 364 KASSERT(g_valid_obj(ptr) =3D=3D 0, 365 ("g_free(%p) of live object, type %d", ptr, 366 g_valid_obj(ptr))); 367 } 368 #endif 369 free(ptr, M_GEOM); 370 } 371=09 372 #define g_topology_lock() = \ 373 do { = \ > I think that all problems that you have seen are different sides of = the same > underlying issue. It looks like iscsi does not properly depart from = geom and > leaves behind some dangling pointers... >=20 > The panics you got today most likely occurred here: > bp->bio_to->geom->start(bp); >=20 > And the most likely reason is that bio_to points to a destroyed geom = provider. >=20 > I wonder if you'd be able to get into direct contact with a developer > responsible for iscsi in FreeBSD. I think that it is a relatively = recent > addition and it was under a FreeBSD Foundation project. So, I'd = expect that the > developer should be responsive. Feel free then to contact me if you need, so that we can go further on = this ! Thank you very much for your help, Ben