Date: Tue, 3 Oct 2017 08:19:04 +0200 From: Ben RUBSON <ben.rubson@gmail.com> To: Freebsd fs <freebsd-fs@freebsd.org>, FreeBSD-scsi <freebsd-scsi@freebsd.org> Subject: Re: ZFS stalled after some mirror disks were lost Message-ID: <DDCFAC80-2D72-4364-85B2-7F4D7D70BCEE@gmail.com> In-Reply-To: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com> References: <4A0E9EB8-57EA-4E76-9D7E-3E344B2037D2@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, Putting scsi list as it could be related. > On 02 Oct 2017, at 20:12, Ben RUBSON <ben.rubson@gmail.com> wrote: >=20 > Hi, >=20 > On a FreeBSD 11 server, the following online/healthy zpool : >=20 > home > mirror-0 > label/local1 > label/local2 > label/iscsi1 > label/iscsi2 > mirror-1 > label/local3 > label/local4 > label/iscsi3 > label/iscsi4 > cache > label/local5 > label/local6 >=20 > A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk > according to "zpool iostat", nothing on local disks (strange but I > noticed that IOs always prefer iscsi disks to local disks). > No write IOs. >=20 > Let's disconnect all iSCSI disks : > iscsictl -Ra >=20 > Expected behavior : > IO activity flawlessly continue on local disks. >=20 > What happened : > All IOs stalled, server only answers to IOs made to its zroot pool. > All commands related to the iSCSI disks (iscsictl), or to ZFS = (zfs/zpool), > don't return. >=20 > Questions : > Why this behavior ? > How to know what happens ? (/var/log/messages says almost nothing) >=20 > I already disconnected the iSCSI disks without any issue in the past, > several times, but there were almost no IOs running. >=20 > Thank you for your help ! >=20 > Ben > On 02 Oct 2017, at 22:55, Andriy Gapon <avg@FreeBSD.org> wrote: >=20 >> On 02/10/2017 22:13, Ben RUBSON wrote: >>=20 >>> On 02 Oct 2017, at 20:45, Andriy Gapon <avg@FreeBSD.org> wrote: >>>=20 >>>> On 02/10/2017 21:17, Ben RUBSON wrote: >>>>=20 >>>> Unfortunately the zpool command stalls / does not return :/ >>>=20 >>> Try to take procstat -kk -a. >>=20 >> Here is the procstat output : >> https://benrubson.github.io/zfs/procstat01.log >=20 > First, it seems that there are some iscsi threads stuck on a lock = like: > 0 100291 kernel iscsimt mi_switch+0xd2 = sleepq_wait+0x3a > _sx_xlock_hard+0x592 iscsi_maintenance_thread+0x316 fork_exit+0x85 > fork_trampoline+0xe >=20 > or like >=20 > 8580 102077 iscsictl - mi_switch+0xd2 = sleepq_wait+0x3a > _sx_slock_hard+0x325 iscsi_ioctl+0x7ea devfs_ioctl_f+0x13f = kern_ioctl+0x2d4 > sys_ioctl+0x171 amd64_syscall+0x4ce Xfast_syscall+0xfb >=20 > Also, there is a thread in cam_sim_free(): > 0 100986 kernel iscsimt mi_switch+0xd2 = sleepq_wait+0x3a > _sleep+0x2a1 cam_sim_free+0x48 iscsi_session_cleanup+0x1bd > iscsi_maintenance_thread+0x388 fork_exit+0x85 fork_trampoline+0xe >=20 > So, it looks like there could be a problem is the iscsi teardown path. >=20 > Maybe that caused a domino effect in ZFS code. I see a lot of threads = waiting > either for spa_namespace_lock or a spa config lock (a highly = specialized ZFS > lock). But it is hard to untangle their inter-dependencies. >=20 > Some of ZFS I/O threads are also affected, for example: > 0 101538 kernel zio_write_issue_ mi_switch+0xd2 = sleepq_wait+0x3a > _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 > taskqueue_run_locked+0x14a taskqueue_thread_loop+0xe8 fork_exit+0x85 > fork_trampoline+0xe > 8716 101319 sshd - mi_switch+0xd2 = sleepq_wait+0x3a > _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 > zio_nowait+0x49 arc_read+0x8e4 dbuf_read+0x6c2 = dmu_buf_hold_array_by_dnode+0x1d3 > dmu_read_uio_dnode+0x41 dmu_read_uio_dbuf+0x3b zfs_freebsd_read+0x5fc > VOP_READ_APV+0x89 vn_read+0x157 vn_io_fault1+0x1c2 vn_io_fault+0x197 > dofileread+0x98 > 71181 101141 encfs - mi_switch+0xd2 = sleepq_wait+0x3a > _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 > zio_nowait+0x49 arc_read+0x8e4 dbuf_read+0x6c2 dmu_buf_hold+0x3d > zap_lockdir+0x43 zap_cursor_retrieve+0x171 zfs_freebsd_readdir+0x3f3 > VOP_READDIR_APV+0x8f kern_getdirentries+0x21b sys_getdirentries+0x28 > amd64_syscall+0x4ce Xfast_syscall+0xfb > 71181 101190 encfs - mi_switch+0xd2 = sleepq_wait+0x3a > _cv_wait+0x194 spa_config_enter+0x9b zio_vdev_io_start+0x1c2 = zio_execute+0x236 > zio_nowait+0x49 arc_read+0x8e4 dbuf_prefetch_indirect_done+0xcc = arc_read+0x425 > dbuf_prefetch+0x4f7 dmu_zfetch+0x418 dmu_buf_hold_array_by_dnode+0x34d > dmu_read_uio_dnode+0x41 dmu_read_uio_dbuf+0x3b zfs_freebsd_read+0x5fc > VOP_READ_APV+0x89 vn_read+0x157 >=20 > Note that the first of these threads executes a write zio. >=20 > It would be nice to determine an owner of spa_namespace_lock. > If you have debug symbols then it can be easily done in kgdb on the = live system: > (kgdb) p spa_namespace_lock So as said a few minutes ago I lost access to the server and had to = recycle it. Thankfully I managed to reproduce the issue, re-playing exactly the same = steps. Curious line in /var/log/messages : kernel: g_access(918): provider da18 has error (da18 is the remaining iSCSI target device which did not disconnect = properly) procstat -kk -a : https://benrubson.github.io/zfs/procstat02.log (kgdb) p spa_namespace_lock $1 =3D -2110867066 Thank you ! Ben
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DDCFAC80-2D72-4364-85B2-7F4D7D70BCEE>