Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Nov 2019 06:54:01 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 241980] panic: I/O to pool appears to be hung on vdev
Message-ID:  <bug-241980-227-RQMzKqf3Fh@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-241980-227@https.bugs.freebsd.org/bugzilla/>
References:  <bug-241980-227@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D241980

--- Comment #21 from Eugene Grosbein <eugen@freebsd.org> ---
It took me unexpectedly long time to get debugging output as I struggled to
force the loader perform one-time (nextboot) loading for patched zfs.ko bei=
ng
able to perform only one reboot per day for this production machine. It
appeared we have documented way to do this for the kernel but not for modul=
es.
After several attempts, zfs_name=3D"/boot/nextboot/zfs.ko" in the
/boot/nextboot.conf did it finally.

# TZ=3DUTC sysctl kern.boottime
kern.boottime: { sec =3D 1574393119, usec =3D 301171 } Fri Nov 22 03:25:19 =
2019

Local time is UTC+3. After two days of uptime, I got this in the log:

Nov 24 06:24:36 col02 kernel: sata SLOW IO: zio io_type 3 timestamp
171750435607373ns, delta 1006734932093ns, last io 172757121535759ns
I/O to pool 'sata' appears to be hung on vdev guid 3313589389580178043 at
'/dev/da2.eli' active zio 65

I've used this script to convert ZFS timestamps (getnanouptime expressed in
nanoseconds) to readable time:

#!/bin/sh
date -jr $(sysctl -n kern.boottime | awk -vt=3D${1%?????????} -F'[, ]' '{pr=
int
t+$4}')
#EOF

First, there was no hardware hung. The system is alive, both ZFS pools still
work, dd if=3D/dev/da2.eli works too. Second, ip_type 3 is ZIO_TYPE_FREE fr=
om
/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h.

# gethrtime 172757121535759
Sun Nov 24 06:24:36 MSK 2019

65 active zio's correspond to L(q) shown by gstat:

# gstat -adI1s -f 'da[2-6].*'
dT: 1.001s  w: 1.000s  filter: da[2-6].*
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d=
=20=20
%busy Name
   65    774     37   3924    9.0    540  25707    0.7    193   8532  212.6=
=20=20
84.5| da2
   65    780     39   4144   14.3    544  26007    0.9    193   7724  241.7=
=20=20
91.0| da3
   65    771     33   3345    9.3    542  24448    0.7    192   8100  255.5=
=20=20
90.7| da4
   65    851     38   4004    7.8    554  28840    1.4    255   9679  236.0=
=20=20
97.1| da5
   65    766     36   4100    4.5    534  22730    0.5    192   8911  208.8=
=20=20
75.6| da6
   65    693     23   4144   14.8    474  26007    1.2    193   7724  243.0=
=20=20
95.0| da3.eli
   65    691     20   3345    9.4    476  24448    1.0    192   8100  255.8=
=20=20
95.1| da4.eli
   65    688     22   3924    9.8    470  25707    1.0    193   8532  213.0=
=20=20
89.1| da2.eli
   65    753     22   4004   11.6    473  28840    1.7    255   9679  238.3=
=20
100.7| da5.eli
   65    691     20   4100    5.6    476  22730    0.7    192   8911  210.6=
=20=20
80.7| da6.eli

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-241980-227-RQMzKqf3Fh>