Date: Sat, 26 May 2018 21:54:10 +0200 From: Alexander Leidinger <Alexander@leidinger.net> To: freebsd-current@freebsd.org Subject: Re: Deadlocks / hangs in ZFS Message-ID: <20180526215410.Horde.TLpIgePvctlYUqw9QcqlgGR@webmail.leidinger.net> In-Reply-To: <fa263af4-9bf7-88f8-8d23-21456daf7960@FreeBSD.org> References: <20180522101749.Horde.Wxz9gSxx1xArxkYMQqTL0iZ@webmail.leidinger.net> <fa263af4-9bf7-88f8-8d23-21456daf7960@FreeBSD.org>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --]
Quoting Steve Wills <swills@freebsd.org> (from Tue, 22 May 2018
08:17:00 -0400):
> I may be seeing similar issues. Have you tried leaving top -SHa
> running and seeing what threads are using CPU when it hangs? I did
> and saw pid 17 [zfskern{txg_thread_enter}] using lots of CPU but no
> disk activity happening. Do you see similar?
For me it is a different zfs process/kthread, l2arc_feed_thread.
Please note that there is still 31 GB free, so it doesn't look lie
resource exhaustion. What I consider strange is the swap usage. I
watched the system and it started to use swap while there were >30 GB
listed as free (in/out rates visible from time to time, and plenty of
RAM free... ???).
last pid: 93392; load averages: 0.16, 0.44, 1.03
up 1+15:36:34 22:35:45
1509 processes:17 running, 1392 sleeping, 3 zombie, 97 waiting
CPU: 0.1% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.9% idle
Mem: 597M Active, 1849M Inact, 6736K Laundry, 25G Wired, 31G Free
ARC: 20G Total, 9028M MFU, 6646M MRU, 2162M Anon, 337M Header, 1935M Other
14G Compressed, 21G Uncompressed, 1.53:1 Ratio
Swap: 4096M Total, 1640M Used, 2455M Free, 40% Inuse
PID JID USERNAME PRI NICE SIZE RES STATE C TIME
WCPU COMMAND
10 0 root 155 ki31 0K 256K CPU1 1 35.4H
100.00% [idle{idle: cpu1}]
10 0 root 155 ki31 0K 256K CPU11 11 35.2H
100.00% [idle{idle: cpu11}]
10 0 root 155 ki31 0K 256K CPU3 3 35.2H
100.00% [idle{idle: cpu3}]
10 0 root 155 ki31 0K 256K CPU15 15 35.1H
100.00% [idle{idle: cpu15}]
10 0 root 155 ki31 0K 256K RUN 9 35.1H
100.00% [idle{idle: cpu9}]
10 0 root 155 ki31 0K 256K CPU5 5 35.0H
100.00% [idle{idle: cpu5}]
10 0 root 155 ki31 0K 256K CPU14 14 35.0H
100.00% [idle{idle: cpu14}]
10 0 root 155 ki31 0K 256K CPU0 0 35.8H
99.12% [idle{idle: cpu0}]
10 0 root 155 ki31 0K 256K CPU6 6 35.3H
98.79% [idle{idle: cpu6}]
10 0 root 155 ki31 0K 256K CPU8 8 35.1H
98.31% [idle{idle: cpu8}]
10 0 root 155 ki31 0K 256K CPU12 12 35.0H
97.24% [idle{idle: cpu12}]
10 0 root 155 ki31 0K 256K CPU4 4 35.4H
96.71% [idle{idle: cpu4}]
10 0 root 155 ki31 0K 256K CPU10 10 35.0H
92.37% [idle{idle: cpu10}]
10 0 root 155 ki31 0K 256K CPU7 7 35.2H
92.20% [idle{idle: cpu7}]
10 0 root 155 ki31 0K 256K CPU13 13 35.1H
91.90% [idle{idle: cpu13}]
10 0 root 155 ki31 0K 256K CPU2 2 35.4H
90.97% [idle{idle: cpu2}]
11 0 root -60 - 0K 816K WAIT 0 15:08
0.82% [intr{swi4: clock (0)}]
31 0 root -16 - 0K 80K pwait 0 44:54
0.60% [pagedaemon{dom0}]
45453 0 root 20 0 16932K 7056K CPU9 9 4:12
0.24% top -SHaj
24 0 root -8 - 0K 256K l2arc_ 0 4:12
0.21% [zfskern{l2arc_feed_thread}]
2375 0 root 20 0 16872K 6868K select 11 3:52
0.20% top -SHua
7007 12 235 20 0 18017M 881M uwait 12 0:00
0.19% [java{ESH-thingHandler-35}]
32 0 root -16 - 0K 16K psleep 15 5:03
0.11% [vmdaemon]
41037 0 netchild 27 0 18036K 9136K select 4 2:20
0.09% tmux: server (/tmp/tmux-1001/default) (t
36 0 root -16 - 0K 16K - 6 2:02
0.09% [racctd]
7007 12 235 20 0 18017M 881M uwait 9 1:24
0.07% [java{java}]
4746 0 root 20 0 13020K 3792K nanslp 8 0:52
0.05% zpool iostat space 1
0 0 root -76 - 0K 10304K - 4 0:16
0.05% [kernel{if_io_tqg_4}]
5550 8 933 20 0 2448M 607M uwait 8 0:41
0.03% [java{java}]
5550 8 933 20 0 2448M 607M uwait 13 0:03
0.03% [java{Timer-1}]
7007 12 235 20 0 18017M 881M uwait 0 0:39
0.02% [java{java}]
5655 8 560 20 0 21524K 4840K select 6 0:21
0.02% /usr/local/sbin/hald{hald}
30 0 root -16 - 0K 16K - 4 0:25
0.01% [rand_harvestq]
1259 0 root 20 0 18780K 18860K select 14 0:19
0.01% /usr/sbin/ntpd -c /etc/ntp.conf -p /var/
0 0 root -76 - 0K 10304K - 12 0:19
0.01% [kernel{if_config_tqg_0}]
31 0 root -16 - 0K 80K psleep 0 0:38
0.01% [pagedaemon{dom1}]
0 0 root -76 - 0K 10304K - 5 0:04
0.01% [kernel{if_io_tqg_5}]
7007 12 235 20 0 18017M 881M uwait 1 0:16
0.01% [java{Karaf Lock Monitor }]
12622 2 88 20 0 1963M 247M uwait 7 0:13
0.01% [mysqld{mysqld}]
27043 0 netchild 20 0 18964K 9124K select 6 0:01
0.01% sshd: netchild@pts/0 (sshd)
7007 12 235 20 0 18017M 881M uwait 8 0:10
0.01% [java{openHAB-job-schedul}]
7007 12 235 20 0 18017M 881M uwait 6 0:10
0.01% [java{openHAB-job-schedul}]
> On 05/22/18 04:17, Alexander Leidinger wrote:
>> Hi,
>>
>> does someone else experience deadlocks / hangs in ZFS?
>>
>> What I see is that if on a 2 socket / 4 cores -> 16 threads system
>> I do a lot in parallel (e.g. updating ports in several jails), then
>> the system may get into a state were I can login, but any exit
>> (e.g. from top) or logout of shell blocks somewhere. Sometimes it
>> helps to CTRL-C all updates to get the system into a good shape
>> again, but most of the times it doesn't.
>>
>> On another system at the same rev (333966) with a lot less CPUs
>> (and AMD instead of Intel), I don't see such a behavior.
>>
>> Bye,
>> Alexander.
>>
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
--
http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org netchild@FreeBSD.org : PGP 0x8F31830F9F2772BF
[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJbCbtiAAoJEKrxQhqFIICEHsMP/3tDHv4Om5O5TalIvB2sItP3
G7xt1pHfOgCEnMwhbnXkKu5NlcN8FH0k8tdFHkJqE8w9j8GyrXkS6n69SFaaohCF
wryx9ASeTFVvShPKCM84wy/Xeke25E2AHNf28m7fkBM9GDOOSoFnL+vkvPJ3Wv7W
+oYVSwahuRC5yUY0mfQJdqwyfyA57UJBpa5tOCfoinNm+jDtxV9batg5a9Ph81Qw
XaGpZCQ/SY0RTrSospQrzHa6Y6dqPJffsQrfntYS+iaZRs/0my9OOaTbrJ4K0FVw
IoOoy1wS4Bp6Sikf5j5TEnpaTsdfX7UK2TODpY4oz6vw/iREzlxVhCNgvn4xAqhG
N7Ubp/ZNxwLrWgBvEc+aXRF9HxapCLC0dyLqzzGio3z29Zb4XGOFCpcENsSbJh5k
JPsZqua+KH5j7poZ2f3wC0+OS7dqBGhb9ot7eCXE7cvUbGpwAXSUaOC3DmE7PLto
zgNIx1u8n6OvPPiayyw9AWNE8fpr5A82G8np7ThIKeMqX+TKFvx5ugbJEiu2X7rT
25cAQREa3+rZNZLhpV2HHggXCUK/Qo2NXIEeVNkOCzyy8Ev/3VRbpLoU8btpvMWr
L8bG7cYGrFk+Dt+DEjr8j3fVImjgd5yFSrL6Vlcfm7lBpnVRA+yNXNWHzHXnMEt1
Wk8vLWkIqgsQRmJTv8xI
=u3NS
-----END PGP SIGNATURE-----
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180526215410.Horde.TLpIgePvctlYUqw9QcqlgGR>
