From owner-freebsd-hackers@freebsd.org Fri Nov 29 11:05:39 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C3DA91AC3B5 for ; Fri, 29 Nov 2019 11:05:39 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [176.74.240.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47PWrf4xBPz3Hbs; Fri, 29 Nov 2019 11:05:37 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 7F3A82C660; Fri, 29 Nov 2019 12:05:35 +0100 (CET) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id s3ngmaYEjX8k; Fri, 29 Nov 2019 12:05:34 +0100 (CET) Received: from [192.168.101.70] (unknown [192.168.101.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 6C7AC2C65C; Fri, 29 Nov 2019 12:05:34 +0100 (CET) Subject: Re: Process in T state does not want to die..... To: Eugene Grosbein , Konstantin Belousov Cc: FreeBSD Hackers , Alexander Motin , Andriy Gapon References: <966f830c-bf09-3683-90da-e70aa343cc16@digiware.nl> <3c57e51d-fa36-39a3-9691-49698e8d2124@grosbein.net> <91490c30-45e9-3c38-c55b-12534fd09e28@digiware.nl> <20191128115122.GN10580@kib.kiev.ua> <296874db-40f0-c7c9-a573-410e4c86049a@digiware.nl> <20191128195013.GU10580@kib.kiev.ua> <1ae7ad65-902c-8e5f-bcf1-1e98448c64bb@digiware.nl> <20191128214633.GV10580@kib.kiev.ua> From: Willem Jan Withagen Message-ID: Date: Fri, 29 Nov 2019 12:05:34 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 47PWrf4xBPz3Hbs X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of wjw@digiware.nl designates 176.74.240.9 as permitted sender) smtp.mailfrom=wjw@digiware.nl X-Spamd-Result: default: False [-5.61 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[digiware.nl]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[9.240.74.176.list.dnswl.org : 127.0.9.2]; IP_SCORE(-3.11)[ip: (-9.80), ipnet: 176.74.224.0/19(-4.90), asn: 28878(-0.90), country: NL(0.02)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:28878, ipnet:176.74.224.0/19, country:NL]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Nov 2019 11:05:39 -0000 On 29-11-2019 11:43, Eugene Grosbein wrote: > 29.11.2019 16:24, Eugene Grosbein wrote: > >> 29.11.2019 4:46, Konstantin Belousov write: >> >>>> sys_extattr_set_fd+0xee amd64_syscall+0x364 fast_syscall_common+0x101 >>> This is an example of the cause for your problem. >> >> I observe this problem too, but my use case is different. >> >> I have several bhyve instances running Windows guests over ZVOLs over SSD-only RAIDZ1 pool. >> "zfs destroy" for snapshots with large "used" numbers takes long time (several minutes) due to slow TRIM. >> Sometimes this makes virtual guest unresponsible and attempt to restart the bhyve instance may bring it to Exiting (E) >> state for several minutes and it finishes successfully after that. But sometimes bhyve process hangs in T state indefinitely. >> >> This is 11.3-STABLE/amd64 r354667. Should I try your patch too? > > OTOH, same system has several FreeBSD jails over mounted ZFS (file systems) over same pool. > These file systems have snapshots created/removed too and snapshot are large (upto 10G). > From what I get from Konstantin is that this problem is due to memory pressure build by both ZFS and the buffercache used by UFS. And the buffercache is waiting for some buffer memory to be able to do its work. If wanted I can try and put a ZFS fs on /dev/ggate0 so that any buffering would be in ZFS and not in UFS. But even with the patch I still now have: root 3471 0.0 5.8 646768 480276 - TsJ 11:16 0:10.74 ceph-osd -i 0 root 3530 0.0 11.8 1153860 985020 - TsJ 11:17 0:11.51 ceph-osd -i 1 root 3532 0.0 5.3 608760 438676 - TsJ 11:17 0:07.31 ceph-osd -i 2 root 3534 0.0 3.2 435564 266328 - IsJ 11:17 0:07.35 ceph-osd -i 3 root 3536 0.0 4.8 565792 398392 - IsJ 11:17 0:08.73 ceph-osd -i 5 root 3553 0.0 2.3 362892 192348 - TsJ 11:17 0:04.21 ceph-osd -i 6 root 3556 0.0 3.0 421516 246956 - TsJ 11:17 0:04.81 ceph-osd -i 4 And from procstat -kk below it looks like things are still stuck in bwillwrite, but now with another set of functions. I guess not writing an extattrib() but writing a file. # ps -o pid,lwp,flags,flags2,state,tracer,command -p 3471 PID LWP F F2 STAT TRACER COMMAND 3471 104097 11080081 00000000 TsJ 0 ceph-osd -i 0 # procstat -kk 3471: 3471 104310 ceph-osd journal_write mi_switch+0xe0 sleepq_wait+0x2c _sleep+0x247 bwillwrite+0x97 dofilewrite+0x93 sys_writev+0x6e amd64_syscall+0x362 fast_syscall_common+0x101 3471 104311 ceph-osd fn_jrn_objstore mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104312 ceph-osd tp_fstore_op mi_switch+0xe0 sleepq_wait+0x2c _sleep+0x247 bwillwrite+0x97 dofilewrite+0x93 sys_write+0xc1 amd64_syscall+0x362 fast_syscall_common+0x101 3471 104313 ceph-osd tp_fstore_op mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104314 ceph-osd fn_odsk_fstore mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104315 ceph-osd fn_appl_fstore mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104316 ceph-osd safe_timer mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104355 ceph-osd ms_dispatch mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104356 ceph-osd ms_local mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104357 ceph-osd safe_timer mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104358 ceph-osd fn_anonymous mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104359 ceph-osd safe_timer mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104360 ceph-osd ms_dispatch mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104361 ceph-osd ms_local mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104362 ceph-osd ms_dispatch mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104363 ceph-osd ms_local mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104364 ceph-osd ms_dispatch mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104365 ceph-osd ms_local mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104366 ceph-osd ms_dispatch mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104367 ceph-osd ms_local mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104368 ceph-osd ms_dispatch mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104369 ceph-osd ms_local mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104370 ceph-osd ms_dispatch mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104371 ceph-osd ms_local mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104372 ceph-osd fn_anonymous mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104373 ceph-osd finisher mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104374 ceph-osd safe_timer mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104375 ceph-osd safe_timer mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104376 ceph-osd osd_srv_agent mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104377 ceph-osd tp_osd_tp mi_switch+0xe0 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f 3471 104378 ceph-osd tp_osd_tp mi_switch+0xe0 thread_suspend_switch+0x140 thread_single+0x47b sigexit+0x53 postsig+0x304 ast+0x327 fast_syscall_common+0x198