From owner-freebsd-current@freebsd.org Sat May 23 10:56:15 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7DC232FE679 for ; Sat, 23 May 2020 10:56:15 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49TgJZ2p3Gz3VVt; Sat, 23 May 2020 10:56:13 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id 04NAu29M054220 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 23 May 2020 13:56:05 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 04NAu29M054220 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id 04NAu11J054219; Sat, 23 May 2020 13:56:01 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 23 May 2020 13:56:01 +0300 From: Konstantin Belousov To: Rick Macklem Cc: Ryan Libby , "freebsd-current@FreeBSD.org" Subject: Re: r358252 causes intermittent hangs where processes are stuck sleeping on btalloc Message-ID: <20200523105601.GN64045@kib.kiev.ua> References: <20200521101428.GC64045@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on tom.home X-Rspamd-Queue-Id: 49TgJZ2p3Gz3VVt X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com X-Spamd-Result: default: False [-0.82 / 15.00]; ARC_NA(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FREEMAIL_FROM(0.00)[gmail.com]; NEURAL_HAM_LONG(-0.32)[-0.325]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; NEURAL_HAM_MEDIUM(-0.27)[-0.270]; R_SPF_SOFTFAIL(0.00)[~all:c]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-0.23)[-0.228]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; FREEMAIL_ENVFROM(0.00)[gmail.com]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 May 2020 10:56:15 -0000 On Fri, May 22, 2020 at 11:46:26PM +0000, Rick Macklem wrote: > Konstantin Belousov wrote: > >On Wed, May 20, 2020 at 11:58:50PM -0700, Ryan Libby wrote: > >> On Wed, May 20, 2020 at 6:04 PM Rick Macklem wrote: > >> > > >> > Hi, > >> > > >> > Since I hadn't upgraded a kernel through the winter, it took me a while > >> > to bisect this, but r358252 seems to be the culprit. > >> > > >> > If I do a kernel build over NFS using my not so big Pentium 4 (single core, > >> > 1.25Gbytes RAM, i386), about every second attempt will hang. > >> > When I do a "ps" in the debugger, I see processes sleeping on btalloc. > >> > If I revert to r358251, I cannot reproduce this. > >> > > >> > Any ideas? > >> > > >> > I can easily test any change you might suggest to see if it fixes the > >> > problem. > >> > > >> > If you want more debug info, let me know, since I can easily > >> > reproduce it. > >> > > >> > Thanks, rick > >> > >> Nothing obvious to me. I can maybe try a repro on a VM... > >> > >> ddb ps, acttrace, alltrace, show all vmem, show page would be welcome. > >> > >> "btalloc" is "We're either out of address space or lost a fill race." > > > >Yes, I would be not surprised to be out of something on 1G i386 machine. > >Please also add 'show alllocks'. > Ok, I used an up to date head kernel and it took longer to reproduce a hang. > This time, none of the processes are stuck on "btalloc". > I'll try and give you most of the above, but since I have to type it in by hand > from the screen, I might not get it all. (I'm no real typist;-) > > show alllocks > exclusive lockmgr ufs (ufs) r = 0 locked @ kern/vfs_subr.c: 3259 > exclusive lockmgr nfs (nfs) r = 0 locked @ kern/vfs_lookup.c:737 > exclusive sleep mutex kernel area domain (kernel arena domain) r = 0 locked @ kern/subr_vmem.c:1343 > exclusive lockmgr bufwait (bufwait) r = 0 locked @ kern/vfs_bio.c:1663 > exclusive lockmgr ufs (ufs) r = 0 locked @ kern/vfs_subr.c:2930 > exclusive lockmgr syncer (syncer) r = 0 locked @ kern/vfs_subr.c:2474 > Process 12 (intr) thread 0x.. (1000008) > exclusive sleep mutex Giant (Giant) r = 0 locked @ kern/kern_intr.c:1152 > > > ps > - Not going to list them all, but here are the ones that seem interesting... > 18 0 0 0 DL vlruwt 0x11d939cc [vnlru] > 16 0 0 0 DL (threaded) [bufdaemon] > 100069 D qsleep [bufdaemon] > 100074 D - [bufspacedaemon-0] > 100084 D sdflush 0x11923284 [/ worker] > - and more of these for the other UFS file systems > 9 0 0 0 DL psleep 0x1e2f830 [vmdaemon] > 8 0 0 0 DL (threaded) [pagedaemon] > 100067 D psleep 0x1e2e95c [dom0] > 100072 D launds 0x1e2e968 [laundry: dom0] > 100073 D umarcl 0x12cc720 [uma] > … a bunch of usb and cam ones > 100025 D - 0x1b2ee40 [doneq0] > … > 12 0 0 0 RL (threaded) [intr] > 100007 I [swi6: task queue] > 100008 Run CPU 0 [swi6: Giant taskq] > … > 100000 D swapin 0x1d96dfc [swapper] > - and a bunch more in D state. > Does this mean the swapper was trying to swap in? > > > acttrace > - just shows the keyboard > kdb_enter() at kdb_enter+0x35/frame > vt_kbdevent() at vt_kdbevent+0x329/frame > kdbmux_intr() at kbdmux_intr+0x19/frame > taskqueue_run_locked() at taskqueue_run_locked+0x175/frame > taskqueue_run() at taskqueue_run+0x44/frame > taskqueue_swi_giant_run(0) at taskqueue_swi_giant_run+0xe/frame > ithread_loop() at ithread_loop+0x237/frame > fork_exit() at fork_exit+0x6c/frame > fork_trampoline() at 0x../frame > > > show all vmem > vmem 0x.. 'transient arena' > quantum: 4096 > size: 23592960 > inuse: 0 > free: 23592960 > busy tags: 0 > free tags: 2 > inuse size free size > 16777216 0 0 1 23592960 > vmem 0x.. 'buffer arena' > quantum: 4096 > size: 94683136 > inuse: 94502912 > free: 180224 > busy tags: 1463 > free tags: 3 > inuse size free size > 16384 2 32768 1 16384 > 32768 39 1277952 1 32768 > 65536 1422 93192192 0 0 > 131072 0 0 1 131072 > vmem 0x.. 'i386trampoline' > quantum: 1 > size: 24576 > inuse: 20860 > free: 3716 > busy tags: 9 > free tags: 3 > inuse size free size > 32 1 48 1 52 > 64 2 208 0 0 > 128 2 280 0 0 > 2048 1 2048 1 3664 > 4096 2 8192 0 0 > 8192 1 10084 0 0 > vmem 0x.. 'kernel rwx arena' > quantum: 4096 > size: 0 > inuse: 0 > free: 0 > busy tags: 0 > free tags: 0 > vmem 0x.. 'kernel area dom' > quantum: 4096 > size: 56623104 > inuse: 56582144 > free: 40960 > busy tags: 11224 > free tags: 3 I think this is the trouble. Did you tried to reduce kern.maxvnodes ? What is the default value for the knob on your machine ? We scaled maxvnodes for ZFS and UFS, might be NFS is even more demanding, having larger node size. > inuse size free size > 4096 11091 45428736 0 0 > 8192 63 516096 0 0 > 16384 12 196608 0 0 > 32768 6 196608 0 0 > 40960 2 81920 1 40960 > 65536 16 1048576 0 0 > 94208 1 94208 0 0 > 110592 1 110592 0 0 > 131072 15 2441216 0 0 > 262144 15 3997696 0 0 > 524288 1 524288 0 0 > 1048576 1 1945600 0 0 > vmem 0x.. 'kernel arena' > quantum: 4096 > size: 390070272 > inuse: 386613248 > free: 3457024 > busy tags: 873 > free tags: 3 > inuse size free size > 4096 35 143360 1 4096 > 8192 2 16384 2 16384 > 12288 1 12288 0 0 > 16384 30 491520 0 0 > 20480 140 2867200 0 0 > 65536 1 65536 0 0 > 131072 631 82706432 0 0 > 1048576 0 0 1 1339392 > 2097152 27 56623104 1 2097152 > 8388608 1 13774848 0 0 > 16777216 3 74883072 0 0 > 33554432 1 36753408 0 0 > 67108864 1 118276096 0 0 > > > alltrace > - I can't face typing too much more, but I'll put a few > here that look interesting > > - for csh > sched_switch() > mi_switch() > kern_yield() > getblkx() > breadn_flags() > ffs_update() > ufs_inactive() > VOP_INACTIVE() > vinactivef() > vput_final() > vm_object_deallocate() > vm_map_process_deferred() > kern_munmap() > sys_munmap() > > - For cc > sched_switch() > mi_switch() > sleepq_switch() > sleepq_timedwait() > _sleep() > pause_sbt() > vmem_bt_alloc() > keg_alloc_slab() > zone_import() > cache_alloc() > cache_alloc_retry() > uma_zalloc_arg() > bt_fill() > vmem_xalloc() > vmem_alloc() > kmem_alloc() > kmem_malloc_domainset() > page_alloc() > keg_alloc_slab() > zone_import() > cache_alloc() > cache_alloc_retry() > uma_zalloc_arg() > nfscl_nget() > nfs_create() > vop_sigdefer() > nfs_vnodeops_bypass() > VOP_CREATE_APV() > vn_open_cred() > vn_open() > kern_openat() > sys_openat() > > Then there are a bunch of these... for cc, make > sched_switch() > mi_switch() > sleepq_switch() > sleepq_catch_signals() > sleepq_wait_sig() > kern_wait6() > sys_wait4() > > - for vnlru > sched_switch() > mi_switch() > sleepq_switch() > sleepq_timedwait() > _sleep() > vnlru_proc() > fork_exit() > fork_trampoline() > > - for syncer > sched_switch() > mi_switch() > critical_exit_preempt() > intr_event_handle() > intr_execute_handlers() > lapic_handle_intr() > Xapic_isr1() > - interrupt > memset() > cache_alloc() > cache_alloc_retry() > uma_zalloc_arg() > vmem_xalloc() > vmem_bt_alloc() > keg_alloc_slab() > zone_import() > cache_alloc() > cache_alloc_retry() > uma_zalloc_arg() > bt_fill() > vmem_xalloc() > vmem_alloc() > bufkva_alloc() > getnewbuf() > getblkx() > breadn_flags() > ffs_update() > ffs_sync() > sync_fsync() > VOP_FSYNC_APV() > sched_sync() > fork_exit() > fork_trampoline() > > - For bufdaemon (a bunch of these) > sched_switch() > mi_switch() > sleepq_switch() > sleepq_timedwait() > _sleep() > buf_daemon() > fork_exit() > fork_trampoline() > > vmdaemon and pagedaemon are basically just like above, > sleeping in > vm_daemon() > or > vm_pageout_worker() > or > vm_pageout_laundry_worker() > or > uma_reclaim_worker() > > That's all the typing I can take right now. > I can probably make this happen again if you want more specific stuff. > > rick > > > >