Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 May 2020 23:46:26 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Konstantin Belousov <kostikbel@gmail.com>, Ryan Libby <rlibby@freebsd.org>
Cc:        "freebsd-current@FreeBSD.org" <freebsd-current@FreeBSD.org>, Konstantin Belousov <kib@FreeBSD.org>
Subject:   Re: r358252 causes intermittent hangs where processes are stuck sleeping on btalloc
Message-ID:  <QB1PR01MB3649E55965E4C1A449A2D4D1DDB40@QB1PR01MB3649.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <20200521101428.GC64045@kib.kiev.ua>
References:  <QB1PR01MB36496B9C1B6CEE512757B039DDB70@QB1PR01MB3649.CANPRD01.PROD.OUTLOOK.COM> <CAHgpiFzfzF5jLvTP4fMnGhkerMUqgKgVeHWQEMkwg19naTUTwA@mail.gmail.com>, <20200521101428.GC64045@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
Konstantin Belousov wrote:=0A=
>On Wed, May 20, 2020 at 11:58:50PM -0700, Ryan Libby wrote:=0A=
>> On Wed, May 20, 2020 at 6:04 PM Rick Macklem <rmacklem@uoguelph.ca> wrot=
e:=0A=
>> >=0A=
>> > Hi,=0A=
>> >=0A=
>> > Since I hadn't upgraded a kernel through the winter, it took me a whil=
e=0A=
>> > to bisect this, but r358252 seems to be the culprit.=0A=
>> >=0A=
>> > If I do a kernel build over NFS using my not so big Pentium 4 (single =
core,=0A=
>> > 1.25Gbytes RAM, i386), about every second attempt will hang.=0A=
>> > When I do a "ps" in the debugger, I see processes sleeping on btalloc.=
=0A=
>> > If I revert to r358251, I cannot reproduce this.=0A=
>> >=0A=
>> > Any ideas?=0A=
>> >=0A=
>> > I can easily test any change you might suggest to see if it fixes the=
=0A=
>> > problem.=0A=
>> >=0A=
>> > If you want more debug info, let me know, since I can easily=0A=
>> > reproduce it.=0A=
>> >=0A=
>> > Thanks, rick=0A=
>>=0A=
>> Nothing obvious to me.  I can maybe try a repro on a VM...=0A=
>>=0A=
>> ddb ps, acttrace, alltrace, show all vmem, show page would be welcome.=
=0A=
>>=0A=
>> "btalloc" is "We're either out of address space or lost a fill race."=0A=
>=0A=
>Yes, I would be not surprised to be out of something on 1G i386 machine.=
=0A=
>Please also add 'show alllocks'.=0A=
Ok, I used an up to date head kernel and it took longer to reproduce a hang=
.=0A=
This time, none of the processes are stuck on "btalloc".=0A=
I'll try and give you most of the above, but since I have to type it in by =
hand=0A=
from the screen, I might not get it all. (I'm no real typist;-)=0A=
> show alllocks=0A=
exclusive lockmgr ufs (ufs) r =3D 0 locked @ kern/vfs_subr.c: 3259=0A=
exclusive lockmgr nfs (nfs) r =3D 0 locked @ kern/vfs_lookup.c:737=0A=
exclusive sleep mutex kernel area domain (kernel arena domain) r =3D 0 lock=
ed @ kern/subr_vmem.c:1343=0A=
exclusive lockmgr bufwait (bufwait) r =3D 0 locked @ kern/vfs_bio.c:1663=0A=
exclusive lockmgr ufs (ufs) r =3D 0 locked @ kern/vfs_subr.c:2930=0A=
exclusive lockmgr syncer (syncer) r =3D 0 locked @ kern/vfs_subr.c:2474=0A=
Process 12 (intr) thread 0x.. (1000008)=0A=
exclusive sleep mutex Giant (Giant) r =3D 0 locked @ kern/kern_intr.c:1152=
=0A=
=0A=
> ps=0A=
- Not going to list them all, but here are the ones that seem interesting..=
.=0A=
18 0 0 0 DL vlruwt 0x11d939cc [vnlru]=0A=
16 0 0 0 DL (threaded)               [bufdaemon]=0A=
100069  D  qsleep                      [bufdaemon]=0A=
100074  D  -                               [bufspacedaemon-0]=0A=
100084  D  sdflush  0x11923284 [/ worker]=0A=
- and more of these for the other UFS file systems=0A=
9 0 0 0   DL psleep  0x1e2f830  [vmdaemon]=0A=
8 0 0 0   DL (threaded)               [pagedaemon]=0A=
100067  D   psleep 0x1e2e95c   [dom0]=0A=
100072  D   launds 0x1e2e968   [laundry: dom0]=0A=
100073  D   umarcl 0x12cc720   [uma]=0A=
=85 a bunch of usb and cam ones=0A=
100025  D   -           0x1b2ee40  [doneq0]=0A=
=85=0A=
12 0 0 0 RL  (threaded)               [intr]=0A=
100007  I                                     [swi6: task queue]=0A=
100008  Run           CPU 0           [swi6: Giant taskq]=0A=
=85=0A=
100000  D   swapin 0x1d96dfc    [swapper]=0A=
- and a bunch more in D state.=0A=
Does this mean the swapper was trying to swap in?=0A=
=0A=
> acttrace=0A=
- just shows the keyboard=0A=
kdb_enter() at kdb_enter+0x35/frame=0A=
vt_kbdevent() at vt_kdbevent+0x329/frame=0A=
kdbmux_intr() at kbdmux_intr+0x19/frame=0A=
taskqueue_run_locked() at taskqueue_run_locked+0x175/frame=0A=
taskqueue_run() at taskqueue_run+0x44/frame=0A=
taskqueue_swi_giant_run(0) at taskqueue_swi_giant_run+0xe/frame=0A=
ithread_loop() at ithread_loop+0x237/frame=0A=
fork_exit() at fork_exit+0x6c/frame=0A=
fork_trampoline() at 0x../frame=0A=
=0A=
> show all vmem=0A=
vmem 0x.. 'transient arena'=0A=
  quantum: 4096=0A=
  size:  23592960=0A=
  inuse: 0=0A=
  free: 23592960=0A=
  busy tags:   0=0A=
  free tags:    2=0A=
                     inuse        size       free        size=0A=
  16777216   0               0           1            23592960=0A=
vmem 0x.. 'buffer arena'=0A=
  quantum:  4096=0A=
  size:   94683136=0A=
  inuse: 94502912=0A=
  free: 180224=0A=
  busy tags:    1463=0A=
  free tags:      3=0A=
                       inuse      size         free        size=0A=
  16384           2             32768     1             16384=0A=
  32768           39           1277952 1              32768=0A=
  65536            1422       93192192 0           0=0A=
  131072          0             0            1              131072=0A=
vmem 0x.. 'i386trampoline'=0A=
  quantum:      1=0A=
  size:               24576=0A=
  inuse:             20860=0A=
  free:               3716=0A=
  busy tags:     9=0A=
  free tags:      3=0A=
                       inuse      size      free      size=0A=
  32                 1             48        1           52=0A=
  64                  2            208       0           0=0A=
  128                2            280       0            0=0A=
  2048              1            2048     1           3664=0A=
  4096              2            8192     0           0=0A=
  8192              1            10084   0           0=0A=
vmem 0x.. 'kernel rwx arena'=0A=
  quantum:    4096=0A=
  size: 0=0A=
  inuse: 0=0A=
  free:   0=0A=
  busy tags: 0=0A=
  free tags:  0=0A=
vmem 0x.. 'kernel area dom'=0A=
  quantum:  4096=0A=
  size: 56623104=0A=
  inuse: 56582144=0A=
  free: 40960=0A=
  busy tags: 11224=0A=
  free tags: 3=0A=
                inuse     size          free              size=0A=
  4096      11091    45428736 0                 0=0A=
  8192       63         516096     0                 0=0A=
  16384     12          196608    0                  0=0A=
  32768      6           196608    0                  0=0A=
  40960      2           81920      1                  40960=0A=
  65536      16         1048576   0                 0=0A=
  94208      1            94208      0                 0=0A=
  110592     1          110592     0                 0=0A=
  131072      15       2441216    0                0=0A=
  262144      15       3997696    0                 0=0A=
  524288      1         524288      0                 0=0A=
  1048576     1        1945600     0                0=0A=
vmem 0x.. 'kernel arena'=0A=
  quantum: 4096=0A=
  size: 390070272=0A=
  inuse: 386613248=0A=
  free: 3457024=0A=
  busy tags:  873=0A=
  free tags:   3=0A=
                   inuse       size         free        size=0A=
4096           35           143360    1            4096=0A=
8192           2              16384      2           16384=0A=
12288         1              12288      0           0=0A=
16384         30             491520   0           0=0A=
20480         140           2867200  0          0=0A=
65536          1              65536     0           0=0A=
131072        631          82706432 0         0=0A=
1048576      0              0               1         1339392=0A=
2097152      27            56623104   1        2097152=0A=
8388608      1              13774848    0       0=0A=
16777216    3              74883072   0        0=0A=
33554432    1              36753408   0        0=0A=
67108864    1              118276096  0       0=0A=
=0A=
> alltrace=0A=
- I can't face typing too much more, but I'll put a few=0A=
  here that look interesting=0A=
=0A=
- for csh=0A=
sched_switch()=0A=
mi_switch()=0A=
kern_yield()=0A=
getblkx()=0A=
breadn_flags()=0A=
ffs_update()=0A=
ufs_inactive()=0A=
VOP_INACTIVE()=0A=
vinactivef()=0A=
vput_final()=0A=
vm_object_deallocate()=0A=
vm_map_process_deferred()=0A=
kern_munmap()=0A=
sys_munmap()=0A=
=0A=
- For cc=0A=
sched_switch()=0A=
mi_switch()=0A=
sleepq_switch()=0A=
sleepq_timedwait()=0A=
_sleep()=0A=
pause_sbt()=0A=
vmem_bt_alloc()=0A=
keg_alloc_slab()=0A=
zone_import()=0A=
cache_alloc()=0A=
cache_alloc_retry()=0A=
uma_zalloc_arg()=0A=
bt_fill()=0A=
vmem_xalloc()=0A=
vmem_alloc()=0A=
kmem_alloc()=0A=
kmem_malloc_domainset()=0A=
page_alloc()=0A=
keg_alloc_slab()=0A=
zone_import()=0A=
cache_alloc()=0A=
cache_alloc_retry()=0A=
uma_zalloc_arg()=0A=
nfscl_nget()=0A=
nfs_create()=0A=
vop_sigdefer()=0A=
nfs_vnodeops_bypass()=0A=
VOP_CREATE_APV()=0A=
vn_open_cred()=0A=
vn_open()=0A=
kern_openat()=0A=
sys_openat()=0A=
=0A=
Then there are a bunch of these... for cc, make=0A=
sched_switch()=0A=
mi_switch()=0A=
sleepq_switch()=0A=
sleepq_catch_signals()=0A=
sleepq_wait_sig()=0A=
kern_wait6()=0A=
sys_wait4()=0A=
=0A=
- for vnlru=0A=
sched_switch()=0A=
mi_switch()=0A=
sleepq_switch()=0A=
sleepq_timedwait()=0A=
_sleep()=0A=
vnlru_proc()=0A=
fork_exit()=0A=
fork_trampoline()=0A=
=0A=
- for syncer=0A=
sched_switch()=0A=
mi_switch()=0A=
critical_exit_preempt()=0A=
intr_event_handle()=0A=
intr_execute_handlers()=0A=
lapic_handle_intr()=0A=
Xapic_isr1()=0A=
- interrupt=0A=
memset()=0A=
cache_alloc()=0A=
cache_alloc_retry()=0A=
uma_zalloc_arg()=0A=
vmem_xalloc()=0A=
vmem_bt_alloc()=0A=
keg_alloc_slab()=0A=
zone_import()=0A=
cache_alloc()=0A=
cache_alloc_retry()=0A=
uma_zalloc_arg()=0A=
bt_fill()=0A=
vmem_xalloc()=0A=
vmem_alloc()=0A=
bufkva_alloc()=0A=
getnewbuf()=0A=
getblkx()=0A=
breadn_flags()=0A=
ffs_update()=0A=
ffs_sync()=0A=
sync_fsync()=0A=
VOP_FSYNC_APV()=0A=
sched_sync()=0A=
fork_exit()=0A=
fork_trampoline()=0A=
=0A=
- For bufdaemon (a bunch of these)=0A=
sched_switch()=0A=
mi_switch()=0A=
sleepq_switch()=0A=
sleepq_timedwait()=0A=
_sleep()=0A=
buf_daemon()=0A=
fork_exit()=0A=
fork_trampoline()=0A=
=0A=
vmdaemon and pagedaemon are basically just like above,=0A=
sleeping in=0A=
vm_daemon()=0A=
or=0A=
vm_pageout_worker()=0A=
or=0A=
vm_pageout_laundry_worker()=0A=
or=0A=
uma_reclaim_worker()=0A=
=0A=
That's all the typing I can take right now.=0A=
I can probably make this happen again if you want more specific stuff.=0A=
=0A=
rick=0A=
=0A=
=0A=
=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?QB1PR01MB3649E55965E4C1A449A2D4D1DDB40>