Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Mar 2023 13:54:12 -0400
From:      Ken Merry <ken@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        hackers@freebsd.org
Subject:   Re: Getting v_wire_count from a kernel core dump?
Message-ID:  <F6115CA8-AA0A-4DA3-9009-DCDC569D5B97@freebsd.org>
In-Reply-To: <ZBnPHsuFy6uwbMkR@kib.kiev.ua>
References:  <66742036-C8DF-4A13-9D4A-CDA71217E574@freebsd.org> <ZBnPHsuFy6uwbMkR@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_2710E057-8BAD-4F01-9DD6-48E4ECCB364F
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8


> On Mar 21, 2023, at 11:37, Konstantin Belousov <kostikbel@gmail.com> =
wrote:
>=20
> On Tue, Mar 21, 2023 at 11:11:30AM -0400, Ken Merry wrote:
>> I have kernel core dumps from several machines out in the field =
(customer sites) that were out of memory panics, and I=E2=80=99m trying =
to figure out, from the kernel core dumps, whether we=E2=80=99re dealing =
with a potential page leak.
>>=20
>> For context, these machines are running stable/13 from April 2021, =
but they do have the fix for this bug:
>>=20
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D256507
>>=20
>> Which is this commit in stable/13:
>>=20
>> =
https://cgit.freebsd.org/src/commit/?id=3D6094749a1a5dafb8daf98deab23fc968=
070bc695
>>=20
>> On a running system, I can get a rough idea whether there is a page =
leak by looking at the VM system page counters:
>>=20
>> # sysctl vm.stats |grep count
>> vm.stats.vm.v_cache_count: 0
>> vm.stats.vm.v_user_wire_count: 0
>> vm.stats.vm.v_laundry_count: 991626
>> vm.stats.vm.v_inactive_count: 39733216
>> vm.stats.vm.v_active_count: 11821309
>> vm.stats.vm.v_wire_count: 11154113
>> vm.stats.vm.v_free_count: 1599981
>> vm.stats.vm.v_page_count: 65347213
>>=20
>> So the first 5 numbers add up to 65300245 in this case, for a =
difference of 46968. =20
>>=20
>> Am I off base here as far as the various counts adding up to the page =
count?  (e.g. is the wire count just an additional attribute of a page =
and not another separate state like active, inactive or laundry?)
>>=20
>> Looking at the kernel core dump for one of the systems I see:
>>=20
>> kgdb) print vm_cnt
>> $1 =3D {v_swtch =3D 0xfffffe022158f2f8, v_trap =3D =
0xfffffe022158f2f0,
>>  v_syscall =3D 0xfffffe022158f2e8, v_intr =3D 0xfffffe022158f2e0,
>>  v_soft =3D 0xfffffe022158f2d8, v_vm_faults =3D 0xfffffe022158f2d0,
>>  v_io_faults =3D 0xfffffe022158f2c8, v_cow_faults =3D =
0xfffffe022158f2c0,
>>  v_cow_optim =3D 0xfffffe022158f2b8, v_zfod =3D 0xfffffe022158f2b0,
>>  v_ozfod =3D 0xfffffe022158f2a8, v_swapin =3D 0xfffffe022158f2a0,
>>  v_swapout =3D 0xfffffe022158f298, v_swappgsin =3D =
0xfffffe022158f290,
>>  v_swappgsout =3D 0xfffffe022158f288, v_vnodein =3D =
0xfffffe022158f280,
>>  v_vnodeout =3D 0xfffffe022158f278, v_vnodepgsin =3D =
0xfffffe022158f270,
>>  v_vnodepgsout =3D 0xfffffe022158f268, v_intrans =3D =
0xfffffe022158f260,
>>  v_reactivated =3D 0xfffffe022158f258, v_pdwakeups =3D =
0xfffffe022158f250,
>>  v_pdpages =3D 0xfffffe022158f248, v_pdshortfalls =3D =
0xfffffe022158f240,
>>  v_dfree =3D 0xfffffe022158f238, v_pfree =3D 0xfffffe022158f230,
>>  v_tfree =3D 0xfffffe022158f228, v_forks =3D 0xfffffe022158f220,
>>  v_vforks =3D 0xfffffe022158f218, v_rforks =3D 0xfffffe022158f210,
>>  v_kthreads =3D 0xfffffe022158f208, v_forkpages =3D =
0xfffffe022158f200,
>>  v_vforkpages =3D 0xfffffe022158f1f8, v_rforkpages =3D =
0xfffffe022158f1f0,
>>  v_kthreadpages =3D 0xfffffe022158f1e8, v_wire_count =3D =
0xfffffe022158f1e0,
>>  v_page_size =3D 4096, v_page_count =3D 65342843, v_free_reserved =3D =
85343,
>>  v_free_target =3D 1392195, v_free_min =3D 412056, v_inactive_target =
=3D 2088292,
>>  v_pageout_free_min =3D 136, v_interrupt_free_min =3D 8, =
v_free_severe =3D 248698}
>> (kgdb) print vm_ndomains
>> $2 =3D 4
>> (kgdb) print vm_dom[0].vmd_pagequeues[0].pq_cnt
>> $3 =3D 6298704
>> (kgdb) print vm_dom[0].vmd_pagequeues[1].pq_cnt
>> $4 =3D 3423939
>> (kgdb) print vm_dom[0].vmd_pagequeues[2].pq_cnt
>> $5 =3D 629834
>> (kgdb) print vm_dom[0].vmd_pagequeues[3].pq_cnt
>> $6 =3D 0
>> (kgdb) print vm_dom[1].vmd_pagequeues[0].pq_cnt
>> $7 =3D 2301793
>> (kgdb) print vm_dom[1].vmd_pagequeues[1].pq_cnt
>> $8 =3D 7130193
>> (kgdb) print vm_dom[1].vmd_pagequeues[2].pq_cnt
>> $9 =3D 701495
>> (kgdb) print vm_dom[1].vmd_pagequeues[3].pq_cnt
>> $10 =3D 0
>> (kgdb) print vm_dom[2].vmd_pagequeues[0].pq_cnt
>> $11 =3D 464429
>> (kgdb) print vm_dom[2].vmd_pagequeues[1].pq_cnt
>> $12 =3D 9123532
>> (kgdb) print vm_dom[2].vmd_pagequeues[2].pq_cnt
>> $13 =3D 1037423
>> (kgdb) print vm_dom[2].vmd_pagequeues[3].pq_cnt
>> $14 =3D 0
>> (kgdb) print vm_dom[3].vmd_pagequeues[0].pq_cnt
>> $15 =3D 5444946
>> (kgdb) print vm_dom[3].vmd_pagequeues[1].pq_cnt
>> $16 =3D 4466782
>> (kgdb) print vm_dom[3].vmd_pagequeues[2].pq_cnt
>> $17 =3D 785195
>> (kgdb) print vm_dom[3].vmd_pagequeues[3].pq_cnt
>> $18 =3D 0
>> (kgdb)=20
>>=20
>>=20
>> Adding up the page queue counts:
>>=20
>> 6298704
>> 3423939
>> 629834
>> ++p
>> 10352477
>> 2301793
>> 7130193
>> 701495
>> ++p
>> 10133481
>> +p
>> 20485958
>> 464429
>> 9123532
>> 1037423
>> ++p
>> 10625384
>> +p
>> 31111342
>> 5444946
>> 4466782
>> 785195
>> ++p
>> 10696923
>> +p
>> 41808265
>>=20
>> So, about 23M pages short of v_page_count. =20
>>=20
>> v_wire_count is a per-CPU counter, and on a running system it gets =
added up.  But trying to access it in the kernel core dump yields:
>>=20
>> (kgdb) print vm_cnt.v_wire_count
>> $2 =3D (counter_u64_t) 0xfffffe022158f1e0
>> (kgdb) print *$2
>> Cannot access memory at address 0xfffffe022158f1e0
>>=20
>> Anyone have any ideas whether I can figure out whether there is a =
page leak from the core dump?
>>=20
>=20
> Did you looked at UMA/malloc stats?  It could be genuine VM leaking =
pages,
> but more often it is some kernel subsystem leaking its own =
allocations.
> For later, try both vmstat -z and vmstat -m on the kernel.debug + =
vmcore.
> Often the leakage is immediately obvious.

So, vmstat -m doesn=E2=80=99t work on a kernel core dump, at least on =
this vintage of stable/13:

# vmstat -m -N /usr/lib/debug/boot/kernel/kernel.debug -M =
2023-03-15_21-23-27.vmcore.0
vmstat: memstat_kvm_malloc:=20
         Type InUse MemUse Requests  Size(s)

As for vmstat -z, we=E2=80=99ve got a script that adds it all up, and =
it=E2=80=99s ~80GB on a system with 256GB RAM.

The largest processes in the system add up to approximately 24GB of RAM =
used, although that is very difficult to measure precisely because =
we=E2=80=99re using Postgres, and the processes that Postgres uses share =
varying amounts of RAM. =20

But it doesn=E2=80=99t seem like we=E2=80=99re approaching 256GB.  If I =
can rule out a page leak, then the reason behind this change could be =
the issue:

=
https://cgit.freebsd.org/src/commit/sys/vm?h=3Dstable/13&id=3D555baef969a1=
7a7cbcd6af3ee5bcf854ecd4de7c

The ARC in our case is at about 65GB.

Thanks,

Ken
=E2=80=94=20
Ken Merry
ken@FreeBSD.ORG


--Apple-Mail=_2710E057-8BAD-4F01-9DD6-48E4ECCB364F
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"content-type" content=3D"text/html; =
charset=3Dutf-8"></head><body style=3D"overflow-wrap: break-word; =
-webkit-nbsp-mode: space; line-break: =
after-white-space;"><div><br></div><div><blockquote type=3D"cite"><div>On =
Mar 21, 2023, at 11:37, Konstantin Belousov &lt;kostikbel@gmail.com&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><div><meta =
charset=3D"UTF-8"><span style=3D"caret-color: rgb(0, 0, 0); font-family: =
Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none; float: none; display: inline !important;">On Tue, Mar 21, 2023 at =
11:11:30AM -0400, Ken Merry wrote:</span><br style=3D"caret-color: =
rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: =
normal; font-variant-caps: normal; font-weight: 400; letter-spacing: =
normal; text-align: start; text-indent: 0px; text-transform: none; =
white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none;"><blockquote type=3D"cite" style=3D"font-family: =
Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; =
-webkit-text-stroke-width: 0px; text-decoration: none;">I have kernel =
core dumps from several machines out in the field (customer sites) that =
were out of memory panics, and I=E2=80=99m trying to figure out, from =
the kernel core dumps, whether we=E2=80=99re dealing with a potential =
page leak.<br><br>For context, these machines are running stable/13 from =
April 2021, but they do have the fix for this =
bug:<br><br>https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D256507<br>=
<br>Which is this commit in =
stable/13:<br><br>https://cgit.freebsd.org/src/commit/?id=3D6094749a1a5daf=
b8daf98deab23fc968070bc695<br><br>On a running system, I can get a rough =
idea whether there is a page leak by looking at the VM system page =
counters:<br><br># sysctl vm.stats |grep =
count<br>vm.stats.vm.v_cache_count: 0<br>vm.stats.vm.v_user_wire_count: =
0<br>vm.stats.vm.v_laundry_count: =
991626<br>vm.stats.vm.v_inactive_count: =
39733216<br>vm.stats.vm.v_active_count: =
11821309<br>vm.stats.vm.v_wire_count: =
11154113<br>vm.stats.vm.v_free_count: =
1599981<br>vm.stats.vm.v_page_count: 65347213<br><br>So the first 5 =
numbers add up to 65300245 in this case, for a difference of 46968. =
&nbsp;<br><br>Am I off base here as far as the various counts adding up =
to the page count? &nbsp;(e.g. is the wire count just an additional =
attribute of a page and not another separate state like active, inactive =
or laundry?)<br><br>Looking at the kernel core dump for one of the =
systems I see:<br><br>kgdb) print vm_cnt<br>$1 =3D {v_swtch =3D =
0xfffffe022158f2f8, v_trap =3D 0xfffffe022158f2f0,<br>&nbsp;v_syscall =3D =
0xfffffe022158f2e8, v_intr =3D 0xfffffe022158f2e0,<br>&nbsp;v_soft =3D =
0xfffffe022158f2d8, v_vm_faults =3D =
0xfffffe022158f2d0,<br>&nbsp;v_io_faults =3D 0xfffffe022158f2c8, =
v_cow_faults =3D 0xfffffe022158f2c0,<br>&nbsp;v_cow_optim =3D =
0xfffffe022158f2b8, v_zfod =3D 0xfffffe022158f2b0,<br>&nbsp;v_ozfod =3D =
0xfffffe022158f2a8, v_swapin =3D 0xfffffe022158f2a0,<br>&nbsp;v_swapout =
=3D 0xfffffe022158f298, v_swappgsin =3D =
0xfffffe022158f290,<br>&nbsp;v_swappgsout =3D 0xfffffe022158f288, =
v_vnodein =3D 0xfffffe022158f280,<br>&nbsp;v_vnodeout =3D =
0xfffffe022158f278, v_vnodepgsin =3D =
0xfffffe022158f270,<br>&nbsp;v_vnodepgsout =3D 0xfffffe022158f268, =
v_intrans =3D 0xfffffe022158f260,<br>&nbsp;v_reactivated =3D =
0xfffffe022158f258, v_pdwakeups =3D =
0xfffffe022158f250,<br>&nbsp;v_pdpages =3D 0xfffffe022158f248, =
v_pdshortfalls =3D 0xfffffe022158f240,<br>&nbsp;v_dfree =3D =
0xfffffe022158f238, v_pfree =3D 0xfffffe022158f230,<br>&nbsp;v_tfree =3D =
0xfffffe022158f228, v_forks =3D 0xfffffe022158f220,<br>&nbsp;v_vforks =3D =
0xfffffe022158f218, v_rforks =3D 0xfffffe022158f210,<br>&nbsp;v_kthreads =
=3D 0xfffffe022158f208, v_forkpages =3D =
0xfffffe022158f200,<br>&nbsp;v_vforkpages =3D 0xfffffe022158f1f8, =
v_rforkpages =3D 0xfffffe022158f1f0,<br>&nbsp;v_kthreadpages =3D =
0xfffffe022158f1e8, v_wire_count =3D =
0xfffffe022158f1e0,<br>&nbsp;v_page_size =3D 4096, v_page_count =3D =
65342843, v_free_reserved =3D 85343,<br>&nbsp;v_free_target =3D 1392195, =
v_free_min =3D 412056, v_inactive_target =3D =
2088292,<br>&nbsp;v_pageout_free_min =3D 136, v_interrupt_free_min =3D =
8, v_free_severe =3D 248698}<br>(kgdb) print vm_ndomains<br>$2 =3D =
4<br>(kgdb) print vm_dom[0].vmd_pagequeues[0].pq_cnt<br>$3 =3D =
6298704<br>(kgdb) print vm_dom[0].vmd_pagequeues[1].pq_cnt<br>$4 =3D =
3423939<br>(kgdb) print vm_dom[0].vmd_pagequeues[2].pq_cnt<br>$5 =3D =
629834<br>(kgdb) print vm_dom[0].vmd_pagequeues[3].pq_cnt<br>$6 =3D =
0<br>(kgdb) print vm_dom[1].vmd_pagequeues[0].pq_cnt<br>$7 =3D =
2301793<br>(kgdb) print vm_dom[1].vmd_pagequeues[1].pq_cnt<br>$8 =3D =
7130193<br>(kgdb) print vm_dom[1].vmd_pagequeues[2].pq_cnt<br>$9 =3D =
701495<br>(kgdb) print vm_dom[1].vmd_pagequeues[3].pq_cnt<br>$10 =3D =
0<br>(kgdb) print vm_dom[2].vmd_pagequeues[0].pq_cnt<br>$11 =3D =
464429<br>(kgdb) print vm_dom[2].vmd_pagequeues[1].pq_cnt<br>$12 =3D =
9123532<br>(kgdb) print vm_dom[2].vmd_pagequeues[2].pq_cnt<br>$13 =3D =
1037423<br>(kgdb) print vm_dom[2].vmd_pagequeues[3].pq_cnt<br>$14 =3D =
0<br>(kgdb) print vm_dom[3].vmd_pagequeues[0].pq_cnt<br>$15 =3D =
5444946<br>(kgdb) print vm_dom[3].vmd_pagequeues[1].pq_cnt<br>$16 =3D =
4466782<br>(kgdb) print vm_dom[3].vmd_pagequeues[2].pq_cnt<br>$17 =3D =
785195<br>(kgdb) print vm_dom[3].vmd_pagequeues[3].pq_cnt<br>$18 =3D =
0<br>(kgdb)<span =
class=3D"Apple-converted-space">&nbsp;</span><br><br><br>Adding up the =
page queue =
counts:<br><br>6298704<br>3423939<br>629834<br>++p<br>10352477<br>2301793<=
br>7130193<br>701495<br>++p<br>10133481<br>+p<br>20485958<br>464429<br>912=
3532<br>1037423<br>++p<br>10625384<br>+p<br>31111342<br>5444946<br>4466782=
<br>785195<br>++p<br>10696923<br>+p<br>41808265<br><br>So, about 23M =
pages short of v_page_count. &nbsp;<br><br>v_wire_count is a per-CPU =
counter, and on a running system it gets added up. &nbsp;But trying to =
access it in the kernel core dump yields:<br><br>(kgdb) print =
vm_cnt.v_wire_count<br>$2 =3D (counter_u64_t) =
0xfffffe022158f1e0<br>(kgdb) print *$2<br>Cannot access memory at =
address 0xfffffe022158f1e0<br><br>Anyone have any ideas whether I can =
figure out whether there is a page leak from the core =
dump?<br><br></blockquote><br style=3D"caret-color: rgb(0, 0, 0); =
font-family: Menlo-Regular; font-size: 11px; font-style: normal; =
font-variant-caps: normal; font-weight: 400; letter-spacing: normal; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); =
font-family: Menlo-Regular; font-size: 11px; font-style: normal; =
font-variant-caps: normal; font-weight: 400; letter-spacing: normal; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none; float: none; display: inline !important;">Did you =
looked at UMA/malloc stats? &nbsp;It could be genuine VM leaking =
pages,</span><br style=3D"caret-color: rgb(0, 0, 0); font-family: =
Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none;"><span style=3D"caret-color: rgb(0, 0, 0); font-family: =
Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none; float: none; display: inline !important;">but more often it is =
some kernel subsystem leaking its own allocations.</span><br =
style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; =
font-size: 11px; font-style: normal; font-variant-caps: normal; =
font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none;"><span style=3D"caret-color: rgb(0, 0, 0); font-family: =
Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none; float: none; display: inline !important;">For later, try both =
vmstat -z and vmstat -m on the kernel.debug + vmcore.</span><br =
style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; =
font-size: 11px; font-style: normal; font-variant-caps: normal; =
font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none;"><span style=3D"caret-color: rgb(0, 0, 0); font-family: =
Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none; float: none; display: inline !important;">Often the leakage is =
immediately obvious.</span></div></blockquote><br></div><div>So, vmstat =
-m doesn=E2=80=99t work on a kernel core dump, at least on this vintage =
of stable/13:</div><div><br></div><div><div># vmstat -m -N =
/usr/lib/debug/boot/kernel/kernel.debug -M =
2023-03-15_21-23-27.vmcore.0</div><div>vmstat: =
memstat_kvm_malloc:&nbsp;</div><div>&nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;Type InUse MemUse Requests =
&nbsp;Size(s)</div><div><br></div><div>As for vmstat -z, we=E2=80=99ve =
got a script that adds it all up, and it=E2=80=99s ~80GB on a system =
with 256GB RAM.</div><div><br></div><div>The largest processes in the =
system add up to approximately 24GB of RAM used, although that is very =
difficult to measure precisely because we=E2=80=99re using Postgres, and =
the processes that Postgres uses share varying amounts of RAM. =
&nbsp;</div><div><br></div><div>But it doesn=E2=80=99t seem like we=E2=80=99=
re approaching 256GB. &nbsp;If I can rule out a page leak, then the =
reason behind this change could be the =
issue:</div><div><br></div><div><a =
href=3D"https://cgit.freebsd.org/src/commit/sys/vm?h=3Dstable/13&amp;id=3D=
555baef969a17a7cbcd6af3ee5bcf854ecd4de7c">https://cgit.freebsd.org/src/com=
mit/sys/vm?h=3Dstable/13&amp;id=3D555baef969a17a7cbcd6af3ee5bcf854ecd4de7c=
</a></div><div><br></div><div>The ARC in our case is at about =
65GB.</div><div><br></div><div>Thanks,</div><div><br></div><div>Ken</div><=
/div><div>=E2=80=94&nbsp;</div><div>Ken =
Merry</div><div>ken@FreeBSD.ORG</div><div><br></div></body></html>=

--Apple-Mail=_2710E057-8BAD-4F01-9DD6-48E4ECCB364F--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F6115CA8-AA0A-4DA3-9009-DCDC569D5B97>