Date: Tue, 21 Mar 2023 13:54:12 -0400 From: Ken Merry <ken@freebsd.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: hackers@freebsd.org Subject: Re: Getting v_wire_count from a kernel core dump? Message-ID: <F6115CA8-AA0A-4DA3-9009-DCDC569D5B97@freebsd.org> In-Reply-To: <ZBnPHsuFy6uwbMkR@kib.kiev.ua> References: <66742036-C8DF-4A13-9D4A-CDA71217E574@freebsd.org> <ZBnPHsuFy6uwbMkR@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_2710E057-8BAD-4F01-9DD6-48E4ECCB364F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Mar 21, 2023, at 11:37, Konstantin Belousov <kostikbel@gmail.com> = wrote: >=20 > On Tue, Mar 21, 2023 at 11:11:30AM -0400, Ken Merry wrote: >> I have kernel core dumps from several machines out in the field = (customer sites) that were out of memory panics, and I=E2=80=99m trying = to figure out, from the kernel core dumps, whether we=E2=80=99re dealing = with a potential page leak. >>=20 >> For context, these machines are running stable/13 from April 2021, = but they do have the fix for this bug: >>=20 >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D256507 >>=20 >> Which is this commit in stable/13: >>=20 >> = https://cgit.freebsd.org/src/commit/?id=3D6094749a1a5dafb8daf98deab23fc968= 070bc695 >>=20 >> On a running system, I can get a rough idea whether there is a page = leak by looking at the VM system page counters: >>=20 >> # sysctl vm.stats |grep count >> vm.stats.vm.v_cache_count: 0 >> vm.stats.vm.v_user_wire_count: 0 >> vm.stats.vm.v_laundry_count: 991626 >> vm.stats.vm.v_inactive_count: 39733216 >> vm.stats.vm.v_active_count: 11821309 >> vm.stats.vm.v_wire_count: 11154113 >> vm.stats.vm.v_free_count: 1599981 >> vm.stats.vm.v_page_count: 65347213 >>=20 >> So the first 5 numbers add up to 65300245 in this case, for a = difference of 46968. =20 >>=20 >> Am I off base here as far as the various counts adding up to the page = count? (e.g. is the wire count just an additional attribute of a page = and not another separate state like active, inactive or laundry?) >>=20 >> Looking at the kernel core dump for one of the systems I see: >>=20 >> kgdb) print vm_cnt >> $1 =3D {v_swtch =3D 0xfffffe022158f2f8, v_trap =3D = 0xfffffe022158f2f0, >> v_syscall =3D 0xfffffe022158f2e8, v_intr =3D 0xfffffe022158f2e0, >> v_soft =3D 0xfffffe022158f2d8, v_vm_faults =3D 0xfffffe022158f2d0, >> v_io_faults =3D 0xfffffe022158f2c8, v_cow_faults =3D = 0xfffffe022158f2c0, >> v_cow_optim =3D 0xfffffe022158f2b8, v_zfod =3D 0xfffffe022158f2b0, >> v_ozfod =3D 0xfffffe022158f2a8, v_swapin =3D 0xfffffe022158f2a0, >> v_swapout =3D 0xfffffe022158f298, v_swappgsin =3D = 0xfffffe022158f290, >> v_swappgsout =3D 0xfffffe022158f288, v_vnodein =3D = 0xfffffe022158f280, >> v_vnodeout =3D 0xfffffe022158f278, v_vnodepgsin =3D = 0xfffffe022158f270, >> v_vnodepgsout =3D 0xfffffe022158f268, v_intrans =3D = 0xfffffe022158f260, >> v_reactivated =3D 0xfffffe022158f258, v_pdwakeups =3D = 0xfffffe022158f250, >> v_pdpages =3D 0xfffffe022158f248, v_pdshortfalls =3D = 0xfffffe022158f240, >> v_dfree =3D 0xfffffe022158f238, v_pfree =3D 0xfffffe022158f230, >> v_tfree =3D 0xfffffe022158f228, v_forks =3D 0xfffffe022158f220, >> v_vforks =3D 0xfffffe022158f218, v_rforks =3D 0xfffffe022158f210, >> v_kthreads =3D 0xfffffe022158f208, v_forkpages =3D = 0xfffffe022158f200, >> v_vforkpages =3D 0xfffffe022158f1f8, v_rforkpages =3D = 0xfffffe022158f1f0, >> v_kthreadpages =3D 0xfffffe022158f1e8, v_wire_count =3D = 0xfffffe022158f1e0, >> v_page_size =3D 4096, v_page_count =3D 65342843, v_free_reserved =3D = 85343, >> v_free_target =3D 1392195, v_free_min =3D 412056, v_inactive_target = =3D 2088292, >> v_pageout_free_min =3D 136, v_interrupt_free_min =3D 8, = v_free_severe =3D 248698} >> (kgdb) print vm_ndomains >> $2 =3D 4 >> (kgdb) print vm_dom[0].vmd_pagequeues[0].pq_cnt >> $3 =3D 6298704 >> (kgdb) print vm_dom[0].vmd_pagequeues[1].pq_cnt >> $4 =3D 3423939 >> (kgdb) print vm_dom[0].vmd_pagequeues[2].pq_cnt >> $5 =3D 629834 >> (kgdb) print vm_dom[0].vmd_pagequeues[3].pq_cnt >> $6 =3D 0 >> (kgdb) print vm_dom[1].vmd_pagequeues[0].pq_cnt >> $7 =3D 2301793 >> (kgdb) print vm_dom[1].vmd_pagequeues[1].pq_cnt >> $8 =3D 7130193 >> (kgdb) print vm_dom[1].vmd_pagequeues[2].pq_cnt >> $9 =3D 701495 >> (kgdb) print vm_dom[1].vmd_pagequeues[3].pq_cnt >> $10 =3D 0 >> (kgdb) print vm_dom[2].vmd_pagequeues[0].pq_cnt >> $11 =3D 464429 >> (kgdb) print vm_dom[2].vmd_pagequeues[1].pq_cnt >> $12 =3D 9123532 >> (kgdb) print vm_dom[2].vmd_pagequeues[2].pq_cnt >> $13 =3D 1037423 >> (kgdb) print vm_dom[2].vmd_pagequeues[3].pq_cnt >> $14 =3D 0 >> (kgdb) print vm_dom[3].vmd_pagequeues[0].pq_cnt >> $15 =3D 5444946 >> (kgdb) print vm_dom[3].vmd_pagequeues[1].pq_cnt >> $16 =3D 4466782 >> (kgdb) print vm_dom[3].vmd_pagequeues[2].pq_cnt >> $17 =3D 785195 >> (kgdb) print vm_dom[3].vmd_pagequeues[3].pq_cnt >> $18 =3D 0 >> (kgdb)=20 >>=20 >>=20 >> Adding up the page queue counts: >>=20 >> 6298704 >> 3423939 >> 629834 >> ++p >> 10352477 >> 2301793 >> 7130193 >> 701495 >> ++p >> 10133481 >> +p >> 20485958 >> 464429 >> 9123532 >> 1037423 >> ++p >> 10625384 >> +p >> 31111342 >> 5444946 >> 4466782 >> 785195 >> ++p >> 10696923 >> +p >> 41808265 >>=20 >> So, about 23M pages short of v_page_count. =20 >>=20 >> v_wire_count is a per-CPU counter, and on a running system it gets = added up. But trying to access it in the kernel core dump yields: >>=20 >> (kgdb) print vm_cnt.v_wire_count >> $2 =3D (counter_u64_t) 0xfffffe022158f1e0 >> (kgdb) print *$2 >> Cannot access memory at address 0xfffffe022158f1e0 >>=20 >> Anyone have any ideas whether I can figure out whether there is a = page leak from the core dump? >>=20 >=20 > Did you looked at UMA/malloc stats? It could be genuine VM leaking = pages, > but more often it is some kernel subsystem leaking its own = allocations. > For later, try both vmstat -z and vmstat -m on the kernel.debug + = vmcore. > Often the leakage is immediately obvious. So, vmstat -m doesn=E2=80=99t work on a kernel core dump, at least on = this vintage of stable/13: # vmstat -m -N /usr/lib/debug/boot/kernel/kernel.debug -M = 2023-03-15_21-23-27.vmcore.0 vmstat: memstat_kvm_malloc:=20 Type InUse MemUse Requests Size(s) As for vmstat -z, we=E2=80=99ve got a script that adds it all up, and = it=E2=80=99s ~80GB on a system with 256GB RAM. The largest processes in the system add up to approximately 24GB of RAM = used, although that is very difficult to measure precisely because = we=E2=80=99re using Postgres, and the processes that Postgres uses share = varying amounts of RAM. =20 But it doesn=E2=80=99t seem like we=E2=80=99re approaching 256GB. If I = can rule out a page leak, then the reason behind this change could be = the issue: = https://cgit.freebsd.org/src/commit/sys/vm?h=3Dstable/13&id=3D555baef969a1= 7a7cbcd6af3ee5bcf854ecd4de7c The ARC in our case is at about 65GB. Thanks, Ken =E2=80=94=20 Ken Merry ken@FreeBSD.ORG --Apple-Mail=_2710E057-8BAD-4F01-9DD6-48E4ECCB364F Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"content-type" content=3D"text/html; = charset=3Dutf-8"></head><body style=3D"overflow-wrap: break-word; = -webkit-nbsp-mode: space; line-break: = after-white-space;"><div><br></div><div><blockquote type=3D"cite"><div>On = Mar 21, 2023, at 11:37, Konstantin Belousov <kostikbel@gmail.com> = wrote:</div><br class=3D"Apple-interchange-newline"><div><meta = charset=3D"UTF-8"><span style=3D"caret-color: rgb(0, 0, 0); font-family: = Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none; float: none; display: inline !important;">On Tue, Mar 21, 2023 at = 11:11:30AM -0400, Ken Merry wrote:</span><br style=3D"caret-color: = rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 11px; font-style: = normal; font-variant-caps: normal; font-weight: 400; letter-spacing: = normal; text-align: start; text-indent: 0px; text-transform: none; = white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><blockquote type=3D"cite" style=3D"font-family: = Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; orphans: auto; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; = -webkit-text-stroke-width: 0px; text-decoration: none;">I have kernel = core dumps from several machines out in the field (customer sites) that = were out of memory panics, and I=E2=80=99m trying to figure out, from = the kernel core dumps, whether we=E2=80=99re dealing with a potential = page leak.<br><br>For context, these machines are running stable/13 from = April 2021, but they do have the fix for this = bug:<br><br>https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D256507<br>= <br>Which is this commit in = stable/13:<br><br>https://cgit.freebsd.org/src/commit/?id=3D6094749a1a5daf= b8daf98deab23fc968070bc695<br><br>On a running system, I can get a rough = idea whether there is a page leak by looking at the VM system page = counters:<br><br># sysctl vm.stats |grep = count<br>vm.stats.vm.v_cache_count: 0<br>vm.stats.vm.v_user_wire_count: = 0<br>vm.stats.vm.v_laundry_count: = 991626<br>vm.stats.vm.v_inactive_count: = 39733216<br>vm.stats.vm.v_active_count: = 11821309<br>vm.stats.vm.v_wire_count: = 11154113<br>vm.stats.vm.v_free_count: = 1599981<br>vm.stats.vm.v_page_count: 65347213<br><br>So the first 5 = numbers add up to 65300245 in this case, for a difference of 46968. = <br><br>Am I off base here as far as the various counts adding up = to the page count? (e.g. is the wire count just an additional = attribute of a page and not another separate state like active, inactive = or laundry?)<br><br>Looking at the kernel core dump for one of the = systems I see:<br><br>kgdb) print vm_cnt<br>$1 =3D {v_swtch =3D = 0xfffffe022158f2f8, v_trap =3D 0xfffffe022158f2f0,<br> v_syscall =3D = 0xfffffe022158f2e8, v_intr =3D 0xfffffe022158f2e0,<br> v_soft =3D = 0xfffffe022158f2d8, v_vm_faults =3D = 0xfffffe022158f2d0,<br> v_io_faults =3D 0xfffffe022158f2c8, = v_cow_faults =3D 0xfffffe022158f2c0,<br> v_cow_optim =3D = 0xfffffe022158f2b8, v_zfod =3D 0xfffffe022158f2b0,<br> v_ozfod =3D = 0xfffffe022158f2a8, v_swapin =3D 0xfffffe022158f2a0,<br> v_swapout = =3D 0xfffffe022158f298, v_swappgsin =3D = 0xfffffe022158f290,<br> v_swappgsout =3D 0xfffffe022158f288, = v_vnodein =3D 0xfffffe022158f280,<br> v_vnodeout =3D = 0xfffffe022158f278, v_vnodepgsin =3D = 0xfffffe022158f270,<br> v_vnodepgsout =3D 0xfffffe022158f268, = v_intrans =3D 0xfffffe022158f260,<br> v_reactivated =3D = 0xfffffe022158f258, v_pdwakeups =3D = 0xfffffe022158f250,<br> v_pdpages =3D 0xfffffe022158f248, = v_pdshortfalls =3D 0xfffffe022158f240,<br> v_dfree =3D = 0xfffffe022158f238, v_pfree =3D 0xfffffe022158f230,<br> v_tfree =3D = 0xfffffe022158f228, v_forks =3D 0xfffffe022158f220,<br> v_vforks =3D = 0xfffffe022158f218, v_rforks =3D 0xfffffe022158f210,<br> v_kthreads = =3D 0xfffffe022158f208, v_forkpages =3D = 0xfffffe022158f200,<br> v_vforkpages =3D 0xfffffe022158f1f8, = v_rforkpages =3D 0xfffffe022158f1f0,<br> v_kthreadpages =3D = 0xfffffe022158f1e8, v_wire_count =3D = 0xfffffe022158f1e0,<br> v_page_size =3D 4096, v_page_count =3D = 65342843, v_free_reserved =3D 85343,<br> v_free_target =3D 1392195, = v_free_min =3D 412056, v_inactive_target =3D = 2088292,<br> v_pageout_free_min =3D 136, v_interrupt_free_min =3D = 8, v_free_severe =3D 248698}<br>(kgdb) print vm_ndomains<br>$2 =3D = 4<br>(kgdb) print vm_dom[0].vmd_pagequeues[0].pq_cnt<br>$3 =3D = 6298704<br>(kgdb) print vm_dom[0].vmd_pagequeues[1].pq_cnt<br>$4 =3D = 3423939<br>(kgdb) print vm_dom[0].vmd_pagequeues[2].pq_cnt<br>$5 =3D = 629834<br>(kgdb) print vm_dom[0].vmd_pagequeues[3].pq_cnt<br>$6 =3D = 0<br>(kgdb) print vm_dom[1].vmd_pagequeues[0].pq_cnt<br>$7 =3D = 2301793<br>(kgdb) print vm_dom[1].vmd_pagequeues[1].pq_cnt<br>$8 =3D = 7130193<br>(kgdb) print vm_dom[1].vmd_pagequeues[2].pq_cnt<br>$9 =3D = 701495<br>(kgdb) print vm_dom[1].vmd_pagequeues[3].pq_cnt<br>$10 =3D = 0<br>(kgdb) print vm_dom[2].vmd_pagequeues[0].pq_cnt<br>$11 =3D = 464429<br>(kgdb) print vm_dom[2].vmd_pagequeues[1].pq_cnt<br>$12 =3D = 9123532<br>(kgdb) print vm_dom[2].vmd_pagequeues[2].pq_cnt<br>$13 =3D = 1037423<br>(kgdb) print vm_dom[2].vmd_pagequeues[3].pq_cnt<br>$14 =3D = 0<br>(kgdb) print vm_dom[3].vmd_pagequeues[0].pq_cnt<br>$15 =3D = 5444946<br>(kgdb) print vm_dom[3].vmd_pagequeues[1].pq_cnt<br>$16 =3D = 4466782<br>(kgdb) print vm_dom[3].vmd_pagequeues[2].pq_cnt<br>$17 =3D = 785195<br>(kgdb) print vm_dom[3].vmd_pagequeues[3].pq_cnt<br>$18 =3D = 0<br>(kgdb)<span = class=3D"Apple-converted-space"> </span><br><br><br>Adding up the = page queue = counts:<br><br>6298704<br>3423939<br>629834<br>++p<br>10352477<br>2301793<= br>7130193<br>701495<br>++p<br>10133481<br>+p<br>20485958<br>464429<br>912= 3532<br>1037423<br>++p<br>10625384<br>+p<br>31111342<br>5444946<br>4466782= <br>785195<br>++p<br>10696923<br>+p<br>41808265<br><br>So, about 23M = pages short of v_page_count. <br><br>v_wire_count is a per-CPU = counter, and on a running system it gets added up. But trying to = access it in the kernel core dump yields:<br><br>(kgdb) print = vm_cnt.v_wire_count<br>$2 =3D (counter_u64_t) = 0xfffffe022158f1e0<br>(kgdb) print *$2<br>Cannot access memory at = address 0xfffffe022158f1e0<br><br>Anyone have any ideas whether I can = figure out whether there is a page leak from the core = dump?<br><br></blockquote><br style=3D"caret-color: rgb(0, 0, 0); = font-family: Menlo-Regular; font-size: 11px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); = font-family: Menlo-Regular; font-size: 11px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none; float: none; display: inline !important;">Did you = looked at UMA/malloc stats? It could be genuine VM leaking = pages,</span><br style=3D"caret-color: rgb(0, 0, 0); font-family: = Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none;"><span style=3D"caret-color: rgb(0, 0, 0); font-family: = Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none; float: none; display: inline !important;">but more often it is = some kernel subsystem leaking its own allocations.</span><br = style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; = font-size: 11px; font-style: normal; font-variant-caps: normal; = font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none;"><span style=3D"caret-color: rgb(0, 0, 0); font-family: = Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none; float: none; display: inline !important;">For later, try both = vmstat -z and vmstat -m on the kernel.debug + vmcore.</span><br = style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; = font-size: 11px; font-style: normal; font-variant-caps: normal; = font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none;"><span style=3D"caret-color: rgb(0, 0, 0); font-family: = Menlo-Regular; font-size: 11px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none; float: none; display: inline !important;">Often the leakage is = immediately obvious.</span></div></blockquote><br></div><div>So, vmstat = -m doesn=E2=80=99t work on a kernel core dump, at least on this vintage = of stable/13:</div><div><br></div><div><div># vmstat -m -N = /usr/lib/debug/boot/kernel/kernel.debug -M = 2023-03-15_21-23-27.vmcore.0</div><div>vmstat: = memstat_kvm_malloc: </div><div> = Type InUse MemUse Requests = Size(s)</div><div><br></div><div>As for vmstat -z, we=E2=80=99ve = got a script that adds it all up, and it=E2=80=99s ~80GB on a system = with 256GB RAM.</div><div><br></div><div>The largest processes in the = system add up to approximately 24GB of RAM used, although that is very = difficult to measure precisely because we=E2=80=99re using Postgres, and = the processes that Postgres uses share varying amounts of RAM. = </div><div><br></div><div>But it doesn=E2=80=99t seem like we=E2=80=99= re approaching 256GB. If I can rule out a page leak, then the = reason behind this change could be the = issue:</div><div><br></div><div><a = href=3D"https://cgit.freebsd.org/src/commit/sys/vm?h=3Dstable/13&id=3D= 555baef969a17a7cbcd6af3ee5bcf854ecd4de7c">https://cgit.freebsd.org/src/com= mit/sys/vm?h=3Dstable/13&id=3D555baef969a17a7cbcd6af3ee5bcf854ecd4de7c= </a></div><div><br></div><div>The ARC in our case is at about = 65GB.</div><div><br></div><div>Thanks,</div><div><br></div><div>Ken</div><= /div><div>=E2=80=94 </div><div>Ken = Merry</div><div>ken@FreeBSD.ORG</div><div><br></div></body></html>= --Apple-Mail=_2710E057-8BAD-4F01-9DD6-48E4ECCB364F--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F6115CA8-AA0A-4DA3-9009-DCDC569D5B97>