Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Nov 2017 01:38:03 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Edward Tomasz Napierala <trasz@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r326125 - head/usr.sbin/kgmon
Message-ID:  <20171124002239.T1335@besplex.bde.org>
In-Reply-To: <201711231241.vANCf58n091345@repo.freebsd.org>
References:  <201711231241.vANCf58n091345@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 23 Nov 2017, Edward Tomasz Napierala wrote:

> Log:
>  Mark kgmon(8) obsolete, redirecting users to pmcstat(8).

It isn't obsolete.  pmcstat can't do at least full (non-statistical)
call graphs and high-resolution profiling.

gmon in the kernel is slow to use in the SMP case and dangerous to use
in all cases, but usually works for call graphs.  In the SMP case, it
uses a giant spinlock which is slow and gives deadlock when a
non-maskable trap like an NMI or debugger trap occurs while the lock
is held by the same CPU.  In the !SMP case, it uses interrupt disabling
to lock.  This races instead of deadlocking for non-maskable traps while
it is held.

Ordinary mutexes have the same problems, and only work if NMI and debugger
trap handlers don't use any mutexes that might be held by the interrupted
context.  Broken cases mostly involve broken locking in printf() and console
drivers.

High-resolution profiling was broken by gcc-4.2.1 and is more broken
for clang.  Only the parts written in asm sort of work, and no parts
ever worked right for SMP.

All these bugs are except the slowness from giant locking are fixed for
gcc-4.2.1 in some of my versions, using better giant locking with a timeout
on it to avoid deadlocking.  When deadlock is detected, profiling is
skipped.  My fixes for printf() and console drivers are similar except
fot trying harder to not skip (switch to alternative methods).

pmc also can't do better than nothing for cases involving NMIs even
when they aren't near deadlock.  Ordinary profiling can do better, and
high resolution profiling can do better still.  E.g., for profiling
an NMI handler, pmc can't generate NMIs to even sample it statistically,
but ordinary profiling can see it whenever the NMI doesn't occur while
the profiling lock is held.  Ordinary profiling then gives an exact
call graph, but broken statistical sampling for times since generating
hardclock interrupts in an NMI (and other contexts) is even more
impossible than generating an NMI for pmc.  High-resolution profiling
gives almost exact times (or perfmon counts) even in NMI handlers
except in the near-deadlock case.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171124002239.T1335>