Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 1 Sep 2019 02:03:48 +0530
From:      shreyank amartya <shreyankfbsd@gmail.com>
To:        freebsd-drivers@freebsd.org
Cc:        freebsd-hackers@freebsd.org, avg@freebsd.org, mmacy@freebsd.org
Subject:   hwpmc NMI/cpuxx ... going to debugger
Message-ID:  <CAD9jf8Ddzyiu8cTup790UGaM4Y9d17Cu%2BTPwXnJFXFWisz7Eew@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi,

I'm trying to profile a java application using hwpmc. While running the
profiler I run into these errors:

NMI ISA 30, EISA ff
NMI/cpu123 ... going to debugger
NMI ISA 30, EISA ff
NMI/cpu58 ... going to debugger
NMI ISA 20, EISA ff
NMI/cpu59 ... going to debugger

Due to this, pmclog contains very few samples and sometimes none at all. I
enabled hwpmc kernel traces and with that it seems that cpu which is unable
to catch the pmc interrupt hits true for this condition in
amd_intr(hwpmc_amd.c):

                if ((pm = pac->pc_amdpmcs[i].phw_pmc) == NULL ||
                    !PMC_IS_SAMPLING_MODE(PMC_TO_MODE(pm))) {
                        isnull++;
                        continue;
                }

                if (!AMD_PMC_HAS_OVERFLOWED(i)) {
                        ovrflw++;
                        continue;
                }

These are some logs

Aug 31 20:01:11 amd kernel: cpu245 MDP:INT:1: cpu=245 tf=0xfffffe00015cbf30
um=0
Aug 31 20:01:11 amd kernel: cpu245 MDP:INT:2: retval=1 isnull=0 ovrflw=0
Aug 31 20:01:11 amd kernel: cpu245 MDP:INT:1: cpu=245 tf=0xfffffe00015cbf30
um=1
Aug 31 20:01:11 amd kernel: cpu245 MDP:INT:2: retval=1 isnull=0 ovrflw=0
Aug 31 20:01:11 amd kernel: cpu245 MDP:INT:1: cpu=245 tf=0xfffffe00015cbf30
um=0
Aug 31 20:01:11 amd kernel: cpu245 MDP:INT:2: retval=0 isnull=16 ovrflw=0
Aug 31 20:01:11 amd kernel: NMI ISA 3c, EISA ff
Aug 31 20:01:11 amd kernel: NMI/cpu245 ... going to debugger
Aug 31 20:01:11 amd kernel: cpu236 MDP:INT:1: cpu=236 tf=0xfffffe0001595f30
um=0
Aug 31 20:01:11 amd kernel: cpu236 MDP:INT:2: retval=1 isnull=0 ovrflw=0
Aug 31 20:01:11 amd kernel: cpu252 MDP:INT:1: cpu=252 tf=0xfffffe00015f5f30
um=0
Aug 31 20:01:11 amd kernel: cpu252 MDP:INT:2: retval=1 isnull=0 ovrflw=0

Other observations:
If I enable all the md=* flags, I do not see this issue. When i run the
profiler for a C++ application, I do not encounter this problem.
There is no panic observed.
Any idea on what could be causing this? or pointers on how to debug this
further?

Thanks
Shreyank



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAD9jf8Ddzyiu8cTup790UGaM4Y9d17Cu%2BTPwXnJFXFWisz7Eew>