index | | raw e-mail
Isn't this 2026? > +.\" > +.\" Redistribution and use in source and binary forms, with or without > +.\" modification, are permitted provided that the following conditions > +.\" are met: > +.\" 1. Redistributions of source code must retain the above copyright > +.\" notice, this list of conditions and the following disclaimer. > +.\" 2. Redistributions in binary form must reproduce the above copyright > +.\" notice, this list of conditions and the following disclaimer in the > +.\" documentation and/or other materials provided with the > distribution. > +.\" > +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND > +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR > PURPOSE > +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE > LIABLE > +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR > CONSEQUENTIAL > +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE > GOODS > +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, > STRICT > +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY > WAY > +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > +.\" SUCH DAMAGE. > +.\" > +.Dd March 15, 2026 > +.Dt PMC.IBS 3 > +.Os > +.Sh NAME > +.Nm pmc.ibs > +.Nd Instruction Based Sampling for > +.Tn AMD > +CPUs > +.Sh LIBRARY > +.Lb libpmc > +.Sh SYNOPSIS > +.In pmc.h > +.Sh DESCRIPTION > +AMD Instruction Based Sampling (IBS) was introduced with the K10 family of > +CPUs. > +AMD IBS is an alternative approach that samples instructions or micro-ops > and > +provides a per-instruction or micro-op breakdown of the sources of stalls. > +.Pp > +Unlike traditional counters, IBS can only be used in the sampling mode and > +provides extra data embedded in the callchain. > +IBS events set the PMC_F_MULTIPART flag to signify multiple payload types > are > +contained in the callchain. > +The first 8 bytes of the callchain contain four tuples with a one byte > type and > +a one byte length field. > +The regular PMC callchain can be found following the multipart payload. > +.Pp > +IBS only provides two events that analyze instruction fetches and > instruction > +execution. > +The instruction fetch (ibs-fetch) event provides data on the processor > +front-end including reporting instruction cache and TLB events. > +The instruction execution (ibs-op) event provides data on the processor > +execution including reporting mispredictions, data cache and TLB events. > +You should use the AMD PMC counters documented in > +.Xr pmc.amd 3 > +to analyze stalls relating instruction issue including reservation > contention. > +.Pp > +A guide to analyzing IBS data is provided in Appendix G of the > +.Rs > +.%B "Software Optimization Guide for AMD Family 10h and 12h Processors" > +.%N "Publication No. 40546" > +.%D "February 2011" > +.%Q "Advanced Micro Devices, Inc." > +.Re > +A more recent document should be used for decoding all of the flags and > fields > +in the IBS data. > +For example, see the AMD Zen 5 documentation > +.Rs > +.%B "Processor Programming Reference (PPR) for AMD Family 1Ah Model 02h" > +.%N "Publication No. 57238" > +.%D "March 6, 2026" > +.%Q "Advanced Micro Devices, Inc." > +.Re > +.Ss PMC Features > +AMD IBS supports the following capabilities. > +.Bl -column "PMC_CAP_INTERRUPT" "Support" > +.It Em Capability Ta Em Support > +.It PMC_CAP_CASCADE Ta \&No > +.It PMC_CAP_EDGE Ta Yes > +.It PMC_CAP_INTERRUPT Ta Yes > +.It PMC_CAP_INVERT Ta \&No > +.It PMC_CAP_READ Ta \&No > +.It PMC_CAP_PRECISE Ta Yes > +.It PMC_CAP_SYSTEM Ta Yes > +.It PMC_CAP_TAGGING Ta \&No > +.It PMC_CAP_THRESHOLD Ta \&No > +.It PMC_CAP_USER Ta \&No > +.It PMC_CAP_WRITE Ta \&No > +.El > +.Pp > +By default AMD IBS enables the edge, interrupt, system and precise flags. > +.Ss Event Qualifiers > +Event specifiers for AMD IBS can have the following optional > +qualifiers: > +.Bl -tag -width "ldlat=value" > +.It Li l3miss > +Configure IBS to only sample if an l3miss occurred. > +.It Li ldlat= Ns Ar value > +Configure the counter to only sample events with load latencies above > +.Ar ldlat . > +IBS only supports filtering latencies that are a multiple of 128 and > between > +128 and 2048. > +Load latency filtering can only be used with ibs-op events and imply the > +l3miss qualifier. > +.It Li randomize > +Randomize the sampling rate. > +.El > +.Ss AMD IBS Events Specifiers > +The IBS event class provides only two event specifiers: > +.Bl -tag -width indent > +.It Li ibs-fetch Xo > +.Op ,l3miss > +.Op ,randomize > +.Xc > +Collect performance samples during instruction fetch. > +The > +.Ar randomize > +qualifier randomly sets the bottom four bits of the sample rate. > +.It Li ibs-op Xo > +.Op ,l3miss > +.Op ,ldlat= Ns Ar ldlat > +.Op ,randomize > +.Xc > +Collect performance samples during instruction execution. > +The > +.Ar randomize > +qualifier, upon reaching the maximum count, restarts the count with a > value > +between 1 and 127. > +.El > +.Pp > +You may collect both events at the same time. > +N.B. AMD discouraged doing so with certain older processors, stating that > +sampling both simultaneously perturbs the results. > +Please see the processor programming reference for your specific > processor. > +.Sh SEE ALSO > +.Xr pmc 3 , > +.Xr pmc.amd 3 , > +.Xr pmc.soft 3 , > +.Xr pmc.tsc 3 , > +.Xr pmclog 3 , > +.Xr hwpmc 4 > +.Sh HISTORY > +AMD IBS support was first introduced in > +.Fx 16.0 . > +.Sh AUTHORS > +AMD IBS support and this manual page were written > +.An Ali Mashtizadeh Aq Mt ali@mashtizadeh.com > +and sponsored by Netflix, Inc. > diff --git a/lib/libpmc/pmc.soft.3 b/lib/libpmc/pmc.soft.3 > index 08d5af63d02d..f58b3e8ffa26 100644 > --- a/lib/libpmc/pmc.soft.3 > +++ b/lib/libpmc/pmc.soft.3 > @@ -90,6 +90,7 @@ Write page fault. > .Xr pmc.corei7 3 , > .Xr pmc.corei7uc 3 , > .Xr pmc.iaf 3 , > +.Xr pmc.ibs 3 , > .Xr pmc.tsc 3 , > .Xr pmc.ucf 3 , > .Xr pmc.westmereuc 3 , > diff --git a/lib/libpmc/pmc.tsc.3 b/lib/libpmc/pmc.tsc.3 > index 4834d897f90c..73e2377df0c7 100644 > --- a/lib/libpmc/pmc.tsc.3 > +++ b/lib/libpmc/pmc.tsc.3 > @@ -62,6 +62,7 @@ maps to the TSC. > .Xr pmc.core 3 , > .Xr pmc.core2 3 , > .Xr pmc.iaf 3 , > +.Xr pmc.ibs 3 , > .Xr pmc.soft 3 , > .Xr pmclog 3 , > .Xr hwpmc 4 > diff --git a/lib/libpmc/pmc.ucf.3 b/lib/libpmc/pmc.ucf.3 > index a7cea6bb57f9..37ee0f87a951 100644 > --- a/lib/libpmc/pmc.ucf.3 > +++ b/lib/libpmc/pmc.ucf.3 > @@ -88,6 +88,7 @@ offset C0H under device number 0 and Function 0. > .Xr pmc.corei7 3 , > .Xr pmc.corei7uc 3 , > .Xr pmc.iaf 3 , > +.Xr pmc.ibs 3 , > .Xr pmc.soft 3 , > .Xr pmc.tsc 3 , > .Xr pmc.westmere 3 , > diff --git a/sys/dev/hwpmc/hwpmc_ibs.h b/sys/dev/hwpmc/hwpmc_ibs.h > index 4449b44c8368..01fc88648558 100644 > --- a/sys/dev/hwpmc/hwpmc_ibs.h > +++ b/sys/dev/hwpmc/hwpmc_ibs.h > @@ -67,6 +67,18 @@ > #define IBS_CTL_LVTOFFSETVALID (1ULL << 8) > #define IBS_CTL_LVTOFFSETMASK 0x0000000F > > +/* > + * The minimum sampling rate was selected to match the default used by > other > + * counters that was also found to be experimentally stable by providing > enough > + * time between consecutive NMIs. The maximum sample rate is determined > by > + * setting all available counter bits, i.e., all available bits except the > + * bottom four that are zero extended. > + */ > +#define IBS_FETCH_MIN_RATE 65536 > +#define IBS_FETCH_MAX_RATE 1048560 > +#define IBS_OP_MIN_RATE 65536 > +#define IBS_OP_MAX_RATE 134217712 > + > /* IBS Fetch Control */ > #define IBS_FETCH_CTL 0xC0011030 /* IBS Fetch Control */ > #define IBS_FETCH_CTL_L3MISS (1ULL << 61) /* L3 Cache Miss */ > @@ -82,7 +94,8 @@ > #define IBS_FETCH_CTL_ENABLE (1ULL << 48) /* Enable */ > #define IBS_FETCH_CTL_MAXCNTMASK 0x0000FFFFULL > > -#define IBS_FETCH_CTL_TO_LAT(_c) ((_c >> 32) & 0x0000FFFF) > +#define IBS_FETCH_INTERVAL_TO_CTL(_c) (((_c) >> 4) & 0x0000FFFF) > +#define IBS_FETCH_CTL_TO_LAT(_c) (((_c) >> 32) & 0x0000FFFF) > > #define IBS_FETCH_LINADDR 0xC0011031 /* Fetch Linear Address > */ > #define IBS_FETCH_PHYSADDR 0xC0011032 /* Fetch Physical > Address */ > @@ -95,12 +108,16 @@ > > /* IBS Execution Control */ > #define IBS_OP_CTL 0xC0011033 /* IBS Execution > Control */ > +#define IBS_OP_CTL_LATFLTEN (1ULL << 63) /* Load Latency > Filtering */ > #define IBS_OP_CTL_COUNTERCONTROL (1ULL << 19) /* Counter Control */ > #define IBS_OP_CTL_VALID (1ULL << 18) /* Valid */ > #define IBS_OP_CTL_ENABLE (1ULL << 17) /* Enable */ > #define IBS_OP_CTL_L3MISSONLY (1ULL << 16) /* L3 Miss Filtering > */ > #define IBS_OP_CTL_MAXCNTMASK 0x0000FFFFULL > > +#define IBS_OP_CTL_LDLAT_TO_CTL(_c) ((((ldlat) >> 7) - 1) << 59) > +#define IBS_OP_INTERVAL_TO_CTL(_c) ((((_c) >> 4) & 0x0000FFFFULL) | > ((_c) & 0x07F00000)) > + > #define IBS_OP_RIP 0xC0011034 /* IBS Op RIP */ > #define IBS_OP_DATA 0xC0011035 /* IBS Op Data */ > #define IBS_OP_DATA_RIPINVALID (1ULL << 38) /* RIP Invalid */ > > --000000000000b77e57064db77c73 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <br><br>On Monday, March 23, 2026, Mitchell Horne <<a href=3D"mailto:mho= rne@freebsd.org">mhorne@freebsd.org</a>> wrote:<br><blockquote class=3D"= gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-= left:1ex">The branch main has been updated by mhorne:<br> <br> URL: <a href=3D"https://cgit.FreeBSD.org/src/commit/?id=3Ddf47355fae720fd8f= 63f36a50c8933f8342483d2" target=3D"_blank">https://cgit.FreeBSD.org/src/<wb= r>commit/?id=3D<wbr>df47355fae720fd8f63f36a50c8933<wbr>f8342483d2</a><br> <br> commit df47355fae720fd8f63f36a50c8933<wbr>f8342483d2<br> Author:=C2=A0 =C2=A0 =C2=A0Ali Mashtizadeh <<a href=3D"mailto:mashti@uwa= terloo.ca">mashti@uwaterloo.ca</a>><br> AuthorDate: 2026-03-18 04:27:09 +0000<br> Commit:=C2=A0 =C2=A0 =C2=A0Mitchell Horne <mhorne@FreeBSD.org><br> CommitDate: 2026-03-23 20:21:28 +0000<br> <br> =C2=A0 =C2=A0 libpmc: Add support for IBS qualifiers<br> <br> =C2=A0 =C2=A0 Add support to libpmc for parsing the IBS qualifiers and comp= uting the<br> =C2=A0 =C2=A0 ctl register value as a function of the qualifiers and the sa= mple rate.<br> =C2=A0 =C2=A0 This includes all of the flags available up to AMD Zen 5.=C2= =A0 Along side<br> =C2=A0 =C2=A0 these user facing changes I included the documentation for AM= D IBS.<br> <br> =C2=A0 =C2=A0 Reviewed by:=C2=A0 =C2=A0 mhorne<br> =C2=A0 =C2=A0 Sponsored by:=C2=A0 =C2=A0Netflix<br> =C2=A0 =C2=A0 Pull Request:=C2=A0 =C2=A0<a href=3D"https://github.com/freeb= sd/freebsd-src/pull/2081" target=3D"_blank">https://github.com/freebsd/<wbr= >freebsd-src/pull/2081</a><br> ---<br> =C2=A0lib/libpmc/Makefile=C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A01 +<br> =C2=A0lib/libpmc/libpmc.c=C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 71 ++++++++++++= ++++++----<br> =C2=A0lib/libpmc/pmc.3=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A07 ++= +<br> =C2=A0lib/libpmc/pmc.amd.3=C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A01 +<br> =C2=A0lib/libpmc/pmc.core.3=C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A01 +<br> =C2=A0lib/libpmc/pmc.core2.3=C2=A0 =C2=A0 |=C2=A0 =C2=A01 +<br> =C2=A0lib/libpmc/pmc.iaf.3=C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A01 +<br> =C2=A0lib/libpmc/pmc.ibs.3=C2=A0 =C2=A0 =C2=A0 | 150 ++++++++++++++++++++++= ++++++++<wbr>++++++++++++++++<br> =C2=A0lib/libpmc/pmc.soft.3=C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A01 +<br> =C2=A0lib/libpmc/pmc.tsc.3=C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A01 +<br> =C2=A0lib/libpmc/pmc.ucf.3=C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A01 +<br> =C2=A0sys/dev/hwpmc/hwpmc_ibs.h |=C2=A0 19 +++++-<br> =C2=A012 files changed, 244 insertions(+), 11 deletions(-)<br> <br> diff --git a/lib/libpmc/Makefile b/lib/libpmc/Makefile<br> index 590f719ebff4..442efdc3d9c0 100644<br> --- a/lib/libpmc/Makefile<br> +++ b/lib/libpmc/Makefile<br> @@ -74,6 +74,7 @@ MAN+=3D pmc.haswell.3<br> =C2=A0MAN+=3D=C2=A0 pmc.haswelluc.3<br> =C2=A0MAN+=3D=C2=A0 pmc.haswellxeon.3<br> =C2=A0MAN+=3D=C2=A0 pmc.iaf.3<br> +MAN+=3D=C2=A0 pmc.ibs.3<br> =C2=A0MAN+=3D=C2=A0 pmc.ivybridge.3<br> =C2=A0MAN+=3D=C2=A0 pmc.ivybridgexeon.3<br> =C2=A0MAN+=3D=C2=A0 pmc.sandybridge.3<br> diff --git a/lib/libpmc/libpmc.c b/lib/libpmc/libpmc.c<br> index ceba40aa7b39..ebb642e8d16b 100644<br> --- a/lib/libpmc/libpmc.c<br> +++ b/lib/libpmc/libpmc.c<br> @@ -696,7 +696,7 @@ ibs_allocate_pmc(enum pmc_event pe, char *ctrspec,<br> =C2=A0 =C2=A0 =C2=A0struct pmc_op_pmcallocate *pmc_config)<br> =C2=A0{<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 char *e, *p, *q;<br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0uint64_t ctl;<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0uint64_t ctl, ldlat;<br> <br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 pmc_config->pm_caps |=3D<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (PMC_CAP_SYSTEM | PMC_CAP_EDGE | = PMC_CAP_PRECISE);<br> @@ -714,23 +714,74 @@ ibs_allocate_pmc(enum pmc_event pe, char *ctrspec,<br= > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (-1);<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 }<br> <br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0/* IBS only supports sampling mode */<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!PMC_IS_SAMPLING_MODE(pmc_<wbr>config->p= m_mode)) {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return (-1);<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0}<br> +<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* parse parameters */<br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0while ((p =3D strsep(&ctrspec, ","= ;)) !=3D NULL) {<br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (KWPREFIXMATCH(p= , "ctl=3D")) {<br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0q =3D strchr(p, '=3D');<br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0if (*++q =3D=3D '\0') /* skip '=3D' */<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0ctl =3D 0;<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (pe =3D=3D PMC_EV_IBS_FETCH) {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0while ((p =3D strse= p(&ctrspec, ",")) !=3D NULL) {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0if (KWMATCH(p, "l3miss")) {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ctl |=3D IBS_FETCH_CTL_L3MISSONLY;<br= > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0} else if (KWMATCH(p, "randomize")) {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ctl |=3D IBS_FETCH_CTL_RANDOMIZE;<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0} else {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (-1);<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0}<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}<br> <br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0ctl =3D strtoull(q, &e, 0);<br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0if (e =3D=3D q || *e !=3D '\0')<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (pmc_config->= pm_count < IBS_FETCH_MIN_RATE ||<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pmc_c= onfig->pm_count > IBS_FETCH_MAX_RATE)<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0return (-1);<br> +<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ctl |=3D IBS_FETCH_= INTERVAL_TO_CTL(pmc_<wbr>config->pm_count);<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0} else {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0while ((p =3D strse= p(&ctrspec, ",")) !=3D NULL) {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0if (KWMATCH(p, "l3miss")) {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ctl |=3D IBS_OP_CTL_L3MISSONLY;<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0} else if (KWPREFIXMATCH(p, "ldlat=3D")) {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0q =3D strchr(p, '=3D');<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (*++q =3D=3D '\0') /* skip= '=3D' */<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return (-= 1);<br> +<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ldlat =3D strtoull(q, &e, 0);<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (e =3D=3D q || *e !=3D '\0'= ;)<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return (-= 1);<br> +<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/*<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * IBS load latency filtering require= s the<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * latency to be a multiple of 128 an= d between<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * 128 and 2048.=C2=A0 The latency is= stored in the<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * IbsOpLatThrsh field, which only co= ntains<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * four bits so the processor compute= s<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * (IbsOpLatThrsh+1)*128 as the value= .<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 *<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * AMD PPR Vol 1 for AMD Family 1Ah M= odel 02h<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * C1 (57238) 2026-03-06 Revision 0.4= 9.<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 */<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (ldlat < 128 || ldlat > 2048= )<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return (-= 1);<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ctl |=3D IBS_OP_CTL_LDLAT_TO_CTL(ldla= t)<wbr>;<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ctl |=3D IBS_OP_CTL_L3MISSONLY | IBS_= OP_CTL_LATFLTEN;<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0} else if (KWMATCH(p, "randomize")) {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ctl |=3D IBS_OP_CTL_COUNTERCONTROL;<b= r> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0} else {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (-1);<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0}<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}<br> <br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0pmc_config->pm_md.pm_ibs.ibs_<wbr>ctl |=3D ctl;<br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0} else {<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (pmc_config->= pm_count < IBS_OP_MIN_RATE ||<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pmc_c= onfig->pm_count > IBS_OP_MAX_RATE)<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 return (-1);<br> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}<br> +<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ctl |=3D IBS_OP_INT= ERVAL_TO_CTL(pmc_<wbr>config->pm_count);<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 }<br> <br> +<br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0pmc_config->pm_md.pm_ibs.ibs_<wbr>ctl |=3D c= tl;<br> +<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (0);<br> =C2=A0}<br> <br> diff --git a/lib/libpmc/pmc.3 b/lib/libpmc/pmc.3<br> index 9a5b599759ff..cb28e0b786b9 100644<br> --- a/lib/libpmc/pmc.3<br> +++ b/lib/libpmc/pmc.3<br> @@ -224,6 +224,11 @@ performance measurement architecture version 2 and lat= er.<br> =C2=A0Programmable hardware counters present in CPUs conforming to the<br> =C2=A0.Tn Intel<br> =C2=A0performance measurement architecture version 1 and later.<br> +.It Li PMC_CLASS_IBS<br> +.Tn AMD<br> +Instruction Based Sampling (IBS) counters present in<br> +.Tn AMD<br> +Family 10h and above.<br> =C2=A0.It Li PMC_CLASS_K8<br> =C2=A0Programmable hardware counters present in<br> =C2=A0.Tn "AMD Athlon64"<br> @@ -491,6 +496,7 @@ following manual pages:<br> =C2=A0.It Em "PMC Class"=C2=A0 =C2=A0 =C2=A0 Ta Em "Manual P= age"<br> =C2=A0.It Li PMC_CLASS_IAF=C2=A0 =C2=A0 Ta Xr pmc.iaf 3<br> =C2=A0.It Li PMC_CLASS_IAP=C2=A0 =C2=A0 Ta Xr pmc.atom 3 , Xr pmc.core 3 , = Xr pmc.core2 3<br> +.It Li PMC_CLASS_IBS=C2=A0 =C2=A0 Ta Xr pmc.ibs 3<br> =C2=A0.It Li PMC_CLASS_K8=C2=A0 =C2=A0 =C2=A0Ta Xr pmc.amd 3<br> =C2=A0.It Li PMC_CLASS_TSC=C2=A0 =C2=A0 Ta Xr pmc.tsc 3<br> =C2=A0.El<br> @@ -542,6 +548,7 @@ Doing otherwise is unsupported.<br> =C2=A0.Xr pmc.haswelluc 3 ,<br> =C2=A0.Xr pmc.haswellxeon 3 ,<br> =C2=A0.Xr pmc.iaf 3 ,<br> +.Xr pmc.ibs 3 ,<br> =C2=A0.Xr pmc.ivybridge 3 ,<br> =C2=A0.Xr pmc.ivybridgexeon 3 ,<br> =C2=A0.Xr pmc.sandybridge 3 ,<br> diff --git a/lib/libpmc/pmc.amd.3 b/lib/libpmc/pmc.amd.3<br> index 047b31aa78bb..75c6331b000f 100644<br> --- a/lib/libpmc/pmc.amd.3<br> +++ b/lib/libpmc/pmc.amd.3<br> @@ -777,6 +777,7 @@ and the underlying hardware events used.<br> =C2=A0.Xr pmc.core 3 ,<br> =C2=A0.Xr pmc.core2 3 ,<br> =C2=A0.Xr pmc.iaf 3 ,<br> +.Xr pmc.ibs 3 ,<br> =C2=A0.Xr pmc.soft 3 ,<br> =C2=A0.Xr pmc.tsc 3 ,<br> =C2=A0.Xr pmclog 3 ,<br> diff --git a/lib/libpmc/pmc.core.3 b/lib/libpmc/pmc.core.3<br> index b4fa9ab661a4..4c41e7c7ad3b 100644<br> --- a/lib/libpmc/pmc.core.3<br> +++ b/lib/libpmc/pmc.core.3<br> @@ -786,6 +786,7 @@ may not count some transitions.<br> =C2=A0.Xr pmc.atom 3 ,<br> =C2=A0.Xr pmc.core2 3 ,<br> =C2=A0.Xr pmc.iaf 3 ,<br> +.Xr pmc.ibs 3 ,<br> =C2=A0.Xr pmc.soft 3 ,<br> =C2=A0.Xr pmc.tsc 3 ,<br> =C2=A0.Xr pmclog 3 ,<br> diff --git a/lib/libpmc/pmc.core2.3 b/lib/libpmc/pmc.core2.3<br> index 86604b7ff16c..7e544fad43b6 100644<br> --- a/lib/libpmc/pmc.core2.3<br> +++ b/lib/libpmc/pmc.core2.3<br> @@ -1101,6 +1101,7 @@ and the underlying hardware events used.<br> =C2=A0.Xr pmc.atom 3 ,<br> =C2=A0.Xr pmc.core 3 ,<br> =C2=A0.Xr pmc.iaf 3 ,<br> +.Xr pmc.ibs 3 ,<br> =C2=A0.Xr pmc.soft 3 ,<br> =C2=A0.Xr pmc.tsc 3 ,<br> =C2=A0.Xr pmc_cpuinfo 3 ,<br> diff --git a/lib/libpmc/pmc.iaf.3 b/lib/libpmc/pmc.iaf.3<br> index eaf45db140f5..c3528e472103 100644<br> --- a/lib/libpmc/pmc.iaf.3<br> +++ b/lib/libpmc/pmc.iaf.3<br> @@ -125,6 +125,7 @@ CPU, use the event specifier<br> =C2=A0.Xr pmc.atom 3 ,<br> =C2=A0.Xr pmc.core 3 ,<br> =C2=A0.Xr pmc.core2 3 ,<br> +.Xr pmc.ibs 3 ,<br> =C2=A0.Xr pmc.soft 3 ,<br> =C2=A0.Xr pmc.tsc 3 ,<br> =C2=A0.Xr pmc_cpuinfo 3 ,<br> diff --git a/lib/libpmc/pmc.ibs.3 b/lib/libpmc/pmc.ibs.3<br> new file mode 100644<br> index 000000000000..69b90b84556c<br> --- /dev/null<br> +++ b/lib/libpmc/pmc.ibs.3<br> @@ -0,0 +1,150 @@<br> +.\" Copyright (c) 2016 Ali Mashtizadeh.=C2=A0 All rights reserved.</b= lockquote><div><br></div><div>Isn't this 2026?</div><div>=C2=A0</div><b= lockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px = #ccc solid;padding-left:1ex"> +.\"<br> +.\" Redistribution and use in source and binary forms, with or withou= t<br> +.\" modification, are permitted provided that the following condition= s<br> +.\" are met:<br> +.\" 1. Redistributions of source code must retain the above copyright= <br> +.\"=C2=A0 =C2=A0 notice, this list of conditions and the following di= sclaimer.<br> +.\" 2. Redistributions in binary form must reproduce the above copyri= ght<br> +.\"=C2=A0 =C2=A0 notice, this list of conditions and the following di= sclaimer in the<br> +.\"=C2=A0 =C2=A0 documentation and/or other materials provided with t= he distribution.<br> +.\"<br> +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS&= #39;' AND<br> +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,= THE<br> +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULA= R PURPOSE<br> +.\" ARE DISCLAIMED.=C2=A0 IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTOR= S BE LIABLE<br> +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONS= EQUENTIAL<br> +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE= GOODS<br> +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPT= ION)<br> +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRAC= T, STRICT<br> +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN= ANY WAY<br> +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILI= TY OF<br> +.\" SUCH DAMAGE.<br> +.\"<br> +.Dd March 15, 2026<br> +.Dt PMC.IBS 3<br> +.Os<br> +.Sh NAME<br> +.Nm pmc.ibs<br> +.Nd Instruction Based Sampling for<br> +.Tn AMD<br> +CPUs<br> +.Sh LIBRARY<br> +.Lb libpmc<br> +.Sh SYNOPSIS<br> +.In pmc.h<br> +.Sh DESCRIPTION<br> +AMD Instruction Based Sampling (IBS) was introduced with the K10 family of= <br> +CPUs.<br> +AMD IBS is an alternative approach that samples instructions or micro-ops = and<br> +provides a per-instruction or micro-op breakdown of the sources of stalls.= <br> +.Pp<br> +Unlike traditional counters, IBS can only be used in the sampling mode and= <br> +provides extra data embedded in the callchain.<br> +IBS events set the PMC_F_MULTIPART flag to signify multiple payload types = are<br> +contained in the callchain.<br> +The first 8 bytes of the callchain contain four tuples with a one byte typ= e and<br> +a one byte length field.<br> +The regular PMC callchain can be found following the multipart payload.<br= > +.Pp<br> +IBS only provides two events that analyze instruction fetches and instruct= ion<br> +execution.<br> +The instruction fetch (ibs-fetch) event provides data on the processor<br> +front-end including reporting instruction cache and TLB events.<br> +The instruction execution (ibs-op) event provides data on the processor<br= > +execution including reporting mispredictions, data cache and TLB events.<b= r> +You should use the AMD PMC counters documented in<br> +.Xr pmc.amd 3<br> +to analyze stalls relating instruction issue including reservation content= ion.<br> +.Pp<br> +A guide to analyzing IBS data is provided in Appendix G of the<br> +.Rs<br> +.%B "Software Optimization Guide for AMD Family 10h and 12h Processor= s"<br> +.%N "Publication No. 40546"<br> +.%D "February 2011"<br> +.%Q "Advanced Micro Devices, Inc."<br> +.Re<br> +A more recent document should be used for decoding all of the flags and fi= elds<br> +in the IBS data.<br> +For example, see the AMD Zen 5 documentation<br> +.Rs<br> +.%B "Processor Programming Reference (PPR) for AMD Family 1Ah Model 0= 2h"<br> +.%N "Publication No. 57238"<br> +.%D "March 6, 2026"<br> +.%Q "Advanced Micro Devices, Inc."<br> +.Re<br> +.Ss PMC Features<br> +AMD IBS supports the following capabilities.<br> +.Bl -column "PMC_CAP_INTERRUPT" "Support"<br> +.It Em Capability Ta Em Support<br> +.It PMC_CAP_CASCADE Ta \&No<br> +.It PMC_CAP_EDGE Ta Yes<br> +.It PMC_CAP_INTERRUPT Ta Yes<br> +.It PMC_CAP_INVERT Ta \&No<br> +.It PMC_CAP_READ Ta \&No<br> +.It PMC_CAP_PRECISE Ta Yes<br> +.It PMC_CAP_SYSTEM Ta Yes<br> +.It PMC_CAP_TAGGING Ta \&No<br> +.It PMC_CAP_THRESHOLD Ta \&No<br> +.It PMC_CAP_USER Ta \&No<br> +.It PMC_CAP_WRITE Ta \&No<br> +.El<br> +.Pp<br> +By default AMD IBS enables the edge, interrupt, system and precise flags.<= br> +.Ss Event Qualifiers<br> +Event specifiers for AMD IBS can have the following optional<br> +qualifiers:<br> +.Bl -tag -width "ldlat=3Dvalue"<br> +.It Li l3miss<br> +Configure IBS to only sample if an l3miss occurred.<br> +.It Li ldlat=3D Ns Ar value<br> +Configure the counter to only sample events with load latencies above<br> +.Ar ldlat .<br> +IBS only supports filtering latencies that are a multiple of 128 and betwe= en<br> +128 and 2048.<br> +Load latency filtering can only be used with ibs-op events and imply the<b= r> +l3miss qualifier.<br> +.It Li randomize<br> +Randomize the sampling rate.<br> +.El<br> +.Ss AMD IBS Events Specifiers<br> +The IBS event class provides only two event specifiers:<br> +.Bl -tag -width indent<br> +.It Li ibs-fetch Xo<br> +.Op ,l3miss<br> +.Op ,randomize<br> +.Xc<br> +Collect performance samples during instruction fetch.<br> +The<br> +.Ar randomize<br> +qualifier randomly sets the bottom four bits of the sample rate.<br> +.It Li ibs-op Xo<br> +.Op ,l3miss<br> +.Op ,ldlat=3D Ns Ar ldlat<br> +.Op ,randomize<br> +.Xc<br> +Collect performance samples during instruction execution.<br> +The<br> +.Ar randomize<br> +qualifier, upon reaching the maximum count, restarts the count with a valu= e<br> +between 1 and 127.<br> +.El<br> +.Pp<br> +You may collect both events at the same time.<br> +N.B. AMD discouraged doing so with certain older processors, stating that<= br> +sampling both simultaneously perturbs the results.<br> +Please see the processor programming reference for your specific processor= .<br> +.Sh SEE ALSO<br> +.Xr pmc 3 ,<br> +.Xr pmc.amd 3 ,<br> +.Xr pmc.soft 3 ,<br> +.Xr pmc.tsc 3 ,<br> +.Xr pmclog 3 ,<br> +.Xr hwpmc 4<br> +.Sh HISTORY<br> +AMD IBS support was first introduced in<br> +.Fx 16.0 .<br> +.Sh AUTHORS<br> +AMD IBS support and this manual page were written<br> +.An Ali Mashtizadeh Aq Mt <a href=3D"mailto:ali@mashtizadeh.com">ali@masht= izadeh.com</a><br> +and sponsored by Netflix, Inc.<br> diff --git a/lib/libpmc/pmc.soft.3 b/lib/libpmc/pmc.soft.3<br> index 08d5af63d02d..f58b3e8ffa26 100644<br> --- a/lib/libpmc/pmc.soft.3<br> +++ b/lib/libpmc/pmc.soft.3<br> @@ -90,6 +90,7 @@ Write page fault.<br> =C2=A0.Xr pmc.corei7 3 ,<br> =C2=A0.Xr pmc.corei7uc 3 ,<br> =C2=A0.Xr pmc.iaf 3 ,<br> +.Xr pmc.ibs 3 ,<br> =C2=A0.Xr pmc.tsc 3 ,<br> =C2=A0.Xr pmc.ucf 3 ,<br> =C2=A0.Xr pmc.westmereuc 3 ,<br> diff --git a/lib/libpmc/pmc.tsc.3 b/lib/libpmc/pmc.tsc.3<br> index 4834d897f90c..73e2377df0c7 100644<br> --- a/lib/libpmc/pmc.tsc.3<br> +++ b/lib/libpmc/pmc.tsc.3<br> @@ -62,6 +62,7 @@ maps to the TSC.<br> =C2=A0.Xr pmc.core 3 ,<br> =C2=A0.Xr pmc.core2 3 ,<br> =C2=A0.Xr pmc.iaf 3 ,<br> +.Xr pmc.ibs 3 ,<br> =C2=A0.Xr pmc.soft 3 ,<br> =C2=A0.Xr pmclog 3 ,<br> =C2=A0.Xr hwpmc 4<br> diff --git a/lib/libpmc/pmc.ucf.3 b/lib/libpmc/pmc.ucf.3<br> index a7cea6bb57f9..37ee0f87a951 100644<br> --- a/lib/libpmc/pmc.ucf.3<br> +++ b/lib/libpmc/pmc.ucf.3<br> @@ -88,6 +88,7 @@ offset C0H under device number 0 and Function 0.<br> =C2=A0.Xr pmc.corei7 3 ,<br> =C2=A0.Xr pmc.corei7uc 3 ,<br> =C2=A0.Xr pmc.iaf 3 ,<br> +.Xr pmc.ibs 3 ,<br> =C2=A0.Xr pmc.soft 3 ,<br> =C2=A0.Xr pmc.tsc 3 ,<br> =C2=A0.Xr pmc.westmere 3 ,<br> diff --git a/sys/dev/hwpmc/hwpmc_ibs.h b/sys/dev/hwpmc/hwpmc_ibs.h<br> index 4449b44c8368..01fc88648558 100644<br> --- a/sys/dev/hwpmc/hwpmc_ibs.h<br> +++ b/sys/dev/hwpmc/hwpmc_ibs.h<br> @@ -67,6 +67,18 @@<br> =C2=A0#define IBS_CTL_LVTOFFSETVALID=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(1ULL= << 8)<br> =C2=A0#define IBS_CTL_LVTOFFSETMASK=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0x000= 0000F<br> <br> +/*<br> + * The minimum sampling rate was selected to match the default used by oth= er<br> + * counters that was also found to be experimentally stable by providing e= nough<br> + * time between consecutive NMIs.=C2=A0 The maximum sample rate is determi= ned by<br> + * setting all available counter bits, i.e., all available bits except the= <br> + * bottom four that are zero extended.<br> + */<br> +#define IBS_FETCH_MIN_RATE=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 65536<br> +#define IBS_FETCH_MAX_RATE=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 1048560<br> +#define IBS_OP_MIN_RATE=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 65536<br> +#define IBS_OP_MAX_RATE=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 134217712<br> +<br> =C2=A0/* IBS Fetch Control */<br> =C2=A0#define IBS_FETCH_CTL=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 0xC0011030 /* IBS Fetch Control */<br> =C2=A0#define IBS_FETCH_CTL_L3MISS=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= (1ULL << 61) /* L3 Cache Miss */<br> @@ -82,7 +94,8 @@<br> =C2=A0#define IBS_FETCH_CTL_ENABLE=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= (1ULL << 48) /* Enable */<br> =C2=A0#define IBS_FETCH_CTL_MAXCNTMASK=C2=A0 =C2=A0 =C2=A0 =C2=A00x0000FFFF= ULL<br> <br> -#define IBS_FETCH_CTL_TO_LAT(_c)=C2=A0 =C2=A0 =C2=A0 =C2=A0((_c >> 3= 2) & 0x0000FFFF)<br> +#define IBS_FETCH_INTERVAL_TO_CTL(_c)=C2=A0 (((_c) >> 4) & 0x000= 0FFFF)<br> +#define IBS_FETCH_CTL_TO_LAT(_c)=C2=A0 =C2=A0 =C2=A0 =C2=A0(((_c) >>= 32) & 0x0000FFFF)<br> <br> =C2=A0#define IBS_FETCH_LINADDR=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 0xC0011031 /* Fetch Linear Address */<br> =C2=A0#define IBS_FETCH_PHYSADDR=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A00xC0011032 /* Fetch Physical Address */<br> @@ -95,12 +108,16 @@<br> <br> =C2=A0/* IBS Execution Control */<br> =C2=A0#define IBS_OP_CTL=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A00xC0011033 /* IBS Execution Control */<br> +#define IBS_OP_CTL_LATFLTEN=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (1ULL= << 63) /* Load Latency Filtering */<br> =C2=A0#define IBS_OP_CTL_COUNTERCONTROL=C2=A0 =C2=A0 =C2=A0 (1ULL << = 19) /* Counter Control */<br> =C2=A0#define IBS_OP_CTL_VALID=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0(1ULL << 18) /* Valid */<br> =C2=A0#define IBS_OP_CTL_ENABLE=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 (1ULL << 17) /* Enable */<br> =C2=A0#define IBS_OP_CTL_L3MISSONLY=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (1ULL= << 16) /* L3 Miss Filtering */<br> =C2=A0#define IBS_OP_CTL_MAXCNTMASK=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0x000= 0FFFFULL<br> <br> +#define IBS_OP_CTL_LDLAT_TO_CTL(_c)=C2=A0 =C2=A0 ((((ldlat) >> 7) - = 1) << 59)<br> +#define IBS_OP_INTERVAL_TO_CTL(_c)=C2=A0 =C2=A0 =C2=A0((((_c) >> 4) = & 0x0000FFFFULL) | ((_c) & 0x07F00000))<br> +<br> =C2=A0#define IBS_OP_RIP=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A00xC0011034 /* IBS Op RIP */<br> =C2=A0#define IBS_OP_DATA=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0xC0011035 /* IBS Op Data */<br> =C2=A0#define IBS_OP_DATA_RIPINVALID=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(1ULL= << 38) /* RIP Invalid */<br> <br> </blockquote> --000000000000b77e57064db77c73--home | help
