Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Mar 2019 14:24:14 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Johannes Lundberg <johalun@FreeBSD.org>
Cc:        John Baldwin <jhb@FreeBSD.org>, cem@freebsd.org, Hans Petter Selasky <hselasky@freebsd.org>, src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>, svn-src-head <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r345103 - head/sys/compat/linuxkpi/common/include/linux
Message-ID:  <20190314122414.GF2492@kib.kiev.ua>
In-Reply-To: <21dcbc21-e3a2-64a2-97ba-a612163ebb9d@FreeBSD.org>
References:  <201903131915.x2DJFbRk002502@repo.freebsd.org> <CAG6CVpU0feif5Z99By7R93cVnNi-MjCsoGh_t9j3xhqhNR%2B%2BQA@mail.gmail.com> <fcfcbb36-a41d-6dda-0940-1e492d6dc676@FreeBSD.org> <0521e2af-03ae-4642-f88c-60d8b8ede968@FreeBSD.org> <20190314114529.GE2492@kib.kiev.ua> <21dcbc21-e3a2-64a2-97ba-a612163ebb9d@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 14, 2019 at 12:04:30PM +0000, Johannes Lundberg wrote:
> 
> On 3/14/19 11:45 AM, Konstantin Belousov wrote:
> > On Thu, Mar 14, 2019 at 11:07:50AM +0000, Johannes Lundberg wrote:
> >> On 3/13/19 11:39 PM, John Baldwin wrote:
> >>> On 3/13/19 1:36 PM, Conrad Meyer wrote:
> >>>> Hi,
> >>>>
> >>>> A lot of the information about PCIe devices is read by PCI probe and
> >>>> cached on the (BSD) device.  You could access it out of
> >>>> device_get_ivars(bsddev)->cfg.pcie and avoid the MMIO latency.
> >>> Agreed when possible, though I'm not sure caching lnkcap is really
> >>> all that useful.
> >>>
> >>>> On Wed, Mar 13, 2019 at 12:15 PM Hans Petter Selasky
> >>>> <hselasky@freebsd.org> wrote:
> >>>>> +static inline enum pci_bus_speed
> >>>>> +pcie_get_speed_cap(struct pci_dev *dev)
> >>>>> +{
> >>>>> +       device_t root;
> >>>>> +       uint32_t lnkcap, lnkcap2;
> >>>>> +       int error, pos;
> >>>>> +
> >>>>> +       root = device_get_parent(dev->dev.bsddev);
> >>>>> +       if (root == NULL)
> >>>>> +               return (PCI_SPEED_UNKNOWN);
> >>>>> +       root = device_get_parent(root);
> >>>>> +       if (root == NULL)
> >>>>> +               return (PCI_SPEED_UNKNOWN);
> >>>>> +       root = device_get_parent(root);
> >>>>> +       if (root == NULL)
> >>>>> +               return (PCI_SPEED_UNKNOWN);
> >>>> What is this mechanism trying to accomplish?  It seems incredibly
> >>>> fragile.  Looking for pci0? pcib0?
> >>> It does seem broken, or maybe that it only works for DRM drivers and nothing else.
> >>> For non-DRM drivers, 'bsddev' is a PCI-e endpoint.  You would then go up two layers
> >>> (pciX and pcibX) to get to the parent bridge.  However, this is assuming a DRM layout
> >>> where it has to go drm0 -> vgapci0 -> pciX -> pcibX due to the vgapci pseudo-driver.
> >>> As a result, this function can't possibly work on anything other than a DRM driver.
> >>> Since it would try to use pci ivars on some PCI bus instance (or worse), it's likely
> >>> to cause random panics if used from a non-DRM driver.
> >>>
> >>> Furthermore, it's not at all clear to me why you can't just read lnkcap/lnkcap2 from
> >>> the endpoint device directly vs heading up to the parent bridge though.  Hmm, I guess
> >>> it's trying to infer the capabilities of the "slot", so that's why it needs the
> >>> grandparent.  You will have to do something less fragile to find the grandparent.
> >>> The simplest way may be to walk up to the first "pcib" device you see and then
> >>> stop.  You will still want to see if it's a "real" PCI device by seeing if it's
> >>> parent is a "pci" device (meaning that the "pcib" device will have valid PCI IVARs
> >>> and PCI config registers you can read).  pci_find_pcie_root_port has some similar
> >>> code for this, but that function would walk too far up for what you want here, so you
> >>> can't reuse it as-is.
> >>>
> >>>>> +       if ((error = pci_find_cap(root, PCIY_EXPRESS, &pos)) != 0)
> >>>>> +               return (PCI_SPEED_UNKNOWN);
> >>>> Cached as non-zero cfg.pcie.pcie_location value in ivars.
> >>>>
> >>>>> +       lnkcap2 = pci_read_config(root, pos + PCIER_LINK_CAP2, 4);
> >>>> This one we don't cache, unfortunately, but could.  Ditto LINK_CAP
> >>>> below.  Usually "pos + PCIEfoo" is spelled "pcie_read_config(...,
> >>>> PCIEfoo...)."
> >>> Yes, pcie_read_config is what you normally would use.  That uses the cached
> >>> pcie_location for you.
> >>>
> >>>      pcie_read_config(dev->dev.bsddev, PCIER_LINK_CAP2, 2);
> >>>
> >>> But also, you should be checking the PCIe version to see if this register
> >>> exists instead of reading it unconditionally.  Specifically, you should read
> >>> the version field (PCIEM_FLAGS_VERSION) from PCIER_FLAGS.  LINK_CAP2 only
> >>> exists if the version >= 2.
> >>>
> >>>>> +
> >>>>> +       if (lnkcap2) {  /* PCIe r3.0-compliant */
> >>>>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_2_5GB)
> >>>>> +                       return (PCIE_SPEED_2_5GT);
> >>>> Seems like these definitions would be better suited as native
> >>>> PCIEM_LINK_CAP2_foo definitions in pcireg.h
> >>> I feel less strongly about those since we have to provide the constants
> >>> anyway for consumers to use.
> >>>
> >>>>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_5_0GB)
> >>>>> +                       return (PCIE_SPEED_5_0GT);
> >>>>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_8_0GB)
> >>>>> +                       return (PCIE_SPEED_8_0GT);
> >>>>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_16_0GB)
> >>>>> +                       return (PCIE_SPEED_16_0GT);
> >>>>> +       } else {        /* pre-r3.0 */
> >>>>> +               lnkcap = pci_read_config(root, pos + PCIER_LINK_CAP, 4);
> >>>>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_2_5GB)
> >>>>> +                       return (PCIE_SPEED_2_5GT);
> >>>>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_5_0GB)
> >>>>> +                       return (PCIE_SPEED_5_0GT);
> >>>>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_8_0GB)
> >>>>> +                       return (PCIE_SPEED_8_0GT);
> >>>>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_16_0GB)
> >>>>> +                       return (PCIE_SPEED_16_0GT);
> >>>>> +       }
> >>>>> +       return (PCI_SPEED_UNKNOWN);
> >>>>> +}
> >>>>> +
> >>>>> +static inline enum pcie_link_width
> >>>>> +pcie_get_width_cap(struct pci_dev *dev)
> >>>>> +{
> >>>>> +       uint32_t lnkcap;
> >>>>> +
> >>>>> +       pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> >>>> Better spelled as PCIER_LINK_CAP.
> >>>>
> >>>>> +       if (lnkcap)
> >>>>> +               return ((lnkcap & PCI_EXP_LNKCAP_MLW) >> 4);
> >>>> And PCIEM_LINK_CAP_MAX_WIDTH.
> >>> Given that this is using a linuxkpi API (pcie_capability_read_dword
> >>> instead of pcie_read_config()) then the rest of this function is
> >>> written in the Linux KPI, so I think using Linux' native constants
> >>> is actually more consistent for this function.
> >> Hi
> >>
> >> Thanks for the feedback.
> >>
> >> This code used to exist in the drm driver but was moved into kernel pci
> >> code on Linux. It's more or less as it existed in the drm driver with
> >> FreeBSD patches having roots probably going back to drm2 from base. If
> > No, I assure you that it did not appeared in drm2.
> 
> For a bit of history:
> 
> drm2:
> https://github.com/freebsd/freebsd/blob/release/12.0.0/sys/dev/drm2/drm_pci.c#L439
So it is vestiges from drm.

> 
> later drm-next:
> https://github.com/FreeBSDDesktop/kms-drm/blob/drm-v4.16/drivers/gpu/drm/drm_pci.c#L441
> 
> and now linuxkpi:
> https://github.com/freebsd/freebsd/blob/master/sys/compat/linuxkpi/common/include/linux/pci.h#L866
> 
> 
> Upstream patch: https://patchwork.kernel.org/patch/10487273/
> 
> 
> >
> >> this function won't be compatible with anything but drm drivers, we
> >> could just as well move it to linuxkpi in ports instead.
> > It only works for GPUs which are either pseudo-PCIe devices, i.e.
> > they represent itself as PCIe-attached devices for configuration purposes
> > but really attached to the CPU ring directly.  This is true for CPU-
> > integrated graphics.  Or for lucky circumstances where external
> > GPU is plugged into CPU PCIe slot without any PCIe bridges.
> >
> > If PCIe GPU is connected through a switch (which happen quite often
> > these days), or into the slot managed by south bridge (I believe this
> > is theoretically possible) then the code would not work.
> >
> > Perhaps you want dmar_get_requester() which does exactly what you
> > (seems) try to get for PCIe devices attached without non-PCIe bridges.
> >
> >> I'll try to implement the suggested improvements. Keep on eye out for
> >> updates here https://reviews.freebsd.org/D19565
> >>
> >> /Johannes
> >>
> >>
> >>
> >>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190314122414.GF2492>