Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Mar 2019 12:04:30 +0000
From:      Johannes Lundberg <johalun@FreeBSD.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        John Baldwin <jhb@FreeBSD.org>, cem@freebsd.org, Hans Petter Selasky <hselasky@freebsd.org>, src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>, svn-src-head <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r345103 - head/sys/compat/linuxkpi/common/include/linux
Message-ID:  <21dcbc21-e3a2-64a2-97ba-a612163ebb9d@FreeBSD.org>
In-Reply-To: <20190314114529.GE2492@kib.kiev.ua>
References:  <201903131915.x2DJFbRk002502@repo.freebsd.org> <CAG6CVpU0feif5Z99By7R93cVnNi-MjCsoGh_t9j3xhqhNR%2B%2BQA@mail.gmail.com> <fcfcbb36-a41d-6dda-0940-1e492d6dc676@FreeBSD.org> <0521e2af-03ae-4642-f88c-60d8b8ede968@FreeBSD.org> <20190314114529.GE2492@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

On 3/14/19 11:45 AM, Konstantin Belousov wrote:
> On Thu, Mar 14, 2019 at 11:07:50AM +0000, Johannes Lundberg wrote:
>> On 3/13/19 11:39 PM, John Baldwin wrote:
>>> On 3/13/19 1:36 PM, Conrad Meyer wrote:
>>>> Hi,
>>>>
>>>> A lot of the information about PCIe devices is read by PCI probe and
>>>> cached on the (BSD) device.  You could access it out of
>>>> device_get_ivars(bsddev)->cfg.pcie and avoid the MMIO latency.
>>> Agreed when possible, though I'm not sure caching lnkcap is really
>>> all that useful.
>>>
>>>> On Wed, Mar 13, 2019 at 12:15 PM Hans Petter Selasky
>>>> <hselasky@freebsd.org> wrote:
>>>>> +static inline enum pci_bus_speed
>>>>> +pcie_get_speed_cap(struct pci_dev *dev)
>>>>> +{
>>>>> +       device_t root;
>>>>> +       uint32_t lnkcap, lnkcap2;
>>>>> +       int error, pos;
>>>>> +
>>>>> +       root = device_get_parent(dev->dev.bsddev);
>>>>> +       if (root == NULL)
>>>>> +               return (PCI_SPEED_UNKNOWN);
>>>>> +       root = device_get_parent(root);
>>>>> +       if (root == NULL)
>>>>> +               return (PCI_SPEED_UNKNOWN);
>>>>> +       root = device_get_parent(root);
>>>>> +       if (root == NULL)
>>>>> +               return (PCI_SPEED_UNKNOWN);
>>>> What is this mechanism trying to accomplish?  It seems incredibly
>>>> fragile.  Looking for pci0? pcib0?
>>> It does seem broken, or maybe that it only works for DRM drivers and nothing else.
>>> For non-DRM drivers, 'bsddev' is a PCI-e endpoint.  You would then go up two layers
>>> (pciX and pcibX) to get to the parent bridge.  However, this is assuming a DRM layout
>>> where it has to go drm0 -> vgapci0 -> pciX -> pcibX due to the vgapci pseudo-driver.
>>> As a result, this function can't possibly work on anything other than a DRM driver.
>>> Since it would try to use pci ivars on some PCI bus instance (or worse), it's likely
>>> to cause random panics if used from a non-DRM driver.
>>>
>>> Furthermore, it's not at all clear to me why you can't just read lnkcap/lnkcap2 from
>>> the endpoint device directly vs heading up to the parent bridge though.  Hmm, I guess
>>> it's trying to infer the capabilities of the "slot", so that's why it needs the
>>> grandparent.  You will have to do something less fragile to find the grandparent.
>>> The simplest way may be to walk up to the first "pcib" device you see and then
>>> stop.  You will still want to see if it's a "real" PCI device by seeing if it's
>>> parent is a "pci" device (meaning that the "pcib" device will have valid PCI IVARs
>>> and PCI config registers you can read).  pci_find_pcie_root_port has some similar
>>> code for this, but that function would walk too far up for what you want here, so you
>>> can't reuse it as-is.
>>>
>>>>> +       if ((error = pci_find_cap(root, PCIY_EXPRESS, &pos)) != 0)
>>>>> +               return (PCI_SPEED_UNKNOWN);
>>>> Cached as non-zero cfg.pcie.pcie_location value in ivars.
>>>>
>>>>> +       lnkcap2 = pci_read_config(root, pos + PCIER_LINK_CAP2, 4);
>>>> This one we don't cache, unfortunately, but could.  Ditto LINK_CAP
>>>> below.  Usually "pos + PCIEfoo" is spelled "pcie_read_config(...,
>>>> PCIEfoo...)."
>>> Yes, pcie_read_config is what you normally would use.  That uses the cached
>>> pcie_location for you.
>>>
>>>      pcie_read_config(dev->dev.bsddev, PCIER_LINK_CAP2, 2);
>>>
>>> But also, you should be checking the PCIe version to see if this register
>>> exists instead of reading it unconditionally.  Specifically, you should read
>>> the version field (PCIEM_FLAGS_VERSION) from PCIER_FLAGS.  LINK_CAP2 only
>>> exists if the version >= 2.
>>>
>>>>> +
>>>>> +       if (lnkcap2) {  /* PCIe r3.0-compliant */
>>>>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_2_5GB)
>>>>> +                       return (PCIE_SPEED_2_5GT);
>>>> Seems like these definitions would be better suited as native
>>>> PCIEM_LINK_CAP2_foo definitions in pcireg.h
>>> I feel less strongly about those since we have to provide the constants
>>> anyway for consumers to use.
>>>
>>>>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_5_0GB)
>>>>> +                       return (PCIE_SPEED_5_0GT);
>>>>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_8_0GB)
>>>>> +                       return (PCIE_SPEED_8_0GT);
>>>>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_16_0GB)
>>>>> +                       return (PCIE_SPEED_16_0GT);
>>>>> +       } else {        /* pre-r3.0 */
>>>>> +               lnkcap = pci_read_config(root, pos + PCIER_LINK_CAP, 4);
>>>>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_2_5GB)
>>>>> +                       return (PCIE_SPEED_2_5GT);
>>>>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_5_0GB)
>>>>> +                       return (PCIE_SPEED_5_0GT);
>>>>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_8_0GB)
>>>>> +                       return (PCIE_SPEED_8_0GT);
>>>>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_16_0GB)
>>>>> +                       return (PCIE_SPEED_16_0GT);
>>>>> +       }
>>>>> +       return (PCI_SPEED_UNKNOWN);
>>>>> +}
>>>>> +
>>>>> +static inline enum pcie_link_width
>>>>> +pcie_get_width_cap(struct pci_dev *dev)
>>>>> +{
>>>>> +       uint32_t lnkcap;
>>>>> +
>>>>> +       pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
>>>> Better spelled as PCIER_LINK_CAP.
>>>>
>>>>> +       if (lnkcap)
>>>>> +               return ((lnkcap & PCI_EXP_LNKCAP_MLW) >> 4);
>>>> And PCIEM_LINK_CAP_MAX_WIDTH.
>>> Given that this is using a linuxkpi API (pcie_capability_read_dword
>>> instead of pcie_read_config()) then the rest of this function is
>>> written in the Linux KPI, so I think using Linux' native constants
>>> is actually more consistent for this function.
>> Hi
>>
>> Thanks for the feedback.
>>
>> This code used to exist in the drm driver but was moved into kernel pci
>> code on Linux. It's more or less as it existed in the drm driver with
>> FreeBSD patches having roots probably going back to drm2 from base. If
> No, I assure you that it did not appeared in drm2.

For a bit of history:

drm2:
https://github.com/freebsd/freebsd/blob/release/12.0.0/sys/dev/drm2/drm_pci.c#L439

later drm-next:
https://github.com/FreeBSDDesktop/kms-drm/blob/drm-v4.16/drivers/gpu/drm/drm_pci.c#L441

and now linuxkpi:
https://github.com/freebsd/freebsd/blob/master/sys/compat/linuxkpi/common/include/linux/pci.h#L866


Upstream patch: https://patchwork.kernel.org/patch/10487273/


>
>> this function won't be compatible with anything but drm drivers, we
>> could just as well move it to linuxkpi in ports instead.
> It only works for GPUs which are either pseudo-PCIe devices, i.e.
> they represent itself as PCIe-attached devices for configuration purposes
> but really attached to the CPU ring directly.  This is true for CPU-
> integrated graphics.  Or for lucky circumstances where external
> GPU is plugged into CPU PCIe slot without any PCIe bridges.
>
> If PCIe GPU is connected through a switch (which happen quite often
> these days), or into the slot managed by south bridge (I believe this
> is theoretically possible) then the code would not work.
>
> Perhaps you want dmar_get_requester() which does exactly what you
> (seems) try to get for PCIe devices attached without non-PCIe bridges.
>
>> I'll try to implement the suggested improvements. Keep on eye out for
>> updates here https://reviews.freebsd.org/D19565
>>
>> /Johannes
>>
>>
>>
>>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21dcbc21-e3a2-64a2-97ba-a612163ebb9d>