Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 31 Jul 2016 17:40:18 +0200
From:      Michal Meloun <mmel@FreeBSD.org>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        "freebsd-arm@freebsd.org" <freebsd-arm@FreeBSD.org>, Svatopluk Kraus <skra@FreeBSD.org>, "freebsd-arch@freebsd.org" <freebsd-arch@FreeBSD.org>
Subject:   Re: INTRNG (Was: svn commit: r301453....)
Message-ID:  <579E1BE2.7020500@FreeBSD.org>
In-Reply-To: <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org>
References:  <201606051620.u55GKD5S066398@repo.freebsd.org> <57950005.6070403@FreeBSD.org> <f82018ee-51e7-60fa-2682-f0ef307a52b5@freebsd.org> <57961549.4020105@FreeBSD.org> <e2cace17-0924-2084-5fcf-626f87e41cc3@freebsd.org> <CANCZdfr%2BZ4XxXRY0yMiWXwp=8iKq54y3uJ9-OfAOdfxAs1qdtw@mail.gmail.com> <f94bfd25-fabf-efc3-55c9-cfdfd9e4d6e6@freebsd.org> <CANCZdfpz=z3gc3pyb_Qssa3vGJSnPv_r6J-SWDPPpE9zPYB9=w@mail.gmail.com> <ab44ddb1-515b-94ac-6b12-673b7c53d658@freebsd.org> <57976867.6080705@FreeBSD.org> <f2edac8f-2859-cd98-754e-881e2b2d1e63@freebsd.org> <5798E104.5020104@FreeBSD.org> <a5d43044-1733-6cc7-2e99-e85b60b0fcf3@freebsd.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <eb603349-eb88-866d-7a26-9e026518fd39@freebsd.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a):
[snip]
>
>> The current API is certainly a bit of a hack and I would be pleased to
>>> see it go; but the replacement needs to be more functional and more
>>> robust. As far as I can tell, I literally *cannot* move PowerPC over
>>> to this architecture.
>> Yes, at this time, I agree. If you can accept enhancement of
>> (owf_)bus_map_intr() then we can move to next, significantly less
>> hackish, state.
>
> OK, sure. I wrote some code this afternoon (attached) to sketch this.
> It does not compile or run or anything useful, but it should
> illustrate the important parts of what I think is an API that should
> work for everyone. I'm happy to finish it if it looks reasonable -- I
> may in any case just to replace PowerPC's increasingly teetering
> intr_machdep.c.
>
> The OF part is essentially unchanged from how it currently works:
> there's a method that you pass the interrupt specifier and interrupt
> parent to, and it spits back an IRQ that means that combination of
> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current
> API, can be implemented in terms of it in one line. Whenever the
> relevant PIC attaches, it is told to use the given IRQ to refer to
> that opaque bytestring.
>
> I've added a set of equivalent functions for ACPI that take the
> contents of an ACPI interrupt specifier and do the same things. It
> tries to have the IRQ be human-meaningful, if possible, by matching
> the ACPI interrupt number. Having separate functions seemed a little
> cleaner than exposing the enums and unions as part of the KPI, but I'm
> not wedded to it -- this is just an example.
>
> PICs register (and deregister) themselves with either an OF phandle or
> an ACPI resource string or (god help us) both as needed. The PICs have
> corresponding methods for interpreting various kinds of interrupt
> specifiers.
>
> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to
> succeed before the PICs are attached is gated on an #ifdef. That can
> probably be off by default on ARM. A PowerPC version of this code
> would have to keep it on until various bus drivers are fixed.
>
> Here's the general flow:
> - Parent bus enumerates children, maps IRQs as needed, adds to
> resource list. The struct resources involved aren't special (just a
> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to
> implement (and already are implemented, in general, for OF/FDT drivers).
> - bus_alloc_resource() does nothing special
> - bus_setup_intr() calls into the PIC and does what is needed, as in
> r301453 (to within that #ifdef)
>
> This should keep all the best pieces of everything:
> - ACPI and OF are supported together, and easy to extend for other
> types of mapping technologies without breaking either KBI or KPI (just
> add new functions)
> - No APIs change for existing Open Firmware code
> - The support code can live entirely in machine-dependent directories,
> with no code changes at all required in kern/ or dev/ofw/
> - bus_setup_intr() and friends behave as in r301453 and succeed or
> fail for real
> - PCI code is not an issue
> - No disconnected interrupt state maintained in mapping table (unless
> the transitional #ifdef is on) and the mapping table content is only
> larger than r301453 by having a copy of the original interrupt specifier.
>
> If and when we manage to fix the kernel APIs to allow non-scalar
> interrupt specifiers, this should also be easy to gradually
> transmogrify to support that case since there exists a 1:1 mapping of
> scalar IRQs to rich data and the control flow would be identical:
> interrupt specifiers are read at enumeration time and a resulting
> handle -- struct resource or scalar IRQ -- used afterward to refer to it.
>
Nice, nice, i think that we are very close at this point. 
The problem with the above workflow is that maps IRQ function is called
at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at
this time, PICs are not exist.
Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping
table,  it doesn't provide this functionality.
But yes, i understand that mapping table is important and necessary for you.

So, if i can extend our  concept a little bit, then:
We can add new table (map_data_tbl for example)  that holds  a copy of
the original interrupt specifier, index to irq_sources table and
probably some flags.
And we can modify the above workflow to:
- Parent bus enumerates children, allocates entries from map_data_tbl,
adds to resource list. The struct resources involved aren't special
(just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are
trivial to implement (and already are implemented, in general, for
OF/FDT drivers). Index to map_data_tbl is used as resource ID.
- bus_alloc_resource()  sets 'ALLOCATED' in map_data_tbl flags field,
*maps IRQs* and stores result in 'index' field. 
- bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls
into the PIC and does what is needed, as in r301453. (to within that
#ifdef)

And, in PPC case, newly attached PIC scans whole map_data_tbl table,
finds his entries and makes what needs.

Does that make sense?
I hope that postponing of map IRQ call is not a problem for PPC, 
everything else looks easy.
Moreover,  this additional level of indirection also solves all my
'hypothetical corner cases' with N:1 mappings (between original
interrupt specifier and real interrupt number).

> Some more comments below.
>
>>> The ideal API would be one in which the resource enumeration code
>>> could assign, instead of numbers, some structured information that
>>> contains the full firmware specifier (string parent + interrupt pin
>>> for ACPI, interrupt parent phandle + specifier for device tree, bare
>>> numbers for simple systems).
>> Yes, exactly. I am happy that at least we agree on something. The only
>> difference is that we want triplet <OFW, interrupt parent phandle,
>> specifier> or <ACPI, string parent, interrupt pin>
>
> Which makes perfect sense and is a good idea.
>
>>> That seemed, and seems, too invasive. I think we agree on this and
>>> have chosen to approximate that API in two different ways.
>> I still think that we found way how we can realize the above idea in
>> non-invasive way.
>>  From my point of view the solution is:
>> Put/attach the above triplet to RLE and then copy it (within
>> bus_resource_attach()  to struct resource.
>> But yes, I understand that I am not able to explain clearly enough.
>
> I think we were just talking past eachother somehow.
>
> Anyway, my concern about putting this in struct resource * at
> bus_alloc_resource() time is basically that there are a whole bunch of
> cases in which (a) it's hard to reverse engineer which interrupt from
> the device tree (or whatever) was meant to correspond to a particular
> arbitrary IRQ in an allocation request unless you keep logs and (b)
> there are a bunch of cases, mostly involving PCI, where interrupt
> resource allocation doesn't proceed through struct resource * at all,
> so you pretty much have to keep logs. Once you start keeping logs of
> which IRQ corresponds to which specifier at the device level, you
> might as well just do it once in a centralized place.
Oki, i agree.
>
>>
>>> When we designed the interrupt mapping code, we evaluated a large
>>> number of possible approaches. Something very much like this was one
>>> of them. It was rejected as being too fragile (it requires resource
>>> allocation and enumeration to be coupled) and to have too many corner
>>> cases (the PCI MSI and LSI APIs, for instance). The interrupt mapping
>>> stuff seemed at the time, and so far still seems, like the least-bad
>>> approximation to the ideal case: it is essentially that ideal API, in
>>> that it happens at bus enumeration and involves just assigning the
>>> firmware specifiers, but with lookup keys to that information stuffed
>>> into the 32-bit numbers that the rest of the system uses.
>> The raw PCI (or MSIX)  case is relatively simple. The PCI domain uses
>> raw numbers and it's host-to-pci bridge job to translate raw numbers
>> from PCI domain to (for example) OFW based resource.
>> At this point, I'm still not sure about MSI...
>
> Right, you can keep a big table in the PCI driver that maps whatever
> IRQs you are handing out to some richer interrupt specifiers and
> consult that when you get a bus_alloc_resource() request. Which is
> basically the approach I'm advocating, except that the table is
> centralized, which reduces code duplication and fixes a number of
> weird corner cases where children assign extra interrupts to
> themselves, etc. and the bus parent has no ability to do something
> sensible.
>
> Please take a look at the attached interface and see if it (or
> something like it) meets your needs. It meets all of mine, at least,
> and I think fixes all the things you were concerned about.
Can you, please,  give me also example for GPIO (these are identified by
<device_t, pin number> pair).
Or, in other words, GPIO needs his own sets of functions?

And thanks for your effort and patience,
Michal





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?579E1BE2.7020500>