Date: Tue, 2 Aug 2016 08:31:39 -0700 From: Nathan Whitehorn <nwhitehorn@freebsd.org> To: Michal Meloun <mmel@FreeBSD.org> Cc: "freebsd-arm@freebsd.org" <freebsd-arm@FreeBSD.org>, Svatopluk Kraus <skra@FreeBSD.org>, "freebsd-arch@freebsd.org" <freebsd-arch@FreeBSD.org> Subject: Re: INTRNG (Was: svn commit: r301453....) Message-ID: <ad1e6337-468e-f35d-7454-444a561cb103@freebsd.org> In-Reply-To: <57A09F34.4050400@FreeBSD.org> References: <201606051620.u55GKD5S066398@repo.freebsd.org> <CANCZdfr%2BZ4XxXRY0yMiWXwp=8iKq54y3uJ9-OfAOdfxAs1qdtw@mail.gmail.com> <f94bfd25-fabf-efc3-55c9-cfdfd9e4d6e6@freebsd.org> <CANCZdfpz=z3gc3pyb_Qssa3vGJSnPv_r6J-SWDPPpE9zPYB9=w@mail.gmail.com> <ab44ddb1-515b-94ac-6b12-673b7c53d658@freebsd.org> <57976867.6080705@FreeBSD.org> <f2edac8f-2859-cd98-754e-881e2b2d1e63@freebsd.org> <5798E104.5020104@FreeBSD.org> <a5d43044-1733-6cc7-2e99-e85b60b0fcf3@freebsd.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <eb603349-eb88-866d-7a26-9e026518fd39@freebsd.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> <cefdfaab-a95f-2a92-89bd-3d0cef2a75ab@freebsd.org> <57A09F34.4050400@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 08/02/16 06:25, Michal Meloun wrote: > I'm sorry for delayed response. We have serious trouble @work so I > cannot spend to much time on FreeBSB for next 2 or 3 days. Understood. > > Dne 02.08.2016 v 6:00 Nathan Whitehorn napsal(a): >> >> On 07/31/16 11:43, Nathan Whitehorn wrote: >>> >>> On 07/31/16 08:40, Michal Meloun wrote: >>>> Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): >>>> [snip] >>>>>> The current API is certainly a bit of a hack and I would be >>>>>> pleased to >>>>>>> see it go; but the replacement needs to be more functional and more >>>>>>> robust. As far as I can tell, I literally *cannot* move PowerPC over >>>>>>> to this architecture. >>>>>> Yes, at this time, I agree. If you can accept enhancement of >>>>>> (owf_)bus_map_intr() then we can move to next, significantly less >>>>>> hackish, state. >>>>> OK, sure. I wrote some code this afternoon (attached) to sketch this. >>>>> It does not compile or run or anything useful, but it should >>>>> illustrate the important parts of what I think is an API that should >>>>> work for everyone. I'm happy to finish it if it looks reasonable -- I >>>>> may in any case just to replace PowerPC's increasingly teetering >>>>> intr_machdep.c. >>>>> >>>>> The OF part is essentially unchanged from how it currently works: >>>>> there's a method that you pass the interrupt specifier and interrupt >>>>> parent to, and it spits back an IRQ that means that combination of >>>>> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current >>>>> API, can be implemented in terms of it in one line. Whenever the >>>>> relevant PIC attaches, it is told to use the given IRQ to refer to >>>>> that opaque bytestring. >>>>> >>>>> I've added a set of equivalent functions for ACPI that take the >>>>> contents of an ACPI interrupt specifier and do the same things. It >>>>> tries to have the IRQ be human-meaningful, if possible, by matching >>>>> the ACPI interrupt number. Having separate functions seemed a little >>>>> cleaner than exposing the enums and unions as part of the KPI, but I'm >>>>> not wedded to it -- this is just an example. >>>>> >>>>> PICs register (and deregister) themselves with either an OF phandle or >>>>> an ACPI resource string or (god help us) both as needed. The PICs have >>>>> corresponding methods for interpreting various kinds of interrupt >>>>> specifiers. >>>>> >>>>> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to >>>>> succeed before the PICs are attached is gated on an #ifdef. That can >>>>> probably be off by default on ARM. A PowerPC version of this code >>>>> would have to keep it on until various bus drivers are fixed. >>>>> >>>>> Here's the general flow: >>>>> - Parent bus enumerates children, maps IRQs as needed, adds to >>>>> resource list. The struct resources involved aren't special (just a >>>>> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to >>>>> implement (and already are implemented, in general, for OF/FDT >>>>> drivers). >>>>> - bus_alloc_resource() does nothing special >>>>> - bus_setup_intr() calls into the PIC and does what is needed, as in >>>>> r301453 (to within that #ifdef) >>>>> >>>>> This should keep all the best pieces of everything: >>>>> - ACPI and OF are supported together, and easy to extend for other >>>>> types of mapping technologies without breaking either KBI or KPI (just >>>>> add new functions) >>>>> - No APIs change for existing Open Firmware code >>>>> - The support code can live entirely in machine-dependent directories, >>>>> with no code changes at all required in kern/ or dev/ofw/ >>>>> - bus_setup_intr() and friends behave as in r301453 and succeed or >>>>> fail for real >>>>> - PCI code is not an issue >>>>> - No disconnected interrupt state maintained in mapping table (unless >>>>> the transitional #ifdef is on) and the mapping table content is only >>>>> larger than r301453 by having a copy of the original interrupt >>>>> specifier. >>>>> >>>>> If and when we manage to fix the kernel APIs to allow non-scalar >>>>> interrupt specifiers, this should also be easy to gradually >>>>> transmogrify to support that case since there exists a 1:1 mapping of >>>>> scalar IRQs to rich data and the control flow would be identical: >>>>> interrupt specifiers are read at enumeration time and a resulting >>>>> handle -- struct resource or scalar IRQ -- used afterward to refer >>>>> to it. >>>>> >>>> Nice, nice, i think that we are very close at this point. >>>> The problem with the above workflow is that maps IRQ function is called >>>> at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at >>>> this time, PICs are not exist. >>>> Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping >>>> table, it doesn't provide this functionality. >>>> But yes, i understand that mapping table is important and necessary >>>> for you. >>>> >>>> So, if i can extend our concept a little bit, then: >>>> We can add new table (map_data_tbl for example) that holds a copy of >>>> the original interrupt specifier, index to irq_sources table and >>>> probably some flags. >>>> And we can modify the above workflow to: >>>> - Parent bus enumerates children, allocates entries from map_data_tbl, >>>> adds to resource list. The struct resources involved aren't special >>>> (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are >>>> trivial to implement (and already are implemented, in general, for >>>> OF/FDT drivers). Index to map_data_tbl is used as resource ID. >>>> - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, >>>> *maps IRQs* and stores result in 'index' field. >>>> - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls >>>> into the PIC and does what is needed, as in r301453. (to within that >>>> #ifdef) >>>> >>>> And, in PPC case, newly attached PIC scans whole map_data_tbl table, >>>> finds his entries and makes what needs. >>>> >>>> Does that make sense? >>>> I hope that postponing of map IRQ call is not a problem for PPC, >>>> everything else looks easy. >>>> Moreover, this additional level of indirection also solves all my >>>> 'hypothetical corner cases' with N:1 mappings (between original >>>> interrupt specifier and real interrupt number). >>> Yes, I think we are converging. >>> >>> My qualm about bus_alloc_resource() is twofold: >>> 1. Traditionally, with interrupts, bus_alloc_resource() has no side >>> effects and I'm not sure it propagates cleanly down the tree in all >>> cases. >>> 2. There are a few bus APIs (bus_config_intr() comes to mind) that >>> use raw IRQ IDs and, so far as I know, can be, and sometimes are >>> called before bus_alloc_resource(). If the PIC doesn't know about the >>> IRQ yet when bus_config_intr() (etc.) is called, things will just break. >>> >>> So you would need to make sure that any bus method handling a >>> resource ID causes it to be mapped on the PIC at that time. It's OK >>> if that happens in bus_alloc_resource() so long as bus_config_intr(), >>> bus_setup_intr(), etc. cause that to happen if it hasn't happened yet >>> -- I don't care when these calls are made to the PIC driver so long >>> as *what* calls will be made is set at enumeration time. >>> >>> I am also totally fine with having, on ARM, bus_config_intr(), >>> bus_setup_intr() etc. fail if the PIC hasn't attached yet (On >>> PowerPC, we can't do that yet, but after this conversation, I regard >>> that as a bug and would fix that later), as well as delaying setup on >>> the PIC to the first time any bus driver actually tries to *use* the >>> interrupt (alloc_resource(), setup_intr(), config_intr(), whatever) >>> rather than when the interrupt is originally allocated. >>> > I understand. And yes, bus_alloc_resource() isn't propagated cleanly > down the tree in all cases. That's why we have 'INTRNG' hook in > subr_bus.c, which is suboptimal. > > About bus_config_intr(). The only consumer of bus_config_intr() is ACPI > code, in little hackish manner, as workaround for problem which is > fully solved by INTRNG. > Also, the semantic of bus_config_intr() is not clear, from INTRNG point > of view. > So i have tendency to ignore it :) Heh. Fair enough. I hadn't noticed that -- though I can see legitimate uses for it in the context of GPIOs. In any case, I'm a little wary of bus_alloc_resource() having side effects. Usually bus_activate_resource() would do that kind of thing. > >>> ------------ The following is a large parenthesis ------------------- >>> >>> One other possible route here that would also work well would be to >>> make ofwbus.c MD (it's a trivial piece of code anyway, so we don't >>> gain a lot by sharing it) and implement ofw_bus_map_intr() locally as >>> an ofwbus bus method. Then you could have the mapping table stored in >>> ofwbus's softc -- the API was designed for this initially. You would >>> need MD extensions for doing PIC registration there (which is fine), >>> but that segregates all the OFW-specific information into >>> OFW-specific code and would let the bus methods and the OFW interrupt >>> mapping table interact cleanly in the same place. This still >>> preserves the pre-r301453 API, makes PowerPC work, and maybe >>> address's skra@'s concern about extensibility and letting core >>> interrupt code know about FDT (or ACPI). I'd be happy to mock this up >>> as well if you think it's a good approach. >>> >> [snip] >>> This has the following features: >>> - Existing OFW API and semantics unchanged >>> - As such, PowerPC, PCI, etc. work fine with no changes >>> - Details encapsulated in MD code, so individual platforms can >>> implement this however they like >>> - arm/arm/intr.c (or whatever) only needs a method to allocate a >>> fresh interrupt, with no state, and anoter to set the device_t for an >>> interrupt sometime later. >>> - The internal table in the platform interrupt code has no knowledge >>> of any mappings whatsoever except having the appropriate device_t for >>> the PIC stored with the interrupt vector. >>> - Device tree bits handled purely in device tree code >>> - No action need be taken on any mapping until the interrupt is >>> actually allocated/set up, like r301453 >>> - Easy to add more mapping mechanisms (e.g. ACPI) by having similar >>> enumeration-mechanism-specific code in the root bus for that mapping >>> mechanism. >>> >>> -------------- End parenthesis ------------------------------- >> Here's an implementation of the parenthesis I wrote on an airplane >> this afternoon. It should be complete, though has not been tested. The >> code is short and simple (+70 lines in ofwbus.c). This preserves the >> pre-r301453 API and semantics relative to drivers, which means PowerPC >> and PCI work out of the box, while keeping the semantics relative to >> the interrupt layer of r301453 (PIC methods only called on resource >> allocation, no allocatable IRQs on unattached PICs, encapsulation of >> OFW-specific code in OFW-specific bits of the tree). It turns out >> those two things are compatible, somewhat to my surprise, and that >> makes the result very clean. I like this approach and would be happy >> to move forward with it. There are five functions of interest: >> >> 1. OFW_BUS_MAP_INTR(). This has the semantics and API it has now: you >> pass an interrupt specifier and parent, you get back an IRQ. No >> changes. This is the core of the normal OFW interrupt API. >> >> 2. OFW_BUS_REGISTER_PIC(device_t pic, phandle_t phandle). This is a >> new function that PIC drivers are supposed to use to register control >> of an interrupt domain. This replaces machine-specific code like >> powerpc_register_pic() to allow the PIC table to be in a bus parent >> rather than in the interrupt core. >> >> 3. PIC_MAP_OFW_INTR(device_t pic, int irq, interrupt specifier). This >> is a new function that PIC drivers that know how to handle device-tree >> interrupt descriptors implement (analogous to various existing ones >> that vary by platform). It tells the PIC that the given abstract IRQ >> means the given opaque interrupt specifier. >> >> 4. arm_allocate_irq(int suggested). This allocates a new IRQ number >> not (yet) attached to a PIC, as in r301453. I've added a parameter >> that lets you pass a suggested number to try in case it is possible to >> make it match an interrupt pin or something for human-readability. >> >> 5. arm_set_pic_for_irq(int irq, device_t). This tells the MD interrupt >> layer to associate a given PIC device_t with an IRQ. That is all the >> information the MD layer ever has about the IRQ mapping. >> >> Functions #1 and #2 are now implemented completely in ofwbus.c: there >> are no callouts anywhere else and the interrupt mapping table is >> maintained entirely internally to ofwbus (in its softc). In order to >> implement ACPI, or some other strategy, you would implement analogs to >> functions #1 and #2 that live somewhere in the bus hierarchy that is >> guaranteed to be above all devices that might want that type of >> interrupt (e.g. in acpi0), and some analog to #3 that PIC drivers >> implementing the mapping scheme would provide. >> >> Since the system interrupt code has no knowledge at all of interrupt >> mapping, of any type, in this scheme, adding new mapping types is >> trivial and can be done on a driver-by-driver basis if necessary >> without changing KPI and without any other part of the system even >> being aware. For example, GPIOs can use a completely different >> mechanism if they want and can do setup purely in the GPIO controller >> driver. You could have a method GPIO_GET_INTERRUPT_FOR_PIN() on a GPIO >> controller in which the GPIO controller allocates a generic IRQ, >> assigns through some internal table just in the GPIO driver, and >> returns to it to a consumer in some other device driver -- without a >> GPIO mapping type, new bus functions, or modifications to the platform >> interrupt code. >> >> The control flow goes like this: >> - Bus driver enumerates children, parses interrupts properties, calls >> OFW_BUS_MAP_INTR() to get IRQs for them (as pre-r301453), adds to >> interrupt list. >> - ofwbus receives the OFW_BUS_MAP_INTR() call, allocates a blank >> disconnected IRQ from the MD interrupt layer, and stores the mapping >> from the new IRQ to the given interrupt specifier and phandle in an >> internal table in ofwbus's softc. >> NB: Nothing else happens here, like post-r301453. Changing this does >> not change any semantics of the API pre-r301453, which means it >> remains fully compatible with PCI and PowerPC. Also, like >> post-r301453, there is no involvement of nexus. >> - PICs attach and call OFW_BUS_REGISTER_PIC(). ofwbus receives these >> messages and adds a (device_t, phandle_t) mapping to a second internal >> table. Note that the interrupt layer does not need to handle PIC >> registration anymore at all (except for the root PIC). >> - Bus child eventually calls a function that tries to set the >> interrupt up (e.g. bus_setup_intr()). That propagates up the bus >> hierarchy, eventually getting to ofwbus. ofwbus notes the IRQ number, >> looks it up in the table, looks up the appropriate PIC from the PIC >> table, then: >> A) calls arm_set_pic_for_irq(irq, pic_device_t) -- this is the >> interrupt layer's only interaction with the mapping code. All it deals >> with is device_ts and abstract IRQ numbers. >> B) calls PIC_MAP_OFW_INTR(pic, irq_number, interrupt-cells, >> interrupt-specifier) to tell the PIC that the interrupt layer's IRQ >> irq_number means the given specifier >> C) finally, passes the call onto nexus, which will do whatever would >> normally happen (unmasking the interrupt, setting handlers, etc.) in >> terms only of the abstract IRQ and the device_t assigned by ofwbus. >> >> You would implement ACPI just by doing a s/OFW/ACPI/g >> search-and-replace above -- since the interrupt layer doesn't know >> about OFW or ACPI or anything else, there is no need to touch it. This >> seems clean, simple, compartmentalized, preserves the existing API, >> and should work on all of our various hardware. PowerPC can't quite >> work with it yet without some multipass foo, but, since the API is >> preserved, that transition can happen gradually without KPI changes. >> For the same reason that it is API-preserving, I think this code is >> also MFC-able. >> -Nathan > I think that we are slightly diverges in this place: > - please note that PIC API (in kern/pic_if.m) is different from the one > that PPC uses. Yes, which is fine (this is machine-dependent code anyway). On consideration, the PIC_MAP_OFW_INTR() function should probably be called OFW_BUS_PIC_MAP_INTR() and live in ofw_bus_if.m anyway, then be implemented by PICs. > - we must be able to store configuration data (original interrupt > specifier) in all cases, not only for OFW. That's reason why I have > proposed > to create 'mapping table'. It is stored in all cases, just not in the core interrupt code. I've only mocked this up for OFW here. For ACPI, you would have the equivalent table in acpi.c, along with ACPI_MAP_INTR(), ACPI_REGISTER_PIC(), etc. in acpi_if.m and implemented in dev/acpica/acpi.c. For GPIOs, you would have a mechanism that just traces however you are representing GPIOs anyway. > Anyway, i think that we both talking about +/- same solution, only i > want to move it from OFW specific level implemented at OFW bus level to > bus/interrupt specifier format independent level implemented in > subr_intr.c. This implements the same API in any case (identical to that pre-r301453), so the implementation doesn't really matter at all in terms of my needs. That said, having it in the root bus for the mapping domain (ofwbus0, acpi0) etc. seems cleaner to me, with no loss of functionality. The core interrupt code (subr_intr.c or whatever) doesn't have any obvious need to know anything at all about the mapping information so long as it knows the PIC device_t that corresponds to every abstract IRQ so that it can mask it or do other operations. Since, presumably, nothing in an ACPI device tree references an interrupt in the OFW device tree (if you even had both -- and how would you specify that, anyway?), implementing the relevant bits one step up the bus hierarchy doesn't change any behavior. And nothing can possibly be more flexible in terms of the mapping table in subr_intr.c than not having a mapping table: you don't need enums for mapping types, or unions that need to be expanded, breaking KBI, or anything. You can delete all the PIC registration (and MSI) code, all the switches, and all the unions from subr_intr.c and have a totally mapping-mechanism-agnostic KPI. > Most of your control flow can be implemented at general level as is, or > already exist [intr_map_irq(), intr_pic_register()] . > Also, impact for current PPC code is, by me, minimal. > I can sketch my idea in more details, if you found it acceptable. All ideas are fine -- and the solution does not need to apply to all platforms anyway, so long as the OFW_BUS_MAP_INTR() semantics (and, ideally, API) are preserved. > Again, I'm sorry for delayed and very brief response. No worries; good luck with the work emergencies. -Nathan > Michal >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ad1e6337-468e-f35d-7454-444a561cb103>