From owner-freebsd-arm@freebsd.org Sun Jul 31 15:40:22 2016 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B465BBA9F22; Sun, 31 Jul 2016 15:40:22 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from mail.miracle.cz (mail.miracle.cz [193.84.128.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.miracle.cz", Issuer "Miracle Group Root CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6EB651224; Sun, 31 Jul 2016 15:40:21 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from [193.84.128.50] (meloun.ad.miracle.cz [193.84.128.50]) by mail.miracle.cz (Postfix) with ESMTPSA id B6BBC3AC81; Sun, 31 Jul 2016 17:40:18 +0200 (CEST) Subject: Re: INTRNG (Was: svn commit: r301453....) To: Nathan Whitehorn References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57950005.6070403@FreeBSD.org> <57961549.4020105@FreeBSD.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Michal Meloun X-Enigmail-Draft-Status: N1110 Message-ID: <579E1BE2.7020500@FreeBSD.org> Date: Sun, 31 Jul 2016 17:40:18 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.miracle.cz); Sun, 31 Jul 2016 17:40:18 +0200 (CEST) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jul 2016 15:40:22 -0000 Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): [snip] > >> The current API is certainly a bit of a hack and I would be pleased to >>> see it go; but the replacement needs to be more functional and more >>> robust. As far as I can tell, I literally *cannot* move PowerPC over >>> to this architecture. >> Yes, at this time, I agree. If you can accept enhancement of >> (owf_)bus_map_intr() then we can move to next, significantly less >> hackish, state. > > OK, sure. I wrote some code this afternoon (attached) to sketch this. > It does not compile or run or anything useful, but it should > illustrate the important parts of what I think is an API that should > work for everyone. I'm happy to finish it if it looks reasonable -- I > may in any case just to replace PowerPC's increasingly teetering > intr_machdep.c. > > The OF part is essentially unchanged from how it currently works: > there's a method that you pass the interrupt specifier and interrupt > parent to, and it spits back an IRQ that means that combination of > things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current > API, can be implemented in terms of it in one line. Whenever the > relevant PIC attaches, it is told to use the given IRQ to refer to > that opaque bytestring. > > I've added a set of equivalent functions for ACPI that take the > contents of an ACPI interrupt specifier and do the same things. It > tries to have the IRQ be human-meaningful, if possible, by matching > the ACPI interrupt number. Having separate functions seemed a little > cleaner than exposing the enums and unions as part of the KPI, but I'm > not wedded to it -- this is just an example. > > PICs register (and deregister) themselves with either an OF phandle or > an ACPI resource string or (god help us) both as needed. The PICs have > corresponding methods for interpreting various kinds of interrupt > specifiers. > > Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to > succeed before the PICs are attached is gated on an #ifdef. That can > probably be off by default on ARM. A PowerPC version of this code > would have to keep it on until various bus drivers are fixed. > > Here's the general flow: > - Parent bus enumerates children, maps IRQs as needed, adds to > resource list. The struct resources involved aren't special (just a > single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to > implement (and already are implemented, in general, for OF/FDT drivers). > - bus_alloc_resource() does nothing special > - bus_setup_intr() calls into the PIC and does what is needed, as in > r301453 (to within that #ifdef) > > This should keep all the best pieces of everything: > - ACPI and OF are supported together, and easy to extend for other > types of mapping technologies without breaking either KBI or KPI (just > add new functions) > - No APIs change for existing Open Firmware code > - The support code can live entirely in machine-dependent directories, > with no code changes at all required in kern/ or dev/ofw/ > - bus_setup_intr() and friends behave as in r301453 and succeed or > fail for real > - PCI code is not an issue > - No disconnected interrupt state maintained in mapping table (unless > the transitional #ifdef is on) and the mapping table content is only > larger than r301453 by having a copy of the original interrupt specifier. > > If and when we manage to fix the kernel APIs to allow non-scalar > interrupt specifiers, this should also be easy to gradually > transmogrify to support that case since there exists a 1:1 mapping of > scalar IRQs to rich data and the control flow would be identical: > interrupt specifiers are read at enumeration time and a resulting > handle -- struct resource or scalar IRQ -- used afterward to refer to it. > Nice, nice, i think that we are very close at this point. The problem with the above workflow is that maps IRQ function is called at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at this time, PICs are not exist. Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping table, it doesn't provide this functionality. But yes, i understand that mapping table is important and necessary for you. So, if i can extend our concept a little bit, then: We can add new table (map_data_tbl for example) that holds a copy of the original interrupt specifier, index to irq_sources table and probably some flags. And we can modify the above workflow to: - Parent bus enumerates children, allocates entries from map_data_tbl, adds to resource list. The struct resources involved aren't special (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to implement (and already are implemented, in general, for OF/FDT drivers). Index to map_data_tbl is used as resource ID. - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, *maps IRQs* and stores result in 'index' field. - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls into the PIC and does what is needed, as in r301453. (to within that #ifdef) And, in PPC case, newly attached PIC scans whole map_data_tbl table, finds his entries and makes what needs. Does that make sense? I hope that postponing of map IRQ call is not a problem for PPC, everything else looks easy. Moreover, this additional level of indirection also solves all my 'hypothetical corner cases' with N:1 mappings (between original interrupt specifier and real interrupt number). > Some more comments below. > >>> The ideal API would be one in which the resource enumeration code >>> could assign, instead of numbers, some structured information that >>> contains the full firmware specifier (string parent + interrupt pin >>> for ACPI, interrupt parent phandle + specifier for device tree, bare >>> numbers for simple systems). >> Yes, exactly. I am happy that at least we agree on something. The only >> difference is that we want triplet > specifier> or > > Which makes perfect sense and is a good idea. > >>> That seemed, and seems, too invasive. I think we agree on this and >>> have chosen to approximate that API in two different ways. >> I still think that we found way how we can realize the above idea in >> non-invasive way. >> From my point of view the solution is: >> Put/attach the above triplet to RLE and then copy it (within >> bus_resource_attach() to struct resource. >> But yes, I understand that I am not able to explain clearly enough. > > I think we were just talking past eachother somehow. > > Anyway, my concern about putting this in struct resource * at > bus_alloc_resource() time is basically that there are a whole bunch of > cases in which (a) it's hard to reverse engineer which interrupt from > the device tree (or whatever) was meant to correspond to a particular > arbitrary IRQ in an allocation request unless you keep logs and (b) > there are a bunch of cases, mostly involving PCI, where interrupt > resource allocation doesn't proceed through struct resource * at all, > so you pretty much have to keep logs. Once you start keeping logs of > which IRQ corresponds to which specifier at the device level, you > might as well just do it once in a centralized place. Oki, i agree. > >> >>> When we designed the interrupt mapping code, we evaluated a large >>> number of possible approaches. Something very much like this was one >>> of them. It was rejected as being too fragile (it requires resource >>> allocation and enumeration to be coupled) and to have too many corner >>> cases (the PCI MSI and LSI APIs, for instance). The interrupt mapping >>> stuff seemed at the time, and so far still seems, like the least-bad >>> approximation to the ideal case: it is essentially that ideal API, in >>> that it happens at bus enumeration and involves just assigning the >>> firmware specifiers, but with lookup keys to that information stuffed >>> into the 32-bit numbers that the rest of the system uses. >> The raw PCI (or MSIX) case is relatively simple. The PCI domain uses >> raw numbers and it's host-to-pci bridge job to translate raw numbers >> from PCI domain to (for example) OFW based resource. >> At this point, I'm still not sure about MSI... > > Right, you can keep a big table in the PCI driver that maps whatever > IRQs you are handing out to some richer interrupt specifiers and > consult that when you get a bus_alloc_resource() request. Which is > basically the approach I'm advocating, except that the table is > centralized, which reduces code duplication and fixes a number of > weird corner cases where children assign extra interrupts to > themselves, etc. and the bus parent has no ability to do something > sensible. > > Please take a look at the attached interface and see if it (or > something like it) meets your needs. It meets all of mine, at least, > and I think fixes all the things you were concerned about. Can you, please, give me also example for GPIO (these are identified by pair). Or, in other words, GPIO needs his own sets of functions? And thanks for your effort and patience, Michal