From owner-freebsd-arch@freebsd.org Sun Jul 31 06:11:39 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D8997BA9C0E; Sun, 31 Jul 2016 06:11:39 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from c.mail.sonic.net (c.mail.sonic.net [64.142.111.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BA55011EB; Sun, 31 Jul 2016 06:11:39 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from zeppelin.tachypleus.net (75-101-50-44.static.sonic.net [75.101.50.44]) (authenticated bits=0) by c.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id u6V6BTAl015001 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sat, 30 Jul 2016 23:11:30 -0700 Subject: Re: INTRNG (Was: svn commit: r301453....) To: Michal Meloun References: <201606051620.u55GKD5S066398@repo.freebsd.org> <5794720F.4050303@FreeBSD.org> <8bfd8668-bc49-e109-e610-b5cd470be3ec@freebsd.org> <57950005.6070403@FreeBSD.org> <57961549.4020105@FreeBSD.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Nathan Whitehorn Message-ID: <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> Date: Sat, 30 Jul 2016 23:11:29 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <579CF7C8.1040302@FreeBSD.org> Content-Type: multipart/mixed; boundary="------------B749752E283B3662897D27B5" X-Sonic-CAuth: UmFuZG9tSVY7JnB2hp4xwcXi2+33i82Up9sMkb7R78poh7TNRC/sop5gnkeK+TCcQg5Jo5M5noycmkDqPWgzI7KwZOyRBCsx2IhbssMPEeY= X-Sonic-ID: C;wl/enuVW5hGIlaDx2xNB0g== M;oJ45n+VW5hGIlaDx2xNB0g== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jul 2016 06:11:39 -0000 This is a multi-part message in MIME format. --------------B749752E283B3662897D27B5 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit On 07/30/16 11:54, Michal Meloun wrote: [snip] > >>>>>> Or: you could skip all of that and use the fully functional mechanism >>>>>> that already exists. >>>>> The problem is, at least from my point of view, that we don't have it. >>>>> The actual MIPS code doesn't work on ARM for numerous reasons. >>>>> This is >>>>> first one -> >>>>> >>>>> https://svnweb.freebsd.org/base/head/sys/powerpc/powerpc/nexus.c?revision=297000&view=markup#l186 >>>>> The pre-r301453 ARM INTRNG doesn't support detached interrupts. >>>>> And r301453 doesn't changed nothing about this. >>>> I'm happy to write whatever code is missing. The ARM implementation of >>>> ofw_bus_map_intr() does seem fairly braindead and should be replaced. >>>> >>>> So here is where I am right now: >>>> - The perceived advantage of the new API seems to be that the mapping >>>> information (interrupt parent, etc.) ends up in a struct resource >>>> instead of in a centralized mapping table >>> True. And, in optimal case, also in RLE. See [1]. >> Your code also has a centralized table (static struct intr_irqsrc >> *irq_sources[NIRQ];), basically the same one, so I'm actually really >> confused on this point. The only actual difference seems to be that >> the firmware interrupt descriptor is stored in the RLE at >> bus_alloc_resource() time instead of in the table at bus enumeration >> time and that the new code requires some extra bus methods. > The irq_sources holds only real interrupts (all interrupts pin from all > interrupts controllers). Your effort is move all 'interrupt description > data' out of INTRNG code - > to RLE (at bus enumeration time) (see [1]) and to struct resource (at > bus_alloc_resource() time). The new bus method is only temporary > shortcut to bypass 11 release. > > But, to be fair, I was hoping it might be necessary. > > [1] This is last missing step in whole patch series and, in optimal > case, it need new filed(s) in RLE structure. So we chose to postpone > discussion about this to time after 11 release. That's a nice idea. I think it will require a big amount of API retooling, though, since we don't use RLEs consistently everywhere in the tree (the PCI code, for example). > >>>> - Additionally, it gives you a better shot at having the PIC online >>>> before the PIC's interrupts are parsed (which is not required, but >>>> nice). >>> Nope, we simply fount that detached pics is very dangerous and not >>> needed. But, if you love it, it can be added as MD extension. >> Dangerous how? I don't "love it"; it's an unfortunate necessity. > Imho, some drivers may rely on correct return code from > bus_alloc_resource() or from bus_setup_intr(). > For example to detect if interrupt exist, or if given interrupt > configuration is supported. These can be decoupled. The thing you need for sure is to be able to assign resources before the PIC attaches. I think you've convinced me that with some care in bus passes, it is possible to postpone the need for bus_alloc_resource() and bus_setup_intr() to succeed until after the PIC[s] have attached in all cases. That will require work, but it's in principle possible. >>>> - Beyond these aesthetic points, there are no concrete examples of new >>>> functionality added by this API, aside from some minor implementation >>>> bugs of the old one on ARM that are easily fixed. >>> Really? Or you just ignore it? >> Which ones? I've been asking for a week and a half now and you have >> responded with some hypothetical examples, all of which either (a) do >> not seem to occur in the real world and (b) were trivially >> implementable with the existing code. >> >>>> - There are, conversely, a number of concrete cases where this new API >>>> would not be able to do the right thing. Some of these can be worked >>>> around or fixed with significant restructuring in the MI parts of the >>>> kernel. >>> And i offer you a fix, without "significant restructuring in the MI >>> parts of the kernel". Unfortunately, you did not want to accept anything >>> other than the old API. >> For one of them, partially. For the others, not so much. For example, >> I have no idea how to implement PCI interrupts (PCIB_ROUTE_INTERRUPT) >> in this framework. I am perfectly happy to accept a new API. What I am >> not perfectly happy to accept is an API that makes it impossible for >> me to support the platforms that I need to support and that, >> simultaneously, doesn't even provide any new capabilities on other >> platforms. > From one of previous mail: > > But again, all what I want in this area is (for me) simple change of > your ofw_bus_map_intr() method to something like: > > enum intr_map_data_type { > INTR_MAP_DATA_OFW, > INTR_MAP_DATA_GPIO, > ... > }; > > int > bus_map_intr(device_t dev, enum intr_map_data_type type, uintptr_t > pic_id, void *config_data, int config_size) > >> The current API is certainly a bit of a hack and I would be pleased to >> see it go; but the replacement needs to be more functional and more >> robust. As far as I can tell, I literally *cannot* move PowerPC over >> to this architecture. > Yes, at this time, I agree. If you can accept enhancement of > (owf_)bus_map_intr() then we can move to next, significantly less > hackish, state. OK, sure. I wrote some code this afternoon (attached) to sketch this. It does not compile or run or anything useful, but it should illustrate the important parts of what I think is an API that should work for everyone. I'm happy to finish it if it looks reasonable -- I may in any case just to replace PowerPC's increasingly teetering intr_machdep.c. The OF part is essentially unchanged from how it currently works: there's a method that you pass the interrupt specifier and interrupt parent to, and it spits back an IRQ that means that combination of things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current API, can be implemented in terms of it in one line. Whenever the relevant PIC attaches, it is told to use the given IRQ to refer to that opaque bytestring. I've added a set of equivalent functions for ACPI that take the contents of an ACPI interrupt specifier and do the same things. It tries to have the IRQ be human-meaningful, if possible, by matching the ACPI interrupt number. Having separate functions seemed a little cleaner than exposing the enums and unions as part of the KPI, but I'm not wedded to it -- this is just an example. PICs register (and deregister) themselves with either an OF phandle or an ACPI resource string or (god help us) both as needed. The PICs have corresponding methods for interpreting various kinds of interrupt specifiers. Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to succeed before the PICs are attached is gated on an #ifdef. That can probably be off by default on ARM. A PowerPC version of this code would have to keep it on until various bus drivers are fixed. Here's the general flow: - Parent bus enumerates children, maps IRQs as needed, adds to resource list. The struct resources involved aren't special (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to implement (and already are implemented, in general, for OF/FDT drivers). - bus_alloc_resource() does nothing special - bus_setup_intr() calls into the PIC and does what is needed, as in r301453 (to within that #ifdef) This should keep all the best pieces of everything: - ACPI and OF are supported together, and easy to extend for other types of mapping technologies without breaking either KBI or KPI (just add new functions) - No APIs change for existing Open Firmware code - The support code can live entirely in machine-dependent directories, with no code changes at all required in kern/ or dev/ofw/ - bus_setup_intr() and friends behave as in r301453 and succeed or fail for real - PCI code is not an issue - No disconnected interrupt state maintained in mapping table (unless the transitional #ifdef is on) and the mapping table content is only larger than r301453 by having a copy of the original interrupt specifier. If and when we manage to fix the kernel APIs to allow non-scalar interrupt specifiers, this should also be easy to gradually transmogrify to support that case since there exists a 1:1 mapping of scalar IRQs to rich data and the control flow would be identical: interrupt specifiers are read at enumeration time and a resulting handle -- struct resource or scalar IRQ -- used afterward to refer to it. Some more comments below. >> The ideal API would be one in which the resource enumeration code >> could assign, instead of numbers, some structured information that >> contains the full firmware specifier (string parent + interrupt pin >> for ACPI, interrupt parent phandle + specifier for device tree, bare >> numbers for simple systems). > Yes, exactly. I am happy that at least we agree on something. The only > difference is that we want triplet specifier> or Which makes perfect sense and is a good idea. >> That seemed, and seems, too invasive. I think we agree on this and >> have chosen to approximate that API in two different ways. > I still think that we found way how we can realize the above idea in > non-invasive way. > From my point of view the solution is: > Put/attach the above triplet to RLE and then copy it (within > bus_resource_attach() to struct resource. > But yes, I understand that I am not able to explain clearly enough. I think we were just talking past eachother somehow. Anyway, my concern about putting this in struct resource * at bus_alloc_resource() time is basically that there are a whole bunch of cases in which (a) it's hard to reverse engineer which interrupt from the device tree (or whatever) was meant to correspond to a particular arbitrary IRQ in an allocation request unless you keep logs and (b) there are a bunch of cases, mostly involving PCI, where interrupt resource allocation doesn't proceed through struct resource * at all, so you pretty much have to keep logs. Once you start keeping logs of which IRQ corresponds to which specifier at the device level, you might as well just do it once in a centralized place. > >> When we designed the interrupt mapping code, we evaluated a large >> number of possible approaches. Something very much like this was one >> of them. It was rejected as being too fragile (it requires resource >> allocation and enumeration to be coupled) and to have too many corner >> cases (the PCI MSI and LSI APIs, for instance). The interrupt mapping >> stuff seemed at the time, and so far still seems, like the least-bad >> approximation to the ideal case: it is essentially that ideal API, in >> that it happens at bus enumeration and involves just assigning the >> firmware specifiers, but with lookup keys to that information stuffed >> into the 32-bit numbers that the rest of the system uses. > The raw PCI (or MSIX) case is relatively simple. The PCI domain uses > raw numbers and it's host-to-pci bridge job to translate raw numbers > from PCI domain to (for example) OFW based resource. > At this point, I'm still not sure about MSI... Right, you can keep a big table in the PCI driver that maps whatever IRQs you are handing out to some richer interrupt specifiers and consult that when you get a bus_alloc_resource() request. Which is basically the approach I'm advocating, except that the table is centralized, which reduces code duplication and fixes a number of weird corner cases where children assign extra interrupts to themselves, etc. and the bus parent has no ability to do something sensible. Please take a look at the attached interface and see if it (or something like it) meets your needs. It meets all of mine, at least, and I think fixes all the things you were concerned about. -Nathan --------------B749752E283B3662897D27B5 Content-Type: text/plain; charset=UTF-8; name="intr.h" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="intr.h" #ifndef _ARM_INTR_H #define _ARM_INTR_H /* Open Firmware-type mappings */ int map_ofw_irq(uint32_t iparent, uint32_t *ispec, size_t icells); void register_pic_ofw(device_t dev, uint32_t phandle); void unregister_pic_ofw(device_t dev); /* ACPI-type mappings */ int map_acpi_irq(const char *provider, int interrupt, uint8_t polarity, uint8_t trigger); void register_pic_acpi(device_t dev, const char *provider); void unregister_pic_acpi(device_t dev); /* PIC interface */ void dispatch_intr(u_int vector, struct trapframe *tf); #endif --------------B749752E283B3662897D27B5 Content-Type: text/plain; charset=UTF-8; name="intr.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="intr.c" /*- * Copyright 2016 Nathan Whitehorn * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static MALLOC_DEFINE(M_INTR, "intr", "interrupt handler data"); struct intr_rec { struct intr_event *event; long *cntp; u_int cntindex; device_t pic; cpuset_t cpu; u_int vector; RB_ENTRY(intr_rec) intr_rec_link; #ifdef ALLOW_LATE_PIC_ATTACH enum intr_trigger trig; enum intr_polarity pol; #endif union { struct { void *ofw_spec; size_t ofw_spec_len; uint32_t ofw_iparent; } ofw_spec; struct { uint8_t acpi_triggering; uint8_t acpi_polarity; uint32_t acpi_intline; const char *acpi_source; } acpi_spec; } ispec; enum { IRQ_TYPE_OFW, IRQ_TYPE_ACPI, IRQ_TYPE_IPI, } ispec_type; }; /* Tree of interrupts: indexes IRQ# (vector) to specifier */ static int intr_rec_compare(struct intr_rec *a, struct intr_rec *b); static RB_HEAD(intr_rec_tree, intr_rec) intr_list = RB_INITIALIZER(&intr_list); RB_GENERATE_STATIC(intr_rec_tree, intr_rec, intr_rec_link, intr_rec_compare); static struct rmlock intr_list_lock; /* Maps of PICs by ID to device_t */ struct ofw_pic { LIST_ENTRY(ofw_pic) ofw_pic_link; uint32_t phandle; device_t dev; }; static LIST_HEAD(ofw_pic_head, ofw_pic) ofw_pic_list = LIST_HEAD_INITIALIZER(&ofw_pic_list); struct acpi_pic { LIST_ENTRY(acpi_pic) acpi_pic_link; const char *acpi_source; device_t dev; }; static LIST_HEAD(acpi_pic_head, acpi_pic) acpi_pic_list = LIST_HEAD_INITIALIZER(&acpi_pic_list); static struct mtx pic_list_lock; device_t root_pic = NULL; /* * Locking semantics: * - intr_list_lock is held for reading whenever an interrupt is being * issued. No non-atomic fields can be updated when held for reading. * - intr_list_lock is held for writing whenever non-atomic fields need * to be changed. * - When entries are added to or deleted from the table, the intr_list_lock * must be held for writing *and* pic_list_lock must be held. * - pic_list_lock is outer to intr_list_lock */ static void intr_init(void *dummy __unused) { rm_init(&intr_list_lock, "intr list"); mtx_init(&pic_list_lock, "PIC list", NULL, MTX_DEF); } SYSINIT(intr_init, SI_SUB_INTR, SI_ORDER_FIRST, intr_init, NULL); /* Utility functions */ static struct intr_rec * allocate_irq(void) { struct intr_rec *rec = malloc(sizeof(struct intr_rec), M_INTR, M_ZERO | M_WAITOK); #ifdef SMP rec->cpu = all_cpus; #else CPU_SETOF(0, &rec->cpu); #endif #ifdef ALLOW_LATE_PIC_ATTACH rec->trig = INTR_TRIGGER_CONFORM; rec->pol = INTR_POLARITY_CONFORM; #endif return rec; } static int insert_irq(struct intr_rec *rec) { struct intr_rec *existing = NULL; rm_assert(&intr_list_lock, RA_WLOCKED); /* Allow code to suggest a vector ID. Honor if possible. */ if (rec->vector != 0) { existing = RB_FIND(intr_rec_tree, &intr_list, rec); if (existing == NULL) goto out; } /* Otherwise, use min(100, max_irq + 1) */ existing = RB_MAX(intr_rec_tree, &intr_list); if (existing == NULL) rec->vector = 100; else rec->vector = min(existing->vector + 1, 100); out: RB_INSERT(intr_rec_tree, &intr_list, rec); return (rec->vector); } static struct intr_rec * find_irq(u_int vector) { struct intr_rec *i, tmp; /* * Note on locking: we don't destroy any IRQ mappings once made * (just reuse them if asked for again), so below code is safe */ tmp.vec = vector; rm_rlock(&intr_list_lock); i = RB_FIND(intr_rec_tree, &intr_list, &tmp); rm_runlock(&intr_list_lock); return (i); } /* Open Firmware-type mappings */ int map_ofw_irq(uint32_t iparent, uint32_t *ispec, size_t icells) { struct intr_rec *existing, *newintr; struct ofw_pic *pic; int irq; /* * Allocate early for locking reasons. Allocation is the usual path * anyway. */ newintr = allocate_irq(); newintr->ispec_type = IRQ_TYPE_OFW; newintr->ispec.ofw_spec.ofw_spec_len = sizeof(ispec[0])*icells; newintr->ispec.ofw_spec.ofw_spec = malloc(sizeof(ispec[0])*icells, M_INTR, M_WAITOK); memcpy(newintr->ispec.ofw_spec.ofw_spec, ispec, newintr->ispec.ofw_spec.ofw_spec_len); newintr->ispec.ofw_iparent = iparent; newintr->vector = ispec[0]; /* Usually the interrupt pin. */ mtx_lock(&pic_list_lock); LIST_FOREACH(pic, &ofw_pic_list, ofw_pic_link) { if (pic->phandle == iparent) break; } if (pic != NULL) newintr->pic = pic->dev; if (pic == NULL) panic("Open Firmware PIC %#d not registered", iparent); rm_wlock(&intr_list_lock); existing = NULL; RB_FOREACH(existing, intr_rec_tree, &intr_list) { if (existing->ispec_type != IRQ_TYPE_OFW) continue; if (existing->ispec.ofw_spec.ofw_iparent != iparent) continue; if (existing->ispec.ofw_spec.ofw_spec_len != newintr->ispec.ofw_spec.ofw_spec_len) continue; if (memcmp(existing->ispec.ofw_spec, ispec, newintr->ispec.ofw_spec_len) == 0) break; } if (existing == NULL) irq = insert_irq(newintr); else irq = existing->vector; rm_wunlock(&intr_list_lock); if (existing != NULL) { /* Already had one */ free(newintr->ofw_spec.ofw_spec); free(newintr, M_INTR); } if (pic != NULL && !existing) PIC_MAP_OFW(pic->dev, irq, ispec, icells); mtx_unlock(&pic_list_lock); return (irq); } void register_pic_ofw(device_t dev, uint32_t phandle) { struct intr_rec *existing; struct ofw_pic *pic; pic = malloc(sizeof(*pic), M_INTR, M_WAITOK); pic->dev = dev; pic->phandle = phandle; mtx_lock(&pic_list_lock); LIST_INSERT_HEAD(pic, &ofw_pic_list, ofw_pic_link); begin_intr_search: /* * Associate any previously-mapped interrupts on this PIC with this * PIC. */ rm_wlock(&intr_list_lock); RB_FOREACH(existing, intr_rec_tree, &intr_list) { if (existing->ispec_type != IRQ_TYPE_OFW) continue; if (existing->ispec.ofw_spec.ofw_iparent == iparent && existing->pic == NULL) { existing->pic = dev; rm_wunlock(&intr_list_lock); /* PIC lock still held */ PIC_MAP_OFW(dev, existing->vec, existing->ispec.ofw_spec.ofw_ispec, existing->ispec.ofw_spec.ofw_icells); #ifdef ALLOW_LATE_PIC_ATTACH PIC_CONFIG(dev, existing->vec, existing->trig, existing->pol); #endif goto begin_intr_search; } } rm_wunlock(&intr_list_lock); mtx_unlock(&pic_list_lock); } void unregister_pic_ofw(device_t dev) { struct intr_rec *existing; struct ofw_pic *pic; mtx_lock(&pic_list_lock); LIST_FOREACH(pic, &ofw_pic_list, ofw_pic_link) { if (pic->dev == dev) break; } if (pic != NULL) { LIST_REMOVE(pic, ofw_pic_link); free(pic, M_INTR); } mtx_unlock(&pic_list_lock); rm_rlock(&intr_list_lock); RB_FOREACH(existing, intr_rec_tree, &intr_list) { if (existing->pic == dev) panic("Removing PIC with existing interrupt"); } rm_runlock(&intr_list_lock); } /* ACPI-type mappings */ int map_acpi_irq(const char *provider, int interrupt, uint8_t polarity, uint8_t trigger) { struct intr_rec *existing, *newintr; char *newprov; struct acpi_pic *pic; int irq; /* * Allocate early for locking reasons. Allocation is the usual path * anyway. */ newintr = allocate_irq(); newintr->ispec_type = IRQ_TYPE_ACPI; newintr->ispec.acpi_spec.acpi_triggering = trigger; newintr->ispec.acpi_spec.acpi_polarity = polarity; newintr->ispec.acpi_spec.acpi_intline = interrupt; newprov = malloc(strlen(provider) + 1, M_INTR, M_WAITOK); strcpy(newprov, provider); newintr->ispec.acpi_spec.acpi_source = newprov; newintr->vector = interrupt; /* Try for this to make dmesg nice */ mtx_lock(&pic_list_lock); LIST_FOREACH(pic, &acpi_pic_list, acpi_pic_link) { if (strcmp(pic->acpi_spec.acpi_source, provider) == 0) break; } if (pic != NULL) newintr->pic = pic->dev; if (pic == NULL) panic("ACPI PIC %s not registered", provider); rm_wlock(&intr_list_lock); existing = NULL; RB_FOREACH(existing, intr_rec_tree, &intr_list) { if (existing->ispec_type != IRQ_TYPE_ACPI) continue; if (strcmp(existing->ispec.acpi_spec.acpi_source, provider) != 0) continue; if (existing->ispec.acpi_spec.acpi_intline == interrupt) break; } if (existing == NULL) irq = insert_irq(newintr); else irq = existing->vector; rm_wunlock(&intr_list_lock); if (existing != NULL) { /* Already had one */ free(newintr->acpi_spec.acpi_source); free(newintr, M_INTR); } if (pic != NULL && !existing) PIC_MAP_ACPI(pic->dev, irq, interrupt, polarity, trigger); mtx_unlock(&pic_list_lock); return (irq); } void register_pic_acpi(device_t dev, const char *provider) { struct intr_rec *existing; struct acpi_pic *pic; char *newprov; pic = malloc(sizeof(*pic), M_INTR, M_WAITOK); pic->dev = dev; newprov = malloc(strlen(provider) + 1, M_INTR, M_WAITOK); strcpy(newprov, provider) pic->acpi_source = newprov; mtx_lock(&pic_list_lock); LIST_INSERT_HEAD(pic, &acpi_pic_list, acpi_pic_link); mtx_unlock(&pic_list_lock); begin_intr_search: /* * Associate any previously-mapped interrupts on this PIC with this * PIC. */ rm_wlock(&intr_list_lock); RB_FOREACH(existing, intr_rec_tree, &intr_list) { if (existing->ispec_type != IRQ_TYPE_ACPI) continue; if (strcmp(existing->ispec.acpi_spec.acpi_source, provider) == 0 && existing->pic == NULL) { existing->pic = dev; rm_wunlock(&intr_list_lock); PIC_MAP_ACPI(dev, existing->vec, existing->ispec.acpi_spec.acpi_intline, existing->ispec.acpi_spec.acpi_polarity, existing->ispec.acpi_spec.acpi_triggering); #ifdef ALLOW_LATE_PIC_ATTACH PIC_CONFIG(dev, existing->vec, existing->trig, existing->pol); #endif goto begin_intr_search; } } rm_wunlock(&intr_list_lock); } void unregister_pic_acpi(device_t dev) { struct intr_rec *existing; struct acpi_pic *pic; mtx_lock(&pic_list_lock); LIST_FOREACH(pic, &acpi_pic_list, acpi_pic_link) { if (pic->dev == dev) break; } if (pic != NULL) { LIST_REMOVE(pic, acpi_pic_link); free(pic->acpi_source, M_INTR); free(pic, M_INTR); } mtx_unlock(&pic_list_lock); rm_rlock(&intr_list_lock); RB_FOREACH(existing, intr_rec_tree, &intr_list) { if (existing->pic == dev) panic("Removing PIC with existing interrupt"); } rm_runlock(&intr_list_lock); } void dispatch_intr(u_int vector, struct trapframe *tf) { struct intr_rec *i; struct intr_event *ie; i = find_irq(vector); KASSERT(i != NULL, ("Dispatching unmapped vector %d", vector)); ie = i->event; KASSERT(ie != NULL, ("%s: interrupt without an event", __func__)) intr_event_handle(ie, tf); /* XXX strays? */ } --------------B749752E283B3662897D27B5-- From owner-freebsd-arch@freebsd.org Sun Jul 31 14:28:13 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1676ABA9A61; Sun, 31 Jul 2016 14:28:13 +0000 (UTC) (envelope-from onwahe@gmail.com) Received: from mail-it0-f45.google.com (mail-it0-f45.google.com [209.85.214.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D50CC15B1; Sun, 31 Jul 2016 14:28:12 +0000 (UTC) (envelope-from onwahe@gmail.com) Received: by mail-it0-f45.google.com with SMTP id f6so225997255ith.1; Sun, 31 Jul 2016 07:28:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Pj2jQxUWLmt2WZq9vCHqAI5q26TIvxQmEsRZ8HfXsZs=; b=bJdupqX9gVFXsae0OdwLaHcKMqW+ZSGaOVmREyse1wYoiD1Hy35t7kc5DgcLQ+pvIg +pHmcryK5o58MCaw/yf/jgPG49+xoGcmmcnRLHmwhNBpEwfe0UayM6tMfn44LRH3gRmZ 13e2IdCxO3CPR0s/jl6fAn+nnhOAJfVji+OB0+tiT8JYgQPcqv2Xwtdw6XTMAtVJGtai 2qlGJVgA8Nb5ehQBFp3s3fqceutAcjYNPTGjtWDF7unpN1w2UNdOBtUE3w0bB9AmV0KI c8P5ucZigUfJA/JsasIQt16KvvqnPwOwimazHNTIlaX/jxeTV5V0rCwS3iNapvFDSdeu Oo3Q== X-Gm-Message-State: AEkoouvm0PKX7lRp8fI3EVMZcK19DdGcTe7trNjHC6s5qM3973ceg9zp6dqsmV/FDnsLfw== X-Received: by 10.36.110.66 with SMTP id w63mr43498478itc.56.1469974936138; Sun, 31 Jul 2016 07:22:16 -0700 (PDT) Received: from mail-io0-f174.google.com (mail-io0-f174.google.com. [209.85.223.174]) by smtp.gmail.com with ESMTPSA id d71sm11579550ioj.33.2016.07.31.07.22.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 31 Jul 2016 07:22:15 -0700 (PDT) Received: by mail-io0-f174.google.com with SMTP id q83so167181531iod.1; Sun, 31 Jul 2016 07:22:15 -0700 (PDT) X-Received: by 10.107.26.16 with SMTP id a16mr52149745ioa.30.1469974935361; Sun, 31 Jul 2016 07:22:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.64.162.75 with HTTP; Sun, 31 Jul 2016 07:22:13 -0700 (PDT) In-Reply-To: <8efe4828-ab2a-02a6-902b-8614b1f4b24e@freebsd.org> References: <201606051620.u55GKD5S066398@repo.freebsd.org> <4e7a3e8f-cc21-f5f2-e3e0-4dbd554a4cd0@freebsd.org> <5794720F.4050303@FreeBSD.org> <8bfd8668-bc49-e109-e610-b5cd470be3ec@freebsd.org> <57950005.6070403@FreeBSD.org> <57961549.4020105@FreeBSD.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <8efe4828-ab2a-02a6-902b-8614b1f4b24e@freebsd.org> From: Svatopluk Kraus Date: Sun, 31 Jul 2016 16:22:13 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: INTRNG (Was: svn commit: r301453....) To: Nathan Whitehorn Cc: Michal Meloun , "freebsd-arm@freebsd.org" , "freebsd-arch@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jul 2016 14:28:13 -0000 Unfortunatelly, I have no time now to read back all this message thread and related ones. Nevertheless, I found time to summarize the main ideas about INTRNG design related to this discussion. Svata ----------------------------------------------------------- INTRNG is designed with three main ideas: (1) An interrupt is identified by one and only unique number. (2) This unique number is assigned only to registered interrupt. (3) The framework does not know any interrupt mapping encoding semantics. A general intr_map_irq() serves to get this unique number for an interrupt mapping. As INTRNG itself does not know any interrupt mapping encoding semantics, it pushes the request to a PIC. Of course, a PIC specification must be provided as arguments to this function. A PIC is specified by either device_t or/and xref (intptr_t). It means that such PIC should be already registered in INTRNG. The idea that INTRNG does not know any interrupt mapping encoding semantics is crucial. It makes INTRNG true general framework. Any interrupt mapping is fully transparent for INTRNG. Any new interrupt mapping semantics can be added without INTRNG concern. That said, centralized interrupt mapping table is no concern for INTRNG as INTRNG works with its interrupt numbers, not with numbers associated with something else. In general, an interrupt mapping table can be established anywhere and clients of this table may use indexes to this table as interrupt specifiers. But if INTRNG bus interface functions are used, a translation to INTRNG interrupt numbers must be done (see paragraph above about intr_map_irq()). In fact, this is a natural case of bridges where interrupts must be remapped. The main reasons why INTRNG does not use a centralized interrupt mapping table instead of an interrupt table are the following: (1) As not only one interrupt mappings can exist for an interrupt in the very same time, it's far more simple, sane, and framework like. One interrupt, one interrupt number, one data associated, no need to know any interrupt mapping semantics. (2) An interrupt mapping is associated with consumer driver and belongs to this driver or some of its parent driver. It's not both sane and newbus like to store an interrupt mapping in some third party - in INTRNG in this case. ----------------------------------------------------------- From owner-freebsd-arch@freebsd.org Sun Jul 31 14:54:39 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AEE19BA9363; Sun, 31 Jul 2016 14:54:39 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9466A15B2; Sun, 31 Jul 2016 14:54:39 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from zeppelin.tachypleus.net (75-101-50-44.static.sonic.net [75.101.50.44]) (authenticated bits=0) by d.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id u6VEsaOF003749 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sun, 31 Jul 2016 07:54:36 -0700 Subject: Re: INTRNG (Was: svn commit: r301453....) To: Svatopluk Kraus References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57950005.6070403@FreeBSD.org> <57961549.4020105@FreeBSD.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <8efe4828-ab2a-02a6-902b-8614b1f4b24e@freebsd.org> Cc: "freebsd-arm@freebsd.org" , Michal Meloun , "freebsd-arch@freebsd.org" From: Nathan Whitehorn Message-ID: <17f46646-95b7-3bcd-f652-942c802f7cf7@freebsd.org> Date: Sun, 31 Jul 2016 07:54:36 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Sonic-CAuth: UmFuZG9tSVbhWuT1q6TGz1sR6C7THvdIp5QWRpLpclfbQOJbhHXV1+oYyy/lalE9IO7Ttd+aa5lUutAdpEhZAorif4r1XwcnWUVlZFSepZ0= X-Sonic-ID: C;Ip7Osi5X5hGDHa/hcgQksw== M;MlA0sy5X5hGDHa/hcgQksw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jul 2016 14:54:39 -0000 On 07/31/16 07:22, Svatopluk Kraus wrote: > Unfortunatelly, I have no time now to read back all this message > thread and related ones. Nevertheless, I found time to summarize the > main ideas about INTRNG design related to this discussion. > > Svata Thank you for the summary. It would be great if you could look at the wiki page at https://wiki.freebsd.org/Complicated_Interrupts at some point and see how it corresponds to this. In particular, there are five cases at the end that I cannot figure out to implement with your API and that are essential to support PowerPC. I have a few comments inline below. -Nathan > > ----------------------------------------------------------- > > INTRNG is designed with three main ideas: > > (1) An interrupt is identified by one and only unique number. > (2) This unique number is assigned only to registered interrupt. > (3) The framework does not know any interrupt mapping encoding semantics. This is also true of the previous system; these are good design principles. > A general intr_map_irq() serves to get this unique number for an > interrupt mapping. As INTRNG itself does not know any interrupt > mapping encoding semantics, it pushes the request to a PIC. Of course, > a PIC specification must be provided as arguments to this function. A > PIC is specified by either device_t or/and xref (intptr_t). It means > that such PIC should be already registered in INTRNG. > > The idea that INTRNG does not know any interrupt mapping encoding > semantics is crucial. It makes INTRNG true general framework. Any > interrupt mapping is fully transparent for INTRNG. Any new interrupt > mapping semantics can be added without INTRNG concern. > > That said, centralized interrupt mapping table is no concern for > INTRNG as INTRNG works with its interrupt numbers, not with numbers > associated with something else. All of this, again, is true of the old code. You provide a PIC specification and data meaningful to that PIC, you get back an abstract IRQ. The core interrupt code neither knows nor cares about what that information means. > > In general, an interrupt mapping table can be established anywhere and > clients of this table may use indexes to this table as interrupt > specifiers. But if INTRNG bus interface functions are used, a > translation to INTRNG interrupt numbers must be done (see paragraph > above about intr_map_irq()). In fact, this is a natural case of > bridges where interrupts must be remapped. Yes, of course. > The main reasons why INTRNG does not use a centralized interrupt > mapping table instead of an interrupt table are the following: > > (1) As not only one interrupt mappings can exist for an interrupt in > the very same time, it's far more simple, sane, and framework like. > One interrupt, one interrupt number, one data associated, no need to > know any interrupt mapping semantics. > > (2) An interrupt mapping is associated with consumer driver and > belongs to this driver or some of its parent driver. It's not both > sane and newbus like to store an interrupt mapping in some third party > - in INTRNG in this case. I agree with this completely. The sticking point is whether you associate PIC-specific interrupt data with the interrupt when the bus parent enumerates the bus or when resources are allocated with bus_alloc_resource(). Doing it at enumeration time means that there is a robust relationship between what the interrupt specifier the bus parent meant to assign and what actually *is* assigned and fixes a large number of issues where interrupts are enumerated by code over which the bus parent does not have full control: its children, the MI PCI code, etc. Doing it when the resources are allocated, as r301453 does, makes it much more fragile (as well as a little confusing, since what interrupts refer to is only set long after they are allocated). For example, I have no idea how to implement PCIB_ROUTE_INTERRUPT(), and so non-MSI PCI interrupts, with this framework and MSI interrupts work only in simple cases. This is a critical thing: It means I can't support PCI. I could hack around the issue with a mapping table and API that records which abstract OFW IRQ meant what specifier and iparent and put that in, say, nexus, using it to do the right thing when bus_alloc_resource() is later called on those specifiers -- but we are immediately back to the pre-310453 API at that point. There are a number of examples (not complete, but illustrative) of other such cases on the referenced wiki page. -Nathan > ----------------------------------------------------------- > _______________________________________________ > freebsd-arch@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@freebsd.org Sun Jul 31 15:40:22 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B465BBA9F22; Sun, 31 Jul 2016 15:40:22 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from mail.miracle.cz (mail.miracle.cz [193.84.128.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.miracle.cz", Issuer "Miracle Group Root CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6EB651224; Sun, 31 Jul 2016 15:40:21 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from [193.84.128.50] (meloun.ad.miracle.cz [193.84.128.50]) by mail.miracle.cz (Postfix) with ESMTPSA id B6BBC3AC81; Sun, 31 Jul 2016 17:40:18 +0200 (CEST) Subject: Re: INTRNG (Was: svn commit: r301453....) To: Nathan Whitehorn References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57950005.6070403@FreeBSD.org> <57961549.4020105@FreeBSD.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Michal Meloun X-Enigmail-Draft-Status: N1110 Message-ID: <579E1BE2.7020500@FreeBSD.org> Date: Sun, 31 Jul 2016 17:40:18 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.miracle.cz); Sun, 31 Jul 2016 17:40:18 +0200 (CEST) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jul 2016 15:40:22 -0000 Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): [snip] > >> The current API is certainly a bit of a hack and I would be pleased to >>> see it go; but the replacement needs to be more functional and more >>> robust. As far as I can tell, I literally *cannot* move PowerPC over >>> to this architecture. >> Yes, at this time, I agree. If you can accept enhancement of >> (owf_)bus_map_intr() then we can move to next, significantly less >> hackish, state. > > OK, sure. I wrote some code this afternoon (attached) to sketch this. > It does not compile or run or anything useful, but it should > illustrate the important parts of what I think is an API that should > work for everyone. I'm happy to finish it if it looks reasonable -- I > may in any case just to replace PowerPC's increasingly teetering > intr_machdep.c. > > The OF part is essentially unchanged from how it currently works: > there's a method that you pass the interrupt specifier and interrupt > parent to, and it spits back an IRQ that means that combination of > things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current > API, can be implemented in terms of it in one line. Whenever the > relevant PIC attaches, it is told to use the given IRQ to refer to > that opaque bytestring. > > I've added a set of equivalent functions for ACPI that take the > contents of an ACPI interrupt specifier and do the same things. It > tries to have the IRQ be human-meaningful, if possible, by matching > the ACPI interrupt number. Having separate functions seemed a little > cleaner than exposing the enums and unions as part of the KPI, but I'm > not wedded to it -- this is just an example. > > PICs register (and deregister) themselves with either an OF phandle or > an ACPI resource string or (god help us) both as needed. The PICs have > corresponding methods for interpreting various kinds of interrupt > specifiers. > > Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to > succeed before the PICs are attached is gated on an #ifdef. That can > probably be off by default on ARM. A PowerPC version of this code > would have to keep it on until various bus drivers are fixed. > > Here's the general flow: > - Parent bus enumerates children, maps IRQs as needed, adds to > resource list. The struct resources involved aren't special (just a > single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to > implement (and already are implemented, in general, for OF/FDT drivers). > - bus_alloc_resource() does nothing special > - bus_setup_intr() calls into the PIC and does what is needed, as in > r301453 (to within that #ifdef) > > This should keep all the best pieces of everything: > - ACPI and OF are supported together, and easy to extend for other > types of mapping technologies without breaking either KBI or KPI (just > add new functions) > - No APIs change for existing Open Firmware code > - The support code can live entirely in machine-dependent directories, > with no code changes at all required in kern/ or dev/ofw/ > - bus_setup_intr() and friends behave as in r301453 and succeed or > fail for real > - PCI code is not an issue > - No disconnected interrupt state maintained in mapping table (unless > the transitional #ifdef is on) and the mapping table content is only > larger than r301453 by having a copy of the original interrupt specifier. > > If and when we manage to fix the kernel APIs to allow non-scalar > interrupt specifiers, this should also be easy to gradually > transmogrify to support that case since there exists a 1:1 mapping of > scalar IRQs to rich data and the control flow would be identical: > interrupt specifiers are read at enumeration time and a resulting > handle -- struct resource or scalar IRQ -- used afterward to refer to it. > Nice, nice, i think that we are very close at this point. The problem with the above workflow is that maps IRQ function is called at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at this time, PICs are not exist. Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping table, it doesn't provide this functionality. But yes, i understand that mapping table is important and necessary for you. So, if i can extend our concept a little bit, then: We can add new table (map_data_tbl for example) that holds a copy of the original interrupt specifier, index to irq_sources table and probably some flags. And we can modify the above workflow to: - Parent bus enumerates children, allocates entries from map_data_tbl, adds to resource list. The struct resources involved aren't special (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to implement (and already are implemented, in general, for OF/FDT drivers). Index to map_data_tbl is used as resource ID. - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, *maps IRQs* and stores result in 'index' field. - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls into the PIC and does what is needed, as in r301453. (to within that #ifdef) And, in PPC case, newly attached PIC scans whole map_data_tbl table, finds his entries and makes what needs. Does that make sense? I hope that postponing of map IRQ call is not a problem for PPC, everything else looks easy. Moreover, this additional level of indirection also solves all my 'hypothetical corner cases' with N:1 mappings (between original interrupt specifier and real interrupt number). > Some more comments below. > >>> The ideal API would be one in which the resource enumeration code >>> could assign, instead of numbers, some structured information that >>> contains the full firmware specifier (string parent + interrupt pin >>> for ACPI, interrupt parent phandle + specifier for device tree, bare >>> numbers for simple systems). >> Yes, exactly. I am happy that at least we agree on something. The only >> difference is that we want triplet > specifier> or > > Which makes perfect sense and is a good idea. > >>> That seemed, and seems, too invasive. I think we agree on this and >>> have chosen to approximate that API in two different ways. >> I still think that we found way how we can realize the above idea in >> non-invasive way. >> From my point of view the solution is: >> Put/attach the above triplet to RLE and then copy it (within >> bus_resource_attach() to struct resource. >> But yes, I understand that I am not able to explain clearly enough. > > I think we were just talking past eachother somehow. > > Anyway, my concern about putting this in struct resource * at > bus_alloc_resource() time is basically that there are a whole bunch of > cases in which (a) it's hard to reverse engineer which interrupt from > the device tree (or whatever) was meant to correspond to a particular > arbitrary IRQ in an allocation request unless you keep logs and (b) > there are a bunch of cases, mostly involving PCI, where interrupt > resource allocation doesn't proceed through struct resource * at all, > so you pretty much have to keep logs. Once you start keeping logs of > which IRQ corresponds to which specifier at the device level, you > might as well just do it once in a centralized place. Oki, i agree. > >> >>> When we designed the interrupt mapping code, we evaluated a large >>> number of possible approaches. Something very much like this was one >>> of them. It was rejected as being too fragile (it requires resource >>> allocation and enumeration to be coupled) and to have too many corner >>> cases (the PCI MSI and LSI APIs, for instance). The interrupt mapping >>> stuff seemed at the time, and so far still seems, like the least-bad >>> approximation to the ideal case: it is essentially that ideal API, in >>> that it happens at bus enumeration and involves just assigning the >>> firmware specifiers, but with lookup keys to that information stuffed >>> into the 32-bit numbers that the rest of the system uses. >> The raw PCI (or MSIX) case is relatively simple. The PCI domain uses >> raw numbers and it's host-to-pci bridge job to translate raw numbers >> from PCI domain to (for example) OFW based resource. >> At this point, I'm still not sure about MSI... > > Right, you can keep a big table in the PCI driver that maps whatever > IRQs you are handing out to some richer interrupt specifiers and > consult that when you get a bus_alloc_resource() request. Which is > basically the approach I'm advocating, except that the table is > centralized, which reduces code duplication and fixes a number of > weird corner cases where children assign extra interrupts to > themselves, etc. and the bus parent has no ability to do something > sensible. > > Please take a look at the attached interface and see if it (or > something like it) meets your needs. It meets all of mine, at least, > and I think fixes all the things you were concerned about. Can you, please, give me also example for GPIO (these are identified by pair). Or, in other words, GPIO needs his own sets of functions? And thanks for your effort and patience, Michal From owner-freebsd-arch@freebsd.org Sun Jul 31 18:43:26 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CF766BAA253; Sun, 31 Jul 2016 18:43:26 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from c.mail.sonic.net (c.mail.sonic.net [64.142.111.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B0C6E173F; Sun, 31 Jul 2016 18:43:26 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from zeppelin.tachypleus.net (75-101-50-44.static.sonic.net [75.101.50.44]) (authenticated bits=0) by c.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id u6VIhN9G004097 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sun, 31 Jul 2016 11:43:23 -0700 Subject: Re: INTRNG (Was: svn commit: r301453....) To: Michal Meloun References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57950005.6070403@FreeBSD.org> <57961549.4020105@FreeBSD.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Nathan Whitehorn Message-ID: <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> Date: Sun, 31 Jul 2016 11:43:23 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <579E1BE2.7020500@FreeBSD.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Sonic-CAuth: UmFuZG9tSVZvVQZDE005RznR1LrYjBmC0ls8/5+kiod/s4Zcd7U+W9XBY02KiwpryW8xl2M4AQJAJYZCOuOJsr3Nh3yxbD936UULR3sDrWw= X-Sonic-ID: C;fGfMqE5X5hG4haDx2xNB0g== M;uHosqU5X5hG4haDx2xNB0g== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jul 2016 18:43:26 -0000 On 07/31/16 08:40, Michal Meloun wrote: > Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): > [snip] >>> The current API is certainly a bit of a hack and I would be pleased to >>>> see it go; but the replacement needs to be more functional and more >>>> robust. As far as I can tell, I literally *cannot* move PowerPC over >>>> to this architecture. >>> Yes, at this time, I agree. If you can accept enhancement of >>> (owf_)bus_map_intr() then we can move to next, significantly less >>> hackish, state. >> OK, sure. I wrote some code this afternoon (attached) to sketch this. >> It does not compile or run or anything useful, but it should >> illustrate the important parts of what I think is an API that should >> work for everyone. I'm happy to finish it if it looks reasonable -- I >> may in any case just to replace PowerPC's increasingly teetering >> intr_machdep.c. >> >> The OF part is essentially unchanged from how it currently works: >> there's a method that you pass the interrupt specifier and interrupt >> parent to, and it spits back an IRQ that means that combination of >> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current >> API, can be implemented in terms of it in one line. Whenever the >> relevant PIC attaches, it is told to use the given IRQ to refer to >> that opaque bytestring. >> >> I've added a set of equivalent functions for ACPI that take the >> contents of an ACPI interrupt specifier and do the same things. It >> tries to have the IRQ be human-meaningful, if possible, by matching >> the ACPI interrupt number. Having separate functions seemed a little >> cleaner than exposing the enums and unions as part of the KPI, but I'm >> not wedded to it -- this is just an example. >> >> PICs register (and deregister) themselves with either an OF phandle or >> an ACPI resource string or (god help us) both as needed. The PICs have >> corresponding methods for interpreting various kinds of interrupt >> specifiers. >> >> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to >> succeed before the PICs are attached is gated on an #ifdef. That can >> probably be off by default on ARM. A PowerPC version of this code >> would have to keep it on until various bus drivers are fixed. >> >> Here's the general flow: >> - Parent bus enumerates children, maps IRQs as needed, adds to >> resource list. The struct resources involved aren't special (just a >> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to >> implement (and already are implemented, in general, for OF/FDT drivers). >> - bus_alloc_resource() does nothing special >> - bus_setup_intr() calls into the PIC and does what is needed, as in >> r301453 (to within that #ifdef) >> >> This should keep all the best pieces of everything: >> - ACPI and OF are supported together, and easy to extend for other >> types of mapping technologies without breaking either KBI or KPI (just >> add new functions) >> - No APIs change for existing Open Firmware code >> - The support code can live entirely in machine-dependent directories, >> with no code changes at all required in kern/ or dev/ofw/ >> - bus_setup_intr() and friends behave as in r301453 and succeed or >> fail for real >> - PCI code is not an issue >> - No disconnected interrupt state maintained in mapping table (unless >> the transitional #ifdef is on) and the mapping table content is only >> larger than r301453 by having a copy of the original interrupt specifier. >> >> If and when we manage to fix the kernel APIs to allow non-scalar >> interrupt specifiers, this should also be easy to gradually >> transmogrify to support that case since there exists a 1:1 mapping of >> scalar IRQs to rich data and the control flow would be identical: >> interrupt specifiers are read at enumeration time and a resulting >> handle -- struct resource or scalar IRQ -- used afterward to refer to it. >> > Nice, nice, i think that we are very close at this point. > The problem with the above workflow is that maps IRQ function is called > at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at > this time, PICs are not exist. > Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping > table, it doesn't provide this functionality. > But yes, i understand that mapping table is important and necessary for you. > > So, if i can extend our concept a little bit, then: > We can add new table (map_data_tbl for example) that holds a copy of > the original interrupt specifier, index to irq_sources table and > probably some flags. > And we can modify the above workflow to: > - Parent bus enumerates children, allocates entries from map_data_tbl, > adds to resource list. The struct resources involved aren't special > (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are > trivial to implement (and already are implemented, in general, for > OF/FDT drivers). Index to map_data_tbl is used as resource ID. > - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, > *maps IRQs* and stores result in 'index' field. > - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls > into the PIC and does what is needed, as in r301453. (to within that > #ifdef) > > And, in PPC case, newly attached PIC scans whole map_data_tbl table, > finds his entries and makes what needs. > > Does that make sense? > I hope that postponing of map IRQ call is not a problem for PPC, > everything else looks easy. > Moreover, this additional level of indirection also solves all my > 'hypothetical corner cases' with N:1 mappings (between original > interrupt specifier and real interrupt number). Yes, I think we are converging. My qualm about bus_alloc_resource() is twofold: 1. Traditionally, with interrupts, bus_alloc_resource() has no side effects and I'm not sure it propagates cleanly down the tree in all cases. 2. T. , x n nn b n /here are a few bus APIs (bus_config_intr() comes to mind) that use raw IRQ IDs and, so far as I know, can be, and sometimes are called before bus_alloc_resource(). If the PIC doesn't know about the IRQ yet when bus_config_intr() (etc.) is called, things will just break. So you would need to make sure that any bus method handling a resource ID causes it to be mapped on the PIC at that time. It's OK if that happens in bus_alloc_resource() so long as bus_config_intr(), bus_setup_intr(), etc. cause that to happen if it hasn't happened yet -- I don't care when these calls are made to the PIC driver so long as *what* calls will be made is set at enumeration time. I am also totally fine with having, on ARM, bus_config_intr(), bus_setup_intr() etc. fail if the PIC hasn't attached yet (On PowerPC, we can't do that yet, but after this conversation, I regard that as a bug and would fix that later), as well as delaying setup on the PIC to the first time any bus driver actually tries to *use* the interrupt (alloc_resource(), setup_intr(), config_intr(), whatever) rather than when the interrupt is originally allocated. ------------ The following is a large parenthesis ------------------- One other possible route here that would also work well would be to make ofwbus.c MD (it's a trivial piece of code anyway, so we don't gain a lot by sharing it) and implement ofw_bus_map_intr() locally as an ofwbus bus method. Then you could have the mapping table stored in ofwbus's softc -- the API was designed for this initially. You would need MD extensions for doing PIC registration there (which is fine), but that segregates all the OFW-specific information into OFW-specific code and would let the bus methods and the OFW interrupt mapping table interact cleanly in the same place. This still preserves the pre-r301453 API, makes PowerPC work, and maybe address's skra@'s concern about extensibility and letting core interrupt code know about FDT (or ACPI). I'd be happy to mock this up as well if you think it's a good approach. So you would have something in arm/arm/ofwbus.c like (pseudo-code, missing unlocks, etc.): struct ofwbus_softc { (list of PICs) (list of IRQ mapping from # to specifier, iparent, and device_t) } static int ofw_bus_map_if_unmapped(struct ofwbus_softc *sc, int irq) { struct ofw_bus_irq_mapping *i; struct ofw_bus_pic *j; int err = 0; mtx_lock(sc->ofw_bus_irqmap_lock); LIST_FOREACH(i, sc->irq_mappings) { if (i->irq == irq) break; } if (i == NULL) return (ENXIO); /* PANIC? */ if (i->dev != NULL) return (0); /* Mapped already */ LIST_FOREACH(j, sc->ofw_bus_registered_pics) { if (j->phandle == i->phandle) { arm_intr_set_pic(irq, j->dev); err = PIC_SET_OFW_ISPEC(j->dev, i->ispec, i->icells); return (err); } } return (ENXIO); /* No matching PIC. Panic? */ } static int ofw_bus_setup_intr(device_t dev, device_t child, int irq, blah...) { int err; /* This section at the beginning of anything allocating/using/modifying interrupts */ err = ofw_bus_map_if_unmapped(sc, irq); if (err != 0) /* Explode if the PIC isn't online yet return (err); bus_setup_intr(dev, child, irq, blah...); /* onward to nexus */ } This has the following features: - Existing OFW API and semantics unchanged - As such, PowerPC, PCI, etc. work fine with no changes - Details encapsulated in MD code, so individual platforms can implement this however they like - arm/arm/intr.c (or whatever) only needs a method to allocate a fresh interrupt, with no state, and anoter to set the device_t for an interrupt sometime later. - The internal table in the platform interrupt code has no knowledge of any mappings whatsoever except having the appropriate device_t for the PIC stored with the interrupt vector. - Device tree bits handled purely in device tree code - No action need be taken on any mapping until the interrupt is actually allocated/set up, like r301453 - Easy to add more mapping mechanisms (e.g. ACPI) by having similar enumeration-mechanism-specific code in the root bus for that mapping mechanism. -------------- End parenthesis ------------------------------- > `/ > >> Some more comments below. >> >>>> The ideal API would be one in which the resource enumeration code >>>> could assign, instead of numbers, some structured information that >>>> contains the full firmware specifier (string parent + interrupt pin >>>> for ACPI, interrupt parent phandle + specifier for device tree, bare >>>> numbers for simple systems). >>> Yes, exactly. I am happy that at least we agree on something. The only >>> difference is that we want triplet >> specifier> or >> Which makes perfect sense and is a good idea. >> >>>> That seemed, and seems, too invasive. I think we agree on this and >>>> have chosen to approximate that API in two different ways. >>> I still think that we found way how we can realize the above idea in >>> non-invasive way. >>> From my point of view the solution is: >>> Put/attach the above triplet to RLE and then copy it (within >>> bus_resource_attach() to struct resource. >>> But yes, I understand that I am not able to explain clearly enough. >> I think we were just talking past eachother somehow. >> >> Anyway, my concern about putting this in struct resource * at >> bus_alloc_resource() time is basically that there are a whole bunch of >> cases in which (a) it's hard to reverse engineer which interrupt from >> the device tree (or whatever) was meant to correspond to a particular >> arbitrary IRQ in an allocation request unless you keep logs and (b) >> there are a bunch of cases, mostly involving PCI, where interrupt >> resource allocation doesn't proceed through struct resource * at all, >> so you pretty much have to keep logs. Once you start keeping logs of >> which IRQ corresponds to which specifier at the device level, you >> might as well just do it once in a centralized place. > Oki, i agree. o Great, I'm glad we're on the same page. >>> gd ` >>>> When we designed the interrupt mapping code, we evaluated a large >>>> number of possible approaches. Something very much like this was one >>>> of them. It was rejected as being too fragile (it requires resource >>>> allocation and enumeration to be coupled) and to have too many corner >>>> cases (the PCI MSI and LSI APIs, for instance). The interrupt mapping >>>> stuff seemed at the time, and so far still seems, like the least-bad >>>> approximation to the ideal case: it is essentially that ideal API, in >>>> that it happens at bus enumeration and involves just assigning the >>>> firmware specifiers, but with lookup keys to that information stuffed >>>> into the 32-bit numbers that the rest of the system uses. >>> The raw PCI (or MSIX) case is relatively simple. The PCI domain uses >>> raw numbers and it's host-to-pci bridge job to translate raw numbers >>> from PCI domain to (for example) OFW based resource. >>> At this point, I'm still not sure about MSI... >> Right, you can keep a big table in the PCI driver that maps whatever >> IRQs you are handing out to some richer interrupt specifiers and >> consult that when you get a bus_alloc_resource() request. Which is >> basically the approach I'm advocating, except that the table is >> centralized, which reduces code duplication and fixes a number of >> weird corner cases where children assign extra interrupts to >> themselves, etc. and the bus parent has no ability to do something >> sensible. >> >> Please take a look at the attached interface and see if it (or >> something like it) meets your needs. It meets all of mine, at least, >> and I think fixes all the things you were concerned about. > Can you, please, give me also example for GPIO (these are identified by > pair). > Or, in other words, GPIO needs his own sets of functions? You have more experience with GPIOs than I do, but I could imagine a couple different ways of doing this for GPIO interrupts not in standard interrupts properties: 1. You have an extra interrupt class (like r301453) called "GPIO" or something that either takes a OFW/ACPI handle or a device_t and a GPIO specifier. 2. You have some global translator function that makes an OFW/ACPI interrupt specifier out of a GPIO property. I don't like this one. 3. You add a method to the GPIO controller (GPIO_GET_INTERRUPT_FOR_PIN or something) that does something driver specific (e.g. checks if you can have such an interrupt, then fabricates an OF- or ACPI-compatible specifier and maps it) to return a usable IRQ or an error. This also abstracts out how you want to describe the GPIO interrupt domain. Aside from not liking #2, I don't have strong opinions. #1 and #3 aren't mutually exclusive (#3 could be implemented in terms of #1), and #3 seems a little cleaner, so I would have a weak preference for #3 as the canonical mechanism. But anything is OK, really. > And thanks for your effort and patience, Of course -- apologies for being somewhat strident. Thanks for taking the time to discuss this! -Nathan > Michal > > From owner-freebsd-arch@freebsd.org Tue Aug 2 04:00:58 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E1BF7BAC75D; Tue, 2 Aug 2016 04:00:57 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C3DB31779; Tue, 2 Aug 2016 04:00:57 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from zeppelin.tachypleus.net ([172.58.136.26]) (authenticated bits=0) by d.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id u7240dAU032222 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Mon, 1 Aug 2016 21:00:43 -0700 Subject: Re: INTRNG (Was: svn commit: r301453....) To: Michal Meloun References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57961549.4020105@FreeBSD.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Nathan Whitehorn Message-ID: Date: Mon, 1 Aug 2016 21:00:37 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> Content-Type: multipart/mixed; boundary="------------B09F4E0CEA3FCACA621BF022" X-Sonic-CAuth: UmFuZG9tSVYN5MKl+H41n8ajRjvdGOkQE8VGoD/VmZXpxyHhNF4SVvgYCM7BQ/S1DAiSzD8cnqftsu2pF7kZe70ehtwuBVerc8oh6Yssfdg= X-Sonic-ID: C;EvrDrGVY5hG9zq/hcgQksw== M;ij3trmVY5hG9zq/hcgQksw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Aug 2016 04:00:58 -0000 This is a multi-part message in MIME format. --------------B09F4E0CEA3FCACA621BF022 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit On 07/31/16 11:43, Nathan Whitehorn wrote: > > > On 07/31/16 08:40, Michal Meloun wrote: >> Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): >> [snip] >>>> The current API is certainly a bit of a hack and I would be pleased to >>>>> see it go; but the replacement needs to be more functional and more >>>>> robust. As far as I can tell, I literally *cannot* move PowerPC over >>>>> to this architecture. >>>> Yes, at this time, I agree. If you can accept enhancement of >>>> (owf_)bus_map_intr() then we can move to next, significantly less >>>> hackish, state. >>> OK, sure. I wrote some code this afternoon (attached) to sketch this. >>> It does not compile or run or anything useful, but it should >>> illustrate the important parts of what I think is an API that should >>> work for everyone. I'm happy to finish it if it looks reasonable -- I >>> may in any case just to replace PowerPC's increasingly teetering >>> intr_machdep.c. >>> >>> The OF part is essentially unchanged from how it currently works: >>> there's a method that you pass the interrupt specifier and interrupt >>> parent to, and it spits back an IRQ that means that combination of >>> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current >>> API, can be implemented in terms of it in one line. Whenever the >>> relevant PIC attaches, it is told to use the given IRQ to refer to >>> that opaque bytestring. >>> >>> I've added a set of equivalent functions for ACPI that take the >>> contents of an ACPI interrupt specifier and do the same things. It >>> tries to have the IRQ be human-meaningful, if possible, by matching >>> the ACPI interrupt number. Having separate functions seemed a little >>> cleaner than exposing the enums and unions as part of the KPI, but I'm >>> not wedded to it -- this is just an example. >>> >>> PICs register (and deregister) themselves with either an OF phandle or >>> an ACPI resource string or (god help us) both as needed. The PICs have >>> corresponding methods for interpreting various kinds of interrupt >>> specifiers. >>> >>> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to >>> succeed before the PICs are attached is gated on an #ifdef. That can >>> probably be off by default on ARM. A PowerPC version of this code >>> would have to keep it on until various bus drivers are fixed. >>> >>> Here's the general flow: >>> - Parent bus enumerates children, maps IRQs as needed, adds to >>> resource list. The struct resources involved aren't special (just a >>> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to >>> implement (and already are implemented, in general, for OF/FDT >>> drivers). >>> - bus_alloc_resource() does nothing special >>> - bus_setup_intr() calls into the PIC and does what is needed, as in >>> r301453 (to within that #ifdef) >>> >>> This should keep all the best pieces of everything: >>> - ACPI and OF are supported together, and easy to extend for other >>> types of mapping technologies without breaking either KBI or KPI (just >>> add new functions) >>> - No APIs change for existing Open Firmware code >>> - The support code can live entirely in machine-dependent directories, >>> with no code changes at all required in kern/ or dev/ofw/ >>> - bus_setup_intr() and friends behave as in r301453 and succeed or >>> fail for real >>> - PCI code is not an issue >>> - No disconnected interrupt state maintained in mapping table (unless >>> the transitional #ifdef is on) and the mapping table content is only >>> larger than r301453 by having a copy of the original interrupt >>> specifier. >>> >>> If and when we manage to fix the kernel APIs to allow non-scalar >>> interrupt specifiers, this should also be easy to gradually >>> transmogrify to support that case since there exists a 1:1 mapping of >>> scalar IRQs to rich data and the control flow would be identical: >>> interrupt specifiers are read at enumeration time and a resulting >>> handle -- struct resource or scalar IRQ -- used afterward to refer >>> to it. >>> >> Nice, nice, i think that we are very close at this point. >> The problem with the above workflow is that maps IRQ function is called >> at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at >> this time, PICs are not exist. >> Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping >> table, it doesn't provide this functionality. >> But yes, i understand that mapping table is important and necessary >> for you. >> >> So, if i can extend our concept a little bit, then: >> We can add new table (map_data_tbl for example) that holds a copy of >> the original interrupt specifier, index to irq_sources table and >> probably some flags. >> And we can modify the above workflow to: >> - Parent bus enumerates children, allocates entries from map_data_tbl, >> adds to resource list. The struct resources involved aren't special >> (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are >> trivial to implement (and already are implemented, in general, for >> OF/FDT drivers). Index to map_data_tbl is used as resource ID. >> - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, >> *maps IRQs* and stores result in 'index' field. >> - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls >> into the PIC and does what is needed, as in r301453. (to within that >> #ifdef) >> >> And, in PPC case, newly attached PIC scans whole map_data_tbl table, >> finds his entries and makes what needs. >> >> Does that make sense? >> I hope that postponing of map IRQ call is not a problem for PPC, >> everything else looks easy. >> Moreover, this additional level of indirection also solves all my >> 'hypothetical corner cases' with N:1 mappings (between original >> interrupt specifier and real interrupt number). > > Yes, I think we are converging. > > My qualm about bus_alloc_resource() is twofold: > 1. Traditionally, with interrupts, bus_alloc_resource() has no side > effects and I'm not sure it propagates cleanly down the tree in all > cases. > 2. T. , x n nn b > n /here are a few bus APIs (bus_config_intr() comes to mind) that > use raw IRQ IDs and, so far as I know, can be, and sometimes are > called before bus_alloc_resource(). If the PIC doesn't know about the > IRQ yet when bus_config_intr() (etc.) is called, things will just break. > > So you would need to make sure that any bus method handling a resource > ID causes it to be mapped on the PIC at that time. It's OK if that > happens in bus_alloc_resource() so long as bus_config_intr(), > bus_setup_intr(), etc. cause that to happen if it hasn't happened yet > -- I don't care when these calls are made to the PIC driver so long as > *what* calls will be made is set at enumeration time. > > I am also totally fine with having, on ARM, bus_config_intr(), > bus_setup_intr() etc. fail if the PIC hasn't attached yet (On PowerPC, > we can't do that yet, but after this conversation, I regard that as a > bug and would fix that later), as well as delaying setup on the PIC to > the first time any bus driver actually tries to *use* the interrupt > (alloc_resource(), setup_intr(), config_intr(), whatever) rather than > when the interrupt is originally allocated. > > ------------ The following is a large parenthesis ------------------- > > One other possible route here that would also work well would be to > make ofwbus.c MD (it's a trivial piece of code anyway, so we don't > gain a lot by sharing it) and implement ofw_bus_map_intr() locally as > an ofwbus bus method. Then you could have the mapping table stored in > ofwbus's softc -- the API was designed for this initially. You would > need MD extensions for doing PIC registration there (which is fine), > but that segregates all the OFW-specific information into OFW-specific > code and would let the bus methods and the OFW interrupt mapping table > interact cleanly in the same place. This still preserves the > pre-r301453 API, makes PowerPC work, and maybe address's skra@'s > concern about extensibility and letting core interrupt code know about > FDT (or ACPI). I'd be happy to mock this up as well if you think it's > a good approach. > [snip] > > This has the following features: > - Existing OFW API and semantics unchanged > - As such, PowerPC, PCI, etc. work fine with no changes > - Details encapsulated in MD code, so individual platforms can > implement this however they like > - arm/arm/intr.c (or whatever) only needs a method to allocate a fresh > interrupt, with no state, and anoter to set the device_t for an > interrupt sometime later. > - The internal table in the platform interrupt code has no knowledge > of any mappings whatsoever except having the appropriate device_t for > the PIC stored with the interrupt vector. > - Device tree bits handled purely in device tree code > - No action need be taken on any mapping until the interrupt is > actually allocated/set up, like r301453 > - Easy to add more mapping mechanisms (e.g. ACPI) by having similar > enumeration-mechanism-specific code in the root bus for that mapping > mechanism. > > -------------- End parenthesis ------------------------------- Here's an implementation of the parenthesis I wrote on an airplane this afternoon. It should be complete, though has not been tested. The code is short and simple (+70 lines in ofwbus.c). This preserves the pre-r301453 API and semantics relative to drivers, which means PowerPC and PCI work out of the box, while keeping the semantics relative to the interrupt layer of r301453 (PIC methods only called on resource allocation, no allocatable IRQs on unattached PICs, encapsulation of OFW-specific code in OFW-specific bits of the tree). It turns out those two things are compatible, somewhat to my surprise, and that makes the result very clean. I like this approach and would be happy to move forward with it. There are five functions of interest: 1. OFW_BUS_MAP_INTR(). This has the semantics and API it has now: you pass an interrupt specifier and parent, you get back an IRQ. No changes. This is the core of the normal OFW interrupt API. 2. OFW_BUS_REGISTER_PIC(device_t pic, phandle_t phandle). This is a new function that PIC drivers are supposed to use to register control of an interrupt domain. This replaces machine-specific code like powerpc_register_pic() to allow the PIC table to be in a bus parent rather than in the interrupt core. 3. PIC_MAP_OFW_INTR(device_t pic, int irq, interrupt specifier). This is a new function that PIC drivers that know how to handle device-tree interrupt descriptors implement (analogous to various existing ones that vary by platform). It tells the PIC that the given abstract IRQ means the given opaque interrupt specifier. 4. arm_allocate_irq(int suggested). This allocates a new IRQ number not (yet) attached to a PIC, as in r301453. I've added a parameter that lets you pass a suggested number to try in case it is possible to make it match an interrupt pin or something for human-readability. 5. arm_set_pic_for_irq(int irq, device_t). This tells the MD interrupt layer to associate a given PIC device_t with an IRQ. That is all the information the MD layer ever has about the IRQ mapping. Functions #1 and #2 are now implemented completely in ofwbus.c: there are no callouts anywhere else and the interrupt mapping table is maintained entirely internally to ofwbus (in its softc). In order to implement ACPI, or some other strategy, you would implement analogs to functions #1 and #2 that live somewhere in the bus hierarchy that is guaranteed to be above all devices that might want that type of interrupt (e.g. in acpi0), and some analog to #3 that PIC drivers implementing the mapping scheme would provide. Since the system interrupt code has no knowledge at all of interrupt mapping, of any type, in this scheme, adding new mapping types is trivial and can be done on a driver-by-driver basis if necessary without changing KPI and without any other part of the system even being aware. For example, GPIOs can use a completely different mechanism if they want and can do setup purely in the GPIO controller driver. You could have a method GPIO_GET_INTERRUPT_FOR_PIN() on a GPIO controller in which the GPIO controller allocates a generic IRQ, assigns through some internal table just in the GPIO driver, and returns to it to a consumer in some other device driver -- without a GPIO mapping type, new bus functions, or modifications to the platform interrupt code. The control flow goes like this: - Bus driver enumerates children, parses interrupts properties, calls OFW_BUS_MAP_INTR() to get IRQs for them (as pre-r301453), adds to interrupt list. - ofwbus receives the OFW_BUS_MAP_INTR() call, allocates a blank disconnected IRQ from the MD interrupt layer, and stores the mapping from the new IRQ to the given interrupt specifier and phandle in an internal table in ofwbus's softc. NB: Nothing else happens here, like post-r301453. Changing this does not change any semantics of the API pre-r301453, which means it remains fully compatible with PCI and PowerPC. Also, like post-r301453, there is no involvement of nexus. - PICs attach and call OFW_BUS_REGISTER_PIC(). ofwbus receives these messages and adds a (device_t, phandle_t) mapping to a second internal table. Note that the interrupt layer does not need to handle PIC registration anymore at all (except for the root PIC). - Bus child eventually calls a function that tries to set the interrupt up (e.g. bus_setup_intr()). That propagates up the bus hierarchy, eventually getting to ofwbus. ofwbus notes the IRQ number, looks it up in the table, looks up the appropriate PIC from the PIC table, then: A) calls arm_set_pic_for_irq(irq, pic_device_t) -- this is the interrupt layer's only interaction with the mapping code. All it deals with is device_ts and abstract IRQ numbers. B) calls PIC_MAP_OFW_INTR(pic, irq_number, interrupt-cells, interrupt-specifier) to tell the PIC that the interrupt layer's IRQ irq_number means the given specifier C) finally, passes the call onto nexus, which will do whatever would normally happen (unmasking the interrupt, setting handlers, etc.) in terms only of the abstract IRQ and the device_t assigned by ofwbus. You would implement ACPI just by doing a s/OFW/ACPI/g search-and-replace above -- since the interrupt layer doesn't know about OFW or ACPI or anything else, there is no need to touch it. This seems clean, simple, compartmentalized, preserves the existing API, and should work on all of our various hardware. PowerPC can't quite work with it yet without some multipass foo, but, since the API is preserved, that transition can happen gradually without KPI changes. For the same reason that it is API-preserving, I think this code is also MFC-able. -Nathan --------------B09F4E0CEA3FCACA621BF022 Content-Type: text/plain; charset=UTF-8; name="ofwbus.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ofwbus.c" /*- * Copyright 2016 Nathan Whitehorn * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include #include #include #include #include #include #include #include #include #include #include #include /* * The ofwbus forms the root of the OF bus hierarchy and binds the device tree * "/" node to newbus. Since the root node has most of the same bindings and * semantics as the FDT simplebus, this inherits from it. It adds termination * of some resource requests, notably interrupt mapping, that cannot be pushed * further down the bus hierarchy. */ struct ofwbus_intrtabent { int irq; device_t pic; phandle_t iparent; pcell_t *ispec; int icells; SLIST_ENTRY(ofwbus_intrtabent) entry; }; struct ofwbus_pic { phandle_t phandle; device_t dev; SLIST_ENTRY(ofwbus_pic) entry; }; struct ofwbus_softc { struct simplebus_softc simplebus_sc; struct rman intr_rman; struct rman mem_rman; struct mtx intr_tab_lock; SLIST_HEAD(ofwbus_intrtabs, ofwbus_intrtabent) intr_tab; SLIST_HEAD(ofwbus_pics, ofwbus_pic) pic_tab; }; #ifndef __aarch64__ static device_identify_t ofwbus_identify; #endif static device_probe_t ofwbus_probe; static device_attach_t ofwbus_attach; static bus_alloc_resource_t ofwbus_alloc_resource; static bus_adjust_resource_t ofwbus_adjust_resource; static bus_release_resource_t ofwbus_release_resource; static bus_bind_intr_t ofwbus_bind_intr; static bus_config_intr_t ofwbus_config_intr; static bus_setup_intr_t ofwbus_setup_intr; static ofw_bus_map_intr_t ofwbus_ofw_map_intr; static ofw_bus_register_pic_t ofwbus_register_pic; /* Register IRQs with PIC if not done yet */ static int ofwbus_setup_irq_with_pic(device_t dev, int irq); static device_method_t ofwbus_methods[] = { /* Device interface */ #ifndef __aarch64__ DEVMETHOD(device_identify, ofwbus_identify), #endif DEVMETHOD(device_probe, ofwbus_probe), DEVMETHOD(device_attach, ofwbus_attach), /* Bus interface */ DEVMETHOD(bus_alloc_resource, ofwbus_alloc_resource), DEVMETHOD(bus_adjust_resource, ofwbus_adjust_resource), DEVMETHOD(bus_release_resource, ofwbus_release_resource), /* ofw_bus interface */ DEVMETHOD(ofw_bus_map_intr, ofwbus_map_intr) DEVMETHOD(ofw_bus_register_pic, ofwbus_register_pic) DEVMETHOD_END }; DEFINE_CLASS_1(ofwbus, ofwbus_driver, ofwbus_methods, sizeof(struct ofwbus_softc), simplebus_driver); static devclass_t ofwbus_devclass; EARLY_DRIVER_MODULE(ofwbus, nexus, ofwbus_driver, ofwbus_devclass, 0, 0, BUS_PASS_BUS + BUS_PASS_ORDER_MIDDLE); MODULE_VERSION(ofwbus, 1); #ifndef __aarch64__ static void ofwbus_identify(driver_t *driver, device_t parent) { /* Check if Open Firmware has been instantiated */ if (OF_peer(0) == 0) return; if (device_find_child(parent, "ofwbus", -1) == NULL) BUS_ADD_CHILD(parent, 0, "ofwbus", -1); } #endif static int ofwbus_probe(device_t dev) { #ifdef __aarch64__ if (OF_peer(0) == 0) return (ENXIO); #endif device_set_desc(dev, "Open Firmware Device Tree"); return (BUS_PROBE_NOWILDCARD); } static int ofwbus_attach(device_t dev) { struct ofwbus_softc *sc; phandle_t node; struct ofw_bus_devinfo obd; sc = device_get_softc(dev); node = OF_peer(0); /* * If no Open Firmware, bail early */ if (node == -1) return (ENXIO); /* * Allocate root rmans for ofwbus. */ sc->sc_intr_rman.rm_type = RMAN_ARRAY; sc->sc_intr_rman.rm_descr = "Interrupts"; sc->sc_mem_rman.rm_type = RMAN_ARRAY; sc->sc_mem_rman.rm_descr = "Device Memory"; if (rman_init(&sc->sc_intr_rman) != 0 || rman_init(&sc->sc_mem_rman) != 0 || rman_manage_region(&sc->sc_intr_rman, 0, ~0) != 0 || rman_manage_region(&sc->sc_mem_rman, 0, BUS_SPACE_MAXADDR) != 0) panic("%s: failed to set up rmans.", __func__); /* * Init interrupt mapping table */ SLIST_INIT(&sc->pic_tab); SLIST_INIT(&sc->intr_tab); mtx_init(&sc->intr_tab_lock, "ofwbus intr", NULL, MTX_DEF); /* * This is the root of the tree, so simplebus_attach() will fail * when it asks our parent (nexus) for its OFW node. Pass node to * simplebus_init directly. */ simplebus_init(dev, node); /* * Now walk the OFW tree and attach top-level devices. */ for (node = OF_child(node); node > 0; node = OF_peer(node)) simplebus_add_device(dev, node, 0, NULL, -1, NULL); return (bus_generic_attach(dev)); } static struct resource * ofwbus_alloc_resource(device_t bus, device_t child, int type, int *rid, rman_res_t start, rman_res_t end, rman_res_t count, u_int flags) { struct ofwbus_softc *sc; struct rman *rm; struct resource *rv; struct resource_list_entry *rle; int isdefault, passthrough, err; isdefault = RMAN_IS_DEFAULT_RANGE(start, end); passthrough = (device_get_parent(child) != bus); sc = device_get_softc(bus); rle = NULL; if (!passthrough && isdefault) { rle = resource_list_find(BUS_GET_RESOURCE_LIST(bus, child), type, *rid); if (rle == NULL) { if (bootverbose) device_printf(bus, "no default resources for " "rid = %d, type = %d\n", *rid, type); return (NULL); } start = rle->start; count = ummax(count, rle->count); end = ummax(rle->end, start + count - 1); } switch (type) { case SYS_RES_IRQ: rm = &sc->sc_intr_rman; break; case SYS_RES_MEMORY: rm = &sc->sc_mem_rman; break; default: return (NULL); } rv = rman_reserve_resource(rm, start, end, count, flags & ~RF_ACTIVE, child); if (rv == NULL) return (NULL); rman_set_rid(rv, *rid); if ((flags & RF_ACTIVE) != 0 && bus_activate_resource(child, type, *rid, rv) != 0) { rman_release_resource(rv); return (NULL); } if (!passthrough && rle != NULL) { rle->res = rv; rle->start = rman_get_start(rv); rle->end = rman_get_end(rv); rle->count = rle->end - rle->start + 1; } return (rv); } static int ofwbus_adjust_resource(device_t bus, device_t child __unused, int type, struct resource *r, rman_res_t start, rman_res_t end) { struct ofwbus_softc *sc; struct rman *rm; device_t ofwbus; ofwbus = bus; while (strcmp(device_get_name(device_get_parent(ofwbus)), "root") != 0) ofwbus = device_get_parent(ofwbus); sc = device_get_softc(ofwbus); switch (type) { case SYS_RES_IRQ: rm = &sc->sc_intr_rman; break; case SYS_RES_MEMORY: rm = &sc->sc_mem_rman; break; default: return (EINVAL); } if (rm == NULL) return (ENXIO); if (rman_is_region_manager(r, rm) == 0) return (EINVAL); return (rman_adjust_resource(r, start, end)); } static int ofwbus_release_resource(device_t bus, device_t child, int type, int rid, struct resource *r) { struct resource_list_entry *rle; int passthrough; int error; passthrough = (device_get_parent(child) != bus); if (!passthrough) { /* Clean resource list entry */ rle = resource_list_find(BUS_GET_RESOURCE_LIST(bus, child), type, rid); if (rle != NULL) rle->res = NULL; } if ((rman_get_flags(r) & RF_ACTIVE) != 0) { error = bus_deactivate_resource(child, type, rid, r); if (error) return (error); } return (rman_release_resource(r)); } static int ofwbus_map_intr(device_t dev, device_t child, phandle_t iparent, int icells, pcell_t *ispec) { struct ofwbus_intrtabent *intr; struct ofwbus_softc *sc; sc = device_get_softc(dev); mtx_lock(&sc->intr_tab_lock); SLIST_FOREACH(intr, &sc->intr_tab, entry) { if (intr->iparent != iparent) continue; if (intr->icells != icells) continue; if (memcmp(ispec, intr->ispec, sizeof(*ispec)*icells) == 0) break; } if (intr == NULL) { intr = malloc(sizeof(*intr), M_DEVBUF, M_ZERO | M_WAITOK); intr->iparent = iparent; intr->icells = icells; intr->ispec = malloc(sizeof(*ispec)*icells, M_DEVBUF, M_WAITOK); memcpy(intr->ispec, ispec, sizeof(*ispec)*icells); /* * Try to set ispec[0] as the IRQ ID. This is often the * interrupt pin on the PIC, so getting that as the IRQ ID * makes dmesg a little more human-friendly and consistent. No * big deal if that fails and we get something random, though. */ intr->irq = arm_allocate_irq(ispec[0]); SLIST_INSERT_HEAD(&sc->intr_tab, intr, entry); } mtx_unlock(&sc->intr_tab_lock); /* Entries never deleted, so safe */ return (intr->irq); } static int ofwbus_register_pic(device_t dev, device_t pic, phandle_t phandle) { struct ofwbus_pic *picent; struct ofwbus_softc *sc; sc = device_get_softc(dev); mtx_lock(&sc->intr_tab_lock); SLIST_FOREACH(picent, &sc->pic_tab, entry) { if (picent->phandle == phandle) panic("PIC %#x already registered as %s", phandle, device_get_nameunit(picent->dev)); } picent = malloc(sizeof(*picent), M_DEVBUF, M_ZERO | M_WAITOK); picent->phandle = phandle; picent->dev = dev; SLIST_INSERT_HEAD(&sc->pic_tab, picent, entry); mtx_unlock(&sc->intr_tab_lock); return (0); } static int ofwbus_setup_irq_with_pic(device_t dev, int irq) { struct ofwbus_intrtabent *intr; struct ofwbus_pic *picent; struct ofwbus_softc *sc; int err = 0; sc = device_get_softc(dev); mtx_lock(&sc->intr_tab_lock); SLIST_FOREACH(intr, &sc->intr_tab, entry) { if (intr->irq == irq) break; } KASSERT(intr != NULL, ("Setting up unmapped IRQ %d", irq)); if (intr->pic == NULL) { SLIST_FOREACH(picent, &sc->pic_tab, entry) { if (picent->phandle == intr->phandle) break; } if (picent == NULL) panic("Mapping IRQ %d to non-existant PIC %#x", irq, intr->phandle); intr->pic = picent->dev; /* * NB: this ordering is safe so long as the MD interrupt * code never iterates through all interrupts and calls PIC * methods on them: * 1. No bus code (bus_config_intr()), etc. can happen until * this function releases intr_tab_lock. * 2. The PIC can't fire the interrupt until PIC_MAP_OFW_IRQ() * at the very least (hopefully the PIC brings it up masked * anyway and the MD code unmasks later). */ arm_set_pic_for_irq(intr->irq, intr->pic); err = PIC_MAP_OFW_IRQ(intr->pic, intr->irq, intr->icells, intr->ispec); if (err != 0) intr->pic = NULL; } mtx_unlock(&sc->intr_tab_lock); return (err); } static int ofwbus_setup_intr(device_t dev, device_t child, struct resource *r, int flags, driver_filter_t *filt, driver_intr_t *intr, void *arg, void **cookiep) { struct ofwbus_softc *sc = device_get_softc(dev); int err; if (r == NULL) panic("%s: NULL interrupt resource!", __func__); /* Make sure we're good to go */ err = ofwbus_setup_irq_with_pic(rman_get_start(r)); if (err != 0) return (err); return (BUS_SETUP_INTR(device_get_parent(dev), child, r, flags, filt, intr, arg, cookiep)); } static int ofwbus_bind_intr(device_t dev, device_t child, struct resource *r, int cpu) { struct ofwbus_softc *sc = device_get_softc(dev); int err; if (r == NULL) panic("%s: NULL interrupt resource!", __func__); /* Make sure we're good to go */ err = ofwbus_setup_irq_with_pic(rman_get_start(r)); if (err != 0) return (err); return (BUS_BIND_INTR(device_get_parent(dev), child, r, cpu)); } static int ofwbus_config_intr(device_t dev, int irq, enum intr_trigger trig, enum intr_polarity pol) { struct ofwbus_softc *sc = device_get_softc(dev); int err; /* Make sure we're good to go */ err = ofwbus_setup_irq_with_pic(irq) if (err != 0) return (err); return (BUS_CONFIG_INTR(device_get_parent(dev), irq, trig, pol)); } --------------B09F4E0CEA3FCACA621BF022 Content-Type: text/plain; charset=UTF-8; name="ofwdiff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ofwdiff" Index: dev/ofw/ofw_bus.h =================================================================== --- dev/ofw/ofw_bus.h (revision 303312) +++ dev/ofw/ofw_bus.h (working copy) @@ -76,4 +76,10 @@ return (OFW_BUS_MAP_INTR(dev, dev, iparent, icells, intr)); } +static __inline int +ofw_bus_register_pic(device_t dev, phandle_t phandle) +{ + return (OFW_BUS_REGISTER_PIC(dev, dev, phandle)); +} + #endif /* !_DEV_OFW_OFW_BUS_H_ */ Index: dev/ofw/ofw_bus_if.m =================================================================== --- dev/ofw/ofw_bus_if.m (revision 303312) +++ dev/ofw/ofw_bus_if.m (working copy) @@ -113,8 +113,23 @@ /* If that fails, then assume a one-domain system */ return (interrupt[0]); } + + int + ofw_bus_default_register_pic(device_t bus, device_t dev, + phandle_t phandle) + { + /* Propagate up the bus hierarchy until someone handles it. */ + if (device_get_parent(bus) != NULL) + return OFW_BUS_REGISTER_PIC(device_get_parent(bus), dev, + phandle); + + /* If that fails, then it doesn't matter, I guess */ + return (0); + } }; +}; + # Get the ofw_bus_devinfo struct for the device dev on the bus. Used for bus # drivers which use the generic methods in ofw_bus_subr.c to implement the # reset of this interface. The default method will return NULL, which means @@ -159,9 +174,9 @@ device_t dev; } DEFAULT ofw_bus_default_get_type; -# Map an (interrupt parent, IRQ) pair to a unique system-wide interrupt number. -# If the interrupt encoding includes a sense field, the interrupt sense will -# also be configured. +# Map an (interrupt parent, interrupt specifier) pair to a unique system-wide +# interrupt number. If the interrupt encoding includes metadata (sense field, +# for example) that will be passed to the PIC at bring-up. METHOD int map_intr { device_t bus; device_t dev; @@ -169,3 +184,11 @@ int icells; pcell_t *interrupt; } DEFAULT ofw_bus_default_map_intr; + +# Register a device as handling interrupts for a given interrupt-parent value. +METHOD int register_pic { + device_t bus; + device_t dev; + phandle_t phandle; +} DEFAULT ofw_bus_default_register_pic; + Index: powerpc/powerpc/nexus.c =================================================================== --- powerpc/powerpc/nexus.c (revision 303312) +++ powerpc/powerpc/nexus.c (working copy) @@ -72,6 +72,7 @@ #endif static bus_config_intr_t nexus_config_intr; static ofw_bus_map_intr_t nexus_ofw_map_intr; +static ofw_bus_register_pic_t nexus_ofw_register_pic; static device_method_t nexus_methods[] = { /* Device interface */ @@ -92,6 +93,7 @@ /* ofw_bus interface */ DEVMETHOD(ofw_bus_map_intr, nexus_ofw_map_intr), + DEVMETHOD(ofw_bus_register_pic, nexus_ofw_register_pic), DEVMETHOD_END }; @@ -193,6 +195,15 @@ } static int +nexus_ofw_register_pic(device_t dev, device_t pic, phandle_t phandle) +{ + + /* XXX: make up random numbers until intr_machdep.c is updated */ + powerpc_register_pic(pic, phandle, 124, 4, FALSE); + return (0); +} + +static int nexus_activate_resource(device_t bus __unused, device_t child __unused, int type, int rid __unused, struct resource *r) { --------------B09F4E0CEA3FCACA621BF022-- From owner-freebsd-arch@freebsd.org Tue Aug 2 13:25:18 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 67660BAC0FB; Tue, 2 Aug 2016 13:25:18 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from mail.miracle.cz (mail.miracle.cz [193.84.128.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.miracle.cz", Issuer "Miracle Group Root CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id DA3B5124B; Tue, 2 Aug 2016 13:25:17 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from [193.84.128.50] (meloun.ad.miracle.cz [193.84.128.50]) by mail.miracle.cz (Postfix) with ESMTPSA id 8BCE23ACC0; Tue, 2 Aug 2016 15:25:08 +0200 (CEST) Subject: Re: INTRNG (Was: svn commit: r301453....) To: Nathan Whitehorn References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Michal Meloun X-Enigmail-Draft-Status: N1110 Message-ID: <57A09F34.4050400@FreeBSD.org> Date: Tue, 2 Aug 2016 15:25:08 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.miracle.cz); Tue, 02 Aug 2016 15:25:08 +0200 (CEST) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Aug 2016 13:25:18 -0000 I'm sorry for delayed response. We have serious trouble @work so I cannot spend to much time on FreeBSB for next 2 or 3 days. Dne 02.08.2016 v 6:00 Nathan Whitehorn napsal(a): > > > On 07/31/16 11:43, Nathan Whitehorn wrote: >> >> >> On 07/31/16 08:40, Michal Meloun wrote: >>> Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): >>> [snip] >>>>> The current API is certainly a bit of a hack and I would be >>>>> pleased to >>>>>> see it go; but the replacement needs to be more functional and more >>>>>> robust. As far as I can tell, I literally *cannot* move PowerPC over >>>>>> to this architecture. >>>>> Yes, at this time, I agree. If you can accept enhancement of >>>>> (owf_)bus_map_intr() then we can move to next, significantly less >>>>> hackish, state. >>>> OK, sure. I wrote some code this afternoon (attached) to sketch this. >>>> It does not compile or run or anything useful, but it should >>>> illustrate the important parts of what I think is an API that should >>>> work for everyone. I'm happy to finish it if it looks reasonable -- I >>>> may in any case just to replace PowerPC's increasingly teetering >>>> intr_machdep.c. >>>> >>>> The OF part is essentially unchanged from how it currently works: >>>> there's a method that you pass the interrupt specifier and interrupt >>>> parent to, and it spits back an IRQ that means that combination of >>>> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current >>>> API, can be implemented in terms of it in one line. Whenever the >>>> relevant PIC attaches, it is told to use the given IRQ to refer to >>>> that opaque bytestring. >>>> >>>> I've added a set of equivalent functions for ACPI that take the >>>> contents of an ACPI interrupt specifier and do the same things. It >>>> tries to have the IRQ be human-meaningful, if possible, by matching >>>> the ACPI interrupt number. Having separate functions seemed a little >>>> cleaner than exposing the enums and unions as part of the KPI, but I'm >>>> not wedded to it -- this is just an example. >>>> >>>> PICs register (and deregister) themselves with either an OF phandle or >>>> an ACPI resource string or (god help us) both as needed. The PICs have >>>> corresponding methods for interpreting various kinds of interrupt >>>> specifiers. >>>> >>>> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to >>>> succeed before the PICs are attached is gated on an #ifdef. That can >>>> probably be off by default on ARM. A PowerPC version of this code >>>> would have to keep it on until various bus drivers are fixed. >>>> >>>> Here's the general flow: >>>> - Parent bus enumerates children, maps IRQs as needed, adds to >>>> resource list. The struct resources involved aren't special (just a >>>> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to >>>> implement (and already are implemented, in general, for OF/FDT >>>> drivers). >>>> - bus_alloc_resource() does nothing special >>>> - bus_setup_intr() calls into the PIC and does what is needed, as in >>>> r301453 (to within that #ifdef) >>>> >>>> This should keep all the best pieces of everything: >>>> - ACPI and OF are supported together, and easy to extend for other >>>> types of mapping technologies without breaking either KBI or KPI (just >>>> add new functions) >>>> - No APIs change for existing Open Firmware code >>>> - The support code can live entirely in machine-dependent directories, >>>> with no code changes at all required in kern/ or dev/ofw/ >>>> - bus_setup_intr() and friends behave as in r301453 and succeed or >>>> fail for real >>>> - PCI code is not an issue >>>> - No disconnected interrupt state maintained in mapping table (unless >>>> the transitional #ifdef is on) and the mapping table content is only >>>> larger than r301453 by having a copy of the original interrupt >>>> specifier. >>>> >>>> If and when we manage to fix the kernel APIs to allow non-scalar >>>> interrupt specifiers, this should also be easy to gradually >>>> transmogrify to support that case since there exists a 1:1 mapping of >>>> scalar IRQs to rich data and the control flow would be identical: >>>> interrupt specifiers are read at enumeration time and a resulting >>>> handle -- struct resource or scalar IRQ -- used afterward to refer >>>> to it. >>>> >>> Nice, nice, i think that we are very close at this point. >>> The problem with the above workflow is that maps IRQ function is called >>> at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at >>> this time, PICs are not exist. >>> Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping >>> table, it doesn't provide this functionality. >>> But yes, i understand that mapping table is important and necessary >>> for you. >>> >>> So, if i can extend our concept a little bit, then: >>> We can add new table (map_data_tbl for example) that holds a copy of >>> the original interrupt specifier, index to irq_sources table and >>> probably some flags. >>> And we can modify the above workflow to: >>> - Parent bus enumerates children, allocates entries from map_data_tbl, >>> adds to resource list. The struct resources involved aren't special >>> (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are >>> trivial to implement (and already are implemented, in general, for >>> OF/FDT drivers). Index to map_data_tbl is used as resource ID. >>> - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, >>> *maps IRQs* and stores result in 'index' field. >>> - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls >>> into the PIC and does what is needed, as in r301453. (to within that >>> #ifdef) >>> >>> And, in PPC case, newly attached PIC scans whole map_data_tbl table, >>> finds his entries and makes what needs. >>> >>> Does that make sense? >>> I hope that postponing of map IRQ call is not a problem for PPC, >>> everything else looks easy. >>> Moreover, this additional level of indirection also solves all my >>> 'hypothetical corner cases' with N:1 mappings (between original >>> interrupt specifier and real interrupt number). >> >> Yes, I think we are converging. >> >> My qualm about bus_alloc_resource() is twofold: >> 1. Traditionally, with interrupts, bus_alloc_resource() has no side >> effects and I'm not sure it propagates cleanly down the tree in all >> cases. >> 2. There are a few bus APIs (bus_config_intr() comes to mind) that >> use raw IRQ IDs and, so far as I know, can be, and sometimes are >> called before bus_alloc_resource(). If the PIC doesn't know about the >> IRQ yet when bus_config_intr() (etc.) is called, things will just break. >> >> So you would need to make sure that any bus method handling a >> resource ID causes it to be mapped on the PIC at that time. It's OK >> if that happens in bus_alloc_resource() so long as bus_config_intr(), >> bus_setup_intr(), etc. cause that to happen if it hasn't happened yet >> -- I don't care when these calls are made to the PIC driver so long >> as *what* calls will be made is set at enumeration time. >> >> I am also totally fine with having, on ARM, bus_config_intr(), >> bus_setup_intr() etc. fail if the PIC hasn't attached yet (On >> PowerPC, we can't do that yet, but after this conversation, I regard >> that as a bug and would fix that later), as well as delaying setup on >> the PIC to the first time any bus driver actually tries to *use* the >> interrupt (alloc_resource(), setup_intr(), config_intr(), whatever) >> rather than when the interrupt is originally allocated. >> I understand. And yes, bus_alloc_resource() isn't propagated cleanly down the tree in all cases. That's why we have 'INTRNG' hook in subr_bus.c, which is suboptimal. About bus_config_intr(). The only consumer of bus_config_intr() is ACPI code, in little hackish manner, as workaround for problem which is fully solved by INTRNG. Also, the semantic of bus_config_intr() is not clear, from INTRNG point of view. So i have tendency to ignore it :) >> ------------ The following is a large parenthesis ------------------- >> >> One other possible route here that would also work well would be to >> make ofwbus.c MD (it's a trivial piece of code anyway, so we don't >> gain a lot by sharing it) and implement ofw_bus_map_intr() locally as >> an ofwbus bus method. Then you could have the mapping table stored in >> ofwbus's softc -- the API was designed for this initially. You would >> need MD extensions for doing PIC registration there (which is fine), >> but that segregates all the OFW-specific information into >> OFW-specific code and would let the bus methods and the OFW interrupt >> mapping table interact cleanly in the same place. This still >> preserves the pre-r301453 API, makes PowerPC work, and maybe >> address's skra@'s concern about extensibility and letting core >> interrupt code know about FDT (or ACPI). I'd be happy to mock this up >> as well if you think it's a good approach. >> > [snip] >> >> This has the following features: >> - Existing OFW API and semantics unchanged >> - As such, PowerPC, PCI, etc. work fine with no changes >> - Details encapsulated in MD code, so individual platforms can >> implement this however they like >> - arm/arm/intr.c (or whatever) only needs a method to allocate a >> fresh interrupt, with no state, and anoter to set the device_t for an >> interrupt sometime later. >> - The internal table in the platform interrupt code has no knowledge >> of any mappings whatsoever except having the appropriate device_t for >> the PIC stored with the interrupt vector. >> - Device tree bits handled purely in device tree code >> - No action need be taken on any mapping until the interrupt is >> actually allocated/set up, like r301453 >> - Easy to add more mapping mechanisms (e.g. ACPI) by having similar >> enumeration-mechanism-specific code in the root bus for that mapping >> mechanism. >> >> -------------- End parenthesis ------------------------------- > > Here's an implementation of the parenthesis I wrote on an airplane > this afternoon. It should be complete, though has not been tested. The > code is short and simple (+70 lines in ofwbus.c). This preserves the > pre-r301453 API and semantics relative to drivers, which means PowerPC > and PCI work out of the box, while keeping the semantics relative to > the interrupt layer of r301453 (PIC methods only called on resource > allocation, no allocatable IRQs on unattached PICs, encapsulation of > OFW-specific code in OFW-specific bits of the tree). It turns out > those two things are compatible, somewhat to my surprise, and that > makes the result very clean. I like this approach and would be happy > to move forward with it. There are five functions of interest: > > 1. OFW_BUS_MAP_INTR(). This has the semantics and API it has now: you > pass an interrupt specifier and parent, you get back an IRQ. No > changes. This is the core of the normal OFW interrupt API. > > 2. OFW_BUS_REGISTER_PIC(device_t pic, phandle_t phandle). This is a > new function that PIC drivers are supposed to use to register control > of an interrupt domain. This replaces machine-specific code like > powerpc_register_pic() to allow the PIC table to be in a bus parent > rather than in the interrupt core. > > 3. PIC_MAP_OFW_INTR(device_t pic, int irq, interrupt specifier). This > is a new function that PIC drivers that know how to handle device-tree > interrupt descriptors implement (analogous to various existing ones > that vary by platform). It tells the PIC that the given abstract IRQ > means the given opaque interrupt specifier. > > 4. arm_allocate_irq(int suggested). This allocates a new IRQ number > not (yet) attached to a PIC, as in r301453. I've added a parameter > that lets you pass a suggested number to try in case it is possible to > make it match an interrupt pin or something for human-readability. > > 5. arm_set_pic_for_irq(int irq, device_t). This tells the MD interrupt > layer to associate a given PIC device_t with an IRQ. That is all the > information the MD layer ever has about the IRQ mapping. > > Functions #1 and #2 are now implemented completely in ofwbus.c: there > are no callouts anywhere else and the interrupt mapping table is > maintained entirely internally to ofwbus (in its softc). In order to > implement ACPI, or some other strategy, you would implement analogs to > functions #1 and #2 that live somewhere in the bus hierarchy that is > guaranteed to be above all devices that might want that type of > interrupt (e.g. in acpi0), and some analog to #3 that PIC drivers > implementing the mapping scheme would provide. > > Since the system interrupt code has no knowledge at all of interrupt > mapping, of any type, in this scheme, adding new mapping types is > trivial and can be done on a driver-by-driver basis if necessary > without changing KPI and without any other part of the system even > being aware. For example, GPIOs can use a completely different > mechanism if they want and can do setup purely in the GPIO controller > driver. You could have a method GPIO_GET_INTERRUPT_FOR_PIN() on a GPIO > controller in which the GPIO controller allocates a generic IRQ, > assigns through some internal table just in the GPIO driver, and > returns to it to a consumer in some other device driver -- without a > GPIO mapping type, new bus functions, or modifications to the platform > interrupt code. > > The control flow goes like this: > - Bus driver enumerates children, parses interrupts properties, calls > OFW_BUS_MAP_INTR() to get IRQs for them (as pre-r301453), adds to > interrupt list. > - ofwbus receives the OFW_BUS_MAP_INTR() call, allocates a blank > disconnected IRQ from the MD interrupt layer, and stores the mapping > from the new IRQ to the given interrupt specifier and phandle in an > internal table in ofwbus's softc. > NB: Nothing else happens here, like post-r301453. Changing this does > not change any semantics of the API pre-r301453, which means it > remains fully compatible with PCI and PowerPC. Also, like > post-r301453, there is no involvement of nexus. > - PICs attach and call OFW_BUS_REGISTER_PIC(). ofwbus receives these > messages and adds a (device_t, phandle_t) mapping to a second internal > table. Note that the interrupt layer does not need to handle PIC > registration anymore at all (except for the root PIC). > - Bus child eventually calls a function that tries to set the > interrupt up (e.g. bus_setup_intr()). That propagates up the bus > hierarchy, eventually getting to ofwbus. ofwbus notes the IRQ number, > looks it up in the table, looks up the appropriate PIC from the PIC > table, then: > A) calls arm_set_pic_for_irq(irq, pic_device_t) -- this is the > interrupt layer's only interaction with the mapping code. All it deals > with is device_ts and abstract IRQ numbers. > B) calls PIC_MAP_OFW_INTR(pic, irq_number, interrupt-cells, > interrupt-specifier) to tell the PIC that the interrupt layer's IRQ > irq_number means the given specifier > C) finally, passes the call onto nexus, which will do whatever would > normally happen (unmasking the interrupt, setting handlers, etc.) in > terms only of the abstract IRQ and the device_t assigned by ofwbus. > > You would implement ACPI just by doing a s/OFW/ACPI/g > search-and-replace above -- since the interrupt layer doesn't know > about OFW or ACPI or anything else, there is no need to touch it. This > seems clean, simple, compartmentalized, preserves the existing API, > and should work on all of our various hardware. PowerPC can't quite > work with it yet without some multipass foo, but, since the API is > preserved, that transition can happen gradually without KPI changes. > For the same reason that it is API-preserving, I think this code is > also MFC-able. > -Nathan I think that we are slightly diverges in this place: - please note that PIC API (in kern/pic_if.m) is different from the one that PPC uses. - we must be able to store configuration data (original interrupt specifier) in all cases, not only for OFW. That's reason why I have proposed to create 'mapping table'. Anyway, i think that we both talking about +/- same solution, only i want to move it from OFW specific level implemented at OFW bus level to bus/interrupt specifier format independent level implemented in subr_intr.c. Most of your control flow can be implemented at general level as is, or already exist [intr_map_irq(), intr_pic_register()] . Also, impact for current PPC code is, by me, minimal. I can sketch my idea in more details, if you found it acceptable. Again, I'm sorry for delayed and very brief response. Michal From owner-freebsd-arch@freebsd.org Tue Aug 2 15:31:44 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F877BACA99; Tue, 2 Aug 2016 15:31:44 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 40CEB15B2; Tue, 2 Aug 2016 15:31:43 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from zeppelin.tachypleus.net ([192.5.85.48]) (authenticated bits=0) by d.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id u72FVdew018035 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Tue, 2 Aug 2016 08:31:40 -0700 Subject: Re: INTRNG (Was: svn commit: r301453....) To: Michal Meloun References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> <57A09F34.4050400@FreeBSD.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Nathan Whitehorn Message-ID: Date: Tue, 2 Aug 2016 08:31:39 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <57A09F34.4050400@FreeBSD.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Sonic-CAuth: UmFuZG9tSVYLbx0DfpdB0lge4gSoDIGN4Mxfrle5ZyKi1R1qp3sg5EfEUSgS/w9/jusrHNWE7pwA11gjPWyFR/izgeMcFy+K X-Sonic-ID: C;CKLiNMZY5hGhja/hcgQksw== M;ACN4NcZY5hGhja/hcgQksw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Aug 2016 15:31:44 -0000 On 08/02/16 06:25, Michal Meloun wrote: > I'm sorry for delayed response. We have serious trouble @work so I > cannot spend to much time on FreeBSB for next 2 or 3 days. Understood. > > Dne 02.08.2016 v 6:00 Nathan Whitehorn napsal(a): >> >> On 07/31/16 11:43, Nathan Whitehorn wrote: >>> >>> On 07/31/16 08:40, Michal Meloun wrote: >>>> Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): >>>> [snip] >>>>>> The current API is certainly a bit of a hack and I would be >>>>>> pleased to >>>>>>> see it go; but the replacement needs to be more functional and more >>>>>>> robust. As far as I can tell, I literally *cannot* move PowerPC over >>>>>>> to this architecture. >>>>>> Yes, at this time, I agree. If you can accept enhancement of >>>>>> (owf_)bus_map_intr() then we can move to next, significantly less >>>>>> hackish, state. >>>>> OK, sure. I wrote some code this afternoon (attached) to sketch this. >>>>> It does not compile or run or anything useful, but it should >>>>> illustrate the important parts of what I think is an API that should >>>>> work for everyone. I'm happy to finish it if it looks reasonable -- I >>>>> may in any case just to replace PowerPC's increasingly teetering >>>>> intr_machdep.c. >>>>> >>>>> The OF part is essentially unchanged from how it currently works: >>>>> there's a method that you pass the interrupt specifier and interrupt >>>>> parent to, and it spits back an IRQ that means that combination of >>>>> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current >>>>> API, can be implemented in terms of it in one line. Whenever the >>>>> relevant PIC attaches, it is told to use the given IRQ to refer to >>>>> that opaque bytestring. >>>>> >>>>> I've added a set of equivalent functions for ACPI that take the >>>>> contents of an ACPI interrupt specifier and do the same things. It >>>>> tries to have the IRQ be human-meaningful, if possible, by matching >>>>> the ACPI interrupt number. Having separate functions seemed a little >>>>> cleaner than exposing the enums and unions as part of the KPI, but I'm >>>>> not wedded to it -- this is just an example. >>>>> >>>>> PICs register (and deregister) themselves with either an OF phandle or >>>>> an ACPI resource string or (god help us) both as needed. The PICs have >>>>> corresponding methods for interpreting various kinds of interrupt >>>>> specifiers. >>>>> >>>>> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to >>>>> succeed before the PICs are attached is gated on an #ifdef. That can >>>>> probably be off by default on ARM. A PowerPC version of this code >>>>> would have to keep it on until various bus drivers are fixed. >>>>> >>>>> Here's the general flow: >>>>> - Parent bus enumerates children, maps IRQs as needed, adds to >>>>> resource list. The struct resources involved aren't special (just a >>>>> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to >>>>> implement (and already are implemented, in general, for OF/FDT >>>>> drivers). >>>>> - bus_alloc_resource() does nothing special >>>>> - bus_setup_intr() calls into the PIC and does what is needed, as in >>>>> r301453 (to within that #ifdef) >>>>> >>>>> This should keep all the best pieces of everything: >>>>> - ACPI and OF are supported together, and easy to extend for other >>>>> types of mapping technologies without breaking either KBI or KPI (just >>>>> add new functions) >>>>> - No APIs change for existing Open Firmware code >>>>> - The support code can live entirely in machine-dependent directories, >>>>> with no code changes at all required in kern/ or dev/ofw/ >>>>> - bus_setup_intr() and friends behave as in r301453 and succeed or >>>>> fail for real >>>>> - PCI code is not an issue >>>>> - No disconnected interrupt state maintained in mapping table (unless >>>>> the transitional #ifdef is on) and the mapping table content is only >>>>> larger than r301453 by having a copy of the original interrupt >>>>> specifier. >>>>> >>>>> If and when we manage to fix the kernel APIs to allow non-scalar >>>>> interrupt specifiers, this should also be easy to gradually >>>>> transmogrify to support that case since there exists a 1:1 mapping of >>>>> scalar IRQs to rich data and the control flow would be identical: >>>>> interrupt specifiers are read at enumeration time and a resulting >>>>> handle -- struct resource or scalar IRQ -- used afterward to refer >>>>> to it. >>>>> >>>> Nice, nice, i think that we are very close at this point. >>>> The problem with the above workflow is that maps IRQ function is called >>>> at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at >>>> this time, PICs are not exist. >>>> Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping >>>> table, it doesn't provide this functionality. >>>> But yes, i understand that mapping table is important and necessary >>>> for you. >>>> >>>> So, if i can extend our concept a little bit, then: >>>> We can add new table (map_data_tbl for example) that holds a copy of >>>> the original interrupt specifier, index to irq_sources table and >>>> probably some flags. >>>> And we can modify the above workflow to: >>>> - Parent bus enumerates children, allocates entries from map_data_tbl, >>>> adds to resource list. The struct resources involved aren't special >>>> (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are >>>> trivial to implement (and already are implemented, in general, for >>>> OF/FDT drivers). Index to map_data_tbl is used as resource ID. >>>> - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, >>>> *maps IRQs* and stores result in 'index' field. >>>> - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls >>>> into the PIC and does what is needed, as in r301453. (to within that >>>> #ifdef) >>>> >>>> And, in PPC case, newly attached PIC scans whole map_data_tbl table, >>>> finds his entries and makes what needs. >>>> >>>> Does that make sense? >>>> I hope that postponing of map IRQ call is not a problem for PPC, >>>> everything else looks easy. >>>> Moreover, this additional level of indirection also solves all my >>>> 'hypothetical corner cases' with N:1 mappings (between original >>>> interrupt specifier and real interrupt number). >>> Yes, I think we are converging. >>> >>> My qualm about bus_alloc_resource() is twofold: >>> 1. Traditionally, with interrupts, bus_alloc_resource() has no side >>> effects and I'm not sure it propagates cleanly down the tree in all >>> cases. >>> 2. There are a few bus APIs (bus_config_intr() comes to mind) that >>> use raw IRQ IDs and, so far as I know, can be, and sometimes are >>> called before bus_alloc_resource(). If the PIC doesn't know about the >>> IRQ yet when bus_config_intr() (etc.) is called, things will just break. >>> >>> So you would need to make sure that any bus method handling a >>> resource ID causes it to be mapped on the PIC at that time. It's OK >>> if that happens in bus_alloc_resource() so long as bus_config_intr(), >>> bus_setup_intr(), etc. cause that to happen if it hasn't happened yet >>> -- I don't care when these calls are made to the PIC driver so long >>> as *what* calls will be made is set at enumeration time. >>> >>> I am also totally fine with having, on ARM, bus_config_intr(), >>> bus_setup_intr() etc. fail if the PIC hasn't attached yet (On >>> PowerPC, we can't do that yet, but after this conversation, I regard >>> that as a bug and would fix that later), as well as delaying setup on >>> the PIC to the first time any bus driver actually tries to *use* the >>> interrupt (alloc_resource(), setup_intr(), config_intr(), whatever) >>> rather than when the interrupt is originally allocated. >>> > I understand. And yes, bus_alloc_resource() isn't propagated cleanly > down the tree in all cases. That's why we have 'INTRNG' hook in > subr_bus.c, which is suboptimal. > > About bus_config_intr(). The only consumer of bus_config_intr() is ACPI > code, in little hackish manner, as workaround for problem which is > fully solved by INTRNG. > Also, the semantic of bus_config_intr() is not clear, from INTRNG point > of view. > So i have tendency to ignore it :) Heh. Fair enough. I hadn't noticed that -- though I can see legitimate uses for it in the context of GPIOs. In any case, I'm a little wary of bus_alloc_resource() having side effects. Usually bus_activate_resource() would do that kind of thing. > >>> ------------ The following is a large parenthesis ------------------- >>> >>> One other possible route here that would also work well would be to >>> make ofwbus.c MD (it's a trivial piece of code anyway, so we don't >>> gain a lot by sharing it) and implement ofw_bus_map_intr() locally as >>> an ofwbus bus method. Then you could have the mapping table stored in >>> ofwbus's softc -- the API was designed for this initially. You would >>> need MD extensions for doing PIC registration there (which is fine), >>> but that segregates all the OFW-specific information into >>> OFW-specific code and would let the bus methods and the OFW interrupt >>> mapping table interact cleanly in the same place. This still >>> preserves the pre-r301453 API, makes PowerPC work, and maybe >>> address's skra@'s concern about extensibility and letting core >>> interrupt code know about FDT (or ACPI). I'd be happy to mock this up >>> as well if you think it's a good approach. >>> >> [snip] >>> This has the following features: >>> - Existing OFW API and semantics unchanged >>> - As such, PowerPC, PCI, etc. work fine with no changes >>> - Details encapsulated in MD code, so individual platforms can >>> implement this however they like >>> - arm/arm/intr.c (or whatever) only needs a method to allocate a >>> fresh interrupt, with no state, and anoter to set the device_t for an >>> interrupt sometime later. >>> - The internal table in the platform interrupt code has no knowledge >>> of any mappings whatsoever except having the appropriate device_t for >>> the PIC stored with the interrupt vector. >>> - Device tree bits handled purely in device tree code >>> - No action need be taken on any mapping until the interrupt is >>> actually allocated/set up, like r301453 >>> - Easy to add more mapping mechanisms (e.g. ACPI) by having similar >>> enumeration-mechanism-specific code in the root bus for that mapping >>> mechanism. >>> >>> -------------- End parenthesis ------------------------------- >> Here's an implementation of the parenthesis I wrote on an airplane >> this afternoon. It should be complete, though has not been tested. The >> code is short and simple (+70 lines in ofwbus.c). This preserves the >> pre-r301453 API and semantics relative to drivers, which means PowerPC >> and PCI work out of the box, while keeping the semantics relative to >> the interrupt layer of r301453 (PIC methods only called on resource >> allocation, no allocatable IRQs on unattached PICs, encapsulation of >> OFW-specific code in OFW-specific bits of the tree). It turns out >> those two things are compatible, somewhat to my surprise, and that >> makes the result very clean. I like this approach and would be happy >> to move forward with it. There are five functions of interest: >> >> 1. OFW_BUS_MAP_INTR(). This has the semantics and API it has now: you >> pass an interrupt specifier and parent, you get back an IRQ. No >> changes. This is the core of the normal OFW interrupt API. >> >> 2. OFW_BUS_REGISTER_PIC(device_t pic, phandle_t phandle). This is a >> new function that PIC drivers are supposed to use to register control >> of an interrupt domain. This replaces machine-specific code like >> powerpc_register_pic() to allow the PIC table to be in a bus parent >> rather than in the interrupt core. >> >> 3. PIC_MAP_OFW_INTR(device_t pic, int irq, interrupt specifier). This >> is a new function that PIC drivers that know how to handle device-tree >> interrupt descriptors implement (analogous to various existing ones >> that vary by platform). It tells the PIC that the given abstract IRQ >> means the given opaque interrupt specifier. >> >> 4. arm_allocate_irq(int suggested). This allocates a new IRQ number >> not (yet) attached to a PIC, as in r301453. I've added a parameter >> that lets you pass a suggested number to try in case it is possible to >> make it match an interrupt pin or something for human-readability. >> >> 5. arm_set_pic_for_irq(int irq, device_t). This tells the MD interrupt >> layer to associate a given PIC device_t with an IRQ. That is all the >> information the MD layer ever has about the IRQ mapping. >> >> Functions #1 and #2 are now implemented completely in ofwbus.c: there >> are no callouts anywhere else and the interrupt mapping table is >> maintained entirely internally to ofwbus (in its softc). In order to >> implement ACPI, or some other strategy, you would implement analogs to >> functions #1 and #2 that live somewhere in the bus hierarchy that is >> guaranteed to be above all devices that might want that type of >> interrupt (e.g. in acpi0), and some analog to #3 that PIC drivers >> implementing the mapping scheme would provide. >> >> Since the system interrupt code has no knowledge at all of interrupt >> mapping, of any type, in this scheme, adding new mapping types is >> trivial and can be done on a driver-by-driver basis if necessary >> without changing KPI and without any other part of the system even >> being aware. For example, GPIOs can use a completely different >> mechanism if they want and can do setup purely in the GPIO controller >> driver. You could have a method GPIO_GET_INTERRUPT_FOR_PIN() on a GPIO >> controller in which the GPIO controller allocates a generic IRQ, >> assigns through some internal table just in the GPIO driver, and >> returns to it to a consumer in some other device driver -- without a >> GPIO mapping type, new bus functions, or modifications to the platform >> interrupt code. >> >> The control flow goes like this: >> - Bus driver enumerates children, parses interrupts properties, calls >> OFW_BUS_MAP_INTR() to get IRQs for them (as pre-r301453), adds to >> interrupt list. >> - ofwbus receives the OFW_BUS_MAP_INTR() call, allocates a blank >> disconnected IRQ from the MD interrupt layer, and stores the mapping >> from the new IRQ to the given interrupt specifier and phandle in an >> internal table in ofwbus's softc. >> NB: Nothing else happens here, like post-r301453. Changing this does >> not change any semantics of the API pre-r301453, which means it >> remains fully compatible with PCI and PowerPC. Also, like >> post-r301453, there is no involvement of nexus. >> - PICs attach and call OFW_BUS_REGISTER_PIC(). ofwbus receives these >> messages and adds a (device_t, phandle_t) mapping to a second internal >> table. Note that the interrupt layer does not need to handle PIC >> registration anymore at all (except for the root PIC). >> - Bus child eventually calls a function that tries to set the >> interrupt up (e.g. bus_setup_intr()). That propagates up the bus >> hierarchy, eventually getting to ofwbus. ofwbus notes the IRQ number, >> looks it up in the table, looks up the appropriate PIC from the PIC >> table, then: >> A) calls arm_set_pic_for_irq(irq, pic_device_t) -- this is the >> interrupt layer's only interaction with the mapping code. All it deals >> with is device_ts and abstract IRQ numbers. >> B) calls PIC_MAP_OFW_INTR(pic, irq_number, interrupt-cells, >> interrupt-specifier) to tell the PIC that the interrupt layer's IRQ >> irq_number means the given specifier >> C) finally, passes the call onto nexus, which will do whatever would >> normally happen (unmasking the interrupt, setting handlers, etc.) in >> terms only of the abstract IRQ and the device_t assigned by ofwbus. >> >> You would implement ACPI just by doing a s/OFW/ACPI/g >> search-and-replace above -- since the interrupt layer doesn't know >> about OFW or ACPI or anything else, there is no need to touch it. This >> seems clean, simple, compartmentalized, preserves the existing API, >> and should work on all of our various hardware. PowerPC can't quite >> work with it yet without some multipass foo, but, since the API is >> preserved, that transition can happen gradually without KPI changes. >> For the same reason that it is API-preserving, I think this code is >> also MFC-able. >> -Nathan > I think that we are slightly diverges in this place: > - please note that PIC API (in kern/pic_if.m) is different from the one > that PPC uses. Yes, which is fine (this is machine-dependent code anyway). On consideration, the PIC_MAP_OFW_INTR() function should probably be called OFW_BUS_PIC_MAP_INTR() and live in ofw_bus_if.m anyway, then be implemented by PICs. > - we must be able to store configuration data (original interrupt > specifier) in all cases, not only for OFW. That's reason why I have > proposed > to create 'mapping table'. It is stored in all cases, just not in the core interrupt code. I've only mocked this up for OFW here. For ACPI, you would have the equivalent table in acpi.c, along with ACPI_MAP_INTR(), ACPI_REGISTER_PIC(), etc. in acpi_if.m and implemented in dev/acpica/acpi.c. For GPIOs, you would have a mechanism that just traces however you are representing GPIOs anyway. > Anyway, i think that we both talking about +/- same solution, only i > want to move it from OFW specific level implemented at OFW bus level to > bus/interrupt specifier format independent level implemented in > subr_intr.c. This implements the same API in any case (identical to that pre-r301453), so the implementation doesn't really matter at all in terms of my needs. That said, having it in the root bus for the mapping domain (ofwbus0, acpi0) etc. seems cleaner to me, with no loss of functionality. The core interrupt code (subr_intr.c or whatever) doesn't have any obvious need to know anything at all about the mapping information so long as it knows the PIC device_t that corresponds to every abstract IRQ so that it can mask it or do other operations. Since, presumably, nothing in an ACPI device tree references an interrupt in the OFW device tree (if you even had both -- and how would you specify that, anyway?), implementing the relevant bits one step up the bus hierarchy doesn't change any behavior. And nothing can possibly be more flexible in terms of the mapping table in subr_intr.c than not having a mapping table: you don't need enums for mapping types, or unions that need to be expanded, breaking KBI, or anything. You can delete all the PIC registration (and MSI) code, all the switches, and all the unions from subr_intr.c and have a totally mapping-mechanism-agnostic KPI. > Most of your control flow can be implemented at general level as is, or > already exist [intr_map_irq(), intr_pic_register()] . > Also, impact for current PPC code is, by me, minimal. > I can sketch my idea in more details, if you found it acceptable. All ideas are fine -- and the solution does not need to apply to all platforms anyway, so long as the OFW_BUS_MAP_INTR() semantics (and, ideally, API) are preserved. > Again, I'm sorry for delayed and very brief response. No worries; good luck with the work emergencies. -Nathan > Michal > From owner-freebsd-arch@freebsd.org Thu Aug 4 09:31:40 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E4C49BAE0CD; Thu, 4 Aug 2016 09:31:40 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from mail.miracle.cz (mail.miracle.cz [193.84.128.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.miracle.cz", Issuer "Miracle Group Root CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7F4FE1BCA; Thu, 4 Aug 2016 09:31:40 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from [193.84.128.50] (meloun.ad.miracle.cz [193.84.128.50]) by mail.miracle.cz (Postfix) with ESMTPSA id 160693ACCB; Thu, 4 Aug 2016 11:31:31 +0200 (CEST) From: Michal Meloun Subject: Re: INTRNG (Was: svn commit: r301453....) To: Nathan Whitehorn References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> <57A09F34.4050400@FreeBSD.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" X-Enigmail-Draft-Status: N1110 Message-ID: <57A30B72.7070809@FreeBSD.org> Date: Thu, 4 Aug 2016 11:31:30 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.miracle.cz); Thu, 04 Aug 2016 11:31:31 +0200 (CEST) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Aug 2016 09:31:41 -0000 Dne 02.08.2016 v 17:31 Nathan Whitehorn napsal(a): > > > On 08/02/16 06:25, Michal Meloun wrote: >> I'm sorry for delayed response. We have serious trouble @work so I >> cannot spend to much time on FreeBSB for next 2 or 3 days. > > Understood. > >> >> Dne 02.08.2016 v 6:00 Nathan Whitehorn napsal(a): >>> >>> On 07/31/16 11:43, Nathan Whitehorn wrote: >>>> >>>> On 07/31/16 08:40, Michal Meloun wrote: >>>>> Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): >>>>> [snip] >>>>>>> The current API is certainly a bit of a hack and I would be >>>>>>> pleased to >>>>>>>> see it go; but the replacement needs to be more functional and >>>>>>>> more >>>>>>>> robust. As far as I can tell, I literally *cannot* move PowerPC >>>>>>>> over >>>>>>>> to this architecture. >>>>>>> Yes, at this time, I agree. If you can accept enhancement of >>>>>>> (owf_)bus_map_intr() then we can move to next, significantly less >>>>>>> hackish, state. >>>>>> OK, sure. I wrote some code this afternoon (attached) to sketch >>>>>> this. >>>>>> It does not compile or run or anything useful, but it should >>>>>> illustrate the important parts of what I think is an API that should >>>>>> work for everyone. I'm happy to finish it if it looks reasonable >>>>>> -- I >>>>>> may in any case just to replace PowerPC's increasingly teetering >>>>>> intr_machdep.c. >>>>>> >>>>>> The OF part is essentially unchanged from how it currently works: >>>>>> there's a method that you pass the interrupt specifier and interrupt >>>>>> parent to, and it spits back an IRQ that means that combination of >>>>>> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its >>>>>> current >>>>>> API, can be implemented in terms of it in one line. Whenever the >>>>>> relevant PIC attaches, it is told to use the given IRQ to refer to >>>>>> that opaque bytestring. >>>>>> >>>>>> I've added a set of equivalent functions for ACPI that take the >>>>>> contents of an ACPI interrupt specifier and do the same things. It >>>>>> tries to have the IRQ be human-meaningful, if possible, by matching >>>>>> the ACPI interrupt number. Having separate functions seemed a little >>>>>> cleaner than exposing the enums and unions as part of the KPI, >>>>>> but I'm >>>>>> not wedded to it -- this is just an example. >>>>>> >>>>>> PICs register (and deregister) themselves with either an OF >>>>>> phandle or >>>>>> an ACPI resource string or (god help us) both as needed. The PICs >>>>>> have >>>>>> corresponding methods for interpreting various kinds of interrupt >>>>>> specifiers. >>>>>> >>>>>> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to >>>>>> succeed before the PICs are attached is gated on an #ifdef. That can >>>>>> probably be off by default on ARM. A PowerPC version of this code >>>>>> would have to keep it on until various bus drivers are fixed. >>>>>> >>>>>> Here's the general flow: >>>>>> - Parent bus enumerates children, maps IRQs as needed, adds to >>>>>> resource list. The struct resources involved aren't special (just a >>>>>> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to >>>>>> implement (and already are implemented, in general, for OF/FDT >>>>>> drivers). >>>>>> - bus_alloc_resource() does nothing special >>>>>> - bus_setup_intr() calls into the PIC and does what is needed, as in >>>>>> r301453 (to within that #ifdef) >>>>>> >>>>>> This should keep all the best pieces of everything: >>>>>> - ACPI and OF are supported together, and easy to extend for other >>>>>> types of mapping technologies without breaking either KBI or KPI >>>>>> (just >>>>>> add new functions) >>>>>> - No APIs change for existing Open Firmware code >>>>>> - The support code can live entirely in machine-dependent >>>>>> directories, >>>>>> with no code changes at all required in kern/ or dev/ofw/ >>>>>> - bus_setup_intr() and friends behave as in r301453 and succeed or >>>>>> fail for real >>>>>> - PCI code is not an issue >>>>>> - No disconnected interrupt state maintained in mapping table >>>>>> (unless >>>>>> the transitional #ifdef is on) and the mapping table content is only >>>>>> larger than r301453 by having a copy of the original interrupt >>>>>> specifier. >>>>>> >>>>>> If and when we manage to fix the kernel APIs to allow non-scalar >>>>>> interrupt specifiers, this should also be easy to gradually >>>>>> transmogrify to support that case since there exists a 1:1 >>>>>> mapping of >>>>>> scalar IRQs to rich data and the control flow would be identical: >>>>>> interrupt specifiers are read at enumeration time and a resulting >>>>>> handle -- struct resource or scalar IRQ -- used afterward to refer >>>>>> to it. >>>>>> >>>>> Nice, nice, i think that we are very close at this point. >>>>> The problem with the above workflow is that maps IRQ function is >>>>> called >>>>> at parent bus enumeration time, generally at BUS_PASS_BUS pass. >>>>> And at >>>>> this time, PICs are not exist. >>>>> Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a >>>>> mapping >>>>> table, it doesn't provide this functionality. >>>>> But yes, i understand that mapping table is important and necessary >>>>> for you. >>>>> >>>>> So, if i can extend our concept a little bit, then: >>>>> We can add new table (map_data_tbl for example) that holds a >>>>> copy of >>>>> the original interrupt specifier, index to irq_sources table and >>>>> probably some flags. >>>>> And we can modify the above workflow to: >>>>> - Parent bus enumerates children, allocates entries from >>>>> map_data_tbl, >>>>> adds to resource list. The struct resources involved aren't special >>>>> (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are >>>>> trivial to implement (and already are implemented, in general, for >>>>> OF/FDT drivers). Index to map_data_tbl is used as resource ID. >>>>> - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, >>>>> *maps IRQs* and stores result in 'index' field. >>>>> - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, >>>>> calls >>>>> into the PIC and does what is needed, as in r301453. (to within that >>>>> #ifdef) >>>>> >>>>> And, in PPC case, newly attached PIC scans whole map_data_tbl table, >>>>> finds his entries and makes what needs. >>>>> >>>>> Does that make sense? >>>>> I hope that postponing of map IRQ call is not a problem for PPC, >>>>> everything else looks easy. >>>>> Moreover, this additional level of indirection also solves all my >>>>> 'hypothetical corner cases' with N:1 mappings (between original >>>>> interrupt specifier and real interrupt number). >>>> Yes, I think we are converging. >>>> >>>> My qualm about bus_alloc_resource() is twofold: >>>> 1. Traditionally, with interrupts, bus_alloc_resource() has no side >>>> effects and I'm not sure it propagates cleanly down the tree in all >>>> cases. >>>> 2. There are a few bus APIs (bus_config_intr() comes to mind) that >>>> use raw IRQ IDs and, so far as I know, can be, and sometimes are >>>> called before bus_alloc_resource(). If the PIC doesn't know about the >>>> IRQ yet when bus_config_intr() (etc.) is called, things will just >>>> break. >>>> >>>> So you would need to make sure that any bus method handling a >>>> resource ID causes it to be mapped on the PIC at that time. It's OK >>>> if that happens in bus_alloc_resource() so long as bus_config_intr(), >>>> bus_setup_intr(), etc. cause that to happen if it hasn't happened yet >>>> -- I don't care when these calls are made to the PIC driver so long >>>> as *what* calls will be made is set at enumeration time. >>>> >>>> I am also totally fine with having, on ARM, bus_config_intr(), >>>> bus_setup_intr() etc. fail if the PIC hasn't attached yet (On >>>> PowerPC, we can't do that yet, but after this conversation, I regard >>>> that as a bug and would fix that later), as well as delaying setup on >>>> the PIC to the first time any bus driver actually tries to *use* the >>>> interrupt (alloc_resource(), setup_intr(), config_intr(), whatever) >>>> rather than when the interrupt is originally allocated. >>>> >> I understand. And yes, bus_alloc_resource() isn't propagated cleanly >> down the tree in all cases. That's why we have 'INTRNG' hook in >> subr_bus.c, which is suboptimal. >> >> About bus_config_intr(). The only consumer of bus_config_intr() is ACPI >> code, in little hackish manner, as workaround for problem which is >> fully solved by INTRNG. >> Also, the semantic of bus_config_intr() is not clear, from INTRNG point >> of view. >> So i have tendency to ignore it :) > > Heh. Fair enough. I hadn't noticed that -- though I can see legitimate > uses for it in the context of GPIOs. In any case, I'm a little wary of > bus_alloc_resource() having side effects. Usually > bus_activate_resource() would do that kind of thing. Yes, good catch!! bus_activate_resource() is definitively best method. > >> >>>> ------------ The following is a large parenthesis ------------------- >>>> >>>> One other possible route here that would also work well would be to >>>> make ofwbus.c MD (it's a trivial piece of code anyway, so we don't >>>> gain a lot by sharing it) and implement ofw_bus_map_intr() locally as >>>> an ofwbus bus method. Then you could have the mapping table stored in >>>> ofwbus's softc -- the API was designed for this initially. You would >>>> need MD extensions for doing PIC registration there (which is fine), >>>> but that segregates all the OFW-specific information into >>>> OFW-specific code and would let the bus methods and the OFW interrupt >>>> mapping table interact cleanly in the same place. This still >>>> preserves the pre-r301453 API, makes PowerPC work, and maybe >>>> address's skra@'s concern about extensibility and letting core >>>> interrupt code know about FDT (or ACPI). I'd be happy to mock this up >>>> as well if you think it's a good approach. >>>> >>> [snip] >>>> This has the following features: >>>> - Existing OFW API and semantics unchanged >>>> - As such, PowerPC, PCI, etc. work fine with no changes >>>> - Details encapsulated in MD code, so individual platforms can >>>> implement this however they like >>>> - arm/arm/intr.c (or whatever) only needs a method to allocate a >>>> fresh interrupt, with no state, and anoter to set the device_t for an >>>> interrupt sometime later. >>>> - The internal table in the platform interrupt code has no knowledge >>>> of any mappings whatsoever except having the appropriate device_t for >>>> the PIC stored with the interrupt vector. >>>> - Device tree bits handled purely in device tree code >>>> - No action need be taken on any mapping until the interrupt is >>>> actually allocated/set up, like r301453 >>>> - Easy to add more mapping mechanisms (e.g. ACPI) by having similar >>>> enumeration-mechanism-specific code in the root bus for that mapping >>>> mechanism. >>>> >>>> -------------- End parenthesis ------------------------------- >>> Here's an implementation of the parenthesis I wrote on an airplane >>> this afternoon. It should be complete, though has not been tested. The >>> code is short and simple (+70 lines in ofwbus.c). This preserves the >>> pre-r301453 API and semantics relative to drivers, which means PowerPC >>> and PCI work out of the box, while keeping the semantics relative to >>> the interrupt layer of r301453 (PIC methods only called on resource >>> allocation, no allocatable IRQs on unattached PICs, encapsulation of >>> OFW-specific code in OFW-specific bits of the tree). It turns out >>> those two things are compatible, somewhat to my surprise, and that >>> makes the result very clean. I like this approach and would be happy >>> to move forward with it. There are five functions of interest: >>> >>> 1. OFW_BUS_MAP_INTR(). This has the semantics and API it has now: you >>> pass an interrupt specifier and parent, you get back an IRQ. No >>> changes. This is the core of the normal OFW interrupt API. >>> >>> 2. OFW_BUS_REGISTER_PIC(device_t pic, phandle_t phandle). This is a >>> new function that PIC drivers are supposed to use to register control >>> of an interrupt domain. This replaces machine-specific code like >>> powerpc_register_pic() to allow the PIC table to be in a bus parent >>> rather than in the interrupt core. >>> >>> 3. PIC_MAP_OFW_INTR(device_t pic, int irq, interrupt specifier). This >>> is a new function that PIC drivers that know how to handle device-tree >>> interrupt descriptors implement (analogous to various existing ones >>> that vary by platform). It tells the PIC that the given abstract IRQ >>> means the given opaque interrupt specifier. >>> >>> 4. arm_allocate_irq(int suggested). This allocates a new IRQ number >>> not (yet) attached to a PIC, as in r301453. I've added a parameter >>> that lets you pass a suggested number to try in case it is possible to >>> make it match an interrupt pin or something for human-readability. >>> >>> 5. arm_set_pic_for_irq(int irq, device_t). This tells the MD interrupt >>> layer to associate a given PIC device_t with an IRQ. That is all the >>> information the MD layer ever has about the IRQ mapping. >>> >>> Functions #1 and #2 are now implemented completely in ofwbus.c: there >>> are no callouts anywhere else and the interrupt mapping table is >>> maintained entirely internally to ofwbus (in its softc). In order to >>> implement ACPI, or some other strategy, you would implement analogs to >>> functions #1 and #2 that live somewhere in the bus hierarchy that is >>> guaranteed to be above all devices that might want that type of >>> interrupt (e.g. in acpi0), and some analog to #3 that PIC drivers >>> implementing the mapping scheme would provide. >>> >>> Since the system interrupt code has no knowledge at all of interrupt >>> mapping, of any type, in this scheme, adding new mapping types is >>> trivial and can be done on a driver-by-driver basis if necessary >>> without changing KPI and without any other part of the system even >>> being aware. For example, GPIOs can use a completely different >>> mechanism if they want and can do setup purely in the GPIO controller >>> driver. You could have a method GPIO_GET_INTERRUPT_FOR_PIN() on a GPIO >>> controller in which the GPIO controller allocates a generic IRQ, >>> assigns through some internal table just in the GPIO driver, and >>> returns to it to a consumer in some other device driver -- without a >>> GPIO mapping type, new bus functions, or modifications to the platform >>> interrupt code. >>> >>> The control flow goes like this: >>> - Bus driver enumerates children, parses interrupts properties, calls >>> OFW_BUS_MAP_INTR() to get IRQs for them (as pre-r301453), adds to >>> interrupt list. >>> - ofwbus receives the OFW_BUS_MAP_INTR() call, allocates a blank >>> disconnected IRQ from the MD interrupt layer, and stores the mapping >>> from the new IRQ to the given interrupt specifier and phandle in an >>> internal table in ofwbus's softc. >>> NB: Nothing else happens here, like post-r301453. Changing this does >>> not change any semantics of the API pre-r301453, which means it >>> remains fully compatible with PCI and PowerPC. Also, like >>> post-r301453, there is no involvement of nexus. >>> - PICs attach and call OFW_BUS_REGISTER_PIC(). ofwbus receives these >>> messages and adds a (device_t, phandle_t) mapping to a second internal >>> table. Note that the interrupt layer does not need to handle PIC >>> registration anymore at all (except for the root PIC). >>> - Bus child eventually calls a function that tries to set the >>> interrupt up (e.g. bus_setup_intr()). That propagates up the bus >>> hierarchy, eventually getting to ofwbus. ofwbus notes the IRQ number, >>> looks it up in the table, looks up the appropriate PIC from the PIC >>> table, then: >>> A) calls arm_set_pic_for_irq(irq, pic_device_t) -- this is the >>> interrupt layer's only interaction with the mapping code. All it deals >>> with is device_ts and abstract IRQ numbers. >>> B) calls PIC_MAP_OFW_INTR(pic, irq_number, interrupt-cells, >>> interrupt-specifier) to tell the PIC that the interrupt layer's IRQ >>> irq_number means the given specifier >>> C) finally, passes the call onto nexus, which will do whatever would >>> normally happen (unmasking the interrupt, setting handlers, etc.) in >>> terms only of the abstract IRQ and the device_t assigned by ofwbus. >>> >>> You would implement ACPI just by doing a s/OFW/ACPI/g >>> search-and-replace above -- since the interrupt layer doesn't know >>> about OFW or ACPI or anything else, there is no need to touch it. This >>> seems clean, simple, compartmentalized, preserves the existing API, >>> and should work on all of our various hardware. PowerPC can't quite >>> work with it yet without some multipass foo, but, since the API is >>> preserved, that transition can happen gradually without KPI changes. >>> For the same reason that it is API-preserving, I think this code is >>> also MFC-able. >>> -Nathan >> I think that we are slightly diverges in this place: >> - please note that PIC API (in kern/pic_if.m) is different from the >> one >> that PPC uses. > > Yes, which is fine (this is machine-dependent code anyway). On > consideration, the PIC_MAP_OFW_INTR() function should probably be > called OFW_BUS_PIC_MAP_INTR() and live in ofw_bus_if.m anyway, then be > implemented by PICs. > >> - we must be able to store configuration data (original interrupt >> specifier) in all cases, not only for OFW. That's reason why I have >> proposed >> to create 'mapping table'. > > It is stored in all cases, just not in the core interrupt code. I've > only mocked this up for OFW here. For ACPI, you would have the > equivalent table in acpi.c, along with ACPI_MAP_INTR(), > ACPI_REGISTER_PIC(), etc. in acpi_if.m and implemented in > dev/acpica/acpi.c. > > For GPIOs, you would have a mechanism that just traces however you are > representing GPIOs anyway. > >> Anyway, i think that we both talking about +/- same solution, only i >> want to move it from OFW specific level implemented at OFW bus level to >> bus/interrupt specifier format independent level implemented in >> subr_intr.c. > > This implements the same API in any case (identical to that > pre-r301453), so the implementation doesn't really matter at all in > terms of my needs. Perfect. > > That said, having it in the root bus for the mapping domain (ofwbus0, > acpi0) etc. seems cleaner to me, with no loss of functionality. The > core interrupt code (subr_intr.c or whatever) doesn't have any obvious > need to know anything at all about the mapping information so long as > it knows the PIC device_t that corresponds to every abstract IRQ so > that it can mask it or do other operations. Since, presumably, nothing > in an ACPI device tree references an interrupt in the OFW device tree > (if you even had both -- and how would you specify that, anyway?), > implementing the relevant bits one step up the bus hierarchy doesn't > change any behavior. > > And nothing can possibly be more flexible in terms of the mapping > table in subr_intr.c than not having a mapping table: you don't need > enums for mapping types, or unions that need to be expanded, breaking > KBI, or anything. You can delete all the PIC registration (and MSI) > code, all the switches, and all the unions from subr_intr.c and have a > totally mapping-mechanism-agnostic KPI. > Where you see "all the switches, and all the unions" in subr_intr.c? subr_intr.c uses nested structures approach, in exactly same way as is used in OFW bus. Moreover, subr_intr.c *IS* currently totally mapping-mechanism-agnostic KPI, and any form of mapping table must follow this rule. But allow me to make short summary: - we must support 3 'main' cases - OFW, ACPI and HINTS (for older arm and mips boards) based systems - single kernel must support all above cases, but only one can by 'active' (kernel must be able to select right one at runtime). - we must support 'direct' (interrupt specifier is defined by one of above methods) or 'indirect' (where interrupt is associated with other function - GPIO pin, but don't have direct description). - we must be able to access single physical pin by both methods, 'direct' and/or 'indirect'. - we must be able to add new interrupt type as simple as possible For interrupt controllers: - single controller must be able to parse multiple formats. 'direct' or 'indirect'. OFW or ACPI. ARM GIC must accept OFW or ACPI arguments (in single kernel), GPIO based interrupt controller must accept 'direct' or 'indirect' formats. For interrupt consumers: - 'direct' interrupts are easy - driver must be able to consume 'indirect' interrupt in 'root bus'/'mapping-mechanism' agnostic manner. For example, MMC driver must be able to get interrupt from CD gpio pin in all possible cases/combinations . Are we still connected? I don't think that all this is possible without single universal format of interrupt mapping specifier. And, if we must have universal format then why we needs different mapping databases? >> Most of your control flow can be implemented at general level as is, or >> already exist [intr_map_irq(), intr_pic_register()] . >> Also, impact for current PPC code is, by me, minimal. >> I can sketch my idea in more details, if you found it acceptable. > > All ideas are fine -- and the solution does not need to apply to all > platforms anyway, so long as the OFW_BUS_MAP_INTR() semantics (and, > ideally, API) are preserved. > >> Again, I'm sorry for delayed and very brief response. > > No worries; good luck with the work emergencies. > -Nathan All problems have been solved, sleep deficit remains :) Michal From owner-freebsd-arch@freebsd.org Thu Aug 4 12:35:07 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D6338BAF85D; Thu, 4 Aug 2016 12:35:07 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B814C13D1; Thu, 4 Aug 2016 12:35:07 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from zeppelin.tachypleus.net ([172.56.13.252]) (authenticated bits=0) by d.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id u74CYowF011910 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Thu, 4 Aug 2016 05:34:56 -0700 Subject: Re: INTRNG (Was: svn commit: r301453....) To: Michal Meloun References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> <57A09F34.4050400@FreeBSD.org> <57A30B72.7070809@FreeBSD.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Nathan Whitehorn Message-ID: <1946069a-d0f9-2c19-80a5-0b490682574b@freebsd.org> Date: Thu, 4 Aug 2016 05:34:50 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <57A30B72.7070809@FreeBSD.org> Content-Type: multipart/mixed; boundary="------------3EFCE033AD3743BB7374BC07" X-Sonic-CAuth: UmFuZG9tSVYTMvib3iK8t8OtonMY5p5XZ9h+jydHp6pTgJjTe9flGBKGidKY6bMTMv300NtwZ45DQAPm4TnMP/b9QGcG10/7b/Nr44LYeWI= X-Sonic-ID: C;PLZz1j9a5hGCN6/hcgQksw== M;/rHB2T9a5hGCN6/hcgQksw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Aug 2016 12:35:07 -0000 This is a multi-part message in MIME format. --------------3EFCE033AD3743BB7374BC07 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit On 08/04/16 02:31, Michal Meloun wrote: > Dne 02.08.2016 v 17:31 Nathan Whitehorn napsal(a): >> >> On 08/02/16 06:25, Michal Meloun wrote: >>> I'm sorry for delayed response. We have serious trouble @work so I >>> cannot spend to much time on FreeBSB for next 2 or 3 days. >> Understood. >> >>> Dne 02.08.2016 v 6:00 Nathan Whitehorn napsal(a): >>>> On 07/31/16 11:43, Nathan Whitehorn wrote: >>>>> On 07/31/16 08:40, Michal Meloun wrote: >>>>>> Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): >>>>>> [snip] >>>>>>>> The current API is certainly a bit of a hack and I would be >>>>>>>> pleased to >>>>>>>>> see it go; but the replacement needs to be more functional and >>>>>>>>> more >>>>>>>>> robust. As far as I can tell, I literally *cannot* move PowerPC >>>>>>>>> over >>>>>>>>> to this architecture. >>>>>>>> Yes, at this time, I agree. If you can accept enhancement of >>>>>>>> (owf_)bus_map_intr() then we can move to next, significantly less >>>>>>>> hackish, state. >>>>>>> OK, sure. I wrote some code this afternoon (attached) to sketch >>>>>>> this. >>>>>>> It does not compile or run or anything useful, but it should >>>>>>> illustrate the important parts of what I think is an API that should >>>>>>> work for everyone. I'm happy to finish it if it looks reasonable >>>>>>> -- I >>>>>>> may in any case just to replace PowerPC's increasingly teetering >>>>>>> intr_machdep.c. >>>>>>> >>>>>>> The OF part is essentially unchanged from how it currently works: >>>>>>> there's a method that you pass the interrupt specifier and interrupt >>>>>>> parent to, and it spits back an IRQ that means that combination of >>>>>>> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its >>>>>>> current >>>>>>> API, can be implemented in terms of it in one line. Whenever the >>>>>>> relevant PIC attaches, it is told to use the given IRQ to refer to >>>>>>> that opaque bytestring. >>>>>>> >>>>>>> I've added a set of equivalent functions for ACPI that take the >>>>>>> contents of an ACPI interrupt specifier and do the same things. It >>>>>>> tries to have the IRQ be human-meaningful, if possible, by matching >>>>>>> the ACPI interrupt number. Having separate functions seemed a little >>>>>>> cleaner than exposing the enums and unions as part of the KPI, >>>>>>> but I'm >>>>>>> not wedded to it -- this is just an example. >>>>>>> >>>>>>> PICs register (and deregister) themselves with either an OF >>>>>>> phandle or >>>>>>> an ACPI resource string or (god help us) both as needed. The PICs >>>>>>> have >>>>>>> corresponding methods for interpreting various kinds of interrupt >>>>>>> specifiers. >>>>>>> >>>>>>> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to >>>>>>> succeed before the PICs are attached is gated on an #ifdef. That can >>>>>>> probably be off by default on ARM. A PowerPC version of this code >>>>>>> would have to keep it on until various bus drivers are fixed. >>>>>>> >>>>>>> Here's the general flow: >>>>>>> - Parent bus enumerates children, maps IRQs as needed, adds to >>>>>>> resource list. The struct resources involved aren't special (just a >>>>>>> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to >>>>>>> implement (and already are implemented, in general, for OF/FDT >>>>>>> drivers). >>>>>>> - bus_alloc_resource() does nothing special >>>>>>> - bus_setup_intr() calls into the PIC and does what is needed, as in >>>>>>> r301453 (to within that #ifdef) >>>>>>> >>>>>>> This should keep all the best pieces of everything: >>>>>>> - ACPI and OF are supported together, and easy to extend for other >>>>>>> types of mapping technologies without breaking either KBI or KPI >>>>>>> (just >>>>>>> add new functions) >>>>>>> - No APIs change for existing Open Firmware code >>>>>>> - The support code can live entirely in machine-dependent >>>>>>> directories, >>>>>>> with no code changes at all required in kern/ or dev/ofw/ >>>>>>> - bus_setup_intr() and friends behave as in r301453 and succeed or >>>>>>> fail for real >>>>>>> - PCI code is not an issue >>>>>>> - No disconnected interrupt state maintained in mapping table >>>>>>> (unless >>>>>>> the transitional #ifdef is on) and the mapping table content is only >>>>>>> larger than r301453 by having a copy of the original interrupt >>>>>>> specifier. >>>>>>> >>>>>>> If and when we manage to fix the kernel APIs to allow non-scalar >>>>>>> interrupt specifiers, this should also be easy to gradually >>>>>>> transmogrify to support that case since there exists a 1:1 >>>>>>> mapping of >>>>>>> scalar IRQs to rich data and the control flow would be identical: >>>>>>> interrupt specifiers are read at enumeration time and a resulting >>>>>>> handle -- struct resource or scalar IRQ -- used afterward to refer >>>>>>> to it. >>>>>>> >>>>>> Nice, nice, i think that we are very close at this point. >>>>>> The problem with the above workflow is that maps IRQ function is >>>>>> called >>>>>> at parent bus enumeration time, generally at BUS_PASS_BUS pass. >>>>>> And at >>>>>> this time, PICs are not exist. >>>>>> Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a >>>>>> mapping >>>>>> table, it doesn't provide this functionality. >>>>>> But yes, i understand that mapping table is important and necessary >>>>>> for you. >>>>>> >>>>>> So, if i can extend our concept a little bit, then: >>>>>> We can add new table (map_data_tbl for example) that holds a >>>>>> copy of >>>>>> the original interrupt specifier, index to irq_sources table and >>>>>> probably some flags. >>>>>> And we can modify the above workflow to: >>>>>> - Parent bus enumerates children, allocates entries from >>>>>> map_data_tbl, >>>>>> adds to resource list. The struct resources involved aren't special >>>>>> (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are >>>>>> trivial to implement (and already are implemented, in general, for >>>>>> OF/FDT drivers). Index to map_data_tbl is used as resource ID. >>>>>> - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, >>>>>> *maps IRQs* and stores result in 'index' field. >>>>>> - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, >>>>>> calls >>>>>> into the PIC and does what is needed, as in r301453. (to within that >>>>>> #ifdef) >>>>>> >>>>>> And, in PPC case, newly attached PIC scans whole map_data_tbl table, >>>>>> finds his entries and makes what needs. >>>>>> >>>>>> Does that make sense? >>>>>> I hope that postponing of map IRQ call is not a problem for PPC, >>>>>> everything else looks easy. >>>>>> Moreover, this additional level of indirection also solves all my >>>>>> 'hypothetical corner cases' with N:1 mappings (between original >>>>>> interrupt specifier and real interrupt number). >>>>> Yes, I think we are converging. >>>>> >>>>> My qualm about bus_alloc_resource() is twofold: >>>>> 1. Traditionally, with interrupts, bus_alloc_resource() has no side >>>>> effects and I'm not sure it propagates cleanly down the tree in all >>>>> cases. >>>>> 2. There are a few bus APIs (bus_config_intr() comes to mind) that >>>>> use raw IRQ IDs and, so far as I know, can be, and sometimes are >>>>> called before bus_alloc_resource(). If the PIC doesn't know about the >>>>> IRQ yet when bus_config_intr() (etc.) is called, things will just >>>>> break. >>>>> >>>>> So you would need to make sure that any bus method handling a >>>>> resource ID causes it to be mapped on the PIC at that time. It's OK >>>>> if that happens in bus_alloc_resource() so long as bus_config_intr(), >>>>> bus_setup_intr(), etc. cause that to happen if it hasn't happened yet >>>>> -- I don't care when these calls are made to the PIC driver so long >>>>> as *what* calls will be made is set at enumeration time. >>>>> >>>>> I am also totally fine with having, on ARM, bus_config_intr(), >>>>> bus_setup_intr() etc. fail if the PIC hasn't attached yet (On >>>>> PowerPC, we can't do that yet, but after this conversation, I regard >>>>> that as a bug and would fix that later), as well as delaying setup on >>>>> the PIC to the first time any bus driver actually tries to *use* the >>>>> interrupt (alloc_resource(), setup_intr(), config_intr(), whatever) >>>>> rather than when the interrupt is originally allocated. >>>>> >>> I understand. And yes, bus_alloc_resource() isn't propagated cleanly >>> down the tree in all cases. That's why we have 'INTRNG' hook in >>> subr_bus.c, which is suboptimal. >>> >>> About bus_config_intr(). The only consumer of bus_config_intr() is ACPI >>> code, in little hackish manner, as workaround for problem which is >>> fully solved by INTRNG. >>> Also, the semantic of bus_config_intr() is not clear, from INTRNG point >>> of view. >>> So i have tendency to ignore it :) >> Heh. Fair enough. I hadn't noticed that -- though I can see legitimate >> uses for it in the context of GPIOs. In any case, I'm a little wary of >> bus_alloc_resource() having side effects. Usually >> bus_activate_resource() would do that kind of thing. > Yes, good catch!! bus_activate_resource() is definitively best method. > >>>>> ------------ The following is a large parenthesis ------------------- >>>>> >>>>> One other possible route here that would also work well would be to >>>>> make ofwbus.c MD (it's a trivial piece of code anyway, so we don't >>>>> gain a lot by sharing it) and implement ofw_bus_map_intr() locally as >>>>> an ofwbus bus method. Then you could have the mapping table stored in >>>>> ofwbus's softc -- the API was designed for this initially. You would >>>>> need MD extensions for doing PIC registration there (which is fine), >>>>> but that segregates all the OFW-specific information into >>>>> OFW-specific code and would let the bus methods and the OFW interrupt >>>>> mapping table interact cleanly in the same place. This still >>>>> preserves the pre-r301453 API, makes PowerPC work, and maybe >>>>> address's skra@'s concern about extensibility and letting core >>>>> interrupt code know about FDT (or ACPI). I'd be happy to mock this up >>>>> as well if you think it's a good approach. >>>>> >>>> [snip] >>>>> This has the following features: >>>>> - Existing OFW API and semantics unchanged >>>>> - As such, PowerPC, PCI, etc. work fine with no changes >>>>> - Details encapsulated in MD code, so individual platforms can >>>>> implement this however they like >>>>> - arm/arm/intr.c (or whatever) only needs a method to allocate a >>>>> fresh interrupt, with no state, and anoter to set the device_t for an >>>>> interrupt sometime later. >>>>> - The internal table in the platform interrupt code has no knowledge >>>>> of any mappings whatsoever except having the appropriate device_t for >>>>> the PIC stored with the interrupt vector. >>>>> - Device tree bits handled purely in device tree code >>>>> - No action need be taken on any mapping until the interrupt is >>>>> actually allocated/set up, like r301453 >>>>> - Easy to add more mapping mechanisms (e.g. ACPI) by having similar >>>>> enumeration-mechanism-specific code in the root bus for that mapping >>>>> mechanism. >>>>> >>>>> -------------- End parenthesis ------------------------------- >>>> Here's an implementation of the parenthesis I wrote on an airplane >>>> this afternoon. It should be complete, though has not been tested. The >>>> code is short and simple (+70 lines in ofwbus.c). This preserves the >>>> pre-r301453 API and semantics relative to drivers, which means PowerPC >>>> and PCI work out of the box, while keeping the semantics relative to >>>> the interrupt layer of r301453 (PIC methods only called on resource >>>> allocation, no allocatable IRQs on unattached PICs, encapsulation of >>>> OFW-specific code in OFW-specific bits of the tree). It turns out >>>> those two things are compatible, somewhat to my surprise, and that >>>> makes the result very clean. I like this approach and would be happy >>>> to move forward with it. There are five functions of interest: >>>> >>>> 1. OFW_BUS_MAP_INTR(). This has the semantics and API it has now: you >>>> pass an interrupt specifier and parent, you get back an IRQ. No >>>> changes. This is the core of the normal OFW interrupt API. >>>> >>>> 2. OFW_BUS_REGISTER_PIC(device_t pic, phandle_t phandle). This is a >>>> new function that PIC drivers are supposed to use to register control >>>> of an interrupt domain. This replaces machine-specific code like >>>> powerpc_register_pic() to allow the PIC table to be in a bus parent >>>> rather than in the interrupt core. >>>> >>>> 3. PIC_MAP_OFW_INTR(device_t pic, int irq, interrupt specifier). This >>>> is a new function that PIC drivers that know how to handle device-tree >>>> interrupt descriptors implement (analogous to various existing ones >>>> that vary by platform). It tells the PIC that the given abstract IRQ >>>> means the given opaque interrupt specifier. >>>> >>>> 4. arm_allocate_irq(int suggested). This allocates a new IRQ number >>>> not (yet) attached to a PIC, as in r301453. I've added a parameter >>>> that lets you pass a suggested number to try in case it is possible to >>>> make it match an interrupt pin or something for human-readability. >>>> >>>> 5. arm_set_pic_for_irq(int irq, device_t). This tells the MD interrupt >>>> layer to associate a given PIC device_t with an IRQ. That is all the >>>> information the MD layer ever has about the IRQ mapping. >>>> >>>> Functions #1 and #2 are now implemented completely in ofwbus.c: there >>>> are no callouts anywhere else and the interrupt mapping table is >>>> maintained entirely internally to ofwbus (in its softc). In order to >>>> implement ACPI, or some other strategy, you would implement analogs to >>>> functions #1 and #2 that live somewhere in the bus hierarchy that is >>>> guaranteed to be above all devices that might want that type of >>>> interrupt (e.g. in acpi0), and some analog to #3 that PIC drivers >>>> implementing the mapping scheme would provide. >>>> >>>> Since the system interrupt code has no knowledge at all of interrupt >>>> mapping, of any type, in this scheme, adding new mapping types is >>>> trivial and can be done on a driver-by-driver basis if necessary >>>> without changing KPI and without any other part of the system even >>>> being aware. For example, GPIOs can use a completely different >>>> mechanism if they want and can do setup purely in the GPIO controller >>>> driver. You could have a method GPIO_GET_INTERRUPT_FOR_PIN() on a GPIO >>>> controller in which the GPIO controller allocates a generic IRQ, >>>> assigns through some internal table just in the GPIO driver, and >>>> returns to it to a consumer in some other device driver -- without a >>>> GPIO mapping type, new bus functions, or modifications to the platform >>>> interrupt code. >>>> >>>> The control flow goes like this: >>>> - Bus driver enumerates children, parses interrupts properties, calls >>>> OFW_BUS_MAP_INTR() to get IRQs for them (as pre-r301453), adds to >>>> interrupt list. >>>> - ofwbus receives the OFW_BUS_MAP_INTR() call, allocates a blank >>>> disconnected IRQ from the MD interrupt layer, and stores the mapping >>>> from the new IRQ to the given interrupt specifier and phandle in an >>>> internal table in ofwbus's softc. >>>> NB: Nothing else happens here, like post-r301453. Changing this does >>>> not change any semantics of the API pre-r301453, which means it >>>> remains fully compatible with PCI and PowerPC. Also, like >>>> post-r301453, there is no involvement of nexus. >>>> - PICs attach and call OFW_BUS_REGISTER_PIC(). ofwbus receives these >>>> messages and adds a (device_t, phandle_t) mapping to a second internal >>>> table. Note that the interrupt layer does not need to handle PIC >>>> registration anymore at all (except for the root PIC). >>>> - Bus child eventually calls a function that tries to set the >>>> interrupt up (e.g. bus_setup_intr()). That propagates up the bus >>>> hierarchy, eventually getting to ofwbus. ofwbus notes the IRQ number, >>>> looks it up in the table, looks up the appropriate PIC from the PIC >>>> table, then: >>>> A) calls arm_set_pic_for_irq(irq, pic_device_t) -- this is the >>>> interrupt layer's only interaction with the mapping code. All it deals >>>> with is device_ts and abstract IRQ numbers. >>>> B) calls PIC_MAP_OFW_INTR(pic, irq_number, interrupt-cells, >>>> interrupt-specifier) to tell the PIC that the interrupt layer's IRQ >>>> irq_number means the given specifier >>>> C) finally, passes the call onto nexus, which will do whatever would >>>> normally happen (unmasking the interrupt, setting handlers, etc.) in >>>> terms only of the abstract IRQ and the device_t assigned by ofwbus. >>>> >>>> You would implement ACPI just by doing a s/OFW/ACPI/g >>>> search-and-replace above -- since the interrupt layer doesn't know >>>> about OFW or ACPI or anything else, there is no need to touch it. This >>>> seems clean, simple, compartmentalized, preserves the existing API, >>>> and should work on all of our various hardware. PowerPC can't quite >>>> work with it yet without some multipass foo, but, since the API is >>>> preserved, that transition can happen gradually without KPI changes. >>>> For the same reason that it is API-preserving, I think this code is >>>> also MFC-able. >>>> -Nathan >>> I think that we are slightly diverges in this place: >>> - please note that PIC API (in kern/pic_if.m) is different from the >>> one >>> that PPC uses. >> Yes, which is fine (this is machine-dependent code anyway). On >> consideration, the PIC_MAP_OFW_INTR() function should probably be >> called OFW_BUS_PIC_MAP_INTR() and live in ofw_bus_if.m anyway, then be >> implemented by PICs. >> >>> - we must be able to store configuration data (original interrupt >>> specifier) in all cases, not only for OFW. That's reason why I have >>> proposed >>> to create 'mapping table'. >> It is stored in all cases, just not in the core interrupt code. I've >> only mocked this up for OFW here. For ACPI, you would have the >> equivalent table in acpi.c, along with ACPI_MAP_INTR(), >> ACPI_REGISTER_PIC(), etc. in acpi_if.m and implemented in >> dev/acpica/acpi.c. >> >> For GPIOs, you would have a mechanism that just traces however you are >> representing GPIOs anyway. >> >>> Anyway, i think that we both talking about +/- same solution, only i >>> want to move it from OFW specific level implemented at OFW bus level to >>> bus/interrupt specifier format independent level implemented in >>> subr_intr.c. >> This implements the same API in any case (identical to that >> pre-r301453), so the implementation doesn't really matter at all in >> terms of my needs. > Perfect. >> That said, having it in the root bus for the mapping domain (ofwbus0, >> acpi0) etc. seems cleaner to me, with no loss of functionality. The >> core interrupt code (subr_intr.c or whatever) doesn't have any obvious >> need to know anything at all about the mapping information so long as >> it knows the PIC device_t that corresponds to every abstract IRQ so >> that it can mask it or do other operations. Since, presumably, nothing >> in an ACPI device tree references an interrupt in the OFW device tree >> (if you even had both -- and how would you specify that, anyway?), >> implementing the relevant bits one step up the bus hierarchy doesn't >> change any behavior. >> >> And nothing can possibly be more flexible in terms of the mapping >> table in subr_intr.c than not having a mapping table: you don't need >> enums for mapping types, or unions that need to be expanded, breaking >> KBI, or anything. You can delete all the PIC registration (and MSI) >> code, all the switches, and all the unions from subr_intr.c and have a >> totally mapping-mechanism-agnostic KPI. >> > Where you see "all the switches, and all the unions" in subr_intr.c? > subr_intr.c uses nested structures approach, in exactly same way as is > used in OFW bus. Moreover, subr_intr.c *IS* currently totally > mapping-mechanism-agnostic KPI, and any form of mapping table must > follow this rule. That was hyperbolic; my apologies. The point was that you don't need intr_map_data, or the intr_map_data_type enum, this way. One of the nice things about that is that you don't need to add more entries to the enum to add new one-off mapping types. For example, the PS3 platform on PPC has its own non-device-tree, non-ACPI bus description and interrupt system. Needing to make new enums (or anything) like INTR_MAP_DATA_PS3 in sys/bus.h seems a little silly, though hardly very bad. I've attached a version of PowerPC's intr_machdep.c and pic_if.m that fully implements this distributed-table scheme. The code ends up being very short and clean (a third the number of lines of code as kern/subr_intr.c, for example) and the diff to PowerPC, including the new ofwbus.c code, ends up being net-negative. > But allow me to make short summary: > - we must support 3 'main' cases - OFW, ACPI and HINTS (for older arm > and mips boards) based systems > - single kernel must support all above cases, but only one can by > 'active' (kernel must be able to select right one at runtime). > - we must support 'direct' (interrupt specifier is defined by one of > above methods) or 'indirect' (where interrupt is associated with other > function - GPIO pin, but don't have direct description). > - we must be able to access single physical pin by both methods, > 'direct' and/or 'indirect'. > - we must be able to add new interrupt type as simple as possible Agreed with all of this. > > For interrupt controllers: > - single controller must be able to parse multiple formats. 'direct' or > 'indirect'. OFW or ACPI. ARM GIC must accept OFW or ACPI arguments (in > single kernel), GPIO based interrupt controller must accept 'direct' or > 'indirect' formats. > > For interrupt consumers: > - 'direct' interrupts are easy > - driver must be able to consume 'indirect' interrupt in 'root > bus'/'mapping-mechanism' agnostic manner. > For example, MMC driver must be able to get interrupt from CD gpio > pin in all possible cases/combinations . > Are we still connected? And this. > I don't think that all this is possible without single universal format > of interrupt mapping specifier. > And, if we must have universal format then why we needs different > mapping databases? It actually works just fine in this mapping-table-in-the-root-bus model. If you have an OFW-type of interrupt, it is set up by ofwbus.c. Since everything that would use an OFW interrupt is a child of ofwbus, this is totally indistinguishable from a global table from the perspective of client code. For ACPI interrupts, they are set up by acpi.c, which is indistinguishable from a global table for the same reason. For hints -- and single-domain systems generally -- you can just have the machine-dependent code assume that interrupts it doesn't explicitly know about belong to the root PIC. This is what you call the "direct" case. For the "indirect" case, there are as usual a few things you could do. They match the set of things you would do in any implementation: 1. Set up one or more global registries for "indirect" PICs, with corresponding mapping methods, either as bus methods on some high-level bus or on the machine-specific nexus, or just a global function that you call. 2. Use the global registry that you already have to have for GPIO controllers and provide a mapping method on the GPIO controller (GPIO_GET_INTERRUPT_FOR_PIN() or something) that returns a mapped IRQ. One of the nice things about distributing the tables this way is that you have lots of flexibility with things like these "indirect" interrupts and are free to do things like #2 at will locally in some machine-specific set of drivers. For example, in the PS3 code, I don't need to add a new "PS3" mapping type *anywhere*. The ps3bus root driver just has a table when it hands out interrupts and talks to ps3pic internally. -Nathan >>> Most of your control flow can be implemented at general level as is, or >>> already exist [intr_map_irq(), intr_pic_register()] . >>> Also, impact for current PPC code is, by me, minimal. >>> I can sketch my idea in more details, if you found it acceptable. >> All ideas are fine -- and the solution does not need to apply to all >> platforms anyway, so long as the OFW_BUS_MAP_INTR() semantics (and, >> ideally, API) are preserved. >> >>> Again, I'm sorry for delayed and very brief response. >> No worries; good luck with the work emergencies. >> -Nathan > All problems have been solved, sleep deficit remains :) > Michal > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > --------------3EFCE033AD3743BB7374BC07 Content-Type: text/plain; charset=UTF-8; name="intr_machdep.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="intr_machdep.c" /*- * Copyright (c) 1991 The Regents of the University of California. * All rights reserved. * * This code is derived from software contributed to Berkeley by * William Jolitz. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /*- * Copyright (c) 2002 Benno Rice. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: @(#)isa.c 7.2 (Berkeley) 5/13/91 * form: src/sys/i386/isa/intr_machdep.c,v 1.57 2001/07/20 * * $FreeBSD: head/sys/powerpc/powerpc/intr_machdep.c 296177 2016-02-29 03:38:00Z jhibbits $ */ #include "opt_isa.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "pic_if.h" #define MAX_STRAY_LOG 5 static MALLOC_DEFINE(M_INTR, "intr", "interrupt handler data"); struct powerpc_intr { struct intr_event *event; long *cntp; u_int irq; u_int vector; device_t pic; u_int cntindex; cpuset_t cpu; int ipi; }; static u_int intrcnt_index = 0; static struct mtx intr_table_lock; static struct powerpc_intr *powerpc_intrs[INTR_VECTORS]; static u_int nvectors; /* Allocated vectors */ static u_int nirqs = 0; /* Allocated IRQs. */ static u_int stray_count; u_long intrcnt[INTR_VECTORS]; char intrnames[INTR_VECTORS * MAXCOMLEN]; size_t sintrcnt = sizeof(intrcnt); size_t sintrnames = sizeof(intrnames); device_t root_pic; #ifdef SMP static void *ipi_cookie; #endif static void intr_init(void *dummy __unused) { mtx_init(&intr_table_lock, "intr sources lock", NULL, MTX_DEF); } SYSINIT(intr_init, SI_SUB_INTR, SI_ORDER_FIRST, intr_init, NULL); #ifdef SMP static void smp_intr_init(void *dummy __unused) { struct powerpc_intr *i; int vector; for (vector = 0; vector < nvectors; vector++) { i = powerpc_intrs[vector]; if (i != NULL && i->event != NULL && i->pic == root_pic) PIC_BIND(i->pic, i->irq, i->cpu); } } SYSINIT(smp_intr_init, SI_SUB_SMP, SI_ORDER_ANY, smp_intr_init, NULL); #endif static void intrcnt_setname(const char *name, int index) { snprintf(intrnames + (MAXCOMLEN + 1) * index, MAXCOMLEN + 1, "%-*s", MAXCOMLEN, name); } void intrcnt_add(const char *name, u_long **countp) { int idx; idx = atomic_fetchadd_int(&intrcnt_index, 1); *countp = &intrcnt[idx]; intrcnt_setname(name, idx); } static struct powerpc_intr * intr_lookup(u_int irq); { struct powerpc_intr *i; int vector; mtx_lock(&intr_table_lock); for (vector = 0; vector < nvectors; vector++) { i = powerpc_intrs[vector]; if (i != NULL && i->irq == irq) { mtx_unlock(&intr_table_lock); return (i); } } mtx_unlock(&intr_table_lock); return (NULL); } int powerpc_alloc_intr(int suggested) { char intrname[16]; struct powerpc_intr *i, *iscan; int vector, irq; mtx_lock(&intr_table_lock); if (suggested <= 0) { for (vector = 0; vector < nvectors; vector++) { i = powerpc_intrs[vector]; if (i != NULL && i->irq == suggested) { suggested = 0; break; } } } /* * If no suggestion, or we couldn't provide it, use max(1000, max_irq+1) */ if (suggested <= 0) { irq = 1000; for (vector = 0; vector < nvectors; vector++) { i = powerpc_intrs[vector]; if (i != NULL && i->irq == suggested) irq = max(irq, i->irq + 1); } } i = malloc(sizeof(*i), M_INTR, M_NOWAIT); if (i == NULL) { mtx_unlock(&intr_table_lock); return (NULL); } i->event = NULL; i->cntp = NULL; i->irq = irq; i->pic = NULL; i->vector = -1; i->ipi = 0; #ifdef SMP i->cpu = all_cpus; #else CPU_SETOF(0, &i->cpu); #endif for (vector = 0; vector < INTR_VECTORS; vector++) { iscan = powerpc_intrs[vector]; if (iscan == NULL && i->vector == -1) { i->vector = vector; break; } } if (i->vector != -1) { powerpc_intrs[i->vector] = i; i->cntindex = atomic_fetchadd_int(&intrcnt_index, 1); i->cntp = &intrcnt[i->cntindex]; sprintf(intrname, "irq%u:", i->irq); intrcnt_setname(intrname, i->cntindex); nvectors++; } mtx_unlock(&intr_table_lock); if (i->vector == -1) { free(i, M_INTR); i = NULL; } return (i); } int powerpc_set_pic_for_irq(int irq, device_t pic) { struct powerpc_intr *i; i = irq_lookup(irq); KASSERT(i != NULL, ("%s: NULL IRQ %d", __func__, irq)); i->pic = pic; return (0); } static void powerpc_intr_eoi(void *arg) { struct powerpc_intr *i = arg; PIC_EOI(i->pic, i->irq); } static void powerpc_intr_pre_ithread(void *arg) { struct powerpc_intr *i = arg; PIC_MASK(i->pic, i->irq); PIC_EOI(i->pic, i->irq); } static void powerpc_intr_post_ithread(void *arg) { struct powerpc_intr *i = arg; PIC_UNMASK(i->pic, i->irq); } static int powerpc_assign_intr_cpu(void *arg, int cpu) { #ifdef SMP struct powerpc_intr *i = arg; if (cpu == NOCPU) i->cpu = all_cpus; else CPU_SETOF(cpu, &i->cpu); if (!cold && i->pic != NULL && i->pic == root_pic) PIC_BIND(i->pic, i->irq, i->cpu); return (0); #else return (EOPNOTSUPP); #endif } int powerpc_enable_intr(void) { struct powerpc_intr *i; int error, vector; #ifdef SMP int ipi_irq; #endif if (npics == 0) panic("no PIC detected\n"); if (root_pic == NULL) panic("no root PIC\n"); #ifdef SMP /* Install an IPI handler. */ if (mp_ncpus > 1) { ipi_irq = powerpc_alloc_intr(4096); error = PIC_MAP_IPI(root_pic, ipi_irq); KASSERT(error == 0, ("%s: SMP root PIC failed to map IPI", __func__, error)); error = powerpc_setup_intr("IPI", ipi_irq, powerpc_ipi_handler, NULL, NULL, INTR_TYPE_MISC | INTR_EXCL, &ipi_cookie); if (error) { printf("unable to setup IPI handler\n"); return (error); } /* * Some subterfuge: disable late EOI and mark this * as an IPI to the dispatch layer. */ i = intr_lookup(ipi_irq, FALSE); i->event->ie_post_filter = NULL; i->ipi = 1; } #endif for (vector = 0; vector < nvectors; vector++) { i = powerpc_intrs[vector]; if (i == NULL) continue; if (i->pic == NULL) continue; if (i->event != NULL) PIC_ENABLE(i->pic, i->irq, vector); } return (0); } int powerpc_setup_intr(const char *name, u_int irq, driver_filter_t filter, driver_intr_t handler, void *arg, enum intr_type flags, void **cookiep) { struct powerpc_intr *i; int error, enable = 0; i = intr_lookup(irq, FALSE); if (i == NULL) return (ENOMEM); if (i->event == NULL) { error = intr_event_create(&i->event, (void *)i, 0, irq, powerpc_intr_pre_ithread, powerpc_intr_post_ithread, powerpc_intr_eoi, powerpc_assign_intr_cpu, "irq%u:", irq); if (error) return (error); enable = 1; } error = intr_event_add_handler(i->event, name, filter, handler, arg, intr_priority(flags), flags, cookiep); mtx_lock(&intr_table_lock); intrcnt_setname(i->event->ie_fullname, i->cntindex); mtx_unlock(&intr_table_lock); if (!cold && i->pic != NULL) if (i->pic == root_pic) PIC_BIND(i->pic, i->irq, i->cpu); if (enable) PIC_ENABLE(i->pic, i->irq, i->vector); } return (error); } int powerpc_teardown_intr(void *cookie) { return (intr_event_remove_handler(cookie)); } #ifdef SMP int powerpc_bind_intr(u_int irq, u_char cpu) { struct powerpc_intr *i; i = intr_lookup(irq); if (i == NULL) return (ENOMEM); return (intr_event_bind(i->event, cpu)); } #endif int powerpc_config_intr(int irq, enum intr_trigger trig, enum intr_polarity pol) { struct powerpc_intr *i; i = intr_lookup(irq); if (i == NULL) return (ENOMEM); i->trig = trig; i->pol = pol; if (i->pic != NULL) PIC_CONFIG(i->pic, i->irq, trig, pol); return (0); } void powerpc_dispatch_intr(u_int vector, struct trapframe *tf) { struct powerpc_intr *i; struct intr_event *ie; i = powerpc_intrs[vector]; if (i == NULL) goto stray; (*i->cntp)++; ie = i->event; KASSERT(ie != NULL, ("%s: interrupt without an event", __func__)); /* * IPIs are magical and need to be EOI'ed before filtering. * This prevents races in IPI handling. */ if (i->ipi) PIC_EOI(i->pic, i->irq); if (intr_event_handle(ie, tf) != 0) { goto stray; } return; stray: stray_count++; if (stray_count <= MAX_STRAY_LOG) { printf("stray irq %d\n", i ? i->irq : -1); if (stray_count >= MAX_STRAY_LOG) { printf("got %d stray interrupts, not logging anymore\n", MAX_STRAY_LOG); } } if (i != NULL) PIC_MASK(i->pic, i->irq); } void powerpc_intr_mask(u_int irq) { struct powerpc_intr *i; i = intr_lookup(irq); if (i == NULL || i->pic == NULL) return; PIC_MASK(i->pic, i->irq); } void powerpc_intr_unmask(u_int irq) { struct powerpc_intr *i; i = intr_lookup(irq); if (i == NULL || i->pic == NULL) return; PIC_UNMASK(i->pic, i->irq); } --------------3EFCE033AD3743BB7374BC07 Content-Type: text/plain; charset=UTF-8; name="pic_if.m" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="pic_if.m" #- # Copyright (c) 1998 Doug Rabson # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # # from: src/sys/kern/bus_if.m,v 1.21 2002/04/21 11:16:10 markm Exp # $FreeBSD: head/sys/powerpc/powerpc/pic_if.m 257059 2013-10-24 15:37:32Z nwhitehorn $ # #include #include #include INTERFACE pic; METHOD void bind { device_t dev; u_int irq; cpuset_t cpumask; }; METHOD void config { device_t dev; u_int irq; enum intr_trigger trig; enum intr_polarity pol; }; METHOD void dispatch { device_t dev; struct trapframe *tf; }; METHOD void enable { device_t dev; u_int irq; u_int vector; }; METHOD void eoi { device_t dev; u_int irq; }; METHOD void ipi { device_t dev; u_int cpu; }; METHOD void mask { device_t dev; u_int irq; }; METHOD void unmask { device_t dev; u_int irq; }; METHOD int map_ipi { device_t dev; }; --------------3EFCE033AD3743BB7374BC07-- From owner-freebsd-arch@freebsd.org Fri Aug 5 15:32:46 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B98EBBB03B0; Fri, 5 Aug 2016 15:32:46 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gold.funkthat.com", Issuer "gold.funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 98E0618D4; Fri, 5 Aug 2016 15:32:46 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.15.2/8.15.2) with ESMTPS id u75FWb4A016924 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 5 Aug 2016 08:32:37 -0700 (PDT) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.15.2/8.15.2/Submit) id u75FWawU016922; Fri, 5 Aug 2016 08:32:36 -0700 (PDT) (envelope-from jmg) Date: Fri, 5 Aug 2016 08:32:36 -0700 From: John-Mark Gurney To: Adrian Chadd Cc: Michelle Sullivan , Anna Wilcox , freebsd-arch , Marius Strobl , Sean Bruno , Jordan Hubbard , sparc64@freebsd.org, Warner Losh Subject: Re: Sparc64 doesn't care about you, and you shouldn't care about Sparc64 Message-ID: <20160805153236.GA16908@funkthat.com> Mail-Followup-To: Adrian Chadd , Michelle Sullivan , Anna Wilcox , freebsd-arch , Marius Strobl , Sean Bruno , Jordan Hubbard , sparc64@freebsd.org, Warner Losh References: <64302b19-9f33-4267-af44-7fc30ea4bf3d@email.android.com> <56434F34.6040707@sorbs.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: FreeBSD 9.1-PRERELEASE amd64 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.6.1 (2016-04-27) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (gold.funkthat.com [127.0.0.1]); Fri, 05 Aug 2016 08:32:38 -0700 (PDT) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Aug 2016 15:32:46 -0000 Adrian Chadd wrote this message on Wed, Nov 11, 2015 at 07:29 -0800: > > FreeBSD has an opportunity to fill the void in the secondhand Oracle > > (Sun) Hardware market if people want to take it.... but I seriously > > don't see why anyone would buy new Oracle boxes and shove FreeBSD on > > them - unless their admins already manage FreeBSD on older + Intel hardware. > > > > As always, someone has to write and debug the code. > > Users are only as good as the revenue to generate to cover > development/debugging/deployment costs. > > So if you're a sparc64 user, how much would you be willing to pay for > freebsd/sparc64 support and development? > > I'm all for keeping an architecture like sparc around, as long as > there's active development and active users. MIPS has both. ARM has > both. Powerpc has both. Sparc is missing some active developers, but > it has plenty of FreeBSD users that speak up (and more users that only > speak up privately.) So, if you want to see sparc64 support continue, > this requires a grass roots effort to get more development happening - > either users need to step up, or someone has to start contributing > money. If you care, then give donations to the FreeBSD Foundation, and say you want/need sparc64 and/or sun4v support... Or find a developer to pay for the work... These are the straight forward ways to keep sparc64 alive... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@freebsd.org Fri Aug 5 22:27:28 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AC9D7BB003D; Fri, 5 Aug 2016 22:27:28 +0000 (UTC) (envelope-from jhs@berklix.com) Received: from slim.berklix.org (slim.berklix.org [94.185.90.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "slim.berklix.org", Issuer "slim.berklix.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 2E11C1764; Fri, 5 Aug 2016 22:27:27 +0000 (UTC) (envelope-from jhs@berklix.com) Received: from mart.js.berklix.net (pD9FE9C82.dip0.t-ipconnect.de [217.254.156.130]) (authenticated bits=128) by slim.berklix.org (8.15.2/8.15.2) with ESMTPSA id u75MRDsq085194 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 6 Aug 2016 00:27:20 +0200 (CEST) (envelope-from jhs@berklix.com) Received: from fire.js.berklix.net (fire.js.berklix.net [192.168.91.41]) by mart.js.berklix.net (8.14.3/8.14.3) with ESMTP id u75MRAHS006682; Sat, 6 Aug 2016 00:27:11 +0200 (CEST) (envelope-from jhs@berklix.com) Received: from fire.js.berklix.net (localhost [127.0.0.1]) by fire.js.berklix.net (8.14.7/8.14.7) with ESMTP id u75MQGbj075918; Sat, 6 Aug 2016 00:26:28 +0200 (CEST) (envelope-from jhs@berklix.com) Message-Id: <201608052226.u75MQGbj075918@fire.js.berklix.net> To: John-Mark Gurney cc: Adrian Chadd , sparc64@freebsd.org, Anna Wilcox , freebsd-arch , Sean Bruno , Michelle Sullivan , Marius Strobl , Jordan Hubbard Subject: Re: Sparc64 doesn't care about you, and you shouldn't care about Sparc64 From: "Julian H. Stacey" Organization: http://berklix.eu BSD Unix Linux Consultants, Munich Germany User-agent: EXMH on FreeBSD http://berklix.eu/free/ X-URL: http://www.berklix.eu/~jhs/ In-reply-to: Your message "Fri, 05 Aug 2016 08:32:36 -0700." <20160805153236.GA16908@funkthat.com> Date: Sat, 06 Aug 2016 00:26:16 +0200 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Aug 2016 22:27:28 -0000 Hi, Reference: > From: John-Mark Gurney > Date: Fri, 5 Aug 2016 08:32:36 -0700 John-Mark Gurney wrote: > Adrian Chadd wrote this message on Wed, Nov 11, 2015 at 07:29 -0800: > > > FreeBSD has an opportunity to fill the void in the secondhand Oracle > > > (Sun) Hardware market if people want to take it.... but I seriously > > > don't see why anyone would buy new Oracle boxes and shove FreeBSD on > > > them - unless their admins already manage FreeBSD on older + Intel hardware. > > > > > > > As always, someone has to write and debug the code. > > > > Users are only as good as the revenue to generate to cover > > development/debugging/deployment costs. > > > > So if you're a sparc64 user, how much would you be willing to pay for > > freebsd/sparc64 support and development? > > > > I'm all for keeping an architecture like sparc around, as long as > > there's active development and active users. MIPS has both. ARM has > > both. Powerpc has both. Sparc is missing some active developers, but > > it has plenty of FreeBSD users that speak up (and more users that only > > speak up privately.) So, if you want to see sparc64 support continue, > > this requires a grass roots effort to get more development happening - > > either users need to step up, or someone has to start contributing > > money. > > If you care, then give donations to the FreeBSD Foundation, and say > you want/need sparc64 and/or sun4v support... > > Or find a developer to pay for the work... FYI Global geograhic index: http://www.berklix.com/consultants/table.html > These are the straight > forward ways to keep sparc64 alive... > > -- > John-Mark Gurney Voice: +1 415 225 5579 Cheers, Julian -- Julian Stacey, BSD Linux Unix Sys Eng Consultant Munich Reply below, Prefix '> '. Plain text, No .doc, base64, HTML, quoted-printable. http://berklix.eu/brexit/#stolen_votes From owner-freebsd-arch@freebsd.org Sat Aug 6 05:19:23 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B0047BB0C18; Sat, 6 Aug 2016 05:19:23 +0000 (UTC) (envelope-from kmacybsd@gmail.com) Received: from mail-io0-x22d.google.com (mail-io0-x22d.google.com [IPv6:2607:f8b0:4001:c06::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6E33B1C9F; Sat, 6 Aug 2016 05:19:23 +0000 (UTC) (envelope-from kmacybsd@gmail.com) Received: by mail-io0-x22d.google.com with SMTP id b62so317198573iod.3; Fri, 05 Aug 2016 22:19:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to; bh=XYBgyGTph1tIWCYxLUYGKn+j1s7aeIWKVESQom1RAUY=; b=GCz9wrEEqVDX/GWLFA1UHwQe9dJoNtVFl0tz01IAImGxS+1E9V/WtO3f8MPqt2s5QB nsJY4M556/MU/+mvQcrkiOHUM9KVtnuMBqR65MTVgrFMM2T7V7d67Tn2chJ+Fv18CbEG HkjZAYtbJQfFj7Eg2CDxCUAn3YM0qo/wJSB8Oqtjg0AnHOIMZMS7Gheo6fUGRWyS/FmC n1HCIsjXsDYvtPYDUgpOhmvZqGlAS+od+HzAVwGEbzmRQujp6ERGdLJlwmZzSQrRTTmt XYkcQHxn1dzm0xrSPhq29nm7nKGoAwQIbxmrCuYZ4JhjUqQuFCYWU0GMFFhkjp8PDbyD LbRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to; bh=XYBgyGTph1tIWCYxLUYGKn+j1s7aeIWKVESQom1RAUY=; b=f2YsZS4BIo9EqW0NIDgSVzo2BBWtvPKqcrsPemFY2o2pvetH2cOL1WzjCfcl64Lpxv 3KFEjDJScb/kSxL2gfGDlVErWBSV61zdYumFKSzW/So1GA/woYT5CnUvvGXjOuPYGoGs la3jyP2rh/aWtD9rVWnyrwbMg9yomTa0MkIw+LaJKBZGRUDNuDVKaRhv4HE4usQ5TAH5 Y67MOxKpvTeE21LaRKtdzr3pLHKXaMTlF92yTzSurCuHD9hm990Q9PaU8EPSSuz0Viwz J4snBZn1j51QBrsBCS1Q9ZEhfh0OoMdz0kVKyn2U26BzAjRBePkE1zyP/7K2APjpRaRf kJkg== X-Gm-Message-State: AEkoouvtdfFGpjk5WxJYQFHEFdMEavllMxnggX+3wV9BqHHN8VmcwokQjGG+TS5pBMigPKW2L4JmkIdAco0Bvg== X-Received: by 10.107.140.205 with SMTP id o196mr93192214iod.42.1470460762766; Fri, 05 Aug 2016 22:19:22 -0700 (PDT) MIME-Version: 1.0 Sender: kmacybsd@gmail.com Received: by 10.107.143.11 with HTTP; Fri, 5 Aug 2016 22:19:22 -0700 (PDT) In-Reply-To: <20160805153236.GA16908@funkthat.com> References: <64302b19-9f33-4267-af44-7fc30ea4bf3d@email.android.com> <56434F34.6040707@sorbs.net> <20160805153236.GA16908@funkthat.com> From: "K. Macy" Date: Fri, 5 Aug 2016 22:19:22 -0700 X-Google-Sender-Auth: SzeFmZuqNUEqWPJuQ6DkDzV032A Message-ID: Subject: Re: Sparc64 doesn't care about you, and you shouldn't care about Sparc64 To: Adrian Chadd , Michelle Sullivan , Anna Wilcox , freebsd-arch , Marius Strobl , Sean Bruno , Jordan Hubbard , sparc64@freebsd.org, Warner Losh Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Aug 2016 05:19:23 -0000 >> I'm all for keeping an architecture like sparc around, as long as >> there's active development and active users. MIPS has both. ARM has >> both. Powerpc has both. Sparc is missing some active developers, but >> it has plenty of FreeBSD users that speak up (and more users that only >> speak up privately.) So, if you want to see sparc64 support continue, >> this requires a grass roots effort to get more development happening - >> either users need to step up, or someone has to start contributing >> money. > > If you care, then give donations to the FreeBSD Foundation, and say > you want/need sparc64 and/or sun4v support... That isn't how a c3 works. There are plenty of useful functions that the FF provides. Specifically directing donations is not one of them. > Or find a developer to pay for the work... These are the straight > forward ways to keep sparc64 alive... This is the only path that's guaranteed success. But it's unlikely that the user base is going to be able to offset much more than a small portion of the financial opportunity cost. From owner-freebsd-arch@freebsd.org Sat Aug 6 14:30:34 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31BEABB0D24; Sat, 6 Aug 2016 14:30:34 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from mail.miracle.cz (mail.miracle.cz [193.84.128.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.miracle.cz", Issuer "Miracle Group Root CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id D276A1A4C; Sat, 6 Aug 2016 14:30:33 +0000 (UTC) (envelope-from mmel@FreeBSD.org) Received: from [193.84.128.50] (meloun.ad.miracle.cz [193.84.128.50]) by mail.miracle.cz (Postfix) with ESMTPSA id 270AE3AC81; Sat, 6 Aug 2016 16:30:25 +0200 (CEST) Subject: Re: INTRNG (Was: svn commit: r301453....) To: Nathan Whitehorn References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> <57A09F34.4050400@FreeBSD.org> <57A30B72.7070809@FreeBSD.org> <1946069a-d0f9-2c19-80a5-0b490682574b@freebsd.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Michal Meloun X-Enigmail-Draft-Status: N1110 Message-ID: <57A5F480.20309@FreeBSD.org> Date: Sat, 6 Aug 2016 16:30:24 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <1946069a-d0f9-2c19-80a5-0b490682574b@freebsd.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.miracle.cz); Sat, 06 Aug 2016 16:30:25 +0200 (CEST) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Aug 2016 14:30:34 -0000 Dne 04.08.2016 v 14:34 Nathan Whitehorn napsal(a): [snip] > >> >>>>>> ------------ The following is a large parenthesis >>>>>> ------------------- >>>>>> >>>>>> One other possible route here that would also work well would be to >>>>>> make ofwbus.c MD (it's a trivial piece of code anyway, so we don't >>>>>> gain a lot by sharing it) and implement ofw_bus_map_intr() >>>>>> locally as >>>>>> an ofwbus bus method. Then you could have the mapping table >>>>>> stored in >>>>>> ofwbus's softc -- the API was designed for this initially. You would >>>>>> need MD extensions for doing PIC registration there (which is fine), >>>>>> but that segregates all the OFW-specific information into >>>>>> OFW-specific code and would let the bus methods and the OFW >>>>>> interrupt >>>>>> mapping table interact cleanly in the same place. This still >>>>>> preserves the pre-r301453 API, makes PowerPC work, and maybe >>>>>> address's skra@'s concern about extensibility and letting core >>>>>> interrupt code know about FDT (or ACPI). I'd be happy to mock >>>>>> this up >>>>>> as well if you think it's a good approach. >>>>>> >>>>> [snip] >>>>>> This has the following features: >>>>>> - Existing OFW API and semantics unchanged >>>>>> - As such, PowerPC, PCI, etc. work fine with no changes >>>>>> - Details encapsulated in MD code, so individual platforms can >>>>>> implement this however they like >>>>>> - arm/arm/intr.c (or whatever) only needs a method to allocate a >>>>>> fresh interrupt, with no state, and anoter to set the device_t >>>>>> for an >>>>>> interrupt sometime later. >>>>>> - The internal table in the platform interrupt code has no knowledge >>>>>> of any mappings whatsoever except having the appropriate device_t >>>>>> for >>>>>> the PIC stored with the interrupt vector. >>>>>> - Device tree bits handled purely in device tree code >>>>>> - No action need be taken on any mapping until the interrupt is >>>>>> actually allocated/set up, like r301453 >>>>>> - Easy to add more mapping mechanisms (e.g. ACPI) by having similar >>>>>> enumeration-mechanism-specific code in the root bus for that mapping >>>>>> mechanism. >>>>>> >>>>>> -------------- End parenthesis ------------------------------- >>>>> Here's an implementation of the parenthesis I wrote on an airplane >>>>> this afternoon. It should be complete, though has not been tested. >>>>> The >>>>> code is short and simple (+70 lines in ofwbus.c). This preserves the >>>>> pre-r301453 API and semantics relative to drivers, which means >>>>> PowerPC >>>>> and PCI work out of the box, while keeping the semantics relative to >>>>> the interrupt layer of r301453 (PIC methods only called on resource >>>>> allocation, no allocatable IRQs on unattached PICs, encapsulation of >>>>> OFW-specific code in OFW-specific bits of the tree). It turns out >>>>> those two things are compatible, somewhat to my surprise, and that >>>>> makes the result very clean. I like this approach and would be happy >>>>> to move forward with it. There are five functions of interest: >>>>> >>>>> 1. OFW_BUS_MAP_INTR(). This has the semantics and API it has now: you >>>>> pass an interrupt specifier and parent, you get back an IRQ. No >>>>> changes. This is the core of the normal OFW interrupt API. >>>>> >>>>> 2. OFW_BUS_REGISTER_PIC(device_t pic, phandle_t phandle). This is a >>>>> new function that PIC drivers are supposed to use to register control >>>>> of an interrupt domain. This replaces machine-specific code like >>>>> powerpc_register_pic() to allow the PIC table to be in a bus parent >>>>> rather than in the interrupt core. >>>>> >>>>> 3. PIC_MAP_OFW_INTR(device_t pic, int irq, interrupt specifier). This >>>>> is a new function that PIC drivers that know how to handle >>>>> device-tree >>>>> interrupt descriptors implement (analogous to various existing ones >>>>> that vary by platform). It tells the PIC that the given abstract IRQ >>>>> means the given opaque interrupt specifier. >>>>> >>>>> 4. arm_allocate_irq(int suggested). This allocates a new IRQ number >>>>> not (yet) attached to a PIC, as in r301453. I've added a parameter >>>>> that lets you pass a suggested number to try in case it is >>>>> possible to >>>>> make it match an interrupt pin or something for human-readability. >>>>> >>>>> 5. arm_set_pic_for_irq(int irq, device_t). This tells the MD >>>>> interrupt >>>>> layer to associate a given PIC device_t with an IRQ. That is all the >>>>> information the MD layer ever has about the IRQ mapping. >>>>> >>>>> Functions #1 and #2 are now implemented completely in ofwbus.c: there >>>>> are no callouts anywhere else and the interrupt mapping table is >>>>> maintained entirely internally to ofwbus (in its softc). In order to >>>>> implement ACPI, or some other strategy, you would implement >>>>> analogs to >>>>> functions #1 and #2 that live somewhere in the bus hierarchy that is >>>>> guaranteed to be above all devices that might want that type of >>>>> interrupt (e.g. in acpi0), and some analog to #3 that PIC drivers >>>>> implementing the mapping scheme would provide. >>>>> >>>>> Since the system interrupt code has no knowledge at all of interrupt >>>>> mapping, of any type, in this scheme, adding new mapping types is >>>>> trivial and can be done on a driver-by-driver basis if necessary >>>>> without changing KPI and without any other part of the system even >>>>> being aware. For example, GPIOs can use a completely different >>>>> mechanism if they want and can do setup purely in the GPIO controller >>>>> driver. You could have a method GPIO_GET_INTERRUPT_FOR_PIN() on a >>>>> GPIO >>>>> controller in which the GPIO controller allocates a generic IRQ, >>>>> assigns through some internal table just in the GPIO driver, and >>>>> returns to it to a consumer in some other device driver -- without a >>>>> GPIO mapping type, new bus functions, or modifications to the >>>>> platform >>>>> interrupt code. >>>>> >>>>> The control flow goes like this: >>>>> - Bus driver enumerates children, parses interrupts properties, calls >>>>> OFW_BUS_MAP_INTR() to get IRQs for them (as pre-r301453), adds to >>>>> interrupt list. >>>>> - ofwbus receives the OFW_BUS_MAP_INTR() call, allocates a blank >>>>> disconnected IRQ from the MD interrupt layer, and stores the mapping >>>>> from the new IRQ to the given interrupt specifier and phandle in an >>>>> internal table in ofwbus's softc. >>>>> NB: Nothing else happens here, like post-r301453. Changing >>>>> this does >>>>> not change any semantics of the API pre-r301453, which means it >>>>> remains fully compatible with PCI and PowerPC. Also, like >>>>> post-r301453, there is no involvement of nexus. >>>>> - PICs attach and call OFW_BUS_REGISTER_PIC(). ofwbus receives these >>>>> messages and adds a (device_t, phandle_t) mapping to a second >>>>> internal >>>>> table. Note that the interrupt layer does not need to handle PIC >>>>> registration anymore at all (except for the root PIC). >>>>> - Bus child eventually calls a function that tries to set the >>>>> interrupt up (e.g. bus_setup_intr()). That propagates up the bus >>>>> hierarchy, eventually getting to ofwbus. ofwbus notes the IRQ number, >>>>> looks it up in the table, looks up the appropriate PIC from the PIC >>>>> table, then: >>>>> A) calls arm_set_pic_for_irq(irq, pic_device_t) -- this is the >>>>> interrupt layer's only interaction with the mapping code. All it >>>>> deals >>>>> with is device_ts and abstract IRQ numbers. >>>>> B) calls PIC_MAP_OFW_INTR(pic, irq_number, interrupt-cells, >>>>> interrupt-specifier) to tell the PIC that the interrupt layer's IRQ >>>>> irq_number means the given specifier >>>>> C) finally, passes the call onto nexus, which will do whatever >>>>> would >>>>> normally happen (unmasking the interrupt, setting handlers, etc.) in >>>>> terms only of the abstract IRQ and the device_t assigned by ofwbus. >>>>> >>>>> You would implement ACPI just by doing a s/OFW/ACPI/g >>>>> search-and-replace above -- since the interrupt layer doesn't know >>>>> about OFW or ACPI or anything else, there is no need to touch it. >>>>> This >>>>> seems clean, simple, compartmentalized, preserves the existing API, >>>>> and should work on all of our various hardware. PowerPC can't quite >>>>> work with it yet without some multipass foo, but, since the API is >>>>> preserved, that transition can happen gradually without KPI changes. >>>>> For the same reason that it is API-preserving, I think this code is >>>>> also MFC-able. >>>>> -Nathan >>>> I think that we are slightly diverges in this place: >>>> - please note that PIC API (in kern/pic_if.m) is different from the >>>> one >>>> that PPC uses. >>> Yes, which is fine (this is machine-dependent code anyway). On >>> consideration, the PIC_MAP_OFW_INTR() function should probably be >>> called OFW_BUS_PIC_MAP_INTR() and live in ofw_bus_if.m anyway, then be >>> implemented by PICs. >>> >>>> - we must be able to store configuration data (original interrupt >>>> specifier) in all cases, not only for OFW. That's reason why I have >>>> proposed >>>> to create 'mapping table'. >>> It is stored in all cases, just not in the core interrupt code. I've >>> only mocked this up for OFW here. For ACPI, you would have the >>> equivalent table in acpi.c, along with ACPI_MAP_INTR(), >>> ACPI_REGISTER_PIC(), etc. in acpi_if.m and implemented in >>> dev/acpica/acpi.c. >>> >>> For GPIOs, you would have a mechanism that just traces however you are >>> representing GPIOs anyway. >>> >>>> Anyway, i think that we both talking about +/- same solution, only i >>>> want to move it from OFW specific level implemented at OFW bus >>>> level to >>>> bus/interrupt specifier format independent level implemented in >>>> subr_intr.c. >>> This implements the same API in any case (identical to that >>> pre-r301453), so the implementation doesn't really matter at all in >>> terms of my needs. >> Perfect. >>> That said, having it in the root bus for the mapping domain (ofwbus0, >>> acpi0) etc. seems cleaner to me, with no loss of functionality. The >>> core interrupt code (subr_intr.c or whatever) doesn't have any obvious >>> need to know anything at all about the mapping information so long as >>> it knows the PIC device_t that corresponds to every abstract IRQ so >>> that it can mask it or do other operations. Since, presumably, nothing >>> in an ACPI device tree references an interrupt in the OFW device tree >>> (if you even had both -- and how would you specify that, anyway?), >>> implementing the relevant bits one step up the bus hierarchy doesn't >>> change any behavior. >>> >>> And nothing can possibly be more flexible in terms of the mapping >>> table in subr_intr.c than not having a mapping table: you don't need >>> enums for mapping types, or unions that need to be expanded, breaking >>> KBI, or anything. You can delete all the PIC registration (and MSI) >>> code, all the switches, and all the unions from subr_intr.c and have a >>> totally mapping-mechanism-agnostic KPI. >>> >> Where you see "all the switches, and all the unions" in subr_intr.c? >> subr_intr.c uses nested structures approach, in exactly same way as is >> used in OFW bus. Moreover, subr_intr.c *IS* currently totally >> mapping-mechanism-agnostic KPI, and any form of mapping table must >> follow this rule. > > That was hyperbolic; my apologies. > > The point was that you don't need intr_map_data, or the > intr_map_data_type enum, this way. One of the nice things about that > is that you don't need to add more entries to the enum to add new > one-off mapping types. For example, the PS3 platform on PPC has its > own non-device-tree, non-ACPI bus description and interrupt system. > Needing to make new enums (or anything) like INTR_MAP_DATA_PS3 in > sys/bus.h seems a little silly, though hardly very bad. Hmm, right. Is something like solution for you? > I've attached a version of PowerPC's intr_machdep.c and pic_if.m that > fully implements this distributed-table scheme. The code ends up being > very short and clean (a third the number of lines of code as > kern/subr_intr.c, for example) and the diff to PowerPC, including the > new ofwbus.c code, ends up being net-negative. Yep, I read it very carefully. >> But allow me to make short summary: >> - we must support 3 'main' cases - OFW, ACPI and HINTS (for older arm >> and mips boards) based systems >> - single kernel must support all above cases, but only one can by >> 'active' (kernel must be able to select right one at runtime). >> - we must support 'direct' (interrupt specifier is defined by one of >> above methods) or 'indirect' (where interrupt is associated with other >> function - GPIO pin, but don't have direct description). >> - we must be able to access single physical pin by both methods, >> 'direct' and/or 'indirect'. >> - we must be able to add new interrupt type as simple as possible > > Agreed with all of this. > >> For interrupt controllers: >> - single controller must be able to parse multiple formats. 'direct' or >> 'indirect'. OFW or ACPI. ARM GIC must accept OFW or ACPI arguments (in >> single kernel), GPIO based interrupt controller must accept 'direct' or >> 'indirect' formats. >> >> For interrupt consumers: >> - 'direct' interrupts are easy >> - driver must be able to consume 'indirect' interrupt in 'root >> bus'/'mapping-mechanism' agnostic manner. >> For example, MMC driver must be able to get interrupt from CD gpio >> pin in all possible cases/combinations . >> Are we still connected? > > And this. > >> I don't think that all this is possible without single universal format >> of interrupt mapping specifier. >> And, if we must have universal format then why we needs different >> mapping databases? > > It actually works just fine in this mapping-table-in-the-root-bus > model. If you have an OFW-type of interrupt, it is set up by ofwbus.c. > Since everything that would use an OFW interrupt is a child of ofwbus, > this is totally indistinguishable from a global table from the > perspective of client code. For ACPI interrupts, they are set up by > acpi.c, which is indistinguishable from a global table for the same > reason. For hints -- and single-domain systems generally -- you can > just have the machine-dependent code assume that interrupts it doesn't > explicitly know about belong to the root PIC. This is what you call > the "direct" case. > > For the "indirect" case, there are as usual a few things you could do. > They match the set of things you would do in any implementation: > 1. Set up one or more global registries for "indirect" PICs, with > corresponding mapping methods, either as bus methods on some > high-level bus or on the machine-specific nexus, or just a global > function that you call. > 2. Use the global registry that you already have to have for GPIO > controllers and provide a mapping method on the GPIO controller > (GPIO_GET_INTERRUPT_FOR_PIN() or something) that returns a mapped IRQ. > I still don't think that this can works. Assume that i have driver with 2 interrupts, one is standard (OFW property based), second is GPIO, targeting same interrupt controller. First is allocated by bus_alloc_resource(), second by gpio_alloc_intr_resource(), so each resource is stored in different table. Then bus_setup_intr() is called, on both. So bus_setup_intr() must be able to select right table, right? But how? > One of the nice things about distributing the tables this way is that > you have lots of flexibility with things like these "indirect" > interrupts and are free to do things like #2 at will locally in some > machine-specific set of drivers. For example, in the PS3 code, I don't > need to add a new "PS3" mapping type *anywhere*. The ps3bus root > driver just has a table when it hands out interrupts and talks to > ps3pic internally. > -Nathan >>>> Most of your control flow can be implemented at general level as >>>> is, or >>>> already exist [intr_map_irq(), intr_pic_register()] . >>>> Also, impact for current PPC code is, by me, minimal. >>>> I can sketch my idea in more details, if you found it acceptable. >>> All ideas are fine -- and the solution does not need to apply to all >>> platforms anyway, so long as the OFW_BUS_MAP_INTR() semantics (and, >>> ideally, API) are preserved. I have prepared working patch (it's not full, but it works on Tegra), https://github.com/strejda/tegra/commit/2cf72e248877fb917c4fc618bcb6e46b7c1058a4 can you, please, take look on it? Also, please, take in account: - we have, currently, 20+ interrupt controllers converted to new INTRNG PIC API. - universal format of interrupt resources is generic part of this API. - I'm not ready to commit any PIC API change together with above patch - i hope that this is understandable. also, "we must be able to add new interrupt type as simple as possible" rule is very important for me. Adding new table (with implementation), one bus method and one PIC method for each new type is not *simple*. In any case, can we concentrate to above patch first? I'm ready to finish it in next few hours, then put it to phabricator for real review. Michal From owner-freebsd-arch@freebsd.org Sat Aug 6 16:44:38 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 08954BB01C2; Sat, 6 Aug 2016 16:44:38 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D1DEB1DBE; Sat, 6 Aug 2016 16:44:37 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from zeppelin.tachypleus.net (75-101-50-44.static.sonic.net [75.101.50.44]) (authenticated bits=0) by d.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id u76GiRfI009180 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sat, 6 Aug 2016 09:44:28 -0700 Subject: Re: INTRNG (Was: svn commit: r301453....) To: Michal Meloun References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> <57A09F34.4050400@FreeBSD.org> <57A30B72.7070809@FreeBSD.org> <1946069a-d0f9-2c19-80a5-0b490682574b@freebsd.org> <57A5F480.20309@FreeBSD.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Nathan Whitehorn Message-ID: <1d63e3aa-1a2f-992f-ae83-656eb185d386@freebsd.org> Date: Sat, 6 Aug 2016 09:44:27 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <57A5F480.20309@FreeBSD.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Sonic-CAuth: UmFuZG9tSVaj1vDjAsIVaTSc89rfeKrjKezA8+PlavUyNWZYQ8IhS2paX50XUnK+e7jU+O+73wK23wQ1MK3EbzRJzxTvQQYaNUgnlMGRjvc= X-Sonic-ID: C;ysNCCvVb5hGydK/hcgQksw== M;4JiOCvVb5hGydK/hcgQksw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Aug 2016 16:44:38 -0000 On 08/06/16 07:30, Michal Meloun wrote: > Dne 04.08.2016 v 14:34 Nathan Whitehorn napsal(a): > [snip] >>>>>>> ------------ The following is a large parenthesis >>>>>>> ------------------- >>>>>>> >>>>>>> One other possible route here that would also work well would be to >>>>>>> make ofwbus.c MD (it's a trivial piece of code anyway, so we don't >>>>>>> gain a lot by sharing it) and implement ofw_bus_map_intr() >>>>>>> locally as >>>>>>> an ofwbus bus method. Then you could have the mapping table >>>>>>> stored in >>>>>>> ofwbus's softc -- the API was designed for this initially. You would >>>>>>> need MD extensions for doing PIC registration there (which is fine), >>>>>>> but that segregates all the OFW-specific information into >>>>>>> OFW-specific code and would let the bus methods and the OFW >>>>>>> interrupt >>>>>>> mapping table interact cleanly in the same place. This still >>>>>>> preserves the pre-r301453 API, makes PowerPC work, and maybe >>>>>>> address's skra@'s concern about extensibility and letting core >>>>>>> interrupt code know about FDT (or ACPI). I'd be happy to mock >>>>>>> this up >>>>>>> as well if you think it's a good approach. >>>>>>> >>>>>> [snip] >>>>>>> This has the following features: >>>>>>> - Existing OFW API and semantics unchanged >>>>>>> - As such, PowerPC, PCI, etc. work fine with no changes >>>>>>> - Details encapsulated in MD code, so individual platforms can >>>>>>> implement this however they like >>>>>>> - arm/arm/intr.c (or whatever) only needs a method to allocate a >>>>>>> fresh interrupt, with no state, and anoter to set the device_t >>>>>>> for an >>>>>>> interrupt sometime later. >>>>>>> - The internal table in the platform interrupt code has no knowledge >>>>>>> of any mappings whatsoever except having the appropriate device_t >>>>>>> for >>>>>>> the PIC stored with the interrupt vector. >>>>>>> - Device tree bits handled purely in device tree code >>>>>>> - No action need be taken on any mapping until the interrupt is >>>>>>> actually allocated/set up, like r301453 >>>>>>> - Easy to add more mapping mechanisms (e.g. ACPI) by having similar >>>>>>> enumeration-mechanism-specific code in the root bus for that mapping >>>>>>> mechanism. >>>>>>> >>>>>>> -------------- End parenthesis ------------------------------- >>>>>> Here's an implementation of the parenthesis I wrote on an airplane >>>>>> this afternoon. It should be complete, though has not been tested. >>>>>> The >>>>>> code is short and simple (+70 lines in ofwbus.c). This preserves the >>>>>> pre-r301453 API and semantics relative to drivers, which means >>>>>> PowerPC >>>>>> and PCI work out of the box, while keeping the semantics relative to >>>>>> the interrupt layer of r301453 (PIC methods only called on resource >>>>>> allocation, no allocatable IRQs on unattached PICs, encapsulation of >>>>>> OFW-specific code in OFW-specific bits of the tree). It turns out >>>>>> those two things are compatible, somewhat to my surprise, and that >>>>>> makes the result very clean. I like this approach and would be happy >>>>>> to move forward with it. There are five functions of interest: >>>>>> >>>>>> 1. OFW_BUS_MAP_INTR(). This has the semantics and API it has now: you >>>>>> pass an interrupt specifier and parent, you get back an IRQ. No >>>>>> changes. This is the core of the normal OFW interrupt API. >>>>>> >>>>>> 2. OFW_BUS_REGISTER_PIC(device_t pic, phandle_t phandle). This is a >>>>>> new function that PIC drivers are supposed to use to register control >>>>>> of an interrupt domain. This replaces machine-specific code like >>>>>> powerpc_register_pic() to allow the PIC table to be in a bus parent >>>>>> rather than in the interrupt core. >>>>>> >>>>>> 3. PIC_MAP_OFW_INTR(device_t pic, int irq, interrupt specifier). This >>>>>> is a new function that PIC drivers that know how to handle >>>>>> device-tree >>>>>> interrupt descriptors implement (analogous to various existing ones >>>>>> that vary by platform). It tells the PIC that the given abstract IRQ >>>>>> means the given opaque interrupt specifier. >>>>>> >>>>>> 4. arm_allocate_irq(int suggested). This allocates a new IRQ number >>>>>> not (yet) attached to a PIC, as in r301453. I've added a parameter >>>>>> that lets you pass a suggested number to try in case it is >>>>>> possible to >>>>>> make it match an interrupt pin or something for human-readability. >>>>>> >>>>>> 5. arm_set_pic_for_irq(int irq, device_t). This tells the MD >>>>>> interrupt >>>>>> layer to associate a given PIC device_t with an IRQ. That is all the >>>>>> information the MD layer ever has about the IRQ mapping. >>>>>> >>>>>> Functions #1 and #2 are now implemented completely in ofwbus.c: there >>>>>> are no callouts anywhere else and the interrupt mapping table is >>>>>> maintained entirely internally to ofwbus (in its softc). In order to >>>>>> implement ACPI, or some other strategy, you would implement >>>>>> analogs to >>>>>> functions #1 and #2 that live somewhere in the bus hierarchy that is >>>>>> guaranteed to be above all devices that might want that type of >>>>>> interrupt (e.g. in acpi0), and some analog to #3 that PIC drivers >>>>>> implementing the mapping scheme would provide. >>>>>> >>>>>> Since the system interrupt code has no knowledge at all of interrupt >>>>>> mapping, of any type, in this scheme, adding new mapping types is >>>>>> trivial and can be done on a driver-by-driver basis if necessary >>>>>> without changing KPI and without any other part of the system even >>>>>> being aware. For example, GPIOs can use a completely different >>>>>> mechanism if they want and can do setup purely in the GPIO controller >>>>>> driver. You could have a method GPIO_GET_INTERRUPT_FOR_PIN() on a >>>>>> GPIO >>>>>> controller in which the GPIO controller allocates a generic IRQ, >>>>>> assigns through some internal table just in the GPIO driver, and >>>>>> returns to it to a consumer in some other device driver -- without a >>>>>> GPIO mapping type, new bus functions, or modifications to the >>>>>> platform >>>>>> interrupt code. >>>>>> >>>>>> The control flow goes like this: >>>>>> - Bus driver enumerates children, parses interrupts properties, calls >>>>>> OFW_BUS_MAP_INTR() to get IRQs for them (as pre-r301453), adds to >>>>>> interrupt list. >>>>>> - ofwbus receives the OFW_BUS_MAP_INTR() call, allocates a blank >>>>>> disconnected IRQ from the MD interrupt layer, and stores the mapping >>>>>> from the new IRQ to the given interrupt specifier and phandle in an >>>>>> internal table in ofwbus's softc. >>>>>> NB: Nothing else happens here, like post-r301453. Changing >>>>>> this does >>>>>> not change any semantics of the API pre-r301453, which means it >>>>>> remains fully compatible with PCI and PowerPC. Also, like >>>>>> post-r301453, there is no involvement of nexus. >>>>>> - PICs attach and call OFW_BUS_REGISTER_PIC(). ofwbus receives these >>>>>> messages and adds a (device_t, phandle_t) mapping to a second >>>>>> internal >>>>>> table. Note that the interrupt layer does not need to handle PIC >>>>>> registration anymore at all (except for the root PIC). >>>>>> - Bus child eventually calls a function that tries to set the >>>>>> interrupt up (e.g. bus_setup_intr()). That propagates up the bus >>>>>> hierarchy, eventually getting to ofwbus. ofwbus notes the IRQ number, >>>>>> looks it up in the table, looks up the appropriate PIC from the PIC >>>>>> table, then: >>>>>> A) calls arm_set_pic_for_irq(irq, pic_device_t) -- this is the >>>>>> interrupt layer's only interaction with the mapping code. All it >>>>>> deals >>>>>> with is device_ts and abstract IRQ numbers. >>>>>> B) calls PIC_MAP_OFW_INTR(pic, irq_number, interrupt-cells, >>>>>> interrupt-specifier) to tell the PIC that the interrupt layer's IRQ >>>>>> irq_number means the given specifier >>>>>> C) finally, passes the call onto nexus, which will do whatever >>>>>> would >>>>>> normally happen (unmasking the interrupt, setting handlers, etc.) in >>>>>> terms only of the abstract IRQ and the device_t assigned by ofwbus. >>>>>> >>>>>> You would implement ACPI just by doing a s/OFW/ACPI/g >>>>>> search-and-replace above -- since the interrupt layer doesn't know >>>>>> about OFW or ACPI or anything else, there is no need to touch it. >>>>>> This >>>>>> seems clean, simple, compartmentalized, preserves the existing API, >>>>>> and should work on all of our various hardware. PowerPC can't quite >>>>>> work with it yet without some multipass foo, but, since the API is >>>>>> preserved, that transition can happen gradually without KPI changes. >>>>>> For the same reason that it is API-preserving, I think this code is >>>>>> also MFC-able. >>>>>> -Nathan >>>>> I think that we are slightly diverges in this place: >>>>> - please note that PIC API (in kern/pic_if.m) is different from the >>>>> one >>>>> that PPC uses. >>>> Yes, which is fine (this is machine-dependent code anyway). On >>>> consideration, the PIC_MAP_OFW_INTR() function should probably be >>>> called OFW_BUS_PIC_MAP_INTR() and live in ofw_bus_if.m anyway, then be >>>> implemented by PICs. >>>> >>>>> - we must be able to store configuration data (original interrupt >>>>> specifier) in all cases, not only for OFW. That's reason why I have >>>>> proposed >>>>> to create 'mapping table'. >>>> It is stored in all cases, just not in the core interrupt code. I've >>>> only mocked this up for OFW here. For ACPI, you would have the >>>> equivalent table in acpi.c, along with ACPI_MAP_INTR(), >>>> ACPI_REGISTER_PIC(), etc. in acpi_if.m and implemented in >>>> dev/acpica/acpi.c. >>>> >>>> For GPIOs, you would have a mechanism that just traces however you are >>>> representing GPIOs anyway. >>>> >>>>> Anyway, i think that we both talking about +/- same solution, only i >>>>> want to move it from OFW specific level implemented at OFW bus >>>>> level to >>>>> bus/interrupt specifier format independent level implemented in >>>>> subr_intr.c. >>>> This implements the same API in any case (identical to that >>>> pre-r301453), so the implementation doesn't really matter at all in >>>> terms of my needs. >>> Perfect. >>>> That said, having it in the root bus for the mapping domain (ofwbus0, >>>> acpi0) etc. seems cleaner to me, with no loss of functionality. The >>>> core interrupt code (subr_intr.c or whatever) doesn't have any obvious >>>> need to know anything at all about the mapping information so long as >>>> it knows the PIC device_t that corresponds to every abstract IRQ so >>>> that it can mask it or do other operations. Since, presumably, nothing >>>> in an ACPI device tree references an interrupt in the OFW device tree >>>> (if you even had both -- and how would you specify that, anyway?), >>>> implementing the relevant bits one step up the bus hierarchy doesn't >>>> change any behavior. >>>> >>>> And nothing can possibly be more flexible in terms of the mapping >>>> table in subr_intr.c than not having a mapping table: you don't need >>>> enums for mapping types, or unions that need to be expanded, breaking >>>> KBI, or anything. You can delete all the PIC registration (and MSI) >>>> code, all the switches, and all the unions from subr_intr.c and have a >>>> totally mapping-mechanism-agnostic KPI. >>>> >>> Where you see "all the switches, and all the unions" in subr_intr.c? >>> subr_intr.c uses nested structures approach, in exactly same way as is >>> used in OFW bus. Moreover, subr_intr.c *IS* currently totally >>> mapping-mechanism-agnostic KPI, and any form of mapping table must >>> follow this rule. >> That was hyperbolic; my apologies. >> >> The point was that you don't need intr_map_data, or the >> intr_map_data_type enum, this way. One of the nice things about that >> is that you don't need to add more entries to the enum to add new >> one-off mapping types. For example, the PS3 platform on PPC has its >> own non-device-tree, non-ACPI bus description and interrupt system. >> Needing to make new enums (or anything) like INTR_MAP_DATA_PS3 in >> sys/bus.h seems a little silly, though hardly very bad. > Hmm, right. > Is something like solution for you? Or just the existing machine/intr_whatever_it_is.h. >> I've attached a version of PowerPC's intr_machdep.c and pic_if.m that >> fully implements this distributed-table scheme. The code ends up being >> very short and clean (a third the number of lines of code as >> kern/subr_intr.c, for example) and the diff to PowerPC, including the >> new ofwbus.c code, ends up being net-negative. > Yep, I read it very carefully. >>> But allow me to make short summary: >>> - we must support 3 'main' cases - OFW, ACPI and HINTS (for older arm >>> and mips boards) based systems >>> - single kernel must support all above cases, but only one can by >>> 'active' (kernel must be able to select right one at runtime). >>> - we must support 'direct' (interrupt specifier is defined by one of >>> above methods) or 'indirect' (where interrupt is associated with other >>> function - GPIO pin, but don't have direct description). >>> - we must be able to access single physical pin by both methods, >>> 'direct' and/or 'indirect'. >>> - we must be able to add new interrupt type as simple as possible >> Agreed with all of this. >> >>> For interrupt controllers: >>> - single controller must be able to parse multiple formats. 'direct' or >>> 'indirect'. OFW or ACPI. ARM GIC must accept OFW or ACPI arguments (in >>> single kernel), GPIO based interrupt controller must accept 'direct' or >>> 'indirect' formats. >>> >>> For interrupt consumers: >>> - 'direct' interrupts are easy >>> - driver must be able to consume 'indirect' interrupt in 'root >>> bus'/'mapping-mechanism' agnostic manner. >>> For example, MMC driver must be able to get interrupt from CD gpio >>> pin in all possible cases/combinations . >>> Are we still connected? >> And this. >> >>> I don't think that all this is possible without single universal format >>> of interrupt mapping specifier. >>> And, if we must have universal format then why we needs different >>> mapping databases? >> It actually works just fine in this mapping-table-in-the-root-bus >> model. If you have an OFW-type of interrupt, it is set up by ofwbus.c. >> Since everything that would use an OFW interrupt is a child of ofwbus, >> this is totally indistinguishable from a global table from the >> perspective of client code. For ACPI interrupts, they are set up by >> acpi.c, which is indistinguishable from a global table for the same >> reason. For hints -- and single-domain systems generally -- you can >> just have the machine-dependent code assume that interrupts it doesn't >> explicitly know about belong to the root PIC. This is what you call >> the "direct" case. >> >> For the "indirect" case, there are as usual a few things you could do. >> They match the set of things you would do in any implementation: >> 1. Set up one or more global registries for "indirect" PICs, with >> corresponding mapping methods, either as bus methods on some >> high-level bus or on the machine-specific nexus, or just a global >> function that you call. >> 2. Use the global registry that you already have to have for GPIO >> controllers and provide a mapping method on the GPIO controller >> (GPIO_GET_INTERRUPT_FOR_PIN() or something) that returns a mapped IRQ. >> > I still don't think that this can works. > Assume that i have driver with 2 interrupts, one is standard (OFW > property based), second is GPIO, targeting same interrupt controller. > First is allocated by bus_alloc_resource(), second by > gpio_alloc_intr_resource(), so each resource is stored in different table. > Then bus_setup_intr() is called, on both. So bus_setup_intr() must be > able to select right table, right? But how? The interrupt defined by the GPIO controller from a GPIO property (as opposed to an interrupt property) could be defined, in general, in four ways: 1. You have a bus method on the GPIO controller that returns a pre-mapped IRQ. Something like GPIO_INTERRUPT_FOR_PIN(). The GPIO properties I have seen don't have standard names so this would need to be done by the child. As a result, you don't need it to work before PIC attachment (or GPIO attachment) and can directly call methods on the GPIO controller. This kind of API has the nice feature that the GPIO controller can return errors early if that pin doesn't actually support interrupts, which is not a concern with normal interrupts property. The IRQ would be mapped internally by the controller when you do GPIO_INTERRUPT_FOR_PIN() and so ofwbus.c would pass it through to nexus (you'd need to remove a KASSERT from my code), which would then go to the interrupt code, which would see the device_t assigned earlier and pass through all operations appropriately. 2. Identical to the first paragraph of (1), except the bus method resolves to an interrupt-parent, specifier pair passed to OFW_BUS_MAP_INTR() rather than an internal mapping in the GPIO controller. This proceeds as normal for OF interrupts thereafter. API is the same as (1). 3. Some global helper function that transforms GPIO specifiers into some other domain (OF, ACPI) without talking to the GPIO controller. I think this is fragile and a bad idea in any case, though would also work as well (or as badly) here as in any other scheme. 4. A GPIO interrupt domain with specifier, like you have currently. You'd have to implement this mapping table in nexus (potentially as a thin wrapper for some code in intr.c or intr_gpio.c or whatever) as a global table as now or as in the first patch I sent. ofwbus.c would, like (1) pass through such requests. This is a perfectly good interface. So I don't think you lose anything by putting the table in ofwbus.c. All four cases would have the same API no matter where the mapping table is. The only non-idiomatic case is #4, which would require some shims, but those shims are wholly identical to the shims you would need if you didn't do OF/FDT interrupts this way. The larger point is that, since all four cases would have identical APIs in either case, it doesn't much matter how it is implemented now. If we picked one method and wanted to change later, we lose nothing and the change can be MFC-ed at will since it has no effect on KPI. > >> One of the nice things about distributing the tables this way is that >> you have lots of flexibility with things like these "indirect" >> interrupts and are free to do things like #2 at will locally in some >> machine-specific set of drivers. For example, in the PS3 code, I don't >> need to add a new "PS3" mapping type *anywhere*. The ps3bus root >> driver just has a table when it hands out interrupts and talks to >> ps3pic internally. >> -Nathan >>>>> Most of your control flow can be implemented at general level as >>>>> is, or >>>>> already exist [intr_map_irq(), intr_pic_register()] . >>>>> Also, impact for current PPC code is, by me, minimal. >>>>> I can sketch my idea in more details, if you found it acceptable. >>>> All ideas are fine -- and the solution does not need to apply to all >>>> platforms anyway, so long as the OFW_BUS_MAP_INTR() semantics (and, >>>> ideally, API) are preserved. > I have prepared working patch (it's not full, but it works on Tegra), > https://github.com/strejda/tegra/commit/2cf72e248877fb917c4fc618bcb6e46b7c1058a4 > can you, please, take look on it? Change looks great to me at first pass. Thank you for working on it! > Also, please, take in account: > - we have, currently, 20+ interrupt controllers converted to new INTRNG > PIC API. > - universal format of interrupt resources is generic part of this API. > - I'm not ready to commit any PIC API change together with above patch - > i hope that this is understandable. Yes, of course. Also, as far as I can tell, there's no reason to change the PIC API on ARM/MIPS anyway from the current INTRNG API. I think there are some minor simplifications you could do, but there's no need for them, and you can implement absolutely any of the APIs we have discussed in this thread in terms of the current pic_if.m with no problems. > > also, > "we must be able to add new interrupt type as simple as possible" rule > is very important for me. Adding new table (with implementation), one > bus method and one PIC method for each new type is not *simple*. Fair enough! I don't think we need that for, e.g., GPIOs (see cases 1-2 above), just for bus enumeration schemes (ACPI, OFW are probably the only ones) that usually require a ton of this kind of thing anyway. But, fundamentally, it doesn't matter. There are three important things from my end: 1. That it is possible to, at bus enumeration time, permanently assign an IRQ to an interrupt specifier from OFW/ACPI. 2. That that assignment not depend on having the PIC attached yet. 3. That the implementation details of that mechanism be reasonably abstracted so that they can change later or vary platform to platform. Whether mapping tables are in some central place (subr_intr.c) or in the parent bus, how the PIC API works, whether they are stored in that table in the form of a union or in different tables, doesn't matter for those three at all. And, with a constant API (3) we can even change our minds later without a lot of hassle. > In any case, can we concentrate to above patch first? I'm ready to > finish it in next few hours, then put it to phabricator for real review. Sounds good. As I said, first pass looks perfect to me. I'll look in detail once it hits phabricator. -Nathan > > Michal > From owner-freebsd-arch@freebsd.org Sat Aug 6 16:58:50 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6A513BB0553 for ; Sat, 6 Aug 2016 16:58:50 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io0-x22c.google.com (mail-io0-x22c.google.com [IPv6:2607:f8b0:4001:c06::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2C88315EE for ; Sat, 6 Aug 2016 16:58:50 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io0-x22c.google.com with SMTP id q83so326227493iod.1 for ; Sat, 06 Aug 2016 09:58:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=eGdiIVVZoY7D8RF1q9koNMFhBK16ze1HkKAN9Jb9Ids=; b=V0xo72+Kk8pa6zzvnvabxqUozxITJxxkVnx+3lnbZqTEigCuCrTOmFUIfMhWkyOulp klAl3v3NcQZBexfQJz0mjJ3BBX1cwFcHGaIHASdHLVMHr4XWM8G7E1gcZBOA/LWfaoL/ 1/s2bcn8UOZvMjMKvUm+6lPXts47wtINXD1WGKhwKbO/6h+umoRJE79dr8gkKo93Se+M 93RmdK5CTuPqnD/0vew1jTXT38MEFzDq0b+is6JggXqMpe8yUMvUxLwerW10RAMygfil MnB58NXfU3NJ0njKDLh/zWA07kbjJtHiY0rLHNTaF2ONhrREWAnGQRFfnsDWzBsDid6K fSJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=eGdiIVVZoY7D8RF1q9koNMFhBK16ze1HkKAN9Jb9Ids=; b=DL+hGebEXzV65kWwkJcZ8jMOoyJ1b2uWoByt5qI8hW4AF2/HY/jQltVl/b45ge8n06 E/c/HQKtxCVhz91XqF51ZFZdXX9a5znhM0w+f1FeC1XupmrI6y2mbU7d7cc6xgb2hwj5 DZV0aQvls3aaWw64o6pg4wjm4tU0Vz7aZLDoKb63pjesD6Q/XVqr0MwHM1msQlM8ELPo b8AbKXckks/v/yzuHk1Hv/RwXJDYGkSLROXtAJ94+FAg6C/UYijpgTrA8Y7PWtEpWSNM EdHzFhs7vWV+1O+movzM7ItNJLfcJmI78gEvVx7VCDQTvmydUNxl2Q+tmScL48OHqybl AXow== X-Gm-Message-State: AEkoouvIXR+QD76cr/KaRMpz4ErJeQQwxKHMLmzkVU1EOEOq4z8SAG34ypejhCNWE6I+7I81/DPB6CEMM5Ssbg== X-Received: by 10.107.9.39 with SMTP id j39mr84327957ioi.73.1470502729575; Sat, 06 Aug 2016 09:58:49 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.36.65.105 with HTTP; Sat, 6 Aug 2016 09:58:48 -0700 (PDT) X-Originating-IP: [69.53.245.200] In-Reply-To: <1d63e3aa-1a2f-992f-ae83-656eb185d386@freebsd.org> References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> <57A09F34.4050400@FreeBSD.org> <57A30B72.7070809@FreeBSD.org> <1946069a-d0f9-2c19-80a5-0b490682574b@freebsd.org> <57A5F480.20309@FreeBSD.org> <1d63e3aa-1a2f-992f-ae83-656eb185d386@freebsd.org> From: Warner Losh Date: Sat, 6 Aug 2016 10:58:48 -0600 X-Google-Sender-Auth: _0BEYsZVKtprYz0QGNWpalXtUEk Message-ID: Subject: Re: INTRNG (Was: svn commit: r301453....) To: Nathan Whitehorn Cc: Michal Meloun , "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Aug 2016 16:58:50 -0000 On Sat, Aug 6, 2016 at 10:44 AM, Nathan Whitehorn wrote: > Fair enough! I don't think we need that for, e.g., GPIOs (see cases 1-2 > above), just for bus enumeration schemes (ACPI, OFW are probably the only > ones) that usually require a ton of this kind of thing anyway. But, > fundamentally, it doesn't matter. There are three important things from my > end: > 1. That it is possible to, at bus enumeration time, permanently assign an > IRQ to an interrupt specifier from OFW/ACPI. > 2. That that assignment not depend on having the PIC attached yet. > 3. That the implementation details of that mechanism be reasonably > abstracted so that they can change later or vary platform to platform. > > Whether mapping tables are in some central place (subr_intr.c) or in the > parent bus, how the PIC API works, whether they are stored in that table in > the form of a union or in different tables, doesn't matter for those three > at all. And, with a constant API (3) we can even change our minds later > without a lot of hassle. First, I hate mapping tables at the nexus, unless they are created dynamically at run time. There's too much variation between boards, SoCs, etc to have that code live in the nexus otherwise. They simply don't scale. This board has interrupts 1-16 wired this way, but that board didn't do that and has an external PIC. This SoC based on Cortext A uses the GPIC, while that one based on the same Cortext A chose to use Atmel's PIC. Perhaps I'm misunderstanding something here as to what is meant by a table though. Next, In your list there's another dependency that's implicit but maybe not called out. You can have PICs that cascade into other PICs, or GPIO controllers that need to enable external PIC-like things before they can route interrupts from things that are downstream (interrupt wise) from them. Maybe I'm just hung up on the phrase "the PIC" and it really means "whatever complex thing or things handles getting the interrupt routed to the CPU." I don't see this design so much on basic eval boards, but do see it in more complex boards that control complicated things. Generally, though, I like the direction things are going. Warner From owner-freebsd-arch@freebsd.org Sat Aug 6 19:51:52 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4B417BB15F9; Sat, 6 Aug 2016 19:51:52 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3022E1E52; Sat, 6 Aug 2016 19:51:51 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from zeppelin.tachypleus.net (75-101-50-44.static.sonic.net [75.101.50.44]) (authenticated bits=0) by d.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id u76JpmSS029471 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sat, 6 Aug 2016 12:51:49 -0700 Subject: Re: INTRNG (Was: svn commit: r301453....) To: Warner Losh References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> <57A09F34.4050400@FreeBSD.org> <57A30B72.7070809@FreeBSD.org> <1946069a-d0f9-2c19-80a5-0b490682574b@freebsd.org> <57A5F480.20309@FreeBSD.org> <1d63e3aa-1a2f-992f-ae83-656eb185d386@freebsd.org> Cc: Michal Meloun , "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Nathan Whitehorn Message-ID: <2e638a0a-0d99-1aa7-4912-1015c8e2e947@freebsd.org> Date: Sat, 6 Aug 2016 12:51:48 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Sonic-CAuth: UmFuZG9tSVYiBw9w+Tgg90SzBC0zIR0jsgnJmBRZw2UjnNvjoQTh/ZrmBTY+xZx8J7p6LHudlKRSLuC8Sx1P4ymrcy4zijtl7D6465MoAnc= X-Sonic-ID: C;2CBQNg9c5hGs+K/hcgQksw== M;jsGjNg9c5hGs+K/hcgQksw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Aug 2016 19:51:52 -0000 On 08/06/16 09:58, Warner Losh wrote: > On Sat, Aug 6, 2016 at 10:44 AM, Nathan Whitehorn > wrote: > >> Fair enough! I don't think we need that for, e.g., GPIOs (see cases 1-2 >> above), just for bus enumeration schemes (ACPI, OFW are probably the only >> ones) that usually require a ton of this kind of thing anyway. But, >> fundamentally, it doesn't matter. There are three important things from my >> end: >> 1. That it is possible to, at bus enumeration time, permanently assign an >> IRQ to an interrupt specifier from OFW/ACPI. >> 2. That that assignment not depend on having the PIC attached yet. >> 3. That the implementation details of that mechanism be reasonably >> abstracted so that they can change later or vary platform to platform. >> >> Whether mapping tables are in some central place (subr_intr.c) or in the >> parent bus, how the PIC API works, whether they are stored in that table in >> the form of a union or in different tables, doesn't matter for those three >> at all. And, with a constant API (3) we can even change our minds later >> without a lot of hassle. > First, I hate mapping tables at the nexus, unless they are created > dynamically at run time. There's too much variation between boards, > SoCs, etc to have that code live in the nexus otherwise. They simply > don't scale. This board has interrupts 1-16 wired this way, but that > board didn't do that and has an external PIC. This SoC based on > Cortext A uses the GPIC, while that one based on the > same Cortext A chose to use Atmel's PIC. Perhaps I'm > misunderstanding something here as to what is meant by a table > though. The table in question is just a mapping of abstract IRQ numbers to the corresponding interrupt specifier and parent. For example, 5: <&gicX, 7 2> Where 5 is the (potentially arbitrary) assigned number for the system corresponding to whatever <7 2> means on gic0. The table is built on the fly so that OFW bus nodes can specify interrupts as something <&gicX, 7 2> and get back scalars that work with rman and the PCI APIs and all the other places that want definitions of interrupts as scalars. It naturally handles all of your examples. The discussion here is whether to: a) keep an OFW/FDT-specific table like this in ofwbus.c (and analogs for ACPI, etc. in their respective places) b) keep a table that maps numbers to some union of OFW/ACPI/whatever data in nexus or intr.c or some global place It doesn't matter much from a functionality perspective (or an API perspective) either way. The question is more about which is easier to maintain and extend long-term. Since the API doesn't change either way, we're free to decide incorrectly and revisit it at will later with little consequence. > Next, In your list there's another dependency that's implicit > but maybe not called out. You can have PICs that cascade into > other PICs, or GPIO controllers that need to enable external > PIC-like things before they can route interrupts from things > that are downstream (interrupt wise) from them. Maybe I'm > just hung up on the phrase "the PIC" and it really means > "whatever complex thing or things handles getting the > interrupt routed to the CPU." I don't see this design so much > on basic eval boards, but do see it in more complex boards > that control complicated things. Yes. We've supported this forever on PPC (where it is very common) and it works here too. The implementation is really simple. First, here's an explanation of the general interrupt mapping system, for context: Let's suppose you have some interrupt hierarchy like this: CPU <- pic0 <- some device PIC0 has a bunch of interrupt pins. As you enumerate the system, the code encounters interrupt specifiers like <&pic0, 3 1>, <&pic0, 4, 1>, <&pic1, 3, 2>. Through OFW_BUS_MAP_INTR(), the bus enumeration code turns these into some arbitrary scalar IRQs assigned by OFW_BUS_MAP_INTR(). The mapping from the (interrupt parent, specifier) tuple to that assigned scalar IRQ is stored somewhere (see above) for later. When the bus devices that use those IRQs attach, they call bus_activate_resource() and bus_setup_intr(). At this point, some code close to the root of the hierarchy (again, see above for exactly where) looks up the corresponding vector specifier in the table, looks up the registered PIC corresponding to the interrupt parent key, and tells the driver attached to the interrupt parent to think of the scalar IRQ as meaning whatever string of numbers it saw earlier. It also tells the machine-dependent interrupt layer that the given scalar IRQ is handled by the device_t for the PIC so that it can ask the PIC to mask/unmask/bind/etc. the interrupt. When PICs attach, they register themselves with whatever bus layer is doing this mapping to say that a given interrupt parent key (the phandle for OFW/FDT) corresponds to the device_t for the PIC. The PIC attached directly to the CPU's interrupt pin[s] (pic0 here) also registers itself somehow with the MD interrupt system. On PPC, there is a global device_t called "root_pic" that the driver sets. When "some device" signals an interrupt, pic0 (the hardware) signals the CPU. The CPU signals the kernel. The MD interrupt layer notices and calls the driver for pic0 (the root PIC). That code inspects whatever registers on PIC0 give it the interrupt line and then signals the MD interrupt layer (let's say by calling a function called arm_dispatch_irq(int irq)) with the corresponding scalar IRQ. That then invokes the corresponding filters and/or ithreads. So how does this change with cascaded PICs? Very little. Let's suppose you have some interrupt hierarchy like this: CPU <- pic0 <- pic1 <- some device PIC0 has a bunch of interrupt pins, one of which is connected to pic1, which provides a bunch more interrupt pins. Both pic0 and pic1 register themselves with the bus layer by the appropriate handles and get their mask/unmask/bind functions called by the interrupt layer during interrupt setup for interrupts they handle. PIC0 registers itself as the root during attachment because it notices that it does not have any interrupts in its resource list. Child PICs, on the other hand, have interrupt properties in the FDT corresponding to the pin on pic0 that is attached to pic1. These get assigned as normal through newbus and the child PIC (pic1) calls bus_setup_intr and registers a filter handler. When "some device" signals an interrupt, pic1 drives a line on pic0, which drives a line on the CPU. The driver for PIC0 calls arm_dispatch_irq() with the IRQ corresponding to pic1, which calls pic1's interrupt handler. Since filter handlers run in primary interrupt context, pic1's interrupt handler can look exactly like what pic0 does when signalled by the interrupt layer: it runs through its registers, finds any lines with active interrupts, then calls arm_dispatch_irq() with the appropriate corresponding scalar IRQ. The nice thing here is that cascaded PICs require zero special handling by the system. You can just treat them as normal interrupts, with no special methods required in the bus layer, the interrupt system, or the PIC driver. -Nathan > > Generally, though, I like the direction things are going. > > Warner >