Date: Thu, 30 Oct 2003 17:34:47 -0500 (EST) From: John Baldwin <jhb@FreeBSD.org> To: arch@FreeBSD.org Subject: HEADSUP: New i386 interrupt and SMP code.. Message-ID: <XFMail.20031030173447.jhb@FreeBSD.org>
next in thread | raw e-mail | index | archive | help
Coming very soon to a CVS tree near you are some very large changes to the i386 interrupt and SMP code. New features include: - Runtime selection of using the I/O APICs or the AT PICs to route interrupts. - I/O APICs can be used in a UP kernel or on a UP system that supplies either an MP Table or ACPI APIC Table. - An SMP kernel can run on a UP machine. This means that SMP can now be enabled in GENERIC and the SMP kernel config can die. - The ACPI MADT table can be used to enumerate CPUs instead of the MP Table if ACPI is enabled. This will add true HT support in that we will finally support the BIOS setting for HT. - I/O APIC interrupts are now longer forced into 8 IRQs. Thus, when using APICs, each PCI interrupt really gets its own IRQ and isn't shared with anyone else. - Multiple fast interrupt handlers can be attached to a given interrupt source provided that all of the handlers are fast. (Note: at this point, fast is a poor name, INTR_DIRECT might be a better name.) - Logical APIC IDs are used to route APIC interrupts from the I/O APICs to CPUs. In theory the APIC interrupt code can now support 60 CPUs. The hardware is still limited to 16 however. - We now correctly route PCI interrupts when using APICs using the PCI interrupt routing infrastructure instead of a gross hack in pci_cfgregread(). This means that we can route interrupts across bridges, support mp tables that only list interrupts for chassis devices, etc. We also correctly route PCI interrupts when using APICs and ACPI. - The new interrupt source abstraction should make it substantially easier to add support for MSI interrupts. - We properly support mixed mode by EOI'ing the AT PIC and not EOI'ing the local APIC for mixed mode interrupts (just irq 0: clk right now). - This code can largely be pulled over to amd64 to support APICs and SMP on that arch. Some implementation details include: - APIC interrupt entry points only use one entry point per 32 vectors and use the APIC ISR registers to determine which interrupt triggered in that range. This means that the APIC code only has to provide 5 entry points instead of 159. - Because we now support up to 159 different IRQs, the critical section optimization code no longer scales well. Especially since the new APIC code does not use a separate entry point for each IRQ. Thus, for the time being at least, critical sections have been reverted back to disabling interrupts for now. I do have a WIP for optimizing critical sections using a more scalable algorithm should the need arise. - Each IRQ is actually a cookie tied to an interrupt source. Each interrupt source is tied to a PIC driver. The PIC driver supports several operations on each interrupt source including disabling the source, enabling it for the first time, etc. Each PIC driver is free to store private per-source data with each source and private per-pic data with each PIC. - APICs (both I/O and local (CPUs)) are enumerated by APIC enumerator drivers of which 2 are provided: one to use the ACPI MADT table and one to use the MP Table. - The SMP code no longer knows anything specific to the MP table. Instead, the APIC enumerators inform the SMP code of CPUs via a simple cpu_add() interface and the SMP code takes it from there. The SMP code is now much easier to read. Also, all of the APIC code has been split out into separate IO and local APIC files aiding in the cleanup. - Almost all of the interrupt dispatch code now happens in C rather than assembly. Notably, fast interrupt handlers no longer have a separate entry point. Downsides: - ACPI will no longer work as a module for know. The reason for this is that ACPI's APIC enumerator needs to be able to hook into a SI_SUB_TUNABLES - 1 SYSINIT() due to existing code that wants to know the available CPUs in the system very early (specifically, UMA). However, code in kernel modules cannot be executed until SI_SUB_KLD, which is much too late. This might be able to be addressed later with some creative hacking. - I haven't ported the changes over to PC98 yet. Code: The code lives in p4 under //depot/user/jhb/acpipci/... Note that several files have moved around so you might want to check the 'notes' file and 'setup.sh' file. If you want to try it out you can check out the tree using p4 and build a kernel. Just be sure to: 1) Run setup.sh first to create needed symlinks for moved files. 2) Use 'device apic' instead of 'options APIC_IO'. I'm sure there's more details that I've forgotten, but that's a start at least. -- John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.20031030173447.jhb>