From owner-freebsd-current@FreeBSD.ORG  Wed Dec  7 14:28:17 2005
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
X-Original-To: current@freebsd.org
Delivered-To: freebsd-current@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 487AE16A420
	for <current@freebsd.org>; Wed,  7 Dec 2005 14:28:17 +0000 (GMT)
	(envelope-from jhb@freebsd.org)
Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 16C4143D92
	for <current@freebsd.org>; Wed,  7 Dec 2005 14:27:55 +0000 (GMT)
	(envelope-from jhb@freebsd.org)
Received: from server.baldwin.cx (unverified [66.23.211.162]) 
	by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 3319097 
	for multiple; Wed, 07 Dec 2005 09:29:43 -0500
Received: from zion.baldwin.cx (zion.baldwin.cx [192.168.0.7])
	(authenticated bits=0)
	by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id jB7ERYo4056800;
	Wed, 7 Dec 2005 09:27:37 -0500 (EST) (envelope-from jhb@freebsd.org)
From: John Baldwin <jhb@freebsd.org>
To: Julian Elischer <julian@elischer.org>
Date: Wed, 7 Dec 2005 09:25:45 -0500
User-Agent: KMail/1.8.3
References: <43961758.4020407@elischer.org>
	<1B4F46C2-C424-45F8-9328-BEE2AA6E0DC6@FreeBSD.org>
	<4396918D.9060109@elischer.org>
In-Reply-To: <4396918D.9060109@elischer.org>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
Message-Id: <200512070925.47141.jhb@freebsd.org>
X-Spam-Status: No, score=-2.8 required=4.2 tests=ALL_TRUSTED autolearn=failed 
	version=3.0.2
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx
X-Server: High Performance Mail Server - http://surgemail.com r=1653887525
Cc: current@freebsd.org
Subject: Re: can someone explain...[ PCI interrupts]
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Dec 2005 14:28:17 -0000

On Wednesday 07 December 2005 02:38 am, Julian Elischer wrote:
> > have been used already, then  the kernel starts assigning multiple
> > (apic, pin) tuples to the same  IRQ resulting in interrupts being shared
> > in software because of the  cpl limitation even though they aren't
> > shared in hardware.  This is  why your IRQ values are different on 4.x
> > than on FreeBSD 5.2+ and  Linux which use the ACPI global interrupt
> > number model.
>
> but if I change the code that does this, I may be able to get my devices
> that collide with the 'boot interrupt' to go elsewhere? That would be
> good..

No, probably not.  The "boot interrupt" collisions happen on all versions o=
f=20
=46reeBSD currently.  I do have a workaround in my head, and if it works, i=
t=20
might even be backportable to 4.x.  You can't change how the interrupts are=
=20
physically wired though, and the boot interrupt collisions happens because =
of=20
issues in hardware.  You might be able to pull a trick where you map the tw=
o=20
colliding interrupts to the same IRQ cookie on 4.x, but that'd be ugly, and=
=20
the fix I'm considering would be a lot simpler and do the same thing (I nee=
d=20
to check, but I think that the INTx swizzle the PXH's do might match the=20
standard PCI-PCI bridge swizzle, and if so, we can just depend on the boot=
=20
interrupt and route the interrupts via the boot destination by ignoring the=
=20
_PRT (for ACPI) on such bridges, and ignoring any MP Table entries (on=20
non-ACPI) so that it falls back to using the PCI-PCI swizzle.

> > for the (apic, pin) tuple being used.  (Thus, IRQs are  just a cookie
> > that is the index into the global array of interrupt  sources on x86.)
> > Note that interrupts routed this way are hardwired  into the motherboard
> > design.  There's no chance for the OS to change  which (pic, pin) a PCI
> > device interrupt is hooked up to.
>
> but from my memory, many PCI devices can select between A,B,C and D
> so maybe by going to the device and selecting a different one of those
> you can force it to go elsewhere...

They devices don't really get to choose, it's a read-only config register t=
hat=20
is set in silicon.  Even then, IIRC, PCI mandates that single-function=20
devices use INTA, and that multi function devices use INTA if they have one=
=20
interrupt, INT[AB] if they have two, etc.  (I'm less certain about the=20
multifunction part, but single-function devices must use INTA.)

> > already.  If so, that's the IRQ that that PCI device interrupt is
> > assigned to.  If an IRQ isn't routed already, then it has to use an
> > algorithm to pick one, make a BIOS call to route the link to the  chosen
> > IRQ, and then assign the PCI device interrupt to that IRQ.
>
> so, is a "link device" a physical piece of hardware or a software
> abstraction?

It's a physical piece of hardware in that it represents a pin on a=20
programmable interrupt router.  You basically have a chip that has several=
=20
input pins (each of which is a link device) and the chip can programmably=20
route each intput pin to one of several output pins.  Thus, you might have =
a=20
single chip but with multiple pins (like an APIC with 24 different pins) an=
d=20
each input pin is considered a link device.

> > Hopefully this at least answers some questions and gives a good
> > overview of what PCI interrupt routing is and how it works, etc.
>
> My head hurts, but a lot makes more sense now.
> I'll need to read this a few more times however.
> if you made this into a web page, and added a few diagrams that would be
> amazing.. also you use a few Acronyms without saying what they are..

Yeah, I should probably put this in the arch-handbook, but I'd need to lear=
n=20
pic to draw the diagrams (or perhaps I could draw them in something else an=
d=20
export it as .eps?)

=2D-=20
John Baldwin <jhb@FreeBSD.org> =A0<>< =A0http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve" =A0=3D =A0http://www.FreeBSD.org