Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Mar 2015 12:51:04 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r280866 - in head/sys: amd64/amd64 i386/i386
Message-ID:  <3149031.gmIvmB3vKt@ralph.baldwin.cx>
In-Reply-To: <20150331003850.GL2379@kib.kiev.ua>
References:  <201503302013.t2UKDNCo093442@svn.freebsd.org> <20150331003850.GL2379@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, March 31, 2015 03:38:50 AM Konstantin Belousov wrote:
> But apparently, that did not helped, and it seems that there are
> sporadic reports of Linux having similar issues with x2APIC on simila=
r
> mobile SandyBridge, which are proof-less charged to BIOS bugs.
>=20
> Mostly, my question is, should we increase DELAYS() in addition to
> lapic_ipi_wait() timeouts ?

Hmm, those delays also come from the MP 1.4 spec.  The INIT delay
is already quite long (10 ms).  The STARTUP delays are more a matter
of how long our mpboot code takes to get to the point of incrementing
mp_ncpus from my understanding.  In the MP spec the delays are only
mentioned in the psuedo-code in B.4, not in the text:

    BSP sends AP an INIT IPI
    BSP DELAYs (10mSec)
    If (APIC_VERSION is not an 82489DX) {
        BSP sends AP a STARTUP IPI
        BSP DELAYs (200=CE=BCSEC)
        BSP sends AP a STARTUP IPI
        BSP DELAYs (200=CE=BCSEC)
    }
    BSP verifies synchronization with executing AP

    Example B-1. Universal Start-up Algorithm

Hmm, the SDM also mentions similar delays in Vol3 8.4.4 (Feb 2014
version):

8.4.4.1 Typical BSP Initialization Sequence

...

14. Performs the following operation to set up the BSP to detect the pr=
esence
    of APs in the system and the number of processors:

    =E2=80=94 Sets the value of the COUNT variable to 1.

    =E2=80=94 Starts a timer (set for an approximate interval of 100 mi=
lliseconds).
      In the AP BIOS initialization code, the AP will increment the COU=
NT
      variable to indicate its presence.  When the timer expires, the B=
SP
      checks the value of the COUNT variable.  If the timer expires and=
 the
      COUNT variable has not been incremented, no APs are present or so=
me
      error has occurred.

15. Broadcasts an INIT-SIPI-SIPI IPI sequence to the APs to wake them u=
p and
    initialize them:

      MOV ESI, ICR_LOW; Load address of ICR low dword into ESI.
      MOV EAX, 000C4500H; Load ICR encoding for broadcast INIT IPI
      ; to all APs into EAX.
      MOV [ESI], EAX; Broadcast INIT IPI to all APs
      ; 10-millisecond delay loop.
      MOV EAX, 000C46XXH; Load ICR encoding for broadcast SIPI IP
      ; to all APs into EAX, where xx is the vector computed in step 10=
.
      MOV [ESI], EAX; Broadcast SIPI IPI to all APs
      ; 200-microsecond delay loop
      MOV [ESI], EAX; Broadcast second SIPI IPI to all APs
      ; 200-microsecond delay loop

16. Waits for the timer interrupt.

17. Reads and evaluates the COUNT variable and establishes a processor =
count.

...

Note that this algorithm specifically refers to BIOS startup and not OS=

startup.  I can't find any clear mention in the SDM of what the OS is s=
upposed
to do to bootstrap APs.  One bread crumb is that 8.4.3 it implies that =
the BIOS
should leave APs in a state that requires an INIT during OS bootstrap w=
hich
implies that OS's have to use INIT as well:

9. While the BSP is executing operating-system boot-strap and start-up =
code,
   the APs remain in the halted state. In this state they will respond =
only to
   INITs, NMIs, and SMIs. They will also respond to snoops and to asser=
tions
   of the STPCLK# pin.

Also, I believe that the 100 millisecond timer referred to in step 14 a=
bove is
a timeout on the entire AP-enumeration process and is the timer waited =
for in
step 16.  It also seems that the BIOS uses broadcast (all-but-self) IPI=
s,
whereas FreeBSD uses targeted (wake up a single AP at a time) IPIs.

I don't really know if we need to increase the delays or not.  I have n=
o idea
what Intel's source for those numbers in the two documents are.  I don'=
t think
I've ever seen a rationale for why they were chosen.

BTW, Linux seems to use the equivalent of 100 milliseconds for the
lapic_ipi_wait() stage before doing the other delays (see
native_safe_apic_wait_icr_idle() for the non-X2APIC case).

--=20
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3149031.gmIvmB3vKt>