FreeBSD Mail Archives

Date:      Tue, 31 Mar 2015 12:51:04 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r280866 - in head/sys: amd64/amd64 i386/i386
Message-ID:  <3149031.gmIvmB3vKt@ralph.baldwin.cx>
In-Reply-To: <20150331003850.GL2379@kib.kiev.ua>
References:  <201503302013.t2UKDNCo093442@svn.freebsd.org> <20150331003850.GL2379@kib.kiev.ua>

index | next in thread | previous in thread | raw e-mail

On Tuesday, March 31, 2015 03:38:50 AM Konstantin Belousov wrote:
> But apparently, that did not helped, and it seems that there are
> sporadic reports of Linux having similar issues with x2APIC on similar
> mobile SandyBridge, which are proof-less charged to BIOS bugs.
> 
> Mostly, my question is, should we increase DELAYS() in addition to
> lapic_ipi_wait() timeouts ?

Hmm, those delays also come from the MP 1.4 spec.  The INIT delay
is already quite long (10 ms).  The STARTUP delays are more a matter
of how long our mpboot code takes to get to the point of incrementing
mp_ncpus from my understanding.  In the MP spec the delays are only
mentioned in the psuedo-code in B.4, not in the text:

    BSP sends AP an INIT IPI
    BSP DELAYs (10mSec)
    If (APIC_VERSION is not an 82489DX) {
        BSP sends AP a STARTUP IPI
        BSP DELAYs (200μSEC)
        BSP sends AP a STARTUP IPI
        BSP DELAYs (200μSEC)
    }
    BSP verifies synchronization with executing AP

    Example B-1. Universal Start-up Algorithm

Hmm, the SDM also mentions similar delays in Vol3 8.4.4 (Feb 2014
version):

8.4.4.1 Typical BSP Initialization Sequence

...

14. Performs the following operation to set up the BSP to detect the presence
    of APs in the system and the number of processors:

    — Sets the value of the COUNT variable to 1.

    — Starts a timer (set for an approximate interval of 100 milliseconds).
      In the AP BIOS initialization code, the AP will increment the COUNT
      variable to indicate its presence.  When the timer expires, the BSP
      checks the value of the COUNT variable.  If the timer expires and the
      COUNT variable has not been incremented, no APs are present or some
      error has occurred.

15. Broadcasts an INIT-SIPI-SIPI IPI sequence to the APs to wake them up and
    initialize them:

      MOV ESI, ICR_LOW; Load address of ICR low dword into ESI.
      MOV EAX, 000C4500H; Load ICR encoding for broadcast INIT IPI
      ; to all APs into EAX.
      MOV [ESI], EAX; Broadcast INIT IPI to all APs
      ; 10-millisecond delay loop.
      MOV EAX, 000C46XXH; Load ICR encoding for broadcast SIPI IP
      ; to all APs into EAX, where xx is the vector computed in step 10.
      MOV [ESI], EAX; Broadcast SIPI IPI to all APs
      ; 200-microsecond delay loop
      MOV [ESI], EAX; Broadcast second SIPI IPI to all APs
      ; 200-microsecond delay loop

16. Waits for the timer interrupt.

17. Reads and evaluates the COUNT variable and establishes a processor count.

...

Note that this algorithm specifically refers to BIOS startup and not OS
startup.  I can't find any clear mention in the SDM of what the OS is supposed
to do to bootstrap APs.  One bread crumb is that 8.4.3 it implies that the BIOS
should leave APs in a state that requires an INIT during OS bootstrap which
implies that OS's have to use INIT as well:

9. While the BSP is executing operating-system boot-strap and start-up code,
   the APs remain in the halted state. In this state they will respond only to
   INITs, NMIs, and SMIs. They will also respond to snoops and to assertions
   of the STPCLK# pin.

Also, I believe that the 100 millisecond timer referred to in step 14 above is
a timeout on the entire AP-enumeration process and is the timer waited for in
step 16.  It also seems that the BIOS uses broadcast (all-but-self) IPIs,
whereas FreeBSD uses targeted (wake up a single AP at a time) IPIs.

I don't really know if we need to increase the delays or not.  I have no idea
what Intel's source for those numbers in the two documents are.  I don't think
I've ever seen a rationale for why they were chosen.

BTW, Linux seems to use the equivalent of 100 milliseconds for the
lapic_ipi_wait() stage before doing the other delays (see
native_safe_apic_wait_icr_idle() for the non-X2APIC case).

-- 
John Baldwin

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3149031.gmIvmB3vKt>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation