Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Jan 2005 04:49:19 -0600 (CST)
From:      Scott Bennett <bennett@cs.niu.edu>
To:        atkielski.anthony@wanadoo.fr
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Hyperthreading hurts 5.3?
Message-ID:  <200501121049.j0CAnJQe028309@mp.cs.niu.edu>

next in thread | raw e-mail | index | archive | help
     On Wed, 12 Jan 2005 06:21:18 +0100 Anthony Atkielski
<atkielski.anthony@wanadoo.fr> wrote:

>Olivier Nicole writes:
>
>ON> Maybe for the same reason you should better not use a non-SMP kernel
>ON> if you have 2 CPU in your box.
>
>Is a hyperthreading CPU identical to a second CPU from the software's
>standpoint?  If not, what are the differences?
>
     Well, no, not exactly.  The dual-cored CPUs share certain resources
on the chip that are not shared in a multi-CPU situation, and that sharing
means certain operations have to be handled differently.  An MP setup has
separate cache and TLB managment in each CPU, whereas P4 w/HT logical
processors share this memory management circuitry.  Alteration of a cache
line requires notification of the other processor(s) in an MP situation to
mark any corresponding line in its(their) cache(s) because multiple separate
caches are involved, but notification is not necessary in the P4 w/HT situation
because it's the same cache being seen by both logical processors.
     Alteration/invalidation of TLB entries requires notification to
invalidate in an MP, so that the other CPU(s) can purge any corresponding TLB
entries it(they) may have, but notification is not required in the P4 w/HT
situation because both logical processors are refering to the same TLB.  Again,
unnecessary purging would be a performance hit.
     There must be some special handling of TLB entries in the P4 w/HT that
I haven't seen documented.  (There almost certainly is documentation; I just
haven't seen it yet.)  There must be some way to distinguish between TLB
entries filled per orders of one logical processor from those filled per
orders of the other logical processor.  If there weren't, then one logical
processor would use TLB entries for the address space running on the other
logical processor, which would, of course, be Very Bad.  But, to improve
performance, there should be some way to share TLBs for the case of two
threads running concurrently in the same address space.  If anyone reading
this knows the details of how this is handled in these chips, please post them
here.


                                  Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet:       bennett at cs.niu.edu                              *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army."                                               *
*    -- Gov. John Hancock, New York Journal, 28 January 1790         *
**********************************************************************



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200501121049.j0CAnJQe028309>