Date: Wed, 12 Jan 2005 04:49:19 -0600 (CST) From: Scott Bennett <bennett@cs.niu.edu> To: atkielski.anthony@wanadoo.fr Cc: freebsd-questions@freebsd.org Subject: Re: Hyperthreading hurts 5.3? Message-ID: <200501121049.j0CAnJQe028309@mp.cs.niu.edu>
next in thread | raw e-mail | index | archive | help
On Wed, 12 Jan 2005 06:21:18 +0100 Anthony Atkielski <atkielski.anthony@wanadoo.fr> wrote: >Olivier Nicole writes: > >ON> Maybe for the same reason you should better not use a non-SMP kernel >ON> if you have 2 CPU in your box. > >Is a hyperthreading CPU identical to a second CPU from the software's >standpoint? If not, what are the differences? > Well, no, not exactly. The dual-cored CPUs share certain resources on the chip that are not shared in a multi-CPU situation, and that sharing means certain operations have to be handled differently. An MP setup has separate cache and TLB managment in each CPU, whereas P4 w/HT logical processors share this memory management circuitry. Alteration of a cache line requires notification of the other processor(s) in an MP situation to mark any corresponding line in its(their) cache(s) because multiple separate caches are involved, but notification is not necessary in the P4 w/HT situation because it's the same cache being seen by both logical processors. Alteration/invalidation of TLB entries requires notification to invalidate in an MP, so that the other CPU(s) can purge any corresponding TLB entries it(they) may have, but notification is not required in the P4 w/HT situation because both logical processors are refering to the same TLB. Again, unnecessary purging would be a performance hit. There must be some special handling of TLB entries in the P4 w/HT that I haven't seen documented. (There almost certainly is documentation; I just haven't seen it yet.) There must be some way to distinguish between TLB entries filled per orders of one logical processor from those filled per orders of the other logical processor. If there weren't, then one logical processor would use TLB entries for the address space running on the other logical processor, which would, of course, be Very Bad. But, to improve performance, there should be some way to share TLBs for the case of two threads running concurrently in the same address space. If anyone reading this knows the details of how this is handled in these chips, please post them here. Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at cs.niu.edu * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200501121049.j0CAnJQe028309>