Date: Wed, 12 Jan 2005 15:29:28 -0600 (CST) From: Scott Bennett <bennett@cs.niu.edu> To: freebsd-questions@freebsd.org Cc: atkielski.anthony@wanadoo.fr Subject: Re: Hyperthreading hurts 5.3? Message-ID: <200501122129.j0CLTSQo008877@mp.cs.niu.edu>
next in thread | raw e-mail | index | archive | help
On Wed, 12 Jan 2005 18:45:56 +0100 Anthony Atkielski <atkielski.anthony@wanadoo.fr> wrote: >Scott Bennett writes: > >SB> Well, no, not exactly. The dual-cored CPUs share certain resources >SB> on the chip that are not shared in a multi-CPU situation, and that sharing >SB> means certain operations have to be handled differently. An MP setup has >SB> separate cache and TLB managment in each CPU ... > >What's TLB? Translation Lookaside Buffer. (I know. It's a weird name. I think its origin was at IBM, which would explain the weirdness completely.) Its a collection of registers that contains the results of the most recent address translations. These results are kept around in order to avoid going through the full address translation process for addresses in pages for which address translation has already occurred. It's a big time-saver. When a virtual address is encountered that isn't in the TLB, then address translation proceeds from scratch, and the result replaces the least recently used entry in the TLB. > >SB> ... whereas P4 w/HT logical processors share this memory management >SB> circuitry. Alteration of a cache line requires notification of the >SB> other processor(s) in an MP situation to mark any corresponding line >SB> in its(their) cache(s) because multiple separate caches are >SB> involved, but notification is not necessary in the P4 w/HT situation >SB> because it's the same cache being seen by both logical processors. >SB> Alteration/invalidation of TLB entries requires notification to >SB> invalidate in an MP, so that the other CPU(s) can purge any corresponding TLB >SB> entries it(they) may have, but notification is not required in the P4 w/HT >SB> situation because both logical processors are refering to the same TLB. Again, >SB> unnecessary purging would be a performance hit. >SB> There must be some special handling of TLB entries in the P4 w/HT that >SB> I haven't seen documented. (There almost certainly is documentation; I just >SB> haven't seen it yet.) There must be some way to distinguish between TLB >SB> entries filled per orders of one logical processor from those filled per >SB> orders of the other logical processor. If there weren't, then one logical >SB> processor would use TLB entries for the address space running on the other >SB> logical processor, which would, of course, be Very Bad. But, to improve >SB> performance, there should be some way to share TLBs for the case of two >SB> threads running concurrently in the same address space. If anyone reading >SB> this knows the details of how this is handled in these chips, please post them >SB> here. > >>From what you say and from what I've read today, it sounds like >hyperthreading comes close to providing two separate processors for >heterogenous system loads (where each hyperthread is using slightly >different processor resources at any given instant), but it may not buy >much of anything for massively parallel compute-bound work, since both >threads may want nearly the same things at the same time and will thus >effectively be forced to spend a lot of time waiting for each other. I think that's probably close to the truth. For logic and number crunching, the two logical processors can proceed in parallel. But they will compete for any non-register memory access, for address translation time, and possibly other resources. I notice that the 5.2.1 boot messages refer to the second core as an AP, which I'm guessing stands for "attached processor". If that guess is correct, then it means that only the first core is able to perform certain functions, and the AP core has to get the first core to do those things for it when it needs them done. Typically, such restricted functions include things like starting I/O operations, handling I/O interrupts, setting the system clock, etc. Whether these restrictions are the actual ones, if there are any at all, in this situation, I do not know. > >Fortunately, my server has a very mixed load, as one would expect for a >generic domain server, so hopefully it will profit from hyperthreading. > What Intel claims is essentially that the HT-enabled CPUs allow snappier responses in interactive processes when a CPU-bound process is running. >And hopefully no weird stuff will happen because I've turned on HT >(although offhand I'm not sure what would happen, unless there are >hidden hardware conflicts or something specific and software-visible >about HT in normal operation that might expose a bug). I'm not sure >that I see how HT could affect Serial ATA disks, for example, any more >than having two separate physical processors would. > Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at cs.niu.edu * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200501122129.j0CLTSQo008877>