From owner-freebsd-ia32@FreeBSD.ORG Wed Dec 6 04:28:35 2006 Return-Path: X-Original-To: freebsd-ia32@freebsd.org Delivered-To: freebsd-ia32@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AA7C316A407 for ; Wed, 6 Dec 2006 04:28:35 +0000 (UTC) (envelope-from ranjith_kumar_b4u@yahoo.com) Received: from web58611.mail.re3.yahoo.com (web58611.mail.re3.yahoo.com [68.142.236.209]) by mx1.FreeBSD.org (Postfix) with SMTP id D81A443C9D for ; Wed, 6 Dec 2006 04:27:51 +0000 (GMT) (envelope-from ranjith_kumar_b4u@yahoo.com) Received: (qmail 59295 invoked by uid 60001); 6 Dec 2006 04:28:34 -0000 Message-ID: <20061206042834.59293.qmail@web58611.mail.re3.yahoo.com> DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=cgLK5asv3pzUEFukFAWUPZ5cGWStGmsPfBl99IyEvNDrrMnNJDJlfOcCR0z4r8EODPw7Pa15ihOYJsl/XNPxJlIBjkqSzBWEFGBRdGZi9n/215hJkjbBeKfu7UNOoN1AOYuCHOuGTbz4uKtmdpJtV5V3RqQ6Vs/4T3M84+C4Zfc=; X-YMail-OSG: PbcZiO0VM1kibaftMpLcBxTsfJQOpB3D1Gvu3VfbGsG1Pd2YLjCCfX1wNrQnouAHYXDj0yavad2N_UJlDDaa_.jVWwaDK5WWy888HlQBmIFwIOZECp0j6A-- Received: from [59.163.25.48] by web58611.mail.re3.yahoo.com via HTTP; Tue, 05 Dec 2006 20:28:34 PST Date: Tue, 5 Dec 2006 20:28:34 -0800 (PST) From: ranjith kumar To: freebsd-ia32@freebsd.org In-Reply-To: <3bbf2fe10611160753q3303d81bw515bffe9af4ee0c9@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailman-Approved-At: Wed, 06 Dec 2006 05:27:27 +0000 Subject: prefetching on pentium4 X-BeenThere: freebsd-ia32@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD on the IA-32 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Dec 2006 04:28:35 -0000 Hi, There are 4 types of prefetch instructions on pentium 4 (IA-32) processor. prefetchnta,prefetcht0,prefetcht1,prefetcht2. In case of pentium 4, IA-32 otimization manuvals say that prefetcht0,prefetcht1,prefetcht2 are identical. It also says ONLY prefetchnta instruction prefetches data into L2 cache without poluting caches. When all the four instructions prefetches data into L2 cache (not into L1 cache) , what is the meaning in saying prefetchnta does not polute caches? ie)what is the difference between prefetchnta and other instructions? Thanks in advance. ____________________________________________________________________________________ Do you Yahoo!? Everyone is raving about the all-new Yahoo! Mail beta. http://new.mail.yahoo.com From owner-freebsd-ia32@FreeBSD.ORG Wed Dec 6 08:56:20 2006 Return-Path: X-Original-To: freebsd-ia32@freebsd.org Delivered-To: freebsd-ia32@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E680316A4C2 for ; Wed, 6 Dec 2006 08:56:20 +0000 (UTC) (envelope-from olivier.certner@free.fr) Received: from smtp6-g19.free.fr (smtp6-g19.free.fr [212.27.42.36]) by mx1.FreeBSD.org (Postfix) with ESMTP id BAEDD43D8E for ; Wed, 6 Dec 2006 08:54:48 +0000 (GMT) (envelope-from olivier.certner@free.fr) Received: from lon92-4-82-226-188-149.fbx.proxad.net (lon92-4-82-226-188-149.fbx.proxad.net [82.226.188.149]) by smtp6-g19.free.fr (Postfix) with ESMTP id 7A50B434E0 for ; Wed, 6 Dec 2006 09:55:31 +0100 (CET) From: Olivier Certner To: freebsd-ia32@freebsd.org Date: Wed, 6 Dec 2006 09:54:48 +0100 User-Agent: KMail/1.9.1 References: <20061206042834.59293.qmail@web58611.mail.re3.yahoo.com> In-Reply-To: <20061206042834.59293.qmail@web58611.mail.re3.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200612060954.48736.> Subject: Re: prefetching on pentium4 X-BeenThere: freebsd-ia32@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD on the IA-32 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Dec 2006 08:56:21 -0000 Hi, On a pentium 4, prefetcht0, prefetcht1 and prefetch2 are identical, at least if you don't have a level 3 cache. Intel's documentation is not very clear about what happens with one more cache in the hierarchy. The prefetchnta instruction does the same thing (fetch some memory bytes into the 2nd level cache) but it is supposed to fetch these bytes in only one way of the cache. I don't know how the way is choosen. Unless you are trying to fetch a relatively large volume of data or data with a special pattern (ie, data that would be put at the same index in the cache, thus utilizing more than one way), you won't see much difference from the prefetchtX variants. You'll have to determine the characteristics of the L2 cache on your paticular P4 processor target in order to check that. Olivier From owner-freebsd-ia32@FreeBSD.ORG Wed Dec 6 19:20:39 2006 Return-Path: X-Original-To: freebsd-ia32@freebsd.org Delivered-To: freebsd-ia32@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5CC5116A49E for ; Wed, 6 Dec 2006 19:20:39 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1E4BA43EEB for ; Wed, 6 Dec 2006 18:56:51 +0000 (GMT) (envelope-from asmrookie@gmail.com) Received: by nf-out-0910.google.com with SMTP id x37so615211nfc for ; Wed, 06 Dec 2006 10:57:37 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=eHOUJD43jfIThGKcWIgawSLOSzFj9u7/wSOwuowDqkttLpUetIaXu/LZfbo4lh9hUfRcld0o+2cwnYjqmu89rbk5M4PFcPenUxQc66hiuYhfn0ZB62J7vwCIzf0t0qgQISzBIhryvBvW3RC8AEbN3nrVr+WZ8eHVzIuP7FsApTQ= Received: by 10.82.131.1 with SMTP id e1mr280419bud.1165431055135; Wed, 06 Dec 2006 10:50:55 -0800 (PST) Received: by 10.82.189.18 with HTTP; Wed, 6 Dec 2006 10:50:55 -0800 (PST) Message-ID: <3bbf2fe10612061050y6fa458abw3b1ace0cd1bebd37@mail.gmail.com> Date: Wed, 6 Dec 2006 19:50:55 +0100 From: "Attilio Rao" Sender: asmrookie@gmail.com To: "ranjith kumar" In-Reply-To: <20061206042834.59293.qmail@web58611.mail.re3.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3bbf2fe10611160753q3303d81bw515bffe9af4ee0c9@mail.gmail.com> <20061206042834.59293.qmail@web58611.mail.re3.yahoo.com> X-Google-Sender-Auth: d0830bb9088d0b51 Cc: freebsd-ia32@freebsd.org Subject: Re: prefetching on pentium4 X-BeenThere: freebsd-ia32@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD on the IA-32 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Dec 2006 19:20:39 -0000 2006/12/6, ranjith kumar : > Hi, > There are 4 types of prefetch instructions on > pentium 4 (IA-32) processor. > prefetchnta,prefetcht0,prefetcht1,prefetcht2. > > In case of pentium 4, IA-32 otimization manuvals say > that prefetcht0,prefetcht1,prefetcht2 are identical. > > It also says ONLY prefetchnta instruction prefetches > data into L2 cache without poluting caches. > > When all the four instructions prefetches data into > L2 cache (not into L1 cache) , what is the meaning in > saying prefetchnta does not polute caches? > > ie)what is the difference between prefetchnta and > other instructions? First of all, it is important to say that prefetch* instruction is only an hint for the CPU and not a *command* for that, so the CPU needs to evaluate (in a not precisated way) if accept or not the caching request. >From this point of view, prefetch* instruction might be the more accomodant possible for the CPU. Different numbers mean different 'critical' level for the CPU (0 - high critical, 2 - low critical), which means prefetching the cache line to an higher level into the cache hierarchy. This would means, in an hypotetical way: prefetch0 -> L1 prefetching prefetch1 -> L2 prefetching prefetch2 -> L3 prefetching And this is what really happens, for example, on P3 (if you consider P3 has not L3 cache, prefetch2 == prefetch1). On P4 things are different beacause you would not manipulate directly L1 cache and, so, what happens is: prefetch0 -> L2 prefetching prefetch1 -> L2 prefetching prefetch2 -> L3 prefetching (if L3 cache is not present prefetch2 is the same as the other, from this the assumption all the three instructions behave at the same). prefetchnta is completely different beacause it fetches a cache line into the NT cache structure. Non Temporal caches are global caches which are particulary powerful beacause they don't need of snooping messages between CPUs (and, in this way, they reduce the CPUs<->caches traffic) and are used by NTI family. Attilio -- Peace can only be achieved by understanding - A. Einstein