Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Apr 1996 06:01:26 +1000
From:      Bruce Evans <bde@zeta.org.au>
To:        bde@zeta.org.au, tege@matematik.su.se
Cc:        asami@cs.berkeley.edu, current@FreeBSD.org, hasty@rah.star-gate.com, mrami@minerva.cis.yale.edu, nisha@cs.berkeley.edu
Subject:   Re: optimized bzeros found harmful (was: fast memory copy ...)
Message-ID:  <199604062001.GAA08258@godzilla.zeta.org.au>

index | next in thread | raw e-mail

>  This behaviour is consistent with the data being zeroed usually not being
>  in the L2 cache.  RBW is 33% slower in that case on my system.  Other
>  cases: if the data is in the L2 cache but not in the L1 cache, then RBW
>  is between 0% and 33% faster; if data the data is in the L1 cache, then
>  RBW is 8.5 times faster (740MB/s!).

>This must be a misunderstanding!

>If the data is really in the L1 cache, the read-before-write is wasted and
>just contributes to the overhead.

It must not be in the L1 cache.  (Why not?)  `perfmon' in -currrent shows
much more bus activity for write test 3 than for write test 4.  E.g.,
counter 25 (PMC5_WRITE_BACKUP_STALL) is about 117e6 events for test 3
and only 5e6 for test 4.  This is for copying a total amount of 100e6
bytes.

Let's see your output for `./w -5' and your explanation of it.

>The read-before-write is effective if and only if the data is not in the L1
>cache.  In that case, it forces allocation of the cache line in the L1
>cache, and thereby allows a 14x peak speedup.

>If other behaviours are observed, the timing framework confuses you.

Let's see you output for `./w -l 65536 -5'.  64K should fit in the L2
cache (512K).  Why does read-before-write give only a 25% speedup?

>All other CPUs I know of have caches that do allocate-on-write.

Perhaps the Pentium behaviour is best.  It seems to penalize writing to
the same location without reading it, but this is abnormal behaviour.

Bruce


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604062001.GAA08258>