Date: Fri, 19 Feb 1999 12:47:41 +1100 From: Peter Jeremy <peter.jeremy@auss2.alcatel.com.au> To: hackers@FreeBSD.ORG Subject: Re: vm_page_zero_fill Message-ID: <99Feb19.123711est.40325@border.alcanet.com.au>
next in thread | raw e-mail | index | archive | help
Alfred Perlstein <bright@cygnus.rush.net> wrote: >After playing with "gcc -O -S bcmp.c" on several platforms, i386, >sparc32, alpha. It seems to me that the function ought to be >replaced with this: [deleted] The code given is portable, but not optimal for any of these architectures - especially the Alpha. The original Alpha chips don't have character instructions so character handling is quite poor (and gcc2.7.x doesn't include support for the new character instructions). Optimal code for the Alpha would read 8-byte long-word aligned chunks from memory, then appropriately re-align and compare them. (There's some discussion about this, though not actual code, in the early Alpha white papers). A similar strategy probably holds for the SPARC (but 4-bytes loads except on UltraSPARCs). Something similar could be done on the ix86, but I'm not certain about the advantages. This _is_ one area where carefully hand-crafted code is worth the effort (especially on the RISC architectures). >it uses the "rep cmpsl" opcode, i have heard that using "movs/lods/cmps" >was no longer optimal after the 486 line, but i'm unsure. Sort of true. In theory, an explicit loop is faster than "rep cmps". Lack of CPU<->RAM bandwidth tends to make this less of an issue unless both strings are in L1 cache. Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99Feb19.123711est.40325>