Date: Thu, 7 Apr 2005 13:42:52 -0500 From: Alan Cox <alc@cs.rice.edu> To: Alexey Dokuchaev <danfe@FreeBSD.ORG>, Alan Cox <alc@FreeBSD.ORG>, src-committers@FreeBSD.ORG, cvs-src@FreeBSD.ORG, cvs-all@FreeBSD.ORG Subject: Re: cvs commit: src/lib/libc/amd64/string Makefile.inc bcopy.S bzero.S memcpy.S memmove.S memset.S Message-ID: <20050407184252.GE20275@cs.rice.edu> In-Reply-To: <20050407132932.GA27083@VARK.MIT.EDU> References: <200504070356.j373u3MP005490@repoman.freebsd.org> <20050407093046.GC1049@FreeBSD.org> <20050407132932.GA27083@VARK.MIT.EDU>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Apr 07, 2005 at 09:29:33AM -0400, David Schultz wrote: > On Thu, Apr 07, 2005, Alexey Dokuchaev wrote: > > On Thu, Apr 07, 2005 at 03:56:03AM +0000, Alan Cox wrote: > > > alc 2005-04-07 03:56:03 UTC > > > > > > FreeBSD src repository > > > > > > Added files: > > > lib/libc/amd64/string Makefile.inc bcopy.S bzero.S memcpy.S > > > memmove.S memset.S > > > Log: > > > Add machine-specific, optimized implementations of bcopy, bzero, memcpy, > > > memmove, and memset. > > > > Great! Are we going to see something like this for ia32? > > i386 has had them since the beginnning of time, and the code > Alan committed is a port of the i386 versions. Yes, exactly. That said, the benefits are profound on microbenchmarks and measureable on macrobenchmarks, like buildworld. As for more "exotic" copy routines, these are user space routines. So, it is already the case that SSE registers can be used if desired. However, the AMD optimization manual only recommends their use for very large copies. Among the reasons is the fact that even the simple methods are moving/zeroing data 64 bits at a time. So, switching to 128-bit SSE registers has a less dramatic effect than on i386, where Matt is benchmarking. Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050407184252.GE20275>