Date: Thu, 18 Jan 2007 14:47:55 -0800 From: Chuck Swiger <cswiger@mac.com> To: Maxim Sobolev <sobomax@FreeBSD.org> Cc: freebsd-current@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs Message-ID: <0B6D259B-618B-466C-844E-3F79FDE272BB@mac.com> In-Reply-To: <45AFF47E.3020905@FreeBSD.org> References: <3bbf2fe10607250813w8ff9e34pc505bf290e71758@mail.gmail.com> <3bbf2fe10607281004o6727e976h19ee7e054876f914@mail.gmail.com> <3bbf2fe10701160851r79b04464m2cbdbb7f644b22b6@mail.gmail.com> <20070116154258.568e1aaf@pleiades.nextvenue.com> <b1fa29170701161355lc021b90o35fa5f9acb5749d@mail.gmail.com> <eoji7s$cit$2@sea.gmane.org> <b1fa29170701161425n7bcfe1e5m1b8c671caf3758db@mail.gmail.com> <eojlnb$qje$1@sea.gmane.org> <3bbf2fe10701161525j6ad9292y93502b8df0f67aa9@mail.gmail.com> <45AD6DFA.6030808@FreeBSD.org> <3bbf2fe10701161655p5e686b52n7340b3100ecfab93@mail.gmail.com> <200701172022.l0HKMYV8053837@apollo.backplane.com> <20070118113831.A11834@delplex.bde.org> <200701181948.l0IJmdfn061671@apollo.backplane.com> <45AFED63.7020009@FreeBSD.org> <25EB3FED-71A9-4AE1-9A38-5D2DC27D0DF7@mac.com> <45AFF47E.3020905@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jan 18, 2007, at 2:28 PM, Maxim Sobolev wrote: >> Unfortunately, there are simply different tradeoffs between >> mechanisms for copying depending on whether you want to use or >> avoid using/thrashing the L1/L2 caches, whether the data is cache- >> aligned, and so forth; the CPU can't infer what you want to >> occur-- you have to tell it. I find it interesting that some of >> the architectures (PA-RISC, > > Well, of course there are some special cases, but in general there > should be some baseline suitable for most of uses. That's why we > (and most other operating systems) only provide single version for > the mem*(3) APIs. Well, a truly generic version in is lib/libc/string/bcopy.c; it's architecture-neutral (ie, it's pure C code) and it handles all kinds of things like overlapping source and destination addresses, non- aligned access, and so forth. The downside is that it's slower than using movl/movsl, much less some of the fancier variants that Bruce and Matt have been discussing (in considerable, interesting detail) earlier: http://now.cs.berkeley.edu/Td/bcopy.html If you're only moving, say, 5 bytes, the overhead of fancy loop unrolling and prefetching and so forth isn't going to help compared with a simple movb/movl combination, so it really depends. -- -Chuck
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0B6D259B-618B-466C-844E-3F79FDE272BB>