Date: Mon, 14 Jul 2014 09:20:55 -0700 From: Justin Hibbits <jrh29@alumni.cwru.edu> To: Alexey Dokuchaev <danfe@nsu.ru> Cc: powerpc@freebsd.org Subject: Re: How to convert SSEish _mm_set1_ps() into AltiVec correctly? Message-ID: <CAHSQbTAG8rJbfyYG-FaQjuVm0ZYWAOLN6UcY-ycM%2Byuw0OvESw@mail.gmail.com> In-Reply-To: <20140714154224.GA28612@regency.nsu.ru>
index | next in thread | previous in thread | raw e-mail
On Mon, Jul 14, 2014 at 8:42 AM, Alexey Dokuchaev <danfe@nsu.ru> wrote:
> Hi there,
>
> I'm a bit confused about how to convert _mm_set1_ps() [1] SSE function into
> its AltiVec equivalent. To start with, I need to set all four floats of a
> vector to the same value. So far, I've come up with two versions that work
> with GCC or Clang, but I want to have a code that works with any compiler,
> and is technically correct (works not just by accident).
>
> On PowerPC, there are two altivec.h files provided by GCC 4.2 and Clang:
>
> /usr/include/clang/3.4.1/altivec.h
> /usr/include/gcc/4.2/altivec.h
>
> The problem is that they are substantially different (read: offer different
> APIs). For Clang, I can simply write something like this:
>
> union {
> float f1, f2, f3, f4;
> vector float f;
> } foo;
>
> foo.f = vec_splats(42.f);
> // all f1, f2, f3, f4 are 42.f now
>
> But this does not work with GCC: it simply does not offer vec_splats(float);
> however, I can do this:
>
> float init = 42.f;
> foo.f = vec_ld(0, &init);
>
> And it will set all four components to 42.f. Yet this is technically wrong,
> as apparently I'm supposed to pass an entire array of floats, e.g. if built
> with Clang all floats are "nan". Lets change the code to this:
>
> float init[4] = { 42.f };
> foo.f = vec_ld(0, init);
>
> This works with both compilers, but I'm not sure if it is correct. Can any
> of our AltiVec experts give me some hint here? Thanks.
>
> ./danfe
>
> [1] http://msdn.microsoft.com/en-us/library/vstudio/2x1se8ha%28v=vs.100%29.aspx
I just tried the following:
vector float a = (vector float){42.0f};
vector float b = vec_splat(a, 0);
Haven't done anything more than compile test it, but it builds with
both gcc and clang. GCC uses vspltw, while clang uses vperm.
- Justin
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHSQbTAG8rJbfyYG-FaQjuVm0ZYWAOLN6UcY-ycM%2Byuw0OvESw>
