Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 May 2018 22:57:13 +0200
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Oliver Pinter <oliver.pinter@hardenedbsd.org>
Cc:        Mateusz Guzik <mjg@freebsd.org>, src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r333324 - in head/sys: amd64/amd64 conf
Message-ID:  <CAGudoHFwyeXmvQwFyhD_JG15uS8_DuEB3h_J1JtWERrRRm5OJg@mail.gmail.com>
In-Reply-To: <CAPQ4ffsPOCWwOYgPiwsUkewv5E7umBTbHA_pWPhTQsj41wt2vA@mail.gmail.com>
References:  <201805071507.w47F7SOs035073@repo.freebsd.org> <CAPQ4ffsPOCWwOYgPiwsUkewv5E7umBTbHA_pWPhTQsj41wt2vA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, May 7, 2018 at 9:03 PM, Oliver Pinter <oliver.pinter@hardenedbsd.org
> wrote:

> On 5/7/18, Mateusz Guzik <mjg@freebsd.org> wrote:
> > Author: mjg
> > Date: Mon May  7 15:07:28 2018
> > New Revision: 333324
> > URL: https://svnweb.freebsd.org/changeset/base/333324
> >
> > Log:
> >   amd64: replace libkern's memset and memmove with assembly variants
> >
> >   memmove is repurposed bcopy (arguments swapped, return value added)
> >   The libkern variant is a wrapper around bcopy, so this is a big
> >   improvement.
> >
> >   memset is repurposed memcpy. The librkern variant is doing fishy stuff,
> >   including branching on 0 and calling bzero.
> >
> >   Both functions are rather crude and subject to partial depessimization.
> >
> >   This is a soft prerequisite to adding variants utilizing the
> >   'Enhanced REP MOVSB/STOSB' bit and let the kernel patch at runtime.
> >
> > +
> > +/*
> > + * memset(dst, c,   len)
> > + *        rdi, rsi, rdx
> > + */
> > +ENTRY(memset)
> > +     PUSH_FRAME_POINTER
> > +     movq    %rdi,%r9
> > +     movq    %rdx,%rcx
> > +     movq    %rsi,%rax
> > +     shrq    $3,%rcx
> > +     rep
> > +     stosq
>
> According to Intel SDM stosq stores the whole RAX into destination,
> and then increments the destination register with 8. This
> implementation is wrong, since the c is a char, and the The RAX looks
> like 000000CC, so the stored patter would be 000000CC * SIZE / 8 * 8 +
> CC * SIZE % 8 in destination buffer.
>

Ye, my bad. Forgot to expand the arg with the multiplication trick. Fixed:
https://svnweb.freebsd.org/base?view=revision&revision=333332

-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHFwyeXmvQwFyhD_JG15uS8_DuEB3h_J1JtWERrRRm5OJg>