Date: Mon, 7 May 2018 21:05:13 +0200 From: Oliver Pinter <oliver.pinter@hardenedbsd.org> To: Mateusz Guzik <mjg@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r333324 - in head/sys: amd64/amd64 conf Message-ID: <CAPQ4ffvTLV2VTii7vthwKMT7jXyOrzkpGrAN4RYKF9tqaJkv_w@mail.gmail.com> In-Reply-To: <CAPQ4ffsPOCWwOYgPiwsUkewv5E7umBTbHA_pWPhTQsj41wt2vA@mail.gmail.com> References: <201805071507.w47F7SOs035073@repo.freebsd.org> <CAPQ4ffsPOCWwOYgPiwsUkewv5E7umBTbHA_pWPhTQsj41wt2vA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 5/7/18, Oliver Pinter <oliver.pinter@hardenedbsd.org> wrote: > On 5/7/18, Mateusz Guzik <mjg@freebsd.org> wrote: >> Author: mjg >> Date: Mon May 7 15:07:28 2018 >> New Revision: 333324 >> URL: https://svnweb.freebsd.org/changeset/base/333324 >> >> Log: >> amd64: replace libkern's memset and memmove with assembly variants >> >> memmove is repurposed bcopy (arguments swapped, return value added) >> The libkern variant is a wrapper around bcopy, so this is a big >> improvement. >> >> memset is repurposed memcpy. The librkern variant is doing fishy stuff, >> including branching on 0 and calling bzero. >> >> Both functions are rather crude and subject to partial depessimization. >> >> This is a soft prerequisite to adding variants utilizing the >> 'Enhanced REP MOVSB/STOSB' bit and let the kernel patch at runtime. >> >> Modified: >> head/sys/amd64/amd64/support.S >> head/sys/conf/files.amd64 >> >> Modified: head/sys/amd64/amd64/support.S >> ============================================================================== >> --- head/sys/amd64/amd64/support.S Mon May 7 15:07:26 2018 (r333323) >> +++ head/sys/amd64/amd64/support.S Mon May 7 15:07:28 2018 (r333324) >> @@ -162,6 +162,58 @@ ENTRY(bcopy) >> END(bcopy) >> >> /* >> + * memmove(dst, src, cnt) >> + * rdi, rsi, rdx >> + * Original by: >> + * ws@tools.de (Wolfgang Solfrank, TooLs GmbH) +49-228-985800 >> + */ >> +ENTRY(memmove) >> + PUSH_FRAME_POINTER >> + movq %rdi,%r9 >> + movq %rdx,%rcx >> + >> + movq %rdi,%rax >> + subq %rsi,%rax >> + cmpq %rcx,%rax /* overlapping && src < dst? */ >> + jb 1f >> + >> + shrq $3,%rcx /* copy by 64-bit words */ >> + rep >> + movsq >> + movq %rdx,%rcx >> + andq $7,%rcx /* any bytes left? */ >> + rep >> + movsb >> + movq %r9,%rax >> + POP_FRAME_POINTER >> + ret >> + >> + /* ALIGN_TEXT */ >> +1: >> + addq %rcx,%rdi /* copy backwards */ >> + addq %rcx,%rsi >> + decq %rdi >> + decq %rsi >> + andq $7,%rcx /* any fractional bytes? */ >> + std >> + rep >> + movsb >> + movq %rdx,%rcx /* copy remainder by 32-bit words */ >> + shrq $3,%rcx >> + subq $7,%rsi >> + subq $7,%rdi >> + rep >> + movsq >> + cld >> + movq %r9,%rax >> + POP_FRAME_POINTER >> + ret >> +END(memmove) >> + >> +/* >> + * memcpy(dst, src, len) >> + * rdi, rsi, rdx >> + * >> * Note: memcpy does not support overlapping copies >> */ >> ENTRY(memcpy) >> @@ -178,6 +230,27 @@ ENTRY(memcpy) >> POP_FRAME_POINTER >> ret >> END(memcpy) >> + >> +/* >> + * memset(dst, c, len) >> + * rdi, rsi, rdx >> + */ >> +ENTRY(memset) >> + PUSH_FRAME_POINTER >> + movq %rdi,%r9 >> + movq %rdx,%rcx >> + movq %rsi,%rax >> + shrq $3,%rcx >> + rep >> + stosq > > According to Intel SDM stosq stores the whole RAX into destination, > and then increments the destination register with 8. This > implementation is wrong, since the c is a char, and the The RAX looks > like 000000CC, so the stored patter would be 000000CC * SIZE / 8 * 8 + > CC * SIZE % 8 in destination buffer. Attached the proof. > >> + movq %rdx,%rcx >> + andq $7,%rcx >> + rep >> + stosb >> + movq %r9,%rax >> + POP_FRAME_POINTER >> + ret >> +END(memset) >> >> /* >> * pagecopy(%rdi=from, %rsi=to) >> >> Modified: head/sys/conf/files.amd64 >> ============================================================================== >> --- head/sys/conf/files.amd64 Mon May 7 15:07:26 2018 (r333323) >> +++ head/sys/conf/files.amd64 Mon May 7 15:07:28 2018 (r333324) >> @@ -620,8 +620,6 @@ isa/vga_isa.c optional vga >> kern/kern_clocksource.c standard >> kern/link_elf_obj.c standard >> libkern/x86/crc32_sse42.c standard >> -libkern/memmove.c standard >> -libkern/memset.c standard >> # >> # IA32 binary support >> # >> _______________________________________________ >> svn-src-head@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/svn-src-head >> To unsubscribe, send any mail to "svn-src-head-unsubscribe@freebsd.org" >> >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPQ4ffvTLV2VTii7vthwKMT7jXyOrzkpGrAN4RYKF9tqaJkv_w>