Date: Thu, 9 Feb 2023 19:36:48 +0000 From: David Chisnall <theraven@FreeBSD.org> To: Mateusz Guzik <mjguzik@gmail.com> Cc: freebsd-hackers <freebsd-hackers@freebsd.org> Subject: Re: CFT: snmalloc as libc malloc Message-ID: <E140A3A2-5C4A-4458-B365-AD693AB853E8@FreeBSD.org> In-Reply-To: <CAGudoHFYMLk6EDrSxLiWFNBoYyTKXfHLAUhZC%2BRF4eUE-rip8Q@mail.gmail.com> References: <2f3dcda0-5135-290a-2dff-683b2e9fe271@FreeBSD.org> <CAGudoHFYMLk6EDrSxLiWFNBoYyTKXfHLAUhZC%2BRF4eUE-rip8Q@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 9 Feb 2023, at 19:15, Mateusz Guzik <mjguzik@gmail.com> wrote: >=20 > it fails to build for me: >=20 > /usr/src/lib/libc/stdlib/snmalloc/malloc.cc:35:10: fatal error: > 'override/jemalloc_compat.cc' file not found > #include "override/jemalloc_compat.cc" > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > 1 error generated. > --- malloc.o --- > *** [malloc.o] Error code 1 >=20 > make[4]: stopped in /usr/src/lib/libc > /usr/src/lib/libc/stdlib/snmalloc/memcpy.cc:25:10: fatal error: > 'global/memcpy.h' file not found > #include <global/memcpy.h> > ^~~~~~~~~~~~~~~~~ > 1 error generated. > --- memcpy.o --- > *** [memcpy.o] Error code 1 This looks as if you haven=E2=80=99t got the submodule? Is there = anything in contrib/snmalloc? > this is a fresh world, top of snmalloc2 branch: > commit a5c83c69817d03943b8be982dd815c7e263d1a83 > Author: David Chisnall <theraven@FreeBSD.org> > Date: Fri Jan 21 15:13:09 2022 +0000 >=20 > Initial commit of snmalloc2 in libc. >=20 > anyway, I wanted to say I find the memcpy thing incredibly suspicious. > I found one article in > = https://github.com/microsoft/snmalloc/blob/main/docs/security/GuardedMemcp= y.md > which benches it and that made it even more suspicious. How did the > benched memcpy look like inside? Perhaps you could share what you are suspicious about? I don=E2=80=99t = really know how to respond to something so vague. The document you = linked to has the benchmark that we used (though the graphs in it appear = to be based on an older version of the memcpy). The PR that added = PowerPC tuning has some additional graphs of measurements. If you compile the memcpy file, you can see the assembly. The C++ = provides a set of building blocks for producing efficient memcpy = implementations. The fastest on x86 is roughly: - A jump table of power for small sizes that do power-of-two-sized = small copies (double-word, word, half-word, and byte) to perform the = copy. - A vectorised copy for medium-sized copies using a loop of SSE copies. - rep movsb for large copies. The compiler does some quite complex layout for the jump table. David
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E140A3A2-5C4A-4458-B365-AD693AB853E8>