Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 Feb 2023 19:36:48 +0000
From:      David Chisnall <theraven@FreeBSD.org>
To:        Mateusz Guzik <mjguzik@gmail.com>
Cc:        freebsd-hackers <freebsd-hackers@freebsd.org>
Subject:   Re: CFT: snmalloc as libc malloc
Message-ID:  <E140A3A2-5C4A-4458-B365-AD693AB853E8@FreeBSD.org>
In-Reply-To: <CAGudoHFYMLk6EDrSxLiWFNBoYyTKXfHLAUhZC%2BRF4eUE-rip8Q@mail.gmail.com>
References:  <2f3dcda0-5135-290a-2dff-683b2e9fe271@FreeBSD.org> <CAGudoHFYMLk6EDrSxLiWFNBoYyTKXfHLAUhZC%2BRF4eUE-rip8Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 9 Feb 2023, at 19:15, Mateusz Guzik <mjguzik@gmail.com> wrote:
>=20
> it fails to build for me:
>=20
> /usr/src/lib/libc/stdlib/snmalloc/malloc.cc:35:10: fatal error:
> 'override/jemalloc_compat.cc' file not found
> #include "override/jemalloc_compat.cc"
>         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 1 error generated.
> --- malloc.o ---
> *** [malloc.o] Error code 1
>=20
> make[4]: stopped in /usr/src/lib/libc
> /usr/src/lib/libc/stdlib/snmalloc/memcpy.cc:25:10: fatal error:
> 'global/memcpy.h' file not found
> #include <global/memcpy.h>
>         ^~~~~~~~~~~~~~~~~
> 1 error generated.
> --- memcpy.o ---
> *** [memcpy.o] Error code 1

This looks as if you haven=E2=80=99t got the submodule?  Is there =
anything in contrib/snmalloc?

> this is a fresh world, top of snmalloc2 branch:
> commit a5c83c69817d03943b8be982dd815c7e263d1a83
> Author: David Chisnall <theraven@FreeBSD.org>
> Date:   Fri Jan 21 15:13:09 2022 +0000
>=20
>    Initial commit of snmalloc2 in libc.
>=20
> anyway, I wanted to say I find the memcpy thing incredibly suspicious.
> I found one article in
> =
https://github.com/microsoft/snmalloc/blob/main/docs/security/GuardedMemcp=
y.md
> which benches it and that made it even more suspicious. How did the
> benched memcpy look like inside?

Perhaps you could share what you are suspicious about?  I don=E2=80=99t =
really know how to respond to something so vague.  The document you =
linked to has the benchmark that we used (though the graphs in it appear =
to be based on an older version of the memcpy).  The PR that added =
PowerPC tuning has some additional graphs of measurements.

If you compile the memcpy file, you can see the assembly.  The C++ =
provides a set of building blocks for producing efficient memcpy =
implementations.  The fastest on x86 is roughly:

 - A jump table of power for small sizes that do power-of-two-sized =
small copies (double-word, word, half-word, and byte) to perform the =
copy.
 - A vectorised copy for medium-sized copies using a loop of SSE copies.
 - rep movsb for large copies.

The compiler does some quite complex layout for the jump table.

David




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E140A3A2-5C4A-4458-B365-AD693AB853E8>