Date: Thu, 6 Nov 2025 20:39:49 +0100 From: =?UTF-8?Q?Aur=C3=A9lien_Couderc?= <aurelien.couderc2002@gmail.com> To: freebsd-hackers@freebsd.org Subject: Implementing VOP_READPLUS() in FreeBSD 15? Message-ID: <CA%2B1jF5rCb8Kx=9pPXtC=dwoCz88waBJeSkADeCwtZOONrKi2Ug@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
This is a followup to a discussion with the nfs-ganesha developers. Could FreeBSD implement a VOP_READPLUS() in FreeBSD 15, please? Citing Lionel Cons/CERN: > But the point is to optimise the read(). First, you have less traffic ove= r the wire (which is a > thing if your reads are in the gigabyte range for large VMs), and it tell= s the VM host that it > can just map all those MMU pages representing the hole to the "default ze= ro page", which > in turn saves lots of space in the L3 and L2 caches ----> THIS DOES WONDE= RS to VM > performance. > > Example: > The performance benefit here comes from the fast that instead of mapping = a 1TB hole > (1099511627776 bytes) to individual 524288 2M pages (x86 2M hugepage size= ), and then > potentially reading from them, you just have ONE 2M page in the cache, an= d all reads come > from that. > > READ_PLUS is THE game changer for that kind of application, especially in= our case (HPC > simulations). I just played with that: 1. Intel XEON with 512GB 2. loading 16 files with 64GB sparse files which are only holes 3. create kernel core dump Result: Almost all pages in the file cache are zero bytes. VOP_READPLUS() would optimize this case, and map all ranges belonging to sparse file holes into the same read-only MMU page representing a physical address range containing zero bytes. Because it's the same physical memory it would consume very little L2/L3 cache space, and save space in the filesystem cache too. Aur=C3=A9lien --=20 Aur=C3=A9lien Couderc <aurelien.couderc2002@gmail.com> Big Data/Data mining expert, chess enthusiast
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B1jF5rCb8Kx=9pPXtC=dwoCz88waBJeSkADeCwtZOONrKi2Ug>
