From nobody Thu Nov 6 19:39:49 2025 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4d2Xbc1Zsjz6FKBT for ; Thu, 06 Nov 2025 19:40:28 +0000 (UTC) (envelope-from aurelien.couderc2002@gmail.com) Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4d2Xbb5qTHz486B for ; Thu, 06 Nov 2025 19:40:27 +0000 (UTC) (envelope-from aurelien.couderc2002@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b="QdDdg/bX"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of aurelien.couderc2002@gmail.com designates 2a00:1450:4864:20::52d as permitted sender) smtp.mailfrom=aurelien.couderc2002@gmail.com Received: by mail-ed1-x52d.google.com with SMTP id 4fb4d7f45d1cf-640e9a53ff6so7018a12.0 for ; Thu, 06 Nov 2025 11:40:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762458026; x=1763062826; darn=freebsd.org; h=content-transfer-encoding:to:subject:message-id:date:from :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=TNcMv/CCJpwnMyvyBbkVSUifwsdMUAYENGqWRE+MuDM=; b=QdDdg/bXXjvQYLyEYSLFlZ0VvwPpBtxaq1G7JkNu3aRnulOeblWoCk9mrCfpizkQ+2 FGkGUHUcVM5ti/6tG6IIYY+dYyn8nNQKw/0sf+iDVJr7doSHvhWCsT2Ef++NfymCg/Yl fhLs/AZq/haNyaHj2afvfvezYCZJZlRlthYsaMIKzdQLXXyj53HYaZNU127EvRSenaHm VRyMEpy16qv28FC8ArlOnzxHggBTVgg/cZ8u0N9Wo2UycgtJm6MSStNNAL8GaOYlb7+H 9N/4zvp4TI4+dzsgPHsCgXhWejzhERFUJ38793WfZPGfFDpmYlyrPTQOr9ArTijsU7zO Fexw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762458026; x=1763062826; h=content-transfer-encoding:to:subject:message-id:date:from :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=TNcMv/CCJpwnMyvyBbkVSUifwsdMUAYENGqWRE+MuDM=; b=I90hAI5TGumXyXcEVdI34lanBCwyJ5kEwZsfOiyQRaVV/QEdj/JYTs0Ngq/g9TfFTL i4jmanEfNvelYegmHjA/4AJwAh8ulr/huXXPuS+3IIL5JMgSaVSMsJLZUsW+6E/kc2bu cmvXKeBYD1JFj2OGgAM7Rk/1uVOedJ4qxlt1ey8u5RJS61hYAkRfb829Qrjicy19JlEb 4DrBcrjk0ayo2f6Z3wtPtOXsnBPCigkfpoUionUvBIPELz8uPfm9QDBGccECWO+mMPjJ 9gejLyUQlqccMLh67J1kZtHoVHdEnAM8aVz2FLKEU4vTKQVxBai1yKVp/Z33/0YcnAA2 QYaA== X-Gm-Message-State: AOJu0YwVlGZwAiDFobGSsg+8xCWYJYhs3uP5SskXwGSiM2CoA/aXhE6j 7tSQnX/mJm2YBwzrmFe54n0OxqTgGPHF3nNsUE46oEnyIidf5q4KRInY0kYWba1OeVn5UAUYLIs tjZtuGi9e5SoypT0oQqDqBPxbuSV6l2sx/xY2 X-Gm-Gg: ASbGnctsWRQKq8EBpJQD2+AzKwUGIg8aV4GZkcoRHdl8AbJjltrbbWy6iyMwa0snfKo 0iDHJjOudEg9XNc6XPVjTabA3j22vI9fixLSG6Jvn4WcrvMpRUFfZ5gjxpaqkU6/FsXgKZac69w 55syK3QJrbJk7BgYhvBxbUIAwok5JM/G4vsVOxQqMk17JJPrq3ixYvmMsa9Y4ufq3eiQEhwwawf sTtrRaNUWA4oyUWOaGyF882ZmNOeZZ+/3D/bX31PI3n3DArqNQJnwsXh94= X-Google-Smtp-Source: AGHT+IFD79/fbYMGeB+ydoexpf/HOcQIBljTt0d+UuPYhseJvLJYC8iwG2dDFMusjrcZnabqYeJ6GlTFk7KOif6+DXY= X-Received: by 2002:a05:6402:268e:b0:639:ff5f:bdfb with SMTP id 4fb4d7f45d1cf-6413f1f7337mr476542a12.19.1762458026370; Thu, 06 Nov 2025 11:40:26 -0800 (PST) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 From: =?UTF-8?Q?Aur=C3=A9lien_Couderc?= Date: Thu, 6 Nov 2025 20:39:49 +0100 X-Gm-Features: AWmQ_bmvGhUiRC7sqR81L1EZJjckZtzaC9Nz4Ui7yqeb_CyuBu5bWRFtrCkCdMo Message-ID: Subject: Implementing VOP_READPLUS() in FreeBSD 15? To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.15 / 15.00]; SUBJECT_ENDS_QUESTION(1.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-0.998]; NEURAL_HAM_SHORT(-0.77)[-0.773]; R_MIXED_CHARSET(0.63)[subject]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_FROM(0.00)[gmail.com]; RCPT_COUNT_ONE(0.00)[1]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; TAGGED_FROM(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; MISSING_XM_UA(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; TO_DN_NONE(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; MID_RHS_MATCH_FROMTLD(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; RCVD_COUNT_ONE(0.00)[1]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::52d:from] X-Rspamd-Queue-Id: 4d2Xbb5qTHz486B This is a followup to a discussion with the nfs-ganesha developers. Could FreeBSD implement a VOP_READPLUS() in FreeBSD 15, please? Citing Lionel Cons/CERN: > But the point is to optimise the read(). First, you have less traffic ove= r the wire (which is a > thing if your reads are in the gigabyte range for large VMs), and it tell= s the VM host that it > can just map all those MMU pages representing the hole to the "default ze= ro page", which > in turn saves lots of space in the L3 and L2 caches ----> THIS DOES WONDE= RS to VM > performance. > > Example: > The performance benefit here comes from the fast that instead of mapping = a 1TB hole > (1099511627776 bytes) to individual 524288 2M pages (x86 2M hugepage size= ), and then > potentially reading from them, you just have ONE 2M page in the cache, an= d all reads come > from that. > > READ_PLUS is THE game changer for that kind of application, especially in= our case (HPC > simulations). I just played with that: 1. Intel XEON with 512GB 2. loading 16 files with 64GB sparse files which are only holes 3. create kernel core dump Result: Almost all pages in the file cache are zero bytes. VOP_READPLUS() would optimize this case, and map all ranges belonging to sparse file holes into the same read-only MMU page representing a physical address range containing zero bytes. Because it's the same physical memory it would consume very little L2/L3 cache space, and save space in the filesystem cache too. Aur=C3=A9lien --=20 Aur=C3=A9lien Couderc Big Data/Data mining expert, chess enthusiast