From nobody Wed Sep 3 15:55:37 2025 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cH6dj2znXz66f8f; Wed, 03 Sep 2025 15:55:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R12" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4cH6dj2Fh3z40XH; Wed, 03 Sep 2025 15:55:37 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1756914937; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=krFmD64c9r1mv74fNy5Llw9OKvzKfDuDKFB9oNXkEdE=; b=SwP5t01J+qIaTVTSoho/F4rR4BbgYGFtPexr4he9MfCrvFglOMolqaK1jpEuA8aduLvlKs G/8+xubZhWavE2kmp7k+EYnCLXwxkkJ84QWTkotjMzhMrYO/LoDkIJPpirM9IgCefPplrR 19Nhd6ajjGIUVA1J20SxV5+N+Cu7/gTslow9XnD6lGAusWhLdFaoDSw6/U2MH2omW57oAc QrZ2QBdfPGAS+7SxBqtz3mKTO/cwKYdMX/rU1oXdSPY+4zepstXZ0xqfSFrdI6v96i30C/ 3S93G6XbFgPhqGsC8/6s7dfiofFQIX8gQsWX23+WhDddDweQUNOvIViZUKTBqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1756914937; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=krFmD64c9r1mv74fNy5Llw9OKvzKfDuDKFB9oNXkEdE=; b=jZ8CkZbVeeqe6vPwPOwi07A6ZS9ZSX0l4ZS2N4boXInJk7TgUN9ug+srsAf9GdiZlUUUL9 85UQtQvZnzB8/S1t2mWXK1fs8tbfe/MY9qChgyZgPDOrST0XcS/sY3Plt/KhQRGUzYPr7B z4JrZ+OpLgSGDzQC7U3JngaITSi23NhqyqxFHEmjLH+i6oCVug1BMO0ZnLvfeKPaan6wtt F1Nff/54rid2oCMloe3E9y9Aqdc53yAiWnD3u/Krdka/zeK/mJt5V1c8uY/57c1cnEA0Ne hqqipTiszIENt6MTzC8gZj8PRnrOBD4uqqmsb7i7DFugP+HGnfP3P8YSFDTFHg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1756914937; a=rsa-sha256; cv=none; b=SEwrbOgUWTa/hAfvCxu3ou9qkkBOJcXPiXcWSN5blpj0hweMY5RmmQ/3cci9CJuHSLJLFW /6Uq2nS7Tl+kNoOtxbTOyQN73qdnIIx+RP220Eqf+FyXx16Ay99utFjV2wz26OQ2rWOJff wvGqMlkKKjfcmP8HCJA7+NPN+hXffOxl+2wpOQvhHjek8ZPv/X+oAAaBUG1QTB0xZQSxrK tKV2OB3BCfNnzvLeOHGlY5+feUnDPnSTSylvgGBKOzO/s1qcmTusZHu/9lNI4jT3q5ITvW RgKt/28P7HAQKu8Q6zoKVtBS1AhWnMT1DNpKSDa+LCdUhWi9/cKTAJ1+5PWHag== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4cH6dj1LGDz7kR; Wed, 03 Sep 2025 15:55:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 583FtbKF058242; Wed, 3 Sep 2025 15:55:37 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 583FtbZD058239; Wed, 3 Sep 2025 15:55:37 GMT (envelope-from git) Date: Wed, 3 Sep 2025 15:55:37 GMT Message-Id: <202509031555.583FtbZD058239@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Warner Losh Subject: git: dc74f3003c2d - main - nvme: Call vm_fault_hold_pages instead of vmapbuf List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: imp X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: dc74f3003c2d1deea654f24b76a1dd932d428ca0 Auto-Submitted: auto-generated The branch main has been updated by imp: URL: https://cgit.FreeBSD.org/src/commit/?id=dc74f3003c2d1deea654f24b76a1dd932d428ca0 commit dc74f3003c2d1deea654f24b76a1dd932d428ca0 Author: Warner Losh AuthorDate: 2025-09-03 15:06:37 +0000 Commit: Warner Losh CommitDate: 2025-09-03 15:55:24 +0000 nvme: Call vm_fault_hold_pages instead of vmapbuf Use the underlying mechanism of vmapbuf instead of using this legacy interface. This means we don't have to allocate a buf, and can store the page array on the stack as it will be small enough for transfers that the vast majority of cards can do. And those that can do larger (> 512k) have provisions to split up requests. Sponsored by: Netflix Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D52149 --- sys/dev/nvme/nvme_ctrlr.c | 98 ++++++++++++++++++++++++++++------------------- 1 file changed, 59 insertions(+), 39 deletions(-) diff --git a/sys/dev/nvme/nvme_ctrlr.c b/sys/dev/nvme/nvme_ctrlr.c index 49960b0f920a..fc912c1342f4 100644 --- a/sys/dev/nvme/nvme_ctrlr.c +++ b/sys/dev/nvme/nvme_ctrlr.c @@ -41,6 +41,9 @@ #include #include #include +#include +#include +#include #include "nvme_private.h" #include "nvme_linux.h" @@ -1265,6 +1268,34 @@ nvme_ctrlr_shared_handler(void *arg) nvme_mmio_write_4(ctrlr, intmc, 1); } +#define NVME_MAX_PAGES (int)(1024 / sizeof(vm_page_t)) + +static int +nvme_user_ioctl_req(vm_offset_t addr, size_t len, bool is_read, + vm_page_t *upages, int max_pages, int *npagesp, struct nvme_request **req, + nvme_cb_fn_t cb_fn, void *cb_arg) +{ + vm_prot_t prot = VM_PROT_READ; + int err; + + if (is_read) + prot |= VM_PROT_WRITE; /* Device will write to host memory */ + err = vm_fault_hold_pages(&curproc->p_vmspace->vm_map, + addr, len, prot, upages, max_pages, npagesp); + if (err != 0) + return (err); + *req = nvme_allocate_request_null(M_WAITOK, cb_fn, cb_arg); + (*req)->payload = memdesc_vmpages(upages, len, addr & PAGE_MASK); + (*req)->payload_valid = true; + return (0); +} + +static void +nvme_user_ioctl_free(vm_page_t *pages, int npage) +{ + vm_page_unhold_pages(pages, npage); +} + static void nvme_pt_done(void *arg, const struct nvme_completion *cpl) { @@ -1287,30 +1318,28 @@ nvme_pt_done(void *arg, const struct nvme_completion *cpl) int nvme_ctrlr_passthrough_cmd(struct nvme_controller *ctrlr, - struct nvme_pt_command *pt, uint32_t nsid, int is_user_buffer, + struct nvme_pt_command *pt, uint32_t nsid, int is_user, int is_admin_cmd) { - struct nvme_request *req; - struct mtx *mtx; - struct buf *buf = NULL; - int ret = 0; + struct nvme_request *req; + struct mtx *mtx; + int ret = 0; + int npages = 0; + vm_page_t upages[NVME_MAX_PAGES]; if (pt->len > 0) { if (pt->len > ctrlr->max_xfer_size) { - nvme_printf(ctrlr, "pt->len (%d) " - "exceeds max_xfer_size (%d)\n", pt->len, - ctrlr->max_xfer_size); - return EIO; + nvme_printf(ctrlr, + "len (%d) exceeds max_xfer_size (%d)\n", + pt->len, ctrlr->max_xfer_size); + return (EIO); } - if (is_user_buffer) { - buf = uma_zalloc(pbuf_zone, M_WAITOK); - buf->b_iocmd = pt->is_read ? BIO_READ : BIO_WRITE; - if (vmapbuf(buf, pt->buf, pt->len, 1) < 0) { - ret = EFAULT; - goto err; - } - req = nvme_allocate_request_vaddr(buf->b_data, pt->len, - M_WAITOK, nvme_pt_done, pt); + if (is_user) { + ret = nvme_user_ioctl_req((vm_offset_t)pt->buf, pt->len, + pt->is_read, upages, nitems(upages), &npages, &req, + nvme_pt_done, pt); + if (ret != 0) + return (ret); } else req = nvme_allocate_request_vaddr(pt->buf, pt->len, M_WAITOK, nvme_pt_done, pt); @@ -1344,11 +1373,8 @@ nvme_ctrlr_passthrough_cmd(struct nvme_controller *ctrlr, mtx_sleep(pt, mtx, PRIBIO, "nvme_pt", 0); mtx_unlock(mtx); - if (buf != NULL) { - vunmapbuf(buf); -err: - uma_zfree(pbuf_zone, buf); - } + if (npages > 0) + nvme_user_ioctl_free(upages, npages); return (ret); } @@ -1374,8 +1400,9 @@ nvme_ctrlr_linux_passthru_cmd(struct nvme_controller *ctrlr, { struct nvme_request *req; struct mtx *mtx; - struct buf *buf = NULL; int ret = 0; + int npages = 0; + vm_page_t upages[NVME_MAX_PAGES]; /* * We don't support metadata. @@ -1386,7 +1413,7 @@ nvme_ctrlr_linux_passthru_cmd(struct nvme_controller *ctrlr, if (npc->data_len > 0 && npc->addr != 0) { if (npc->data_len > ctrlr->max_xfer_size) { nvme_printf(ctrlr, - "npc->data_len (%d) exceeds max_xfer_size (%d)\n", + "data_len (%d) exceeds max_xfer_size (%d)\n", npc->data_len, ctrlr->max_xfer_size); return (EIO); } @@ -1399,15 +1426,11 @@ nvme_ctrlr_linux_passthru_cmd(struct nvme_controller *ctrlr, if ((npc->opcode & 0x3) == 3) return (EINVAL); if (is_user) { - buf = uma_zalloc(pbuf_zone, M_WAITOK); - buf->b_iocmd = npc->opcode & 1 ? BIO_WRITE : BIO_READ; - if (vmapbuf(buf, (void *)(uintptr_t)npc->addr, - npc->data_len, 1) < 0) { - ret = EFAULT; - goto err; - } - req = nvme_allocate_request_vaddr(buf->b_data, - npc->data_len, M_WAITOK, nvme_npc_done, npc); + ret = nvme_user_ioctl_req(npc->addr, npc->data_len, + npc->opcode & 0x1, upages, nitems(upages), &npages, + &req, nvme_npc_done, npc); + if (ret != 0) + return (ret); } else req = nvme_allocate_request_vaddr( (void *)(uintptr_t)npc->addr, npc->data_len, @@ -1442,11 +1465,8 @@ nvme_ctrlr_linux_passthru_cmd(struct nvme_controller *ctrlr, mtx_sleep(npc, mtx, PRIBIO, "nvme_npc", 0); mtx_unlock(mtx); - if (buf != NULL) { - vunmapbuf(buf); -err: - uma_zfree(pbuf_zone, buf); - } + if (npages > 0) + nvme_user_ioctl_free(upages, npages); return (ret); }