From nobody Tue Oct 29 19:28:14 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4XdKzg1xj2z5bqL8; Tue, 29 Oct 2024 19:28:15 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4XdKzg19VGz4H1r; Tue, 29 Oct 2024 19:28:15 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1730230095; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=Gi+6eW5yJYdR+rSv8I0KmQzRRpBeZNDh+YHTxxcrIIQ=; b=d08cQXLYVAw9idwNU3Qg9EVapCPOHOqHdL5/EiqvFePwdd2dlMzGHu1ECiuHQjxI6+nl44 uOadf1a/UELzX1yx42clL8aO4v6cmoBjiJ4nT44q8N/xpIK5okE08ibA3YYg6sBQ9pLuuq 6+r1mPVmYeXBbOZKo8zbxBwFEGkuzgXiofjeHhnme7Qk0fTYrcfGfp/X4jBrXCr7l8UDoA WmGhR77XjXiQkivL7uzpg5iSnFBoMN70hmodaDrp/TYm6+6eGxzk4rBsWFTWHsG2/S2aU0 pvOksWI/yIcQAUl5UCKKleQqirUIfS3kiYzrCCFpNgzeK3MtzmFDGx1qm/gUwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1730230095; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=Gi+6eW5yJYdR+rSv8I0KmQzRRpBeZNDh+YHTxxcrIIQ=; b=wjTabWxNu5rR4jn1PDuHleklelYjGiDdLC32vZh3kOWCIjWtdm4bm33CVOjI3S8J410i1o vahycF2qSzkfBhVY6BJe9TZ0LjEWyg/ViE523OplJ/koOrtQdnV15AmOVk8iJ+h1t+S6N6 bWAh3jDEffZuv1kva4kJAUKT1t8lICUJn4ulBKyIVneABDPIIQeLdJb5gilk04fkSR+/F3 yhVFOfJI0LGR6/4bqq1HACR9P3+6TpwdaAgKSR7GlBo/1r89W1ZnuVzfzCaNMHqwiaLdQy hUxeDQSfYe4ubuCV6F1hITZQqKzy2IwRa5+NaNgVge4+dTdHAg4/FXId3sUHjQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1730230095; a=rsa-sha256; cv=none; b=HX5wZZoknb+B1mgdsA1ultaPmwp6Q4MGgJsmWnTYXPY8j2UktewOHpPaR9umwQTB+sARr0 xy762qGvqf9BAB7LFzfP3c1nAnEEj9iywhf3SS94X/liUw2Sejj1oU6UdsZSWHead0beN6 CMBDVdp4tdEWsHGWJOQkoeMXHaZ5G+dHRxniov+H5TwrCyV0jzRm7AG5b5aE5Kh5RQaejj CW1tBkSvAzP667GoIJoSGc7VD2fLj8kFm5yo5hESfEXzkhbnEbbo8H+e/twK4CCTFpmY2W G9m1BPfwCGHOLeHR1j/wHAxrcfvsFezOyWRfe3KAXa+MucM5V3xwLB1cQcdgFw== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4XdKzg0khNzcXk; Tue, 29 Oct 2024 19:28:15 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 49TJSEe6094395; Tue, 29 Oct 2024 19:28:14 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 49TJSE29094392; Tue, 29 Oct 2024 19:28:14 GMT (envelope-from git) Date: Tue, 29 Oct 2024 19:28:14 GMT Message-Id: <202410291928.49TJSE29094392@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Ed Maste Subject: git: 3981cf108773 - stable/14 - bhyve ahci: Improve robustness of TRIM handling List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: emaste X-Git-Repository: src X-Git-Refname: refs/heads/stable/14 X-Git-Reftype: branch X-Git-Commit: 3981cf108773d6b29c8e100bc3b4a105eae681ec Auto-Submitted: auto-generated The branch stable/14 has been updated by emaste: URL: https://cgit.FreeBSD.org/src/commit/?id=3981cf108773d6b29c8e100bc3b4a105eae681ec commit 3981cf108773d6b29c8e100bc3b4a105eae681ec Author: John Baldwin AuthorDate: 2024-10-24 14:18:09 +0000 Commit: Ed Maste CommitDate: 2024-10-29 19:19:51 +0000 bhyve ahci: Improve robustness of TRIM handling The previous fix for a stack buffer leak in the ahci device model actually broke the handling of TRIM as one of the checks it added caused TRIM commands to never be completed. This resulted in command timeouts if a guest OS did a 'newfs -E' of an AHCI disk, for example. Also, for the invalid case the previous check was handling, the device model should be failing with an error rather than claiming success. To resolve this, validate the length of a TRIM request and fail with an error if it exceeds the maximum number of supported blocks advertised via IDENTIFY. In addition, if the PRDT does not provide enough data, fail the command with an error rather than performing a partial completion. This is somewhat complicated by the implementation of TRIM in the ahci device model. A single TRIM request can specify multiple LBA ranges. The device model handles this by dispatching blockif_delete() requests one at a time. When a blockif_delete() request completes, the device model locates the TRIM buffer and searches for the next LBA range to handle. Previously, the device model would re-read the trim buffer from guest memory each time. However, this was subject to some unpleasant races if the guest changed the PRDT entries or CFIS while a command was in flight. Instead, read the buffer of trim ranges once and cache it across multipe internal blockif requests. Reviewed by: mav Fixes: 71fa171c6480 bhyve: Initialize stack buffer in pci_ahci Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D47224 (cherry picked from commit 8c8ebbb045185396083cd3e4d333fe1851930ee7) --- usr.sbin/bhyve/pci_ahci.c | 145 ++++++++++++++++++++++++++++++---------------- 1 file changed, 95 insertions(+), 50 deletions(-) diff --git a/usr.sbin/bhyve/pci_ahci.c b/usr.sbin/bhyve/pci_ahci.c index e4c877229425..2c55431c0275 100644 --- a/usr.sbin/bhyve/pci_ahci.c +++ b/usr.sbin/bhyve/pci_ahci.c @@ -127,6 +127,7 @@ struct ahci_ioreq { STAILQ_ENTRY(ahci_ioreq) io_flist; TAILQ_ENTRY(ahci_ioreq) io_blist; uint8_t *cfis; + uint8_t *dsm; uint32_t len; uint32_t done; int slot; @@ -214,6 +215,8 @@ struct pci_ahci_softc { }; #define ahci_ctx(sc) ((sc)->asc_pi->pi_vmctx) +static void ahci_handle_next_trim(struct ahci_port *p, int slot, uint8_t *cfis, + uint8_t *buf, uint32_t len, uint32_t done); static void ahci_handle_port(struct ahci_port *p); static inline void lba_to_msf(uint8_t *buf, int lba) @@ -813,18 +816,14 @@ read_prdt(struct ahci_port *p, int slot, uint8_t *cfis, void *buf, } static void -ahci_handle_dsm_trim(struct ahci_port *p, int slot, uint8_t *cfis, uint32_t done) +ahci_handle_dsm_trim(struct ahci_port *p, int slot, uint8_t *cfis) { - struct ahci_ioreq *aior; - struct blockif_req *breq; - uint8_t *entry; - uint64_t elba; - uint32_t len, elen; - int err, first, ncq; - uint8_t buf[512]; - unsigned int written; + uint32_t len; + int ncq; + uint8_t *buf; + unsigned int nread; - first = (done == 0); + buf = NULL; if (cfis[2] == ATA_DATA_SET_MANAGEMENT) { len = (uint16_t)cfis[13] << 8 | cfis[12]; len *= 512; @@ -834,39 +833,84 @@ ahci_handle_dsm_trim(struct ahci_port *p, int slot, uint8_t *cfis, uint32_t done len *= 512; ncq = 1; } - written = read_prdt(p, slot, cfis, buf, sizeof(buf)); - memset(buf + written, 0, sizeof(buf) - written); -next: - if (done >= sizeof(buf) - 8) - return; - entry = &buf[done]; - elba = ((uint64_t)entry[5] << 40) | - ((uint64_t)entry[4] << 32) | - ((uint64_t)entry[3] << 24) | - ((uint64_t)entry[2] << 16) | - ((uint64_t)entry[1] << 8) | - entry[0]; - elen = (uint16_t)entry[7] << 8 | entry[6]; - done += 8; - if (elen == 0) { - if (done >= len) { - if (ncq) { - if (first) - ahci_write_fis_d2h_ncq(p, slot); - ahci_write_fis_sdb(p, slot, cfis, - ATA_S_READY | ATA_S_DSC); - } else { - ahci_write_fis_d2h(p, slot, cfis, - ATA_S_READY | ATA_S_DSC); - } + /* Support for only a single block is advertised via IDENTIFY. */ + if (len > 512) { + goto invalid_command; + } + + buf = malloc(len); + nread = read_prdt(p, slot, cfis, buf, len); + if (nread != len) { + goto invalid_command; + } + ahci_handle_next_trim(p, slot, cfis, buf, len, 0); + return; + +invalid_command: + free(buf); + if (ncq) { + ahci_write_fis_d2h_ncq(p, slot); + ahci_write_fis_sdb(p, slot, cfis, + (ATA_E_ABORT << 8) | ATA_S_READY | ATA_S_ERROR); + } else { + ahci_write_fis_d2h(p, slot, cfis, + (ATA_E_ABORT << 8) | ATA_S_READY | ATA_S_ERROR); + } +} + +static void +ahci_handle_next_trim(struct ahci_port *p, int slot, uint8_t *cfis, + uint8_t *buf, uint32_t len, uint32_t done) +{ + struct ahci_ioreq *aior; + struct blockif_req *breq; + uint8_t *entry; + uint64_t elba; + uint32_t elen; + int err; + bool first, ncq; + + first = (done == 0); + if (cfis[2] == ATA_DATA_SET_MANAGEMENT) { + ncq = false; + } else { /* ATA_SEND_FPDMA_QUEUED */ + ncq = true; + } + + /* Find the next range to TRIM. */ + while (done < len) { + entry = &buf[done]; + elba = ((uint64_t)entry[5] << 40) | + ((uint64_t)entry[4] << 32) | + ((uint64_t)entry[3] << 24) | + ((uint64_t)entry[2] << 16) | + ((uint64_t)entry[1] << 8) | + entry[0]; + elen = (uint16_t)entry[7] << 8 | entry[6]; + done += 8; + if (elen != 0) + break; + } + + /* All remaining ranges were empty. */ + if (done == len) { + free(buf); + if (ncq) { + if (first) + ahci_write_fis_d2h_ncq(p, slot); + ahci_write_fis_sdb(p, slot, cfis, + ATA_S_READY | ATA_S_DSC); + } else { + ahci_write_fis_d2h(p, slot, cfis, + ATA_S_READY | ATA_S_DSC); + } + if (!first) { p->pending &= ~(1 << slot); ahci_check_stopped(p); - if (!first) - ahci_handle_port(p); - return; + ahci_handle_port(p); } - goto next; + return; } /* @@ -879,6 +923,7 @@ next: aior->slot = slot; aior->len = len; aior->done = done; + aior->dsm = buf; aior->more = (len != done); breq = &aior->io_req; @@ -1756,7 +1801,7 @@ ahci_handle_cmd(struct ahci_port *p, int slot, uint8_t *cfis) case ATA_DATA_SET_MANAGEMENT: if (cfis[11] == 0 && cfis[3] == ATA_DSM_TRIM && cfis[13] == 0 && cfis[12] == 1) { - ahci_handle_dsm_trim(p, slot, cfis, 0); + ahci_handle_dsm_trim(p, slot, cfis); break; } ahci_write_fis_d2h(p, slot, cfis, @@ -1766,7 +1811,7 @@ ahci_handle_cmd(struct ahci_port *p, int slot, uint8_t *cfis) if ((cfis[13] & 0x1f) == ATA_SFPDMA_DSM && cfis[17] == 0 && cfis[16] == ATA_DSM_TRIM && cfis[11] == 0 && cfis[3] == 1) { - ahci_handle_dsm_trim(p, slot, cfis, 0); + ahci_handle_dsm_trim(p, slot, cfis); break; } ahci_write_fis_d2h(p, slot, cfis, @@ -1904,12 +1949,12 @@ ata_ioreq_cb(struct blockif_req *br, int err) struct ahci_port *p; struct pci_ahci_softc *sc; uint32_t tfd; - uint8_t *cfis; - int slot, ncq, dsm; + uint8_t *cfis, *dsm; + int slot, ncq; DPRINTF("%s %d", __func__, err); - ncq = dsm = 0; + ncq = 0; aior = br->br_param; p = aior->io_pr; cfis = aior->cfis; @@ -1921,10 +1966,8 @@ ata_ioreq_cb(struct blockif_req *br, int err) cfis[2] == ATA_READ_FPDMA_QUEUED || cfis[2] == ATA_SEND_FPDMA_QUEUED) ncq = 1; - if (cfis[2] == ATA_DATA_SET_MANAGEMENT || - (cfis[2] == ATA_SEND_FPDMA_QUEUED && - (cfis[13] & 0x1f) == ATA_SFPDMA_DSM)) - dsm = 1; + dsm = aior->dsm; + aior->dsm = NULL; pthread_mutex_lock(&sc->mtx); @@ -1942,8 +1985,9 @@ ata_ioreq_cb(struct blockif_req *br, int err) hdr->prdbc = aior->done; if (!err && aior->more) { - if (dsm) - ahci_handle_dsm_trim(p, slot, cfis, aior->done); + if (dsm != NULL) + ahci_handle_next_trim(p, slot, cfis, dsm, + aior->len, aior->done); else ahci_handle_rw(p, slot, cfis, aior->done); goto out; @@ -1965,6 +2009,7 @@ ata_ioreq_cb(struct blockif_req *br, int err) ahci_check_stopped(p); ahci_handle_port(p); + free(dsm); out: pthread_mutex_unlock(&sc->mtx); DPRINTF("%s exit", __func__);