From nobody Sun May 1 17:10:56 2022 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 6DEEB1AB5FE8; Sun, 1 May 2022 17:10:56 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Krt682hqKz4nGl; Sun, 1 May 2022 17:10:56 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1651425056; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=+5PncWDGMnv8FwPt7gYf6yA5dR2kNW2YFb872aysIpU=; b=HvEkbZQo2VmmZStPM29GixzXXxTee5W7BTS54N/xPtAiaayIIlfq97uUTtDLT5rJfeog7A NEsYyRb9oQEY9sDzytkkWw4p2cIC4XJK6g9wx2A6RN6tHE1VD/P2TfoksmScuimVN6BBs7 bXDtnJAVIcozvFc2VueUPKWDHg9SG9JBt2gem1zI2c69KdyTHbN6pUishAksDSd+AhUVmL k2ctuXktWflgkNUiWEojbcxJemxGnl51rtgyC86FypBiE4Q/WIecfPPvbNN2BjdLcw5EPH CmbABdf2dSK6ppM3+5SVjnu7UhwsRvO7GMjZNg44/tidGEscVqszVCMVKY9pMw== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3D42019B0C; Sun, 1 May 2022 17:10:56 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 241HAusK042981; Sun, 1 May 2022 17:10:56 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 241HAuaL042980; Sun, 1 May 2022 17:10:56 GMT (envelope-from git) Date: Sun, 1 May 2022 17:10:56 GMT Message-Id: <202205011710.241HAuaL042980@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Warner Losh Subject: git: 6c8ab086fed3 - main - ada: Retry commands with retries left on CAM_SEL_TIMEOUT List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: imp X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 6c8ab086fed37a6b44fa84377e48c499f223ae80 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1651425056; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=+5PncWDGMnv8FwPt7gYf6yA5dR2kNW2YFb872aysIpU=; b=BrkTfftIEg1xXElh+0bUSv5n/t6Sj5V9Tm0A7LiTT6VXgUcQ+nWdgl9Cgp7BG5xY6HGdSk ga6fY8zyobNVx8WLQU329sa9BZuJrdx72yPOk+3UX3Lqw5enzojRpKyFLfqn+WFzQfkQnF ZBGJGbr9Pv/TxXknKPHUvoZFKl3gNFTElJS5c4JXJNlEosR7pz4qIsY7K23PikT+UHvUNv /i6RCSQc4vxiYEzH0G6zeEYFDYy9fB1ajbC3FnBaEDPfIxcbq81z2G5IqY4vYNhrbJuo8Z blEVJh/CwGil7aKmz5xYn/FBAQn/nzDTbipaJNqQD7apekQ2aqM7+0JmtptgzA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1651425056; a=rsa-sha256; cv=none; b=kTJx+jmzm7JXbb4Wi7w9AE8cloh6HiSnUWK3W7Ar3o+YxZQwJAANaq4Lwk8EacHMqa2Qh3 +gcxJn0TkrDZaeJ+IhAh4JLjHuhkCvkkq2NAnYQBH763qVDKn3Y33GMiREfMDaETTriJDr 9NJ40rXLWgYSzTb/LbYNZjlb7gZ9ANU0oYf8+oWzAxYjCj3sp2JUWaGR6ePVoSIbOnB+Hm d4TF0gHpimgnKmxvQOszDjEXRD1einih7ZCS5I3ojR+Uy+VnrKddykNWMeXR6LhPcvABxN Hk1MVntiXFwUhP7G8cqlhKPUqhJoSJGjzGVGK4MJqcGIlcBo5/88l3dRevAK0w== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by imp: URL: https://cgit.FreeBSD.org/src/commit/?id=6c8ab086fed37a6b44fa84377e48c499f223ae80 commit 6c8ab086fed37a6b44fa84377e48c499f223ae80 Author: Warner Losh AuthorDate: 2022-05-01 16:39:04 +0000 Commit: Warner Losh CommitDate: 2022-05-01 17:08:56 +0000 ada: Retry commands with retries left on CAM_SEL_TIMEOUT The AHCI and ATA SIMs will return CAM_SEL_TIMEOUT when an underlying device has stopped responding. This is usually seen after a timeouted out command and can be a transient event. Rather than fail the peripheral immediately after seeing this, queue a retry. For transient events, this allows drives to continue to provide data, though with some added latency, just like we do when we have some other kind of retriable error. If the error isn't transient (the drive is truly gone), then we'll discover that eventually and fail the transaction and invalidate the drive like we do today. This helps us avoid a panic at the end of camperiphfree when CAM_PERIPH_NEW_DEV_FOUND is set. However, the deferred callback should be queued to xpt_async_td instead of being made inline there. This issue will be solved in a different patch that does that. PR 263703. This also helps us avoid another bug where we can drop all references to the device (causing us to go through camperiphfree and destroy the path) while we have an I/O pending in the ata_da state machine (usually in state ADA_STATE_RAHEAD with ATA_SETFEATURES ATA_SF_ENAB_RCACHE command). It's not clear why the reference that we take out to do the reprobe isn't effective at blocking this. By retrying this condition, though we avoid this bug (at least more often, I don't have a good reproduction test case, I just see this panic a few times a month at work on systems that have transient disk errors on ahci connected SATA SSDs). PR 263704. It's too soon to know how much this helps us avoid this bug. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D34977 --- sys/cam/ata/ata_da.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sys/cam/ata/ata_da.c b/sys/cam/ata/ata_da.c index b82671315138..b76058c8f19d 100644 --- a/sys/cam/ata/ata_da.c +++ b/sys/cam/ata/ata_da.c @@ -2872,7 +2872,7 @@ adadone(struct cam_periph *periph, union ccb *done_ccb) cam_periph_lock(periph); bp = (struct bio *)done_ccb->ccb_h.ccb_bp; if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { - error = adaerror(done_ccb, 0, 0); + error = adaerror(done_ccb, CAM_RETRY_SELTO, 0); if (error == ERESTART) { /* A retry was scheduled, so just return. */ cam_periph_unlock(periph);