From nobody Tue Dec 12 15:32:10 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SqMzt2s0dz5462s; Tue, 12 Dec 2023 15:32:10 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SqMzt2fScz4VJm; Tue, 12 Dec 2023 15:32:10 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1702395130; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=5KKH21HSY4NnX8XgPza8ZnFTkRwgJMm9Pq3rRp4Wfic=; b=pjrEQ1otLLdBlfLGUQsTa5mLUwfLm7Q5VgL3kkLjOMQ6X6K7/p/7uqV5N3G1G7aqX+Khjz 5RXS2/urK0lmIoIe3Gx2NEPnkHZ7AmcnUVegquNTXl3ShSdzTZPfW6KSQp/QKsY6NJzCPs rKVG5ap93AuBAhC2UEaTcheuwUKEI/y/VMO3+fhlnBAQzI96TYqmFK6gHcI125+wQxs9S0 CuBNeihWZkWieQKY6rNdRwpQw0BFZYo6TF9g9v/g27lrf1CMYOqRpcd+0aZHxPNEzvQB4s JKlxiMnzbz7/QzU5Wh/RfRJnbzsL8L9oubnsQN9xJpETxHViFQxm5mBuxEn1AA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1702395130; a=rsa-sha256; cv=none; b=UWdOGKb/JcaqlfnfvcKHeD3jv/T1HI+Czk9Z5tCIQFq9Yt0fdGhOfRkqJsleqh8JtLhNu1 1PwvgzzcINrov9o2N+MTC5lA7qfnhQw3PYka1hgZMSVXspLjmiBRX0K2mA47+A6VtZhsi2 ZGM1fahLGPWrNK8/ejOKX8FGU4zxPlk6T8gnf03WnRqMltwIUdfqglHAsX4DuKxbMXmgEp DitJFGeIgGD+Ig1gI0jSM6VSic/uVrxWyDHK0Lip860+FTS0dCpHz8xwkJ//lrW8QEY8rX PVO/TQ3WrnQT8R8Mz+7VbotjS3Xd4lgcXPnFiM8pTwIwF66iirj/uI9sVwVC9w== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1702395130; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=5KKH21HSY4NnX8XgPza8ZnFTkRwgJMm9Pq3rRp4Wfic=; b=vmGhU2kZvzzMwYWr7wWEEg2Z4qsfialvZPGHJBYyU1a72+mXRb+SC9JksKQHrT8vrE9POP ASCtxBOq1gJzmImW0IOAQGftIM0H3MXofjxT7Gol9WGyitZ90q9BQEZ2H6aVOZT1ZPzmH0 IVMI5ERX5Pwf2xD32bZ7c0qzq6dk464Idsdjjt+uQLQ+ysYETCLDcqJgQP3UaOlTDCeSZ7 xp1WhcodTriAN91Y8DhZt1P1p6mOMidpVq93YxANTLjO3IkhigyRoyWu7Pt+LgAF73Xud7 FTqNcly0lMwfID88kIRfdgMkX++QYbzfYkC0bntIseanVglUMVcoteMq0vCoZQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4SqMzt1ksGz4wn; Tue, 12 Dec 2023 15:32:10 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 3BCFWAGP016625; Tue, 12 Dec 2023 15:32:10 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 3BCFWA57016590; Tue, 12 Dec 2023 15:32:10 GMT (envelope-from git) Date: Tue, 12 Dec 2023 15:32:10 GMT Message-Id: <202312121532.3BCFWA57016590@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: "Kenneth D. Merry" Subject: git: 70547544ce93 - main - ocs_fc: IO timeout handling and error reporting fix. List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: ken X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 70547544ce931357c980be47d937e5b57a2d7f49 Auto-Submitted: auto-generated The branch main has been updated by ken: URL: https://cgit.FreeBSD.org/src/commit/?id=70547544ce931357c980be47d937e5b57a2d7f49 commit 70547544ce931357c980be47d937e5b57a2d7f49 Author: Ram Kishore Vegesna AuthorDate: 2023-12-12 15:22:58 +0000 Commit: Kenneth D. Merry CommitDate: 2023-12-12 15:22:58 +0000 ocs_fc: IO timeout handling and error reporting fix. Hardware timeout uses a 8-bit timeout value and expects the timeout to be less than 255 seconds. Added software timer implemetation to timeout and abort the IOs with timeout more than 255 seconds. Fix the timeout problem by dividing CAM timeouts by 1000 as hardware expects timeout value in seconds. Before this change, CAM timeouts in milliseconds were getting truncated to 8 bits and converted to seconds. So the actual timeout used when going down to the card would depend on the bottom 8 bits of the timeout used. Add the mapping of ocs_fc error status to CAM status. Reported by: ken Reviewed by: ken Tested by: ken, ram Approved by: ken MFC after: 1 week --- sys/dev/ocs_fc/ocs_cam.c | 87 ++++++++++++++++++++++++++++++++++++++++++++-- sys/dev/ocs_fc/ocs_hw.c | 70 +++++++++++++++++++++++-------------- sys/dev/ocs_fc/ocs_hw.h | 10 +++--- sys/dev/ocs_fc/ocs_scsi.c | 6 +++- sys/dev/ocs_fc/ocs_xport.c | 8 +++-- sys/dev/ocs_fc/sli4.h | 2 +- 6 files changed, 144 insertions(+), 39 deletions(-) diff --git a/sys/dev/ocs_fc/ocs_cam.c b/sys/dev/ocs_fc/ocs_cam.c index 4da1b6669047..0fa94083e898 100644 --- a/sys/dev/ocs_fc/ocs_cam.c +++ b/sys/dev/ocs_fc/ocs_cam.c @@ -44,6 +44,7 @@ #include "ocs.h" #include "ocs_scsi.h" #include "ocs_device.h" +#include /* Default IO timeout value for initiators is 30 seconds */ #define OCS_CAM_IO_TIMEOUT 30 @@ -55,6 +56,27 @@ typedef struct { int32_t rc; } ocs_dmamap_load_arg_t; +struct ocs_scsi_status_desc { + ocs_scsi_io_status_e status; + const char *desc; +} ocs_status_desc[] = { + { OCS_SCSI_STATUS_GOOD, "Good" }, + { OCS_SCSI_STATUS_ABORTED, "Aborted" }, + { OCS_SCSI_STATUS_ERROR, "Error" }, + { OCS_SCSI_STATUS_DIF_GUARD_ERROR, "DIF Guard Error" }, + { OCS_SCSI_STATUS_DIF_REF_TAG_ERROR, "DIF REF Tag Error" }, + { OCS_SCSI_STATUS_DIF_APP_TAG_ERROR, "DIF App Tag Error" }, + { OCS_SCSI_STATUS_DIF_UNKNOWN_ERROR, "DIF Unknown Error" }, + { OCS_SCSI_STATUS_PROTOCOL_CRC_ERROR, "Proto CRC Error" }, + { OCS_SCSI_STATUS_NO_IO, "No IO" }, + { OCS_SCSI_STATUS_ABORT_IN_PROGRESS, "Abort in Progress" }, + { OCS_SCSI_STATUS_CHECK_RESPONSE, "Check Response" }, + { OCS_SCSI_STATUS_COMMAND_TIMEOUT, "Command Timeout" }, + { OCS_SCSI_STATUS_TIMEDOUT_AND_ABORTED, "Timed out and Aborted" }, + { OCS_SCSI_STATUS_SHUTDOWN, "Shutdown" }, + { OCS_SCSI_STATUS_NEXUS_LOST, "Nexus Lost" } +}; + static void ocs_action(struct cam_sim *, union ccb *); static void ocs_poll(struct cam_sim *); @@ -1497,7 +1519,7 @@ static int32_t ocs_scsi_initiator_io_cb(ocs_io_t *io, * If we've already got a SCSI error, prefer that because it * will have more detail. */ - if ((rsp->residual < 0) && (ccb_status == CAM_REQ_CMP)) { + if ((rsp->residual < 0) && (ccb_status == CAM_REQ_CMP)) { ccb_status = CAM_DATA_RUN_ERR; } @@ -1517,7 +1539,62 @@ static int32_t ocs_scsi_initiator_io_cb(ocs_io_t *io, ocs_memcpy(&csio->sense_data, rsp->sense_data, sense_len); } } else if (scsi_status != OCS_SCSI_STATUS_GOOD) { - ccb_status = CAM_REQ_CMP_ERR; + const char *err_desc = NULL; + char path_str[64]; + char err_str[224]; + struct sbuf sb; + size_t i; + + sbuf_new(&sb, err_str, sizeof(err_str), 0); + + xpt_path_string(ccb->ccb_h.path, path_str, sizeof(path_str)); + sbuf_cat(&sb, path_str); + + for (i = 0; i < (sizeof(ocs_status_desc) / + sizeof(ocs_status_desc[0])); i++) { + if (scsi_status == ocs_status_desc[i].status) { + err_desc = ocs_status_desc[i].desc; + break; + } + } + if (ccb->ccb_h.func_code == XPT_SCSI_IO) { + scsi_command_string(&ccb->csio, &sb); + sbuf_printf(&sb, "length %d ", ccb->csio.dxfer_len); + } + sbuf_printf(&sb, "error status %d (%s)\n", scsi_status, + (err_desc != NULL) ? err_desc : "Unknown"); + sbuf_finish(&sb); + printf("%s", sbuf_data(&sb)); + + switch (scsi_status) { + case OCS_SCSI_STATUS_ABORTED: + case OCS_SCSI_STATUS_ABORT_IN_PROGRESS: + ccb_status = CAM_REQ_ABORTED; + break; + case OCS_SCSI_STATUS_DIF_GUARD_ERROR: + case OCS_SCSI_STATUS_DIF_REF_TAG_ERROR: + case OCS_SCSI_STATUS_DIF_APP_TAG_ERROR: + case OCS_SCSI_STATUS_DIF_UNKNOWN_ERROR: + case OCS_SCSI_STATUS_PROTOCOL_CRC_ERROR: + ccb_status = CAM_IDE; + break; + case OCS_SCSI_STATUS_ERROR: + case OCS_SCSI_STATUS_NO_IO: + ccb_status = CAM_REQ_CMP_ERR; + break; + case OCS_SCSI_STATUS_COMMAND_TIMEOUT: + case OCS_SCSI_STATUS_TIMEDOUT_AND_ABORTED: + ccb_status = CAM_CMD_TIMEOUT; + break; + case OCS_SCSI_STATUS_SHUTDOWN: + case OCS_SCSI_STATUS_NEXUS_LOST: + ccb_status = CAM_SCSI_IT_NEXUS_LOST; + break; + default: + ccb_status = CAM_REQ_CMP_ERR; + break; + } + } else { ccb_status = CAM_REQ_CMP; } @@ -1842,7 +1919,11 @@ ocs_initiator_io(struct ocs_softc *ocs, union ccb *ccb) } else if (ccb->ccb_h.timeout == CAM_TIME_DEFAULT) { io->timeout = OCS_CAM_IO_TIMEOUT; } else { - io->timeout = ccb->ccb_h.timeout; + if (ccb->ccb_h.timeout < 1000) + io->timeout = 1; + else { + io->timeout = ccb->ccb_h.timeout / 1000; + } } switch (csio->tag_action) { diff --git a/sys/dev/ocs_fc/ocs_hw.c b/sys/dev/ocs_fc/ocs_hw.c index 4db6634d3d90..cfb9d4f8f536 100644 --- a/sys/dev/ocs_fc/ocs_hw.c +++ b/sys/dev/ocs_fc/ocs_hw.c @@ -148,17 +148,29 @@ static void ocs_hw_check_sec_hio_list(ocs_hw_t *hw); static void target_wqe_timer_cb(void *arg); static void shutdown_target_wqe_timer(ocs_hw_t *hw); +/* WQE timeout for initiator IOs */ +static inline uint8_t +ocs_hw_set_io_wqe_timeout(ocs_hw_io_t *io, uint32_t timeout) +{ + if (timeout > 255) { + io->wqe_timeout = timeout; + return 0; + } else { + return timeout; + } +} + static inline void ocs_hw_add_io_timed_wqe(ocs_hw_t *hw, ocs_hw_io_t *io) { - if (hw->config.emulate_tgt_wqe_timeout && io->tgt_wqe_timeout) { + if (hw->config.emulate_wqe_timeout && io->wqe_timeout) { /* * Active WQE list currently only used for * target WQE timeouts. */ ocs_lock(&hw->io_lock); ocs_list_add_tail(&hw->io_timed_wqe, io); - io->submit_ticks = ocs_get_os_ticks(); + getmicrouptime(&io->submit_time); ocs_unlock(&hw->io_lock); } } @@ -166,7 +178,7 @@ ocs_hw_add_io_timed_wqe(ocs_hw_t *hw, ocs_hw_io_t *io) static inline void ocs_hw_remove_io_timed_wqe(ocs_hw_t *hw, ocs_hw_io_t *io) { - if (hw->config.emulate_tgt_wqe_timeout) { + if (hw->config.emulate_wqe_timeout) { /* * If target wqe timeouts are enabled, * remove from active wqe list. @@ -965,7 +977,7 @@ ocs_hw_init(ocs_hw_t *hw) } /* finally kick off periodic timer to check for timed out target WQEs */ - if (hw->config.emulate_tgt_wqe_timeout) { + if (hw->config.emulate_wqe_timeout) { ocs_setup_timer(hw->os, &hw->wqe_timer, target_wqe_timer_cb, hw, OCS_HW_WQ_TIMER_PERIOD_MS); } @@ -1695,8 +1707,8 @@ ocs_hw_get(ocs_hw_t *hw, ocs_hw_property_e prop, uint32_t *value) case OCS_HW_EMULATE_I_ONLY_AAB: *value = hw->config.i_only_aab; break; - case OCS_HW_EMULATE_TARGET_WQE_TIMEOUT: - *value = hw->config.emulate_tgt_wqe_timeout; + case OCS_HW_EMULATE_WQE_TIMEOUT: + *value = hw->config.emulate_wqe_timeout; break; case OCS_HW_VPD_LEN: *value = sli_get_vpd_len(&hw->sli); @@ -1996,8 +2008,8 @@ ocs_hw_set(ocs_hw_t *hw, ocs_hw_property_e prop, uint32_t value) case OCS_HW_EMULATE_I_ONLY_AAB: hw->config.i_only_aab = value; break; - case OCS_HW_EMULATE_TARGET_WQE_TIMEOUT: - hw->config.emulate_tgt_wqe_timeout = value; + case OCS_HW_EMULATE_WQE_TIMEOUT: + hw->config.emulate_wqe_timeout = value; break; case OCS_HW_BOUNCE: hw->config.bounce = value; @@ -3324,7 +3336,7 @@ ocs_hw_init_free_io(ocs_hw_io_t *io) io->type = 0xFFFF; io->wq = NULL; io->ul_io = NULL; - io->tgt_wqe_timeout = 0; + io->wqe_timeout = 0; } /** @@ -3738,7 +3750,7 @@ ocs_hw_check_sec_hio_list(ocs_hw_t *hw) flags &= ~SLI4_IO_CONTINUATION; } - io->tgt_wqe_timeout = io->sec_iparam.fcp_tgt.timeout; + io->wqe_timeout = io->sec_iparam.fcp_tgt.timeout; /* Complete (continue) TRECV IO */ if (io->xbusy) { @@ -4041,6 +4053,7 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, ocs_hw_rtn_e rc = OCS_HW_RTN_SUCCESS; uint32_t rpi; uint8_t send_wqe = TRUE; + uint8_t timeout = 0; CPUTRACE(""); @@ -4075,6 +4088,8 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, */ switch (type) { case OCS_HW_IO_INITIATOR_READ: + timeout = ocs_hw_set_io_wqe_timeout(io, iparam->fcp_ini.timeout); + /* * If use_dif_quarantine workaround is in effect, and dif_separates then mark the * initiator read IO for quarantine @@ -4090,12 +4105,14 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, if (sli_fcp_iread64_wqe(&hw->sli, io->wqe.wqebuf, hw->sli.config.wqe_size, &io->def_sgl, io->first_data_sge, len, io->indicator, io->reqtag, SLI4_CQ_DEFAULT, rpi, rnode, iparam->fcp_ini.dif_oper, iparam->fcp_ini.blk_size, - iparam->fcp_ini.timeout)) { + timeout)) { ocs_log_err(hw->os, "IREAD WQE error\n"); rc = OCS_HW_RTN_ERROR; } break; case OCS_HW_IO_INITIATOR_WRITE: + timeout = ocs_hw_set_io_wqe_timeout(io, iparam->fcp_ini.timeout); + ocs_hw_io_ini_sge(hw, io, iparam->fcp_ini.cmnd, iparam->fcp_ini.cmnd_size, iparam->fcp_ini.rsp); @@ -4104,18 +4121,20 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, io->indicator, io->reqtag, SLI4_CQ_DEFAULT, rpi, rnode, iparam->fcp_ini.dif_oper, iparam->fcp_ini.blk_size, - iparam->fcp_ini.timeout)) { + timeout)) { ocs_log_err(hw->os, "IWRITE WQE error\n"); rc = OCS_HW_RTN_ERROR; } break; case OCS_HW_IO_INITIATOR_NODATA: + timeout = ocs_hw_set_io_wqe_timeout(io, iparam->fcp_ini.timeout); + ocs_hw_io_ini_sge(hw, io, iparam->fcp_ini.cmnd, iparam->fcp_ini.cmnd_size, iparam->fcp_ini.rsp); if (sli_fcp_icmnd64_wqe(&hw->sli, io->wqe.wqebuf, hw->sli.config.wqe_size, &io->def_sgl, io->indicator, io->reqtag, SLI4_CQ_DEFAULT, - rpi, rnode, iparam->fcp_ini.timeout)) { + rpi, rnode, timeout)) { ocs_log_err(hw->os, "ICMND WQE error\n"); rc = OCS_HW_RTN_ERROR; } @@ -4137,7 +4156,7 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, flags &= ~SLI4_IO_CONTINUATION; } - io->tgt_wqe_timeout = iparam->fcp_tgt.timeout; + io->wqe_timeout = iparam->fcp_tgt.timeout; /* * If use_dif_quarantine workaround is in effect, and this is a DIF enabled IO @@ -4227,7 +4246,7 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, flags &= ~SLI4_IO_CONTINUATION; } - io->tgt_wqe_timeout = iparam->fcp_tgt.timeout; + io->wqe_timeout = iparam->fcp_tgt.timeout; if (sli_fcp_tsend64_wqe(&hw->sli, io->wqe.wqebuf, hw->sli.config.wqe_size, &io->def_sgl, io->first_data_sge, iparam->fcp_tgt.offset, len, io->indicator, io->reqtag, SLI4_CQ_DEFAULT, @@ -4260,7 +4279,7 @@ ocs_hw_io_send(ocs_hw_t *hw, ocs_hw_io_type_e type, ocs_hw_io_t *io, } } - io->tgt_wqe_timeout = iparam->fcp_tgt.timeout; + io->wqe_timeout = iparam->fcp_tgt.timeout; if (sli_fcp_trsp64_wqe(&hw->sli, io->wqe.wqebuf, hw->sli.config.wqe_size, &io->def_sgl, len, @@ -11173,9 +11192,8 @@ target_wqe_timer_nop_cb(ocs_hw_t *hw, int32_t status, uint8_t *mqe, void *arg) { ocs_hw_io_t *io = NULL; ocs_hw_io_t *io_next = NULL; - uint64_t ticks_current = ocs_get_os_ticks(); - uint32_t sec_elapsed; ocs_hw_rtn_e rc; + struct timeval cur_time; sli4_mbox_command_header_t *hdr = (sli4_mbox_command_header_t *)mqe; @@ -11188,27 +11206,28 @@ target_wqe_timer_nop_cb(ocs_hw_t *hw, int32_t status, uint8_t *mqe, void *arg) /* loop through active WQE list and check for timeouts */ ocs_lock(&hw->io_lock); ocs_list_foreach_safe(&hw->io_timed_wqe, io, io_next) { - sec_elapsed = ((ticks_current - io->submit_ticks) / ocs_get_os_tick_freq()); /* * If elapsed time > timeout, abort it. No need to check type since * it wouldn't be on this list unless it was a target WQE */ - if (sec_elapsed > io->tgt_wqe_timeout) { - ocs_log_test(hw->os, "IO timeout xri=0x%x tag=0x%x type=%d\n", - io->indicator, io->reqtag, io->type); + getmicrouptime(&cur_time); + timevalsub(&cur_time, &io->submit_time); + if (cur_time.tv_sec > io->wqe_timeout) { + ocs_log_info(hw->os, "IO timeout xri=0x%x tag=0x%x type=%d elasped time:%u\n", + io->indicator, io->reqtag, io->type, cur_time.tv_sec); /* remove from active_wqe list so won't try to abort again */ ocs_list_remove(&hw->io_timed_wqe, io); /* save status of "timed out" for when abort completes */ io->status_saved = 1; - io->saved_status = SLI4_FC_WCQE_STATUS_TARGET_WQE_TIMEOUT; + io->saved_status = SLI4_FC_WCQE_STATUS_WQE_TIMEOUT; io->saved_ext = 0; io->saved_len = 0; /* now abort outstanding IO */ - rc = ocs_hw_io_abort(hw, io, FALSE, NULL, NULL); + rc = ocs_hw_io_abort(hw, io, TRUE, NULL, NULL); if (rc) { ocs_log_test(hw->os, "abort failed xri=%#x tag=%#x rc=%d\n", @@ -11237,7 +11256,6 @@ target_wqe_timer_cb(void *arg) /* delete existing timer; will kick off new timer after checking wqe timeouts */ hw->in_active_wqe_timer = TRUE; - ocs_del_timer(&hw->wqe_timer); /* Forward timer callback to execute in the mailbox completion processing context */ if (ocs_hw_async_call(hw, target_wqe_timer_nop_cb, hw)) { @@ -11250,7 +11268,7 @@ shutdown_target_wqe_timer(ocs_hw_t *hw) { uint32_t iters = 100; - if (hw->config.emulate_tgt_wqe_timeout) { + if (hw->config.emulate_wqe_timeout) { /* request active wqe timer shutdown, then wait for it to complete */ hw->active_wqe_timer_shutdown = TRUE; diff --git a/sys/dev/ocs_fc/ocs_hw.h b/sys/dev/ocs_fc/ocs_hw.h index d4ee85c3f52a..671aa40871f2 100644 --- a/sys/dev/ocs_fc/ocs_hw.h +++ b/sys/dev/ocs_fc/ocs_hw.h @@ -211,7 +211,7 @@ typedef enum { OCS_HW_WAR_VERSION, OCS_HW_DISABLE_AR_TGT_DIF, OCS_HW_EMULATE_I_ONLY_AAB, /**< emulate IAAB=0 for initiator-commands only */ - OCS_HW_EMULATE_TARGET_WQE_TIMEOUT, /**< enable driver timeouts for target WQEs */ + OCS_HW_EMULATE_WQE_TIMEOUT, /**< enable driver timeouts for WQEs */ OCS_HW_LINK_CONFIG_SPEED, OCS_HW_CONFIG_TOPOLOGY, OCS_HW_BOUNCE, @@ -520,7 +520,7 @@ typedef union ocs_hw_io_param_u { ocs_hw_dif_blk_size_e blk_size; uint32_t cmnd_size; uint16_t flags; - uint8_t timeout; + uint32_t timeout; uint32_t first_burst; } fcp_ini; } ocs_hw_io_param_t; @@ -576,8 +576,8 @@ struct ocs_hw_io_s { void *abort_arg; /**< argument passed to "abort done" callback */ ocs_ref_t ref; /**< refcount object */ size_t length; /**< needed for bug O127585: length of IO */ - uint8_t tgt_wqe_timeout; /**< timeout value for target WQEs */ - uint64_t submit_ticks; /**< timestamp when current WQE was submitted */ + uint32_t wqe_timeout; /**< timeout value for WQEs */ + struct timeval submit_time; /**< timestamp when current WQE was submitted */ uint32_t status_saved:1, /**< if TRUE, latched status should be returned */ abort_in_progress:1, /**< if TRUE, abort is in progress */ @@ -915,7 +915,7 @@ struct ocs_hw_s { uint16_t auto_xfer_rdy_app_tag_value; uint8_t dif_mode; /**< DIF mode to use */ uint8_t i_only_aab; /** Enable initiator-only auto-abort */ - uint8_t emulate_tgt_wqe_timeout; /** Enable driver target wqe timeouts */ + uint8_t emulate_wqe_timeout; /** Enable driver wqe timeouts */ uint32_t bounce:1; const char *queue_topology; /**< Queue topology string */ uint8_t auto_xfer_rdy_t10_enable; /** Enable t10 PI for auto xfer ready */ diff --git a/sys/dev/ocs_fc/ocs_scsi.c b/sys/dev/ocs_fc/ocs_scsi.c index 0e87cc0bed4b..af9fc798b01c 100644 --- a/sys/dev/ocs_fc/ocs_scsi.c +++ b/sys/dev/ocs_fc/ocs_scsi.c @@ -413,7 +413,7 @@ ocs_target_io_cb(ocs_hw_io_t *hio, ocs_remote_node_t *rnode, uint32_t length, } break; - case SLI4_FC_WCQE_STATUS_TARGET_WQE_TIMEOUT: + case SLI4_FC_WCQE_STATUS_WQE_TIMEOUT: /* target IO timed out */ scsi_status = OCS_SCSI_STATUS_TIMEDOUT_AND_ABORTED; break; @@ -2209,6 +2209,10 @@ ocs_initiator_io_cb(ocs_hw_io_t *hio, ocs_remote_node_t *rnode, uint32_t length, scsi_status = OCS_SCSI_STATUS_ERROR; } break; + case SLI4_FC_WCQE_STATUS_WQE_TIMEOUT: + /* IO timed out */ + scsi_status = OCS_SCSI_STATUS_TIMEDOUT_AND_ABORTED; + break; case SLI4_FC_WCQE_STATUS_DI_ERROR: if (ext_status & 0x01) { scsi_status = OCS_SCSI_STATUS_DIF_GUARD_ERROR; diff --git a/sys/dev/ocs_fc/ocs_xport.c b/sys/dev/ocs_fc/ocs_xport.c index d6b0d8740906..d997ea245132 100644 --- a/sys/dev/ocs_fc/ocs_xport.c +++ b/sys/dev/ocs_fc/ocs_xport.c @@ -524,9 +524,11 @@ ocs_xport_initialize(ocs_xport_t *xport) } } - if (ocs->target_io_timer_sec) { - ocs_log_debug(ocs, "setting target io timer=%d\n", ocs->target_io_timer_sec); - ocs_hw_set(&ocs->hw, OCS_HW_EMULATE_TARGET_WQE_TIMEOUT, TRUE); + if (ocs->target_io_timer_sec || ocs->enable_ini) { + if (ocs->target_io_timer_sec) + ocs_log_debug(ocs, "setting target io timer=%d\n", ocs->target_io_timer_sec); + + ocs_hw_set(&ocs->hw, OCS_HW_EMULATE_WQE_TIMEOUT, TRUE); } ocs_hw_callback(&ocs->hw, OCS_HW_CB_DOMAIN, ocs_domain_cb, ocs); diff --git a/sys/dev/ocs_fc/sli4.h b/sys/dev/ocs_fc/sli4.h index 4d8686ce8841..e9271df1530f 100644 --- a/sys/dev/ocs_fc/sli4.h +++ b/sys/dev/ocs_fc/sli4.h @@ -5345,7 +5345,7 @@ typedef struct sli4_fc_wqec_s { #define SLI4_FC_WCQE_STATUS_RX_ABORT_REQUEST 0x1b /* driver generated status codes; better not overlap with chip's status codes! */ -#define SLI4_FC_WCQE_STATUS_TARGET_WQE_TIMEOUT 0xff +#define SLI4_FC_WCQE_STATUS_WQE_TIMEOUT 0xff #define SLI4_FC_WCQE_STATUS_SHUTDOWN 0xfe #define SLI4_FC_WCQE_STATUS_DISPATCH_ERROR 0xfd