From owner-dev-commits-src-main@freebsd.org Fri Aug 20 14:03:37 2021 Return-Path: Delivered-To: dev-commits-src-main@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 618BC6610E5; Fri, 20 Aug 2021 14:03:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GrjzF2C9kz3sTS; Fri, 20 Aug 2021 14:03:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 340E31A615; Fri, 20 Aug 2021 14:03:37 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 17KE3bMs071445; Fri, 20 Aug 2021 14:03:37 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 17KE3brs071444; Fri, 20 Aug 2021 14:03:37 GMT (envelope-from git) Date: Fri, 20 Aug 2021 14:03:37 GMT Message-Id: <202108201403.17KE3brs071444@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Alexander Motin Subject: git: e3c5965c259f - main - mpr(4): Handle mprsas_alloc_tm() errors on device removal. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: mav X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: e3c5965c259f7029afe01612b248c3acf9f5b3e0 Auto-Submitted: auto-generated X-BeenThere: dev-commits-src-main@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Commit messages for the main branch of the src repository List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Aug 2021 14:03:37 -0000 The branch main has been updated by mav: URL: https://cgit.FreeBSD.org/src/commit/?id=e3c5965c259f7029afe01612b248c3acf9f5b3e0 commit e3c5965c259f7029afe01612b248c3acf9f5b3e0 Author: Alexander Motin AuthorDate: 2021-08-20 13:46:51 +0000 Commit: Alexander Motin CommitDate: 2021-08-20 14:03:32 +0000 mpr(4): Handle mprsas_alloc_tm() errors on device removal. SAS9305-16e with firmware 16.00.01.00 report HighPriorityCredit of only 8, while for comparison some other combinations I have report 100 or even 128. In case of large JBOD detach requirement to send target reset command to each target same time overflows the limit, and without adequate handling makes devices stuck in half-detached state, preventing later re-attach. To handle that in case of allocation error mark the target with new MPRSAS_TARGET_TOREMOVE flag, and retry the removal attempt next time something else free high priority command. With this patch I can successfully detach/attach 102 disk JBOD from/to the SAS9305-16e. MFC after: 2 weeks Sponsored by: iXsystems, Inc. --- sys/dev/mpr/mpr_sas.c | 36 ++++++++++++++++++++++++++++++++---- sys/dev/mpr/mpr_sas.h | 4 ++-- sys/dev/mpr/mpr_sas_lsi.c | 1 + sys/dev/mpr/mprvar.h | 5 +++++ 4 files changed, 40 insertions(+), 6 deletions(-) diff --git a/sys/dev/mpr/mpr_sas.c b/sys/dev/mpr/mpr_sas.c index f529fdf23d52..e1739028dd8f 100644 --- a/sys/dev/mpr/mpr_sas.c +++ b/sys/dev/mpr/mpr_sas.c @@ -412,6 +412,34 @@ mprsas_remove_volume(struct mpr_softc *sc, struct mpr_command *tm) mprsas_free_tm(sc, tm); } +/* + * Retry mprsas_prepare_remove() if some previous attempt failed to allocate + * high priority command due to limit reached. + */ +void +mprsas_prepare_remove_retry(struct mprsas_softc *sassc) +{ + struct mprsas_target *target; + int i; + + if ((sassc->flags & MPRSAS_TOREMOVE) == 0) + return; + + for (i = 0; i < sassc->maxtargets; i++) { + target = &sassc->targets[i]; + if ((target->flags & MPRSAS_TARGET_TOREMOVE) == 0) + continue; + if (TAILQ_EMPTY(&sassc->sc->high_priority_req_list)) + return; + target->flags &= ~MPRSAS_TARGET_TOREMOVE; + if (target->flags & MPR_TARGET_FLAGS_VOLUME) + mprsas_prepare_volume_remove(sassc, target->handle); + else + mprsas_prepare_remove(sassc, target->handle); + } + sassc->flags &= ~MPRSAS_TOREMOVE; +} + /* * No Need to call "MPI2_SAS_OP_REMOVE_DEVICE" For Volume removal. * Otherwise Volume Delete is same as Bare Drive Removal. @@ -440,8 +468,8 @@ mprsas_prepare_volume_remove(struct mprsas_softc *sassc, uint16_t handle) cm = mprsas_alloc_tm(sc); if (cm == NULL) { - mpr_dprint(sc, MPR_ERROR, - "%s: command alloc failure\n", __func__); + targ->flags |= MPRSAS_TARGET_TOREMOVE; + sassc->flags |= MPRSAS_TOREMOVE; return; } @@ -506,8 +534,8 @@ mprsas_prepare_remove(struct mprsas_softc *sassc, uint16_t handle) tm = mprsas_alloc_tm(sc); if (tm == NULL) { - mpr_dprint(sc, MPR_ERROR, "%s: command alloc failure\n", - __func__); + targ->flags |= MPRSAS_TARGET_TOREMOVE; + sassc->flags |= MPRSAS_TOREMOVE; return; } diff --git a/sys/dev/mpr/mpr_sas.h b/sys/dev/mpr/mpr_sas.h index ea427ca8f821..4ec6be15613c 100644 --- a/sys/dev/mpr/mpr_sas.h +++ b/sys/dev/mpr/mpr_sas.h @@ -57,8 +57,7 @@ struct mprsas_target { #define MPR_TARGET_FLAGS_RAID_COMPONENT (1 << 4) #define MPR_TARGET_FLAGS_VOLUME (1 << 5) #define MPR_TARGET_IS_SATA_SSD (1 << 6) -#define MPRSAS_TARGET_INRECOVERY (MPRSAS_TARGET_INABORT | \ - MPRSAS_TARGET_INRESET | MPRSAS_TARGET_INCHIPRESET) +#define MPRSAS_TARGET_TOREMOVE (1 << 7) uint16_t tid; SLIST_HEAD(, mprsas_lun) luns; @@ -95,6 +94,7 @@ struct mprsas_softc { #define MPRSAS_DISCOVERY_TIMEOUT_PENDING (1 << 2) #define MPRSAS_QUEUE_FROZEN (1 << 3) #define MPRSAS_SHUTDOWN (1 << 4) +#define MPRSAS_TOREMOVE (1 << 5) u_int maxtargets; struct mprsas_target *targets; struct cam_devq *devq; diff --git a/sys/dev/mpr/mpr_sas_lsi.c b/sys/dev/mpr/mpr_sas_lsi.c index 0800fd0385a7..025395f6eedd 100644 --- a/sys/dev/mpr/mpr_sas_lsi.c +++ b/sys/dev/mpr/mpr_sas_lsi.c @@ -1428,6 +1428,7 @@ mprsas_volume_add(struct mpr_softc *sc, u16 handle) targ->tid = id; targ->handle = handle; targ->devname = wwid; + targ->flags = MPR_TARGET_FLAGS_VOLUME; TAILQ_INIT(&targ->commands); TAILQ_INIT(&targ->timedout_commands); while (!SLIST_EMPTY(&targ->luns)) { diff --git a/sys/dev/mpr/mprvar.h b/sys/dev/mpr/mprvar.h index 524c93861b70..93386f1f58d0 100644 --- a/sys/dev/mpr/mprvar.h +++ b/sys/dev/mpr/mprvar.h @@ -668,6 +668,8 @@ mpr_alloc_command(struct mpr_softc *sc) return (cm); } +void mprsas_prepare_remove_retry(struct mprsas_softc *sassc); + static __inline void mpr_free_high_priority_command(struct mpr_softc *sc, struct mpr_command *cm) { @@ -691,6 +693,9 @@ mpr_free_high_priority_command(struct mpr_softc *sc, struct mpr_command *cm) mpr_free_chain(sc, chain); } TAILQ_INSERT_TAIL(&sc->high_priority_req_list, cm, cm_link); + + if (sc->sassc) + mprsas_prepare_remove_retry(sc->sassc); } static __inline struct mpr_command *