From owner-dev-commits-src-branches@freebsd.org Mon May 24 14:43:54 2021 Return-Path: Delivered-To: dev-commits-src-branches@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2F018636407; Mon, 24 May 2021 14:43:54 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Fpg2K60FNz4dtZ; Mon, 24 May 2021 14:43:53 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id AC0DC40BE; Mon, 24 May 2021 14:43:53 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 14OEhrC6045374; Mon, 24 May 2021 14:43:53 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 14OEhr0l045373; Mon, 24 May 2021 14:43:53 GMT (envelope-from git) Date: Mon, 24 May 2021 14:43:53 GMT Message-Id: <202105241443.14OEhr0l045373@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Alexander Motin Subject: git: 22d8055aa216 - stable/13 - mpr/mps(4): Make device mapping some more robust. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: mav X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 22d8055aa216b2e39ec48ddaeab31d542fb96bee Auto-Submitted: auto-generated X-BeenThere: dev-commits-src-branches@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Commits to the stable branches of the FreeBSD src repository List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 May 2021 14:43:54 -0000 The branch stable/13 has been updated by mav: URL: https://cgit.FreeBSD.org/src/commit/?id=22d8055aa216b2e39ec48ddaeab31d542fb96bee commit 22d8055aa216b2e39ec48ddaeab31d542fb96bee Author: Alexander Motin AuthorDate: 2021-04-24 03:18:01 +0000 Commit: Alexander Motin CommitDate: 2021-05-24 14:43:39 +0000 mpr/mps(4): Make device mapping some more robust. Allow new enclosure to replace previously existing one if there is no completely unused table entry, same as it is done for devices. If we can not process DPM due to corruption -- wipe it and restart from scratch. Otherwise I don't see a way to recover persistence if something go wrong and there is no BIOS to recover it for us. Together this solves a problem that appeared when 9300-8i firmware update to 16.00.10.00 somehow switched its mapping mode from Device Persistence to Enclosure/Slot without wiping the DPM table. It made HBA completely unusable, since overflowed and conflicting mapping table was unable to map any of enclosures and so devices. Also while there make some enclosure mapping errors more informative. MFC after: 1 month Sponsored by: iXsystems, Inc. (cherry picked from commit b99419aee49e2cc53747730be4d0ec4f9b330eb2) --- sys/dev/mpr/mpr_mapping.c | 71 ++++++++++++++++++++++++++++++----------------- sys/dev/mps/mps_mapping.c | 39 +++++++++++++++++--------- 2 files changed, 71 insertions(+), 39 deletions(-) diff --git a/sys/dev/mpr/mpr_mapping.c b/sys/dev/mpr/mpr_mapping.c index ef97080fb175..6c191672c24e 100644 --- a/sys/dev/mpr/mpr_mapping.c +++ b/sys/dev/mpr/mpr_mapping.c @@ -1213,9 +1213,10 @@ _mapping_get_dev_info(struct mpr_softc *sc, phy_change->is_processed = 1; mpr_dprint(sc, MPR_ERROR | MPR_MAPPING, "%s: " "failed to add the device with handle " - "0x%04x because the enclosure is not in " - "the mapping table\n", __func__, - phy_change->dev_handle); + "0x%04x because enclosure handle 0x%04x " + "is not in the mapping table\n", __func__, + phy_change->dev_handle, + topo_change->enc_handle); continue; } if (!((phy_change->device_info & @@ -1368,9 +1369,10 @@ _mapping_get_pcie_dev_info(struct mpr_softc *sc, port_change->is_processed = 1; mpr_dprint(sc, MPR_ERROR | MPR_MAPPING, "%s: " "failed to add the device with handle " - "0x%04x because the enclosure is not in " - "the mapping table\n", __func__, - port_change->dev_handle); + "0x%04x because enclosure handle 0x%04x " + "is not in the mapping table\n", __func__, + port_change->dev_handle, + topo_change->enc_handle); continue; } if (!(port_change->device_info & @@ -1609,9 +1611,10 @@ _mapping_add_new_device(struct mpr_softc *sc, phy_change->is_processed = 1; mpr_dprint(sc, MPR_ERROR | MPR_MAPPING, "%s: " "failed to add the device with handle " - "0x%04x because the enclosure is not in " - "the mapping table\n", __func__, - phy_change->dev_handle); + "0x%04x because enclosure handle 0x%04x " + "is not in the mapping table\n", __func__, + phy_change->dev_handle, + topo_change->enc_handle); continue; } @@ -1865,9 +1868,10 @@ _mapping_add_new_pcie_device(struct mpr_softc *sc, port_change->is_processed = 1; mpr_dprint(sc, MPR_ERROR | MPR_MAPPING, "%s: " "failed to add the device with handle " - "0x%04x because the enclosure is not in " - "the mapping table\n", __func__, - port_change->dev_handle); + "0x%04x because enclosure handle 0x%04x " + "is not in the mapping table\n", __func__, + port_change->dev_handle, + topo_change->enc_handle); continue; } @@ -2205,7 +2209,7 @@ mpr_mapping_free_memory(struct mpr_softc *sc) free(sc->dpm_pg0, M_MPR); } -static bool +static void _mapping_process_dpm_pg0(struct mpr_softc *sc) { u8 missing_cnt, enc_idx; @@ -2331,8 +2335,8 @@ _mapping_process_dpm_pg0(struct mpr_softc *sc) MPR_DPM_BAD_IDX) { mpr_dprint(sc, MPR_ERROR | MPR_MAPPING, "%s: Conflict in mapping table for " - " enclosure %d\n", __func__, - enc_idx); + "enclosure %d device %d\n", + __func__, enc_idx, map_idx); goto fail; } physical_id = @@ -2371,20 +2375,26 @@ _mapping_process_dpm_pg0(struct mpr_softc *sc) mt_entry->device_info = MPR_DEV_RESERVED; } } /*close the loop for DPM table */ - return (true); + return; fail: + mpr_dprint(sc, MPR_ERROR | MPR_MAPPING, "%s: " + "Wiping DPM to start from scratch\n", __func__); + dpm_entry = (Mpi2DriverMap0Entry_t *) ((uint8_t *) sc->dpm_pg0 + + sizeof(MPI2_CONFIG_EXTENDED_PAGE_HEADER)); + bzero(dpm_entry, sizeof(Mpi2DriverMap0Entry_t) * sc->max_dpm_entries); for (entry_num = 0; entry_num < sc->max_dpm_entries; entry_num++) { + sc->dpm_flush_entry[entry_num] = 1; sc->dpm_entry_used[entry_num] = 0; /* * for IR firmware, it may be necessary to wipe out * sc->mapping_table volumes tooi */ } + _mapping_flush_dpm_pages(sc); for (enc_idx = 0; enc_idx < sc->num_enc_table_entries; enc_idx++) _mapping_clear_enc_entry(sc->enclosure_table + enc_idx); sc->num_enc_table_entries = 0; - return (false); } /* @@ -2624,10 +2634,8 @@ retry_read_dpm: } } - if (sc->is_dpm_enable) { - if (!_mapping_process_dpm_pg0(sc)) - sc->is_dpm_enable = 0; - } + if (sc->is_dpm_enable) + _mapping_process_dpm_pg0(sc); if (! sc->is_dpm_enable) { mpr_dprint(sc, MPR_MAPPING, "%s: DPM processing is disabled. " "Device mappings will not persist across reboots or " @@ -2823,14 +2831,24 @@ mpr_mapping_enclosure_dev_status_change_event(struct mpr_softc *sc, * and the enclosure has enough space in the Mapping * Table to map its devices. */ - enc_idx = sc->num_enc_table_entries; - if (enc_idx >= sc->max_enclosures) { + if (sc->num_enc_table_entries < sc->max_enclosures) { + enc_idx = sc->num_enc_table_entries; + sc->num_enc_table_entries++; + } else { + enc_idx = _mapping_get_high_missing_et_idx(sc); + if (enc_idx != MPR_ENCTABLE_BAD_IDX) { + et_entry = &sc->enclosure_table[enc_idx]; + _mapping_add_to_removal_table(sc, + et_entry->dpm_entry_num); + _mapping_clear_enc_entry(et_entry); + } + } + if (enc_idx == MPR_ENCTABLE_BAD_IDX) { mpr_dprint(sc, MPR_ERROR | MPR_MAPPING, "%s: " "Enclosure cannot be added to mapping " "table because it's full.\n", __func__); goto out; } - sc->num_enc_table_entries++; et_entry = &sc->enclosure_table[enc_idx]; et_entry->enc_handle = le16toh(event_data-> EnclosureHandle); @@ -2857,8 +2875,9 @@ mpr_mapping_enclosure_dev_status_change_event(struct mpr_softc *sc, le16toh(event_data->EnclosureHandle)); if (enc_idx == MPR_ENCTABLE_BAD_IDX) { mpr_dprint(sc, MPR_ERROR | MPR_MAPPING, "%s: Cannot " - "unmap enclosure %d because it has already been " - "deleted.\n", __func__, enc_idx); + "unmap enclosure with handle 0x%04x because it " + "has already been deleted.\n", __func__, + le16toh(event_data->EnclosureHandle)); goto out; } et_entry = &sc->enclosure_table[enc_idx]; diff --git a/sys/dev/mps/mps_mapping.c b/sys/dev/mps/mps_mapping.c index 71cdb399c271..9e1e0726c574 100644 --- a/sys/dev/mps/mps_mapping.c +++ b/sys/dev/mps/mps_mapping.c @@ -1175,9 +1175,10 @@ _mapping_get_dev_info(struct mps_softc *sc, phy_change->is_processed = 1; mps_dprint(sc, MPS_ERROR | MPS_MAPPING, "%s: " "failed to add the device with handle " - "0x%04x because the enclosure is not in " - "the mapping table\n", __func__, - phy_change->dev_handle); + "0x%04x because enclosure handle 0x%04x " + "is not in the mapping table\n", __func__, + phy_change->dev_handle, + topo_change->enc_handle); continue; } if (!((phy_change->device_info & @@ -1420,9 +1421,10 @@ _mapping_add_new_device(struct mps_softc *sc, phy_change->is_processed = 1; mps_dprint(sc, MPS_ERROR | MPS_MAPPING, "%s: " "failed to add the device with handle " - "0x%04x because the enclosure is not in " - "the mapping table\n", __func__, - phy_change->dev_handle); + "0x%04x because enclosure handle 0x%04x " + "is not in the mapping table\n", __func__, + phy_change->dev_handle, + topo_change->enc_handle); continue; } @@ -1884,8 +1886,8 @@ _mapping_process_dpm_pg0(struct mps_softc *sc) MPS_DPM_BAD_IDX) { mps_dprint(sc, MPS_ERROR | MPS_MAPPING, "%s: Conflict in mapping table for " - " enclosure %d\n", __func__, - enc_idx); + "enclosure %d device %d\n", + __func__, enc_idx, map_idx); break; } physical_id = @@ -2359,14 +2361,24 @@ mps_mapping_enclosure_dev_status_change_event(struct mps_softc *sc, * and the enclosure has enough space in the Mapping * Table to map its devices. */ - enc_idx = sc->num_enc_table_entries; - if (enc_idx >= sc->max_enclosures) { + if (sc->num_enc_table_entries < sc->max_enclosures) { + enc_idx = sc->num_enc_table_entries; + sc->num_enc_table_entries++; + } else { + enc_idx = _mapping_get_high_missing_et_idx(sc); + if (enc_idx != MPS_ENCTABLE_BAD_IDX) { + et_entry = &sc->enclosure_table[enc_idx]; + _mapping_add_to_removal_table(sc, + et_entry->dpm_entry_num); + _mapping_clear_enc_entry(et_entry); + } + } + if (enc_idx == MPS_ENCTABLE_BAD_IDX) { mps_dprint(sc, MPS_ERROR | MPS_MAPPING, "%s: " "Enclosure cannot be added to mapping " "table because it's full.\n", __func__); goto out; } - sc->num_enc_table_entries++; et_entry = &sc->enclosure_table[enc_idx]; et_entry->enc_handle = le16toh(event_data-> EnclosureHandle); @@ -2393,8 +2405,9 @@ mps_mapping_enclosure_dev_status_change_event(struct mps_softc *sc, le16toh(event_data->EnclosureHandle)); if (enc_idx == MPS_ENCTABLE_BAD_IDX) { mps_dprint(sc, MPS_ERROR | MPS_MAPPING, "%s: Cannot " - "unmap enclosure %d because it has already been " - "deleted.\n", __func__, enc_idx); + "unmap enclosure with handle 0x%04x because it " + "has already been deleted.\n", __func__, + le16toh(event_data->EnclosureHandle)); goto out; } et_entry = &sc->enclosure_table[enc_idx];