Date: Sat, 11 Feb 2017 04:56:05 +0000 From: John <jwd@FreeBSD.org> To: FreeBSD-scsi <freebsd-scsi@freebsd.org> Subject: multipath device never failing - loops over providers instead Message-ID: <20170211045605.GA43225@FreeBSD.org>
next in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
Hi Folks,
Running 10.3-STABLE r308246 from Nov 3, 2016
I thought I saw a commit in this area a while back but I
cannot seem to find it nor is google helping..
I have SAS drives behind 2 multiplexers (4 paths total) which
are all configured similar to the following:
# gmultipath status Z76
Name Status Components
multipath/Z76 OPTIMAL da92 (ACTIVE)
da236 (PASSIVE)
da428 (PASSIVE)
da572 (PASSIVE)
For each path on the components above, the following sequence occurs:
kernel: (da92:mpr0:0:399:0): READ(10). CDB: 28 00 0b a7 20 c0 00 00 10 00
kernel: (da92:mpr0:0:399:0): CAM status: SCSI Status Error
kernel: (da92:mpr0:0:399:0): SCSI status: Check Condition
kernel: (da92:mpr0:0:399:0): SCSI sense: HARDWARE FAILURE asc:32,0 (No defect spare location available)
kernel: (da92:mpr0:0:399:0): Info: 0xba720c0
kernel: (da92:mpr0:0:399:0): Field Replaceable Unit: 157
kernel: (da92:mpr0:0:399:0): Command Specific Info: 0x80010000
kernel: (da92:mpr0:0:399:0): Actual Retry Count: 255
kernel: (da92:mpr0:0:399:0): Retrying command (per sense data)
After each path has failed, the following is seen:
kernel: GEOM_MULTIPATH: Error 5, da92 in Z76 marked FAIL
kernel: GEOM_MULTIPATH: all paths in Z76 were marked FAIL, restore da572
kernel: GEOM_MULTIPATH: all paths in Z76 were marked FAIL, restore da428
kernel: GEOM_MULTIPATH: all paths in Z76 were marked FAIL, restore da236
kernel: GEOM_MULTIPATH: da572 is now active path in Z76
and the entire failure loop occurs again. The multipath device
itself is never failed (so the zfs pool can never go into degraded mode,
the faulty drive replaced with a spare, etc).
Once I pulled the drive the multipath device Z76 fails and
things sent as expected.
It seems g_multipath_fault() in this instance should just fail the device.
Does anyone have any pointers on this issue?
Thanks,
John
[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
iQF8BAEBCgBmBQJYnpljXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQwNDBGOTgxNzM0NzQ3OEFBNDYyODNGQzVC
NjI0OTlBMTQyNEY3RjgxAAoJELYkmaFCT3+BGk4IALskuVHIvoVBhLkuAViD8/ME
i/LckUyVRB86r5lHoetAfPo8yQv7urAMvB27PBnvDRsxKWF/aCMxioVHjFsai86R
BpsObFYycGazAoEgoxYsybs5wtKGO5pLm+VPS8DSaHHiNmJtpFeEg8a1vLhdOCmj
IpZHyo5StiUokvde3TViAHUo3+CeBVir5K63QlqelHtNa1oE/0difiJfkogdioHs
EBCQ34NqzsbGbogo0O8ubKI77LYZnsIxn49z0pMIoXohxuCpw53PCoN+QuFCmrjp
9n5GtA5crOieE2pixEUuixJzT1s+/6ZTeV0IaFRn7I0WZpTqsWSmspnZorOuwUk=
=KkTg
-----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170211045605.GA43225>
