Date: Thu, 12 Jul 2018 12:00:41 +0200 From: Oliver Sech <crimsonthunder@gmx.net> To: Ken Merry <ken@freebsd.org>, Stephen Mcconnell <stephen.mcconnell@broadcom.com> Cc: FreeBSD-scsi <freebsd-scsi@freebsd.org> Subject: Re: problems with SAS JBODs 2 Message-ID: <9e0bf18f-0689-b2a0-1da4-b70c497b2f14@gmx.net> In-Reply-To: <54B10B7C-CDCE-4428-B584-59CE8F38B120@freebsd.org> References: <trinity-14d18077-ea73-40f6-9e87-d2d4000b1f7e-1530620937871@3c-app-gmx-bs01> <CAOtMX2h8r31AeNCKyckK2P0VLn1CKFogo9bWom2So1x2ngpa4A@mail.gmail.com> <237f77ab-89e2-188b-b2b1-84c6d88609b0@gmx.net> <b785fe02-9242-c95f-56cb-2130f90e17f5@gmx.net> <3caf8ccd6fde8cfc4db25bae5327c46b@mail.gmail.com> <0af047d477d15ec364140653bd967c89@mail.gmail.com> <54B10B7C-CDCE-4428-B584-59CE8F38B120@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 07/11/2018 10:35 PM, Ken Merry wrote: > Oliver, what happens when you try to do I/O to the devices that don’t go away after you pull the cable? Does that cause the devices to go away? I tried to 'dd if=/dev/daX of=/dev/null bs=1k count=1' and at least the "da" device disappears. > Looking at the mprutil output, it also shows the devices sticking around from the adapter’s standpoint. > > You can also try a ‘camcontrol rescan all’ or a ‘camcontrol rescan N’ (where N is the scbus number shown by ‘camcontrol devlist -v’). That will do some basic probes for each of the devices and should in theory cause them to go away if they aren’t accessible. > > It seems like the adapter may not be recognizing that the devices in question have gone. I'm pretty sure that I tried this 'camcontrol rescan all' a few times. While I not sure anymore if that cleans up the non-working devices, I'm sure that no new devices were added. Unfortunately I haven't gotten yet to Steves 'clear controller mapping' script but I did a few other things: * The last time I tried to upgrade the firmware I had all sorts of problems. "sas3flash" reported bad checksums while flashing some of the files. So I reflashed both controllers with the DOS version of sas3flash. This was basically a challenge in itself because the DOS version of this utility does not seem to run on computers of this decade. (ERROR: Failed to initialize PAL. Exiting program.) The equivalent sas3flash.EFI version seems to be out of date and caused the checksum problems described before. (This time I wiped them before flashing with "sas3flash -o -e 6".) * I tried to change mpr tuneable "use_phy_num" after that but this has not improved the situation. I will retry and collect logs with Steves script. * I retried with the latest "mpr.ko" from the broadcom download page. (Same problems, no "use_phy_num" tuneable.) * I retested this hardware with Linux (4.15 and 4.17) ** Some shelves could be replugged reliably (ie: 45 disks show up, 45 disks disappear, 45 disks reappear) ** The newest shelf 2 disks were missing after the replugging (ie: 44 disks show up, 44 disks disappear, 42 disks reappear) (kernel log mpt3sas_cm0: "device is not present handle) * I tired a different controller ** So far I used a Broadcom LSI SAS 9305-16e (Controller: SAS3216) (Firmware 16.00.01.00 or 15.00.00.00) ** Yesterday I switched to a new fresh out-of-the-box Broadcom LSI 9305-24i (Controller: SAS3224) (Firmware 09.00.00.00 (or something similar with 09*)) With the new controller everything seems work on Linux. It might be the old Firmware?... It is better with the new controller on FreeBSD in that sense that I at least get one out of two /dev/sesX devices back. But disks are still missing and are not getting completely cleaned up... This whole thing is a bit frustrating, especially since up until now I thought that HBAs are kind of "connect and forget" devices. Next step is to set up a separate test environment and try to get it to work there. I will keep you updated and try provide log for all FreeBSD related problems.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9e0bf18f-0689-b2a0-1da4-b70c497b2f14>