Date: Thu, 1 Jun 2017 11:36:35 -0600 From: Stephen Mcconnell <stephen.mcconnell@broadcom.com> To: Harry Schmalzbauer <freebsd@omnilan.de> Cc: freebsd-scsi@freebsd.org, Scott Long <scottl@freebsd.org> Subject: RE: mps(4) blocks panic-reboot Message-ID: <e6fe7cc17fb1302caf2122eaa11d10ba@mail.gmail.com> In-Reply-To: <59303484.1040609@omnilan.de> References: <592FDE8C.1090609@omnilan.de> 12a36df9eff99c77ec621987efbe75fe@mail.gmail.com <ff9342e2e1eb541f347d9f683cfc8214@mail.gmail.com> <59303484.1040609@omnilan.de>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
Can you try the attached patch and let me know how it goes? I didn't test
it, but since you know how, it might be easier this way. This was diff'd
from the latest mps files in stable/11, which I recently updated (today).
Thanks,
Steve
> -----Original Message-----
> From: Harry Schmalzbauer [mailto:freebsd@omnilan.de]
> Sent: Thursday, June 01, 2017 9:37 AM
> To: Stephen Mcconnell
> Cc: freebsd-scsi@freebsd.org; Scott Long
> Subject: Re: mps(4) blocks panic-reboot
>
> Bezüglich Stephen Mcconnell's Nachricht vom 01.06.2017 17:25 (localtime):
> > I found a couple of emails between me and Scott a while back and we
> > talked about this. The problem is that the SSU handling relies on
> > interrupts, but interrupts stop due to the panic, so it hangs. Scott
> > came up with a way around it but we never decided on a final fix and
> > then it was forgotten about. If you have a way to reproduce this, I can
> > try to
> find a fix here.
> > Or, I might be able to force the system to panic at the right time.
>
> Thank you very much for your attention!
>
> I remember haveing read some discussion about that topic but thought it
> was
> fixed and haven't searched any further; thanks for doing that job :-)
>
> I can reproduce at any time, willing to test anything (which I can get to
> compile
> on stable/11)!
> This is a semi-productive machine where I evaluate some netmap/bhyve
> options/stragtegies.
>
> Thanks,
>
> -harry
[-- Attachment #2 --]
mps_ssu_polled.diff 000644 000000 000000 00000006610 13114047767 015054 0 ustar 00root wheel 000000 000000 Index: mps_sas.c
===================================================================
--- mps_sas.c (revision 319446)
+++ mps_sas.c (working copy)
@@ -2211,18 +2211,6 @@
}
}
- /*
- * If this is a Start Stop Unit command and it was issued by the driver
- * during shutdown, decrement the refcount to account for all of the
- * commands that were sent. All SSU commands should be completed before
- * shutdown completes, meaning SSU_refcount will be 0 after SSU_started
- * is TRUE.
- */
- if (sc->SSU_started && (csio->cdb_io.cdb_bytes[0] == START_STOP_UNIT)) {
- mps_dprint(sc, MPS_INFO, "Decrementing SSU count.\n");
- sc->SSU_refcount--;
- }
-
/* Take the fast path to completion */
if (cm->cm_reply == NULL) {
if (mpssas_get_ccbstatus(ccb) == CAM_REQ_INPROG) {
Index: mps_sas_lsi.c
===================================================================
--- mps_sas_lsi.c (revision 319446)
+++ mps_sas_lsi.c (working copy)
@@ -1117,13 +1117,13 @@
target_id_t targetid;
struct mpssas_target *target;
char path_str[64];
- struct timeval cur_time, start_time;
/*
- * For each target, issue a StartStopUnit command to stop the device.
+ * Disable interrupts now because shutdown is in progress and don't want
+ * to rely on ISR to complete these. So polling must be done here as
+ * well.
*/
- sc->SSU_started = TRUE;
- sc->SSU_refcount = 0;
+ mps_mask_intr(sc);
for (targetid = 0; targetid < sc->max_devices; targetid++) {
target = &sassc->targets[targetid];
if (target->handle == 0x0) {
@@ -1157,12 +1157,9 @@
"handle %d\n", path_str, target->handle);
/*
- * Issue a START STOP UNIT command for the target.
- * Increment the SSU counter to be used to count the
- * number of required replies.
+ * Issue a START STOP UNIT command for the target and
+ * poll for completion.
*/
- mps_dprint(sc, MPS_INFO, "Incrementing SSU count\n");
- sc->SSU_refcount++;
ccb->ccb_h.target_id =
xpt_path_target_id(ccb->ccb_h.path);
ccb->ccb_h.ppriv_ptr1 = sassc;
@@ -1175,27 +1172,9 @@
/*immediate*/FALSE,
MPS_SENSE_LEN,
/*timeout*/10000);
- xpt_action(ccb);
+ xpt_polled_action(ccb);
}
}
-
- /*
- * Wait until all of the SSU commands have completed or time has
- * expired (60 seconds). Pause for 100ms each time through. If any
- * command times out, the target will be reset in the SCSI command
- * timeout routine.
- */
- getmicrotime(&start_time);
- while (sc->SSU_refcount) {
- pause("mpswait", hz/10);
-
- getmicrotime(&cur_time);
- if ((cur_time.tv_sec - start_time.tv_sec) > 60) {
- mps_dprint(sc, MPS_FAULT, "Time has expired waiting "
- "for SSU commands to complete.\n");
- break;
- }
- }
}
static void
@@ -1214,9 +1193,7 @@
path_str);
/*
- * Nothing more to do except free the CCB and path. If the command
- * timed out, an abort reset, then target reset will be issued during
- * the SCSI Command process.
+ * Nothing more to do except free the CCB and path.
*/
xpt_free_path(done_ccb->ccb_h.path);
xpt_free_ccb(done_ccb);
Index: mpsvar.h
===================================================================
--- mpsvar.h (revision 319446)
+++ mpsvar.h (working copy)
@@ -421,10 +421,6 @@
char exclude_ids[80];
struct timeval lastfail;
-
- /* StartStopUnit command handling at shutdown */
- uint32_t SSU_refcount;
- uint8_t SSU_started;
};
struct mps_config_params {
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?e6fe7cc17fb1302caf2122eaa11d10ba>
