Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 Jun 2017 11:36:35 -0600
From:      Stephen Mcconnell <stephen.mcconnell@broadcom.com>
To:        Harry Schmalzbauer <freebsd@omnilan.de>
Cc:        freebsd-scsi@freebsd.org, Scott Long <scottl@freebsd.org>
Subject:   RE: mps(4) blocks panic-reboot
Message-ID:  <e6fe7cc17fb1302caf2122eaa11d10ba@mail.gmail.com>
In-Reply-To: <59303484.1040609@omnilan.de>
References:  <592FDE8C.1090609@omnilan.de> 12a36df9eff99c77ec621987efbe75fe@mail.gmail.com <ff9342e2e1eb541f347d9f683cfc8214@mail.gmail.com> <59303484.1040609@omnilan.de>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
Can you try the attached patch and let me know how it goes? I didn't test
it, but since you know how, it might be easier this way. This was diff'd
from the latest mps files in stable/11, which I recently updated (today).

Thanks,
Steve

> -----Original Message-----
> From: Harry Schmalzbauer [mailto:freebsd@omnilan.de]
> Sent: Thursday, June 01, 2017 9:37 AM
> To: Stephen Mcconnell
> Cc: freebsd-scsi@freebsd.org; Scott Long
> Subject: Re: mps(4) blocks panic-reboot
>
> Bezüglich Stephen Mcconnell's Nachricht vom 01.06.2017 17:25 (localtime):
> > I found a couple of emails between me and Scott a while back and we
> > talked about this. The problem is that the SSU handling relies on
> > interrupts, but interrupts stop due to the panic, so it hangs. Scott
> > came up with a way around it but we never decided on a final fix and
> > then it was forgotten about. If you have a way to reproduce this, I can
> > try to
> find a fix here.
> > Or, I might be able to force the system to panic at the right time.
>
> Thank you very much for your attention!
>
> I remember haveing read some discussion about that topic but thought it
> was
> fixed and haven't searched any further; thanks for doing that job :-)
>
> I can reproduce at any time, willing to test anything (which I can get to
> compile
> on stable/11)!
> This is a semi-productive machine where I evaluate some netmap/bhyve
> options/stragtegies.
>
> Thanks,
>
> -harry

[-- Attachment #2 --]
mps_ssu_polled.diff000644 000000 000000 00000006610 13114047767 015054 0ustar00rootwheel000000 000000 Index: mps_sas.c
===================================================================
--- mps_sas.c	(revision 319446)
+++ mps_sas.c	(working copy)
@@ -2211,18 +2211,6 @@
 		}
 	}
 
-	/*
-	 * If this is a Start Stop Unit command and it was issued by the driver
-	 * during shutdown, decrement the refcount to account for all of the
-	 * commands that were sent.  All SSU commands should be completed before
-	 * shutdown completes, meaning SSU_refcount will be 0 after SSU_started
-	 * is TRUE.
-	 */
-	if (sc->SSU_started && (csio->cdb_io.cdb_bytes[0] == START_STOP_UNIT)) {
-		mps_dprint(sc, MPS_INFO, "Decrementing SSU count.\n");
-		sc->SSU_refcount--;
-	}
-
 	/* Take the fast path to completion */
 	if (cm->cm_reply == NULL) {
 		if (mpssas_get_ccbstatus(ccb) == CAM_REQ_INPROG) {
Index: mps_sas_lsi.c
===================================================================
--- mps_sas_lsi.c	(revision 319446)
+++ mps_sas_lsi.c	(working copy)
@@ -1117,13 +1117,13 @@
 	target_id_t targetid;
 	struct mpssas_target *target;
 	char path_str[64];
-	struct timeval cur_time, start_time;
 
 	/*
-	 * For each target, issue a StartStopUnit command to stop the device.
+	 * Disable interrupts now because shutdown is in progress and don't want
+	 * to rely on ISR to complete these. So polling must be done here as
+	 * well.
 	 */
-	sc->SSU_started = TRUE;
-	sc->SSU_refcount = 0;
+	mps_mask_intr(sc);
 	for (targetid = 0; targetid < sc->max_devices; targetid++) {
 		target = &sassc->targets[targetid];
 		if (target->handle == 0x0) {
@@ -1157,12 +1157,9 @@
 			    "handle %d\n", path_str, target->handle);
 			
 			/*
-			 * Issue a START STOP UNIT command for the target.
-			 * Increment the SSU counter to be used to count the
-			 * number of required replies.
+			 * Issue a START STOP UNIT command for the target and
+			 * poll for completion.
 			 */
-			mps_dprint(sc, MPS_INFO, "Incrementing SSU count\n");
-			sc->SSU_refcount++;
 			ccb->ccb_h.target_id =
 			    xpt_path_target_id(ccb->ccb_h.path);
 			ccb->ccb_h.ppriv_ptr1 = sassc;
@@ -1175,27 +1172,9 @@
 			    /*immediate*/FALSE,
 			    MPS_SENSE_LEN,
 			    /*timeout*/10000);
-			xpt_action(ccb);
+			xpt_polled_action(ccb);
 		}
 	}
-
-	/*
-	 * Wait until all of the SSU commands have completed or time has
-	 * expired (60 seconds).  Pause for 100ms each time through.  If any
-	 * command times out, the target will be reset in the SCSI command
-	 * timeout routine.
-	 */
-	getmicrotime(&start_time);
-	while (sc->SSU_refcount) {
-		pause("mpswait", hz/10);
-		
-		getmicrotime(&cur_time);
-		if ((cur_time.tv_sec - start_time.tv_sec) > 60) {
-			mps_dprint(sc, MPS_FAULT, "Time has expired waiting "
-			    "for SSU commands to complete.\n");
-			break;
-		}
-	}
 }
 
 static void
@@ -1214,9 +1193,7 @@
 	    path_str);
 
 	/*
-	 * Nothing more to do except free the CCB and path.  If the command
-	 * timed out, an abort reset, then target reset will be issued during
-	 * the SCSI Command process.
+	 * Nothing more to do except free the CCB and path.
 	 */
 	xpt_free_path(done_ccb->ccb_h.path);
 	xpt_free_ccb(done_ccb);
Index: mpsvar.h
===================================================================
--- mpsvar.h	(revision 319446)
+++ mpsvar.h	(working copy)
@@ -421,10 +421,6 @@
 
 	char				exclude_ids[80];
 	struct timeval			lastfail;
-
-	/* StartStopUnit command handling at shutdown */
-	uint32_t			SSU_refcount;
-	uint8_t				SSU_started;
 };
 
 struct mps_config_params {

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?e6fe7cc17fb1302caf2122eaa11d10ba>