Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Aug 2019 16:57:52 -0400
From:      Alexander Motin <mav@FreeBSD.org>
To:        Scott Long <scottl@samsco.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r351550 - head/sys/cam/scsi
Message-ID:  <3c2aa0be-3d42-881e-87e1-675499a7bc5f@FreeBSD.org>
In-Reply-To: <99271565-F168-48C8-90E0-749417C7C974@samsco.org>
References:  <201908271641.x7RGf6LC075849@repo.freebsd.org> <99271565-F168-48C8-90E0-749417C7C974@samsco.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Some FreeNAS user reported panic after updating to newer version.  On
the screenshot provided were several BUSY statuses for SATA disk on
mps(4), followed by panic "Attempt to remove out-of-bounds index -1 from
queue ...".  In his case I blame ancient LSI firmware or some broken
hardware, but I was able to reproduce the panic on FreeBSD head debug
kernel by hacking mps(4) driver to always report BUSY (appeared except
IDENTIFY and REPORT LUNS).  To diagnose it I inserted assertion into
xpt_free_ccb(), checking ccb->ccb_h.pinfo.index for values used for
requests still in send queue.  Not sure it is to be persistent, but in
this case it lead me directly to this place.

On 27.08.2019 16:23, Scott Long wrote:
> This is very concerning, and I wonder if it’s the cause of the mystery use-after-free / double-complete that I’ve seen for years and have never been able to catch.  Can you say more about how you found it?
> 
> Scott
> 
> 
>> On Aug 27, 2019, at 10:41 AM, Alexander Motin <mav@FreeBSD.org> wrote:
>>
>> Author: mav
>> Date: Tue Aug 27 16:41:06 2019
>> New Revision: 351550
>> URL: https://svnweb.freebsd.org/changeset/base/351550
>>
>> Log:
>>  Always check cam_periph_error() status for ERESTART.
>>
>>  Even if we do not expect retries, we better be sure, since otherwise it
>>  may result in use after free kernel panic.  I've noticed that it retries
>>  SCSI_STATUS_BUSY even with SF_NO_RECOVERY | SF_NO_RETRY.
>>
>>  MFC after:	1 week
>>  Sponsored by:	iXsystems, Inc.
>>
>> Modified:
>>  head/sys/cam/scsi/scsi_xpt.c
>>
>> Modified: head/sys/cam/scsi/scsi_xpt.c
>> ==============================================================================
>> --- head/sys/cam/scsi/scsi_xpt.c	Tue Aug 27 15:42:08 2019	(r351549)
>> +++ head/sys/cam/scsi/scsi_xpt.c	Tue Aug 27 16:41:06 2019	(r351550)
>> @@ -1684,8 +1684,9 @@ probe_device_check:
>> 	case PROBE_TUR_FOR_NEGOTIATION:
>> 	case PROBE_DV_EXIT:
>> 		if (cam_ccb_status(done_ccb) != CAM_REQ_CMP) {
>> -			cam_periph_error(done_ccb, 0,
>> -			    SF_NO_PRINT | SF_NO_RECOVERY | SF_NO_RETRY);
>> +			if (cam_periph_error(done_ccb, 0, SF_NO_PRINT |
>> +			    SF_NO_RECOVERY | SF_NO_RETRY) == ERESTART)
>> +				goto outr;
>> 		}
>> 		if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) {
>> 			/* Don't wedge the queue */
>> @@ -1735,8 +1736,9 @@ probe_device_check:
>> 		struct ccb_scsiio *csio;
>>
>> 		if (cam_ccb_status(done_ccb) != CAM_REQ_CMP) {
>> -			cam_periph_error(done_ccb, 0,
>> -			    SF_NO_PRINT | SF_NO_RECOVERY | SF_NO_RETRY);
>> +			if (cam_periph_error(done_ccb, 0, SF_NO_PRINT |
>> +			    SF_NO_RECOVERY | SF_NO_RETRY) == ERESTART)
>> +				goto outr;
>> 		}
>> 		if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) {
>> 			/* Don't wedge the queue */
>>
> 

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c2aa0be-3d42-881e-87e1-675499a7bc5f>