Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Apr 2007 13:19:43 -0600
From:      Scott Long <scottl@samsco.org>
To:        Nikolay Pavlov <quetzal@zone3000.net>, Thomas Quinot <thomas@FreeBSD.ORG>,  "Ganbold.TS" <ganbold@micom.mng.net>, freebsd-stable@FreeBSD.ORG, mjacob@FreeBSD.ORG, linimon@FreeBSD.ORG, bug-followup@FreeBSD.ORG
Subject:   Re: kern/112119: system hangs when starts k3b on RELENG_6
Message-ID:  <46324CCF.7040109@samsco.org>
In-Reply-To: <20070427174922.GA5655@zone3000.net>
References:  <20070427150134.64D3713C448@mx1.freebsd.org>	<20070427153218.GA9091@melamine.cuivre.fr.eu.org> <20070427174922.GA5655@zone3000.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Nikolay Pavlov wrote:
> On Friday, 27 April 2007 at 17:32:18 +0200, Thomas Quinot wrote:
>> * Ganbold.TS, 2007-04-27 :
>>
>>> I tried your patch at
>>> http://www.freebsd.org/cgi/query-pr.cgi?pr=103602&getpatch=12 and the
>>> problem is still the same. Ssytem freezes upon start of k3b.
>>>
>>> I also tried your attached patch, which reverts part of rev. 1.42.2.3
>>> and the problem is still the same, system hangs when starts k3b.
>> Thanks, that's useful info. Please try the attached patch instead, which
>> reverts another part of 1.42.2.3 (I'm trying to figure out exactly
>> *which* part of this change is causing the problem).
>>
>> Also, were you able to capture system console output at the point where
>> the crash occurs? We might have some indications there.
> 
> This patch works for me. I do not have a reboot and i am able to
> succesfully burn a cd.
> 
>> Thomas.
>>
> 
>> Index: atapi-cam.c
>> ===================================================================
>> RCS file: /space/mirror/ncvs/src/sys/dev/ata/atapi-cam.c,v
>> retrieving revision 1.42.2.3
>> retrieving revision 1.42.2.2
>> diff -u -r1.42.2.3 -r1.42.2.2
>> --- atapi-cam.c	29 Mar 2007 20:08:32 -0000	1.42.2.3
>> +++ atapi-cam.c	6 Mar 2007 16:56:50 -0000	1.42.2.2
>> @@ -697,39 +680,32 @@
>>  	    csio->ccb_h.status |= CAM_AUTOSNS_VALID;
>>  	}
>>      } else if (request->result != 0) {
>> -	if ((request->flags & ATA_R_TIMEOUT) != 0) {
>> -	    rc = CAM_CMD_TIMEOUT;
>> -	} else {
>> -	    rc = CAM_SCSI_STATUS_ERROR;
>> -	    csio->scsi_status = SCSI_STATUS_CHECK_COND;
>> +	rc = CAM_SCSI_STATUS_ERROR;
>> +	csio->scsi_status = SCSI_STATUS_CHECK_COND;
>>  
>> -	    if ((csio->ccb_h.flags & CAM_DIS_AUTOSENSE) == 0) {
>> +	if ((csio->ccb_h.flags & CAM_DIS_AUTOSENSE) == 0) {
>>  #if 0
>> -		static const int8_t ccb[16] = { ATAPI_REQUEST_SENSE, 0, 0, 0,
>> -		    sizeof(struct atapi_sense), 0, 0, 0, 0, 0, 0,
>> -		    0, 0, 0, 0, 0 };
>> -
>> -		bcopy (ccb, request->u.atapi.ccb, sizeof ccb);
>> -		request->data = (caddr_t)&csio->sense_data;
>> -		request->bytecount = sizeof(struct atapi_sense);
>> -		request->transfersize = min(request->bytecount, 65534);
>> -		request->timeout = csio->ccb_h.timeout / 1000;
>> -		request->retries = 2;
>> -		request->flags = ATA_R_QUIET|ATA_R_ATAPI|ATA_R_IMMEDIATE;
>> -		hcb->flags |= AUTOSENSE;
>> +	    static const int8_t ccb[16] = { ATAPI_REQUEST_SENSE, 0, 0, 0,
>> +		sizeof(struct atapi_sense), 0, 0, 0, 0, 0, 0,
>> +		0, 0, 0, 0, 0 };
>> +
>> +	    bcopy (ccb, request->u.atapi.ccb, sizeof ccb);
>> +	    request->data = (caddr_t)&csio->sense_data;
>> +	    request->bytecount = sizeof(struct atapi_sense);
>> +	    request->transfersize = min(request->bytecount, 65534);
>> +	    request->timeout = csio->ccb_h.timeout / 1000;
>> +	    request->retries = 2;
>> +	    request->flags = ATA_R_QUIET|ATA_R_ATAPI|ATA_R_IMMEDIATE;
>> +	    hcb->flags |= AUTOSENSE;
>>  
>> -		ata_queue_request(request);
>> -		return;
>> +	    ata_queue_request(request);
>> +	    return;
>>  #else
>> -		/*
>> -		 * Use auto-sense data from the ATA layer, if it has
>> -		 * issued a REQUEST SENSE automatically and that operation
>> -		 * returned without error.
>> -		 */
>> -		if (request->u.atapi.saved_cmd != 0 && request->error == 0) {
>> -		    bcopy (&request->u.atapi.sense, &csio->sense_data, sizeof(struct atapi_sense));
>> -		    csio->ccb_h.status |= CAM_AUTOSNS_VALID;
>> -		}
>> +	    /* The ATA driver has already requested sense for us. */
>> +	    if (request->error == 0) {
>> +		/* The ATA autosense suceeded. */
>> +		bcopy (&request->u.atapi.sense, &csio->sense_data, sizeof(struct atapi_sense));
>> +		csio->ccb_h.status |= CAM_AUTOSNS_VALID;
>>  	    }
>>  #endif
>>  	}
> 

My best guess is that request->u.atapi.saved_cmd isn't getting preserved
when ata_completed() does an automatic REQUEST_SENSE.  Not sure if this
is true or why it would happen.  But if that's the case, then CAM is
going to manually request sense, which atapi-cam and ata will likely
treat as a normal DMA capable command.  Note that the autosense code in
the ATA driver disables DMA for the REQUEST_SENSE command.  This might
be a key issue; the drive might be getting very unhappy with a DMA
flagged REQUEST_SENSE command, especially if it's already in a
CHECK_CONDITION state.  This unhappiness might be leading to the
interrupt storm and observed deadlock on UP system.

With the patch above, sense info is reported to CAM regardless of the
contents of saved_cmd, preventing CAM from generating the troublesome
REQUEST_SENSE on its own.

Oh hell, I know exactly what the problem is!  The opcode for a
TEST_UNIT_READY is 0x00.  This is probably the command that is
generating the CHECK_CONDITION.  The test for saved_cmd is entirely
bogus.  What really needs to happen if for ATA to have an "autosense
valid" flag in the request.  But without that, the best that you can
do is to just ignore the contents of saved_cmd and also zero out
request->u.atapi.sense before issuing every command.

Scott



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?46324CCF.7040109>