Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 31 Dec 2021 10:41:23 -0800
From:      John Baldwin <jhb@FreeBSD.org>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        =?UTF-8?Q?Edward_Tomasz_Napiera=c5=82a?= <trasz@freebsd.org>, scsi@FreeBSD.org, Ken Merry <ken@freebsd.org>
Subject:   Re: iSCSI target: Handling in-flight requests during ctld shutdown
Message-ID:  <10687910-7500-008c-aed5-61f76ae90b3d@FreeBSD.org>
In-Reply-To: <2ce80c64-6954-21d0-74eb-eeb88e289350@FreeBSD.org>
References:  <fd383f6f-5a19-e2bb-5383-e559271eb3cd@FreeBSD.org> <b6c090ac-6cb0-6173-422d-9aef0b37b8ee@FreeBSD.org> <42e175d9-1693-29e2-0b5b-3fa513aa2a2d@FreeBSD.org> <7f1a1a65-199d-858a-792c-42871d1df13e@FreeBSD.org> <68ad0ffd-79fb-973d-e5ba-92deee352d44@FreeBSD.org> <2ce80c64-6954-21d0-74eb-eeb88e289350@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 12/30/21 3:06 PM, Alexander Motin wrote:
> On 30.12.2021 17:41, John Baldwin wrote:
>> On 12/30/21 12:10 PM, Alexander Motin wrote:
>>> The checks were in all backends, until in 2c7dc6bae9fd I've moved them
>>> to ctl_datamove_done_process().  Backends don't have any special
>>> knowledge now to process specific data move error codes.  It was
>>> complete code duplication between backends and I've tried to organize
>>> that a bit.  Unfortunately we do not have formalized data move error
>>> statuses in CTL to do more.  It would be much more interesting (and
>>> complicated ;)) if iSCSI (for example) could report data transfer error
>>> without killing connection, but that area in general is not very
>>> specified in SCSI, since usually each transport has own means of
>>> recovery.
>>
>> Hmmm, but this commit seems to have actually broken this case.  That is,
>> now the possibly uninitialized data is written to the backing store:
>>
>>           /*
>> -        * We set status at this point for read commands, and write
>> -        * commands with errors.
>> +        * We set status at this point for read and compare commands.
>>            */
>> -       if (io->io_hdr.flags & CTL_FLAG_ABORT) {
>> -               ;
>> -       } else if ((io->io_hdr.port_status != 0) &&
>> -           ((io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE ||
>> -            (io->io_hdr.status & CTL_STATUS_MASK) == CTL_SUCCESS)) {
>> -               ctl_set_internal_failure(&io->scsiio, /*sks_valid*/ 1,
>> -                   /*retry_count*/ io->io_hdr.port_status);
>> -       } else if (io->scsiio.kern_data_resid != 0 &&
>> -           (io->io_hdr.flags & CTL_FLAG_DATA_MASK) == CTL_FLAG_DATA_OUT &&
>> -           ((io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE ||
>> -            (io->io_hdr.status & CTL_STATUS_MASK) == CTL_SUCCESS)) {
>> -               ctl_set_invalid_field_ciu(&io->scsiio);
>> -       } else if ((io->io_hdr.port_status == 0) &&
>> -           ((io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE)) {
>> +       if ((io->io_hdr.flags & CTL_FLAG_ABORT) == 0 &&
>> +           (io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE) {
>>                   lbalen = ARGS(beio->io);
>>                   if (lbalen->flags & CTL_LLF_READ) {
>>                           ctl_set_success(&io->scsiio);
>>
>> As ctl_datamove_done_process() will call ctl_set_internal_failure(), but
>> ctl_datamove_done() will still call be_move_done which runs the code above.
>> The code above doesn't check port_status anywhere, so it will still happily
>> write the stale data to the backend unless CTL_FLAG_ABORT is set.  Do the
>> backends need to be checking port_status == 0 again in addition to
>> CTL_FLAG_ABORT then?
> 
> No. Backends check for ((io->io_hdr.status & CTL_STATUS_MASK) ==
> CTL_STATUS_NONE) .  Notice that you've received that HARDWARE ERROR
> status, not garbage.

Ah, via the ctl_set_sense().

>>>>     I do see checks for CTL_FLAG_ABORT, and the handler
>>>> for
>>>> the CTL_TASK_I_T_NEXUS_RESET does set CTL_FLAG_ABORT on pending
>>>> requests.
>>>>
>>>> For the tasks in sciscsi_session_terminate_tasks(), those should already
>>>> have
>>>> CTL_FLAG_ABORT set anyway, but it wouldn't hurt if it were set again by
>>>> cfiscsi_data_wait_abort().  For the the cfiscsi_task_management_done
>>>> case I'm
>>>> less certain, but I suspect there too that returning an internal error
>>>> status
>>>> back to the initiator is not expected and that it would be better to
>>>> just set
>>>> CTL_FLAG_ABORT and drop any response?
>>>
>>> cfiscsi_data_wait_abort() does only one specific thing -- it aborts
>>> waiting for data and completes the data move operation, allowing backend
>>> to continue (or at least free resources, etc).  It is only
>>> cfiscsi_session_terminate_tasks() knows that it should abort data moves
>>> while it aborts tasks, since it knows that the connection is dead.
>>> cfiscsi_task_management_done() is another special case, since iSCSI
>>> explicitly mentions that there will be no further exchanges related to
>>> that task, so the data move should explicitly return.  But in general
>>> case I think it would be nice to not combine those two facts of aborting
>>> task and aborting data transfer, since the last may not necessary imply
>>> the first, as the first does not imply the last.
>>>
>>> It is tempting to just set the flag, but I suspect it may cause more
>>> problems (I am worrying about HA peer update).  I'd prefer the race
>>> inside the transport to be handled inside the transport.  May be we
>>> could just add another flag to be set under the session lock inside
>>> cfiscsi_session_terminate_tasks() after it aborted all the tasks, but
>>> before it start to abort transfers itself and use it inside
>>> cfiscsi_datamove_out() instead of cs_terminating.
>>
>> I think though that even that flag wouldn't quite help as you would still
>> have the problem that the internal_error response would still be sent on
>> the wire if we don't abort the entire task vs aborting just the move.
> 
> No.  cfiscsi_datamove_out() called before the new flag is set would
> still try to send R2T over the dying connection to be aborted by the
> cfiscsi_session_terminate_tasks() few milliseconds later.
> cfiscsi_data_wait_abort() would only be needed if
> cfiscsi_session_terminate_tasks() has already passed through the data
> waiters list and waiting for the last tasks completion.

So I think what I was missing is that I had assumed in the race case
that the task was not visible when the NEXUS_I_T_RESET ran, but I think
from re-reading now that the task has to have been in the lun's OOA
queue as we don't permit queueing more tasks due to LUN_RESERVED being
cleared, so I think that means that even in the race case the task
has been aborted.  Perhaps then the code in cfiscsi_datamove_out can
just check CTL_FLAG_ABORT instead of cs_terminating?  That would
function similar to your proposed new flag I think assuming it is correct
that the task for any I/O being passed to cfiscsi_datamove_out concurrent
with cfiscsi_session_terminate_tasks must have been "visible" on the OAA
queue and thus aborted by the handler?

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?10687910-7500-008c-aed5-61f76ae90b3d>