Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Dec 2021 15:10:16 -0500
From:      Alexander Motin <mav@FreeBSD.org>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        =?UTF-8?Q?Edward_Tomasz_Napiera=c5=82a?= <trasz@freebsd.org>, scsi@FreeBSD.org, Ken Merry <ken@freebsd.org>
Subject:   Re: iSCSI target: Handling in-flight requests during ctld shutdown
Message-ID:  <7f1a1a65-199d-858a-792c-42871d1df13e@FreeBSD.org>
In-Reply-To: <42e175d9-1693-29e2-0b5b-3fa513aa2a2d@FreeBSD.org>
References:  <fd383f6f-5a19-e2bb-5383-e559271eb3cd@FreeBSD.org> <b6c090ac-6cb0-6173-422d-9aef0b37b8ee@FreeBSD.org> <42e175d9-1693-29e2-0b5b-3fa513aa2a2d@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 30.12.2021 14:29, John Baldwin wrote:
> On 12/29/21 1:57 PM, Alexander Motin wrote:
>> The HARDWARE ERROR is obviously not expected by the initiator.  It
>> should better not be leaked after we decided to kill the connection.
>> Initiator may retry it and still work happily after reconnect, but
>> cleaner would be to not rely on that.  cfiscsi_session_terminate_tasks()
>> aborts all running commands by CTL_TASK_I_T_NEXUS_RESET, that make them
>> not return statuses to initiator, but I suppose this is the other side
>> of the race now.
> 
> Hmm, I wonder if we should be setting CTL_FLAG_ABORT instead of setting the
> port_status when aborting an I/O?  The comment in ctl_frontend_iscsi.c 
> claims
> the backends check the port_status, but I don't see any checks for 
> port_status
> at all in backends.

The checks were in all backends, until in 2c7dc6bae9fd I've moved them 
to ctl_datamove_done_process().  Backends don't have any special 
knowledge now to process specific data move error codes.  It was 
complete code duplication between backends and I've tried to organize 
that a bit.  Unfortunately we do not have formalized data move error 
statuses in CTL to do more.  It would be much more interesting (and 
complicated ;)) if iSCSI (for example) could report data transfer error 
without killing connection, but that area in general is not very 
specified in SCSI, since usually each transport has own means of recovery.

>  I do see checks for CTL_FLAG_ABORT, and the handler 
> for
> the CTL_TASK_I_T_NEXUS_RESET does set CTL_FLAG_ABORT on pending requests.
> 
> For the tasks in sciscsi_session_terminate_tasks(), those should already 
> have
> CTL_FLAG_ABORT set anyway, but it wouldn't hurt if it were set again by
> cfiscsi_data_wait_abort().  For the the cfiscsi_task_management_done 
> case I'm
> less certain, but I suspect there too that returning an internal error 
> status
> back to the initiator is not expected and that it would be better to 
> just set
> CTL_FLAG_ABORT and drop any response?

cfiscsi_data_wait_abort() does only one specific thing -- it aborts 
waiting for data and completes the data move operation, allowing backend 
to continue (or at least free resources, etc).  It is only 
cfiscsi_session_terminate_tasks() knows that it should abort data moves 
while it aborts tasks, since it knows that the connection is dead. 
cfiscsi_task_management_done() is another special case, since iSCSI 
explicitly mentions that there will be no further exchanges related to 
that task, so the data move should explicitly return.  But in general 
case I think it would be nice to not combine those two facts of aborting 
task and aborting data transfer, since the last may not necessary imply 
the first, as the first does not imply the last.

It is tempting to just set the flag, but I suspect it may cause more 
problems (I am worrying about HA peer update).  I'd prefer the race 
inside the transport to be handled inside the transport.  May be we 
could just add another flag to be set under the session lock inside 
cfiscsi_session_terminate_tasks() after it aborted all the tasks, but 
before it start to abort transfers itself and use it inside 
cfiscsi_datamove_out() instead of cs_terminating.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7f1a1a65-199d-858a-792c-42871d1df13e>