Date: Wed, 04 Jan 2012 17:50:06 +0200 From: Alexander Motin <mav@FreeBSD.org> To: freebsd-scsi@freebsd.org Subject: Thoughts about CAM SWI Message-ID: <4F04752E.8010805@FreeBSD.org>
next in thread | raw e-mail | index | archive | help
Hi. Many times was risen question about extra context switch in CAM from interrupt thread to the CAM SWI on command completion. I've tried to analyze the ways how can it be avoided. The main problem I see there is a problems with reenterability of CAM itself, peripheral drivers and SIMs. In general case it looks unsafe to handle command completion directly from the xpt_action() call, as caller may not expect state changes at this point. It also looks unsafe to handle command completion directly from xpt_done() in interrupt thread, as SIM may not expect new requests getting in at this point, for example, if some error recovery is in progress, or it needs state to be consistent to handle other completed (possibly with errors) requests. The most complicated case I see is if the xpt_done() called from inside of sim_action() method of SIM. In that case direct command completion may recursively cause problems for all sides. I've tried to find places where this call loop can be broken. First place I was thinking about was the xpt_action() call. It looks possible to turn the code upside down to handle completion directly, but to not submit new requests to the controller immediately if we are called from the SIM context (another question is how to identify it). The problem is that there are set of non-queued SIM requests that may affect it's state, for example, XPT_SET_TRAN_SETTINGS or XPT_RESET_BUS. Positive side it that we may hope to avoid SIM modification. Second idea is to allow SIMs to be more reenterable. It can be done in two ways simultaneously on the SIM author's choice: - adding another version of xpt_done(), like xpt_done_direct() that would allowed immediate command completion if SIM is sure it can handle reentrancy at this point. - adding some functions, like xpt_batch_start() and xpt_batch_done() instructing CAM to queue xpt_done() calls as usual between them, but to not run SWI, instead handle them directly on the xpt_batch_done() call, that supposed to be called at point where SIM state is consistent and permits reenterability. This approach is very simple from the CAM point of view, but needs small SIMs modifications, while keeping full compatibility with unmodified. I've made the patch implementing the last way for all ATA SIMs: http://people.freebsd.org/~mav/cam_batch.patch Results can be illustrated by simple synthetic test: dd if=/dev/ada0 of=/dev/null bs=512 count=500000 , where ada0 is Intel SATA SSD: x before + after +------------------------------------------------------------+ | x + + | |xxxx xxx +++ + + +| | |_A__| |_MA___| | +------------------------------------------------------------+ N Min Max Median Avg Stddev x 8 16186671 16270294 16222511 16227889 28506.033 + 8 16802958 16929107 16832671 16843965 43561.511 Difference at 95.0% confidence 616076 +/- 39480.5 3.7964% +/- 0.243288% (Student's t, pooled s = 36811.7) Total number of context switches in system reduced from 315K to 260K. Removed context switches gave 3.8% speedup. May be that is not much, and affects only specific situations of the sequential very high request rate workaloads, but it is almost free. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F04752E.8010805>