From owner-freebsd-scsi@FreeBSD.ORG Fri Apr 20 09:04:08 2007 Return-Path: X-Original-To: scsi@freebsd.org Delivered-To: freebsd-scsi@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4BA0416A400 for ; Fri, 20 Apr 2007 09:04:08 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id ED02213C44B for ; Fri, 20 Apr 2007 09:04:07 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l3K9423S051827 for ; Fri, 20 Apr 2007 03:04:02 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <462881F1.3020001@samsco.org> Date: Fri, 20 Apr 2007 03:03:45 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.2pre) Gecko/20070111 SeaMonkey/1.1 MIME-Version: 1.0 To: scsi@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]); Fri, 20 Apr 2007 03:04:02 -0600 (MDT) X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org Cc: Subject: MPSAFE CAM, MPSAFE drivers X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Apr 2007 09:04:08 -0000 All, I'm happy to announce that CAM is now MPSAFE, thanks to the help of many people and sponsorship by Yahoo! The work is in FreeBSD CVS now and can be obtained by checking out the HEAD/7-CURRENT branch. It will be part of the upcoming FreeBSD 7.0 release this year. Only the AHC and AHD drivers are MPSAFE at the moment, but hopefully more will follow in the coming months. Below is a document describing the locking approach, and instructions for locking CAM/SIM drivers that are not yet MPSAFE. Locking theory -------------- The following describes the basics of the locking strategy in CAM itself and how that applies to the SIM drivers (SCSI hardware drivers) underneath it. While CAM is MPSAFE, only a few SIMs have been made MPSAFE so far. The rest are mostly unchanged and are allowed continue to operate just as they did before. I hope that other developers and interested users will step in and help make these drivers MPSAFE, as it's too much work for me alone. Being MPSAFE doesn't necessarily make the CAM subsystem itself faster. The locking is still fairly monolithic on a per SIM instance level, and there isn't much parallelism for operations within each instance. Multiple SIM instances, i.e. multiple buses, do operate almost completely independently of each other now, so there is full parallelism there. However, being MPSAFE does eliminate contention with the other parts of the OS that are still under Giant, and this is still a huge win. Testing moderate to heavy loads on multi-core systems has shown a significant decrease in contention on the Giant lock, while showing only minimal new contention on the CAM locks. This lowered contention translates into less system time wasted by the CPUs, and thus more cycles for useful work as well as less latency. There are now 4 basic locks in CAM, 3 of which are: xpt_lock - Protects the XPT softc, periph, and SIM instances xpt_topo_lock - Protects the global peripheral and bus lists cam_simq_lock - Protects the list of SIMs to be processed in the camisr These 3 locks are internal to the CAM core and have little bearing on the operation of SIMs. None of these locks will be held when calling into a SIM, and the SIM has no need to access to them either. The 4th lock is the SIM lock. This is a non-recursive sleep mutex (MTX_DEF) that the SIM instance uses to protect its internal data structures and operations. It is also exported up to CAM when calling cam_sim_alloc(), and is used by CAM to protect target, device, and peripheral objects, as well as SIM and device queues. Every entry from CAM into the SIM will be done with this lock held. The SIM is welcome to unlock it when it needs, but it must be held when calling back into most CAM functions. It is the primary lock for normal I/O flow throughout CAM starting at the top of the stack in the periph driver. The flow looks like this: periph_strategy sim->mtx | | xpt_schedule | | | periph_start | | | xpt_action | | | sim_action + On completion: sim_isr sim->mtx | | xpt_done |cam_simq_lock | | swi_sched + camisr cam_simq_lock | camisr_runqueue sim->mtx | | periph_done + A SIM that is not MPSAFE exports the the Giant mutex (&Giant) in cam_sim_alloc(). Giant is then treated as a normal mutex by CAM and is locked and unlocked in the same place as for MPSAFE SIMs. This does not put all of CAM back under Giant; multiple SIMs instances can be registered, some MPSAFE and some not, and CAM will treat the locking of each instance separately. Driver changes -------------- For non-MPSAFE drivers, a single change was made to the API in the cam_sim_alloc() function. The function now looks like this: struct cam_sim * cam_sim_alloc(sim_action_func sim_action, sim_poll_func sim_poll, const char *sim_name, void *softc, u_int32_t unit, struct mtx *mtx, int max_dev_transactions, int max_tagged_dev_transactions, struct cam_devq *queue); For the "mtx" argument, "&Giant" is used. Everything else in the SIM stays the same. Some structures have also changed sizes, most notable "cam_sim", but that is not an issue since source level compatibility is already affected. MPSAFE drivers must do the following things: 1. Provide a pointer to a MTX_DEF mutex in cam_sim_alloc(). The mutex must be allocated and initialized before calling cam_sim_alloc(), and must not be destroyed until after calling cam_sim_free(). It should not be held while calling cam_sim_alloc(). 2. The timeout_ch field in the ccb_hdr structure is no longer available for use by the SIM. SIMs must now allocate, initialize, and manage their own callout structures. All uses of the timeout() API must be switched to the callout() API. See the callout manpage for details on this. 3. Add the INTR_MPSAFE flag to bus_setup_intr(). This will prevent Giant from being automatically acquired before the driver interrupt handler is called. 4. Any busdma tags that allow load deferrals (i.e. return EINPROGRESS) must register a non-Giant mutex in bus_dma_tag_create(). This field is not inherited from parent tags. 5. If the driver registers a character device with make_dev(), the D_NEEDSGIANT flag should be dropped, and appropriate locking added to the device entry vectors. 6. If the driver registers any sysctls, all locks must be dropped and Giant must be held explicitly when registering and deregistering the sysctl nodes. Sysctl handlers will be called with Giant held, and appropriate locking should be added under that. No calls into CAM should be made from these contexts. 7. Provide appropriate locking in the interrupt handler as well as any taskqueue handlers, callout handlers, kthreads, or other detached contexts, as appropriate. 8. Ensure that the registered SIM mutex is held when calling all CAM entry points. Until recently, the xpt_done() entry point provided its own locking and did not require Giant to be held. It still does not require Giant, but it does require the SIM lock to be held when calling it. 9. Do not hold the SIM mutex or any other mutex when calling malloc(M_WAITOK), bus_dmamem_alloc(), and bus_dmamap_create(). 10. Any uses of tsleep must be changed to msleep. For multi-function PCI devices where each function represents a bus, a separate SIM and SIM mutex should be allocated and managed for each function. Functions that register multiple SIMs should coordinate locking between those SIMs as needed; the same lock can be registered for these separate SIMs, at the cost of reduced parallelism between SIMs. Functions that register a single SIM for multiple buses will have all of those buses under a single mutex as far as CAM is concerned. The simplest strategy is to use a single lock per SIM instance. More complex multi-level or pipelined locking is allowed; the registered SIM lock can be dropped by the SIM at any point without disrupting the rest of CAM, so long as no CAM entry points are called with it unlocked. This will be an area for further research. Userland changes ---------------- Efforts were made to keep the userland API and ABI unchanged. Thus, there are no source level changes needed for any tools, libraries, or apps, nor any need to recompile any of these either. Future work ----------- The CAM API will likely undergo some more small changes to support future work with newbus integration and SAS/SATA/FC transport modularization. These changes will hopefully be done before FreeBSD 7.0 is released.