From owner-freebsd-scsi@FreeBSD.ORG Thu Jan 5 04:51:03 2012 Return-Path: Delivered-To: scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FAC5106564A; Thu, 5 Jan 2012 04:51:03 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id CB0058FC16; Thu, 5 Jan 2012 04:50:59 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id q054dYnQ039136; Wed, 4 Jan 2012 21:39:34 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id q054dYRP039135; Wed, 4 Jan 2012 21:39:34 -0700 (MST) (envelope-from ken) Date: Wed, 4 Jan 2012 21:39:34 -0700 From: "Kenneth D. Merry" To: current@FreeBSD.org, scsi@FreeBSD.org Message-ID: <20120105043934.GA37322@nargothrond.kdm.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="yrj/dFKFPuw6o+aM" Content-Disposition: inline User-Agent: Mutt/1.4.2i X-Mailman-Approved-At: Thu, 05 Jan 2012 05:00:58 +0000 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: CAM Target Layer available X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jan 2012 04:51:03 -0000 --yrj/dFKFPuw6o+aM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline The CAM Target Layer (CTL) is now available for testing. I am planning to commit it to to head next week, barring any major objections. CTL is a disk and processor device emulation subsystem originally written for Copan Systems under Linux starting in 2003. It has been shipping in Copan (now SGI) products since 2005. It was ported to FreeBSD in 2008, and thanks to an agreement between SGI (who acquired Copan's assets in 2010) and Spectra Logic in 2010, CTL is available under a BSD-style license. The intent behind the agreement was that Spectra would work to get CTL into the FreeBSD tree. The attached patches are against FreeBSD/head as of SVN change 229516. They are also located here: http://people.freebsd.org/~ken/ctl/ctl_diffs.20120104.4.txt.gz The code is not "perfect" (few pieces of software are), but is in good shape from a functional standpoint. My intent is to get it out there for other folks to use, and perhaps help with improvements. There are a few other CAM changes included with these diffs, some of which will be committed separately from CTL, some concurrently. This is a quick summary: - Fix a panic in the da(4) driver when a drive disappears on boot. - Fix locking in the CAM EDT traversal code. - Add an optional sysctl/tunable (disabled by default) to suppress "duplicate" devices. This most frequently shows up with dual ported SAS drives. - Add some very basic error injection into the da(4) driver. - Bump the length field in the SCSI INQUIRY CDB to 2 bytes to line up with more recent SCSI specs. CTL Features: ============ - Disk and processor device emulation. - Tagged queueing - SCSI task attribute support (ordered, head of queue, simple tags) - SCSI implicit command ordering support. (e.g. if a read follows a mode select, the read will be blocked until the mode select completes.) - Full task management support (abort, LUN reset, target reset, etc.) - Support for multiple ports - Support for multiple simultaneous initiators - Support for multiple simultaneous backing stores - Persistent reservation support - Mode sense/select support - Error injection support - High Availability support (1) - All I/O handled in-kernel, no userland context switch overhead. (1) HA Support is just an API stub, and needs much more to be fully functional. See the to-do list below. Configuring and Running CTL: =========================== - After applying the CTL patchset to your tree, build world and install it on your target system. - Add 'device ctl' to your kernel configuration file. - If you're running with a 8Gb or 4Gb Qlogic FC board, add 'options ISP_TARGET_MODE' to your kernel config file. 'device ispfw' or loading the ispfw module is also recommended. - Rebuild and install a new kernel. - Reboot with the new kernel. - To add a LUN with the RAM disk backend: ctladm create -b ramdisk -s 10485760000000000000 ctladm port -o on - You should now see the CTL disk LUN through camcontrol devlist: scbus6 on ctl2cam0 bus 0: at scbus6 target 1 lun 0 (da24,pass32) <> at scbus6 target -1 lun -1 () This is visible through the CTL CAM SIM. This allows using CTL without any physical hardware. You should be able to issue any normal SCSI commands to the device via the pass(4)/da(4) devices. If any target-capable HBAs are in the system (e.g. isp(4)), and have target mode enabled, you should now also be able to see the CTL LUNs via that target interface. Note that all CTL LUNs are presented to all frontends. There is no LUN masking, or separate, per-port configuration. - Note that the ramdisk backend is a "fake" ramdisk. That is, it is backed by a small amount of RAM that is used for all I/O requests. This is useful for performance testing, but not for any data integrity tests. - To add a LUN with the block/file backend: truncate -s +1T myfile ctladm create -b block -o file=myfile ctladm port -o on - You can also see a list of LUNs and their backends like this: # ctladm devlist LUN Backend Size (Blocks) BS Serial Number Device ID 0 block 2147483648 512 MYSERIAL 0 MYDEVID 0 1 block 2147483648 512 MYSERIAL 1 MYDEVID 1 2 block 2147483648 512 MYSERIAL 2 MYDEVID 2 3 block 2147483648 512 MYSERIAL 3 MYDEVID 3 4 block 2147483648 512 MYSERIAL 4 MYDEVID 4 5 block 2147483648 512 MYSERIAL 5 MYDEVID 5 6 block 2147483648 512 MYSERIAL 6 MYDEVID 6 7 block 2147483648 512 MYSERIAL 7 MYDEVID 7 8 block 2147483648 512 MYSERIAL 8 MYDEVID 8 9 block 2147483648 512 MYSERIAL 9 MYDEVID 9 10 block 2147483648 512 MYSERIAL 10 MYDEVID 10 11 block 2147483648 512 MYSERIAL 11 MYDEVID 11 - You can see the LUN type and backing store for block/file backend LUNs like this: # ctladm devlist -v LUN Backend Size (Blocks) BS Serial Number Device ID 0 block 2147483648 512 MYSERIAL 0 MYDEVID 0 lun_type=0 num_threads=14 file=testdisk0 1 block 2147483648 512 MYSERIAL 1 MYDEVID 1 lun_type=0 num_threads=14 file=testdisk1 2 block 2147483648 512 MYSERIAL 2 MYDEVID 2 lun_type=0 num_threads=14 file=testdisk2 3 block 2147483648 512 MYSERIAL 3 MYDEVID 3 lun_type=0 num_threads=14 file=testdisk3 4 block 2147483648 512 MYSERIAL 4 MYDEVID 4 lun_type=0 num_threads=14 file=testdisk4 5 block 2147483648 512 MYSERIAL 5 MYDEVID 5 lun_type=0 num_threads=14 file=testdisk5 6 block 2147483648 512 MYSERIAL 6 MYDEVID 6 lun_type=0 num_threads=14 file=testdisk6 7 block 2147483648 512 MYSERIAL 7 MYDEVID 7 lun_type=0 num_threads=14 file=testdisk7 8 block 2147483648 512 MYSERIAL 8 MYDEVID 8 lun_type=0 num_threads=14 file=testdisk8 9 block 2147483648 512 MYSERIAL 9 MYDEVID 9 lun_type=0 num_threads=14 file=testdisk9 10 ramdisk 0 0 MYSERIAL 0 MYDEVID 0 lun_type=3 11 ramdisk 204800000000000 512 MYSERIAL 1 MYDEVID 1 lun_type=0 - To see system throughput, use ctlstat(8): # ctlstat -t System Read System Write System Total ms KB/t tps MB/s ms KB/t tps MB/s ms KB/t tps MB/s 1.71 50.64 0 0.00 1.24 512.00 0 0.03 2.05 245.20 0 0.03 1.0% 0.00 0.00 0 0.00 1.12 512.00 564 282.00 1.12 512.00 564 282.00 8.4% 0.00 0.00 0 0.00 1.27 512.00 536 268.00 1.27 512.00 536 268.00 10.0% 0.00 0.00 0 0.00 1.27 512.00 535 267.50 1.27 512.00 535 267.50 7.6% 0.00 0.00 0 0.00 1.12 512.00 520 260.00 1.12 512.00 520 260.00 10.9% 0.00 0.00 0 0.00 1.02 512.00 538 269.00 1.02 512.00 538 269.00 10.9% 0.00 0.00 0 0.00 1.10 512.00 557 278.50 1.10 512.00 557 278.50 9.6% 0.00 0.00 0 0.00 1.12 512.00 561 280.50 1.12 512.00 561 280.50 10.4% 0.00 0.00 0 0.00 1.14 512.00 502 251.00 1.14 512.00 502 251.00 6.5% 0.00 0.00 0 0.00 1.31 512.00 527 263.50 1.31 512.00 527 263.50 10.5% 0.00 0.00 0 0.00 1.07 512.00 560 280.00 1.07 512.00 560 280.00 10.3% CTL To Do List: ============== - Use devstat(9) for CTL's statistics collection. CTL uses a home-grown statistics collection system that is similar to devstat(9). ctlstat should be retired in favor of iostat, etc., once aggregation modes are available in iostat to match the behavior of ctlstat -t and dump modes are available to match the behavior of ctlstat -d/ctlstat -J. - ZFS ARC backend for CTL. Since ZFS copies all I/O into the ARC (Adaptive Replacement Cache), running the block/file backend on top of a ZFS-backed zdev or file will involve an extra set of copies. The optimal solution for backing targets served by CTL with ZFS would be to allocate buffers out of the ARC directly, and DMA to/from them directly. That would eliminate an extra data buffer allocation and copy. - Switch CTL over to using CAM CCBs instead of its own union ctl_io. This will likely require a significant amount of work, but will eliminate another data structure in the stack, more memory allocations, etc. This will also require changes to the CAM CCB structure to support CTL. - Full-featured High Availability support. The HA API that is in ctl_ha.h is essentially a renamed version of Copan's HA API. There is no substance to it, but it remains in CTL to show what needs to be done to implement active/active HA from a CTL standpoint. The things that would need to be done include: - A kernel level software API for message passing as well as DMA between at least two nodes. - Hardware support and drivers for inter-node communication. This could be as simples as ethernet hardware and drivers. - A "supervisor", or startup framework to control and coordinate HA startup, failover (going from active/active to single mode), and failback (going from single mode to active/active). - HA support in other components of the stack. The goal behind HA is that one node can fail and another node can seamlessly take over handling I/O requests. This requires support from pretty much every component in the storage stack, from top to bottom. CTL is one piece of it, but you also need support in the RAID stack/filesystem/backing store. You also need full configuration mirroring, and all peer nodes need to be able to talk to the underlying storage hardware. Thanks, Ken -- Kenneth Merry ken@FreeBSD.ORG --yrj/dFKFPuw6o+aM--