Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Jan 2012 21:53:11 -0700
From:      "Kenneth D. Merry" <ken@FreeBSD.org>
To:        current@FreeBSD.org, scsi@FreeBSD.org
Subject:   CAM Target Layer available
Message-ID:  <20120105045311.GA40378@nargothrond.kdm.org>

next in thread | raw e-mail | index | archive | help

The CAM Target Layer (CTL) is now available for testing.  I am planning to
commit it to to head next week, barring any major objections.

CTL is a disk and processor device emulation subsystem originally written
for Copan Systems under Linux starting in 2003.  It has been shipping in
Copan (now SGI) products since 2005.

It was ported to FreeBSD in 2008, and thanks to an agreement between SGI
(who acquired Copan's assets in 2010) and Spectra Logic in 2010, CTL is
available under a BSD-style license.  The intent behind the agreement was
that Spectra would work to get CTL into the FreeBSD tree.

The patches are against FreeBSD/head as of SVN change 229516 and are
located here:

http://people.freebsd.org/~ken/ctl/ctl_diffs.20120104.4.txt.gz

The code is not "perfect" (few pieces of software are), but is in good
shape from a functional standpoint.  My intent is to get it out there for
other folks to use, and perhaps help with improvements.

There are a few other CAM changes included with these diffs, some of which
will be committed separately from CTL, some concurrently.  This is a quick
summary:

 - Fix a panic in the da(4) driver when a drive disappears on boot.
 - Fix locking in the CAM EDT traversal code.
 - Add an optional sysctl/tunable (disabled by default) to suppress
   "duplicate" devices.  This most frequently shows up with dual ported SAS
   drives.
 - Add some very basic error injection into the da(4) driver.
 - Bump the length field in the SCSI INQUIRY CDB to 2 bytes to line up with
   more recent SCSI specs.

CTL Features:
============

 - Disk and processor device emulation.
 - Tagged queueing
 - SCSI task attribute support (ordered, head of queue, simple tags)
 - SCSI implicit command ordering support.  (e.g. if a read follows a mode
   select, the read will be blocked until the mode select completes.)
 - Full task management support (abort, LUN reset, target reset, etc.)
 - Support for multiple ports
 - Support for multiple simultaneous initiators
 - Support for multiple simultaneous backing stores
 - Persistent reservation support
 - Mode sense/select support
 - Error injection support
 - High Availability support (1)
 - All I/O handled in-kernel, no userland context switch overhead.

(1) HA Support is just an API stub, and needs much more to be fully
    functional.  See the to-do list below.

Configuring and Running CTL:
===========================

 - After applying the CTL patchset to your tree, build world and install it
   on your target system.

 - Add 'device ctl' to your kernel configuration file.

 - If you're running with a 8Gb or 4Gb Qlogic FC board, add
   'options ISP_TARGET_MODE' to your kernel config file.  'device ispfw'
   or loading the ispfw module is also recommended.

 - Rebuild and install a new kernel.

 - Reboot with the new kernel.

 - To add a LUN with the RAM disk backend:

	ctladm create -b ramdisk -s 10485760000000000000
	ctladm port -o on

 - You should now see the CTL disk LUN through camcontrol devlist:

scbus6 on ctl2cam0 bus 0:
<FREEBSD CTLDISK 0001>             at scbus6 target 1 lun 0 (da24,pass32)
<>                                 at scbus6 target -1 lun -1 ()

   This is visible through the CTL CAM SIM.  This allows using CTL without
   any physical hardware.  You should be able to issue any normal SCSI
   commands to the device via the pass(4)/da(4) devices.

   If any target-capable HBAs are in the system (e.g. isp(4)), and have
   target mode enabled, you should now also be able to see the CTL LUNs via
   that target interface.

   Note that all CTL LUNs are presented to all frontends.  There is no
   LUN masking, or separate, per-port configuration.

 - Note that the ramdisk backend is a "fake" ramdisk.  That is, it is
   backed by a small amount of RAM that is used for all I/O requests.  This
   is useful for performance testing, but not for any data integrity tests.

 - To add a LUN with the block/file backend:

	truncate -s +1T myfile
	ctladm create -b block -o file=myfile
	ctladm port -o on

 - You can also see a list of LUNs and their backends like this:

# ctladm devlist
LUN Backend       Size (Blocks)   BS Serial Number    Device ID       
  0 block            2147483648  512 MYSERIAL   0     MYDEVID   0     
  1 block            2147483648  512 MYSERIAL   1     MYDEVID   1     
  2 block            2147483648  512 MYSERIAL   2     MYDEVID   2     
  3 block            2147483648  512 MYSERIAL   3     MYDEVID   3     
  4 block            2147483648  512 MYSERIAL   4     MYDEVID   4     
  5 block            2147483648  512 MYSERIAL   5     MYDEVID   5     
  6 block            2147483648  512 MYSERIAL   6     MYDEVID   6     
  7 block            2147483648  512 MYSERIAL   7     MYDEVID   7     
  8 block            2147483648  512 MYSERIAL   8     MYDEVID   8     
  9 block            2147483648  512 MYSERIAL   9     MYDEVID   9     
 10 block            2147483648  512 MYSERIAL  10     MYDEVID  10     
 11 block            2147483648  512 MYSERIAL  11     MYDEVID  11    

 - You can see the LUN type and backing store for block/file backend LUNs
   like this:

# ctladm devlist -v
LUN Backend       Size (Blocks)   BS Serial Number    Device ID       
  0 block            2147483648  512 MYSERIAL   0     MYDEVID   0     
      lun_type=0
      num_threads=14
      file=testdisk0
  1 block            2147483648  512 MYSERIAL   1     MYDEVID   1     
      lun_type=0
      num_threads=14
      file=testdisk1
  2 block            2147483648  512 MYSERIAL   2     MYDEVID   2     
      lun_type=0
      num_threads=14
      file=testdisk2
  3 block            2147483648  512 MYSERIAL   3     MYDEVID   3     
      lun_type=0
      num_threads=14
      file=testdisk3
  4 block            2147483648  512 MYSERIAL   4     MYDEVID   4     
      lun_type=0
      num_threads=14
      file=testdisk4
  5 block            2147483648  512 MYSERIAL   5     MYDEVID   5     
      lun_type=0
      num_threads=14
      file=testdisk5
  6 block            2147483648  512 MYSERIAL   6     MYDEVID   6     
      lun_type=0
      num_threads=14
      file=testdisk6
  7 block            2147483648  512 MYSERIAL   7     MYDEVID   7     
      lun_type=0
      num_threads=14
      file=testdisk7
  8 block            2147483648  512 MYSERIAL   8     MYDEVID   8     
      lun_type=0
      num_threads=14
      file=testdisk8
  9 block            2147483648  512 MYSERIAL   9     MYDEVID   9     
      lun_type=0
      num_threads=14
      file=testdisk9
 10 ramdisk                   0    0 MYSERIAL   0     MYDEVID   0     
      lun_type=3
 11 ramdisk     204800000000000  512 MYSERIAL   1     MYDEVID   1     
      lun_type=0

 - To see system throughput, use ctlstat(8):

# ctlstat -t
          System Read          System Write          System Total
   ms  KB/t tps  MB/s    ms  KB/t tps  MB/s    ms  KB/t tps  MB/s 
 1.71 50.64   0  0.00  1.24 512.00   0  0.03  2.05 245.20   0  0.03    1.0%
 0.00  0.00   0  0.00  1.12 512.00 564 282.00  1.12 512.00 564 282.00    8.4%
 0.00  0.00   0  0.00  1.27 512.00 536 268.00  1.27 512.00 536 268.00   10.0%
 0.00  0.00   0  0.00  1.27 512.00 535 267.50  1.27 512.00 535 267.50    7.6%
 0.00  0.00   0  0.00  1.12 512.00 520 260.00  1.12 512.00 520 260.00   10.9%
 0.00  0.00   0  0.00  1.02 512.00 538 269.00  1.02 512.00 538 269.00   10.9%
 0.00  0.00   0  0.00  1.10 512.00 557 278.50  1.10 512.00 557 278.50    9.6%
 0.00  0.00   0  0.00  1.12 512.00 561 280.50  1.12 512.00 561 280.50   10.4%
 0.00  0.00   0  0.00  1.14 512.00 502 251.00  1.14 512.00 502 251.00    6.5%
 0.00  0.00   0  0.00  1.31 512.00 527 263.50  1.31 512.00 527 263.50   10.5%
 0.00  0.00   0  0.00  1.07 512.00 560 280.00  1.07 512.00 560 280.00   10.3%

CTL To Do List:
==============

 - Use devstat(9) for CTL's statistics collection.  CTL uses a home-grown
   statistics collection system that is similar to devstat(9).  ctlstat
   should be retired in favor of iostat, etc., once aggregation modes are
   available in iostat to match the behavior of ctlstat -t and dump modes
   are available to match the behavior of ctlstat -d/ctlstat -J.

 - ZFS ARC backend for CTL.  Since ZFS copies all I/O into the ARC
   (Adaptive Replacement Cache), running the block/file backend on top of a
   ZFS-backed zdev or file will involve an extra set of copies.  The
   optimal solution for backing targets served by CTL with ZFS would be to
   allocate buffers out of the ARC directly, and DMA to/from them directly.
   That would eliminate an extra data buffer allocation and copy.

 - Switch CTL over to using CAM CCBs instead of its own union ctl_io.  This
   will likely require a significant amount of work, but will eliminate
   another data structure in the stack, more memory allocations, etc.  This
   will also require changes to the CAM CCB structure to support CTL.

 - Full-featured High Availability support.  The HA API that is in ctl_ha.h
   is essentially a renamed version of Copan's HA API.  There is no
   substance to it, but it remains in CTL to show what needs to be done to
   implement active/active HA from a CTL standpoint.  The things that would
   need to be done include:
	- A kernel level software API for message passing as well as DMA
	  between at least two nodes.
	- Hardware support and drivers for inter-node communication.  This
	  could be as simples as ethernet hardware and drivers.
	- A "supervisor", or startup framework to control and coordinate
	  HA startup, failover (going from active/active to single mode),
	  and failback (going from single mode to active/active).
	- HA support in other components of the stack.  The goal behind HA
	  is that one node can fail and another node can seamlessly take
	  over handling I/O requests.  This requires support from pretty
	  much every component in the storage stack, from top to bottom.
	  CTL is one piece of it, but you also need support in the RAID
	  stack/filesystem/backing store.  You also need full configuration
	  mirroring, and all peer nodes need to be able to talk to the
	  underlying storage hardware.

Thanks,

Ken
-- 
Kenneth Merry
ken@FreeBSD.ORG



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120105045311.GA40378>