From owner-freebsd-scsi@FreeBSD.ORG Mon Jan 9 11:07:13 2012 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 264891065672 for ; Mon, 9 Jan 2012 11:07:13 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 142278FC16 for ; Mon, 9 Jan 2012 11:07:13 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q09B7CQ2042301 for ; Mon, 9 Jan 2012 11:07:12 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q09B7COI042299 for freebsd-scsi@FreeBSD.org; Mon, 9 Jan 2012 11:07:12 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 9 Jan 2012 11:07:12 GMT Message-Id: <201201091107.q09B7COI042299@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Jan 2012 11:07:13 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/163812 scsi [mpt] problem with mpt driver for lsi controlled conne o kern/163713 scsi [aic7xxx] [patch] Add Adaptec29329LPE to aic79xx_pci.c f kern/163130 scsi [mpt] cannot dumpon to mpt connected disk o kern/162256 scsi [mpt] QUEUE FULL EVENT and 'mpt_cam_event: 0x0' o kern/161809 scsi [cam] [patch] set kern.cam.boot_delay via build option o kern/159412 scsi [ciss] 7.3 RELEASE: ciss0 ADAPTER HEARTBEAT FAILED err o kern/157770 scsi [iscsi] [panic] iscsi_initiator panic o kern/154432 scsi [xpt] run_interrupt_driven_hooks: still waiting after o kern/153514 scsi [cam] [panic] CAM related panic o kern/153361 scsi [ciss] Smart Array 5300 boot/detect drive problem o kern/152250 scsi [ciss] [patch] Kernel panic when hw.ciss.expose_hidden o kern/151564 scsi [ciss] ciss(4) should increase CISS_MAX_LOGICAL to 10 o docs/151336 scsi Missing documentation of scsi_ and ata_ functions in c s kern/149927 scsi [cam] hard drive not stopped before removing power dur o kern/148083 scsi [aac] Strange device reporting o kern/147704 scsi [mpt] sys/dev/mpt: new chip revision, partially unsupp o kern/146287 scsi [ciss] ciss(4) cannot see more than one SmartArray con o kern/145768 scsi [mpt] can't perform I/O on SAS based SAN disk in freeb o kern/144648 scsi [aac] Strange values of speed and bus width in dmesg o kern/144301 scsi [ciss] [hang] HP proliant server locks when using ciss o kern/142351 scsi [mpt] LSILogic driver performance problems o kern/134488 scsi [mpt] MPT SCSI driver probes max. 8 LUNs per device o kern/132250 scsi [ciss] ciss driver does not support more then 15 drive o kern/132206 scsi [mpt] system panics on boot when mirroring and 2nd dri o kern/130621 scsi [mpt] tranfer rate is inscrutable slow when use lsi213 o kern/129602 scsi [ahd] ahd(4) gets confused and wedges SCSI bus o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/127717 scsi [ata] [patch] [request] - support write cache toggling o kern/123674 scsi [ahc] ahc driver dumping o kern/123520 scsi [ahd] unable to boot from net while using ahd o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o bin/57088 scsi [cam] [patch] for a possible fd leak in libcam.c o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 51 problems total. From owner-freebsd-scsi@FreeBSD.ORG Wed Jan 11 12:13:25 2012 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CDADD106566B for ; Wed, 11 Jan 2012 12:13:25 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 0EACA8FC08 for ; Wed, 11 Jan 2012 12:13:24 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA17290; Wed, 11 Jan 2012 13:59:03 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Rkwpb-000Aqw-Fl; Wed, 11 Jan 2012 13:59:03 +0200 Message-ID: <4F0D7986.5080309@FreeBSD.org> Date: Wed, 11 Jan 2012 13:59:02 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20111222 Thunderbird/9.0 MIME-Version: 1.0 To: freebsd-scsi@FreeBSD.org X-Enigmail-Version: undefined Content-Type: text/plain; charset=X-VIET-VPS Content-Transfer-Encoding: 7bit Cc: Subject: dadump: missing cam_periph_unlock for the EIO branch? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Jan 2012 12:13:25 -0000 It looks like if the dadump() returns EIO, then it fails to unlock the cam periph lock (the sim lock, really). The leaked lock could lead to unnecessary secondary panics. What do you think? Example: http://sunner.semmy.ru/~az/avg/12.JPG http://sunner.semmy.ru/~az/avg/13.JPG -- Andriy Gapon From owner-freebsd-scsi@FreeBSD.ORG Wed Jan 11 19:58:11 2012 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2078A1065678 for ; Wed, 11 Jan 2012 19:58:11 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 69A658FC1E for ; Wed, 11 Jan 2012 19:58:10 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA22043 for ; Wed, 11 Jan 2012 21:58:09 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Rl4JE-000BAc-RA for freebsd-scsi@freebsd.org; Wed, 11 Jan 2012 21:58:08 +0200 Message-ID: <4F0DE9D0.40406@FreeBSD.org> Date: Wed, 11 Jan 2012 21:58:08 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20111222 Thunderbird/9.0 MIME-Version: 1.0 To: freebsd-scsi@FreeBSD.org References: <4F0D7986.5080309@FreeBSD.org> In-Reply-To: <4F0D7986.5080309@FreeBSD.org> X-Enigmail-Version: undefined Content-Type: text/plain; charset=x-viet-vps Content-Transfer-Encoding: 7bit Cc: Subject: Re: dadump: missing cam_periph_unlock for the EIO branch? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Jan 2012 19:58:11 -0000 on 11/01/2012 13:59 Andriy Gapon said the following: > > It looks like if the dadump() returns EIO, then it fails to unlock the cam > periph lock (the sim lock, really). The leaked lock could lead to unnecessary > secondary panics. What do you think? > > Example: > http://sunner.semmy.ru/~az/avg/12.JPG > http://sunner.semmy.ru/~az/avg/13.JPG So how about something this? diff --git a/sys/cam/scsi/scsi_da.c b/sys/cam/scsi/scsi_da.c index 29a93ae..985a501 100644 --- a/sys/cam/scsi/scsi_da.c +++ b/sys/cam/scsi/scsi_da.c @@ -1094,6 +1094,7 @@ dadump(void *arg, void *virtual, vm_offset_t physical, off_t offset, size_t leng /*sense_len*/SSD_FULL_SIZE, DA_DEFAULT_TIMEOUT * 1000); xpt_polled_action((union ccb *)&csio); + cam_periph_unlock(periph); if ((csio.ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { printf("Aborting dump due to I/O error.\n"); @@ -1105,7 +1106,6 @@ dadump(void *arg, void *virtual, vm_offset_t physical, off_t offset, size_t leng csio.ccb_h.status, csio.scsi_status); return(EIO); } - cam_periph_unlock(periph); return(0); } -- Andriy Gapon From owner-freebsd-scsi@FreeBSD.ORG Thu Jan 12 05:04:59 2012 Return-Path: Delivered-To: scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 73C0F1065670; Thu, 12 Jan 2012 05:04:59 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id 2C61E8FC15; Thu, 12 Jan 2012 05:04:58 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id q0C54wPl024934; Wed, 11 Jan 2012 22:04:58 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id q0C54wiG024933; Wed, 11 Jan 2012 22:04:58 -0700 (MST) (envelope-from ken) Date: Wed, 11 Jan 2012 22:04:58 -0700 From: "Kenneth D. Merry" To: current@FreeBSD.org, scsi@FreeBSD.org Message-ID: <20120112050458.GA24148@nargothrond.kdm.org> References: <20120105045311.GA40378@nargothrond.kdm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120105045311.GA40378@nargothrond.kdm.org> User-Agent: Mutt/1.4.2i Cc: Subject: Re: CAM Target Layer available X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jan 2012 05:04:59 -0000 On Wed, Jan 04, 2012 at 21:53:11 -0700, Kenneth D. Merry wrote: > > The CAM Target Layer (CTL) is now available for testing. I am planning to > commit it to to head next week, barring any major objections. > > CTL is a disk and processor device emulation subsystem originally written > for Copan Systems under Linux starting in 2003. It has been shipping in > Copan (now SGI) products since 2005. > > It was ported to FreeBSD in 2008, and thanks to an agreement between SGI > (who acquired Copan's assets in 2010) and Spectra Logic in 2010, CTL is > available under a BSD-style license. The intent behind the agreement was > that Spectra would work to get CTL into the FreeBSD tree. > > The patches are against FreeBSD/head as of SVN change 229516 and are > located here: > > http://people.freebsd.org/~ken/ctl/ctl_diffs.20120104.4.txt.gz > > The code is not "perfect" (few pieces of software are), but is in good > shape from a functional standpoint. My intent is to get it out there for > other folks to use, and perhaps help with improvements. > > There are a few other CAM changes included with these diffs, some of which > will be committed separately from CTL, some concurrently. This is a quick > summary: > > - Fix a panic in the da(4) driver when a drive disappears on boot. > - Fix locking in the CAM EDT traversal code. > - Add an optional sysctl/tunable (disabled by default) to suppress > "duplicate" devices. This most frequently shows up with dual ported SAS > drives. > - Add some very basic error injection into the da(4) driver. > - Bump the length field in the SCSI INQUIRY CDB to 2 bytes to line up with > more recent SCSI specs. > > CTL Features: > ============ > > - Disk and processor device emulation. > - Tagged queueing > - SCSI task attribute support (ordered, head of queue, simple tags) > - SCSI implicit command ordering support. (e.g. if a read follows a mode > select, the read will be blocked until the mode select completes.) > - Full task management support (abort, LUN reset, target reset, etc.) > - Support for multiple ports > - Support for multiple simultaneous initiators > - Support for multiple simultaneous backing stores > - Persistent reservation support > - Mode sense/select support > - Error injection support > - High Availability support (1) > - All I/O handled in-kernel, no userland context switch overhead. > > (1) HA Support is just an API stub, and needs much more to be fully > functional. See the to-do list below. > > Configuring and Running CTL: > =========================== > > - After applying the CTL patchset to your tree, build world and install it > on your target system. > > - Add 'device ctl' to your kernel configuration file. > > - If you're running with a 8Gb or 4Gb Qlogic FC board, add > 'options ISP_TARGET_MODE' to your kernel config file. 'device ispfw' > or loading the ispfw module is also recommended. > > - Rebuild and install a new kernel. > > - Reboot with the new kernel. > > - To add a LUN with the RAM disk backend: > > ctladm create -b ramdisk -s 10485760000000000000 > ctladm port -o on > > - You should now see the CTL disk LUN through camcontrol devlist: > > scbus6 on ctl2cam0 bus 0: > at scbus6 target 1 lun 0 (da24,pass32) > <> at scbus6 target -1 lun -1 () > > This is visible through the CTL CAM SIM. This allows using CTL without > any physical hardware. You should be able to issue any normal SCSI > commands to the device via the pass(4)/da(4) devices. > > If any target-capable HBAs are in the system (e.g. isp(4)), and have > target mode enabled, you should now also be able to see the CTL LUNs via > that target interface. > > Note that all CTL LUNs are presented to all frontends. There is no > LUN masking, or separate, per-port configuration. > > - Note that the ramdisk backend is a "fake" ramdisk. That is, it is > backed by a small amount of RAM that is used for all I/O requests. This > is useful for performance testing, but not for any data integrity tests. > > - To add a LUN with the block/file backend: > > truncate -s +1T myfile > ctladm create -b block -o file=myfile > ctladm port -o on > > - You can also see a list of LUNs and their backends like this: > > # ctladm devlist > LUN Backend Size (Blocks) BS Serial Number Device ID > 0 block 2147483648 512 MYSERIAL 0 MYDEVID 0 > 1 block 2147483648 512 MYSERIAL 1 MYDEVID 1 > 2 block 2147483648 512 MYSERIAL 2 MYDEVID 2 > 3 block 2147483648 512 MYSERIAL 3 MYDEVID 3 > 4 block 2147483648 512 MYSERIAL 4 MYDEVID 4 > 5 block 2147483648 512 MYSERIAL 5 MYDEVID 5 > 6 block 2147483648 512 MYSERIAL 6 MYDEVID 6 > 7 block 2147483648 512 MYSERIAL 7 MYDEVID 7 > 8 block 2147483648 512 MYSERIAL 8 MYDEVID 8 > 9 block 2147483648 512 MYSERIAL 9 MYDEVID 9 > 10 block 2147483648 512 MYSERIAL 10 MYDEVID 10 > 11 block 2147483648 512 MYSERIAL 11 MYDEVID 11 > > - You can see the LUN type and backing store for block/file backend LUNs > like this: > > # ctladm devlist -v > LUN Backend Size (Blocks) BS Serial Number Device ID > 0 block 2147483648 512 MYSERIAL 0 MYDEVID 0 > lun_type=0 > num_threads=14 > file=testdisk0 > 1 block 2147483648 512 MYSERIAL 1 MYDEVID 1 > lun_type=0 > num_threads=14 > file=testdisk1 > 2 block 2147483648 512 MYSERIAL 2 MYDEVID 2 > lun_type=0 > num_threads=14 > file=testdisk2 > 3 block 2147483648 512 MYSERIAL 3 MYDEVID 3 > lun_type=0 > num_threads=14 > file=testdisk3 > 4 block 2147483648 512 MYSERIAL 4 MYDEVID 4 > lun_type=0 > num_threads=14 > file=testdisk4 > 5 block 2147483648 512 MYSERIAL 5 MYDEVID 5 > lun_type=0 > num_threads=14 > file=testdisk5 > 6 block 2147483648 512 MYSERIAL 6 MYDEVID 6 > lun_type=0 > num_threads=14 > file=testdisk6 > 7 block 2147483648 512 MYSERIAL 7 MYDEVID 7 > lun_type=0 > num_threads=14 > file=testdisk7 > 8 block 2147483648 512 MYSERIAL 8 MYDEVID 8 > lun_type=0 > num_threads=14 > file=testdisk8 > 9 block 2147483648 512 MYSERIAL 9 MYDEVID 9 > lun_type=0 > num_threads=14 > file=testdisk9 > 10 ramdisk 0 0 MYSERIAL 0 MYDEVID 0 > lun_type=3 > 11 ramdisk 204800000000000 512 MYSERIAL 1 MYDEVID 1 > lun_type=0 > > - To see system throughput, use ctlstat(8): > > # ctlstat -t > System Read System Write System Total > ms KB/t tps MB/s ms KB/t tps MB/s ms KB/t tps MB/s > 1.71 50.64 0 0.00 1.24 512.00 0 0.03 2.05 245.20 0 0.03 1.0% > 0.00 0.00 0 0.00 1.12 512.00 564 282.00 1.12 512.00 564 282.00 8.4% > 0.00 0.00 0 0.00 1.27 512.00 536 268.00 1.27 512.00 536 268.00 10.0% > 0.00 0.00 0 0.00 1.27 512.00 535 267.50 1.27 512.00 535 267.50 7.6% > 0.00 0.00 0 0.00 1.12 512.00 520 260.00 1.12 512.00 520 260.00 10.9% > 0.00 0.00 0 0.00 1.02 512.00 538 269.00 1.02 512.00 538 269.00 10.9% > 0.00 0.00 0 0.00 1.10 512.00 557 278.50 1.10 512.00 557 278.50 9.6% > 0.00 0.00 0 0.00 1.12 512.00 561 280.50 1.12 512.00 561 280.50 10.4% > 0.00 0.00 0 0.00 1.14 512.00 502 251.00 1.14 512.00 502 251.00 6.5% > 0.00 0.00 0 0.00 1.31 512.00 527 263.50 1.31 512.00 527 263.50 10.5% > 0.00 0.00 0 0.00 1.07 512.00 560 280.00 1.07 512.00 560 280.00 10.3% > > CTL To Do List: > ============== > > - Use devstat(9) for CTL's statistics collection. CTL uses a home-grown > statistics collection system that is similar to devstat(9). ctlstat > should be retired in favor of iostat, etc., once aggregation modes are > available in iostat to match the behavior of ctlstat -t and dump modes > are available to match the behavior of ctlstat -d/ctlstat -J. > > - ZFS ARC backend for CTL. Since ZFS copies all I/O into the ARC > (Adaptive Replacement Cache), running the block/file backend on top of a > ZFS-backed zdev or file will involve an extra set of copies. The > optimal solution for backing targets served by CTL with ZFS would be to > allocate buffers out of the ARC directly, and DMA to/from them directly. > That would eliminate an extra data buffer allocation and copy. > > - Switch CTL over to using CAM CCBs instead of its own union ctl_io. This > will likely require a significant amount of work, but will eliminate > another data structure in the stack, more memory allocations, etc. This > will also require changes to the CAM CCB structure to support CTL. > > - Full-featured High Availability support. The HA API that is in ctl_ha.h > is essentially a renamed version of Copan's HA API. There is no > substance to it, but it remains in CTL to show what needs to be done to > implement active/active HA from a CTL standpoint. The things that would > need to be done include: > - A kernel level software API for message passing as well as DMA > between at least two nodes. > - Hardware support and drivers for inter-node communication. This > could be as simples as ethernet hardware and drivers. > - A "supervisor", or startup framework to control and coordinate > HA startup, failover (going from active/active to single mode), > and failback (going from single mode to active/active). > - HA support in other components of the stack. The goal behind HA > is that one node can fail and another node can seamlessly take > over handling I/O requests. This requires support from pretty > much every component in the storage stack, from top to bottom. > CTL is one piece of it, but you also need support in the RAID > stack/filesystem/backing store. You also need full configuration > mirroring, and all peer nodes need to be able to talk to the > underlying storage hardware. I checked CTL into head today, along with most of the CAM changes I mentioned above. My plan is to MFC CTL into stable/9 in a month. If there is enough interest, I can probably MFC CTL into stable/8 as well. The only potential hiccup there is the change in the size of the inquiry CDB length field. I doubt many, if any, ports are using that data structure, but it is a small API change. (Albeit one brought on by a standards change.) In any case, if anyone sees any ports breakage as a result, please let me know. I'm planning on MFCing the other CAM changes in 2 weeks. I decided not to put in the duplicate suppression code for now. It's a little kludgy. If people think it would be valuable, I can put it in. It's really just a stopgap until we get actual multipath and SAS probing support in CAM. Ken -- Kenneth Merry ken@FreeBSD.ORG From owner-freebsd-scsi@FreeBSD.ORG Sat Jan 14 05:16:18 2012 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5D36B106564A for ; Sat, 14 Jan 2012 05:16:18 +0000 (UTC) (envelope-from jwd@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 414528FC08 for ; Sat, 14 Jan 2012 05:16:18 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q0E5GIQe076424 for ; Sat, 14 Jan 2012 05:16:18 GMT (envelope-from jwd@freefall.freebsd.org) Received: (from jwd@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q0E5GIid076422 for freebsd-scsi@FreeBSD.org; Sat, 14 Jan 2012 05:16:18 GMT (envelope-from jwd) Date: Sat, 14 Jan 2012 05:16:18 +0000 From: John To: freebsd-scsi@FreeBSD.org Message-ID: <20120114051618.GA41288@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: mps driver chain_alloc_fail / performance ? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Jan 2012 05:16:18 -0000 Hi Folks, I've started poking through the source for this, but thought I'd go ahead and post to ask other's their opinion. I have a system with 3 LSI SAS hba cards installed: mps0: port 0x5000-0x50ff mem 0xf5ff0000-0xf5ff3fff,0xf5f80000-0xf5fbffff irq 30 at device 0.0 on pci13 mps0: Firmware: 05.00.13.00 mps0: IOCCapabilities: 285c mps1: port 0x7000-0x70ff mem 0xfbef0000-0xfbef3fff,0xfbe80000-0xfbebffff irq 48 at device 0.0 on pci33 mps1: Firmware: 07.00.00.00 mps1: IOCCapabilities: 1285c mps2: port 0x6000-0x60ff mem 0xfbcf0000-0xfbcf3fff,0xfbc80000-0xfbcbffff irq 56 at device 0.0 on pci27 mps2: Firmware: 07.00.00.00 mps2: IOCCapabilities: 1285c Basically, one for internal and two for external drives, for a total of about 200 drives, ie: # camcontrol inquiry da10 pass21: Fixed Direct Access SCSI-5 device pass21: Serial Number 6XR14KYV0000B148LDKM pass21: 600.000MB/s transfers, Command Queueing Enabled When running the system under load, I see the following reported: hw.mps.0.allow_multiple_tm_cmds: 0 hw.mps.0.io_cmds_active: 0 hw.mps.0.io_cmds_highwater: 772 hw.mps.0.chain_free: 2048 hw.mps.0.chain_free_lowwater: 1832 hw.mps.0.chain_alloc_fail: 0 <--- Ok hw.mps.1.allow_multiple_tm_cmds: 0 hw.mps.1.io_cmds_active: 0 hw.mps.1.io_cmds_highwater: 1019 hw.mps.1.chain_free: 2048 hw.mps.1.chain_free_lowwater: 0 hw.mps.1.chain_alloc_fail: 14369 <---- ?? hw.mps.2.allow_multiple_tm_cmds: 0 hw.mps.2.io_cmds_active: 0 hw.mps.2.io_cmds_highwater: 1019 hw.mps.2.chain_free: 2048 hw.mps.2.chain_free_lowwater: 0 hw.mps.2.chain_alloc_fail: 13307 <---- ?? So finally my question (sorry, I'm long winded): What is the correct way to increase the number of elements in sc->chain_list so mps_alloc_chain() won't run out? static __inline struct mps_chain * mps_alloc_chain(struct mps_softc *sc) { struct mps_chain *chain; if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) { TAILQ_REMOVE(&sc->chain_list, chain, chain_link); sc->chain_free--; if (sc->chain_free < sc->chain_free_lowwater) sc->chain_free_lowwater = sc->chain_free; } else sc->chain_alloc_fail++; return (chain); } A few layers up, it seems like it would be nice if the buffer exhaustion was reported outside of debug being enabled... at least maybe the first time. It looks like changing the related #define is the only way. Does anyone have any experience with tuning this driver for high throughput/large disk arrays? The shelves are all dual pathed, and with the new gmultipath active/active support, I've still only been able to achieve about 500MBytes per second across the controllers/drives. I appreciate any thoughts. Thanks, John ps: I currently have a ccd on top of these drives which seems to perform more consistenty then zfs. But that's an email for a different day :-) From owner-freebsd-scsi@FreeBSD.ORG Sat Jan 14 23:46:23 2012 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0EF521065675; Sat, 14 Jan 2012 23:46:23 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id BB0D88FC08; Sat, 14 Jan 2012 23:46:22 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id q0ENMj6A059590; Sat, 14 Jan 2012 16:22:46 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id q0ENMjvu059589; Sat, 14 Jan 2012 16:22:45 -0700 (MST) (envelope-from ken) Date: Sat, 14 Jan 2012 16:22:45 -0700 From: "Kenneth D. Merry" To: John Message-ID: <20120114232245.GA57880@nargothrond.kdm.org> References: <20120114051618.GA41288@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120114051618.GA41288@FreeBSD.org> User-Agent: Mutt/1.4.2i Cc: freebsd-scsi@freebsd.org Subject: Re: mps driver chain_alloc_fail / performance ? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Jan 2012 23:46:23 -0000 On Sat, Jan 14, 2012 at 05:16:18 +0000, John wrote: > Hi Folks, > > I've started poking through the source for this, but thought I'd > go ahead and post to ask other's their opinion. > > I have a system with 3 LSI SAS hba cards installed: > > mps0: port 0x5000-0x50ff mem 0xf5ff0000-0xf5ff3fff,0xf5f80000-0xf5fbffff irq 30 at device 0.0 on pci13 > mps0: Firmware: 05.00.13.00 > mps0: IOCCapabilities: 285c > mps1: port 0x7000-0x70ff mem 0xfbef0000-0xfbef3fff,0xfbe80000-0xfbebffff irq 48 at device 0.0 on pci33 > mps1: Firmware: 07.00.00.00 > mps1: IOCCapabilities: 1285c > mps2: port 0x6000-0x60ff mem 0xfbcf0000-0xfbcf3fff,0xfbc80000-0xfbcbffff irq 56 at device 0.0 on pci27 > mps2: Firmware: 07.00.00.00 > mps2: IOCCapabilities: 1285c The firmware on those boards is a little old. You might consider upgrading. > Basically, one for internal and two for external drives, for a total > of about 200 drives, ie: > > # camcontrol inquiry da10 > pass21: Fixed Direct Access SCSI-5 device > pass21: Serial Number 6XR14KYV0000B148LDKM > pass21: 600.000MB/s transfers, Command Queueing Enabled That's a lot of drives! I've only run up to 60 drives. > When running the system under load, I see the following reported: > > hw.mps.0.allow_multiple_tm_cmds: 0 > hw.mps.0.io_cmds_active: 0 > hw.mps.0.io_cmds_highwater: 772 > hw.mps.0.chain_free: 2048 > hw.mps.0.chain_free_lowwater: 1832 > hw.mps.0.chain_alloc_fail: 0 <--- Ok > > hw.mps.1.allow_multiple_tm_cmds: 0 > hw.mps.1.io_cmds_active: 0 > hw.mps.1.io_cmds_highwater: 1019 > hw.mps.1.chain_free: 2048 > hw.mps.1.chain_free_lowwater: 0 > hw.mps.1.chain_alloc_fail: 14369 <---- ?? > > hw.mps.2.allow_multiple_tm_cmds: 0 > hw.mps.2.io_cmds_active: 0 > hw.mps.2.io_cmds_highwater: 1019 > hw.mps.2.chain_free: 2048 > hw.mps.2.chain_free_lowwater: 0 > hw.mps.2.chain_alloc_fail: 13307 <---- ?? > > So finally my question (sorry, I'm long winded): What is the > correct way to increase the number of elements in sc->chain_list > so mps_alloc_chain() won't run out? Bump MPS_CHAIN_FRAMES to something larger. You can try 4096 and see what happens. > static __inline struct mps_chain * > mps_alloc_chain(struct mps_softc *sc) > { > struct mps_chain *chain; > > if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) { > TAILQ_REMOVE(&sc->chain_list, chain, chain_link); > sc->chain_free--; > if (sc->chain_free < sc->chain_free_lowwater) > sc->chain_free_lowwater = sc->chain_free; > } else > sc->chain_alloc_fail++; > return (chain); > } > > A few layers up, it seems like it would be nice if the buffer > exhaustion was reported outside of debug being enabled... at least > maybe the first time. It used to report being out of chain frames every time it happened, which wound up being too much. You're right, doing it once might be good. > It looks like changing the related #define is the only way. Yes, that is currently the only way. Yours is by far the largest setup I've seen so far. I've run the driver with 60 drives attached. > Does anyone have any experience with tuning this driver for high > throughput/large disk arrays? The shelves are all dual pathed, and with > the new gmultipath active/active support, I've still only been able to > achieve about 500MBytes per second across the controllers/drives. Once you bump up the number of chain frames to the point where you aren't running out, I doubt the driver will be the big bottleneck. It'll probably be other things higher up the stack. > ps: I currently have a ccd on top of these drives which seems to > perform more consistenty then zfs. But that's an email for a different > day :-) What sort of ZFS topology did you try? I know for raidz2, and perhaps for raidz, ZFS is faster if your number of data disks is a power of 2. If you want raidz2 protection, try creating arrays in groups of 10, so you wind up having 8 data disks. Ken -- Kenneth Merry ken@FreeBSD.ORG