From owner-freebsd-scsi@FreeBSD.ORG Sat Jan 14 05:16:18 2012 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5D36B106564A for ; Sat, 14 Jan 2012 05:16:18 +0000 (UTC) (envelope-from jwd@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 414528FC08 for ; Sat, 14 Jan 2012 05:16:18 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q0E5GIQe076424 for ; Sat, 14 Jan 2012 05:16:18 GMT (envelope-from jwd@freefall.freebsd.org) Received: (from jwd@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q0E5GIid076422 for freebsd-scsi@FreeBSD.org; Sat, 14 Jan 2012 05:16:18 GMT (envelope-from jwd) Date: Sat, 14 Jan 2012 05:16:18 +0000 From: John To: freebsd-scsi@FreeBSD.org Message-ID: <20120114051618.GA41288@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: mps driver chain_alloc_fail / performance ? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Jan 2012 05:16:18 -0000 Hi Folks, I've started poking through the source for this, but thought I'd go ahead and post to ask other's their opinion. I have a system with 3 LSI SAS hba cards installed: mps0: port 0x5000-0x50ff mem 0xf5ff0000-0xf5ff3fff,0xf5f80000-0xf5fbffff irq 30 at device 0.0 on pci13 mps0: Firmware: 05.00.13.00 mps0: IOCCapabilities: 285c mps1: port 0x7000-0x70ff mem 0xfbef0000-0xfbef3fff,0xfbe80000-0xfbebffff irq 48 at device 0.0 on pci33 mps1: Firmware: 07.00.00.00 mps1: IOCCapabilities: 1285c mps2: port 0x6000-0x60ff mem 0xfbcf0000-0xfbcf3fff,0xfbc80000-0xfbcbffff irq 56 at device 0.0 on pci27 mps2: Firmware: 07.00.00.00 mps2: IOCCapabilities: 1285c Basically, one for internal and two for external drives, for a total of about 200 drives, ie: # camcontrol inquiry da10 pass21: Fixed Direct Access SCSI-5 device pass21: Serial Number 6XR14KYV0000B148LDKM pass21: 600.000MB/s transfers, Command Queueing Enabled When running the system under load, I see the following reported: hw.mps.0.allow_multiple_tm_cmds: 0 hw.mps.0.io_cmds_active: 0 hw.mps.0.io_cmds_highwater: 772 hw.mps.0.chain_free: 2048 hw.mps.0.chain_free_lowwater: 1832 hw.mps.0.chain_alloc_fail: 0 <--- Ok hw.mps.1.allow_multiple_tm_cmds: 0 hw.mps.1.io_cmds_active: 0 hw.mps.1.io_cmds_highwater: 1019 hw.mps.1.chain_free: 2048 hw.mps.1.chain_free_lowwater: 0 hw.mps.1.chain_alloc_fail: 14369 <---- ?? hw.mps.2.allow_multiple_tm_cmds: 0 hw.mps.2.io_cmds_active: 0 hw.mps.2.io_cmds_highwater: 1019 hw.mps.2.chain_free: 2048 hw.mps.2.chain_free_lowwater: 0 hw.mps.2.chain_alloc_fail: 13307 <---- ?? So finally my question (sorry, I'm long winded): What is the correct way to increase the number of elements in sc->chain_list so mps_alloc_chain() won't run out? static __inline struct mps_chain * mps_alloc_chain(struct mps_softc *sc) { struct mps_chain *chain; if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) { TAILQ_REMOVE(&sc->chain_list, chain, chain_link); sc->chain_free--; if (sc->chain_free < sc->chain_free_lowwater) sc->chain_free_lowwater = sc->chain_free; } else sc->chain_alloc_fail++; return (chain); } A few layers up, it seems like it would be nice if the buffer exhaustion was reported outside of debug being enabled... at least maybe the first time. It looks like changing the related #define is the only way. Does anyone have any experience with tuning this driver for high throughput/large disk arrays? The shelves are all dual pathed, and with the new gmultipath active/active support, I've still only been able to achieve about 500MBytes per second across the controllers/drives. I appreciate any thoughts. Thanks, John ps: I currently have a ccd on top of these drives which seems to perform more consistenty then zfs. But that's an email for a different day :-)