Date: Sat, 14 Jan 2012 05:16:18 +0000 From: John <jwd@FreeBSD.org> To: freebsd-scsi@FreeBSD.org Subject: mps driver chain_alloc_fail / performance ? Message-ID: <20120114051618.GA41288@FreeBSD.org>
next in thread | raw e-mail | index | archive | help
Hi Folks, I've started poking through the source for this, but thought I'd go ahead and post to ask other's their opinion. I have a system with 3 LSI SAS hba cards installed: mps0: <LSI SAS2116> port 0x5000-0x50ff mem 0xf5ff0000-0xf5ff3fff,0xf5f80000-0xf5fbffff irq 30 at device 0.0 on pci13 mps0: Firmware: 05.00.13.00 mps0: IOCCapabilities: 285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay> mps1: <LSI SAS2116> port 0x7000-0x70ff mem 0xfbef0000-0xfbef3fff,0xfbe80000-0xfbebffff irq 48 at device 0.0 on pci33 mps1: Firmware: 07.00.00.00 mps1: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc> mps2: <LSI SAS2116> port 0x6000-0x60ff mem 0xfbcf0000-0xfbcf3fff,0xfbc80000-0xfbcbffff irq 56 at device 0.0 on pci27 mps2: Firmware: 07.00.00.00 mps2: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc> Basically, one for internal and two for external drives, for a total of about 200 drives, ie: # camcontrol inquiry da10 pass21: <HP EG0600FBLSH HPD2> Fixed Direct Access SCSI-5 device pass21: Serial Number 6XR14KYV0000B148LDKM pass21: 600.000MB/s transfers, Command Queueing Enabled When running the system under load, I see the following reported: hw.mps.0.allow_multiple_tm_cmds: 0 hw.mps.0.io_cmds_active: 0 hw.mps.0.io_cmds_highwater: 772 hw.mps.0.chain_free: 2048 hw.mps.0.chain_free_lowwater: 1832 hw.mps.0.chain_alloc_fail: 0 <--- Ok hw.mps.1.allow_multiple_tm_cmds: 0 hw.mps.1.io_cmds_active: 0 hw.mps.1.io_cmds_highwater: 1019 hw.mps.1.chain_free: 2048 hw.mps.1.chain_free_lowwater: 0 hw.mps.1.chain_alloc_fail: 14369 <---- ?? hw.mps.2.allow_multiple_tm_cmds: 0 hw.mps.2.io_cmds_active: 0 hw.mps.2.io_cmds_highwater: 1019 hw.mps.2.chain_free: 2048 hw.mps.2.chain_free_lowwater: 0 hw.mps.2.chain_alloc_fail: 13307 <---- ?? So finally my question (sorry, I'm long winded): What is the correct way to increase the number of elements in sc->chain_list so mps_alloc_chain() won't run out? static __inline struct mps_chain * mps_alloc_chain(struct mps_softc *sc) { struct mps_chain *chain; if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) { TAILQ_REMOVE(&sc->chain_list, chain, chain_link); sc->chain_free--; if (sc->chain_free < sc->chain_free_lowwater) sc->chain_free_lowwater = sc->chain_free; } else sc->chain_alloc_fail++; return (chain); } A few layers up, it seems like it would be nice if the buffer exhaustion was reported outside of debug being enabled... at least maybe the first time. It looks like changing the related #define is the only way. Does anyone have any experience with tuning this driver for high throughput/large disk arrays? The shelves are all dual pathed, and with the new gmultipath active/active support, I've still only been able to achieve about 500MBytes per second across the controllers/drives. I appreciate any thoughts. Thanks, John ps: I currently have a ccd on top of these drives which seems to perform more consistenty then zfs. But that's an email for a different day :-)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120114051618.GA41288>