FreeBSD Mail Archives

Date:      Wed, 18 Jan 2012 10:00:05 +0530
From:      "Desai, Kashyap" <Kashyap.Desai@lsi.com>
To:        John <jwd@freebsd.org>, "Kenneth D. Merry" <ken@freebsd.org>
Cc:        "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org>
Subject:   RE: mps driver chain_alloc_fail / performance ?
Message-ID:  <B2FD678A64EAAD45B089B123FDFC3ED7299D0AA748@inbmail01.lsi.com>
In-Reply-To: <20120117020218.GA59053@FreeBSD.org>
References:  <20120114051618.GA41288@FreeBSD.org> <20120114232245.GA57880@nargothrond.kdm.org> <B2FD678A64EAAD45B089B123FDFC3ED7299CF90E7C@inbmail01.lsi.com> <20120117020218.GA59053@FreeBSD.org>



> -----Original Message-----
> From: John [mailto:jwd@freebsd.org]
> Sent: Tuesday, January 17, 2012 7:32 AM
> To: Desai, Kashyap; Kenneth D. Merry
> Cc: freebsd-scsi@freebsd.org
> Subject: Re: mps driver chain_alloc_fail / performance ?
>=20
> ----- Desai, Kashyap's Original Message -----
> > Which driver version is this ? In our 09.00.00.00 Driver (which is in
> pipeline to be committed) has 2048 chain buffer counter.
>=20
>    I'm not sure how to answer your question directly. We're using the
> driver
> that comes with FreeBSD. Not a driver directly from LSI. If we can get a
> copy
> of your 9.0 driver we can try testing against it.

If you type "sysctl -a |grep mps" you can see driver version..

>=20
> > And our Test team has verified it with almost 150+ Drives.
>=20
>    Currently, we have 8 shelves, 25 drives per shelf, dual attached
> configured with geom multipath using Active/Active. Ignoring SSDs and
> OS disks on the internal card, we see 400 da devices on mps1 & mps2.
> For the record, the shelves are:
>=20
> ses0 at mps1 bus 0 scbus7 target 0 lun 0
> ses0: <HP D2700 SAS AJ941A 0131> Fixed Enclosure Services SCSI-5 device
> ses0: 600.000MB/s transfers
> ses0: Command Queueing enabled
> ses0: SCSI-3 SES Device
>=20
>=20
> > As suggested by Ken, Can you try increasing MPS_CHAIN_FRAMES  to 4096
> OR 2048
>=20
>    Absolutely. The current value is 2048. We are currently running with
> this patch to increase the value and output a singular alerting message:
>=20
> --- sys/dev/mps/mpsvar.h.orig	2012-01-15 19:28:51.000000000 -0500
> +++ sys/dev/mps/mpsvar.h	2012-01-15 20:14:07.000000000 -0500
> @@ -34,7 +34,7 @@
>  #define MPS_REQ_FRAMES		1024
>  #define MPS_EVT_REPLY_FRAMES	32
>  #define MPS_REPLY_FRAMES	MPS_REQ_FRAMES
> -#define MPS_CHAIN_FRAMES	2048
> +#define MPS_CHAIN_FRAMES	4096
>  #define MPS_SENSE_LEN		SSD_FULL_SIZE
>  #define MPS_MSI_COUNT		1
>  #define MPS_SGE64_SIZE		12
> @@ -242,8 +242,11 @@
>  		sc->chain_free--;
>  		if (sc->chain_free < sc->chain_free_lowwater)
>  			sc->chain_free_lowwater =3D sc->chain_free;
> -	} else
> +	} else {
>  		sc->chain_alloc_fail++;
> +		if (sc->chain_alloc_fail =3D=3D 1)
> +			device_printf(sc->mps_dev,"Insufficient chain_list
> buffers.");
> +	}
>  	return (chain);
>  }
>=20
>=20
>    If the logic for outputting the message is appropriate I think
> it would be nice to get it committed.

If this works for you and you really want to commit, I would suggest to hav=
e module parameter to pass chain_max value.
Basically, current implementation is not the correct way to handle out of c=
hain scenario.

Driver should calculate max chain required per HBA at run time from IOC fac=
t reply from FW. And it should try to allocate those many
Chain buffer run time (instead of having #define for chain max ).=20

If Driver does not find those memory from system @ run time, we should fail=
 to detect HBA at load time.

>From our Linux Driver logs, I find out we need 29700 chain buffer required =
per HBA(SAS2008 PCI-Express).
So better to increase MPS_CHAIN_FRAMES to (24 * 1024), until we have more r=
obust support in driver.


Hope this helps you.

~ Kashyap

>=20
> > ~ Kashyap
> >
> > > Kenneth D. Merry said:
> > >
> > > The firmware on those boards is a little old.  You might consider
> > > upgrading.
>=20
>    We updated the the FW this morning and we're now showing:
>=20
> mps0: <LSI SAS2116> port 0x5000-0x50ff mem 0xf5ff0000-
> 0xf5ff3fff,0xf5f80000-0xf5fbffff irq 30 at device 0.0 on pci13
> mps0: Firmware: 12.00.00.00
> mps0: IOCCapabilities:
> 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDis
> c>
> mps1: <LSI SAS2116> port 0x7000-0x70ff mem 0xfbef0000-
> 0xfbef3fff,0xfbe80000-0xfbebffff irq 48 at device 0.0 on pci33
> mps1: Firmware: 12.00.00.00
> mps1: IOCCapabilities:
> 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDis
> c>
> mps2: <LSI SAS2116> port 0x6000-0x60ff mem 0xfbcf0000-
> 0xfbcf3fff,0xfbc80000-0xfbcbffff irq 56 at device 0.0 on pci27
> mps2: Firmware: 12.00.00.00
> mps2: IOCCapabilities:
> 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDis
> c>
>=20
>    We last updated about around November of last year.
>=20
> > > > # camcontrol inquiry da10
> > > > pass21: <HP EG0600FBLSH HPD2> Fixed Direct Access SCSI-5 device
> > > > pass21: Serial Number 6XR14KYV0000B148LDKM
> > > > pass21: 600.000MB/s transfers, Command Queueing Enabled
> > >
> > > That's a lot of drives!  I've only run up to 60 drives.
>=20
>    See above. In general, I'm relatively pleased with how the system
> responds with all these drives.
>=20
> > > >    When running the system under load, I see the following
> reported:
> > > >
> > > > hw.mps.2.allow_multiple_tm_cmds: 0
> > > > hw.mps.2.io_cmds_active: 0
> > > > hw.mps.2.io_cmds_highwater: 1019
> > > > hw.mps.2.chain_free: 2048
> > > > hw.mps.2.chain_free_lowwater: 0
> > > > hw.mps.2.chain_alloc_fail: 13307     <---- ??
>=20
>    The current test case run is showing:
>=20
> hw.mps.2.debug_level: 0
> hw.mps.2.allow_multiple_tm_cmds: 0
> hw.mps.2.io_cmds_active: 109
> hw.mps.2.io_cmds_highwater: 1019
> hw.mps.2.chain_free: 4042
> hw.mps.2.chain_free_lowwater: 3597
> hw.mps.2.chain_alloc_fail: 0
>=20
>    It may be a few hours before it progresses to the point where it
> ran low last time.
>=20
> > > Bump MPS_CHAIN_FRAMES to something larger.  You can try 4096 and see
> > > what happens.
>=20
>    Agreed. Let me know if you thing there is anything we should add to
> the patch above.
>=20
> > > >    A few layers up, it seems like it would be nice if the buffer
> > > > exhaustion was reported outside of debug being enabled... at least
> > > > maybe the first time.
> > >
> > > It used to report being out of chain frames every time it happened,
> > > which wound up being too much.  You're right, doing it once might be
> good.
>=20
> Thanks, that's how I tried to put the patch together.
>=20
> > > Once you bump up the number of chain frames to the point where you
> aren't
> > > running out, I doubt the driver will be the big bottleneck.  It'll
> probably
> > > be other things higher up the stack.
>=20
> Question. What "should" the layer of code above the mps driver do if the
> driver
> returns ENOBUFS? I'm wondering if it might explain some incorrect
> results.
>=20
> > > What sort of ZFS topology did you try?
> > >
> > > I know for raidz2, and perhaps for raidz, ZFS is faster if your
> number
> > > of data disks is a power of 2.
> > >
> > > If you want raidz2 protection, try creating arrays in groups of 10,
> so
> > > you wind up having 8 data disks.
>=20
> The fasted we've seen is with a pool made of mirrors, though this uses
> up the most space. It also caused the most alloc fails (and leads to my
> question about ENOBUFS).
>=20
> Thank you both for your help. Any comments are always welcome! If I
> haven't
> answered a question, or otherwise said something that doesn't make
> sense, let me know.
>=20
> Thanks,
> John

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B2FD678A64EAAD45B089B123FDFC3ED7299D0AA748>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation