Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Mar 2014 19:13:53 +0000
From:      "Desai, Kashyap" <Kashyap.Desai@lsi.com>
To:        Borja Marcos <borjam@sarenet.es>
Cc:        "scottl@netflix.com" <scottl@netflix.com>, "Radford, Adam" <Adam.Radford@lsi.com>, "sean_bruno@yahoo.com" <sean_bruno@yahoo.com>, "Mankani, Krishnaraddi" <Krishnaraddi.Mankani@lsi.com>, "dwhite@ixsystems.com" <dwhite@ixsystems.com>, "Maloy, Joe" <Joe.Maloy@lsi.com>, "jpaetzel@freebsd.org" <jpaetzel@freebsd.org>, "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org>, Doug Ambrisko <ambrisko@cisco.com>, "Kenneth D. Merry" <ken@kdm.org>, "McConnell, Stephen" <Stephen.McConnell@lsi.com>
Subject:   RE: LSI - MR-Fusion controller driver <mrsas> patch and man page
Message-ID:  <45cbdd9366aa4a19997d4ca306d0cdcc@BN1PR07MB247.namprd07.prod.outlook.com>
In-Reply-To: <B13F319C-09B9-48A3-B082-A0936D714F12@sarenet.es>
References:  <e59396595152456dbcde63d48f70aa8f@BN1PR07MB247.namprd07.prod.outlook.com> <20140107181139.GC2080@cisco.com> <20140124185356.GA28724@ambrisko.com> <20140124190047.GA34975@ambrisko.com> <9c3fd2b15e9b4c2cb967519a3b7f98ad@BN1PR07MB247.namprd07.prod.outlook.com> <20140318143738.GA65955@cisco.com> <20140320235534.GA92797@cisco.com> <C698AC2A-06A7-4408-9790-20B69FAF31C6@sarenet.es> <20140321160954.GB99545@cisco.com> <5C32A3C7-B28B-4E69-9DF0-EE53181085F7@sarenet.es> <20140324174519.GA30345@cisco.com> <AE6FAC75-D2EA-457F-8654-2ECD57EDAA13@sarenet.es> <8bd5b88321704b49baaf4538c6941292@BN1PR07MB247.namprd07.prod.outlook.com> <B13F319C-09B9-48A3-B082-A0936D714F12@sarenet.es>

next in thread | previous in thread | raw e-mail | index | archive | help
Borja:

I have read your comment. First of all thanks for explaining with lots of t=
echnical details.  I definitely like to take this as feedback and will work=
 internally to find out how best we can handle this. As of now you cannot u=
se mrsas driver as mfi pass-through.

I have observed that most of the benefit you mentioned for pass-through is =
mainly faced by some manufacturing divisions and we provide temporary drop =
for some specific reason. That driver expose Un-configured drive to OS and =
they can do FW upgrade of the Drives without doing lots of manual work.

Let me explain you one fundamental problem with pass-through drive.

Let's say you have 4 drives and all are exposed to OS as pass-through drive=
. Now user can't recognize (using LSI provided configuration utils like sto=
rcli/MegaCl), if those drive are used by end user. From LSI config utils, i=
t is still Un-configured drive and valid for creating Raid volume. So this =
is big issue managing physical disk if Un-configured drive are exposed and =
used by user.

You can run MR controller in JBOD mod where all drives will be default conv=
erted as JBOD and visible to the OS. =20

Also, LSI controller support only T10 Thin provisioning standards. For all =
JBOD drives command will go to the actual drive, but for Volumes it disable=
s via setting values in vpd page 0xb0 for Volumes.

<mrsas> controller map Volumes on bus-0 and syspd to bus-1..so you can easi=
ly figure out Raid vs JBOD.=20

LSI developed CAM based HBA device driver  <mps> and that was under guidanc=
e of FreeBSD key folks.  Our first goal is to meet <mrsas> driver with all =
latest features (which Linux <megaraid_sas> driver supports)  and use CAM b=
ase interface same as <mps> driver.=20
We will add new features as and when requested and prioritize.=20

Doug:
I have to see your query regarding difference between Thunderbolt and Invad=
er.=20



~ Kashyap


> -----Original Message-----
> From: Borja Marcos [mailto:borjam@sarenet.es]
> Sent: Tuesday, March 25, 2014 8:01 PM
> To: Desai, Kashyap
> Cc: Doug Ambrisko; scottl@netflix.com; Radford, Adam; Kenneth D. Merry;
> sean_bruno@yahoo.com; Mankani, Krishnaraddi; dwhite@ixsystems.com;
> Maloy, Joe; jpaetzel@freebsd.org; freebsd-scsi@freebsd.org; McConnell,
> Stephen
> Subject: Re: LSI - MR-Fusion controller driver <mrsas> patch and man page
>=20
>=20
> On Mar 25, 2014, at 12:42 PM, Desai, Kashyap wrote:
>=20
> > Borja:
> >
> > <mrsas> driver will attach Raid volume and JBOD (SysPD) to the CAM laye=
r.
> It is not good to expose hidden raid volume or what we called as pass-
> through device here to the OS for many reason..  Other than management
> things like SMART monitor, we cannot/should not do file system IO on pass=
-
> through devices.
>=20
> Of course it's not a good idea to expose drives that are part of a logica=
l
> volume. But unconfigured drives should. Read on, please ;)
>=20
> > With <mfi> it might be true that user always do file system IO on <mfiX=
>
> deivce and consider /dev/daX as pass-through device... With <mrsas> all
> device will be seen as <daX>. You cannot identify which will be a pass-
> through and which is configured device by LSI config utils.
>=20
> Exposing devices as "da" should not be a mere "esthetic" decision. The "d=
a"
> driver has some stuff intended for direct access to disks, but not for lo=
gical
> volumes created by other devices such as advanced RAID cards. For example=
,
> the "da" device can issue TRIM commands, it reads device serial numbers
> (which, now, can be used by GEOM to identify disks), etc. Disks are more
> complicated now with that "advanced format" thing and so I think it's ver=
y
> important for disks to be directly accessible if you want/need it.  Of co=
urse
> other features might be introduced in the future. Features that may be
> added to the "da" driver but which will probably be useless for a logical
> device, even outright inappropiate.
>=20
> I would suggest you to offer choice, and, most critically, to offer a _cl=
ear_
> _choice_, as you have different kinds of customers. Some will want/need
> logical volumes and advanced RAID stuff, others won't. In some machines I
> have I am actually doing *both* things at the same time. I may have a RAI=
D
> card based mirror for certain tasks, maybe with a UFS filesystem on it, r=
elying
> on pass-through to the rest of the devices on which I use ZFS.
>=20
> I think you should use a specific name for the logical devices, such as t=
he mfi
> driver does. If I see a "mfid" device name it's clear that it's a logical=
 device,
> not a "bare metal" hard disk, and that its behavior and features depend
> mainly on the logical device magic in the card.
>=20
> And you should offer a perfectly transparent pass-through option, maybe
> restricted to disks not configured as "RAID" ones (to avoid accidents), I=
 mean,
> what you now call "syspd" mode. These disks, ideally, should not be assig=
ned
> to a special logic-volume like "mfisyspd" driver (or its equivalent), but=
 to the
> "da" driver so that all of the features I expect from a bare metal hard d=
isk
> would work. SMART, access to mode pages, detecting sector sizes, serial
> numbers, whatever, would work without hiccups.
>=20
> Doing it the current "syspd" way means that any new feature added to disk=
s
> must be added to the card firmware and to the "syspd" portion of the driv=
er,
> while keeping a clear access to the SAS (or SATA-on-SAS) devices with no
> other manipulation would mean that the "da" driver would have immediate
> access to those features with no need to add support to the card firmware
> and driver.
>=20
>=20
> > It is not a complex code change if pass-through device is required for
> <mrsas>, but it is just a matter of no use and more error prone to expose
> devices as pass-through.
>=20
> It is certainly error prone if you are using logical devices. But if you =
are not
> using them (my case and there are many others in this situation) the lack=
 of a
> well supported pass through device can  be error prone.
>=20
> From a mere engineering point of view, it's a bad idea to add unnecessary
> software layers. Advanced RAID card features are a lifesaver for "classic=
"
> filesystems such as UFS/FFS, EXTwhateverFS, NTFS, etc, but can get in the
> way of other filesystems such as ZFS. ZFS intends to perform the function=
s of
> a RAID device itself.
>=20
> > None of the LSI driver does this including <mps> and <mrsas> in FreeBSD=
 +
> <megaraid_sas> and <mpt2sas>/<mpt3sas> in Linux.
>=20
> I've been using pass-through disks on Adaptec RAID cards (aac), and LSI L=
ogic
> (mps and mfi) with different levels of success for years. It can be trick=
y, but
> ZFS works best with direct access to the disks.
>=20
> > If you can express what functionality you think it is missing, if there=
 is not
> pass-through device ?
>=20
> Of course. Some of the missing functionalities I would miss by not using =
a
> pass through are:
>=20
> - Inability to support problematic disks with "quirks". The "da" driver o=
ffers a
> flexible mechanism for that. If not using the da driver I lose that abili=
ty, and
> you will agree with me that getting a manufacturer (LSI) to update a card=
s
> firmware is much harder than doing it myself if needed.
>=20
> - Inability to support future/special features without a firmware update =
for
> the card. An example is the diversity of block sizes in SSDs, or, more re=
cently,
> TRIM for SSDs. ZFS on FreeBSD now supports TRIM, and it's important for
> performance and drive health. How does "syspd" handle it currently?
>=20
> - Again I will insist on how additional software layers are a bad idea.
>=20
> - Also, one of the "features" of LSI cards represents a serious operation=
al
> issue: the persistent assignment of target numbers to disk serial numbers
> keeping a table of target-serial number mappings on NVRAM. There were
> some recent messages in this list regarding that problem. And it seems to
> happen even when using pass-through devices.
>=20
> In the past I have had problems with ZFS and the "old" way of creating
> "pseudo JBOD" devices on LSI cards by creating a RAID 0 logical volume fo=
r
> each disk. For example, hot swapping a broken disk can be more error pron=
e
> if,  apart from just extracting a disk and adding a new one, I need to ru=
n
> certain tool to have it effectively recognised by the card firmware. It a=
dds
> unnecessary complexity. Moreover, in some cases (I can't recall the exact
> details, as it happened several years ago) it requires a reboot, which de=
feats
> the purpose of how swappable disks in the first place.
>=20
> Please don't underestimate the operational impact of all this. An operato=
r
> swapping a disk at 3 am should not need to do any complex check to
> determine the disk to extract. Nor he/she should require additional actio=
ns
> such as "mfiutil online this", activate that or, of course, a reboot, to =
have it
> recognised. ZFS (and, I presume, other advanced filesystems) has its own
> commands for that, which include their own sanity checks doing its best t=
o
> avoid trouble.
>=20
> > Are you doing ZFS (File system IO) on Pass-through device. ?
>=20
> Indeed I am. And I know there are many successful setups doing the same.
>=20
> > If yes, then why can't you create JBOD/SysPD  for that purpose?
>=20
> It's explained above but I will summarize.
>=20
> - Plain simple good engineering practice (avoiding unneeded software
> layers),
> - Access to special/future features on disks
> - Better observability (monitoring, etc)
> - Simpler operational procedures which means safer systems operations and
> better reliability.
>=20
> Let me be brutally honest here and, please, take no offense but take it a=
s
> feedback from a customer. Right now, advanced RAID cards can be more a
> liability than a desirable feature. Look at all the places where people
> repurpose RAID cards to be simple HBAs doing all sorts of unsupported
> voodoo.
>=20
> Ideally this shouldn't happen, but we are somewhat forced by server
> manufacturers. At some point at least, for example, Dell refused to sell =
"IT
> mode" LSI2008 cards for internal devices, selling them just with external=
 SAS
> connectors. So many people just repurpose the internal, "IR firmware" car=
ds
> to "IT mode" so that they can be simple HBAs even though they still pose =
a
> problem with that target-serial number feature in NVRAM. I have an IBM
> server here with an onboard Invader card which, obviously, has many more
> features.
>=20
> By defining some design guidelines for your hardware, firmware, and drive=
s,
> however, you can get to a win-win solution. If a card can fullfill both r=
oles
> perfectly (advanced RAID features and plain HBA) it will no longer be a
> liability. The same hardware will be appropiate for many purposes, and it=
 will
> be even better for the purchasing departments of us, your final customers=
.
> No need to be keeping track of several SKUs  depending on the intended
> purpose. Same card usable for, say, NTFS and ZFS depending only on
> configuration.
>=20
> And those design guidelines I am suggesting are simple:
>=20
> - Full functioning pass through mode with a minimal surprise component,
> with the simplest, most transparent possible access from the CAM layer to
> the SAS/SATA commands so that those true pass-through devices get
> assigned to the right drivers such as "ses",  "da", "sa", etc. This shoul=
d be a
> core feature, not an add on to somewhat ease monitoring.
>=20
> - Making that transparent, pass through mode clearly distinguishable from
> the logical volume magic, so that the device name reflects its nature and
> purpose. "mfid" (or "mrsasd", or whatever you like) would the logical
> devices, avoiding attaching them to the standard CAM drivers.
>=20
>=20
> You could just repurpose the "syspd" configuration in the newer
> cards/firmware versions so that drives marked as "syspd" become perfectly
> transparent pass throughs.
>=20
> Please consider it, I am sure you will have many happy customers.
>=20
> (And I hope you endured reading this message until the end!!)
>=20
>=20
> Thank you!
>=20
>=20
>=20
>=20
>=20
>=20
>=20
> Borja.
>=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45cbdd9366aa4a19997d4ca306d0cdcc>