From owner-aic7xxx@FreeBSD.ORG Wed Aug 17 14:06:00 2005 Return-Path: X-Original-To: aic7xxx@freebsd.org Delivered-To: aic7xxx@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 746F916A420 for ; Wed, 17 Aug 2005 14:06:00 +0000 (GMT) (envelope-from Todd.Denniston@ssa.crane.navy.mil) Received: from mail.ssa.crane.navy.mil (mail.ssa.crane.navy.mil [164.227.42.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4043543D5A for ; Wed, 17 Aug 2005 14:05:53 +0000 (GMT) (envelope-from Todd.Denniston@ssa.crane.navy.mil) Received: from ssa.crane.navy.mil (tdennist@glock.ssa.crane.navy.mil [164.227.42.142]) by mail.ssa.crane.navy.mil with ESMTP id JAA07522; Wed, 17 Aug 2005 09:07:26 -0500 Sender: tdennist@ssa.crane.navy.mil Message-ID: <430343F7.94D5925B@ssa.crane.navy.mil> Date: Wed, 17 Aug 2005 09:04:39 -0500 From: Todd Denniston Organization: Code 6067, NSWC Crane X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.2.25glock1 i686) X-Accept-Language: en MIME-Version: 1.0 To: Don References: <430275CB.4040706@geeksrus.ca> Content-Type: multipart/mixed; boundary="------------5A0725A47910F19027BC2A28" Cc: aic7xxx@freebsd.org Subject: Re: FC3 / aic7xxx / Promise VTrak 12100 RAID woes X-BeenThere: aic7xxx@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Adaptec Device Drivers in FreeBSD and Linux List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2005 14:06:00 -0000 This is a multi-part message in MIME format. --------------5A0725A47910F19027BC2A28 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Don wrote: > > Hi, > > From looking through the archives, it doesn't look like I'm the first > person to have "fun" with Promise RAID systems... although I didn't > notice any specific discussions about the VTrak. > > I have a FC3 stock system (kernel 2.6.9) with an Adaptec 7899 card and > it is connected to a Promise VTrak 12100 RAID loaded up with about 2.5 > TB of disks. The system "works", however it is slow. You mention sequential "char" write being the really slow test, and I notice that your Tagged Queuing Depth (TQD) is set to 4 below, I would think that having a higher TQD might help in this situation. > In digging around I've seen some people claim that performance of scsi > RAID disks can have to do with the card configuration. I have checked > "dmesg" and noted: > > scsi3 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36 > > aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs > > scsi3:A:10:0: DV failed to configure device. Please file a bug report > against this driver. > (scsi3:A:10): 160.000MB/s transfers (80.000MHz DT, offset 62, 16bit) > Vendor: Promise Model: 11 Disk RAID5 Rev: V0.0 > Type: Direct-Access ANSI SCSI revision: 04 > scsi3:A:10:0: Tagged Queuing enabled. Depth 4 > SCSI device sdc: 2449856640 1024-byte hdwr sectors (2508653 MB) > SCSI device sdc: drive cache: write through > sdc: sdc1 sdc2 > Attached scsi disk sdc at scsi3, channel 0, id 10, lun 0 > > I've seen this "DV failed" message on this list archives, but for most > people it seemed to have more serious consequences, whereas for this > system it seems to roll right past the error. I'm not sure if the error > is relevant or if its a red herring. with the RM8000 I have to maintain, yes it would pass on through the DV failed, but later on the RM8000 would stop talking (or listening) on the SCSI bus until the RM8000 was cold booted. If I was driving the unit with global_tag_depth:212 and DV on in my environment[1], I would get lockups ~8 hours after cold boot consistently. On the same hardware: leaving dv on and setting the tag depth to 1 [0] would get ~12 hours. turning dv off[2] would get ~24 hours. turning dv off and setting the tag depth to 16 [3] would get ~3 days. turning dv off and setting the tag depth to 224 [4] AND setting the bus speed (in the cards bios) to 40.000MHz gets plus of 71 days (at this setting I have not had a lockup, but I have had other reasons to reboot the linux system, like power outages){If I understood DV correctly, a working DV implementation would probably also have slowed down the bus speed to the device}. One other fun variable, these numbers are from after I had gotten the units into production and they were seeing bursts of use from the ~30 users. A constant load like `badblocks -w` or full sync from the other drbd host (all .75TB, at maximum write speed ~22MB/s) did not cause trouble. If I understood domain validation[5] correctly, if the device is a Ultra160 device it is required to do DV, but it seems either the Promise products either a) don't do it, b) do it wrong, c) do it differently than the adaptec drivers/hardware expects. When I questioned the Promise rep (I thought he was a technical type), when I was having lots of problems, about this it seemed to be something he was unaware of and unconcerned about when I pointed him to the page I found describing it[5]. In short, I don't think that the Promise equipment likes the DV that adaptec does, and because of the un-liked DV it MAY become unstable, you would probably be best to shut the DV off for that device because it has already told you it is not working anyway. > > I've put a mesage in to Promise support but am not holding my breath. Definitely Don't hold your breath, it takes (at the time I asked my support questions) ~5 days to 4 weeks for them to even pull your email out of the imap server (yes I set the return receipts for both the servers and clients when dealing with them). The only advice I ever got from them (other than getting the latest firmware installed, which I had already done) was to slow down the bus and I only got that after CALLING them. > > If anyone has any advice, or recommendations on how to track this > problem down further, I would really appreciate it. > get a product from a different vendor and compare the performance/reliability. > Thanks in advance, > > Don in /etc/modules.conf note dv:{0} assumes the promise drive is the first drive on the first bus, IIRC to set it off for the third drive on the first bus it would be dv:{1,1,0}. [0] options aic7xxx aic7xxx=verbose.global_tag_depth:1 [1] options aic7xxx aic7xxx=verbose.global_tag_depth:212 [2] options aic7xxx aic7xxx=verbose.global_tag_depth:214.dv:{0} [3] options aic7xxx aic7xxx=verbose.global_tag_depth:16.dv:{0} [4] options aic7xxx aic7xxx=verbose.global_tag_depth:224.dv:{0} Note on the above tag_depth settings, the reason I changed some around (212,214,224) was so I could see which line was being used at boot in the the syslog. I found that after changing the line, one must rebuild the initrd on a FC1 system to have the change take effect. [5] http://www.storagereview.com/guide2000/ref/hdd/if/scsi/protDomain.html -- Todd Denniston Crane Division, Naval Surface Warfare Center (NSWC Crane) Harnessing the Power of Technology for the Warfighter --------------5A0725A47910F19027BC2A28 Content-Type: text/plain; charset=us-ascii; name="stddisclamer.h" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="stddisclamer.h" Even when this disclaimer is not here, the opinions expressed by me are not necessarily sanctioned by and do not necessarily represent those of my employer. Also even when this disclaimer is not here, I DO NOT have authority to direct you in any way to alter your contractual obligation and my email can NOT be used as direction to modify a contract. --------------5A0725A47910F19027BC2A28--