Date: Mon, 11 Aug 1997 01:25:01 -0500 From: Doug Ledford <dledford@dialnet.net> To: welbon@bga.com Cc: Doug Ledford <dledford@dialnet.net>, "Daniel M. Eischen" <deischen@iworks.InterWorks.org>, aic7xxx@FreeBSD.ORG Subject: Re: aic7xxx / AHA2940 worries... anyone? Message-ID: <199708110625.BAA20205@dledford.dialnet.net> In-Reply-To: Your message of "Mon, 11 Aug 1997 01:00:41 CDT." <Pine.LNX.3.96.970811002603.272H-200000@tarantula.arachnid.com>
next in thread | previous in thread | raw e-mail | index | archive | help
-------- > One thing I did not make clear was that the crashes occur more frequently > when the file sizes are larger and also occur more frequently when the CPU > speed increases. At a 2GB file size, a single pass of Bonnie often wont > complete. This evening I have been running Bonnie in the background on > 2GB files with Symbios 875 based controllers so I don't think it is my > drives excepting that the 875 will not queue deeper than 12 commands.. This goes in line with what I was tlaking about. With a larger file size, you increase the statistical probability that a command will get lost somewhere in that operation. With faster CPU speeds, you can create/destroy SCSI blocks and commands faster (meaning we put stuff out to the cards faster, and they correspondingly respond with command completes faster). Note the various seemingly CPU bound operations in the bonnie tests. These operations will run faster and have a higher likelyhood of outstepping the CPU when you up the CPU speed. This seems backwards, but when you consider the following, maybe not: 1: At a faster CPU speed, we can create more read/write requests in given time slice. 2: Certain operations are nearly always fixed length, regardless of CPU speed (these include bus accesses on devices other than our own aic7xxx devices such as the regular timer interrupt accesses, etc.) 3: Windows exist in the code during which interrupts are turned off while the CPU reads and writes to other devices on the bus. These windows change based on a combination of CPU speed and percent of operation that is CPU bound vs. bus bound. 4: For windows that have a high bus/cpu bounding ratio, the time we are at an interrupt off level can be held nearly constant with increasing CPU speed. 5: During those same windows, we have previously sent out commands to the drives at a somewhat faster rate due to CPU speed increases. These commands can then be returning at a comparable faster rate depending on drive capability and the luck of the draw in regards to the drive cache and controller firmware. 6: It is entirely possible that during any such fixed length window, faster CPU speeds can result in more commands returning complete and increasing the risk that the aic7xxx controllers QOUTFIFO might overflow while the card can not be serviced. > > > driver firmware, it may just be that under the kind of load 9 drives can > > create on a controller, we are losing commands and getting hosed. > > I have three controllers with three drives per controller. The mdadd is > set up such that in sequential accesses the each next card is accessed. I > did this thining that it would distribute the load accross the controllers > a little better. There are cases were it is faster. The mdadd is: > > /sbin/mdadd /dev/md0 /dev/sdb2 /dev/sde2 /dev/sdh2 \ > /dev/sdc2 /dev/sdf2 /dev/sdi2 \ > /dev/sdd2 /dev/sdg2 /dev/sdj2 > /sbin/mdrun -p0 -c64k /dev/md0 > > disks sdb, sdc and sdd are on scsi2. > disks sde, sdf and sdg are on scsi3. > disks sdh, sdi and sdj are on scsi4. > > So in a sequential access the code hops from controller to controller. It > makes a measureable difference when I read from cylinders near the outer > edge. I don't have a direct comparison to a case with the interleaving at > my fingertips, but check out the block read and block write rates in the > attachment. Not too shabby IMHO. In my experience, these types of things will help during testing, but once you get into a production environment with lots of file reads and writes and a reasonably used filesystem that isn't almost all free space, then these advantages quickly disappear as the reads and writes will start to disperse amongst the various drives on their own (a news server is a good example of thise, where over time the filesystem has been written/read from enough that any given file is randomly located on the device, so any attempt at ordering the drives for optimization doesn't really gain anything, although the same is also true that any given ordering of the drives isn't going to hurt things any either). -- ***************************************************************************** * Doug Ledford * Unix, Novell, Dos, Windows 3.x, * * dledford@dialnet.net 873-DIAL * WfW, Windows 95 & NT Technician * * PPP access $14.95/month ***************************************** * Springfield, MO and surrounding * Usenet news, e-mail and shell account.* * communities. Sign-up online at * Web page creation and hosting, other * * 873-9000 V.34 * services available, call for info. * *****************************************************************************
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708110625.BAA20205>