Date: Thu, 10 May 2007 17:14:01 -0600 From: Scott Long <scottl@samsco.org> To: David Wolfskill <david@catwhisker.org>, stable@freebsd.org Subject: Re: 6.2-R on Dell Poweredge 2950 with Dell PERC 5/i [mfi(4)] Message-ID: <4643A739.3080601@samsco.org> In-Reply-To: <20070510200211.GM64542@bunrab.catwhisker.org> References: <20070510200211.GM64542@bunrab.catwhisker.org>
next in thread | previous in thread | raw e-mail | index | archive | help
David Wolfskill wrote: > From a quick look in the lists, I get the impression that the Dell PERC > 5/i may be a bit problematic. Since I hadn't any plans on using that > hardware, though, I've paid more attention to other things. > Not sure that this impression is entirely accurate. The biggest problem with MFI machines is online RAID management. The storage driver itself matured very quickly and has been very reliable. > Well, now a colleague is trying to run 6.2-R on one of these 2950s; dmesg > says the controller is: > > mfi0: <Dell PERC 5/i> mem 0xd80f0000-0xd80fffff,0xfc4e0000-0xfc4fffff irq 78 at device 14.0 on pci2 > mfi0: 817 (224963336s/0x0020/0) - Shutdown command received from host > mfi0: 818 (4278190080s/0x0020/0) - PCI 0x041028 0x0415 0x041028 0x041f03: Firmware initialization started (PCI ID 0015/1028/1f03/1028) > mfi0: 819 (4278190080s/0x0020/0) - Type 18: Firmware version 1.00.02-0157 > mfi0: 820 (4278190096s/0x0008/0) - Battery Present > mfi0: 821 (4278190124s/0x0004/0) - PD 08(e1/s255) event: Enclosure (SES) discovered on PD 08(e1/s255) > mfi0: 822 (4278190124s/0x0002/0) - PD 08(e1/s255) event: Inserted: PD 08(e1/s255) > mfi0: 823 (4278190124s/0x0002/0) - Type 29: Inserted: PD 08(e1/s255) Info: enclPd=08, scsiType=d, portMap=00, sasAddr=500180b04413ce00,0000000000000000 > mfi0: 824 (4278190124s/0x0002/0) - PD 00(e1/s0) event: Inserted: PD 00(e1/s0) > mfi0: 825 (4278190124s/0x0002/0) - Type 29: Inserted: PD 00(e1/s0) Info: enclPd=08, scsiType=0, portMap=01, sasAddr=50010b900046038e,0000000000000000 > mfi0: 826 (4278190124s/0x0002/0) - PD 01(e1/s1) event: Inserted: PD 01(e1/s1) > mfi0: 827 (4278190124s/0x0002/0) - Type 29: Inserted: PD 01(e1/s1) Info: enclPd=08, scsiType=0, portMap=02, sasAddr=50010b9000460376,0000000000000000 > mfi0: 828 (4278190124s/0x0002/0) - PD 02(e1/s2) event: Inserted: PD 02(e1/s2) > mfi0: 829 (4278190124s/0x0002/0) - Type 29: Inserted: PD 02(e1/s2) Info: enclPd=08, scsiType=0, portMap=04, sasAddr=50010b900046035a,0000000000000000 > mfi0: 830 (4278190124s/0x0002/0) - PD 03(e1/s3) event: Inserted: PD 03(e1/s3) > mfi0: 831 (4278190124s/0x0002/0) - Type 29: Inserted: PD 03(e1/s3) Info: enclPd=08, scsiType=0, portMap=08, sasAddr=50010b90004603be,0000000000000000 > mfi0: 832 (4278190124s/0x0002/0) - PD 04(e1/s4) event: Inserted: PD 04(e1/s4) > mfi0: 833 (4278190124s/0x0002/0) - Type 29: Inserted: PD 04(e1/s4) Info: enclPd=08, scsiType=0, portMap=10, sasAddr=50010b900045f6d6,0000000000000000 > mfi0: 834 (4278190124s/0x0002/0) - PD 05(e1/s5) event: Inserted: PD 05(e1/s5) > mfi0: 835 (4278190124s/0x0002/0) - Type 29: Inserted: PD 05(e1/s5) Info: enclPd=08, scsiType=0, portMap=20, sasAddr=50010b9000460246,0000000000000000 > mfi0: 836 (224964238s/0x0020/0) - Adapter ticks 224964238 elapsed 45s: Time established as 02/16/07 18:03:58; (45 seconds since power on) > > and the disks looks like: > > mfid0: <MFI Logical Disk> on mfi0 > mfid0: 418176MB (856424448 sectors) RAID volume '' is optimal > Looks A OK to me. > > The intended production workload involves creation and deletion of > a large number of files rather rapidly. > > I recalled that for the first year or two with Soft Updates, there > were problems with that kind of workload, such that there was enough > hysteresis in making free blocks actually available for subsequent > allocation that processes that were trying to write to new blocks > on such file systems would often fail, reporting ENOSPC. Un-mounting > and re-mounting the file system would clean things up, but that > doesn't tend to be a viable approach for keeping a long-running > application happy. :-} > sysctl vfs.ffs.doasyncfree=0 might help. Running the syncer more frequently might also help, but I don't recall the sysctl node for that. > I reminded my colleague of this, since she also reported that an > un-mount/re-mount sequence caused a lot of free space to show up > on the file system in question, and she responded that she had been > aware of this, and had been turning off Soft Updates on the file > systems for the application in question, but she had forgotten that > Soft Updates was on by default when she set up this (test) system. > > She then turned off Soft Updates and started the test workload again. > And instead of failing with ENOSPC after 3 days, it only took 2. Very strange. No chance that it was due to files that were deleted but still referenced by open apps? > > Hmmm... well; that wasn't exactly what I had expected. > > Any hints, here? The machine is running the i386 arch, with a pair of > dual-core 2.33HHz Xeons. > > I have a recent dmesg.boot, but I'd rather keep list messages fairly > short. > > We have a local private mirror of the FreeBSD CVS repository, so we have > some flexibility in what we can do for testing, but the objective is to > put the box in production -- and I'd rather not run CURRENT as part of a > customer-visible production workload. :-} [My laptop is a different > matter, of course....] > This sounds purely like a filesystem issue, not an MFI driver issue. Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4643A739.3080601>