Date: Mon, 07 Nov 2011 10:36:15 -0500 From: Douglas Gilbert <dgilbert@interlog.com> To: Rich <rercola@acm.jhu.edu> Cc: "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org>, "Kenneth D. Merry" <ken@freebsd.org>, "fs@freebsd.org" <fs@freebsd.org>, =?ISO-8859-1?Q?Karli_Sj=F6berg?= <Karli.Sjoberg@slu.se> Subject: Re: AOC-USAS2-L8i zfs panics and SCSI errors in messages Message-ID: <4EB7FAEF.30505@interlog.com> In-Reply-To: <CAOeNLuqFuA-Ewfj0xyNmfGdbznsoRAYb6GNgGDzN8PtPck0yUw@mail.gmail.com> References: <82B38DBF-DD3A-46CD-93F6-02CDB6506E05@slu.se> <20111025193302.GA30409@nargothrond.kdm.org> <B4D81944-39F5-4053-ACBA-78EBB7DD70EB@slu.se> <20111026101602.GA9768@icarus.home.lan> <75BDE9FA-6130-4BB4-8518-275D68BB3E49@slu.se> <CAOeNLuqFuA-Ewfj0xyNmfGdbznsoRAYb6GNgGDzN8PtPck0yUw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11-11-07 03:56 AM, Rich wrote: > Observation - the LSI SAS expanders, in my experience, sometimes > misbehave when there are drives which respond slower than some timeout > to commands (as far as I've seen it's only SATA drives it does this > for, but I don't have many SAS drives for comparison), leading to all > further commands to that drive for a bit not working, and then what > happens depending on the OS varies dramatically. > > If you could try without an expander (e.g. with 1->4 SAS->SATA fanout > cables), you may be surprised (and/or annoyed) to find your life gets > better. SAS-2 expanders are better than the original generation. [LSI makes both.] SAS-2 added the CONFIGURE GENERAL SMP function which contains various timeout tweaks for the STP protocol (i.e. the protocol that tunnels (S)ATA commands between a SAS HBA (initiator) and an expander). If you are using SAS-2 expanders and FreeBSD 9.0 then you can fetch my smp_utils package and use the smp_conf_general utility to change those timeout settings. If you have SAS-2 expanders but an older version of FreeBSD then you will need Solaris or Linux to run my smp_utils package in order to change those timeout values on the expander. Doug Gilbert BTW smp_rep_general will show the current settings of those STP timeouts. > On Mon, Nov 7, 2011 at 3:48 AM, Karli Sjöberg<Karli.Sjoberg@slu.se> wrote: >> As a test, I have copied in about 1.5TB and scrubbed several times without any panic. It stayed solid until periodic weekly:( Same panic as with daily. >> >> /Karli Sjöberg >> >> 26 okt 2011 kl. 12.16 skrev Jeremy Chadwick: >> >> On Wed, Oct 26, 2011 at 11:36:44AM +0200, Karli Sj?berg wrote: >> Hi all, >> >> I tracked down what causes the panics! >> >> I got a tip from aragon and phoenix at the forum about >> /etc/periodic/security/100.chksetuid >> >> And to put: >> daily_status_security_chksetuid_enable="NO" >> into /etc/periodic.conf >> >> This is not truly the cause of the panic, it simply exacerbates it. >> >> Many of the periodic scripts will do things like iterate over all files >> on the filesystem looking for specific attributes, etc.. This tends to >> stress filesystems heavily. This isn't the only one. :-) >> >> I can now run periodic daily without any panics. I?m still wondering >> about the cause of this, the explanation from the forum was that that >> phase is too demanding for multi TB systems. But I have several multi >> TB servers with FreeBSD and ZFS, and none of them has ever behaved >> this way. Besides, the panic is instantaneous, not degenerative. I >> imagine that a run like that would start out OK and then just get >> worse and worse, getting gradually slower and slower until it just >> wouldn?t cope any more and hang. This feels more like hitting a wall. >> As if it found something that is couldn?t deal with and has no choice >> but to panic immediately. >> >> It may be possible that you have some underlying filesystem corruption >> that triggers this situation. Have you actually tried doing a "zpool >> scrub" of your pools and seeing if any errors happen or if the panic >> occurs there? >> >> I'm inclined to think what you're experiencing is probably a bug or >> "quirk" in the storage controller driver you're using. There are other >> drivers that have had fixes applied to them "to make them work decently >> with ZFS", meaning the kind of stressful I/O ZFS puts on them results in >> the controller driver behaving oddly or freaking out, case in point. It >> could also be a controller firmware bug/quirk/design issue. Seriously. >> >> I believe the AOC-USAS2-L8i controller has been discussed on >> freebsd-stable, re: mps(4) driver problems or equivalent, but I'm not >> going to CC that list given that there would be 3 cross-posted lists >> involved and that is liable to upset some folks. You should search the >> mailing lists for discussion of Supermicro controllers that work >> reliably with FreeBSD. >> >> It would be worthwhile to discuss this condition on -stable, mainly with >> something like "Anyone else using the AOC-USAS2-L8i reliably with ZFS?" >> You get the idea. >> >> -- >> | Jeremy Chadwick jdc at parodius.com<http://parodius.com> | >> | Parodius Networking http://www.parodius.com/ | >> | UNIX Systems Administrator Mountain View, CA, US | >> | Making life hard for others since 1977. PGP 4BD6C0CB | >> >> >> >> >> Med Vänliga Hälsningar >> ------------------------------------------------------------------------------- >> Karli Sjöberg >> Swedish University of Agricultural Sciences >> Box 7079 (Visiting Address Kronåsvägen 8) >> S-750 07 Uppsala, Sweden >> Phone: +46-(0)18-67 15 66 >> karli.sjoberg@slu.se<mailto:karli.sjoberg@adm.slu.se> >> >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EB7FAEF.30505>