Date: Wed, 08 Jul 2009 12:24:17 +0100 From: Ian J Hart <ianjhart@ntlworld.com> To: Kip Macy <kmacy@freebsd.org> Cc: freebsd-current@freebsd.org, Ian J Hart <ianjhart@ntlworld.com> Subject: Re: zpool scrub errors on 3ware 9550SXU Message-ID: <20090708122417.14619w86w7wfu4ms@10.248.192.16> In-Reply-To: <3c1674c90907071412t346b1591rfecfae22bb60a8f5@mail.gmail.com> References: <20090624153442.137934uzyotkb5og@10.248.192.16> <20090707210345.13681mi2dwvan78k@webmail.private.lan> <3c1674c90907071412t346b1591rfecfae22bb60a8f5@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Quoting Kip Macy <kmacy@freebsd.org>: > Did you answer my question of whether or not this can be reproduced > on 7-STABLE? Yes I did, but the threading is a little broken, sorry that's my fault. To reiterate, with 7 stable circa Jun 25th scrubs complete okay on the exact same hardware and v6 zpool as fails under 8.0-BETA1. I'm scrubbing under 7 every time a run under 8 fails. A reminder of the setup. 3ware 9550SXU-16 16x 1.5TB seagate. These drives throw bad sectors! 2 8 disk raidz2 vdevs combined into one pool.21.8TB. Test file system with compression on copies 2 I don't think this is a zfs error as such, it looks like the card gives up, which then spawns a whole series of bogus checksum errors (but what do I know). It's odd that it seems to take 40m+ to fail. Offsets are always large. How can I test for/eliminate any LBA error? What else might cause the card to fail (after 40m)? BTW I have to put this into production soon, so I can start testing all the other stuff which might not work (ie samba). Thanks for your help. > > > -Kip > > > > On Tue, Jul 7, 2009 at 1:03 PM, Ian J Hart<ianjhart@ntlworld.com> wrote: >> Quoting ianjhart@ntlworld.com: >> >>> Quoting ianjhart@ntlworld.com: >>> >>>> Quoting Kip Macy <kmacy@freebsd.org>: >>>> >>>>>> >>>>>> As usual scrubs cleanly on 7.2. Started throwing errors within a few >>>>>> minutes under 8. Then it paniced, possibly due to scrub -s. >>>>>> >>>>>> It's sat at the DB prompt if there's anything I can do. I'll need >>>>>> idiots guide level instruction. I have a screen dump if someone >>>>>> want to step >>>>>> up. Off list? >>>>>> >>>>>> Highlight seems to be... >>>>>> >>>>>> Memory modified after free 0xffffff0004da0c00(248) val=3000000 @ >>>>>> 0xffffff0004dc00 >>>>>> Panic: most recently used by none >>>>> >>>>> Can you test with recent 7-STABLE? That would tell me whether or not >>>>> your hitting a general HEAD issues or problems with the v13 import. >>>> >>>> It's doing a scrub under 7.2 following another failed test. I'll pull it >>>> up to stable after that. >>>> >>>> Have more data will post that once I've done a couple a jobs. >>>> >>>>> >>>>> Thanks, >>>>> Kip >>> >>> Here's that extra data. >>> >>> Updated 3ware/AMCC card firmware. >>> >>> Enable onboard SATA and fit a 300GB SATA disk. Remove the floppy and fit a >>> second 300GB SATA disk. >>> >>> Remove the two 500GB disks and replace with 1.5TB units. I can now create >>> two 8 disk raidz2 giving the same 12 disks worth of storage I had with one >>> 14 disk raidz2. >>> >>> Reinstall the two O/S on the 300GB drives. >>> >>> <slight tangent> >>> May be of use to someone, so bear with me. >>> >>> Reset to BIOS defaults. Some issues! Disabling sound helps. >>> >>> Now suspect motherboard BIOS may be part of the problem. Removed both >>> cards and tested each version in turn. >>> >>> ref: http://www.tyan.com.tw/support_download_bios.aspx?model=S.S2895 >>> >>> Started with 1.04 ended up with 1.04. Versions after, detect the internal; >>> SATA disks as 150 not 300. Most versions lock the keyboard (KVM) >>> when legacy >>> USB is enabled. That's a PITA when you've just taken the floopy disk out.No >>> internal SATA disk settings. Be nice to check the geometry as 7 and 8 >>> sysinstall seem to be behaving differently. >>> >>> With the cards back in. >>> >>> Add an ATA disk and CDROM while testing.Easyboot order is SATA0 ATA0 >>> SATA1. Fdisk the so far blank ATA disk :) >>> >>> On board audio clashes with something. BIOS 1.03 and later supports 16 >>> SCSI boot devices. I disabled booting from the RAID card to allow the >>> onboard SATA drives to boot. >>> >>> Out of space for option ROM error has gone. >>> >>> AFAIK CPUs are late enough to support DDR400. Check anyway. Clock down to >>> 333Mhz. Still fails. >>> >>> </slight tangent> >>> >>> There's one last thing, this BIOS (1.04) does not supply the fix for AMD >>> errata 169. Later BIOS incorrectly detect the onboard SATA disks. >>> >>> Northbridge System Request Queue may stall. >>> >>> ref: >>> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25759.pdf >>> >>> We don't seem to have /dev/msr. Could I fix this using (the shiny new) >>> cpucontrol? >>> >>> Thanks >>> >>> ---------------------------------------------------------------- >>> This message was sent using IMP, the Internet Messaging Program. >>> >>> >>> _______________________________________________ >>> freebsd-current@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-current >>> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >>> >> >> FWIW this is still reproducable with 8.0-BETA1. >> >> -- >> ian j hart >> >> ---------------------------------------------------------------- >> This message was sent using IMP, the Internet Messaging Program. >> >> >> _______________________________________________ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >> > > > > -- > When bad men combine, the good must associate; else they will fall one > by one, an unpitied sacrifice in a contemptible struggle. > > Edmund Burke > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > -- ian j hart -- ian j hart ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090708122417.14619w86w7wfu4ms>