Date: Wed, 25 Mar 2009 13:49:07 +0000 (GMT) From: "Mark Powell" <M.S.Powell@salford.ac.uk> To: Alexander Leidinger <Alexander@Leidinger.net> Cc: kevin <kevinxlinuz@163.com>, FreeBSD Current <freebsd-current@freebsd.org>, Daniel Eriksson <daniel@toomuchdata.com> Subject: Re: Apparently spurious ZFS CRC errors (was Re: ZFS data error without reasons) Message-ID: <20090325125930.U73916@rust.salford.ac.uk> In-Reply-To: <20090325135528.21416hzpozpjst8g@webmail.leidinger.net> References: <49BD117B.2080706@163.com> <4F9C9299A10AE74E89EA580D14AA10A635E68A@royal64.emp.zapto.org> <49BE4EC1.90207@163.com> <20090320102824.W75873@rust.salford.ac.uk> <20090320152737.D641@rust.salford.ac.uk> <20090325105613.55624rkkgf2xkr6s@webmail.leidinger.net> <20090325103721.G67233@rust.salford.ac.uk> <20090325135528.21416hzpozpjst8g@webmail.leidinger.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 25 Mar 2009, Alexander Leidinger wrote:
>> Can prefetch really cause these problems? And if so why?
>
> I don't think so. I missed the part where you explained this before. In
> this case it's really the write cache. The interesting questions is if
> this is because of the harddisks you use, or because of a bug in the
> software.
>
> You run a very recent current? 1-2 weeks before there was a bug (not in
> ZFS) which caused CRC errors, but it was fixed shortly after it was
> noticed. If you haven't updated your system, it may be best to update it
> and try again. Please report back.
I'm running recent current. I too saw that there were bugs causing CRC
errors, and hoped that the relevant fixes would help me out. Unfortunately
not.
I most recently remade the whole array again with current from last
Thursday 19th March.
I tried it with WC disabled, but performance is awful. I expected,
obviously a little worse, but not to be noticable without benchmarks? Well
restoring my 1st LTO2 200GB tape (should take 1h45-2hrs), after 3h30 it
was only about halfway through the tape, so I gave up. Hoping, possibly in
vain, that it was a ZFS option causing the issue.
The drives in question are:
ad24 Device Model: WDC WD10EADS-00L5B1
ad22 Device Model: WDC WD10EADS-00L5B1
ad20 Device Model: WDC WD10EADS-00L5B1
ad18 Device Model: WDC WD10EADS-00L5B1
ad16 Device Model: WDC WD10EADS-00L5B1
ad14 Device Model: WDC WD10EADS-00L5B1
ad10 Device Model: WDC WD5000AAKS-22TMA0
ad8 Device Model: WDC WD5000AAKS-65TMA0
The WD5000AAKS were used for around 18 months in the previous 9x500GB
RAIDZ2 on 7, so I would expect them to be ok.
I've had the WD10EADS for about 2 months. However, I did replace the
drives in the old 9x500GB RAIDZ2, with each of the new drives to check
they were ok, resilvering them, one at a time, into the array i.e.
eventually I was running 3x500GB+6X1TB in the still logically 9x500GB
RAIDZ2. Yes, this would only check the lower 500GB of each 1TB drive, but
surely that's enough of a test?
AFAICT, I had WC off in 7 though.
On my most recent failure I do see:
-----
# zpool status -v
pool: pool
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: scrub in progress, 40.02% done, 3h54m to go
config:
NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 42
raidz2 ONLINE 0 0 42
stripe/str0 ONLINE 0 0 0
ad14 ONLINE 0 0 4
ad16 ONLINE 0 0 2
ad18 ONLINE 0 0 3
ad20 ONLINE 0 0 7
ad22 ONLINE 0 0 4
ad24 ONLINE 0 0 5
-----
i.e. no errors on the 2x500GB stripe. That would seem to suggest firmware
write caching bugs on the 1TB drives. However, my other error report had:
-----
pool: pool
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed after 0h51m with 0 errors on Fri Mar 20 10:57:18 2009
config:
NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 0
raidz2 ONLINE 0 0 23
stripe/str0 ONLINE 0 0 489 12.3M repaired
ad14 ONLINE 0 0 786 19.7M repaired
ad16 ONLINE 0 0 804 20.1M repaired
ad18 ONLINE 0 0 754 18.8M repaired
ad20 ONLINE 0 0 771 19.3M repaired
ad22 ONLINE 0 0 808 20.2M repaired
ad24 ONLINE 0 0 848 21.2M repaired
errors: No known data errors
-----
i.e. errors on the stripe, but the stripe error count seems to be just
over half of that a 1TB drive. If the errors we spread evenly, one would
expect 2x the amount of CRC errors on the stripe?
>>> If you want to get more out of zfs, maybe vfs.zfs.vdev.max_pending could
>>> help if you are using SATA (as I read the zfs tuning guide, it makes sense
>>> to have a high value when you have command queueing, which we have with
>>> SCSI drives, but not yet with SATA drives and probably not at all with
>>> PATA drives).
>>
>> I'm running completely SATA with NCQ supporting drives. However, and
>> possibly as you say, NCQ is not really/properly supported in FBSD?
>
> NCQ is not supported yet in FreeBSD. Alexander Motin said he is interested in
> implementing it, but I don't know about the status of this.
Ok. So vfs.zfs.vdev.max_pending is irrelevant for SATA currently?
Cheers.
--
Mark Powell - UNIX System Administrator - The University of Salford
Information & Learning Services, Clifford Whitworth Building,
Salford University, Manchester, M5 4WT, UK.
Tel: +44 161 295 6843 Fax: +44 161 295 5888 www.pgp.com for PGP key
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090325125930.U73916>
