Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 19 Feb 2012 12:10:39 -0800
From:      Artem Belevich <art@freebsd.org>
To:        =?ISO-8859-1?Q?Ask_Bj=F8rn_Hansen?= <ask@develooper.com>
Cc:        freebsd-stable@freebsd.org, freebsd-zfs@freebsd.org
Subject:   Re: Can't read a full block, only got 8193 bytes.
Message-ID:  <CAFqOu6iLe=a9Xi9qi2rsLz_KL_P2jjK1BZKUu0dBPnzj-9ET-Q@mail.gmail.com>
In-Reply-To: <770EEEFF-B41D-4851-AD74-C3F96FFB1683@develooper.com>
References:  <770EEEFF-B41D-4851-AD74-C3F96FFB1683@develooper.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Feb 18, 2012 at 10:10 PM, Ask Bj=F8rn Hansen <ask@develooper.com> w=
rote:
> Hi everyone,
>
> We're recycling an old database server with room for 16 disks as a backup=
 server (our old database servers had 12-20 15k disks; the new ones one or =
two SSDs and they're faster).
>
> We have a box running FreeBSD 8.2 with 7 disks in a ZFS raidz2 (and a spa=
re). =A0It's using an older 3ware card with all the disks (2TB WD green "ea=
rs" ones) setup as a "single" unit on the 3ware controller and though slow =
is basically working great. =A0We have a small program to smartly purge old=
 snapshots that I wrote after a year and tens of thousands of snapshots: ht=
tps://github.com/abh/zfs-snapshot-cleaner
>
> The new box is running 9.0 with a 3ware 9690SA-4I4E card with the latest =
firmware (4.10.00.024). =A0We're using Seagate 3TB barracuda disks (big and=
 cheap; good for backups).
>
> Now for the problem: When running bonnie++ we get a few ZFS checksum erro=
rs and (weirder) we get this error from bonnie:
>
> "Can't read a full block, only got 8193 bytes."

That's probably just a side effect of ZFS checksum errors. ZFS will
happily read the file until it hits a record with checksum. If
redundant info is available (raidz or mirror), ZFS will attempt to
recover your data. If there's no redundancy you will get read error.
If you do "zpool status -v" you should see list of files affected by
corruption.

>
> This seems to only be when testing a single ZFS disk or a UFS partition. =
=A0Testing a raidz1 we just get checksum errors noted in zpool status, but =
no errors reading (though read speeds are ~10MB/second across four disks --=
 writing sequentially was ~230MB/second).
>
> Any ideas where to start look?

You need to figure out why you're getting checksum errors. Alas
there's probably no easy way to troubleshoot it. The issue could be
hardware related and possible culprits may include bad RAM, bad SATA
cables, quirks of particular firmware revision on disk controller
and/or hard drive.

> Our best guess is that the 3ware controller can't play nicely with the di=
sks; we're planning to try some older/smaller disks on Monday and then tryi=
ng the same system and disks with Linux to see if the 3ware driver there wo=
rks differently.

--Artem



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFqOu6iLe=a9Xi9qi2rsLz_KL_P2jjK1BZKUu0dBPnzj-9ET-Q>