Date: Sun, 24 Jan 2010 00:51:13 -0500 From: jhell <jhell@DataIX.net> To: Rich <rincebrain@gmail.com> Cc: freebsd-fs@freebsd.org Subject: Re: Errors on a file on a zpool: How to remove? Message-ID: <alpine.BSF.2.00.1001240043350.19303@pragry.qngnvk.ybpny> In-Reply-To: <5da0588e1001232128w5a551674od0805c2ff0b884ad@mail.gmail.com> References: <5da0588e1001222223m773648am907267235bdcf882@mail.gmail.com> <alpine.BSF.2.00.1001231733570.2160@ibyngvyr> <5da0588e1001231541l246769eao410c5ea6ccca0de4@mail.gmail.com> <A43CB93C-06D6-406D-A8C0-4E10E85661A2@gmail.com> <5da0588e1001231615t37c22575uedaae938be40f530@mail.gmail.com> <4B5B94B8.7070509@modulus.org> <5da0588e1001231638i349f8f17t297e970b08825441@mail.gmail.com> <alpine.BSF.2.00.1001232307590.83451@pragry.qngnvk.ybpny> <5da0588e1001232017m6c67731fwaa1d71cd86800017@mail.gmail.com> <alpine.BSF.2.00.1001232341590.19303@pragry.qngnvk.ybpny> <5da0588e1001232128w5a551674od0805c2ff0b884ad@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 24 Jan 2010 00:28, rincebrain@ wrote: > On Sun, Jan 24, 2010 at 12:15 AM, jhell <jhell@dataix.net> wrote: >> From what I see and what was already mentioned earlier in this thread is >> meta data corruption but the checksum errors do not span across the whole >> pool of vdevs. These are, correct me if I am wrong USB mass storage devices >> ? SSD ? > > 1.5T Seagate 7200RPM drives. > >> In the arrangement of the devices on the system are da2,4,5 on the same hub >> and da6,7 on another ? If this is the case you may have consolidated your >> errors down to being a USB problem and narrowed down to where they are >> connected to. > > ...no. > > All five are on the same SATA controller. These behaviors persist > independent of which SATA controller they are plugged into, and I've > tried all seven in the machine. > >> What happened to da1,3 ? Were these once connected to the system ? and if so >> did you start noticing this problem occur roughly about the same period they >> were removed ? > > da1,3 are being used in another disk pool, and were never a part of this pool. > > This is not an issue of a faulty SATA controller or SATA drives. > > This is an issue of "there was a single faulty stick of RAM in the machine". > Yeah I read this earlier, My apologies it slipped while I was writing "mind went into multi-write single read mode". > I have sixteen disks in this machine. These three are having issues > only on these particular files, and only on these files, not on random > portions of the disk. The disks never report read errors - the ZFS > layer is what reports them. SMART is not reporting any difficulties in > reading any sectors of these disks. > > > I could be mistaken, but I do not believe there to be a faulty > controller in play at this time. I've rotated the drives among the > spares of the 24 ports on the SATA controller in question, as well as > the on-motherboard controller, and this behavior has persisted. > > - Rich > As I was thinking earlier... you mentioned you scrubbed multiple times with no difference. When I was mentioning the attempt to remove/replace I was thinking this will cause a "re-silvering" of the drives possibly fixing meta-data for the effected disks if good meta-data still exists somewhere. Might be worth a shot but I would start with the replace of the devices that are showing the errors until you can clear the errors successfully without them showing up again and/or until you have replaced all disks. Best of luck. -- jhell
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1001240043350.19303>