From owner-freebsd-fs@FreeBSD.ORG Thu Jan 10 21:12:56 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 41EEE995 for ; Thu, 10 Jan 2013 21:12:56 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-vc0-f171.google.com (mail-vc0-f171.google.com [209.85.220.171]) by mx1.freebsd.org (Postfix) with ESMTP id 0845C960 for ; Thu, 10 Jan 2013 21:12:55 +0000 (UTC) Received: by mail-vc0-f171.google.com with SMTP id n11so798244vch.2 for ; Thu, 10 Jan 2013 13:12:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=vY/bUgyjEB5vGxiiWk/qQ3welR+mhgTvuAAjBxoN5uk=; b=MiEPC7oPk5ybmZkn768e3Ojx6g0s2yRLVKmozen+iWZVNwGUmuVjFM6tuupfYKnTrM QB/979fzqtMfwAquM6TFACpodXreBNphZfYicjCBqVjRCz3Zx+xcCHt2Xeh9MSVAlVsd 8zez/do0qrPS2fuQj8KpJCTtvBaOZwB9AV5nyMvyfZio73r3rj9/CUg1BRDU2QHV6X09 arp5KKBQvNlP6mnXRbBja3/QZyZBpFvvJmTSXyXhr7vP6kSqyQzux3kAUqCX2zKt0s/1 eEJIvwkErmiac/OekZb7Ov798J9iByzASh95CRnQ4SOgxR9/FjahSesukgX6+9UH6t6W Kx0g== MIME-Version: 1.0 Received: by 10.59.11.67 with SMTP id eg3mr95130746ved.31.1357852374927; Thu, 10 Jan 2013 13:12:54 -0800 (PST) Sender: artemb@gmail.com Received: by 10.220.122.196 with HTTP; Thu, 10 Jan 2013 13:12:54 -0800 (PST) In-Reply-To: <20130110193949.GA10023@mid.pc5.i.0x5.de> References: <20130108174225.GA17260@mid.pc5.i.0x5.de> <20130109162613.GA34276@mid.pc5.i.0x5.de> <20130110193949.GA10023@mid.pc5.i.0x5.de> Date: Thu, 10 Jan 2013 13:12:54 -0800 X-Google-Sender-Auth: vPvY8Vqh3wKKq1MaQuc5Sp8eyKE Message-ID: Subject: Re: slowdown of zfs (tx->tx) From: Artem Belevich To: Nicolas Rachinsky Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jan 2013 21:12:56 -0000 On Thu, Jan 10, 2013 at 11:39 AM, Nicolas Rachinsky wrote: > There is an UDMA_CRC_Error_Count of 17 and 20 for the two disks with > checksum errors. The other disks have values between 0 and 5. > > And yes, there have been timeouts some time ago. Since the problem did > occur without the timeout occuring again, I considered the timeouts to > be unrelated. And then I forgot them. :( > > > But shouldn't timeouts either produce correct data after a retry or > a read/write error otherwise? if I see CRC counter incrementing often enough that's a good indication that something is wrong. It does not mean that those transactions were the ones that corrupted data, but rather as an indication that things are not right with particular device. It may be a false alarm as CRC errors may happen under normal conditions, but non-trivial number of them is a good sign of trouble. --Artem