From owner-freebsd-fs@FreeBSD.ORG Wed Jan 9 18:36:14 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 20D1D7C1 for ; Wed, 9 Jan 2013 18:36:14 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from cpsmtpb-ews06.kpnxchange.com (cpsmtpb-ews06.kpnxchange.com [213.75.39.9]) by mx1.freebsd.org (Postfix) with ESMTP id 79FA93F5 for ; Wed, 9 Jan 2013 18:36:13 +0000 (UTC) Received: from cpsps-ews27.kpnxchange.com ([10.94.84.193]) by cpsmtpb-ews06.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Wed, 9 Jan 2013 19:33:59 +0100 Received: from CPSMTPM-TLF102.kpnxchange.com ([195.121.3.5]) by cpsps-ews27.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Wed, 9 Jan 2013 19:33:59 +0100 Received: from sjakie.klop.ws ([212.182.167.131]) by CPSMTPM-TLF102.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Wed, 9 Jan 2013 19:35:04 +0100 Received: from 212-182-167-131.ip.telfort.nl (localhost [127.0.0.1]) by sjakie.klop.ws (Postfix) with ESMTP id 7FD5C7A60 for ; Wed, 9 Jan 2013 19:35:04 +0100 (CET) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: freebsd-fs@freebsd.org Subject: Re: slowdown of zfs (tx->tx) References: <20130108174225.GA17260@mid.pc5.i.0x5.de> <20130109162613.GA34276@mid.pc5.i.0x5.de> Date: Wed, 09 Jan 2013 19:35:04 +0100 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: "Ronald Klop" Message-ID: In-Reply-To: <20130109162613.GA34276@mid.pc5.i.0x5.de> User-Agent: Opera Mail/12.12 (FreeBSD) X-OriginalArrivalTime: 09 Jan 2013 18:35:04.0522 (UTC) FILETIME=[0AADE2A0:01CDEE98] X-RcptDomain: freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Jan 2013 18:36:14 -0000 On Wed, 09 Jan 2013 17:26:13 +0100, Nicolas Rachinsky wrote: > * Artem Belevich [2013-01-08 12:47 -0800]: >> On Tue, Jan 8, 2013 at 9:42 AM, Nicolas Rachinsky >> wrote: >> > NAME STATE READ WRITE CKSUM >> > pool1 DEGRADED 0 0 0 >> > raidz2-0 DEGRADED 0 0 0 >> > ada5 ONLINE 0 0 0 >> > ada8 ONLINE 0 0 0 >> > ada2 ONLINE 0 0 0 >> > ada3 ONLINE 0 0 0 >> > 11846390416703086268 UNAVAIL 0 0 0 was >> /dev/dsk/ada1 >> > ada6 ONLINE 0 0 0 >> > ada0 ONLINE 0 0 1 >> > ada7 ONLINE 0 0 0 >> > ada4 ONLINE 0 0 3 >> >> You seem to have some checksum errors which does suggest hardware >> troubles. > > I somehow missed these. Is there any way to learn when these checksum > errors happen? > >> For starters, check smart info for all drives and see if they have any >> relocated sectors. > > There are some disks with relocated sectors, but for both ada0 and > ada4 Reallocated_Sector_Ct is 0. > >> Use gstat during your workload to see if any of the drives takes much >> longer than others to handle its job. > > There is one disk sticking out a bit. > >> > There is almost no disk activity during this time. >> >> What kind of disk activity *is* there? > > What would be interesting? > > >> > sync is disabled for the whole pool. >> >> If that's the case (assyming you're talking about sync=disabled zfs >> property), then synchronous writes are probably not the cause of >> slowdown. My guess would be either failing HDD or something funky with >> cabling or sata controller. > > Yes, sync=disabled for pool1. > > > Ok, I will start swapping hardware (sadly the machine is quite a drive > away). > > Thank you very much for your help. > > Nicolas If you are driving anyway replace this one: >> > 11846390416703086268 UNAVAIL 0 0 0 was >> /dev/dsk/ada1 If the pool is healthy checksum errors will be noticed earlier by the sysadmin. Ronald.