From owner-freebsd-fs@FreeBSD.ORG Mon Jan 14 19:13:42 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 48135DA5 for ; Mon, 14 Jan 2013 19:13:42 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-vb0-f47.google.com (mail-vb0-f47.google.com [209.85.212.47]) by mx1.freebsd.org (Postfix) with ESMTP id F04751C9 for ; Mon, 14 Jan 2013 19:13:41 +0000 (UTC) Received: by mail-vb0-f47.google.com with SMTP id e21so3912475vbm.34 for ; Mon, 14 Jan 2013 11:13:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=pZo+nyGb8Coy3KWiu9wDGHcznmwlSyv8uWLJMzxP8mo=; b=mq5NNJ+pWNc2HMqdufIE0bqP3uMKcLUCoBhDsyVVTRtmGDAGGj2ru4eE7C1unorfRk KX4G1HuZPMqgtpPW79SfU0VEqxUpZgN9Dc1nTkLGh2cb8KcfhsKf90v5wA6Z7WvtOYSY VKJ1E1j8dKVHLuUIkEZyV0WHiS1LqqHlsXJ8jEBa9L1ivNVWIfvH4wMpLKeG08ZOeg8D Wi+qiRoqpKTuJk/DtXQSD0fCDiInQc4w2Vqy8GR51TmBltQVOPu3onf/1KZxRnSnVXUb A8lmkUUTSp2gzCi1pba/KKWHav6IQAzZ7LIWfVMZwK0zwOeQaC9VAiErATYYgmO+blNV BKxw== MIME-Version: 1.0 Received: by 10.52.180.200 with SMTP id dq8mr89384491vdc.71.1358190820894; Mon, 14 Jan 2013 11:13:40 -0800 (PST) Sender: artemb@gmail.com Received: by 10.220.122.196 with HTTP; Mon, 14 Jan 2013 11:13:40 -0800 (PST) In-Reply-To: <20130114094010.GA75529@mid.pc5.i.0x5.de> References: <20130108174225.GA17260@mid.pc5.i.0x5.de> <20130109162613.GA34276@mid.pc5.i.0x5.de> <20130110193949.GA10023@mid.pc5.i.0x5.de> <20130111073417.GA95100@mid.pc5.i.0x5.de> <20130114094010.GA75529@mid.pc5.i.0x5.de> Date: Mon, 14 Jan 2013 11:13:40 -0800 X-Google-Sender-Auth: wj3keMDjo9kBGkdBzj1W7RwB6V0 Message-ID: Subject: Re: slowdown of zfs (tx->tx) From: Artem Belevich To: Nicolas Rachinsky Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jan 2013 19:13:42 -0000 On Mon, Jan 14, 2013 at 1:40 AM, Nicolas Rachinsky wrote: > 5 Reallocated_Sector_Ct 0x0033 094 094 010 Pre-fail Always - 166 > 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 1259614646 > 196 Reallocated_Event_Count 0x0032 096 096 000 Old_age Always - 166 > Reallocated_Sector_Ct did not increase during the last days. It does not matter IMHO. That hard drive already got quite a few bad sectors that ECC could not deal with. There are apparently more marginally bad sectors, but ECC deals with it for now. Once enough bits rot, you'll get more bad sectors. I personally would replace the drive. >> Cound you do gstat with 1-second interval. Some of the 5-second >> samples show that ada8 is the bottleneck -- it has its request queue >> full (L(q)=10) when all other drives were done with their jobs. And >> that's a 5-sec average. Its write service time also seems to be a lot >> higher than for other drives. > > Attached. I have replace ada8 by ada9, which is a Western Digital > Caviar Black. > > Now ada0 and ada4 seem to be the bottleneck. > > But I don't understand the intervalls without any disk activity. It is puzzling. Is rsync still sleeping in tx->tx state? Try running "procstat -kk " periodically. It will print in-kernel stack trace and may help giving a clue where/why rsync is stuck. --Artem