From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 03:57:01 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C149F106564A
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 03:57:01 +0000 (UTC)
	(envelope-from jurgen@ish.com.au)
Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net
	[59.167.240.32])
	by mx1.freebsd.org (Postfix) with ESMTP id 7D1D38FC15
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 03:57:01 +0000 (UTC)
Received: from ip-211.ish.com.au ([203.29.62.211]:17037 helo=ish.com.au)
	by fish.ish.com.au with esmtp (Exim 4.69)
	(envelope-from <jurgen@ish.com.au>) id 1OLTHA-0005k8-2D
	for freebsd-fs@freebsd.org; Mon, 07 Jun 2010 13:45:24 +1000
Received: from [203.29.62.154] (HELO ip-154.ish.com.au)
	by ish.com.au (CommuniGate Pro SMTP 5.3.7)
	with ESMTP id 5950264 for freebsd-fs@freebsd.org;
	Mon, 07 Jun 2010 13:45:24 +1000
Message-ID: <4C0C6B54.8020005@ish.com.au>
Date: Mon, 07 Jun 2010 13:45:24 +1000
From: Jurgen Weber <jurgen@ish.com.au>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US;
	rv:1.9.1.8) Gecko/20100310 Shredder/3.0.4pre
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: zfs filesystem problem
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 03:57:01 -0000

Hello

I have a FreeBSD 8.0-p2 system, which runs two pools. One with 6 disks 
all mirrored for our data and another mirrored pool for the OS. The 
system has 16GB of RAM.

I have a nightly cron script running which takes a snapshot of a 
particular file system within the storage pool. This has been running 
for just over a month now without any issues until this weekend.

Now we can not access the mentioned file system. If we try to `ls` to it 
or `cd` into it the shell locks up (not even kill -9 can stop the `ls` 
processes, etc) and top shows that the process state is `zfs`.

This file system is the root of a jail. While the jailed system works 
fine right now I can not help but feel its time is limited.

Any suggestions on how to get this file system functioning normally again?

Thanks

Jurgen
-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001   fax +61 2 9550 4001

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 06:10:39 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0492D106566C
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 06:10:39 +0000 (UTC)
	(envelope-from sergiy.suprun@gmail.com)
Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com
	[209.85.212.54])
	by mx1.freebsd.org (Postfix) with ESMTP id AD6388FC08
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 06:10:38 +0000 (UTC)
Received: by vws4 with SMTP id 4so1370560vws.13
	for <freebsd-fs@freebsd.org>; Sun, 06 Jun 2010 23:10:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type;
	bh=0t2eCNSblkjj/HvQgGpG4Evn2I3TosdUQ9qTttmbmHs=;
	b=m5NFho0eRBdxbJiyAjViXi+wTjxIgsyLfV3TLWHk93A8WXheLM8Gur6WCdCcaZF+TM
	/wlKjqcjF5dK8A+xTXrAXfzC1vl7Zlo8VFesCGH9HP3Mwtx85hRuGXAG7KLQWqeQ6+LE
	NlNHoN5HOuocHFobtZe1SSAfJZ/6RXuTcvecI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	b=FSSQ37GMwIJ7PhQsd3qYs50FYbu8UMjYegGlM5mbc2QbKl52JZqF+HIR7Q4hcnxyKo
	2a6GCTg+yNcQiRigde3PofTl/hV/gw+1HEMSKrhwbFTlq4iF1CVThlFQ2sWmHVRvg7dC
	uSSp7XqHknpM4CGydvNfhHY2owFKR5pgS1ehI=
MIME-Version: 1.0
Received: by 10.224.59.12 with SMTP id j12mr948367qah.94.1275889570778; Sun, 
	06 Jun 2010 22:46:10 -0700 (PDT)
Received: by 10.224.19.133 with HTTP; Sun, 6 Jun 2010 22:46:10 -0700 (PDT)
In-Reply-To: <4C0C6B54.8020005@ish.com.au>
References: <4C0C6B54.8020005@ish.com.au>
Date: Mon, 7 Jun 2010 08:46:10 +0300
Message-ID: <AANLkTil2-wsv2KvCOHmPVpQuf7q0uF16x2xh6NXJPgIS@mail.gmail.com>
From: Sergiy Suprun <sergiy.suprun@gmail.com>
To: Jurgen Weber <jurgen@ish.com.au>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs filesystem problem
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 06:10:39 -0000

On Mon, Jun 7, 2010 at 06:45, Jurgen Weber <jurgen@ish.com.au> wrote:

> Hello
>
> I have a FreeBSD 8.0-p2 system, which runs two pools. One with 6 disks all
> mirrored for our data and another mirrored pool for the OS. The system has
> 16GB of RAM.
>
> I have a nightly cron script running which takes a snapshot of a particular
> file system within the storage pool. This has been running for just over a
> month now without any issues until this weekend.
>
> Now we can not access the mentioned file system. If we try to `ls` to it or
> `cd` into it the shell locks up (not even kill -9 can stop the `ls`
> processes, etc) and top shows that the process state is `zfs`.
>
> This file system is the root of a jail. While the jailed system works fine
> right now I can not help but feel its time is limited.
>
> Any suggestions on how to get this file system functioning normally again?
>
> Thanks
>
> Jurgen
> -------------------------->
> ish
> http://www.ish.com.au
> Level 1, 30 Wilson Street Newtown 2042 Australia
> phone +61 2 9550 5001   fax +61 2 9550 4001
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>

Hello.
How about scrub ?
And which size of your pools and how many place used by data+snapshots?

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 08:15:57 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EE40B106564A
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 08:15:57 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 27EF38FC1B
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 08:15:56 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA18004
	for <freebsd-fs@freebsd.org>; Mon, 07 Jun 2010 11:15:55 +0300 (EEST)
	(envelope-from avg@icyb.net.ua)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1OLXUx-000G9l-DD
	for freebsd-fs@freebsd.org; Mon, 07 Jun 2010 11:15:55 +0300
Message-ID: <4C0CAABA.2010506@icyb.net.ua>
Date: Mon, 07 Jun 2010 11:15:54 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.24 (X11/20100321)
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
X-Enigmail-Version: 0.96.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Subject: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 08:15:58 -0000


During recent zpool scrub one read error was detected and "128K repaired".
In system log I see the following message:
ZFS: vdev I/O failure, zpool=tank
path=/dev/gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff offset=284456910848
size=131072 error=5

On the other hand, there are no other errors, nothing from geom, ahci, etc.
Why would that happen? What kind of error could this be?

Thanks!
-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 08:34:30 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6E44E1065686
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 08:34:30 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta07.emeryville.ca.mail.comcast.net
	(qmta07.emeryville.ca.mail.comcast.net [76.96.30.64])
	by mx1.freebsd.org (Postfix) with ESMTP id 3D7618FC0A
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 08:34:29 +0000 (UTC)
Received: from omta02.emeryville.ca.mail.comcast.net ([76.96.30.19])
	by qmta07.emeryville.ca.mail.comcast.net with comcast
	id SwZZ1e0040QkzPwA7waV7u; Mon, 07 Jun 2010 08:34:29 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta02.emeryville.ca.mail.comcast.net with comcast
	id SwaU1e0043S48mS8NwaUfD; Mon, 07 Jun 2010 08:34:29 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 6EBA29B418; Mon,  7 Jun 2010 01:34:28 -0700 (PDT)
Date: Mon, 7 Jun 2010 01:34:28 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Andriy Gapon <avg@icyb.net.ua>
Message-ID: <20100607083428.GA48419@icarus.home.lan>
References: <4C0CAABA.2010506@icyb.net.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4C0CAABA.2010506@icyb.net.ua>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 08:34:31 -0000

On Mon, Jun 07, 2010 at 11:15:54AM +0300, Andriy Gapon wrote:
> During recent zpool scrub one read error was detected and "128K repaired".
>
> In system log I see the following message:
> ZFS: vdev I/O failure, zpool=tank
> path=/dev/gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff offset=284456910848
> size=131072 error=5
> 
> On the other hand, there are no other errors, nothing from geom, ahci, etc.
> Why would that happen? What kind of error could this be?

I believe this indicates silent data corruption[1], which ZFS can
auto-correct if the pool is a mirror or raidz (otherwise it can detect
the problem but not fix it).  This can happen for a lot of reasons, but
tracking down the source is often difficult.  Usually it indicates the
disk itself has some kind of problem (cache going bad, some sector
remaps which didn't happen or failed, etc.).

What I'd need to determine the cause:

- Full "zpool status tank" output before the scrub
- Full "zpool status tank" output after the scrub
- Full "smartctl -a /dev/XXX" for all disk members of zpool "tank"

Furthermore, what made you decide to scrub the pool on a whim?

[1]: http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta
     http://blogs.sun.com/bonwick/entry/zfs_end_to_end_data
     http://blogs.sun.com/bonwick/entry/raid_z

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 08:55:30 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 092601065674
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 08:55:30 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 4B9778FC19
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 08:55:28 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA19509;
	Mon, 07 Jun 2010 11:55:25 +0300 (EEST)
	(envelope-from avg@icyb.net.ua)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1OLY7A-000GDF-QW; Mon, 07 Jun 2010 11:55:24 +0300
Message-ID: <4C0CB3FC.8070001@icyb.net.ua>
Date: Mon, 07 Jun 2010 11:55:24 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.24 (X11/20100321)
MIME-Version: 1.0
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
In-Reply-To: <20100607083428.GA48419@icarus.home.lan>
X-Enigmail-Version: 0.96.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 08:55:30 -0000

on 07/06/2010 11:34 Jeremy Chadwick said the following:
> On Mon, Jun 07, 2010 at 11:15:54AM +0300, Andriy Gapon wrote:
>> During recent zpool scrub one read error was detected and "128K repaired".
>>
>> In system log I see the following message:
>> ZFS: vdev I/O failure, zpool=tank
>> path=/dev/gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff offset=284456910848
>> size=131072 error=5
>>
>> On the other hand, there are no other errors, nothing from geom, ahci, etc.
>> Why would that happen? What kind of error could this be?
> 
> I believe this indicates silent data corruption[1], which ZFS can
> auto-correct if the pool is a mirror or raidz (otherwise it can detect
> the problem but not fix it).

This pool is a mirror.

> This can happen for a lot of reasons, but
> tracking down the source is often difficult.  Usually it indicates the
> disk itself has some kind of problem (cache going bad, some sector
> remaps which didn't happen or failed, etc.).

Please note that this is not a CKSUM error, but READ error.

> What I'd need to determine the cause:
> 
> - Full "zpool status tank" output before the scrub

This was "all clear".

> - Full "zpool status tank" output after the scrub

zpool status -v
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 5h0m with 0 errors on Sat Jun  5 05:05:43 2010
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            ada0p4                                      ONLINE       0     0     0
            gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff  ONLINE       1     0
 0  128K repaired

> - Full "smartctl -a /dev/XXX" for all disk members of zpool "tank"

Those output for both disks are "perfect".
I monitor them regularly, also smartd is running and complaints from it.

> Furthermore, what made you decide to scrub the pool on a whim?

Why on a whim? It was a regularly scheduled scrub (bi-weekly).

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 09:08:52 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 70BAF106564A
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 09:08:52 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta02.emeryville.ca.mail.comcast.net
	(qmta02.emeryville.ca.mail.comcast.net [76.96.30.24])
	by mx1.freebsd.org (Postfix) with ESMTP id 57BF88FC13
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 09:08:51 +0000 (UTC)
Received: from omta11.emeryville.ca.mail.comcast.net ([76.96.30.36])
	by qmta02.emeryville.ca.mail.comcast.net with comcast
	id Sx7f1e0020mlR8UA2x8r8r; Mon, 07 Jun 2010 09:08:51 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta11.emeryville.ca.mail.comcast.net with comcast
	id Sx8q1e0043S48mS8Xx8r6q; Mon, 07 Jun 2010 09:08:51 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 9563B9B418; Mon,  7 Jun 2010 02:08:50 -0700 (PDT)
Date: Mon, 7 Jun 2010 02:08:50 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Andriy Gapon <avg@icyb.net.ua>
Message-ID: <20100607090850.GA49166@icarus.home.lan>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4C0CB3FC.8070001@icyb.net.ua>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 09:08:52 -0000

On Mon, Jun 07, 2010 at 11:55:24AM +0300, Andriy Gapon wrote:
> on 07/06/2010 11:34 Jeremy Chadwick said the following:
> > On Mon, Jun 07, 2010 at 11:15:54AM +0300, Andriy Gapon wrote:
> >> During recent zpool scrub one read error was detected and "128K repaired".
> >>
> >> In system log I see the following message:
> >> ZFS: vdev I/O failure, zpool=tank
> >> path=/dev/gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff offset=284456910848
> >> size=131072 error=5
> >>
> >> On the other hand, there are no other errors, nothing from geom, ahci, etc.
> >> Why would that happen? What kind of error could this be?
> > 
> > I believe this indicates silent data corruption[1], which ZFS can
> > auto-correct if the pool is a mirror or raidz (otherwise it can detect
> > the problem but not fix it).
> 
> This pool is a mirror.
> 
> > This can happen for a lot of reasons, but
> > tracking down the source is often difficult.  Usually it indicates the
> > disk itself has some kind of problem (cache going bad, some sector
> > remaps which didn't happen or failed, etc.).
> 
> Please note that this is not a CKSUM error, but READ error.

Okay, then it indicates reading some data off the disk failed.  ZFS
auto-corrected it by reading the data from the other member in the pool
(ada0p4).  That's confirmed here:

> status: One or more devices has experienced an unrecoverable error.  An
>         attempt was made to correct the error.  Applications are unaffected.
> 
>         NAME                                            STATE     READ WRITE CKSUM
>         tank                                            ONLINE       0     0     0
>           mirror                                        ONLINE       0     0     0
>             ada0p4                                      ONLINE       0     0     0
>             gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff  ONLINE       1     0     0  128K repaired

> > - Full "smartctl -a /dev/XXX" for all disk members of zpool "tank"
> 
> Those output for both disks are "perfect".
> I monitor them regularly, also smartd is running and complaints from it.

Most people I know if do not know how to interpret SMART statistics, and
that's not their fault -- and that's why I requested them.  :-)  In this
case, I'd like to see "smartctl -a" output for the disk that's
associated with the above GPT ID.  There may be some attributes or data
in the SMART error log which could indicate what's going on.  smartd
does not know how to interpret data; it just logs what it sees.

> > Furthermore, what made you decide to scrub the pool on a whim?
> 
> Why on a whim? It was a regularly scheduled scrub (bi-weekly).

I'm still trying to figure out why people do this.  ZFS will
automatically detect and correct errors of this sort when it encounters
them during normal operation.  It's good that you caught an error ahead
of time, but ZFS would have dealt with this on its own.

It's important to remember that scrubs are *highly* intensive on both
the system itself as well as on all pool members.  Disk I/O activity is
very heavy during a scrub; it's not considered "normal use".

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 09:28:49 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BA1FF106564A
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 09:28:49 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id D6D2D8FC19
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 09:28:48 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA19992;
	Mon, 07 Jun 2010 12:28:44 +0300 (EEST)
	(envelope-from avg@icyb.net.ua)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1OLYdP-000GFz-Mi; Mon, 07 Jun 2010 12:28:43 +0300
Message-ID: <4C0CBBCA.3050304@icyb.net.ua>
Date: Mon, 07 Jun 2010 12:28:42 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.24 (X11/20100321)
MIME-Version: 1.0
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua>
	<20100607090850.GA49166@icarus.home.lan>
In-Reply-To: <20100607090850.GA49166@icarus.home.lan>
X-Enigmail-Version: 0.96.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 09:28:49 -0000

on 07/06/2010 12:08 Jeremy Chadwick said the following:
> On Mon, Jun 07, 2010 at 11:55:24AM +0300, Andriy Gapon wrote:
>> on 07/06/2010 11:34 Jeremy Chadwick said the following:
>>> On Mon, Jun 07, 2010 at 11:15:54AM +0300, Andriy Gapon wrote:
>>>> During recent zpool scrub one read error was detected and "128K repaired".
>>>>
>>>> In system log I see the following message:
>>>> ZFS: vdev I/O failure, zpool=tank
>>>> path=/dev/gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff offset=284456910848
>>>> size=131072 error=5
>>>>
>>>> On the other hand, there are no other errors, nothing from geom, ahci, etc.
>>>> Why would that happen? What kind of error could this be?
>>> I believe this indicates silent data corruption[1], which ZFS can
>>> auto-correct if the pool is a mirror or raidz (otherwise it can detect
>>> the problem but not fix it).
>> This pool is a mirror.
>>
>>> This can happen for a lot of reasons, but
>>> tracking down the source is often difficult.  Usually it indicates the
>>> disk itself has some kind of problem (cache going bad, some sector
>>> remaps which didn't happen or failed, etc.).
>> Please note that this is not a CKSUM error, but READ error.
> 
> Okay, then it indicates reading some data off the disk failed.  ZFS
> auto-corrected it by reading the data from the other member in the pool
> (ada0p4).  That's confirmed here:

Yes, right, of course.
If you read my original post you'll see that my question was: why ZFS saw I/O
error, but disk/controller/geom/etc driver didn't see it.
I do not see us moving towards an answer to that.

>> status: One or more devices has experienced an unrecoverable error.  An
>>         attempt was made to correct the error.  Applications are unaffected.
>>
>>         NAME                                            STATE     READ WRITE CKSUM
>>         tank                                            ONLINE       0     0     0
>>           mirror                                        ONLINE       0     0     0
>>             ada0p4                                      ONLINE       0     0     0
>>             gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff  ONLINE       1     0     0  128K repaired
> 
>>> - Full "smartctl -a /dev/XXX" for all disk members of zpool "tank"
>> Those output for both disks are "perfect".
>> I monitor them regularly, also smartd is running and complaints from it.
> 
> Most people I know if do not know how to interpret SMART statistics, and
> that's not their fault -- and that's why I requested them.  :-)  

I'll leave this without a comment.

> In this
> case, I'd like to see "smartctl -a" output for the disk that's
> associated with the above GPT ID.  There may be some attributes or data
> in the SMART error log which could indicate what's going on.  smartd
> does not know how to interpret data; it just logs what it sees.

$ smartctl -a /dev/ada1
smartctl 5.39.1 2010-01-28 r3054 [FreeBSD 8.1-PRERELEASE amd64] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Blue Serial ATA family
Device Model:     WDC WD5000AAKS-00A7B2
Serial Number:    WD-WMASY6905909
Firmware Version: 01.03B01
User Capacity:    500,107,862,016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Mon Jun  7 11:53:50 2010 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                                        was suspended by an interrupting command
from host.
                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      (   0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                 (11160) seconds.

Offline data collection

capabilities:                    (0x7b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off
support.
                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (   2) minutes.

Extended self-test routine

recommended polling time:        ( 131) minutes.

Conveyance self-test routine

recommended polling time:        (   5) minutes.

SCT capabilities:              (0x303f) SCT Status supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.


SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -
      0
  3 Spin_Up_Time            0x0027   169   160   021    Pre-fail  Always       -
      4516
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -
      53
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -
      0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -
      0
  9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -
      10385
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -
      0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -
      0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -
      30
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -
      25
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -
      52
194 Temperature_Celsius     0x0022   102   088   000    Old_age   Always       -
      45
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -
      0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -
      0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -
      0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -
      0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -
      0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)
LBA_of_first_error
# 1  Short offline       Completed without error       00%     10331         -

# 2  Extended offline    Completed without error       00%     10237         -

# 3  Short offline       Completed without error       00%     10165         -

# 4  Short offline       Completed without error       00%      9999         -

# 5  Short offline       Completed without error       00%      9830         -

# 6  Short offline       Completed without error       00%      9662         -

# 7  Extended offline    Completed without error       00%      9496         -

# 8  Short offline       Completed without error       00%      9327         -

# 9  Short offline       Completed without error       00%      9159         -

#10  Short offline       Completed without error       00%      8992         -

#11  Short offline       Completed without error       00%      8824         -

#12  Extended offline    Completed without error       00%      8778         -

#13  Short offline       Completed without error       00%      8657         -

#14  Short offline       Completed without error       00%      8489         -

#15  Short offline       Completed without error       00%      8154         -

#16  Extended offline    Completed without error       00%      8036         -

#17  Short offline       Completed without error       00%      7986         -

#18  Short offline       Completed without error       00%      7819         -

#19  Short offline       Completed without error       00%      7651         -

#20  Extended offline    Completed without error       00%      7366         -

#21  Short offline       Completed without error       00%      7316         -


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 10:38:32 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1DF5F1065672
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 10:38:32 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta13.westchester.pa.mail.comcast.net
	(qmta13.westchester.pa.mail.comcast.net [76.96.59.243])
	by mx1.freebsd.org (Postfix) with ESMTP id BFE788FC1D
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 10:38:31 +0000 (UTC)
Received: from omta17.westchester.pa.mail.comcast.net ([76.96.62.89])
	by qmta13.westchester.pa.mail.comcast.net with comcast
	id SyYS1e0041vXlb85DyeXYP; Mon, 07 Jun 2010 10:38:31 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta17.westchester.pa.mail.comcast.net with comcast
	id SyeW1e00A3S48mS3dyeXaE; Mon, 07 Jun 2010 10:38:31 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 66A9F9B418; Mon,  7 Jun 2010 03:38:29 -0700 (PDT)
Date: Mon, 7 Jun 2010 03:38:29 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Andriy Gapon <avg@icyb.net.ua>
Message-ID: <20100607103829.GA50106@icarus.home.lan>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua>
	<20100607090850.GA49166@icarus.home.lan>
	<4C0CBBCA.3050304@icyb.net.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4C0CBBCA.3050304@icyb.net.ua>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 10:38:32 -0000

On Mon, Jun 07, 2010 at 12:28:42PM +0300, Andriy Gapon wrote:
> on 07/06/2010 12:08 Jeremy Chadwick said the following:
> > On Mon, Jun 07, 2010 at 11:55:24AM +0300, Andriy Gapon wrote:
> >> on 07/06/2010 11:34 Jeremy Chadwick said the following:
> >>> On Mon, Jun 07, 2010 at 11:15:54AM +0300, Andriy Gapon wrote:
> >>>> During recent zpool scrub one read error was detected and "128K repaired".
> >>>>
> >>>> In system log I see the following message:
> >>>> ZFS: vdev I/O failure, zpool=tank
> >>>> path=/dev/gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff offset=284456910848
> >>>> size=131072 error=5
> >>>>
> >>>> On the other hand, there are no other errors, nothing from geom, ahci, etc.
> >>>> Why would that happen? What kind of error could this be?
> >>> I believe this indicates silent data corruption[1], which ZFS can
> >>> auto-correct if the pool is a mirror or raidz (otherwise it can detect
> >>> the problem but not fix it).
> >> This pool is a mirror.
> >>
> >>> This can happen for a lot of reasons, but
> >>> tracking down the source is often difficult.  Usually it indicates the
> >>> disk itself has some kind of problem (cache going bad, some sector
> >>> remaps which didn't happen or failed, etc.).
> >> Please note that this is not a CKSUM error, but READ error.
> > 
> > Okay, then it indicates reading some data off the disk failed.  ZFS
> > auto-corrected it by reading the data from the other member in the pool
> > (ada0p4).  That's confirmed here:
> 
> Yes, right, of course.
> If you read my original post you'll see that my question was: why ZFS saw I/O
> error, but disk/controller/geom/etc driver didn't see it.
> I do not see us moving towards an answer to that.

My understanding is that a "vdev I/O error" indicates some sort of
communication failure with a member in the pool, or some other layer
within FreeBSD (GEOM I think, like you said).  I don't think there has
to be a 1:1 ratio between vdev I/O errors and controller/disk errors.

For AHCI and storage controllers, I/O errors are messages that are
returned from the controller to the OS, or from the disk through the
controller to the OS.  I suppose it's possible ZFS could be throwing
an error for something that isn't actually block/disk-level.

I'm interested to see what this turns out to be!

I agree that your SMART statistics look fine -- the only test that isn't
working is a manual or automatic offline data collection test, but this
one fails (gets aborted) pretty often when the system is in use.  You
can see that here:

> Offline data collection status:  (0x84) Offline data collection activity
>                                         was suspended by an interrupting command from host.
>                                         Auto Offline Data Collection: Enabled.

This is the test that "-t offline" induces (not -t short/long).  It
takes a very long time to run, which is why it often gets aborted:

> Total time to complete Offline
> data collection:                 (11160) seconds.

That's the only thing that looks even remotely of concern with ada1,
and it's not even worth focusing on.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 11:06:54 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 05192106567B
	for <freebsd-fs@FreeBSD.org>; Mon,  7 Jun 2010 11:06:54 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (unknown [IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id E7AF78FC13
	for <freebsd-fs@FreeBSD.org>; Mon,  7 Jun 2010 11:06:53 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o57B6rRn008623
	for <freebsd-fs@FreeBSD.org>; Mon, 7 Jun 2010 11:06:53 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o57B6r7l008620
	for freebsd-fs@FreeBSD.org; Mon, 7 Jun 2010 11:06:53 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 7 Jun 2010 11:06:53 GMT
Message-Id: <201006071106.o57B6r7l008620@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-fs@FreeBSD.org
Cc: 
Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 11:06:54 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/147420  fs         [nfs] [panic] kldload nfs modules causes nfs-aware ker
o kern/147292  fs         [nfs] [patch] readahead missing in nfs client options 
o kern/146708  fs         [ufs] [panic] Kernel panic in softdep_disk_write_compl
o kern/146528  fs         [zfs] Severe memory leak in ZFS on i386
o kern/146502  fs         [nfs] FreeBSD 8 NFS Client Connection to Server
o kern/146375  fs         [nfs] [patch] Typos in macro variables names in sys/fs
o kern/145778  fs         [zfs] [panic] panic in zfs_fuid_map_id (known issue fi
s kern/145712  fs         [zfs] cannot offline two drives in a raidz2 configurat
s kern/145424  fs         [zfs] [patch] move source closer to v15
o kern/145411  fs         [xfs] [panic] Kernel panics shortly after mounting an 
o kern/145309  fs         [disklabel]: Editing disk label invalidates the whole 
o kern/145272  fs         [zfs] [panic] Panic during boot when accessing zfs on 
o kern/145246  fs         [ufs] dirhash in 7.3 gratuitously frees hashes when it
o kern/145238  fs         [zfs] [panic] kernel panic on zpool clear tank
o kern/145229  fs         [zfs] Vast differences in ZFS ARC behavior between 8.0
o kern/145189  fs         [nfs] nfsd performs abysmally under load
o kern/144929  fs         [ufs] [lor] vfs_bio.c + ufs_dirhash.c
o kern/144458  fs         [nfs] [patch] nfsd fails as a kld
p kern/144447  fs         [zfs] sharenfs fsunshare() & fsshare_main() non functi
o kern/144416  fs         [panic] Kernel panic on online filesystem optimization
s kern/144415  fs         [zfs] [panic] kernel panics on boot after zfs crash
o kern/144234  fs         [zfs] Cannot boot machine with recent gptzfsboot code 
o kern/143825  fs         [nfs] [panic] Kernel panic on NFS client
o kern/143345  fs         [ext2fs] [patch] extfs minor header cleanups to better
o kern/143212  fs         [nfs] NFSv4 client strange work ...
o kern/143184  fs         [zfs] [lor] zfs/bufwait LOR
o kern/142924  fs         [ext2fs] [patch] Small cleanup for the inode struct in
o kern/142914  fs         [zfs] ZFS performance degradation over time
o kern/142878  fs         [zfs] [vfs] lock order reversal
o kern/142597  fs         [ext2fs] ext2fs does not work on filesystems with real
o kern/142489  fs         [zfs] [lor] allproc/zfs LOR
o kern/142466  fs         Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re
o kern/142401  fs         [ntfs] [patch] Minor updates to NTFS from NetBSD
o kern/142306  fs         [zfs] [panic] ZFS drive (from OSX Leopard) causes two 
o kern/142068  fs         [ufs] BSD labels are got deleted spontaneously
o kern/141897  fs         [msdosfs] [panic] Kernel panic. msdofs: file name leng
o kern/141463  fs         [nfs] [panic] Frequent kernel panics after upgrade fro
o kern/141305  fs         [zfs] FreeBSD ZFS+sendfile severe performance issues (
o kern/141091  fs         [patch] [nullfs] fix panics with DIAGNOSTIC enabled
o kern/141086  fs         [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS
o kern/141010  fs         [zfs] "zfs scrub" fails when backed by files in UFS2
o kern/140888  fs         [zfs] boot fail from zfs root while the pool resilveri
o kern/140661  fs         [zfs] [patch] /boot/loader fails to work on a GPT/ZFS-
o kern/140640  fs         [zfs] snapshot crash
o kern/140134  fs         [msdosfs] write and fsck destroy filesystem integrity
o kern/140068  fs         [smbfs] [patch] smbfs does not allow semicolon in file
o kern/139725  fs         [zfs] zdb(1) dumps core on i386 when examining zpool c
o kern/139715  fs         [zfs] vfs.numvnodes leak on busy zfs
o bin/139651   fs         [nfs] mount(8): read-only remount of NFS volume does n
o kern/139597  fs         [patch] [tmpfs] tmpfs initializes va_gen but doesn't u
o kern/139564  fs         [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo
o kern/139407  fs         [smbfs] [panic] smb mount causes system crash if remot
o kern/139363  fs         [nfs] diskless root nfs mount from non FreeBSD server 
o kern/138790  fs         [zfs] ZFS ceases caching when mem demand is high
o kern/138421  fs         [ufs] [patch] remove UFS label limitations
o kern/138202  fs         mount_msdosfs(1) see only 2Gb
f kern/137037  fs         [zfs] [hang] zfs rollback on root causes FreeBSD to fr
o kern/136968  fs         [ufs] [lor] ufs/bufwait/ufs (open)
o kern/136945  fs         [ufs] [lor] filedesc structure/ufs (poll)
o kern/136944  fs         [ffs] [lor] bufwait/snaplk (fsync)
o kern/136873  fs         [ntfs] Missing directories/files on NTFS volume
o kern/136865  fs         [nfs] [patch] NFS exports atomic and on-the-fly atomic
o kern/136470  fs         [nfs] Cannot mount / in read-only, over NFS
o kern/135546  fs         [zfs] zfs.ko module doesn't ignore zpool.cache filenam
o kern/135469  fs         [ufs] [panic] kernel crash on md operation in ufs_dirb
o kern/135050  fs         [zfs] ZFS clears/hides disk errors on reboot
o kern/134491  fs         [zfs] Hot spares are rather cold...
o kern/133676  fs         [smbfs] [panic] umount -f'ing a vnode-based memory dis
o kern/133614  fs         [panic] panic: ffs_truncate: read-only filesystem
o kern/133174  fs         [msdosfs] [patch] msdosfs must support utf-encoded int
f kern/133150  fs         [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w
o kern/132960  fs         [ufs] [panic] panic:ffs_blkfree: freeing free frag
o kern/132397  fs         reboot causes filesystem corruption (failure to sync b
o kern/132331  fs         [ufs] [lor] LOR ufs and syncer
o kern/132237  fs         [msdosfs] msdosfs has problems to read MSDOS Floppy
o kern/132145  fs         [panic] File System Hard Crashes
o kern/131441  fs         [unionfs] [nullfs] unionfs and/or nullfs not combineab
o kern/131360  fs         [nfs] poor scaling behavior of the NFS server under lo
o kern/131342  fs         [nfs] mounting/unmounting of disks causes NFS to fail
o bin/131341   fs         makefs: error "Bad file descriptor"  on the mount poin
o kern/130979  fs         [smbfs] [panic] boot/kernel/smbfs.ko
o kern/130920  fs         [msdosfs] cp(1) takes 100% CPU time while copying file
o kern/130229  fs         [iconv] usermount fails on fs that need iconv
o kern/130210  fs         [nullfs] Error by check nullfs
o kern/129760  fs         [nfs] after 'umount -f' of a stale NFS share FreeBSD l
o kern/129488  fs         [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: 
o kern/129231  fs         [ufs] [patch] New UFS mount (norandom) option - mostly
o kern/129152  fs         [panic] non-userfriendly panic when trying to mount(8)
o kern/129059  fs         [zfs] [patch] ZFS bootloader whitelistable via WITHOUT
f kern/128829  fs         smbd(8) causes periodic panic on 7-RELEASE
o kern/127420  fs         [gjournal] [panic] Journal overflow on gmirrored gjour
o kern/127029  fs         [panic] mount(8): trying to mount a write protected zi
o kern/126287  fs         [ufs] [panic] Kernel panics while mounting an UFS file
o kern/125895  fs         [ffs] [panic] kernel: panic: ffs_blkfree: freeing free
s kern/125738  fs         [zfs] [request] SHA256 acceleration in ZFS
p kern/124621  fs         [ext3] [patch] Cannot mount ext2fs partition
f bin/124424   fs         [zfs] zfs(8): zfs list -r shows strange snapshots' siz
o kern/123939  fs         [msdosfs] corrupts new files
o kern/122380  fs         [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash
o bin/122172   fs         [fs]: amd(8) automount daemon dies on 6.3-STABLE i386,
o bin/121898   fs         [nullfs] pwd(1)/getcwd(2) fails with Permission denied
o bin/121779   fs         [ufs] snapinfo(8) (and related tools?) only work for t
o bin/121366   fs         [zfs] [patch] Automatic disk scrubbing from periodic(8
o bin/121072   fs         [smbfs] mount_smbfs(8) cannot normally convert the cha
f kern/120991  fs         [panic] [fs] [snapshot] System crashes when manipulati
o kern/120483  fs         [ntfs] [patch] NTFS filesystem locking changes
o kern/120482  fs         [ntfs] [patch] Sync style changes between NetBSD and F
f kern/119735  fs         [zfs] geli + ZFS + samba starting on boot panics 7.0-B
o kern/118912  fs         [2tb] disk sizing/geometry problem with large array
o kern/118713  fs         [minidump] [patch] Display media size required for a k
o bin/118249   fs         mv(1): moving a directory changes its mtime
o kern/118107  fs         [ntfs] [panic] Kernel panic when accessing a file at N
o bin/117315   fs         [smbfs] mount_smbfs(8) and related options can't mount
o kern/117314  fs         [ntfs] Long-filename only NTFS fs'es cause kernel pani
o kern/117158  fs         [zfs] zpool scrub causes panic if geli vdevs detach on
o bin/116980   fs         [msdosfs] [patch] mount_msdosfs(8) resets some flags f
o conf/116931  fs         lack of fsck_cd9660 prevents mounting iso images with 
o kern/116913  fs         [ffs] [panic] ffs_blkfree: freeing free block
p kern/116608  fs         [msdosfs] [patch] msdosfs fails to check mount options
o kern/116583  fs         [ffs] [hang] System freezes for short time when using 
o kern/116170  fs         [panic] Kernel panic when mounting /tmp
o kern/115645  fs         [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex
o bin/115361   fs         [zfs] mount(8) gets into a state where it won't set/un
o kern/114955  fs         [cd9660] [patch] [request] support for mask,dirmask,ui
o kern/114847  fs         [ntfs] [patch] [request] dirmask support for NTFS ala 
o kern/114676  fs         [ufs] snapshot creation panics: snapacct_ufs2: bad blo
o bin/114468   fs         [patch] [request] add -d option to umount(8) to detach
o kern/113852  fs         [smbfs] smbfs does not properly implement DFS referral
o bin/113838   fs         [patch] [request] mount(8): add support for relative p
o bin/113049   fs         [patch] [request] make quot(8) use getopt(3) and show 
o kern/112658  fs         [smbfs] [patch] smbfs and caching problems (resolves b
o kern/111843  fs         [msdosfs] Long Names of files are incorrectly created 
o kern/111782  fs         [ufs] dump(8) fails horribly for large filesystems
s bin/111146   fs         [2tb] fsck(8) fails on 6T filesystem
o kern/109024  fs         [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not 
o kern/109010  fs         [msdosfs] can't mv directory within fat32 file system
o bin/107829   fs         [2TB] fdisk(8): invalid boundary checking in fdisk / w
o kern/106107  fs         [ufs] left-over fsck_snapshot after unfinished backgro
o kern/106030  fs         [ufs] [panic] panic in ufs from geom when a dead disk 
o kern/104406  fs         [ufs] Processes get stuck in "ufs" state under persist
o kern/104133  fs         [ext2fs] EXT2FS module corrupts EXT2/3 filesystems
o kern/103035  fs         [ntfs] Directories in NTFS mounted disc images appear 
o kern/101324  fs         [smbfs] smbfs sometimes not case sensitive when it's s
o kern/99290   fs         [ntfs] mount_ntfs ignorant of cluster sizes
o kern/97377   fs         [ntfs] [patch] syntax cleanup for ntfs_ihash.c
o kern/95222   fs         [iso9660] File sections on ISO9660 level 3 CDs ignored
o kern/94849   fs         [ufs] rename on UFS filesystem is not atomic
o kern/94769   fs         [ufs] Multiple file deletions on multi-snapshotted fil
o kern/94733   fs         [smbfs] smbfs may cause double unlock
o kern/93942   fs         [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D
o kern/92272   fs         [ffs] [hang] Filling a filesystem while creating a sna
f kern/91568   fs         [ufs] [panic] writing to UFS/softupdates DVD media in 
o kern/91134   fs         [smbfs] [patch] Preserve access and modification time 
a kern/90815   fs         [smbfs] [patch] SMBFS with character conversions somet
o kern/88657   fs         [smbfs] windows client hang when browsing a samba shar
o kern/88266   fs         [smbfs] smbfs does not implement UIO_NOCOPY and sendfi
o kern/87859   fs         [smbfs] System reboot while umount smbfs.
o kern/86587   fs         [msdosfs] rm -r /PATH fails with lots of small files
o kern/85326   fs         [smbfs] [panic] saving a file via samba to an overquot
o kern/84589   fs         [2TB] 5.4-STABLE unresponsive during background fsck 2
o kern/80088   fs         [smbfs] Incorrect file time setting on NTFS mounted vi
o kern/73484   fs         [ntfs] Kernel panic when doing `ls` from the client si
o bin/73019    fs         [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino
o kern/71774   fs         [ntfs] NTFS cannot "see" files on a WinXP filesystem
o kern/68978   fs         [panic] [ufs] crashes with failing hard disk, loose po
o kern/65920   fs         [nwfs] Mounted Netware filesystem behaves strange
o kern/65901   fs         [smbfs] [patch] smbfs fails fsx write/truncate-down/tr
o kern/61503   fs         [smbfs] mount_smbfs does not work as non-root
o kern/55617   fs         [smbfs] Accessing an nsmb-mounted drive via a smb expo
o kern/53137   fs         [ffs] [panic] background fscking causing ffs_valloc pa
o kern/51685   fs         [hang] Unbounded inode allocation causes kernel to loc
o kern/51583   fs         [nullfs] [patch] allow to work with devices and socket
o kern/36566   fs         [smbfs] System reboot with dead smb mount and umount
o kern/33464   fs         [ufs] soft update inconsistencies after system crash
o kern/18874   fs         [2TB] 32bit NFS servers export wrong negative values t

175 problems total.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 11:12:19 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B9A94106566C
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 11:12:19 +0000 (UTC)
	(envelope-from martin@lispworks.com)
Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com
	[193.34.186.230])
	by mx1.freebsd.org (Postfix) with ESMTP id 529CF8FC18
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 11:12:18 +0000 (UTC)
Received: from higson.cam.lispworks.com
	(IDENT:U2FsdGVkX195IfpMpx2tua35nMV9ljAsr3py87BCKUE@higson
	[192.168.1.7])
	by lwfs1-cam.cam.lispworks.com (8.14.3/8.14.3) with ESMTP id
	o57BCGGQ052521; Mon, 7 Jun 2010 12:12:16 +0100 (BST)
	(envelope-from martin@lispworks.com)
Received: from higson.cam.lispworks.com by higson.cam.lispworks.com (8.13.1)
	id o57BCGXU027499; Mon, 7 Jun 2010 12:12:16 +0100
Received: (from martin@localhost)
	by higson.cam.lispworks.com (8.13.1/8.13.1/Submit) id o57BCGMf027496;
	Mon, 7 Jun 2010 12:12:16 +0100
Date: Mon, 7 Jun 2010 12:12:16 +0100
Message-Id: <201006071112.o57BCGMf027496@higson.cam.lispworks.com>
From: Martin Simmons <martin@lispworks.com>
To: freebsd-fs@freebsd.org
In-reply-to: <20100607090850.GA49166@icarus.home.lan> (message from Jeremy
	Chadwick on Mon, 7 Jun 2010 02:08:50 -0700)
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua> <20100607090850.GA49166@icarus.home.lan>
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 11:12:19 -0000

>>>>> On Mon, 7 Jun 2010 02:08:50 -0700, Jeremy Chadwick said:
> 
> I'm still trying to figure out why people do this.

Maybe because the ZFS Best Practices Guide suggests it?  ("Run zpool scrub on
a regular basis to identify data integrity problems...")

It makes sense to detect errors when there is still a healthy mirror, rather
than waiting until two drives are failing :-)


> It's important to remember that scrubs are *highly* intensive on both
> the system itself as well as on all pool members.  Disk I/O activity is
> very heavy during a scrub; it's not considered "normal use".

Is it worse that a full backup?  I guess scrub does read all drives, but OTOH
backup will typically read all data non-linearly, which adds a different kind
of stress.

__Martin

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 11:43:37 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C0C4E106566B
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 11:43:37 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 10DDF8FC0C
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 11:43:36 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA23524;
	Mon, 07 Jun 2010 14:43:33 +0300 (EEST)
	(envelope-from avg@icyb.net.ua)
Message-ID: <4C0CDB64.6090304@icyb.net.ua>
Date: Mon, 07 Jun 2010 14:43:32 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.24 (X11/20100517)
MIME-Version: 1.0
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua>
	<20100607090850.GA49166@icarus.home.lan>
	<4C0CBBCA.3050304@icyb.net.ua>
	<20100607103829.GA50106@icarus.home.lan>
In-Reply-To: <20100607103829.GA50106@icarus.home.lan>
X-Enigmail-Version: 0.95.7
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 11:43:37 -0000

on 07/06/2010 13:38 Jeremy Chadwick said the following:
> My understanding is that a "vdev I/O error" indicates some sort of
> communication failure with a member in the pool, or some other layer
> within FreeBSD (GEOM I think, like you said).  I don't think there has
> to be a 1:1 ratio between vdev I/O errors and controller/disk errors.
> 
> For AHCI and storage controllers, I/O errors are messages that are
> returned from the controller to the OS, or from the disk through the
> controller to the OS.  I suppose it's possible ZFS could be throwing
> an error for something that isn't actually block/disk-level.
> 
> I'm interested to see what this turns out to be!

Yes, me too :)
I skimmed through the sources and so far I see at least two possibilities:
1) Decompression error for a filesystem with compression.
Again, I don't know why that could happen if there are no checksum errors or
hardware errors.
2) Successful but short read from disk.
Same thing - I don't know why that could happen.

And I am sure that there are other possibilities too.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 12:19:56 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4D5D31065680
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 12:19:56 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta03.emeryville.ca.mail.comcast.net
	(qmta03.emeryville.ca.mail.comcast.net [76.96.30.32])
	by mx1.freebsd.org (Postfix) with ESMTP id 32E888FC08
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 12:19:55 +0000 (UTC)
Received: from omta23.emeryville.ca.mail.comcast.net ([76.96.30.90])
	by qmta03.emeryville.ca.mail.comcast.net with comcast
	id T00o1e0041wfjNsA30KvFQ; Mon, 07 Jun 2010 12:19:55 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta23.emeryville.ca.mail.comcast.net with comcast
	id T0Ku1e00A3S48mS8j0Kuz3; Mon, 07 Jun 2010 12:19:55 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 948929B418; Mon,  7 Jun 2010 05:19:54 -0700 (PDT)
Date: Mon, 7 Jun 2010 05:19:54 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Martin Simmons <martin@lispworks.com>
Message-ID: <20100607121954.GA52932@icarus.home.lan>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua>
	<20100607090850.GA49166@icarus.home.lan>
	<201006071112.o57BCGMf027496@higson.cam.lispworks.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201006071112.o57BCGMf027496@higson.cam.lispworks.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 12:19:56 -0000

On Mon, Jun 07, 2010 at 12:12:16PM +0100, Martin Simmons wrote:
> >>>>> On Mon, 7 Jun 2010 02:08:50 -0700, Jeremy Chadwick said:
> > 
> > I'm still trying to figure out why people do this.
> 
> Maybe because the ZFS Best Practices Guide suggests it?  ("Run zpool scrub on
> a regular basis to identify data integrity problems...")
> 
> It makes sense to detect errors when there is still a healthy mirror, rather
> than waiting until two drives are failing :-)

The official quote from the ZFS Best Practices Guide[1] is:

"Run zpool scrub on a regular basis to identify data integrity problems.
If you have consumer-quality drives, consider a weekly scrubbing
schedule. If you have datacenter-quality drives, consider a monthly
scrubbing schedule."

The first line of the paragraph seems reasonable; the concept being, do
this process often so that you catch potential data-threatening errors
before your entire pool explodes.  Cool, I can accept that, but it gets
us into a discussion about how often this is necessary (keep reading for
more on that).  However, the second part of the paragraph -- total
rubbish.  "Datacenter-quality drives?"  Oh, I think they mean
"enterprise-grade drives", which really don't offer much more than
high-end consumer-grade drives at this point in time[2].  One of the key
points of ZFS's creation was to provide a reliable filesystem using
cheap disks[3][4].

The only thing I can find in the ZFS Administration Guide[5] is this:

"The simplest way to check your data integrity is to initiate an
explicit scrubbing of all data within the pool. This operation traverses
all the data in the pool once and verifies that all blocks can be read.
Scrubbing proceeds as fast as the devices allow, though the priority of
any I/O remains below that of normal operations. This operation might
negatively impact performance, though the file system should remain
usable and nearly as responsive while the scrubbing occurs."

"Performing routine scrubbing also guarantees continuous I/O to all
disks on the system. Routine scrubbing has the side effect of preventing
power management from placing idle disks in low-power mode. If the
system is generally performing I/O all the time, or if power consumption
is not a concern, then this issue can safely be ignored."

What's confusing about this is the phrase that pool verification is done
by "verifying all the blocks can be read".  Doesn't that happen when a
standard read operation comes down the pipe for a file?  What I'm
getting at is that there's no explanation (that I can find) which states
why scrubbing regularly "ensures" anything, other than allowing a person
to see an error sooner than later.

Which brings us to the topic of scrub interval...

This exact question was asked on the ZFS OpenSolaris list[6] in late
2008, and nobody there provided any concrete evidence either.  The
closest thing to evidence is this:

"...in normal operation, ZFS only checks data as it's read back from the
disks.  If you don't periodically scrub, errors that happen over time
won't be caught until I next read that actual data, which might be
inconvenient if it's a long since the initial data was written".

The topic of scrub intervals was also brought up a month later[7].
Someone said:

"We did a study on re-write scrubs which showed that once per year was a
good interval for modern, enterprise-class disks.  However, ZFS does a
read-only scrub, so you might want to scrub more often".

The first part conflicts with what the guide recommends (I'd also like
to see the results of the study!), while the last half of the paragraph
makes no sense ("because it reads, do it more often!").  So if you take
the first sentence and apply it to what the ZFS Best Practices Guide
says, you come out with... "scrub consumer-grade disks every 6 months".

In the same thread, we have this quote from a different person:

"Even that is probably more frequent than necessary. I'm sure somebody  
has done the MTTDL math. IIRC, the big win is doing any scrubbing at  
all. The difference between scrubbing every 2 weeks and every 2  
months may be negligible. (IANAMathematician tho)"

So the justification seems, well, unjustified.  It's almost as if
because the filesystem is new, that there's an underlying sense of
paranoia, so everyone scrubs often.  I understand the "pre-emptive"
argument, just not the technical argument.

So how often do *I* scrub our pools?  Rarely.  I tend to look at SMART
stats much more aggressively; "uh oh, uncorrected sector, better
scrub..."  Or if while using the system it feels sluggish on I/O, or
cronjob tasks taking way longer than need be.

> > It's important to remember that scrubs are *highly* intensive on both
> > the system itself as well as on all pool members.  Disk I/O activity is
> > very heavy during a scrub; it's not considered "normal use".
> 
> Is it worse that a full backup?  I guess scrub does read all drives, but OTOH
> backup will typically read all data non-linearly, which adds a different kind
> of stress.

I'd guess it'd depend greatly on the type of backup.  I'd imagine that a
ZFS snapshot (non-incremental) + zfs send would be less intensive than a
scrub, and the same (but even more so) with an incremental snapshot.
I'd imagine rsync/tar/cp/etc. would be somewhere in-between.

I don't use ZFS snapshots because I don't know if they've stabilised on
FreeBSD.


[1]: http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Storage_Pools
[2]: http://lists.freebsd.org/pipermail/freebsd-fs/2010-May/008508.html
[3]: http://blogs.sun.com/bonwick/entry/zfs_end_to_end_data
[4]: http://Fwww.sun.com/software/solaris/zfs_lc_preso.pdf
[5]: http://docs.sun.com/app/docs/doc/819-5461/gbbwa?l=en&a=view
[6]: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg20995.html
[7]: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg21728.html
[8]: http://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSPeriodicScrubbing

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 13:22:09 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 83C101065676
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 13:22:09 +0000 (UTC)
	(envelope-from jhellenthal@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id D39B48FC1F
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 13:22:08 +0000 (UTC)
Received: by iwn5 with SMTP id 5so4102027iwn.13
	for <freebsd-fs@freebsd.org>; Mon, 07 Jun 2010 06:22:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:sender:message-id:date:from
	:user-agent:mime-version:to:cc:subject:references:in-reply-to
	:x-enigmail-version:openpgp:content-type:content-transfer-encoding;
	bh=xYeM61mLdIcJ1eV7H9kWRIrPWMEskPThLMcodtAhMrw=;
	b=It4gEZIAx+Qxrt4ai9tIdQAQAjVP+TgMuVRanBJYSnGC5QACeLkcuuzDflzGZN4F1V
	g6k1boIihKlOBcrJqGvWaJgiReIruWKl2OoAqTWjgjOj11gJC4cDMwglwQO9Ool00kds
	m+XePBR2z2U+kRK3oiI97w7If2UQB+wf/V9IY=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject
	:references:in-reply-to:x-enigmail-version:openpgp:content-type
	:content-transfer-encoding;
	b=cp2Fb3nq7ZCQRPq6/GwmNRwch/aVR7vD9Zg1NaxE/vp+QtilgAuSA4tdImk7q2/f/E
	t0q8QUPwYnYP7srjuiim5sgNyoXJMhSz0+WvHyFxFvzbG56LcgS/ji13k5w/th8zCdUH
	ZiahzcND0G1SwZz21h/Ow3w0yY5Wi4OI7fyrA=
Received: by 10.231.125.87 with SMTP id x23mr17307915ibr.88.1275916927560;
	Mon, 07 Jun 2010 06:22:07 -0700 (PDT)
Received: from centel.dataix.local
	(adsl-99-181-128-180.dsl.klmzmi.sbcglobal.net [99.181.128.180])
	by mx.google.com with ESMTPS id f1sm20702856ibg.21.2010.06.07.06.22.04
	(version=SSLv3 cipher=RC4-MD5); Mon, 07 Jun 2010 06:22:05 -0700 (PDT)
Sender: "J. Hellenthal" <jhellenthal@gmail.com>
Message-ID: <4C0CF27B.1050402@dataix.net>
Date: Mon, 07 Jun 2010 09:22:03 -0400
From: jhell <jhell@dataix.net>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US;
	rv:1.9.1.9) Gecko/20100515 Thunderbird
MIME-Version: 1.0
To: Sergiy Suprun <sergiy.suprun@gmail.com>
References: <4C0C6B54.8020005@ish.com.au>
	<AANLkTil2-wsv2KvCOHmPVpQuf7q0uF16x2xh6NXJPgIS@mail.gmail.com>
In-Reply-To: <AANLkTil2-wsv2KvCOHmPVpQuf7q0uF16x2xh6NXJPgIS@mail.gmail.com>
X-Enigmail-Version: 1.0.1
OpenPGP: id=89D8547E
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs filesystem problem
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 13:22:09 -0000

On 06/07/2010 01:46, Sergiy Suprun wrote:
> On Mon, Jun 7, 2010 at 06:45, Jurgen Weber <jurgen@ish.com.au> wrote:
> 
>> Hello
>>
>> I have a FreeBSD 8.0-p2 system, which runs two pools. One with 6 disks all
>> mirrored for our data and another mirrored pool for the OS. The system has
>> 16GB of RAM.
>>
>> I have a nightly cron script running which takes a snapshot of a particular
>> file system within the storage pool. This has been running for just over a
>> month now without any issues until this weekend.
>>
>> Now we can not access the mentioned file system. If we try to `ls` to it or
>> `cd` into it the shell locks up (not even kill -9 can stop the `ls`
>> processes, etc) and top shows that the process state is `zfs`.

This is most likely caused by some bugs that were found and fixed in
stable/8. One of the commits that mm@ made has touched that zio->iowait
that you should see your processes are stuck in.

There still seems "at least in my case" some zio->iowait problems going
on but I have not pinned that down to the cause yet, but they have not
caused any of my system proccesses to freeze in that state.

Grab a kernel from one of the snapshots that were made sometime last
month to test this out just to be sure so your not upgrading for no
reason. When I say kernel I mean kernel & modules that go with it as ZFS
is a module and you will obviously need that.

Please report back on your findings if the kernel from stable fixed your
problem.

URL to retrieve snapshots: http://bit.ly/aLoXXV

Good Luck!,


<Irrelevant>
> Hello.
> How about scrub ?
> And which size of your pools and how many place used by data+snapshots?
</Irrelevant>.

-- 

 jhell

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 13:31:25 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 91C68106567B
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 13:31:25 +0000 (UTC)
	(envelope-from tevans.uk@googlemail.com)
Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com
	[209.85.161.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 2255D8FC1F
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 13:31:24 +0000 (UTC)
Received: by fxm20 with SMTP id 20so2493393fxm.13
	for <freebsd-fs@freebsd.org>; Mon, 07 Jun 2010 06:31:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=googlemail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=ElPHcB02FihrllgSxW0It4b4Wp1czmGJo0ZJDDmPOCo=;
	b=BwktnhfGURh89CjKAoSKAOPuNOuVIUuJGXLxwv+ufolJrPGV3qZHEYBj3pvVNsa3j7
	Vo03yBYBd5OH7DSRDyhKbll+KqcW4GjO49lVRyTqtNZRELk8iSlZ/XG2kXBjtbsjidW3
	i+PEuvMgM4gEfh1LAW097Rs+dJWYpgw5I/Dbk=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=sWt6hnyzxlH9J/E2q4SXhgPqizkYfirbDxuW8qIyvzRNQehxHfcegEFbH7BwV26HT5
	j5Ty4d28wkDM4soZBR00FuV7U4cOJWJ+qhMCSXFco9rNDoLXI0pw2XhauA5LFPKjgwa7
	bo7QwBcNe3jRVVcagLPoGUq5FD0b7o3bmIxKs=
MIME-Version: 1.0
Received: by 10.239.185.72 with SMTP id b8mr984872hbh.99.1275917483862; Mon, 
	07 Jun 2010 06:31:23 -0700 (PDT)
Received: by 10.239.185.1 with HTTP; Mon, 7 Jun 2010 06:31:23 -0700 (PDT)
In-Reply-To: <20100607121954.GA52932@icarus.home.lan>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua>
	<20100607090850.GA49166@icarus.home.lan>
	<201006071112.o57BCGMf027496@higson.cam.lispworks.com>
	<20100607121954.GA52932@icarus.home.lan>
Date: Mon, 7 Jun 2010 14:31:23 +0100
Message-ID: <AANLkTinkNKrSpy4LGNfGPVqiwA4NaRtFjEH7HCEFdjhu@mail.gmail.com>
From: Tom Evans <tevans.uk@googlemail.com>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 13:31:25 -0000

On Mon, Jun 7, 2010 at 1:19 PM, Jeremy Chadwick
<freebsd@jdc.parodius.com> wrote:
> What's confusing about this is the phrase that pool verification is done
> by "verifying all the blocks can be read". =C2=A0Doesn't that happen when=
 a
> standard read operation comes down the pipe for a file? =C2=A0What I'm
> getting at is that there's no explanation (that I can find) which states
> why scrubbing regularly "ensures" anything, other than allowing a person
> to see an error sooner than later.
>

The purpose is to avoid unrecoverable double failures.

Assume you have a raidz, and you do not periodically scrub the disk.
One of the disks develops a silent problem with reading a file. Later,
a second disk completely fails. You replace the disk, and then during
the resilver discover that your raidz is FAULTED, because it cannot
reconstruct files from the silently dodgy first disk.

With periodic scrubs, you are ensuring that at that point you can
recover from a single disk failure. Regularly running a scrub
increases your confidence that you will be able to recover. The ZFS
best practices guide suggests a shorter interval for consumer grade
hard drives because they have lower confidence in them to remain error
free.

As I understand it, the scrub is just an attempt to ensure that
everything on the pool is readable, attempting to reconstruct it if
there are any issues. I guess it is slightly more clever than 'find
/tank -type f -print0 | xargs -0 cat > /dev/null'.

Cheers

Tom

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 16:55:38 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D512B106567B
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 16:55:38 +0000 (UTC)
	(envelope-from martin@lispworks.com)
Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com
	[193.34.186.230])
	by mx1.freebsd.org (Postfix) with ESMTP id 6719D8FC23
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 16:55:34 +0000 (UTC)
Received: from higson.cam.lispworks.com
	(IDENT:U2FsdGVkX1+YcVTcmnofzCojPtvAIFHitg0ZVF99qqI@higson
	[192.168.1.7])
	by lwfs1-cam.cam.lispworks.com (8.14.3/8.14.3) with ESMTP id
	o57GtTne086545; Mon, 7 Jun 2010 17:55:29 +0100 (BST)
	(envelope-from martin@lispworks.com)
Received: from higson.cam.lispworks.com by higson.cam.lispworks.com (8.13.1)
	id o57GtSxK029970; Mon, 7 Jun 2010 17:55:28 +0100
Received: (from martin@localhost)
	by higson.cam.lispworks.com (8.13.1/8.13.1/Submit) id o57GtSBg029967;
	Mon, 7 Jun 2010 17:55:28 +0100
Date: Mon, 7 Jun 2010 17:55:28 +0100
Message-Id: <201006071655.o57GtSBg029967@higson.cam.lispworks.com>
From: Martin Simmons <martin@lispworks.com>
To: freebsd-fs@freebsd.org
In-reply-to: <20100607121954.GA52932@icarus.home.lan> (message from Jeremy
	Chadwick on Mon, 7 Jun 2010 05:19:54 -0700)
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua>
	<20100607090850.GA49166@icarus.home.lan>
	<201006071112.o57BCGMf027496@higson.cam.lispworks.com>
	<20100607121954.GA52932@icarus.home.lan>
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 16:55:38 -0000

>>>>> On Mon, 7 Jun 2010 05:19:54 -0700, Jeremy Chadwick said:
> 
> Which brings us to the topic of scrub interval...
> 
> This exact question was asked on the ZFS OpenSolaris list[6] in late
> 2008, and nobody there provided any concrete evidence either.  The
> closest thing to evidence is this:
> 
> "...in normal operation, ZFS only checks data as it's read back from the
> disks.  If you don't periodically scrub, errors that happen over time
> won't be caught until I next read that actual data, which might be
> inconvenient if it's a long since the initial data was written".

The question can't be answered with absolute numbers, because it depends on
other factors such as environmental effects.


> The topic of scrub intervals was also brought up a month later[7].
> Someone said:
> 
> "We did a study on re-write scrubs which showed that once per year was a
> good interval for modern, enterprise-class disks.  However, ZFS does a
> read-only scrub, so you might want to scrub more often".
> 
> The first part conflicts with what the guide recommends (I'd also like
> to see the results of the study!), while the last half of the paragraph
> makes no sense ("because it reads, do it more often!").  So if you take
> the first sentence and apply it to what the ZFS Best Practices Guide
> says, you come out with... "scrub consumer-grade disks every 6 months".

It doesn't conflict if you agree that freshly written data is more likely to
be readable that data written long ago (with some curve in between).

The re-write scrub they are talking about will write all of the data back to
the disks during the scrubbing operation, which makes it fresher.

ZFS OTOH performs read-only scrubs, i.e. it just checks that the data can be
read.  It only writes if there was a problem reading from one of the disks.

I don't know if there is any science behind that theory...

__Martin

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 17:11:47 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7B0DD1065674
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 17:11:47 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id 1F1AF8FC0A
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 17:11:46 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.13.8+Sun/8.13.8) with ESMTP id
	o57HBj2P023334; Mon, 7 Jun 2010 12:11:46 -0500 (CDT)
Date: Mon, 7 Jun 2010 12:11:45 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
In-Reply-To: <20100607121954.GA52932@icarus.home.lan>
Message-ID: <alpine.GSO.2.01.1006071153250.12887@freddy.simplesystems.org>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua>
	<20100607090850.GA49166@icarus.home.lan>
	<201006071112.o57BCGMf027496@higson.cam.lispworks.com>
	<20100607121954.GA52932@icarus.home.lan>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Mon, 07 Jun 2010 12:11:46 -0500 (CDT)
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 17:11:47 -0000

On Mon, 7 Jun 2010, Jeremy Chadwick wrote:

> rubbish.  "Datacenter-quality drives?"  Oh, I think they mean
> "enterprise-grade drives", which really don't offer much more than
> high-end consumer-grade drives at this point in time[2].  One of the key
> points of ZFS's creation was to provide a reliable filesystem using
> cheap disks[3][4].

There are differences between disks.  High-grade enterprise disks 
offer uncorrected error rates at least an order of magnitude better 
than typical tier-2 "SATA" disks and sometimes two orders of magnitude 
better than a cheap maximum-density drive.  Yes, there are tier-2 
drives that come with SAS interfaces, and you can immediately 
distinguish what they are since they offer high storage capacities and 
more reasonable prices.

> What's confusing about this is the phrase that pool verification is done
> by "verifying all the blocks can be read".  Doesn't that happen when a
> standard read operation comes down the pipe for a file?  What I'm

No.  A standard read does not verify that all data and metadata can be 
read.  Only one copy of the data and metadata is read and there may be 
several such copies.  Metadata is always stored multiple times, even 
if the vdev does not offer additional redundancy.

> The topic of scrub intervals was also brought up a month later[7].
> Someone said:
>
> "We did a study on re-write scrubs which showed that once per year was a
> good interval for modern, enterprise-class disks.  However, ZFS does a
> read-only scrub, so you might want to scrub more often".

The concept of "bit rot" on modern disk drives is very unproven.  The 
magnetism will surely last 1000+ years so the issue is mostly with 
stability of the media material and the heads.  The idea that scrub 
should re-write the data assumes that magnetic hysteresis is lost over 
time.  This is all very silly for a device with an expected service 
life of 5 years.  It is much more likely for the drive heads to lose 
their function or for a mechanical defect to appear.

Given the above, it makes sense to scrub more often on pools which see 
a lot of writes (to verify the recently written data), and less often 
on pools which are rarely updated.  More levels of redundancy 
diminshes the value of the scrub.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 17:16:19 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A34B61065676
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 17:16:19 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id 668AA8FC17
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 17:16:19 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.13.8+Sun/8.13.8) with ESMTP id
	o57HGISp023377; Mon, 7 Jun 2010 12:16:18 -0500 (CDT)
Date: Mon, 7 Jun 2010 12:16:18 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Martin Simmons <martin@lispworks.com>
In-Reply-To: <201006071655.o57GtSBg029967@higson.cam.lispworks.com>
Message-ID: <alpine.GSO.2.01.1006071213090.12887@freddy.simplesystems.org>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
	<4C0CB3FC.8070001@icyb.net.ua>
	<20100607090850.GA49166@icarus.home.lan>
	<201006071112.o57BCGMf027496@higson.cam.lispworks.com>
	<20100607121954.GA52932@icarus.home.lan>
	<201006071655.o57GtSBg029967@higson.cam.lispworks.com>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Mon, 07 Jun 2010 12:16:18 -0500 (CDT)
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 17:16:19 -0000

On Mon, 7 Jun 2010, Martin Simmons wrote:
>
> It doesn't conflict if you agree that freshly written data is more likely to
> be readable that data written long ago (with some curve in between).

Depending on the actual failure mechanism, the inverse may actually be 
true.  Freshly written data may be trash while old data still reads 
fine.

> I don't know if there is any science behind that theory...

The science is continually changing.  A study done even 5 or 7 years 
ago may no longer be relevant.  Regardless, actual results seen in the 
field count more than any theory.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 22:59:16 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D4636106564A
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 22:59:16 +0000 (UTC)
	(envelope-from brad@duttonbros.com)
Received: from uno.mnl.com (uno.mnl.com [64.221.209.136])
	by mx1.freebsd.org (Postfix) with ESMTP id 8E2BF8FC08
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 22:59:16 +0000 (UTC)
Received: from uno.mnl.com (localhost [127.0.0.1])
	by uno.mnl.com (Postfix) with ESMTP id 6C5C619E8
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 15:42:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=duttonbros.com; h=
	message-id:date:from:to:subject:mime-version:content-type
	:content-transfer-encoding; s=mail; bh=B49dhHgjMmD3QCQJxbuCP7QVk
	zg=; b=hTnMeva4DwL6QyyWXn7/Q0YYc0dGQhxUTseavB6QFsKjS+CgmkPahOckA
	hoRjLJ2U0KifDvXDuw3Q7Fon/7di58iwLWQK0doKYcw0A/Juu+7dKWl9mcKAYrcK
	FnQaSjQ6FhPRNgGDdMpJEQpAQYlCP0QBkIY+hWQQQykdyHXxGk=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=duttonbros.com; h=message-id
	:date:from:to:subject:mime-version:content-type
	:content-transfer-encoding; q=dns; s=mail; b=E7+vgDYUshOdj/bNGGB
	OPEcadGJGK/8q7LAutTk4TBgIcthbSMAw+kFhyfhkrG2Oy45T6101QT5xNERQIwz
	ShvaM2VPd86fpmcBIP4tFYFDGcw0HPo0nPDUxKV/FOjQHXTeS1yblQ7ESX98ura4
	Zzbeu8ysWA8hTQ7aEr3UCLis=
Received: from localhost (localhost [127.0.0.1])
	by uno.mnl.com (Postfix) with ESMTP id 291E019E7
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 15:42:56 -0700 (PDT)
Received: from noah.mnl.com (noah.mnl.com [192.168.0.31]) by duttonbros.com
	(Horde Framework) with HTTP; Mon, 07 Jun 2010 15:42:56 -0700
Message-ID: <20100607154256.941428ovaq2hha0g@duttonbros.com>
Date: Mon, 07 Jun 2010 15:42:56 -0700
From: "Bradley W. Dutton" <brad@duttonbros.com>
To: freebsd-fs@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain;
 charset=ISO-8859-1;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Internet Messaging Program (IMP) H3 (4.3.7) / FreeBSD-8.1
Subject: ZFS performance of various vdevs (long post)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 22:59:16 -0000

Hi,

I just upgraded a 5x500 raidz (no NCQ) array to an 8x2tb raidz2 (NCQ)  
array. In the process I was expecting my new setup to absolutely tear  
through data due to having faster and additional drives. While the new  
setup is considerably faster than the old, some of the throughput  
rates weren't as high as I was expecting. I was hoping I could get  
some help to understand how ZFS is working or possibly identify some  
bottlenecks. My goal is to have ZFS on FreeBSD be the best it can.

Below are benchmarks of the old 5 drive array (normal/raidz1/raidz2)  
and raidz2 of the new 8 drive array. As I'm using the new array I  
can't reformat it to test the other vdev types.

Sorry in advance if this format is hard to read. Let me know if I  
omitted any key information. I did several runs of each of these  
commands and the results were in range of each other enough that I  
didn't think any numbers were out of line due to caching.


The PC I'm using to test:
FreeBSD backup 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #0: Mon May 24  
18:45:38 PDT 2010     root@backup:/usr/obj/usr/src/sys/BACKUP  amd64
AMD Athlon X2 5600
4gigs of RAM
5 SATA drives are Western Digital RE2 (7200rpm) using on board  
controller (NvidiaA nForce 570 SLI MCP, no NCQ):
WD5001ABYS (3 of these)
WD5000YS (2 of these)

Supermicro AOC-USAS-L8i PCI Express x8 controller (with NCQ):
8 Hitachi 2TB 7200rpm drives


Relevant /boot/loader.conf settings:
vm.kmem_size="3G"
vfs.zfs.arc_max="2100M"
vfs.zfs.arc_meta_limit="700M"
vfs.zfs.prefetch_disable="0"

My CPU metrics aren't anything official, just me monitoring top while  
these commands are running. I mostly kept track of CPU to see if any  
processes were CPU bound. These are a percentage of total CPU time on  
the box, so 50% would be 1 core maxxed out.

Changing the dd blocksize didn't seem to affect anything so I left it  
at 1M. Also, if the machine was running for a while and had various  
items cached in the ARC the speeds could be much slower, as much as  
half. The first ZFS benchmark was half as fast as the below numbers on  
a warm box (running for several days), I rebooted to get max speed.  
The faster numbers weren't due to the data being cached, I observed  
higher throughput numbers using gstat. Instead of 30Mbytes/sec I would  
see 60 or 70.

The RE2 drives do between 70-80Mbytes/sec sequential reading/writing:

#!/bin/sh
for disk in "ad4" "ad6" "ad10" "ad12" "ad14"
do
   dd if=/dev/${disk} of=/dev/null bs=1m count=4000 &
done

4194304000 bytes transferred in 49.603534 secs (84556556 bytes/sec)
4194304000 bytes transferred in 51.679365 secs (81160130 bytes/sec)
4194304000 bytes transferred in 52.642995 secs (79674494 bytes/sec)
4194304000 bytes transferred in 57.742892 secs (72637581 bytes/sec)
4194304000 bytes transferred in 58.189738 secs (72079789 bytes/sec)

CPU usage is low when doing these 5 reads, <10%


The Hitachi drives do 120-130Mbytes/sec sequential read/write:

#!/bin/sh
for disk in "da0" "da1" "da2" "da3" "da4" "da5" "da6" "da7"
do
   dd if=/dev/${disk} of=/dev/null bs=1m count=4000 &
done
4194304000 bytes transferred in 31.980469 secs (131152048 bytes/sec)
4194304000 bytes transferred in 32.349440 secs (129656155 bytes/sec)
4194304000 bytes transferred in 32.776024 secs (127968664 bytes/sec)
4194304000 bytes transferred in 32.951440 secs (127287427 bytes/sec)
4194304000 bytes transferred in 33.048651 secs (126913017 bytes/sec)
4194304000 bytes transferred in 33.057686 secs (126878331 bytes/sec)
4194304000 bytes transferred in 33.374149 secs (125675234 bytes/sec)
4194304000 bytes transferred in 35.226584 secs (119066441 bytes/sec)

CPU usage is around 25-30%


Now on to the ZFS benchmarks:

#
# a regular ZFS pool for the 5 drive array
#
zpool create bench /dev/ad4 /dev/ad6 /dev/ad10 /dev/ad12 /dev/ad14
dd if=/dev/zero of=/bench/test.file bs=1m count=12000
12582912000 bytes transferred in 39.687730 secs (317047913 bytes/sec)
30-35% CPU

All 5 drives are written to so we have:
317/5 = ~63Mbytes/sec
This is close to 70Mbytes/sec so I'm ok with these numbers. I'm not  
sure how much overhead the checksumming is adding so that could  
account for the throughput gap here?

dd if=/bench/test.file of=/dev/null bs=1m
12582912000 bytes transferred in 34.668165 secs (362952928 bytes/sec)
around 30% CPU
All 5 drives are read from so we have:
362/5 = ~72Mbytes/sec
This seems to be max speed considering the slowest drives in the pool  
run at this speed.


#
# a ZFS raidz pool for the 5 drive array
#
zpool destroy bench
zpool create bench raidz /dev/ad4 /dev/ad6 /dev/ad10 /dev/ad12 /dev/ad14
dd if=/dev/zero of=/bench/test.file bs=1m count=12000
12582912000 bytes transferred in 54.357053 secs (231486281 bytes/sec)
CPU varied widely, between 30 and 70%, kernel process using most, then dd

Only 4 of 5 are writing actual data correct? so we have:
231/4 = ~58Mbytes/sec (this seems to be similar to gstat)
We are getting a bit slower here from our reference 70Mbytes/sec and  
compared to 63 in the regular vdev.


dd if=/bench/test.file of=/dev/null bs=1m
12582912000 bytes transferred in 45.825533 secs (274582993 bytes/sec)
around 40% CPU, kernel then dd using the most CPU

Again only 4 of 5 have data so the throughput is this?
274/4 = ~68Mbytes/sec (looks to be similar to gstat)
This is good and close to max speed.


#
# a ZFS raidz2 pool for the 5 drive array
#
zpool destroy bench
zpool create bench raidz2 /dev/ad4 /dev/ad6 /dev/ad10 /dev/ad12 /dev/ad14

dd if=/dev/zero of=/bench/test.file bs=1m count=12000
12582912000 bytes transferred in 97.491160 secs (129067210 bytes/sec)
CPU varied a lot 15-50%, a burst or two to 75%

Only 3 of 5 are writing actual data correct? so we have:
129/3 = ~43Mbytes/sec (gstat was varying quite a bit here, as low as  
5, as high as 60)
These speeds are now quite a bit lower than I would expect.  
Calculation overhead is causing the discrepancy here? The CPU is too  
slow?


dd if=/bench/test.file of=/dev/null bs=1m
12582912000 bytes transferred in 58.947959 secs (213457976 bytes/sec)
around 30% CPU

Only 3 of 5 have data and I'm not sure how to calculate throughput.  
I'm guessing the round robin reads help boost these numbers (read 3  
data disks + 1 parity so only 4 of 5 drives are in use for any given  
read?). gstat shows rates around 40Mbytes/sec even though I would  
expect closer to 60-70.
213/3 = ~71Mbytes/sec (although I don't think we can do this  
calculation this way)


#
# ZFS raidz2 pool on the 8 drive array
# this pool is about 15% used so the read/write tests aren't necessarily
# on the fastest part of the disks.
#
zpool create tank raidz2 /dev/da0 /dev/da1 /dev/da2 /dev/da3 /dev/da4  
/dev/da5 /dev/da6 /dev/da7

dd if=/dev/zero of=/tank/test.file bs=1m count=12000
12582912000 bytes transferred in 40.878876 secs (307809638 bytes/sec)
varying 40-70% CPU (a few bursts into the 90s), kernel then dd using  
most of it

307/6 = ~51Mbytes/sec (gstat varied quite a big, 20-80, it seems to  
average in the 50s as dd reported)
Per disk this isn't much faster than the old array, 51 compared to 43.  
With a few bursts to 95% CPU it seems as though some of this could be  
CPU bound.


dd if=/tank/test.file of=/dev/null bs=1m
12582912000 bytes transferred in 32.911291 secs (382328118 bytes/sec)
around 55% CPU, mostly kernel then dd

Similar to raidz2 test above, I don't think we can calculate  
throughput this way. In any case, this is actually slower per disk  
than the old array.
382/6 = ~64Mbytes/sec (gstat seemed to be around 50 so I'm guessing  
the round robin reading is creating more throughput)


#
# wrap up
#
So the normal vdev performs closest to raw drive speeds. Raidz1 is  
slower and raidz2 even more so. This is observable in the dd tests and  
viewing in gstat. Any ideas why the raid numbers are slower? I've  
tried to account for the fact that the raid vdevs have fewer data  
disks. Would a faster CPU help here?

Unfortunately I migrated all of my data to the new array so I can't  
run all of my tests on there. It would have been nice to see if a  
normal pool (non raid) on these disks would have come close to max  
speeds of 120-130Mbytes/sec (giving a total pool through put close to  
1Gbyte/sec) as the smaller array did with respect to its max speed.

I noticed scrubbing the big array is CPU bound as the kernel process  
is at 99% when running (total CPU is 50% as the as the scrub doesn't  
multithread/process). The disks are running around 45-50Mbytes/sec in  
gstat. Scrubbing the smaller/slower array isn't CPU bound and the  
disks run at close to max speed.

Thanks for time,
Brad


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 23:19:18 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0A3831065678
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 23:19:18 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id C4FC18FC14
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 23:19:17 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.13.8+Sun/8.13.8) with ESMTP id
	o57NJFIq004113; Mon, 7 Jun 2010 18:19:16 -0500 (CDT)
Date: Mon, 7 Jun 2010 18:19:15 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: "Bradley W. Dutton" <brad@duttonbros.com>
In-Reply-To: <20100607154256.941428ovaq2hha0g@duttonbros.com>
Message-ID: <alpine.GSO.2.01.1006071811040.12887@freddy.simplesystems.org>
References: <20100607154256.941428ovaq2hha0g@duttonbros.com>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Mon, 07 Jun 2010 18:19:16 -0500 (CDT)
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS performance of various vdevs (long post)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 23:19:18 -0000

On Mon, 7 Jun 2010, Bradley W. Dutton wrote:
> So the normal vdev performs closest to raw drive speeds. Raidz1 is slower and 
> raidz2 even more so. This is observable in the dd tests and viewing in gstat. 
> Any ideas why the raid numbers are slower? I've tried to account for the fact 
> that the raid vdevs have fewer data disks. Would a faster CPU help here?

The sequential throughput on your new drives is faster than the old 
drives, but it is likely that the seek and rotational latencies are 
longer.  ZFS is transaction-oriented and must tell all the drives to 
sync their write cache before proceeding to the next transaction 
group.  Drives with more latency will slow down this step.  Likewise, 
ZFS always reads and writes full filesystem blocks (default 128K) and 
this may cause more overhead when using raidz.

Using 'dd' from /dev/zero is not a very good benchmark test since zfs 
could potentially compress zero-filled blocks down to just a few bytes 
(I think recent versions of zfs do this) and of course Unix supports 
files with holes.

The higher CPU usage might be due to the device driver or the 
interface card being used.

If you could afford to do so, you will likely see considerably better 
performance by using mirrors instead of raidz since then 128K blocks 
will be sent to each disk and with fewer seeks.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 23:29:14 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C3FC5106566B
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 23:29:14 +0000 (UTC)
	(envelope-from peterjeremy@acm.org)
Received: from mail14.syd.optusnet.com.au (mail14.syd.optusnet.com.au
	[211.29.132.195])
	by mx1.freebsd.org (Postfix) with ESMTP id 54F788FC08
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 23:29:13 +0000 (UTC)
Received: from server.vk2pj.dyndns.org
	(c211-30-160-13.mirnd2.nsw.optusnet.com.au [211.30.160.13] (may
	be forged))
	by mail14.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	o57NTAj3000531
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
	for <freebsd-fs@freebsd.org>; Tue, 8 Jun 2010 09:29:12 +1000
X-Bogosity: Ham, spamicity=0.000000
Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1])
	by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id o57NTApC058778
	for <freebsd-fs@freebsd.org>; Tue, 8 Jun 2010 09:29:10 +1000 (EST)
	(envelope-from peter@server.vk2pj.dyndns.org)
Received: (from peter@localhost)
	by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id o57NTACe058777
	for freebsd-fs@freebsd.org; Tue, 8 Jun 2010 09:29:10 +1000 (EST)
	(envelope-from peter)
Date: Tue, 8 Jun 2010 09:29:10 +1000
From: Peter Jeremy <peterjeremy@acm.org>
To: freebsd-fs@freebsd.org
Message-ID: <20100607232909.GA57423@server.vk2pj.dyndns.org>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="jI8keyz6grp/JLjh"
Content-Disposition: inline
X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc
User-Agent: Mutt/1.5.20 (2009-06-14)
Subject: ZFS memory usage
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 23:29:14 -0000


--jI8keyz6grp/JLjh
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Currently, ZFS does not appear to be able to steal memory from the
"inactive" list, whereas NFS and UFS both return "freed" pages to the
"inactive" list.  Over time, unless you have a pure ZFS box (with no
NFS), this tends to result in ZFS reporting a memory shortage
(kstat.zfs.misc.arcstats.memory_throttle_count increasing), whilst
there is plenty of "inactive" space.

What is involved in correcting this?

At least part of the problem is that
cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:arc_memory_throttle()
only looks at cnt.v_free_count (number of free pages) when deciding
whether to throttle or not.  Is the fix as simple as changing the
test to check (cnt.v_free_count + cnt.v_inactive_count)?

Assuming that the fix is non-trivial, is there an easy way to transfer
"inactive" memory to the "free" list?  The perl hack:
  perl -e '$x =3D "x" x 1000000;'
sort-of works - by forcing the VM system into real memory shortage.
Is there a better work-around?

--=20
Peter Jeremy

--jI8keyz6grp/JLjh
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEUEARECAAYFAkwNgMUACgkQ/opHv/APuIdQ3gCgvipKI+Dalgu9JATA2CHohjy1
8U8AljXf+S28MzAjT0It336mNQGC0wQ=
=jsoE
-----END PGP SIGNATURE-----

--jI8keyz6grp/JLjh--

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 00:14:29 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1026B106564A
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 00:14:29 +0000 (UTC)
	(envelope-from artemb@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id D7CCD8FC08
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 00:14:28 +0000 (UTC)
Received: by pwj1 with SMTP id 1so2168144pwj.13
	for <freebsd-fs@freebsd.org>; Mon, 07 Jun 2010 17:14:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=docQGOGeooXh2/Wb2yeWf/H4pTSreSPXF478hBzazG4=;
	b=ntqQp2WTRrhODBUbKvpH4DnJz5cXXSJoGpFRQ43mSIR/oPAZMTGJVbPxD4azYWjc3c
	AhBVNJw0OuMDWes083ds9oLElyMX9oJT3TDHGUY/BeIXPPwi51mNa0yGTDR3XVn0Crs4
	+0yl+q2c4lZCzgcHN5CjgvcIYG5YcXlW1/Qxc=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	b=kBfAKGJkfOJYrCPa4/h4CLbMloVy0hVUIEGFV0S4JPUac2NKAIUoRFtXk2nI69g+5X
	/b3212oaT3vyTOYu0F/X+xPDkdkD9J/sJxeX1O77NRBmmjbrryL1UMYCqzOV+8BJrGgF
	A4u4UzkOGO1wHmymSiPIsbDGdRcWe39SrIEzA=
MIME-Version: 1.0
Received: by 10.140.248.7 with SMTP id v7mr12523527rvh.252.1275956068293; Mon, 
	07 Jun 2010 17:14:28 -0700 (PDT)
Sender: artemb@gmail.com
Received: by 10.141.40.4 with HTTP; Mon, 7 Jun 2010 17:14:28 -0700 (PDT)
In-Reply-To: <20100607232909.GA57423@server.vk2pj.dyndns.org>
References: <20100607232909.GA57423@server.vk2pj.dyndns.org>
Date: Mon, 7 Jun 2010 17:14:28 -0700
X-Google-Sender-Auth: RzqYfPu2vObfOqlhkkxvNiPUqXI
Message-ID: <AANLkTin1GV0ayjtsvF5NTQZK37-kVq1KOlFEkHFiywa-@mail.gmail.com>
From: Artem Belevich <fbsdlist@src.cx>
To: Peter Jeremy <peterjeremy@acm.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS memory usage
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 00:14:29 -0000

I believe it's pagedaemon's job to push pages from active list to
inactive and from inactive down to cache and free.
I have a really ugly hack to arc.c which forces pagedaemon wakeup if
ARC sees too much memory on inactive list.
How much is too much is defined by a sysctl value.

http://pastebin.com/ZCkzkWcs

Be warned: it's ugly, it may not work, it assumes too much, it's plain
broken, it may <write in your worst nightmare details here>...
I'm serious -- I have seen my box locking up when I did manage to
exhaust memory. The only reason I'm posting this ugliness at all is
bacause of hope that someone more familiar with memory allocation in
FreeBSD may be able to suggest better approach.

--Artem


On Mon, Jun 7, 2010 at 4:29 PM, Peter Jeremy <peterjeremy@acm.org> wrote:
> Currently, ZFS does not appear to be able to steal memory from the
> "inactive" list, whereas NFS and UFS both return "freed" pages to the
> "inactive" list. =A0Over time, unless you have a pure ZFS box (with no
> NFS), this tends to result in ZFS reporting a memory shortage
> (kstat.zfs.misc.arcstats.memory_throttle_count increasing), whilst
> there is plenty of "inactive" space.
>
> What is involved in correcting this?
>
> At least part of the problem is that
> cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:arc_memory_throttle()
> only looks at cnt.v_free_count (number of free pages) when deciding
> whether to throttle or not. =A0Is the fix as simple as changing the
> test to check (cnt.v_free_count + cnt.v_inactive_count)?
>
> Assuming that the fix is non-trivial, is there an easy way to transfer
> "inactive" memory to the "free" list? =A0The perl hack:
> =A0perl -e '$x =3D "x" x 1000000;'
> sort-of works - by forcing the VM system into real memory shortage.
> Is there a better work-around?
>
> --
> Peter Jeremy
>

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 00:30:13 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 35EA6106566C
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 00:30:13 +0000 (UTC)
	(envelope-from andrew@modulus.org)
Received: from email.octopus.com.au (email.octopus.com.au [122.100.2.232])
	by mx1.freebsd.org (Postfix) with ESMTP id EA96A8FC1B
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 00:30:12 +0000 (UTC)
Received: by email.octopus.com.au (Postfix, from userid 1002)
	id 995795CB94D; Tue,  8 Jun 2010 10:23:10 +1000 (EST)
X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on email.octopus.com.au
X-Spam-Level: ****
X-Spam-Status: No, score=4.4 required=10.0 tests=ALL_TRUSTED,
	DNS_FROM_OPENWHOIS,FH_DATE_PAST_20XX autolearn=no version=3.2.3
Received: from [10.1.50.144] (142.19.96.58.static.exetel.com.au [58.96.19.142])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	(Authenticated sender: admin@email.octopus.com.au)
	by email.octopus.com.au (Postfix) with ESMTP id 86DA95CB938
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 10:23:06 +1000 (EST)
Message-ID: <4C0D8F09.4090009@modulus.org>
Date: Tue, 08 Jun 2010 10:30:01 +1000
From: Andrew Snow <andrew@modulus.org>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
	rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <20100607232909.GA57423@server.vk2pj.dyndns.org>
	<AANLkTin1GV0ayjtsvF5NTQZK37-kVq1KOlFEkHFiywa-@mail.gmail.com>
In-Reply-To: <AANLkTin1GV0ayjtsvF5NTQZK37-kVq1KOlFEkHFiywa-@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: ZFS memory usage
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 00:30:13 -0000


I think that most ZFS users, if they have a UFS partition, it'll only be 
for a root/boot partition anyway.

So in a mixed ZFS/UFS system, ideally the Inactive pages should be 
slowly returned to ARC.  It is desirable for ZFS to get a majority of 
the available (inactive) memory, to improve its performance.

Currently we have the opposite situation in effect.


- Andrew

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 00:32:20 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1366A106567B
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 00:32:20 +0000 (UTC)
	(envelope-from brad@duttonbros.com)
Received: from uno.mnl.com (uno.mnl.com [64.221.209.136])
	by mx1.freebsd.org (Postfix) with ESMTP id C80F28FC0C
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 00:32:19 +0000 (UTC)
Received: from uno.mnl.com (localhost [127.0.0.1])
	by uno.mnl.com (Postfix) with ESMTP id 027DE1A2B;
	Mon,  7 Jun 2010 17:32:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=duttonbros.com; h=
	message-id:date:from:to:cc:subject:references:in-reply-to
	:mime-version:content-type:content-transfer-encoding; s=mail;
	bh=l7lEN0CgegfYT9yurIpXOq9Yu+4=; b=tZyAKeXj1DuHQJCEdl6Rx3+l6ACd
	KbtuAw2KN9SXGd1ZXhy2uFndD2MLrDKO5CqNZDGJwETDRgiPSIz0jfKPf+uRVj4x
	aOsrSXWUjorHSWIAa79pTTLTDHGBatHhf0l+BMb/uw/6aYyexh8C43PVt0f+KOTo
	ScfxdmzTbZC+o2Y=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=duttonbros.com; h=message-id
	:date:from:to:cc:subject:references:in-reply-to:mime-version
	:content-type:content-transfer-encoding; q=dns; s=mail; b=LONbBj
	Ys6vyu0eQ7KcvkUUylJnXVXhMBtWqlzAz9aBRZT/aMso/3IaijZwZgXHwBh4bm6T
	GCRc8qi+2wcQLsLGYH4/2/yDoLBiO2VwZwEvzGUAVC408u1j3iH+Iaip4ODSWZfr
	cuNmZ6JNRZO+ZEvUZRP7By2z5b2VRXrP+wFkE=
Received: from localhost (localhost [127.0.0.1])
	by uno.mnl.com (Postfix) with ESMTP id E1E821A29;
	Mon,  7 Jun 2010 17:32:18 -0700 (PDT)
Received: from noah.mnl.com (noah.mnl.com [192.168.0.31]) by duttonbros.com
	(Horde Framework) with HTTP; Mon, 07 Jun 2010 17:32:18 -0700
Message-ID: <20100607173218.11716iopp083dbpu@duttonbros.com>
Date: Mon, 07 Jun 2010 17:32:18 -0700
From: "Bradley W. Dutton" <brad@duttonbros.com>
To: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
References: <20100607154256.941428ovaq2hha0g@duttonbros.com>
	<alpine.GSO.2.01.1006071811040.12887@freddy.simplesystems.org>
In-Reply-To: <alpine.GSO.2.01.1006071811040.12887@freddy.simplesystems.org>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=ISO-8859-1;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Internet Messaging Program (IMP) H3 (4.3.7) / FreeBSD-8.1
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS performance of various vdevs (long post)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 00:32:20 -0000

Quoting Bob Friesenhahn <bfriesen@simple.dallas.tx.us>:

> On Mon, 7 Jun 2010, Bradley W. Dutton wrote:
>> So the normal vdev performs closest to raw drive speeds. Raidz1 is  
>> slower and raidz2 even more so. This is observable in the dd tests  
>> and viewing in gstat. Any ideas why the raid numbers are slower?  
>> I've tried to account for the fact that the raid vdevs have fewer  
>> data disks. Would a faster CPU help here?
>
> The sequential throughput on your new drives is faster than the old  
> drives, but it is likely that the seek and rotational latencies are  
> longer.  ZFS is transaction-oriented and must tell all the drives to  
> sync their write cache before proceeding to the next transaction  
> group.  Drives with more latency will slow down this step.   
> Likewise, ZFS always reads and writes full filesystem blocks  
> (default 128K) and this may cause more overhead when using raidz.

The details are little lacking on the Hitachi site but the  
HDS722020ALA330 says 8.2 seek time.
http://www.hitachigst.com/tech/techlib.nsf/techdocs/5F2DC3B35EA0311386257634000284AD/$file/USA7K2000_DS7K2000_OEMSpec_r1.2.pdf

The WDC drives say 8.9 so we should be in the same ballpark on seek times.
http://www.wdc.com/en/products/products.asp?driveid=399

I thought the NCQ vs no NCQ might tip the scales in favor of the  
Hitachi array as well.

Are there any tools to check the latencies of the disks?


> Using 'dd' from /dev/zero is not a very good benchmark test since  
> zfs could potentially compress zero-filled blocks down to just a few  
> bytes (I think recent versions of zfs do this) and of course Unix  
> supports files with holes.

I know it's pretty simple but for checking throughput I thought it  
would be ok. I don't have compression on and based on the drive lights  
and gstat, the drives definitely aren't idle.


> The higher CPU usage might be due to the device driver or the  
> interface card being used.

Definitely a plausible explanation. If this was the case would the 8  
parallel dd processes exhibit the same behavior? or is the type of IO  
affecting how much CPU the driver is using?


> If you could afford to do so, you will likely see considerably  
> better performance by using mirrors instead of raidz since then 128K  
> blocks will be sent to each disk and with fewer seeks.

I agree with you but at this poing I value the extra space more as I  
don't have a lot of random IO. I read the following and decided to  
stick with raidz2 when ditching my old raidz1 setup:
http://blogs.sun.com/roch/entry/when_to_and_not_to


Thanks for the feedback,
Brad


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 01:37:53 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0922B1065672
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 01:37:53 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id C46E78FC1B
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 01:37:52 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.13.8+Sun/8.13.8) with ESMTP id
	o581bopE004728; Mon, 7 Jun 2010 20:37:51 -0500 (CDT)
Date: Mon, 7 Jun 2010 20:37:50 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: "Bradley W. Dutton" <brad@duttonbros.com>
In-Reply-To: <20100607173218.11716iopp083dbpu@duttonbros.com>
Message-ID: <alpine.GSO.2.01.1006072034130.12887@freddy.simplesystems.org>
References: <20100607154256.941428ovaq2hha0g@duttonbros.com>
	<alpine.GSO.2.01.1006071811040.12887@freddy.simplesystems.org>
	<20100607173218.11716iopp083dbpu@duttonbros.com>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Mon, 07 Jun 2010 20:37:51 -0500 (CDT)
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS performance of various vdevs (long post)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 01:37:53 -0000

On Mon, 7 Jun 2010, Bradley W. Dutton wrote:
>
> Are there any tools to check the latencies of the disks?

There might be something better, but 'iostat -x' is definitely your 
friend when it comes to looking at latencies under load.  Use a sample 
time of 30 seconds ('iostat -x 30').  Check if a few disks are much 
slower than the others.  If they are all about the same, then the 
disks are likely operating ok.  Sometimes it is found that one or two 
disks are abnormally slow, and this slows down the whole raidz.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 04:47:09 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 479C3106566C
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 04:47:09 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta05.emeryville.ca.mail.comcast.net
	(qmta05.emeryville.ca.mail.comcast.net [76.96.30.48])
	by mx1.freebsd.org (Postfix) with ESMTP id 2C66A8FC13
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 04:47:08 +0000 (UTC)
Received: from omta17.emeryville.ca.mail.comcast.net ([76.96.30.73])
	by qmta05.emeryville.ca.mail.comcast.net with comcast
	id TG4f1e0051afHeLA5Gn8eh; Tue, 08 Jun 2010 04:47:08 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta17.emeryville.ca.mail.comcast.net with comcast
	id TGn71e00C3S48mS8dGn7A8; Tue, 08 Jun 2010 04:47:08 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 526C29B418; Mon,  7 Jun 2010 21:47:07 -0700 (PDT)
Date: Mon, 7 Jun 2010 21:47:07 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: "Bradley W. Dutton" <brad@duttonbros.com>
Message-ID: <20100608044707.GA78147@icarus.home.lan>
References: <20100607154256.941428ovaq2hha0g@duttonbros.com>
	<alpine.GSO.2.01.1006071811040.12887@freddy.simplesystems.org>
	<20100607173218.11716iopp083dbpu@duttonbros.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100607173218.11716iopp083dbpu@duttonbros.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS performance of various vdevs (long post)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 04:47:09 -0000

On Mon, Jun 07, 2010 at 05:32:18PM -0700, Bradley W. Dutton wrote:
> Quoting Bob Friesenhahn <bfriesen@simple.dallas.tx.us>:
> 
> >On Mon, 7 Jun 2010, Bradley W. Dutton wrote:
> >>So the normal vdev performs closest to raw drive speeds. Raidz1
> >>is slower and raidz2 even more so. This is observable in the dd
> >>tests and viewing in gstat. Any ideas why the raid numbers are
> >>slower? I've tried to account for the fact that the raid vdevs
> >>have fewer data disks. Would a faster CPU help here?
> >
> >The sequential throughput on your new drives is faster than the
> >old drives, but it is likely that the seek and rotational
> >latencies are longer.  ZFS is transaction-oriented and must tell
> >all the drives to sync their write cache before proceeding to the
> >next transaction group.  Drives with more latency will slow down
> >this step.  Likewise, ZFS always reads and writes full filesystem
> >blocks (default 128K) and this may cause more overhead when using
> >raidz.
> 
> The details are little lacking on the Hitachi site but the
> HDS722020ALA330 says 8.2 seek time.
> http://www.hitachigst.com/tech/techlib.nsf/techdocs/5F2DC3B35EA0311386257634000284AD/$file/USA7K2000_DS7K2000_OEMSpec_r1.2.pdf
> 
> The WDC drives say 8.9 so we should be in the same ballpark on seek times.
> http://www.wdc.com/en/products/products.asp?driveid=399
> 
> I thought the NCQ vs no NCQ might tip the scales in favor of the
> Hitachi array as well.

I'm not sure you understand NCQ.  What you're doing in your dd test is
individual dd's on each disk.  NCQ is a per-disk thing.  What you need
to test is multiple concurrent transactions *per disk*.  What I'm trying
to say is that NCQ vs. no-NCQ isn't the culprit here, because your
testbench model isn't making use of it.

> I know it's pretty simple but for checking throughput I thought it
> would be ok. I don't have compression on and based on the drive
> lights and gstat, the drives definitely aren't idle.

Try disabling prefetch (you have it enabled) and try setting
vfs.zfs.txg.timeout="5".  Some people have reported a "sweet spot" with
regards to the last parameter (needing to be adjusted if your disks are
extremely fast, etc.), as otherwise ZFS would be extremely "bursty" in
its I/O (stalling/deadlocking the system at set intervals).  By
decreasing the value you essentially do disk writes more regularly (with
less data), and depending upon the load and controller, this may even
out performance.

> >The higher CPU usage might be due to the device driver or the
> >interface card being used.
> 
> Definitely a plausible explanation. If this was the case would the 8
> parallel dd processes exhibit the same behavior? or is the type of
> IO affecting how much CPU the driver is using?

It would be the latter.

Also, I believe this Supermicro controller has been discussed in the
past.  I can't remember if people had outright failures/issues with it
or if people were complaining about sub-par performance.  I could also
be remembering a different Supermicro controller.

If I had to make a recommendation, it would be to reproduce the same
setup on a system using an Intel ICH9/ICH9R or ICH10/ICH10R controller
in AHCI mode (with ahci.ko loaded, not ataahci.ko) and see if things
improve.  But start with the loader.conf tunables I mentioned above --
segregate each test.

I would also recommend you re-run your tests with a different blocksize
for dd.  I don't know why people keep using 1m (Linux websites?).  Test
the following increments: 4k, 8k, 16k, 32k, 64k, 128k, 256k.  That's
about where you should stop.

Otherwise, consider installing ports/benchmarks/bonnie++ and try that.
That will also get you concurrent I/O tests, I believe.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 04:47:36 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 53E591065670
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 04:47:36 +0000 (UTC)
	(envelope-from jurgen@ish.com.au)
Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net
	[59.167.240.32])
	by mx1.freebsd.org (Postfix) with ESMTP id 8B1B78FC1E
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 04:47:35 +0000 (UTC)
Received: from ip-211.ish.com.au ([203.29.62.211]:29587 helo=ish.com.au)
	by fish.ish.com.au with esmtp (Exim 4.69)
	(envelope-from <jurgen@ish.com.au>)
	id 1OLqiq-0000C5-0l; Tue, 08 Jun 2010 14:47:32 +1000
Received: from [203.29.62.154] (HELO ip-154.ish.com.au)
	by ish.com.au (CommuniGate Pro SMTP 5.3.7)
	with ESMTP id 5951910; Tue, 08 Jun 2010 14:47:32 +1000
Message-ID: <4C0DCB64.5090002@ish.com.au>
Date: Tue, 08 Jun 2010 14:47:32 +1000
From: Jurgen Weber <jurgen@ish.com.au>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US;
	rv:1.9.1.8) Gecko/20100310 Shredder/3.0.4pre
MIME-Version: 1.0
To: jhell <jhell@dataix.net>
References: <4C0C6B54.8020005@ish.com.au>
	<AANLkTil2-wsv2KvCOHmPVpQuf7q0uF16x2xh6NXJPgIS@mail.gmail.com>
	<4C0CF27B.1050402@dataix.net>
In-Reply-To: <4C0CF27B.1050402@dataix.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs filesystem problem
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 04:47:36 -0000

Thanks

I was able to reboot this machine last night which solved the immediate 
problem.

I'll let you know how I go.


On 7/06/10 11:22 PM, jhell wrote:
> On 06/07/2010 01:46, Sergiy Suprun wrote:
>> On Mon, Jun 7, 2010 at 06:45, Jurgen Weber<jurgen@ish.com.au>  wrote:
>>
>>> Hello
>>>
>>> I have a FreeBSD 8.0-p2 system, which runs two pools. One with 6 disks all
>>> mirrored for our data and another mirrored pool for the OS. The system has
>>> 16GB of RAM.
>>>
>>> I have a nightly cron script running which takes a snapshot of a particular
>>> file system within the storage pool. This has been running for just over a
>>> month now without any issues until this weekend.
>>>
>>> Now we can not access the mentioned file system. If we try to `ls` to it or
>>> `cd` into it the shell locks up (not even kill -9 can stop the `ls`
>>> processes, etc) and top shows that the process state is `zfs`.
>
> This is most likely caused by some bugs that were found and fixed in
> stable/8. One of the commits that mm@ made has touched that zio->iowait
> that you should see your processes are stuck in.
>
> There still seems "at least in my case" some zio->iowait problems going
> on but I have not pinned that down to the cause yet, but they have not
> caused any of my system proccesses to freeze in that state.
>
> Grab a kernel from one of the snapshots that were made sometime last
> month to test this out just to be sure so your not upgrading for no
> reason. When I say kernel I mean kernel&  modules that go with it as ZFS
> is a module and you will obviously need that.
>
> Please report back on your findings if the kernel from stable fixed your
> problem.
>
> URL to retrieve snapshots: http://bit.ly/aLoXXV
>
> Good Luck!,
>
>
> <Irrelevant>
>> Hello.
>> How about scrub ?
>> And which size of your pools and how many place used by data+snapshots?
> </Irrelevant>.
>

-- 
-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001   fax +61 2 9550 4001

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 07:26:04 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AED6A1065672
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 07:26:04 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id F2B808FC1F
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 07:26:03 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA13343;
	Tue, 08 Jun 2010 10:25:57 +0300 (EEST)
	(envelope-from avg@icyb.net.ua)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1OLtC9-000KOw-4E; Tue, 08 Jun 2010 10:25:57 +0300
Message-ID: <4C0DF084.6090106@icyb.net.ua>
Date: Tue, 08 Jun 2010 10:25:56 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.24 (X11/20100603)
MIME-Version: 1.0
To: Artem Belevich <fbsdlist@src.cx>
References: <20100607232909.GA57423@server.vk2pj.dyndns.org>
	<AANLkTin1GV0ayjtsvF5NTQZK37-kVq1KOlFEkHFiywa-@mail.gmail.com>
In-Reply-To: <AANLkTin1GV0ayjtsvF5NTQZK37-kVq1KOlFEkHFiywa-@mail.gmail.com>
X-Enigmail-Version: 0.96.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS memory usage
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 07:26:04 -0000

on 08/06/2010 03:14 Artem Belevich said the following:
> I believe it's pagedaemon's job to push pages from active list to
> inactive and from inactive down to cache and free.
> I have a really ugly hack to arc.c which forces pagedaemon wakeup if
> ARC sees too much memory on inactive list.
> How much is too much is defined by a sysctl value.
> 
> http://pastebin.com/ZCkzkWcs
> 
> Be warned: it's ugly, it may not work, it assumes too much, it's plain
> broken, it may <write in your worst nightmare details here>...
> I'm serious -- I have seen my box locking up when I did manage to
> exhaust memory. The only reason I'm posting this ugliness at all is
> bacause of hope that someone more familiar with memory allocation in
> FreeBSD may be able to suggest better approach.


I think it's a good start.
I did a much more primitive thing locally and even that improved thing for me a
lot - I simply dropped "(vm_paging_target() > -2048" check.

Kip Macy is aware of this situation, perhaps he'll look into resolving it.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 08:56:15 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7A05B1065673
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 08:56:15 +0000 (UTC)
	(envelope-from anders@FreeBSD.org)
Received: from fupp.net (totem.fix.no [80.91.36.20])
	by mx1.freebsd.org (Postfix) with ESMTP id D79D98FC14
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 08:56:14 +0000 (UTC)
Received: from localhost (totem.fix.no [80.91.36.20])
	by fupp.net (Postfix) with ESMTP id 19B0E47114
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 10:36:50 +0200 (CEST)
Received: from fupp.net ([80.91.36.20])
	by localhost (totem.fix.no [80.91.36.20]) (amavisd-new, port 10024)
	with LMTP id uE60sxk492mZ for <freebsd-fs@freebsd.org>;
	Tue,  8 Jun 2010 10:36:49 +0200 (CEST)
Received: by fupp.net (Postfix, from userid 1000)
	id E670747113; Tue,  8 Jun 2010 10:36:49 +0200 (CEST)
Date: Tue, 8 Jun 2010 10:36:49 +0200
From: Anders Nordby <anders@FreeBSD.org>
To: freebsd-fs@freebsd.org
Message-ID: <20100608083649.GA77452@fupp.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
User-Agent: Mutt/1.4.2.3i
X-PGP-Key: http://anders.fix.no/pgp/
X-PGP-Key-FingerPrint: 1E0F C53C D8DF 6A8F EAAD  19C5 D12A BC9F 0083 5956
Subject: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 08:56:15 -0000

Hi!

I have a file server running 8.1-PRERELEASE amd64, where I share some
filesystems using NFS and Samba. After running for a day or two, the
server starts to get around 25% packet loss, browsing directories across
NFS gets really slow etc. Rebooting solves it until it happens again.
Has anyone experienced anything similar? I had this issue in FreeBSD 7
as well, upgrading did not help.

PS: I used mountd and /etc/exports to share the filesystems. I also
regularly run zpool status from monitoring systems. I also replaced the
server physically, changed switch ports, cables etc. So it does not seem
to be a problem with hardware.

Bye,

-- 
Anders.

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 10:01:21 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E0E611065672
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 10:01:20 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta14.westchester.pa.mail.comcast.net
	(qmta14.westchester.pa.mail.comcast.net [76.96.59.212])
	by mx1.freebsd.org (Postfix) with ESMTP id A381E8FC1E
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 10:01:20 +0000 (UTC)
Received: from omta07.westchester.pa.mail.comcast.net ([76.96.62.59])
	by qmta14.westchester.pa.mail.comcast.net with comcast
	id TMir1e0021GhbT85EMl6ha; Tue, 08 Jun 2010 09:45:06 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta07.westchester.pa.mail.comcast.net with comcast
	id TMl51e00A3S48mS3TMl6qS; Tue, 08 Jun 2010 09:45:06 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 15FBD9B418; Tue,  8 Jun 2010 02:45:04 -0700 (PDT)
Date: Tue, 8 Jun 2010 02:45:04 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Anders Nordby <anders@FreeBSD.org>
Message-ID: <20100608094504.GA86086@icarus.home.lan>
References: <20100608083649.GA77452@fupp.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100608083649.GA77452@fupp.net>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@freebsd.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 10:01:21 -0000

On Tue, Jun 08, 2010 at 10:36:49AM +0200, Anders Nordby wrote:
> I have a file server running 8.1-PRERELEASE amd64, where I share some
> filesystems using NFS and Samba. After running for a day or two, the
> server starts to get around 25% packet loss, browsing directories across
> NFS gets really slow etc. Rebooting solves it until it happens again.
> Has anyone experienced anything similar? I had this issue in FreeBSD 7
> as well, upgrading did not help.
> 
> PS: I used mountd and /etc/exports to share the filesystems. I also
> regularly run zpool status from monitoring systems. I also replaced the
> server physically, changed switch ports, cables etc. So it does not seem
> to be a problem with hardware.

For what it's worth, we have a similar setup, but without Samba.
Machine happens to be running 8.0-STABLE (world/kernel Mon Apr 26
02:26:36).  No packet loss seen, and no overall issues aside from some
input errors on our em1 NIC (which do not correlate on the switch its
connected to).  NFS is not used heavily aside from daily backup jobs
across a gigE nework.  Machine has been up 42 days.

$ netstat -ibn
Name    Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll
em0    1500 <Link#1>      XX:XX:XX:XX:XX:XX  1541235     0     0  344966704   359378     0  255337127     0
em0    1500 XX.XX.XX.XX/X XX.XX.XX.XX         424637     -     -  271854885   359033     -  250287803     -
em1    1500 <Link#2>      XX:XX:XX:XX:XX:XX 62851814    59     0 81941673228 43418668     0 3520821042     0
em1    1500 XX.XX.XX.XX/X XX.XX.XX.XX       62813816     -     - 81059999660 43464778     - 2912408432     -
lo0   16384 <Link#3>                            3288     0     0     435566     3288     0     435566     0
lo0   16384 127.0.0.0/8   127.0.0.1             3288     -     -     435566     3288     -     435566     -

$ netstat -m
3085/2165/5250 mbufs in use (current/cache/total)
2549/2433/4982/25600 mbuf clusters in use (current/cache/total/max)
2048/896 mbuf+clusters out of packet secondary zone in use (current/cache)
0/104/104/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
5869K/5823K/11692K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

switch# show interfaces 2

 Status and Counters - Port Counters for port 2

  Name  : XXXXXXXXXX
  Link Status     : Up
  Totals (Since boot or last clear) :
   Bytes Rx        : 2,512,873,738      Bytes Tx        : 1,117,402,842
   Unicast Rx      : 169,702,228        Unicast Tx      : 237,760,196
   Bcast/Mcast Rx  : 12,667             Bcast/Mcast Tx  : 75,726
  Errors (Since boot or last clear) :
   FCS Rx          : 0                  Drops Rx        : 0
   Alignment Rx    : 0                  Collisions Tx   : 0
   Runts Rx        : 0                  Late Colln Tx   : 0
   Giants Rx       : 0                  Excessive Colln : 0
   Total Rx Errors : 0                  Deferred Tx     : 0
  Rates (5 minute weighted average) :
   Total Rx  (bps) : 1449296            Total Tx  (bps) : 1504528
   Unicast Rx (Pkts/sec) : 0            Unicast Tx (Pkts/sec) : 0
   B/Mcast Rx (Pkts/sec) : 0            B/Mcast Tx (Pkts/sec) : 0
   Utilization Rx  : 00.04 %            Utilization Tx  : 00.04 %


-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 15:20:07 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CF1BD1065674
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 15:20:07 +0000 (UTC)
	(envelope-from brad@duttonbros.com)
Received: from uno.mnl.com (uno.mnl.com [64.221.209.136])
	by mx1.freebsd.org (Postfix) with ESMTP id A05CA8FC15
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 15:20:07 +0000 (UTC)
Received: from uno.mnl.com (localhost [127.0.0.1])
	by uno.mnl.com (Postfix) with ESMTP id 1DD501F04;
	Tue,  8 Jun 2010 08:20:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=duttonbros.com; h=
	message-id:date:from:to:cc:subject:references:in-reply-to
	:mime-version:content-type:content-transfer-encoding; s=mail;
	bh=UK/HH0x79TxQQwm/y3vO9uNunDs=; b=YR1uMMY3wzuG9aItq9LtESzI7f5G
	aQJ8oENrDeY1flS9yb2t/XKVJEqanll4GGX3D/L2sLmHEZKdfxe1Rhfac1x12Eo/
	P7NVimFDL/pf90i1e/MFaQP+lHdR6IDU7xBnr/xqlgYjHqsolqtBr1c57qkBkIPV
	NNU0VIPSJyLiM3I=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=duttonbros.com; h=message-id
	:date:from:to:cc:subject:references:in-reply-to:mime-version
	:content-type:content-transfer-encoding; q=dns; s=mail; b=V6d4Ac
	LlJtYifoPgABSStCMcijEGy40Pe5QJiol24gfej3IVSSUUH1otIsWjik520jTlcB
	sk1smdsYvBXcFqlwdpJvAwZjD4BPnjLIvd71S1zNCpoJvY0hBs5AKzOGDDvPkknd
	zUgtN1+pWvv/1dVtSxN3GVoZhtMW5ruff46Ro=
Received: from localhost (localhost [127.0.0.1])
	by uno.mnl.com (Postfix) with ESMTP id 0739E1F03;
	Tue,  8 Jun 2010 08:20:07 -0700 (PDT)
Received: from c-98-210-178-102.hsd1.ca.comcast.net
	(c-98-210-178-102.hsd1.ca.comcast.net [98.210.178.102]) by
	duttonbros.com
	(Horde Framework) with HTTP; Tue, 08 Jun 2010 08:20:06 -0700
Message-ID: <20100608082006.5006764hokcpvzqe@duttonbros.com>
Date: Tue, 08 Jun 2010 08:20:06 -0700
From: "Bradley W. Dutton" <brad@duttonbros.com>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
References: <20100607154256.941428ovaq2hha0g@duttonbros.com>
	<alpine.GSO.2.01.1006071811040.12887@freddy.simplesystems.org>
	<20100607173218.11716iopp083dbpu@duttonbros.com>
	<20100608044707.GA78147@icarus.home.lan>
In-Reply-To: <20100608044707.GA78147@icarus.home.lan>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=ISO-8859-1;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Internet Messaging Program (IMP) H3 (4.3.7) / FreeBSD-8.1
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS performance of various vdevs (long post)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 15:20:07 -0000

Quoting Jeremy Chadwick <freebsd@jdc.parodius.com>:

>> On Mon, Jun 07, 2010 at 05:32:18PM -0700, Bradley W. Dutton wrote:
>> I know it's pretty simple but for checking throughput I thought it
>> would be ok. I don't have compression on and based on the drive
>> lights and gstat, the drives definitely aren't idle.
>
> Try disabling prefetch (you have it enabled) and try setting
> vfs.zfs.txg.timeout="5".  Some people have reported a "sweet spot" with
> regards to the last parameter (needing to be adjusted if your disks are
> extremely fast, etc.), as otherwise ZFS would be extremely "bursty" in
> its I/O (stalling/deadlocking the system at set intervals).  By
> decreasing the value you essentially do disk writes more regularly (with
> less data), and depending upon the load and controller, this may even
> out performance.

I tested some of these settings. With the timeout set to 5 not much  
changed write wise. (keep in mind these results are the Nvidia/WDRE2  
combo):

With txg=5 and prefetch disabled I saw read speeds go down considerably:
# normal/jbod txg=5 no prefetch
zpool create bench /dev/ad4 /dev/ad6 /dev/ad10 /dev/ad12 /dev/ad14
dd if=/bench/test.file of=/dev/null bs=1m
12582912000 bytes transferred in 59.330330 secs (212082286 bytes/sec)
compared to
12582912000 bytes transferred in 34.668165 secs (362952928 bytes/sec)

zpool create bench raidz /dev/ad4 /dev/ad6 /dev/ad10 /dev/ad12 /dev/ad14
dd if=/bench/test.file of=/dev/null bs=1m
12582912000 bytes transferred in 71.135696 secs (176886046 bytes/sec)
compared to
12582912000 bytes transferred in 45.825533 secs (274582993 bytes/sec)

Running the same tests on the raidz2 Supermicro/Hitachi setup didn't  
yield any difference in writes, the reads were slower:
zpool create tank raidz2 /dev/da0 /dev/da1 /dev/da2 /dev/da3 /dev/da4  
/dev/da5 /dev/da6 /dev/da7
dd if=/tank/test.file of=/dev/null bs=1m
12582912000 bytes transferred in 44.118409 secs (285207745 bytes/sec)
compared to
12582912000 bytes transferred in 32.911291 secs (382328118 bytes/sec)

I rebooted and reran these numbers just to make sure they were consistent.


>> >The higher CPU usage might be due to the device driver or the
>> >interface card being used.
>>
>> Definitely a plausible explanation. If this was the case would the 8
>> parallel dd processes exhibit the same behavior? or is the type of
>> IO affecting how much CPU the driver is using?
>
> It would be the latter.
>
> Also, I believe this Supermicro controller has been discussed in the
> past.  I can't remember if people had outright failures/issues with it
> or if people were complaining about sub-par performance.  I could also
> be remembering a different Supermicro controller.
>
> If I had to make a recommendation, it would be to reproduce the same
> setup on a system using an Intel ICH9/ICH9R or ICH10/ICH10R controller
> in AHCI mode (with ahci.ko loaded, not ataahci.ko) and see if things
> improve.  But start with the loader.conf tunables I mentioned above --
> segregate each test.
>
> I would also recommend you re-run your tests with a different blocksize
> for dd.  I don't know why people keep using 1m (Linux websites?).  Test
> the following increments: 4k, 8k, 16k, 32k, 64k, 128k, 256k.  That's
> about where you should stop.

I tested with 8, 16, 32, 64, 128, 1m and the results all looked  
similar. As such I stuck with bs=1m because it's easier to change count.


> Otherwise, consider installing ports/benchmarks/bonnie++ and try that.
> That will also get you concurrent I/O tests, I believe.

I may give this a shot but I'm most interested in less concurrency as  
I have larger files with only a couple of readers/writers. As Bob  
noted a bunch of mirrors in the pool would definitely be faster for  
concurrent IO.


Thanks for the help,
Brad


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  8 23:39:28 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A1A801065670
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 23:39:28 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 5749A8FC1F
	for <freebsd-fs@freebsd.org>; Tue,  8 Jun 2010 23:39:27 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AvsEADdxDkyDaFvK/2dsb2JhbACeRnG/ZoUWBA
X-IronPort-AV: E=Sophos;i="4.53,387,1272859200"; d="scan'208";a="80004278"
Received: from fraser.cs.uoguelph.ca ([131.104.91.202])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 08 Jun 2010 19:39:25 -0400
Received: from localhost (localhost.localdomain [127.0.0.1])
	by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 526E8109C2C3;
	Tue,  8 Jun 2010 19:39:27 -0400 (EDT)
X-Virus-Scanned: amavisd-new at fraser.cs.uoguelph.ca
Received: from fraser.cs.uoguelph.ca ([127.0.0.1])
	by localhost (fraser.cs.uoguelph.ca [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id s1hizAzPQcVt; Tue,  8 Jun 2010 19:39:26 -0400 (EDT)
Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102])
	by fraser.cs.uoguelph.ca (Postfix) with ESMTP id D485B109C24A;
	Tue,  8 Jun 2010 19:39:26 -0400 (EDT)
Received: from localhost (rmacklem@localhost)
	by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id
	o58NtWP09898; Tue, 8 Jun 2010 19:55:32 -0400 (EDT)
X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing
	-bs
Date: Tue, 8 Jun 2010 19:55:32 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
X-X-Sender: rmacklem@muncher.cs.uoguelph.ca
To: Anders Nordby <anders@FreeBSD.org>
In-Reply-To: <20100608083649.GA77452@fupp.net>
Message-ID: <Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
References: <20100608083649.GA77452@fupp.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2010 23:39:28 -0000


On Tue, 8 Jun 2010, Anders Nordby wrote:

> Hi!
>
> I have a file server running 8.1-PRERELEASE amd64, where I share some
> filesystems using NFS and Samba. After running for a day or two, the
> server starts to get around 25% packet loss, browsing directories across
> NFS gets really slow etc. Rebooting solves it until it happens again.
> Has anyone experienced anything similar? I had this issue in FreeBSD 7
> as well, upgrading did not help.
>
Well, here's a few things you might try. (I know nothing about ZFS,
except what I see discussed on the mailing lists.)

- "netstat -m" will show you mbuf allocations. Might give you a hint
   w.r.t. mbuf/mbuf cluster exhaustion.
- I'd try setting zio_use_uma = 0, since there have been reports of
   issues related to ZFS using the uma allocator and mbuf allocation
   uses the uma allocator now, too. (I think this is fairly recent, so
   might not be relevant to FreeBSD7.)
- You can try the experimental NFS server to see if that affects the
   behaviour. ("-e" option on both mountd and nfsd)
- If you have some different network hardware, you could try a different
   net interface. This would isolate the problem, if it happens to be
   related to the network device driver for the hardware you have.

There are lots of email messages in the archive related to tuning the
arc for zfs. I know nothing about it, but I'd look for a message that
describes what the current recommendations are for amd64 w.r.t. this.

Hopefully others can suggest other things to check. It smells like some
sort of resource exhaustion problem, but who knows???

Good luck with it, rick


From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 02:44:52 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6DD3C1065675
	for <freebsd-fs@freebsd.org>; Wed,  9 Jun 2010 02:44:52 +0000 (UTC)
	(envelope-from andrew@modulus.org)
Received: from email.octopus.com.au (email.octopus.com.au [122.100.2.232])
	by mx1.freebsd.org (Postfix) with ESMTP id 2D49F8FC18
	for <freebsd-fs@freebsd.org>; Wed,  9 Jun 2010 02:44:51 +0000 (UTC)
Received: by email.octopus.com.au (Postfix, from userid 1002)
	id E24795CB93A; Wed,  9 Jun 2010 12:37:47 +1000 (EST)
X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on email.octopus.com.au
X-Spam-Level: ****
X-Spam-Status: No, score=4.4 required=10.0 tests=ALL_TRUSTED,
	DNS_FROM_OPENWHOIS,FH_DATE_PAST_20XX autolearn=no version=3.2.3
Received: from [10.1.50.144] (142.19.96.58.static.exetel.com.au [58.96.19.142])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	(Authenticated sender: admin@email.octopus.com.au)
	by email.octopus.com.au (Postfix) with ESMTP id E64B25CB94B;
	Wed,  9 Jun 2010 12:37:43 +1000 (EST)
Message-ID: <4C0F0017.5000002@modulus.org>
Date: Wed, 09 Jun 2010 12:44:39 +1000
From: Andrew Snow <andrew@modulus.org>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
	rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4
MIME-Version: 1.0
To: "Bradley W. Dutton" <brad@duttonbros.com>, freebsd-fs@freebsd.org
References: <20100607154256.941428ovaq2hha0g@duttonbros.com>	<alpine.GSO.2.01.1006071811040.12887@freddy.simplesystems.org>	<20100607173218.11716iopp083dbpu@duttonbros.com>	<20100608044707.GA78147@icarus.home.lan>
	<20100608082006.5006764hokcpvzqe@duttonbros.com>
In-Reply-To: <20100608082006.5006764hokcpvzqe@duttonbros.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: 
Subject: Re: ZFS performance of various vdevs (long post)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 02:44:52 -0000


Under opensolaris, the LSI-based controllers seem to use interrupt 
coalescing, and you can even tweak the settings via lsiutil.  When you 
disable it, things go alot slower.  I suspect this is the reason for the 
speed difference of sequential transfer rates vs FreeBSD.

- Andrew


From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 08:26:18 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 87D4A106567D;
	Wed,  9 Jun 2010 08:26:18 +0000 (UTC) (envelope-from az@FreeBSD.org)
Received: from freefall.freebsd.org (unknown [IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 6037A8FC23;
	Wed,  9 Jun 2010 08:26:18 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o598QI5G026982;
	Wed, 9 Jun 2010 08:26:18 GMT (envelope-from az@freefall.freebsd.org)
Received: (from az@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o598QFYq026978;
	Wed, 9 Jun 2010 08:26:15 GMT (envelope-from az)
Date: Wed, 9 Jun 2010 08:26:15 GMT
Message-Id: <201006090826.o598QFYq026978@freefall.freebsd.org>
To: andrey.zverev@electro-com.ru, az@FreeBSD.org, freebsd-fs@FreeBSD.org
From: az@FreeBSD.org
Cc: 
Subject: Re: kern/130979: [smbfs] [panic] boot/kernel/smbfs.ko
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 08:26:18 -0000

Synopsis: [smbfs] [panic] boot/kernel/smbfs.ko

State-Changed-From-To: open->closed
State-Changed-By: az
State-Changed-When: Wed Jun 9 08:26:15 UTC 2010
State-Changed-Why: 
not occurs anymore

http://www.freebsd.org/cgi/query-pr.cgi?pr=130979

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 12:25:19 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E20341065670
	for <freebsd-fs@FreeBSD.org>; Wed,  9 Jun 2010 12:25:19 +0000 (UTC)
	(envelope-from anders@FreeBSD.org)
Received: from fupp.net (totem.fix.no [80.91.36.20])
	by mx1.freebsd.org (Postfix) with ESMTP id 2CAC18FC13
	for <freebsd-fs@FreeBSD.org>; Wed,  9 Jun 2010 12:25:18 +0000 (UTC)
Received: from localhost (totem.fix.no [80.91.36.20])
	by fupp.net (Postfix) with ESMTP id A691C47321;
	Wed,  9 Jun 2010 14:25:17 +0200 (CEST)
Received: from fupp.net ([80.91.36.20])
	by localhost (totem.fix.no [80.91.36.20]) (amavisd-new, port 10024)
	with LMTP id 7URuBGWSvEX8; Wed,  9 Jun 2010 14:25:17 +0200 (CEST)
Received: by fupp.net (Postfix, from userid 1000)
	id 295B447320; Wed,  9 Jun 2010 14:25:17 +0200 (CEST)
Date: Wed, 9 Jun 2010 14:25:17 +0200
From: Anders Nordby <anders@FreeBSD.org>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20100609122517.GA16231@fupp.net>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key: http://anders.fix.no/pgp/
X-PGP-Key-FingerPrint: 1E0F C53C D8DF 6A8F EAAD  19C5 D12A BC9F 0083 5956
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 12:25:20 -0000

Hi,

On Tue, Jun 08, 2010 at 07:55:32PM -0400, Rick Macklem wrote:
> Well, here's a few things you might try. (I know nothing about ZFS,
> except what I see discussed on the mailing lists.)
> 
> - "netstat -m" will show you mbuf allocations. Might give you a hint
>   w.r.t. mbuf/mbuf cluster exhaustion.
> - I'd try setting zio_use_uma = 0, since there have been reports of
>   issues related to ZFS using the uma allocator and mbuf allocation
>   uses the uma allocator now, too. (I think this is fairly recent, so
>   might not be relevant to FreeBSD7.)
> - You can try the experimental NFS server to see if that affects the
>   behaviour. ("-e" option on both mountd and nfsd)
> - If you have some different network hardware, you could try a different
>   net interface. This would isolate the problem, if it happens to be
>   related to the network device driver for the hardware you have.
> 
> There are lots of email messages in the archive related to tuning the
> arc for zfs. I know nothing about it, but I'd look for a message that
> describes what the current recommendations are for amd64 w.r.t. this.
> 
> Hopefully others can suggest other things to check. It smells like some
> sort of resource exhaustion problem, but who knows???

Thanks. The only thing that (temporarily) solves this issue so far is
rebooting, which helps only for a day or so. I have tried different
NICs, replacing the physical server, replacing cables, changing and
resetting switch ports. But it did not help, so I think this is a
software problem. I will try zio_use_uma = 0 I think, and then try to
limit vfs.zfs.arc_max to 100 MB or so.

On the ZFS+NFS server while having these issues:

root@unixfile:~# netstat -m
1293/4602/5895 mbufs in use (current/cache/total)
1109/3619/4728/65536 mbuf clusters in use (current/cache/total/max)
257/1023 mbuf+clusters out of packet secondary zone in use
(current/cache)
0/104/104/12800 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
2541K/8804K/11345K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Packet loss seen from my workstation:

anders@noname:~$ ping unixfile
PING unixfile.aftenposten.no (192.168.120.33) 56(84) bytes of data.
64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=1
ttl=63 time=0
.230 ms
64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=3
ttl=63 time=0
.262 ms
64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=5
ttl=63 time=0
.272 ms
64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=6
ttl=63 time=0
.203 ms
64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=7
ttl=63 time=0
.306 ms
64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=9
ttl=63 time=0
.309 ms
^C
--- unixfile.aftenposten.no ping statistics ---
10 packets transmitted, 6 received, 40% packet loss, time 9017ms
rtt min/avg/max/mdev = 0.203/0.263/0.309/0.042 ms

Here is also vmstat -z from the server:

ITEM                     SIZE     LIMIT      USED      FREE  REQUESTS  FAILURES

UMA Kegs:                 208,        0,      175,       12,      175,        0
UMA Zones:                320,        0,      175,        5,      175,        0
UMA Slabs:                568,        0,    20339,     7535,   162600,        0
UMA RCntSlabs:            568,        0,     2468,        3,     2468,        0
UMA Hash:                 256,        0,        5,       85,       81,        0
16 Bucket:                152,        0,      558,      292,     1115,        0
32 Bucket:                280,        0,      269,       25,      491,        0
64 Bucket:                536,        0,      254,        5,      391,       17
128 Bucket:              1048,        0,     3598,       47,     6823,      914
VM OBJECT:                216,        0,    47009,     9529,  2668554,        0
MAP:                      232,        0,        7,       25,        7,        0
KMAP ENTRY:               120,   119815,     4881,      606,   376403,        0
MAP ENTRY:                120,        0,     1797,      683,  4683855,        0
DP fakepg:                120,        0,        0,        0,        0,        0
SG fakepg:                120,        0,        0,        0,        0,        0
mt_zone:                 2056,        0,      196,        3,      196,        0
16:                        16,        0,    14932,     7916,  3237030,        0
32:                        32,        0,     2438,     1703,  2411143,        0
64:                        64,        0,    32128,    18216, 93399160,        0
128:                      128,        0,    28706,    55075, 12071701,        0
256:                      256,        0,     3831,     7104, 58010086,        0
512:                      512,        0,     1753,      578, 32140172,        0
1024:                    1024,        0,       93,      123,   201330,        0
2048:                    2048,        0,      529,      375, 36122797,        0
4096:                    4096,        0,      253,      184,   185892,        0
Files:                     80,        0,      424,      386,  1078416,        0
TURNSTILE:                136,        0,      297,       63,      297,        0
umtx pi:                   96,        0,        0,        0,        0,        0
MAC labels:                40,        0,        0,        0,        0,        0
PROC:                    1120,        0,       66,      114,   107003,        0
THREAD:                   984,        0,      267,       29,      294,        0
SLEEPQUEUE:                80,        0,      297,       80,      297,        0
VMSPACE:                  392,        0,       45,      155,   107030,        0
cpuset:                    72,        0,        2,       98,        2,        0
audit_record:             952,        0,        0,        0,        0,        0
mbuf_packet:              256,        0,      259,     1021, 34278617,        0
mbuf:                     256,        0,     1025,     3590, 131614064,        0
mbuf_cluster:            2048,    65536,     2278,     2450, 16615870,        0
mbuf_jumbo_page:         4096,    12800,        0,      104,   153927,        0
mbuf_jumbo_9k:           9216,     6400,        0,        0,        0,        0
mbuf_jumbo_16k:         16384,     3200,        0,        0,        0,        0
mbuf_ext_refcnt:            4,        0,        0,        0,        0,        0
g_bio:                    232,        0,        0,     8544,  1690094,        0
ttyinq:                   160,        0,      135,       81,      300,        0
ttyoutq:                  256,        0,       72,       48,      160,        0
ata_request:              320,        0,        0,       24,        1,        0
ata_composite:            336,        0,        0,        0,        0,        0
VNODE:                    472,        0,    69327,     4057, 12604560,        0
VNODEPOLL:                112,        0,        0,        0,        0,        0
S VFS Cache:              108,        0,    70366,     6821, 12297146,        0
L VFS Cache:              328,        0,      179,    25369,   544759,        0
NAMEI:                   1024,        0,        0,       96, 18824297,        0
NFSMOUNT:                 616,        0,        0,        0,        0,        0
NFSNODE:                  656,        0,        0,        0,        0,        0
DIRHASH:                 1024,        0,     1147,       37,     1147,        0
pipe:                     728,        0,       19,       86,    85332,        0
ksiginfo:                 112,        0,      166,      890,     4901,        0
itimer:                   344,        0,        0,       22,        1,        0
KNOTE:                    128,        0,        0,      145,      622,        0
socket:                   680,   131076,       53,       79,    20777,        0
unpcb:                    240,   131072,       10,      182,     6269,        0
ipq:                       56,     2079,        0,      189,      159,        0
udp_inpcb:                336,   131076,       11,       66,     5487,        0
udpcb:                     16,   131208,       11,      661,     5487,        0
tcp_inpcb:                336,   131076,       32,      111,     9019,        0
tcpcb:                    880,   131072,       32,       96,     9019,        0
tcptw:                     72,    26250,        0,      200,       51,        0
syncache:                 144,    15366,        0,      130,     8229,        0
hostcache:                136,    15372,        8,      132,       61,        0
tcpreass:                  40,     4116,        3,      501,   662733,        0
sackhole:                  32,        0,        0,      202,       11,        0
ripcb:                    336,   131076,        0,       22,        1,        0
rtentry:                  200,        0,        4,       34,        4,        0
selfd:                     56,        0,      262,      683,   704729,        0
SWAPMETA:                 288,   116519,        0,        0,        0,        0
ip4flow:                   56,    99351,       16,      551,    11254,        0
ip6flow:                   80,    99360,        0,        0,        0,        0
Mountpoints:              752,        0,        5,       20,        5,        0
FFS inode:                168,        0,    43495,    25739,   526228,        0
FFS1 dinode:              128,        0,        0,        0,        0,        0
FFS2 dinode:              256,        0,    43495,    25610,   526228,        0
taskq_zone:                56,        0,        0,      819,   299535,        0
zio_cache:                776,        0,        0,     2830,  7902766,        0
zio_buf_512:              512,        0,    73281,    39083,  2179139,        0
zio_data_buf_512:         512,        0,       41,      260,    86233,        0
zio_buf_1024:            1024,        0,       64,      624,    31885,        0
zio_data_buf_1024:       1024,        0,       33,      815,    14631,        0
zio_buf_1536:            1536,        0,       15,      161,     6621,        0
zio_data_buf_1536:       1536,        0,        9,      179,      666,        0
zio_buf_2048:            2048,        0,       10,      352,    13371,        0
zio_data_buf_2048:       2048,        0,        4,       82,      518,        0
zio_buf_2560:            2560,        0,        6,       76,     4631,        0
zio_data_buf_2560:       2560,        0,        8,       79,      751,        0
zio_buf_3072:            3072,        0,        3,      146,     8829,        0
zio_data_buf_3072:       3072,        0,        4,      107,     1160,        0
zio_buf_3584:            3584,        0,        5,      273,    22944,        0
zio_data_buf_3584:       3584,        0,        5,       82,      418,        0
zio_buf_4096:            4096,        0,       10,      192,    21812,        0
zio_data_buf_4096:       4096,        0,        7,      141,     1628,        0
zio_buf_5120:            5120,        0,        2,      236,    49783,        0
zio_data_buf_5120:       5120,        0,       14,      366,     2686,        0
zio_buf_6144:            6144,        0,        3,      127,    26343,        0
zio_data_buf_6144:       6144,        0,       20,      629,     1944,        0
zio_buf_7168:            7168,        0,        3,       85,     7341,        0
zio_data_buf_7168:       7168,        0,       31,      690,     2953,        0
zio_buf_8192:            8192,        0,        5,       98,     6653,        0
zio_data_buf_8192:       8192,        0,       47,      712,     3562,        0
zio_buf_10240:          10240,        0,       10,      109,     5628,        0
zio_data_buf_10240:     10240,        0,       80,      846,     5494,        0
zio_buf_12288:          12288,        0,        9,       81,     2704,        0
zio_data_buf_12288:     12288,        0,       59,      972,     4714,        0
zio_buf_14336:          14336,        0,        0,      293,    79024,        0
zio_data_buf_14336:     14336,        0,       64,      770,     5474,        0
zio_buf_16384:          16384,        0,     3409,      613,    42927,        0
zio_data_buf_16384:     16384,        0,       53,      615,    36196,        0
zio_buf_20480:          20480,        0,        0,       72,     1000,        0
zio_data_buf_20480:     20480,        0,       50,      761,     5383,        0
zio_buf_24576:          24576,        0,        3,       42,      702,        0
zio_data_buf_24576:     24576,        0,       24,      312,     3207,        0
zio_buf_28672:          28672,        0,        1,       54,      784,        0
zio_data_buf_28672:     28672,        0,       10,      157,     1538,        0
zio_buf_32768:          32768,        0,        0,       61,     1079,        0
zio_data_buf_32768:     32768,        0,        8,      129,    22324,        0
zio_buf_36864:          36864,        0,        3,       71,      486,        0
zio_data_buf_36864:     36864,        0,       11,       92,     1506,        0
zio_buf_40960:          40960,        0,        1,       53,      324,        0
zio_data_buf_40960:     40960,        0,        7,       58,      728,        0
zio_buf_45056:          45056,        0,        1,       43,      319,        0
zio_data_buf_45056:     45056,        0,        3,       55,      530,        0
zio_buf_49152:          49152,        0,        0,       65,     1224,        0
zio_data_buf_49152:     49152,        0,        1,      140,    17837,        0
zio_buf_53248:          53248,        0,        0,       53,      364,        0
zio_data_buf_53248:     53248,        0,        0,       54,      349,        0
zio_buf_57344:          57344,        0,        2,       52,      381,        0
zio_data_buf_57344:     57344,        0,        6,       97,     2164,        0
zio_buf_61440:          61440,        0,        0,       44,      267,        0
zio_data_buf_61440:     61440,        0,        1,       50,      594,        0
zio_buf_65536:          65536,        0,      172,       92,    41829,        0
zio_data_buf_65536:     65536,        0,        0,      119,    14319,        0
zio_buf_69632:          69632,        0,        0,       35,      194,        0
zio_data_buf_69632:     69632,        0,        0,       38,      195,        0
zio_buf_73728:          73728,        0,        0,       44,      525,        0
zio_data_buf_73728:     73728,        0,        3,       75,      718,        0
zio_buf_77824:          77824,        0,        0,       58,      462,        0
zio_data_buf_77824:     77824,        0,        6,       74,      557,        0
zio_buf_81920:          81920,        0,        1,       53,      422,        0
zio_data_buf_81920:     81920,        0,        0,      118,    12825,        0
zio_buf_86016:          86016,        0,        1,       34,      308,        0
zio_data_buf_86016:     86016,        0,        5,       50,      957,        0
zio_buf_90112:          90112,        0,        1,       48,      481,        0
zio_data_buf_90112:     90112,        0,        1,       29,       44,        0
zio_buf_94208:          94208,        0,        0,       49,     1036,        0
zio_data_buf_94208:     94208,        0,        0,       57,      177,        0
zio_buf_98304:          98304,        0,        0,       44,      348,        0
zio_data_buf_98304:     98304,        0,        0,      112,    12362,        0
zio_buf_102400:        102400,        0,        0,       58,      388,        0
zio_data_buf_102400:   102400,        0,        0,       20,       45,        0
zio_buf_106496:        106496,        0,        1,       35,      477,        0
zio_data_buf_106496:   106496,        0,        1,       57,      482,        0
zio_buf_110592:        110592,        0,        1,       72,      884,        0
zio_data_buf_110592:   110592,        0,        0,       71,      930,        0
zio_buf_114688:        114688,        0,        0,       61,      656,        0
zio_data_buf_114688:   114688,        0,        1,      146,    10626,        0
zio_buf_118784:        118784,        0,        0,       67,      532,        0
zio_data_buf_118784:   118784,        0,        0,       10,       29,        0
zio_buf_122880:        122880,        0,        1,       86,     1444,        0
zio_data_buf_122880:   122880,        0,        0,       50,      176,        0
zio_buf_126976:        126976,        0,        1,       59,     1029,        0
zio_data_buf_126976:   126976,        0,        0,       42,      325,        0
zio_buf_131072:        131072,        0,        0,      717,   119915,        0
zio_data_buf_131072:   131072,        0,      474,      981,   214146,        0
dmu_buf_impl_t:           224,        0,    77939,    46739,  2664713,        0
dnode_t:                  776,        0,    73767,    45043,  2094869,        0
arc_buf_hdr_t:            208,        0,    27195,    24519,   605620,        0
arc_buf_t:                 72,        0,     4901,    14949,   677129,        0
zil_lwb_cache:            200,        0,        2,     1233,   118944,        0
zfs_znode_cache:          376,        0,    25805,     4005, 12077350,        0

Regards,

-- 
Anders.

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 13:35:23 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 58A531065670;
	Wed,  9 Jun 2010 13:35:23 +0000 (UTC)
	(envelope-from asmrookie@gmail.com)
Received: from mail-yw0-f182.google.com (mail-yw0-f182.google.com
	[209.85.211.182])
	by mx1.freebsd.org (Postfix) with ESMTP id DD23F8FC26;
	Wed,  9 Jun 2010 13:35:22 +0000 (UTC)
Received: by ywh12 with SMTP id 12so4550306ywh.14
	for <multiple recipients>; Wed, 09 Jun 2010 06:35:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=lCNZqEZFgdHyY52qMl5pRpfjTgPbsAnOLT3D80M4zC0=;
	b=KeLRRAfd1QlKzP3QjpXT3dYARH9yyNZV9on6/noUAeLm1+v1C/1Yg5H1EnmPK8dYFr
	6Iea5lB27lEC47QBR92LejqAkZRvtpMP8nqvnUis9aqXEC9qahNQjk4ZZchjLle9jSax
	y6ieD7KP0RhVnaAidaz/IZ+1kDla4Lhweu8ZE=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	b=axTqZuVIhEJawz4DpYhFl5aS9p/kn9oWObFs7KeXllbfNXsMVAWDFrPhsEcG1uZ5ni
	78/hDd9zkh4WOadE+YC661Vsc6ocIUrQb9G/xAWIXoCPTk6uPXolN+KTV86Ltl5yxe6m
	U4JO12TmHDnZntpdPFG4NzJs8I/1Tb1MIOxgo=
MIME-Version: 1.0
Received: by 10.229.223.201 with SMTP id il9mr5906267qcb.89.1276090521180; 
	Wed, 09 Jun 2010 06:35:21 -0700 (PDT)
Sender: asmrookie@gmail.com
Received: by 10.229.183.213 with HTTP; Wed, 9 Jun 2010 06:35:21 -0700 (PDT)
In-Reply-To: <20100605190659.GA3369@a91-153-117-195.elisa-laajakaista.fi>
References: <20100603143501.GA3176@a91-153-117-195.elisa-laajakaista.fi>
	<AANLkTimuJJcZ0D1TMDvTHjpb3advHetSB0aJw2IevSC1@mail.gmail.com>
	<20100605190659.GA3369@a91-153-117-195.elisa-laajakaista.fi>
Date: Wed, 9 Jun 2010 15:35:21 +0200
X-Google-Sender-Auth: FtESSOhMuEGNWUGUcLzN1Fx-kpY
Message-ID: <AANLkTika1fZN40iPLxBAN8DKZp4v8CALfQ7DWJnPBZXO@mail.gmail.com>
From: Attilio Rao <attilio@freebsd.org>
To: Jaakko Heinonen <jh@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, kib@freebsd.org
Subject: Re: syncer vnode leak because of nmount() race
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 13:35:23 -0000

2010/6/5 Jaakko Heinonen <jh@freebsd.org>:
>
> Thank you for the reply.
>
> On 2010-06-04, Attilio Rao wrote:
>> I think that, luckilly, it is not a very common condition to have the
>> mount still in flight and get updates... :)
>
> Agreed, but mountd(8) increases chances because it does an update mount
> for all local file systems when it receives SIGHUP.
>
>> However, I think that the real breakage here is that the check on
>> mnt->mnt_syncer is done lockless and it is unsafe.
>
>> I found also this bug when rewriting the syncer and I resolved that by
>> using a separate flag for that (in my case it was simpler and more
>> beneficial actually for some other reasons, but you may do the same
>> thing with a mnt_kern_flag entry).
>
> OK, I will take a look at this approach.
>
>> Additively, note that vfs_busy() here is not intended to protect
>> against such situations but against unmount.
>>
>> > PS. vfs_unbusy(9) manual page is out of date after r184554 and IMO
>> > =C2=A0 =C2=A0vfs_busy(9) manual page is misleading because it talks ab=
out
>> > =C2=A0 =C2=A0synchronizing access to a mount point.
>>
>> May you be more precise on what is misleading please?
>
> As you wrote above it protects only against unmount. At least I got
> feeling that it does more than that when I read this: "The purpose of
> this function is to synchronize access to a mount point. =C2=A0It also de=
lays
> unmounting by sleeping on mp if the MNTK_UNMOUNT flag is set in
> mp->mnt_kern_flag and the LK_NOWAIT flag is not set.".
>
> I did some updates for the manual pages:
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0http://people.freebsd.org/~jh/patches/vfs_busy=
-vfs_unbusy.diff

That patch is fine.
I'd just avoid to mention the mnt_lockref flag name, just use the
generic 'refcount' word.

Thanks,
Attilio


--=20
Peace can only be achieved by understanding - A. Einstein

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 14:26:40 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C46F8106567A
	for <fs@freebsd.org>; Wed,  9 Jun 2010 14:26:40 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de
	[217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 70F308FC1A
	for <fs@freebsd.org>; Wed,  9 Jun 2010 14:26:40 +0000 (UTC)
Received: from outgoing.leidinger.net (pD954FA9A.dip.t-dialin.net
	[217.84.250.154])
	by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 976DB84400A
	for <fs@freebsd.org>; Wed,  9 Jun 2010 16:26:36 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id 8BC035143
	for <fs@freebsd.org>; Wed,  9 Jun 2010 16:26:30 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net;
	s=outgoing-alex; t=1276093590;
	bh=3xXXXXyAjxubPmzOSt0BIzdRKaxNyf0qQLuGo44Mels=;
	h=Message-ID:Date:From:To:Subject:MIME-Version:Content-Type:
	Content-Transfer-Encoding;
	b=MLVA6KFMWKK+AF1hISUKUP70DzPjl6y24Syk0SwcxHtVhtETSOzJ1tmqiUf69Iln/
	JNIGkGiOSv2l5mUPtO8wP7E5MFTelW8ct5HYOxnAfO/h61rCwWhr7w1+I6GC1mjdqj
	gAzg9E8JNvLJSoOUsHlvnV5JUYAwj2wqnTeTNn/LW24ZCZyMBHzgt4jTjFulZaJ8oh
	LfKI7L6nckBFea/MyvvWudaTp4j6tbNgyVZBoi5YkIHGgTEqN+bSdPfI5MGTj9pyS1
	gestUg5R+A+flF/SFT/NFbsLREmL0Dbh0kNeQkjbNe4eTPhD2J/FaS2Ed0bCmRvbzL
	/8PwmzwPLrcxQ==
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o59EQSD0013312
	for fs@freebsd.org; Wed, 9 Jun 2010 16:26:28 +0200 (CEST)
	(envelope-from Alexander@Leidinger.net)
Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by
	webmail.leidinger.net (Horde Framework) with HTTP; Wed, 09 Jun 2010
	16:26:27 +0200
Message-ID: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
Date: Wed, 09 Jun 2010 16:26:27 +0200
From: Alexander Leidinger <Alexander@Leidinger.net>
To: fs@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain;
 charset=UTF-8;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4)
X-EBL-MailScanner-Information: Please contact the ISP for more information
X-EBL-MailScanner-ID: 976DB84400A.A77FA
X-EBL-MailScanner: Found to be clean
X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN,
	SpamAssassin (not cached, score=-1.023, required 6,
	autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10,
	DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_ZF 0.08)
X-EBL-MailScanner-From: alexander@leidinger.net
X-EBL-MailScanner-Watermark: 1276698397.30206@6t8LK8M5dwh/ciFgKfrgmQ
X-EBL-Spam-Status: No
Cc: 
Subject: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 14:26:40 -0000

Hi,

I noticed that we do not have an automatism to scrub a ZFS pool  
periodically. Is there interest in something like this, or shall I  
keep it local?

Here's the main part of the monthly periodic script I quickly created:
---snip---
case "$monthly_scrub_zfs_enable" in
     [Yy][Ee][Ss])
         echo
         echo 'Scrubbing of zfs pools:'

         if [ -z "${monthly_scrub_zfs_pools}" ]; then
                 monthly_scrub_zfs_pools="$(zpool list -H -o name)"
         fi

         for pool in ${monthly_scrub_zfs_pools}; do
                 # successful only if there is at least one pool to scrub
                 rc=0

                 echo "   starting scrubbing of pool '${pool}'"
                 zpool scrub ${pool}
                 echo "      consult 'zpool status ${pool}' for the result"
                 echo "      or wait for the daily_status_zfs mail, if enabled"
         done
         ;;
---snip---

Bye,
Alexander.

-- 
Fuch's Warning:
	If you actually look like your passport photo, you aren't well
	enough to travel.

http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 15:10:30 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 347D3106564A
	for <fs@freebsd.org>; Wed,  9 Jun 2010 15:10:30 +0000 (UTC)
	(envelope-from ticso@cicely7.cicely.de)
Received: from raven.bwct.de (raven.bwct.de [85.159.14.73])
	by mx1.freebsd.org (Postfix) with ESMTP id 8EC9F8FC2B
	for <fs@freebsd.org>; Wed,  9 Jun 2010 15:10:29 +0000 (UTC)
Received: from mail.cicely.de ([10.1.1.37])
	by raven.bwct.de (8.13.4/8.13.4) with ESMTP id o59EhwES040479
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Wed, 9 Jun 2010 16:43:58 +0200 (CEST)
	(envelope-from ticso@cicely7.cicely.de)
Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9])
	by mail.cicely.de (8.14.3/8.14.3) with ESMTP id o59EhtwG005367
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 9 Jun 2010 16:43:55 +0200 (CEST)
	(envelope-from ticso@cicely7.cicely.de)
Received: from cicely7.cicely.de (localhost [127.0.0.1])
	by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id o59EhtYe074772;
	Wed, 9 Jun 2010 16:43:55 +0200 (CEST)
	(envelope-from ticso@cicely7.cicely.de)
Received: (from ticso@localhost)
	by cicely7.cicely.de (8.14.2/8.14.2/Submit) id o59EhtIr074771;
	Wed, 9 Jun 2010 16:43:55 +0200 (CEST) (envelope-from ticso)
Date: Wed, 9 Jun 2010 16:43:55 +0200
From: Bernd Walter <ticso@cicely7.cicely.de>
To: Alexander Leidinger <Alexander@Leidinger.net>
Message-ID: <20100609144355.GL72453@cicely7.cicely.de>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386
User-Agent: Mutt/1.5.11
X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED=-1, BAYES_00=-1.9,
	T_RP_MATCHES_RCVD=-0.01 autolearn=ham version=3.3.0
X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on spamd.cicely.de
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: ticso@cicely.de
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 15:10:30 -0000

On Wed, Jun 09, 2010 at 04:26:27PM +0200, Alexander Leidinger wrote:
> Hi,
> 
> I noticed that we do not have an automatism to scrub a ZFS pool  
> periodically. Is there interest in something like this, or shall I  
> keep it local?

For me scrub'ing takes several days without having a special big
pool size and starting another scrub restarts everything.
You should at least check if another one is still running.
I think resilvering is also a collision case to check for.

> Here's the main part of the monthly periodic script I quickly created:
> ---snip---
> case "$monthly_scrub_zfs_enable" in
>     [Yy][Ee][Ss])
>         echo
>         echo 'Scrubbing of zfs pools:'
> 
>         if [ -z "${monthly_scrub_zfs_pools}" ]; then
>                 monthly_scrub_zfs_pools="$(zpool list -H -o name)"
>         fi
> 
>         for pool in ${monthly_scrub_zfs_pools}; do
>                 # successful only if there is at least one pool to scrub
>                 rc=0
> 
>                 echo "   starting scrubbing of pool '${pool}'"
>                 zpool scrub ${pool}
>                 echo "      consult 'zpool status ${pool}' for the result"
>                 echo "      or wait for the daily_status_zfs mail, if 
>                 enabled"
>         done
>         ;;
> ---snip---

-- 
B.Walter <bernd@bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 15:12:48 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 10A21106566B;
	Wed,  9 Jun 2010 15:12:48 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
	[131.104.91.44])
	by mx1.freebsd.org (Postfix) with ESMTP id 9C48B8FC12;
	Wed,  9 Jun 2010 15:12:47 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AvsEAPJLD0yDaFvK/2dsb2JhbACeS3G+HIUYBA
X-IronPort-AV: E=Sophos;i="4.53,391,1272859200"; d="scan'208";a="79387669"
Received: from fraser.cs.uoguelph.ca ([131.104.91.202])
	by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 09 Jun 2010 11:12:44 -0400
Received: from localhost (localhost.localdomain [127.0.0.1])
	by fraser.cs.uoguelph.ca (Postfix) with ESMTP id C0744109C2C9;
	Wed,  9 Jun 2010 11:12:45 -0400 (EDT)
X-Virus-Scanned: amavisd-new at fraser.cs.uoguelph.ca
Received: from fraser.cs.uoguelph.ca ([127.0.0.1])
	by localhost (fraser.cs.uoguelph.ca [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id n5r7430M-ZgK; Wed,  9 Jun 2010 11:12:45 -0400 (EDT)
Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102])
	by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 04749109C327;
	Wed,  9 Jun 2010 11:12:45 -0400 (EDT)
Received: from localhost (rmacklem@localhost)
	by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id
	o59FSqn27257; Wed, 9 Jun 2010 11:28:52 -0400 (EDT)
X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing
	-bs
Date: Wed, 9 Jun 2010 11:28:52 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
X-X-Sender: rmacklem@muncher.cs.uoguelph.ca
To: Anders Nordby <anders@FreeBSD.org>
In-Reply-To: <20100609122517.GA16231@fupp.net>
Message-ID: <Pine.GSO.4.63.1006091119410.23896@muncher.cs.uoguelph.ca>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 15:12:48 -0000


On Wed, 9 Jun 2010, Anders Nordby wrote:

>
> Thanks. The only thing that (temporarily) solves this issue so far is
> rebooting, which helps only for a day or so. I have tried different
> NICs, replacing the physical server, replacing cables, changing and
> resetting switch ports. But it did not help, so I think this is a
> software problem. I will try zio_use_uma = 0 I think, and then try to
> limit vfs.zfs.arc_max to 100 MB or so.
>

When you tried a different NIC, was a different type (ie. different
chipset that uses a different device driver)? I suggested that not
because I thought the hardware was broken but because I thought it
might be related to the network interface's device driver and switching
to a different device driver would isolate that possibility.

> On the ZFS+NFS server while having these issues:
>
> root@unixfile:~# netstat -m
> 1293/4602/5895 mbufs in use (current/cache/total)
> 1109/3619/4728/65536 mbuf clusters in use (current/cache/total/max)
> 257/1023 mbuf+clusters out of packet secondary zone in use
> (current/cache)
> 0/104/104/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 2541K/8804K/11345K bytes allocated to network (current/cache/total)
> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/0/0 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
>
> Packet loss seen from my workstation:
>
> anders@noname:~$ ping unixfile
> PING unixfile.aftenposten.no (192.168.120.33) 56(84) bytes of data.
> 64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=1
> ttl=63 time=0
> .230 ms
> 64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=3
> ttl=63 time=0
> .262 ms
> 64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=5
> ttl=63 time=0
> .272 ms
> 64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=6
> ttl=63 time=0
> .203 ms
> 64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=7
> ttl=63 time=0
> .306 ms
> 64 bytes from unixfile.aftenposten.no (192.168.120.33): icmp_seq=9
> ttl=63 time=0
> .309 ms

Well, it doesn't seem to be mbuf exhaustion (I don't know what
"out of packet secondary zone" means, I'll have to look at that) and
if it doesn't handle pings it seems really hosed. Have you done a
"vmstat 5" + "ps axlH" (or similar) to try and see what it's doing?
("top" and "netstat" might also help?)

If you can figure out where it's spinning its wheels, that might
at least give us a hint w.r.t. the problem.

Good luck with it, rick


From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 15:23:15 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1FEF31065672
	for <fs@freebsd.org>; Wed,  9 Jun 2010 15:23:15 +0000 (UTC)
	(envelope-from jhellenthal@gmail.com)
Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com
	[209.85.212.54])
	by mx1.freebsd.org (Postfix) with ESMTP id C25378FC19
	for <fs@freebsd.org>; Wed,  9 Jun 2010 15:23:14 +0000 (UTC)
Received: by vws1 with SMTP id 1so1303981vws.13
	for <fs@freebsd.org>; Wed, 09 Jun 2010 08:23:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:sender:message-id:date:from
	:user-agent:mime-version:to:cc:subject:references:in-reply-to
	:x-enigmail-version:openpgp:content-type:content-transfer-encoding;
	bh=svHz9HjkzfVMAY2N38DZeqY4A5nlEmdWy//f2aHo5jE=;
	b=bXGj9gInM60USS1xy/0Ltf2IyjwuKpv/fGaCqXlarkR11Ab23whtpX35bBYpto5b6O
	lG0TgaFJy5CrKfK2capWutsUW790nqes/8BE015Hm4sC0XWTJAhZ31n/Mnkz5xwi5yA4
	pR7+R21bu49mtrUZPyaXwoQLKlAkTS6HMVil0=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject
	:references:in-reply-to:x-enigmail-version:openpgp:content-type
	:content-transfer-encoding;
	b=F4KeFOIqllciuu28djKHrqIY+CyBgmz2rQA1Lzj/U0zJd2e4NF0nM625mTj4p1yDdT
	hahRdEbUpjejDhqHD8Zst0CdkGSfVCKN9vlpaJA/T35GmDYKrrF3zHt3wyGaD9MYSU/Y
	pjNaPc6ELXnqL9XY3T1Lrdyedu2mhAOkEMWzQ=
Received: by 10.224.64.76 with SMTP id d12mr2572727qai.208.1276096992919;
	Wed, 09 Jun 2010 08:23:12 -0700 (PDT)
Received: from centel.dataix.local
	(adsl-99-181-128-180.dsl.klmzmi.sbcglobal.net [99.181.128.180])
	by mx.google.com with ESMTPS id m29sm9513928qck.16.2010.06.09.08.23.11
	(version=SSLv3 cipher=RC4-MD5); Wed, 09 Jun 2010 08:23:12 -0700 (PDT)
Sender: "J. Hellenthal" <jhellenthal@gmail.com>
Message-ID: <4C0FB1DE.9080508@dataix.net>
Date: Wed, 09 Jun 2010 11:23:10 -0400
From: jhell <jhell@dataix.net>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US;
	rv:1.9.1.9) Gecko/20100515 Thunderbird
MIME-Version: 1.0
To: Alexander Leidinger <Alexander@Leidinger.net>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C0FAE2A.7050103@dataix.net>
In-Reply-To: <4C0FAE2A.7050103@dataix.net>
X-Enigmail-Version: 1.0.1
OpenPGP: id=89D8547E
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 15:23:15 -0000

On 06/09/2010 11:07, jhell wrote:
> On 06/09/2010 10:26, Alexander Leidinger wrote:
>> Hi,
>>
>> I noticed that we do not have an automatism to scrub a ZFS pool
>> periodically. Is there interest in something like this, or shall I keep
>> it local?
>>
>> Here's the main part of the monthly periodic script I quickly created:
>> ---snip---
>> case "$monthly_scrub_zfs_enable" in
>>     [Yy][Ee][Ss])
>>         echo
>>         echo 'Scrubbing of zfs pools:'
>>
>>         if [ -z "${monthly_scrub_zfs_pools}" ]; then
>>                 monthly_scrub_zfs_pools="$(zpool list -H -o name)"
>>         fi
>>
>>         for pool in ${monthly_scrub_zfs_pools}; do
>>                 # successful only if there is at least one pool to scrub
>>                 rc=0
>>
>>                 echo "   starting scrubbing of pool '${pool}'"
>>                 zpool scrub ${pool}
>>                 echo "      consult 'zpool status ${pool}' for the result"
>>                 echo "      or wait for the daily_status_zfs mail, if
>> enabled"
>>         done
>>         ;;
>> ---snip---
>>
>> Bye,
>> Alexander.
>>
> 
> Please add a check to see if any resilerving is being done on the pool
> that the scub is being executed on. (Just in case), I would hope that
> the scrub would fail silently in this case.
> 
> Please also check whether a scrub is already running on one of the pools
> and if so & another pool exists start a background loop to wait for the
> first scrub to finish or die silently.
> 
> I had a scrub fully restart from calling scrub a second time after being
> more than 50% complete, its frustrating.
> 
> 
> Thanks!,
> 

I should probably suggest one check that comes to mind.

zpool history ${pool} | grep scrub | tail -1 |cut -f1 -d.

Then compare the output with today's date to make sure today is >= 30
days from the date of the last scrub.

With the above this could be turned into a daily_zfs_scrub_enable with a
default daily_zfs_scrub_threshold="30" and ensuring that if one check is
missed it will not take another 30 days to run the check again.

Food for thought.


Thanks!, Thanks!,

-- 

 jhell

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 15:31:19 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5E23B1065673
	for <fs@freebsd.org>; Wed,  9 Jun 2010 15:31:19 +0000 (UTC)
	(envelope-from jhellenthal@gmail.com)
Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com
	[209.85.212.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 09AD18FC12
	for <fs@freebsd.org>; Wed,  9 Jun 2010 15:31:18 +0000 (UTC)
Received: by vws1 with SMTP id 1so1314192vws.13
	for <fs@freebsd.org>; Wed, 09 Jun 2010 08:31:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:sender:message-id:date:from
	:user-agent:mime-version:to:cc:subject:references:in-reply-to
	:x-enigmail-version:openpgp:content-type:content-transfer-encoding;
	bh=rw4sqFwf1+DasB+/2YgKSYCW/vvQGWmyymVOqNtTR6s=;
	b=GQ9IXn1xtQeJk0Hn5vYXN6+jdWrGh+dDX2AoQmrR+homPOp6Xg/TzM9DWpyFQZd3fq
	VhBo4san6crvbzpgsksiCGUniUedbbTqmDvaNC/c/H4j7YurK5K7F7VNYYqJcEyZDlsV
	PNzjLixQfJmFgbi6TEK8cl8ywwJ122fjhQ3Y8=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject
	:references:in-reply-to:x-enigmail-version:openpgp:content-type
	:content-transfer-encoding;
	b=b0UsKjTTc18Bd7hqgrpOjxyXNR2N76ANRNrFWaVIpboM+FdXcRnmorRpEIIKccX2aw
	oTfDYizMzSt0jGvGDUEgLFXLQbq16hSda5RbwhJwO2sqZ6zCdhat5LwZZeEmuBlmKkiD
	A2BMeoXMI+iLxc/L3nqKQG4hhjvDF4M7DOJ0Q=
Received: by 10.224.26.154 with SMTP id e26mr2616905qac.247.1276096046379;
	Wed, 09 Jun 2010 08:07:26 -0700 (PDT)
Received: from centel.dataix.local
	(adsl-99-181-128-180.dsl.klmzmi.sbcglobal.net [99.181.128.180])
	by mx.google.com with ESMTPS id i10sm9462951qcb.23.2010.06.09.08.07.23
	(version=SSLv3 cipher=RC4-MD5); Wed, 09 Jun 2010 08:07:24 -0700 (PDT)
Sender: "J. Hellenthal" <jhellenthal@gmail.com>
Message-ID: <4C0FAE2A.7050103@dataix.net>
Date: Wed, 09 Jun 2010 11:07:22 -0400
From: jhell <jhell@dataix.net>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US;
	rv:1.9.1.9) Gecko/20100515 Thunderbird
MIME-Version: 1.0
To: Alexander Leidinger <Alexander@Leidinger.net>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
In-Reply-To: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
X-Enigmail-Version: 1.0.1
OpenPGP: id=89D8547E
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 15:31:19 -0000

On 06/09/2010 10:26, Alexander Leidinger wrote:
> Hi,
> 
> I noticed that we do not have an automatism to scrub a ZFS pool
> periodically. Is there interest in something like this, or shall I keep
> it local?
> 
> Here's the main part of the monthly periodic script I quickly created:
> ---snip---
> case "$monthly_scrub_zfs_enable" in
>     [Yy][Ee][Ss])
>         echo
>         echo 'Scrubbing of zfs pools:'
> 
>         if [ -z "${monthly_scrub_zfs_pools}" ]; then
>                 monthly_scrub_zfs_pools="$(zpool list -H -o name)"
>         fi
> 
>         for pool in ${monthly_scrub_zfs_pools}; do
>                 # successful only if there is at least one pool to scrub
>                 rc=0
> 
>                 echo "   starting scrubbing of pool '${pool}'"
>                 zpool scrub ${pool}
>                 echo "      consult 'zpool status ${pool}' for the result"
>                 echo "      or wait for the daily_status_zfs mail, if
> enabled"
>         done
>         ;;
> ---snip---
> 
> Bye,
> Alexander.
> 

Please add a check to see if any resilerving is being done on the pool
that the scub is being executed on. (Just in case), I would hope that
the scrub would fail silently in this case.

Please also check whether a scrub is already running on one of the pools
and if so & another pool exists start a background loop to wait for the
first scrub to finish or die silently.

I had a scrub fully restart from calling scrub a second time after being
more than 50% complete, its frustrating.


Thanks!,

-- 

 jhell

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun  9 23:22:17 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 67D12106566C
	for <freebsd-fs@freebsd.org>; Wed,  9 Jun 2010 23:22:17 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 1E6748FC17
	for <freebsd-fs@freebsd.org>; Wed,  9 Jun 2010 23:22:16 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AlcHADO/D0yDaFvI/2dsb2JhbACSSAEBjBJxv1KFGAQ
X-IronPort-AV: E=Sophos;i="4.53,395,1272859200"; d="scan'208";a="80144053"
Received: from darling.cs.uoguelph.ca ([131.104.91.200])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 09 Jun 2010 19:22:14 -0400
Received: from localhost (localhost.localdomain [127.0.0.1])
	by darling.cs.uoguelph.ca (Postfix) with ESMTP id 0FC0E940138
	for <freebsd-fs@freebsd.org>; Wed,  9 Jun 2010 19:22:16 -0400 (EDT)
X-Virus-Scanned: amavisd-new at darling.cs.uoguelph.ca
Received: from darling.cs.uoguelph.ca ([127.0.0.1])
	by localhost (darling.cs.uoguelph.ca [127.0.0.1]) (amavisd-new,
	port 10024) with ESMTP id zMehGE+SBbU3 for <freebsd-fs@freebsd.org>;
	Wed,  9 Jun 2010 19:22:15 -0400 (EDT)
Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102])
	by darling.cs.uoguelph.ca (Postfix) with ESMTP id 1B5D29400E6
	for <freebsd-fs@freebsd.org>; Wed,  9 Jun 2010 19:22:15 -0400 (EDT)
Received: from localhost (rmacklem@localhost)
	by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id
	o59NcOC23052
	for <freebsd-fs@freebsd.org>; Wed, 9 Jun 2010 19:38:24 -0400 (EDT)
X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing
	-bs
Date: Wed, 9 Jun 2010 19:38:24 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
X-X-Sender: rmacklem@muncher.cs.uoguelph.ca
To: freebsd-fs@freebsd.org
Message-ID: <Pine.GSO.4.63.1006091934390.22971@muncher.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Subject: Testers: NFSv3 support for pxeboot for nfs diskless root
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 23:22:17 -0000

I put 3 patches (you need to apply them all) here:
http://people.freebsd.org/~rmacklem/nfsdiskless-patches/

They convert lib/libstand/nfs.c and pxeboot to use NFSv3 instead
of NFSv2 (unless built with OLD_NFSV2 defined). Initial test
reports have been good. (one has it working ok and the other has
a problem in an area not related to the patches, it appears)

So, if others are interested in testing these, it would be
appreciated, rick


From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 08:17:16 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8C2421065672
	for <freebsd-fs@FreeBSD.org>; Thu, 10 Jun 2010 08:17:16 +0000 (UTC)
	(envelope-from peterjeremy@acm.org)
Received: from mail10.syd.optusnet.com.au (mail10.syd.optusnet.com.au
	[211.29.132.191])
	by mx1.freebsd.org (Postfix) with ESMTP id A86A08FC16
	for <freebsd-fs@FreeBSD.org>; Thu, 10 Jun 2010 08:17:14 +0000 (UTC)
Received: from server.vk2pj.dyndns.org
	(c211-30-160-13.mirnd2.nsw.optusnet.com.au [211.30.160.13] (may
	be forged))
	by mail10.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	o5A8HBce013897
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Thu, 10 Jun 2010 18:17:12 +1000
X-Bogosity: Ham, spamicity=0.000000
Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1])
	by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id o5A8HAgm064684;
	Thu, 10 Jun 2010 18:17:10 +1000 (EST)
	(envelope-from peter@server.vk2pj.dyndns.org)
Received: (from peter@localhost)
	by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id o5A8HAFU064683;
	Thu, 10 Jun 2010 18:17:10 +1000 (EST) (envelope-from peter)
Date: Thu, 10 Jun 2010 18:17:10 +1000
From: Peter Jeremy <peter@vk2pj.dyndns.org>
To: Anders Nordby <anders@FreeBSD.org>
Message-ID: <20100610081710.GA64350@server.vk2pj.dyndns.org>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="82I3+IH0IqGh5yIs"
Content-Disposition: inline
In-Reply-To: <20100609122517.GA16231@fupp.net>
X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 08:17:16 -0000


--82I3+IH0IqGh5yIs
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2010-Jun-09 14:25:17 +0200, Anders Nordby <anders@FreeBSD.org> wrote:
>Thanks. The only thing that (temporarily) solves this issue so far is
>rebooting, which helps only for a day or so. I have tried different
>NICs, replacing the physical server, replacing cables, changing and
>resetting switch ports. But it did not help, so I think this is a
>software problem. I will try zio_use_uma =3D 0 I think, and then try to
>limit vfs.zfs.arc_max to 100 MB or so.

I wonder if your system is running out of free RAM.  How would you
like to monitor "inactive", "cache" and "free" from either "systat -v"
or "vmstat -s" whilst the problem is occurring.

Does something like
  perl -e '$x =3D "x" x 10000000;'
temporarily correct the problem?

--=20
Peter Jeremy

--82I3+IH0IqGh5yIs
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAkwQn4YACgkQ/opHv/APuIcB1wCgrdULTyjAmBBdPnx1t0b4XFSj
bU8AoLvsCgKcm2xInLijQgSYP/5jt/5W
=sJDK
-----END PGP SIGNATURE-----

--82I3+IH0IqGh5yIs--

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 09:23:53 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1FEE2106564A
	for <fs@freebsd.org>; Thu, 10 Jun 2010 09:23:53 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de
	[217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id BE34B8FC17
	for <fs@freebsd.org>; Thu, 10 Jun 2010 09:23:52 +0000 (UTC)
Received: from outgoing.leidinger.net (pD954FE15.dip.t-dialin.net
	[217.84.254.21])
	by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id D938484400A;
	Thu, 10 Jun 2010 11:23:48 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id 0C919510B;
	Thu, 10 Jun 2010 11:23:46 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net;
	s=outgoing-alex; t=1276161826;
	bh=Dcxo6zVx6ibwiiWk7SE1m5A5gJR3uG18MmLUn7x7DGM=;
	h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To:
	MIME-Version:Content-Type:Content-Transfer-Encoding;
	b=HG/1+XO8bHWMMmYgcTpiCBQO1GfHtreM/wGKsZ9whWluJQqvSKxbHENo26HjKMpQ5
	SN7j7nZ8mlCfaNE3Uc6koloroMrfHw8XaCYsqfJffq60gGpS6rltyXEeH9CmdUeQue
	UIsRlVOlbnxbDIO1RkqDU8aK+MalM5JhDoUJeIK0IKNKi9y0e8vJTZPH9Se2pI2SaS
	Pm1MwHUK+ILG9uMzxCRHoKku34i/uAz9K8S5RhwrDKr//9ztJ/05LrITgFgnHrVojU
	Wd2IN7gjkTGnqM+ITCWHg2XOg19OU5vgNzTW0lb7aXGAZEPHp5slsJ+7KTmXNaJmUr
	onAraupNhwVeQ==
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5A9Njiw090741;
	Thu, 10 Jun 2010 11:23:45 +0200 (CEST)
	(envelope-from Alexander@Leidinger.net)
Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by
	webmail.leidinger.net (Horde Framework) with HTTP; Thu, 10 Jun 2010
	11:23:45 +0200
Message-ID: <20100610112345.644960lrau3mxfk0@webmail.leidinger.net>
Date: Thu, 10 Jun 2010 11:23:45 +0200
From: Alexander Leidinger <Alexander@Leidinger.net>
To: ticso@cicely.de
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<20100609144355.GL72453@cicely7.cicely.de>
In-Reply-To: <20100609144355.GL72453@cicely7.cicely.de>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=UTF-8;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4)
X-EBL-MailScanner-Information: Please contact the ISP for more information
X-EBL-MailScanner-ID: D938484400A.A6C5E
X-EBL-MailScanner: Found to be clean
X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN,
	SpamAssassin (not cached, score=-0.423, required 6,
	autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10,
	DKIM_VALID -0.10, DKIM_VALID_AU -0.10, J_CHICKENPOX_53 0.60,
	TW_ZF 0.08)
X-EBL-MailScanner-From: alexander@leidinger.net
X-EBL-MailScanner-Watermark: 1276766630.67689@y3jyMiMwQ2yXop2mseZ7gg
X-EBL-Spam-Status: No
Cc: fs@FreeBSD.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 09:23:53 -0000


Quoting Bernd Walter <ticso@cicely7.cicely.de> (from Wed, 9 Jun 2010  
16:43:55 +0200):

> On Wed, Jun 09, 2010 at 04:26:27PM +0200, Alexander Leidinger wrote:
>> Hi,
>>
>> I noticed that we do not have an automatism to scrub a ZFS pool
>> periodically. Is there interest in something like this, or shall I
>> keep it local?
>
> For me scrub'ing takes several days without having a special big
> pool size and starting another scrub restarts everything.
> You should at least check if another one is still running.

Good point, I will have a look at this...

But I'm a little bit surprised, when I scrub a pool of 3 times 250 GB  
disks in RAIDZ configuration, it is finished fast (a fraction of a  
day... maybe an hour or two). Initially it displays a very long time  
(>400 hours), but this is reducing after a while drastically. The pool  
is filled up to 3/4 of the entire capacity.

> I think resilvering is also a collision case to check for.

No. Resilvering has higher priority than a scrub. From the man-page:
---snip---
                                                                If a
resilver is in progress, ZFS does not allow a scrub to  be  started
until the resilver completes.
---snip---

Bye,
Alexander.

-- 
Fairy Tale, n.:
	A horror story to prepare children for the newspapers.

http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 09:27:17 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AB3C3106567C
	for <fs@freebsd.org>; Thu, 10 Jun 2010 09:27:17 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de
	[217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 55A378FC1B
	for <fs@freebsd.org>; Thu, 10 Jun 2010 09:27:17 +0000 (UTC)
Received: from outgoing.leidinger.net (pD954FE15.dip.t-dialin.net
	[217.84.254.21])
	by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 3FE2584400A;
	Thu, 10 Jun 2010 11:27:14 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id 684E1510C;
	Thu, 10 Jun 2010 11:27:11 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net;
	s=outgoing-alex; t=1276162031;
	bh=kFfA/Kv+EvB0dZDuJIUymozwCEMUx49yqdzcGWPdi2I=;
	h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To:
	MIME-Version:Content-Type:Content-Transfer-Encoding;
	b=NzjKK/B1LT7NGS/rjUc/gIQg5ce6hLwb/Gvf8JqXMNVZVhJuLQ8N+/JeQITNw56IJ
	JipT2RVHiA2al+m3jsZmVl57S9HY4HZWDiEJTrJrW7KoE+ipFKQtWjmlH/W2hBRbcn
	WGUwdOmJkmeYVF76rn7bKVk+edMY9AENfVq28XaOefnM7MAdPVq0nWIKyUiiia2lhl
	5EHN2gT6DhFOX4vvr61JrcOL8Qj52Y9d4cltwb0hU+MTp5HNKvd7bY3WvBIBAcv6iO
	JIfGr27csMKq+kha4iPODfmdQpauJyh9XqM+iDZOUXWj1fY9yEfo099HCKnvMmy7pI
	Q/cYRn1ln15wQ==
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5A9RADE091502;
	Thu, 10 Jun 2010 11:27:10 +0200 (CEST)
	(envelope-from Alexander@Leidinger.net)
Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by
	webmail.leidinger.net (Horde Framework) with HTTP; Thu, 10 Jun 2010
	11:27:10 +0200
Message-ID: <20100610112710.20215zznvaqdai88@webmail.leidinger.net>
Date: Thu, 10 Jun 2010 11:27:10 +0200
From: Alexander Leidinger <Alexander@Leidinger.net>
To: jhell <jhell@dataix.net>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C0FAE2A.7050103@dataix.net>
In-Reply-To: <4C0FAE2A.7050103@dataix.net>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=UTF-8;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4)
X-EBL-MailScanner-Information: Please contact the ISP for more information
X-EBL-MailScanner-ID: 3FE2584400A.A5FD7
X-EBL-MailScanner: Found to be clean
X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN,
	SpamAssassin (not cached, score=-1.023, required 6,
	autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10,
	DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_ZF 0.08)
X-EBL-MailScanner-From: alexander@leidinger.net
X-EBL-MailScanner-Watermark: 1276766835.33222@eRKmZy5hXQLo6fdhk/1uog
X-EBL-Spam-Status: No
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 09:27:17 -0000

Quoting jhell <jhell@dataix.net> (from Wed, 09 Jun 2010 11:07:22 -0400):

> Please add a check to see if any resilerving is being done on the pool
> that the scub is being executed on. (Just in case), I would hope that
> the scrub would fail silently in this case.

It does. No need to check for the resilvering.

> Please also check whether a scrub is already running on one of the pools
> and if so & another pool exists start a background loop to wait for the
> first scrub to finish or die silently.

I do not want a background job running forever in the periodic script.  
If a scrub is in progress, it should print the fact and do nothing (a  
second scrub after the running one finishes us superflous). I will  
have a look at this.

Bye,
Alexander.

-- 
You look tired.

http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 09:32:49 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A591B1065673
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 09:32:49 +0000 (UTC)
	(envelope-from admin@kkip.pl)
Received: from mainframe.kkip.pl (kkip.pl [87.105.164.78])
	by mx1.freebsd.org (Postfix) with ESMTP id 15E898FC16
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 09:32:48 +0000 (UTC)
Received: from static-78-8-144-74.ssp.dialog.net.pl ([78.8.144.74]
	helo=[192.168.0.2])
	by mainframe.kkip.pl with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
	(Exim 4.71 (FreeBSD)) (envelope-from <admin@kkip.pl>)
	id 1OMe7s-000Kho-9n
	for freebsd-fs@freebsd.org; Thu, 10 Jun 2010 11:32:47 +0200
Message-ID: <4C10B136.3030404@kkip.pl>
Date: Thu, 10 Jun 2010 11:32:38 +0200
From: Bartosz Stec <admin@kkip.pl>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US;
	rv:1.9.1.9) Gecko/20100406 Shredder/3.0.4
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
In-Reply-To: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Authenticated-User: admin@kkip.pl
X-Authenticator: plain
X-Sender-Verify: SUCCEEDED (sender exists & accepts mail)
X-Spam-Score: -8.1
X-Spam-Score-Int: -80
X-Exim-Version: 4.71 (build at 02-Feb-2010 20:10:28)
X-Date: 2010-06-10 11:32:47
X-Connected-IP: 78.8.144.74:63299
X-Message-Linecount: 58
X-Body-Linecount: 46
X-Message-Size: 1943
X-Body-Size: 1400
X-Received-Count: 1
X-Recipient-Count: 1
X-Local-Recipient-Count: 1
X-Local-Recipient-Defer-Count: 0
X-Local-Recipient-Fail-Count: 0
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 09:32:49 -0000

On 2010-06-09 16:26, Alexander Leidinger wrote:
> Hi,
>
> I noticed that we do not have an automatism to scrub a ZFS pool 
> periodically. Is there interest in something like this, or shall I 
> keep it local?
>
> Here's the main part of the monthly periodic script I quickly created:
> ---snip---
> case "$monthly_scrub_zfs_enable" in
>     [Yy][Ee][Ss])
>         echo
>         echo 'Scrubbing of zfs pools:'
>
>         if [ -z "${monthly_scrub_zfs_pools}" ]; then
>                 monthly_scrub_zfs_pools="$(zpool list -H -o name)"
>         fi
>
>         for pool in ${monthly_scrub_zfs_pools}; do
>                 # successful only if there is at least one pool to scrub
>                 rc=0
>
>                 echo "   starting scrubbing of pool '${pool}'"
>                 zpool scrub ${pool}
>                 echo "      consult 'zpool status ${pool}' for the 
> result"
>                 echo "      or wait for the daily_status_zfs mail, if 
> enabled"
>         done
>         ;;
> ---snip---
>
> Bye,
> Alexander.
>

Ross-at-neces-dot-com already did what you're searching for. I'm using 
his periodic scripts for some months now, check here: 
http://www.neces.com/blog/technology/integrating-freebsd-zfs-and-periodic-snapshots-and-scrubs. 
They're doing all necessary stuff, like checking for scrub in progress 
too. Hope you'll find them helpful.

Cheers :)

-- 
Bartosz Stec

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 09:41:17 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 353B9106566B
	for <fs@freebsd.org>; Thu, 10 Jun 2010 09:41:17 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta02.westchester.pa.mail.comcast.net
	(qmta02.westchester.pa.mail.comcast.net [76.96.62.24])
	by mx1.freebsd.org (Postfix) with ESMTP id EBCFA8FC13
	for <fs@freebsd.org>; Thu, 10 Jun 2010 09:41:16 +0000 (UTC)
Received: from omta05.westchester.pa.mail.comcast.net ([76.96.62.43])
	by qmta02.westchester.pa.mail.comcast.net with comcast
	id U9Na1e0050vyq2s529U1M7; Thu, 10 Jun 2010 09:28:01 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta05.westchester.pa.mail.comcast.net with comcast
	id U9Tz1e0053S48mS3R9U0h2; Thu, 10 Jun 2010 09:28:01 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 6C8BA9B418; Thu, 10 Jun 2010 02:27:58 -0700 (PDT)
Date: Thu, 10 Jun 2010 02:27:58 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Alexander Leidinger <Alexander@Leidinger.net>
Message-ID: <20100610092758.GA67752@icarus.home.lan>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<20100609144355.GL72453@cicely7.cicely.de>
	<20100610112345.644960lrau3mxfk0@webmail.leidinger.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100610112345.644960lrau3mxfk0@webmail.leidinger.net>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: ticso@cicely.de, fs@FreeBSD.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 09:41:17 -0000

On Thu, Jun 10, 2010 at 11:23:45AM +0200, Alexander Leidinger wrote:
> But I'm a little bit surprised, when I scrub a pool of 3 times 250
> GB disks in RAIDZ configuration, it is finished fast (a fraction of
> a day... maybe an hour or two). Initially it displays a very long
> time (>400 hours), but this is reducing after a while drastically.

For what it's worth, Solaris does the exact same thing (initially shows
a very long duration, which keeps getting longer, but then reduces after
some time and begins catching up quickly).  It didn't originally behave
this way (on FreeBSD or Solaris) so there's probably a justified reason
for it.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 09:53:32 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DF6E01065677
	for <fs@freebsd.org>; Thu, 10 Jun 2010 09:53:32 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de
	[217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 8756A8FC1C
	for <fs@freebsd.org>; Thu, 10 Jun 2010 09:53:32 +0000 (UTC)
Received: from outgoing.leidinger.net (pD954FE15.dip.t-dialin.net
	[217.84.254.21])
	by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 2B0D284400A;
	Thu, 10 Jun 2010 11:53:27 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id 4ED045110;
	Thu, 10 Jun 2010 11:53:24 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net;
	s=outgoing-alex; t=1276163604;
	bh=NtdeI0CCaxYciry4aAL6cPV4BSDW65K80po0wn2HWok=;
	h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To:
	MIME-Version:Content-Type:Content-Transfer-Encoding;
	b=XnN6P8OFET/smuwvY9i/+sYNptWyC2gll0vdxVWubLVHDvMFM6lAARw8Ekpx3ReDd
	SinQCyNxtHknjL4+NRHz5fysrqkd+OIZwE87ZDCbVQhMQL689uPFpOEwSwG1h8OAPi
	+wCYZVbT677YU/LbGJx2OgAqSN/3/ZFMYacbDjEgXblSO7y/TogCv2qGkFBpzLjUGx
	xpYioa0vk26yZhzCwFOP8JK4k7RkgJynmo4Vnui4c0rW9t8BpFplgBuX2kXGlzLhvP
	SbNxKFwit3p9AMg//rbdr8dKDnxe9Jwk8mVd9Q1lGH2vnnZiFKHlq8+vGS0dqZwK25
	HenngjEHBMd0A==
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5A9rO9F097574;
	Thu, 10 Jun 2010 11:53:24 +0200 (CEST)
	(envelope-from Alexander@Leidinger.net)
Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by
	webmail.leidinger.net (Horde Framework) with HTTP; Thu, 10 Jun 2010
	11:53:24 +0200
Message-ID: <20100610115324.10161biomkjndvy8@webmail.leidinger.net>
Date: Thu, 10 Jun 2010 11:53:24 +0200
From: Alexander Leidinger <Alexander@Leidinger.net>
To: jhell <jhell@dataix.net>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C0FAE2A.7050103@dataix.net> <4C0FB1DE.9080508@dataix.net>
In-Reply-To: <4C0FB1DE.9080508@dataix.net>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=UTF-8;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4)
X-EBL-MailScanner-Information: Please contact the ISP for more information
X-EBL-MailScanner-ID: 2B0D284400A.A5DAF
X-EBL-MailScanner: Found to be clean
X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN,
	SpamAssassin (not cached, score=-1.023, required 6,
	autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10,
	DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_ZF 0.08)
X-EBL-MailScanner-From: alexander@leidinger.net
X-EBL-MailScanner-Watermark: 1276768410.15697@8r0q2ZXhaXHjW+sOx5mwbQ
X-EBL-Spam-Status: No
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 09:53:33 -0000

Quoting jhell <jhell@dataix.net> (from Wed, 09 Jun 2010 11:23:10 -0400):

> On 06/09/2010 11:07, jhell wrote:
>> On 06/09/2010 10:26, Alexander Leidinger wrote:
>>> Hi,
>>>
>>> I noticed that we do not have an automatism to scrub a ZFS pool
>>> periodically. Is there interest in something like this, or shall I keep
>>> it local?
>>>
>>> Here's the main part of the monthly periodic script I quickly created:
>>> ---snip---
>>> case "$monthly_scrub_zfs_enable" in
>>>     [Yy][Ee][Ss])
>>>         echo
>>>         echo 'Scrubbing of zfs pools:'
>>>
>>>         if [ -z "${monthly_scrub_zfs_pools}" ]; then
>>>                 monthly_scrub_zfs_pools="$(zpool list -H -o name)"
>>>         fi
>>>
>>>         for pool in ${monthly_scrub_zfs_pools}; do
>>>                 # successful only if there is at least one pool to scrub
>>>                 rc=0
>>>
>>>                 echo "   starting scrubbing of pool '${pool}'"
>>>                 zpool scrub ${pool}
>>>                 echo "      consult 'zpool status ${pool}' for the result"
>>>                 echo "      or wait for the daily_status_zfs mail, if
>>> enabled"
>>>         done
>>>         ;;
>>> ---snip---
>>>
>>> Bye,
>>> Alexander.
>>>
>>
>> Please add a check to see if any resilerving is being done on the pool
>> that the scub is being executed on. (Just in case), I would hope that
>> the scrub would fail silently in this case.
>>
>> Please also check whether a scrub is already running on one of the pools
>> and if so & another pool exists start a background loop to wait for the
>> first scrub to finish or die silently.
>>
>> I had a scrub fully restart from calling scrub a second time after being
>> more than 50% complete, its frustrating.
>>
>>
>> Thanks!,
>>
>
> I should probably suggest one check that comes to mind.
>
> zpool history ${pool} | grep scrub | tail -1 |cut -f1 -d.
>
> Then compare the output with today's date to make sure today is >= 30
> days from the date of the last scrub.
>
> With the above this could be turned into a daily_zfs_scrub_enable with a
> default daily_zfs_scrub_threshold="30" and ensuring that if one check is
> missed it will not take another 30 days to run the check again.

Good idea! I even found a command line which does the calculation for  
the number of days between "now" and the last run (not taking a leap  
year into account, but an off-by-one day error here does not matter).

Bye,
Alexander.

-- 
"He's a businessman. I'll make him an offer he can't refuse."
		-- Vito Corleone, "Chapter 1", page 39

http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 10:24:13 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5FD3D1065676
	for <fs@FreeBSD.org>; Thu, 10 Jun 2010 10:24:13 +0000 (UTC)
	(envelope-from ticso@cicely7.cicely.de)
Received: from raven.bwct.de (raven.bwct.de [85.159.14.73])
	by mx1.freebsd.org (Postfix) with ESMTP id 05F768FC0A
	for <fs@FreeBSD.org>; Thu, 10 Jun 2010 10:24:12 +0000 (UTC)
Received: from mail.cicely.de ([10.1.1.37])
	by raven.bwct.de (8.13.4/8.13.4) with ESMTP id o5AAOAMJ004753
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Thu, 10 Jun 2010 12:24:10 +0200 (CEST)
	(envelope-from ticso@cicely7.cicely.de)
Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9])
	by mail.cicely.de (8.14.3/8.14.3) with ESMTP id o5AAO1AZ058643
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Thu, 10 Jun 2010 12:24:01 +0200 (CEST)
	(envelope-from ticso@cicely7.cicely.de)
Received: from cicely7.cicely.de (localhost [127.0.0.1])
	by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id o5AAO0tx080480;
	Thu, 10 Jun 2010 12:24:00 +0200 (CEST)
	(envelope-from ticso@cicely7.cicely.de)
Received: (from ticso@localhost)
	by cicely7.cicely.de (8.14.2/8.14.2/Submit) id o5AAO01W080479;
	Thu, 10 Jun 2010 12:24:00 +0200 (CEST) (envelope-from ticso)
Date: Thu, 10 Jun 2010 12:24:00 +0200
From: Bernd Walter <ticso@cicely7.cicely.de>
To: Alexander Leidinger <Alexander@leidinger.net>
Message-ID: <20100610102350.GP72453@cicely7.cicely.de>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<20100609144355.GL72453@cicely7.cicely.de>
	<20100610112345.644960lrau3mxfk0@webmail.leidinger.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100610112345.644960lrau3mxfk0@webmail.leidinger.net>
X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386
User-Agent: Mutt/1.5.11
X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED=-1, BAYES_00=-1.9,
	T_RP_MATCHES_RCVD=-0.01 autolearn=ham version=3.3.0
X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on spamd.cicely.de
Cc: ticso@cicely.de, fs@FreeBSD.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: ticso@cicely.de
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 10:24:13 -0000

On Thu, Jun 10, 2010 at 11:23:45AM +0200, Alexander Leidinger wrote:
> 
> Quoting Bernd Walter <ticso@cicely7.cicely.de> (from Wed, 9 Jun 2010  
> 16:43:55 +0200):
> 
> >On Wed, Jun 09, 2010 at 04:26:27PM +0200, Alexander Leidinger wrote:
> >>Hi,
> >>
> >>I noticed that we do not have an automatism to scrub a ZFS pool
> >>periodically. Is there interest in something like this, or shall I
> >>keep it local?
> >
> >For me scrub'ing takes several days without having a special big
> >pool size and starting another scrub restarts everything.
> >You should at least check if another one is still running.
> 
> Good point, I will have a look at this...
> 
> But I'm a little bit surprised, when I scrub a pool of 3 times 250 GB  
> disks in RAIDZ configuration, it is finished fast (a fraction of a  
> day... maybe an hour or two). Initially it displays a very long time  
> (>400 hours), but this is reducing after a while drastically. The pool  
> is filled up to 3/4 of the entire capacity.

Well - my system is not idle during scrub and I don't have very
fast disks either.
My system runs with 2x 4x500G RAIDZ.
Disks are consumer grade sata.
Controller are onboard Intel AHCI and SiI 3132.
OS is 8.0RC1(r198183), therefor I'm still using ata driver.

That's at scrub start:
[115]cicely14# zpool status
  pool: data
 state: ONLINE
 scrub: scrub in progress for 0h0m, 0.00% done, 2275h55m to go
config:

        NAME             STATE     READ WRITE CKSUM
        data             ONLINE       0     0     0
          raidz1         ONLINE       0     0     0
            ad34         ONLINE       0     0     0
            ad12         ONLINE       0     0     0
            ad28         ONLINE       0     0     0
            ad26         ONLINE       0     0     0
          raidz1         ONLINE       0     0     0
            ad4          ONLINE       0     0     0
            ad6          ONLINE       0     0     0
            ad36         ONLINE       0     0     0
            ad10         ONLINE       0     0     0
        cache
          label/cache6   ONLINE       0     0     0
          label/cache7   ONLINE       0     0     0
          label/cache8   ONLINE       0     0     0
          label/cache9   ONLINE       0     0     0
          label/cache10  ONLINE       0     0     0

errors: No known data errors

ETA first increases:
[116]cicely14# zpool status
  pool: data
 state: ONLINE
 scrub: scrub in progress for 0h0m, 0.00% done, 2539h19m to go

Then gets smaller:
[117]cicely14# zpool status
  pool: data
 state: ONLINE
 scrub: scrub in progress for 0h1m, 0.00% done, 1551h38m to go

[120]cicely14# zpool status
  pool: data
 state: ONLINE
 scrub: scrub in progress for 0h2m, 0.00% done, 1182h20m to go

But it may get higher again:
[121]cicely14# zpool status
  pool: data
 state: ONLINE
 scrub: scrub in progress for 0h6m, 0.01% done, 1346h41m to go

I dont remember the time it took for the last scrub, but IIRC
it took something about 2-3 days, so initial ETA is much higher
than reality too.

-- 
B.Walter <bernd@bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 10:29:22 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4FFAD106566C
	for <fs@freebsd.org>; Thu, 10 Jun 2010 10:29:22 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta01.westchester.pa.mail.comcast.net
	(qmta01.westchester.pa.mail.comcast.net [76.96.62.16])
	by mx1.freebsd.org (Postfix) with ESMTP id F15298FC0A
	for <fs@freebsd.org>; Thu, 10 Jun 2010 10:29:21 +0000 (UTC)
Received: from omta08.westchester.pa.mail.comcast.net ([76.96.62.12])
	by qmta01.westchester.pa.mail.comcast.net with comcast
	id U9wc1e0060Fqzac51AVMfS; Thu, 10 Jun 2010 10:29:21 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta08.westchester.pa.mail.comcast.net with comcast
	id UAVL1e0023S48mS3UAVLo5; Thu, 10 Jun 2010 10:29:21 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id CFDC99B418; Thu, 10 Jun 2010 03:29:18 -0700 (PDT)
Date: Thu, 10 Jun 2010 03:29:18 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: ticso@cicely.de
Message-ID: <20100610102918.GA69770@icarus.home.lan>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<20100609144355.GL72453@cicely7.cicely.de>
	<20100610112345.644960lrau3mxfk0@webmail.leidinger.net>
	<20100610102350.GP72453@cicely7.cicely.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100610102350.GP72453@cicely7.cicely.de>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: Alexander Leidinger <Alexander@leidinger.net>, fs@FreeBSD.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 10:29:22 -0000

On Thu, Jun 10, 2010 at 12:24:00PM +0200, Bernd Walter wrote:
> On Thu, Jun 10, 2010 at 11:23:45AM +0200, Alexander Leidinger wrote:
> > 
> > Quoting Bernd Walter <ticso@cicely7.cicely.de> (from Wed, 9 Jun 2010  
> > 16:43:55 +0200):
> > 
> > >On Wed, Jun 09, 2010 at 04:26:27PM +0200, Alexander Leidinger wrote:
> > >>Hi,
> > >>
> > >>I noticed that we do not have an automatism to scrub a ZFS pool
> > >>periodically. Is there interest in something like this, or shall I
> > >>keep it local?
> > >
> > >For me scrub'ing takes several days without having a special big
> > >pool size and starting another scrub restarts everything.
> > >You should at least check if another one is still running.
> > 
> > Good point, I will have a look at this...
> > 
> > But I'm a little bit surprised, when I scrub a pool of 3 times 250 GB  
> > disks in RAIDZ configuration, it is finished fast (a fraction of a  
> > day... maybe an hour or two). Initially it displays a very long time  
> > (>400 hours), but this is reducing after a while drastically. The pool  
> > is filled up to 3/4 of the entire capacity.
> 
> Well - my system is not idle during scrub and I don't have very
> fast disks either.
> My system runs with 2x 4x500G RAIDZ.
> Disks are consumer grade sata.
> Controller are onboard Intel AHCI and SiI 3132.
> OS is 8.0RC1(r198183), therefor I'm still using ata driver.
> 
> That's at scrub start:
> [115]cicely14# zpool status
>   pool: data
>  state: ONLINE
>  scrub: scrub in progress for 0h0m, 0.00% done, 2275h55m to go
> config:
> 
>         NAME             STATE     READ WRITE CKSUM
>         data             ONLINE       0     0     0
>           raidz1         ONLINE       0     0     0
>             ad34         ONLINE       0     0     0
>             ad12         ONLINE       0     0     0
>             ad28         ONLINE       0     0     0
>             ad26         ONLINE       0     0     0
>           raidz1         ONLINE       0     0     0
>             ad4          ONLINE       0     0     0
>             ad6          ONLINE       0     0     0
>             ad36         ONLINE       0     0     0
>             ad10         ONLINE       0     0     0
>         cache
>           label/cache6   ONLINE       0     0     0
>           label/cache7   ONLINE       0     0     0
>           label/cache8   ONLINE       0     0     0
>           label/cache9   ONLINE       0     0     0
>           label/cache10  ONLINE       0     0     0
> 
> errors: No known data errors
> 
> ETA first increases:
> [116]cicely14# zpool status
>   pool: data
>  state: ONLINE
>  scrub: scrub in progress for 0h0m, 0.00% done, 2539h19m to go
> 
> Then gets smaller:
> [117]cicely14# zpool status
>   pool: data
>  state: ONLINE
>  scrub: scrub in progress for 0h1m, 0.00% done, 1551h38m to go
> 
> [120]cicely14# zpool status
>   pool: data
>  state: ONLINE
>  scrub: scrub in progress for 0h2m, 0.00% done, 1182h20m to go
> 
> But it may get higher again:
> [121]cicely14# zpool status
>   pool: data
>  state: ONLINE
>  scrub: scrub in progress for 0h6m, 0.01% done, 1346h41m to go
> 
> I dont remember the time it took for the last scrub, but IIRC
> it took something about 2-3 days, so initial ETA is much higher
> than reality too.

You're running an 8.0 release candidate.  There have been some changes
to scrubbing and other whatnots with ZFS between then and now.  I'd
recommend trying RELENG_8 and seeing if the behaviour remains.  You
don't have to use ahci.ko (you can stick with ataahci.ko).

By "behaviour" I'm referring to how long the scrub is taking.  The
variance you see in ETA is normal.  You can verify that things aren't
stalled blindly by using "zpool iostat" (there should be fairly
intensive I/O).

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 11:06:13 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7343F1065672
	for <freebsd-fs@FreeBSD.org>; Thu, 10 Jun 2010 11:06:13 +0000 (UTC)
	(envelope-from anders@FreeBSD.org)
Received: from fupp.net (totem.fix.no [80.91.36.20])
	by mx1.freebsd.org (Postfix) with ESMTP id 2D6898FC16
	for <freebsd-fs@FreeBSD.org>; Thu, 10 Jun 2010 11:06:12 +0000 (UTC)
Received: from localhost (totem.fix.no [80.91.36.20])
	by fupp.net (Postfix) with ESMTP id D0188471D3;
	Thu, 10 Jun 2010 13:06:11 +0200 (CEST)
Received: from fupp.net ([80.91.36.20])
	by localhost (totem.fix.no [80.91.36.20]) (amavisd-new, port 10024)
	with LMTP id qtOxo--HCVD5; Thu, 10 Jun 2010 13:06:09 +0200 (CEST)
Received: by fupp.net (Postfix, from userid 1000)
	id B89E2471D2; Thu, 10 Jun 2010 13:06:09 +0200 (CEST)
Date: Thu, 10 Jun 2010 13:06:09 +0200
From: Anders Nordby <anders@FreeBSD.org>
To: Peter Jeremy <peter@vk2pj.dyndns.org>
Message-ID: <20100610110609.GA87243@fupp.net>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20100610081710.GA64350@server.vk2pj.dyndns.org>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key: http://anders.fix.no/pgp/
X-PGP-Key-FingerPrint: 1E0F C53C D8DF 6A8F EAAD  19C5 D12A BC9F 0083 5956
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 11:06:13 -0000

Hi,

On Thu, Jun 10, 2010 at 06:17:10PM +1000, Peter Jeremy wrote:
> I wonder if your system is running out of free RAM.  How would you
> like to monitor "inactive", "cache" and "free" from either "systat -v"
> or "vmstat -s" whilst the problem is occurring.
> 
> Does something like
>   perl -e '$x = "x" x 10000000;'
> temporarily correct the problem?

While the problem is happening:

root@unixfile:~# vmstat -s
511745441 cpu context switches
151635080 device interrupts
 14028218 software interrupts
 11549957 traps
974939023 system calls
       22 kernel threads created
    77512  fork() calls
     6097 vfork() calls
        0 rfork() calls
        0 swap pager pageins
        0 swap pager pages paged in
        0 swap pager pageouts
        0 swap pager pages paged out
      699 vnode pager pageins
     4777 vnode pager pages paged in
     2024 vnode pager pageouts
     2471 vnode pager pages paged out
        0 page daemon wakeups
        0 pages examined by the page daemon
      318 pages reactivated
  4738808 copy-on-write faults
     4957 copy-on-write optimized faults
  3843376 zero fill pages zeroed
        0 zero fill pages prezeroed
     2273 intransit blocking page faults
 11236873 total VM faults taken
        0 pages affected by kernel thread creation
 20699066 pages affected by  fork()
  1707164 pages affected by vfork()
        0 pages affected by rfork()
      363 pages cached
 27229532 pages freed
        0 pages freed by daemon
  6618712 pages freed by exiting processes
     6054 pages active
    37307 pages inactive
       28 pages in VM cache
   261148 pages wired down
   456560 pages free
     4096 bytes per page
 43744208 total name lookups
          cache hits (19% pos + 1% neg) system 0% per-directory
          deletions 2%, falsehits 0%, toolong 0%

And from systat -v:

Disks   da0   da1 pass0 pass1                     1045240 wire
KB/t   0.00  0.00  0.00  0.00                       25240 act
tps       0     0     0     0                      149344 inact
MB/s   0.00  0.00  0.00  0.00                         112 cache
%busy     0     0     0     0                     1824452 free
                                                   323680 buf
> Does something like
>   perl -e '$x = "x" x 10000000;'
> temporarily correct the problem?

No.

Regards,

-- 
Anders.

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 11:13:17 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 22FC3106564A
	for <freebsd-fs@FreeBSD.org>; Thu, 10 Jun 2010 11:13:17 +0000 (UTC)
	(envelope-from anders@FreeBSD.org)
Received: from fupp.net (totem.fix.no [80.91.36.20])
	by mx1.freebsd.org (Postfix) with ESMTP id D0C508FC19
	for <freebsd-fs@FreeBSD.org>; Thu, 10 Jun 2010 11:13:16 +0000 (UTC)
Received: from localhost (totem.fix.no [80.91.36.20])
	by fupp.net (Postfix) with ESMTP id 58B484720E;
	Thu, 10 Jun 2010 13:13:16 +0200 (CEST)
Received: from fupp.net ([80.91.36.20])
	by localhost (totem.fix.no [80.91.36.20]) (amavisd-new, port 10024)
	with LMTP id jRzrMs13SDTp; Thu, 10 Jun 2010 13:13:16 +0200 (CEST)
Received: by fupp.net (Postfix, from userid 1000)
	id 2C2934720D; Thu, 10 Jun 2010 13:13:16 +0200 (CEST)
Date: Thu, 10 Jun 2010 13:13:16 +0200
From: Anders Nordby <anders@FreeBSD.org>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20100610111316.GB87243@fupp.net>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<Pine.GSO.4.63.1006091119410.23896@muncher.cs.uoguelph.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <Pine.GSO.4.63.1006091119410.23896@muncher.cs.uoguelph.ca>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key: http://anders.fix.no/pgp/
X-PGP-Key-FingerPrint: 1E0F C53C D8DF 6A8F EAAD  19C5 D12A BC9F 0083 5956
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 11:13:17 -0000

Hi,

On Wed, Jun 09, 2010 at 11:28:52AM -0400, Rick Macklem wrote:
> When you tried a different NIC, was a different type (ie. different
> chipset that uses a different device driver)? I suggested that not
> because I thought the hardware was broken but because I thought it
> might be related to the network interface's device driver and switching
> to a different device driver would isolate that possibility.

Nope. I switched from NIC 1 to 2, and switched server to an identical
one. They both use bge NICs, a very common interface. I somehow doubt
this is related to the NIC or driver, I have many machines with the same
bge NIC (HP NC7782) that does not have any problems like this.

> Well, it doesn't seem to be mbuf exhaustion (I don't know what
> "out of packet secondary zone" means, I'll have to look at that) and
> if it doesn't handle pings it seems really hosed. Have you done a
> "vmstat 5" + "ps axlH" (or similar) to try and see what it's doing?
> ("top" and "netstat" might also help?)

root@unixfile:~# vmstat 5
 procs      memory      page                    disks     faults
cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr da0 da1   in   sy   cs
us sy id
 0 0 0    410M  1781M   279   0   0   0   481   0   0   0 1918 12338
6476  0  2 98
 0 0 0    410M  1781M     1   0   0   0     0   0   0   0  497   34 2268
0  1 99
 0 0 0    410M  1781M   123   0   0   0   116   0   0   0  455 1787 2071
0  0 99
 0 0 0    410M  1781M     0   0   0   0     4   0   0   0  292   38 1459
0  1 99
^C
root@unixfile:~# top -b 5
last pid: 86306;  load averages:  0.04,  0.13,  0.07  up 0+22:01:28
13:09:31
46 processes:  1 running, 45 sleeping

Mem: 25M Active, 147M Inact, 1021M Wired, 112K Cache, 316M Buf, 1780M
Free
Swap: 6144M Total, 6144M Free


  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
COMMAND
  786 root        4  44    0  5804K  1276K rpcsvc  1  51:04  0.00% nfsd
  839 nagios      1  44    0 10880K  3228K select  0   0:04  0.00% nrpe2
  847 root        1  44    0 20852K  8356K select  0   0:04  0.00%
perl5.10.1
 1076 root        1  44    0 11968K  4188K select  0   0:01  0.00%
sendmail
81645 root        1  44    0 10220K  2920K wait    2   0:01  0.00% bash

The server doesn't have many connections, 16 in ESTABLISHED state. As
you can see from top, the server has 1780 MB free memory.

Regards,

-- 
Anders.

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 11:46:16 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C31141065676
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 11:46:16 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta07.emeryville.ca.mail.comcast.net
	(qmta07.emeryville.ca.mail.comcast.net [76.96.30.64])
	by mx1.freebsd.org (Postfix) with ESMTP id A830F8FC1A
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 11:46:15 +0000 (UTC)
Received: from omta01.emeryville.ca.mail.comcast.net ([76.96.30.11])
	by qmta07.emeryville.ca.mail.comcast.net with comcast
	id UBbX1e0040EPchoA7BmFsj; Thu, 10 Jun 2010 11:46:15 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta01.emeryville.ca.mail.comcast.net with comcast
	id UBmE1e00A3S48mS8MBmEYg; Thu, 10 Jun 2010 11:46:15 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 692D99B418; Thu, 10 Jun 2010 04:46:14 -0700 (PDT)
Date: Thu, 10 Jun 2010 04:46:14 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Anders Nordby <anders@FreeBSD.org>
Message-ID: <20100610114614.GA71432@icarus.home.lan>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<Pine.GSO.4.63.1006091119410.23896@muncher.cs.uoguelph.ca>
	<20100610111316.GB87243@fupp.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100610111316.GB87243@fupp.net>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 11:46:16 -0000

On Thu, Jun 10, 2010 at 01:13:16PM +0200, Anders Nordby wrote:
> On Wed, Jun 09, 2010 at 11:28:52AM -0400, Rick Macklem wrote:
> > When you tried a different NIC, was a different type (ie. different
> > chipset that uses a different device driver)? I suggested that not
> > because I thought the hardware was broken but because I thought it
> > might be related to the network interface's device driver and switching
> > to a different device driver would isolate that possibility.
> 
> Nope. I switched from NIC 1 to 2, and switched server to an identical
> one. They both use bge NICs, a very common interface. I somehow doubt
> this is related to the NIC or driver, I have many machines with the same
> bge NIC (HP NC7782) that does not have any problems like this.

This may not be the problem of course, but are they they *exact* same
model and revision of NIC?  pciconf -lvc on both boxes, and looking for
the relevant bgeX interfaces, would determine that.

I believe Rick was recommending you switch to another model of NIC that
doesn't fall under the same driver, e.g. do you see this behaviour when
using em(4).

Also, can you provide uname -a output, or specifically the build date of
the kernel.  There have been bge(4) changes happening regularly
throughout the lifetime of RELENG_8, including into the -PRERELEASE
stage.

> Mem: 25M Active, 147M Inact, 1021M Wired, 112K Cache, 316M Buf, 1780M Free
> Swap: 6144M Total, 6144M Free
> 
> The server doesn't have many connections, 16 in ESTABLISHED state. As
> you can see from top, the server has 1780 MB free memory.

Clarification: I believe it actually has 1927MB (147M Inact + 1780M
Free) available.  I've always understood top's "Free" field to mean
"number/amount of pages which have never been touched/used since the
kernel was started", while "Inact" to mean "number/amount of pages which
have been touched/used but are not actively being used, this available
for use".

If someone more familiar with the VM and top could expand on this,
that'd be helpful.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 11:48:33 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3762C106566B
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 11:48:33 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta07.emeryville.ca.mail.comcast.net
	(qmta07.emeryville.ca.mail.comcast.net [76.96.30.64])
	by mx1.freebsd.org (Postfix) with ESMTP id 1E61E8FC19
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 11:48:33 +0000 (UTC)
Received: from omta04.emeryville.ca.mail.comcast.net ([76.96.30.35])
	by qmta07.emeryville.ca.mail.comcast.net with comcast
	id UBXx1e0060lTkoCA7BoY5Q; Thu, 10 Jun 2010 11:48:32 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta04.emeryville.ca.mail.comcast.net with comcast
	id UBoY1e0013S48mS8QBoYXq; Thu, 10 Jun 2010 11:48:32 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 09CE09B418; Thu, 10 Jun 2010 04:48:32 -0700 (PDT)
Date: Thu, 10 Jun 2010 04:48:32 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Anders Nordby <anders@FreeBSD.org>
Message-ID: <20100610114831.GB71432@icarus.home.lan>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
	<20100610110609.GA87243@fupp.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100610110609.GA87243@fupp.net>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@FreeBSD.org, Peter Jeremy <peter@vk2pj.dyndns.org>
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 11:48:33 -0000

On Thu, Jun 10, 2010 at 01:06:09PM +0200, Anders Nordby wrote:
> Hi,
> 
> On Thu, Jun 10, 2010 at 06:17:10PM +1000, Peter Jeremy wrote:
> > I wonder if your system is running out of free RAM.  How would you
> > like to monitor "inactive", "cache" and "free" from either "systat -v"
> > or "vmstat -s" whilst the problem is occurring.
> > 
> > Does something like
> >   perl -e '$x = "x" x 10000000;'
> > temporarily correct the problem?
> 
> While the problem is happening:
> 
> root@unixfile:~# vmstat -s

Can you also provide "vmstat -i" output, both when the issue is
happening and after the machine has been rebooted (but been up for 5-10
minutes)?  Thanks.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 11:54:46 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E07E2106564A
	for <fs@FreeBSD.org>; Thu, 10 Jun 2010 11:54:46 +0000 (UTC)
	(envelope-from ticso@cicely7.cicely.de)
Received: from raven.bwct.de (raven.bwct.de [85.159.14.73])
	by mx1.freebsd.org (Postfix) with ESMTP id 4A6238FC1C
	for <fs@FreeBSD.org>; Thu, 10 Jun 2010 11:54:45 +0000 (UTC)
Received: from mail.cicely.de ([10.1.1.37])
	by raven.bwct.de (8.13.4/8.13.4) with ESMTP id o5ABsiTj010250
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Thu, 10 Jun 2010 13:54:44 +0200 (CEST)
	(envelope-from ticso@cicely7.cicely.de)
Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9])
	by mail.cicely.de (8.14.3/8.14.3) with ESMTP id o5ABsWt8061742
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Thu, 10 Jun 2010 13:54:32 +0200 (CEST)
	(envelope-from ticso@cicely7.cicely.de)
Received: from cicely7.cicely.de (localhost [127.0.0.1])
	by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id o5ABsWZJ080841;
	Thu, 10 Jun 2010 13:54:32 +0200 (CEST)
	(envelope-from ticso@cicely7.cicely.de)
Received: (from ticso@localhost)
	by cicely7.cicely.de (8.14.2/8.14.2/Submit) id o5ABsVw4080840;
	Thu, 10 Jun 2010 13:54:31 +0200 (CEST) (envelope-from ticso)
Date: Thu, 10 Jun 2010 13:54:31 +0200
From: Bernd Walter <ticso@cicely7.cicely.de>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
Message-ID: <20100610115429.GQ72453@cicely7.cicely.de>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<20100609144355.GL72453@cicely7.cicely.de>
	<20100610112345.644960lrau3mxfk0@webmail.leidinger.net>
	<20100610102350.GP72453@cicely7.cicely.de>
	<20100610102918.GA69770@icarus.home.lan>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100610102918.GA69770@icarus.home.lan>
X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386
User-Agent: Mutt/1.5.11
X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED=-1, BAYES_00=-1.9,
	T_RP_MATCHES_RCVD=-0.01 autolearn=unavailable version=3.3.0
X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on spamd.cicely.de
Cc: Alexander Leidinger <Alexander@leidinger.net>, ticso@cicely.de,
	fs@FreeBSD.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: ticso@cicely.de
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 11:54:47 -0000

On Thu, Jun 10, 2010 at 03:29:18AM -0700, Jeremy Chadwick wrote:
> You're running an 8.0 release candidate.  There have been some changes
> to scrubbing and other whatnots with ZFS between then and now.  I'd
> recommend trying RELENG_8 and seeing if the behaviour remains.  You
> don't have to use ahci.ko (you can stick with ataahci.ko).

Good to know.
Updating to more recent 8 or maybe current is already on my TODO
list for ataahci, but since the system runs it is quite low.
My wishlist also has reboot persistent cache devices.
For me they work prety well, but are empty after reboot and it takes
several days to fill.
But so far noone could tell me if it has been changed so far.

> By "behaviour" I'm referring to how long the scrub is taking.  The
> variance you see in ETA is normal.  You can verify that things aren't
> stalled blindly by using "zpool iostat" (there should be fairly
> intensive I/O).

-- 
B.Walter <bernd@bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 13:03:09 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A2D8B106566B
	for <freebsd-fs@FreeBSD.org>; Thu, 10 Jun 2010 13:03:09 +0000 (UTC)
	(envelope-from anders@FreeBSD.org)
Received: from fupp.net (totem.fix.no [80.91.36.20])
	by mx1.freebsd.org (Postfix) with ESMTP id 5B4F48FC08
	for <freebsd-fs@FreeBSD.org>; Thu, 10 Jun 2010 13:03:08 +0000 (UTC)
Received: from localhost (totem.fix.no [80.91.36.20])
	by fupp.net (Postfix) with ESMTP id 02AE34766B;
	Thu, 10 Jun 2010 15:03:08 +0200 (CEST)
Received: from fupp.net ([80.91.36.20])
	by localhost (totem.fix.no [80.91.36.20]) (amavisd-new, port 10024)
	with LMTP id rlna0lJRHmku; Thu, 10 Jun 2010 15:03:07 +0200 (CEST)
Received: by fupp.net (Postfix, from userid 1000)
	id CC6BE4766A; Thu, 10 Jun 2010 15:03:07 +0200 (CEST)
Date: Thu, 10 Jun 2010 15:03:07 +0200
From: Anders Nordby <anders@FreeBSD.org>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
Message-ID: <20100610130307.GA33285@fupp.net>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
	<20100610110609.GA87243@fupp.net>
	<20100610114831.GB71432@icarus.home.lan>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20100610114831.GB71432@icarus.home.lan>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key: http://anders.fix.no/pgp/
X-PGP-Key-FingerPrint: 1E0F C53C D8DF 6A8F EAAD  19C5 D12A BC9F 0083 5956
Cc: freebsd-fs@FreeBSD.org, Peter Jeremy <peter@vk2pj.dyndns.org>
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 13:03:09 -0000

Hi,

On Thu, Jun 10, 2010 at 04:48:32AM -0700, Jeremy Chadwick wrote:
> Can you also provide "vmstat -i" output, both when the issue is
> happening and after the machine has been rebooted (but been up for 5-10
> minutes)?  Thanks.

While having issues:

root@unixfile:~# vmstat -i
interrupt                          total       rate
irq1: atkbd0                           6          0
irq14: ata0                            1          0
irq18: uhci2                    78164874        953
irq19: uhci1                      643047          7
irq26: bge1                     73830825        900
irq51: ciss0                      642774          7
cpu0: timer                    163861455       1998
cpu1: timer                    163853438       1998
cpu3: timer                    163906515       1999
cpu2: timer                    163906515       1999
Total      

5 minutes after a reboot:

root@unixfile:~# vmstat -i
interrupt                          total       rate
irq1: atkbd0                           6          0
irq14: ata0                            1          0
irq18: uhci2                        5813         19
irq19: uhci1                        2503          8
irq26: bge1                         1997          6
irq51: ciss0                        2503          8
cpu0: timer                       592619       1995
cpu1: timer                       584601       1968
cpu2: timer                       584605       1968
cpu3: timer                       584606       1968
Total                            2359254       7943

Bye,

-- 
Anders.

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 13:39:01 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 056281065679
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 13:39:01 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta04.emeryville.ca.mail.comcast.net
	(qmta04.emeryville.ca.mail.comcast.net [76.96.30.40])
	by mx1.freebsd.org (Postfix) with ESMTP id DDC6F8FC20
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 13:39:00 +0000 (UTC)
Received: from omta19.emeryville.ca.mail.comcast.net ([76.96.30.76])
	by qmta04.emeryville.ca.mail.comcast.net with comcast
	id UC9f1e0031eYJf8A4Df0yN; Thu, 10 Jun 2010 13:39:00 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta19.emeryville.ca.mail.comcast.net with comcast
	id UDez1e0033S48mS01DezCg; Thu, 10 Jun 2010 13:39:00 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 18E089B418; Thu, 10 Jun 2010 06:38:59 -0700 (PDT)
Date: Thu, 10 Jun 2010 06:38:59 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Anders Nordby <anders@FreeBSD.org>
Message-ID: <20100610133859.GA74094@icarus.home.lan>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
	<20100610110609.GA87243@fupp.net>
	<20100610114831.GB71432@icarus.home.lan>
	<20100610130307.GA33285@fupp.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100610130307.GA33285@fupp.net>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@FreeBSD.org, Peter Jeremy <peter@vk2pj.dyndns.org>
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 13:39:01 -0000

On Thu, Jun 10, 2010 at 03:03:07PM +0200, Anders Nordby wrote:
> On Thu, Jun 10, 2010 at 04:48:32AM -0700, Jeremy Chadwick wrote:
> > Can you also provide "vmstat -i" output, both when the issue is
> > happening and after the machine has been rebooted (but been up for 5-10
> > minutes)?  Thanks.
> 
> While having issues:
> 
> root@unixfile:~# vmstat -i
> interrupt                          total       rate
> irq1: atkbd0                           6          0
> irq14: ata0                            1          0
> irq18: uhci2                    78164874        953
> irq19: uhci1                      643047          7
> irq26: bge1                     73830825        900
> irq51: ciss0                      642774          7
> cpu0: timer                    163861455       1998
> cpu1: timer                    163853438       1998
> cpu3: timer                    163906515       1999
> cpu2: timer                    163906515       1999
> Total      
> 
> 5 minutes after a reboot:
> 
> root@unixfile:~# vmstat -i
> interrupt                          total       rate
> irq1: atkbd0                           6          0
> irq14: ata0                            1          0
> irq18: uhci2                        5813         19
> irq19: uhci1                        2503          8
> irq26: bge1                         1997          6
> irq51: ciss0                        2503          8
> cpu0: timer                       592619       1995
> cpu1: timer                       584601       1968
> cpu2: timer                       584605       1968
> cpu3: timer                       584606       1968
> Total                            2359254       7943

The interrupt rate for bge1 (irq26) is very high during the problem,
while otherwise is only ~6/sec.  Shot in the dark, but this is probably
the cause of the packet loss you see.  Oddly, your uhci2 interface (used
for USB) is also firing at a very high rate.  I don't know if this is
the sign of a NIC problem, driver problem, or interrupt (think APIC?)
routing problem.

Debugging this is beyond my capability, but folks like John Baldwin may
have some ideas on where to go from here.

Also, have you used "netstat -ibn -I bge1" (to look at byte counters) or
"tcpdump -l -n -s 0 -i bge1" to watch network traffic live when this is
happening?  The reason I ask is to determine if there's any chance this
box starts seeing problems due to DoS attacks or excessive LAN traffic
which is unexpected.  Basically, be sure that all the network I/O going
on across bge1 is expected.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 14:26:38 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C21971065678
	for <fs@freebsd.org>; Thu, 10 Jun 2010 14:26:38 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de
	[217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 6EA6D8FC15
	for <fs@freebsd.org>; Thu, 10 Jun 2010 14:26:38 +0000 (UTC)
Received: from outgoing.leidinger.net (pD954FE15.dip.t-dialin.net
	[217.84.254.21])
	by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 2D16D84400A
	for <fs@freebsd.org>; Thu, 10 Jun 2010 16:26:34 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id BB19E5133
	for <fs@freebsd.org>; Thu, 10 Jun 2010 16:26:30 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net;
	s=outgoing-alex; t=1276179990;
	bh=O+U83IuEdEGiKRxgFIt2pSeDkq2rqmzcd4g40Ir5/4c=;
	h=Message-ID:Date:From:To:Subject:MIME-Version:Content-Type:
	Content-Transfer-Encoding;
	b=X73EJTlzwUpNe8XMdiLBspwNWUvojLybicbfiESDeWuz8malhosrHeQ+u0kCjQXMj
	ejIEyzTPv3zZPziHF+btnDM4xiYYhbBVtaHZsV09Z1dlTVJ5EioCHAH3QPyFgj5qhz
	Qru0lq0RwfgPO1lfHIGas/wPANM/b2czD30y9TNkF53rW7xvkpCE9A3xbp3H+TuDkQ
	YDrvZv72vwHyOSp4xC6f44Tv41YKqsLB4GfZIsZuLaF4mlu1xs7QxTIeftWb5+Gzk9
	kgtwWRfcYoPqBLSC1WRh6nbGB3HWPYL0PVDirAsTxHS0RbXX3VHuE2PrKBxLiitkDJ
	hS++2dNVC6H5g==
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5AEQT3t024202
	for fs@freebsd.org; Thu, 10 Jun 2010 16:26:29 +0200 (CEST)
	(envelope-from Alexander@Leidinger.net)
Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by
	webmail.leidinger.net (Horde Framework) with HTTP; Thu, 10 Jun 2010
	16:26:29 +0200
Message-ID: <20100610162629.38992mazf0sfdqg0@webmail.leidinger.net>
Date: Thu, 10 Jun 2010 16:26:29 +0200
From: Alexander Leidinger <Alexander@Leidinger.net>
To: fs@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain;
 charset=UTF-8;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4)
X-EBL-MailScanner-Information: Please contact the ISP for more information
X-EBL-MailScanner-ID: 2D16D84400A.A5F60
X-EBL-MailScanner: Found to be clean
X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN,
	SpamAssassin (not cached, score=-1.023, required 6,
	autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10,
	DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_ZF 0.08)
X-EBL-MailScanner-From: alexander@leidinger.net
X-EBL-MailScanner-Watermark: 1276784796.45294@k6vt/wRgJTOjnYvTQ47Q/Q
X-EBL-Spam-Status: No
Cc: 
Subject: CFT: periodic scrubbing of ZFS pools
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 14:26:38 -0000

Hi,

as there seems to be interest in a periodic script to scrub zpools, I  
modified my monthly-POC into a daily script with parameters for which  
pools to scrub, how many days between scrubs (even different per pool,  
if required), and several error checks (non-existing pool specified,  
scrub in progress).

You can find it at
    http://www.Leidinger.net/FreeBSD/current-patches/600.scrub-zfs

Please put it into /etc/periodic/daily and test it. Possible  
periodic.conf variables are:
  daily_scrub_zfs_enable="YES"
  daily_scrub_zfs_pools="name1 name2 name3" # all if unset or empty
  daily scrub_zfs_default_threshold="<number_of_days>" # default: 30
  daily_scrub_zfs_<POOLNAME>_threshold="<number_of_days>"

If there is no specific threshold for a pool (= days between scrubs),  
the default threshold is used.

Bye,
Alexander.

-- 
Hear about...
	the guru who refused Novocaine while having a tooth pulled because
	he wanted to transcend dental medication?

http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 14:59:18 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0152E106567A
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 14:59:18 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de
	[217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 9F0508FC08
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 14:59:17 +0000 (UTC)
Received: from outgoing.leidinger.net (pD954FE15.dip.t-dialin.net
	[217.84.254.21])
	by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 0E76C84400A;
	Thu, 10 Jun 2010 16:59:14 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id E057E5138;
	Thu, 10 Jun 2010 16:59:10 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net;
	s=outgoing-alex; t=1276181951;
	bh=0vjWRYEjt7X2eHhj6sivGuhQwwrwnP/Sq/H4T0I0N18=;
	h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To:
	MIME-Version:Content-Type:Content-Transfer-Encoding;
	b=MYCwxdzv/PKYo4U6EK5PkRDZclVp1JTNQ+Dz7i4HObj97S3XHfR2fRpWISMIXF3RQ
	l+KXrNOOyhgXyCBJwbobh0dNK2q4a7XyN/3Vxee92H0tATM5M9eZp2hNfWHDS2exdV
	hpOtnddqYQa4CsKfCbczNUdXaqFSE8xr3p4uPRzpL5rzXRrWXfXZJjRJ9AcBufXAhN
	p1ae0dKxeogEBC9+7nv4O6i/HLIhgYLkAn+tEKxtTafTqcxF770N3nAeTNoAB/BcyJ
	tacPOwvmZKAGrt7oILXB8VO1ln8Ep4iNxyjyZoaNEPqjgWtiVplP+27IwCiEFPHWGZ
	FShrC1e+4z6Tw==
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5AExAsZ057261;
	Thu, 10 Jun 2010 16:59:10 +0200 (CEST)
	(envelope-from Alexander@Leidinger.net)
Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by
	webmail.leidinger.net (Horde Framework) with HTTP; Thu, 10 Jun 2010
	16:59:09 +0200
Message-ID: <20100610165909.19296dpe2uxbeqo0@webmail.leidinger.net>
Date: Thu, 10 Jun 2010 16:59:09 +0200
From: Alexander Leidinger <Alexander@Leidinger.net>
To: Bartosz Stec <admin@kkip.pl>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C10B136.3030404@kkip.pl>
In-Reply-To: <4C10B136.3030404@kkip.pl>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=UTF-8;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4)
X-EBL-MailScanner-Information: Please contact the ISP for more information
X-EBL-MailScanner-ID: 0E76C84400A.A4F1A
X-EBL-MailScanner: Found to be clean
X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN,
	SpamAssassin (not cached, score=-1.023, required 6,
	autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10,
	DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_ZF 0.08)
X-EBL-MailScanner-From: alexander@leidinger.net
X-EBL-MailScanner-Watermark: 1276786754.8908@VKQnxMUjr3Yh7FMHYD2BZQ
X-EBL-Spam-Status: No
Cc: freebsd-fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 14:59:18 -0000

Quoting Bartosz Stec <admin@kkip.pl> (from Thu, 10 Jun 2010 11:32:38 +0200):

> Ross-at-neces-dot-com already did what you're searching for. I'm  
> using his periodic scripts for some months now, check here:  
> http://www.neces.com/blog/technology/integrating-freebsd-zfs-and-periodic-snapshots-and-scrubs.
> They're doing all necessary stuff, like checking for scrub in progress
> too. Hope you'll find them helpful.

They can not be imported as is into FreeBSD, the way he is sharing  
common stuff between several scripts is a little bit outside of what  
*I* would agree to do in FreeBSD. My polished up script has also some  
more features and the code in question is not difficult to get right.  
So... no reuse of what he did.

You can find the script referenced in another mail I wrote to fs@.

Bye,
Alexander.

-- 

http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 14:59:55 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D0B7E106566B
	for <fs@freebsd.org>; Thu, 10 Jun 2010 14:59:55 +0000 (UTC)
	(envelope-from artemb@gmail.com)
Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com
	[209.85.212.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 81ACB8FC12
	for <fs@freebsd.org>; Thu, 10 Jun 2010 14:59:55 +0000 (UTC)
Received: by vws1 with SMTP id 1so33527vws.13
	for <fs@freebsd.org>; Thu, 10 Jun 2010 07:59:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=jcvOVPY1k1EA9E/whQYMfE/D3fw2KHx2L1/+fCoQSEg=;
	b=gU6yqmlSmt9QstP8f6NmVhm4SOtedmtZ+dV7KcwJ7IaVLolhc/NC55vGWkZeCBTXE2
	JlINADUuvey56dwwBAUpRVe7Zj2mZOG5pgtMlzRlYZn1sKZMLt9L9iW4vS0x6aTWPqMl
	GWXzBv87VE4NxKGX4lvEg49q8L3aTBT6/rw1Y=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	b=Sqt9ZQFyJSgUp8JhzCH7MhVehk15X8KDwqb/cVvadf5gom4r6heqX2ZXc1Td2IYHoz
	4u3Sl/UdNFR9D7m0vvGIbQuRKwacah5bcqtk75L7qLiIuxmzIJ1KxnEa/4be9OhQYzgv
	mPHiiw1ljgvy4a1mwjQxlqaT704/O0kuDwGpg=
MIME-Version: 1.0
Received: by 10.224.72.34 with SMTP id k34mr271999qaj.283.1276181987148; Thu, 
	10 Jun 2010 07:59:47 -0700 (PDT)
Sender: artemb@gmail.com
Received: by 10.220.202.11 with HTTP; Thu, 10 Jun 2010 07:59:46 -0700 (PDT)
In-Reply-To: <20100610115324.10161biomkjndvy8@webmail.leidinger.net>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C0FAE2A.7050103@dataix.net> <4C0FB1DE.9080508@dataix.net>
	<20100610115324.10161biomkjndvy8@webmail.leidinger.net>
Date: Thu, 10 Jun 2010 07:59:46 -0700
X-Google-Sender-Auth: s-MbWBVC1hkAYToYvi2-TLIvWQE
Message-ID: <AANLkTin445x53XTprCkn1ISmVrnJeSu1XhR52tmtUfkS@mail.gmail.com>
From: Artem Belevich <fbsdlist@src.cx>
To: Alexander Leidinger <Alexander@leidinger.net>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 14:59:55 -0000

> Good idea! I even found a command line which does the calculation for the
> number of days between "now" and the last run (not taking a leap year int=
o
> account, but an off-by-one day error here does not matter).

You can get exactly one month difference by using -v option of 'date'
command to figure out the time/date offset by arbitrary amount.
Combined with +"%s" format to print number of seconds since Epoch and
-r to specify the reference point in time it makes 'date' pretty
useful in scripts.

--Artem


On Thu, Jun 10, 2010 at 2:53 AM, Alexander Leidinger
<Alexander@leidinger.net> wrote:
> Quoting jhell <jhell@dataix.net> (from Wed, 09 Jun 2010 11:23:10 -0400):
>
>> On 06/09/2010 11:07, jhell wrote:
>>>
>>> On 06/09/2010 10:26, Alexander Leidinger wrote:
>>>>
>>>> Hi,
>>>>
>>>> I noticed that we do not have an automatism to scrub a ZFS pool
>>>> periodically. Is there interest in something like this, or shall I kee=
p
>>>> it local?
>>>>
>>>> Here's the main part of the monthly periodic script I quickly created:
>>>> ---snip---
>>>> case "$monthly_scrub_zfs_enable" in
>>>> =A0 =A0[Yy][Ee][Ss])
>>>> =A0 =A0 =A0 =A0echo
>>>> =A0 =A0 =A0 =A0echo 'Scrubbing of zfs pools:'
>>>>
>>>> =A0 =A0 =A0 =A0if [ -z "${monthly_scrub_zfs_pools}" ]; then
>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0monthly_scrub_zfs_pools=3D"$(zpool list=
 -H -o name)"
>>>> =A0 =A0 =A0 =A0fi
>>>>
>>>> =A0 =A0 =A0 =A0for pool in ${monthly_scrub_zfs_pools}; do
>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0# successful only if there is at least =
one pool to scrub
>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0rc=3D0
>>>>
>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0echo " =A0 starting scrubbing of pool '=
${pool}'"
>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0zpool scrub ${pool}
>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0echo " =A0 =A0 =A0consult 'zpool status=
 ${pool}' for the
>>>> result"
>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0echo " =A0 =A0 =A0or wait for the daily=
_status_zfs mail, if
>>>> enabled"
>>>> =A0 =A0 =A0 =A0done
>>>> =A0 =A0 =A0 =A0;;
>>>> ---snip---
>>>>
>>>> Bye,
>>>> Alexander.
>>>>
>>>
>>> Please add a check to see if any resilerving is being done on the pool
>>> that the scub is being executed on. (Just in case), I would hope that
>>> the scrub would fail silently in this case.
>>>
>>> Please also check whether a scrub is already running on one of the pool=
s
>>> and if so & another pool exists start a background loop to wait for the
>>> first scrub to finish or die silently.
>>>
>>> I had a scrub fully restart from calling scrub a second time after bein=
g
>>> more than 50% complete, its frustrating.
>>>
>>>
>>> Thanks!,
>>>
>>
>> I should probably suggest one check that comes to mind.
>>
>> zpool history ${pool} | grep scrub | tail -1 |cut -f1 -d.
>>
>> Then compare the output with today's date to make sure today is >=3D 30
>> days from the date of the last scrub.
>>
>> With the above this could be turned into a daily_zfs_scrub_enable with a
>> default daily_zfs_scrub_threshold=3D"30" and ensuring that if one check =
is
>> missed it will not take another 30 days to run the check again.
>
> Good idea! I even found a command line which does the calculation for the
> number of days between "now" and the last run (not taking a leap year int=
o
> account, but an off-by-one day error here does not matter).
>
> Bye,
> Alexander.
>
> --
> "He's a businessman. I'll make him an offer he can't refuse."
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0-- Vito Corleone, "Chapter 1", page 39
>
> http://www.Leidinger.net =A0 =A0Alexander @ Leidinger.net: PGP ID =3D B00=
63FE7
> http://www.FreeBSD.org =A0 =A0 =A0 netchild @ FreeBSD.org =A0: PGP ID =3D=
 72077137
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 15:38:37 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1BDDB106566C
	for <fs@freebsd.org>; Thu, 10 Jun 2010 15:38:37 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de
	[217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id A002C8FC15
	for <fs@freebsd.org>; Thu, 10 Jun 2010 15:38:36 +0000 (UTC)
Received: from outgoing.leidinger.net (pD954FE15.dip.t-dialin.net
	[217.84.254.21])
	by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 1379684400A;
	Thu, 10 Jun 2010 17:38:30 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id 61EC4513D;
	Thu, 10 Jun 2010 17:38:26 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net;
	s=outgoing-alex; t=1276184306;
	bh=6g9oRTUhwv6PRLZi6gzWvchHLsGHcQ1V5jhOtYXtTFk=;
	h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To:
	MIME-Version:Content-Type:Content-Transfer-Encoding;
	b=VLIIg5HYI4BUcHydyOWEGEQz/v7JWT4faPiymqX/RnnUzctoGnvPQsmguMGzZTJe6
	j/9AH0EsQ+iptqVkmWC4Mm6RhDHjWk6JUTSD4I4wGZPPdODPzHzU+5fnv3PFUdJmiZ
	GX777V5YzYC4TufeJ1Yzw3Gslxl/Vu4RhtfTxzpkB7GIB0R6ToP7DeOjJXuC0zj+TX
	G+9ToXXeKIaaWxdcwlaCd93MdLvvfj1D88UD9iiTFs0DbbmnF+gG9beW5JN3yznc1W
	WByGsH7Vh4CN4tbM/78ZxxSmJrqBD2a+IJGGj26GTtQcmD8wwjv5tN7Rr1kaTP2oy0
	JkpXN0zcHZFFA==
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5AFcPUF014853;
	Thu, 10 Jun 2010 17:38:25 +0200 (CEST)
	(envelope-from Alexander@Leidinger.net)
Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by
	webmail.leidinger.net (Horde Framework) with HTTP; Thu, 10 Jun 2010
	17:38:25 +0200
Message-ID: <20100610173825.164930ekkryr5tes@webmail.leidinger.net>
Date: Thu, 10 Jun 2010 17:38:25 +0200
From: Alexander Leidinger <Alexander@Leidinger.net>
To: Artem Belevich <fbsdlist@src.cx>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C0FAE2A.7050103@dataix.net> <4C0FB1DE.9080508@dataix.net>
	<20100610115324.10161biomkjndvy8@webmail.leidinger.net>
	<AANLkTin445x53XTprCkn1ISmVrnJeSu1XhR52tmtUfkS@mail.gmail.com>
In-Reply-To: <AANLkTin445x53XTprCkn1ISmVrnJeSu1XhR52tmtUfkS@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=UTF-8;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4)
X-EBL-MailScanner-Information: Please contact the ISP for more information
X-EBL-MailScanner-ID: 1379684400A.A3AE9
X-EBL-MailScanner: Found to be clean
X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN,
	SpamAssassin (not cached, score=-1.023, required 6,
	autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10,
	DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_ZF 0.08)
X-EBL-MailScanner-From: alexander@leidinger.net
X-EBL-MailScanner-Watermark: 1276789113.78919@IHdN45pu8j7X7jQc1IMirw
X-EBL-Spam-Status: No
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 15:38:37 -0000

Quoting Artem Belevich <fbsdlist@src.cx> (from Thu, 10 Jun 2010  
07:59:46 -0700):

>> Good idea! I even found a command line which does the calculation for the
>> number of days between "now" and the last run (not taking a leap year into
>> account, but an off-by-one day error here does not matter).
>
> You can get exactly one month difference by using -v option of 'date'
> command to figure out the time/date offset by arbitrary amount.
> Combined with +"%s" format to print number of seconds since Epoch and
> -r to specify the reference point in time it makes 'date' pretty
> useful in scripts.

What we have is the date of the last scrub (e.g. 2010-06-08.20:51:12),  
and what we want to know is if between the last scrub and now we  
passed a specific amount of days or not.

What I do is taking the year multiplied with 365 plus the day of the  
year. Both of this for the last date of the scrub and "now". The  
difference is the number of days between those two dates. This value I  
can use with -le or -ge for the test command.

This is only off by one once in a leap year when the leap-day is  
in-between the two dates (those people which want to scrub every 4  
years are off by two when both leap-days are in-between, but a scrub  
of every 4 years or more looks unreasonable to me, so I do not care  
much about this).

This is done in one line with two calls to date (once for the last  
scrub, once for "now") and a little bit of shell-buildin-arithmetic.  
If you have a more correct version which is not significantly more  
complex, feel free to share it here.

Bye,
Alexander.

-- 
  "Who would have though hell would really exist? And that it would be in New
Jersey?" -Leela
"Actually..." - Fry

http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 16:34:55 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EDECF106564A
	for <fs@freebsd.org>; Thu, 10 Jun 2010 16:34:54 +0000 (UTC)
	(envelope-from artemb@gmail.com)
Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 9BB0A8FC19
	for <fs@freebsd.org>; Thu, 10 Jun 2010 16:34:54 +0000 (UTC)
Received: by gwj20 with SMTP id 20so107585gwj.13
	for <fs@freebsd.org>; Thu, 10 Jun 2010 09:34:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=MC3I8FcxLBDFrRWpwq2eEV8tx9Qr4iji8o+RECRnV6g=;
	b=r7Dv6Qemf8ZgBMDM3eyTWmVO45hQ1Bez4+dhe45fXUQWcLL5bjCp/uAuZnvWM1zhcu
	3+lN5+8nzzXsMS/3sHtYq5H8uOg9q7JC2IUJ56L+FlK5aLilLzbItizm3yiypORJaMPf
	RmhNvNmOyXtLWj/tu2qPWZJKLXuQJx70torhk=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	b=L6sLqT5vh7HYYn07viZ8RSlR+s747bFHiMAtuAaDz+4Ta46vcc/OZ+E8vIQEPCrNyp
	+s7yTHMox7L4mAgzQMDPPWpNzYarBw2W1I7iSiXNeMcoTP8DO24rejc+AHaef7nF2GD5
	L0w0HxMaZBnm5976PRgoLptA+jgNLoOvskRV8=
MIME-Version: 1.0
Received: by 10.229.214.8 with SMTP id gy8mr341769qcb.173.1276187689283; Thu, 
	10 Jun 2010 09:34:49 -0700 (PDT)
Sender: artemb@gmail.com
Received: by 10.220.202.11 with HTTP; Thu, 10 Jun 2010 09:34:49 -0700 (PDT)
In-Reply-To: <20100610173825.164930ekkryr5tes@webmail.leidinger.net>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C0FAE2A.7050103@dataix.net> <4C0FB1DE.9080508@dataix.net>
	<20100610115324.10161biomkjndvy8@webmail.leidinger.net>
	<AANLkTin445x53XTprCkn1ISmVrnJeSu1XhR52tmtUfkS@mail.gmail.com>
	<20100610173825.164930ekkryr5tes@webmail.leidinger.net>
Date: Thu, 10 Jun 2010 09:34:49 -0700
X-Google-Sender-Auth: -vohyxI_OHLBoDB5mmJ_vCo_cfE
Message-ID: <AANLkTilech-Onkawu4pvNQx5hrByd3R-Mn6MK4AiSHsc@mail.gmail.com>
From: Artem Belevich <fbsdlist@src.cx>
To: Alexander Leidinger <Alexander@leidinger.net>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 16:34:55 -0000

You can do something like this:

#SCRUB_TS=3D"2010-06-08.20:51:12"
SCRUB_TS=3D$1
# parse timestamp, move it forward by 1 month and print in seconds since Ep=
och
NEXT_SCRUB_DATE_S=3D`date -j -f "%Y-%m-%d.%H:%M:%S" -v+1m +"%s" $SCRUB_TS`
# for debugging purposes convert epoch time into something human-readable
NEXT_SCRUB_DATE=3D`date -r $NEXT_SCRUB_DATE`
# surrent time in secs since Epoch.
NOW_S=3D`date +"%s"`
# Compare two times to figure out if next scrub time is still in the future
if [ $NOW_S -gt $NEXT_SCRUB_DATE_S ]; then
    echo yup.
else
    echo nope.
fi

--Artem


On Thu, Jun 10, 2010 at 8:38 AM, Alexander Leidinger
<Alexander@leidinger.net> wrote:
> Quoting Artem Belevich <fbsdlist@src.cx> (from Thu, 10 Jun 2010 07:59:46
> -0700):
>
>>> Good idea! I even found a command line which does the calculation for t=
he
>>> number of days between "now" and the last run (not taking a leap year
>>> into
>>> account, but an off-by-one day error here does not matter).
>>
>> You can get exactly one month difference by using -v option of 'date'
>> command to figure out the time/date offset by arbitrary amount.
>> Combined with +"%s" format to print number of seconds since Epoch and
>> -r to specify the reference point in time it makes 'date' pretty
>> useful in scripts.
>
> What we have is the date of the last scrub (e.g. 2010-06-08.20:51:12), an=
d
> what we want to know is if between the last scrub and now we passed a
> specific amount of days or not.
>
> What I do is taking the year multiplied with 365 plus the day of the year=
.
> Both of this for the last date of the scrub and "now". The difference is =
the
> number of days between those two dates. This value I can use with -le or =
-ge
> for the test command.
>
> This is only off by one once in a leap year when the leap-day is in-betwe=
en
> the two dates (those people which want to scrub every 4 years are off by =
two
> when both leap-days are in-between, but a scrub of every 4 years or more
> looks unreasonable to me, so I do not care much about this).
>
> This is done in one line with two calls to date (once for the last scrub,
> once for "now") and a little bit of shell-buildin-arithmetic. If you have=
 a
> more correct version which is not significantly more complex, feel free t=
o
> share it here.
>
> Bye,
> Alexander.
>
> --
> =A0"Who would have though hell would really exist? And that it would be i=
n New
> Jersey?" -Leela
> "Actually..." - Fry
>
> http://www.Leidinger.net =A0 =A0Alexander @ Leidinger.net: PGP ID =3D B00=
63FE7
> http://www.FreeBSD.org =A0 =A0 =A0 netchild @ FreeBSD.org =A0: PGP ID =3D=
 72077137
>

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 17:36:32 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 294291065673;
	Thu, 10 Jun 2010 17:36:32 +0000 (UTC)
	(envelope-from gpalmer@freebsd.org)
Received: from noop.in-addr.com (mail.in-addr.com [IPv6:2001:470:8:162::1])
	by mx1.freebsd.org (Postfix) with ESMTP id F155B8FC1B;
	Thu, 10 Jun 2010 17:36:31 +0000 (UTC)
Received: from gjp by noop.in-addr.com with local (Exim 4.54 (FreeBSD))
	id 1OMlg6-000NGa-0v; Thu, 10 Jun 2010 13:36:30 -0400
Date: Thu, 10 Jun 2010 13:36:29 -0400
From: Gary Palmer <gpalmer@freebsd.org>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
Message-ID: <20100610173629.GA70716@in-addr.com>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<Pine.GSO.4.63.1006091119410.23896@muncher.cs.uoguelph.ca>
	<20100610111316.GB87243@fupp.net>
	<20100610114614.GA71432@icarus.home.lan>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100610114614.GA71432@icarus.home.lan>
Cc: freebsd-fs@FreeBSD.org, Anders Nordby <anders@FreeBSD.org>
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 17:36:32 -0000

On Thu, Jun 10, 2010 at 04:46:14AM -0700, Jeremy Chadwick wrote:
> Clarification: I believe it actually has 1927MB (147M Inact + 1780M
> Free) available.  I've always understood top's "Free" field to mean
> "number/amount of pages which have never been touched/used since the
> kernel was started", while "Inact" to mean "number/amount of pages which
> have been touched/used but are not actively being used, this available
> for use".
> 
> If someone more familiar with the VM and top could expand on this,
> that'd be helpful.

I'm not a VM guru, however here is my understanding:

- "Free" are pages that have been reclaimed by the page daemon and are
  ready for immediate use without further action.  The page daemon always
  tries to keep a few pages in the "Free" state to avoid problems with
  page starvation

- "Inactive" pages are pages that are candidates for reclamation by the
  page daemon if so needed.  I believe some amount of work is needed to
  move an inactive page to the free list, including zeroing it I think as
  well as removing any references still pointing to it (e.g. it could be
  a cached copy of data from local storage).  

Probably not completely 100% accurate as I haven't kept up with VM changes
in the last few years, but close enough for government work :)

Regards,

Gary

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 17:58:36 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8282D1065673
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 17:58:36 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 8ADE68FC1E
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 17:58:34 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id UAA19784;
	Thu, 10 Jun 2010 20:58:33 +0300 (EEST)
	(envelope-from avg@icyb.net.ua)
Message-ID: <4C1127C8.9040207@icyb.net.ua>
Date: Thu, 10 Jun 2010 20:58:32 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.24 (X11/20100517)
MIME-Version: 1.0
To: Gary Palmer <gpalmer@freebsd.org>
References: <20100608083649.GA77452@fupp.net>	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>	<20100609122517.GA16231@fupp.net>	<Pine.GSO.4.63.1006091119410.23896@muncher.cs.uoguelph.ca>	<20100610111316.GB87243@fupp.net>	<20100610114614.GA71432@icarus.home.lan>
	<20100610173629.GA70716@in-addr.com>
In-Reply-To: <20100610173629.GA70716@in-addr.com>
X-Enigmail-Version: 0.95.7
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 17:58:36 -0000

on 10/06/2010 20:36 Gary Palmer said the following:
> On Thu, Jun 10, 2010 at 04:46:14AM -0700, Jeremy Chadwick wrote:
>> Clarification: I believe it actually has 1927MB (147M Inact + 1780M
>> Free) available.  I've always understood top's "Free" field to mean
>> "number/amount of pages which have never been touched/used since the
>> kernel was started", while "Inact" to mean "number/amount of pages which
>> have been touched/used but are not actively being used, this available
>> for use".
>>
>> If someone more familiar with the VM and top could expand on this,
>> that'd be helpful.
> 
> I'm not a VM guru, however here is my understanding:
> 
> - "Free" are pages that have been reclaimed by the page daemon and are
>   ready for immediate use without further action.  The page daemon always
>   tries to keep a few pages in the "Free" state to avoid problems with
>   page starvation
> 
> - "Inactive" pages are pages that are candidates for reclamation by the
>   page daemon if so needed.  I believe some amount of work is needed to
>   move an inactive page to the free list, including zeroing it I think as
>   well as removing any references still pointing to it (e.g. it could be
>   a cached copy of data from local storage).  

Something like that, right.
My understanding:
Active pages are also candidates for reclamation, but Inactive are the primary
ones.  The only difference is how much time passed since they were last "referenced".
Cached pages are pages that effectively free, i.e. can be reclaimed any moment,
but their content is still valid and so they can be re-used.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 18:48:57 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6BEFF106566C;
	Thu, 10 Jun 2010 18:48:57 +0000 (UTC)
	(envelope-from peterjeremy@acm.org)
Received: from mail18.syd.optusnet.com.au (mail18.syd.optusnet.com.au
	[211.29.132.199])
	by mx1.freebsd.org (Postfix) with ESMTP id EEA0B8FC1D;
	Thu, 10 Jun 2010 18:48:56 +0000 (UTC)
Received: from server.vk2pj.dyndns.org
	(c211-30-160-13.mirnd2.nsw.optusnet.com.au [211.30.160.13] (may
	be forged))
	by mail18.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	o5AImk9e004609
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 11 Jun 2010 04:48:47 +1000
X-Bogosity: Ham, spamicity=0.000000
Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1])
	by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id o5AImjjq069479;
	Fri, 11 Jun 2010 04:48:45 +1000 (EST)
	(envelope-from peter@server.vk2pj.dyndns.org)
Received: (from peter@localhost)
	by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id o5AImiWl069459;
	Fri, 11 Jun 2010 04:48:44 +1000 (EST) (envelope-from peter)
Date: Fri, 11 Jun 2010 04:48:44 +1000
From: Peter Jeremy <peterjeremy@acm.org>
To: Anders Nordby <anders@FreeBSD.org>
Message-ID: <20100610184844.GA64544@server.vk2pj.dyndns.org>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
	<20100610110609.GA87243@fupp.net>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="17pEHd4RhPHOinZp"
Content-Disposition: inline
In-Reply-To: <20100610110609.GA87243@fupp.net>
X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 18:48:57 -0000


--17pEHd4RhPHOinZp
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2010-Jun-10 13:06:09 +0200, Anders Nordby <anders@FreeBSD.org> wrote:
>On Thu, Jun 10, 2010 at 06:17:10PM +1000, Peter Jeremy wrote:
>> I wonder if your system is running out of free RAM.  How would you
>> like to monitor "inactive", "cache" and "free" from either "systat -v"
>> or "vmstat -s" whilst the problem is occurring.
>>=20
>> Does something like
>>   perl -e '$x =3D "x" x 10000000;'
>> temporarily correct the problem?
>
>While the problem is happening:
=2E..
>And from systat -v:
>
>Disks   da0   da1 pass0 pass1                     1045240 wire
>KB/t   0.00  0.00  0.00  0.00                       25240 act
>tps       0     0     0     0                      149344 inact
>MB/s   0.00  0.00  0.00  0.00                         112 cache
>%busy     0     0     0     0                     1824452 free
>                                                   323680 buf
>> Does something like
>>   perl -e '$x =3D "x" x 10000000;'
>> temporarily correct the problem?
>
>No.

OK, it's not the issue I was considering.  I can't offer any further
suggestions at this point.

--=20
Peter Jeremy
PS: Sorry about the confused 'From' address last time.

--17pEHd4RhPHOinZp
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAkwRM4wACgkQ/opHv/APuIeZhwCgno9MH4EsURDXi5kS+YbpX8TE
93wAn1evYz+M3uyjWTYXiRuqtGzBpIoJ
=+ufv
-----END PGP SIGNATURE-----

--17pEHd4RhPHOinZp--

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 19:11:22 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0A516106567C
	for <fs@freebsd.org>; Thu, 10 Jun 2010 19:11:22 +0000 (UTC)
	(envelope-from jhellenthal@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id C65A98FC08
	for <fs@freebsd.org>; Thu, 10 Jun 2010 19:11:21 +0000 (UTC)
Received: by pwj1 with SMTP id 1so154190pwj.13
	for <fs@freebsd.org>; Thu, 10 Jun 2010 12:11:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:sender:message-id:date:from
	:user-agent:mime-version:cc:subject:references:in-reply-to
	:x-enigmail-version:openpgp:content-type:content-transfer-encoding;
	bh=kr9TGqryv95PLQPydMzDZEJ669kD8Xs8cIiI9T1H9E8=;
	b=Xs2pPKntmo9nxXa0fJxPVxlR0C5Q+iLGD/PxT359oklmDeFAvaNucEdLrZI3hMUQ3p
	rQXhHDFzPRGbiRHjepDxK5e+jLlRVKxkryddUw2YK87Aw+TpCMOyCe0bkcmneOAki4fz
	vgsdq34vhvRcC+tYjDnWtwe+spR9eiaWNqEjs=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:message-id:date:from:user-agent:mime-version:cc:subject
	:references:in-reply-to:x-enigmail-version:openpgp:content-type
	:content-transfer-encoding;
	b=JsQU+kx8Q28G9j6hJhOD9DzGnAcJLY+UvLldwhdXmkq/7MRJXyjZAUDhvcdrfrG1cl
	ij4+jajVKKu6wUGYrG239aYMLIBWrf4nFoBBObAXkDIAydtZLuNYnsGIkcnih0i698SP
	/m1GTPvfoh2c4GjV61XG6d48T/TfQwjFd0hdE=
Received: by 10.143.26.1 with SMTP id d1mr428550wfj.311.1276197076119;
	Thu, 10 Jun 2010 12:11:16 -0700 (PDT)
Received: from centel.dataix.local
	(adsl-99-181-128-180.dsl.klmzmi.sbcglobal.net [99.181.128.180])
	by mx.google.com with ESMTPS id s21sm186820wff.12.2010.06.10.12.11.13
	(version=SSLv3 cipher=RC4-MD5); Thu, 10 Jun 2010 12:11:14 -0700 (PDT)
Sender: "J. Hellenthal" <jhellenthal@gmail.com>
Message-ID: <4C1138D0.7070901@dataix.net>
Date: Thu, 10 Jun 2010 15:11:12 -0400
From: jhell <jhell@dataix.net>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US;
	rv:1.9.1.9) Gecko/20100515 Thunderbird
MIME-Version: 1.0
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C0FAE2A.7050103@dataix.net> <4C0FB1DE.9080508@dataix.net>
	<20100610115324.10161biomkjndvy8@webmail.leidinger.net>
	<AANLkTin445x53XTprCkn1ISmVrnJeSu1XhR52tmtUfkS@mail.gmail.com>
	<20100610173825.164930ekkryr5tes@webmail.leidinger.net>
	<AANLkTilech-Onkawu4pvNQx5hrByd3R-Mn6MK4AiSHsc@mail.gmail.com>
In-Reply-To: <AANLkTilech-Onkawu4pvNQx5hrByd3R-Mn6MK4AiSHsc@mail.gmail.com>
X-Enigmail-Version: 1.0.1
OpenPGP: id=89D8547E
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: Alexander Leidinger <Alexander@leidinger.net>, fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 19:11:22 -0000

On 06/10/2010 12:34, Artem Belevich wrote:
> You can do something like this:
> 
> #SCRUB_TS="2010-06-08.20:51:12"
> SCRUB_TS=$1
> # parse timestamp, move it forward by 1 month and print in seconds since Epoch
> NEXT_SCRUB_DATE_S=`date -j -f "%Y-%m-%d.%H:%M:%S" -v+1m +"%s" $SCRUB_TS`
> # for debugging purposes convert epoch time into something human-readable
> NEXT_SCRUB_DATE=`date -r $NEXT_SCRUB_DATE`
> # surrent time in secs since Epoch.
> NOW_S=`date +"%s"`
> # Compare two times to figure out if next scrub time is still in the future
> if [ $NOW_S -gt $NEXT_SCRUB_DATE_S ]; then
>     echo yup.
> else
>     echo nope.
> fi
> 
> --Artem

#!/bin/sh

lastscrub=$(zpool history exports |grep scrub |tail -1 |cut -f1 -d.)
todaypoch=$(date -j -f "%Y-%m-%d" "+%s" $(date "+%Y-%m-%d"))
scrubpoch=$(date -j -f "%Y-%m-%d" "+%s" $lastscrub)

echo $lastscrub Last Scrub From zpool history
echo $todaypoch Today converted to seconds since epoch
echo $scrubpoch Last scrub converted to seconds since epoch

expired=$((((($todaypoch-$scrubpoch)/60)/60)/24))

if [ ${expired:=30} -ge ${daily_scrub_zfs_threshold:=30} ]; then
        echo "Performing Scrub...."
        else
        echo "SORRY its only been $expired days since your last scrub."
fi


My reasoning for setting expired to have a default value of 30 depended
on whether a pool may have just been created in which a scrub would have
never been performed thus with this value being equal to that of the
default threshold would allow that pool to be scrubbed on the first day
it was created.

I considered just doing ${expired:=${daily_scrub_zfs_threshold:=30}}
which would also allow it to be set to whatever a user set their value
to before the pool was created and adds another layer of redundancy on
that variable in a fail-safe sort of way.

Regards & nice work on this. I just noticed the CFT just after writing
this. but still have a look at the above it may simplify the testing
while providing some fallback for what I stated above.

> 
> 
> 
> On Thu, Jun 10, 2010 at 8:38 AM, Alexander Leidinger
> <Alexander@leidinger.net> wrote:
>> Quoting Artem Belevich <fbsdlist@src.cx> (from Thu, 10 Jun 2010 07:59:46
>> -0700):
>>
>>>> Good idea! I even found a command line which does the calculation for the
>>>> number of days between "now" and the last run (not taking a leap year
>>>> into
>>>> account, but an off-by-one day error here does not matter).
>>>
>>> You can get exactly one month difference by using -v option of 'date'
>>> command to figure out the time/date offset by arbitrary amount.
>>> Combined with +"%s" format to print number of seconds since Epoch and
>>> -r to specify the reference point in time it makes 'date' pretty
>>> useful in scripts.
>>
>> What we have is the date of the last scrub (e.g. 2010-06-08.20:51:12), and
>> what we want to know is if between the last scrub and now we passed a
>> specific amount of days or not.
>>
>> What I do is taking the year multiplied with 365 plus the day of the year.
>> Both of this for the last date of the scrub and "now". The difference is the
>> number of days between those two dates. This value I can use with -le or -ge
>> for the test command.
>>
>> This is only off by one once in a leap year when the leap-day is in-between
>> the two dates (those people which want to scrub every 4 years are off by two
>> when both leap-days are in-between, but a scrub of every 4 years or more
>> looks unreasonable to me, so I do not care much about this).
>>
>> This is done in one line with two calls to date (once for the last scrub,
>> once for "now") and a little bit of shell-buildin-arithmetic. If you have a
>> more correct version which is not significantly more complex, feel free to
>> share it here.
>>
>> Bye,
>> Alexander.
>>
>> --
>>  "Who would have though hell would really exist? And that it would be in New
>> Jersey?" -Leela
>> "Actually..." - Fry
>>
>> http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
>> http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137
>>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


-- 

 jhell

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 10 23:32:41 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EEB4B1065673;
	Thu, 10 Jun 2010 23:32:40 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 82ED58FC17;
	Thu, 10 Jun 2010 23:32:40 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AvsEABsTEUyDaFvK/2dsb2JhbACeeHG/CIUYBA
X-IronPort-AV: E=Sophos;i="4.53,400,1272859200"; d="scan'208";a="80271424"
Received: from fraser.cs.uoguelph.ca ([131.104.91.202])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 10 Jun 2010 19:32:37 -0400
Received: from localhost (localhost.localdomain [127.0.0.1])
	by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 8A6A2109C350;
	Thu, 10 Jun 2010 19:32:39 -0400 (EDT)
X-Virus-Scanned: amavisd-new at fraser.cs.uoguelph.ca
Received: from fraser.cs.uoguelph.ca ([127.0.0.1])
	by localhost (fraser.cs.uoguelph.ca [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id bmtIrr2oPbuO; Thu, 10 Jun 2010 19:32:39 -0400 (EDT)
Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102])
	by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 00592109C34B;
	Thu, 10 Jun 2010 19:32:38 -0400 (EDT)
Received: from localhost (rmacklem@localhost)
	by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id
	o5ANmnW07793; Thu, 10 Jun 2010 19:48:49 -0400 (EDT)
X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing
	-bs
Date: Thu, 10 Jun 2010 19:48:49 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
X-X-Sender: rmacklem@muncher.cs.uoguelph.ca
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
In-Reply-To: <20100610133859.GA74094@icarus.home.lan>
Message-ID: <Pine.GSO.4.63.1006101936100.6000@muncher.cs.uoguelph.ca>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
	<20100610110609.GA87243@fupp.net>
	<20100610114831.GB71432@icarus.home.lan>
	<20100610130307.GA33285@fupp.net>
	<20100610133859.GA74094@icarus.home.lan>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-fs@FreeBSD.org, Peter Jeremy <peter@vk2pj.dyndns.org>,
	Anders Nordby <anders@FreeBSD.org>
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jun 2010 23:32:41 -0000


On Thu, 10 Jun 2010, Jeremy Chadwick wrote:

>
> The interrupt rate for bge1 (irq26) is very high during the problem,
> while otherwise is only ~6/sec.  Shot in the dark, but this is probably
> the cause of the packet loss you see.  Oddly, your uhci2 interface (used
> for USB) is also firing at a very high rate.  I don't know if this is
> the sign of a NIC problem, driver problem, or interrupt (think APIC?)
> routing problem.
>
> Debugging this is beyond my capability, but folks like John Baldwin may
> have some ideas on where to go from here.
>
> Also, have you used "netstat -ibn -I bge1" (to look at byte counters) or
> "tcpdump -l -n -s 0 -i bge1" to watch network traffic live when this is
> happening?  The reason I ask is to determine if there's any chance this
> box starts seeing problems due to DoS attacks or excessive LAN traffic
> which is unexpected.  Basically, be sure that all the network I/O going
> on across bge1 is expected.
>
Yes, I think Jeremy is on the right track. I'd second the recommendation
to look at traffic when it is happening. I might choose:
 	tcpdump -s 0 -w <file> -i bge1
and then load "<file>" into wireshark, since wireshark is much better at
making sense of NFS traffic. (Since the nfsd is at the top of the process
list, it hints that there may be heavy nfs traffic being received by
bge1.)

If you do this tcpdump for a short period of time and then email "<file>"
to me as an attachment, I can take a look at it. (If the traffic isn't
NFS, then there's not much point in doing this.) We might have a case
where a client is retrying the same RPC (or RPC sequence) over and over
and over again, my friend (sorry I couldn't resist:-).

Given that you stated FreeBSD8.1-Prerelease I think you should have the
patch, but please make sure that your sys/nfsserver/nfs_srvsubs.c is
at least r206406.

Let me know how it goes, rick

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 03:18:12 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D3A1C1065679
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 03:18:12 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta09.westchester.pa.mail.comcast.net
	(qmta09.westchester.pa.mail.comcast.net [76.96.62.96])
	by mx1.freebsd.org (Postfix) with ESMTP id 7521A8FC17
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 03:18:11 +0000 (UTC)
Received: from omta14.westchester.pa.mail.comcast.net ([76.96.62.60])
	by qmta09.westchester.pa.mail.comcast.net with comcast
	id USbJ1e0031HzFnQ59TJCw6; Fri, 11 Jun 2010 03:18:12 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by omta14.westchester.pa.mail.comcast.net with comcast
	id UTJA1e00F3S48mS3aTJBzs; Fri, 11 Jun 2010 03:18:12 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id AEAB19B423; Thu, 10 Jun 2010 20:18:09 -0700 (PDT)
Date: Thu, 10 Jun 2010 20:18:09 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20100611031809.GA93666@icarus.home.lan>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
	<20100610110609.GA87243@fupp.net>
	<20100610114831.GB71432@icarus.home.lan>
	<20100610130307.GA33285@fupp.net>
	<20100610133859.GA74094@icarus.home.lan>
	<Pine.GSO.4.63.1006101936100.6000@muncher.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.GSO.4.63.1006101936100.6000@muncher.cs.uoguelph.ca>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@FreeBSD.org, Peter Jeremy <peter@vk2pj.dyndns.org>,
	Anders Nordby <anders@FreeBSD.org>, PYUN Yong-Hyeon <pyunyh@gmail.com>
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 03:18:13 -0000

On Thu, Jun 10, 2010 at 07:48:49PM -0400, Rick Macklem wrote:
> On Thu, 10 Jun 2010, Jeremy Chadwick wrote:
> >The interrupt rate for bge1 (irq26) is very high during the problem,
> >while otherwise is only ~6/sec.  Shot in the dark, but this is probably
> >the cause of the packet loss you see.  Oddly, your uhci2 interface (used
> >for USB) is also firing at a very high rate.  I don't know if this is
> >the sign of a NIC problem, driver problem, or interrupt (think APIC?)
> >routing problem.
> >
> >Debugging this is beyond my capability, but folks like John Baldwin may
> >have some ideas on where to go from here.
> >
> >Also, have you used "netstat -ibn -I bge1" (to look at byte counters) or
> >"tcpdump -l -n -s 0 -i bge1" to watch network traffic live when this is
> >happening?  The reason I ask is to determine if there's any chance this
> >box starts seeing problems due to DoS attacks or excessive LAN traffic
> >which is unexpected.  Basically, be sure that all the network I/O going
> >on across bge1 is expected.
> >
> Yes, I think Jeremy is on the right track. I'd second the recommendation
> to look at traffic when it is happening. I might choose:
> 	tcpdump -s 0 -w <file> -i bge1
> and then load "<file>" into wireshark, since wireshark is much better at
> making sense of NFS traffic. (Since the nfsd is at the top of the process
> list, it hints that there may be heavy nfs traffic being received by
> bge1.)
> 
> If you do this tcpdump for a short period of time and then email "<file>"
> to me as an attachment, I can take a look at it. (If the traffic isn't
> NFS, then there's not much point in doing this.) We might have a case
> where a client is retrying the same RPC (or RPC sequence) over and over
> and over again, my friend (sorry I couldn't resist:-).
> 
> Given that you stated FreeBSD8.1-Prerelease I think you should have the
> patch, but please make sure that your sys/nfsserver/nfs_srvsubs.c is
> at least r206406.
> 
> Let me know how it goes, rick

Also for Anders --

With regards to possible bge(4) issues, Yong-Hyeon works on this driver
fairly often.  If it turns out to be a driver issue of some sort, he can
probably help.  Relevant commits are here (to give you some idea of
activity):

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/bge/if_bge.c

One commit caught my eye (rev 1.226.2.15), but that seems to be more
focused on mbuf issues (your system doesn't appear to be having any,
given your netstat -m output).

CC'ing Yong-Hyeong, as he might know of some edge case where bge(4)
could go crazy with interrupts.  :-)  Yong-Hyeon, the entire thread is
here:

http://lists.freebsd.org/pipermail/freebsd-fs/2010-June/008654.html

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 06:03:50 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 58AE4106567C
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 06:03:50 +0000 (UTC)
	(envelope-from to.my.trociny@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id A66B78FC14
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 06:03:49 +0000 (UTC)
Received: by bwz2 with SMTP id 2so281077bwz.13
	for <freebsd-fs@freebsd.org>; Thu, 10 Jun 2010 23:03:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:from:to:subject:organization
	:date:message-id:user-agent:mime-version:content-type;
	bh=jJGd4KXkRNjxRw7LniXf+XPGfrJorXOofPJIxd9Jktg=;
	b=Wgn3s0fT+RaCkNgnQNlPbYNjJGOtNOlZx7t65p1hoCtcjwgVHojNZPNdHxkwh7kS7c
	E/6uhWGvHuiBmQfu8jjUzyNE7SXq0uC9fATJwZ6vinrlsEupHqJS2ZjstTaY3cTvMZEx
	z1kloN60pnbpzqIV5qmprXDCIX1VeqXU4nJMs=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=from:to:subject:organization:date:message-id:user-agent
	:mime-version:content-type;
	b=Cy36o/FtDPQ1YdH7o1cYSpPl3r0LKFZEJOyQ0Q91l+b5BVfbCCKyj2g/thNPPVVmcy
	GJ1s02Hen0VDSs1lH+B1oECreRTRr4K2at6a7w9Ws9ayKs940Hy2pazo0Hjx7NDWNMmx
	RPVsRacmFSdrCRM2xurXItQovik9Wrhre7gdE=
Received: by 10.204.81.84 with SMTP id w20mr919417bkk.81.1276236227869;
	Thu, 10 Jun 2010 23:03:47 -0700 (PDT)
Received: from localhost (ua1.etadirect.net [91.198.140.16])
	by mx.google.com with ESMTPS id z17sm3441551bkx.6.2010.06.10.23.03.46
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Thu, 10 Jun 2010 23:03:47 -0700 (PDT)
From: Mikolaj Golub <to.my.trociny@gmail.com>
To: freebsd-fs@freebsd.org
Organization: TOA Ukraine
Date: Fri, 11 Jun 2010 09:03:44 +0300
Message-ID: <86mxv22ji7.fsf@zhuzha.ua1>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
Subject: '#ifndef DIAGNOSTIC' in nfsclient code looks like a typo
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 06:03:50 -0000

--=-=-=

Hi:

'#ifndef DIAGNOSTIC' in sys/nfsclient/nfs_vnops.c and
sys/fs/nfsclient/nfs_clvnops.c looks like a typo and '#ifdef' should be used
instead (see the attached patch).

-- 
Mikolaj Golub


--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline; filename=nfsclient.ifdef_DIAGNOSTIC.patch

Index: sys/nfsclient/nfs_vnops.c
===================================================================
--- sys/nfsclient/nfs_vnops.c	(revision 209021)
+++ sys/nfsclient/nfs_vnops.c	(working copy)
@@ -1348,7 +1348,7 @@ nfs_writerpc(struct vnode *vp, struct uio *uiop, s
 	int v3 = NFS_ISV3(vp), committed = NFSV3WRITE_FILESYNC;
 	int wsize;
 	
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if (uiop->uio_iovcnt != 1)
 		panic("nfs: writerpc iovcnt > 1");
 #endif
@@ -1708,7 +1708,7 @@ nfs_remove(struct vop_remove_args *ap)
 	int error = 0;
 	struct vattr vattr;
 
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if ((cnp->cn_flags & HASBUF) == 0)
 		panic("nfs_remove: no name");
 	if (vrefcnt(vp) < 1)
@@ -1814,7 +1814,7 @@ nfs_rename(struct vop_rename_args *ap)
 	struct componentname *fcnp = ap->a_fcnp;
 	int error;
 
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if ((tcnp->cn_flags & HASBUF) == 0 ||
 	    (fcnp->cn_flags & HASBUF) == 0)
 		panic("nfs_rename: no name");
@@ -2277,7 +2277,7 @@ nfs_readdirrpc(struct vnode *vp, struct uio *uiop,
 	int attrflag;
 	int v3 = NFS_ISV3(vp);
 
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
 		(uiop->uio_resid & (DIRBLKSIZ - 1)))
 		panic("nfs readdirrpc bad uio");
@@ -2482,7 +2482,7 @@ nfs_readdirplusrpc(struct vnode *vp, struct uio *u
 #ifndef nolint
 	dp = NULL;
 #endif
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
 		(uiop->uio_resid & (DIRBLKSIZ - 1)))
 		panic("nfs readdirplusrpc bad uio");
@@ -2752,7 +2752,7 @@ nfs_sillyrename(struct vnode *dvp, struct vnode *v
 
 	cache_purge(dvp);
 	np = VTONFS(vp);
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if (vp->v_type == VDIR)
 		panic("nfs: sillyrename dir");
 #endif
Index: sys/fs/nfsclient/nfs_clvnops.c
===================================================================
--- sys/fs/nfsclient/nfs_clvnops.c	(revision 209021)
+++ sys/fs/nfsclient/nfs_clvnops.c	(working copy)
@@ -1564,7 +1564,7 @@ nfs_remove(struct vop_remove_args *ap)
 	int error = 0;
 	struct vattr vattr;
 
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if ((cnp->cn_flags & HASBUF) == 0)
 		panic("nfs_remove: no name");
 	if (vrefcnt(vp) < 1)
@@ -1676,7 +1676,7 @@ nfs_rename(struct vop_rename_args *ap)
 	struct nfsv4node *newv4 = NULL;
 	int error;
 
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if ((tcnp->cn_flags & HASBUF) == 0 ||
 	    (fcnp->cn_flags & HASBUF) == 0)
 		panic("nfs_rename: no name");
@@ -2137,7 +2137,7 @@ ncl_readdirrpc(struct vnode *vp, struct uio *uiop,
 	struct nfsmount *nmp = VFSTONFS(vp->v_mount);
 	int error = 0, eof, attrflag;
 
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
 		(uiop->uio_resid & (DIRBLKSIZ - 1)))
 		panic("nfs readdirrpc bad uio");
@@ -2198,7 +2198,7 @@ ncl_readdirplusrpc(struct vnode *vp, struct uio *u
 	struct nfsmount *nmp = VFSTONFS(vp->v_mount);
 	int error = 0, attrflag, eof;
 
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
 		(uiop->uio_resid & (DIRBLKSIZ - 1)))
 		panic("nfs readdirplusrpc bad uio");
@@ -2264,7 +2264,7 @@ nfs_sillyrename(struct vnode *dvp, struct vnode *v
 
 	cache_purge(dvp);
 	np = VTONFS(vp);
-#ifndef DIAGNOSTIC
+#ifdef DIAGNOSTIC
 	if (vp->v_type == VDIR)
 		panic("nfs: sillyrename dir");
 #endif

--=-=-=--

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 08:42:29 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8E8E2106567D
	for <fs@freebsd.org>; Fri, 11 Jun 2010 08:42:29 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de
	[217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 0B2098FC0A
	for <fs@freebsd.org>; Fri, 11 Jun 2010 08:42:28 +0000 (UTC)
Received: from outgoing.leidinger.net (pD954FC95.dip.t-dialin.net
	[217.84.252.149])
	by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 7BC1F84405C;
	Fri, 11 Jun 2010 10:42:23 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id 44A2251EB;
	Fri, 11 Jun 2010 10:42:20 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net;
	s=outgoing-alex; t=1276245740;
	bh=g5+qKNsN4COx6/MHIHxAUCSoCZrvGqp5tGOMhyaZwxo=;
	h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To:
	MIME-Version:Content-Type:Content-Transfer-Encoding;
	b=zbTfE+frFODz5EsHOQYFhkfNuBVfMFP3fASLFkeHPLOGn+ukdLyndPKlbwqD0/cLG
	9sjMFUGCO7N0t2TA6cwIsUgbPfgmnbfPXqpwxXw6P08w/Z5v8kAfZo+1laG0w+Fnvx
	k9ayTyGcEDFhkUEPhp5HAV/Ta60SSbW98XoxgRBRXtDg7rCRgBkF4Wm5itc4HPOMap
	ZHmXPgSkMziPAlewnFfGLwiWppnpM77kyAIXAiHcCn90X3TjVCdEag1PKUBoFVNO3q
	jO9BfKoiobSbSm+8uqSIIR6Caj9H1IilSr+L/I1Yjd4zXvoayPus5neKkz51A//lCL
	aahlJVF+ROw2w==
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5B8gJ58059155;
	Fri, 11 Jun 2010 10:42:19 +0200 (CEST)
	(envelope-from Alexander@Leidinger.net)
Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by
	webmail.leidinger.net (Horde Framework) with HTTP; Fri, 11 Jun 2010
	10:42:19 +0200
Message-ID: <20100611104219.51344ag1ah7br4kk@webmail.leidinger.net>
Date: Fri, 11 Jun 2010 10:42:19 +0200
From: Alexander Leidinger <Alexander@Leidinger.net>
To: jhell <jhell@dataix.net>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C0FAE2A.7050103@dataix.net> <4C0FB1DE.9080508@dataix.net>
	<20100610115324.10161biomkjndvy8@webmail.leidinger.net>
	<AANLkTin445x53XTprCkn1ISmVrnJeSu1XhR52tmtUfkS@mail.gmail.com>
	<20100610173825.164930ekkryr5tes@webmail.leidinger.net>
	<AANLkTilech-Onkawu4pvNQx5hrByd3R-Mn6MK4AiSHsc@mail.gmail.com>
	<4C1138D0.7070901@dataix.net>
In-Reply-To: <4C1138D0.7070901@dataix.net>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=UTF-8;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4)
X-EBL-MailScanner-Information: Please contact the ISP for more information
X-EBL-MailScanner-ID: 7BC1F84405C.A6A65
X-EBL-MailScanner: Found to be clean
X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN,
	SpamAssassin (not cached, score=-1.023, required 6,
	autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10,
	DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_ZF 0.08)
X-EBL-MailScanner-From: alexander@leidinger.net
X-EBL-MailScanner-Watermark: 1276850545.62292@GjhW0Arrhz5WypsI9PqVPw
X-EBL-Spam-Status: No
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 08:42:29 -0000

Quoting jhell <jhell@dataix.net> (from Thu, 10 Jun 2010 15:11:12 -0400):

> On 06/10/2010 12:34, Artem Belevich wrote:
>> You can do something like this:
>>
>> #SCRUB_TS="2010-06-08.20:51:12"
>> SCRUB_TS=$1
>> # parse timestamp, move it forward by 1 month and print in seconds  
>> since Epoch
>> NEXT_SCRUB_DATE_S=`date -j -f "%Y-%m-%d.%H:%M:%S" -v+1m +"%s" $SCRUB_TS`
>> # for debugging purposes convert epoch time into something human-readable
>> NEXT_SCRUB_DATE=`date -r $NEXT_SCRUB_DATE`
>> # surrent time in secs since Epoch.
>> NOW_S=`date +"%s"`
>> # Compare two times to figure out if next scrub time is still in the future
>> if [ $NOW_S -gt $NEXT_SCRUB_DATE_S ]; then
>>     echo yup.
>> else
>>     echo nope.
>> fi
>>
>> --Artem
>
> #!/bin/sh
>
> lastscrub=$(zpool history exports |grep scrub |tail -1 |cut -f1 -d.)
> todaypoch=$(date -j -f "%Y-%m-%d" "+%s" $(date "+%Y-%m-%d"))
> scrubpoch=$(date -j -f "%Y-%m-%d" "+%s" $lastscrub)
>
> echo $lastscrub Last Scrub From zpool history
> echo $todaypoch Today converted to seconds since epoch
> echo $scrubpoch Last scrub converted to seconds since epoch
>
> expired=$((((($todaypoch-$scrubpoch)/60)/60)/24))

Apart from the fact that we can do this with one $(( ))... what  
happens if/when time_t is extended to 64 bits on 32 bit platforms? Can  
we get into trouble with the shell-arithmetic or not? It depends upon  
the bit-size of the shell integers, and the signedness of them. Jilles  
(our shell maintainer) suggested also to use the seconds since epoch  
and I asked him the same question. I'm waiting for an answer from him.

The same concerns apply to test(1) (or the corresponding buildin) in  
the solution of Artem.

By calculating with days everywhere (like in my solution), I'm sure  
that it takes longer to hit a wall than by calculating with seconds  
since epoch (which can cause a problem in 2038 or during a transition  
when this problem is tackled in time_t but not here, which is not that  
far away). The off-by-one day once every 4 years shouldn't be a  
problem. If someone can assure with some nice facts, that using the  
seconds since epoch will not cause problems in the described cases, I  
have no problem to switch to use them.

Bye,
Alexander.

> if [ ${expired:=30} -ge ${daily_scrub_zfs_threshold:=30} ]; then
>         echo "Performing Scrub...."
>         else
>         echo "SORRY its only been $expired days since your last scrub."
> fi
>
>
> My reasoning for setting expired to have a default value of 30 depended
> on whether a pool may have just been created in which a scrub would have
> never been performed thus with this value being equal to that of the
> default threshold would allow that pool to be scrubbed on the first day
> it was created.
>
> I considered just doing ${expired:=${daily_scrub_zfs_threshold:=30}}
> which would also allow it to be set to whatever a user set their value
> to before the pool was created and adds another layer of redundancy on
> that variable in a fail-safe sort of way.
>
> Regards & nice work on this. I just noticed the CFT just after writing
> this. but still have a look at the above it may simplify the testing
> while providing some fallback for what I stated above.
>
>>
>>
>>
>> On Thu, Jun 10, 2010 at 8:38 AM, Alexander Leidinger
>> <Alexander@leidinger.net> wrote:
>>> Quoting Artem Belevich <fbsdlist@src.cx> (from Thu, 10 Jun 2010 07:59:46
>>> -0700):
>>>
>>>>> Good idea! I even found a command line which does the calculation for the
>>>>> number of days between "now" and the last run (not taking a leap year
>>>>> into
>>>>> account, but an off-by-one day error here does not matter).
>>>>
>>>> You can get exactly one month difference by using -v option of 'date'
>>>> command to figure out the time/date offset by arbitrary amount.
>>>> Combined with +"%s" format to print number of seconds since Epoch and
>>>> -r to specify the reference point in time it makes 'date' pretty
>>>> useful in scripts.
>>>
>>> What we have is the date of the last scrub (e.g. 2010-06-08.20:51:12), and
>>> what we want to know is if between the last scrub and now we passed a
>>> specific amount of days or not.
>>>
>>> What I do is taking the year multiplied with 365 plus the day of the year.
>>> Both of this for the last date of the scrub and "now". The  
>>> difference is the
>>> number of days between those two dates. This value I can use with  
>>> -le or -ge
>>> for the test command.
>>>
>>> This is only off by one once in a leap year when the leap-day is in-between
>>> the two dates (those people which want to scrub every 4 years are  
>>> off by two
>>> when both leap-days are in-between, but a scrub of every 4 years or more
>>> looks unreasonable to me, so I do not care much about this).
>>>
>>> This is done in one line with two calls to date (once for the last scrub,
>>> once for "now") and a little bit of shell-buildin-arithmetic. If you have a
>>> more correct version which is not significantly more complex, feel free to
>>> share it here.
>>>
>>> Bye,
>>> Alexander.
>>>
>>> --
>>>  "Who would have though hell would really exist? And that it would  
>>> be in New
>>> Jersey?" -Leela
>>> "Actually..." - Fry
>>>
>>> http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
>>> http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137
>>>
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
>
> --
>
>  jhell
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
>


-- 
Before marriage the three little words are "I love you," after marriage
they are "Let's eat out."

http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 10:38:07 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 03008106567D
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 10:38:07 +0000 (UTC)
	(envelope-from simon@comsys.ntu-kpi.kiev.ua)
Received: from comsys.kpi.ua (comsys.kpi.ua [77.47.192.42])
	by mx1.freebsd.org (Postfix) with ESMTP id AC61D8FC19
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 10:38:06 +0000 (UTC)
Received: from pm513-1.comsys.kpi.ua ([10.18.52.101]
	helo=pm513-1.comsys.ntu-kpi.kiev.ua)
	by comsys.kpi.ua with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63)
	(envelope-from <simon@comsys.ntu-kpi.kiev.ua>)
	id 1ON1ci-00085B-7r; Fri, 11 Jun 2010 13:38:04 +0300
Received: by pm513-1.comsys.ntu-kpi.kiev.ua (Postfix, from userid 1001)
	id E37B11CC0B; Fri, 11 Jun 2010 13:38:03 +0300 (EEST)
Date: Fri, 11 Jun 2010 13:38:03 +0300
From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20100611103803.GA1855@pm513-1.comsys.ntu-kpi.kiev.ua>
References: <Pine.GSO.4.63.1006091934390.22971@muncher.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.GSO.4.63.1006091934390.22971@muncher.cs.uoguelph.ca>
User-Agent: Mutt/1.5.20 (2009-06-14)
X-Authenticated-User: simon@comsys.ntu-kpi.kiev.ua
X-Authenticator: plain
X-Sender-Verify: SUCCEEDED (sender exists & accepts mail)
X-Exim-Version: 4.63 (build at 06-Jan-2007 23:14:37)
X-Date: 2010-06-11 13:38:04
X-Connected-IP: 10.18.52.101:11436
X-Message-Linecount: 30
X-Body-Linecount: 15
X-Message-Size: 1422
X-Body-Size: 729
Cc: freebsd-fs@freebsd.org
Subject: Re: Testers: NFSv3 support for pxeboot for nfs diskless root
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 10:38:07 -0000

On Wed, Jun 09, 2010 at 07:38:24PM -0400, Rick Macklem wrote:
> I put 3 patches (you need to apply them all) here:
> http://people.freebsd.org/~rmacklem/nfsdiskless-patches/
> 
> They convert lib/libstand/nfs.c and pxeboot to use NFSv3 instead
> of NFSv2 (unless built with OLD_NFSV2 defined). Initial test
> reports have been good. (one has it working ok and the other has
> a problem in an area not related to the patches, it appears)
> 
> So, if others are interested in testing these, it would be
> appreciated, rick

Shouldn't return values from malloc() calls be checked?
Also additional checks for NULL values before free() calls can be removed,
at least this will reduce size of code.  There is PR/83424 related to this.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 12:04:40 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2D5F41065676
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 12:04:40 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id D8F0D8FC14
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 12:04:39 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1ON2yU-0001Bx-Fi
	for freebsd-fs@freebsd.org; Fri, 11 Jun 2010 14:04:38 +0200
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 14:04:38 +0200
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 14:04:38 +0200
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
connect(): No such file or directory
From: Ivan Voras <ivoras@freebsd.org>
Date: Fri, 11 Jun 2010 14:04:24 +0200
Lines: 28
Message-ID: <hut8o7$q2o$1@dough.gmane.org>
References: <20100610162629.38992mazf0sfdqg0@webmail.leidinger.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.1.9) Gecko/20100518 Thunderbird/3.0.4
In-Reply-To: <20100610162629.38992mazf0sfdqg0@webmail.leidinger.net>
X-Enigmail-Version: 1.0.1
Subject: Re: CFT: periodic scrubbing of ZFS pools
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 12:04:40 -0000

On 06/10/10 16:26, Alexander Leidinger wrote:
> Hi,
> 
> as there seems to be interest in a periodic script to scrub zpools, I
> modified my monthly-POC into a daily script with parameters for which
> pools to scrub, how many days between scrubs (even different per pool,
> if required), and several error checks (non-existing pool specified,
> scrub in progress).
> 
> You can find it at
>    http://www.Leidinger.net/FreeBSD/current-patches/600.scrub-zfs
> 
> Please put it into /etc/periodic/daily and test it. Possible
> periodic.conf variables are:
>  daily_scrub_zfs_enable="YES"
>  daily_scrub_zfs_pools="name1 name2 name3" # all if unset or empty
>  daily scrub_zfs_default_threshold="<number_of_days>" # default: 30
>  daily_scrub_zfs_<POOLNAME>_threshold="<number_of_days>"
> 
> If there is no specific threshold for a pool (= days between scrubs),
> the default threshold is used.

Fairly good and useful, but could you add a small check of "zpool
status" information before scrubbing that would a) complain LOUDLY AND
VISIBLY if a previous scrub failed and b) skip issuing a new scrub
command if there is such an error, to avoid stressing possibly broken
hardware?


From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 14:51:46 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B96091065678
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 14:51:46 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 6A4F28FC13
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 14:51:45 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AvsEAKvqEUyDaFvK/2dsb2JhbACee3G/EYUYBA
X-IronPort-AV: E=Sophos;i="4.53,403,1272859200"; d="scan'208";a="80337382"
Received: from fraser.cs.uoguelph.ca ([131.104.91.202])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 11 Jun 2010 10:51:43 -0400
Received: from localhost (localhost.localdomain [127.0.0.1])
	by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 74E1A109C358;
	Fri, 11 Jun 2010 10:51:45 -0400 (EDT)
X-Virus-Scanned: amavisd-new at fraser.cs.uoguelph.ca
Received: from fraser.cs.uoguelph.ca ([127.0.0.1])
	by localhost (fraser.cs.uoguelph.ca [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id i6wLuQqUrRwm; Fri, 11 Jun 2010 10:51:45 -0400 (EDT)
Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102])
	by fraser.cs.uoguelph.ca (Postfix) with ESMTP id 101DE109C34B;
	Fri, 11 Jun 2010 10:51:45 -0400 (EDT)
Received: from localhost (rmacklem@localhost)
	by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id
	o5BF7v320443; Fri, 11 Jun 2010 11:07:57 -0400 (EDT)
X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing
	-bs
Date: Fri, 11 Jun 2010 11:07:57 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
X-X-Sender: rmacklem@muncher.cs.uoguelph.ca
To: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
In-Reply-To: <20100611103803.GA1855@pm513-1.comsys.ntu-kpi.kiev.ua>
Message-ID: <Pine.GSO.4.63.1006111104400.20439@muncher.cs.uoguelph.ca>
References: <Pine.GSO.4.63.1006091934390.22971@muncher.cs.uoguelph.ca>
	<20100611103803.GA1855@pm513-1.comsys.ntu-kpi.kiev.ua>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-fs@freebsd.org
Subject: Re: Testers: NFSv3 support for pxeboot for nfs diskless root
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 14:51:46 -0000


On Fri, 11 Jun 2010, Andrey Simonenko wrote:

>
> Shouldn't return values from malloc() calls be checked?

Yea, I suppose that's a good idea, although I think all that can be
done is print a failure message, since it's "dead in the water" at
that point.

> Also additional checks for NULL values before free() calls can be removed,
> at least this will reduce size of code.  There is PR/83424 related to this.
>
My only concern here would be if someone were to change Free() so it
doesn't check for a null pointer, but since it does now, I suppose
it's a feature and shouldn't be changed.

Anyone else have an opinion on this? (ie. Whether I should just assume
that Free() checks for the NULL ptr.)

rick


From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 15:20:42 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 48F301065677;
	Fri, 11 Jun 2010 15:20:42 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de
	[217.11.53.44])
	by mx1.freebsd.org (Postfix) with ESMTP id C89408FC17;
	Fri, 11 Jun 2010 15:20:41 +0000 (UTC)
Received: from outgoing.leidinger.net (pD954FC95.dip.t-dialin.net
	[217.84.252.149])
	by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 8EE4F84400A;
	Fri, 11 Jun 2010 17:20:37 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id 3FE81521B;
	Fri, 11 Jun 2010 17:20:34 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net;
	s=outgoing-alex; t=1276269634;
	bh=fqWjhwATLlDUTnUUymky33S3bCX3IngdN484M52/M1A=;
	h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To:
	MIME-Version:Content-Type:Content-Transfer-Encoding;
	b=MwAKfWquhD0EcPouOe5m3/y0UQ2sOPpv9U6nXXa3Cq67IBDMm5l/Zg0z3q6rXvxlr
	aKxZFUBWl9o3fOsDrz4VJJB+gShqvyZKl+yZh2KJdxpU7DIKz6ZgKAyQoWsVV+lzpD
	LJPaMPPnld6ZZTyXq74Zj1CJqWJDH1dy/upExwSoV5Jl67GiAnyMLu1D0dX8iP1JIN
	A+nyx4CxAgRODuUEiRNOdjJpO/jvHbsQRS5EbR5OtW74mKtRu1paRrOIOA+sH3d5Rb
	idHMim9w/tz/d+G5FXQVtqwpD/zakgONiE5hAjPlDkKuVFXzU1aWMNiyzeE1KSku6s
	tepOC0ClbGuwA==
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5BFKXeI022633;
	Fri, 11 Jun 2010 17:20:33 +0200 (CEST)
	(envelope-from Alexander@Leidinger.net)
Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by
	webmail.leidinger.net (Horde Framework) with HTTP; Fri, 11 Jun 2010
	17:20:33 +0200
Message-ID: <20100611172033.42001s90ahe57oe8@webmail.leidinger.net>
Date: Fri, 11 Jun 2010 17:20:33 +0200
From: Alexander Leidinger <Alexander@Leidinger.net>
To: Ivan Voras <ivoras@freebsd.org>
References: <20100610162629.38992mazf0sfdqg0@webmail.leidinger.net>
	<hut8o7$q2o$1@dough.gmane.org>
In-Reply-To: <hut8o7$q2o$1@dough.gmane.org>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=UTF-8;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4)
X-EBL-MailScanner-Information: Please contact the ISP for more information
X-EBL-MailScanner-ID: 8EE4F84400A.A78D4
X-EBL-MailScanner: Found to be clean
X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN,
	SpamAssassin (not cached, score=-1.023, required 6,
	autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10,
	DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_ZF 0.08)
X-EBL-MailScanner-From: alexander@leidinger.net
X-EBL-MailScanner-Watermark: 1276874438.51835@75x/WBkoEbA9pDIrtOCYzA
X-EBL-Spam-Status: No
Cc: freebsd-fs@freebsd.org
Subject: Re: CFT: periodic scrubbing of ZFS pools
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 15:20:42 -0000

Quoting Ivan Voras <ivoras@freebsd.org> (from Fri, 11 Jun 2010  
14:04:24 +0200):

> On 06/10/10 16:26, Alexander Leidinger wrote:
>> Hi,
>>
>> as there seems to be interest in a periodic script to scrub zpools, I
>> modified my monthly-POC into a daily script with parameters for which
>> pools to scrub, how many days between scrubs (even different per pool,
>> if required), and several error checks (non-existing pool specified,
>> scrub in progress).
>>
>> You can find it at
>>    http://www.Leidinger.net/FreeBSD/current-patches/600.scrub-zfs
>>
>> Please put it into /etc/periodic/daily and test it. Possible
>> periodic.conf variables are:
>>  daily_scrub_zfs_enable="YES"
>>  daily_scrub_zfs_pools="name1 name2 name3" # all if unset or empty
>>  daily scrub_zfs_default_threshold="<number_of_days>" # default: 30
>>  daily_scrub_zfs_<POOLNAME>_threshold="<number_of_days>"
>>
>> If there is no specific threshold for a pool (= days between scrubs),
>> the default threshold is used.
>
> Fairly good and useful, but could you add a small check of "zpool
> status" information before scrubbing that would a) complain LOUDLY AND
> VISIBLY if a previous scrub failed and b) skip issuing a new scrub
> command if there is such an error, to avoid stressing possibly broken
> hardware?

Can you please provide an example of such a failed scrub?

Things I fixed so far:
  - use the creation time of the pool if no scrub was done before
  - rename the script via s/600/800/ (this is a I/O intensive task
    and we want to have this done late in the periodic run, so
    that other stuff is not slowed down too much)

Bye,
Alexander.

-- 
Winning isn't everything, but losing isn't anything.

http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 15:53:11 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7A3E51065670
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 15:53:11 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 499D48FC12
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 15:53:11 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id ECFC146C13;
	Fri, 11 Jun 2010 11:53:10 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id EF5B78A03C;
	Fri, 11 Jun 2010 11:53:09 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-fs@freebsd.org
Date: Fri, 11 Jun 2010 11:51:21 -0400
User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; )
References: <Pine.GSO.4.63.1006091934390.22971@muncher.cs.uoguelph.ca>
	<20100611103803.GA1855@pm513-1.comsys.ntu-kpi.kiev.ua>
	<Pine.GSO.4.63.1006111104400.20439@muncher.cs.uoguelph.ca>
In-Reply-To: <Pine.GSO.4.63.1006111104400.20439@muncher.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201006111151.21925.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Fri, 11 Jun 2010 11:53:10 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: 
Subject: Re: Testers: NFSv3 support for pxeboot for nfs diskless root
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 15:53:11 -0000

On Friday 11 June 2010 11:07:57 am Rick Macklem wrote:
> 
> On Fri, 11 Jun 2010, Andrey Simonenko wrote:
> 
> >
> > Shouldn't return values from malloc() calls be checked?
> 
> Yea, I suppose that's a good idea, although I think all that can be
> done is print a failure message, since it's "dead in the water" at
> that point.
> 
> > Also additional checks for NULL values before free() calls can be removed,
> > at least this will reduce size of code.  There is PR/83424 related to 
this.
> >
> My only concern here would be if someone were to change Free() so it
> doesn't check for a null pointer, but since it does now, I suppose
> it's a feature and shouldn't be changed.
> 
> Anyone else have an opinion on this? (ie. Whether I should just assume
> that Free() checks for the NULL ptr.)

free() in the kernel and userland also check for NULL, so I think it's ok to 
assume the same behavior for libstand.

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 16:12:08 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 30CCA106564A
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 16:12:08 +0000 (UTC)
	(envelope-from ivoras@gmail.com)
Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.155])
	by mx1.freebsd.org (Postfix) with ESMTP id 7F1B78FC0C
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 16:12:06 +0000 (UTC)
Received: by fg-out-1718.google.com with SMTP id d23so246201fga.13
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 09:12:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:mime-version:sender:received
	:in-reply-to:references:from:date:x-google-sender-auth:message-id
	:subject:to:cc:content-type;
	bh=f1RokaG63qog1hhpCLdXgKCdteQnUmKa38pqTmn1z28=;
	b=Zf2r3UiQ+ntNOh+NFD3keVzb44FSoMMAP8bcnryyCpMUxla8kQtBnm/AVYl5P5sQ5W
	RIKCwqbqyrwWLf7VOOfe6ILWcGWnhPtZktGakTNG2nD7e7FXBpLXVYiQHdsu3ASTp6g0
	0AL40UIbHVyzqwo6Ejd6vkX7dZ66b5KPCeNnc=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:from:date
	:x-google-sender-auth:message-id:subject:to:cc:content-type;
	b=Z3Y+ScGOsOVTzOp7Wmt1a+h4iRCaRPAZilUjrvK7Fb1o9ukpWf0h0PUiuCnMGGQlsV
	Qys4CqR4O/9BM+rMWS8IeG2h0tHuTrmk6S6uxWXd3Cg1ZdnK6b/Xs44veAUG1c0bOe1i
	TUG7xAJf4XTabADz15ZgKsw5F8iPh4KalzLWw=
Received: by 10.216.179.138 with SMTP id h10mr1179376wem.49.1276272725571; 
	Fri, 11 Jun 2010 09:12:05 -0700 (PDT)
MIME-Version: 1.0
Sender: ivoras@gmail.com
Received: by 10.216.89.197 with HTTP; Fri, 11 Jun 2010 09:11:45 -0700 (PDT)
In-Reply-To: <20100611172033.42001s90ahe57oe8@webmail.leidinger.net>
References: <20100610162629.38992mazf0sfdqg0@webmail.leidinger.net> 
	<hut8o7$q2o$1@dough.gmane.org>
	<20100611172033.42001s90ahe57oe8@webmail.leidinger.net>
From: Ivan Voras <ivoras@freebsd.org>
Date: Fri, 11 Jun 2010 18:11:45 +0200
X-Google-Sender-Auth: cBtPpfA_UZZbY1XRQP_1Mch67qI
Message-ID: <AANLkTilBqAsDnDs4nlsVjBdt1U9554t9rNuzMruqRJqE@mail.gmail.com>
To: Alexander Leidinger <Alexander@leidinger.net>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org
Subject: Re: CFT: periodic scrubbing of ZFS pools
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 16:12:08 -0000

On 11 June 2010 17:20, Alexander Leidinger <Alexander@leidinger.net> wrote:
> Quoting Ivan Voras <ivoras@freebsd.org> (from Fri, 11 Jun 2010 14:04:24
> +0200):

>> Fairly good and useful, but could you add a small check of "zpool
>> status" information before scrubbing that would a) complain LOUDLY AND
>> VISIBLY if a previous scrub failed and b) skip issuing a new scrub
>> command if there is such an error, to avoid stressing possibly broken
>> hardware?
>
> Can you please provide an example of such a failed scrub?

You should probably treat any status message that doesn't have "none
requested" or "scrub completed with 0 errors..." as failed.

I could setup a gnop device with errors to prove it if you'd like :)

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 16:33:17 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B39461065673
	for <freebsd-fs@FreeBSD.org>; Fri, 11 Jun 2010 16:33:17 +0000 (UTC)
	(envelope-from anders@FreeBSD.org)
Received: from fupp.net (totem.fix.no [80.91.36.20])
	by mx1.freebsd.org (Postfix) with ESMTP id 510298FC14
	for <freebsd-fs@FreeBSD.org>; Fri, 11 Jun 2010 16:33:16 +0000 (UTC)
Received: from localhost (totem.fix.no [80.91.36.20])
	by fupp.net (Postfix) with ESMTP id 5036547C36;
	Fri, 11 Jun 2010 18:33:15 +0200 (CEST)
Received: from fupp.net ([80.91.36.20])
	by localhost (totem.fix.no [80.91.36.20]) (amavisd-new, port 10024)
	with LMTP id apHlpcw7u46C; Fri, 11 Jun 2010 18:33:14 +0200 (CEST)
Received: by fupp.net (Postfix, from userid 1000)
	id C22D947C35; Fri, 11 Jun 2010 18:33:14 +0200 (CEST)
Date: Fri, 11 Jun 2010 18:33:14 +0200
From: Anders Nordby <anders@FreeBSD.org>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
Message-ID: <20100611163314.GA84574@fupp.net>
References: <20100608083649.GA77452@fupp.net>
	<Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
	<20100610110609.GA87243@fupp.net>
	<20100610114831.GB71432@icarus.home.lan>
	<20100610130307.GA33285@fupp.net>
	<20100610133859.GA74094@icarus.home.lan>
	<Pine.GSO.4.63.1006101936100.6000@muncher.cs.uoguelph.ca>
	<20100611031809.GA93666@icarus.home.lan>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20100611031809.GA93666@icarus.home.lan>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key: http://anders.fix.no/pgp/
X-PGP-Key-FingerPrint: 1E0F C53C D8DF 6A8F EAAD  19C5 D12A BC9F 0083 5956
Cc: freebsd-fs@FreeBSD.org, Peter Jeremy <peter@vk2pj.dyndns.org>
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 16:33:17 -0000

Hi,

On Thu, Jun 10, 2010 at 08:18:09PM -0700, Jeremy Chadwick wrote:
>> Given that you stated FreeBSD8.1-Prerelease I think you should have the
>> patch, but please make sure that your sys/nfsserver/nfs_srvsubs.c is
>> at least r206406.

I didn't have any time to dump and look at the network traffic much yet
(life is busy). But, the issue in this thread also happens/happened in
FreeBSD 7.3-RELEASE, so I don't see how it's a recent change that makes
this happen. Last night I had some progress, by switching to an old 100
Mbps USB NIC of mine (nerds sure do have lots of handy things at home
eh) I got rid of the packet loss:

Jun 11 01:25:14 unixfile kernel: rue0: <USBKR100 USB 10/100 LAN, class
0/0, rev 
1.10/1.00, addr 2> on usbus3
Jun 11 01:25:14 unixfile kernel: miibus2: <MII bus> on rue0
Jun 11 01:25:14 unixfile kernel: ruephy0: <RealTek RTL8150 internal
media interf
ace> PHY 0 on miibus2

Performance is quite lousy however. Just in case I am trying to get hold
of a PCI-X Intel NIC to see how that goes, as this is a production
server after all (or supposed to be).

> With regards to possible bge(4) issues, Yong-Hyeon works on this driver
> fairly often.  If it turns out to be a driver issue of some sort, he can
> probably help.  Relevant commits are here (to give you some idea of
> activity):
> 
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/bge/if_bge.c
> 
> One commit caught my eye (rev 1.226.2.15), but that seems to be more
> focused on mbuf issues (your system doesn't appear to be having any,
> given your netstat -m output).
> 
> CC'ing Yong-Hyeong, as he might know of some edge case where bge(4)
> could go crazy with interrupts.  :-)  Yong-Hyeon, the entire thread is
> here:
> 
> http://lists.freebsd.org/pipermail/freebsd-fs/2010-June/008654.html

Let me know if there's anything bge related I can try/test. It might
take a day or two or more. Customer is sort of getting annoyed by these
problems, so the room for testing is getting smaller. But of course I
want to help get a fix for this.

Regards,

-- 
Anders.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 18:30:43 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 03B101065678;
	Fri, 11 Jun 2010 18:30:43 +0000 (UTC)
	(envelope-from pyunyh@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 6F21F8FC16;
	Fri, 11 Jun 2010 18:30:42 +0000 (UTC)
Received: by pwj1 with SMTP id 1so976169pwj.13
	for <multiple recipients>; Fri, 11 Jun 2010 11:30:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:received:from:date:to:cc
	:subject:message-id:reply-to:references:mime-version:content-type
	:content-disposition:in-reply-to:user-agent;
	bh=cb6N/hi4tPsh8RXsxGECMvteyup7Qjnfqn9C6dF610c=;
	b=mzBcOiKtVrB9Me0LSvEwQo1GWU7qzlYWqrjwwMTlx4unTzbjVR98URePda4Izr3ACC
	zcdqsfnRflDbq4RENz0kxCqSL7PCXoxe5+FTty8FIHmbyaoGBU9eJmiz7fLwlNWPhiSZ
	wLQ+tevt04A8N85P1BigIOjg+VZ2KhuTu02HI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=from:date:to:cc:subject:message-id:reply-to:references:mime-version
	:content-type:content-disposition:in-reply-to:user-agent;
	b=nXb1jkn2T7RQ+yq3o4iuziTYxoDQnpy9nsYlQbESm4Glou7RLEsssfY9UoI8vZD+il
	UWlBIeR9mmeC4sKEaCC53i+BJSVyaUdwRDfIHMHo2DPuKoLuKLW96wkmcAmh/5ECY87A
	tKPv9NIw/LJ+86CJ9LvBpbV9G8WlfL9+de2aI=
Received: by 10.140.55.13 with SMTP id d13mr1674983rva.119.1276279167128;
	Fri, 11 Jun 2010 10:59:27 -0700 (PDT)
Received: from pyunyh@gmail.com ([174.35.1.224])
	by mx.google.com with ESMTPS id b12sm1435975rvn.22.2010.06.11.10.59.25
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Fri, 11 Jun 2010 10:59:26 -0700 (PDT)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
	Fri, 11 Jun 2010 10:58:05 -0700
From: Pyun YongHyeon <pyunyh@gmail.com>
Date: Fri, 11 Jun 2010 10:58:05 -0700
To: Anders Nordby <anders@freebsd.org>
Message-ID: <20100611175805.GE13776@michelle.cdnetworks.com>
References: <Pine.GSO.4.63.1006081946040.8742@muncher.cs.uoguelph.ca>
	<20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
	<20100610110609.GA87243@fupp.net>
	<20100610114831.GB71432@icarus.home.lan>
	<20100610130307.GA33285@fupp.net>
	<20100610133859.GA74094@icarus.home.lan>
	<Pine.GSO.4.63.1006101936100.6000@muncher.cs.uoguelph.ca>
	<20100611031809.GA93666@icarus.home.lan>
	<20100611163314.GA84574@fupp.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100611163314.GA84574@fupp.net>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-fs@freebsd.org, Peter Jeremy <peter@vk2pj.dyndns.org>
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 18:30:43 -0000

On Fri, Jun 11, 2010 at 06:33:14PM +0200, Anders Nordby wrote:
> Hi,
> 
> On Thu, Jun 10, 2010 at 08:18:09PM -0700, Jeremy Chadwick wrote:
> >> Given that you stated FreeBSD8.1-Prerelease I think you should have the
> >> patch, but please make sure that your sys/nfsserver/nfs_srvsubs.c is
> >> at least r206406.
> 
> I didn't have any time to dump and look at the network traffic much yet
> (life is busy). But, the issue in this thread also happens/happened in
> FreeBSD 7.3-RELEASE, so I don't see how it's a recent change that makes
> this happen. Last night I had some progress, by switching to an old 100
> Mbps USB NIC of mine (nerds sure do have lots of handy things at home
> eh) I got rid of the packet loss:
> 
> Jun 11 01:25:14 unixfile kernel: rue0: <USBKR100 USB 10/100 LAN, class
> 0/0, rev 
> 1.10/1.00, addr 2> on usbus3
> Jun 11 01:25:14 unixfile kernel: miibus2: <MII bus> on rue0
> Jun 11 01:25:14 unixfile kernel: ruephy0: <RealTek RTL8150 internal
> media interf
> ace> PHY 0 on miibus2
> 
> Performance is quite lousy however. Just in case I am trying to get hold
> of a PCI-X Intel NIC to see how that goes, as this is a production
> server after all (or supposed to be).
> 
> > With regards to possible bge(4) issues, Yong-Hyeon works on this driver
> > fairly often.  If it turns out to be a driver issue of some sort, he can
> > probably help.  Relevant commits are here (to give you some idea of
> > activity):
> > 
> > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/bge/if_bge.c
> > 
> > One commit caught my eye (rev 1.226.2.15), but that seems to be more
> > focused on mbuf issues (your system doesn't appear to be having any,
> > given your netstat -m output).
> > 
> > CC'ing Yong-Hyeong, as he might know of some edge case where bge(4)
> > could go crazy with interrupts.  :-)  Yong-Hyeon, the entire thread is
> > here:
> > 
> > http://lists.freebsd.org/pipermail/freebsd-fs/2010-June/008654.html
> 
> Let me know if there's anything bge related I can try/test. It might
> take a day or two or more. Customer is sort of getting annoyed by these
> problems, so the room for testing is getting smaller. But of course I
> want to help get a fix for this.
> 

Show me dmesg output to know which bge(4) controller you had. And
show me output of "netstat -ndI bge0". Some bge(4) controllers
supports detailed MAC counters and these are exported via sysctl.
If your controller is one of these controller, you can check the
statistics of controller with "sysctl dev.bge.0.stat" and post it
if you can see them.

> Regards,
> 
> -- 
> Anders.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 19:11:04 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EACAB1065674
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 19:11:04 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id 686688FC15
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 19:11:03 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5BJB050036792
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 11 Jun 2010 22:11:00 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	o5BJB0N5060004; Fri, 11 Jun 2010 22:11:00 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5BJAxAf060003; 
	Fri, 11 Jun 2010 22:10:59 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Fri, 11 Jun 2010 22:10:59 +0300
From: Kostik Belousov <kostikbel@gmail.com>
To: Mikolaj Golub <to.my.trociny@gmail.com>
Message-ID: <20100611191059.GF13238@deviant.kiev.zoral.com.ua>
References: <86mxv22ji7.fsf@zhuzha.ua1>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="wtjvnLv0o8UUzur2"
Content-Disposition: inline
In-Reply-To: <86mxv22ji7.fsf@zhuzha.ua1>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-3.6 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: freebsd-fs@freebsd.org
Subject: Re: '#ifndef DIAGNOSTIC' in nfsclient code looks like a typo
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 19:11:05 -0000


--wtjvnLv0o8UUzur2
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Jun 11, 2010 at 09:03:44AM +0300, Mikolaj Golub wrote:
> Hi:
>=20
> '#ifndef DIAGNOSTIC' in sys/nfsclient/nfs_vnops.c and
> sys/fs/nfsclient/nfs_clvnops.c looks like a typo and '#ifdef' should be u=
sed
> instead (see the attached patch).

All the changes should be converted to the KASSERTs. There is no point
in doing
	if (something)
		panic();
for diagnostic; use
	KASSERT(something, (panic message));


--wtjvnLv0o8UUzur2
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (FreeBSD)

iEYEARECAAYFAkwSikMACgkQC3+MBN1Mb4ht8wCg4Lo/kk++XQFke4I56+CCH46v
O1cAnRHlBUAYSiDN3fKNYfxaT989cDOo
=qNG7
-----END PGP SIGNATURE-----

--wtjvnLv0o8UUzur2--

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 23:01:21 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CBF5A106566B
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 23:01:21 +0000 (UTC)
	(envelope-from anders@FreeBSD.org)
Received: from fupp.net (totem.fix.no [80.91.36.20])
	by mx1.freebsd.org (Postfix) with ESMTP id 52AF38FC12
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 23:01:20 +0000 (UTC)
Received: from localhost (totem.fix.no [80.91.36.20])
	by fupp.net (Postfix) with ESMTP id C16B347194;
	Sat, 12 Jun 2010 01:01:20 +0200 (CEST)
Received: from fupp.net ([80.91.36.20])
	by localhost (totem.fix.no [80.91.36.20]) (amavisd-new, port 10024)
	with LMTP id Lpuwp0wZHxGg; Sat, 12 Jun 2010 01:01:20 +0200 (CEST)
Received: by fupp.net (Postfix, from userid 1000)
	id 545BD47193; Sat, 12 Jun 2010 01:01:20 +0200 (CEST)
Date: Sat, 12 Jun 2010 01:01:20 +0200
From: Anders Nordby <anders@FreeBSD.org>
To: Pyun YongHyeon <pyunyh@gmail.com>
Message-ID: <20100611230120.GA89356@fupp.net>
References: <20100609122517.GA16231@fupp.net>
	<20100610081710.GA64350@server.vk2pj.dyndns.org>
	<20100610110609.GA87243@fupp.net>
	<20100610114831.GB71432@icarus.home.lan>
	<20100610130307.GA33285@fupp.net>
	<20100610133859.GA74094@icarus.home.lan>
	<Pine.GSO.4.63.1006101936100.6000@muncher.cs.uoguelph.ca>
	<20100611031809.GA93666@icarus.home.lan>
	<20100611163314.GA84574@fupp.net>
	<20100611175805.GE13776@michelle.cdnetworks.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20100611175805.GE13776@michelle.cdnetworks.com>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key: http://anders.fix.no/pgp/
X-PGP-Key-FingerPrint: 1E0F C53C D8DF 6A8F EAAD  19C5 D12A BC9F 0083 5956
Cc: freebsd-fs@freebsd.org, Peter Jeremy <peter@vk2pj.dyndns.org>
Subject: Re: Odd network issues on ZFS based NFS server
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 23:01:21 -0000

Hi,

On Fri, Jun 11, 2010 at 10:58:05AM -0700, Pyun YongHyeon wrote:
>> Let me know if there's anything bge related I can try/test. It might
>> take a day or two or more. Customer is sort of getting annoyed by these
>> problems, so the room for testing is getting smaller. But of course I
>> want to help get a fix for this.
> Show me dmesg output to know which bge(4) controller you had. And
> show me output of "netstat -ndI bge0". Some bge(4) controllers
> supports detailed MAC counters and these are exported via sysctl.
> If your controller is one of these controller, you can check the
> statistics of controller with "sysctl dev.bge.0.stat" and post it
> if you can see them.

Since running on rue NIC I didn't retry bge again. But I did not reboot
since I had problems last time either, I just changed NIC from bge1 to
ue0. So I'm not sure if these numbers are interesting or if I should
retry using a bge NIC, but here goes:

anders@unixfile:~$ grep ^bge1 /var/run/dmesg.boot 
bge1: <HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100> mem
0xfdce0000-0xfdceffff irq 26 at device 1.1 on pci3
bge1: Ethernet address: 00:16:35:03:e6:3e
bge1: [ITHREAD]
anders@unixfile:~$ netstat -ndI bge1
Name    Mtu Network       Address              Ipkts Ierrs Idrop
Opkts Oerrs  Coll Drop
bge1*  1500 <Link#2>      00:16:35:03:e6:3e 21417404     0     0
20313076     0     0    0 
anders@unixfile:~$ sysctl dev.bge.1.stats
dev.bge.1.stats.FramesDroppedDueToFilters: 0
dev.bge.1.stats.DmaWriteQueueFull: 34
dev.bge.1.stats.DmaWriteHighPriQueueFull: 0
dev.bge.1.stats.NoMoreRxBDs: 0
dev.bge.1.stats.InputDiscards: 0
dev.bge.1.stats.InputErrors: 0
dev.bge.1.stats.RecvThresholdHit: 12086131
dev.bge.1.stats.DmaReadQueueFull: 957280
dev.bge.1.stats.DmaReadHighPriQueueFull: 4835
dev.bge.1.stats.SendDataCompQueueFull: 0
dev.bge.1.stats.RingSetSendProdIndex: 20515417
dev.bge.1.stats.RingStatusUpdate: 20492506
dev.bge.1.stats.Interrupts: 20492506
dev.bge.1.stats.AvoidedInterrupts: 0
dev.bge.1.stats.SendThresholdHit: 0
dev.bge.1.stats.rx.Octets: 0
dev.bge.1.stats.rx.Fragments: 0
dev.bge.1.stats.rx.UcastPkts: 0
dev.bge.1.stats.rx.MulticastPkts: 0
dev.bge.1.stats.rx.FCSErrors: 0
dev.bge.1.stats.rx.AlignmentErrors: 0
dev.bge.1.stats.rx.xonPauseFramesReceived: 0
dev.bge.1.stats.rx.xoffPauseFramesReceived: 0
dev.bge.1.stats.rx.ControlFramesReceived: 0
dev.bge.1.stats.rx.xoffStateEntered: 0
dev.bge.1.stats.rx.FramesTooLong: 0
dev.bge.1.stats.rx.Jabbers: 0
dev.bge.1.stats.rx.UndersizePkts: 0
dev.bge.1.stats.rx.inRangeLengthError: 0
dev.bge.1.stats.rx.outRangeLengthError: 0
dev.bge.1.stats.tx.Octets: 0
dev.bge.1.stats.tx.Collisions: 0
dev.bge.1.stats.tx.XonSent: 0
dev.bge.1.stats.tx.XoffSent: 0
dev.bge.1.stats.tx.flowControlDone: 0
dev.bge.1.stats.tx.InternalMacTransmitErrors: 0
dev.bge.1.stats.tx.SingleCollisionFrames: 0
dev.bge.1.stats.tx.MultipleCollisionFrames: 0
dev.bge.1.stats.tx.DeferredTransmissions: 0
dev.bge.1.stats.tx.ExcessiveCollisions: 0
dev.bge.1.stats.tx.LateCollisions: 0
dev.bge.1.stats.tx.UcastPkts: 0
dev.bge.1.stats.tx.MulticastPkts: 0
dev.bge.1.stats.tx.BroadcastPkts: 0
dev.bge.1.stats.tx.CarrierSenseErrors: 0
dev.bge.1.stats.tx.Discards: 0
dev.bge.1.stats.tx.Errors: 0

Regards,

-- 
Anders.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 23:19:21 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id ADA39106567D
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 23:19:21 +0000 (UTC)
	(envelope-from hiroshi@soupacific.com)
Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201])
	by mx1.freebsd.org (Postfix) with ESMTP id 46D638FC20
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 23:19:21 +0000 (UTC)
Received: from [127.0.0.1] (unknown [192.168.1.239])
	by mail.soupacific.com (Postfix) with ESMTP id 6EE966ABBB;
	Thu, 10 Jun 2010 09:40:54 +0000 (UTC)
Message-ID: <4C10B526.4040908@soupacific.com>
Date: Thu, 10 Jun 2010 18:49:26 +0900
From: "hiroshi@soupacific.com" <hiroshi@soupacific.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4
MIME-Version: 1.0
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
References: <x2u90ed88931004131130m272ab2e4p37c2bdee5353a495@mail.gmail.com>
	<20100416065126.GG1705@garage.freebsd.pl>
	<4BCD3979.8050107@soupacific.com> <4BCD5AD7.8070502@soupacific.com>
	<4BCFA4C2.6000109@soupacific.com> <4BCFB1C5.5000908@soupacific.com>
	<4BD01800.9040901@soupacific.com> <4BD0438B.5080308@soupacific.com>
	<4BD0E432.1000108@soupacific.com>
	<20100423061521.GC1670@garage.freebsd.pl>
	<4BD17B0D.5080601@soupacific.com>
In-Reply-To: <4BD17B0D.5080601@soupacific.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 23:19:21 -0000

Thanks for your supporting timeout and it works great for 9.0.
One fo two server shutdown, then rebooting only one server, it works as 
primary.


And now I try to run HAST on FreeBSD 8.0.

Exact same configuration but soething wrong.

On Primary server
sv01A#hastctl crate zfshast
sv01A#hastd
sv01A#hastctl role primary zfshast

On secondary

sv01B#hastctl create zfshast
sv01B#hastd
sv01B#hastctl role secondary zfshast

Then
Secondary shows following

Jun ..... [zfshast] (secondary) Unable to recieve request header: socket 
is not connected.
Jun...... [zfshast] (secondary) worker process exited

I checked and found proto_recv() function always returns socket is not 
connected.

sv01A and sv01B is looks working, since before hastctl role secondary 
zfshast.

hastd shows
Jun.... sv01B hastd: [zfshast] (init) we acr as init for the resource 
and not as secondary as requested by tcp4:/192.168.0.240:56279

Tow time above message are shown.


hast.conf is

#global section
	control /var/run/hastctl
	listen tcp:/0.0.0.0.:8547
##	timeout 50
resource zfshast {
	on sv01A {
		local /dev/ad8
		remote 192.168.0.241
		}
	on sv01B {
		local /dev/ad8
		remote 192.168.0.240
		}
}

I change timeout value but no difference.

8.1 also same result.

What shall I do ?

Thanks

Hiroshi

P.S. I have to change postfix ip soon.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 11 23:44:21 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B03B5106567E
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 23:44:21 +0000 (UTC)
	(envelope-from hiroshi@soupacific.com)
Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201])
	by mx1.freebsd.org (Postfix) with ESMTP id 48A5D8FC16
	for <freebsd-fs@freebsd.org>; Fri, 11 Jun 2010 23:44:20 +0000 (UTC)
Received: from [127.0.0.1] (unknown [192.168.1.239])
	by mail.soupacific.com (Postfix) with ESMTP id ED45A6B30E;
	Fri, 11 Jun 2010 11:46:57 +0000 (UTC)
Message-ID: <4C122435.4020409@soupacific.com>
Date: Fri, 11 Jun 2010 20:55:33 +0900
From: "hiroshi@soupacific.com" <hiroshi@soupacific.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4
MIME-Version: 1.0
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
References: <x2u90ed88931004131130m272ab2e4p37c2bdee5353a495@mail.gmail.com>
	<20100416065126.GG1705@garage.freebsd.pl>
	<4BCD3979.8050107@soupacific.com> <4BCD5AD7.8070502@soupacific.com>
	<4BCFA4C2.6000109@soupacific.com> <4BCFB1C5.5000908@soupacific.com>
	<4BD01800.9040901@soupacific.com> <4BD0438B.5080308@soupacific.com>
	<4BD0E432.1000108@soupacific.com>
	<20100423061521.GC1670@garage.freebsd.pl>
	<4BD17B0D.5080601@soupacific.com> <4C10B526.4040908@soupacific.com>
In-Reply-To: <4C10B526.4040908@soupacific.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Cc: freebsd-fs@freebsd.org
Subject: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jun 2010 23:44:21 -0000

Thanks for your supporting timeout and it works great for 9.0.
One fo two server shutdown, then rebooting only one server, it works as 
primary.


And now I try to run HAST on FreeBSD 8.0.

Exact same configuration but soething wrong.

On Primary server
sv01A#hastctl crate zfshast
sv01A#hastd
sv01A#hastctl role primary zfshast

On secondary

sv01B#hastctl create zfshast
sv01B#hastd
sv01B#hastctl role secondary zfshast

Then
Secondary shows following

Jun ..... [zfshast] (secondary) Unable to recieve request header: socket 
is not connected.
Jun...... [zfshast] (secondary) worker process exited

I checked and found proto_recv() function always returns socket is not 
connected　except first loop of recv_thread().

sv01A and sv01B is looks working, since before hastctl role secondary 
zfshast.

hastd shows
Jun.... sv01B hastd: [zfshast] (init) we acr as init for the resource 
and not as secondary as requested by tcp4:/192.168.0.240:56279

Tow time above message are shown.


hast.conf is

#global section
     control /var/run/hastctl
     listen tcp:/0.0.0.0.:8547
##    timeout 50
resource zfshast {
     on sv01A {
         local /dev/ad8
         remote 192.168.0.241
         }
     on sv01B {
         local /dev/ad8
         remote 192.168.0.240
         }
}

I change timeout value but no difference.

8.1 also same result.

What shall I do ?

Thanks

Hiroshi

P.S. This is second mail caused by postfix ffailure

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 06:02:28 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9992E106566B;
	Sat, 12 Jun 2010 06:02:28 +0000 (UTC)
	(envelope-from hiroshi@soupacific.com)
Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201])
	by mx1.freebsd.org (Postfix) with ESMTP id 2A73C8FC0A;
	Sat, 12 Jun 2010 06:02:27 +0000 (UTC)
Received: from [127.0.0.1] (unknown [192.168.1.239])
	by mail.soupacific.com (Postfix) with ESMTP id 0A98C6B638;
	Sat, 12 Jun 2010 05:53:50 +0000 (UTC)
Message-ID: <4C1322EE.8070704@soupacific.com>
Date: Sat, 12 Jun 2010 15:02:22 +0900
From: "hiroshi@soupacific.com" <hiroshi@soupacific.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4
MIME-Version: 1.0
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>, freebsd-fs@freebsd.org
References: <x2u90ed88931004131130m272ab2e4p37c2bdee5353a495@mail.gmail.com><20100416065126.GG1705@garage.freebsd.pl><4BCD3979.8050107@soupacific.com>
	<4BCD5AD7.8070502@soupacific.com><4BCFA4C2.6000109@soupacific.com>
	<4BCFB1C5.5000908@soupacific.com><4BD01800.9040901@soupacific.com>
	<4BD0438B.5080308@soupacific.com><4BD0E432.1000108@soupacific.com><20100423061521.GC1670@garage.freebsd.pl><4BD17B0D.5080601@soupacific.com>
	<4C10B526.4040908@soupacific.com> <4C122435.4020409@soupacific.com>
In-Reply-To: <4C122435.4020409@soupacific.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: 
Subject: Re: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 06:02:28 -0000

I put some log message to trace the trouble of HAST on 8.1.

Modified code is

/*
  * Thread receives requests from the primary node.
  */
static void *
recv_thread(void *arg)
{
	struct hast_resource *res = arg;
	struct hio *hio;
	bool wakeup;
pjdlog_warning("recv_thread");
	for (;;) {
		pjdlog_debug(2, "recv: Taking free request.");
		mtx_lock(&hio_free_list_lock);
		while ((hio = TAILQ_FIRST(&hio_free_list)) == NULL) {
			pjdlog_debug(2, "recv: No free requests, waiting.");
			cv_wait(&hio_free_list_cond, &hio_free_list_lock);
		}
		TAILQ_REMOVE(&hio_free_list, hio, hio_next);
		mtx_unlock(&hio_free_list_lock);
		pjdlog_debug(2, "recv: (%p) Got request.", hio);

pjdlog_warning("wooooo");

		if (hast_proto_recv_hdr(res->hr_remotein, &hio->hio_nv) < 0) {
			pjdlog_exit(EX_TEMPFAIL,
			    "Unable to receive request header. ");
		}
		if (requnpack(res, hio) != 0)
{
pjdlog_warning("requnpack");
			goto send_queue;
}
		reqlog(LOG_DEBUG, 2, -1, hio,
		    "recv: (%p) Got request header: ", hio);
		if (hio->hio_cmd == HIO_WRITE) {
			if (hast_proto_recv_data(res, res->hr_remotein,
			    hio->hio_nv, hio->hio_data, MAXPHYS) < 0) {
				pjdlog_exit(EX_TEMPFAIL,
				    "Unable to receive reply data");
			}
pjdlog_warning("HIO_WRITE");
		}
		pjdlog_debug(2, "recv: (%p) Moving request to the disk queue.",
		    hio);
		mtx_lock(&hio_disk_list_lock);
		wakeup = TAILQ_EMPTY(&hio_disk_list);
		TAILQ_INSERT_TAIL(&hio_disk_list, hio, hio_next);
		mtx_unlock(&hio_disk_list_lock);
		if (wakeup)
{
pjdlog_warning("wakeup");
			cv_signal(&hio_disk_list_cond);
}
		continue;
send_queue:
		pjdlog_debug(2, "recv: (%p) Moving request to the send queue.",
		    hio);
		mtx_lock(&hio_send_list_lock);
		wakeup = TAILQ_EMPTY(&hio_send_list);
		TAILQ_INSERT_TAIL(&hio_send_list, hio, hio_next);
		mtx_unlock(&hio_send_list_lock);
		if (wakeup)
			cv_signal(&hio_send_list_cond);
	}
	/* NOTREACHED */
	return (NULL);
}

/*
  * Thread sends requests back to primary node.
  */
static void *
send_thread(void *arg)
{
	struct hast_resource *res = arg;
	struct nv *nvout;
	struct hio *hio;
	void *data;
	size_t length;
	bool wakeup;

	for (;;) {

pjdlog_warning("send_thread for loop");
		pjdlog_debug(2, "send: Taking request.");
		mtx_lock(&hio_send_list_lock);
		while ((hio = TAILQ_FIRST(&hio_send_list)) == NULL) {
			pjdlog_debug(2, "send: No requests, waiting.");
			cv_wait(&hio_send_list_cond, &hio_send_list_lock);
		}
		TAILQ_REMOVE(&hio_send_list, hio, hio_next);
		mtx_unlock(&hio_send_list_lock);

9.0 logs shows

un 12 12:49:33 fw01B hastd: [zfshast] (secondary) send_thread for loop
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) HIO_WRITE
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) wakup
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) woooo
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) send_thread for loop
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) HIO_WRITE
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) wakup
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) woooo
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) send_thread for loop
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) HIO_WRITE
Jun 12 12:49:33 fw01B hastd: [zfshast] (secondary) wakup

repeated forever

8.1
Jun 12 14:07:18 sv01B hastd: [zfshast] (init) We act as init for the 
resource and

not as secondary as requested by tcp4://192.168.0.240:59254.
Jun 12 14:07:23 sv01B hastd: [zfshast] (init) We act as init for the 
resource and

not as secondary as requested by tcp4://192.168.0.240:56349.
Jun 12 14:07:28 sv01B hastd: [zfshast] (secondary) recv_thread
Jun 12 14:07:28 sv01B hastd: [zfshast] (secondary) send_thread for loop
Jun 12 14:07:28 sv01B hastd: [zfshast] (secondary) wooooo
Jun 12 14:07:28 sv01B hastd: [zfshast] (secondary) HIO_WRITE
Jun 12 14:07:28 sv01B hastd: [zfshast] (secondary) wakeup
Jun 12 14:07:28 sv01B hastd: [zfshast] (secondary) wooooo
Jun 12 14:07:28 sv01B hastd: [zfshast] (secondary) send_thread for loop
Jun 12 14:07:33 sv01B hastd: [zfshast] (secondary) Unable to receive 
request header.

: Socket is not connected.
Jun 12 14:07:33 sv01B hastd: [zfshast] (secondary) Worker process exited

ungracefully (pid=757, exitcode=75).
Jun 12 14:07:33 sv01B hastd: [zfshast] (secondary) recv_thread
Jun 12 14:07:33 sv01B hastd: [zfshast] (secondary) send_thread for loop
Jun 12 14:07:33 sv01B hastd: [zfshast] (secondary) wooooo
Jun 12 14:07:33 sv01B hastd: [zfshast] (secondary) HIO_WRITE
Jun 12 14:07:33 sv01B hastd: [zfshast] (secondary) wakeup
Jun 12 14:07:33 sv01B hastd: [zfshast] (secondary) wooooo
Jun 12 14:07:33 sv01B hastd: [zfshast] (secondary) send_thread for loop
Jun 12 14:07:38 sv01B hastd: [zfshast] (secondary) Unable to receive 
request header.

: Socket is not connected.


I hope this simple trace could help you some idea.

Thanks

Hiroshi

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 08:23:01 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4B3E8106566B
	for <freebsd-fs@FreeBSD.org>; Sat, 12 Jun 2010 08:23:01 +0000 (UTC)
	(envelope-from mm@FreeBSD.org)
Received: from mail.vx.sk (core.vx.sk [188.40.32.143])
	by mx1.freebsd.org (Postfix) with ESMTP id 8008A8FC1A
	for <freebsd-fs@FreeBSD.org>; Sat, 12 Jun 2010 08:23:00 +0000 (UTC)
Received: from core.vx.sk (localhost [127.0.0.1])
	by mail.vx.sk (Postfix) with ESMTP id 70947162D7
	for <freebsd-fs@FreeBSD.org>; Sat, 12 Jun 2010 10:22:59 +0200 (CEST)
X-Virus-Scanned: amavisd-new at mail.vx.sk
Received: from mail.vx.sk ([127.0.0.1])
	by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id hkLYKna0c+rp for <freebsd-fs@FreeBSD.org>;
	Sat, 12 Jun 2010 10:22:57 +0200 (CEST)
Received: from [10.9.8.1] (chello089173000055.chello.sk [89.173.0.55])
	by mail.vx.sk (Postfix) with ESMTPSA id D5519162C8
	for <freebsd-fs@FreeBSD.org>; Sat, 12 Jun 2010 10:22:56 +0200 (CEST)
Message-ID: <4C1343E2.1000102@FreeBSD.org>
Date: Sat, 12 Jun 2010 10:22:58 +0200
From: Martin Matuska <mm@FreeBSD.org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sk;
	rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23
	Mnenhy/0.7.5.0
MIME-Version: 1.0
To: freebsd-fs@FreeBSD.org
X-Enigmail-Version: 1.0.1
Content-Type: text/plain; charset=windows-1250
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: 
Subject: ZFS vendor bugfix patches on my TODO-list
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 08:23:01 -0000

Here is a list of not yet commited vendor ZFS patches on my todo list.
All patches are from OpenSolaris and fix known bugs.
delphij@ has already reviewed these, I am waiting for currently waiting
for pjd's final words.
If you run into any of these issues please try the corresponding patch.

All 9 patches bundled:
http://people.freebsd.org/~mm/patches/zfs/8.1/head-aggregated.patch

Individual patches and problem descriptions:

1. http://people.freebsd.org/~mm/patches/zfs/8.1/head-8890.patch

Synopsis: Unable to remove a file over NFS after hitting refquota limit
Bug-ID: 6798878
<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6798878>
Onnv revision: 8890:8c2bd5f17bf2

2. http://people.freebsd.org/~mm/patches/zfs/8.1/head-9409.patch

Synopsis: zfs destroy fails to free object in open context, stops up txg
train
Bug-ID: 6809683
<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6809683>
Onnv revision: 9409:9dc3f17354ed

3. http://people.freebsd.org/~mm/patches/zfs/8.1/head-9434.patch

Synopsis: incomplete resilvering after disk replacement (raidz)
Bug-ID:  6794570
<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6794570>
Onnv revision: 9434:3bebded7c76a

4. http://people.freebsd.org/~mm/patches/8.1/head-9722.patch

Synopsis: vdev_probe() starvation brings txg train to a screeching halt
Bug-ID: 6844069
<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844069>
Onnv revision: 9722:e3866bad4e96

5. http://people.freebsd.org/~mm/patches/zfs/8.1/head-9774.patch

Synopsis: ZFS panic deadlock: cycle in blocking chain via zfs_zget
Bug-ID: 6788152
<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6788152>
Onnv revision: 9774:0bb234ab2287

6. http://people.freebsd.org/~mm/patches/zfs/8.1/head-9997.patch

Synopsis: zpool resilver stalls with spa_scrub_thread in a 3 way deadlock
Bug-ID: 6843235
<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6843235>
Onnv revision: 9997:174d75a29a1c

7. http://people.freebsd.org/~mm/patches/zfs/8.1/head-10040.patch

Synopsis: zfs panics on zpool import
Bug-ID: 6857012
<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6857012>
Onnv revision: 10040:38b25aeeaf7a

8. http://people.freebsd.org/~mm/patches/zfs/8.1/head-10295.patch

Synopsis: panic in zfs_getsecattr
Bug-ID: 6870564
<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6870564>
Onnv revision: 10295:f7a18a1e9610

9. http://people.freebsd.org/~mm/patches/zfs/8.1/head-10839.patch

Synopsis: arc_read_done may try to byteswap undefined data (sparc-related)
Bug-ID: 6836714
<http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6836714>
Onnv revision: 10839:cf83b553a2ab

Cheers,
mm

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 10:47:50 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C7AAE1065679
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 10:47:50 +0000 (UTC)
	(envelope-from pjd@garage.freebsd.pl)
Received: from mail.garage.freebsd.pl (chello089077043238.chello.pl
	[89.77.43.238]) by mx1.freebsd.org (Postfix) with ESMTP id 0DACB8FC1D
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 10:47:48 +0000 (UTC)
Received: by mail.garage.freebsd.pl (Postfix, from userid 65534)
	id 9C03845CBA; Sat, 12 Jun 2010 12:47:46 +0200 (CEST)
Received: from localhost (gate.wheel.pl [10.0.0.1])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.garage.freebsd.pl (Postfix) with ESMTP id 8F588456B1;
	Sat, 12 Jun 2010 12:47:40 +0200 (CEST)
Date: Sat, 12 Jun 2010 12:47:32 +0200
From: Pawel Jakub Dawidek <pjd@FreeBSD.org>
To: "hiroshi@soupacific.com" <hiroshi@soupacific.com>
Message-ID: <20100612104336.GA2253@garage.freebsd.pl>
References: <4BCD3979.8050107@soupacific.com> <4BCD5AD7.8070502@soupacific.com>
	<4BCFA4C2.6000109@soupacific.com> <4BCFB1C5.5000908@soupacific.com>
	<4BD01800.9040901@soupacific.com> <4BD0438B.5080308@soupacific.com>
	<4BD0E432.1000108@soupacific.com>
	<20100423061521.GC1670@garage.freebsd.pl>
	<4BD17B0D.5080601@soupacific.com> <4C10B526.4040908@soupacific.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="UHN/qo2QbUvPLonB"
Content-Disposition: inline
In-Reply-To: <4C10B526.4040908@soupacific.com>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc
X-OS: FreeBSD 9.0-CURRENT amd64
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
	mail.garage.freebsd.pl
X-Spam-Level: 
X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00,
	TO_ADDRESS_EQ_REAL autolearn=ham version=3.0.4
Cc: freebsd-fs@freebsd.org
Subject: Re: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 10:47:50 -0000


--UHN/qo2QbUvPLonB
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Jun 10, 2010 at 06:49:26PM +0900, hiroshi@soupacific.com wrote:
> Thanks for your supporting timeout and it works great for 9.0.
> One fo two server shutdown, then rebooting only one server, it works as=
=20
> primary.
>=20
>=20
> And now I try to run HAST on FreeBSD 8.0.

Is this 8.0 or 8-STABLE?

> Exact same configuration but soething wrong.
>=20
> On Primary server
> sv01A#hastctl crate zfshast
> sv01A#hastd
> sv01A#hastctl role primary zfshast
>=20
> On secondary
>=20
> sv01B#hastctl create zfshast
> sv01B#hastd
> sv01B#hastctl role secondary zfshast
>=20
> Then
> Secondary shows following
>=20
> Jun ..... [zfshast] (secondary) Unable to recieve request header: socket=
=20
> is not connected.
> Jun...... [zfshast] (secondary) worker process exited
>=20
> I checked and found proto_recv() function always returns socket is not=20
> connected.
>=20
> sv01A and sv01B is looks working, since before hastctl role secondary=20
> zfshast.
>=20
> hastd shows
> Jun.... sv01B hastd: [zfshast] (init) we acr as init for the resource=20
> and not as secondary as requested by tcp4:/192.168.0.240:56279

So is the resource configured as secondary or not?
Could you stop hastd and start it manually with debug turned on?
Don't forget to mark it as secondary.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--UHN/qo2QbUvPLonB
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAkwTZcMACgkQForvXbEpPzQ7mgCfRIjXlsXpptQ/3w4xTmW7De+/
pxMAn0GYsiQT1XUhpu7d03yOiyXJKlCf
=9tGV
-----END PGP SIGNATURE-----

--UHN/qo2QbUvPLonB--

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 11:43:34 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 990961065670;
	Sat, 12 Jun 2010 11:43:34 +0000 (UTC)
	(envelope-from hiroshi@soupacific.com)
Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201])
	by mx1.freebsd.org (Postfix) with ESMTP id 222F68FC19;
	Sat, 12 Jun 2010 11:43:33 +0000 (UTC)
Received: from [127.0.0.1] (unknown [192.168.1.239])
	by mail.soupacific.com (Postfix) with ESMTP id 98E316B702;
	Sat, 12 Jun 2010 11:34:56 +0000 (UTC)
Message-ID: <4C1372E0.1000903@soupacific.com>
Date: Sat, 12 Jun 2010 20:43:28 +0900
From: "hiroshi@soupacific.com" <hiroshi@soupacific.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4
MIME-Version: 1.0
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
References: <4BCD3979.8050107@soupacific.com> <4BCD5AD7.8070502@soupacific.com>
	<4BCFA4C2.6000109@soupacific.com> <4BCFB1C5.5000908@soupacific.com>
	<4BD01800.9040901@soupacific.com> <4BD0438B.5080308@soupacific.com>
	<4BD0E432.1000108@soupacific.com>
	<20100423061521.GC1670@garage.freebsd.pl>
	<4BD17B0D.5080601@soupacific.com> <4C10B526.4040908@soupacific.com>
	<20100612104336.GA2253@garage.freebsd.pl>
In-Reply-To: <20100612104336.GA2253@garage.freebsd.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 11:43:34 -0000

Thanks for your quick responce!


> Is this 8.0 or 8-STABLE?
First I did on 8.0-Release and now csuped to 8.1-Prerelease
Both same behaiver.

hastd -dd log info:
Follwoing are debug.log Sorry bit long!

9.0 current

un 12 18:53:52 fw01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 18:53:52 fw01B hastd: tcp4://192.168.0.240:32772: resource=zfshast
Jun 12 18:53:57 fw01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 18:53:57 fw01B hastd: tcp4://192.168.0.240:35907: resource=zfshast
Jun 12 18:54:02 fw01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 18:54:02 fw01B hastd: tcp4://192.168.0.240:41046: resource=zfshast
Jun 12 18:54:07 fw01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 18:54:07 fw01B hastd: tcp4://192.168.0.240:24170: resource=zfshast
Jun 12 18:54:12 fw01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 18:54:12 fw01B hastd: tcp4://192.168.0.240:58260: resource=zfshast
Jun 12 18:54:17 fw01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 18:54:17 fw01B hastd: tcp4://192.168.0.240:62353: resource=zfshast
Jun 12 18:54:22 fw01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 18:54:22 fw01B hastd: tcp4://192.168.0.240:45572: resource=zfshast
Jun 12 18:54:27 fw01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 18:54:27 fw01B hastd: tcp4://192.168.0.240:40139: resource=zfshast
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) Initial connection from

tcp4://192.168.0.240:40139.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) Incoming connection from

tcp4://192.168.0.240:40139 configured.
Jun 12 18:54:27 fw01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 18:54:27 fw01B hastd: tcp4://192.168.0.240:16787: resource=zfshast
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) Outgoing connection to

tcp4://192.168.0.240:16787 configured.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) Obtained info about 
/dev/ad4p4.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: No requests, 
waiting.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: Taking free 
request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: (0x8014132e0) 
Got request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) send: No requests, 
waiting.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: (0x8014132e0) 
Got request

header: WRITE(0, 131072).
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: (0x8014132e0) 
Moving

request to the disk queue.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: Taking free 
request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: (0x801413290) 
Got request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: (0x8014132e0) 
Got request:

WRITE(0, 131072).
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: (0x8014132e0) 
Moving

request to the send queue.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: No requests, 
waiting.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) send: (0x8014132e0) 
Got request:

WRITE(0, 131072).
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: (0x8014132e0) 
Moving

request to the free queue.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) send: No requests, 
waiting.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: (0x801413290) 
Got request

header: WRITE(131072, 131072).
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: (0x801413290) 
Moving

request to the disk queue.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: Taking free 
request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: (0x801413240) 
Got request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: (0x801413290) 
Got request:

WRITE(131072, 131072).
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: (0x801413290) 
Moving

request to the send queue.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: No requests, 
waiting.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) send: (0x801413290) 
Got request:

WRITE(131072, 131072).
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) disk: (0x801413290) 
Moving

request to the free queue.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) send: No requests, 
waiting.
Jun 12 18:54:27 fw01B hastd: [zfshast] (secondary) recv: (0x801413240) 
Got request

header: WRITE(262144, 131072).

8.1-Prerelease

Jun 12 20:00:07 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 20:00:07 sv01B hastd: tcp4://192.168.0.240:63762: resource=zfshast
Jun 12 20:00:12 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 20:00:12 sv01B hastd: tcp4://192.168.0.240:22890: resource=zfshast
Jun 12 20:00:17 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 20:00:17 sv01B hastd: tcp4://192.168.0.240:36449: resource=zfshast
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) Initial connection from

tcp4://192.168.0.240:36449.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) Incoming connection from

tcp4://192.168.0.240:36449 configured.
Jun 12 20:00:17 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 20:00:17 sv01B hastd: tcp4://192.168.0.240:39312: resource=zfshast
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) Outgoing connection to

tcp4://192.168.0.240:39312 configured.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) Obtained info about 
/dev/ad4p4.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) recv: Taking free 
request.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) disk: No requests, 
waiting.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) recv: (0x8011f52e0) 
Got request.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) send: No requests, 
waiting.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) recv: (0x8011f52e0) 
Got request

header: WRITE(0, 131072).
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) recv: (0x8011f52e0) 
Moving

request to the disk queue.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) recv: Taking free 
request.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) recv: (0x8011f5290) 
Got request.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) disk: (0x8011f52e0) 
Got request:

WRITE(0, 131072).
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) disk: (0x8011f52e0) 
Moving

request to the send queue.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) disk: No requests, 
waiting.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) send: (0x8011f52e0) 
Got request:

WRITE(0, 131072).
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) disk: (0x8011f52e0) 
Moving

request to the free queue.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 20:00:17 sv01B hastd: [zfshast] (secondary) send: No requests, 
waiting.
Jun 12 20:00:22 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 20:00:22 sv01B hastd: tcp4://192.168.0.240:55777: resource=zfshast
Jun 12 20:00:22 sv01B hastd: [zfshast] (secondary) Initial connection from

tcp4://192.168.0.240:55777.
Jun 12 20:00:22 sv01B hastd: [zfshast] (secondary) Worker process exists 
(pid=768),

stopping it.
Jun 12 20:00:22 sv01B hastd: [zfshast] (secondary) Incoming connection from

tcp4://192.168.0.240:55777 configured.
Jun 12 20:00:22 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 20:00:22 sv01B hastd: tcp4://192.168.0.240:38559: resource=zfshast


0:56279
>
> So is the resource configured as secondary or not?
> Could you stop hastd and start it manually with debug turned on?
> Don't forget to mark it as secondary.
>
Yes I did it as secondary manualy.

Thanks

Hiroshi

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 12:03:52 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8EDF61065674;
	Sat, 12 Jun 2010 12:03:52 +0000 (UTC)
	(envelope-from jhellenthal@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id EBC0F8FC12;
	Sat, 12 Jun 2010 12:03:51 +0000 (UTC)
Received: by iwn7 with SMTP id 7so2757516iwn.13
	for <multiple recipients>; Sat, 12 Jun 2010 05:03:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:sender:message-id:date:from
	:user-agent:mime-version:to:cc:subject:references:in-reply-to
	:x-enigmail-version:openpgp:content-type:content-transfer-encoding;
	bh=9ZatU3tWlk1pC8R6lFZNJj8e/M82Zd+KnVRDq22mb5o=;
	b=tvPziFEhqbhjsVaTP7FGqXQVlR3JxArtPeT4FJzjY8VMKBFzdBB/qsoqvh5G9rf87D
	kqw8p/5mtq6P1Sfn0AYcYCbpy40FlePF0JbCDq5BCvxRKCPLW+sW2K9K4Gv0oXqJK/br
	X/8ra6nJovZLYTDuP6aapmXcaMOOWO8+A0HMg=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject
	:references:in-reply-to:x-enigmail-version:openpgp:content-type
	:content-transfer-encoding;
	b=L5iZBofDC7KuP48FPFxwDTsuiqUTEzAjKXyj2uoqOZIK1iySp2EYePF9bWglnjbA7h
	knX876JGi61jG+RjrjAByoTzweAXCgIl4/gAvXBf+z+Wy8h3b7wB7bnb+aigTGf9Lbl8
	HwZdyC//LPQOr8ETOFyAMD1BwC14b8xoTwuMo=
Received: by 10.231.139.21 with SMTP id c21mr3176762ibu.160.1276344230877;
	Sat, 12 Jun 2010 05:03:50 -0700 (PDT)
Received: from centel.dataix.local
	(adsl-99-181-128-180.dsl.klmzmi.sbcglobal.net [99.181.128.180])
	by mx.google.com with ESMTPS id b3sm10149581ibf.13.2010.06.12.05.03.48
	(version=SSLv3 cipher=RC4-MD5); Sat, 12 Jun 2010 05:03:49 -0700 (PDT)
Sender: "J. Hellenthal" <jhellenthal@gmail.com>
Message-ID: <4C1377A3.2040408@dataix.net>
Date: Sat, 12 Jun 2010 08:03:47 -0400
From: jhell <jhell@dataix.net>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US;
	rv:1.9.1.9) Gecko/20100515 Thunderbird
MIME-Version: 1.0
To: Ivan Voras <ivoras@freebsd.org>
References: <20100610162629.38992mazf0sfdqg0@webmail.leidinger.net>
	<hut8o7$q2o$1@dough.gmane.org>
	<20100611172033.42001s90ahe57oe8@webmail.leidinger.net>
	<AANLkTilBqAsDnDs4nlsVjBdt1U9554t9rNuzMruqRJqE@mail.gmail.com>
In-Reply-To: <AANLkTilBqAsDnDs4nlsVjBdt1U9554t9rNuzMruqRJqE@mail.gmail.com>
X-Enigmail-Version: 1.0.1
OpenPGP: id=89D8547E
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, Alexander Leidinger <Alexander@leidinger.net>
Subject: Re: CFT: periodic scrubbing of ZFS pools
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 12:03:52 -0000

On 06/11/2010 12:11, Ivan Voras wrote:
> On 11 June 2010 17:20, Alexander Leidinger <Alexander@leidinger.net> wrote:
>> Quoting Ivan Voras <ivoras@freebsd.org> (from Fri, 11 Jun 2010 14:04:24
>> +0200):
> 
>>> Fairly good and useful, but could you add a small check of "zpool
>>> status" information before scrubbing that would a) complain LOUDLY AND
>>> VISIBLY if a previous scrub failed and b) skip issuing a new scrub
>>> command if there is such an error, to avoid stressing possibly broken
>>> hardware?
>>
>> Can you please provide an example of such a failed scrub?
> 
> You should probably treat any status message that doesn't have "none
> requested" or "scrub completed with 0 errors..." as failed.

I disagree with this as it conflicts with your previous request.

none requested = no error and the next scrub should be allowed

scrub completed with 0 errors = no errors.

Why shouldn't the next scrub that is being determined in the script take
place if there is no errors ?. I only see doing this if you want the
scrub to only ever be performed once and never again thereafter.

I do agree on the other hand if the scrub status has any form of
[fF][aA][iI][dD] in it then it should not be performed and as well if it
contains [fF][aA][uU][lL][tT][eE][dD] or some other combinations on one
of the devices.

Regards,

-- 

 jhell

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 12:17:22 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 988371065676
	for <fs@freebsd.org>; Sat, 12 Jun 2010 12:17:22 +0000 (UTC)
	(envelope-from jhellenthal@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 52CFF8FC1C
	for <fs@freebsd.org>; Sat, 12 Jun 2010 12:17:22 +0000 (UTC)
Received: by iwn7 with SMTP id 7so2768066iwn.13
	for <fs@freebsd.org>; Sat, 12 Jun 2010 05:17:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:sender:message-id:date:from
	:user-agent:mime-version:to:cc:subject:references:in-reply-to
	:x-enigmail-version:openpgp:content-type:content-transfer-encoding;
	bh=xhE12Tler9mfTf7hv2jzvQaIo7S/bN9OXWeTKiGwpZc=;
	b=cAaZ1TWSIn6Oj48QufzHobzbmKiX08msW40WQc6nr4HfAgKisP7wnbI7TW49rkAXqT
	GsF3kebYuq1krZlBZ1gLSP5ovTcC6XrgCkhwGiXNDuKNCWnF1Q+IE5gsMNrvMacR/uCW
	2RQcYrcIC4tteRFVnaRLldizsUfKBC6B7kzRk=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject
	:references:in-reply-to:x-enigmail-version:openpgp:content-type
	:content-transfer-encoding;
	b=bYd7Xa2NbvM8QtW7qwA8cyKYiI36xtFzRQ1Z6RBNDRohpp2GlsrvQ8COyhbkB+wCHK
	a9lFe8Ew8f7Kfhwf3/sU1H0HZ9CuSh13noccWVxsgUkvPh5oZB7zIORp8AxODGhyrUqa
	mG0qiUG+j5jQWMfS9bZ0TReEW9RooiLW7Lhd4=
Received: by 10.231.168.129 with SMTP id u1mr3302111iby.49.1276345040957;
	Sat, 12 Jun 2010 05:17:20 -0700 (PDT)
Received: from centel.dataix.local
	(adsl-99-181-128-180.dsl.klmzmi.sbcglobal.net [99.181.128.180])
	by mx.google.com with ESMTPS id a8sm10195975ibi.11.2010.06.12.05.17.18
	(version=SSLv3 cipher=RC4-MD5); Sat, 12 Jun 2010 05:17:19 -0700 (PDT)
Sender: "J. Hellenthal" <jhellenthal@gmail.com>
Message-ID: <4C137ACE.9080900@dataix.net>
Date: Sat, 12 Jun 2010 08:17:18 -0400
From: jhell <jhell@dataix.net>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US;
	rv:1.9.1.9) Gecko/20100515 Thunderbird
MIME-Version: 1.0
To: Alexander Leidinger <Alexander@leidinger.net>
References: <20100609162627.11355zjzwnf7nj8k@webmail.leidinger.net>
	<4C0FAE2A.7050103@dataix.net> <4C0FB1DE.9080508@dataix.net>
	<20100610115324.10161biomkjndvy8@webmail.leidinger.net>
	<AANLkTin445x53XTprCkn1ISmVrnJeSu1XhR52tmtUfkS@mail.gmail.com>
	<20100610173825.164930ekkryr5tes@webmail.leidinger.net>
	<AANLkTilech-Onkawu4pvNQx5hrByd3R-Mn6MK4AiSHsc@mail.gmail.com>
	<4C1138D0.7070901@dataix.net>
	<20100611104219.51344ag1ah7br4kk@webmail.leidinger.net>
In-Reply-To: <20100611104219.51344ag1ah7br4kk@webmail.leidinger.net>
X-Enigmail-Version: 1.0.1
OpenPGP: id=89D8547E
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: fs@freebsd.org
Subject: Re: Do we want a periodic script for a zfs scrub?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 12:17:22 -0000

On 06/11/2010 04:42, Alexander Leidinger wrote:
: #!/bin/sh
:
: lastscrub=$(zpool history exports |grep scrub |tail -1 |cut -f1 -d.)
: todayjul=$(date -j -f "%Y-%m-%d" "+%j" $(date "+%Y-%m-%d"))
: scrubjul=$(date -j -f "%Y-%m-%d" "+%j" $lastscrub)
:
: echo $lastscrub Last Scrub From zpool history
: echo $todayjul Today converted to julian
: echo $scrubjul Last scrub converted to julian
:
: expired=$(($todayjul-$scrubjul))
> 
> Apart from the fact that we can do this with one $(( ))... what happens
> if/when time_t is extended to 64 bits on 32 bit platforms? Can we get
> into trouble with the shell-arithmetic or not? It depends upon the
> bit-size of the shell integers, and the signedness of them. Jilles (our
> shell maintainer) suggested also to use the seconds since epoch and I
> asked him the same question. I'm waiting for an answer from him.
> 

I do not think this would be a problem for the script as the script is
relying on date for the conversion except for the subtraction that is
taking place.

If there was a problem then I would believe it would have to be
corrected in date(1) & possibly sh(1), I could be wrong though.

> The same concerns apply to test(1) (or the corresponding buildin) in the
> solution of Artem.
> 

I agree.

> By calculating with days everywhere (like in my solution), I'm sure that
> it takes longer to hit a wall than by calculating with seconds since
> epoch (which can cause a problem in 2038 or during a transition when
> this problem is tackled in time_t but not here, which is not that far
> away). The off-by-one day once every 4 years shouldn't be a problem. If
> someone can assure with some nice facts, that using the seconds since
> epoch will not cause problems in the described cases, I have no problem
> to switch to use them.

I agree with this, please see corrected example above.

Another situation that had come to mind is when & certainly being
possible that the time on the system could drop to before the last scrub
has taken place causing (using the above script for example) todayjul to
be less than scrubjul and in that case would output a negative integer
that would skew results until the system time has been restored to its
correct & current date & time.

if [ $todayjul -gt $scrubjul ]; then
	expired=$(($todayjul-$scrubjul))
	else
	expired=$(($scrubjul-$todayjul))
fi


-- 

 jhell

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 13:28:19 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 084E11065673;
	Sat, 12 Jun 2010 13:28:19 +0000 (UTC)
	(envelope-from jhellenthal@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 7433F8FC15;
	Sat, 12 Jun 2010 13:28:18 +0000 (UTC)
Received: by iwn7 with SMTP id 7so2826055iwn.13
	for <multiple recipients>; Sat, 12 Jun 2010 06:28:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:sender:message-id:date:from
	:user-agent:mime-version:to:cc:subject:references:in-reply-to
	:x-enigmail-version:openpgp:content-type:content-transfer-encoding;
	bh=BxKOviUiOtBJ1fd1sfklvc+5zpmrN2xLQkgXacW2qEQ=;
	b=xsyMK3DcAbEZ6dfyLRn58aX8aZ18TD7ciJzoEml+yQmrin5DVre6l5tCItPAGGeWGR
	SQkqYZ4iVWzBC5HC+WJwL1Xmo2uNhTd8luqfCN5m741coVvW5WQkmP7SN9k0DWmkLYty
	m06V+EdeKoxCadd/kthpp2lIljlhctYwT5Dc0=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject
	:references:in-reply-to:x-enigmail-version:openpgp:content-type
	:content-transfer-encoding;
	b=M0ZTOQShAB9LRiLk3btAJjGGONJEPoB8pujjW6GUzJ0niFGEznCKXGE3MyoioiCLFk
	I7AexSz5sQ4YykfbvOUZaaBZjqP1BcTnzdHiEXj+MSjX1sw9TZZSa+53z0cS3jMN9gc8
	WQkFM6baQLzVJ9vgiJV7VG/b1WeGyPYoH4xa8=
Received: by 10.231.190.132 with SMTP id di4mr3385990ibb.41.1276349297993;
	Sat, 12 Jun 2010 06:28:17 -0700 (PDT)
Received: from centel.dataix.local
	(adsl-99-181-128-180.dsl.klmzmi.sbcglobal.net [99.181.128.180])
	by mx.google.com with ESMTPS id t28sm10442362ibg.18.2010.06.12.06.28.16
	(version=SSLv3 cipher=RC4-MD5); Sat, 12 Jun 2010 06:28:16 -0700 (PDT)
Sender: "J. Hellenthal" <jhellenthal@gmail.com>
Message-ID: <4C138B6F.3060107@dataix.net>
Date: Sat, 12 Jun 2010 09:28:15 -0400
From: jhell <jhell@dataix.net>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US;
	rv:1.9.1.9) Gecko/20100515 Thunderbird
MIME-Version: 1.0
To: "hiroshi@soupacific.com" <hiroshi@soupacific.com>
References: <x2u90ed88931004131130m272ab2e4p37c2bdee5353a495@mail.gmail.com>
	<20100416065126.GG1705@garage.freebsd.pl>
	<4BCD3979.8050107@soupacific.com> <4BCD5AD7.8070502@soupacific.com>
	<4BCFA4C2.6000109@soupacific.com> <4BCFB1C5.5000908@soupacific.com>
	<4BD01800.9040901@soupacific.com> <4BD0438B.5080308@soupacific.com>
	<4BD0E432.1000108@soupacific.com>
	<20100423061521.GC1670@garage.freebsd.pl>
	<4BD17B0D.5080601@soupacific.com> <4C10B526.4040908@soupacific.com>
	<4C122435.4020409@soupacific.com>
In-Reply-To: <4C122435.4020409@soupacific.com>
X-Enigmail-Version: 1.0.1
OpenPGP: id=89D8547E
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek <pjd@freebsd.org>
Subject: Re: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 13:28:19 -0000

On 06/11/2010 07:55, hiroshi@soupacific.com wrote:
> On Primary server
> sv01A#hastctl crate zfshast

Just to be sure. Are you sure the above command is intended to be
"crate" and not create ?

As I have seen you mention it two times now.


-- 

 jhell

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 13:35:50 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3A5C91065672;
	Sat, 12 Jun 2010 13:35:50 +0000 (UTC)
	(envelope-from hiroshi@soupacific.com)
Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201])
	by mx1.freebsd.org (Postfix) with ESMTP id 040438FC18;
	Sat, 12 Jun 2010 13:35:49 +0000 (UTC)
Received: from [127.0.0.1] (unknown [192.168.1.239])
	by mail.soupacific.com (Postfix) with ESMTP id A0A776B779;
	Sat, 12 Jun 2010 13:27:12 +0000 (UTC)
Message-ID: <4C138D30.30302@soupacific.com>
Date: Sat, 12 Jun 2010 22:35:44 +0900
From: "hiroshi@soupacific.com" <hiroshi@soupacific.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4
MIME-Version: 1.0
To: jhell <jhell@dataix.net>
References: <x2u90ed88931004131130m272ab2e4p37c2bdee5353a495@mail.gmail.com>
	<20100416065126.GG1705@garage.freebsd.pl>
	<4BCD3979.8050107@soupacific.com> <4BCD5AD7.8070502@soupacific.com>
	<4BCFA4C2.6000109@soupacific.com> <4BCFB1C5.5000908@soupacific.com>
	<4BD01800.9040901@soupacific.com> <4BD0438B.5080308@soupacific.com>
	<4BD0E432.1000108@soupacific.com>
	<20100423061521.GC1670@garage.freebsd.pl>
	<4BD17B0D.5080601@soupacific.com> <4C10B526.4040908@soupacific.com>
	<4C122435.4020409@soupacific.com> <4C138B6F.3060107@dataix.net>
In-Reply-To: <4C138B6F.3060107@dataix.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek <pjd@freebsd.org>
Subject: Re: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 13:35:50 -0000

On 6/12/2010 10:28 PM, jhell wrote:
> On 06/11/2010 07:55, hiroshi@soupacific.com wrote:
>> On Primary server
>> sv01A#hastctl crate zfshast
>
> Just to be sure. Are you sure the above command is intended to be
> "crate" and not create ?
>
> As I have seen you mention it two times now.
>
>

sorry those are mistype ! Sure create !

Thanks

Hiroshi


From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 14:23:28 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4AF7B1065678
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 14:23:28 +0000 (UTC)
	(envelope-from pjd@garage.freebsd.pl)
Received: from mail.garage.freebsd.pl (chello089077043238.chello.pl
	[89.77.43.238]) by mx1.freebsd.org (Postfix) with ESMTP id 851E88FC18
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 14:23:27 +0000 (UTC)
Received: by mail.garage.freebsd.pl (Postfix, from userid 65534)
	id 8872345E48; Sat, 12 Jun 2010 16:23:25 +0200 (CEST)
Received: from localhost (gate.wheel.pl [10.0.0.1])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.garage.freebsd.pl (Postfix) with ESMTP id B010745CDC;
	Sat, 12 Jun 2010 16:23:20 +0200 (CEST)
Date: Sat, 12 Jun 2010 16:23:11 +0200
From: Pawel Jakub Dawidek <pjd@FreeBSD.org>
To: "hiroshi@soupacific.com" <hiroshi@soupacific.com>
Message-ID: <20100612142311.GF2253@garage.freebsd.pl>
References: <4BCFA4C2.6000109@soupacific.com> <4BCFB1C5.5000908@soupacific.com>
	<4BD01800.9040901@soupacific.com> <4BD0438B.5080308@soupacific.com>
	<4BD0E432.1000108@soupacific.com>
	<20100423061521.GC1670@garage.freebsd.pl>
	<4BD17B0D.5080601@soupacific.com> <4C10B526.4040908@soupacific.com>
	<20100612104336.GA2253@garage.freebsd.pl>
	<4C1372E0.1000903@soupacific.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="gTtJ75FAzB1T2CN6"
Content-Disposition: inline
In-Reply-To: <4C1372E0.1000903@soupacific.com>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc
X-OS: FreeBSD 9.0-CURRENT amd64
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
	mail.garage.freebsd.pl
X-Spam-Level: 
X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00,
	TO_ADDRESS_EQ_REAL autolearn=ham version=3.0.4
Cc: freebsd-fs@freebsd.org
Subject: Re: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 14:23:28 -0000


--gTtJ75FAzB1T2CN6
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Jun 12, 2010 at 08:43:28PM +0900, hiroshi@soupacific.com wrote:
> >Is this 8.0 or 8-STABLE?
> First I did on 8.0-Release and now csuped to 8.1-Prerelease
> Both same behaiver.
>=20
> hastd -dd log info:
> Follwoing are debug.log Sorry bit long!
[...]

Could you send the debug from the whole session, but when the problem
appears? I don't see those "socket is not connected" errors in the
output you sent.

If it is possible could you turn off lines wrapping or maybe send the
debug output as an attachment?

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--gTtJ75FAzB1T2CN6
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEUEARECAAYFAkwTmE8ACgkQForvXbEpPzQbzwCXb0s6H4M4g4GiMvmJxCAvF3gn
pQCeKKOkN4yF8+AqiCe5tcmSj+R1sZw=
=zfXG
-----END PGP SIGNATURE-----

--gTtJ75FAzB1T2CN6--

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 14:54:27 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BB9171065674;
	Sat, 12 Jun 2010 14:54:27 +0000 (UTC)
	(envelope-from hiroshi@soupacific.com)
Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201])
	by mx1.freebsd.org (Postfix) with ESMTP id 05A788FC20;
	Sat, 12 Jun 2010 14:54:26 +0000 (UTC)
Received: from [127.0.0.1] (unknown [192.168.1.239])
	by mail.soupacific.com (Postfix) with ESMTP id 5450C6B7B6;
	Sat, 12 Jun 2010 14:45:49 +0000 (UTC)
Message-ID: <4C139F9C.2090305@soupacific.com>
Date: Sat, 12 Jun 2010 23:54:20 +0900
From: "hiroshi@soupacific.com" <hiroshi@soupacific.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4
MIME-Version: 1.0
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
References: <4BCFA4C2.6000109@soupacific.com> <4BCFB1C5.5000908@soupacific.com>
	<4BD01800.9040901@soupacific.com> <4BD0438B.5080308@soupacific.com>
	<4BD0E432.1000108@soupacific.com>
	<20100423061521.GC1670@garage.freebsd.pl>
	<4BD17B0D.5080601@soupacific.com> <4C10B526.4040908@soupacific.com>
	<20100612104336.GA2253@garage.freebsd.pl>
	<4C1372E0.1000903@soupacific.com>
	<20100612142311.GF2253@garage.freebsd.pl>
In-Reply-To: <20100612142311.GF2253@garage.freebsd.pl>
Content-Type: multipart/mixed; boundary="------------040800000505090801010307"
Cc: freebsd-fs@freebsd.org
Subject: Re: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 14:54:27 -0000

This is a multi-part message in MIME format.
--------------040800000505090801010307
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit


> Could you send the debug from the whole session, but when the problem
> appears? I don't see those "socket is not connected" errors in the
> output you sent.

"socket is not connected" errors is in message log

> If it is possible could you turn off lines wrapping or maybe send the
> debug output as an attachment?
>
I here attache debug.log and message files.

Thanks

Hiroshi


--------------040800000505090801010307
Content-Type: text/plain;
 name="messages"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="messages"

Jun 12 23:37:33 sv01B newsyslog[433]: logfile first created
Jun 12 23:37:33 sv01B syslogd: kernel boot file is /boot/kernel/kernel
Jun 12 23:37:33 sv01B kernel: Copyright (c) 1992-2010 The FreeBSD Project.
Jun 12 23:37:33 sv01B kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
Jun 12 23:37:33 sv01B kernel: The Regents of the University of California. All rights reserved.
Jun 12 23:37:33 sv01B kernel: FreeBSD is a registered trademark of The FreeBSD Foundation.
Jun 12 23:37:33 sv01B kernel: FreeBSD 8.1-PRERELEASE #3: Mon Jun  7 21:35:44 UTC 2010
Jun 12 23:37:33 sv01B kernel: root@sv01B:/usr/obj/usr/src/sys/GENERIC amd64
Jun 12 23:37:33 sv01B kernel: Timecounter "i8254" frequency 1193182 Hz quality 0
Jun 12 23:37:33 sv01B kernel: CPU: AMD Athlon(tm) II X3 440 Processor (3000.14-MHz K8-class CPU)
Jun 12 23:37:33 sv01B kernel: Origin = "AuthenticAMD"  Id = 0x100f52  Family = 10  Model = 5  Stepping = 2
Jun 12 23:37:33 sv01B kernel: Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
Jun 12 23:37:33 sv01B kernel: Features2=0x802009<SSE3,MON,CX16,POPCNT>
Jun 12 23:37:33 sv01B kernel: AMD Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!>
Jun 12 23:37:33 sv01B kernel: AMD Features2=0x37ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT>
Jun 12 23:37:33 sv01B kernel: TSC: P-state invariant
Jun 12 23:37:33 sv01B kernel: real memory  = 4294967296 (4096 MB)
Jun 12 23:37:33 sv01B kernel: avail memory = 3845677056 (3667 MB)
Jun 12 23:37:33 sv01B kernel: ACPI APIC Table: <7623MS A7623200>
Jun 12 23:37:33 sv01B kernel: FreeBSD/SMP: Multiprocessor System Detected: 3 CPUs
Jun 12 23:37:33 sv01B kernel: FreeBSD/SMP: 1 package(s) x 3 core(s)
Jun 12 23:37:33 sv01B kernel: cpu0 (BSP): APIC ID:  0
Jun 12 23:37:33 sv01B kernel: cpu1 (AP): APIC ID:  1
Jun 12 23:37:33 sv01B kernel: cpu2 (AP): APIC ID:  2
Jun 12 23:37:33 sv01B kernel: ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 0x       0       0/0x1 (20100331/tbfadt-655)
Jun 12 23:37:33 sv01B kernel: ioapic0 <Version 2.1> irqs 0-23 on motherboard
Jun 12 23:37:33 sv01B kernel: kbd1 at kbdmux0
Jun 12 23:37:33 sv01B kernel: acpi0: <7623MS A7623200> on motherboard
Jun 12 23:37:33 sv01B kernel: acpi0: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: acpi0: Power Button (fixed)
Jun 12 23:37:33 sv01B kernel: acpi0: reservation of fee00000, 1000 (3) failed
Jun 12 23:37:33 sv01B kernel: acpi0: reservation of ffb80000, 80000 (3) failed
Jun 12 23:37:33 sv01B kernel: acpi0: reservation of fec10000, 20 (3) failed
Jun 12 23:37:33 sv01B kernel: acpi0: reservation of 0, a0000 (3) failed
Jun 12 23:37:33 sv01B kernel: acpi0: reservation of 100000, cff00000 (3) failed
Jun 12 23:37:33 sv01B kernel: ACPI HPET table warning: Sequence is non-zero (2)
Jun 12 23:37:33 sv01B kernel: Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
Jun 12 23:37:33 sv01B kernel: acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
Jun 12 23:37:33 sv01B kernel: cpu0: <ACPI CPU> on acpi0
Jun 12 23:37:33 sv01B kernel: cpu1: <ACPI CPU> on acpi0
Jun 12 23:37:33 sv01B kernel: cpu2: <ACPI CPU> on acpi0
Jun 12 23:37:33 sv01B kernel: acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Jun 12 23:37:33 sv01B kernel: Timecounter "HPET" frequency 14318180 Hz quality 900
Jun 12 23:37:33 sv01B kernel: pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
Jun 12 23:37:33 sv01B kernel: pci0: <ACPI PCI bus> on pcib0
Jun 12 23:37:33 sv01B kernel: pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
Jun 12 23:37:33 sv01B kernel: pci1: <ACPI PCI bus> on pcib1
Jun 12 23:37:33 sv01B kernel: vgapci0: <VGA-compatible display> port 0xd000-0xd0ff mem 0xd0000000-0xdfffffff,0xfeaf0000-0xfeafffff,0xfe900000-0xfe9fffff irq 18 at device 5.0 on pci1
Jun 12 23:37:33 sv01B kernel: pci1: <multimedia, HDA> at device 5.1 (no driver attached)
Jun 12 23:37:33 sv01B kernel: pcib2: <ACPI PCI-PCI bridge> irq 17 at device 5.0 on pci0
Jun 12 23:37:33 sv01B kernel: pci2: <ACPI PCI bus> on pcib2
Jun 12 23:37:33 sv01B kernel: alc0: <Atheros AR8131 PCIe Gigabit Ethernet> port 0xe800-0xe87f mem 0xfebc0000-0xfebfffff irq 17 at device 0.0 on pci2
Jun 12 23:37:33 sv01B kernel: alc0: 15872 Tx FIFO, 15360 Rx FIFO
Jun 12 23:37:33 sv01B kernel: alc0: Using 1 MSI message(s).
Jun 12 23:37:33 sv01B kernel: miibus0: <MII bus> on alc0
Jun 12 23:37:33 sv01B kernel: atphy0: <Atheros F1 10/100/1000 PHY> PHY 0 on miibus0
Jun 12 23:37:33 sv01B kernel: atphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, auto
Jun 12 23:37:33 sv01B kernel: alc0: Ethernet address: 40:61:86:cc:e9:34
Jun 12 23:37:33 sv01B kernel: alc0: [FILTER]
Jun 12 23:37:33 sv01B kernel: atapci0: <ATI IXP700/800 SATA300 controller> port 0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f mem 0xfe8ffc00-0xfe8fffff irq 22 at device 17.0 on pci0
Jun 12 23:37:33 sv01B kernel: atapci0: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: atapci0: AHCI v1.10 controller with 4 3Gbps ports, PM supported
Jun 12 23:37:33 sv01B kernel: ata2: <ATA channel 0> on atapci0
Jun 12 23:37:33 sv01B kernel: ata2: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: ata3: <ATA channel 1> on atapci0
Jun 12 23:37:33 sv01B kernel: ata3: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: ata4: <ATA channel 2> on atapci0
Jun 12 23:37:33 sv01B kernel: ata4: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: ata5: <ATA channel 3> on atapci0
Jun 12 23:37:33 sv01B kernel: ata5: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: ohci0: <OHCI (generic) USB controller> mem 0xfe8fe000-0xfe8fefff irq 16 at device 18.0 on pci0
Jun 12 23:37:33 sv01B kernel: ohci0: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: usbus0: <OHCI (generic) USB controller> on ohci0
Jun 12 23:37:33 sv01B kernel: ohci1: <OHCI (generic) USB controller> mem 0xfe8fd000-0xfe8fdfff irq 16 at device 18.1 on pci0
Jun 12 23:37:33 sv01B kernel: ohci1: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: usbus1: <OHCI (generic) USB controller> on ohci1
Jun 12 23:37:33 sv01B kernel: ehci0: <EHCI (generic) USB 2.0 controller> mem 0xfe8ff800-0xfe8ff8ff irq 17 at device 18.2 on pci0
Jun 12 23:37:33 sv01B kernel: ehci0: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: usbus2: EHCI version 1.0
Jun 12 23:37:33 sv01B kernel: usbus2: <EHCI (generic) USB 2.0 controller> on ehci0
Jun 12 23:37:33 sv01B kernel: ohci2: <OHCI (generic) USB controller> mem 0xfe8fc000-0xfe8fcfff irq 18 at device 19.0 on pci0
Jun 12 23:37:33 sv01B kernel: ohci2: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: usbus3: <OHCI (generic) USB controller> on ohci2
Jun 12 23:37:33 sv01B kernel: ohci3: <OHCI (generic) USB controller> mem 0xfe8fb000-0xfe8fbfff irq 18 at device 19.1 on pci0
Jun 12 23:37:33 sv01B kernel: ohci3: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: usbus4: <OHCI (generic) USB controller> on ohci3
Jun 12 23:37:33 sv01B kernel: ehci1: <EHCI (generic) USB 2.0 controller> mem 0xfe8ff400-0xfe8ff4ff irq 19 at device 19.2 on pci0
Jun 12 23:37:33 sv01B kernel: ehci1: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: usbus5: EHCI version 1.0
Jun 12 23:37:33 sv01B kernel: usbus5: <EHCI (generic) USB 2.0 controller> on ehci1
Jun 12 23:37:33 sv01B kernel: pci0: <serial bus, SMBus> at device 20.0 (no driver attached)
Jun 12 23:37:33 sv01B kernel: atapci1: <ATI IXP700/800 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 20.1 on pci0
Jun 12 23:37:33 sv01B kernel: ata0: <ATA channel 0> on atapci1
Jun 12 23:37:33 sv01B kernel: ata0: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: ata1: <ATA channel 1> on atapci1
Jun 12 23:37:33 sv01B kernel: ata1: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: isab0: <PCI-ISA bridge> at device 20.3 on pci0
Jun 12 23:37:33 sv01B kernel: isa0: <ISA bus> on isab0
Jun 12 23:37:33 sv01B kernel: pcib3: <ACPI PCI-PCI bridge> at device 20.4 on pci0
Jun 12 23:37:33 sv01B kernel: pci3: <ACPI PCI bus> on pcib3
Jun 12 23:37:33 sv01B kernel: ohci4: <OHCI (generic) USB controller> mem 0xfe8fa000-0xfe8fafff irq 18 at device 20.5 on pci0
Jun 12 23:37:33 sv01B kernel: ohci4: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: usbus6: <OHCI (generic) USB controller> on ohci4
Jun 12 23:37:33 sv01B kernel: acpi_button0: <Power Button> on acpi0
Jun 12 23:37:33 sv01B kernel: atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
Jun 12 23:37:33 sv01B kernel: sc0: <System console> at flags 0x100 on isa0
Jun 12 23:37:33 sv01B kernel: sc0: VGA <16 virtual consoles, flags=0x300>
Jun 12 23:37:33 sv01B kernel: vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Jun 12 23:37:33 sv01B kernel: atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
Jun 12 23:37:33 sv01B kernel: atkbd0: <AT Keyboard> irq 1 on atkbdc0
Jun 12 23:37:33 sv01B kernel: kbd0 at atkbd0
Jun 12 23:37:33 sv01B kernel: atkbd0: [GIANT-LOCKED]
Jun 12 23:37:33 sv01B kernel: atkbd0: [ITHREAD]
Jun 12 23:37:33 sv01B kernel: ppc0: cannot reserve I/O port range
Jun 12 23:37:33 sv01B kernel: acpi_throttle0: <ACPI CPU Throttling> on cpu0
Jun 12 23:37:33 sv01B kernel: hwpstate0: <Cool`n'Quiet 2.0> on cpu0
Jun 12 23:37:33 sv01B kernel: ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present;
Jun 12 23:37:33 sv01B kernel: to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf.
Jun 12 23:37:33 sv01B kernel: ZFS filesystem version 3
Jun 12 23:37:33 sv01B kernel: ZFS storage pool version 14
Jun 12 23:37:33 sv01B kernel: Timecounters tick every 1.000 msec
Jun 12 23:37:33 sv01B kernel: usbus0: 12Mbps Full Speed USB v1.0
Jun 12 23:37:33 sv01B kernel: usbus1: 12Mbps Full Speed USB v1.0
Jun 12 23:37:33 sv01B kernel: usbus2: 480Mbps High Speed USB v2.0
Jun 12 23:37:33 sv01B kernel: usbus3: 12Mbps Full Speed USB v1.0
Jun 12 23:37:33 sv01B kernel: usbus4: 12Mbps Full Speed USB v1.0
Jun 12 23:37:33 sv01B kernel: usbus5: 480Mbps High Speed USB v2.0
Jun 12 23:37:33 sv01B kernel: usbus6: 12Mbps Full Speed USB v1.0
Jun 12 23:37:33 sv01B kernel: ad4: 953869MB <Seagate ST31000528AS CC38> at ata2-master UDMA100 SATA 3Gb/s
Jun 12 23:37:33 sv01B kernel: ugen0.1: <ATI> at usbus0
Jun 12 23:37:33 sv01B kernel: uhub0: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
Jun 12 23:37:33 sv01B kernel: ugen1.1: <ATI> at usbus1
Jun 12 23:37:33 sv01B kernel: uhub1: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
Jun 12 23:37:33 sv01B kernel: ugen2.1: <ATI> at usbus2
Jun 12 23:37:33 sv01B kernel: uhub2: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
Jun 12 23:37:33 sv01B kernel: ugen3.1: <ATI> at usbus3
Jun 12 23:37:33 sv01B kernel: uhub3: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3
Jun 12 23:37:33 sv01B kernel: ugen4.1: <ATI> at usbus4
Jun 12 23:37:33 sv01B kernel: uhub4: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4
Jun 12 23:37:33 sv01B kernel: ugen5.1: <ATI> at usbus5
Jun 12 23:37:33 sv01B kernel: uhub5: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus5
Jun 12 23:37:33 sv01B kernel: ugen6.1: <ATI> at usbus6
Jun 12 23:37:33 sv01B kernel: uhub6: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus6
Jun 12 23:37:33 sv01B kernel: SMP: AP CPU #1 Launched!
Jun 12 23:37:33 sv01B kernel: SMP: AP CPU #2 Launched!
Jun 12 23:37:33 sv01B kernel: Root mount waiting for: usbus6 usbus5 usbus4 usbus3 usbus2 usbus1 usbus0
Jun 12 23:37:33 sv01B kernel: uhub6: 2 ports with 2 removable, self powered
Jun 12 23:37:33 sv01B kernel: uhub1: 3 ports with 3 removable, self powered
Jun 12 23:37:33 sv01B kernel: uhub0: 3 ports with 3 removable, self powered
Jun 12 23:37:33 sv01B kernel: uhub3: 3 ports with 3 removable, self powered
Jun 12 23:37:33 sv01B kernel: uhub4: 3 ports with 3 removable, self powered
Jun 12 23:37:33 sv01B kernel: Root mount waiting for: usbus5 usbus2
Jun 12 23:37:33 sv01B kernel: Root mount waiting for: usbus5 usbus2
Jun 12 23:37:33 sv01B kernel: uhub2: 6 ports with 6 removable, self powered
Jun 12 23:37:33 sv01B kernel: uhub5: 6 ports with 6 removable, self powered
Jun 12 23:37:33 sv01B kernel: Trying to mount root from zfs:zsv01B/ROOT/zsv01B
Jun 12 23:37:33 sv01B kernel: ugen1.2: <CHICONY> at usbus1
Jun 12 23:37:33 sv01B kernel: ukbd0: <CHICONY USB NetVista Full Width Keyboard, class 0/0, rev 1.10/1.02, addr 2> on usbus1
Jun 12 23:37:33 sv01B kernel: kbd2 at ukbd0
Jun 12 23:37:35 sv01B kernel: alc0: link state changed to UP
Jun 12 23:38:45 sv01B login: ROOT LOGIN (root) ON ttyv0
Jun 12 23:38:53 sv01B login: ROOT LOGIN (root) ON ttyv1
Jun 12 23:39:00 sv01B login: ROOT LOGIN (root) ON ttyv2
Jun 12 23:39:59 sv01B hastd: [zfshast] (init) We act as init for the resource and not as secondary as requested by tcp4://192.168.0.240:41687.
Jun 12 23:40:04 sv01B hastd: [zfshast] (init) We act as init for the resource and not as secondary as requested by tcp4://192.168.0.240:49067.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) Unable to receive request header. : Socket is not connected.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) Worker process exited ungracefully (pid=763, exitcode=75).
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) Unable to receive request header. : Socket is not connected.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) Worker process exited ungracefully (pid=765, exitcode=75).
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) Unable to receive request header. : Socket is not connected.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) Worker process exited ungracefully (pid=767, exitcode=75).
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) Unable to receive request header. : Socket is not connected.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) Worker process exited ungracefully (pid=769, exitcode=75).
Jun 12 23:40:34 sv01B hastd: [zfshast] (secondary) Unable to receive request header. : Socket is not connected.
Jun 12 23:40:34 sv01B hastd: [zfshast] (secondary) Worker process exited ungracefully (pid=771, exitcode=75).
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) Unable to receive request header. : Socket is not connected.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) Worker process exited ungracefully (pid=772, exitcode=75).
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) Unable to receive request header. : Socket is not connected.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) Worker process exited ungracefully (pid=773, exitcode=75).
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) Unable to receive request header. : Socket is not connected.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) Worker process exited ungracefully (pid=774, exitcode=75).
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) Unable to receive request header. : Socket is not connected.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) Worker process exited ungracefully (pid=775, exitcode=75).

--------------040800000505090801010307
Content-Type: text/plain;
 name="debug.log"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="debug.log"

Jun 12 23:37:33 sv01B newsyslog[433]: logfile first created
Jun 12 23:39:59 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:39:59 sv01B hastd: tcp4://192.168.0.240:41687: resource=zfshast
Jun 12 23:40:04 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:04 sv01B hastd: tcp4://192.168.0.240:49067: resource=zfshast
Jun 12 23:40:09 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:09 sv01B hastd: tcp4://192.168.0.240:25069: resource=zfshast
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:25069.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:25069 configured.
Jun 12 23:40:09 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:09 sv01B hastd: tcp4://192.168.0.240:65280: resource=zfshast
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:65280 configured.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) recv: (0x8011f52e0) Got request.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) recv: (0x8011f52e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) recv: (0x8011f52e0) Moving request to the disk queue.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) recv: (0x8011f5290) Got request.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) disk: (0x8011f52e0) Got request: WRITE(0, 131072).
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) disk: (0x8011f52e0) Moving request to the send queue.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) send: (0x8011f52e0) Got request: WRITE(0, 131072).
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) disk: (0x8011f52e0) Moving request to the free queue.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:14 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:14 sv01B hastd: tcp4://192.168.0.240:32992: resource=zfshast
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:32992.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=763), stopping it.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:32992 configured.
Jun 12 23:40:14 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:14 sv01B hastd: tcp4://192.168.0.240:26238: resource=zfshast
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:26238 configured.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:14 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:19 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:19 sv01B hastd: tcp4://192.168.0.240:56005: resource=zfshast
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:56005.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=765), stopping it.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:56005 configured.
Jun 12 23:40:19 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:19 sv01B hastd: tcp4://192.168.0.240:18207: resource=zfshast
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:18207 configured.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:19 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:24 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:24 sv01B hastd: tcp4://192.168.0.240:43780: resource=zfshast
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:43780.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=767), stopping it.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:43780 configured.
Jun 12 23:40:24 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:24 sv01B hastd: tcp4://192.168.0.240:52141: resource=zfshast
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:52141 configured.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:24 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:29 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:29 sv01B hastd: tcp4://192.168.0.240:25496: resource=zfshast
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:25496.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=769), stopping it.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:25496 configured.
Jun 12 23:40:29 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:29 sv01B hastd: tcp4://192.168.0.240:35356: resource=zfshast
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:35356 configured.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:29 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:34 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:34 sv01B hastd: tcp4://192.168.0.240:32148: resource=zfshast
Jun 12 23:40:34 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:32148.
Jun 12 23:40:34 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=771), stopping it.
Jun 12 23:40:34 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:32148 configured.
Jun 12 23:40:34 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:35 sv01B hastd: tcp4://192.168.0.240:26904: resource=zfshast
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:26904 configured.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:35 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:40 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:40 sv01B hastd: tcp4://192.168.0.240:10336: resource=zfshast
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:10336.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=772), stopping it.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:10336 configured.
Jun 12 23:40:40 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:40 sv01B hastd: tcp4://192.168.0.240:23685: resource=zfshast
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:23685 configured.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:40 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:45 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:45 sv01B hastd: tcp4://192.168.0.240:59142: resource=zfshast
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:59142.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=773), stopping it.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:59142 configured.
Jun 12 23:40:45 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:45 sv01B hastd: tcp4://192.168.0.240:42481: resource=zfshast
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:42481 configured.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:45 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:50 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:50 sv01B hastd: tcp4://192.168.0.240:60611: resource=zfshast
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:60611.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=774), stopping it.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:60611 configured.
Jun 12 23:40:50 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:50 sv01B hastd: tcp4://192.168.0.240:19970: resource=zfshast
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:19970 configured.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:50 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:55 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:55 sv01B hastd: tcp4://192.168.0.240:14274: resource=zfshast
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:14274.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=775), stopping it.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:14274 configured.
Jun 12 23:40:55 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:40:55 sv01B hastd: tcp4://192.168.0.240:17831: resource=zfshast
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:17831 configured.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:40:55 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:00 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:00 sv01B hastd: tcp4://192.168.0.240:35927: resource=zfshast
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:35927.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=776), stopping it.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:35927 configured.
Jun 12 23:41:00 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:00 sv01B hastd: tcp4://192.168.0.240:40411: resource=zfshast
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:40411 configured.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:00 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:05 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:05 sv01B hastd: tcp4://192.168.0.240:45270: resource=zfshast
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:45270.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=778), stopping it.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:45270 configured.
Jun 12 23:41:05 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:05 sv01B hastd: tcp4://192.168.0.240:58964: resource=zfshast
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:58964 configured.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:05 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:10 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:10 sv01B hastd: tcp4://192.168.0.240:46708: resource=zfshast
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:46708.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=780), stopping it.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:46708 configured.
Jun 12 23:41:10 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:10 sv01B hastd: tcp4://192.168.0.240:14714: resource=zfshast
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:14714 configured.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:10 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:15 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:15 sv01B hastd: tcp4://192.168.0.240:61084: resource=zfshast
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:61084.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=781), stopping it.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:61084 configured.
Jun 12 23:41:15 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:15 sv01B hastd: tcp4://192.168.0.240:51255: resource=zfshast
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:51255 configured.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:15 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:20 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:20 sv01B hastd: tcp4://192.168.0.240:47597: resource=zfshast
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:47597.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=782), stopping it.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:47597 configured.
Jun 12 23:41:20 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:20 sv01B hastd: tcp4://192.168.0.240:60235: resource=zfshast
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:60235 configured.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:20 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:25 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:25 sv01B hastd: tcp4://192.168.0.240:21401: resource=zfshast
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:21401.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=784), stopping it.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:21401 configured.
Jun 12 23:41:25 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:25 sv01B hastd: tcp4://192.168.0.240:47118: resource=zfshast
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:47118 configured.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:25 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:30 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:30 sv01B hastd: tcp4://192.168.0.240:21567: resource=zfshast
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:21567.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=785), stopping it.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:21567 configured.
Jun 12 23:41:30 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:30 sv01B hastd: tcp4://192.168.0.240:38332: resource=zfshast
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:38332 configured.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:30 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:35 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:35 sv01B hastd: tcp4://192.168.0.240:53072: resource=zfshast
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) Initial connection from tcp4://192.168.0.240:53072.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) Worker process exists (pid=786), stopping it.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) Incoming connection from tcp4://192.168.0.240:53072 configured.
Jun 12 23:41:35 sv01B hastd: Accepting connection to tcp4://0.0.0.0:8457.
Jun 12 23:41:35 sv01B hastd: tcp4://192.168.0.240:15778: resource=zfshast
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) Outgoing connection to tcp4://192.168.0.240:15778 configured.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) Obtained info about /dev/ad4p4.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) Locked /dev/ad4p4.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Got request header: WRITE(0, 131072).
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) recv: (0x8011f32e0) Moving request to the disk queue.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) recv: Taking free request.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) recv: (0x8011f3290) Got request.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) Local activemap cleared.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the send queue.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) disk: Taking request.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) disk: No requests, waiting.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) send: (0x8011f32e0) Got request: WRITE(0, 131072).
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) disk: (0x8011f32e0) Moving request to the free queue.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) send: Taking request.
Jun 12 23:41:35 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.

--------------040800000505090801010307--

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 19:15:57 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D0BC6106564A
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 19:15:57 +0000 (UTC)
	(envelope-from to.my.trociny@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 49AB08FC08
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 19:15:56 +0000 (UTC)
Received: by bwz2 with SMTP id 2so1500790bwz.13
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 12:15:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:from:to:cc:subject:references
	:x-comment-to:date:in-reply-to:message-id:user-agent:mime-version
	:content-type; bh=cEljRfu8oPr58A2o+cBkmZlt4VTiCNhD+nzTUHonS+k=;
	b=ikySyJKP1B8GtY70iFJ57zU76GeVgWNrCMEpx93pd2SjEoU4GRJBwP+JEG3seDT9P+
	DbtlJyCvjoD/bPFbwe7zTfBYblW1JSI0rDuBPUt7xpD/uwkaWWfndzre/84gNOlDpjG6
	l0cM9Ldsjgm9kaL3lTT3P924pMK3bhmlrxdXU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=from:to:cc:subject:references:x-comment-to:date:in-reply-to
	:message-id:user-agent:mime-version:content-type;
	b=siBhcADzwLnlSC1u4W/48qFm4CjM/XEgDBSAjhglxl610WnkHxr4ix+OOZdBxV6+7T
	4AG/99UGC2CnI5fPfaBvVfvqwk6EnrVriDpYTz33v4jMezAKBHhjJ4fIz6vApwbDfvE0
	524VckJiFlCcUUwxeJD4RvxM6QWpH1v7g6dk8=
Received: by 10.204.74.2 with SMTP id s2mr2618085bkj.28.1276370154970;
	Sat, 12 Jun 2010 12:15:54 -0700 (PDT)
Received: from localhost ([95.69.160.52])
	by mx.google.com with ESMTPS id z20sm11017279bkx.9.2010.06.12.12.15.53
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Sat, 12 Jun 2010 12:15:54 -0700 (PDT)
From: Mikolaj Golub <to.my.trociny@gmail.com>
To: Kostik Belousov <kostikbel@gmail.com>
References: <86mxv22ji7.fsf@zhuzha.ua1>
	<20100611191059.GF13238@deviant.kiev.zoral.com.ua>
X-Comment-To: Kostik Belousov
Date: Sat, 12 Jun 2010 22:15:52 +0300
In-Reply-To: <20100611191059.GF13238@deviant.kiev.zoral.com.ua> (Kostik
	Belousov's message of "Fri, 11 Jun 2010 22:10:59 +0300")
Message-ID: <86mxv0cb9z.fsf@kopusha.home.net>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
Cc: freebsd-fs@freebsd.org
Subject: Re: '#ifndef DIAGNOSTIC' in nfsclient code looks like a typo
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 19:15:58 -0000

--=-=-=


On Fri, 11 Jun 2010 22:10:59 +0300 Kostik Belousov wrote:

 KB> All the changes should be converted to the KASSERTs. There is no point
 KB> in doing
 KB>         if (something)
 KB>                 panic();
 KB> for diagnostic; use
 KB>         KASSERT(something, (panic message));

Please look at the attached patch.

-- 
Mikolaj Golub


--=-=-=
Content-Type: text/x-patch
Content-Disposition: attachment; filename=nfs.KASSERT.patch

Index: sys/nfsclient/nfs_vnops.c
===================================================================
--- sys/nfsclient/nfs_vnops.c	(revision 208960)
+++ sys/nfsclient/nfs_vnops.c	(working copy)
@@ -1348,10 +1348,7 @@
 	int v3 = NFS_ISV3(vp), committed = NFSV3WRITE_FILESYNC;
 	int wsize;
 	
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1)
-		panic("nfs: writerpc iovcnt > 1");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1, ("nfs: writerpc iovcnt > 1"));
 	*must_commit = 0;
 	tsiz = uiop->uio_resid;
 	mtx_lock(&nmp->nm_mtx);
@@ -1708,12 +1705,8 @@
 	int error = 0;
 	struct vattr vattr;
 
-#ifndef DIAGNOSTIC
-	if ((cnp->cn_flags & HASBUF) == 0)
-		panic("nfs_remove: no name");
-	if (vrefcnt(vp) < 1)
-		panic("nfs_remove: bad v_usecount");
-#endif
+	KASSERT(cnp->cn_flags & HASBUF, ("nfs_remove: no name"));
+	KASSERT(vrefcnt(vp) > 0, ("nfs_remove: bad v_usecount"));
 	if (vp->v_type == VDIR)
 		error = EPERM;
 	else if (vrefcnt(vp) == 1 || (np->n_sillyrename &&
@@ -1814,11 +1807,8 @@
 	struct componentname *fcnp = ap->a_fcnp;
 	int error;
 
-#ifndef DIAGNOSTIC
-	if ((tcnp->cn_flags & HASBUF) == 0 ||
-	    (fcnp->cn_flags & HASBUF) == 0)
-		panic("nfs_rename: no name");
-#endif
+	KASSERT((tcnp->cn_flags & HASBUF) && (fcnp->cn_flags & HASBUF),
+		("nfs_rename: no name"));
 	/* Check for cross-device rename */
 	if ((fvp->v_mount != tdvp->v_mount) ||
 	    (tvp && (fvp->v_mount != tvp->v_mount))) {
@@ -2277,11 +2267,10 @@
 	int attrflag;
 	int v3 = NFS_ISV3(vp);
 
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
-		(uiop->uio_resid & (DIRBLKSIZ - 1)))
-		panic("nfs readdirrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+	       !(uiop->uio_offset & (DIRBLKSIZ - 1)) &&
+	       !(uiop->uio_resid & (DIRBLKSIZ - 1)),
+	       ("nfs readdirrpc bad uio"));
 
 	/*
 	 * If there is no cookie, assume directory was stale.
@@ -2482,11 +2471,10 @@
 #ifndef nolint
 	dp = NULL;
 #endif
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
-		(uiop->uio_resid & (DIRBLKSIZ - 1)))
-		panic("nfs readdirplusrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		!(uiop->uio_offset & (DIRBLKSIZ - 1)) &&
+		!(uiop->uio_resid & (DIRBLKSIZ - 1)),
+		("nfs readdirplusrpc bad uio"));
 	ndp->ni_dvp = vp;
 	newvp = NULLVP;
 
@@ -2752,10 +2740,7 @@
 
 	cache_purge(dvp);
 	np = VTONFS(vp);
-#ifndef DIAGNOSTIC
-	if (vp->v_type == VDIR)
-		panic("nfs: sillyrename dir");
-#endif
+	KASSERT(vp->v_type != VDIR, ("nfs: sillyrename dir"));
 	sp = malloc(sizeof (struct sillyrename),
 		M_NFSREQ, M_WAITOK);
 	sp->s_cred = crhold(cnp->cn_cred);
Index: sys/nfsclient/nfs_bio.c
===================================================================
--- sys/nfsclient/nfs_bio.c	(revision 208960)
+++ sys/nfsclient/nfs_bio.c	(working copy)
@@ -453,10 +453,7 @@
 	int seqcount;
 	int nra, error = 0, n = 0, on = 0;
 
-#ifdef DIAGNOSTIC
-	if (uio->uio_rw != UIO_READ)
-		panic("nfs_read mode");
-#endif
+	KASSERT(uio->uio_rw == UIO_READ, ("nfs_read mode"));
 	if (uio->uio_resid == 0)
 		return (0);
 	if (uio->uio_offset < 0)	/* XXX VDIR cookies can be negative */
@@ -875,12 +872,9 @@
 	int bcount;
 	int n, on, error = 0;
 
-#ifdef DIAGNOSTIC
-	if (uio->uio_rw != UIO_WRITE)
-		panic("nfs_write mode");
-	if (uio->uio_segflg == UIO_USERSPACE && uio->uio_td != curthread)
-		panic("nfs_write proc");
-#endif
+	KASSERT(uio->uio_rw == UIO_WRITE, ("nfs_write mode"));
+	KASSERT(uio->uio_segflg != UIO_USERSPACE || uio->uio_td == curthread,
+		("nfs_write proc"));
 	if (vp->v_type != VREG)
 		return (EIO);
 	mtx_lock(&np->n_mtx);
Index: sys/nfsclient/nfs_subs.c
===================================================================
--- sys/nfsclient/nfs_subs.c	(revision 208960)
+++ sys/nfsclient/nfs_subs.c	(working copy)
@@ -199,10 +199,7 @@
 	int uiosiz, clflg, rem;
 	char *cp;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1)
-		panic("nfsm_uiotombuf: iovcnt != 1");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1, ("nfsm_uiotombuf: iovcnt != 1"));
 
 	if (siz > MLEN)		/* or should it >= MCLBYTES ?? */
 		clflg = 1;
@@ -789,10 +786,7 @@
 	
 	pos = (uoff_t)off / NFS_DIRBLKSIZ;
 	if (pos == 0 || off < 0) {
-#ifdef DIAGNOSTIC
-		if (add)
-			panic("nfs getcookie add at <= 0");
-#endif
+		KASSERT(!add, ("nfs getcookie add at <= 0"));
 		return (&nfs_nullcookie);
 	}
 	pos--;
@@ -843,10 +837,7 @@
 {
 	struct nfsnode *np = VTONFS(vp);
 
-#ifdef DIAGNOSTIC
-	if (vp->v_type != VDIR)
-		panic("nfs: invaldir not dir");
-#endif
+	KASSERT(vp->v_type == VDIR, ("nfs: invaldir not dir"));
 	nfs_dircookie_lock(np);
 	np->n_direofoffset = 0;
 	np->n_cookieverf.nfsuquad[0] = 0;
Index: sys/fs/nfsclient/nfs_clbio.c
===================================================================
--- sys/fs/nfsclient/nfs_clbio.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clbio.c	(working copy)
@@ -453,10 +453,7 @@
 	int seqcount;
 	int nra, error = 0, n = 0, on = 0;
 
-#ifdef DIAGNOSTIC
-	if (uio->uio_rw != UIO_READ)
-		panic("ncl_read mode");
-#endif
+	KASSERT(uio->uio_rw == UIO_READ, ("ncl_read mode"));
 	if (uio->uio_resid == 0)
 		return (0);
 	if (uio->uio_offset < 0)	/* XXX VDIR cookies can be negative */
@@ -881,12 +878,9 @@
 	int bcount;
 	int n, on, error = 0;
 
-#ifdef DIAGNOSTIC
-	if (uio->uio_rw != UIO_WRITE)
-		panic("ncl_write mode");
-	if (uio->uio_segflg == UIO_USERSPACE && uio->uio_td != curthread)
-		panic("ncl_write proc");
-#endif
+	KASSERT(uio->uio_rw == UIO_WRITE, ("ncl_write mode"));
+	KASSERT(uio->uio_segflg != UIO_USERSPACE || uio->uio_td == curthread,
+		("ncl_write proc"));
 	if (vp->v_type != VREG)
 		return (EIO);
 	mtx_lock(&np->n_mtx);
Index: sys/fs/nfsclient/nfs_clcomsubs.c
===================================================================
--- sys/fs/nfsclient/nfs_clcomsubs.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clcomsubs.c	(working copy)
@@ -194,10 +194,7 @@
 	int uiosiz, clflg, rem;
 	char *cp, *tcp;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1)
-		panic("nfsm_uiotombuf: iovcnt != 1");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1, ("nfsm_uiotombuf: iovcnt != 1"));
 
 	if (siz > ncl_mbuf_mlen)	/* or should it >= MCLBYTES ?? */
 		clflg = 1;
@@ -346,10 +343,7 @@
 
 	pos = off / NFS_DIRBLKSIZ;
 	if (pos == 0) {
-#ifdef DIAGNOSTIC
-		if (add)
-			panic("nfs getcookie add at 0");
-#endif
+		KASSERT(!add, ("nfs getcookie add at 0"));
 		return (&nfs_nullcookie);
 	}
 	pos--;
Index: sys/fs/nfsclient/nfs_clsubs.c
===================================================================
--- sys/fs/nfsclient/nfs_clsubs.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clsubs.c	(working copy)
@@ -282,10 +282,7 @@
 	
 	pos = (uoff_t)off / NFS_DIRBLKSIZ;
 	if (pos == 0 || off < 0) {
-#ifdef DIAGNOSTIC
-		if (add)
-			panic("nfs getcookie add at <= 0");
-#endif
+		KASSERT(!add, ("nfs getcookie add at <= 0"));
 		return (&nfs_nullcookie);
 	}
 	pos--;
@@ -336,10 +333,7 @@
 {
 	struct nfsnode *np = VTONFS(vp);
 
-#ifdef DIAGNOSTIC
-	if (vp->v_type != VDIR)
-		panic("nfs: invaldir not dir");
-#endif
+	KASSERT(vp->v_type == VDIR, ("nfs: invaldir not dir"));
 	ncl_dircookie_lock(np);
 	np->n_direofoffset = 0;
 	np->n_cookieverf.nfsuquad[0] = 0;
Index: sys/fs/nfsclient/nfs_clvnops.c
===================================================================
--- sys/fs/nfsclient/nfs_clvnops.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clvnops.c	(working copy)
@@ -1564,12 +1564,8 @@
 	int error = 0;
 	struct vattr vattr;
 
-#ifndef DIAGNOSTIC
-	if ((cnp->cn_flags & HASBUF) == 0)
-		panic("nfs_remove: no name");
-	if (vrefcnt(vp) < 1)
-		panic("nfs_remove: bad v_usecount");
-#endif
+	KASSERT(cnp->cn_flags & HASBUF, ("nfs_remove: no name"));
+	KASSERT(vrefcnt(vp) > 0, ("nfs_remove: bad v_usecount"));
 	if (vp->v_type == VDIR)
 		error = EPERM;
 	else if (vrefcnt(vp) == 1 || (np->n_sillyrename &&
@@ -1676,11 +1672,8 @@
 	struct nfsv4node *newv4 = NULL;
 	int error;
 
-#ifndef DIAGNOSTIC
-	if ((tcnp->cn_flags & HASBUF) == 0 ||
-	    (fcnp->cn_flags & HASBUF) == 0)
-		panic("nfs_rename: no name");
-#endif
+	KASSERT((tcnp->cn_flags & HASBUF) && (fcnp->cn_flags & HASBUF),
+		("nfs_rename: no name"));
 	/* Check for cross-device rename */
 	if ((fvp->v_mount != tdvp->v_mount) ||
 	    (tvp && (fvp->v_mount != tvp->v_mount))) {
@@ -2137,11 +2130,10 @@
 	struct nfsmount *nmp = VFSTONFS(vp->v_mount);
 	int error = 0, eof, attrflag;
 
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
-		(uiop->uio_resid & (DIRBLKSIZ - 1)))
-		panic("nfs readdirrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		!(uiop->uio_offset & (DIRBLKSIZ - 1)) &&
+		!(uiop->uio_resid & (DIRBLKSIZ - 1)),
+		("nfs readdirrpc bad uio"));
 
 	/*
 	 * If there is no cookie, assume directory was stale.
@@ -2198,11 +2190,10 @@
 	struct nfsmount *nmp = VFSTONFS(vp->v_mount);
 	int error = 0, attrflag, eof;
 
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
-		(uiop->uio_resid & (DIRBLKSIZ - 1)))
-		panic("nfs readdirplusrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		!(uiop->uio_offset & (DIRBLKSIZ - 1)) &&
+		!(uiop->uio_resid & (DIRBLKSIZ - 1)),
+		("nfs readdirplusrpc bad uio"));
 
 	/*
 	 * If there is no cookie, assume directory was stale.
@@ -2264,10 +2255,7 @@
 
 	cache_purge(dvp);
 	np = VTONFS(vp);
-#ifndef DIAGNOSTIC
-	if (vp->v_type == VDIR)
-		panic("nfs: sillyrename dir");
-#endif
+	KASSERT(vp->v_type != VDIR, ("nfs: sillyrename dir"));
 	MALLOC(sp, struct sillyrename *, sizeof (struct sillyrename),
 	    M_NEWNFSREQ, M_WAITOK);
 	sp->s_cred = crhold(cnp->cn_cred);
Index: sys/fs/nfsclient/nfs_clrpcops.c
===================================================================
--- sys/fs/nfsclient/nfs_clrpcops.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clrpcops.c	(working copy)
@@ -1445,10 +1445,7 @@
 	struct nfsrv_descript *nd = &nfsd;
 	nfsattrbit_t attrbits;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1)
-		panic("nfs: writerpc iovcnt > 1");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1, ("nfs: writerpc iovcnt > 1"));
 	*attrflagp = 0;
 	tsiz = uio_uio_resid(uiop);
 	NFSLOCKMNT(nmp);
@@ -2501,10 +2498,9 @@
 	u_int32_t *tl2 = NULL;
 	size_t tresid;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uio_uio_resid(uiop) & (DIRBLKSIZ - 1)))
-		panic("nfs readdirrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		!(uio_uio_resid(uiop) & (DIRBLKSIZ - 1)),
+		("nfs readdirrpc bad uio"));
 
 	/*
 	 * There is no point in reading a lot more than uio_resid, however
@@ -2939,10 +2935,9 @@
 	size_t tresid;
 	u_int32_t *tl2 = NULL, fakefileno = 0xffffffff, rderr;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uio_uio_resid(uiop) & (DIRBLKSIZ - 1)))
-		panic("nfs readdirplusrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		!(uio_uio_resid(uiop) & (DIRBLKSIZ - 1)),
+		("nfs readdirplusrpc bad uio"));
 	*attrflagp = 0;
 	if (eofp != NULL)
 		*eofp = 0;
Index: sys/fs/nfsserver/nfs_nfsdsocket.c
===================================================================
--- sys/fs/nfsserver/nfs_nfsdsocket.c	(revision 208960)
+++ sys/fs/nfsserver/nfs_nfsdsocket.c	(working copy)
@@ -364,10 +364,7 @@
 	 * Get a locked vnode for the first file handle
 	 */
 	if (!(nd->nd_flag & ND_NFSV4)) {
-#ifdef DIAGNOSTIC
-		if (nd->nd_repstat)
-			panic("nfsrvd_dorpc");
-#endif
+		KASSERT(nd->nd_repstat == 0, ("nfsrvd_dorpc"));
 		/*
 		 * For NFSv3, if the malloc/mget allocation is near limits,
 		 * return NFSERR_DELAY.

--=-=-=--

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 19:40:27 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 42AE31065674
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 19:40:27 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id B26C28FC17
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 19:40:26 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5CJeMpG049663
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 12 Jun 2010 22:40:22 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	o5CJeMHA078076; Sat, 12 Jun 2010 22:40:22 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5CJeMne078075; 
	Sat, 12 Jun 2010 22:40:22 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Sat, 12 Jun 2010 22:40:22 +0300
From: Kostik Belousov <kostikbel@gmail.com>
To: Mikolaj Golub <to.my.trociny@gmail.com>
Message-ID: <20100612194022.GP13238@deviant.kiev.zoral.com.ua>
References: <86mxv22ji7.fsf@zhuzha.ua1>
	<20100611191059.GF13238@deviant.kiev.zoral.com.ua>
	<86mxv0cb9z.fsf@kopusha.home.net>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="hHiQ9nAwW5IGN2dL"
Content-Disposition: inline
In-Reply-To: <86mxv0cb9z.fsf@kopusha.home.net>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-3.6 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: freebsd-fs@freebsd.org
Subject: Re: '#ifndef DIAGNOSTIC' in nfsclient code looks like a typo
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 19:40:27 -0000


--hHiQ9nAwW5IGN2dL
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Jun 12, 2010 at 10:15:52PM +0300, Mikolaj Golub wrote:
>=20
> On Fri, 11 Jun 2010 22:10:59 +0300 Kostik Belousov wrote:
>=20
>  KB> All the changes should be converted to the KASSERTs. There is no poi=
nt
>  KB> in doing
>  KB>         if (something)
>  KB>                 panic();
>  KB> for diagnostic; use
>  KB>         KASSERT(something, (panic message));
>=20
> Please look at the attached patch.

Almost there. According to style(9), the values should be explicitely
compared with 0, unless the value is of the boolean type. I suggest
you to change e.g.
+	KASSERT(uiop->uio_iovcnt =3D=3D 1 &&
+		!(uio_uio_resid(uiop) & (DIRBLKSIZ - 1)),
to
+	KASSERT(uiop->uio_iovcnt =3D=3D 1 &&
+		(uio_uio_resid(uiop) & (DIRBLKSIZ - 1)) =3D=3D 0,

and change
+	KASSERT((tcnp->cn_flags & HASBUF) && (fcnp->cn_flags & HASBUF),
+		("nfs_rename: no name"));
to
+	KASSERT((tcnp->cn_flags & HASBUF) !=3D 0 && (fcnp->cn_flags & HASBUF) !=
=3D 0,
+		("nfs_rename: no name"));


--hHiQ9nAwW5IGN2dL
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (FreeBSD)

iEYEARECAAYFAkwT4qUACgkQC3+MBN1Mb4gAiwCfW3Cm3vzXSk2wnlnbg5pjlpv4
rNEAoIoNtjZAiTBUTch547aZn+DZruqP
=EUq2
-----END PGP SIGNATURE-----

--hHiQ9nAwW5IGN2dL--

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 20:25:00 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7C0ED106566C;
	Sat, 12 Jun 2010 20:25:00 +0000 (UTC)
	(envelope-from to.my.trociny@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 8F08C8FC0C;
	Sat, 12 Jun 2010 20:24:59 +0000 (UTC)
Received: by bwz2 with SMTP id 2so1535531bwz.13
	for <multiple recipients>; Sat, 12 Jun 2010 13:24:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:from:to:cc:subject:references
	:x-comment-to:date:in-reply-to:message-id:user-agent:mime-version
	:content-type; bh=jgLuAZLEqB+ZDfyE0vubpbDmMxxbmhZTrKJQIDRl2rg=;
	b=jAauPJJHRkTGYNneTy6S+foFKx/bEIWGq6cKwGiGM94BzkPQBm5J692naYTbL9PK6Y
	YJTXb8zSrE/SIeQq2bF3NzKdNGGj3N68o7TemHhcgI/FTCHYXSW8anl/zPn/ADLiGlqx
	NRdoXGUTI527tokf3mBGo5zIHTwrwiM+Bo5Jk=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=from:to:cc:subject:references:x-comment-to:date:in-reply-to
	:message-id:user-agent:mime-version:content-type;
	b=xIJc91kHr3XVI+FeOWHUPQceU0ByO9CqavQL7nrovO3IMXhEVkA6Jf4Ww62dNIvKOS
	kA3iuLYp9xrYZzm07FhxNa2tYmFVE6lIv7bTaGf51vpEtqnHcC/LHCU/uLPdXhFcVIS3
	G1nU5ENbtG0QlB88ytebwjcSCByk/4ZqOX1ag=
Received: by 10.204.74.35 with SMTP id s35mr2669221bkj.33.1276374298416;
	Sat, 12 Jun 2010 13:24:58 -0700 (PDT)
Received: from localhost ([95.69.160.52])
	by mx.google.com with ESMTPS id z17sm11246814bkx.12.2010.06.12.13.24.56
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Sat, 12 Jun 2010 13:24:57 -0700 (PDT)
From: Mikolaj Golub <to.my.trociny@gmail.com>
To: "hiroshi\@soupacific.com" <hiroshi@soupacific.com>
References: <4BCFA4C2.6000109@soupacific.com>
	<4BCFB1C5.5000908@soupacific.com> <4BD01800.9040901@soupacific.com>
	<4BD0438B.5080308@soupacific.com> <4BD0E432.1000108@soupacific.com>
	<20100423061521.GC1670@garage.freebsd.pl>
	<4BD17B0D.5080601@soupacific.com> <4C10B526.4040908@soupacific.com>
	<20100612104336.GA2253@garage.freebsd.pl>
	<4C1372E0.1000903@soupacific.com>
	<20100612142311.GF2253@garage.freebsd.pl>
	<4C139F9C.2090305@soupacific.com>
X-Comment-To: hiroshi@soupacific.com
Date: Sat, 12 Jun 2010 23:24:53 +0300
In-Reply-To: <4C139F9C.2090305@soupacific.com> (hiroshi@soupacific.com's
	message of "Sat, 12 Jun 2010 23:54:20 +0900")
Message-ID: <86iq5oc82y.fsf@kopusha.home.net>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek <pjd@FreeBSD.org>
Subject: Re: FreeBSD 8.1 and HAST
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 20:25:00 -0000

--=-=-=


On Sat, 12 Jun 2010 23:54:20 +0900 hiroshi@soupacific.com wrote:

 >> Could you send the debug from the whole session, but when the problem
 >> appears? I don't see those "socket is not connected" errors in the
 >> output you sent.

 h> "socket is not connected" errors is in message log

 >> If it is possible could you turn off lines wrapping or maybe send the
 >> debug output as an attachment?
 >>
 h> I here attache debug.log and message files.

It would be good to have all.log enabled in newsyslog.conf and provide the
output from there so all lines are in one log and it is clear which message
appeared earlier. Also the logs from the primary could be useful too.

 h> Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) send: (0x8011f52e0) Got request: WRITE(0, 131072).
 h> Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) disk: (0x8011f52e0) Moving request to the free queue.

BTW, this message lies that it is from the disk thread. It is from the send
thread too. See the attached patch.

 h> Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) send: Taking request.
 h> Jun 12 23:40:09 sv01B hastd: [zfshast] (secondary) send: No requests, waiting.

-- 
Mikolaj Golub

--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline; filename=secondary.c.send.patch

Index: sbin/hastd/secondary.c
===================================================================
--- sbin/hastd/secondary.c	(revision 208960)
+++ sbin/hastd/secondary.c	(working copy)
@@ -687,7 +687,7 @@ send_thread(void *arg)
 			pjdlog_exit(EX_TEMPFAIL, "Unable to send reply.");
 		}
 		nv_free(nvout);
-		pjdlog_debug(2, "disk: (%p) Moving request to the free queue.",
+		pjdlog_debug(2, "send: (%p) Moving request to the free queue.",
 		    hio);
 		nv_free(hio->hio_nv);
 		hio->hio_error = 0;

--=-=-=--

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 12 21:33:48 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8F522106564A
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 21:33:48 +0000 (UTC)
	(envelope-from to.my.trociny@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id E21C48FC1A
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 21:33:47 +0000 (UTC)
Received: by bwz2 with SMTP id 2so1569795bwz.13
	for <freebsd-fs@freebsd.org>; Sat, 12 Jun 2010 14:33:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:from:to:cc:subject:references
	:x-comment-to:date:in-reply-to:message-id:user-agent:mime-version
	:content-type; bh=1n/t+5IAfsVr1RwhpqW8Ksnip3bkG9Z90+eqDI7aeOk=;
	b=NTYRCZRvIYvV98wYzW1Cw9pNIxChcketOpNJ2v2oEeTEt/6nNQh7StVI4E2jWke5HB
	eh6wNK+Etn1UEnDqNwt5t4ISA8S5vjIuEBd03UlFaUvtN/eYuh4pIStA4CMRMeue6+4l
	alfYAoukOiRbTraeQbPjwQPEKvw+7O1xKq3Uw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=from:to:cc:subject:references:x-comment-to:date:in-reply-to
	:message-id:user-agent:mime-version:content-type;
	b=gr4v7PeLm1NIweSPHgtPNsir2xNbu1yfmuwpyzi6LJ7MONaXVlez9iMiTU2kzKTbIN
	a2C9R3VfVJqxGRdQZ4C39tb4or1bRx3BLk1v8wIi/7S1P4K4HtTiUUwV70IXXIAG08pM
	LkmgEoEBrFrv3f+goSwv14MyBQCCDvD61azGY=
Received: by 10.204.132.211 with SMTP id c19mr2640088bkt.184.1276378426515;
	Sat, 12 Jun 2010 14:33:46 -0700 (PDT)
Received: from localhost ([95.69.160.52])
	by mx.google.com with ESMTPS id z17sm11478440bkx.12.2010.06.12.14.33.45
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Sat, 12 Jun 2010 14:33:45 -0700 (PDT)
From: Mikolaj Golub <to.my.trociny@gmail.com>
To: Kostik Belousov <kostikbel@gmail.com>
References: <86mxv22ji7.fsf@zhuzha.ua1>
	<20100611191059.GF13238@deviant.kiev.zoral.com.ua>
	<86mxv0cb9z.fsf@kopusha.home.net>
	<20100612194022.GP13238@deviant.kiev.zoral.com.ua>
X-Comment-To: Kostik Belousov
Date: Sun, 13 Jun 2010 00:33:43 +0300
In-Reply-To: <20100612194022.GP13238@deviant.kiev.zoral.com.ua> (Kostik
	Belousov's message of "Sat, 12 Jun 2010 22:40:22 +0300")
Message-ID: <86eigcc4w8.fsf@kopusha.home.net>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
Cc: freebsd-fs@freebsd.org
Subject: Re: '#ifndef DIAGNOSTIC' in nfsclient code looks like a typo
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jun 2010 21:33:48 -0000

--=-=-=


On Sat, 12 Jun 2010 22:40:22 +0300 Kostik Belousov wrote:

 KB> Almost there. According to style(9), the values should be explicitely
 KB> compared with 0, unless the value is of the boolean type. I suggest
 KB> you to change e.g.
 KB> +        KASSERT(uiop->uio_iovcnt == 1 &&
 KB> +                !(uio_uio_resid(uiop) & (DIRBLKSIZ - 1)),
 KB> to
 KB> +        KASSERT(uiop->uio_iovcnt == 1 &&
 KB> +                (uio_uio_resid(uiop) & (DIRBLKSIZ - 1)) == 0,

 KB> and change
 KB> +        KASSERT((tcnp->cn_flags & HASBUF) && (fcnp->cn_flags & HASBUF),
 KB> +                ("nfs_rename: no name"));
 KB> to
 KB> +        KASSERT((tcnp->cn_flags & HASBUF) != 0 && (fcnp->cn_flags & HASBUF) != 0,
 KB> +                ("nfs_rename: no name"));

Updated.

-- 
Mikolaj Golub


--=-=-=
Content-Type: text/x-patch
Content-Disposition: attachment; filename=nfs.KASSERT.patch

Index: sys/nfsclient/nfs_vnops.c
===================================================================
--- sys/nfsclient/nfs_vnops.c	(revision 208960)
+++ sys/nfsclient/nfs_vnops.c	(working copy)
@@ -1348,10 +1348,7 @@
 	int v3 = NFS_ISV3(vp), committed = NFSV3WRITE_FILESYNC;
 	int wsize;
 	
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1)
-		panic("nfs: writerpc iovcnt > 1");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1, ("nfs: writerpc iovcnt > 1"));
 	*must_commit = 0;
 	tsiz = uiop->uio_resid;
 	mtx_lock(&nmp->nm_mtx);
@@ -1708,12 +1705,8 @@
 	int error = 0;
 	struct vattr vattr;
 
-#ifndef DIAGNOSTIC
-	if ((cnp->cn_flags & HASBUF) == 0)
-		panic("nfs_remove: no name");
-	if (vrefcnt(vp) < 1)
-		panic("nfs_remove: bad v_usecount");
-#endif
+	KASSERT((cnp->cn_flags & HASBUF) != 0, ("nfs_remove: no name"));
+	KASSERT(vrefcnt(vp) > 0, ("nfs_remove: bad v_usecount"));
 	if (vp->v_type == VDIR)
 		error = EPERM;
 	else if (vrefcnt(vp) == 1 || (np->n_sillyrename &&
@@ -1814,11 +1807,9 @@
 	struct componentname *fcnp = ap->a_fcnp;
 	int error;
 
-#ifndef DIAGNOSTIC
-	if ((tcnp->cn_flags & HASBUF) == 0 ||
-	    (fcnp->cn_flags & HASBUF) == 0)
-		panic("nfs_rename: no name");
-#endif
+	KASSERT((tcnp->cn_flags & HASBUF) != 0 &&
+		(fcnp->cn_flags & HASBUF) != 0,
+		("nfs_rename: no name"));
 	/* Check for cross-device rename */
 	if ((fvp->v_mount != tdvp->v_mount) ||
 	    (tvp && (fvp->v_mount != tvp->v_mount))) {
@@ -2277,11 +2268,10 @@
 	int attrflag;
 	int v3 = NFS_ISV3(vp);
 
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
-		(uiop->uio_resid & (DIRBLKSIZ - 1)))
-		panic("nfs readdirrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+	       (uiop->uio_offset & (DIRBLKSIZ - 1)) == 0 &&
+	       (uiop->uio_resid & (DIRBLKSIZ - 1)) == 0,
+	       ("nfs readdirrpc bad uio"));
 
 	/*
 	 * If there is no cookie, assume directory was stale.
@@ -2482,11 +2472,10 @@
 #ifndef nolint
 	dp = NULL;
 #endif
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
-		(uiop->uio_resid & (DIRBLKSIZ - 1)))
-		panic("nfs readdirplusrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		(uiop->uio_offset & (DIRBLKSIZ - 1)) == 0 &&
+		(uiop->uio_resid & (DIRBLKSIZ - 1)) == 0,
+		("nfs readdirplusrpc bad uio"));
 	ndp->ni_dvp = vp;
 	newvp = NULLVP;
 
@@ -2752,10 +2741,7 @@
 
 	cache_purge(dvp);
 	np = VTONFS(vp);
-#ifndef DIAGNOSTIC
-	if (vp->v_type == VDIR)
-		panic("nfs: sillyrename dir");
-#endif
+	KASSERT(vp->v_type != VDIR, ("nfs: sillyrename dir"));
 	sp = malloc(sizeof (struct sillyrename),
 		M_NFSREQ, M_WAITOK);
 	sp->s_cred = crhold(cnp->cn_cred);
Index: sys/nfsclient/nfs_bio.c
===================================================================
--- sys/nfsclient/nfs_bio.c	(revision 208960)
+++ sys/nfsclient/nfs_bio.c	(working copy)
@@ -453,10 +453,7 @@
 	int seqcount;
 	int nra, error = 0, n = 0, on = 0;
 
-#ifdef DIAGNOSTIC
-	if (uio->uio_rw != UIO_READ)
-		panic("nfs_read mode");
-#endif
+	KASSERT(uio->uio_rw == UIO_READ, ("nfs_read mode"));
 	if (uio->uio_resid == 0)
 		return (0);
 	if (uio->uio_offset < 0)	/* XXX VDIR cookies can be negative */
@@ -875,12 +872,9 @@
 	int bcount;
 	int n, on, error = 0;
 
-#ifdef DIAGNOSTIC
-	if (uio->uio_rw != UIO_WRITE)
-		panic("nfs_write mode");
-	if (uio->uio_segflg == UIO_USERSPACE && uio->uio_td != curthread)
-		panic("nfs_write proc");
-#endif
+	KASSERT(uio->uio_rw == UIO_WRITE, ("nfs_write mode"));
+	KASSERT(uio->uio_segflg != UIO_USERSPACE || uio->uio_td == curthread,
+		("nfs_write proc"));
 	if (vp->v_type != VREG)
 		return (EIO);
 	mtx_lock(&np->n_mtx);
Index: sys/nfsclient/nfs_subs.c
===================================================================
--- sys/nfsclient/nfs_subs.c	(revision 208960)
+++ sys/nfsclient/nfs_subs.c	(working copy)
@@ -199,10 +199,7 @@
 	int uiosiz, clflg, rem;
 	char *cp;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1)
-		panic("nfsm_uiotombuf: iovcnt != 1");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1, ("nfsm_uiotombuf: iovcnt != 1"));
 
 	if (siz > MLEN)		/* or should it >= MCLBYTES ?? */
 		clflg = 1;
@@ -789,10 +786,7 @@
 	
 	pos = (uoff_t)off / NFS_DIRBLKSIZ;
 	if (pos == 0 || off < 0) {
-#ifdef DIAGNOSTIC
-		if (add)
-			panic("nfs getcookie add at <= 0");
-#endif
+		KASSERT(!add, ("nfs getcookie add at <= 0"));
 		return (&nfs_nullcookie);
 	}
 	pos--;
@@ -843,10 +837,7 @@
 {
 	struct nfsnode *np = VTONFS(vp);
 
-#ifdef DIAGNOSTIC
-	if (vp->v_type != VDIR)
-		panic("nfs: invaldir not dir");
-#endif
+	KASSERT(vp->v_type == VDIR, ("nfs: invaldir not dir"));
 	nfs_dircookie_lock(np);
 	np->n_direofoffset = 0;
 	np->n_cookieverf.nfsuquad[0] = 0;
Index: sys/fs/nfsclient/nfs_clbio.c
===================================================================
--- sys/fs/nfsclient/nfs_clbio.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clbio.c	(working copy)
@@ -453,10 +453,7 @@
 	int seqcount;
 	int nra, error = 0, n = 0, on = 0;
 
-#ifdef DIAGNOSTIC
-	if (uio->uio_rw != UIO_READ)
-		panic("ncl_read mode");
-#endif
+	KASSERT(uio->uio_rw == UIO_READ, ("ncl_read mode"));
 	if (uio->uio_resid == 0)
 		return (0);
 	if (uio->uio_offset < 0)	/* XXX VDIR cookies can be negative */
@@ -881,12 +878,9 @@
 	int bcount;
 	int n, on, error = 0;
 
-#ifdef DIAGNOSTIC
-	if (uio->uio_rw != UIO_WRITE)
-		panic("ncl_write mode");
-	if (uio->uio_segflg == UIO_USERSPACE && uio->uio_td != curthread)
-		panic("ncl_write proc");
-#endif
+	KASSERT(uio->uio_rw == UIO_WRITE, ("ncl_write mode"));
+	KASSERT(uio->uio_segflg != UIO_USERSPACE || uio->uio_td == curthread,
+		("ncl_write proc"));
 	if (vp->v_type != VREG)
 		return (EIO);
 	mtx_lock(&np->n_mtx);
Index: sys/fs/nfsclient/nfs_clcomsubs.c
===================================================================
--- sys/fs/nfsclient/nfs_clcomsubs.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clcomsubs.c	(working copy)
@@ -194,10 +194,7 @@
 	int uiosiz, clflg, rem;
 	char *cp, *tcp;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1)
-		panic("nfsm_uiotombuf: iovcnt != 1");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1, ("nfsm_uiotombuf: iovcnt != 1"));
 
 	if (siz > ncl_mbuf_mlen)	/* or should it >= MCLBYTES ?? */
 		clflg = 1;
@@ -346,10 +343,7 @@
 
 	pos = off / NFS_DIRBLKSIZ;
 	if (pos == 0) {
-#ifdef DIAGNOSTIC
-		if (add)
-			panic("nfs getcookie add at 0");
-#endif
+		KASSERT(!add, ("nfs getcookie add at 0"));
 		return (&nfs_nullcookie);
 	}
 	pos--;
Index: sys/fs/nfsclient/nfs_clsubs.c
===================================================================
--- sys/fs/nfsclient/nfs_clsubs.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clsubs.c	(working copy)
@@ -282,10 +282,7 @@
 	
 	pos = (uoff_t)off / NFS_DIRBLKSIZ;
 	if (pos == 0 || off < 0) {
-#ifdef DIAGNOSTIC
-		if (add)
-			panic("nfs getcookie add at <= 0");
-#endif
+		KASSERT(!add, ("nfs getcookie add at <= 0"));
 		return (&nfs_nullcookie);
 	}
 	pos--;
@@ -336,10 +333,7 @@
 {
 	struct nfsnode *np = VTONFS(vp);
 
-#ifdef DIAGNOSTIC
-	if (vp->v_type != VDIR)
-		panic("nfs: invaldir not dir");
-#endif
+	KASSERT(vp->v_type == VDIR, ("nfs: invaldir not dir"));
 	ncl_dircookie_lock(np);
 	np->n_direofoffset = 0;
 	np->n_cookieverf.nfsuquad[0] = 0;
Index: sys/fs/nfsclient/nfs_clvnops.c
===================================================================
--- sys/fs/nfsclient/nfs_clvnops.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clvnops.c	(working copy)
@@ -1564,12 +1564,8 @@
 	int error = 0;
 	struct vattr vattr;
 
-#ifndef DIAGNOSTIC
-	if ((cnp->cn_flags & HASBUF) == 0)
-		panic("nfs_remove: no name");
-	if (vrefcnt(vp) < 1)
-		panic("nfs_remove: bad v_usecount");
-#endif
+	KASSERT((cnp->cn_flags & HASBUF) != 0, ("nfs_remove: no name"));
+	KASSERT(vrefcnt(vp) > 0, ("nfs_remove: bad v_usecount"));
 	if (vp->v_type == VDIR)
 		error = EPERM;
 	else if (vrefcnt(vp) == 1 || (np->n_sillyrename &&
@@ -1676,11 +1672,9 @@
 	struct nfsv4node *newv4 = NULL;
 	int error;
 
-#ifndef DIAGNOSTIC
-	if ((tcnp->cn_flags & HASBUF) == 0 ||
-	    (fcnp->cn_flags & HASBUF) == 0)
-		panic("nfs_rename: no name");
-#endif
+	KASSERT((tcnp->cn_flags & HASBUF) != 0 &&
+		(fcnp->cn_flags & HASBUF) != 0,
+		("nfs_rename: no name"));
 	/* Check for cross-device rename */
 	if ((fvp->v_mount != tdvp->v_mount) ||
 	    (tvp && (fvp->v_mount != tvp->v_mount))) {
@@ -2137,11 +2131,10 @@
 	struct nfsmount *nmp = VFSTONFS(vp->v_mount);
 	int error = 0, eof, attrflag;
 
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
-		(uiop->uio_resid & (DIRBLKSIZ - 1)))
-		panic("nfs readdirrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		(uiop->uio_offset & (DIRBLKSIZ - 1)) == 0 &&
+		(uiop->uio_resid & (DIRBLKSIZ - 1)) == 0,
+		("nfs readdirrpc bad uio"));
 
 	/*
 	 * If there is no cookie, assume directory was stale.
@@ -2198,11 +2191,10 @@
 	struct nfsmount *nmp = VFSTONFS(vp->v_mount);
 	int error = 0, attrflag, eof;
 
-#ifndef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uiop->uio_offset & (DIRBLKSIZ - 1)) ||
-		(uiop->uio_resid & (DIRBLKSIZ - 1)))
-		panic("nfs readdirplusrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		(uiop->uio_offset & (DIRBLKSIZ - 1)) == 0 &&
+		(uiop->uio_resid & (DIRBLKSIZ - 1)) == 0,
+		("nfs readdirplusrpc bad uio"));
 
 	/*
 	 * If there is no cookie, assume directory was stale.
@@ -2264,10 +2256,7 @@
 
 	cache_purge(dvp);
 	np = VTONFS(vp);
-#ifndef DIAGNOSTIC
-	if (vp->v_type == VDIR)
-		panic("nfs: sillyrename dir");
-#endif
+	KASSERT(vp->v_type != VDIR, ("nfs: sillyrename dir"));
 	MALLOC(sp, struct sillyrename *, sizeof (struct sillyrename),
 	    M_NEWNFSREQ, M_WAITOK);
 	sp->s_cred = crhold(cnp->cn_cred);
Index: sys/fs/nfsclient/nfs_clrpcops.c
===================================================================
--- sys/fs/nfsclient/nfs_clrpcops.c	(revision 208960)
+++ sys/fs/nfsclient/nfs_clrpcops.c	(working copy)
@@ -1445,10 +1445,7 @@
 	struct nfsrv_descript *nd = &nfsd;
 	nfsattrbit_t attrbits;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1)
-		panic("nfs: writerpc iovcnt > 1");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1, ("nfs: writerpc iovcnt > 1"));
 	*attrflagp = 0;
 	tsiz = uio_uio_resid(uiop);
 	NFSLOCKMNT(nmp);
@@ -2501,10 +2498,9 @@
 	u_int32_t *tl2 = NULL;
 	size_t tresid;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uio_uio_resid(uiop) & (DIRBLKSIZ - 1)))
-		panic("nfs readdirrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		(uio_uio_resid(uiop) & (DIRBLKSIZ - 1)) == 0,
+		("nfs readdirrpc bad uio"));
 
 	/*
 	 * There is no point in reading a lot more than uio_resid, however
@@ -2939,10 +2935,9 @@
 	size_t tresid;
 	u_int32_t *tl2 = NULL, fakefileno = 0xffffffff, rderr;
 
-#ifdef DIAGNOSTIC
-	if (uiop->uio_iovcnt != 1 || (uio_uio_resid(uiop) & (DIRBLKSIZ - 1)))
-		panic("nfs readdirplusrpc bad uio");
-#endif
+	KASSERT(uiop->uio_iovcnt == 1 &&
+		(uio_uio_resid(uiop) & (DIRBLKSIZ - 1)) == 0,
+		("nfs readdirplusrpc bad uio"));
 	*attrflagp = 0;
 	if (eofp != NULL)
 		*eofp = 0;
Index: sys/fs/nfsserver/nfs_nfsdsocket.c
===================================================================
--- sys/fs/nfsserver/nfs_nfsdsocket.c	(revision 208960)
+++ sys/fs/nfsserver/nfs_nfsdsocket.c	(working copy)
@@ -364,10 +364,7 @@
 	 * Get a locked vnode for the first file handle
 	 */
 	if (!(nd->nd_flag & ND_NFSV4)) {
-#ifdef DIAGNOSTIC
-		if (nd->nd_repstat)
-			panic("nfsrvd_dorpc");
-#endif
+		KASSERT(nd->nd_repstat == 0, ("nfsrvd_dorpc"));
 		/*
 		 * For NFSv3, if the malloc/mget allocation is near limits,
 		 * return NFSERR_DELAY.

--=-=-=--