From owner-freebsd-fs@FreeBSD.ORG  Sun Jun 21 08:21:04 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 82CAF395
 for <freebsd-fs@hub.freebsd.org>; Sun, 21 Jun 2015 08:21:04 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6CFF1D8B
 for <freebsd-fs@FreeBSD.org>; Sun, 21 Jun 2015 08:21:04 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t5L8L4MM012752
 for <freebsd-fs@FreeBSD.org>; Sun, 21 Jun 2015 08:21:04 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-fs@FreeBSD.org
Subject: [Bug 198242] [zfs] L2ARC degraded. Checksum errors, I/O errors
Date: Sun, 21 Jun 2015 08:21:01 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 10.1-RELEASE
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: avg@FreeBSD.org
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: avg@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: assigned_to
Message-ID: <bug-198242-3630-nAORFIXgnQ@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-198242-3630@https.bugs.freebsd.org/bugzilla/>
References: <bug-198242-3630@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 21 Jun 2015 08:21:04 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198242

Andriy Gapon <avg@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|freebsd-fs@FreeBSD.org      |avg@FreeBSD.org

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun 21 14:01:28 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id C8E52DDE
 for <freebsd-fs@hub.freebsd.org>; Sun, 21 Jun 2015 14:01:28 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from smtp.digiware.nl (smtp.digiware.nl [31.223.170.169])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6DB9D8B3
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 14:01:28 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from rack1.digiware.nl (unknown [127.0.0.1])
 by smtp.digiware.nl (Postfix) with ESMTP id E453516A407;
 Sun, 21 Jun 2015 16:01:23 +0200 (CEST)
X-Virus-Scanned: amavisd-new at digiware.nl
Received: from smtp.digiware.nl ([127.0.0.1])
 by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id oMI3wgyVHyb9; Sun, 21 Jun 2015 16:00:53 +0200 (CEST)
Received: from [IPv6:2001:4cb8:3:1:ccdf:1bc4:d42f:fddb] (unknown
 [IPv6:2001:4cb8:3:1:ccdf:1bc4:d42f:fddb])
 by smtp.digiware.nl (Postfix) with ESMTPA id AF83016A402;
 Sun, 21 Jun 2015 16:00:53 +0200 (CEST)
Message-ID: <5586C396.9010100@digiware.nl>
Date: Sun, 21 Jun 2015 16:00:54 +0200
From: Willem Jan Withagen <wjw@digiware.nl>
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Daryl Richards <daryl@isletech.net>, freebsd-fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl> <558590BD.40603@isletech.net>
In-Reply-To: <558590BD.40603@isletech.net>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 21 Jun 2015 14:01:28 -0000

On 20/06/2015 18:11, Daryl Richards wrote:
> Check the failmode setting on your pool. From man zpool:
> 
>        failmode=wait | continue | panic
> 
>            Controls the system behavior in the event of catastrophic
> pool failure.  This  condition  is  typically  a
>            result  of  a  loss of connectivity to the underlying storage
> device(s) or a failure of all devices within
>            the pool. The behavior of such an event is determined as
> follows:
> 
>            wait        Blocks all I/O access until the device
> connectivity is recovered and the errors  are  cleared.
>                        This is the default behavior.
> 
>            continue    Returns  EIO  to  any  new write I/O requests but
> allows reads to any of the remaining healthy
>                        devices. Any write requests that have yet to be
> committed to disk would be blocked.
> 
>            panic       Prints out a message to the console and generates
> a system crash dump.

'mmm

Did not know about this setting. Nice one, but alas my current setting is:
zfsboot  failmode         wait                           default
zfsraid  failmode         wait                           default

So either the setting is not working, or something else is up?
Is waiting only meant to wait a limited time? And then panic anyways?

But then still I wonder why even in the 'continue'-case the ZFS system
ends in a state where the filesystem is not able to continue in its
standard functioning ( read and write ) and disconnects the disk???

All failmode settings result in a seriously handicapped system...
On a raidz2 system I would perhaps expected this to occur when the
second disk goes into thin space??

The other question is: The man page talks about
'Controls the system behavior in the event of catastrophic pool failure'
And is a hung disk a 'catastrophic pool failure'?

Still very puzzled?

--WjW

> 
> 
> On 2015-06-20 10:19 AM, Willem Jan Withagen wrote:
>> Hi,
>>
>> Found my system rebooted this morning:
>>
>> Jun 20 05:28:33 zfs kernel: sonewconn: pcb 0xfffff8011b6da498: Listen
>> queue overflow: 8 already in queue awaiting acceptance (48 occurrences)
>> Jun 20 05:28:33 zfs kernel: panic: I/O to pool 'zfsraid' appears to be
>> hung on vdev guid 18180224580327100979 at '/dev/da0'.
>> Jun 20 05:28:33 zfs kernel: cpuid = 0
>> Jun 20 05:28:33 zfs kernel: Uptime: 8d9h7m9s
>> Jun 20 05:28:33 zfs kernel: Dumping 6445 out of 8174
>> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
>>
>> Which leads me to believe that /dev/da0 went out on vacation, leaving
>> ZFS into trouble.... But the array is:
>> ----
>> NAME               SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP
>> zfsraid           32.5T  13.3T  19.2T         -     7%    41%  1.00x
>> ONLINE  -
>>    raidz2          16.2T  6.67T  9.58T         -     8%    41%
>>      da0               -      -      -         -      -      -
>>      da1               -      -      -         -      -      -
>>      da2               -      -      -         -      -      -
>>      da3               -      -      -         -      -      -
>>      da4               -      -      -         -      -      -
>>      da5               -      -      -         -      -      -
>>    raidz2          16.2T  6.67T  9.58T         -     7%    41%
>>      da6               -      -      -         -      -      -
>>      da7               -      -      -         -      -      -
>>      ada4              -      -      -         -      -      -
>>      ada5              -      -      -         -      -      -
>>      ada6              -      -      -         -      -      -
>>      ada7              -      -      -         -      -      -
>>    mirror           504M  1.73M   502M         -    39%     0%
>>      gpt/log0          -      -      -         -      -      -
>>      gpt/log1          -      -      -         -      -      -
>> cache                 -      -      -      -      -      -
>>    gpt/raidcache0   109G  1.34G   107G         -     0%     1%
>>    gpt/raidcache1   109G   787M   108G         -     0%     0%
>> ----
>>
>> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
>> then switch to DEGRADED state and continue, letting the operator fix the
>> broken disk.
>> Instead it chooses to panic, which is not a nice thing to do. :)
>>
>> Or do I have to high hopes of ZFS?
>>
>> Next question to answer is why this WD RED on:
>>
>> arcmsr0@pci0:7:14:0:    class=0x010400 card=0x112017d3 chip=0x112017d3
>> rev=0x00 hdr=0x00
>>      vendor     = 'Areca Technology Corp.'
>>      device     = 'ARC-1120 8-Port PCI-X to SATA RAID Controller'
>>      class      = mass storage
>>      subclass   = RAID
>>
>> got hung, and nothing for this shows in SMART....


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun 21 14:30:44 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 826EEA1E
 for <freebsd-fs@hub.freebsd.org>; Sun, 21 Jun 2015 14:30:44 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id ED166F48
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 14:30:43 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A2D5BABUyYZV/95baINbg2RfBoMYvEkKhS5KAoFYEQEBAQEBAQGBCoQiAQEBAwEBAQEgBCcgCwUWGBEZAgQlAQkmBggHBAEcBIgGCA2xJ5V1AQEBAQEBAQMBAQEBAQEBAQEZi0WENAEBBRcZGweCaIFDBYwOh2+CJIIyhC+EA0GDTIgoikImY4FZgVkiMQeBBTqBAgEBAQ
X-IronPort-AV: E=Sophos;i="5.13,654,1427774400"; d="scan'208";a="219649125"
Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.222])
 by esa-annu.net.uoguelph.ca with ESMTP; 21 Jun 2015 10:30:35 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id ED97AB3F84;
 Sun, 21 Jun 2015 10:30:34 -0400 (EDT)
Date: Sun, 21 Jun 2015 10:30:34 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: "alex.burlyga.ietf alex.burlyga.ietf" <alex.burlyga.ietf@gmail.com>
Cc: freebsd-fs@freebsd.org
Message-ID: <1969046464.61534041.1434897034960.JavaMail.root@uoguelph.ca>
In-Reply-To: <CA+JhTNTSC-xPVdpUGcQemVMLUwuQB6D8-3d2HD6WjU+jd1SMNQ@mail.gmail.com>
Subject: Re: [nfs][client] - Question about handling of the NFS3_EEXIST
 error in SYMLINK rpc
MIME-Version: 1.0
Content-Type: multipart/mixed; 
 boundary="----=_Part_61534039_1057484672.1434897034958"
X-Originating-IP: [172.17.95.10]
X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 21 Jun 2015 14:30:44 -0000

------=_Part_61534039_1057484672.1434897034958
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Alex Burlyga wrote:
> Hi,
> 
> NFS client code in nfsrpc_symlink() masks server returned NFS3_EEXIST
> error
> code
> by returning 0 to the upper layers. I'm assuming this was an attempt
> to
> work around
> some server's broken replay cache out there, however, it breaks a
> more
> common
> case where server is returning EEXIST for legitimate reason and
> application
> is expecting this error code and equipped to deal with it.
> 
> To fix it I see three ways of doing this:
>  * Remove offending code
>  * Make it optional, sysctl?
>  * On NFS3_EEXIST send READLINK rpc to make sure symlink content is
>  right
> 
> Which of the ways will maximize the chances of getting this fix
> upstream?
> 
I've attached a patch for testing/review that does essentially #2.
It has no effect on trivial tests, since the syscall does a Lookup
before trying to create the symlink and fails with EEXIST.
Do you have a case where competing clients are trying to create
the symlink or something like that, which runs into this?

Please test the attached patch, since I don't know how to do that, rick

> One more point, old client circa FreeBSD 7.0 does not exhibit this
> problem.
> 
> Alex
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> 

------=_Part_61534039_1057484672.1434897034958
Content-Type: text/x-patch; name=eexist.patch
Content-Disposition: attachment; filename=eexist.patch
Content-Transfer-Encoding: base64

LS0tIGZzL25mc2NsaWVudC9uZnNfY2xycGNvcHMuYy5zYXYyCTIwMTUtMDYtMjEgMDk6Mjc6Mzgu
NjQwOTQ3MDAwIC0wNDAwCisrKyBmcy9uZnNjbGllbnQvbmZzX2NscnBjb3BzLmMJMjAxNS0wNi0y
MSAwOTo1Mzo0Mi43MjMwODUwMDAgLTA0MDAKQEAgLTQ2LDYgKzQ2LDEzIEBAIF9fRkJTRElEKCIk
RnJlZUJTRDogaGVhZC9zeXMvZnMvbmZzY2xpZW4KICNpbmNsdWRlICJvcHRfaW5ldDYuaCIKIAog
I2luY2x1ZGUgPGZzL25mcy9uZnNwb3J0Lmg+CisjaW5jbHVkZSA8c3lzL3N5c2N0bC5oPgorCitT
WVNDVExfREVDTChfdmZzX25mcyk7CisKK3N0YXRpYyBpbnQJbmZzaWdub3JlX2VleGlzdCA9IDA7
CitTWVNDVExfSU5UKF92ZnNfbmZzLCBPSURfQVVUTywgaWdub3JlX2VleGlzdCwgQ1RMRkxBR19S
VywKKyAgICAmbmZzaWdub3JlX2VleGlzdCwgMCwgIk5GUyBpZ25vcmUgRUVYSVNUIHJlcGxpZXMg
Zm9yIG1rZGlyL3N5bWxpbmsiKTsKIAogLyoKICAqIEdsb2JhbCB2YXJpYWJsZXMKQEAgLTI1MzAs
OCArMjUzNywxMiBAQCBuZnNycGNfc3ltbGluayh2bm9kZV90IGR2cCwgY2hhciAqbmFtZSwgCiAJ
bWJ1Zl9mcmVlbShuZC0+bmRfbXJlcCk7CiAJLyoKIAkgKiBLbHVkZ2U6IE1hcCBFRVhJU1QgPT4g
MCBhc3N1bWluZyB0aGF0IGl0IGlzIGEgcmVwbHkgdG8gYSByZXRyeS4KKwkgKiBPbmx5IGRvIHRo
aXMgaWYgdmZzLm5mcy5pZ25vcmVfZWV4aXN0IGlzIHNldC4KKwkgKiBOZXZlciBkbyB0aGlzIGZv
ciBORlN2NC4xIG9yIGxhdGVyIG1pbm9yIHZlcnNpb25zLCBzaW5jZSBzZXNzaW9ucworCSAqIHNo
b3VsZCBndWFyYW50ZWUgImV4YWN0bHkgb25jZSIgUlBDIHNlbWFudGljcy4KIAkgKi8KLQlpZiAo
ZXJyb3IgPT0gRUVYSVNUKQorCWlmIChlcnJvciA9PSBFRVhJU1QgJiYgbmZzaWdub3JlX2VleGlz
dCAhPSAwICYmICghTkZTSEFTTkZTVjQobm1wKSB8fAorCSAgICBubXAtPm5tX21pbm9ydmVycyA9
PSAwKSkKIAkJZXJyb3IgPSAwOwogCXJldHVybiAoZXJyb3IpOwogfQpAQCAtMjU1MCwxMCArMjU2
MSwxMiBAQCBuZnNycGNfbWtkaXIodm5vZGVfdCBkdnAsIGNoYXIgKm5hbWUsIGluCiAJbmZzYXR0
cmJpdF90IGF0dHJiaXRzOwogCWludCBlcnJvciA9IDA7CiAJc3RydWN0IG5mc2ZoICpmaHA7CisJ
c3RydWN0IG5mc21vdW50ICpubXA7CiAKIAkqbmZocHAgPSBOVUxMOwogCSphdHRyZmxhZ3AgPSAw
OwogCSpkYXR0cmZsYWdwID0gMDsKKwlubXAgPSBWRlNUT05GUyh2bm9kZV9tb3VudChkdnApKTsK
IAlmaHAgPSBWVE9ORlMoZHZwKS0+bl9maHA7CiAJaWYgKG5hbWVsZW4gPiBORlNfTUFYTkFNTEVO
KQogCQlyZXR1cm4gKEVOQU1FVE9PTE9ORyk7CkBAIC0yNjA1LDkgKzI2MTgsMTMgQEAgbmZzcnBj
X21rZGlyKHZub2RlX3QgZHZwLCBjaGFyICpuYW1lLCBpbgogbmZzbW91dDoKIAltYnVmX2ZyZWVt
KG5kLT5uZF9tcmVwKTsKIAkvKgotCSAqIEtsdWRnZTogTWFwIEVFWElTVCA9PiAwIGFzc3VtaW5n
IHRoYXQgeW91IGhhdmUgYSByZXBseSB0byBhIHJldHJ5LgorCSAqIEtsdWRnZTogTWFwIEVFWElT
VCA9PiAwIGFzc3VtaW5nIHRoYXQgaXQgaXMgYSByZXBseSB0byBhIHJldHJ5LgorCSAqIE9ubHkg
ZG8gdGhpcyBpZiB2ZnMubmZzLmlnbm9yZV9lZXhpc3QgaXMgc2V0LgorCSAqIE5ldmVyIGRvIHRo
aXMgZm9yIE5GU3Y0LjEgb3IgbGF0ZXIgbWlub3IgdmVyc2lvbnMsIHNpbmNlIHNlc3Npb25zCisJ
ICogc2hvdWxkIGd1YXJhbnRlZSAiZXhhY3RseSBvbmNlIiBSUEMgc2VtYW50aWNzLgogCSAqLwot
CWlmIChlcnJvciA9PSBFRVhJU1QpCisJaWYgKGVycm9yID09IEVFWElTVCAmJiBuZnNpZ25vcmVf
ZWV4aXN0ICE9IDAgJiYgKCFORlNIQVNORlNWNChubXApIHx8CisJICAgIG5tcC0+bm1fbWlub3J2
ZXJzID09IDApKQogCQllcnJvciA9IDA7CiAJcmV0dXJuIChlcnJvcik7CiB9Cg==
------=_Part_61534039_1057484672.1434897034958--

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun 21 18:24:02 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 4A1C3FB4
 for <freebsd-fs@hub.freebsd.org>; Sun, 21 Jun 2015 18:24:02 +0000 (UTC)
 (envelope-from nobody@ws1.emirates.net.ae)
Received: from dsrmail2.emirates.net.ae (dsrmail2.emirates.net.ae
 [194.170.201.252])
 by mx1.freebsd.org (Postfix) with ESMTP id 9F37ABD3
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 18:24:00 +0000 (UTC)
 (envelope-from nobody@ws1.emirates.net.ae)
Received: from ws1.emirates.net.ae ([194.170.187.5])
 by dsrmail2.emirates.net.ae (I&ES Mail Server 4.2)
 with ESMTP id <0NQB00GHH4FXM4C0@dsrmail2.emirates.net.ae> for
 freebsd-fs@freebsd.org; Sun, 21 Jun 2015 22:23:57 +0400 (GST)
Received: from ws1.emirates.net.ae (localhost [127.0.0.1])
 by ws1.emirates.net.ae (8.14.5+Sun/8.14.5) with ESMTP id t5LINvkG023894	for
 <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 22:23:57 +0400 (GST)
Received: (from nobody@localhost)
 by ws1.emirates.net.ae (8.14.5+Sun/8.14.5/Submit) id t5LINvph023890;	Sun,
 21 Jun 2015 22:23:57 +0400 (GST)
To: freebsd-fs@freebsd.org
Subject: Notice to Appear
Date: Sun, 21 Jun 2015 22:23:57 +0400
From: State Court <karl.weber@wmc-e.ae>
Reply-to: State Court <karl.weber@wmc-e.ae>
Message-id: <134a52e2ce858f5170de33ee9f15f858@wmc-e.ae>
X-Priority: 3
MIME-version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 21 Jun 2015 18:24:02 -0000

Notice to Appear,

You have to appear in the Court on the June 25.
Please, do not forget to bring all the documents related to the case.
Note: The case will be heard by the judge in your absence if you do not come.

The copy of Court Notice is attached to this email.

Yours faithfully,
Karl Weber,
Court Secretary.


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun 21 19:50:18 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 691AA10A
 for <freebsd-fs@hub.freebsd.org>; Sun, 21 Jun 2015 19:50:18 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: from mail-oi0-x230.google.com (mail-oi0-x230.google.com
 [IPv6:2607:f8b0:4003:c06::230])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 2E9B097
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 19:50:18 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: by oigx81 with SMTP id x81so109591111oig.1
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 12:50:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=L8a9ADRzLN8Rsm2g877jh9MfaJbptDYU8BEsnPlKCmo=;
 b=flwiczJEzUQ3ryLHHdZayrlC1l4zhvkmUgM5tR6I9OE1a7FSP98eTYB+P55eVtcBa1
 F8eigYnL2EsQ6dex2Vkik2tLcXUWjL31tOYHahb/fRU1j6qSgvb4OfYy9wXOTGiWzUYT
 O4vFdTPBIxxDq/ffWg+mxxN62j4A2fX5dfKrRCm4k+FbKIo4GsPg6xBoEGngkXmjNGJd
 Tc6qbkThjmZt7igPPCqvtelOM2Tg4cipooZ9/Osp3DBuKUPZpPh+WmCyiJpGtNl0iErL
 k7T8pt58E+Icl8nbQWAZ+I17JwYiIERsiV3wGKvfDbMEH9tGIwJtaZy1SHCEDWqyJEzI
 LJrQ==
MIME-Version: 1.0
X-Received: by 10.60.155.132 with SMTP id vw4mr8044581oeb.51.1434916217248;
 Sun, 21 Jun 2015 12:50:17 -0700 (PDT)
Received: by 10.202.77.138 with HTTP; Sun, 21 Jun 2015 12:50:17 -0700 (PDT)
In-Reply-To: <5586C396.9010100@digiware.nl>
References: <5585767B.4000206@digiware.nl> <558590BD.40603@isletech.net>
 <5586C396.9010100@digiware.nl>
Date: Sun, 21 Jun 2015 15:50:17 -0400
Message-ID: <CAGtEZUAO5-rBoz0YBcYfvZ6tx_sj0MEFuxGSYk+z0XHrJySk2A@mail.gmail.com>
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
From: Tom Curry <thomasrcurry@gmail.com>
To: Willem Jan Withagen <wjw@digiware.nl>
Cc: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 21 Jun 2015 19:50:18 -0000

Was there by chance a lot of disk activity going on when this occurred?

On Sun, Jun 21, 2015 at 10:00 AM, Willem Jan Withagen <wjw@digiware.nl>
wrote:

> On 20/06/2015 18:11, Daryl Richards wrote:
> > Check the failmode setting on your pool. From man zpool:
> >
> >        failmode=wait | continue | panic
> >
> >            Controls the system behavior in the event of catastrophic
> > pool failure.  This  condition  is  typically  a
> >            result  of  a  loss of connectivity to the underlying storage
> > device(s) or a failure of all devices within
> >            the pool. The behavior of such an event is determined as
> > follows:
> >
> >            wait        Blocks all I/O access until the device
> > connectivity is recovered and the errors  are  cleared.
> >                        This is the default behavior.
> >
> >            continue    Returns  EIO  to  any  new write I/O requests but
> > allows reads to any of the remaining healthy
> >                        devices. Any write requests that have yet to be
> > committed to disk would be blocked.
> >
> >            panic       Prints out a message to the console and generates
> > a system crash dump.
>
> 'mmm
>
> Did not know about this setting. Nice one, but alas my current setting is:
> zfsboot  failmode         wait                           default
> zfsraid  failmode         wait                           default
>
> So either the setting is not working, or something else is up?
> Is waiting only meant to wait a limited time? And then panic anyways?
>
> But then still I wonder why even in the 'continue'-case the ZFS system
> ends in a state where the filesystem is not able to continue in its
> standard functioning ( read and write ) and disconnects the disk???
>
> All failmode settings result in a seriously handicapped system...
> On a raidz2 system I would perhaps expected this to occur when the
> second disk goes into thin space??
>
> The other question is: The man page talks about
> 'Controls the system behavior in the event of catastrophic pool failure'
> And is a hung disk a 'catastrophic pool failure'?
>
> Still very puzzled?
>
> --WjW
>
> >
> >
> > On 2015-06-20 10:19 AM, Willem Jan Withagen wrote:
> >> Hi,
> >>
> >> Found my system rebooted this morning:
> >>
> >> Jun 20 05:28:33 zfs kernel: sonewconn: pcb 0xfffff8011b6da498: Listen
> >> queue overflow: 8 already in queue awaiting acceptance (48 occurrences)
> >> Jun 20 05:28:33 zfs kernel: panic: I/O to pool 'zfsraid' appears to be
> >> hung on vdev guid 18180224580327100979 at '/dev/da0'.
> >> Jun 20 05:28:33 zfs kernel: cpuid = 0
> >> Jun 20 05:28:33 zfs kernel: Uptime: 8d9h7m9s
> >> Jun 20 05:28:33 zfs kernel: Dumping 6445 out of 8174
> >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> >>
> >> Which leads me to believe that /dev/da0 went out on vacation, leaving
> >> ZFS into trouble.... But the array is:
> >> ----
> >> NAME               SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP
> >> zfsraid           32.5T  13.3T  19.2T         -     7%    41%  1.00x
> >> ONLINE  -
> >>    raidz2          16.2T  6.67T  9.58T         -     8%    41%
> >>      da0               -      -      -         -      -      -
> >>      da1               -      -      -         -      -      -
> >>      da2               -      -      -         -      -      -
> >>      da3               -      -      -         -      -      -
> >>      da4               -      -      -         -      -      -
> >>      da5               -      -      -         -      -      -
> >>    raidz2          16.2T  6.67T  9.58T         -     7%    41%
> >>      da6               -      -      -         -      -      -
> >>      da7               -      -      -         -      -      -
> >>      ada4              -      -      -         -      -      -
> >>      ada5              -      -      -         -      -      -
> >>      ada6              -      -      -         -      -      -
> >>      ada7              -      -      -         -      -      -
> >>    mirror           504M  1.73M   502M         -    39%     0%
> >>      gpt/log0          -      -      -         -      -      -
> >>      gpt/log1          -      -      -         -      -      -
> >> cache                 -      -      -      -      -      -
> >>    gpt/raidcache0   109G  1.34G   107G         -     0%     1%
> >>    gpt/raidcache1   109G   787M   108G         -     0%     0%
> >> ----
> >>
> >> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
> >> then switch to DEGRADED state and continue, letting the operator fix the
> >> broken disk.
> >> Instead it chooses to panic, which is not a nice thing to do. :)
> >>
> >> Or do I have to high hopes of ZFS?
> >>
> >> Next question to answer is why this WD RED on:
> >>
> >> arcmsr0@pci0:7:14:0:    class=0x010400 card=0x112017d3 chip=0x112017d3
> >> rev=0x00 hdr=0x00
> >>      vendor     = 'Areca Technology Corp.'
> >>      device     = 'ARC-1120 8-Port PCI-X to SATA RAID Controller'
> >>      class      = mass storage
> >>      subclass   = RAID
> >>
> >> got hung, and nothing for this shows in SMART....
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 00:43:27 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id CF390506
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 00:43:27 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from hub.freebsd.org (hub.freebsd.org
 [IPv6:2001:1900:2254:206c::16:88])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "hub.freebsd.org", Issuer "hub.freebsd.org" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 4727396B
 for <freebsd-fs@FreeBSD.ORG>; Mon, 22 Jun 2015 00:11:02 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: by hub.freebsd.org (Postfix)
 id 2BA80272; Mon, 22 Jun 2015 00:11:02 +0000 (UTC)
Delivered-To: fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2876A270
 for <fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 00:11:02 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from smtp.digiware.nl (unknown [IPv6:2001:4cb8:90:ffff::3])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 60DD61F6F
 for <fs@freebsd.org>; Mon, 22 Jun 2015 00:10:28 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from rack1.digiware.nl (unknown [127.0.0.1])
 by smtp.digiware.nl (Postfix) with ESMTP id 1E81516A409;
 Mon, 22 Jun 2015 01:30:56 +0200 (CEST)
X-Virus-Scanned: amavisd-new at digiware.nl
Received: from smtp.digiware.nl ([127.0.0.1])
 by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id q1W7ZvfcA2rj; Mon, 22 Jun 2015 01:30:45 +0200 (CEST)
Received: from [IPv6:2001:4cb8:3:1:a079:ce8f:c2bf:e69] (unknown
 [IPv6:2001:4cb8:3:1:a079:ce8f:c2bf:e69])
 by smtp.digiware.nl (Postfix) with ESMTPA id 7CF3916A407;
 Mon, 22 Jun 2015 01:30:45 +0200 (CEST)
Message-ID: <55874927.80807@digiware.nl>
Date: Mon, 22 Jun 2015 01:30:47 +0200
From: Willem Jan Withagen <wjw@digiware.nl>
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Quartz <quartz@sneakertech.com>
CC: fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl> <5587236A.6020404@sneakertech.com>
In-Reply-To: <5587236A.6020404@sneakertech.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 00:43:27 -0000

On 21/06/2015 22:49, Quartz wrote:
> Also:
> 
>> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
>> then switch to DEGRADED state and continue, letting the operator fix the
>> broken disk.
> 
>> Next question to answer is why this WD RED on:
> 
>> got hung, and nothing for this shows in SMART....
> 
> You have a raidz2, which means THREE disks need to go down before the
> pool is unwritable. The problem is most likely your controller or power
> supply, not your disks.

But still I would expect the volume to become degraded if one of the
disks goes into the error state? It is real nice that it is still
'raidz1' but it does need to get fixed...

> Also2: don't rely too much on SMART for determining drive health. Google
> released a paper a few years ago revealing that half of all drives die
> without reporting SMART errors.
> 
> http://research.google.com/archive/disk_failures.pdf

This article is mainly about forcasting disk failure based on SMART
numbers.... Because first "failures" in SMART do nor require one to
immediately replace the disk. The common idea is, if the numbers grow,
expect the device to break.

I was just looking at the counters to see if the disk had logged just
any fact of info/warning/error that could have anything to do with the
problem I have.

Thanx,
--WjW


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 00:46:48 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 27F0B571
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 00:46:48 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (unknown [IPv6:2607:f440::d144:5b3])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 00971F56
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 00:46:47 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id C03A33F70D
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 17:06:18 -0400 (EDT)
Message-ID: <5587274A.2020205@sneakertech.com>
Date: Sun, 21 Jun 2015 17:06:18 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: ZFS pool restructuring and emergency repair
References: <5584C0BC.9070707@sneakertech.com> <5584F83D.1040702@egr.msu.edu>
In-Reply-To: <5584F83D.1040702@egr.msu.edu>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 00:46:48 -0000

> man gvirstor which lets you create an arbitrarily large storage device
> backed by chunks of storage based on how much you are actually using.  I
> have not used it.

So, effectively a sparse disk image?

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 00:46:48 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 28457572
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 00:46:48 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (unknown [IPv6:2607:f440::d144:5b3])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 009C3F57
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 00:46:47 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id 9371B3F715;
 Sun, 21 Jun 2015 20:28:27 -0400 (EDT)
Message-ID: <558756AB.405@sneakertech.com>
Date: Sun, 21 Jun 2015 20:28:27 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Willem Jan Withagen <wjw@digiware.nl>
CC: freebsd-fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl> <558590BD.40603@isletech.net>
 <5586C396.9010100@digiware.nl> <55871F4C.5010103@sneakertech.com>
 <55874772.4090607@digiware.nl>
In-Reply-To: <55874772.4090607@digiware.nl>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 00:46:48 -0000

> But especially the hung disk during reading

Writing is the issue moreso. At least, if you set your failmode to 
'continue' ZFS will to try to honor reads as long as it's able, but 
writes will block. (In practice though it'll usually only give you an 
extra minute or so before everything locks up).


> We'll the pool did not die, (at least not IMHO)

Sorry, that's bad wording on my part. What I meant was that IO to the 
pool died.


>just one disk stopt
> working....

It would have to be 3+ disks in your case, with a raidz2.


> I guess that if I like to live dangerously, I could set enabled to 0,
> and run the risk... ??

Well, that will just disable the auto panic. If the IO disappeared into 
a black hole due to a hardware issue the machine will just stay hung 
forever until you manually press the reset button on the front. ZFS will 
prevent any major corruption of the pool so it's not really "dangerous". 
(Outside of further hardware failures).


> But still I would expect the volume to become degraded if one of the
> disks goes into the error state?

If *one* of the disks drops out, yes. If a second drops out later, also 
yes, because ZFS can still handle IO to the pool. But as soon as that 
third disk drops out in a way that locks up IO, ZFS freezes.

For reference, I had a raidz2 test case with 6 drives. I could yank the 
sata cable off two of the drives and the pool would be marked as 
degraded, but as soon as I yanked that third drive everything froze. 
This is why I heavily suspect in your case that your controller or PSU 
is failing and dropping multiple disks at a time. The fact that the log 
reports da0 is probably just because that was the last disk ZFS tried to 
fall back on when they all dropped out at once.

Ideally, the system *should* handle this situation gracefully, but the 
reality is that it doesn't. If the last disk fails in a way that hangs 
IO, it takes the whole machine with it. No system configuration change 
can prevent this, not with how things are currently designed.


> This article is mainly about forcasting disk failure based on SMART
> numbers....

> I was just looking at the counters to see if the disk had logged just
> any fact of info/warning/error

What Google found out is that a lot of disks *don't* report errors or 
warnings before experiencing problems. In other words, SMART saying "all 
good" doesn't really mean much in practice, so you shouldn't really rely 
on it for diagnostics.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 00:57:18 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id C0A317A2
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 00:57:18 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from hub.freebsd.org (hub.freebsd.org [8.8.178.136])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "hub.freebsd.org", Issuer "hub.freebsd.org" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id A413D656
 for <freebsd-fs@FreeBSD.ORG>; Mon, 22 Jun 2015 00:57:18 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: by hub.freebsd.org (Postfix)
 id 99E837A1; Mon, 22 Jun 2015 00:57:18 +0000 (UTC)
Delivered-To: fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 996167A0
 for <fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 00:57:18 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6187A653;
 Mon, 22 Jun 2015 00:57:18 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
 [65.66.246.65])
 by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id t5LL1tkE017983;
 Sun, 21 Jun 2015 16:01:55 -0500 (CDT)
Date: Sun, 21 Jun 2015 16:01:55 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Steve Wills <swills@freebsd.org>
cc: Willem Jan Withagen <wjw@digiware.nl>, fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
In-Reply-To: <20150620221431.GB26416@mouf.net>
Message-ID: <alpine.GSO.2.01.1506211558510.4186@freddy.simplesystems.org>
References: <5585767B.4000206@digiware.nl> <20150620221431.GB26416@mouf.net>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (blade.simplesystems.org [65.66.246.90]);
 Sun, 21 Jun 2015 16:01:56 -0500 (CDT)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 00:57:18 -0000

On Sat, 20 Jun 2015, Steve Wills wrote:
>> rev=0x00 hdr=0x00
>>     vendor     = 'Areca Technology Corp.'
>>     device     = 'ARC-1120 8-Port PCI-X to SATA RAID Controller'
>>     class      = mass storage
>>     subclass   = RAID
>
> You may be hitting the zfs deadman panic, which is triggered when the
> controller hangs. This can in some cases be caused by disks that die in unusual
> ways.

Notice that the RAID controller is a PCI-X device (shared parallel, 
not dedicated serial like PCIe).  The whole PCI backplane could have 
hung.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 00:57:22 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 06FE87D4
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 00:57:22 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: from mail-oi0-x235.google.com (mail-oi0-x235.google.com
 [IPv6:2607:f8b0:4003:c06::235])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6D707668
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 00:57:21 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: by oigb199 with SMTP id b199so70083251oig.3
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 17:57:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=ciGPUTCfvAP5nIIsgQvFQ5r+Kj4lpYwTsH7ktql4oqM=;
 b=gtI7WfHf0MvGceYSBQ2CdSVBJeq90e6790k7DGRE7psTjofmn/ayXMAZlOOdZ6/GwL
 mBnxmBgr1WmFDeb0army6aFnD/dRfibnlOpYuWWcd1++VzozWdleUpc6YYf7/LMGUi75
 GG0x0yDxp2x7xEeLcxBGj/X4OmKNAe3wpQg5RoBuJcqar6OSGK8XfQDGVBuqOZQFDUEt
 73oT9SqYjm1aVNUsJ9euxQCzCbsxWXrqq6nDdpN2JnF8Znpc/HJGysxIwdeFlMB3BkBq
 RWy5uAE2QZDbMxjZR4xHxROSYwagp+wSyC2s+GsbgDyF3hV6btTU6OobwWB5ATybDfAu
 dXKg==
MIME-Version: 1.0
X-Received: by 10.60.118.193 with SMTP id ko1mr7671514oeb.38.1434932779902;
 Sun, 21 Jun 2015 17:26:19 -0700 (PDT)
Received: by 10.202.77.138 with HTTP; Sun, 21 Jun 2015 17:26:19 -0700 (PDT)
In-Reply-To: <55874C8A.4090405@digiware.nl>
References: <5585767B.4000206@digiware.nl> <558590BD.40603@isletech.net>
 <5586C396.9010100@digiware.nl>
 <CAGtEZUAO5-rBoz0YBcYfvZ6tx_sj0MEFuxGSYk+z0XHrJySk2A@mail.gmail.com>
 <55873E1D.9010401@digiware.nl>
 <CAGtEZUBexzwjTGMXY+Mg5knNsC+f35TXhAqhL0vdOKoOUO1F3A@mail.gmail.com>
 <55874C8A.4090405@digiware.nl>
Date: Sun, 21 Jun 2015 20:26:19 -0400
Message-ID: <CAGtEZUBx=EwM7bx5VL+pn9AsV-66mAbXbCjQq23=E2zAJnEtMA@mail.gmail.com>
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
From: Tom Curry <thomasrcurry@gmail.com>
To: Willem Jan Withagen <wjw@digiware.nl>
Cc: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 00:57:22 -0000

Yes, currently I am not using the patch from that PR. But I have lowered
the ARC max size, I am confident if I left it default I would have panics
again.


On Sun, Jun 21, 2015 at 7:45 PM, Willem Jan Withagen <wjw@digiware.nl>
wrote:

> On 22/06/2015 01:34, Tom Curry wrote:
> > I asked because recently I had similar trouble. Lots of kernel panics,
> > sometimes they were just like yours, sometimes they were general
> > protection faults. But they would always occur when my nightly backups
> > took place where VMs on iSCSI zvol luns were read and then written over
> > smb to another pool on the same machine over 10GbE.
> >
> > I nearly went out of my mind trying to figure out what was going on,
> > I'll spare you the gory details, but I stumbled across this PR
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594 and as I read
>
> So this is "the Karl Denninger ZFS patch"....
> I tried to follow the discussion at the moment, keeping it in the back
> of my head.....
> I concluded that the ideas where sort of accepted, but a different
> solution was implemented?
>
> > through it little light bulbs starting coming on. Luckily it was easy
> > for me to reproduce the problem so I kicked off the backups and watched
> > the system memory. Wired would grow, ARC would shrink, and then the
> > system would start swapping. If I stopped the IO right then it would
> > recover after a while. But if I let it go it would always panic, and
> > half the time it would be the same message as yours. So I applied the
> > patch from that PR, rebooted, and kicked off the backup. No more panic.
> > Recently I rebuilt a vanilla kernel from stable/10 but explicitly set
> > vfs.zfs.arc_max to 24G (I have 32G) and ran my torture tests and it is
> > stable.
>
> So you've (almost) answered my question, but English is not my native
> language and hence my question for certainty: You did not add the patch
> to your recently build stable/10 kernel...
>
> > So I don't want to send you on a wild goose chase, but it's entirely
> > possible this problem you are having is not hardware related at all, but
> > is a memory starvation issue related to the ARC under periods of heavy
> > activity.
>
> Well rsync will do that for you... And since a few months I've also
> loaded some iSCSI zvols as remote disks to some windows stations.
>
> Your suggestions are highly appreciated. Especially since I do not have
> space PCI-X parts... (It the current hardware blows up, I'm getting
> monder new stuff.) So other than checking some cabling and likes there
> is very little I could swap.
>
> Thanx,
> --WjW
>
> > On Sun, Jun 21, 2015 at 6:43 PM, Willem Jan Withagen <wjw@digiware.nl
> > <mailto:wjw@digiware.nl>> wrote:
> >
> >     On 21/06/2015 21:50, Tom Curry wrote:
> >     > Was there by chance a lot of disk activity going on when this
> occurred?
> >
> >     Define 'a lot'??
> >     But very likely, since the system is also a backup location for
> several
> >     external service which backup thru rsync. And they can generate
> generate
> >     quite some traffic. Next to the fact that it also serves a NVR with a
> >     ZVOL trhu iSCSI...
> >
> >     --WjW
> >
> >     >
> >     > On Sun, Jun 21, 2015 at 10:00 AM, Willem Jan Withagen <
> wjw@digiware.nl <mailto:wjw@digiware.nl>
> >     > <mailto:wjw@digiware.nl <mailto:wjw@digiware.nl>>> wrote:
> >     >
> >     >     On 20/06/2015 18:11, Daryl Richards wrote:
> >     >     > Check the failmode setting on your pool. From man zpool:
> >     >     >
> >     >     >        failmode=wait | continue | panic
> >     >     >
> >     >     >            Controls the system behavior in the event of
> >     catastrophic
> >     >     > pool failure.  This  condition  is  typically  a
> >     >     >            result  of  a  loss of connectivity to the
> >     underlying storage
> >     >     > device(s) or a failure of all devices within
> >     >     >            the pool. The behavior of such an event is
> >     determined as
> >     >     > follows:
> >     >     >
> >     >     >            wait        Blocks all I/O access until the device
> >     >     > connectivity is recovered and the errors  are  cleared.
> >     >     >                        This is the default behavior.
> >     >     >
> >     >     >            continue    Returns  EIO  to  any  new write I/O
> >     requests but
> >     >     > allows reads to any of the remaining healthy
> >     >     >                        devices. Any write requests that have
> >     yet to be
> >     >     > committed to disk would be blocked.
> >     >     >
> >     >     >            panic       Prints out a message to the console
> >     and generates
> >     >     > a system crash dump.
> >     >
> >     >     'mmm
> >     >
> >     >     Did not know about this setting. Nice one, but alas my current
> >     >     setting is:
> >     >     zfsboot  failmode         wait
>  default
> >     >     zfsraid  failmode         wait
>  default
> >     >
> >     >     So either the setting is not working, or something else is up?
> >     >     Is waiting only meant to wait a limited time? And then panic
> >     anyways?
> >     >
> >     >     But then still I wonder why even in the 'continue'-case the
> >     ZFS system
> >     >     ends in a state where the filesystem is not able to continue
> >     in its
> >     >     standard functioning ( read and write ) and disconnects the
> >     disk???
> >     >
> >     >     All failmode settings result in a seriously handicapped
> system...
> >     >     On a raidz2 system I would perhaps expected this to occur when
> the
> >     >     second disk goes into thin space??
> >     >
> >     >     The other question is: The man page talks about
> >     >     'Controls the system behavior in the event of catastrophic
> >     pool failure'
> >     >     And is a hung disk a 'catastrophic pool failure'?
> >     >
> >     >     Still very puzzled?
> >     >
> >     >     --WjW
> >     >
> >     >     >
> >     >     >
> >     >     > On 2015-06-20 10:19 AM, Willem Jan Withagen wrote:
> >     >     >> Hi,
> >     >     >>
> >     >     >> Found my system rebooted this morning:
> >     >     >>
> >     >     >> Jun 20 05:28:33 zfs kernel: sonewconn: pcb
> >     0xfffff8011b6da498: Listen
> >     >     >> queue overflow: 8 already in queue awaiting acceptance (48
> >     >     occurrences)
> >     >     >> Jun 20 05:28:33 zfs kernel: panic: I/O to pool 'zfsraid'
> >     appears
> >     >     to be
> >     >     >> hung on vdev guid 18180224580327100979 at '/dev/da0'.
> >     >     >> Jun 20 05:28:33 zfs kernel: cpuid = 0
> >     >     >> Jun 20 05:28:33 zfs kernel: Uptime: 8d9h7m9s
> >     >     >> Jun 20 05:28:33 zfs kernel: Dumping 6445 out of 8174
> >     >     >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> >     >     >>
> >     >     >> Which leads me to believe that /dev/da0 went out on
> >     vacation, leaving
> >     >     >> ZFS into trouble.... But the array is:
> >     >     >> ----
> >     >     >> NAME               SIZE  ALLOC   FREE  EXPANDSZ   FRAG
> >     CAP  DEDUP
> >     >     >> zfsraid           32.5T  13.3T  19.2T         -     7%
> >     41%  1.00x
> >     >     >> ONLINE  -
> >     >     >>    raidz2          16.2T  6.67T  9.58T         -     8%
> 41%
> >     >     >>      da0               -      -      -         -      -
>   -
> >     >     >>      da1               -      -      -         -      -
>   -
> >     >     >>      da2               -      -      -         -      -
>   -
> >     >     >>      da3               -      -      -         -      -
>   -
> >     >     >>      da4               -      -      -         -      -
>   -
> >     >     >>      da5               -      -      -         -      -
>   -
> >     >     >>    raidz2          16.2T  6.67T  9.58T         -     7%
> 41%
> >     >     >>      da6               -      -      -         -      -
>   -
> >     >     >>      da7               -      -      -         -      -
>   -
> >     >     >>      ada4              -      -      -         -      -
>   -
> >     >     >>      ada5              -      -      -         -      -
>   -
> >     >     >>      ada6              -      -      -         -      -
>   -
> >     >     >>      ada7              -      -      -         -      -
>   -
> >     >     >>    mirror           504M  1.73M   502M         -    39%
>  0%
> >     >     >>      gpt/log0          -      -      -         -      -
>   -
> >     >     >>      gpt/log1          -      -      -         -      -
>   -
> >     >     >> cache                 -      -      -      -      -      -
> >     >     >>    gpt/raidcache0   109G  1.34G   107G         -     0%
>  1%
> >     >     >>    gpt/raidcache1   109G   787M   108G         -     0%
>  0%
> >     >     >> ----
> >     >     >>
> >     >     >> And thus I'd would have expected that ZFS would disconnect
> >     >     /dev/da0 and
> >     >     >> then switch to DEGRADED state and continue, letting the
> >     operator
> >     >     fix the
> >     >     >> broken disk.
> >     >     >> Instead it chooses to panic, which is not a nice thing to
> >     do. :)
> >     >     >>
> >     >     >> Or do I have to high hopes of ZFS?
> >     >     >>
> >     >     >> Next question to answer is why this WD RED on:
> >     >     >>
> >     >     >> arcmsr0@pci0:7:14:0:    class=0x010400 card=0x112017d3
> >     >     chip=0x112017d3
> >     >     >> rev=0x00 hdr=0x00
> >     >     >>      vendor     = 'Areca Technology Corp.'
> >     >     >>      device     = 'ARC-1120 8-Port PCI-X to SATA RAID
> >     Controller'
> >     >     >>      class      = mass storage
> >     >     >>      subclass   = RAID
> >     >     >>
> >     >     >> got hung, and nothing for this shows in SMART....
> >     >
> >     >     _______________________________________________
> >     >     freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org>
> >     <mailto:freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org>>
> >     mailing list
> >     >     http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> >     >     To unsubscribe, send any mail to "
> freebsd-fs-unsubscribe@freebsd.org
> >     <mailto:freebsd-fs-unsubscribe@freebsd.org>
> >     >     <mailto:freebsd-fs-unsubscribe@freebsd.org
> >     <mailto:freebsd-fs-unsubscribe@freebsd.org>>"
> >     >
> >     >
> >
> >
>
>

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 01:05:32 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 13483AB6
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:05:32 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from smtp.digiware.nl (smtp.digiware.nl [31.223.170.169])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id AC4CEC99
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 01:05:31 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from rack1.digiware.nl (unknown [127.0.0.1])
 by smtp.digiware.nl (Postfix) with ESMTP id ACC1F16A40B;
 Mon, 22 Jun 2015 01:45:28 +0200 (CEST)
X-Virus-Scanned: amavisd-new at digiware.nl
Received: from smtp.digiware.nl ([127.0.0.1])
 by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id QpwdSG2hukCH; Mon, 22 Jun 2015 01:45:16 +0200 (CEST)
Received: from [IPv6:2001:4cb8:3:1:a079:ce8f:c2bf:e69] (unknown
 [IPv6:2001:4cb8:3:1:a079:ce8f:c2bf:e69])
 by smtp.digiware.nl (Postfix) with ESMTPA id 03F8416A40A;
 Mon, 22 Jun 2015 01:45:16 +0200 (CEST)
Message-ID: <55874C8A.4090405@digiware.nl>
Date: Mon, 22 Jun 2015 01:45:14 +0200
From: Willem Jan Withagen <wjw@digiware.nl>
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Tom Curry <thomasrcurry@gmail.com>
CC: freebsd-fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl>	<558590BD.40603@isletech.net>	<5586C396.9010100@digiware.nl>	<CAGtEZUAO5-rBoz0YBcYfvZ6tx_sj0MEFuxGSYk+z0XHrJySk2A@mail.gmail.com>	<55873E1D.9010401@digiware.nl>
 <CAGtEZUBexzwjTGMXY+Mg5knNsC+f35TXhAqhL0vdOKoOUO1F3A@mail.gmail.com>
In-Reply-To: <CAGtEZUBexzwjTGMXY+Mg5knNsC+f35TXhAqhL0vdOKoOUO1F3A@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 01:05:32 -0000

On 22/06/2015 01:34, Tom Curry wrote:
> I asked because recently I had similar trouble. Lots of kernel panics,
> sometimes they were just like yours, sometimes they were general
> protection faults. But they would always occur when my nightly backups
> took place where VMs on iSCSI zvol luns were read and then written over
> smb to another pool on the same machine over 10GbE. 
> 
> I nearly went out of my mind trying to figure out what was going on,
> I'll spare you the gory details, but I stumbled across this PR
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594 and as I read

So this is "the Karl Denninger ZFS patch"....
I tried to follow the discussion at the moment, keeping it in the back
of my head.....
I concluded that the ideas where sort of accepted, but a different
solution was implemented?

> through it little light bulbs starting coming on. Luckily it was easy
> for me to reproduce the problem so I kicked off the backups and watched
> the system memory. Wired would grow, ARC would shrink, and then the
> system would start swapping. If I stopped the IO right then it would
> recover after a while. But if I let it go it would always panic, and
> half the time it would be the same message as yours. So I applied the
> patch from that PR, rebooted, and kicked off the backup. No more panic.
> Recently I rebuilt a vanilla kernel from stable/10 but explicitly set
> vfs.zfs.arc_max to 24G (I have 32G) and ran my torture tests and it is
> stable. 

So you've (almost) answered my question, but English is not my native
language and hence my question for certainty: You did not add the patch
to your recently build stable/10 kernel...

> So I don't want to send you on a wild goose chase, but it's entirely
> possible this problem you are having is not hardware related at all, but
> is a memory starvation issue related to the ARC under periods of heavy
> activity.

Well rsync will do that for you... And since a few months I've also
loaded some iSCSI zvols as remote disks to some windows stations.

Your suggestions are highly appreciated. Especially since I do not have
space PCI-X parts... (It the current hardware blows up, I'm getting
monder new stuff.) So other than checking some cabling and likes there
is very little I could swap.

Thanx,
--WjW

> On Sun, Jun 21, 2015 at 6:43 PM, Willem Jan Withagen <wjw@digiware.nl
> <mailto:wjw@digiware.nl>> wrote:
> 
>     On 21/06/2015 21:50, Tom Curry wrote:
>     > Was there by chance a lot of disk activity going on when this occurred?
> 
>     Define 'a lot'??
>     But very likely, since the system is also a backup location for several
>     external service which backup thru rsync. And they can generate generate
>     quite some traffic. Next to the fact that it also serves a NVR with a
>     ZVOL trhu iSCSI...
> 
>     --WjW
> 
>     >
>     > On Sun, Jun 21, 2015 at 10:00 AM, Willem Jan Withagen <wjw@digiware.nl <mailto:wjw@digiware.nl>
>     > <mailto:wjw@digiware.nl <mailto:wjw@digiware.nl>>> wrote:
>     >
>     >     On 20/06/2015 18:11, Daryl Richards wrote:
>     >     > Check the failmode setting on your pool. From man zpool:
>     >     >
>     >     >        failmode=wait | continue | panic
>     >     >
>     >     >            Controls the system behavior in the event of
>     catastrophic
>     >     > pool failure.  This  condition  is  typically  a
>     >     >            result  of  a  loss of connectivity to the
>     underlying storage
>     >     > device(s) or a failure of all devices within
>     >     >            the pool. The behavior of such an event is
>     determined as
>     >     > follows:
>     >     >
>     >     >            wait        Blocks all I/O access until the device
>     >     > connectivity is recovered and the errors  are  cleared.
>     >     >                        This is the default behavior.
>     >     >
>     >     >            continue    Returns  EIO  to  any  new write I/O
>     requests but
>     >     > allows reads to any of the remaining healthy
>     >     >                        devices. Any write requests that have
>     yet to be
>     >     > committed to disk would be blocked.
>     >     >
>     >     >            panic       Prints out a message to the console
>     and generates
>     >     > a system crash dump.
>     >
>     >     'mmm
>     >
>     >     Did not know about this setting. Nice one, but alas my current
>     >     setting is:
>     >     zfsboot  failmode         wait                           default
>     >     zfsraid  failmode         wait                           default
>     >
>     >     So either the setting is not working, or something else is up?
>     >     Is waiting only meant to wait a limited time? And then panic
>     anyways?
>     >
>     >     But then still I wonder why even in the 'continue'-case the
>     ZFS system
>     >     ends in a state where the filesystem is not able to continue
>     in its
>     >     standard functioning ( read and write ) and disconnects the
>     disk???
>     >
>     >     All failmode settings result in a seriously handicapped system...
>     >     On a raidz2 system I would perhaps expected this to occur when the
>     >     second disk goes into thin space??
>     >
>     >     The other question is: The man page talks about
>     >     'Controls the system behavior in the event of catastrophic
>     pool failure'
>     >     And is a hung disk a 'catastrophic pool failure'?
>     >
>     >     Still very puzzled?
>     >
>     >     --WjW
>     >
>     >     >
>     >     >
>     >     > On 2015-06-20 10:19 AM, Willem Jan Withagen wrote:
>     >     >> Hi,
>     >     >>
>     >     >> Found my system rebooted this morning:
>     >     >>
>     >     >> Jun 20 05:28:33 zfs kernel: sonewconn: pcb
>     0xfffff8011b6da498: Listen
>     >     >> queue overflow: 8 already in queue awaiting acceptance (48
>     >     occurrences)
>     >     >> Jun 20 05:28:33 zfs kernel: panic: I/O to pool 'zfsraid'
>     appears
>     >     to be
>     >     >> hung on vdev guid 18180224580327100979 at '/dev/da0'.
>     >     >> Jun 20 05:28:33 zfs kernel: cpuid = 0
>     >     >> Jun 20 05:28:33 zfs kernel: Uptime: 8d9h7m9s
>     >     >> Jun 20 05:28:33 zfs kernel: Dumping 6445 out of 8174
>     >     >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
>     >     >>
>     >     >> Which leads me to believe that /dev/da0 went out on
>     vacation, leaving
>     >     >> ZFS into trouble.... But the array is:
>     >     >> ----
>     >     >> NAME               SIZE  ALLOC   FREE  EXPANDSZ   FRAG   
>     CAP  DEDUP
>     >     >> zfsraid           32.5T  13.3T  19.2T         -     7%   
>     41%  1.00x
>     >     >> ONLINE  -
>     >     >>    raidz2          16.2T  6.67T  9.58T         -     8%    41%
>     >     >>      da0               -      -      -         -      -      -
>     >     >>      da1               -      -      -         -      -      -
>     >     >>      da2               -      -      -         -      -      -
>     >     >>      da3               -      -      -         -      -      -
>     >     >>      da4               -      -      -         -      -      -
>     >     >>      da5               -      -      -         -      -      -
>     >     >>    raidz2          16.2T  6.67T  9.58T         -     7%    41%
>     >     >>      da6               -      -      -         -      -      -
>     >     >>      da7               -      -      -         -      -      -
>     >     >>      ada4              -      -      -         -      -      -
>     >     >>      ada5              -      -      -         -      -      -
>     >     >>      ada6              -      -      -         -      -      -
>     >     >>      ada7              -      -      -         -      -      -
>     >     >>    mirror           504M  1.73M   502M         -    39%     0%
>     >     >>      gpt/log0          -      -      -         -      -      -
>     >     >>      gpt/log1          -      -      -         -      -      -
>     >     >> cache                 -      -      -      -      -      -
>     >     >>    gpt/raidcache0   109G  1.34G   107G         -     0%     1%
>     >     >>    gpt/raidcache1   109G   787M   108G         -     0%     0%
>     >     >> ----
>     >     >>
>     >     >> And thus I'd would have expected that ZFS would disconnect
>     >     /dev/da0 and
>     >     >> then switch to DEGRADED state and continue, letting the
>     operator
>     >     fix the
>     >     >> broken disk.
>     >     >> Instead it chooses to panic, which is not a nice thing to
>     do. :)
>     >     >>
>     >     >> Or do I have to high hopes of ZFS?
>     >     >>
>     >     >> Next question to answer is why this WD RED on:
>     >     >>
>     >     >> arcmsr0@pci0:7:14:0:    class=0x010400 card=0x112017d3
>     >     chip=0x112017d3
>     >     >> rev=0x00 hdr=0x00
>     >     >>      vendor     = 'Areca Technology Corp.'
>     >     >>      device     = 'ARC-1120 8-Port PCI-X to SATA RAID
>     Controller'
>     >     >>      class      = mass storage
>     >     >>      subclass   = RAID
>     >     >>
>     >     >> got hung, and nothing for this shows in SMART....
>     >
>     >     _______________________________________________
>     >     freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org>
>     <mailto:freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org>>
>     mailing list
>     >     http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>     >     To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org
>     <mailto:freebsd-fs-unsubscribe@freebsd.org>
>     >     <mailto:freebsd-fs-unsubscribe@freebsd.org
>     <mailto:freebsd-fs-unsubscribe@freebsd.org>>"
>     >
>     >
> 
> 


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 01:06:01 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2A0B0B34
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:06:01 +0000 (UTC)
 (envelope-from bugzilla-noreply@FreeBSD.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 015F0EDD
 for <freebsd-fs@FreeBSD.org>; Mon, 22 Jun 2015 01:06:01 +0000 (UTC)
 (envelope-from bugzilla-noreply@FreeBSD.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t5LL0BnU027262
 for <freebsd-fs@FreeBSD.org>; Sun, 21 Jun 2015 21:00:11 GMT
 (envelope-from bugzilla-noreply@FreeBSD.org)
Message-Id: <201506212100.t5LL0BnU027262@kenobi.freebsd.org>
From: bugzilla-noreply@FreeBSD.org
To: freebsd-fs@FreeBSD.org
Subject: Problem reports for freebsd-fs@FreeBSD.org that need special attention
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
Date: Sun, 21 Jun 2015 21:00:11 +0000
Content-Type: text/plain; charset="UTF-8"
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 01:06:01 -0000

To view an individual PR, use:
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id).

The following is a listing of current problems submitted by FreeBSD users,
which need special attention. These represent problem reports covering
all versions including experimental development code and obsolete releases.

Status      |    Bug Id | Description
------------+-----------+---------------------------------------------------
Open        |    136470 | [nfs] Cannot mount / in read-only, over NFS       
Open        |    139651 | [nfs] mount(8): read-only remount of NFS volume d 
Open        |    144447 | [zfs] sharenfs fsunshare() & fsshare_main() non f 

3 problems total for which you should take action.

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 01:10:26 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 29DD2CC5
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:10:26 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from smtp.digiware.nl (smtp.digiware.nl [31.223.170.169])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id C69D8A7D
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 01:10:25 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from rack1.digiware.nl (unknown [127.0.0.1])
 by smtp.digiware.nl (Postfix) with ESMTP id B891D16A403;
 Mon, 22 Jun 2015 00:43:51 +0200 (CEST)
X-Virus-Scanned: amavisd-new at digiware.nl
Received: from smtp.digiware.nl ([127.0.0.1])
 by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id gUsaS4ddV81U; Mon, 22 Jun 2015 00:43:39 +0200 (CEST)
Received: from [IPv6:2001:4cb8:3:1:a079:ce8f:c2bf:e69] (unknown
 [IPv6:2001:4cb8:3:1:a079:ce8f:c2bf:e69])
 by smtp.digiware.nl (Postfix) with ESMTPA id 91B3D16A402;
 Mon, 22 Jun 2015 00:43:39 +0200 (CEST)
Message-ID: <55873E1D.9010401@digiware.nl>
Date: Mon, 22 Jun 2015 00:43:41 +0200
From: Willem Jan Withagen <wjw@digiware.nl>
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Tom Curry <thomasrcurry@gmail.com>
CC: freebsd-fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl>	<558590BD.40603@isletech.net>	<5586C396.9010100@digiware.nl>
 <CAGtEZUAO5-rBoz0YBcYfvZ6tx_sj0MEFuxGSYk+z0XHrJySk2A@mail.gmail.com>
In-Reply-To: <CAGtEZUAO5-rBoz0YBcYfvZ6tx_sj0MEFuxGSYk+z0XHrJySk2A@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 01:10:26 -0000

On 21/06/2015 21:50, Tom Curry wrote:
> Was there by chance a lot of disk activity going on when this occurred? 

Define 'a lot'??
But very likely, since the system is also a backup location for several
external service which backup thru rsync. And they can generate generate
quite some traffic. Next to the fact that it also serves a NVR with a
ZVOL trhu iSCSI...

--WjW

> 
> On Sun, Jun 21, 2015 at 10:00 AM, Willem Jan Withagen <wjw@digiware.nl
> <mailto:wjw@digiware.nl>> wrote:
> 
>     On 20/06/2015 18:11, Daryl Richards wrote:
>     > Check the failmode setting on your pool. From man zpool:
>     >
>     >        failmode=wait | continue | panic
>     >
>     >            Controls the system behavior in the event of catastrophic
>     > pool failure.  This  condition  is  typically  a
>     >            result  of  a  loss of connectivity to the underlying storage
>     > device(s) or a failure of all devices within
>     >            the pool. The behavior of such an event is determined as
>     > follows:
>     >
>     >            wait        Blocks all I/O access until the device
>     > connectivity is recovered and the errors  are  cleared.
>     >                        This is the default behavior.
>     >
>     >            continue    Returns  EIO  to  any  new write I/O requests but
>     > allows reads to any of the remaining healthy
>     >                        devices. Any write requests that have yet to be
>     > committed to disk would be blocked.
>     >
>     >            panic       Prints out a message to the console and generates
>     > a system crash dump.
> 
>     'mmm
> 
>     Did not know about this setting. Nice one, but alas my current
>     setting is:
>     zfsboot  failmode         wait                           default
>     zfsraid  failmode         wait                           default
> 
>     So either the setting is not working, or something else is up?
>     Is waiting only meant to wait a limited time? And then panic anyways?
> 
>     But then still I wonder why even in the 'continue'-case the ZFS system
>     ends in a state where the filesystem is not able to continue in its
>     standard functioning ( read and write ) and disconnects the disk???
> 
>     All failmode settings result in a seriously handicapped system...
>     On a raidz2 system I would perhaps expected this to occur when the
>     second disk goes into thin space??
> 
>     The other question is: The man page talks about
>     'Controls the system behavior in the event of catastrophic pool failure'
>     And is a hung disk a 'catastrophic pool failure'?
> 
>     Still very puzzled?
> 
>     --WjW
> 
>     >
>     >
>     > On 2015-06-20 10:19 AM, Willem Jan Withagen wrote:
>     >> Hi,
>     >>
>     >> Found my system rebooted this morning:
>     >>
>     >> Jun 20 05:28:33 zfs kernel: sonewconn: pcb 0xfffff8011b6da498: Listen
>     >> queue overflow: 8 already in queue awaiting acceptance (48
>     occurrences)
>     >> Jun 20 05:28:33 zfs kernel: panic: I/O to pool 'zfsraid' appears
>     to be
>     >> hung on vdev guid 18180224580327100979 at '/dev/da0'.
>     >> Jun 20 05:28:33 zfs kernel: cpuid = 0
>     >> Jun 20 05:28:33 zfs kernel: Uptime: 8d9h7m9s
>     >> Jun 20 05:28:33 zfs kernel: Dumping 6445 out of 8174
>     >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
>     >>
>     >> Which leads me to believe that /dev/da0 went out on vacation, leaving
>     >> ZFS into trouble.... But the array is:
>     >> ----
>     >> NAME               SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP
>     >> zfsraid           32.5T  13.3T  19.2T         -     7%    41%  1.00x
>     >> ONLINE  -
>     >>    raidz2          16.2T  6.67T  9.58T         -     8%    41%
>     >>      da0               -      -      -         -      -      -
>     >>      da1               -      -      -         -      -      -
>     >>      da2               -      -      -         -      -      -
>     >>      da3               -      -      -         -      -      -
>     >>      da4               -      -      -         -      -      -
>     >>      da5               -      -      -         -      -      -
>     >>    raidz2          16.2T  6.67T  9.58T         -     7%    41%
>     >>      da6               -      -      -         -      -      -
>     >>      da7               -      -      -         -      -      -
>     >>      ada4              -      -      -         -      -      -
>     >>      ada5              -      -      -         -      -      -
>     >>      ada6              -      -      -         -      -      -
>     >>      ada7              -      -      -         -      -      -
>     >>    mirror           504M  1.73M   502M         -    39%     0%
>     >>      gpt/log0          -      -      -         -      -      -
>     >>      gpt/log1          -      -      -         -      -      -
>     >> cache                 -      -      -      -      -      -
>     >>    gpt/raidcache0   109G  1.34G   107G         -     0%     1%
>     >>    gpt/raidcache1   109G   787M   108G         -     0%     0%
>     >> ----
>     >>
>     >> And thus I'd would have expected that ZFS would disconnect
>     /dev/da0 and
>     >> then switch to DEGRADED state and continue, letting the operator
>     fix the
>     >> broken disk.
>     >> Instead it chooses to panic, which is not a nice thing to do. :)
>     >>
>     >> Or do I have to high hopes of ZFS?
>     >>
>     >> Next question to answer is why this WD RED on:
>     >>
>     >> arcmsr0@pci0:7:14:0:    class=0x010400 card=0x112017d3
>     chip=0x112017d3
>     >> rev=0x00 hdr=0x00
>     >>      vendor     = 'Areca Technology Corp.'
>     >>      device     = 'ARC-1120 8-Port PCI-X to SATA RAID Controller'
>     >>      class      = mass storage
>     >>      subclass   = RAID
>     >>
>     >> got hung, and nothing for this shows in SMART....
> 
>     _______________________________________________
>     freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org> mailing list
>     http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>     To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org
>     <mailto:freebsd-fs-unsubscribe@freebsd.org>"
> 
> 


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 01:19:19 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id C6368ECD
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:19:19 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: from mail-oi0-x22b.google.com (mail-oi0-x22b.google.com
 [IPv6:2607:f8b0:4003:c06::22b])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 895951479
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 01:19:19 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: by oiyy130 with SMTP id y130so94848515oiy.0
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 18:19:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=Hd++X6M5ph1MSGvLPQPjsRhO2SKJlqnpgaiU00lvJ1E=;
 b=YD1bEby2rb+sCf2Cghph1Pm/J5U0BFSLTusNnDkwh2+J//epeje94c+ELRT3GFydl+
 6I1ydFMgLh7BPGTH+PXX0kZmZVWxLuLz+T2IAvdd6r7adarlIFLz4giFXv7TIuuTT4Xr
 cRIcHTnMl3J2vzyETv+HrYB0TfpIJQqLGRi0hT1VfCh4xqMMXKJ/nNZ1aRLexHvIE0l7
 U2LeevDge6Q4NM+BYYTNGf7E/6b+FOjEAr61BvPcHM1W0rEQ6NEWdA5pfo4Q7WXz1luQ
 GVd6vU0AABUgUOuWnsL521yFqeZZ+BaXgno9TF5Y09zSnPWiD2x39Jw0IkfNcOly9qF5
 5bbA==
MIME-Version: 1.0
X-Received: by 10.182.22.33 with SMTP id a1mr22568524obf.41.1434929697833;
 Sun, 21 Jun 2015 16:34:57 -0700 (PDT)
Received: by 10.202.77.138 with HTTP; Sun, 21 Jun 2015 16:34:57 -0700 (PDT)
In-Reply-To: <55873E1D.9010401@digiware.nl>
References: <5585767B.4000206@digiware.nl> <558590BD.40603@isletech.net>
 <5586C396.9010100@digiware.nl>
 <CAGtEZUAO5-rBoz0YBcYfvZ6tx_sj0MEFuxGSYk+z0XHrJySk2A@mail.gmail.com>
 <55873E1D.9010401@digiware.nl>
Date: Sun, 21 Jun 2015 19:34:57 -0400
Message-ID: <CAGtEZUBexzwjTGMXY+Mg5knNsC+f35TXhAqhL0vdOKoOUO1F3A@mail.gmail.com>
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
From: Tom Curry <thomasrcurry@gmail.com>
To: Willem Jan Withagen <wjw@digiware.nl>
Cc: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 01:19:19 -0000

I asked because recently I had similar trouble. Lots of kernel panics,
sometimes they were just like yours, sometimes they were general protection
faults. But they would always occur when my nightly backups took place
where VMs on iSCSI zvol luns were read and then written over smb to another
pool on the same machine over 10GbE.

I nearly went out of my mind trying to figure out what was going on, I'll
spare you the gory details, but I stumbled across this PR
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594 and as I read
through it little light bulbs starting coming on. Luckily it was easy for
me to reproduce the problem so I kicked off the backups and watched the
system memory. Wired would grow, ARC would shrink, and then the system
would start swapping. If I stopped the IO right then it would recover after
a while. But if I let it go it would always panic, and half the time it
would be the same message as yours. So I applied the patch from that PR,
rebooted, and kicked off the backup. No more panic. Recently I rebuilt a
vanilla kernel from stable/10 but explicitly set vfs.zfs.arc_max to 24G (I
have 32G) and ran my torture tests and it is stable.

So I don't want to send you on a wild goose chase, but it's entirely
possible this problem you are having is not hardware related at all, but is
a memory starvation issue related to the ARC under periods of heavy
activity.


On Sun, Jun 21, 2015 at 6:43 PM, Willem Jan Withagen <wjw@digiware.nl>
wrote:

> On 21/06/2015 21:50, Tom Curry wrote:
> > Was there by chance a lot of disk activity going on when this occurred?
>
> Define 'a lot'??
> But very likely, since the system is also a backup location for several
> external service which backup thru rsync. And they can generate generate
> quite some traffic. Next to the fact that it also serves a NVR with a
> ZVOL trhu iSCSI...
>
> --WjW
>
> >
> > On Sun, Jun 21, 2015 at 10:00 AM, Willem Jan Withagen <wjw@digiware.nl
> > <mailto:wjw@digiware.nl>> wrote:
> >
> >     On 20/06/2015 18:11, Daryl Richards wrote:
> >     > Check the failmode setting on your pool. From man zpool:
> >     >
> >     >        failmode=wait | continue | panic
> >     >
> >     >            Controls the system behavior in the event of
> catastrophic
> >     > pool failure.  This  condition  is  typically  a
> >     >            result  of  a  loss of connectivity to the underlying
> storage
> >     > device(s) or a failure of all devices within
> >     >            the pool. The behavior of such an event is determined as
> >     > follows:
> >     >
> >     >            wait        Blocks all I/O access until the device
> >     > connectivity is recovered and the errors  are  cleared.
> >     >                        This is the default behavior.
> >     >
> >     >            continue    Returns  EIO  to  any  new write I/O
> requests but
> >     > allows reads to any of the remaining healthy
> >     >                        devices. Any write requests that have yet
> to be
> >     > committed to disk would be blocked.
> >     >
> >     >            panic       Prints out a message to the console and
> generates
> >     > a system crash dump.
> >
> >     'mmm
> >
> >     Did not know about this setting. Nice one, but alas my current
> >     setting is:
> >     zfsboot  failmode         wait                           default
> >     zfsraid  failmode         wait                           default
> >
> >     So either the setting is not working, or something else is up?
> >     Is waiting only meant to wait a limited time? And then panic anyways?
> >
> >     But then still I wonder why even in the 'continue'-case the ZFS
> system
> >     ends in a state where the filesystem is not able to continue in its
> >     standard functioning ( read and write ) and disconnects the disk???
> >
> >     All failmode settings result in a seriously handicapped system...
> >     On a raidz2 system I would perhaps expected this to occur when the
> >     second disk goes into thin space??
> >
> >     The other question is: The man page talks about
> >     'Controls the system behavior in the event of catastrophic pool
> failure'
> >     And is a hung disk a 'catastrophic pool failure'?
> >
> >     Still very puzzled?
> >
> >     --WjW
> >
> >     >
> >     >
> >     > On 2015-06-20 10:19 AM, Willem Jan Withagen wrote:
> >     >> Hi,
> >     >>
> >     >> Found my system rebooted this morning:
> >     >>
> >     >> Jun 20 05:28:33 zfs kernel: sonewconn: pcb 0xfffff8011b6da498:
> Listen
> >     >> queue overflow: 8 already in queue awaiting acceptance (48
> >     occurrences)
> >     >> Jun 20 05:28:33 zfs kernel: panic: I/O to pool 'zfsraid' appears
> >     to be
> >     >> hung on vdev guid 18180224580327100979 at '/dev/da0'.
> >     >> Jun 20 05:28:33 zfs kernel: cpuid = 0
> >     >> Jun 20 05:28:33 zfs kernel: Uptime: 8d9h7m9s
> >     >> Jun 20 05:28:33 zfs kernel: Dumping 6445 out of 8174
> >     >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> >     >>
> >     >> Which leads me to believe that /dev/da0 went out on vacation,
> leaving
> >     >> ZFS into trouble.... But the array is:
> >     >> ----
> >     >> NAME               SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP
> DEDUP
> >     >> zfsraid           32.5T  13.3T  19.2T         -     7%    41%
> 1.00x
> >     >> ONLINE  -
> >     >>    raidz2          16.2T  6.67T  9.58T         -     8%    41%
> >     >>      da0               -      -      -         -      -      -
> >     >>      da1               -      -      -         -      -      -
> >     >>      da2               -      -      -         -      -      -
> >     >>      da3               -      -      -         -      -      -
> >     >>      da4               -      -      -         -      -      -
> >     >>      da5               -      -      -         -      -      -
> >     >>    raidz2          16.2T  6.67T  9.58T         -     7%    41%
> >     >>      da6               -      -      -         -      -      -
> >     >>      da7               -      -      -         -      -      -
> >     >>      ada4              -      -      -         -      -      -
> >     >>      ada5              -      -      -         -      -      -
> >     >>      ada6              -      -      -         -      -      -
> >     >>      ada7              -      -      -         -      -      -
> >     >>    mirror           504M  1.73M   502M         -    39%     0%
> >     >>      gpt/log0          -      -      -         -      -      -
> >     >>      gpt/log1          -      -      -         -      -      -
> >     >> cache                 -      -      -      -      -      -
> >     >>    gpt/raidcache0   109G  1.34G   107G         -     0%     1%
> >     >>    gpt/raidcache1   109G   787M   108G         -     0%     0%
> >     >> ----
> >     >>
> >     >> And thus I'd would have expected that ZFS would disconnect
> >     /dev/da0 and
> >     >> then switch to DEGRADED state and continue, letting the operator
> >     fix the
> >     >> broken disk.
> >     >> Instead it chooses to panic, which is not a nice thing to do. :)
> >     >>
> >     >> Or do I have to high hopes of ZFS?
> >     >>
> >     >> Next question to answer is why this WD RED on:
> >     >>
> >     >> arcmsr0@pci0:7:14:0:    class=0x010400 card=0x112017d3
> >     chip=0x112017d3
> >     >> rev=0x00 hdr=0x00
> >     >>      vendor     = 'Areca Technology Corp.'
> >     >>      device     = 'ARC-1120 8-Port PCI-X to SATA RAID Controller'
> >     >>      class      = mass storage
> >     >>      subclass   = RAID
> >     >>
> >     >> got hung, and nothing for this shows in SMART....
> >
> >     _______________________________________________
> >     freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org> mailing list
> >     http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> >     To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org
> >     <mailto:freebsd-fs-unsubscribe@freebsd.org>"
> >
> >
>
>

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 01:21:53 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 74001F09
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:21:53 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 51A2A18E2
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 01:21:53 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id F2B173F71B;
 Sun, 21 Jun 2015 16:32:12 -0400 (EDT)
Message-ID: <55871F4C.5010103@sneakertech.com>
Date: Sun, 21 Jun 2015 16:32:12 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Willem Jan Withagen <wjw@digiware.nl>
CC: freebsd-fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl> <558590BD.40603@isletech.net>
 <5586C396.9010100@digiware.nl>
In-Reply-To: <5586C396.9010100@digiware.nl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 01:21:53 -0000

> Or do I have to high hopes of ZFS?
> And is a hung disk a 'catastrophic pool failure'?

Yes to both.

I encountered this exact same issue a couple years ago (and complained 
about it to this list as well, although I didn't get a complete answer 
at the time. I can provide links to the conversation if interested).

Basically, the heart of the issue is the way the kernel/drivers/ZFS 
deals with IO and DMA. There's currently no way to tell what's going on 
with the disks and what outstanding IO to the pool can be dropped or 
ignored. As-currently-designed there's no safe way to just kick out the 
pool and keep going, so the only options are to wait, panic, or wait and 
then panic. Fixing this would require a major rewrite of a lot of code, 
which isn't going to happen any time soon. The failmode setting and 
deadman timer were implemented as a bandage to prevent the system from 
hanging forever.

See this page for more info:
http://comments.gmane.org/gmane.os.illumos.zfs/61


> All failmode settings result in a seriously handicapped system...

Yes. Again, this is a design issue/flaw with how DMA works. There's no 
real way to continue on gracefully when a pool completely dies due to 
hung IO.

We're all pretty much stuck with this problem, at least for quite a while.


> Is waiting only meant to wait a limited time? And then panic anyways?

By default yes. However, if you know that on your system the issue will 
eventually resolve itself given several hours (and you want to wait that 
long) you can change the deadman timeout or disable it completely. Look 
at "vfs.zfs.deadman_enabled" and "vfs.zfs.deadman_synctime".


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 01:36:47 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 95E8C73
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:36:47 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from hub.freebsd.org (hub.freebsd.org [8.8.178.136])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "hub.freebsd.org", Issuer "hub.freebsd.org" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 798DE1E3E
 for <freebsd-fs@FreeBSD.ORG>; Mon, 22 Jun 2015 01:36:47 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: by hub.freebsd.org (Postfix)
 id 6EFFD72; Mon, 22 Jun 2015 01:36:47 +0000 (UTC)
Delivered-To: fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 6D37071
 for <fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:36:47 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4B0B71E3D
 for <fs@freebsd.org>; Mon, 22 Jun 2015 01:36:47 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id CABD33F71F;
 Sun, 21 Jun 2015 16:49:46 -0400 (EDT)
Message-ID: <5587236A.6020404@sneakertech.com>
Date: Sun, 21 Jun 2015 16:49:46 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Willem Jan Withagen <wjw@digiware.nl>
CC: fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl>
In-Reply-To: <5585767B.4000206@digiware.nl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 01:36:47 -0000

Also:

> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
> then switch to DEGRADED state and continue, letting the operator fix the
> broken disk.

> Next question to answer is why this WD RED on:

> got hung, and nothing for this shows in SMART....

You have a raidz2, which means THREE disks need to go down before the 
pool is unwritable. The problem is most likely your controller or power 
supply, not your disks.

Also2: don't rely too much on SMART for determining drive health. Google 
released a paper a few years ago revealing that half of all drives die 
without reporting SMART errors.

http://research.google.com/archive/disk_failures.pdf

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 01:49:49 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id DF46D22E
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:49:49 +0000 (UTC)
 (envelope-from michelle@sorbs.net)
Received: from hub.freebsd.org (hub.freebsd.org
 [IPv6:2001:1900:2254:206c::16:88])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "hub.freebsd.org", Issuer "hub.freebsd.org" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id C4D18A2F
 for <freebsd-fs@FreeBSD.ORG>; Mon, 22 Jun 2015 01:49:49 +0000 (UTC)
 (envelope-from michelle@sorbs.net)
Received: by hub.freebsd.org (Postfix)
 id BAAE122D; Mon, 22 Jun 2015 01:49:49 +0000 (UTC)
Delivered-To: fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B8E0B22C
 for <fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:49:49 +0000 (UTC)
 (envelope-from michelle@sorbs.net)
Received: from hades.sorbs.net (hades.sorbs.net [67.231.146.201])
 by mx1.freebsd.org (Postfix) with ESMTP id A6D6DA2E
 for <fs@freebsd.org>; Mon, 22 Jun 2015 01:49:49 +0000 (UTC)
 (envelope-from michelle@sorbs.net)
MIME-version: 1.0
Content-transfer-encoding: 7BIT
Content-type: text/plain; CHARSET=US-ASCII
Received: from isux.com (firewall.isux.com [213.165.190.213])
 by hades.sorbs.net
 (Oracle Communications Messaging Server 7.0.5.29.0 64bit (built Jul 9 2013))
 with ESMTPSA id <0NQB00IYKPCDW900@hades.sorbs.net> for fs@freebsd.org; Sun,
 21 Jun 2015 18:55:27 -0700 (PDT)
Message-id: <558769B5.601@sorbs.net>
Date: Mon, 22 Jun 2015 03:49:41 +0200
From: Michelle Sullivan <michelle@sorbs.net>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.24)
 Gecko/20100301 SeaMonkey/1.1.19
To: Quartz <quartz@sneakertech.com>
Cc: Willem Jan Withagen <wjw@digiware.nl>, fs@freebsd.org
Subject: Re: This diskfailure should not panic a system,
 but just disconnect disk from ZFS
References: <5585767B.4000206@digiware.nl> <5587236A.6020404@sneakertech.com>
In-reply-to: <5587236A.6020404@sneakertech.com>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 01:49:50 -0000

Quartz wrote:
> Also:
>
>> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
>> then switch to DEGRADED state and continue, letting the operator fix the
>> broken disk.
>
>> Next question to answer is why this WD RED on:
>
>> got hung, and nothing for this shows in SMART....
>
> You have a raidz2, which means THREE disks need to go down before the
> pool is unwritable. The problem is most likely your controller or
> power supply, not your disks.
>
Never make such assumptions...

I have worked in a professional environment where 9 of 12 disks failed
within 24 hours of each other....  They were all supposed to be from
different batches but due to an error they came from the same batch and
the environment was so tightly controlled and the work-load was so
similar that MTBF was almost identical on all 11 disks in the array...
the only disk that lasted more than 2 weeks over the failure was the
hotspare...!

-- 
Michelle Sullivan
http://www.mhix.org/


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 01:50:25 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id A0B8A267
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:50:25 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from smtp.digiware.nl (smtp.digiware.nl [31.223.170.169])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 30C85AFF
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 01:50:24 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from rack1.digiware.nl (unknown [127.0.0.1])
 by smtp.digiware.nl (Postfix) with ESMTP id 1DF1416A408;
 Mon, 22 Jun 2015 01:23:40 +0200 (CEST)
X-Virus-Scanned: amavisd-new at digiware.nl
Received: from smtp.digiware.nl ([127.0.0.1])
 by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id lZryYmhesiXL; Mon, 22 Jun 2015 01:23:28 +0200 (CEST)
Received: from [IPv6:2001:4cb8:3:1:a079:ce8f:c2bf:e69] (unknown
 [IPv6:2001:4cb8:3:1:a079:ce8f:c2bf:e69])
 by smtp.digiware.nl (Postfix) with ESMTPA id BE2CF16A407;
 Mon, 22 Jun 2015 01:23:28 +0200 (CEST)
Message-ID: <55874772.4090607@digiware.nl>
Date: Mon, 22 Jun 2015 01:23:30 +0200
From: Willem Jan Withagen <wjw@digiware.nl>
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Quartz <quartz@sneakertech.com>
CC: freebsd-fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl> <558590BD.40603@isletech.net>
 <5586C396.9010100@digiware.nl> <55871F4C.5010103@sneakertech.com>
In-Reply-To: <55871F4C.5010103@sneakertech.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 01:50:25 -0000

On 21/06/2015 22:32, Quartz wrote:
>> Or do I have to high hopes of ZFS?
>> And is a hung disk a 'catastrophic pool failure'?
> 
> Yes to both.
> 
> I encountered this exact same issue a couple years ago (and complained
> about it to this list as well, although I didn't get a complete answer
> at the time. I can provide links to the conversation if interested).
> 
> Basically, the heart of the issue is the way the kernel/drivers/ZFS
> deals with IO and DMA. There's currently no way to tell what's going on
> with the disks and what outstanding IO to the pool can be dropped or
> ignored. As-currently-designed there's no safe way to just kick out the
> pool and keep going, so the only options are to wait, panic, or wait and
> then panic. Fixing this would require a major rewrite of a lot of code,
> which isn't going to happen any time soon. The failmode setting and
> deadman timer were implemented as a bandage to prevent the system from
> hanging forever.
> 
> See this page for more info:
> http://comments.gmane.org/gmane.os.illumos.zfs/61
> 

Yes, I know of this discussion already as long as I'm working with ZFS.
But reading it like this does make it much more sense.
Perhaps the text should suggest some thing more painfull :), since now
it suggest that chopping the disk would help... (Hence my reaction)

But especially the hung disk during reading is sort of a ticking
timebomb... On the other hand, if it is already outstanding for 100's of
seconds, then it is never going to arrive. So the chance of running into
corrupted memory is going to be 0.00000......

But I do agree that in this case a panic might be next best solution.

>From another response I conclude that htere could be something in the
driver/hardware combo that could run me into trouble...

>> All failmode settings result in a seriously handicapped system...
> 
> Yes. Again, this is a design issue/flaw with how DMA works. There's no
> real way to continue on gracefully when a pool completely dies due to
> hung IO.
> 
> We're all pretty much stuck with this problem, at least for quite a while.

We'll the pool did not die, (at least not IMHO) just one disk stopt
working.... The pool is still resilient, so it
	could continue
	alert the operator
	have the operator fix/reboot/.....

But it the fact that a stalled DMA action could "corrupt" memory after
the fact. Just because a command outstanding for way too long, does all
of a sudden completes.

>> Is waiting only meant to wait a limited time? And then panic anyways?
> 
> By default yes. However, if you know that on your system the issue will
> eventually resolve itself given several hours (and you want to wait that
> long) you can change the deadman timeout or disable it completely. Look
> at "vfs.zfs.deadman_enabled" and "vfs.zfs.deadman_synctime".

I see:
vfs.zfs.deadman_enabled: 1
vfs.zfs.deadman_checktime_ms: 5000
vfs.zfs.deadman_synctime_ms: 1000000

So the "hung" on this I/O action has taken 1000 secs, and did not
complete...

I guess that if I like to live dangerously, I could set enabled to 0,
and run the risk... ??

Probably the better solution is to see if it occurs more often, and in
that case upgrade to modern hardware with newer HD controllers.
Current config has worked for me for already quite some time.
For the time being I'll offload a few disks to a mvs controller that is
the only PCIe slot in the MB.

Thanx for all the info,
--WjW


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 01:51:47 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 58BB82AF
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 01:51:47 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 36951D52
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 01:51:46 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id 5053B3F721
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 17:05:23 -0400 (EDT)
Message-ID: <55872712.2090800@sneakertech.com>
Date: Sun, 21 Jun 2015 17:05:22 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Freebsd fs <freebsd-fs@freebsd.org>
Subject: Re: ZFS pool restructuring and emergency repair
References: <5584C0BC.9070707@sneakertech.com>
 <alpine.GSO.2.01.1506192111050.4186@freddy.simplesystems.org>
In-Reply-To: <alpine.GSO.2.01.1506192111050.4186@freddy.simplesystems.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 01:51:47 -0000

> You can use 'send' and 'receive' to send all the data and the metadata
> associated with that data. I believe that you are correct that
> filesystem properties (like 'compression') are not preserved.


Hmm... I may have to just create a bunch of dummy pools and see what is 
and isn't copied. Was trying to avoid that.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 02:31:49 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id EE8A3545
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 02:31:49 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from hub.freebsd.org (hub.freebsd.org
 [IPv6:2001:1900:2254:206c::16:88])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "hub.freebsd.org", Issuer "hub.freebsd.org" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id D3D5215F
 for <freebsd-fs@FreeBSD.ORG>; Mon, 22 Jun 2015 02:31:49 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: by hub.freebsd.org (Postfix)
 id C9969544; Mon, 22 Jun 2015 02:31:49 +0000 (UTC)
Delivered-To: fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id C9027543
 for <fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 02:31:49 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id A693C15B
 for <fs@freebsd.org>; Mon, 22 Jun 2015 02:31:49 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id DD3E83F6E0;
 Sun, 21 Jun 2015 22:31:47 -0400 (EDT)
Message-ID: <55877393.3040704@sneakertech.com>
Date: Sun, 21 Jun 2015 22:31:47 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Michelle Sullivan <michelle@sorbs.net>
CC: fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl> <5587236A.6020404@sneakertech.com>
 <558769B5.601@sorbs.net>
In-Reply-To: <558769B5.601@sorbs.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 02:31:50 -0000

>> You have a raidz2, which means THREE disks need to go down before the
>> pool is unwritable. The problem is most likely your controller or
>> power supply, not your disks.
>>
> Never make such assumptions...
>
> I have worked in a professional environment where 9 of 12 disks failed
> within 24 hours of each other....

Right... but if that was his problem there should be some logs of the 
other drives going down first, and typically ZFS would correctly mark 
the pool as degraded (at least, it would in my testing). The fact that 
ZFS didn't get a chance to log anything and the pool came back up 
healthy leads me to believe the controller went south, taking several 
disks with it all at once and totally borking all IO. (Either that or 
what Tom Curry mentioned about the Arc issue, which I wasn't previously 
aware of).

Of course, if it issue isn't repeatable then who knows....

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 07:16:45 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2794240B
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 07:16:45 +0000 (UTC)
 (envelope-from borjam@sarenet.es)
Received: from cu1176c.smtpx.saremail.com (cu1176c.smtpx.saremail.com
 [195.16.148.151])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id DA52CFC
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 07:16:44 +0000 (UTC)
 (envelope-from borjam@sarenet.es)
Received: from [172.16.2.2] (izaro.sarenet.es [192.148.167.11])
 by proxypop02.sare.net (Postfix) with ESMTPSA id C45F49DC6A7;
 Mon, 22 Jun 2015 09:16:34 +0200 (CEST)
Subject: Re: ZFS pool restructuring and emergency repair
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: text/plain; charset=utf-8
From: Borja Marcos <borjam@sarenet.es>
In-Reply-To: <5584F83D.1040702@egr.msu.edu>
Date: Mon, 22 Jun 2015 09:16:31 +0200
Cc: freebsd-fs@freebsd.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <D97E0465-AE25-4429-9434-5EB449E5EEB0@sarenet.es>
References: <5584C0BC.9070707@sneakertech.com> <5584F83D.1040702@egr.msu.edu>
To: Adam McDougall <mcdouga9@egr.msu.edu>
X-Mailer: Apple Mail (2.1283)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 07:16:45 -0000


On Jun 20, 2015, at 7:21 AM, Adam McDougall wrote:

> The manpage for zfs says: (under zfs send)
>=20
> -R      Generate a replication stream package, which will replicate
>        the specified filesystem, and all descendent file systems, up
>        to the named snapshot. When received, all properties, snap=E2=80=90=

>        shots, descendent file systems, and clones are preserved.

And that includes compression. However, the send format was committed =
before
snapshot holds were introduced.=20

If you rely on holds to avoid accidental snapshot deletion, remember =
that holds
will not be replicated.


Borja.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 07:43:20 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 8C9B25CB
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 07:43:20 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6B8D59F
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 07:43:20 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id AD7E63F6AB
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 03:43:18 -0400 (EDT)
Message-ID: <5587BC96.9090601@sneakertech.com>
Date: Mon, 22 Jun 2015 03:43:18 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Freebsd fs <freebsd-fs@freebsd.org>
Subject: Re: ZFS pool restructuring and emergency repair
References: <5584C0BC.9070707@sneakertech.com>
In-Reply-To: <5584C0BC.9070707@sneakertech.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 07:43:20 -0000

> - A server is set up with a pool created a certain way, for the sake of
> argument let's say it's a raidz-2 comprised of 6x 2TB disks. There's
> only actually ~1TB of data currently on the server though. Let's say
> there's a catastrophic emergency where one of the disks needs to be
> replaced, but the only available spare is an old 500GB. As I understand
> it, you're basically SOL. Even though a 6x500 (really 4x500) is more
> than enough to hold 1Tb of data, you can't do anything in this situation
> since although ZFS can expand a pool to fit larger disks, it can't
> shrink one under any circumstance. Is my understanding still correct or
> is there a way around this issue now?

So I take it that, aside from messing with a gvirstor/ sparse disk 
image, there's still no way to really handle this because there's still 
no way to shrink a pool after creation?


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 08:14:57 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 7771187D
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 08:14:57 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 5839C1148
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 08:14:57 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id 47A2C3F6CF
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 04:14:56 -0400 (EDT)
Message-ID: <5587C3FF.9070407@sneakertech.com>
Date: Mon, 22 Jun 2015 04:14:55 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: ZFS raid write performance?
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 08:14:57 -0000

What's sequential write performance like these days for ZFS raidzX? 
Someone suggested to me that I set up a single not-raid disk to act as a 
fast 'landing pad' for receiving files, then move them to the pool later 
in the background. Is that actually necessary? (Assume generic sata 
drives, 250mb-4gb sized files, and transfers are across a LAN using 
single unbonded GigE).


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 08:38:25 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 1F8AB941
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 08:38:25 +0000 (UTC)
 (envelope-from delphij@delphij.net)
Received: from anubis.delphij.net (anubis.delphij.net [64.62.153.212])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "anubis.delphij.net",
 Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 05E111C66
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 08:38:24 +0000 (UTC)
 (envelope-from delphij@delphij.net)
Received: from Xins-MBP.home.us.delphij.net
 (c-71-202-112-39.hsd1.ca.comcast.net [71.202.112.39])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by anubis.delphij.net (Postfix) with ESMTPSA id 510071A0B9;
 Mon, 22 Jun 2015 01:38:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=delphij.net;
 s=anubis; t=1434962304; x=1434976704;
 bh=Y3NEldf89aiDa/+PxsZYkArRrEneAOaFr7bHntnjBaA=;
 h=Date:From:To:Subject:References:In-Reply-To;
 b=wWg/4XlgMZs4ce/LtrISQnKuJyvV9Dz71WP7pKvbwh8rTlnxLjKhIrYrjvByXfbbl
 OvpvXgJEmLku9OUoWX9qO0zYnlZGHRJUFOS6nFXAJ49eiR7DCuwzyRJWvoVGUsfvUN
 hdbr0ot9Vk14FKTyQD318dlCDUAbEh/KXMxF01k8=
Message-ID: <5587C97F.2000407@delphij.net>
Date: Mon, 22 Jun 2015 01:38:23 -0700
From: Xin Li <delphij@delphij.net>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Quartz <quartz@sneakertech.com>, FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com>
In-Reply-To: <5587C3FF.9070407@sneakertech.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 08:38:25 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512


On 6/22/15 01:14, Quartz wrote:
> What's sequential write performance like these days for ZFS
> raidzX? Someone suggested to me that I set up a single not-raid
> disk to act as a fast 'landing pad' for receiving files, then move
> them to the pool later in the background. Is that actually
> necessary? (Assume generic sata drives, 250mb-4gb sized files, and
> transfers are across a LAN using single unbonded GigE).

That sounds really weird recommendation IMHO.  Did "someone" explained
with the reasoning/benefit of that "landing pad"?

I don't have hardware for testing handy, but IIRC even with 10,000 RPM
hard drives, a single hard drive won't do much beyond 100MB/s (maybe
120MB/s max) for sequential 128kB blocks, so that "landing pad" would
probably not very helpful assuming you can saturate your GigE network
(and keep in mind that with a file system in place it's not a perfect
sequential operation; plus if there is something wrong with that hard
drive you will have to start over rather than just replacing the bad
drive).

Cheers,
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJVh8l+AAoJEJW2GBstM+nsD8EP/RHR8Oiqf6FFVG4LT+CSqXLc
GIsSqaR/6/l04Ah0ixTkaubNvOELPlFZdFKQDtNd2u71G2Z7XtMbNvOK3G7whOxC
6a5xdNfdIYs7lq3jatN79BP9dygtgICsb1oMrCyAzd/tQc+cTvPabC/OxR4TtEJn
ZumP6LworIDGp1ruMrmQ7VvcOKhCxzs4VO7G8Lcj/WkhzR3TDEsZuzzqefWg1RlO
SBWJEwMGUugKWOCvgm8eQ2Hmw3btYbee1wfzuojtRN+d+IS8PtmsFpGBo8PCRSb8
lPz1Cf1fY4/zwruiG4EI+0CFvfr/05rN6DBRolyctdCGY1zX4rgKu6DT62kFkUR7
1nQdwxQ9slsQck1vyfAv2nIlGU530E696ZoS8/Ppqi/P8IqktYDLXKMn9+l0s+y+
EDzfvITasvwa6GRp5oxD2wagMjhvJ9iwELBLsppbjNH2i6n6k7EUSD1WGDHyQI2O
irzm7ecRd5mym14Ruk0PxOAkuRrWhIdkSEHWrK1V5MZolIMw7MTf/gzNJPDIG0tZ
MP4JmaOlysmHwIxoDLwAVlfuwweT3496miRbDvjzBrexkBvOVcIQdtymhZJmGe/z
DoejzWQvub5CbsDVbNAVW6HBppbW2MEqby4zyzl/Ae/IzsvYKAdVTQdmICO7wqNz
XWCqRSAjysOM5RDHoyXf
=Newc
-----END PGP SIGNATURE-----

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 12:21:57 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 27EFC510
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 12:21:57 +0000 (UTC)
 (envelope-from kraduk@gmail.com)
Received: from mail-wi0-x22e.google.com (mail-wi0-x22e.google.com
 [IPv6:2a00:1450:400c:c05::22e])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id A4928686
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 12:21:56 +0000 (UTC)
 (envelope-from kraduk@gmail.com)
Received: by wibdq8 with SMTP id dq8so73456303wib.1
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 05:21:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=wBzx43O/qR5X0aWL9TieLLYZUa+aZXD0oGWkEMHv0XQ=;
 b=KB7x/ECJa+uaKCGiwQGjru99p0Nmh/c//gGsKaytPv3ZBrMR7Nxb9XTNOB+kUJEKki
 Wn2cKfjFqspeqCOC5nGEzqw8c8eam6EM92bwUBZz1KKZHUIDe0SBhvbILTqP1MYOLygV
 HSaQ8OZfx0V/gK90MzOKTmn+WlHX+0Yw7b/gRjkC3Knlyg8mmHGnYlFpZajhuQXsjBc0
 inbGAvMQJGeB3iGtkEgKD/rgyaFgJ/HHVZnh6ex4uyRKH4qkPZaZvq3xN7IjWzOOof12
 3CblWvYgESyiTJ2jHfxgCPw6Z8/qnXRAsvQ9fsGxVENw3WNo56R65gl1+fdnwV3spm65
 WG7Q==
MIME-Version: 1.0
X-Received: by 10.194.176.68 with SMTP id cg4mr51298644wjc.106.1434975715106; 
 Mon, 22 Jun 2015 05:21:55 -0700 (PDT)
Received: by 10.180.73.5 with HTTP; Mon, 22 Jun 2015 05:21:55 -0700 (PDT)
In-Reply-To: <20150622121343.GB60684@neutralgood.org>
References: <5587C3FF.9070407@sneakertech.com>
 <20150622121343.GB60684@neutralgood.org>
Date: Mon, 22 Jun 2015 13:21:55 +0100
Message-ID: <CALfReye=TBmxmYJ5h4m++_3y+ZyRDh+a2+W8PWt-di7Yb5H5_Q@mail.gmail.com>
Subject: Re: ZFS raid write performance?
From: krad <kraduk@gmail.com>
To: kpneal@pobox.com
Cc: Quartz <quartz@sneakertech.com>, FreeBSD FS <freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 12:21:57 -0000

also ask yourself how big the data transfer is going to be. If its only a
few gigs or 10s of gigs at a time and not streaming you could well find its
all dumped to the ram on the box anyhow before its committed to the disk.
With regards to 10k disks, be careful there as more modern higher platter
capacity 7k disks might give better throughput due to the higher data
density.


On 22 June 2015 at 13:13, <kpneal@pobox.com> wrote:

> On Mon, Jun 22, 2015 at 04:14:55AM -0400, Quartz wrote:
> > What's sequential write performance like these days for ZFS raidzX?
> > Someone suggested to me that I set up a single not-raid disk to act as a
> > fast 'landing pad' for receiving files, then move them to the pool later
> > in the background. Is that actually necessary? (Assume generic sata
> > drives, 250mb-4gb sized files, and transfers are across a LAN using
> > single unbonded GigE).
>
> Tests were posted to ZFS lists a few years ago. That was a while ago, but
> at a fundamental level ZFS hasn't changed since then so the results should
> still be valid.
>
> For both reads and writes all levels of raidz* perform slightly faster
> than the speed of a single drive. _Slightly_ faster, like, the speed of
> a single drive * 1.1 or so roughly speaking.
>
> For mirrors, writes perform about the same as a single drive, and as more
> drives are added they get slightly worse. But reads scale pretty well as
> you add drives because reads can be spread across all the drives in the
> mirror in parallel.
>
> Having multiple vdevs helps because ZFS does striping across the vdevs.
> However, this striping only happens with writes that are done _after_ new
> vdevs are added. There is no rebalancing of data after new vdevs are added.
> So adding new vdevs won't change the read performance of data already on
> disk.
>
> ZFS does try to strip across vdevs, but if your old vdevs are nearly full
> then adding new ones results in data mostly going to the new, nearly empty
> vdevs. So if you only added a single new vdev to expand the pool then
> you'll see write performance roughly equal to the performance of that
> single vdev.
>
> Rebalancing can be done roughly with "zfs send | zfs receive". If you do
> this enough times, and destroy old, sent datasets after an iteration, then
> you can to some extent rebalance a pool. You won't achieve a perfect
> rebalance, though.
>
> We can thank Oracle for the destruction of the archives at sun.com which
> made it pretty darn difficult to find those posts.
>
> Finally, single GigE is _slow_. I see no point in a "landing pad" when
> using unbonded GigE.
>
> --
> Kevin P. Neal                                http://www.pobox.com/~kpn/
>
> Seen on bottom of IBM part number 1887724:
> DO NOT EXPOSE MOUSE PAD TO DIRECT SUNLIGHT FOR EXTENDED PERIODS OF TIME.
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 12:30:35 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 3979F5B8
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 12:30:35 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from hub.freebsd.org (hub.freebsd.org [8.8.178.136])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "hub.freebsd.org", Issuer "hub.freebsd.org" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 1A4FFBF6
 for <freebsd-fs@FreeBSD.ORG>; Mon, 22 Jun 2015 12:30:35 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: by hub.freebsd.org (Postfix)
 id 0FA475B7; Mon, 22 Jun 2015 12:30:35 +0000 (UTC)
Delivered-To: fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 0ED265B6
 for <fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 12:30:35 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from smtp.digiware.nl (unknown [IPv6:2001:4cb8:90:ffff::3])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id C718CBF4
 for <fs@freebsd.org>; Mon, 22 Jun 2015 12:30:34 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from rack1.digiware.nl (unknown [127.0.0.1])
 by smtp.digiware.nl (Postfix) with ESMTP id B8E3516A403;
 Mon, 22 Jun 2015 14:30:29 +0200 (CEST)
X-Virus-Scanned: amavisd-new at digiware.nl
Received: from smtp.digiware.nl ([127.0.0.1])
 by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id rV8O8kcg-_kh; Mon, 22 Jun 2015 14:30:02 +0200 (CEST)
Received: from [192.168.101.176] (vpn.ecoracks.nl [31.223.170.173])
 by smtp.digiware.nl (Postfix) with ESMTPA id 0AFAB16A401;
 Mon, 22 Jun 2015 14:30:02 +0200 (CEST)
Message-ID: <5587FFCC.3080100@digiware.nl>
Date: Mon, 22 Jun 2015 14:30:04 +0200
From: Willem Jan Withagen <wjw@digiware.nl>
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Quartz <quartz@sneakertech.com>, 
 Michelle Sullivan <michelle@sorbs.net>
CC: fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl> <5587236A.6020404@sneakertech.com>
 <558769B5.601@sorbs.net> <55877393.3040704@sneakertech.com>
In-Reply-To: <55877393.3040704@sneakertech.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 12:30:35 -0000

On 22/06/2015 04:31, Quartz wrote:
>>> You have a raidz2, which means THREE disks need to go down before the
>>> pool is unwritable. The problem is most likely your controller or
>>> power supply, not your disks.
>>>
>> Never make such assumptions...
>>
>> I have worked in a professional environment where 9 of 12 disks failed
>> within 24 hours of each other....
> 
> Right... but if that was his problem there should be some logs of the
> other drives going down first, and typically ZFS would correctly mark
> the pool as degraded (at least, it would in my testing). The fact that
> ZFS didn't get a chance to log anything and the pool came back up
> healthy leads me to believe the controller went south, taking several
> disks with it all at once and totally borking all IO. (Either that or
> what Tom Curry mentioned about the Arc issue, which I wasn't previously
> aware of).
> 
> Of course, if it issue isn't repeatable then who knows....

I do not think it was a full out failure, but just one transaction that
got hit by an alpha-particle...

Well, remember that the hung-diagnostics timeout is 1000 sec.
In the time-span before the panic nothing else was logged about
disks/controllers/etc... not functioning..

Only the few secs before the panic ctl/iSCSI and the network interface
started complaining that the was a memory shortage and the
networkinterafce started dumping packets....

But all that was logged really nicely in syslog. So I think that in the
1000sec it took for the deadman switch to trigger, the zpool just
functioned as was expected.... And the hardware somewhere lost one
transaction.

So I'll be crossing my fingers, and we'll see when/what/where the next
crash in going to occur. And work from there....

--WjW


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 12:53:29 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 445E76F0
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 12:53:29 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: from mail-wg0-f51.google.com (mail-wg0-f51.google.com [74.125.82.51])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id D5C328D0
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 12:53:28 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: by wguu7 with SMTP id u7so68475535wgu.3
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 05:53:26 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:subject:to:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-type
 :content-transfer-encoding;
 bh=+gWXM2SXFUDiF+0aE9RCIdvoancTM7m3gVVrfhTTINw=;
 b=ZA64jQLRMOnI8SgrnRI/OCrAgTYMmZnhyL93RjfEgdARMxcVgWWTil/yDc+f7XFgRI
 w5xh9DAL63LlRUkz79hxt/Vo4XT9db/uLs3FHkmlrekA/Wvdozlb7PSYshQRVauW1/Ii
 nZa5oJC89eTHk9pOuQ4x/yuVwrJCUGdTe6z8pUcW4OHO1EUlCBQ4wO3FZr7dWufkTcHO
 InZAO2qAQl1asKB++7mAGxyNQaEL+McyJMqxmtw38teGmo1kgv/WUWiEuP1bqKZqZtme
 O6YoihwxUF7sA0DIbLOYQSKVjfPQDvDVBIFAy/kNqvnX3zzU4x0JIiW9HR1m5sbFBmVm
 hEfg==
X-Gm-Message-State: ALoCoQltwhKJ+g9Z6A7gu1sAuw89l9Loefwvyv7KU4HR2m9JEu0c+O4hGYSV9JqzNweJxc8lv28M
X-Received: by 10.180.36.4 with SMTP id m4mr31616356wij.34.1434977606470;
 Mon, 22 Jun 2015 05:53:26 -0700 (PDT)
Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk.
 [82.69.141.170])
 by mx.google.com with ESMTPSA id hn7sm30422531wjc.16.2015.06.22.05.53.25
 for <freebsd-fs@freebsd.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Mon, 22 Jun 2015 05:53:25 -0700 (PDT)
Subject: Re: ZFS raid write performance?
To: freebsd-fs@freebsd.org
References: <5587C3FF.9070407@sneakertech.com>
 <20150622121343.GB60684@neutralgood.org>
From: Steven Hartland <killing@multiplay.co.uk>
Message-ID: <55880544.70907@multiplay.co.uk>
Date: Mon, 22 Jun 2015 13:53:24 +0100
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101
 Thunderbird/38.0.1
MIME-Version: 1.0
In-Reply-To: <20150622121343.GB60684@neutralgood.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 12:53:29 -0000

On 22/06/2015 13:13, kpneal@pobox.com wrote:
> On Mon, Jun 22, 2015 at 04:14:55AM -0400, Quartz wrote:
>> What's sequential write performance like these days for ZFS raidzX?
>> Someone suggested to me that I set up a single not-raid disk to act as a
>> fast 'landing pad' for receiving files, then move them to the pool later
>> in the background. Is that actually necessary? (Assume generic sata
>> drives, 250mb-4gb sized files, and transfers are across a LAN using
>> single unbonded GigE).
> Tests were posted to ZFS lists a few years ago. That was a while ago, but
> at a fundamental level ZFS hasn't changed since then so the results should
> still be valid.
>
> For both reads and writes all levels of raidz* perform slightly faster
> than the speed of a single drive. _Slightly_ faster, like, the speed of
> a single drive * 1.1 or so roughly speaking.
>
> For mirrors, writes perform about the same as a single drive, and as more
> drives are added they get slightly worse. But reads scale pretty well as
> you add drives because reads can be spread across all the drives in the
> mirror in parallel.
>
> Having multiple vdevs helps because ZFS does striping across the vdevs.
> However, this striping only happens with writes that are done _after_ new
> vdevs are added. There is no rebalancing of data after new vdevs are added.
> So adding new vdevs won't change the read performance of data already on
> disk.
>
> ZFS does try to strip across vdevs, but if your old vdevs are nearly full
> then adding new ones results in data mostly going to the new, nearly empty
> vdevs. So if you only added a single new vdev to expand the pool then
> you'll see write performance roughly equal to the performance of that
> single vdev.
>
> Rebalancing can be done roughly with "zfs send | zfs receive". If you do
> this enough times, and destroy old, sent datasets after an iteration, then
> you can to some extent rebalance a pool. You won't achieve a perfect
> rebalance, though.
>
> We can thank Oracle for the destruction of the archives at sun.com which
> made it pretty darn difficult to find those posts.
>
> Finally, single GigE is _slow_. I see no point in a "landing pad" when
> using unbonded GigE.
>
Actually it has had some significant changes which are likely to effect 
the results as it now has
an entirely new IO scheduler, so retesting would be wise.

     Regards
     Steve

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 13:17:38 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 312E1804
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 13:17:38 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id ECB842B1
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 13:17:37 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
 [65.66.246.65])
 by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id t5MDHZlJ004931;
 Mon, 22 Jun 2015 08:17:36 -0500 (CDT)
Date: Mon, 22 Jun 2015 08:17:35 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Quartz <quartz@sneakertech.com>
cc: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
In-Reply-To: <5587C3FF.9070407@sneakertech.com>
Message-ID: <alpine.GSO.2.01.1506220802100.4186@freddy.simplesystems.org>
References: <5587C3FF.9070407@sneakertech.com>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (blade.simplesystems.org [65.66.246.90]);
 Mon, 22 Jun 2015 08:17:36 -0500 (CDT)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 13:17:38 -0000

On Mon, 22 Jun 2015, Quartz wrote:

> What's sequential write performance like these days for ZFS raidzX? Someone 
> suggested to me that I set up a single not-raid disk to act as a fast 
> 'landing pad' for receiving files, then move them to the pool later in the 
> background. Is that actually necessary? (Assume generic sata drives, 
> 250mb-4gb sized files, and transfers are across a LAN using single unbonded 
> GigE).

The primary determinant of write performance is if the writes are 
synchronous or not, With synchronous writes, the data is comitted to 
non-volatile storage before responding to the requestor.  With 
asyncronous writes, the data only needs to be written into RAM before 
responding to the requestor.

Writes over NFS 3 are synchronous.  Writes over CIFS/Samba are likely 
not.  For good performance with synchronous writes, some sort of 
non-volatile write cache (e.g. dedicated zfs intent log "slog", 
controller NVRAM) is advised.

Use multiple sets of mirrors for maximum write performance with 
multiple clients.

Even 10 years old hardware should be able to keep up with gigabit 
Ethernet rates (< 100MB/s) given a reasonable disk subsystem.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 14:40:32 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B79F4CC9;
 Mon, 22 Jun 2015 14:40:32 +0000 (UTC) (envelope-from ler@lerctr.org)
Received: from thebighonker.lerctr.org (thebighonker.lerctr.org
 [IPv6:2001:470:1f0f:3ad:223:7dff:fe9e:6e8a])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "thebighonker.lerctr.org",
 Issuer "COMODO RSA Domain Validation Secure Server CA" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 35D5DF75;
 Mon, 22 Jun 2015 14:40:32 +0000 (UTC) (envelope-from ler@lerctr.org)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lerctr.org;
 s=lerami; 
 h=Message-ID:References:In-Reply-To:Subject:Cc:To:From:Date:Content-Transfer-Encoding:Content-Type:MIME-Version;
 bh=Yi41SDOQERjINs7dO92ot8H+LI5VBrnGPXNmmPBB1UY=; 
 b=FGf2pUx52k7tgYZPoZfvQrZgxCwk9Zu7dYgMLNC/xc7VBERSasW77qovs9b5ZBbLCC+IOVr+8w/S+C2c2ZCTkNNZnd7ww5u0hSY6fKhvLsq+cOVLNn4r5iHf/ry81NMfWioV21oirGrGYES26wFAxF6wst9bmWXtpA4kw51KH/U=;
Received: from thebighonker.lerctr.org
 ([2001:470:1f0f:3ad:223:7dff:fe9e:6e8a]:43395 helo=webmail.lerctr.org)
 by thebighonker.lerctr.org with esmtpsa (TLSv1:DHE-RSA-AES128-SHA:128)
 (Exim 4.85 (FreeBSD)) (envelope-from <ler@lerctr.org>)
 id 1Z72th-0005w4-8N; Mon, 22 Jun 2015 09:40:29 -0500
Received: from 104-54-221-134.lightspeed.austtx.sbcglobal.net
 ([104.54.221.134]) by webmail.lerctr.org
 with HTTP (HTTP/1.1 POST); Mon, 22 Jun 2015 09:40:29 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII;
 format=flowed
Content-Transfer-Encoding: 7bit
Date: Mon, 22 Jun 2015 09:40:29 -0500
From: Larry Rosenman <ler@lerctr.org>
To: Rick Macklem <rmacklem@uoguelph.ca>
Cc: Freebsd fs <freebsd-fs@freebsd.org>, rmacklem@freebsd.org, Freebsd
 current <freebsd-current@freebsd.org>
Subject: Re: NFS Mount and LARGE amounts of "INACT" memory
In-Reply-To: <7f8b3449973cff790d996bb1f169b8e0@thebighonker.lerctr.org>
References: <228350188.61172889.1434758295576.JavaMail.root@uoguelph.ca>
 <bc8c541ebdc25e23b63f34d81c9cc2fc@thebighonker.lerctr.org>
 <7f8b3449973cff790d996bb1f169b8e0@thebighonker.lerctr.org>
Message-ID: <06abcbf4fab73f3c0ba711269934e0ea@thebighonker.lerctr.org>
X-Sender: ler@lerctr.org
User-Agent: Roundcube Webmail/1.1.1
X-Spam-Score: -1.0 (-)
X-LERCTR-Spam-Score: -1.0 (-)
X-Spam-Report: SpamScore (-1.0/5.0) ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001
X-LERCTR-Spam-Report: SpamScore (-1.0/5.0) ALL_TRUSTED=-1, SHORTCIRCUIT=-0.0001
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 14:40:32 -0000

On 2015-06-19 19:30, Larry Rosenman wrote:
> On 2015-06-19 19:00, Larry Rosenman wrote:
>> On 2015-06-19 18:58, Rick Macklem wrote:
>>> Larry Rosenman wrote:
>>>> On 2015-06-17 07:26, Larry Rosenman wrote:
>>>> > I have a 64G memory FreeBSD 11-CURRENT system that has a couple of
>>>> > mounts to a FreeNAS (FreeBSD 9.3) system.
>>>> >
>>>> > When my rsync from a different system to one of the NFS mounts
>>>> > runs, I
>>>> > get like 48G of Inactive memory that goes back to
>>>> > free if I umount the share.
>>>> >
>>>> > I'm wondering why this memory moves from ZFS ARC to INACT.
>>>> >
>>>> > And, is this expected?
>>> A wild ass guess would be yes. Assuming you are referring to the NFS
>>> client (and not FreeNAS server) and guessing that rsync uses mmap'd
>>> I/O...
>>> - The pages will be associated with the file's vnode until that vnode
>>>   is recycled. (mmap'd I/O can continue after the file is closed.)
>>>   This could take a long time.
>>> I am not knowledgible w.r.t. the VM subsystem, but I'm guessing that
>>> there is some way for these pages to be reused if memory is limited?
>>> (Hopefully someone with VM knowledge can comment on this?)
>>> 
>> Yes, this is the NFS Client, not sure on mmap(2), but that would make 
>> sense
>> 
>> BUT, I don't like that it kills my ZFS ARC....
>> 
>> VM Guys?
>> 
> BTW, a quick grep if the rsync sources shows it does NOT use mmap, but
> has some mmap-like routines,
> so I'm at a loss....

Adding in -CURRENT for the VM guys.....
> 
>>> rick
>>> 
>>>> I've posted screenshots at:
>>>> 
>>>> http://www.lerctr.org/~ler/FreeBSD_inact/
>>>> 
>>>> 
>>>> --
>>>> Larry Rosenman                     http://www.lerctr.org/~ler
>>>> Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
>>>> US Mail: 108 Turvey Cove, Hutto, TX 78634-5688
>>>> _______________________________________________
>>>> freebsd-fs@freebsd.org mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>>> To unsubscribe, send any mail to 
>>>> "freebsd-fs-unsubscribe@freebsd.org"
>>>> 

-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 108 Turvey Cove, Hutto, TX 78634-5688


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 15:50:55 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 9DE6E1DB
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 15:50:55 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from hub.freebsd.org (hub.freebsd.org
 [IPv6:2001:1900:2254:206c::16:88])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "hub.freebsd.org", Issuer "hub.freebsd.org" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 7F2A4625
 for <freebsd-fs@FreeBSD.ORG>; Mon, 22 Jun 2015 15:50:55 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: by hub.freebsd.org (Postfix)
 id 74C031DA; Mon, 22 Jun 2015 15:50:55 +0000 (UTC)
Delivered-To: fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 73E0F1D9
 for <fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 15:50:55 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from smtp.digiware.nl (smtp.digiware.nl [31.223.170.169])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3512A623
 for <fs@freebsd.org>; Mon, 22 Jun 2015 15:50:54 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from rack1.digiware.nl (unknown [127.0.0.1])
 by smtp.digiware.nl (Postfix) with ESMTP id 11BA916A401;
 Mon, 22 Jun 2015 17:50:51 +0200 (CEST)
X-Virus-Scanned: amavisd-new at digiware.nl
Received: from smtp.digiware.nl ([127.0.0.1])
 by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id WXS0Lf198po6; Mon, 22 Jun 2015 17:50:23 +0200 (CEST)
Received: from [192.168.101.176] (vpn.ecoracks.nl [31.223.170.173])
 by smtp.digiware.nl (Postfix) with ESMTPA id E44FB16A402;
 Mon, 22 Jun 2015 17:41:17 +0200 (CEST)
Message-ID: <55882C9F.8020507@digiware.nl>
Date: Mon, 22 Jun 2015 17:41:19 +0200
From: Willem Jan Withagen <wjw@digiware.nl>
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Michelle Sullivan <michelle@sorbs.net>, 
 Quartz <quartz@sneakertech.com>
CC: fs@freebsd.org
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
References: <5585767B.4000206@digiware.nl> <5587236A.6020404@sneakertech.com>
 <558769B5.601@sorbs.net>
In-Reply-To: <558769B5.601@sorbs.net>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 15:50:55 -0000

On 22/06/2015 03:49, Michelle Sullivan wrote:
> Quartz wrote:
>> Also:
>>
>>> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
>>> then switch to DEGRADED state and continue, letting the operator fix the
>>> broken disk.
>>
>>> Next question to answer is why this WD RED on:
>>
>>> got hung, and nothing for this shows in SMART....
>>
>> You have a raidz2, which means THREE disks need to go down before the
>> pool is unwritable. The problem is most likely your controller or
>> power supply, not your disks.
>>
> Never make such assumptions...
> 
> I have worked in a professional environment where 9 of 12 disks failed
> within 24 hours of each other....  They were all supposed to be from
> different batches but due to an error they came from the same batch and
> the environment was so tightly controlled and the work-load was so
> similar that MTBF was almost identical on all 11 disks in the array...
> the only disk that lasted more than 2 weeks over the failure was the
> hotspare...!
> 

Scary (non)-statistics....
Theories are always nice, but this sort of experiences make your hair go
grey overnight.

--WjW

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 16:04:53 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 9BA7B2E9
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 16:04:53 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: from mail-wg0-f49.google.com (mail-wg0-f49.google.com [74.125.82.49])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 38A50D90
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 16:04:52 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: by wguu7 with SMTP id u7so73463155wgu.3
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 09:04:45 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-type
 :content-transfer-encoding;
 bh=nKSgCEWwSNgLooc2Y/DvM1oukiioE6h8hy1ROiLHXJk=;
 b=hr31UO5j5gWk6gqVDjB631G/j+Whhlz1jO5sXUCJoVWklPhCwaxFQ4tQDqG4Xv1amV
 QHr8iN3pMlskaiLdq+iqpEEh7kDVt++s2ESa/baw/Y8Jebo96TR4EROpkp6ZRwRN/erN
 m300+vF8DBppFBOOyuv4hsZ7rkl3YLtzVDbhimKX78eK7qKHpmM5v2ydbZeXx9FmRpd+
 DKDW0A1PNHwt0lB00PLDrT4d4hcNmwLuPXZ3FScq4mATD+IlVkIYA23l1vtCB/2Q31Te
 IT4TJMSuMbxAmCDD2et3VRy4XfZVszSGpiKcPYXt4bjYK0db2FfAoqLKX8BWjJUJZmGy
 /alg==
X-Gm-Message-State: ALoCoQmsC8i+sFc0uSIzO4MuPxOXg0Tvl8c/I2qrK1yFRKtuAVEbB+DfAHeuj/sPIuqNJujgz27T
X-Received: by 10.194.109.36 with SMTP id hp4mr51614154wjb.4.1434989085051;
 Mon, 22 Jun 2015 09:04:45 -0700 (PDT)
Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk.
 [82.69.141.170])
 by mx.google.com with ESMTPSA id fo13sm17870049wic.0.2015.06.22.09.04.43
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Mon, 22 Jun 2015 09:04:43 -0700 (PDT)
Subject: Re: ZFS raid write performance?
To: kpneal@pobox.com
References: <5587C3FF.9070407@sneakertech.com>
 <20150622121343.GB60684@neutralgood.org> <55880544.70907@multiplay.co.uk>
 <20150622153056.GA96798@neutralgood.org>
Cc: freebsd-fs@freebsd.org
From: Steven Hartland <killing@multiplay.co.uk>
Message-ID: <5588321A.4060102@multiplay.co.uk>
Date: Mon, 22 Jun 2015 17:04:42 +0100
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101
 Thunderbird/38.0.1
MIME-Version: 1.0
In-Reply-To: <20150622153056.GA96798@neutralgood.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 16:04:53 -0000


On 22/06/2015 16:30, kpneal@pobox.com wrote:
> On Mon, Jun 22, 2015 at 01:53:24PM +0100, Steven Hartland wrote:
>> On 22/06/2015 13:13, kpneal@pobox.com wrote:
>>> On Mon, Jun 22, 2015 at 04:14:55AM -0400, Quartz wrote:
>>>> What's sequential write performance like these days for ZFS raidzX?
>>>> Someone suggested to me that I set up a single not-raid disk to act as a
>>>> fast 'landing pad' for receiving files, then move them to the pool later
>>>> in the background. Is that actually necessary? (Assume generic sata
>>>> drives, 250mb-4gb sized files, and transfers are across a LAN using
>>>> single unbonded GigE).
>>> Tests were posted to ZFS lists a few years ago. That was a while ago, but
>>> at a fundamental level ZFS hasn't changed since then so the results should
>>> still be valid.
>>>
>>> For both reads and writes all levels of raidz* perform slightly faster
>>> than the speed of a single drive. _Slightly_ faster, like, the speed of
>>> a single drive * 1.1 or so roughly speaking.
>>>
>>> For mirrors, writes perform about the same as a single drive, and as more
>>> drives are added they get slightly worse. But reads scale pretty well as
>>> you add drives because reads can be spread across all the drives in the
>>> mirror in parallel.
>>>
>>> Having multiple vdevs helps because ZFS does striping across the vdevs.
>>> However, this striping only happens with writes that are done _after_ new
>>> vdevs are added. There is no rebalancing of data after new vdevs are added.
>>> So adding new vdevs won't change the read performance of data already on
>>> disk.
>>>
>>> ZFS does try to strip across vdevs, but if your old vdevs are nearly full
>>> then adding new ones results in data mostly going to the new, nearly empty
>>> vdevs. So if you only added a single new vdev to expand the pool then
>>> you'll see write performance roughly equal to the performance of that
>>> single vdev.
>>>
>>> Rebalancing can be done roughly with "zfs send | zfs receive". If you do
>>> this enough times, and destroy old, sent datasets after an iteration, then
>>> you can to some extent rebalance a pool. You won't achieve a perfect
>>> rebalance, though.
>>>
>>> We can thank Oracle for the destruction of the archives at sun.com which
>>> made it pretty darn difficult to find those posts.
>>>
>>> Finally, single GigE is _slow_. I see no point in a "landing pad" when
>>> using unbonded GigE.
>>>
>> Actually it has had some significant changes which are likely to effect
>> the results as it now has
>> an entirely new IO scheduler, so retesting would be wise.
> And this affects which parts of my post?
>
> Reading and writing to a raidz* requires touching all or almost all of
> the disks.
>
> Writing to a mirror requires touching all the disks. Reading from a mirror
> requires touching one disk.
Yes however if you get say a 10% improvement on scheduling said writes / 
reads then the overall impact will be noticeable.
> That hasn't changed. I'm skeptical that a new way of doing the same thing
> would change the results that much, especially for a large stream of
> data.
>
> I can see a new I/O scheduler being more _fair_, but that only applies
> when the box has multiple things going on.
A concrete example for mirrors will performance when dealing with 3 
readers demonstrated an increased in throughput from 168MB/s to 320MB/s 
with prefetch and without prefetch that was 95MB/s increased to 284MB/s 
in our testing, so significant differences.

This is a rather extreme example, but there's never any harm in 
re-testing to avoid using incorrect assumptions ;-)

     Regards
     Steve

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 18:58:23 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 08A01F12
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 18:58:23 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id C5625339
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 18:58:22 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
 [65.66.246.65])
 by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id t5MIwJar014990;
 Mon, 22 Jun 2015 13:58:20 -0500 (CDT)
Date: Mon, 22 Jun 2015 13:58:19 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: kpneal@pobox.com
cc: freebsd-fs@freebsd.org
Subject: Re:  ZFS raid write performance?
In-Reply-To: <20150622153056.GA96798@neutralgood.org>
Message-ID: <alpine.GSO.2.01.1506221355420.12519@freddy.simplesystems.org>
References: <5587C3FF.9070407@sneakertech.com>
 <20150622121343.GB60684@neutralgood.org> <55880544.70907@multiplay.co.uk>
 <20150622153056.GA96798@neutralgood.org>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (blade.simplesystems.org [65.66.246.90]);
 Mon, 22 Jun 2015 13:58:20 -0500 (CDT)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 18:58:23 -0000

On Mon, 22 Jun 2015, kpneal@pobox.com wrote:

> Reading and writing to a raidz* requires touching all or almost all of
> the disks.
>
> Writing to a mirror requires touching all the disks. Reading from a mirror
> requires touching one disk.

Keep in mind that for the same number of disks, using mirrors results 
in more vdevs and less use of precious IOPS.  Also, using mirrors 
results in larger I/O requests since zfs blocks don't need to be 
fragmented.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 20:46:59 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 6B3DF723
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 20:46:59 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 46E271C8
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 20:46:58 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id 903A63F732;
 Mon, 22 Jun 2015 16:46:57 -0400 (EDT)
Message-ID: <55887441.70605@sneakertech.com>
Date: Mon, 22 Jun 2015 16:46:57 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Todd Russell <trussell@sjasc.edu>
CC: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS pool restructuring and emergency repair
References: <5584C0BC.9070707@sneakertech.com>
 <CAFXz1NuBDYjE8WT4QFYO4z-4+AfJxr5uYnEFJ-mDX2TZet8yFA@mail.gmail.com>
In-Reply-To: <CAFXz1NuBDYjE8WT4QFYO4z-4+AfJxr5uYnEFJ-mDX2TZet8yFA@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 20:46:59 -0000

> I hate to jump in and be "that guy" but, seriously, if you are using
> this for something crucial, are you really going to risk it all with "an
> old 500GB"? Drives aren't so expensive that you can't afford to buy a
> spare match to keep on the side until such a day occurs.

That's a fair point, but the question here is "catastrophic emergency" 
that takes out all of your spares and you have to limp by on something 
you found under the couch for a day or two until you can get new drives 
in. You can imagine whatever contrived situation is most likely in your 
case.

My main concern is that ZFS is just kinda inflexible about some things, 
(especially disk/pool configurations) and that has the potential to 
cause real problems in some situations. I've seen a lot of things happen 
against the odds through the years, so I like to plan ahead as much as 
possible and try to figure out what my options are for mitigating those 
risks. Part of that means periodically asking around to see what's 
changed that I might have missed.

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 21:02:23 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 5C5B19C2
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 21:02:23 +0000 (UTC)
 (envelope-from alex.burlyga.ietf@gmail.com)
Received: from mail-yk0-x233.google.com (mail-yk0-x233.google.com
 [IPv6:2607:f8b0:4002:c07::233])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 17760FE9
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 21:02:23 +0000 (UTC)
 (envelope-from alex.burlyga.ietf@gmail.com)
Received: by ykfy125 with SMTP id y125so22371148ykf.1
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 14:02:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=BvHgdACUUfkUJRRFCYXCAPPQfz+knpZwZkYbMrBS34g=;
 b=TQ964hBhA15gIHnC3fVQM1HX/3iJGUSxVzd5xpQ854Js7S6OsMm3o8DZavUOscoAMU
 snvOYePogNLJAXPlCx5/3fNcsuWjIXXnF0GjPoBzbrCiKkjySuVf4iNjERf5ApXc6fYR
 w9QKjW6/Hs5WmZwlQ+7JSblE6qO5jLvTuaqJgFo0dTFnLpZGJ8BQ2415Zv/0ySIHXxKt
 dJrfKKyC5CL/ENUDepBSf1rq09mP7vf+NMRMq3Z2tYjzP+ZCj0V8TuFmQQfaHi2AMiHN
 aePS4JimbQHKkiABvH74mqj35bKnb8fvH6NCokkJo7ZrHobh+HYK+qm8OjqSn24hKGSr
 7uoA==
MIME-Version: 1.0
X-Received: by 10.170.223.131 with SMTP id p125mr38768155ykf.47.1435006942126; 
 Mon, 22 Jun 2015 14:02:22 -0700 (PDT)
Received: by 10.13.244.65 with HTTP; Mon, 22 Jun 2015 14:02:22 -0700 (PDT)
In-Reply-To: <1969046464.61534041.1434897034960.JavaMail.root@uoguelph.ca>
References: <CA+JhTNTSC-xPVdpUGcQemVMLUwuQB6D8-3d2HD6WjU+jd1SMNQ@mail.gmail.com>
 <1969046464.61534041.1434897034960.JavaMail.root@uoguelph.ca>
Date: Mon, 22 Jun 2015 14:02:22 -0700
Message-ID: <CA+JhTNS4XYrmrJeQEQh85d0PHkGG7a4Yt3ZL1E8-RLwNxWA1+Q@mail.gmail.com>
Subject: Re: [nfs][client] - Question about handling of the NFS3_EEXIST error
 in SYMLINK rpc
From: "alex.burlyga.ietf alex.burlyga.ietf" <alex.burlyga.ietf@gmail.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Cc: freebsd-fs <freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 21:02:23 -0000

Rick,

Thank you for a quick turn around, see answers inline:

On Sun, Jun 21, 2015 at 7:30 AM, Rick Macklem <rmacklem@uoguelph.ca> wrote:
> Alex Burlyga wrote:
>> Hi,
>>
>> NFS client code in nfsrpc_symlink() masks server returned NFS3_EEXIST
>> error
>> code
>> by returning 0 to the upper layers. I'm assuming this was an attempt
>> to
>> work around
>> some server's broken replay cache out there, however, it breaks a
>> more
>> common
>> case where server is returning EEXIST for legitimate reason and
>> application
>> is expecting this error code and equipped to deal with it.
>>
>> To fix it I see three ways of doing this:
>>  * Remove offending code
>>  * Make it optional, sysctl?
>>  * On NFS3_EEXIST send READLINK rpc to make sure symlink content is
>>  right
>>
>> Which of the ways will maximize the chances of getting this fix
>> upstream?
>>
> I've attached a patch for testing/review that does essentially #2.
> It has no effect on trivial tests, since the syscall does a Lookup
> before trying to create the symlink and fails with EEXIST.
> Do you have a case where competing clients are trying to create
> the symlink or something like that, which runs into this?

That's exactly failing test case we are running into.
>
> Please test the attached patch, since I don't know how to do that, rick
Great! I'll test it. I was leaning towards option 3 for SYMLINK and
option 2 for MKDIR.
This will work. Thanks for taking your time to generate the patch!

>
>> One more point, old client circa FreeBSD 7.0 does not exhibit this
>> problem.
>>
>> Alex
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>>

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 21:03:14 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 59B78A1C
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 21:03:14 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3838FFC
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 21:03:14 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id 237F93F6E8;
 Mon, 22 Jun 2015 17:03:13 -0400 (EDT)
Message-ID: <55887810.3080301@sneakertech.com>
Date: Mon, 22 Jun 2015 17:03:12 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Xin Li <delphij@delphij.net>
CC: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com> <5587C97F.2000407@delphij.net>
In-Reply-To: <5587C97F.2000407@delphij.net>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 21:03:14 -0000

>> What's sequential write performance like these days for ZFS
>> raidzX? Someone suggested to me that I set up a single not-raid
>> disk to act as a fast 'landing pad' for receiving files, then move
>> them to the pool later in the background. Is that actually
>> necessary? (Assume generic sata drives, 250mb-4gb sized files, and
>> transfers are across a LAN using single unbonded GigE).
>
> That sounds really weird recommendation IMHO.  Did "someone" explained
> with the reasoning/benefit of that "landing pad"?

Sort of. Something about the checksum calculations causing too much 
overhead. I think they were confused about sequential write vs random 
write, and possibly mdadm vs zfs. It was just something mentioned in 
passing that I didn't want to start a debate about at the time, since I 
wasn't 100% sure.


>a single hard drive won't do much beyond 100MB/s (maybe
> 120MB/s max) for sequential 128kB blocks, so that "landing pad" would
> probably not very helpful assuming you can saturate your GigE network

Wait, I'm confused. A single GigE has a theoretical max of like 
100mb/sec. That would imply the drive is probably about the same speed?


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 21:10:54 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 8DD00A95
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 21:10:54 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6B8EB349
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 21:10:54 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id 8D7A43F725
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 17:10:53 -0400 (EDT)
Message-ID: <558879DD.2090005@sneakertech.com>
Date: Mon, 22 Jun 2015 17:10:53 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com>
 <alpine.GSO.2.01.1506220802100.4186@freddy.simplesystems.org>
In-Reply-To: <alpine.GSO.2.01.1506220802100.4186@freddy.simplesystems.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 21:10:54 -0000

> Writes over NFS 3 are synchronous. Writes over CIFS/Samba are likely
> not.

> write performance with multiple
> clients.

> Finally, single GigE is _slow_.


I realize I've left out some possibly critical information.

This box is a dump space that needs to receive files from widely mixed 
"clients" (*nix/Win/Mac/desktop/laptop/etc) across a LAN, so the file 
share software is Samba and the client machines will be connecting with 
(at most) a single GigE.

Dunno if that changes anything.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 21:15:40 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2D7A0B0B
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 21:15:40 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 0A5E28D7
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 21:15:39 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id 8FBEA3F740
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 17:15:38 -0400 (EDT)
Message-ID: <55887AFA.30101@sneakertech.com>
Date: Mon, 22 Jun 2015 17:15:38 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com>
 <alpine.GSO.2.01.1506220802100.4186@freddy.simplesystems.org>
 <558879DD.2090005@sneakertech.com>
In-Reply-To: <558879DD.2090005@sneakertech.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 21:15:40 -0000


>> Writes over NFS 3 are synchronous. Writes over CIFS/Samba are likely
>> not.
>
>> write performance with multiple
>> clients.
>
>> Finally, single GigE is _slow_.
>
>
> I realize I've left out some possibly critical information.
>
> This box is a dump space that needs to receive files from widely mixed
> "clients" (*nix/Win/Mac/desktop/laptop/etc) across a LAN, so the file
> share software is Samba and the client machines will be connecting with
> (at most) a single GigE.

... and, these files will not need to be read or copied back off the 
server for a few days.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 21:19:37 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id A8BFBB5E
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 21:19:37 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 83895A62
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 21:19:37 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id CEA843F740
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 17:19:36 -0400 (EDT)
Message-ID: <55887BE8.2090305@sneakertech.com>
Date: Mon, 22 Jun 2015 17:19:36 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Freebsd fs <freebsd-fs@freebsd.org>
Subject: Re: ZFS pool restructuring and emergency repair
References: <5584C0BC.9070707@sneakertech.com>
 <5587BC96.9090601@sneakertech.com> <20150622115856.GA60684@neutralgood.org>
In-Reply-To: <20150622115856.GA60684@neutralgood.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 21:19:37 -0000

>> So I take it that, aside from messing with a gvirstor/ sparse disk
>> image, there's still no way to really handle this because there's still
>> no way to shrink a pool after creation?
>
> Correct. There's no way to shrink a pool ever.

Drat, that's what I thought. Oh well.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 21:46:46 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id C8224DDE
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 21:46:46 +0000 (UTC)
 (envelope-from m.seaman@infracaninophile.co.uk)
Received: from smtp.infracaninophile.co.uk (smtp.infracaninophile.co.uk
 [IPv6:2001:8b0:151:1:3cd3:cd67:fafa:3d78])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.infracaninophile.co.uk",
 Issuer "infracaninophile.co.uk" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 69E56C12
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 21:46:46 +0000 (UTC)
 (envelope-from m.seaman@infracaninophile.co.uk)
Received: from liminal.local ([192.168.100.2]) (authenticated bits=0)
 by smtp.infracaninophile.co.uk (8.15.1/8.15.1) with ESMTPSA id t5MLkc10092529
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO)
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 22:46:38 +0100 (BST)
 (envelope-from m.seaman@infracaninophile.co.uk)
Authentication-Results: smtp.infracaninophile.co.uk;
 dmarc=none header.from=infracaninophile.co.uk
DKIM-Filter: OpenDKIM Filter v2.9.2 smtp.infracaninophile.co.uk t5MLkc10092529
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=infracaninophile.co.uk; s=201001-infracaninophile; t=1435009598;
 bh=R60+BhHbggf+N0YBN5nZ9fdo2f/+NeGA4scLZYXrB3Y=;
 h=Date:From:To:Subject:References:In-Reply-To;
 z=Date:=20Mon,=2022=20Jun=202015=2022:46:29=20+0100|From:=20Matthew
 =20Seaman=20<m.seaman@infracaninophile.co.uk>|To:=20freebsd-fs@fre
 ebsd.org|Subject:=20Re:=20ZFS=20pool=20restructuring=20and=20emerg
 ency=20repair|References:=20<5584C0BC.9070707@sneakertech.com>=20<
 5587BC96.9090601@sneakertech.com>=20<20150622115856.GA60684@neutra
 lgood.org>=20<55887BE8.2090305@sneakertech.com>|In-Reply-To:=20<55
 887BE8.2090305@sneakertech.com>;
 b=gRoYfF7VNm2IzDC6eAXeq9Azgnrm2kL25dCWYKHkzP5x8+yRwefcWIKpyLW7K8jnE
 yx9rW6rA5WU1xBNsyhM3w6aBjtCaEwKDqOp6tgQ6kqGSJyJ+m30M5OX51Uu7N0JC5Z
 y8IAUkt2gFS+smZdZOU2uQmZn2WHpoSERyOKPGH0=
X-Authentication-Warning: lucid-nonsense.infracaninophile.co.uk: Host
 [192.168.100.2] claimed to be liminal.local
Message-ID: <55888235.5000100@infracaninophile.co.uk>
Date: Mon, 22 Jun 2015 22:46:29 +0100
From: Matthew Seaman <m.seaman@infracaninophile.co.uk>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: ZFS pool restructuring and emergency repair
References: <5584C0BC.9070707@sneakertech.com>
 <5587BC96.9090601@sneakertech.com> <20150622115856.GA60684@neutralgood.org>
 <55887BE8.2090305@sneakertech.com>
In-Reply-To: <55887BE8.2090305@sneakertech.com>
Content-Type: multipart/signed; micalg=pgp-sha512;
 protocol="application/pgp-signature";
 boundary="9DMiUK28hUiFvVtASlkbdsOo5nBiBRee5"
X-Virus-Scanned: clamav-milter 0.98.7 at lucid-nonsense.infracaninophile.co.uk
X-Virus-Status: Clean
X-Spam-Status: No, score=-1.5 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
 DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU autolearn=ham autolearn_force=no
 version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on
 lucid-nonsense.infracaninophile.co.uk
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 21:46:47 -0000

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--9DMiUK28hUiFvVtASlkbdsOo5nBiBRee5
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

On 22/06/2015 22:19, Quartz wrote:
>>> So I take it that, aside from messing with a gvirstor/ sparse disk
>>> image, there's still no way to really handle this because there's sti=
ll
>>> no way to shrink a pool after creation?
>>
>> Correct. There's no way to shrink a pool ever.
>=20
> Drat, that's what I thought. Oh well.

Although in one of Matt Ahrens talks at BSDCan he spoke of plans to
change this.  Essentially you'ld be able to offline a vdev, and a
background process (like scrub) would copy all the data blocks from that
device to elsewhere in the pool.  Once finished, the devices making up
the vdev could be physically removed.

	Cheers,

	Matthew


--9DMiUK28hUiFvVtASlkbdsOo5nBiBRee5
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.20 (Darwin)

iQJ8BAEBCgBmBQJViII7XxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ2NTNBNjhCOTEzQTRFNkNGM0UxRTEzMjZC
QjIzQUY1MThFMUE0MDEzAAoJELsjr1GOGkATGHIP/2TlXQYW+nZbX3t1Sm7evmC7
zxN48bsUk3j1vIIR9X0OVvDcfAuuVAq5gxvBisteBOV3jJCEvIofjmbxx+4bEkDe
NLg1hp4tvqKyoFEDrwq/pOzorgKCd8JOXEKXIvNthTRKM4LLkZUebQ09yIykSYy1
ldSs/5YPL3taN/L9aTs+ibuS+FIpCdprZ7qhm9o434KkuagIo4GwqOM/kd0fzpAg
m7uxIytfw7mtDydCGDJ+tDjjcPEnToNkd2Xkl6QyEfpG3oUHpaqsZZuIDgRDlIY7
9RbUcSWym8cLqjpxYmeQbxLdCmNaxuhTZARiFx33N5oD0C7btce8A5+YuVzHu/0n
YO1ETXvTgHy0C96wLsd/jx22rROvRIB79YoY28nZfrKB6l4pAJuwGFQfx9oeJBjT
NQ8NdoLFlGmvhcQ4L66fEbeYDvnG1m64UpbvYeiNKX3NkNjcBV4NrIRbSRds79t9
+9reSjshk0bht0AfWSiABJeikzXan/JAoDV+4P3WvIbdFRcADD0dmkxrnQRPdrv0
P1y9ksh1WDNN95mofQ0U+i/UZuB9lsX42ciVV1/JL2VzNoP5oxF3qYX6V5IFAqjR
hxF3Beui0Ut2vmfBl5HeHW5eHtwzaq4RFStrYXmwVP7a4TVTcnxIfLemV4I9PnxS
JsgD47/wW3vuBoN0ZjqI
=W/cs
-----END PGP SIGNATURE-----

--9DMiUK28hUiFvVtASlkbdsOo5nBiBRee5--

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 21:53:19 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 713D2E4E
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 21:53:19 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4E2F6FAC
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 21:53:18 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id B6C3C3F753;
 Mon, 22 Jun 2015 17:53:17 -0400 (EDT)
Message-ID: <558883CD.3080006@sneakertech.com>
Date: Mon, 22 Jun 2015 17:53:17 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Matthew Seaman <m.seaman@infracaninophile.co.uk>
CC: freebsd-fs@freebsd.org
Subject: Re: ZFS pool restructuring and emergency repair
References: <5584C0BC.9070707@sneakertech.com>
 <5587BC96.9090601@sneakertech.com> <20150622115856.GA60684@neutralgood.org>
 <55887BE8.2090305@sneakertech.com> <55888235.5000100@infracaninophile.co.uk>
In-Reply-To: <55888235.5000100@infracaninophile.co.uk>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 21:53:19 -0000

> Although in one of Matt Ahrens talks at BSDCan he spoke of plans to
> change this.  Essentially you'ld be able to offline a vdev, and a
> background process (like scrub) would copy all the data blocks from that
> device to elsewhere in the pool.  Once finished, the devices making up
> the vdev could be physically removed.

Oh, that would be nice. Was there a timeline guesstimate for when that 
would be implemented, or was it more a "maybe someday" thing?


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 22:37:07 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 25E522AE
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 22:37:07 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 03DF7C8
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 22:37:06 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id F08D83F760;
 Mon, 22 Jun 2015 18:37:05 -0400 (EDT)
Message-ID: <55888E0D.6040704@sneakertech.com>
Date: Mon, 22 Jun 2015 18:37:01 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: kpneal@pobox.com
CC: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com> <5587C97F.2000407@delphij.net>
 <55887810.3080301@sneakertech.com> <20150622221422.GA71520@neutralgood.org>
In-Reply-To: <20150622221422.GA71520@neutralgood.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 22:37:07 -0000

>>> a single hard drive won't do much beyond 100MB/s (maybe
>>> 120MB/s max) for sequential 128kB blocks, so that "landing pad" would
>>> probably not very helpful assuming you can saturate your GigE network
>>
>> Wait, I'm confused. A single GigE has a theoretical max of like
>> 100mb/sec. That would imply the drive is probably about the same speed?
>
> You won't get the theoretical max what with the overhead of Ethernet
> packets, TCP/IP overhead, and SMB protocol overhead.

Right, I know that, that's why I don't understand what Xin Li was trying 
to say.

I guess a better way to word the question is: would a raidzX using 
generic drives, samba, and 500mb-4gb files be notably slower at writing 
than ~70mb/sec. I have a feeling not, but I wanted to double check.

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 22:51:08 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 77D6038D
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 22:51:08 +0000 (UTC)
 (envelope-from delphij@delphij.net)
Received: from anubis.delphij.net (anubis.delphij.net [64.62.153.212])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "anubis.delphij.net",
 Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 578058D9
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 22:51:08 +0000 (UTC)
 (envelope-from delphij@delphij.net)
Received: from zeta.ixsystems.com (c-71-202-112-39.hsd1.ca.comcast.net
 [71.202.112.39])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by anubis.delphij.net (Postfix) with ESMTPSA id 027EC182F3;
 Mon, 22 Jun 2015 15:51:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=delphij.net;
 s=anubis; t=1435013467; x=1435027867;
 bh=2+Av8ovk6mp7Ug5J3pZg31q1KXM38hnVZVkmPlL323k=;
 h=Date:From:Reply-To:To:CC:Subject:References:In-Reply-To;
 b=V4O55+zj4iUSqFK6PHt7FCkf8Y7sCV60mvq5rpIs80sXEWaV+rg5Ot6NY+d/9pjny
 cNB/EtImMmc31gmVbF3+thssEHAfA4jPw/NWMwCskWCQc8tLMoxW1XQZJ1UJD4koVt
 zRymYQ65KnZtVhRkLIxfhIC2O6UC+xiMctcyQHEo=
Message-ID: <5588915A.700@delphij.net>
Date: Mon, 22 Jun 2015 15:51:06 -0700
From: Xin Li <delphij@delphij.net>
Reply-To: d@delphij.net
Organization: The FreeBSD Project
MIME-Version: 1.0
To: Quartz <quartz@sneakertech.com>
CC: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com> <5587C97F.2000407@delphij.net>
 <55887810.3080301@sneakertech.com>
In-Reply-To: <55887810.3080301@sneakertech.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 22:51:08 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 06/22/15 14:03, Quartz wrote:
>>> What's sequential write performance like these days for ZFS 
>>> raidzX? Someone suggested to me that I set up a single not-raid
>>> disk to act as a fast 'landing pad' for receiving files, then
>>> move them to the pool later in the background. Is that actually
>>> necessary? (Assume generic sata drives, 250mb-4gb sized files,
>>> and transfers are across a LAN using single unbonded GigE).
>> 
>> That sounds really weird recommendation IMHO.  Did "someone" 
>> explained with the reasoning/benefit of that "landing pad"?
> 
> Sort of. Something about the checksum calculations causing too much
> overhead. I think they were confused about sequential write vs 
> random

There are some overhead but it won't be the bottleneck if you are
using one GigE connection (not to mention that the default ZFS
checksum is not
SHA256 but a much faster algorithm), where network is the bottleneck.

> write, and possibly mdadm vs zfs. It was just something mentioned 
> in passing that I didn't want to start a debate about at the time, 
> since I wasn't 100% sure.
> 
>> a single hard drive won't do much beyond 100MB/s (maybe 120MB/s 
>> max) for sequential 128kB blocks, so that "landing pad" would 
>> probably not very helpful assuming you can saturate your GigE 
>> network
> 
> Wait, I'm confused. A single GigE has a theoretical max of like 
> 100mb/sec. That would imply the drive is probably about the same 
> speed?

No, what I'm trying to say is that since reading from the single drive
can't do much better than the network, it's likely that you wouldn't
be benefited by having it.

If the drive is much faster than the network and RAID-Z is much slower
than it, you could get some benefits because the unit you are using as
source of the replication can be re-purposed once data is on that
drive (which if I was to do the operation, I would probably never do
because that means less redundancy during migration).

Cheers,
- -- 
Xin LI <delphij@delphij.net>    https://www.delphij.net/
FreeBSD - The Power to Serve!           Live free or die
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.1.5 (FreeBSD)

iQIcBAEBCgAGBQJViJFXAAoJEJW2GBstM+nsr3IP/AhXwYqg0MwXg+hbhl2kFhh2
w0lIbWwGe2KzttphJZDv+FORlnkUtynOS5YULiwavldup91DHyOZvru4HjuugBeR
BOW3Bkq4xOBVlzn9oW/BTMWbutevhmTBXG18iVxj2qwsy9NIuGL+1wyrYI5r5bl/
BaHBHYF6UXtr8Um77qZ8neKuv+ePGCCqYLei/paTc56XRnq5nlreulW8fxuHN4Pz
b3JPLzoaPdQOkcXtBe9V6ZlmdLvfBAmrCbD0gL0BDAeLsvkjRlQifwl+ZTSLeOtF
ja3bJ8tfCMFeGuRsL0RginiIn21if2rjZRuhWfUY0cDsPXgLVjseLLxc7F8NMwDt
rigkEuTTIfZy6UKD+70g05O2suN963Orqy1L6tfoAG0bEk9qH5ZoNl50F/fboRu/
68bAwTEMNo0x7h7XlCgB2GYS5qdDgsIeNbJLcDmXHmgTAyK/XM5/5pvSvXY2dYWN
/z/cYVHB8cVSwugcYZP/NQk8Eeldy2P+uZlUVqUSiWmk3m0x51VPyFJUtnnNIEf+
E4TupH/kyfZoiTgbsdCvfYqWm6YViNrjeZ8qa5qeGQnjDiNf1hCqyd/YbaCE0rHX
ACV4PyDkyW56uf+89uoKbn6QQMwb3FsL/6epODzLlSQYYDI+hvwN7PpKHQTnVFAS
gQutdcHlR3ZNiiLIO0ji
=bigs
-----END PGP SIGNATURE-----

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 23:04:58 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 3FE845AB
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 23:04:58 +0000 (UTC)
 (envelope-from delphij@delphij.net)
Received: from anubis.delphij.net (anubis.delphij.net [64.62.153.212])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "anubis.delphij.net",
 Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 26B2B132
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 23:04:57 +0000 (UTC)
 (envelope-from delphij@delphij.net)
Received: from zeta.ixsystems.com (c-71-202-112-39.hsd1.ca.comcast.net
 [71.202.112.39])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by anubis.delphij.net (Postfix) with ESMTPSA id 522071839B;
 Mon, 22 Jun 2015 16:04:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=delphij.net;
 s=anubis; t=1435014297; x=1435028697;
 bh=CwINWeWEfXWL5C6yjw1YlEYPjk/+4gqrCFXiwHzrN88=;
 h=Date:From:Reply-To:To:CC:Subject:References:In-Reply-To;
 b=U4hdbUqBJjwrvgsKB5jFDJzTqHxpcOvktenTbDAiFN8C4nWvapL4oR2vlUc59DZtt
 kq2HrrHWIlhVN7IMKKahMqHBOrtAlysP96Kf7kSpGuG9p8vb4L0hNlUb08UZV6LTi2
 FpgD0PNGTsFfC/JqWEB/j+46e2ipvymCpNQE1eOY=
Message-ID: <55889498.3090405@delphij.net>
Date: Mon, 22 Jun 2015 16:04:56 -0700
From: Xin Li <delphij@delphij.net>
Reply-To: d@delphij.net
Organization: The FreeBSD Project
MIME-Version: 1.0
To: kpneal@pobox.com, Quartz <quartz@sneakertech.com>
CC: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com>
 <20150622121343.GB60684@neutralgood.org>
In-Reply-To: <20150622121343.GB60684@neutralgood.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 23:04:58 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 06/22/15 05:13, kpneal@pobox.com wrote:
> For both reads and writes all levels of raidz* perform slightly
> faster than the speed of a single drive. _Slightly_ faster, like,
> the speed of a single drive * 1.1 or so roughly speaking.

How big is the data block for each read-write?  For large blocks
RAID-Z is likely to perform nearly as well as stripped disks (e.g. 3
disks RAID-Z is slightly slower than 2 disk stripped, but would be
much better than single disk pool).  Typically copying data would use
larger data blocks.

For smaller writes it's likely to have worse results.

> Finally, single GigE is _slow_. I see no point in a "landing pad"
> when using unbonded GigE.

How will a "landing pad" help when let's say we have 10GigE or even
faster network connection?

Cheers,
- -- 
Xin LI <delphij@delphij.net>    https://www.delphij.net/
FreeBSD - The Power to Serve!           Live free or die
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.1.5 (FreeBSD)

iQIcBAEBCgAGBQJViJSYAAoJEJW2GBstM+nsVR0QAILGeNt4iT+mT1NeEEBiFtng
wcdmzNHeUueSjRl/ecJl4O6UbDH/OAxrUwLTyj6/mP8J60JhfIZisrcnSYXCSYQL
6INTAFy8u+eD7ewMNYXr0PddDku3bsTKSC7zlSZKURctlkqX1gEatGLJDDhDMqJj
KCcGpBnNX5CFS9y6UrCxbezoPwYlGf1CrEQooin5s5bLKWBwjBnG+XsaURtCOvXo
aY6ctTHyKDhuDWfBlaSU73eaFAw6zjcjVvJh6BHVA3JZSwx5F4vFT9ahjpPSimvS
h2byxrtSEi6PAIF+f7T+4zRoCqy+i2yYmnZlqHRQtGBtipF1cnzFlGQsGQtussE/
mamcXhcZDm2HbmxLyoUV15vNG4m/zvgMJK6VpMJrdbO5u/DfCDer/zuJyWJt6N/B
Ytldb/a24WLpKEDtdUtkFw774GPOgXk8YEU/TN6lyxRx5Ua6wb8kB66npEZi3eMN
tvdD45gKKVXmB5ooQjAiRzuOanKhDR40OBpCD1ZgNl513mSGJ0iNeJVGMzz2gakj
2r1GcRi+5DZTfcupc2NOLwe+8JM5B0QQzXCmuHS/eTdTGBBR4tfyIX8D5uxZs3wq
2CHPRg3yQxy0JOk14+q6g1uJfdiBjBt1SKF+gFD0TMuIFGEnREx6DpcNrJp3uPMF
INYxVG0U2UUmTYIeM2iJ
=CIqL
-----END PGP SIGNATURE-----

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 23:13:30 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 7FA165FF
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 23:13:30 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 471BC655
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 23:13:29 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
 [65.66.246.65])
 by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id t5MNDRhR023038;
 Mon, 22 Jun 2015 18:13:27 -0500 (CDT)
Date: Mon, 22 Jun 2015 18:13:27 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Quartz <quartz@sneakertech.com>
cc: freebsd-fs@freebsd.org
Subject: Re: ZFS pool restructuring and emergency repair
In-Reply-To: <558883CD.3080006@sneakertech.com>
Message-ID: <alpine.GSO.2.01.1506221812370.12519@freddy.simplesystems.org>
References: <5584C0BC.9070707@sneakertech.com>
 <5587BC96.9090601@sneakertech.com> <20150622115856.GA60684@neutralgood.org>
 <55887BE8.2090305@sneakertech.com> <55888235.5000100@infracaninophile.co.uk>
 <558883CD.3080006@sneakertech.com>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (blade.simplesystems.org [65.66.246.90]);
 Mon, 22 Jun 2015 18:13:27 -0500 (CDT)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 23:13:30 -0000

On Mon, 22 Jun 2015, Quartz wrote:

>> Although in one of Matt Ahrens talks at BSDCan he spoke of plans to
>> change this.  Essentially you'ld be able to offline a vdev, and a
>> background process (like scrub) would copy all the data blocks from that
>> device to elsewhere in the pool.  Once finished, the devices making up
>> the vdev could be physically removed.
>
> Oh, that would be nice. Was there a timeline guesstimate for when that would 
> be implemented, or was it more a "maybe someday" thing?

This has been planned for perhaps 8 years already.  Still in the 
original status.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 00:40:16 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 5B027A43
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 00:40:16 +0000 (UTC)
 (envelope-from delphij@delphij.net)
Received: from anubis.delphij.net (anubis.delphij.net
 [IPv6:2001:470:1:117::25])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "anubis.delphij.net",
 Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 3E941188
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 00:40:16 +0000 (UTC)
 (envelope-from delphij@delphij.net)
Received: from zeta.ixsystems.com (c-71-202-112-39.hsd1.ca.comcast.net
 [71.202.112.39])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by anubis.delphij.net (Postfix) with ESMTPSA id B20E2186DE;
 Mon, 22 Jun 2015 17:40:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=delphij.net;
 s=anubis; t=1435020014; x=1435034414;
 bh=s2MrOc4nMp2Xx+QZgChKN+UjBSQ1Vhq5ptkj2Ce9LPk=;
 h=Date:From:Reply-To:To:CC:Subject:References:In-Reply-To;
 b=g+9c6NbahS2nmCYJLY+ieFw+frM9V/O/mGo1Wev80S6sx2stfALCMXaew9FndDbv/
 K4Wm5R6YH7fuGHCOaV8mR+yO1gthXfbD1fgpHRbwjnniGK5GuKZsi2ts9aBo8L53H2
 6Y+v1H1GwZBGiD9RgSHLaxcwBkgZLEPLOf2ZbEvs=
Message-ID: <5588AAED.9030003@delphij.net>
Date: Mon, 22 Jun 2015 17:40:13 -0700
From: Xin Li <delphij@delphij.net>
Reply-To: d@delphij.net
Organization: The FreeBSD Project
MIME-Version: 1.0
To: kpneal@pobox.com, Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
CC: freebsd-fs@freebsd.org, Quartz <quartz@sneakertech.com>
Subject: Re: ZFS pool restructuring and emergency repair
References: <5584C0BC.9070707@sneakertech.com>
 <5587BC96.9090601@sneakertech.com> <20150622115856.GA60684@neutralgood.org>
 <55887BE8.2090305@sneakertech.com> <55888235.5000100@infracaninophile.co.uk>
 <558883CD.3080006@sneakertech.com>
 <alpine.GSO.2.01.1506221812370.12519@freddy.simplesystems.org>
 <20150623000453.GA92931@neutralgood.org>
In-Reply-To: <20150623000453.GA92931@neutralgood.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 00:40:16 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 06/22/15 17:04, kpneal@pobox.com wrote:
> On Mon, Jun 22, 2015 at 06:13:27PM -0500, Bob Friesenhahn wrote:
>> On Mon, 22 Jun 2015, Quartz wrote:
>> 
>>>> Although in one of Matt Ahrens talks at BSDCan he spoke of
>>>> plans to change this.  Essentially you'ld be able to offline
>>>> a vdev, and a background process (like scrub) would copy all
>>>> the data blocks from that device to elsewhere in the pool.
>>>> Once finished, the devices making up the vdev could be
>>>> physically removed.
>>> 
>>> Oh, that would be nice. Was there a timeline guesstimate for
>>> when that would be implemented, or was it more a "maybe
>>> someday" thing?
>> 
>> This has been planned for perhaps 8 years already.  Still in the
>>  original status.
> 
> Is this via "block pointer rewrite"?

Actually the vdev removal feature is implemented back in last December
(bcc'ed Matt in case he want to chime in) by Delphix.  If I remember
correctly, it's almost finished at the time we had OpenZFS developer
summit last year.

The initial changeset is about 5000 or 5500 lines of changes and is
not integrated into Illumos repository yet.

==

The block pointer rewrite is something that would complicate the ZFS
code quite a lot (and possibly also break many layering design) so
don't expect it happening anytime soon.

Cheers,
- -- 
Xin LI <delphij@delphij.net>    https://www.delphij.net/
FreeBSD - The Power to Serve!           Live free or die
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.1.5 (FreeBSD)

iQIcBAEBCgAGBQJViKrtAAoJEJW2GBstM+nspCYP/RSXZT9Ni/Asc17hkuro/0jR
lwDrkQkDrGin8/ACZ8MKNnVpdRIysuMvPD9fsi5pq7N9/nnGFf1Xq0EF7dYDn+bl
UpxnXJ678lnpwTls0NXo93RoPxzsBEzAbMjmJ4YWEWOe0iKnwj+hL4d7WoHYu0tM
mqFWpBM4kefd0QDjMLOMK58z20qdNqIPFxTMP+pTiVycl4x8lb284hLEWmi6u1g/
1u57PowRwCOWPxISuunUgeKpkz2c05YTG4vQzm2p9kzhjV2lrqNiNLSxPMv4FEfI
NTKSoscyfznm6GAOT+yV9HfepzZiWDQaG2l8epRA9hn+KhzMUsium3kX/3JHwL97
ybFqvPj46QzkVjnaTgAw2rsYqaYlDcBmJ6xKU/J+u+aq55VKnyN2sLYLYxD576QS
IgN7LYgMCp+6YCU+oOGhmwzcAlF4kykjeW//om3Kjr4VY7Fk7jEBC20vMn5bBobj
jtluxyDk2t3ccjbdNzAjHsgmzDSwQodgfsMjj7U35pTI6YkWG3Ywc/D7oLoc9C6K
oVZSJsh11tjCO0D6XZx2Nv3hy1Y3Lr8AAZ7SJnpm4zEBKx3HYyPWCtwjA3quSPxx
OSW3I7AlUUYaDfYrTIM3mrm4XOd5IBxGKfAbgdF/hQDTRQZUQXchqMxzfC6rEtv/
Djz/XVE1Ad9RgST3gzA+
=e4aP
-----END PGP SIGNATURE-----

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 03:29:29 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B968DDCB
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 03:29:29 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 9432CF5
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 03:29:29 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id 2703B3F6DD;
 Mon, 22 Jun 2015 23:29:22 -0400 (EDT)
Message-ID: <5588D291.4030806@sneakertech.com>
Date: Mon, 22 Jun 2015 23:29:21 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: kpneal@pobox.com
CC: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com> <5587C97F.2000407@delphij.net>
 <55887810.3080301@sneakertech.com> <20150622221422.GA71520@neutralgood.org>
 <55888E0D.6040704@sneakertech.com> <20150623002854.GB96928@neutralgood.org>
In-Reply-To: <20150623002854.GB96928@neutralgood.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 03:29:29 -0000

>> I guess a better way to word the question is: would a raidzX using
>> generic drives, samba, and 500mb-4gb files be notably slower at writing
>> than ~70mb/sec. I have a feeling not, but I wanted to double check.
>
> Gut feeling: I'm sticking with the network being the limiting factor.
> But the only real way to know is to setup a test system and, well, test.

Question: I'm not super familiar with the way freebsd+zfs handles IO and 
caching under the hood, especially not when you throw drive caching into 
the mix too. Does something simple like "dd if=/dev/zero of=/pool/foo 
bs=1m count=500" give me a reasonably accurate number for write speed?


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 06:07:40 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 76446944
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 06:07:40 +0000 (UTC)
 (envelope-from m.seaman@infracaninophile.co.uk)
Received: from smtp.infracaninophile.co.uk (smtp.infracaninophile.co.uk
 [IPv6:2001:8b0:151:1:3cd3:cd67:fafa:3d78])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.infracaninophile.co.uk",
 Issuer "infracaninophile.co.uk" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 16AED209
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 06:07:39 +0000 (UTC)
 (envelope-from m.seaman@infracaninophile.co.uk)
Received: from liminal.local ([192.168.100.2]) (authenticated bits=0)
 by smtp.infracaninophile.co.uk (8.15.1/8.15.1) with ESMTPSA id t5N67PcA003958
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO);
 Tue, 23 Jun 2015 07:07:27 +0100 (BST)
 (envelope-from m.seaman@infracaninophile.co.uk)
Authentication-Results: smtp.infracaninophile.co.uk;
 dmarc=none header.from=infracaninophile.co.uk
DKIM-Filter: OpenDKIM Filter v2.9.2 smtp.infracaninophile.co.uk t5N67PcA003958
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=infracaninophile.co.uk; s=201001-infracaninophile; t=1435039648;
 bh=gT7CALiFZrOUYcvHS7/IH59TOCU3NGjn9WYZ4C3vP4E=;
 h=Date:From:To:CC:Subject:References:In-Reply-To;
 z=Date:=20Tue,=2023=20Jun=202015=2007:07:15=20+0100|From:=20Matthew
 =20Seaman=20<m.seaman@infracaninophile.co.uk>|To:=20Quartz=20<quar
 tz@sneakertech.com>|CC:=20freebsd-fs@freebsd.org|Subject:=20Re:=20
 ZFS=20pool=20restructuring=20and=20emergency=20repair|References:=
 20<5584C0BC.9070707@sneakertech.com>=20<5587BC96.9090601@sneakerte
 ch.com>=20<20150622115856.GA60684@neutralgood.org>=20<55887BE8.209
 0305@sneakertech.com>=20<55888235.5000100@infracaninophile.co.uk>=
 20<558883CD.3080006@sneakertech.com>|In-Reply-To:=20<558883CD.3080
 006@sneakertech.com>;
 b=ski5sHqUy6P/KHCfyjUgRBH1MogQz6cIvseG7zECEhJTb5x1UiZ3znwU0BJeZk0E+
 /eXhxPhROUhyyEBcBkCzyyrf42vckN4LujXZF2ZISEZ9fuc8fFXcqJ0JS7t1Et8PRy
 XtIRWsXww9n0DczZ5vnL8O0N+SGZR57wBCNWcFMY=
X-Authentication-Warning: lucid-nonsense.infracaninophile.co.uk: Host
 [192.168.100.2] claimed to be liminal.local
Message-ID: <5588F793.4050802@infracaninophile.co.uk>
Date: Tue, 23 Jun 2015 07:07:15 +0100
From: Matthew Seaman <m.seaman@infracaninophile.co.uk>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Quartz <quartz@sneakertech.com>
CC: freebsd-fs@freebsd.org
Subject: Re: ZFS pool restructuring and emergency repair
References: <5584C0BC.9070707@sneakertech.com>
 <5587BC96.9090601@sneakertech.com> <20150622115856.GA60684@neutralgood.org>
 <55887BE8.2090305@sneakertech.com> <55888235.5000100@infracaninophile.co.uk>
 <558883CD.3080006@sneakertech.com>
In-Reply-To: <558883CD.3080006@sneakertech.com>
Content-Type: multipart/signed; micalg=pgp-sha512;
 protocol="application/pgp-signature";
 boundary="nrFQqdjcIEWsOmaCfcTeJw2phUG9c0iMl"
X-Virus-Scanned: clamav-milter 0.98.7 at lucid-nonsense.infracaninophile.co.uk
X-Virus-Status: Clean
X-Spam-Status: No, score=-1.5 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
 DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU autolearn=ham autolearn_force=no
 version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on
 lucid-nonsense.infracaninophile.co.uk
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 06:07:40 -0000

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--nrFQqdjcIEWsOmaCfcTeJw2phUG9c0iMl
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

On 22/06/2015 22:53, Quartz wrote:
>> Although in one of Matt Ahrens talks at BSDCan he spoke of plans to
>> change this.  Essentially you'ld be able to offline a vdev, and a
>> background process (like scrub) would copy all the data blocks from th=
at
>> device to elsewhere in the pool.  Once finished, the devices making up=

>> the vdev could be physically removed.
>=20
> Oh, that would be nice. Was there a timeline guesstimate for when that
> would be implemented, or was it more a "maybe someday" thing?


He didn't specify any sort of timeline I'm afraid.

	Cheers,

	Matthew


--nrFQqdjcIEWsOmaCfcTeJw2phUG9c0iMl
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.20 (Darwin)

iQJ8BAEBCgBmBQJViPebXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ2NTNBNjhCOTEzQTRFNkNGM0UxRTEzMjZC
QjIzQUY1MThFMUE0MDEzAAoJELsjr1GOGkAT/6kP/RbBSBKTWOUOKaWGmjaZeksC
kSUajB43dEthUdfTIGmEShyuv7f8qhGjpFHUvI3V9B3//SQvoiWxeTOaVtjgS9sX
XxdNckK8jqk6xC14kqwLoU0yraARv0rftiC90p8b31OOHVJq3xMm3V/byBRWK5nH
cX4PvDX4q9FURutLPISAXxoHFm56jPWZ3HtjAYl5Ecu3SLTJi/5GcDQv8ELI3a62
Av9VNhJKMFjZIS2as+yjHWzGhMWMnWiBfo/ZZCNzp0q5jOtT8AOI3luby2AHLxbe
S/U+t2E7o4eIUUd5GztTtLRDCsi1eZ6AQuHmZ91LtKGgThalZ3MyJH8Pfr32QAA5
dzVNlMZEApavheD7NdP9W62av4qAqL6JfAUHga4qt+5MPQB0tAWxyqUp0plncwek
ThKezhgLfHEe76y3VsS7PAObUwNtTyfYnHKnQ7pnqq2Ct6qML+iGYbOXz3NhVobt
giF1Q4Hhb1Il47zGEFyHa9QzAwkulx4sySMgJbtS/KjUMC4g6tIifgH5oGMAnrT7
uK6wqhqPbcUJqSvU4Y/B8Y6tuNlx5hRqk3CK1Q/pOfHhmR2kVPqAGQrKSkRDsW3o
qdUyqnUVaSCWKXYODyv2BUrxKUWKHLOXj4FSPNsI8MybVdMxOV1IgxTAGJRVCfc8
jsJk1veTzROlLmyzO/kP
=e3AH
-----END PGP SIGNATURE-----

--nrFQqdjcIEWsOmaCfcTeJw2phUG9c0iMl--

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 10:06:56 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 0A2C2B50
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 10:06:56 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id DA93C2E9
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 10:06:55 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id F1ECB3F6DD;
 Tue, 23 Jun 2015 06:06:53 -0400 (EDT)
Message-ID: <55892FBD.7030204@sneakertech.com>
Date: Tue, 23 Jun 2015 06:06:53 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: kpneal@pobox.com
CC: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com> <5587C97F.2000407@delphij.net>
 <55887810.3080301@sneakertech.com> <20150622221422.GA71520@neutralgood.org>
 <55888E0D.6040704@sneakertech.com> <20150623002854.GB96928@neutralgood.org>
 <5588D291.4030806@sneakertech.com> <20150623042234.GA66734@neutralgood.org>
In-Reply-To: <20150623042234.GA66734@neutralgood.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 10:06:56 -0000

> I'd go with something similar to how the machine will be used in production.

Hrmm, I was hoping for a quickie synthetic test just to gauge if the 
write speed was anywhere near as low as the network.

> Also, ZFS has the ability to detect sequences of zeros and optimize the
> writing of them.

Yeah, I had a vague memory of something like that, which is why I asked.

>Make sure the amount copied to the machine is, oh, say, at least
> twice (or maybe thrice?) the amount of RAM in the server just to be sure
> you've defeated any caching.

That's... not going to be easy. I don't have that much data broken into 
files of that size kicking around yet. It won't necessarily be a 
"production" test either, given that most clients will be transferring 
only a few gigs over at a time.


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 10:55:04 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id D23C1D6B
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 10:55:04 +0000 (UTC)
 (envelope-from karli.sjoberg@slu.se)
Received: from exch2-4.slu.se (webmail.slu.se [77.235.224.124])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "webmail.slu.se", Issuer "TERENA SSL CA" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 606CBC82
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 10:55:04 +0000 (UTC)
 (envelope-from karli.sjoberg@slu.se)
Received: from exch2-4.slu.se (77.235.224.124) by exch2-4.slu.se
 (77.235.224.124) with Microsoft SMTP Server (TLS) id 15.0.1076.9; Tue, 23 Jun
 2015 12:39:23 +0200
Received: from exch2-4.slu.se ([fe80::3117:818f:aa48:9d9b]) by exch2-4.slu.se
 ([fe80::3117:818f:aa48:9d9b%22]) with mapi id 15.00.1076.000;
 Tue, 23 Jun 2015 12:39:23 +0200
From: =?utf-8?B?S2FybGkgU2rDtmJlcmc=?= <karli.sjoberg@slu.se>
To: Quartz <quartz@sneakertech.com>
CC: "kpneal@pobox.com" <kpneal@pobox.com>, FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: ZFS raid write performance?
Thread-Topic: ZFS raid write performance?
Thread-Index: AQHQraDeZ5iISfttbk29ogJBEcFyIA==
Date: Tue, 23 Jun 2015 10:39:23 +0000
Message-ID: <ad25d51b-ff7d-4b62-94b6-b873990e155b@email.android.com>
Accept-Language: sv-SE, en-US
Content-Language: sv-SE
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-ms-exchange-transport-fromentityheader: Hosted
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 10:55:04 -0000

DQpEZW4gMjMganVuIDIwMTUgMTI6MDYgZW0gc2tyZXYgUXVhcnR6IDxxdWFydHpAc25lYWtlcnRl
Y2guY29tPjoNCj4NCj4gPiBJJ2QgZ28gd2l0aCBzb21ldGhpbmcgc2ltaWxhciB0byBob3cgdGhl
IG1hY2hpbmUgd2lsbCBiZSB1c2VkIGluIHByb2R1Y3Rpb24uDQo+DQo+IEhybW0sIEkgd2FzIGhv
cGluZyBmb3IgYSBxdWlja2llIHN5bnRoZXRpYyB0ZXN0IGp1c3QgdG8gZ2F1Z2UgaWYgdGhlDQo+
IHdyaXRlIHNwZWVkIHdhcyBhbnl3aGVyZSBuZWFyIGFzIGxvdyBhcyB0aGUgbmV0d29yay4NCg0K
YmVuY2htYXJrcy9ib25uaWUrKw0KDQpHb29kIHN5bnRoZXRpYyB0ZXN0aW5nIHRvb2wgZm9yIGZp
bGVzeXN0ZW0gcGVyZm9ybWFuY2UuDQoNCi9LDQoNCj4NCj4gPiBBbHNvLCBaRlMgaGFzIHRoZSBh
YmlsaXR5IHRvIGRldGVjdCBzZXF1ZW5jZXMgb2YgemVyb3MgYW5kIG9wdGltaXplIHRoZQ0KPiA+
IHdyaXRpbmcgb2YgdGhlbS4NCj4NCj4gWWVhaCwgSSBoYWQgYSB2YWd1ZSBtZW1vcnkgb2Ygc29t
ZXRoaW5nIGxpa2UgdGhhdCwgd2hpY2ggaXMgd2h5IEkgYXNrZWQuDQo+DQo+ID5NYWtlIHN1cmUg
dGhlIGFtb3VudCBjb3BpZWQgdG8gdGhlIG1hY2hpbmUgaXMsIG9oLCBzYXksIGF0IGxlYXN0DQo+
ID4gdHdpY2UgKG9yIG1heWJlIHRocmljZT8pIHRoZSBhbW91bnQgb2YgUkFNIGluIHRoZSBzZXJ2
ZXIganVzdCB0byBiZSBzdXJlDQo+ID4geW91J3ZlIGRlZmVhdGVkIGFueSBjYWNoaW5nLg0KPg0K
PiBUaGF0J3MuLi4gbm90IGdvaW5nIHRvIGJlIGVhc3kuIEkgZG9uJ3QgaGF2ZSB0aGF0IG11Y2gg
ZGF0YSBicm9rZW4gaW50bw0KPiBmaWxlcyBvZiB0aGF0IHNpemUga2lja2luZyBhcm91bmQgeWV0
LiBJdCB3b24ndCBuZWNlc3NhcmlseSBiZSBhDQo+ICJwcm9kdWN0aW9uIiB0ZXN0IGVpdGhlciwg
Z2l2ZW4gdGhhdCBtb3N0IGNsaWVudHMgd2lsbCBiZSB0cmFuc2ZlcnJpbmcNCj4gb25seSBhIGZl
dyBnaWdzIG92ZXIgYXQgYSB0aW1lLg0KPg0KPg0KPiBfX19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fXw0KPiBmcmVlYnNkLWZzQGZyZWVic2Qub3JnIG1haWxpbmcg
bGlzdA0KPiBodHRwOi8vbGlzdHMuZnJlZWJzZC5vcmcvbWFpbG1hbi9saXN0aW5mby9mcmVlYnNk
LWZzDQo+IFRvIHVuc3Vic2NyaWJlLCBzZW5kIGFueSBtYWlsIHRvICJmcmVlYnNkLWZzLXVuc3Vi
c2NyaWJlQGZyZWVic2Qub3JnIg0K

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 13:32:33 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 004C39BA
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 13:32:32 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 9F6C8BAD
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 13:32:31 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
 [65.66.246.65])
 by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id t5NDWO69019797;
 Tue, 23 Jun 2015 08:32:24 -0500 (CDT)
Date: Tue, 23 Jun 2015 08:32:24 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: kpneal@pobox.com
cc: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re:  ZFS raid write performance?
In-Reply-To: <20150623042234.GA66734@neutralgood.org>
Message-ID: <alpine.GSO.2.01.1506230812550.4186@freddy.simplesystems.org>
References: <5587C3FF.9070407@sneakertech.com> <5587C97F.2000407@delphij.net>
 <55887810.3080301@sneakertech.com> <20150622221422.GA71520@neutralgood.org>
 <55888E0D.6040704@sneakertech.com> <20150623002854.GB96928@neutralgood.org>
 <5588D291.4030806@sneakertech.com>
 <20150623042234.GA66734@neutralgood.org>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (blade.simplesystems.org [65.66.246.90]);
 Tue, 23 Jun 2015 08:32:24 -0500 (CDT)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 13:32:33 -0000

On Tue, 23 Jun 2015, kpneal@pobox.com wrote:
>
> When I was testing read speeds I tarred up a tree that was 700+GB in size
> on a server with 64GB of memory.

Tar (and cpio) are only single-threaded.  They open and read input 
files one by one.  Zfs's read-ahead algorithm ramps up the amount of 
read-ahead each time the program goes to read data and it is not 
already in memory.  Due to this ramp-up, input file size has a 
significant impact on the apparent read performance.  The ramp-up 
occurs on a per-file basis.  Large files (still much smaller than RAM) 
will produce a higher data rate than small files.  If read requests 
are pending for several files at once (or several read requests for 
different parts of the same file), then the observed data rate would 
be higher.

Tar/cpio read tests are often more impacted by disk latencies and zfs 
read-ahead algorithms than the peak performance of the data path.  A 
very large server with many disks may produce similar timings to a 
very small server.

Long ago I wrote a test script 
(http://www.simplesystems.org/users/bfriesen/zfs-discuss/zfs-cache-test.ksh) 
which was intended to expose a zfs bug existing at that time, but is 
still a very useful test for zfs caching and read-ahead by testing 
initial sequential read performance from a filesystem.  This script 
was written for Solaris and might need some small adaptation to be 
used for FreeBSD.

Extracting a tar file (particularly on a network client) is a very 
interesting test of network server write performance.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 15:17:15 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id DAA0BE1
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 15:17:15 +0000 (UTC)
 (envelope-from lkateley@kateley.com)
Received: from mail-ig0-f172.google.com (mail-ig0-f172.google.com
 [209.85.213.172])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id A76F7D8D
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 15:17:15 +0000 (UTC)
 (envelope-from lkateley@kateley.com)
Received: by igblr2 with SMTP id lr2so55825120igb.0
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 08:17:14 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to
 :subject:references:in-reply-to:content-type
 :content-transfer-encoding;
 bh=H3lD+6PGax4zErBlAI70lqdJ2FBQCAZ7WoLYx/KjR/g=;
 b=dF58tihMVHqO7NWZYWGtVlSyRRtwX1UUOhxG1dCsZMje+wz0ZUobW4ZIkqHGYpTBNt
 1kVhufoywSbrPPVyCtmnLDjm5V89S4PkzIY5dAz5JevukCF/KRGe61f2nRbMucN8hF1v
 1y51gFxCP8jUrAEiKLfTuZEnBaD6QTCTx6PnnyTnT+IQxA091zOybRdYuCWQtjXkKhBs
 aZCMHVRC5IrvIuBiyRgjC9Q/GslNk693J/Se9zv/dFcliv3Gp0HQ6GdBjXYDYtKHwGre
 eVGnBsYpFRbxtH07d+gqN2YHjXEhXC5Vi4f9mncOIOCCcoEmZsukl7A8JffVx6yd67ew
 2TCg==
X-Gm-Message-State: ALoCoQnqNS8LiYXbBAUuBOl1Y8Rnz8WDfBzzF+AXRnBKk+k2LNLkoUqFoPDn9m75qo6yYB4jXdSm
X-Received: by 10.107.28.202 with SMTP id c193mr45802637ioc.90.1435072634336; 
 Tue, 23 Jun 2015 08:17:14 -0700 (PDT)
Received: from kateleycoimac.local ([63.231.252.189])
 by mx.google.com with ESMTPSA id c12sm13487992ioj.39.2015.06.23.08.17.13
 for <freebsd-fs@freebsd.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 23 Jun 2015 08:17:13 -0700 (PDT)
Message-ID: <55897878.30708@kateley.com>
Date: Tue, 23 Jun 2015 10:17:12 -0500
From: Linda Kateley <lkateley@kateley.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10;
 rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com> <5587C97F.2000407@delphij.net>
 <55887810.3080301@sneakertech.com> <20150622221422.GA71520@neutralgood.org>
 <55888E0D.6040704@sneakertech.com> <20150623002854.GB96928@neutralgood.org>
 <5588D291.4030806@sneakertech.com> <20150623042234.GA66734@neutralgood.org>
 <alpine.GSO.2.01.1506230812550.4186@freddy.simplesystems.org>
In-Reply-To: <alpine.GSO.2.01.1506230812550.4186@freddy.simplesystems.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 15:17:16 -0000

Is it possible that the suggestion for the "landing pad" could be 
recommending a smaller ssd pool? Then replicating back to a slower pool? 
I actually do that kind of architecture once in awhile, especially for 
uses like large cad drawings, where there is a tendency to work on one 
big file at a time... With lower costs and higher densities of ssd, this 
is a nice way to use them

On 6/23/15 8:32 AM, Bob Friesenhahn wrote:
> On Tue, 23 Jun 2015, kpneal@pobox.com wrote:
>>
>> When I was testing read speeds I tarred up a tree that was 700+GB in 
>> size
>> on a server with 64GB of memory.
>
> Tar (and cpio) are only single-threaded.  They open and read input 
> files one by one.  Zfs's read-ahead algorithm ramps up the amount of 
> read-ahead each time the program goes to read data and it is not 
> already in memory.  Due to this ramp-up, input file size has a 
> significant impact on the apparent read performance.  The ramp-up 
> occurs on a per-file basis.  Large files (still much smaller than RAM) 
> will produce a higher data rate than small files.  If read requests 
> are pending for several files at once (or several read requests for 
> different parts of the same file), then the observed data rate would 
> be higher.
>
> Tar/cpio read tests are often more impacted by disk latencies and zfs 
> read-ahead algorithms than the peak performance of the data path.  A 
> very large server with many disks may produce similar timings to a 
> very small server.
>
> Long ago I wrote a test script 
> (http://www.simplesystems.org/users/bfriesen/zfs-discuss/zfs-cache-test.ksh) 
> which was intended to expose a zfs bug existing at that time, but is 
> still a very useful test for zfs caching and read-ahead by testing 
> initial sequential read performance from a filesystem. This script was 
> written for Solaris and might need some small adaptation to be used 
> for FreeBSD.
>
> Extracting a tar file (particularly on a network client) is a very 
> interesting test of network server write performance.
>
> Bob

-- 
Linda Kateley
Kateley Company
Skype ID-kateleyco
http://kateleyco.com


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 15:29:42 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 1FE011B0
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 15:29:42 +0000 (UTC)
 (envelope-from ben@altesco.nl)
Received: from altus-escon.com (altescovd.xs4all.nl [82.95.116.106])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (Client CN "proxy.altus-escon.com", Issuer "PositiveSSL CA 2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9AFD626A
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 15:29:40 +0000 (UTC)
 (envelope-from ben@altesco.nl)
Received: from daneel.altus-escon.com (daneel.altus-escon.com [193.78.231.7])
 (authenticated bits=0)
 by altus-escon.com (8.14.9/8.14.9) with ESMTP id t5NFQCAq058793
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT)
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 17:26:13 +0200 (CEST)
 (envelope-from ben@altesco.nl)
From: Ben Stuyts <ben@altesco.nl>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: Panic on removing corrupted file on zfs
Message-Id: <BC0509E5-E86E-4D50-AB5A-308434C4C1CB@altesco.nl>
Date: Tue, 23 Jun 2015 17:26:12 +0200
To: freebsd-fs@freebsd.org
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2102\))
X-Mailer: Apple Mail (2.2102)
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3
 (altus-escon.com [193.78.231.142]); Tue, 23 Jun 2015 17:26:13 +0200 (CEST)
X-Virus-Scanned: clamav-milter 0.98.7 at mars.altus-escon.com
X-Virus-Status: Clean
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 15:29:42 -0000

Hello,

I have a corrupted file on a zfs file system. It is a backup store for =
an rsync job, and rsync errors with:

rsync: failed to read xattr rsync.%stat for =
"/home1/vwa/rsync/tank3/cam/jpg/487-20150224180950-05.jpg": Input/output =
error (5)
Corrupt rsync.%stat xattr attached to =
"/home1/vwa/rsync/tank3/cam/jpg/487-20150224180950-04.jpg": "100644 0,0 =
\#007:1001"
rsync error: error in file IO (code 11) at xattrs.c(1003) =
[generator=3D3.1.1]

This is a file from februari, and it hasn=E2=80=99t changed since. =
Smartctl shows no errors. No ECC memory on this system, so maybe caused =
by a memory problem. I am currently running a scrub for the second time. =
First time didn=E2=80=99t help.

Output from zpool status -v:

  pool: home1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub in progress since Tue Jun 23 15:37:31 2015
        462G scanned out of 2.47T at 80.8M/s, 7h16m to go
        0 repaired, 18.29% done
config:

	NAME                                          STATE     READ =
WRITE CKSUM
	home1                                         ONLINE       0     =
0     0
	  gptid/14032b0b-7f05-11e3-8797-54bef70d8314  ONLINE       0     =
0     0

errors: Permanent errors have been detected in the following files:

        =
/home1/vwa/rsync/tank3/cam/jpg/487-20150224180950-05.jpg/<xattrdir>

When I try to rm the file the system panics. =46rom /var/crash:

tera8 dumped core - see /var/crash/vmcore.1

Tue Jun 23 15:37:11 CEST 2015

FreeBSD tera8 10.1-STABLE FreeBSD 10.1-STABLE #2 r284317: Fri Jun 12 =
17:07:21 CEST 2015     root@tera8:/usr/obj/usr/src/sys/GENERIC  amd64

panic: acl_from_aces: a_type is 0x4d00

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you =
are
welcome to change it and/or distribute copies of it under certain =
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for =
details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: acl_from_aces: a_type is 0x4d00
cpuid =3D 1
KDB: stack backtrace:
#0 0xffffffff8097d890 at kdb_backtrace+0x60
#1 0xffffffff809410e9 at vpanic+0x189
#2 0xffffffff80940f53 at panic+0x43
#3 0xffffffff81aaa209 at acl_from_aces+0x1c9
#4 0xffffffff81b61546 at zfs_freebsd_getacl+0xa6
#5 0xffffffff80e5de77 at VOP_GETACL_APV+0xa7
#6 0xffffffff809c7a3c at vacl_get_acl+0xdc
#7 0xffffffff809c7bd2 at sys___acl_get_link+0x72
#8 0xffffffff80d35817 at amd64_syscall+0x357
#9 0xffffffff80d1a89b at Xfast_syscall+0xfb

Is there any other way of getting rid of this file (except destroying =
the fs/pool)?=20

Thanks,
Ben


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 17:24:09 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id E68CF799
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 17:24:08 +0000 (UTC)
 (envelope-from vitaoxru@vip12.sweb.ru)
Received: from vip12.sweb.ru (vip12.sweb.ru [77.222.40.88])
 (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id A88BCF9D
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 17:24:07 +0000 (UTC)
 (envelope-from vitaoxru@vip12.sweb.ru)
Received: from vitaoxru by vip12.sweb.ru with local (Exim 4.84)
 (envelope-from <vitaoxru@vip12.sweb.ru>) id 1Z7RvS-000WrP-RY
 for freebsd-fs@freebsd.org; Tue, 23 Jun 2015 20:23:58 +0300
To: freebsd-fs@freebsd.org
Subject: Pay for driving on toll road, invoice #000395230
X-PHP-Originating-Script: 10130:post.php(13) : eval()'d code
Date: Tue, 23 Jun 2015 20:23:58 +0300
From: "E-ZPass Support" <jonathan.kaplan@vip12.sweb.ru>
Reply-To: "E-ZPass Support" <jonathan.kaplan@vip12.sweb.ru>
Message-ID: <134a52e2ce858f5170de33ee9f15f858@vip12.sweb.ru>
X-Priority: 3
MIME-Version: 1.0
X-Sender-Uid: 10130
Content-Type: text/plain; charset=us-ascii
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 17:24:09 -0000

Notice to Appear,

You have a unpaid bill for using toll road.
Please, do not forget to service your debt.

You can find the invoice is in the attachment.

Sincerely,
Jonathan Kaplan,
E-ZPass Agent.


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 18:50:55 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2430CC7A
 for <freebsd-fs@nevdull.freebsd.org>; Tue, 23 Jun 2015 18:50:55 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id F3608FE8
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 18:50:54 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id E79F83F6D9
 for <freebsd-fs@freebsd.org>; Tue, 23 Jun 2015 14:50:52 -0400 (EDT)
Message-ID: <5589AA8C.30304@sneakertech.com>
Date: Tue, 23 Jun 2015 14:50:52 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: ZFS raid write performance?
References: <5587C3FF.9070407@sneakertech.com> <5587C97F.2000407@delphij.net>
 <55887810.3080301@sneakertech.com> <20150622221422.GA71520@neutralgood.org>
 <55888E0D.6040704@sneakertech.com> <20150623002854.GB96928@neutralgood.org>
 <5588D291.4030806@sneakertech.com> <20150623042234.GA66734@neutralgood.org>
 <alpine.GSO.2.01.1506230812550.4186@freddy.simplesystems.org>
 <55897878.30708@kateley.com>
In-Reply-To: <55897878.30708@kateley.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2015 18:50:55 -0000

> Is it possible that the suggestion for the "landing pad" could be
> recommending a smaller ssd pool? Then replicating back to a slower pool?

It was for a single not-raid disk (then presumably rsyncing the files 
over to the pool, or something). The thought process seemed to be that a 
single disk always beat a raid-with-parity (ie; raid5, raidz2, etc) when 
it came to write speed.


> This is another argument for Quartz to test like he(?) would use in
> production.

Yeah, it's just that that's not terribly convenient at the moment. I 
think I'll just toss another drive in there and do some limited testing 
when we start copying things over.


From owner-freebsd-fs@freebsd.org  Wed Jun 24 20:48:47 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0DC6D915CFD
 for <freebsd-fs@mailman.ysv.freebsd.org>; Wed, 24 Jun 2015 20:48:47 +0000 (UTC)
 (envelope-from ronald-lists@klop.ws)
Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl
 [195.190.28.81]) (using TLSv1 with cipher AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id C793B1D98;
 Wed, 24 Jun 2015 20:48:46 +0000 (UTC)
 (envelope-from ronald-lists@klop.ws)
Received: from smtp.greenhost.nl ([213.108.104.138])
 by smarthost1.greenhost.nl with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16)
 (Exim 4.72) (envelope-from <ronald-lists@klop.ws>)
 id 1Z7rax-0005D2-To; Wed, 24 Jun 2015 22:48:37 +0200
Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes
To: "Andriy Gapon" <avg@freebsd.org>, "Warren Block" <wblock@wonkity.com>
Cc: freebsd-fs@freebsd.org
Subject: Re: 11-CURRENT does not mount my root ZFS
References: <op.x0dshmgbkndu52@53555e88.cm-6-6b.dynamic.ziggo.nl>
 <5581A7EF.5080606@FreeBSD.org>
 <alpine.BSF.2.20.1506171119530.4982@wonkity.com>
Date: Wed, 24 Jun 2015 22:48:30 +0200
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: "Ronald Klop" <ronald-lists@klop.ws>
Message-ID: <op.x0q5e4zukndu52@53555e88.cm-6-6b.dynamic.ziggo.nl>
In-Reply-To: <alpine.BSF.2.20.1506171119530.4982@wonkity.com>
User-Agent: Opera Mail/12.16 (FreeBSD)
X-Authenticated-As-Hash: 398f5522cb258ce43cb679602f8cfe8b62a256d1
X-Virus-Scanned: by clamav at smarthost1.samage.net
X-Spam-Level: /
X-Spam-Score: -0.2
X-Spam-Status: No, score=-0.2 required=5.0 tests=ALL_TRUSTED, BAYES_50,
 URIBL_BLOCKED autolearn=disabled version=3.3.1
X-Scan-Signature: dfea3049d3b923820beb462d65569822
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Jun 2015 20:48:47 -0000

On Wed, 17 Jun 2015 19:22:23 +0200, Warren Block <wblock@wonkity.com>  
wrote:

> On Wed, 17 Jun 2015, Andriy Gapon wrote:
>
>> On 17/06/2015 18:40, Ronald Klop wrote:
>>> Hello,
>>>
>>> I'm running 10-STABLE on my laptop on ZFS for a while already.
>>> Today I compiled and installed a 11-CURRENT kernel. After boot the  
>>> kernel gives
>>> this error at the moment of mountroot.
>> [snip]
>>>
>>> What could be the cause of this? Can I provide more information?
>>
>> That would be very weird but perhaps the problem is caused by a  
>> mismatch in pool
>> features?  Of course, it's hard to imagine that the CURRENT kernel  
>> would not
>> support something that 10-STABLE supported...
>> However, zpool get all output might still be informative.
>
> Outdated boot code on all but one drive?  (Sorry, no experience booting  
> ZFS from MBR.)

Hi,

Thanks for your responses. I just started to binary search for the  
revision which breaks when I figured that the kernel I was trying was  
really old. I had used my svn checkout to look at some old versions of  
drivers for porting to 10.
So I'm now running from a recent 11 kernel and things work like they  
should!

Cheers,
Ronald.

From owner-freebsd-fs@freebsd.org  Thu Jun 25 01:15:22 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 34E5C91563B
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 25 Jun 2015 01:15:22 +0000 (UTC)
 (envelope-from egor.gabin@outlook.com)
Received: from COL004-OMC4S15.hotmail.com (col004-omc4s15.hotmail.com
 [65.55.34.217])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "*.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id DED0C185C
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 01:15:21 +0000 (UTC)
 (envelope-from egor.gabin@outlook.com)
Received: from COL129-W60 ([65.55.34.199]) by COL004-OMC4S15.hotmail.com over
 TLS secured channel with Microsoft SMTPSVC(7.5.7601.22751); 
 Wed, 24 Jun 2015 18:14:15 -0700
X-TMN: [qBMGu3hg0DxmP83WQZrU/j/YSDxG7rK+]
X-Originating-Email: [egor.gabin@outlook.com]
Message-ID: <COL129-W6079B919E4A8AE8E1A56588EAE0@phx.gbl>
From: Bob Void <egor.gabin@outlook.com>
To: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Faulted pool
Date: Wed, 24 Jun 2015 21:14:15 -0400
Importance: Normal
MIME-Version: 1.0
X-OriginalArrivalTime: 25 Jun 2015 01:14:15.0723 (UTC)
 FILETIME=[40D51FB0:01D0AEE4]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Jun 2015 01:15:22 -0000

Hi All=2C Yes the mess I am in is all my fault. I have no one to blame but =
myself and I am trying to fix this myself. I have reached a point where i n=
eed help from the community.=20
Here is my situation. Freenas 9.2.0.1 running on Freebsd 9.2-Release-p4
Faulted raidz pool called bubbapool is made up of 2TB SATA seagate 1/4 driv=
es is not being recognized by the bios.I was able to clone two of the drive=
 using DDRESCUE and the third had bad 1.5mb in bad sectors but was cloned. =
in the process of using ddrescue for the first time I overwrote the USB car=
d where freenas was run from so I had to go to the next version and importe=
d a config backup from a few months past.
I think I need to go back to a good uberblock but i dont know how to compil=
e code.=20
Details below. =20
antnas: ~ # camcontrol devlist=0A=
=0A=
<ST2000DM001-1ER164 CC25>          at scbus0 target 0 lun 0 (ada0=2Cpass0)=
=0A=
=0A=
<ST2000DM001-1ER164 CC25>          at scbus1 target 0 lun 0 (ada1=2Cpass1)=
=0A=
=0A=
<ST2000DM001-1ER164 CC25>          at scbus2 target 0 lun 0 (ada2=2Cpass2)=
=0A=
=0A=
<ST2000DM001-1ER164 CC25>          at scbus3 target 0 lun 0 (ada3=2Cpass3)=
=0A=
=0A=
<ST4000VN000-1H4168=0A=
SC46>          at scbus4 target 0 lun=0A=
0 (ada4=2Cpass4)=0A=
=0A=
<ST4000VN000-1H4168=0A=
SC46>          at scbus5 target 0 lun=0A=
0 (ada5=2Cpass5)=0A=
=0A=
<ST4000VN000-1H4168=0A=
SC46>          at scbus6 target 0 lun=0A=
0 (ada6=2Cpass6)=0A=
=0A=
<ST4000VN000-1H4168=0A=
SC46>          at scbus7 target 0 lun=0A=
0 (ada7=2Cpass7)=0A=
=0A=
<ST31500341AS=0A=
LC1A>                at scbus8 target=0A=
1 lun 0 (ada8=2Cpass8)=0A=
=0A=
<ST31500341AS=0A=
LC1A>                at scbus9 target=0A=
0 lun 0 (ada9=2Cpass9)=0A=
=0A=
<ST31500341AS=0A=
LC1A>                at scbus9 target=0A=
1 lun 0 (ada10=2Cpass10)=0A=
=0A=
<PNY USB 2.0 FD=0A=
1100>              at scbus10 target 0=0A=
lun 0 (da0=2Cpass11)
antnas: ~ # zpool=0A=
import -fV 2272410887342933893
antnas: ~ # zpool=0A=
status bubbapool=0A=
=0A=
  pool: bubbapool=0A=
=0A=
 state: FAULTED=0A=
=0A=
status: One or more=0A=
devices could not be opened.  There are=0A=
insufficient=0A=
=0A=
        replicas for the pool to continue=0A=
functioning.=0A=
=0A=
action: Attach the=0A=
missing device and online it using 'zpool online'.=0A=
=0A=
   see: http://illumos.org/msg/ZFS-8000-3C=0A=
=0A=
  scan: none requested=0A=
=0A=
config:=0A=
=0A=
 =0A=
=0A=
        NAME                      STATE     READ WRITE CKSUM=0A=
=0A=
        bubbapool                 FAULTED      0    =0A=
0     1=0A=
=0A=
          raidz1-0                DEGRADED     0    =0A=
0     6=0A=
=0A=
            ada0                  ONLINE       0    =0A=
0     0  block size: 512B configured=2C 4096B native=0A=
=0A=
            ada2                  ONLINE       0    =0A=
0     0  block size: 512B configured=2C 4096B native=0A=
=0A=
            15427384508884946962  UNAVAIL     =0A=
0     0     0 =0A=
was /dev/ada1=0A=
=0A=
            ada1                  ONLINE       0    =0A=
0     0  block size: 512B configured=2C 4096B native=0A=
=0A=
antnas: ~ #

antnas: ~ # zdb -l=0A=
/dev/ada0--------------------------------------------LABEL 0---------------=
-----------------------------    version: 5000    name: 'bubbapool'    stat=
e: 0    txg: 15723793    pool_guid: 2272410887342933893    hostid: 25391348=
34    hostname: 'antnas.local'    top_guid: 12206387516572927959    guid: 5=
263395365568228054    vdev_children: 1    vdev_tree:        type: 'raidz'  =
      id: 0        guid: 12206387516572927959        nparity: 1        meta=
slab_array: 14        metaslab_shift: 31        ashift: 9        asize: 800=
1576501248        is_log: 0        children[0]:            type: 'disk'    =
        id: 0            guid: 2847030196120806336            path: '/dev/a=
da3'            phys_path: '/dev/ada3'            whole_disk: 0            =
DTL: 21        children[1]:            type: 'disk'            id: 1       =
     guid: 5263395365568228054            path: '/dev/ada0'            phys=
_path: '/dev/ada0'            whole_disk: 0            DTL: 20        child=
ren[2]:            type: 'disk'            id: 2            guid: 154273845=
08884946962            path: '/dev/ada1'            phys_path: '/dev/ada1' =
           whole_disk: 0            DTL: 19        children[3]:            =
type: 'disk'            id: 3            guid: 17279438588802848693        =
    path: '/dev/ada2'            phys_path: '/dev/ada2'            whole_di=
sk: 0            DTL: 18            removed: 1    features_for_read:-------=
-------------------------------------LABEL 1-------------------------------=
-------------    version: 5000    name: 'bubbapool'    state: 0    txg: 157=
23793    pool_guid: 2272410887342933893    hostid: 2539134834    hostname: =
'antnas.local'    top_guid: 12206387516572927959    guid: 52633953655682280=
54    vdev_children: 1    vdev_tree:        type: 'raidz'        id: 0     =
   guid: 12206387516572927959        nparity: 1        metaslab_array: 14  =
      metaslab_shift: 31        ashift: 9        asize: 8001576501248      =
  is_log: 0        children[0]:            type: 'disk'            id: 0   =
         guid: 2847030196120806336            path: '/dev/ada3'            =
phys_path: '/dev/ada3'            whole_disk: 0            DTL: 21        c=
hildren[1]:            type: 'disk'            id: 1            guid: 52633=
95365568228054            path: '/dev/ada0'            phys_path: '/dev/ada=
0'            whole_disk: 0            DTL: 20        children[2]:         =
   type: 'disk'            id: 2            guid: 15427384508884946962     =
       path: '/dev/ada1'            phys_path: '/dev/ada1'            whole=
_disk: 0            DTL: 19        children[3]:            type: 'disk'    =
        id: 3            guid: 17279438588802848693            path: '/dev/=
ada2'            phys_path: '/dev/ada2'            whole_disk: 0           =
 DTL: 18            removed: 1    features_for_read:-----------------------=
---------------------LABEL 2--------------------------------------------fai=
led to unpack=0A=
label 2--------------------------------------------LABEL 3-----------------=
---------------------------failed to unpack=0A=
label 3antnas: ~ #  antnas: ~ # dmesg | grep -i ADA0ada0 at siisch0 bus=0A=
0 scbus0 target 0 lun 0ada0:=0A=
<ST2000DM001-1ER164 CC25> ATA-9 SATA 3.x deviceada0: Serial Number Z4Z26ATW=
ada0: 300.000MB/s=0A=
transfers (SATA 2.x=2C UDMA6=2C PIO 8192bytes)ada0: Command=0A=
Queueing enabledada0: 1907728MB=0A=
(3907027055 512 byte sectors: 16H 63S/T 16383C)ada0:=0A=
quirks=3D0x1<4K>ada0: Previously was=0A=
known as ad4=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
antnas: ~ #
antnas: ~ # zdb -l=0A=
/dev/ada1--------------------------------------------LABEL 0---------------=
-----------------------------failed to unpack=0A=
label 0--------------------------------------------LABEL 1-----------------=
---------------------------failed to unpack=0A=
label 1--------------------------------------------LABEL 2-----------------=
---------------------------failed to unpack=0A=
label 2--------------------------------------------LABEL 3-----------------=
---------------------------=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
failed to unpack=0A=
label 3
antnas: ~ #  dmesg | grep -i ADA1ada1 at siisch1 bus=0A=
0 scbus1 target 0 lun 0ada1:=0A=
<ST2000DM001-1ER164 CC25> ATA-9 SATA 3.x deviceada1: Serial Number Z8E00BZ3=
ada1: 300.000MB/s=0A=
transfers (SATA 2.x=2C UDMA6=2C PIO 8192bytes)ada1: Command=0A=
Queueing enabledada1: 1907729MB=0A=
(3907029168 512 byte sectors: 16H 63S/T 16383C)ada1:=0A=
quirks=3D0x1<4K>ada1: Previously was=0A=
known as ad6ada10 at ata1 bus 0=0A=
scbus9 target 1 lun 0ada10:=0A=
<ST31500341AS LC1A> ATA-8 SATA 2.x deviceada10: Serial Number=0A=
9VS0RWK6ada10: 150.000MB/s=0A=
transfers (SATA=2C UDMA5=2C PIO 8192bytes)ada10: 1430799MB=0A=
(2930277168 512 byte sectors: 16H 63S/T 16383C)ada10: Previously=0A=
was known as ad3=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
antnas: ~ #
antnas: ~ # zdb -l=0A=
/dev/ada2--------------------------------------------LABEL 0---------------=
-----------------------------    version: 5000    name: 'bubbapool'    stat=
e: 0    txg: 15516933    pool_guid: 2272410887342933893    hostname: ''    =
top_guid: 12206387516572927959    guid: 17279438588802848693    vdev_childr=
en: 1    vdev_tree:        type: 'raidz'        id: 0        guid: 12206387=
516572927959        nparity: 1        metaslab_array: 14        metaslab_sh=
ift: 31        ashift: 9        asize: 8001576501248        is_log: 0      =
  children[0]:            type: 'disk'            id: 0            guid: 28=
47030196120806336            path: '/dev/ada3'            phys_path: '/dev/=
ada3'            whole_disk: 0            DTL: 21        children[1]:      =
      type: 'disk'            id: 1            guid: 5263395365568228054   =
         path: '/dev/ada0'            phys_path: '/dev/ada0'            who=
le_disk: 0            DTL: 20        children[2]:            type: 'disk'  =
          id: 2            guid: 15427384508884946962            path: '/de=
v/ada1'            phys_path: '/dev/ada1'            whole_disk: 0         =
   DTL: 19        children[3]:            type: 'disk'            id: 3    =
        guid: 17279438588802848693            path: '/dev/ada2'            =
phys_path: '/dev/ada2'            whole_disk: 0            DTL: 18    featu=
res_for_read:--------------------------------------------LABEL 1-----------=
---------------------------------    version: 5000    name: 'bubbapool'    =
state: 0    txg: 15516933    pool_guid: 2272410887342933893    hostname: ''=
    top_guid: 12206387516572927959    guid: 17279438588802848693    vdev_ch=
ildren: 1    vdev_tree:        type: 'raidz'        id: 0        guid: 1220=
6387516572927959        nparity: 1        metaslab_array: 14        metasla=
b_shift: 31        ashift: 9        asize: 8001576501248        is_log: 0  =
      children[0]:            type: 'disk'            id: 0            guid=
: 2847030196120806336            path: '/dev/ada3'            phys_path: '/=
dev/ada3'            whole_disk: 0            DTL: 21        children[1]:  =
          type: 'disk'            id: 1            guid: 526339536556822805=
4            path: '/dev/ada0'            phys_path: '/dev/ada0'           =
 whole_disk: 0            DTL: 20        children[2]:            type: 'dis=
k'            id: 2            guid: 15427384508884946962            path: =
'/dev/ada1'            phys_path: '/dev/ada1'            whole_disk: 0     =
       DTL: 19        children[3]:            type: 'disk'            id: 3=
            guid: 17279438588802848693            path: '/dev/ada2'        =
    phys_path: '/dev/ada2'            whole_disk: 0            DTL: 18    f=
eatures_for_read:--------------------------------------------LABEL 2-------=
-------------------------------------failed to unpack=0A=
label 2--------------------------------------------LABEL 3-----------------=
---------------------------failed to unpack=0A=
label 3antnas: ~ # antnas: ~ # dmesg | grep -i ADA2ada2 at siisch2 bus=0A=
0 scbus2 target 0 lun 0ada2:=0A=
<ST2000DM001-1ER164 CC25> ATA-9 SATA 3.x deviceada2: Serial Number Z4Z21B4K=
ada2: 300.000MB/s=0A=
transfers (SATA 2.x=2C UDMA6=2C PIO 8192bytes)ada2: Command=0A=
Queueing enabledada2: 1907728MB=0A=
(3907027055 512 byte sectors: 16H 63S/T 16383C)ada2:=0A=
quirks=3D0x1<4K>ada2: Previously was=0A=
known as ad8=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
antnas: ~ #

antnas: ~ # zdb -l=0A=
/dev/ada3--------------------------------------------LABEL 0---------------=
-----------------------------    version: 5000    name: 'bubbapool'    stat=
e: 0    txg: 15723793    pool_guid: 2272410887342933893    hostid: 25391348=
34    hostname: 'antnas.local'    top_guid: 12206387516572927959    guid: 2=
847030196120806336    vdev_children: 1    vdev_tree:        type: 'raidz'  =
      id: 0        guid: 12206387516572927959        nparity: 1        meta=
slab_array: 14        metaslab_shift: 31        ashift: 9        asize: 800=
1576501248        is_log: 0        children[0]:            type: 'disk'    =
        id: 0            guid: 2847030196120806336            path: '/dev/a=
da3'            phys_path: '/dev/ada3'            whole_disk: 0            =
DTL: 21        children[1]:            type: 'disk'            id: 1       =
     guid: 5263395365568228054            path: '/dev/ada0'            phys=
_path: '/dev/ada0'            whole_disk: 0            DTL: 20        child=
ren[2]:            type: 'disk'            id: 2            guid: 154273845=
08884946962            path: '/dev/ada1'            phys_path: '/dev/ada1' =
           whole_disk: 0            DTL: 19        children[3]:            =
type: 'disk'            id: 3            guid: 17279438588802848693        =
    path: '/dev/ada2'            phys_path: '/dev/ada2'            whole_di=
sk: 0            DTL: 18            removed: 1    features_for_read:-------=
-------------------------------------LABEL 1-------------------------------=
-------------    version: 5000    name: 'bubbapool'    state: 0    txg: 157=
23793    pool_guid: 2272410887342933893    hostid: 2539134834    hostname: =
'antnas.local'    top_guid: 12206387516572927959    guid: 28470301961208063=
36    vdev_children: 1    vdev_tree:        type: 'raidz'        id: 0     =
   guid: 12206387516572927959        nparity: 1        metaslab_array: 14  =
      metaslab_shift: 31        ashift: 9        asize: 8001576501248      =
  is_log: 0        children[0]:            type: 'disk'            id: 0   =
         guid: 2847030196120806336            path: '/dev/ada3'            =
phys_path: '/dev/ada3'            whole_disk: 0            DTL: 21        c=
hildren[1]:            type: 'disk'            id: 1            guid: 52633=
95365568228054            path: '/dev/ada0'            phys_path: '/dev/ada=
0'            whole_disk: 0            DTL: 20        children[2]:         =
   type: 'disk'            id: 2            guid: 15427384508884946962     =
       path: '/dev/ada1'            phys_path: '/dev/ada1'            whole=
_disk: 0            DTL: 19        children[3]:            type: 'disk'    =
        id: 3            guid: 17279438588802848693            path: '/dev/=
ada2'            phys_path: '/dev/ada2'            whole_disk: 0           =
 DTL: 18            removed: 1    features_for_read:-----------------------=
---------------------LABEL 2--------------------------------------------fai=
led to unpack=0A=
label 2--------------------------------------------LABEL 3-----------------=
---------------------------failed to unpack=0A=
label 3antnas: ~ # antnas: ~ # dmesg | grep -i ADA3ada3 at siisch3 bus=0A=
0 scbus3 target 0 lun 0ada3:=0A=
<ST2000DM001-1ER164 CC25> ATA-9 SATA 3.x deviceada3: Serial Number Z4Z276N0=
ada3: 300.000MB/s=0A=
transfers (SATA 2.x=2C UDMA6=2C PIO 8192bytes)ada3: Command=0A=
Queueing enabledada3: 1907728MB=0A=
(3907027055 512 byte sectors: 16H 63S/T 16383C)ada3:=0A=
quirks=3D0x1<4K>=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
=0A=
ada3: Previously was=0A=
known as ad10 		 	   		  =

From owner-freebsd-fs@freebsd.org  Thu Jun 25 12:32:03 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8855B98CD7F
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 25 Jun 2015 12:32:03 +0000 (UTC)
 (envelope-from mjguzik@gmail.com)
Received: from mail-wi0-x229.google.com (mail-wi0-x229.google.com
 [IPv6:2a00:1450:400c:c05::229])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id EFC0C1610
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 12:32:02 +0000 (UTC)
 (envelope-from mjguzik@gmail.com)
Received: by wiwl6 with SMTP id l6so16589561wiw.0
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 05:32:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=YJ4es8Qs8l+PimabFp9qms7h05LRfscde5bSIfyQev4=;
 b=oc/j+QA7L9eklWwB6sS36arYkjpSW71LC5ggP6KlFbhjC/ZXXKb2yVr4C6NdJ/IdHf
 gN2FC09jyjkmSoPK5EfzH4OXBoQa9LT3Umry6bOhtZd/Mu4+P+7a+momPiIGdUhDtBxl
 DyPIQ2Pnqp5QmY11PRmMNDjvkSCSg9SkhEuaEnMU/SRYONqisVmhrEVAAjtOSiFk2FG1
 sezHeV2nj5UWW6f+ybCkek/z5X7z0nKlBWxPOMvqm6pPr6YWi0Te9mE62+H7Dft0Iv75
 qMPMZGXMy3oJ3aEsUVkDQIPM+iQCKraf5AIjbAwioVRwWl1t1/4CuzIL+JomXvOkPwr4
 r4eA==
X-Received: by 10.180.106.195 with SMTP id gw3mr5346562wib.25.1435235521492;
 Thu, 25 Jun 2015 05:32:01 -0700 (PDT)
Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net.
 [2001:470:1f08:1f7::2])
 by mx.google.com with ESMTPSA id g15sm7405619wiv.22.2015.06.25.05.31.59
 (version=TLSv1.2 cipher=RC4-SHA bits=128/128);
 Thu, 25 Jun 2015 05:31:59 -0700 (PDT)
Date: Thu, 25 Jun 2015 14:31:57 +0200
From: Mateusz Guzik <mjguzik@gmail.com>
To: Konstantin Belousov <kostikbel@gmail.com>
Cc: freebsd-fs@freebsd.org
Subject: Re: atomic v_usecount and v_holdcnt
Message-ID: <20150625123156.GA29667@dft-labs.eu>
References: <20141122002812.GA32289@dft-labs.eu>
 <20141122092527.GT17068@kib.kiev.ua>
 <20141122211147.GA23623@dft-labs.eu>
 <20141124095251.GH17068@kib.kiev.ua>
 <20150314225226.GA15302@dft-labs.eu>
 <20150316094643.GZ2379@kib.kiev.ua>
 <20150317014412.GA10819@dft-labs.eu>
 <20150318104442.GS2379@kib.kiev.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <20150318104442.GS2379@kib.kiev.ua>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Jun 2015 12:32:03 -0000

On Wed, Mar 18, 2015 at 12:44:42PM +0200, Konstantin Belousov wrote:
> On Tue, Mar 17, 2015 at 02:44:12AM +0100, Mateusz Guzik wrote:
> > On Mon, Mar 16, 2015 at 11:46:43AM +0200, Konstantin Belousov wrote:
> > > On Sat, Mar 14, 2015 at 11:52:26PM +0100, Mateusz Guzik wrote:
> > > > On Mon, Nov 24, 2014 at 11:52:52AM +0200, Konstantin Belousov wrote:
> > > > > On Sat, Nov 22, 2014 at 10:11:47PM +0100, Mateusz Guzik wrote:
> > > > > > On Sat, Nov 22, 2014 at 11:25:27AM +0200, Konstantin Belousov wrote:
> > > > > > > On Sat, Nov 22, 2014 at 01:28:12AM +0100, Mateusz Guzik wrote:
> > > > > > > > The idea is that we don't need an interlock as long as we don't
> > > > > > > > transition either counter 1->0 or 0->1.
> > > > > > > I already said that something along the lines of the patch should work.
> > > > > > > In fact, you need vnode lock when hold count changes between 0 and 1,
> > > > > > > and probably the same for use count.
> > > > > > > 
> > > > > > 
> > > > > > I don't see why this would be required (not that I'm an VFS expert).
> > > > > > vnode recycling seems to be protected with the interlock.
> > > > > > 
> > > > > > In fact I would argue that if this is really needed, current code is
> > > > > > buggy.
> > > > > Yes, it is already (somewhat) buggy.
> > > > > 
> > > > > Most need of the lock is for the case of counts coming from 1 to 0.
> > > > > The reason is the handling of the active vnode list, which is used
> > > > > for limiting the amount of vnode list walking in syncer.  When hold
> > > > > count is decremented to 0, vnode is removed from the active list.
> > > > > When use count is decremented to 0, vnode is supposedly inactivated,
> > > > > and vinactive() cleans the cached pages belonging to vnode.  In other
> > > > > words, VI_OWEINACT for dirty vnode is sort of bug.
> > > > > 
> > > > 
> > > > Modified the patch to no longer have the usecount + interlock dropped +
> > > > VI_OWEINACT set window.
> > > > 
> > > > Extended 0->1 hold count + vnode not locked window remains. I can fix
> > > > that if it is really necessary by having _vhold return with interlock
> > > > held if it did such transition.
> > > 
> > > In v_upgrade_usecount(), you call v_incr_devcount() without without interlock
> > > held.  What prevents the devfs vnode from being recycled, in particular,
> > > from invalidation of v_rdev pointer ?
> > > 
> > 
> > Right, that was buggy. Fixed in the patch below.
> Why non-atomicity of updates to several counters is safe ?  This at least
> requires an explanation in the comment, I mean holdcnt/usecnt pair.
> 

The patch below was tested with make -j 40 buildworld in a loop for 7 hours
and it survived.

I started a comment above vget, unfinished yet.

Further playing around revealed that zfs will vref a vnode with no
usecount (zfs_lookup -> zfs_dirlook -> zfs_dirent_lock -> zfs_zget ->
VN_HOLD) and it is possible that it will have VI_OWEINACT set (tested on
a kernel without my patch). VN_HOLD is defined as vref(). The code can
sleep, so some shuffling around can be done to call vinactive() if it
happens to be exclusively locked (but most of the time it is locked
shared).

However, it seems that vputx deals with such consumers:
if (vp->v_usecount > 0)
	vp->v_iflag &= ~VI_OWEINACT;

Given that there are possibly more consumers like zfs how about:
In vputx assert that the flag is unset if the usecount went to > 0. Clear
the flag in vref and vget if transitioning 0->1 and assert it is unset
otherwise.

The way I read it is that in the stock kernel with properly timed vref
the flag would be cleared anyway, with vinactive() only called if it was
done by vget and only with the vnode exclusively locked.

With a aforementioned change likelyhood of vinactive() remains the same,
but now the flag state can be asserted.

> Assume the thread increased the v_usecount, but did not managed to
> acquire dev_mtx. Another thread performs vrele() and progressed to
> v_decr_devcount(). It decreases the si_usecount, which might allow yet
> another thread to see the si_usecount as too low and start unwanted
> action. I think that the tests for VCHR must be done at the very
> start of the functions, and devfs vnodes must hold vnode interlock
> unconditionally.
> 

Inserted v_type != VCHR checks in relevant places, vi_usecount
manipulation functions now assert that the interlock is held.

> > 
> > > I think that refcount_acquire_if_greater() KPI is excessive.  You always
> > > calls acquire with val == 0, and release with val == 1.
> > > 
> > 
> > Yea i noted in my prevoius e-mail it should be changed (see below).
> > 
> > I replaced them with refcount_acquire_if_not_zero and
> > refcount_release_if_not_last.
> I dislike the length of the names.  Can you propose something shorter ?
> 

Unfortunately the original API is alreday quite verbose and I don't have
anything readable which would retain "refcount_acquire" (instead of a
"ref_get" or "ref_acq"). Adding "_nz" as a suffix does not look good
("refcount_acquire_if_nz").

> The type for the local variable old in both functions should be u_int.
> 

Done.

> > 
> > > WRT to _refcount_release_lock, why is lock_object->lc_lock/lc_unlock KPI
> > > cannot be used ? This allows to make refcount_release_lock() a function
> > > instead of gcc extension macros.  Not to mention that the macro is unused.
> > 
> > These were supposed to be used by other code, forgot to remove it from
> > the patch I sent here.
> > 
> > We can discuss this in another thread.
> > 
> > Striclty speaking we could use it here for vnode interlock, but I did
> > not want to get around VI_LOCK macro (which right now is just a
> > mtx_lock, but this may change).
> > 
> > Updated patch is below:
> Do not introduce ASSERT_VI_LOCK, the name difference between
> ASSERT_VI_LOCKED and ASSERT_VI_LOCK is only in the broken grammar.
> I do not see anything wrong with explicit if() statements where needed,
> in all four places.

Done.

> 
> In vputx(), wrap the long line (if (refcount_release() || VI_DOINGINACT)).

Done.


diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/vnode.c b/sys/cddl/contrib/opensolaris/uts/common/fs/vnode.c
index 83f29c1..b587ebd 100644
--- a/sys/cddl/contrib/opensolaris/uts/common/fs/vnode.c
+++ b/sys/cddl/contrib/opensolaris/uts/common/fs/vnode.c
@@ -99,6 +99,6 @@ vn_rele_async(vnode_t *vp, taskq_t *taskq)
 		    (task_func_t *)vn_rele_inactive, vp, TQ_SLEEP) != 0);
 		return;
 	}
-	vp->v_usecount--;
+	refcount_release(&vp->v_usecount);
 	vdropl(vp);
 }
diff --git a/sys/kern/vfs_cache.c b/sys/kern/vfs_cache.c
index 19ef783..cb4ea94 100644
--- a/sys/kern/vfs_cache.c
+++ b/sys/kern/vfs_cache.c
@@ -661,12 +661,12 @@ success:
 		ltype = VOP_ISLOCKED(dvp);
 		VOP_UNLOCK(dvp, 0);
 	}
-	VI_LOCK(*vpp);
+	vhold(*vpp);
 	if (wlocked)
 		CACHE_WUNLOCK();
 	else
 		CACHE_RUNLOCK();
-	error = vget(*vpp, cnp->cn_lkflags | LK_INTERLOCK, cnp->cn_thread);
+	error = vget(*vpp, cnp->cn_lkflags | LK_VNHELD, cnp->cn_thread);
 	if (cnp->cn_flags & ISDOTDOT) {
 		vn_lock(dvp, ltype | LK_RETRY);
 		if (dvp->v_iflag & VI_DOOMED) {
@@ -1366,9 +1366,9 @@ vn_dir_dd_ino(struct vnode *vp)
 		if ((ncp->nc_flag & NCF_ISDOTDOT) != 0)
 			continue;
 		ddvp = ncp->nc_dvp;
-		VI_LOCK(ddvp);
+		vhold(ddvp);
 		CACHE_RUNLOCK();
-		if (vget(ddvp, LK_INTERLOCK | LK_SHARED | LK_NOWAIT, curthread))
+		if (vget(ddvp, LK_SHARED | LK_NOWAIT | LK_VNHELD, curthread))
 			return (NULL);
 		return (ddvp);
 	}
diff --git a/sys/kern/vfs_hash.c b/sys/kern/vfs_hash.c
index 930fca1..48601e7 100644
--- a/sys/kern/vfs_hash.c
+++ b/sys/kern/vfs_hash.c
@@ -84,9 +84,9 @@ vfs_hash_get(const struct mount *mp, u_int hash, int flags, struct thread *td, s
 				continue;
 			if (fn != NULL && fn(vp, arg))
 				continue;
-			VI_LOCK(vp);
+			vhold(vp);
 			rw_runlock(&vfs_hash_lock);
-			error = vget(vp, flags | LK_INTERLOCK, td);
+			error = vget(vp, flags | LK_VNHELD, td);
 			if (error == ENOENT && (flags & LK_NOWAIT) == 0)
 				break;
 			if (error)
@@ -128,9 +128,9 @@ vfs_hash_insert(struct vnode *vp, u_int hash, int flags, struct thread *td, stru
 				continue;
 			if (fn != NULL && fn(vp2, arg))
 				continue;
-			VI_LOCK(vp2);
+			vhold(vp2);
 			rw_wunlock(&vfs_hash_lock);
-			error = vget(vp2, flags | LK_INTERLOCK, td);
+			error = vget(vp2, flags | LK_VNHELD, td);
 			if (error == ENOENT && (flags & LK_NOWAIT) == 0)
 				break;
 			rw_wlock(&vfs_hash_lock);
diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index 1f1a7b6..a8cd2cb 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -68,6 +68,7 @@ __FBSDID("$FreeBSD$");
 #include <sys/pctrie.h>
 #include <sys/priv.h>
 #include <sys/reboot.h>
+#include <sys/refcount.h>
 #include <sys/rwlock.h>
 #include <sys/sched.h>
 #include <sys/sleepqueue.h>
@@ -102,9 +103,8 @@ static int	flushbuflist(struct bufv *bufv, int flags, struct bufobj *bo,
 static void	syncer_shutdown(void *arg, int howto);
 static int	vtryrecycle(struct vnode *vp);
 static void	v_incr_usecount(struct vnode *);
-static void	v_decr_usecount(struct vnode *);
-static void	v_decr_useonly(struct vnode *);
-static void	v_upgrade_usecount(struct vnode *);
+static void	v_incr_devcount(struct vnode *);
+static void	v_decr_devcount(struct vnode *);
 static void	vnlru_free(int);
 static void	vgonel(struct vnode *);
 static void	vfs_knllock(void *arg);
@@ -868,7 +868,7 @@ vnlru_free(int count)
 		 */
 		freevnodes--;
 		vp->v_iflag &= ~VI_FREE;
-		vp->v_holdcnt++;
+		refcount_acquire(&vp->v_holdcnt);
 
 		mtx_unlock(&vnode_free_list_mtx);
 		VI_UNLOCK(vp);
@@ -2079,78 +2079,68 @@ reassignbuf(struct buf *bp)
 
 /*
  * Increment the use and hold counts on the vnode, taking care to reference
- * the driver's usecount if this is a chardev.  The vholdl() will remove
- * the vnode from the free list if it is presently free.  Requires the
- * vnode interlock and returns with it held.
+ * the driver's usecount if this is a chardev.  The _vhold() will remove
+ * the vnode from the free list if it is presently free.
  */
 static void
 v_incr_usecount(struct vnode *vp)
 {
 
+	ASSERT_VI_UNLOCKED(vp, __func__);
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
-	vholdl(vp);
-	vp->v_usecount++;
-	if (vp->v_type == VCHR && vp->v_rdev != NULL) {
-		dev_lock();
-		vp->v_rdev->si_usecount++;
-		dev_unlock();
-	}
-}
 
-/*
- * Turn a holdcnt into a use+holdcnt such that only one call to
- * v_decr_usecount is needed.
- */
-static void
-v_upgrade_usecount(struct vnode *vp)
-{
+	if (vp->v_type == VCHR) {
+		VI_LOCK(vp);
+		_vhold(vp, true);
+		if (vp->v_iflag & VI_OWEINACT) {
+			VNASSERT(vp->v_usecount == 0, vp,
+			    ("vnode with usecount and VI_OWEINACT set"));
+			vp->v_iflag &= ~VI_OWEINACT;
+		}
+		refcount_acquire(&vp->v_usecount);
+		v_incr_devcount(vp);
+		VI_UNLOCK(vp);
+		return;
+	}
 
-	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
-	vp->v_usecount++;
-	if (vp->v_type == VCHR && vp->v_rdev != NULL) {
-		dev_lock();
-		vp->v_rdev->si_usecount++;
-		dev_unlock();
+	_vhold(vp, false);
+	if (refcount_acquire_if_not_zero(&vp->v_usecount)) {
+		VNASSERT((vp->v_iflag & VI_OWEINACT) == 0, vp,
+		    ("vnode with usecount and VI_OWEINACT set"));
+	} else {
+		VI_LOCK(vp);
+		if (vp->v_iflag & VI_OWEINACT)
+			vp->v_iflag &= ~VI_OWEINACT;
+		refcount_acquire(&vp->v_usecount);
+		VI_UNLOCK(vp);
 	}
 }
 
 /*
- * Decrement the vnode use and hold count along with the driver's usecount
- * if this is a chardev.  The vdropl() below releases the vnode interlock
- * as it may free the vnode.
+ * Increment si_usecount of the associated device, if any.
  */
 static void
-v_decr_usecount(struct vnode *vp)
+v_incr_devcount(struct vnode *vp)
 {
 
-	ASSERT_VI_LOCKED(vp, __FUNCTION__);
-	VNASSERT(vp->v_usecount > 0, vp,
-	    ("v_decr_usecount: negative usecount"));
-	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
-	vp->v_usecount--;
+	ASSERT_VI_LOCKED(vp, __func__);
+
 	if (vp->v_type == VCHR && vp->v_rdev != NULL) {
 		dev_lock();
-		vp->v_rdev->si_usecount--;
+		vp->v_rdev->si_usecount++;
 		dev_unlock();
 	}
-	vdropl(vp);
 }
 
 /*
- * Decrement only the use count and driver use count.  This is intended to
- * be paired with a follow on vdropl() to release the remaining hold count.
- * In this way we may vgone() a vnode with a 0 usecount without risk of
- * having it end up on a free list because the hold count is kept above 0.
+ * Decrement si_usecount of the associated device, if any.
  */
 static void
-v_decr_useonly(struct vnode *vp)
+v_decr_devcount(struct vnode *vp)
 {
 
-	ASSERT_VI_LOCKED(vp, __FUNCTION__);
-	VNASSERT(vp->v_usecount > 0, vp,
-	    ("v_decr_useonly: negative usecount"));
-	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
-	vp->v_usecount--;
+	ASSERT_VI_LOCKED(vp, __func__);
+
 	if (vp->v_type == VCHR && vp->v_rdev != NULL) {
 		dev_lock();
 		vp->v_rdev->si_usecount--;
@@ -2164,21 +2154,38 @@ v_decr_useonly(struct vnode *vp)
  * is being destroyed.  Only callers who specify LK_RETRY will
  * see doomed vnodes.  If inactive processing was delayed in
  * vput try to do it here.
+ *
+ * Notes on lockless counter manipulation:
+ * The hold count prevents the vnode from being freed, while the
+ * use count prevents it from being recycled.
+ *
+ * Only 1->0 and 0->1 transitions require atomicity with respect to
+ * other operations (e.g. taking the vnode off of a free list).
+ * In such a case the interlock is taken, which provides mutual
+ * exclusion against threads transitioning the other way.
  */
 int
 vget(struct vnode *vp, int flags, struct thread *td)
 {
-	int error;
+	int error, oweinact;
 
-	error = 0;
 	VNASSERT((flags & LK_TYPE_MASK) != 0, vp,
 	    ("vget: invalid lock operation"));
+
+	if ((flags & LK_INTERLOCK) != 0)
+		ASSERT_VI_LOCKED(vp, __func__);
+	else
+		ASSERT_VI_UNLOCKED(vp, __func__);
+	if ((flags & LK_VNHELD) != 0)
+		VNASSERT((vp->v_holdcnt > 0), vp,
+		    ("vget: LK_VNHELD passed but vnode not held"));
+
 	CTR3(KTR_VFS, "%s: vp %p with flags %d", __func__, vp, flags);
 
-	if ((flags & LK_INTERLOCK) == 0)
-		VI_LOCK(vp);
-	vholdl(vp);
-	if ((error = vn_lock(vp, flags | LK_INTERLOCK)) != 0) {
+	if ((flags & LK_VNHELD) == 0)
+		_vhold(vp, (flags & LK_INTERLOCK) != 0);
+
+	if ((error = vn_lock(vp, flags)) != 0) {
 		vdrop(vp);
 		CTR2(KTR_VFS, "%s: impossible to lock vnode %p", __func__,
 		    vp);
@@ -2186,22 +2193,34 @@ vget(struct vnode *vp, int flags, struct thread *td)
 	}
 	if (vp->v_iflag & VI_DOOMED && (flags & LK_RETRY) == 0)
 		panic("vget: vn_lock failed to return ENOENT\n");
-	VI_LOCK(vp);
-	/* Upgrade our holdcnt to a usecount. */
-	v_upgrade_usecount(vp);
+
 	/*
 	 * We don't guarantee that any particular close will
 	 * trigger inactive processing so just make a best effort
 	 * here at preventing a reference to a removed file.  If
 	 * we don't succeed no harm is done.
+	 *
+	 * Upgrade our holdcnt to a usecount.
 	 */
-	if (vp->v_iflag & VI_OWEINACT) {
-		if (VOP_ISLOCKED(vp) == LK_EXCLUSIVE &&
+	if (vp->v_type != VCHR &&
+	    refcount_acquire_if_not_zero(&vp->v_usecount)) {
+		VNASSERT((vp->v_iflag & VI_OWEINACT) == 0, vp,
+		    ("vnode with usecount and VI_OWEINACT set"));
+	} else {
+		VI_LOCK(vp);
+		if ((vp->v_iflag & VI_OWEINACT) == 0) {
+			oweinact = 0;
+		} else {
+			oweinact = 1;
+			vp->v_iflag &= ~VI_OWEINACT;
+		}
+		refcount_acquire(&vp->v_usecount);
+		v_incr_devcount(vp);
+		if (oweinact && VOP_ISLOCKED(vp) == LK_EXCLUSIVE &&
 		    (flags & LK_NOWAIT) == 0)
 			vinactive(vp, td);
-		vp->v_iflag &= ~VI_OWEINACT;
+		VI_UNLOCK(vp);
 	}
-	VI_UNLOCK(vp);
 	return (0);
 }
 
@@ -2213,36 +2232,34 @@ vref(struct vnode *vp)
 {
 
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
-	VI_LOCK(vp);
 	v_incr_usecount(vp);
-	VI_UNLOCK(vp);
 }
 
 /*
  * Return reference count of a vnode.
  *
- * The results of this call are only guaranteed when some mechanism other
- * than the VI lock is used to stop other processes from gaining references
- * to the vnode.  This may be the case if the caller holds the only reference.
- * This is also useful when stale data is acceptable as race conditions may
- * be accounted for by some other means.
+ * The results of this call are only guaranteed when some mechanism is used to
+ * stop other processes from gaining references to the vnode.  This may be the
+ * case if the caller holds the only reference.  This is also useful when stale
+ * data is acceptable as race conditions may be accounted for by some other
+ * means.
  */
 int
 vrefcnt(struct vnode *vp)
 {
-	int usecnt;
 
-	VI_LOCK(vp);
-	usecnt = vp->v_usecount;
-	VI_UNLOCK(vp);
-
-	return (usecnt);
+	return (vp->v_usecount);
 }
 
 #define	VPUTX_VRELE	1
 #define	VPUTX_VPUT	2
 #define	VPUTX_VUNREF	3
 
+/*
+ * Decrement the use and hold counts for a vnode.
+ *
+ * See an explanation near vget() as to why atomic operation is safe.
+ */
 static void
 vputx(struct vnode *vp, int func)
 {
@@ -2255,33 +2272,44 @@ vputx(struct vnode *vp, int func)
 		ASSERT_VOP_LOCKED(vp, "vput");
 	else
 		KASSERT(func == VPUTX_VRELE, ("vputx: wrong func"));
+	ASSERT_VI_UNLOCKED(vp, __func__);
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
-	VI_LOCK(vp);
-
-	/* Skip this v_writecount check if we're going to panic below. */
-	VNASSERT(vp->v_writecount < vp->v_usecount || vp->v_usecount < 1, vp,
-	    ("vputx: missed vn_close"));
-	error = 0;
 
-	if (vp->v_usecount > 1 || ((vp->v_iflag & VI_DOINGINACT) &&
-	    vp->v_usecount == 1)) {
+	if (vp->v_type != VCHR &&
+	    refcount_release_if_not_last(&vp->v_usecount)) {
 		if (func == VPUTX_VPUT)
 			VOP_UNLOCK(vp, 0);
-		v_decr_usecount(vp);
+		vdrop(vp);
 		return;
 	}
 
-	if (vp->v_usecount != 1) {
-		vprint("vputx: negative ref count", vp);
-		panic("vputx: negative ref cnt");
-	}
-	CTR2(KTR_VFS, "%s: return vnode %p to the freelist", __func__, vp);
+	VI_LOCK(vp);
+
 	/*
 	 * We want to hold the vnode until the inactive finishes to
 	 * prevent vgone() races.  We drop the use count here and the
 	 * hold count below when we're done.
 	 */
-	v_decr_useonly(vp);
+	if (!refcount_release(&vp->v_usecount) ||
+	    (vp->v_iflag & VI_DOINGINACT)) {
+		if (func == VPUTX_VPUT)
+			VOP_UNLOCK(vp, 0);
+		v_decr_devcount(vp);
+		vdropl(vp);
+		return;
+	}
+
+	v_decr_devcount(vp);
+
+	error = 0;
+
+	if (vp->v_usecount != 0) {
+		vprint("vputx: usecount not zero", vp);
+		panic("vputx: usecount not zero");
+	}
+
+	CTR2(KTR_VFS, "%s: return vnode %p to the freelist", __func__, vp);
+
 	/*
 	 * We must call VOP_INACTIVE with the node locked. Mark
 	 * as VI_DOINGINACT to avoid recursion.
@@ -2307,7 +2335,8 @@ vputx(struct vnode *vp, int func)
 		break;
 	}
 	if (vp->v_usecount > 0)
-		vp->v_iflag &= ~VI_OWEINACT;
+		VNASSERT((vp->v_iflag & VI_OWEINACT) == 0, vp,
+		    ("vnode with usecount and VI_OWEINACT set"));
 	if (error == 0) {
 		if (vp->v_iflag & VI_OWEINACT)
 			vinactive(vp, curthread);
@@ -2351,36 +2380,36 @@ vunref(struct vnode *vp)
 }
 
 /*
- * Somebody doesn't want the vnode recycled.
- */
-void
-vhold(struct vnode *vp)
-{
-
-	VI_LOCK(vp);
-	vholdl(vp);
-	VI_UNLOCK(vp);
-}
-
-/*
  * Increase the hold count and activate if this is the first reference.
  */
 void
-vholdl(struct vnode *vp)
+_vhold(struct vnode *vp, bool locked)
 {
 	struct mount *mp;
 
+	if (locked)
+		ASSERT_VI_LOCKED(vp, __func__);
+	else
+		ASSERT_VI_UNLOCKED(vp, __func__);
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
-#ifdef INVARIANTS
-	/* getnewvnode() calls v_incr_usecount() without holding interlock. */
-	if (vp->v_type != VNON || vp->v_data != NULL)
-		ASSERT_VI_LOCKED(vp, "vholdl");
-#endif
-	vp->v_holdcnt++;
-	if ((vp->v_iflag & VI_FREE) == 0)
+	if (!locked && refcount_acquire_if_not_zero(&vp->v_holdcnt)) {
+		VNASSERT((vp->v_iflag & VI_FREE) == 0, vp,
+		    ("_vhold: vnode with holdcnt is free"));
 		return;
-	VNASSERT(vp->v_holdcnt == 1, vp, ("vholdl: wrong hold count"));
-	VNASSERT(vp->v_op != NULL, vp, ("vholdl: vnode already reclaimed."));
+	}
+
+	if (!locked)
+		VI_LOCK(vp);
+	if ((vp->v_iflag & VI_FREE) == 0) {
+		refcount_acquire(&vp->v_holdcnt);
+		if (!locked)
+			VI_UNLOCK(vp);
+		return;
+	}
+	VNASSERT(vp->v_holdcnt == 0, vp,
+	    ("%s: wrong hold count", __func__));
+	VNASSERT(vp->v_op != NULL, vp,
+	    ("%s: vnode already reclaimed.", __func__));
 	/*
 	 * Remove a vnode from the free list, mark it as in use,
 	 * and put it on the active list.
@@ -2396,18 +2425,9 @@ vholdl(struct vnode *vp)
 	TAILQ_INSERT_HEAD(&mp->mnt_activevnodelist, vp, v_actfreelist);
 	mp->mnt_activevnodelistsize++;
 	mtx_unlock(&vnode_free_list_mtx);
-}
-
-/*
- * Note that there is one less who cares about this vnode.
- * vdrop() is the opposite of vhold().
- */
-void
-vdrop(struct vnode *vp)
-{
-
-	VI_LOCK(vp);
-	vdropl(vp);
+	refcount_acquire(&vp->v_holdcnt);
+	if (!locked)
+		VI_UNLOCK(vp);
 }
 
 /*
@@ -2416,20 +2436,28 @@ vdrop(struct vnode *vp)
  * (marked VI_DOOMED) in which case we will free it.
  */
 void
-vdropl(struct vnode *vp)
+_vdrop(struct vnode *vp, bool locked)
 {
 	struct bufobj *bo;
 	struct mount *mp;
 	int active;
 
-	ASSERT_VI_LOCKED(vp, "vdropl");
+	if (locked)
+		ASSERT_VI_LOCKED(vp, __func__);
+	else
+		ASSERT_VI_UNLOCKED(vp, __func__);
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
-	if (vp->v_holdcnt <= 0)
+	if ((int)vp->v_holdcnt <= 0)
 		panic("vdrop: holdcnt %d", vp->v_holdcnt);
-	vp->v_holdcnt--;
-	VNASSERT(vp->v_holdcnt >= vp->v_usecount, vp,
-	    ("hold count less than use count"));
-	if (vp->v_holdcnt > 0) {
+	if (refcount_release_if_not_last(&vp->v_holdcnt)) {
+		if (locked)
+			VI_UNLOCK(vp);
+		return;
+	}
+
+	if (!locked)
+		VI_LOCK(vp);
+	if (refcount_release(&vp->v_holdcnt) == 0) {
 		VI_UNLOCK(vp);
 		return;
 	}
diff --git a/sys/sys/lockmgr.h b/sys/sys/lockmgr.h
index ff0473d..a74d5f5 100644
--- a/sys/sys/lockmgr.h
+++ b/sys/sys/lockmgr.h
@@ -159,6 +159,7 @@ _lockmgr_args_rw(struct lock *lk, u_int flags, struct rwlock *ilk,
 #define	LK_SLEEPFAIL	0x000800
 #define	LK_TIMELOCK	0x001000
 #define	LK_NODDLKTREAT	0x002000
+#define	LK_VNHELD	0x004000
 
 /*
  * Operations for lockmgr().
diff --git a/sys/sys/refcount.h b/sys/sys/refcount.h
index 4611664..d3f817c 100644
--- a/sys/sys/refcount.h
+++ b/sys/sys/refcount.h
@@ -64,4 +64,32 @@ refcount_release(volatile u_int *count)
 	return (old == 1);
 }
 
+static __inline int
+refcount_acquire_if_not_zero(volatile u_int *count)
+{
+	u_int old;
+
+	for (;;) {
+		old = *count;
+		if (old == 0)
+			return (0);
+		if (atomic_cmpset_int(count, old, old + 1))
+			return (1);
+	}
+}
+
+static __inline int
+refcount_release_if_not_last(volatile u_int *count)
+{
+	u_int old;
+
+	for (;;) {
+		old = *count;
+		if (old == 1)
+			return (0);
+		if (atomic_cmpset_int(count, old, old - 1))
+			return (1);
+	}
+}
+
 #endif	/* ! __SYS_REFCOUNT_H__ */
diff --git a/sys/sys/vnode.h b/sys/sys/vnode.h
index 36ef8af..9286a4e 100644
--- a/sys/sys/vnode.h
+++ b/sys/sys/vnode.h
@@ -162,8 +162,8 @@ struct vnode {
 	daddr_t	v_lastw;			/* v last write  */
 	int	v_clen;				/* v length of cur. cluster */
 
-	int	v_holdcnt;			/* i prevents recycling. */
-	int	v_usecount;			/* i ref count of users */
+	u_int	v_holdcnt;			/* i prevents recycling. */
+	u_int	v_usecount;			/* i ref count of users */
 	u_int	v_iflag;			/* i vnode flags (see below) */
 	u_int	v_vflag;			/* v vnode flags */
 	int	v_writecount;			/* v ref count of writers */
@@ -652,13 +652,15 @@ int	vaccess_acl_posix1e(enum vtype type, uid_t file_uid,
 	    struct ucred *cred, int *privused);
 void	vattr_null(struct vattr *vap);
 int	vcount(struct vnode *vp);
-void	vdrop(struct vnode *);
-void	vdropl(struct vnode *);
+#define	vdrop(vp)	_vdrop((vp), 0)
+#define	vdropl(vp)	_vdrop((vp), 1)
+void	_vdrop(struct vnode *, bool);
 int	vflush(struct mount *mp, int rootrefs, int flags, struct thread *td);
 int	vget(struct vnode *vp, int lockflag, struct thread *td);
 void	vgone(struct vnode *vp);
-void	vhold(struct vnode *);
-void	vholdl(struct vnode *);
+#define	vhold(vp)	_vhold((vp), 0)
+#define	vholdl(vp)	_vhold((vp), 1)
+void	_vhold(struct vnode *, bool);
 void	vinactive(struct vnode *, struct thread *);
 int	vinvalbuf(struct vnode *vp, int save, int slpflag, int slptimeo);
 int	vtruncbuf(struct vnode *vp, struct ucred *cred, off_t length,

From owner-freebsd-fs@freebsd.org  Thu Jun 25 17:22:15 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2239B98E520
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 25 Jun 2015 17:22:15 +0000 (UTC)
 (envelope-from javocado@gmail.com)
Received: from mail-lb0-x232.google.com (mail-lb0-x232.google.com
 [IPv6:2a00:1450:4010:c04::232])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9C0531D5B
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 17:22:14 +0000 (UTC)
 (envelope-from javocado@gmail.com)
Received: by lbbvz5 with SMTP id vz5so49762981lbb.0
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 10:22:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :content-type; bh=5hm0urbBmUiJjGYeExsM33yf2d49LyNC//I3OqKxf0g=;
 b=xmi3lVwO4/E3CQYg4/b+Ukb9dKkH1N26x+LJlN019VXimpkQ18scNuvPdRcZXhu99T
 kQIkYZ7aP/BdObUD8KX3QRtcOeO4lTJoNBKmdc77lc0h4jZl3XiDQuBVX0xrQ/7HDjiw
 hQqLYTvsvXBi79ksC4ihoNoyMnC/Jf2+YADiiiaF5DSOMjja71YeDEtyPlcti/3S+Yz4
 Q3ceQtIPSPJ2OP9JmDXjyrT9VkGzcPi4dg4XYQkouz330GrrYkRMxal6G9QvvYntNQ3u
 g7vWuq07R9LWG6lNUZErMjPZKiG27r37YPzOrIGgH4iw9Nksqh2nP8h7vzNuHC2xzYHk
 muqw==
MIME-Version: 1.0
X-Received: by 10.112.154.71 with SMTP id vm7mr44934253lbb.96.1435252932451;
 Thu, 25 Jun 2015 10:22:12 -0700 (PDT)
Received: by 10.114.96.8 with HTTP; Thu, 25 Jun 2015 10:22:12 -0700 (PDT)
In-Reply-To: <CAP1HOmSj5nNn7+so=D_2qXrSh+tE=cTTbgjzCBkCCim1vZUTGg@mail.gmail.com>
References: <CAP1HOmSj5nNn7+so=D_2qXrSh+tE=cTTbgjzCBkCCim1vZUTGg@mail.gmail.com>
Date: Thu, 25 Jun 2015 10:22:12 -0700
Message-ID: <CAP1HOmTbmamehpaTE-A=d+rHFMYdZiuEebOaLCS3u1WurxdnrA@mail.gmail.com>
Subject: Fwd: ZFS pool within FreeBSD bhyve guest
From: javocado <javocado@gmail.com>
To: FreeBSD Filesystems <freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Jun 2015 17:22:15 -0000

(I'm posting here because I think this may be more of a zfs issue rather
than a bhyve issue)

Hi,

I would like to create a zfs filesystem within my bhyve (FreeBSD 10.1 as
the guest and host) allowing users of the VM to run zfs send/receive
commands on the zfs filesystem within their bhyve VM.

Is this possible and what is/are the methods and options for creating the
zfs filesystem (or volume) within the VM? If there is a way to do this,
would any of the proposed methods depend on whether the VM lives in a file
versus a zfs volume? My VM is file-based.

Thanks!

From owner-freebsd-fs@freebsd.org  Thu Jun 25 17:41:52 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D72C198C72E
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 25 Jun 2015 17:41:52 +0000 (UTC)
 (envelope-from rah.lists@gmail.com)
Received: from mail-lb0-x232.google.com (mail-lb0-x232.google.com
 [IPv6:2a00:1450:4010:c04::232])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 5DA851658
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 17:41:52 +0000 (UTC)
 (envelope-from rah.lists@gmail.com)
Received: by lbnk3 with SMTP id k3so50106129lbn.1
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 10:41:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=m5Ye4TMqxSpIJu61UrbTJo/Dy+xNpI58mtzWHlC2QMM=;
 b=SlZZDqmHKTDkrkwIbxUmEWiG+umb+CCn2a2rh5TcrrqzI2SsaluS5tdMZ3z2+MFGz/
 hYK22rh00tIGm8I1lKgqLsaXD4ji3/oc0Lq5wcR1FgK9+iXmD5ZNJahRP/RrWvgIWrNN
 fexIV3YSwt8E2Oo4i7V95dQJc0QEq3Us35xTMGfvqigMc8HjVkP78WXmSgiZB/j7A9kR
 wM6/EJToxFp4qk/is5KRl1Hxz9X50vK3FVmjlxBDXCFdYtvxYZouGbbTjds8Hq7h21ku
 1N3uPTSnaQvyaJOdodkg0CQAGdi/jJ1TQgGrRckx+x6/1mCfWFUqlZrwSQ8+HDqCRPVO
 PNmg==
MIME-Version: 1.0
X-Received: by 10.153.4.12 with SMTP id ca12mr4616968lad.20.1435254110519;
 Thu, 25 Jun 2015 10:41:50 -0700 (PDT)
Received: by 10.25.218.66 with HTTP; Thu, 25 Jun 2015 10:41:50 -0700 (PDT)
Date: Thu, 25 Jun 2015 13:41:50 -0400
Message-ID: <CAFzyudixCBu17NfmFyhoB_1d7g5qfH9p-53WEpOykXOjsUD7Yw@mail.gmail.com>
Subject: VFS buffering issues with UFS + soft-updates journaling
From: RA H <rah.lists@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Jun 2015 17:41:52 -0000

I was directed here by a moderator on the official forums, who suggested
the behaviour I'm seeing may be a bug. I don't have much experience with
mailing lists, so please be gentle :)


I'm experiencing data loss on a UFS filesystem on an iSCSI disk when the
iSCSI connection is terminated abruptly. I know the issue isn't that the
data doesn't have time to flush to disk; unmounting the filesystem right
after the copy completes always returns immediately.

Details:
FreeBSD 10.1-RELEASE
iscsictl(8) (ie. the new iSCSI initiator)
single GPT partition on disk
UFS with soft-update journaling

I mount the fs, copy a 1G file (have tried source file on tmpfs and a
local SATA disk), wait ~10 seconds, then pull the Ethernet cable on the
NIC which is connected to the iSCSI disk. I then reboot gracefully with
shutdown -r now. After the system comes back up a fsck is necessary; I
answer y to all the questions. After mounting, I either find no evidence
the file ever existed, a file of zero size, or a truncated file. Even
calling sync before terminating the connection does not prevent data loss.

The first indication the problem had something to do with buffering was
that during the reboot, the buffer sync
(ie. Syncing disks, buffers remaining...) always indicates something in
the range of 20-50 buffers that need syncing, all of which are eventually
given up on.

As a workaround, I set the sysctl variable vfs.lodirtybuffers to 1. With
this setting, it takes 2-3 seconds for the sysctl variable
vfs.numdirtybuffers to return to the level it was before the I started
the copy. At that point I can pull the ethernet cable, reboot (there are
still a few buffers that don't get synced), fsck, etc. and the file is
intact. I haven't seen any side-effects to this, but I expect setting it
so low is not exactly best practice.

Another workaround is using UFS without soft-updates or journaling,
provided I sync before terminating the iSCSI connection. The sync actually
does what's expected in this case, and although fsck is still required,
AFAICT it only needs to mark the fs clean. Initial testing with UFS
and gjournal seems to work without OOTB, but I'm not sure I want to go
that route.

From owner-freebsd-fs@freebsd.org  Thu Jun 25 18:29:11 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A107D98C5C7
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 25 Jun 2015 18:29:11 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au
 [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 631F21FDD
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 18:29:11 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au
 (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197])
 by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id BCEA7D43C1C;
 Fri, 26 Jun 2015 04:29:09 +1000 (AEST)
Date: Fri, 26 Jun 2015 04:29:07 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Mateusz Guzik <mjguzik@gmail.com>
cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-fs@freebsd.org
Subject: Re: atomic v_usecount and v_holdcnt
In-Reply-To: <20150625123156.GA29667@dft-labs.eu>
Message-ID: <20150626042546.Q2820@besplex.bde.org>
References: <20141122002812.GA32289@dft-labs.eu>
 <20141122092527.GT17068@kib.kiev.ua>
 <20141122211147.GA23623@dft-labs.eu> <20141124095251.GH17068@kib.kiev.ua>
 <20150314225226.GA15302@dft-labs.eu> <20150316094643.GZ2379@kib.kiev.ua>
 <20150317014412.GA10819@dft-labs.eu> <20150318104442.GS2379@kib.kiev.ua>
 <20150625123156.GA29667@dft-labs.eu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=XMDNMlVE c=1 sm=1 tr=0
 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10
 a=JzwRw_2MAAAA:8 a=dfNNiiqOaOqD_QZmin8A:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Jun 2015 18:29:11 -0000

On Thu, 25 Jun 2015, Mateusz Guzik wrote:

> On Wed, Mar 18, 2015 at 12:44:42PM +0200, Konstantin Belousov wrote:
>> On Tue, Mar 17, 2015 at 02:44:12AM +0100, Mateusz Guzik wrote:

>>> I replaced them with refcount_acquire_if_not_zero and
>>> refcount_release_if_not_last.
>> I dislike the length of the names.  Can you propose something shorter ?
>
> Unfortunately the original API is alreday quite verbose and I don't have
> anything readable which would retain "refcount_acquire" (instead of a
> "ref_get" or "ref_acq"). Adding "_nz" as a suffix does not look good
> ("refcount_acquire_if_nz").

refcount -> rc
acquire -> acq

The "acq" abbreviation is already used a lot for atomic ops.

Bruce

From owner-freebsd-fs@freebsd.org  Thu Jun 25 19:53:57 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D9A3898D67B
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 25 Jun 2015 19:53:57 +0000 (UTC)
 (envelope-from etnapierala@gmail.com)
Received: from mail-wi0-x230.google.com (mail-wi0-x230.google.com
 [IPv6:2a00:1450:400c:c05::230])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6D901185F
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 19:53:57 +0000 (UTC)
 (envelope-from etnapierala@gmail.com)
Received: by wiwl6 with SMTP id l6so27630167wiw.0
 for <freebsd-fs@freebsd.org>; Thu, 25 Jun 2015 12:53:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=sender:date:from:to:cc:subject:message-id:mail-followup-to
 :references:mime-version:content-type:content-disposition
 :in-reply-to:user-agent;
 bh=2xGe5pJY7xYT8p5exM2iKaKua5/XJeO2FIp6KLBwemY=;
 b=kiIIva1sHh1Cf94KuBnmSP5tNQUohvU25fTPfh4A2CGLpJsN4F7Cr0OQFUuLLjHHyl
 riyrGaZ/KaKx1flJGOVUqxUsnWRtNVHP8Ni+B0HCesJSZ1tnfjrkjEHt/TQSCgUuRGpT
 Dn3WWQFSMH2AsJoPw1NDXOsQ7rmmXrO+oanzUg0L6QkHQ+nBkAXwftACytL/95/3DxB7
 +R+DTO41gw8O9Q9NlZIX4XqiyvCdMrAK9InGOnhDVFdu1tnKMZiLypH+el/CkKa68D4x
 Zzn1KbtRJqP1y/8/3yHJc+z+09NxJAlyTF/tTTUe1B1CgkSlNwQ/AjfZQR5P70vgKBc6
 +C8w==
X-Received: by 10.180.88.8 with SMTP id bc8mr8563493wib.19.1435262035910;
 Thu, 25 Jun 2015 12:53:55 -0700 (PDT)
Received: from brick.home (adje188.neoplus.adsl.tpnet.pl. [79.184.212.188])
 by mx.google.com with ESMTPSA id d3sm9080449wic.1.2015.06.25.12.53.54
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 25 Jun 2015 12:53:55 -0700 (PDT)
Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= <etnapierala@gmail.com>
Date: Thu, 25 Jun 2015 21:53:52 +0200
From: Edward Tomasz =?utf-8?Q?Napiera=C5=82a?= <trasz@FreeBSD.org>
To: Bruce Evans <brde@optusnet.com.au>
Cc: Mateusz Guzik <mjguzik@gmail.com>, freebsd-fs@freebsd.org
Subject: Re: atomic v_usecount and v_holdcnt
Message-ID: <20150625195352.GB1042@brick.home>
Mail-Followup-To: Bruce Evans <brde@optusnet.com.au>,
 Mateusz Guzik <mjguzik@gmail.com>, freebsd-fs@freebsd.org
References: <20141122002812.GA32289@dft-labs.eu>
 <20141122092527.GT17068@kib.kiev.ua>
 <20141122211147.GA23623@dft-labs.eu>
 <20141124095251.GH17068@kib.kiev.ua>
 <20150314225226.GA15302@dft-labs.eu>
 <20150316094643.GZ2379@kib.kiev.ua>
 <20150317014412.GA10819@dft-labs.eu>
 <20150318104442.GS2379@kib.kiev.ua>
 <20150625123156.GA29667@dft-labs.eu>
 <20150626042546.Q2820@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150626042546.Q2820@besplex.bde.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Jun 2015 19:53:57 -0000

On 0626T0429, Bruce Evans wrote:
> On Thu, 25 Jun 2015, Mateusz Guzik wrote:
> 
> > On Wed, Mar 18, 2015 at 12:44:42PM +0200, Konstantin Belousov wrote:
> >> On Tue, Mar 17, 2015 at 02:44:12AM +0100, Mateusz Guzik wrote:
> 
> >>> I replaced them with refcount_acquire_if_not_zero and
> >>> refcount_release_if_not_last.
> >> I dislike the length of the names.  Can you propose something shorter ?
> >
> > Unfortunately the original API is alreday quite verbose and I don't have
> > anything readable which would retain "refcount_acquire" (instead of a
> > "ref_get" or "ref_acq"). Adding "_nz" as a suffix does not look good
> > ("refcount_acquire_if_nz").
> 
> refcount -> rc
> acquire -> acq
> 
> The "acq" abbreviation is already used a lot for atomic ops.

How about refcount_acquire_gt_0() and refcount_release_gt_1()1?


From owner-freebsd-fs@freebsd.org  Thu Jun 25 22:15:10 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 824CA98C569
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 25 Jun 2015 22:15:10 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 677A41CDA
 for <freebsd-fs@FreeBSD.org>; Thu, 25 Jun 2015 22:15:10 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t5PMFAqH058864
 for <freebsd-fs@FreeBSD.org>; Thu, 25 Jun 2015 22:15:10 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-fs@FreeBSD.org
Subject: [Bug 200663] zfs allow/unallow doesn't show numeric UID when the ID
 no longer exists in the password file
Date: Thu, 25 Jun 2015 22:15:09 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 11.0-CURRENT
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: Affects Only Me
X-Bugzilla-Who: delphij@FreeBSD.org
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-200663-3630-DngPkQFcF9@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-200663-3630@https.bugs.freebsd.org/bugzilla/>
References: <bug-200663-3630@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Jun 2015 22:15:10 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200663

Xin LI <delphij@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |delphij@FreeBSD.org

--- Comment #2 from Xin LI <delphij@FreeBSD.org> ---
I have submitted an issue at Illumos: https://www.illumos.org/issues/6037 with
a proposed patch against FreeBSD.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

From owner-freebsd-fs@freebsd.org  Fri Jun 26 07:01:24 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9477F98C536
 for <freebsd-fs@mailman.ysv.freebsd.org>; Fri, 26 Jun 2015 07:01:24 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from douhisi.pair.com (douhisi.pair.com [209.68.5.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6F34A1263
 for <freebsd-fs@freebsd.org>; Fri, 26 Jun 2015 07:01:23 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from [10.2.2.1] (pool-173-48-121-235.bstnma.fios.verizon.net
 [173.48.121.235])
 by douhisi.pair.com (Postfix) with ESMTPSA id 0E6623F743
 for <freebsd-fs@freebsd.org>; Fri, 26 Jun 2015 03:01:21 -0400 (EDT)
Message-ID: <558CF8BC.6050807@sneakertech.com>
Date: Fri, 26 Jun 2015 03:01:16 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: The "myth" of zfs stripe width
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jun 2015 07:01:24 -0000

http://blog.delphix.com/matt/2014/06/06/zfs-stripe-width/

This blog claims that the old rule of "power of two, plus parity" when 
considering the number of disks in a raid doesn't really apply to zfs. 
What does everyone else think?

From owner-freebsd-fs@freebsd.org  Fri Jun 26 13:40:14 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD01F98CAF1
 for <freebsd-fs@mailman.ysv.freebsd.org>; Fri, 26 Jun 2015 13:40:14 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 8B2EA1ADE
 for <freebsd-fs@freebsd.org>; Fri, 26 Jun 2015 13:40:14 +0000 (UTC)
 (envelope-from bfriesen@simple.dallas.tx.us)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
 [65.66.246.65])
 by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id t5QDe6L5004701;
 Fri, 26 Jun 2015 08:40:06 -0500 (CDT)
Date: Fri, 26 Jun 2015 08:40:06 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Quartz <quartz@sneakertech.com>
cc: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: The "myth" of zfs stripe width
In-Reply-To: <558CF8BC.6050807@sneakertech.com>
Message-ID: <alpine.GSO.2.01.1506260834370.4186@freddy.simplesystems.org>
References: <558CF8BC.6050807@sneakertech.com>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (blade.simplesystems.org [65.66.246.90]);
 Fri, 26 Jun 2015 08:40:06 -0500 (CDT)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jun 2015 13:40:14 -0000

On Fri, 26 Jun 2015, Quartz wrote:

> http://blog.delphix.com/matt/2014/06/06/zfs-stripe-width/
>
> This blog claims that the old rule of "power of two, plus parity" when 
> considering the number of disks in a raid doesn't really apply to zfs. What 
> does everyone else think?

Are you suggesting that we might question the authority (the person 
who invented the technology) on this topic?

This happens to be a blog that you can trust.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@freebsd.org  Fri Jun 26 15:39:14 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8389298D05A
 for <freebsd-fs@mailman.ysv.freebsd.org>; Fri, 26 Jun 2015 15:39:14 +0000 (UTC)
 (envelope-from ben@altesco.nl)
Received: from altus-escon.com (altescovd.xs4all.nl [82.95.116.106])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (Client CN "proxy.altus-escon.com", Issuer "PositiveSSL CA 2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 2631011FD
 for <freebsd-fs@freebsd.org>; Fri, 26 Jun 2015 15:39:13 +0000 (UTC)
 (envelope-from ben@altesco.nl)
Received: from daneel.altus-escon.com (daneel.altus-escon.com [193.78.231.7])
 (authenticated bits=0)
 by altus-escon.com (8.14.9/8.14.9) with ESMTP id t5QFd3eb014942
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT)
 for <freebsd-fs@freebsd.org>; Fri, 26 Jun 2015 17:39:03 +0200 (CEST)
 (envelope-from ben@altesco.nl)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2102\))
Subject: Re: Panic on removing corrupted file on zfs
From: Ben Stuyts <ben@altesco.nl>
In-Reply-To: <BC0509E5-E86E-4D50-AB5A-308434C4C1CB@altesco.nl>
Date: Fri, 26 Jun 2015 17:39:03 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <2CC1E621-687B-4F4A-97D4-2DCCB620E17A@altesco.nl>
References: <BC0509E5-E86E-4D50-AB5A-308434C4C1CB@altesco.nl>
To: freebsd-fs@freebsd.org
X-Mailer: Apple Mail (2.2102)
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3
 (altus-escon.com [193.78.231.142]); Fri, 26 Jun 2015 17:39:03 +0200 (CEST)
X-Virus-Scanned: clamav-milter 0.98.7 at mars.altus-escon.com
X-Virus-Status: Clean
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jun 2015 15:39:14 -0000

Anybody? Otherwise I=E2=80=99ll just wipe the pool (it=E2=80=99s a =
backup so no big loss).

(To be safe I also ran memtest86 on this system, but it found no =
errors.)

Ben

> On 23 Jun 2015, at 17:26, Ben Stuyts <ben@altesco.nl> wrote:
>=20
> Hello,
>=20
> I have a corrupted file on a zfs file system. It is a backup store for =
an rsync job, and rsync errors with:
>=20
> rsync: failed to read xattr rsync.%stat for =
"/home1/vwa/rsync/tank3/cam/jpg/487-20150224180950-05.jpg": Input/output =
error (5)
> Corrupt rsync.%stat xattr attached to =
"/home1/vwa/rsync/tank3/cam/jpg/487-20150224180950-04.jpg": "100644 0,0 =
\#007:1001"
> rsync error: error in file IO (code 11) at xattrs.c(1003) =
[generator=3D3.1.1]
>=20
> This is a file from februari, and it hasn=E2=80=99t changed since. =
Smartctl shows no errors. No ECC memory on this system, so maybe caused =
by a memory problem. I am currently running a scrub for the second time. =
First time didn=E2=80=99t help.
>=20
> Output from zpool status -v:
>=20
>  pool: home1
> state: ONLINE
> status: One or more devices has experienced an error resulting in data
> 	corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore =
the
> 	entire pool from backup.
>   see: http://illumos.org/msg/ZFS-8000-8A
>  scan: scrub in progress since Tue Jun 23 15:37:31 2015
>        462G scanned out of 2.47T at 80.8M/s, 7h16m to go
>        0 repaired, 18.29% done
> config:
>=20
> 	NAME                                          STATE     READ =
WRITE CKSUM
> 	home1                                         ONLINE       0     =
0     0
> 	  gptid/14032b0b-7f05-11e3-8797-54bef70d8314  ONLINE       0     =
0     0
>=20
> errors: Permanent errors have been detected in the following files:
>=20
>        =
/home1/vwa/rsync/tank3/cam/jpg/487-20150224180950-05.jpg/<xattrdir>
>=20
> When I try to rm the file the system panics. =46rom /var/crash:
>=20
> tera8 dumped core - see /var/crash/vmcore.1
>=20
> Tue Jun 23 15:37:11 CEST 2015
>=20
> FreeBSD tera8 10.1-STABLE FreeBSD 10.1-STABLE #2 r284317: Fri Jun 12 =
17:07:21 CEST 2015     root@tera8:/usr/obj/usr/src/sys/GENERIC  amd64
>=20
> panic: acl_from_aces: a_type is 0x4d00
>=20
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and =
you are
> welcome to change it and/or distribute copies of it under certain =
conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for =
details.
> This GDB was configured as "amd64-marcel-freebsd"...
>=20
> Unread portion of the kernel message buffer:
> panic: acl_from_aces: a_type is 0x4d00
> cpuid =3D 1
> KDB: stack backtrace:
> #0 0xffffffff8097d890 at kdb_backtrace+0x60
> #1 0xffffffff809410e9 at vpanic+0x189
> #2 0xffffffff80940f53 at panic+0x43
> #3 0xffffffff81aaa209 at acl_from_aces+0x1c9
> #4 0xffffffff81b61546 at zfs_freebsd_getacl+0xa6
> #5 0xffffffff80e5de77 at VOP_GETACL_APV+0xa7
> #6 0xffffffff809c7a3c at vacl_get_acl+0xdc
> #7 0xffffffff809c7bd2 at sys___acl_get_link+0x72
> #8 0xffffffff80d35817 at amd64_syscall+0x357
> #9 0xffffffff80d1a89b at Xfast_syscall+0xfb
>=20
> Is there any other way of getting rid of this file (except destroying =
the fs/pool)?=20
>=20
> Thanks,
> Ben
>=20
>=20
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>=20


From owner-freebsd-fs@freebsd.org  Fri Jun 26 16:50:16 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3494798DBC8
 for <freebsd-fs@mailman.ysv.freebsd.org>; Fri, 26 Jun 2015 16:50:16 +0000 (UTC)
 (envelope-from schittenden@groupon.com)
Received: from mail-yk0-x230.google.com (mail-yk0-x230.google.com
 [IPv6:2607:f8b0:4002:c07::230])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id E441A1C92
 for <freebsd-fs@freebsd.org>; Fri, 26 Jun 2015 16:50:15 +0000 (UTC)
 (envelope-from schittenden@groupon.com)
Received: by ykdt186 with SMTP id t186so63645685ykd.0
 for <freebsd-fs@freebsd.org>; Fri, 26 Jun 2015 09:50:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=groupon.com; s=google;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=66WPSAAiYcKHuQPcltY0o8SO0V8YtWAhYbrSPD9To7M=;
 b=WoXJTGTtV6HAH/Gq4zCK5lDPfs5mbLftsfmSCEhCf0bu0DqOqiYiEDzwIAASKu2Yev
 Ce/BaazYwZ3pWi0tEdeD6HDSt5wKthofTlRF0LvBsJl3VKidkUMVKeL0USbNvnipnkGY
 /r2ab4cNJH9C2auKDViEaT6/Xr+92Vr3K0L3Y=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=66WPSAAiYcKHuQPcltY0o8SO0V8YtWAhYbrSPD9To7M=;
 b=J+LAYYdtucJfD9WRFCIQe5bJO28t2ug/xy6LyH4JNTALO+7wNgMe/PB30cSjH3s/Tw
 SI7Yst90fqidJI43L1J1Dzku47bprWp7I2rHjnhLR0asUJTZNi0gFSiv2SmcLu1tBe/u
 HzaQjZDGkOPTuNM5V0kwAJAqZdViL6VwOmIVxRO80/4uDU062oyHRjxHL3irpoCVP3cZ
 kMbrxTv6n4JePm3xfrTy9ED5/5DXBOJuFGd9qfYCPNoS4DqmY+B/BYl4PjkbVoxtZzQD
 RnP6QlETVrYGAEBkPyc0JL2PiX3ik/riVeTsQlDl+/gBOYz1q+luLQJ2e/LVlOdych0L
 LNiQ==
X-Gm-Message-State: ALoCoQnqUywkHYvqaWcA4zuytqcI+U3i/G5rKxRwLdUwxZveormWArYJ6ojmoWNuIZ5bApcw644F
MIME-Version: 1.0
X-Received: by 10.13.236.5 with SMTP id v5mr3147227ywe.138.1435337414398; Fri,
 26 Jun 2015 09:50:14 -0700 (PDT)
Received: by 10.13.242.7 with HTTP; Fri, 26 Jun 2015 09:50:14 -0700 (PDT)
In-Reply-To: <CAP1HOmTbmamehpaTE-A=d+rHFMYdZiuEebOaLCS3u1WurxdnrA@mail.gmail.com>
References: <CAP1HOmSj5nNn7+so=D_2qXrSh+tE=cTTbgjzCBkCCim1vZUTGg@mail.gmail.com>
 <CAP1HOmTbmamehpaTE-A=d+rHFMYdZiuEebOaLCS3u1WurxdnrA@mail.gmail.com>
Date: Fri, 26 Jun 2015 09:50:14 -0700
Message-ID: <CACfj5vL7tUy_+Y74T71D1xc-SBv9enJA+xNGMa5xyYttMK6ryw@mail.gmail.com>
Subject: Re: ZFS pool within FreeBSD bhyve guest
From: Sean Chittenden <seanc@groupon.com>
To: javocado <javocado@gmail.com>
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jun 2015 16:50:16 -0000

Look at iohyve as an easy way to do this.  You need to export a zvol to
your guest.  You can't export a "file system" but you can export a
ZFS-backed volume.  -sc

https://github.com/pr1ntf/iohyve


On Thu, Jun 25, 2015 at 10:22 AM, javocado <javocado@gmail.com> wrote:

> (I'm posting here because I think this may be more of a zfs issue rather
> than a bhyve issue)
>
> Hi,
>
> I would like to create a zfs filesystem within my bhyve (FreeBSD 10.1 as
> the guest and host) allowing users of the VM to run zfs send/receive
> commands on the zfs filesystem within their bhyve VM.
>
> Is this possible and what is/are the methods and options for creating the
> zfs filesystem (or volume) within the VM? If there is a way to do this,
> would any of the proposed methods depend on whether the VM lives in a file
> versus a zfs volume? My VM is file-based.
>
> Thanks!
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>


-- 
Sean Chittenden

From owner-freebsd-fs@freebsd.org  Fri Jun 26 22:00:51 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2393998DFB2
 for <freebsd-fs@mailman.ysv.freebsd.org>; Fri, 26 Jun 2015 22:00:51 +0000 (UTC)
 (envelope-from javocado@gmail.com)
Received: from mail-la0-x22a.google.com (mail-la0-x22a.google.com
 [IPv6:2a00:1450:4010:c03::22a])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 972D01CEE
 for <freebsd-fs@freebsd.org>; Fri, 26 Jun 2015 22:00:50 +0000 (UTC)
 (envelope-from javocado@gmail.com)
Received: by lagx9 with SMTP id x9so71112847lag.1
 for <freebsd-fs@freebsd.org>; Fri, 26 Jun 2015 15:00:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :content-type; bh=DIytguKNpnjWPJtJ9W9xr3nT4XGKh2ePybL33LvY0KU=;
 b=yNF42nYaypa8BTA/zISE9WpSN0TyjQWNDvPS4QEijTvNCo8xyHwOlaO23i3D6gkaf1
 GElmolYDFLTIq3TLYExE6r04vh7WmgZwelJTf8TFTt4FwW/PlLt++B+iQO7p1i8EM62E
 bQprXAKeGfyXO7eR6F/7gBU5hMeVjEbi4+HgL7HX/guyoTGPhOqfMbUoxd63u6NsK7Ie
 a5PTI8jYY6hBCAk42suE4rucmxlQ5kb5Y7lJ7uvZfnvD+aExHEfoRzvU3S4hxk9saAYY
 3lzLqrxJh8gXZhlXov71JQzyhW1RuUuE5jvgmLk3B/3lw+Z9VWubt0z7oOcPMUlyJ6aJ
 nFWg==
MIME-Version: 1.0
X-Received: by 10.112.162.38 with SMTP id xx6mr3517498lbb.110.1435356048550;
 Fri, 26 Jun 2015 15:00:48 -0700 (PDT)
Received: by 10.114.96.8 with HTTP; Fri, 26 Jun 2015 15:00:48 -0700 (PDT)
In-Reply-To: <20150613094244.GC37870@brick.home>
References: <CAP1HOmTTPhN24-XXy6Vq3dD261xeUTyUvncgXKh=mE6jPACChA@mail.gmail.com>
 <20150613094244.GC37870@brick.home>
Date: Fri, 26 Jun 2015 15:00:48 -0700
Message-ID: <CAP1HOmT96=sfeyMwA5ZmoRouGPqQsTH3jEE7h3w4e12w40x2ag@mail.gmail.com>
Subject: Re: growfs failure
From: javocado <javocado@gmail.com>
To: javocado <javocado@gmail.com>, FreeBSD Filesystems <freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jun 2015 22:00:51 -0000

Thanks for the suggestion, the original system is 8.3 amd64. The growfs did
work when I moved the image file over to a 10.1 amd64 system

On Sat, Jun 13, 2015 at 2:42 AM, Edward Tomasz Napiera=C5=82a <trasz@freebs=
d.org>
wrote:

> On 0603T1619, javocado wrote:
> > While trying to growfs a filesystem, I receive the following error:
> >
> > growfs: rdfs: read error: 5812093147771869908: Input/output error
> >
> > Here were the steps taken leading up to this point:
> >
> > (original file is 300 GB, growing to 500 GB)
> >
> > (the filesystem is clean with fsck_ufs /dev/md1)
> >
> > geli detach /dev/md1.eli
> >
> > mdconfig -d -u 1
> >
> > truncate -s +200G geli.img
> >
> > mdconfig -f geli.img -u 1
> >
> > geli resize -s 300G /dev/md1
> >
> > geli attach /dev/md1
> >
> > growfs /dev/md1.eli
> >
> > new file systemsize is: 262143999 frags
> > Warning: 326780 sector(s) cannot be allocated.
> > growfs: 511840.4MB (1048249216 sectors) block size 16384, fragment size
> 2048
> >         using 2786 cylinder groups of 183.72MB, 11758 blks, 23552 inode=
s.
> > super-block backups (for fsck -b #) at:
> >  629476448, 629852704, 630228960, 630605216, 630981472, 631357728,
> > 631733984, 632110240,
> > ....
> > growfs: rdfs: read error: 5812093147771869908: Input/output error
>
> I can't reproduce it.  What's the FreeBSD version?  The output messages
> above don't match current versions of growfs(8); could you try to upgrade
> and see if the problem is fixed?
>
>

From owner-freebsd-fs@freebsd.org  Sat Jun 27 18:37:03 2015
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9C83498CE32
 for <freebsd-fs@mailman.ysv.freebsd.org>; Sat, 27 Jun 2015 18:37:03 +0000 (UTC)
 (envelope-from postmaster+1557035@post.webmailer.de)
Received: from cg6-p07-ob.smtp.rzone.de (cg6-p07-ob.smtp.rzone.de
 [IPv6:2a01:238:20a:202:5317::8])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "*.smtp.rzone.de", Issuer "TeleSec ServerPass DE-2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 39C401692
 for <freebsd-fs@freebsd.org>; Sat, 27 Jun 2015 18:36:59 +0000 (UTC)
 (envelope-from postmaster+1557035@post.webmailer.de)
X-RZG-CLASS-ID: cg07
Received: from coates.store ([192.168.42.140]) by jored.store (RZmta 37.8 OK)
 with ESMTP id e0588ar5RIapBvQ for <freebsd-fs@freebsd.org>;
 Sat, 27 Jun 2015 20:36:51 +0200 (CEST)
Received: (from Unknown UID 1557035@localhost)
 by post.webmailer.de (8.13.7/8.13.7) id t5RIanVn013242;
 Sat, 27 Jun 2015 18:36:49 GMT
To: freebsd-fs@freebsd.org
Subject: Unable to deliver your item, #0000983036
Date: Sat, 27 Jun 2015 20:36:49 +0200
From: "FedEx 2Day A.M." <walter.downs@w80.rzone.de>
Reply-To: "FedEx 2Day A.M." <walter.downs@w80.rzone.de>
Message-ID: <5be2a11df09a5f8a18d3037b5f8bdb44@w80.rzone.de>
X-Priority: 3
MIME-Version: 1.0
X-RZG-SCRIPT: :P28WfFC8JrA0JY4UkyfhUWv+YuCloWhyOLk77zZraDNPI4MwvWp5TFVn98vE2ZAeOn0rJsg57o36NfRf8EGnT2ai4NheKJSwUcX6sHUGkwZOJLxuC0pIvLlmH4ZdoyOrMyb5poN6A7VVbVwP5SYqnE0RvUDrmA/N
Content-Type: text/plain; charset=us-ascii
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 27 Jun 2015 18:37:03 -0000

Dear Customer,

Courier was unable to deliver the parcel to you.
Please, open email attachment to print shipment label.

Yours faithfully,
Walter Downs,
Sr. Support Agent.