From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 09:17:14 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 06347106566B;
	Mon,  8 Nov 2010 09:17:14 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id DA85A8FC19;
	Mon,  8 Nov 2010 09:17:12 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA25662;
	Mon, 08 Nov 2010 11:17:08 +0200 (EET) (envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1PFNqd-0003iD-Uy; Mon, 08 Nov 2010 11:17:08 +0200
Message-ID: <4CD7C013.8010502@freebsd.org>
Date: Mon, 08 Nov 2010 11:17:07 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: jhell <jhell@DataIX.net>
References: <4C91F031.1010801@freebsd.org>
	<20101010134717.GA59922@deviant.kiev.zoral.com.ua>
	<4CD2E5AA.1090208@freebsd.org> <4CD48259.6030402@DataIX.net>
In-Reply-To: <4CD48259.6030402@DataIX.net>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: Alan Cox <alc@freebsd.org>, Pawel Jakub Dawidek <pjd@freebsd.org>,
	freebsd-fs@freebsd.org
Subject: Re: vop_getpages for zfs
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 09:17:14 -0000

on 06/11/2010 00:16 jhell said the following:
> Thought I would give you a heads up on this after seeing the post about
> zfs_getpages.diff.
> 
> I patched up to this before after seeing it posted to <you_know_where>@
> and got reliable dumpage from it. Basically a vm_page_unwire fault or
> something like that. I believe the following is the backtrace from that.
> 
> (kgdb) #0  doadump () at pcpu.h:231
> #1  0x80675251 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:419
> #2  0x806754e5 in panic (fmt=Variable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:592
> #3  0x808e27ce in vm_page_unwire (m=0x816b46e0, activate=1)
>     at /usr/src/sys/vm/vm_page.c:1564
> #4  0x808d123a in vm_fault_unwire (map=0x81690088, start=2568372224,
>     end=2568433664, fictitious=0) at /usr/src/sys/vm/vm_fault.c:1123
> #5  0x808d8a33 in vm_map_delete (map=0x81690088, start=2568372224,
>     end=2568433664) at /usr/src/sys/vm/vm_map.c:2619
> #6  0x808d8d0d in vm_map_remove (map=0x81690088, start=2568372224,
> end=Variable "end" is not available.
> )
>     at /usr/src/sys/vm/vm_map.c:2801
> #7  0x808d6360 in kmem_free (map=0x81690088, addr=2568372224, size=61440)
>     at /usr/src/sys/vm/vm_kern.c:211
> #8  0x808cb601 in page_free (mem=0x99164000, size=61440, flags=34 '"')
>     at /usr/src/sys/vm/uma_core.c:1069
> #9  0x808ccf92 in uma_large_free (slab=0x85be7e6c)
>     at /usr/src/sys/vm/uma_core.c:3021
> #10 0x8065f7a5 in free (addr=0x99164000, mtp=0x80d8e114)
>     at /usr/src/sys/kern/kern_malloc.c:506
> #11 0x80cd49db in zil_lwb_write_done () from /boot/kernel/zfs.ko
> #12 0x80cd99b1 in zio_done () from /boot/kernel/zfs.ko
> #13 0x80cd7d3a in zio_execute () from /boot/kernel/zfs.ko
> #14 0x80cd7f2e in zio_notify_parent () from /boot/kernel/zfs.ko
> #15 0x80cd9a31 in zio_done () from /boot/kernel/zfs.ko
> #16 0x80cd7d3a in zio_execute () from /boot/kernel/zfs.ko
> #17 0x80c6656b in taskq_run_safe () from /boot/kernel/zfs.ko
> #18 0x806af812 in taskqueue_run (queue=0x85a0ac40)
>     at /usr/src/sys/kern/subr_taskqueue.c:239
> #19 0x806afa07 in taskqueue_thread_loop (arg=0x85a3f830)
>     at /usr/src/sys/kern/subr_taskqueue.c:360
> #20 0x80647377 in fork_exit (callout=0x806af94a <taskqueue_thread_loop>,
>     arg=0x85a3f830, frame=0xb4439d38) at /usr/src/sys/kern/kern_fork.c:845
> #21 0x809126a4 in fork_trampoline () at
> /usr/src/sys/i386/i386/exception.s:273
> (kgdb)
> 
> And that coincided with the dates that I added the patch once seeing it
> on the list.
> changeset:   351:f1ca4eb51520
> user:        J. Hellenthal <jhell@DataIX.net>
> date:        Sun Oct 10 22:57:24 2010 -0400
> summary:     Remove the zfs_getpages patch from Andriy Gapon
> 
> changeset:   350:bb885c047f0a
> user:        J. Hellenthal <jhell@DataIX.net>
> date:        Sun Oct 10 22:27:31 2010 -0400
> summary:     zfs_getpages improvement from Andriy Gapon
> 
> If you would like I can patch back up to this patch to provide more
> information if its needed, but at the moment I do not have it available
> nor do I have the core that was generated.

Thanks a lot for this report.
Actually the original patch/commit was intended for head only as there are some
differences in page locking between head and other branches.
I have almost forgot about that and would certainly do that if not for your report.

See r214936 and r214941.
Thanks!
-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 09:55:11 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4B2431065672;
	Mon,  8 Nov 2010 09:55:11 +0000 (UTC) (envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 5DAC08FC0A;
	Mon,  8 Nov 2010 09:55:10 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA26760;
	Mon, 08 Nov 2010 11:55:09 +0200 (EET) (envelope-from avg@icyb.net.ua)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1PFORQ-0003lV-M8; Mon, 08 Nov 2010 11:55:08 +0200
Message-ID: <4CD7C8FC.900@icyb.net.ua>
Date: Mon, 08 Nov 2010 11:55:08 +0200
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: freebsd-current@FreeBSD.ORG, freebsd-fs@FreeBSD.ORG
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=X-VIET-VPS
Content-Transfer-Encoding: 7bit
Cc: 
Subject: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 09:55:11 -0000


JFYI.
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff80372a64
stack pointer           = 0x28:0xffffff81265486f0
frame pointer           = 0x28:0xffffff8126548700
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 4080 (initial thread)
trap number             = 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff801b9b8a = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0xffffffff803b36ba = kdb_backtrace+0x3a
panic() at 0xffffffff8037c8b2 = panic+0x1d2
trap_fatal() at 0xffffffff8055b35d = trap_fatal+0x39d
trap_pfault() at 0xffffffff8055b638 = trap_pfault+0x2b8
trap() at 0xffffffff8055bd33 = trap+0x603
calltrap() at 0xffffffff80545f78 = calltrap+0x8
--- trap 0xc, rip = 0xffffffff80372a64, rsp = 0xffffff81265486f0, rbp =
0xffffff8126548700 ---
crhold() at 0xffffffff80372a64 = crhold+0x4
fdata_alloc() at 0xffffffff80e17a9f = fdata_alloc+0xcf
fusedev_open() at 0xffffffff80e1896e = fusedev_open+0xae
devfs_open() at 0xffffffff802e8fa7 = devfs_open+0x117
VOP_OPEN_APV() at 0xffffffff805bb0c4 = VOP_OPEN_APV+0x74
vn_open_cred() at 0xffffffff804222bd = vn_open_cred+0x4ad
vn_open() at 0xffffffff804223dc = vn_open+0x1c
kern_openat() at 0xffffffff80420bad = kern_openat+0x15d
kern_open() at 0xffffffff80420f29 = kern_open+0x19
open() at 0xffffffff80420f48 = open+0x18
syscallenter() at 0xffffffff803c0f9e = syscallenter+0x3be
syscall() at 0xffffffff8055b6b1 = syscall+0x41
Xfast_syscall() at 0xffffffff80546252 = Xfast_syscall+0xe2

NULL pointer is passed as an argument to crhold.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 11:06:57 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0AB571065712
	for <freebsd-fs@FreeBSD.org>; Mon,  8 Nov 2010 11:06:57 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id C9A4B8FC13
	for <freebsd-fs@FreeBSD.org>; Mon,  8 Nov 2010 11:06:56 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oA8B6urQ088083
	for <freebsd-fs@FreeBSD.org>; Mon, 8 Nov 2010 11:06:56 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oA8B6u6O088081
	for freebsd-fs@FreeBSD.org; Mon, 8 Nov 2010 11:06:56 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 8 Nov 2010 11:06:56 GMT
Message-Id: <201011081106.oA8B6u6O088081@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-fs@FreeBSD.org
Cc: 
Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 11:06:57 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/151942  fs         [zfs] panic during ls(1) zfs snapshot directory
o kern/151910  fs         [zfs] booting from raidz/raidz2 on ciss(4) doesn't wor
o kern/151905  fs         [zfs] page fault under load in /sbin/zfs
o kern/151845  fs         [smbfs] [patch] smbfs should be upgraded to support Un
o kern/151648  fs         [zfs] disk wait bug
o kern/151629  fs         [fs] [patch] Skip empty directory entries during name 
o kern/151330  fs         [zfs] will unshare all zfs filesystem after execute a 
o kern/151326  fs         [nfs] nfs exports fail if netgroups contain duplicate 
o kern/151251  fs         [ufs] Can not create files on filesystem with heavy us
o kern/151226  fs         [zfs] can't delete zfs snapshot
o kern/151111  fs         [zfs] vnodes leakage during zfs unmount
o kern/151082  fs         [zfs] [patch] sappend-flaged files on ZFS not working 
o kern/150796  fs         [panic] [suj] [ufs] [softupdates] Panic on portbuild
o kern/150503  fs         [zfs] ZFS disks are UNAVAIL and corrupted after reboot
o kern/150501  fs         [zfs] ZFS vdev failure vdev.bad_label on amd64
o kern/150390  fs         [zfs] zfs deadlock when arcmsr reports drive faulted
o kern/150336  fs         [nfs] mountd/nfsd became confused; refused to reload n
o kern/150207  fs         zpool(1): zpool import -d /dev tries to open weird dev
o kern/149855  fs         [gvinum] growfs causes fsck to report errors in Filesy
o kern/149495  fs         [zfs] chflags sappend on zfs not working right
o kern/149208  fs         mksnap_ffs(8) hang/deadlock
o kern/149173  fs         [patch] [zfs] make OpenSolaris <sys/nvpair.h> installa
o kern/149022  fs         [hang] File system operations hangs with suspfs state
o kern/149015  fs         [zfs] [patch] misc fixes for ZFS code to build on Glib
o kern/149014  fs         [zfs] [patch] declarations in ZFS libraries/utilities 
o kern/149013  fs         [zfs] [patch] make ZFS makefiles use the libraries fro
o kern/148504  fs         [zfs] ZFS' zpool does not allow replacing drives to be
o kern/148490  fs         [zfs]: zpool attach - resilver bidirectionally, and re
o kern/148368  fs         [zfs] ZFS hanging forever on 8.1-PRERELEASE
o bin/148296   fs         [zfs] [loader] [patch] Very slow probe in /usr/src/sys
o kern/148204  fs         [nfs] UDP NFS causes overload
o kern/148138  fs         [zfs] zfs raidz pool commands freeze
o kern/147903  fs         [zfs] [panic] Kernel panics on faulty zfs device
o kern/147881  fs         [zfs] [patch] ZFS "sharenfs" doesn't allow different "
o kern/147790  fs         [zfs] zfs set acl(mode|inherit) fails on existing zfs
o kern/147560  fs         [zfs] [boot] Booting 8.1-PRERELEASE raidz system take 
o kern/147420  fs         [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt 
o kern/146941  fs         [zfs] [panic] Kernel Double Fault - Happens constantly
o kern/146786  fs         [zfs] zpool import hangs with checksum errors
o kern/146708  fs         [ufs] [panic] Kernel panic in softdep_disk_write_compl
o kern/146528  fs         [zfs] Severe memory leak in ZFS on i386
o kern/146502  fs         [nfs] FreeBSD 8 NFS Client Connection to Server
o kern/146375  fs         [nfs] [patch] Typos in macro variables names in sys/fs
s kern/145712  fs         [zfs] cannot offline two drives in a raidz2 configurat
o kern/145411  fs         [xfs] [panic] Kernel panics shortly after mounting an 
o bin/145309   fs         bsdlabel: Editing disk label invalidates the whole dev
o kern/145272  fs         [zfs] [panic] Panic during boot when accessing zfs on 
o kern/145246  fs         [ufs] dirhash in 7.3 gratuitously frees hashes when it
o kern/145238  fs         [zfs] [panic] kernel panic on zpool clear tank
o kern/145229  fs         [zfs] Vast differences in ZFS ARC behavior between 8.0
o kern/145189  fs         [nfs] nfsd performs abysmally under load
o kern/144929  fs         [ufs] [lor] vfs_bio.c + ufs_dirhash.c
o kern/144458  fs         [nfs] [patch] nfsd fails as a kld
p kern/144447  fs         [zfs] sharenfs fsunshare() & fsshare_main() non functi
o kern/144416  fs         [panic] Kernel panic on online filesystem optimization
s kern/144415  fs         [zfs] [panic] kernel panics on boot after zfs crash
o kern/144234  fs         [zfs] Cannot boot machine with recent gptzfsboot code 
o kern/143825  fs         [nfs] [panic] Kernel panic on NFS client
o bin/143572   fs         [zfs] zpool(1): [patch] The verbose output from iostat
o kern/143345  fs         [ext2fs] [patch] extfs minor header cleanups to better
o kern/143212  fs         [nfs] NFSv4 client strange work ...
o kern/143184  fs         [zfs] [lor] zfs/bufwait LOR
o kern/142924  fs         [ext2fs] [patch] Small cleanup for the inode struct in
o kern/142914  fs         [zfs] ZFS performance degradation over time
o kern/142878  fs         [zfs] [vfs] lock order reversal
o kern/142597  fs         [ext2fs] ext2fs does not work on filesystems with real
o kern/142489  fs         [zfs] [lor] allproc/zfs LOR
o kern/142466  fs         Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re
o kern/142401  fs         [ntfs] [patch] Minor updates to NTFS from NetBSD
o kern/142306  fs         [zfs] [panic] ZFS drive (from OSX Leopard) causes two 
o kern/142068  fs         [ufs] BSD labels are got deleted spontaneously
o kern/141897  fs         [msdosfs] [panic] Kernel panic. msdofs: file name leng
o kern/141463  fs         [nfs] [panic] Frequent kernel panics after upgrade fro
o kern/141305  fs         [zfs] FreeBSD ZFS+sendfile severe performance issues (
o kern/141091  fs         [patch] [nullfs] fix panics with DIAGNOSTIC enabled
o kern/141086  fs         [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS
o kern/141010  fs         [zfs] "zfs scrub" fails when backed by files in UFS2
o kern/140888  fs         [zfs] boot fail from zfs root while the pool resilveri
o kern/140661  fs         [zfs] [patch] /boot/loader fails to work on a GPT/ZFS-
o kern/140640  fs         [zfs] snapshot crash
o kern/140134  fs         [msdosfs] write and fsck destroy filesystem integrity
o kern/140068  fs         [smbfs] [patch] smbfs does not allow semicolon in file
o kern/139725  fs         [zfs] zdb(1) dumps core on i386 when examining zpool c
o kern/139715  fs         [zfs] vfs.numvnodes leak on busy zfs
o bin/139651   fs         [nfs] mount(8): read-only remount of NFS volume does n
o kern/139597  fs         [patch] [tmpfs] tmpfs initializes va_gen but doesn't u
o kern/139564  fs         [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo
o kern/139407  fs         [smbfs] [panic] smb mount causes system crash if remot
o kern/138790  fs         [zfs] ZFS ceases caching when mem demand is high
o kern/138662  fs         [panic] ffs_blkfree: freeing free block
o kern/138421  fs         [ufs] [patch] remove UFS label limitations
o kern/138202  fs         mount_msdosfs(1) see only 2Gb
o kern/136968  fs         [ufs] [lor] ufs/bufwait/ufs (open)
o kern/136945  fs         [ufs] [lor] filedesc structure/ufs (poll)
o kern/136944  fs         [ffs] [lor] bufwait/snaplk (fsync)
o kern/136873  fs         [ntfs] Missing directories/files on NTFS volume
o kern/136865  fs         [nfs] [patch] NFS exports atomic and on-the-fly atomic
o kern/136470  fs         [nfs] Cannot mount / in read-only, over NFS
o kern/135667  fs         [lor] LORs causing ufs filesystem corruption on XEN Do
o kern/135546  fs         [zfs] zfs.ko module doesn't ignore zpool.cache filenam
o kern/135469  fs         [ufs] [panic] kernel crash on md operation in ufs_dirb
o kern/135050  fs         [zfs] ZFS clears/hides disk errors on reboot
o kern/134491  fs         [zfs] Hot spares are rather cold...
o kern/133676  fs         [smbfs] [panic] umount -f'ing a vnode-based memory dis
o kern/133614  fs         [panic] panic: ffs_truncate: read-only filesystem
o kern/133174  fs         [msdosfs] [patch] msdosfs must support utf-encoded int
o kern/132960  fs         [ufs] [panic] panic:ffs_blkfree: freeing free frag
o kern/132397  fs         reboot causes filesystem corruption (failure to sync b
o kern/132331  fs         [ufs] [lor] LOR ufs and syncer
o kern/132237  fs         [msdosfs] msdosfs has problems to read MSDOS Floppy
o kern/132145  fs         [panic] File System Hard Crashes
o kern/131441  fs         [unionfs] [nullfs] unionfs and/or nullfs not combineab
o kern/131360  fs         [nfs] poor scaling behavior of the NFS server under lo
o kern/131342  fs         [nfs] mounting/unmounting of disks causes NFS to fail
o bin/131341   fs         makefs: error "Bad file descriptor"  on the mount poin
o kern/130920  fs         [msdosfs] cp(1) takes 100% CPU time while copying file
o kern/130210  fs         [nullfs] Error by check nullfs
o kern/129760  fs         [nfs] after 'umount -f' of a stale NFS share FreeBSD l
o kern/129488  fs         [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: 
o kern/129231  fs         [ufs] [patch] New UFS mount (norandom) option - mostly
o kern/129152  fs         [panic] non-userfriendly panic when trying to mount(8)
f kern/128829  fs         smbd(8) causes periodic panic on 7-RELEASE
o kern/127787  fs         [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs
o bin/127270   fs         fsck_msdosfs(8) may crash if BytesPerSec is zero
o kern/127029  fs         [panic] mount(8): trying to mount a write protected zi
o kern/126287  fs         [ufs] [panic] Kernel panics while mounting an UFS file
o kern/125895  fs         [ffs] [panic] kernel: panic: ffs_blkfree: freeing free
s kern/125738  fs         [zfs] [request] SHA256 acceleration in ZFS
p kern/124621  fs         [ext3] [patch] Cannot mount ext2fs partition
f bin/124424   fs         [zfs] zfs(8): zfs list -r shows strange snapshots' siz
o kern/123939  fs         [msdosfs] corrupts new files
o kern/122380  fs         [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash
o bin/122172   fs         [fs]: amd(8) automount daemon dies on 6.3-STABLE i386,
o bin/121898   fs         [nullfs] pwd(1)/getcwd(2) fails with Permission denied
o bin/121779   fs         [ufs] snapinfo(8) (and related tools?) only work for t
o bin/121366   fs         [zfs] [patch] Automatic disk scrubbing from periodic(8
o bin/121072   fs         [smbfs] mount_smbfs(8) cannot normally convert the cha
f kern/120991  fs         [panic] [ffs] [snapshot] System crashes when manipulat
o kern/120483  fs         [ntfs] [patch] NTFS filesystem locking changes
o kern/120482  fs         [ntfs] [patch] Sync style changes between NetBSD and F
f kern/119735  fs         [zfs] geli + ZFS + samba starting on boot panics 7.0-B
o kern/118912  fs         [2tb] disk sizing/geometry problem with large array
o kern/118713  fs         [minidump] [patch] Display media size required for a k
o bin/118249   fs         mv(1): moving a directory changes its mtime
o kern/118107  fs         [ntfs] [panic] Kernel panic when accessing a file at N
o kern/117954  fs         [ufs] dirhash on very large directories blocks the mac
o bin/117315   fs         [smbfs] mount_smbfs(8) and related options can't mount
o kern/117314  fs         [ntfs] Long-filename only NTFS fs'es cause kernel pani
o kern/117158  fs         [zfs] zpool scrub causes panic if geli vdevs detach on
o bin/116980   fs         [msdosfs] [patch] mount_msdosfs(8) resets some flags f
o conf/116931  fs         lack of fsck_cd9660 prevents mounting iso images with 
p kern/116608  fs         [msdosfs] [patch] msdosfs fails to check mount options
o kern/116583  fs         [ffs] [hang] System freezes for short time when using 
o kern/116170  fs         [panic] Kernel panic when mounting /tmp
f kern/115645  fs         [ffs] [snapshots] [panic] lockmgr: thread 0xc4c00d80, 
o bin/115361   fs         [zfs] mount(8) gets into a state where it won't set/un
o kern/114955  fs         [cd9660] [patch] [request] support for mask,dirmask,ui
o kern/114847  fs         [ntfs] [patch] [request] dirmask support for NTFS ala 
o kern/114676  fs         [ufs] snapshot creation panics: snapacct_ufs2: bad blo
o bin/114468   fs         [patch] [request] add -d option to umount(8) to detach
o kern/113852  fs         [smbfs] smbfs does not properly implement DFS referral
o bin/113838   fs         [patch] [request] mount(8): add support for relative p
o bin/113049   fs         [patch] [request] make quot(8) use getopt(3) and show 
o kern/112658  fs         [smbfs] [patch] smbfs and caching problems (resolves b
o kern/111843  fs         [msdosfs] Long Names of files are incorrectly created 
o kern/111782  fs         [ufs] dump(8) fails horribly for large filesystems
s bin/111146   fs         [2tb] fsck(8) fails on 6T filesystem
o kern/109024  fs         [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat
o kern/109010  fs         [msdosfs] can't mv directory within fat32 file system
o bin/107829   fs         [2TB] fdisk(8): invalid boundary checking in fdisk / w
o kern/106107  fs         [ufs] left-over fsck_snapshot after unfinished backgro
o kern/106030  fs         [ufs] [panic] panic in ufs from geom when a dead disk 
o kern/104406  fs         [ufs] Processes get stuck in "ufs" state under persist
o kern/104133  fs         [ext2fs] EXT2FS module corrupts EXT2/3 filesystems
o kern/103035  fs         [ntfs] Directories in NTFS mounted disc images appear 
o kern/101324  fs         [smbfs] smbfs sometimes not case sensitive when it's s
o kern/99290   fs         [ntfs] mount_ntfs ignorant of cluster sizes
s bin/97498    fs         [request] newfs(8) has no option to clear the first 12
o kern/97377   fs         [ntfs] [patch] syntax cleanup for ntfs_ihash.c
o kern/95222   fs         [iso9660] File sections on ISO9660 level 3 CDs ignored
o kern/94849   fs         [ufs] rename on UFS filesystem is not atomic
o bin/94810    fs         fsck(8) incorrectly reports 'file system marked clean'
o kern/94769   fs         [ufs] Multiple file deletions on multi-snapshotted fil
o kern/94733   fs         [smbfs] smbfs may cause double unlock
o bin/94635    fs         snapinfo(8)/libufs only works for disk-backed filesyst
o kern/93942   fs         [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D
o kern/92272   fs         [ffs] [hang] Filling a filesystem while creating a sna
o kern/91134   fs         [smbfs] [patch] Preserve access and modification time 
a kern/90815   fs         [smbfs] [patch] SMBFS with character conversions somet
o kern/88657   fs         [smbfs] windows client hang when browsing a samba shar
o kern/88555   fs         [panic] ffs_blkfree: freeing free frag on AMD 64
o kern/88266   fs         [smbfs] smbfs does not implement UIO_NOCOPY and sendfi
o bin/87966    fs         [patch] newfs(8): introduce -A flag for newfs to enabl
o kern/87859   fs         [smbfs] System reboot while umount smbfs.
o kern/86587   fs         [msdosfs] rm -r /PATH fails with lots of small files
o bin/85494    fs         fsck_ffs: unchecked use of cg_inosused macro etc.
f kern/85326   fs         [smbfs] [panic] saving a file via samba to an overquot
o kern/80088   fs         [smbfs] Incorrect file time setting on NTFS mounted vi
o bin/74779    fs         Background-fsck checks one filesystem twice and omits 
o kern/73484   fs         [ntfs] Kernel panic when doing `ls` from the client si
o bin/73019    fs         [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino
o kern/71774   fs         [ntfs] NTFS cannot "see" files on a WinXP filesystem
o bin/70600    fs         fsck(8) throws files away when it can't grow lost+foun
o kern/68978   fs         [panic] [ufs] crashes with failing hard disk, loose po
o kern/65920   fs         [nwfs] Mounted Netware filesystem behaves strange
o kern/65901   fs         [smbfs] [patch] smbfs fails fsx write/truncate-down/tr
o kern/61503   fs         [smbfs] mount_smbfs does not work as non-root
o kern/55617   fs         [smbfs] Accessing an nsmb-mounted drive via a smb expo
o kern/51685   fs         [hang] Unbounded inode allocation causes kernel to loc
o kern/51583   fs         [nullfs] [patch] allow to work with devices and socket
o kern/36566   fs         [smbfs] System reboot with dead smb mount and umount
o kern/33464   fs         [ufs] soft update inconsistencies after system crash
o bin/27687    fs         fsck(8) wrapper is not properly passing options to fsc
o kern/18874   fs         [2TB] 32bit NFS servers export wrong negative values t

214 problems total.


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 11:55:05 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1810A106566B;
	Mon,  8 Nov 2010 11:55:05 +0000 (UTC) (envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id E8F6E8FC0C;
	Mon,  8 Nov 2010 11:55:03 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA29560;
	Mon, 08 Nov 2010 13:55:02 +0200 (EET) (envelope-from avg@icyb.net.ua)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1PFQJS-0003vD-4w; Mon, 08 Nov 2010 13:55:02 +0200
Message-ID: <4CD7E515.5040209@icyb.net.ua>
Date: Mon, 08 Nov 2010 13:55:01 +0200
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Ivan Voras <ivoras@freebsd.org>
References: <4CD7C8FC.900@icyb.net.ua> <ib8nas$9de$1@dough.gmane.org>
In-Reply-To: <ib8nas$9de$1@dough.gmane.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 11:55:05 -0000

on 08/11/2010 13:35 Ivan Voras said the following:
> On 11/08/10 10:55, Andriy Gapon wrote:
>>
>> JFYI.
>> Fatal trap 12: page fault while in kernel mode
> 
> Can you find any set of circumstances which make this repeatable?
> 
> This panic apparently goes like this:
> 
> 1) used by devfs_open():
>  47 static struct cdevsw fuse_cdevsw = {
>  48         .d_open = fusedev_open,
> 
> 2) in fusedev_open():
> 119         fdata = fdata_alloc(dev, td->td_ucred);
> 
> 3) in fdata_alloc():
> 297         data->daemoncred = crhold(cred);
> 
> in other words, td->td_ucred from td passed to fusedev_open (presumably
> when the device is opened from the userland) appears to be NULL.
> 
> I don't know if there is any normal set of circumstances under which
> this is expected.

I reliable got this panic when all I was doing is saving an attachment in
thunderbird 3 that ran in KDE 4 environment.  Not sure what was going on behind
the scenes, but shouldn't have been anything out of the ordinary.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 12:13:23 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 683B61065679;
	Mon,  8 Nov 2010 12:13:23 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 4D5C08FC26;
	Mon,  8 Nov 2010 12:13:21 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA29823;
	Mon, 08 Nov 2010 14:13:20 +0200 (EET) (envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1PFQbA-0003x3-Ia; Mon, 08 Nov 2010 14:13:20 +0200
Message-ID: <4CD7E960.1070200@freebsd.org>
Date: Mon, 08 Nov 2010 14:13:20 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Ivan Voras <ivoras@freebsd.org>
References: <4CD7C8FC.900@icyb.net.ua> <ib8nas$9de$1@dough.gmane.org>
	<4CD7E515.5040209@icyb.net.ua>
In-Reply-To: <4CD7E515.5040209@icyb.net.ua>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 12:13:23 -0000

on 08/11/2010 13:55 Andriy Gapon said the following:
> I reliable got this panic when all I was doing is saving an attachment in
> thunderbird 3 that ran in KDE 4 environment.  Not sure what was going on behind
> the scenes, but shouldn't have been anything out of the ordinary.

Perhaps this is my local mistake.  I can't see from code and crash dump how NULL
pointer is possible there.  So perhaps I have some ABI mismatch between kernel
and fuse module.
I will rebuild fuse kmod and re-test again.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 12:28:42 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5C459106564A
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 12:28:42 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 11D758FC15
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 12:28:41 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1PFQq0-0004bC-6N
	for freebsd-fs@freebsd.org; Mon, 08 Nov 2010 13:28:40 +0100
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Mon, 08 Nov 2010 13:28:40 +0100
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Mon, 08 Nov 2010 13:28:40 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 08 Nov 2010 13:28:26 +0100
Lines: 19
Message-ID: <ib8qda$nat$1@dough.gmane.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
X-Enigmail-Version: 1.1.2
Subject: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 12:28:42 -0000

I was looking at fusefs sources and there is a dance it does with the
Giant lock which looks fishy.

Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of
mentionings, but if I understand it correctly only these "active" instances:

1) one set of mtx_assert() calls on it in pseudofs, which I can't figure
out what they're guarding
2) some manual locking and unlocking in nfsclient which appears to only
guard printf() (???)
3) some more locking in nfsserver which apparently is only there to
guard the underlying local file system
4) coda, which appears to be the only one marked with D_NEEDGIANT, but
doesn't do much of its own interfacing with it

Except for these, is there any more magic that would need to be resolved
to excise Giant from VFS?

Would it be correct to think that coda is the single biggest obstacle?


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 12:41:04 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 95CE110656C5
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 12:41:04 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 4C5818FC25
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 12:41:03 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1PFR1y-0001F5-Sd
	for freebsd-fs@freebsd.org; Mon, 08 Nov 2010 13:41:02 +0100
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Mon, 08 Nov 2010 13:41:02 +0100
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Mon, 08 Nov 2010 13:41:02 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 08 Nov 2010 13:40:51 +0100
Lines: 14
Message-ID: <ib8r4k$qkm$1@dough.gmane.org>
References: <4CD7C8FC.900@icyb.net.ua>
	<ib8nas$9de$1@dough.gmane.org>	<4CD7E515.5040209@icyb.net.ua>
	<4CD7E960.1070200@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <4CD7E960.1070200@freebsd.org>
X-Enigmail-Version: 1.1.2
Cc: freebsd-current@freebsd.org
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 12:41:04 -0000

On 11/08/10 13:13, Andriy Gapon wrote:
> on 08/11/2010 13:55 Andriy Gapon said the following:
>> I reliable got this panic when all I was doing is saving an attachment in
>> thunderbird 3 that ran in KDE 4 environment.  Not sure what was going on behind
>> the scenes, but shouldn't have been anything out of the ordinary.
> 
> Perhaps this is my local mistake.  I can't see from code and crash dump how NULL
> pointer is possible there.  So perhaps I have some ABI mismatch between kernel
> and fuse module.
> I will rebuild fuse kmod and re-test again.

OTOH it could be one of those supposed memory corruptions of FUSE, but
this particular backtrace is pretty tight, I can't see anything that can
NULL-ify something in the struct thread.


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 14:32:35 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E40BF106566B;
	Mon,  8 Nov 2010 14:32:35 +0000 (UTC)
	(envelope-from gleb.kurtsou@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 4423B8FC19;
	Mon,  8 Nov 2010 14:32:34 +0000 (UTC)
Received: by bwz3 with SMTP id 3so4924375bwz.13
	for <multiple recipients>; Mon, 08 Nov 2010 06:32:34 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:date:from:to:cc:subject
	:message-id:references:mime-version:content-type:content-disposition
	:in-reply-to:user-agent;
	bh=7VW5+8jQ+fRzpA3ohr1dYewahGI6mBkL1elER2pkx0c=;
	b=eCzFYxwhNTr6AaFtTgYTwRSF33lWCsyabha8NyJaJzz07FWCFshXMhgK/WLHLwefFW
	iCrNAjTpJHGDQrTeWbANn3KiTtd6xkUFdfkzHOYSbf8hqo/nQX7nSphkT9Fu8NmWwEnm
	nJHNeHOUg88HNZVxrbBmLYdj+1/99lXIfvrt4=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=date:from:to:cc:subject:message-id:references:mime-version
	:content-type:content-disposition:in-reply-to:user-agent;
	b=F/Xcusv9TFJ3BJHYORP5j8QDMI8dWIwJYlWh5l3PDGKcPuOzoE9tVkbfQly/gOQtE/
	82Rqnn1SsXzp2kypTFqYQw5vzd7OPw2prag7xjfqNM9gVV2F6k17WkUMq0G2Zb3RwwSK
	x3bL5a5fx8+FqJFzW9bqDGu/1NGDAtSvuZ1jc=
Received: by 10.204.62.193 with SMTP id y1mr4957723bkh.131.1289226695087;
	Mon, 08 Nov 2010 06:31:35 -0800 (PST)
Received: from localhost ([91.187.5.20])
	by mx.google.com with ESMTPS id r21sm3911608bkj.22.2010.11.08.06.31.33
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Mon, 08 Nov 2010 06:31:33 -0800 (PST)
Date: Mon, 8 Nov 2010 16:31:30 +0200
From: Gleb Kurtsou <gleb.kurtsou@gmail.com>
To: Ivan Voras <ivoras@freebsd.org>
Message-ID: <20101108143130.GA2799@tops>
References: <ib8qda$nat$1@dough.gmane.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <ib8qda$nat$1@dough.gmane.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 14:32:36 -0000

On (08/11/2010 13:28), Ivan Voras wrote:
> I was looking at fusefs sources and there is a dance it does with the
> Giant lock which looks fishy.
It's intended to be fishy. No kernel level locks should be held before
returning to userland, in other words on each syscall vnode is locked (+
Gaint lock for fs if needed), than it's unlocked by filesystem and
relocked upon callback from userspace. puffs is MPSAFE if that could be
of any help for you.

> Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of
> mentionings, but if I understand it correctly only these "active" instances:
> 
> 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure
> out what they're guarding
> 2) some manual locking and unlocking in nfsclient which appears to only
> guard printf() (???)
Somewhat unrelated, but. Does NFS client unlock vnodes while
sending/waiting for RCP reply? I thought it does, but I'm not sure.

> 3) some more locking in nfsserver which apparently is only there to
> guard the underlying local file system
> 4) coda, which appears to be the only one marked with D_NEEDGIANT, but
> doesn't do much of its own interfacing with it
> 
> Except for these, is there any more magic that would need to be resolved
> to excise Giant from VFS?
Kostik was working on it.

> Would it be correct to think that coda is the single biggest obstacle?
Filesystem should be marked as MPSAFE, it's not D_NEEDGIANT flag but
MNTK_MPSAFE. A lot of filesystems are still locked by Gaint, i.e ext2fs,
smbfs, nwfs, ntfs, etc.

> 
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 14:38:43 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6B03F10656AB
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 14:38:43 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 2335D8FC20
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 14:38:42 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1PFSrp-0005bJ-I7
	for freebsd-fs@freebsd.org; Mon, 08 Nov 2010 15:38:41 +0100
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Mon, 08 Nov 2010 15:38:41 +0100
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Mon, 08 Nov 2010 15:38:41 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 08 Nov 2010 15:38:30 +0100
Lines: 21
Message-ID: <ib9216$rk8$1@dough.gmane.org>
References: <ib8qda$nat$1@dough.gmane.org> <20101108143130.GA2799@tops>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <20101108143130.GA2799@tops>
X-Enigmail-Version: 1.1.2
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 14:38:43 -0000

On 11/08/10 15:31, Gleb Kurtsou wrote:
> On (08/11/2010 13:28), Ivan Voras wrote:
>> I was looking at fusefs sources and there is a dance it does with the
>> Giant lock which looks fishy.
> It's intended to be fishy. No kernel level locks should be held before
> returning to userland, in other words on each syscall vnode is locked (+
> Gaint lock for fs if needed), than it's unlocked by filesystem and
> relocked upon callback from userspace. puffs is MPSAFE if that could be
> of any help for you.

I don't think we're talking completely about the same thing here.

I'm talking about fuse's DO_GIANT_MANUALLY flag, with awareness that
fuse does:

473 #ifdef MNTK_MPSAFE
474         mp->mnt_kern_flag |= MNTK_MPSAFE;
475 #endif

What are you talking about?


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 15:05:05 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D93851065697;
	Mon,  8 Nov 2010 15:05:05 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id A67D68FC1D;
	Mon,  8 Nov 2010 15:05:05 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id 590B446B06;
	Mon,  8 Nov 2010 10:05:05 -0500 (EST)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 85A328A029;
	Mon,  8 Nov 2010 10:05:04 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-current@freebsd.org
Date: Mon, 8 Nov 2010 09:42:41 -0500
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <4CD7C8FC.900@icyb.net.ua> <ib8nas$9de$1@dough.gmane.org>
In-Reply-To: <ib8nas$9de$1@dough.gmane.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="utf-8"
Content-Transfer-Encoding: 7bit
Message-Id: <201011080942.41546.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6
	(bigwig.baldwin.cx); Mon, 08 Nov 2010 10:05:04 -0500 (EST)
X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-1.9 required=4.2 tests=BAYES_00 autolearn=ham
	version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx
Cc: freebsd-fs@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 15:05:06 -0000

On Monday, November 08, 2010 6:35:55 am Ivan Voras wrote:
> On 11/08/10 10:55, Andriy Gapon wrote:
> > 
> > JFYI.
> > Fatal trap 12: page fault while in kernel mode
> 
> Can you find any set of circumstances which make this repeatable?
> 
> This panic apparently goes like this:
> 
> 1) used by devfs_open():
>  47 static struct cdevsw fuse_cdevsw = {
>  48         .d_open = fusedev_open,
> 
> 2) in fusedev_open():
> 119         fdata = fdata_alloc(dev, td->td_ucred);
> 
> 3) in fdata_alloc():
> 297         data->daemoncred = crhold(cred);
> 
> in other words, td->td_ucred from td passed to fusedev_open (presumably
> when the device is opened from the userland) appears to be NULL.
> 
> I don't know if there is any normal set of circumstances under which
> this is expected.

No, td_ucred should never be NULL.

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 15:05:10 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6744A10656BD;
	Mon,  8 Nov 2010 15:05:10 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 3282C8FC12;
	Mon,  8 Nov 2010 15:05:10 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id CB41046B35;
	Mon,  8 Nov 2010 10:05:09 -0500 (EST)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id A65268A027;
	Mon,  8 Nov 2010 10:05:08 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-fs@freebsd.org
Date: Mon, 8 Nov 2010 10:04:59 -0500
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <ib8qda$nat$1@dough.gmane.org>
In-Reply-To: <ib8qda$nat$1@dough.gmane.org>
MIME-Version: 1.0
Message-Id: <201011081004.59640.jhb@freebsd.org>
Content-Type: Text/Plain;
  charset="utf-8"
Content-Transfer-Encoding: 7bit
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6
	(bigwig.baldwin.cx); Mon, 08 Nov 2010 10:05:08 -0500 (EST)
X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-1.9 required=4.2 tests=BAYES_00 autolearn=ham
	version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx
Cc: Ivan Voras <ivoras@freebsd.org>
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 15:05:10 -0000

On Monday, November 08, 2010 7:28:26 am Ivan Voras wrote:
> I was looking at fusefs sources and there is a dance it does with the
> Giant lock which looks fishy.
> 
> Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of
> mentionings, but if I understand it correctly only these "active" instances:
> 
> 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure
> out what they're guarding
> 2) some manual locking and unlocking in nfsclient which appears to only
> guard printf() (???)
> 3) some more locking in nfsserver which apparently is only there to
> guard the underlying local file system
> 4) coda, which appears to be the only one marked with D_NEEDGIANT, but
> doesn't do much of its own interfacing with it
> 
> Except for these, is there any more magic that would need to be resolved
> to excise Giant from VFS?
> 
> Would it be correct to think that coda is the single biggest obstacle?

Err, all the VFS_LOCK_GIANT() stuff for filesystems that do not have 
MNTK_MPSAFE set.  I believe the currently MPSAFE fs's are UFS, ZFS, MSDOSFS,
CD9660, UDF, NFS client, and devfs.  I think all others are !MPSAFE still.

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 15:08:12 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C0C091065672;
	Mon,  8 Nov 2010 15:08:12 +0000 (UTC)
	(envelope-from ivoras@gmail.com)
Received: from mail-pv0-f182.google.com (mail-pv0-f182.google.com
	[74.125.83.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 8C3578FC0C;
	Mon,  8 Nov 2010 15:08:12 +0000 (UTC)
Received: by pvc22 with SMTP id 22so1343640pvc.13
	for <multiple recipients>; Mon, 08 Nov 2010 07:08:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:mime-version:sender:received
	:in-reply-to:references:from:date:x-google-sender-auth:message-id
	:subject:to:cc:content-type:content-transfer-encoding;
	bh=1y/z5S8bvr0wnQxCmQAL8gHVfZt+to3OEni7ST3Gy3M=;
	b=uPWFt+qy1zDiShhx/XOYSk+eyqxYjBiVN15ujbFO9UPY8FRAiGDc5KC/PLm70CLh27
	eQmV9/4DKyDWG+5tz1rgf6MhMUtzOJ1109dpl945vNxVebRqxd3RUbzViHGH1Cg1ayE8
	r0uRbC1PqUHbxs5V4guraqlKQe4FMnF18KsZY=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:from:date
	:x-google-sender-auth:message-id:subject:to:cc:content-type
	:content-transfer-encoding;
	b=fUyMbRYYbIjo9CfOMVZo4F1P74ntuncrHlKkvaH3C/2hBs5Hb/9/XY3QTbyDqEdLm3
	GswIJC97SrYA1lBeEUp6UP4j4GWEtRBfmkJKbsbtS7frzDZk9qF1VSf3eKB9ajBVjaNk
	FEQCIZvcBPaei9ZkslQ8BML33NSSaQiB6rhVA=
Received: by 10.229.246.136 with SMTP id ly8mr5134771qcb.237.1289228891427;
	Mon, 08 Nov 2010 07:08:11 -0800 (PST)
MIME-Version: 1.0
Sender: ivoras@gmail.com
Received: by 10.229.40.145 with HTTP; Mon, 8 Nov 2010 07:07:29 -0800 (PST)
In-Reply-To: <201011081004.59640.jhb@freebsd.org>
References: <ib8qda$nat$1@dough.gmane.org> <201011081004.59640.jhb@freebsd.org>
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 8 Nov 2010 16:07:29 +0100
X-Google-Sender-Auth: x8ab4LFurqRWZNp0ap7bTcPEfmA
Message-ID: <AANLkTi=B3D=tv81ewPQbWMm4xH5sYzWFX3hbPY72R=ug@mail.gmail.com>
To: John Baldwin <jhb@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 15:08:12 -0000

On 8 November 2010 16:04, John Baldwin <jhb@freebsd.org> wrote:
> On Monday, November 08, 2010 7:28:26 am Ivan Voras wrote:
>> I was looking at fusefs sources and there is a dance it does with the
>> Giant lock which looks fishy.
>>
>> Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of
>> mentionings, but if I understand it correctly only these "active" instan=
ces:
>>
>> 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure
>> out what they're guarding
>> 2) some manual locking and unlocking in nfsclient which appears to only
>> guard printf() (???)
>> 3) some more locking in nfsserver which apparently is only there to
>> guard the underlying local file system
>> 4) coda, which appears to be the only one marked with D_NEEDGIANT, but
>> doesn't do much of its own interfacing with it
>>
>> Except for these, is there any more magic that would need to be resolved
>> to excise Giant from VFS?
>>
>> Would it be correct to think that coda is the single biggest obstacle?
>
> Err, all the VFS_LOCK_GIANT() stuff for filesystems that do not have
> MNTK_MPSAFE set. =C2=A0I believe the currently MPSAFE fs's are UFS, ZFS, =
MSDOSFS,
> CD9660, UDF, NFS client, and devfs. =C2=A0I think all others are !MPSAFE =
still.

Thanks!

It seemed too easy to be true.

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 15:10:33 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6734D1065675;
	Mon,  8 Nov 2010 15:10:33 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id D41788FC2B;
	Mon,  8 Nov 2010 15:10:32 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id oA8FATQW000494
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 8 Nov 2010 17:10:29 +0200 (EET)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	oA8FASWf017413; Mon, 8 Nov 2010 17:10:28 +0200 (EET)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id oA8FASPR017412; 
	Mon, 8 Nov 2010 17:10:28 +0200 (EET)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Mon, 8 Nov 2010 17:10:28 +0200
From: Kostik Belousov <kostikbel@gmail.com>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <20101108151028.GI2392@deviant.kiev.zoral.com.ua>
References: <ib8qda$nat$1@dough.gmane.org> <201011081004.59640.jhb@freebsd.org>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="SvH2i/6Q4qfAo9X5"
Content-Disposition: inline
In-Reply-To: <201011081004.59640.jhb@freebsd.org>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.5 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_20,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: freebsd-fs@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 15:10:33 -0000


--SvH2i/6Q4qfAo9X5
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Nov 08, 2010 at 10:04:59AM -0500, John Baldwin wrote:
> On Monday, November 08, 2010 7:28:26 am Ivan Voras wrote:
> > I was looking at fusefs sources and there is a dance it does with the
> > Giant lock which looks fishy.
> >=20
> > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of
> > mentionings, but if I understand it correctly only these "active" insta=
nces:
> >=20
> > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure
> > out what they're guarding
> > 2) some manual locking and unlocking in nfsclient which appears to only
> > guard printf() (???)
> > 3) some more locking in nfsserver which apparently is only there to
> > guard the underlying local file system
> > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but
> > doesn't do much of its own interfacing with it
> >=20
> > Except for these, is there any more magic that would need to be resolved
> > to excise Giant from VFS?
> >=20
> > Would it be correct to think that coda is the single biggest obstacle?
>=20
> Err, all the VFS_LOCK_GIANT() stuff for filesystems that do not have=20
> MNTK_MPSAFE set.  I believe the currently MPSAFE fs's are UFS, ZFS, MSDOS=
FS,
> CD9660, UDF, NFS client, and devfs.  I think all others are !MPSAFE still.
pseudofs-based fses are mpsafe too.

I already claimed several times that I will remove VFS_LOCK_GIANT
after smbfs is locked. Patch for removal is sitting in my repository
for almost a year.

--SvH2i/6Q4qfAo9X5
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAkzYEuQACgkQC3+MBN1Mb4j9awCgjT/sNXF737obeLixZFmTJgZU
mwwAoMXrnVc+Jxaoz0R9iXmeV0TKyDXn
=5/2O
-----END PGP SIGNATURE-----

--SvH2i/6Q4qfAo9X5--

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 15:15:51 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 461731065670;
	Mon,  8 Nov 2010 15:15:51 +0000 (UTC) (envelope-from pjd@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 1BA7E8FC1A;
	Mon,  8 Nov 2010 15:15:51 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oA8FFoG4053454;
	Mon, 8 Nov 2010 15:15:50 GMT (envelope-from pjd@freefall.freebsd.org)
Received: (from pjd@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oA8FFoB1053450;
	Mon, 8 Nov 2010 15:15:50 GMT (envelope-from pjd)
Date: Mon, 8 Nov 2010 15:15:50 GMT
Message-Id: <201011081515.oA8FFoB1053450@freefall.freebsd.org>
To: am@raisa.eu.org, pjd@FreeBSD.org, freebsd-fs@FreeBSD.org, pjd@FreeBSD.org
From: pjd@FreeBSD.org
Cc: 
Subject: Re: kern/151910: [zfs] booting from raidz/raidz2 on ciss(4) doesn't
	work
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 15:15:51 -0000

Synopsis: [zfs] booting from raidz/raidz2 on ciss(4) doesn't work

State-Changed-From-To: open->feedback
State-Changed-By: pjd
State-Changed-When: pon 8 lis 2010 15:08:24 UTC
State-Changed-Why: 
Could you take a look at two files in FreeBSD HEAD:

	sys/boot/i386/zfsboot/zfsboot.c
	sys/boot/i386/libi386/biosdisk.c

Look for VIRTUALBOX in there and apply the same changes to your stable/8 code
or just modify the code to use code that is compiled with VIRTUALBOX defined.
There is a bug in VirtualBox that the BIOS reports only one disk available,
but if you ignore that and just look for more, you will find them.
Maybe there is a similar bug in your BIOS?
Please try it out and let me know. If it won't work we ca add more debug to
see where and why it fails exactly.


Responsible-Changed-From-To: freebsd-fs->pjd
Responsible-Changed-By: pjd
Responsible-Changed-When: pon 8 lis 2010 15:08:24 UTC
Responsible-Changed-Why: 
I'll take this one.

http://www.freebsd.org/cgi/query-pr.cgi?pr=151910

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 15:29:11 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8B31C106566C;
	Mon,  8 Nov 2010 15:29:11 +0000 (UTC)
	(envelope-from ivoras@gmail.com)
Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com
	[209.85.210.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 4D4D68FC18;
	Mon,  8 Nov 2010 15:29:10 +0000 (UTC)
Received: by pzk12 with SMTP id 12so437004pzk.13
	for <multiple recipients>; Mon, 08 Nov 2010 07:29:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:mime-version:sender:received
	:in-reply-to:references:from:date:x-google-sender-auth:message-id
	:subject:to:cc:content-type;
	bh=rYmgPQS2EXgeBYxtrofZIzZkdPKATIa4SCiEhDI8wuU=;
	b=D+xoy9r5b2bQY2gvqrG/XDCaKjj3BPMh9pQbHUhfjxMeur4oj861hLBR0ep/xVQw5p
	CiRK/ipzX3o8hjnwFyOwsdj2YronuA+P3PRTMxOOeR6EMCHF7hFW58b0pmnRIF4WOsas
	Kbnnm+dazRAGJ0X+HwKdKgVBCHSRkg7IPR3eo=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:from:date
	:x-google-sender-auth:message-id:subject:to:cc:content-type;
	b=Ft2W1uvTBhJrM8vr3z79spSP8jR/SdyuMc/kMO/3b2OQMHoVY6q1SWnW5hQj5rVuqN
	Vb4pfLuDcWIp0zgsiEi9QYelAB6cg7tIDHCFtAvpwnm7qVPv1SUXifcjvxa2cJC4L5pl
	wMAFhwPhfoEfPKfeEvxHRX4+X5qwW7ZRmxC5Y=
Received: by 10.229.212.5 with SMTP id gq5mr5139312qcb.275.1289230150158; Mon,
	08 Nov 2010 07:29:10 -0800 (PST)
MIME-Version: 1.0
Sender: ivoras@gmail.com
Received: by 10.229.40.145 with HTTP; Mon, 8 Nov 2010 07:28:29 -0800 (PST)
In-Reply-To: <20101108151028.GI2392@deviant.kiev.zoral.com.ua>
References: <ib8qda$nat$1@dough.gmane.org> <201011081004.59640.jhb@freebsd.org>
	<20101108151028.GI2392@deviant.kiev.zoral.com.ua>
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 8 Nov 2010 16:28:29 +0100
X-Google-Sender-Auth: ItHWOQKCrhtLyU7It-eoPpq9uN4
Message-ID: <AANLkTi=KtGJjhEzRnQN78koFQqvtFC1c+qVfunUdae+w@mail.gmail.com>
To: Kostik Belousov <kostikbel@gmail.com>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 15:29:11 -0000

On 8 November 2010 16:10, Kostik Belousov <kostikbel@gmail.com> wrote:

> I already claimed several times that I will remove VFS_LOCK_GIANT
> after smbfs is locked. Patch for removal is sitting in my repository
> for almost a year.

Ok, I've made a little table here:

http://wiki.freebsd.org/MPSAFE_VFS

Just to get my understanding clear on this: in the case of VFS, Giant
protects / prevents concurrent execution of all vfsops or something
more / something other than that?

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 16:02:57 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 08A881065674;
	Mon,  8 Nov 2010 16:02:57 +0000 (UTC) (envelope-from jh@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id D2B958FC15;
	Mon,  8 Nov 2010 16:02:56 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oA8G2uuS004790;
	Mon, 8 Nov 2010 16:02:56 GMT (envelope-from jh@freefall.freebsd.org)
Received: (from jh@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oA8G2ugq004786;
	Mon, 8 Nov 2010 16:02:56 GMT (envelope-from jh)
Date: Mon, 8 Nov 2010 16:02:56 GMT
Message-Id: <201011081602.oA8G2ugq004786@freefall.freebsd.org>
To: anatoly.borodin@gmail.com, jh@FreeBSD.org, freebsd-fs@FreeBSD.org
From: jh@FreeBSD.org
Cc: 
Subject: Re: bin/124424: [zfs] zfs(8): zfs list -r shows strange snapshots'
	size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 16:02:57 -0000

Synopsis: [zfs] zfs(8): zfs list -r shows strange snapshots' size

State-Changed-From-To: feedback->closed
State-Changed-By: jh
State-Changed-When: Mon Nov 8 16:02:56 UTC 2010
State-Changed-Why: 
Feedback timeout.

http://www.freebsd.org/cgi/query-pr.cgi?pr=124424

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 16:32:27 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2F377106566C
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 16:32:27 +0000 (UTC)
	(envelope-from monthadar@gmail.com)
Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com
	[209.85.210.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 0734E8FC0C
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 16:32:26 +0000 (UTC)
Received: by pzk12 with SMTP id 12so497681pzk.13
	for <freebsd-fs@freebsd.org>; Mon, 08 Nov 2010 08:32:26 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:date:message-id
	:subject:from:to:content-type;
	bh=kWS+q2CIU2lyQgKhVEyK9bL1WqM66n/QNaVYFYvGShs=;
	b=ZhrAMS7FF/4WSvs3lV3a6GJi3pXRH3x/RlLCOR/A6w6HlBmEAMywYMbwBPiwsyZbhB
	E4ZFDW8IGmV1FTS2Bz/ZnuijwFhfLNl9OHLFdyfj3NVYW+QFesWr2ze/IY4uwdCGL1zi
	EwlR5Cgfta+UjPcEe6iFLmDsCi2oGMvbnvj0Q=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:date:message-id:subject:from:to:content-type;
	b=B5OiWTEV6gjYW//duRaGd8zt2BUmZ7afSAkpqp1UcyIYKo8Q9KnKihmjvxRbKOYEQ3
	OT8+EuttQb6jmLZ1+MHeGgHZt7JmgXfPPwDrcrjJKy0RRXw0DTW9IuEfwVDB5jCTsPun
	HxixWKYDCSs7SXXvuR+JD/U/CUCgRl6nHtEbQ=
MIME-Version: 1.0
Received: by 10.229.225.199 with SMTP id it7mr5377936qcb.33.1289232617849;
	Mon, 08 Nov 2010 08:10:17 -0800 (PST)
Received: by 10.229.182.77 with HTTP; Mon, 8 Nov 2010 08:10:17 -0800 (PST)
Date: Mon, 8 Nov 2010 17:10:17 +0100
Message-ID: <AANLkTinJEZ+QBVWuEFfjsyo8FXb1_iFCa0aHOjT2MVwB@mail.gmail.com>
From: Monthadar Al Jaberi <monthadar@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Subject: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
	error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 16:32:27 -0000

Hi,

I dont know if I am asking on the wrong place. But it has todo with
filesystem and onboard flash (16MB) on a RouterStation Pro board.

I am running a FreeBSD Current 201010, with the kernel configuration
file specified in /usr/src/sys/mips/conf/AR71XX with device
geom_redboot.

but I get this error when I try to mount from flash:
mount /dev/redboot/fs /var/fs
mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size
8192: Invalid argument


So I guessed it has todo with the flash configured in 64k sectors
according to the boot output.
...
mx25l0: <M25Pxx Flash Family> at cs 0 on spibus0
mx25l0: mx25ll128, sector 65536 bytes, 256 sectors
...

So I just tried to change SBLOCKSIZE from 8129 to 65536 in
/usr/src/sys/ufs/ffs/fs.h, but then I got this error:
mount /dev/redboot/fs /mnt/fs
g_vfs_done():redboot/fs[READ(offset=8192, length=65536)]error = 22
mount: /dev/redboot/fs : Invalid argument

The filesystem is generated from an empty skeleton using:
makefs -t ffs -B big -s 128k image-name directory-path

Then I transfer the image to the flash using Redboot bootloader.

Am I generating an incorrect filesystem image? I dont understand
offset and length in the last error message.

I couldnt use cat to dump the content in /dev/redboot/fs gives an
invalid argument error.
But I can use read(fd, buf, 65536) to read data. Has to be 64k (hint
from http://wiki.freebsd.org/AdrianChadd/UbiquityRouterstationPro).

Any help is much appreciated.
-- 
//Monthadar Al Jaberi

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 17:41:56 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 852471065673;
	Mon,  8 Nov 2010 17:41:56 +0000 (UTC)
	(envelope-from sarawgi.aditya@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 4744D8FC1A;
	Mon,  8 Nov 2010 17:41:56 +0000 (UTC)
Received: by pwj5 with SMTP id 5so53605pwj.13
	for <multiple recipients>; Mon, 08 Nov 2010 09:41:55 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:date:from:to:cc:subject
	:message-id:references:mime-version:content-type:content-disposition
	:in-reply-to:user-agent;
	bh=UEkQ27ifyK9gVdHmvIkulyOdojh5gOSz4OrW8Vy+kxs=;
	b=EcdO/CPPKVdQNU7fC0zc5Pu/nT8X/MVirF0q4P+l/G7I3VtPmpGEPsIV2EKmQqgXWd
	3ZCLIO/s13i8WLWok4CDmA1E1hfJRLC5P2Ar9kQUiWoQ9AsO34YcXW6+Vrz59icSasoz
	a4Z+pyorIcvY6JvSB1vCqcHNCkTKVkHTSk7Fs=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=date:from:to:cc:subject:message-id:references:mime-version
	:content-type:content-disposition:in-reply-to:user-agent;
	b=v303C99e1ZdPfVwapyBaD804PoTQLm5Li4QF0FtowSgRBHuoqu918rhxthqvLShZNE
	P8TgZa+tx/4JXHYxn4Wzyeyi2rOd1HL57158K8Rlcn4ZvTz3JObRFTIHyodRNGfW1GZu
	X1KrgVYdT/zXeSXwPxmJLkzKHoKIG3IiYR4/c=
Received: by 10.142.237.4 with SMTP id k4mr5118835wfh.171.1289238115655;
	Mon, 08 Nov 2010 09:41:55 -0800 (PST)
Received: from earth ([183.87.49.109])
	by mx.google.com with ESMTPS id x35sm175623wfd.1.2010.11.08.09.41.52
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Mon, 08 Nov 2010 09:41:54 -0800 (PST)
Date: Mon, 8 Nov 2010 23:13:32 +0530
From: Aditya Sarawgi <sarawgi.aditya@gmail.com>
To: Doug Barton <dougb@FreeBSD.org>
Message-ID: <20101108174327.GC2066@earth>
References: <20100929031825.L683@besplex.bde.org>
	<20100929084801.M948@besplex.bde.org>
	<20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org>
	<20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org>
	<AANLkTi=iTCG4aO-KO_gy7fp_96KcZ_TCyNk5OkLZUHV3@mail.gmail.com>
	<4CD201AE.3040409@FreeBSD.org>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="BXVAT5kNtrzKuDFl"
Content-Disposition: inline
In-Reply-To: <4CD201AE.3040409@FreeBSD.org>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: fs@freebsd.org
Subject: Re: ext2fs now extremely slow
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 17:41:56 -0000


--BXVAT5kNtrzKuDFl
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Wed, Nov 03, 2010 at 05:43:26PM -0700, Doug Barton wrote:
> On 11/03/10 16:38, Aditya Sarawgi wrote:
> > On Wed, Nov 3, 2010 at 5:21 AM, Doug Barton<dougb@freebsd.org>  wrote:
> 
> >> Is anything happening with this?  I recently built a new system that is
> >> multi-booting windows, freebsd, and ubuntu. I chose ext[23]fs for my /home
> >> partition so that I could share unix'y stuff between freebsd and linux, but
> >> I'm having both performance and stability problems, and today (fortunately
> >> for the first time, and fortunately recoverable) I had actual data loss. I'm
> >> happy to be a guinea pig for new code if people are reasonably sure that it
> >> will help, but if the situation doesn't improve I will have to reformat.
> >>
> >
> > Are you suffering from these problems on CURRENT ?
> 
> Yes.
> 
> > Can you please elaborate
> > on the performance and stability issue you are facing ? Any specific scenario ?
> 
> What I did was create a fairly large (37G) /home and put all the stuff 
> I'd like to have access to from all 3 systems, like svn, my ports tree, 
> etc. I also ended up putting my obj directory there because I created my 
> /usr/local a little smaller than I should have and after installing 
> gnome I ran out of room. :)
> 
> I should also point out that this is on a brand new desktop system that 
> was donated by a FreeBSD user. It's a C2D running at 3.17G, 4G RAM, and 
> a fast 250G disk. I'm running amd64 -current. Everything disk intensive 
> (updating ports with csup, updating my svn trees, etc.) is slower on 
> this system than it was on my laptop where all the same stuff was on 
> UFS2. Bruce's message that started this thread alluded to the problems, 
> my experience has been similar.
> 
> Regarding stability, sometimes (but not always) when I'm doing the above 
> listed disk-intensive things on an otherwise idle system I've had the 
> system lock up. Not panic, not reboot, just wedge. I'm running X when 
> this happens, so I'm not 100% sure that the disk activity is the 
> culprit, but it seems very suspicious. Yesterday was a very bad day, I 
> had to do 3 tries to get all the way through a buildworld/kernel, mostly 
> because the last 2 crashes resulted in my /usr/src (which is actually 
> /home/svn/head) and /usr/obj (/home/obj-9) directories getting corrupted 
> respectively. Today (running r214694) has actually been quite good, 
> although I haven't tried a buildworld yet.
>

I am not sure if this is the right use case for ext2fs 
 
> > You can test Zheng's preallocation patch for ext2fs, there is a
> > serious lack of testers for that.
> 
> I would be happy to do that, but my reading of this thread last month 
> didn't produce a clear "try this version of the patch" neon sign. 
> Various people referred to suggestions, updates, etc. If someone could 
> provide a URL for the right patch to try, as well as a suggestion for 
> benchmarking methodology, I'll be glad to do so.
>

I have attached the patch. Some primitive testing like copying files, 
untaring etc and comparing with the existing ext2fs will do. If you 
are looking to do a full fledged benchmarking then I would suggest
iozone, blogbench, dbench etc. 
 
> >> On a related note, is there any way to use the journaling features of ext3fs
> >> in FreeBSD? When I boot the linux partition it's treating the fs as ext3fs,
> >> but AFAICS we only have ext2fs capabilities.
> >>
> >
> > Journaling is difficult to bring in, especially if one is planning to
> > have a BSDL version.
> 
> Ok. I can live with accessing the stuff as ext2 from FreeBSD, and I can 
> even live with a minor performance penalty. What I can't live with is 
> instability and/or data corruption; and it should go without saying that 
> our users should not have to live with that either.
>

We were planning to use gjournal but it is too tied with UFS and it wouldn't be
compatible with ext2fs journaling. Haiku seems to have journaling for ext2fs but 
that depends a lot on BFS journaling.
Bringing in journaling code is not a option over here since they have their separate
journaling layer. 
 
> 
> Thanks for the response,
> 
> Doug
> 
> -- 
> 
> 	Nothin' ever doesn't change, but nothin' changes much.
> 			-- OK Go
> 
> 	Breadth of IT experience, and depth of knowledge in the DNS.
> 	Yours for the right price.  :)  http://SupersetSolutions.com/
> 

--BXVAT5kNtrzKuDFl
Content-Type: text/x-diff; charset=us-ascii
Content-Disposition: attachment; filename="ext2fs_prealloc.diff"

diff -urN /usr/src/sys/fs/ext2fs/ext2_alloc.c new/ext2_alloc.c
--- /usr/src/sys/fs/ext2fs/ext2_alloc.c	2010-01-14 22:30:54.000000000 +0800
+++ new/ext2_alloc.c	2010-08-19 02:47:29.000000000 +0800
@@ -50,6 +50,9 @@
 #include <fs/ext2fs/ext2fs.h>
 #include <fs/ext2fs/fs.h>
 #include <fs/ext2fs/ext2_extern.h>
+#include <fs/ext2fs/ext2_rsv_win.h>
+
+#define phy_blk(cg, fs) (((cg) * (fs->e2fs->e2fs_fpg)) + fs->e2fs->e2fs_first_dblock)
 
 static daddr_t	ext2_alloccg(struct inode *, int, daddr_t, int);
 static u_long	ext2_dirpref(struct inode *);
@@ -59,37 +62,524 @@
 						int));
 static daddr_t	ext2_nodealloccg(struct inode *, int, daddr_t, int);
 static daddr_t  ext2_mapsearch(struct m_ext2fs *, char *, daddr_t);
+
+/* For reservation window */
+static u_long   ext2_alloc_blk(struct inode *, int, struct buf *, int32_t, struct ext2_rsv_win *);
+static int      ext2_alloc_new_rsv(struct inode *, int, struct buf *, int32_t);
+static int      ext2_bpref_in_rsv(struct ext2_rsv_win *, int32_t);
+static int      ext2_find_rsv(struct ext2_rsv_win *, struct ext2_rsv_win *,
+                              struct m_ext2fs *, int32_t, int);
+static void	ext2_remove_rsv_win(struct m_ext2fs *, struct ext2_rsv_win *);
+static u_long   ext2_rsvalloc(struct m_ext2fs *, struct inode *,
+                              int, struct buf *, int32_t, int);
+static daddr_t  ext2_search_next_block(struct m_ext2fs *, char *, int, int);
+static struct ext2_rsv_win *ext2_search_rsv(struct ext2_rsv_win_tree *, int32_t);
+
+RB_GENERATE(ext2_rsv_win_tree, ext2_rsv_win, rsv_link, ext2_rsv_win_cmp);
+
 /*
  * Allocate a block in the file system.
  *
- * A preference may be optionally specified. If a preference is given
- * the following hierarchy is used to allocate a block:
- *   1) allocate the requested block.
- *   2) allocate a rotationally optimal block in the same cylinder.
- *   3) allocate a block in the same cylinder group.
- *   4) quadradically rehash into other cylinder groups, until an
- *        available block is located.
- * If no block preference is given the following hierarchy is used
- * to allocate a block:
- *   1) allocate a block in the cylinder group that contains the
- *        inode for the file.
- *   2) quadradically rehash into other cylinder groups, until an
- *        available block is located.
- *
- * A preference may be optionally specified. If a preference is given
- * the following hierarchy is used to allocate a block:
- *   1) allocate the requested block.
- *   2) allocate a rotationally optimal block in the same cylinder.
- *   3) allocate a block in the same cylinder group.
- *   4) quadradically rehash into other cylinder groups, until an
- *        available block is located.
- * If no block preference is given the following hierarchy is used
- * to allocate a block:
- *   1) allocate a block in the cylinder group that contains the
- *        inode for the file.
- *   2) quadradically rehash into other cylinder groups, until an
- *        available block is located.
+ * By given preference:
+ *   Check whether inode has a reservation window and preference
+ *   is within it and try to allocate a free block from
+ *   this reservation window.
+ *   If not, traverse RB tree to find a place, which is not in
+ *   any window and insert it to RB tree to try to allocate a
+ *   free block again.
+ *   If it fails, try to allocate a free block in other cylinder
+ *   groups without preference.
+ */
+
+/*
+ * Allocate a free block.
+ *
+ * First check whether reservation window is used.
+ * If reservation window is used, try to allocate a free
+ * block from the reservation window. If it fails, traverse
+ * the bitmap to find a free block.
+ * If reservation window is not used, try to allocate
+ * a free block by bpref. If it fails, traverse the bitmap
+ * to find a free block.
  */
+static u_long
+ext2_alloc_blk(struct inode *ip, int cg, struct buf *bp,
+    int32_t bpref, struct ext2_rsv_win *rp)
+{
+	struct m_ext2fs *fs;
+	struct ext2mount *ump;
+	int bno, start, end;
+	char *bbp;
+
+	fs = ip->i_e2fs;
+	ump = ip->i_ump;
+	bbp = (char *)bp->b_data;
+
+	if (fs->e2fs_gd[cg].ext2bgd_nbfree == 0)
+		return (0);
+
+        if (bpref < 0)
+                bpref = 0;
+
+        /* Check whether it use reservation window */
+        if (rp != NULL) {
+                /*
+                 * If window's start is not in this cylinder group,
+                 * try to allocate from the beginning, otherwise
+                 * try to allocate from the beginning of the
+                 * window.
+                 */
+                if (dtog(fs, rp->rsv_start) < cg)
+                        start = 0;
+                else
+                        start = rp->rsv_start;
+
+                /*
+                 * If window's end crosses the end of this group,
+                 * set end variable to the end of this group.
+                 * Otherwise, set it to the window's end.
+                 */
+                if (dtog(fs, rp->rsv_end) > cg)
+                        end = phy_blk(cg + 1, fs) - 1;
+                else
+                        end = rp->rsv_end;
+
+                /* If preference block is within the window, try to allocate it. */
+                if (start <= bpref && bpref <= end) {
+                        bpref = dtogd(fs, bpref);
+                        if (isclr(bbp, bpref)) {
+                                rp->rsv_alloc_hit++;
+                                bno = bpref;
+                                goto gotit;
+                        }
+                } else
+                        if (dtog(fs, rp->rsv_start) == cg)
+                                bpref = dtogd(fs, rp->rsv_start);
+                        else
+                                bpref = 0;
+        } else {
+                if (dtog(fs, bpref) != cg)
+                        bpref = 0;
+                if (bpref != 0) {
+                        bpref = dtogd(fs, bpref);
+                        if (isclr(bbp, bpref)) {
+                                bno = bpref;
+                                goto gotit;
+                        }
+                }
+        }
+
+	bno = ext2_mapsearch(fs, bbp, bpref);
+	if (bno < 0)
+		return (0);
+
+gotit:
+	setbit(bbp, (daddr_t)bno);
+	EXT2_LOCK(ump);
+	fs->e2fs->e2fs_fbcount--;
+	fs->e2fs_gd[cg].ext2bgd_nbfree--;
+	fs->e2fs_fmod = 1;
+	EXT2_UNLOCK(ump);
+	bdwrite(bp);
+	bno = phy_blk(cg, fs) + bno;
+        return (bno);
+}
+
+/*
+ * Initialize reservation window per inode.
+ */
+void
+ext2_init_rsv(struct inode *ip)
+{
+	struct ext2_rsv_win *rp;
+
+	rp = malloc(sizeof(struct ext2_rsv_win),
+	    M_EXT2NODE, M_WAITOK | M_ZERO);
+
+	/* 
+         * If malloc failed, we just do not use the
+	 * reservation window mechanism.
+	 */
+	if (rp == NULL)
+		return;
+
+	rp->rsv_start = EXT2_RSV_NOT_ALLOCATED;
+	rp->rsv_end = EXT2_RSV_NOT_ALLOCATED;
+
+	rp->rsv_goal_size = EXT2_RSV_DEFAULT_RESERVE_BLKS;
+	rp->rsv_alloc_hit = 0;
+
+	ip->i_rsv = rp;
+} 
+
+/*
+ * Discard reservation window.
+ *
+ * It is called during the following situations:
+ * 1. free an inode
+ * 2. sync inode
+ * 3. truncate a file
+ */
+void
+ext2_discard_rsv(struct inode *ip)
+{
+	struct ext2_rsv_win *rp;
+
+	if (ip->i_rsv == NULL) 
+                return;
+
+	rp = ip->i_rsv;
+
+        /* If reservation window is empty, nothing to do */
+	if (rp->rsv_end == EXT2_RSV_NOT_ALLOCATED)
+                return;
+
+        EXT2_TREE_LOCK(ip->i_e2fs);
+        ext2_remove_rsv_win(ip->i_e2fs, rp);
+        EXT2_TREE_UNLOCK(ip->i_e2fs);
+        rp->rsv_goal_size = EXT2_RSV_DEFAULT_RESERVE_BLKS;
+}
+
+/*
+ * Remove a ext2_rsv_win structure from RB tree.
+ */
+static void
+ext2_remove_rsv_win(struct m_ext2fs *fs, struct ext2_rsv_win *rp)
+{
+	RB_REMOVE(ext2_rsv_win_tree, fs->e2fs_rsv_tree, rp);
+	rp->rsv_start = EXT2_RSV_NOT_ALLOCATED;
+	rp->rsv_end = EXT2_RSV_NOT_ALLOCATED;
+	rp->rsv_alloc_hit = 0;
+}
+
+/*
+ * Check bpref is in the reservation window.
+ */
+static int
+ext2_bpref_in_rsv(struct ext2_rsv_win *rp, int32_t bpref)
+{
+        if (bpref >= 0 && (bpref < rp->rsv_start || bpref > rp->rsv_end))
+                return (0);
+
+        return (1);
+}
+
+/*
+ * Search a tree node from RB tree. It includes the bpref or
+ * the previous one if bpref is not in any window.
+ */
+static struct ext2_rsv_win *
+ext2_search_rsv(struct ext2_rsv_win_tree *root, int32_t start)
+{
+        struct ext2_rsv_win *prev, *next;
+
+        if (RB_EMPTY(root))
+                return (NULL);
+
+        next = RB_ROOT(root);
+        do {
+                prev = next;
+                if (start < next->rsv_start)
+                        next = RB_LEFT(next, rsv_link);
+                else if (start > next->rsv_end)
+                        next = RB_RIGHT(next, rsv_link);
+                else
+                        return (next);
+        } while (next != NULL);
+
+        if (prev->rsv_start > start) {
+                next = RB_PREV(ext2_rsv_win_tree, root, prev);
+                if (next != NULL)
+                        prev = next;
+        }
+
+        return (prev);
+}
+
+/*
+ * Find a reservation window by given range from start to
+ * the end of this cylinder group.
+ */
+static int
+ext2_find_rsv(struct ext2_rsv_win *search, struct ext2_rsv_win *rp,
+    struct m_ext2fs *fs, int32_t start, int cg)
+{
+        struct ext2_rsv_win *rsv, *prev;
+        int32_t cur;
+        int size = rp->rsv_goal_size;
+
+        if (search == NULL) {
+                rp->rsv_start = start & ~7;
+                rp->rsv_end = start + size - 1;
+                rp->rsv_alloc_hit = 0;
+
+                RB_INSERT(ext2_rsv_win_tree, fs->e2fs_rsv_tree, rp);
+
+                return (0);
+        }
+
+        /*
+         * Make the start of reservation window byte-aligned
+         * in order to can find a free block with bit operations
+         * in the ext2_search_next_block() function.
+         */
+        cur = start & ~7;
+        rsv = search;
+        prev = NULL;
+
+        while (1) {
+                if (cur <= rsv->rsv_end)
+                        cur = rsv->rsv_end + 1;
+
+                if (dtog(fs, cur) != cg)
+                        return (-1);
+
+                prev = rsv;
+                rsv = RB_NEXT(ext2_rsv_win_tree, fs->e2fs_rsv_tree, rsv);
+
+                if (rsv == NULL)
+                        break;
+
+                if (cur + size <= rsv->rsv_start)
+                        break;
+        }
+
+        if (prev != rp && rp->rsv_end != EXT2_RSV_NOT_ALLOCATED)
+                ext2_remove_rsv_win(fs, rp);
+
+        rp->rsv_start = cur;
+        rp->rsv_end = cur + size - 1;
+        rp->rsv_alloc_hit = 0;
+
+        if (prev != rp)
+                RB_INSERT(ext2_rsv_win_tree, fs->e2fs_rsv_tree, rp);
+
+        return (0);
+}
+
+/*
+ * Find a free block by given range from bpref to
+ * the end of this cylinder group.
+ */
+static daddr_t
+ext2_search_next_block(struct m_ext2fs *fs, char *bbp, int bpref, int cg)
+{
+        daddr_t bno;
+        int start, loc, len, map, i;
+
+        start = bpref / NBBY;
+        len = howmany(fs->e2fs->e2fs_fpg, NBBY) - start;
+        loc = skpc(0xff, len, &bbp[start]);
+        if (loc == 0)
+                return (-1);
+
+        i = start + len - loc;
+        map = bbp[i];
+        bno = i * NBBY;
+        for (i = 1; i < (1 << NBBY); i <<= 1, bno++) {
+                if ((map & i) == 0)
+                        return (bno);
+        }
+
+        return (-1);
+}
+
+/*
+ * Allocate a new reservation window.
+ */
+static int
+ext2_alloc_new_rsv(struct inode *ip, int cg, struct buf *bp, int32_t bpref)
+{
+        struct m_ext2fs *fs;
+        struct ext2_rsv_win *rp, *search;
+        char *bbp;
+        int start, size, ret;
+
+        fs = ip->i_e2fs;
+        rp = ip->i_rsv;
+        bbp = bp->b_data;
+        size = rp->rsv_goal_size;
+
+        if (bpref <= 0)
+                start = phy_blk(cg, fs);
+        else
+                start = bpref;
+
+        /* Dynamically increase the size of window */
+        if (rp->rsv_end != EXT2_RSV_NOT_ALLOCATED) {
+                if (rp->rsv_alloc_hit >
+                    ((rp->rsv_end - rp->rsv_start + 1) / 2)) {
+                        size = size * 2;
+                        if (size > EXT2_RSV_MAX_RESERVE_BLKS)
+                                size = EXT2_RSV_MAX_RESERVE_BLKS;
+                        rp->rsv_goal_size = size;
+                }
+        }
+
+        EXT2_TREE_LOCK(fs);
+
+        search = ext2_search_rsv(fs->e2fs_rsv_tree, start);
+
+repeat:
+        ret = ext2_find_rsv(search, rp, fs, start, cg);
+        if (ret < 0) {
+                if (rp->rsv_end != EXT2_RSV_NOT_ALLOCATED)
+                        ext2_remove_rsv_win(fs, rp);
+                EXT2_TREE_UNLOCK(fs);
+                return (-1);
+        }
+        EXT2_TREE_UNLOCK(fs);
+
+        start = dtogd(fs, rp->rsv_start);
+        start = ext2_search_next_block(fs, bbp, start, cg);
+        if (start < 0) {
+                EXT2_TREE_LOCK(fs);
+                if (rp->rsv_end != EXT2_RSV_NOT_ALLOCATED)
+                        ext2_remove_rsv_win(fs, rp);
+                EXT2_TREE_UNLOCK(fs);
+                return (-1);
+        }
+
+        start = phy_blk(cg, fs) + start;
+        if (start >= rp->rsv_start && start <= rp->rsv_end)
+                return (0);
+
+        search = rp;
+        EXT2_TREE_LOCK(fs);
+        goto repeat;
+}
+
+/*
+ * Allocate a free block from reservation window.
+ */
+static u_long
+ext2_rsvalloc(struct m_ext2fs *fs, struct inode *ip, int cg,
+    struct buf *bp, int32_t bpref, int size)
+{
+        struct ext2_rsv_win *rp;
+        int ret;
+
+        rp = ip->i_rsv;
+        if (rp == NULL)
+                return (ext2_alloc_blk(ip, cg, bp, bpref, NULL));
+
+        if (rp->rsv_end == EXT2_RSV_NOT_ALLOCATED ||
+            !ext2_bpref_in_rsv(rp, bpref)) {
+                ret = ext2_alloc_new_rsv(ip, cg, bp, bpref);
+                if (ret < 0)
+                        return (0);
+        }
+
+        return (ext2_alloc_blk(ip, cg, bp, bpref, rp));
+}
+
+/*
+ * Allocate a block using reservation window in ext2 file system.
+ *
+ * NOTE: This function will replace the ext2_alloc() function.
+ */
+int
+ext2_alloc_rsv(struct inode *ip, int32_t lbn, int32_t bpref,
+    int size, struct ucred *cred, int32_t *bnp)
+{
+	struct m_ext2fs *fs;
+	struct ext2mount *ump;
+        struct buf *bp;
+	int32_t bno = 0;
+	int i, cg, error;
+
+	*bnp = 0;
+	fs = ip->i_e2fs;
+	ump = ip->i_ump;
+	mtx_assert(EXT2_MTX(ump), MA_OWNED);
+
+	if (size == fs->e2fs_bsize && fs->e2fs->e2fs_fbcount == 0)
+		goto nospace;
+	if (cred->cr_uid != 0 && 
+	    fs->e2fs->e2fs_fbcount < fs->e2fs->e2fs_rbcount)
+		goto nospace;
+
+	if (bpref >= fs->e2fs->e2fs_bcount)
+		bpref = 0;
+	if (bpref == 0)
+		cg = ino_to_cg(fs, ip->i_number);
+	else
+		cg = dtog(fs, bpref);
+
+        /* If cg has some free blocks, then try to allocate a free block from this cg */
+        if (fs->e2fs_gd[cg].ext2bgd_nbfree > 0) {
+                /* Read block bitmap from buffer */
+                EXT2_UNLOCK(ump);
+                error = bread(ip->i_devvp,
+                    fsbtodb(fs, fs->e2fs_gd[cg].ext2bgd_b_bitmap),
+                    (int)fs->e2fs_bsize, NOCRED, &bp);
+                if (error) {
+                        brelse(bp);
+                        goto ioerror;
+                }
+
+                EXT2_RSV_LOCK(ip);
+                /* Try to allocate from reservation window */
+                bno = ext2_rsvalloc(fs, ip, cg, bp, bpref, size);
+                EXT2_RSV_UNLOCK(ip);
+                if (bno > 0)
+                        goto allocated;
+
+                brelse(bp);
+                EXT2_LOCK(ump);
+        }
+
+        /* Just need to try to allocate a free block from rest groups. */
+        cg = (cg + 1) % fs->e2fs_gcount;
+        for (i = 1; i < fs->e2fs_gcount; i++) {
+                if (fs->e2fs_gd[cg].ext2bgd_nbfree > 0) {
+                        /* Read block bitmap from buffer */
+                        EXT2_UNLOCK(ump);
+                        error = bread(ip->i_devvp,
+                            fsbtodb(fs, fs->e2fs_gd[cg].ext2bgd_b_bitmap),
+                            (int)fs->e2fs_bsize, NOCRED, &bp);
+                        if (error) {
+                                brelse(bp);
+                                goto ioerror;
+                        }
+
+                        EXT2_RSV_LOCK(ip);
+                        bno = ext2_rsvalloc(fs, ip, cg, bp, -1, size);
+                        EXT2_RSV_UNLOCK(ip);
+                        if (bno > 0)
+                                goto allocated;
+
+                        brelse(bp);
+                        EXT2_LOCK(ump);
+                }
+
+                cg++;
+                if (cg == fs->e2fs_gcount)
+                        cg = 0;
+        }
+
+allocated:
+        if (bno > 0) {
+                ip->i_next_alloc_block = lbn;
+                ip->i_next_alloc_goal = bno;
+
+                ip->i_blocks += btodb(fs->e2fs_bsize);
+                ip->i_flag |= IN_CHANGE | IN_UPDATE;
+                *bnp = bno;
+                return (0);
+        }
+
+nospace:
+	EXT2_UNLOCK(ump);
+	ext2_fserr(fs, cred->cr_uid, "file system full");
+	uprintf("\n%s: write failed, file system is full\n", fs->e2fs_fsmnt);
+	return (ENOSPC);
+
+ioerror:
+        ext2_fserr(fs, cred->cr_uid, "file system IO error");
+        uprintf("\n%s: write failed, file system IO error\n", fs->e2fs_fsmnt);
+        return (EIO);
+}
 
 int
 ext2_alloc(ip, lbn, bpref, size, cred, bnp)
@@ -923,9 +1413,11 @@
 		start = 0;
 		loc = skpc(0xff, len, &bbp[start]);
 		if (loc == 0) {
-			printf("start = %d, len = %d, fs = %s\n",
-				start, len, fs->e2fs_fsmnt);
-			panic("ext2fs_alloccg: map corrupted");
+                        /* XXX: just for reservation window */
+                        return -1;
+			/*printf("start = %d, len = %d, fs = %s\n",*/
+				/*start, len, fs->e2fs_fsmnt);*/
+			/*panic("ext2fs_alloccg: map corrupted");*/
 			/* NOTREACHED */
 		}
 	}
diff -urN /usr/src/sys/fs/ext2fs/ext2_balloc.c new/ext2_balloc.c
--- /usr/src/sys/fs/ext2fs/ext2_balloc.c	2010-01-14 22:30:54.000000000 +0800
+++ new/ext2_balloc.c	2010-08-19 02:47:29.000000000 +0800
@@ -49,6 +49,7 @@
 #include <fs/ext2fs/fs.h>
 #include <fs/ext2fs/ext2_extern.h>
 #include <fs/ext2fs/ext2_mount.h>
+#include <fs/ext2fs/ext2_rsv_win.h>
 /*
  * Balloc defines the structure of file system storage
  * by allocating the physical blocks on a device given
@@ -78,6 +79,9 @@
 	fs = ip->i_e2fs;
 	ump = ip->i_ump;
 
+        if (ip->i_rsv == NULL)
+                ext2_init_rsv(ip);
+
 	/*
 	 * check if this is a sequential block allocation. 
 	 * If so, increment next_alloc fields to allow ext2_blkpref 
@@ -136,9 +140,9 @@
 			else
 				nsize = fs->e2fs_bsize;
 			EXT2_LOCK(ump);
-			error = ext2_alloc(ip, lbn,
-			    ext2_blkpref(ip, lbn, (int)lbn, &ip->i_db[0], 0),
-			    nsize, cred, &newb);
+			error = ext2_alloc_rsv(ip, lbn,
+				    ext2_blkpref(ip, lbn, (int)lbn, &ip->i_db[0], 0),
+				    nsize, cred, &newb);
 			if (error)
 				return (error);
 			bp = getblk(vp, lbn, nsize, 0, 0, 0);
@@ -170,9 +174,9 @@
 		EXT2_LOCK(ump);
 		pref = ext2_blkpref(ip, lbn, indirs[0].in_off + 
 					     EXT2_NDIR_BLOCKS, &ip->i_db[0], 0);
-	        if ((error = ext2_alloc(ip, lbn, pref, 
-			(int)fs->e2fs_bsize, cred, &newb)))
-			return (error);
+	        if ((error = ext2_alloc_rsv(ip, lbn, pref, 
+				(int)fs->e2fs_bsize, cred, &newb)))
+				return (error);
 		nb = newb;
 		bp = getblk(vp, indirs[1].in_lbn, fs->e2fs_bsize, 0, 0, 0);
 		bp->b_blkno = fsbtodb(fs, newb);
@@ -211,7 +215,7 @@
 		if (pref == 0)
 			pref = ext2_blkpref(ip, lbn, indirs[i].in_off, bap,
 						bp->b_lblkno);
-		error =  ext2_alloc(ip, lbn, pref, (int)fs->e2fs_bsize, cred, &newb);
+		error =  ext2_alloc_rsv(ip, lbn, pref, (int)fs->e2fs_bsize, cred, &newb);
 		if (error) {
 			brelse(bp);
 			return (error);
@@ -250,8 +254,8 @@
 		EXT2_LOCK(ump);
 		pref = ext2_blkpref(ip, lbn, indirs[i].in_off, &bap[0], 
 				bp->b_lblkno);
-		if ((error = ext2_alloc(ip,
-		    lbn, pref, (int)fs->e2fs_bsize, cred, &newb)) != 0) {
+		if ((error = ext2_alloc_rsv(ip, lbn, pref,
+				(int)fs->e2fs_bsize, cred, &newb)) != 0) {
 			brelse(bp);
 			return (error);
 		}
diff -urN /usr/src/sys/fs/ext2fs/ext2_inode.c new/ext2_inode.c
--- /usr/src/sys/fs/ext2fs/ext2_inode.c	2010-01-14 22:30:54.000000000 +0800
+++ new/ext2_inode.c	2010-08-19 02:47:29.000000000 +0800
@@ -52,6 +52,7 @@
 #include <fs/ext2fs/ext2fs.h>
 #include <fs/ext2fs/fs.h>
 #include <fs/ext2fs/ext2_extern.h>
+#include <fs/ext2fs/ext2_rsv_win.h>
 
 static int ext2_indirtrunc(struct inode *, int32_t, int32_t, int32_t, int,
 	    long *);
@@ -153,6 +154,11 @@
 	}
 	fs = oip->i_e2fs;
 	osize = oip->i_size;
+
+        EXT2_RSV_LOCK(oip);
+	ext2_discard_rsv(oip);
+        EXT2_RSV_UNLOCK(oip);
+
 	/*
 	 * Lengthen the size of the file. We must ensure that the
 	 * last byte of the file is allocated. Since the smallest
@@ -484,6 +490,10 @@
 	if (prtactive && vrefcnt(vp) != 0)
 		vprint("ext2_inactive: pushing active", vp);
 
+        EXT2_RSV_LOCK(ip);
+        ext2_discard_rsv(ip);
+        EXT2_RSV_UNLOCK(ip);
+
 	/*
 	 * Ignore inodes related to stale file handles.
 	 */
@@ -525,11 +535,21 @@
 	if (prtactive && vrefcnt(vp) != 0)
 		vprint("ufs_reclaim: pushing active", vp);
 	ip = VTOI(vp);
+
 	if (ip->i_flag & IN_LAZYMOD) {
 		ip->i_flag |= IN_MODIFIED;
 		ext2_update(vp, 0);
 	}
 	vfs_hash_remove(vp);
+
+        EXT2_RSV_LOCK(ip);
+        if (ip->i_rsv != NULL) {
+                free(ip->i_rsv, M_EXT2NODE);
+                ip->i_rsv = NULL;
+        }
+        EXT2_RSV_UNLOCK(ip);
+        mtx_destroy(&ip->i_rsv_lock);
+
 	free(vp->v_data, M_EXT2NODE);
 	vp->v_data = 0;
 	vnode_destroy_vobject(vp);
diff -urN /usr/src/sys/fs/ext2fs/ext2_rsv_win.h new/ext2_rsv_win.h
--- /usr/src/sys/fs/ext2fs/ext2_rsv_win.h	1970-01-01 08:00:00.000000000 +0800
+++ new/ext2_rsv_win.h	2010-08-19 02:47:29.000000000 +0800
@@ -0,0 +1,78 @@
+/*-
+ * Copyright (c) 2010, 2010 Zheng Liu <lz@freebsd.org>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * $FreeBSD: src/sys/fs/ext2fs/ext2_rsv_win.h,v 0.1 2010/05/08 12:41:51 lz Exp $
+ */
+#ifndef _FS_EXT2FS_EXT2_RSV_WIN_H_
+#define _FS_EXT2FS_EXT2_RSV_WIN_H_
+
+#include <sys/tree.h>
+
+#define EXT2_RSV_DEFAULT_RESERVE_BLKS 8
+#define EXT2_RSV_MAX_RESERVE_BLKS     1024
+#define EXT2_RSV_NOT_ALLOCATED        0
+
+#define EXT2_RSV_LOCK(ip)   mtx_lock(&ip->i_rsv_lock)
+#define EXT2_RSV_UNLOCK(ip) mtx_unlock(&ip->i_rsv_lock)
+
+#define EXT2_TREE_LOCK(fs)   mtx_lock(&fs->e2fs_rsv_lock);
+#define EXT2_TREE_UNLOCK(fs) mtx_unlock(&fs->e2fs_rsv_lock);
+
+/*
+ * Reservation window entry
+ */
+struct ext2_rsv_win {
+	RB_ENTRY(ext2_rsv_win) rsv_link; /* RB tree links */
+
+	int32_t rsv_goal_size; /* Default reservation window size */
+	int32_t rsv_alloc_hit; /* Number of allocated windows */
+
+	int32_t rsv_start; /* First bytes of window */
+	int32_t rsv_end;   /* End bytes of window */
+};
+
+RB_HEAD(ext2_rsv_win_tree, ext2_rsv_win);
+
+static __inline int
+ext2_rsv_win_cmp(const struct ext2_rsv_win *a,
+		 const struct ext2_rsv_win *b)
+{
+	if (a->rsv_start < b->rsv_start)
+		return (-1);
+	if (a->rsv_start == b->rsv_start)
+		return (0);
+
+	return (1);
+}
+RB_PROTOTYPE(ext2_rsv_win_tree, ext2_rsv_win, rsv_link, ext2_rsv_win_cmp);
+
+/* predefine */
+struct inode;
+/* ext2_alloc.c */
+void    ext2_init_rsv(struct inode *ip);
+void    ext2_discard_rsv(struct inode *ip);
+int     ext2_alloc_rsv(struct inode *, int32_t, int32_t, int, struct ucred *, int32_t *);
+
+#endif /* !_FS_EXT2FS_EXT2_RSV_WIN_H_ */
diff -urN /usr/src/sys/fs/ext2fs/ext2_vfsops.c new/ext2_vfsops.c
--- /usr/src/sys/fs/ext2fs/ext2_vfsops.c	2010-01-14 22:30:54.000000000 +0800
+++ new/ext2_vfsops.c	2010-08-19 02:47:29.000000000 +0800
@@ -1,4 +1,4 @@
-/*-
+/*
  *  modified for EXT2FS support in Lites 1.1
  *
  *  Aug 1995, Godmar Back (gback@cs.utah.edu)
@@ -61,6 +61,7 @@
 #include <fs/ext2fs/fs.h>
 #include <fs/ext2fs/ext2_extern.h>
 #include <fs/ext2fs/ext2fs.h>
+#include <fs/ext2fs/ext2_rsv_win.h>
 
 static int	ext2_flushfiles(struct mount *mp, int flags, struct thread *td);
 static int	ext2_mountfs(struct vnode *, struct mount *);
@@ -95,9 +96,9 @@
 static int	compute_sb_data(struct vnode * devvp,
 		    struct ext2fs * es, struct m_ext2fs * fs);
 
-static const char *ext2_opts[] = { "from", "export", "acls", "noexec",
-    "noatime", "union", "suiddir", "multilabel", "nosymfollow",
-    "noclusterr", "noclusterw", "force", NULL };
+static const char *ext2_opts[] = { "acls", "async", "export", "force",
+    "from", "multilabel", "noatime", "noclusterr", "noclusterw",
+    "noexec", "nosymfollow", "suiddir", "union", NULL };
 
 /*
  * VFS Operations.
@@ -581,6 +582,14 @@
 	if ((error = compute_sb_data(devvp, ump->um_e2fs->e2fs, ump->um_e2fs)))
 		goto out;
 
+	/* Initial reservation window index and lock */
+	bzero(&ump->um_e2fs->e2fs_rsv_lock, sizeof(struct mtx));
+	mtx_init(&ump->um_e2fs->e2fs_rsv_lock,
+            "rsv tree lock", NULL, MTX_DEF);
+        ump->um_e2fs->e2fs_rsv_tree = malloc(sizeof(struct ext2_rsv_win_tree),
+            M_EXT2MNT, M_WAITOK | M_ZERO);
+	RB_INIT(ump->um_e2fs->e2fs_rsv_tree);
+
 	brelse(bp);
 	bp = NULL;
 	fs = ump->um_e2fs;
@@ -680,6 +689,8 @@
 	g_topology_unlock();
 	PICKUP_GIANT();
 	vrele(ump->um_devvp);
+        free(fs->e2fs_rsv_tree, M_EXT2MNT);
+	mtx_destroy(&fs->e2fs_rsv_lock);
 	free(fs->e2fs_gd, M_EXT2MNT);
 	free(fs->e2fs_contigdirs, M_EXT2MNT);
 	free(fs->e2fs, M_EXT2MNT);
@@ -919,6 +930,10 @@
 	ip->i_prealloc_count = 0;
 	ip->i_prealloc_block = 0;
 
+	bzero(&ip->i_rsv_lock, sizeof(struct mtx));
+	mtx_init(&ip->i_rsv_lock, "inode rsv lock", NULL, MTX_DEF);
+        ip->i_rsv = NULL;
+
 	/*
 	 * Now we want to make sure that block pointers for unused
 	 * blocks are zeroed out - ext2_balloc depends on this
diff -urN /usr/src/sys/fs/ext2fs/ext2fs.h new/ext2fs.h
--- /usr/src/sys/fs/ext2fs/ext2fs.h	2010-01-14 22:30:54.000000000 +0800
+++ new/ext2fs.h	2010-08-19 02:47:29.000000000 +0800
@@ -38,6 +38,7 @@
 #define _FS_EXT2FS_EXT2_FS_H
 
 #include <sys/types.h>
+#include <sys/lock.h>
 
 /*
  * Special inode numbers
@@ -174,6 +175,9 @@
 	char e2fs_wasvalid;       /* valid at mount time */
 	off_t e2fs_maxfilesize;
 	struct ext2_gd *e2fs_gd; /* Group Descriptors */
+
+	struct mtx e2fs_rsv_lock;                /* Protect reservation window RB tree */
+	struct ext2_rsv_win_tree *e2fs_rsv_tree; /* Reservation window index */
 };
 
 /*
diff -urN /usr/src/sys/fs/ext2fs/inode.h new/inode.h
--- /usr/src/sys/fs/ext2fs/inode.h	2010-01-14 22:30:54.000000000 +0800
+++ new/inode.h	2010-08-19 02:47:29.000000000 +0800
@@ -100,6 +100,10 @@
 	int32_t		i_gen;		/* Generation number. */
 	u_int32_t	i_uid;		/* File owner. */
 	u_int32_t	i_gid;		/* File group. */
+
+        /* Fields for reservation window */
+        struct mtx          i_rsv_lock; /* Protects i_rsv */
+	struct ext2_rsv_win *i_rsv;     /* Reservation window */
 };
 
 /*


--BXVAT5kNtrzKuDFl--

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 17:42:06 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C80361065670;
	Mon,  8 Nov 2010 17:42:06 +0000 (UTC)
	(envelope-from sarawgi.aditya@gmail.com)
Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com
	[209.85.212.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 6706D8FC1C;
	Mon,  8 Nov 2010 17:42:06 +0000 (UTC)
Received: by vws20 with SMTP id 20so304346vws.13
	for <multiple recipients>; Mon, 08 Nov 2010 09:42:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:date:from:to:cc:subject
	:message-id:references:mime-version:content-type:content-disposition
	:in-reply-to:user-agent;
	bh=mVm82mMOju9/lRR1t+H4KvygCcVHzO9IC7P0G4Xoec0=;
	b=mZl+1r8gul7aUFomCPoCIrBmGirOVQbTSCNUwtQDVYK9xVDch6YhyXCOtODUoNuHfq
	AD23+OICwH28W+zu14ms8/EFuP7+suDQ8dpxSj6hzMXWA4ArXHfHUaS2Jo0okfvjKh8E
	dBJ19bsxCKYu8NwTOhc0x3ef8GKf0WU6ayuYc=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=date:from:to:cc:subject:message-id:references:mime-version
	:content-type:content-disposition:in-reply-to:user-agent;
	b=QikkURkohMdyJYDueRVpcaH+sCbxKmtABQgy8zSaRRUyMWRuscE7O7gRGQ5bQbxMOy
	vOvpHJGOJJdObvAOfDkt3ZTwdrVr03fTtu6OKsS719Wcv65cX+Hj9zMHNKuGR0RqwJML
	oiyc/KkDKkPzMyi6kkwwG3OTpF//7QA9Nm5ZA=
Received: by 10.224.199.6 with SMTP id eq6mr4503915qab.272.1289236803254;
	Mon, 08 Nov 2010 09:20:03 -0800 (PST)
Received: from earth ([183.87.49.109])
	by mx.google.com with ESMTPS id y21sm79987yhc.14.2010.11.08.09.20.00
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Mon, 08 Nov 2010 09:20:02 -0800 (PST)
Date: Mon, 8 Nov 2010 22:51:39 +0530
From: Aditya Sarawgi <sarawgi.aditya@gmail.com>
To: Gleb Kurtsou <gleb.kurtsou@gmail.com>
Message-ID: <20101108172136.GA2066@earth>
References: <ib8qda$nat$1@dough.gmane.org>
 <20101108143130.GA2799@tops>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20101108143130.GA2799@tops>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 17:42:07 -0000

On Mon, Nov 08, 2010 at 04:31:30PM +0200, Gleb Kurtsou wrote:
> On (08/11/2010 13:28), Ivan Voras wrote:
> > I was looking at fusefs sources and there is a dance it does with the
> > Giant lock which looks fishy.
> It's intended to be fishy. No kernel level locks should be held before
> returning to userland, in other words on each syscall vnode is locked (+
> Gaint lock for fs if needed), than it's unlocked by filesystem and
> relocked upon callback from userspace. puffs is MPSAFE if that could be
> of any help for you.
> 
> > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of
> > mentionings, but if I understand it correctly only these "active" instances:
> > 
> > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure
> > out what they're guarding
> > 2) some manual locking and unlocking in nfsclient which appears to only
> > guard printf() (???)
> Somewhat unrelated, but. Does NFS client unlock vnodes while
> sending/waiting for RCP reply? I thought it does, but I'm not sure.
> 
> > 3) some more locking in nfsserver which apparently is only there to
> > guard the underlying local file system
> > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but
> > doesn't do much of its own interfacing with it
> > 
> > Except for these, is there any more magic that would need to be resolved
> > to excise Giant from VFS?
> Kostik was working on it.
> 
> > Would it be correct to think that coda is the single biggest obstacle?
> Filesystem should be marked as MPSAFE, it's not D_NEEDGIANT flag but
> MNTK_MPSAFE. A lot of filesystems are still locked by Gaint, i.e ext2fs,
> smbfs, nwfs, ntfs, etc.
>

ext2fs on 9-CURRENT is MPSAFE.  
 
> > 
> > _______________________________________________
> > freebsd-fs@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 17:49:07 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 803A71065675
	for <freebsd-fs@FreeBSD.ORG>; Mon,  8 Nov 2010 17:49:07 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2])
	by mx1.freebsd.org (Postfix) with ESMTP id EF7368FC1D
	for <freebsd-fs@FreeBSD.ORG>; Mon,  8 Nov 2010 17:49:06 +0000 (UTC)
Received: from lurza.secnetix.de (localhost [127.0.0.1])
	by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id oA8Hmnsq085404;
	Mon, 8 Nov 2010 18:49:04 +0100 (CET)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.14.3/8.14.3/Submit) id oA8HmnLS085403;
	Mon, 8 Nov 2010 18:48:49 +0100 (CET) (envelope-from olli)
Date: Mon, 8 Nov 2010 18:48:49 +0100 (CET)
Message-Id: <201011081748.oA8HmnLS085403@lurza.secnetix.de>
From: Oliver Fromme <olli@lurza.secnetix.de>
To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com
In-Reply-To: <AANLkTinJEZ+QBVWuEFfjsyo8FXb1_iFCa0aHOjT2MVwB@mail.gmail.com>
X-Newsgroups: list.freebsd-fs
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX)
	(FreeBSD/6.4-PRERELEASE-20080904 (i386))
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5
	(lurza.secnetix.de [127.0.0.1]);
	Mon, 08 Nov 2010 18:49:05 +0100 (CET)
Cc: 
Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
	?error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 17:49:07 -0000

Monthadar Al Jaberi <monthadar@gmail.com> wrote:
 > I dont know if I am asking on the wrong place. But it has todo with
 > filesystem and onboard flash (16MB) on a RouterStation Pro board.
 > 
 > I am running a FreeBSD Current 201010, with the kernel configuration
 > file specified in /usr/src/sys/mips/conf/AR71XX with device
 > geom_redboot.
 > 
 > but I get this error when I try to mount from flash:
 > mount /dev/redboot/fs /var/fs
 > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size
 > 8192: Invalid argument
 > 
 > So I guessed it has todo with the flash configured in 64k sectors
 > according to the boot output.
 > ...
 > mx25l0: <M25Pxx Flash Family> at cs 0 on spibus0
 > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors
 > ...

Historically UFS/FFS supports only 512 bytes per sector.
I think it was patched at some point in the past to support
2048 bytes per sector, too, which is used by some MOD media
and DVD-RAM.  I'm pretty sure it does _not_ support 65536
bytes per sector (someone please correct me if I'm wrong).

 > So I just tried to change SBLOCKSIZE from 8129 to 65536 in
 > /usr/src/sys/ufs/ffs/fs.h, but then I got this error:

That won't work.  The media sector size is a hard limit;
the driver will refuse to read or write anything that is
not aligned to the media sector size.  Changing the size
of the super block (SBLOCKSIZE) won't help much.

 > mount /dev/redboot/fs /mnt/fs
 > g_vfs_done():redboot/fs[READ(offset=8192, length=65536)]error = 22
 > mount: /dev/redboot/fs : Invalid argument

The UFS code tries to read the super block at offset 8192,
which is not aligned correctly (it's not a multiple of the
sector size).

I think UFS is not the right file system to put on a flash
media that has 256 sectors of 65536 bytes.  In theory you
could insert a translation layer that converts 512-byte
access to 65536-byte access (requiring a read-modify-write
operation when writing).  Maybe gnop(8) can do this, it
has a sector size option, but I haven't tried it.  Anyway,
that would be extremely inefficient.

Another possibility is to create a memory device with
mdconfig(8) and use "dd bs=65536" to copy the contents
of the flash device to the memory disk, then you can mount
the memory disk.  When you need to write any modifications
back to the flash device, you have to umount the memory disk
and use dd again (with if= an of= reversed, of course).
Make sure that the size of the memory disk is a multiple
of 65536, too.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch�ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M�n-
chen, HRB 125758,  Gesch�ftsf�hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"Documentation is like sex; when it's good, it's very, very good,
and when it's bad, it's better than nothing."
        -- Dick Brandon

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 17:53:49 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4837A10656B0
	for <fs@freebsd.org>; Mon,  8 Nov 2010 17:53:49 +0000 (UTC)
	(envelope-from julian@freebsd.org)
Received: from out-0.mx.aerioconnect.net (out-0-37.mx.aerioconnect.net
	[216.240.47.97])
	by mx1.freebsd.org (Postfix) with ESMTP id 274668FC19
	for <fs@freebsd.org>; Mon,  8 Nov 2010 17:53:48 +0000 (UTC)
Received: from idiom.com (postfix@mx0.idiom.com [216.240.32.160])
	by out-0.mx.aerioconnect.net (8.13.8/8.13.8) with ESMTP id
	oA8HrBZl019638; Mon, 8 Nov 2010 09:53:31 -0800
X-Client-Authorized: MaGic Cook1e
X-Client-Authorized: MaGic Cook1e
Received: from julian-mac.elischer.org
	(h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137])
	by idiom.com (Postfix) with ESMTP id 66F7E2D610B;
	Fri,  5 Nov 2010 22:15:58 -0700 (PDT)
Message-ID: <4CD4E492.3090002@freebsd.org>
Date: Fri, 05 Nov 2010 22:16:02 -0700
From: Julian Elischer <julian@freebsd.org>
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US;
	rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6
MIME-Version: 1.0
To: "Mikhail T." <mi+thunw@aldan.algebra.com>
References: <4CD04AEC.8040607@aldan.algebra.com> <4CD051A9.7090200@freebsd.org>
	<4CD0660E.2000102@aldan.algebra.com> <4CD06C4B.80100@freebsd.org>
	<4CD0895A.5030402@aldan.algebra.com> <4CD09830.3030400@freebsd.org>
	<4CD48F81.1080201@aldan.algebra.com>
In-Reply-To: <4CD48F81.1080201@aldan.algebra.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.67 on 216.240.47.51
Cc: fs@freebsd.org
Subject: Re: iozone-ing an SSD (Re: Using an SSD "disk" for /)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 17:53:49 -0000

On 11/5/10 4:13 PM, Mikhail T. wrote:
> Hello!
>
> So, after an earlier inquiry, I went ahead and purchased an SSD
> (Crucial's CTFDDAC128MAG-1G1) and put it to some testing today.
>
> The computer is Dell Poweredge 2900, running FreeBSD-8.1/amd64 (the
> October 10th snapshot). Generic kernel. The system drive (for now) is
> traditional "real" HD -- a 15K RPM by Fujitsu (MAX3073RC), I ran `iozone
> -a' 4 times:
>
>      1. On /var/tmp -- freshly newfs-ed by the sysinstall on the Fujitsu
> drive (/dev/da0).
>      2. On the SSD (/dev/ad4) freshly newfs-ed by me without ANY options
> (no softupdates).
>      3. On the SSD (/dev/ad4) freshly newfs-ed by me with very large -e
> and -a options. Reading the man-page, I figured, any parameters
> mentioning "cylinders" can be set to very large values...
>      4. On the SSD (/dev/da1) connected to the server's mpt-controller,
> rather than the plain SATA port -- using the same filesystem created in
> 3. above (no reformatting). (The 2.5" can't be secured in the 3.5" slot
> and is simply hanging in the air on the SATA/SAS connectors.)
>
> The results can be found in 4 HTML files found at:
> http://aldan.algebra.com/~mi/io/ (The original iozone-created Excel
> files are there too.)
>
> They puzzle... Fujitsu, for example, is not an OBVIOUS loser -- it beats
> the SSD in a number of file-size record-length combinations. I also
> can't explain, the differences between different takes on the SSD.
>
> And, lastly, there is a surprising (to me) spike in "Record Rewrite"
> throughput -- for both SSD and HD -- for large files when the reclen is
> 64. Using reclen of 128 results in much worsening throughput --
> especially for the Fujitsu.
>
> I wonder, if these data can be exploited to come up with better newfs
> parameters for the modern disks (SSD and not)... Comments? Thanks!

I have no idea about that brand of ssd, but the industry benchmark for 
disk-replacement
SSDs at the moment are the newest intel  drives.
>      -mi
>


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 18:00:35 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 144271065673;
	Mon,  8 Nov 2010 18:00:35 +0000 (UTC)
	(envelope-from gleb.kurtsou@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 6797D8FC18;
	Mon,  8 Nov 2010 18:00:34 +0000 (UTC)
Received: by bwz3 with SMTP id 3so5134163bwz.13
	for <multiple recipients>; Mon, 08 Nov 2010 10:00:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:date:from:to:cc:subject
	:message-id:references:mime-version:content-type:content-disposition
	:in-reply-to:user-agent;
	bh=umDWEwINQ2QVEKnfcknVxs77tOY/6h8O7vg+W2V2o+E=;
	b=wwcR35J9uSsP0o3UJhgBZRFKTLJepY8bKGhpZ9m9yyq5fbhMaT+t4KqbMaraDB0QS3
	O8ek0BlF9OhEgzlEAi6vJySDAemQOJ+bsge0ThmW2WXVbaN84pO/q21SQpILLnAgw9pS
	xfOSFhz4limUZzrSK5YeuVpeOBHmLMT4fWoS8=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=date:from:to:cc:subject:message-id:references:mime-version
	:content-type:content-disposition:in-reply-to:user-agent;
	b=XTNB99wqrByXu2Kh8E98jTiKx2M5iMg5XCNLXL20UFG1WnFKJOimOO9QnB6Vd7ANVm
	wydXOJJYq0VfSrzFVfn1fwLevTolImus0p75wcFWtRguxYqPREtynFg/2I1qIPkJDng8
	JEdt8A+GvpC+aqVuDpWZI6qiGytBn3ctOdios=
Received: by 10.204.120.136 with SMTP id d8mr5093014bkr.152.1289239233254;
	Mon, 08 Nov 2010 10:00:33 -0800 (PST)
Received: from localhost ([91.187.5.20])
	by mx.google.com with ESMTPS id v25sm148936bkt.18.2010.11.08.10.00.31
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Mon, 08 Nov 2010 10:00:32 -0800 (PST)
Date: Mon, 8 Nov 2010 20:00:28 +0200
From: Gleb Kurtsou <gleb.kurtsou@gmail.com>
To: Aditya Sarawgi <sarawgi.aditya@gmail.com>
Message-ID: <20101108180028.GA3964@tops>
References: <ib8qda$nat$1@dough.gmane.org> <20101108143130.GA2799@tops>
	<20101108172136.GA2066@earth>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <20101108172136.GA2066@earth>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 18:00:35 -0000

On (08/11/2010 22:51), Aditya Sarawgi wrote:
> On Mon, Nov 08, 2010 at 04:31:30PM +0200, Gleb Kurtsou wrote:
> > On (08/11/2010 13:28), Ivan Voras wrote:
> > > I was looking at fusefs sources and there is a dance it does with the
> > > Giant lock which looks fishy.
> > It's intended to be fishy. No kernel level locks should be held before
> > returning to userland, in other words on each syscall vnode is locked (+
> > Gaint lock for fs if needed), than it's unlocked by filesystem and
> > relocked upon callback from userspace. puffs is MPSAFE if that could be
> > of any help for you.
> > 
> > > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of
> > > mentionings, but if I understand it correctly only these "active" instances:
> > > 
> > > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure
> > > out what they're guarding
> > > 2) some manual locking and unlocking in nfsclient which appears to only
> > > guard printf() (???)
> > Somewhat unrelated, but. Does NFS client unlock vnodes while
> > sending/waiting for RCP reply? I thought it does, but I'm not sure.
> > 
> > > 3) some more locking in nfsserver which apparently is only there to
> > > guard the underlying local file system
> > > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but
> > > doesn't do much of its own interfacing with it
> > > 
> > > Except for these, is there any more magic that would need to be resolved
> > > to excise Giant from VFS?
> > Kostik was working on it.
> > 
> > > Would it be correct to think that coda is the single biggest obstacle?
> > Filesystem should be marked as MPSAFE, it's not D_NEEDGIANT flag but
> > MNTK_MPSAFE. A lot of filesystems are still locked by Gaint, i.e ext2fs,
> > smbfs, nwfs, ntfs, etc.
> >
> 
> ext2fs on 9-CURRENT is MPSAFE.  
Didn't check it for a while, sorry.

But there's a deadlock in ext2_rename, it doesn't following vnode
locking order (parent -> child) by doing vn_lock(fvp). The problem can't
be fixed in a generic way at the moment, the best solution would
probably be to follow UFS and unlock all vnodes, lock one-by-one and
relookup.  The same applies to tmpfs.


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 18:16:14 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 62D8B1065693;
	Mon,  8 Nov 2010 18:16:14 +0000 (UTC)
	(envelope-from sarawgi.aditya@gmail.com)
Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com
	[209.85.161.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 057F58FC12;
	Mon,  8 Nov 2010 18:16:13 +0000 (UTC)
Received: by gxk9 with SMTP id 9so3767691gxk.13
	for <multiple recipients>; Mon, 08 Nov 2010 10:16:13 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:date:from:to:cc:subject
	:message-id:references:mime-version:content-type:content-disposition
	:in-reply-to:user-agent;
	bh=ZvfRvxD+AWPjqTosS2n/4a1mmbAGx28wQ3OTCTC3S1s=;
	b=gqGWVnl/HTXj/fV0zAOnXzy3Ze9cuGUcgqXLcld5H0JsqN6Mwzwy8Jx3S1oguJ/yth
	RING6XwWByB4n4jXZdTp3vS7ub25/GIxacBGApLWB+4C0ABo4gBSiO82KKHAMxWZ9qwb
	l+LRoKMGRymYL6EQSk8poziNg4pM38RxWqFQ4=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=date:from:to:cc:subject:message-id:references:mime-version
	:content-type:content-disposition:in-reply-to:user-agent;
	b=kbKh8QwRk7RQkI/zkr/7M0BH59f6PqZzGOCSSQsXQQAVvI/uwB2AMY+1fyh/0FX3oj
	0wwBh13TsxMP/geP+3C0ufP6joj2abTuU8gN2Ogd3nVnZQDMSpKyoHfKVbZwMylqkAZa
	rIRp3sX2OfQnEFEoVBS5VtEv1Cc4ndtyrtO74=
Received: by 10.150.146.17 with SMTP id t17mr939166ybd.337.1289240173058;
	Mon, 08 Nov 2010 10:16:13 -0800 (PST)
Received: from earth ([183.87.49.109])
	by mx.google.com with ESMTPS id v8sm113910yba.14.2010.11.08.10.16.10
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Mon, 08 Nov 2010 10:16:12 -0800 (PST)
Date: Mon, 8 Nov 2010 23:47:49 +0530
From: Aditya Sarawgi <sarawgi.aditya@gmail.com>
To: Gleb Kurtsou <gleb.kurtsou@gmail.com>
Message-ID: <20101108181748.GD2066@earth>
References: <ib8qda$nat$1@dough.gmane.org> <20101108143130.GA2799@tops>
	<20101108172136.GA2066@earth> <20101108180028.GA3964@tops>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20101108180028.GA3964@tops>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: freebsd-fs@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 18:16:14 -0000

On Mon, Nov 08, 2010 at 08:00:28PM +0200, Gleb Kurtsou wrote:
> On (08/11/2010 22:51), Aditya Sarawgi wrote:
> > On Mon, Nov 08, 2010 at 04:31:30PM +0200, Gleb Kurtsou wrote:
> > > On (08/11/2010 13:28), Ivan Voras wrote:
> > > > I was looking at fusefs sources and there is a dance it does with the
> > > > Giant lock which looks fishy.
> > > It's intended to be fishy. No kernel level locks should be held before
> > > returning to userland, in other words on each syscall vnode is locked (+
> > > Gaint lock for fs if needed), than it's unlocked by filesystem and
> > > relocked upon callback from userspace. puffs is MPSAFE if that could be
> > > of any help for you.
> > > 
> > > > Grepping for "-ir giant" in /sys/fs on 8-stable shows only a handful of
> > > > mentionings, but if I understand it correctly only these "active" instances:
> > > > 
> > > > 1) one set of mtx_assert() calls on it in pseudofs, which I can't figure
> > > > out what they're guarding
> > > > 2) some manual locking and unlocking in nfsclient which appears to only
> > > > guard printf() (???)
> > > Somewhat unrelated, but. Does NFS client unlock vnodes while
> > > sending/waiting for RCP reply? I thought it does, but I'm not sure.
> > > 
> > > > 3) some more locking in nfsserver which apparently is only there to
> > > > guard the underlying local file system
> > > > 4) coda, which appears to be the only one marked with D_NEEDGIANT, but
> > > > doesn't do much of its own interfacing with it
> > > > 
> > > > Except for these, is there any more magic that would need to be resolved
> > > > to excise Giant from VFS?
> > > Kostik was working on it.
> > > 
> > > > Would it be correct to think that coda is the single biggest obstacle?
> > > Filesystem should be marked as MPSAFE, it's not D_NEEDGIANT flag but
> > > MNTK_MPSAFE. A lot of filesystems are still locked by Gaint, i.e ext2fs,
> > > smbfs, nwfs, ntfs, etc.
> > >
> > 
> > ext2fs on 9-CURRENT is MPSAFE.  
> Didn't check it for a while, sorry.
>

No Problem :)
 
> But there's a deadlock in ext2_rename, it doesn't following vnode
> locking order (parent -> child) by doing vn_lock(fvp). The problem can't
> be fixed in a generic way at the moment, the best solution would
> probably be to follow UFS and unlock all vnodes, lock one-by-one and
> relookup.  The same applies to tmpfs.
>

Thanks for pointing this out. Saw some mails related to this earlier.
Will take a look.  

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 19:01:33 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E87C3106564A
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 19:01:33 +0000 (UTC)
	(envelope-from carlson39@llnl.gov)
Received: from nspiron-1.llnl.gov (nspiron-1.llnl.gov [128.115.41.81])
	by mx1.freebsd.org (Postfix) with ESMTP id D32CA8FC13
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 19:01:33 +0000 (UTC)
X-Attachments: None
Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135])
	by nspiron-1.llnl.gov with ESMTP; 08 Nov 2010 10:32:55 -0800
Message-ID: <4CD84258.6090404@llnl.gov>
Date: Mon, 08 Nov 2010 10:32:56 -0800
From: Mike Carlson <carlson39@llnl.gov>
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US;
	rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org, pjd@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: 
Subject: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 19:01:34 -0000

I'm having a problem with  stripping 7 18TB RAID6 (hardware SAN) volumes 
together.

Here is a quick rundown of the hardware:
* HP DL180 G6 w/12GB ram
* QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter)
* Winchester Hardware SAN,

    da2 at isp0 bus 0 scbus2 target 0 lun 0
    da2: <WINSYS SX2318R 373O> Fixed Direct Access SCSI-5 device
    da2: 800.000MB/s transfers
    da2: Command Queueing enabled
    da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C)


As soon as I create the volume and write data to it, it is reported as 
being corrupted:

    write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8
    write# zpool scrub filevol001dd if=/dev/random
    of=/filevol001/random.dat.1 bs=1m count=1000
    write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
    1000+0 records in
    1000+0 records out
    1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec)
    write# cd /filevol001/
    write# ls
    random.dat.1
    write# md5 *
    MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3
    write# cp random.dat.1 random.dat.2
    cp: random.dat.1: Input/output error
    write# zpool status
       pool: filevol001
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         filevol001  ONLINE       0     0     0
           da2       ONLINE       0     0     0
           da3       ONLINE       0     0     0
           da4       ONLINE       0     0     0
           da5       ONLINE       0     0     0
           da6       ONLINE       0     0     0
           da7       ONLINE       0     0     0
           da8       ONLINE       0     0     0

    errors: No known data errors
    write# zpool scrub filevol001
    write# zpool status
       pool: filevol001
      state: ONLINE
    status: One or more devices has experienced an error resulting in data
         corruption.  Applications may be affected.
    action: Restore the file in question if possible.  Otherwise restore the
         entire pool from backup.
        see: http://www.sun.com/msg/ZFS-8000-8A
      scrub: scrub completed after 0h0m with 2437 errors on Mon Nov  8
    10:14:20 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         filevol001  ONLINE       0     0 2.38K
           da2       ONLINE       0     0 1.24K  12K repaired
           da3       ONLINE       0     0 1.12K
           da4       ONLINE       0     0 1.13K
           da5       ONLINE       0     0 1.27K
           da6       ONLINE       0     0     0
           da7       ONLINE       0     0     0
           da8       ONLINE       0     0     0

    errors: 2437 data errors, use '-v' for a list

However, if I create a 'raidz' volume, no errors occur:

    write# zpool destroy filevol001
    write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8
    write# zpool status
       pool: filevol001
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         filevol001  ONLINE       0     0     0
           raidz1    ONLINE       0     0     0
             da2     ONLINE       0     0     0
             da3     ONLINE       0     0     0
             da4     ONLINE       0     0     0
             da5     ONLINE       0     0     0
             da6     ONLINE       0     0     0
             da7     ONLINE       0     0     0
             da8     ONLINE       0     0     0

    errors: No known data errors
    write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
    1000+0 records in
    1000+0 records out
    1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec)
    write# zpool scrub filevol001

    dmesg output:
    write# zpool status
       pool: filevol001
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
    09:54:51 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         filevol001  ONLINE       0     0     0
           raidz1    ONLINE       0     0     0
             da2     ONLINE       0     0     0
             da3     ONLINE       0     0     0
             da4     ONLINE       0     0     0
             da5     ONLINE       0     0     0
             da6     ONLINE       0     0     0
             da7     ONLINE       0     0     0
             da8     ONLINE       0     0     0

    errors: No known data errors
    write# ls
    random.dat.1
    write# cp random.dat.1 random.dat.2
    write# cp random.dat.1 random.dat.3
    write# cp random.dat.1 random.dat.4
    write# cp random.dat.1 random.dat.5
    write# cp random.dat.1 random.dat.6
    write# cp random.dat.1 random.dat.7
    write# md5 *
    MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2
    MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2
    MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2
    MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2
    MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2
    MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2
    MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2

What is also odd, is if I create 7 separate ZFS volumes, they do not 
report any data corruption:

    write# zpool destroy filevol001
    write# zpool create test01 da2
    write# zpool create test02 da3
    write# zpool create test03 da4
    write# zpool create test04 da5
    write# zpool create test05 da6
    write# zpool create test06 da7
    write# zpool create test07 da8
    write# zpool status
       pool: test01
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         test01      ONLINE       0     0     0
           da2       ONLINE       0     0     0

    errors: No known data errors

       pool: test02
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         test02      ONLINE       0     0     0
           da3       ONLINE       0     0     0

    errors: No known data errors

       pool: test03
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         test03      ONLINE       0     0     0
           da4       ONLINE       0     0     0

    errors: No known data errors

       pool: test04
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         test04      ONLINE       0     0     0
           da5       ONLINE       0     0     0

    errors: No known data errors

       pool: test05
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         test05      ONLINE       0     0     0
           da6       ONLINE       0     0     0

    errors: No known data errors

       pool: test06
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         test06      ONLINE       0     0     0
           da7       ONLINE       0     0     0

    errors: No known data errors

       pool: test07
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         test07      ONLINE       0     0     0
           da8       ONLINE       0     0     0

    errors: No known data errors
    write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000
    1000+0 records in
    1000+0 records out
    1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec)
    write# cd /tmp/
    write# md5 /tmp/random.dat.1
    MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp
    random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1
    /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07
    write# md5 /test*/*
    MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03
    ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ;
    zpool scrub test07
    write# zpool status
       pool: test01
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
    10:27:49 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         test01      ONLINE       0     0     0
           da2       ONLINE       0     0     0

    errors: No known data errors

       pool: test02
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
    10:27:52 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         test02      ONLINE       0     0     0
           da3       ONLINE       0     0     0

    errors: No known data errors

       pool: test03
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
    10:27:54 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         test03      ONLINE       0     0     0
           da4       ONLINE       0     0     0

    errors: No known data errors

       pool: test04
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
    10:27:57 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         test04      ONLINE       0     0     0
           da5       ONLINE       0     0     0

    errors: No known data errors

       pool: test05
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
    10:28:00 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         test05      ONLINE       0     0     0
           da6       ONLINE       0     0     0

    errors: No known data errors

       pool: test06
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
    10:28:02 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         test06      ONLINE       0     0     0
           da7       ONLINE       0     0     0

    errors: No known data errors

       pool: test07
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
    10:28:05 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         test07      ONLINE       0     0     0
           da8       ONLINE       0     0     0

    errors: No known data errors

Based on these results, I've drawn the following conclusion:
* ZFS single pool per device = OKAY
* ZFS raidz of all devices = OKAY
* ZFS stripe of all devices = NOT OKAY

The results are immediate, and I know ZFS will self-heal, so is that 
what it is doing behind my back and just not reporting it? Is this a ZFS 
bug with striping vs. raidz?

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 19:06:43 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A19011065673
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 19:06:43 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta01.westchester.pa.mail.comcast.net
	(qmta01.westchester.pa.mail.comcast.net [76.96.62.16])
	by mx1.freebsd.org (Postfix) with ESMTP id 4C3DF8FC20
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 19:06:42 +0000 (UTC)
Received: from omta19.westchester.pa.mail.comcast.net ([76.96.62.98])
	by qmta01.westchester.pa.mail.comcast.net with comcast
	id UbpN1f00127AodY51j6jJx; Mon, 08 Nov 2010 19:06:43 +0000
Received: from koitsu.dyndns.org ([98.248.41.155])
	by omta19.westchester.pa.mail.comcast.net with comcast
	id Uj6i1f0043LrwQ23fj6ius; Mon, 08 Nov 2010 19:06:43 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id DA0469B427; Mon,  8 Nov 2010 11:06:40 -0800 (PST)
Date: Mon, 8 Nov 2010 11:06:40 -0800
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Mike Carlson <carlson39@llnl.gov>
Message-ID: <20101108190640.GA15661@icarus.home.lan>
References: <4CD84258.6090404@llnl.gov>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4CD84258.6090404@llnl.gov>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, pjd@freebsd.org
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 19:06:43 -0000

On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote:
> I'm having a problem with  stripping 7 18TB RAID6 (hardware SAN)
> volumes together.
> 
> Here is a quick rundown of the hardware:
> * HP DL180 G6 w/12GB ram
> * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter)
> * Winchester Hardware SAN,
> 
>    da2 at isp0 bus 0 scbus2 target 0 lun 0
>    da2: <WINSYS SX2318R 373O> Fixed Direct Access SCSI-5 device
>    da2: 800.000MB/s transfers
>    da2: Command Queueing enabled
>    da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C)
> 
> 
> As soon as I create the volume and write data to it, it is reported
> as being corrupted:
> 
>    write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8
>    write# zpool scrub filevol001dd if=/dev/random
>    of=/filevol001/random.dat.1 bs=1m count=1000
>    write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
>    1000+0 records in
>    1000+0 records out
>    1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec)
>    write# cd /filevol001/
>    write# ls
>    random.dat.1
>    write# md5 *
>    MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3
>    write# cp random.dat.1 random.dat.2
>    cp: random.dat.1: Input/output error
>    write# zpool status
>       pool: filevol001
>      state: ONLINE
>      scrub: none requested
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         filevol001  ONLINE       0     0     0
>           da2       ONLINE       0     0     0
>           da3       ONLINE       0     0     0
>           da4       ONLINE       0     0     0
>           da5       ONLINE       0     0     0
>           da6       ONLINE       0     0     0
>           da7       ONLINE       0     0     0
>           da8       ONLINE       0     0     0
> 
>    errors: No known data errors
>    write# zpool scrub filevol001
>    write# zpool status
>       pool: filevol001
>      state: ONLINE
>    status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
>    action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>        see: http://www.sun.com/msg/ZFS-8000-8A
>      scrub: scrub completed after 0h0m with 2437 errors on Mon Nov  8
>    10:14:20 2010
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         filevol001  ONLINE       0     0 2.38K
>           da2       ONLINE       0     0 1.24K  12K repaired
>           da3       ONLINE       0     0 1.12K
>           da4       ONLINE       0     0 1.13K
>           da5       ONLINE       0     0 1.27K
>           da6       ONLINE       0     0     0
>           da7       ONLINE       0     0     0
>           da8       ONLINE       0     0     0
> 
>    errors: 2437 data errors, use '-v' for a list
> 
> However, if I create a 'raidz' volume, no errors occur:
> 
>    write# zpool destroy filevol001
>    write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8
>    write# zpool status
>       pool: filevol001
>      state: ONLINE
>      scrub: none requested
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         filevol001  ONLINE       0     0     0
>           raidz1    ONLINE       0     0     0
>             da2     ONLINE       0     0     0
>             da3     ONLINE       0     0     0
>             da4     ONLINE       0     0     0
>             da5     ONLINE       0     0     0
>             da6     ONLINE       0     0     0
>             da7     ONLINE       0     0     0
>             da8     ONLINE       0     0     0
> 
>    errors: No known data errors
>    write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
>    1000+0 records in
>    1000+0 records out
>    1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec)
>    write# zpool scrub filevol001
> 
>    dmesg output:
>    write# zpool status
>       pool: filevol001
>      state: ONLINE
>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>    09:54:51 2010
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         filevol001  ONLINE       0     0     0
>           raidz1    ONLINE       0     0     0
>             da2     ONLINE       0     0     0
>             da3     ONLINE       0     0     0
>             da4     ONLINE       0     0     0
>             da5     ONLINE       0     0     0
>             da6     ONLINE       0     0     0
>             da7     ONLINE       0     0     0
>             da8     ONLINE       0     0     0
> 
>    errors: No known data errors
>    write# ls
>    random.dat.1
>    write# cp random.dat.1 random.dat.2
>    write# cp random.dat.1 random.dat.3
>    write# cp random.dat.1 random.dat.4
>    write# cp random.dat.1 random.dat.5
>    write# cp random.dat.1 random.dat.6
>    write# cp random.dat.1 random.dat.7
>    write# md5 *
>    MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2
>    MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2
>    MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2
>    MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2
>    MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2
>    MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2
>    MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2
> 
> What is also odd, is if I create 7 separate ZFS volumes, they do not
> report any data corruption:
> 
>    write# zpool destroy filevol001
>    write# zpool create test01 da2
>    write# zpool create test02 da3
>    write# zpool create test03 da4
>    write# zpool create test04 da5
>    write# zpool create test05 da6
>    write# zpool create test06 da7
>    write# zpool create test07 da8
>    write# zpool status
>       pool: test01
>      state: ONLINE
>      scrub: none requested
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test01      ONLINE       0     0     0
>           da2       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test02
>      state: ONLINE
>      scrub: none requested
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test02      ONLINE       0     0     0
>           da3       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test03
>      state: ONLINE
>      scrub: none requested
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test03      ONLINE       0     0     0
>           da4       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test04
>      state: ONLINE
>      scrub: none requested
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test04      ONLINE       0     0     0
>           da5       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test05
>      state: ONLINE
>      scrub: none requested
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test05      ONLINE       0     0     0
>           da6       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test06
>      state: ONLINE
>      scrub: none requested
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test06      ONLINE       0     0     0
>           da7       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test07
>      state: ONLINE
>      scrub: none requested
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test07      ONLINE       0     0     0
>           da8       ONLINE       0     0     0
> 
>    errors: No known data errors
>    write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000
>    1000+0 records in
>    1000+0 records out
>    1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec)
>    write# cd /tmp/
>    write# md5 /tmp/random.dat.1
>    MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>    write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp
>    random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1
>    /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07
>    write# md5 /test*/*
>    MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>    MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>    MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>    MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>    MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>    MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>    MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>    write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03
>    ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ;
>    zpool scrub test07
>    write# zpool status
>       pool: test01
>      state: ONLINE
>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>    10:27:49 2010
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test01      ONLINE       0     0     0
>           da2       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test02
>      state: ONLINE
>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>    10:27:52 2010
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test02      ONLINE       0     0     0
>           da3       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test03
>      state: ONLINE
>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>    10:27:54 2010
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test03      ONLINE       0     0     0
>           da4       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test04
>      state: ONLINE
>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>    10:27:57 2010
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test04      ONLINE       0     0     0
>           da5       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test05
>      state: ONLINE
>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>    10:28:00 2010
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test05      ONLINE       0     0     0
>           da6       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test06
>      state: ONLINE
>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>    10:28:02 2010
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test06      ONLINE       0     0     0
>           da7       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
>       pool: test07
>      state: ONLINE
>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>    10:28:05 2010
>    config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         test07      ONLINE       0     0     0
>           da8       ONLINE       0     0     0
> 
>    errors: No known data errors
> 
> Based on these results, I've drawn the following conclusion:
> * ZFS single pool per device = OKAY
> * ZFS raidz of all devices = OKAY
> * ZFS stripe of all devices = NOT OKAY
> 
> The results are immediate, and I know ZFS will self-heal, so is that
> what it is doing behind my back and just not reporting it? Is this a
> ZFS bug with striping vs. raidz?

Can you reproduce this problem using RELENG_8?  Please try one of the
below snapshots.

ftp://ftp4.freebsd.org/pub/FreeBSD/snapshots/201011/

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 19:11:30 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E9C0A1065670;
	Mon,  8 Nov 2010 19:11:30 +0000 (UTC)
	(envelope-from carlson39@llnl.gov)
Received: from nspiron-1.llnl.gov (nspiron-1.llnl.gov [128.115.41.81])
	by mx1.freebsd.org (Postfix) with ESMTP id C9E918FC1B;
	Mon,  8 Nov 2010 19:11:30 +0000 (UTC)
X-Attachments: None
Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135])
	by nspiron-1.llnl.gov with ESMTP; 08 Nov 2010 11:11:30 -0800
Message-ID: <4CD84B63.4030800@llnl.gov>
Date: Mon, 08 Nov 2010 11:11:31 -0800
From: Mike Carlson <carlson39@llnl.gov>
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US;
	rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
References: <4CD84258.6090404@llnl.gov>
	<20101108190640.GA15661@icarus.home.lan>
In-Reply-To: <20101108190640.GA15661@icarus.home.lan>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>,
	"pjd@freebsd.org" <pjd@freebsd.org>
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 19:11:31 -0000

On 11/08/2010 11:06 AM, Jeremy Chadwick wrote:
> On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote:
>> I'm having a problem with  stripping 7 18TB RAID6 (hardware SAN)
>> volumes together.
>>
>> Here is a quick rundown of the hardware:
>> * HP DL180 G6 w/12GB ram
>> * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter)
>> * Winchester Hardware SAN,
>>
>>     da2 at isp0 bus 0 scbus2 target 0 lun 0
>>     da2:<WINSYS SX2318R 373O>  Fixed Direct Access SCSI-5 device
>>     da2: 800.000MB/s transfers
>>     da2: Command Queueing enabled
>>     da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C)
>>
>>
>> As soon as I create the volume and write data to it, it is reported
>> as being corrupted:
>>
>>     write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8
>>     write# zpool scrub filevol001dd if=/dev/random
>>     of=/filevol001/random.dat.1 bs=1m count=1000
>>     write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
>>     1000+0 records in
>>     1000+0 records out
>>     1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec)
>>     write# cd /filevol001/
>>     write# ls
>>     random.dat.1
>>     write# md5 *
>>     MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3
>>     write# cp random.dat.1 random.dat.2
>>     cp: random.dat.1: Input/output error
>>     write# zpool status
>>        pool: filevol001
>>       state: ONLINE
>>       scrub: none requested
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          filevol001  ONLINE       0     0     0
>>            da2       ONLINE       0     0     0
>>            da3       ONLINE       0     0     0
>>            da4       ONLINE       0     0     0
>>            da5       ONLINE       0     0     0
>>            da6       ONLINE       0     0     0
>>            da7       ONLINE       0     0     0
>>            da8       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>     write# zpool scrub filevol001
>>     write# zpool status
>>        pool: filevol001
>>       state: ONLINE
>>     status: One or more devices has experienced an error resulting in data
>>          corruption.  Applications may be affected.
>>     action: Restore the file in question if possible.  Otherwise restore the
>>          entire pool from backup.
>>         see: http://BLOCKEDwww.BLOCKEDsun.com/msg/ZFS-8000-8A
>>       scrub: scrub completed after 0h0m with 2437 errors on Mon Nov  8
>>     10:14:20 2010
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          filevol001  ONLINE       0     0 2.38K
>>            da2       ONLINE       0     0 1.24K  12K repaired
>>            da3       ONLINE       0     0 1.12K
>>            da4       ONLINE       0     0 1.13K
>>            da5       ONLINE       0     0 1.27K
>>            da6       ONLINE       0     0     0
>>            da7       ONLINE       0     0     0
>>            da8       ONLINE       0     0     0
>>
>>     errors: 2437 data errors, use '-v' for a list
>>
>> However, if I create a 'raidz' volume, no errors occur:
>>
>>     write# zpool destroy filevol001
>>     write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8
>>     write# zpool status
>>        pool: filevol001
>>       state: ONLINE
>>       scrub: none requested
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          filevol001  ONLINE       0     0     0
>>            raidz1    ONLINE       0     0     0
>>              da2     ONLINE       0     0     0
>>              da3     ONLINE       0     0     0
>>              da4     ONLINE       0     0     0
>>              da5     ONLINE       0     0     0
>>              da6     ONLINE       0     0     0
>>              da7     ONLINE       0     0     0
>>              da8     ONLINE       0     0     0
>>
>>     errors: No known data errors
>>     write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
>>     1000+0 records in
>>     1000+0 records out
>>     1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec)
>>     write# zpool scrub filevol001
>>
>>     dmesg output:
>>     write# zpool status
>>        pool: filevol001
>>       state: ONLINE
>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>     09:54:51 2010
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          filevol001  ONLINE       0     0     0
>>            raidz1    ONLINE       0     0     0
>>              da2     ONLINE       0     0     0
>>              da3     ONLINE       0     0     0
>>              da4     ONLINE       0     0     0
>>              da5     ONLINE       0     0     0
>>              da6     ONLINE       0     0     0
>>              da7     ONLINE       0     0     0
>>              da8     ONLINE       0     0     0
>>
>>     errors: No known data errors
>>     write# ls
>>     random.dat.1
>>     write# cp random.dat.1 random.dat.2
>>     write# cp random.dat.1 random.dat.3
>>     write# cp random.dat.1 random.dat.4
>>     write# cp random.dat.1 random.dat.5
>>     write# cp random.dat.1 random.dat.6
>>     write# cp random.dat.1 random.dat.7
>>     write# md5 *
>>     MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>     MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>     MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>     MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>     MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>     MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>     MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>
>> What is also odd, is if I create 7 separate ZFS volumes, they do not
>> report any data corruption:
>>
>>     write# zpool destroy filevol001
>>     write# zpool create test01 da2
>>     write# zpool create test02 da3
>>     write# zpool create test03 da4
>>     write# zpool create test04 da5
>>     write# zpool create test05 da6
>>     write# zpool create test06 da7
>>     write# zpool create test07 da8
>>     write# zpool status
>>        pool: test01
>>       state: ONLINE
>>       scrub: none requested
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test01      ONLINE       0     0     0
>>            da2       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test02
>>       state: ONLINE
>>       scrub: none requested
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test02      ONLINE       0     0     0
>>            da3       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test03
>>       state: ONLINE
>>       scrub: none requested
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test03      ONLINE       0     0     0
>>            da4       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test04
>>       state: ONLINE
>>       scrub: none requested
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test04      ONLINE       0     0     0
>>            da5       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test05
>>       state: ONLINE
>>       scrub: none requested
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test05      ONLINE       0     0     0
>>            da6       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test06
>>       state: ONLINE
>>       scrub: none requested
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test06      ONLINE       0     0     0
>>            da7       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test07
>>       state: ONLINE
>>       scrub: none requested
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test07      ONLINE       0     0     0
>>            da8       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>     write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000
>>     1000+0 records in
>>     1000+0 records out
>>     1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec)
>>     write# cd /tmp/
>>     write# md5 /tmp/random.dat.1
>>     MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>     write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp
>>     random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1
>>     /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07
>>     write# md5 /test*/*
>>     MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>     MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>     MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>     MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>     MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>     MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>     MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>     write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03
>>     ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ;
>>     zpool scrub test07
>>     write# zpool status
>>        pool: test01
>>       state: ONLINE
>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>     10:27:49 2010
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test01      ONLINE       0     0     0
>>            da2       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test02
>>       state: ONLINE
>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>     10:27:52 2010
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test02      ONLINE       0     0     0
>>            da3       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test03
>>       state: ONLINE
>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>     10:27:54 2010
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test03      ONLINE       0     0     0
>>            da4       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test04
>>       state: ONLINE
>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>     10:27:57 2010
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test04      ONLINE       0     0     0
>>            da5       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test05
>>       state: ONLINE
>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>     10:28:00 2010
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test05      ONLINE       0     0     0
>>            da6       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test06
>>       state: ONLINE
>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>     10:28:02 2010
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test06      ONLINE       0     0     0
>>            da7       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>>        pool: test07
>>       state: ONLINE
>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>     10:28:05 2010
>>     config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          test07      ONLINE       0     0     0
>>            da8       ONLINE       0     0     0
>>
>>     errors: No known data errors
>>
>> Based on these results, I've drawn the following conclusion:
>> * ZFS single pool per device = OKAY
>> * ZFS raidz of all devices = OKAY
>> * ZFS stripe of all devices = NOT OKAY
>>
>> The results are immediate, and I know ZFS will self-heal, so is that
>> what it is doing behind my back and just not reporting it? Is this a
>> ZFS bug with striping vs. raidz?
> Can you reproduce this problem using RELENG_8?  Please try one of the
> below snapshots.
>
> ftp://BLOCKEDftp4.freebsd.org/pub/FreeBSD/snapshots/201011/
>
The server is in a data center with limited access control, do I have to 
option of using a particular CVS tag (checking out via csup) and then 
perform a make world/kernel?

If so, I can report back later today, otherwise it might take longer :(

Mike C

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 19:29:54 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 52DE41065673
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 19:29:53 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta02.westchester.pa.mail.comcast.net
	(qmta02.westchester.pa.mail.comcast.net [76.96.62.24])
	by mx1.freebsd.org (Postfix) with ESMTP id F14618FC17
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 19:29:52 +0000 (UTC)
Received: from omta12.westchester.pa.mail.comcast.net ([76.96.62.44])
	by qmta02.westchester.pa.mail.comcast.net with comcast
	id UcB91f0040xGWP852jVtav; Mon, 08 Nov 2010 19:29:53 +0000
Received: from koitsu.dyndns.org ([98.248.41.155])
	by omta12.westchester.pa.mail.comcast.net with comcast
	id UjVs1f0013LrwQ23YjVskQ; Mon, 08 Nov 2010 19:29:53 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id CD4379B427; Mon,  8 Nov 2010 11:29:50 -0800 (PST)
Date: Mon, 8 Nov 2010 11:29:50 -0800
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Mike Carlson <carlson39@llnl.gov>
Message-ID: <20101108192950.GA15902@icarus.home.lan>
References: <4CD84258.6090404@llnl.gov>
	<20101108190640.GA15661@icarus.home.lan>
	<4CD84B63.4030800@llnl.gov>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4CD84B63.4030800@llnl.gov>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>,
	"pjd@freebsd.org" <pjd@freebsd.org>
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 19:29:55 -0000

On Mon, Nov 08, 2010 at 11:11:31AM -0800, Mike Carlson wrote:
> On 11/08/2010 11:06 AM, Jeremy Chadwick wrote:
> >On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote:
> >>I'm having a problem with  stripping 7 18TB RAID6 (hardware SAN)
> >>volumes together.
> >>
> >>Here is a quick rundown of the hardware:
> >>* HP DL180 G6 w/12GB ram
> >>* QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter)
> >>* Winchester Hardware SAN,
> >>
> >>    da2 at isp0 bus 0 scbus2 target 0 lun 0
> >>    da2:<WINSYS SX2318R 373O>  Fixed Direct Access SCSI-5 device
> >>    da2: 800.000MB/s transfers
> >>    da2: Command Queueing enabled
> >>    da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C)
> >>
> >>
> >>As soon as I create the volume and write data to it, it is reported
> >>as being corrupted:
> >>
> >>    write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8
> >>    write# zpool scrub filevol001dd if=/dev/random
> >>    of=/filevol001/random.dat.1 bs=1m count=1000
> >>    write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
> >>    1000+0 records in
> >>    1000+0 records out
> >>    1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec)
> >>    write# cd /filevol001/
> >>    write# ls
> >>    random.dat.1
> >>    write# md5 *
> >>    MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3
> >>    write# cp random.dat.1 random.dat.2
> >>    cp: random.dat.1: Input/output error
> >>    write# zpool status
> >>       pool: filevol001
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         filevol001  ONLINE       0     0     0
> >>           da2       ONLINE       0     0     0
> >>           da3       ONLINE       0     0     0
> >>           da4       ONLINE       0     0     0
> >>           da5       ONLINE       0     0     0
> >>           da6       ONLINE       0     0     0
> >>           da7       ONLINE       0     0     0
> >>           da8       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>    write# zpool scrub filevol001
> >>    write# zpool status
> >>       pool: filevol001
> >>      state: ONLINE
> >>    status: One or more devices has experienced an error resulting in data
> >>         corruption.  Applications may be affected.
> >>    action: Restore the file in question if possible.  Otherwise restore the
> >>         entire pool from backup.
> >>        see: http://BLOCKEDwww.BLOCKEDsun.com/msg/ZFS-8000-8A
> >>      scrub: scrub completed after 0h0m with 2437 errors on Mon Nov  8
> >>    10:14:20 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         filevol001  ONLINE       0     0 2.38K
> >>           da2       ONLINE       0     0 1.24K  12K repaired
> >>           da3       ONLINE       0     0 1.12K
> >>           da4       ONLINE       0     0 1.13K
> >>           da5       ONLINE       0     0 1.27K
> >>           da6       ONLINE       0     0     0
> >>           da7       ONLINE       0     0     0
> >>           da8       ONLINE       0     0     0
> >>
> >>    errors: 2437 data errors, use '-v' for a list
> >>
> >>However, if I create a 'raidz' volume, no errors occur:
> >>
> >>    write# zpool destroy filevol001
> >>    write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8
> >>    write# zpool status
> >>       pool: filevol001
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         filevol001  ONLINE       0     0     0
> >>           raidz1    ONLINE       0     0     0
> >>             da2     ONLINE       0     0     0
> >>             da3     ONLINE       0     0     0
> >>             da4     ONLINE       0     0     0
> >>             da5     ONLINE       0     0     0
> >>             da6     ONLINE       0     0     0
> >>             da7     ONLINE       0     0     0
> >>             da8     ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>    write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
> >>    1000+0 records in
> >>    1000+0 records out
> >>    1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec)
> >>    write# zpool scrub filevol001
> >>
> >>    dmesg output:
> >>    write# zpool status
> >>       pool: filevol001
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    09:54:51 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         filevol001  ONLINE       0     0     0
> >>           raidz1    ONLINE       0     0     0
> >>             da2     ONLINE       0     0     0
> >>             da3     ONLINE       0     0     0
> >>             da4     ONLINE       0     0     0
> >>             da5     ONLINE       0     0     0
> >>             da6     ONLINE       0     0     0
> >>             da7     ONLINE       0     0     0
> >>             da8     ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>    write# ls
> >>    random.dat.1
> >>    write# cp random.dat.1 random.dat.2
> >>    write# cp random.dat.1 random.dat.3
> >>    write# cp random.dat.1 random.dat.4
> >>    write# cp random.dat.1 random.dat.5
> >>    write# cp random.dat.1 random.dat.6
> >>    write# cp random.dat.1 random.dat.7
> >>    write# md5 *
> >>    MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>    MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2
> >>
> >>What is also odd, is if I create 7 separate ZFS volumes, they do not
> >>report any data corruption:
> >>
> >>    write# zpool destroy filevol001
> >>    write# zpool create test01 da2
> >>    write# zpool create test02 da3
> >>    write# zpool create test03 da4
> >>    write# zpool create test04 da5
> >>    write# zpool create test05 da6
> >>    write# zpool create test06 da7
> >>    write# zpool create test07 da8
> >>    write# zpool status
> >>       pool: test01
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test01      ONLINE       0     0     0
> >>           da2       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test02
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test02      ONLINE       0     0     0
> >>           da3       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test03
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test03      ONLINE       0     0     0
> >>           da4       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test04
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test04      ONLINE       0     0     0
> >>           da5       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test05
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test05      ONLINE       0     0     0
> >>           da6       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test06
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test06      ONLINE       0     0     0
> >>           da7       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test07
> >>      state: ONLINE
> >>      scrub: none requested
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test07      ONLINE       0     0     0
> >>           da8       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>    write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000
> >>    1000+0 records in
> >>    1000+0 records out
> >>    1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec)
> >>    write# cd /tmp/
> >>    write# md5 /tmp/random.dat.1
> >>    MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp
> >>    random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1
> >>    /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07
> >>    write# md5 /test*/*
> >>    MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
> >>    write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03
> >>    ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ;
> >>    zpool scrub test07
> >>    write# zpool status
> >>       pool: test01
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:27:49 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test01      ONLINE       0     0     0
> >>           da2       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test02
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:27:52 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test02      ONLINE       0     0     0
> >>           da3       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test03
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:27:54 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test03      ONLINE       0     0     0
> >>           da4       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test04
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:27:57 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test04      ONLINE       0     0     0
> >>           da5       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test05
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:28:00 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test05      ONLINE       0     0     0
> >>           da6       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test06
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:28:02 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test06      ONLINE       0     0     0
> >>           da7       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>       pool: test07
> >>      state: ONLINE
> >>      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
> >>    10:28:05 2010
> >>    config:
> >>
> >>         NAME        STATE     READ WRITE CKSUM
> >>         test07      ONLINE       0     0     0
> >>           da8       ONLINE       0     0     0
> >>
> >>    errors: No known data errors
> >>
> >>Based on these results, I've drawn the following conclusion:
> >>* ZFS single pool per device = OKAY
> >>* ZFS raidz of all devices = OKAY
> >>* ZFS stripe of all devices = NOT OKAY
> >>
> >>The results are immediate, and I know ZFS will self-heal, so is that
> >>what it is doing behind my back and just not reporting it? Is this a
> >>ZFS bug with striping vs. raidz?
> >Can you reproduce this problem using RELENG_8?  Please try one of the
> >below snapshots.
> >
> >ftp://BLOCKEDftp4.freebsd.org/pub/FreeBSD/snapshots/201011/
> >
> The server is in a data center with limited access control, do I
> have to option of using a particular CVS tag (checking out via csup)
> and then perform a make world/kernel?

Doing this is more painful than, say, downloading a livefs image and
seeing if you can reproduce the problem (e.g. you won't be modifying
your existing OS installation), especially since I can't guarantee that
the problem you're seeing is fixed in RELENG_8 (hence my request to
begin with).  But if you can't boot livefs, then here you go:

You'll need some form of console access (either serial or VGA) to do the
upgrade reliably.  "Rolling back" may also not be an option since
RELENG_8 is newer than RELENG_8_1 and may have introduced some new
binaries or executables into the fray.  If you don't have console access
to this machine, if things go awry you may be SOL.  The vagueness of my
statement is intentional; I can't cover every situation that might come
to light.

Please be sure to back up your kernel configuration file before doing
the following, and make sure that the supfile shown below has
tag=RELENG_8 in it (it should).  And yes, the rm commands below are
recommended; failure to use them could result in some oddities given
that your /usr/src tree refers to RELENG_8_1 version numbers which
differ from RELENG_8.  You *do not* have to do this for ports (since for
ports, tag=. is used by default).

rm -fr /var/db/sup/src-all
rm -fr /usr/src/*
rm -fr /usr/obj/*
csup -h cvsupserver -L 2 /usr/share/examples/cvsup/stable-supfile

At this point you can restore your kernel configuration file to the
appropriate place (/sys/i386/conf, /sys/amd64/conf, etc.) and build
world/kernel as per the instructions in /usr/src/Makefile (see lines
~51-62).  ***Please do not skip any of the steps***.  Good luck.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 19:32:04 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 54A6C1065672;
	Mon,  8 Nov 2010 19:32:04 +0000 (UTC)
	(envelope-from carlson39@llnl.gov)
Received: from nspiron-1.llnl.gov (nspiron-1.llnl.gov [128.115.41.81])
	by mx1.freebsd.org (Postfix) with ESMTP id 3D4CA8FC25;
	Mon,  8 Nov 2010 19:32:04 +0000 (UTC)
X-Attachments: None
Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135])
	by nspiron-1.llnl.gov with ESMTP; 08 Nov 2010 11:32:03 -0800
Message-ID: <4CD85034.5000909@llnl.gov>
Date: Mon, 08 Nov 2010 11:32:04 -0800
From: Mike Carlson <carlson39@llnl.gov>
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US;
	rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
References: <4CD84258.6090404@llnl.gov>
	<20101108190640.GA15661@icarus.home.lan>
	<4CD84B63.4030800@llnl.gov>
	<20101108192950.GA15902@icarus.home.lan>
In-Reply-To: <20101108192950.GA15902@icarus.home.lan>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>,
	"pjd@freebsd.org" <pjd@freebsd.org>
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 19:32:04 -0000

On 11/08/2010 11:29 AM, Jeremy Chadwick wrote:
> On Mon, Nov 08, 2010 at 11:11:31AM -0800, Mike Carlson wrote:
>> On 11/08/2010 11:06 AM, Jeremy Chadwick wrote:
>>> On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote:
>>>> I'm having a problem with  stripping 7 18TB RAID6 (hardware SAN)
>>>> volumes together.
>>>>
>>>> Here is a quick rundown of the hardware:
>>>> * HP DL180 G6 w/12GB ram
>>>> * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter)
>>>> * Winchester Hardware SAN,
>>>>
>>>>     da2 at isp0 bus 0 scbus2 target 0 lun 0
>>>>     da2:<WINSYS SX2318R 373O>   Fixed Direct Access SCSI-5 device
>>>>     da2: 800.000MB/s transfers
>>>>     da2: Command Queueing enabled
>>>>     da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C)
>>>>
>>>>
>>>> As soon as I create the volume and write data to it, it is reported
>>>> as being corrupted:
>>>>
>>>>     write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8
>>>>     write# zpool scrub filevol001dd if=/dev/random
>>>>     of=/filevol001/random.dat.1 bs=1m count=1000
>>>>     write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
>>>>     1000+0 records in
>>>>     1000+0 records out
>>>>     1048576000 bytes transferred in 16.472807 secs (63654968 bytes/sec)
>>>>     write# cd /filevol001/
>>>>     write# ls
>>>>     random.dat.1
>>>>     write# md5 *
>>>>     MD5 (random.dat.1) = 629f8883d6394189a1658d24a5698bb3
>>>>     write# cp random.dat.1 random.dat.2
>>>>     cp: random.dat.1: Input/output error
>>>>     write# zpool status
>>>>        pool: filevol001
>>>>       state: ONLINE
>>>>       scrub: none requested
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          filevol001  ONLINE       0     0     0
>>>>            da2       ONLINE       0     0     0
>>>>            da3       ONLINE       0     0     0
>>>>            da4       ONLINE       0     0     0
>>>>            da5       ONLINE       0     0     0
>>>>            da6       ONLINE       0     0     0
>>>>            da7       ONLINE       0     0     0
>>>>            da8       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>     write# zpool scrub filevol001
>>>>     write# zpool status
>>>>        pool: filevol001
>>>>       state: ONLINE
>>>>     status: One or more devices has experienced an error resulting in data
>>>>          corruption.  Applications may be affected.
>>>>     action: Restore the file in question if possible.  Otherwise restore the
>>>>          entire pool from backup.
>>>>         see: http://BLOCKEDBLOCKEDwww.BLOCKEDBLOCKEDsun.com/msg/ZFS-8000-8A
>>>>       scrub: scrub completed after 0h0m with 2437 errors on Mon Nov  8
>>>>     10:14:20 2010
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          filevol001  ONLINE       0     0 2.38K
>>>>            da2       ONLINE       0     0 1.24K  12K repaired
>>>>            da3       ONLINE       0     0 1.12K
>>>>            da4       ONLINE       0     0 1.13K
>>>>            da5       ONLINE       0     0 1.27K
>>>>            da6       ONLINE       0     0     0
>>>>            da7       ONLINE       0     0     0
>>>>            da8       ONLINE       0     0     0
>>>>
>>>>     errors: 2437 data errors, use '-v' for a list
>>>>
>>>> However, if I create a 'raidz' volume, no errors occur:
>>>>
>>>>     write# zpool destroy filevol001
>>>>     write# zpool create filevol001 raidz da2 da3 da4 da5 da6 da7 da8
>>>>     write# zpool status
>>>>        pool: filevol001
>>>>       state: ONLINE
>>>>       scrub: none requested
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          filevol001  ONLINE       0     0     0
>>>>            raidz1    ONLINE       0     0     0
>>>>              da2     ONLINE       0     0     0
>>>>              da3     ONLINE       0     0     0
>>>>              da4     ONLINE       0     0     0
>>>>              da5     ONLINE       0     0     0
>>>>              da6     ONLINE       0     0     0
>>>>              da7     ONLINE       0     0     0
>>>>              da8     ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>     write# dd if=/dev/random of=/filevol001/random.dat.1 bs=1m count=1000
>>>>     1000+0 records in
>>>>     1000+0 records out
>>>>     1048576000 bytes transferred in 17.135045 secs (61194821 bytes/sec)
>>>>     write# zpool scrub filevol001
>>>>
>>>>     dmesg output:
>>>>     write# zpool status
>>>>        pool: filevol001
>>>>       state: ONLINE
>>>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>>>     09:54:51 2010
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          filevol001  ONLINE       0     0     0
>>>>            raidz1    ONLINE       0     0     0
>>>>              da2     ONLINE       0     0     0
>>>>              da3     ONLINE       0     0     0
>>>>              da4     ONLINE       0     0     0
>>>>              da5     ONLINE       0     0     0
>>>>              da6     ONLINE       0     0     0
>>>>              da7     ONLINE       0     0     0
>>>>              da8     ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>     write# ls
>>>>     random.dat.1
>>>>     write# cp random.dat.1 random.dat.2
>>>>     write# cp random.dat.1 random.dat.3
>>>>     write# cp random.dat.1 random.dat.4
>>>>     write# cp random.dat.1 random.dat.5
>>>>     write# cp random.dat.1 random.dat.6
>>>>     write# cp random.dat.1 random.dat.7
>>>>     write# md5 *
>>>>     MD5 (random.dat.1) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>>>     MD5 (random.dat.2) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>>>     MD5 (random.dat.3) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>>>     MD5 (random.dat.4) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>>>     MD5 (random.dat.5) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>>>     MD5 (random.dat.6) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>>>     MD5 (random.dat.7) = f5e3467f61a954bc2e0bcc35d49ac8b2
>>>>
>>>> What is also odd, is if I create 7 separate ZFS volumes, they do not
>>>> report any data corruption:
>>>>
>>>>     write# zpool destroy filevol001
>>>>     write# zpool create test01 da2
>>>>     write# zpool create test02 da3
>>>>     write# zpool create test03 da4
>>>>     write# zpool create test04 da5
>>>>     write# zpool create test05 da6
>>>>     write# zpool create test06 da7
>>>>     write# zpool create test07 da8
>>>>     write# zpool status
>>>>        pool: test01
>>>>       state: ONLINE
>>>>       scrub: none requested
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test01      ONLINE       0     0     0
>>>>            da2       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test02
>>>>       state: ONLINE
>>>>       scrub: none requested
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test02      ONLINE       0     0     0
>>>>            da3       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test03
>>>>       state: ONLINE
>>>>       scrub: none requested
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test03      ONLINE       0     0     0
>>>>            da4       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test04
>>>>       state: ONLINE
>>>>       scrub: none requested
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test04      ONLINE       0     0     0
>>>>            da5       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test05
>>>>       state: ONLINE
>>>>       scrub: none requested
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test05      ONLINE       0     0     0
>>>>            da6       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test06
>>>>       state: ONLINE
>>>>       scrub: none requested
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test06      ONLINE       0     0     0
>>>>            da7       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test07
>>>>       state: ONLINE
>>>>       scrub: none requested
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test07      ONLINE       0     0     0
>>>>            da8       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>     write# dd if=/dev/random of=/tmp/random.dat.1 bs=1m count=1000
>>>>     1000+0 records in
>>>>     1000+0 records out
>>>>     1048576000 bytes transferred in 19.286735 secs (54367730 bytes/sec)
>>>>     write# cd /tmp/
>>>>     write# md5 /tmp/random.dat.1
>>>>     MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>>>     write# cp random.dat.1 /test01 ; cp random.dat.1 /test02 ;cp
>>>>     random.dat.1 /test03 ; cp random.dat.1 /test04 ; cp random.dat.1
>>>>     /test05 ; cp random.dat.1 /test06 ; cp random.dat.1 /test07
>>>>     write# md5 /test*/*
>>>>     MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>>>     MD5 (/test02/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>>>     MD5 (/test03/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>>>     MD5 (/test04/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>>>     MD5 (/test05/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>>>     MD5 (/test06/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>>>     MD5 (/test07/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>>>>     write# zpool scrub test01 ; zpool scrub test02 ;zpool scrub test03
>>>>     ;zpool scrub test04 ; zpool scrub test05 ; zpool scrub test06 ;
>>>>     zpool scrub test07
>>>>     write# zpool status
>>>>        pool: test01
>>>>       state: ONLINE
>>>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>>>     10:27:49 2010
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test01      ONLINE       0     0     0
>>>>            da2       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test02
>>>>       state: ONLINE
>>>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>>>     10:27:52 2010
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test02      ONLINE       0     0     0
>>>>            da3       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test03
>>>>       state: ONLINE
>>>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>>>     10:27:54 2010
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test03      ONLINE       0     0     0
>>>>            da4       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test04
>>>>       state: ONLINE
>>>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>>>     10:27:57 2010
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test04      ONLINE       0     0     0
>>>>            da5       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test05
>>>>       state: ONLINE
>>>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>>>     10:28:00 2010
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test05      ONLINE       0     0     0
>>>>            da6       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test06
>>>>       state: ONLINE
>>>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>>>     10:28:02 2010
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test06      ONLINE       0     0     0
>>>>            da7       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>>        pool: test07
>>>>       state: ONLINE
>>>>       scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
>>>>     10:28:05 2010
>>>>     config:
>>>>
>>>>          NAME        STATE     READ WRITE CKSUM
>>>>          test07      ONLINE       0     0     0
>>>>            da8       ONLINE       0     0     0
>>>>
>>>>     errors: No known data errors
>>>>
>>>> Based on these results, I've drawn the following conclusion:
>>>> * ZFS single pool per device = OKAY
>>>> * ZFS raidz of all devices = OKAY
>>>> * ZFS stripe of all devices = NOT OKAY
>>>>
>>>> The results are immediate, and I know ZFS will self-heal, so is that
>>>> what it is doing behind my back and just not reporting it? Is this a
>>>> ZFS bug with striping vs. raidz?
>>> Can you reproduce this problem using RELENG_8?  Please try one of the
>>> below snapshots.
>>>
>>> ftp://BLOCKEDBLOCKEDftp4.freebsd.org/pub/FreeBSD/snapshots/201011/
>>>
>> The server is in a data center with limited access control, do I
>> have to option of using a particular CVS tag (checking out via csup)
>> and then perform a make world/kernel?
> Doing this is more painful than, say, downloading a livefs image and
> seeing if you can reproduce the problem (e.g. you won't be modifying
> your existing OS installation), especially since I can't guarantee that
> the problem you're seeing is fixed in RELENG_8 (hence my request to
> begin with).  But if you can't boot livefs, then here you go:
>
> You'll need some form of console access (either serial or VGA) to do the
> upgrade reliably.  "Rolling back" may also not be an option since
> RELENG_8 is newer than RELENG_8_1 and may have introduced some new
> binaries or executables into the fray.  If you don't have console access
> to this machine, if things go awry you may be SOL.  The vagueness of my
> statement is intentional; I can't cover every situation that might come
> to light.
>
> Please be sure to back up your kernel configuration file before doing
> the following, and make sure that the supfile shown below has
> tag=RELENG_8 in it (it should).  And yes, the rm commands below are
> recommended; failure to use them could result in some oddities given
> that your /usr/src tree refers to RELENG_8_1 version numbers which
> differ from RELENG_8.  You *do not* have to do this for ports (since for
> ports, tag=. is used by default).
>
> rm -fr /var/db/sup/src-all
> rm -fr /usr/src/*
> rm -fr /usr/obj/*
> csup -h cvsupserver -L 2 /usr/share/examples/cvsup/stable-supfile
>
> At this point you can restore your kernel configuration file to the
> appropriate place (/sys/i386/conf, /sys/amd64/conf, etc.) and build
> world/kernel as per the instructions in /usr/src/Makefile (see lines
> ~51-62).  ***Please do not skip any of the steps***.  Good luck.
>
> --
> | Jeremy Chadwick                                   jdc@parodius.com |
> | Parodius Networking                       http://BLOCKEDwww.BLOCKEDparodius.com/ |
> | UNIX Systems Administrator                  Mountain View, CA, USA |
> | Making life hard for others since 1977.              PGP: 4BD6C0CB |
>
>
>
Ahh, point taken :) I think I'll take a trip to the datacenter and boot 
off of a thumb drive...

Thank Jeremy, I'll report back later!

Mike C

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 20:39:06 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 317E91065695
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 20:39:06 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from fallbackmx06.syd.optusnet.com.au
	(fallbackmx06.syd.optusnet.com.au [211.29.132.8])
	by mx1.freebsd.org (Postfix) with ESMTP id BD1A58FC0A
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 20:39:05 +0000 (UTC)
Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au
	[211.29.132.186])
	by fallbackmx06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	oA8KNSDh016903
	for <freebsd-fs@freebsd.org>; Tue, 9 Nov 2010 07:23:29 +1100
Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au
	(c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73])
	by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	oA8KNPJ2032284
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Tue, 9 Nov 2010 07:23:26 +1100
Date: Tue, 9 Nov 2010 07:23:25 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: freebsd-fs@freebsd.org, monthadar@gmail.com
In-Reply-To: <201011081748.oA8HmnLS085403@lurza.secnetix.de>
Message-ID: <20101109065842.R2343@besplex.bde.org>
References: <201011081748.oA8HmnLS085403@lurza.secnetix.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: 
Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
 ?error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 20:39:06 -0000

On Mon, 8 Nov 2010, Oliver Fromme wrote:

> Monthadar Al Jaberi <monthadar@gmail.com> wrote:
> > I dont know if I am asking on the wrong place. But it has todo with
> > filesystem and onboard flash (16MB) on a RouterStation Pro board.
> >
> > I am running a FreeBSD Current 201010, with the kernel configuration
> > file specified in /usr/src/sys/mips/conf/AR71XX with device
> > geom_redboot.
> >
> > but I get this error when I try to mount from flash:
> > mount /dev/redboot/fs /var/fs
> > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size
> > 8192: Invalid argument
> >
> > So I guessed it has todo with the flash configured in 64k sectors
> > according to the boot output.
> > ...
> > mx25l0: <M25Pxx Flash Family> at cs 0 on spibus0
> > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors
> > ...
>
> Historically UFS/FFS supports only 512 bytes per sector.
> I think it was patched at some point in the past to support
> 2048 bytes per sector, too, which is used by some MOD media
> and DVD-RAM.  I'm pretty sure it does _not_ support 65536
> bytes per sector (someone please correct me if I'm wrong).

Maybe 25 years ago, but in Net2/ 20 years ago ffs doesn't really
even use sectors.  It just has a buggy superblock probe which
prevents it determining its correct i/o size when that size
exceeds SBLOCKSIZE = 8192.

>
> > So I just tried to change SBLOCKSIZE from 8129 to 65536 in
> > /usr/src/sys/ufs/ffs/fs.h, but then I got this error:
>
> That won't work.  The media sector size is a hard limit;
> the driver will refuse to read or write anything that is
> not aligned to the media sector size.  Changing the size
> of the super block (SBLOCKSIZE) won't help much.
>
> > mount /dev/redboot/fs /mnt/fs
> > g_vfs_done():redboot/fs[READ(offset=8192, length=65536)]error = 22
> > mount: /dev/redboot/fs : Invalid argument
>
> The UFS code tries to read the super block at offset 8192,
> which is not aligned correctly (it's not a multiple of the
> sector size).

The ffs code (and also utility code like ffs_fsck) bogusly aborts the
search on the first i/o error.  Otherwise, changing just the size should
work for ffs2.

Changing both the offset and the size should work.  I think it might
just work for ffs2 once you recompile all utilities using the
changed SBLOCKSEARCH and SBLOCKSIZE.  The change will necessarily
break ffs1: from ffs/fs.h:

%  * Depending on the architecture and the media, the superblock may
%  * reside in any one of four places. For tiny media where every block 
%  * counts, it is placed at the very front of the partition. Historically,
%  * UFS1 placed it 8K from the front to leave room for the disk label and
%  * a small bootstrap. For UFS2 it got moved to 64K from the front to leave

It can't be at 8K if the media sector size is 8K.  Except, with more changes
to the probe, perhaps it can be (e.g., always start by reading 128K at offset
0, and check what is at various offsets within the 128K.  This covers all
the usual cases in 1 i/o).

%  * room for the disk label and a bigger bootstrap, and for really piggy
%  * systems we check at 256K from the front if the first three fail. In
%  * all cases the size of the superblock will be SBLOCKSIZE. All values are

Actually, the size of the superblock will never be SBLOCKSIZE.  It will
always be sizeof(struct fs), which is about 1500.  SBLOCKSIZE is just the
i/o size used in buggy probes for the superblock.

%  * given in byte-offset form, so they do not imply a sector size. The
%  * SBLOCKSEARCH specifies the order in which the locations should be searched.
%  */
% #define SBLOCK_FLOPPY	     0

I don't remember this ever working.  Floppies normally start with a 512-byte
boot sector.  Thus the superblock cannot start a 0 for a normal floppy,
and I don't remember anything ever supporting sufficiently abnormal floppy
for this to work.  (My kernel has incomplete support for this putting the
superblock at offset 512, involving reducing MINBSIZE to 512 and using
512-blocks for everything.  Everything works except superblock size and
probing issues.  The superblock should end up beginning in the first
fs_bsize block after the boot blocks (normally 512, but could be 0 with
more work).

% #define SBLOCK_UFS1	  8192
% #define SBLOCK_UFS2	 65536
% #define SBLOCK_PIGGY	262144
% #define SBLOCKSIZE	  8192
% #define SBLOCKSEARCH \
% 	{ SBLOCK_UFS2, SBLOCK_UFS1, SBLOCK_FLOPPY, SBLOCK_PIGGY, -1 }

Try changing SBLOCK_UFS1 to 65536 too.  Obviously this breaks the normal
ffs1 case.  Better, try changing only SBLOCKSIZE and not aborting on i/o
error.

> I think UFS is not the right file system to put on a flash
> media that has 256 sectors of 65536 bytes.  In theory you
> could insert a translation layer that converts 512-byte
> access to 65536-byte access (requiring a read-modify-write
> operation when writing).  Maybe gnop(8) can do this, it
> has a sector size option, but I haven't tried it.  Anyway,
> that would be extremely inefficient.

Hmm, 256 sectors is small.  It might have negative space for
data after using > 256 ssectors for metadata.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  8 22:31:18 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 97B08106564A
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 22:31:18 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 189F68FC26
	for <freebsd-fs@freebsd.org>; Mon,  8 Nov 2010 22:31:17 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1PFaF8-0006WG-A0
	for freebsd-fs@freebsd.org; Mon, 08 Nov 2010 23:31:14 +0100
Received: from cpe-188-129-102-227.dynamic.amis.hr ([188.129.102.227])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Mon, 08 Nov 2010 23:31:14 +0100
Received: from ivoras by cpe-188-129-102-227.dynamic.amis.hr with local
	(Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Mon, 08 Nov 2010 23:31:14 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 08 Nov 2010 23:30:59 +0100
Lines: 40
Message-ID: <ib9tn7$hcr$1@dough.gmane.org>
References: <4CD04AEC.8040607@aldan.algebra.com>
	<4CD051A9.7090200@freebsd.org>	<4CD0660E.2000102@aldan.algebra.com>
	<4CD06C4B.80100@freebsd.org>	<4CD0895A.5030402@aldan.algebra.com>
	<4CD09830.3030400@freebsd.org>	<4CD48F81.1080201@aldan.algebra.com>
	<4CD5A5B7.4040006@aldan.algebra.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: cpe-188-129-102-227.dynamic.amis.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <4CD5A5B7.4040006@aldan.algebra.com>
Subject: Re: iozone-ing an SSD (Re: Using an SSD "disk" for /)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2010 22:31:18 -0000

On 11/06/10 20:00, Mikhail T. wrote:
> On 11/5/2010 7:13 PM, Mikhail T. wrote:
>> The results can be found in 4 HTML files found at:
>> http://aldan.algebra.com/~mi/io/ (The original iozone-created Excel
>> files are there too.)

That server doesn't respond!

> I added some more iozone runs, as well as those of rawio. These are much
> fewer (as file-system parameters don't affect rawio) and easier to
> interpret:
>
>      * It makes no difference to the SSD, whether your access is random
>        or sequential

And this is their biggest strength. All others - like "raw" IO speed, 
are in the majority of serious use cases secondary to that.

>      * SSD clearly beats the HD in rawrite, although, at "only" 88Mb/sec,
>        the results are far from the marketing...

Basically, you should be looking at IOPS, not MB/s.

>      * SSD connected to plain SATA port strongly beats the same SSD
>        connected to the fancy SAS controller (mpt)

Not really surprising. The controller might be too smart for its own 
good in this simple case.

But there could be other, more important resons, like the controller 
disabling the drive's (in this case, SSD's) write caching, which the 
majority of "real" RAID controllers do by default. Leaving the drive's 
write cache turned on puts your data at risk (which is important if you 
are running servers). You can sort of verify this hypothesis by setting 
this loader tunable:

hw.ata.wc=0

- this should disable the disk write cache for (S)ATA drives.


From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 01:05:15 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5E1EA1065674;
	Tue,  9 Nov 2010 01:05:15 +0000 (UTC)
	(envelope-from carlson39@llnl.gov)
Received: from smtp.llnl.gov (nspiron-3.llnl.gov [128.115.41.83])
	by mx1.freebsd.org (Postfix) with ESMTP id 3E8D68FC37;
	Tue,  9 Nov 2010 01:05:15 +0000 (UTC)
X-Attachments: None
Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135])
	by smtp.llnl.gov with ESMTP; 08 Nov 2010 17:05:14 -0800
Message-ID: <4CD89E4A.6000902@llnl.gov>
Date: Mon, 08 Nov 2010 17:05:14 -0800
From: Mike Carlson <carlson39@llnl.gov>
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US;
	rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
References: <4CD84258.6090404@llnl.gov>
	<20101108190640.GA15661@icarus.home.lan>
	<4CD84B63.4030800@llnl.gov>
	<20101108192950.GA15902@icarus.home.lan>
In-Reply-To: <20101108192950.GA15902@icarus.home.lan>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>,
	"pjd@freebsd.org" <pjd@freebsd.org>
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 01:05:15 -0000

On 11/08/2010 11:29 AM, Jeremy Chadwick wrote:
> On Mon, Nov 08, 2010 at 11:11:31AM -0800, Mike Carlson wrote:
>> On 11/08/2010 11:06 AM, Jeremy Chadwick wrote:
>>> On Mon, Nov 08, 2010 at 10:32:56AM -0800, Mike Carlson wrote:
>>>> I'm having a problem with  stripping 7 18TB RAID6 (hardware SAN)
>>>> volumes together.
>>>>
>>>> Here is a quick rundown of the hardware:
>>>> * HP DL180 G6 w/12GB ram
>>>> * QLogic FC HBA (Qlogic ISP 2532 PCI FC-AL Adapter)
>>>> * Winchester Hardware SAN,
>>>>
>>>>     da2 at isp0 bus 0 scbus2 target 0 lun 0
>>>>     da2:<WINSYS SX2318R 373O>   Fixed Direct Access SCSI-5 device
>>>>     da2: 800.000MB/s transfers
>>>>     da2: Command Queueing enabled
>>>>     da2: 19074680MB (39064944640 512 byte sectors: 255H 63S/T 2431680C)
>>>>
>>>
>> The server is in a data center with limited access control, do I
>> have to option of using a particular CVS tag (checking out via csup)
>> and then perform a make world/kernel?
> Doing this is more painful than, say, downloading a livefs image and
> seeing if you can reproduce the problem (e.g. you won't be modifying
> your existing OS installation), especially since I can't guarantee that
> the problem you're seeing is fixed in RELENG_8 (hence my request to
> begin with).  But if you can't boot livefs, then here you go:
>
> You'll need some form of console access (either serial or VGA) to do the
> upgrade reliably.  "Rolling back" may also not be an option since
> RELENG_8 is newer than RELENG_8_1 and may have introduced some new
> binaries or executables into the fray.  If you don't have console access
> to this machine, if things go awry you may be SOL.  The vagueness of my
> statement is intentional; I can't cover every situation that might come
> to light.
>
> Please be sure to back up your kernel configuration file before doing
> the following, and make sure that the supfile shown below has
> tag=RELENG_8 in it (it should).  And yes, the rm commands below are
> recommended; failure to use them could result in some oddities given
> that your /usr/src tree refers to RELENG_8_1 version numbers which
> differ from RELENG_8.  You *do not* have to do this for ports (since for
> ports, tag=. is used by default).
>
> rm -fr /var/db/sup/src-all
> rm -fr /usr/src/*
> rm -fr /usr/obj/*
> csup -h cvsupserver -L 2 /usr/share/examples/cvsup/stable-supfile
>
> At this point you can restore your kernel configuration file to the
> appropriate place (/sys/i386/conf, /sys/amd64/conf, etc.) and build
> world/kernel as per the instructions in /usr/src/Makefile (see lines
> ~51-62).  ***Please do not skip any of the steps***.  Good luck.
>
> --
> | Jeremy Chadwick                                   jdc@parodius.com |
> | Parodius Networking                       http://BLOCKEDwww.BLOCKEDparodius.com/ |
> | UNIX Systems Administrator                  Mountain View, CA, USA |
> | Making life hard for others since 1977.              PGP: 4BD6C0CB |
>
>
>
I wasn't able to make it to the Data Center to boot off of a USB/CD, but 
I did follow your steps to upgrade to RELENG_8. So far, things are stable:

    write# uname -a
    FreeBSD write.llnl.gov 8.1-STABLE FreeBSD 8.1-STABLE #0: Mon Nov  8
    16:38:06 PST 2010    
    root@write.llnl.gov:/usr/obj/usr/src/sys/GENERIC  amd64
    write# kldstat
    Id Refs Address            Size     Name
      1   15 0xffffffff80100000 d86d18   kernel
      2    1 0xffffffff80e87000 f058     aio.ko
      3    1 0xffffffff80e97000 16ea40   ispfw.ko
      4    1 0xffffffff81006000 5568     geom_multipath.ko
      5    1 0xffffffff81222000 104ac5   zfs.ko
      6    1 0xffffffff81327000 1a15     opensolaris.ko
    write# zpool create test01 da2 da3 da4 da5 da6 da7 da8
    write# zpool status
    write# cd /tmp
    write# clear
    write# cp random.dat.1 /test01/
    write# cp random.dat.1 /test01/random.dat.2
    write# cp random.dat.1 /test01/random.dat.3
    write# cp random.dat.1 /test01/random.dat.4
    write# cp random.dat.1 /test01/random.dat.5
    write# cp random.dat.1 /test01/random.dat.6
    write# md5 random.dat.1
    MD5 (random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    write# md5 /test01/random.dat.*
    MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.2) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.3) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.4) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.5) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.6) = f795fa09e1b0975c0da0ec6e49544a36
    write# zpool status
       pool: test01
      state: ONLINE
      scrub: none requested
    config:

         NAME        STATE     READ WRITE CKSUM
         test01      ONLINE       0     0     0
           da2       ONLINE       0     0     0
           da3       ONLINE       0     0     0
           da4       ONLINE       0     0     0
           da5       ONLINE       0     0     0
           da6       ONLINE       0     0     0
           da7       ONLINE       0     0     0
           da8       ONLINE       0     0     0

    errors: No known data errors
    write# zpool scrub test01
    write# zpool status
       pool: test01
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Mon Nov  8
    17:00:01 2010
    config:

         NAME        STATE     READ WRITE CKSUM
         test01      ONLINE       0     0     0
           da2       ONLINE       0     0     0
           da3       ONLINE       0     0     0
           da4       ONLINE       0     0     0
           da5       ONLINE       0     0     0
           da6       ONLINE       0     0     0
           da7       ONLINE       0     0     0
           da8       ONLINE       0     0     0

    errors: No known data errors


Any ideas for further testing to narrow down the culprit? Oh, one other 
thing that I modified was /boot/loader.conf. I had previously limited 
the vfs.zfs.arc_max to 1024M, so I had also commented that out.

Thanks again, I'm going to continue writing files and scrubbing the 
array until I have a level of confidence with the file system.

Mike C


From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 11:23:20 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9B4C01065695
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 11:23:20 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 51C7C8FC14
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 11:23:20 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1PFmII-0001HT-VD
	for freebsd-fs@freebsd.org; Tue, 09 Nov 2010 12:23:18 +0100
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Tue, 09 Nov 2010 12:23:18 +0100
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Tue, 09 Nov 2010 12:23:18 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Tue, 09 Nov 2010 12:23:04 +0100
Lines: 13
Message-ID: <ibbauo$27m$1@dough.gmane.org>
References: <4CD84258.6090404@llnl.gov>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <4CD84258.6090404@llnl.gov>
X-Enigmail-Version: 1.1.2
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 11:23:20 -0000

On 11/08/10 19:32, Mike Carlson wrote:

> As soon as I create the volume and write data to it, it is reported as
> being corrupted:
> 
>    write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8

> However, if I create a 'raidz' volume, no errors occur:

A very interesting problem. Can you check with some other kind of volume
manager that striping the data doesn't cause some unusual hardware
interaction? Can you try, as an experiment, striping them all with
gstripe (but you'll have to use a small stripe size like 16 KiB or 8 KiB)?


From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 13:13:49 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 926D2106566C
	for <freebsd-fs@FreeBSD.ORG>; Tue,  9 Nov 2010 13:13:49 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2])
	by mx1.freebsd.org (Postfix) with ESMTP id ECC868FC18
	for <freebsd-fs@FreeBSD.ORG>; Tue,  9 Nov 2010 13:13:48 +0000 (UTC)
Received: from lurza.secnetix.de (localhost [127.0.0.1])
	by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id oA9DDVDT077097;
	Tue, 9 Nov 2010 14:13:47 +0100 (CET)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.14.3/8.14.3/Submit) id oA9DDUoc077095;
	Tue, 9 Nov 2010 14:13:30 +0100 (CET) (envelope-from olli)
Date: Tue, 9 Nov 2010 14:13:30 +0100 (CET)
Message-Id: <201011091313.oA9DDUoc077095@lurza.secnetix.de>
From: Oliver Fromme <olli@lurza.secnetix.de>
To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com, brde@optusnet.com.au
In-Reply-To: <20101109065842.R2343@besplex.bde.org>
X-Newsgroups: list.freebsd-fs
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX)
	(FreeBSD/6.4-PRERELEASE-20080904 (i386))
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5
	(lurza.secnetix.de [127.0.0.1]);
	Tue, 09 Nov 2010 14:13:47 +0100 (CET)
Cc: 
Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
	?error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com, brde@optusnet.com.au
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 13:13:49 -0000

Bruce Evans wrote:
 > On Mon, 8 Nov 2010, Oliver Fromme wrote:
 > > Monthadar Al Jaberi <monthadar@gmail.com> wrote:
 > > > [...]
 > > > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size
 > > > 8192: Invalid argument
 > > > 
 > > > So I guessed it has todo with the flash configured in 64k sectors
 > > > according to the boot output.
 > > > ...
 > > > mx25l0: <M25Pxx Flash Family> at cs 0 on spibus0
 > > > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors
 > > > ...
 > > 
 > > Historically UFS/FFS supports only 512 bytes per sector.
 > > I think it was patched at some point in the past to support
 > > 2048 bytes per sector, too, which is used by some MOD media
 > > and DVD-RAM.  I'm pretty sure it does _not_ support 65536
 > > bytes per sector (someone please correct me if I'm wrong).
 > 
 > Maybe 25 years ago, but in Net2/ 20 years ago ffs doesn't really
 > even use sectors.  It just has a buggy superblock probe which
 > prevents it determining its correct i/o size when that size
 > exceeds SBLOCKSIZE = 8192.

In the second half of the 90s it did *not* support the
2048-byte sectors of the larger MOD media that became
popular at that time.  I owned several of those drives
(still have one of them), so I remember it quite well.
FreeBSD's file system code needed some patches in order
to be able to use those disks.  Before that, only 512-
byte sectors worked.

I don't know what sector sizes are supported today, but
I wouldn't be surprised if only 512 to 2048 works out
of the box.  I'm not aware of any widely used media
that has sectors smaller than 512 or larger than 2048.
(Those new 4k drives translate accesses to/from 512 byte
sectors, so it looks like a 512-byte sector drive.)

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch�ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M�n-
chen, HRB 125758,  Gesch�ftsf�hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

Python is executable pseudocode.  Perl is executable line noise.

From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 13:29:16 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 16E84106566C
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 13:29:16 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id C2CBA8FC15
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 13:29:15 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1PFoFm-0007jp-CW
	for freebsd-fs@freebsd.org; Tue, 09 Nov 2010 14:28:50 +0100
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Tue, 09 Nov 2010 14:28:50 +0100
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Tue, 09 Nov 2010 14:28:50 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Tue, 09 Nov 2010 14:28:10 +0100
Lines: 18
Message-ID: <ibbi9a$5c3$1@dough.gmane.org>
References: <20101109065842.R2343@besplex.bde.org>
	<201011091313.oA9DDUoc077095@lurza.secnetix.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <201011091313.oA9DDUoc077095@lurza.secnetix.de>
X-Enigmail-Version: 1.1.2
Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
 ?error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 13:29:16 -0000

On 11/09/10 14:13, Oliver Fromme wrote:

> I don't know what sector sizes are supported today, but
> I wouldn't be surprised if only 512 to 2048 works out
> of the box.  I'm not aware of any widely used media
> that has sectors smaller than 512 or larger than 2048.
> (Those new 4k drives translate accesses to/from 512 byte
> sectors, so it looks like a 512-byte sector drive.)

Can't say much about the Olden Times, but now it's trivial to show that
UFS, in fact, works ok with various sector sizes using gnop.

I've tested it with at least 4 KiB sectors recently, and I think I
remember trying (successfully) 8 KiB sectors a few years ago in a
project, also successfully.

What might or might not work well, I think, is having fragment/block
ratio other then "8".


From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 13:59:35 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9D111106564A;
	Tue,  9 Nov 2010 13:59:35 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 1FDB58FC17;
	Tue,  9 Nov 2010 13:59:34 +0000 (UTC)
Received: from lurza.secnetix.de (localhost [127.0.0.1])
	by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id oA9DxIuw079253;
	Tue, 9 Nov 2010 14:59:33 +0100 (CET)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.14.3/8.14.3/Submit) id oA9DxI3U079252;
	Tue, 9 Nov 2010 14:59:18 +0100 (CET) (envelope-from olli)
Date: Tue, 9 Nov 2010 14:59:18 +0100 (CET)
Message-Id: <201011091359.oA9DxI3U079252@lurza.secnetix.de>
From: Oliver Fromme <olli@lurza.secnetix.de>
To: freebsd-fs@FreeBSD.ORG, ivoras@FreeBSD.ORG
In-Reply-To: <ibbi9a$5c3$1@dough.gmane.org>
X-Newsgroups: list.freebsd-fs
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX)
	(FreeBSD/6.4-PRERELEASE-20080904 (i386))
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5
	(lurza.secnetix.de [127.0.0.1]);
	Tue, 09 Nov 2010 14:59:34 +0100 (CET)
Cc: 
Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
	?error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: freebsd-fs@FreeBSD.ORG, ivoras@FreeBSD.ORG
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 13:59:35 -0000

Ivan Voras <ivoras@freebsd.org> wrote:
 > What might or might not work well, I think, is having fragment/block
 > ratio other then "8".

I've also successfully used fsize == bsize (i.e. ratio 1).
But I also think that ratios other than 1 and 8 will not
work well.

Another question is whether other FS-related tools like
fsck(8) and dump(8) work well with unusual file systems.
Having FFS support for, say, 65536 byte sectors won't buy
you much if fsck can't reliably handle it.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch�ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M�n-
chen, HRB 125758,  Gesch�ftsf�hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"If Java had true garbage collection, most programs
would delete themselves upon execution."
        -- Robert Sewell

From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 14:16:39 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BD3041065673
	for <freebsd-fs@FreeBSD.ORG>; Tue,  9 Nov 2010 14:16:39 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au
	[211.29.132.187])
	by mx1.freebsd.org (Postfix) with ESMTP id 56AFE8FC1F
	for <freebsd-fs@FreeBSD.ORG>; Tue,  9 Nov 2010 14:16:39 +0000 (UTC)
Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au
	(c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73])
	by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	oA9EGaJC026668
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 10 Nov 2010 01:16:37 +1100
Date: Wed, 10 Nov 2010 01:16:36 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com, brde@optusnet.com.au
In-Reply-To: <201011091313.oA9DDUoc077095@lurza.secnetix.de>
Message-ID: <20101110004642.H1101@besplex.bde.org>
References: <201011091313.oA9DDUoc077095@lurza.secnetix.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: 
Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
 ?error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 14:16:39 -0000

On Tue, 9 Nov 2010, Oliver Fromme wrote:

> Bruce Evans wrote:
> > On Mon, 8 Nov 2010, Oliver Fromme wrote:
> > > Monthadar Al Jaberi <monthadar@gmail.com> wrote:
> > > > [...]
> > > > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock size
> > > > 8192: Invalid argument
> > > >
> > > > So I guessed it has todo with the flash configured in 64k sectors
> > > > according to the boot output.
> > > > ...
> > > > mx25l0: <M25Pxx Flash Family> at cs 0 on spibus0
> > > > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors
> > > > ...
> > >
> > > Historically UFS/FFS supports only 512 bytes per sector.
> > > I think it was patched at some point in the past to support
> > > 2048 bytes per sector, too, which is used by some MOD media
> > > and DVD-RAM.  I'm pretty sure it does _not_ support 65536
> > > bytes per sector (someone please correct me if I'm wrong).
> >
> > Maybe 25 years ago, but in Net2/ 20 years ago ffs doesn't really
> > even use sectors.  It just has a buggy superblock probe which
> > prevents it determining its correct i/o size when that size
> > exceeds SBLOCKSIZE = 8192.
>
> In the second half of the 90s it did *not* support the
> 2048-byte sectors of the larger MOD media that became
> popular at that time.  I owned several of those drives
> (still have one of them), so I remember it quite well.
> FreeBSD's file system code needed some patches in order
> to be able to use those disks.  Before that, only 512-
> byte sectors worked.

That's strange, since in the initial slice code in 1994 or 1995, I
emulated 4K-sectors in uncommitted patches in the floppy driver and
thought I tested ffs with them.

> I don't know what sector sizes are supported today, but
> I wouldn't be surprised if only 512 to 2048 works out
> of the box.  I'm not aware of any widely used media
> that has sectors smaller than 512 or larger than 2048.
> (Those new 4k drives translate accesses to/from 512 byte
> sectors, so it looks like a 512-byte sector drive.)

I have a DVD drive that only supports writing 32K-blocks on DVD-R.  In
2005 I gave up trying to get ffs to work on this.  The drive supports
reading the usual 2K-blocks, so the ffs probe worked, and the ffs block
size just needed to be set to 32K or 64K so that writes worked too.
But this block size wastes a lot of space and time for small files.
FreeBSD's buffering is bad for read-mostly media, and I never found
any file system that works well for small files on DVDs or CDROMs
(images that can be written in 10 minutes take more like 10 hours
to read back if they contain a few hundred thousand small files).

mdconfig allows any representable sector size except 0 and possibly non-
power-of-2 ones (mdconfig(8) uses strtoul() with null error handling;
md(4) checks for a power of 2 in the malloc-backed case but has null
arg checking in other cases (if other cases are reached now -- this
feature used to be limited to the malloc-backed case)).  Power of 2
sizes that cannot work because they exceed MAXBSIZE can certainly be
configured.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 15:03:50 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DA62D1065673
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 15:03:50 +0000 (UTC)
	(envelope-from monthadar@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id A64AA8FC0C
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 15:03:50 +0000 (UTC)
Received: by pwi10 with SMTP id 10so217601pwi.13
	for <multiple recipients>; Tue, 09 Nov 2010 07:03:50 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=GWSu1ZlD4YFGZ8BW0It1eIYC8Jrc3KpgCgW9rLajbrc=;
	b=ZUdpHsb13HYUncHUFil2S4Ljqzdc744c/XyuGol2blQ8pX87S4Oh+wCGY8c2XqAG2K
	j1z1QPVxG7DvnwSpjjrIpAAZ/iChEPxKRn1Bp0HJJJKF0TXkPRjQmM8ltvhUX7LZdEL+
	BihRJYa0VIUPYT6HlCrrII9zt+yZ0O4R2ElIk=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=whvzaoLbgVQtaltfGEpaRPEzj1ItndNzaZCO0eVFrkT1IowlSG51W+bkwGOexW4YDc
	7o3KmlvYuFBcgxRqdWbTNTltyVfbE/lxN8xAvXjy/9UYouweQKJIgy4a/9urrXSrG5t3
	iH5EldU0z6K9dLYnZ0w0iY/68CWX64atXB3+E=
MIME-Version: 1.0
Received: by 10.229.225.199 with SMTP id it7mr6556763qcb.33.1289315029694;
	Tue, 09 Nov 2010 07:03:49 -0800 (PST)
Received: by 10.229.182.77 with HTTP; Tue, 9 Nov 2010 07:03:49 -0800 (PST)
In-Reply-To: <20101110004642.H1101@besplex.bde.org>
References: <201011091313.oA9DDUoc077095@lurza.secnetix.de>
	<20101110004642.H1101@besplex.bde.org>
Date: Tue, 9 Nov 2010 16:03:49 +0100
Message-ID: <AANLkTi=MuiB6Xt3uztB7Yz82TvA2GJ9q6ncquU1=HTj9@mail.gmail.com>
From: Monthadar Al Jaberi <monthadar@gmail.com>
To: Bruce Evans <brde@optusnet.com.au>, olli@lurza.secnetix.de,
	ivoras@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org
Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
	?error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 15:03:50 -0000

Thank you for the clarifications! Definitely the right place to ask!

I successfully tried mdconfig version, but gnop gives error message:
gnop: Invalid secsize for provider redboot/fs.

for any sectorsize that I give other than 64K, or multiples of 64K :/
RSPRO3# gnop create -s 128k -S 256k /dev/redboot/fs
GEOM_NOP: Device redboot/fs.nop created.
but that wont help mount...


I looked into datasheet for the flash (MX25125805D) and it seems like
it can work in both 64K and 4K? Confused.
(http://www.macronix.com/QuickPlace/hq/PageLibrary482576EF002A2699.nsf/h_In=
dex/30D2368B704F50B9482576EF002D070F/?OpenDocument&Type=3DSerial%20Flash&De=
nsity=3D128Mb)

Also I think the mx25 driver in freebsd is not configured correctly to
my flash, Adrian Chadd had some diffs on his flash, but seems like for
another slightly different flash.
(http://people.freebsd.org/~adrian/rspro/) For example there is no
CMD_BLOCK_ERASE_32K in datasheet for my flash.

Would it help me if I changed the flash driver to work with 4K? Or do
I still need to either, mdconfig, gnop or play UFS/UFS2 code (hard for
me)?

Basically I have a cross compiled kernel+mdroot with tinyBSD wireless
configuration, zipped and stored on the flash. So I am trying to have
a filesystem on the flash that will shadow changes.

When I zipp it takes ~10M instead of 47M!

br,


On Tue, Nov 9, 2010 at 3:16 PM, Bruce Evans <brde@optusnet.com.au> wrote:
> On Tue, 9 Nov 2010, Oliver Fromme wrote:
>
>> Bruce Evans wrote:
>> > On Mon, 8 Nov 2010, Oliver Fromme wrote:
>> > > Monthadar Al Jaberi <monthadar@gmail.com> wrote:
>> > > > [...]
>> > > > mount: /dev/redboot/fs Invalid sectorsize 65536 for superblock siz=
e
>> > > > 8192: Invalid argument
>> > > >
>> > > > So I guessed it has todo with the flash configured in 64k sectors
>> > > > according to the boot output.
>> > > > ...
>> > > > mx25l0: <M25Pxx Flash Family> at cs 0 on spibus0
>> > > > mx25l0: mx25ll128, sector 65536 bytes, 256 sectors
>> > > > ...
>> > >
>> > > Historically UFS/FFS supports only 512 bytes per sector.
>> > > I think it was patched at some point in the past to support
>> > > 2048 bytes per sector, too, which is used by some MOD media
>> > > and DVD-RAM. =A0I'm pretty sure it does _not_ support 65536
>> > > bytes per sector (someone please correct me if I'm wrong).
>> >
>> > Maybe 25 years ago, but in Net2/ 20 years ago ffs doesn't really
>> > even use sectors. =A0It just has a buggy superblock probe which
>> > prevents it determining its correct i/o size when that size
>> > exceeds SBLOCKSIZE =3D 8192.
>>
>> In the second half of the 90s it did *not* support the
>> 2048-byte sectors of the larger MOD media that became
>> popular at that time. =A0I owned several of those drives
>> (still have one of them), so I remember it quite well.
>> FreeBSD's file system code needed some patches in order
>> to be able to use those disks. =A0Before that, only 512-
>> byte sectors worked.
>
> That's strange, since in the initial slice code in 1994 or 1995, I
> emulated 4K-sectors in uncommitted patches in the floppy driver and
> thought I tested ffs with them.
>
>> I don't know what sector sizes are supported today, but
>> I wouldn't be surprised if only 512 to 2048 works out
>> of the box. =A0I'm not aware of any widely used media
>> that has sectors smaller than 512 or larger than 2048.
>> (Those new 4k drives translate accesses to/from 512 byte
>> sectors, so it looks like a 512-byte sector drive.)
>
> I have a DVD drive that only supports writing 32K-blocks on DVD-R. =A0In
> 2005 I gave up trying to get ffs to work on this. =A0The drive supports
> reading the usual 2K-blocks, so the ffs probe worked, and the ffs block
> size just needed to be set to 32K or 64K so that writes worked too.
> But this block size wastes a lot of space and time for small files.
> FreeBSD's buffering is bad for read-mostly media, and I never found
> any file system that works well for small files on DVDs or CDROMs
> (images that can be written in 10 minutes take more like 10 hours
> to read back if they contain a few hundred thousand small files).
>
> mdconfig allows any representable sector size except 0 and possibly non-
> power-of-2 ones (mdconfig(8) uses strtoul() with null error handling;
> md(4) checks for a power of 2 in the malloc-backed case but has null
> arg checking in other cases (if other cases are reached now -- this
> feature used to be limited to the malloc-backed case)). =A0Power of 2
> sizes that cannot work because they exceed MAXBSIZE can certainly be
> configured.
>
> Bruce
>


--=20
//Monthadar Al Jaberi

From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 15:11:46 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 40B591065693;
	Tue,  9 Nov 2010 15:11:46 +0000 (UTC)
	(envelope-from bruce@cran.org.uk)
Received: from muon.cran.org.uk (muon.cran.org.uk
	[IPv6:2a01:348:0:15:5d59:5c40:0:1])
	by mx1.freebsd.org (Postfix) with ESMTP id 6B83E8FC20;
	Tue,  9 Nov 2010 15:11:45 +0000 (UTC)
Received: from muon.cran.org.uk (localhost [127.0.0.1])
	by muon.cran.org.uk (Postfix) with ESMTP id BB53BE7208;
	Tue,  9 Nov 2010 15:11:44 +0000 (GMT)
Received: from core.nessbank (client-82-26-212-122.pete.adsl.virginmedia.com
	[82.26.212.122])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by muon.cran.org.uk (Postfix) with ESMTPSA;
	Tue,  9 Nov 2010 15:11:43 +0000 (GMT)
From: Bruce Cran <bruce@cran.org.uk>
To: freebsd-fs@freebsd.org
Date: Tue, 9 Nov 2010 15:11:42 +0000
User-Agent: KMail/1.13.5 (FreeBSD/9.0-CURRENT; KDE/4.5.2; amd64; ; )
MIME-Version: 1.0
Content-Type: Multipart/Mixed;
  boundary="Boundary-00=_uSW2MSQc6oN+6RL"
Message-Id: <201011091511.42902.bruce@cran.org.uk>
Cc: 
Subject: Fwd: Re: Corruption of UFS filesystems after using md(4)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 15:11:46 -0000

--Boundary-00=_uSW2MSQc6oN+6RL
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit

I'm forwarding this because I guess someone from fs@ might be interested. It 
seems creating sparse files now causes UFS filesystems to become corrupt on 
CURRENT.

-- 
Bruce Cran

--Boundary-00=_uSW2MSQc6oN+6RL
Content-Type: message/rfc822;
  name="forwarded message"
Content-Transfer-Encoding: 7bit
Content-Description: Peter Holm <pho@freebsd.org>: Re: Corruption of UFS
	filesystems after using md(4)
Content-Disposition: inline

Return-Path: <owner-freebsd-current@freebsd.org>
X-Original-To: bruce@cran.org.uk
Delivered-To: brucec@muon.cran.org.uk
Received: from muon.cran.org.uk (localhost [127.0.0.1])
	by muon.cran.org.uk (Postfix) with ESMTP id AFD0DE7207
	for <bruce@cran.org.uk>; Wed,  3 Nov 2010 07:16:07 +0000 (GMT)
X-Spam-Flag: NO
X-Spam-Score: -1.911
X-Spam-Level: 
X-Spam-Status: No, score=-1.911 tagged_above=-999 required=10
	tests=[BAYES_00=-1.9, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01]
	autolearn=ham
Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35])
	by muon.cran.org.uk (Postfix) with ESMTP
	for <bruce@cran.org.uk>; Wed,  3 Nov 2010 07:16:06 +0000 (GMT)
Received: from hub.freebsd.org (hub.freebsd.org [IPv6:2001:4f8:fff6::36])
	by mx2.freebsd.org (Postfix) with ESMTP id 107A6178D32;
	Wed,  3 Nov 2010 07:15:54 +0000 (UTC)
Received: from hub.freebsd.org (localhost [127.0.0.1])
	by hub.freebsd.org (Postfix) with ESMTP id 2876C1065780;
	Wed,  3 Nov 2010 07:15:52 +0000 (UTC)
	(envelope-from owner-freebsd-current@freebsd.org)
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6AC061065670
	for <freebsd-current@freebsd.org>; Wed,  3 Nov 2010 07:15:46 +0000 (UTC)
	(envelope-from pho@holm.cc)
Received: from relay00.pair.com (relay00.pair.com [209.68.5.9])
	by mx1.freebsd.org (Postfix) with SMTP id 1B4648FC17
	for <freebsd-current@freebsd.org>; Wed,  3 Nov 2010 07:15:46 +0000 (UTC)
Received: (qmail 82243 invoked from network); 3 Nov 2010 06:49:05 -0000
Received: from 93.166.52.54 (HELO x2.osted.lan) (93.166.52.54)
	by relay00.pair.com with SMTP; 3 Nov 2010 06:49:05 -0000
X-pair-Authenticated: 93.166.52.54
Received: from x2.osted.lan (localhost [127.0.0.1])
	by x2.osted.lan (8.14.3/8.14.3) with ESMTP id oA36n5Xg039844;
	Wed, 3 Nov 2010 07:49:05 +0100 (CET) (envelope-from pho@x2.osted.lan)
Received: (from pho@localhost)
	by x2.osted.lan (8.14.3/8.14.3/Submit) id oA36n5SE039843;
	Wed, 3 Nov 2010 07:49:05 +0100 (CET) (envelope-from pho)
Date: Wed, 3 Nov 2010 07:49:04 +0100
From: Peter Holm <pho@freebsd.org>
To: Bruce Cran <bruce@cran.org.uk>
Message-ID: <20101103064904.GA39407@x2.osted.lan>
References: <201011021912.14281.bruce@cran.org.uk>
	<201011021933.51052.bruce@cran.org.uk>
Mime-Version: 1.0
Content-Type: text/plain;
  charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201011021933.51052.bruce@cran.org.uk>
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-current@freebsd.org
Subject: Re: Corruption of UFS filesystems after using md(4)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
Sender: owner-freebsd-current@freebsd.org
Errors-To: owner-freebsd-current@freebsd.org
X-UID: 38219
X-Length: 5168

On Tue, Nov 02, 2010 at 07:33:50PM +0000, Bruce Cran wrote:
> On Tuesday 02 November 2010 19:12:14 Bruce Cran wrote:
> > I've noticed in recent months that I appear to be getting silent corruption
> > of my UFS filesystems - and I think it may be linked to using md(4) or
> > creating sparse files.
> 
> I've confirmed this is a UFS bug related to sparse files: "truncate -s20G f1 
> && rm f1" is enough to trigger the error and start generating .viminfo files 
> that appear to be 20GB. When running fsck I get an "Invalid block count" error 
> if I just reboot without removing the .viminfo file; if I do remove it, I get 
> a "Partially allocated inode" error.
> 

I'm able to verify this by:

"m.sh" 49L, 1917C written
$ ./m.sh
Local config: x4
+ mdconfig -a -t swap -s 1g -u 5
+ bsdlabel -w md5 auto
+ newfs -U md5a
+ mount /dev/md5a /mnt
+ truncate -s20G /mnt/f1
+ rm /mnt/f1
+ umount /mnt
+ fsck -t ufs -y /dev/md5a
** /dev/md5a
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
PARTIALLY ALLOCATED INODE I=4
UNEXPECTED SOFT UPDATE INCONSISTENCY

CLEAR? yes

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes

SUMMARY INFORMATION BAD
SALVAGE? yes

BLK(S) MISSING IN BIT MAPS
SALVAGE? yes

2 files, 2 used, 506481 free (25 frags, 63307 blocks, 0.0%
fragmentation)

***** FILE SYSTEM IS CLEAN *****

***** FILE SYSTEM WAS MODIFIED *****
+ mdconfig -d -u 5
$ 

- Peter
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

--Boundary-00=_uSW2MSQc6oN+6RL--

From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 16:55:31 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 201EE1065672
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 16:55:31 +0000 (UTC)
	(envelope-from v.velox@vvelox.net)
Received: from vulpes.vvelox.net (sula-ki.vvelox.net [99.69.115.46])
	by mx1.freebsd.org (Postfix) with ESMTP id D9FD28FC1D
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 16:55:30 +0000 (UTC)
Received: from vixen42.vulpes.vvelox.net (unknown [192.168.14.2])
	(Authenticated sender: v.velox)
	by vulpes.vvelox.net (Postfix) with ESMTPA id 230F2B885
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 10:22:18 -0600 (CST)
Date: Tue, 9 Nov 2010 10:17:40 -0600
From: "Zane C.B." <v.velox@vvelox.net>
To: freebsd-fs@freebsd.org
Message-ID: <20101109101740.2e8e80ce@vixen42.vulpes.vvelox.net>
X-Mailer: Claws Mail 3.7.6 (GTK+ 2.20.1; amd64-portbld-freebsd8.1)
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=PGP-SHA1;
	boundary="Sig_/dMzSZwx/X+f8WKtpRC_qITB";
	protocol="application/pgp-signature"
Subject: NFS locking
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 16:55:31 -0000

--Sig_/dMzSZwx/X+f8WKtpRC_qITB
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

What does it take to get NFS locking working?

Any time I start lockd on the server, I get the message below in
dmesg.

NLM: failed to contact remote rpcbind, stat =3D 0, port =3D 0
NLM: failed to contact remote rpcbind, stat =3D 0, port =3D 0
Can't start NLM - unable to contact NSM

Any ideas?

--Sig_/dMzSZwx/X+f8WKtpRC_qITB
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)

iEYEARECAAYFAkzZdC4ACgkQqrJJy0yxYQDGXgCdHb52sCUd6LH79+PnaByWhtdH
VtkAn0QzUR41d7rp6Fgg1/uKlnXFQoxn
=etQ9
-----END PGP SIGNATURE-----

--Sig_/dMzSZwx/X+f8WKtpRC_qITB--

From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 17:01:06 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DC7D31065675
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 17:01:06 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta07.emeryville.ca.mail.comcast.net
	(qmta07.emeryville.ca.mail.comcast.net [76.96.30.64])
	by mx1.freebsd.org (Postfix) with ESMTP id C1C558FC1D
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 17:01:06 +0000 (UTC)
Received: from omta07.emeryville.ca.mail.comcast.net ([76.96.30.59])
	by qmta07.emeryville.ca.mail.comcast.net with comcast
	id V3rl1f0041GXsucA7516s0; Tue, 09 Nov 2010 17:01:06 +0000
Received: from koitsu.dyndns.org ([98.248.41.155])
	by omta07.emeryville.ca.mail.comcast.net with comcast
	id V5141f00R3LrwQ28U515Sn; Tue, 09 Nov 2010 17:01:05 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id A14D39B427; Tue,  9 Nov 2010 09:01:04 -0800 (PST)
Date: Tue, 9 Nov 2010 09:01:04 -0800
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: "Zane C.B." <v.velox@vvelox.net>
Message-ID: <20101109170104.GA37882@icarus.home.lan>
References: <20101109101740.2e8e80ce@vixen42.vulpes.vvelox.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20101109101740.2e8e80ce@vixen42.vulpes.vvelox.net>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
Subject: Re: NFS locking
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 17:01:06 -0000

On Tue, Nov 09, 2010 at 10:17:40AM -0600, Zane C.B. wrote:
> What does it take to get NFS locking working?
> 
> Any time I start lockd on the server, I get the message below in
> dmesg.
> 
> NLM: failed to contact remote rpcbind, stat = 0, port = 0
> NLM: failed to contact remote rpcbind, stat = 0, port = 0
> Can't start NLM - unable to contact NSM
> 
> Any ideas?

http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2010-03/msg00484.html

Solution (for me):

http://lists.freebsd.org/pipermail/freebsd-stable/2010-March/056043.html

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 17:27:07 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1BD841065693
	for <freebsd-fs@FreeBSD.ORG>; Tue,  9 Nov 2010 17:27:07 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 8C68D8FC0A
	for <freebsd-fs@FreeBSD.ORG>; Tue,  9 Nov 2010 17:27:06 +0000 (UTC)
Received: from lurza.secnetix.de (localhost [127.0.0.1])
	by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id oA9HQoXl088175;
	Tue, 9 Nov 2010 18:27:05 +0100 (CET)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.14.3/8.14.3/Submit) id oA9HQngL088174;
	Tue, 9 Nov 2010 18:26:49 +0100 (CET) (envelope-from olli)
Date: Tue, 9 Nov 2010 18:26:49 +0100 (CET)
Message-Id: <201011091726.oA9HQngL088174@lurza.secnetix.de>
From: Oliver Fromme <olli@lurza.secnetix.de>
To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com
In-Reply-To: <AANLkTi=MuiB6Xt3uztB7Yz82TvA2GJ9q6ncquU1=HTj9@mail.gmail.com>
X-Newsgroups: list.freebsd-fs
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX)
	(FreeBSD/6.4-PRERELEASE-20080904 (i386))
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5
	(lurza.secnetix.de [127.0.0.1]);
	Tue, 09 Nov 2010 18:27:05 +0100 (CET)
Cc: 
Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
	??error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: freebsd-fs@FreeBSD.ORG, monthadar@gmail.com
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 17:27:07 -0000

Monthadar Al Jaberi <monthadar@gmail.com> wrote:
 > I successfully tried mdconfig version, but gnop gives error message:
 > gnop: Invalid secsize for provider redboot/fs.

I'm sorry, I shouldn't have mentioned gnop.  It's probably
not helpful in this case.

 > I looked into datasheet for the flash (MX25125805D) and it seems like
 > it can work in both 64K and 4K? Confused.
 > (http://www.macronix.com/QuickPlace/hq/PageLibrary482576EF002A2699.nsf/h_Index/30D2368B704F50B9482576EF002D070F/?OpenDocument&Type=Serial%20Flash&Density=128Mb)

I'm not familiar with that kind of hardware, but if it can
be switched to 4k sector mode, then that would make things
a lot easier.

 > Would it help me if I changed the flash driver to work with 4K?

Yes, definitely.

 > Or do I still need to either, mdconfig, gnop or play UFS/UFS2 code
 > (hard for me)?

If everything else fails, I would simply create a memory
disk with mdconfig (if you have enough RAM), copy the file
system from flash to the memory disk (use "dd bs=64k ...")
and mount it from there.  That's not hard.

 > Basically I have a cross compiled kernel+mdroot with tinyBSD wireless
 > configuration, zipped and stored on the flash. So I am trying to have
 > a filesystem on the flash that will shadow changes.
 > 
 > When I zipp it takes ~10M instead of 47M!

Wait a second ...  I don't understand ...  Are you saying
that you've put a compressed FS image on the flash?  Is
that the file system that you're trying to mount?  Or are
we talking about two distinct pieces of flash?

If you just want to save changes (e.g. configuration files,
log files and similar) to the flash, you can also simply
use a tar archive:

# tar -cb 128 -f /dev/<your flash> <your files>

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch�ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M�n-
chen, HRB 125758,  Gesch�ftsf�hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"The last good thing written in C was
Franz Schubert's Symphony number 9."
        -- Erwin Dieterich

From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 17:37:32 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8C40D106564A
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 17:37:32 +0000 (UTC)
	(envelope-from carlson39@llnl.gov)
Received: from smtp.llnl.gov (nspiron-3.llnl.gov [128.115.41.83])
	by mx1.freebsd.org (Postfix) with ESMTP id 765E78FC19
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 17:37:32 +0000 (UTC)
X-Attachments: None
Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135])
	by smtp.llnl.gov with ESMTP; 09 Nov 2010 09:37:31 -0800
Message-ID: <4CD986DC.1070401@llnl.gov>
Date: Tue, 09 Nov 2010 09:37:32 -0800
From: Mike Carlson <carlson39@llnl.gov>
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US;
	rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <4CD84258.6090404@llnl.gov> <ibbauo$27m$1@dough.gmane.org>
In-Reply-To: <ibbauo$27m$1@dough.gmane.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 17:37:32 -0000

On 11/09/2010 03:23 AM, Ivan Voras wrote:
> On 11/08/10 19:32, Mike Carlson wrote:
>
>> As soon as I create the volume and write data to it, it is reported as
>> being corrupted:
>>
>>     write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8
>> However, if I create a 'raidz' volume, no errors occur:
> A very interesting problem. Can you check with some other kind of volume
> manager that striping the data doesn't cause some unusual hardware
> interaction? Can you try, as an experiment, striping them all with
> gstripe (but you'll have to use a small stripe size like 16 KiB or 8 KiB)?
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://BLOCKEDlists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
Sure:

    write# gstripe label -v -s 16384  data /dev/da2 /dev/da3 /dev/da4
    /dev/da5 /dev/da6 /dev/da7 /dev/da8
    Metadata value stored on /dev/da2.
    Metadata value stored on /dev/da3.
    Metadata value stored on /dev/da4.
    Metadata value stored on /dev/da5.
    Metadata value stored on /dev/da6.
    Metadata value stored on /dev/da7.
    Metadata value stored on /dev/da8.
    Done.
    write# newfs -O2 -U /dev/stripe/data
    /dev/stripe/data: 133522760.0MB (273454612256 sectors) block size
    16384, fragment size 2048
         using 627697 cylinder groups of 212.72MB, 13614 blks, 6848 inodes.
         with soft updates
    super-block backups (for fsck -b #) at:
    ...
    write# mount /dev/stripe/data /mnt
    write# df -h
    Filesystem            Size    Used   Avail Capacity  Mounted on
    /dev/da0s1a           1.7T     22G    1.6T     1%    /
    devfs                 1.0K    1.0K      0B   100%    /dev
    /dev/stripe/data      126T    4.0K    116T     0%    /mnt
    write# cd /tmp
    write# md5 /tmp/random.dat.1
    MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    write# cp random.dat.1 /mnt/
    write# cp /mnt/random.dat.1 /mnt/random.dat.2
    write# cp /mnt/random.dat.1 /mnt/random.dat.3
    write# cp /mnt/random.dat.1 /mnt/random.dat.4
    write# cp /mnt/random.dat.1 /mnt/random.dat.5
    write# cp /mnt/random.dat.1 /mnt/random.dat.6
    write# md5 /mnt/*
    MD5 (/mnt/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/mnt/random.dat.2) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/mnt/random.dat.3) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/mnt/random.dat.4) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/mnt/random.dat.5) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/mnt/random.dat.6) = f795fa09e1b0975c0da0ec6e49544a36
    write# fsck /mnt
    fsck: Could not determine filesystem type
    write# fsck_ufs  /mnt
    ** /dev/stripe/data (NO WRITE)
    ** Last Mounted on /mnt
    ** Phase 1 - Check Blocks and Sizes
    Segmentation fault

So, the data appears to be okay, I wanted to run through a FSCK just to 
do it but that seg faulted. Otherwise, that data looks good.

Question, why did you recommend using a smaller stripe size? Is that to 
ensure a sample 1GB test file gets written across ALL disk members?

From owner-freebsd-fs@FreeBSD.ORG  Tue Nov  9 17:42:46 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AC0851065670
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 17:42:46 +0000 (UTC)
	(envelope-from carlson39@llnl.gov)
Received: from smtp.llnl.gov (nspiron-3.llnl.gov [128.115.41.83])
	by mx1.freebsd.org (Postfix) with ESMTP id 8B1518FC1B
	for <freebsd-fs@freebsd.org>; Tue,  9 Nov 2010 17:42:46 +0000 (UTC)
X-Attachments: None
Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135])
	by smtp.llnl.gov with ESMTP; 09 Nov 2010 09:42:46 -0800
Message-ID: <4CD98816.1020306@llnl.gov>
Date: Tue, 09 Nov 2010 09:42:46 -0800
From: Mike Carlson <carlson39@llnl.gov>
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US;
	rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <4CD84258.6090404@llnl.gov> <ibbauo$27m$1@dough.gmane.org>
	<4CD986DC.1070401@llnl.gov>
In-Reply-To: <4CD986DC.1070401@llnl.gov>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2010 17:42:46 -0000

On 11/09/2010 09:37 AM, Mike Carlson wrote:
> On 11/09/2010 03:23 AM, Ivan Voras wrote:
>> On 11/08/10 19:32, Mike Carlson wrote:
>>
>>> As soon as I create the volume and write data to it, it is reported as
>>> being corrupted:
>>>
>>>      write# zpool create filevol001 da2 da3 da4 da5 da6 da7 da8
>>> However, if I create a 'raidz' volume, no errors occur:
>> A very interesting problem. Can you check with some other kind of volume
>> manager that striping the data doesn't cause some unusual hardware
>> interaction? Can you try, as an experiment, striping them all with
>> gstripe (but you'll have to use a small stripe size like 16 KiB or 8 KiB)?
>>
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://BLOCKEDBLOCKEDlists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>>
> Sure:
>
>      write# gstripe label -v -s 16384  data /dev/da2 /dev/da3 /dev/da4
>      /dev/da5 /dev/da6 /dev/da7 /dev/da8
>      Metadata value stored on /dev/da2.
>      Metadata value stored on /dev/da3.
>      Metadata value stored on /dev/da4.
>      Metadata value stored on /dev/da5.
>      Metadata value stored on /dev/da6.
>      Metadata value stored on /dev/da7.
>      Metadata value stored on /dev/da8.
>      Done.
>      write# newfs -O2 -U /dev/stripe/data
>      /dev/stripe/data: 133522760.0MB (273454612256 sectors) block size
>      16384, fragment size 2048
>           using 627697 cylinder groups of 212.72MB, 13614 blks, 6848 inodes.
>           with soft updates
>      super-block backups (for fsck -b #) at:
>      ...
>      write# mount /dev/stripe/data /mnt
>      write# df -h
>      Filesystem            Size    Used   Avail Capacity  Mounted on
>      /dev/da0s1a           1.7T     22G    1.6T     1%    /
>      devfs                 1.0K    1.0K      0B   100%    /dev
>      /dev/stripe/data      126T    4.0K    116T     0%    /mnt
>      write# cd /tmp
>      write# md5 /tmp/random.dat.1
>      MD5 (/tmp/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>      write# cp random.dat.1 /mnt/
>      write# cp /mnt/random.dat.1 /mnt/random.dat.2
>      write# cp /mnt/random.dat.1 /mnt/random.dat.3
>      write# cp /mnt/random.dat.1 /mnt/random.dat.4
>      write# cp /mnt/random.dat.1 /mnt/random.dat.5
>      write# cp /mnt/random.dat.1 /mnt/random.dat.6
>      write# md5 /mnt/*
>      MD5 (/mnt/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
>      MD5 (/mnt/random.dat.2) = f795fa09e1b0975c0da0ec6e49544a36
>      MD5 (/mnt/random.dat.3) = f795fa09e1b0975c0da0ec6e49544a36
>      MD5 (/mnt/random.dat.4) = f795fa09e1b0975c0da0ec6e49544a36
>      MD5 (/mnt/random.dat.5) = f795fa09e1b0975c0da0ec6e49544a36
>      MD5 (/mnt/random.dat.6) = f795fa09e1b0975c0da0ec6e49544a36
>      write# fsck /mnt
>      fsck: Could not determine filesystem type
>      write# fsck_ufs  /mnt
>      ** /dev/stripe/data (NO WRITE)
>      ** Last Mounted on /mnt
>      ** Phase 1 - Check Blocks and Sizes
>      Segmentation fault
>
> So, the data appears to be okay, I wanted to run through a FSCK just to
> do it but that seg faulted. Otherwise, that data looks good.
>
> Question, why did you recommend using a smaller stripe size? Is that to
> ensure a sample 1GB test file gets written across ALL disk members?
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://BLOCKEDlists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
Oh, I almost forgot, here is the ZFS version of that gstripe array:

    write# zpool create test01 /dev/stripe/data
    write# cp /tmp/random.dat.1 /test01/
    write# cp /test01/random.dat.1 /test01/random.dat.2
    write# cp /test01/random.dat.1 /test01/random.dat.3
    write# cp /test01/random.dat.1 /test01/random.dat.4
    write# cp /test01/random.dat.1 /test01/random.dat.5
    write# cp /test01/random.dat.1 /test01/random.dat.6
    write# md5 /test01/*
    MD5 (/test01/random.dat.1) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.2) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.3) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.4) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.5) = f795fa09e1b0975c0da0ec6e49544a36
    MD5 (/test01/random.dat.6) = f795fa09e1b0975c0da0ec6e49544a36
    write# zpool scrub
    write# zpool status
       pool: test01
      state: ONLINE
      scrub: scrub completed after 0h0m with 0 errors on Tue Nov  9
    09:41:34 2010
    config:

         NAME           STATE     READ WRITE CKSUM
         test01         ONLINE       0     0     0
           stripe/data  ONLINE       0     0     0

    errors: No known data errors

Again, no errors.

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 00:20:08 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D9B1E106564A
	for <fs@freebsd.org>; Wed, 10 Nov 2010 00:20:08 +0000 (UTC)
	(envelope-from dougb@FreeBSD.org)
Received: from mail2.fluidhosting.com (mx23.fluidhosting.com [204.14.89.6])
	by mx1.freebsd.org (Postfix) with ESMTP id 785BF8FC20
	for <fs@freebsd.org>; Wed, 10 Nov 2010 00:20:07 +0000 (UTC)
Received: (qmail 26517 invoked by uid 399); 10 Nov 2010 00:20:07 -0000
Received: from localhost (HELO doug-optiplex.ka9q.net)
	(dougb@dougbarton.us@127.0.0.1)
	by localhost with ESMTPAM; 10 Nov 2010 00:20:07 -0000
X-Originating-IP: 127.0.0.1
X-Sender: dougb@dougbarton.us
Message-ID: <4CD9E535.8000801@FreeBSD.org>
Date: Tue, 09 Nov 2010 16:20:05 -0800
From: Doug Barton <dougb@FreeBSD.org>
Organization: http://SupersetSolutions.com/
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101028 Thunderbird/3.1.6
MIME-Version: 1.0
To: Aditya Sarawgi <sarawgi.aditya@gmail.com>
References: <20100929031825.L683@besplex.bde.org>
	<20100929084801.M948@besplex.bde.org>
	<20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org>
	<20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org>
	<AANLkTi=iTCG4aO-KO_gy7fp_96KcZ_TCyNk5OkLZUHV3@mail.gmail.com>
	<4CD201AE.3040409@FreeBSD.org> <20101108174327.GC2066@earth>
In-Reply-To: <20101108174327.GC2066@earth>
X-Enigmail-Version: 1.1.2
OpenPGP: id=1A1ABC84
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: fs@freebsd.org
Subject: Re: ext2fs now extremely slow
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 00:20:08 -0000

On 11/08/2010 09:43, Aditya Sarawgi wrote:
> On Wed, Nov 03, 2010 at 05:43:26PM -0700, Doug Barton wrote:

>> Regarding stability, sometimes (but not always) when I'm doing the above
>> listed disk-intensive things on an otherwise idle system I've had the
>> system lock up. Not panic, not reboot, just wedge. I'm running X when
>> this happens, so I'm not 100% sure that the disk activity is the
>> culprit, but it seems very suspicious. Yesterday was a very bad day, I
>> had to do 3 tries to get all the way through a buildworld/kernel, mostly
>> because the last 2 crashes resulted in my /usr/src (which is actually
>> /home/svn/head) and /usr/obj (/home/obj-9) directories getting corrupted
>> respectively. Today (running r214694) has actually been quite good,
>> although I haven't tried a buildworld yet.
>>
>
> I am not sure if this is the right use case for ext2fs

Can you expand on that? What about it do you see as problematic?

>>> You can test Zheng's preallocation patch for ext2fs, there is a
>>> serious lack of testers for that.
>>
>> I would be happy to do that, but my reading of this thread last month
>> didn't produce a clear "try this version of the patch" neon sign.
>> Various people referred to suggestions, updates, etc. If someone could
>> provide a URL for the right patch to try, as well as a suggestion for
>> benchmarking methodology, I'll be glad to do so.
>>
>
> I have attached the patch.

Thanks for that. I'm curious though whether this is the latest version 
of the patch with the suggested improvements from earlier in this thread?

> Some primitive testing like copying files,
> untaring etc and comparing with the existing ext2fs will do. If you
> are looking to do a full fledged benchmarking then I would suggest
> iozone, blogbench, dbench etc.

Sorry, I am not a filesystem person, so if you want me to do any real 
benchmarking you're going to have to give me details ... Install this 
program, run this test, etc.

Meanwhile I finally got around to setting up my 8.1-RELEASE partition on 
this same system and particularly with cvsup it's very noticeable that 
ext2fs in -current is MUCH slower than in RELENG_8. I'll do some before 
and after tests on -current, then I'll do the same thing on 8.1 and see 
how the numbers compare.


Doug

-- 

	Nothin' ever doesn't change, but nothin' changes much.
			-- OK Go

	Breadth of IT experience, and depth of knowledge in the DNS.
	Yours for the right price.  :)  http://SupersetSolutions.com/


From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 06:34:08 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A5303106566B;
	Wed, 10 Nov 2010 06:34:08 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 892F28FC0C;
	Wed, 10 Nov 2010 06:34:07 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id IAA10495;
	Wed, 10 Nov 2010 08:34:06 +0200 (EET) (envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1PG4Fx-000Agy-QS; Wed, 10 Nov 2010 08:34:05 +0200
Message-ID: <4CDA3CDD.5000404@freebsd.org>
Date: Wed, 10 Nov 2010 08:34:05 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Ivan Voras <ivoras@freebsd.org>
References: <4CD7C8FC.900@icyb.net.ua> <ib8nas$9de$1@dough.gmane.org>
	<4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org>
In-Reply-To: <4CD7E960.1070200@freebsd.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 06:34:08 -0000

on 08/11/2010 14:13 Andriy Gapon said the following:
> on 08/11/2010 13:55 Andriy Gapon said the following:
>> I reliable got this panic when all I was doing is saving an attachment in
>> thunderbird 3 that ran in KDE 4 environment.  Not sure what was going on behind
>> the scenes, but shouldn't have been anything out of the ordinary.
> 
> Perhaps this is my local mistake.  I can't see from code and crash dump how NULL
> pointer is possible there.  So perhaps I have some ABI mismatch between kernel
> and fuse module.
> I will rebuild fuse kmod and re-test again.

Yes, the rebuild has helped.
I wish this could be nicely automated.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 07:44:38 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8978E106564A
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 07:44:38 +0000 (UTC)
	(envelope-from monthadar@gmail.com)
Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com
	[209.85.216.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 408888FC1B
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 07:44:37 +0000 (UTC)
Received: by qwg8 with SMTP id 8so394654qwg.13
	for <freebsd-fs@freebsd.org>; Tue, 09 Nov 2010 23:44:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:content-type
	:content-transfer-encoding;
	bh=FVKfQrOebaQb6qQ0NBhIEsDJCsuXt2IsH5SVO21Nu6E=;
	b=YH2hYWphPwiagS1fUUS3xD1kqMoFiu8UP4WD+sto9mGvZGJhwjrfJQyePeuidbspP2
	WdmizMSI9qUIT7PTmC42zgxrjsf1F/YXSoKeoHUgEkMggIZqnwsD+LLSKdpqdG1onR04
	ZGfpXputK8bkEW6WLdPKHZWIw92YZGZ22zf4c=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:content-type:content-transfer-encoding;
	b=bjeXSjO5aiUPRo/MtC6G3xCv/ku2UGtaYTvCq5mJBS7k0ukDqQEPn8tZVHVs7EaMaX
	Vh47jZ16u7X7Vyq1ptiEY4wOwGbThsR1dC+p3Ryr8MbZXJhePdfZfRtkrblWgOtfUJ/f
	sqFe2zG9lS65qla55liED70qh7qbDfqX567xk=
MIME-Version: 1.0
Received: by 10.224.193.201 with SMTP id dv9mr6212274qab.125.1289375077306;
	Tue, 09 Nov 2010 23:44:37 -0800 (PST)
Received: by 10.229.182.77 with HTTP; Tue, 9 Nov 2010 23:44:37 -0800 (PST)
In-Reply-To: <201011091726.oA9HQngL088174@lurza.secnetix.de>
References: <AANLkTi=MuiB6Xt3uztB7Yz82TvA2GJ9q6ncquU1=HTj9@mail.gmail.com>
	<201011091726.oA9HQngL088174@lurza.secnetix.de>
Date: Wed, 10 Nov 2010 08:44:37 +0100
Message-ID: <AANLkTimBPD1ma6OnUyvaMb0sjKPKM8Hm4jD5Om4APH2p@mail.gmail.com>
From: Monthadar Al Jaberi <monthadar@gmail.com>
To: freebsd-fs@freebsd.org, olli@lurza.secnetix.de
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: 
Subject: Re: problem mounting from flash [Invalid sectorsize] [g_vfs_done()
	??error=22]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 07:44:38 -0000

> =A0> Would it help me if I changed the flash driver to work with 4K?
>
> Yes, definitely.

Ok, I will look into.

>
> =A0> Or do I still need to either, mdconfig, gnop or play UFS/UFS2 code
> =A0> (hard for me)?
>
> If everything else fails, I would simply create a memory
> disk with mdconfig (if you have enough RAM), copy the file
> system from flash to the memory disk (use "dd bs=3D64k ...")
> and mount it from there. =A0That's not hard.

Yepp, works like a charm, Thank you =3D)
My board have 128MB ram and 16MB flash.

>
> =A0> Basically I have a cross compiled kernel+mdroot with tinyBSD wireles=
s
> =A0> configuration, zipped and stored on the flash. So I am trying to hav=
e
> =A0> a filesystem on the flash that will shadow changes.
> =A0>
> =A0> When I zipp it takes ~10M instead of 47M!
>
> Wait a second ... =A0I don't understand ... =A0Are you saying
> that you've put a compressed FS image on the flash? =A0Is
> that the file system that you're trying to mount? =A0Or are
> we talking about two distinct pieces of flash?

No its another filesystem that I want to mount.

The root filesystem is inside the kernel image (MD_ROOT option) , the
kernel will mount from it (/dev/md0)
When the kernel image was generated I zipped it and stored it on the
flash, so a bootloader like Redboot will unzipp to ram and run the
kernel.
So I guess its messy to touch the filesystem that is inside a zipped
kernel image. Filesystem itself is not zipped.

So idea is to create another directory in the flash FIS (Flash Image
System) where I store new versions of somefiles. And your tar advice
works nicely, that makes it easier than having a filesystem on flash
and it will be zipped too!!!!! Thank you :)

Maybe no need to fidle with the flash driver and UFS code, who knows... :P
I guess ideal scenario is having a direct read/write separate
filesystem for some paths like /etc on the flash, while /var and /tmp
are mounted as /dev/mdX and the rest is read-only.

Thank you!

>
> --
> Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
> Handelsregister: Registergericht Muenchen, HRA 74606, =A0Gesch=E4ftsfuehr=
ung:
> secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M=FC=
n-
> chen, HRB 125758, =A0Gesch=E4ftsf=FChrer: Maik Bachmann, Olaf Erb, Ralf G=
ebhart
>
> FreeBSD-Dienstleistungen, -Produkte und mehr: =A0http://www.secnetix.de/b=
sd
>
> "The last good thing written in C was
> Franz Schubert's Symphony number 9."
> =A0 =A0 =A0 =A0-- Erwin Dieterich
>


--=20
//Monthadar Al Jaberi

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 11:03:18 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 28B89106564A
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 11:03:18 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id A86C58FC15
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 11:03:17 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1PG8SN-0005zW-Ah
	for freebsd-fs@freebsd.org; Wed, 10 Nov 2010 12:03:11 +0100
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 12:03:11 +0100
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 12:03:11 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Wed, 10 Nov 2010 12:03:00 +0100
Lines: 58
Message-ID: <ibdu54$fd1$1@dough.gmane.org>
References: <4CD84258.6090404@llnl.gov>
	<ibbauo$27m$1@dough.gmane.org>	<4CD986DC.1070401@llnl.gov>
	<4CD98816.1020306@llnl.gov>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <4CD98816.1020306@llnl.gov>
X-Enigmail-Version: 1.1.2
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 11:03:18 -0000

On 11/09/10 18:42, Mike Carlson wrote:

>>      write# gstripe label -v -s 16384  data /dev/da2 /dev/da3 /dev/da4
>>      /dev/da5 /dev/da6 /dev/da7 /dev/da8

>>      write# df -h
>>      Filesystem            Size    Used   Avail Capacity  Mounted on
>>      /dev/da0s1a           1.7T     22G    1.6T     1%    /
>>      devfs                 1.0K    1.0K      0B   100%    /dev
>>      /dev/stripe/data      126T    4.0K    116T     0%    /mnt

>>      write# fsck /mnt
>>      fsck: Could not determine filesystem type
>>      write# fsck_ufs  /mnt
>>      ** /dev/stripe/data (NO WRITE)
>>      ** Last Mounted on /mnt
>>      ** Phase 1 - Check Blocks and Sizes
>>      Segmentation fault

>> So, the data appears to be okay, I wanted to run through a FSCK just to
>> do it but that seg faulted. Otherwise, that data looks good.

Hmm, probably it tried to allocate a gazillion internal structures to
check it and didn't take no for an answer.

>> Question, why did you recommend using a smaller stripe size? Is that to
>> ensure a sample 1GB test file gets written across ALL disk members?

Yes, it's the surest way since MAXPHYS=128 KiB / 8 = 16 KiB.

Well, as far as I'm concerned this probably shows that there isn't
something wrong about hardware or GEOM, though more testing, like
running a couple of bonnie++ rounds on the UFS on the stripe volume for
a few hours, would probably be better.

Btw. what bandwidth do you get from this combination (gstripe + UFS)?

> Oh, I almost forgot, here is the ZFS version of that gstripe array:
> 
>    write# zpool create test01 /dev/stripe/data

>    write# zpool scrub
>    write# zpool status
>       pool: test01
>      state: ONLINE
>      scrub: scrub completed after 0h0m with 0 errors on Tue Nov  9
>    09:41:34 2010
>    config:
> 
>         NAME           STATE     READ WRITE CKSUM
>         test01         ONLINE       0     0     0
>           stripe/data  ONLINE       0     0     0

"scrub" verifies only written data, not the whole file system space
(that's why it finishes so fast), so it isn't really doing any load on
the array, but I agree that it looks more and more like there really is
an issue in ZFS.


From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 17:05:37 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 484B31065673;
	Wed, 10 Nov 2010 17:05:37 +0000 (UTC)
	(envelope-from sarawgi.aditya@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 0E96D8FC12;
	Wed, 10 Nov 2010 17:05:36 +0000 (UTC)
Received: by pwi10 with SMTP id 10so193682pwi.13
	for <multiple recipients>; Wed, 10 Nov 2010 09:05:35 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:date:from:to:cc:subject
	:message-id:references:mime-version:content-type:content-disposition
	:in-reply-to:user-agent;
	bh=doHDc3Be3vx7yab1G/OcJncfuNovjKpeU+Flb4/uKlM=;
	b=JnNtUfxvt0j3a8aifyoTbRW3guaws2mGb44PW6o32bNFEsBOScpubiKvwITxnbWrxy
	ONhx7ae0jtDDi6GPhxbas19cd7o3eEcI4Y0foz3mMXN27s6/+GbN/nDDYVmBwXPLf5Jb
	hR1biIFMH6HD6InArH1lxLrwIMsnARxy9DcYQ=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=date:from:to:cc:subject:message-id:references:mime-version
	:content-type:content-disposition:in-reply-to:user-agent;
	b=q/nwmuknIjTdD9DtVV6kJfIyKWKMI1pOAj3VTvQ1UCh3jPZJXBgI1GjTGFI+M1/ua9
	rW17iWTlgQ5XS2IhEYt3qMTNSDRi6YziD/AW8kr9foDAI2D640MJNSmJ926Dby0ihS6I
	lFnAVAmbus+Mb6zypcYpvKQNg+2mXmtGSZs4s=
Received: by 10.42.177.73 with SMTP id bh9mr1657654icb.162.1289408734429;
	Wed, 10 Nov 2010 09:05:34 -0800 (PST)
Received: from earth ([183.87.49.208])
	by mx.google.com with ESMTPS id gy41sm1082604ibb.23.2010.11.10.09.05.32
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Wed, 10 Nov 2010 09:05:33 -0800 (PST)
Date: Wed, 10 Nov 2010 22:37:22 +0530
From: Aditya Sarawgi <sarawgi.aditya@gmail.com>
To: Doug Barton <dougb@FreeBSD.org>
Message-ID: <20101110170719.GA1573@earth>
References: <20100929031825.L683@besplex.bde.org>
	<20100929084801.M948@besplex.bde.org>
	<20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org>
	<20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org>
	<AANLkTi=iTCG4aO-KO_gy7fp_96KcZ_TCyNk5OkLZUHV3@mail.gmail.com>
	<4CD201AE.3040409@FreeBSD.org> <20101108174327.GC2066@earth>
	<4CD9E535.8000801@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4CD9E535.8000801@FreeBSD.org>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: fs@freebsd.org
Subject: Re: ext2fs now extremely slow
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 17:05:37 -0000

On Tue, Nov 09, 2010 at 04:20:05PM -0800, Doug Barton wrote:
> On 11/08/2010 09:43, Aditya Sarawgi wrote:
> > On Wed, Nov 03, 2010 at 05:43:26PM -0700, Doug Barton wrote:
> 
> >> Regarding stability, sometimes (but not always) when I'm doing the above
> >> listed disk-intensive things on an otherwise idle system I've had the
> >> system lock up. Not panic, not reboot, just wedge. I'm running X when
> >> this happens, so I'm not 100% sure that the disk activity is the
> >> culprit, but it seems very suspicious. Yesterday was a very bad day, I
> >> had to do 3 tries to get all the way through a buildworld/kernel, mostly
> >> because the last 2 crashes resulted in my /usr/src (which is actually
> >> /home/svn/head) and /usr/obj (/home/obj-9) directories getting corrupted
> >> respectively. Today (running r214694) has actually been quite good,
> >> although I haven't tried a buildworld yet.
> >>
> >
> > I am not sure if this is the right use case for ext2fs
> 
> Can you expand on that? What about it do you see as problematic?
>

ext2fs is not a native filesystem. It is relatively slower than UFS
may have some other problems like deadlocks which are not yet discovered.
May make the data inconsistent due to lack of facilities like journaling.
It is only meant to make your data in linux partitions accessible.
 
> >>> You can test Zheng's preallocation patch for ext2fs, there is a
> >>> serious lack of testers for that.
> >>
> >> I would be happy to do that, but my reading of this thread last month
> >> didn't produce a clear "try this version of the patch" neon sign.
> >> Various people referred to suggestions, updates, etc. If someone could
> >> provide a URL for the right patch to try, as well as a suggestion for
> >> benchmarking methodology, I'll be glad to do so.
> >>
> >
> > I have attached the patch.
> 
> Thanks for that. I'm curious though whether this is the latest version 
> of the patch with the suggested improvements from earlier in this thread?
>

There will be only style and some comment fixes in the new patch.
 
> > Some primitive testing like copying files,
> > untaring etc and comparing with the existing ext2fs will do. If you
> > are looking to do a full fledged benchmarking then I would suggest
> > iozone, blogbench, dbench etc.
> 
> Sorry, I am not a filesystem person, so if you want me to do any real 
> benchmarking you're going to have to give me details ... Install this 
> program, run this test, etc.
>

Install dbench, blogbench from ports or packages and follow this
http://wiki.freebsd.org/SOC2010ZhengLiu
 
> Meanwhile I finally got around to setting up my 8.1-RELEASE partition on 
> this same system and particularly with cvsup it's very noticeable that 
> ext2fs in -current is MUCH slower than in RELENG_8. I'll do some before 
> and after tests on -current, then I'll do the same thing on 8.1 and see 
> how the numbers compare.
>

Yes, we know there are some scope of improvements. So the BSDL ext2fs lacks 
preallocation which used to preallocate some blocks which improved the sequential
write performance. This problem is solved by Zheng's reservation window work. 
The other issue like Bruce mentioned is that some of the blocks in between are 
skipped by the block allocator algorithm. We intend to have fixes for both of
these before 9 is released.
Can you also mail me privately the output of dumpe2fs <unmounted /dev/ext2_partition>

Thanks
Aditya Sarawgi  

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 18:56:01 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 23D0A106564A;
	Wed, 10 Nov 2010 18:56:01 +0000 (UTC)
	(envelope-from pluknet@gmail.com)
Received: from mail-qy0-f182.google.com (mail-qy0-f182.google.com
	[209.85.216.182])
	by mx1.freebsd.org (Postfix) with ESMTP id BA2F98FC12;
	Wed, 10 Nov 2010 18:56:00 +0000 (UTC)
Received: by qyk30 with SMTP id 30so744085qyk.13
	for <multiple recipients>; Wed, 10 Nov 2010 10:56:00 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=jJXPW5Gg9GsBZGAn+awKJIbu1ydDK9tuxD64unKKGM0=;
	b=FU7SlrNdE5y9LiKpPOnMSHmCyuKvbAMoTOsbKZW273rdBEm2F9Wo7pHfmlU1yQO+xB
	qTT/oAm3pjpXTwhi4ZJHsI459AorzAPwvb7NDWWt86frBLYx9SBwc807vrNE17tY3eSa
	1V8HY0KlGwwmMH+z/7Rc20x4/naFWBnpw9jzE=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=lJHWAgGvYAb8is7hKUx3JemolyJu2JABok9CxR3qnTloN5HF77lbUrQr7C5K9XKBq5
	SFdyP4tb9i0w80OfeBAgTuIiBMNJURHFECjWec3wmmpHgOE4kO0/CYr9EDJnUnj57p9o
	lJTwYPy0QkOTjZE+/Q3nMyGPuHKuaHkwVhH8U=
MIME-Version: 1.0
Received: by 10.229.229.135 with SMTP id ji7mr5330367qcb.100.1289413566167;
	Wed, 10 Nov 2010 10:26:06 -0800 (PST)
Received: by 10.229.69.135 with HTTP; Wed, 10 Nov 2010 10:26:06 -0800 (PST)
In-Reply-To: <4CDA3CDD.5000404@freebsd.org>
References: <4CD7C8FC.900@icyb.net.ua> <ib8nas$9de$1@dough.gmane.org>
	<4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org>
	<4CDA3CDD.5000404@freebsd.org>
Date: Wed, 10 Nov 2010 21:26:06 +0300
Message-ID: <AANLkTi=LcnhXNb+PrkvykvWKoFyHU79dH2F=g5vweS4X@mail.gmail.com>
From: Sergey Kandaurov <pluknet@gmail.com>
To: Andriy Gapon <avg@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org,
	Ivan Voras <ivoras@freebsd.org>
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 18:56:01 -0000

On 10 November 2010 09:34, Andriy Gapon <avg@freebsd.org> wrote:
> on 08/11/2010 14:13 Andriy Gapon said the following:
>> on 08/11/2010 13:55 Andriy Gapon said the following:
>>> I reliable got this panic when all I was doing is saving an attachment =
in
>>> thunderbird 3 that ran in KDE 4 environment. =A0Not sure what was going=
 on behind
>>> the scenes, but shouldn't have been anything out of the ordinary.
>>
>> Perhaps this is my local mistake. =A0I can't see from code and crash dum=
p how NULL
>> pointer is possible there. =A0So perhaps I have some ABI mismatch betwee=
n kernel
>> and fuse module.
>> I will rebuild fuse kmod and re-test again.
>
> Yes, the rebuild has helped.
> I wish this could be nicely automated.

Hi.
If I understood you correctly, then you need
PORTS_MODULES set in /etc/make.conf.

--=20
wbr,
pluknet

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 19:08:39 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 58C73106566C;
	Wed, 10 Nov 2010 19:08:39 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 182408FC1B;
	Wed, 10 Nov 2010 19:08:37 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA26702;
	Wed, 10 Nov 2010 21:08:35 +0200 (EET) (envelope-from avg@freebsd.org)
Message-ID: <4CDAEDB2.4010704@freebsd.org>
Date: Wed, 10 Nov 2010 21:08:34 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Sergey Kandaurov <pluknet@gmail.com>
References: <4CD7C8FC.900@icyb.net.ua>	<ib8nas$9de$1@dough.gmane.org>	<4CD7E515.5040209@icyb.net.ua>	<4CD7E960.1070200@freebsd.org>	<4CDA3CDD.5000404@freebsd.org>
	<AANLkTi=LcnhXNb+PrkvykvWKoFyHU79dH2F=g5vweS4X@mail.gmail.com>
In-Reply-To: <AANLkTi=LcnhXNb+PrkvykvWKoFyHU79dH2F=g5vweS4X@mail.gmail.com>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org,
	Ivan Voras <ivoras@freebsd.org>
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 19:08:39 -0000

on 10/11/2010 20:26 Sergey Kandaurov said the following:
> Hi.
> If I understood you correctly, then you need
> PORTS_MODULES set in /etc/make.conf.

It was a long time ago when I tried it last time, but I remember having problems
with it during upgrades.
-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 19:42:46 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 348FB106566C;
	Wed, 10 Nov 2010 19:42:46 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 0DA538FC0C;
	Wed, 10 Nov 2010 19:42:44 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA27160;
	Wed, 10 Nov 2010 21:42:42 +0200 (EET) (envelope-from avg@freebsd.org)
Message-ID: <4CDAF5B1.4040501@freebsd.org>
Date: Wed, 10 Nov 2010 21:42:41 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Sergey Kandaurov <pluknet@gmail.com>
References: <4CD7C8FC.900@icyb.net.ua>	<ib8nas$9de$1@dough.gmane.org>	<4CD7E515.5040209@icyb.net.ua>	<4CD7E960.1070200@freebsd.org>	<4CDA3CDD.5000404@freebsd.org>
	<AANLkTi=LcnhXNb+PrkvykvWKoFyHU79dH2F=g5vweS4X@mail.gmail.com>
	<4CDAEDB2.4010704@freebsd.org>
In-Reply-To: <4CDAEDB2.4010704@freebsd.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org,
	Ivan Voras <ivoras@freebsd.org>
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 19:42:46 -0000

on 10/11/2010 21:08 Andriy Gapon said the following:
> on 10/11/2010 20:26 Sergey Kandaurov said the following:
>> Hi.
>> If I understood you correctly, then you need
>> PORTS_MODULES set in /etc/make.conf.
> 
> It was a long time ago when I tried it last time, but I remember having problems
> with it during upgrades.

I think this is what it was/is.
If a port in PORTS_MODULES has dependencies, then buildkernel would try to install
those dependencies even if they are already installed.  And that, obviously, would
fail.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 19:49:29 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A6390106564A
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 19:49:29 +0000 (UTC)
	(envelope-from carlson39@llnl.gov)
Received: from smtp.llnl.gov (nspiron-3.llnl.gov [128.115.41.83])
	by mx1.freebsd.org (Postfix) with ESMTP id 85BD68FC17
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 19:49:29 +0000 (UTC)
X-Attachments: None
Received: from bagua.llnl.gov (HELO [134.9.197.135]) ([134.9.197.135])
	by smtp.llnl.gov with ESMTP; 10 Nov 2010 11:49:28 -0800
Message-ID: <4CDAF749.4000805@llnl.gov>
Date: Wed, 10 Nov 2010 11:49:29 -0800
From: Mike Carlson <carlson39@llnl.gov>
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US;
	rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <4CD84258.6090404@llnl.gov>	<ibbauo$27m$1@dough.gmane.org>	<4CD986DC.1070401@llnl.gov>	<4CD98816.1020306@llnl.gov>
	<ibdu54$fd1$1@dough.gmane.org>
In-Reply-To: <ibdu54$fd1$1@dough.gmane.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: Re: 8.1-RELEASE: ZFS data errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 19:49:29 -0000

On 11/10/2010 03:03 AM, Ivan Voras wrote:
> On 11/09/10 18:42, Mike Carlson wrote:
>
>>>       write# gstripe label -v -s 16384  data /dev/da2 /dev/da3 /dev/da4
>>>       /dev/da5 /dev/da6 /dev/da7 /dev/da8
>>>       write# df -h
>>>       Filesystem            Size    Used   Avail Capacity  Mounted on
>>>       /dev/da0s1a           1.7T     22G    1.6T     1%    /
>>>       devfs                 1.0K    1.0K      0B   100%    /dev
>>>       /dev/stripe/data      126T    4.0K    116T     0%    /mnt
>>>       write# fsck /mnt
>>>       fsck: Could not determine filesystem type
>>>       write# fsck_ufs  /mnt
>>>       ** /dev/stripe/data (NO WRITE)
>>>       ** Last Mounted on /mnt
>>>       ** Phase 1 - Check Blocks and Sizes
>>>       Segmentation fault
>>> So, the data appears to be okay, I wanted to run through a FSCK just to
>>> do it but that seg faulted. Otherwise, that data looks good.
> Hmm, probably it tried to allocate a gazillion internal structures to
> check it and didn't take no for an answer.
>
>>> Question, why did you recommend using a smaller stripe size? Is that to
>>> ensure a sample 1GB test file gets written across ALL disk members?
> Yes, it's the surest way since MAXPHYS=128 KiB / 8 = 16 KiB.
>
> Well, as far as I'm concerned this probably shows that there isn't
> something wrong about hardware or GEOM, though more testing, like
> running a couple of bonnie++ rounds on the UFS on the stripe volume for
> a few hours, would probably be better.
>
> Btw. what bandwidth do you get from this combination (gstripe + UFS)?
>

The bandwidth for geom_stripe + UFS2 was very nice:

    write# mount
    /dev/da0s1a on / (ufs, local, soft-updates)
    devfs on /dev (devfs, local, multilabel)
    filevol002 on /filevol002 (zfs, local)
    /dev/stripe/data on /mnt (ufs, local, soft-updates)

Simple DD write:

    write# dd if=/dev/zero of=/mnt/zero.dat bs=1m count=5000
    5000+0 records in
    5000+0 records out
    5242880000 bytes transferred in 13.503850 secs (388250759 bytes/sec)

running bonnie++

    write# bonnie++ -u 100 -s24576 -d. -n64
    Using uid:100, gid:65533.
    Writing a byte at a time...done
    Writing intelligently...done
    Rewriting...done
    Reading a byte at a time...done
    Reading intelligently...done
    start 'em...done...done...done...done...done...
    Create files in sequential order...done.
    Stat files in sequential order...done.
    Delete files in sequential order...done.
    Create files in random order...done.
    Stat files in random order...done.
    Delete files in random order...done.
    Version  1.96       ------Sequential Output------ --Sequential
    Input- --Random-
    Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr-
    --Block-- --Seeks--
    Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec
    %CP  /sec %CP
    write.llnl.gov  24G   730  99 343750  63 106157  26  1111  86
    174698  26 219.2   3
    Latency             11492us     149ms     227ms   70274us  
    66776us     766ms
    Version  1.96       ------Sequential Create------ --------Random
    Create--------
    write.llnl.gov      -Create-- --Read--- -Delete-- -Create--
    --Read--- -Delete--
                   files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
    %CP  /sec %CP
                      64 18681  47 +++++ +++ 99516  97 26297  40 +++++
    +++ 113937  96
    Latency               310ms     149us     152us   68841us    
    144us     146us
    1.96,1.96,write.llnl.gov,1,1289416723,24G,,730,99,343750,63,106157,26,1111,86,174698,26,219.2,3,64,,,,,18681,47,+++++,+++,99516,97,26297,40,+++++,+++,113937,96,11492us,149ms,227ms,70274us,66776us,766ms,310ms,149us,152us,68841us,144us,146us

The system immediately and mysteriously reboot after running bonnie++ 
though, that doesn't seem like a good sign...

I've got an iozone benchmark, gstripe + multipath + UFS vs. multipath + 
ZFS. I can email the gzip'd file to you, as I don't want to clutter the 
mailing list with file attachments.

Another question, for anyone really, but will gmultipath ever have an 
'active/active' model? I'm happy that I have some type of redundancy for 
my SAN, but it it was possible to aggregate the bandwidth of both 
controllers, that would be pretty cool as well.

>> Oh, I almost forgot, here is the ZFS version of that gstripe array:
>>
>>     write# zpool create test01 /dev/stripe/data
>>     write# zpool scrub
>>     write# zpool status
>>        pool: test01
>>       state: ONLINE
>>       scrub: scrub completed after 0h0m with 0 errors on Tue Nov  9
>>     09:41:34 2010
>>     config:
>>
>>          NAME           STATE     READ WRITE CKSUM
>>          test01         ONLINE       0     0     0
>>            stripe/data  ONLINE       0     0     0
> "scrub" verifies only written data, not the whole file system space
> (that's why it finishes so fast), so it isn't really doing any load on
> the array, but I agree that it looks more and more like there really is
> an issue in ZFS.
>
Yeah, I ran scrub when there was around 20GB of random data. In 
8.1-RELEASE, that was the way I would trigger ZFS's acknowledgment that 
the pool had a problem.

I also dug through my logs and saw these:

    Nov  8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01
    path=/dev/da5 offset=749207552 size=131072
    Nov  8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01
    path=/dev/da5 offset=749338624 size=131072
    Nov  8 15:09:51 write root: ZFS: zpool I/O failure, zpool=test01
    error=86
    Nov  8 15:09:51 write root: ZFS: zpool I/O failure, zpool=test01
    error=86
    Nov  8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01
    path=/dev/da3 offset=748421120 size=131072
    Nov  8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01
    path=/dev/da4 offset=746586112 size=131072
    Nov  8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01
    path=/dev/da4 offset=746455040 size=131072
    Nov  8 15:09:51 write root: ZFS: checksum mismatch, zpool=test01
    path=/dev/da4 offset=746717184 size=131072
    Nov  8 15:09:52 write root: ZFS: checksum mismatch, zpool=test01
    path=/dev/da3 offset=748290048 size=131072
    Nov  8 15:09:52 write root: ZFS: checksum mismatch, zpool=test01
    path=/dev/da3 offset=748421120 size=131072
    Nov  8 15:09:52 write root: ZFS: checksum mismatch, zpool=test01
    path=/dev/da4 offset=746586112 size=131072
    Nov  8 15:09:52 write root: ZFS: zpool I/O failure, zpool=test01
    error=86

I'm inclined to believe it is an issue with ZFS.

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 20:18:44 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C5C731065697;
	Wed, 10 Nov 2010 20:18:44 +0000 (UTC)
	(envelope-from yanegomi@gmail.com)
Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com
	[74.125.82.182])
	by mx1.freebsd.org (Postfix) with ESMTP id D44748FC1B;
	Wed, 10 Nov 2010 20:18:43 +0000 (UTC)
Received: by wya21 with SMTP id 21so1218791wya.13
	for <multiple recipients>; Wed, 10 Nov 2010 12:18:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=ZBBfufQUZ146StgDszFB2/UvQATpM1G14gpMtjs5ZU0=;
	b=KQ2N+7AMsTqTSFeGJ158mPfHrAS9TDoNNX93u9bDP4ii13Pp7yrnuqEd3xccBPJXqK
	ZFJFkMGO9f9BhuQWeElnDATtyBap+d9m1YwONQycvOQuEJEwO2sW9RWtF+rjpHIF/JTE
	fSGFiXW+53qrc8J8z2S+GpafGCG2JnV3e0yWw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	b=FOu7eS65HP4mY777/xXOXW9NP2NT8gIcbmE5Po+9lxopgWy2jGvd5scIZm0ROso1V8
	450tR/mmHwhXjOKZWw0Tzis0OxqrAkSA4oGpO9XabysHQZtVGk3xvG+PoVFet+C8fzf6
	n2S3tEHXqeh/2FslcWShDFPv3WPG0eSKW35SM=
MIME-Version: 1.0
Received: by 10.216.175.18 with SMTP id y18mr8885463wel.30.1289420322665; Wed,
	10 Nov 2010 12:18:42 -0800 (PST)
Sender: yanegomi@gmail.com
Received: by 10.216.198.27 with HTTP; Wed, 10 Nov 2010 12:18:42 -0800 (PST)
In-Reply-To: <4CDAF5B1.4040501@freebsd.org>
References: <4CD7C8FC.900@icyb.net.ua> <ib8nas$9de$1@dough.gmane.org>
	<4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org>
	<4CDA3CDD.5000404@freebsd.org>
	<AANLkTi=LcnhXNb+PrkvykvWKoFyHU79dH2F=g5vweS4X@mail.gmail.com>
	<4CDAEDB2.4010704@freebsd.org> <4CDAF5B1.4040501@freebsd.org>
Date: Wed, 10 Nov 2010 12:18:42 -0800
X-Google-Sender-Auth: -JRvsaMOZ3PeCq9259iCRjUctSY
Message-ID: <AANLkTikr0OZef1--ycTyZENbCuLNcQOnYDCWuLiJkWDz@mail.gmail.com>
From: Garrett Cooper <gcooper@FreeBSD.org>
To: Andriy Gapon <avg@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, Ivan Voras <ivoras@freebsd.org>,
	freebsd-current@freebsd.org
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 20:18:44 -0000

On Wed, Nov 10, 2010 at 11:42 AM, Andriy Gapon <avg@freebsd.org> wrote:
> on 10/11/2010 21:08 Andriy Gapon said the following:
>> on 10/11/2010 20:26 Sergey Kandaurov said the following:
>>> Hi.
>>> If I understood you correctly, then you need
>>> PORTS_MODULES set in /etc/make.conf.
>>
>> It was a long time ago when I tried it last time, but I remember having =
problems
>> with it during upgrades.
>
> I think this is what it was/is.
> If a port in PORTS_MODULES has dependencies, then buildkernel would try t=
o install
> those dependencies even if they are already installed. =A0And that, obvio=
usly, would
> fail.

Didn't know about this knob -- cool!

And FWIW, all it does is a:

all
install: deinstall reinstall (huh?)
reinstall: deinstall reinstall (huh?)
clean

Seems like it should be:

clean
all
[deinstall]
install
clean

or:

clean
all
install -DFORCE_PKG_REGISTER
clean

the first clean is just in case the PORTSWORKDIR is dirty.

Thanks!
-Garrett

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 20:21:39 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5100A1065673;
	Wed, 10 Nov 2010 20:21:39 +0000 (UTC)
	(envelope-from yanegomi@gmail.com)
Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com
	[74.125.82.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 5D01D8FC17;
	Wed, 10 Nov 2010 20:21:37 +0000 (UTC)
Received: by wya21 with SMTP id 21so1221731wya.13
	for <multiple recipients>; Wed, 10 Nov 2010 12:21:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=7Rnux6gzTQowlwVRDy8ov7o4lqrdqINruw1qCpoEXyE=;
	b=m3E+GFhLd+O+gnQc8gXZI93mKN3dDN+/rJf0VKyCKZs7QnE1QspZy5ivsFOkG6cHTg
	yz7rISTZKDjsaq5BHcL1UINmWjuc/Scm+AnoilXMZPBz4Cm5ESqn2o1ccqPtY3nLK0qG
	AtiV4iFsR6NoVvs5GX/n8BF6WbyJt7vTsItCY=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	b=fs28eHzdxA85xV6vwXX6+o5vYhr9r3CsPzhz0ODn7b8nC5WMwUJv/8WPrtuARWfTEK
	l9eZuvxPKJx21qdyoTPyt+QZfd/E27VCT8WSYpLkeNg3s23gOWriVn7u2zXV6e7heUBv
	SboaJcO+WiVQp8EJpMaWgBAl9KV706t7G1XlI=
MIME-Version: 1.0
Received: by 10.216.7.210 with SMTP id 60mr1544969wep.30.1289420496884; Wed,
	10 Nov 2010 12:21:36 -0800 (PST)
Sender: yanegomi@gmail.com
Received: by 10.216.198.27 with HTTP; Wed, 10 Nov 2010 12:21:36 -0800 (PST)
In-Reply-To: <AANLkTikr0OZef1--ycTyZENbCuLNcQOnYDCWuLiJkWDz@mail.gmail.com>
References: <4CD7C8FC.900@icyb.net.ua> <ib8nas$9de$1@dough.gmane.org>
	<4CD7E515.5040209@icyb.net.ua> <4CD7E960.1070200@freebsd.org>
	<4CDA3CDD.5000404@freebsd.org>
	<AANLkTi=LcnhXNb+PrkvykvWKoFyHU79dH2F=g5vweS4X@mail.gmail.com>
	<4CDAEDB2.4010704@freebsd.org> <4CDAF5B1.4040501@freebsd.org>
	<AANLkTikr0OZef1--ycTyZENbCuLNcQOnYDCWuLiJkWDz@mail.gmail.com>
Date: Wed, 10 Nov 2010 12:21:36 -0800
X-Google-Sender-Auth: V2owgFY86dIsGIPTYZZW1uYjaJ8
Message-ID: <AANLkTimtRExk5g9hGhREpPjTGW5azLA=_E6YuoWboxXA@mail.gmail.com>
From: Garrett Cooper <gcooper@FreeBSD.org>
To: Andriy Gapon <avg@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, Ivan Voras <ivoras@freebsd.org>,
	freebsd-current@freebsd.org
Subject: Re: another fuse panic
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 20:21:39 -0000

On Wed, Nov 10, 2010 at 12:18 PM, Garrett Cooper <gcooper@freebsd.org> wrot=
e:
> On Wed, Nov 10, 2010 at 11:42 AM, Andriy Gapon <avg@freebsd.org> wrote:
>> on 10/11/2010 21:08 Andriy Gapon said the following:
>>> on 10/11/2010 20:26 Sergey Kandaurov said the following:
>>>> Hi.
>>>> If I understood you correctly, then you need
>>>> PORTS_MODULES set in /etc/make.conf.
>>>
>>> It was a long time ago when I tried it last time, but I remember having=
 problems
>>> with it during upgrades.
>>
>> I think this is what it was/is.
>> If a port in PORTS_MODULES has dependencies, then buildkernel would try =
to install
>> those dependencies even if they are already installed. =A0And that, obvi=
ously, would
>> fail.
>
> Didn't know about this knob -- cool!
>
> And FWIW, all it does is a:
>
> all
> install: deinstall reinstall (huh?)
> reinstall: deinstall reinstall (huh?)
> clean
>
> Seems like it should be:
>
> clean
> all
> [deinstall]
> install
> clean
>
> or:
>
> clean
> all
> install -DFORCE_PKG_REGISTER
> clean
>
> the first clean is just in case the PORTSWORKDIR is dirty.

And FWIW an even better idea might be to align the port with the
process in use, i.e.

clean (i.e. NO_CLEAN, KERNFAST, etc not specified) ->
[${PORTSDIR}/${PORT}] clean
buildkernel -> [${PORTSDIR}/${PORT}] all
installkernel -> [${PORTSDIR}/${PORT}] deinstall install

*shrugs*
-Garrett

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 10 23:00:45 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A1BEB106564A
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 23:00:45 +0000 (UTC)
	(envelope-from mark@exonetric.com)
Received: from relay0.exonetric.net (relay0.exonetric.net [82.138.248.161])
	by mx1.freebsd.org (Postfix) with ESMTP id 6F23D8FC0C
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 23:00:45 +0000 (UTC)
Received: from [192.168.0.7] (unknown [78.86.207.85])
	by relay0.exonetric.net (Postfix) with ESMTP id 2876E57004
	for <freebsd-fs@freebsd.org>; Wed, 10 Nov 2010 22:28:28 +0000 (GMT)
From: Mark Blackman <mark@exonetric.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Date: Wed, 10 Nov 2010 22:28:27 +0000
Message-Id: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com>
To: freebsd-fs@freebsd.org
Mime-Version: 1.0 (Apple Message framework v1081)
X-Mailer: Apple Mail (2.1081)
Subject: ZFS and pathconf(_PC_NO_TRUNC)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Nov 2010 23:00:45 -0000

Hi, 

I note that when testing the pathconf(2) NO_TRUNC property 
on a ZFS filesystem, I get a ENOENT, "No such file or directory".

I'm not sure if this qualifies as correct behaviour, but thought
a learned soul on this list could enlighten me.

I've attached the C snippet I used for testing.

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

int main(int argc, char *argv[]){
  int result;

  result=pathconf(argv[1], _PC_NO_TRUNC);
  printf("for %s: no_trunc is %d\n",argv[1],result);
  if (result<0)
    perror(NULL);
  1;
}

- Mark


From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 02:51:46 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 22ECE1065673;
	Thu, 11 Nov 2010 02:51:46 +0000 (UTC)
	(envelope-from kevlo@FreeBSD.org)
Received: from ns.kevlo.org (kevlo.org [220.128.136.52])
	by mx1.freebsd.org (Postfix) with ESMTP id A07AC8FC0A;
	Thu, 11 Nov 2010 02:51:45 +0000 (UTC)
Received: from [127.0.0.1] (kevlo@kevlo.org [220.128.136.52])
	by ns.kevlo.org (8.14.3/8.14.3) with ESMTP id oAB2OAAh020778;
	Thu, 11 Nov 2010 10:24:12 +0800 (CST)
From: Kevin Lo <kevlo@FreeBSD.org>
To: delphij@FreeBSD.org
Content-Type: text/plain; charset="UTF-8"
Date: Thu, 11 Nov 2010 10:24:56 +0800
Message-ID: <1289442296.2128.16.camel@monet>
Mime-Version: 1.0
X-Mailer: Evolution 2.30.3 FreeBSD GNOME Team Port 
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org
Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 02:51:46 -0000

Xin Li wrote:
> (cc'ed to freebsd-fs@)
>
> I think it's important that someone familiar with the code review and
> evaluate the current patches and commit it against -HEAD...

> MSDOSFS patch (against 7.1):
> http://btload.googlegroups.com/web/msdosfs.patch?gda=MzIscT8AAABs_gmy4a1S9lRiXjEy-V5OpwtI67JnIGlz0zr18tjObOtoi5oIt3BJMRGeqGBbbj-ccyFKn-rNKC-d1pM_IdV0
> NTFS patch:
> http://btload.googlegroups.com/web/ntfs.patch?gda=OqsHoDwAAABs_gmy4a1S9lRiXjEy-V5O7RN7t-m4MjZ-5dQn_EvaqDVCWO9_HyYEQJyRQYPtRCL9Wm-ajmzVoAFUlE7c_fAt

Hi Xin,

The MSDOSFS patch looks good to me. I've been testing this patch against -HEAD
for years and it seems to be working great. 
I hope this patch will be committed soon. Thanks!

> Cheers,
> - --
> Xin LI <delphij at delphij.net>	http://www.delphij.net/
> FreeBSD - The Power to Serve!

	Kevin


From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 04:00:35 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C0137106564A
	for <fs@freebsd.org>; Thu, 11 Nov 2010 04:00:35 +0000 (UTC)
	(envelope-from dougb@FreeBSD.org)
Received: from mail2.fluidhosting.com (mx23.fluidhosting.com [204.14.89.6])
	by mx1.freebsd.org (Postfix) with ESMTP id 641018FC12
	for <fs@freebsd.org>; Thu, 11 Nov 2010 04:00:35 +0000 (UTC)
Received: (qmail 25363 invoked by uid 399); 11 Nov 2010 04:00:32 -0000
Received: from localhost (HELO doug-optiplex.ka9q.net)
	(dougb@dougbarton.us@127.0.0.1)
	by localhost with ESMTPAM; 11 Nov 2010 04:00:32 -0000
X-Originating-IP: 127.0.0.1
X-Sender: dougb@dougbarton.us
Message-ID: <4CDB6A5F.2000908@FreeBSD.org>
Date: Wed, 10 Nov 2010 20:00:31 -0800
From: Doug Barton <dougb@FreeBSD.org>
Organization: http://SupersetSolutions.com/
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101028 Thunderbird/3.1.6
MIME-Version: 1.0
To: Aditya Sarawgi <sarawgi.aditya@gmail.com>
References: <20100929031825.L683@besplex.bde.org>
	<20100929084801.M948@besplex.bde.org>
	<20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org>
	<20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org>
	<AANLkTi=iTCG4aO-KO_gy7fp_96KcZ_TCyNk5OkLZUHV3@mail.gmail.com>
	<4CD201AE.3040409@FreeBSD.org> <20101108174327.GC2066@earth>
	<4CD9E535.8000801@FreeBSD.org> <20101110170719.GA1573@earth>
In-Reply-To: <20101110170719.GA1573@earth>
X-Enigmail-Version: 1.1.2
OpenPGP: id=1A1ABC84
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: fs@freebsd.org
Subject: Re: ext2fs now extremely slow
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 04:00:35 -0000

On 11/10/2010 09:07, Aditya Sarawgi wrote:
> On Tue, Nov 09, 2010 at 04:20:05PM -0800, Doug Barton wrote:

>> Can you expand on that? What about it do you see as problematic?
>>
>
> ext2fs is not a native filesystem. It is relatively slower than UFS

Like I already said, I can live with a little bit slower.

> may have some other problems like deadlocks which are not yet discovered.

What better way to discover them than actual day to day use? :)

> May make the data inconsistent due to lack of facilities like journaling.

Well that's just plain unacceptable. Either the fs works reliably (which 
obviously includes safely) or it should be removed. At bare minimum if 
it can't reliably write data then support should be changed to read-only.

> It is only meant to make your data in linux partitions accessible.

Sorry, I'm not buying that. Don't get me wrong, I really appreciate the 
help you're providing, and I don't want to get you in the middle of my 
own personal vendetta, but we can't tolerate this perspective. We either 
need to support something, or not support it.

>>>>> You can test Zheng's preallocation patch for ext2fs, there is a
>>>>> serious lack of testers for that.
>>>>
>>>> I would be happy to do that, but my reading of this thread last month
>>>> didn't produce a clear "try this version of the patch" neon sign.
>>>> Various people referred to suggestions, updates, etc. If someone could
>>>> provide a URL for the right patch to try, as well as a suggestion for
>>>> benchmarking methodology, I'll be glad to do so.
>>>>
>>>
>>> I have attached the patch.
>>
>> Thanks for that. I'm curious though whether this is the latest version
>> of the patch with the suggested improvements from earlier in this thread?
>>
>
> There will be only style and some comment fixes in the new patch.

Ok, thanks.

> Yes, we know there are some scope of improvements. So the BSDL ext2fs lacks
> preallocation which used to preallocate some blocks which improved the sequential
> write performance. This problem is solved by Zheng's reservation window work.
> The other issue like Bruce mentioned is that some of the blocks in between are
> skipped by the block allocator algorithm. We intend to have fixes for both of
> these before 9 is released.

Ok, but time is running out. :)


Doug

-- 

	Nothin' ever doesn't change, but nothin' changes much.
			-- OK Go

	Breadth of IT experience, and depth of knowledge in the DNS.
	Yours for the right price.  :)  http://SupersetSolutions.com/


From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 05:49:17 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7B0B51065670;
	Thu, 11 Nov 2010 05:49:17 +0000 (UTC)
	(envelope-from buganini@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 2A8198FC12;
	Thu, 11 Nov 2010 05:49:16 +0000 (UTC)
Received: by iwn39 with SMTP id 39so1729243iwn.13
	for <multiple recipients>; Wed, 10 Nov 2010 21:49:16 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type;
	bh=YpXW1Py7o+5PTRqUc7qzg5uPksj9v0CxH/tCHbtFR28=;
	b=gwjya343sweUBebDqrGGldmabUtnN15Obr+/efI3ak2MBUUgvzCixDZJDi1AXHhQOU
	Iqo8LKIbNq23zpd7M4Dm2fQs75X5E1W4E7QxIazZMysk+VUwkm/24XtsAyhsosXN/4iO
	AbxaoKLKZcgEKBV6Yc9VJ5kxQJzv6AB9O3Ny8=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	b=L/sF2tfrGe+z5uzoqd1/z2mG1OboKK8s67ukMvdD7o9LLpjWCK/N2uTRS3wkn920q3
	NLGQ8L9h9qmb8734PXRv7vgY3Lw7cprg9zBh/hzmhL8WWVcbxnGB3i09yvc8vwmYgF0L
	il44czTOHIIhTpfATluc6zHgrPv6IuCLjEWH4=
MIME-Version: 1.0
Received: by 10.231.13.136 with SMTP id c8mr220627iba.19.1289452852095; Wed,
	10 Nov 2010 21:20:52 -0800 (PST)
Received: by 10.231.32.194 with HTTP; Wed, 10 Nov 2010 21:20:52 -0800 (PST)
In-Reply-To: <1289442296.2128.16.camel@monet>
References: <1289442296.2128.16.camel@monet>
Date: Thu, 11 Nov 2010 13:20:52 +0800
Message-ID: <AANLkTi=V5AhaPFG9EFnopiiEXUyA0-HsoiMfT=RKxhZF@mail.gmail.com>
From: Buganini <buganini@gmail.com>
To: Kevin Lo <kevlo@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org, delphij@freebsd.org
Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 05:49:17 -0000

I'm using these two patches on CURRENT
http://security-hole.info/~buganini/patches/kiconv_msdosfs/

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 06:09:23 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 00A3A106566B
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 06:09:23 +0000 (UTC)
	(envelope-from dougb@FreeBSD.org)
Received: from mail2.fluidhosting.com (mx23.fluidhosting.com [204.14.89.6])
	by mx1.freebsd.org (Postfix) with ESMTP id 85CB48FC0A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 06:09:20 +0000 (UTC)
Received: (qmail 20813 invoked by uid 399); 11 Nov 2010 06:09:20 -0000
Received: from localhost (HELO doug-optiplex.ka9q.net)
	(dougb@dougbarton.us@127.0.0.1)
	by localhost with ESMTPAM; 11 Nov 2010 06:09:20 -0000
X-Originating-IP: 127.0.0.1
X-Sender: dougb@dougbarton.us
Message-ID: <4CDB888E.6030005@FreeBSD.org>
Date: Wed, 10 Nov 2010 22:09:18 -0800
From: Doug Barton <dougb@FreeBSD.org>
Organization: http://SupersetSolutions.com/
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101028 Thunderbird/3.1.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
X-Enigmail-Version: 1.1.2
OpenPGP: id=1A1ABC84
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Minor problem re-mounting ext2fs system
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 06:09:23 -0000

Even after a clean shutdown I get this error at nearly every reboot:

kernel: last write time is in the future.
kernel: (by less than a day, probably due to the hardware clock being 
incorrectly set).
kernel: FIXED.

It does get fixed, but it requires the fsck program from e2fsprogs to do 
it.


Doug

-- 

	Nothin' ever doesn't change, but nothin' changes much.
			-- OK Go

	Breadth of IT experience, and depth of knowledge in the DNS.
	Yours for the right price.  :)  http://SupersetSolutions.com/


From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 10:06:37 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 26BE2106567A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 10:06:37 +0000 (UTC)
	(envelope-from rs@bytecamp.net)
Received: from mail.bytecamp.net (mail.bytecamp.net [212.204.60.9])
	by mx1.freebsd.org (Postfix) with ESMTP id 5DA938FC13
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 10:06:35 +0000 (UTC)
Received: (qmail 95076 invoked by uid 89); 11 Nov 2010 10:39:55 +0100
Received: from stella.bytecamp.net (HELO ?212.204.60.37?)
	(rs%bytecamp.net@212.204.60.37)
	by mail.bytecamp.net with CAMELLIA256-SHA encrypted SMTP;
	11 Nov 2010 10:39:55 +0100
Message-ID: <4CDBB9EB.8010908@bytecamp.net>
Date: Thu, 11 Nov 2010 10:39:55 +0100
From: Robert Schulze <rs@bytecamp.net>
Organization: bytecamp GmbH
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
	rv:1.9.1.15) Gecko/20101027 Thunderbird/3.0.10
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit
Subject: nfsd stuck in *rc_lock state
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 10:06:37 -0000

Hello everybody,

we are running several 8.1 NFS Clients connected to a 8.0 NFS Server.
The system ran fine, but today the NFS-Server locked up.

nfsd ate 100% of CPU, with one {nfsd: service} thread in state 
"*rc_lock". I tried to find others with the same issue and finally ended 
up at a patch from Rick:

http://people.freebsd.org/~rmacklem/freebsd8.1-patches/replay.patch

may I apply this patch to a 8.0 system to fix this issue, or are there 
any other patches/commits which affect this?

with kind regards,

Robert Schulze

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 10:19:27 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 62EA5106564A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 10:19:27 +0000 (UTC)
	(envelope-from gljennjohn@googlemail.com)
Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com
	[209.85.161.54])
	by mx1.freebsd.org (Postfix) with ESMTP id E27728FC0A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 10:19:26 +0000 (UTC)
Received: by fxm19 with SMTP id 19so1160724fxm.13
	for <multiple recipients>; Thu, 11 Nov 2010 02:19:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=googlemail.com; s=gamma;
	h=domainkey-signature:received:received:date:from:to:cc:subject
	:message-id:in-reply-to:references:reply-to:x-mailer:mime-version
	:content-type:content-transfer-encoding;
	bh=UW+revAW1FFfXwu6BGHdliMXC85Rvb1K5URo5T3/CJk=;
	b=RqunH0l5ZZr2t3wK+tueHR/yc2Wj0ZwF+Owro4+xiL5zQpb6GWdbItsimYkUKX5sTM
	LfbTc24UJLDXIhpQsio4gaoj/kG8b4VLgSvn7xGnG1IzqM+mFHYtJFWh9mIJMSSq4l2U
	sWXILZrz/pHZIF4LhIQoI+1p9KBxZOAxmfhqM=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma;
	h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to
	:x-mailer:mime-version:content-type:content-transfer-encoding;
	b=T1r/tjoeMu3itCSNeUK4/egKrkBTqssVUQJfBVkyFs9uPpW66Luyu/niVDDbOlGC9x
	uLsqbt62hDx04M9shCeb59En2lzzArEx1P6HKaE/2aQSgxKEmin0LQIn3OxJEvxYNIox
	gdCck53Y+kHTHZUNQGysR2Nf4/1VsgQFxHMUU=
Received: by 10.223.106.210 with SMTP id y18mr110008fao.108.1289470765068;
	Thu, 11 Nov 2010 02:19:25 -0800 (PST)
Received: from ernst.jennejohn.org (p578E1734.dip.t-dialin.net [87.142.23.52])
	by mx.google.com with ESMTPS id o7sm836079fal.3.2010.11.11.02.19.23
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Thu, 11 Nov 2010 02:19:24 -0800 (PST)
Date: Thu, 11 Nov 2010 11:19:22 +0100
From: Gary Jennejohn <gljennjohn@googlemail.com>
To: delphij@FreeBSD.org
Message-ID: <20101111111922.7fe8ab19@ernst.jennejohn.org>
In-Reply-To: <1289442296.2128.16.camel@monet>
References: <1289442296.2128.16.camel@monet>
X-Mailer: Claws Mail 3.7.6 (GTK+ 2.18.7; amd64-portbld-freebsd9.0)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org
Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: gljennjohn@googlemail.com
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 10:19:27 -0000

Xin Li wrote:
> > (cc'ed to freebsd-fs@)
> >
> > I think it's important that someone familiar with the code review and
> > evaluate the current patches and commit it against -HEAD...
> 
> > MSDOSFS patch (against 7.1):
> > http://btload.googlegroups.com/web/msdosfs.patch?gda=MzIscT8AAABs_gmy4a1S9lRiXjEy-V5OpwtI67JnIGlz0zr18tjObOtoi5oIt3BJMRGeqGBbbj-ccyFKn-rNKC-d1pM_IdV0
> > NTFS patch:
> > http://btload.googlegroups.com/web/ntfs.patch?gda=OqsHoDwAAABs_gmy4a1S9lRiXjEy-V5O7RN7t-m4MjZ-5dQn_EvaqDVCWO9_HyYEQJyRQYPtRCL9Wm-ajmzVoAFUlE7c_fAt
> 

Just FYI.

The NTFS patch is no longer found and the page is reported as no longer
existing.

-- 
Gary Jennejohn

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 12:06:36 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 03356106564A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 12:06:36 +0000 (UTC)
	(envelope-from martin@lispworks.com)
Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com
	[193.34.186.230])
	by mx1.freebsd.org (Postfix) with ESMTP id 920AD8FC14
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 12:06:34 +0000 (UTC)
Received: from higson.cam.lispworks.com
	(IDENT:U2FsdGVkX1+e19cTDG6C3wTBNk2namDyp3/fFqQ2vxY@higson
	[192.168.1.7])
	by lwfs1-cam.cam.lispworks.com (8.14.3/8.14.3) with ESMTP id
	oABC6VkA002088; Thu, 11 Nov 2010 12:06:31 GMT
	(envelope-from martin@lispworks.com)
Received: from higson.cam.lispworks.com by higson.cam.lispworks.com (8.13.1)
	id oABC6VGe027666; Thu, 11 Nov 2010 12:06:31 GMT
Received: (from martin@localhost)
	by higson.cam.lispworks.com (8.13.1/8.13.1/Submit) id oABC6VYG027663;
	Thu, 11 Nov 2010 12:06:31 GMT
Date: Thu, 11 Nov 2010 12:06:31 GMT
Message-Id: <201011111206.oABC6VYG027663@higson.cam.lispworks.com>
From: Martin Simmons <martin@lispworks.com>
To: mark@exonetric.com
In-reply-to: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com> (message
	from Mark Blackman on Wed, 10 Nov 2010 22:28:27 +0000)
References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com>
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS and pathconf(_PC_NO_TRUNC)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 12:06:36 -0000

>>>>> On Wed, 10 Nov 2010 22:28:27 +0000, Mark Blackman said:
> 
> I note that when testing the pathconf(2) NO_TRUNC property 
> on a ZFS filesystem, I get a ENOENT, "No such file or directory".
> 
> I'm not sure if this qualifies as correct behaviour, but thought
> a learned soul on this list could enlighten me.
> 
> I've attached the C snippet I used for testing.
> 
> #include <unistd.h>
> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
> #include <errno.h>
> 
> int main(int argc, char *argv[]){
>   int result;
> 
>   result=pathconf(argv[1], _PC_NO_TRUNC);
>   printf("for %s: no_trunc is %d\n",argv[1],result);
>   if (result<0)
>     perror(NULL);
>   1;
> }

Your call to printf is clobbering the real errno, which is EINVAL.  That is an
allowed value according to the pathconf man page:

     [EINVAL]           The implementation does not support an association of
                        the variable name with the associated file.

So it is correct, but maybe not useful.

__Martin

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 12:10:53 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D5131106564A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 12:10:53 +0000 (UTC)
	(envelope-from mark@exonetric.com)
Received: from relay0.exonetric.net (relay0.exonetric.net [82.138.248.161])
	by mx1.freebsd.org (Postfix) with ESMTP id 9EC1A8FC15
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 12:10:53 +0000 (UTC)
Received: from [192.168.111.107] (unknown [62.244.179.66])
	by relay0.exonetric.net (Postfix) with ESMTP id 990BF57228;
	Thu, 11 Nov 2010 12:10:52 +0000 (GMT)
Mime-Version: 1.0 (Apple Message framework v1081)
Content-Type: text/plain; charset=us-ascii
From: Mark Blackman <mark@exonetric.com>
In-Reply-To: <201011111206.oABC6VYG027663@higson.cam.lispworks.com>
Date: Thu, 11 Nov 2010 12:10:36 +0000
Content-Transfer-Encoding: quoted-printable
Message-Id: <86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com>
References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com>
	<201011111206.oABC6VYG027663@higson.cam.lispworks.com>
To: Martin Simmons <martin@lispworks.com>
X-Mailer: Apple Mail (2.1081)
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS and pathconf(_PC_NO_TRUNC)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 12:10:53 -0000


On 11 Nov 2010, at 12:06, Martin Simmons wrote:

>>>>>> On Wed, 10 Nov 2010 22:28:27 +0000, Mark Blackman said:
>>=20
>> #include <unistd.h>
>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <string.h>
>> #include <errno.h>
>>=20
>> int main(int argc, char *argv[]){
>>  int result;
>>=20
>>  result=3Dpathconf(argv[1], _PC_NO_TRUNC);
>>  printf("for %s: no_trunc is %d\n",argv[1],result);
>>  if (result<0)
>>    perror(NULL);
>>  1;
>> }
>=20
> Your call to printf is clobbering the real errno, which is EINVAL. =20

Doh! thanks for pointing that out. :)

> That is an
> allowed value according to the pathconf man page:
>=20
>     [EINVAL]           The implementation does not support an =
association of
>                        the variable name with the associated file.
>=20
> So it is correct, but maybe not useful.

hmm. this is popping up in the context of building perl 5.12 on a =
zfs-only
filesystem. One of the POSIX::* tests fails because of the above.

- Mark


From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 12:25:04 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7C46B106564A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 12:25:04 +0000 (UTC)
	(envelope-from gleb.kurtsou@gmail.com)
Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com
	[209.85.215.182])
	by mx1.freebsd.org (Postfix) with ESMTP id CD3FD8FC1D
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 12:25:03 +0000 (UTC)
Received: by eyb7 with SMTP id 7so1020245eyb.13
	for <multiple recipients>; Thu, 11 Nov 2010 04:25:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:date:from:to:cc:subject
	:message-id:references:mime-version:content-type:content-disposition
	:in-reply-to:user-agent;
	bh=Y0K/XQGdHdjpHnZ8NZaSyIOdOcHMMP1v7v4eFcrgfHU=;
	b=V0EEWOSjCv3YGY0Vlz4q5tUIw3BIFW3j+C97eYt7qxBASOwhO6hGh7JnI0RPw5uYw/
	xBKOmqabQMYQ03zz75XCtM2TGThcFXwzNJEs50tb6KAVzY7wVyc0SC6myEW7LBI0Twwh
	vRzrIL8zIIdlUhbYz3TSwP0putztjyiP7SEG8=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=date:from:to:cc:subject:message-id:references:mime-version
	:content-type:content-disposition:in-reply-to:user-agent;
	b=bPA5ac9Iip3ioTDRKxPZiz2eizM/D0201lmXn7ep+RSIJxaP4oslTuaNyIul/2tWza
	2//de9/yzQnjqoQMP9PHktr9SZTNsxlKBQWaaTMaYk5hq7DOCNL8rXPXYL+JUGPdzBO3
	V5zJ5bNoCwLEQLYW5aYY0miFU2ETGnHdnf+BQ=
Received: by 10.213.33.78 with SMTP id g14mr1654254ebd.3.1289478301577;
	Thu, 11 Nov 2010 04:25:01 -0800 (PST)
Received: from localhost ([212.98.186.134])
	by mx.google.com with ESMTPS id q58sm1867625eeh.3.2010.11.11.04.24.59
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Thu, 11 Nov 2010 04:25:00 -0800 (PST)
Date: Thu, 11 Nov 2010 14:24:55 +0200
From: Gleb Kurtsou <gleb.kurtsou@gmail.com>
To: Buganini <buganini@gmail.com>
Message-ID: <20101111122455.GA2098@tops>
References: <1289442296.2128.16.camel@monet>
	<AANLkTi=V5AhaPFG9EFnopiiEXUyA0-HsoiMfT=RKxhZF@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <AANLkTi=V5AhaPFG9EFnopiiEXUyA0-HsoiMfT=RKxhZF@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, Kevin Lo <kevlo@freebsd.org>, delphij@freebsd.org
Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 12:25:04 -0000

On (11/11/2010 13:20), Buganini wrote:
> I'm using these two patches on CURRENT
> http://security-hole.info/~buganini/patches/kiconv_msdosfs/

Patch looks worth committing, I've once started working on very similar
solution but never had time to finish it.

What do think about importing lower-upper case characters unicode ranges
and extending kiconv to remove locale option (-L) from msdosfs?

Thanks,
Gleb.

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 13:22:59 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0FCA7106566B
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 13:22:59 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au
	[211.29.132.188])
	by mx1.freebsd.org (Postfix) with ESMTP id A228A8FC0C
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 13:22:58 +0000 (UTC)
Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au
	(c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73])
	by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	oABDMoZN016053
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 12 Nov 2010 00:22:51 +1100
Date: Fri, 12 Nov 2010 00:22:50 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Martin Simmons <martin@lispworks.com>
In-Reply-To: <201011111206.oABC6VYG027663@higson.cam.lispworks.com>
Message-ID: <20101112000011.A1372@besplex.bde.org>
References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com>
	<201011111206.oABC6VYG027663@higson.cam.lispworks.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS and pathconf(_PC_NO_TRUNC)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 13:22:59 -0000

On Thu, 11 Nov 2010, Martin Simmons wrote:

>>>>>> On Wed, 10 Nov 2010 22:28:27 +0000, Mark Blackman said:
>>
>> I note that when testing the pathconf(2) NO_TRUNC property
>> on a ZFS filesystem, I get a ENOENT, "No such file or directory".
>>
>> I'm not sure if this qualifies as correct behaviour, but thought
>> a learned soul on this list could enlighten me.
>>
>> I've attached the C snippet I used for testing.
>>
>> #include <unistd.h>
>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <string.h>
>> #include <errno.h>
>>
>> int main(int argc, char *argv[]){
>>   int result;
>>
>>   result=pathconf(argv[1], _PC_NO_TRUNC);
>>   printf("for %s: no_trunc is %d\n",argv[1],result);
>>   if (result<0)
>>     perror(NULL);
>>   1;
>> }
>
> Your call to printf is clobbering the real errno, which is EINVAL.  That is an
> allowed value according to the pathconf man page:
>
>     [EINVAL]           The implementation does not support an association of
>                        the variable name with the associated file.
>
> So it is correct, but maybe not useful.

I think POSIX requires (in cross-references that are hard to read in
ASCII versions) _PC_NO_TRUNC to be supported for directories (only).
POSIX clearly says that whether _PC_NO_TRUNC is supported for
non-directories is implementation-defined.

For some feature test variables, it may be necessary to test both the
pathconf variable and the compile-time variable (_POSIX_NO_TRUNC here).
Under FreeBSD, _POSIX_NO_TRUNC is 1, which means that this feature
applies to all files, but this is just wrong since individual file
systems can and do return 0 for _PC_NO_TRUNC (msdosfs is one example
-- it has too allow truncation since file names are often too long for
8.3 format, and truncating them is what msdos would do).  So applications
shouldn't trust _POSIX_NOTRUNC.  Fortunately, it is easiest to never
use it (except in programs that test for bugs like the FreeBSD one).
It is not useful, since even if it is correct then in most cases it
will be 0 (meaning that whether the feature applies is fs-dependent,
so that _PC_NO_TRUNC must be used).  It is easiest to always just use
_PC_NO_TRUNC and not use the ifdef tangle involving _POSIX_NO_TRUNC that
is needed to sometimnes avoid using _PC_NO_TRUNC>

msdosfs may also be wrong in returning 0 for _PC_NO_TRUNC in both the 8.3
and longnames cases.  I don't know what msdos does in the longnames case,
but with long names there is no need to truncate.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 13:54:41 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 74444106564A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 13:54:41 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au
	[211.29.132.188])
	by mx1.freebsd.org (Postfix) with ESMTP id 10A118FC0C
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 13:54:40 +0000 (UTC)
Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au
	(c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73])
	by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	oABDsYZg013200
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 12 Nov 2010 00:54:35 +1100
Date: Fri, 12 Nov 2010 00:54:34 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Mark Blackman <mark@exonetric.com>
In-Reply-To: <86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com>
Message-ID: <20101112002522.V1372@besplex.bde.org>
References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com>
	<201011111206.oABC6VYG027663@higson.cam.lispworks.com>
	<86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS and pathconf(_PC_NO_TRUNC)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 13:54:41 -0000

On Thu, 11 Nov 2010, Mark Blackman wrote:

> On 11 Nov 2010, at 12:06, Martin Simmons wrote:
>> Your call to printf is clobbering the real errno, which is EINVAL.
>
> Doh! thanks for pointing that out. :)
>
>> That is an
>> allowed value according to the pathconf man page:
>>
>>     [EINVAL]           The implementation does not support an association of
>>                        the variable name with the associated file.
>>
>> So it is correct, but maybe not useful.
>
> hmm. this is popping up in the context of building perl 5.12 on a zfs-only
> filesystem. One of the POSIX::* tests fails because of the above.

zfs_vnops.c:zfs_pathconf() is missing _PC_NO_TRUNC, so it seems to be just
broken (it returns EOPNOTSUPP for cases not in the switch, so there seems
to be no way for another level to support _PC_NO_TRUNC).  It apparently
depends on another layer providing defaults.

Other basic things missing in it:
_PC_NAME_MAX
_PC_CHOWN_RESTRICTED
_PC_PIPE_BUF
[several other things that are in the switch statement in vop_stdpathconf(),
  but which are nonsense there since they only apply to device files and
  should depend on the file anyway, and which don't apply to zfs or any
  normal file system since device files on normal file systems are no longer
  supported]

_PC_PIPE_BUF is not quite like the features that onluy apply to device
files.  It aplies to named pipes, and since there is no defaulting of
_PC_* in FreeBSD, all file systems that support named pipes must support
it in their pathconf vop although it has nothing to do with file systems.

Fortunately, pathconf() is never used except by naive programs like perl :-).

_PC_NAME_MAX is used by patch(1) in FreeBSD, but patch(1) also has an ifdef
tangle using _POSIX_NAME_MAX and other messes which I think allows patch
to work accidentally if zfs returns EOPNOTSUPP: from backupfile.c:

% void
% addext(char *filename, char *ext, int e)
% {
%   char *s = (char *)(uintptr_t)(const void *)basename (filename);
%   int slen = strlen (s), extlen = strlen (ext);
%   long slen_max = -1;
% 
% #if HAVE_PATHCONF && defined (_PC_NAME_MAX)
% #ifndef _POSIX_NAME_MAX
% #define _POSIX_NAME_MAX 14
% #endif

_POSIX_NAME_MAX is always 14 on POSIX systems, so this ifdef is no help.

%   if (slen + extlen <= _POSIX_NAME_MAX)
%     /* The file name is so short there's no need to call pathconf.  */
%     slen_max = _POSIX_NAME_MAX;
%   else if (s == filename)
%     slen_max = pathconf (".", _PC_NAME_MAX);

I think we get here and pathconf() fails for names of length just 15 or
greater.

%   else
%     {
%       char c = *s;
%       *s = 0;
%       slen_max = pathconf (filename, _PC_NAME_MAX);
%       *s = c;
%     }
% #endif
%   if (slen_max == -1) {
% #ifdef HAVE_LONG_FILE_NAMES
%     slen_max = 255;

We get here on error (since although FreeBSD only has long file names on
some file systems, patch is misconfigured, possibly by configuring it on
a normal file system that has long names, so HAVE_LONG_FILE_NAMES is set
unconditionally in the hard-configured config.h), so the max is essentially 
hard-coded as 255 if pathconf() fails.

% #else
%     slen_max = 14;
% #endif
%   }

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 14:17:58 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7241D1065693
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 14:17:58 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au
	[211.29.132.185])
	by mx1.freebsd.org (Postfix) with ESMTP id 0DBCC8FC16
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 14:17:57 +0000 (UTC)
Received: from c122-107-121-73.carlnfd1.nsw.optusnet.com.au
	(c122-107-121-73.carlnfd1.nsw.optusnet.com.au [122.107.121.73])
	by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	oABEHpoa020457
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 12 Nov 2010 01:17:52 +1100
Date: Fri, 12 Nov 2010 01:17:51 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Bruce Evans <brde@optusnet.com.au>
In-Reply-To: <20101112002522.V1372@besplex.bde.org>
Message-ID: <20101112005719.O1598@besplex.bde.org>
References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com>
	<201011111206.oABC6VYG027663@higson.cam.lispworks.com>
	<86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com>
	<20101112002522.V1372@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS and pathconf(_PC_NO_TRUNC)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 14:17:58 -0000

On Fri, 12 Nov 2010, Bruce Evans wrote:

> On Thu, 11 Nov 2010, Mark Blackman wrote:
>
>> On 11 Nov 2010, at 12:06, Martin Simmons wrote:
>>> Your call to printf is clobbering the real errno, which is EINVAL.
>> 
>> Doh! thanks for pointing that out. :)
>> 
>>> That is an
>>> allowed value according to the pathconf man page:
>>> 
>>>     [EINVAL]           The implementation does not support an association 
>>> of
>>>                        the variable name with the associated file.
>>> 
>>> So it is correct, but maybe not useful.
>> 
>> hmm. this is popping up in the context of building perl 5.12 on a zfs-only
>> filesystem. One of the POSIX::* tests fails because of the above.
>
> zfs_vnops.c:zfs_pathconf() is missing _PC_NO_TRUNC, so it seems to be just
> broken (it returns EOPNOTSUPP for cases not in the switch, so there seems
> to be no way for another level to support _PC_NO_TRUNC).  It apparently
> depends on another layer providing defaults.
>
> Other basic things missing in it:
> _PC_NAME_MAX
> _PC_CHOWN_RESTRICTED
> _PC_PIPE_BUF
> [several other things that are in the switch statement in vop_stdpathconf(),

Oops, more careful grepping shows that zfs uses its own layer
(zfs_freebsd_pathconf()) to provide defaults by calling vop_stdpathconf()
if zfs_pathconf() fails with error EOPNOTSUPP, so it should never fail
for _PC_NO_TRUNC or the other basic ones).

But there seem to be problems somewhere.  EOPNOTSUPP is not a possible error
for pathconf().  EINVAL must be returned for unsupported feature tests.

ffs handles all the cases directly, except it is more careful for fifos --
it calls fifo_pathconf() for these, where zfs seems to succeed for a lot
of features that shouldn't apply to fifos, and then falls back to
vop_stdpathconf() which succeeds for even more features where it shouldn't.
fifo_pathconf() only succeeds for _PC_LINK_MAX, _PC_PIPE_BUF and
_PC_CHOWN_RESTRICTED.  Its setting of _PC_LINK_MAX is wrong, since the
value of this is fs-dependent.  _PC_PIPE_BUF is of course pipe+fifo-dependent
so its setting belongs here and somewhere for pipes (I can't see where it
is supported for fpathconf() on pipes).  _PC_CHOWN_RESTRICTED also shouldn't
be set here, but perhaps fifos can know a system-wide setting and repeat it,
as other file systems do.

Correct layering would probably result in vop_stdpathconf() not existing.
It is currently more or less correct only for devfs, and is only used by
zfs, coda, devfs, fdescfs and portalfs.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 14:32:28 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DD3991065670
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 14:32:28 +0000 (UTC)
	(envelope-from mark@exonetric.com)
Received: from relay0.exonetric.net (relay0.exonetric.net [82.138.248.161])
	by mx1.freebsd.org (Postfix) with ESMTP id A503A8FC17
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 14:32:28 +0000 (UTC)
Received: from [192.168.111.107] (unknown [62.244.179.66])
	by relay0.exonetric.net (Postfix) with ESMTP id 79F3A57228;
	Thu, 11 Nov 2010 14:32:27 +0000 (GMT)
Mime-Version: 1.0 (Apple Message framework v1081)
Content-Type: text/plain; charset=us-ascii
From: Mark Blackman <mark@exonetric.com>
In-Reply-To: <20101112005719.O1598@besplex.bde.org>
Date: Thu, 11 Nov 2010 14:32:25 +0000
Content-Transfer-Encoding: quoted-printable
Message-Id: <43792CD8-6127-4402-8A4F-5DFD894F7256@exonetric.com>
References: <871369D9-7D63-4CE0-BB87-B8C46A62B271@exonetric.com>
	<201011111206.oABC6VYG027663@higson.cam.lispworks.com>
	<86371A88-1474-4A51-8C84-05C4A71A9135@exonetric.com>
	<20101112002522.V1372@besplex.bde.org>
	<20101112005719.O1598@besplex.bde.org>
To: Bruce Evans <brde@optusnet.com.au>
X-Mailer: Apple Mail (2.1081)
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS and pathconf(_PC_NO_TRUNC)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 14:32:28 -0000


On 11 Nov 2010, at 14:17, Bruce Evans wrote:
>  where zfs seems to succeed for a lot
> of features that shouldn't apply to fifos, and then falls back to
> vop_stdpathconf() which succeeds for even more features where it =
shouldn't.
> fifo_pathconf() only succeeds for _PC_LINK_MAX, _PC_PIPE_BUF and
> _PC_CHOWN_RESTRICTED.  Its setting of _PC_LINK_MAX is wrong, since the
> value of this is fs-dependent.  _PC_PIPE_BUF is of course =
pipe+fifo-dependent
> so its setting belongs here and somewhere for pipes (I can't see where =
it
> is supported for fpathconf() on pipes).  _PC_CHOWN_RESTRICTED also =
shouldn't
> be set here, but perhaps fifos can know a system-wide setting and =
repeat it,
> as other file systems do.
>=20
> Correct layering would probably result in vop_stdpathconf() not =
existing.
> It is currently more or less correct only for devfs, and is only used =
by
> zfs, coda, devfs, fdescfs and portalfs.
>=20
> Bruce

Ok, I'll file that bug then. I wonder how Solaris handles pathconf, but =
that's
another question.

Cheers,
Mark=

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 20:23:11 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EACC0106566C;
	Thu, 11 Nov 2010 20:23:11 +0000 (UTC) (envelope-from lists@mawer.org)
Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com
	[209.85.212.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 95AE58FC19;
	Thu, 11 Nov 2010 20:23:11 +0000 (UTC)
Received: by vws20 with SMTP id 20so623653vws.13
	for <multiple recipients>; Thu, 11 Nov 2010 12:23:10 -0800 (PST)
MIME-Version: 1.0
Received: by 10.229.236.196 with SMTP id kl4mr1110283qcb.109.1289505467057;
	Thu, 11 Nov 2010 11:57:47 -0800 (PST)
Received: by 10.229.91.66 with HTTP; Thu, 11 Nov 2010 11:57:46 -0800 (PST)
In-Reply-To: <AANLkTi=KtGJjhEzRnQN78koFQqvtFC1c+qVfunUdae+w@mail.gmail.com>
References: <ib8qda$nat$1@dough.gmane.org> <201011081004.59640.jhb@freebsd.org>
	<20101108151028.GI2392@deviant.kiev.zoral.com.ua>
	<AANLkTi=KtGJjhEzRnQN78koFQqvtFC1c+qVfunUdae+w@mail.gmail.com>
Date: Fri, 12 Nov 2010 06:57:46 +1100
Message-ID: <AANLkTikpOs9TzjqV3TUZCTYW9zmzvNr2FtYaDWhMY+=R@mail.gmail.com>
From: Antony Mawer <lists@mawer.org>
To: Ivan Voras <ivoras@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org
Subject: Re: The state of Giant lock in the file systems?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 20:23:12 -0000

On Tue, Nov 9, 2010 at 2:28 AM, Ivan Voras <ivoras@freebsd.org> wrote:
> On 8 November 2010 16:10, Kostik Belousov <kostikbel@gmail.com> wrote:
>
>> I already claimed several times that I will remove VFS_LOCK_GIANT
>> after smbfs is locked. Patch for removal is sitting in my repository
>> for almost a year.
>
> Ok, I've made a little table here:
>
> http://wiki.freebsd.org/MPSAFE_VFS

FYI - NWFS is still functional in 8.x (there are some minor but
annoying bugs, e.g. the root node path resolution occasionally trips
over itself causing the mount point to become inaccessible, but that's
been there since 4.x days), and I am happy to test any locking changes
to it.

>From memory NWFS and SMBFS share similar locking strategies so what
gets done to one typically gets applied to the other. This got hit
early in the 6.0 beta series where SMBFS had VFS locking changes which
hadn't been applied to NWFS. On that occasion we were able to work
with truckman@ to isolate the problem and get the right locking
changes made in time for 6.0's release.

On the SMBFS front, it's largely unmaintained as well (sadly) -- there
are patches to add Unicode support to SMBFS which have been floating
around since 2005, but so far they have (to my knowledge) never seen
any reviews:

http://people.freebsd.org/~imura/kiconv/

SMBFS and NWFS both share a lot of similar designs in terms of their
FreeBSD implementations due to being both implemented by the same
developer (bp@).

--Antony

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 20:28:14 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 37FFC106566C;
	Thu, 11 Nov 2010 20:28:14 +0000 (UTC) (envelope-from lists@mawer.org)
Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com
	[209.85.216.54])
	by mx1.freebsd.org (Postfix) with ESMTP id BB9038FC16;
	Thu, 11 Nov 2010 20:28:13 +0000 (UTC)
Received: by qwj8 with SMTP id 8so1501535qwj.13
	for <multiple recipients>; Thu, 11 Nov 2010 12:28:13 -0800 (PST)
MIME-Version: 1.0
Received: by 10.229.82.85 with SMTP id a21mr1121489qcl.71.1289505610127; Thu,
	11 Nov 2010 12:00:10 -0800 (PST)
Received: by 10.229.91.66 with HTTP; Thu, 11 Nov 2010 12:00:09 -0800 (PST)
In-Reply-To: <20101111122455.GA2098@tops>
References: <1289442296.2128.16.camel@monet>
	<AANLkTi=V5AhaPFG9EFnopiiEXUyA0-HsoiMfT=RKxhZF@mail.gmail.com>
	<20101111122455.GA2098@tops>
Date: Fri, 12 Nov 2010 07:00:09 +1100
Message-ID: <AANLkTikVjr5XmV1RwaKcDRoFXfM+9W7U=YtFfcgS7x2H@mail.gmail.com>
From: Antony Mawer <lists@mawer.org>
To: Gleb Kurtsou <gleb.kurtsou@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org, Kevin Lo <kevlo@freebsd.org>, delphij@freebsd.org
Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 20:28:14 -0000

On Thu, Nov 11, 2010 at 11:24 PM, Gleb Kurtsou <gleb.kurtsou@gmail.com> wrote:
> On (11/11/2010 13:20), Buganini wrote:
>> I'm using these two patches on CURRENT
>> http://security-hole.info/~buganini/patches/kiconv_msdosfs/
>
> Patch looks worth committing, I've once started working on very similar
> solution but never had time to finish it.
>
> What do think about importing lower-upper case characters unicode ranges
> and extending kiconv to remove locale option (-L) from msdosfs?

While we're on the topic of looking at filesystem's Unicode support,
would anyone with the appropriate knowledge have a chance to look at
these patches?

http://people.freebsd.org/~imura/kiconv/

It would make smbfs against modern systems so much more usable...

--Antony

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 20:30:16 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C7913106566B
	for <freebsd-fs@hub.freebsd.org>; Thu, 11 Nov 2010 20:30:16 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id B71A48FC0A
	for <freebsd-fs@hub.freebsd.org>; Thu, 11 Nov 2010 20:30:16 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oABKUFH5059845
	for <freebsd-fs@freefall.freebsd.org>; Thu, 11 Nov 2010 20:30:15 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oABKUFoF059840;
	Thu, 11 Nov 2010 20:30:15 GMT (envelope-from gnats)
Date: Thu, 11 Nov 2010 20:30:15 GMT
Message-Id: <201011112030.oABKUFoF059840@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
From: Antony Mawer <lists@mawer.org>
Cc: 
Subject: Re: kern/151845: [smbfs] [patch] smbfs should be upgraded to
	support Unicode
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Antony Mawer <lists@mawer.org>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 20:30:16 -0000

The following reply was made to PR kern/151845; it has been noted by GNATS.

From: Antony Mawer <lists@mawer.org>
To: bug-followup@FreeBSD.org, m.meelis@easybow.com
Cc:  
Subject: Re: kern/151845: [smbfs] [patch] smbfs should be upgraded to support Unicode
Date: Fri, 12 Nov 2010 06:55:35 +1100

 There were some patches floating around to add Unicode support to
 smbfs as long as 5 years ago, apparently inspired b work done on Mac
 OS X. These are still available here:
 
 http://people.freebsd.org/~imura/kiconv/
 
 The smbfs code hasn't changed too much in that time (it's pretty much
 unmaintained), so I don't think it would be too much work to dust them
 off and get them to apply against 8.x or -CURRENT.
 
 If someone out there with some knowledge of this area were able to
 spare a few hours to look at this would be a huge step in bringing
 SMBFS up to a modern usable level - at the moment it is largely
 useless as soon as you hit files with non-ASCII characters in it.

From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 11 22:54:44 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6DD6A106564A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 22:54:44 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
	[131.104.91.44])
	by mx1.freebsd.org (Postfix) with ESMTP id 2B4EF8FC0A
	for <freebsd-fs@freebsd.org>; Thu, 11 Nov 2010 22:54:43 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApwEANcC3EyDaFvO/2dsb2JhbACDO6ABsRCQb4EigVKBY3MEhFqFfoUP
X-IronPort-AV: E=Sophos;i="4.59,185,1288584000"; d="scan'208";a="100457969"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 11 Nov 2010 17:54:42 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 06E9FB3EB0;
	Thu, 11 Nov 2010 17:54:43 -0500 (EST)
Date: Thu, 11 Nov 2010 17:54:43 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Robert Schulze <rs@bytecamp.net>
Message-ID: <865951441.196904.1289516082967.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <4CDBB9EB.8010908@bytecamp.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [12.16.49.138]
X-Mailer: Zimbra 6.0.7_GA_2476.RHEL4 (ZimbraWebClient - IE8
	(Win)/6.0.7_GA_2473.RHEL4_64)
Cc: freebsd-fs@freebsd.org
Subject: Re: nfsd stuck in *rc_lock state
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2010 22:54:44 -0000

> Hello everybody,
> 
> we are running several 8.1 NFS Clients connected to a 8.0 NFS Server.
> The system ran fine, but today the NFS-Server locked up.
> 
> nfsd ate 100% of CPU, with one {nfsd: service} thread in state
> "*rc_lock". I tried to find others with the same issue and finally
> ended
> up at a patch from Rick:
> 
> http://people.freebsd.org/~rmacklem/freebsd8.1-patches/replay.patch
> 
> may I apply this patch to a 8.0 system to fix this issue, or are there
> any other patches/commits which affect this?
> 
That patch is "self contained", so I think it should be fine to apply it
to an 8.0 server.

You might also want
   http://people.freebsd.org/~rmacklem/freebsd8.0-patches/freebsd8-svc-mbufleak.patch
which plugged an mbuf leak in the regular FreeBSD8.0 server.

Good luck with it, rick

From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 12 07:44:04 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 63E1C106564A;
	Fri, 12 Nov 2010 07:44:04 +0000 (UTC)
	(envelope-from kevlo@FreeBSD.org)
Received: from ns.kevlo.org (kevlo.org [220.128.136.52])
	by mx1.freebsd.org (Postfix) with ESMTP id 0273F8FC12;
	Fri, 12 Nov 2010 07:44:03 +0000 (UTC)
Received: from [127.0.0.1] (kevlo@kevlo.org [220.128.136.52])
	by ns.kevlo.org (8.14.3/8.14.3) with ESMTP id oAC7hFsS018602;
	Fri, 12 Nov 2010 15:43:16 +0800 (CST)
From: Kevin Lo <kevlo@FreeBSD.org>
To: gljennjohn@googlemail.com
In-Reply-To: <20101111111922.7fe8ab19@ernst.jennejohn.org>
References: <1289442296.2128.16.camel@monet>
	<20101111111922.7fe8ab19@ernst.jennejohn.org>
Content-Type: text/plain; charset="UTF-8"
Date: Fri, 12 Nov 2010 15:44:02 +0800
Message-ID: <1289547842.6426.6.camel@monet>
Mime-Version: 1.0
X-Mailer: Evolution 2.30.3 FreeBSD GNOME Team Port 
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org, delphij@FreeBSD.org
Subject: Re: patch: let msdosfs(vfat)/ntfs to support UTF-8 locale well
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Nov 2010 07:44:04 -0000

Gary Jennejohn wrote:
> Xin Li wrote:
> > > (cc'ed to freebsd-fs@)
> > >
> > > I think it's important that someone familiar with the code review and
> > > evaluate the current patches and commit it against -HEAD...
> > 
> > > MSDOSFS patch (against 7.1):
> > > http://btload.googlegroups.com/web/msdosfs.patch?gda=MzIscT8AAABs_gmy4a1S9lRiXjEy-V5OpwtI67JnIGlz0zr18tjObOtoi5oIt3BJMRGeqGBbbj-ccyFKn-rNKC-d1pM_IdV0
> > > NTFS patch:
> > > http://btload.googlegroups.com/web/ntfs.patch?gda=OqsHoDwAAABs_gmy4a1S9lRiXjEy-V5O7RN7t-m4MjZ-5dQn_EvaqDVCWO9_HyYEQJyRQYPtRCL9Wm-ajmzVoAFUlE7c_fAt
> > 
> 
> Just FYI.
> 
> The NTFS patch is no longer found and the page is reported as no longer
> existing.

Hmm... You can download the NTFS patch from
http://btload.googlegroups.com/web/ntfs-utf8-patch.diff?gda=ImG5zEYAAABTKdAk9D4djfQOfSDW4ZV9M0mZmSlyAz1mAL3Bd_FXPiRk0z41TwnNjIphZItxmHHoNShIR6xqBMu6AvwilW_uE-Ea7GxYMt0t6nY0uV5FIQ

	Kevin


From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 12 09:29:56 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B85F0106566C;
	Fri, 12 Nov 2010 09:29:56 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 8DE808FC15;
	Fri, 12 Nov 2010 09:29:56 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAC9Tu2v000812;
	Fri, 12 Nov 2010 09:29:56 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oAC9TuFv000808;
	Fri, 12 Nov 2010 09:29:56 GMT (envelope-from linimon)
Date: Fri, 12 Nov 2010 09:29:56 GMT
Message-Id: <201011120929.oAC9TuFv000808@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org
From: linimon@FreeBSD.org
Cc: 
Subject: Re: kern/152022: [nfs] nfs service hangs with linux client
	[regression]
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Nov 2010 09:29:56 -0000

Old Synopsis: nfs service hangs with linux client
New Synopsis: [nfs] nfs service hangs with linux client [regression]

Responsible-Changed-From-To: freebsd-bugs->freebsd-fs
Responsible-Changed-By: linimon
Responsible-Changed-When: Fri Nov 12 09:29:15 UTC 2010
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=152022

From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 12 09:32:55 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 66D2E1065675;
	Fri, 12 Nov 2010 09:32:55 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 3BE578FC13;
	Fri, 12 Nov 2010 09:32:55 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAC9WtHG002521;
	Fri, 12 Nov 2010 09:32:55 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oAC9Wtjl002517;
	Fri, 12 Nov 2010 09:32:55 GMT (envelope-from linimon)
Date: Fri, 12 Nov 2010 09:32:55 GMT
Message-Id: <201011120932.oAC9Wtjl002517@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org
From: linimon@FreeBSD.org
Cc: 
Subject: Re: kern/152079: [msdosfs] [patch] Small cleanups from the other
	NetBSD/OpenBSD
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Nov 2010 09:32:55 -0000

Old Synopsis: msdosfs: Small cleanups from the other NetBSD/OpenBSD
New Synopsis: [msdosfs] [patch] Small cleanups from the other NetBSD/OpenBSD

Responsible-Changed-From-To: freebsd-bugs->freebsd-fs
Responsible-Changed-By: linimon
Responsible-Changed-When: Fri Nov 12 09:32:29 UTC 2010
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=152079

From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 12 11:58:01 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 09B201065670;
	Fri, 12 Nov 2010 11:58:01 +0000 (UTC) (envelope-from alexz@visp.ru)
Received: from mail.visp.ru (srv1.visp.ru [91.215.204.2])
	by mx1.freebsd.org (Postfix) with ESMTP id AF5828FC18;
	Fri, 12 Nov 2010 11:58:00 +0000 (UTC)
Received: from 91-215-205-255.static.visp.ru ([91.215.205.255] helo=zagrebin)
	by mail.visp.ru with esmtp (Exim 4.72 (FreeBSD))
	(envelope-from <alexz@visp.ru>)
	id 1PGsGU-0004rl-Dw; Fri, 12 Nov 2010 14:57:58 +0300
From: "Alexander Zagrebin" <alexz@visp.ru>
To: <freebsd-stable@freebsd.org>,
	<freebsd-fs@freebsd.org>
Date: Fri, 12 Nov 2010 14:57:58 +0300
Message-ID: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="koi8-r"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
Thread-Index: AcuCYNju4THmbqyLQdC782dwcgeoxA==
Cc: 
Subject: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Nov 2010 11:58:01 -0000

I have found that there is an issue with unmounting ZFS snapshots:
the /sbin/umount "hangs" after unmounting.

The test system is i386, but I can reproduce this issue on amd64 too.

# uname -a
FreeBSD alpha.vosz.local 8.1-STABLE FreeBSD 8.1-STABLE #0: Tue Oct 19
18:47:05 MSD 2010     root@alpha.vosz.local:/usr/obj/usr/src/sys/GENERIC
i386

How to try to repeat:

# zfs snapshot pool/var@test

# zfs list -t all -r pool/var
NAME            USED  AVAIL  REFER  MOUNTPOINT
pool/var       4,86M  2,99G  4,86M  /var
pool/var@test      0      -  4,86M  -

# mount -t zfs pool/var@test /mnt

# mount
...
pool/var@test on /mnt (zfs, local, noatime, read-only)

# umount /mnt

At this point umount hangs and it's impossible to kill it
even with the `kill -9`.

>From the working console I can see that:
1. snapshot is unmounted successfully

# mount
pool/root on / (zfs, local)
devfs on /dev (devfs, local, multilabel)
pool/home on /home (zfs, local)
pool/tmp on /tmp (zfs, local)
pool/usr on /usr (zfs, local)
pool/usr/src on /usr/src (zfs, local)
pool/var on /var (zfs, local)

2. the umount is waiting for disk
#ps | egrep 'PID|umount'
  PID  TT  STAT      TIME COMMAND
  958   0  D+     0:00,04 umount /mnt
# procstat -t 958
  PID    TID COMM             TDNAME           CPU  PRI STATE   WCHAN
  958 100731 umount           -                  3  133 sleep   mntref

Can anybody confirm this issue?
Any suggestions?

-- 
Alexander Zagrebin


From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 12 12:13:24 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 57B1C1065694;
	Fri, 12 Nov 2010 12:13:24 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 577A08FC31;
	Fri, 12 Nov 2010 12:13:23 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA28407;
	Fri, 12 Nov 2010 14:13:20 +0200 (EET) (envelope-from avg@freebsd.org)
Message-ID: <4CDD2F5F.2000902@freebsd.org>
Date: Fri, 12 Nov 2010 14:13:19 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Alexander Zagrebin <alexz@visp.ru>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>
In-Reply-To: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=KOI8-R
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Nov 2010 12:13:24 -0000

on 12/11/2010 13:57 Alexander Zagrebin said the following:
> 2. the umount is waiting for disk
> #ps | egrep 'PID|umount'
>   PID  TT  STAT      TIME COMMAND
>   958   0  D+     0:00,04 umount /mnt
> # procstat -t 958
>   PID    TID COMM             TDNAME           CPU  PRI STATE   WCHAN
>   958 100731 umount           -                  3  133 sleep   mntref

procstat -kk <pid>

> Can anybody confirm this issue?
> Any suggestions?
> 

ktrace-ing umount could also be useful.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 12 14:00:24 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3896F10656A8;
	Fri, 12 Nov 2010 14:00:24 +0000 (UTC) (envelope-from alexz@visp.ru)
Received: from mail.visp.ru (srv1.visp.ru [91.215.204.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 2EF348FC15;
	Fri, 12 Nov 2010 14:00:14 +0000 (UTC)
Received: from 91-215-205-255.static.visp.ru ([91.215.205.255] helo=zagrebin)
	by mail.visp.ru with esmtp (Exim 4.72 (FreeBSD))
	(envelope-from <alexz@visp.ru>)
	id 1PGuAl-000Lo1-Nd; Fri, 12 Nov 2010 17:00:11 +0300
From: "Alexander Zagrebin" <alexz@visp.ru>
To: "'Andriy Gapon'" <avg@freebsd.org>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>
	<4CDD2F5F.2000902@freebsd.org>
Date: Fri, 12 Nov 2010 17:00:11 +0300
Keywords: freebsd-stable
Message-ID: <FD7FC6ED159249338A04BE125941D146@vosz.local>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="koi8-r"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
Thread-Index: AcuCYyqzfZYrwX5jRtKA2oGCQvAKogADjN1w
In-Reply-To: <4CDD2F5F.2000902@freebsd.org>
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: RE: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Nov 2010 14:00:24 -0000

Thanks for your reply!

> > 2. the umount is waiting for disk
> > #ps | egrep 'PID|umount'
> >   PID  TT  STAT      TIME COMMAND
> >   958   0  D+     0:00,04 umount /mnt
> > # procstat -t 958
> >   PID    TID COMM             TDNAME           CPU  PRI 
> STATE   WCHAN
> >   958 100731 umount           -                  3  133 
> sleep   mntref
> 
> procstat -kk <pid>

$ ps a | grep umount
86874   2- D      0:00,06 umount /mnt
90433   3  S+     0:00,01 grep umount

$ sudo procstat -kk 86874
  PID    TID COMM             TDNAME           KSTACK
86874 100731 umount           -                mi_switch+0x176
sleepq_wait+0x42 _sleep+0x317 vfs_mount_destroy+0x5a dounmount+0x4d4
unmount+0x38b syscall+0x1cf Xfast_syscall+0xe2

-- 
Alexander Zagrebin


From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 12 14:27:05 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6DDA9106566B;
	Fri, 12 Nov 2010 14:27:05 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 562148FC18;
	Fri, 12 Nov 2010 14:27:03 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA00315;
	Fri, 12 Nov 2010 16:27:01 +0200 (EET) (envelope-from avg@freebsd.org)
Message-ID: <4CDD4EB4.40004@freebsd.org>
Date: Fri, 12 Nov 2010 16:27:00 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Alexander Zagrebin <alexz@visp.ru>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>
	<4CDD2F5F.2000902@freebsd.org>
	<FD7FC6ED159249338A04BE125941D146@vosz.local>
In-Reply-To: <FD7FC6ED159249338A04BE125941D146@vosz.local>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=KOI8-R
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Nov 2010 14:27:05 -0000

on 12/11/2010 16:00 Alexander Zagrebin said the following:
> Thanks for your reply!
> 
>>> 2. the umount is waiting for disk
>>> #ps | egrep 'PID|umount'
>>>   PID  TT  STAT      TIME COMMAND
>>>   958   0  D+     0:00,04 umount /mnt
>>> # procstat -t 958
>>>   PID    TID COMM             TDNAME           CPU  PRI 
>> STATE   WCHAN
>>>   958 100731 umount           -                  3  133 
>> sleep   mntref
>>
>> procstat -kk <pid>
> 
> $ ps a | grep umount
> 86874   2- D      0:00,06 umount /mnt
> 90433   3  S+     0:00,01 grep umount
> 
> $ sudo procstat -kk 86874
>   PID    TID COMM             TDNAME           KSTACK
> 86874 100731 umount           -                mi_switch+0x176
> sleepq_wait+0x42 _sleep+0x317 vfs_mount_destroy+0x5a dounmount+0x4d4
> unmount+0x38b syscall+0x1cf Xfast_syscall+0xe2
> 


Looks like possible mnt_ref leak.
I think that something like that was fixed some not long time ago.
Perhaps you either don't have the fix or there is another leak.
What revision do you have?

Perhaps Martin has an insight here.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 12 17:43:44 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0D9DB106566B;
	Fri, 12 Nov 2010 17:43:44 +0000 (UTC)
	(envelope-from sarawgi.aditya@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id BA7448FC17;
	Fri, 12 Nov 2010 17:43:43 +0000 (UTC)
Received: by iwn39 with SMTP id 39so3747096iwn.13
	for <multiple recipients>; Fri, 12 Nov 2010 09:43:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=cp0173sigjDNVvCPuetu4pnYqA2uRnHVLIN2JtltlCI=;
	b=REPCLxRLYMoj/tRqLJ2JckzEMIXwyB3o5MPyKsiIuVLzBI2818Dn/jQyABIR1zGB2y
	nrZpWSG8NQWciGTTAU0DnQWDEmuHAECHpL0Z5o4K5AW59MpERsaNKHJ18Gjz1BIjPyrM
	Sp2OAGCXISALPM3W2VONkcuCSmovKjSiYvISA=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=ddX2x7SLfOIlErFnrk3fj90t0GZvGH32RS4npSUH4HzvaJTeKhmmBfJz0TTUS+8Umq
	sLTdkmrqfvjYGpl+FuBztrXW2TWb/SB9LNK70XIIstHOT1ZrOuZt3cTDOkRKSCREJLMd
	6t2uGSLR3DLFpUakcmWXOXvXCvA/IxGNlwtLE=
MIME-Version: 1.0
Received: by 10.231.183.136 with SMTP id cg8mr2267704ibb.114.1289583822776;
	Fri, 12 Nov 2010 09:43:42 -0800 (PST)
Received: by 10.231.253.81 with HTTP; Fri, 12 Nov 2010 09:43:42 -0800 (PST)
In-Reply-To: <4CDB888E.6030005@FreeBSD.org>
References: <4CDB888E.6030005@FreeBSD.org>
Date: Fri, 12 Nov 2010 23:13:42 +0530
Message-ID: <AANLkTik0aiXJXmjWjZz_4k1ZvjtVd0ZqJq5Bx9O_JDE1@mail.gmail.com>
From: Aditya Sarawgi <sarawgi.aditya@gmail.com>
To: Doug Barton <dougb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org
Subject: Re: Minor problem re-mounting ext2fs system
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Nov 2010 17:43:44 -0000

Hi,


On Thu, Nov 11, 2010 at 11:39 AM, Doug Barton <dougb@freebsd.org> wrote:
> Even after a clean shutdown I get this error at nearly every reboot:
>
> kernel: last write time is in the future.
> kernel: (by less than a day, probably due to the hardware clock being
> incorrectly set).
> kernel: FIXED.
>

The ext2fs superblock (the structure that keeps summary and important
information for the filesystem) has one of the field has timestamp.
Now clearly it doesn't make any sense
for the filesystem getting mounted to have a future timestamp. It
regards this has inconsistency and hence you need to fsck it. You need
to figure out why is it taking a
future timestamp.
You can analyse the superblock by dumpe2fs from e2fsprogs.


> It does get fixed, but it requires the fsck program from e2fsprogs to do =
it.
>
>
> Doug
>
> --
>
> =A0 =A0 =A0 =A0Nothin' ever doesn't change, but nothin' changes much.
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0-- OK Go
>
> =A0 =A0 =A0 =A0Breadth of IT experience, and depth of knowledge in the DN=
S.
> =A0 =A0 =A0 =A0Yours for the right price. =A0:) =A0http://SupersetSolutio=
ns.com/
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>


--=20
Cheers,
Aditya Sarawgi

From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 12 22:35:20 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 10623106566C
	for <fs@freebsd.org>; Fri, 12 Nov 2010 22:35:20 +0000 (UTC)
	(envelope-from gull@gull.us)
Received: from mail-ew0-f54.google.com (mail-ew0-f54.google.com
	[209.85.215.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 9F2488FC17
	for <fs@freebsd.org>; Fri, 12 Nov 2010 22:35:19 +0000 (UTC)
Received: by ewy3 with SMTP id 3so665709ewy.13
	for <fs@freebsd.org>; Fri, 12 Nov 2010 14:35:18 -0800 (PST)
MIME-Version: 1.0
Received: by 10.14.124.201 with SMTP id x49mr2089424eeh.7.1289599652793; Fri,
	12 Nov 2010 14:07:32 -0800 (PST)
Received: by 10.14.127.1 with HTTP; Fri, 12 Nov 2010 14:07:32 -0800 (PST)
X-Originating-IP: [69.91.159.208]
In-Reply-To: <4CDB6A5F.2000908@FreeBSD.org>
References: <20100929031825.L683@besplex.bde.org>
	<20100929084801.M948@besplex.bde.org>
	<20100929041650.GA1553@aditya> <201009290917.05269.jhb@freebsd.org>
	<20100929202526.GA1564@aditya> <4CD0A3E8.4080304@FreeBSD.org>
	<AANLkTi=iTCG4aO-KO_gy7fp_96KcZ_TCyNk5OkLZUHV3@mail.gmail.com>
	<4CD201AE.3040409@FreeBSD.org> <20101108174327.GC2066@earth>
	<4CD9E535.8000801@FreeBSD.org> <20101110170719.GA1573@earth>
	<4CDB6A5F.2000908@FreeBSD.org>
Date: Fri, 12 Nov 2010 14:07:32 -0800
Message-ID: <AANLkTikD18Fy2KrN8QSd50HtN2v48B8tCXFzPHh0c1HB@mail.gmail.com>
From: David Brodbeck <gull@gull.us>
To: fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Cc: 
Subject: Re: ext2fs now extremely slow
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Nov 2010 22:35:20 -0000

On Wed, Nov 10, 2010 at 8:00 PM, Doug Barton <dougb@freebsd.org> wrote:
>> May make the data inconsistent due to lack of facilities like journaling.
>
> Well that's just plain unacceptable. Either the fs works reliably (which
> obviously includes safely) or it should be removed. At bare minimum if it
> can't reliably write data then support should be changed to read-only.

ext2fs has never included journaling, so if you lose power while
you're writing to it, it will be inconsistent and need to be fsck'd.
This isn't unique to the FreeBSD implementation; it's just part of the
design.  Most Linux systems now use ext3fs, which is basically ext2
with journaling added.

I kind of share Aditya's perspective that what you're trying to do is
a bit odd, although it might be a good way to squash bugs.  Still,
what's next...trying to run make world on an msdos filesystem? ;)

From owner-freebsd-fs@FreeBSD.ORG  Sat Nov 13 02:27:12 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 03C751065672;
	Sat, 13 Nov 2010 02:27:12 +0000 (UTC) (envelope-from mm@FreeBSD.org)
Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3])
	by mx1.freebsd.org (Postfix) with ESMTP id 459958FC16;
	Sat, 13 Nov 2010 02:27:11 +0000 (UTC)
Received: from core.vx.sk (localhost [127.0.0.1])
	by mail.vx.sk (Postfix) with ESMTP id 4405E11C005;
	Sat, 13 Nov 2010 03:27:08 +0100 (CET)
X-Virus-Scanned: amavisd-new at mail.vx.sk
Received: from mail.vx.sk ([127.0.0.1])
	by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id Qe5JL+6kd74J; Sat, 13 Nov 2010 03:27:06 +0100 (CET)
Received: from [10.9.8.1] (188-167-78-139.dynamic.chello.sk [188.167.78.139])
	by mail.vx.sk (Postfix) with ESMTPSA id B0C8511BFFB;
	Sat, 13 Nov 2010 03:27:05 +0100 (CET)
Message-ID: <4CDDF77B.90708@FreeBSD.org>
Date: Sat, 13 Nov 2010 03:27:07 +0100
From: Martin Matuska <mm@FreeBSD.org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sk;
	rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23
	Mnenhy/0.7.5.0
MIME-Version: 1.0
To: Andriy Gapon <avg@freebsd.org>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>	<4CDD2F5F.2000902@freebsd.org>	<FD7FC6ED159249338A04BE125941D146@vosz.local>
	<4CDD4EB4.40004@freebsd.org>
In-Reply-To: <4CDD4EB4.40004@freebsd.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Nov 2010 02:27:12 -0000

Yes, this is indeed a leak introduced by importing onnv revision 9214
and it exists in perforce as well - very easy to reproduce.

# mount -t zfs test@t1 /mnt
# umount /mnt (-> hang)

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6604992
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6810367

This is not compatible with mounting snapshots outside mounted ZFS and I
was not able to reproduce the errors defined in 6604992 and 6810367
(they are Solaris-specific). I suggest we comment out this code (from
head, later MFC and p4 as well).

Patch (should work with HEAD and 8-STABLE):
http://people.freebsd.org/~mm/patches/zfs/zfs_vfsops.c.patch

Dňa 12.11.2010 15:27, Andriy Gapon  wrote / napísal(a):
> on 12/11/2010 16:00 Alexander Zagrebin said the following:
>> Thanks for your reply!
>>
>>>> 2. the umount is waiting for disk
>>>> #ps | egrep 'PID|umount'
>>>>   PID  TT  STAT      TIME COMMAND
>>>>   958   0  D+     0:00,04 umount /mnt
>>>> # procstat -t 958
>>>>   PID    TID COMM             TDNAME           CPU  PRI 
>>> STATE   WCHAN
>>>>   958 100731 umount           -                  3  133 
>>> sleep   mntref
>>>
>>> procstat -kk <pid>
>>
>> $ ps a | grep umount
>> 86874   2- D      0:00,06 umount /mnt
>> 90433   3  S+     0:00,01 grep umount
>>
>> $ sudo procstat -kk 86874
>>   PID    TID COMM             TDNAME           KSTACK
>> 86874 100731 umount           -                mi_switch+0x176
>> sleepq_wait+0x42 _sleep+0x317 vfs_mount_destroy+0x5a dounmount+0x4d4
>> unmount+0x38b syscall+0x1cf Xfast_syscall+0xe2
>>
> 
> 
> Looks like possible mnt_ref leak.
> I think that something like that was fixed some not long time ago.
> Perhaps you either don't have the fix or there is another leak.
> What revision do you have?
> 
> Perhaps Martin has an insight here.
> 

From owner-freebsd-fs@FreeBSD.ORG  Sat Nov 13 06:30:53 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 86829106566B;
	Sat, 13 Nov 2010 06:30:53 +0000 (UTC) (envelope-from alexz@visp.ru)
Received: from mail.visp.ru (srv1.visp.ru [91.215.204.2])
	by mx1.freebsd.org (Postfix) with ESMTP id EBA288FC1B;
	Sat, 13 Nov 2010 06:30:52 +0000 (UTC)
Received: from 91-215-205-255.static.visp.ru ([91.215.205.255] helo=zagrebin)
	by mail.visp.ru with esmtp (Exim 4.72 (FreeBSD))
	(envelope-from <alexz@visp.ru>)
	id 1PH9dS-000Ivu-4I; Sat, 13 Nov 2010 09:30:50 +0300
From: "Alexander Zagrebin" <alexz@visp.ru>
To: "'Martin Matuska'" <mm@FreeBSD.org>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>	<4CDD2F5F.2000902@freebsd.org>	<FD7FC6ED159249338A04BE125941D146@vosz.local><4CDD4EB4.40004@freebsd.org>
	<4CDDF77B.90708@FreeBSD.org>
Date: Sat, 13 Nov 2010 09:30:49 +0300
Keywords: freebsd-stable
Message-ID: <8EEEFFFCCE94428992BE20F4A2EB8362@vosz.local>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="koi8-r"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
Thread-Index: AcuC2oQL0Tr2Iry0RMqsIX6jZG2/hgAGlavQ
In-Reply-To: <4CDDF77B.90708@FreeBSD.org>
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org,
	'Andriy Gapon' <avg@freebsd.org>
Subject: RE: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Nov 2010 06:30:53 -0000

> Yes, this is indeed a leak introduced by importing onnv revision 9214
> and it exists in perforce as well - very easy to reproduce.
> 
> # mount -t zfs test@t1 /mnt
> # umount /mnt (-> hang)
> 
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6604992
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6810367
> 
> This is not compatible with mounting snapshots outside 
> mounted ZFS and I
> was not able to reproduce the errors defined in 6604992 and 6810367
> (they are Solaris-specific). I suggest we comment out this code (from
> head, later MFC and p4 as well).
> 
> Patch (should work with HEAD and 8-STABLE):
> http://people.freebsd.org/~mm/patches/zfs/zfs_vfsops.c.patch
> 

The patch was applied cleanly to the latest stable.
umount doesn't hangs now. Thanks.

Let me ask a question...
I'm updating the source tree via csup/cvs.
Is there a method to determine a SVN revision in this case?
If no, then may be possible to add (and automatically maintain on
svn -> cvs replication) special file into cvs tree
(for example, /usr/src/revision) with the current svn revision inside?

-- 
Alexander Zagrebin


From owner-freebsd-fs@FreeBSD.ORG  Sat Nov 13 09:57:11 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E9347106566B;
	Sat, 13 Nov 2010 09:57:11 +0000 (UTC)
	(envelope-from arundel@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id BE71E8FC17;
	Sat, 13 Nov 2010 09:57:11 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAD9vBeE031642;
	Sat, 13 Nov 2010 09:57:11 GMT
	(envelope-from arundel@freefall.freebsd.org)
Received: (from arundel@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oAD9vBnF031638;
	Sat, 13 Nov 2010 09:57:11 GMT (envelope-from arundel)
Date: Sat, 13 Nov 2010 09:57:11 GMT
Message-Id: <201011130957.oAD9vBnF031638@freefall.freebsd.org>
To: arundel@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org
From: arundel@FreeBSD.org
Cc: 
Subject: Re: bin/151713: [patch] Bug in growfs(8) with respect to 32-bit
	overflow
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Nov 2010 09:57:12 -0000

Synopsis: [patch] Bug in growfs(8) with respect to 32-bit overflow

Responsible-Changed-From-To: freebsd-bugs->freebsd-fs
Responsible-Changed-By: arundel
Responsible-Changed-When: Sat Nov 13 09:55:47 UTC 2010
Responsible-Changed-Why: 
Assign to freebsd-fs@, since they should have an opinion regarding this issue.

http://www.freebsd.org/cgi/query-pr.cgi?pr=151713

From owner-freebsd-fs@FreeBSD.ORG  Sat Nov 13 10:29:10 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AECBF1065673;
	Sat, 13 Nov 2010 10:29:10 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 921488FC12;
	Sat, 13 Nov 2010 10:29:09 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA11999;
	Sat, 13 Nov 2010 12:29:08 +0200 (EET) (envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1PHDM4-000Mtu-1O; Sat, 13 Nov 2010 12:29:08 +0200
Message-ID: <4CDE6823.6080907@freebsd.org>
Date: Sat, 13 Nov 2010 12:27:47 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Martin Matuska <mm@freebsd.org>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>	<4CDD2F5F.2000902@freebsd.org>	<FD7FC6ED159249338A04BE125941D146@vosz.local>
	<4CDD4EB4.40004@freebsd.org> <4CDDF77B.90708@FreeBSD.org>
In-Reply-To: <4CDDF77B.90708@FreeBSD.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Nov 2010 10:29:10 -0000

on 13/11/2010 04:27 Martin Matuska said the following:
> Yes, this is indeed a leak introduced by importing onnv revision 9214
> and it exists in perforce as well - very easy to reproduce.
> 
> # mount -t zfs test@t1 /mnt
> # umount /mnt (-> hang)
> 
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6604992
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6810367
> 
> This is not compatible with mounting snapshots outside mounted ZFS and I
> was not able to reproduce the errors defined in 6604992 and 6810367
> (they are Solaris-specific). I suggest we comment out this code (from
> head, later MFC and p4 as well).
> 
> Patch (should work with HEAD and 8-STABLE):
> http://people.freebsd.org/~mm/patches/zfs/zfs_vfsops.c.patch

Not quite sure, but perhaps it's better to make the logic in each place match
the other.  That is, I see that the code does hold on a filesystem of a covered
vnode, but does rele on a parent ZFS filesystem.
Or is this kind of protection not needed at all for FreeBSD?

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Sat Nov 13 11:06:33 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0BAFB106564A;
	Sat, 13 Nov 2010 11:06:33 +0000 (UTC) (envelope-from mm@FreeBSD.org)
Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3])
	by mx1.freebsd.org (Postfix) with ESMTP id 906F18FC0A;
	Sat, 13 Nov 2010 11:06:32 +0000 (UTC)
Received: from core.vx.sk (localhost [127.0.0.1])
	by mail.vx.sk (Postfix) with ESMTP id B9FBE122D76;
	Sat, 13 Nov 2010 12:06:31 +0100 (CET)
X-Virus-Scanned: amavisd-new at mail.vx.sk
Received: from mail.vx.sk ([127.0.0.1])
	by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id 8gpd45QMWnIz; Sat, 13 Nov 2010 12:06:27 +0100 (CET)
Received: from [10.9.8.1] (188-167-78-139.dynamic.chello.sk [188.167.78.139])
	by mail.vx.sk (Postfix) with ESMTPSA id 0D0E3122D5E;
	Sat, 13 Nov 2010 12:06:27 +0100 (CET)
Message-ID: <4CDE7133.6010803@FreeBSD.org>
Date: Sat, 13 Nov 2010 12:06:27 +0100
From: Martin Matuska <mm@FreeBSD.org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sk;
	rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23
	Mnenhy/0.7.5.0
MIME-Version: 1.0
To: Andriy Gapon <avg@freebsd.org>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>	<4CDD2F5F.2000902@freebsd.org>	<FD7FC6ED159249338A04BE125941D146@vosz.local>	<4CDD4EB4.40004@freebsd.org>
	<4CDDF77B.90708@FreeBSD.org> <4CDE6823.6080907@freebsd.org>
In-Reply-To: <4CDE6823.6080907@freebsd.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Nov 2010 11:06:33 -0000

No, this is not good for us. Solaris does not allow "mounting" of
snapshots on any vnode, like we do. Solaris has them only in
.zfs/snapshots. This allows us to have read-only mounts without even
mounting the parent zfs.

Before v15 we have been happy with that code and had no issues :-)

I have a very simple testcase where just fixing the VFS_RELE breaks our
forced unmount. Let's say we use the correct VFS_RELE in zfs_vfsops.c:
VFS_RELE(vfsp->mnt_vnodecovered->v_vfsp);

Now let's say you have a mounted filesystem (e.g. md) under /mnt:
/dev/md5 on /mnt (ufs, local)

# mkdir /mnt/test
# mount -t zfs tank@t2 /mnt/test
# umount -f /mnt

Now you will hang because the second VFS_HOLD. So I stick to my opinion
that this "extra protection" is more a problem than a solution in our
case and it should be commented out.

Dňa 13.11.2010 11:27, Andriy Gapon  wrote / napísal(a):
> on 13/11/2010 04:27 Martin Matuska said the following:
>> Yes, this is indeed a leak introduced by importing onnv revision 9214
>> and it exists in perforce as well - very easy to reproduce.
>>
>> # mount -t zfs test@t1 /mnt
>> # umount /mnt (-> hang)
>>
>> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6604992
>> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6810367
>>
>> This is not compatible with mounting snapshots outside mounted ZFS and I
>> was not able to reproduce the errors defined in 6604992 and 6810367
>> (they are Solaris-specific). I suggest we comment out this code (from
>> head, later MFC and p4 as well).
>>
>> Patch (should work with HEAD and 8-STABLE):
>> http://people.freebsd.org/~mm/patches/zfs/zfs_vfsops.c.patch
> 
> Not quite sure, but perhaps it's better to make the logic in each place match
> the other.  That is, I see that the code does hold on a filesystem of a covered
> vnode, but does rele on a parent ZFS filesystem.
> Or is this kind of protection not needed at all for FreeBSD?
> 

From owner-freebsd-fs@FreeBSD.ORG  Sat Nov 13 11:11:19 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 856F21065701;
	Sat, 13 Nov 2010 11:11:19 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 6626C8FC08;
	Sat, 13 Nov 2010 11:11:17 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA12371;
	Sat, 13 Nov 2010 13:11:16 +0200 (EET) (envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1PHE0q-000Mwy-MH; Sat, 13 Nov 2010 13:11:16 +0200
Message-ID: <4CDE7203.7090507@freebsd.org>
Date: Sat, 13 Nov 2010 13:09:55 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Martin Matuska <mm@freebsd.org>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>	<4CDD2F5F.2000902@freebsd.org>	<FD7FC6ED159249338A04BE125941D146@vosz.local>	<4CDD4EB4.40004@freebsd.org>
	<4CDDF77B.90708@FreeBSD.org> <4CDE6823.6080907@freebsd.org>
	<4CDE7133.6010803@FreeBSD.org>
In-Reply-To: <4CDE7133.6010803@FreeBSD.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Nov 2010 11:11:19 -0000

on 13/11/2010 13:06 Martin Matuska said the following:
> No, this is not good for us. Solaris does not allow "mounting" of
> snapshots on any vnode, like we do. Solaris has them only in
> .zfs/snapshots. This allows us to have read-only mounts without even
> mounting the parent zfs.
> 
> Before v15 we have been happy with that code and had no issues :-)
> 
> I have a very simple testcase where just fixing the VFS_RELE breaks our
> forced unmount. Let's say we use the correct VFS_RELE in zfs_vfsops.c:
> VFS_RELE(vfsp->mnt_vnodecovered->v_vfsp);
> 
> Now let's say you have a mounted filesystem (e.g. md) under /mnt:
> /dev/md5 on /mnt (ufs, local)
> 
> # mkdir /mnt/test
> # mount -t zfs tank@t2 /mnt/test
> # umount -f /mnt
> 
> Now you will hang because the second VFS_HOLD.

Hang here would be bad, I agree.
But I think that the umount shouldn't succeed either, in this case.

> So I stick to my opinion
> that this "extra protection" is more a problem than a solution in our
> case and it should be commented out.


-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Sat Nov 13 11:21:18 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A350D1065679;
	Sat, 13 Nov 2010 11:21:18 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id 3794E8FC28;
	Sat, 13 Nov 2010 11:21:17 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id oADBL47a079627
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 13 Nov 2010 13:21:04 +0200 (EET)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	oADBL4hj036568; Sat, 13 Nov 2010 13:21:04 +0200 (EET)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id oADBL4tO036567; 
	Sat, 13 Nov 2010 13:21:04 +0200 (EET)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Sat, 13 Nov 2010 13:21:04 +0200
From: Kostik Belousov <kostikbel@gmail.com>
To: Andriy Gapon <avg@freebsd.org>
Message-ID: <20101113112104.GE2392@deviant.kiev.zoral.com.ua>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>
	<4CDD2F5F.2000902@freebsd.org>
	<FD7FC6ED159249338A04BE125941D146@vosz.local>
	<4CDD4EB4.40004@freebsd.org> <4CDDF77B.90708@FreeBSD.org>
	<4CDE6823.6080907@freebsd.org> <4CDE7133.6010803@FreeBSD.org>
	<4CDE7203.7090507@freebsd.org>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="QrrxbCYKnJeJBlX9"
Content-Disposition: inline
In-Reply-To: <4CDE7203.7090507@freebsd.org>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Nov 2010 11:21:18 -0000


--QrrxbCYKnJeJBlX9
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Nov 13, 2010 at 01:09:55PM +0200, Andriy Gapon wrote:
> on 13/11/2010 13:06 Martin Matuska said the following:
> > No, this is not good for us. Solaris does not allow "mounting" of
> > snapshots on any vnode, like we do. Solaris has them only in
> > .zfs/snapshots. This allows us to have read-only mounts without even
> > mounting the parent zfs.
> >=20
> > Before v15 we have been happy with that code and had no issues :-)
> >=20
> > I have a very simple testcase where just fixing the VFS_RELE breaks our
> > forced unmount. Let's say we use the correct VFS_RELE in zfs_vfsops.c:
> > VFS_RELE(vfsp->mnt_vnodecovered->v_vfsp);
> >=20
> > Now let's say you have a mounted filesystem (e.g. md) under /mnt:
> > /dev/md5 on /mnt (ufs, local)
> >=20
> > # mkdir /mnt/test
> > # mount -t zfs tank@t2 /mnt/test
> > # umount -f /mnt
> >=20
> > Now you will hang because the second VFS_HOLD.
>=20
> Hang here would be bad, I agree.
> But I think that the umount shouldn't succeed either, in this case.
Normal unmount indeed shall not succeed in this case, because mount
adds a reference to the covered vnode. But forced unmount should be
allowed to proceed.

After unmount, you can use fsid to unmount the lower mount point.
>=20
> > So I stick to my opinion
> > that this "extra protection" is more a problem than a solution in our
> > case and it should be commented out.
>=20
>=20
> --=20
> Andriy Gapon
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"

--QrrxbCYKnJeJBlX9
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAkzedJ8ACgkQC3+MBN1Mb4iVbACg9BjzaWe4CKTTgoiDq/g3eJab
gxIAoPIu6gsaPqGxSYGORw1XUPtuAgSx
=P5Rh
-----END PGP SIGNATURE-----

--QrrxbCYKnJeJBlX9--

From owner-freebsd-fs@FreeBSD.ORG  Sat Nov 13 11:26:48 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B7D6D1065695;
	Sat, 13 Nov 2010 11:26:48 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 8C9258FC20;
	Sat, 13 Nov 2010 11:26:47 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA12514;
	Sat, 13 Nov 2010 13:26:45 +0200 (EET) (envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1PHEFo-000MyQ-NM; Sat, 13 Nov 2010 13:26:44 +0200
Message-ID: <4CDE75A4.8050702@freebsd.org>
Date: Sat, 13 Nov 2010 13:25:24 +0200
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: Kostik Belousov <kostikbel@gmail.com>
References: <D9ABDE54892A4D9285FE7FFA6E1B1B69@vosz.local>
	<4CDD2F5F.2000902@freebsd.org>
	<FD7FC6ED159249338A04BE125941D146@vosz.local>
	<4CDD4EB4.40004@freebsd.org> <4CDDF77B.90708@FreeBSD.org>
	<4CDE6823.6080907@freebsd.org> <4CDE7133.6010803@FreeBSD.org>
	<4CDE7203.7090507@freebsd.org>
	<20101113112104.GE2392@deviant.kiev.zoral.com.ua>
In-Reply-To: <20101113112104.GE2392@deviant.kiev.zoral.com.ua>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: 8.1-STABLE: problem with unmounting ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Nov 2010 11:26:48 -0000

on 13/11/2010 13:21 Kostik Belousov said the following:
> On Sat, Nov 13, 2010 at 01:09:55PM +0200, Andriy Gapon wrote:
>> on 13/11/2010 13:06 Martin Matuska said the following:
>>> No, this is not good for us. Solaris does not allow "mounting" of
>>> snapshots on any vnode, like we do. Solaris has them only in
>>> .zfs/snapshots. This allows us to have read-only mounts without even
>>> mounting the parent zfs.
>>>
>>> Before v15 we have been happy with that code and had no issues :-)
>>>
>>> I have a very simple testcase where just fixing the VFS_RELE breaks our
>>> forced unmount. Let's say we use the correct VFS_RELE in zfs_vfsops.c:
>>> VFS_RELE(vfsp->mnt_vnodecovered->v_vfsp);
>>>
>>> Now let's say you have a mounted filesystem (e.g. md) under /mnt:
>>> /dev/md5 on /mnt (ufs, local)
>>>
>>> # mkdir /mnt/test
>>> # mount -t zfs tank@t2 /mnt/test
>>> # umount -f /mnt
>>>
>>> Now you will hang because the second VFS_HOLD.
>>
>> Hang here would be bad, I agree.
>> But I think that the umount shouldn't succeed either, in this case.
> Normal unmount indeed shall not succeed in this case, because mount
> adds a reference to the covered vnode. But forced unmount should be
> allowed to proceed.
> 
> After unmount, you can use fsid to unmount the lower mount point.

Ah, I see now, thank you for the explanation.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Sat Nov 13 23:35:47 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6F283106566B
	for <freebsd-fs@freebsd.org>; Sat, 13 Nov 2010 23:35:47 +0000 (UTC)
	(envelope-from TERRY@tmk.com)
Received: from server.tmk.com (server.tmk.com [204.141.35.63])
	by mx1.freebsd.org (Postfix) with ESMTP id 4BC398FC0A
	for <freebsd-fs@freebsd.org>; Sat, 13 Nov 2010 23:35:46 +0000 (UTC)
Received: from tmk.com by tmk.com (PMDF V6.4 #37010)
	id <01NU7T8XRYKW00BCHX@tmk.com>; Sat, 13 Nov 2010 18:01:59 -0500 (EST)
Date: Sat, 13 Nov 2010 18:00:29 -0500 (EST)
From: Terry Kennedy <TERRY@tmk.com>
To: freebsd-stable@freebsd.org, freebsd-fs@freebsd.org
Message-id: <01NU7TBBN3D000BCHX@tmk.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; CHARSET=us-ascii
Cc: 
Subject: ZFS panic after replacing log device
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Nov 2010 23:35:47 -0000

I'm posting this to the freebsd-stable and freebsd-fs mailing lists. Followups
should probably happen on freebsd-fs.

I have a ZFS pool configured as:

zpool create data raidz da1 da2 da3 da4 da5 raidz da6 da7 da8 da9 da10 
raidz da11 da12 da13 da14 da15 spare da16 log da0

where da1-16 are WD2003FYYS drives (2TB RE4) and da0 is a 256GB PCI-Express
SSD (name omitted to protect the guilty).

The SSD has been dropping offline randomly - it seems that one or more flash 
modules pop out of their sockets and need to be re-seated frequently for some 
reason.

The most recent time it did that, I replaced the SSD with another one (for some 
reason, the manufacturer ties the flash modules to a particular controller, so 
just moving the modules results in an offline SSD and inability to manage it 
due to "license limits exceeded" or some such nonsense).

ZFS wasn't happy with the log device being changed, and reported it as 
corrupted, with the suggested corrective action being to "zpool clear" it. I 
did that, and then did a "zpool replace data da0 da0" and it claimed to 
successfully resilver it. I then did a "zpool scrub" and the scrub completed 
with no errors. So far, so good.

However, any attempt to write to the array results in a near-immediate panic:

panic: solaris assert: sm->sm_spare + size <= sm->sm_size, file: 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, 
line: 93 cpuid=2

(Screenshot at http://www.tmk.com/transient/zfs-panic.png in case I mis-typed
something).

This is repeatable across reboot / scrub / test cycles. System is 8-STABLE as 
of Fri Nov  5 19:08:35 EDT 2010, on-disk pool is version 4/15, same as the 
kernel.

I know that certain operations on log devices aren't supported until pool 
version 19 or thereabouts, but the error messages and zpool command results 
gave the impression that what I was doing was supported and worked (when it 
didn't). If this is truly a "you can't do that in pool version 15", perhaps a 
warning could be added so users don't get fooled into thinking it worked?

I can give a developer remote console / root access to the box if that would 
help. I have a couple days before I will need to nuke the pool and restore it 
from backups.

        Terry Kennedy             http://www.tmk.com
        terry@tmk.com             New York, NY USA